>The key point is the "assumptions about distribution of errors" and a 
>(alone) can't determine that.

True, they can't determine that, but they need that information to devise a 
sampling scheme to tell you how much you need to sample, where and when, to 
assure yourself that with a given degree of probability that the full data 
set (population in stat terms) does not have more than a chosen % of 

Are all errors random. binomial distribution, and independent? Then use 
random sampling.
Are they likely to be dependent .. i.e. if a machine has one error, may it 
have more? Then stratify sampling by machine.
Are they clustered by time .. e.g. more likely to occur with higher volumes 
than lower? Then weight the sampling toward the end of the day.

>If you develop a software product, with say 100 features, do you only test
>a sample of the 100 features, or do you test all 100?

We test all 100, some test cases may exercise more than one feature, and 
more important (riskier, more expensive if got wrong) features are tested 
more ( e.g. more stringent tests, more failure modes, more interactions) 
than less important ones (which may be satisfied by a simple visual check). 
But bear in mind that that is testing behaviour and characteristics, not 
auditing transaction data.

>If an accountant audits the books of a company, would the shareholders
>be happy if the accountant only checked a sample of the records?

They are, because that's all auditors do. Auditors have many sampling 
schemes, and some involve using any prior knowledge they have to point them 
in the right direction; see above.

I used to know this stuff when I worked in quality control thirty years ago 
but it's faded now. A few years ago, when I worked in TCD Statistics dept ( 
http://www.tcd.ie/Statistics ) my colleagues could do this kind of thing 
very easily. As I understand now that most on this list are CS academics, 
maybe it's time to talk to your colleagues?

