Someone asked me how to determine the sample size for a study involving a diagnostic test. It seems like a tricky thing, because most studies of diagnostic tests don’t have a formal hypothesis. What you need to do instead is to specify a particular statistic that you are interested in estimating and then selecting a sample size so that the confidence interval for this estimate is reasonably precise.
For example, suppose you want to estimate the sensitivity (Sn) and specificity (Sp) of a diagnostic test. Your best guess is that sensitivity will be at least 75% and specificity will be at least 90%. The formula for a confidence interval for Sn or Sp would be
where n~a~ and n~n~ are the number of abnormal (diseased) and normal (healthy) patients in the study.
A sample of size 50 abnormal and 50 normal patients would give a95% confidence interval of plus/minus 0.12 for Sn and plus/minus 0.083 for Sp. This seems like a reasonable amount of precision. A sample of size 75 in each group would provide slightly narrower confidence intervals (plus/minus 0.098 and plus/minus 0.068 respectively). Your choice of the sample size depends in large part on the number of patients you can recruit from and also a balance between maximizing precision and minimizing the amount of time you spend on this project.
Suppose instead that you wanted to estimate the area under the curve (AUC) for a Received Operating Characteristic Curve (ROC curve). The formula for a standard error here is a bit messier. The web page
I have a whole section on my web for determining the appropriate sample size, including a page on sample size for a diagnostic tests, which I have just updated to include the above example.
You can find an earlier version of this page on my original website.