*Dear Professor Mean
- How big should a study of a diagnostic test be? I want to estimate a sample size for the sensitivity and specifity of a test. I guess confidence intervals would address this
- but is there a calculation analogous to a power analysis that would apply to figure out the size of the groups beforehand? – Jovial John*
Sample size is not important. Just ask for enough in the research grant so that you can buy your consulting statistician a new computer. Just kidding. You are actually pretty close to having the right answer.
Power calculations are appropriate only when you have a research hypothesis. The emphasis in a study of a diagnostic test is estimation. You want accurate estimates of sensitivity
- specificity and/or likelihood ratios.
**When the focus is on estimation
- you determine the sample size through the width of the confidence interval**. You specify how precise (how narrow) you want your confidence intervals to be. This determines your sample size.
**For sensitivity and specificity
- use the standard formulas for a binomial proportion**. The formulas can be found in any introductory statistics book. For a likelihood ratio
- the formulas are a bit more complex
- but the same principle applies.
- suppose you want to estimate the sensitivity (Sn) and specificity (Sp) of a diagnostic test. Your best guess is that sensitivity will be at least 75% and specificity will be at least 90%. The formula for a confidence interval for Sn or Sp would be
where n~a~ and n~n~ are the number of abnormal (diseased) and normal (healthy) patients in the study. You assess abnormal and normal under the gold standard
- of course.
A sample of size 50 abnormal and 50 normal patients would give a 95% confidence interval of plus/minus 0.12 for Sn and plus/minus 0.083 for Sp. This seems like a reasonable amount of precision. A sample of size 75 in each group would provide slightly narrower confidence intervals (plus/minus 0.098 and plus/minus 0.068 respectively). Your choice of the sample size depends in large part on the number of patients you can recruit from and also a balance between maximizing precision and minimizing the amount of time you spend on this project.
Suppose instead that you wanted to estimate the area under the curve (AUC) for a Received Operating Characteristic Curve (ROC curve). The formula for a standard error here is a bit messier. The web page
- the standard error would be 0.044, which is reasonably small. With 75 in each group
- the standard error would be 0.036.
Jovial John wants to know how many subjects to include in a research study of a new diagnostic test. Professor Mean explains that you should select a sample size that will make the confidence interval for sensitivity and/or specificity sufficiently narrow.
You can find an earlier version of this page on my original website.