Unbalanced sample sizes for evaluating a diagnostic test

I get a lot of questions about unbalanced sample sizes. Quite often the mechanics of the research protocol make it easier to find a lot of patients in one group and only a few in another group.

For example, someone is evaluating a diagnostic test and notes that only 16%25 of the patients in the study will actually have the disease being tested for. Will this cause any bias, he wonders? Any loss in precision?

You will lose some precision, but there is no bias of any kind. Most studies of diagnostic tests have an imbalance, sometimes quite extreme. If a disease is rare, then the sensitivity, which uses the number of disease patients in the denominator, will have a lot less precision than the specificity, which uses the number of healthy patients in the denominator. The only cure is to recruit enough of all types of patients to insure that there is a reasonable number of patients with disease.

Biases can occur if you have leaky groups (e.g., some healthy patients actually have the disease but your gold standard for diagnosing the disease misses a few of them). There has been a lot written about this problem and other possible sources of bias. Here are a few references:

Diagnostic Strategies for Common Medical Problems. Black ER, MD, Bordley DR, MD, Tape TG, MD, Panzer RJ, MD (1999) Philadelphia, Pennsylvania: American College of Physicians.
Effect of misclassification of causes of death in verbal autopsy: can it be adjusted? Chandramohan D, Setel P, Quigley M. Int J Epidemiol 2001: 30(3); 509-14. [Medline] [Abstract] [Full text] [PDF]
Index for rating predictive accuracy of screening tests. Choi BC. Methods Inf Med 1982: 21(3); 149-53. [Medline]
Sensitivity and specificity of a single diagnostic test in the presence of work-up bias. Choi BC. J Clin Epidemiol 1992: 45(6); 581-6. [Medline]
Determining the value of additional surrogate exposure data for improving the estimate of an odds ratio. Dahm PF, Gail MH, Rosenberg PS, Pee D. Stat Med 1995: 14(23); 2581-98. [Medline]
Correlation between electroretinogram findings and molecular analysis in the Duchenne muscular dystrophy phenotype. DeBecker I, Riddell DC, Dooley JM, Tremblay F. Journal of Opthalmology 1994: 78(9); 719-22. [Medline]
Diagnostic bias and toxic shock syndrome. Harvey M, Horwitz RI, Feinstein AR. Am J Med 1984: 76(3); 351-60. [Medline]
Estimating the error rates of diagnostic tests. Hui SL, Walter SD. Biometrics 1980: 36(1); 167-71. [Medline]
Inferences for likelihood ratios in the absence of a “gold standard”. Joseph L, Gyorkos T. Medical Decision Making 1996: 16(4); 412-17. [Medline]
Evaluating Medical Tests: Objective and Quantitative Guidelines. Kraemer HC (1992) Newbury Park, CA: Sage Publications.
Random effects models in latent class analysis for evaluating accuracy of diagnostic tests. Qu Y, Tan M, Kutner MH. Biometrics 1996: 52(3); 797-810. [Medline]
Teaching evidence-based medicine: caveats and challenges. Welch HG, Lurie JD. Acad Med 2000: 75(3); 235-40. [Medline] [Abstract] [Full text] [PDF]

Here are some guidelines for critically evaluating research on diagnostic tests.

User’s guide to the surgical literature: how to use an article about a diagnostic test. Bhandari M, Montori VM, Swiontkowski MF, Guyatt GH. J Bone Joint Surg Am 2003: 85-A(6); 1133-40. [Medline] [Full text] [PDF]
Toward a checklist for reporting of studies of diagnostic accuracy of medical tests. Bruns DE, Huth EJ, Magid E, Young DS. Clinical Chemistry 2000: 46(7); 893-5. [Medline] [Abstract] [Full text] [PDF]
Evidence-based diagnostic radiology. Dixon AK. Lancet 1997: 350(9076); 509-12. [Medline] [Abstract]
Clinical problem solving and diagnostic decision making: selective review of the cognitive literature. Elstein AS, Schwarz A. Bmj 2002: 324(7339); 729-32. [Medline] [Full text] [PDF]
How to read a paper. Papers that report diagnostic or screening tests. Greenhalgh T. Bmj 1997: 315(7107); 540-3. [Medline] [Full text]
Users' guides to the medical literature. III. How to use an article about a diagnostic test. A. Are the results of the study valid? Evidence-Based Medicine Working Group. Jaeschke R, Guyatt G, Sackett DL. Jama 1994: 271(5); 389-91. [Medline]
Users' guides to the medical literature. III. How to use an article about a diagnostic test. B. What are the results and will they help me in caring for my patients? The Evidence-Based Medicine Working Group. Jaeschke R, Guyatt GH, Sackett DL. Jama 1994: 271(9); 703-7. [Medline]
Evidence base of clinical diagnosis: Evaluation of diagnostic procedures. Knottnerus JA, van Weel C, Muris JWM. British Medical Journal 2002: 324(7335); 477-480. [Medline] [Full text] [PDF]
Evidence base of clinical diagnosis: The architecture of diagnostic research. Sackett DL, Haynes RB. BMJ 2002: 324(7336); 539-541. [Medline] [Full text] [PDF]
The evaluation of diagnostic tests: principles, problems, and new developments. Sox HC. Annu Rev Med 1996: 47; 463-71. [Medline]
Communicating accuracy of tests to general practitioners: a controlled study. Steurer J, Fischer JE, Bachmann LM, Koller M, ter Riet G. Bmj 2002: 324(7341); 824-6. [Medline] [Abstract] [Full text] [PDF]
Methodology for a multicenter study of serious infections in young infants in developing countries. The WHO Young Infants Study Group. The WHO Young Infants Study Group. Pediatr Infect Dis J 1999: 18(10 Suppl); S8-16. [Medline]
Evidence base of clinical diagnosis: Rational, cost effective use of investigations in clinical practice. Winkens R, Dinant GJ. Bmj 2002: 324(7340); 783. [Medline] [Full text] [PDF]

You can find an earlier version of this page on my original website.

Unbalanced sample sizes for evaluating a diagnostic test

Steve Simon

2004-08-05