Four hundred or more participants needed for stable contingency table estimates of clinical prediction rule performance

Peter Kent, Eleanor Boyle, Jennifer L Keating, Hanne B Albert, Jan Hartvigsen

Research output: Contribution to journalJournal articleResearchpeer-review

81 Downloads (Pure)

Abstract

OBJECTIVE: To quantify variability in the results of statistical analyses based on contingency tables and discuss the implications for the choice of sample size for studies that derive clinical prediction rules.

STUDY DESIGN AND SETTING: An analysis of three pre-existing sets of large cohort data (n= 4,062 to 8,674) was performed. In each dataset, repeated random-sampling of various sample sizes, from n=100 up to n=2,000, was performed 100 times at each sample size and the variability in estimates of sensitivity, specificity, positive and negative likelihood ratios, post-test probabilities, odds ratios and risk/prevalence ratios, for each sample size was calculated.

RESULTS: There were very wide, and statistically significant, differences in estimates derived from contingency tables from the same dataset when calculated in sample sizes below 400 people, and typically this variability stabilized in samples of 400 to 600 people. Although estimates of prevalence also varied significantly in samples below 600 people, that relationship only explains a small component of the variability in these statistical parameters.

CONCLUSION: To reduce sample-specific variability, contingency tables should consist of 400 participants or more when used to derive clinical prediction rules or test their performance.

Original languageEnglish
JournalJournal of Clinical Epidemiology
Volume82
Pages (from-to)137–148
ISSN0895-4356
DOIs
Publication statusPublished - 2017

Fingerprint

Decision Support Techniques
Sample Size
Odds Ratio

Cite this

@article{a51997d5b01840719303ca73f8d2bba6,
title = "Four hundred or more participants needed for stable contingency table estimates of clinical prediction rule performance",
abstract = "OBJECTIVE: To quantify variability in the results of statistical analyses based on contingency tables and discuss the implications for the choice of sample size for studies that derive clinical prediction rules.STUDY DESIGN AND SETTING: An analysis of three pre-existing sets of large cohort data (n= 4,062 to 8,674) was performed. In each dataset, repeated random-sampling of various sample sizes, from n=100 up to n=2,000, was performed 100 times at each sample size and the variability in estimates of sensitivity, specificity, positive and negative likelihood ratios, post-test probabilities, odds ratios and risk/prevalence ratios, for each sample size was calculated.RESULTS: There were very wide, and statistically significant, differences in estimates derived from contingency tables from the same dataset when calculated in sample sizes below 400 people, and typically this variability stabilized in samples of 400 to 600 people. Although estimates of prevalence also varied significantly in samples below 600 people, that relationship only explains a small component of the variability in these statistical parameters.CONCLUSION: To reduce sample-specific variability, contingency tables should consist of 400 participants or more when used to derive clinical prediction rules or test their performance.",
author = "Peter Kent and Eleanor Boyle and Keating, {Jennifer L} and Albert, {Hanne B} and Jan Hartvigsen",
note = "Copyright {\circledC} 2016 Elsevier Inc. All rights reserved.",
year = "2017",
doi = "10.1016/j.jclinepi.2016.10.004",
language = "English",
volume = "82",
pages = "137–148",
journal = "Journal of Clinical Epidemiology",
issn = "0895-4356",
publisher = "Elsevier",

}

Four hundred or more participants needed for stable contingency table estimates of clinical prediction rule performance. / Kent, Peter; Boyle, Eleanor; Keating, Jennifer L; Albert, Hanne B; Hartvigsen, Jan.

In: Journal of Clinical Epidemiology, Vol. 82, 2017, p. 137–148.

Research output: Contribution to journalJournal articleResearchpeer-review

TY - JOUR

T1 - Four hundred or more participants needed for stable contingency table estimates of clinical prediction rule performance

AU - Kent, Peter

AU - Boyle, Eleanor

AU - Keating, Jennifer L

AU - Albert, Hanne B

AU - Hartvigsen, Jan

N1 - Copyright © 2016 Elsevier Inc. All rights reserved.

PY - 2017

Y1 - 2017

N2 - OBJECTIVE: To quantify variability in the results of statistical analyses based on contingency tables and discuss the implications for the choice of sample size for studies that derive clinical prediction rules.STUDY DESIGN AND SETTING: An analysis of three pre-existing sets of large cohort data (n= 4,062 to 8,674) was performed. In each dataset, repeated random-sampling of various sample sizes, from n=100 up to n=2,000, was performed 100 times at each sample size and the variability in estimates of sensitivity, specificity, positive and negative likelihood ratios, post-test probabilities, odds ratios and risk/prevalence ratios, for each sample size was calculated.RESULTS: There were very wide, and statistically significant, differences in estimates derived from contingency tables from the same dataset when calculated in sample sizes below 400 people, and typically this variability stabilized in samples of 400 to 600 people. Although estimates of prevalence also varied significantly in samples below 600 people, that relationship only explains a small component of the variability in these statistical parameters.CONCLUSION: To reduce sample-specific variability, contingency tables should consist of 400 participants or more when used to derive clinical prediction rules or test their performance.

AB - OBJECTIVE: To quantify variability in the results of statistical analyses based on contingency tables and discuss the implications for the choice of sample size for studies that derive clinical prediction rules.STUDY DESIGN AND SETTING: An analysis of three pre-existing sets of large cohort data (n= 4,062 to 8,674) was performed. In each dataset, repeated random-sampling of various sample sizes, from n=100 up to n=2,000, was performed 100 times at each sample size and the variability in estimates of sensitivity, specificity, positive and negative likelihood ratios, post-test probabilities, odds ratios and risk/prevalence ratios, for each sample size was calculated.RESULTS: There were very wide, and statistically significant, differences in estimates derived from contingency tables from the same dataset when calculated in sample sizes below 400 people, and typically this variability stabilized in samples of 400 to 600 people. Although estimates of prevalence also varied significantly in samples below 600 people, that relationship only explains a small component of the variability in these statistical parameters.CONCLUSION: To reduce sample-specific variability, contingency tables should consist of 400 participants or more when used to derive clinical prediction rules or test their performance.

U2 - 10.1016/j.jclinepi.2016.10.004

DO - 10.1016/j.jclinepi.2016.10.004

M3 - Journal article

C2 - 27847252

VL - 82

SP - 137

EP - 148

JO - Journal of Clinical Epidemiology

JF - Journal of Clinical Epidemiology

SN - 0895-4356

ER -