Re-examining False Positives in Health Data Analysis
Author Information
Author(s): Terry Haines, Richard Beare, Velandai Srikanth
Primary Institution: Monash University
Hypothesis
How does the abundance of health data affect the rate of false positive findings in research?
Conclusion
The study found that with enough data, analysts can create spurious significant findings in a significant number of studies.
Supporting Evidence
- The cumulative type 1 error rate was 26.8% for the 24 data point set.
- The type 1 error rate was 21.9% for the 24,000 data point set.
- Analysts can spuriously manufacture significant findings in one in four to five studies.
Takeaway
When researchers have a lot of health data, they might accidentally find results that look important but aren't really true.
Methodology
The study used 1,000 Monte Carlo simulations of a pre-post intervention study with a parallel control site.
Potential Biases
The potential for spurious findings due to multiple comparisons increases the risk of bias.
Limitations
The study primarily focuses on the type 1 error rate and may not address other types of errors.
Digital Object Identifier (DOI)
Want to read the original?
Access the complete publication on the publisher's website