Understanding Population Structure in Biological Data
Author Information
Author(s): Jonathan Carlson, Carl Kadie, Simon Mallal, David Heckerman, Philip Awadalla
Primary Institution: Microsoft Research
Hypothesis
Can hierarchical population structure confound the identification of correlations in biological data?
Conclusion
The study demonstrates that different models can effectively correct for confounding effects in biological data, improving the identification of associations.
Supporting Evidence
- The study identifies two distinct confounding processes: coevolution and conditional influence.
- Generative models were shown to effectively correct for confounding effects in biological data.
- Results indicate that no single method is best for addressing all forms of confounding.
Takeaway
This study shows that when looking at biological data, we need to consider how the relationships between different groups can affect our results, and that using the right models can help us find better answers.
Methodology
The study examines several methods that correct for confounding on discrete data with hierarchical population structure and applies generative models to real biological data.
Potential Biases
Potential biases may arise from the assumptions made in the models regarding the relationships between variables.
Limitations
The models may not be applicable to all forms of confounding, and the effectiveness can vary based on the specific biological context.
Participant Demographics
The study involved HIV sequences and HLA data from 205 individuals.
Statistical Information
P-Value
p=0.0001
Statistical Significance
p<0.05
Digital Object Identifier (DOI)
Want to read the original?
Access the complete publication on the publisher's website