Prediction of glycosylation sites using random forests
2008
Predicting Glycosylation Sites with Random Forests
Sample size: 261
publication
Evidence: high
Author Information
Author(s): Stephen E. Hamby, Jonathan D. Hirst
Primary Institution: School of Chemistry, University of Nottingham
Hypothesis
Can the random forest algorithm improve the prediction of glycosylation sites in proteins?
Conclusion
The study developed an accurate predictor for glycosylation sites, significantly outperforming existing methods.
Supporting Evidence
- The GPP program predicts glycosylation sites with an accuracy of 90.8% for Ser, 92.0% for Thr, and 92.8% for Asn.
- The random forest algorithm outperformed existing glycosylation predictors.
- The study utilized a large dataset from OGLYCBASE for training and validation.
Takeaway
The researchers created a computer program that can guess where sugars attach to proteins, and it does a really good job at it.
Methodology
The study used a random forest algorithm and pairwise patterns to predict glycosylation sites from protein sequences.
Limitations
The models generated by random forest can be challenging to interpret, and the dataset may not cover all glycosylation types.
Statistical Information
Statistical Significance
p<0.05
Digital Object Identifier (DOI)
Want to read the original?
Access the complete publication on the publisher's website