Prediction of glycosylation sites using random forests
2008

Predicting Glycosylation Sites with Random Forests

Sample size: 261 publication Evidence: high

Author Information

Author(s): Stephen E. Hamby, Jonathan D. Hirst

Primary Institution: School of Chemistry, University of Nottingham

Hypothesis

Can the random forest algorithm improve the prediction of glycosylation sites in proteins?

Conclusion

The study developed an accurate predictor for glycosylation sites, significantly outperforming existing methods.

Supporting Evidence

  • The GPP program predicts glycosylation sites with an accuracy of 90.8% for Ser, 92.0% for Thr, and 92.8% for Asn.
  • The random forest algorithm outperformed existing glycosylation predictors.
  • The study utilized a large dataset from OGLYCBASE for training and validation.

Takeaway

The researchers created a computer program that can guess where sugars attach to proteins, and it does a really good job at it.

Methodology

The study used a random forest algorithm and pairwise patterns to predict glycosylation sites from protein sequences.

Limitations

The models generated by random forest can be challenging to interpret, and the dataset may not cover all glycosylation types.

Statistical Information

Statistical Significance

p<0.05

Digital Object Identifier (DOI)

10.1186/1471-2105-9-500

Want to read the original?

Access the complete publication on the publisher's website

View Original Publication