Predicting disordered regions in proteins using the profiles of amino acid indices
2009

Predicting Disordered Regions in Proteins

Sample size: 472 publication Evidence: moderate

Author Information

Author(s): Han Pengfei, Zhang Xiuzhen, Feng Zhi-Ping

Primary Institution: RMIT University

Hypothesis

Can amino acid indices and Random Forest models effectively predict disordered regions in proteins?

Conclusion

The algorithms DRaai-L and DRaai-S outperform existing methods in predicting disordered regions in proteins.

Supporting Evidence

  • The algorithms DRaai-L and DRaai-S achieved areas under the ROC curve of 85.1% and 81.2%, respectively.
  • DRaai-L outperformed many existing algorithms based on amino acid composition.
  • The study utilized a comprehensive dataset from DisProt and CASP7 for training and testing.

Takeaway

Scientists created computer programs to help find parts of proteins that are flexible and don't have a fixed shape, which is important for understanding how proteins work.

Methodology

The study used Random Forest machine learning models and amino acid indices to predict disordered regions in proteins.

Potential Biases

Potential bias due to the imbalance in the distribution of ordered and disordered residues in the training datasets.

Limitations

The prediction accuracy for short disordered regions is lower due to limited training data.

Statistical Information

P-Value

p<0.05

Statistical Significance

p<0.05

Digital Object Identifier (DOI)

10.1186/1471-2105-10-S1-S42

Want to read the original?

Access the complete publication on the publisher's website

View Original Publication