Predicting Disordered Regions in Proteins
Author Information
Author(s): Han Pengfei, Zhang Xiuzhen, Feng Zhi-Ping
Primary Institution: RMIT University
Hypothesis
Can amino acid indices and Random Forest models effectively predict disordered regions in proteins?
Conclusion
The algorithms DRaai-L and DRaai-S outperform existing methods in predicting disordered regions in proteins.
Supporting Evidence
- The algorithms DRaai-L and DRaai-S achieved areas under the ROC curve of 85.1% and 81.2%, respectively.
- DRaai-L outperformed many existing algorithms based on amino acid composition.
- The study utilized a comprehensive dataset from DisProt and CASP7 for training and testing.
Takeaway
Scientists created computer programs to help find parts of proteins that are flexible and don't have a fixed shape, which is important for understanding how proteins work.
Methodology
The study used Random Forest machine learning models and amino acid indices to predict disordered regions in proteins.
Potential Biases
Potential bias due to the imbalance in the distribution of ordered and disordered residues in the training datasets.
Limitations
The prediction accuracy for short disordered regions is lower due to limited training data.
Statistical Information
P-Value
p<0.05
Statistical Significance
p<0.05
Digital Object Identifier (DOI)
Want to read the original?
Access the complete publication on the publisher's website