Improving Literature Classification with Gene Ontology
Author Information
Author(s): Jin Bo, Muller Brian, Zhai Chengxiang, Lu Xinghua
Primary Institution: Medical University of South Carolina
Hypothesis
Can graph-based multi-label classification methods enhance the automatic annotation of biomedical literature using the Gene Ontology graph?
Conclusion
Graph-based multi-label classification methods significantly outperform conventional flat multi-label classification approaches for protein annotation based on literature.
Supporting Evidence
- Graph-based methods significantly improve predictions of Gene Ontology terms.
- The study utilized a dataset of 36,423 MEDLINE entries for evaluation.
- Graph-based classifiers can suggest annotations closely related to true annotations.
Takeaway
This study shows that using a special graph structure helps computers better understand and classify scientific papers about proteins, making it easier to label them correctly.
Methodology
The study evaluated three graph-based multi-label classification algorithms against a conventional flat multi-label algorithm using a dataset of biomedical literature.
Limitations
The methods may require further improvement to meet real-world annotation needs and rely on the quality of training data.
Digital Object Identifier (DOI)
Want to read the original?
Access the complete publication on the publisher's website