Using Natural Language Processing to Build Bacterial Regulatory Networks
Author Information
Author(s): Carlos Rodríguez-Penagos, Heladia Salgado, Irma Martínez-Flores, Julio Collado-Vides
Primary Institution: Universidad Nacional Autónoma de México
Hypothesis
Can automatic annotation using Text-Mining techniques complement manual curation of biological databases?
Conclusion
Manual curation of the output of automatic processing of text is a good way to complement a more detailed review of the literature.
Supporting Evidence
- The NLP system was able to recreate 45% of the manually-curated RegulonDB.
- New interactions were identified that were not previously curated.
- A novel Regulatory Interaction Markup Language was proposed for better data representation.
Takeaway
This study shows that computers can help scientists find important information in research papers about how genes work together, but humans still need to check the results.
Methodology
A rule-based Natural Language Processing system was implemented to create networks of regulatory interactions from various collections of abstracts and full-text papers.
Limitations
The study relies on the completeness and accuracy of the RegulonDB database for evaluation, which may not be exhaustive.
Digital Object Identifier (DOI)
Want to read the original?
Access the complete publication on the publisher's website