Automatic reconstruction of a bacterial regulatory network using Natural Language Processing
2007

Using Natural Language Processing to Build Bacterial Regulatory Networks

publication Evidence: moderate

Author Information

Author(s): Carlos Rodríguez-Penagos, Heladia Salgado, Irma Martínez-Flores, Julio Collado-Vides

Primary Institution: Universidad Nacional Autónoma de México

Hypothesis

Can automatic annotation using Text-Mining techniques complement manual curation of biological databases?

Conclusion

Manual curation of the output of automatic processing of text is a good way to complement a more detailed review of the literature.

Supporting Evidence

  • The NLP system was able to recreate 45% of the manually-curated RegulonDB.
  • New interactions were identified that were not previously curated.
  • A novel Regulatory Interaction Markup Language was proposed for better data representation.

Takeaway

This study shows that computers can help scientists find important information in research papers about how genes work together, but humans still need to check the results.

Methodology

A rule-based Natural Language Processing system was implemented to create networks of regulatory interactions from various collections of abstracts and full-text papers.

Limitations

The study relies on the completeness and accuracy of the RegulonDB database for evaluation, which may not be exhaustive.

Digital Object Identifier (DOI)

10.1186/1471-2105-8-293

Want to read the original?

Access the complete publication on the publisher's website

View Original Publication