Pharmspresso: a text mining tool for extraction of pharmacogenomic concepts and relationships from full text
2009

Pharmspresso: A Tool for Extracting Pharmacogenomic Information

Sample size: 45 publication Evidence: moderate

Author Information

Author(s): Yael Garten, Russ B. Altman

Primary Institution: Stanford University

Hypothesis

We hypothesized that with minor modifications Textpresso would be useful for the task of identifying and extracting pharmacogenomic relationships.

Conclusion

Pharmspresso is a text analysis tool that extracts pharmacogenomic concepts from the literature automatically and thus captures our current understanding of gene-drug interactions in a computable form.

Supporting Evidence

  • Pharmspresso identified 78%, 61%, and 74% of target gene, polymorphism, and drug concepts, respectively.
  • The current corpus contains 1025 full text articles from 343 different journals.
  • Pharmspresso can mark up 1000 articles in less than 5 minutes on a single core consumer grade PC.

Takeaway

Pharmspresso is a computer program that helps scientists find important information about how genes and drugs interact by reading full articles instead of just summaries.

Methodology

Pharmspresso was evaluated by comparing its ability to extract information from 45 human-curated articles to the performance of human evaluators.

Potential Biases

Regular expressions used in Pharmspresso may lead to false positives or missed mentions due to imprecise matching.

Limitations

Pharmspresso may miss some polymorphisms that are only described in tables or images, and it relies on a predefined corpus of articles.

Digital Object Identifier (DOI)

10.1186/1471-2105-10-S2-S6

Want to read the original?

Access the complete publication on the publisher's website

View Original Publication