MBA: a literature mining system for extracting biomedical abbreviations
2009

MBA: A System for Extracting Biomedical Abbreviations

Sample size: 294 publication Evidence: moderate

Author Information

Author(s): Xu Yun, Wang ZhiHao, Lei YiMing, Zhao YuZhong, Xue Yu

Primary Institution: University of Science and Technology of China

Hypothesis

A systematic method for extracting biomedical abbreviations can improve the identification of both acronym-type and non-acronym-type abbreviations.

Conclusion

The MBA system effectively extracts biomedical abbreviations and outperforms existing methods.

Supporting Evidence

  • MBA achieved a recall of 88% at a precision of 91% on the Medstract gold-standard EVALUATION Corpus.
  • The system identified 162 <short form, long form> pairs, with 147 being correct.
  • Compared to other algorithms, MBA had a precision of 91% and a recall of 88%.

Takeaway

The MBA system helps scientists understand abbreviations in medical papers by finding their full meanings.

Methodology

The study developed a literature mining system that classifies abbreviations and identifies their definitions using a scoring method and alignment algorithms.

Limitations

The system may miss some abbreviations not separated by parentheses and relies on statistical methods that may not recognize rare abbreviations.

Digital Object Identifier (DOI)

10.1186/1471-2105-10-14

Want to read the original?

Access the complete publication on the publisher's website

View Original Publication