MBA: A System for Extracting Biomedical Abbreviations
Author Information
Author(s): Xu Yun, Wang ZhiHao, Lei YiMing, Zhao YuZhong, Xue Yu
Primary Institution: University of Science and Technology of China
Hypothesis
A systematic method for extracting biomedical abbreviations can improve the identification of both acronym-type and non-acronym-type abbreviations.
Conclusion
The MBA system effectively extracts biomedical abbreviations and outperforms existing methods.
Supporting Evidence
- MBA achieved a recall of 88% at a precision of 91% on the Medstract gold-standard EVALUATION Corpus.
- The system identified 162 <short form, long form> pairs, with 147 being correct.
- Compared to other algorithms, MBA had a precision of 91% and a recall of 88%.
Takeaway
The MBA system helps scientists understand abbreviations in medical papers by finding their full meanings.
Methodology
The study developed a literature mining system that classifies abbreviations and identifies their definitions using a scoring method and alignment algorithms.
Limitations
The system may miss some abbreviations not separated by parentheses and relies on statistical methods that may not recognize rare abbreviations.
Digital Object Identifier (DOI)
Want to read the original?
Access the complete publication on the publisher's website