Ontology-driven Indexing of Public Datasets for Translational Bioinformatics
Author Information
Author(s): Shah Nigam H, Jonquet Clement, Chiang Annie P, Butte Atul J, Chen Rong, Musen Mark A
Primary Institution: Centre for Biomedical Informatics, School of Medicine, Stanford University
Hypothesis
Can we effectively map text annotations of gene expression datasets to concepts in the UMLS for better data integration?
Conclusion
The study demonstrates that mapping text annotations of microarray datasets to UMLS concepts enables better identification and integration of datasets across different repositories.
Supporting Evidence
- The study processed annotations of 369 GEO datasets and 1045 TMAD datasets.
- High precision and recall were achieved in identifying disease-related datasets.
- A prototype system was developed to enable ontology-based querying of biomedical data.
Takeaway
This study shows how we can use smart labeling to help researchers find related medical data more easily.
Methodology
The study developed a prototype system that processes text metadata from various biomedical resources and maps them to ontology concepts.
Limitations
The study primarily focuses on human datasets and may not generalize to other types of data.
Digital Object Identifier (DOI)
Want to read the original?
Access the complete publication on the publisher's website