Integrative Disease Classification Using Microarray Data
Author Information
Author(s): Liu Chun-Chi, Hu Jianjun, Kalakrishnan Mrinal, Huang Haiyan, Zhou Xianghong Jasmine
Primary Institution: University of Southern California
Hypothesis
Can heterogeneous microarray datasets from public repositories be integrated for effective disease classification?
Conclusion
The study shows that integrating multiple microarray datasets improves disease classification accuracy.
Supporting Evidence
- ManiSVM achieved an overall accuracy of 70.7%, outperforming SVM which had an accuracy of 58.8%.
- The classification accuracy increased with the number of homogenous training datasets.
- 12% of disease classes achieved accuracy higher than 80%.
Takeaway
This study found a way to combine different sets of medical data to better identify diseases, making it easier for doctors to diagnose patients.
Methodology
The study used a new classification approach called ManiSVM, which combines manifold data transformation with SVM learning, and evaluated performance using leave-one-dataset-out cross-validation.
Potential Biases
Potential bias due to correlated training and testing data from the same dataset.
Limitations
The study's performance may be affected by mapping imprecision in UMLS concepts and the varying sizes of datasets.
Statistical Information
P-Value
3.50E-05
Statistical Significance
p<0.05
Digital Object Identifier (DOI)
Want to read the original?
Access the complete publication on the publisher's website