Data mining methods in the prediction of Dementia: A real-data comparison of the accuracy, sensitivity and specificity of linear discriminant analysis, logistic regression, neural networks, support vector machines, classification trees and random forests
2011

Comparing Methods to Predict Dementia

Sample size: 400 publication Evidence: moderate

Author Information

Author(s): João Maroco, Dina Silva, Ana Rodrigues, Manuela Guerreiro, Isabel Santana, Alexandre de Mendonça

Primary Institution: Unidade de Investigação em Psicologia e Saúde & Departamento de Estatística, ISPA - Instituto Universitário

Hypothesis

Newer statistical classification methods derived from data mining and machine learning can improve the accuracy, sensitivity, and specificity of predictions obtained from neuropsychological testing for dementia.

Conclusion

Random Forests and Linear Discriminant Analysis are the most effective methods for predicting dementia from neuropsychological tests.

Supporting Evidence

  • All classifiers performed better than chance alone (p < 0.05).
  • Support Vector Machines showed the largest overall classification accuracy (Median = 0.76).
  • Random Forest ranked second in overall accuracy (Median = 0.73).
  • Linear Discriminant Analysis showed acceptable overall accuracy (Median = 0.66).
  • Sensitivity was low for Support Vector Machines (Median = 0.3).
  • Random Forests and Linear Discriminant Analysis ranked first in sensitivity and specificity.

Takeaway

This study looked at different ways to predict if older people with memory problems will develop dementia, finding that some methods work better than others.

Methodology

The study compared seven data mining classifiers and three traditional classifiers using neuropsychological tests on a sample of elderly patients with Mild Cognitive Impairment.

Potential Biases

The performance of classifiers may depend on the tuning parameters chosen, which could introduce bias.

Limitations

The sample size may limit the performance of some data mining methods, and the results are based on a specific dataset.

Participant Demographics

The sample consisted of 400 elderly patients, with 275 diagnosed with Mild Cognitive Impairment and 125 with Dementia.

Statistical Information

P-Value

p<0.05

Statistical Significance

p<0.05

Digital Object Identifier (DOI)

10.1186/1756-0500-4-299

Want to read the original?

Access the complete publication on the publisher's website

View Original Publication