On the Representability of Complete Genomes by Multiple Competing Finite-Context (Markov) Models
2011

Using Markov Models to Describe Complete Genomes

Sample size: 11 publication Evidence: moderate

Author Information

Author(s): Pinho Armando J., Ferreira Paulo J. S. G., Neves António J. R., Bastos Carlos A. C.

Primary Institution: Signal Processing Lab, IEETA/DETI, University of Aveiro, Aveiro, Portugal

Hypothesis

How well can complete genomes be described using exclusively a combination of Markov models?

Conclusion

Multiple competing Markov models can explain entire genomes almost as well as advanced DNA compression methods.

Supporting Evidence

  • The study found that Markov models can effectively describe DNA sequences.
  • Results showed that for small-sized genomes, finite-context models performed better than complex methods.
  • The research provides evidence that local models can compete with advanced compression techniques.

Takeaway

This study shows that we can use simple models to understand complex DNA sequences, and they can work just as well as more complicated methods.

Methodology

The study used multiple competing finite-context models of different orders to analyze DNA sequences from eleven species.

Limitations

The models may not capture long-range correlations and repetitions in DNA sequences.

Participant Demographics

The study analyzed DNA sequences from eleven different species.

Digital Object Identifier (DOI)

10.1371/journal.pone.0021588

Want to read the original?

Access the complete publication on the publisher's website

View Original Publication