nGASP – the nematode genome annotation assessment project
Author Information
Author(s): Coghlan Avril, Fiedler Tristan J, McKay Sheldon J, Flicek Paul, Harris Todd W, Blasiar Darin, Stein Lincoln D
Primary Institution: Wellcome Trust Sanger Institute
Hypothesis
The nGASP project aims to assess the accuracy of protein-coding gene prediction software in C. elegans and apply this knowledge to other Caenorhabditis species.
Conclusion
The study establishes a baseline of gene prediction accuracy in Caenorhabditis genomes and guides the choice of gene-finders for annotating newly sequenced genomes.
Supporting Evidence
- The most accurate gene-finders were 'combiner' algorithms.
- Median gene level sensitivity of combiners was 78% and specificity was 42%.
- Combiners improved sensitivity of predictions above those based on expressed sequence alignments alone.
Takeaway
Scientists wanted to see how well different computer programs could find genes in the DNA of tiny worms, and they found that some programs did a really good job.
Methodology
The project involved 17 groups submitting 47 prediction sets for 10 Mb of the C. elegans genome, which were evaluated for sensitivity and specificity against reference gene sets.
Potential Biases
There may be bias in the evaluation due to the differences in reference gene sets used for sensitivity and specificity assessments.
Limitations
The study may have limitations due to the reliance on curated gene models and the potential for missing genes in the reference sets.
Participant Demographics
Seventeen groups worldwide participated in the project.
Statistical Information
P-Value
0.04
Statistical Significance
p<0.05
Digital Object Identifier (DOI)
Want to read the original?
Access the complete publication on the publisher's website