Cargando…
Comparative Evaluation of Intron Prediction Methods and Detection of Plant Genome Annotation Using Intron Length Distributions
Intron prediction is an important problem of the constantly updated genome annotation. Using two model plant (rice and Arabidopsis) genomes, we compared two well-known intron prediction tools: the Blast-Like Alignment Tool (BLAT) and Sim4cc. The results showed that each of the tools had its own adva...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Korea Genome Organization
2012
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3475488/ https://www.ncbi.nlm.nih.gov/pubmed/23105930 http://dx.doi.org/10.5808/GI.2012.10.1.58 |
_version_ | 1782246956630802432 |
---|---|
author | Yang, Long Cho, Hwan-Gue |
author_facet | Yang, Long Cho, Hwan-Gue |
author_sort | Yang, Long |
collection | PubMed |
description | Intron prediction is an important problem of the constantly updated genome annotation. Using two model plant (rice and Arabidopsis) genomes, we compared two well-known intron prediction tools: the Blast-Like Alignment Tool (BLAT) and Sim4cc. The results showed that each of the tools had its own advantages and disadvantages. BLAT predicted more than 99% introns of whole genomic introns with a small number of false-positive introns. Sim4cc was successful at finding the correct introns with a false-negative rate of 1.02% to 4.85%, and it needed a longer run time than BLAT. Further, we evaluated the intron information of 10 complete plant genomes. As non-coding sequences, intron lengths are not limited by a triplet codon frame; so, intron lengths have three phases: a multiple of three bases (3n), a multiple of three bases plus one (3n + 1), and a multiple of three bases plus two (3n + 2). It was widely accepted that the percentages of the 3n, 3n + 1, and 3n + 2 introns were quite similar in genomes. Our studies showed that 80% (8/10) of species were similar in terms of the number of three phases. The percentages of 3n introns in Ostreococcus lucimarinus was excessive (47.7%), while in Ostreococcus tauri, it was deficient (29.1%). This discrepancy could have been the result of errors in intron prediction. It is suggested that a three-phase evaluation is a fast and effective method of detecting intron annotation problems. |
format | Online Article Text |
id | pubmed-3475488 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2012 |
publisher | Korea Genome Organization |
record_format | MEDLINE/PubMed |
spelling | pubmed-34754882012-10-26 Comparative Evaluation of Intron Prediction Methods and Detection of Plant Genome Annotation Using Intron Length Distributions Yang, Long Cho, Hwan-Gue Genomics Inf Article Intron prediction is an important problem of the constantly updated genome annotation. Using two model plant (rice and Arabidopsis) genomes, we compared two well-known intron prediction tools: the Blast-Like Alignment Tool (BLAT) and Sim4cc. The results showed that each of the tools had its own advantages and disadvantages. BLAT predicted more than 99% introns of whole genomic introns with a small number of false-positive introns. Sim4cc was successful at finding the correct introns with a false-negative rate of 1.02% to 4.85%, and it needed a longer run time than BLAT. Further, we evaluated the intron information of 10 complete plant genomes. As non-coding sequences, intron lengths are not limited by a triplet codon frame; so, intron lengths have three phases: a multiple of three bases (3n), a multiple of three bases plus one (3n + 1), and a multiple of three bases plus two (3n + 2). It was widely accepted that the percentages of the 3n, 3n + 1, and 3n + 2 introns were quite similar in genomes. Our studies showed that 80% (8/10) of species were similar in terms of the number of three phases. The percentages of 3n introns in Ostreococcus lucimarinus was excessive (47.7%), while in Ostreococcus tauri, it was deficient (29.1%). This discrepancy could have been the result of errors in intron prediction. It is suggested that a three-phase evaluation is a fast and effective method of detecting intron annotation problems. Korea Genome Organization 2012-03 2012-03-31 /pmc/articles/PMC3475488/ /pubmed/23105930 http://dx.doi.org/10.5808/GI.2012.10.1.58 Text en Copyright © 2012 by The Korea Genome Organization http://creativecommons.org/licenses/by-nc/3.0 It is identical to the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/). |
spellingShingle | Article Yang, Long Cho, Hwan-Gue Comparative Evaluation of Intron Prediction Methods and Detection of Plant Genome Annotation Using Intron Length Distributions |
title | Comparative Evaluation of Intron Prediction Methods and Detection of Plant Genome Annotation Using Intron Length Distributions |
title_full | Comparative Evaluation of Intron Prediction Methods and Detection of Plant Genome Annotation Using Intron Length Distributions |
title_fullStr | Comparative Evaluation of Intron Prediction Methods and Detection of Plant Genome Annotation Using Intron Length Distributions |
title_full_unstemmed | Comparative Evaluation of Intron Prediction Methods and Detection of Plant Genome Annotation Using Intron Length Distributions |
title_short | Comparative Evaluation of Intron Prediction Methods and Detection of Plant Genome Annotation Using Intron Length Distributions |
title_sort | comparative evaluation of intron prediction methods and detection of plant genome annotation using intron length distributions |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3475488/ https://www.ncbi.nlm.nih.gov/pubmed/23105930 http://dx.doi.org/10.5808/GI.2012.10.1.58 |
work_keys_str_mv | AT yanglong comparativeevaluationofintronpredictionmethodsanddetectionofplantgenomeannotationusingintronlengthdistributions AT chohwangue comparativeevaluationofintronpredictionmethodsanddetectionofplantgenomeannotationusingintronlengthdistributions |