Cargando…

Empirical comparison of ab initio repeat finding programs

Identification of dispersed repetitive elements can be difficult, especially when elements share little or no homology with previously described repeats. Consequently, a growing number of computational tools have been designed to identify repetitive elements in an ab initio manner, i.e. without usin...

Descripción completa

Detalles Bibliográficos
Autores principales: Saha, Surya, Bridges, Susan, Magbanua, Zenaida V., Peterson, Daniel G.
Formato: Texto
Lenguaje:English
Publicado: Oxford University Press 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2367713/
https://www.ncbi.nlm.nih.gov/pubmed/18287116
http://dx.doi.org/10.1093/nar/gkn064
_version_ 1782154355406798848
author Saha, Surya
Bridges, Susan
Magbanua, Zenaida V.
Peterson, Daniel G.
author_facet Saha, Surya
Bridges, Susan
Magbanua, Zenaida V.
Peterson, Daniel G.
author_sort Saha, Surya
collection PubMed
description Identification of dispersed repetitive elements can be difficult, especially when elements share little or no homology with previously described repeats. Consequently, a growing number of computational tools have been designed to identify repetitive elements in an ab initio manner, i.e. without using prior sequence data. Here we present the results of side-by-side evaluations of six of the most widely used ab initio repeat finding programs. Using sequence from rice chromosome 12, tools were compared with regard to time requirements, ability to find known repeats, utility in identifying potential novel repeats, number and types of repeat elements recognized and compactness of family descriptions. The study reveals profound differences in the utility of the tools with some identifying virtually their entire substrate as repetitive, others making reasonable estimates of repetition, and some missing almost all repeats. Of note, even when tools recognized similar numbers of repeats they often showed marked differences in the nature and number of repeat families identified. Within the context of this comparative study, ReAS and RepeatScout showed the most promise in analysis of sequence reads and assembled genomic regions, respectively. Our results should help biologists identify the program(s), if any, that is best suited for their needs.
format Text
id pubmed-2367713
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-23677132008-05-07 Empirical comparison of ab initio repeat finding programs Saha, Surya Bridges, Susan Magbanua, Zenaida V. Peterson, Daniel G. Nucleic Acids Res Computational Biology Identification of dispersed repetitive elements can be difficult, especially when elements share little or no homology with previously described repeats. Consequently, a growing number of computational tools have been designed to identify repetitive elements in an ab initio manner, i.e. without using prior sequence data. Here we present the results of side-by-side evaluations of six of the most widely used ab initio repeat finding programs. Using sequence from rice chromosome 12, tools were compared with regard to time requirements, ability to find known repeats, utility in identifying potential novel repeats, number and types of repeat elements recognized and compactness of family descriptions. The study reveals profound differences in the utility of the tools with some identifying virtually their entire substrate as repetitive, others making reasonable estimates of repetition, and some missing almost all repeats. Of note, even when tools recognized similar numbers of repeats they often showed marked differences in the nature and number of repeat families identified. Within the context of this comparative study, ReAS and RepeatScout showed the most promise in analysis of sequence reads and assembled genomic regions, respectively. Our results should help biologists identify the program(s), if any, that is best suited for their needs. Oxford University Press 2008-04 2008-02-20 /pmc/articles/PMC2367713/ /pubmed/18287116 http://dx.doi.org/10.1093/nar/gkn064 Text en © 2008 The Author(s) http://creativecommons.org/licenses/by-nc/2.0/uk/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Computational Biology
Saha, Surya
Bridges, Susan
Magbanua, Zenaida V.
Peterson, Daniel G.
Empirical comparison of ab initio repeat finding programs
title Empirical comparison of ab initio repeat finding programs
title_full Empirical comparison of ab initio repeat finding programs
title_fullStr Empirical comparison of ab initio repeat finding programs
title_full_unstemmed Empirical comparison of ab initio repeat finding programs
title_short Empirical comparison of ab initio repeat finding programs
title_sort empirical comparison of ab initio repeat finding programs
topic Computational Biology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2367713/
https://www.ncbi.nlm.nih.gov/pubmed/18287116
http://dx.doi.org/10.1093/nar/gkn064
work_keys_str_mv AT sahasurya empiricalcomparisonofabinitiorepeatfindingprograms
AT bridgessusan empiricalcomparisonofabinitiorepeatfindingprograms
AT magbanuazenaidav empiricalcomparisonofabinitiorepeatfindingprograms
AT petersondanielg empiricalcomparisonofabinitiorepeatfindingprograms