Cargando…

phRAIDER: Pattern-Hunter based Rapid Ab Initio Detection of Elementary Repeats

Motivation: Transposable elements (TEs) and repetitive DNA make up a sizable fraction of Eukaryotic genomes, and their annotation is crucial to the study of the structure, organization, and evolution of any newly sequenced genome. Although RepeatMasker and nHMMER are useful for identifying these rep...

Descripción completa

Detalles Bibliográficos
Autores principales: Schaeffer, Carly E., Figueroa, Nathaniel D., Liu, Xiaolin, Karro, John E.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4908342/
https://www.ncbi.nlm.nih.gov/pubmed/27307619
http://dx.doi.org/10.1093/bioinformatics/btw258
_version_ 1782437663525044224
author Schaeffer, Carly E.
Figueroa, Nathaniel D.
Liu, Xiaolin
Karro, John E.
author_facet Schaeffer, Carly E.
Figueroa, Nathaniel D.
Liu, Xiaolin
Karro, John E.
author_sort Schaeffer, Carly E.
collection PubMed
description Motivation: Transposable elements (TEs) and repetitive DNA make up a sizable fraction of Eukaryotic genomes, and their annotation is crucial to the study of the structure, organization, and evolution of any newly sequenced genome. Although RepeatMasker and nHMMER are useful for identifying these repeats, they require a pre-compiled repeat library—which is not always available. De novo identification tools such as Recon, RepeatScout or RepeatGluer serve to identify TEs purely from sequence content, but are either limited by runtimes that prohibit whole-genome use or degrade in quality in the presence of substitutions that disrupt the sequence patterns. Results: phRAIDER is a de novo TE identification tool that address the issues of excessive runtime without sacrificing sensitivity as compared to competing tools. The underlying model is a new definition of elementary repeats that incorporates the PatternHunter spaced seed model, allowing for greater sensitivity in the presence of genomic substitutions. As compared with the premier tool in the literature, RepeatScout, phRAIDER shows an average 10× speedup on any single human chromosome and has the ability to process the whole human genome in just over three hours. Here we discuss the tool, the theoretical model underlying the tool, and the results demonstrating its effectiveness. Availability and implementation: phRAIDER is an open source tool available from https://github.com/karroje/phRAIDER. Contact: karroje@miamiOH.edu or Supplementary information: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-4908342
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-49083422016-06-17 phRAIDER: Pattern-Hunter based Rapid Ab Initio Detection of Elementary Repeats Schaeffer, Carly E. Figueroa, Nathaniel D. Liu, Xiaolin Karro, John E. Bioinformatics Ismb 2016 Proceedings July 8 to July 12, 2016, Orlando, Florida Motivation: Transposable elements (TEs) and repetitive DNA make up a sizable fraction of Eukaryotic genomes, and their annotation is crucial to the study of the structure, organization, and evolution of any newly sequenced genome. Although RepeatMasker and nHMMER are useful for identifying these repeats, they require a pre-compiled repeat library—which is not always available. De novo identification tools such as Recon, RepeatScout or RepeatGluer serve to identify TEs purely from sequence content, but are either limited by runtimes that prohibit whole-genome use or degrade in quality in the presence of substitutions that disrupt the sequence patterns. Results: phRAIDER is a de novo TE identification tool that address the issues of excessive runtime without sacrificing sensitivity as compared to competing tools. The underlying model is a new definition of elementary repeats that incorporates the PatternHunter spaced seed model, allowing for greater sensitivity in the presence of genomic substitutions. As compared with the premier tool in the literature, RepeatScout, phRAIDER shows an average 10× speedup on any single human chromosome and has the ability to process the whole human genome in just over three hours. Here we discuss the tool, the theoretical model underlying the tool, and the results demonstrating its effectiveness. Availability and implementation: phRAIDER is an open source tool available from https://github.com/karroje/phRAIDER. Contact: karroje@miamiOH.edu or Supplementary information: Supplementary data are available at Bioinformatics online. Oxford University Press 2016-06-15 2016-06-11 /pmc/articles/PMC4908342/ /pubmed/27307619 http://dx.doi.org/10.1093/bioinformatics/btw258 Text en © The Author 2016. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Ismb 2016 Proceedings July 8 to July 12, 2016, Orlando, Florida
Schaeffer, Carly E.
Figueroa, Nathaniel D.
Liu, Xiaolin
Karro, John E.
phRAIDER: Pattern-Hunter based Rapid Ab Initio Detection of Elementary Repeats
title phRAIDER: Pattern-Hunter based Rapid Ab Initio Detection of Elementary Repeats
title_full phRAIDER: Pattern-Hunter based Rapid Ab Initio Detection of Elementary Repeats
title_fullStr phRAIDER: Pattern-Hunter based Rapid Ab Initio Detection of Elementary Repeats
title_full_unstemmed phRAIDER: Pattern-Hunter based Rapid Ab Initio Detection of Elementary Repeats
title_short phRAIDER: Pattern-Hunter based Rapid Ab Initio Detection of Elementary Repeats
title_sort phraider: pattern-hunter based rapid ab initio detection of elementary repeats
topic Ismb 2016 Proceedings July 8 to July 12, 2016, Orlando, Florida
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4908342/
https://www.ncbi.nlm.nih.gov/pubmed/27307619
http://dx.doi.org/10.1093/bioinformatics/btw258
work_keys_str_mv AT schaeffercarlye phraiderpatternhunterbasedrapidabinitiodetectionofelementaryrepeats
AT figueroanathanield phraiderpatternhunterbasedrapidabinitiodetectionofelementaryrepeats
AT liuxiaolin phraiderpatternhunterbasedrapidabinitiodetectionofelementaryrepeats
AT karrojohne phraiderpatternhunterbasedrapidabinitiodetectionofelementaryrepeats