Cargando…

RepeatFiller newly identifies megabases of aligning repetitive sequences and improves annotations of conserved non-exonic elements

BACKGROUND: Transposons and other repetitive sequences make up a large part of complex genomes. Repetitive sequences can be co-opted into a variety of functions and thus provide a source for evolutionary novelty. However, comprehensively detecting ancestral repeats that align between species is diff...

Descripción completa

Detalles Bibliográficos
Autores principales: Osipova, Ekaterina, Hecker, Nikolai, Hiller, Michael
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6862929/
https://www.ncbi.nlm.nih.gov/pubmed/31742600
http://dx.doi.org/10.1093/gigascience/giz132
_version_ 1783471667072204800
author Osipova, Ekaterina
Hecker, Nikolai
Hiller, Michael
author_facet Osipova, Ekaterina
Hecker, Nikolai
Hiller, Michael
author_sort Osipova, Ekaterina
collection PubMed
description BACKGROUND: Transposons and other repetitive sequences make up a large part of complex genomes. Repetitive sequences can be co-opted into a variety of functions and thus provide a source for evolutionary novelty. However, comprehensively detecting ancestral repeats that align between species is difficult because considering all repeat-overlapping seeds in alignment methods that rely on the seed-and-extend heuristic results in prohibitively high runtimes. RESULTS: Here, we show that ignoring repeat-overlapping alignment seeds when aligning entire genomes misses numerous alignments between repetitive elements. We present a tool, RepeatFiller, that improves genome alignments by incorporating previously undetected local alignments between repetitive sequences. By applying RepeatFiller to genome alignments between human and 20 other representative mammals, we uncover between 22 and 84 Mb of previously undetected alignments that mostly overlap transposable elements. We further show that the increased alignment coverage improves the annotation of conserved non-exonic elements, both by discovering numerous novel transposon-derived elements that evolve under constraint and by removing thousands of elements that are not under constraint in placental mammals. CONCLUSIONS: RepeatFiller contributes to comprehensively aligning repetitive genomic regions, which facilitates studying transposon co-option and genome evolution. Source code: https://github.com/hillerlab/GenomeAlignmentTools
format Online
Article
Text
id pubmed-6862929
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-68629292019-11-25 RepeatFiller newly identifies megabases of aligning repetitive sequences and improves annotations of conserved non-exonic elements Osipova, Ekaterina Hecker, Nikolai Hiller, Michael Gigascience Technical Note BACKGROUND: Transposons and other repetitive sequences make up a large part of complex genomes. Repetitive sequences can be co-opted into a variety of functions and thus provide a source for evolutionary novelty. However, comprehensively detecting ancestral repeats that align between species is difficult because considering all repeat-overlapping seeds in alignment methods that rely on the seed-and-extend heuristic results in prohibitively high runtimes. RESULTS: Here, we show that ignoring repeat-overlapping alignment seeds when aligning entire genomes misses numerous alignments between repetitive elements. We present a tool, RepeatFiller, that improves genome alignments by incorporating previously undetected local alignments between repetitive sequences. By applying RepeatFiller to genome alignments between human and 20 other representative mammals, we uncover between 22 and 84 Mb of previously undetected alignments that mostly overlap transposable elements. We further show that the increased alignment coverage improves the annotation of conserved non-exonic elements, both by discovering numerous novel transposon-derived elements that evolve under constraint and by removing thousands of elements that are not under constraint in placental mammals. CONCLUSIONS: RepeatFiller contributes to comprehensively aligning repetitive genomic regions, which facilitates studying transposon co-option and genome evolution. Source code: https://github.com/hillerlab/GenomeAlignmentTools Oxford University Press 2019-11-19 /pmc/articles/PMC6862929/ /pubmed/31742600 http://dx.doi.org/10.1093/gigascience/giz132 Text en © The Author(s) 2019. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Technical Note
Osipova, Ekaterina
Hecker, Nikolai
Hiller, Michael
RepeatFiller newly identifies megabases of aligning repetitive sequences and improves annotations of conserved non-exonic elements
title RepeatFiller newly identifies megabases of aligning repetitive sequences and improves annotations of conserved non-exonic elements
title_full RepeatFiller newly identifies megabases of aligning repetitive sequences and improves annotations of conserved non-exonic elements
title_fullStr RepeatFiller newly identifies megabases of aligning repetitive sequences and improves annotations of conserved non-exonic elements
title_full_unstemmed RepeatFiller newly identifies megabases of aligning repetitive sequences and improves annotations of conserved non-exonic elements
title_short RepeatFiller newly identifies megabases of aligning repetitive sequences and improves annotations of conserved non-exonic elements
title_sort repeatfiller newly identifies megabases of aligning repetitive sequences and improves annotations of conserved non-exonic elements
topic Technical Note
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6862929/
https://www.ncbi.nlm.nih.gov/pubmed/31742600
http://dx.doi.org/10.1093/gigascience/giz132
work_keys_str_mv AT osipovaekaterina repeatfillernewlyidentifiesmegabasesofaligningrepetitivesequencesandimprovesannotationsofconservednonexonicelements
AT heckernikolai repeatfillernewlyidentifiesmegabasesofaligningrepetitivesequencesandimprovesannotationsofconservednonexonicelements
AT hillermichael repeatfillernewlyidentifiesmegabasesofaligningrepetitivesequencesandimprovesannotationsofconservednonexonicelements