Cargando…

Fine-grained annotation and classification of de novo predicted LTR retrotransposons

Long terminal repeat (LTR) retrotransposons and endogenous retroviruses (ERVs) are transposable elements in eukaryotic genomes well suited for computational identification. De novo identification tools determine the position of potential LTR retrotransposon or ERV insertions in genomic sequences. Fo...

Descripción completa

Detalles Bibliográficos
Autores principales: Steinbiss, Sascha, Willhoeft, Ute, Gremme, Gordon, Kurtz, Stefan
Formato: Texto
Lenguaje:English
Publicado: Oxford University Press 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2790888/
https://www.ncbi.nlm.nih.gov/pubmed/19786494
http://dx.doi.org/10.1093/nar/gkp759
_version_ 1782175145653174272
author Steinbiss, Sascha
Willhoeft, Ute
Gremme, Gordon
Kurtz, Stefan
author_facet Steinbiss, Sascha
Willhoeft, Ute
Gremme, Gordon
Kurtz, Stefan
author_sort Steinbiss, Sascha
collection PubMed
description Long terminal repeat (LTR) retrotransposons and endogenous retroviruses (ERVs) are transposable elements in eukaryotic genomes well suited for computational identification. De novo identification tools determine the position of potential LTR retrotransposon or ERV insertions in genomic sequences. For further analysis, it is desirable to obtain an annotation of the internal structure of such candidates. This article presents LTRdigest, a novel software tool for automated annotation of internal features of putative LTR retrotransposons. It uses local alignment and hidden Markov model-based algorithms to detect retrotransposon-associated protein domains as well as primer binding sites and polypurine tracts. As an example, we used LTRdigest results to identify 88 (near) full-length ERVs in the chromosome 4 sequence of Mus musculus, separating them from truncated insertions and other repeats. Furthermore, we propose a work flow for the use of LTRdigest in de novo LTR retrotransposon classification and perform an exemplary de novo analysis on the Drosophila melanogaster genome as a proof of concept. Using a new method solely based on the annotations generated by LTRdigest, 518 potential LTR retrotransposons were automatically assigned to 62 candidate groups. Representative sequences from 41 of these 62 groups were matched to reference sequences with >80% global sequence similarity.
format Text
id pubmed-2790888
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-27908882009-12-09 Fine-grained annotation and classification of de novo predicted LTR retrotransposons Steinbiss, Sascha Willhoeft, Ute Gremme, Gordon Kurtz, Stefan Nucleic Acids Res Computational Biology Long terminal repeat (LTR) retrotransposons and endogenous retroviruses (ERVs) are transposable elements in eukaryotic genomes well suited for computational identification. De novo identification tools determine the position of potential LTR retrotransposon or ERV insertions in genomic sequences. For further analysis, it is desirable to obtain an annotation of the internal structure of such candidates. This article presents LTRdigest, a novel software tool for automated annotation of internal features of putative LTR retrotransposons. It uses local alignment and hidden Markov model-based algorithms to detect retrotransposon-associated protein domains as well as primer binding sites and polypurine tracts. As an example, we used LTRdigest results to identify 88 (near) full-length ERVs in the chromosome 4 sequence of Mus musculus, separating them from truncated insertions and other repeats. Furthermore, we propose a work flow for the use of LTRdigest in de novo LTR retrotransposon classification and perform an exemplary de novo analysis on the Drosophila melanogaster genome as a proof of concept. Using a new method solely based on the annotations generated by LTRdigest, 518 potential LTR retrotransposons were automatically assigned to 62 candidate groups. Representative sequences from 41 of these 62 groups were matched to reference sequences with >80% global sequence similarity. Oxford University Press 2009-11 2009-09-28 /pmc/articles/PMC2790888/ /pubmed/19786494 http://dx.doi.org/10.1093/nar/gkp759 Text en © The Author(s) 2009. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/2.5/uk/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Computational Biology
Steinbiss, Sascha
Willhoeft, Ute
Gremme, Gordon
Kurtz, Stefan
Fine-grained annotation and classification of de novo predicted LTR retrotransposons
title Fine-grained annotation and classification of de novo predicted LTR retrotransposons
title_full Fine-grained annotation and classification of de novo predicted LTR retrotransposons
title_fullStr Fine-grained annotation and classification of de novo predicted LTR retrotransposons
title_full_unstemmed Fine-grained annotation and classification of de novo predicted LTR retrotransposons
title_short Fine-grained annotation and classification of de novo predicted LTR retrotransposons
title_sort fine-grained annotation and classification of de novo predicted ltr retrotransposons
topic Computational Biology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2790888/
https://www.ncbi.nlm.nih.gov/pubmed/19786494
http://dx.doi.org/10.1093/nar/gkp759
work_keys_str_mv AT steinbisssascha finegrainedannotationandclassificationofdenovopredictedltrretrotransposons
AT willhoeftute finegrainedannotationandclassificationofdenovopredictedltrretrotransposons
AT gremmegordon finegrainedannotationandclassificationofdenovopredictedltrretrotransposons
AT kurtzstefan finegrainedannotationandclassificationofdenovopredictedltrretrotransposons