Cargando…

De novo identification of LTR retrotransposons in eukaryotic genomes

BACKGROUND: LTR retrotransposons are a class of mobile genetic elements containing two similar long terminal repeats (LTRs). Currently, LTR retrotransposons are annotated in eukaryotic genomes mainly through the conventional homology searching approach. Hence, it is limited to annotating known eleme...

Descripción completa

Detalles Bibliográficos
Autores principales: Rho, Mina, Choi, Jeong-Hyeon, Kim, Sun, Lynch, Michael, Tang, Haixu
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2007
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1858694/
https://www.ncbi.nlm.nih.gov/pubmed/17407597
http://dx.doi.org/10.1186/1471-2164-8-90
_version_ 1782133174420111360
author Rho, Mina
Choi, Jeong-Hyeon
Kim, Sun
Lynch, Michael
Tang, Haixu
author_facet Rho, Mina
Choi, Jeong-Hyeon
Kim, Sun
Lynch, Michael
Tang, Haixu
author_sort Rho, Mina
collection PubMed
description BACKGROUND: LTR retrotransposons are a class of mobile genetic elements containing two similar long terminal repeats (LTRs). Currently, LTR retrotransposons are annotated in eukaryotic genomes mainly through the conventional homology searching approach. Hence, it is limited to annotating known elements. RESULTS: In this paper, we report a de novo computational method that can identify new LTR retrotransposons without relying on a library of known elements. Specifically, our method identifies intact LTR retrotransposons by using an approximate string matching technique and protein domain analysis. In addition, it identifies partially deleted or solo LTRs using profile Hidden Markov Models (pHMMs). As a result, this method can de novo identify all types of LTR retrotransposons. We tested this method on the two pairs of eukaryotic genomes, C. elegans vs. C. briggsae and D. melanogaster vs. D. pseudoobscura. LTR retrotransposons in C. elegans and D. melanogaster have been intensively studied using conventional annotation methods. Comparing with previous work, we identified new intact LTR retroelements and new putative families, which may imply that there may still be new retroelements that are left to be discovered even in well-studied organisms. To assess the sensitivity and accuracy of our method, we compared our results with a previously published method, LTR_STRUC, which predominantly identifies full-length LTR retrotransposons. In summary, both methods identified comparable number of intact LTR retroelements. But our method can identify nearly all known elements in C. elegans, while LTR_STRUCT missed about 1/3 of them. Our method also identified more known LTR retroelements than LTR_STRUCT in the D. melanogaster genome. We also identified some LTR retroelements in the other two genomes, C. briggsae and D. pseudoobscura, which have not been completely finished. In contrast, the conventional method failed to identify those elements. Finally, the phylogenetic and chromosomal distributions of the identified elements are discussed. CONCLUSION: We report a novel method for de novo identification of LTR retrotransposons in eukaryotic genomes with favorable performance over the existing methods.
format Text
id pubmed-1858694
institution National Center for Biotechnology Information
language English
publishDate 2007
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-18586942007-04-28 De novo identification of LTR retrotransposons in eukaryotic genomes Rho, Mina Choi, Jeong-Hyeon Kim, Sun Lynch, Michael Tang, Haixu BMC Genomics Methodology Article BACKGROUND: LTR retrotransposons are a class of mobile genetic elements containing two similar long terminal repeats (LTRs). Currently, LTR retrotransposons are annotated in eukaryotic genomes mainly through the conventional homology searching approach. Hence, it is limited to annotating known elements. RESULTS: In this paper, we report a de novo computational method that can identify new LTR retrotransposons without relying on a library of known elements. Specifically, our method identifies intact LTR retrotransposons by using an approximate string matching technique and protein domain analysis. In addition, it identifies partially deleted or solo LTRs using profile Hidden Markov Models (pHMMs). As a result, this method can de novo identify all types of LTR retrotransposons. We tested this method on the two pairs of eukaryotic genomes, C. elegans vs. C. briggsae and D. melanogaster vs. D. pseudoobscura. LTR retrotransposons in C. elegans and D. melanogaster have been intensively studied using conventional annotation methods. Comparing with previous work, we identified new intact LTR retroelements and new putative families, which may imply that there may still be new retroelements that are left to be discovered even in well-studied organisms. To assess the sensitivity and accuracy of our method, we compared our results with a previously published method, LTR_STRUC, which predominantly identifies full-length LTR retrotransposons. In summary, both methods identified comparable number of intact LTR retroelements. But our method can identify nearly all known elements in C. elegans, while LTR_STRUCT missed about 1/3 of them. Our method also identified more known LTR retroelements than LTR_STRUCT in the D. melanogaster genome. We also identified some LTR retroelements in the other two genomes, C. briggsae and D. pseudoobscura, which have not been completely finished. In contrast, the conventional method failed to identify those elements. Finally, the phylogenetic and chromosomal distributions of the identified elements are discussed. CONCLUSION: We report a novel method for de novo identification of LTR retrotransposons in eukaryotic genomes with favorable performance over the existing methods. BioMed Central 2007-04-03 /pmc/articles/PMC1858694/ /pubmed/17407597 http://dx.doi.org/10.1186/1471-2164-8-90 Text en Copyright © 2007 Rho et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Rho, Mina
Choi, Jeong-Hyeon
Kim, Sun
Lynch, Michael
Tang, Haixu
De novo identification of LTR retrotransposons in eukaryotic genomes
title De novo identification of LTR retrotransposons in eukaryotic genomes
title_full De novo identification of LTR retrotransposons in eukaryotic genomes
title_fullStr De novo identification of LTR retrotransposons in eukaryotic genomes
title_full_unstemmed De novo identification of LTR retrotransposons in eukaryotic genomes
title_short De novo identification of LTR retrotransposons in eukaryotic genomes
title_sort de novo identification of ltr retrotransposons in eukaryotic genomes
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1858694/
https://www.ncbi.nlm.nih.gov/pubmed/17407597
http://dx.doi.org/10.1186/1471-2164-8-90
work_keys_str_mv AT rhomina denovoidentificationofltrretrotransposonsineukaryoticgenomes
AT choijeonghyeon denovoidentificationofltrretrotransposonsineukaryoticgenomes
AT kimsun denovoidentificationofltrretrotransposonsineukaryoticgenomes
AT lynchmichael denovoidentificationofltrretrotransposonsineukaryoticgenomes
AT tanghaixu denovoidentificationofltrretrotransposonsineukaryoticgenomes