Cargando…

Detection of transposable elements by their compositional bias

BACKGROUND: Transposable elements (TE) are mobile genetic entities present in nearly all genomes. Previous work has shown that TEs tend to have a different nucleotide composition than the host genes, either considering codon usage bias or dinucleotide frequencies. We show here how these compositiona...

Descripción completa

Detalles Bibliográficos
Autores principales: Andrieu, Olivier, Fiston, Anna-Sophie, Anxolabéhère, Dominique, Quesneville, Hadi
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2004
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC497039/
https://www.ncbi.nlm.nih.gov/pubmed/15251040
http://dx.doi.org/10.1186/1471-2105-5-94
_version_ 1782121671495254016
author Andrieu, Olivier
Fiston, Anna-Sophie
Anxolabéhère, Dominique
Quesneville, Hadi
author_facet Andrieu, Olivier
Fiston, Anna-Sophie
Anxolabéhère, Dominique
Quesneville, Hadi
author_sort Andrieu, Olivier
collection PubMed
description BACKGROUND: Transposable elements (TE) are mobile genetic entities present in nearly all genomes. Previous work has shown that TEs tend to have a different nucleotide composition than the host genes, either considering codon usage bias or dinucleotide frequencies. We show here how these compositional differences can be used as a tool for detection and analysis of TE sequences. RESULTS: We compared the composition of TE sequences and host gene sequences using probabilistic models of nucleotide sequences. We used hidden Markov models (HMM), which take into account the base composition of the sequences (occurrences of words n nucleotides long, with n ranging here from 1 to 4) and the heterogeneity between coding and non-coding parts of sequences. We analyzed three sets of sequences containing class I TEs, class II TEs and genes respectively in three species: Drosophila melanogaster, Cænorhabditis elegans and Arabidopsis thaliana. Each of these sets had a distinct, homogeneous composition, enabling us to distinguish between the two classes of TE and the genes. However the particular base composition of the TEs differed in the three species studied. CONCLUSIONS: This approach can be used to detect and annotate TEs in genomic sequences and complements the current homology-based TE detection methods. Furthermore, the HMM method is able to identify the parts of a sequence in which the nucleotide composition resembles that of a coding region of a TE. This is useful for the detailed annotation of TE sequences, which may contain an ancient, highly diverged coding region that is no longer fully functional.
format Text
id pubmed-497039
institution National Center for Biotechnology Information
language English
publishDate 2004
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-4970392004-07-31 Detection of transposable elements by their compositional bias Andrieu, Olivier Fiston, Anna-Sophie Anxolabéhère, Dominique Quesneville, Hadi BMC Bioinformatics Methodology Article BACKGROUND: Transposable elements (TE) are mobile genetic entities present in nearly all genomes. Previous work has shown that TEs tend to have a different nucleotide composition than the host genes, either considering codon usage bias or dinucleotide frequencies. We show here how these compositional differences can be used as a tool for detection and analysis of TE sequences. RESULTS: We compared the composition of TE sequences and host gene sequences using probabilistic models of nucleotide sequences. We used hidden Markov models (HMM), which take into account the base composition of the sequences (occurrences of words n nucleotides long, with n ranging here from 1 to 4) and the heterogeneity between coding and non-coding parts of sequences. We analyzed three sets of sequences containing class I TEs, class II TEs and genes respectively in three species: Drosophila melanogaster, Cænorhabditis elegans and Arabidopsis thaliana. Each of these sets had a distinct, homogeneous composition, enabling us to distinguish between the two classes of TE and the genes. However the particular base composition of the TEs differed in the three species studied. CONCLUSIONS: This approach can be used to detect and annotate TEs in genomic sequences and complements the current homology-based TE detection methods. Furthermore, the HMM method is able to identify the parts of a sequence in which the nucleotide composition resembles that of a coding region of a TE. This is useful for the detailed annotation of TE sequences, which may contain an ancient, highly diverged coding region that is no longer fully functional. BioMed Central 2004-07-13 /pmc/articles/PMC497039/ /pubmed/15251040 http://dx.doi.org/10.1186/1471-2105-5-94 Text en Copyright © 2004 Andrieu et al; licensee BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL.
spellingShingle Methodology Article
Andrieu, Olivier
Fiston, Anna-Sophie
Anxolabéhère, Dominique
Quesneville, Hadi
Detection of transposable elements by their compositional bias
title Detection of transposable elements by their compositional bias
title_full Detection of transposable elements by their compositional bias
title_fullStr Detection of transposable elements by their compositional bias
title_full_unstemmed Detection of transposable elements by their compositional bias
title_short Detection of transposable elements by their compositional bias
title_sort detection of transposable elements by their compositional bias
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC497039/
https://www.ncbi.nlm.nih.gov/pubmed/15251040
http://dx.doi.org/10.1186/1471-2105-5-94
work_keys_str_mv AT andrieuolivier detectionoftransposableelementsbytheircompositionalbias
AT fistonannasophie detectionoftransposableelementsbytheircompositionalbias
AT anxolabeheredominique detectionoftransposableelementsbytheircompositionalbias
AT quesnevillehadi detectionoftransposableelementsbytheircompositionalbias