Cargando…

Automated recognition of retroviral sequences in genomic data—RetroTector(©)

Eukaryotic genomes contain many endogenous retroviral sequences (ERVs). ERVs are often severely mutated, therefore difficult to detect. A platform independent (Java) program package, RetroTector(©) (ReTe), was constructed. It has three basic modules: (i) detection of candidate long terminal repeats...

Descripción completa

Detalles Bibliográficos
Autores principales: Sperber, Göran O., Airola, Tove, Jern, Patric, Blomberg, Jonas
Formato: Texto
Lenguaje:English
Publicado: Oxford University Press 2007
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1976444/
https://www.ncbi.nlm.nih.gov/pubmed/17636050
http://dx.doi.org/10.1093/nar/gkm515
_version_ 1782135085660635136
author Sperber, Göran O.
Airola, Tove
Jern, Patric
Blomberg, Jonas
author_facet Sperber, Göran O.
Airola, Tove
Jern, Patric
Blomberg, Jonas
author_sort Sperber, Göran O.
collection PubMed
description Eukaryotic genomes contain many endogenous retroviral sequences (ERVs). ERVs are often severely mutated, therefore difficult to detect. A platform independent (Java) program package, RetroTector(©) (ReTe), was constructed. It has three basic modules: (i) detection of candidate long terminal repeats (LTRs), (ii) detection of chains of conserved retroviral motifs fulfilling distance constraints and (iii) attempted reconstruction of original retroviral protein sequences, combining alignment, codon statistics and properties of protein ends. Other features are prediction of additional open reading frames, automated database collection, graphical presentation and automatic classification. ReTe favors elements >1000-bp long due to its dependence on order of and distances between retroviral fragments. It detects single or low-copy-number elements. ReTe assigned a ‘retroviral’ score of 890–2827 to 10 exogenous retroviruses from seven genera, and accurately predicted their genes. In a simulated model, ReTe was robust against mutational decay. The human genome was analyzed in 1–2 days on a LINUX cluster. Retroviral sequences were detected in divergent vertebrate genomes. Most ReTe detected chains were coincident with Repeatmasker output and the HERVd database. ReTe did not report most of the evolutionary old HERV-L related and MalR sequences, and is not yet tailored for single LTR detection. Nevertheless, ReTe rationally detects and annotates many retroviral sequences.
format Text
id pubmed-1976444
institution National Center for Biotechnology Information
language English
publishDate 2007
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-19764442007-09-26 Automated recognition of retroviral sequences in genomic data—RetroTector(©) Sperber, Göran O. Airola, Tove Jern, Patric Blomberg, Jonas Nucleic Acids Res Computational Biology Eukaryotic genomes contain many endogenous retroviral sequences (ERVs). ERVs are often severely mutated, therefore difficult to detect. A platform independent (Java) program package, RetroTector(©) (ReTe), was constructed. It has three basic modules: (i) detection of candidate long terminal repeats (LTRs), (ii) detection of chains of conserved retroviral motifs fulfilling distance constraints and (iii) attempted reconstruction of original retroviral protein sequences, combining alignment, codon statistics and properties of protein ends. Other features are prediction of additional open reading frames, automated database collection, graphical presentation and automatic classification. ReTe favors elements >1000-bp long due to its dependence on order of and distances between retroviral fragments. It detects single or low-copy-number elements. ReTe assigned a ‘retroviral’ score of 890–2827 to 10 exogenous retroviruses from seven genera, and accurately predicted their genes. In a simulated model, ReTe was robust against mutational decay. The human genome was analyzed in 1–2 days on a LINUX cluster. Retroviral sequences were detected in divergent vertebrate genomes. Most ReTe detected chains were coincident with Repeatmasker output and the HERVd database. ReTe did not report most of the evolutionary old HERV-L related and MalR sequences, and is not yet tailored for single LTR detection. Nevertheless, ReTe rationally detects and annotates many retroviral sequences. Oxford University Press 2007-08 2007-07-17 /pmc/articles/PMC1976444/ /pubmed/17636050 http://dx.doi.org/10.1093/nar/gkm515 Text en © 2007 The Author(s) http://creativecommons.org/licenses/by-nc/2.0/uk/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Computational Biology
Sperber, Göran O.
Airola, Tove
Jern, Patric
Blomberg, Jonas
Automated recognition of retroviral sequences in genomic data—RetroTector(©)
title Automated recognition of retroviral sequences in genomic data—RetroTector(©)
title_full Automated recognition of retroviral sequences in genomic data—RetroTector(©)
title_fullStr Automated recognition of retroviral sequences in genomic data—RetroTector(©)
title_full_unstemmed Automated recognition of retroviral sequences in genomic data—RetroTector(©)
title_short Automated recognition of retroviral sequences in genomic data—RetroTector(©)
title_sort automated recognition of retroviral sequences in genomic data—retrotector(©)
topic Computational Biology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1976444/
https://www.ncbi.nlm.nih.gov/pubmed/17636050
http://dx.doi.org/10.1093/nar/gkm515
work_keys_str_mv AT sperbergorano automatedrecognitionofretroviralsequencesingenomicdataretrotector
AT airolatove automatedrecognitionofretroviralsequencesingenomicdataretrotector
AT jernpatric automatedrecognitionofretroviralsequencesingenomicdataretrotector
AT blombergjonas automatedrecognitionofretroviralsequencesingenomicdataretrotector