Cargando…
Automated recognition of retroviral sequences in genomic data—RetroTector(©)
Eukaryotic genomes contain many endogenous retroviral sequences (ERVs). ERVs are often severely mutated, therefore difficult to detect. A platform independent (Java) program package, RetroTector(©) (ReTe), was constructed. It has three basic modules: (i) detection of candidate long terminal repeats...
Autores principales: | , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2007
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1976444/ https://www.ncbi.nlm.nih.gov/pubmed/17636050 http://dx.doi.org/10.1093/nar/gkm515 |
_version_ | 1782135085660635136 |
---|---|
author | Sperber, Göran O. Airola, Tove Jern, Patric Blomberg, Jonas |
author_facet | Sperber, Göran O. Airola, Tove Jern, Patric Blomberg, Jonas |
author_sort | Sperber, Göran O. |
collection | PubMed |
description | Eukaryotic genomes contain many endogenous retroviral sequences (ERVs). ERVs are often severely mutated, therefore difficult to detect. A platform independent (Java) program package, RetroTector(©) (ReTe), was constructed. It has three basic modules: (i) detection of candidate long terminal repeats (LTRs), (ii) detection of chains of conserved retroviral motifs fulfilling distance constraints and (iii) attempted reconstruction of original retroviral protein sequences, combining alignment, codon statistics and properties of protein ends. Other features are prediction of additional open reading frames, automated database collection, graphical presentation and automatic classification. ReTe favors elements >1000-bp long due to its dependence on order of and distances between retroviral fragments. It detects single or low-copy-number elements. ReTe assigned a ‘retroviral’ score of 890–2827 to 10 exogenous retroviruses from seven genera, and accurately predicted their genes. In a simulated model, ReTe was robust against mutational decay. The human genome was analyzed in 1–2 days on a LINUX cluster. Retroviral sequences were detected in divergent vertebrate genomes. Most ReTe detected chains were coincident with Repeatmasker output and the HERVd database. ReTe did not report most of the evolutionary old HERV-L related and MalR sequences, and is not yet tailored for single LTR detection. Nevertheless, ReTe rationally detects and annotates many retroviral sequences. |
format | Text |
id | pubmed-1976444 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2007 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-19764442007-09-26 Automated recognition of retroviral sequences in genomic data—RetroTector(©) Sperber, Göran O. Airola, Tove Jern, Patric Blomberg, Jonas Nucleic Acids Res Computational Biology Eukaryotic genomes contain many endogenous retroviral sequences (ERVs). ERVs are often severely mutated, therefore difficult to detect. A platform independent (Java) program package, RetroTector(©) (ReTe), was constructed. It has three basic modules: (i) detection of candidate long terminal repeats (LTRs), (ii) detection of chains of conserved retroviral motifs fulfilling distance constraints and (iii) attempted reconstruction of original retroviral protein sequences, combining alignment, codon statistics and properties of protein ends. Other features are prediction of additional open reading frames, automated database collection, graphical presentation and automatic classification. ReTe favors elements >1000-bp long due to its dependence on order of and distances between retroviral fragments. It detects single or low-copy-number elements. ReTe assigned a ‘retroviral’ score of 890–2827 to 10 exogenous retroviruses from seven genera, and accurately predicted their genes. In a simulated model, ReTe was robust against mutational decay. The human genome was analyzed in 1–2 days on a LINUX cluster. Retroviral sequences were detected in divergent vertebrate genomes. Most ReTe detected chains were coincident with Repeatmasker output and the HERVd database. ReTe did not report most of the evolutionary old HERV-L related and MalR sequences, and is not yet tailored for single LTR detection. Nevertheless, ReTe rationally detects and annotates many retroviral sequences. Oxford University Press 2007-08 2007-07-17 /pmc/articles/PMC1976444/ /pubmed/17636050 http://dx.doi.org/10.1093/nar/gkm515 Text en © 2007 The Author(s) http://creativecommons.org/licenses/by-nc/2.0/uk/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Computational Biology Sperber, Göran O. Airola, Tove Jern, Patric Blomberg, Jonas Automated recognition of retroviral sequences in genomic data—RetroTector(©) |
title | Automated recognition of retroviral sequences in genomic data—RetroTector(©) |
title_full | Automated recognition of retroviral sequences in genomic data—RetroTector(©) |
title_fullStr | Automated recognition of retroviral sequences in genomic data—RetroTector(©) |
title_full_unstemmed | Automated recognition of retroviral sequences in genomic data—RetroTector(©) |
title_short | Automated recognition of retroviral sequences in genomic data—RetroTector(©) |
title_sort | automated recognition of retroviral sequences in genomic data—retrotector(©) |
topic | Computational Biology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1976444/ https://www.ncbi.nlm.nih.gov/pubmed/17636050 http://dx.doi.org/10.1093/nar/gkm515 |
work_keys_str_mv | AT sperbergorano automatedrecognitionofretroviralsequencesingenomicdataretrotector AT airolatove automatedrecognitionofretroviralsequencesingenomicdataretrotector AT jernpatric automatedrecognitionofretroviralsequencesingenomicdataretrotector AT blombergjonas automatedrecognitionofretroviralsequencesingenomicdataretrotector |