Cargando…

RIO: Analyzing proteomes by automated phylogenomics using resampled inference of orthologs

BACKGROUND: When analyzing protein sequences using sequence similarity searches, orthologous sequences (that diverged by speciation) are more reliable predictors of a new protein's function than paralogous sequences (that diverged by gene duplication). The utility of phylogenetic information in...

Descripción completa

Detalles Bibliográficos
Autores principales:	Zmasek, Christian M, Eddy, Sean R
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2002
Materias:	Methodology article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC116988/ https://www.ncbi.nlm.nih.gov/pubmed/12028595 http://dx.doi.org/10.1186/1471-2105-3-14

_version_	1782120272106618880
author	Zmasek, Christian M Eddy, Sean R
author_facet	Zmasek, Christian M Eddy, Sean R
author_sort	Zmasek, Christian M
collection	PubMed
description	BACKGROUND: When analyzing protein sequences using sequence similarity searches, orthologous sequences (that diverged by speciation) are more reliable predictors of a new protein's function than paralogous sequences (that diverged by gene duplication). The utility of phylogenetic information in high-throughput genome annotation ("phylogenomics") is widely recognized, but existing approaches are either manual or not explicitly based on phylogenetic trees. RESULTS: Here we present RIO (Resampled Inference of Orthologs), a procedure for automated phylogenomics using explicit phylogenetic inference. RIO analyses are performed over bootstrap resampled phylogenetic trees to estimate the reliability of orthology assignments. We also introduce supplementary concepts that are helpful for functional inference. RIO has been implemented as Perl pipeline connecting several C and Java programs. It is available at http://www.genetics.wustl.edu/eddy/forester/. A web server is at http://www.rio.wustl.edu/. RIO was tested on the Arabidopsis thaliana and Caenorhabditis elegans proteomes. CONCLUSION: The RIO procedure is particularly useful for the automated detection of first representatives of novel protein subfamilies. We also describe how some orthologies can be misleading for functional inference.
format	Text
id	pubmed-116988
institution	National Center for Biotechnology Information
language	English
publishDate	2002
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-1169882002-07-10 RIO: Analyzing proteomes by automated phylogenomics using resampled inference of orthologs Zmasek, Christian M Eddy, Sean R BMC Bioinformatics Methodology article BACKGROUND: When analyzing protein sequences using sequence similarity searches, orthologous sequences (that diverged by speciation) are more reliable predictors of a new protein's function than paralogous sequences (that diverged by gene duplication). The utility of phylogenetic information in high-throughput genome annotation ("phylogenomics") is widely recognized, but existing approaches are either manual or not explicitly based on phylogenetic trees. RESULTS: Here we present RIO (Resampled Inference of Orthologs), a procedure for automated phylogenomics using explicit phylogenetic inference. RIO analyses are performed over bootstrap resampled phylogenetic trees to estimate the reliability of orthology assignments. We also introduce supplementary concepts that are helpful for functional inference. RIO has been implemented as Perl pipeline connecting several C and Java programs. It is available at http://www.genetics.wustl.edu/eddy/forester/. A web server is at http://www.rio.wustl.edu/. RIO was tested on the Arabidopsis thaliana and Caenorhabditis elegans proteomes. CONCLUSION: The RIO procedure is particularly useful for the automated detection of first representatives of novel protein subfamilies. We also describe how some orthologies can be misleading for functional inference. BioMed Central 2002-05-16 /pmc/articles/PMC116988/ /pubmed/12028595 http://dx.doi.org/10.1186/1471-2105-3-14 Text en Copyright ©2002 Zmasek and Eddy; licensee BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL.
spellingShingle	Methodology article Zmasek, Christian M Eddy, Sean R RIO: Analyzing proteomes by automated phylogenomics using resampled inference of orthologs
title	RIO: Analyzing proteomes by automated phylogenomics using resampled inference of orthologs
title_full	RIO: Analyzing proteomes by automated phylogenomics using resampled inference of orthologs
title_fullStr	RIO: Analyzing proteomes by automated phylogenomics using resampled inference of orthologs
title_full_unstemmed	RIO: Analyzing proteomes by automated phylogenomics using resampled inference of orthologs
title_short	RIO: Analyzing proteomes by automated phylogenomics using resampled inference of orthologs
title_sort	rio: analyzing proteomes by automated phylogenomics using resampled inference of orthologs
topic	Methodology article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC116988/ https://www.ncbi.nlm.nih.gov/pubmed/12028595 http://dx.doi.org/10.1186/1471-2105-3-14
work_keys_str_mv	AT zmasekchristianm rioanalyzingproteomesbyautomatedphylogenomicsusingresampledinferenceoforthologs AT eddyseanr rioanalyzingproteomesbyautomatedphylogenomicsusingresampledinferenceoforthologs

RIO: Analyzing proteomes by automated phylogenomics using resampled inference of orthologs

Ejemplares similares