Cargando…

FlowerPower: clustering proteins into domain architecture classes for phylogenomic inference of protein function

BACKGROUND: Function prediction by transfer of annotation from the top database hit in a homology search has been shown to be prone to systematic error. Phylogenomic analysis reduces these errors by inferring protein function within the evolutionary context of the entire family. However, accuracy of...

Descripción completa

Detalles Bibliográficos
Autores principales:	Krishnamurthy, Nandini, Brown, Duncan, Sjölander, Kimmen
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2007
Materias:	Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1796606/ https://www.ncbi.nlm.nih.gov/pubmed/17288570 http://dx.doi.org/10.1186/1471-2148-7-S1-S12

_version_	1782132243006750720
author	Krishnamurthy, Nandini Brown, Duncan Sjölander, Kimmen
author_facet	Krishnamurthy, Nandini Brown, Duncan Sjölander, Kimmen
author_sort	Krishnamurthy, Nandini
collection	PubMed
description	BACKGROUND: Function prediction by transfer of annotation from the top database hit in a homology search has been shown to be prone to systematic error. Phylogenomic analysis reduces these errors by inferring protein function within the evolutionary context of the entire family. However, accuracy of function prediction for multi-domain proteins depends on all members having the same overall domain structure. By contrast, most common homolog detection methods are optimized for retrieving local homologs, and do not address this requirement. RESULTS: We present FlowerPower, a novel clustering algorithm designed for the identification of global homologs as a precursor to structural phylogenomic analysis. Similar to methods such as PSIBLAST, FlowerPower employs an iterative approach to clustering sequences. However, rather than using a single HMM or profile to expand the cluster, FlowerPower identifies subfamilies using the SCI-PHY algorithm and then selects and aligns new homologs using subfamily hidden Markov models. FlowerPower is shown to outperform BLAST, PSI-BLAST and the UCSC SAM-Target 2K methods at discrimination between proteins in the same domain architecture class and those having different overall domain structures. CONCLUSION: Structural phylogenomic analysis enables biologists to avoid the systematic errors associated with annotation transfer; clustering sequences based on sharing the same domain architecture is a critical first step in this process. FlowerPower is shown to consistently identify homologous sequences having the same domain architecture as the query. AVAILABILITY: FlowerPower is available as a webserver at .
format	Text
id	pubmed-1796606
institution	National Center for Biotechnology Information
language	English
publishDate	2007
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-17966062007-02-09 FlowerPower: clustering proteins into domain architecture classes for phylogenomic inference of protein function Krishnamurthy, Nandini Brown, Duncan Sjölander, Kimmen BMC Evol Biol Research BACKGROUND: Function prediction by transfer of annotation from the top database hit in a homology search has been shown to be prone to systematic error. Phylogenomic analysis reduces these errors by inferring protein function within the evolutionary context of the entire family. However, accuracy of function prediction for multi-domain proteins depends on all members having the same overall domain structure. By contrast, most common homolog detection methods are optimized for retrieving local homologs, and do not address this requirement. RESULTS: We present FlowerPower, a novel clustering algorithm designed for the identification of global homologs as a precursor to structural phylogenomic analysis. Similar to methods such as PSIBLAST, FlowerPower employs an iterative approach to clustering sequences. However, rather than using a single HMM or profile to expand the cluster, FlowerPower identifies subfamilies using the SCI-PHY algorithm and then selects and aligns new homologs using subfamily hidden Markov models. FlowerPower is shown to outperform BLAST, PSI-BLAST and the UCSC SAM-Target 2K methods at discrimination between proteins in the same domain architecture class and those having different overall domain structures. CONCLUSION: Structural phylogenomic analysis enables biologists to avoid the systematic errors associated with annotation transfer; clustering sequences based on sharing the same domain architecture is a critical first step in this process. FlowerPower is shown to consistently identify homologous sequences having the same domain architecture as the query. AVAILABILITY: FlowerPower is available as a webserver at . BioMed Central 2007-02-08 /pmc/articles/PMC1796606/ /pubmed/17288570 http://dx.doi.org/10.1186/1471-2148-7-S1-S12 Text en Copyright © 2007 Krishnamurthy et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Krishnamurthy, Nandini Brown, Duncan Sjölander, Kimmen FlowerPower: clustering proteins into domain architecture classes for phylogenomic inference of protein function
title	FlowerPower: clustering proteins into domain architecture classes for phylogenomic inference of protein function
title_full	FlowerPower: clustering proteins into domain architecture classes for phylogenomic inference of protein function
title_fullStr	FlowerPower: clustering proteins into domain architecture classes for phylogenomic inference of protein function
title_full_unstemmed	FlowerPower: clustering proteins into domain architecture classes for phylogenomic inference of protein function
title_short	FlowerPower: clustering proteins into domain architecture classes for phylogenomic inference of protein function
title_sort	flowerpower: clustering proteins into domain architecture classes for phylogenomic inference of protein function
topic	Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1796606/ https://www.ncbi.nlm.nih.gov/pubmed/17288570 http://dx.doi.org/10.1186/1471-2148-7-S1-S12
work_keys_str_mv	AT krishnamurthynandini flowerpowerclusteringproteinsintodomainarchitectureclassesforphylogenomicinferenceofproteinfunction AT brownduncan flowerpowerclusteringproteinsintodomainarchitectureclassesforphylogenomicinferenceofproteinfunction AT sjolanderkimmen flowerpowerclusteringproteinsintodomainarchitectureclassesforphylogenomicinferenceofproteinfunction

FlowerPower: clustering proteins into domain architecture classes for phylogenomic inference of protein function

Ejemplares similares