Cargando…

Protein function prediction using domain families

Here we assessed the use of domain families for predicting the functions of whole proteins. These 'functional families' (FunFams) were derived using a protocol that combines sequence clustering with supervised cluster evaluation, relying on available high-quality Gene Ontology (GO) annotat...

Descripción completa

Detalles Bibliográficos
Autores principales: Rentzsch, Robert, Orengo, Christine A
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3584934/
https://www.ncbi.nlm.nih.gov/pubmed/23514456
http://dx.doi.org/10.1186/1471-2105-14-S3-S5
_version_ 1782261080796430336
author Rentzsch, Robert
Orengo, Christine A
author_facet Rentzsch, Robert
Orengo, Christine A
author_sort Rentzsch, Robert
collection PubMed
description Here we assessed the use of domain families for predicting the functions of whole proteins. These 'functional families' (FunFams) were derived using a protocol that combines sequence clustering with supervised cluster evaluation, relying on available high-quality Gene Ontology (GO) annotation data in the latter step. In essence, the protocol groups domain sequences belonging to the same superfamily into families based on the GO annotations of their parent proteins. An initial test based on enzyme sequences confirmed that the FunFams resemble enzyme (domain) families much better than do families produced by sequence clustering alone. For the CAFA 2011 experiment, we further associated the FunFams with GO terms probabilistically. All target proteins were first submitted to domain superfamily assignment, followed by FunFam assignment and, eventually, function assignment. The latter included an integration step for multi-domain target proteins. The CAFA results put our domain-based approach among the top ten of 31 competing groups and 56 prediction methods, confirming that it outperforms simple pairwise whole-protein sequence comparisons.
format Online
Article
Text
id pubmed-3584934
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-35849342013-03-11 Protein function prediction using domain families Rentzsch, Robert Orengo, Christine A BMC Bioinformatics Proceedings Here we assessed the use of domain families for predicting the functions of whole proteins. These 'functional families' (FunFams) were derived using a protocol that combines sequence clustering with supervised cluster evaluation, relying on available high-quality Gene Ontology (GO) annotation data in the latter step. In essence, the protocol groups domain sequences belonging to the same superfamily into families based on the GO annotations of their parent proteins. An initial test based on enzyme sequences confirmed that the FunFams resemble enzyme (domain) families much better than do families produced by sequence clustering alone. For the CAFA 2011 experiment, we further associated the FunFams with GO terms probabilistically. All target proteins were first submitted to domain superfamily assignment, followed by FunFam assignment and, eventually, function assignment. The latter included an integration step for multi-domain target proteins. The CAFA results put our domain-based approach among the top ten of 31 competing groups and 56 prediction methods, confirming that it outperforms simple pairwise whole-protein sequence comparisons. BioMed Central 2013-02-28 /pmc/articles/PMC3584934/ /pubmed/23514456 http://dx.doi.org/10.1186/1471-2105-14-S3-S5 Text en Copyright ©2013 Rentzsch and Orengo; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Proceedings
Rentzsch, Robert
Orengo, Christine A
Protein function prediction using domain families
title Protein function prediction using domain families
title_full Protein function prediction using domain families
title_fullStr Protein function prediction using domain families
title_full_unstemmed Protein function prediction using domain families
title_short Protein function prediction using domain families
title_sort protein function prediction using domain families
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3584934/
https://www.ncbi.nlm.nih.gov/pubmed/23514456
http://dx.doi.org/10.1186/1471-2105-14-S3-S5
work_keys_str_mv AT rentzschrobert proteinfunctionpredictionusingdomainfamilies
AT orengochristinea proteinfunctionpredictionusingdomainfamilies