Cargando…

Partially-supervised protein subclass discovery with simultaneous annotation of functional residues

BACKGROUND: The study of functional subfamilies of protein domain families and the identification of the residues which determine substrate specificity is an important question in the analysis of protein domains. One way to address this question is the use of clustering methods for protein sequence...

Descripción completa

Detalles Bibliográficos
Autores principales: Georgi, Benjamin, Schultz, Jörg, Schliep, Alexander
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2777906/
https://www.ncbi.nlm.nih.gov/pubmed/19857261
http://dx.doi.org/10.1186/1472-6807-9-68
_version_ 1782174212477157376
author Georgi, Benjamin
Schultz, Jörg
Schliep, Alexander
author_facet Georgi, Benjamin
Schultz, Jörg
Schliep, Alexander
author_sort Georgi, Benjamin
collection PubMed
description BACKGROUND: The study of functional subfamilies of protein domain families and the identification of the residues which determine substrate specificity is an important question in the analysis of protein domains. One way to address this question is the use of clustering methods for protein sequence data and approaches to predict functional residues based on such clusterings. The locations of putative functional residues in known protein structures provide insights into how different substrate specificities are reflected on the protein structure level. RESULTS: We have developed an extension of the context-specific independence mixture model clustering framework which allows for the integration of experimental data. As these are usually known only for a few proteins, our algorithm implements a partially-supervised learning approach. We discover domain subfamilies and predict functional residues for four protein domain families: phosphatases, pyridoxal dependent decarboxylases, WW and SH3 domains to demonstrate the usefulness of our approach. CONCLUSION: The partially-supervised clustering revealed biologically meaningful subfamilies even for highly heterogeneous domains and the predicted functional residues provide insights into the basis of the different substrate specificities.
format Text
id pubmed-2777906
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-27779062009-11-17 Partially-supervised protein subclass discovery with simultaneous annotation of functional residues Georgi, Benjamin Schultz, Jörg Schliep, Alexander BMC Struct Biol Research Article BACKGROUND: The study of functional subfamilies of protein domain families and the identification of the residues which determine substrate specificity is an important question in the analysis of protein domains. One way to address this question is the use of clustering methods for protein sequence data and approaches to predict functional residues based on such clusterings. The locations of putative functional residues in known protein structures provide insights into how different substrate specificities are reflected on the protein structure level. RESULTS: We have developed an extension of the context-specific independence mixture model clustering framework which allows for the integration of experimental data. As these are usually known only for a few proteins, our algorithm implements a partially-supervised learning approach. We discover domain subfamilies and predict functional residues for four protein domain families: phosphatases, pyridoxal dependent decarboxylases, WW and SH3 domains to demonstrate the usefulness of our approach. CONCLUSION: The partially-supervised clustering revealed biologically meaningful subfamilies even for highly heterogeneous domains and the predicted functional residues provide insights into the basis of the different substrate specificities. BioMed Central 2009-10-26 /pmc/articles/PMC2777906/ /pubmed/19857261 http://dx.doi.org/10.1186/1472-6807-9-68 Text en Copyright © 2009 Georgi et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Georgi, Benjamin
Schultz, Jörg
Schliep, Alexander
Partially-supervised protein subclass discovery with simultaneous annotation of functional residues
title Partially-supervised protein subclass discovery with simultaneous annotation of functional residues
title_full Partially-supervised protein subclass discovery with simultaneous annotation of functional residues
title_fullStr Partially-supervised protein subclass discovery with simultaneous annotation of functional residues
title_full_unstemmed Partially-supervised protein subclass discovery with simultaneous annotation of functional residues
title_short Partially-supervised protein subclass discovery with simultaneous annotation of functional residues
title_sort partially-supervised protein subclass discovery with simultaneous annotation of functional residues
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2777906/
https://www.ncbi.nlm.nih.gov/pubmed/19857261
http://dx.doi.org/10.1186/1472-6807-9-68
work_keys_str_mv AT georgibenjamin partiallysupervisedproteinsubclassdiscoverywithsimultaneousannotationoffunctionalresidues
AT schultzjorg partiallysupervisedproteinsubclassdiscoverywithsimultaneousannotationoffunctionalresidues
AT schliepalexander partiallysupervisedproteinsubclassdiscoverywithsimultaneousannotationoffunctionalresidues