Cargando…

A framework for protein structure classification and identification of novel protein structures

BACKGROUND: Protein structure classification plays a central role in understanding the function of a protein molecule with respect to all known proteins in a structure database. With the rapid increase in the number of new protein structures, the need for automated and accurate methods for protein c...

Descripción completa

Detalles Bibliográficos
Autores principales: Kim, You Jung, Patel, Jignesh M
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2006
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1622760/
https://www.ncbi.nlm.nih.gov/pubmed/17042958
http://dx.doi.org/10.1186/1471-2105-7-456
_version_ 1782130562718236672
author Kim, You Jung
Patel, Jignesh M
author_facet Kim, You Jung
Patel, Jignesh M
author_sort Kim, You Jung
collection PubMed
description BACKGROUND: Protein structure classification plays a central role in understanding the function of a protein molecule with respect to all known proteins in a structure database. With the rapid increase in the number of new protein structures, the need for automated and accurate methods for protein classification is increasingly important. RESULTS: In this paper we present a unified framework for protein structure classification and identification of novel protein structures. The framework consists of a set of components for comparing, classifying, and clustering protein structures. These components allow us to accurately classify proteins into known folds, to detect new protein folds, and to provide a way of clustering the new folds. In our evaluation with SCOP 1.69, our method correctly classifies 86.0%, 87.7%, and 90.5% of new domains at family, superfamily, and fold levels. Furthermore, for protein domains that belong to new domain families, our method is able to produce clusters that closely correspond to the new families in SCOP 1.69. As a result, our method can also be used to suggest new classification groups that contain novel folds. CONCLUSION: We have developed a method called proCC for automatically classifying and clustering domains. The method is effective in classifying new domains and suggesting new domain families, and it is also very efficient. A web site offering access to proCC is freely available at
format Text
id pubmed-1622760
institution National Center for Biotechnology Information
language English
publishDate 2006
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-16227602006-10-26 A framework for protein structure classification and identification of novel protein structures Kim, You Jung Patel, Jignesh M BMC Bioinformatics Methodology Article BACKGROUND: Protein structure classification plays a central role in understanding the function of a protein molecule with respect to all known proteins in a structure database. With the rapid increase in the number of new protein structures, the need for automated and accurate methods for protein classification is increasingly important. RESULTS: In this paper we present a unified framework for protein structure classification and identification of novel protein structures. The framework consists of a set of components for comparing, classifying, and clustering protein structures. These components allow us to accurately classify proteins into known folds, to detect new protein folds, and to provide a way of clustering the new folds. In our evaluation with SCOP 1.69, our method correctly classifies 86.0%, 87.7%, and 90.5% of new domains at family, superfamily, and fold levels. Furthermore, for protein domains that belong to new domain families, our method is able to produce clusters that closely correspond to the new families in SCOP 1.69. As a result, our method can also be used to suggest new classification groups that contain novel folds. CONCLUSION: We have developed a method called proCC for automatically classifying and clustering domains. The method is effective in classifying new domains and suggesting new domain families, and it is also very efficient. A web site offering access to proCC is freely available at BioMed Central 2006-10-16 /pmc/articles/PMC1622760/ /pubmed/17042958 http://dx.doi.org/10.1186/1471-2105-7-456 Text en Copyright © 2006 Kim and Patel; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Kim, You Jung
Patel, Jignesh M
A framework for protein structure classification and identification of novel protein structures
title A framework for protein structure classification and identification of novel protein structures
title_full A framework for protein structure classification and identification of novel protein structures
title_fullStr A framework for protein structure classification and identification of novel protein structures
title_full_unstemmed A framework for protein structure classification and identification of novel protein structures
title_short A framework for protein structure classification and identification of novel protein structures
title_sort framework for protein structure classification and identification of novel protein structures
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1622760/
https://www.ncbi.nlm.nih.gov/pubmed/17042958
http://dx.doi.org/10.1186/1471-2105-7-456
work_keys_str_mv AT kimyoujung aframeworkforproteinstructureclassificationandidentificationofnovelproteinstructures
AT pateljigneshm aframeworkforproteinstructureclassificationandidentificationofnovelproteinstructures
AT kimyoujung frameworkforproteinstructureclassificationandidentificationofnovelproteinstructures
AT pateljigneshm frameworkforproteinstructureclassificationandidentificationofnovelproteinstructures