Cargando…

CMASA: an accurate algorithm for detecting local protein structural similarity and its application to enzyme catalytic site annotation

BACKGROUND: The rapid development of structural genomics has resulted in many "unknown function" proteins being deposited in Protein Data Bank (PDB), thus, the functional prediction of these proteins has become a challenge for structural bioinformatics. Several sequence-based and structure...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Gong-Hua, Huang, Jing-Fei
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2936402/
https://www.ncbi.nlm.nih.gov/pubmed/20796320
http://dx.doi.org/10.1186/1471-2105-11-439
_version_ 1782186487496835072
author Li, Gong-Hua
Huang, Jing-Fei
author_facet Li, Gong-Hua
Huang, Jing-Fei
author_sort Li, Gong-Hua
collection PubMed
description BACKGROUND: The rapid development of structural genomics has resulted in many "unknown function" proteins being deposited in Protein Data Bank (PDB), thus, the functional prediction of these proteins has become a challenge for structural bioinformatics. Several sequence-based and structure-based methods have been developed to predict protein function, but these methods need to be improved further, such as, enhancing the accuracy, sensitivity, and the computational speed. Here, an accurate algorithm, the CMASA (Contact MAtrix based local Structural Alignment algorithm), has been developed to predict unknown functions of proteins based on the local protein structural similarity. This algorithm has been evaluated by building a test set including 164 enzyme families, and also been compared to other methods. RESULTS: The evaluation of CMASA shows that the CMASA is highly accurate (0.96), sensitive (0.86), and fast enough to be used in the large-scale functional annotation. Comparing to both sequence-based and global structure-based methods, not only the CMASA can find remote homologous proteins, but also can find the active site convergence. Comparing to other local structure comparison-based methods, the CMASA can obtain the better performance than both FFF (a method using geometry to predict protein function) and SPASM (a local structure alignment method); and the CMASA is more sensitive than PINTS and is more accurate than JESS (both are local structure alignment methods). The CMASA was applied to annotate the enzyme catalytic sites of the non-redundant PDB, and at least 166 putative catalytic sites have been suggested, these sites can not be observed by the Catalytic Site Atlas (CSA). CONCLUSIONS: The CMASA is an accurate algorithm for detecting local protein structural similarity, and it holds several advantages in predicting enzyme active sites. The CMASA can be used in large-scale enzyme active site annotation. The CMASA can be available by the mail-based server (http://159.226.149.45/other1/CMASA/CMASA.htm).
format Text
id pubmed-2936402
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-29364022011-07-08 CMASA: an accurate algorithm for detecting local protein structural similarity and its application to enzyme catalytic site annotation Li, Gong-Hua Huang, Jing-Fei BMC Bioinformatics Methodology Article BACKGROUND: The rapid development of structural genomics has resulted in many "unknown function" proteins being deposited in Protein Data Bank (PDB), thus, the functional prediction of these proteins has become a challenge for structural bioinformatics. Several sequence-based and structure-based methods have been developed to predict protein function, but these methods need to be improved further, such as, enhancing the accuracy, sensitivity, and the computational speed. Here, an accurate algorithm, the CMASA (Contact MAtrix based local Structural Alignment algorithm), has been developed to predict unknown functions of proteins based on the local protein structural similarity. This algorithm has been evaluated by building a test set including 164 enzyme families, and also been compared to other methods. RESULTS: The evaluation of CMASA shows that the CMASA is highly accurate (0.96), sensitive (0.86), and fast enough to be used in the large-scale functional annotation. Comparing to both sequence-based and global structure-based methods, not only the CMASA can find remote homologous proteins, but also can find the active site convergence. Comparing to other local structure comparison-based methods, the CMASA can obtain the better performance than both FFF (a method using geometry to predict protein function) and SPASM (a local structure alignment method); and the CMASA is more sensitive than PINTS and is more accurate than JESS (both are local structure alignment methods). The CMASA was applied to annotate the enzyme catalytic sites of the non-redundant PDB, and at least 166 putative catalytic sites have been suggested, these sites can not be observed by the Catalytic Site Atlas (CSA). CONCLUSIONS: The CMASA is an accurate algorithm for detecting local protein structural similarity, and it holds several advantages in predicting enzyme active sites. The CMASA can be used in large-scale enzyme active site annotation. The CMASA can be available by the mail-based server (http://159.226.149.45/other1/CMASA/CMASA.htm). BioMed Central 2010-08-27 /pmc/articles/PMC2936402/ /pubmed/20796320 http://dx.doi.org/10.1186/1471-2105-11-439 Text en Copyright ©2010 Li and Huang; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Li, Gong-Hua
Huang, Jing-Fei
CMASA: an accurate algorithm for detecting local protein structural similarity and its application to enzyme catalytic site annotation
title CMASA: an accurate algorithm for detecting local protein structural similarity and its application to enzyme catalytic site annotation
title_full CMASA: an accurate algorithm for detecting local protein structural similarity and its application to enzyme catalytic site annotation
title_fullStr CMASA: an accurate algorithm for detecting local protein structural similarity and its application to enzyme catalytic site annotation
title_full_unstemmed CMASA: an accurate algorithm for detecting local protein structural similarity and its application to enzyme catalytic site annotation
title_short CMASA: an accurate algorithm for detecting local protein structural similarity and its application to enzyme catalytic site annotation
title_sort cmasa: an accurate algorithm for detecting local protein structural similarity and its application to enzyme catalytic site annotation
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2936402/
https://www.ncbi.nlm.nih.gov/pubmed/20796320
http://dx.doi.org/10.1186/1471-2105-11-439
work_keys_str_mv AT ligonghua cmasaanaccuratealgorithmfordetectinglocalproteinstructuralsimilarityanditsapplicationtoenzymecatalyticsiteannotation
AT huangjingfei cmasaanaccuratealgorithmfordetectinglocalproteinstructuralsimilarityanditsapplicationtoenzymecatalyticsiteannotation