Cargando…

Systematic comparison of SCOP and CATH: a new gold standard for protein structure analysis

BACKGROUND: SCOP and CATH are widely used as gold standards to benchmark novel protein structure comparison methods as well as to train machine learning approaches for protein structure classification and prediction. The two hierarchies result from different protocols which may result in differing c...

Descripción completa

Detalles Bibliográficos
Autores principales: Csaba, Gergely, Birzele, Fabian, Zimmer, Ralf
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2678134/
https://www.ncbi.nlm.nih.gov/pubmed/19374763
http://dx.doi.org/10.1186/1472-6807-9-23
_version_ 1782166825652453376
author Csaba, Gergely
Birzele, Fabian
Zimmer, Ralf
author_facet Csaba, Gergely
Birzele, Fabian
Zimmer, Ralf
author_sort Csaba, Gergely
collection PubMed
description BACKGROUND: SCOP and CATH are widely used as gold standards to benchmark novel protein structure comparison methods as well as to train machine learning approaches for protein structure classification and prediction. The two hierarchies result from different protocols which may result in differing classifications of the same protein. Ignoring such differences leads to problems when being used to train or benchmark automatic structure classification methods. Here, we propose a method to compare SCOP and CATH in detail and discuss possible applications of this analysis. RESULTS: We create a new mapping between SCOP and CATH and define a consistent benchmark set which is shown to largely reduce errors made by structure comparison methods such as TM-Align and has useful further applications, e.g. for machine learning methods being trained for protein structure classification. Additionally, we extract additional connections in the topology of the protein fold space from the orthogonal features contained in SCOP and CATH. CONCLUSION: Via an all-to-all comparison, we find that there are large and unexpected differences between SCOP and CATH w.r.t. their domain definitions as well as their hierarchic partitioning of the fold space on every level of the two classifications. A consistent mapping of SCOP and CATH can be exploited for automated structure comparison and classification. AVAILABILITY: Benchmark sets and an interactive SCOP-CATH browser are available at .
format Text
id pubmed-2678134
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-26781342009-05-07 Systematic comparison of SCOP and CATH: a new gold standard for protein structure analysis Csaba, Gergely Birzele, Fabian Zimmer, Ralf BMC Struct Biol Methodology Article BACKGROUND: SCOP and CATH are widely used as gold standards to benchmark novel protein structure comparison methods as well as to train machine learning approaches for protein structure classification and prediction. The two hierarchies result from different protocols which may result in differing classifications of the same protein. Ignoring such differences leads to problems when being used to train or benchmark automatic structure classification methods. Here, we propose a method to compare SCOP and CATH in detail and discuss possible applications of this analysis. RESULTS: We create a new mapping between SCOP and CATH and define a consistent benchmark set which is shown to largely reduce errors made by structure comparison methods such as TM-Align and has useful further applications, e.g. for machine learning methods being trained for protein structure classification. Additionally, we extract additional connections in the topology of the protein fold space from the orthogonal features contained in SCOP and CATH. CONCLUSION: Via an all-to-all comparison, we find that there are large and unexpected differences between SCOP and CATH w.r.t. their domain definitions as well as their hierarchic partitioning of the fold space on every level of the two classifications. A consistent mapping of SCOP and CATH can be exploited for automated structure comparison and classification. AVAILABILITY: Benchmark sets and an interactive SCOP-CATH browser are available at . BioMed Central 2009-04-17 /pmc/articles/PMC2678134/ /pubmed/19374763 http://dx.doi.org/10.1186/1472-6807-9-23 Text en Copyright © 2009 Csaba et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Csaba, Gergely
Birzele, Fabian
Zimmer, Ralf
Systematic comparison of SCOP and CATH: a new gold standard for protein structure analysis
title Systematic comparison of SCOP and CATH: a new gold standard for protein structure analysis
title_full Systematic comparison of SCOP and CATH: a new gold standard for protein structure analysis
title_fullStr Systematic comparison of SCOP and CATH: a new gold standard for protein structure analysis
title_full_unstemmed Systematic comparison of SCOP and CATH: a new gold standard for protein structure analysis
title_short Systematic comparison of SCOP and CATH: a new gold standard for protein structure analysis
title_sort systematic comparison of scop and cath: a new gold standard for protein structure analysis
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2678134/
https://www.ncbi.nlm.nih.gov/pubmed/19374763
http://dx.doi.org/10.1186/1472-6807-9-23
work_keys_str_mv AT csabagergely systematiccomparisonofscopandcathanewgoldstandardforproteinstructureanalysis
AT birzelefabian systematiccomparisonofscopandcathanewgoldstandardforproteinstructureanalysis
AT zimmerralf systematiccomparisonofscopandcathanewgoldstandardforproteinstructureanalysis