Cargando…

CLIPS-1D: analysis of multiple sequence alignments to deduce for residue-positions a role in catalysis, ligand-binding, or protein structure

BACKGROUND: One aim of the in silico characterization of proteins is to identify all residue-positions, which are crucial for function or structure. Several sequence-based algorithms exist, which predict functionally important sites. However, with respect to sequence information, many functionally a...

Descripción completa

Detalles Bibliográficos
Autores principales: Janda, Jan-Oliver, Busch, Markus, Kück, Fabian, Porfenenko, Mikhail, Merkl, Rainer
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3391178/
https://www.ncbi.nlm.nih.gov/pubmed/22480135
http://dx.doi.org/10.1186/1471-2105-13-55
_version_ 1782237488029368320
author Janda, Jan-Oliver
Busch, Markus
Kück, Fabian
Porfenenko, Mikhail
Merkl, Rainer
author_facet Janda, Jan-Oliver
Busch, Markus
Kück, Fabian
Porfenenko, Mikhail
Merkl, Rainer
author_sort Janda, Jan-Oliver
collection PubMed
description BACKGROUND: One aim of the in silico characterization of proteins is to identify all residue-positions, which are crucial for function or structure. Several sequence-based algorithms exist, which predict functionally important sites. However, with respect to sequence information, many functionally and structurally important sites are hard to distinguish and consequently a large number of incorrectly predicted functional sites have to be expected. This is why we were interested to design a new classifier that differentiates between functionally and structurally important sites and to assess its performance on representative datasets. RESULTS: We have implemented CLIPS-1D, which predicts a role in catalysis, ligand-binding, or protein structure for residue-positions in a mutually exclusive manner. By analyzing a multiple sequence alignment, the algorithm scores conservation as well as abundance of residues at individual sites and their local neighborhood and categorizes by means of a multiclass support vector machine. A cross-validation confirmed that residue-positions involved in catalysis were identified with state-of-the-art quality; the mean MCC-value was 0.34. For structurally important sites, prediction quality was considerably higher (mean MCC = 0.67). For ligand-binding sites, prediction quality was lower (mean MCC = 0.12), because binding sites and structurally important residue-positions share conservation and abundance values, which makes their separation difficult. We show that classification success varies for residues in a class-specific manner. This is why our algorithm computes residue-specific p-values, which allow for the statistical assessment of each individual prediction. CLIPS-1D is available as a Web service at http://www-bioinf.uni-regensburg.de/. CONCLUSIONS: CLIPS-1D is a classifier, whose prediction quality has been determined separately for catalytic sites, ligand-binding sites, and structurally important sites. It generates hypotheses about residue-positions important for a set of homologous proteins and focuses on conservation and abundance signals. Thus, the algorithm can be applied in cases where function cannot be transferred from well-characterized proteins by means of sequence comparison.
format Online
Article
Text
id pubmed-3391178
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-33911782012-07-09 CLIPS-1D: analysis of multiple sequence alignments to deduce for residue-positions a role in catalysis, ligand-binding, or protein structure Janda, Jan-Oliver Busch, Markus Kück, Fabian Porfenenko, Mikhail Merkl, Rainer BMC Bioinformatics Research Article BACKGROUND: One aim of the in silico characterization of proteins is to identify all residue-positions, which are crucial for function or structure. Several sequence-based algorithms exist, which predict functionally important sites. However, with respect to sequence information, many functionally and structurally important sites are hard to distinguish and consequently a large number of incorrectly predicted functional sites have to be expected. This is why we were interested to design a new classifier that differentiates between functionally and structurally important sites and to assess its performance on representative datasets. RESULTS: We have implemented CLIPS-1D, which predicts a role in catalysis, ligand-binding, or protein structure for residue-positions in a mutually exclusive manner. By analyzing a multiple sequence alignment, the algorithm scores conservation as well as abundance of residues at individual sites and their local neighborhood and categorizes by means of a multiclass support vector machine. A cross-validation confirmed that residue-positions involved in catalysis were identified with state-of-the-art quality; the mean MCC-value was 0.34. For structurally important sites, prediction quality was considerably higher (mean MCC = 0.67). For ligand-binding sites, prediction quality was lower (mean MCC = 0.12), because binding sites and structurally important residue-positions share conservation and abundance values, which makes their separation difficult. We show that classification success varies for residues in a class-specific manner. This is why our algorithm computes residue-specific p-values, which allow for the statistical assessment of each individual prediction. CLIPS-1D is available as a Web service at http://www-bioinf.uni-regensburg.de/. CONCLUSIONS: CLIPS-1D is a classifier, whose prediction quality has been determined separately for catalytic sites, ligand-binding sites, and structurally important sites. It generates hypotheses about residue-positions important for a set of homologous proteins and focuses on conservation and abundance signals. Thus, the algorithm can be applied in cases where function cannot be transferred from well-characterized proteins by means of sequence comparison. BioMed Central 2012-04-05 /pmc/articles/PMC3391178/ /pubmed/22480135 http://dx.doi.org/10.1186/1471-2105-13-55 Text en Copyright ©2012 Janda et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Janda, Jan-Oliver
Busch, Markus
Kück, Fabian
Porfenenko, Mikhail
Merkl, Rainer
CLIPS-1D: analysis of multiple sequence alignments to deduce for residue-positions a role in catalysis, ligand-binding, or protein structure
title CLIPS-1D: analysis of multiple sequence alignments to deduce for residue-positions a role in catalysis, ligand-binding, or protein structure
title_full CLIPS-1D: analysis of multiple sequence alignments to deduce for residue-positions a role in catalysis, ligand-binding, or protein structure
title_fullStr CLIPS-1D: analysis of multiple sequence alignments to deduce for residue-positions a role in catalysis, ligand-binding, or protein structure
title_full_unstemmed CLIPS-1D: analysis of multiple sequence alignments to deduce for residue-positions a role in catalysis, ligand-binding, or protein structure
title_short CLIPS-1D: analysis of multiple sequence alignments to deduce for residue-positions a role in catalysis, ligand-binding, or protein structure
title_sort clips-1d: analysis of multiple sequence alignments to deduce for residue-positions a role in catalysis, ligand-binding, or protein structure
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3391178/
https://www.ncbi.nlm.nih.gov/pubmed/22480135
http://dx.doi.org/10.1186/1471-2105-13-55
work_keys_str_mv AT jandajanoliver clips1danalysisofmultiplesequencealignmentstodeduceforresiduepositionsaroleincatalysisligandbindingorproteinstructure
AT buschmarkus clips1danalysisofmultiplesequencealignmentstodeduceforresiduepositionsaroleincatalysisligandbindingorproteinstructure
AT kuckfabian clips1danalysisofmultiplesequencealignmentstodeduceforresiduepositionsaroleincatalysisligandbindingorproteinstructure
AT porfenenkomikhail clips1danalysisofmultiplesequencealignmentstodeduceforresiduepositionsaroleincatalysisligandbindingorproteinstructure
AT merklrainer clips1danalysisofmultiplesequencealignmentstodeduceforresiduepositionsaroleincatalysisligandbindingorproteinstructure