Cargando…
H2rs: Deducing evolutionary and functionally important residue positions by means of an entropy and similarity based analysis of multiple sequence alignments
BACKGROUND: The identification of functionally important residue positions is an important task of computational biology. Methods of correlation analysis allow for the identification of pairs of residue positions, whose occupancy is mutually dependent due to constraints imposed by protein structure...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4021312/ https://www.ncbi.nlm.nih.gov/pubmed/24766829 http://dx.doi.org/10.1186/1471-2105-15-118 |
_version_ | 1782316216801558528 |
---|---|
author | Janda, Jan-Oliver Popal, Ajmal Bauer, Jochen Busch, Markus Klocke, Michael Spitzer, Wolfgang Keller, Jörg Merkl, Rainer |
author_facet | Janda, Jan-Oliver Popal, Ajmal Bauer, Jochen Busch, Markus Klocke, Michael Spitzer, Wolfgang Keller, Jörg Merkl, Rainer |
author_sort | Janda, Jan-Oliver |
collection | PubMed |
description | BACKGROUND: The identification of functionally important residue positions is an important task of computational biology. Methods of correlation analysis allow for the identification of pairs of residue positions, whose occupancy is mutually dependent due to constraints imposed by protein structure or function. A common measure assessing these dependencies is the mutual information, which is based on Shannon’s information theory that utilizes probabilities only. Consequently, such approaches do not consider the similarity of residue pairs, which may degrade the algorithm’s performance. One typical algorithm is H2r, which characterizes each individual residue position k by the conn(k)-value, which is the number of significantly correlated pairs it belongs to. RESULTS: To improve specificity of H2r, we developed a revised algorithm, named H2rs, which is based on the von Neumann entropy (vNE). To compute the corresponding mutual information, a matrix A is required, which assesses the similarity of residue pairs. We determined A by deducing substitution frequencies from contacting residue pairs observed in the homologs of 35 809 proteins, whose structure is known. In analogy to H2r, the enhanced algorithm computes a normalized conn(k)-value. Within the framework of H2rs, only statistically significant vNE values were considered. To decide on significance, the algorithm calculates a p-value by performing a randomization test for each individual pair of residue positions. The analysis of a large in silico testbed demonstrated that specificity and precision were higher for H2rs than for H2r and two other methods of correlation analysis. The gain in prediction quality is further confirmed by a detailed assessment of five well-studied enzymes. The outcome of H2rs and of a method that predicts contacting residue positions (PSICOV) overlapped only marginally. H2rs can be downloaded from http://www-bioinf.uni-regensburg.de. CONCLUSIONS: Considering substitution frequencies for residue pairs by means of the von Neumann entropy and a p-value improved the success rate in identifying important residue positions. The integration of proven statistical concepts and normalization allows for an easier comparison of results obtained with different proteins. Comparing the outcome of the local method H2rs and of the global method PSICOV indicates that such methods supplement each other and have different scopes of application. |
format | Online Article Text |
id | pubmed-4021312 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-40213122014-05-28 H2rs: Deducing evolutionary and functionally important residue positions by means of an entropy and similarity based analysis of multiple sequence alignments Janda, Jan-Oliver Popal, Ajmal Bauer, Jochen Busch, Markus Klocke, Michael Spitzer, Wolfgang Keller, Jörg Merkl, Rainer BMC Bioinformatics Methodology Article BACKGROUND: The identification of functionally important residue positions is an important task of computational biology. Methods of correlation analysis allow for the identification of pairs of residue positions, whose occupancy is mutually dependent due to constraints imposed by protein structure or function. A common measure assessing these dependencies is the mutual information, which is based on Shannon’s information theory that utilizes probabilities only. Consequently, such approaches do not consider the similarity of residue pairs, which may degrade the algorithm’s performance. One typical algorithm is H2r, which characterizes each individual residue position k by the conn(k)-value, which is the number of significantly correlated pairs it belongs to. RESULTS: To improve specificity of H2r, we developed a revised algorithm, named H2rs, which is based on the von Neumann entropy (vNE). To compute the corresponding mutual information, a matrix A is required, which assesses the similarity of residue pairs. We determined A by deducing substitution frequencies from contacting residue pairs observed in the homologs of 35 809 proteins, whose structure is known. In analogy to H2r, the enhanced algorithm computes a normalized conn(k)-value. Within the framework of H2rs, only statistically significant vNE values were considered. To decide on significance, the algorithm calculates a p-value by performing a randomization test for each individual pair of residue positions. The analysis of a large in silico testbed demonstrated that specificity and precision were higher for H2rs than for H2r and two other methods of correlation analysis. The gain in prediction quality is further confirmed by a detailed assessment of five well-studied enzymes. The outcome of H2rs and of a method that predicts contacting residue positions (PSICOV) overlapped only marginally. H2rs can be downloaded from http://www-bioinf.uni-regensburg.de. CONCLUSIONS: Considering substitution frequencies for residue pairs by means of the von Neumann entropy and a p-value improved the success rate in identifying important residue positions. The integration of proven statistical concepts and normalization allows for an easier comparison of results obtained with different proteins. Comparing the outcome of the local method H2rs and of the global method PSICOV indicates that such methods supplement each other and have different scopes of application. BioMed Central 2014-04-27 /pmc/articles/PMC4021312/ /pubmed/24766829 http://dx.doi.org/10.1186/1471-2105-15-118 Text en Copyright © 2014 Janda et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Methodology Article Janda, Jan-Oliver Popal, Ajmal Bauer, Jochen Busch, Markus Klocke, Michael Spitzer, Wolfgang Keller, Jörg Merkl, Rainer H2rs: Deducing evolutionary and functionally important residue positions by means of an entropy and similarity based analysis of multiple sequence alignments |
title | H2rs: Deducing evolutionary and functionally important residue positions by means of an entropy and similarity based analysis of multiple sequence alignments |
title_full | H2rs: Deducing evolutionary and functionally important residue positions by means of an entropy and similarity based analysis of multiple sequence alignments |
title_fullStr | H2rs: Deducing evolutionary and functionally important residue positions by means of an entropy and similarity based analysis of multiple sequence alignments |
title_full_unstemmed | H2rs: Deducing evolutionary and functionally important residue positions by means of an entropy and similarity based analysis of multiple sequence alignments |
title_short | H2rs: Deducing evolutionary and functionally important residue positions by means of an entropy and similarity based analysis of multiple sequence alignments |
title_sort | h2rs: deducing evolutionary and functionally important residue positions by means of an entropy and similarity based analysis of multiple sequence alignments |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4021312/ https://www.ncbi.nlm.nih.gov/pubmed/24766829 http://dx.doi.org/10.1186/1471-2105-15-118 |
work_keys_str_mv | AT jandajanoliver h2rsdeducingevolutionaryandfunctionallyimportantresiduepositionsbymeansofanentropyandsimilaritybasedanalysisofmultiplesequencealignments AT popalajmal h2rsdeducingevolutionaryandfunctionallyimportantresiduepositionsbymeansofanentropyandsimilaritybasedanalysisofmultiplesequencealignments AT bauerjochen h2rsdeducingevolutionaryandfunctionallyimportantresiduepositionsbymeansofanentropyandsimilaritybasedanalysisofmultiplesequencealignments AT buschmarkus h2rsdeducingevolutionaryandfunctionallyimportantresiduepositionsbymeansofanentropyandsimilaritybasedanalysisofmultiplesequencealignments AT klockemichael h2rsdeducingevolutionaryandfunctionallyimportantresiduepositionsbymeansofanentropyandsimilaritybasedanalysisofmultiplesequencealignments AT spitzerwolfgang h2rsdeducingevolutionaryandfunctionallyimportantresiduepositionsbymeansofanentropyandsimilaritybasedanalysisofmultiplesequencealignments AT kellerjorg h2rsdeducingevolutionaryandfunctionallyimportantresiduepositionsbymeansofanentropyandsimilaritybasedanalysisofmultiplesequencealignments AT merklrainer h2rsdeducingevolutionaryandfunctionallyimportantresiduepositionsbymeansofanentropyandsimilaritybasedanalysisofmultiplesequencealignments |