Cargando…
CoeViz: a web-based tool for coevolution analysis of protein residues
BACKGROUND: Proteins generally perform their function in a folded state. Residues forming an active site, whether it is a catalytic center or interaction interface, are frequently distant in a protein sequence. Hence, traditional sequence-based prediction methods focusing on a single residue (or a s...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4782369/ https://www.ncbi.nlm.nih.gov/pubmed/26956673 http://dx.doi.org/10.1186/s12859-016-0975-z |
_version_ | 1782419940398071808 |
---|---|
author | Baker, Frazier N. Porollo, Aleksey |
author_facet | Baker, Frazier N. Porollo, Aleksey |
author_sort | Baker, Frazier N. |
collection | PubMed |
description | BACKGROUND: Proteins generally perform their function in a folded state. Residues forming an active site, whether it is a catalytic center or interaction interface, are frequently distant in a protein sequence. Hence, traditional sequence-based prediction methods focusing on a single residue (or a short window of residues) at a time may have difficulties in identifying and clustering the residues constituting a functional site, especially when a protein has multiple functions. Evolutionary information encoded in multiple sequence alignments is known to greatly improve sequence-based predictions. Identification of coevolving residues further advances the protein structure and function annotation by revealing cooperative pairs and higher order groupings of residues. RESULTS: We present a new web-based tool (CoeViz) that provides a versatile analysis and visualization of pairwise coevolution of amino acid residues. The tool computes three covariance metrics: mutual information, chi-square statistic, Pearson correlation, and one conservation metric: joint Shannon entropy. Implemented adjustments of covariance scores include phylogeny correction, corrections for sequence dissimilarity and alignment gaps, and the average product correction. Visualization of residue relationships is enhanced by hierarchical cluster trees, heat maps, circular diagrams, and the residue highlighting in protein sequence and 3D structure. Unlike other existing tools, CoeViz is not limited to analyzing conserved domains or protein families and can process long, unstructured and multi-domain proteins thousands of residues long. Two examples are provided to illustrate the use of the tool for identification of residues (1) involved in enzymatic function, (2) forming short linear functional motifs, and (3) constituting a structural domain. CONCLUSIONS: CoeViz represents a practical resource for a quick sequence-based protein annotation for molecular biologists, e.g., for identifying putative functional clusters of residues and structural domains. CoeViz also can serve computational biologists as a resource of coevolution matrices, e.g., for developing machine learning-based prediction models. The presented tool is integrated in the POLYVIEW-2D server (http://polyview.cchmc.org/) and available from resulting pages of POLYVIEW-2D. |
format | Online Article Text |
id | pubmed-4782369 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-47823692016-03-09 CoeViz: a web-based tool for coevolution analysis of protein residues Baker, Frazier N. Porollo, Aleksey BMC Bioinformatics Software BACKGROUND: Proteins generally perform their function in a folded state. Residues forming an active site, whether it is a catalytic center or interaction interface, are frequently distant in a protein sequence. Hence, traditional sequence-based prediction methods focusing on a single residue (or a short window of residues) at a time may have difficulties in identifying and clustering the residues constituting a functional site, especially when a protein has multiple functions. Evolutionary information encoded in multiple sequence alignments is known to greatly improve sequence-based predictions. Identification of coevolving residues further advances the protein structure and function annotation by revealing cooperative pairs and higher order groupings of residues. RESULTS: We present a new web-based tool (CoeViz) that provides a versatile analysis and visualization of pairwise coevolution of amino acid residues. The tool computes three covariance metrics: mutual information, chi-square statistic, Pearson correlation, and one conservation metric: joint Shannon entropy. Implemented adjustments of covariance scores include phylogeny correction, corrections for sequence dissimilarity and alignment gaps, and the average product correction. Visualization of residue relationships is enhanced by hierarchical cluster trees, heat maps, circular diagrams, and the residue highlighting in protein sequence and 3D structure. Unlike other existing tools, CoeViz is not limited to analyzing conserved domains or protein families and can process long, unstructured and multi-domain proteins thousands of residues long. Two examples are provided to illustrate the use of the tool for identification of residues (1) involved in enzymatic function, (2) forming short linear functional motifs, and (3) constituting a structural domain. CONCLUSIONS: CoeViz represents a practical resource for a quick sequence-based protein annotation for molecular biologists, e.g., for identifying putative functional clusters of residues and structural domains. CoeViz also can serve computational biologists as a resource of coevolution matrices, e.g., for developing machine learning-based prediction models. The presented tool is integrated in the POLYVIEW-2D server (http://polyview.cchmc.org/) and available from resulting pages of POLYVIEW-2D. BioMed Central 2016-03-08 /pmc/articles/PMC4782369/ /pubmed/26956673 http://dx.doi.org/10.1186/s12859-016-0975-z Text en © Baker and Porollo. 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Software Baker, Frazier N. Porollo, Aleksey CoeViz: a web-based tool for coevolution analysis of protein residues |
title | CoeViz: a web-based tool for coevolution analysis of protein residues |
title_full | CoeViz: a web-based tool for coevolution analysis of protein residues |
title_fullStr | CoeViz: a web-based tool for coevolution analysis of protein residues |
title_full_unstemmed | CoeViz: a web-based tool for coevolution analysis of protein residues |
title_short | CoeViz: a web-based tool for coevolution analysis of protein residues |
title_sort | coeviz: a web-based tool for coevolution analysis of protein residues |
topic | Software |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4782369/ https://www.ncbi.nlm.nih.gov/pubmed/26956673 http://dx.doi.org/10.1186/s12859-016-0975-z |
work_keys_str_mv | AT bakerfraziern coevizawebbasedtoolforcoevolutionanalysisofproteinresidues AT porolloaleksey coevizawebbasedtoolforcoevolutionanalysisofproteinresidues |