Cargando…

CoeViz: a web-based tool for coevolution analysis of protein residues

BACKGROUND: Proteins generally perform their function in a folded state. Residues forming an active site, whether it is a catalytic center or interaction interface, are frequently distant in a protein sequence. Hence, traditional sequence-based prediction methods focusing on a single residue (or a s...

Descripción completa

Detalles Bibliográficos
Autores principales: Baker, Frazier N., Porollo, Aleksey
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4782369/
https://www.ncbi.nlm.nih.gov/pubmed/26956673
http://dx.doi.org/10.1186/s12859-016-0975-z
_version_ 1782419940398071808
author Baker, Frazier N.
Porollo, Aleksey
author_facet Baker, Frazier N.
Porollo, Aleksey
author_sort Baker, Frazier N.
collection PubMed
description BACKGROUND: Proteins generally perform their function in a folded state. Residues forming an active site, whether it is a catalytic center or interaction interface, are frequently distant in a protein sequence. Hence, traditional sequence-based prediction methods focusing on a single residue (or a short window of residues) at a time may have difficulties in identifying and clustering the residues constituting a functional site, especially when a protein has multiple functions. Evolutionary information encoded in multiple sequence alignments is known to greatly improve sequence-based predictions. Identification of coevolving residues further advances the protein structure and function annotation by revealing cooperative pairs and higher order groupings of residues. RESULTS: We present a new web-based tool (CoeViz) that provides a versatile analysis and visualization of pairwise coevolution of amino acid residues. The tool computes three covariance metrics: mutual information, chi-square statistic, Pearson correlation, and one conservation metric: joint Shannon entropy. Implemented adjustments of covariance scores include phylogeny correction, corrections for sequence dissimilarity and alignment gaps, and the average product correction. Visualization of residue relationships is enhanced by hierarchical cluster trees, heat maps, circular diagrams, and the residue highlighting in protein sequence and 3D structure. Unlike other existing tools, CoeViz is not limited to analyzing conserved domains or protein families and can process long, unstructured and multi-domain proteins thousands of residues long. Two examples are provided to illustrate the use of the tool for identification of residues (1) involved in enzymatic function, (2) forming short linear functional motifs, and (3) constituting a structural domain. CONCLUSIONS: CoeViz represents a practical resource for a quick sequence-based protein annotation for molecular biologists, e.g., for identifying putative functional clusters of residues and structural domains. CoeViz also can serve computational biologists as a resource of coevolution matrices, e.g., for developing machine learning-based prediction models. The presented tool is integrated in the POLYVIEW-2D server (http://polyview.cchmc.org/) and available from resulting pages of POLYVIEW-2D.
format Online
Article
Text
id pubmed-4782369
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-47823692016-03-09 CoeViz: a web-based tool for coevolution analysis of protein residues Baker, Frazier N. Porollo, Aleksey BMC Bioinformatics Software BACKGROUND: Proteins generally perform their function in a folded state. Residues forming an active site, whether it is a catalytic center or interaction interface, are frequently distant in a protein sequence. Hence, traditional sequence-based prediction methods focusing on a single residue (or a short window of residues) at a time may have difficulties in identifying and clustering the residues constituting a functional site, especially when a protein has multiple functions. Evolutionary information encoded in multiple sequence alignments is known to greatly improve sequence-based predictions. Identification of coevolving residues further advances the protein structure and function annotation by revealing cooperative pairs and higher order groupings of residues. RESULTS: We present a new web-based tool (CoeViz) that provides a versatile analysis and visualization of pairwise coevolution of amino acid residues. The tool computes three covariance metrics: mutual information, chi-square statistic, Pearson correlation, and one conservation metric: joint Shannon entropy. Implemented adjustments of covariance scores include phylogeny correction, corrections for sequence dissimilarity and alignment gaps, and the average product correction. Visualization of residue relationships is enhanced by hierarchical cluster trees, heat maps, circular diagrams, and the residue highlighting in protein sequence and 3D structure. Unlike other existing tools, CoeViz is not limited to analyzing conserved domains or protein families and can process long, unstructured and multi-domain proteins thousands of residues long. Two examples are provided to illustrate the use of the tool for identification of residues (1) involved in enzymatic function, (2) forming short linear functional motifs, and (3) constituting a structural domain. CONCLUSIONS: CoeViz represents a practical resource for a quick sequence-based protein annotation for molecular biologists, e.g., for identifying putative functional clusters of residues and structural domains. CoeViz also can serve computational biologists as a resource of coevolution matrices, e.g., for developing machine learning-based prediction models. The presented tool is integrated in the POLYVIEW-2D server (http://polyview.cchmc.org/) and available from resulting pages of POLYVIEW-2D. BioMed Central 2016-03-08 /pmc/articles/PMC4782369/ /pubmed/26956673 http://dx.doi.org/10.1186/s12859-016-0975-z Text en © Baker and Porollo. 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Software
Baker, Frazier N.
Porollo, Aleksey
CoeViz: a web-based tool for coevolution analysis of protein residues
title CoeViz: a web-based tool for coevolution analysis of protein residues
title_full CoeViz: a web-based tool for coevolution analysis of protein residues
title_fullStr CoeViz: a web-based tool for coevolution analysis of protein residues
title_full_unstemmed CoeViz: a web-based tool for coevolution analysis of protein residues
title_short CoeViz: a web-based tool for coevolution analysis of protein residues
title_sort coeviz: a web-based tool for coevolution analysis of protein residues
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4782369/
https://www.ncbi.nlm.nih.gov/pubmed/26956673
http://dx.doi.org/10.1186/s12859-016-0975-z
work_keys_str_mv AT bakerfraziern coevizawebbasedtoolforcoevolutionanalysisofproteinresidues
AT porolloaleksey coevizawebbasedtoolforcoevolutionanalysisofproteinresidues