Cargando…
Correlated mutations via regularized multinomial regression
BACKGROUND: In addition to sequence conservation, protein multiple sequence alignments contain evolutionary signal in the form of correlated variation among amino acid positions. This signal indicates positions in the sequence that influence each other, and can be applied for the prediction of intra...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2011
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3247924/ https://www.ncbi.nlm.nih.gov/pubmed/22082126 http://dx.doi.org/10.1186/1471-2105-12-444 |
_version_ | 1782220192060801024 |
---|---|
author | Sreekumar, Janardanan ter Braak, Cajo JF van Ham, Roeland CHJ van Dijk, Aalt DJ |
author_facet | Sreekumar, Janardanan ter Braak, Cajo JF van Ham, Roeland CHJ van Dijk, Aalt DJ |
author_sort | Sreekumar, Janardanan |
collection | PubMed |
description | BACKGROUND: In addition to sequence conservation, protein multiple sequence alignments contain evolutionary signal in the form of correlated variation among amino acid positions. This signal indicates positions in the sequence that influence each other, and can be applied for the prediction of intra- or intermolecular contacts. Although various approaches exist for the detection of such correlated mutations, in general these methods utilize only pairwise correlations. Hence, they tend to conflate direct and indirect dependencies. RESULTS: We propose RMRCM, a method for Regularized Multinomial Regression in order to obtain Correlated Mutations from protein multiple sequence alignments. Importantly, our method is not restricted to pairwise (column-column) comparisons only, but takes into account the network nature of relationships between protein residues in order to predict residue-residue contacts. The use of regularization ensures that the number of predicted links between columns in the multiple sequence alignment remains limited, preventing overprediction. Using simulated datasets we analyzed the performance of our approach in predicting residue-residue contacts, and studied how it is influenced by various types of noise. For various biological datasets, validation with protein structure data indicates a good performance of the proposed algorithm for the prediction of residue-residue contacts, in comparison to previous results. RMRCM can also be applied to predict interactions (in addition to only predicting interaction sites or contact sites), as demonstrated by predicting PDZ-peptide interactions. CONCLUSIONS: A novel method is presented, which uses regularized multinomial regression in order to obtain correlated mutations from protein multiple sequence alignments. AVAILABILITY: R-code of our implementation is available via http://www.ab.wur.nl/rmrcm |
format | Online Article Text |
id | pubmed-3247924 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2011 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-32479242011-12-30 Correlated mutations via regularized multinomial regression Sreekumar, Janardanan ter Braak, Cajo JF van Ham, Roeland CHJ van Dijk, Aalt DJ BMC Bioinformatics Research Article BACKGROUND: In addition to sequence conservation, protein multiple sequence alignments contain evolutionary signal in the form of correlated variation among amino acid positions. This signal indicates positions in the sequence that influence each other, and can be applied for the prediction of intra- or intermolecular contacts. Although various approaches exist for the detection of such correlated mutations, in general these methods utilize only pairwise correlations. Hence, they tend to conflate direct and indirect dependencies. RESULTS: We propose RMRCM, a method for Regularized Multinomial Regression in order to obtain Correlated Mutations from protein multiple sequence alignments. Importantly, our method is not restricted to pairwise (column-column) comparisons only, but takes into account the network nature of relationships between protein residues in order to predict residue-residue contacts. The use of regularization ensures that the number of predicted links between columns in the multiple sequence alignment remains limited, preventing overprediction. Using simulated datasets we analyzed the performance of our approach in predicting residue-residue contacts, and studied how it is influenced by various types of noise. For various biological datasets, validation with protein structure data indicates a good performance of the proposed algorithm for the prediction of residue-residue contacts, in comparison to previous results. RMRCM can also be applied to predict interactions (in addition to only predicting interaction sites or contact sites), as demonstrated by predicting PDZ-peptide interactions. CONCLUSIONS: A novel method is presented, which uses regularized multinomial regression in order to obtain correlated mutations from protein multiple sequence alignments. AVAILABILITY: R-code of our implementation is available via http://www.ab.wur.nl/rmrcm BioMed Central 2011-11-14 /pmc/articles/PMC3247924/ /pubmed/22082126 http://dx.doi.org/10.1186/1471-2105-12-444 Text en Copyright ©2011 Sreekumar et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Sreekumar, Janardanan ter Braak, Cajo JF van Ham, Roeland CHJ van Dijk, Aalt DJ Correlated mutations via regularized multinomial regression |
title | Correlated mutations via regularized multinomial regression |
title_full | Correlated mutations via regularized multinomial regression |
title_fullStr | Correlated mutations via regularized multinomial regression |
title_full_unstemmed | Correlated mutations via regularized multinomial regression |
title_short | Correlated mutations via regularized multinomial regression |
title_sort | correlated mutations via regularized multinomial regression |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3247924/ https://www.ncbi.nlm.nih.gov/pubmed/22082126 http://dx.doi.org/10.1186/1471-2105-12-444 |
work_keys_str_mv | AT sreekumarjanardanan correlatedmutationsviaregularizedmultinomialregression AT terbraakcajojf correlatedmutationsviaregularizedmultinomialregression AT vanhamroelandchj correlatedmutationsviaregularizedmultinomialregression AT vandijkaaltdj correlatedmutationsviaregularizedmultinomialregression |