Cargando…

Correlated mutations via regularized multinomial regression

BACKGROUND: In addition to sequence conservation, protein multiple sequence alignments contain evolutionary signal in the form of correlated variation among amino acid positions. This signal indicates positions in the sequence that influence each other, and can be applied for the prediction of intra...

Descripción completa

Detalles Bibliográficos
Autores principales: Sreekumar, Janardanan, ter Braak, Cajo JF, van Ham, Roeland CHJ, van Dijk, Aalt DJ
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3247924/
https://www.ncbi.nlm.nih.gov/pubmed/22082126
http://dx.doi.org/10.1186/1471-2105-12-444
_version_ 1782220192060801024
author Sreekumar, Janardanan
ter Braak, Cajo JF
van Ham, Roeland CHJ
van Dijk, Aalt DJ
author_facet Sreekumar, Janardanan
ter Braak, Cajo JF
van Ham, Roeland CHJ
van Dijk, Aalt DJ
author_sort Sreekumar, Janardanan
collection PubMed
description BACKGROUND: In addition to sequence conservation, protein multiple sequence alignments contain evolutionary signal in the form of correlated variation among amino acid positions. This signal indicates positions in the sequence that influence each other, and can be applied for the prediction of intra- or intermolecular contacts. Although various approaches exist for the detection of such correlated mutations, in general these methods utilize only pairwise correlations. Hence, they tend to conflate direct and indirect dependencies. RESULTS: We propose RMRCM, a method for Regularized Multinomial Regression in order to obtain Correlated Mutations from protein multiple sequence alignments. Importantly, our method is not restricted to pairwise (column-column) comparisons only, but takes into account the network nature of relationships between protein residues in order to predict residue-residue contacts. The use of regularization ensures that the number of predicted links between columns in the multiple sequence alignment remains limited, preventing overprediction. Using simulated datasets we analyzed the performance of our approach in predicting residue-residue contacts, and studied how it is influenced by various types of noise. For various biological datasets, validation with protein structure data indicates a good performance of the proposed algorithm for the prediction of residue-residue contacts, in comparison to previous results. RMRCM can also be applied to predict interactions (in addition to only predicting interaction sites or contact sites), as demonstrated by predicting PDZ-peptide interactions. CONCLUSIONS: A novel method is presented, which uses regularized multinomial regression in order to obtain correlated mutations from protein multiple sequence alignments. AVAILABILITY: R-code of our implementation is available via http://www.ab.wur.nl/rmrcm
format Online
Article
Text
id pubmed-3247924
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-32479242011-12-30 Correlated mutations via regularized multinomial regression Sreekumar, Janardanan ter Braak, Cajo JF van Ham, Roeland CHJ van Dijk, Aalt DJ BMC Bioinformatics Research Article BACKGROUND: In addition to sequence conservation, protein multiple sequence alignments contain evolutionary signal in the form of correlated variation among amino acid positions. This signal indicates positions in the sequence that influence each other, and can be applied for the prediction of intra- or intermolecular contacts. Although various approaches exist for the detection of such correlated mutations, in general these methods utilize only pairwise correlations. Hence, they tend to conflate direct and indirect dependencies. RESULTS: We propose RMRCM, a method for Regularized Multinomial Regression in order to obtain Correlated Mutations from protein multiple sequence alignments. Importantly, our method is not restricted to pairwise (column-column) comparisons only, but takes into account the network nature of relationships between protein residues in order to predict residue-residue contacts. The use of regularization ensures that the number of predicted links between columns in the multiple sequence alignment remains limited, preventing overprediction. Using simulated datasets we analyzed the performance of our approach in predicting residue-residue contacts, and studied how it is influenced by various types of noise. For various biological datasets, validation with protein structure data indicates a good performance of the proposed algorithm for the prediction of residue-residue contacts, in comparison to previous results. RMRCM can also be applied to predict interactions (in addition to only predicting interaction sites or contact sites), as demonstrated by predicting PDZ-peptide interactions. CONCLUSIONS: A novel method is presented, which uses regularized multinomial regression in order to obtain correlated mutations from protein multiple sequence alignments. AVAILABILITY: R-code of our implementation is available via http://www.ab.wur.nl/rmrcm BioMed Central 2011-11-14 /pmc/articles/PMC3247924/ /pubmed/22082126 http://dx.doi.org/10.1186/1471-2105-12-444 Text en Copyright ©2011 Sreekumar et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Sreekumar, Janardanan
ter Braak, Cajo JF
van Ham, Roeland CHJ
van Dijk, Aalt DJ
Correlated mutations via regularized multinomial regression
title Correlated mutations via regularized multinomial regression
title_full Correlated mutations via regularized multinomial regression
title_fullStr Correlated mutations via regularized multinomial regression
title_full_unstemmed Correlated mutations via regularized multinomial regression
title_short Correlated mutations via regularized multinomial regression
title_sort correlated mutations via regularized multinomial regression
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3247924/
https://www.ncbi.nlm.nih.gov/pubmed/22082126
http://dx.doi.org/10.1186/1471-2105-12-444
work_keys_str_mv AT sreekumarjanardanan correlatedmutationsviaregularizedmultinomialregression
AT terbraakcajojf correlatedmutationsviaregularizedmultinomialregression
AT vanhamroelandchj correlatedmutationsviaregularizedmultinomialregression
AT vandijkaaltdj correlatedmutationsviaregularizedmultinomialregression