Cargando…

A unified statistical model of protein multiple sequence alignment integrating direct coupling and insertions

The multiple sequence alignment (MSA) of a protein family provides a wealth of information in terms of the conservation pattern of amino acid residues not only at each alignment site but also between distant sites. In order to statistically model the MSA incorporating both short-range and long-range...

Descripción completa

Detalles Bibliográficos
Autor principal: Kinjo, Akira R.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: The Biophysical Society of Japan (BSJ) 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5042171/
https://www.ncbi.nlm.nih.gov/pubmed/27924257
http://dx.doi.org/10.2142/biophysico.13.0_45
_version_ 1782456559209545728
author Kinjo, Akira R.
author_facet Kinjo, Akira R.
author_sort Kinjo, Akira R.
collection PubMed
description The multiple sequence alignment (MSA) of a protein family provides a wealth of information in terms of the conservation pattern of amino acid residues not only at each alignment site but also between distant sites. In order to statistically model the MSA incorporating both short-range and long-range correlations as well as insertions, I have derived a lattice gas model of the MSA based on the principle of maximum entropy. The partition function, obtained by the transfer matrix method with a mean-field approximation, accounts for all possible alignments with all possible sequences. The model parameters for short-range and long-range interactions were determined by a self-consistent condition and by a Gaussian approximation, respectively. Using this model with and without long-range interactions, I analyzed the globin and V-set domains by increasing the “temperature” and by “mutating” a site. The correlations between residue conservation and various measures of the system’s stability indicate that the long-range interactions make the conservation pattern more specific to the structure, and increasingly stabilize better conserved residues.
format Online
Article
Text
id pubmed-5042171
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher The Biophysical Society of Japan (BSJ)
record_format MEDLINE/PubMed
spelling pubmed-50421712016-12-06 A unified statistical model of protein multiple sequence alignment integrating direct coupling and insertions Kinjo, Akira R. Biophys Physicobiol Regular Article The multiple sequence alignment (MSA) of a protein family provides a wealth of information in terms of the conservation pattern of amino acid residues not only at each alignment site but also between distant sites. In order to statistically model the MSA incorporating both short-range and long-range correlations as well as insertions, I have derived a lattice gas model of the MSA based on the principle of maximum entropy. The partition function, obtained by the transfer matrix method with a mean-field approximation, accounts for all possible alignments with all possible sequences. The model parameters for short-range and long-range interactions were determined by a self-consistent condition and by a Gaussian approximation, respectively. Using this model with and without long-range interactions, I analyzed the globin and V-set domains by increasing the “temperature” and by “mutating” a site. The correlations between residue conservation and various measures of the system’s stability indicate that the long-range interactions make the conservation pattern more specific to the structure, and increasingly stabilize better conserved residues. The Biophysical Society of Japan (BSJ) 2016-04-22 /pmc/articles/PMC5042171/ /pubmed/27924257 http://dx.doi.org/10.2142/biophysico.13.0_45 Text en © 2016 The Biophysical Society of Japan This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Regular Article
Kinjo, Akira R.
A unified statistical model of protein multiple sequence alignment integrating direct coupling and insertions
title A unified statistical model of protein multiple sequence alignment integrating direct coupling and insertions
title_full A unified statistical model of protein multiple sequence alignment integrating direct coupling and insertions
title_fullStr A unified statistical model of protein multiple sequence alignment integrating direct coupling and insertions
title_full_unstemmed A unified statistical model of protein multiple sequence alignment integrating direct coupling and insertions
title_short A unified statistical model of protein multiple sequence alignment integrating direct coupling and insertions
title_sort unified statistical model of protein multiple sequence alignment integrating direct coupling and insertions
topic Regular Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5042171/
https://www.ncbi.nlm.nih.gov/pubmed/27924257
http://dx.doi.org/10.2142/biophysico.13.0_45
work_keys_str_mv AT kinjoakirar aunifiedstatisticalmodelofproteinmultiplesequencealignmentintegratingdirectcouplingandinsertions
AT kinjoakirar unifiedstatisticalmodelofproteinmultiplesequencealignmentintegratingdirectcouplingandinsertions