Cargando…
Regional Context in the Alignment of Biological Sequence Pairs
Sequence divergence derives from either point substitution or indel (insertion or deletion) processes. We investigated the rates of these two processes both in protein and non-protein coding DNA. We aligned sequence pairs using two pair-hidden Markov models (PHMMs) conjoined by one silent state. The...
Autores principales: | , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
Springer-Verlag
2010
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3064887/ https://www.ncbi.nlm.nih.gov/pubmed/21107551 http://dx.doi.org/10.1007/s00239-010-9409-0 |
_version_ | 1782200931065004032 |
---|---|
author | Sammut, Raymond Huttley, Gavin |
author_facet | Sammut, Raymond Huttley, Gavin |
author_sort | Sammut, Raymond |
collection | PubMed |
description | Sequence divergence derives from either point substitution or indel (insertion or deletion) processes. We investigated the rates of these two processes both in protein and non-protein coding DNA. We aligned sequence pairs using two pair-hidden Markov models (PHMMs) conjoined by one silent state. The two PHMMs had their own set of parameters to model rates in their respective regions. The aim was to test the hypothesis that the indel mutation rate mimics the point mutation rate. That is, indels are found less often in conserved regions (slow point substitution rate) and more often in non-conserved regions (fast point substitution rate). Both polypeptides and rRNA molecules in our data exhibited a clear distinction between slow and fast rates of the two processes. These two rates served as surrogates to conserved and non-conserved secondary structure components, respectively. With polypeptides we found both the fast indel rate and the fast replacement rate were co-located with hydrophilic residues. We also found that the average concordance, of our alignments with corresponding curated alignments, improves markedly when the model allows either of the two fast rates to colocate with hydrophilic residues. With rRNA molecules, our model did not detect colocation between the fast indel rate and the fast substitution rate. Nevertheless, coupling the indel rates with the point substitution rates across the two regions markedly increased model fit. This result suggests that rRNA pairwise alignments should be modeled after allowing for the two processes to vary simultaneously and independently in the two regions. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1007/s00239-010-9409-0) contains supplementary material, which is available to authorized users. |
format | Text |
id | pubmed-3064887 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2010 |
publisher | Springer-Verlag |
record_format | MEDLINE/PubMed |
spelling | pubmed-30648872011-04-21 Regional Context in the Alignment of Biological Sequence Pairs Sammut, Raymond Huttley, Gavin J Mol Evol Article Sequence divergence derives from either point substitution or indel (insertion or deletion) processes. We investigated the rates of these two processes both in protein and non-protein coding DNA. We aligned sequence pairs using two pair-hidden Markov models (PHMMs) conjoined by one silent state. The two PHMMs had their own set of parameters to model rates in their respective regions. The aim was to test the hypothesis that the indel mutation rate mimics the point mutation rate. That is, indels are found less often in conserved regions (slow point substitution rate) and more often in non-conserved regions (fast point substitution rate). Both polypeptides and rRNA molecules in our data exhibited a clear distinction between slow and fast rates of the two processes. These two rates served as surrogates to conserved and non-conserved secondary structure components, respectively. With polypeptides we found both the fast indel rate and the fast replacement rate were co-located with hydrophilic residues. We also found that the average concordance, of our alignments with corresponding curated alignments, improves markedly when the model allows either of the two fast rates to colocate with hydrophilic residues. With rRNA molecules, our model did not detect colocation between the fast indel rate and the fast substitution rate. Nevertheless, coupling the indel rates with the point substitution rates across the two regions markedly increased model fit. This result suggests that rRNA pairwise alignments should be modeled after allowing for the two processes to vary simultaneously and independently in the two regions. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1007/s00239-010-9409-0) contains supplementary material, which is available to authorized users. Springer-Verlag 2010-11-24 2011 /pmc/articles/PMC3064887/ /pubmed/21107551 http://dx.doi.org/10.1007/s00239-010-9409-0 Text en © The Author(s) 2010 https://creativecommons.org/licenses/by-nc/4.0/ This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited. |
spellingShingle | Article Sammut, Raymond Huttley, Gavin Regional Context in the Alignment of Biological Sequence Pairs |
title | Regional Context in the Alignment of Biological Sequence Pairs |
title_full | Regional Context in the Alignment of Biological Sequence Pairs |
title_fullStr | Regional Context in the Alignment of Biological Sequence Pairs |
title_full_unstemmed | Regional Context in the Alignment of Biological Sequence Pairs |
title_short | Regional Context in the Alignment of Biological Sequence Pairs |
title_sort | regional context in the alignment of biological sequence pairs |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3064887/ https://www.ncbi.nlm.nih.gov/pubmed/21107551 http://dx.doi.org/10.1007/s00239-010-9409-0 |
work_keys_str_mv | AT sammutraymond regionalcontextinthealignmentofbiologicalsequencepairs AT huttleygavin regionalcontextinthealignmentofbiologicalsequencepairs |