Cargando…
Graph-based modeling of tandem repeats improves global multiple sequence alignment
Tandem repeats (TRs) are often present in proteins with crucial functions, responsible for resistance, pathogenicity and associated with infectious or neurodegenerative diseases. This motivates numerous studies of TRs and their evolution, requiring accurate multiple sequence alignment. TRs may be lo...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3783189/ https://www.ncbi.nlm.nih.gov/pubmed/23877246 http://dx.doi.org/10.1093/nar/gkt628 |
_version_ | 1782285639868219392 |
---|---|
author | Szalkowski, Adam M. Anisimova, Maria |
author_facet | Szalkowski, Adam M. Anisimova, Maria |
author_sort | Szalkowski, Adam M. |
collection | PubMed |
description | Tandem repeats (TRs) are often present in proteins with crucial functions, responsible for resistance, pathogenicity and associated with infectious or neurodegenerative diseases. This motivates numerous studies of TRs and their evolution, requiring accurate multiple sequence alignment. TRs may be lost or inserted at any position of a TR region by replication slippage or recombination, but current methods assume fixed unit boundaries, and yet are of high complexity. We present a new global graph-based alignment method that does not restrict TR unit indels by unit boundaries. TR indels are modeled separately and penalized using the phylogeny-aware alignment algorithm. This ensures enhanced accuracy of reconstructed alignments, disentangling TRs and measuring indel events and rates in a biologically meaningful way. Our method detects not only duplication events but also all changes in TR regions owing to recombination, strand slippage and other events inserting or deleting TR units. We evaluate our method by simulation incorporating TR evolution, by either sampling TRs from a profile hidden Markov model or by mimicking strand slippage with duplications. The new method is illustrated on a family of type III effectors, a pathogenicity determinant in agriculturally important bacteria Ralstonia solanacearum. We show that TR indel rate variation contributes to the diversification of this protein family. |
format | Online Article Text |
id | pubmed-3783189 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-37831892013-09-30 Graph-based modeling of tandem repeats improves global multiple sequence alignment Szalkowski, Adam M. Anisimova, Maria Nucleic Acids Res Methods Online Tandem repeats (TRs) are often present in proteins with crucial functions, responsible for resistance, pathogenicity and associated with infectious or neurodegenerative diseases. This motivates numerous studies of TRs and their evolution, requiring accurate multiple sequence alignment. TRs may be lost or inserted at any position of a TR region by replication slippage or recombination, but current methods assume fixed unit boundaries, and yet are of high complexity. We present a new global graph-based alignment method that does not restrict TR unit indels by unit boundaries. TR indels are modeled separately and penalized using the phylogeny-aware alignment algorithm. This ensures enhanced accuracy of reconstructed alignments, disentangling TRs and measuring indel events and rates in a biologically meaningful way. Our method detects not only duplication events but also all changes in TR regions owing to recombination, strand slippage and other events inserting or deleting TR units. We evaluate our method by simulation incorporating TR evolution, by either sampling TRs from a profile hidden Markov model or by mimicking strand slippage with duplications. The new method is illustrated on a family of type III effectors, a pathogenicity determinant in agriculturally important bacteria Ralstonia solanacearum. We show that TR indel rate variation contributes to the diversification of this protein family. Oxford University Press 2013-09 2013-07-22 /pmc/articles/PMC3783189/ /pubmed/23877246 http://dx.doi.org/10.1093/nar/gkt628 Text en © The Author(s) 2013. Published by Oxford University Press. http://creativecommons.org/licenses/by/3.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Methods Online Szalkowski, Adam M. Anisimova, Maria Graph-based modeling of tandem repeats improves global multiple sequence alignment |
title | Graph-based modeling of tandem repeats improves global multiple sequence alignment |
title_full | Graph-based modeling of tandem repeats improves global multiple sequence alignment |
title_fullStr | Graph-based modeling of tandem repeats improves global multiple sequence alignment |
title_full_unstemmed | Graph-based modeling of tandem repeats improves global multiple sequence alignment |
title_short | Graph-based modeling of tandem repeats improves global multiple sequence alignment |
title_sort | graph-based modeling of tandem repeats improves global multiple sequence alignment |
topic | Methods Online |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3783189/ https://www.ncbi.nlm.nih.gov/pubmed/23877246 http://dx.doi.org/10.1093/nar/gkt628 |
work_keys_str_mv | AT szalkowskiadamm graphbasedmodelingoftandemrepeatsimprovesglobalmultiplesequencealignment AT anisimovamaria graphbasedmodelingoftandemrepeatsimprovesglobalmultiplesequencealignment |