Cargando…

Graph-based modeling of tandem repeats improves global multiple sequence alignment

Tandem repeats (TRs) are often present in proteins with crucial functions, responsible for resistance, pathogenicity and associated with infectious or neurodegenerative diseases. This motivates numerous studies of TRs and their evolution, requiring accurate multiple sequence alignment. TRs may be lo...

Descripción completa

Detalles Bibliográficos
Autores principales: Szalkowski, Adam M., Anisimova, Maria
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3783189/
https://www.ncbi.nlm.nih.gov/pubmed/23877246
http://dx.doi.org/10.1093/nar/gkt628
_version_ 1782285639868219392
author Szalkowski, Adam M.
Anisimova, Maria
author_facet Szalkowski, Adam M.
Anisimova, Maria
author_sort Szalkowski, Adam M.
collection PubMed
description Tandem repeats (TRs) are often present in proteins with crucial functions, responsible for resistance, pathogenicity and associated with infectious or neurodegenerative diseases. This motivates numerous studies of TRs and their evolution, requiring accurate multiple sequence alignment. TRs may be lost or inserted at any position of a TR region by replication slippage or recombination, but current methods assume fixed unit boundaries, and yet are of high complexity. We present a new global graph-based alignment method that does not restrict TR unit indels by unit boundaries. TR indels are modeled separately and penalized using the phylogeny-aware alignment algorithm. This ensures enhanced accuracy of reconstructed alignments, disentangling TRs and measuring indel events and rates in a biologically meaningful way. Our method detects not only duplication events but also all changes in TR regions owing to recombination, strand slippage and other events inserting or deleting TR units. We evaluate our method by simulation incorporating TR evolution, by either sampling TRs from a profile hidden Markov model or by mimicking strand slippage with duplications. The new method is illustrated on a family of type III effectors, a pathogenicity determinant in agriculturally important bacteria Ralstonia solanacearum. We show that TR indel rate variation contributes to the diversification of this protein family.
format Online
Article
Text
id pubmed-3783189
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-37831892013-09-30 Graph-based modeling of tandem repeats improves global multiple sequence alignment Szalkowski, Adam M. Anisimova, Maria Nucleic Acids Res Methods Online Tandem repeats (TRs) are often present in proteins with crucial functions, responsible for resistance, pathogenicity and associated with infectious or neurodegenerative diseases. This motivates numerous studies of TRs and their evolution, requiring accurate multiple sequence alignment. TRs may be lost or inserted at any position of a TR region by replication slippage or recombination, but current methods assume fixed unit boundaries, and yet are of high complexity. We present a new global graph-based alignment method that does not restrict TR unit indels by unit boundaries. TR indels are modeled separately and penalized using the phylogeny-aware alignment algorithm. This ensures enhanced accuracy of reconstructed alignments, disentangling TRs and measuring indel events and rates in a biologically meaningful way. Our method detects not only duplication events but also all changes in TR regions owing to recombination, strand slippage and other events inserting or deleting TR units. We evaluate our method by simulation incorporating TR evolution, by either sampling TRs from a profile hidden Markov model or by mimicking strand slippage with duplications. The new method is illustrated on a family of type III effectors, a pathogenicity determinant in agriculturally important bacteria Ralstonia solanacearum. We show that TR indel rate variation contributes to the diversification of this protein family. Oxford University Press 2013-09 2013-07-22 /pmc/articles/PMC3783189/ /pubmed/23877246 http://dx.doi.org/10.1093/nar/gkt628 Text en © The Author(s) 2013. Published by Oxford University Press. http://creativecommons.org/licenses/by/3.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methods Online
Szalkowski, Adam M.
Anisimova, Maria
Graph-based modeling of tandem repeats improves global multiple sequence alignment
title Graph-based modeling of tandem repeats improves global multiple sequence alignment
title_full Graph-based modeling of tandem repeats improves global multiple sequence alignment
title_fullStr Graph-based modeling of tandem repeats improves global multiple sequence alignment
title_full_unstemmed Graph-based modeling of tandem repeats improves global multiple sequence alignment
title_short Graph-based modeling of tandem repeats improves global multiple sequence alignment
title_sort graph-based modeling of tandem repeats improves global multiple sequence alignment
topic Methods Online
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3783189/
https://www.ncbi.nlm.nih.gov/pubmed/23877246
http://dx.doi.org/10.1093/nar/gkt628
work_keys_str_mv AT szalkowskiadamm graphbasedmodelingoftandemrepeatsimprovesglobalmultiplesequencealignment
AT anisimovamaria graphbasedmodelingoftandemrepeatsimprovesglobalmultiplesequencealignment