Cargando…
Accelerating phylogeny-aware alignment with indel evolution using short time Fourier transform
Recently we presented a frequentist dynamic programming (DP) approach for multiple sequence alignment based on the explicit model of indel evolution Poisson Indel Process (PIP). This phylogeny-aware approach produces evolutionary meaningful gap patterns and is robust to the ‘over-alignment’ bias. De...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7671320/ https://www.ncbi.nlm.nih.gov/pubmed/33575636 http://dx.doi.org/10.1093/nargab/lqaa092 |
_version_ | 1783610907340832768 |
---|---|
author | Maiolo, Massimo Ulzega, Simone Gil, Manuel Anisimova, Maria |
author_facet | Maiolo, Massimo Ulzega, Simone Gil, Manuel Anisimova, Maria |
author_sort | Maiolo, Massimo |
collection | PubMed |
description | Recently we presented a frequentist dynamic programming (DP) approach for multiple sequence alignment based on the explicit model of indel evolution Poisson Indel Process (PIP). This phylogeny-aware approach produces evolutionary meaningful gap patterns and is robust to the ‘over-alignment’ bias. Despite linear time complexity for the computation of marginal likelihoods, the overall method’s complexity is cubic in sequence length. Inspired by the popular aligner MAFFT, we propose a new technique to accelerate the evolutionary indel based alignment. Amino acid sequences are converted to sequences representing their physicochemical properties, and homologous blocks are identified by multi-scale short-time Fourier transform. Three three-dimensional DP matrices are then created under PIP, with homologous blocks defining sparse structures where most cells are excluded from the calculations. The homologous blocks are connected through intermediate ‘linking blocks’. The homologous and linking blocks are aligned under PIP as independent DP sub-matrices and their tracebacks merged to yield the final alignment. The new algorithm can largely profit from parallel computing, yielding a theoretical speed-up estimated to be proportional to the cubic power of the number of sub-blocks in the DP matrices. We compare the new method to the original PIP approach and demonstrate it on real data. |
format | Online Article Text |
id | pubmed-7671320 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-76713202021-02-10 Accelerating phylogeny-aware alignment with indel evolution using short time Fourier transform Maiolo, Massimo Ulzega, Simone Gil, Manuel Anisimova, Maria NAR Genom Bioinform Methods Article Recently we presented a frequentist dynamic programming (DP) approach for multiple sequence alignment based on the explicit model of indel evolution Poisson Indel Process (PIP). This phylogeny-aware approach produces evolutionary meaningful gap patterns and is robust to the ‘over-alignment’ bias. Despite linear time complexity for the computation of marginal likelihoods, the overall method’s complexity is cubic in sequence length. Inspired by the popular aligner MAFFT, we propose a new technique to accelerate the evolutionary indel based alignment. Amino acid sequences are converted to sequences representing their physicochemical properties, and homologous blocks are identified by multi-scale short-time Fourier transform. Three three-dimensional DP matrices are then created under PIP, with homologous blocks defining sparse structures where most cells are excluded from the calculations. The homologous blocks are connected through intermediate ‘linking blocks’. The homologous and linking blocks are aligned under PIP as independent DP sub-matrices and their tracebacks merged to yield the final alignment. The new algorithm can largely profit from parallel computing, yielding a theoretical speed-up estimated to be proportional to the cubic power of the number of sub-blocks in the DP matrices. We compare the new method to the original PIP approach and demonstrate it on real data. Oxford University Press 2020-11-06 /pmc/articles/PMC7671320/ /pubmed/33575636 http://dx.doi.org/10.1093/nargab/lqaa092 Text en © The Author(s) 2019. Published by Oxford University Press on behalf of NAR Genomics and Bioinformatics. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Methods Article Maiolo, Massimo Ulzega, Simone Gil, Manuel Anisimova, Maria Accelerating phylogeny-aware alignment with indel evolution using short time Fourier transform |
title | Accelerating phylogeny-aware alignment with indel evolution using short time Fourier transform |
title_full | Accelerating phylogeny-aware alignment with indel evolution using short time Fourier transform |
title_fullStr | Accelerating phylogeny-aware alignment with indel evolution using short time Fourier transform |
title_full_unstemmed | Accelerating phylogeny-aware alignment with indel evolution using short time Fourier transform |
title_short | Accelerating phylogeny-aware alignment with indel evolution using short time Fourier transform |
title_sort | accelerating phylogeny-aware alignment with indel evolution using short time fourier transform |
topic | Methods Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7671320/ https://www.ncbi.nlm.nih.gov/pubmed/33575636 http://dx.doi.org/10.1093/nargab/lqaa092 |
work_keys_str_mv | AT maiolomassimo acceleratingphylogenyawarealignmentwithindelevolutionusingshorttimefouriertransform AT ulzegasimone acceleratingphylogenyawarealignmentwithindelevolutionusingshorttimefouriertransform AT gilmanuel acceleratingphylogenyawarealignmentwithindelevolutionusingshorttimefouriertransform AT anisimovamaria acceleratingphylogenyawarealignmentwithindelevolutionusingshorttimefouriertransform |