Cargando…

ARPIP: Ancestral Sequence Reconstruction with Insertions and Deletions under the Poisson Indel Process

Modern phylogenetic methods allow inference of ancestral molecular sequences given an alignment and phylogeny relating present-day sequences. This provides insight into the evolutionary history of molecules, helping to understand gene function and to study biological processes such as adaptation and...

Descripción completa

Detalles Bibliográficos
Autores principales: Jowkar, Gholamhossein, Pečerska, Jūlija, Maiolo, Massimo, Gil, Manuel, Anisimova, Maria
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10275563/
https://www.ncbi.nlm.nih.gov/pubmed/35866991
http://dx.doi.org/10.1093/sysbio/syac050
_version_ 1785059898733625344
author Jowkar, Gholamhossein
Pečerska, Jūlija
Maiolo, Massimo
Gil, Manuel
Anisimova, Maria
author_facet Jowkar, Gholamhossein
Pečerska, Jūlija
Maiolo, Massimo
Gil, Manuel
Anisimova, Maria
author_sort Jowkar, Gholamhossein
collection PubMed
description Modern phylogenetic methods allow inference of ancestral molecular sequences given an alignment and phylogeny relating present-day sequences. This provides insight into the evolutionary history of molecules, helping to understand gene function and to study biological processes such as adaptation and convergent evolution across a variety of applications. Here, we propose a dynamic programming algorithm for fast joint likelihood-based reconstruction of ancestral sequences under the Poisson Indel Process (PIP). Unlike previous approaches, our method, named ARPIP, enables the reconstruction with insertions and deletions based on an explicit indel model. Consequently, inferred indel events have an explicit biological interpretation. Likelihood computation is achieved in linear time with respect to the number of sequences. Our method consists of two steps, namely finding the most probable indel points and reconstructing ancestral sequences. First, we find the most likely indel points and prune the phylogeny to reflect the insertion and deletion events per site. Second, we infer the ancestral states on the pruned subtree in a manner similar to FastML. We applied ARPIP (Ancestral Reconstruction under PIP) on simulated data sets and on real data from the Betacoronavirus genus. ARPIP reconstructs both the indel events and substitutions with a high degree of accuracy. Our method fares well when compared to established state-of-the-art methods such as FastML and PAML. Moreover, the method can be extended to explore both optimal and suboptimal reconstructions, include rate heterogeneity through time and more. We believe it will expand the range of novel applications of ancestral sequence reconstruction. [Ancestral sequences; dynamic programming; evolutionary stochastic process; indel; joint ancestral sequence reconstruction; maximum likelihood; Poisson Indel Process; phylogeny; SARS-CoV.]
format Online
Article
Text
id pubmed-10275563
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-102755632023-06-17 ARPIP: Ancestral Sequence Reconstruction with Insertions and Deletions under the Poisson Indel Process Jowkar, Gholamhossein Pečerska, Jūlija Maiolo, Massimo Gil, Manuel Anisimova, Maria Syst Biol Regular Articles Modern phylogenetic methods allow inference of ancestral molecular sequences given an alignment and phylogeny relating present-day sequences. This provides insight into the evolutionary history of molecules, helping to understand gene function and to study biological processes such as adaptation and convergent evolution across a variety of applications. Here, we propose a dynamic programming algorithm for fast joint likelihood-based reconstruction of ancestral sequences under the Poisson Indel Process (PIP). Unlike previous approaches, our method, named ARPIP, enables the reconstruction with insertions and deletions based on an explicit indel model. Consequently, inferred indel events have an explicit biological interpretation. Likelihood computation is achieved in linear time with respect to the number of sequences. Our method consists of two steps, namely finding the most probable indel points and reconstructing ancestral sequences. First, we find the most likely indel points and prune the phylogeny to reflect the insertion and deletion events per site. Second, we infer the ancestral states on the pruned subtree in a manner similar to FastML. We applied ARPIP (Ancestral Reconstruction under PIP) on simulated data sets and on real data from the Betacoronavirus genus. ARPIP reconstructs both the indel events and substitutions with a high degree of accuracy. Our method fares well when compared to established state-of-the-art methods such as FastML and PAML. Moreover, the method can be extended to explore both optimal and suboptimal reconstructions, include rate heterogeneity through time and more. We believe it will expand the range of novel applications of ancestral sequence reconstruction. [Ancestral sequences; dynamic programming; evolutionary stochastic process; indel; joint ancestral sequence reconstruction; maximum likelihood; Poisson Indel Process; phylogeny; SARS-CoV.] Oxford University Press 2022-07-22 /pmc/articles/PMC10275563/ /pubmed/35866991 http://dx.doi.org/10.1093/sysbio/syac050 Text en © The Author(s) 2022. Published by Oxford University Press on behalf of the Society of Systematic Biologists. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Regular Articles
Jowkar, Gholamhossein
Pečerska, Jūlija
Maiolo, Massimo
Gil, Manuel
Anisimova, Maria
ARPIP: Ancestral Sequence Reconstruction with Insertions and Deletions under the Poisson Indel Process
title ARPIP: Ancestral Sequence Reconstruction with Insertions and Deletions under the Poisson Indel Process
title_full ARPIP: Ancestral Sequence Reconstruction with Insertions and Deletions under the Poisson Indel Process
title_fullStr ARPIP: Ancestral Sequence Reconstruction with Insertions and Deletions under the Poisson Indel Process
title_full_unstemmed ARPIP: Ancestral Sequence Reconstruction with Insertions and Deletions under the Poisson Indel Process
title_short ARPIP: Ancestral Sequence Reconstruction with Insertions and Deletions under the Poisson Indel Process
title_sort arpip: ancestral sequence reconstruction with insertions and deletions under the poisson indel process
topic Regular Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10275563/
https://www.ncbi.nlm.nih.gov/pubmed/35866991
http://dx.doi.org/10.1093/sysbio/syac050
work_keys_str_mv AT jowkargholamhossein arpipancestralsequencereconstructionwithinsertionsanddeletionsunderthepoissonindelprocess
AT pecerskajulija arpipancestralsequencereconstructionwithinsertionsanddeletionsunderthepoissonindelprocess
AT maiolomassimo arpipancestralsequencereconstructionwithinsertionsanddeletionsunderthepoissonindelprocess
AT gilmanuel arpipancestralsequencereconstructionwithinsertionsanddeletionsunderthepoissonindelprocess
AT anisimovamaria arpipancestralsequencereconstructionwithinsertionsanddeletionsunderthepoissonindelprocess