Cargando…
AliSim: A Fast and Versatile Phylogenetic Sequence Simulator for the Genomic Era
Sequence simulators play an important role in phylogenetics. Simulated data has many applications, such as evaluating the performance of different methods, hypothesis testing with parametric bootstraps, and, more recently, generating data for training machine-learning applications. Many sequence sim...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9113491/ https://www.ncbi.nlm.nih.gov/pubmed/35511713 http://dx.doi.org/10.1093/molbev/msac092 |
_version_ | 1784709594280361984 |
---|---|
author | Ly-Trong, Nhan Naser-Khdour, Suha Lanfear, Robert Minh, Bui Quang |
author_facet | Ly-Trong, Nhan Naser-Khdour, Suha Lanfear, Robert Minh, Bui Quang |
author_sort | Ly-Trong, Nhan |
collection | PubMed |
description | Sequence simulators play an important role in phylogenetics. Simulated data has many applications, such as evaluating the performance of different methods, hypothesis testing with parametric bootstraps, and, more recently, generating data for training machine-learning applications. Many sequence simulation programmes exist, but the most feature-rich programmes tend to be rather slow, and the fastest programmes tend to be feature-poor. Here, we introduce AliSim, a new tool that can efficiently simulate biologically realistic alignments under a large range of complex evolutionary models. To achieve high performance across a wide range of simulation conditions, AliSim implements an adaptive approach that combines the commonly used rate matrix and probability matrix approaches. AliSim takes 1.4 h and 1.3 GB RAM to simulate alignments with one million sequences or sites, whereas popular software Seq-Gen, Dawg, and INDELible require 2–5 h and 50–500 GB of RAM. We provide AliSim as an extension of the IQ-TREE software version 2.2, freely available at www.iqtree.org, and a comprehensive user tutorial at http://www.iqtree.org/doc/AliSim. |
format | Online Article Text |
id | pubmed-9113491 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-91134912022-05-18 AliSim: A Fast and Versatile Phylogenetic Sequence Simulator for the Genomic Era Ly-Trong, Nhan Naser-Khdour, Suha Lanfear, Robert Minh, Bui Quang Mol Biol Evol Resources Sequence simulators play an important role in phylogenetics. Simulated data has many applications, such as evaluating the performance of different methods, hypothesis testing with parametric bootstraps, and, more recently, generating data for training machine-learning applications. Many sequence simulation programmes exist, but the most feature-rich programmes tend to be rather slow, and the fastest programmes tend to be feature-poor. Here, we introduce AliSim, a new tool that can efficiently simulate biologically realistic alignments under a large range of complex evolutionary models. To achieve high performance across a wide range of simulation conditions, AliSim implements an adaptive approach that combines the commonly used rate matrix and probability matrix approaches. AliSim takes 1.4 h and 1.3 GB RAM to simulate alignments with one million sequences or sites, whereas popular software Seq-Gen, Dawg, and INDELible require 2–5 h and 50–500 GB of RAM. We provide AliSim as an extension of the IQ-TREE software version 2.2, freely available at www.iqtree.org, and a comprehensive user tutorial at http://www.iqtree.org/doc/AliSim. Oxford University Press 2022-05-03 /pmc/articles/PMC9113491/ /pubmed/35511713 http://dx.doi.org/10.1093/molbev/msac092 Text en © The Author(s) 2022. Published by Oxford University Press on behalf of Society for Molecular Biology and Evolution. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Resources Ly-Trong, Nhan Naser-Khdour, Suha Lanfear, Robert Minh, Bui Quang AliSim: A Fast and Versatile Phylogenetic Sequence Simulator for the Genomic Era |
title | AliSim: A Fast and Versatile Phylogenetic Sequence Simulator for the Genomic Era |
title_full | AliSim: A Fast and Versatile Phylogenetic Sequence Simulator for the Genomic Era |
title_fullStr | AliSim: A Fast and Versatile Phylogenetic Sequence Simulator for the Genomic Era |
title_full_unstemmed | AliSim: A Fast and Versatile Phylogenetic Sequence Simulator for the Genomic Era |
title_short | AliSim: A Fast and Versatile Phylogenetic Sequence Simulator for the Genomic Era |
title_sort | alisim: a fast and versatile phylogenetic sequence simulator for the genomic era |
topic | Resources |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9113491/ https://www.ncbi.nlm.nih.gov/pubmed/35511713 http://dx.doi.org/10.1093/molbev/msac092 |
work_keys_str_mv | AT lytrongnhan alisimafastandversatilephylogeneticsequencesimulatorforthegenomicera AT naserkhdoursuha alisimafastandversatilephylogeneticsequencesimulatorforthegenomicera AT lanfearrobert alisimafastandversatilephylogeneticsequencesimulatorforthegenomicera AT minhbuiquang alisimafastandversatilephylogeneticsequencesimulatorforthegenomicera |