Cargando…

Phylogenetic inference using generative adversarial networks

MOTIVATION: The application of machine learning approaches in phylogenetics has been impeded by the vast model space associated with inference. Supervised machine learning approaches require data from across this space to train models. Because of this, previous approaches have typically been limited...

Descripción completa

Detalles Bibliográficos
Autores principales: Smith, Megan L, Hahn, Matthew W
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10500083/
https://www.ncbi.nlm.nih.gov/pubmed/37669126
http://dx.doi.org/10.1093/bioinformatics/btad543
_version_ 1785105848968675328
author Smith, Megan L
Hahn, Matthew W
author_facet Smith, Megan L
Hahn, Matthew W
author_sort Smith, Megan L
collection PubMed
description MOTIVATION: The application of machine learning approaches in phylogenetics has been impeded by the vast model space associated with inference. Supervised machine learning approaches require data from across this space to train models. Because of this, previous approaches have typically been limited to inferring relationships among unrooted quartets of taxa, where there are only three possible topologies. Here, we explore the potential of generative adversarial networks (GANs) to address this limitation. GANs consist of a generator and a discriminator: at each step, the generator aims to create data that is similar to real data, while the discriminator attempts to distinguish generated and real data. By using an evolutionary model as the generator, we use GANs to make evolutionary inferences. Since a new model can be considered at each iteration, heuristic searches of complex model spaces are possible. Thus, GANs offer a potential solution to the challenges of applying machine learning in phylogenetics. RESULTS: We developed phyloGAN, a GAN that infers phylogenetic relationships among species. phyloGAN takes as input a concatenated alignment, or a set of gene alignments, and infers a phylogenetic tree either considering or ignoring gene tree heterogeneity. We explored the performance of phyloGAN for up to 15 taxa in the concatenation case and 6 taxa when considering gene tree heterogeneity. Error rates are relatively low in these simple cases. However, run times are slow and performance metrics suggest issues during training. Future work should explore novel architectures that may result in more stable and efficient GANs for phylogenetics. AVAILABILITY AND IMPLEMENTATION: phyloGAN is available on github: https://github.com/meganlsmith/phyloGAN/.
format Online
Article
Text
id pubmed-10500083
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-105000832023-09-15 Phylogenetic inference using generative adversarial networks Smith, Megan L Hahn, Matthew W Bioinformatics Original Paper MOTIVATION: The application of machine learning approaches in phylogenetics has been impeded by the vast model space associated with inference. Supervised machine learning approaches require data from across this space to train models. Because of this, previous approaches have typically been limited to inferring relationships among unrooted quartets of taxa, where there are only three possible topologies. Here, we explore the potential of generative adversarial networks (GANs) to address this limitation. GANs consist of a generator and a discriminator: at each step, the generator aims to create data that is similar to real data, while the discriminator attempts to distinguish generated and real data. By using an evolutionary model as the generator, we use GANs to make evolutionary inferences. Since a new model can be considered at each iteration, heuristic searches of complex model spaces are possible. Thus, GANs offer a potential solution to the challenges of applying machine learning in phylogenetics. RESULTS: We developed phyloGAN, a GAN that infers phylogenetic relationships among species. phyloGAN takes as input a concatenated alignment, or a set of gene alignments, and infers a phylogenetic tree either considering or ignoring gene tree heterogeneity. We explored the performance of phyloGAN for up to 15 taxa in the concatenation case and 6 taxa when considering gene tree heterogeneity. Error rates are relatively low in these simple cases. However, run times are slow and performance metrics suggest issues during training. Future work should explore novel architectures that may result in more stable and efficient GANs for phylogenetics. AVAILABILITY AND IMPLEMENTATION: phyloGAN is available on github: https://github.com/meganlsmith/phyloGAN/. Oxford University Press 2023-09-05 /pmc/articles/PMC10500083/ /pubmed/37669126 http://dx.doi.org/10.1093/bioinformatics/btad543 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Paper
Smith, Megan L
Hahn, Matthew W
Phylogenetic inference using generative adversarial networks
title Phylogenetic inference using generative adversarial networks
title_full Phylogenetic inference using generative adversarial networks
title_fullStr Phylogenetic inference using generative adversarial networks
title_full_unstemmed Phylogenetic inference using generative adversarial networks
title_short Phylogenetic inference using generative adversarial networks
title_sort phylogenetic inference using generative adversarial networks
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10500083/
https://www.ncbi.nlm.nih.gov/pubmed/37669126
http://dx.doi.org/10.1093/bioinformatics/btad543
work_keys_str_mv AT smithmeganl phylogeneticinferenceusinggenerativeadversarialnetworks
AT hahnmattheww phylogeneticinferenceusinggenerativeadversarialnetworks