Cargando…

A Deep-Learning Approach for Inference of Selective Sweeps from the Ancestral Recombination Graph

Detecting signals of selection from genomic data is a central problem in population genetics. Coupling the rich information in the ancestral recombination graph (ARG) with a powerful and scalable deep-learning framework, we developed a novel method to detect and quantify positive selection: Selectio...

Descripción completa

Detalles Bibliográficos
Autores principales: Hejase, Hussein A, Mo, Ziyi, Campagna, Leonardo, Siepel, Adam
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8789311/
https://www.ncbi.nlm.nih.gov/pubmed/34888675
http://dx.doi.org/10.1093/molbev/msab332
_version_ 1784639738842447872
author Hejase, Hussein A
Mo, Ziyi
Campagna, Leonardo
Siepel, Adam
author_facet Hejase, Hussein A
Mo, Ziyi
Campagna, Leonardo
Siepel, Adam
author_sort Hejase, Hussein A
collection PubMed
description Detecting signals of selection from genomic data is a central problem in population genetics. Coupling the rich information in the ancestral recombination graph (ARG) with a powerful and scalable deep-learning framework, we developed a novel method to detect and quantify positive selection: Selection Inference using the Ancestral recombination graph (SIA). Built on a Long Short-Term Memory (LSTM) architecture, a particular type of a Recurrent Neural Network (RNN), SIA can be trained to explicitly infer a full range of selection coefficients, as well as the allele frequency trajectory and time of selection onset. We benchmarked SIA extensively on simulations under a European human demographic model, and found that it performs as well or better as some of the best available methods, including state-of-the-art machine-learning and ARG-based methods. In addition, we used SIA to estimate selection coefficients at several loci associated with human phenotypes of interest. SIA detected novel signals of selection particular to the European (CEU) population at the MC1R and ABCC11 loci. In addition, it recapitulated signals of selection at the LCT locus and several pigmentation-related genes. Finally, we reanalyzed polymorphism data of a collection of recently radiated southern capuchino seedeater taxa in the genus Sporophila to quantify the strength of selection and improved the power of our previous methods to detect partial soft sweeps. Overall, SIA uses deep learning to leverage the ARG and thereby provides new insight into how selective sweeps shape genomic diversity.
format Online
Article
Text
id pubmed-8789311
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-87893112022-01-26 A Deep-Learning Approach for Inference of Selective Sweeps from the Ancestral Recombination Graph Hejase, Hussein A Mo, Ziyi Campagna, Leonardo Siepel, Adam Mol Biol Evol Methods Detecting signals of selection from genomic data is a central problem in population genetics. Coupling the rich information in the ancestral recombination graph (ARG) with a powerful and scalable deep-learning framework, we developed a novel method to detect and quantify positive selection: Selection Inference using the Ancestral recombination graph (SIA). Built on a Long Short-Term Memory (LSTM) architecture, a particular type of a Recurrent Neural Network (RNN), SIA can be trained to explicitly infer a full range of selection coefficients, as well as the allele frequency trajectory and time of selection onset. We benchmarked SIA extensively on simulations under a European human demographic model, and found that it performs as well or better as some of the best available methods, including state-of-the-art machine-learning and ARG-based methods. In addition, we used SIA to estimate selection coefficients at several loci associated with human phenotypes of interest. SIA detected novel signals of selection particular to the European (CEU) population at the MC1R and ABCC11 loci. In addition, it recapitulated signals of selection at the LCT locus and several pigmentation-related genes. Finally, we reanalyzed polymorphism data of a collection of recently radiated southern capuchino seedeater taxa in the genus Sporophila to quantify the strength of selection and improved the power of our previous methods to detect partial soft sweeps. Overall, SIA uses deep learning to leverage the ARG and thereby provides new insight into how selective sweeps shape genomic diversity. Oxford University Press 2021-11-22 /pmc/articles/PMC8789311/ /pubmed/34888675 http://dx.doi.org/10.1093/molbev/msab332 Text en © The Author(s) 2021. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methods
Hejase, Hussein A
Mo, Ziyi
Campagna, Leonardo
Siepel, Adam
A Deep-Learning Approach for Inference of Selective Sweeps from the Ancestral Recombination Graph
title A Deep-Learning Approach for Inference of Selective Sweeps from the Ancestral Recombination Graph
title_full A Deep-Learning Approach for Inference of Selective Sweeps from the Ancestral Recombination Graph
title_fullStr A Deep-Learning Approach for Inference of Selective Sweeps from the Ancestral Recombination Graph
title_full_unstemmed A Deep-Learning Approach for Inference of Selective Sweeps from the Ancestral Recombination Graph
title_short A Deep-Learning Approach for Inference of Selective Sweeps from the Ancestral Recombination Graph
title_sort deep-learning approach for inference of selective sweeps from the ancestral recombination graph
topic Methods
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8789311/
https://www.ncbi.nlm.nih.gov/pubmed/34888675
http://dx.doi.org/10.1093/molbev/msab332
work_keys_str_mv AT hejasehusseina adeeplearningapproachforinferenceofselectivesweepsfromtheancestralrecombinationgraph
AT moziyi adeeplearningapproachforinferenceofselectivesweepsfromtheancestralrecombinationgraph
AT campagnaleonardo adeeplearningapproachforinferenceofselectivesweepsfromtheancestralrecombinationgraph
AT siepeladam adeeplearningapproachforinferenceofselectivesweepsfromtheancestralrecombinationgraph
AT hejasehusseina deeplearningapproachforinferenceofselectivesweepsfromtheancestralrecombinationgraph
AT moziyi deeplearningapproachforinferenceofselectivesweepsfromtheancestralrecombinationgraph
AT campagnaleonardo deeplearningapproachforinferenceofselectivesweepsfromtheancestralrecombinationgraph
AT siepeladam deeplearningapproachforinferenceofselectivesweepsfromtheancestralrecombinationgraph