Cargando…

Localizing and Classifying Adaptive Targets with Trend Filtered Regression

Identifying genomic locations of natural selection from sequence data is an ongoing challenge in population genetics. Current methods utilizing information combined from several summary statistics typically assume no correlation of summary statistics regardless of the genomic location from which the...

Descripción completa

Detalles Bibliográficos
Autores principales: Mughal, Mehreen R, DeGiorgio, Michael
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6409434/
https://www.ncbi.nlm.nih.gov/pubmed/30398642
http://dx.doi.org/10.1093/molbev/msy205
_version_ 1783401971288375296
author Mughal, Mehreen R
DeGiorgio, Michael
author_facet Mughal, Mehreen R
DeGiorgio, Michael
author_sort Mughal, Mehreen R
collection PubMed
description Identifying genomic locations of natural selection from sequence data is an ongoing challenge in population genetics. Current methods utilizing information combined from several summary statistics typically assume no correlation of summary statistics regardless of the genomic location from which they are calculated. However, due to linkage disequilibrium, summary statistics calculated at nearby genomic positions are highly correlated. We introduce an approach termed Trendsetter that accounts for the similarity of statistics calculated from adjacent genomic regions through trend filtering, while reducing the effects of multicollinearity through regularization. Our penalized regression framework has high power to detect sweeps, is capable of classifying sweep regions as either hard or soft, and can be applied to other selection scenarios as well. We find that Trendsetter is robust to both extensive missing data and strong background selection, and has comparable power to similar current approaches. Moreover, the model learned by Trendsetter can be viewed as a set of curves modeling the spatial distribution of summary statistics in the genome. Application to human genomic data revealed positively selected regions previously discovered such as LCT in Europeans and EDAR in East Asians. We also identified a number of novel candidates and show that populations with greater relatedness share more sweep signals.
format Online
Article
Text
id pubmed-6409434
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-64094342019-03-15 Localizing and Classifying Adaptive Targets with Trend Filtered Regression Mughal, Mehreen R DeGiorgio, Michael Mol Biol Evol Discoveries Identifying genomic locations of natural selection from sequence data is an ongoing challenge in population genetics. Current methods utilizing information combined from several summary statistics typically assume no correlation of summary statistics regardless of the genomic location from which they are calculated. However, due to linkage disequilibrium, summary statistics calculated at nearby genomic positions are highly correlated. We introduce an approach termed Trendsetter that accounts for the similarity of statistics calculated from adjacent genomic regions through trend filtering, while reducing the effects of multicollinearity through regularization. Our penalized regression framework has high power to detect sweeps, is capable of classifying sweep regions as either hard or soft, and can be applied to other selection scenarios as well. We find that Trendsetter is robust to both extensive missing data and strong background selection, and has comparable power to similar current approaches. Moreover, the model learned by Trendsetter can be viewed as a set of curves modeling the spatial distribution of summary statistics in the genome. Application to human genomic data revealed positively selected regions previously discovered such as LCT in Europeans and EDAR in East Asians. We also identified a number of novel candidates and show that populations with greater relatedness share more sweep signals. Oxford University Press 2019-02 2018-11-06 /pmc/articles/PMC6409434/ /pubmed/30398642 http://dx.doi.org/10.1093/molbev/msy205 Text en © The Author(s) 2018. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Discoveries
Mughal, Mehreen R
DeGiorgio, Michael
Localizing and Classifying Adaptive Targets with Trend Filtered Regression
title Localizing and Classifying Adaptive Targets with Trend Filtered Regression
title_full Localizing and Classifying Adaptive Targets with Trend Filtered Regression
title_fullStr Localizing and Classifying Adaptive Targets with Trend Filtered Regression
title_full_unstemmed Localizing and Classifying Adaptive Targets with Trend Filtered Regression
title_short Localizing and Classifying Adaptive Targets with Trend Filtered Regression
title_sort localizing and classifying adaptive targets with trend filtered regression
topic Discoveries
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6409434/
https://www.ncbi.nlm.nih.gov/pubmed/30398642
http://dx.doi.org/10.1093/molbev/msy205
work_keys_str_mv AT mughalmehreenr localizingandclassifyingadaptivetargetswithtrendfilteredregression
AT degiorgiomichael localizingandclassifyingadaptivetargetswithtrendfilteredregression