Cargando…
Flexible Mixture Model Approaches That Accommodate Footprint Size Variability for Robust Detection of Balancing Selection
Long-term balancing selection typically leaves narrow footprints of increased genetic diversity, and therefore most detection approaches only achieve optimal performances when sufficiently small genomic regions (i.e., windows) are examined. Such methods are sensitive to window sizes and suffer subst...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7820363/ https://www.ncbi.nlm.nih.gov/pubmed/32462188 http://dx.doi.org/10.1093/molbev/msaa134 |
_version_ | 1783639194297434112 |
---|---|
author | Cheng, Xiaoheng DeGiorgio, Michael |
author_facet | Cheng, Xiaoheng DeGiorgio, Michael |
author_sort | Cheng, Xiaoheng |
collection | PubMed |
description | Long-term balancing selection typically leaves narrow footprints of increased genetic diversity, and therefore most detection approaches only achieve optimal performances when sufficiently small genomic regions (i.e., windows) are examined. Such methods are sensitive to window sizes and suffer substantial losses in power when windows are large. Here, we employ mixture models to construct a set of five composite likelihood ratio test statistics, which we collectively term B statistics. These statistics are agnostic to window sizes and can operate on diverse forms of input data. Through simulations, we show that they exhibit comparable power to the best-performing current methods, and retain substantially high power regardless of window sizes. They also display considerable robustness to high mutation rates and uneven recombination landscapes, as well as an array of other common confounding scenarios. Moreover, we applied a specific version of the B statistics, termed B(2), to a human population-genomic data set and recovered many top candidates from prior studies, including the then-uncharacterized STPG2 and CCDC169–SOHLH2, both of which are related to gamete functions. We further applied B(2) on a bonobo population-genomic data set. In addition to the MHC-DQ genes, we uncovered several novel candidate genes, such as KLRD1, involved in viral defense, and SCN9A, associated with pain perception. Finally, we show that our methods can be extended to account for multiallelic balancing selection and integrated the set of statistics into open-source software named BalLeRMix for future applications by the scientific community. |
format | Online Article Text |
id | pubmed-7820363 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-78203632021-01-26 Flexible Mixture Model Approaches That Accommodate Footprint Size Variability for Robust Detection of Balancing Selection Cheng, Xiaoheng DeGiorgio, Michael Mol Biol Evol Methods Long-term balancing selection typically leaves narrow footprints of increased genetic diversity, and therefore most detection approaches only achieve optimal performances when sufficiently small genomic regions (i.e., windows) are examined. Such methods are sensitive to window sizes and suffer substantial losses in power when windows are large. Here, we employ mixture models to construct a set of five composite likelihood ratio test statistics, which we collectively term B statistics. These statistics are agnostic to window sizes and can operate on diverse forms of input data. Through simulations, we show that they exhibit comparable power to the best-performing current methods, and retain substantially high power regardless of window sizes. They also display considerable robustness to high mutation rates and uneven recombination landscapes, as well as an array of other common confounding scenarios. Moreover, we applied a specific version of the B statistics, termed B(2), to a human population-genomic data set and recovered many top candidates from prior studies, including the then-uncharacterized STPG2 and CCDC169–SOHLH2, both of which are related to gamete functions. We further applied B(2) on a bonobo population-genomic data set. In addition to the MHC-DQ genes, we uncovered several novel candidate genes, such as KLRD1, involved in viral defense, and SCN9A, associated with pain perception. Finally, we show that our methods can be extended to account for multiallelic balancing selection and integrated the set of statistics into open-source software named BalLeRMix for future applications by the scientific community. Oxford University Press 2020-10-04 /pmc/articles/PMC7820363/ /pubmed/32462188 http://dx.doi.org/10.1093/molbev/msaa134 Text en © The Author(s) 2020. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Methods Cheng, Xiaoheng DeGiorgio, Michael Flexible Mixture Model Approaches That Accommodate Footprint Size Variability for Robust Detection of Balancing Selection |
title | Flexible Mixture Model Approaches That Accommodate Footprint Size Variability for Robust Detection of Balancing Selection |
title_full | Flexible Mixture Model Approaches That Accommodate Footprint Size Variability for Robust Detection of Balancing Selection |
title_fullStr | Flexible Mixture Model Approaches That Accommodate Footprint Size Variability for Robust Detection of Balancing Selection |
title_full_unstemmed | Flexible Mixture Model Approaches That Accommodate Footprint Size Variability for Robust Detection of Balancing Selection |
title_short | Flexible Mixture Model Approaches That Accommodate Footprint Size Variability for Robust Detection of Balancing Selection |
title_sort | flexible mixture model approaches that accommodate footprint size variability for robust detection of balancing selection |
topic | Methods |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7820363/ https://www.ncbi.nlm.nih.gov/pubmed/32462188 http://dx.doi.org/10.1093/molbev/msaa134 |
work_keys_str_mv | AT chengxiaoheng flexiblemixturemodelapproachesthataccommodatefootprintsizevariabilityforrobustdetectionofbalancingselection AT degiorgiomichael flexiblemixturemodelapproachesthataccommodatefootprintsizevariabilityforrobustdetectionofbalancingselection |