Cargando…

Dissecting Genomic Determinants of Positive Selection with an Evolution-Guided Regression Model

In evolutionary genomics, it is fundamentally important to understand how characteristics of genomic sequences, such as gene expression level, determine the rate of adaptive evolution. While numerous statistical methods, such as the McDonald–Kreitman (MK) test, are available to examine the associati...

Descripción completa

Detalles Bibliográficos
Autor principal: Huang, Yi-Fei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8763110/
https://www.ncbi.nlm.nih.gov/pubmed/34597406
http://dx.doi.org/10.1093/molbev/msab291
_version_ 1784633880861474816
author Huang, Yi-Fei
author_facet Huang, Yi-Fei
author_sort Huang, Yi-Fei
collection PubMed
description In evolutionary genomics, it is fundamentally important to understand how characteristics of genomic sequences, such as gene expression level, determine the rate of adaptive evolution. While numerous statistical methods, such as the McDonald–Kreitman (MK) test, are available to examine the association between genomic features and the rate of adaptation, we currently lack a statistical approach to disentangle the independent effect of a genomic feature from the effects of other correlated genomic features. To address this problem, I present a novel statistical model, the MK regression, which augments the MK test with a generalized linear model. Analogous to the classical multiple regression model, the MK regression can analyze multiple genomic features simultaneously to infer the independent effect of a genomic feature, holding constant all other genomic features. Using the MK regression, I identify numerous genomic features driving positive selection in chimpanzees. These features include well-known ones, such as local mutation rate, residue exposure level, tissue specificity, and immune genes, as well as new features not previously reported, such as gene expression level and metabolic genes. In particular, I show that highly expressed genes may have a higher adaptation rate than their weakly expressed counterparts, even though a higher expression level may impose stronger negative selection. Also, I show that metabolic genes may have a higher adaptation rate than their nonmetabolic counterparts, possibly due to recent changes in diet in primate evolution. Overall, the MK regression is a powerful approach to elucidate the genomic basis of adaptation.
format Online
Article
Text
id pubmed-8763110
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-87631102022-01-18 Dissecting Genomic Determinants of Positive Selection with an Evolution-Guided Regression Model Huang, Yi-Fei Mol Biol Evol Methods In evolutionary genomics, it is fundamentally important to understand how characteristics of genomic sequences, such as gene expression level, determine the rate of adaptive evolution. While numerous statistical methods, such as the McDonald–Kreitman (MK) test, are available to examine the association between genomic features and the rate of adaptation, we currently lack a statistical approach to disentangle the independent effect of a genomic feature from the effects of other correlated genomic features. To address this problem, I present a novel statistical model, the MK regression, which augments the MK test with a generalized linear model. Analogous to the classical multiple regression model, the MK regression can analyze multiple genomic features simultaneously to infer the independent effect of a genomic feature, holding constant all other genomic features. Using the MK regression, I identify numerous genomic features driving positive selection in chimpanzees. These features include well-known ones, such as local mutation rate, residue exposure level, tissue specificity, and immune genes, as well as new features not previously reported, such as gene expression level and metabolic genes. In particular, I show that highly expressed genes may have a higher adaptation rate than their weakly expressed counterparts, even though a higher expression level may impose stronger negative selection. Also, I show that metabolic genes may have a higher adaptation rate than their nonmetabolic counterparts, possibly due to recent changes in diet in primate evolution. Overall, the MK regression is a powerful approach to elucidate the genomic basis of adaptation. Oxford University Press 2021-10-01 /pmc/articles/PMC8763110/ /pubmed/34597406 http://dx.doi.org/10.1093/molbev/msab291 Text en © The Author(s) 2021. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Methods
Huang, Yi-Fei
Dissecting Genomic Determinants of Positive Selection with an Evolution-Guided Regression Model
title Dissecting Genomic Determinants of Positive Selection with an Evolution-Guided Regression Model
title_full Dissecting Genomic Determinants of Positive Selection with an Evolution-Guided Regression Model
title_fullStr Dissecting Genomic Determinants of Positive Selection with an Evolution-Guided Regression Model
title_full_unstemmed Dissecting Genomic Determinants of Positive Selection with an Evolution-Guided Regression Model
title_short Dissecting Genomic Determinants of Positive Selection with an Evolution-Guided Regression Model
title_sort dissecting genomic determinants of positive selection with an evolution-guided regression model
topic Methods
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8763110/
https://www.ncbi.nlm.nih.gov/pubmed/34597406
http://dx.doi.org/10.1093/molbev/msab291
work_keys_str_mv AT huangyifei dissectinggenomicdeterminantsofpositiveselectionwithanevolutionguidedregressionmodel