Cargando…

Inference of Gene Flow between Species under Misspecified Models

Genomic sequence data provide a rich source of information about the history of species divergence and interspecific hybridization or introgression. Despite recent advances in genomics and statistical methods, it remains challenging to infer gene flow, and as a result, one may have to estimate intro...

Descripción completa

Detalles Bibliográficos
Autores principales: Huang, Jun, Thawornwattana, Yuttapong, Flouri, Tomáš, Mallet, James, Yang, Ziheng
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9729068/
https://www.ncbi.nlm.nih.gov/pubmed/36317198
http://dx.doi.org/10.1093/molbev/msac237
_version_ 1784845408108806144
author Huang, Jun
Thawornwattana, Yuttapong
Flouri, Tomáš
Mallet, James
Yang, Ziheng
author_facet Huang, Jun
Thawornwattana, Yuttapong
Flouri, Tomáš
Mallet, James
Yang, Ziheng
author_sort Huang, Jun
collection PubMed
description Genomic sequence data provide a rich source of information about the history of species divergence and interspecific hybridization or introgression. Despite recent advances in genomics and statistical methods, it remains challenging to infer gene flow, and as a result, one may have to estimate introgression rates and times under misspecified models. Here we use mathematical analysis and computer simulation to examine estimation bias and issues of interpretation when the model of gene flow is misspecified in analysis of genomic datasets, for example, if introgression is assigned to the wrong lineages. In the case of two species, we establish a correspondence between the migration rate in the continuous migration model and the introgression probability in the introgression model. When gene flow occurs continuously through time but in the analysis is assumed to occur at a fixed time point, common evolutionary parameters such as species divergence times are surprisingly well estimated. However, the time of introgression tends to be estimated towards the recent end of the period of continuous gene flow. When introgression events are assigned incorrectly to the parental or daughter lineages, introgression times tend to collapse onto species divergence times, with introgression probabilities underestimated. Overall, our analyses suggest that the simple introgression model is useful for extracting information concerning between-specific gene flow and divergence even when the model may be misspecified. However, for reliable inference of gene flow it is important to include multiple samples per species, in particular, from hybridizing species.
format Online
Article
Text
id pubmed-9729068
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-97290682022-12-08 Inference of Gene Flow between Species under Misspecified Models Huang, Jun Thawornwattana, Yuttapong Flouri, Tomáš Mallet, James Yang, Ziheng Mol Biol Evol Methods Genomic sequence data provide a rich source of information about the history of species divergence and interspecific hybridization or introgression. Despite recent advances in genomics and statistical methods, it remains challenging to infer gene flow, and as a result, one may have to estimate introgression rates and times under misspecified models. Here we use mathematical analysis and computer simulation to examine estimation bias and issues of interpretation when the model of gene flow is misspecified in analysis of genomic datasets, for example, if introgression is assigned to the wrong lineages. In the case of two species, we establish a correspondence between the migration rate in the continuous migration model and the introgression probability in the introgression model. When gene flow occurs continuously through time but in the analysis is assumed to occur at a fixed time point, common evolutionary parameters such as species divergence times are surprisingly well estimated. However, the time of introgression tends to be estimated towards the recent end of the period of continuous gene flow. When introgression events are assigned incorrectly to the parental or daughter lineages, introgression times tend to collapse onto species divergence times, with introgression probabilities underestimated. Overall, our analyses suggest that the simple introgression model is useful for extracting information concerning between-specific gene flow and divergence even when the model may be misspecified. However, for reliable inference of gene flow it is important to include multiple samples per species, in particular, from hybridizing species. Oxford University Press 2022-11-01 /pmc/articles/PMC9729068/ /pubmed/36317198 http://dx.doi.org/10.1093/molbev/msac237 Text en © The Author(s) 2022. Published by Oxford University Press on behalf of Society for Molecular Biology and Evolution. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methods
Huang, Jun
Thawornwattana, Yuttapong
Flouri, Tomáš
Mallet, James
Yang, Ziheng
Inference of Gene Flow between Species under Misspecified Models
title Inference of Gene Flow between Species under Misspecified Models
title_full Inference of Gene Flow between Species under Misspecified Models
title_fullStr Inference of Gene Flow between Species under Misspecified Models
title_full_unstemmed Inference of Gene Flow between Species under Misspecified Models
title_short Inference of Gene Flow between Species under Misspecified Models
title_sort inference of gene flow between species under misspecified models
topic Methods
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9729068/
https://www.ncbi.nlm.nih.gov/pubmed/36317198
http://dx.doi.org/10.1093/molbev/msac237
work_keys_str_mv AT huangjun inferenceofgeneflowbetweenspeciesundermisspecifiedmodels
AT thawornwattanayuttapong inferenceofgeneflowbetweenspeciesundermisspecifiedmodels
AT flouritomas inferenceofgeneflowbetweenspeciesundermisspecifiedmodels
AT malletjames inferenceofgeneflowbetweenspeciesundermisspecifiedmodels
AT yangziheng inferenceofgeneflowbetweenspeciesundermisspecifiedmodels