Cargando…
Inference of Gene Flow between Species under Misspecified Models
Genomic sequence data provide a rich source of information about the history of species divergence and interspecific hybridization or introgression. Despite recent advances in genomics and statistical methods, it remains challenging to infer gene flow, and as a result, one may have to estimate intro...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9729068/ https://www.ncbi.nlm.nih.gov/pubmed/36317198 http://dx.doi.org/10.1093/molbev/msac237 |
_version_ | 1784845408108806144 |
---|---|
author | Huang, Jun Thawornwattana, Yuttapong Flouri, Tomáš Mallet, James Yang, Ziheng |
author_facet | Huang, Jun Thawornwattana, Yuttapong Flouri, Tomáš Mallet, James Yang, Ziheng |
author_sort | Huang, Jun |
collection | PubMed |
description | Genomic sequence data provide a rich source of information about the history of species divergence and interspecific hybridization or introgression. Despite recent advances in genomics and statistical methods, it remains challenging to infer gene flow, and as a result, one may have to estimate introgression rates and times under misspecified models. Here we use mathematical analysis and computer simulation to examine estimation bias and issues of interpretation when the model of gene flow is misspecified in analysis of genomic datasets, for example, if introgression is assigned to the wrong lineages. In the case of two species, we establish a correspondence between the migration rate in the continuous migration model and the introgression probability in the introgression model. When gene flow occurs continuously through time but in the analysis is assumed to occur at a fixed time point, common evolutionary parameters such as species divergence times are surprisingly well estimated. However, the time of introgression tends to be estimated towards the recent end of the period of continuous gene flow. When introgression events are assigned incorrectly to the parental or daughter lineages, introgression times tend to collapse onto species divergence times, with introgression probabilities underestimated. Overall, our analyses suggest that the simple introgression model is useful for extracting information concerning between-specific gene flow and divergence even when the model may be misspecified. However, for reliable inference of gene flow it is important to include multiple samples per species, in particular, from hybridizing species. |
format | Online Article Text |
id | pubmed-9729068 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-97290682022-12-08 Inference of Gene Flow between Species under Misspecified Models Huang, Jun Thawornwattana, Yuttapong Flouri, Tomáš Mallet, James Yang, Ziheng Mol Biol Evol Methods Genomic sequence data provide a rich source of information about the history of species divergence and interspecific hybridization or introgression. Despite recent advances in genomics and statistical methods, it remains challenging to infer gene flow, and as a result, one may have to estimate introgression rates and times under misspecified models. Here we use mathematical analysis and computer simulation to examine estimation bias and issues of interpretation when the model of gene flow is misspecified in analysis of genomic datasets, for example, if introgression is assigned to the wrong lineages. In the case of two species, we establish a correspondence between the migration rate in the continuous migration model and the introgression probability in the introgression model. When gene flow occurs continuously through time but in the analysis is assumed to occur at a fixed time point, common evolutionary parameters such as species divergence times are surprisingly well estimated. However, the time of introgression tends to be estimated towards the recent end of the period of continuous gene flow. When introgression events are assigned incorrectly to the parental or daughter lineages, introgression times tend to collapse onto species divergence times, with introgression probabilities underestimated. Overall, our analyses suggest that the simple introgression model is useful for extracting information concerning between-specific gene flow and divergence even when the model may be misspecified. However, for reliable inference of gene flow it is important to include multiple samples per species, in particular, from hybridizing species. Oxford University Press 2022-11-01 /pmc/articles/PMC9729068/ /pubmed/36317198 http://dx.doi.org/10.1093/molbev/msac237 Text en © The Author(s) 2022. Published by Oxford University Press on behalf of Society for Molecular Biology and Evolution. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Methods Huang, Jun Thawornwattana, Yuttapong Flouri, Tomáš Mallet, James Yang, Ziheng Inference of Gene Flow between Species under Misspecified Models |
title | Inference of Gene Flow between Species under Misspecified Models |
title_full | Inference of Gene Flow between Species under Misspecified Models |
title_fullStr | Inference of Gene Flow between Species under Misspecified Models |
title_full_unstemmed | Inference of Gene Flow between Species under Misspecified Models |
title_short | Inference of Gene Flow between Species under Misspecified Models |
title_sort | inference of gene flow between species under misspecified models |
topic | Methods |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9729068/ https://www.ncbi.nlm.nih.gov/pubmed/36317198 http://dx.doi.org/10.1093/molbev/msac237 |
work_keys_str_mv | AT huangjun inferenceofgeneflowbetweenspeciesundermisspecifiedmodels AT thawornwattanayuttapong inferenceofgeneflowbetweenspeciesundermisspecifiedmodels AT flouritomas inferenceofgeneflowbetweenspeciesundermisspecifiedmodels AT malletjames inferenceofgeneflowbetweenspeciesundermisspecifiedmodels AT yangziheng inferenceofgeneflowbetweenspeciesundermisspecifiedmodels |