Cargando…

Gene flow analysis method, the D-statistic, is robust in a wide parameter space

BACKGROUND: We evaluated the sensitivity of the D-statistic, a parsimony-like method widely used to detect gene flow between closely related species. This method has been applied to a variety of taxa with a wide range of divergence times. However, its parameter space and thus its applicability to a...

Descripción completa

Detalles Bibliográficos
Autores principales: Zheng, Yichen, Janke, Axel
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5759368/
https://www.ncbi.nlm.nih.gov/pubmed/29310567
http://dx.doi.org/10.1186/s12859-017-2002-4
_version_ 1783291182278770688
author Zheng, Yichen
Janke, Axel
author_facet Zheng, Yichen
Janke, Axel
author_sort Zheng, Yichen
collection PubMed
description BACKGROUND: We evaluated the sensitivity of the D-statistic, a parsimony-like method widely used to detect gene flow between closely related species. This method has been applied to a variety of taxa with a wide range of divergence times. However, its parameter space and thus its applicability to a wide taxonomic range has not been systematically studied. Divergence time, population size, time of gene flow, distance of outgroup and number of loci were examined in a sensitivity analysis. RESULT: The sensitivity study shows that the primary determinant of the D-statistic is the relative population size, i.e. the population size scaled by the number of generations since divergence. This is consistent with the fact that the main confounding factor in gene flow detection is incomplete lineage sorting by diluting the signal. The sensitivity of the D-statistic is also affected by the direction of gene flow, size and number of loci. In addition, we examined the ability of the f-statistics, [Formula: see text] and [Formula: see text] , to estimate the fraction of a genome affected by gene flow; while these statistics are difficult to implement to practical questions in biology due to lack of knowledge of when the gene flow happened, they can be used to compare datasets with identical or similar demographic background. CONCLUSIONS: The D-statistic, as a method to detect gene flow, is robust against a wide range of genetic distances (divergence times) but it is sensitive to population size. The D-statistic should only be applied with critical reservation to taxa where population sizes are large relative to branch lengths in generations. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-017-2002-4) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5759368
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-57593682018-01-16 Gene flow analysis method, the D-statistic, is robust in a wide parameter space Zheng, Yichen Janke, Axel BMC Bioinformatics Research Article BACKGROUND: We evaluated the sensitivity of the D-statistic, a parsimony-like method widely used to detect gene flow between closely related species. This method has been applied to a variety of taxa with a wide range of divergence times. However, its parameter space and thus its applicability to a wide taxonomic range has not been systematically studied. Divergence time, population size, time of gene flow, distance of outgroup and number of loci were examined in a sensitivity analysis. RESULT: The sensitivity study shows that the primary determinant of the D-statistic is the relative population size, i.e. the population size scaled by the number of generations since divergence. This is consistent with the fact that the main confounding factor in gene flow detection is incomplete lineage sorting by diluting the signal. The sensitivity of the D-statistic is also affected by the direction of gene flow, size and number of loci. In addition, we examined the ability of the f-statistics, [Formula: see text] and [Formula: see text] , to estimate the fraction of a genome affected by gene flow; while these statistics are difficult to implement to practical questions in biology due to lack of knowledge of when the gene flow happened, they can be used to compare datasets with identical or similar demographic background. CONCLUSIONS: The D-statistic, as a method to detect gene flow, is robust against a wide range of genetic distances (divergence times) but it is sensitive to population size. The D-statistic should only be applied with critical reservation to taxa where population sizes are large relative to branch lengths in generations. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-017-2002-4) contains supplementary material, which is available to authorized users. BioMed Central 2018-01-08 /pmc/articles/PMC5759368/ /pubmed/29310567 http://dx.doi.org/10.1186/s12859-017-2002-4 Text en © The Author(s). 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Zheng, Yichen
Janke, Axel
Gene flow analysis method, the D-statistic, is robust in a wide parameter space
title Gene flow analysis method, the D-statistic, is robust in a wide parameter space
title_full Gene flow analysis method, the D-statistic, is robust in a wide parameter space
title_fullStr Gene flow analysis method, the D-statistic, is robust in a wide parameter space
title_full_unstemmed Gene flow analysis method, the D-statistic, is robust in a wide parameter space
title_short Gene flow analysis method, the D-statistic, is robust in a wide parameter space
title_sort gene flow analysis method, the d-statistic, is robust in a wide parameter space
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5759368/
https://www.ncbi.nlm.nih.gov/pubmed/29310567
http://dx.doi.org/10.1186/s12859-017-2002-4
work_keys_str_mv AT zhengyichen geneflowanalysismethodthedstatisticisrobustinawideparameterspace
AT jankeaxel geneflowanalysismethodthedstatisticisrobustinawideparameterspace