Cargando…
Gene flow analysis method, the D-statistic, is robust in a wide parameter space
BACKGROUND: We evaluated the sensitivity of the D-statistic, a parsimony-like method widely used to detect gene flow between closely related species. This method has been applied to a variety of taxa with a wide range of divergence times. However, its parameter space and thus its applicability to a...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5759368/ https://www.ncbi.nlm.nih.gov/pubmed/29310567 http://dx.doi.org/10.1186/s12859-017-2002-4 |
_version_ | 1783291182278770688 |
---|---|
author | Zheng, Yichen Janke, Axel |
author_facet | Zheng, Yichen Janke, Axel |
author_sort | Zheng, Yichen |
collection | PubMed |
description | BACKGROUND: We evaluated the sensitivity of the D-statistic, a parsimony-like method widely used to detect gene flow between closely related species. This method has been applied to a variety of taxa with a wide range of divergence times. However, its parameter space and thus its applicability to a wide taxonomic range has not been systematically studied. Divergence time, population size, time of gene flow, distance of outgroup and number of loci were examined in a sensitivity analysis. RESULT: The sensitivity study shows that the primary determinant of the D-statistic is the relative population size, i.e. the population size scaled by the number of generations since divergence. This is consistent with the fact that the main confounding factor in gene flow detection is incomplete lineage sorting by diluting the signal. The sensitivity of the D-statistic is also affected by the direction of gene flow, size and number of loci. In addition, we examined the ability of the f-statistics, [Formula: see text] and [Formula: see text] , to estimate the fraction of a genome affected by gene flow; while these statistics are difficult to implement to practical questions in biology due to lack of knowledge of when the gene flow happened, they can be used to compare datasets with identical or similar demographic background. CONCLUSIONS: The D-statistic, as a method to detect gene flow, is robust against a wide range of genetic distances (divergence times) but it is sensitive to population size. The D-statistic should only be applied with critical reservation to taxa where population sizes are large relative to branch lengths in generations. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-017-2002-4) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-5759368 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-57593682018-01-16 Gene flow analysis method, the D-statistic, is robust in a wide parameter space Zheng, Yichen Janke, Axel BMC Bioinformatics Research Article BACKGROUND: We evaluated the sensitivity of the D-statistic, a parsimony-like method widely used to detect gene flow between closely related species. This method has been applied to a variety of taxa with a wide range of divergence times. However, its parameter space and thus its applicability to a wide taxonomic range has not been systematically studied. Divergence time, population size, time of gene flow, distance of outgroup and number of loci were examined in a sensitivity analysis. RESULT: The sensitivity study shows that the primary determinant of the D-statistic is the relative population size, i.e. the population size scaled by the number of generations since divergence. This is consistent with the fact that the main confounding factor in gene flow detection is incomplete lineage sorting by diluting the signal. The sensitivity of the D-statistic is also affected by the direction of gene flow, size and number of loci. In addition, we examined the ability of the f-statistics, [Formula: see text] and [Formula: see text] , to estimate the fraction of a genome affected by gene flow; while these statistics are difficult to implement to practical questions in biology due to lack of knowledge of when the gene flow happened, they can be used to compare datasets with identical or similar demographic background. CONCLUSIONS: The D-statistic, as a method to detect gene flow, is robust against a wide range of genetic distances (divergence times) but it is sensitive to population size. The D-statistic should only be applied with critical reservation to taxa where population sizes are large relative to branch lengths in generations. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-017-2002-4) contains supplementary material, which is available to authorized users. BioMed Central 2018-01-08 /pmc/articles/PMC5759368/ /pubmed/29310567 http://dx.doi.org/10.1186/s12859-017-2002-4 Text en © The Author(s). 2018 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Article Zheng, Yichen Janke, Axel Gene flow analysis method, the D-statistic, is robust in a wide parameter space |
title | Gene flow analysis method, the D-statistic, is robust in a wide parameter space |
title_full | Gene flow analysis method, the D-statistic, is robust in a wide parameter space |
title_fullStr | Gene flow analysis method, the D-statistic, is robust in a wide parameter space |
title_full_unstemmed | Gene flow analysis method, the D-statistic, is robust in a wide parameter space |
title_short | Gene flow analysis method, the D-statistic, is robust in a wide parameter space |
title_sort | gene flow analysis method, the d-statistic, is robust in a wide parameter space |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5759368/ https://www.ncbi.nlm.nih.gov/pubmed/29310567 http://dx.doi.org/10.1186/s12859-017-2002-4 |
work_keys_str_mv | AT zhengyichen geneflowanalysismethodthedstatisticisrobustinawideparameterspace AT jankeaxel geneflowanalysismethodthedstatisticisrobustinawideparameterspace |