Cargando…

The influence of a priori grouping on inference of genetic clusters: simulation study and literature review of the DAPC method

Inference of genetic clusters is a key aim of population genetics, sparking development of numerous analytical methods. Within these, there is a conceptual divide between finding de novo structure versus assessment of a priori groups. Recently developed, Discriminant Analysis of Principal Components...

Descripción completa

Detalles Bibliográficos
Autores principales: Miller, Joshua M., Cullingham, Catherine I., Peery, Rhiannon M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer International Publishing 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7553915/
https://www.ncbi.nlm.nih.gov/pubmed/32753664
http://dx.doi.org/10.1038/s41437-020-0348-2
_version_ 1783593702396002304
author Miller, Joshua M.
Cullingham, Catherine I.
Peery, Rhiannon M.
author_facet Miller, Joshua M.
Cullingham, Catherine I.
Peery, Rhiannon M.
author_sort Miller, Joshua M.
collection PubMed
description Inference of genetic clusters is a key aim of population genetics, sparking development of numerous analytical methods. Within these, there is a conceptual divide between finding de novo structure versus assessment of a priori groups. Recently developed, Discriminant Analysis of Principal Components (DAPC), combines discriminant analysis (DA) with principal component (PC) analysis. When applying DAPC, the groups used in the DA (specified a priori or described de novo) need to be carefully assessed. While DAPC has rapidly become a core technique, the sensitivity of the method to misspecification of groups and how it is being empirically applied, are unknown. To address this, we conducted a simulation study examining the influence of a priori versus de novo group designations, and a literature review of how DAPC is being applied. We found that with a priori groupings, distance between genetic clusters reflected underlying F(ST). However, when migration rates were high and groups were described de novo there was considerable inaccuracy, both in terms of the number of genetic clusters suggested and placement of individuals into those clusters. Nearly all (90.1%) of 224 studies surveyed used DAPC to find de novo clusters, and for the majority (62.5%) the stated goal matched the results. However, most studies (52.3%) omit key run parameters, preventing repeatability and transparency. Therefore, we present recommendations for standard reporting of parameters used in DAPC analyses. The influence of groupings in genetic clustering is not unique to DAPC, and researchers need to consider their goal and which methods will be most appropriate.
format Online
Article
Text
id pubmed-7553915
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Springer International Publishing
record_format MEDLINE/PubMed
spelling pubmed-75539152020-10-19 The influence of a priori grouping on inference of genetic clusters: simulation study and literature review of the DAPC method Miller, Joshua M. Cullingham, Catherine I. Peery, Rhiannon M. Heredity (Edinb) Review Article Inference of genetic clusters is a key aim of population genetics, sparking development of numerous analytical methods. Within these, there is a conceptual divide between finding de novo structure versus assessment of a priori groups. Recently developed, Discriminant Analysis of Principal Components (DAPC), combines discriminant analysis (DA) with principal component (PC) analysis. When applying DAPC, the groups used in the DA (specified a priori or described de novo) need to be carefully assessed. While DAPC has rapidly become a core technique, the sensitivity of the method to misspecification of groups and how it is being empirically applied, are unknown. To address this, we conducted a simulation study examining the influence of a priori versus de novo group designations, and a literature review of how DAPC is being applied. We found that with a priori groupings, distance between genetic clusters reflected underlying F(ST). However, when migration rates were high and groups were described de novo there was considerable inaccuracy, both in terms of the number of genetic clusters suggested and placement of individuals into those clusters. Nearly all (90.1%) of 224 studies surveyed used DAPC to find de novo clusters, and for the majority (62.5%) the stated goal matched the results. However, most studies (52.3%) omit key run parameters, preventing repeatability and transparency. Therefore, we present recommendations for standard reporting of parameters used in DAPC analyses. The influence of groupings in genetic clustering is not unique to DAPC, and researchers need to consider their goal and which methods will be most appropriate. Springer International Publishing 2020-08-04 2020-11 /pmc/articles/PMC7553915/ /pubmed/32753664 http://dx.doi.org/10.1038/s41437-020-0348-2 Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle Review Article
Miller, Joshua M.
Cullingham, Catherine I.
Peery, Rhiannon M.
The influence of a priori grouping on inference of genetic clusters: simulation study and literature review of the DAPC method
title The influence of a priori grouping on inference of genetic clusters: simulation study and literature review of the DAPC method
title_full The influence of a priori grouping on inference of genetic clusters: simulation study and literature review of the DAPC method
title_fullStr The influence of a priori grouping on inference of genetic clusters: simulation study and literature review of the DAPC method
title_full_unstemmed The influence of a priori grouping on inference of genetic clusters: simulation study and literature review of the DAPC method
title_short The influence of a priori grouping on inference of genetic clusters: simulation study and literature review of the DAPC method
title_sort influence of a priori grouping on inference of genetic clusters: simulation study and literature review of the dapc method
topic Review Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7553915/
https://www.ncbi.nlm.nih.gov/pubmed/32753664
http://dx.doi.org/10.1038/s41437-020-0348-2
work_keys_str_mv AT millerjoshuam theinfluenceofapriorigroupingoninferenceofgeneticclusterssimulationstudyandliteraturereviewofthedapcmethod
AT cullinghamcatherinei theinfluenceofapriorigroupingoninferenceofgeneticclusterssimulationstudyandliteraturereviewofthedapcmethod
AT peeryrhiannonm theinfluenceofapriorigroupingoninferenceofgeneticclusterssimulationstudyandliteraturereviewofthedapcmethod
AT millerjoshuam influenceofapriorigroupingoninferenceofgeneticclusterssimulationstudyandliteraturereviewofthedapcmethod
AT cullinghamcatherinei influenceofapriorigroupingoninferenceofgeneticclusterssimulationstudyandliteraturereviewofthedapcmethod
AT peeryrhiannonm influenceofapriorigroupingoninferenceofgeneticclusterssimulationstudyandliteraturereviewofthedapcmethod