Cargando…

Recommendations for improving statistical inference in population genomics

The field of population genomics has grown rapidly in response to the recent advent of affordable, large-scale sequencing technologies. As opposed to the situation during the majority of the 20th century, in which the development of theoretical and statistical population genetic insights outpaced th...

Descripción completa

Detalles Bibliográficos
Autores principales: Johri, Parul, Aquadro, Charles F., Beaumont, Mark, Charlesworth, Brian, Excoffier, Laurent, Eyre-Walker, Adam, Keightley, Peter D., Lynch, Michael, McVean, Gil, Payseur, Bret A., Pfeifer, Susanne P., Stephan, Wolfgang, Jensen, Jeffrey D.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9154105/
https://www.ncbi.nlm.nih.gov/pubmed/35639797
http://dx.doi.org/10.1371/journal.pbio.3001669
_version_ 1784717970016043008
author Johri, Parul
Aquadro, Charles F.
Beaumont, Mark
Charlesworth, Brian
Excoffier, Laurent
Eyre-Walker, Adam
Keightley, Peter D.
Lynch, Michael
McVean, Gil
Payseur, Bret A.
Pfeifer, Susanne P.
Stephan, Wolfgang
Jensen, Jeffrey D.
author_facet Johri, Parul
Aquadro, Charles F.
Beaumont, Mark
Charlesworth, Brian
Excoffier, Laurent
Eyre-Walker, Adam
Keightley, Peter D.
Lynch, Michael
McVean, Gil
Payseur, Bret A.
Pfeifer, Susanne P.
Stephan, Wolfgang
Jensen, Jeffrey D.
author_sort Johri, Parul
collection PubMed
description The field of population genomics has grown rapidly in response to the recent advent of affordable, large-scale sequencing technologies. As opposed to the situation during the majority of the 20th century, in which the development of theoretical and statistical population genetic insights outpaced the generation of data to which they could be applied, genomic data are now being produced at a far greater rate than they can be meaningfully analyzed and interpreted. With this wealth of data has come a tendency to focus on fitting specific (and often rather idiosyncratic) models to data, at the expense of a careful exploration of the range of possible underlying evolutionary processes. For example, the approach of directly investigating models of adaptive evolution in each newly sequenced population or species often neglects the fact that a thorough characterization of ubiquitous nonadaptive processes is a prerequisite for accurate inference. We here describe the perils of these tendencies, present our consensus views on current best practices in population genomic data analysis, and highlight areas of statistical inference and theory that are in need of further attention. Thereby, we argue for the importance of defining a biologically relevant baseline model tuned to the details of each new analysis, of skepticism and scrutiny in interpreting model fitting results, and of carefully defining addressable hypotheses and underlying uncertainties.
format Online
Article
Text
id pubmed-9154105
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-91541052022-06-01 Recommendations for improving statistical inference in population genomics Johri, Parul Aquadro, Charles F. Beaumont, Mark Charlesworth, Brian Excoffier, Laurent Eyre-Walker, Adam Keightley, Peter D. Lynch, Michael McVean, Gil Payseur, Bret A. Pfeifer, Susanne P. Stephan, Wolfgang Jensen, Jeffrey D. PLoS Biol Consensus View The field of population genomics has grown rapidly in response to the recent advent of affordable, large-scale sequencing technologies. As opposed to the situation during the majority of the 20th century, in which the development of theoretical and statistical population genetic insights outpaced the generation of data to which they could be applied, genomic data are now being produced at a far greater rate than they can be meaningfully analyzed and interpreted. With this wealth of data has come a tendency to focus on fitting specific (and often rather idiosyncratic) models to data, at the expense of a careful exploration of the range of possible underlying evolutionary processes. For example, the approach of directly investigating models of adaptive evolution in each newly sequenced population or species often neglects the fact that a thorough characterization of ubiquitous nonadaptive processes is a prerequisite for accurate inference. We here describe the perils of these tendencies, present our consensus views on current best practices in population genomic data analysis, and highlight areas of statistical inference and theory that are in need of further attention. Thereby, we argue for the importance of defining a biologically relevant baseline model tuned to the details of each new analysis, of skepticism and scrutiny in interpreting model fitting results, and of carefully defining addressable hypotheses and underlying uncertainties. Public Library of Science 2022-05-31 /pmc/articles/PMC9154105/ /pubmed/35639797 http://dx.doi.org/10.1371/journal.pbio.3001669 Text en © 2022 Johri et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Consensus View
Johri, Parul
Aquadro, Charles F.
Beaumont, Mark
Charlesworth, Brian
Excoffier, Laurent
Eyre-Walker, Adam
Keightley, Peter D.
Lynch, Michael
McVean, Gil
Payseur, Bret A.
Pfeifer, Susanne P.
Stephan, Wolfgang
Jensen, Jeffrey D.
Recommendations for improving statistical inference in population genomics
title Recommendations for improving statistical inference in population genomics
title_full Recommendations for improving statistical inference in population genomics
title_fullStr Recommendations for improving statistical inference in population genomics
title_full_unstemmed Recommendations for improving statistical inference in population genomics
title_short Recommendations for improving statistical inference in population genomics
title_sort recommendations for improving statistical inference in population genomics
topic Consensus View
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9154105/
https://www.ncbi.nlm.nih.gov/pubmed/35639797
http://dx.doi.org/10.1371/journal.pbio.3001669
work_keys_str_mv AT johriparul recommendationsforimprovingstatisticalinferenceinpopulationgenomics
AT aquadrocharlesf recommendationsforimprovingstatisticalinferenceinpopulationgenomics
AT beaumontmark recommendationsforimprovingstatisticalinferenceinpopulationgenomics
AT charlesworthbrian recommendationsforimprovingstatisticalinferenceinpopulationgenomics
AT excoffierlaurent recommendationsforimprovingstatisticalinferenceinpopulationgenomics
AT eyrewalkeradam recommendationsforimprovingstatisticalinferenceinpopulationgenomics
AT keightleypeterd recommendationsforimprovingstatisticalinferenceinpopulationgenomics
AT lynchmichael recommendationsforimprovingstatisticalinferenceinpopulationgenomics
AT mcveangil recommendationsforimprovingstatisticalinferenceinpopulationgenomics
AT payseurbreta recommendationsforimprovingstatisticalinferenceinpopulationgenomics
AT pfeifersusannep recommendationsforimprovingstatisticalinferenceinpopulationgenomics
AT stephanwolfgang recommendationsforimprovingstatisticalinferenceinpopulationgenomics
AT jensenjeffreyd recommendationsforimprovingstatisticalinferenceinpopulationgenomics