Cargando…

Expanding the stdpopsim species catalog, and lessons learned for realistic genome simulations

Simulation is a key tool in population genetics for both methods development and empirical research, but producing simulations that recapitulate the main features of genomic datasets remains a major obstacle. Today, more realistic simulations are possible thanks to large increases in the quantity an...

Descripción completa

Detalles Bibliográficos
Autores principales: Lauterbur, M Elise, Cavassim, Maria Izabel A, Gladstein, Ariella L, Gower, Graham, Pope, Nathaniel S, Tsambos, Georgia, Adrion, Jeffrey, Belsare, Saurabh, Biddanda, Arjun, Caudill, Victoria, Cury, Jean, Echevarria, Ignacio, Haller, Benjamin C, Hasan, Ahmed R, Huang, Xin, Iasi, Leonardo Nicola Martin, Noskova, Ekaterina, Obsteter, Jana, Pavinato, Vitor Antonio Correa, Pearson, Alice, Peede, David, Perez, Manolo F, Rodrigues, Murillo F, Smith, Chris CR, Spence, Jeffrey P, Teterina, Anastasia, Tittes, Silas, Unneberg, Per, Vazquez, Juan Manuel, Waples, Ryan K, Wohns, Anthony Wilder, Wong, Yan, Baumdicker, Franz, Cartwright, Reed A, Gorjanc, Gregor, Gutenkunst, Ryan N, Kelleher, Jerome, Kern, Andrew D, Ragsdale, Aaron P, Ralph, Peter L, Schrider, Daniel R, Gronau, Ilan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: eLife Sciences Publications, Ltd 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10328510/
https://www.ncbi.nlm.nih.gov/pubmed/37342968
http://dx.doi.org/10.7554/eLife.84874
_version_ 1785069813995929600
author Lauterbur, M Elise
Cavassim, Maria Izabel A
Gladstein, Ariella L
Gower, Graham
Pope, Nathaniel S
Tsambos, Georgia
Adrion, Jeffrey
Belsare, Saurabh
Biddanda, Arjun
Caudill, Victoria
Cury, Jean
Echevarria, Ignacio
Haller, Benjamin C
Hasan, Ahmed R
Huang, Xin
Iasi, Leonardo Nicola Martin
Noskova, Ekaterina
Obsteter, Jana
Pavinato, Vitor Antonio Correa
Pearson, Alice
Peede, David
Perez, Manolo F
Rodrigues, Murillo F
Smith, Chris CR
Spence, Jeffrey P
Teterina, Anastasia
Tittes, Silas
Unneberg, Per
Vazquez, Juan Manuel
Waples, Ryan K
Wohns, Anthony Wilder
Wong, Yan
Baumdicker, Franz
Cartwright, Reed A
Gorjanc, Gregor
Gutenkunst, Ryan N
Kelleher, Jerome
Kern, Andrew D
Ragsdale, Aaron P
Ralph, Peter L
Schrider, Daniel R
Gronau, Ilan
author_facet Lauterbur, M Elise
Cavassim, Maria Izabel A
Gladstein, Ariella L
Gower, Graham
Pope, Nathaniel S
Tsambos, Georgia
Adrion, Jeffrey
Belsare, Saurabh
Biddanda, Arjun
Caudill, Victoria
Cury, Jean
Echevarria, Ignacio
Haller, Benjamin C
Hasan, Ahmed R
Huang, Xin
Iasi, Leonardo Nicola Martin
Noskova, Ekaterina
Obsteter, Jana
Pavinato, Vitor Antonio Correa
Pearson, Alice
Peede, David
Perez, Manolo F
Rodrigues, Murillo F
Smith, Chris CR
Spence, Jeffrey P
Teterina, Anastasia
Tittes, Silas
Unneberg, Per
Vazquez, Juan Manuel
Waples, Ryan K
Wohns, Anthony Wilder
Wong, Yan
Baumdicker, Franz
Cartwright, Reed A
Gorjanc, Gregor
Gutenkunst, Ryan N
Kelleher, Jerome
Kern, Andrew D
Ragsdale, Aaron P
Ralph, Peter L
Schrider, Daniel R
Gronau, Ilan
author_sort Lauterbur, M Elise
collection PubMed
description Simulation is a key tool in population genetics for both methods development and empirical research, but producing simulations that recapitulate the main features of genomic datasets remains a major obstacle. Today, more realistic simulations are possible thanks to large increases in the quantity and quality of available genetic data, and the sophistication of inference and simulation software. However, implementing these simulations still requires substantial time and specialized knowledge. These challenges are especially pronounced for simulating genomes for species that are not well-studied, since it is not always clear what information is required to produce simulations with a level of realism sufficient to confidently answer a given question. The community-developed framework stdpopsim seeks to lower this barrier by facilitating the simulation of complex population genetic models using up-to-date information. The initial version of stdpopsim focused on establishing this framework using six well-characterized model species (Adrion et al., 2020). Here, we report on major improvements made in the new release of stdpopsim (version 0.2), which includes a significant expansion of the species catalog and substantial additions to simulation capabilities. Features added to improve the realism of the simulated genomes include non-crossover recombination and provision of species-specific genomic annotations. Through community-driven efforts, we expanded the number of species in the catalog more than threefold and broadened coverage across the tree of life. During the process of expanding the catalog, we have identified common sticking points and developed the best practices for setting up genome-scale simulations. We describe the input data required for generating a realistic simulation, suggest good practices for obtaining the relevant information from the literature, and discuss common pitfalls and major considerations. These improvements to stdpopsim aim to further promote the use of realistic whole-genome population genetic simulations, especially in non-model organisms, making them available, transparent, and accessible to everyone.
format Online
Article
Text
id pubmed-10328510
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher eLife Sciences Publications, Ltd
record_format MEDLINE/PubMed
spelling pubmed-103285102023-07-08 Expanding the stdpopsim species catalog, and lessons learned for realistic genome simulations Lauterbur, M Elise Cavassim, Maria Izabel A Gladstein, Ariella L Gower, Graham Pope, Nathaniel S Tsambos, Georgia Adrion, Jeffrey Belsare, Saurabh Biddanda, Arjun Caudill, Victoria Cury, Jean Echevarria, Ignacio Haller, Benjamin C Hasan, Ahmed R Huang, Xin Iasi, Leonardo Nicola Martin Noskova, Ekaterina Obsteter, Jana Pavinato, Vitor Antonio Correa Pearson, Alice Peede, David Perez, Manolo F Rodrigues, Murillo F Smith, Chris CR Spence, Jeffrey P Teterina, Anastasia Tittes, Silas Unneberg, Per Vazquez, Juan Manuel Waples, Ryan K Wohns, Anthony Wilder Wong, Yan Baumdicker, Franz Cartwright, Reed A Gorjanc, Gregor Gutenkunst, Ryan N Kelleher, Jerome Kern, Andrew D Ragsdale, Aaron P Ralph, Peter L Schrider, Daniel R Gronau, Ilan eLife Genetics and Genomics Simulation is a key tool in population genetics for both methods development and empirical research, but producing simulations that recapitulate the main features of genomic datasets remains a major obstacle. Today, more realistic simulations are possible thanks to large increases in the quantity and quality of available genetic data, and the sophistication of inference and simulation software. However, implementing these simulations still requires substantial time and specialized knowledge. These challenges are especially pronounced for simulating genomes for species that are not well-studied, since it is not always clear what information is required to produce simulations with a level of realism sufficient to confidently answer a given question. The community-developed framework stdpopsim seeks to lower this barrier by facilitating the simulation of complex population genetic models using up-to-date information. The initial version of stdpopsim focused on establishing this framework using six well-characterized model species (Adrion et al., 2020). Here, we report on major improvements made in the new release of stdpopsim (version 0.2), which includes a significant expansion of the species catalog and substantial additions to simulation capabilities. Features added to improve the realism of the simulated genomes include non-crossover recombination and provision of species-specific genomic annotations. Through community-driven efforts, we expanded the number of species in the catalog more than threefold and broadened coverage across the tree of life. During the process of expanding the catalog, we have identified common sticking points and developed the best practices for setting up genome-scale simulations. We describe the input data required for generating a realistic simulation, suggest good practices for obtaining the relevant information from the literature, and discuss common pitfalls and major considerations. These improvements to stdpopsim aim to further promote the use of realistic whole-genome population genetic simulations, especially in non-model organisms, making them available, transparent, and accessible to everyone. eLife Sciences Publications, Ltd 2023-06-21 /pmc/articles/PMC10328510/ /pubmed/37342968 http://dx.doi.org/10.7554/eLife.84874 Text en © 2023, Lauterbur et al https://creativecommons.org/licenses/by/4.0/This article is distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use and redistribution provided that the original author and source are credited.
spellingShingle Genetics and Genomics
Lauterbur, M Elise
Cavassim, Maria Izabel A
Gladstein, Ariella L
Gower, Graham
Pope, Nathaniel S
Tsambos, Georgia
Adrion, Jeffrey
Belsare, Saurabh
Biddanda, Arjun
Caudill, Victoria
Cury, Jean
Echevarria, Ignacio
Haller, Benjamin C
Hasan, Ahmed R
Huang, Xin
Iasi, Leonardo Nicola Martin
Noskova, Ekaterina
Obsteter, Jana
Pavinato, Vitor Antonio Correa
Pearson, Alice
Peede, David
Perez, Manolo F
Rodrigues, Murillo F
Smith, Chris CR
Spence, Jeffrey P
Teterina, Anastasia
Tittes, Silas
Unneberg, Per
Vazquez, Juan Manuel
Waples, Ryan K
Wohns, Anthony Wilder
Wong, Yan
Baumdicker, Franz
Cartwright, Reed A
Gorjanc, Gregor
Gutenkunst, Ryan N
Kelleher, Jerome
Kern, Andrew D
Ragsdale, Aaron P
Ralph, Peter L
Schrider, Daniel R
Gronau, Ilan
Expanding the stdpopsim species catalog, and lessons learned for realistic genome simulations
title Expanding the stdpopsim species catalog, and lessons learned for realistic genome simulations
title_full Expanding the stdpopsim species catalog, and lessons learned for realistic genome simulations
title_fullStr Expanding the stdpopsim species catalog, and lessons learned for realistic genome simulations
title_full_unstemmed Expanding the stdpopsim species catalog, and lessons learned for realistic genome simulations
title_short Expanding the stdpopsim species catalog, and lessons learned for realistic genome simulations
title_sort expanding the stdpopsim species catalog, and lessons learned for realistic genome simulations
topic Genetics and Genomics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10328510/
https://www.ncbi.nlm.nih.gov/pubmed/37342968
http://dx.doi.org/10.7554/eLife.84874
work_keys_str_mv AT lauterburmelise expandingthestdpopsimspeciescatalogandlessonslearnedforrealisticgenomesimulations
AT cavassimmariaizabela expandingthestdpopsimspeciescatalogandlessonslearnedforrealisticgenomesimulations
AT gladsteinariellal expandingthestdpopsimspeciescatalogandlessonslearnedforrealisticgenomesimulations
AT gowergraham expandingthestdpopsimspeciescatalogandlessonslearnedforrealisticgenomesimulations
AT popenathaniels expandingthestdpopsimspeciescatalogandlessonslearnedforrealisticgenomesimulations
AT tsambosgeorgia expandingthestdpopsimspeciescatalogandlessonslearnedforrealisticgenomesimulations
AT adrionjeffrey expandingthestdpopsimspeciescatalogandlessonslearnedforrealisticgenomesimulations
AT belsaresaurabh expandingthestdpopsimspeciescatalogandlessonslearnedforrealisticgenomesimulations
AT biddandaarjun expandingthestdpopsimspeciescatalogandlessonslearnedforrealisticgenomesimulations
AT caudillvictoria expandingthestdpopsimspeciescatalogandlessonslearnedforrealisticgenomesimulations
AT curyjean expandingthestdpopsimspeciescatalogandlessonslearnedforrealisticgenomesimulations
AT echevarriaignacio expandingthestdpopsimspeciescatalogandlessonslearnedforrealisticgenomesimulations
AT hallerbenjaminc expandingthestdpopsimspeciescatalogandlessonslearnedforrealisticgenomesimulations
AT hasanahmedr expandingthestdpopsimspeciescatalogandlessonslearnedforrealisticgenomesimulations
AT huangxin expandingthestdpopsimspeciescatalogandlessonslearnedforrealisticgenomesimulations
AT iasileonardonicolamartin expandingthestdpopsimspeciescatalogandlessonslearnedforrealisticgenomesimulations
AT noskovaekaterina expandingthestdpopsimspeciescatalogandlessonslearnedforrealisticgenomesimulations
AT obsteterjana expandingthestdpopsimspeciescatalogandlessonslearnedforrealisticgenomesimulations
AT pavinatovitorantoniocorrea expandingthestdpopsimspeciescatalogandlessonslearnedforrealisticgenomesimulations
AT pearsonalice expandingthestdpopsimspeciescatalogandlessonslearnedforrealisticgenomesimulations
AT peededavid expandingthestdpopsimspeciescatalogandlessonslearnedforrealisticgenomesimulations
AT perezmanolof expandingthestdpopsimspeciescatalogandlessonslearnedforrealisticgenomesimulations
AT rodriguesmurillof expandingthestdpopsimspeciescatalogandlessonslearnedforrealisticgenomesimulations
AT smithchriscr expandingthestdpopsimspeciescatalogandlessonslearnedforrealisticgenomesimulations
AT spencejeffreyp expandingthestdpopsimspeciescatalogandlessonslearnedforrealisticgenomesimulations
AT teterinaanastasia expandingthestdpopsimspeciescatalogandlessonslearnedforrealisticgenomesimulations
AT tittessilas expandingthestdpopsimspeciescatalogandlessonslearnedforrealisticgenomesimulations
AT unnebergper expandingthestdpopsimspeciescatalogandlessonslearnedforrealisticgenomesimulations
AT vazquezjuanmanuel expandingthestdpopsimspeciescatalogandlessonslearnedforrealisticgenomesimulations
AT waplesryank expandingthestdpopsimspeciescatalogandlessonslearnedforrealisticgenomesimulations
AT wohnsanthonywilder expandingthestdpopsimspeciescatalogandlessonslearnedforrealisticgenomesimulations
AT wongyan expandingthestdpopsimspeciescatalogandlessonslearnedforrealisticgenomesimulations
AT baumdickerfranz expandingthestdpopsimspeciescatalogandlessonslearnedforrealisticgenomesimulations
AT cartwrightreeda expandingthestdpopsimspeciescatalogandlessonslearnedforrealisticgenomesimulations
AT gorjancgregor expandingthestdpopsimspeciescatalogandlessonslearnedforrealisticgenomesimulations
AT gutenkunstryann expandingthestdpopsimspeciescatalogandlessonslearnedforrealisticgenomesimulations
AT kelleherjerome expandingthestdpopsimspeciescatalogandlessonslearnedforrealisticgenomesimulations
AT kernandrewd expandingthestdpopsimspeciescatalogandlessonslearnedforrealisticgenomesimulations
AT ragsdaleaaronp expandingthestdpopsimspeciescatalogandlessonslearnedforrealisticgenomesimulations
AT ralphpeterl expandingthestdpopsimspeciescatalogandlessonslearnedforrealisticgenomesimulations
AT schriderdanielr expandingthestdpopsimspeciescatalogandlessonslearnedforrealisticgenomesimulations
AT gronauilan expandingthestdpopsimspeciescatalogandlessonslearnedforrealisticgenomesimulations