Cargando…

Fregene: Simulation of realistic sequence-level data in populations and ascertained samples

BACKGROUND: FREGENE simulates sequence-level data over large genomic regions in large populations. Because, unlike coalescent simulators, it works forwards through time, it allows complex scenarios of selection, demography, and recombination to be modelled simultaneously. Detailed tracking of sites...

Descripción completa

Detalles Bibliográficos
Autores principales: Chadeau-Hyam, Marc, Hoggart, Clive J, O'Reilly, Paul F, Whittaker, John C, De Iorio, Maria, Balding, David J
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2542380/
https://www.ncbi.nlm.nih.gov/pubmed/18778480
http://dx.doi.org/10.1186/1471-2105-9-364
_version_ 1782159144469397504
author Chadeau-Hyam, Marc
Hoggart, Clive J
O'Reilly, Paul F
Whittaker, John C
De Iorio, Maria
Balding, David J
author_facet Chadeau-Hyam, Marc
Hoggart, Clive J
O'Reilly, Paul F
Whittaker, John C
De Iorio, Maria
Balding, David J
author_sort Chadeau-Hyam, Marc
collection PubMed
description BACKGROUND: FREGENE simulates sequence-level data over large genomic regions in large populations. Because, unlike coalescent simulators, it works forwards through time, it allows complex scenarios of selection, demography, and recombination to be modelled simultaneously. Detailed tracking of sites under selection is implemented in FREGENE and provides the opportunity to test theoretical predictions and gain new insights into mechanisms of selection. We describe here main functionalities of both FREGENE and SAMPLE, a companion program that can replicate association study datasets. RESULTS: We report detailed analyses of six large simulated datasets that we have made publicly available. Three demographic scenarios are modelled: one panmictic, one substructured with migration, and one complex scenario that mimics the principle features of genetic variation in major worldwide human populations. For each scenario there is one neutral simulation, and one with a complex pattern of selection. CONCLUSION: FREGENE and the simulated datasets will be valuable for assessing the validity of models for selection, demography and population genetic parameters, as well as the efficacy of association studies. Its principle advantages are modelling flexibility and computational efficiency. It is open source and object-oriented. As such, it can be customised and the range of models extended.
format Text
id pubmed-2542380
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-25423802008-09-18 Fregene: Simulation of realistic sequence-level data in populations and ascertained samples Chadeau-Hyam, Marc Hoggart, Clive J O'Reilly, Paul F Whittaker, John C De Iorio, Maria Balding, David J BMC Bioinformatics Software BACKGROUND: FREGENE simulates sequence-level data over large genomic regions in large populations. Because, unlike coalescent simulators, it works forwards through time, it allows complex scenarios of selection, demography, and recombination to be modelled simultaneously. Detailed tracking of sites under selection is implemented in FREGENE and provides the opportunity to test theoretical predictions and gain new insights into mechanisms of selection. We describe here main functionalities of both FREGENE and SAMPLE, a companion program that can replicate association study datasets. RESULTS: We report detailed analyses of six large simulated datasets that we have made publicly available. Three demographic scenarios are modelled: one panmictic, one substructured with migration, and one complex scenario that mimics the principle features of genetic variation in major worldwide human populations. For each scenario there is one neutral simulation, and one with a complex pattern of selection. CONCLUSION: FREGENE and the simulated datasets will be valuable for assessing the validity of models for selection, demography and population genetic parameters, as well as the efficacy of association studies. Its principle advantages are modelling flexibility and computational efficiency. It is open source and object-oriented. As such, it can be customised and the range of models extended. BioMed Central 2008-09-08 /pmc/articles/PMC2542380/ /pubmed/18778480 http://dx.doi.org/10.1186/1471-2105-9-364 Text en Copyright © 2008 Chadeau-Hyam et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Software
Chadeau-Hyam, Marc
Hoggart, Clive J
O'Reilly, Paul F
Whittaker, John C
De Iorio, Maria
Balding, David J
Fregene: Simulation of realistic sequence-level data in populations and ascertained samples
title Fregene: Simulation of realistic sequence-level data in populations and ascertained samples
title_full Fregene: Simulation of realistic sequence-level data in populations and ascertained samples
title_fullStr Fregene: Simulation of realistic sequence-level data in populations and ascertained samples
title_full_unstemmed Fregene: Simulation of realistic sequence-level data in populations and ascertained samples
title_short Fregene: Simulation of realistic sequence-level data in populations and ascertained samples
title_sort fregene: simulation of realistic sequence-level data in populations and ascertained samples
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2542380/
https://www.ncbi.nlm.nih.gov/pubmed/18778480
http://dx.doi.org/10.1186/1471-2105-9-364
work_keys_str_mv AT chadeauhyammarc fregenesimulationofrealisticsequenceleveldatainpopulationsandascertainedsamples
AT hoggartclivej fregenesimulationofrealisticsequenceleveldatainpopulationsandascertainedsamples
AT oreillypaulf fregenesimulationofrealisticsequenceleveldatainpopulationsandascertainedsamples
AT whittakerjohnc fregenesimulationofrealisticsequenceleveldatainpopulationsandascertainedsamples
AT deioriomaria fregenesimulationofrealisticsequenceleveldatainpopulationsandascertainedsamples
AT baldingdavidj fregenesimulationofrealisticsequenceleveldatainpopulationsandascertainedsamples