Cargando…

Simulating a population genomics data set using FlowSim

BACKGROUND: The field of population genetics use the genetic composition of populations to study the effects of ecological and evolutionary factors, including selection, genetic drift, mating structure, and migration. Until recently, these studies were usually based upon the analysis of relatively f...

Descripción completa

Detalles Bibliográficos
Autor principal: Malde, Ketil
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3942619/
https://www.ncbi.nlm.nih.gov/pubmed/24479665
http://dx.doi.org/10.1186/1756-0500-7-68
_version_ 1782479101033971712
author Malde, Ketil
author_facet Malde, Ketil
author_sort Malde, Ketil
collection PubMed
description BACKGROUND: The field of population genetics use the genetic composition of populations to study the effects of ecological and evolutionary factors, including selection, genetic drift, mating structure, and migration. Until recently, these studies were usually based upon the analysis of relatively few (typically 10–20) DNA markers on samples from multiple populations. In contrast, high-throughput sequencing provides large amounts of data and consequently very high resolution genetic information. Recent technological developments are rapidly making this a cost-effective alternative. In addition, sequencing allows both the direct study of genomic differences between population, and the discovery of single nucleotide polymorphism marker that can be subsequently used in high-throughput genotyping. Much of the analysis in population genetics was developed before large scale sequencing became feasible. Methods often do not take into account the characteristics of the different sequencing technologies, and consequently, may not always be well suited to this kind of data. RESULTS: Although the FlowSim suite of tools originally targeted simulation of de novo 454 genomics data, recent developments and enhancements makes it suitable also for simulating other kinds of data. We examine its application to population genomics, and provide examples and supplementary scripts and utilities to aid in this task. CONCLUSIONS: Simulation is an important tool to study and develop methods in many fields, and here we demonstrate how to simulate a high-throughput sequencing dataset for population genomics.
format Online
Article
Text
id pubmed-3942619
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-39426192014-03-06 Simulating a population genomics data set using FlowSim Malde, Ketil BMC Res Notes Research Article BACKGROUND: The field of population genetics use the genetic composition of populations to study the effects of ecological and evolutionary factors, including selection, genetic drift, mating structure, and migration. Until recently, these studies were usually based upon the analysis of relatively few (typically 10–20) DNA markers on samples from multiple populations. In contrast, high-throughput sequencing provides large amounts of data and consequently very high resolution genetic information. Recent technological developments are rapidly making this a cost-effective alternative. In addition, sequencing allows both the direct study of genomic differences between population, and the discovery of single nucleotide polymorphism marker that can be subsequently used in high-throughput genotyping. Much of the analysis in population genetics was developed before large scale sequencing became feasible. Methods often do not take into account the characteristics of the different sequencing technologies, and consequently, may not always be well suited to this kind of data. RESULTS: Although the FlowSim suite of tools originally targeted simulation of de novo 454 genomics data, recent developments and enhancements makes it suitable also for simulating other kinds of data. We examine its application to population genomics, and provide examples and supplementary scripts and utilities to aid in this task. CONCLUSIONS: Simulation is an important tool to study and develop methods in many fields, and here we demonstrate how to simulate a high-throughput sequencing dataset for population genomics. BioMed Central 2014-01-31 /pmc/articles/PMC3942619/ /pubmed/24479665 http://dx.doi.org/10.1186/1756-0500-7-68 Text en Copyright © 2014 Malde; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Malde, Ketil
Simulating a population genomics data set using FlowSim
title Simulating a population genomics data set using FlowSim
title_full Simulating a population genomics data set using FlowSim
title_fullStr Simulating a population genomics data set using FlowSim
title_full_unstemmed Simulating a population genomics data set using FlowSim
title_short Simulating a population genomics data set using FlowSim
title_sort simulating a population genomics data set using flowsim
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3942619/
https://www.ncbi.nlm.nih.gov/pubmed/24479665
http://dx.doi.org/10.1186/1756-0500-7-68
work_keys_str_mv AT maldeketil simulatingapopulationgenomicsdatasetusingflowsim