Cargando…

SimBA: simulation algorithm to fit extant-population distributions

BACKGROUND: Simulation of populations with specified characteristics such as allele frequencies, linkage disequilibrium etc., is an integral component of many studies, including in-silico breeding optimization. Since the accuracy and sensitivity of population simulation is critical to the quality of...

Descripción completa

Detalles Bibliográficos
Autores principales: Parida, Laxmi, Haiminen, Niina
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4372275/
https://www.ncbi.nlm.nih.gov/pubmed/25886895
http://dx.doi.org/10.1186/s12859-015-0525-0
_version_ 1782363150964752384
author Parida, Laxmi
Haiminen, Niina
author_facet Parida, Laxmi
Haiminen, Niina
author_sort Parida, Laxmi
collection PubMed
description BACKGROUND: Simulation of populations with specified characteristics such as allele frequencies, linkage disequilibrium etc., is an integral component of many studies, including in-silico breeding optimization. Since the accuracy and sensitivity of population simulation is critical to the quality of the output of the applications that use them, accurate algorithms are required to provide a strong foundation to the methods in these studies. RESULTS: In this paper we present SimBA (Simulation using Best-fit Algorithm) a non-generative approach, based on a combination of stochastic techniques and discrete methods. We optimize a hill climbing algorithm and extend the framework to include multiple subpopulation structures. Additionally, we show that SimBA is very sensitive to the input specifications, i.e., very similar but distinct input characteristics result in distinct outputs with high fidelity to the specified distributions. This property of the simulation is not explicitly modeled or studied by previous methods. CONCLUSIONS: We show that SimBA outperforms the existing population simulation methods, both in terms of accuracy as well as time-efficiency. Not only does it construct populations that meet the input specifications more stringently than other published methods, SimBA is also easy to use. It does not require explicit parameter adaptations or calibrations. Also, it can work with input specified as distributions, without an exemplar matrix or population as required by some methods. SimBA is available at http://researcher.ibm.com/project/5669.
format Online
Article
Text
id pubmed-4372275
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-43722752015-03-25 SimBA: simulation algorithm to fit extant-population distributions Parida, Laxmi Haiminen, Niina BMC Bioinformatics Sequence Analysis (Methods) BACKGROUND: Simulation of populations with specified characteristics such as allele frequencies, linkage disequilibrium etc., is an integral component of many studies, including in-silico breeding optimization. Since the accuracy and sensitivity of population simulation is critical to the quality of the output of the applications that use them, accurate algorithms are required to provide a strong foundation to the methods in these studies. RESULTS: In this paper we present SimBA (Simulation using Best-fit Algorithm) a non-generative approach, based on a combination of stochastic techniques and discrete methods. We optimize a hill climbing algorithm and extend the framework to include multiple subpopulation structures. Additionally, we show that SimBA is very sensitive to the input specifications, i.e., very similar but distinct input characteristics result in distinct outputs with high fidelity to the specified distributions. This property of the simulation is not explicitly modeled or studied by previous methods. CONCLUSIONS: We show that SimBA outperforms the existing population simulation methods, both in terms of accuracy as well as time-efficiency. Not only does it construct populations that meet the input specifications more stringently than other published methods, SimBA is also easy to use. It does not require explicit parameter adaptations or calibrations. Also, it can work with input specified as distributions, without an exemplar matrix or population as required by some methods. SimBA is available at http://researcher.ibm.com/project/5669. BioMed Central 2015-03-14 /pmc/articles/PMC4372275/ /pubmed/25886895 http://dx.doi.org/10.1186/s12859-015-0525-0 Text en © Parida and Haiminen; licensee BioMed Central. 2015 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Sequence Analysis (Methods)
Parida, Laxmi
Haiminen, Niina
SimBA: simulation algorithm to fit extant-population distributions
title SimBA: simulation algorithm to fit extant-population distributions
title_full SimBA: simulation algorithm to fit extant-population distributions
title_fullStr SimBA: simulation algorithm to fit extant-population distributions
title_full_unstemmed SimBA: simulation algorithm to fit extant-population distributions
title_short SimBA: simulation algorithm to fit extant-population distributions
title_sort simba: simulation algorithm to fit extant-population distributions
topic Sequence Analysis (Methods)
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4372275/
https://www.ncbi.nlm.nih.gov/pubmed/25886895
http://dx.doi.org/10.1186/s12859-015-0525-0
work_keys_str_mv AT paridalaxmi simbasimulationalgorithmtofitextantpopulationdistributions
AT haiminenniina simbasimulationalgorithmtofitextantpopulationdistributions