Cargando…

A Comparison of Gene Region Simulation Methods

BACKGROUND: Accurately modeling LD in simulations is essential to correctly evaluate new and existing association methods. At present, there has been minimal research comparing the quality of existing gene region simulation methods to produce LD structures similar to an existing gene region. Here we...

Descripción completa

Detalles Bibliográficos
Autores principales: Hendricks, Audrey E., Dupuis, Josée, Gupta, Mayetri, Logue, Mark W., Lunetta, Kathryn L.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3399793/
https://www.ncbi.nlm.nih.gov/pubmed/22815869
http://dx.doi.org/10.1371/journal.pone.0040925
_version_ 1782238423564681216
author Hendricks, Audrey E.
Dupuis, Josée
Gupta, Mayetri
Logue, Mark W.
Lunetta, Kathryn L.
author_facet Hendricks, Audrey E.
Dupuis, Josée
Gupta, Mayetri
Logue, Mark W.
Lunetta, Kathryn L.
author_sort Hendricks, Audrey E.
collection PubMed
description BACKGROUND: Accurately modeling LD in simulations is essential to correctly evaluate new and existing association methods. At present, there has been minimal research comparing the quality of existing gene region simulation methods to produce LD structures similar to an existing gene region. Here we compare the ability of three approaches to accurately simulate the LD within a gene region: HapSim (2005), Hapgen (2009), and a minor extension to simple haplotype resampling. METHODOLOGY/PRINCIPAL FINDINGS: In order to observe the variation and bias for each method, we compare the simulated pairwise LD measures and minor allele frequencies to the original HapMap data in an extensive simulation study. When possible, we also evaluate the effects of changing parameters. HapSim produces samples of haplotypes with lower LD, on average, compared to the original haplotype set while both our resampling method and Hapgen do not introduce this bias. The variation introduced across the replicates by our resampling method is quite small and may not provide enough sampling variability to make a generalizable simulation study. CONCLUSION: We recommend using Hapgen to simulate replicate haplotypes from a gene region. Hapgen produces moderate sampling variation between the replicates while retaining the overall unique LD structure of the gene region.
format Online
Article
Text
id pubmed-3399793
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-33997932012-07-19 A Comparison of Gene Region Simulation Methods Hendricks, Audrey E. Dupuis, Josée Gupta, Mayetri Logue, Mark W. Lunetta, Kathryn L. PLoS One Research Article BACKGROUND: Accurately modeling LD in simulations is essential to correctly evaluate new and existing association methods. At present, there has been minimal research comparing the quality of existing gene region simulation methods to produce LD structures similar to an existing gene region. Here we compare the ability of three approaches to accurately simulate the LD within a gene region: HapSim (2005), Hapgen (2009), and a minor extension to simple haplotype resampling. METHODOLOGY/PRINCIPAL FINDINGS: In order to observe the variation and bias for each method, we compare the simulated pairwise LD measures and minor allele frequencies to the original HapMap data in an extensive simulation study. When possible, we also evaluate the effects of changing parameters. HapSim produces samples of haplotypes with lower LD, on average, compared to the original haplotype set while both our resampling method and Hapgen do not introduce this bias. The variation introduced across the replicates by our resampling method is quite small and may not provide enough sampling variability to make a generalizable simulation study. CONCLUSION: We recommend using Hapgen to simulate replicate haplotypes from a gene region. Hapgen produces moderate sampling variation between the replicates while retaining the overall unique LD structure of the gene region. Public Library of Science 2012-07-18 /pmc/articles/PMC3399793/ /pubmed/22815869 http://dx.doi.org/10.1371/journal.pone.0040925 Text en Hendricks et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Hendricks, Audrey E.
Dupuis, Josée
Gupta, Mayetri
Logue, Mark W.
Lunetta, Kathryn L.
A Comparison of Gene Region Simulation Methods
title A Comparison of Gene Region Simulation Methods
title_full A Comparison of Gene Region Simulation Methods
title_fullStr A Comparison of Gene Region Simulation Methods
title_full_unstemmed A Comparison of Gene Region Simulation Methods
title_short A Comparison of Gene Region Simulation Methods
title_sort comparison of gene region simulation methods
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3399793/
https://www.ncbi.nlm.nih.gov/pubmed/22815869
http://dx.doi.org/10.1371/journal.pone.0040925
work_keys_str_mv AT hendricksaudreye acomparisonofgeneregionsimulationmethods
AT dupuisjosee acomparisonofgeneregionsimulationmethods
AT guptamayetri acomparisonofgeneregionsimulationmethods
AT loguemarkw acomparisonofgeneregionsimulationmethods
AT lunettakathrynl acomparisonofgeneregionsimulationmethods
AT hendricksaudreye comparisonofgeneregionsimulationmethods
AT dupuisjosee comparisonofgeneregionsimulationmethods
AT guptamayetri comparisonofgeneregionsimulationmethods
AT loguemarkw comparisonofgeneregionsimulationmethods
AT lunettakathrynl comparisonofgeneregionsimulationmethods