Cargando…

Haplotype association analysis of combining unrelated case-control and triads with consideration of population stratification

Combining data when data are collected under different study designs, such as family trios and unrelated case-control samples, gains more power and is cost-effective than analyzing each data separately. However, a potential concern is population stratification (PS) among unrelated case-control sampl...

Descripción completa

Detalles Bibliográficos
Autores principales: Wen, Shu-Hui, Tsai, Miao-Yu
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4028876/
https://www.ncbi.nlm.nih.gov/pubmed/24860592
http://dx.doi.org/10.3389/fgene.2014.00103
_version_ 1782317120577601536
author Wen, Shu-Hui
Tsai, Miao-Yu
author_facet Wen, Shu-Hui
Tsai, Miao-Yu
author_sort Wen, Shu-Hui
collection PubMed
description Combining data when data are collected under different study designs, such as family trios and unrelated case-control samples, gains more power and is cost-effective than analyzing each data separately. However, a potential concern is population stratification (PS) among unrelated case-control samples and analyses integrating data should address this confounding effect. In this paper, we develop a simpler method, haplotype generalized linear model (HGLM), that tests and estimates haplotype effects on disease risk and allows for modification against PS for combining data. We proposed to combine information across aggregations of haplotype weighted-counts estimated from population case-control data and trio data separately, and to perform subsequent GLM analysis. Furthermore, we present a framework of analysis of variance based on haplotype weighted-counts for detecting whether it is appropriate to combine two data sources, as well as the modified HGLM with clustering methods for addressing PS. We evaluate the statistical properties in terms of the accuracy, false positive rate (FPR) and empirical power using simulated data with regard to various disease risks, sample sizes, multi-SNP haplotypes and the presence of PS. Our simulation results indicate that HGLM performs comparably well with the likelihood-based haplotype association analysis, particularly when the haplotype effects are moderate, but may not perform well when dealing with lengthy haplotypes for small sample sizes. In the presence of PS, the modified HGLM remains valid and has satisfactory nominal level and small bias. Overall, HGLM appears to be successful in combining data and is simple to implement in standard statistical software.
format Online
Article
Text
id pubmed-4028876
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-40288762014-05-23 Haplotype association analysis of combining unrelated case-control and triads with consideration of population stratification Wen, Shu-Hui Tsai, Miao-Yu Front Genet Genetics Combining data when data are collected under different study designs, such as family trios and unrelated case-control samples, gains more power and is cost-effective than analyzing each data separately. However, a potential concern is population stratification (PS) among unrelated case-control samples and analyses integrating data should address this confounding effect. In this paper, we develop a simpler method, haplotype generalized linear model (HGLM), that tests and estimates haplotype effects on disease risk and allows for modification against PS for combining data. We proposed to combine information across aggregations of haplotype weighted-counts estimated from population case-control data and trio data separately, and to perform subsequent GLM analysis. Furthermore, we present a framework of analysis of variance based on haplotype weighted-counts for detecting whether it is appropriate to combine two data sources, as well as the modified HGLM with clustering methods for addressing PS. We evaluate the statistical properties in terms of the accuracy, false positive rate (FPR) and empirical power using simulated data with regard to various disease risks, sample sizes, multi-SNP haplotypes and the presence of PS. Our simulation results indicate that HGLM performs comparably well with the likelihood-based haplotype association analysis, particularly when the haplotype effects are moderate, but may not perform well when dealing with lengthy haplotypes for small sample sizes. In the presence of PS, the modified HGLM remains valid and has satisfactory nominal level and small bias. Overall, HGLM appears to be successful in combining data and is simple to implement in standard statistical software. Frontiers Media S.A. 2014-04-29 /pmc/articles/PMC4028876/ /pubmed/24860592 http://dx.doi.org/10.3389/fgene.2014.00103 Text en Copyright © 2014 Wen and Tsai. http://creativecommons.org/licenses/by/3.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Wen, Shu-Hui
Tsai, Miao-Yu
Haplotype association analysis of combining unrelated case-control and triads with consideration of population stratification
title Haplotype association analysis of combining unrelated case-control and triads with consideration of population stratification
title_full Haplotype association analysis of combining unrelated case-control and triads with consideration of population stratification
title_fullStr Haplotype association analysis of combining unrelated case-control and triads with consideration of population stratification
title_full_unstemmed Haplotype association analysis of combining unrelated case-control and triads with consideration of population stratification
title_short Haplotype association analysis of combining unrelated case-control and triads with consideration of population stratification
title_sort haplotype association analysis of combining unrelated case-control and triads with consideration of population stratification
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4028876/
https://www.ncbi.nlm.nih.gov/pubmed/24860592
http://dx.doi.org/10.3389/fgene.2014.00103
work_keys_str_mv AT wenshuhui haplotypeassociationanalysisofcombiningunrelatedcasecontrolandtriadswithconsiderationofpopulationstratification
AT tsaimiaoyu haplotypeassociationanalysisofcombiningunrelatedcasecontrolandtriadswithconsiderationofpopulationstratification