Cargando…
Haplotype association analysis of combining unrelated case-control and triads with consideration of population stratification
Combining data when data are collected under different study designs, such as family trios and unrelated case-control samples, gains more power and is cost-effective than analyzing each data separately. However, a potential concern is population stratification (PS) among unrelated case-control sampl...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4028876/ https://www.ncbi.nlm.nih.gov/pubmed/24860592 http://dx.doi.org/10.3389/fgene.2014.00103 |
_version_ | 1782317120577601536 |
---|---|
author | Wen, Shu-Hui Tsai, Miao-Yu |
author_facet | Wen, Shu-Hui Tsai, Miao-Yu |
author_sort | Wen, Shu-Hui |
collection | PubMed |
description | Combining data when data are collected under different study designs, such as family trios and unrelated case-control samples, gains more power and is cost-effective than analyzing each data separately. However, a potential concern is population stratification (PS) among unrelated case-control samples and analyses integrating data should address this confounding effect. In this paper, we develop a simpler method, haplotype generalized linear model (HGLM), that tests and estimates haplotype effects on disease risk and allows for modification against PS for combining data. We proposed to combine information across aggregations of haplotype weighted-counts estimated from population case-control data and trio data separately, and to perform subsequent GLM analysis. Furthermore, we present a framework of analysis of variance based on haplotype weighted-counts for detecting whether it is appropriate to combine two data sources, as well as the modified HGLM with clustering methods for addressing PS. We evaluate the statistical properties in terms of the accuracy, false positive rate (FPR) and empirical power using simulated data with regard to various disease risks, sample sizes, multi-SNP haplotypes and the presence of PS. Our simulation results indicate that HGLM performs comparably well with the likelihood-based haplotype association analysis, particularly when the haplotype effects are moderate, but may not perform well when dealing with lengthy haplotypes for small sample sizes. In the presence of PS, the modified HGLM remains valid and has satisfactory nominal level and small bias. Overall, HGLM appears to be successful in combining data and is simple to implement in standard statistical software. |
format | Online Article Text |
id | pubmed-4028876 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-40288762014-05-23 Haplotype association analysis of combining unrelated case-control and triads with consideration of population stratification Wen, Shu-Hui Tsai, Miao-Yu Front Genet Genetics Combining data when data are collected under different study designs, such as family trios and unrelated case-control samples, gains more power and is cost-effective than analyzing each data separately. However, a potential concern is population stratification (PS) among unrelated case-control samples and analyses integrating data should address this confounding effect. In this paper, we develop a simpler method, haplotype generalized linear model (HGLM), that tests and estimates haplotype effects on disease risk and allows for modification against PS for combining data. We proposed to combine information across aggregations of haplotype weighted-counts estimated from population case-control data and trio data separately, and to perform subsequent GLM analysis. Furthermore, we present a framework of analysis of variance based on haplotype weighted-counts for detecting whether it is appropriate to combine two data sources, as well as the modified HGLM with clustering methods for addressing PS. We evaluate the statistical properties in terms of the accuracy, false positive rate (FPR) and empirical power using simulated data with regard to various disease risks, sample sizes, multi-SNP haplotypes and the presence of PS. Our simulation results indicate that HGLM performs comparably well with the likelihood-based haplotype association analysis, particularly when the haplotype effects are moderate, but may not perform well when dealing with lengthy haplotypes for small sample sizes. In the presence of PS, the modified HGLM remains valid and has satisfactory nominal level and small bias. Overall, HGLM appears to be successful in combining data and is simple to implement in standard statistical software. Frontiers Media S.A. 2014-04-29 /pmc/articles/PMC4028876/ /pubmed/24860592 http://dx.doi.org/10.3389/fgene.2014.00103 Text en Copyright © 2014 Wen and Tsai. http://creativecommons.org/licenses/by/3.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Genetics Wen, Shu-Hui Tsai, Miao-Yu Haplotype association analysis of combining unrelated case-control and triads with consideration of population stratification |
title | Haplotype association analysis of combining unrelated case-control and triads with consideration of population stratification |
title_full | Haplotype association analysis of combining unrelated case-control and triads with consideration of population stratification |
title_fullStr | Haplotype association analysis of combining unrelated case-control and triads with consideration of population stratification |
title_full_unstemmed | Haplotype association analysis of combining unrelated case-control and triads with consideration of population stratification |
title_short | Haplotype association analysis of combining unrelated case-control and triads with consideration of population stratification |
title_sort | haplotype association analysis of combining unrelated case-control and triads with consideration of population stratification |
topic | Genetics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4028876/ https://www.ncbi.nlm.nih.gov/pubmed/24860592 http://dx.doi.org/10.3389/fgene.2014.00103 |
work_keys_str_mv | AT wenshuhui haplotypeassociationanalysisofcombiningunrelatedcasecontrolandtriadswithconsiderationofpopulationstratification AT tsaimiaoyu haplotypeassociationanalysisofcombiningunrelatedcasecontrolandtriadswithconsiderationofpopulationstratification |