Cargando…

A Hybrid Likelihood Model for Sequence-Based Disease Association Studies

In the past few years, case-control studies of common diseases have shifted their focus from single genes to whole exomes. New sequencing technologies now routinely detect hundreds of thousands of sequence variants in a single study, many of which are rare or even novel. The limitation of classical...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Yun-Ching, Carter, Hannah, Parla, Jennifer, Kramer, Melissa, Goes, Fernando S., Pirooznia, Mehdi, Zandi, Peter P., McCombie, W. Richard, Potash, James B., Karchin, Rachel
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3554549/
https://www.ncbi.nlm.nih.gov/pubmed/23358228
http://dx.doi.org/10.1371/journal.pgen.1003224
_version_ 1782256916789985280
author Chen, Yun-Ching
Carter, Hannah
Parla, Jennifer
Kramer, Melissa
Goes, Fernando S.
Pirooznia, Mehdi
Zandi, Peter P.
McCombie, W. Richard
Potash, James B.
Karchin, Rachel
author_facet Chen, Yun-Ching
Carter, Hannah
Parla, Jennifer
Kramer, Melissa
Goes, Fernando S.
Pirooznia, Mehdi
Zandi, Peter P.
McCombie, W. Richard
Potash, James B.
Karchin, Rachel
author_sort Chen, Yun-Ching
collection PubMed
description In the past few years, case-control studies of common diseases have shifted their focus from single genes to whole exomes. New sequencing technologies now routinely detect hundreds of thousands of sequence variants in a single study, many of which are rare or even novel. The limitation of classical single-marker association analysis for rare variants has been a challenge in such studies. A new generation of statistical methods for case-control association studies has been developed to meet this challenge. A common approach to association analysis of rare variants is the burden-style collapsing methods to combine rare variant data within individuals across or within genes. Here, we propose a new hybrid likelihood model that combines a burden test with a test of the position distribution of variants. In extensive simulations and on empirical data from the Dallas Heart Study, the new model demonstrates consistently good power, in particular when applied to a gene set (e.g., multiple candidate genes with shared biological function or pathway), when rare variants cluster in key functional regions of a gene, and when protective variants are present. When applied to data from an ongoing sequencing study of bipolar disorder (191 cases, 107 controls), the model identifies seven gene sets with nominal p-values[Image: see text]0.05, of which one MAPK signaling pathway (KEGG) reaches trend-level significance after correcting for multiple testing.
format Online
Article
Text
id pubmed-3554549
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-35545492013-01-28 A Hybrid Likelihood Model for Sequence-Based Disease Association Studies Chen, Yun-Ching Carter, Hannah Parla, Jennifer Kramer, Melissa Goes, Fernando S. Pirooznia, Mehdi Zandi, Peter P. McCombie, W. Richard Potash, James B. Karchin, Rachel PLoS Genet Research Article In the past few years, case-control studies of common diseases have shifted their focus from single genes to whole exomes. New sequencing technologies now routinely detect hundreds of thousands of sequence variants in a single study, many of which are rare or even novel. The limitation of classical single-marker association analysis for rare variants has been a challenge in such studies. A new generation of statistical methods for case-control association studies has been developed to meet this challenge. A common approach to association analysis of rare variants is the burden-style collapsing methods to combine rare variant data within individuals across or within genes. Here, we propose a new hybrid likelihood model that combines a burden test with a test of the position distribution of variants. In extensive simulations and on empirical data from the Dallas Heart Study, the new model demonstrates consistently good power, in particular when applied to a gene set (e.g., multiple candidate genes with shared biological function or pathway), when rare variants cluster in key functional regions of a gene, and when protective variants are present. When applied to data from an ongoing sequencing study of bipolar disorder (191 cases, 107 controls), the model identifies seven gene sets with nominal p-values[Image: see text]0.05, of which one MAPK signaling pathway (KEGG) reaches trend-level significance after correcting for multiple testing. Public Library of Science 2013-01-24 /pmc/articles/PMC3554549/ /pubmed/23358228 http://dx.doi.org/10.1371/journal.pgen.1003224 Text en © 2013 Chen et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Chen, Yun-Ching
Carter, Hannah
Parla, Jennifer
Kramer, Melissa
Goes, Fernando S.
Pirooznia, Mehdi
Zandi, Peter P.
McCombie, W. Richard
Potash, James B.
Karchin, Rachel
A Hybrid Likelihood Model for Sequence-Based Disease Association Studies
title A Hybrid Likelihood Model for Sequence-Based Disease Association Studies
title_full A Hybrid Likelihood Model for Sequence-Based Disease Association Studies
title_fullStr A Hybrid Likelihood Model for Sequence-Based Disease Association Studies
title_full_unstemmed A Hybrid Likelihood Model for Sequence-Based Disease Association Studies
title_short A Hybrid Likelihood Model for Sequence-Based Disease Association Studies
title_sort hybrid likelihood model for sequence-based disease association studies
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3554549/
https://www.ncbi.nlm.nih.gov/pubmed/23358228
http://dx.doi.org/10.1371/journal.pgen.1003224
work_keys_str_mv AT chenyunching ahybridlikelihoodmodelforsequencebaseddiseaseassociationstudies
AT carterhannah ahybridlikelihoodmodelforsequencebaseddiseaseassociationstudies
AT parlajennifer ahybridlikelihoodmodelforsequencebaseddiseaseassociationstudies
AT kramermelissa ahybridlikelihoodmodelforsequencebaseddiseaseassociationstudies
AT goesfernandos ahybridlikelihoodmodelforsequencebaseddiseaseassociationstudies
AT piroozniamehdi ahybridlikelihoodmodelforsequencebaseddiseaseassociationstudies
AT zandipeterp ahybridlikelihoodmodelforsequencebaseddiseaseassociationstudies
AT mccombiewrichard ahybridlikelihoodmodelforsequencebaseddiseaseassociationstudies
AT potashjamesb ahybridlikelihoodmodelforsequencebaseddiseaseassociationstudies
AT karchinrachel ahybridlikelihoodmodelforsequencebaseddiseaseassociationstudies
AT chenyunching hybridlikelihoodmodelforsequencebaseddiseaseassociationstudies
AT carterhannah hybridlikelihoodmodelforsequencebaseddiseaseassociationstudies
AT parlajennifer hybridlikelihoodmodelforsequencebaseddiseaseassociationstudies
AT kramermelissa hybridlikelihoodmodelforsequencebaseddiseaseassociationstudies
AT goesfernandos hybridlikelihoodmodelforsequencebaseddiseaseassociationstudies
AT piroozniamehdi hybridlikelihoodmodelforsequencebaseddiseaseassociationstudies
AT zandipeterp hybridlikelihoodmodelforsequencebaseddiseaseassociationstudies
AT mccombiewrichard hybridlikelihoodmodelforsequencebaseddiseaseassociationstudies
AT potashjamesb hybridlikelihoodmodelforsequencebaseddiseaseassociationstudies
AT karchinrachel hybridlikelihoodmodelforsequencebaseddiseaseassociationstudies