Cargando…

Bayesian copy number detection and association in large-scale studies

BACKGROUND: Germline copy number variants (CNVs) increase risk for many diseases, yet detection of CNVs and quantifying their contribution to disease risk in large-scale studies is challenging due to biological and technical sources of heterogeneity that vary across the genome within and between sam...

Descripción completa

Detalles Bibliográficos
Autores principales: Cristiano, Stephen, McKean, David, Carey, Jacob, Bracci, Paige, Brennan, Paul, Chou, Michael, Du, Mengmeng, Gallinger, Steven, Goggins, Michael G., Hassan, Manal M., Hung, Rayjean J., Kurtz, Robert C., Li, Donghui, Lu, Lingeng, Neale, Rachel, Olson, Sara, Petersen, Gloria, Rabe, Kari G., Fu, Jack, Risch, Harvey, Rosner, Gary L., Ruczinski, Ingo, Klein, Alison P., Scharpf, Robert B.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7487704/
https://www.ncbi.nlm.nih.gov/pubmed/32894098
http://dx.doi.org/10.1186/s12885-020-07304-3
_version_ 1783581542047547392
author Cristiano, Stephen
McKean, David
Carey, Jacob
Bracci, Paige
Brennan, Paul
Chou, Michael
Du, Mengmeng
Gallinger, Steven
Goggins, Michael G.
Hassan, Manal M.
Hung, Rayjean J.
Kurtz, Robert C.
Li, Donghui
Lu, Lingeng
Neale, Rachel
Olson, Sara
Petersen, Gloria
Rabe, Kari G.
Fu, Jack
Risch, Harvey
Rosner, Gary L.
Ruczinski, Ingo
Klein, Alison P.
Scharpf, Robert B.
author_facet Cristiano, Stephen
McKean, David
Carey, Jacob
Bracci, Paige
Brennan, Paul
Chou, Michael
Du, Mengmeng
Gallinger, Steven
Goggins, Michael G.
Hassan, Manal M.
Hung, Rayjean J.
Kurtz, Robert C.
Li, Donghui
Lu, Lingeng
Neale, Rachel
Olson, Sara
Petersen, Gloria
Rabe, Kari G.
Fu, Jack
Risch, Harvey
Rosner, Gary L.
Ruczinski, Ingo
Klein, Alison P.
Scharpf, Robert B.
author_sort Cristiano, Stephen
collection PubMed
description BACKGROUND: Germline copy number variants (CNVs) increase risk for many diseases, yet detection of CNVs and quantifying their contribution to disease risk in large-scale studies is challenging due to biological and technical sources of heterogeneity that vary across the genome within and between samples. METHODS: We developed an approach called CNPBayes to identify latent batch effects in genome-wide association studies involving copy number, to provide probabilistic estimates of integer copy number across the estimated batches, and to fully integrate the copy number uncertainty in the association model for disease. RESULTS: Applying a hidden Markov model (HMM) to identify CNVs in a large multi-site Pancreatic Cancer Case Control study (PanC4) of 7598 participants, we found CNV inference was highly sensitive to technical noise that varied appreciably among participants. Applying CNPBayes to this dataset, we found that the major sources of technical variation were linked to sample processing by the centralized laboratory and not the individual study sites. Modeling the latent batch effects at each CNV region hierarchically, we developed probabilistic estimates of copy number that were directly incorporated in a Bayesian regression model for pancreatic cancer risk. Candidate associations aided by this approach include deletions of 8q24 near regulatory elements of the tumor oncogene MYC and of Tumor Suppressor Candidate 3 (TUSC3). CONCLUSIONS: Laboratory effects may not account for the major sources of technical variation in genome-wide association studies. This study provides a robust Bayesian inferential framework for identifying latent batch effects, estimating copy number, and evaluating the role of copy number in heritable diseases.
format Online
Article
Text
id pubmed-7487704
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-74877042020-09-16 Bayesian copy number detection and association in large-scale studies Cristiano, Stephen McKean, David Carey, Jacob Bracci, Paige Brennan, Paul Chou, Michael Du, Mengmeng Gallinger, Steven Goggins, Michael G. Hassan, Manal M. Hung, Rayjean J. Kurtz, Robert C. Li, Donghui Lu, Lingeng Neale, Rachel Olson, Sara Petersen, Gloria Rabe, Kari G. Fu, Jack Risch, Harvey Rosner, Gary L. Ruczinski, Ingo Klein, Alison P. Scharpf, Robert B. BMC Cancer Research Article BACKGROUND: Germline copy number variants (CNVs) increase risk for many diseases, yet detection of CNVs and quantifying their contribution to disease risk in large-scale studies is challenging due to biological and technical sources of heterogeneity that vary across the genome within and between samples. METHODS: We developed an approach called CNPBayes to identify latent batch effects in genome-wide association studies involving copy number, to provide probabilistic estimates of integer copy number across the estimated batches, and to fully integrate the copy number uncertainty in the association model for disease. RESULTS: Applying a hidden Markov model (HMM) to identify CNVs in a large multi-site Pancreatic Cancer Case Control study (PanC4) of 7598 participants, we found CNV inference was highly sensitive to technical noise that varied appreciably among participants. Applying CNPBayes to this dataset, we found that the major sources of technical variation were linked to sample processing by the centralized laboratory and not the individual study sites. Modeling the latent batch effects at each CNV region hierarchically, we developed probabilistic estimates of copy number that were directly incorporated in a Bayesian regression model for pancreatic cancer risk. Candidate associations aided by this approach include deletions of 8q24 near regulatory elements of the tumor oncogene MYC and of Tumor Suppressor Candidate 3 (TUSC3). CONCLUSIONS: Laboratory effects may not account for the major sources of technical variation in genome-wide association studies. This study provides a robust Bayesian inferential framework for identifying latent batch effects, estimating copy number, and evaluating the role of copy number in heritable diseases. BioMed Central 2020-09-07 /pmc/articles/PMC7487704/ /pubmed/32894098 http://dx.doi.org/10.1186/s12885-020-07304-3 Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research Article
Cristiano, Stephen
McKean, David
Carey, Jacob
Bracci, Paige
Brennan, Paul
Chou, Michael
Du, Mengmeng
Gallinger, Steven
Goggins, Michael G.
Hassan, Manal M.
Hung, Rayjean J.
Kurtz, Robert C.
Li, Donghui
Lu, Lingeng
Neale, Rachel
Olson, Sara
Petersen, Gloria
Rabe, Kari G.
Fu, Jack
Risch, Harvey
Rosner, Gary L.
Ruczinski, Ingo
Klein, Alison P.
Scharpf, Robert B.
Bayesian copy number detection and association in large-scale studies
title Bayesian copy number detection and association in large-scale studies
title_full Bayesian copy number detection and association in large-scale studies
title_fullStr Bayesian copy number detection and association in large-scale studies
title_full_unstemmed Bayesian copy number detection and association in large-scale studies
title_short Bayesian copy number detection and association in large-scale studies
title_sort bayesian copy number detection and association in large-scale studies
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7487704/
https://www.ncbi.nlm.nih.gov/pubmed/32894098
http://dx.doi.org/10.1186/s12885-020-07304-3
work_keys_str_mv AT cristianostephen bayesiancopynumberdetectionandassociationinlargescalestudies
AT mckeandavid bayesiancopynumberdetectionandassociationinlargescalestudies
AT careyjacob bayesiancopynumberdetectionandassociationinlargescalestudies
AT braccipaige bayesiancopynumberdetectionandassociationinlargescalestudies
AT brennanpaul bayesiancopynumberdetectionandassociationinlargescalestudies
AT choumichael bayesiancopynumberdetectionandassociationinlargescalestudies
AT dumengmeng bayesiancopynumberdetectionandassociationinlargescalestudies
AT gallingersteven bayesiancopynumberdetectionandassociationinlargescalestudies
AT gogginsmichaelg bayesiancopynumberdetectionandassociationinlargescalestudies
AT hassanmanalm bayesiancopynumberdetectionandassociationinlargescalestudies
AT hungrayjeanj bayesiancopynumberdetectionandassociationinlargescalestudies
AT kurtzrobertc bayesiancopynumberdetectionandassociationinlargescalestudies
AT lidonghui bayesiancopynumberdetectionandassociationinlargescalestudies
AT lulingeng bayesiancopynumberdetectionandassociationinlargescalestudies
AT nealerachel bayesiancopynumberdetectionandassociationinlargescalestudies
AT olsonsara bayesiancopynumberdetectionandassociationinlargescalestudies
AT petersengloria bayesiancopynumberdetectionandassociationinlargescalestudies
AT rabekarig bayesiancopynumberdetectionandassociationinlargescalestudies
AT fujack bayesiancopynumberdetectionandassociationinlargescalestudies
AT rischharvey bayesiancopynumberdetectionandassociationinlargescalestudies
AT rosnergaryl bayesiancopynumberdetectionandassociationinlargescalestudies
AT ruczinskiingo bayesiancopynumberdetectionandassociationinlargescalestudies
AT kleinalisonp bayesiancopynumberdetectionandassociationinlargescalestudies
AT scharpfrobertb bayesiancopynumberdetectionandassociationinlargescalestudies