Cargando…

Copy number analysis of whole-genome data using BIC-seq2 and its application to detection of cancer susceptibility variants

Whole-genome sequencing data allow detection of copy number variation (CNV) at high resolution. However, estimation based on read coverage along the genome suffers from bias due to GC content and other factors. Here, we develop an algorithm called BIC-seq2 that combines normalization of the data at...

Descripción completa

Detalles Bibliográficos
Autores principales: Xi, Ruibin, Lee, Semin, Xia, Yuchao, Kim, Tae-Min, Park, Peter J
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5772337/
https://www.ncbi.nlm.nih.gov/pubmed/27260798
http://dx.doi.org/10.1093/nar/gkw491
_version_ 1783293392624549888
author Xi, Ruibin
Lee, Semin
Xia, Yuchao
Kim, Tae-Min
Park, Peter J
author_facet Xi, Ruibin
Lee, Semin
Xia, Yuchao
Kim, Tae-Min
Park, Peter J
author_sort Xi, Ruibin
collection PubMed
description Whole-genome sequencing data allow detection of copy number variation (CNV) at high resolution. However, estimation based on read coverage along the genome suffers from bias due to GC content and other factors. Here, we develop an algorithm called BIC-seq2 that combines normalization of the data at the nucleotide level and Bayesian information criterion-based segmentation to detect both somatic and germline CNVs accurately. Analysis of simulation data showed that this method outperforms existing methods. We apply this algorithm to low coverage whole-genome sequencing data from peripheral blood of nearly a thousand patients across eleven cancer types in The Cancer Genome Atlas (TCGA) to identify cancer-predisposing CNV regions. We confirm known regions and discover new ones including those covering KMT2C, GOLPH3, ERBB2 and PLAG1. Analysis of colorectal cancer genomes in particular reveals novel recurrent CNVs including deletions at two chromatin-remodeling genes RERE and NPM2. This method will be useful to many researchers interested in profiling CNVs from whole-genome sequencing data.
format Online
Article
Text
id pubmed-5772337
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-57723372018-01-23 Copy number analysis of whole-genome data using BIC-seq2 and its application to detection of cancer susceptibility variants Xi, Ruibin Lee, Semin Xia, Yuchao Kim, Tae-Min Park, Peter J Nucleic Acids Res Genomics Whole-genome sequencing data allow detection of copy number variation (CNV) at high resolution. However, estimation based on read coverage along the genome suffers from bias due to GC content and other factors. Here, we develop an algorithm called BIC-seq2 that combines normalization of the data at the nucleotide level and Bayesian information criterion-based segmentation to detect both somatic and germline CNVs accurately. Analysis of simulation data showed that this method outperforms existing methods. We apply this algorithm to low coverage whole-genome sequencing data from peripheral blood of nearly a thousand patients across eleven cancer types in The Cancer Genome Atlas (TCGA) to identify cancer-predisposing CNV regions. We confirm known regions and discover new ones including those covering KMT2C, GOLPH3, ERBB2 and PLAG1. Analysis of colorectal cancer genomes in particular reveals novel recurrent CNVs including deletions at two chromatin-remodeling genes RERE and NPM2. This method will be useful to many researchers interested in profiling CNVs from whole-genome sequencing data. Oxford University Press 2016-07-27 2016-06-03 /pmc/articles/PMC5772337/ /pubmed/27260798 http://dx.doi.org/10.1093/nar/gkw491 Text en © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Genomics
Xi, Ruibin
Lee, Semin
Xia, Yuchao
Kim, Tae-Min
Park, Peter J
Copy number analysis of whole-genome data using BIC-seq2 and its application to detection of cancer susceptibility variants
title Copy number analysis of whole-genome data using BIC-seq2 and its application to detection of cancer susceptibility variants
title_full Copy number analysis of whole-genome data using BIC-seq2 and its application to detection of cancer susceptibility variants
title_fullStr Copy number analysis of whole-genome data using BIC-seq2 and its application to detection of cancer susceptibility variants
title_full_unstemmed Copy number analysis of whole-genome data using BIC-seq2 and its application to detection of cancer susceptibility variants
title_short Copy number analysis of whole-genome data using BIC-seq2 and its application to detection of cancer susceptibility variants
title_sort copy number analysis of whole-genome data using bic-seq2 and its application to detection of cancer susceptibility variants
topic Genomics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5772337/
https://www.ncbi.nlm.nih.gov/pubmed/27260798
http://dx.doi.org/10.1093/nar/gkw491
work_keys_str_mv AT xiruibin copynumberanalysisofwholegenomedatausingbicseq2anditsapplicationtodetectionofcancersusceptibilityvariants
AT leesemin copynumberanalysisofwholegenomedatausingbicseq2anditsapplicationtodetectionofcancersusceptibilityvariants
AT xiayuchao copynumberanalysisofwholegenomedatausingbicseq2anditsapplicationtodetectionofcancersusceptibilityvariants
AT kimtaemin copynumberanalysisofwholegenomedatausingbicseq2anditsapplicationtodetectionofcancersusceptibilityvariants
AT parkpeterj copynumberanalysisofwholegenomedatausingbicseq2anditsapplicationtodetectionofcancersusceptibilityvariants