Cargando…

RefCNV: Identification of Gene-Based Copy Number Variants Using Whole Exome Sequencing

With rapid advances in DNA sequencing technologies, whole exome sequencing (WES) has become a popular approach for detecting somatic mutations in oncology studies. The initial intent of WES was to characterize single nucleotide variants, but it was observed that the number of sequencing reads that m...

Descripción completa

Detalles Bibliográficos
Autores principales: Chang, Lun-Ching, Das, Biswajit, Lih, Chih-Jian, Si, Han, Camalier, Corinne E., McGregor, Paul M., Polley, Eric
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Libertas Academica 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4849420/
https://www.ncbi.nlm.nih.gov/pubmed/27147817
http://dx.doi.org/10.4137/CIN.S36612
_version_ 1782429535821627392
author Chang, Lun-Ching
Das, Biswajit
Lih, Chih-Jian
Si, Han
Camalier, Corinne E.
McGregor, Paul M.
Polley, Eric
author_facet Chang, Lun-Ching
Das, Biswajit
Lih, Chih-Jian
Si, Han
Camalier, Corinne E.
McGregor, Paul M.
Polley, Eric
author_sort Chang, Lun-Ching
collection PubMed
description With rapid advances in DNA sequencing technologies, whole exome sequencing (WES) has become a popular approach for detecting somatic mutations in oncology studies. The initial intent of WES was to characterize single nucleotide variants, but it was observed that the number of sequencing reads that mapped to a genomic region correlated with the DNA copy number variants (CNVs). We propose a method RefCNV that uses a reference set to estimate the distribution of the coverage for each exon. The construction of the reference set includes an evaluation of the sources of variability in the coverage distribution. We observed that the processing steps had an impact on the coverage distribution. For each exon, we compared the observed coverage with the expected normal coverage. Thresholds for determining CNVs were selected to control the false-positive error rate. RefCNV prediction correlated significantly (r = 0.96–0.86) with CNV measured by digital polymerase chain reaction for MET (7q31), EGFR (7p12), or ERBB2 (17q12) in 13 tumor cell lines. The genome-wide CNV analysis showed a good overall correlation (Spearman’s coefficient = 0.82) between RefCNV estimation and publicly available CNV data in Cancer Cell Line Encyclopedia. RefCNV also showed better performance than three other CNV estimation methods in genome-wide CNV analysis.
format Online
Article
Text
id pubmed-4849420
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Libertas Academica
record_format MEDLINE/PubMed
spelling pubmed-48494202016-05-04 RefCNV: Identification of Gene-Based Copy Number Variants Using Whole Exome Sequencing Chang, Lun-Ching Das, Biswajit Lih, Chih-Jian Si, Han Camalier, Corinne E. McGregor, Paul M. Polley, Eric Cancer Inform Methodology With rapid advances in DNA sequencing technologies, whole exome sequencing (WES) has become a popular approach for detecting somatic mutations in oncology studies. The initial intent of WES was to characterize single nucleotide variants, but it was observed that the number of sequencing reads that mapped to a genomic region correlated with the DNA copy number variants (CNVs). We propose a method RefCNV that uses a reference set to estimate the distribution of the coverage for each exon. The construction of the reference set includes an evaluation of the sources of variability in the coverage distribution. We observed that the processing steps had an impact on the coverage distribution. For each exon, we compared the observed coverage with the expected normal coverage. Thresholds for determining CNVs were selected to control the false-positive error rate. RefCNV prediction correlated significantly (r = 0.96–0.86) with CNV measured by digital polymerase chain reaction for MET (7q31), EGFR (7p12), or ERBB2 (17q12) in 13 tumor cell lines. The genome-wide CNV analysis showed a good overall correlation (Spearman’s coefficient = 0.82) between RefCNV estimation and publicly available CNV data in Cancer Cell Line Encyclopedia. RefCNV also showed better performance than three other CNV estimation methods in genome-wide CNV analysis. Libertas Academica 2016-04-27 /pmc/articles/PMC4849420/ /pubmed/27147817 http://dx.doi.org/10.4137/CIN.S36612 Text en © 2016 the author(s), publisher and licensee Libertas Academica Ltd. This is an open-access article distributed under the terms of the Creative Commons CC-BY-NC 3.0 license.
spellingShingle Methodology
Chang, Lun-Ching
Das, Biswajit
Lih, Chih-Jian
Si, Han
Camalier, Corinne E.
McGregor, Paul M.
Polley, Eric
RefCNV: Identification of Gene-Based Copy Number Variants Using Whole Exome Sequencing
title RefCNV: Identification of Gene-Based Copy Number Variants Using Whole Exome Sequencing
title_full RefCNV: Identification of Gene-Based Copy Number Variants Using Whole Exome Sequencing
title_fullStr RefCNV: Identification of Gene-Based Copy Number Variants Using Whole Exome Sequencing
title_full_unstemmed RefCNV: Identification of Gene-Based Copy Number Variants Using Whole Exome Sequencing
title_short RefCNV: Identification of Gene-Based Copy Number Variants Using Whole Exome Sequencing
title_sort refcnv: identification of gene-based copy number variants using whole exome sequencing
topic Methodology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4849420/
https://www.ncbi.nlm.nih.gov/pubmed/27147817
http://dx.doi.org/10.4137/CIN.S36612
work_keys_str_mv AT changlunching refcnvidentificationofgenebasedcopynumbervariantsusingwholeexomesequencing
AT dasbiswajit refcnvidentificationofgenebasedcopynumbervariantsusingwholeexomesequencing
AT lihchihjian refcnvidentificationofgenebasedcopynumbervariantsusingwholeexomesequencing
AT sihan refcnvidentificationofgenebasedcopynumbervariantsusingwholeexomesequencing
AT camaliercorinnee refcnvidentificationofgenebasedcopynumbervariantsusingwholeexomesequencing
AT mcgregorpaulm refcnvidentificationofgenebasedcopynumbervariantsusingwholeexomesequencing
AT polleyeric refcnvidentificationofgenebasedcopynumbervariantsusingwholeexomesequencing