Cargando…

CNVkit: Genome-Wide Copy Number Detection and Visualization from Targeted DNA Sequencing

Germline copy number variants (CNVs) and somatic copy number alterations (SCNAs) are of significant importance in syndromic conditions and cancer. Massively parallel sequencing is increasingly used to infer copy number information from variations in the read depth in sequencing data. However, this a...

Descripción completa

Detalles Bibliográficos
Autores principales: Talevich, Eric, Shain, A. Hunter, Botton, Thomas, Bastian, Boris C.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4839673/
https://www.ncbi.nlm.nih.gov/pubmed/27100738
http://dx.doi.org/10.1371/journal.pcbi.1004873
_version_ 1782428161799094272
author Talevich, Eric
Shain, A. Hunter
Botton, Thomas
Bastian, Boris C.
author_facet Talevich, Eric
Shain, A. Hunter
Botton, Thomas
Bastian, Boris C.
author_sort Talevich, Eric
collection PubMed
description Germline copy number variants (CNVs) and somatic copy number alterations (SCNAs) are of significant importance in syndromic conditions and cancer. Massively parallel sequencing is increasingly used to infer copy number information from variations in the read depth in sequencing data. However, this approach has limitations in the case of targeted re-sequencing, which leaves gaps in coverage between the regions chosen for enrichment and introduces biases related to the efficiency of target capture and library preparation. We present a method for copy number detection, implemented in the software package CNVkit, that uses both the targeted reads and the nonspecifically captured off-target reads to infer copy number evenly across the genome. This combination achieves both exon-level resolution in targeted regions and sufficient resolution in the larger intronic and intergenic regions to identify copy number changes. In particular, we successfully inferred copy number at equivalent to 100-kilobase resolution genome-wide from a platform targeting as few as 293 genes. After normalizing read counts to a pooled reference, we evaluated and corrected for three sources of bias that explain most of the extraneous variability in the sequencing read depth: GC content, target footprint size and spacing, and repetitive sequences. We compared the performance of CNVkit to copy number changes identified by array comparative genomic hybridization. We packaged the components of CNVkit so that it is straightforward to use and provides visualizations, detailed reporting of significant features, and export options for integration into existing analysis pipelines. CNVkit is freely available from https://github.com/etal/cnvkit.
format Online
Article
Text
id pubmed-4839673
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-48396732016-04-29 CNVkit: Genome-Wide Copy Number Detection and Visualization from Targeted DNA Sequencing Talevich, Eric Shain, A. Hunter Botton, Thomas Bastian, Boris C. PLoS Comput Biol Research Article Germline copy number variants (CNVs) and somatic copy number alterations (SCNAs) are of significant importance in syndromic conditions and cancer. Massively parallel sequencing is increasingly used to infer copy number information from variations in the read depth in sequencing data. However, this approach has limitations in the case of targeted re-sequencing, which leaves gaps in coverage between the regions chosen for enrichment and introduces biases related to the efficiency of target capture and library preparation. We present a method for copy number detection, implemented in the software package CNVkit, that uses both the targeted reads and the nonspecifically captured off-target reads to infer copy number evenly across the genome. This combination achieves both exon-level resolution in targeted regions and sufficient resolution in the larger intronic and intergenic regions to identify copy number changes. In particular, we successfully inferred copy number at equivalent to 100-kilobase resolution genome-wide from a platform targeting as few as 293 genes. After normalizing read counts to a pooled reference, we evaluated and corrected for three sources of bias that explain most of the extraneous variability in the sequencing read depth: GC content, target footprint size and spacing, and repetitive sequences. We compared the performance of CNVkit to copy number changes identified by array comparative genomic hybridization. We packaged the components of CNVkit so that it is straightforward to use and provides visualizations, detailed reporting of significant features, and export options for integration into existing analysis pipelines. CNVkit is freely available from https://github.com/etal/cnvkit. Public Library of Science 2016-04-21 /pmc/articles/PMC4839673/ /pubmed/27100738 http://dx.doi.org/10.1371/journal.pcbi.1004873 Text en © 2016 Talevich et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Talevich, Eric
Shain, A. Hunter
Botton, Thomas
Bastian, Boris C.
CNVkit: Genome-Wide Copy Number Detection and Visualization from Targeted DNA Sequencing
title CNVkit: Genome-Wide Copy Number Detection and Visualization from Targeted DNA Sequencing
title_full CNVkit: Genome-Wide Copy Number Detection and Visualization from Targeted DNA Sequencing
title_fullStr CNVkit: Genome-Wide Copy Number Detection and Visualization from Targeted DNA Sequencing
title_full_unstemmed CNVkit: Genome-Wide Copy Number Detection and Visualization from Targeted DNA Sequencing
title_short CNVkit: Genome-Wide Copy Number Detection and Visualization from Targeted DNA Sequencing
title_sort cnvkit: genome-wide copy number detection and visualization from targeted dna sequencing
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4839673/
https://www.ncbi.nlm.nih.gov/pubmed/27100738
http://dx.doi.org/10.1371/journal.pcbi.1004873
work_keys_str_mv AT talevicheric cnvkitgenomewidecopynumberdetectionandvisualizationfromtargeteddnasequencing
AT shainahunter cnvkitgenomewidecopynumberdetectionandvisualizationfromtargeteddnasequencing
AT bottonthomas cnvkitgenomewidecopynumberdetectionandvisualizationfromtargeteddnasequencing
AT bastianborisc cnvkitgenomewidecopynumberdetectionandvisualizationfromtargeteddnasequencing