Cargando…
CNVkit: Genome-Wide Copy Number Detection and Visualization from Targeted DNA Sequencing
Germline copy number variants (CNVs) and somatic copy number alterations (SCNAs) are of significant importance in syndromic conditions and cancer. Massively parallel sequencing is increasingly used to infer copy number information from variations in the read depth in sequencing data. However, this a...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4839673/ https://www.ncbi.nlm.nih.gov/pubmed/27100738 http://dx.doi.org/10.1371/journal.pcbi.1004873 |
_version_ | 1782428161799094272 |
---|---|
author | Talevich, Eric Shain, A. Hunter Botton, Thomas Bastian, Boris C. |
author_facet | Talevich, Eric Shain, A. Hunter Botton, Thomas Bastian, Boris C. |
author_sort | Talevich, Eric |
collection | PubMed |
description | Germline copy number variants (CNVs) and somatic copy number alterations (SCNAs) are of significant importance in syndromic conditions and cancer. Massively parallel sequencing is increasingly used to infer copy number information from variations in the read depth in sequencing data. However, this approach has limitations in the case of targeted re-sequencing, which leaves gaps in coverage between the regions chosen for enrichment and introduces biases related to the efficiency of target capture and library preparation. We present a method for copy number detection, implemented in the software package CNVkit, that uses both the targeted reads and the nonspecifically captured off-target reads to infer copy number evenly across the genome. This combination achieves both exon-level resolution in targeted regions and sufficient resolution in the larger intronic and intergenic regions to identify copy number changes. In particular, we successfully inferred copy number at equivalent to 100-kilobase resolution genome-wide from a platform targeting as few as 293 genes. After normalizing read counts to a pooled reference, we evaluated and corrected for three sources of bias that explain most of the extraneous variability in the sequencing read depth: GC content, target footprint size and spacing, and repetitive sequences. We compared the performance of CNVkit to copy number changes identified by array comparative genomic hybridization. We packaged the components of CNVkit so that it is straightforward to use and provides visualizations, detailed reporting of significant features, and export options for integration into existing analysis pipelines. CNVkit is freely available from https://github.com/etal/cnvkit. |
format | Online Article Text |
id | pubmed-4839673 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-48396732016-04-29 CNVkit: Genome-Wide Copy Number Detection and Visualization from Targeted DNA Sequencing Talevich, Eric Shain, A. Hunter Botton, Thomas Bastian, Boris C. PLoS Comput Biol Research Article Germline copy number variants (CNVs) and somatic copy number alterations (SCNAs) are of significant importance in syndromic conditions and cancer. Massively parallel sequencing is increasingly used to infer copy number information from variations in the read depth in sequencing data. However, this approach has limitations in the case of targeted re-sequencing, which leaves gaps in coverage between the regions chosen for enrichment and introduces biases related to the efficiency of target capture and library preparation. We present a method for copy number detection, implemented in the software package CNVkit, that uses both the targeted reads and the nonspecifically captured off-target reads to infer copy number evenly across the genome. This combination achieves both exon-level resolution in targeted regions and sufficient resolution in the larger intronic and intergenic regions to identify copy number changes. In particular, we successfully inferred copy number at equivalent to 100-kilobase resolution genome-wide from a platform targeting as few as 293 genes. After normalizing read counts to a pooled reference, we evaluated and corrected for three sources of bias that explain most of the extraneous variability in the sequencing read depth: GC content, target footprint size and spacing, and repetitive sequences. We compared the performance of CNVkit to copy number changes identified by array comparative genomic hybridization. We packaged the components of CNVkit so that it is straightforward to use and provides visualizations, detailed reporting of significant features, and export options for integration into existing analysis pipelines. CNVkit is freely available from https://github.com/etal/cnvkit. Public Library of Science 2016-04-21 /pmc/articles/PMC4839673/ /pubmed/27100738 http://dx.doi.org/10.1371/journal.pcbi.1004873 Text en © 2016 Talevich et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Talevich, Eric Shain, A. Hunter Botton, Thomas Bastian, Boris C. CNVkit: Genome-Wide Copy Number Detection and Visualization from Targeted DNA Sequencing |
title | CNVkit: Genome-Wide Copy Number Detection and Visualization from Targeted DNA Sequencing |
title_full | CNVkit: Genome-Wide Copy Number Detection and Visualization from Targeted DNA Sequencing |
title_fullStr | CNVkit: Genome-Wide Copy Number Detection and Visualization from Targeted DNA Sequencing |
title_full_unstemmed | CNVkit: Genome-Wide Copy Number Detection and Visualization from Targeted DNA Sequencing |
title_short | CNVkit: Genome-Wide Copy Number Detection and Visualization from Targeted DNA Sequencing |
title_sort | cnvkit: genome-wide copy number detection and visualization from targeted dna sequencing |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4839673/ https://www.ncbi.nlm.nih.gov/pubmed/27100738 http://dx.doi.org/10.1371/journal.pcbi.1004873 |
work_keys_str_mv | AT talevicheric cnvkitgenomewidecopynumberdetectionandvisualizationfromtargeteddnasequencing AT shainahunter cnvkitgenomewidecopynumberdetectionandvisualizationfromtargeteddnasequencing AT bottonthomas cnvkitgenomewidecopynumberdetectionandvisualizationfromtargeteddnasequencing AT bastianborisc cnvkitgenomewidecopynumberdetectionandvisualizationfromtargeteddnasequencing |