Cargando…
A robust model for read count data in exome sequencing experiments and implications for copy number variant calling
Motivation: Exome sequencing has proven to be an effective tool to discover the genetic basis of Mendelian disorders. It is well established that copy number variants (CNVs) contribute to the etiology of these disorders. However, calling CNVs from exome sequence data is challenging. A typical read d...
Autores principales: | , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2012
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3476336/ https://www.ncbi.nlm.nih.gov/pubmed/22942019 http://dx.doi.org/10.1093/bioinformatics/bts526 |
_version_ | 1782247083080679424 |
---|---|
author | Plagnol, Vincent Curtis, James Epstein, Michael Mok, Kin Y. Stebbings, Emma Grigoriadou, Sofia Wood, Nicholas W. Hambleton, Sophie Burns, Siobhan O. Thrasher, Adrian J. Kumararatne, Dinakantha Doffinger, Rainer Nejentsev, Sergey |
author_facet | Plagnol, Vincent Curtis, James Epstein, Michael Mok, Kin Y. Stebbings, Emma Grigoriadou, Sofia Wood, Nicholas W. Hambleton, Sophie Burns, Siobhan O. Thrasher, Adrian J. Kumararatne, Dinakantha Doffinger, Rainer Nejentsev, Sergey |
author_sort | Plagnol, Vincent |
collection | PubMed |
description | Motivation: Exome sequencing has proven to be an effective tool to discover the genetic basis of Mendelian disorders. It is well established that copy number variants (CNVs) contribute to the etiology of these disorders. However, calling CNVs from exome sequence data is challenging. A typical read depth strategy consists of using another sample (or a combination of samples) as a reference to control for the variability at the capture and sequencing steps. However, technical variability between samples complicates the analysis and can create spurious CNV calls. Results: Here, we introduce ExomeDepth, a new CNV calling algorithm designed to control for this technical variability. ExomeDepth uses a robust model for the read count data and uses this model to build an optimized reference set in order to maximize the power to detect CNVs. As a result, ExomeDepth is effective across a wider range of exome datasets than the previously existing tools, even for small (e.g. one to two exons) and heterozygous deletions. We used this new approach to analyse exome data from 24 patients with primary immunodeficiencies. Depending on data quality and the exact target region, we find between 170 and 250 exonic CNV calls per sample. Our analysis identified two novel causative deletions in the genes GATA2 and DOCK8. Availability: The code used in this analysis has been implemented into an R package called ExomeDepth and is available at the Comprehensive R Archive Network (CRAN). Contact: v.plagnol@ucl.ac.uk Supplementary Information: Supplementary data are available at Bioinformatics online. |
format | Online Article Text |
id | pubmed-3476336 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2012 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-34763362012-12-12 A robust model for read count data in exome sequencing experiments and implications for copy number variant calling Plagnol, Vincent Curtis, James Epstein, Michael Mok, Kin Y. Stebbings, Emma Grigoriadou, Sofia Wood, Nicholas W. Hambleton, Sophie Burns, Siobhan O. Thrasher, Adrian J. Kumararatne, Dinakantha Doffinger, Rainer Nejentsev, Sergey Bioinformatics Original Papers Motivation: Exome sequencing has proven to be an effective tool to discover the genetic basis of Mendelian disorders. It is well established that copy number variants (CNVs) contribute to the etiology of these disorders. However, calling CNVs from exome sequence data is challenging. A typical read depth strategy consists of using another sample (or a combination of samples) as a reference to control for the variability at the capture and sequencing steps. However, technical variability between samples complicates the analysis and can create spurious CNV calls. Results: Here, we introduce ExomeDepth, a new CNV calling algorithm designed to control for this technical variability. ExomeDepth uses a robust model for the read count data and uses this model to build an optimized reference set in order to maximize the power to detect CNVs. As a result, ExomeDepth is effective across a wider range of exome datasets than the previously existing tools, even for small (e.g. one to two exons) and heterozygous deletions. We used this new approach to analyse exome data from 24 patients with primary immunodeficiencies. Depending on data quality and the exact target region, we find between 170 and 250 exonic CNV calls per sample. Our analysis identified two novel causative deletions in the genes GATA2 and DOCK8. Availability: The code used in this analysis has been implemented into an R package called ExomeDepth and is available at the Comprehensive R Archive Network (CRAN). Contact: v.plagnol@ucl.ac.uk Supplementary Information: Supplementary data are available at Bioinformatics online. Oxford University Press 2012-11-01 2012-08-31 /pmc/articles/PMC3476336/ /pubmed/22942019 http://dx.doi.org/10.1093/bioinformatics/bts526 Text en © The Author 2012. Published by Oxford University Press. http://creativecommons.org/licenses/by/3.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Original Papers Plagnol, Vincent Curtis, James Epstein, Michael Mok, Kin Y. Stebbings, Emma Grigoriadou, Sofia Wood, Nicholas W. Hambleton, Sophie Burns, Siobhan O. Thrasher, Adrian J. Kumararatne, Dinakantha Doffinger, Rainer Nejentsev, Sergey A robust model for read count data in exome sequencing experiments and implications for copy number variant calling |
title | A robust model for read count data in exome sequencing experiments and implications for copy number variant calling |
title_full | A robust model for read count data in exome sequencing experiments and implications for copy number variant calling |
title_fullStr | A robust model for read count data in exome sequencing experiments and implications for copy number variant calling |
title_full_unstemmed | A robust model for read count data in exome sequencing experiments and implications for copy number variant calling |
title_short | A robust model for read count data in exome sequencing experiments and implications for copy number variant calling |
title_sort | robust model for read count data in exome sequencing experiments and implications for copy number variant calling |
topic | Original Papers |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3476336/ https://www.ncbi.nlm.nih.gov/pubmed/22942019 http://dx.doi.org/10.1093/bioinformatics/bts526 |
work_keys_str_mv | AT plagnolvincent arobustmodelforreadcountdatainexomesequencingexperimentsandimplicationsforcopynumbervariantcalling AT curtisjames arobustmodelforreadcountdatainexomesequencingexperimentsandimplicationsforcopynumbervariantcalling AT epsteinmichael arobustmodelforreadcountdatainexomesequencingexperimentsandimplicationsforcopynumbervariantcalling AT mokkiny arobustmodelforreadcountdatainexomesequencingexperimentsandimplicationsforcopynumbervariantcalling AT stebbingsemma arobustmodelforreadcountdatainexomesequencingexperimentsandimplicationsforcopynumbervariantcalling AT grigoriadousofia arobustmodelforreadcountdatainexomesequencingexperimentsandimplicationsforcopynumbervariantcalling AT woodnicholasw arobustmodelforreadcountdatainexomesequencingexperimentsandimplicationsforcopynumbervariantcalling AT hambletonsophie arobustmodelforreadcountdatainexomesequencingexperimentsandimplicationsforcopynumbervariantcalling AT burnssiobhano arobustmodelforreadcountdatainexomesequencingexperimentsandimplicationsforcopynumbervariantcalling AT thrasheradrianj arobustmodelforreadcountdatainexomesequencingexperimentsandimplicationsforcopynumbervariantcalling AT kumararatnedinakantha arobustmodelforreadcountdatainexomesequencingexperimentsandimplicationsforcopynumbervariantcalling AT doffingerrainer arobustmodelforreadcountdatainexomesequencingexperimentsandimplicationsforcopynumbervariantcalling AT nejentsevsergey arobustmodelforreadcountdatainexomesequencingexperimentsandimplicationsforcopynumbervariantcalling AT plagnolvincent robustmodelforreadcountdatainexomesequencingexperimentsandimplicationsforcopynumbervariantcalling AT curtisjames robustmodelforreadcountdatainexomesequencingexperimentsandimplicationsforcopynumbervariantcalling AT epsteinmichael robustmodelforreadcountdatainexomesequencingexperimentsandimplicationsforcopynumbervariantcalling AT mokkiny robustmodelforreadcountdatainexomesequencingexperimentsandimplicationsforcopynumbervariantcalling AT stebbingsemma robustmodelforreadcountdatainexomesequencingexperimentsandimplicationsforcopynumbervariantcalling AT grigoriadousofia robustmodelforreadcountdatainexomesequencingexperimentsandimplicationsforcopynumbervariantcalling AT woodnicholasw robustmodelforreadcountdatainexomesequencingexperimentsandimplicationsforcopynumbervariantcalling AT hambletonsophie robustmodelforreadcountdatainexomesequencingexperimentsandimplicationsforcopynumbervariantcalling AT burnssiobhano robustmodelforreadcountdatainexomesequencingexperimentsandimplicationsforcopynumbervariantcalling AT thrasheradrianj robustmodelforreadcountdatainexomesequencingexperimentsandimplicationsforcopynumbervariantcalling AT kumararatnedinakantha robustmodelforreadcountdatainexomesequencingexperimentsandimplicationsforcopynumbervariantcalling AT doffingerrainer robustmodelforreadcountdatainexomesequencingexperimentsandimplicationsforcopynumbervariantcalling AT nejentsevsergey robustmodelforreadcountdatainexomesequencingexperimentsandimplicationsforcopynumbervariantcalling |