Cargando…

A robust model for read count data in exome sequencing experiments and implications for copy number variant calling

Motivation: Exome sequencing has proven to be an effective tool to discover the genetic basis of Mendelian disorders. It is well established that copy number variants (CNVs) contribute to the etiology of these disorders. However, calling CNVs from exome sequence data is challenging. A typical read d...

Descripción completa

Detalles Bibliográficos
Autores principales: Plagnol, Vincent, Curtis, James, Epstein, Michael, Mok, Kin Y., Stebbings, Emma, Grigoriadou, Sofia, Wood, Nicholas W., Hambleton, Sophie, Burns, Siobhan O., Thrasher, Adrian J., Kumararatne, Dinakantha, Doffinger, Rainer, Nejentsev, Sergey
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3476336/
https://www.ncbi.nlm.nih.gov/pubmed/22942019
http://dx.doi.org/10.1093/bioinformatics/bts526
_version_ 1782247083080679424
author Plagnol, Vincent
Curtis, James
Epstein, Michael
Mok, Kin Y.
Stebbings, Emma
Grigoriadou, Sofia
Wood, Nicholas W.
Hambleton, Sophie
Burns, Siobhan O.
Thrasher, Adrian J.
Kumararatne, Dinakantha
Doffinger, Rainer
Nejentsev, Sergey
author_facet Plagnol, Vincent
Curtis, James
Epstein, Michael
Mok, Kin Y.
Stebbings, Emma
Grigoriadou, Sofia
Wood, Nicholas W.
Hambleton, Sophie
Burns, Siobhan O.
Thrasher, Adrian J.
Kumararatne, Dinakantha
Doffinger, Rainer
Nejentsev, Sergey
author_sort Plagnol, Vincent
collection PubMed
description Motivation: Exome sequencing has proven to be an effective tool to discover the genetic basis of Mendelian disorders. It is well established that copy number variants (CNVs) contribute to the etiology of these disorders. However, calling CNVs from exome sequence data is challenging. A typical read depth strategy consists of using another sample (or a combination of samples) as a reference to control for the variability at the capture and sequencing steps. However, technical variability between samples complicates the analysis and can create spurious CNV calls. Results: Here, we introduce ExomeDepth, a new CNV calling algorithm designed to control for this technical variability. ExomeDepth uses a robust model for the read count data and uses this model to build an optimized reference set in order to maximize the power to detect CNVs. As a result, ExomeDepth is effective across a wider range of exome datasets than the previously existing tools, even for small (e.g. one to two exons) and heterozygous deletions. We used this new approach to analyse exome data from 24 patients with primary immunodeficiencies. Depending on data quality and the exact target region, we find between 170 and 250 exonic CNV calls per sample. Our analysis identified two novel causative deletions in the genes GATA2 and DOCK8. Availability: The code used in this analysis has been implemented into an R package called ExomeDepth and is available at the Comprehensive R Archive Network (CRAN). Contact: v.plagnol@ucl.ac.uk Supplementary Information: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-3476336
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-34763362012-12-12 A robust model for read count data in exome sequencing experiments and implications for copy number variant calling Plagnol, Vincent Curtis, James Epstein, Michael Mok, Kin Y. Stebbings, Emma Grigoriadou, Sofia Wood, Nicholas W. Hambleton, Sophie Burns, Siobhan O. Thrasher, Adrian J. Kumararatne, Dinakantha Doffinger, Rainer Nejentsev, Sergey Bioinformatics Original Papers Motivation: Exome sequencing has proven to be an effective tool to discover the genetic basis of Mendelian disorders. It is well established that copy number variants (CNVs) contribute to the etiology of these disorders. However, calling CNVs from exome sequence data is challenging. A typical read depth strategy consists of using another sample (or a combination of samples) as a reference to control for the variability at the capture and sequencing steps. However, technical variability between samples complicates the analysis and can create spurious CNV calls. Results: Here, we introduce ExomeDepth, a new CNV calling algorithm designed to control for this technical variability. ExomeDepth uses a robust model for the read count data and uses this model to build an optimized reference set in order to maximize the power to detect CNVs. As a result, ExomeDepth is effective across a wider range of exome datasets than the previously existing tools, even for small (e.g. one to two exons) and heterozygous deletions. We used this new approach to analyse exome data from 24 patients with primary immunodeficiencies. Depending on data quality and the exact target region, we find between 170 and 250 exonic CNV calls per sample. Our analysis identified two novel causative deletions in the genes GATA2 and DOCK8. Availability: The code used in this analysis has been implemented into an R package called ExomeDepth and is available at the Comprehensive R Archive Network (CRAN). Contact: v.plagnol@ucl.ac.uk Supplementary Information: Supplementary data are available at Bioinformatics online. Oxford University Press 2012-11-01 2012-08-31 /pmc/articles/PMC3476336/ /pubmed/22942019 http://dx.doi.org/10.1093/bioinformatics/bts526 Text en © The Author 2012. Published by Oxford University Press. http://creativecommons.org/licenses/by/3.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Papers
Plagnol, Vincent
Curtis, James
Epstein, Michael
Mok, Kin Y.
Stebbings, Emma
Grigoriadou, Sofia
Wood, Nicholas W.
Hambleton, Sophie
Burns, Siobhan O.
Thrasher, Adrian J.
Kumararatne, Dinakantha
Doffinger, Rainer
Nejentsev, Sergey
A robust model for read count data in exome sequencing experiments and implications for copy number variant calling
title A robust model for read count data in exome sequencing experiments and implications for copy number variant calling
title_full A robust model for read count data in exome sequencing experiments and implications for copy number variant calling
title_fullStr A robust model for read count data in exome sequencing experiments and implications for copy number variant calling
title_full_unstemmed A robust model for read count data in exome sequencing experiments and implications for copy number variant calling
title_short A robust model for read count data in exome sequencing experiments and implications for copy number variant calling
title_sort robust model for read count data in exome sequencing experiments and implications for copy number variant calling
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3476336/
https://www.ncbi.nlm.nih.gov/pubmed/22942019
http://dx.doi.org/10.1093/bioinformatics/bts526
work_keys_str_mv AT plagnolvincent arobustmodelforreadcountdatainexomesequencingexperimentsandimplicationsforcopynumbervariantcalling
AT curtisjames arobustmodelforreadcountdatainexomesequencingexperimentsandimplicationsforcopynumbervariantcalling
AT epsteinmichael arobustmodelforreadcountdatainexomesequencingexperimentsandimplicationsforcopynumbervariantcalling
AT mokkiny arobustmodelforreadcountdatainexomesequencingexperimentsandimplicationsforcopynumbervariantcalling
AT stebbingsemma arobustmodelforreadcountdatainexomesequencingexperimentsandimplicationsforcopynumbervariantcalling
AT grigoriadousofia arobustmodelforreadcountdatainexomesequencingexperimentsandimplicationsforcopynumbervariantcalling
AT woodnicholasw arobustmodelforreadcountdatainexomesequencingexperimentsandimplicationsforcopynumbervariantcalling
AT hambletonsophie arobustmodelforreadcountdatainexomesequencingexperimentsandimplicationsforcopynumbervariantcalling
AT burnssiobhano arobustmodelforreadcountdatainexomesequencingexperimentsandimplicationsforcopynumbervariantcalling
AT thrasheradrianj arobustmodelforreadcountdatainexomesequencingexperimentsandimplicationsforcopynumbervariantcalling
AT kumararatnedinakantha arobustmodelforreadcountdatainexomesequencingexperimentsandimplicationsforcopynumbervariantcalling
AT doffingerrainer arobustmodelforreadcountdatainexomesequencingexperimentsandimplicationsforcopynumbervariantcalling
AT nejentsevsergey arobustmodelforreadcountdatainexomesequencingexperimentsandimplicationsforcopynumbervariantcalling
AT plagnolvincent robustmodelforreadcountdatainexomesequencingexperimentsandimplicationsforcopynumbervariantcalling
AT curtisjames robustmodelforreadcountdatainexomesequencingexperimentsandimplicationsforcopynumbervariantcalling
AT epsteinmichael robustmodelforreadcountdatainexomesequencingexperimentsandimplicationsforcopynumbervariantcalling
AT mokkiny robustmodelforreadcountdatainexomesequencingexperimentsandimplicationsforcopynumbervariantcalling
AT stebbingsemma robustmodelforreadcountdatainexomesequencingexperimentsandimplicationsforcopynumbervariantcalling
AT grigoriadousofia robustmodelforreadcountdatainexomesequencingexperimentsandimplicationsforcopynumbervariantcalling
AT woodnicholasw robustmodelforreadcountdatainexomesequencingexperimentsandimplicationsforcopynumbervariantcalling
AT hambletonsophie robustmodelforreadcountdatainexomesequencingexperimentsandimplicationsforcopynumbervariantcalling
AT burnssiobhano robustmodelforreadcountdatainexomesequencingexperimentsandimplicationsforcopynumbervariantcalling
AT thrasheradrianj robustmodelforreadcountdatainexomesequencingexperimentsandimplicationsforcopynumbervariantcalling
AT kumararatnedinakantha robustmodelforreadcountdatainexomesequencingexperimentsandimplicationsforcopynumbervariantcalling
AT doffingerrainer robustmodelforreadcountdatainexomesequencingexperimentsandimplicationsforcopynumbervariantcalling
AT nejentsevsergey robustmodelforreadcountdatainexomesequencingexperimentsandimplicationsforcopynumbervariantcalling