Cargando…

Conpair: concordance and contamination estimator for matched tumor–normal pairs

Motivation: Sequencing of matched tumor and normal samples is the standard study design for reliable detection of somatic alterations. However, even very low levels of cross-sample contamination significantly impact calling of somatic mutations, because contaminant germline variants can be incorrect...

Descripción completa

Detalles Bibliográficos
Autores principales: Bergmann, Ewa A., Chen, Bo-Juen, Arora, Kanika, Vacic, Vladimir, Zody, Michael C.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5048070/
https://www.ncbi.nlm.nih.gov/pubmed/27354699
http://dx.doi.org/10.1093/bioinformatics/btw389
Descripción
Sumario:Motivation: Sequencing of matched tumor and normal samples is the standard study design for reliable detection of somatic alterations. However, even very low levels of cross-sample contamination significantly impact calling of somatic mutations, because contaminant germline variants can be incorrectly interpreted as somatic. There are currently no sequence-only based methods that reliably estimate contamination levels in tumor samples, which frequently display copy number changes. As a solution, we developed Conpair, a tool for detection of sample swaps and cross-individual contamination in whole-genome and whole-exome tumor–normal sequencing experiments. Results: On a ladder of in silico contaminated samples, we demonstrated that Conpair reliably measures contamination levels as low as 0.1%, even in presence of copy number changes. We also estimated contamination levels in glioblastoma WGS and WXS tumor–normal datasets from TCGA and showed that they strongly correlate with tumor–normal concordance, as well as with the number of germline variants called as somatic by several widely-used somatic callers. Availability and Implementation: The method is available at: https://github.com/nygenome/conpair. Contact: egrabowska@gmail.com or mczody@nygenome.org Supplementary information: Supplementary data are available at Bioinformatics online.