Cargando…

BAMixChecker: an automated checkup tool for matched sample pairs in NGS cohort

SUMMARY: Mislabeling in the process of next generation sequencing is a frequent problem that can cause an entire genomic analysis to fail, and a regular cohort-level checkup is needed to ensure that it has not occurred. We developed a new, automated tool (BAMixChecker) that accurately detects sample...

Descripción completa

Detalles Bibliográficos
Autores principales: Chun, Hein, Kim, Sangwoo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6853765/
https://www.ncbi.nlm.nih.gov/pubmed/31197312
http://dx.doi.org/10.1093/bioinformatics/btz479
Descripción
Sumario:SUMMARY: Mislabeling in the process of next generation sequencing is a frequent problem that can cause an entire genomic analysis to fail, and a regular cohort-level checkup is needed to ensure that it has not occurred. We developed a new, automated tool (BAMixChecker) that accurately detects sample mismatches from a given BAM file cohort with minimal user intervention. BAMixChecker uses a flexible, data-specific set of single-nucleotide polymorphisms and detects orphan (unpaired) and swapped (mispaired) samples based on genotype-concordance score and entropy-based file name analysis. BAMixChecker shows ∼100% accuracy in real WES, RNA-Seq and targeted sequencing data cohorts, even for small panels (<50 genes). BAMixChecker provides an HTML-style report that graphically outlines the sample matching status in tables and heatmaps, with which users can quickly inspect any mismatch events. AVAILABILITY AND IMPLEMENTATION: BAMixChecker is available at https://github.com/heinc1010/BAMixChecker SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.