Cargando…

nPoRe: n-polymer realigner for improved pileup-based variant calling

Despite recent improvements in nanopore basecalling accuracy, germline variant calling of small insertions and deletions (INDELs) remains poor. Although precision and recall for single nucleotide polymorphisms (SNPs) now exceeds 99.5%, INDEL recall remains below 80% for standard R9.4.1 flow cells. W...

Descripción completa

Detalles Bibliográficos
Autores principales: Dunn, Tim, Blaauw, David, Das, Reetuparna, Narayanasamy, Satish
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10022090/
https://www.ncbi.nlm.nih.gov/pubmed/36927439
http://dx.doi.org/10.1186/s12859-023-05193-4
Descripción
Sumario:Despite recent improvements in nanopore basecalling accuracy, germline variant calling of small insertions and deletions (INDELs) remains poor. Although precision and recall for single nucleotide polymorphisms (SNPs) now exceeds 99.5%, INDEL recall remains below 80% for standard R9.4.1 flow cells. We show that read phasing and realignment can recover a significant portion of false negative INDELs. In particular, we extend Needleman-Wunsch affine gap alignment by introducing new gap penalties for more accurately aligning repeated n-polymer sequences such as homopolymers ([Formula: see text] ) and tandem repeats ([Formula: see text] ). At the same precision, haplotype phasing improves INDEL recall from 63.76 to [Formula: see text] and nPoRe realignment improves it further to [Formula: see text] .