Cargando…

DOCEST—fast and accurate estimator of human NGS sequencing depth and error rate

MOTIVATION: Accurate estimation of next-generation sequencing depth of coverage is needed for detecting the copy number of repeated elements in the human genome. The common methods for estimating sequencing depth are based on counting the number of reads mapped to the genome or subgenomic regions. S...

Descripción completa

Detalles Bibliográficos
Autores principales: Kaplinski, Lauris, Möls, Märt, Puurand, Tarmo, Remm, Maido
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10460481/
https://www.ncbi.nlm.nih.gov/pubmed/37641716
http://dx.doi.org/10.1093/bioadv/vbad084
Descripción
Sumario:MOTIVATION: Accurate estimation of next-generation sequencing depth of coverage is needed for detecting the copy number of repeated elements in the human genome. The common methods for estimating sequencing depth are based on counting the number of reads mapped to the genome or subgenomic regions. Such methods are sensitive to the mapping quality. The presence of contamination or the large deviance of an individual genome from the reference may introduce bias in depth estimation. RESULTS: Here, we present an algorithm and implementation for estimating both the sequencing depth and error rate from unmapped reads using a uniquely filtered k-mer set. On simulated reads with 20× coverage, the margin of error was less than 0.01%. At 0.01× coverage and the presence of 10-fold contamination, the precision was within 2% for depth and within 10% for error rate. AVAILABILITY AND IMPLEMENTATION: DOCEST program and database can be downloaded from https://bioinfo.ut.ee/docest/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics Advances online.