Cargando…
DOCEST—fast and accurate estimator of human NGS sequencing depth and error rate
MOTIVATION: Accurate estimation of next-generation sequencing depth of coverage is needed for detecting the copy number of repeated elements in the human genome. The common methods for estimating sequencing depth are based on counting the number of reads mapped to the genome or subgenomic regions. S...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10460481/ https://www.ncbi.nlm.nih.gov/pubmed/37641716 http://dx.doi.org/10.1093/bioadv/vbad084 |
_version_ | 1785097653610086400 |
---|---|
author | Kaplinski, Lauris Möls, Märt Puurand, Tarmo Remm, Maido |
author_facet | Kaplinski, Lauris Möls, Märt Puurand, Tarmo Remm, Maido |
author_sort | Kaplinski, Lauris |
collection | PubMed |
description | MOTIVATION: Accurate estimation of next-generation sequencing depth of coverage is needed for detecting the copy number of repeated elements in the human genome. The common methods for estimating sequencing depth are based on counting the number of reads mapped to the genome or subgenomic regions. Such methods are sensitive to the mapping quality. The presence of contamination or the large deviance of an individual genome from the reference may introduce bias in depth estimation. RESULTS: Here, we present an algorithm and implementation for estimating both the sequencing depth and error rate from unmapped reads using a uniquely filtered k-mer set. On simulated reads with 20× coverage, the margin of error was less than 0.01%. At 0.01× coverage and the presence of 10-fold contamination, the precision was within 2% for depth and within 10% for error rate. AVAILABILITY AND IMPLEMENTATION: DOCEST program and database can be downloaded from https://bioinfo.ut.ee/docest/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics Advances online. |
format | Online Article Text |
id | pubmed-10460481 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-104604812023-08-28 DOCEST—fast and accurate estimator of human NGS sequencing depth and error rate Kaplinski, Lauris Möls, Märt Puurand, Tarmo Remm, Maido Bioinform Adv Application Note MOTIVATION: Accurate estimation of next-generation sequencing depth of coverage is needed for detecting the copy number of repeated elements in the human genome. The common methods for estimating sequencing depth are based on counting the number of reads mapped to the genome or subgenomic regions. Such methods are sensitive to the mapping quality. The presence of contamination or the large deviance of an individual genome from the reference may introduce bias in depth estimation. RESULTS: Here, we present an algorithm and implementation for estimating both the sequencing depth and error rate from unmapped reads using a uniquely filtered k-mer set. On simulated reads with 20× coverage, the margin of error was less than 0.01%. At 0.01× coverage and the presence of 10-fold contamination, the precision was within 2% for depth and within 10% for error rate. AVAILABILITY AND IMPLEMENTATION: DOCEST program and database can be downloaded from https://bioinfo.ut.ee/docest/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics Advances online. Oxford University Press 2023-07-18 /pmc/articles/PMC10460481/ /pubmed/37641716 http://dx.doi.org/10.1093/bioadv/vbad084 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Application Note Kaplinski, Lauris Möls, Märt Puurand, Tarmo Remm, Maido DOCEST—fast and accurate estimator of human NGS sequencing depth and error rate |
title | DOCEST—fast and accurate estimator of human NGS sequencing depth and error rate |
title_full | DOCEST—fast and accurate estimator of human NGS sequencing depth and error rate |
title_fullStr | DOCEST—fast and accurate estimator of human NGS sequencing depth and error rate |
title_full_unstemmed | DOCEST—fast and accurate estimator of human NGS sequencing depth and error rate |
title_short | DOCEST—fast and accurate estimator of human NGS sequencing depth and error rate |
title_sort | docest—fast and accurate estimator of human ngs sequencing depth and error rate |
topic | Application Note |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10460481/ https://www.ncbi.nlm.nih.gov/pubmed/37641716 http://dx.doi.org/10.1093/bioadv/vbad084 |
work_keys_str_mv | AT kaplinskilauris docestfastandaccurateestimatorofhumanngssequencingdepthanderrorrate AT molsmart docestfastandaccurateestimatorofhumanngssequencingdepthanderrorrate AT puurandtarmo docestfastandaccurateestimatorofhumanngssequencingdepthanderrorrate AT remmmaido docestfastandaccurateestimatorofhumanngssequencingdepthanderrorrate |