Cargando…

An adaptive decorrelation method removes Illumina DNA base-calling errors caused by crosstalk between adjacent clusters

Base-calling accuracy is crucial for high-throughput DNA sequencing and downstream analysis such as read mapping and genome assembly. Accordingly, we made an endeavor to reduce DNA sequencing errors of Illumina systems by correcting three kinds of crosstalk in the cluster intensity data. We discover...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Bo, Wan, Lin, Wang, Anqi, Li, Lei M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5316982/
https://www.ncbi.nlm.nih.gov/pubmed/28216647
http://dx.doi.org/10.1038/srep41348
_version_ 1782508931187212288
author Wang, Bo
Wan, Lin
Wang, Anqi
Li, Lei M.
author_facet Wang, Bo
Wan, Lin
Wang, Anqi
Li, Lei M.
author_sort Wang, Bo
collection PubMed
description Base-calling accuracy is crucial for high-throughput DNA sequencing and downstream analysis such as read mapping and genome assembly. Accordingly, we made an endeavor to reduce DNA sequencing errors of Illumina systems by correcting three kinds of crosstalk in the cluster intensity data. We discovered that signal crosstalk between adjacent clusters accounts for a large portion of sequencing errors in Illumina systems, even after correcting color crosstalk caused by the overlap of dye emission spectra and phasing/pre-phasing caused by out-of-step nucleotide synthesis. Interestingly and importantly, spatial crosstalk between adjacent clusters is cluster-specific and often asymmetric, which cannot be corrected by existing deconvolution methods. Therefore, we introduce a novel mathematical method able to estimate and remove spatial crosstalk, thereby reducing base-calling errors by 44–69% at a given mapping rate from Illumina systems. Furthermore, the resolution gained from this work provides new room for higher throughput of DNA sequencing and of general measurement systems using fluorescence-based imaging technology. The resulting base-caller 3Dec is available for academic users at http://github.com/flishwnag/3dec. Not only does it reduce 62.1% errors compared to the standard pipeline, but also its implementation is fast enough for daily sequencing.
format Online
Article
Text
id pubmed-5316982
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Nature Publishing Group
record_format MEDLINE/PubMed
spelling pubmed-53169822017-02-24 An adaptive decorrelation method removes Illumina DNA base-calling errors caused by crosstalk between adjacent clusters Wang, Bo Wan, Lin Wang, Anqi Li, Lei M. Sci Rep Article Base-calling accuracy is crucial for high-throughput DNA sequencing and downstream analysis such as read mapping and genome assembly. Accordingly, we made an endeavor to reduce DNA sequencing errors of Illumina systems by correcting three kinds of crosstalk in the cluster intensity data. We discovered that signal crosstalk between adjacent clusters accounts for a large portion of sequencing errors in Illumina systems, even after correcting color crosstalk caused by the overlap of dye emission spectra and phasing/pre-phasing caused by out-of-step nucleotide synthesis. Interestingly and importantly, spatial crosstalk between adjacent clusters is cluster-specific and often asymmetric, which cannot be corrected by existing deconvolution methods. Therefore, we introduce a novel mathematical method able to estimate and remove spatial crosstalk, thereby reducing base-calling errors by 44–69% at a given mapping rate from Illumina systems. Furthermore, the resolution gained from this work provides new room for higher throughput of DNA sequencing and of general measurement systems using fluorescence-based imaging technology. The resulting base-caller 3Dec is available for academic users at http://github.com/flishwnag/3dec. Not only does it reduce 62.1% errors compared to the standard pipeline, but also its implementation is fast enough for daily sequencing. Nature Publishing Group 2017-02-20 /pmc/articles/PMC5316982/ /pubmed/28216647 http://dx.doi.org/10.1038/srep41348 Text en Copyright © 2017, The Author(s) http://creativecommons.org/licenses/by-nc-sa/4.0/ This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/4.0/
spellingShingle Article
Wang, Bo
Wan, Lin
Wang, Anqi
Li, Lei M.
An adaptive decorrelation method removes Illumina DNA base-calling errors caused by crosstalk between adjacent clusters
title An adaptive decorrelation method removes Illumina DNA base-calling errors caused by crosstalk between adjacent clusters
title_full An adaptive decorrelation method removes Illumina DNA base-calling errors caused by crosstalk between adjacent clusters
title_fullStr An adaptive decorrelation method removes Illumina DNA base-calling errors caused by crosstalk between adjacent clusters
title_full_unstemmed An adaptive decorrelation method removes Illumina DNA base-calling errors caused by crosstalk between adjacent clusters
title_short An adaptive decorrelation method removes Illumina DNA base-calling errors caused by crosstalk between adjacent clusters
title_sort adaptive decorrelation method removes illumina dna base-calling errors caused by crosstalk between adjacent clusters
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5316982/
https://www.ncbi.nlm.nih.gov/pubmed/28216647
http://dx.doi.org/10.1038/srep41348
work_keys_str_mv AT wangbo anadaptivedecorrelationmethodremovesilluminadnabasecallingerrorscausedbycrosstalkbetweenadjacentclusters
AT wanlin anadaptivedecorrelationmethodremovesilluminadnabasecallingerrorscausedbycrosstalkbetweenadjacentclusters
AT wanganqi anadaptivedecorrelationmethodremovesilluminadnabasecallingerrorscausedbycrosstalkbetweenadjacentclusters
AT lileim anadaptivedecorrelationmethodremovesilluminadnabasecallingerrorscausedbycrosstalkbetweenadjacentclusters
AT wangbo adaptivedecorrelationmethodremovesilluminadnabasecallingerrorscausedbycrosstalkbetweenadjacentclusters
AT wanlin adaptivedecorrelationmethodremovesilluminadnabasecallingerrorscausedbycrosstalkbetweenadjacentclusters
AT wanganqi adaptivedecorrelationmethodremovesilluminadnabasecallingerrorscausedbycrosstalkbetweenadjacentclusters
AT lileim adaptivedecorrelationmethodremovesilluminadnabasecallingerrorscausedbycrosstalkbetweenadjacentclusters