Cargando…
An adaptive decorrelation method removes Illumina DNA base-calling errors caused by crosstalk between adjacent clusters
Base-calling accuracy is crucial for high-throughput DNA sequencing and downstream analysis such as read mapping and genome assembly. Accordingly, we made an endeavor to reduce DNA sequencing errors of Illumina systems by correcting three kinds of crosstalk in the cluster intensity data. We discover...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5316982/ https://www.ncbi.nlm.nih.gov/pubmed/28216647 http://dx.doi.org/10.1038/srep41348 |
_version_ | 1782508931187212288 |
---|---|
author | Wang, Bo Wan, Lin Wang, Anqi Li, Lei M. |
author_facet | Wang, Bo Wan, Lin Wang, Anqi Li, Lei M. |
author_sort | Wang, Bo |
collection | PubMed |
description | Base-calling accuracy is crucial for high-throughput DNA sequencing and downstream analysis such as read mapping and genome assembly. Accordingly, we made an endeavor to reduce DNA sequencing errors of Illumina systems by correcting three kinds of crosstalk in the cluster intensity data. We discovered that signal crosstalk between adjacent clusters accounts for a large portion of sequencing errors in Illumina systems, even after correcting color crosstalk caused by the overlap of dye emission spectra and phasing/pre-phasing caused by out-of-step nucleotide synthesis. Interestingly and importantly, spatial crosstalk between adjacent clusters is cluster-specific and often asymmetric, which cannot be corrected by existing deconvolution methods. Therefore, we introduce a novel mathematical method able to estimate and remove spatial crosstalk, thereby reducing base-calling errors by 44–69% at a given mapping rate from Illumina systems. Furthermore, the resolution gained from this work provides new room for higher throughput of DNA sequencing and of general measurement systems using fluorescence-based imaging technology. The resulting base-caller 3Dec is available for academic users at http://github.com/flishwnag/3dec. Not only does it reduce 62.1% errors compared to the standard pipeline, but also its implementation is fast enough for daily sequencing. |
format | Online Article Text |
id | pubmed-5316982 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | Nature Publishing Group |
record_format | MEDLINE/PubMed |
spelling | pubmed-53169822017-02-24 An adaptive decorrelation method removes Illumina DNA base-calling errors caused by crosstalk between adjacent clusters Wang, Bo Wan, Lin Wang, Anqi Li, Lei M. Sci Rep Article Base-calling accuracy is crucial for high-throughput DNA sequencing and downstream analysis such as read mapping and genome assembly. Accordingly, we made an endeavor to reduce DNA sequencing errors of Illumina systems by correcting three kinds of crosstalk in the cluster intensity data. We discovered that signal crosstalk between adjacent clusters accounts for a large portion of sequencing errors in Illumina systems, even after correcting color crosstalk caused by the overlap of dye emission spectra and phasing/pre-phasing caused by out-of-step nucleotide synthesis. Interestingly and importantly, spatial crosstalk between adjacent clusters is cluster-specific and often asymmetric, which cannot be corrected by existing deconvolution methods. Therefore, we introduce a novel mathematical method able to estimate and remove spatial crosstalk, thereby reducing base-calling errors by 44–69% at a given mapping rate from Illumina systems. Furthermore, the resolution gained from this work provides new room for higher throughput of DNA sequencing and of general measurement systems using fluorescence-based imaging technology. The resulting base-caller 3Dec is available for academic users at http://github.com/flishwnag/3dec. Not only does it reduce 62.1% errors compared to the standard pipeline, but also its implementation is fast enough for daily sequencing. Nature Publishing Group 2017-02-20 /pmc/articles/PMC5316982/ /pubmed/28216647 http://dx.doi.org/10.1038/srep41348 Text en Copyright © 2017, The Author(s) http://creativecommons.org/licenses/by-nc-sa/4.0/ This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/4.0/ |
spellingShingle | Article Wang, Bo Wan, Lin Wang, Anqi Li, Lei M. An adaptive decorrelation method removes Illumina DNA base-calling errors caused by crosstalk between adjacent clusters |
title | An adaptive decorrelation method removes Illumina DNA base-calling errors caused by crosstalk between adjacent clusters |
title_full | An adaptive decorrelation method removes Illumina DNA base-calling errors caused by crosstalk between adjacent clusters |
title_fullStr | An adaptive decorrelation method removes Illumina DNA base-calling errors caused by crosstalk between adjacent clusters |
title_full_unstemmed | An adaptive decorrelation method removes Illumina DNA base-calling errors caused by crosstalk between adjacent clusters |
title_short | An adaptive decorrelation method removes Illumina DNA base-calling errors caused by crosstalk between adjacent clusters |
title_sort | adaptive decorrelation method removes illumina dna base-calling errors caused by crosstalk between adjacent clusters |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5316982/ https://www.ncbi.nlm.nih.gov/pubmed/28216647 http://dx.doi.org/10.1038/srep41348 |
work_keys_str_mv | AT wangbo anadaptivedecorrelationmethodremovesilluminadnabasecallingerrorscausedbycrosstalkbetweenadjacentclusters AT wanlin anadaptivedecorrelationmethodremovesilluminadnabasecallingerrorscausedbycrosstalkbetweenadjacentclusters AT wanganqi anadaptivedecorrelationmethodremovesilluminadnabasecallingerrorscausedbycrosstalkbetweenadjacentclusters AT lileim anadaptivedecorrelationmethodremovesilluminadnabasecallingerrorscausedbycrosstalkbetweenadjacentclusters AT wangbo adaptivedecorrelationmethodremovesilluminadnabasecallingerrorscausedbycrosstalkbetweenadjacentclusters AT wanlin adaptivedecorrelationmethodremovesilluminadnabasecallingerrorscausedbycrosstalkbetweenadjacentclusters AT wanganqi adaptivedecorrelationmethodremovesilluminadnabasecallingerrorscausedbycrosstalkbetweenadjacentclusters AT lileim adaptivedecorrelationmethodremovesilluminadnabasecallingerrorscausedbycrosstalkbetweenadjacentclusters |