Cargando…

Sequencing DNA with nanopores: Troubles and biases

Oxford Nanopore Technologies’ (ONT) long read sequencers offer access to longer DNA fragments than previous sequencer generations, at the cost of a higher error rate. While many papers have studied read correction methods, few have addressed the detailed characterization of observed errors, a task c...

Descripción completa

Detalles Bibliográficos
Autores principales: Delahaye, Clara, Nicolas, Jacques
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8486125/
https://www.ncbi.nlm.nih.gov/pubmed/34597327
http://dx.doi.org/10.1371/journal.pone.0257521
_version_ 1784577678690484224
author Delahaye, Clara
Nicolas, Jacques
author_facet Delahaye, Clara
Nicolas, Jacques
author_sort Delahaye, Clara
collection PubMed
description Oxford Nanopore Technologies’ (ONT) long read sequencers offer access to longer DNA fragments than previous sequencer generations, at the cost of a higher error rate. While many papers have studied read correction methods, few have addressed the detailed characterization of observed errors, a task complicated by frequent changes in chemistry and software in ONT technology. The MinION sequencer is now more stable and this paper proposes an up-to-date view of its error landscape, using the most mature flowcell and basecaller. We studied Nanopore sequencing error biases on both bacterial and human DNA reads. We found that, although Nanopore sequencing is expected not to suffer from GC bias, it is a crucial parameter with respect to errors. In particular, low-GC reads have fewer errors than high-GC reads (about 6% and 8% respectively). The error profile for homopolymeric regions or regions with short repeats, the source of about half of all sequencing errors, also depends on the GC rate and mainly shows deletions, although there are some reads with long insertions. Another interesting finding is that the quality measure, although over-estimated, offers valuable information to predict the error rate as well as the abundance of reads. We supplemented this study with an analysis of a rapeseed RNA read set and shown a higher level of errors with a higher level of deletion in these data. Finally, we have implemented an open source pipeline for long-term monitoring of the error profile, which enables users to easily compute various analysis presented in this work, including for future developments of the sequencing device. Overall, we hope this work will provide a basis for the design of better error-correction methods.
format Online
Article
Text
id pubmed-8486125
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-84861252021-10-02 Sequencing DNA with nanopores: Troubles and biases Delahaye, Clara Nicolas, Jacques PLoS One Research Article Oxford Nanopore Technologies’ (ONT) long read sequencers offer access to longer DNA fragments than previous sequencer generations, at the cost of a higher error rate. While many papers have studied read correction methods, few have addressed the detailed characterization of observed errors, a task complicated by frequent changes in chemistry and software in ONT technology. The MinION sequencer is now more stable and this paper proposes an up-to-date view of its error landscape, using the most mature flowcell and basecaller. We studied Nanopore sequencing error biases on both bacterial and human DNA reads. We found that, although Nanopore sequencing is expected not to suffer from GC bias, it is a crucial parameter with respect to errors. In particular, low-GC reads have fewer errors than high-GC reads (about 6% and 8% respectively). The error profile for homopolymeric regions or regions with short repeats, the source of about half of all sequencing errors, also depends on the GC rate and mainly shows deletions, although there are some reads with long insertions. Another interesting finding is that the quality measure, although over-estimated, offers valuable information to predict the error rate as well as the abundance of reads. We supplemented this study with an analysis of a rapeseed RNA read set and shown a higher level of errors with a higher level of deletion in these data. Finally, we have implemented an open source pipeline for long-term monitoring of the error profile, which enables users to easily compute various analysis presented in this work, including for future developments of the sequencing device. Overall, we hope this work will provide a basis for the design of better error-correction methods. Public Library of Science 2021-10-01 /pmc/articles/PMC8486125/ /pubmed/34597327 http://dx.doi.org/10.1371/journal.pone.0257521 Text en © 2021 Delahaye, Nicolas https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Delahaye, Clara
Nicolas, Jacques
Sequencing DNA with nanopores: Troubles and biases
title Sequencing DNA with nanopores: Troubles and biases
title_full Sequencing DNA with nanopores: Troubles and biases
title_fullStr Sequencing DNA with nanopores: Troubles and biases
title_full_unstemmed Sequencing DNA with nanopores: Troubles and biases
title_short Sequencing DNA with nanopores: Troubles and biases
title_sort sequencing dna with nanopores: troubles and biases
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8486125/
https://www.ncbi.nlm.nih.gov/pubmed/34597327
http://dx.doi.org/10.1371/journal.pone.0257521
work_keys_str_mv AT delahayeclara sequencingdnawithnanoporestroublesandbiases
AT nicolasjacques sequencingdnawithnanoporestroublesandbiases