Cargando…
Causalcall: Nanopore Basecalling Using a Temporal Convolutional Network
Nanopore sequencing is promising because of its long read length and high speed. During sequencing, a strand of DNA/RNA passes through a biological nanopore, which causes the current in the pore to fluctuate. During basecalling, context-dependent current measurements are translated into the base seq...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6984161/ https://www.ncbi.nlm.nih.gov/pubmed/32038706 http://dx.doi.org/10.3389/fgene.2019.01332 |
_version_ | 1783491608832901120 |
---|---|
author | Zeng, Jingwen Cai, Hongmin Peng, Hong Wang, Haiyan Zhang, Yue Akutsu, Tatsuya |
author_facet | Zeng, Jingwen Cai, Hongmin Peng, Hong Wang, Haiyan Zhang, Yue Akutsu, Tatsuya |
author_sort | Zeng, Jingwen |
collection | PubMed |
description | Nanopore sequencing is promising because of its long read length and high speed. During sequencing, a strand of DNA/RNA passes through a biological nanopore, which causes the current in the pore to fluctuate. During basecalling, context-dependent current measurements are translated into the base sequence of the DNA/RNA strand. Accurate and fast basecalling is vital for downstream analyses such as genome assembly and detecting single-nucleotide polymorphisms and genomic structural variants. However, owing to the various changes in DNA/RNA molecules, noise during sequencing, and limitations of basecalling methods, accurate basecalling remains a challenge. In this paper, we propose Causalcall, which uses an end-to-end temporal convolution-based deep learning model for accurate and fast nanopore basecalling. Developed on a temporal convolutional network (TCN) and a connectionist temporal classification decoder, Causalcall directly identifies base sequences of varying lengths from current measurements in long time series. In contrast to the basecalling models using recurrent neural networks (RNNs), the convolution-based model of Causalcall can speed up basecalling by matrix computation. Experiments on multiple species have demonstrated the great potential of the TCN-based model to improve basecalling accuracy and speed when compared to an RNN-based model. Besides, experiments on genome assembly indicate the utility of Causalcall in reference-based genome assembly. |
format | Online Article Text |
id | pubmed-6984161 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-69841612020-02-07 Causalcall: Nanopore Basecalling Using a Temporal Convolutional Network Zeng, Jingwen Cai, Hongmin Peng, Hong Wang, Haiyan Zhang, Yue Akutsu, Tatsuya Front Genet Genetics Nanopore sequencing is promising because of its long read length and high speed. During sequencing, a strand of DNA/RNA passes through a biological nanopore, which causes the current in the pore to fluctuate. During basecalling, context-dependent current measurements are translated into the base sequence of the DNA/RNA strand. Accurate and fast basecalling is vital for downstream analyses such as genome assembly and detecting single-nucleotide polymorphisms and genomic structural variants. However, owing to the various changes in DNA/RNA molecules, noise during sequencing, and limitations of basecalling methods, accurate basecalling remains a challenge. In this paper, we propose Causalcall, which uses an end-to-end temporal convolution-based deep learning model for accurate and fast nanopore basecalling. Developed on a temporal convolutional network (TCN) and a connectionist temporal classification decoder, Causalcall directly identifies base sequences of varying lengths from current measurements in long time series. In contrast to the basecalling models using recurrent neural networks (RNNs), the convolution-based model of Causalcall can speed up basecalling by matrix computation. Experiments on multiple species have demonstrated the great potential of the TCN-based model to improve basecalling accuracy and speed when compared to an RNN-based model. Besides, experiments on genome assembly indicate the utility of Causalcall in reference-based genome assembly. Frontiers Media S.A. 2020-01-20 /pmc/articles/PMC6984161/ /pubmed/32038706 http://dx.doi.org/10.3389/fgene.2019.01332 Text en Copyright © 2020 Zeng, Cai, Peng, Wang, Zhang and Akutsu http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Genetics Zeng, Jingwen Cai, Hongmin Peng, Hong Wang, Haiyan Zhang, Yue Akutsu, Tatsuya Causalcall: Nanopore Basecalling Using a Temporal Convolutional Network |
title | Causalcall: Nanopore Basecalling Using a Temporal Convolutional Network |
title_full | Causalcall: Nanopore Basecalling Using a Temporal Convolutional Network |
title_fullStr | Causalcall: Nanopore Basecalling Using a Temporal Convolutional Network |
title_full_unstemmed | Causalcall: Nanopore Basecalling Using a Temporal Convolutional Network |
title_short | Causalcall: Nanopore Basecalling Using a Temporal Convolutional Network |
title_sort | causalcall: nanopore basecalling using a temporal convolutional network |
topic | Genetics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6984161/ https://www.ncbi.nlm.nih.gov/pubmed/32038706 http://dx.doi.org/10.3389/fgene.2019.01332 |
work_keys_str_mv | AT zengjingwen causalcallnanoporebasecallingusingatemporalconvolutionalnetwork AT caihongmin causalcallnanoporebasecallingusingatemporalconvolutionalnetwork AT penghong causalcallnanoporebasecallingusingatemporalconvolutionalnetwork AT wanghaiyan causalcallnanoporebasecallingusingatemporalconvolutionalnetwork AT zhangyue causalcallnanoporebasecallingusingatemporalconvolutionalnetwork AT akutsutatsuya causalcallnanoporebasecallingusingatemporalconvolutionalnetwork |