Cargando…

Causalcall: Nanopore Basecalling Using a Temporal Convolutional Network

Nanopore sequencing is promising because of its long read length and high speed. During sequencing, a strand of DNA/RNA passes through a biological nanopore, which causes the current in the pore to fluctuate. During basecalling, context-dependent current measurements are translated into the base seq...

Descripción completa

Detalles Bibliográficos
Autores principales: Zeng, Jingwen, Cai, Hongmin, Peng, Hong, Wang, Haiyan, Zhang, Yue, Akutsu, Tatsuya
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6984161/
https://www.ncbi.nlm.nih.gov/pubmed/32038706
http://dx.doi.org/10.3389/fgene.2019.01332
_version_ 1783491608832901120
author Zeng, Jingwen
Cai, Hongmin
Peng, Hong
Wang, Haiyan
Zhang, Yue
Akutsu, Tatsuya
author_facet Zeng, Jingwen
Cai, Hongmin
Peng, Hong
Wang, Haiyan
Zhang, Yue
Akutsu, Tatsuya
author_sort Zeng, Jingwen
collection PubMed
description Nanopore sequencing is promising because of its long read length and high speed. During sequencing, a strand of DNA/RNA passes through a biological nanopore, which causes the current in the pore to fluctuate. During basecalling, context-dependent current measurements are translated into the base sequence of the DNA/RNA strand. Accurate and fast basecalling is vital for downstream analyses such as genome assembly and detecting single-nucleotide polymorphisms and genomic structural variants. However, owing to the various changes in DNA/RNA molecules, noise during sequencing, and limitations of basecalling methods, accurate basecalling remains a challenge. In this paper, we propose Causalcall, which uses an end-to-end temporal convolution-based deep learning model for accurate and fast nanopore basecalling. Developed on a temporal convolutional network (TCN) and a connectionist temporal classification decoder, Causalcall directly identifies base sequences of varying lengths from current measurements in long time series. In contrast to the basecalling models using recurrent neural networks (RNNs), the convolution-based model of Causalcall can speed up basecalling by matrix computation. Experiments on multiple species have demonstrated the great potential of the TCN-based model to improve basecalling accuracy and speed when compared to an RNN-based model. Besides, experiments on genome assembly indicate the utility of Causalcall in reference-based genome assembly.
format Online
Article
Text
id pubmed-6984161
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-69841612020-02-07 Causalcall: Nanopore Basecalling Using a Temporal Convolutional Network Zeng, Jingwen Cai, Hongmin Peng, Hong Wang, Haiyan Zhang, Yue Akutsu, Tatsuya Front Genet Genetics Nanopore sequencing is promising because of its long read length and high speed. During sequencing, a strand of DNA/RNA passes through a biological nanopore, which causes the current in the pore to fluctuate. During basecalling, context-dependent current measurements are translated into the base sequence of the DNA/RNA strand. Accurate and fast basecalling is vital for downstream analyses such as genome assembly and detecting single-nucleotide polymorphisms and genomic structural variants. However, owing to the various changes in DNA/RNA molecules, noise during sequencing, and limitations of basecalling methods, accurate basecalling remains a challenge. In this paper, we propose Causalcall, which uses an end-to-end temporal convolution-based deep learning model for accurate and fast nanopore basecalling. Developed on a temporal convolutional network (TCN) and a connectionist temporal classification decoder, Causalcall directly identifies base sequences of varying lengths from current measurements in long time series. In contrast to the basecalling models using recurrent neural networks (RNNs), the convolution-based model of Causalcall can speed up basecalling by matrix computation. Experiments on multiple species have demonstrated the great potential of the TCN-based model to improve basecalling accuracy and speed when compared to an RNN-based model. Besides, experiments on genome assembly indicate the utility of Causalcall in reference-based genome assembly. Frontiers Media S.A. 2020-01-20 /pmc/articles/PMC6984161/ /pubmed/32038706 http://dx.doi.org/10.3389/fgene.2019.01332 Text en Copyright © 2020 Zeng, Cai, Peng, Wang, Zhang and Akutsu http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Zeng, Jingwen
Cai, Hongmin
Peng, Hong
Wang, Haiyan
Zhang, Yue
Akutsu, Tatsuya
Causalcall: Nanopore Basecalling Using a Temporal Convolutional Network
title Causalcall: Nanopore Basecalling Using a Temporal Convolutional Network
title_full Causalcall: Nanopore Basecalling Using a Temporal Convolutional Network
title_fullStr Causalcall: Nanopore Basecalling Using a Temporal Convolutional Network
title_full_unstemmed Causalcall: Nanopore Basecalling Using a Temporal Convolutional Network
title_short Causalcall: Nanopore Basecalling Using a Temporal Convolutional Network
title_sort causalcall: nanopore basecalling using a temporal convolutional network
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6984161/
https://www.ncbi.nlm.nih.gov/pubmed/32038706
http://dx.doi.org/10.3389/fgene.2019.01332
work_keys_str_mv AT zengjingwen causalcallnanoporebasecallingusingatemporalconvolutionalnetwork
AT caihongmin causalcallnanoporebasecallingusingatemporalconvolutionalnetwork
AT penghong causalcallnanoporebasecallingusingatemporalconvolutionalnetwork
AT wanghaiyan causalcallnanoporebasecallingusingatemporalconvolutionalnetwork
AT zhangyue causalcallnanoporebasecallingusingatemporalconvolutionalnetwork
AT akutsutatsuya causalcallnanoporebasecallingusingatemporalconvolutionalnetwork