Cargando…

Halcyon: an accurate basecaller exploiting an encoder–decoder model with monotonic attention

MOTIVATION: In recent years, nanopore sequencing technology has enabled inexpensive long-read sequencing, which promises reads longer than a few thousand bases. Such long-read sequences contribute to the precise detection of structural variations and accurate haplotype phasing. However, deciphering...

Descripción completa

Detalles Bibliográficos
Autores principales: Konishi, Hiroki, Yamaguchi, Rui, Yamaguchi, Kiyoshi, Furukawa, Yoichi, Imoto, Seiya
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8189681/
https://www.ncbi.nlm.nih.gov/pubmed/33165508
http://dx.doi.org/10.1093/bioinformatics/btaa953
_version_ 1783705535856508928
author Konishi, Hiroki
Yamaguchi, Rui
Yamaguchi, Kiyoshi
Furukawa, Yoichi
Imoto, Seiya
author_facet Konishi, Hiroki
Yamaguchi, Rui
Yamaguchi, Kiyoshi
Furukawa, Yoichi
Imoto, Seiya
author_sort Konishi, Hiroki
collection PubMed
description MOTIVATION: In recent years, nanopore sequencing technology has enabled inexpensive long-read sequencing, which promises reads longer than a few thousand bases. Such long-read sequences contribute to the precise detection of structural variations and accurate haplotype phasing. However, deciphering precise DNA sequences from noisy and complicated nanopore raw signals remains a crucial demand for downstream analyses based on higher-quality nanopore sequencing, although various basecallers have been introduced to date. RESULTS: To address this need, we developed a novel basecaller, Halcyon, that incorporates neural-network techniques frequently used in the field of machine translation. Our model employs monotonic-attention mechanisms to learn semantic correspondences between nucleotides and signal levels without any pre-segmentation against input signals. We evaluated performance with a human whole-genome sequencing dataset and demonstrated that Halcyon outperformed existing third-party basecallers and achieved competitive performance against the latest Oxford Nanopore Technologies’ basecallers. AVAILABILITYAND IMPLEMENTATION: The source code (halcyon) can be found at https://github.com/relastle/halcyon.
format Online
Article
Text
id pubmed-8189681
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-81896812021-06-10 Halcyon: an accurate basecaller exploiting an encoder–decoder model with monotonic attention Konishi, Hiroki Yamaguchi, Rui Yamaguchi, Kiyoshi Furukawa, Yoichi Imoto, Seiya Bioinformatics Original Papers MOTIVATION: In recent years, nanopore sequencing technology has enabled inexpensive long-read sequencing, which promises reads longer than a few thousand bases. Such long-read sequences contribute to the precise detection of structural variations and accurate haplotype phasing. However, deciphering precise DNA sequences from noisy and complicated nanopore raw signals remains a crucial demand for downstream analyses based on higher-quality nanopore sequencing, although various basecallers have been introduced to date. RESULTS: To address this need, we developed a novel basecaller, Halcyon, that incorporates neural-network techniques frequently used in the field of machine translation. Our model employs monotonic-attention mechanisms to learn semantic correspondences between nucleotides and signal levels without any pre-segmentation against input signals. We evaluated performance with a human whole-genome sequencing dataset and demonstrated that Halcyon outperformed existing third-party basecallers and achieved competitive performance against the latest Oxford Nanopore Technologies’ basecallers. AVAILABILITYAND IMPLEMENTATION: The source code (halcyon) can be found at https://github.com/relastle/halcyon. Oxford University Press 2020-12-07 /pmc/articles/PMC8189681/ /pubmed/33165508 http://dx.doi.org/10.1093/bioinformatics/btaa953 Text en © The Author(s) 2020. Published by Oxford University Press. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/ (https://creativecommons.org/licenses/by-nc/4.0/) ), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Original Papers
Konishi, Hiroki
Yamaguchi, Rui
Yamaguchi, Kiyoshi
Furukawa, Yoichi
Imoto, Seiya
Halcyon: an accurate basecaller exploiting an encoder–decoder model with monotonic attention
title Halcyon: an accurate basecaller exploiting an encoder–decoder model with monotonic attention
title_full Halcyon: an accurate basecaller exploiting an encoder–decoder model with monotonic attention
title_fullStr Halcyon: an accurate basecaller exploiting an encoder–decoder model with monotonic attention
title_full_unstemmed Halcyon: an accurate basecaller exploiting an encoder–decoder model with monotonic attention
title_short Halcyon: an accurate basecaller exploiting an encoder–decoder model with monotonic attention
title_sort halcyon: an accurate basecaller exploiting an encoder–decoder model with monotonic attention
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8189681/
https://www.ncbi.nlm.nih.gov/pubmed/33165508
http://dx.doi.org/10.1093/bioinformatics/btaa953
work_keys_str_mv AT konishihiroki halcyonanaccuratebasecallerexploitinganencoderdecodermodelwithmonotonicattention
AT yamaguchirui halcyonanaccuratebasecallerexploitinganencoderdecodermodelwithmonotonicattention
AT yamaguchikiyoshi halcyonanaccuratebasecallerexploitinganencoderdecodermodelwithmonotonicattention
AT furukawayoichi halcyonanaccuratebasecallerexploitinganencoderdecodermodelwithmonotonicattention
AT imotoseiya halcyonanaccuratebasecallerexploitinganencoderdecodermodelwithmonotonicattention