Cargando…

DeepSimulator: a deep simulator for Nanopore sequencing

MOTIVATION: Oxford Nanopore sequencing is a rapidly developed sequencing technology in recent years. To keep pace with the explosion of the downstream data analytical tools, a versatile Nanopore sequencing simulator is needed to complement the experimental data as well as to benchmark those newly de...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Yu, Han, Renmin, Bi, Chongwei, Li, Mo, Wang, Sheng, Gao, Xin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6129308/
https://www.ncbi.nlm.nih.gov/pubmed/29659695
http://dx.doi.org/10.1093/bioinformatics/bty223
_version_ 1783353778713395200
author Li, Yu
Han, Renmin
Bi, Chongwei
Li, Mo
Wang, Sheng
Gao, Xin
author_facet Li, Yu
Han, Renmin
Bi, Chongwei
Li, Mo
Wang, Sheng
Gao, Xin
author_sort Li, Yu
collection PubMed
description MOTIVATION: Oxford Nanopore sequencing is a rapidly developed sequencing technology in recent years. To keep pace with the explosion of the downstream data analytical tools, a versatile Nanopore sequencing simulator is needed to complement the experimental data as well as to benchmark those newly developed tools. However, all the currently available simulators are based on simple statistics of the produced reads, which have difficulty in capturing the complex nature of the Nanopore sequencing procedure, the main task of which is the generation of raw electrical current signals. RESULTS: Here we propose a deep learning based simulator, DeepSimulator, to mimic the entire pipeline of Nanopore sequencing. Starting from a given reference genome or assembled contigs, we simulate the electrical current signals by a context-dependent deep learning model, followed by a base-calling procedure to yield simulated reads. This workflow mimics the sequencing procedure more naturally. The thorough experiments performed across four species show that the signals generated by our context-dependent model are more similar to the experimentally obtained signals than the ones generated by the official context-independent pore model. In terms of the simulated reads, we provide a parameter interface to users so that they can obtain the reads with different accuracies ranging from 83 to 97%. The reads generated by the default parameter have almost the same properties as the real data. Two case studies demonstrate the application of DeepSimulator to benefit the development of tools in de novo assembly and in low coverage SNP detection. AVAILABILITY AND IMPLEMENTATION: The software can be accessed freely at: https://github.com/lykaust15/DeepSimulator. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-6129308
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-61293082018-09-12 DeepSimulator: a deep simulator for Nanopore sequencing Li, Yu Han, Renmin Bi, Chongwei Li, Mo Wang, Sheng Gao, Xin Bioinformatics Original Papers MOTIVATION: Oxford Nanopore sequencing is a rapidly developed sequencing technology in recent years. To keep pace with the explosion of the downstream data analytical tools, a versatile Nanopore sequencing simulator is needed to complement the experimental data as well as to benchmark those newly developed tools. However, all the currently available simulators are based on simple statistics of the produced reads, which have difficulty in capturing the complex nature of the Nanopore sequencing procedure, the main task of which is the generation of raw electrical current signals. RESULTS: Here we propose a deep learning based simulator, DeepSimulator, to mimic the entire pipeline of Nanopore sequencing. Starting from a given reference genome or assembled contigs, we simulate the electrical current signals by a context-dependent deep learning model, followed by a base-calling procedure to yield simulated reads. This workflow mimics the sequencing procedure more naturally. The thorough experiments performed across four species show that the signals generated by our context-dependent model are more similar to the experimentally obtained signals than the ones generated by the official context-independent pore model. In terms of the simulated reads, we provide a parameter interface to users so that they can obtain the reads with different accuracies ranging from 83 to 97%. The reads generated by the default parameter have almost the same properties as the real data. Two case studies demonstrate the application of DeepSimulator to benefit the development of tools in de novo assembly and in low coverage SNP detection. AVAILABILITY AND IMPLEMENTATION: The software can be accessed freely at: https://github.com/lykaust15/DeepSimulator. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2018-09-01 2018-04-06 /pmc/articles/PMC6129308/ /pubmed/29659695 http://dx.doi.org/10.1093/bioinformatics/bty223 Text en © The Author(s) 2018. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Original Papers
Li, Yu
Han, Renmin
Bi, Chongwei
Li, Mo
Wang, Sheng
Gao, Xin
DeepSimulator: a deep simulator for Nanopore sequencing
title DeepSimulator: a deep simulator for Nanopore sequencing
title_full DeepSimulator: a deep simulator for Nanopore sequencing
title_fullStr DeepSimulator: a deep simulator for Nanopore sequencing
title_full_unstemmed DeepSimulator: a deep simulator for Nanopore sequencing
title_short DeepSimulator: a deep simulator for Nanopore sequencing
title_sort deepsimulator: a deep simulator for nanopore sequencing
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6129308/
https://www.ncbi.nlm.nih.gov/pubmed/29659695
http://dx.doi.org/10.1093/bioinformatics/bty223
work_keys_str_mv AT liyu deepsimulatoradeepsimulatorfornanoporesequencing
AT hanrenmin deepsimulatoradeepsimulatorfornanoporesequencing
AT bichongwei deepsimulatoradeepsimulatorfornanoporesequencing
AT limo deepsimulatoradeepsimulatorfornanoporesequencing
AT wangsheng deepsimulatoradeepsimulatorfornanoporesequencing
AT gaoxin deepsimulatoradeepsimulatorfornanoporesequencing