Cargando…

A multi-task convolutional deep neural network for variant calling in single molecule sequencing

The accurate identification of DNA sequence variants is an important, but challenging task in genomics. It is particularly difficult for single molecule sequencing, which has a per-nucleotide error rate of ~5–15%. Meeting this demand, we developed Clairvoyante, a multi-task five-layer convolutional...

Descripción completa

Detalles Bibliográficos
Autores principales:	Luo, Ruibang, Sedlazeck, Fritz J., Lam, Tak-Wah, Schatz, Michael C.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Nature Publishing Group UK 2019
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6397153/ https://www.ncbi.nlm.nih.gov/pubmed/30824707 http://dx.doi.org/10.1038/s41467-019-09025-z

_version_	1783399368947138560
author	Luo, Ruibang Sedlazeck, Fritz J. Lam, Tak-Wah Schatz, Michael C.
author_facet	Luo, Ruibang Sedlazeck, Fritz J. Lam, Tak-Wah Schatz, Michael C.
author_sort	Luo, Ruibang
collection	PubMed
description	The accurate identification of DNA sequence variants is an important, but challenging task in genomics. It is particularly difficult for single molecule sequencing, which has a per-nucleotide error rate of ~5–15%. Meeting this demand, we developed Clairvoyante, a multi-task five-layer convolutional neural network model for predicting variant type (SNP or indel), zygosity, alternative allele and indel length from aligned reads. For the well-characterized NA12878 human sample, Clairvoyante achieves 99.67, 95.78, 90.53% F1-score on 1KP common variants, and 98.65, 92.57, 87.26% F1-score for whole-genome analysis, using Illumina, PacBio, and Oxford Nanopore data, respectively. Training on a second human sample shows Clairvoyante is sample agnostic and finds variants in less than 2 h on a standard server. Furthermore, we present 3,135 variants that are missed using Illumina but supported independently by both PacBio and Oxford Nanopore reads. Clairvoyante is available open-source (https://github.com/aquaskyline/Clairvoyante), with modules to train, utilize and visualize the model.
format	Online Article Text
id	pubmed-6397153
institution	National Center for Biotechnology Information
language	English
publishDate	2019
publisher	Nature Publishing Group UK
record_format	MEDLINE/PubMed
spelling	pubmed-63971532019-03-04 A multi-task convolutional deep neural network for variant calling in single molecule sequencing Luo, Ruibang Sedlazeck, Fritz J. Lam, Tak-Wah Schatz, Michael C. Nat Commun Article The accurate identification of DNA sequence variants is an important, but challenging task in genomics. It is particularly difficult for single molecule sequencing, which has a per-nucleotide error rate of ~5–15%. Meeting this demand, we developed Clairvoyante, a multi-task five-layer convolutional neural network model for predicting variant type (SNP or indel), zygosity, alternative allele and indel length from aligned reads. For the well-characterized NA12878 human sample, Clairvoyante achieves 99.67, 95.78, 90.53% F1-score on 1KP common variants, and 98.65, 92.57, 87.26% F1-score for whole-genome analysis, using Illumina, PacBio, and Oxford Nanopore data, respectively. Training on a second human sample shows Clairvoyante is sample agnostic and finds variants in less than 2 h on a standard server. Furthermore, we present 3,135 variants that are missed using Illumina but supported independently by both PacBio and Oxford Nanopore reads. Clairvoyante is available open-source (https://github.com/aquaskyline/Clairvoyante), with modules to train, utilize and visualize the model. Nature Publishing Group UK 2019-03-01 /pmc/articles/PMC6397153/ /pubmed/30824707 http://dx.doi.org/10.1038/s41467-019-09025-z Text en © The Author(s) 2019 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle	Article Luo, Ruibang Sedlazeck, Fritz J. Lam, Tak-Wah Schatz, Michael C. A multi-task convolutional deep neural network for variant calling in single molecule sequencing
title	A multi-task convolutional deep neural network for variant calling in single molecule sequencing
title_full	A multi-task convolutional deep neural network for variant calling in single molecule sequencing
title_fullStr	A multi-task convolutional deep neural network for variant calling in single molecule sequencing
title_full_unstemmed	A multi-task convolutional deep neural network for variant calling in single molecule sequencing
title_short	A multi-task convolutional deep neural network for variant calling in single molecule sequencing
title_sort	multi-task convolutional deep neural network for variant calling in single molecule sequencing
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6397153/ https://www.ncbi.nlm.nih.gov/pubmed/30824707 http://dx.doi.org/10.1038/s41467-019-09025-z
work_keys_str_mv	AT luoruibang amultitaskconvolutionaldeepneuralnetworkforvariantcallinginsinglemoleculesequencing AT sedlazeckfritzj amultitaskconvolutionaldeepneuralnetworkforvariantcallinginsinglemoleculesequencing AT lamtakwah amultitaskconvolutionaldeepneuralnetworkforvariantcallinginsinglemoleculesequencing AT schatzmichaelc amultitaskconvolutionaldeepneuralnetworkforvariantcallinginsinglemoleculesequencing AT luoruibang multitaskconvolutionaldeepneuralnetworkforvariantcallinginsinglemoleculesequencing AT sedlazeckfritzj multitaskconvolutionaldeepneuralnetworkforvariantcallinginsinglemoleculesequencing AT lamtakwah multitaskconvolutionaldeepneuralnetworkforvariantcallinginsinglemoleculesequencing AT schatzmichaelc multitaskconvolutionaldeepneuralnetworkforvariantcallinginsinglemoleculesequencing

A multi-task convolutional deep neural network for variant calling in single molecule sequencing

Ejemplares similares