Cargando…
A multi-task convolutional deep neural network for variant calling in single molecule sequencing
The accurate identification of DNA sequence variants is an important, but challenging task in genomics. It is particularly difficult for single molecule sequencing, which has a per-nucleotide error rate of ~5–15%. Meeting this demand, we developed Clairvoyante, a multi-task five-layer convolutional...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6397153/ https://www.ncbi.nlm.nih.gov/pubmed/30824707 http://dx.doi.org/10.1038/s41467-019-09025-z |
_version_ | 1783399368947138560 |
---|---|
author | Luo, Ruibang Sedlazeck, Fritz J. Lam, Tak-Wah Schatz, Michael C. |
author_facet | Luo, Ruibang Sedlazeck, Fritz J. Lam, Tak-Wah Schatz, Michael C. |
author_sort | Luo, Ruibang |
collection | PubMed |
description | The accurate identification of DNA sequence variants is an important, but challenging task in genomics. It is particularly difficult for single molecule sequencing, which has a per-nucleotide error rate of ~5–15%. Meeting this demand, we developed Clairvoyante, a multi-task five-layer convolutional neural network model for predicting variant type (SNP or indel), zygosity, alternative allele and indel length from aligned reads. For the well-characterized NA12878 human sample, Clairvoyante achieves 99.67, 95.78, 90.53% F1-score on 1KP common variants, and 98.65, 92.57, 87.26% F1-score for whole-genome analysis, using Illumina, PacBio, and Oxford Nanopore data, respectively. Training on a second human sample shows Clairvoyante is sample agnostic and finds variants in less than 2 h on a standard server. Furthermore, we present 3,135 variants that are missed using Illumina but supported independently by both PacBio and Oxford Nanopore reads. Clairvoyante is available open-source (https://github.com/aquaskyline/Clairvoyante), with modules to train, utilize and visualize the model. |
format | Online Article Text |
id | pubmed-6397153 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-63971532019-03-04 A multi-task convolutional deep neural network for variant calling in single molecule sequencing Luo, Ruibang Sedlazeck, Fritz J. Lam, Tak-Wah Schatz, Michael C. Nat Commun Article The accurate identification of DNA sequence variants is an important, but challenging task in genomics. It is particularly difficult for single molecule sequencing, which has a per-nucleotide error rate of ~5–15%. Meeting this demand, we developed Clairvoyante, a multi-task five-layer convolutional neural network model for predicting variant type (SNP or indel), zygosity, alternative allele and indel length from aligned reads. For the well-characterized NA12878 human sample, Clairvoyante achieves 99.67, 95.78, 90.53% F1-score on 1KP common variants, and 98.65, 92.57, 87.26% F1-score for whole-genome analysis, using Illumina, PacBio, and Oxford Nanopore data, respectively. Training on a second human sample shows Clairvoyante is sample agnostic and finds variants in less than 2 h on a standard server. Furthermore, we present 3,135 variants that are missed using Illumina but supported independently by both PacBio and Oxford Nanopore reads. Clairvoyante is available open-source (https://github.com/aquaskyline/Clairvoyante), with modules to train, utilize and visualize the model. Nature Publishing Group UK 2019-03-01 /pmc/articles/PMC6397153/ /pubmed/30824707 http://dx.doi.org/10.1038/s41467-019-09025-z Text en © The Author(s) 2019 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. |
spellingShingle | Article Luo, Ruibang Sedlazeck, Fritz J. Lam, Tak-Wah Schatz, Michael C. A multi-task convolutional deep neural network for variant calling in single molecule sequencing |
title | A multi-task convolutional deep neural network for variant calling in single molecule sequencing |
title_full | A multi-task convolutional deep neural network for variant calling in single molecule sequencing |
title_fullStr | A multi-task convolutional deep neural network for variant calling in single molecule sequencing |
title_full_unstemmed | A multi-task convolutional deep neural network for variant calling in single molecule sequencing |
title_short | A multi-task convolutional deep neural network for variant calling in single molecule sequencing |
title_sort | multi-task convolutional deep neural network for variant calling in single molecule sequencing |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6397153/ https://www.ncbi.nlm.nih.gov/pubmed/30824707 http://dx.doi.org/10.1038/s41467-019-09025-z |
work_keys_str_mv | AT luoruibang amultitaskconvolutionaldeepneuralnetworkforvariantcallinginsinglemoleculesequencing AT sedlazeckfritzj amultitaskconvolutionaldeepneuralnetworkforvariantcallinginsinglemoleculesequencing AT lamtakwah amultitaskconvolutionaldeepneuralnetworkforvariantcallinginsinglemoleculesequencing AT schatzmichaelc amultitaskconvolutionaldeepneuralnetworkforvariantcallinginsinglemoleculesequencing AT luoruibang multitaskconvolutionaldeepneuralnetworkforvariantcallinginsinglemoleculesequencing AT sedlazeckfritzj multitaskconvolutionaldeepneuralnetworkforvariantcallinginsinglemoleculesequencing AT lamtakwah multitaskconvolutionaldeepneuralnetworkforvariantcallinginsinglemoleculesequencing AT schatzmichaelc multitaskconvolutionaldeepneuralnetworkforvariantcallinginsinglemoleculesequencing |