Cargando…

DeepSV: accurate calling of genomic deletions from high-throughput sequencing data using deep convolutional neural network

BACKGROUND: Calling genetic variations from sequence reads is an important problem in genomics. There are many existing methods for calling various types of variations. Recently, Google developed a method for calling single nucleotide polymorphisms (SNPs) based on deep learning. Their method visuali...

Descripción completa

Detalles Bibliográficos
Autores principales: Cai, Lei, Wu, Yufeng, Gao, Jingyang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6909530/
https://www.ncbi.nlm.nih.gov/pubmed/31830921
http://dx.doi.org/10.1186/s12859-019-3299-y
_version_ 1783478961048649728
author Cai, Lei
Wu, Yufeng
Gao, Jingyang
author_facet Cai, Lei
Wu, Yufeng
Gao, Jingyang
author_sort Cai, Lei
collection PubMed
description BACKGROUND: Calling genetic variations from sequence reads is an important problem in genomics. There are many existing methods for calling various types of variations. Recently, Google developed a method for calling single nucleotide polymorphisms (SNPs) based on deep learning. Their method visualizes sequence reads in the forms of images. These images are then used to train a deep neural network model, which is used to call SNPs. This raises a research question: can deep learning be used to call more complex genetic variations such as structural variations (SVs) from sequence data? RESULTS: In this paper, we extend this high-level approach to the problem of calling structural variations. We present DeepSV, an approach based on deep learning for calling long deletions from sequence reads. DeepSV is based on a novel method of visualizing sequence reads. The visualization is designed to capture multiple sources of information in the sequence data that are relevant to long deletions. DeepSV also implements techniques for working with noisy training data. DeepSV trains a model from the visualized sequence reads and calls deletions based on this model. We demonstrate that DeepSV outperforms existing methods in terms of accuracy and efficiency of deletion calling on the data from the 1000 Genomes Project. CONCLUSIONS: Our work shows that deep learning can potentially lead to effective calling of different types of genetic variations that are complex than SNPs.
format Online
Article
Text
id pubmed-6909530
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-69095302019-12-19 DeepSV: accurate calling of genomic deletions from high-throughput sequencing data using deep convolutional neural network Cai, Lei Wu, Yufeng Gao, Jingyang BMC Bioinformatics Research Article BACKGROUND: Calling genetic variations from sequence reads is an important problem in genomics. There are many existing methods for calling various types of variations. Recently, Google developed a method for calling single nucleotide polymorphisms (SNPs) based on deep learning. Their method visualizes sequence reads in the forms of images. These images are then used to train a deep neural network model, which is used to call SNPs. This raises a research question: can deep learning be used to call more complex genetic variations such as structural variations (SVs) from sequence data? RESULTS: In this paper, we extend this high-level approach to the problem of calling structural variations. We present DeepSV, an approach based on deep learning for calling long deletions from sequence reads. DeepSV is based on a novel method of visualizing sequence reads. The visualization is designed to capture multiple sources of information in the sequence data that are relevant to long deletions. DeepSV also implements techniques for working with noisy training data. DeepSV trains a model from the visualized sequence reads and calls deletions based on this model. We demonstrate that DeepSV outperforms existing methods in terms of accuracy and efficiency of deletion calling on the data from the 1000 Genomes Project. CONCLUSIONS: Our work shows that deep learning can potentially lead to effective calling of different types of genetic variations that are complex than SNPs. BioMed Central 2019-12-12 /pmc/articles/PMC6909530/ /pubmed/31830921 http://dx.doi.org/10.1186/s12859-019-3299-y Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Cai, Lei
Wu, Yufeng
Gao, Jingyang
DeepSV: accurate calling of genomic deletions from high-throughput sequencing data using deep convolutional neural network
title DeepSV: accurate calling of genomic deletions from high-throughput sequencing data using deep convolutional neural network
title_full DeepSV: accurate calling of genomic deletions from high-throughput sequencing data using deep convolutional neural network
title_fullStr DeepSV: accurate calling of genomic deletions from high-throughput sequencing data using deep convolutional neural network
title_full_unstemmed DeepSV: accurate calling of genomic deletions from high-throughput sequencing data using deep convolutional neural network
title_short DeepSV: accurate calling of genomic deletions from high-throughput sequencing data using deep convolutional neural network
title_sort deepsv: accurate calling of genomic deletions from high-throughput sequencing data using deep convolutional neural network
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6909530/
https://www.ncbi.nlm.nih.gov/pubmed/31830921
http://dx.doi.org/10.1186/s12859-019-3299-y
work_keys_str_mv AT cailei deepsvaccuratecallingofgenomicdeletionsfromhighthroughputsequencingdatausingdeepconvolutionalneuralnetwork
AT wuyufeng deepsvaccuratecallingofgenomicdeletionsfromhighthroughputsequencingdatausingdeepconvolutionalneuralnetwork
AT gaojingyang deepsvaccuratecallingofgenomicdeletionsfromhighthroughputsequencingdatausingdeepconvolutionalneuralnetwork