Cargando…

CNNdel: Calling Structural Variations on Low Coverage Data Based on Convolutional Neural Networks

Many structural variations (SVs) detection methods have been proposed due to the popularization of next-generation sequencing (NGS). These SV calling methods use different SV-property-dependent features; however, they all suffer from poor accuracy when running on low coverage sequences. The union of...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Jing, Ling, Cheng, Gao, Jingyang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5467383/
https://www.ncbi.nlm.nih.gov/pubmed/28630866
http://dx.doi.org/10.1155/2017/6375059
_version_ 1783243266039218176
author Wang, Jing
Ling, Cheng
Gao, Jingyang
author_facet Wang, Jing
Ling, Cheng
Gao, Jingyang
author_sort Wang, Jing
collection PubMed
description Many structural variations (SVs) detection methods have been proposed due to the popularization of next-generation sequencing (NGS). These SV calling methods use different SV-property-dependent features; however, they all suffer from poor accuracy when running on low coverage sequences. The union of results from these tools achieves fairly high sensitivity but still produces low accuracy on low coverage sequence data. That is, these methods contain many false positives. In this paper, we present CNNdel, an approach for calling deletions from paired-end reads. CNNdel gathers SV candidates reported by multiple tools and then extracts features from aligned BAM files at the positions of candidates. With labeled feature-expressed candidates as a training set, CNNdel trains convolutional neural networks (CNNs) to distinguish true unlabeled candidates from false ones. Results show that CNNdel works well with NGS reads from 26 low coverage genomes of the 1000 Genomes Project. The paper demonstrates that convolutional neural networks can automatically assign the priority of SV features and reduce the false positives efficaciously.
format Online
Article
Text
id pubmed-5467383
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Hindawi
record_format MEDLINE/PubMed
spelling pubmed-54673832017-06-19 CNNdel: Calling Structural Variations on Low Coverage Data Based on Convolutional Neural Networks Wang, Jing Ling, Cheng Gao, Jingyang Biomed Res Int Research Article Many structural variations (SVs) detection methods have been proposed due to the popularization of next-generation sequencing (NGS). These SV calling methods use different SV-property-dependent features; however, they all suffer from poor accuracy when running on low coverage sequences. The union of results from these tools achieves fairly high sensitivity but still produces low accuracy on low coverage sequence data. That is, these methods contain many false positives. In this paper, we present CNNdel, an approach for calling deletions from paired-end reads. CNNdel gathers SV candidates reported by multiple tools and then extracts features from aligned BAM files at the positions of candidates. With labeled feature-expressed candidates as a training set, CNNdel trains convolutional neural networks (CNNs) to distinguish true unlabeled candidates from false ones. Results show that CNNdel works well with NGS reads from 26 low coverage genomes of the 1000 Genomes Project. The paper demonstrates that convolutional neural networks can automatically assign the priority of SV features and reduce the false positives efficaciously. Hindawi 2017 2017-05-28 /pmc/articles/PMC5467383/ /pubmed/28630866 http://dx.doi.org/10.1155/2017/6375059 Text en Copyright © 2017 Jing Wang et al. https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Wang, Jing
Ling, Cheng
Gao, Jingyang
CNNdel: Calling Structural Variations on Low Coverage Data Based on Convolutional Neural Networks
title CNNdel: Calling Structural Variations on Low Coverage Data Based on Convolutional Neural Networks
title_full CNNdel: Calling Structural Variations on Low Coverage Data Based on Convolutional Neural Networks
title_fullStr CNNdel: Calling Structural Variations on Low Coverage Data Based on Convolutional Neural Networks
title_full_unstemmed CNNdel: Calling Structural Variations on Low Coverage Data Based on Convolutional Neural Networks
title_short CNNdel: Calling Structural Variations on Low Coverage Data Based on Convolutional Neural Networks
title_sort cnndel: calling structural variations on low coverage data based on convolutional neural networks
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5467383/
https://www.ncbi.nlm.nih.gov/pubmed/28630866
http://dx.doi.org/10.1155/2017/6375059
work_keys_str_mv AT wangjing cnndelcallingstructuralvariationsonlowcoveragedatabasedonconvolutionalneuralnetworks
AT lingcheng cnndelcallingstructuralvariationsonlowcoveragedatabasedonconvolutionalneuralnetworks
AT gaojingyang cnndelcallingstructuralvariationsonlowcoveragedatabasedonconvolutionalneuralnetworks