Cargando…

DanQ: a hybrid convolutional and recurrent deep neural network for quantifying the function of DNA sequences

Modeling the properties and functions of DNA sequences is an important, but challenging task in the broad field of genomics. This task is particularly difficult for non-coding DNA, the vast majority of which is still poorly understood in terms of function. A powerful predictive model for the functio...

Descripción completa

Detalles Bibliográficos
Autores principales: Quang, Daniel, Xie, Xiaohui
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4914104/
https://www.ncbi.nlm.nih.gov/pubmed/27084946
http://dx.doi.org/10.1093/nar/gkw226
_version_ 1782438510588854272
author Quang, Daniel
Xie, Xiaohui
author_facet Quang, Daniel
Xie, Xiaohui
author_sort Quang, Daniel
collection PubMed
description Modeling the properties and functions of DNA sequences is an important, but challenging task in the broad field of genomics. This task is particularly difficult for non-coding DNA, the vast majority of which is still poorly understood in terms of function. A powerful predictive model for the function of non-coding DNA can have enormous benefit for both basic science and translational research because over 98% of the human genome is non-coding and 93% of disease-associated variants lie in these regions. To address this need, we propose DanQ, a novel hybrid convolutional and bi-directional long short-term memory recurrent neural network framework for predicting non-coding function de novo from sequence. In the DanQ model, the convolution layer captures regulatory motifs, while the recurrent layer captures long-term dependencies between the motifs in order to learn a regulatory ‘grammar’ to improve predictions. DanQ improves considerably upon other models across several metrics. For some regulatory markers, DanQ can achieve over a 50% relative improvement in the area under the precision-recall curve metric compared to related models. We have made the source code available at the github repository http://github.com/uci-cbcl/DanQ.
format Online
Article
Text
id pubmed-4914104
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-49141042016-06-22 DanQ: a hybrid convolutional and recurrent deep neural network for quantifying the function of DNA sequences Quang, Daniel Xie, Xiaohui Nucleic Acids Res Methods Online Modeling the properties and functions of DNA sequences is an important, but challenging task in the broad field of genomics. This task is particularly difficult for non-coding DNA, the vast majority of which is still poorly understood in terms of function. A powerful predictive model for the function of non-coding DNA can have enormous benefit for both basic science and translational research because over 98% of the human genome is non-coding and 93% of disease-associated variants lie in these regions. To address this need, we propose DanQ, a novel hybrid convolutional and bi-directional long short-term memory recurrent neural network framework for predicting non-coding function de novo from sequence. In the DanQ model, the convolution layer captures regulatory motifs, while the recurrent layer captures long-term dependencies between the motifs in order to learn a regulatory ‘grammar’ to improve predictions. DanQ improves considerably upon other models across several metrics. For some regulatory markers, DanQ can achieve over a 50% relative improvement in the area under the precision-recall curve metric compared to related models. We have made the source code available at the github repository http://github.com/uci-cbcl/DanQ. Oxford University Press 2016-06-20 2016-04-15 /pmc/articles/PMC4914104/ /pubmed/27084946 http://dx.doi.org/10.1093/nar/gkw226 Text en © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Methods Online
Quang, Daniel
Xie, Xiaohui
DanQ: a hybrid convolutional and recurrent deep neural network for quantifying the function of DNA sequences
title DanQ: a hybrid convolutional and recurrent deep neural network for quantifying the function of DNA sequences
title_full DanQ: a hybrid convolutional and recurrent deep neural network for quantifying the function of DNA sequences
title_fullStr DanQ: a hybrid convolutional and recurrent deep neural network for quantifying the function of DNA sequences
title_full_unstemmed DanQ: a hybrid convolutional and recurrent deep neural network for quantifying the function of DNA sequences
title_short DanQ: a hybrid convolutional and recurrent deep neural network for quantifying the function of DNA sequences
title_sort danq: a hybrid convolutional and recurrent deep neural network for quantifying the function of dna sequences
topic Methods Online
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4914104/
https://www.ncbi.nlm.nih.gov/pubmed/27084946
http://dx.doi.org/10.1093/nar/gkw226
work_keys_str_mv AT quangdaniel danqahybridconvolutionalandrecurrentdeepneuralnetworkforquantifyingthefunctionofdnasequences
AT xiexiaohui danqahybridconvolutionalandrecurrentdeepneuralnetworkforquantifyingthefunctionofdnasequences