Cargando…

Ensemble of Deep Recurrent Neural Networks for Identifying Enhancers via Dinucleotide Physicochemical Properties

Enhancers are short deoxyribonucleic acid fragments that assume an important part in the genetic process of gene expression. Due to their possibly distant location relative to the gene that is acted upon, the identification of enhancers is difficult. There are many published works focused on identif...

Descripción completa

Detalles Bibliográficos
Autores principales: Tan, Kok Keng, Le, Nguyen Quoc Khanh, Yeh, Hui-Yuan, Chua, Matthew Chin Heng
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6678823/
https://www.ncbi.nlm.nih.gov/pubmed/31340596
http://dx.doi.org/10.3390/cells8070767
_version_ 1783441193338667008
author Tan, Kok Keng
Le, Nguyen Quoc Khanh
Yeh, Hui-Yuan
Chua, Matthew Chin Heng
author_facet Tan, Kok Keng
Le, Nguyen Quoc Khanh
Yeh, Hui-Yuan
Chua, Matthew Chin Heng
author_sort Tan, Kok Keng
collection PubMed
description Enhancers are short deoxyribonucleic acid fragments that assume an important part in the genetic process of gene expression. Due to their possibly distant location relative to the gene that is acted upon, the identification of enhancers is difficult. There are many published works focused on identifying enhancers based on their sequence information, however, the resulting performance still requires improvements. Using deep learning methods, this study proposes a model ensemble of classifiers for predicting enhancers based on deep recurrent neural networks. The input features of deep ensemble networks were generated from six types of dinucleotide physicochemical properties, which had outperformed the other features. In summary, our model which used this ensemble approach could identify enhancers with achieved sensitivity of 75.5%, specificity of 76%, accuracy of 75.5%, and MCC of 0.51. For classifying enhancers into strong or weak sequences, our model reached sensitivity of 83.15%, specificity of 45.61%, accuracy of 68.49%, and MCC of 0.312. Compared to the benchmark result, our results had higher performance in term of most measurement metrics. The results showed that deep model ensembles hold the potential for improving on the best results achieved to date using shallow machine learning methods.
format Online
Article
Text
id pubmed-6678823
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-66788232019-08-19 Ensemble of Deep Recurrent Neural Networks for Identifying Enhancers via Dinucleotide Physicochemical Properties Tan, Kok Keng Le, Nguyen Quoc Khanh Yeh, Hui-Yuan Chua, Matthew Chin Heng Cells Article Enhancers are short deoxyribonucleic acid fragments that assume an important part in the genetic process of gene expression. Due to their possibly distant location relative to the gene that is acted upon, the identification of enhancers is difficult. There are many published works focused on identifying enhancers based on their sequence information, however, the resulting performance still requires improvements. Using deep learning methods, this study proposes a model ensemble of classifiers for predicting enhancers based on deep recurrent neural networks. The input features of deep ensemble networks were generated from six types of dinucleotide physicochemical properties, which had outperformed the other features. In summary, our model which used this ensemble approach could identify enhancers with achieved sensitivity of 75.5%, specificity of 76%, accuracy of 75.5%, and MCC of 0.51. For classifying enhancers into strong or weak sequences, our model reached sensitivity of 83.15%, specificity of 45.61%, accuracy of 68.49%, and MCC of 0.312. Compared to the benchmark result, our results had higher performance in term of most measurement metrics. The results showed that deep model ensembles hold the potential for improving on the best results achieved to date using shallow machine learning methods. MDPI 2019-07-23 /pmc/articles/PMC6678823/ /pubmed/31340596 http://dx.doi.org/10.3390/cells8070767 Text en © 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Tan, Kok Keng
Le, Nguyen Quoc Khanh
Yeh, Hui-Yuan
Chua, Matthew Chin Heng
Ensemble of Deep Recurrent Neural Networks for Identifying Enhancers via Dinucleotide Physicochemical Properties
title Ensemble of Deep Recurrent Neural Networks for Identifying Enhancers via Dinucleotide Physicochemical Properties
title_full Ensemble of Deep Recurrent Neural Networks for Identifying Enhancers via Dinucleotide Physicochemical Properties
title_fullStr Ensemble of Deep Recurrent Neural Networks for Identifying Enhancers via Dinucleotide Physicochemical Properties
title_full_unstemmed Ensemble of Deep Recurrent Neural Networks for Identifying Enhancers via Dinucleotide Physicochemical Properties
title_short Ensemble of Deep Recurrent Neural Networks for Identifying Enhancers via Dinucleotide Physicochemical Properties
title_sort ensemble of deep recurrent neural networks for identifying enhancers via dinucleotide physicochemical properties
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6678823/
https://www.ncbi.nlm.nih.gov/pubmed/31340596
http://dx.doi.org/10.3390/cells8070767
work_keys_str_mv AT tankokkeng ensembleofdeeprecurrentneuralnetworksforidentifyingenhancersviadinucleotidephysicochemicalproperties
AT lenguyenquockhanh ensembleofdeeprecurrentneuralnetworksforidentifyingenhancersviadinucleotidephysicochemicalproperties
AT yehhuiyuan ensembleofdeeprecurrentneuralnetworksforidentifyingenhancersviadinucleotidephysicochemicalproperties
AT chuamatthewchinheng ensembleofdeeprecurrentneuralnetworksforidentifyingenhancersviadinucleotidephysicochemicalproperties