Cargando…

iEnhancer-ECNN: identifying enhancers and their strength using ensembles of convolutional neural networks

BACKGROUND: Enhancers are non-coding DNA fragments which are crucial in gene regulation (e.g. transcription and translation). Having high locational variation and free scattering in 98% of non-encoding genomes, enhancer identification is, therefore, more complicated than other genetic factors. To ad...

Descripción completa

Detalles Bibliográficos
Autores principales:	Nguyen, Quang H., Nguyen-Vo, Thanh-Hoang, Le, Nguyen Quoc Khanh, Do, Trang T.T., Rahardja, Susanto, Nguyen, Binh P.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2019
Materias:	Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6929481/ https://www.ncbi.nlm.nih.gov/pubmed/31874637 http://dx.doi.org/10.1186/s12864-019-6336-3

_version_	1783482710585507840
author	Nguyen, Quang H. Nguyen-Vo, Thanh-Hoang Le, Nguyen Quoc Khanh Do, Trang T.T. Rahardja, Susanto Nguyen, Binh P.
author_facet	Nguyen, Quang H. Nguyen-Vo, Thanh-Hoang Le, Nguyen Quoc Khanh Do, Trang T.T. Rahardja, Susanto Nguyen, Binh P.
author_sort	Nguyen, Quang H.
collection	PubMed
description	BACKGROUND: Enhancers are non-coding DNA fragments which are crucial in gene regulation (e.g. transcription and translation). Having high locational variation and free scattering in 98% of non-encoding genomes, enhancer identification is, therefore, more complicated than other genetic factors. To address this biological issue, several in silico studies have been done to identify and classify enhancer sequences among a myriad of DNA sequences using computational advances. Although recent studies have come up with improved performance, shortfalls in these learning models still remain. To overcome limitations of existing learning models, we introduce iEnhancer-ECNN, an efficient prediction framework using one-hot encoding and k-mers for data transformation and ensembles of convolutional neural networks for model construction, to identify enhancers and classify their strength. The benchmark dataset from Liu et al.’s study was used to develop and evaluate the ensemble models. A comparative analysis between iEnhancer-ECNN and existing state-of-the-art methods was done to fairly assess the model performance. RESULTS: Our experimental results demonstrates that iEnhancer-ECNN has better performance compared to other state-of-the-art methods using the same dataset. The accuracy of the ensemble model for enhancer identification (layer 1) and enhancer classification (layer 2) are 0.769 and 0.678, respectively. Compared to other related studies, improvements in the Area Under the Receiver Operating Characteristic Curve (AUC), sensitivity, and Matthews’s correlation coefficient (MCC) of our models are remarkable, especially for the model of layer 2 with about 11.0%, 46.5%, and 65.0%, respectively. CONCLUSIONS: iEnhancer-ECNN outperforms other previously proposed methods with significant improvement in most of the evaluation metrics. Strong growths in the MCC of both layers are highly meaningful in assuring the stability of our models.
format	Online Article Text
id	pubmed-6929481
institution	National Center for Biotechnology Information
language	English
publishDate	2019
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-69294812019-12-30 iEnhancer-ECNN: identifying enhancers and their strength using ensembles of convolutional neural networks Nguyen, Quang H. Nguyen-Vo, Thanh-Hoang Le, Nguyen Quoc Khanh Do, Trang T.T. Rahardja, Susanto Nguyen, Binh P. BMC Genomics Research BACKGROUND: Enhancers are non-coding DNA fragments which are crucial in gene regulation (e.g. transcription and translation). Having high locational variation and free scattering in 98% of non-encoding genomes, enhancer identification is, therefore, more complicated than other genetic factors. To address this biological issue, several in silico studies have been done to identify and classify enhancer sequences among a myriad of DNA sequences using computational advances. Although recent studies have come up with improved performance, shortfalls in these learning models still remain. To overcome limitations of existing learning models, we introduce iEnhancer-ECNN, an efficient prediction framework using one-hot encoding and k-mers for data transformation and ensembles of convolutional neural networks for model construction, to identify enhancers and classify their strength. The benchmark dataset from Liu et al.’s study was used to develop and evaluate the ensemble models. A comparative analysis between iEnhancer-ECNN and existing state-of-the-art methods was done to fairly assess the model performance. RESULTS: Our experimental results demonstrates that iEnhancer-ECNN has better performance compared to other state-of-the-art methods using the same dataset. The accuracy of the ensemble model for enhancer identification (layer 1) and enhancer classification (layer 2) are 0.769 and 0.678, respectively. Compared to other related studies, improvements in the Area Under the Receiver Operating Characteristic Curve (AUC), sensitivity, and Matthews’s correlation coefficient (MCC) of our models are remarkable, especially for the model of layer 2 with about 11.0%, 46.5%, and 65.0%, respectively. CONCLUSIONS: iEnhancer-ECNN outperforms other previously proposed methods with significant improvement in most of the evaluation metrics. Strong growths in the MCC of both layers are highly meaningful in assuring the stability of our models. BioMed Central 2019-12-24 /pmc/articles/PMC6929481/ /pubmed/31874637 http://dx.doi.org/10.1186/s12864-019-6336-3 Text en © The Author(s) 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle	Research Nguyen, Quang H. Nguyen-Vo, Thanh-Hoang Le, Nguyen Quoc Khanh Do, Trang T.T. Rahardja, Susanto Nguyen, Binh P. iEnhancer-ECNN: identifying enhancers and their strength using ensembles of convolutional neural networks
title	iEnhancer-ECNN: identifying enhancers and their strength using ensembles of convolutional neural networks
title_full	iEnhancer-ECNN: identifying enhancers and their strength using ensembles of convolutional neural networks
title_fullStr	iEnhancer-ECNN: identifying enhancers and their strength using ensembles of convolutional neural networks
title_full_unstemmed	iEnhancer-ECNN: identifying enhancers and their strength using ensembles of convolutional neural networks
title_short	iEnhancer-ECNN: identifying enhancers and their strength using ensembles of convolutional neural networks
title_sort	ienhancer-ecnn: identifying enhancers and their strength using ensembles of convolutional neural networks
topic	Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6929481/ https://www.ncbi.nlm.nih.gov/pubmed/31874637 http://dx.doi.org/10.1186/s12864-019-6336-3
work_keys_str_mv	AT nguyenquangh ienhancerecnnidentifyingenhancersandtheirstrengthusingensemblesofconvolutionalneuralnetworks AT nguyenvothanhhoang ienhancerecnnidentifyingenhancersandtheirstrengthusingensemblesofconvolutionalneuralnetworks AT lenguyenquockhanh ienhancerecnnidentifyingenhancersandtheirstrengthusingensemblesofconvolutionalneuralnetworks AT dotrangtt ienhancerecnnidentifyingenhancersandtheirstrengthusingensemblesofconvolutionalneuralnetworks AT rahardjasusanto ienhancerecnnidentifyingenhancersandtheirstrengthusingensemblesofconvolutionalneuralnetworks AT nguyenbinhp ienhancerecnnidentifyingenhancersandtheirstrengthusingensemblesofconvolutionalneuralnetworks

iEnhancer-ECNN: identifying enhancers and their strength using ensembles of convolutional neural networks

Ejemplares similares