Cargando…

Sequence based prediction of enhancer regions from DNA random walk

Regulatory elements play a critical role in development process of eukaryotic organisms by controlling the spatio-temporal pattern of gene expression. Enhancer is one of these elements which contributes to the regulation of gene expression through chromatin loop or eRNA expression. Experimental iden...

Descripción completa

Detalles Bibliográficos
Autores principales:	Singh, Anand Pratap, Mishra, Sarthak, Jabin, Suraiya
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Nature Publishing Group UK 2018
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6206163/ https://www.ncbi.nlm.nih.gov/pubmed/30374023 http://dx.doi.org/10.1038/s41598-018-33413-y

_version_	1783366315800526848
author	Singh, Anand Pratap Mishra, Sarthak Jabin, Suraiya
author_facet	Singh, Anand Pratap Mishra, Sarthak Jabin, Suraiya
author_sort	Singh, Anand Pratap
collection	PubMed
description	Regulatory elements play a critical role in development process of eukaryotic organisms by controlling the spatio-temporal pattern of gene expression. Enhancer is one of these elements which contributes to the regulation of gene expression through chromatin loop or eRNA expression. Experimental identification of a novel enhancer is a costly exercise, due to which there is an interest in computational approaches to predict enhancer regions in a genome. Existing computational approaches to achieve this goal have primarily been based on training of high-throughput data such as transcription factor binding sites (TFBS), DNA methylation, and histone modification marks etc. On the other hand, purely sequence based approaches to predict enhancer regions are promising as they are not biased by the complexity or context specificity of such datasets. In sequence based approaches, machine learning models are either directly trained on sequences or sequence features, to classify sequences as enhancers or non-enhancers. In this paper, we derived statistical and nonlinear dynamic features along with k-mer features from experimentally validated sequences taken from Vista Enhancer Browser through random walk model and applied different machine learning based methods to predict whether an input test sequence is enhancer or not. Experimental results demonstrate the success of proposed model based on Ensemble method with area under curve (AUC) 0.86, 0.89, and 0.87 in B cells, T cells, and Natural killer cells for histone marks dataset.
format	Online Article Text
id	pubmed-6206163
institution	National Center for Biotechnology Information
language	English
publishDate	2018
publisher	Nature Publishing Group UK
record_format	MEDLINE/PubMed
spelling	pubmed-62061632018-11-01 Sequence based prediction of enhancer regions from DNA random walk Singh, Anand Pratap Mishra, Sarthak Jabin, Suraiya Sci Rep Article Regulatory elements play a critical role in development process of eukaryotic organisms by controlling the spatio-temporal pattern of gene expression. Enhancer is one of these elements which contributes to the regulation of gene expression through chromatin loop or eRNA expression. Experimental identification of a novel enhancer is a costly exercise, due to which there is an interest in computational approaches to predict enhancer regions in a genome. Existing computational approaches to achieve this goal have primarily been based on training of high-throughput data such as transcription factor binding sites (TFBS), DNA methylation, and histone modification marks etc. On the other hand, purely sequence based approaches to predict enhancer regions are promising as they are not biased by the complexity or context specificity of such datasets. In sequence based approaches, machine learning models are either directly trained on sequences or sequence features, to classify sequences as enhancers or non-enhancers. In this paper, we derived statistical and nonlinear dynamic features along with k-mer features from experimentally validated sequences taken from Vista Enhancer Browser through random walk model and applied different machine learning based methods to predict whether an input test sequence is enhancer or not. Experimental results demonstrate the success of proposed model based on Ensemble method with area under curve (AUC) 0.86, 0.89, and 0.87 in B cells, T cells, and Natural killer cells for histone marks dataset. Nature Publishing Group UK 2018-10-29 /pmc/articles/PMC6206163/ /pubmed/30374023 http://dx.doi.org/10.1038/s41598-018-33413-y Text en © The Author(s) 2018 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle	Article Singh, Anand Pratap Mishra, Sarthak Jabin, Suraiya Sequence based prediction of enhancer regions from DNA random walk
title	Sequence based prediction of enhancer regions from DNA random walk
title_full	Sequence based prediction of enhancer regions from DNA random walk
title_fullStr	Sequence based prediction of enhancer regions from DNA random walk
title_full_unstemmed	Sequence based prediction of enhancer regions from DNA random walk
title_short	Sequence based prediction of enhancer regions from DNA random walk
title_sort	sequence based prediction of enhancer regions from dna random walk
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6206163/ https://www.ncbi.nlm.nih.gov/pubmed/30374023 http://dx.doi.org/10.1038/s41598-018-33413-y
work_keys_str_mv	AT singhanandpratap sequencebasedpredictionofenhancerregionsfromdnarandomwalk AT mishrasarthak sequencebasedpredictionofenhancerregionsfromdnarandomwalk AT jabinsuraiya sequencebasedpredictionofenhancerregionsfromdnarandomwalk

Sequence based prediction of enhancer regions from DNA random walk

Ejemplares similares