Cargando…

ncPro-ML: An integrated computational tool for identifying non-coding RNA promoters in multiple species

The promoter is located near the transcription start sites and regulates transcription initiation of the gene. Accurate identification of promoters is essential for understanding the mechanism of gene regulation. Since experimental methods are costly and ineffective, developing efficient and accurat...

Descripción completa

Detalles Bibliográficos
Autores principales: Tang, Qiang, Nie, Fulei, Kang, Juanjuan, Chen, Wei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Research Network of Computational and Structural Biotechnology 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7509369/
https://www.ncbi.nlm.nih.gov/pubmed/33005306
http://dx.doi.org/10.1016/j.csbj.2020.09.001
_version_ 1783585580650594304
author Tang, Qiang
Nie, Fulei
Kang, Juanjuan
Chen, Wei
author_facet Tang, Qiang
Nie, Fulei
Kang, Juanjuan
Chen, Wei
author_sort Tang, Qiang
collection PubMed
description The promoter is located near the transcription start sites and regulates transcription initiation of the gene. Accurate identification of promoters is essential for understanding the mechanism of gene regulation. Since experimental methods are costly and ineffective, developing efficient and accurate computational tools to identify promoters are necessary. Although a series of methods have been proposed for identifying promoters, none of them is able to identify the promoters of non-coding RNA (ncRNA). In the present work, a new method called ncPro-ML was proposed to identify the promoter of ncRNA in Homo sapiens and Mus musculus, in which different kinds of sequence encoding schemes were used to convert DNA sequences into feature vectors. To test the length effect, for each species, datasets including sequences with different lengths were built. The results demonstrated that ncPro-ML achieved the best performance based on the dataset with the sequence length of 221 nucleotides for human and mouse. The performances of ncPro-ML were also satisfying from both independent dataset test and cross-species test. The results indicate that the proposed predictor can server as a powerful tool for the discovery of ncRNA promoters. In addition, a web-server for ncPro-ML was developed, which can be freely accessed at http://www.bio-bigdata.cn/ncPro-ML/.
format Online
Article
Text
id pubmed-7509369
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Research Network of Computational and Structural Biotechnology
record_format MEDLINE/PubMed
spelling pubmed-75093692020-09-30 ncPro-ML: An integrated computational tool for identifying non-coding RNA promoters in multiple species Tang, Qiang Nie, Fulei Kang, Juanjuan Chen, Wei Comput Struct Biotechnol J Research Article The promoter is located near the transcription start sites and regulates transcription initiation of the gene. Accurate identification of promoters is essential for understanding the mechanism of gene regulation. Since experimental methods are costly and ineffective, developing efficient and accurate computational tools to identify promoters are necessary. Although a series of methods have been proposed for identifying promoters, none of them is able to identify the promoters of non-coding RNA (ncRNA). In the present work, a new method called ncPro-ML was proposed to identify the promoter of ncRNA in Homo sapiens and Mus musculus, in which different kinds of sequence encoding schemes were used to convert DNA sequences into feature vectors. To test the length effect, for each species, datasets including sequences with different lengths were built. The results demonstrated that ncPro-ML achieved the best performance based on the dataset with the sequence length of 221 nucleotides for human and mouse. The performances of ncPro-ML were also satisfying from both independent dataset test and cross-species test. The results indicate that the proposed predictor can server as a powerful tool for the discovery of ncRNA promoters. In addition, a web-server for ncPro-ML was developed, which can be freely accessed at http://www.bio-bigdata.cn/ncPro-ML/. Research Network of Computational and Structural Biotechnology 2020-09-10 /pmc/articles/PMC7509369/ /pubmed/33005306 http://dx.doi.org/10.1016/j.csbj.2020.09.001 Text en © 2020 The Author(s) http://creativecommons.org/licenses/by-nc-nd/4.0/ This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Research Article
Tang, Qiang
Nie, Fulei
Kang, Juanjuan
Chen, Wei
ncPro-ML: An integrated computational tool for identifying non-coding RNA promoters in multiple species
title ncPro-ML: An integrated computational tool for identifying non-coding RNA promoters in multiple species
title_full ncPro-ML: An integrated computational tool for identifying non-coding RNA promoters in multiple species
title_fullStr ncPro-ML: An integrated computational tool for identifying non-coding RNA promoters in multiple species
title_full_unstemmed ncPro-ML: An integrated computational tool for identifying non-coding RNA promoters in multiple species
title_short ncPro-ML: An integrated computational tool for identifying non-coding RNA promoters in multiple species
title_sort ncpro-ml: an integrated computational tool for identifying non-coding rna promoters in multiple species
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7509369/
https://www.ncbi.nlm.nih.gov/pubmed/33005306
http://dx.doi.org/10.1016/j.csbj.2020.09.001
work_keys_str_mv AT tangqiang ncpromlanintegratedcomputationaltoolforidentifyingnoncodingrnapromotersinmultiplespecies
AT niefulei ncpromlanintegratedcomputationaltoolforidentifyingnoncodingrnapromotersinmultiplespecies
AT kangjuanjuan ncpromlanintegratedcomputationaltoolforidentifyingnoncodingrnapromotersinmultiplespecies
AT chenwei ncpromlanintegratedcomputationaltoolforidentifyingnoncodingrnapromotersinmultiplespecies