Cargando…
ncPro-ML: An integrated computational tool for identifying non-coding RNA promoters in multiple species
The promoter is located near the transcription start sites and regulates transcription initiation of the gene. Accurate identification of promoters is essential for understanding the mechanism of gene regulation. Since experimental methods are costly and ineffective, developing efficient and accurat...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Research Network of Computational and Structural Biotechnology
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7509369/ https://www.ncbi.nlm.nih.gov/pubmed/33005306 http://dx.doi.org/10.1016/j.csbj.2020.09.001 |
_version_ | 1783585580650594304 |
---|---|
author | Tang, Qiang Nie, Fulei Kang, Juanjuan Chen, Wei |
author_facet | Tang, Qiang Nie, Fulei Kang, Juanjuan Chen, Wei |
author_sort | Tang, Qiang |
collection | PubMed |
description | The promoter is located near the transcription start sites and regulates transcription initiation of the gene. Accurate identification of promoters is essential for understanding the mechanism of gene regulation. Since experimental methods are costly and ineffective, developing efficient and accurate computational tools to identify promoters are necessary. Although a series of methods have been proposed for identifying promoters, none of them is able to identify the promoters of non-coding RNA (ncRNA). In the present work, a new method called ncPro-ML was proposed to identify the promoter of ncRNA in Homo sapiens and Mus musculus, in which different kinds of sequence encoding schemes were used to convert DNA sequences into feature vectors. To test the length effect, for each species, datasets including sequences with different lengths were built. The results demonstrated that ncPro-ML achieved the best performance based on the dataset with the sequence length of 221 nucleotides for human and mouse. The performances of ncPro-ML were also satisfying from both independent dataset test and cross-species test. The results indicate that the proposed predictor can server as a powerful tool for the discovery of ncRNA promoters. In addition, a web-server for ncPro-ML was developed, which can be freely accessed at http://www.bio-bigdata.cn/ncPro-ML/. |
format | Online Article Text |
id | pubmed-7509369 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Research Network of Computational and Structural Biotechnology |
record_format | MEDLINE/PubMed |
spelling | pubmed-75093692020-09-30 ncPro-ML: An integrated computational tool for identifying non-coding RNA promoters in multiple species Tang, Qiang Nie, Fulei Kang, Juanjuan Chen, Wei Comput Struct Biotechnol J Research Article The promoter is located near the transcription start sites and regulates transcription initiation of the gene. Accurate identification of promoters is essential for understanding the mechanism of gene regulation. Since experimental methods are costly and ineffective, developing efficient and accurate computational tools to identify promoters are necessary. Although a series of methods have been proposed for identifying promoters, none of them is able to identify the promoters of non-coding RNA (ncRNA). In the present work, a new method called ncPro-ML was proposed to identify the promoter of ncRNA in Homo sapiens and Mus musculus, in which different kinds of sequence encoding schemes were used to convert DNA sequences into feature vectors. To test the length effect, for each species, datasets including sequences with different lengths were built. The results demonstrated that ncPro-ML achieved the best performance based on the dataset with the sequence length of 221 nucleotides for human and mouse. The performances of ncPro-ML were also satisfying from both independent dataset test and cross-species test. The results indicate that the proposed predictor can server as a powerful tool for the discovery of ncRNA promoters. In addition, a web-server for ncPro-ML was developed, which can be freely accessed at http://www.bio-bigdata.cn/ncPro-ML/. Research Network of Computational and Structural Biotechnology 2020-09-10 /pmc/articles/PMC7509369/ /pubmed/33005306 http://dx.doi.org/10.1016/j.csbj.2020.09.001 Text en © 2020 The Author(s) http://creativecommons.org/licenses/by-nc-nd/4.0/ This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). |
spellingShingle | Research Article Tang, Qiang Nie, Fulei Kang, Juanjuan Chen, Wei ncPro-ML: An integrated computational tool for identifying non-coding RNA promoters in multiple species |
title | ncPro-ML: An integrated computational tool for identifying non-coding RNA promoters in multiple species |
title_full | ncPro-ML: An integrated computational tool for identifying non-coding RNA promoters in multiple species |
title_fullStr | ncPro-ML: An integrated computational tool for identifying non-coding RNA promoters in multiple species |
title_full_unstemmed | ncPro-ML: An integrated computational tool for identifying non-coding RNA promoters in multiple species |
title_short | ncPro-ML: An integrated computational tool for identifying non-coding RNA promoters in multiple species |
title_sort | ncpro-ml: an integrated computational tool for identifying non-coding rna promoters in multiple species |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7509369/ https://www.ncbi.nlm.nih.gov/pubmed/33005306 http://dx.doi.org/10.1016/j.csbj.2020.09.001 |
work_keys_str_mv | AT tangqiang ncpromlanintegratedcomputationaltoolforidentifyingnoncodingrnapromotersinmultiplespecies AT niefulei ncpromlanintegratedcomputationaltoolforidentifyingnoncodingrnapromotersinmultiplespecies AT kangjuanjuan ncpromlanintegratedcomputationaltoolforidentifyingnoncodingrnapromotersinmultiplespecies AT chenwei ncpromlanintegratedcomputationaltoolforidentifyingnoncodingrnapromotersinmultiplespecies |