Cargando…
PredLnc-GFStack: A Global Sequence Feature Based on a Stacked Ensemble Learning Method for Predicting lncRNAs from Transcripts
Long non-coding RNAs (lncRNAs) are a class of RNAs with the length exceeding 200 base pairs (bps), which do not encode proteins, nevertheless, lncRNAs have many vital biological functions. A large number of novel transcripts were discovered as a result of the development of high-throughput sequencin...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6770532/ https://www.ncbi.nlm.nih.gov/pubmed/31484412 http://dx.doi.org/10.3390/genes10090672 |
_version_ | 1783455495332298752 |
---|---|
author | Liu, Shuai Zhao, Xiaohan Zhang, Guangyan Li, Weiyang Liu, Feng Liu, Shichao Zhang, Wen |
author_facet | Liu, Shuai Zhao, Xiaohan Zhang, Guangyan Li, Weiyang Liu, Feng Liu, Shichao Zhang, Wen |
author_sort | Liu, Shuai |
collection | PubMed |
description | Long non-coding RNAs (lncRNAs) are a class of RNAs with the length exceeding 200 base pairs (bps), which do not encode proteins, nevertheless, lncRNAs have many vital biological functions. A large number of novel transcripts were discovered as a result of the development of high-throughput sequencing technology. Under this circumstance, computational methods for lncRNA prediction are in great demand. In this paper, we consider global sequence features and propose a stacked ensemble learning-based method to predict lncRNAs from transcripts, abbreviated as PredLnc-GFStack. We extract the critical features from the candidate feature list using the genetic algorithm (GA) and then employ the stacked ensemble learning method to construct PredLnc-GFStack model. Computational experimental results show that PredLnc-GFStack outperforms several state-of-the-art methods for lncRNA prediction. Furthermore, PredLnc-GFStack demonstrates an outstanding ability for cross-species ncRNA prediction. |
format | Online Article Text |
id | pubmed-6770532 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-67705322019-10-30 PredLnc-GFStack: A Global Sequence Feature Based on a Stacked Ensemble Learning Method for Predicting lncRNAs from Transcripts Liu, Shuai Zhao, Xiaohan Zhang, Guangyan Li, Weiyang Liu, Feng Liu, Shichao Zhang, Wen Genes (Basel) Article Long non-coding RNAs (lncRNAs) are a class of RNAs with the length exceeding 200 base pairs (bps), which do not encode proteins, nevertheless, lncRNAs have many vital biological functions. A large number of novel transcripts were discovered as a result of the development of high-throughput sequencing technology. Under this circumstance, computational methods for lncRNA prediction are in great demand. In this paper, we consider global sequence features and propose a stacked ensemble learning-based method to predict lncRNAs from transcripts, abbreviated as PredLnc-GFStack. We extract the critical features from the candidate feature list using the genetic algorithm (GA) and then employ the stacked ensemble learning method to construct PredLnc-GFStack model. Computational experimental results show that PredLnc-GFStack outperforms several state-of-the-art methods for lncRNA prediction. Furthermore, PredLnc-GFStack demonstrates an outstanding ability for cross-species ncRNA prediction. MDPI 2019-09-03 /pmc/articles/PMC6770532/ /pubmed/31484412 http://dx.doi.org/10.3390/genes10090672 Text en © 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Liu, Shuai Zhao, Xiaohan Zhang, Guangyan Li, Weiyang Liu, Feng Liu, Shichao Zhang, Wen PredLnc-GFStack: A Global Sequence Feature Based on a Stacked Ensemble Learning Method for Predicting lncRNAs from Transcripts |
title | PredLnc-GFStack: A Global Sequence Feature Based on a Stacked Ensemble Learning Method for Predicting lncRNAs from Transcripts |
title_full | PredLnc-GFStack: A Global Sequence Feature Based on a Stacked Ensemble Learning Method for Predicting lncRNAs from Transcripts |
title_fullStr | PredLnc-GFStack: A Global Sequence Feature Based on a Stacked Ensemble Learning Method for Predicting lncRNAs from Transcripts |
title_full_unstemmed | PredLnc-GFStack: A Global Sequence Feature Based on a Stacked Ensemble Learning Method for Predicting lncRNAs from Transcripts |
title_short | PredLnc-GFStack: A Global Sequence Feature Based on a Stacked Ensemble Learning Method for Predicting lncRNAs from Transcripts |
title_sort | predlnc-gfstack: a global sequence feature based on a stacked ensemble learning method for predicting lncrnas from transcripts |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6770532/ https://www.ncbi.nlm.nih.gov/pubmed/31484412 http://dx.doi.org/10.3390/genes10090672 |
work_keys_str_mv | AT liushuai predlncgfstackaglobalsequencefeaturebasedonastackedensemblelearningmethodforpredictinglncrnasfromtranscripts AT zhaoxiaohan predlncgfstackaglobalsequencefeaturebasedonastackedensemblelearningmethodforpredictinglncrnasfromtranscripts AT zhangguangyan predlncgfstackaglobalsequencefeaturebasedonastackedensemblelearningmethodforpredictinglncrnasfromtranscripts AT liweiyang predlncgfstackaglobalsequencefeaturebasedonastackedensemblelearningmethodforpredictinglncrnasfromtranscripts AT liufeng predlncgfstackaglobalsequencefeaturebasedonastackedensemblelearningmethodforpredictinglncrnasfromtranscripts AT liushichao predlncgfstackaglobalsequencefeaturebasedonastackedensemblelearningmethodforpredictinglncrnasfromtranscripts AT zhangwen predlncgfstackaglobalsequencefeaturebasedonastackedensemblelearningmethodforpredictinglncrnasfromtranscripts |