Cargando…

IIMLP: integrated information-entropy-based method for LncRNA prediction

BACKGROUND: The prediction of long non-coding RNA (lncRNA) has attracted great attention from researchers, as more and more evidence indicate that various complex human diseases are closely related to lncRNAs. In the era of bio-med big data, in addition to the prediction of lncRNAs by biological exp...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Junyi, Li, Huinian, Ye, Xiao, Zhang, Li, Xu, Qingzhe, Ping, Yuan, Jing, Xiaozhu, Jiang, Wei, Liao, Qing, Liu, Bo, Wang, Yadong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8117603/
https://www.ncbi.nlm.nih.gov/pubmed/33980144
http://dx.doi.org/10.1186/s12859-020-03884-w
_version_ 1783691615005573120
author Li, Junyi
Li, Huinian
Ye, Xiao
Zhang, Li
Xu, Qingzhe
Ping, Yuan
Jing, Xiaozhu
Jiang, Wei
Liao, Qing
Liu, Bo
Wang, Yadong
author_facet Li, Junyi
Li, Huinian
Ye, Xiao
Zhang, Li
Xu, Qingzhe
Ping, Yuan
Jing, Xiaozhu
Jiang, Wei
Liao, Qing
Liu, Bo
Wang, Yadong
author_sort Li, Junyi
collection PubMed
description BACKGROUND: The prediction of long non-coding RNA (lncRNA) has attracted great attention from researchers, as more and more evidence indicate that various complex human diseases are closely related to lncRNAs. In the era of bio-med big data, in addition to the prediction of lncRNAs by biological experimental methods, many computational methods based on machine learning have been proposed to make better use of the sequence resources of lncRNAs. RESULTS: We developed the lncRNA prediction method by integrating information-entropy-based features and machine learning algorithms. We calculate generalized topological entropy and generate 6 novel features for lncRNA sequences. By employing these 6 features and other features such as open reading frame, we apply supporting vector machine, XGBoost and random forest algorithms to distinguish human lncRNAs. We compare our method with the one which has more K-mer features and results show that our method has higher area under the curve up to 99.7905%. CONCLUSIONS: We develop an accurate and efficient method which has novel information entropy features to analyze and classify lncRNAs. Our method is also extendable for research on the other functional elements in DNA sequences.
format Online
Article
Text
id pubmed-8117603
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-81176032021-05-13 IIMLP: integrated information-entropy-based method for LncRNA prediction Li, Junyi Li, Huinian Ye, Xiao Zhang, Li Xu, Qingzhe Ping, Yuan Jing, Xiaozhu Jiang, Wei Liao, Qing Liu, Bo Wang, Yadong BMC Bioinformatics Research BACKGROUND: The prediction of long non-coding RNA (lncRNA) has attracted great attention from researchers, as more and more evidence indicate that various complex human diseases are closely related to lncRNAs. In the era of bio-med big data, in addition to the prediction of lncRNAs by biological experimental methods, many computational methods based on machine learning have been proposed to make better use of the sequence resources of lncRNAs. RESULTS: We developed the lncRNA prediction method by integrating information-entropy-based features and machine learning algorithms. We calculate generalized topological entropy and generate 6 novel features for lncRNA sequences. By employing these 6 features and other features such as open reading frame, we apply supporting vector machine, XGBoost and random forest algorithms to distinguish human lncRNAs. We compare our method with the one which has more K-mer features and results show that our method has higher area under the curve up to 99.7905%. CONCLUSIONS: We develop an accurate and efficient method which has novel information entropy features to analyze and classify lncRNAs. Our method is also extendable for research on the other functional elements in DNA sequences. BioMed Central 2021-05-13 /pmc/articles/PMC8117603/ /pubmed/33980144 http://dx.doi.org/10.1186/s12859-020-03884-w Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Li, Junyi
Li, Huinian
Ye, Xiao
Zhang, Li
Xu, Qingzhe
Ping, Yuan
Jing, Xiaozhu
Jiang, Wei
Liao, Qing
Liu, Bo
Wang, Yadong
IIMLP: integrated information-entropy-based method for LncRNA prediction
title IIMLP: integrated information-entropy-based method for LncRNA prediction
title_full IIMLP: integrated information-entropy-based method for LncRNA prediction
title_fullStr IIMLP: integrated information-entropy-based method for LncRNA prediction
title_full_unstemmed IIMLP: integrated information-entropy-based method for LncRNA prediction
title_short IIMLP: integrated information-entropy-based method for LncRNA prediction
title_sort iimlp: integrated information-entropy-based method for lncrna prediction
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8117603/
https://www.ncbi.nlm.nih.gov/pubmed/33980144
http://dx.doi.org/10.1186/s12859-020-03884-w
work_keys_str_mv AT lijunyi iimlpintegratedinformationentropybasedmethodforlncrnaprediction
AT lihuinian iimlpintegratedinformationentropybasedmethodforlncrnaprediction
AT yexiao iimlpintegratedinformationentropybasedmethodforlncrnaprediction
AT zhangli iimlpintegratedinformationentropybasedmethodforlncrnaprediction
AT xuqingzhe iimlpintegratedinformationentropybasedmethodforlncrnaprediction
AT pingyuan iimlpintegratedinformationentropybasedmethodforlncrnaprediction
AT jingxiaozhu iimlpintegratedinformationentropybasedmethodforlncrnaprediction
AT jiangwei iimlpintegratedinformationentropybasedmethodforlncrnaprediction
AT liaoqing iimlpintegratedinformationentropybasedmethodforlncrnaprediction
AT liubo iimlpintegratedinformationentropybasedmethodforlncrnaprediction
AT wangyadong iimlpintegratedinformationentropybasedmethodforlncrnaprediction