Cargando…
StackCirRNAPred: computational classification of long circRNA from other lncRNA based on stacking strategy
BACKGROUND: CircRNAs are essential for the regulation of post-transcriptional gene expression, including as miRNA sponges, and play an important role in disease development. Some computational tools have been proposed recently to predict circRNA, since only one classifier is used, there is still muc...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9793644/ https://www.ncbi.nlm.nih.gov/pubmed/36575368 http://dx.doi.org/10.1186/s12859-022-05118-7 |
_version_ | 1784859881324412928 |
---|---|
author | Wang, Xin Liu, Yadong Li, Jie Wang, Guohua |
author_facet | Wang, Xin Liu, Yadong Li, Jie Wang, Guohua |
author_sort | Wang, Xin |
collection | PubMed |
description | BACKGROUND: CircRNAs are essential for the regulation of post-transcriptional gene expression, including as miRNA sponges, and play an important role in disease development. Some computational tools have been proposed recently to predict circRNA, since only one classifier is used, there is still much that can be done to improve the performance. RESULTS: StackCirRNAPred was proposed, the computational classification of long circRNA from other lncRNA based on stacking strategy. In order to cope with the potential problem that a single feature might not be able to distinguish circRNA well from other lncRNA, we first extracted features from different sources, including nucleic acid composition, sequence spatial features and physicochemical properties, Alu and tandem repeats. We innovatively apply the stacking strategy to integrate the more advantageous classifiers of RF, LightGBM, XGBoost. This allows the model to incorporate these features more flexibly. StackCirRNAPred was found to be significantly better than other tools, with precision, accuracy, F1, recall and MCC of 0.843, 0.833, 0.831, 0.819 and 0.666 respectively. We tested it directly on the mouse dataset. StackCirRNAPred was still significantly better than other methods, with precision, accuracy, F1, recall and MCC of 0.837, 0.839, 0.839, 0.841, 0.677. CONCLUSIONS: We proposed StackCirRNAPred based on stacking strategy to distinguish long circRNAs from other lncRNAs. With the test results demonstrating the validity and robustness of StackCirRNAPred, we hope StackCirRNAPred will complement existing circRNA prediction methods and is helpful in down-stream research. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-022-05118-7. |
format | Online Article Text |
id | pubmed-9793644 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-97936442022-12-28 StackCirRNAPred: computational classification of long circRNA from other lncRNA based on stacking strategy Wang, Xin Liu, Yadong Li, Jie Wang, Guohua BMC Bioinformatics Research BACKGROUND: CircRNAs are essential for the regulation of post-transcriptional gene expression, including as miRNA sponges, and play an important role in disease development. Some computational tools have been proposed recently to predict circRNA, since only one classifier is used, there is still much that can be done to improve the performance. RESULTS: StackCirRNAPred was proposed, the computational classification of long circRNA from other lncRNA based on stacking strategy. In order to cope with the potential problem that a single feature might not be able to distinguish circRNA well from other lncRNA, we first extracted features from different sources, including nucleic acid composition, sequence spatial features and physicochemical properties, Alu and tandem repeats. We innovatively apply the stacking strategy to integrate the more advantageous classifiers of RF, LightGBM, XGBoost. This allows the model to incorporate these features more flexibly. StackCirRNAPred was found to be significantly better than other tools, with precision, accuracy, F1, recall and MCC of 0.843, 0.833, 0.831, 0.819 and 0.666 respectively. We tested it directly on the mouse dataset. StackCirRNAPred was still significantly better than other methods, with precision, accuracy, F1, recall and MCC of 0.837, 0.839, 0.839, 0.841, 0.677. CONCLUSIONS: We proposed StackCirRNAPred based on stacking strategy to distinguish long circRNAs from other lncRNAs. With the test results demonstrating the validity and robustness of StackCirRNAPred, we hope StackCirRNAPred will complement existing circRNA prediction methods and is helpful in down-stream research. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-022-05118-7. BioMed Central 2022-12-27 /pmc/articles/PMC9793644/ /pubmed/36575368 http://dx.doi.org/10.1186/s12859-022-05118-7 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Wang, Xin Liu, Yadong Li, Jie Wang, Guohua StackCirRNAPred: computational classification of long circRNA from other lncRNA based on stacking strategy |
title | StackCirRNAPred: computational classification of long circRNA from other lncRNA based on stacking strategy |
title_full | StackCirRNAPred: computational classification of long circRNA from other lncRNA based on stacking strategy |
title_fullStr | StackCirRNAPred: computational classification of long circRNA from other lncRNA based on stacking strategy |
title_full_unstemmed | StackCirRNAPred: computational classification of long circRNA from other lncRNA based on stacking strategy |
title_short | StackCirRNAPred: computational classification of long circRNA from other lncRNA based on stacking strategy |
title_sort | stackcirrnapred: computational classification of long circrna from other lncrna based on stacking strategy |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9793644/ https://www.ncbi.nlm.nih.gov/pubmed/36575368 http://dx.doi.org/10.1186/s12859-022-05118-7 |
work_keys_str_mv | AT wangxin stackcirrnapredcomputationalclassificationoflongcircrnafromotherlncrnabasedonstackingstrategy AT liuyadong stackcirrnapredcomputationalclassificationoflongcircrnafromotherlncrnabasedonstackingstrategy AT lijie stackcirrnapredcomputationalclassificationoflongcircrnafromotherlncrnabasedonstackingstrategy AT wangguohua stackcirrnapredcomputationalclassificationoflongcircrnafromotherlncrnabasedonstackingstrategy |