Cargando…
DLRAPom: a hybrid pipeline of Optimized XGBoost-guided integrative multiomics analysis for identifying targetable disease-related lncRNA–miRNA–mRNA regulatory axes
The lack of a reliable and easy-to-operate screening pipeline for disease-related noncoding RNA regulatory axis is a problem that needs to be solved urgently. To address this, we designed a hybrid pipeline, disease-related lncRNA–miRNA–mRNA regulatory axis prediction from multiomics (DLRAPom), to id...
Autores principales: | , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8921741/ https://www.ncbi.nlm.nih.gov/pubmed/35224615 http://dx.doi.org/10.1093/bib/bbac046 |
_version_ | 1784669385910124544 |
---|---|
author | Shen, Chen Li, Huiyu Li, Miao Niu, Yu Liu, Jing Zhu, Li Gui, Hongsheng Han, Wei Wang, Huiying Zhang, Wenpei Wang, Xiaochen Luo, Xiao Sun, Yu Yan, Jiangwei Guan, Fanglin |
author_facet | Shen, Chen Li, Huiyu Li, Miao Niu, Yu Liu, Jing Zhu, Li Gui, Hongsheng Han, Wei Wang, Huiying Zhang, Wenpei Wang, Xiaochen Luo, Xiao Sun, Yu Yan, Jiangwei Guan, Fanglin |
author_sort | Shen, Chen |
collection | PubMed |
description | The lack of a reliable and easy-to-operate screening pipeline for disease-related noncoding RNA regulatory axis is a problem that needs to be solved urgently. To address this, we designed a hybrid pipeline, disease-related lncRNA–miRNA–mRNA regulatory axis prediction from multiomics (DLRAPom), to identify risk biomarkers and disease-related lncRNA–miRNA–mRNA regulatory axes by adding a novel machine learning model on the basis of conventional analysis and combining experimental validation. The pipeline consists of four parts, including selecting hub biomarkers by conventional bioinformatics analysis, discovering the most essential protein-coding biomarkers by a novel machine learning model, extracting the key lncRNA–miRNA–mRNA axis and validating experimentally. Our study is the first one to propose a new pipeline predicting the interactions between lncRNA and miRNA and mRNA by combining WGCNA and XGBoost. Compared with the methods reported previously, we developed an Optimized XGBoost model to reduce the degree of overfitting in multiomics data, thereby improving the generalization ability of the overall model for the integrated analysis of multiomics data. With applications to gestational diabetes mellitus (GDM), we predicted nine risk protein-coding biomarkers and some potential lncRNA–miRNA–mRNA regulatory axes, which all correlated with GDM. In those regulatory axes, the MALAT1/hsa-miR-144-3p/IRS1 axis was predicted to be the key axis and was identified as being associated with GDM for the first time. In short, as a flexible pipeline, DLRAPom can contribute to molecular pathogenesis research of diseases, effectively predicting potential disease-related noncoding RNA regulatory networks and providing promising candidates for functional research on disease pathogenesis. |
format | Online Article Text |
id | pubmed-8921741 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-89217412022-03-15 DLRAPom: a hybrid pipeline of Optimized XGBoost-guided integrative multiomics analysis for identifying targetable disease-related lncRNA–miRNA–mRNA regulatory axes Shen, Chen Li, Huiyu Li, Miao Niu, Yu Liu, Jing Zhu, Li Gui, Hongsheng Han, Wei Wang, Huiying Zhang, Wenpei Wang, Xiaochen Luo, Xiao Sun, Yu Yan, Jiangwei Guan, Fanglin Brief Bioinform Problem Solving Protocol The lack of a reliable and easy-to-operate screening pipeline for disease-related noncoding RNA regulatory axis is a problem that needs to be solved urgently. To address this, we designed a hybrid pipeline, disease-related lncRNA–miRNA–mRNA regulatory axis prediction from multiomics (DLRAPom), to identify risk biomarkers and disease-related lncRNA–miRNA–mRNA regulatory axes by adding a novel machine learning model on the basis of conventional analysis and combining experimental validation. The pipeline consists of four parts, including selecting hub biomarkers by conventional bioinformatics analysis, discovering the most essential protein-coding biomarkers by a novel machine learning model, extracting the key lncRNA–miRNA–mRNA axis and validating experimentally. Our study is the first one to propose a new pipeline predicting the interactions between lncRNA and miRNA and mRNA by combining WGCNA and XGBoost. Compared with the methods reported previously, we developed an Optimized XGBoost model to reduce the degree of overfitting in multiomics data, thereby improving the generalization ability of the overall model for the integrated analysis of multiomics data. With applications to gestational diabetes mellitus (GDM), we predicted nine risk protein-coding biomarkers and some potential lncRNA–miRNA–mRNA regulatory axes, which all correlated with GDM. In those regulatory axes, the MALAT1/hsa-miR-144-3p/IRS1 axis was predicted to be the key axis and was identified as being associated with GDM for the first time. In short, as a flexible pipeline, DLRAPom can contribute to molecular pathogenesis research of diseases, effectively predicting potential disease-related noncoding RNA regulatory networks and providing promising candidates for functional research on disease pathogenesis. Oxford University Press 2022-02-26 /pmc/articles/PMC8921741/ /pubmed/35224615 http://dx.doi.org/10.1093/bib/bbac046 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Problem Solving Protocol Shen, Chen Li, Huiyu Li, Miao Niu, Yu Liu, Jing Zhu, Li Gui, Hongsheng Han, Wei Wang, Huiying Zhang, Wenpei Wang, Xiaochen Luo, Xiao Sun, Yu Yan, Jiangwei Guan, Fanglin DLRAPom: a hybrid pipeline of Optimized XGBoost-guided integrative multiomics analysis for identifying targetable disease-related lncRNA–miRNA–mRNA regulatory axes |
title | DLRAPom: a hybrid pipeline of Optimized XGBoost-guided integrative multiomics analysis for identifying targetable disease-related lncRNA–miRNA–mRNA regulatory axes |
title_full | DLRAPom: a hybrid pipeline of Optimized XGBoost-guided integrative multiomics analysis for identifying targetable disease-related lncRNA–miRNA–mRNA regulatory axes |
title_fullStr | DLRAPom: a hybrid pipeline of Optimized XGBoost-guided integrative multiomics analysis for identifying targetable disease-related lncRNA–miRNA–mRNA regulatory axes |
title_full_unstemmed | DLRAPom: a hybrid pipeline of Optimized XGBoost-guided integrative multiomics analysis for identifying targetable disease-related lncRNA–miRNA–mRNA regulatory axes |
title_short | DLRAPom: a hybrid pipeline of Optimized XGBoost-guided integrative multiomics analysis for identifying targetable disease-related lncRNA–miRNA–mRNA regulatory axes |
title_sort | dlrapom: a hybrid pipeline of optimized xgboost-guided integrative multiomics analysis for identifying targetable disease-related lncrna–mirna–mrna regulatory axes |
topic | Problem Solving Protocol |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8921741/ https://www.ncbi.nlm.nih.gov/pubmed/35224615 http://dx.doi.org/10.1093/bib/bbac046 |
work_keys_str_mv | AT shenchen dlrapomahybridpipelineofoptimizedxgboostguidedintegrativemultiomicsanalysisforidentifyingtargetablediseaserelatedlncrnamirnamrnaregulatoryaxes AT lihuiyu dlrapomahybridpipelineofoptimizedxgboostguidedintegrativemultiomicsanalysisforidentifyingtargetablediseaserelatedlncrnamirnamrnaregulatoryaxes AT limiao dlrapomahybridpipelineofoptimizedxgboostguidedintegrativemultiomicsanalysisforidentifyingtargetablediseaserelatedlncrnamirnamrnaregulatoryaxes AT niuyu dlrapomahybridpipelineofoptimizedxgboostguidedintegrativemultiomicsanalysisforidentifyingtargetablediseaserelatedlncrnamirnamrnaregulatoryaxes AT liujing dlrapomahybridpipelineofoptimizedxgboostguidedintegrativemultiomicsanalysisforidentifyingtargetablediseaserelatedlncrnamirnamrnaregulatoryaxes AT zhuli dlrapomahybridpipelineofoptimizedxgboostguidedintegrativemultiomicsanalysisforidentifyingtargetablediseaserelatedlncrnamirnamrnaregulatoryaxes AT guihongsheng dlrapomahybridpipelineofoptimizedxgboostguidedintegrativemultiomicsanalysisforidentifyingtargetablediseaserelatedlncrnamirnamrnaregulatoryaxes AT hanwei dlrapomahybridpipelineofoptimizedxgboostguidedintegrativemultiomicsanalysisforidentifyingtargetablediseaserelatedlncrnamirnamrnaregulatoryaxes AT wanghuiying dlrapomahybridpipelineofoptimizedxgboostguidedintegrativemultiomicsanalysisforidentifyingtargetablediseaserelatedlncrnamirnamrnaregulatoryaxes AT zhangwenpei dlrapomahybridpipelineofoptimizedxgboostguidedintegrativemultiomicsanalysisforidentifyingtargetablediseaserelatedlncrnamirnamrnaregulatoryaxes AT wangxiaochen dlrapomahybridpipelineofoptimizedxgboostguidedintegrativemultiomicsanalysisforidentifyingtargetablediseaserelatedlncrnamirnamrnaregulatoryaxes AT luoxiao dlrapomahybridpipelineofoptimizedxgboostguidedintegrativemultiomicsanalysisforidentifyingtargetablediseaserelatedlncrnamirnamrnaregulatoryaxes AT sunyu dlrapomahybridpipelineofoptimizedxgboostguidedintegrativemultiomicsanalysisforidentifyingtargetablediseaserelatedlncrnamirnamrnaregulatoryaxes AT yanjiangwei dlrapomahybridpipelineofoptimizedxgboostguidedintegrativemultiomicsanalysisforidentifyingtargetablediseaserelatedlncrnamirnamrnaregulatoryaxes AT guanfanglin dlrapomahybridpipelineofoptimizedxgboostguidedintegrativemultiomicsanalysisforidentifyingtargetablediseaserelatedlncrnamirnamrnaregulatoryaxes |