Cargando…

XGBCDA: a multiple heterogeneous networks-based method for predicting circRNA-disease associations

BACKGROUND: Biological experiments have demonstrated that circRNA plays an essential role in various biological processes and human diseases. However, it is time-consuming and costly to merely conduct biological experiments to detect the association between circRNA and diseases. Accordingly, develop...

Descripción completa

Detalles Bibliográficos
Autores principales: Shen, Siyuan, Liu, Junyi, Zhou, Cheng, Qian, Yurong, Deng, Lei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9632006/
https://www.ncbi.nlm.nih.gov/pubmed/36329528
http://dx.doi.org/10.1186/s12920-021-01054-2
_version_ 1784823938936733696
author Shen, Siyuan
Liu, Junyi
Zhou, Cheng
Qian, Yurong
Deng, Lei
author_facet Shen, Siyuan
Liu, Junyi
Zhou, Cheng
Qian, Yurong
Deng, Lei
author_sort Shen, Siyuan
collection PubMed
description BACKGROUND: Biological experiments have demonstrated that circRNA plays an essential role in various biological processes and human diseases. However, it is time-consuming and costly to merely conduct biological experiments to detect the association between circRNA and diseases. Accordingly, developing an efficient computational model to predict circRNA-disease associations is urgent. METHODS: In this research, we propose a multiple heterogeneous networks-based method, named XGBCDA, to predict circRNA-disease associations. The method first extracts original features, namely statistical features and graph theory features, from integrated circRNA similarity network, disease similarity network and circRNA-disease association network, and then sends these original features to the XGBoost classifier for training latent features. The method utilizes the tree learned by the XGBoost model, the index of leaf that instance finally falls into, and the 1 of K coding to represent the latent features. Finally, the method combines the latent features from the XGBoost with the original features to train the final model for predicting the association between the circRNA and diseases. RESULTS: The tenfold cross-validation results of the XGBCDA method illustrate that the area under the ROC curve reaches 0.9860. In addition, the method presents a striking performance in the case studies of colorectal cancer, gastric cancer and cervical cancer. CONCLUSION: With fabulous performance in predicting potential circRNA-disease associations, the XGBCDA method has the promising ability to assist biomedical researchers in terms of circRNA-disease association prediction.
format Online
Article
Text
id pubmed-9632006
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-96320062022-11-04 XGBCDA: a multiple heterogeneous networks-based method for predicting circRNA-disease associations Shen, Siyuan Liu, Junyi Zhou, Cheng Qian, Yurong Deng, Lei BMC Med Genomics Research BACKGROUND: Biological experiments have demonstrated that circRNA plays an essential role in various biological processes and human diseases. However, it is time-consuming and costly to merely conduct biological experiments to detect the association between circRNA and diseases. Accordingly, developing an efficient computational model to predict circRNA-disease associations is urgent. METHODS: In this research, we propose a multiple heterogeneous networks-based method, named XGBCDA, to predict circRNA-disease associations. The method first extracts original features, namely statistical features and graph theory features, from integrated circRNA similarity network, disease similarity network and circRNA-disease association network, and then sends these original features to the XGBoost classifier for training latent features. The method utilizes the tree learned by the XGBoost model, the index of leaf that instance finally falls into, and the 1 of K coding to represent the latent features. Finally, the method combines the latent features from the XGBoost with the original features to train the final model for predicting the association between the circRNA and diseases. RESULTS: The tenfold cross-validation results of the XGBCDA method illustrate that the area under the ROC curve reaches 0.9860. In addition, the method presents a striking performance in the case studies of colorectal cancer, gastric cancer and cervical cancer. CONCLUSION: With fabulous performance in predicting potential circRNA-disease associations, the XGBCDA method has the promising ability to assist biomedical researchers in terms of circRNA-disease association prediction. BioMed Central 2022-11-03 /pmc/articles/PMC9632006/ /pubmed/36329528 http://dx.doi.org/10.1186/s12920-021-01054-2 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Shen, Siyuan
Liu, Junyi
Zhou, Cheng
Qian, Yurong
Deng, Lei
XGBCDA: a multiple heterogeneous networks-based method for predicting circRNA-disease associations
title XGBCDA: a multiple heterogeneous networks-based method for predicting circRNA-disease associations
title_full XGBCDA: a multiple heterogeneous networks-based method for predicting circRNA-disease associations
title_fullStr XGBCDA: a multiple heterogeneous networks-based method for predicting circRNA-disease associations
title_full_unstemmed XGBCDA: a multiple heterogeneous networks-based method for predicting circRNA-disease associations
title_short XGBCDA: a multiple heterogeneous networks-based method for predicting circRNA-disease associations
title_sort xgbcda: a multiple heterogeneous networks-based method for predicting circrna-disease associations
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9632006/
https://www.ncbi.nlm.nih.gov/pubmed/36329528
http://dx.doi.org/10.1186/s12920-021-01054-2
work_keys_str_mv AT shensiyuan xgbcdaamultipleheterogeneousnetworksbasedmethodforpredictingcircrnadiseaseassociations
AT liujunyi xgbcdaamultipleheterogeneousnetworksbasedmethodforpredictingcircrnadiseaseassociations
AT zhoucheng xgbcdaamultipleheterogeneousnetworksbasedmethodforpredictingcircrnadiseaseassociations
AT qianyurong xgbcdaamultipleheterogeneousnetworksbasedmethodforpredictingcircrnadiseaseassociations
AT denglei xgbcdaamultipleheterogeneousnetworksbasedmethodforpredictingcircrnadiseaseassociations