Cargando…

KISL: knowledge-injected semi-supervised learning for biological co-expression network modules

The exploration of important biomarkers associated with cancer development is crucial for diagnosing cancer, designing therapeutic interventions, and predicting prognoses. The analysis of gene co-expression provides a systemic perspective on gene networks and can be a valuable tool for mining biomar...

Descripción completa

Detalles Bibliográficos
Autores principales: Xiao, Gangyi, Guan, Renchu, Cao, Yangkun, Huang, Zhenyu, Xu, Ying
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10185879/
https://www.ncbi.nlm.nih.gov/pubmed/37205122
http://dx.doi.org/10.3389/fgene.2023.1151962
_version_ 1785042453843148800
author Xiao, Gangyi
Guan, Renchu
Cao, Yangkun
Huang, Zhenyu
Xu, Ying
author_facet Xiao, Gangyi
Guan, Renchu
Cao, Yangkun
Huang, Zhenyu
Xu, Ying
author_sort Xiao, Gangyi
collection PubMed
description The exploration of important biomarkers associated with cancer development is crucial for diagnosing cancer, designing therapeutic interventions, and predicting prognoses. The analysis of gene co-expression provides a systemic perspective on gene networks and can be a valuable tool for mining biomarkers. The main objective of co-expression network analysis is to discover highly synergistic sets of genes, and the most widely used method is weighted gene co-expression network analysis (WGCNA). With the Pearson correlation coefficient, WGCNA measures gene correlation, and uses hierarchical clustering to identify gene modules. The Pearson correlation coefficient reflects only the linear dependence between variables, and the main drawback of hierarchical clustering is that once two objects are clustered together, the process cannot be reversed. Hence, readjusting inappropriate cluster divisions is not possible. Existing co-expression network analysis methods rely on unsupervised methods that do not utilize prior biological knowledge for module delineation. Here we present a method for identification of outstanding modules in a co-expression network using a knowledge-injected semi-supervised learning approach (KISL), which utilizes apriori biological knowledge and a semi-supervised clustering method to address the issue existing in the current GCN-based clustering methods. To measure the linear and non-linear dependence between genes, we introduce a distance correlation due to the complexity of the gene-gene relationship. Eight RNA-seq datasets of cancer samples are used to validate its effectiveness. In all eight datasets, the KISL algorithm outperformed WGCNA when comparing the silhouette coefficient, Calinski-Harabasz index and Davies-Bouldin index evaluation metrics. According to the results, KISL clusters had better cluster evaluation values and better gene module aggregation. Enrichment analysis of the recognition modules demonstrated their effectiveness in discovering modular structures in biological co-expression networks. In addition, as a general method, KISL can be applied to various co-expression network analyses based on similarity metrics. Source codes for the KISL and the related scripts are available online at https://github.com/Mowonhoo/KISL.git.
format Online
Article
Text
id pubmed-10185879
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-101858792023-05-17 KISL: knowledge-injected semi-supervised learning for biological co-expression network modules Xiao, Gangyi Guan, Renchu Cao, Yangkun Huang, Zhenyu Xu, Ying Front Genet Genetics The exploration of important biomarkers associated with cancer development is crucial for diagnosing cancer, designing therapeutic interventions, and predicting prognoses. The analysis of gene co-expression provides a systemic perspective on gene networks and can be a valuable tool for mining biomarkers. The main objective of co-expression network analysis is to discover highly synergistic sets of genes, and the most widely used method is weighted gene co-expression network analysis (WGCNA). With the Pearson correlation coefficient, WGCNA measures gene correlation, and uses hierarchical clustering to identify gene modules. The Pearson correlation coefficient reflects only the linear dependence between variables, and the main drawback of hierarchical clustering is that once two objects are clustered together, the process cannot be reversed. Hence, readjusting inappropriate cluster divisions is not possible. Existing co-expression network analysis methods rely on unsupervised methods that do not utilize prior biological knowledge for module delineation. Here we present a method for identification of outstanding modules in a co-expression network using a knowledge-injected semi-supervised learning approach (KISL), which utilizes apriori biological knowledge and a semi-supervised clustering method to address the issue existing in the current GCN-based clustering methods. To measure the linear and non-linear dependence between genes, we introduce a distance correlation due to the complexity of the gene-gene relationship. Eight RNA-seq datasets of cancer samples are used to validate its effectiveness. In all eight datasets, the KISL algorithm outperformed WGCNA when comparing the silhouette coefficient, Calinski-Harabasz index and Davies-Bouldin index evaluation metrics. According to the results, KISL clusters had better cluster evaluation values and better gene module aggregation. Enrichment analysis of the recognition modules demonstrated their effectiveness in discovering modular structures in biological co-expression networks. In addition, as a general method, KISL can be applied to various co-expression network analyses based on similarity metrics. Source codes for the KISL and the related scripts are available online at https://github.com/Mowonhoo/KISL.git. Frontiers Media S.A. 2023-05-02 /pmc/articles/PMC10185879/ /pubmed/37205122 http://dx.doi.org/10.3389/fgene.2023.1151962 Text en Copyright © 2023 Xiao, Guan, Cao, Huang and Xu. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Xiao, Gangyi
Guan, Renchu
Cao, Yangkun
Huang, Zhenyu
Xu, Ying
KISL: knowledge-injected semi-supervised learning for biological co-expression network modules
title KISL: knowledge-injected semi-supervised learning for biological co-expression network modules
title_full KISL: knowledge-injected semi-supervised learning for biological co-expression network modules
title_fullStr KISL: knowledge-injected semi-supervised learning for biological co-expression network modules
title_full_unstemmed KISL: knowledge-injected semi-supervised learning for biological co-expression network modules
title_short KISL: knowledge-injected semi-supervised learning for biological co-expression network modules
title_sort kisl: knowledge-injected semi-supervised learning for biological co-expression network modules
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10185879/
https://www.ncbi.nlm.nih.gov/pubmed/37205122
http://dx.doi.org/10.3389/fgene.2023.1151962
work_keys_str_mv AT xiaogangyi kislknowledgeinjectedsemisupervisedlearningforbiologicalcoexpressionnetworkmodules
AT guanrenchu kislknowledgeinjectedsemisupervisedlearningforbiologicalcoexpressionnetworkmodules
AT caoyangkun kislknowledgeinjectedsemisupervisedlearningforbiologicalcoexpressionnetworkmodules
AT huangzhenyu kislknowledgeinjectedsemisupervisedlearningforbiologicalcoexpressionnetworkmodules
AT xuying kislknowledgeinjectedsemisupervisedlearningforbiologicalcoexpressionnetworkmodules