Cargando…

K-Module Algorithm: An Additional Step to Improve the Clustering Results of WGCNA Co-Expression Networks

Among biological networks, co-expression networks have been widely studied. One of the most commonly used pipelines for the construction of co-expression networks is weighted gene co-expression network analysis (WGCNA), which can identify highly co-expressed clusters of genes (modules). WGCNA identi...

Descripción completa

Detalles Bibliográficos
Autores principales: Hou, Jie, Ye, Xiufen, Li, Chuanlong, Wang, Yixing
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7828115/
https://www.ncbi.nlm.nih.gov/pubmed/33445666
http://dx.doi.org/10.3390/genes12010087
_version_ 1783640930722512896
author Hou, Jie
Ye, Xiufen
Li, Chuanlong
Wang, Yixing
author_facet Hou, Jie
Ye, Xiufen
Li, Chuanlong
Wang, Yixing
author_sort Hou, Jie
collection PubMed
description Among biological networks, co-expression networks have been widely studied. One of the most commonly used pipelines for the construction of co-expression networks is weighted gene co-expression network analysis (WGCNA), which can identify highly co-expressed clusters of genes (modules). WGCNA identifies gene modules using hierarchical clustering. The major drawback of hierarchical clustering is that once two objects are clustered together, it cannot be reversed; thus, re-adjustment of the unbefitting decision is impossible. In this paper, we calculate the similarity matrix with the distance correlation for WGCNA to construct a gene co-expression network, and present a new approach called the k-module algorithm to improve the WGCNA clustering results. This method can assign all genes to the module with the highest mean connectivity with these genes. This algorithm re-adjusts the results of hierarchical clustering while retaining the advantages of the dynamic tree cut method. The validity of the algorithm is verified using six datasets from microarray and RNA-seq data. The k-module algorithm has fewer iterations, which leads to lower complexity. We verify that the gene modules obtained by the k-module algorithm have high enrichment scores and strong stability. Our method improves upon hierarchical clustering, and can be applied to general clustering algorithms based on the similarity matrix, not limited to gene co-expression network analysis.
format Online
Article
Text
id pubmed-7828115
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-78281152021-01-25 K-Module Algorithm: An Additional Step to Improve the Clustering Results of WGCNA Co-Expression Networks Hou, Jie Ye, Xiufen Li, Chuanlong Wang, Yixing Genes (Basel) Article Among biological networks, co-expression networks have been widely studied. One of the most commonly used pipelines for the construction of co-expression networks is weighted gene co-expression network analysis (WGCNA), which can identify highly co-expressed clusters of genes (modules). WGCNA identifies gene modules using hierarchical clustering. The major drawback of hierarchical clustering is that once two objects are clustered together, it cannot be reversed; thus, re-adjustment of the unbefitting decision is impossible. In this paper, we calculate the similarity matrix with the distance correlation for WGCNA to construct a gene co-expression network, and present a new approach called the k-module algorithm to improve the WGCNA clustering results. This method can assign all genes to the module with the highest mean connectivity with these genes. This algorithm re-adjusts the results of hierarchical clustering while retaining the advantages of the dynamic tree cut method. The validity of the algorithm is verified using six datasets from microarray and RNA-seq data. The k-module algorithm has fewer iterations, which leads to lower complexity. We verify that the gene modules obtained by the k-module algorithm have high enrichment scores and strong stability. Our method improves upon hierarchical clustering, and can be applied to general clustering algorithms based on the similarity matrix, not limited to gene co-expression network analysis. MDPI 2021-01-12 /pmc/articles/PMC7828115/ /pubmed/33445666 http://dx.doi.org/10.3390/genes12010087 Text en © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Hou, Jie
Ye, Xiufen
Li, Chuanlong
Wang, Yixing
K-Module Algorithm: An Additional Step to Improve the Clustering Results of WGCNA Co-Expression Networks
title K-Module Algorithm: An Additional Step to Improve the Clustering Results of WGCNA Co-Expression Networks
title_full K-Module Algorithm: An Additional Step to Improve the Clustering Results of WGCNA Co-Expression Networks
title_fullStr K-Module Algorithm: An Additional Step to Improve the Clustering Results of WGCNA Co-Expression Networks
title_full_unstemmed K-Module Algorithm: An Additional Step to Improve the Clustering Results of WGCNA Co-Expression Networks
title_short K-Module Algorithm: An Additional Step to Improve the Clustering Results of WGCNA Co-Expression Networks
title_sort k-module algorithm: an additional step to improve the clustering results of wgcna co-expression networks
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7828115/
https://www.ncbi.nlm.nih.gov/pubmed/33445666
http://dx.doi.org/10.3390/genes12010087
work_keys_str_mv AT houjie kmodulealgorithmanadditionalsteptoimprovetheclusteringresultsofwgcnacoexpressionnetworks
AT yexiufen kmodulealgorithmanadditionalsteptoimprovetheclusteringresultsofwgcnacoexpressionnetworks
AT lichuanlong kmodulealgorithmanadditionalsteptoimprovetheclusteringresultsofwgcnacoexpressionnetworks
AT wangyixing kmodulealgorithmanadditionalsteptoimprovetheclusteringresultsofwgcnacoexpressionnetworks