Cargando…
K-Module Algorithm: An Additional Step to Improve the Clustering Results of WGCNA Co-Expression Networks
Among biological networks, co-expression networks have been widely studied. One of the most commonly used pipelines for the construction of co-expression networks is weighted gene co-expression network analysis (WGCNA), which can identify highly co-expressed clusters of genes (modules). WGCNA identi...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7828115/ https://www.ncbi.nlm.nih.gov/pubmed/33445666 http://dx.doi.org/10.3390/genes12010087 |
_version_ | 1783640930722512896 |
---|---|
author | Hou, Jie Ye, Xiufen Li, Chuanlong Wang, Yixing |
author_facet | Hou, Jie Ye, Xiufen Li, Chuanlong Wang, Yixing |
author_sort | Hou, Jie |
collection | PubMed |
description | Among biological networks, co-expression networks have been widely studied. One of the most commonly used pipelines for the construction of co-expression networks is weighted gene co-expression network analysis (WGCNA), which can identify highly co-expressed clusters of genes (modules). WGCNA identifies gene modules using hierarchical clustering. The major drawback of hierarchical clustering is that once two objects are clustered together, it cannot be reversed; thus, re-adjustment of the unbefitting decision is impossible. In this paper, we calculate the similarity matrix with the distance correlation for WGCNA to construct a gene co-expression network, and present a new approach called the k-module algorithm to improve the WGCNA clustering results. This method can assign all genes to the module with the highest mean connectivity with these genes. This algorithm re-adjusts the results of hierarchical clustering while retaining the advantages of the dynamic tree cut method. The validity of the algorithm is verified using six datasets from microarray and RNA-seq data. The k-module algorithm has fewer iterations, which leads to lower complexity. We verify that the gene modules obtained by the k-module algorithm have high enrichment scores and strong stability. Our method improves upon hierarchical clustering, and can be applied to general clustering algorithms based on the similarity matrix, not limited to gene co-expression network analysis. |
format | Online Article Text |
id | pubmed-7828115 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-78281152021-01-25 K-Module Algorithm: An Additional Step to Improve the Clustering Results of WGCNA Co-Expression Networks Hou, Jie Ye, Xiufen Li, Chuanlong Wang, Yixing Genes (Basel) Article Among biological networks, co-expression networks have been widely studied. One of the most commonly used pipelines for the construction of co-expression networks is weighted gene co-expression network analysis (WGCNA), which can identify highly co-expressed clusters of genes (modules). WGCNA identifies gene modules using hierarchical clustering. The major drawback of hierarchical clustering is that once two objects are clustered together, it cannot be reversed; thus, re-adjustment of the unbefitting decision is impossible. In this paper, we calculate the similarity matrix with the distance correlation for WGCNA to construct a gene co-expression network, and present a new approach called the k-module algorithm to improve the WGCNA clustering results. This method can assign all genes to the module with the highest mean connectivity with these genes. This algorithm re-adjusts the results of hierarchical clustering while retaining the advantages of the dynamic tree cut method. The validity of the algorithm is verified using six datasets from microarray and RNA-seq data. The k-module algorithm has fewer iterations, which leads to lower complexity. We verify that the gene modules obtained by the k-module algorithm have high enrichment scores and strong stability. Our method improves upon hierarchical clustering, and can be applied to general clustering algorithms based on the similarity matrix, not limited to gene co-expression network analysis. MDPI 2021-01-12 /pmc/articles/PMC7828115/ /pubmed/33445666 http://dx.doi.org/10.3390/genes12010087 Text en © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Hou, Jie Ye, Xiufen Li, Chuanlong Wang, Yixing K-Module Algorithm: An Additional Step to Improve the Clustering Results of WGCNA Co-Expression Networks |
title | K-Module Algorithm: An Additional Step to Improve the Clustering Results of WGCNA Co-Expression Networks |
title_full | K-Module Algorithm: An Additional Step to Improve the Clustering Results of WGCNA Co-Expression Networks |
title_fullStr | K-Module Algorithm: An Additional Step to Improve the Clustering Results of WGCNA Co-Expression Networks |
title_full_unstemmed | K-Module Algorithm: An Additional Step to Improve the Clustering Results of WGCNA Co-Expression Networks |
title_short | K-Module Algorithm: An Additional Step to Improve the Clustering Results of WGCNA Co-Expression Networks |
title_sort | k-module algorithm: an additional step to improve the clustering results of wgcna co-expression networks |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7828115/ https://www.ncbi.nlm.nih.gov/pubmed/33445666 http://dx.doi.org/10.3390/genes12010087 |
work_keys_str_mv | AT houjie kmodulealgorithmanadditionalsteptoimprovetheclusteringresultsofwgcnacoexpressionnetworks AT yexiufen kmodulealgorithmanadditionalsteptoimprovetheclusteringresultsofwgcnacoexpressionnetworks AT lichuanlong kmodulealgorithmanadditionalsteptoimprovetheclusteringresultsofwgcnacoexpressionnetworks AT wangyixing kmodulealgorithmanadditionalsteptoimprovetheclusteringresultsofwgcnacoexpressionnetworks |