Cargando…

Identifying potential biomarkers for non-obstructive azoospermia using WGCNA and machine learning algorithms

OBJECTIVE: The cause and mechanism of non-obstructive azoospermia (NOA) is complicated; therefore, an effective therapy strategy is yet to be developed. This study aimed to analyse the pathogenesis of NOA at the molecular biological level and to identify the core regulatory genes, which could be uti...

Descripción completa

Detalles Bibliográficos
Autores principales: Tang, Qizhen, Su, Quanxin, Wei, Letian, Wang, Kenan, Jiang, Tao
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10579891/
https://www.ncbi.nlm.nih.gov/pubmed/37854191
http://dx.doi.org/10.3389/fendo.2023.1108616
_version_ 1785121828365139968
author Tang, Qizhen
Su, Quanxin
Wei, Letian
Wang, Kenan
Jiang, Tao
author_facet Tang, Qizhen
Su, Quanxin
Wei, Letian
Wang, Kenan
Jiang, Tao
author_sort Tang, Qizhen
collection PubMed
description OBJECTIVE: The cause and mechanism of non-obstructive azoospermia (NOA) is complicated; therefore, an effective therapy strategy is yet to be developed. This study aimed to analyse the pathogenesis of NOA at the molecular biological level and to identify the core regulatory genes, which could be utilised as potential biomarkers. METHODS: Three NOA microarray datasets (GSE45885, GSE108886, and GSE145467) were collected from the GEO database and merged into training sets; a further dataset (GSE45887) was then defined as the validation set. Differential gene analysis, consensus cluster analysis, and WGCNA were used to identify preliminary signature genes; then, enrichment analysis was applied to these previously screened signature genes. Next, 4 machine learning algorithms (RF, SVM, GLM, and XGB) were used to detect potential biomarkers that are most closely associated with NOA. Finally, a diagnostic model was constructed from these potential biomarkers and visualised as a nomogram. The differential expression and predictive reliability of the biomarkers were confirmed using the validation set. Furthermore, the competing endogenous RNA network was constructed to identify the regulatory mechanisms of potential biomarkers; further, the CIBERSORT algorithm was used to calculate immune infiltration status among the samples. RESULTS: A total of 215 differentially expressed genes (DEGs) were identified between NOA and control groups (27 upregulated and 188 downregulated genes). The WGCNA results identified 1123 genes in the MEblue module as target genes that are highly correlated with NOA positivity. The NOA samples were divided into 2 clusters using consensus clustering; further, 1027 genes in the MEblue module, which were screened by WGCNA, were considered to be target genes that are highly correlated with NOA classification. The 129 overlapping genes were then established as signature genes. The XGB algorithm that had the maximum AUC value (AUC=0.946) and the minimum residual value was used to further screen the signature genes. IL20RB, C9orf117, HILS1, PAOX, and DZIP1 were identified as potential NOA biomarkers. This 5 biomarker model had the highest AUC value, of up to 0.982, compared to other single biomarker models; additionally, the results of this biomarker model were verified in the validation set. CONCLUSIONS: As IL20RB, C9orf117, HILS1, PAOX, and DZIP1 have been determined to possess the strongest association with NOA, these five genes could be used as potential therapeutic targets for NOA patients. Furthermore, the model constructed using these five genes, which possessed the highest diagnostic accuracy, may be an effective biomarker model that warrants further experimental validation.
format Online
Article
Text
id pubmed-10579891
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-105798912023-10-18 Identifying potential biomarkers for non-obstructive azoospermia using WGCNA and machine learning algorithms Tang, Qizhen Su, Quanxin Wei, Letian Wang, Kenan Jiang, Tao Front Endocrinol (Lausanne) Endocrinology OBJECTIVE: The cause and mechanism of non-obstructive azoospermia (NOA) is complicated; therefore, an effective therapy strategy is yet to be developed. This study aimed to analyse the pathogenesis of NOA at the molecular biological level and to identify the core regulatory genes, which could be utilised as potential biomarkers. METHODS: Three NOA microarray datasets (GSE45885, GSE108886, and GSE145467) were collected from the GEO database and merged into training sets; a further dataset (GSE45887) was then defined as the validation set. Differential gene analysis, consensus cluster analysis, and WGCNA were used to identify preliminary signature genes; then, enrichment analysis was applied to these previously screened signature genes. Next, 4 machine learning algorithms (RF, SVM, GLM, and XGB) were used to detect potential biomarkers that are most closely associated with NOA. Finally, a diagnostic model was constructed from these potential biomarkers and visualised as a nomogram. The differential expression and predictive reliability of the biomarkers were confirmed using the validation set. Furthermore, the competing endogenous RNA network was constructed to identify the regulatory mechanisms of potential biomarkers; further, the CIBERSORT algorithm was used to calculate immune infiltration status among the samples. RESULTS: A total of 215 differentially expressed genes (DEGs) were identified between NOA and control groups (27 upregulated and 188 downregulated genes). The WGCNA results identified 1123 genes in the MEblue module as target genes that are highly correlated with NOA positivity. The NOA samples were divided into 2 clusters using consensus clustering; further, 1027 genes in the MEblue module, which were screened by WGCNA, were considered to be target genes that are highly correlated with NOA classification. The 129 overlapping genes were then established as signature genes. The XGB algorithm that had the maximum AUC value (AUC=0.946) and the minimum residual value was used to further screen the signature genes. IL20RB, C9orf117, HILS1, PAOX, and DZIP1 were identified as potential NOA biomarkers. This 5 biomarker model had the highest AUC value, of up to 0.982, compared to other single biomarker models; additionally, the results of this biomarker model were verified in the validation set. CONCLUSIONS: As IL20RB, C9orf117, HILS1, PAOX, and DZIP1 have been determined to possess the strongest association with NOA, these five genes could be used as potential therapeutic targets for NOA patients. Furthermore, the model constructed using these five genes, which possessed the highest diagnostic accuracy, may be an effective biomarker model that warrants further experimental validation. Frontiers Media S.A. 2023-10-03 /pmc/articles/PMC10579891/ /pubmed/37854191 http://dx.doi.org/10.3389/fendo.2023.1108616 Text en Copyright © 2023 Tang, Su, Wei, Wang and Jiang https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Endocrinology
Tang, Qizhen
Su, Quanxin
Wei, Letian
Wang, Kenan
Jiang, Tao
Identifying potential biomarkers for non-obstructive azoospermia using WGCNA and machine learning algorithms
title Identifying potential biomarkers for non-obstructive azoospermia using WGCNA and machine learning algorithms
title_full Identifying potential biomarkers for non-obstructive azoospermia using WGCNA and machine learning algorithms
title_fullStr Identifying potential biomarkers for non-obstructive azoospermia using WGCNA and machine learning algorithms
title_full_unstemmed Identifying potential biomarkers for non-obstructive azoospermia using WGCNA and machine learning algorithms
title_short Identifying potential biomarkers for non-obstructive azoospermia using WGCNA and machine learning algorithms
title_sort identifying potential biomarkers for non-obstructive azoospermia using wgcna and machine learning algorithms
topic Endocrinology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10579891/
https://www.ncbi.nlm.nih.gov/pubmed/37854191
http://dx.doi.org/10.3389/fendo.2023.1108616
work_keys_str_mv AT tangqizhen identifyingpotentialbiomarkersfornonobstructiveazoospermiausingwgcnaandmachinelearningalgorithms
AT suquanxin identifyingpotentialbiomarkersfornonobstructiveazoospermiausingwgcnaandmachinelearningalgorithms
AT weiletian identifyingpotentialbiomarkersfornonobstructiveazoospermiausingwgcnaandmachinelearningalgorithms
AT wangkenan identifyingpotentialbiomarkersfornonobstructiveazoospermiausingwgcnaandmachinelearningalgorithms
AT jiangtao identifyingpotentialbiomarkersfornonobstructiveazoospermiausingwgcnaandmachinelearningalgorithms