Cargando…

Identification and validation of diagnostic signature genes in non-obstructive azoospermia by machine learning

Non-obstructive azoospermia (NOA) is a common cause of male infertility, and no specific diagnostic indicators exist. In this study, we used human testis datasets GSE45885, GSE45887, and GSE108886 from GEO database as training datasets, and screened 6 signature genes (all lowly expressed in the NOA...

Descripción completa

Detalles Bibliográficos
Autores principales: Ran, Lingxiang, Gao, Zhixiang, Chen, Qiu, Cui, Fengmei, Liu, Xiaolong, Xue, Boxin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Impact Journals 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10257997/
https://www.ncbi.nlm.nih.gov/pubmed/37227814
http://dx.doi.org/10.18632/aging.204749
_version_ 1785057398557245440
author Ran, Lingxiang
Gao, Zhixiang
Chen, Qiu
Cui, Fengmei
Liu, Xiaolong
Xue, Boxin
author_facet Ran, Lingxiang
Gao, Zhixiang
Chen, Qiu
Cui, Fengmei
Liu, Xiaolong
Xue, Boxin
author_sort Ran, Lingxiang
collection PubMed
description Non-obstructive azoospermia (NOA) is a common cause of male infertility, and no specific diagnostic indicators exist. In this study, we used human testis datasets GSE45885, GSE45887, and GSE108886 from GEO database as training datasets, and screened 6 signature genes (all lowly expressed in the NOA group) using Boruta algorithm and Lasso regression: C12orf54, TSSK6, OR2H1, FER1L5, C9orf153, XKR3. The diagnostic efficacy of the above genes was examined by constructing models with LightGBM algorithm: the AUC (Area Under Curve) of both ROC and Precision-Recall curves for internal validation was 1.0 (p < 0.05). For the external validation dataset GSE145467 (human testis), the AUC of its ROC curve was 0.9 and that of its Precision-Recall curve was 0.833 (p < 0.05). Next, we confirmed the cellular localization of the above genes using human testis single-cell RNA sequencing dataset GSE149512, which were all located in spermatid. Besides, the downstream regulatory mechanisms of the above genes in spermatid were inferred by GSEA algorithm: C12orf54 may be involved in the repression of E2F-related and MYC-related pathways, TSSK6 and C9orf153 may be involved in the repression of MYC-related pathways, while FER1L5 may be involved in the repression of spermatogenesis pathway. Finally, we constructed a NOA model in mice using X-ray irradiation, and quantitative Real-time PCR results showed that C12orf54, TSSK6, OR2H1, FER1L5, and C9orf153 were all lowly expressed in NOA group. In summary, we have identified novel signature genes of NOA using machine learning methods and complete experimental validation, which will be helpful for its early diagnosis.
format Online
Article
Text
id pubmed-10257997
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Impact Journals
record_format MEDLINE/PubMed
spelling pubmed-102579972023-06-13 Identification and validation of diagnostic signature genes in non-obstructive azoospermia by machine learning Ran, Lingxiang Gao, Zhixiang Chen, Qiu Cui, Fengmei Liu, Xiaolong Xue, Boxin Aging (Albany NY) Research Paper Non-obstructive azoospermia (NOA) is a common cause of male infertility, and no specific diagnostic indicators exist. In this study, we used human testis datasets GSE45885, GSE45887, and GSE108886 from GEO database as training datasets, and screened 6 signature genes (all lowly expressed in the NOA group) using Boruta algorithm and Lasso regression: C12orf54, TSSK6, OR2H1, FER1L5, C9orf153, XKR3. The diagnostic efficacy of the above genes was examined by constructing models with LightGBM algorithm: the AUC (Area Under Curve) of both ROC and Precision-Recall curves for internal validation was 1.0 (p < 0.05). For the external validation dataset GSE145467 (human testis), the AUC of its ROC curve was 0.9 and that of its Precision-Recall curve was 0.833 (p < 0.05). Next, we confirmed the cellular localization of the above genes using human testis single-cell RNA sequencing dataset GSE149512, which were all located in spermatid. Besides, the downstream regulatory mechanisms of the above genes in spermatid were inferred by GSEA algorithm: C12orf54 may be involved in the repression of E2F-related and MYC-related pathways, TSSK6 and C9orf153 may be involved in the repression of MYC-related pathways, while FER1L5 may be involved in the repression of spermatogenesis pathway. Finally, we constructed a NOA model in mice using X-ray irradiation, and quantitative Real-time PCR results showed that C12orf54, TSSK6, OR2H1, FER1L5, and C9orf153 were all lowly expressed in NOA group. In summary, we have identified novel signature genes of NOA using machine learning methods and complete experimental validation, which will be helpful for its early diagnosis. Impact Journals 2023-05-24 /pmc/articles/PMC10257997/ /pubmed/37227814 http://dx.doi.org/10.18632/aging.204749 Text en Copyright: © 2023 Ran et al. https://creativecommons.org/licenses/by/3.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/3.0/) (CC BY 3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Paper
Ran, Lingxiang
Gao, Zhixiang
Chen, Qiu
Cui, Fengmei
Liu, Xiaolong
Xue, Boxin
Identification and validation of diagnostic signature genes in non-obstructive azoospermia by machine learning
title Identification and validation of diagnostic signature genes in non-obstructive azoospermia by machine learning
title_full Identification and validation of diagnostic signature genes in non-obstructive azoospermia by machine learning
title_fullStr Identification and validation of diagnostic signature genes in non-obstructive azoospermia by machine learning
title_full_unstemmed Identification and validation of diagnostic signature genes in non-obstructive azoospermia by machine learning
title_short Identification and validation of diagnostic signature genes in non-obstructive azoospermia by machine learning
title_sort identification and validation of diagnostic signature genes in non-obstructive azoospermia by machine learning
topic Research Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10257997/
https://www.ncbi.nlm.nih.gov/pubmed/37227814
http://dx.doi.org/10.18632/aging.204749
work_keys_str_mv AT ranlingxiang identificationandvalidationofdiagnosticsignaturegenesinnonobstructiveazoospermiabymachinelearning
AT gaozhixiang identificationandvalidationofdiagnosticsignaturegenesinnonobstructiveazoospermiabymachinelearning
AT chenqiu identificationandvalidationofdiagnosticsignaturegenesinnonobstructiveazoospermiabymachinelearning
AT cuifengmei identificationandvalidationofdiagnosticsignaturegenesinnonobstructiveazoospermiabymachinelearning
AT liuxiaolong identificationandvalidationofdiagnosticsignaturegenesinnonobstructiveazoospermiabymachinelearning
AT xueboxin identificationandvalidationofdiagnosticsignaturegenesinnonobstructiveazoospermiabymachinelearning