Cargando…

Identification of Diagnostic CpG Signatures in Patients with Gestational Diabetes Mellitus via Epigenome-Wide Association Study Integrated with Machine Learning

BACKGROUND: Gestational diabetes mellitus (GDM) is the most prevalent metabolic disease during pregnancy, but the diagnosis is controversial and lagging partly due to the lack of useful biomarkers. CpG methylation is involved in the development of GDM. However, the specific CpG methylation sites ser...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Yan, Geng, Hui, Duan, Bide, Yang, Xiuzhi, Ma, Airong, Ding, Xiaoyan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8162250/
https://www.ncbi.nlm.nih.gov/pubmed/34104645
http://dx.doi.org/10.1155/2021/1984690
_version_ 1783700671954944000
author Liu, Yan
Geng, Hui
Duan, Bide
Yang, Xiuzhi
Ma, Airong
Ding, Xiaoyan
author_facet Liu, Yan
Geng, Hui
Duan, Bide
Yang, Xiuzhi
Ma, Airong
Ding, Xiaoyan
author_sort Liu, Yan
collection PubMed
description BACKGROUND: Gestational diabetes mellitus (GDM) is the most prevalent metabolic disease during pregnancy, but the diagnosis is controversial and lagging partly due to the lack of useful biomarkers. CpG methylation is involved in the development of GDM. However, the specific CpG methylation sites serving as diagnostic biomarkers of GDM remain unclear. Here, we aimed to explore CpG signatures and establish the predicting model for the GDM diagnosis. METHODS: DNA methylation data of GSE88929 and GSE102177 were obtained from the GEO database, followed by the epigenome-wide association study (EWAS). GO and KEGG pathway analyses were performed by using the clusterProfiler package of R. The PPI network was constructed in the STRING database and Cytoscape software. The SVM model was established, in which the β-values of selected CpG sites were the predictor variable and the occurrence of GDM was the outcome variable. RESULTS: We identified 62 significant CpG methylation sites in the GDM samples compared with the control samples. GO and KEGG analyses based on the 62 CpG sites demonstrated that several essential cellular processes and signaling pathways were enriched in the system. A total of 12 hub genes related to the identified CpG sites were found in the PPI network. The SVM model based on the selected CpGs within the promoter region, including cg00922748, cg05216211, cg05376185, cg06617468, cg17097119, and cg22385669, was established, and the AUC values of the training set and testing set in the model were 0.8138 and 0.7576. The AUC value of the independent validation set of GSE102177 was 0.6667. CONCLUSION: We identified potential diagnostic CpG signatures by EWAS integrated with the SVM model. The SVM model based on the identified 6 CpG sites reliably predicted the GDM occurrence, contributing to the diagnosis of GDM. Our finding provides new insights into the cross-application of EWAS and machine learning in GDM investigation.
format Online
Article
Text
id pubmed-8162250
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Hindawi
record_format MEDLINE/PubMed
spelling pubmed-81622502021-06-07 Identification of Diagnostic CpG Signatures in Patients with Gestational Diabetes Mellitus via Epigenome-Wide Association Study Integrated with Machine Learning Liu, Yan Geng, Hui Duan, Bide Yang, Xiuzhi Ma, Airong Ding, Xiaoyan Biomed Res Int Research Article BACKGROUND: Gestational diabetes mellitus (GDM) is the most prevalent metabolic disease during pregnancy, but the diagnosis is controversial and lagging partly due to the lack of useful biomarkers. CpG methylation is involved in the development of GDM. However, the specific CpG methylation sites serving as diagnostic biomarkers of GDM remain unclear. Here, we aimed to explore CpG signatures and establish the predicting model for the GDM diagnosis. METHODS: DNA methylation data of GSE88929 and GSE102177 were obtained from the GEO database, followed by the epigenome-wide association study (EWAS). GO and KEGG pathway analyses were performed by using the clusterProfiler package of R. The PPI network was constructed in the STRING database and Cytoscape software. The SVM model was established, in which the β-values of selected CpG sites were the predictor variable and the occurrence of GDM was the outcome variable. RESULTS: We identified 62 significant CpG methylation sites in the GDM samples compared with the control samples. GO and KEGG analyses based on the 62 CpG sites demonstrated that several essential cellular processes and signaling pathways were enriched in the system. A total of 12 hub genes related to the identified CpG sites were found in the PPI network. The SVM model based on the selected CpGs within the promoter region, including cg00922748, cg05216211, cg05376185, cg06617468, cg17097119, and cg22385669, was established, and the AUC values of the training set and testing set in the model were 0.8138 and 0.7576. The AUC value of the independent validation set of GSE102177 was 0.6667. CONCLUSION: We identified potential diagnostic CpG signatures by EWAS integrated with the SVM model. The SVM model based on the identified 6 CpG sites reliably predicted the GDM occurrence, contributing to the diagnosis of GDM. Our finding provides new insights into the cross-application of EWAS and machine learning in GDM investigation. Hindawi 2021-05-19 /pmc/articles/PMC8162250/ /pubmed/34104645 http://dx.doi.org/10.1155/2021/1984690 Text en Copyright © 2021 Yan Liu et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Liu, Yan
Geng, Hui
Duan, Bide
Yang, Xiuzhi
Ma, Airong
Ding, Xiaoyan
Identification of Diagnostic CpG Signatures in Patients with Gestational Diabetes Mellitus via Epigenome-Wide Association Study Integrated with Machine Learning
title Identification of Diagnostic CpG Signatures in Patients with Gestational Diabetes Mellitus via Epigenome-Wide Association Study Integrated with Machine Learning
title_full Identification of Diagnostic CpG Signatures in Patients with Gestational Diabetes Mellitus via Epigenome-Wide Association Study Integrated with Machine Learning
title_fullStr Identification of Diagnostic CpG Signatures in Patients with Gestational Diabetes Mellitus via Epigenome-Wide Association Study Integrated with Machine Learning
title_full_unstemmed Identification of Diagnostic CpG Signatures in Patients with Gestational Diabetes Mellitus via Epigenome-Wide Association Study Integrated with Machine Learning
title_short Identification of Diagnostic CpG Signatures in Patients with Gestational Diabetes Mellitus via Epigenome-Wide Association Study Integrated with Machine Learning
title_sort identification of diagnostic cpg signatures in patients with gestational diabetes mellitus via epigenome-wide association study integrated with machine learning
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8162250/
https://www.ncbi.nlm.nih.gov/pubmed/34104645
http://dx.doi.org/10.1155/2021/1984690
work_keys_str_mv AT liuyan identificationofdiagnosticcpgsignaturesinpatientswithgestationaldiabetesmellitusviaepigenomewideassociationstudyintegratedwithmachinelearning
AT genghui identificationofdiagnosticcpgsignaturesinpatientswithgestationaldiabetesmellitusviaepigenomewideassociationstudyintegratedwithmachinelearning
AT duanbide identificationofdiagnosticcpgsignaturesinpatientswithgestationaldiabetesmellitusviaepigenomewideassociationstudyintegratedwithmachinelearning
AT yangxiuzhi identificationofdiagnosticcpgsignaturesinpatientswithgestationaldiabetesmellitusviaepigenomewideassociationstudyintegratedwithmachinelearning
AT maairong identificationofdiagnosticcpgsignaturesinpatientswithgestationaldiabetesmellitusviaepigenomewideassociationstudyintegratedwithmachinelearning
AT dingxiaoyan identificationofdiagnosticcpgsignaturesinpatientswithgestationaldiabetesmellitusviaepigenomewideassociationstudyintegratedwithmachinelearning