Cargando…

Integrated genomic analysis defines molecular subgroups in dilated cardiomyopathy and identifies novel biomarkers based on machine learning methods

Aim: As the most common cardiomyopathy, dilated cardiomyopathy (DCM) often leads to progressive heart failure and sudden cardiac death. This study was designed to investigate the molecular subgroups of DCM. Methods: Three datasets of DCM were downloaded from GEO database (GSE17800, GSE79962 and GSE3...

Descripción completa

Detalles Bibliográficos
Autores principales: Ye, Ling-Fang, Weng, Jia-Yi, Wu, Li-Da
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9941670/
https://www.ncbi.nlm.nih.gov/pubmed/36824437
http://dx.doi.org/10.3389/fgene.2023.1050696
_version_ 1784891338287742976
author Ye, Ling-Fang
Weng, Jia-Yi
Wu, Li-Da
author_facet Ye, Ling-Fang
Weng, Jia-Yi
Wu, Li-Da
author_sort Ye, Ling-Fang
collection PubMed
description Aim: As the most common cardiomyopathy, dilated cardiomyopathy (DCM) often leads to progressive heart failure and sudden cardiac death. This study was designed to investigate the molecular subgroups of DCM. Methods: Three datasets of DCM were downloaded from GEO database (GSE17800, GSE79962 and GSE3585). After log2-transformation and background correction with “limma” package in R software, the three datasets were merged into a metadata cohort. The consensus clustering was conducted by the “Consensus Cluster Plus” package to uncover the molecular subgroups of DCM. Moreover, clinical characteristics of different molecular subgroups were compared in detail. We also adopted Weighted gene co-expression network analysis (WGCNA) analysis based on subgroup‐specific signatures of gene expression profiles to further explore the specific gene modules of each molecular subgroup and its biological function. Two machine learning methods of LASSO regression algorithm and SVM-RFE algorithm was used to screen out the genetic biomarkers, of which the discriminative ability of molecular subgroups was evaluated by receiver operating characteristic (ROC) curve. Results: Based on the gene expression profiles, heart tissue samples from patients with DCM were clustered into three molecular subgroups. No statistical difference was found in age, body mass index (BMI) and left ventricular internal diameter at end-diastole (LVIDD) among three molecular subgroups. However, the results of left ventricular ejection fraction (LVEF) statistics showed that patients from subgroup 2 had a worse condition than the other group. We found that some of the gene modules (pink, black and grey) in WGCNA analysis were significantly related to cardiac function, and each molecular subgroup had its specific gene modules functions in modulating occurrence and progression of DCM. LASSO regression algorithm and SVM-RFE algorithm was used to further screen out genetic biomarkers of molecular subgroup 2, including TCEAL4, ISG15, RWDD1, ALG5, MRPL20, JTB and LITAF. The results of ROC curves showed that all of the genetic biomarkers had favorable discriminative effectiveness. Conclusion: Patients from different molecular subgroups have their unique gene expression patterns and different clinical characteristics. More personalized treatment under the guidance of gene expression patterns should be realized.
format Online
Article
Text
id pubmed-9941670
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-99416702023-02-22 Integrated genomic analysis defines molecular subgroups in dilated cardiomyopathy and identifies novel biomarkers based on machine learning methods Ye, Ling-Fang Weng, Jia-Yi Wu, Li-Da Front Genet Genetics Aim: As the most common cardiomyopathy, dilated cardiomyopathy (DCM) often leads to progressive heart failure and sudden cardiac death. This study was designed to investigate the molecular subgroups of DCM. Methods: Three datasets of DCM were downloaded from GEO database (GSE17800, GSE79962 and GSE3585). After log2-transformation and background correction with “limma” package in R software, the three datasets were merged into a metadata cohort. The consensus clustering was conducted by the “Consensus Cluster Plus” package to uncover the molecular subgroups of DCM. Moreover, clinical characteristics of different molecular subgroups were compared in detail. We also adopted Weighted gene co-expression network analysis (WGCNA) analysis based on subgroup‐specific signatures of gene expression profiles to further explore the specific gene modules of each molecular subgroup and its biological function. Two machine learning methods of LASSO regression algorithm and SVM-RFE algorithm was used to screen out the genetic biomarkers, of which the discriminative ability of molecular subgroups was evaluated by receiver operating characteristic (ROC) curve. Results: Based on the gene expression profiles, heart tissue samples from patients with DCM were clustered into three molecular subgroups. No statistical difference was found in age, body mass index (BMI) and left ventricular internal diameter at end-diastole (LVIDD) among three molecular subgroups. However, the results of left ventricular ejection fraction (LVEF) statistics showed that patients from subgroup 2 had a worse condition than the other group. We found that some of the gene modules (pink, black and grey) in WGCNA analysis were significantly related to cardiac function, and each molecular subgroup had its specific gene modules functions in modulating occurrence and progression of DCM. LASSO regression algorithm and SVM-RFE algorithm was used to further screen out genetic biomarkers of molecular subgroup 2, including TCEAL4, ISG15, RWDD1, ALG5, MRPL20, JTB and LITAF. The results of ROC curves showed that all of the genetic biomarkers had favorable discriminative effectiveness. Conclusion: Patients from different molecular subgroups have their unique gene expression patterns and different clinical characteristics. More personalized treatment under the guidance of gene expression patterns should be realized. Frontiers Media S.A. 2023-02-07 /pmc/articles/PMC9941670/ /pubmed/36824437 http://dx.doi.org/10.3389/fgene.2023.1050696 Text en Copyright © 2023 Ye, Weng and Wu. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Ye, Ling-Fang
Weng, Jia-Yi
Wu, Li-Da
Integrated genomic analysis defines molecular subgroups in dilated cardiomyopathy and identifies novel biomarkers based on machine learning methods
title Integrated genomic analysis defines molecular subgroups in dilated cardiomyopathy and identifies novel biomarkers based on machine learning methods
title_full Integrated genomic analysis defines molecular subgroups in dilated cardiomyopathy and identifies novel biomarkers based on machine learning methods
title_fullStr Integrated genomic analysis defines molecular subgroups in dilated cardiomyopathy and identifies novel biomarkers based on machine learning methods
title_full_unstemmed Integrated genomic analysis defines molecular subgroups in dilated cardiomyopathy and identifies novel biomarkers based on machine learning methods
title_short Integrated genomic analysis defines molecular subgroups in dilated cardiomyopathy and identifies novel biomarkers based on machine learning methods
title_sort integrated genomic analysis defines molecular subgroups in dilated cardiomyopathy and identifies novel biomarkers based on machine learning methods
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9941670/
https://www.ncbi.nlm.nih.gov/pubmed/36824437
http://dx.doi.org/10.3389/fgene.2023.1050696
work_keys_str_mv AT yelingfang integratedgenomicanalysisdefinesmolecularsubgroupsindilatedcardiomyopathyandidentifiesnovelbiomarkersbasedonmachinelearningmethods
AT wengjiayi integratedgenomicanalysisdefinesmolecularsubgroupsindilatedcardiomyopathyandidentifiesnovelbiomarkersbasedonmachinelearningmethods
AT wulida integratedgenomicanalysisdefinesmolecularsubgroupsindilatedcardiomyopathyandidentifiesnovelbiomarkersbasedonmachinelearningmethods