Cargando…

Multiple-model machine learning identifies potential functional genes in dilated cardiomyopathy

INTRODUCTION: Machine learning (ML) has gained intensive popularity in various fields, such as disease diagnosis in healthcare. However, it has limitation for single algorithm to explore the diagnosing value of dilated cardiomyopathy (DCM). We aim to develop a novel overall normalized sum weight of...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Lin, Lin, Yexiang, Wang, Kaiyue, Han, Lifeng, Zhang, Xue, Gao, Xiumei, Li, Zheng, Zhang, Houliang, Zhou, Jiashun, Yu, Heshui, Fu, Xuebin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9874116/
https://www.ncbi.nlm.nih.gov/pubmed/36712235
http://dx.doi.org/10.3389/fcvm.2022.1044443
_version_ 1784877734081593344
author Zhang, Lin
Lin, Yexiang
Wang, Kaiyue
Han, Lifeng
Zhang, Xue
Gao, Xiumei
Li, Zheng
Zhang, Houliang
Zhou, Jiashun
Yu, Heshui
Fu, Xuebin
author_facet Zhang, Lin
Lin, Yexiang
Wang, Kaiyue
Han, Lifeng
Zhang, Xue
Gao, Xiumei
Li, Zheng
Zhang, Houliang
Zhou, Jiashun
Yu, Heshui
Fu, Xuebin
author_sort Zhang, Lin
collection PubMed
description INTRODUCTION: Machine learning (ML) has gained intensive popularity in various fields, such as disease diagnosis in healthcare. However, it has limitation for single algorithm to explore the diagnosing value of dilated cardiomyopathy (DCM). We aim to develop a novel overall normalized sum weight of multiple-model MLs to assess the diagnosing value in DCM. METHODS: Gene expression data were selected from previously published databases (six sets of eligible microarrays, 386 samples) with eligible criteria. Two sets of microarrays were used as training; the others were studied in the testing sets (ratio 5:1). Totally, we identified 20 differently expressed genes (DEGs) between DCM and control individuals (7 upregulated and 13 down-regulated). RESULTS: We developed six classification ML methods to identify potential candidate genes based on their overall weights. Three genes, serine proteinase inhibitor A3 (SERPINA3), frizzled-related proteins (FRPs) 3 (FRZB), and ficolin 3 (FCN3) were finally identified as the receiver operating characteristic (ROC). Interestingly, we found all three genes correlated considerably with plasma cells. Importantly, not only in training sets but also testing sets, the areas under the curve (AUCs) for SERPINA3, FRZB, and FCN3 were greater than 0.88. The ROC of SERPINA3 was significantly high (0.940 in training and 0.918 in testing sets), indicating it is a potentially functional gene in DCM. Especially, the plasma levels in DCM patients of SERPINA3, FCN, and FRZB were significant compared with healthy control. DISCUSSION: SERPINA3, FRZB, and FCN3 might be potential diagnosis targets for DCM, Further verification work could be implemented.
format Online
Article
Text
id pubmed-9874116
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-98741162023-01-26 Multiple-model machine learning identifies potential functional genes in dilated cardiomyopathy Zhang, Lin Lin, Yexiang Wang, Kaiyue Han, Lifeng Zhang, Xue Gao, Xiumei Li, Zheng Zhang, Houliang Zhou, Jiashun Yu, Heshui Fu, Xuebin Front Cardiovasc Med Cardiovascular Medicine INTRODUCTION: Machine learning (ML) has gained intensive popularity in various fields, such as disease diagnosis in healthcare. However, it has limitation for single algorithm to explore the diagnosing value of dilated cardiomyopathy (DCM). We aim to develop a novel overall normalized sum weight of multiple-model MLs to assess the diagnosing value in DCM. METHODS: Gene expression data were selected from previously published databases (six sets of eligible microarrays, 386 samples) with eligible criteria. Two sets of microarrays were used as training; the others were studied in the testing sets (ratio 5:1). Totally, we identified 20 differently expressed genes (DEGs) between DCM and control individuals (7 upregulated and 13 down-regulated). RESULTS: We developed six classification ML methods to identify potential candidate genes based on their overall weights. Three genes, serine proteinase inhibitor A3 (SERPINA3), frizzled-related proteins (FRPs) 3 (FRZB), and ficolin 3 (FCN3) were finally identified as the receiver operating characteristic (ROC). Interestingly, we found all three genes correlated considerably with plasma cells. Importantly, not only in training sets but also testing sets, the areas under the curve (AUCs) for SERPINA3, FRZB, and FCN3 were greater than 0.88. The ROC of SERPINA3 was significantly high (0.940 in training and 0.918 in testing sets), indicating it is a potentially functional gene in DCM. Especially, the plasma levels in DCM patients of SERPINA3, FCN, and FRZB were significant compared with healthy control. DISCUSSION: SERPINA3, FRZB, and FCN3 might be potential diagnosis targets for DCM, Further verification work could be implemented. Frontiers Media S.A. 2023-01-11 /pmc/articles/PMC9874116/ /pubmed/36712235 http://dx.doi.org/10.3389/fcvm.2022.1044443 Text en Copyright © 2023 Zhang, Lin, Wang, Han, Zhang, Gao, Li, Zhang, Zhou, Yu and Fu. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Cardiovascular Medicine
Zhang, Lin
Lin, Yexiang
Wang, Kaiyue
Han, Lifeng
Zhang, Xue
Gao, Xiumei
Li, Zheng
Zhang, Houliang
Zhou, Jiashun
Yu, Heshui
Fu, Xuebin
Multiple-model machine learning identifies potential functional genes in dilated cardiomyopathy
title Multiple-model machine learning identifies potential functional genes in dilated cardiomyopathy
title_full Multiple-model machine learning identifies potential functional genes in dilated cardiomyopathy
title_fullStr Multiple-model machine learning identifies potential functional genes in dilated cardiomyopathy
title_full_unstemmed Multiple-model machine learning identifies potential functional genes in dilated cardiomyopathy
title_short Multiple-model machine learning identifies potential functional genes in dilated cardiomyopathy
title_sort multiple-model machine learning identifies potential functional genes in dilated cardiomyopathy
topic Cardiovascular Medicine
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9874116/
https://www.ncbi.nlm.nih.gov/pubmed/36712235
http://dx.doi.org/10.3389/fcvm.2022.1044443
work_keys_str_mv AT zhanglin multiplemodelmachinelearningidentifiespotentialfunctionalgenesindilatedcardiomyopathy
AT linyexiang multiplemodelmachinelearningidentifiespotentialfunctionalgenesindilatedcardiomyopathy
AT wangkaiyue multiplemodelmachinelearningidentifiespotentialfunctionalgenesindilatedcardiomyopathy
AT hanlifeng multiplemodelmachinelearningidentifiespotentialfunctionalgenesindilatedcardiomyopathy
AT zhangxue multiplemodelmachinelearningidentifiespotentialfunctionalgenesindilatedcardiomyopathy
AT gaoxiumei multiplemodelmachinelearningidentifiespotentialfunctionalgenesindilatedcardiomyopathy
AT lizheng multiplemodelmachinelearningidentifiespotentialfunctionalgenesindilatedcardiomyopathy
AT zhanghouliang multiplemodelmachinelearningidentifiespotentialfunctionalgenesindilatedcardiomyopathy
AT zhoujiashun multiplemodelmachinelearningidentifiespotentialfunctionalgenesindilatedcardiomyopathy
AT yuheshui multiplemodelmachinelearningidentifiespotentialfunctionalgenesindilatedcardiomyopathy
AT fuxuebin multiplemodelmachinelearningidentifiespotentialfunctionalgenesindilatedcardiomyopathy