Cargando…
A cost-effective machine learning-based method for preeclampsia risk assessment and driver genes discovery
BACKGROUND: The placenta, as a unique exchange organ between mother and fetus, is essential for successful human pregnancy and fetal health. Preeclampsia (PE) caused by placental dysfunction contributes to both maternal and infant morbidity and mortality. Accurate identification of PE patients plays...
Autores principales: | , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9972636/ https://www.ncbi.nlm.nih.gov/pubmed/36849879 http://dx.doi.org/10.1186/s13578-023-00991-y |
_version_ | 1784898363967143936 |
---|---|
author | Wang, Hao Zhang, Zhaoyue Li, Haicheng Li, Jinzhao Li, Hanshuang Liu, Mingzhu Liang, Pengfei Xi, Qilemuge Xing, Yongqiang Yang, Lei Zuo, Yongchun |
author_facet | Wang, Hao Zhang, Zhaoyue Li, Haicheng Li, Jinzhao Li, Hanshuang Liu, Mingzhu Liang, Pengfei Xi, Qilemuge Xing, Yongqiang Yang, Lei Zuo, Yongchun |
author_sort | Wang, Hao |
collection | PubMed |
description | BACKGROUND: The placenta, as a unique exchange organ between mother and fetus, is essential for successful human pregnancy and fetal health. Preeclampsia (PE) caused by placental dysfunction contributes to both maternal and infant morbidity and mortality. Accurate identification of PE patients plays a vital role in the formulation of treatment plans. However, the traditional clinical methods of PE have a high misdiagnosis rate. RESULTS: Here, we first designed a computational biology method that used single-cell transcriptome (scRNA-seq) of healthy pregnancy (38 wk) and early-onset PE (28–32 wk) to identify pathological cell subpopulations and predict PE risk. Based on machine learning methods and feature selection techniques, we observed that the Tuning ReliefF (TURF) score hybrid with XGBoost (TURF_XGB) achieved optimal performance, with 92.61% accuracy and 92.46% recall for classifying nine cell subpopulations of healthy placentas. Biological landscapes of placenta heterogeneity could be mapped by the 110 marker genes screened by TURF_XGB, which revealed the superiority of the TURF feature mining. Moreover, we processed the PE dataset with LASSO to obtain 497 biomarkers. Integration analysis of the above two gene sets revealed that dendritic cells were closely associated with early-onset PE, and C1QB and C1QC might drive preeclampsia by mediating inflammation. In addition, an ensemble model-based risk stratification card was developed to classify preeclampsia patients, and its area under the receiver operating characteristic curve (AUC) could reach 0.99. For broader accessibility, we designed an accessible online web server (http://bioinfor.imu.edu.cn/placenta). CONCLUSION: Single-cell transcriptome-based preeclampsia risk assessment using an ensemble machine learning framework is a valuable asset for clinical decision-making. C1QB and C1QC may be involved in the development and progression of early-onset PE by affecting the complement and coagulation cascades pathway that mediate inflammation, which has important implications for better understanding the pathogenesis of PE. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13578-023-00991-y. |
format | Online Article Text |
id | pubmed-9972636 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-99726362023-03-01 A cost-effective machine learning-based method for preeclampsia risk assessment and driver genes discovery Wang, Hao Zhang, Zhaoyue Li, Haicheng Li, Jinzhao Li, Hanshuang Liu, Mingzhu Liang, Pengfei Xi, Qilemuge Xing, Yongqiang Yang, Lei Zuo, Yongchun Cell Biosci Research BACKGROUND: The placenta, as a unique exchange organ between mother and fetus, is essential for successful human pregnancy and fetal health. Preeclampsia (PE) caused by placental dysfunction contributes to both maternal and infant morbidity and mortality. Accurate identification of PE patients plays a vital role in the formulation of treatment plans. However, the traditional clinical methods of PE have a high misdiagnosis rate. RESULTS: Here, we first designed a computational biology method that used single-cell transcriptome (scRNA-seq) of healthy pregnancy (38 wk) and early-onset PE (28–32 wk) to identify pathological cell subpopulations and predict PE risk. Based on machine learning methods and feature selection techniques, we observed that the Tuning ReliefF (TURF) score hybrid with XGBoost (TURF_XGB) achieved optimal performance, with 92.61% accuracy and 92.46% recall for classifying nine cell subpopulations of healthy placentas. Biological landscapes of placenta heterogeneity could be mapped by the 110 marker genes screened by TURF_XGB, which revealed the superiority of the TURF feature mining. Moreover, we processed the PE dataset with LASSO to obtain 497 biomarkers. Integration analysis of the above two gene sets revealed that dendritic cells were closely associated with early-onset PE, and C1QB and C1QC might drive preeclampsia by mediating inflammation. In addition, an ensemble model-based risk stratification card was developed to classify preeclampsia patients, and its area under the receiver operating characteristic curve (AUC) could reach 0.99. For broader accessibility, we designed an accessible online web server (http://bioinfor.imu.edu.cn/placenta). CONCLUSION: Single-cell transcriptome-based preeclampsia risk assessment using an ensemble machine learning framework is a valuable asset for clinical decision-making. C1QB and C1QC may be involved in the development and progression of early-onset PE by affecting the complement and coagulation cascades pathway that mediate inflammation, which has important implications for better understanding the pathogenesis of PE. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13578-023-00991-y. BioMed Central 2023-02-28 /pmc/articles/PMC9972636/ /pubmed/36849879 http://dx.doi.org/10.1186/s13578-023-00991-y Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Wang, Hao Zhang, Zhaoyue Li, Haicheng Li, Jinzhao Li, Hanshuang Liu, Mingzhu Liang, Pengfei Xi, Qilemuge Xing, Yongqiang Yang, Lei Zuo, Yongchun A cost-effective machine learning-based method for preeclampsia risk assessment and driver genes discovery |
title | A cost-effective machine learning-based method for preeclampsia risk assessment and driver genes discovery |
title_full | A cost-effective machine learning-based method for preeclampsia risk assessment and driver genes discovery |
title_fullStr | A cost-effective machine learning-based method for preeclampsia risk assessment and driver genes discovery |
title_full_unstemmed | A cost-effective machine learning-based method for preeclampsia risk assessment and driver genes discovery |
title_short | A cost-effective machine learning-based method for preeclampsia risk assessment and driver genes discovery |
title_sort | cost-effective machine learning-based method for preeclampsia risk assessment and driver genes discovery |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9972636/ https://www.ncbi.nlm.nih.gov/pubmed/36849879 http://dx.doi.org/10.1186/s13578-023-00991-y |
work_keys_str_mv | AT wanghao acosteffectivemachinelearningbasedmethodforpreeclampsiariskassessmentanddrivergenesdiscovery AT zhangzhaoyue acosteffectivemachinelearningbasedmethodforpreeclampsiariskassessmentanddrivergenesdiscovery AT lihaicheng acosteffectivemachinelearningbasedmethodforpreeclampsiariskassessmentanddrivergenesdiscovery AT lijinzhao acosteffectivemachinelearningbasedmethodforpreeclampsiariskassessmentanddrivergenesdiscovery AT lihanshuang acosteffectivemachinelearningbasedmethodforpreeclampsiariskassessmentanddrivergenesdiscovery AT liumingzhu acosteffectivemachinelearningbasedmethodforpreeclampsiariskassessmentanddrivergenesdiscovery AT liangpengfei acosteffectivemachinelearningbasedmethodforpreeclampsiariskassessmentanddrivergenesdiscovery AT xiqilemuge acosteffectivemachinelearningbasedmethodforpreeclampsiariskassessmentanddrivergenesdiscovery AT xingyongqiang acosteffectivemachinelearningbasedmethodforpreeclampsiariskassessmentanddrivergenesdiscovery AT yanglei acosteffectivemachinelearningbasedmethodforpreeclampsiariskassessmentanddrivergenesdiscovery AT zuoyongchun acosteffectivemachinelearningbasedmethodforpreeclampsiariskassessmentanddrivergenesdiscovery AT wanghao costeffectivemachinelearningbasedmethodforpreeclampsiariskassessmentanddrivergenesdiscovery AT zhangzhaoyue costeffectivemachinelearningbasedmethodforpreeclampsiariskassessmentanddrivergenesdiscovery AT lihaicheng costeffectivemachinelearningbasedmethodforpreeclampsiariskassessmentanddrivergenesdiscovery AT lijinzhao costeffectivemachinelearningbasedmethodforpreeclampsiariskassessmentanddrivergenesdiscovery AT lihanshuang costeffectivemachinelearningbasedmethodforpreeclampsiariskassessmentanddrivergenesdiscovery AT liumingzhu costeffectivemachinelearningbasedmethodforpreeclampsiariskassessmentanddrivergenesdiscovery AT liangpengfei costeffectivemachinelearningbasedmethodforpreeclampsiariskassessmentanddrivergenesdiscovery AT xiqilemuge costeffectivemachinelearningbasedmethodforpreeclampsiariskassessmentanddrivergenesdiscovery AT xingyongqiang costeffectivemachinelearningbasedmethodforpreeclampsiariskassessmentanddrivergenesdiscovery AT yanglei costeffectivemachinelearningbasedmethodforpreeclampsiariskassessmentanddrivergenesdiscovery AT zuoyongchun costeffectivemachinelearningbasedmethodforpreeclampsiariskassessmentanddrivergenesdiscovery |