Cargando…

Polycystic Ovary Syndrome Detection Machine Learning Model Based on Optimized Feature Selection and Explainable Artificial Intelligence

Polycystic ovary syndrome (PCOS) has been classified as a severe health problem common among women globally. Early detection and treatment of PCOS reduce the possibility of long-term complications, such as increasing the chances of developing type 2 diabetes and gestational diabetes. Therefore, effe...

Descripción completa

Detalles Bibliográficos
Autores principales: Elmannai, Hela, El-Rashidy, Nora, Mashal, Ibrahim, Alohali, Manal Abdullah, Farag, Sara, El-Sappagh, Shaker, Saleh, Hager
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10137609/
https://www.ncbi.nlm.nih.gov/pubmed/37189606
http://dx.doi.org/10.3390/diagnostics13081506
_version_ 1785032507204304896
author Elmannai, Hela
El-Rashidy, Nora
Mashal, Ibrahim
Alohali, Manal Abdullah
Farag, Sara
El-Sappagh, Shaker
Saleh, Hager
author_facet Elmannai, Hela
El-Rashidy, Nora
Mashal, Ibrahim
Alohali, Manal Abdullah
Farag, Sara
El-Sappagh, Shaker
Saleh, Hager
author_sort Elmannai, Hela
collection PubMed
description Polycystic ovary syndrome (PCOS) has been classified as a severe health problem common among women globally. Early detection and treatment of PCOS reduce the possibility of long-term complications, such as increasing the chances of developing type 2 diabetes and gestational diabetes. Therefore, effective and early PCOS diagnosis will help the healthcare systems to reduce the disease’s problems and complications. Machine learning (ML) and ensemble learning have recently shown promising results in medical diagnostics. The main goal of our research is to provide model explanations to ensure efficiency, effectiveness, and trust in the developed model through local and global explanations. Feature selection methods with different types of ML models (logistic regression (LR), random forest (RF), decision tree (DT), naive Bayes (NB), support vector machine (SVM), k-nearest neighbor (KNN), xgboost, and Adaboost algorithm to get optimal feature selection and best model. Stacking ML models that combine the best base ML models with meta-learner are proposed to improve performance. Bayesian optimization is used to optimize ML models. Combining SMOTE (Synthetic Minority Oversampling Techniques) and ENN (Edited Nearest Neighbour) solves the class imbalance. The experimental results were made using a benchmark PCOS dataset with two ratios splitting 70:30 and 80:20. The result showed that the Stacking ML with REF feature selection recorded the highest accuracy at 100 compared to other models.
format Online
Article
Text
id pubmed-10137609
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-101376092023-04-28 Polycystic Ovary Syndrome Detection Machine Learning Model Based on Optimized Feature Selection and Explainable Artificial Intelligence Elmannai, Hela El-Rashidy, Nora Mashal, Ibrahim Alohali, Manal Abdullah Farag, Sara El-Sappagh, Shaker Saleh, Hager Diagnostics (Basel) Article Polycystic ovary syndrome (PCOS) has been classified as a severe health problem common among women globally. Early detection and treatment of PCOS reduce the possibility of long-term complications, such as increasing the chances of developing type 2 diabetes and gestational diabetes. Therefore, effective and early PCOS diagnosis will help the healthcare systems to reduce the disease’s problems and complications. Machine learning (ML) and ensemble learning have recently shown promising results in medical diagnostics. The main goal of our research is to provide model explanations to ensure efficiency, effectiveness, and trust in the developed model through local and global explanations. Feature selection methods with different types of ML models (logistic regression (LR), random forest (RF), decision tree (DT), naive Bayes (NB), support vector machine (SVM), k-nearest neighbor (KNN), xgboost, and Adaboost algorithm to get optimal feature selection and best model. Stacking ML models that combine the best base ML models with meta-learner are proposed to improve performance. Bayesian optimization is used to optimize ML models. Combining SMOTE (Synthetic Minority Oversampling Techniques) and ENN (Edited Nearest Neighbour) solves the class imbalance. The experimental results were made using a benchmark PCOS dataset with two ratios splitting 70:30 and 80:20. The result showed that the Stacking ML with REF feature selection recorded the highest accuracy at 100 compared to other models. MDPI 2023-04-21 /pmc/articles/PMC10137609/ /pubmed/37189606 http://dx.doi.org/10.3390/diagnostics13081506 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Elmannai, Hela
El-Rashidy, Nora
Mashal, Ibrahim
Alohali, Manal Abdullah
Farag, Sara
El-Sappagh, Shaker
Saleh, Hager
Polycystic Ovary Syndrome Detection Machine Learning Model Based on Optimized Feature Selection and Explainable Artificial Intelligence
title Polycystic Ovary Syndrome Detection Machine Learning Model Based on Optimized Feature Selection and Explainable Artificial Intelligence
title_full Polycystic Ovary Syndrome Detection Machine Learning Model Based on Optimized Feature Selection and Explainable Artificial Intelligence
title_fullStr Polycystic Ovary Syndrome Detection Machine Learning Model Based on Optimized Feature Selection and Explainable Artificial Intelligence
title_full_unstemmed Polycystic Ovary Syndrome Detection Machine Learning Model Based on Optimized Feature Selection and Explainable Artificial Intelligence
title_short Polycystic Ovary Syndrome Detection Machine Learning Model Based on Optimized Feature Selection and Explainable Artificial Intelligence
title_sort polycystic ovary syndrome detection machine learning model based on optimized feature selection and explainable artificial intelligence
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10137609/
https://www.ncbi.nlm.nih.gov/pubmed/37189606
http://dx.doi.org/10.3390/diagnostics13081506
work_keys_str_mv AT elmannaihela polycysticovarysyndromedetectionmachinelearningmodelbasedonoptimizedfeatureselectionandexplainableartificialintelligence
AT elrashidynora polycysticovarysyndromedetectionmachinelearningmodelbasedonoptimizedfeatureselectionandexplainableartificialintelligence
AT mashalibrahim polycysticovarysyndromedetectionmachinelearningmodelbasedonoptimizedfeatureselectionandexplainableartificialintelligence
AT alohalimanalabdullah polycysticovarysyndromedetectionmachinelearningmodelbasedonoptimizedfeatureselectionandexplainableartificialintelligence
AT faragsara polycysticovarysyndromedetectionmachinelearningmodelbasedonoptimizedfeatureselectionandexplainableartificialintelligence
AT elsappaghshaker polycysticovarysyndromedetectionmachinelearningmodelbasedonoptimizedfeatureselectionandexplainableartificialintelligence
AT salehhager polycysticovarysyndromedetectionmachinelearningmodelbasedonoptimizedfeatureselectionandexplainableartificialintelligence