Cargando…

Early prediction of medical students' performance in high-stakes examinations using machine learning approaches

INTRODUCTION: Since the advent of medical education systems, managing high-stakes exams has been a top priority and challenge for all policymakers. However, considering machine learning (ML) techniques as a replacement for medical licensing examinations, particularly during crises such as the COVID-...

Descripción completa

Detalles Bibliográficos
Autores principales: Mastour, Haniye, Dehghani, Toktam, Moradi, Ehsan, Eslami, Saeid
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10372649/
https://www.ncbi.nlm.nih.gov/pubmed/37519702
http://dx.doi.org/10.1016/j.heliyon.2023.e18248
_version_ 1785078412746948608
author Mastour, Haniye
Dehghani, Toktam
Moradi, Ehsan
Eslami, Saeid
author_facet Mastour, Haniye
Dehghani, Toktam
Moradi, Ehsan
Eslami, Saeid
author_sort Mastour, Haniye
collection PubMed
description INTRODUCTION: Since the advent of medical education systems, managing high-stakes exams has been a top priority and challenge for all policymakers. However, considering machine learning (ML) techniques as a replacement for medical licensing examinations, particularly during crises such as the COVID-19 outbreak, could be an effective solution. This study uses ML models to develop a framework for predicting medical students' performance on high-stakes exams, such as the Comprehensive Medical Basic Sciences Examination (CMBSE). MATERIAL AND METHODS: Prediction of students' status and score on high-stakes examinations faces several challenges, including an imbalanced number of failing and passing students, a large number of heterogeneous and complex features, and the need to identify at-risk and top-performing students. In this study, two major categories of ML approaches are compared: first, classic models (logistic regression (LR), support vector machine (SVM), and k-nearest neighbors (KNN)), and second, ensemble models (voting, bagging (BG), random forests (RF), adaptive boosting (ADA), extreme gradient boosting (XGB), and stacking). RESULTS: To evaluate the models' discrimination ability, they are assessed using a real dataset containing information on medical students over a five-year period (n = 1005). The findings indicate that ensemble ML models demonstrate optimal performance in predicting CMBSE status (RF and stacking). Similarly, among the classic regressors, LR exhibited the highest root-mean-square deviation (RMSD) (0.134) and coefficient of determination (R2) (0.62), whereas the RF model had the highest RMSD (0.077) and R2 (0.80) overall. Furthermore, Anatomical Sciences, Biochemistry, Parasitology, and Entomology grade point average (GPA) and grades demonstrated the strongest positive correlation with the outcomes. CONCLUSION: Comparing classic and ensemble ML models revealed that ensemble models are superior to classic models. Therefore, the presented framework could be considered a suitable alternative for the CMBSE and other comparable medical licensing examinations.
format Online
Article
Text
id pubmed-10372649
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-103726492023-07-28 Early prediction of medical students' performance in high-stakes examinations using machine learning approaches Mastour, Haniye Dehghani, Toktam Moradi, Ehsan Eslami, Saeid Heliyon Research Article INTRODUCTION: Since the advent of medical education systems, managing high-stakes exams has been a top priority and challenge for all policymakers. However, considering machine learning (ML) techniques as a replacement for medical licensing examinations, particularly during crises such as the COVID-19 outbreak, could be an effective solution. This study uses ML models to develop a framework for predicting medical students' performance on high-stakes exams, such as the Comprehensive Medical Basic Sciences Examination (CMBSE). MATERIAL AND METHODS: Prediction of students' status and score on high-stakes examinations faces several challenges, including an imbalanced number of failing and passing students, a large number of heterogeneous and complex features, and the need to identify at-risk and top-performing students. In this study, two major categories of ML approaches are compared: first, classic models (logistic regression (LR), support vector machine (SVM), and k-nearest neighbors (KNN)), and second, ensemble models (voting, bagging (BG), random forests (RF), adaptive boosting (ADA), extreme gradient boosting (XGB), and stacking). RESULTS: To evaluate the models' discrimination ability, they are assessed using a real dataset containing information on medical students over a five-year period (n = 1005). The findings indicate that ensemble ML models demonstrate optimal performance in predicting CMBSE status (RF and stacking). Similarly, among the classic regressors, LR exhibited the highest root-mean-square deviation (RMSD) (0.134) and coefficient of determination (R2) (0.62), whereas the RF model had the highest RMSD (0.077) and R2 (0.80) overall. Furthermore, Anatomical Sciences, Biochemistry, Parasitology, and Entomology grade point average (GPA) and grades demonstrated the strongest positive correlation with the outcomes. CONCLUSION: Comparing classic and ensemble ML models revealed that ensemble models are superior to classic models. Therefore, the presented framework could be considered a suitable alternative for the CMBSE and other comparable medical licensing examinations. Elsevier 2023-07-13 /pmc/articles/PMC10372649/ /pubmed/37519702 http://dx.doi.org/10.1016/j.heliyon.2023.e18248 Text en © 2023 The Authors https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Research Article
Mastour, Haniye
Dehghani, Toktam
Moradi, Ehsan
Eslami, Saeid
Early prediction of medical students' performance in high-stakes examinations using machine learning approaches
title Early prediction of medical students' performance in high-stakes examinations using machine learning approaches
title_full Early prediction of medical students' performance in high-stakes examinations using machine learning approaches
title_fullStr Early prediction of medical students' performance in high-stakes examinations using machine learning approaches
title_full_unstemmed Early prediction of medical students' performance in high-stakes examinations using machine learning approaches
title_short Early prediction of medical students' performance in high-stakes examinations using machine learning approaches
title_sort early prediction of medical students' performance in high-stakes examinations using machine learning approaches
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10372649/
https://www.ncbi.nlm.nih.gov/pubmed/37519702
http://dx.doi.org/10.1016/j.heliyon.2023.e18248
work_keys_str_mv AT mastourhaniye earlypredictionofmedicalstudentsperformanceinhighstakesexaminationsusingmachinelearningapproaches
AT dehghanitoktam earlypredictionofmedicalstudentsperformanceinhighstakesexaminationsusingmachinelearningapproaches
AT moradiehsan earlypredictionofmedicalstudentsperformanceinhighstakesexaminationsusingmachinelearningapproaches
AT eslamisaeid earlypredictionofmedicalstudentsperformanceinhighstakesexaminationsusingmachinelearningapproaches