Cargando…

A case study of using AI for General Certificate of Secondary Education (GCSE) grade prediction in a selective independent school in England

The COVID-19 pandemic has created significant challenges for UK schools, but a time of cancelled exams and uncertainty around future examinations can provide opportunities to explore novel assessment methods. Hence, the 2020 proposal of the Ofqual algorithm which combines teachers' estimated gr...

Descripción completa

Detalles Bibliográficos
Autor principal: Denes, Gyorgy
Formato: Online Artículo Texto
Lenguaje:English
Publicado: The Author(s). Published by Elsevier Ltd. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9883073/
http://dx.doi.org/10.1016/j.caeai.2023.100129
Descripción
Sumario:The COVID-19 pandemic has created significant challenges for UK schools, but a time of cancelled exams and uncertainty around future examinations can provide opportunities to explore novel assessment methods. Hence, the 2020 proposal of the Ofqual algorithm which combines teachers' estimated grades and schools' historical performance seemed timely. However, the algorithmically calculated grades resulted in a public backlash and withdrawal of the proposal. While the failed Ofqual algorithm could be considered an example of AI, we do not yet have a thorough understanding of its numerical accuracy and how it performs in comparison to other AI models. This paper investigates this novel application: the potential use of a range of AI models as assessment tools in a selective, independent, secondary school in England. The following questions were examined: (1) how accurate are modern AI models in predicting GCSE exam grades? (2) what are the differences in model accuracy across subjects and can these be explained by qualitative differences in teachers' grading practices? Results indicate that while models yield acceptable mean absolute errors, individual mispredictions can be larger than desired. Subject differences highlighted that grading subjectivity is less significant in science, technology, engineering, and maths (STEM) subjects, which could explain why objective models fail to predict non-STEM grades more frequently. In summary, numerical results indicate that grade prediction could be an interesting novel application of AI, but more research is needed to reduce outliers.