Cargando…

Contrasting Classical and Machine Learning Approaches in the Estimation of Value-Added Scores in Large-Scale Educational Data

There is no consensus on which statistical model estimates school value-added (VA) most accurately. To date, the two most common statistical models used for the calculation of VA scores are two classical methods: linear regression and multilevel models. These models have the advantage of being relat...

Descripción completa

Detalles Bibliográficos
Autores principales:	Levy, Jessica, Mussack, Dominic, Brunner, Martin, Keller, Ulrich, Cardoso-Leite, Pedro, Fischbach, Antoine
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Frontiers Media S.A. 2020
Materias:	Psychology
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7472739/ https://www.ncbi.nlm.nih.gov/pubmed/32973639 http://dx.doi.org/10.3389/fpsyg.2020.02190

_version_	1783579043540500480
author	Levy, Jessica Mussack, Dominic Brunner, Martin Keller, Ulrich Cardoso-Leite, Pedro Fischbach, Antoine
author_facet	Levy, Jessica Mussack, Dominic Brunner, Martin Keller, Ulrich Cardoso-Leite, Pedro Fischbach, Antoine
author_sort	Levy, Jessica
collection	PubMed
description	There is no consensus on which statistical model estimates school value-added (VA) most accurately. To date, the two most common statistical models used for the calculation of VA scores are two classical methods: linear regression and multilevel models. These models have the advantage of being relatively transparent and thus understandable for most researchers and practitioners. However, these statistical models are bound to certain assumptions (e.g., linearity) that might limit their prediction accuracy. Machine learning methods, which have yielded spectacular results in numerous fields, may be a valuable alternative to these classical models. Although big data is not new in general, it is relatively new in the realm of social sciences and education. New types of data require new data analytical approaches. Such techniques have already evolved in fields with a long tradition in crunching big data (e.g., gene technology). The objective of the present paper is to competently apply these “imported” techniques to education data, more precisely VA scores, and assess when and how they can extend or replace the classical psychometrics toolbox. The different models include linear and non-linear methods and extend classical models with the most commonly used machine learning methods (i.e., random forest, neural networks, support vector machines, and boosting). We used representative data of 3,026 students in 153 schools who took part in the standardized achievement tests of the Luxembourg School Monitoring Program in grades 1 and 3. Multilevel models outperformed classical linear and polynomial regressions, as well as different machine learning models. However, it could be observed that across all schools, school VA scores from different model types correlated highly. Yet, the percentage of disagreements as compared to multilevel models was not trivial and real-life implications for individual schools may still be dramatic depending on the model type used. Implications of these results and possible ethical concerns regarding the use of machine learning methods for decision-making in education are discussed.
format	Online Article Text
id	pubmed-7472739
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	Frontiers Media S.A.
record_format	MEDLINE/PubMed
spelling	pubmed-74727392020-09-23 Contrasting Classical and Machine Learning Approaches in the Estimation of Value-Added Scores in Large-Scale Educational Data Levy, Jessica Mussack, Dominic Brunner, Martin Keller, Ulrich Cardoso-Leite, Pedro Fischbach, Antoine Front Psychol Psychology There is no consensus on which statistical model estimates school value-added (VA) most accurately. To date, the two most common statistical models used for the calculation of VA scores are two classical methods: linear regression and multilevel models. These models have the advantage of being relatively transparent and thus understandable for most researchers and practitioners. However, these statistical models are bound to certain assumptions (e.g., linearity) that might limit their prediction accuracy. Machine learning methods, which have yielded spectacular results in numerous fields, may be a valuable alternative to these classical models. Although big data is not new in general, it is relatively new in the realm of social sciences and education. New types of data require new data analytical approaches. Such techniques have already evolved in fields with a long tradition in crunching big data (e.g., gene technology). The objective of the present paper is to competently apply these “imported” techniques to education data, more precisely VA scores, and assess when and how they can extend or replace the classical psychometrics toolbox. The different models include linear and non-linear methods and extend classical models with the most commonly used machine learning methods (i.e., random forest, neural networks, support vector machines, and boosting). We used representative data of 3,026 students in 153 schools who took part in the standardized achievement tests of the Luxembourg School Monitoring Program in grades 1 and 3. Multilevel models outperformed classical linear and polynomial regressions, as well as different machine learning models. However, it could be observed that across all schools, school VA scores from different model types correlated highly. Yet, the percentage of disagreements as compared to multilevel models was not trivial and real-life implications for individual schools may still be dramatic depending on the model type used. Implications of these results and possible ethical concerns regarding the use of machine learning methods for decision-making in education are discussed. Frontiers Media S.A. 2020-08-21 /pmc/articles/PMC7472739/ /pubmed/32973639 http://dx.doi.org/10.3389/fpsyg.2020.02190 Text en Copyright © 2020 Levy, Mussack, Brunner, Keller, Cardoso-Leite and Fischbach. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle	Psychology Levy, Jessica Mussack, Dominic Brunner, Martin Keller, Ulrich Cardoso-Leite, Pedro Fischbach, Antoine Contrasting Classical and Machine Learning Approaches in the Estimation of Value-Added Scores in Large-Scale Educational Data
title	Contrasting Classical and Machine Learning Approaches in the Estimation of Value-Added Scores in Large-Scale Educational Data
title_full	Contrasting Classical and Machine Learning Approaches in the Estimation of Value-Added Scores in Large-Scale Educational Data
title_fullStr	Contrasting Classical and Machine Learning Approaches in the Estimation of Value-Added Scores in Large-Scale Educational Data
title_full_unstemmed	Contrasting Classical and Machine Learning Approaches in the Estimation of Value-Added Scores in Large-Scale Educational Data
title_short	Contrasting Classical and Machine Learning Approaches in the Estimation of Value-Added Scores in Large-Scale Educational Data
title_sort	contrasting classical and machine learning approaches in the estimation of value-added scores in large-scale educational data
topic	Psychology
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7472739/ https://www.ncbi.nlm.nih.gov/pubmed/32973639 http://dx.doi.org/10.3389/fpsyg.2020.02190
work_keys_str_mv	AT levyjessica contrastingclassicalandmachinelearningapproachesintheestimationofvalueaddedscoresinlargescaleeducationaldata AT mussackdominic contrastingclassicalandmachinelearningapproachesintheestimationofvalueaddedscoresinlargescaleeducationaldata AT brunnermartin contrastingclassicalandmachinelearningapproachesintheestimationofvalueaddedscoresinlargescaleeducationaldata AT kellerulrich contrastingclassicalandmachinelearningapproachesintheestimationofvalueaddedscoresinlargescaleeducationaldata AT cardosoleitepedro contrastingclassicalandmachinelearningapproachesintheestimationofvalueaddedscoresinlargescaleeducationaldata AT fischbachantoine contrastingclassicalandmachinelearningapproachesintheestimationofvalueaddedscoresinlargescaleeducationaldata

Contrasting Classical and Machine Learning Approaches in the Estimation of Value-Added Scores in Large-Scale Educational Data

Ejemplares similares