Cargando…
On the Treatment of Missing Item Responses in Educational Large-Scale Assessment Data: An Illustrative Simulation Study and a Case Study Using PISA 2018 Mathematics Data
Missing item responses are prevalent in educational large-scale assessment studies such as the programme for international student assessment (PISA). The current operational practice scores missing item responses as wrong, but several psychometricians have advocated for a model-based treatment based...
Autor principal: | |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8700118/ https://www.ncbi.nlm.nih.gov/pubmed/34940395 http://dx.doi.org/10.3390/ejihpe11040117 |
_version_ | 1784620679583236096 |
---|---|
author | Robitzsch, Alexander |
author_facet | Robitzsch, Alexander |
author_sort | Robitzsch, Alexander |
collection | PubMed |
description | Missing item responses are prevalent in educational large-scale assessment studies such as the programme for international student assessment (PISA). The current operational practice scores missing item responses as wrong, but several psychometricians have advocated for a model-based treatment based on latent ignorability assumption. In this approach, item responses and response indicators are jointly modeled conditional on a latent ability and a latent response propensity variable. Alternatively, imputation-based approaches can be used. The latent ignorability assumption is weakened in the Mislevy-Wu model that characterizes a nonignorable missingness mechanism and allows the missingness of an item to depend on the item itself. The scoring of missing item responses as wrong and the latent ignorable model are submodels of the Mislevy-Wu model. In an illustrative simulation study, it is shown that the Mislevy-Wu model provides unbiased model parameters. Moreover, the simulation replicates the finding from various simulation studies from the literature that scoring missing item responses as wrong provides biased estimates if the latent ignorability assumption holds in the data-generating model. However, if missing item responses are generated such that they can only be generated from incorrect item responses, applying an item response model that relies on latent ignorability results in biased estimates. The Mislevy-Wu model guarantees unbiased parameter estimates if the more general Mislevy-Wu model holds in the data-generating model. In addition, this article uses the PISA 2018 mathematics dataset as a case study to investigate the consequences of different missing data treatments on country means and country standard deviations. Obtained country means and country standard deviations can substantially differ for the different scaling models. In contrast to previous statements in the literature, the scoring of missing item responses as incorrect provided a better model fit than a latent ignorable model for most countries. Furthermore, the dependence of the missingness of an item from the item itself after conditioning on the latent response propensity was much more pronounced for constructed-response items than for multiple-choice items. As a consequence, scaling models that presuppose latent ignorability should be refused from two perspectives. First, the Mislevy-Wu model is preferred over the latent ignorable model for reasons of model fit. Second, in the discussion section, we argue that model fit should only play a minor role in choosing psychometric models in large-scale assessment studies because validity aspects are most relevant. Missing data treatments that countries can simply manipulate (and, hence, their students) result in unfair country comparisons. |
format | Online Article Text |
id | pubmed-8700118 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-87001182021-12-24 On the Treatment of Missing Item Responses in Educational Large-Scale Assessment Data: An Illustrative Simulation Study and a Case Study Using PISA 2018 Mathematics Data Robitzsch, Alexander Eur J Investig Health Psychol Educ Article Missing item responses are prevalent in educational large-scale assessment studies such as the programme for international student assessment (PISA). The current operational practice scores missing item responses as wrong, but several psychometricians have advocated for a model-based treatment based on latent ignorability assumption. In this approach, item responses and response indicators are jointly modeled conditional on a latent ability and a latent response propensity variable. Alternatively, imputation-based approaches can be used. The latent ignorability assumption is weakened in the Mislevy-Wu model that characterizes a nonignorable missingness mechanism and allows the missingness of an item to depend on the item itself. The scoring of missing item responses as wrong and the latent ignorable model are submodels of the Mislevy-Wu model. In an illustrative simulation study, it is shown that the Mislevy-Wu model provides unbiased model parameters. Moreover, the simulation replicates the finding from various simulation studies from the literature that scoring missing item responses as wrong provides biased estimates if the latent ignorability assumption holds in the data-generating model. However, if missing item responses are generated such that they can only be generated from incorrect item responses, applying an item response model that relies on latent ignorability results in biased estimates. The Mislevy-Wu model guarantees unbiased parameter estimates if the more general Mislevy-Wu model holds in the data-generating model. In addition, this article uses the PISA 2018 mathematics dataset as a case study to investigate the consequences of different missing data treatments on country means and country standard deviations. Obtained country means and country standard deviations can substantially differ for the different scaling models. In contrast to previous statements in the literature, the scoring of missing item responses as incorrect provided a better model fit than a latent ignorable model for most countries. Furthermore, the dependence of the missingness of an item from the item itself after conditioning on the latent response propensity was much more pronounced for constructed-response items than for multiple-choice items. As a consequence, scaling models that presuppose latent ignorability should be refused from two perspectives. First, the Mislevy-Wu model is preferred over the latent ignorable model for reasons of model fit. Second, in the discussion section, we argue that model fit should only play a minor role in choosing psychometric models in large-scale assessment studies because validity aspects are most relevant. Missing data treatments that countries can simply manipulate (and, hence, their students) result in unfair country comparisons. MDPI 2021-12-14 /pmc/articles/PMC8700118/ /pubmed/34940395 http://dx.doi.org/10.3390/ejihpe11040117 Text en © 2021 by the author. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Robitzsch, Alexander On the Treatment of Missing Item Responses in Educational Large-Scale Assessment Data: An Illustrative Simulation Study and a Case Study Using PISA 2018 Mathematics Data |
title | On the Treatment of Missing Item Responses in Educational Large-Scale Assessment Data: An Illustrative Simulation Study and a Case Study Using PISA 2018 Mathematics Data |
title_full | On the Treatment of Missing Item Responses in Educational Large-Scale Assessment Data: An Illustrative Simulation Study and a Case Study Using PISA 2018 Mathematics Data |
title_fullStr | On the Treatment of Missing Item Responses in Educational Large-Scale Assessment Data: An Illustrative Simulation Study and a Case Study Using PISA 2018 Mathematics Data |
title_full_unstemmed | On the Treatment of Missing Item Responses in Educational Large-Scale Assessment Data: An Illustrative Simulation Study and a Case Study Using PISA 2018 Mathematics Data |
title_short | On the Treatment of Missing Item Responses in Educational Large-Scale Assessment Data: An Illustrative Simulation Study and a Case Study Using PISA 2018 Mathematics Data |
title_sort | on the treatment of missing item responses in educational large-scale assessment data: an illustrative simulation study and a case study using pisa 2018 mathematics data |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8700118/ https://www.ncbi.nlm.nih.gov/pubmed/34940395 http://dx.doi.org/10.3390/ejihpe11040117 |
work_keys_str_mv | AT robitzschalexander onthetreatmentofmissingitemresponsesineducationallargescaleassessmentdataanillustrativesimulationstudyandacasestudyusingpisa2018mathematicsdata |