Cargando…

A Comparison of Classical and Modern Measures of Internal Consistency

Three measures of internal consistency – Kuder-Richardson Formula 20 (KR20), Cronbach’s alpha (α), and person separation reliability (R) – are considered. KR20 and α are common measures in classical test theory, whereas R is developed in modern test theory and, more precisely, in Rasch measurement....

Descripción completa

Detalles Bibliográficos
Autores principales: Anselmi, Pasquale, Colledani, Daiana, Robusto, Egidio
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6904350/
https://www.ncbi.nlm.nih.gov/pubmed/31866905
http://dx.doi.org/10.3389/fpsyg.2019.02714
_version_ 1783477986396209152
author Anselmi, Pasquale
Colledani, Daiana
Robusto, Egidio
author_facet Anselmi, Pasquale
Colledani, Daiana
Robusto, Egidio
author_sort Anselmi, Pasquale
collection PubMed
description Three measures of internal consistency – Kuder-Richardson Formula 20 (KR20), Cronbach’s alpha (α), and person separation reliability (R) – are considered. KR20 and α are common measures in classical test theory, whereas R is developed in modern test theory and, more precisely, in Rasch measurement. These three measures specify the observed variance as the sum of true variance and error variance. However, they differ for the way in which these quantities are obtained. KR20 uses the error variance of an “average” respondent from the sample, which overestimates the error variance of respondents with high or low scores. Conversely, R uses the actual average error variance of the sample. KR20 and α use respondents’ test scores in calculating the observed variance. This is potentially misleading because test scores are not linear representations of the underlying variable, whereas calculation of variance requires linearity. Contrariwise, if the data fit the Rasch model, the measures estimated for each respondent are on a linear scale, thus being numerically suitable for calculating the observed variance. Given these differences, R is expected to be a better index of internal consistency than KR20 and α. The present work compares the three measures on simulated data sets with dichotomous and polytomous items. It is shown that all the estimates of internal consistency decrease with the increasing of the skewness of the score distribution, with R decreasing to a larger extent. Thus, R is more conservative than KR20 and α, and prevents test users from believing a test has better measurement characteristics than it actually has. In addition, it is shown that Rasch-based infit and outfit person statistics can be used for handling data sets with random responses. Two options are described. The first one implies computing a more conservative estimate of internal consistency. The second one implies detecting individuals with random responses. When there are a few individuals with a consistent number of random responses, infit and outfit allow for correctly detecting almost all of them. Once these individuals are removed, a “cleaned” data set is obtained that can be used for computing a less biased estimate of internal consistency.
format Online
Article
Text
id pubmed-6904350
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-69043502019-12-20 A Comparison of Classical and Modern Measures of Internal Consistency Anselmi, Pasquale Colledani, Daiana Robusto, Egidio Front Psychol Psychology Three measures of internal consistency – Kuder-Richardson Formula 20 (KR20), Cronbach’s alpha (α), and person separation reliability (R) – are considered. KR20 and α are common measures in classical test theory, whereas R is developed in modern test theory and, more precisely, in Rasch measurement. These three measures specify the observed variance as the sum of true variance and error variance. However, they differ for the way in which these quantities are obtained. KR20 uses the error variance of an “average” respondent from the sample, which overestimates the error variance of respondents with high or low scores. Conversely, R uses the actual average error variance of the sample. KR20 and α use respondents’ test scores in calculating the observed variance. This is potentially misleading because test scores are not linear representations of the underlying variable, whereas calculation of variance requires linearity. Contrariwise, if the data fit the Rasch model, the measures estimated for each respondent are on a linear scale, thus being numerically suitable for calculating the observed variance. Given these differences, R is expected to be a better index of internal consistency than KR20 and α. The present work compares the three measures on simulated data sets with dichotomous and polytomous items. It is shown that all the estimates of internal consistency decrease with the increasing of the skewness of the score distribution, with R decreasing to a larger extent. Thus, R is more conservative than KR20 and α, and prevents test users from believing a test has better measurement characteristics than it actually has. In addition, it is shown that Rasch-based infit and outfit person statistics can be used for handling data sets with random responses. Two options are described. The first one implies computing a more conservative estimate of internal consistency. The second one implies detecting individuals with random responses. When there are a few individuals with a consistent number of random responses, infit and outfit allow for correctly detecting almost all of them. Once these individuals are removed, a “cleaned” data set is obtained that can be used for computing a less biased estimate of internal consistency. Frontiers Media S.A. 2019-12-04 /pmc/articles/PMC6904350/ /pubmed/31866905 http://dx.doi.org/10.3389/fpsyg.2019.02714 Text en Copyright © 2019 Anselmi, Colledani and Robusto. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Psychology
Anselmi, Pasquale
Colledani, Daiana
Robusto, Egidio
A Comparison of Classical and Modern Measures of Internal Consistency
title A Comparison of Classical and Modern Measures of Internal Consistency
title_full A Comparison of Classical and Modern Measures of Internal Consistency
title_fullStr A Comparison of Classical and Modern Measures of Internal Consistency
title_full_unstemmed A Comparison of Classical and Modern Measures of Internal Consistency
title_short A Comparison of Classical and Modern Measures of Internal Consistency
title_sort comparison of classical and modern measures of internal consistency
topic Psychology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6904350/
https://www.ncbi.nlm.nih.gov/pubmed/31866905
http://dx.doi.org/10.3389/fpsyg.2019.02714
work_keys_str_mv AT anselmipasquale acomparisonofclassicalandmodernmeasuresofinternalconsistency
AT colledanidaiana acomparisonofclassicalandmodernmeasuresofinternalconsistency
AT robustoegidio acomparisonofclassicalandmodernmeasuresofinternalconsistency
AT anselmipasquale comparisonofclassicalandmodernmeasuresofinternalconsistency
AT colledanidaiana comparisonofclassicalandmodernmeasuresofinternalconsistency
AT robustoegidio comparisonofclassicalandmodernmeasuresofinternalconsistency