Cargando…
Accuracy of performance-test linking based on a many-facet Rasch model
Performance assessments, in which human raters assess examinee performance in practical tasks, have attracted much attention in various assessment contexts involving measurement of higher-order abilities. However, difficulty persists in that ability measurement accuracy strongly depends on rater and...
Autor principal: | |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer US
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8367909/ https://www.ncbi.nlm.nih.gov/pubmed/33169286 http://dx.doi.org/10.3758/s13428-020-01498-x |
_version_ | 1783739112044363776 |
---|---|
author | Uto, Masaki |
author_facet | Uto, Masaki |
author_sort | Uto, Masaki |
collection | PubMed |
description | Performance assessments, in which human raters assess examinee performance in practical tasks, have attracted much attention in various assessment contexts involving measurement of higher-order abilities. However, difficulty persists in that ability measurement accuracy strongly depends on rater and task characteristics such as rater severity and task difficulty. To resolve this problem, various item response theory (IRT) models incorporating rater and task parameters, including many-facet Rasch models (MFRMs), have been proposed. When applying such IRT models to datasets comprising results of multiple performance tests administered to different examinees, test linking is needed to unify the scale for model parameters estimated from individual test results. In test linking, test administrators generally need to design multiple tests such that raters and tasks partially overlap. The accuracy of linking under this design is highly reliant on the numbers of common raters and tasks. However, the numbers of common raters and tasks required to ensure high accuracy in test linking remain unclear, making it difficult to determine appropriate test designs. We therefore empirically evaluate the accuracy of IRT-based performance-test linking under common rater and task designs. Concretely, we conduct evaluations through simulation experiments that examine linking accuracy based on a MFRM while changing numbers of common raters and tasks with various factors that possibly affect linking accuracy. |
format | Online Article Text |
id | pubmed-8367909 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Springer US |
record_format | MEDLINE/PubMed |
spelling | pubmed-83679092021-08-31 Accuracy of performance-test linking based on a many-facet Rasch model Uto, Masaki Behav Res Methods Article Performance assessments, in which human raters assess examinee performance in practical tasks, have attracted much attention in various assessment contexts involving measurement of higher-order abilities. However, difficulty persists in that ability measurement accuracy strongly depends on rater and task characteristics such as rater severity and task difficulty. To resolve this problem, various item response theory (IRT) models incorporating rater and task parameters, including many-facet Rasch models (MFRMs), have been proposed. When applying such IRT models to datasets comprising results of multiple performance tests administered to different examinees, test linking is needed to unify the scale for model parameters estimated from individual test results. In test linking, test administrators generally need to design multiple tests such that raters and tasks partially overlap. The accuracy of linking under this design is highly reliant on the numbers of common raters and tasks. However, the numbers of common raters and tasks required to ensure high accuracy in test linking remain unclear, making it difficult to determine appropriate test designs. We therefore empirically evaluate the accuracy of IRT-based performance-test linking under common rater and task designs. Concretely, we conduct evaluations through simulation experiments that examine linking accuracy based on a MFRM while changing numbers of common raters and tasks with various factors that possibly affect linking accuracy. Springer US 2020-11-09 2021 /pmc/articles/PMC8367909/ /pubmed/33169286 http://dx.doi.org/10.3758/s13428-020-01498-x Text en © The Author(s) 2020 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Uto, Masaki Accuracy of performance-test linking based on a many-facet Rasch model |
title | Accuracy of performance-test linking based on a many-facet Rasch model |
title_full | Accuracy of performance-test linking based on a many-facet Rasch model |
title_fullStr | Accuracy of performance-test linking based on a many-facet Rasch model |
title_full_unstemmed | Accuracy of performance-test linking based on a many-facet Rasch model |
title_short | Accuracy of performance-test linking based on a many-facet Rasch model |
title_sort | accuracy of performance-test linking based on a many-facet rasch model |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8367909/ https://www.ncbi.nlm.nih.gov/pubmed/33169286 http://dx.doi.org/10.3758/s13428-020-01498-x |
work_keys_str_mv | AT utomasaki accuracyofperformancetestlinkingbasedonamanyfacetraschmodel |