Cargando…

Re-conceptualising and accounting for examiner (cut-score) stringency in a ‘high frequency, small cohort’ performance test

Variation in examiner stringency is an ongoing problem in many performance settings such as in OSCEs, and usually is conceptualised and measured based on scores/grades examiners award. Under borderline regression, the standard within a station is set using checklist/domain scores and global grades a...

Descripción completa

Detalles Bibliográficos
Autor principal: Homer, Matt
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer Netherlands 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8041694/
https://www.ncbi.nlm.nih.gov/pubmed/32876815
http://dx.doi.org/10.1007/s10459-020-09990-x
_version_ 1783677986725167104
author Homer, Matt
author_facet Homer, Matt
author_sort Homer, Matt
collection PubMed
description Variation in examiner stringency is an ongoing problem in many performance settings such as in OSCEs, and usually is conceptualised and measured based on scores/grades examiners award. Under borderline regression, the standard within a station is set using checklist/domain scores and global grades acting in combination. This complexity requires a more nuanced view of what stringency might mean when considering sources of variation of cut-scores in stations. This study uses data from 349 administrations of an 18-station, 36 candidate single circuit OSCE for international medical graduates wanting to practice in the UK (PLAB2). The station-level data was gathered over a 34-month period up to July 2019. Linear mixed models are used to estimate and then separate out examiner (n = 547), station (n = 330) and examination (n = 349) effects on borderline regression cut-scores. Examiners are the largest source of variation in cut-scores accounting for 56% of variance in cut-scores, compared to 6% for stations, < 1% for exam and 37% residual. Aggregating to the exam level tends to ameliorate this effect. For 96% of examinations, a ‘fair’ cut-score, equalising out variation in examiner stringency that candidates experience, is within one standard error of measurement (SEM) of the actual cut-score. The addition of the SEM to produce the final pass mark generally ensures the public is protected from almost all false positives in the examination caused by examiner cut-score stringency acting in candidates’ favour.
format Online
Article
Text
id pubmed-8041694
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Springer Netherlands
record_format MEDLINE/PubMed
spelling pubmed-80416942021-04-27 Re-conceptualising and accounting for examiner (cut-score) stringency in a ‘high frequency, small cohort’ performance test Homer, Matt Adv Health Sci Educ Theory Pract Article Variation in examiner stringency is an ongoing problem in many performance settings such as in OSCEs, and usually is conceptualised and measured based on scores/grades examiners award. Under borderline regression, the standard within a station is set using checklist/domain scores and global grades acting in combination. This complexity requires a more nuanced view of what stringency might mean when considering sources of variation of cut-scores in stations. This study uses data from 349 administrations of an 18-station, 36 candidate single circuit OSCE for international medical graduates wanting to practice in the UK (PLAB2). The station-level data was gathered over a 34-month period up to July 2019. Linear mixed models are used to estimate and then separate out examiner (n = 547), station (n = 330) and examination (n = 349) effects on borderline regression cut-scores. Examiners are the largest source of variation in cut-scores accounting for 56% of variance in cut-scores, compared to 6% for stations, < 1% for exam and 37% residual. Aggregating to the exam level tends to ameliorate this effect. For 96% of examinations, a ‘fair’ cut-score, equalising out variation in examiner stringency that candidates experience, is within one standard error of measurement (SEM) of the actual cut-score. The addition of the SEM to produce the final pass mark generally ensures the public is protected from almost all false positives in the examination caused by examiner cut-score stringency acting in candidates’ favour. Springer Netherlands 2020-09-02 2021 /pmc/articles/PMC8041694/ /pubmed/32876815 http://dx.doi.org/10.1007/s10459-020-09990-x Text en © The Author(s) 2020 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Homer, Matt
Re-conceptualising and accounting for examiner (cut-score) stringency in a ‘high frequency, small cohort’ performance test
title Re-conceptualising and accounting for examiner (cut-score) stringency in a ‘high frequency, small cohort’ performance test
title_full Re-conceptualising and accounting for examiner (cut-score) stringency in a ‘high frequency, small cohort’ performance test
title_fullStr Re-conceptualising and accounting for examiner (cut-score) stringency in a ‘high frequency, small cohort’ performance test
title_full_unstemmed Re-conceptualising and accounting for examiner (cut-score) stringency in a ‘high frequency, small cohort’ performance test
title_short Re-conceptualising and accounting for examiner (cut-score) stringency in a ‘high frequency, small cohort’ performance test
title_sort re-conceptualising and accounting for examiner (cut-score) stringency in a ‘high frequency, small cohort’ performance test
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8041694/
https://www.ncbi.nlm.nih.gov/pubmed/32876815
http://dx.doi.org/10.1007/s10459-020-09990-x
work_keys_str_mv AT homermatt reconceptualisingandaccountingforexaminercutscorestringencyinahighfrequencysmallcohortperformancetest