Cargando…

Is human classification by experienced untrained observers a gold standard in fixation detection?

Manual classification is still a common method to evaluate event detection algorithms. The procedure is often as follows: Two or three human coders and the algorithm classify a significant quantity of data. In the gold standard approach, deviations from the human classifications are considered to be...

Descripción completa

Detalles Bibliográficos
Autores principales: Hooge, Ignace T. C., Niehorster, Diederick C., Nyström, Marcus, Andersson, Richard, Hessels, Roy S.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer US 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7875941/
https://www.ncbi.nlm.nih.gov/pubmed/29052166
http://dx.doi.org/10.3758/s13428-017-0955-x
_version_ 1783649870990540800
author Hooge, Ignace T. C.
Niehorster, Diederick C.
Nyström, Marcus
Andersson, Richard
Hessels, Roy S.
author_facet Hooge, Ignace T. C.
Niehorster, Diederick C.
Nyström, Marcus
Andersson, Richard
Hessels, Roy S.
author_sort Hooge, Ignace T. C.
collection PubMed
description Manual classification is still a common method to evaluate event detection algorithms. The procedure is often as follows: Two or three human coders and the algorithm classify a significant quantity of data. In the gold standard approach, deviations from the human classifications are considered to be due to mistakes of the algorithm. However, little is known about human classification in eye tracking. To what extent do the classifications from a larger group of human coders agree? Twelve experienced but untrained human coders classified fixations in 6 min of adult and infant eye-tracking data. When using the sample-based Cohen’s kappa, the classifications of the humans agreed near perfectly. However, we found substantial differences between the classifications when we examined fixation duration and number of fixations. We hypothesized that the human coders applied different (implicit) thresholds and selection rules. Indeed, when spatially close fixations were merged, most of the classification differences disappeared. On the basis of the nature of these intercoder differences, we concluded that fixation classification by experienced untrained human coders is not a gold standard. To bridge the gap between agreement measures (e.g., Cohen’s kappa) and eye movement parameters (fixation duration, number of fixations), we suggest the use of the event-based F1 score and two new measures: the relative timing offset (RTO) and the relative timing deviation (RTD).
format Online
Article
Text
id pubmed-7875941
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Springer US
record_format MEDLINE/PubMed
spelling pubmed-78759412021-02-22 Is human classification by experienced untrained observers a gold standard in fixation detection? Hooge, Ignace T. C. Niehorster, Diederick C. Nyström, Marcus Andersson, Richard Hessels, Roy S. Behav Res Methods Article Manual classification is still a common method to evaluate event detection algorithms. The procedure is often as follows: Two or three human coders and the algorithm classify a significant quantity of data. In the gold standard approach, deviations from the human classifications are considered to be due to mistakes of the algorithm. However, little is known about human classification in eye tracking. To what extent do the classifications from a larger group of human coders agree? Twelve experienced but untrained human coders classified fixations in 6 min of adult and infant eye-tracking data. When using the sample-based Cohen’s kappa, the classifications of the humans agreed near perfectly. However, we found substantial differences between the classifications when we examined fixation duration and number of fixations. We hypothesized that the human coders applied different (implicit) thresholds and selection rules. Indeed, when spatially close fixations were merged, most of the classification differences disappeared. On the basis of the nature of these intercoder differences, we concluded that fixation classification by experienced untrained human coders is not a gold standard. To bridge the gap between agreement measures (e.g., Cohen’s kappa) and eye movement parameters (fixation duration, number of fixations), we suggest the use of the event-based F1 score and two new measures: the relative timing offset (RTO) and the relative timing deviation (RTD). Springer US 2017-10-19 2018 /pmc/articles/PMC7875941/ /pubmed/29052166 http://dx.doi.org/10.3758/s13428-017-0955-x Text en © The Author(s) 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
spellingShingle Article
Hooge, Ignace T. C.
Niehorster, Diederick C.
Nyström, Marcus
Andersson, Richard
Hessels, Roy S.
Is human classification by experienced untrained observers a gold standard in fixation detection?
title Is human classification by experienced untrained observers a gold standard in fixation detection?
title_full Is human classification by experienced untrained observers a gold standard in fixation detection?
title_fullStr Is human classification by experienced untrained observers a gold standard in fixation detection?
title_full_unstemmed Is human classification by experienced untrained observers a gold standard in fixation detection?
title_short Is human classification by experienced untrained observers a gold standard in fixation detection?
title_sort is human classification by experienced untrained observers a gold standard in fixation detection?
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7875941/
https://www.ncbi.nlm.nih.gov/pubmed/29052166
http://dx.doi.org/10.3758/s13428-017-0955-x
work_keys_str_mv AT hoogeignacetc ishumanclassificationbyexperienceduntrainedobserversagoldstandardinfixationdetection
AT niehorsterdiederickc ishumanclassificationbyexperienceduntrainedobserversagoldstandardinfixationdetection
AT nystrommarcus ishumanclassificationbyexperienceduntrainedobserversagoldstandardinfixationdetection
AT anderssonrichard ishumanclassificationbyexperienceduntrainedobserversagoldstandardinfixationdetection
AT hesselsroys ishumanclassificationbyexperienceduntrainedobserversagoldstandardinfixationdetection