Cargando…
Is human classification by experienced untrained observers a gold standard in fixation detection?
Manual classification is still a common method to evaluate event detection algorithms. The procedure is often as follows: Two or three human coders and the algorithm classify a significant quantity of data. In the gold standard approach, deviations from the human classifications are considered to be...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer US
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7875941/ https://www.ncbi.nlm.nih.gov/pubmed/29052166 http://dx.doi.org/10.3758/s13428-017-0955-x |
_version_ | 1783649870990540800 |
---|---|
author | Hooge, Ignace T. C. Niehorster, Diederick C. Nyström, Marcus Andersson, Richard Hessels, Roy S. |
author_facet | Hooge, Ignace T. C. Niehorster, Diederick C. Nyström, Marcus Andersson, Richard Hessels, Roy S. |
author_sort | Hooge, Ignace T. C. |
collection | PubMed |
description | Manual classification is still a common method to evaluate event detection algorithms. The procedure is often as follows: Two or three human coders and the algorithm classify a significant quantity of data. In the gold standard approach, deviations from the human classifications are considered to be due to mistakes of the algorithm. However, little is known about human classification in eye tracking. To what extent do the classifications from a larger group of human coders agree? Twelve experienced but untrained human coders classified fixations in 6 min of adult and infant eye-tracking data. When using the sample-based Cohen’s kappa, the classifications of the humans agreed near perfectly. However, we found substantial differences between the classifications when we examined fixation duration and number of fixations. We hypothesized that the human coders applied different (implicit) thresholds and selection rules. Indeed, when spatially close fixations were merged, most of the classification differences disappeared. On the basis of the nature of these intercoder differences, we concluded that fixation classification by experienced untrained human coders is not a gold standard. To bridge the gap between agreement measures (e.g., Cohen’s kappa) and eye movement parameters (fixation duration, number of fixations), we suggest the use of the event-based F1 score and two new measures: the relative timing offset (RTO) and the relative timing deviation (RTD). |
format | Online Article Text |
id | pubmed-7875941 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | Springer US |
record_format | MEDLINE/PubMed |
spelling | pubmed-78759412021-02-22 Is human classification by experienced untrained observers a gold standard in fixation detection? Hooge, Ignace T. C. Niehorster, Diederick C. Nyström, Marcus Andersson, Richard Hessels, Roy S. Behav Res Methods Article Manual classification is still a common method to evaluate event detection algorithms. The procedure is often as follows: Two or three human coders and the algorithm classify a significant quantity of data. In the gold standard approach, deviations from the human classifications are considered to be due to mistakes of the algorithm. However, little is known about human classification in eye tracking. To what extent do the classifications from a larger group of human coders agree? Twelve experienced but untrained human coders classified fixations in 6 min of adult and infant eye-tracking data. When using the sample-based Cohen’s kappa, the classifications of the humans agreed near perfectly. However, we found substantial differences between the classifications when we examined fixation duration and number of fixations. We hypothesized that the human coders applied different (implicit) thresholds and selection rules. Indeed, when spatially close fixations were merged, most of the classification differences disappeared. On the basis of the nature of these intercoder differences, we concluded that fixation classification by experienced untrained human coders is not a gold standard. To bridge the gap between agreement measures (e.g., Cohen’s kappa) and eye movement parameters (fixation duration, number of fixations), we suggest the use of the event-based F1 score and two new measures: the relative timing offset (RTO) and the relative timing deviation (RTD). Springer US 2017-10-19 2018 /pmc/articles/PMC7875941/ /pubmed/29052166 http://dx.doi.org/10.3758/s13428-017-0955-x Text en © The Author(s) 2017 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. |
spellingShingle | Article Hooge, Ignace T. C. Niehorster, Diederick C. Nyström, Marcus Andersson, Richard Hessels, Roy S. Is human classification by experienced untrained observers a gold standard in fixation detection? |
title | Is human classification by experienced untrained observers a gold standard in fixation detection? |
title_full | Is human classification by experienced untrained observers a gold standard in fixation detection? |
title_fullStr | Is human classification by experienced untrained observers a gold standard in fixation detection? |
title_full_unstemmed | Is human classification by experienced untrained observers a gold standard in fixation detection? |
title_short | Is human classification by experienced untrained observers a gold standard in fixation detection? |
title_sort | is human classification by experienced untrained observers a gold standard in fixation detection? |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7875941/ https://www.ncbi.nlm.nih.gov/pubmed/29052166 http://dx.doi.org/10.3758/s13428-017-0955-x |
work_keys_str_mv | AT hoogeignacetc ishumanclassificationbyexperienceduntrainedobserversagoldstandardinfixationdetection AT niehorsterdiederickc ishumanclassificationbyexperienceduntrainedobserversagoldstandardinfixationdetection AT nystrommarcus ishumanclassificationbyexperienceduntrainedobserversagoldstandardinfixationdetection AT anderssonrichard ishumanclassificationbyexperienceduntrainedobserversagoldstandardinfixationdetection AT hesselsroys ishumanclassificationbyexperienceduntrainedobserversagoldstandardinfixationdetection |