Cargando…

Detecting Careless Responding in Survey Data Using Stochastic Gradient Boosting

Careless responding is a bias in survey responses that disregards the actual item content, constituting a threat to the factor structure, reliability, and validity of psychological measurements. Different approaches have been proposed to detect aberrant responses such as probing questions that direc...

Descripción completa

Detalles Bibliográficos
Autores principales: Schroeders, Ulrich, Schmidt, Christoph, Gnambs, Timo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: SAGE Publications 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8725053/
https://www.ncbi.nlm.nih.gov/pubmed/34992306
http://dx.doi.org/10.1177/00131644211004708
_version_ 1784626035825836032
author Schroeders, Ulrich
Schmidt, Christoph
Gnambs, Timo
author_facet Schroeders, Ulrich
Schmidt, Christoph
Gnambs, Timo
author_sort Schroeders, Ulrich
collection PubMed
description Careless responding is a bias in survey responses that disregards the actual item content, constituting a threat to the factor structure, reliability, and validity of psychological measurements. Different approaches have been proposed to detect aberrant responses such as probing questions that directly assess test-taking behavior (e.g., bogus items), auxiliary or paradata (e.g., response times), or data-driven statistical techniques (e.g., Mahalanobis distance). In the present study, gradient boosted trees, a state-of-the-art machine learning technique, are introduced to identify careless respondents. The performance of the approach was compared with established techniques previously described in the literature (e.g., statistical outlier methods, consistency analyses, and response pattern functions) using simulated data and empirical data from a web-based study, in which diligent versus careless response behavior was experimentally induced. In the simulation study, gradient boosting machines outperformed traditional detection mechanisms in flagging aberrant responses. However, this advantage did not transfer to the empirical study. In terms of precision, the results of both traditional and the novel detection mechanisms were unsatisfactory, although the latter incorporated response times as additional information. The comparison between the results of the simulation and the online study showed that responses in real-world settings seem to be much more erratic than can be expected from the simulation studies. We critically discuss the generalizability of currently available detection methods and provide an outlook on future research on the detection of aberrant response patterns in survey research.
format Online
Article
Text
id pubmed-8725053
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher SAGE Publications
record_format MEDLINE/PubMed
spelling pubmed-87250532022-01-05 Detecting Careless Responding in Survey Data Using Stochastic Gradient Boosting Schroeders, Ulrich Schmidt, Christoph Gnambs, Timo Educ Psychol Meas Article Careless responding is a bias in survey responses that disregards the actual item content, constituting a threat to the factor structure, reliability, and validity of psychological measurements. Different approaches have been proposed to detect aberrant responses such as probing questions that directly assess test-taking behavior (e.g., bogus items), auxiliary or paradata (e.g., response times), or data-driven statistical techniques (e.g., Mahalanobis distance). In the present study, gradient boosted trees, a state-of-the-art machine learning technique, are introduced to identify careless respondents. The performance of the approach was compared with established techniques previously described in the literature (e.g., statistical outlier methods, consistency analyses, and response pattern functions) using simulated data and empirical data from a web-based study, in which diligent versus careless response behavior was experimentally induced. In the simulation study, gradient boosting machines outperformed traditional detection mechanisms in flagging aberrant responses. However, this advantage did not transfer to the empirical study. In terms of precision, the results of both traditional and the novel detection mechanisms were unsatisfactory, although the latter incorporated response times as additional information. The comparison between the results of the simulation and the online study showed that responses in real-world settings seem to be much more erratic than can be expected from the simulation studies. We critically discuss the generalizability of currently available detection methods and provide an outlook on future research on the detection of aberrant response patterns in survey research. SAGE Publications 2021-04-19 2022-02 /pmc/articles/PMC8725053/ /pubmed/34992306 http://dx.doi.org/10.1177/00131644211004708 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/This article is distributed under the terms of the Creative Commons Attribution 4.0 License (https://creativecommons.org/licenses/by/4.0/) which permits any use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access pages (https://us.sagepub.com/en-us/nam/open-access-at-sage).
spellingShingle Article
Schroeders, Ulrich
Schmidt, Christoph
Gnambs, Timo
Detecting Careless Responding in Survey Data Using Stochastic Gradient Boosting
title Detecting Careless Responding in Survey Data Using Stochastic Gradient Boosting
title_full Detecting Careless Responding in Survey Data Using Stochastic Gradient Boosting
title_fullStr Detecting Careless Responding in Survey Data Using Stochastic Gradient Boosting
title_full_unstemmed Detecting Careless Responding in Survey Data Using Stochastic Gradient Boosting
title_short Detecting Careless Responding in Survey Data Using Stochastic Gradient Boosting
title_sort detecting careless responding in survey data using stochastic gradient boosting
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8725053/
https://www.ncbi.nlm.nih.gov/pubmed/34992306
http://dx.doi.org/10.1177/00131644211004708
work_keys_str_mv AT schroedersulrich detectingcarelessrespondinginsurveydatausingstochasticgradientboosting
AT schmidtchristoph detectingcarelessrespondinginsurveydatausingstochasticgradientboosting
AT gnambstimo detectingcarelessrespondinginsurveydatausingstochasticgradientboosting