Cargando…

Ensuring survey research data integrity in the era of internet bots

We used an internet-based survey platform to conduct a cross-sectional survey regarding the impact of COVID-19 on the LGBTQ + population in the United States. While this method of data collection was quick and inexpensive, the data collected required extensive cleaning due to the infiltration of bot...

Descripción completa

Detalles Bibliográficos
Autores principales:	Griffin, Marybec, Martino, Richard J., LoSchiavo, Caleb, Comer-Carruthers, Camilla, Krause, Kristen D., Stults, Christopher B., Halkitis, Perry N.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Springer Netherlands 2021
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8490963/ https://www.ncbi.nlm.nih.gov/pubmed/34629553 http://dx.doi.org/10.1007/s11135-021-01252-1

_version_	1784578651326513152
author	Griffin, Marybec Martino, Richard J. LoSchiavo, Caleb Comer-Carruthers, Camilla Krause, Kristen D. Stults, Christopher B. Halkitis, Perry N.
author_facet	Griffin, Marybec Martino, Richard J. LoSchiavo, Caleb Comer-Carruthers, Camilla Krause, Kristen D. Stults, Christopher B. Halkitis, Perry N.
author_sort	Griffin, Marybec
collection	PubMed
description	We used an internet-based survey platform to conduct a cross-sectional survey regarding the impact of COVID-19 on the LGBTQ + population in the United States. While this method of data collection was quick and inexpensive, the data collected required extensive cleaning due to the infiltration of bots. Based on this experience, we provide recommendations for ensuring data integrity. Recruitment conducted between May 7 and 8, 2020 resulted in an initial sample of 1251 responses. The Qualtrics survey was disseminated via social media and professional association listservs. After noticing data discrepancies, research staff developed a rigorous data cleaning protocol. A second wave of recruitment was conducted on June 11–12, 2020 using the original recruitment methods. The five-step data cleaning protocol led to the removal of 773 (61.8%) surveys from the initial dataset, resulting in a sample of 478 participants in the first wave of data collection. The protocol led to the removal of 46 (31.9%) surveys from the second two-day wave of data collection, resulting in a sample of 98 participants in the second wave of data collection. After verifying the two-day pilot process was effective at screening for bots, the survey was reopened for a third wave of data collection resulting in a total of 709 responses, which were identified as an additional 514 (72.5%) valid participants and led to the removal of an additional 194 (27.4%) possible bots. The final analytic sample consists of 1090 participants. Although a useful and efficient research tool, especially among hard-to-reach populations, internet-based research is vulnerable to bots and mischievous responders, despite survey platforms’ built-in protections. Beyond the depletion of research funds, bot infiltration threatens data integrity and may disproportionately harm research with marginalized populations. Based on our experience, we recommend the use of strategies such as qualitative questions, duplicate demographic questions, and incentive raffles to reduce likelihood of mischievous respondents. These protections can be undertaken to ensure data integrity and facilitate research on vulnerable populations.
format	Online Article Text
id	pubmed-8490963
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	Springer Netherlands
record_format	MEDLINE/PubMed
spelling	pubmed-84909632021-10-05 Ensuring survey research data integrity in the era of internet bots Griffin, Marybec Martino, Richard J. LoSchiavo, Caleb Comer-Carruthers, Camilla Krause, Kristen D. Stults, Christopher B. Halkitis, Perry N. Qual Quant Article We used an internet-based survey platform to conduct a cross-sectional survey regarding the impact of COVID-19 on the LGBTQ + population in the United States. While this method of data collection was quick and inexpensive, the data collected required extensive cleaning due to the infiltration of bots. Based on this experience, we provide recommendations for ensuring data integrity. Recruitment conducted between May 7 and 8, 2020 resulted in an initial sample of 1251 responses. The Qualtrics survey was disseminated via social media and professional association listservs. After noticing data discrepancies, research staff developed a rigorous data cleaning protocol. A second wave of recruitment was conducted on June 11–12, 2020 using the original recruitment methods. The five-step data cleaning protocol led to the removal of 773 (61.8%) surveys from the initial dataset, resulting in a sample of 478 participants in the first wave of data collection. The protocol led to the removal of 46 (31.9%) surveys from the second two-day wave of data collection, resulting in a sample of 98 participants in the second wave of data collection. After verifying the two-day pilot process was effective at screening for bots, the survey was reopened for a third wave of data collection resulting in a total of 709 responses, which were identified as an additional 514 (72.5%) valid participants and led to the removal of an additional 194 (27.4%) possible bots. The final analytic sample consists of 1090 participants. Although a useful and efficient research tool, especially among hard-to-reach populations, internet-based research is vulnerable to bots and mischievous responders, despite survey platforms’ built-in protections. Beyond the depletion of research funds, bot infiltration threatens data integrity and may disproportionately harm research with marginalized populations. Based on our experience, we recommend the use of strategies such as qualitative questions, duplicate demographic questions, and incentive raffles to reduce likelihood of mischievous respondents. These protections can be undertaken to ensure data integrity and facilitate research on vulnerable populations. Springer Netherlands 2021-10-05 2022 /pmc/articles/PMC8490963/ /pubmed/34629553 http://dx.doi.org/10.1007/s11135-021-01252-1 Text en © The Author(s), under exclusive licence to Springer Nature B.V. 2021 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.
spellingShingle	Article Griffin, Marybec Martino, Richard J. LoSchiavo, Caleb Comer-Carruthers, Camilla Krause, Kristen D. Stults, Christopher B. Halkitis, Perry N. Ensuring survey research data integrity in the era of internet bots
title	Ensuring survey research data integrity in the era of internet bots
title_full	Ensuring survey research data integrity in the era of internet bots
title_fullStr	Ensuring survey research data integrity in the era of internet bots
title_full_unstemmed	Ensuring survey research data integrity in the era of internet bots
title_short	Ensuring survey research data integrity in the era of internet bots
title_sort	ensuring survey research data integrity in the era of internet bots
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8490963/ https://www.ncbi.nlm.nih.gov/pubmed/34629553 http://dx.doi.org/10.1007/s11135-021-01252-1
work_keys_str_mv	AT griffinmarybec ensuringsurveyresearchdataintegrityintheeraofinternetbots AT martinorichardj ensuringsurveyresearchdataintegrityintheeraofinternetbots AT loschiavocaleb ensuringsurveyresearchdataintegrityintheeraofinternetbots AT comercarrutherscamilla ensuringsurveyresearchdataintegrityintheeraofinternetbots AT krausekristend ensuringsurveyresearchdataintegrityintheeraofinternetbots AT stultschristopherb ensuringsurveyresearchdataintegrityintheeraofinternetbots AT halkitisperryn ensuringsurveyresearchdataintegrityintheeraofinternetbots

Ensuring survey research data integrity in the era of internet bots

Ejemplares similares