Cargando…

The Early Detection of Fraudulent COVID-19 Products From Twitter Chatter: Data Set and Baseline Approach Using Anomaly Detection

BACKGROUND: Social media has served as a lucrative platform for spreading misinformation and for promoting fraudulent products for the treatment, testing, and prevention of COVID-19. This has resulted in the issuance of many warning letters by the US Food and Drug Administration (FDA). While social...

Descripción completa

Detalles Bibliográficos
Autores principales:	Sarker, Abeed, Lakamana, Sahithi, Liao, Ruqi, Abbas, Aamir, Yang, Yuan-Chi, Al-Garadi, Mohammed
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	JMIR Publications 2023
Materias:	Original Paper
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10131818/ https://www.ncbi.nlm.nih.gov/pubmed/37113382 http://dx.doi.org/10.2196/43694

_version_	1785031260409692160
author	Sarker, Abeed Lakamana, Sahithi Liao, Ruqi Abbas, Aamir Yang, Yuan-Chi Al-Garadi, Mohammed
author_facet	Sarker, Abeed Lakamana, Sahithi Liao, Ruqi Abbas, Aamir Yang, Yuan-Chi Al-Garadi, Mohammed
author_sort	Sarker, Abeed
collection	PubMed
description	BACKGROUND: Social media has served as a lucrative platform for spreading misinformation and for promoting fraudulent products for the treatment, testing, and prevention of COVID-19. This has resulted in the issuance of many warning letters by the US Food and Drug Administration (FDA). While social media continues to serve as the primary platform for the promotion of such fraudulent products, it also presents the opportunity to identify these products early by using effective social media mining methods. OBJECTIVE: Our objectives were to (1) create a data set of fraudulent COVID-19 products that can be used for future research and (2) propose a method using data from Twitter for automatically detecting heavily promoted COVID-19 products early. METHODS: We created a data set from FDA-issued warnings during the early months of the COVID-19 pandemic. We used natural language processing and time-series anomaly detection methods for automatically detecting fraudulent COVID-19 products early from Twitter. Our approach is based on the intuition that increases in the popularity of fraudulent products lead to corresponding anomalous increases in the volume of chatter regarding them. We compared the anomaly signal generation date for each product with the corresponding FDA letter issuance date. We also performed a brief manual analysis of chatter associated with 2 products to characterize their contents. RESULTS: FDA warning issue dates ranged from March 6, 2020, to June 22, 2021, and 44 key phrases representing fraudulent products were included. From 577,872,350 posts made between February 19 and December 31, 2020, which are all publicly available, our unsupervised approach detected 34 out of 44 (77.3%) signals about fraudulent products earlier than the FDA letter issuance dates, and an additional 6 (13.6%) within a week following the corresponding FDA letters. Content analysis revealed misinformation, information, political, and conspiracy theories to be prominent topics. CONCLUSIONS: Our proposed method is simple, effective, easy to deploy, and does not require high-performance computing machinery unlike deep neural network–based methods. The method can be easily extended to other types of signal detection from social media data. The data set may be used for future research and the development of more advanced methods.
format	Online Article Text
id	pubmed-10131818
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	JMIR Publications
record_format	MEDLINE/PubMed
spelling	pubmed-101318182023-04-26 The Early Detection of Fraudulent COVID-19 Products From Twitter Chatter: Data Set and Baseline Approach Using Anomaly Detection Sarker, Abeed Lakamana, Sahithi Liao, Ruqi Abbas, Aamir Yang, Yuan-Chi Al-Garadi, Mohammed JMIR Infodemiology Original Paper BACKGROUND: Social media has served as a lucrative platform for spreading misinformation and for promoting fraudulent products for the treatment, testing, and prevention of COVID-19. This has resulted in the issuance of many warning letters by the US Food and Drug Administration (FDA). While social media continues to serve as the primary platform for the promotion of such fraudulent products, it also presents the opportunity to identify these products early by using effective social media mining methods. OBJECTIVE: Our objectives were to (1) create a data set of fraudulent COVID-19 products that can be used for future research and (2) propose a method using data from Twitter for automatically detecting heavily promoted COVID-19 products early. METHODS: We created a data set from FDA-issued warnings during the early months of the COVID-19 pandemic. We used natural language processing and time-series anomaly detection methods for automatically detecting fraudulent COVID-19 products early from Twitter. Our approach is based on the intuition that increases in the popularity of fraudulent products lead to corresponding anomalous increases in the volume of chatter regarding them. We compared the anomaly signal generation date for each product with the corresponding FDA letter issuance date. We also performed a brief manual analysis of chatter associated with 2 products to characterize their contents. RESULTS: FDA warning issue dates ranged from March 6, 2020, to June 22, 2021, and 44 key phrases representing fraudulent products were included. From 577,872,350 posts made between February 19 and December 31, 2020, which are all publicly available, our unsupervised approach detected 34 out of 44 (77.3%) signals about fraudulent products earlier than the FDA letter issuance dates, and an additional 6 (13.6%) within a week following the corresponding FDA letters. Content analysis revealed misinformation, information, political, and conspiracy theories to be prominent topics. CONCLUSIONS: Our proposed method is simple, effective, easy to deploy, and does not require high-performance computing machinery unlike deep neural network–based methods. The method can be easily extended to other types of signal detection from social media data. The data set may be used for future research and the development of more advanced methods. JMIR Publications 2023-03-14 /pmc/articles/PMC10131818/ /pubmed/37113382 http://dx.doi.org/10.2196/43694 Text en ©Abeed Sarker, Sahithi Lakamana, Ruqi Liao, Aamir Abbas, Yuan-Chi Yang, Mohammed Al-Garadi. Originally published in JMIR Infodemiology (https://infodemiology.jmir.org), 14.03.2023. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Infodemiology, is properly cited. The complete bibliographic information, a link to the original publication on https://infodemiology.jmir.org/, as well as this copyright and license information must be included.
spellingShingle	Original Paper Sarker, Abeed Lakamana, Sahithi Liao, Ruqi Abbas, Aamir Yang, Yuan-Chi Al-Garadi, Mohammed The Early Detection of Fraudulent COVID-19 Products From Twitter Chatter: Data Set and Baseline Approach Using Anomaly Detection
title	The Early Detection of Fraudulent COVID-19 Products From Twitter Chatter: Data Set and Baseline Approach Using Anomaly Detection
title_full	The Early Detection of Fraudulent COVID-19 Products From Twitter Chatter: Data Set and Baseline Approach Using Anomaly Detection
title_fullStr	The Early Detection of Fraudulent COVID-19 Products From Twitter Chatter: Data Set and Baseline Approach Using Anomaly Detection
title_full_unstemmed	The Early Detection of Fraudulent COVID-19 Products From Twitter Chatter: Data Set and Baseline Approach Using Anomaly Detection
title_short	The Early Detection of Fraudulent COVID-19 Products From Twitter Chatter: Data Set and Baseline Approach Using Anomaly Detection
title_sort	early detection of fraudulent covid-19 products from twitter chatter: data set and baseline approach using anomaly detection
topic	Original Paper
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10131818/ https://www.ncbi.nlm.nih.gov/pubmed/37113382 http://dx.doi.org/10.2196/43694
work_keys_str_mv	AT sarkerabeed theearlydetectionoffraudulentcovid19productsfromtwitterchatterdatasetandbaselineapproachusinganomalydetection AT lakamanasahithi theearlydetectionoffraudulentcovid19productsfromtwitterchatterdatasetandbaselineapproachusinganomalydetection AT liaoruqi theearlydetectionoffraudulentcovid19productsfromtwitterchatterdatasetandbaselineapproachusinganomalydetection AT abbasaamir theearlydetectionoffraudulentcovid19productsfromtwitterchatterdatasetandbaselineapproachusinganomalydetection AT yangyuanchi theearlydetectionoffraudulentcovid19productsfromtwitterchatterdatasetandbaselineapproachusinganomalydetection AT algaradimohammed theearlydetectionoffraudulentcovid19productsfromtwitterchatterdatasetandbaselineapproachusinganomalydetection AT sarkerabeed earlydetectionoffraudulentcovid19productsfromtwitterchatterdatasetandbaselineapproachusinganomalydetection AT lakamanasahithi earlydetectionoffraudulentcovid19productsfromtwitterchatterdatasetandbaselineapproachusinganomalydetection AT liaoruqi earlydetectionoffraudulentcovid19productsfromtwitterchatterdatasetandbaselineapproachusinganomalydetection AT abbasaamir earlydetectionoffraudulentcovid19productsfromtwitterchatterdatasetandbaselineapproachusinganomalydetection AT yangyuanchi earlydetectionoffraudulentcovid19productsfromtwitterchatterdatasetandbaselineapproachusinganomalydetection AT algaradimohammed earlydetectionoffraudulentcovid19productsfromtwitterchatterdatasetandbaselineapproachusinganomalydetection

The Early Detection of Fraudulent COVID-19 Products From Twitter Chatter: Data Set and Baseline Approach Using Anomaly Detection

Ejemplares similares