Cargando…

Detecting COVID-19-Related Fake News Using Feature Extraction

Since its emergence in December 2019, there have been numerous posts and news regarding the COVID-19 pandemic in social media, traditional print, and electronic media. These sources have information from both trusted and non-trusted medical sources. Furthermore, the news from these media are spread...

Descripción completa

Detalles Bibliográficos
Autores principales: Khan, Suleman, Hakak, Saqib, Deepa, N., Prabadevi, B., Dev, Kapal, Trelova, Silvia
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8764372/
https://www.ncbi.nlm.nih.gov/pubmed/35059379
http://dx.doi.org/10.3389/fpubh.2021.788074
_version_ 1784634149415419904
author Khan, Suleman
Hakak, Saqib
Deepa, N.
Prabadevi, B.
Dev, Kapal
Trelova, Silvia
author_facet Khan, Suleman
Hakak, Saqib
Deepa, N.
Prabadevi, B.
Dev, Kapal
Trelova, Silvia
author_sort Khan, Suleman
collection PubMed
description Since its emergence in December 2019, there have been numerous posts and news regarding the COVID-19 pandemic in social media, traditional print, and electronic media. These sources have information from both trusted and non-trusted medical sources. Furthermore, the news from these media are spread rapidly. Spreading a piece of deceptive information may lead to anxiety, unwanted exposure to medical remedies, tricks for digital marketing, and may lead to deadly factors. Therefore, a model for detecting fake news from the news pool is essential. In this work, the dataset which is a fusion of news related to COVID-19 that has been sourced from data from several social media and news sources is used for classification. In the first step, preprocessing is performed on the dataset to remove unwanted text, then tokenization is carried out to extract the tokens from the raw text data collected from various sources. Later, feature selection is performed to avoid the computational overhead incurred in processing all the features in the dataset. The linguistic and sentiment features are extracted for further processing. Finally, several state-of-the-art machine learning algorithms are trained to classify the COVID-19-related dataset. These algorithms are then evaluated using various metrics. The results show that the random forest classifier outperforms the other classifiers with an accuracy of 88.50%.
format Online
Article
Text
id pubmed-8764372
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-87643722022-01-19 Detecting COVID-19-Related Fake News Using Feature Extraction Khan, Suleman Hakak, Saqib Deepa, N. Prabadevi, B. Dev, Kapal Trelova, Silvia Front Public Health Public Health Since its emergence in December 2019, there have been numerous posts and news regarding the COVID-19 pandemic in social media, traditional print, and electronic media. These sources have information from both trusted and non-trusted medical sources. Furthermore, the news from these media are spread rapidly. Spreading a piece of deceptive information may lead to anxiety, unwanted exposure to medical remedies, tricks for digital marketing, and may lead to deadly factors. Therefore, a model for detecting fake news from the news pool is essential. In this work, the dataset which is a fusion of news related to COVID-19 that has been sourced from data from several social media and news sources is used for classification. In the first step, preprocessing is performed on the dataset to remove unwanted text, then tokenization is carried out to extract the tokens from the raw text data collected from various sources. Later, feature selection is performed to avoid the computational overhead incurred in processing all the features in the dataset. The linguistic and sentiment features are extracted for further processing. Finally, several state-of-the-art machine learning algorithms are trained to classify the COVID-19-related dataset. These algorithms are then evaluated using various metrics. The results show that the random forest classifier outperforms the other classifiers with an accuracy of 88.50%. Frontiers Media S.A. 2022-01-04 /pmc/articles/PMC8764372/ /pubmed/35059379 http://dx.doi.org/10.3389/fpubh.2021.788074 Text en Copyright © 2022 Khan, Hakak, Deepa, Prabadevi, Dev and Trelova. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Public Health
Khan, Suleman
Hakak, Saqib
Deepa, N.
Prabadevi, B.
Dev, Kapal
Trelova, Silvia
Detecting COVID-19-Related Fake News Using Feature Extraction
title Detecting COVID-19-Related Fake News Using Feature Extraction
title_full Detecting COVID-19-Related Fake News Using Feature Extraction
title_fullStr Detecting COVID-19-Related Fake News Using Feature Extraction
title_full_unstemmed Detecting COVID-19-Related Fake News Using Feature Extraction
title_short Detecting COVID-19-Related Fake News Using Feature Extraction
title_sort detecting covid-19-related fake news using feature extraction
topic Public Health
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8764372/
https://www.ncbi.nlm.nih.gov/pubmed/35059379
http://dx.doi.org/10.3389/fpubh.2021.788074
work_keys_str_mv AT khansuleman detectingcovid19relatedfakenewsusingfeatureextraction
AT hakaksaqib detectingcovid19relatedfakenewsusingfeatureextraction
AT deepan detectingcovid19relatedfakenewsusingfeatureextraction
AT prabadevib detectingcovid19relatedfakenewsusingfeatureextraction
AT devkapal detectingcovid19relatedfakenewsusingfeatureextraction
AT trelovasilvia detectingcovid19relatedfakenewsusingfeatureextraction