Cargando…
Detecting COVID-19-Related Fake News Using Feature Extraction
Since its emergence in December 2019, there have been numerous posts and news regarding the COVID-19 pandemic in social media, traditional print, and electronic media. These sources have information from both trusted and non-trusted medical sources. Furthermore, the news from these media are spread...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8764372/ https://www.ncbi.nlm.nih.gov/pubmed/35059379 http://dx.doi.org/10.3389/fpubh.2021.788074 |
_version_ | 1784634149415419904 |
---|---|
author | Khan, Suleman Hakak, Saqib Deepa, N. Prabadevi, B. Dev, Kapal Trelova, Silvia |
author_facet | Khan, Suleman Hakak, Saqib Deepa, N. Prabadevi, B. Dev, Kapal Trelova, Silvia |
author_sort | Khan, Suleman |
collection | PubMed |
description | Since its emergence in December 2019, there have been numerous posts and news regarding the COVID-19 pandemic in social media, traditional print, and electronic media. These sources have information from both trusted and non-trusted medical sources. Furthermore, the news from these media are spread rapidly. Spreading a piece of deceptive information may lead to anxiety, unwanted exposure to medical remedies, tricks for digital marketing, and may lead to deadly factors. Therefore, a model for detecting fake news from the news pool is essential. In this work, the dataset which is a fusion of news related to COVID-19 that has been sourced from data from several social media and news sources is used for classification. In the first step, preprocessing is performed on the dataset to remove unwanted text, then tokenization is carried out to extract the tokens from the raw text data collected from various sources. Later, feature selection is performed to avoid the computational overhead incurred in processing all the features in the dataset. The linguistic and sentiment features are extracted for further processing. Finally, several state-of-the-art machine learning algorithms are trained to classify the COVID-19-related dataset. These algorithms are then evaluated using various metrics. The results show that the random forest classifier outperforms the other classifiers with an accuracy of 88.50%. |
format | Online Article Text |
id | pubmed-8764372 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-87643722022-01-19 Detecting COVID-19-Related Fake News Using Feature Extraction Khan, Suleman Hakak, Saqib Deepa, N. Prabadevi, B. Dev, Kapal Trelova, Silvia Front Public Health Public Health Since its emergence in December 2019, there have been numerous posts and news regarding the COVID-19 pandemic in social media, traditional print, and electronic media. These sources have information from both trusted and non-trusted medical sources. Furthermore, the news from these media are spread rapidly. Spreading a piece of deceptive information may lead to anxiety, unwanted exposure to medical remedies, tricks for digital marketing, and may lead to deadly factors. Therefore, a model for detecting fake news from the news pool is essential. In this work, the dataset which is a fusion of news related to COVID-19 that has been sourced from data from several social media and news sources is used for classification. In the first step, preprocessing is performed on the dataset to remove unwanted text, then tokenization is carried out to extract the tokens from the raw text data collected from various sources. Later, feature selection is performed to avoid the computational overhead incurred in processing all the features in the dataset. The linguistic and sentiment features are extracted for further processing. Finally, several state-of-the-art machine learning algorithms are trained to classify the COVID-19-related dataset. These algorithms are then evaluated using various metrics. The results show that the random forest classifier outperforms the other classifiers with an accuracy of 88.50%. Frontiers Media S.A. 2022-01-04 /pmc/articles/PMC8764372/ /pubmed/35059379 http://dx.doi.org/10.3389/fpubh.2021.788074 Text en Copyright © 2022 Khan, Hakak, Deepa, Prabadevi, Dev and Trelova. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Public Health Khan, Suleman Hakak, Saqib Deepa, N. Prabadevi, B. Dev, Kapal Trelova, Silvia Detecting COVID-19-Related Fake News Using Feature Extraction |
title | Detecting COVID-19-Related Fake News Using Feature Extraction |
title_full | Detecting COVID-19-Related Fake News Using Feature Extraction |
title_fullStr | Detecting COVID-19-Related Fake News Using Feature Extraction |
title_full_unstemmed | Detecting COVID-19-Related Fake News Using Feature Extraction |
title_short | Detecting COVID-19-Related Fake News Using Feature Extraction |
title_sort | detecting covid-19-related fake news using feature extraction |
topic | Public Health |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8764372/ https://www.ncbi.nlm.nih.gov/pubmed/35059379 http://dx.doi.org/10.3389/fpubh.2021.788074 |
work_keys_str_mv | AT khansuleman detectingcovid19relatedfakenewsusingfeatureextraction AT hakaksaqib detectingcovid19relatedfakenewsusingfeatureextraction AT deepan detectingcovid19relatedfakenewsusingfeatureextraction AT prabadevib detectingcovid19relatedfakenewsusingfeatureextraction AT devkapal detectingcovid19relatedfakenewsusingfeatureextraction AT trelovasilvia detectingcovid19relatedfakenewsusingfeatureextraction |