Cargando…
An ensemble predictive analytics of COVID-19 infodemic tweets using bag of words
Fake COVID-19 tweets appear as legitimate and appealing to unsuspecting internet users because of lack of prior knowledge of the novel pandemic. Such news could be misleading, counterproductive, unethical, unprofessional, and sometimes, constitute a log in the wheel of global efforts toward flatteni...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8137711/ http://dx.doi.org/10.1016/B978-0-12-824536-1.00004-6 |
_version_ | 1783695661458259968 |
---|---|
author | Olaleye, T.O. Arogundade, O.T. Abayomi-Alli, A. Adesemowo, A.K. |
author_facet | Olaleye, T.O. Arogundade, O.T. Abayomi-Alli, A. Adesemowo, A.K. |
author_sort | Olaleye, T.O. |
collection | PubMed |
description | Fake COVID-19 tweets appear as legitimate and appealing to unsuspecting internet users because of lack of prior knowledge of the novel pandemic. Such news could be misleading, counterproductive, unethical, unprofessional, and sometimes, constitute a log in the wheel of global efforts toward flattening the virus spread curve. Therefore, aside the COVID-19 pandemic, dealing with fake news and myths about the virus constitute an infodemic issue which must be tackled to ensure that only valid information is consumed by the public. Following the research approach, this chapter aims at a predictive analytics of COVID-19 infodemic tweets that generates a classification rule and validates genuine information from verified accredited health institutions/sources. On deployment of classifier Vote ensembles formed by base classifiers SMO, Voted Perceptron, Liblinear, Reptree, and Decision Stump on dataset of tokenized 81,456 Bag of Words which encapsulate 2964 COVID-19 tweet instances and 3169 extracted numeric vector attributes, experimental result shows a novel 99.93% prediction accuracy on 10-fold cross validation while the information gain of each 3169 extracted attributes is ranked to ascertain the most significant COVID-19 tweet-words for the detection system. Other performance metrics including ROC area and Relief-F validates the reliability of the model and returns SMO as the most efficient base classifier. The thrust of the model centered more on the trustworthiness of COVID-19 tweet source than the truthfulness of the tweet which underscores the prominence of verified health institutions as well as it contributes to discourse on inhibition and impact of fake news especially on societal pandemics. The COVID-19 infodemic detection algorithm provides insight into new spin on fake news in the age of social media and era of pandemics. |
format | Online Article Text |
id | pubmed-8137711 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
record_format | MEDLINE/PubMed |
spelling | pubmed-81377112021-05-21 An ensemble predictive analytics of COVID-19 infodemic tweets using bag of words Olaleye, T.O. Arogundade, O.T. Abayomi-Alli, A. Adesemowo, A.K. Data Science for COVID-19 Article Fake COVID-19 tweets appear as legitimate and appealing to unsuspecting internet users because of lack of prior knowledge of the novel pandemic. Such news could be misleading, counterproductive, unethical, unprofessional, and sometimes, constitute a log in the wheel of global efforts toward flattening the virus spread curve. Therefore, aside the COVID-19 pandemic, dealing with fake news and myths about the virus constitute an infodemic issue which must be tackled to ensure that only valid information is consumed by the public. Following the research approach, this chapter aims at a predictive analytics of COVID-19 infodemic tweets that generates a classification rule and validates genuine information from verified accredited health institutions/sources. On deployment of classifier Vote ensembles formed by base classifiers SMO, Voted Perceptron, Liblinear, Reptree, and Decision Stump on dataset of tokenized 81,456 Bag of Words which encapsulate 2964 COVID-19 tweet instances and 3169 extracted numeric vector attributes, experimental result shows a novel 99.93% prediction accuracy on 10-fold cross validation while the information gain of each 3169 extracted attributes is ranked to ascertain the most significant COVID-19 tweet-words for the detection system. Other performance metrics including ROC area and Relief-F validates the reliability of the model and returns SMO as the most efficient base classifier. The thrust of the model centered more on the trustworthiness of COVID-19 tweet source than the truthfulness of the tweet which underscores the prominence of verified health institutions as well as it contributes to discourse on inhibition and impact of fake news especially on societal pandemics. The COVID-19 infodemic detection algorithm provides insight into new spin on fake news in the age of social media and era of pandemics. 2021 2021-05-21 /pmc/articles/PMC8137711/ http://dx.doi.org/10.1016/B978-0-12-824536-1.00004-6 Text en Copyright © 2021 Elsevier Inc. All rights reserved. Since January 2020 Elsevier has created a COVID-19 resource centre with free information in English and Mandarin on the novel coronavirus COVID-19. The COVID-19 resource centre is hosted on Elsevier Connect, the company's public news and information website. Elsevier hereby grants permission to make all its COVID-19-related research that is available on the COVID-19 resource centre - including this research content - immediately available in PubMed Central and other publicly funded repositories, such as the WHO COVID database with rights for unrestricted research re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for free by Elsevier for as long as the COVID-19 resource centre remains active. |
spellingShingle | Article Olaleye, T.O. Arogundade, O.T. Abayomi-Alli, A. Adesemowo, A.K. An ensemble predictive analytics of COVID-19 infodemic tweets using bag of words |
title | An ensemble predictive analytics of COVID-19 infodemic tweets using bag of words |
title_full | An ensemble predictive analytics of COVID-19 infodemic tweets using bag of words |
title_fullStr | An ensemble predictive analytics of COVID-19 infodemic tweets using bag of words |
title_full_unstemmed | An ensemble predictive analytics of COVID-19 infodemic tweets using bag of words |
title_short | An ensemble predictive analytics of COVID-19 infodemic tweets using bag of words |
title_sort | ensemble predictive analytics of covid-19 infodemic tweets using bag of words |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8137711/ http://dx.doi.org/10.1016/B978-0-12-824536-1.00004-6 |
work_keys_str_mv | AT olaleyeto anensemblepredictiveanalyticsofcovid19infodemictweetsusingbagofwords AT arogundadeot anensemblepredictiveanalyticsofcovid19infodemictweetsusingbagofwords AT abayomiallia anensemblepredictiveanalyticsofcovid19infodemictweetsusingbagofwords AT adesemowoak anensemblepredictiveanalyticsofcovid19infodemictweetsusingbagofwords AT olaleyeto ensemblepredictiveanalyticsofcovid19infodemictweetsusingbagofwords AT arogundadeot ensemblepredictiveanalyticsofcovid19infodemictweetsusingbagofwords AT abayomiallia ensemblepredictiveanalyticsofcovid19infodemictweetsusingbagofwords AT adesemowoak ensemblepredictiveanalyticsofcovid19infodemictweetsusingbagofwords |