Cargando…

Disinformation: analysis and identification

We present an extensive study on disinformation, which is defined as information that is false and misleading and intentionally shared to cause harm. Through this work, we aim to answer the following questions: Can we automatically and accurately classify a news article as containing disinformation?...

Descripción completa

Detalles Bibliográficos
Autores principales: Pathak, Archita, Srihari, Rohini K., Natu, Nihit
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer US 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8212793/
https://www.ncbi.nlm.nih.gov/pubmed/34177355
http://dx.doi.org/10.1007/s10588-021-09336-x
_version_ 1783709707431575552
author Pathak, Archita
Srihari, Rohini K.
Natu, Nihit
author_facet Pathak, Archita
Srihari, Rohini K.
Natu, Nihit
author_sort Pathak, Archita
collection PubMed
description We present an extensive study on disinformation, which is defined as information that is false and misleading and intentionally shared to cause harm. Through this work, we aim to answer the following questions: Can we automatically and accurately classify a news article as containing disinformation? What characteristics of disinformation differentiate it from other types of benign information? We conduct this study in the context of two significant events: the US elections of 2016 and the 2020 COVID pandemic. We build a series of classifiers to (i) examine linguistic clues exhibited by different types of fake news articles, (ii) analyze “clickbaityness” of disinformation headlines, and (iii) finally, perform fine-grained, veracity-based article classification through a natural language inference (NLI) module for automated disinformation verification; this utilizes a manually curated set of evidence sources. For the latter, we built a new dataset that is annotated with generic, veracity-based labels and ground truth evidence supporting each label. The veracity labels were formulated based on examining standards used by reputable fact-checking organizations. We show that disinformation derives features from both propaganda and mainstream news, making it more challenging to detect. However, there is significant potential for automating the fact-checking process to incorporate the degree of veracity. We provide error analysis that illustrates the challenges involved in the automated fact-checking task and identifies factors that may improve this process in future work. Finally, we also describe the implementation of a web app that extracts important entities and actions from a given article and searches the web to gather evidence from credible sources. The evidence articles are then used to generate a veracity label that can assist manual fact-checkers engaged in combating disinformation.
format Online
Article
Text
id pubmed-8212793
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Springer US
record_format MEDLINE/PubMed
spelling pubmed-82127932021-06-21 Disinformation: analysis and identification Pathak, Archita Srihari, Rohini K. Natu, Nihit Comput Math Organ Theory S.I. : SBP-BRiMS2020 We present an extensive study on disinformation, which is defined as information that is false and misleading and intentionally shared to cause harm. Through this work, we aim to answer the following questions: Can we automatically and accurately classify a news article as containing disinformation? What characteristics of disinformation differentiate it from other types of benign information? We conduct this study in the context of two significant events: the US elections of 2016 and the 2020 COVID pandemic. We build a series of classifiers to (i) examine linguistic clues exhibited by different types of fake news articles, (ii) analyze “clickbaityness” of disinformation headlines, and (iii) finally, perform fine-grained, veracity-based article classification through a natural language inference (NLI) module for automated disinformation verification; this utilizes a manually curated set of evidence sources. For the latter, we built a new dataset that is annotated with generic, veracity-based labels and ground truth evidence supporting each label. The veracity labels were formulated based on examining standards used by reputable fact-checking organizations. We show that disinformation derives features from both propaganda and mainstream news, making it more challenging to detect. However, there is significant potential for automating the fact-checking process to incorporate the degree of veracity. We provide error analysis that illustrates the challenges involved in the automated fact-checking task and identifies factors that may improve this process in future work. Finally, we also describe the implementation of a web app that extracts important entities and actions from a given article and searches the web to gather evidence from credible sources. The evidence articles are then used to generate a veracity label that can assist manual fact-checkers engaged in combating disinformation. Springer US 2021-06-18 2021 /pmc/articles/PMC8212793/ /pubmed/34177355 http://dx.doi.org/10.1007/s10588-021-09336-x Text en © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2021 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.
spellingShingle S.I. : SBP-BRiMS2020
Pathak, Archita
Srihari, Rohini K.
Natu, Nihit
Disinformation: analysis and identification
title Disinformation: analysis and identification
title_full Disinformation: analysis and identification
title_fullStr Disinformation: analysis and identification
title_full_unstemmed Disinformation: analysis and identification
title_short Disinformation: analysis and identification
title_sort disinformation: analysis and identification
topic S.I. : SBP-BRiMS2020
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8212793/
https://www.ncbi.nlm.nih.gov/pubmed/34177355
http://dx.doi.org/10.1007/s10588-021-09336-x
work_keys_str_mv AT pathakarchita disinformationanalysisandidentification
AT sriharirohinik disinformationanalysisandidentification
AT natunihit disinformationanalysisandidentification