Cargando…
Disinformation: analysis and identification
We present an extensive study on disinformation, which is defined as information that is false and misleading and intentionally shared to cause harm. Through this work, we aim to answer the following questions: Can we automatically and accurately classify a news article as containing disinformation?...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer US
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8212793/ https://www.ncbi.nlm.nih.gov/pubmed/34177355 http://dx.doi.org/10.1007/s10588-021-09336-x |
_version_ | 1783709707431575552 |
---|---|
author | Pathak, Archita Srihari, Rohini K. Natu, Nihit |
author_facet | Pathak, Archita Srihari, Rohini K. Natu, Nihit |
author_sort | Pathak, Archita |
collection | PubMed |
description | We present an extensive study on disinformation, which is defined as information that is false and misleading and intentionally shared to cause harm. Through this work, we aim to answer the following questions: Can we automatically and accurately classify a news article as containing disinformation? What characteristics of disinformation differentiate it from other types of benign information? We conduct this study in the context of two significant events: the US elections of 2016 and the 2020 COVID pandemic. We build a series of classifiers to (i) examine linguistic clues exhibited by different types of fake news articles, (ii) analyze “clickbaityness” of disinformation headlines, and (iii) finally, perform fine-grained, veracity-based article classification through a natural language inference (NLI) module for automated disinformation verification; this utilizes a manually curated set of evidence sources. For the latter, we built a new dataset that is annotated with generic, veracity-based labels and ground truth evidence supporting each label. The veracity labels were formulated based on examining standards used by reputable fact-checking organizations. We show that disinformation derives features from both propaganda and mainstream news, making it more challenging to detect. However, there is significant potential for automating the fact-checking process to incorporate the degree of veracity. We provide error analysis that illustrates the challenges involved in the automated fact-checking task and identifies factors that may improve this process in future work. Finally, we also describe the implementation of a web app that extracts important entities and actions from a given article and searches the web to gather evidence from credible sources. The evidence articles are then used to generate a veracity label that can assist manual fact-checkers engaged in combating disinformation. |
format | Online Article Text |
id | pubmed-8212793 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Springer US |
record_format | MEDLINE/PubMed |
spelling | pubmed-82127932021-06-21 Disinformation: analysis and identification Pathak, Archita Srihari, Rohini K. Natu, Nihit Comput Math Organ Theory S.I. : SBP-BRiMS2020 We present an extensive study on disinformation, which is defined as information that is false and misleading and intentionally shared to cause harm. Through this work, we aim to answer the following questions: Can we automatically and accurately classify a news article as containing disinformation? What characteristics of disinformation differentiate it from other types of benign information? We conduct this study in the context of two significant events: the US elections of 2016 and the 2020 COVID pandemic. We build a series of classifiers to (i) examine linguistic clues exhibited by different types of fake news articles, (ii) analyze “clickbaityness” of disinformation headlines, and (iii) finally, perform fine-grained, veracity-based article classification through a natural language inference (NLI) module for automated disinformation verification; this utilizes a manually curated set of evidence sources. For the latter, we built a new dataset that is annotated with generic, veracity-based labels and ground truth evidence supporting each label. The veracity labels were formulated based on examining standards used by reputable fact-checking organizations. We show that disinformation derives features from both propaganda and mainstream news, making it more challenging to detect. However, there is significant potential for automating the fact-checking process to incorporate the degree of veracity. We provide error analysis that illustrates the challenges involved in the automated fact-checking task and identifies factors that may improve this process in future work. Finally, we also describe the implementation of a web app that extracts important entities and actions from a given article and searches the web to gather evidence from credible sources. The evidence articles are then used to generate a veracity label that can assist manual fact-checkers engaged in combating disinformation. Springer US 2021-06-18 2021 /pmc/articles/PMC8212793/ /pubmed/34177355 http://dx.doi.org/10.1007/s10588-021-09336-x Text en © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2021 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic. |
spellingShingle | S.I. : SBP-BRiMS2020 Pathak, Archita Srihari, Rohini K. Natu, Nihit Disinformation: analysis and identification |
title | Disinformation: analysis and identification |
title_full | Disinformation: analysis and identification |
title_fullStr | Disinformation: analysis and identification |
title_full_unstemmed | Disinformation: analysis and identification |
title_short | Disinformation: analysis and identification |
title_sort | disinformation: analysis and identification |
topic | S.I. : SBP-BRiMS2020 |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8212793/ https://www.ncbi.nlm.nih.gov/pubmed/34177355 http://dx.doi.org/10.1007/s10588-021-09336-x |
work_keys_str_mv | AT pathakarchita disinformationanalysisandidentification AT sriharirohinik disinformationanalysisandidentification AT natunihit disinformationanalysisandidentification |