Cargando…

A Multiple change-point detection framework on linguistic characteristics of real versus fake news articles

Extracting information from textual data of news articles has been proven to be significant in developing efficient fake news detection systems. Pointedly, to fight disinformation, researchers concentrated on extracting information which focuses on exploiting linguistic characteristics that are comm...

Descripción completa

Detalles Bibliográficos
Autores principales: Petrou, Nikolas, Christodoulou, Chrysovalantis, Anastasiou, Andreas, Pallis, George, Dikaiakos, Marios D.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10100634/
https://www.ncbi.nlm.nih.gov/pubmed/37055455
http://dx.doi.org/10.1038/s41598-023-32952-3
_version_ 1785025320117600256
author Petrou, Nikolas
Christodoulou, Chrysovalantis
Anastasiou, Andreas
Pallis, George
Dikaiakos, Marios D.
author_facet Petrou, Nikolas
Christodoulou, Chrysovalantis
Anastasiou, Andreas
Pallis, George
Dikaiakos, Marios D.
author_sort Petrou, Nikolas
collection PubMed
description Extracting information from textual data of news articles has been proven to be significant in developing efficient fake news detection systems. Pointedly, to fight disinformation, researchers concentrated on extracting information which focuses on exploiting linguistic characteristics that are common in fake news and can aid in detecting false content automatically. Even though these approaches were proven to have high performance, the research community proved that both the language as well as the word use in literature are evolving. Therefore, the objective of this paper is to explore the linguistic characteristics of fake news and real ones over time. To achieve this, we establish a large dataset containing linguistic characteristics of various articles over the years. In addition, we introduce a novel framework where the articles are classified in specified topics based on their content and the most informative linguistic features are extracted using dimensionality reduction methods. Eventually, the framework detects the changes of the extracted linguistic features on real and fake news articles over the time incorporating a novel change-point detection method. By employing our framework for the established dataset, we noticed that the linguistic characteristics which concern the article’s title seem to be significantly important in capturing important movements in the similarity level of “Fake” and “Real” articles.
format Online
Article
Text
id pubmed-10100634
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-101006342023-04-14 A Multiple change-point detection framework on linguistic characteristics of real versus fake news articles Petrou, Nikolas Christodoulou, Chrysovalantis Anastasiou, Andreas Pallis, George Dikaiakos, Marios D. Sci Rep Article Extracting information from textual data of news articles has been proven to be significant in developing efficient fake news detection systems. Pointedly, to fight disinformation, researchers concentrated on extracting information which focuses on exploiting linguistic characteristics that are common in fake news and can aid in detecting false content automatically. Even though these approaches were proven to have high performance, the research community proved that both the language as well as the word use in literature are evolving. Therefore, the objective of this paper is to explore the linguistic characteristics of fake news and real ones over time. To achieve this, we establish a large dataset containing linguistic characteristics of various articles over the years. In addition, we introduce a novel framework where the articles are classified in specified topics based on their content and the most informative linguistic features are extracted using dimensionality reduction methods. Eventually, the framework detects the changes of the extracted linguistic features on real and fake news articles over the time incorporating a novel change-point detection method. By employing our framework for the established dataset, we noticed that the linguistic characteristics which concern the article’s title seem to be significantly important in capturing important movements in the similarity level of “Fake” and “Real” articles. Nature Publishing Group UK 2023-04-13 /pmc/articles/PMC10100634/ /pubmed/37055455 http://dx.doi.org/10.1038/s41598-023-32952-3 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Petrou, Nikolas
Christodoulou, Chrysovalantis
Anastasiou, Andreas
Pallis, George
Dikaiakos, Marios D.
A Multiple change-point detection framework on linguistic characteristics of real versus fake news articles
title A Multiple change-point detection framework on linguistic characteristics of real versus fake news articles
title_full A Multiple change-point detection framework on linguistic characteristics of real versus fake news articles
title_fullStr A Multiple change-point detection framework on linguistic characteristics of real versus fake news articles
title_full_unstemmed A Multiple change-point detection framework on linguistic characteristics of real versus fake news articles
title_short A Multiple change-point detection framework on linguistic characteristics of real versus fake news articles
title_sort multiple change-point detection framework on linguistic characteristics of real versus fake news articles
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10100634/
https://www.ncbi.nlm.nih.gov/pubmed/37055455
http://dx.doi.org/10.1038/s41598-023-32952-3
work_keys_str_mv AT petrounikolas amultiplechangepointdetectionframeworkonlinguisticcharacteristicsofrealversusfakenewsarticles
AT christodoulouchrysovalantis amultiplechangepointdetectionframeworkonlinguisticcharacteristicsofrealversusfakenewsarticles
AT anastasiouandreas amultiplechangepointdetectionframeworkonlinguisticcharacteristicsofrealversusfakenewsarticles
AT pallisgeorge amultiplechangepointdetectionframeworkonlinguisticcharacteristicsofrealversusfakenewsarticles
AT dikaiakosmariosd amultiplechangepointdetectionframeworkonlinguisticcharacteristicsofrealversusfakenewsarticles
AT petrounikolas multiplechangepointdetectionframeworkonlinguisticcharacteristicsofrealversusfakenewsarticles
AT christodoulouchrysovalantis multiplechangepointdetectionframeworkonlinguisticcharacteristicsofrealversusfakenewsarticles
AT anastasiouandreas multiplechangepointdetectionframeworkonlinguisticcharacteristicsofrealversusfakenewsarticles
AT pallisgeorge multiplechangepointdetectionframeworkonlinguisticcharacteristicsofrealversusfakenewsarticles
AT dikaiakosmariosd multiplechangepointdetectionframeworkonlinguisticcharacteristicsofrealversusfakenewsarticles