Cargando…

Thai Fake News Detection Based on Information Retrieval, Natural Language Processing and Machine Learning

Fake news is a big problem in every society. Fake news must be detected and its sharing should be stopped before it causes further damage to the country. Spotting fake news is challenging because of its dynamics. In this research, we propose a framework for robust Thai fake news detection. The frame...

Descripción completa

Detalles Bibliográficos
Autor principal: Meesad, Phayung
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer Singapore 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8382114/
https://www.ncbi.nlm.nih.gov/pubmed/34458858
http://dx.doi.org/10.1007/s42979-021-00775-6
_version_ 1783741489787961344
author Meesad, Phayung
author_facet Meesad, Phayung
author_sort Meesad, Phayung
collection PubMed
description Fake news is a big problem in every society. Fake news must be detected and its sharing should be stopped before it causes further damage to the country. Spotting fake news is challenging because of its dynamics. In this research, we propose a framework for robust Thai fake news detection. The framework comprises three main modules, including information retrieval, natural language processing, and machine learning. This research has two phases: the data collection phase and the machine learning model building phase. In the data collection phase, we obtained data from Thai online news websites using web-crawler information retrieval, and we analyzed the data using natural language processing techniques to extract good features from web data. For comparison, we selected some well-known classification Machine Learning models, including Naïve Bayesian, Logistic Regression, K-Nearest Neighbor, Multilayer Perceptron, Support Vector Machine, Decision Tree, Random Forest, Rule-Based Classifier, and Long Short-Term Memory. The comparison study on the test set showed that Long Short-Term Memory was the best model, and we deployed an automatic online fake news detection web application.
format Online
Article
Text
id pubmed-8382114
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Springer Singapore
record_format MEDLINE/PubMed
spelling pubmed-83821142021-08-23 Thai Fake News Detection Based on Information Retrieval, Natural Language Processing and Machine Learning Meesad, Phayung SN Comput Sci Original Research Fake news is a big problem in every society. Fake news must be detected and its sharing should be stopped before it causes further damage to the country. Spotting fake news is challenging because of its dynamics. In this research, we propose a framework for robust Thai fake news detection. The framework comprises three main modules, including information retrieval, natural language processing, and machine learning. This research has two phases: the data collection phase and the machine learning model building phase. In the data collection phase, we obtained data from Thai online news websites using web-crawler information retrieval, and we analyzed the data using natural language processing techniques to extract good features from web data. For comparison, we selected some well-known classification Machine Learning models, including Naïve Bayesian, Logistic Regression, K-Nearest Neighbor, Multilayer Perceptron, Support Vector Machine, Decision Tree, Random Forest, Rule-Based Classifier, and Long Short-Term Memory. The comparison study on the test set showed that Long Short-Term Memory was the best model, and we deployed an automatic online fake news detection web application. Springer Singapore 2021-08-23 2021 /pmc/articles/PMC8382114/ /pubmed/34458858 http://dx.doi.org/10.1007/s42979-021-00775-6 Text en © The Author(s), under exclusive licence to Springer Nature Singapore Pte Ltd 2021 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.
spellingShingle Original Research
Meesad, Phayung
Thai Fake News Detection Based on Information Retrieval, Natural Language Processing and Machine Learning
title Thai Fake News Detection Based on Information Retrieval, Natural Language Processing and Machine Learning
title_full Thai Fake News Detection Based on Information Retrieval, Natural Language Processing and Machine Learning
title_fullStr Thai Fake News Detection Based on Information Retrieval, Natural Language Processing and Machine Learning
title_full_unstemmed Thai Fake News Detection Based on Information Retrieval, Natural Language Processing and Machine Learning
title_short Thai Fake News Detection Based on Information Retrieval, Natural Language Processing and Machine Learning
title_sort thai fake news detection based on information retrieval, natural language processing and machine learning
topic Original Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8382114/
https://www.ncbi.nlm.nih.gov/pubmed/34458858
http://dx.doi.org/10.1007/s42979-021-00775-6
work_keys_str_mv AT meesadphayung thaifakenewsdetectionbasedoninformationretrievalnaturallanguageprocessingandmachinelearning