Cargando…

Enhancing Spam Message Classification and Detection Using Transformer-Based Embedding and Ensemble Learning

Over the last decade, the Short Message Service (SMS) has become a primary communication channel. Nevertheless, its popularity has also given rise to the so-called SMS spam. These messages, i.e., spam, are annoying and potentially malicious by exposing SMS users to credential theft and data loss. To...

Descripción completa

Detalles Bibliográficos
Autores principales: Ghourabi, Abdallah, Alohaly, Manar
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10146782/
https://www.ncbi.nlm.nih.gov/pubmed/37112202
http://dx.doi.org/10.3390/s23083861
_version_ 1785034660940611584
author Ghourabi, Abdallah
Alohaly, Manar
author_facet Ghourabi, Abdallah
Alohaly, Manar
author_sort Ghourabi, Abdallah
collection PubMed
description Over the last decade, the Short Message Service (SMS) has become a primary communication channel. Nevertheless, its popularity has also given rise to the so-called SMS spam. These messages, i.e., spam, are annoying and potentially malicious by exposing SMS users to credential theft and data loss. To mitigate this persistent threat, we propose a new model for SMS spam detection based on pre-trained Transformers and Ensemble Learning. The proposed model uses a text embedding technique that builds on the recent advancements of the GPT-3 Transformer. This technique provides a high-quality representation that can improve detection results. In addition, we used an Ensemble Learning method where four machine learning models were grouped into one model that performed significantly better than its separate constituent parts. The experimental evaluation of the model was performed using the SMS Spam Collection Dataset. The obtained results showed a state-of-the-art performance that exceeded all previous works with an accuracy that reached 99.91%.
format Online
Article
Text
id pubmed-10146782
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-101467822023-04-29 Enhancing Spam Message Classification and Detection Using Transformer-Based Embedding and Ensemble Learning Ghourabi, Abdallah Alohaly, Manar Sensors (Basel) Article Over the last decade, the Short Message Service (SMS) has become a primary communication channel. Nevertheless, its popularity has also given rise to the so-called SMS spam. These messages, i.e., spam, are annoying and potentially malicious by exposing SMS users to credential theft and data loss. To mitigate this persistent threat, we propose a new model for SMS spam detection based on pre-trained Transformers and Ensemble Learning. The proposed model uses a text embedding technique that builds on the recent advancements of the GPT-3 Transformer. This technique provides a high-quality representation that can improve detection results. In addition, we used an Ensemble Learning method where four machine learning models were grouped into one model that performed significantly better than its separate constituent parts. The experimental evaluation of the model was performed using the SMS Spam Collection Dataset. The obtained results showed a state-of-the-art performance that exceeded all previous works with an accuracy that reached 99.91%. MDPI 2023-04-10 /pmc/articles/PMC10146782/ /pubmed/37112202 http://dx.doi.org/10.3390/s23083861 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Ghourabi, Abdallah
Alohaly, Manar
Enhancing Spam Message Classification and Detection Using Transformer-Based Embedding and Ensemble Learning
title Enhancing Spam Message Classification and Detection Using Transformer-Based Embedding and Ensemble Learning
title_full Enhancing Spam Message Classification and Detection Using Transformer-Based Embedding and Ensemble Learning
title_fullStr Enhancing Spam Message Classification and Detection Using Transformer-Based Embedding and Ensemble Learning
title_full_unstemmed Enhancing Spam Message Classification and Detection Using Transformer-Based Embedding and Ensemble Learning
title_short Enhancing Spam Message Classification and Detection Using Transformer-Based Embedding and Ensemble Learning
title_sort enhancing spam message classification and detection using transformer-based embedding and ensemble learning
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10146782/
https://www.ncbi.nlm.nih.gov/pubmed/37112202
http://dx.doi.org/10.3390/s23083861
work_keys_str_mv AT ghourabiabdallah enhancingspammessageclassificationanddetectionusingtransformerbasedembeddingandensemblelearning
AT alohalymanar enhancingspammessageclassificationanddetectionusingtransformerbasedembeddingandensemblelearning