Cargando…
Enhancing Spam Message Classification and Detection Using Transformer-Based Embedding and Ensemble Learning
Over the last decade, the Short Message Service (SMS) has become a primary communication channel. Nevertheless, its popularity has also given rise to the so-called SMS spam. These messages, i.e., spam, are annoying and potentially malicious by exposing SMS users to credential theft and data loss. To...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10146782/ https://www.ncbi.nlm.nih.gov/pubmed/37112202 http://dx.doi.org/10.3390/s23083861 |
_version_ | 1785034660940611584 |
---|---|
author | Ghourabi, Abdallah Alohaly, Manar |
author_facet | Ghourabi, Abdallah Alohaly, Manar |
author_sort | Ghourabi, Abdallah |
collection | PubMed |
description | Over the last decade, the Short Message Service (SMS) has become a primary communication channel. Nevertheless, its popularity has also given rise to the so-called SMS spam. These messages, i.e., spam, are annoying and potentially malicious by exposing SMS users to credential theft and data loss. To mitigate this persistent threat, we propose a new model for SMS spam detection based on pre-trained Transformers and Ensemble Learning. The proposed model uses a text embedding technique that builds on the recent advancements of the GPT-3 Transformer. This technique provides a high-quality representation that can improve detection results. In addition, we used an Ensemble Learning method where four machine learning models were grouped into one model that performed significantly better than its separate constituent parts. The experimental evaluation of the model was performed using the SMS Spam Collection Dataset. The obtained results showed a state-of-the-art performance that exceeded all previous works with an accuracy that reached 99.91%. |
format | Online Article Text |
id | pubmed-10146782 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-101467822023-04-29 Enhancing Spam Message Classification and Detection Using Transformer-Based Embedding and Ensemble Learning Ghourabi, Abdallah Alohaly, Manar Sensors (Basel) Article Over the last decade, the Short Message Service (SMS) has become a primary communication channel. Nevertheless, its popularity has also given rise to the so-called SMS spam. These messages, i.e., spam, are annoying and potentially malicious by exposing SMS users to credential theft and data loss. To mitigate this persistent threat, we propose a new model for SMS spam detection based on pre-trained Transformers and Ensemble Learning. The proposed model uses a text embedding technique that builds on the recent advancements of the GPT-3 Transformer. This technique provides a high-quality representation that can improve detection results. In addition, we used an Ensemble Learning method where four machine learning models were grouped into one model that performed significantly better than its separate constituent parts. The experimental evaluation of the model was performed using the SMS Spam Collection Dataset. The obtained results showed a state-of-the-art performance that exceeded all previous works with an accuracy that reached 99.91%. MDPI 2023-04-10 /pmc/articles/PMC10146782/ /pubmed/37112202 http://dx.doi.org/10.3390/s23083861 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Ghourabi, Abdallah Alohaly, Manar Enhancing Spam Message Classification and Detection Using Transformer-Based Embedding and Ensemble Learning |
title | Enhancing Spam Message Classification and Detection Using Transformer-Based Embedding and Ensemble Learning |
title_full | Enhancing Spam Message Classification and Detection Using Transformer-Based Embedding and Ensemble Learning |
title_fullStr | Enhancing Spam Message Classification and Detection Using Transformer-Based Embedding and Ensemble Learning |
title_full_unstemmed | Enhancing Spam Message Classification and Detection Using Transformer-Based Embedding and Ensemble Learning |
title_short | Enhancing Spam Message Classification and Detection Using Transformer-Based Embedding and Ensemble Learning |
title_sort | enhancing spam message classification and detection using transformer-based embedding and ensemble learning |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10146782/ https://www.ncbi.nlm.nih.gov/pubmed/37112202 http://dx.doi.org/10.3390/s23083861 |
work_keys_str_mv | AT ghourabiabdallah enhancingspammessageclassificationanddetectionusingtransformerbasedembeddingandensemblelearning AT alohalymanar enhancingspammessageclassificationanddetectionusingtransformerbasedembeddingandensemblelearning |