Cargando…

Analysis of e-Mail Spam Detection Using a Novel Machine Learning-Based Hybrid Bagging Technique

e-mail service providers and consumers find it challenging to distinguish between spam and nonspam e-mails. The purpose of spammers is to spread false information by sending annoying messages that catch the attention of the public. Various spam identification techniques have been suggested and evalu...

Descripción completa

Detalles Bibliográficos
Autor principal:	Rayan, Alanazi
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Hindawi 2022
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9381222/ https://www.ncbi.nlm.nih.gov/pubmed/35983156 http://dx.doi.org/10.1155/2022/2500772

_version_	1784769030573260800
author	Rayan, Alanazi
author_facet	Rayan, Alanazi
author_sort	Rayan, Alanazi
collection	PubMed
description	e-mail service providers and consumers find it challenging to distinguish between spam and nonspam e-mails. The purpose of spammers is to spread false information by sending annoying messages that catch the attention of the public. Various spam identification techniques have been suggested and evaluated in the past, but the results show that the more research in this regard is required to enhance accuracy and to reduce training time and error rate. Thus, this research proposes a novel machine learning-based hybrid bagging method for e-mail spam identification by combining two machine learning methods: random forest and J48 (decision tree). The proposed framework categorizes the e-mail into ham and spam. The database is split into multiple sets and provided as input to each method in this procedure. Moreover, tokenization, stemming, and stop word removal are performed in the preprocessing stage. Further, correlation feature selection (CFS) is employed in this research to select the required features from the preprocessed data. The effectiveness of the presented method is evaluated in terms of true-negative rates, accuracy, recall, precision, false-positive rate, f-measure, and false-negative rate; the outcomes of three studies are compared. According to the results, the presented hybrid bagged model-based SMD technology achieved 98 percent accuracy.
format	Online Article Text
id	pubmed-9381222
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	Hindawi
record_format	MEDLINE/PubMed
spelling	pubmed-93812222022-08-17 Analysis of e-Mail Spam Detection Using a Novel Machine Learning-Based Hybrid Bagging Technique Rayan, Alanazi Comput Intell Neurosci Research Article e-mail service providers and consumers find it challenging to distinguish between spam and nonspam e-mails. The purpose of spammers is to spread false information by sending annoying messages that catch the attention of the public. Various spam identification techniques have been suggested and evaluated in the past, but the results show that the more research in this regard is required to enhance accuracy and to reduce training time and error rate. Thus, this research proposes a novel machine learning-based hybrid bagging method for e-mail spam identification by combining two machine learning methods: random forest and J48 (decision tree). The proposed framework categorizes the e-mail into ham and spam. The database is split into multiple sets and provided as input to each method in this procedure. Moreover, tokenization, stemming, and stop word removal are performed in the preprocessing stage. Further, correlation feature selection (CFS) is employed in this research to select the required features from the preprocessed data. The effectiveness of the presented method is evaluated in terms of true-negative rates, accuracy, recall, precision, false-positive rate, f-measure, and false-negative rate; the outcomes of three studies are compared. According to the results, the presented hybrid bagged model-based SMD technology achieved 98 percent accuracy. Hindawi 2022-08-09 /pmc/articles/PMC9381222/ /pubmed/35983156 http://dx.doi.org/10.1155/2022/2500772 Text en Copyright © 2022 Alanazi Rayan. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Article Rayan, Alanazi Analysis of e-Mail Spam Detection Using a Novel Machine Learning-Based Hybrid Bagging Technique
title	Analysis of e-Mail Spam Detection Using a Novel Machine Learning-Based Hybrid Bagging Technique
title_full	Analysis of e-Mail Spam Detection Using a Novel Machine Learning-Based Hybrid Bagging Technique
title_fullStr	Analysis of e-Mail Spam Detection Using a Novel Machine Learning-Based Hybrid Bagging Technique
title_full_unstemmed	Analysis of e-Mail Spam Detection Using a Novel Machine Learning-Based Hybrid Bagging Technique
title_short	Analysis of e-Mail Spam Detection Using a Novel Machine Learning-Based Hybrid Bagging Technique
title_sort	analysis of e-mail spam detection using a novel machine learning-based hybrid bagging technique
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9381222/ https://www.ncbi.nlm.nih.gov/pubmed/35983156 http://dx.doi.org/10.1155/2022/2500772
work_keys_str_mv	AT rayanalanazi analysisofemailspamdetectionusinganovelmachinelearningbasedhybridbaggingtechnique

Analysis of e-Mail Spam Detection Using a Novel Machine Learning-Based Hybrid Bagging Technique

Ejemplares similares