Cargando…

Life-long phishing attack detection using continual learning

Phishing is an identity theft that employs social engineering methods to get confidential data from unwary users. A phisher frequently attempts to trick the victim into clicking a URL that leads to a malicious website. Many phishing attack victims lose their credentials and digital assets daily. Thi...

Descripción completa

Detalles Bibliográficos
Autores principales: Ejaz, Asif, Mian, Adnan Noor, Manzoor, Sanaullah
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10352299/
https://www.ncbi.nlm.nih.gov/pubmed/37460588
http://dx.doi.org/10.1038/s41598-023-37552-9
_version_ 1785074482662080512
author Ejaz, Asif
Mian, Adnan Noor
Manzoor, Sanaullah
author_facet Ejaz, Asif
Mian, Adnan Noor
Manzoor, Sanaullah
author_sort Ejaz, Asif
collection PubMed
description Phishing is an identity theft that employs social engineering methods to get confidential data from unwary users. A phisher frequently attempts to trick the victim into clicking a URL that leads to a malicious website. Many phishing attack victims lose their credentials and digital assets daily. This study demonstrates how the performance of traditional machine learning (ML)-based phishing detection models deteriorates over time. This failure is due to drastic changes in feature distributions caused by new phishing techniques and technological evolution over time. This paper explores continual learning (CL) techniques for sustained phishing detection performance over time. To demonstrate this behavior, we collect phishing and benign samples for three consecutive years from 2018 to 2020 and divide them into six datasets to evaluate traditional ML and proposed CL algorithms. We train a vanilla neural network (VNN) model in the CL fashion using deep feature embedding of HTML contents. We compare the proposed CL algorithms with the VNN model trained from scratch and with transfer learning (TL). We show that CL algorithms maintain accuracy over time with a tolerable deterioration of 2.45%. In contrast, VNN and TL-based models’ performance deteriorates by over 20.65% and 8%, respectively.
format Online
Article
Text
id pubmed-10352299
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-103522992023-07-19 Life-long phishing attack detection using continual learning Ejaz, Asif Mian, Adnan Noor Manzoor, Sanaullah Sci Rep Article Phishing is an identity theft that employs social engineering methods to get confidential data from unwary users. A phisher frequently attempts to trick the victim into clicking a URL that leads to a malicious website. Many phishing attack victims lose their credentials and digital assets daily. This study demonstrates how the performance of traditional machine learning (ML)-based phishing detection models deteriorates over time. This failure is due to drastic changes in feature distributions caused by new phishing techniques and technological evolution over time. This paper explores continual learning (CL) techniques for sustained phishing detection performance over time. To demonstrate this behavior, we collect phishing and benign samples for three consecutive years from 2018 to 2020 and divide them into six datasets to evaluate traditional ML and proposed CL algorithms. We train a vanilla neural network (VNN) model in the CL fashion using deep feature embedding of HTML contents. We compare the proposed CL algorithms with the VNN model trained from scratch and with transfer learning (TL). We show that CL algorithms maintain accuracy over time with a tolerable deterioration of 2.45%. In contrast, VNN and TL-based models’ performance deteriorates by over 20.65% and 8%, respectively. Nature Publishing Group UK 2023-07-17 /pmc/articles/PMC10352299/ /pubmed/37460588 http://dx.doi.org/10.1038/s41598-023-37552-9 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Ejaz, Asif
Mian, Adnan Noor
Manzoor, Sanaullah
Life-long phishing attack detection using continual learning
title Life-long phishing attack detection using continual learning
title_full Life-long phishing attack detection using continual learning
title_fullStr Life-long phishing attack detection using continual learning
title_full_unstemmed Life-long phishing attack detection using continual learning
title_short Life-long phishing attack detection using continual learning
title_sort life-long phishing attack detection using continual learning
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10352299/
https://www.ncbi.nlm.nih.gov/pubmed/37460588
http://dx.doi.org/10.1038/s41598-023-37552-9
work_keys_str_mv AT ejazasif lifelongphishingattackdetectionusingcontinuallearning
AT mianadnannoor lifelongphishingattackdetectionusingcontinuallearning
AT manzoorsanaullah lifelongphishingattackdetectionusingcontinuallearning