Cargando…

Infrequent Pattern Detection for Reliable Network Traffic Analysis Using Robust Evolutionary Computation

While anomaly detection is very important in many domains, such as in cybersecurity, there are many rare anomalies or infrequent patterns in cybersecurity datasets. Detection of infrequent patterns is computationally expensive. Cybersecurity datasets consist of many features, mostly irrelevant, resu...

Descripción completa

Detalles Bibliográficos
Autores principales: Rashid, A. N. M. Bazlur, Ahmed, Mohiuddin, Pathan, Al-Sakib Khan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8123319/
https://www.ncbi.nlm.nih.gov/pubmed/33922954
http://dx.doi.org/10.3390/s21093005
_version_ 1783692868987125760
author Rashid, A. N. M. Bazlur
Ahmed, Mohiuddin
Pathan, Al-Sakib Khan
author_facet Rashid, A. N. M. Bazlur
Ahmed, Mohiuddin
Pathan, Al-Sakib Khan
author_sort Rashid, A. N. M. Bazlur
collection PubMed
description While anomaly detection is very important in many domains, such as in cybersecurity, there are many rare anomalies or infrequent patterns in cybersecurity datasets. Detection of infrequent patterns is computationally expensive. Cybersecurity datasets consist of many features, mostly irrelevant, resulting in lower classification performance by machine learning algorithms. Hence, a feature selection (FS) approach, i.e., selecting relevant features only, is an essential preprocessing step in cybersecurity data analysis. Despite many FS approaches proposed in the literature, cooperative co-evolution (CC)-based FS approaches can be more suitable for cybersecurity data preprocessing considering the Big Data scenario. Accordingly, in this paper, we have applied our previously proposed CC-based FS with random feature grouping (CCFSRFG) to a benchmark cybersecurity dataset as the preprocessing step. The dataset with original features and the dataset with a reduced number of features were used for infrequent pattern detection. Experimental analysis was performed and evaluated using 10 unsupervised anomaly detection techniques. Therefore, the proposed infrequent pattern detection is termed Unsupervised Infrequent Pattern Detection (UIPD). Then, we compared the experimental results with and without FS in terms of true positive rate (TPR). Experimental analysis indicates that the highest rate of TPR improvement was by cluster-based local outlier factor (CBLOF) of the backdoor infrequent pattern detection, and it was 385.91% when using FS. Furthermore, the highest overall infrequent pattern detection TPR was improved by 61.47% for all infrequent patterns using clustering-based multivariate Gaussian outlier score (CMGOS) with FS.
format Online
Article
Text
id pubmed-8123319
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-81233192021-05-16 Infrequent Pattern Detection for Reliable Network Traffic Analysis Using Robust Evolutionary Computation Rashid, A. N. M. Bazlur Ahmed, Mohiuddin Pathan, Al-Sakib Khan Sensors (Basel) Article While anomaly detection is very important in many domains, such as in cybersecurity, there are many rare anomalies or infrequent patterns in cybersecurity datasets. Detection of infrequent patterns is computationally expensive. Cybersecurity datasets consist of many features, mostly irrelevant, resulting in lower classification performance by machine learning algorithms. Hence, a feature selection (FS) approach, i.e., selecting relevant features only, is an essential preprocessing step in cybersecurity data analysis. Despite many FS approaches proposed in the literature, cooperative co-evolution (CC)-based FS approaches can be more suitable for cybersecurity data preprocessing considering the Big Data scenario. Accordingly, in this paper, we have applied our previously proposed CC-based FS with random feature grouping (CCFSRFG) to a benchmark cybersecurity dataset as the preprocessing step. The dataset with original features and the dataset with a reduced number of features were used for infrequent pattern detection. Experimental analysis was performed and evaluated using 10 unsupervised anomaly detection techniques. Therefore, the proposed infrequent pattern detection is termed Unsupervised Infrequent Pattern Detection (UIPD). Then, we compared the experimental results with and without FS in terms of true positive rate (TPR). Experimental analysis indicates that the highest rate of TPR improvement was by cluster-based local outlier factor (CBLOF) of the backdoor infrequent pattern detection, and it was 385.91% when using FS. Furthermore, the highest overall infrequent pattern detection TPR was improved by 61.47% for all infrequent patterns using clustering-based multivariate Gaussian outlier score (CMGOS) with FS. MDPI 2021-04-25 /pmc/articles/PMC8123319/ /pubmed/33922954 http://dx.doi.org/10.3390/s21093005 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Rashid, A. N. M. Bazlur
Ahmed, Mohiuddin
Pathan, Al-Sakib Khan
Infrequent Pattern Detection for Reliable Network Traffic Analysis Using Robust Evolutionary Computation
title Infrequent Pattern Detection for Reliable Network Traffic Analysis Using Robust Evolutionary Computation
title_full Infrequent Pattern Detection for Reliable Network Traffic Analysis Using Robust Evolutionary Computation
title_fullStr Infrequent Pattern Detection for Reliable Network Traffic Analysis Using Robust Evolutionary Computation
title_full_unstemmed Infrequent Pattern Detection for Reliable Network Traffic Analysis Using Robust Evolutionary Computation
title_short Infrequent Pattern Detection for Reliable Network Traffic Analysis Using Robust Evolutionary Computation
title_sort infrequent pattern detection for reliable network traffic analysis using robust evolutionary computation
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8123319/
https://www.ncbi.nlm.nih.gov/pubmed/33922954
http://dx.doi.org/10.3390/s21093005
work_keys_str_mv AT rashidanmbazlur infrequentpatterndetectionforreliablenetworktrafficanalysisusingrobustevolutionarycomputation
AT ahmedmohiuddin infrequentpatterndetectionforreliablenetworktrafficanalysisusingrobustevolutionarycomputation
AT pathanalsakibkhan infrequentpatterndetectionforreliablenetworktrafficanalysisusingrobustevolutionarycomputation