Cargando…

An improved X-means and isolation forest based methodology for network traffic anomaly detection

Anomaly detection in network traffic is becoming a challenging task due to the complexity of large-scale networks and the proliferation of various social network applications. In the actual industrial environment, only recently obtained unlabelled data can be used as the training set. The accuracy o...

Descripción completa

Detalles Bibliográficos
Autores principales: Feng, Yifan, Cai, Weihong, Yue, Haoyu, Xu, Jianlong, Lin, Yan, Chen, Jiaxin, Hu, Zijun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8803200/
https://www.ncbi.nlm.nih.gov/pubmed/35100305
http://dx.doi.org/10.1371/journal.pone.0263423
_version_ 1784642822524108800
author Feng, Yifan
Cai, Weihong
Yue, Haoyu
Xu, Jianlong
Lin, Yan
Chen, Jiaxin
Hu, Zijun
author_facet Feng, Yifan
Cai, Weihong
Yue, Haoyu
Xu, Jianlong
Lin, Yan
Chen, Jiaxin
Hu, Zijun
author_sort Feng, Yifan
collection PubMed
description Anomaly detection in network traffic is becoming a challenging task due to the complexity of large-scale networks and the proliferation of various social network applications. In the actual industrial environment, only recently obtained unlabelled data can be used as the training set. The accuracy of the abnormal ratio in the training set as prior knowledge has a great influence on the performance of the commonly used unsupervised algorithms. In this study, an anomaly detection algorithm based on X-means and iForest is proposed, named X-iForest, which clusters the standard Euclidean distance between the abnormal points and the normal cluster centre to achieve secondary filtering by using X-means. We compared X-iForest with seven mainstream unsupervised algorithms in terms of the AUC and anomaly detection rates. A large number of experiments showed that X-iForest has notable advantages over other algorithms and can be well applied to anomaly detection of large-scale network traffic data.
format Online
Article
Text
id pubmed-8803200
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-88032002022-02-01 An improved X-means and isolation forest based methodology for network traffic anomaly detection Feng, Yifan Cai, Weihong Yue, Haoyu Xu, Jianlong Lin, Yan Chen, Jiaxin Hu, Zijun PLoS One Research Article Anomaly detection in network traffic is becoming a challenging task due to the complexity of large-scale networks and the proliferation of various social network applications. In the actual industrial environment, only recently obtained unlabelled data can be used as the training set. The accuracy of the abnormal ratio in the training set as prior knowledge has a great influence on the performance of the commonly used unsupervised algorithms. In this study, an anomaly detection algorithm based on X-means and iForest is proposed, named X-iForest, which clusters the standard Euclidean distance between the abnormal points and the normal cluster centre to achieve secondary filtering by using X-means. We compared X-iForest with seven mainstream unsupervised algorithms in terms of the AUC and anomaly detection rates. A large number of experiments showed that X-iForest has notable advantages over other algorithms and can be well applied to anomaly detection of large-scale network traffic data. Public Library of Science 2022-01-31 /pmc/articles/PMC8803200/ /pubmed/35100305 http://dx.doi.org/10.1371/journal.pone.0263423 Text en © 2022 Feng et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Feng, Yifan
Cai, Weihong
Yue, Haoyu
Xu, Jianlong
Lin, Yan
Chen, Jiaxin
Hu, Zijun
An improved X-means and isolation forest based methodology for network traffic anomaly detection
title An improved X-means and isolation forest based methodology for network traffic anomaly detection
title_full An improved X-means and isolation forest based methodology for network traffic anomaly detection
title_fullStr An improved X-means and isolation forest based methodology for network traffic anomaly detection
title_full_unstemmed An improved X-means and isolation forest based methodology for network traffic anomaly detection
title_short An improved X-means and isolation forest based methodology for network traffic anomaly detection
title_sort improved x-means and isolation forest based methodology for network traffic anomaly detection
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8803200/
https://www.ncbi.nlm.nih.gov/pubmed/35100305
http://dx.doi.org/10.1371/journal.pone.0263423
work_keys_str_mv AT fengyifan animprovedxmeansandisolationforestbasedmethodologyfornetworktrafficanomalydetection
AT caiweihong animprovedxmeansandisolationforestbasedmethodologyfornetworktrafficanomalydetection
AT yuehaoyu animprovedxmeansandisolationforestbasedmethodologyfornetworktrafficanomalydetection
AT xujianlong animprovedxmeansandisolationforestbasedmethodologyfornetworktrafficanomalydetection
AT linyan animprovedxmeansandisolationforestbasedmethodologyfornetworktrafficanomalydetection
AT chenjiaxin animprovedxmeansandisolationforestbasedmethodologyfornetworktrafficanomalydetection
AT huzijun animprovedxmeansandisolationforestbasedmethodologyfornetworktrafficanomalydetection
AT fengyifan improvedxmeansandisolationforestbasedmethodologyfornetworktrafficanomalydetection
AT caiweihong improvedxmeansandisolationforestbasedmethodologyfornetworktrafficanomalydetection
AT yuehaoyu improvedxmeansandisolationforestbasedmethodologyfornetworktrafficanomalydetection
AT xujianlong improvedxmeansandisolationforestbasedmethodologyfornetworktrafficanomalydetection
AT linyan improvedxmeansandisolationforestbasedmethodologyfornetworktrafficanomalydetection
AT chenjiaxin improvedxmeansandisolationforestbasedmethodologyfornetworktrafficanomalydetection
AT huzijun improvedxmeansandisolationforestbasedmethodologyfornetworktrafficanomalydetection