Cargando…
An improved X-means and isolation forest based methodology for network traffic anomaly detection
Anomaly detection in network traffic is becoming a challenging task due to the complexity of large-scale networks and the proliferation of various social network applications. In the actual industrial environment, only recently obtained unlabelled data can be used as the training set. The accuracy o...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8803200/ https://www.ncbi.nlm.nih.gov/pubmed/35100305 http://dx.doi.org/10.1371/journal.pone.0263423 |
_version_ | 1784642822524108800 |
---|---|
author | Feng, Yifan Cai, Weihong Yue, Haoyu Xu, Jianlong Lin, Yan Chen, Jiaxin Hu, Zijun |
author_facet | Feng, Yifan Cai, Weihong Yue, Haoyu Xu, Jianlong Lin, Yan Chen, Jiaxin Hu, Zijun |
author_sort | Feng, Yifan |
collection | PubMed |
description | Anomaly detection in network traffic is becoming a challenging task due to the complexity of large-scale networks and the proliferation of various social network applications. In the actual industrial environment, only recently obtained unlabelled data can be used as the training set. The accuracy of the abnormal ratio in the training set as prior knowledge has a great influence on the performance of the commonly used unsupervised algorithms. In this study, an anomaly detection algorithm based on X-means and iForest is proposed, named X-iForest, which clusters the standard Euclidean distance between the abnormal points and the normal cluster centre to achieve secondary filtering by using X-means. We compared X-iForest with seven mainstream unsupervised algorithms in terms of the AUC and anomaly detection rates. A large number of experiments showed that X-iForest has notable advantages over other algorithms and can be well applied to anomaly detection of large-scale network traffic data. |
format | Online Article Text |
id | pubmed-8803200 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-88032002022-02-01 An improved X-means and isolation forest based methodology for network traffic anomaly detection Feng, Yifan Cai, Weihong Yue, Haoyu Xu, Jianlong Lin, Yan Chen, Jiaxin Hu, Zijun PLoS One Research Article Anomaly detection in network traffic is becoming a challenging task due to the complexity of large-scale networks and the proliferation of various social network applications. In the actual industrial environment, only recently obtained unlabelled data can be used as the training set. The accuracy of the abnormal ratio in the training set as prior knowledge has a great influence on the performance of the commonly used unsupervised algorithms. In this study, an anomaly detection algorithm based on X-means and iForest is proposed, named X-iForest, which clusters the standard Euclidean distance between the abnormal points and the normal cluster centre to achieve secondary filtering by using X-means. We compared X-iForest with seven mainstream unsupervised algorithms in terms of the AUC and anomaly detection rates. A large number of experiments showed that X-iForest has notable advantages over other algorithms and can be well applied to anomaly detection of large-scale network traffic data. Public Library of Science 2022-01-31 /pmc/articles/PMC8803200/ /pubmed/35100305 http://dx.doi.org/10.1371/journal.pone.0263423 Text en © 2022 Feng et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Feng, Yifan Cai, Weihong Yue, Haoyu Xu, Jianlong Lin, Yan Chen, Jiaxin Hu, Zijun An improved X-means and isolation forest based methodology for network traffic anomaly detection |
title | An improved X-means and isolation forest based methodology for network traffic anomaly detection |
title_full | An improved X-means and isolation forest based methodology for network traffic anomaly detection |
title_fullStr | An improved X-means and isolation forest based methodology for network traffic anomaly detection |
title_full_unstemmed | An improved X-means and isolation forest based methodology for network traffic anomaly detection |
title_short | An improved X-means and isolation forest based methodology for network traffic anomaly detection |
title_sort | improved x-means and isolation forest based methodology for network traffic anomaly detection |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8803200/ https://www.ncbi.nlm.nih.gov/pubmed/35100305 http://dx.doi.org/10.1371/journal.pone.0263423 |
work_keys_str_mv | AT fengyifan animprovedxmeansandisolationforestbasedmethodologyfornetworktrafficanomalydetection AT caiweihong animprovedxmeansandisolationforestbasedmethodologyfornetworktrafficanomalydetection AT yuehaoyu animprovedxmeansandisolationforestbasedmethodologyfornetworktrafficanomalydetection AT xujianlong animprovedxmeansandisolationforestbasedmethodologyfornetworktrafficanomalydetection AT linyan animprovedxmeansandisolationforestbasedmethodologyfornetworktrafficanomalydetection AT chenjiaxin animprovedxmeansandisolationforestbasedmethodologyfornetworktrafficanomalydetection AT huzijun animprovedxmeansandisolationforestbasedmethodologyfornetworktrafficanomalydetection AT fengyifan improvedxmeansandisolationforestbasedmethodologyfornetworktrafficanomalydetection AT caiweihong improvedxmeansandisolationforestbasedmethodologyfornetworktrafficanomalydetection AT yuehaoyu improvedxmeansandisolationforestbasedmethodologyfornetworktrafficanomalydetection AT xujianlong improvedxmeansandisolationforestbasedmethodologyfornetworktrafficanomalydetection AT linyan improvedxmeansandisolationforestbasedmethodologyfornetworktrafficanomalydetection AT chenjiaxin improvedxmeansandisolationforestbasedmethodologyfornetworktrafficanomalydetection AT huzijun improvedxmeansandisolationforestbasedmethodologyfornetworktrafficanomalydetection |