Cargando…

Anomaly Detection for Individual Sequences with Applications in Identifying Malicious Tools †

Anomaly detection refers to the problem of identifying abnormal behaviour within a set of measurements. In many cases, one has some statistical model for normal data, and wishes to identify whether new data fit the model or not. However, in others, while there are normal data to learn from, there is...

Descripción completa

Detalles Bibliográficos
Autores principales:	Siboni, Shachar, Cohen, Asaf
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2020
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7517183/ https://www.ncbi.nlm.nih.gov/pubmed/33286421 http://dx.doi.org/10.3390/e22060649

_version_	1783587171355066368
author	Siboni, Shachar Cohen, Asaf
author_facet	Siboni, Shachar Cohen, Asaf
author_sort	Siboni, Shachar
collection	PubMed
description	Anomaly detection refers to the problem of identifying abnormal behaviour within a set of measurements. In many cases, one has some statistical model for normal data, and wishes to identify whether new data fit the model or not. However, in others, while there are normal data to learn from, there is no statistical model for this data, and there is no structured parameter set to estimate. Thus, one is forced to assume an individual sequences setup, where there is no given model or any guarantee that such a model exists. In this work, we propose a universal anomaly detection algorithm for one-dimensional time series that is able to learn the normal behaviour of systems and alert for abnormalities, without assuming anything on the normal data, or anything on the anomalies. The suggested method utilizes new information measures that were derived from the Lempel–Ziv (LZ) compression algorithm in order to optimally and efficiently learn the normal behaviour (during learning), and then estimate the likelihood of new data (during operation) and classify it accordingly. We apply the algorithm to key problems in computer security, as well as a benchmark anomaly detection data set, all using simple, single-feature time-indexed data. The first is detecting Botnets Command and Control (C&C) channels without deep inspection. We then apply it to the problems of malicious tools detection via system calls monitoring and data leakage identification.We conclude with the New York City (NYC) taxi data. Finally, while using information theoretic tools, we show that an attacker’s attempt to maliciously fool the detection system by trying to generate normal data is bound to fail, either due to a high probability of error or because of the need for huge amounts of resources.
format	Online Article Text
id	pubmed-7517183
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-75171832020-11-09 Anomaly Detection for Individual Sequences with Applications in Identifying Malicious Tools † Siboni, Shachar Cohen, Asaf Entropy (Basel) Article Anomaly detection refers to the problem of identifying abnormal behaviour within a set of measurements. In many cases, one has some statistical model for normal data, and wishes to identify whether new data fit the model or not. However, in others, while there are normal data to learn from, there is no statistical model for this data, and there is no structured parameter set to estimate. Thus, one is forced to assume an individual sequences setup, where there is no given model or any guarantee that such a model exists. In this work, we propose a universal anomaly detection algorithm for one-dimensional time series that is able to learn the normal behaviour of systems and alert for abnormalities, without assuming anything on the normal data, or anything on the anomalies. The suggested method utilizes new information measures that were derived from the Lempel–Ziv (LZ) compression algorithm in order to optimally and efficiently learn the normal behaviour (during learning), and then estimate the likelihood of new data (during operation) and classify it accordingly. We apply the algorithm to key problems in computer security, as well as a benchmark anomaly detection data set, all using simple, single-feature time-indexed data. The first is detecting Botnets Command and Control (C&C) channels without deep inspection. We then apply it to the problems of malicious tools detection via system calls monitoring and data leakage identification.We conclude with the New York City (NYC) taxi data. Finally, while using information theoretic tools, we show that an attacker’s attempt to maliciously fool the detection system by trying to generate normal data is bound to fail, either due to a high probability of error or because of the need for huge amounts of resources. MDPI 2020-06-12 /pmc/articles/PMC7517183/ /pubmed/33286421 http://dx.doi.org/10.3390/e22060649 Text en © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Siboni, Shachar Cohen, Asaf Anomaly Detection for Individual Sequences with Applications in Identifying Malicious Tools †
title	Anomaly Detection for Individual Sequences with Applications in Identifying Malicious Tools †
title_full	Anomaly Detection for Individual Sequences with Applications in Identifying Malicious Tools †
title_fullStr	Anomaly Detection for Individual Sequences with Applications in Identifying Malicious Tools †
title_full_unstemmed	Anomaly Detection for Individual Sequences with Applications in Identifying Malicious Tools †
title_short	Anomaly Detection for Individual Sequences with Applications in Identifying Malicious Tools †
title_sort	anomaly detection for individual sequences with applications in identifying malicious tools †
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7517183/ https://www.ncbi.nlm.nih.gov/pubmed/33286421 http://dx.doi.org/10.3390/e22060649
work_keys_str_mv	AT sibonishachar anomalydetectionforindividualsequenceswithapplicationsinidentifyingmalicioustools AT cohenasaf anomalydetectionforindividualsequenceswithapplicationsinidentifyingmalicioustools

Anomaly Detection for Individual Sequences with Applications in Identifying Malicious Tools †

Ejemplares similares