Cargando…

A simple method for unsupervised anomaly detection: An application to Web time series data

We propose a simple anomaly detection method that is applicable to unlabeled time series data and is sufficiently tractable, even for non-technical entities, by using the density ratio estimation based on the state space model. Our detection rule is based on the ratio of log-likelihoods estimated by...

Descripción completa

Detalles Bibliográficos
Autores principales: Yoshihara, Keisuke, Takahashi, Kei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8752013/
https://www.ncbi.nlm.nih.gov/pubmed/35015791
http://dx.doi.org/10.1371/journal.pone.0262463
_version_ 1784631804302458880
author Yoshihara, Keisuke
Takahashi, Kei
author_facet Yoshihara, Keisuke
Takahashi, Kei
author_sort Yoshihara, Keisuke
collection PubMed
description We propose a simple anomaly detection method that is applicable to unlabeled time series data and is sufficiently tractable, even for non-technical entities, by using the density ratio estimation based on the state space model. Our detection rule is based on the ratio of log-likelihoods estimated by the dynamic linear model, i.e. the ratio of log-likelihood in our model to that in an over-dispersed model that we will call the NULL model. Using the Yahoo S5 data set and the Numenta Anomaly Benchmark data set, publicly available and commonly used benchmark data sets, we find that our method achieves better or comparable performance compared to the existing methods. The result implies that it is essential in time series anomaly detection to incorporate the specific information on time series data into the model. In addition, we apply the proposed method to unlabeled Web time series data, specifically, daily page view and average session duration data on an electronic commerce site that deals in insurance goods to show the applicability of our method to unlabeled real-world data. We find that the increase in page view caused by e-mail newsletter deliveries is less likely to contribute to completing an insurance contract. The result also suggests the importance of the simultaneous monitoring of more than one time series.
format Online
Article
Text
id pubmed-8752013
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-87520132022-01-12 A simple method for unsupervised anomaly detection: An application to Web time series data Yoshihara, Keisuke Takahashi, Kei PLoS One Research Article We propose a simple anomaly detection method that is applicable to unlabeled time series data and is sufficiently tractable, even for non-technical entities, by using the density ratio estimation based on the state space model. Our detection rule is based on the ratio of log-likelihoods estimated by the dynamic linear model, i.e. the ratio of log-likelihood in our model to that in an over-dispersed model that we will call the NULL model. Using the Yahoo S5 data set and the Numenta Anomaly Benchmark data set, publicly available and commonly used benchmark data sets, we find that our method achieves better or comparable performance compared to the existing methods. The result implies that it is essential in time series anomaly detection to incorporate the specific information on time series data into the model. In addition, we apply the proposed method to unlabeled Web time series data, specifically, daily page view and average session duration data on an electronic commerce site that deals in insurance goods to show the applicability of our method to unlabeled real-world data. We find that the increase in page view caused by e-mail newsletter deliveries is less likely to contribute to completing an insurance contract. The result also suggests the importance of the simultaneous monitoring of more than one time series. Public Library of Science 2022-01-11 /pmc/articles/PMC8752013/ /pubmed/35015791 http://dx.doi.org/10.1371/journal.pone.0262463 Text en © 2022 Yoshihara, Takahashi https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Yoshihara, Keisuke
Takahashi, Kei
A simple method for unsupervised anomaly detection: An application to Web time series data
title A simple method for unsupervised anomaly detection: An application to Web time series data
title_full A simple method for unsupervised anomaly detection: An application to Web time series data
title_fullStr A simple method for unsupervised anomaly detection: An application to Web time series data
title_full_unstemmed A simple method for unsupervised anomaly detection: An application to Web time series data
title_short A simple method for unsupervised anomaly detection: An application to Web time series data
title_sort simple method for unsupervised anomaly detection: an application to web time series data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8752013/
https://www.ncbi.nlm.nih.gov/pubmed/35015791
http://dx.doi.org/10.1371/journal.pone.0262463
work_keys_str_mv AT yoshiharakeisuke asimplemethodforunsupervisedanomalydetectionanapplicationtowebtimeseriesdata
AT takahashikei asimplemethodforunsupervisedanomalydetectionanapplicationtowebtimeseriesdata
AT yoshiharakeisuke simplemethodforunsupervisedanomalydetectionanapplicationtowebtimeseriesdata
AT takahashikei simplemethodforunsupervisedanomalydetectionanapplicationtowebtimeseriesdata