Cargando…

Novel semi-metrics for multivariate change point analysis and anomaly detection

This paper proposes a new method for determining similarity and anomalies between time series, most practically effective in large collections of (likely related) time series, by measuring distances between structural breaks within such a collection. We introduce a class of semi-metric distance meas...

Descripción completa

Detalles Bibliográficos
Autores principales: James, Nick, Menzies, Max, Azizi, Lamiae, Chan, Jennifer
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier B.V. 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7329734/
https://www.ncbi.nlm.nih.gov/pubmed/32834249
http://dx.doi.org/10.1016/j.physd.2020.132636
_version_ 1783552956982886400
author James, Nick
Menzies, Max
Azizi, Lamiae
Chan, Jennifer
author_facet James, Nick
Menzies, Max
Azizi, Lamiae
Chan, Jennifer
author_sort James, Nick
collection PubMed
description This paper proposes a new method for determining similarity and anomalies between time series, most practically effective in large collections of (likely related) time series, by measuring distances between structural breaks within such a collection. We introduce a class of semi-metric distance measures, which we term MJ distances. These semi-metrics provide an advantage over existing options such as the Hausdorff and Wasserstein metrics. We prove they have desirable properties, including better sensitivity to outliers, while experiments on simulated data demonstrate that they uncover similarity within collections of time series more effectively. Semi-metrics carry a potential disadvantage: without the triangle inequality, they may not satisfy a “transitivity property of closeness.” We analyse this failure with proof and introduce an computational method to investigate, in which we demonstrate that our semi-metrics violate transitivity infrequently and mildly. Finally, we apply our methods to cryptocurrency and measles data, introducing a judicious application of eigenvalue analysis.
format Online
Article
Text
id pubmed-7329734
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Elsevier B.V.
record_format MEDLINE/PubMed
spelling pubmed-73297342020-07-02 Novel semi-metrics for multivariate change point analysis and anomaly detection James, Nick Menzies, Max Azizi, Lamiae Chan, Jennifer Physica D Article This paper proposes a new method for determining similarity and anomalies between time series, most practically effective in large collections of (likely related) time series, by measuring distances between structural breaks within such a collection. We introduce a class of semi-metric distance measures, which we term MJ distances. These semi-metrics provide an advantage over existing options such as the Hausdorff and Wasserstein metrics. We prove they have desirable properties, including better sensitivity to outliers, while experiments on simulated data demonstrate that they uncover similarity within collections of time series more effectively. Semi-metrics carry a potential disadvantage: without the triangle inequality, they may not satisfy a “transitivity property of closeness.” We analyse this failure with proof and introduce an computational method to investigate, in which we demonstrate that our semi-metrics violate transitivity infrequently and mildly. Finally, we apply our methods to cryptocurrency and measles data, introducing a judicious application of eigenvalue analysis. Elsevier B.V. 2020-11 2020-07-02 /pmc/articles/PMC7329734/ /pubmed/32834249 http://dx.doi.org/10.1016/j.physd.2020.132636 Text en © 2020 Elsevier B.V. All rights reserved. Since January 2020 Elsevier has created a COVID-19 resource centre with free information in English and Mandarin on the novel coronavirus COVID-19. The COVID-19 resource centre is hosted on Elsevier Connect, the company's public news and information website. Elsevier hereby grants permission to make all its COVID-19-related research that is available on the COVID-19 resource centre - including this research content - immediately available in PubMed Central and other publicly funded repositories, such as the WHO COVID database with rights for unrestricted research re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for free by Elsevier for as long as the COVID-19 resource centre remains active.
spellingShingle Article
James, Nick
Menzies, Max
Azizi, Lamiae
Chan, Jennifer
Novel semi-metrics for multivariate change point analysis and anomaly detection
title Novel semi-metrics for multivariate change point analysis and anomaly detection
title_full Novel semi-metrics for multivariate change point analysis and anomaly detection
title_fullStr Novel semi-metrics for multivariate change point analysis and anomaly detection
title_full_unstemmed Novel semi-metrics for multivariate change point analysis and anomaly detection
title_short Novel semi-metrics for multivariate change point analysis and anomaly detection
title_sort novel semi-metrics for multivariate change point analysis and anomaly detection
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7329734/
https://www.ncbi.nlm.nih.gov/pubmed/32834249
http://dx.doi.org/10.1016/j.physd.2020.132636
work_keys_str_mv AT jamesnick novelsemimetricsformultivariatechangepointanalysisandanomalydetection
AT menziesmax novelsemimetricsformultivariatechangepointanalysisandanomalydetection
AT azizilamiae novelsemimetricsformultivariatechangepointanalysisandanomalydetection
AT chanjennifer novelsemimetricsformultivariatechangepointanalysisandanomalydetection