Cargando…

Cluster-based stability evaluation in time series data sets

In modern data analysis, time is often considered just another feature. Yet time has a special role that is regularly overlooked. Procedures are usually only designed for time-independent data and are therefore often unsuitable for the temporal aspect of the data. This is especially the case for clu...

Descripción completa

Detalles Bibliográficos
Autores principales: Klassen, Gerhard, Tatusch, Martha, Conrad, Stefan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer US 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9746592/
https://www.ncbi.nlm.nih.gov/pubmed/36531973
http://dx.doi.org/10.1007/s10489-022-04231-7
_version_ 1784849398236184576
author Klassen, Gerhard
Tatusch, Martha
Conrad, Stefan
author_facet Klassen, Gerhard
Tatusch, Martha
Conrad, Stefan
author_sort Klassen, Gerhard
collection PubMed
description In modern data analysis, time is often considered just another feature. Yet time has a special role that is regularly overlooked. Procedures are usually only designed for time-independent data and are therefore often unsuitable for the temporal aspect of the data. This is especially the case for clustering algorithms. Although there are a few evolutionary approaches for time-dependent data, the evaluation of these and therefore the selection is difficult for the user. In this paper, we present a general evaluation measure that examines clusterings with respect to their temporal stability and thus provides information about the achieved quality. For this purpose, we examine the temporal stability of time series with respect to their cluster neighbors, the temporal stability of clusters with respect to their composition, and finally conclude on the temporal stability of the entire clustering. We summarise these components in a parameter-free toolkit that we call Cluster Over-Time Stability Evaluation (CLOSE). In addition to that we present a fuzzy variant which we call FCSETS (Fuzzy Clustering Stability Evaluation of Time Series). These toolkits enable a number of advanced applications. One of these is parameter selection for any type of clustering algorithm. We demonstrate parameter selection as an example and evaluate results of classical clustering algorithms against a well-known evolutionary clustering algorithm. We then introduce a method for outlier detection in time series data based on CLOSE. We demonstrate the practicality of our approaches on three real world data sets and one generated data set.
format Online
Article
Text
id pubmed-9746592
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Springer US
record_format MEDLINE/PubMed
spelling pubmed-97465922022-12-14 Cluster-based stability evaluation in time series data sets Klassen, Gerhard Tatusch, Martha Conrad, Stefan Appl Intell (Dordr) Article In modern data analysis, time is often considered just another feature. Yet time has a special role that is regularly overlooked. Procedures are usually only designed for time-independent data and are therefore often unsuitable for the temporal aspect of the data. This is especially the case for clustering algorithms. Although there are a few evolutionary approaches for time-dependent data, the evaluation of these and therefore the selection is difficult for the user. In this paper, we present a general evaluation measure that examines clusterings with respect to their temporal stability and thus provides information about the achieved quality. For this purpose, we examine the temporal stability of time series with respect to their cluster neighbors, the temporal stability of clusters with respect to their composition, and finally conclude on the temporal stability of the entire clustering. We summarise these components in a parameter-free toolkit that we call Cluster Over-Time Stability Evaluation (CLOSE). In addition to that we present a fuzzy variant which we call FCSETS (Fuzzy Clustering Stability Evaluation of Time Series). These toolkits enable a number of advanced applications. One of these is parameter selection for any type of clustering algorithm. We demonstrate parameter selection as an example and evaluate results of classical clustering algorithms against a well-known evolutionary clustering algorithm. We then introduce a method for outlier detection in time series data based on CLOSE. We demonstrate the practicality of our approaches on three real world data sets and one generated data set. Springer US 2022-12-13 /pmc/articles/PMC9746592/ /pubmed/36531973 http://dx.doi.org/10.1007/s10489-022-04231-7 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Klassen, Gerhard
Tatusch, Martha
Conrad, Stefan
Cluster-based stability evaluation in time series data sets
title Cluster-based stability evaluation in time series data sets
title_full Cluster-based stability evaluation in time series data sets
title_fullStr Cluster-based stability evaluation in time series data sets
title_full_unstemmed Cluster-based stability evaluation in time series data sets
title_short Cluster-based stability evaluation in time series data sets
title_sort cluster-based stability evaluation in time series data sets
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9746592/
https://www.ncbi.nlm.nih.gov/pubmed/36531973
http://dx.doi.org/10.1007/s10489-022-04231-7
work_keys_str_mv AT klassengerhard clusterbasedstabilityevaluationintimeseriesdatasets
AT tatuschmartha clusterbasedstabilityevaluationintimeseriesdatasets
AT conradstefan clusterbasedstabilityevaluationintimeseriesdatasets