Cargando…

A Fast Weighted Fuzzy C-Medoids Clustering for Time Series Data Based on P-Splines

The rapid growth of digital information has produced massive amounts of time series data on rich features and most time series data are noisy and contain some outlier samples, which leads to a decline in the clustering effect. To efficiently discover the hidden statistical information about the data...

Descripción completa

Detalles Bibliográficos
Autores principales: Xu, Jiucheng, Hou, Qinchen, Qu, Kanglin, Sun, Yuanhao, Meng, Xiangru
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9414275/
https://www.ncbi.nlm.nih.gov/pubmed/36015930
http://dx.doi.org/10.3390/s22166163
_version_ 1784775950865530880
author Xu, Jiucheng
Hou, Qinchen
Qu, Kanglin
Sun, Yuanhao
Meng, Xiangru
author_facet Xu, Jiucheng
Hou, Qinchen
Qu, Kanglin
Sun, Yuanhao
Meng, Xiangru
author_sort Xu, Jiucheng
collection PubMed
description The rapid growth of digital information has produced massive amounts of time series data on rich features and most time series data are noisy and contain some outlier samples, which leads to a decline in the clustering effect. To efficiently discover the hidden statistical information about the data, a fast weighted fuzzy C-medoids clustering algorithm based on P-splines (PS-WFCMdd) is proposed for time series datasets in this study. Specifically, the P-spline method is used to fit the functional data related to the original time series data, and the obtained smooth-fitting data is used as the input of the clustering algorithm to enhance the ability to process the data set during the clustering process. Then, we define a new weighted method to further avoid the influence of outlier sample points in the weighted fuzzy C-medoids clustering process, to improve the robustness of our algorithm. We propose using the third version of mueen’s algorithm for similarity search (MASS 3) to measure the similarity between time series quickly and accurately, to further improve the clustering efficiency. Our new algorithm is compared with several other time series clustering algorithms, and the performance of the algorithm is evaluated experimentally on different types of time series examples. The experimental results show that our new method can speed up data processing and the comprehensive performance of each clustering evaluation index are relatively good.
format Online
Article
Text
id pubmed-9414275
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-94142752022-08-27 A Fast Weighted Fuzzy C-Medoids Clustering for Time Series Data Based on P-Splines Xu, Jiucheng Hou, Qinchen Qu, Kanglin Sun, Yuanhao Meng, Xiangru Sensors (Basel) Article The rapid growth of digital information has produced massive amounts of time series data on rich features and most time series data are noisy and contain some outlier samples, which leads to a decline in the clustering effect. To efficiently discover the hidden statistical information about the data, a fast weighted fuzzy C-medoids clustering algorithm based on P-splines (PS-WFCMdd) is proposed for time series datasets in this study. Specifically, the P-spline method is used to fit the functional data related to the original time series data, and the obtained smooth-fitting data is used as the input of the clustering algorithm to enhance the ability to process the data set during the clustering process. Then, we define a new weighted method to further avoid the influence of outlier sample points in the weighted fuzzy C-medoids clustering process, to improve the robustness of our algorithm. We propose using the third version of mueen’s algorithm for similarity search (MASS 3) to measure the similarity between time series quickly and accurately, to further improve the clustering efficiency. Our new algorithm is compared with several other time series clustering algorithms, and the performance of the algorithm is evaluated experimentally on different types of time series examples. The experimental results show that our new method can speed up data processing and the comprehensive performance of each clustering evaluation index are relatively good. MDPI 2022-08-17 /pmc/articles/PMC9414275/ /pubmed/36015930 http://dx.doi.org/10.3390/s22166163 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Xu, Jiucheng
Hou, Qinchen
Qu, Kanglin
Sun, Yuanhao
Meng, Xiangru
A Fast Weighted Fuzzy C-Medoids Clustering for Time Series Data Based on P-Splines
title A Fast Weighted Fuzzy C-Medoids Clustering for Time Series Data Based on P-Splines
title_full A Fast Weighted Fuzzy C-Medoids Clustering for Time Series Data Based on P-Splines
title_fullStr A Fast Weighted Fuzzy C-Medoids Clustering for Time Series Data Based on P-Splines
title_full_unstemmed A Fast Weighted Fuzzy C-Medoids Clustering for Time Series Data Based on P-Splines
title_short A Fast Weighted Fuzzy C-Medoids Clustering for Time Series Data Based on P-Splines
title_sort fast weighted fuzzy c-medoids clustering for time series data based on p-splines
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9414275/
https://www.ncbi.nlm.nih.gov/pubmed/36015930
http://dx.doi.org/10.3390/s22166163
work_keys_str_mv AT xujiucheng afastweightedfuzzycmedoidsclusteringfortimeseriesdatabasedonpsplines
AT houqinchen afastweightedfuzzycmedoidsclusteringfortimeseriesdatabasedonpsplines
AT qukanglin afastweightedfuzzycmedoidsclusteringfortimeseriesdatabasedonpsplines
AT sunyuanhao afastweightedfuzzycmedoidsclusteringfortimeseriesdatabasedonpsplines
AT mengxiangru afastweightedfuzzycmedoidsclusteringfortimeseriesdatabasedonpsplines
AT xujiucheng fastweightedfuzzycmedoidsclusteringfortimeseriesdatabasedonpsplines
AT houqinchen fastweightedfuzzycmedoidsclusteringfortimeseriesdatabasedonpsplines
AT qukanglin fastweightedfuzzycmedoidsclusteringfortimeseriesdatabasedonpsplines
AT sunyuanhao fastweightedfuzzycmedoidsclusteringfortimeseriesdatabasedonpsplines
AT mengxiangru fastweightedfuzzycmedoidsclusteringfortimeseriesdatabasedonpsplines