Cargando…

The water supply association analysis method in Shenzhen based on kmeans clustering discretization and apriori algorithm

Since water supply association analysis plays an important role in attribution analysis of water supply fluctuation, how to carry out effective association analysis has become a critical problem. However, the current techniques and methods used for association analysis are not very effective because...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Xin, Sang, Xuefeng, Chang, Jiaxuan, Zheng, Yang, Han, Yuping
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8341608/
https://www.ncbi.nlm.nih.gov/pubmed/34351977
http://dx.doi.org/10.1371/journal.pone.0255684
_version_ 1783733948742893568
author Liu, Xin
Sang, Xuefeng
Chang, Jiaxuan
Zheng, Yang
Han, Yuping
author_facet Liu, Xin
Sang, Xuefeng
Chang, Jiaxuan
Zheng, Yang
Han, Yuping
author_sort Liu, Xin
collection PubMed
description Since water supply association analysis plays an important role in attribution analysis of water supply fluctuation, how to carry out effective association analysis has become a critical problem. However, the current techniques and methods used for association analysis are not very effective because they are based on continuous data. In general, there is different degrees of monotone relationship between continuous data, which makes the analysis results easily affected by monotone relationship. The multicollinearity between continuous data distorts these analytical methods and may generate incorrect results. Meanwhile, we cannot know the association rules and value interval between features and water supply. Therefore, the lack of an effective analysis method hinders the water supply association analysis. Association rules and value interval of features obtained from association analysis are helpful to grasp cause of water supply fluctuation and know the fluctuation interval of water supply, so as to provide better support for water supply dispatching. But the association rules and value interval between features and water supply are not fully understood. In this study, a data mining method coupling kmeans clustering discretization and apriori algorithm was proposed. The kmeans was used for data discretization to obtain the one-hot encoding that can be recognized by apriori, and the discretization can also avoid the influence of monotone relationship and multicollinearity on analysis results. All the rules eventually need to be validated in order to filter out spurious rules. The results show that the method in this study is an effective association analysis method. The method can not only obtain the valid strong association rules between features and water supply, but also understand whether the association relationship between features and water supply is direct or indirect. Meanwhile, the method can also obtain value interval of features, the association degree between features and confidence probability of rules.
format Online
Article
Text
id pubmed-8341608
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-83416082021-08-06 The water supply association analysis method in Shenzhen based on kmeans clustering discretization and apriori algorithm Liu, Xin Sang, Xuefeng Chang, Jiaxuan Zheng, Yang Han, Yuping PLoS One Research Article Since water supply association analysis plays an important role in attribution analysis of water supply fluctuation, how to carry out effective association analysis has become a critical problem. However, the current techniques and methods used for association analysis are not very effective because they are based on continuous data. In general, there is different degrees of monotone relationship between continuous data, which makes the analysis results easily affected by monotone relationship. The multicollinearity between continuous data distorts these analytical methods and may generate incorrect results. Meanwhile, we cannot know the association rules and value interval between features and water supply. Therefore, the lack of an effective analysis method hinders the water supply association analysis. Association rules and value interval of features obtained from association analysis are helpful to grasp cause of water supply fluctuation and know the fluctuation interval of water supply, so as to provide better support for water supply dispatching. But the association rules and value interval between features and water supply are not fully understood. In this study, a data mining method coupling kmeans clustering discretization and apriori algorithm was proposed. The kmeans was used for data discretization to obtain the one-hot encoding that can be recognized by apriori, and the discretization can also avoid the influence of monotone relationship and multicollinearity on analysis results. All the rules eventually need to be validated in order to filter out spurious rules. The results show that the method in this study is an effective association analysis method. The method can not only obtain the valid strong association rules between features and water supply, but also understand whether the association relationship between features and water supply is direct or indirect. Meanwhile, the method can also obtain value interval of features, the association degree between features and confidence probability of rules. Public Library of Science 2021-08-05 /pmc/articles/PMC8341608/ /pubmed/34351977 http://dx.doi.org/10.1371/journal.pone.0255684 Text en © 2021 Liu et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Liu, Xin
Sang, Xuefeng
Chang, Jiaxuan
Zheng, Yang
Han, Yuping
The water supply association analysis method in Shenzhen based on kmeans clustering discretization and apriori algorithm
title The water supply association analysis method in Shenzhen based on kmeans clustering discretization and apriori algorithm
title_full The water supply association analysis method in Shenzhen based on kmeans clustering discretization and apriori algorithm
title_fullStr The water supply association analysis method in Shenzhen based on kmeans clustering discretization and apriori algorithm
title_full_unstemmed The water supply association analysis method in Shenzhen based on kmeans clustering discretization and apriori algorithm
title_short The water supply association analysis method in Shenzhen based on kmeans clustering discretization and apriori algorithm
title_sort water supply association analysis method in shenzhen based on kmeans clustering discretization and apriori algorithm
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8341608/
https://www.ncbi.nlm.nih.gov/pubmed/34351977
http://dx.doi.org/10.1371/journal.pone.0255684
work_keys_str_mv AT liuxin thewatersupplyassociationanalysismethodinshenzhenbasedonkmeansclusteringdiscretizationandapriorialgorithm
AT sangxuefeng thewatersupplyassociationanalysismethodinshenzhenbasedonkmeansclusteringdiscretizationandapriorialgorithm
AT changjiaxuan thewatersupplyassociationanalysismethodinshenzhenbasedonkmeansclusteringdiscretizationandapriorialgorithm
AT zhengyang thewatersupplyassociationanalysismethodinshenzhenbasedonkmeansclusteringdiscretizationandapriorialgorithm
AT hanyuping thewatersupplyassociationanalysismethodinshenzhenbasedonkmeansclusteringdiscretizationandapriorialgorithm
AT liuxin watersupplyassociationanalysismethodinshenzhenbasedonkmeansclusteringdiscretizationandapriorialgorithm
AT sangxuefeng watersupplyassociationanalysismethodinshenzhenbasedonkmeansclusteringdiscretizationandapriorialgorithm
AT changjiaxuan watersupplyassociationanalysismethodinshenzhenbasedonkmeansclusteringdiscretizationandapriorialgorithm
AT zhengyang watersupplyassociationanalysismethodinshenzhenbasedonkmeansclusteringdiscretizationandapriorialgorithm
AT hanyuping watersupplyassociationanalysismethodinshenzhenbasedonkmeansclusteringdiscretizationandapriorialgorithm