Cargando…

A Clustering Algorithm for Multi-Modal Heterogeneous Big Data With Abnormal Data

The problems of data abnormalities and missing data are puzzling the traditional multi-modal heterogeneous big data clustering. In order to solve this issue, a multi-view heterogeneous big data clustering algorithm based on improved Kmeans clustering is established in this paper. At first, for the b...

Descripción completa

Detalles Bibliográficos
Autores principales: Yan, An, Wang, Wei, Ren, Yi, Geng, HongWei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8236595/
https://www.ncbi.nlm.nih.gov/pubmed/34194310
http://dx.doi.org/10.3389/fnbot.2021.680613
_version_ 1783714571437998080
author Yan, An
Wang, Wei
Ren, Yi
Geng, HongWei
author_facet Yan, An
Wang, Wei
Ren, Yi
Geng, HongWei
author_sort Yan, An
collection PubMed
description The problems of data abnormalities and missing data are puzzling the traditional multi-modal heterogeneous big data clustering. In order to solve this issue, a multi-view heterogeneous big data clustering algorithm based on improved Kmeans clustering is established in this paper. At first, for the big data which involve heterogeneous data, based on multi view data analyzing, we propose an advanced Kmeans algorithm on the base of multi view heterogeneous system to determine the similarity detection metrics. Then, a BP neural network method is used to predict the missing attribute values, complete the missing data and restore the big data structure in heterogeneous state. Last, we ulteriorly propose a data denoising algorithm to denoise the abnormal data. Based on the above methods, we construct a framework namely BPK-means to resolve the problems of data abnormalities and missing data. Our solution approach is evaluated through rigorous performance evaluation study. Compared with the original algorithm, both theoretical verification and experimental results show that the accuracy of the proposed method is greatly improved.
format Online
Article
Text
id pubmed-8236595
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-82365952021-06-29 A Clustering Algorithm for Multi-Modal Heterogeneous Big Data With Abnormal Data Yan, An Wang, Wei Ren, Yi Geng, HongWei Front Neurorobot Neuroscience The problems of data abnormalities and missing data are puzzling the traditional multi-modal heterogeneous big data clustering. In order to solve this issue, a multi-view heterogeneous big data clustering algorithm based on improved Kmeans clustering is established in this paper. At first, for the big data which involve heterogeneous data, based on multi view data analyzing, we propose an advanced Kmeans algorithm on the base of multi view heterogeneous system to determine the similarity detection metrics. Then, a BP neural network method is used to predict the missing attribute values, complete the missing data and restore the big data structure in heterogeneous state. Last, we ulteriorly propose a data denoising algorithm to denoise the abnormal data. Based on the above methods, we construct a framework namely BPK-means to resolve the problems of data abnormalities and missing data. Our solution approach is evaluated through rigorous performance evaluation study. Compared with the original algorithm, both theoretical verification and experimental results show that the accuracy of the proposed method is greatly improved. Frontiers Media S.A. 2021-06-14 /pmc/articles/PMC8236595/ /pubmed/34194310 http://dx.doi.org/10.3389/fnbot.2021.680613 Text en Copyright © 2021 Yan, Wang, Ren and Geng. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Neuroscience
Yan, An
Wang, Wei
Ren, Yi
Geng, HongWei
A Clustering Algorithm for Multi-Modal Heterogeneous Big Data With Abnormal Data
title A Clustering Algorithm for Multi-Modal Heterogeneous Big Data With Abnormal Data
title_full A Clustering Algorithm for Multi-Modal Heterogeneous Big Data With Abnormal Data
title_fullStr A Clustering Algorithm for Multi-Modal Heterogeneous Big Data With Abnormal Data
title_full_unstemmed A Clustering Algorithm for Multi-Modal Heterogeneous Big Data With Abnormal Data
title_short A Clustering Algorithm for Multi-Modal Heterogeneous Big Data With Abnormal Data
title_sort clustering algorithm for multi-modal heterogeneous big data with abnormal data
topic Neuroscience
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8236595/
https://www.ncbi.nlm.nih.gov/pubmed/34194310
http://dx.doi.org/10.3389/fnbot.2021.680613
work_keys_str_mv AT yanan aclusteringalgorithmformultimodalheterogeneousbigdatawithabnormaldata
AT wangwei aclusteringalgorithmformultimodalheterogeneousbigdatawithabnormaldata
AT renyi aclusteringalgorithmformultimodalheterogeneousbigdatawithabnormaldata
AT genghongwei aclusteringalgorithmformultimodalheterogeneousbigdatawithabnormaldata
AT yanan clusteringalgorithmformultimodalheterogeneousbigdatawithabnormaldata
AT wangwei clusteringalgorithmformultimodalheterogeneousbigdatawithabnormaldata
AT renyi clusteringalgorithmformultimodalheterogeneousbigdatawithabnormaldata
AT genghongwei clusteringalgorithmformultimodalheterogeneousbigdatawithabnormaldata