Cargando…

Classification of the drifting data streams using heterogeneous diversified dynamic class-weighted ensemble

Data streams can be defined as the continuous stream of data coming from different sources and in different forms. Streams are often very dynamic, and its underlying structure usually changes over time, which may result to a phenomenon called concept drift. When solving predictive problems using the...

Descripción completa

Detalles Bibliográficos
Autores principales: Sarnovsky, Martin, Kolarik, Michal
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8022634/
https://www.ncbi.nlm.nih.gov/pubmed/33834113
http://dx.doi.org/10.7717/peerj-cs.459
_version_ 1783674972308242432
author Sarnovsky, Martin
Kolarik, Michal
author_facet Sarnovsky, Martin
Kolarik, Michal
author_sort Sarnovsky, Martin
collection PubMed
description Data streams can be defined as the continuous stream of data coming from different sources and in different forms. Streams are often very dynamic, and its underlying structure usually changes over time, which may result to a phenomenon called concept drift. When solving predictive problems using the streaming data, traditional machine learning models trained on historical data may become invalid when such changes occur. Adaptive models equipped with mechanisms to reflect the changes in the data proved to be suitable to handle drifting streams. Adaptive ensemble models represent a popular group of these methods used in classification of drifting data streams. In this paper, we present the heterogeneous adaptive ensemble model for the data streams classification, which utilizes the dynamic class weighting scheme and a mechanism to maintain the diversity of the ensemble members. Our main objective was to design a model consisting of a heterogeneous group of base learners (Naive Bayes, k-NN, Decision trees), with adaptive mechanism which besides the performance of the members also takes into an account the diversity of the ensemble. The model was experimentally evaluated on both real-world and synthetic datasets. We compared the presented model with other existing adaptive ensemble methods, both from the perspective of predictive performance and computational resource requirements.
format Online
Article
Text
id pubmed-8022634
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-80226342021-04-07 Classification of the drifting data streams using heterogeneous diversified dynamic class-weighted ensemble Sarnovsky, Martin Kolarik, Michal PeerJ Comput Sci Algorithms and Analysis of Algorithms Data streams can be defined as the continuous stream of data coming from different sources and in different forms. Streams are often very dynamic, and its underlying structure usually changes over time, which may result to a phenomenon called concept drift. When solving predictive problems using the streaming data, traditional machine learning models trained on historical data may become invalid when such changes occur. Adaptive models equipped with mechanisms to reflect the changes in the data proved to be suitable to handle drifting streams. Adaptive ensemble models represent a popular group of these methods used in classification of drifting data streams. In this paper, we present the heterogeneous adaptive ensemble model for the data streams classification, which utilizes the dynamic class weighting scheme and a mechanism to maintain the diversity of the ensemble members. Our main objective was to design a model consisting of a heterogeneous group of base learners (Naive Bayes, k-NN, Decision trees), with adaptive mechanism which besides the performance of the members also takes into an account the diversity of the ensemble. The model was experimentally evaluated on both real-world and synthetic datasets. We compared the presented model with other existing adaptive ensemble methods, both from the perspective of predictive performance and computational resource requirements. PeerJ Inc. 2021-04-01 /pmc/articles/PMC8022634/ /pubmed/33834113 http://dx.doi.org/10.7717/peerj-cs.459 Text en © 2021 Sarnovsky and Kolarik https://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited.
spellingShingle Algorithms and Analysis of Algorithms
Sarnovsky, Martin
Kolarik, Michal
Classification of the drifting data streams using heterogeneous diversified dynamic class-weighted ensemble
title Classification of the drifting data streams using heterogeneous diversified dynamic class-weighted ensemble
title_full Classification of the drifting data streams using heterogeneous diversified dynamic class-weighted ensemble
title_fullStr Classification of the drifting data streams using heterogeneous diversified dynamic class-weighted ensemble
title_full_unstemmed Classification of the drifting data streams using heterogeneous diversified dynamic class-weighted ensemble
title_short Classification of the drifting data streams using heterogeneous diversified dynamic class-weighted ensemble
title_sort classification of the drifting data streams using heterogeneous diversified dynamic class-weighted ensemble
topic Algorithms and Analysis of Algorithms
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8022634/
https://www.ncbi.nlm.nih.gov/pubmed/33834113
http://dx.doi.org/10.7717/peerj-cs.459
work_keys_str_mv AT sarnovskymartin classificationofthedriftingdatastreamsusingheterogeneousdiversifieddynamicclassweightedensemble
AT kolarikmichal classificationofthedriftingdatastreamsusingheterogeneousdiversifieddynamicclassweightedensemble