Cargando…

PEARL: Probabilistic Exact Adaptive Random Forest with Lossy Counting for Data Streams

In order to adapt random forests to the dynamic nature of data streams, the state-of-the-art technique discards trained trees and grows new trees when concept drifts are detected. This is particularly wasteful when recurrent patterns exist. In this work, we introduce a novel framework called PEARL,...

Descripción completa

Detalles Bibliográficos
Autores principales: Wu, Ocean, Koh, Yun Sing, Dobbie, Gillian, Lacombe, Thomas
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7206241/
http://dx.doi.org/10.1007/978-3-030-47436-2_2
_version_ 1783530375712079872
author Wu, Ocean
Koh, Yun Sing
Dobbie, Gillian
Lacombe, Thomas
author_facet Wu, Ocean
Koh, Yun Sing
Dobbie, Gillian
Lacombe, Thomas
author_sort Wu, Ocean
collection PubMed
description In order to adapt random forests to the dynamic nature of data streams, the state-of-the-art technique discards trained trees and grows new trees when concept drifts are detected. This is particularly wasteful when recurrent patterns exist. In this work, we introduce a novel framework called PEARL, which uses both an exact technique and a probabilistic graphical model with Lossy Counting, to replace drifted trees with relevant trees built from the past. The exact technique utilizes pattern matching to find the set of drifted trees, that co-occurred in predictions in the past. Meanwhile, a probabilistic graphical model is being built to capture the tree replacements among recurrent concept drifts. Once the graphical model becomes stable, it replaces the exact technique and finds relevant trees in a probabilistic fashion. Further, Lossy Counting is applied to the graphical model which brings an added theoretical guarantee for both error rate and space complexity. We empirically show our technique outperforms baselines in terms of cumulative accuracy on both synthetic and real-world datasets.
format Online
Article
Text
id pubmed-7206241
institution National Center for Biotechnology Information
language English
publishDate 2020
record_format MEDLINE/PubMed
spelling pubmed-72062412020-05-08 PEARL: Probabilistic Exact Adaptive Random Forest with Lossy Counting for Data Streams Wu, Ocean Koh, Yun Sing Dobbie, Gillian Lacombe, Thomas Advances in Knowledge Discovery and Data Mining Article In order to adapt random forests to the dynamic nature of data streams, the state-of-the-art technique discards trained trees and grows new trees when concept drifts are detected. This is particularly wasteful when recurrent patterns exist. In this work, we introduce a novel framework called PEARL, which uses both an exact technique and a probabilistic graphical model with Lossy Counting, to replace drifted trees with relevant trees built from the past. The exact technique utilizes pattern matching to find the set of drifted trees, that co-occurred in predictions in the past. Meanwhile, a probabilistic graphical model is being built to capture the tree replacements among recurrent concept drifts. Once the graphical model becomes stable, it replaces the exact technique and finds relevant trees in a probabilistic fashion. Further, Lossy Counting is applied to the graphical model which brings an added theoretical guarantee for both error rate and space complexity. We empirically show our technique outperforms baselines in terms of cumulative accuracy on both synthetic and real-world datasets. 2020-04-17 /pmc/articles/PMC7206241/ http://dx.doi.org/10.1007/978-3-030-47436-2_2 Text en © Springer Nature Switzerland AG 2020 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.
spellingShingle Article
Wu, Ocean
Koh, Yun Sing
Dobbie, Gillian
Lacombe, Thomas
PEARL: Probabilistic Exact Adaptive Random Forest with Lossy Counting for Data Streams
title PEARL: Probabilistic Exact Adaptive Random Forest with Lossy Counting for Data Streams
title_full PEARL: Probabilistic Exact Adaptive Random Forest with Lossy Counting for Data Streams
title_fullStr PEARL: Probabilistic Exact Adaptive Random Forest with Lossy Counting for Data Streams
title_full_unstemmed PEARL: Probabilistic Exact Adaptive Random Forest with Lossy Counting for Data Streams
title_short PEARL: Probabilistic Exact Adaptive Random Forest with Lossy Counting for Data Streams
title_sort pearl: probabilistic exact adaptive random forest with lossy counting for data streams
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7206241/
http://dx.doi.org/10.1007/978-3-030-47436-2_2
work_keys_str_mv AT wuocean pearlprobabilisticexactadaptiverandomforestwithlossycountingfordatastreams
AT kohyunsing pearlprobabilisticexactadaptiverandomforestwithlossycountingfordatastreams
AT dobbiegillian pearlprobabilisticexactadaptiverandomforestwithlossycountingfordatastreams
AT lacombethomas pearlprobabilisticexactadaptiverandomforestwithlossycountingfordatastreams