Cargando…
PEARL: Probabilistic Exact Adaptive Random Forest with Lossy Counting for Data Streams
In order to adapt random forests to the dynamic nature of data streams, the state-of-the-art technique discards trained trees and grows new trees when concept drifts are detected. This is particularly wasteful when recurrent patterns exist. In this work, we introduce a novel framework called PEARL,...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7206241/ http://dx.doi.org/10.1007/978-3-030-47436-2_2 |
_version_ | 1783530375712079872 |
---|---|
author | Wu, Ocean Koh, Yun Sing Dobbie, Gillian Lacombe, Thomas |
author_facet | Wu, Ocean Koh, Yun Sing Dobbie, Gillian Lacombe, Thomas |
author_sort | Wu, Ocean |
collection | PubMed |
description | In order to adapt random forests to the dynamic nature of data streams, the state-of-the-art technique discards trained trees and grows new trees when concept drifts are detected. This is particularly wasteful when recurrent patterns exist. In this work, we introduce a novel framework called PEARL, which uses both an exact technique and a probabilistic graphical model with Lossy Counting, to replace drifted trees with relevant trees built from the past. The exact technique utilizes pattern matching to find the set of drifted trees, that co-occurred in predictions in the past. Meanwhile, a probabilistic graphical model is being built to capture the tree replacements among recurrent concept drifts. Once the graphical model becomes stable, it replaces the exact technique and finds relevant trees in a probabilistic fashion. Further, Lossy Counting is applied to the graphical model which brings an added theoretical guarantee for both error rate and space complexity. We empirically show our technique outperforms baselines in terms of cumulative accuracy on both synthetic and real-world datasets. |
format | Online Article Text |
id | pubmed-7206241 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
record_format | MEDLINE/PubMed |
spelling | pubmed-72062412020-05-08 PEARL: Probabilistic Exact Adaptive Random Forest with Lossy Counting for Data Streams Wu, Ocean Koh, Yun Sing Dobbie, Gillian Lacombe, Thomas Advances in Knowledge Discovery and Data Mining Article In order to adapt random forests to the dynamic nature of data streams, the state-of-the-art technique discards trained trees and grows new trees when concept drifts are detected. This is particularly wasteful when recurrent patterns exist. In this work, we introduce a novel framework called PEARL, which uses both an exact technique and a probabilistic graphical model with Lossy Counting, to replace drifted trees with relevant trees built from the past. The exact technique utilizes pattern matching to find the set of drifted trees, that co-occurred in predictions in the past. Meanwhile, a probabilistic graphical model is being built to capture the tree replacements among recurrent concept drifts. Once the graphical model becomes stable, it replaces the exact technique and finds relevant trees in a probabilistic fashion. Further, Lossy Counting is applied to the graphical model which brings an added theoretical guarantee for both error rate and space complexity. We empirically show our technique outperforms baselines in terms of cumulative accuracy on both synthetic and real-world datasets. 2020-04-17 /pmc/articles/PMC7206241/ http://dx.doi.org/10.1007/978-3-030-47436-2_2 Text en © Springer Nature Switzerland AG 2020 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic. |
spellingShingle | Article Wu, Ocean Koh, Yun Sing Dobbie, Gillian Lacombe, Thomas PEARL: Probabilistic Exact Adaptive Random Forest with Lossy Counting for Data Streams |
title | PEARL: Probabilistic Exact Adaptive Random Forest with Lossy Counting for Data Streams |
title_full | PEARL: Probabilistic Exact Adaptive Random Forest with Lossy Counting for Data Streams |
title_fullStr | PEARL: Probabilistic Exact Adaptive Random Forest with Lossy Counting for Data Streams |
title_full_unstemmed | PEARL: Probabilistic Exact Adaptive Random Forest with Lossy Counting for Data Streams |
title_short | PEARL: Probabilistic Exact Adaptive Random Forest with Lossy Counting for Data Streams |
title_sort | pearl: probabilistic exact adaptive random forest with lossy counting for data streams |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7206241/ http://dx.doi.org/10.1007/978-3-030-47436-2_2 |
work_keys_str_mv | AT wuocean pearlprobabilisticexactadaptiverandomforestwithlossycountingfordatastreams AT kohyunsing pearlprobabilisticexactadaptiverandomforestwithlossycountingfordatastreams AT dobbiegillian pearlprobabilisticexactadaptiverandomforestwithlossycountingfordatastreams AT lacombethomas pearlprobabilisticexactadaptiverandomforestwithlossycountingfordatastreams |