Cargando…
Sequential Learning of Principal Curves: Summarizing Data Streams on the Fly
When confronted with massive data streams, summarizing data with dimension reduction methods such as PCA raises theoretical and algorithmic pitfalls. A principal curve acts as a nonlinear generalization of PCA, and the present paper proposes a novel algorithm to automatically and sequentially learn...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8622390/ https://www.ncbi.nlm.nih.gov/pubmed/34828234 http://dx.doi.org/10.3390/e23111534 |
_version_ | 1784605682359599104 |
---|---|
author | Li, Le Guedj, Benjamin |
author_facet | Li, Le Guedj, Benjamin |
author_sort | Li, Le |
collection | PubMed |
description | When confronted with massive data streams, summarizing data with dimension reduction methods such as PCA raises theoretical and algorithmic pitfalls. A principal curve acts as a nonlinear generalization of PCA, and the present paper proposes a novel algorithm to automatically and sequentially learn principal curves from data streams. We show that our procedure is supported by regret bounds with optimal sublinear remainder terms. A greedy local search implementation (called slpc, for sequential learning principal curves) that incorporates both sleeping experts and multi-armed bandit ingredients is presented, along with its regret computation and performance on synthetic and real-life data. |
format | Online Article Text |
id | pubmed-8622390 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-86223902021-11-27 Sequential Learning of Principal Curves: Summarizing Data Streams on the Fly Li, Le Guedj, Benjamin Entropy (Basel) Article When confronted with massive data streams, summarizing data with dimension reduction methods such as PCA raises theoretical and algorithmic pitfalls. A principal curve acts as a nonlinear generalization of PCA, and the present paper proposes a novel algorithm to automatically and sequentially learn principal curves from data streams. We show that our procedure is supported by regret bounds with optimal sublinear remainder terms. A greedy local search implementation (called slpc, for sequential learning principal curves) that incorporates both sleeping experts and multi-armed bandit ingredients is presented, along with its regret computation and performance on synthetic and real-life data. MDPI 2021-11-18 /pmc/articles/PMC8622390/ /pubmed/34828234 http://dx.doi.org/10.3390/e23111534 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Li, Le Guedj, Benjamin Sequential Learning of Principal Curves: Summarizing Data Streams on the Fly |
title | Sequential Learning of Principal Curves: Summarizing Data Streams on the Fly |
title_full | Sequential Learning of Principal Curves: Summarizing Data Streams on the Fly |
title_fullStr | Sequential Learning of Principal Curves: Summarizing Data Streams on the Fly |
title_full_unstemmed | Sequential Learning of Principal Curves: Summarizing Data Streams on the Fly |
title_short | Sequential Learning of Principal Curves: Summarizing Data Streams on the Fly |
title_sort | sequential learning of principal curves: summarizing data streams on the fly |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8622390/ https://www.ncbi.nlm.nih.gov/pubmed/34828234 http://dx.doi.org/10.3390/e23111534 |
work_keys_str_mv | AT lile sequentiallearningofprincipalcurvessummarizingdatastreamsonthefly AT guedjbenjamin sequentiallearningofprincipalcurvessummarizingdatastreamsonthefly |