Cargando…
On randomized sketching algorithms and the Tracy–Widom law
There is an increasing body of work exploring the integration of random projection into algorithms for numerical linear algebra. The primary motivation is to reduce the overall computational cost of processing large datasets. A suitably chosen random projection can be used to embed the original data...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer US
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9852177/ https://www.ncbi.nlm.nih.gov/pubmed/36691583 http://dx.doi.org/10.1007/s11222-022-10148-5 |
_version_ | 1784872556321308672 |
---|---|
author | Ahfock, Daniel Astle, William J. Richardson, Sylvia |
author_facet | Ahfock, Daniel Astle, William J. Richardson, Sylvia |
author_sort | Ahfock, Daniel |
collection | PubMed |
description | There is an increasing body of work exploring the integration of random projection into algorithms for numerical linear algebra. The primary motivation is to reduce the overall computational cost of processing large datasets. A suitably chosen random projection can be used to embed the original dataset in a lower-dimensional space such that key properties of the original dataset are retained. These algorithms are often referred to as sketching algorithms, as the projected dataset can be used as a compressed representation of the full dataset. We show that random matrix theory, in particular the Tracy–Widom law, is useful for describing the operating characteristics of sketching algorithms in the tall-data regime when the sample size n is much greater than the number of variables d. Asymptotic large sample results are of particular interest as this is the regime where sketching is most useful for data compression. In particular, we develop asymptotic approximations for the success rate in generating random subspace embeddings and the convergence probability of iterative sketching algorithms. We test a number of sketching algorithms on real large high-dimensional datasets and find that the asymptotic expressions give accurate predictions of the empirical performance. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s11222-022-10148-5. |
format | Online Article Text |
id | pubmed-9852177 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Springer US |
record_format | MEDLINE/PubMed |
spelling | pubmed-98521772023-01-21 On randomized sketching algorithms and the Tracy–Widom law Ahfock, Daniel Astle, William J. Richardson, Sylvia Stat Comput Article There is an increasing body of work exploring the integration of random projection into algorithms for numerical linear algebra. The primary motivation is to reduce the overall computational cost of processing large datasets. A suitably chosen random projection can be used to embed the original dataset in a lower-dimensional space such that key properties of the original dataset are retained. These algorithms are often referred to as sketching algorithms, as the projected dataset can be used as a compressed representation of the full dataset. We show that random matrix theory, in particular the Tracy–Widom law, is useful for describing the operating characteristics of sketching algorithms in the tall-data regime when the sample size n is much greater than the number of variables d. Asymptotic large sample results are of particular interest as this is the regime where sketching is most useful for data compression. In particular, we develop asymptotic approximations for the success rate in generating random subspace embeddings and the convergence probability of iterative sketching algorithms. We test a number of sketching algorithms on real large high-dimensional datasets and find that the asymptotic expressions give accurate predictions of the empirical performance. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s11222-022-10148-5. Springer US 2023-01-19 2023 /pmc/articles/PMC9852177/ /pubmed/36691583 http://dx.doi.org/10.1007/s11222-022-10148-5 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Ahfock, Daniel Astle, William J. Richardson, Sylvia On randomized sketching algorithms and the Tracy–Widom law |
title | On randomized sketching algorithms and the Tracy–Widom law |
title_full | On randomized sketching algorithms and the Tracy–Widom law |
title_fullStr | On randomized sketching algorithms and the Tracy–Widom law |
title_full_unstemmed | On randomized sketching algorithms and the Tracy–Widom law |
title_short | On randomized sketching algorithms and the Tracy–Widom law |
title_sort | on randomized sketching algorithms and the tracy–widom law |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9852177/ https://www.ncbi.nlm.nih.gov/pubmed/36691583 http://dx.doi.org/10.1007/s11222-022-10148-5 |
work_keys_str_mv | AT ahfockdaniel onrandomizedsketchingalgorithmsandthetracywidomlaw AT astlewilliamj onrandomizedsketchingalgorithmsandthetracywidomlaw AT richardsonsylvia onrandomizedsketchingalgorithmsandthetracywidomlaw |