Cargando…

Rate Distortion Theory for Descriptive Statistics

Rate distortion theory was developed for optimizing lossy compression of data, but it also has applications in statistics. In this paper, we illustrate how rate distortion theory can be used to analyze various datasets. The analysis involves testing, identification of outliers, choice of compression...

Descripción completa

Detalles Bibliográficos
Autor principal: Harremoës, Peter
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10047654/
https://www.ncbi.nlm.nih.gov/pubmed/36981344
http://dx.doi.org/10.3390/e25030456
_version_ 1785013979348729856
author Harremoës, Peter
author_facet Harremoës, Peter
author_sort Harremoës, Peter
collection PubMed
description Rate distortion theory was developed for optimizing lossy compression of data, but it also has applications in statistics. In this paper, we illustrate how rate distortion theory can be used to analyze various datasets. The analysis involves testing, identification of outliers, choice of compression rate, calculation of optimal reconstruction points, and assigning “descriptive confidence regions” to the reconstruction points. We study four models or datasets of increasing complexity: clustering, Gaussian models, linear regression, and a dataset describing orientations of early Islamic mosques. These examples illustrate how rate distortion analysis may serve as a common framework for handling different statistical problems.
format Online
Article
Text
id pubmed-10047654
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-100476542023-03-29 Rate Distortion Theory for Descriptive Statistics Harremoës, Peter Entropy (Basel) Article Rate distortion theory was developed for optimizing lossy compression of data, but it also has applications in statistics. In this paper, we illustrate how rate distortion theory can be used to analyze various datasets. The analysis involves testing, identification of outliers, choice of compression rate, calculation of optimal reconstruction points, and assigning “descriptive confidence regions” to the reconstruction points. We study four models or datasets of increasing complexity: clustering, Gaussian models, linear regression, and a dataset describing orientations of early Islamic mosques. These examples illustrate how rate distortion analysis may serve as a common framework for handling different statistical problems. MDPI 2023-03-05 /pmc/articles/PMC10047654/ /pubmed/36981344 http://dx.doi.org/10.3390/e25030456 Text en © 2023 by the author. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Harremoës, Peter
Rate Distortion Theory for Descriptive Statistics
title Rate Distortion Theory for Descriptive Statistics
title_full Rate Distortion Theory for Descriptive Statistics
title_fullStr Rate Distortion Theory for Descriptive Statistics
title_full_unstemmed Rate Distortion Theory for Descriptive Statistics
title_short Rate Distortion Theory for Descriptive Statistics
title_sort rate distortion theory for descriptive statistics
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10047654/
https://www.ncbi.nlm.nih.gov/pubmed/36981344
http://dx.doi.org/10.3390/e25030456
work_keys_str_mv AT harremoespeter ratedistortiontheoryfordescriptivestatistics