Cargando…
Entropy Estimation Using a Linguistic Zipf–Mandelbrot–Li Model for Natural Sequences
Entropy estimation faces numerous challenges when applied to various real-world problems. Our interest is in divergence and entropy estimation algorithms which are capable of rapid estimation for natural sequence data such as human and synthetic languages. This typically requires a large amount of d...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8468050/ https://www.ncbi.nlm.nih.gov/pubmed/34573725 http://dx.doi.org/10.3390/e23091100 |
_version_ | 1784573561139101696 |
---|---|
author | Back, Andrew D. Wiles, Janet |
author_facet | Back, Andrew D. Wiles, Janet |
author_sort | Back, Andrew D. |
collection | PubMed |
description | Entropy estimation faces numerous challenges when applied to various real-world problems. Our interest is in divergence and entropy estimation algorithms which are capable of rapid estimation for natural sequence data such as human and synthetic languages. This typically requires a large amount of data; however, we propose a new approach which is based on a new rank-based analytic Zipf–Mandelbrot–Li probabilistic model. Unlike previous approaches, which do not consider the nature of the probability distribution in relation to language; here, we introduce a novel analytic Zipfian model which includes linguistic constraints. This provides more accurate distributions for natural sequences such as natural or synthetic emergent languages. Results are given which indicates the performance of the proposed ZML model. We derive an entropy estimation method which incorporates the linguistic constraint-based Zipf–Mandelbrot–Li into a new non-equiprobable coincidence counting algorithm which is shown to be effective for tasks such as entropy rate estimation with limited data. |
format | Online Article Text |
id | pubmed-8468050 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-84680502021-09-27 Entropy Estimation Using a Linguistic Zipf–Mandelbrot–Li Model for Natural Sequences Back, Andrew D. Wiles, Janet Entropy (Basel) Article Entropy estimation faces numerous challenges when applied to various real-world problems. Our interest is in divergence and entropy estimation algorithms which are capable of rapid estimation for natural sequence data such as human and synthetic languages. This typically requires a large amount of data; however, we propose a new approach which is based on a new rank-based analytic Zipf–Mandelbrot–Li probabilistic model. Unlike previous approaches, which do not consider the nature of the probability distribution in relation to language; here, we introduce a novel analytic Zipfian model which includes linguistic constraints. This provides more accurate distributions for natural sequences such as natural or synthetic emergent languages. Results are given which indicates the performance of the proposed ZML model. We derive an entropy estimation method which incorporates the linguistic constraint-based Zipf–Mandelbrot–Li into a new non-equiprobable coincidence counting algorithm which is shown to be effective for tasks such as entropy rate estimation with limited data. MDPI 2021-08-24 /pmc/articles/PMC8468050/ /pubmed/34573725 http://dx.doi.org/10.3390/e23091100 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Back, Andrew D. Wiles, Janet Entropy Estimation Using a Linguistic Zipf–Mandelbrot–Li Model for Natural Sequences |
title | Entropy Estimation Using a Linguistic Zipf–Mandelbrot–Li Model for Natural Sequences |
title_full | Entropy Estimation Using a Linguistic Zipf–Mandelbrot–Li Model for Natural Sequences |
title_fullStr | Entropy Estimation Using a Linguistic Zipf–Mandelbrot–Li Model for Natural Sequences |
title_full_unstemmed | Entropy Estimation Using a Linguistic Zipf–Mandelbrot–Li Model for Natural Sequences |
title_short | Entropy Estimation Using a Linguistic Zipf–Mandelbrot–Li Model for Natural Sequences |
title_sort | entropy estimation using a linguistic zipf–mandelbrot–li model for natural sequences |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8468050/ https://www.ncbi.nlm.nih.gov/pubmed/34573725 http://dx.doi.org/10.3390/e23091100 |
work_keys_str_mv | AT backandrewd entropyestimationusingalinguisticzipfmandelbrotlimodelfornaturalsequences AT wilesjanet entropyestimationusingalinguisticzipfmandelbrotlimodelfornaturalsequences |