Cargando…
The Lorenz Curve: A Proper Framework to Define Satisfactory Measures of Symbol Dominance, Symbol Diversity, and Information Entropy
Novel measures of symbol dominance (d(C)(1) and d(C)(2)), symbol diversity (D(C)(1) = N (1 − d(C)(1)) and D(C)(2) = N (1 − d(C)(2))), and information entropy (H(C)(1) = log(2) D(C)(1) and H(C)(2) = log(2) D(C)(2)) are derived from Lorenz-consistent statistics that I had previously proposed to quanti...
Autor principal: | |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7517034/ https://www.ncbi.nlm.nih.gov/pubmed/33286315 http://dx.doi.org/10.3390/e22050542 |
_version_ | 1783587136392396800 |
---|---|
author | Camargo, Julio A. |
author_facet | Camargo, Julio A. |
author_sort | Camargo, Julio A. |
collection | PubMed |
description | Novel measures of symbol dominance (d(C)(1) and d(C)(2)), symbol diversity (D(C)(1) = N (1 − d(C)(1)) and D(C)(2) = N (1 − d(C)(2))), and information entropy (H(C)(1) = log(2) D(C)(1) and H(C)(2) = log(2) D(C)(2)) are derived from Lorenz-consistent statistics that I had previously proposed to quantify dominance and diversity in ecology. Here, d(C)(1) refers to the average absolute difference between the relative abundances of dominant and subordinate symbols, with its value being equivalent to the maximum vertical distance from the Lorenz curve to the 45-degree line of equiprobability; d(C)(2) refers to the average absolute difference between all pairs of relative symbol abundances, with its value being equivalent to twice the area between the Lorenz curve and the 45-degree line of equiprobability; N is the number of different symbols or maximum expected diversity. These Lorenz-consistent statistics are compared with statistics based on Shannon’s entropy and Rényi’s second-order entropy to show that the former have better mathematical behavior than the latter. The use of d(C)(1), D(C)(1), and H(C)(1) is particularly recommended, as only changes in the allocation of relative abundance between dominant (p(d) > 1/N) and subordinate (p(s) < 1/N) symbols are of real relevance for probability distributions to achieve the reference distribution (p(i) = 1/N) or to deviate from it. |
format | Online Article Text |
id | pubmed-7517034 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-75170342020-11-09 The Lorenz Curve: A Proper Framework to Define Satisfactory Measures of Symbol Dominance, Symbol Diversity, and Information Entropy Camargo, Julio A. Entropy (Basel) Article Novel measures of symbol dominance (d(C)(1) and d(C)(2)), symbol diversity (D(C)(1) = N (1 − d(C)(1)) and D(C)(2) = N (1 − d(C)(2))), and information entropy (H(C)(1) = log(2) D(C)(1) and H(C)(2) = log(2) D(C)(2)) are derived from Lorenz-consistent statistics that I had previously proposed to quantify dominance and diversity in ecology. Here, d(C)(1) refers to the average absolute difference between the relative abundances of dominant and subordinate symbols, with its value being equivalent to the maximum vertical distance from the Lorenz curve to the 45-degree line of equiprobability; d(C)(2) refers to the average absolute difference between all pairs of relative symbol abundances, with its value being equivalent to twice the area between the Lorenz curve and the 45-degree line of equiprobability; N is the number of different symbols or maximum expected diversity. These Lorenz-consistent statistics are compared with statistics based on Shannon’s entropy and Rényi’s second-order entropy to show that the former have better mathematical behavior than the latter. The use of d(C)(1), D(C)(1), and H(C)(1) is particularly recommended, as only changes in the allocation of relative abundance between dominant (p(d) > 1/N) and subordinate (p(s) < 1/N) symbols are of real relevance for probability distributions to achieve the reference distribution (p(i) = 1/N) or to deviate from it. MDPI 2020-05-13 /pmc/articles/PMC7517034/ /pubmed/33286315 http://dx.doi.org/10.3390/e22050542 Text en © 2020 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Camargo, Julio A. The Lorenz Curve: A Proper Framework to Define Satisfactory Measures of Symbol Dominance, Symbol Diversity, and Information Entropy |
title | The Lorenz Curve: A Proper Framework to Define Satisfactory Measures of Symbol Dominance, Symbol Diversity, and Information Entropy |
title_full | The Lorenz Curve: A Proper Framework to Define Satisfactory Measures of Symbol Dominance, Symbol Diversity, and Information Entropy |
title_fullStr | The Lorenz Curve: A Proper Framework to Define Satisfactory Measures of Symbol Dominance, Symbol Diversity, and Information Entropy |
title_full_unstemmed | The Lorenz Curve: A Proper Framework to Define Satisfactory Measures of Symbol Dominance, Symbol Diversity, and Information Entropy |
title_short | The Lorenz Curve: A Proper Framework to Define Satisfactory Measures of Symbol Dominance, Symbol Diversity, and Information Entropy |
title_sort | lorenz curve: a proper framework to define satisfactory measures of symbol dominance, symbol diversity, and information entropy |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7517034/ https://www.ncbi.nlm.nih.gov/pubmed/33286315 http://dx.doi.org/10.3390/e22050542 |
work_keys_str_mv | AT camargojulioa thelorenzcurveaproperframeworktodefinesatisfactorymeasuresofsymboldominancesymboldiversityandinformationentropy AT camargojulioa lorenzcurveaproperframeworktodefinesatisfactorymeasuresofsymboldominancesymboldiversityandinformationentropy |