Cargando…

The Lorenz Curve: A Proper Framework to Define Satisfactory Measures of Symbol Dominance, Symbol Diversity, and Information Entropy

Novel measures of symbol dominance (d(C)(1) and d(C)(2)), symbol diversity (D(C)(1) = N (1 − d(C)(1)) and D(C)(2) = N (1 − d(C)(2))), and information entropy (H(C)(1) = log(2) D(C)(1) and H(C)(2) = log(2) D(C)(2)) are derived from Lorenz-consistent statistics that I had previously proposed to quanti...

Descripción completa

Detalles Bibliográficos
Autor principal: Camargo, Julio A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7517034/
https://www.ncbi.nlm.nih.gov/pubmed/33286315
http://dx.doi.org/10.3390/e22050542
_version_ 1783587136392396800
author Camargo, Julio A.
author_facet Camargo, Julio A.
author_sort Camargo, Julio A.
collection PubMed
description Novel measures of symbol dominance (d(C)(1) and d(C)(2)), symbol diversity (D(C)(1) = N (1 − d(C)(1)) and D(C)(2) = N (1 − d(C)(2))), and information entropy (H(C)(1) = log(2) D(C)(1) and H(C)(2) = log(2) D(C)(2)) are derived from Lorenz-consistent statistics that I had previously proposed to quantify dominance and diversity in ecology. Here, d(C)(1) refers to the average absolute difference between the relative abundances of dominant and subordinate symbols, with its value being equivalent to the maximum vertical distance from the Lorenz curve to the 45-degree line of equiprobability; d(C)(2) refers to the average absolute difference between all pairs of relative symbol abundances, with its value being equivalent to twice the area between the Lorenz curve and the 45-degree line of equiprobability; N is the number of different symbols or maximum expected diversity. These Lorenz-consistent statistics are compared with statistics based on Shannon’s entropy and Rényi’s second-order entropy to show that the former have better mathematical behavior than the latter. The use of d(C)(1), D(C)(1), and H(C)(1) is particularly recommended, as only changes in the allocation of relative abundance between dominant (p(d) > 1/N) and subordinate (p(s) < 1/N) symbols are of real relevance for probability distributions to achieve the reference distribution (p(i) = 1/N) or to deviate from it.
format Online
Article
Text
id pubmed-7517034
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-75170342020-11-09 The Lorenz Curve: A Proper Framework to Define Satisfactory Measures of Symbol Dominance, Symbol Diversity, and Information Entropy Camargo, Julio A. Entropy (Basel) Article Novel measures of symbol dominance (d(C)(1) and d(C)(2)), symbol diversity (D(C)(1) = N (1 − d(C)(1)) and D(C)(2) = N (1 − d(C)(2))), and information entropy (H(C)(1) = log(2) D(C)(1) and H(C)(2) = log(2) D(C)(2)) are derived from Lorenz-consistent statistics that I had previously proposed to quantify dominance and diversity in ecology. Here, d(C)(1) refers to the average absolute difference between the relative abundances of dominant and subordinate symbols, with its value being equivalent to the maximum vertical distance from the Lorenz curve to the 45-degree line of equiprobability; d(C)(2) refers to the average absolute difference between all pairs of relative symbol abundances, with its value being equivalent to twice the area between the Lorenz curve and the 45-degree line of equiprobability; N is the number of different symbols or maximum expected diversity. These Lorenz-consistent statistics are compared with statistics based on Shannon’s entropy and Rényi’s second-order entropy to show that the former have better mathematical behavior than the latter. The use of d(C)(1), D(C)(1), and H(C)(1) is particularly recommended, as only changes in the allocation of relative abundance between dominant (p(d) > 1/N) and subordinate (p(s) < 1/N) symbols are of real relevance for probability distributions to achieve the reference distribution (p(i) = 1/N) or to deviate from it. MDPI 2020-05-13 /pmc/articles/PMC7517034/ /pubmed/33286315 http://dx.doi.org/10.3390/e22050542 Text en © 2020 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Camargo, Julio A.
The Lorenz Curve: A Proper Framework to Define Satisfactory Measures of Symbol Dominance, Symbol Diversity, and Information Entropy
title The Lorenz Curve: A Proper Framework to Define Satisfactory Measures of Symbol Dominance, Symbol Diversity, and Information Entropy
title_full The Lorenz Curve: A Proper Framework to Define Satisfactory Measures of Symbol Dominance, Symbol Diversity, and Information Entropy
title_fullStr The Lorenz Curve: A Proper Framework to Define Satisfactory Measures of Symbol Dominance, Symbol Diversity, and Information Entropy
title_full_unstemmed The Lorenz Curve: A Proper Framework to Define Satisfactory Measures of Symbol Dominance, Symbol Diversity, and Information Entropy
title_short The Lorenz Curve: A Proper Framework to Define Satisfactory Measures of Symbol Dominance, Symbol Diversity, and Information Entropy
title_sort lorenz curve: a proper framework to define satisfactory measures of symbol dominance, symbol diversity, and information entropy
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7517034/
https://www.ncbi.nlm.nih.gov/pubmed/33286315
http://dx.doi.org/10.3390/e22050542
work_keys_str_mv AT camargojulioa thelorenzcurveaproperframeworktodefinesatisfactorymeasuresofsymboldominancesymboldiversityandinformationentropy
AT camargojulioa lorenzcurveaproperframeworktodefinesatisfactorymeasuresofsymboldominancesymboldiversityandinformationentropy