Cargando…

The Lorenz Curve: A Proper Framework to Define Satisfactory Measures of Symbol Dominance, Symbol Diversity, and Information Entropy

Novel measures of symbol dominance (d(C)(1) and d(C)(2)), symbol diversity (D(C)(1) = N (1 − d(C)(1)) and D(C)(2) = N (1 − d(C)(2))), and information entropy (H(C)(1) = log(2) D(C)(1) and H(C)(2) = log(2) D(C)(2)) are derived from Lorenz-consistent statistics that I had previously proposed to quanti...

Descripción completa

Detalles Bibliográficos
Autor principal: Camargo, Julio A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7517034/
https://www.ncbi.nlm.nih.gov/pubmed/33286315
http://dx.doi.org/10.3390/e22050542
Descripción
Sumario:Novel measures of symbol dominance (d(C)(1) and d(C)(2)), symbol diversity (D(C)(1) = N (1 − d(C)(1)) and D(C)(2) = N (1 − d(C)(2))), and information entropy (H(C)(1) = log(2) D(C)(1) and H(C)(2) = log(2) D(C)(2)) are derived from Lorenz-consistent statistics that I had previously proposed to quantify dominance and diversity in ecology. Here, d(C)(1) refers to the average absolute difference between the relative abundances of dominant and subordinate symbols, with its value being equivalent to the maximum vertical distance from the Lorenz curve to the 45-degree line of equiprobability; d(C)(2) refers to the average absolute difference between all pairs of relative symbol abundances, with its value being equivalent to twice the area between the Lorenz curve and the 45-degree line of equiprobability; N is the number of different symbols or maximum expected diversity. These Lorenz-consistent statistics are compared with statistics based on Shannon’s entropy and Rényi’s second-order entropy to show that the former have better mathematical behavior than the latter. The use of d(C)(1), D(C)(1), and H(C)(1) is particularly recommended, as only changes in the allocation of relative abundance between dominant (p(d) > 1/N) and subordinate (p(s) < 1/N) symbols are of real relevance for probability distributions to achieve the reference distribution (p(i) = 1/N) or to deviate from it.