Cargando…

The Lorenz Curve: A Proper Framework to Define Satisfactory Measures of Symbol Dominance, Symbol Diversity, and Information Entropy

Novel measures of symbol dominance (d(C)(1) and d(C)(2)), symbol diversity (D(C)(1) = N (1 − d(C)(1)) and D(C)(2) = N (1 − d(C)(2))), and information entropy (H(C)(1) = log(2) D(C)(1) and H(C)(2) = log(2) D(C)(2)) are derived from Lorenz-consistent statistics that I had previously proposed to quanti...

Descripción completa

Detalles Bibliográficos
Autor principal:	Camargo, Julio A.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2020
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7517034/ https://www.ncbi.nlm.nih.gov/pubmed/33286315 http://dx.doi.org/10.3390/e22050542

_version_	1783587136392396800
author	Camargo, Julio A.
author_facet	Camargo, Julio A.
author_sort	Camargo, Julio A.
collection	PubMed
description	Novel measures of symbol dominance (d(C)(1) and d(C)(2)), symbol diversity (D(C)(1) = N (1 − d(C)(1)) and D(C)(2) = N (1 − d(C)(2))), and information entropy (H(C)(1) = log(2) D(C)(1) and H(C)(2) = log(2) D(C)(2)) are derived from Lorenz-consistent statistics that I had previously proposed to quantify dominance and diversity in ecology. Here, d(C)(1) refers to the average absolute difference between the relative abundances of dominant and subordinate symbols, with its value being equivalent to the maximum vertical distance from the Lorenz curve to the 45-degree line of equiprobability; d(C)(2) refers to the average absolute difference between all pairs of relative symbol abundances, with its value being equivalent to twice the area between the Lorenz curve and the 45-degree line of equiprobability; N is the number of different symbols or maximum expected diversity. These Lorenz-consistent statistics are compared with statistics based on Shannon’s entropy and Rényi’s second-order entropy to show that the former have better mathematical behavior than the latter. The use of d(C)(1), D(C)(1), and H(C)(1) is particularly recommended, as only changes in the allocation of relative abundance between dominant (p(d) > 1/N) and subordinate (p(s) < 1/N) symbols are of real relevance for probability distributions to achieve the reference distribution (p(i) = 1/N) or to deviate from it.
format	Online Article Text
id	pubmed-7517034
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-75170342020-11-09 The Lorenz Curve: A Proper Framework to Define Satisfactory Measures of Symbol Dominance, Symbol Diversity, and Information Entropy Camargo, Julio A. Entropy (Basel) Article Novel measures of symbol dominance (d(C)(1) and d(C)(2)), symbol diversity (D(C)(1) = N (1 − d(C)(1)) and D(C)(2) = N (1 − d(C)(2))), and information entropy (H(C)(1) = log(2) D(C)(1) and H(C)(2) = log(2) D(C)(2)) are derived from Lorenz-consistent statistics that I had previously proposed to quantify dominance and diversity in ecology. Here, d(C)(1) refers to the average absolute difference between the relative abundances of dominant and subordinate symbols, with its value being equivalent to the maximum vertical distance from the Lorenz curve to the 45-degree line of equiprobability; d(C)(2) refers to the average absolute difference between all pairs of relative symbol abundances, with its value being equivalent to twice the area between the Lorenz curve and the 45-degree line of equiprobability; N is the number of different symbols or maximum expected diversity. These Lorenz-consistent statistics are compared with statistics based on Shannon’s entropy and Rényi’s second-order entropy to show that the former have better mathematical behavior than the latter. The use of d(C)(1), D(C)(1), and H(C)(1) is particularly recommended, as only changes in the allocation of relative abundance between dominant (p(d) > 1/N) and subordinate (p(s) < 1/N) symbols are of real relevance for probability distributions to achieve the reference distribution (p(i) = 1/N) or to deviate from it. MDPI 2020-05-13 /pmc/articles/PMC7517034/ /pubmed/33286315 http://dx.doi.org/10.3390/e22050542 Text en © 2020 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Camargo, Julio A. The Lorenz Curve: A Proper Framework to Define Satisfactory Measures of Symbol Dominance, Symbol Diversity, and Information Entropy
title	The Lorenz Curve: A Proper Framework to Define Satisfactory Measures of Symbol Dominance, Symbol Diversity, and Information Entropy
title_full	The Lorenz Curve: A Proper Framework to Define Satisfactory Measures of Symbol Dominance, Symbol Diversity, and Information Entropy
title_fullStr	The Lorenz Curve: A Proper Framework to Define Satisfactory Measures of Symbol Dominance, Symbol Diversity, and Information Entropy
title_full_unstemmed	The Lorenz Curve: A Proper Framework to Define Satisfactory Measures of Symbol Dominance, Symbol Diversity, and Information Entropy
title_short	The Lorenz Curve: A Proper Framework to Define Satisfactory Measures of Symbol Dominance, Symbol Diversity, and Information Entropy
title_sort	lorenz curve: a proper framework to define satisfactory measures of symbol dominance, symbol diversity, and information entropy
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7517034/ https://www.ncbi.nlm.nih.gov/pubmed/33286315 http://dx.doi.org/10.3390/e22050542
work_keys_str_mv	AT camargojulioa thelorenzcurveaproperframeworktodefinesatisfactorymeasuresofsymboldominancesymboldiversityandinformationentropy AT camargojulioa lorenzcurveaproperframeworktodefinesatisfactorymeasuresofsymboldominancesymboldiversityandinformationentropy

The Lorenz Curve: A Proper Framework to Define Satisfactory Measures of Symbol Dominance, Symbol Diversity, and Information Entropy

Ejemplares similares