Cargando…

Network-Based and Binless Frequency Analyses

We introduce and develop a new network-based and binless methodology to perform frequency analyses and produce histograms. In contrast with traditional frequency analysis techniques that use fixed intervals to bin values, we place a range ±ζ around each individual value in a data set and count the n...

Descripción completa

Detalles Bibliográficos
Autores principales: Derrible, Sybil, Ahmad, Nasir
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4631440/
https://www.ncbi.nlm.nih.gov/pubmed/26529207
http://dx.doi.org/10.1371/journal.pone.0142108
_version_ 1782398862894301184
author Derrible, Sybil
Ahmad, Nasir
author_facet Derrible, Sybil
Ahmad, Nasir
author_sort Derrible, Sybil
collection PubMed
description We introduce and develop a new network-based and binless methodology to perform frequency analyses and produce histograms. In contrast with traditional frequency analysis techniques that use fixed intervals to bin values, we place a range ±ζ around each individual value in a data set and count the number of values within that range, which allows us to compare every single value of a data set with one another. In essence, the methodology is identical to the construction of a network, where two values are connected if they lie within a given a range (±ζ). The value with the highest degree (i.e., most connections) is therefore assimilated to the mode of the distribution. To select an optimal range, we look at the stability of the proportion of nodes in the largest cluster. The methodology is validated by sampling 12 typical distributions, and it is applied to a number of real-world data sets with both spatial and temporal components. The methodology can be applied to any data set and provides a robust means to uncover meaningful patterns and trends. A free python script and a tutorial are also made available to facilitate the application of the method.
format Online
Article
Text
id pubmed-4631440
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-46314402015-11-13 Network-Based and Binless Frequency Analyses Derrible, Sybil Ahmad, Nasir PLoS One Research Article We introduce and develop a new network-based and binless methodology to perform frequency analyses and produce histograms. In contrast with traditional frequency analysis techniques that use fixed intervals to bin values, we place a range ±ζ around each individual value in a data set and count the number of values within that range, which allows us to compare every single value of a data set with one another. In essence, the methodology is identical to the construction of a network, where two values are connected if they lie within a given a range (±ζ). The value with the highest degree (i.e., most connections) is therefore assimilated to the mode of the distribution. To select an optimal range, we look at the stability of the proportion of nodes in the largest cluster. The methodology is validated by sampling 12 typical distributions, and it is applied to a number of real-world data sets with both spatial and temporal components. The methodology can be applied to any data set and provides a robust means to uncover meaningful patterns and trends. A free python script and a tutorial are also made available to facilitate the application of the method. Public Library of Science 2015-11-03 /pmc/articles/PMC4631440/ /pubmed/26529207 http://dx.doi.org/10.1371/journal.pone.0142108 Text en © 2015 Derrible, Ahmad http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Derrible, Sybil
Ahmad, Nasir
Network-Based and Binless Frequency Analyses
title Network-Based and Binless Frequency Analyses
title_full Network-Based and Binless Frequency Analyses
title_fullStr Network-Based and Binless Frequency Analyses
title_full_unstemmed Network-Based and Binless Frequency Analyses
title_short Network-Based and Binless Frequency Analyses
title_sort network-based and binless frequency analyses
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4631440/
https://www.ncbi.nlm.nih.gov/pubmed/26529207
http://dx.doi.org/10.1371/journal.pone.0142108
work_keys_str_mv AT derriblesybil networkbasedandbinlessfrequencyanalyses
AT ahmadnasir networkbasedandbinlessfrequencyanalyses