Cargando…

Network Analytics Enabled by Generating a Pool of Network Variants from Noisy Data

Mapping network nodes and edges to communities and network functions is crucial to gaining a higher level of understanding of the network structure and functions. Such mappings are particularly challenging to design for covert social networks, which intentionally hide their structure and functions t...

Descripción completa

Detalles Bibliográficos
Autores principales:	Mandviwalla, Aamir, Elsisy, Amr, Atique, Muhammad Saad, Kuzmin, Konstantin, Gaiteri, Chris, Szymanski, Boleslaw K.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2023
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10453411/ https://www.ncbi.nlm.nih.gov/pubmed/37628148 http://dx.doi.org/10.3390/e25081118

_version_	1785095929624264704
author	Mandviwalla, Aamir Elsisy, Amr Atique, Muhammad Saad Kuzmin, Konstantin Gaiteri, Chris Szymanski, Boleslaw K.
author_facet	Mandviwalla, Aamir Elsisy, Amr Atique, Muhammad Saad Kuzmin, Konstantin Gaiteri, Chris Szymanski, Boleslaw K.
author_sort	Mandviwalla, Aamir
collection	PubMed
description	Mapping network nodes and edges to communities and network functions is crucial to gaining a higher level of understanding of the network structure and functions. Such mappings are particularly challenging to design for covert social networks, which intentionally hide their structure and functions to protect important members from attacks or arrests. Here, we focus on correctly inferring the structures and functions of such networks, but our methodology can be broadly applied. Without the ground truth, knowledge about the allocation of nodes to communities and network functions, no single network based on the noisy data can represent all plausible communities and functions of the true underlying network. To address this limitation, we apply a generative model that randomly distorts the original network based on the noisy data, generating a pool of statistically equivalent networks. Each unique generated network is recorded, while each duplicate of the already recorded network just increases the repetition count of that network. We treat each such network as a variant of the ground truth with the probability of arising in the real world approximated by the ratio of the count of this network’s duplicates plus one to the total number of all generated networks. Communities of variants with frequently occurring duplicates contain persistent patterns shared by their structures. Using Shannon entropy, we can find a variant that minimizes the uncertainty for operations planned on the network. Repeatedly generating new pools of networks from the best network of the previous step for several steps lowers the entropy of the best new variant. If the entropy is too high, the network operators can identify nodes, the monitoring of which can achieve the most significant reduction in entropy. Finally, we also present a heuristic for constructing a new variant, which is not randomly generated but has the lowest expected cost of operating on the distorted mappings of network nodes to communities and functions caused by noisy data.
format	Online Article Text
id	pubmed-10453411
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-104534112023-08-26 Network Analytics Enabled by Generating a Pool of Network Variants from Noisy Data Mandviwalla, Aamir Elsisy, Amr Atique, Muhammad Saad Kuzmin, Konstantin Gaiteri, Chris Szymanski, Boleslaw K. Entropy (Basel) Article Mapping network nodes and edges to communities and network functions is crucial to gaining a higher level of understanding of the network structure and functions. Such mappings are particularly challenging to design for covert social networks, which intentionally hide their structure and functions to protect important members from attacks or arrests. Here, we focus on correctly inferring the structures and functions of such networks, but our methodology can be broadly applied. Without the ground truth, knowledge about the allocation of nodes to communities and network functions, no single network based on the noisy data can represent all plausible communities and functions of the true underlying network. To address this limitation, we apply a generative model that randomly distorts the original network based on the noisy data, generating a pool of statistically equivalent networks. Each unique generated network is recorded, while each duplicate of the already recorded network just increases the repetition count of that network. We treat each such network as a variant of the ground truth with the probability of arising in the real world approximated by the ratio of the count of this network’s duplicates plus one to the total number of all generated networks. Communities of variants with frequently occurring duplicates contain persistent patterns shared by their structures. Using Shannon entropy, we can find a variant that minimizes the uncertainty for operations planned on the network. Repeatedly generating new pools of networks from the best network of the previous step for several steps lowers the entropy of the best new variant. If the entropy is too high, the network operators can identify nodes, the monitoring of which can achieve the most significant reduction in entropy. Finally, we also present a heuristic for constructing a new variant, which is not randomly generated but has the lowest expected cost of operating on the distorted mappings of network nodes to communities and functions caused by noisy data. MDPI 2023-07-26 /pmc/articles/PMC10453411/ /pubmed/37628148 http://dx.doi.org/10.3390/e25081118 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Mandviwalla, Aamir Elsisy, Amr Atique, Muhammad Saad Kuzmin, Konstantin Gaiteri, Chris Szymanski, Boleslaw K. Network Analytics Enabled by Generating a Pool of Network Variants from Noisy Data
title	Network Analytics Enabled by Generating a Pool of Network Variants from Noisy Data
title_full	Network Analytics Enabled by Generating a Pool of Network Variants from Noisy Data
title_fullStr	Network Analytics Enabled by Generating a Pool of Network Variants from Noisy Data
title_full_unstemmed	Network Analytics Enabled by Generating a Pool of Network Variants from Noisy Data
title_short	Network Analytics Enabled by Generating a Pool of Network Variants from Noisy Data
title_sort	network analytics enabled by generating a pool of network variants from noisy data
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10453411/ https://www.ncbi.nlm.nih.gov/pubmed/37628148 http://dx.doi.org/10.3390/e25081118
work_keys_str_mv	AT mandviwallaaamir networkanalyticsenabledbygeneratingapoolofnetworkvariantsfromnoisydata AT elsisyamr networkanalyticsenabledbygeneratingapoolofnetworkvariantsfromnoisydata AT atiquemuhammadsaad networkanalyticsenabledbygeneratingapoolofnetworkvariantsfromnoisydata AT kuzminkonstantin networkanalyticsenabledbygeneratingapoolofnetworkvariantsfromnoisydata AT gaiterichris networkanalyticsenabledbygeneratingapoolofnetworkvariantsfromnoisydata AT szymanskiboleslawk networkanalyticsenabledbygeneratingapoolofnetworkvariantsfromnoisydata

Network Analytics Enabled by Generating a Pool of Network Variants from Noisy Data

Ejemplares similares