Cargando…

Assessing Low-Intensity Relationships in Complex Networks

Many large network data sets are noisy and contain links representing low-intensity relationships that are difficult to differentiate from random interactions. This is especially relevant for high-throughput data from systems biology, large-scale ecological data, but also for Web 2.0 data on human i...

Descripción completa

Detalles Bibliográficos
Autores principales:	Spitz, Andreas, Gimmler, Anna, Stoeck, Thorsten, Zweig, Katharina Anna, Horvát, Emőke-Ágnes
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Public Library of Science 2016
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4838277/ https://www.ncbi.nlm.nih.gov/pubmed/27096435 http://dx.doi.org/10.1371/journal.pone.0152536

_version_	1782427965153345536
author	Spitz, Andreas Gimmler, Anna Stoeck, Thorsten Zweig, Katharina Anna Horvát, Emőke-Ágnes
author_facet	Spitz, Andreas Gimmler, Anna Stoeck, Thorsten Zweig, Katharina Anna Horvát, Emőke-Ágnes
author_sort	Spitz, Andreas
collection	PubMed
description	Many large network data sets are noisy and contain links representing low-intensity relationships that are difficult to differentiate from random interactions. This is especially relevant for high-throughput data from systems biology, large-scale ecological data, but also for Web 2.0 data on human interactions. In these networks with missing and spurious links, it is possible to refine the data based on the principle of structural similarity, which assesses the shared neighborhood of two nodes. By using similarity measures to globally rank all possible links and choosing the top-ranked pairs, true links can be validated, missing links inferred, and spurious observations removed. While many similarity measures have been proposed to this end, there is no general consensus on which one to use. In this article, we first contribute a set of benchmarks for complex networks from three different settings (e-commerce, systems biology, and social networks) and thus enable a quantitative performance analysis of classic node similarity measures. Based on this, we then propose a new methodology for link assessment called z* that assesses the statistical significance of the number of their common neighbors by comparison with the expected value in a suitably chosen random graph model and which is a consistently top-performing algorithm for all benchmarks. In addition to a global ranking of links, we also use this method to identify the most similar neighbors of each single node in a local ranking, thereby showing the versatility of the method in two distinct scenarios and augmenting its applicability. Finally, we perform an exploratory analysis on an oceanographic plankton data set and find that the distribution of microbes follows similar biogeographic rules as those of macroorganisms, a result that rejects the global dispersal hypothesis for microbes.
format	Online Article Text
id	pubmed-4838277
institution	National Center for Biotechnology Information
language	English
publishDate	2016
publisher	Public Library of Science
record_format	MEDLINE/PubMed
spelling	pubmed-48382772016-04-29 Assessing Low-Intensity Relationships in Complex Networks Spitz, Andreas Gimmler, Anna Stoeck, Thorsten Zweig, Katharina Anna Horvát, Emőke-Ágnes PLoS One Research Article Many large network data sets are noisy and contain links representing low-intensity relationships that are difficult to differentiate from random interactions. This is especially relevant for high-throughput data from systems biology, large-scale ecological data, but also for Web 2.0 data on human interactions. In these networks with missing and spurious links, it is possible to refine the data based on the principle of structural similarity, which assesses the shared neighborhood of two nodes. By using similarity measures to globally rank all possible links and choosing the top-ranked pairs, true links can be validated, missing links inferred, and spurious observations removed. While many similarity measures have been proposed to this end, there is no general consensus on which one to use. In this article, we first contribute a set of benchmarks for complex networks from three different settings (e-commerce, systems biology, and social networks) and thus enable a quantitative performance analysis of classic node similarity measures. Based on this, we then propose a new methodology for link assessment called z* that assesses the statistical significance of the number of their common neighbors by comparison with the expected value in a suitably chosen random graph model and which is a consistently top-performing algorithm for all benchmarks. In addition to a global ranking of links, we also use this method to identify the most similar neighbors of each single node in a local ranking, thereby showing the versatility of the method in two distinct scenarios and augmenting its applicability. Finally, we perform an exploratory analysis on an oceanographic plankton data set and find that the distribution of microbes follows similar biogeographic rules as those of macroorganisms, a result that rejects the global dispersal hypothesis for microbes. Public Library of Science 2016-04-20 /pmc/articles/PMC4838277/ /pubmed/27096435 http://dx.doi.org/10.1371/journal.pone.0152536 Text en © 2016 Spitz et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle	Research Article Spitz, Andreas Gimmler, Anna Stoeck, Thorsten Zweig, Katharina Anna Horvát, Emőke-Ágnes Assessing Low-Intensity Relationships in Complex Networks
title	Assessing Low-Intensity Relationships in Complex Networks
title_full	Assessing Low-Intensity Relationships in Complex Networks
title_fullStr	Assessing Low-Intensity Relationships in Complex Networks
title_full_unstemmed	Assessing Low-Intensity Relationships in Complex Networks
title_short	Assessing Low-Intensity Relationships in Complex Networks
title_sort	assessing low-intensity relationships in complex networks
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4838277/ https://www.ncbi.nlm.nih.gov/pubmed/27096435 http://dx.doi.org/10.1371/journal.pone.0152536
work_keys_str_mv	AT spitzandreas assessinglowintensityrelationshipsincomplexnetworks AT gimmleranna assessinglowintensityrelationshipsincomplexnetworks AT stoeckthorsten assessinglowintensityrelationshipsincomplexnetworks AT zweigkatharinaanna assessinglowintensityrelationshipsincomplexnetworks AT horvatemokeagnes assessinglowintensityrelationshipsincomplexnetworks

Assessing Low-Intensity Relationships in Complex Networks

Ejemplares similares