Cargando…

A novel probabilistic generator for large-scale gene association networks

MOTIVATION: Gene expression data provide an opportunity for reverse-engineering gene-gene associations using network inference methods. However, it is difficult to assess the performance of these methods because the true underlying network is unknown in real data. Current benchmarks address this pro...

Descripción completa

Detalles Bibliográficos
Autores principales: Grimes, Tyler, Datta, Somnath
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8589155/
https://www.ncbi.nlm.nih.gov/pubmed/34767561
http://dx.doi.org/10.1371/journal.pone.0259193
_version_ 1784598634447241216
author Grimes, Tyler
Datta, Somnath
author_facet Grimes, Tyler
Datta, Somnath
author_sort Grimes, Tyler
collection PubMed
description MOTIVATION: Gene expression data provide an opportunity for reverse-engineering gene-gene associations using network inference methods. However, it is difficult to assess the performance of these methods because the true underlying network is unknown in real data. Current benchmarks address this problem by subsampling a known regulatory network to conduct simulations. But the topology of regulatory networks can vary greatly across organisms or tissues, and reference-based generators—such as GeneNetWeaver—are not designed to capture this heterogeneity. This means, for example, benchmark results from the E. coli regulatory network will not carry over to other organisms or tissues. In contrast, probabilistic generators do not require a reference network, and they have the potential to capture a rich distribution of topologies. This makes probabilistic generators an ideal approach for obtaining a robust benchmarking of network inference methods. RESULTS: We propose a novel probabilistic network generator that (1) provides an alternative to address the inherent limitation of reference-based generators and (2) is able to create realistic gene association networks, and (3) captures the heterogeneity found across gold-standard networks better than existing generators used in practice. Eight organism-specific and 12 human tissue-specific gold-standard association networks are considered. Several measures of global topology are used to determine the similarity of generated networks to the gold-standards. Along with demonstrating the variability of network structure across organisms and tissues, we show that the commonly used “scale-free” model is insufficient for replicating these structures. AVAILABILITY: This generator is implemented in the R package “SeqNet” and is available on CRAN (https://cran.r-project.org/web/packages/SeqNet/index.html).
format Online
Article
Text
id pubmed-8589155
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-85891552021-11-13 A novel probabilistic generator for large-scale gene association networks Grimes, Tyler Datta, Somnath PLoS One Research Article MOTIVATION: Gene expression data provide an opportunity for reverse-engineering gene-gene associations using network inference methods. However, it is difficult to assess the performance of these methods because the true underlying network is unknown in real data. Current benchmarks address this problem by subsampling a known regulatory network to conduct simulations. But the topology of regulatory networks can vary greatly across organisms or tissues, and reference-based generators—such as GeneNetWeaver—are not designed to capture this heterogeneity. This means, for example, benchmark results from the E. coli regulatory network will not carry over to other organisms or tissues. In contrast, probabilistic generators do not require a reference network, and they have the potential to capture a rich distribution of topologies. This makes probabilistic generators an ideal approach for obtaining a robust benchmarking of network inference methods. RESULTS: We propose a novel probabilistic network generator that (1) provides an alternative to address the inherent limitation of reference-based generators and (2) is able to create realistic gene association networks, and (3) captures the heterogeneity found across gold-standard networks better than existing generators used in practice. Eight organism-specific and 12 human tissue-specific gold-standard association networks are considered. Several measures of global topology are used to determine the similarity of generated networks to the gold-standards. Along with demonstrating the variability of network structure across organisms and tissues, we show that the commonly used “scale-free” model is insufficient for replicating these structures. AVAILABILITY: This generator is implemented in the R package “SeqNet” and is available on CRAN (https://cran.r-project.org/web/packages/SeqNet/index.html). Public Library of Science 2021-11-12 /pmc/articles/PMC8589155/ /pubmed/34767561 http://dx.doi.org/10.1371/journal.pone.0259193 Text en © 2021 Grimes, Datta https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Grimes, Tyler
Datta, Somnath
A novel probabilistic generator for large-scale gene association networks
title A novel probabilistic generator for large-scale gene association networks
title_full A novel probabilistic generator for large-scale gene association networks
title_fullStr A novel probabilistic generator for large-scale gene association networks
title_full_unstemmed A novel probabilistic generator for large-scale gene association networks
title_short A novel probabilistic generator for large-scale gene association networks
title_sort novel probabilistic generator for large-scale gene association networks
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8589155/
https://www.ncbi.nlm.nih.gov/pubmed/34767561
http://dx.doi.org/10.1371/journal.pone.0259193
work_keys_str_mv AT grimestyler anovelprobabilisticgeneratorforlargescalegeneassociationnetworks
AT dattasomnath anovelprobabilisticgeneratorforlargescalegeneassociationnetworks
AT grimestyler novelprobabilisticgeneratorforlargescalegeneassociationnetworks
AT dattasomnath novelprobabilisticgeneratorforlargescalegeneassociationnetworks