Cargando…

Assessment of weighted topological overlap (wTO) to improve fidelity of gene co-expression networks

BACKGROUND: For more than a decade, gene expression data sets have been used as basis for the construction of co-expression networks used in systems biology investigations, leading to many important discoveries in a wide range of subjects spanning human disease to evolution and the development of or...

Descripción completa

Detalles Bibliográficos
Autores principales: Voigt, André, Almaas, Eivind
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6350380/
https://www.ncbi.nlm.nih.gov/pubmed/30691386
http://dx.doi.org/10.1186/s12859-019-2596-9
_version_ 1783390443300454400
author Voigt, André
Almaas, Eivind
author_facet Voigt, André
Almaas, Eivind
author_sort Voigt, André
collection PubMed
description BACKGROUND: For more than a decade, gene expression data sets have been used as basis for the construction of co-expression networks used in systems biology investigations, leading to many important discoveries in a wide range of subjects spanning human disease to evolution and the development of organisms. A commonly encountered challenge in such investigations is first that of detecting, then subsequently removing, spurious correlations (i.e. links) in these networks. While access to a large number of measurements per gene would reduce this problem, often only a small number of measurements are available. The weighted Topological Overlap (wTO) measure, which incorporates information from the shared network-neighborhood of a given gene-pair into a single score, is a metric that is frequently used with the implicit expectation of producing higher-quality networks. However, the actual extent to which wTO improves on the accuracy of a co-expression analysis has not been quantified. RESULTS: Here, we used a large-sample biological data set containing 338 gene-expression measurements per gene as a reference system. From these data, we generated ensembles consisting of 10, 20 and 50 randomly selected measurements to emulate low-quality data sets, finding that the wTO measure consistently generates more robust scores than what results from simple correlation calculations. Furthermore, for the data sets consisting of only 10 and 20 samples per gene, we find that wTO serves as a better predictor of the correlation scores generated from the full data set. However, we find that using wTO as a score for network building substantially alters several topographical aspects of the resulting networks, with no conclusive evidence that the resulting structure is more accurate. Importantly, we find that the much used approach of applying a soft-threshold modifier to link weights prior to computing the wTO substantially decreases the robustness of the resulting wTO network, but increases the predictive power of wTO networks with regards to the reference correlation (soft threshold) network, particularly as the size of the data sets increases. CONCLUSION: Our analysis demonstrates that, in agreement with previous assumptions, the wTO approach is capable of significantly improving the fidelity of co-expression networks, and that this effect is especially evident for cases of low-sample number gene-expression data sets. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-2596-9) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6350380
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-63503802019-02-04 Assessment of weighted topological overlap (wTO) to improve fidelity of gene co-expression networks Voigt, André Almaas, Eivind BMC Bioinformatics Research Article BACKGROUND: For more than a decade, gene expression data sets have been used as basis for the construction of co-expression networks used in systems biology investigations, leading to many important discoveries in a wide range of subjects spanning human disease to evolution and the development of organisms. A commonly encountered challenge in such investigations is first that of detecting, then subsequently removing, spurious correlations (i.e. links) in these networks. While access to a large number of measurements per gene would reduce this problem, often only a small number of measurements are available. The weighted Topological Overlap (wTO) measure, which incorporates information from the shared network-neighborhood of a given gene-pair into a single score, is a metric that is frequently used with the implicit expectation of producing higher-quality networks. However, the actual extent to which wTO improves on the accuracy of a co-expression analysis has not been quantified. RESULTS: Here, we used a large-sample biological data set containing 338 gene-expression measurements per gene as a reference system. From these data, we generated ensembles consisting of 10, 20 and 50 randomly selected measurements to emulate low-quality data sets, finding that the wTO measure consistently generates more robust scores than what results from simple correlation calculations. Furthermore, for the data sets consisting of only 10 and 20 samples per gene, we find that wTO serves as a better predictor of the correlation scores generated from the full data set. However, we find that using wTO as a score for network building substantially alters several topographical aspects of the resulting networks, with no conclusive evidence that the resulting structure is more accurate. Importantly, we find that the much used approach of applying a soft-threshold modifier to link weights prior to computing the wTO substantially decreases the robustness of the resulting wTO network, but increases the predictive power of wTO networks with regards to the reference correlation (soft threshold) network, particularly as the size of the data sets increases. CONCLUSION: Our analysis demonstrates that, in agreement with previous assumptions, the wTO approach is capable of significantly improving the fidelity of co-expression networks, and that this effect is especially evident for cases of low-sample number gene-expression data sets. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-2596-9) contains supplementary material, which is available to authorized users. BioMed Central 2019-01-28 /pmc/articles/PMC6350380/ /pubmed/30691386 http://dx.doi.org/10.1186/s12859-019-2596-9 Text en © The Author(s) 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Voigt, André
Almaas, Eivind
Assessment of weighted topological overlap (wTO) to improve fidelity of gene co-expression networks
title Assessment of weighted topological overlap (wTO) to improve fidelity of gene co-expression networks
title_full Assessment of weighted topological overlap (wTO) to improve fidelity of gene co-expression networks
title_fullStr Assessment of weighted topological overlap (wTO) to improve fidelity of gene co-expression networks
title_full_unstemmed Assessment of weighted topological overlap (wTO) to improve fidelity of gene co-expression networks
title_short Assessment of weighted topological overlap (wTO) to improve fidelity of gene co-expression networks
title_sort assessment of weighted topological overlap (wto) to improve fidelity of gene co-expression networks
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6350380/
https://www.ncbi.nlm.nih.gov/pubmed/30691386
http://dx.doi.org/10.1186/s12859-019-2596-9
work_keys_str_mv AT voigtandre assessmentofweightedtopologicaloverlapwtotoimprovefidelityofgenecoexpressionnetworks
AT almaaseivind assessmentofweightedtopologicaloverlapwtotoimprovefidelityofgenecoexpressionnetworks