Cargando…
Assessment of weighted topological overlap (wTO) to improve fidelity of gene co-expression networks
BACKGROUND: For more than a decade, gene expression data sets have been used as basis for the construction of co-expression networks used in systems biology investigations, leading to many important discoveries in a wide range of subjects spanning human disease to evolution and the development of or...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6350380/ https://www.ncbi.nlm.nih.gov/pubmed/30691386 http://dx.doi.org/10.1186/s12859-019-2596-9 |
_version_ | 1783390443300454400 |
---|---|
author | Voigt, André Almaas, Eivind |
author_facet | Voigt, André Almaas, Eivind |
author_sort | Voigt, André |
collection | PubMed |
description | BACKGROUND: For more than a decade, gene expression data sets have been used as basis for the construction of co-expression networks used in systems biology investigations, leading to many important discoveries in a wide range of subjects spanning human disease to evolution and the development of organisms. A commonly encountered challenge in such investigations is first that of detecting, then subsequently removing, spurious correlations (i.e. links) in these networks. While access to a large number of measurements per gene would reduce this problem, often only a small number of measurements are available. The weighted Topological Overlap (wTO) measure, which incorporates information from the shared network-neighborhood of a given gene-pair into a single score, is a metric that is frequently used with the implicit expectation of producing higher-quality networks. However, the actual extent to which wTO improves on the accuracy of a co-expression analysis has not been quantified. RESULTS: Here, we used a large-sample biological data set containing 338 gene-expression measurements per gene as a reference system. From these data, we generated ensembles consisting of 10, 20 and 50 randomly selected measurements to emulate low-quality data sets, finding that the wTO measure consistently generates more robust scores than what results from simple correlation calculations. Furthermore, for the data sets consisting of only 10 and 20 samples per gene, we find that wTO serves as a better predictor of the correlation scores generated from the full data set. However, we find that using wTO as a score for network building substantially alters several topographical aspects of the resulting networks, with no conclusive evidence that the resulting structure is more accurate. Importantly, we find that the much used approach of applying a soft-threshold modifier to link weights prior to computing the wTO substantially decreases the robustness of the resulting wTO network, but increases the predictive power of wTO networks with regards to the reference correlation (soft threshold) network, particularly as the size of the data sets increases. CONCLUSION: Our analysis demonstrates that, in agreement with previous assumptions, the wTO approach is capable of significantly improving the fidelity of co-expression networks, and that this effect is especially evident for cases of low-sample number gene-expression data sets. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-2596-9) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-6350380 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-63503802019-02-04 Assessment of weighted topological overlap (wTO) to improve fidelity of gene co-expression networks Voigt, André Almaas, Eivind BMC Bioinformatics Research Article BACKGROUND: For more than a decade, gene expression data sets have been used as basis for the construction of co-expression networks used in systems biology investigations, leading to many important discoveries in a wide range of subjects spanning human disease to evolution and the development of organisms. A commonly encountered challenge in such investigations is first that of detecting, then subsequently removing, spurious correlations (i.e. links) in these networks. While access to a large number of measurements per gene would reduce this problem, often only a small number of measurements are available. The weighted Topological Overlap (wTO) measure, which incorporates information from the shared network-neighborhood of a given gene-pair into a single score, is a metric that is frequently used with the implicit expectation of producing higher-quality networks. However, the actual extent to which wTO improves on the accuracy of a co-expression analysis has not been quantified. RESULTS: Here, we used a large-sample biological data set containing 338 gene-expression measurements per gene as a reference system. From these data, we generated ensembles consisting of 10, 20 and 50 randomly selected measurements to emulate low-quality data sets, finding that the wTO measure consistently generates more robust scores than what results from simple correlation calculations. Furthermore, for the data sets consisting of only 10 and 20 samples per gene, we find that wTO serves as a better predictor of the correlation scores generated from the full data set. However, we find that using wTO as a score for network building substantially alters several topographical aspects of the resulting networks, with no conclusive evidence that the resulting structure is more accurate. Importantly, we find that the much used approach of applying a soft-threshold modifier to link weights prior to computing the wTO substantially decreases the robustness of the resulting wTO network, but increases the predictive power of wTO networks with regards to the reference correlation (soft threshold) network, particularly as the size of the data sets increases. CONCLUSION: Our analysis demonstrates that, in agreement with previous assumptions, the wTO approach is capable of significantly improving the fidelity of co-expression networks, and that this effect is especially evident for cases of low-sample number gene-expression data sets. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-2596-9) contains supplementary material, which is available to authorized users. BioMed Central 2019-01-28 /pmc/articles/PMC6350380/ /pubmed/30691386 http://dx.doi.org/10.1186/s12859-019-2596-9 Text en © The Author(s) 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Article Voigt, André Almaas, Eivind Assessment of weighted topological overlap (wTO) to improve fidelity of gene co-expression networks |
title | Assessment of weighted topological overlap (wTO) to improve fidelity of gene co-expression networks |
title_full | Assessment of weighted topological overlap (wTO) to improve fidelity of gene co-expression networks |
title_fullStr | Assessment of weighted topological overlap (wTO) to improve fidelity of gene co-expression networks |
title_full_unstemmed | Assessment of weighted topological overlap (wTO) to improve fidelity of gene co-expression networks |
title_short | Assessment of weighted topological overlap (wTO) to improve fidelity of gene co-expression networks |
title_sort | assessment of weighted topological overlap (wto) to improve fidelity of gene co-expression networks |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6350380/ https://www.ncbi.nlm.nih.gov/pubmed/30691386 http://dx.doi.org/10.1186/s12859-019-2596-9 |
work_keys_str_mv | AT voigtandre assessmentofweightedtopologicaloverlapwtotoimprovefidelityofgenecoexpressionnetworks AT almaaseivind assessmentofweightedtopologicaloverlapwtotoimprovefidelityofgenecoexpressionnetworks |