Cargando…
Generating confidence intervals on biological networks
BACKGROUND: In the analysis of networks we frequently require the statistical significance of some network statistic, such as measures of similarity for the properties of interacting nodes. The structure of the network may introduce dependencies among the nodes and it will in general be necessary to...
Autores principales: | , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2007
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2241843/ https://www.ncbi.nlm.nih.gov/pubmed/18053130 http://dx.doi.org/10.1186/1471-2105-8-467 |
_version_ | 1782150539488788480 |
---|---|
author | Thorne, Thomas Stumpf, Michael PH |
author_facet | Thorne, Thomas Stumpf, Michael PH |
author_sort | Thorne, Thomas |
collection | PubMed |
description | BACKGROUND: In the analysis of networks we frequently require the statistical significance of some network statistic, such as measures of similarity for the properties of interacting nodes. The structure of the network may introduce dependencies among the nodes and it will in general be necessary to account for these dependencies in the statistical analysis. To this end we require some form of Null model of the network: generally rewired replicates of the network are generated which preserve only the degree (number of interactions) of each node. We show that this can fail to capture important features of network structure, and may result in unrealistic significance levels, when potentially confounding additional information is available. METHODS: We present a new network resampling Null model which takes into account the degree sequence as well as available biological annotations. Using gene ontology information as an illustration we show how this information can be accounted for in the resampling approach, and the impact such information has on the assessment of statistical significance of correlations and motif-abundances in the Saccharomyces cerevisiae protein interaction network. An algorithm, GOcardShuffle, is introduced to allow for the efficient construction of an improved Null model for network data. RESULTS: We use the protein interaction network of S. cerevisiae; correlations between the evolutionary rates and expression levels of interacting proteins and their statistical significance were assessed for Null models which condition on different aspects of the available data. The novel GOcardShuffle approach results in a Null model for annotated network data which appears better to describe the properties of real biological networks. CONCLUSION: An improved statistical approach for the statistical analysis of biological network data, which conditions on the available biological information, leads to qualitatively different results compared to approaches which ignore such annotations. In particular we demonstrate the effects of the biological organization of the network can be sufficient to explain the observed similarity of interacting proteins. |
format | Text |
id | pubmed-2241843 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2007 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-22418432008-02-14 Generating confidence intervals on biological networks Thorne, Thomas Stumpf, Michael PH BMC Bioinformatics Methodology Article BACKGROUND: In the analysis of networks we frequently require the statistical significance of some network statistic, such as measures of similarity for the properties of interacting nodes. The structure of the network may introduce dependencies among the nodes and it will in general be necessary to account for these dependencies in the statistical analysis. To this end we require some form of Null model of the network: generally rewired replicates of the network are generated which preserve only the degree (number of interactions) of each node. We show that this can fail to capture important features of network structure, and may result in unrealistic significance levels, when potentially confounding additional information is available. METHODS: We present a new network resampling Null model which takes into account the degree sequence as well as available biological annotations. Using gene ontology information as an illustration we show how this information can be accounted for in the resampling approach, and the impact such information has on the assessment of statistical significance of correlations and motif-abundances in the Saccharomyces cerevisiae protein interaction network. An algorithm, GOcardShuffle, is introduced to allow for the efficient construction of an improved Null model for network data. RESULTS: We use the protein interaction network of S. cerevisiae; correlations between the evolutionary rates and expression levels of interacting proteins and their statistical significance were assessed for Null models which condition on different aspects of the available data. The novel GOcardShuffle approach results in a Null model for annotated network data which appears better to describe the properties of real biological networks. CONCLUSION: An improved statistical approach for the statistical analysis of biological network data, which conditions on the available biological information, leads to qualitatively different results compared to approaches which ignore such annotations. In particular we demonstrate the effects of the biological organization of the network can be sufficient to explain the observed similarity of interacting proteins. BioMed Central 2007-11-30 /pmc/articles/PMC2241843/ /pubmed/18053130 http://dx.doi.org/10.1186/1471-2105-8-467 Text en Copyright © 2007 Thorne and Stumpf; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Methodology Article Thorne, Thomas Stumpf, Michael PH Generating confidence intervals on biological networks |
title | Generating confidence intervals on biological networks |
title_full | Generating confidence intervals on biological networks |
title_fullStr | Generating confidence intervals on biological networks |
title_full_unstemmed | Generating confidence intervals on biological networks |
title_short | Generating confidence intervals on biological networks |
title_sort | generating confidence intervals on biological networks |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2241843/ https://www.ncbi.nlm.nih.gov/pubmed/18053130 http://dx.doi.org/10.1186/1471-2105-8-467 |
work_keys_str_mv | AT thornethomas generatingconfidenceintervalsonbiologicalnetworks AT stumpfmichaelph generatingconfidenceintervalsonbiologicalnetworks |