Cargando…

Generating confidence intervals on biological networks

BACKGROUND: In the analysis of networks we frequently require the statistical significance of some network statistic, such as measures of similarity for the properties of interacting nodes. The structure of the network may introduce dependencies among the nodes and it will in general be necessary to...

Descripción completa

Detalles Bibliográficos
Autores principales:	Thorne, Thomas, Stumpf, Michael PH
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2007
Materias:	Methodology Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2241843/ https://www.ncbi.nlm.nih.gov/pubmed/18053130 http://dx.doi.org/10.1186/1471-2105-8-467

_version_	1782150539488788480
author	Thorne, Thomas Stumpf, Michael PH
author_facet	Thorne, Thomas Stumpf, Michael PH
author_sort	Thorne, Thomas
collection	PubMed
description	BACKGROUND: In the analysis of networks we frequently require the statistical significance of some network statistic, such as measures of similarity for the properties of interacting nodes. The structure of the network may introduce dependencies among the nodes and it will in general be necessary to account for these dependencies in the statistical analysis. To this end we require some form of Null model of the network: generally rewired replicates of the network are generated which preserve only the degree (number of interactions) of each node. We show that this can fail to capture important features of network structure, and may result in unrealistic significance levels, when potentially confounding additional information is available. METHODS: We present a new network resampling Null model which takes into account the degree sequence as well as available biological annotations. Using gene ontology information as an illustration we show how this information can be accounted for in the resampling approach, and the impact such information has on the assessment of statistical significance of correlations and motif-abundances in the Saccharomyces cerevisiae protein interaction network. An algorithm, GOcardShuffle, is introduced to allow for the efficient construction of an improved Null model for network data. RESULTS: We use the protein interaction network of S. cerevisiae; correlations between the evolutionary rates and expression levels of interacting proteins and their statistical significance were assessed for Null models which condition on different aspects of the available data. The novel GOcardShuffle approach results in a Null model for annotated network data which appears better to describe the properties of real biological networks. CONCLUSION: An improved statistical approach for the statistical analysis of biological network data, which conditions on the available biological information, leads to qualitatively different results compared to approaches which ignore such annotations. In particular we demonstrate the effects of the biological organization of the network can be sufficient to explain the observed similarity of interacting proteins.
format	Text
id	pubmed-2241843
institution	National Center for Biotechnology Information
language	English
publishDate	2007
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-22418432008-02-14 Generating confidence intervals on biological networks Thorne, Thomas Stumpf, Michael PH BMC Bioinformatics Methodology Article BACKGROUND: In the analysis of networks we frequently require the statistical significance of some network statistic, such as measures of similarity for the properties of interacting nodes. The structure of the network may introduce dependencies among the nodes and it will in general be necessary to account for these dependencies in the statistical analysis. To this end we require some form of Null model of the network: generally rewired replicates of the network are generated which preserve only the degree (number of interactions) of each node. We show that this can fail to capture important features of network structure, and may result in unrealistic significance levels, when potentially confounding additional information is available. METHODS: We present a new network resampling Null model which takes into account the degree sequence as well as available biological annotations. Using gene ontology information as an illustration we show how this information can be accounted for in the resampling approach, and the impact such information has on the assessment of statistical significance of correlations and motif-abundances in the Saccharomyces cerevisiae protein interaction network. An algorithm, GOcardShuffle, is introduced to allow for the efficient construction of an improved Null model for network data. RESULTS: We use the protein interaction network of S. cerevisiae; correlations between the evolutionary rates and expression levels of interacting proteins and their statistical significance were assessed for Null models which condition on different aspects of the available data. The novel GOcardShuffle approach results in a Null model for annotated network data which appears better to describe the properties of real biological networks. CONCLUSION: An improved statistical approach for the statistical analysis of biological network data, which conditions on the available biological information, leads to qualitatively different results compared to approaches which ignore such annotations. In particular we demonstrate the effects of the biological organization of the network can be sufficient to explain the observed similarity of interacting proteins. BioMed Central 2007-11-30 /pmc/articles/PMC2241843/ /pubmed/18053130 http://dx.doi.org/10.1186/1471-2105-8-467 Text en Copyright © 2007 Thorne and Stumpf; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Methodology Article Thorne, Thomas Stumpf, Michael PH Generating confidence intervals on biological networks
title	Generating confidence intervals on biological networks
title_full	Generating confidence intervals on biological networks
title_fullStr	Generating confidence intervals on biological networks
title_full_unstemmed	Generating confidence intervals on biological networks
title_short	Generating confidence intervals on biological networks
title_sort	generating confidence intervals on biological networks
topic	Methodology Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2241843/ https://www.ncbi.nlm.nih.gov/pubmed/18053130 http://dx.doi.org/10.1186/1471-2105-8-467
work_keys_str_mv	AT thornethomas generatingconfidenceintervalsonbiologicalnetworks AT stumpfmichaelph generatingconfidenceintervalsonbiologicalnetworks

Generating confidence intervals on biological networks

Ejemplares similares