Cargando…

Measuring rank robustness in scored protein interaction networks

BACKGROUND: Protein interaction databases often provide confidence scores for each recorded interaction based on the available experimental evidence. Protein interaction networks (PINs) are then built by thresholding on these scores, so that only interactions of sufficiently high quality are include...

Descripción completa

Detalles Bibliográficos
Autores principales: Bozhilova, Lyuba V., Whitmore, Alan V., Wray, Jonny, Reinert, Gesine, Deane, Charlotte M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6714100/
https://www.ncbi.nlm.nih.gov/pubmed/31462221
http://dx.doi.org/10.1186/s12859-019-3036-6
_version_ 1783446989156909056
author Bozhilova, Lyuba V.
Whitmore, Alan V.
Wray, Jonny
Reinert, Gesine
Deane, Charlotte M.
author_facet Bozhilova, Lyuba V.
Whitmore, Alan V.
Wray, Jonny
Reinert, Gesine
Deane, Charlotte M.
author_sort Bozhilova, Lyuba V.
collection PubMed
description BACKGROUND: Protein interaction databases often provide confidence scores for each recorded interaction based on the available experimental evidence. Protein interaction networks (PINs) are then built by thresholding on these scores, so that only interactions of sufficiently high quality are included. These networks are used to identify biologically relevant motifs or nodes using metrics such as degree or betweenness centrality. This type of analysis can be sensitive to the choice of threshold. If a node metric is to be useful for extracting biological signal, it should induce similar node rankings across PINs obtained at different reasonable confidence score thresholds. RESULTS: We propose three measures—rank continuity, identifiability, and instability—to evaluate how robust a node metric is to changes in the score threshold. We apply our measures to twenty-five metrics and identify four as the most robust: the number of edges in the step-1 ego network, as well as the leave-one-out differences in average redundancy, average number of edges in the step-1 ego network, and natural connectivity. Our measures show good agreement across PINs from different species and data sources. Analysis of synthetically generated scored networks shows that robustness results are context-specific, and depend both on network topology and on how scores are placed across network edges. CONCLUSION: Due to the uncertainty associated with protein interaction detection, and therefore network structure, for PIN analysis to be reproducible, it should yield similar results across different confidence score thresholds. We demonstrate that while certain node metrics are robust with respect to threshold choice, this is not always the case. Promisingly, our results suggest that there are some metrics that are robust across networks constructed from different databases, and different scoring procedures. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-3036-6) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6714100
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-67141002019-09-04 Measuring rank robustness in scored protein interaction networks Bozhilova, Lyuba V. Whitmore, Alan V. Wray, Jonny Reinert, Gesine Deane, Charlotte M. BMC Bioinformatics Research Article BACKGROUND: Protein interaction databases often provide confidence scores for each recorded interaction based on the available experimental evidence. Protein interaction networks (PINs) are then built by thresholding on these scores, so that only interactions of sufficiently high quality are included. These networks are used to identify biologically relevant motifs or nodes using metrics such as degree or betweenness centrality. This type of analysis can be sensitive to the choice of threshold. If a node metric is to be useful for extracting biological signal, it should induce similar node rankings across PINs obtained at different reasonable confidence score thresholds. RESULTS: We propose three measures—rank continuity, identifiability, and instability—to evaluate how robust a node metric is to changes in the score threshold. We apply our measures to twenty-five metrics and identify four as the most robust: the number of edges in the step-1 ego network, as well as the leave-one-out differences in average redundancy, average number of edges in the step-1 ego network, and natural connectivity. Our measures show good agreement across PINs from different species and data sources. Analysis of synthetically generated scored networks shows that robustness results are context-specific, and depend both on network topology and on how scores are placed across network edges. CONCLUSION: Due to the uncertainty associated with protein interaction detection, and therefore network structure, for PIN analysis to be reproducible, it should yield similar results across different confidence score thresholds. We demonstrate that while certain node metrics are robust with respect to threshold choice, this is not always the case. Promisingly, our results suggest that there are some metrics that are robust across networks constructed from different databases, and different scoring procedures. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-019-3036-6) contains supplementary material, which is available to authorized users. BioMed Central 2019-08-28 /pmc/articles/PMC6714100/ /pubmed/31462221 http://dx.doi.org/10.1186/s12859-019-3036-6 Text en © The Author(s) 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Bozhilova, Lyuba V.
Whitmore, Alan V.
Wray, Jonny
Reinert, Gesine
Deane, Charlotte M.
Measuring rank robustness in scored protein interaction networks
title Measuring rank robustness in scored protein interaction networks
title_full Measuring rank robustness in scored protein interaction networks
title_fullStr Measuring rank robustness in scored protein interaction networks
title_full_unstemmed Measuring rank robustness in scored protein interaction networks
title_short Measuring rank robustness in scored protein interaction networks
title_sort measuring rank robustness in scored protein interaction networks
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6714100/
https://www.ncbi.nlm.nih.gov/pubmed/31462221
http://dx.doi.org/10.1186/s12859-019-3036-6
work_keys_str_mv AT bozhilovalyubav measuringrankrobustnessinscoredproteininteractionnetworks
AT whitmorealanv measuringrankrobustnessinscoredproteininteractionnetworks
AT wrayjonny measuringrankrobustnessinscoredproteininteractionnetworks
AT reinertgesine measuringrankrobustnessinscoredproteininteractionnetworks
AT deanecharlottem measuringrankrobustnessinscoredproteininteractionnetworks