Cargando…
Link prediction and feature relevance in knowledge networks: A machine learning approach
We propose a supervised machine learning approach to predict partnership formation between universities. We focus on successful joint R&D projects funded by the Horizon 2020 programme in three research domains: Social Sciences and Humanities, Physical and Engineering Sciences, and Life Sciences....
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10688692/ https://www.ncbi.nlm.nih.gov/pubmed/38032965 http://dx.doi.org/10.1371/journal.pone.0290018 |
_version_ | 1785152216582062080 |
---|---|
author | Zinilli, Antonio Cerulli, Giovanni |
author_facet | Zinilli, Antonio Cerulli, Giovanni |
author_sort | Zinilli, Antonio |
collection | PubMed |
description | We propose a supervised machine learning approach to predict partnership formation between universities. We focus on successful joint R&D projects funded by the Horizon 2020 programme in three research domains: Social Sciences and Humanities, Physical and Engineering Sciences, and Life Sciences. We perform two related analyses: link formation prediction, and feature importance detection. In predicting link formation, we consider two settings: one including all features, both exogenous (pertaining to the node) and endogenous (pertaining to the network); and one including only exogenous features (thus removing the network attributes of the nodes). Using out-of-sample cross-validated accuracy, we obtain 91% prediction accuracy when both types of attributes are used, and around 67% when using only the exogenous ones. This proves that partnership predictive power is on average 24% larger for universities already incumbent in the programme than for newcomers (for which network attributes are clearly unknown). As for feature importance, by computing super-learner average partial effects and elasticities, we find that the endogenous attributes are the most relevant in affecting the probability to generate a link, and observe a largely negative elasticity of the link probability to feature changes, fairly uniform across attributes and domains. |
format | Online Article Text |
id | pubmed-10688692 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-106886922023-12-01 Link prediction and feature relevance in knowledge networks: A machine learning approach Zinilli, Antonio Cerulli, Giovanni PLoS One Research Article We propose a supervised machine learning approach to predict partnership formation between universities. We focus on successful joint R&D projects funded by the Horizon 2020 programme in three research domains: Social Sciences and Humanities, Physical and Engineering Sciences, and Life Sciences. We perform two related analyses: link formation prediction, and feature importance detection. In predicting link formation, we consider two settings: one including all features, both exogenous (pertaining to the node) and endogenous (pertaining to the network); and one including only exogenous features (thus removing the network attributes of the nodes). Using out-of-sample cross-validated accuracy, we obtain 91% prediction accuracy when both types of attributes are used, and around 67% when using only the exogenous ones. This proves that partnership predictive power is on average 24% larger for universities already incumbent in the programme than for newcomers (for which network attributes are clearly unknown). As for feature importance, by computing super-learner average partial effects and elasticities, we find that the endogenous attributes are the most relevant in affecting the probability to generate a link, and observe a largely negative elasticity of the link probability to feature changes, fairly uniform across attributes and domains. Public Library of Science 2023-11-30 /pmc/articles/PMC10688692/ /pubmed/38032965 http://dx.doi.org/10.1371/journal.pone.0290018 Text en © 2023 Zinilli, Cerulli https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Zinilli, Antonio Cerulli, Giovanni Link prediction and feature relevance in knowledge networks: A machine learning approach |
title | Link prediction and feature relevance in knowledge networks: A machine learning approach |
title_full | Link prediction and feature relevance in knowledge networks: A machine learning approach |
title_fullStr | Link prediction and feature relevance in knowledge networks: A machine learning approach |
title_full_unstemmed | Link prediction and feature relevance in knowledge networks: A machine learning approach |
title_short | Link prediction and feature relevance in knowledge networks: A machine learning approach |
title_sort | link prediction and feature relevance in knowledge networks: a machine learning approach |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10688692/ https://www.ncbi.nlm.nih.gov/pubmed/38032965 http://dx.doi.org/10.1371/journal.pone.0290018 |
work_keys_str_mv | AT zinilliantonio linkpredictionandfeaturerelevanceinknowledgenetworksamachinelearningapproach AT cerulligiovanni linkpredictionandfeaturerelevanceinknowledgenetworksamachinelearningapproach |