Cargando…

Link prediction and feature relevance in knowledge networks: A machine learning approach

We propose a supervised machine learning approach to predict partnership formation between universities. We focus on successful joint R&D projects funded by the Horizon 2020 programme in three research domains: Social Sciences and Humanities, Physical and Engineering Sciences, and Life Sciences....

Descripción completa

Detalles Bibliográficos
Autores principales: Zinilli, Antonio, Cerulli, Giovanni
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10688692/
https://www.ncbi.nlm.nih.gov/pubmed/38032965
http://dx.doi.org/10.1371/journal.pone.0290018
_version_ 1785152216582062080
author Zinilli, Antonio
Cerulli, Giovanni
author_facet Zinilli, Antonio
Cerulli, Giovanni
author_sort Zinilli, Antonio
collection PubMed
description We propose a supervised machine learning approach to predict partnership formation between universities. We focus on successful joint R&D projects funded by the Horizon 2020 programme in three research domains: Social Sciences and Humanities, Physical and Engineering Sciences, and Life Sciences. We perform two related analyses: link formation prediction, and feature importance detection. In predicting link formation, we consider two settings: one including all features, both exogenous (pertaining to the node) and endogenous (pertaining to the network); and one including only exogenous features (thus removing the network attributes of the nodes). Using out-of-sample cross-validated accuracy, we obtain 91% prediction accuracy when both types of attributes are used, and around 67% when using only the exogenous ones. This proves that partnership predictive power is on average 24% larger for universities already incumbent in the programme than for newcomers (for which network attributes are clearly unknown). As for feature importance, by computing super-learner average partial effects and elasticities, we find that the endogenous attributes are the most relevant in affecting the probability to generate a link, and observe a largely negative elasticity of the link probability to feature changes, fairly uniform across attributes and domains.
format Online
Article
Text
id pubmed-10688692
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-106886922023-12-01 Link prediction and feature relevance in knowledge networks: A machine learning approach Zinilli, Antonio Cerulli, Giovanni PLoS One Research Article We propose a supervised machine learning approach to predict partnership formation between universities. We focus on successful joint R&D projects funded by the Horizon 2020 programme in three research domains: Social Sciences and Humanities, Physical and Engineering Sciences, and Life Sciences. We perform two related analyses: link formation prediction, and feature importance detection. In predicting link formation, we consider two settings: one including all features, both exogenous (pertaining to the node) and endogenous (pertaining to the network); and one including only exogenous features (thus removing the network attributes of the nodes). Using out-of-sample cross-validated accuracy, we obtain 91% prediction accuracy when both types of attributes are used, and around 67% when using only the exogenous ones. This proves that partnership predictive power is on average 24% larger for universities already incumbent in the programme than for newcomers (for which network attributes are clearly unknown). As for feature importance, by computing super-learner average partial effects and elasticities, we find that the endogenous attributes are the most relevant in affecting the probability to generate a link, and observe a largely negative elasticity of the link probability to feature changes, fairly uniform across attributes and domains. Public Library of Science 2023-11-30 /pmc/articles/PMC10688692/ /pubmed/38032965 http://dx.doi.org/10.1371/journal.pone.0290018 Text en © 2023 Zinilli, Cerulli https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Zinilli, Antonio
Cerulli, Giovanni
Link prediction and feature relevance in knowledge networks: A machine learning approach
title Link prediction and feature relevance in knowledge networks: A machine learning approach
title_full Link prediction and feature relevance in knowledge networks: A machine learning approach
title_fullStr Link prediction and feature relevance in knowledge networks: A machine learning approach
title_full_unstemmed Link prediction and feature relevance in knowledge networks: A machine learning approach
title_short Link prediction and feature relevance in knowledge networks: A machine learning approach
title_sort link prediction and feature relevance in knowledge networks: a machine learning approach
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10688692/
https://www.ncbi.nlm.nih.gov/pubmed/38032965
http://dx.doi.org/10.1371/journal.pone.0290018
work_keys_str_mv AT zinilliantonio linkpredictionandfeaturerelevanceinknowledgenetworksamachinelearningapproach
AT cerulligiovanni linkpredictionandfeaturerelevanceinknowledgenetworksamachinelearningapproach