Cargando…

The Language of Innovation

Predicting innovation is a peculiar problem in data science. Following its definition, an innovation is always a never-seen-before event, leaving no room for traditional supervised learning approaches. Here we propose a strategy to address the problem in the context of innovative patents, by definin...

Descripción completa

Detalles Bibliográficos
Autores principales: Tacchella, Andrea, Napoletano, Andrea, Pietronero, Luciano
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7194493/
https://www.ncbi.nlm.nih.gov/pubmed/32352986
http://dx.doi.org/10.1371/journal.pone.0230107
_version_ 1783528349481566208
author Tacchella, Andrea
Napoletano, Andrea
Pietronero, Luciano
author_facet Tacchella, Andrea
Napoletano, Andrea
Pietronero, Luciano
author_sort Tacchella, Andrea
collection PubMed
description Predicting innovation is a peculiar problem in data science. Following its definition, an innovation is always a never-seen-before event, leaving no room for traditional supervised learning approaches. Here we propose a strategy to address the problem in the context of innovative patents, by defining innovations as never-seen-before associations of technologies and exploiting self-supervised learning techniques. We think of technological codes present in patents as a vocabulary and the whole technological corpus as written in a specific, evolving language. We leverage such structure with techniques borrowed from Natural Language Processing by embedding technologies in a high dimensional euclidean space where relative positions are representative of learned semantics. Proximity in this space is an effective predictor of specific innovation events, that outperforms a wide range of standard link-prediction metrics. The success of patented innovations follows a complex dynamics characterized by different patterns which we analyze in details with specific examples. The methods proposed in this paper provide a completely new way of understanding and forecasting innovation, by tackling it from a revealing perspective and opening interesting scenarios for a number of applications and further analytic approaches.
format Online
Article
Text
id pubmed-7194493
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-71944932020-05-12 The Language of Innovation Tacchella, Andrea Napoletano, Andrea Pietronero, Luciano PLoS One Research Article Predicting innovation is a peculiar problem in data science. Following its definition, an innovation is always a never-seen-before event, leaving no room for traditional supervised learning approaches. Here we propose a strategy to address the problem in the context of innovative patents, by defining innovations as never-seen-before associations of technologies and exploiting self-supervised learning techniques. We think of technological codes present in patents as a vocabulary and the whole technological corpus as written in a specific, evolving language. We leverage such structure with techniques borrowed from Natural Language Processing by embedding technologies in a high dimensional euclidean space where relative positions are representative of learned semantics. Proximity in this space is an effective predictor of specific innovation events, that outperforms a wide range of standard link-prediction metrics. The success of patented innovations follows a complex dynamics characterized by different patterns which we analyze in details with specific examples. The methods proposed in this paper provide a completely new way of understanding and forecasting innovation, by tackling it from a revealing perspective and opening interesting scenarios for a number of applications and further analytic approaches. Public Library of Science 2020-04-30 /pmc/articles/PMC7194493/ /pubmed/32352986 http://dx.doi.org/10.1371/journal.pone.0230107 Text en © 2020 Tacchella et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Tacchella, Andrea
Napoletano, Andrea
Pietronero, Luciano
The Language of Innovation
title The Language of Innovation
title_full The Language of Innovation
title_fullStr The Language of Innovation
title_full_unstemmed The Language of Innovation
title_short The Language of Innovation
title_sort language of innovation
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7194493/
https://www.ncbi.nlm.nih.gov/pubmed/32352986
http://dx.doi.org/10.1371/journal.pone.0230107
work_keys_str_mv AT tacchellaandrea thelanguageofinnovation
AT napoletanoandrea thelanguageofinnovation
AT pietroneroluciano thelanguageofinnovation
AT tacchellaandrea languageofinnovation
AT napoletanoandrea languageofinnovation
AT pietroneroluciano languageofinnovation