Cargando…
The Language of Innovation
Predicting innovation is a peculiar problem in data science. Following its definition, an innovation is always a never-seen-before event, leaving no room for traditional supervised learning approaches. Here we propose a strategy to address the problem in the context of innovative patents, by definin...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7194493/ https://www.ncbi.nlm.nih.gov/pubmed/32352986 http://dx.doi.org/10.1371/journal.pone.0230107 |
_version_ | 1783528349481566208 |
---|---|
author | Tacchella, Andrea Napoletano, Andrea Pietronero, Luciano |
author_facet | Tacchella, Andrea Napoletano, Andrea Pietronero, Luciano |
author_sort | Tacchella, Andrea |
collection | PubMed |
description | Predicting innovation is a peculiar problem in data science. Following its definition, an innovation is always a never-seen-before event, leaving no room for traditional supervised learning approaches. Here we propose a strategy to address the problem in the context of innovative patents, by defining innovations as never-seen-before associations of technologies and exploiting self-supervised learning techniques. We think of technological codes present in patents as a vocabulary and the whole technological corpus as written in a specific, evolving language. We leverage such structure with techniques borrowed from Natural Language Processing by embedding technologies in a high dimensional euclidean space where relative positions are representative of learned semantics. Proximity in this space is an effective predictor of specific innovation events, that outperforms a wide range of standard link-prediction metrics. The success of patented innovations follows a complex dynamics characterized by different patterns which we analyze in details with specific examples. The methods proposed in this paper provide a completely new way of understanding and forecasting innovation, by tackling it from a revealing perspective and opening interesting scenarios for a number of applications and further analytic approaches. |
format | Online Article Text |
id | pubmed-7194493 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-71944932020-05-12 The Language of Innovation Tacchella, Andrea Napoletano, Andrea Pietronero, Luciano PLoS One Research Article Predicting innovation is a peculiar problem in data science. Following its definition, an innovation is always a never-seen-before event, leaving no room for traditional supervised learning approaches. Here we propose a strategy to address the problem in the context of innovative patents, by defining innovations as never-seen-before associations of technologies and exploiting self-supervised learning techniques. We think of technological codes present in patents as a vocabulary and the whole technological corpus as written in a specific, evolving language. We leverage such structure with techniques borrowed from Natural Language Processing by embedding technologies in a high dimensional euclidean space where relative positions are representative of learned semantics. Proximity in this space is an effective predictor of specific innovation events, that outperforms a wide range of standard link-prediction metrics. The success of patented innovations follows a complex dynamics characterized by different patterns which we analyze in details with specific examples. The methods proposed in this paper provide a completely new way of understanding and forecasting innovation, by tackling it from a revealing perspective and opening interesting scenarios for a number of applications and further analytic approaches. Public Library of Science 2020-04-30 /pmc/articles/PMC7194493/ /pubmed/32352986 http://dx.doi.org/10.1371/journal.pone.0230107 Text en © 2020 Tacchella et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Tacchella, Andrea Napoletano, Andrea Pietronero, Luciano The Language of Innovation |
title | The Language of Innovation |
title_full | The Language of Innovation |
title_fullStr | The Language of Innovation |
title_full_unstemmed | The Language of Innovation |
title_short | The Language of Innovation |
title_sort | language of innovation |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7194493/ https://www.ncbi.nlm.nih.gov/pubmed/32352986 http://dx.doi.org/10.1371/journal.pone.0230107 |
work_keys_str_mv | AT tacchellaandrea thelanguageofinnovation AT napoletanoandrea thelanguageofinnovation AT pietroneroluciano thelanguageofinnovation AT tacchellaandrea languageofinnovation AT napoletanoandrea languageofinnovation AT pietroneroluciano languageofinnovation |