Cargando…

Disambiguation of patent inventors and assignees using high-resolution geolocation data

Patent data represent a significant source of information on innovation, knowledge production, and the evolution of technology through networks of citations, co-invention and co-assignment. A major obstacle to extracting useful information from this data is the problem of name disambiguation: linkin...

Descripción completa

Detalles Bibliográficos
Autores principales: Morrison, Greg, Riccaboni, Massimo, Pammolli, Fabio
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5433392/
https://www.ncbi.nlm.nih.gov/pubmed/28509897
http://dx.doi.org/10.1038/sdata.2017.64
_version_ 1783236843804819456
author Morrison, Greg
Riccaboni, Massimo
Pammolli, Fabio
author_facet Morrison, Greg
Riccaboni, Massimo
Pammolli, Fabio
author_sort Morrison, Greg
collection PubMed
description Patent data represent a significant source of information on innovation, knowledge production, and the evolution of technology through networks of citations, co-invention and co-assignment. A major obstacle to extracting useful information from this data is the problem of name disambiguation: linking alternate spellings of individuals or institutions to a single identifier to uniquely determine the parties involved in knowledge production and diffusion. In this paper, we describe a new algorithm that uses high-resolution geolocation to disambiguate both inventors and assignees on about 8.5 million patents found in the European Patent Office (EPO), under the Patent Cooperation Treaty (PCT), and in the US Patent and Trademark Office (USPTO). We show this disambiguation is consistent with a number of ground-truth benchmarks of both assignees and inventors, significantly outperforming the use of undisambiguated names to identify unique entities. A significant benefit of this work is the high quality assignee disambiguation with coverage across the world coupled with an inventor disambiguation (that is competitive with other state of the art approaches) in multiple patent offices.
format Online
Article
Text
id pubmed-5433392
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Nature Publishing Group
record_format MEDLINE/PubMed
spelling pubmed-54333922017-05-19 Disambiguation of patent inventors and assignees using high-resolution geolocation data Morrison, Greg Riccaboni, Massimo Pammolli, Fabio Sci Data Data Descriptor Patent data represent a significant source of information on innovation, knowledge production, and the evolution of technology through networks of citations, co-invention and co-assignment. A major obstacle to extracting useful information from this data is the problem of name disambiguation: linking alternate spellings of individuals or institutions to a single identifier to uniquely determine the parties involved in knowledge production and diffusion. In this paper, we describe a new algorithm that uses high-resolution geolocation to disambiguate both inventors and assignees on about 8.5 million patents found in the European Patent Office (EPO), under the Patent Cooperation Treaty (PCT), and in the US Patent and Trademark Office (USPTO). We show this disambiguation is consistent with a number of ground-truth benchmarks of both assignees and inventors, significantly outperforming the use of undisambiguated names to identify unique entities. A significant benefit of this work is the high quality assignee disambiguation with coverage across the world coupled with an inventor disambiguation (that is competitive with other state of the art approaches) in multiple patent offices. Nature Publishing Group 2017-05-16 /pmc/articles/PMC5433392/ /pubmed/28509897 http://dx.doi.org/10.1038/sdata.2017.64 Text en Copyright © 2017, The Author(s) http://creativecommons.org/licenses/by/4.0 This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver http://creativecommons.org/publicdomain/zero/1.0/ applies to the metadata files made available in this article.
spellingShingle Data Descriptor
Morrison, Greg
Riccaboni, Massimo
Pammolli, Fabio
Disambiguation of patent inventors and assignees using high-resolution geolocation data
title Disambiguation of patent inventors and assignees using high-resolution geolocation data
title_full Disambiguation of patent inventors and assignees using high-resolution geolocation data
title_fullStr Disambiguation of patent inventors and assignees using high-resolution geolocation data
title_full_unstemmed Disambiguation of patent inventors and assignees using high-resolution geolocation data
title_short Disambiguation of patent inventors and assignees using high-resolution geolocation data
title_sort disambiguation of patent inventors and assignees using high-resolution geolocation data
topic Data Descriptor
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5433392/
https://www.ncbi.nlm.nih.gov/pubmed/28509897
http://dx.doi.org/10.1038/sdata.2017.64
work_keys_str_mv AT morrisongreg disambiguationofpatentinventorsandassigneesusinghighresolutiongeolocationdata
AT riccabonimassimo disambiguationofpatentinventorsandassigneesusinghighresolutiongeolocationdata
AT pammollifabio disambiguationofpatentinventorsandassigneesusinghighresolutiongeolocationdata