Cargando…
Disambiguation of patent inventors and assignees using high-resolution geolocation data
Patent data represent a significant source of information on innovation, knowledge production, and the evolution of technology through networks of citations, co-invention and co-assignment. A major obstacle to extracting useful information from this data is the problem of name disambiguation: linkin...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5433392/ https://www.ncbi.nlm.nih.gov/pubmed/28509897 http://dx.doi.org/10.1038/sdata.2017.64 |
_version_ | 1783236843804819456 |
---|---|
author | Morrison, Greg Riccaboni, Massimo Pammolli, Fabio |
author_facet | Morrison, Greg Riccaboni, Massimo Pammolli, Fabio |
author_sort | Morrison, Greg |
collection | PubMed |
description | Patent data represent a significant source of information on innovation, knowledge production, and the evolution of technology through networks of citations, co-invention and co-assignment. A major obstacle to extracting useful information from this data is the problem of name disambiguation: linking alternate spellings of individuals or institutions to a single identifier to uniquely determine the parties involved in knowledge production and diffusion. In this paper, we describe a new algorithm that uses high-resolution geolocation to disambiguate both inventors and assignees on about 8.5 million patents found in the European Patent Office (EPO), under the Patent Cooperation Treaty (PCT), and in the US Patent and Trademark Office (USPTO). We show this disambiguation is consistent with a number of ground-truth benchmarks of both assignees and inventors, significantly outperforming the use of undisambiguated names to identify unique entities. A significant benefit of this work is the high quality assignee disambiguation with coverage across the world coupled with an inventor disambiguation (that is competitive with other state of the art approaches) in multiple patent offices. |
format | Online Article Text |
id | pubmed-5433392 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | Nature Publishing Group |
record_format | MEDLINE/PubMed |
spelling | pubmed-54333922017-05-19 Disambiguation of patent inventors and assignees using high-resolution geolocation data Morrison, Greg Riccaboni, Massimo Pammolli, Fabio Sci Data Data Descriptor Patent data represent a significant source of information on innovation, knowledge production, and the evolution of technology through networks of citations, co-invention and co-assignment. A major obstacle to extracting useful information from this data is the problem of name disambiguation: linking alternate spellings of individuals or institutions to a single identifier to uniquely determine the parties involved in knowledge production and diffusion. In this paper, we describe a new algorithm that uses high-resolution geolocation to disambiguate both inventors and assignees on about 8.5 million patents found in the European Patent Office (EPO), under the Patent Cooperation Treaty (PCT), and in the US Patent and Trademark Office (USPTO). We show this disambiguation is consistent with a number of ground-truth benchmarks of both assignees and inventors, significantly outperforming the use of undisambiguated names to identify unique entities. A significant benefit of this work is the high quality assignee disambiguation with coverage across the world coupled with an inventor disambiguation (that is competitive with other state of the art approaches) in multiple patent offices. Nature Publishing Group 2017-05-16 /pmc/articles/PMC5433392/ /pubmed/28509897 http://dx.doi.org/10.1038/sdata.2017.64 Text en Copyright © 2017, The Author(s) http://creativecommons.org/licenses/by/4.0 This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver http://creativecommons.org/publicdomain/zero/1.0/ applies to the metadata files made available in this article. |
spellingShingle | Data Descriptor Morrison, Greg Riccaboni, Massimo Pammolli, Fabio Disambiguation of patent inventors and assignees using high-resolution geolocation data |
title | Disambiguation of patent inventors and assignees using high-resolution geolocation data |
title_full | Disambiguation of patent inventors and assignees using high-resolution geolocation data |
title_fullStr | Disambiguation of patent inventors and assignees using high-resolution geolocation data |
title_full_unstemmed | Disambiguation of patent inventors and assignees using high-resolution geolocation data |
title_short | Disambiguation of patent inventors and assignees using high-resolution geolocation data |
title_sort | disambiguation of patent inventors and assignees using high-resolution geolocation data |
topic | Data Descriptor |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5433392/ https://www.ncbi.nlm.nih.gov/pubmed/28509897 http://dx.doi.org/10.1038/sdata.2017.64 |
work_keys_str_mv | AT morrisongreg disambiguationofpatentinventorsandassigneesusinghighresolutiongeolocationdata AT riccabonimassimo disambiguationofpatentinventorsandassigneesusinghighresolutiongeolocationdata AT pammollifabio disambiguationofpatentinventorsandassigneesusinghighresolutiongeolocationdata |