Cargando…

Automated Patent Categorization and Guided Patent Search using IPC as Inspired by MeSH and PubMed

Document search on PubMed, the pre-eminent database for biomedical literature, relies on the annotation of its documents with relevant terms from the Medical Subject Headings ontology (MeSH) for improving recall through query expansion. Patent documents are another important information source, thou...

Descripción completa

Detalles Bibliográficos
Autores principales: Eisinger, Daniel, Tsatsaronis, George, Bundschus, Markus, Wieneke, Ulrich, Schroeder, Michael
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3632996/
https://www.ncbi.nlm.nih.gov/pubmed/23734562
http://dx.doi.org/10.1186/2041-1480-4-S1-S3
_version_ 1782266925051543552
author Eisinger, Daniel
Tsatsaronis, George
Bundschus, Markus
Wieneke, Ulrich
Schroeder, Michael
author_facet Eisinger, Daniel
Tsatsaronis, George
Bundschus, Markus
Wieneke, Ulrich
Schroeder, Michael
author_sort Eisinger, Daniel
collection PubMed
description Document search on PubMed, the pre-eminent database for biomedical literature, relies on the annotation of its documents with relevant terms from the Medical Subject Headings ontology (MeSH) for improving recall through query expansion. Patent documents are another important information source, though they are considerably less accessible. One option to expand patent search beyond pure keywords is the inclusion of classification information: Since every patent is assigned at least one class code, it should be possible for these assignments to be automatically used in a similar way as the MeSH annotations in PubMed. In order to develop a system for this task, it is necessary to have a good understanding of the properties of both classification systems. This report describes our comparative analysis of MeSH and the main patent classification system, the International Patent Classification (IPC). We investigate the hierarchical structures as well as the properties of the terms/classes respectively, and we compare the assignment of IPC codes to patents with the annotation of PubMed documents with MeSH terms. Our analysis shows a strong structural similarity of the hierarchies, but significant differences of terms and annotations. The low number of IPC class assignments and the lack of occurrences of class labels in patent texts imply that current patent search is severely limited. To overcome these limits, we evaluate a method for the automated assignment of additional classes to patent documents, and we propose a system for guided patent search based on the use of class co-occurrence information and external resources.
format Online
Article
Text
id pubmed-3632996
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-36329962013-04-25 Automated Patent Categorization and Guided Patent Search using IPC as Inspired by MeSH and PubMed Eisinger, Daniel Tsatsaronis, George Bundschus, Markus Wieneke, Ulrich Schroeder, Michael J Biomed Semantics Proceedings Document search on PubMed, the pre-eminent database for biomedical literature, relies on the annotation of its documents with relevant terms from the Medical Subject Headings ontology (MeSH) for improving recall through query expansion. Patent documents are another important information source, though they are considerably less accessible. One option to expand patent search beyond pure keywords is the inclusion of classification information: Since every patent is assigned at least one class code, it should be possible for these assignments to be automatically used in a similar way as the MeSH annotations in PubMed. In order to develop a system for this task, it is necessary to have a good understanding of the properties of both classification systems. This report describes our comparative analysis of MeSH and the main patent classification system, the International Patent Classification (IPC). We investigate the hierarchical structures as well as the properties of the terms/classes respectively, and we compare the assignment of IPC codes to patents with the annotation of PubMed documents with MeSH terms. Our analysis shows a strong structural similarity of the hierarchies, but significant differences of terms and annotations. The low number of IPC class assignments and the lack of occurrences of class labels in patent texts imply that current patent search is severely limited. To overcome these limits, we evaluate a method for the automated assignment of additional classes to patent documents, and we propose a system for guided patent search based on the use of class co-occurrence information and external resources. BioMed Central 2013-04-15 /pmc/articles/PMC3632996/ /pubmed/23734562 http://dx.doi.org/10.1186/2041-1480-4-S1-S3 Text en Copyright © 2013 Eisinger et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Proceedings
Eisinger, Daniel
Tsatsaronis, George
Bundschus, Markus
Wieneke, Ulrich
Schroeder, Michael
Automated Patent Categorization and Guided Patent Search using IPC as Inspired by MeSH and PubMed
title Automated Patent Categorization and Guided Patent Search using IPC as Inspired by MeSH and PubMed
title_full Automated Patent Categorization and Guided Patent Search using IPC as Inspired by MeSH and PubMed
title_fullStr Automated Patent Categorization and Guided Patent Search using IPC as Inspired by MeSH and PubMed
title_full_unstemmed Automated Patent Categorization and Guided Patent Search using IPC as Inspired by MeSH and PubMed
title_short Automated Patent Categorization and Guided Patent Search using IPC as Inspired by MeSH and PubMed
title_sort automated patent categorization and guided patent search using ipc as inspired by mesh and pubmed
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3632996/
https://www.ncbi.nlm.nih.gov/pubmed/23734562
http://dx.doi.org/10.1186/2041-1480-4-S1-S3
work_keys_str_mv AT eisingerdaniel automatedpatentcategorizationandguidedpatentsearchusingipcasinspiredbymeshandpubmed
AT tsatsaronisgeorge automatedpatentcategorizationandguidedpatentsearchusingipcasinspiredbymeshandpubmed
AT bundschusmarkus automatedpatentcategorizationandguidedpatentsearchusingipcasinspiredbymeshandpubmed
AT wienekeulrich automatedpatentcategorizationandguidedpatentsearchusingipcasinspiredbymeshandpubmed
AT schroedermichael automatedpatentcategorizationandguidedpatentsearchusingipcasinspiredbymeshandpubmed