Cargando…

Towards a Semantic Lexicon for Biological Language Processing

This paper explores the use of the resources in the National Library of Medicine's Unified Medical Language System (UMLS) for the construction of a lexicon useful for processing texts in the field of molecular biology. A lexicon is constructed from overlapping terms in the UMLS SPECIALIST lexic...

Descripción completa

Detalles Bibliográficos
Autor principal: Verspoor, Karin
Formato: Texto
Lenguaje:English
Publicado: Hindawi Publishing Corporation 2005
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2448606/
https://www.ncbi.nlm.nih.gov/pubmed/18629302
http://dx.doi.org/10.1002/cfg.451
_version_ 1782157181431316480
author Verspoor, Karin
author_facet Verspoor, Karin
author_sort Verspoor, Karin
collection PubMed
description This paper explores the use of the resources in the National Library of Medicine's Unified Medical Language System (UMLS) for the construction of a lexicon useful for processing texts in the field of molecular biology. A lexicon is constructed from overlapping terms in the UMLS SPECIALIST lexicon and the UMLS Metathesaurus to obtain both morphosyntactic and semantic information for terms, and the coverage of a domain corpus is assessed. Over 77% of tokens in the domain corpus are found in the constructed lexicon, validating the lexicon's coverage of the most frequent terms in the domain and indicating that the constructed lexicon is potentially an important resource for biological text processing.
format Text
id pubmed-2448606
institution National Center for Biotechnology Information
language English
publishDate 2005
publisher Hindawi Publishing Corporation
record_format MEDLINE/PubMed
spelling pubmed-24486062008-07-14 Towards a Semantic Lexicon for Biological Language Processing Verspoor, Karin Comp Funct Genomics Research Article This paper explores the use of the resources in the National Library of Medicine's Unified Medical Language System (UMLS) for the construction of a lexicon useful for processing texts in the field of molecular biology. A lexicon is constructed from overlapping terms in the UMLS SPECIALIST lexicon and the UMLS Metathesaurus to obtain both morphosyntactic and semantic information for terms, and the coverage of a domain corpus is assessed. Over 77% of tokens in the domain corpus are found in the constructed lexicon, validating the lexicon's coverage of the most frequent terms in the domain and indicating that the constructed lexicon is potentially an important resource for biological text processing. Hindawi Publishing Corporation 2005 /pmc/articles/PMC2448606/ /pubmed/18629302 http://dx.doi.org/10.1002/cfg.451 Text en Copyright © 2005 Hindawi Publishing Corporation. http://creativecommons.org/licenses/by/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Verspoor, Karin
Towards a Semantic Lexicon for Biological Language Processing
title Towards a Semantic Lexicon for Biological Language Processing
title_full Towards a Semantic Lexicon for Biological Language Processing
title_fullStr Towards a Semantic Lexicon for Biological Language Processing
title_full_unstemmed Towards a Semantic Lexicon for Biological Language Processing
title_short Towards a Semantic Lexicon for Biological Language Processing
title_sort towards a semantic lexicon for biological language processing
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2448606/
https://www.ncbi.nlm.nih.gov/pubmed/18629302
http://dx.doi.org/10.1002/cfg.451
work_keys_str_mv AT verspoorkarin towardsasemanticlexiconforbiologicallanguageprocessing