Cargando…

Semi-Supervised Learning to Identify UMLS Semantic Relations

The UMLS Semantic Network is constructed by experts and requires periodic expert review to update. We propose and implement a semi-supervised approach for automatically identifying UMLS semantic relations from narrative text in PubMed. Our method analyzes biomedical narrative text to collect semanti...

Descripción completa

Detalles Bibliográficos
Autores principales: Luo, Yuan, Uzuner, Ozlem
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Medical Informatics Association 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4419772/
https://www.ncbi.nlm.nih.gov/pubmed/25954580
_version_ 1782369639122075648
author Luo, Yuan
Uzuner, Ozlem
author_facet Luo, Yuan
Uzuner, Ozlem
author_sort Luo, Yuan
collection PubMed
description The UMLS Semantic Network is constructed by experts and requires periodic expert review to update. We propose and implement a semi-supervised approach for automatically identifying UMLS semantic relations from narrative text in PubMed. Our method analyzes biomedical narrative text to collect semantic entity pairs, and extracts multiple semantic, syntactic and orthographic features for the collected pairs. We experiment with seeded k-means clustering with various distance metrics. We create and annotate a ground truth corpus according to the top two levels of the UMLS semantic relation hierarchy. We evaluate our system on this corpus and characterize the learning curves of different clustering configuration. Using KL divergence consistently performs the best on the held-out test data. With full seeding, we obtain macro-averaged F-measures above 70% for clustering the top level UMLS relations (2-way), and above 50% for clustering the second level relations (7-way).
format Online
Article
Text
id pubmed-4419772
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher American Medical Informatics Association
record_format MEDLINE/PubMed
spelling pubmed-44197722015-05-07 Semi-Supervised Learning to Identify UMLS Semantic Relations Luo, Yuan Uzuner, Ozlem AMIA Jt Summits Transl Sci Proc Articles The UMLS Semantic Network is constructed by experts and requires periodic expert review to update. We propose and implement a semi-supervised approach for automatically identifying UMLS semantic relations from narrative text in PubMed. Our method analyzes biomedical narrative text to collect semantic entity pairs, and extracts multiple semantic, syntactic and orthographic features for the collected pairs. We experiment with seeded k-means clustering with various distance metrics. We create and annotate a ground truth corpus according to the top two levels of the UMLS semantic relation hierarchy. We evaluate our system on this corpus and characterize the learning curves of different clustering configuration. Using KL divergence consistently performs the best on the held-out test data. With full seeding, we obtain macro-averaged F-measures above 70% for clustering the top level UMLS relations (2-way), and above 50% for clustering the second level relations (7-way). American Medical Informatics Association 2014-04-07 /pmc/articles/PMC4419772/ /pubmed/25954580 Text en ©2014 AMIA - All rights reserved. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose
spellingShingle Articles
Luo, Yuan
Uzuner, Ozlem
Semi-Supervised Learning to Identify UMLS Semantic Relations
title Semi-Supervised Learning to Identify UMLS Semantic Relations
title_full Semi-Supervised Learning to Identify UMLS Semantic Relations
title_fullStr Semi-Supervised Learning to Identify UMLS Semantic Relations
title_full_unstemmed Semi-Supervised Learning to Identify UMLS Semantic Relations
title_short Semi-Supervised Learning to Identify UMLS Semantic Relations
title_sort semi-supervised learning to identify umls semantic relations
topic Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4419772/
https://www.ncbi.nlm.nih.gov/pubmed/25954580
work_keys_str_mv AT luoyuan semisupervisedlearningtoidentifyumlssemanticrelations
AT uzunerozlem semisupervisedlearningtoidentifyumlssemanticrelations