Cargando…

Care episode retrieval: distributional semantic models for information retrieval in the clinical domain

Patients' health related information is stored in electronic health records (EHRs) by health service providers. These records include sequential documentation of care episodes in the form of clinical notes. EHRs are used throughout the health care sector by professionals, administrators and pat...

Descripción completa

Detalles Bibliográficos
Autores principales: Moen, Hans, Ginter, Filip, Marsi, Erwin, Peltonen, Laura-Maria, Salakoski, Tapio, Salanterä, Sanna
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4474584/
https://www.ncbi.nlm.nih.gov/pubmed/26099735
http://dx.doi.org/10.1186/1472-6947-15-S2-S2
_version_ 1782377297025695744
author Moen, Hans
Ginter, Filip
Marsi, Erwin
Peltonen, Laura-Maria
Salakoski, Tapio
Salanterä, Sanna
author_facet Moen, Hans
Ginter, Filip
Marsi, Erwin
Peltonen, Laura-Maria
Salakoski, Tapio
Salanterä, Sanna
author_sort Moen, Hans
collection PubMed
description Patients' health related information is stored in electronic health records (EHRs) by health service providers. These records include sequential documentation of care episodes in the form of clinical notes. EHRs are used throughout the health care sector by professionals, administrators and patients, primarily for clinical purposes, but also for secondary purposes such as decision support and research. The vast amounts of information in EHR systems complicate information management and increase the risk of information overload. Therefore, clinicians and researchers need new tools to manage the information stored in the EHRs. A common use case is, given a - possibly unfinished - care episode, to retrieve the most similar care episodes among the records. This paper presents several methods for information retrieval, focusing on care episode retrieval, based on textual similarity, where similarity is measured through domain-specific modelling of the distributional semantics of words. Models include variants of random indexing and the semantic neural network model word2vec. Two novel methods are introduced that utilize the ICD-10 codes attached to care episodes to better induce domain-specificity in the semantic model. We report on experimental evaluation of care episode retrieval that circumvents the lack of human judgements regarding episode relevance. Results suggest that several of the methods proposed outperform a state-of-the art search engine (Lucene) on the retrieval task.
format Online
Article
Text
id pubmed-4474584
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-44745842015-06-25 Care episode retrieval: distributional semantic models for information retrieval in the clinical domain Moen, Hans Ginter, Filip Marsi, Erwin Peltonen, Laura-Maria Salakoski, Tapio Salanterä, Sanna BMC Med Inform Decis Mak Proceedings Patients' health related information is stored in electronic health records (EHRs) by health service providers. These records include sequential documentation of care episodes in the form of clinical notes. EHRs are used throughout the health care sector by professionals, administrators and patients, primarily for clinical purposes, but also for secondary purposes such as decision support and research. The vast amounts of information in EHR systems complicate information management and increase the risk of information overload. Therefore, clinicians and researchers need new tools to manage the information stored in the EHRs. A common use case is, given a - possibly unfinished - care episode, to retrieve the most similar care episodes among the records. This paper presents several methods for information retrieval, focusing on care episode retrieval, based on textual similarity, where similarity is measured through domain-specific modelling of the distributional semantics of words. Models include variants of random indexing and the semantic neural network model word2vec. Two novel methods are introduced that utilize the ICD-10 codes attached to care episodes to better induce domain-specificity in the semantic model. We report on experimental evaluation of care episode retrieval that circumvents the lack of human judgements regarding episode relevance. Results suggest that several of the methods proposed outperform a state-of-the art search engine (Lucene) on the retrieval task. BioMed Central 2015-06-15 /pmc/articles/PMC4474584/ /pubmed/26099735 http://dx.doi.org/10.1186/1472-6947-15-S2-S2 Text en Copyright © 2015 Moen et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/4.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Proceedings
Moen, Hans
Ginter, Filip
Marsi, Erwin
Peltonen, Laura-Maria
Salakoski, Tapio
Salanterä, Sanna
Care episode retrieval: distributional semantic models for information retrieval in the clinical domain
title Care episode retrieval: distributional semantic models for information retrieval in the clinical domain
title_full Care episode retrieval: distributional semantic models for information retrieval in the clinical domain
title_fullStr Care episode retrieval: distributional semantic models for information retrieval in the clinical domain
title_full_unstemmed Care episode retrieval: distributional semantic models for information retrieval in the clinical domain
title_short Care episode retrieval: distributional semantic models for information retrieval in the clinical domain
title_sort care episode retrieval: distributional semantic models for information retrieval in the clinical domain
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4474584/
https://www.ncbi.nlm.nih.gov/pubmed/26099735
http://dx.doi.org/10.1186/1472-6947-15-S2-S2
work_keys_str_mv AT moenhans careepisoderetrievaldistributionalsemanticmodelsforinformationretrievalintheclinicaldomain
AT ginterfilip careepisoderetrievaldistributionalsemanticmodelsforinformationretrievalintheclinicaldomain
AT marsierwin careepisoderetrievaldistributionalsemanticmodelsforinformationretrievalintheclinicaldomain
AT peltonenlauramaria careepisoderetrievaldistributionalsemanticmodelsforinformationretrievalintheclinicaldomain
AT salakoskitapio careepisoderetrievaldistributionalsemanticmodelsforinformationretrievalintheclinicaldomain
AT salanterasanna careepisoderetrievaldistributionalsemanticmodelsforinformationretrievalintheclinicaldomain