Cargando…

Test collections for electronic health record-based clinical information retrieval

OBJECTIVES: To create test collections for evaluating clinical information retrieval (IR) systems and advancing clinical IR research. MATERIALS AND METHODS: Electronic health record (EHR) data, including structured and free-text data, from 45 000 patients who are a part of the Mayo Clinic Biobank co...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Yanshan, Wen, Andrew, Liu, Sijia, Hersh, William, Bedrick, Steven, Liu, Hongfang
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6824517/
https://www.ncbi.nlm.nih.gov/pubmed/31709390
http://dx.doi.org/10.1093/jamiaopen/ooz016
_version_ 1783464747689050112
author Wang, Yanshan
Wen, Andrew
Liu, Sijia
Hersh, William
Bedrick, Steven
Liu, Hongfang
author_facet Wang, Yanshan
Wen, Andrew
Liu, Sijia
Hersh, William
Bedrick, Steven
Liu, Hongfang
author_sort Wang, Yanshan
collection PubMed
description OBJECTIVES: To create test collections for evaluating clinical information retrieval (IR) systems and advancing clinical IR research. MATERIALS AND METHODS: Electronic health record (EHR) data, including structured and free-text data, from 45 000 patients who are a part of the Mayo Clinic Biobank cohort was retrieved from the clinical data warehouse. The clinical IR system indexed a total of 42 million free-text EHR documents. The search queries consisted of 56 topics developed through a collaboration between Mayo Clinic and Oregon Health & Science University. We described the creation of test collections, including a to-be-evaluated document pool using five retrieval models, and human assessment guidelines. We analyzed the relevance judgment results in terms of human agreement and time spent, and results of three levels of relevance, and reported performance of five retrieval models. RESULTS: The two judges had a moderate overall agreement with a Kappa value of 0.49, spent a consistent amount of time judging the relevance, and were able to identify easy and difficult topics. The conventional retrieval model performed best on most topics while a concept-based retrieval model had better performance on the topics requiring conceptual level retrieval. DISCUSSION: IR can provide an alternate approach to leveraging clinical narratives for patient information discovery as it is less dependent on semantics. Our study showed the feasibility of test collections along with a few challenges. CONCLUSION: The conventional test collections for evaluating the IR system show potential for successfully evaluating clinical IR systems with a few challenges to be investigated.
format Online
Article
Text
id pubmed-6824517
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-68245172019-11-06 Test collections for electronic health record-based clinical information retrieval Wang, Yanshan Wen, Andrew Liu, Sijia Hersh, William Bedrick, Steven Liu, Hongfang JAMIA Open Research and Applications OBJECTIVES: To create test collections for evaluating clinical information retrieval (IR) systems and advancing clinical IR research. MATERIALS AND METHODS: Electronic health record (EHR) data, including structured and free-text data, from 45 000 patients who are a part of the Mayo Clinic Biobank cohort was retrieved from the clinical data warehouse. The clinical IR system indexed a total of 42 million free-text EHR documents. The search queries consisted of 56 topics developed through a collaboration between Mayo Clinic and Oregon Health & Science University. We described the creation of test collections, including a to-be-evaluated document pool using five retrieval models, and human assessment guidelines. We analyzed the relevance judgment results in terms of human agreement and time spent, and results of three levels of relevance, and reported performance of five retrieval models. RESULTS: The two judges had a moderate overall agreement with a Kappa value of 0.49, spent a consistent amount of time judging the relevance, and were able to identify easy and difficult topics. The conventional retrieval model performed best on most topics while a concept-based retrieval model had better performance on the topics requiring conceptual level retrieval. DISCUSSION: IR can provide an alternate approach to leveraging clinical narratives for patient information discovery as it is less dependent on semantics. Our study showed the feasibility of test collections along with a few challenges. CONCLUSION: The conventional test collections for evaluating the IR system show potential for successfully evaluating clinical IR systems with a few challenges to be investigated. Oxford University Press 2019-06-04 /pmc/articles/PMC6824517/ /pubmed/31709390 http://dx.doi.org/10.1093/jamiaopen/ooz016 Text en © The Author(s) 2019. Published by Oxford University Press on behalf of the American Medical Informatics Association. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Research and Applications
Wang, Yanshan
Wen, Andrew
Liu, Sijia
Hersh, William
Bedrick, Steven
Liu, Hongfang
Test collections for electronic health record-based clinical information retrieval
title Test collections for electronic health record-based clinical information retrieval
title_full Test collections for electronic health record-based clinical information retrieval
title_fullStr Test collections for electronic health record-based clinical information retrieval
title_full_unstemmed Test collections for electronic health record-based clinical information retrieval
title_short Test collections for electronic health record-based clinical information retrieval
title_sort test collections for electronic health record-based clinical information retrieval
topic Research and Applications
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6824517/
https://www.ncbi.nlm.nih.gov/pubmed/31709390
http://dx.doi.org/10.1093/jamiaopen/ooz016
work_keys_str_mv AT wangyanshan testcollectionsforelectronichealthrecordbasedclinicalinformationretrieval
AT wenandrew testcollectionsforelectronichealthrecordbasedclinicalinformationretrieval
AT liusijia testcollectionsforelectronichealthrecordbasedclinicalinformationretrieval
AT hershwilliam testcollectionsforelectronichealthrecordbasedclinicalinformationretrieval
AT bedricksteven testcollectionsforelectronichealthrecordbasedclinicalinformationretrieval
AT liuhongfang testcollectionsforelectronichealthrecordbasedclinicalinformationretrieval