Cargando…
Test collections for electronic health record-based clinical information retrieval
OBJECTIVES: To create test collections for evaluating clinical information retrieval (IR) systems and advancing clinical IR research. MATERIALS AND METHODS: Electronic health record (EHR) data, including structured and free-text data, from 45 000 patients who are a part of the Mayo Clinic Biobank co...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6824517/ https://www.ncbi.nlm.nih.gov/pubmed/31709390 http://dx.doi.org/10.1093/jamiaopen/ooz016 |
_version_ | 1783464747689050112 |
---|---|
author | Wang, Yanshan Wen, Andrew Liu, Sijia Hersh, William Bedrick, Steven Liu, Hongfang |
author_facet | Wang, Yanshan Wen, Andrew Liu, Sijia Hersh, William Bedrick, Steven Liu, Hongfang |
author_sort | Wang, Yanshan |
collection | PubMed |
description | OBJECTIVES: To create test collections for evaluating clinical information retrieval (IR) systems and advancing clinical IR research. MATERIALS AND METHODS: Electronic health record (EHR) data, including structured and free-text data, from 45 000 patients who are a part of the Mayo Clinic Biobank cohort was retrieved from the clinical data warehouse. The clinical IR system indexed a total of 42 million free-text EHR documents. The search queries consisted of 56 topics developed through a collaboration between Mayo Clinic and Oregon Health & Science University. We described the creation of test collections, including a to-be-evaluated document pool using five retrieval models, and human assessment guidelines. We analyzed the relevance judgment results in terms of human agreement and time spent, and results of three levels of relevance, and reported performance of five retrieval models. RESULTS: The two judges had a moderate overall agreement with a Kappa value of 0.49, spent a consistent amount of time judging the relevance, and were able to identify easy and difficult topics. The conventional retrieval model performed best on most topics while a concept-based retrieval model had better performance on the topics requiring conceptual level retrieval. DISCUSSION: IR can provide an alternate approach to leveraging clinical narratives for patient information discovery as it is less dependent on semantics. Our study showed the feasibility of test collections along with a few challenges. CONCLUSION: The conventional test collections for evaluating the IR system show potential for successfully evaluating clinical IR systems with a few challenges to be investigated. |
format | Online Article Text |
id | pubmed-6824517 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-68245172019-11-06 Test collections for electronic health record-based clinical information retrieval Wang, Yanshan Wen, Andrew Liu, Sijia Hersh, William Bedrick, Steven Liu, Hongfang JAMIA Open Research and Applications OBJECTIVES: To create test collections for evaluating clinical information retrieval (IR) systems and advancing clinical IR research. MATERIALS AND METHODS: Electronic health record (EHR) data, including structured and free-text data, from 45 000 patients who are a part of the Mayo Clinic Biobank cohort was retrieved from the clinical data warehouse. The clinical IR system indexed a total of 42 million free-text EHR documents. The search queries consisted of 56 topics developed through a collaboration between Mayo Clinic and Oregon Health & Science University. We described the creation of test collections, including a to-be-evaluated document pool using five retrieval models, and human assessment guidelines. We analyzed the relevance judgment results in terms of human agreement and time spent, and results of three levels of relevance, and reported performance of five retrieval models. RESULTS: The two judges had a moderate overall agreement with a Kappa value of 0.49, spent a consistent amount of time judging the relevance, and were able to identify easy and difficult topics. The conventional retrieval model performed best on most topics while a concept-based retrieval model had better performance on the topics requiring conceptual level retrieval. DISCUSSION: IR can provide an alternate approach to leveraging clinical narratives for patient information discovery as it is less dependent on semantics. Our study showed the feasibility of test collections along with a few challenges. CONCLUSION: The conventional test collections for evaluating the IR system show potential for successfully evaluating clinical IR systems with a few challenges to be investigated. Oxford University Press 2019-06-04 /pmc/articles/PMC6824517/ /pubmed/31709390 http://dx.doi.org/10.1093/jamiaopen/ooz016 Text en © The Author(s) 2019. Published by Oxford University Press on behalf of the American Medical Informatics Association. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Research and Applications Wang, Yanshan Wen, Andrew Liu, Sijia Hersh, William Bedrick, Steven Liu, Hongfang Test collections for electronic health record-based clinical information retrieval |
title | Test collections for electronic health record-based clinical information retrieval |
title_full | Test collections for electronic health record-based clinical information retrieval |
title_fullStr | Test collections for electronic health record-based clinical information retrieval |
title_full_unstemmed | Test collections for electronic health record-based clinical information retrieval |
title_short | Test collections for electronic health record-based clinical information retrieval |
title_sort | test collections for electronic health record-based clinical information retrieval |
topic | Research and Applications |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6824517/ https://www.ncbi.nlm.nih.gov/pubmed/31709390 http://dx.doi.org/10.1093/jamiaopen/ooz016 |
work_keys_str_mv | AT wangyanshan testcollectionsforelectronichealthrecordbasedclinicalinformationretrieval AT wenandrew testcollectionsforelectronichealthrecordbasedclinicalinformationretrieval AT liusijia testcollectionsforelectronichealthrecordbasedclinicalinformationretrieval AT hershwilliam testcollectionsforelectronichealthrecordbasedclinicalinformationretrieval AT bedricksteven testcollectionsforelectronichealthrecordbasedclinicalinformationretrieval AT liuhongfang testcollectionsforelectronichealthrecordbasedclinicalinformationretrieval |