Cargando…

Test collections for electronic health record-based clinical information retrieval

OBJECTIVES: To create test collections for evaluating clinical information retrieval (IR) systems and advancing clinical IR research. MATERIALS AND METHODS: Electronic health record (EHR) data, including structured and free-text data, from 45 000 patients who are a part of the Mayo Clinic Biobank co...

Descripción completa

Detalles Bibliográficos
Autores principales:	Wang, Yanshan, Wen, Andrew, Liu, Sijia, Hersh, William, Bedrick, Steven, Liu, Hongfang
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Oxford University Press 2019
Materias:	Research and Applications
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6824517/ https://www.ncbi.nlm.nih.gov/pubmed/31709390 http://dx.doi.org/10.1093/jamiaopen/ooz016

_version_	1783464747689050112
author	Wang, Yanshan Wen, Andrew Liu, Sijia Hersh, William Bedrick, Steven Liu, Hongfang
author_facet	Wang, Yanshan Wen, Andrew Liu, Sijia Hersh, William Bedrick, Steven Liu, Hongfang
author_sort	Wang, Yanshan
collection	PubMed
description	OBJECTIVES: To create test collections for evaluating clinical information retrieval (IR) systems and advancing clinical IR research. MATERIALS AND METHODS: Electronic health record (EHR) data, including structured and free-text data, from 45 000 patients who are a part of the Mayo Clinic Biobank cohort was retrieved from the clinical data warehouse. The clinical IR system indexed a total of 42 million free-text EHR documents. The search queries consisted of 56 topics developed through a collaboration between Mayo Clinic and Oregon Health & Science University. We described the creation of test collections, including a to-be-evaluated document pool using five retrieval models, and human assessment guidelines. We analyzed the relevance judgment results in terms of human agreement and time spent, and results of three levels of relevance, and reported performance of five retrieval models. RESULTS: The two judges had a moderate overall agreement with a Kappa value of 0.49, spent a consistent amount of time judging the relevance, and were able to identify easy and difficult topics. The conventional retrieval model performed best on most topics while a concept-based retrieval model had better performance on the topics requiring conceptual level retrieval. DISCUSSION: IR can provide an alternate approach to leveraging clinical narratives for patient information discovery as it is less dependent on semantics. Our study showed the feasibility of test collections along with a few challenges. CONCLUSION: The conventional test collections for evaluating the IR system show potential for successfully evaluating clinical IR systems with a few challenges to be investigated.
format	Online Article Text
id	pubmed-6824517
institution	National Center for Biotechnology Information
language	English
publishDate	2019
publisher	Oxford University Press
record_format	MEDLINE/PubMed
spelling	pubmed-68245172019-11-06 Test collections for electronic health record-based clinical information retrieval Wang, Yanshan Wen, Andrew Liu, Sijia Hersh, William Bedrick, Steven Liu, Hongfang JAMIA Open Research and Applications OBJECTIVES: To create test collections for evaluating clinical information retrieval (IR) systems and advancing clinical IR research. MATERIALS AND METHODS: Electronic health record (EHR) data, including structured and free-text data, from 45 000 patients who are a part of the Mayo Clinic Biobank cohort was retrieved from the clinical data warehouse. The clinical IR system indexed a total of 42 million free-text EHR documents. The search queries consisted of 56 topics developed through a collaboration between Mayo Clinic and Oregon Health & Science University. We described the creation of test collections, including a to-be-evaluated document pool using five retrieval models, and human assessment guidelines. We analyzed the relevance judgment results in terms of human agreement and time spent, and results of three levels of relevance, and reported performance of five retrieval models. RESULTS: The two judges had a moderate overall agreement with a Kappa value of 0.49, spent a consistent amount of time judging the relevance, and were able to identify easy and difficult topics. The conventional retrieval model performed best on most topics while a concept-based retrieval model had better performance on the topics requiring conceptual level retrieval. DISCUSSION: IR can provide an alternate approach to leveraging clinical narratives for patient information discovery as it is less dependent on semantics. Our study showed the feasibility of test collections along with a few challenges. CONCLUSION: The conventional test collections for evaluating the IR system show potential for successfully evaluating clinical IR systems with a few challenges to be investigated. Oxford University Press 2019-06-04 /pmc/articles/PMC6824517/ /pubmed/31709390 http://dx.doi.org/10.1093/jamiaopen/ooz016 Text en © The Author(s) 2019. Published by Oxford University Press on behalf of the American Medical Informatics Association. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle	Research and Applications Wang, Yanshan Wen, Andrew Liu, Sijia Hersh, William Bedrick, Steven Liu, Hongfang Test collections for electronic health record-based clinical information retrieval
title	Test collections for electronic health record-based clinical information retrieval
title_full	Test collections for electronic health record-based clinical information retrieval
title_fullStr	Test collections for electronic health record-based clinical information retrieval
title_full_unstemmed	Test collections for electronic health record-based clinical information retrieval
title_short	Test collections for electronic health record-based clinical information retrieval
title_sort	test collections for electronic health record-based clinical information retrieval
topic	Research and Applications
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6824517/ https://www.ncbi.nlm.nih.gov/pubmed/31709390 http://dx.doi.org/10.1093/jamiaopen/ooz016
work_keys_str_mv	AT wangyanshan testcollectionsforelectronichealthrecordbasedclinicalinformationretrieval AT wenandrew testcollectionsforelectronichealthrecordbasedclinicalinformationretrieval AT liusijia testcollectionsforelectronichealthrecordbasedclinicalinformationretrieval AT hershwilliam testcollectionsforelectronichealthrecordbasedclinicalinformationretrieval AT bedricksteven testcollectionsforelectronichealthrecordbasedclinicalinformationretrieval AT liuhongfang testcollectionsforelectronichealthrecordbasedclinicalinformationretrieval

Test collections for electronic health record-based clinical information retrieval

Ejemplares similares