Cargando…

Implementation of a Cohort Retrieval System for Clinical Data Repositories Using the Observational Medical Outcomes Partnership Common Data Model: Proof-of-Concept System Validation

BACKGROUND: Widespread adoption of electronic health records has enabled the secondary use of electronic health record data for clinical research and health care delivery. Natural language processing techniques have shown promise in their capability to extract the information embedded in unstructure...

Descripción completa

Detalles Bibliográficos
Autores principales:	Liu, Sijia, Wang, Yanshan, Wen, Andrew, Wang, Liwei, Hong, Na, Shen, Feichen, Bedrick, Steven, Hersh, William, Liu, Hongfang
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	JMIR Publications 2020
Materias:	Original Paper
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7576539/ https://www.ncbi.nlm.nih.gov/pubmed/33021486 http://dx.doi.org/10.2196/17376

_version_	1783598035679313920
author	Liu, Sijia Wang, Yanshan Wen, Andrew Wang, Liwei Hong, Na Shen, Feichen Bedrick, Steven Hersh, William Liu, Hongfang
author_facet	Liu, Sijia Wang, Yanshan Wen, Andrew Wang, Liwei Hong, Na Shen, Feichen Bedrick, Steven Hersh, William Liu, Hongfang
author_sort	Liu, Sijia
collection	PubMed
description	BACKGROUND: Widespread adoption of electronic health records has enabled the secondary use of electronic health record data for clinical research and health care delivery. Natural language processing techniques have shown promise in their capability to extract the information embedded in unstructured clinical data, and information retrieval techniques provide flexible and scalable solutions that can augment natural language processing systems for retrieving and ranking relevant records. OBJECTIVE: In this paper, we present the implementation of a cohort retrieval system that can execute textual cohort selection queries on both structured data and unstructured text—Cohort Retrieval Enhanced by Analysis of Text from Electronic Health Records (CREATE). METHODS: CREATE is a proof-of-concept system that leverages a combination of structured queries and information retrieval techniques on natural language processing results to improve cohort retrieval performance using the Observational Medical Outcomes Partnership Common Data Model to enhance model portability. The natural language processing component was used to extract common data model concepts from textual queries. We designed a hierarchical index to support the common data model concept search utilizing information retrieval techniques and frameworks. RESULTS: Our case study on 5 cohort identification queries, evaluated using the precision at 5 information retrieval metric at both the patient-level and document-level, demonstrates that CREATE achieves a mean precision at 5 of 0.90, which outperforms systems using only structured data or only unstructured text with mean precision at 5 values of 0.54 and 0.74, respectively. CONCLUSIONS: The implementation and evaluation of Mayo Clinic Biobank data demonstrated that CREATE outperforms cohort retrieval systems that only use one of either structured data or unstructured text in complex textual cohort queries.
format	Online Article Text
id	pubmed-7576539
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	JMIR Publications
record_format	MEDLINE/PubMed
spelling	pubmed-75765392020-10-27 Implementation of a Cohort Retrieval System for Clinical Data Repositories Using the Observational Medical Outcomes Partnership Common Data Model: Proof-of-Concept System Validation Liu, Sijia Wang, Yanshan Wen, Andrew Wang, Liwei Hong, Na Shen, Feichen Bedrick, Steven Hersh, William Liu, Hongfang JMIR Med Inform Original Paper BACKGROUND: Widespread adoption of electronic health records has enabled the secondary use of electronic health record data for clinical research and health care delivery. Natural language processing techniques have shown promise in their capability to extract the information embedded in unstructured clinical data, and information retrieval techniques provide flexible and scalable solutions that can augment natural language processing systems for retrieving and ranking relevant records. OBJECTIVE: In this paper, we present the implementation of a cohort retrieval system that can execute textual cohort selection queries on both structured data and unstructured text—Cohort Retrieval Enhanced by Analysis of Text from Electronic Health Records (CREATE). METHODS: CREATE is a proof-of-concept system that leverages a combination of structured queries and information retrieval techniques on natural language processing results to improve cohort retrieval performance using the Observational Medical Outcomes Partnership Common Data Model to enhance model portability. The natural language processing component was used to extract common data model concepts from textual queries. We designed a hierarchical index to support the common data model concept search utilizing information retrieval techniques and frameworks. RESULTS: Our case study on 5 cohort identification queries, evaluated using the precision at 5 information retrieval metric at both the patient-level and document-level, demonstrates that CREATE achieves a mean precision at 5 of 0.90, which outperforms systems using only structured data or only unstructured text with mean precision at 5 values of 0.54 and 0.74, respectively. CONCLUSIONS: The implementation and evaluation of Mayo Clinic Biobank data demonstrated that CREATE outperforms cohort retrieval systems that only use one of either structured data or unstructured text in complex textual cohort queries. JMIR Publications 2020-10-06 /pmc/articles/PMC7576539/ /pubmed/33021486 http://dx.doi.org/10.2196/17376 Text en ©Sijia Liu, Yanshan Wang, Andrew Wen, Liwei Wang, Na Hong, Feichen Shen, Steven Bedrick, William Hersh, Hongfang Liu. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 06.10.2020. https://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on http://medinform.jmir.org/, as well as this copyright and license information must be included.
spellingShingle	Original Paper Liu, Sijia Wang, Yanshan Wen, Andrew Wang, Liwei Hong, Na Shen, Feichen Bedrick, Steven Hersh, William Liu, Hongfang Implementation of a Cohort Retrieval System for Clinical Data Repositories Using the Observational Medical Outcomes Partnership Common Data Model: Proof-of-Concept System Validation
title	Implementation of a Cohort Retrieval System for Clinical Data Repositories Using the Observational Medical Outcomes Partnership Common Data Model: Proof-of-Concept System Validation
title_full	Implementation of a Cohort Retrieval System for Clinical Data Repositories Using the Observational Medical Outcomes Partnership Common Data Model: Proof-of-Concept System Validation
title_fullStr	Implementation of a Cohort Retrieval System for Clinical Data Repositories Using the Observational Medical Outcomes Partnership Common Data Model: Proof-of-Concept System Validation
title_full_unstemmed	Implementation of a Cohort Retrieval System for Clinical Data Repositories Using the Observational Medical Outcomes Partnership Common Data Model: Proof-of-Concept System Validation
title_short	Implementation of a Cohort Retrieval System for Clinical Data Repositories Using the Observational Medical Outcomes Partnership Common Data Model: Proof-of-Concept System Validation
title_sort	implementation of a cohort retrieval system for clinical data repositories using the observational medical outcomes partnership common data model: proof-of-concept system validation
topic	Original Paper
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7576539/ https://www.ncbi.nlm.nih.gov/pubmed/33021486 http://dx.doi.org/10.2196/17376
work_keys_str_mv	AT liusijia implementationofacohortretrievalsystemforclinicaldatarepositoriesusingtheobservationalmedicaloutcomespartnershipcommondatamodelproofofconceptsystemvalidation AT wangyanshan implementationofacohortretrievalsystemforclinicaldatarepositoriesusingtheobservationalmedicaloutcomespartnershipcommondatamodelproofofconceptsystemvalidation AT wenandrew implementationofacohortretrievalsystemforclinicaldatarepositoriesusingtheobservationalmedicaloutcomespartnershipcommondatamodelproofofconceptsystemvalidation AT wangliwei implementationofacohortretrievalsystemforclinicaldatarepositoriesusingtheobservationalmedicaloutcomespartnershipcommondatamodelproofofconceptsystemvalidation AT hongna implementationofacohortretrievalsystemforclinicaldatarepositoriesusingtheobservationalmedicaloutcomespartnershipcommondatamodelproofofconceptsystemvalidation AT shenfeichen implementationofacohortretrievalsystemforclinicaldatarepositoriesusingtheobservationalmedicaloutcomespartnershipcommondatamodelproofofconceptsystemvalidation AT bedricksteven implementationofacohortretrievalsystemforclinicaldatarepositoriesusingtheobservationalmedicaloutcomespartnershipcommondatamodelproofofconceptsystemvalidation AT hershwilliam implementationofacohortretrievalsystemforclinicaldatarepositoriesusingtheobservationalmedicaloutcomespartnershipcommondatamodelproofofconceptsystemvalidation AT liuhongfang implementationofacohortretrievalsystemforclinicaldatarepositoriesusingtheobservationalmedicaloutcomespartnershipcommondatamodelproofofconceptsystemvalidation

Implementation of a Cohort Retrieval System for Clinical Data Repositories Using the Observational Medical Outcomes Partnership Common Data Model: Proof-of-Concept System Validation

Ejemplares similares