Cargando…

A Framework (SOCRATex) for Hierarchical Annotation of Unstructured Electronic Health Records and Integration Into a Standardized Medical Database: Development and Usability Study

BACKGROUND: Although electronic health records (EHRs) have been widely used in secondary assessments, clinical documents are relatively less utilized owing to the lack of standardized clinical text frameworks across different institutions. OBJECTIVE: This study aimed to develop a framework for proce...

Descripción completa

Detalles Bibliográficos
Autores principales: Park, Jimyung, You, Seng Chan, Jeong, Eugene, Weng, Chunhua, Park, Dongsu, Roh, Jin, Lee, Dong Yun, Cheong, Jae Youn, Choi, Jin Wook, Kang, Mira, Park, Rae Woong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: JMIR Publications 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8044740/
https://www.ncbi.nlm.nih.gov/pubmed/33783361
http://dx.doi.org/10.2196/23983
_version_ 1783678552218009600
author Park, Jimyung
You, Seng Chan
Jeong, Eugene
Weng, Chunhua
Park, Dongsu
Roh, Jin
Lee, Dong Yun
Cheong, Jae Youn
Choi, Jin Wook
Kang, Mira
Park, Rae Woong
author_facet Park, Jimyung
You, Seng Chan
Jeong, Eugene
Weng, Chunhua
Park, Dongsu
Roh, Jin
Lee, Dong Yun
Cheong, Jae Youn
Choi, Jin Wook
Kang, Mira
Park, Rae Woong
author_sort Park, Jimyung
collection PubMed
description BACKGROUND: Although electronic health records (EHRs) have been widely used in secondary assessments, clinical documents are relatively less utilized owing to the lack of standardized clinical text frameworks across different institutions. OBJECTIVE: This study aimed to develop a framework for processing unstructured clinical documents of EHRs and integration with standardized structured data. METHODS: We developed a framework known as Staged Optimization of Curation, Regularization, and Annotation of clinical text (SOCRATex). SOCRATex has the following four aspects: (1) extracting clinical notes for the target population and preprocessing the data, (2) defining the annotation schema with a hierarchical structure, (3) performing document-level hierarchical annotation using the annotation schema, and (4) indexing annotations for a search engine system. To test the usability of the proposed framework, proof-of-concept studies were performed on EHRs. We defined three distinctive patient groups and extracted their clinical documents (ie, pathology reports, radiology reports, and admission notes). The documents were annotated and integrated into the Observational Medical Outcomes Partnership (OMOP)-common data model (CDM) database. The annotations were used for creating Cox proportional hazard models with different settings of clinical analyses to measure (1) all-cause mortality, (2) thyroid cancer recurrence, and (3) 30-day hospital readmission. RESULTS: Overall, 1055 clinical documents of 953 patients were extracted and annotated using the defined annotation schemas. The generated annotations were indexed into an unstructured textual data repository. Using the annotations of pathology reports, we identified that node metastasis and lymphovascular tumor invasion were associated with all-cause mortality among colon and rectum cancer patients (both P=.02). The other analyses involving measuring thyroid cancer recurrence using radiology reports and 30-day hospital readmission using admission notes in depressive disorder patients also showed results consistent with previous findings. CONCLUSIONS: We propose a framework for hierarchical annotation of textual data and integration into a standardized OMOP-CDM medical database. The proof-of-concept studies demonstrated that our framework can effectively process and integrate diverse clinical documents with standardized structured data for clinical research.
format Online
Article
Text
id pubmed-8044740
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher JMIR Publications
record_format MEDLINE/PubMed
spelling pubmed-80447402021-04-22 A Framework (SOCRATex) for Hierarchical Annotation of Unstructured Electronic Health Records and Integration Into a Standardized Medical Database: Development and Usability Study Park, Jimyung You, Seng Chan Jeong, Eugene Weng, Chunhua Park, Dongsu Roh, Jin Lee, Dong Yun Cheong, Jae Youn Choi, Jin Wook Kang, Mira Park, Rae Woong JMIR Med Inform Original Paper BACKGROUND: Although electronic health records (EHRs) have been widely used in secondary assessments, clinical documents are relatively less utilized owing to the lack of standardized clinical text frameworks across different institutions. OBJECTIVE: This study aimed to develop a framework for processing unstructured clinical documents of EHRs and integration with standardized structured data. METHODS: We developed a framework known as Staged Optimization of Curation, Regularization, and Annotation of clinical text (SOCRATex). SOCRATex has the following four aspects: (1) extracting clinical notes for the target population and preprocessing the data, (2) defining the annotation schema with a hierarchical structure, (3) performing document-level hierarchical annotation using the annotation schema, and (4) indexing annotations for a search engine system. To test the usability of the proposed framework, proof-of-concept studies were performed on EHRs. We defined three distinctive patient groups and extracted their clinical documents (ie, pathology reports, radiology reports, and admission notes). The documents were annotated and integrated into the Observational Medical Outcomes Partnership (OMOP)-common data model (CDM) database. The annotations were used for creating Cox proportional hazard models with different settings of clinical analyses to measure (1) all-cause mortality, (2) thyroid cancer recurrence, and (3) 30-day hospital readmission. RESULTS: Overall, 1055 clinical documents of 953 patients were extracted and annotated using the defined annotation schemas. The generated annotations were indexed into an unstructured textual data repository. Using the annotations of pathology reports, we identified that node metastasis and lymphovascular tumor invasion were associated with all-cause mortality among colon and rectum cancer patients (both P=.02). The other analyses involving measuring thyroid cancer recurrence using radiology reports and 30-day hospital readmission using admission notes in depressive disorder patients also showed results consistent with previous findings. CONCLUSIONS: We propose a framework for hierarchical annotation of textual data and integration into a standardized OMOP-CDM medical database. The proof-of-concept studies demonstrated that our framework can effectively process and integrate diverse clinical documents with standardized structured data for clinical research. JMIR Publications 2021-03-30 /pmc/articles/PMC8044740/ /pubmed/33783361 http://dx.doi.org/10.2196/23983 Text en ©Jimyung Park, Seng Chan You, Eugene Jeong, Chunhua Weng, Dongsu Park, Jin Roh, Dong Yun Lee, Jae Youn Cheong, Jin Wook Choi, Mira Kang, Rae Woong Park. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 30.03.2021. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on http://medinform.jmir.org/, as well as this copyright and license information must be included.
spellingShingle Original Paper
Park, Jimyung
You, Seng Chan
Jeong, Eugene
Weng, Chunhua
Park, Dongsu
Roh, Jin
Lee, Dong Yun
Cheong, Jae Youn
Choi, Jin Wook
Kang, Mira
Park, Rae Woong
A Framework (SOCRATex) for Hierarchical Annotation of Unstructured Electronic Health Records and Integration Into a Standardized Medical Database: Development and Usability Study
title A Framework (SOCRATex) for Hierarchical Annotation of Unstructured Electronic Health Records and Integration Into a Standardized Medical Database: Development and Usability Study
title_full A Framework (SOCRATex) for Hierarchical Annotation of Unstructured Electronic Health Records and Integration Into a Standardized Medical Database: Development and Usability Study
title_fullStr A Framework (SOCRATex) for Hierarchical Annotation of Unstructured Electronic Health Records and Integration Into a Standardized Medical Database: Development and Usability Study
title_full_unstemmed A Framework (SOCRATex) for Hierarchical Annotation of Unstructured Electronic Health Records and Integration Into a Standardized Medical Database: Development and Usability Study
title_short A Framework (SOCRATex) for Hierarchical Annotation of Unstructured Electronic Health Records and Integration Into a Standardized Medical Database: Development and Usability Study
title_sort framework (socratex) for hierarchical annotation of unstructured electronic health records and integration into a standardized medical database: development and usability study
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8044740/
https://www.ncbi.nlm.nih.gov/pubmed/33783361
http://dx.doi.org/10.2196/23983
work_keys_str_mv AT parkjimyung aframeworksocratexforhierarchicalannotationofunstructuredelectronichealthrecordsandintegrationintoastandardizedmedicaldatabasedevelopmentandusabilitystudy
AT yousengchan aframeworksocratexforhierarchicalannotationofunstructuredelectronichealthrecordsandintegrationintoastandardizedmedicaldatabasedevelopmentandusabilitystudy
AT jeongeugene aframeworksocratexforhierarchicalannotationofunstructuredelectronichealthrecordsandintegrationintoastandardizedmedicaldatabasedevelopmentandusabilitystudy
AT wengchunhua aframeworksocratexforhierarchicalannotationofunstructuredelectronichealthrecordsandintegrationintoastandardizedmedicaldatabasedevelopmentandusabilitystudy
AT parkdongsu aframeworksocratexforhierarchicalannotationofunstructuredelectronichealthrecordsandintegrationintoastandardizedmedicaldatabasedevelopmentandusabilitystudy
AT rohjin aframeworksocratexforhierarchicalannotationofunstructuredelectronichealthrecordsandintegrationintoastandardizedmedicaldatabasedevelopmentandusabilitystudy
AT leedongyun aframeworksocratexforhierarchicalannotationofunstructuredelectronichealthrecordsandintegrationintoastandardizedmedicaldatabasedevelopmentandusabilitystudy
AT cheongjaeyoun aframeworksocratexforhierarchicalannotationofunstructuredelectronichealthrecordsandintegrationintoastandardizedmedicaldatabasedevelopmentandusabilitystudy
AT choijinwook aframeworksocratexforhierarchicalannotationofunstructuredelectronichealthrecordsandintegrationintoastandardizedmedicaldatabasedevelopmentandusabilitystudy
AT kangmira aframeworksocratexforhierarchicalannotationofunstructuredelectronichealthrecordsandintegrationintoastandardizedmedicaldatabasedevelopmentandusabilitystudy
AT parkraewoong aframeworksocratexforhierarchicalannotationofunstructuredelectronichealthrecordsandintegrationintoastandardizedmedicaldatabasedevelopmentandusabilitystudy
AT parkjimyung frameworksocratexforhierarchicalannotationofunstructuredelectronichealthrecordsandintegrationintoastandardizedmedicaldatabasedevelopmentandusabilitystudy
AT yousengchan frameworksocratexforhierarchicalannotationofunstructuredelectronichealthrecordsandintegrationintoastandardizedmedicaldatabasedevelopmentandusabilitystudy
AT jeongeugene frameworksocratexforhierarchicalannotationofunstructuredelectronichealthrecordsandintegrationintoastandardizedmedicaldatabasedevelopmentandusabilitystudy
AT wengchunhua frameworksocratexforhierarchicalannotationofunstructuredelectronichealthrecordsandintegrationintoastandardizedmedicaldatabasedevelopmentandusabilitystudy
AT parkdongsu frameworksocratexforhierarchicalannotationofunstructuredelectronichealthrecordsandintegrationintoastandardizedmedicaldatabasedevelopmentandusabilitystudy
AT rohjin frameworksocratexforhierarchicalannotationofunstructuredelectronichealthrecordsandintegrationintoastandardizedmedicaldatabasedevelopmentandusabilitystudy
AT leedongyun frameworksocratexforhierarchicalannotationofunstructuredelectronichealthrecordsandintegrationintoastandardizedmedicaldatabasedevelopmentandusabilitystudy
AT cheongjaeyoun frameworksocratexforhierarchicalannotationofunstructuredelectronichealthrecordsandintegrationintoastandardizedmedicaldatabasedevelopmentandusabilitystudy
AT choijinwook frameworksocratexforhierarchicalannotationofunstructuredelectronichealthrecordsandintegrationintoastandardizedmedicaldatabasedevelopmentandusabilitystudy
AT kangmira frameworksocratexforhierarchicalannotationofunstructuredelectronichealthrecordsandintegrationintoastandardizedmedicaldatabasedevelopmentandusabilitystudy
AT parkraewoong frameworksocratexforhierarchicalannotationofunstructuredelectronichealthrecordsandintegrationintoastandardizedmedicaldatabasedevelopmentandusabilitystudy