Cargando…

The Generation of a Lung Cancer Health Factor Distribution Using Patient Graphs Constructed From Electronic Medical Records: Retrospective Study

BACKGROUND: Electronic medical records (EMRs) of patients with lung cancer (LC) capture a variety of health factors. Understanding the distribution of these factors will help identify key factors for risk prediction in preventive screening for LC. OBJECTIVE: We aimed to generate an integrated biomed...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Anjun, Huang, Ran, Wu, Erman, Han, Ruobing, Wen, Jian, Li, Qinghua, Zhang, Zhiyong, Shen, Bairong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: JMIR Publications 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9736747/
https://www.ncbi.nlm.nih.gov/pubmed/36427233
http://dx.doi.org/10.2196/40361
_version_ 1784847110395396096
author Chen, Anjun
Huang, Ran
Wu, Erman
Han, Ruobing
Wen, Jian
Li, Qinghua
Zhang, Zhiyong
Shen, Bairong
author_facet Chen, Anjun
Huang, Ran
Wu, Erman
Han, Ruobing
Wen, Jian
Li, Qinghua
Zhang, Zhiyong
Shen, Bairong
author_sort Chen, Anjun
collection PubMed
description BACKGROUND: Electronic medical records (EMRs) of patients with lung cancer (LC) capture a variety of health factors. Understanding the distribution of these factors will help identify key factors for risk prediction in preventive screening for LC. OBJECTIVE: We aimed to generate an integrated biomedical graph from EMR data and Unified Medical Language System (UMLS) ontology for LC, and to generate an LC health factor distribution from a hospital EMR of approximately 1 million patients. METHODS: The data were collected from 2 sets of 1397 patients with and those without LC. A patient-centered health factor graph was plotted with 108,000 standardized data, and a graph database was generated to integrate the graphs of patient health factors and the UMLS ontology. With the patient graph, we calculated the connection delta ratio (CDR) for each of the health factors to measure the relative strength of the factor’s relationship to LC. RESULTS: The patient graph had 93,000 relations between the 2794 patient nodes and 650 factor nodes. An LC graph with 187 related biomedical concepts and 188 horizontal biomedical relations was plotted and linked to the patient graph. Searching the integrated biomedical graph with any number or category of health factors resulted in graphical representations of relationships between patients and factors, while searches using any patient presented the patient’s health factors from the EMR and the LC knowledge graph (KG) from the UMLS in the same graph. Sorting the health factors by CDR in descending order generated a distribution of health factors for LC. The top 70 CDR-ranked factors of disease, symptom, medical history, observation, and laboratory test categories were verified to be concordant with those found in the literature. CONCLUSIONS: By collecting standardized data of thousands of patients with and those without LC from the EMR, it was possible to generate a hospital-wide patient-centered health factor graph for graph search and presentation. The patient graph could be integrated with the UMLS KG for LC and thus enable hospitals to bring continuously updated international standard biomedical KGs from the UMLS for clinical use in hospitals. CDR analysis of the graph of patients with LC generated a CDR-sorted distribution of health factors, in which the top CDR-ranked health factors were concordant with the literature. The resulting distribution of LC health factors can be used to help personalize risk evaluation and preventive screening recommendations.
format Online
Article
Text
id pubmed-9736747
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher JMIR Publications
record_format MEDLINE/PubMed
spelling pubmed-97367472022-12-11 The Generation of a Lung Cancer Health Factor Distribution Using Patient Graphs Constructed From Electronic Medical Records: Retrospective Study Chen, Anjun Huang, Ran Wu, Erman Han, Ruobing Wen, Jian Li, Qinghua Zhang, Zhiyong Shen, Bairong J Med Internet Res Original Paper BACKGROUND: Electronic medical records (EMRs) of patients with lung cancer (LC) capture a variety of health factors. Understanding the distribution of these factors will help identify key factors for risk prediction in preventive screening for LC. OBJECTIVE: We aimed to generate an integrated biomedical graph from EMR data and Unified Medical Language System (UMLS) ontology for LC, and to generate an LC health factor distribution from a hospital EMR of approximately 1 million patients. METHODS: The data were collected from 2 sets of 1397 patients with and those without LC. A patient-centered health factor graph was plotted with 108,000 standardized data, and a graph database was generated to integrate the graphs of patient health factors and the UMLS ontology. With the patient graph, we calculated the connection delta ratio (CDR) for each of the health factors to measure the relative strength of the factor’s relationship to LC. RESULTS: The patient graph had 93,000 relations between the 2794 patient nodes and 650 factor nodes. An LC graph with 187 related biomedical concepts and 188 horizontal biomedical relations was plotted and linked to the patient graph. Searching the integrated biomedical graph with any number or category of health factors resulted in graphical representations of relationships between patients and factors, while searches using any patient presented the patient’s health factors from the EMR and the LC knowledge graph (KG) from the UMLS in the same graph. Sorting the health factors by CDR in descending order generated a distribution of health factors for LC. The top 70 CDR-ranked factors of disease, symptom, medical history, observation, and laboratory test categories were verified to be concordant with those found in the literature. CONCLUSIONS: By collecting standardized data of thousands of patients with and those without LC from the EMR, it was possible to generate a hospital-wide patient-centered health factor graph for graph search and presentation. The patient graph could be integrated with the UMLS KG for LC and thus enable hospitals to bring continuously updated international standard biomedical KGs from the UMLS for clinical use in hospitals. CDR analysis of the graph of patients with LC generated a CDR-sorted distribution of health factors, in which the top CDR-ranked health factors were concordant with the literature. The resulting distribution of LC health factors can be used to help personalize risk evaluation and preventive screening recommendations. JMIR Publications 2022-11-25 /pmc/articles/PMC9736747/ /pubmed/36427233 http://dx.doi.org/10.2196/40361 Text en ©Anjun Chen, Ran Huang, Erman Wu, Ruobing Han, Jian Wen, Qinghua Li, Zhiyong Zhang, Bairong Shen. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 25.11.2022. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on https://www.jmir.org/, as well as this copyright and license information must be included.
spellingShingle Original Paper
Chen, Anjun
Huang, Ran
Wu, Erman
Han, Ruobing
Wen, Jian
Li, Qinghua
Zhang, Zhiyong
Shen, Bairong
The Generation of a Lung Cancer Health Factor Distribution Using Patient Graphs Constructed From Electronic Medical Records: Retrospective Study
title The Generation of a Lung Cancer Health Factor Distribution Using Patient Graphs Constructed From Electronic Medical Records: Retrospective Study
title_full The Generation of a Lung Cancer Health Factor Distribution Using Patient Graphs Constructed From Electronic Medical Records: Retrospective Study
title_fullStr The Generation of a Lung Cancer Health Factor Distribution Using Patient Graphs Constructed From Electronic Medical Records: Retrospective Study
title_full_unstemmed The Generation of a Lung Cancer Health Factor Distribution Using Patient Graphs Constructed From Electronic Medical Records: Retrospective Study
title_short The Generation of a Lung Cancer Health Factor Distribution Using Patient Graphs Constructed From Electronic Medical Records: Retrospective Study
title_sort generation of a lung cancer health factor distribution using patient graphs constructed from electronic medical records: retrospective study
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9736747/
https://www.ncbi.nlm.nih.gov/pubmed/36427233
http://dx.doi.org/10.2196/40361
work_keys_str_mv AT chenanjun thegenerationofalungcancerhealthfactordistributionusingpatientgraphsconstructedfromelectronicmedicalrecordsretrospectivestudy
AT huangran thegenerationofalungcancerhealthfactordistributionusingpatientgraphsconstructedfromelectronicmedicalrecordsretrospectivestudy
AT wuerman thegenerationofalungcancerhealthfactordistributionusingpatientgraphsconstructedfromelectronicmedicalrecordsretrospectivestudy
AT hanruobing thegenerationofalungcancerhealthfactordistributionusingpatientgraphsconstructedfromelectronicmedicalrecordsretrospectivestudy
AT wenjian thegenerationofalungcancerhealthfactordistributionusingpatientgraphsconstructedfromelectronicmedicalrecordsretrospectivestudy
AT liqinghua thegenerationofalungcancerhealthfactordistributionusingpatientgraphsconstructedfromelectronicmedicalrecordsretrospectivestudy
AT zhangzhiyong thegenerationofalungcancerhealthfactordistributionusingpatientgraphsconstructedfromelectronicmedicalrecordsretrospectivestudy
AT shenbairong thegenerationofalungcancerhealthfactordistributionusingpatientgraphsconstructedfromelectronicmedicalrecordsretrospectivestudy
AT chenanjun generationofalungcancerhealthfactordistributionusingpatientgraphsconstructedfromelectronicmedicalrecordsretrospectivestudy
AT huangran generationofalungcancerhealthfactordistributionusingpatientgraphsconstructedfromelectronicmedicalrecordsretrospectivestudy
AT wuerman generationofalungcancerhealthfactordistributionusingpatientgraphsconstructedfromelectronicmedicalrecordsretrospectivestudy
AT hanruobing generationofalungcancerhealthfactordistributionusingpatientgraphsconstructedfromelectronicmedicalrecordsretrospectivestudy
AT wenjian generationofalungcancerhealthfactordistributionusingpatientgraphsconstructedfromelectronicmedicalrecordsretrospectivestudy
AT liqinghua generationofalungcancerhealthfactordistributionusingpatientgraphsconstructedfromelectronicmedicalrecordsretrospectivestudy
AT zhangzhiyong generationofalungcancerhealthfactordistributionusingpatientgraphsconstructedfromelectronicmedicalrecordsretrospectivestudy
AT shenbairong generationofalungcancerhealthfactordistributionusingpatientgraphsconstructedfromelectronicmedicalrecordsretrospectivestudy