Cargando…

Extending electronic medical records vector models with knowledge graphs to improve hospitalization prediction

BACKGROUND: Artificial intelligence methods applied to electronic medical records (EMRs) hold the potential to help physicians save time by sharpening their analysis and decisions, thereby improving the health of patients. On the one hand, machine learning algorithms have proven their effectiveness...

Descripción completa

Detalles Bibliográficos
Autores principales: Gazzotti, Raphaël, Faron, Catherine, Gandon, Fabien, Lacroix-Hugues, Virginie, Darmon, David
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8861628/
https://www.ncbi.nlm.nih.gov/pubmed/35193692
http://dx.doi.org/10.1186/s13326-022-00261-9
_version_ 1784654928355000320
author Gazzotti, Raphaël
Faron, Catherine
Gandon, Fabien
Lacroix-Hugues, Virginie
Darmon, David
author_facet Gazzotti, Raphaël
Faron, Catherine
Gandon, Fabien
Lacroix-Hugues, Virginie
Darmon, David
author_sort Gazzotti, Raphaël
collection PubMed
description BACKGROUND: Artificial intelligence methods applied to electronic medical records (EMRs) hold the potential to help physicians save time by sharpening their analysis and decisions, thereby improving the health of patients. On the one hand, machine learning algorithms have proven their effectiveness in extracting information and exploiting knowledge extracted from data. On the other hand, knowledge graphs capture human knowledge by relying on conceptual schemas and formalization and supporting reasoning. Leveraging knowledge graphs that are legion in the medical field, it is possible to pre-process and enrich data representation used by machine learning algorithms. Medical data standardization is an opportunity to jointly exploit the richness of knowledge graphs and the capabilities of machine learning algorithms. METHODS: We propose to address the problem of hospitalization prediction for patients with an approach that enriches vector representation of EMRs with information extracted from different knowledge graphs before learning and predicting. In addition, we performed an automatic selection of features resulting from knowledge graphs to distinguish noisy ones from those that can benefit the decision making. We report the results of our experiments on the PRIMEGE PACA database that contains more than 600,000 consultations carried out by 17 general practitioners (GPs). RESULTS: A statistical evaluation shows that our proposed approach improves hospitalization prediction. More precisely, injecting features extracted from cross-domain knowledge graphs in the vector representation of EMRs given as input to the prediction algorithm significantly increases the F1 score of the prediction. CONCLUSIONS: By injecting knowledge from recognized reference sources into the representation of EMRs, it is possible to significantly improve the prediction of medical events. Future work would be to evaluate the impact of a feature selection step coupled with a combination of features extracted from several knowledge graphs. A possible avenue is to study more hierarchical levels and properties related to concepts, as well as to integrate more semantic annotators to exploit unstructured data.
format Online
Article
Text
id pubmed-8861628
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-88616282022-02-22 Extending electronic medical records vector models with knowledge graphs to improve hospitalization prediction Gazzotti, Raphaël Faron, Catherine Gandon, Fabien Lacroix-Hugues, Virginie Darmon, David J Biomed Semantics Research BACKGROUND: Artificial intelligence methods applied to electronic medical records (EMRs) hold the potential to help physicians save time by sharpening their analysis and decisions, thereby improving the health of patients. On the one hand, machine learning algorithms have proven their effectiveness in extracting information and exploiting knowledge extracted from data. On the other hand, knowledge graphs capture human knowledge by relying on conceptual schemas and formalization and supporting reasoning. Leveraging knowledge graphs that are legion in the medical field, it is possible to pre-process and enrich data representation used by machine learning algorithms. Medical data standardization is an opportunity to jointly exploit the richness of knowledge graphs and the capabilities of machine learning algorithms. METHODS: We propose to address the problem of hospitalization prediction for patients with an approach that enriches vector representation of EMRs with information extracted from different knowledge graphs before learning and predicting. In addition, we performed an automatic selection of features resulting from knowledge graphs to distinguish noisy ones from those that can benefit the decision making. We report the results of our experiments on the PRIMEGE PACA database that contains more than 600,000 consultations carried out by 17 general practitioners (GPs). RESULTS: A statistical evaluation shows that our proposed approach improves hospitalization prediction. More precisely, injecting features extracted from cross-domain knowledge graphs in the vector representation of EMRs given as input to the prediction algorithm significantly increases the F1 score of the prediction. CONCLUSIONS: By injecting knowledge from recognized reference sources into the representation of EMRs, it is possible to significantly improve the prediction of medical events. Future work would be to evaluate the impact of a feature selection step coupled with a combination of features extracted from several knowledge graphs. A possible avenue is to study more hierarchical levels and properties related to concepts, as well as to integrate more semantic annotators to exploit unstructured data. BioMed Central 2022-02-22 /pmc/articles/PMC8861628/ /pubmed/35193692 http://dx.doi.org/10.1186/s13326-022-00261-9 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Gazzotti, Raphaël
Faron, Catherine
Gandon, Fabien
Lacroix-Hugues, Virginie
Darmon, David
Extending electronic medical records vector models with knowledge graphs to improve hospitalization prediction
title Extending electronic medical records vector models with knowledge graphs to improve hospitalization prediction
title_full Extending electronic medical records vector models with knowledge graphs to improve hospitalization prediction
title_fullStr Extending electronic medical records vector models with knowledge graphs to improve hospitalization prediction
title_full_unstemmed Extending electronic medical records vector models with knowledge graphs to improve hospitalization prediction
title_short Extending electronic medical records vector models with knowledge graphs to improve hospitalization prediction
title_sort extending electronic medical records vector models with knowledge graphs to improve hospitalization prediction
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8861628/
https://www.ncbi.nlm.nih.gov/pubmed/35193692
http://dx.doi.org/10.1186/s13326-022-00261-9
work_keys_str_mv AT gazzottiraphael extendingelectronicmedicalrecordsvectormodelswithknowledgegraphstoimprovehospitalizationprediction
AT faroncatherine extendingelectronicmedicalrecordsvectormodelswithknowledgegraphstoimprovehospitalizationprediction
AT gandonfabien extendingelectronicmedicalrecordsvectormodelswithknowledgegraphstoimprovehospitalizationprediction
AT lacroixhuguesvirginie extendingelectronicmedicalrecordsvectormodelswithknowledgegraphstoimprovehospitalizationprediction
AT darmondavid extendingelectronicmedicalrecordsvectormodelswithknowledgegraphstoimprovehospitalizationprediction