Cargando…

Embedding electronic health records onto a knowledge network recognizes prodromal features of multiple sclerosis and predicts diagnosis

OBJECTIVE: Early identification of chronic diseases is a pillar of precision medicine as it can lead to improved outcomes, reduction of disease burden, and lower healthcare costs. Predictions of a patient’s health trajectory have been improved through the application of machine learning approaches t...

Descripción completa

Detalles Bibliográficos
Autores principales: Nelson, Charlotte A, Bove, Riley, Butte, Atul J, Baranzini, Sergio E
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8800523/
https://www.ncbi.nlm.nih.gov/pubmed/34915552
http://dx.doi.org/10.1093/jamia/ocab270
_version_ 1784642279411023872
author Nelson, Charlotte A
Bove, Riley
Butte, Atul J
Baranzini, Sergio E
author_facet Nelson, Charlotte A
Bove, Riley
Butte, Atul J
Baranzini, Sergio E
author_sort Nelson, Charlotte A
collection PubMed
description OBJECTIVE: Early identification of chronic diseases is a pillar of precision medicine as it can lead to improved outcomes, reduction of disease burden, and lower healthcare costs. Predictions of a patient’s health trajectory have been improved through the application of machine learning approaches to electronic health records (EHRs). However, these methods have traditionally relied on “black box” algorithms that can process large amounts of data but are unable to incorporate domain knowledge, thus limiting their predictive and explanatory power. Here, we present a method for incorporating domain knowledge into clinical classifications by embedding individual patient data into a biomedical knowledge graph. MATERIALS AND METHODS: A modified version of the Page rank algorithm was implemented to embed millions of deidentified EHRs into a biomedical knowledge graph (SPOKE). This resulted in high-dimensional, knowledge-guided patient health signatures (ie, SPOKEsigs) that were subsequently used as features in a random forest environment to classify patients at risk of developing a chronic disease. RESULTS: Our model predicted disease status of 5752 subjects 3 years before being diagnosed with multiple sclerosis (MS) (AUC = 0.83). SPOKEsigs outperformed predictions using EHRs alone, and the biological drivers of the classifiers provided insight into the underpinnings of prodromal MS. CONCLUSION: Using data from EHR as input, SPOKEsigs describe patients at both the clinical and biological levels. We provide a clinical use case for detecting MS up to 5 years prior to their documented diagnosis in the clinic and illustrate the biological features that distinguish the prodromal MS state.
format Online
Article
Text
id pubmed-8800523
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-88005232022-01-31 Embedding electronic health records onto a knowledge network recognizes prodromal features of multiple sclerosis and predicts diagnosis Nelson, Charlotte A Bove, Riley Butte, Atul J Baranzini, Sergio E J Am Med Inform Assoc Research and Applications OBJECTIVE: Early identification of chronic diseases is a pillar of precision medicine as it can lead to improved outcomes, reduction of disease burden, and lower healthcare costs. Predictions of a patient’s health trajectory have been improved through the application of machine learning approaches to electronic health records (EHRs). However, these methods have traditionally relied on “black box” algorithms that can process large amounts of data but are unable to incorporate domain knowledge, thus limiting their predictive and explanatory power. Here, we present a method for incorporating domain knowledge into clinical classifications by embedding individual patient data into a biomedical knowledge graph. MATERIALS AND METHODS: A modified version of the Page rank algorithm was implemented to embed millions of deidentified EHRs into a biomedical knowledge graph (SPOKE). This resulted in high-dimensional, knowledge-guided patient health signatures (ie, SPOKEsigs) that were subsequently used as features in a random forest environment to classify patients at risk of developing a chronic disease. RESULTS: Our model predicted disease status of 5752 subjects 3 years before being diagnosed with multiple sclerosis (MS) (AUC = 0.83). SPOKEsigs outperformed predictions using EHRs alone, and the biological drivers of the classifiers provided insight into the underpinnings of prodromal MS. CONCLUSION: Using data from EHR as input, SPOKEsigs describe patients at both the clinical and biological levels. We provide a clinical use case for detecting MS up to 5 years prior to their documented diagnosis in the clinic and illustrate the biological features that distinguish the prodromal MS state. Oxford University Press 2021-12-16 /pmc/articles/PMC8800523/ /pubmed/34915552 http://dx.doi.org/10.1093/jamia/ocab270 Text en © The Author(s) 2021. Published by Oxford University Press on behalf of the American Medical Informatics Association. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Research and Applications
Nelson, Charlotte A
Bove, Riley
Butte, Atul J
Baranzini, Sergio E
Embedding electronic health records onto a knowledge network recognizes prodromal features of multiple sclerosis and predicts diagnosis
title Embedding electronic health records onto a knowledge network recognizes prodromal features of multiple sclerosis and predicts diagnosis
title_full Embedding electronic health records onto a knowledge network recognizes prodromal features of multiple sclerosis and predicts diagnosis
title_fullStr Embedding electronic health records onto a knowledge network recognizes prodromal features of multiple sclerosis and predicts diagnosis
title_full_unstemmed Embedding electronic health records onto a knowledge network recognizes prodromal features of multiple sclerosis and predicts diagnosis
title_short Embedding electronic health records onto a knowledge network recognizes prodromal features of multiple sclerosis and predicts diagnosis
title_sort embedding electronic health records onto a knowledge network recognizes prodromal features of multiple sclerosis and predicts diagnosis
topic Research and Applications
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8800523/
https://www.ncbi.nlm.nih.gov/pubmed/34915552
http://dx.doi.org/10.1093/jamia/ocab270
work_keys_str_mv AT nelsoncharlottea embeddingelectronichealthrecordsontoaknowledgenetworkrecognizesprodromalfeaturesofmultiplesclerosisandpredictsdiagnosis
AT boveriley embeddingelectronichealthrecordsontoaknowledgenetworkrecognizesprodromalfeaturesofmultiplesclerosisandpredictsdiagnosis
AT butteatulj embeddingelectronichealthrecordsontoaknowledgenetworkrecognizesprodromalfeaturesofmultiplesclerosisandpredictsdiagnosis
AT baranzinisergioe embeddingelectronichealthrecordsontoaknowledgenetworkrecognizesprodromalfeaturesofmultiplesclerosisandpredictsdiagnosis