Cargando…

Federated Learning of Electronic Health Records to Improve Mortality Prediction in Hospitalized Patients With COVID-19: Machine Learning Approach

BACKGROUND: Machine learning models require large datasets that may be siloed across different health care institutions. Machine learning studies that focus on COVID-19 have been limited to single-hospital data, which limits model generalizability. OBJECTIVE: We aimed to use federated learning, a ma...

Descripción completa

Detalles Bibliográficos
Autores principales: Vaid, Akhil, Jaladanki, Suraj K, Xu, Jie, Teng, Shelly, Kumar, Arvind, Lee, Samuel, Somani, Sulaiman, Paranjpe, Ishan, De Freitas, Jessica K, Wanyan, Tingyi, Johnson, Kipp W, Bicak, Mesude, Klang, Eyal, Kwon, Young Joon, Costa, Anthony, Zhao, Shan, Miotto, Riccardo, Charney, Alexander W, Böttinger, Erwin, Fayad, Zahi A, Nadkarni, Girish N, Wang, Fei, Glicksberg, Benjamin S
Formato: Online Artículo Texto
Lenguaje:English
Publicado: JMIR Publications 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7842859/
https://www.ncbi.nlm.nih.gov/pubmed/33400679
http://dx.doi.org/10.2196/24207
Descripción
Sumario:BACKGROUND: Machine learning models require large datasets that may be siloed across different health care institutions. Machine learning studies that focus on COVID-19 have been limited to single-hospital data, which limits model generalizability. OBJECTIVE: We aimed to use federated learning, a machine learning technique that avoids locally aggregating raw clinical data across multiple institutions, to predict mortality in hospitalized patients with COVID-19 within 7 days. METHODS: Patient data were collected from the electronic health records of 5 hospitals within the Mount Sinai Health System. Logistic regression with L1 regularization/least absolute shrinkage and selection operator (LASSO) and multilayer perceptron (MLP) models were trained by using local data at each site. We developed a pooled model with combined data from all 5 sites, and a federated model that only shared parameters with a central aggregator. RESULTS: The LASSO(federated) model outperformed the LASSO(local) model at 3 hospitals, and the MLP(federated) model performed better than the MLP(local) model at all 5 hospitals, as determined by the area under the receiver operating characteristic curve. The LASSO(pooled) model outperformed the LASSO(federated) model at all hospitals, and the MLP(federated) model outperformed the MLP(pooled) model at 2 hospitals. CONCLUSIONS: The federated learning of COVID-19 electronic health record data shows promise in developing robust predictive models without compromising patient privacy.