Cargando…
Federated Learning of Electronic Health Records to Improve Mortality Prediction in Hospitalized Patients With COVID-19: Machine Learning Approach
BACKGROUND: Machine learning models require large datasets that may be siloed across different health care institutions. Machine learning studies that focus on COVID-19 have been limited to single-hospital data, which limits model generalizability. OBJECTIVE: We aimed to use federated learning, a ma...
Autores principales: | , , , , , , , , , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
JMIR Publications
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7842859/ https://www.ncbi.nlm.nih.gov/pubmed/33400679 http://dx.doi.org/10.2196/24207 |
Sumario: | BACKGROUND: Machine learning models require large datasets that may be siloed across different health care institutions. Machine learning studies that focus on COVID-19 have been limited to single-hospital data, which limits model generalizability. OBJECTIVE: We aimed to use federated learning, a machine learning technique that avoids locally aggregating raw clinical data across multiple institutions, to predict mortality in hospitalized patients with COVID-19 within 7 days. METHODS: Patient data were collected from the electronic health records of 5 hospitals within the Mount Sinai Health System. Logistic regression with L1 regularization/least absolute shrinkage and selection operator (LASSO) and multilayer perceptron (MLP) models were trained by using local data at each site. We developed a pooled model with combined data from all 5 sites, and a federated model that only shared parameters with a central aggregator. RESULTS: The LASSO(federated) model outperformed the LASSO(local) model at 3 hospitals, and the MLP(federated) model performed better than the MLP(local) model at all 5 hospitals, as determined by the area under the receiver operating characteristic curve. The LASSO(pooled) model outperformed the LASSO(federated) model at all hospitals, and the MLP(federated) model outperformed the MLP(pooled) model at 2 hospitals. CONCLUSIONS: The federated learning of COVID-19 electronic health record data shows promise in developing robust predictive models without compromising patient privacy. |
---|