Cargando…

Communication Efficient Federated Generalized Tensor Factorization for Collaborative Health Data Analytics

Modern healthcare systems knitted by a web of entities (e.g., hospitals, clinics, pharmacy companies) are collecting a huge volume of healthcare data from a large number of individuals with various medical procedures, medications, diagnosis, and lab tests. To extract meaningful medical concepts (i.e...

Descripción completa

Detalles Bibliográficos
Autores principales: Ma, Jing, Zhang, Qiuchen, Lou, Jian, Xiong, Li, Ho, Joyce C.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8404412/
https://www.ncbi.nlm.nih.gov/pubmed/34467367
http://dx.doi.org/10.1145/3442381.3449832
_version_ 1783746162814091264
author Ma, Jing
Zhang, Qiuchen
Lou, Jian
Xiong, Li
Ho, Joyce C.
author_facet Ma, Jing
Zhang, Qiuchen
Lou, Jian
Xiong, Li
Ho, Joyce C.
author_sort Ma, Jing
collection PubMed
description Modern healthcare systems knitted by a web of entities (e.g., hospitals, clinics, pharmacy companies) are collecting a huge volume of healthcare data from a large number of individuals with various medical procedures, medications, diagnosis, and lab tests. To extract meaningful medical concepts (i.e., phenotypes) from such higher-arity relational healthcare data, tensor factorization has been proven to be an effective approach and received increasing research attention, due to their intrinsic capability to represent the high-dimensional data. Recently, federated learning offers a privacy-preserving paradigm for collaborative learning among different entities, which seemingly provides an ideal potential to further enhance the tensor factorization-based collaborative phenotyping to handle sensitive personal health data. However, existing attempts to federated tensor factorization come with various limitations, including restrictions to the classic tensor factorization, high communication cost and reduced accuracy. We propose a communication efficient federated generalized tensor factorization, which is flexible enough to choose from a variate of losses to best suit different types of data in practice. We design a three-level communication reduction strategy tailored to the generalized tensor factorization, which is able to reduce the uplink communication cost up to 99.90%. In addition, we theoretically prove that our algorithm does not compromise convergence speed despite the aggressive communication compression. Extensive experiments on two real-world electronics health record datasets demonstrate the efficiency improvements in terms of computation and communication cost.
format Online
Article
Text
id pubmed-8404412
institution National Center for Biotechnology Information
language English
publishDate 2021
record_format MEDLINE/PubMed
spelling pubmed-84044122021-08-30 Communication Efficient Federated Generalized Tensor Factorization for Collaborative Health Data Analytics Ma, Jing Zhang, Qiuchen Lou, Jian Xiong, Li Ho, Joyce C. Proc Int World Wide Web Conf Article Modern healthcare systems knitted by a web of entities (e.g., hospitals, clinics, pharmacy companies) are collecting a huge volume of healthcare data from a large number of individuals with various medical procedures, medications, diagnosis, and lab tests. To extract meaningful medical concepts (i.e., phenotypes) from such higher-arity relational healthcare data, tensor factorization has been proven to be an effective approach and received increasing research attention, due to their intrinsic capability to represent the high-dimensional data. Recently, federated learning offers a privacy-preserving paradigm for collaborative learning among different entities, which seemingly provides an ideal potential to further enhance the tensor factorization-based collaborative phenotyping to handle sensitive personal health data. However, existing attempts to federated tensor factorization come with various limitations, including restrictions to the classic tensor factorization, high communication cost and reduced accuracy. We propose a communication efficient federated generalized tensor factorization, which is flexible enough to choose from a variate of losses to best suit different types of data in practice. We design a three-level communication reduction strategy tailored to the generalized tensor factorization, which is able to reduce the uplink communication cost up to 99.90%. In addition, we theoretically prove that our algorithm does not compromise convergence speed despite the aggressive communication compression. Extensive experiments on two real-world electronics health record datasets demonstrate the efficiency improvements in terms of computation and communication cost. 2021-04 /pmc/articles/PMC8404412/ /pubmed/34467367 http://dx.doi.org/10.1145/3442381.3449832 Text en https://creativecommons.org/licenses/by/4.0/This paper is published under the Creative Commons Attribution 4.0 International (CC-BY 4.0) license. Authors reserve their rights to disseminate the work on their personal and corporate Web sites with the appropriate attribution.
spellingShingle Article
Ma, Jing
Zhang, Qiuchen
Lou, Jian
Xiong, Li
Ho, Joyce C.
Communication Efficient Federated Generalized Tensor Factorization for Collaborative Health Data Analytics
title Communication Efficient Federated Generalized Tensor Factorization for Collaborative Health Data Analytics
title_full Communication Efficient Federated Generalized Tensor Factorization for Collaborative Health Data Analytics
title_fullStr Communication Efficient Federated Generalized Tensor Factorization for Collaborative Health Data Analytics
title_full_unstemmed Communication Efficient Federated Generalized Tensor Factorization for Collaborative Health Data Analytics
title_short Communication Efficient Federated Generalized Tensor Factorization for Collaborative Health Data Analytics
title_sort communication efficient federated generalized tensor factorization for collaborative health data analytics
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8404412/
https://www.ncbi.nlm.nih.gov/pubmed/34467367
http://dx.doi.org/10.1145/3442381.3449832
work_keys_str_mv AT majing communicationefficientfederatedgeneralizedtensorfactorizationforcollaborativehealthdataanalytics
AT zhangqiuchen communicationefficientfederatedgeneralizedtensorfactorizationforcollaborativehealthdataanalytics
AT loujian communicationefficientfederatedgeneralizedtensorfactorizationforcollaborativehealthdataanalytics
AT xiongli communicationefficientfederatedgeneralizedtensorfactorizationforcollaborativehealthdataanalytics
AT hojoycec communicationefficientfederatedgeneralizedtensorfactorizationforcollaborativehealthdataanalytics