Cargando…

FedVoting: A Cross-Silo Boosting Tree Construction Method for Privacy-Preserving Long-Term Human Mobility Prediction

The prediction of human mobility can facilitate resolving many kinds of urban problems, such as reducing traffic congestion, and promote commercial activities, such as targeted advertising. However, the requisite personal GPS data face privacy issues. Related organizations can only collect limited d...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Yinghao, Fan, Zipei, Song, Xuan, Shibasaki, Ryosuke
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8708522/
https://www.ncbi.nlm.nih.gov/pubmed/34960376
http://dx.doi.org/10.3390/s21248282
Descripción
Sumario:The prediction of human mobility can facilitate resolving many kinds of urban problems, such as reducing traffic congestion, and promote commercial activities, such as targeted advertising. However, the requisite personal GPS data face privacy issues. Related organizations can only collect limited data and they experience difficulties in sharing them. These data are in “isolated islands” and cannot collectively contribute to improving the performance of applications. Thus, the method of federated learning (FL) can be adopted, in which multiple entities collaborate to train a collective model with their raw data stored locally and, therefore, not exchanged or transferred. However, to predict long-term human mobility, the performance and practicality would be impaired if only some models were simply combined with FL, due to the irregularity and complexity of long-term mobility data. Therefore, we explored the optimized construction method based on the high-efficient gradient-boosting decision tree (GBDT) model with FL and propose the novel federated voting (FedVoting) mechanism, which aggregates the ensemble of differential privacy (DP)-protected GBDTs by the multiple training, cross-validation and voting processes to generate the optimal model and can achieve both good performance and privacy protection. The experiments show the great accuracy in long-term predictions of special event attendance and point-of-interest visits. Compared with training the model independently for each silo (organization) and state-of-art baselines, the FedVoting method achieves a significant accuracy improvement, almost comparable to the centralized training, at a negligible expense of privacy exposure.