Cargando…

Missing Traffic Data Imputation with a Linear Generative Model Based on Probabilistic Principal Component Analysis

Even with the ubiquitous sensing data in intelligent transportation systems, such as the mobile sensing of vehicle trajectories, traffic estimation is still faced with the data missing problem due to the detector faults or limited number of probe vehicles as mobile sensors. Such data missing issue p...

Descripción completa

Detalles Bibliográficos
Autores principales: Huang, Liping, Li, Zhenghuan, Luo, Ruikang, Su, Rong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9824200/
https://www.ncbi.nlm.nih.gov/pubmed/36616802
http://dx.doi.org/10.3390/s23010204
_version_ 1784866351089713152
author Huang, Liping
Li, Zhenghuan
Luo, Ruikang
Su, Rong
author_facet Huang, Liping
Li, Zhenghuan
Luo, Ruikang
Su, Rong
author_sort Huang, Liping
collection PubMed
description Even with the ubiquitous sensing data in intelligent transportation systems, such as the mobile sensing of vehicle trajectories, traffic estimation is still faced with the data missing problem due to the detector faults or limited number of probe vehicles as mobile sensors. Such data missing issue poses an obstacle for many further explorations, e.g., the link-based traffic status modeling. Although many studies have focused on tackling this kind of problem, existing studies mainly focus on the situation in which data are missing at random and ignore the distinction between links of missing data. In the practical scenario, traffic speed data are always missing not at random (MNAR). The distinction for recovering missing data on different links has not been studied yet. In this paper, we propose a general linear model based on probabilistic principal component analysis (PPCA) for solving MNAR traffic speed data imputation. Furthermore, we propose a metric, i.e., Pearson score (p-score), for distinguishing links and investigate how the model performs on links with different p-score values. Experimental results show that the new model outperforms the typically used PPCA model, and missing data on links with higher p-score values can be better recovered.
format Online
Article
Text
id pubmed-9824200
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-98242002023-01-08 Missing Traffic Data Imputation with a Linear Generative Model Based on Probabilistic Principal Component Analysis Huang, Liping Li, Zhenghuan Luo, Ruikang Su, Rong Sensors (Basel) Article Even with the ubiquitous sensing data in intelligent transportation systems, such as the mobile sensing of vehicle trajectories, traffic estimation is still faced with the data missing problem due to the detector faults or limited number of probe vehicles as mobile sensors. Such data missing issue poses an obstacle for many further explorations, e.g., the link-based traffic status modeling. Although many studies have focused on tackling this kind of problem, existing studies mainly focus on the situation in which data are missing at random and ignore the distinction between links of missing data. In the practical scenario, traffic speed data are always missing not at random (MNAR). The distinction for recovering missing data on different links has not been studied yet. In this paper, we propose a general linear model based on probabilistic principal component analysis (PPCA) for solving MNAR traffic speed data imputation. Furthermore, we propose a metric, i.e., Pearson score (p-score), for distinguishing links and investigate how the model performs on links with different p-score values. Experimental results show that the new model outperforms the typically used PPCA model, and missing data on links with higher p-score values can be better recovered. MDPI 2022-12-25 /pmc/articles/PMC9824200/ /pubmed/36616802 http://dx.doi.org/10.3390/s23010204 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Huang, Liping
Li, Zhenghuan
Luo, Ruikang
Su, Rong
Missing Traffic Data Imputation with a Linear Generative Model Based on Probabilistic Principal Component Analysis
title Missing Traffic Data Imputation with a Linear Generative Model Based on Probabilistic Principal Component Analysis
title_full Missing Traffic Data Imputation with a Linear Generative Model Based on Probabilistic Principal Component Analysis
title_fullStr Missing Traffic Data Imputation with a Linear Generative Model Based on Probabilistic Principal Component Analysis
title_full_unstemmed Missing Traffic Data Imputation with a Linear Generative Model Based on Probabilistic Principal Component Analysis
title_short Missing Traffic Data Imputation with a Linear Generative Model Based on Probabilistic Principal Component Analysis
title_sort missing traffic data imputation with a linear generative model based on probabilistic principal component analysis
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9824200/
https://www.ncbi.nlm.nih.gov/pubmed/36616802
http://dx.doi.org/10.3390/s23010204
work_keys_str_mv AT huangliping missingtrafficdataimputationwithalineargenerativemodelbasedonprobabilisticprincipalcomponentanalysis
AT lizhenghuan missingtrafficdataimputationwithalineargenerativemodelbasedonprobabilisticprincipalcomponentanalysis
AT luoruikang missingtrafficdataimputationwithalineargenerativemodelbasedonprobabilisticprincipalcomponentanalysis
AT surong missingtrafficdataimputationwithalineargenerativemodelbasedonprobabilisticprincipalcomponentanalysis