Cargando…

An Expressway ETC Missing Data Restoration Model Considering Multi-Attribute Features

Electronic toll collection (ETC) data mining has become one of the hotspots in the research of intelligent expressway extension applications. Ensuring the integrity of ETC data stands as a critical measure in upholding data quality. ETC data are typical structured data, and although deep learning ho...

Descripción completa

Detalles Bibliográficos
Autores principales: Zou, Fumin, Zhou, Zhaoyi, Cai, Qiqin, Guo, Feng, Zhang, Xinyi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10647695/
https://www.ncbi.nlm.nih.gov/pubmed/37960444
http://dx.doi.org/10.3390/s23218745
_version_ 1785135167427313664
author Zou, Fumin
Zhou, Zhaoyi
Cai, Qiqin
Guo, Feng
Zhang, Xinyi
author_facet Zou, Fumin
Zhou, Zhaoyi
Cai, Qiqin
Guo, Feng
Zhang, Xinyi
author_sort Zou, Fumin
collection PubMed
description Electronic toll collection (ETC) data mining has become one of the hotspots in the research of intelligent expressway extension applications. Ensuring the integrity of ETC data stands as a critical measure in upholding data quality. ETC data are typical structured data, and although deep learning holds great potential in the ETC data restoration field, its applications in structured data are still in the early stages. To address these issues, we propose an expressway ETC missing transaction data restoration model considering multi-attribute features (MAF). Initially, we employ an entity embedding neural network (EENN) to automatically learn the representation of categorical features in multi-dimensional space, subsequently obtaining embedding vectors from networks that have been adequately trained. Then, we use long short-term memory (LSTM) neural networks to extract the changing patterns of vehicle speeds across several continuous sections. Ultimately, we merge the processed features with other features as input, using a three-layer multilayer perceptron (MLP) to complete the ETC data restoration. To validate the effectiveness of the proposed method, we conducted extensive tests using real ETC datasets and compared it with methods commonly used for structured data restoration. The experimental results demonstrate that the proposed method significantly outperforms others in restoration accuracy on two different datasets. Specifically, our sample data size reached around 400,000 entries. Compared to the currently best method, our method improved the restoration accuracy by 19.06% on non-holiday ETC datasets. The MAE and RMSE values reached optimal levels of 12.394 and 23.815, respectively. The fitting degree of the model to the dataset also reached its peak ([Formula: see text] = 0.993). Meanwhile, the restoration stability of our method on holiday datasets increased by 5.82%. An ablation experiment showed that the EENN and LSTM modules contributed 7.60% and 9% to the restoration accuracy, as well as 4.68% and 7.29% to the restoration stability. This study indicates that the proposed method not only significantly improves the quality of ETC data but also meets the timeliness requirements of big data mining analysis.
format Online
Article
Text
id pubmed-10647695
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-106476952023-10-26 An Expressway ETC Missing Data Restoration Model Considering Multi-Attribute Features Zou, Fumin Zhou, Zhaoyi Cai, Qiqin Guo, Feng Zhang, Xinyi Sensors (Basel) Article Electronic toll collection (ETC) data mining has become one of the hotspots in the research of intelligent expressway extension applications. Ensuring the integrity of ETC data stands as a critical measure in upholding data quality. ETC data are typical structured data, and although deep learning holds great potential in the ETC data restoration field, its applications in structured data are still in the early stages. To address these issues, we propose an expressway ETC missing transaction data restoration model considering multi-attribute features (MAF). Initially, we employ an entity embedding neural network (EENN) to automatically learn the representation of categorical features in multi-dimensional space, subsequently obtaining embedding vectors from networks that have been adequately trained. Then, we use long short-term memory (LSTM) neural networks to extract the changing patterns of vehicle speeds across several continuous sections. Ultimately, we merge the processed features with other features as input, using a three-layer multilayer perceptron (MLP) to complete the ETC data restoration. To validate the effectiveness of the proposed method, we conducted extensive tests using real ETC datasets and compared it with methods commonly used for structured data restoration. The experimental results demonstrate that the proposed method significantly outperforms others in restoration accuracy on two different datasets. Specifically, our sample data size reached around 400,000 entries. Compared to the currently best method, our method improved the restoration accuracy by 19.06% on non-holiday ETC datasets. The MAE and RMSE values reached optimal levels of 12.394 and 23.815, respectively. The fitting degree of the model to the dataset also reached its peak ([Formula: see text] = 0.993). Meanwhile, the restoration stability of our method on holiday datasets increased by 5.82%. An ablation experiment showed that the EENN and LSTM modules contributed 7.60% and 9% to the restoration accuracy, as well as 4.68% and 7.29% to the restoration stability. This study indicates that the proposed method not only significantly improves the quality of ETC data but also meets the timeliness requirements of big data mining analysis. MDPI 2023-10-26 /pmc/articles/PMC10647695/ /pubmed/37960444 http://dx.doi.org/10.3390/s23218745 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Zou, Fumin
Zhou, Zhaoyi
Cai, Qiqin
Guo, Feng
Zhang, Xinyi
An Expressway ETC Missing Data Restoration Model Considering Multi-Attribute Features
title An Expressway ETC Missing Data Restoration Model Considering Multi-Attribute Features
title_full An Expressway ETC Missing Data Restoration Model Considering Multi-Attribute Features
title_fullStr An Expressway ETC Missing Data Restoration Model Considering Multi-Attribute Features
title_full_unstemmed An Expressway ETC Missing Data Restoration Model Considering Multi-Attribute Features
title_short An Expressway ETC Missing Data Restoration Model Considering Multi-Attribute Features
title_sort expressway etc missing data restoration model considering multi-attribute features
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10647695/
https://www.ncbi.nlm.nih.gov/pubmed/37960444
http://dx.doi.org/10.3390/s23218745
work_keys_str_mv AT zoufumin anexpresswayetcmissingdatarestorationmodelconsideringmultiattributefeatures
AT zhouzhaoyi anexpresswayetcmissingdatarestorationmodelconsideringmultiattributefeatures
AT caiqiqin anexpresswayetcmissingdatarestorationmodelconsideringmultiattributefeatures
AT guofeng anexpresswayetcmissingdatarestorationmodelconsideringmultiattributefeatures
AT zhangxinyi anexpresswayetcmissingdatarestorationmodelconsideringmultiattributefeatures
AT zoufumin expresswayetcmissingdatarestorationmodelconsideringmultiattributefeatures
AT zhouzhaoyi expresswayetcmissingdatarestorationmodelconsideringmultiattributefeatures
AT caiqiqin expresswayetcmissingdatarestorationmodelconsideringmultiattributefeatures
AT guofeng expresswayetcmissingdatarestorationmodelconsideringmultiattributefeatures
AT zhangxinyi expresswayetcmissingdatarestorationmodelconsideringmultiattributefeatures