Cargando…

General-Purpose Automated Machine Learning for Transportation: A Case Study of Auto-sklearn for Traffic Forecasting

Currently, there are no guidelines to determine what are the most suitable machine learning pipelines (i.e. the workflow from data preprocessing to model selection and validation) to approach Traffic Forecasting (TF) problems. Although automated machine learning (AutoML) has proved to be successful...

Descripción completa

Detalles Bibliográficos
Autores principales: Angarita-Zapata, Juan S., Masegosa, Antonio D., Triguero, Isaac
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7274664/
http://dx.doi.org/10.1007/978-3-030-50143-3_57
Descripción
Sumario:Currently, there are no guidelines to determine what are the most suitable machine learning pipelines (i.e. the workflow from data preprocessing to model selection and validation) to approach Traffic Forecasting (TF) problems. Although automated machine learning (AutoML) has proved to be successful dealing with the model selection problem in other applications areas, only a few papers have explored the performance of general-purpose AutoML methods, purely based on optimisation, when tackling TF. In this paper, we provide a thorough exploration of the benefits of Auto-sklearn for TF, as a general-purpose AutoML method that follows a hybrid search strategy combining optimisation with meta-learning and ensemble learning. Particularly, we focus on how well Auto-sklearn is able to recommend competitive machine learning pipelines to forecast traffic, modelled as a TF multi-class imbalanced classification problem, along different time horizons at two spatial scales (point and road segment) and two environments (freeway and urban). Concretely, we test the following scenarios: I) a hybrid search strategy with the three components (optimisation, meta-learning, ensemble learning), II) a strategy based on meta-learning and ensemble learning, and III) a strategy based on the estimation of the best performing pipeline from those suggested by the meta-learning. Experimental results show that the meta-learning component of Auto-sklearn does not work properly on TF problems, and on the other hand, that the optimisation does not contribute too much to the final performance of predictions.