Cargando…

CAPformer: Pedestrian Crossing Action Prediction Using Transformer

Anticipating pedestrian crossing behavior in urban scenarios is a challenging task for autonomous vehicles. Early this year, a benchmark comprising JAAD and PIE datasets have been released. In the benchmark, several state-of-the-art methods have been ranked. However, most of the ranked temporal mode...

Descripción completa

Detalles Bibliográficos
Autores principales:	Lorenzo, Javier, Alonso, Ignacio Parra, Izquierdo, Rubén, Ballardini, Augusto Luis, Saz, Álvaro Hernández, Llorca, David Fernández, Sotelo, Miguel Ángel
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2021
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8433949/ https://www.ncbi.nlm.nih.gov/pubmed/34502584 http://dx.doi.org/10.3390/s21175694

_version_	1783751482769670144
author	Lorenzo, Javier Alonso, Ignacio Parra Izquierdo, Rubén Ballardini, Augusto Luis Saz, Álvaro Hernández Llorca, David Fernández Sotelo, Miguel Ángel
author_facet	Lorenzo, Javier Alonso, Ignacio Parra Izquierdo, Rubén Ballardini, Augusto Luis Saz, Álvaro Hernández Llorca, David Fernández Sotelo, Miguel Ángel
author_sort	Lorenzo, Javier
collection	PubMed
description	Anticipating pedestrian crossing behavior in urban scenarios is a challenging task for autonomous vehicles. Early this year, a benchmark comprising JAAD and PIE datasets have been released. In the benchmark, several state-of-the-art methods have been ranked. However, most of the ranked temporal models rely on recurrent architectures. In our case, we propose, as far as we are concerned, the first self-attention alternative, based on transformer architecture, which has had enormous success in natural language processing (NLP) and recently in computer vision. Our architecture is composed of various branches which fuse video and kinematic data. The video branch is based on two possible architectures: RubiksNet and TimeSformer. The kinematic branch is based on different configurations of transformer encoder. Several experiments have been performed mainly focusing on pre-processing input data, highlighting problems with two kinematic data sources: pose keypoints and ego-vehicle speed. Our proposed model results are comparable to PCPA, the best performing model in the benchmark reaching an F1 Score of nearly [Formula: see text] against [Formula: see text]. Furthermore, by using only bounding box coordinates and image data, our model surpasses PCPA by a larger margin ([Formula: see text] vs. [Formula: see text]). Our model has proven to be a valid alternative to recurrent architectures, providing advantages such as parallelization and whole sequence processing, learning relationships between samples not possible with recurrent architectures.
format	Online Article Text
id	pubmed-8433949
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-84339492021-09-12 CAPformer: Pedestrian Crossing Action Prediction Using Transformer Lorenzo, Javier Alonso, Ignacio Parra Izquierdo, Rubén Ballardini, Augusto Luis Saz, Álvaro Hernández Llorca, David Fernández Sotelo, Miguel Ángel Sensors (Basel) Article Anticipating pedestrian crossing behavior in urban scenarios is a challenging task for autonomous vehicles. Early this year, a benchmark comprising JAAD and PIE datasets have been released. In the benchmark, several state-of-the-art methods have been ranked. However, most of the ranked temporal models rely on recurrent architectures. In our case, we propose, as far as we are concerned, the first self-attention alternative, based on transformer architecture, which has had enormous success in natural language processing (NLP) and recently in computer vision. Our architecture is composed of various branches which fuse video and kinematic data. The video branch is based on two possible architectures: RubiksNet and TimeSformer. The kinematic branch is based on different configurations of transformer encoder. Several experiments have been performed mainly focusing on pre-processing input data, highlighting problems with two kinematic data sources: pose keypoints and ego-vehicle speed. Our proposed model results are comparable to PCPA, the best performing model in the benchmark reaching an F1 Score of nearly [Formula: see text] against [Formula: see text]. Furthermore, by using only bounding box coordinates and image data, our model surpasses PCPA by a larger margin ([Formula: see text] vs. [Formula: see text]). Our model has proven to be a valid alternative to recurrent architectures, providing advantages such as parallelization and whole sequence processing, learning relationships between samples not possible with recurrent architectures. MDPI 2021-08-24 /pmc/articles/PMC8433949/ /pubmed/34502584 http://dx.doi.org/10.3390/s21175694 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Lorenzo, Javier Alonso, Ignacio Parra Izquierdo, Rubén Ballardini, Augusto Luis Saz, Álvaro Hernández Llorca, David Fernández Sotelo, Miguel Ángel CAPformer: Pedestrian Crossing Action Prediction Using Transformer
title	CAPformer: Pedestrian Crossing Action Prediction Using Transformer
title_full	CAPformer: Pedestrian Crossing Action Prediction Using Transformer
title_fullStr	CAPformer: Pedestrian Crossing Action Prediction Using Transformer
title_full_unstemmed	CAPformer: Pedestrian Crossing Action Prediction Using Transformer
title_short	CAPformer: Pedestrian Crossing Action Prediction Using Transformer
title_sort	capformer: pedestrian crossing action prediction using transformer
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8433949/ https://www.ncbi.nlm.nih.gov/pubmed/34502584 http://dx.doi.org/10.3390/s21175694
work_keys_str_mv	AT lorenzojavier capformerpedestriancrossingactionpredictionusingtransformer AT alonsoignacioparra capformerpedestriancrossingactionpredictionusingtransformer AT izquierdoruben capformerpedestriancrossingactionpredictionusingtransformer AT ballardiniaugustoluis capformerpedestriancrossingactionpredictionusingtransformer AT sazalvarohernandez capformerpedestriancrossingactionpredictionusingtransformer AT llorcadavidfernandez capformerpedestriancrossingactionpredictionusingtransformer AT sotelomiguelangel capformerpedestriancrossingactionpredictionusingtransformer

CAPformer: Pedestrian Crossing Action Prediction Using Transformer

Ejemplares similares