Cargando…

QuantLaneNet: A 640-FPS and 34-GOPS/W FPGA-Based CNN Accelerator for Lane Detection

Lane detection is one of the most fundamental problems in the rapidly developing field of autonomous vehicles. With the dramatic growth of deep learning in recent years, many models have achieved a high accuracy for this task. However, most existing deep-learning methods for lane detection face two...

Descripción completa

Detalles Bibliográficos
Autores principales: Lam, Duc Khai, Du, Cam Vinh, Pham, Hoai Luan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10422460/
https://www.ncbi.nlm.nih.gov/pubmed/37571445
http://dx.doi.org/10.3390/s23156661
_version_ 1785089215892029440
author Lam, Duc Khai
Du, Cam Vinh
Pham, Hoai Luan
author_facet Lam, Duc Khai
Du, Cam Vinh
Pham, Hoai Luan
author_sort Lam, Duc Khai
collection PubMed
description Lane detection is one of the most fundamental problems in the rapidly developing field of autonomous vehicles. With the dramatic growth of deep learning in recent years, many models have achieved a high accuracy for this task. However, most existing deep-learning methods for lane detection face two main problems. First, most early studies usually follow a segmentation approach, which requires much post-processing to extract the necessary geometric information about the lane lines. Second, many models fail to reach real-time speed due to the high complexity of model architecture. To offer a solution to these problems, this paper proposes a lightweight convolutional neural network that requires only two small arrays for minimum post-processing, instead of segmentation maps for the task of lane detection. This proposed network utilizes a simple lane representation format for its output. The proposed model can achieve 93.53% accuracy on the TuSimple dataset. A hardware accelerator is proposed and implemented on the Virtex-7 VC707 FPGA platform to optimize processing time and power consumption. Several techniques, including data quantization to reduce data width down to 8-bit, exploring various loop-unrolling strategies for different convolution layers, and pipelined computation across layers, are optimized in the proposed hardware accelerator architecture. This implementation can process at 640 FPS while consuming only 10.309 W, equating to a computation throughput of 345.6 GOPS and energy efficiency of 33.52 GOPS/W.
format Online
Article
Text
id pubmed-10422460
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-104224602023-08-13 QuantLaneNet: A 640-FPS and 34-GOPS/W FPGA-Based CNN Accelerator for Lane Detection Lam, Duc Khai Du, Cam Vinh Pham, Hoai Luan Sensors (Basel) Article Lane detection is one of the most fundamental problems in the rapidly developing field of autonomous vehicles. With the dramatic growth of deep learning in recent years, many models have achieved a high accuracy for this task. However, most existing deep-learning methods for lane detection face two main problems. First, most early studies usually follow a segmentation approach, which requires much post-processing to extract the necessary geometric information about the lane lines. Second, many models fail to reach real-time speed due to the high complexity of model architecture. To offer a solution to these problems, this paper proposes a lightweight convolutional neural network that requires only two small arrays for minimum post-processing, instead of segmentation maps for the task of lane detection. This proposed network utilizes a simple lane representation format for its output. The proposed model can achieve 93.53% accuracy on the TuSimple dataset. A hardware accelerator is proposed and implemented on the Virtex-7 VC707 FPGA platform to optimize processing time and power consumption. Several techniques, including data quantization to reduce data width down to 8-bit, exploring various loop-unrolling strategies for different convolution layers, and pipelined computation across layers, are optimized in the proposed hardware accelerator architecture. This implementation can process at 640 FPS while consuming only 10.309 W, equating to a computation throughput of 345.6 GOPS and energy efficiency of 33.52 GOPS/W. MDPI 2023-07-25 /pmc/articles/PMC10422460/ /pubmed/37571445 http://dx.doi.org/10.3390/s23156661 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Lam, Duc Khai
Du, Cam Vinh
Pham, Hoai Luan
QuantLaneNet: A 640-FPS and 34-GOPS/W FPGA-Based CNN Accelerator for Lane Detection
title QuantLaneNet: A 640-FPS and 34-GOPS/W FPGA-Based CNN Accelerator for Lane Detection
title_full QuantLaneNet: A 640-FPS and 34-GOPS/W FPGA-Based CNN Accelerator for Lane Detection
title_fullStr QuantLaneNet: A 640-FPS and 34-GOPS/W FPGA-Based CNN Accelerator for Lane Detection
title_full_unstemmed QuantLaneNet: A 640-FPS and 34-GOPS/W FPGA-Based CNN Accelerator for Lane Detection
title_short QuantLaneNet: A 640-FPS and 34-GOPS/W FPGA-Based CNN Accelerator for Lane Detection
title_sort quantlanenet: a 640-fps and 34-gops/w fpga-based cnn accelerator for lane detection
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10422460/
https://www.ncbi.nlm.nih.gov/pubmed/37571445
http://dx.doi.org/10.3390/s23156661
work_keys_str_mv AT lamduckhai quantlaneneta640fpsand34gopswfpgabasedcnnacceleratorforlanedetection
AT ducamvinh quantlaneneta640fpsand34gopswfpgabasedcnnacceleratorforlanedetection
AT phamhoailuan quantlaneneta640fpsand34gopswfpgabasedcnnacceleratorforlanedetection