Cargando…
QuantLaneNet: A 640-FPS and 34-GOPS/W FPGA-Based CNN Accelerator for Lane Detection
Lane detection is one of the most fundamental problems in the rapidly developing field of autonomous vehicles. With the dramatic growth of deep learning in recent years, many models have achieved a high accuracy for this task. However, most existing deep-learning methods for lane detection face two...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10422460/ https://www.ncbi.nlm.nih.gov/pubmed/37571445 http://dx.doi.org/10.3390/s23156661 |
_version_ | 1785089215892029440 |
---|---|
author | Lam, Duc Khai Du, Cam Vinh Pham, Hoai Luan |
author_facet | Lam, Duc Khai Du, Cam Vinh Pham, Hoai Luan |
author_sort | Lam, Duc Khai |
collection | PubMed |
description | Lane detection is one of the most fundamental problems in the rapidly developing field of autonomous vehicles. With the dramatic growth of deep learning in recent years, many models have achieved a high accuracy for this task. However, most existing deep-learning methods for lane detection face two main problems. First, most early studies usually follow a segmentation approach, which requires much post-processing to extract the necessary geometric information about the lane lines. Second, many models fail to reach real-time speed due to the high complexity of model architecture. To offer a solution to these problems, this paper proposes a lightweight convolutional neural network that requires only two small arrays for minimum post-processing, instead of segmentation maps for the task of lane detection. This proposed network utilizes a simple lane representation format for its output. The proposed model can achieve 93.53% accuracy on the TuSimple dataset. A hardware accelerator is proposed and implemented on the Virtex-7 VC707 FPGA platform to optimize processing time and power consumption. Several techniques, including data quantization to reduce data width down to 8-bit, exploring various loop-unrolling strategies for different convolution layers, and pipelined computation across layers, are optimized in the proposed hardware accelerator architecture. This implementation can process at 640 FPS while consuming only 10.309 W, equating to a computation throughput of 345.6 GOPS and energy efficiency of 33.52 GOPS/W. |
format | Online Article Text |
id | pubmed-10422460 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-104224602023-08-13 QuantLaneNet: A 640-FPS and 34-GOPS/W FPGA-Based CNN Accelerator for Lane Detection Lam, Duc Khai Du, Cam Vinh Pham, Hoai Luan Sensors (Basel) Article Lane detection is one of the most fundamental problems in the rapidly developing field of autonomous vehicles. With the dramatic growth of deep learning in recent years, many models have achieved a high accuracy for this task. However, most existing deep-learning methods for lane detection face two main problems. First, most early studies usually follow a segmentation approach, which requires much post-processing to extract the necessary geometric information about the lane lines. Second, many models fail to reach real-time speed due to the high complexity of model architecture. To offer a solution to these problems, this paper proposes a lightweight convolutional neural network that requires only two small arrays for minimum post-processing, instead of segmentation maps for the task of lane detection. This proposed network utilizes a simple lane representation format for its output. The proposed model can achieve 93.53% accuracy on the TuSimple dataset. A hardware accelerator is proposed and implemented on the Virtex-7 VC707 FPGA platform to optimize processing time and power consumption. Several techniques, including data quantization to reduce data width down to 8-bit, exploring various loop-unrolling strategies for different convolution layers, and pipelined computation across layers, are optimized in the proposed hardware accelerator architecture. This implementation can process at 640 FPS while consuming only 10.309 W, equating to a computation throughput of 345.6 GOPS and energy efficiency of 33.52 GOPS/W. MDPI 2023-07-25 /pmc/articles/PMC10422460/ /pubmed/37571445 http://dx.doi.org/10.3390/s23156661 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Lam, Duc Khai Du, Cam Vinh Pham, Hoai Luan QuantLaneNet: A 640-FPS and 34-GOPS/W FPGA-Based CNN Accelerator for Lane Detection |
title | QuantLaneNet: A 640-FPS and 34-GOPS/W FPGA-Based CNN Accelerator for Lane Detection |
title_full | QuantLaneNet: A 640-FPS and 34-GOPS/W FPGA-Based CNN Accelerator for Lane Detection |
title_fullStr | QuantLaneNet: A 640-FPS and 34-GOPS/W FPGA-Based CNN Accelerator for Lane Detection |
title_full_unstemmed | QuantLaneNet: A 640-FPS and 34-GOPS/W FPGA-Based CNN Accelerator for Lane Detection |
title_short | QuantLaneNet: A 640-FPS and 34-GOPS/W FPGA-Based CNN Accelerator for Lane Detection |
title_sort | quantlanenet: a 640-fps and 34-gops/w fpga-based cnn accelerator for lane detection |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10422460/ https://www.ncbi.nlm.nih.gov/pubmed/37571445 http://dx.doi.org/10.3390/s23156661 |
work_keys_str_mv | AT lamduckhai quantlaneneta640fpsand34gopswfpgabasedcnnacceleratorforlanedetection AT ducamvinh quantlaneneta640fpsand34gopswfpgabasedcnnacceleratorforlanedetection AT phamhoailuan quantlaneneta640fpsand34gopswfpgabasedcnnacceleratorforlanedetection |