Cargando…

Efficient Multi-Scale Stereo-Matching Network Using Adaptive Cost Volume Filtering

While recent deep learning-based stereo-matching networks have shown outstanding advances, there are still some unsolved challenges. First, most state-of-the-art stereo models employ 3D convolutions for 4D cost volume aggregation, which limit the deployment of networks for resource-limited mobile en...

Descripción completa

Detalles Bibliográficos
Autores principales:	Jeon, Suyeon, Heo, Yong Seok
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2022
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9371070/ https://www.ncbi.nlm.nih.gov/pubmed/35898003 http://dx.doi.org/10.3390/s22155500

_version_	1784767020409028608
author	Jeon, Suyeon Heo, Yong Seok
author_facet	Jeon, Suyeon Heo, Yong Seok
author_sort	Jeon, Suyeon
collection	PubMed
description	While recent deep learning-based stereo-matching networks have shown outstanding advances, there are still some unsolved challenges. First, most state-of-the-art stereo models employ 3D convolutions for 4D cost volume aggregation, which limit the deployment of networks for resource-limited mobile environments owing to heavy consumption of computation and memory. Although there are some efficient networks, most of them still require a heavy computational cost to incorporate them to mobile computing devices in real-time. Second, most stereo networks indirectly supervise cost volumes through disparity regression loss by using the softargmax function. This causes problems in ambiguous regions, such as the boundaries of objects, because there are many possibilities for unreasonable cost distributions which result in overfitting problem. A few works deal with this problem by generating artificial cost distribution using only the ground truth disparity value that is insufficient to fully regularize the cost volume. To address these problems, we first propose an efficient multi-scale sequential feature fusion network (MSFFNet). Specifically, we connect multi-scale SFF modules in parallel with a cross-scale fusion function to generate a set of cost volumes with different scales. These cost volumes are then effectively combined using the proposed interlaced concatenation method. Second, we propose an adaptive cost-volume-filtering (ACVF) loss function that directly supervises our estimated cost volume. The proposed ACVF loss directly adds constraints to the cost volume using the probability distribution generated from the ground truth disparity map and that estimated from the teacher network which achieves higher accuracy. Results of several experiments using representative datasets for stereo matching show that our proposed method is more efficient than previous methods. Our network architecture consumes fewer parameters and generates reasonable disparity maps with faster speed compared with the existing state-of-the art stereo models. Concretely, our network achieves 1.01 EPE with runtime of 42 ms, 2.92 M parameters, and 97.96 G FLOPs on the Scene Flow test set. Compared with PSMNet, our method is 89% faster and 7% more accurate with 45% fewer parameters.
format	Online Article Text
id	pubmed-9371070
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-93710702022-08-12 Efficient Multi-Scale Stereo-Matching Network Using Adaptive Cost Volume Filtering Jeon, Suyeon Heo, Yong Seok Sensors (Basel) Article While recent deep learning-based stereo-matching networks have shown outstanding advances, there are still some unsolved challenges. First, most state-of-the-art stereo models employ 3D convolutions for 4D cost volume aggregation, which limit the deployment of networks for resource-limited mobile environments owing to heavy consumption of computation and memory. Although there are some efficient networks, most of them still require a heavy computational cost to incorporate them to mobile computing devices in real-time. Second, most stereo networks indirectly supervise cost volumes through disparity regression loss by using the softargmax function. This causes problems in ambiguous regions, such as the boundaries of objects, because there are many possibilities for unreasonable cost distributions which result in overfitting problem. A few works deal with this problem by generating artificial cost distribution using only the ground truth disparity value that is insufficient to fully regularize the cost volume. To address these problems, we first propose an efficient multi-scale sequential feature fusion network (MSFFNet). Specifically, we connect multi-scale SFF modules in parallel with a cross-scale fusion function to generate a set of cost volumes with different scales. These cost volumes are then effectively combined using the proposed interlaced concatenation method. Second, we propose an adaptive cost-volume-filtering (ACVF) loss function that directly supervises our estimated cost volume. The proposed ACVF loss directly adds constraints to the cost volume using the probability distribution generated from the ground truth disparity map and that estimated from the teacher network which achieves higher accuracy. Results of several experiments using representative datasets for stereo matching show that our proposed method is more efficient than previous methods. Our network architecture consumes fewer parameters and generates reasonable disparity maps with faster speed compared with the existing state-of-the art stereo models. Concretely, our network achieves 1.01 EPE with runtime of 42 ms, 2.92 M parameters, and 97.96 G FLOPs on the Scene Flow test set. Compared with PSMNet, our method is 89% faster and 7% more accurate with 45% fewer parameters. MDPI 2022-07-23 /pmc/articles/PMC9371070/ /pubmed/35898003 http://dx.doi.org/10.3390/s22155500 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Jeon, Suyeon Heo, Yong Seok Efficient Multi-Scale Stereo-Matching Network Using Adaptive Cost Volume Filtering
title	Efficient Multi-Scale Stereo-Matching Network Using Adaptive Cost Volume Filtering
title_full	Efficient Multi-Scale Stereo-Matching Network Using Adaptive Cost Volume Filtering
title_fullStr	Efficient Multi-Scale Stereo-Matching Network Using Adaptive Cost Volume Filtering
title_full_unstemmed	Efficient Multi-Scale Stereo-Matching Network Using Adaptive Cost Volume Filtering
title_short	Efficient Multi-Scale Stereo-Matching Network Using Adaptive Cost Volume Filtering
title_sort	efficient multi-scale stereo-matching network using adaptive cost volume filtering
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9371070/ https://www.ncbi.nlm.nih.gov/pubmed/35898003 http://dx.doi.org/10.3390/s22155500
work_keys_str_mv	AT jeonsuyeon efficientmultiscalestereomatchingnetworkusingadaptivecostvolumefiltering AT heoyongseok efficientmultiscalestereomatchingnetworkusingadaptivecostvolumefiltering

Efficient Multi-Scale Stereo-Matching Network Using Adaptive Cost Volume Filtering

Ejemplares similares