Cargando…

Based on cross-scale fusion attention mechanism network for semantic segmentation for street scenes

Semantic segmentation, which is a fundamental task in computer vision. Every pixel will have a specific semantic class assigned to it through semantic segmentation methods. Embedded systems and mobile devices are difficult to deploy high-accuracy segmentation algorithms. Despite the rapid developmen...

Descripción completa

Detalles Bibliográficos
Autores principales: Ye, Xin, Gao, Lang, Chen, Jichen, Lei, Mingyue
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10501793/
https://www.ncbi.nlm.nih.gov/pubmed/37719330
http://dx.doi.org/10.3389/fnbot.2023.1204418
_version_ 1785106186194911232
author Ye, Xin
Gao, Lang
Chen, Jichen
Lei, Mingyue
author_facet Ye, Xin
Gao, Lang
Chen, Jichen
Lei, Mingyue
author_sort Ye, Xin
collection PubMed
description Semantic segmentation, which is a fundamental task in computer vision. Every pixel will have a specific semantic class assigned to it through semantic segmentation methods. Embedded systems and mobile devices are difficult to deploy high-accuracy segmentation algorithms. Despite the rapid development of semantic segmentation, the balance between speed and accuracy must be improved. As a solution to the above problems, we created a cross-scale fusion attention mechanism network called CFANet, which fuses feature maps from different scales. We first design a novel efficient residual module (ERM), which applies both dilation convolution and factorized convolution. Our CFANet is mainly constructed from ERM. Subsequently, we designed a new multi-branch channel attention mechanism (MCAM) to refine the feature maps at different levels. Experiment results show that CFANet achieved 70.6% mean intersection over union (mIoU) and 67.7% mIoU on Cityscapes and CamVid datasets, respectively, with inference speeds of 118 FPS and 105 FPS on NVIDIA RTX2080Ti GPU cards with 0.84M parameters.
format Online
Article
Text
id pubmed-10501793
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-105017932023-09-15 Based on cross-scale fusion attention mechanism network for semantic segmentation for street scenes Ye, Xin Gao, Lang Chen, Jichen Lei, Mingyue Front Neurorobot Neuroscience Semantic segmentation, which is a fundamental task in computer vision. Every pixel will have a specific semantic class assigned to it through semantic segmentation methods. Embedded systems and mobile devices are difficult to deploy high-accuracy segmentation algorithms. Despite the rapid development of semantic segmentation, the balance between speed and accuracy must be improved. As a solution to the above problems, we created a cross-scale fusion attention mechanism network called CFANet, which fuses feature maps from different scales. We first design a novel efficient residual module (ERM), which applies both dilation convolution and factorized convolution. Our CFANet is mainly constructed from ERM. Subsequently, we designed a new multi-branch channel attention mechanism (MCAM) to refine the feature maps at different levels. Experiment results show that CFANet achieved 70.6% mean intersection over union (mIoU) and 67.7% mIoU on Cityscapes and CamVid datasets, respectively, with inference speeds of 118 FPS and 105 FPS on NVIDIA RTX2080Ti GPU cards with 0.84M parameters. Frontiers Media S.A. 2023-08-31 /pmc/articles/PMC10501793/ /pubmed/37719330 http://dx.doi.org/10.3389/fnbot.2023.1204418 Text en Copyright © 2023 Ye, Gao, Chen and Lei. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Neuroscience
Ye, Xin
Gao, Lang
Chen, Jichen
Lei, Mingyue
Based on cross-scale fusion attention mechanism network for semantic segmentation for street scenes
title Based on cross-scale fusion attention mechanism network for semantic segmentation for street scenes
title_full Based on cross-scale fusion attention mechanism network for semantic segmentation for street scenes
title_fullStr Based on cross-scale fusion attention mechanism network for semantic segmentation for street scenes
title_full_unstemmed Based on cross-scale fusion attention mechanism network for semantic segmentation for street scenes
title_short Based on cross-scale fusion attention mechanism network for semantic segmentation for street scenes
title_sort based on cross-scale fusion attention mechanism network for semantic segmentation for street scenes
topic Neuroscience
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10501793/
https://www.ncbi.nlm.nih.gov/pubmed/37719330
http://dx.doi.org/10.3389/fnbot.2023.1204418
work_keys_str_mv AT yexin basedoncrossscalefusionattentionmechanismnetworkforsemanticsegmentationforstreetscenes
AT gaolang basedoncrossscalefusionattentionmechanismnetworkforsemanticsegmentationforstreetscenes
AT chenjichen basedoncrossscalefusionattentionmechanismnetworkforsemanticsegmentationforstreetscenes
AT leimingyue basedoncrossscalefusionattentionmechanismnetworkforsemanticsegmentationforstreetscenes