Cargando…
Based on cross-scale fusion attention mechanism network for semantic segmentation for street scenes
Semantic segmentation, which is a fundamental task in computer vision. Every pixel will have a specific semantic class assigned to it through semantic segmentation methods. Embedded systems and mobile devices are difficult to deploy high-accuracy segmentation algorithms. Despite the rapid developmen...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10501793/ https://www.ncbi.nlm.nih.gov/pubmed/37719330 http://dx.doi.org/10.3389/fnbot.2023.1204418 |
_version_ | 1785106186194911232 |
---|---|
author | Ye, Xin Gao, Lang Chen, Jichen Lei, Mingyue |
author_facet | Ye, Xin Gao, Lang Chen, Jichen Lei, Mingyue |
author_sort | Ye, Xin |
collection | PubMed |
description | Semantic segmentation, which is a fundamental task in computer vision. Every pixel will have a specific semantic class assigned to it through semantic segmentation methods. Embedded systems and mobile devices are difficult to deploy high-accuracy segmentation algorithms. Despite the rapid development of semantic segmentation, the balance between speed and accuracy must be improved. As a solution to the above problems, we created a cross-scale fusion attention mechanism network called CFANet, which fuses feature maps from different scales. We first design a novel efficient residual module (ERM), which applies both dilation convolution and factorized convolution. Our CFANet is mainly constructed from ERM. Subsequently, we designed a new multi-branch channel attention mechanism (MCAM) to refine the feature maps at different levels. Experiment results show that CFANet achieved 70.6% mean intersection over union (mIoU) and 67.7% mIoU on Cityscapes and CamVid datasets, respectively, with inference speeds of 118 FPS and 105 FPS on NVIDIA RTX2080Ti GPU cards with 0.84M parameters. |
format | Online Article Text |
id | pubmed-10501793 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-105017932023-09-15 Based on cross-scale fusion attention mechanism network for semantic segmentation for street scenes Ye, Xin Gao, Lang Chen, Jichen Lei, Mingyue Front Neurorobot Neuroscience Semantic segmentation, which is a fundamental task in computer vision. Every pixel will have a specific semantic class assigned to it through semantic segmentation methods. Embedded systems and mobile devices are difficult to deploy high-accuracy segmentation algorithms. Despite the rapid development of semantic segmentation, the balance between speed and accuracy must be improved. As a solution to the above problems, we created a cross-scale fusion attention mechanism network called CFANet, which fuses feature maps from different scales. We first design a novel efficient residual module (ERM), which applies both dilation convolution and factorized convolution. Our CFANet is mainly constructed from ERM. Subsequently, we designed a new multi-branch channel attention mechanism (MCAM) to refine the feature maps at different levels. Experiment results show that CFANet achieved 70.6% mean intersection over union (mIoU) and 67.7% mIoU on Cityscapes and CamVid datasets, respectively, with inference speeds of 118 FPS and 105 FPS on NVIDIA RTX2080Ti GPU cards with 0.84M parameters. Frontiers Media S.A. 2023-08-31 /pmc/articles/PMC10501793/ /pubmed/37719330 http://dx.doi.org/10.3389/fnbot.2023.1204418 Text en Copyright © 2023 Ye, Gao, Chen and Lei. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Neuroscience Ye, Xin Gao, Lang Chen, Jichen Lei, Mingyue Based on cross-scale fusion attention mechanism network for semantic segmentation for street scenes |
title | Based on cross-scale fusion attention mechanism network for semantic segmentation for street scenes |
title_full | Based on cross-scale fusion attention mechanism network for semantic segmentation for street scenes |
title_fullStr | Based on cross-scale fusion attention mechanism network for semantic segmentation for street scenes |
title_full_unstemmed | Based on cross-scale fusion attention mechanism network for semantic segmentation for street scenes |
title_short | Based on cross-scale fusion attention mechanism network for semantic segmentation for street scenes |
title_sort | based on cross-scale fusion attention mechanism network for semantic segmentation for street scenes |
topic | Neuroscience |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10501793/ https://www.ncbi.nlm.nih.gov/pubmed/37719330 http://dx.doi.org/10.3389/fnbot.2023.1204418 |
work_keys_str_mv | AT yexin basedoncrossscalefusionattentionmechanismnetworkforsemanticsegmentationforstreetscenes AT gaolang basedoncrossscalefusionattentionmechanismnetworkforsemanticsegmentationforstreetscenes AT chenjichen basedoncrossscalefusionattentionmechanismnetworkforsemanticsegmentationforstreetscenes AT leimingyue basedoncrossscalefusionattentionmechanismnetworkforsemanticsegmentationforstreetscenes |