Cargando…

Sound source localization based on residual network and channel attention module

This paper presents a sound source localization (SSL) model based on residual network and channel attention mechanism. The method takes the combination of log-Mel spectrogram and generalized cross-correlation phase transform (GCC-PHAT) as the input features, and extracts the time–frequency informati...

Descripción completa

Detalles Bibliográficos
Autores principales:	Hu, Fucai, Song, Xiaohui, He, Ruhan, Yu, Yongsheng
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Nature Publishing Group UK 2023
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10070247/ https://www.ncbi.nlm.nih.gov/pubmed/37012391 http://dx.doi.org/10.1038/s41598-023-32657-7

_version_	1785018985872359424
author	Hu, Fucai Song, Xiaohui He, Ruhan Yu, Yongsheng
author_facet	Hu, Fucai Song, Xiaohui He, Ruhan Yu, Yongsheng
author_sort	Hu, Fucai
collection	PubMed
description	This paper presents a sound source localization (SSL) model based on residual network and channel attention mechanism. The method takes the combination of log-Mel spectrogram and generalized cross-correlation phase transform (GCC-PHAT) as the input features, and extracts the time–frequency information by using the residual structure and channel attention mechanism, thus obtaining a better localizing performance. The residual blocks are introduced to extract deeper features, which can stack more layers for high-level features and avoid gradient vanishing or exploding at the same time. The attention mechanism is taken into account for the feature extraction stage in the proposed SSL model, which can focus on the most important information on the input features. We use the signals collected by microphone array to explore the performance of the model under different features, and find the most suitable input features of the proposed method. We compare our method with other models on public dataset. Experience results show a quite substantial improvement of sound source localizing performance.
format	Online Article Text
id	pubmed-10070247
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	Nature Publishing Group UK
record_format	MEDLINE/PubMed
spelling	pubmed-100702472023-04-05 Sound source localization based on residual network and channel attention module Hu, Fucai Song, Xiaohui He, Ruhan Yu, Yongsheng Sci Rep Article This paper presents a sound source localization (SSL) model based on residual network and channel attention mechanism. The method takes the combination of log-Mel spectrogram and generalized cross-correlation phase transform (GCC-PHAT) as the input features, and extracts the time–frequency information by using the residual structure and channel attention mechanism, thus obtaining a better localizing performance. The residual blocks are introduced to extract deeper features, which can stack more layers for high-level features and avoid gradient vanishing or exploding at the same time. The attention mechanism is taken into account for the feature extraction stage in the proposed SSL model, which can focus on the most important information on the input features. We use the signals collected by microphone array to explore the performance of the model under different features, and find the most suitable input features of the proposed method. We compare our method with other models on public dataset. Experience results show a quite substantial improvement of sound source localizing performance. Nature Publishing Group UK 2023-04-03 /pmc/articles/PMC10070247/ /pubmed/37012391 http://dx.doi.org/10.1038/s41598-023-32657-7 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle	Article Hu, Fucai Song, Xiaohui He, Ruhan Yu, Yongsheng Sound source localization based on residual network and channel attention module
title	Sound source localization based on residual network and channel attention module
title_full	Sound source localization based on residual network and channel attention module
title_fullStr	Sound source localization based on residual network and channel attention module
title_full_unstemmed	Sound source localization based on residual network and channel attention module
title_short	Sound source localization based on residual network and channel attention module
title_sort	sound source localization based on residual network and channel attention module
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10070247/ https://www.ncbi.nlm.nih.gov/pubmed/37012391 http://dx.doi.org/10.1038/s41598-023-32657-7
work_keys_str_mv	AT hufucai soundsourcelocalizationbasedonresidualnetworkandchannelattentionmodule AT songxiaohui soundsourcelocalizationbasedonresidualnetworkandchannelattentionmodule AT heruhan soundsourcelocalizationbasedonresidualnetworkandchannelattentionmodule AT yuyongsheng soundsourcelocalizationbasedonresidualnetworkandchannelattentionmodule

Sound source localization based on residual network and channel attention module

Ejemplares similares