Cargando…

An Efficient Time-Domain End-to-End Single-Channel Bird Sound Separation Network

SIMPLE SUMMARY: Automatic bird sound recognition using artificial intelligence technology has been widely used to identify bird species recently. However, the bird sounds recorded in the wild are usually mixed sounds, which can affect the accuracy of identification. In this paper, we utilized massiv...

Descripción completa

Detalles Bibliográficos
Autores principales:	Zhang, Chengyun, Chen, Yonghuan, Hao, Zezhou, Gao, Xinghui
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2022
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9686777/ https://www.ncbi.nlm.nih.gov/pubmed/36428345 http://dx.doi.org/10.3390/ani12223117

_version_	1784835837581590528
author	Zhang, Chengyun Chen, Yonghuan Hao, Zezhou Gao, Xinghui
author_facet	Zhang, Chengyun Chen, Yonghuan Hao, Zezhou Gao, Xinghui
author_sort	Zhang, Chengyun
collection	PubMed
description	SIMPLE SUMMARY: Automatic bird sound recognition using artificial intelligence technology has been widely used to identify bird species recently. However, the bird sounds recorded in the wild are usually mixed sounds, which can affect the accuracy of identification. In this paper, we utilized massive amounts of data of bird sounds and proposed an efficient time-domain single-channel bird sound separation network. Our proposed network achieved good separation performance and fast separation speed while greatly reducing the consumption of computational resources. Our work may help to discriminate individual birds and study the interaction between individual birds, as well as to realize the automatic identification of bird species in various mobile and edge computing devices. ABSTRACT: Bird sounds have obvious characteristics per species, and they are an important way for birds to communicate and transmit information. However, the recorded bird sounds in the field are usually mixed, which making it challenging to identify different bird species and to perform associated tasks. In this study, based on the supervised learning framework, we propose a bird sound separation network, a dual-path tiny transformer network, to directly perform end-to-end mixed species bird sound separation in the time-domain. This separation network is mainly composed of the dual-path network and the simplified transformer structure, which greatly reduces the computational resources required of the network. Experimental results show that our proposed separation network has good separation performance (SI-SNRi reaches 19.3 dB and SDRi reaches 20.1 dB), but compared with DPRNN and DPTNet, its parameters and floating point operations are greatly reduced, which means a higher separation efficiency and faster separation speed. The good separation performance and high separation efficiency indicate that our proposed separation network is valuable for distinguishing individual birds and studying the interaction between individual birds, as well as for realizing the automatic identification of bird species on a variety of mobile devices or edge computing devices.
format	Online Article Text
id	pubmed-9686777
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-96867772022-11-25 An Efficient Time-Domain End-to-End Single-Channel Bird Sound Separation Network Zhang, Chengyun Chen, Yonghuan Hao, Zezhou Gao, Xinghui Animals (Basel) Article SIMPLE SUMMARY: Automatic bird sound recognition using artificial intelligence technology has been widely used to identify bird species recently. However, the bird sounds recorded in the wild are usually mixed sounds, which can affect the accuracy of identification. In this paper, we utilized massive amounts of data of bird sounds and proposed an efficient time-domain single-channel bird sound separation network. Our proposed network achieved good separation performance and fast separation speed while greatly reducing the consumption of computational resources. Our work may help to discriminate individual birds and study the interaction between individual birds, as well as to realize the automatic identification of bird species in various mobile and edge computing devices. ABSTRACT: Bird sounds have obvious characteristics per species, and they are an important way for birds to communicate and transmit information. However, the recorded bird sounds in the field are usually mixed, which making it challenging to identify different bird species and to perform associated tasks. In this study, based on the supervised learning framework, we propose a bird sound separation network, a dual-path tiny transformer network, to directly perform end-to-end mixed species bird sound separation in the time-domain. This separation network is mainly composed of the dual-path network and the simplified transformer structure, which greatly reduces the computational resources required of the network. Experimental results show that our proposed separation network has good separation performance (SI-SNRi reaches 19.3 dB and SDRi reaches 20.1 dB), but compared with DPRNN and DPTNet, its parameters and floating point operations are greatly reduced, which means a higher separation efficiency and faster separation speed. The good separation performance and high separation efficiency indicate that our proposed separation network is valuable for distinguishing individual birds and studying the interaction between individual birds, as well as for realizing the automatic identification of bird species on a variety of mobile devices or edge computing devices. MDPI 2022-11-11 /pmc/articles/PMC9686777/ /pubmed/36428345 http://dx.doi.org/10.3390/ani12223117 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Zhang, Chengyun Chen, Yonghuan Hao, Zezhou Gao, Xinghui An Efficient Time-Domain End-to-End Single-Channel Bird Sound Separation Network
title	An Efficient Time-Domain End-to-End Single-Channel Bird Sound Separation Network
title_full	An Efficient Time-Domain End-to-End Single-Channel Bird Sound Separation Network
title_fullStr	An Efficient Time-Domain End-to-End Single-Channel Bird Sound Separation Network
title_full_unstemmed	An Efficient Time-Domain End-to-End Single-Channel Bird Sound Separation Network
title_short	An Efficient Time-Domain End-to-End Single-Channel Bird Sound Separation Network
title_sort	efficient time-domain end-to-end single-channel bird sound separation network
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9686777/ https://www.ncbi.nlm.nih.gov/pubmed/36428345 http://dx.doi.org/10.3390/ani12223117
work_keys_str_mv	AT zhangchengyun anefficienttimedomainendtoendsinglechannelbirdsoundseparationnetwork AT chenyonghuan anefficienttimedomainendtoendsinglechannelbirdsoundseparationnetwork AT haozezhou anefficienttimedomainendtoendsinglechannelbirdsoundseparationnetwork AT gaoxinghui anefficienttimedomainendtoendsinglechannelbirdsoundseparationnetwork AT zhangchengyun efficienttimedomainendtoendsinglechannelbirdsoundseparationnetwork AT chenyonghuan efficienttimedomainendtoendsinglechannelbirdsoundseparationnetwork AT haozezhou efficienttimedomainendtoendsinglechannelbirdsoundseparationnetwork AT gaoxinghui efficienttimedomainendtoendsinglechannelbirdsoundseparationnetwork

An Efficient Time-Domain End-to-End Single-Channel Bird Sound Separation Network

Ejemplares similares