Cargando…

Improved Transformer-Based Dual-Path Network with Amplitude and Complex Domain Feature Fusion for Speech Enhancement

Most previous speech enhancement methods only predict amplitude features, but more and more studies have proved that phase information is crucial for speech quality. Recently, there have also been some methods to choose complex features, but complex masks are difficult to estimate. Removing noise wh...

Descripción completa

Detalles Bibliográficos
Autores principales:	Ye, Moujia, Wan, Hongjie
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2023
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9955017/ https://www.ncbi.nlm.nih.gov/pubmed/36832595 http://dx.doi.org/10.3390/e25020228

Descripción
Sumario:	Most previous speech enhancement methods only predict amplitude features, but more and more studies have proved that phase information is crucial for speech quality. Recently, there have also been some methods to choose complex features, but complex masks are difficult to estimate. Removing noise while maintaining good speech quality at low signal-to-noise ratios is still a problem. This study proposes a dual-path network structure for speech enhancement that can model complex spectra and amplitudes simultaneously, and introduces an attention-aware feature fusion module to fuse the two features to facilitate overall spectrum recovery. In addition, we improve a transformer-based feature extraction module that can efficiently extract local and global features. The proposed network achieves better performance than the baseline models in experiments on the Voice Bank + DEMAND dataset. We also conducted ablation experiments to verify the effectiveness of the dual-path structure, the improved transformer, and the fusion module, and investigated the effect of the input-mask multiplication strategy on the results.

Improved Transformer-Based Dual-Path Network with Amplitude and Complex Domain Feature Fusion for Speech Enhancement

Ejemplares similares