Cargando…

A Survey on Low-Latency DNN-Based Speech Enhancement

This paper presents recent advances in low-latency, single-channel, deep neural network-based speech enhancement systems. The sources of latency and their acceptable values in different applications are described. This is followed by an analysis of the constraints imposed on neural network architect...

Descripción completa

Detalles Bibliográficos
Autor principal: Drgas, Szymon
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9921748/
https://www.ncbi.nlm.nih.gov/pubmed/36772421
http://dx.doi.org/10.3390/s23031380
_version_ 1784887386291830784
author Drgas, Szymon
author_facet Drgas, Szymon
author_sort Drgas, Szymon
collection PubMed
description This paper presents recent advances in low-latency, single-channel, deep neural network-based speech enhancement systems. The sources of latency and their acceptable values in different applications are described. This is followed by an analysis of the constraints imposed on neural network architectures. Specifically, the causal units used in deep neural networks are presented and discussed in the context of their properties, such as the number of parameters, the receptive field, and computational complexity. This is followed by a discussion of techniques used to reduce the computational complexity and memory requirements of the neural networks used in this task. Finally, the techniques used by the winners of the latest speech enhancement challenges (DNS, Clarity) are shown and compared.
format Online
Article
Text
id pubmed-9921748
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-99217482023-02-12 A Survey on Low-Latency DNN-Based Speech Enhancement Drgas, Szymon Sensors (Basel) Review This paper presents recent advances in low-latency, single-channel, deep neural network-based speech enhancement systems. The sources of latency and their acceptable values in different applications are described. This is followed by an analysis of the constraints imposed on neural network architectures. Specifically, the causal units used in deep neural networks are presented and discussed in the context of their properties, such as the number of parameters, the receptive field, and computational complexity. This is followed by a discussion of techniques used to reduce the computational complexity and memory requirements of the neural networks used in this task. Finally, the techniques used by the winners of the latest speech enhancement challenges (DNS, Clarity) are shown and compared. MDPI 2023-01-26 /pmc/articles/PMC9921748/ /pubmed/36772421 http://dx.doi.org/10.3390/s23031380 Text en © 2023 by the author. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Review
Drgas, Szymon
A Survey on Low-Latency DNN-Based Speech Enhancement
title A Survey on Low-Latency DNN-Based Speech Enhancement
title_full A Survey on Low-Latency DNN-Based Speech Enhancement
title_fullStr A Survey on Low-Latency DNN-Based Speech Enhancement
title_full_unstemmed A Survey on Low-Latency DNN-Based Speech Enhancement
title_short A Survey on Low-Latency DNN-Based Speech Enhancement
title_sort survey on low-latency dnn-based speech enhancement
topic Review
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9921748/
https://www.ncbi.nlm.nih.gov/pubmed/36772421
http://dx.doi.org/10.3390/s23031380
work_keys_str_mv AT drgasszymon asurveyonlowlatencydnnbasedspeechenhancement
AT drgasszymon surveyonlowlatencydnnbasedspeechenhancement