Cargando…
A Survey on Low-Latency DNN-Based Speech Enhancement
This paper presents recent advances in low-latency, single-channel, deep neural network-based speech enhancement systems. The sources of latency and their acceptable values in different applications are described. This is followed by an analysis of the constraints imposed on neural network architect...
Autor principal: | |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9921748/ https://www.ncbi.nlm.nih.gov/pubmed/36772421 http://dx.doi.org/10.3390/s23031380 |
_version_ | 1784887386291830784 |
---|---|
author | Drgas, Szymon |
author_facet | Drgas, Szymon |
author_sort | Drgas, Szymon |
collection | PubMed |
description | This paper presents recent advances in low-latency, single-channel, deep neural network-based speech enhancement systems. The sources of latency and their acceptable values in different applications are described. This is followed by an analysis of the constraints imposed on neural network architectures. Specifically, the causal units used in deep neural networks are presented and discussed in the context of their properties, such as the number of parameters, the receptive field, and computational complexity. This is followed by a discussion of techniques used to reduce the computational complexity and memory requirements of the neural networks used in this task. Finally, the techniques used by the winners of the latest speech enhancement challenges (DNS, Clarity) are shown and compared. |
format | Online Article Text |
id | pubmed-9921748 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-99217482023-02-12 A Survey on Low-Latency DNN-Based Speech Enhancement Drgas, Szymon Sensors (Basel) Review This paper presents recent advances in low-latency, single-channel, deep neural network-based speech enhancement systems. The sources of latency and their acceptable values in different applications are described. This is followed by an analysis of the constraints imposed on neural network architectures. Specifically, the causal units used in deep neural networks are presented and discussed in the context of their properties, such as the number of parameters, the receptive field, and computational complexity. This is followed by a discussion of techniques used to reduce the computational complexity and memory requirements of the neural networks used in this task. Finally, the techniques used by the winners of the latest speech enhancement challenges (DNS, Clarity) are shown and compared. MDPI 2023-01-26 /pmc/articles/PMC9921748/ /pubmed/36772421 http://dx.doi.org/10.3390/s23031380 Text en © 2023 by the author. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Review Drgas, Szymon A Survey on Low-Latency DNN-Based Speech Enhancement |
title | A Survey on Low-Latency DNN-Based Speech Enhancement |
title_full | A Survey on Low-Latency DNN-Based Speech Enhancement |
title_fullStr | A Survey on Low-Latency DNN-Based Speech Enhancement |
title_full_unstemmed | A Survey on Low-Latency DNN-Based Speech Enhancement |
title_short | A Survey on Low-Latency DNN-Based Speech Enhancement |
title_sort | survey on low-latency dnn-based speech enhancement |
topic | Review |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9921748/ https://www.ncbi.nlm.nih.gov/pubmed/36772421 http://dx.doi.org/10.3390/s23031380 |
work_keys_str_mv | AT drgasszymon asurveyonlowlatencydnnbasedspeechenhancement AT drgasszymon surveyonlowlatencydnnbasedspeechenhancement |