Cargando…
Non-intrusive speech quality assessment with attention-based ResNet-BiLSTM
Speech quality is frequently affected by a variety factors in online conferencing applications, such as background noise, reverberation, packet loss and network jitter. In real scenarios, it is impossible to obtain a clean reference signal for evaluating the quality of the conferencing speech. There...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer London
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10088708/ https://www.ncbi.nlm.nih.gov/pubmed/37362228 http://dx.doi.org/10.1007/s11760-023-02559-2 |
Sumario: | Speech quality is frequently affected by a variety factors in online conferencing applications, such as background noise, reverberation, packet loss and network jitter. In real scenarios, it is impossible to obtain a clean reference signal for evaluating the quality of the conferencing speech. Therefore, an effective non-intrusive speech quality assessment (NISQA) method is necessary. In this paper, we propose a new network framework for NISQA based on ResNet and BiLSTM. ResNet is utilized to extract local features, while BiLSTM is used to integrate representative features with long-term time dependencies and sequential characteristics. Considering that ResNet may result in the loss of context information when applied to the NISQA task, we propose a variant of ResNet which can preserve the time series information of the conferencing speech. The experimental results demonstrate that the proposed method has a high correlation with the mean opinion score of clean, noisy and processed speech. |
---|