Cargando…

A scalable speech coding scheme using compressive sensing and orthogonal mapping based vector quantization

A novel scalable speech coding scheme based on Compressive Sensing (CS), which can operate at bit rates from 3.275 to 7.275 kbps is designed and implemented in this paper. The CS based speech coding offers the benefit of combined compression and encryption with inherent de-noising and bit rate scala...

Descripción completa

Detalles Bibliográficos
Autores principales: Sankar, M.S. Arun, Sathidevi, P.S.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6541890/
https://www.ncbi.nlm.nih.gov/pubmed/31193755
http://dx.doi.org/10.1016/j.heliyon.2019.e01820
_version_ 1783422834245107712
author Sankar, M.S. Arun
Sathidevi, P.S.
author_facet Sankar, M.S. Arun
Sathidevi, P.S.
author_sort Sankar, M.S. Arun
collection PubMed
description A novel scalable speech coding scheme based on Compressive Sensing (CS), which can operate at bit rates from 3.275 to 7.275 kbps is designed and implemented in this paper. The CS based speech coding offers the benefit of combined compression and encryption with inherent de-noising and bit rate scalability. The non-stationary nature of speech signal causes the recovery process from CS measurements very complex due to the variation in sparsifying bases. In this work, the complexity of the recovery process is reduced by assigning a suitable basis to each frame of the speech signal based on its statistical properties. As the quality of the reconstructed speech depends on the sensing matrix used at the transmitter, a variant of Binary Permuted Block Diagonal (BPBD) matrix is also proposed here which offers a better performance than that of the commonly used Gaussian random matrix. To improve the coding efficiency, formant filter coefficients are quantized using the conventional Vector Quantization (VQ) and an orthogonal mapping based VQ is developed for the quantization of CS measurements. The proposed coding scheme offers the listening quality for reconstructed speech similar to that of Adaptive Multi rate - Narrowband (AMR-NB) codec at 6.7 kbps and Enhanced Voice Services (EVS) at 7.2 kbps. A separate de-noising block is not required in the proposed coding scheme due to the inherent de-noising property of CS. Scalability in bit rate is achieved in the proposed method by varying the number of random measurements and the number of levels for orthogonal mapping in the VQ stage of measurements.
format Online
Article
Text
id pubmed-6541890
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-65418902019-06-03 A scalable speech coding scheme using compressive sensing and orthogonal mapping based vector quantization Sankar, M.S. Arun Sathidevi, P.S. Heliyon Article A novel scalable speech coding scheme based on Compressive Sensing (CS), which can operate at bit rates from 3.275 to 7.275 kbps is designed and implemented in this paper. The CS based speech coding offers the benefit of combined compression and encryption with inherent de-noising and bit rate scalability. The non-stationary nature of speech signal causes the recovery process from CS measurements very complex due to the variation in sparsifying bases. In this work, the complexity of the recovery process is reduced by assigning a suitable basis to each frame of the speech signal based on its statistical properties. As the quality of the reconstructed speech depends on the sensing matrix used at the transmitter, a variant of Binary Permuted Block Diagonal (BPBD) matrix is also proposed here which offers a better performance than that of the commonly used Gaussian random matrix. To improve the coding efficiency, formant filter coefficients are quantized using the conventional Vector Quantization (VQ) and an orthogonal mapping based VQ is developed for the quantization of CS measurements. The proposed coding scheme offers the listening quality for reconstructed speech similar to that of Adaptive Multi rate - Narrowband (AMR-NB) codec at 6.7 kbps and Enhanced Voice Services (EVS) at 7.2 kbps. A separate de-noising block is not required in the proposed coding scheme due to the inherent de-noising property of CS. Scalability in bit rate is achieved in the proposed method by varying the number of random measurements and the number of levels for orthogonal mapping in the VQ stage of measurements. Elsevier 2019-05-28 /pmc/articles/PMC6541890/ /pubmed/31193755 http://dx.doi.org/10.1016/j.heliyon.2019.e01820 Text en © 2019 The Authors http://creativecommons.org/licenses/by-nc-nd/4.0/ This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Article
Sankar, M.S. Arun
Sathidevi, P.S.
A scalable speech coding scheme using compressive sensing and orthogonal mapping based vector quantization
title A scalable speech coding scheme using compressive sensing and orthogonal mapping based vector quantization
title_full A scalable speech coding scheme using compressive sensing and orthogonal mapping based vector quantization
title_fullStr A scalable speech coding scheme using compressive sensing and orthogonal mapping based vector quantization
title_full_unstemmed A scalable speech coding scheme using compressive sensing and orthogonal mapping based vector quantization
title_short A scalable speech coding scheme using compressive sensing and orthogonal mapping based vector quantization
title_sort scalable speech coding scheme using compressive sensing and orthogonal mapping based vector quantization
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6541890/
https://www.ncbi.nlm.nih.gov/pubmed/31193755
http://dx.doi.org/10.1016/j.heliyon.2019.e01820
work_keys_str_mv AT sankarmsarun ascalablespeechcodingschemeusingcompressivesensingandorthogonalmappingbasedvectorquantization
AT sathidevips ascalablespeechcodingschemeusingcompressivesensingandorthogonalmappingbasedvectorquantization
AT sankarmsarun scalablespeechcodingschemeusingcompressivesensingandorthogonalmappingbasedvectorquantization
AT sathidevips scalablespeechcodingschemeusingcompressivesensingandorthogonalmappingbasedvectorquantization