Cargando…

A Speech Recognition Method Based on Domain-Specific Datasets and Confidence Decision Networks

This paper proposes a speech recognition method based on a domain-specific language speech network (DSL-Net) and a confidence decision network (CD-Net). The method involves automatically training a domain-specific dataset, using pre-trained model parameters for migration learning, and obtaining a do...

Descripción completa

Detalles Bibliográficos
Autores principales:	Dong, Zhe, Ding, Qianqian, Zhai, Weifeng, Zhou, Meng
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2023
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10346893/ https://www.ncbi.nlm.nih.gov/pubmed/37447886 http://dx.doi.org/10.3390/s23136036

_version_	1785073421342736384
author	Dong, Zhe Ding, Qianqian Zhai, Weifeng Zhou, Meng
author_facet	Dong, Zhe Ding, Qianqian Zhai, Weifeng Zhou, Meng
author_sort	Dong, Zhe
collection	PubMed
description	This paper proposes a speech recognition method based on a domain-specific language speech network (DSL-Net) and a confidence decision network (CD-Net). The method involves automatically training a domain-specific dataset, using pre-trained model parameters for migration learning, and obtaining a domain-specific speech model. Importance sampling weights were set for the trained domain-specific speech model, which was then integrated with the trained speech model from the benchmark dataset. This integration automatically expands the lexical content of the model to accommodate the input speech based on the lexicon and language model. The adaptation attempts to address the issue of out-of-vocabulary words that are likely to arise in most realistic scenarios and utilizes external knowledge sources to extend the existing language model. By doing so, the approach enhances the adaptability of the language model in new domains or scenarios and improves the prediction accuracy of the model. For domain-specific vocabulary recognition, a deep fully convolutional neural network (DFCNN) and a candidate temporal classification (CTC)-based approach were employed to achieve effective recognition of domain-specific vocabulary. Furthermore, a confidence-based classifier was added to enhance the accuracy and robustness of the overall approach. In the experiments, the method was tested on a proprietary domain audio dataset and compared with an automatic speech recognition (ASR) system trained on a large-scale dataset. Based on experimental verification, the model achieved an accuracy improvement from 82% to 91% in the medical domain. The inclusion of domain-specific datasets resulted in a 5% to 7% enhancement over the baseline, while the introduction of model confidence further improved the baseline by 3% to 5%. These findings demonstrate the significance of incorporating domain-specific datasets and model confidence in advancing speech recognition technology.
format	Online Article Text
id	pubmed-10346893
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-103468932023-07-15 A Speech Recognition Method Based on Domain-Specific Datasets and Confidence Decision Networks Dong, Zhe Ding, Qianqian Zhai, Weifeng Zhou, Meng Sensors (Basel) Article This paper proposes a speech recognition method based on a domain-specific language speech network (DSL-Net) and a confidence decision network (CD-Net). The method involves automatically training a domain-specific dataset, using pre-trained model parameters for migration learning, and obtaining a domain-specific speech model. Importance sampling weights were set for the trained domain-specific speech model, which was then integrated with the trained speech model from the benchmark dataset. This integration automatically expands the lexical content of the model to accommodate the input speech based on the lexicon and language model. The adaptation attempts to address the issue of out-of-vocabulary words that are likely to arise in most realistic scenarios and utilizes external knowledge sources to extend the existing language model. By doing so, the approach enhances the adaptability of the language model in new domains or scenarios and improves the prediction accuracy of the model. For domain-specific vocabulary recognition, a deep fully convolutional neural network (DFCNN) and a candidate temporal classification (CTC)-based approach were employed to achieve effective recognition of domain-specific vocabulary. Furthermore, a confidence-based classifier was added to enhance the accuracy and robustness of the overall approach. In the experiments, the method was tested on a proprietary domain audio dataset and compared with an automatic speech recognition (ASR) system trained on a large-scale dataset. Based on experimental verification, the model achieved an accuracy improvement from 82% to 91% in the medical domain. The inclusion of domain-specific datasets resulted in a 5% to 7% enhancement over the baseline, while the introduction of model confidence further improved the baseline by 3% to 5%. These findings demonstrate the significance of incorporating domain-specific datasets and model confidence in advancing speech recognition technology. MDPI 2023-06-29 /pmc/articles/PMC10346893/ /pubmed/37447886 http://dx.doi.org/10.3390/s23136036 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Dong, Zhe Ding, Qianqian Zhai, Weifeng Zhou, Meng A Speech Recognition Method Based on Domain-Specific Datasets and Confidence Decision Networks
title	A Speech Recognition Method Based on Domain-Specific Datasets and Confidence Decision Networks
title_full	A Speech Recognition Method Based on Domain-Specific Datasets and Confidence Decision Networks
title_fullStr	A Speech Recognition Method Based on Domain-Specific Datasets and Confidence Decision Networks
title_full_unstemmed	A Speech Recognition Method Based on Domain-Specific Datasets and Confidence Decision Networks
title_short	A Speech Recognition Method Based on Domain-Specific Datasets and Confidence Decision Networks
title_sort	speech recognition method based on domain-specific datasets and confidence decision networks
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10346893/ https://www.ncbi.nlm.nih.gov/pubmed/37447886 http://dx.doi.org/10.3390/s23136036
work_keys_str_mv	AT dongzhe aspeechrecognitionmethodbasedondomainspecificdatasetsandconfidencedecisionnetworks AT dingqianqian aspeechrecognitionmethodbasedondomainspecificdatasetsandconfidencedecisionnetworks AT zhaiweifeng aspeechrecognitionmethodbasedondomainspecificdatasetsandconfidencedecisionnetworks AT zhoumeng aspeechrecognitionmethodbasedondomainspecificdatasetsandconfidencedecisionnetworks AT dongzhe speechrecognitionmethodbasedondomainspecificdatasetsandconfidencedecisionnetworks AT dingqianqian speechrecognitionmethodbasedondomainspecificdatasetsandconfidencedecisionnetworks AT zhaiweifeng speechrecognitionmethodbasedondomainspecificdatasetsandconfidencedecisionnetworks AT zhoumeng speechrecognitionmethodbasedondomainspecificdatasetsandconfidencedecisionnetworks

A Speech Recognition Method Based on Domain-Specific Datasets and Confidence Decision Networks

Ejemplares similares