Cargando…

Accent Recognition with Hybrid Phonetic Features

The performance of voice-controlled systems is usually influenced by accented speech. To make these systems more robust, frontend accent recognition (AR) technologies have received increased attention in recent years. As accent is a high-level abstract feature that has a profound relationship with l...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Zhan, Wang, Yuehai, Yang, Jianyi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8469688/
https://www.ncbi.nlm.nih.gov/pubmed/34577464
http://dx.doi.org/10.3390/s21186258
_version_ 1784573999818211328
author Zhang, Zhan
Wang, Yuehai
Yang, Jianyi
author_facet Zhang, Zhan
Wang, Yuehai
Yang, Jianyi
author_sort Zhang, Zhan
collection PubMed
description The performance of voice-controlled systems is usually influenced by accented speech. To make these systems more robust, frontend accent recognition (AR) technologies have received increased attention in recent years. As accent is a high-level abstract feature that has a profound relationship with language knowledge, AR is more challenging than other language-agnostic audio classification tasks. In this paper, we use an auxiliary automatic speech recognition (ASR) task to extract language-related phonetic features. Furthermore, we propose a hybrid structure that incorporates the embeddings of both a fixed acoustic model and a trainable acoustic model, making the language-related acoustic feature more robust. We conduct several experiments on the AESRC dataset. The results demonstrate that our approach can obtain an 8.02% relative improvement compared with the Transformer baseline, showing the merits of the proposed method.
format Online
Article
Text
id pubmed-8469688
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-84696882021-09-27 Accent Recognition with Hybrid Phonetic Features Zhang, Zhan Wang, Yuehai Yang, Jianyi Sensors (Basel) Article The performance of voice-controlled systems is usually influenced by accented speech. To make these systems more robust, frontend accent recognition (AR) technologies have received increased attention in recent years. As accent is a high-level abstract feature that has a profound relationship with language knowledge, AR is more challenging than other language-agnostic audio classification tasks. In this paper, we use an auxiliary automatic speech recognition (ASR) task to extract language-related phonetic features. Furthermore, we propose a hybrid structure that incorporates the embeddings of both a fixed acoustic model and a trainable acoustic model, making the language-related acoustic feature more robust. We conduct several experiments on the AESRC dataset. The results demonstrate that our approach can obtain an 8.02% relative improvement compared with the Transformer baseline, showing the merits of the proposed method. MDPI 2021-09-18 /pmc/articles/PMC8469688/ /pubmed/34577464 http://dx.doi.org/10.3390/s21186258 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Zhang, Zhan
Wang, Yuehai
Yang, Jianyi
Accent Recognition with Hybrid Phonetic Features
title Accent Recognition with Hybrid Phonetic Features
title_full Accent Recognition with Hybrid Phonetic Features
title_fullStr Accent Recognition with Hybrid Phonetic Features
title_full_unstemmed Accent Recognition with Hybrid Phonetic Features
title_short Accent Recognition with Hybrid Phonetic Features
title_sort accent recognition with hybrid phonetic features
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8469688/
https://www.ncbi.nlm.nih.gov/pubmed/34577464
http://dx.doi.org/10.3390/s21186258
work_keys_str_mv AT zhangzhan accentrecognitionwithhybridphoneticfeatures
AT wangyuehai accentrecognitionwithhybridphoneticfeatures
AT yangjianyi accentrecognitionwithhybridphoneticfeatures