Cargando…

Confusion2Vec 2.0: Enriching ambiguous spoken language representations with subwords

Word vector representations enable machines to encode human language for spoken language understanding and processing. Confusion2vec, motivated from human speech production and perception, is a word vector representation which encodes ambiguities present in human spoken language in addition to seman...

Descripción completa

Detalles Bibliográficos
Autores principales: Gurunath Shivakumar, Prashanth, Georgiou, Panayiotis, Narayanan, Shrikanth
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8896703/
https://www.ncbi.nlm.nih.gov/pubmed/35245327
http://dx.doi.org/10.1371/journal.pone.0264488
_version_ 1784663219347914752
author Gurunath Shivakumar, Prashanth
Georgiou, Panayiotis
Narayanan, Shrikanth
author_facet Gurunath Shivakumar, Prashanth
Georgiou, Panayiotis
Narayanan, Shrikanth
author_sort Gurunath Shivakumar, Prashanth
collection PubMed
description Word vector representations enable machines to encode human language for spoken language understanding and processing. Confusion2vec, motivated from human speech production and perception, is a word vector representation which encodes ambiguities present in human spoken language in addition to semantics and syntactic information. Confusion2vec provides a robust spoken language representation by considering inherent human language ambiguities. In this paper, we propose a novel word vector space estimation by unsupervised learning on lattices output by an automatic speech recognition (ASR) system. We encode each word in Confusion2vec vector space by its constituent subword character n-grams. We show that the subword encoding helps better represent the acoustic perceptual ambiguities in human spoken language via information modeled on lattice-structured ASR output. The usefulness of the proposed Confusion2vec representation is evaluated using analogy and word similarity tasks designed for assessing semantic, syntactic and acoustic word relations. We also show the benefits of subword modeling for acoustic ambiguity representation on the task of spoken language intent detection. The results significantly outperform existing word vector representations when evaluated on erroneous ASR outputs, providing improvements up-to 13.12% relative to previous state-of-the-art in intent detection on ATIS benchmark dataset. We demonstrate that Confusion2vec subword modeling eliminates the need for retraining/adapting the natural language understanding models on ASR transcripts.
format Online
Article
Text
id pubmed-8896703
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-88967032022-03-05 Confusion2Vec 2.0: Enriching ambiguous spoken language representations with subwords Gurunath Shivakumar, Prashanth Georgiou, Panayiotis Narayanan, Shrikanth PLoS One Research Article Word vector representations enable machines to encode human language for spoken language understanding and processing. Confusion2vec, motivated from human speech production and perception, is a word vector representation which encodes ambiguities present in human spoken language in addition to semantics and syntactic information. Confusion2vec provides a robust spoken language representation by considering inherent human language ambiguities. In this paper, we propose a novel word vector space estimation by unsupervised learning on lattices output by an automatic speech recognition (ASR) system. We encode each word in Confusion2vec vector space by its constituent subword character n-grams. We show that the subword encoding helps better represent the acoustic perceptual ambiguities in human spoken language via information modeled on lattice-structured ASR output. The usefulness of the proposed Confusion2vec representation is evaluated using analogy and word similarity tasks designed for assessing semantic, syntactic and acoustic word relations. We also show the benefits of subword modeling for acoustic ambiguity representation on the task of spoken language intent detection. The results significantly outperform existing word vector representations when evaluated on erroneous ASR outputs, providing improvements up-to 13.12% relative to previous state-of-the-art in intent detection on ATIS benchmark dataset. We demonstrate that Confusion2vec subword modeling eliminates the need for retraining/adapting the natural language understanding models on ASR transcripts. Public Library of Science 2022-03-04 /pmc/articles/PMC8896703/ /pubmed/35245327 http://dx.doi.org/10.1371/journal.pone.0264488 Text en © 2022 Gurunath Shivakumar et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Gurunath Shivakumar, Prashanth
Georgiou, Panayiotis
Narayanan, Shrikanth
Confusion2Vec 2.0: Enriching ambiguous spoken language representations with subwords
title Confusion2Vec 2.0: Enriching ambiguous spoken language representations with subwords
title_full Confusion2Vec 2.0: Enriching ambiguous spoken language representations with subwords
title_fullStr Confusion2Vec 2.0: Enriching ambiguous spoken language representations with subwords
title_full_unstemmed Confusion2Vec 2.0: Enriching ambiguous spoken language representations with subwords
title_short Confusion2Vec 2.0: Enriching ambiguous spoken language representations with subwords
title_sort confusion2vec 2.0: enriching ambiguous spoken language representations with subwords
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8896703/
https://www.ncbi.nlm.nih.gov/pubmed/35245327
http://dx.doi.org/10.1371/journal.pone.0264488
work_keys_str_mv AT gurunathshivakumarprashanth confusion2vec20enrichingambiguousspokenlanguagerepresentationswithsubwords
AT georgioupanayiotis confusion2vec20enrichingambiguousspokenlanguagerepresentationswithsubwords
AT narayananshrikanth confusion2vec20enrichingambiguousspokenlanguagerepresentationswithsubwords