Cargando…

Visualization of Speech Perception Analysis via Phoneme Alignment: A Pilot Study

Objective: Speech tests assess the ability of people with hearing loss to comprehend speech with a hearing aid or cochlear implant. The tests are usually at the word or sentence level. However, few tests analyze errors at the phoneme level. So, there is a need for an automated program to visualize i...

Descripción completa

Detalles Bibliográficos
Autores principales:	Ratnanather, J. Tilak, Wang, Lydia C., Bae, Seung-Ho, O'Neill, Erin R., Sagi, Elad, Tward, Daniel J.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Frontiers Media S.A. 2022
Materias:	Neurology
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8787339/ https://www.ncbi.nlm.nih.gov/pubmed/35087462 http://dx.doi.org/10.3389/fneur.2021.724800

_version_	1784639341094502400
author	Ratnanather, J. Tilak Wang, Lydia C. Bae, Seung-Ho O'Neill, Erin R. Sagi, Elad Tward, Daniel J.
author_facet	Ratnanather, J. Tilak Wang, Lydia C. Bae, Seung-Ho O'Neill, Erin R. Sagi, Elad Tward, Daniel J.
author_sort	Ratnanather, J. Tilak
collection	PubMed
description	Objective: Speech tests assess the ability of people with hearing loss to comprehend speech with a hearing aid or cochlear implant. The tests are usually at the word or sentence level. However, few tests analyze errors at the phoneme level. So, there is a need for an automated program to visualize in real time the accuracy of phonemes in these tests. Method: The program reads in stimulus-response pairs and obtains their phonemic representations from an open-source digital pronouncing dictionary. The stimulus phonemes are aligned with the response phonemes via a modification of the Levenshtein Minimum Edit Distance algorithm. Alignment is achieved via dynamic programming with modified costs based on phonological features for insertion, deletions and substitutions. The accuracy for each phoneme is based on the F1-score. Accuracy is visualized with respect to place and manner (consonants) or height (vowels). Confusion matrices for the phonemes are used in an information transfer analysis of ten phonological features. A histogram of the information transfer for the features over a frequency-like range is presented as a phonemegram. Results: The program was applied to two datasets. One consisted of test data at the sentence and word levels. Stimulus-response sentence pairs from six volunteers with different degrees of hearing loss and modes of amplification were analyzed. Four volunteers listened to sentences from a mobile auditory training app while two listened to sentences from a clinical speech test. Stimulus-response word pairs from three lists were also analyzed. The other dataset consisted of published stimulus-response pairs from experiments of 31 participants with cochlear implants listening to 400 Basic English Lexicon sentences via different talkers at four different SNR levels. In all cases, visualization was obtained in real time. Analysis of 12,400 actual and random pairs showed that the program was robust to the nature of the pairs. Conclusion: It is possible to automate the alignment of phonemes extracted from stimulus-response pairs from speech tests in real time. The alignment then makes it possible to visualize the accuracy of responses via phonological features in two ways. Such visualization of phoneme alignment and accuracy could aid clinicians and scientists.
format	Online Article Text
id	pubmed-8787339
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	Frontiers Media S.A.
record_format	MEDLINE/PubMed
spelling	pubmed-87873392022-01-26 Visualization of Speech Perception Analysis via Phoneme Alignment: A Pilot Study Ratnanather, J. Tilak Wang, Lydia C. Bae, Seung-Ho O'Neill, Erin R. Sagi, Elad Tward, Daniel J. Front Neurol Neurology Objective: Speech tests assess the ability of people with hearing loss to comprehend speech with a hearing aid or cochlear implant. The tests are usually at the word or sentence level. However, few tests analyze errors at the phoneme level. So, there is a need for an automated program to visualize in real time the accuracy of phonemes in these tests. Method: The program reads in stimulus-response pairs and obtains their phonemic representations from an open-source digital pronouncing dictionary. The stimulus phonemes are aligned with the response phonemes via a modification of the Levenshtein Minimum Edit Distance algorithm. Alignment is achieved via dynamic programming with modified costs based on phonological features for insertion, deletions and substitutions. The accuracy for each phoneme is based on the F1-score. Accuracy is visualized with respect to place and manner (consonants) or height (vowels). Confusion matrices for the phonemes are used in an information transfer analysis of ten phonological features. A histogram of the information transfer for the features over a frequency-like range is presented as a phonemegram. Results: The program was applied to two datasets. One consisted of test data at the sentence and word levels. Stimulus-response sentence pairs from six volunteers with different degrees of hearing loss and modes of amplification were analyzed. Four volunteers listened to sentences from a mobile auditory training app while two listened to sentences from a clinical speech test. Stimulus-response word pairs from three lists were also analyzed. The other dataset consisted of published stimulus-response pairs from experiments of 31 participants with cochlear implants listening to 400 Basic English Lexicon sentences via different talkers at four different SNR levels. In all cases, visualization was obtained in real time. Analysis of 12,400 actual and random pairs showed that the program was robust to the nature of the pairs. Conclusion: It is possible to automate the alignment of phonemes extracted from stimulus-response pairs from speech tests in real time. The alignment then makes it possible to visualize the accuracy of responses via phonological features in two ways. Such visualization of phoneme alignment and accuracy could aid clinicians and scientists. Frontiers Media S.A. 2022-01-11 /pmc/articles/PMC8787339/ /pubmed/35087462 http://dx.doi.org/10.3389/fneur.2021.724800 Text en Copyright © 2022 Ratnanather, Wang, Bae, O'Neill, Sagi and Tward. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle	Neurology Ratnanather, J. Tilak Wang, Lydia C. Bae, Seung-Ho O'Neill, Erin R. Sagi, Elad Tward, Daniel J. Visualization of Speech Perception Analysis via Phoneme Alignment: A Pilot Study
title	Visualization of Speech Perception Analysis via Phoneme Alignment: A Pilot Study
title_full	Visualization of Speech Perception Analysis via Phoneme Alignment: A Pilot Study
title_fullStr	Visualization of Speech Perception Analysis via Phoneme Alignment: A Pilot Study
title_full_unstemmed	Visualization of Speech Perception Analysis via Phoneme Alignment: A Pilot Study
title_short	Visualization of Speech Perception Analysis via Phoneme Alignment: A Pilot Study
title_sort	visualization of speech perception analysis via phoneme alignment: a pilot study
topic	Neurology
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8787339/ https://www.ncbi.nlm.nih.gov/pubmed/35087462 http://dx.doi.org/10.3389/fneur.2021.724800
work_keys_str_mv	AT ratnanatherjtilak visualizationofspeechperceptionanalysisviaphonemealignmentapilotstudy AT wanglydiac visualizationofspeechperceptionanalysisviaphonemealignmentapilotstudy AT baeseungho visualizationofspeechperceptionanalysisviaphonemealignmentapilotstudy AT oneillerinr visualizationofspeechperceptionanalysisviaphonemealignmentapilotstudy AT sagielad visualizationofspeechperceptionanalysisviaphonemealignmentapilotstudy AT twarddanielj visualizationofspeechperceptionanalysisviaphonemealignmentapilotstudy

Visualization of Speech Perception Analysis via Phoneme Alignment: A Pilot Study

Ejemplares similares