Cargando…

Automatic detection of prosodic boundaries in spontaneous speech

Automatic speech recognition (ASR) and natural language processing (NLP) are expected to benefit from an effective, simple, and reliable method to automatically parse conversational speech. The ability to parse conversational speech depends crucially on the ability to identify boundaries between pro...

Descripción completa

Detalles Bibliográficos
Autores principales: Biron, Tirza, Baum, Daniel, Freche, Dominik, Matalon, Nadav, Ehrmann, Netanel, Weinreb, Eyal, Biron, David, Moses, Elisha
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8092678/
https://www.ncbi.nlm.nih.gov/pubmed/33939754
http://dx.doi.org/10.1371/journal.pone.0250969
_version_ 1783687677746348032
author Biron, Tirza
Baum, Daniel
Freche, Dominik
Matalon, Nadav
Ehrmann, Netanel
Weinreb, Eyal
Biron, David
Moses, Elisha
author_facet Biron, Tirza
Baum, Daniel
Freche, Dominik
Matalon, Nadav
Ehrmann, Netanel
Weinreb, Eyal
Biron, David
Moses, Elisha
author_sort Biron, Tirza
collection PubMed
description Automatic speech recognition (ASR) and natural language processing (NLP) are expected to benefit from an effective, simple, and reliable method to automatically parse conversational speech. The ability to parse conversational speech depends crucially on the ability to identify boundaries between prosodic phrases. This is done naturally by the human ear, yet has proved surprisingly difficult to achieve reliably and simply in an automatic manner. Efforts to date have focused on detecting phrase boundaries using a variety of linguistic and acoustic cues. We propose a method which does not require model training and utilizes two prosodic cues that are based on ASR output. Boundaries are identified using discontinuities in speech rate (pre-boundary lengthening and phrase-initial acceleration) and silent pauses. The resulting phrases preserve syntactic validity, exhibit pitch reset, and compare well with manual tagging of prosodic boundaries. Collectively, our findings support the notion of prosodic phrases that represent coherent patterns across textual and acoustic parameters.
format Online
Article
Text
id pubmed-8092678
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-80926782021-05-07 Automatic detection of prosodic boundaries in spontaneous speech Biron, Tirza Baum, Daniel Freche, Dominik Matalon, Nadav Ehrmann, Netanel Weinreb, Eyal Biron, David Moses, Elisha PLoS One Research Article Automatic speech recognition (ASR) and natural language processing (NLP) are expected to benefit from an effective, simple, and reliable method to automatically parse conversational speech. The ability to parse conversational speech depends crucially on the ability to identify boundaries between prosodic phrases. This is done naturally by the human ear, yet has proved surprisingly difficult to achieve reliably and simply in an automatic manner. Efforts to date have focused on detecting phrase boundaries using a variety of linguistic and acoustic cues. We propose a method which does not require model training and utilizes two prosodic cues that are based on ASR output. Boundaries are identified using discontinuities in speech rate (pre-boundary lengthening and phrase-initial acceleration) and silent pauses. The resulting phrases preserve syntactic validity, exhibit pitch reset, and compare well with manual tagging of prosodic boundaries. Collectively, our findings support the notion of prosodic phrases that represent coherent patterns across textual and acoustic parameters. Public Library of Science 2021-05-03 /pmc/articles/PMC8092678/ /pubmed/33939754 http://dx.doi.org/10.1371/journal.pone.0250969 Text en © 2021 Biron et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Biron, Tirza
Baum, Daniel
Freche, Dominik
Matalon, Nadav
Ehrmann, Netanel
Weinreb, Eyal
Biron, David
Moses, Elisha
Automatic detection of prosodic boundaries in spontaneous speech
title Automatic detection of prosodic boundaries in spontaneous speech
title_full Automatic detection of prosodic boundaries in spontaneous speech
title_fullStr Automatic detection of prosodic boundaries in spontaneous speech
title_full_unstemmed Automatic detection of prosodic boundaries in spontaneous speech
title_short Automatic detection of prosodic boundaries in spontaneous speech
title_sort automatic detection of prosodic boundaries in spontaneous speech
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8092678/
https://www.ncbi.nlm.nih.gov/pubmed/33939754
http://dx.doi.org/10.1371/journal.pone.0250969
work_keys_str_mv AT birontirza automaticdetectionofprosodicboundariesinspontaneousspeech
AT baumdaniel automaticdetectionofprosodicboundariesinspontaneousspeech
AT frechedominik automaticdetectionofprosodicboundariesinspontaneousspeech
AT matalonnadav automaticdetectionofprosodicboundariesinspontaneousspeech
AT ehrmannnetanel automaticdetectionofprosodicboundariesinspontaneousspeech
AT weinrebeyal automaticdetectionofprosodicboundariesinspontaneousspeech
AT birondavid automaticdetectionofprosodicboundariesinspontaneousspeech
AT moseselisha automaticdetectionofprosodicboundariesinspontaneousspeech