Cargando…

A tool for efficient and accurate segmentation of speech data: announcing POnSS

Despite advances in automatic speech recognition (ASR), human input is still essential for producing research-grade segmentations of speech data. Conventional approaches to manual segmentation are very labor-intensive. We introduce POnSS, a browser-based system that is specialized for the task of se...

Descripción completa

Detalles Bibliográficos
Autores principales: Rodd, Joe, Decuyper, Caitlin, Bosker, Hans Rutger, ten Bosch, Louis
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer US 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8062395/
https://www.ncbi.nlm.nih.gov/pubmed/32869139
http://dx.doi.org/10.3758/s13428-020-01449-6
_version_ 1783681753942065152
author Rodd, Joe
Decuyper, Caitlin
Bosker, Hans Rutger
ten Bosch, Louis
author_facet Rodd, Joe
Decuyper, Caitlin
Bosker, Hans Rutger
ten Bosch, Louis
author_sort Rodd, Joe
collection PubMed
description Despite advances in automatic speech recognition (ASR), human input is still essential for producing research-grade segmentations of speech data. Conventional approaches to manual segmentation are very labor-intensive. We introduce POnSS, a browser-based system that is specialized for the task of segmenting the onsets and offsets of words, which combines aspects of ASR with limited human input. In developing POnSS, we identified several sub-tasks of segmentation, and implemented each of these as separate interfaces for the annotators to interact with to streamline their task as much as possible. We evaluated segmentations made with POnSS against a baseline of segmentations of the same data made conventionally in Praat. We observed that POnSS achieved comparable reliability to segmentation using Praat, but required 23% less annotator time investment. Because of its greater efficiency without sacrificing reliability, POnSS represents a distinct methodological advance for the segmentation of speech data.
format Online
Article
Text
id pubmed-8062395
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Springer US
record_format MEDLINE/PubMed
spelling pubmed-80623952021-05-05 A tool for efficient and accurate segmentation of speech data: announcing POnSS Rodd, Joe Decuyper, Caitlin Bosker, Hans Rutger ten Bosch, Louis Behav Res Methods Article Despite advances in automatic speech recognition (ASR), human input is still essential for producing research-grade segmentations of speech data. Conventional approaches to manual segmentation are very labor-intensive. We introduce POnSS, a browser-based system that is specialized for the task of segmenting the onsets and offsets of words, which combines aspects of ASR with limited human input. In developing POnSS, we identified several sub-tasks of segmentation, and implemented each of these as separate interfaces for the annotators to interact with to streamline their task as much as possible. We evaluated segmentations made with POnSS against a baseline of segmentations of the same data made conventionally in Praat. We observed that POnSS achieved comparable reliability to segmentation using Praat, but required 23% less annotator time investment. Because of its greater efficiency without sacrificing reliability, POnSS represents a distinct methodological advance for the segmentation of speech data. Springer US 2020-08-31 2021 /pmc/articles/PMC8062395/ /pubmed/32869139 http://dx.doi.org/10.3758/s13428-020-01449-6 Text en © The Author(s) 2020 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Rodd, Joe
Decuyper, Caitlin
Bosker, Hans Rutger
ten Bosch, Louis
A tool for efficient and accurate segmentation of speech data: announcing POnSS
title A tool for efficient and accurate segmentation of speech data: announcing POnSS
title_full A tool for efficient and accurate segmentation of speech data: announcing POnSS
title_fullStr A tool for efficient and accurate segmentation of speech data: announcing POnSS
title_full_unstemmed A tool for efficient and accurate segmentation of speech data: announcing POnSS
title_short A tool for efficient and accurate segmentation of speech data: announcing POnSS
title_sort tool for efficient and accurate segmentation of speech data: announcing ponss
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8062395/
https://www.ncbi.nlm.nih.gov/pubmed/32869139
http://dx.doi.org/10.3758/s13428-020-01449-6
work_keys_str_mv AT roddjoe atoolforefficientandaccuratesegmentationofspeechdataannouncingponss
AT decuypercaitlin atoolforefficientandaccuratesegmentationofspeechdataannouncingponss
AT boskerhansrutger atoolforefficientandaccuratesegmentationofspeechdataannouncingponss
AT tenboschlouis atoolforefficientandaccuratesegmentationofspeechdataannouncingponss
AT roddjoe toolforefficientandaccuratesegmentationofspeechdataannouncingponss
AT decuypercaitlin toolforefficientandaccuratesegmentationofspeechdataannouncingponss
AT boskerhansrutger toolforefficientandaccuratesegmentationofspeechdataannouncingponss
AT tenboschlouis toolforefficientandaccuratesegmentationofspeechdataannouncingponss