Cargando…
A tool for efficient and accurate segmentation of speech data: announcing POnSS
Despite advances in automatic speech recognition (ASR), human input is still essential for producing research-grade segmentations of speech data. Conventional approaches to manual segmentation are very labor-intensive. We introduce POnSS, a browser-based system that is specialized for the task of se...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer US
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8062395/ https://www.ncbi.nlm.nih.gov/pubmed/32869139 http://dx.doi.org/10.3758/s13428-020-01449-6 |
_version_ | 1783681753942065152 |
---|---|
author | Rodd, Joe Decuyper, Caitlin Bosker, Hans Rutger ten Bosch, Louis |
author_facet | Rodd, Joe Decuyper, Caitlin Bosker, Hans Rutger ten Bosch, Louis |
author_sort | Rodd, Joe |
collection | PubMed |
description | Despite advances in automatic speech recognition (ASR), human input is still essential for producing research-grade segmentations of speech data. Conventional approaches to manual segmentation are very labor-intensive. We introduce POnSS, a browser-based system that is specialized for the task of segmenting the onsets and offsets of words, which combines aspects of ASR with limited human input. In developing POnSS, we identified several sub-tasks of segmentation, and implemented each of these as separate interfaces for the annotators to interact with to streamline their task as much as possible. We evaluated segmentations made with POnSS against a baseline of segmentations of the same data made conventionally in Praat. We observed that POnSS achieved comparable reliability to segmentation using Praat, but required 23% less annotator time investment. Because of its greater efficiency without sacrificing reliability, POnSS represents a distinct methodological advance for the segmentation of speech data. |
format | Online Article Text |
id | pubmed-8062395 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Springer US |
record_format | MEDLINE/PubMed |
spelling | pubmed-80623952021-05-05 A tool for efficient and accurate segmentation of speech data: announcing POnSS Rodd, Joe Decuyper, Caitlin Bosker, Hans Rutger ten Bosch, Louis Behav Res Methods Article Despite advances in automatic speech recognition (ASR), human input is still essential for producing research-grade segmentations of speech data. Conventional approaches to manual segmentation are very labor-intensive. We introduce POnSS, a browser-based system that is specialized for the task of segmenting the onsets and offsets of words, which combines aspects of ASR with limited human input. In developing POnSS, we identified several sub-tasks of segmentation, and implemented each of these as separate interfaces for the annotators to interact with to streamline their task as much as possible. We evaluated segmentations made with POnSS against a baseline of segmentations of the same data made conventionally in Praat. We observed that POnSS achieved comparable reliability to segmentation using Praat, but required 23% less annotator time investment. Because of its greater efficiency without sacrificing reliability, POnSS represents a distinct methodological advance for the segmentation of speech data. Springer US 2020-08-31 2021 /pmc/articles/PMC8062395/ /pubmed/32869139 http://dx.doi.org/10.3758/s13428-020-01449-6 Text en © The Author(s) 2020 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Rodd, Joe Decuyper, Caitlin Bosker, Hans Rutger ten Bosch, Louis A tool for efficient and accurate segmentation of speech data: announcing POnSS |
title | A tool for efficient and accurate segmentation of speech data: announcing POnSS |
title_full | A tool for efficient and accurate segmentation of speech data: announcing POnSS |
title_fullStr | A tool for efficient and accurate segmentation of speech data: announcing POnSS |
title_full_unstemmed | A tool for efficient and accurate segmentation of speech data: announcing POnSS |
title_short | A tool for efficient and accurate segmentation of speech data: announcing POnSS |
title_sort | tool for efficient and accurate segmentation of speech data: announcing ponss |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8062395/ https://www.ncbi.nlm.nih.gov/pubmed/32869139 http://dx.doi.org/10.3758/s13428-020-01449-6 |
work_keys_str_mv | AT roddjoe atoolforefficientandaccuratesegmentationofspeechdataannouncingponss AT decuypercaitlin atoolforefficientandaccuratesegmentationofspeechdataannouncingponss AT boskerhansrutger atoolforefficientandaccuratesegmentationofspeechdataannouncingponss AT tenboschlouis atoolforefficientandaccuratesegmentationofspeechdataannouncingponss AT roddjoe toolforefficientandaccuratesegmentationofspeechdataannouncingponss AT decuypercaitlin toolforefficientandaccuratesegmentationofspeechdataannouncingponss AT boskerhansrutger toolforefficientandaccuratesegmentationofspeechdataannouncingponss AT tenboschlouis toolforefficientandaccuratesegmentationofspeechdataannouncingponss |