Cargando…

Identifying languages in a novel dataset: ASMR-whispered speech

INTRODUCTION: The Autonomous Sensory Meridian Response (ASMR) is a combination of sensory phenomena involving electrostatic-like tingling sensations, which emerge in response to certain stimuli. Despite the overwhelming popularity of ASMR in the social media, no open source databases on ASMR related...

Descripción completa

Detalles Bibliográficos
Autores principales:	Song, Meishu, Yang, Zijiang, Parada-Cabaleiro, Emilia, Jing, Xin, Yamamoto, Yoshiharu, Schuller, Björn
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Frontiers Media S.A. 2023
Materias:	Neuroscience
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10308374/ https://www.ncbi.nlm.nih.gov/pubmed/37397449 http://dx.doi.org/10.3389/fnins.2023.1120311

_version_	1785066230918414336
author	Song, Meishu Yang, Zijiang Parada-Cabaleiro, Emilia Jing, Xin Yamamoto, Yoshiharu Schuller, Björn
author_facet	Song, Meishu Yang, Zijiang Parada-Cabaleiro, Emilia Jing, Xin Yamamoto, Yoshiharu Schuller, Björn
author_sort	Song, Meishu
collection	PubMed
description	INTRODUCTION: The Autonomous Sensory Meridian Response (ASMR) is a combination of sensory phenomena involving electrostatic-like tingling sensations, which emerge in response to certain stimuli. Despite the overwhelming popularity of ASMR in the social media, no open source databases on ASMR related stimuli are yet available, which makes this phenomenon mostly inaccessible to the research community; thus, almost completely unexplored. In this regard, we present the ASMR Whispered-Speech (ASMR-WS) database. METHODS: ASWR-WS is a novel database on whispered speech, specifically tailored to promote the development of ASMR-like unvoiced Language Identification (unvoiced-LID) systems. The ASMR-WS database encompasses 38 videos-for a total duration of 10 h and 36 min-and includes seven target languages (Chinese, English, French, Italian, Japanese, Korean, and Spanish). Along with the database, we present baseline results for unvoiced-LID on the ASMR-WS database. RESULTS: Our best results on the seven-class problem, based on segments of 2s length, and on a CNN classifier and MFCC acoustic features, achieved 85.74% of unweighted average recall and 90.83% of accuracy. DISCUSSION: For future work, we would like to focus more deeply on the duration of speech samples, as we see varied results with the combinations applied herein. To enable further research in this area, the ASMR-WS database, as well as the partitioning considered in the presented baseline, is made accessible to the research community.
format	Online Article Text
id	pubmed-10308374
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	Frontiers Media S.A.
record_format	MEDLINE/PubMed
spelling	pubmed-103083742023-06-30 Identifying languages in a novel dataset: ASMR-whispered speech Song, Meishu Yang, Zijiang Parada-Cabaleiro, Emilia Jing, Xin Yamamoto, Yoshiharu Schuller, Björn Front Neurosci Neuroscience INTRODUCTION: The Autonomous Sensory Meridian Response (ASMR) is a combination of sensory phenomena involving electrostatic-like tingling sensations, which emerge in response to certain stimuli. Despite the overwhelming popularity of ASMR in the social media, no open source databases on ASMR related stimuli are yet available, which makes this phenomenon mostly inaccessible to the research community; thus, almost completely unexplored. In this regard, we present the ASMR Whispered-Speech (ASMR-WS) database. METHODS: ASWR-WS is a novel database on whispered speech, specifically tailored to promote the development of ASMR-like unvoiced Language Identification (unvoiced-LID) systems. The ASMR-WS database encompasses 38 videos-for a total duration of 10 h and 36 min-and includes seven target languages (Chinese, English, French, Italian, Japanese, Korean, and Spanish). Along with the database, we present baseline results for unvoiced-LID on the ASMR-WS database. RESULTS: Our best results on the seven-class problem, based on segments of 2s length, and on a CNN classifier and MFCC acoustic features, achieved 85.74% of unweighted average recall and 90.83% of accuracy. DISCUSSION: For future work, we would like to focus more deeply on the duration of speech samples, as we see varied results with the combinations applied herein. To enable further research in this area, the ASMR-WS database, as well as the partitioning considered in the presented baseline, is made accessible to the research community. Frontiers Media S.A. 2023-06-15 /pmc/articles/PMC10308374/ /pubmed/37397449 http://dx.doi.org/10.3389/fnins.2023.1120311 Text en Copyright © 2023 Song, Yang, Parada-Cabaleiro, Jing, Yamamoto and Schuller. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle	Neuroscience Song, Meishu Yang, Zijiang Parada-Cabaleiro, Emilia Jing, Xin Yamamoto, Yoshiharu Schuller, Björn Identifying languages in a novel dataset: ASMR-whispered speech
title	Identifying languages in a novel dataset: ASMR-whispered speech
title_full	Identifying languages in a novel dataset: ASMR-whispered speech
title_fullStr	Identifying languages in a novel dataset: ASMR-whispered speech
title_full_unstemmed	Identifying languages in a novel dataset: ASMR-whispered speech
title_short	Identifying languages in a novel dataset: ASMR-whispered speech
title_sort	identifying languages in a novel dataset: asmr-whispered speech
topic	Neuroscience
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10308374/ https://www.ncbi.nlm.nih.gov/pubmed/37397449 http://dx.doi.org/10.3389/fnins.2023.1120311
work_keys_str_mv	AT songmeishu identifyinglanguagesinanoveldatasetasmrwhisperedspeech AT yangzijiang identifyinglanguagesinanoveldatasetasmrwhisperedspeech AT paradacabaleiroemilia identifyinglanguagesinanoveldatasetasmrwhisperedspeech AT jingxin identifyinglanguagesinanoveldatasetasmrwhisperedspeech AT yamamotoyoshiharu identifyinglanguagesinanoveldatasetasmrwhisperedspeech AT schullerbjorn identifyinglanguagesinanoveldatasetasmrwhisperedspeech

Identifying languages in a novel dataset: ASMR-whispered speech

Ejemplares similares