Cargando…

Degraded and computer-generated speech processing in a bonobo

The human auditory system is capable of processing human speech even in situations when it has been heavily degraded, such as during noise-vocoding, when frequency domain-based cues to phonetic content are strongly reduced. This has contributed to arguments that speech processing is highly specializ...

Descripción completa

Detalles Bibliográficos
Autores principales: Lahiff, Nicole J., Slocombe, Katie E., Taglialatela, Jared, Dellwo, Volker, Townsend, Simon W.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer Berlin Heidelberg 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9652166/
https://www.ncbi.nlm.nih.gov/pubmed/35595881
http://dx.doi.org/10.1007/s10071-022-01621-9
_version_ 1784828409385320448
author Lahiff, Nicole J.
Slocombe, Katie E.
Taglialatela, Jared
Dellwo, Volker
Townsend, Simon W.
author_facet Lahiff, Nicole J.
Slocombe, Katie E.
Taglialatela, Jared
Dellwo, Volker
Townsend, Simon W.
author_sort Lahiff, Nicole J.
collection PubMed
description The human auditory system is capable of processing human speech even in situations when it has been heavily degraded, such as during noise-vocoding, when frequency domain-based cues to phonetic content are strongly reduced. This has contributed to arguments that speech processing is highly specialized and likely a de novo evolved trait in humans. Previous comparative research has demonstrated that a language competent chimpanzee was also capable of recognizing degraded speech, and therefore that the mechanisms underlying speech processing may not be uniquely human. However, to form a robust reconstruction of the evolutionary origins of speech processing, additional data from other closely related ape species is needed. Specifically, such data can help disentangle whether these capabilities evolved independently in humans and chimpanzees, or if they were inherited from our last common ancestor. Here we provide evidence of processing of highly varied (degraded and computer-generated) speech in a language competent bonobo, Kanzi. We took advantage of Kanzi’s existing proficiency with touchscreens and his ability to report his understanding of human speech through interacting with arbitrary symbols called lexigrams. Specifically, we asked Kanzi to recognise both human (natural) and computer-generated forms of 40 highly familiar words that had been degraded (noise-vocoded and sinusoidal forms) using a match-to-sample paradigm. Results suggest that—apart from noise-vocoded computer-generated speech—Kanzi recognised both natural and computer-generated voices that had been degraded, at rates significantly above chance. Kanzi performed better with all forms of natural voice speech compared to computer-generated speech. This work provides additional support for the hypothesis that the processing apparatus necessary to deal with highly variable speech, including for the first time in nonhuman animals, computer-generated speech, may be at least as old as the last common ancestor we share with bonobos and chimpanzees. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s10071-022-01621-9.
format Online
Article
Text
id pubmed-9652166
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Springer Berlin Heidelberg
record_format MEDLINE/PubMed
spelling pubmed-96521662022-11-15 Degraded and computer-generated speech processing in a bonobo Lahiff, Nicole J. Slocombe, Katie E. Taglialatela, Jared Dellwo, Volker Townsend, Simon W. Anim Cogn Original Paper The human auditory system is capable of processing human speech even in situations when it has been heavily degraded, such as during noise-vocoding, when frequency domain-based cues to phonetic content are strongly reduced. This has contributed to arguments that speech processing is highly specialized and likely a de novo evolved trait in humans. Previous comparative research has demonstrated that a language competent chimpanzee was also capable of recognizing degraded speech, and therefore that the mechanisms underlying speech processing may not be uniquely human. However, to form a robust reconstruction of the evolutionary origins of speech processing, additional data from other closely related ape species is needed. Specifically, such data can help disentangle whether these capabilities evolved independently in humans and chimpanzees, or if they were inherited from our last common ancestor. Here we provide evidence of processing of highly varied (degraded and computer-generated) speech in a language competent bonobo, Kanzi. We took advantage of Kanzi’s existing proficiency with touchscreens and his ability to report his understanding of human speech through interacting with arbitrary symbols called lexigrams. Specifically, we asked Kanzi to recognise both human (natural) and computer-generated forms of 40 highly familiar words that had been degraded (noise-vocoded and sinusoidal forms) using a match-to-sample paradigm. Results suggest that—apart from noise-vocoded computer-generated speech—Kanzi recognised both natural and computer-generated voices that had been degraded, at rates significantly above chance. Kanzi performed better with all forms of natural voice speech compared to computer-generated speech. This work provides additional support for the hypothesis that the processing apparatus necessary to deal with highly variable speech, including for the first time in nonhuman animals, computer-generated speech, may be at least as old as the last common ancestor we share with bonobos and chimpanzees. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s10071-022-01621-9. Springer Berlin Heidelberg 2022-05-20 2022 /pmc/articles/PMC9652166/ /pubmed/35595881 http://dx.doi.org/10.1007/s10071-022-01621-9 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Original Paper
Lahiff, Nicole J.
Slocombe, Katie E.
Taglialatela, Jared
Dellwo, Volker
Townsend, Simon W.
Degraded and computer-generated speech processing in a bonobo
title Degraded and computer-generated speech processing in a bonobo
title_full Degraded and computer-generated speech processing in a bonobo
title_fullStr Degraded and computer-generated speech processing in a bonobo
title_full_unstemmed Degraded and computer-generated speech processing in a bonobo
title_short Degraded and computer-generated speech processing in a bonobo
title_sort degraded and computer-generated speech processing in a bonobo
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9652166/
https://www.ncbi.nlm.nih.gov/pubmed/35595881
http://dx.doi.org/10.1007/s10071-022-01621-9
work_keys_str_mv AT lahiffnicolej degradedandcomputergeneratedspeechprocessinginabonobo
AT slocombekatiee degradedandcomputergeneratedspeechprocessinginabonobo
AT taglialatelajared degradedandcomputergeneratedspeechprocessinginabonobo
AT dellwovolker degradedandcomputergeneratedspeechprocessinginabonobo
AT townsendsimonw degradedandcomputergeneratedspeechprocessinginabonobo