Cargando…

Causal inference of asynchronous audiovisual speech

During speech perception, humans integrate auditory information from the voice with visual information from the face. This multisensory integration increases perceptual precision, but only if the two cues come from the same talker; this requirement has been largely ignored by current models of speec...

Descripción completa

Detalles Bibliográficos
Autores principales: Magnotti, John F., Ma, Wei Ji, Beauchamp, Michael S.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3826594/
https://www.ncbi.nlm.nih.gov/pubmed/24294207
http://dx.doi.org/10.3389/fpsyg.2013.00798
_version_ 1782290929906876416
author Magnotti, John F.
Ma, Wei Ji
Beauchamp, Michael S.
author_facet Magnotti, John F.
Ma, Wei Ji
Beauchamp, Michael S.
author_sort Magnotti, John F.
collection PubMed
description During speech perception, humans integrate auditory information from the voice with visual information from the face. This multisensory integration increases perceptual precision, but only if the two cues come from the same talker; this requirement has been largely ignored by current models of speech perception. We describe a generative model of multisensory speech perception that includes this critical step of determining the likelihood that the voice and face information have a common cause. A key feature of the model is that it is based on a principled analysis of how an observer should solve this causal inference problem using the asynchrony between two cues and the reliability of the cues. This allows the model to make predictions about the behavior of subjects performing a synchrony judgment task, predictive power that does not exist in other approaches, such as post-hoc fitting of Gaussian curves to behavioral data. We tested the model predictions against the performance of 37 subjects performing a synchrony judgment task viewing audiovisual speech under a variety of manipulations, including varying asynchronies, intelligibility, and visual cue reliability. The causal inference model outperformed the Gaussian model across two experiments, providing a better fit to the behavioral data with fewer parameters. Because the causal inference model is derived from a principled understanding of the task, model parameters are directly interpretable in terms of stimulus and subject properties.
format Online
Article
Text
id pubmed-3826594
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-38265942013-11-29 Causal inference of asynchronous audiovisual speech Magnotti, John F. Ma, Wei Ji Beauchamp, Michael S. Front Psychol Psychology During speech perception, humans integrate auditory information from the voice with visual information from the face. This multisensory integration increases perceptual precision, but only if the two cues come from the same talker; this requirement has been largely ignored by current models of speech perception. We describe a generative model of multisensory speech perception that includes this critical step of determining the likelihood that the voice and face information have a common cause. A key feature of the model is that it is based on a principled analysis of how an observer should solve this causal inference problem using the asynchrony between two cues and the reliability of the cues. This allows the model to make predictions about the behavior of subjects performing a synchrony judgment task, predictive power that does not exist in other approaches, such as post-hoc fitting of Gaussian curves to behavioral data. We tested the model predictions against the performance of 37 subjects performing a synchrony judgment task viewing audiovisual speech under a variety of manipulations, including varying asynchronies, intelligibility, and visual cue reliability. The causal inference model outperformed the Gaussian model across two experiments, providing a better fit to the behavioral data with fewer parameters. Because the causal inference model is derived from a principled understanding of the task, model parameters are directly interpretable in terms of stimulus and subject properties. Frontiers Media S.A. 2013-11-13 /pmc/articles/PMC3826594/ /pubmed/24294207 http://dx.doi.org/10.3389/fpsyg.2013.00798 Text en Copyright © 2013 Magnotti, Ma and Beauchamp. http://creativecommons.org/licenses/by/3.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Psychology
Magnotti, John F.
Ma, Wei Ji
Beauchamp, Michael S.
Causal inference of asynchronous audiovisual speech
title Causal inference of asynchronous audiovisual speech
title_full Causal inference of asynchronous audiovisual speech
title_fullStr Causal inference of asynchronous audiovisual speech
title_full_unstemmed Causal inference of asynchronous audiovisual speech
title_short Causal inference of asynchronous audiovisual speech
title_sort causal inference of asynchronous audiovisual speech
topic Psychology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3826594/
https://www.ncbi.nlm.nih.gov/pubmed/24294207
http://dx.doi.org/10.3389/fpsyg.2013.00798
work_keys_str_mv AT magnottijohnf causalinferenceofasynchronousaudiovisualspeech
AT maweiji causalinferenceofasynchronousaudiovisualspeech
AT beauchampmichaels causalinferenceofasynchronousaudiovisualspeech