Cargando…

Deterministic response strategies in a trial-and-error learning task

Trial-and-error learning is a universal strategy for establishing which actions are beneficial or harmful in new environments. However, learning stimulus-response associations solely via trial-and-error is often suboptimal, as in many settings dependencies among stimuli and responses can be exploite...

Descripción completa

Detalles Bibliográficos
Autores principales: Mohr, Holger, Zwosta, Katharina, Markovic, Dimitrije, Bitzer, Sebastian, Wolfensteller, Uta, Ruge, Hannes
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6289466/
https://www.ncbi.nlm.nih.gov/pubmed/30496285
http://dx.doi.org/10.1371/journal.pcbi.1006621
_version_ 1783379965102784512
author Mohr, Holger
Zwosta, Katharina
Markovic, Dimitrije
Bitzer, Sebastian
Wolfensteller, Uta
Ruge, Hannes
author_facet Mohr, Holger
Zwosta, Katharina
Markovic, Dimitrije
Bitzer, Sebastian
Wolfensteller, Uta
Ruge, Hannes
author_sort Mohr, Holger
collection PubMed
description Trial-and-error learning is a universal strategy for establishing which actions are beneficial or harmful in new environments. However, learning stimulus-response associations solely via trial-and-error is often suboptimal, as in many settings dependencies among stimuli and responses can be exploited to increase learning efficiency. Previous studies have shown that in settings featuring such dependencies, humans typically engage high-level cognitive processes and employ advanced learning strategies to improve their learning efficiency. Here we analyze in detail the initial learning phase of a sample of human subjects (N = 85) performing a trial-and-error learning task with deterministic feedback and hidden stimulus-response dependencies. Using computational modeling, we find that the standard Q-learning model cannot sufficiently explain human learning strategies in this setting. Instead, newly introduced deterministic response models, which are theoretically optimal and transform stimulus sequences unambiguously into response sequences, provide the best explanation for 50.6% of the subjects. Most of the remaining subjects either show a tendency towards generic optimal learning (21.2%) or at least partially exploit stimulus-response dependencies (22.3%), while a few subjects (5.9%) show no clear preference for any of the employed models. After the initial learning phase, asymptotic learning performance during the subsequent practice phase is best explained by the standard Q-learning model. Our results show that human learning strategies in the presented trial-and-error learning task go beyond merely associating stimuli and responses via incremental reinforcement. Specifically during initial learning, high-level cognitive processes support sophisticated learning strategies that increase learning efficiency while keeping memory demands and computational efforts bounded. The good asymptotic fit of the Q-learning model indicates that these cognitive processes are successively replaced by the formation of stimulus-response associations over the course of learning.
format Online
Article
Text
id pubmed-6289466
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-62894662018-12-28 Deterministic response strategies in a trial-and-error learning task Mohr, Holger Zwosta, Katharina Markovic, Dimitrije Bitzer, Sebastian Wolfensteller, Uta Ruge, Hannes PLoS Comput Biol Research Article Trial-and-error learning is a universal strategy for establishing which actions are beneficial or harmful in new environments. However, learning stimulus-response associations solely via trial-and-error is often suboptimal, as in many settings dependencies among stimuli and responses can be exploited to increase learning efficiency. Previous studies have shown that in settings featuring such dependencies, humans typically engage high-level cognitive processes and employ advanced learning strategies to improve their learning efficiency. Here we analyze in detail the initial learning phase of a sample of human subjects (N = 85) performing a trial-and-error learning task with deterministic feedback and hidden stimulus-response dependencies. Using computational modeling, we find that the standard Q-learning model cannot sufficiently explain human learning strategies in this setting. Instead, newly introduced deterministic response models, which are theoretically optimal and transform stimulus sequences unambiguously into response sequences, provide the best explanation for 50.6% of the subjects. Most of the remaining subjects either show a tendency towards generic optimal learning (21.2%) or at least partially exploit stimulus-response dependencies (22.3%), while a few subjects (5.9%) show no clear preference for any of the employed models. After the initial learning phase, asymptotic learning performance during the subsequent practice phase is best explained by the standard Q-learning model. Our results show that human learning strategies in the presented trial-and-error learning task go beyond merely associating stimuli and responses via incremental reinforcement. Specifically during initial learning, high-level cognitive processes support sophisticated learning strategies that increase learning efficiency while keeping memory demands and computational efforts bounded. The good asymptotic fit of the Q-learning model indicates that these cognitive processes are successively replaced by the formation of stimulus-response associations over the course of learning. Public Library of Science 2018-11-29 /pmc/articles/PMC6289466/ /pubmed/30496285 http://dx.doi.org/10.1371/journal.pcbi.1006621 Text en © 2018 Mohr et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Mohr, Holger
Zwosta, Katharina
Markovic, Dimitrije
Bitzer, Sebastian
Wolfensteller, Uta
Ruge, Hannes
Deterministic response strategies in a trial-and-error learning task
title Deterministic response strategies in a trial-and-error learning task
title_full Deterministic response strategies in a trial-and-error learning task
title_fullStr Deterministic response strategies in a trial-and-error learning task
title_full_unstemmed Deterministic response strategies in a trial-and-error learning task
title_short Deterministic response strategies in a trial-and-error learning task
title_sort deterministic response strategies in a trial-and-error learning task
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6289466/
https://www.ncbi.nlm.nih.gov/pubmed/30496285
http://dx.doi.org/10.1371/journal.pcbi.1006621
work_keys_str_mv AT mohrholger deterministicresponsestrategiesinatrialanderrorlearningtask
AT zwostakatharina deterministicresponsestrategiesinatrialanderrorlearningtask
AT markovicdimitrije deterministicresponsestrategiesinatrialanderrorlearningtask
AT bitzersebastian deterministicresponsestrategiesinatrialanderrorlearningtask
AT wolfenstelleruta deterministicresponsestrategiesinatrialanderrorlearningtask
AT rugehannes deterministicresponsestrategiesinatrialanderrorlearningtask