Cargando…

Altered Statistical Learning and Decision-Making in Methamphetamine Dependence: Evidence from a Two-Armed Bandit Task

Understanding how humans weigh long-term and short-term goals is important for both basic cognitive science and clinical neuroscience, as substance users need to balance the appeal of an immediate high vs. the long-term goal of sobriety. We use a computational model to identify learning and decision...

Descripción completa

Detalles Bibliográficos
Autores principales: Harlé, Katia M., Zhang, Shunan, Schiff, Max, Mackey, Scott, Paulus, Martin P., Yu, Angela J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4683191/
https://www.ncbi.nlm.nih.gov/pubmed/26733906
http://dx.doi.org/10.3389/fpsyg.2015.01910
_version_ 1782405991336247296
author Harlé, Katia M.
Zhang, Shunan
Schiff, Max
Mackey, Scott
Paulus, Martin P.
Yu, Angela J.
author_facet Harlé, Katia M.
Zhang, Shunan
Schiff, Max
Mackey, Scott
Paulus, Martin P.
Yu, Angela J.
author_sort Harlé, Katia M.
collection PubMed
description Understanding how humans weigh long-term and short-term goals is important for both basic cognitive science and clinical neuroscience, as substance users need to balance the appeal of an immediate high vs. the long-term goal of sobriety. We use a computational model to identify learning and decision-making abnormalities in methamphetamine-dependent individuals (MDI, n = 16) vs. healthy control subjects (HCS, n = 16), in a two-armed bandit task. In this task, subjects repeatedly choose between two arms with fixed but unknown reward rates. Each choice not only yields potential immediate reward but also information useful for long-term reward accumulation, thus pitting exploration against exploitation. We formalize the task as comprising a learning component, the updating of estimated reward rates based on ongoing observations, and a decision-making component, the choice among options based on current beliefs and uncertainties about reward rates. We model the learning component as iterative Bayesian inference (the Dynamic Belief Model), and the decision component using five competing decision policies: Win-stay/Lose-shift (WSLS), ε-Greedy, τ-Switch, Softmax, Knowledge Gradient. HCS and MDI significantly differ in how they learn about reward rates and use them to make decisions. HCS learn from past observations but weigh recent data more, and their decision policy is best fit as Softmax. MDI are more likely to follow the simple learning-independent policy of WSLS, and among MDI best fit by Softmax, they have more pessimistic prior beliefs about reward rates and are less likely to choose the option estimated to be most rewarding. Neurally, MDI's tendency to avoid the most rewarding option is associated with a lower gray matter volume of the thalamic dorsal lateral nucleus. More broadly, our work illustrates the ability of our computational framework to help reveal subtle learning and decision-making abnormalities in substance use.
format Online
Article
Text
id pubmed-4683191
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-46831912016-01-05 Altered Statistical Learning and Decision-Making in Methamphetamine Dependence: Evidence from a Two-Armed Bandit Task Harlé, Katia M. Zhang, Shunan Schiff, Max Mackey, Scott Paulus, Martin P. Yu, Angela J. Front Psychol Psychology Understanding how humans weigh long-term and short-term goals is important for both basic cognitive science and clinical neuroscience, as substance users need to balance the appeal of an immediate high vs. the long-term goal of sobriety. We use a computational model to identify learning and decision-making abnormalities in methamphetamine-dependent individuals (MDI, n = 16) vs. healthy control subjects (HCS, n = 16), in a two-armed bandit task. In this task, subjects repeatedly choose between two arms with fixed but unknown reward rates. Each choice not only yields potential immediate reward but also information useful for long-term reward accumulation, thus pitting exploration against exploitation. We formalize the task as comprising a learning component, the updating of estimated reward rates based on ongoing observations, and a decision-making component, the choice among options based on current beliefs and uncertainties about reward rates. We model the learning component as iterative Bayesian inference (the Dynamic Belief Model), and the decision component using five competing decision policies: Win-stay/Lose-shift (WSLS), ε-Greedy, τ-Switch, Softmax, Knowledge Gradient. HCS and MDI significantly differ in how they learn about reward rates and use them to make decisions. HCS learn from past observations but weigh recent data more, and their decision policy is best fit as Softmax. MDI are more likely to follow the simple learning-independent policy of WSLS, and among MDI best fit by Softmax, they have more pessimistic prior beliefs about reward rates and are less likely to choose the option estimated to be most rewarding. Neurally, MDI's tendency to avoid the most rewarding option is associated with a lower gray matter volume of the thalamic dorsal lateral nucleus. More broadly, our work illustrates the ability of our computational framework to help reveal subtle learning and decision-making abnormalities in substance use. Frontiers Media S.A. 2015-12-18 /pmc/articles/PMC4683191/ /pubmed/26733906 http://dx.doi.org/10.3389/fpsyg.2015.01910 Text en Copyright © 2015 Harlé, Zhang, Schiff, Mackey, Paulus and Yu. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Psychology
Harlé, Katia M.
Zhang, Shunan
Schiff, Max
Mackey, Scott
Paulus, Martin P.
Yu, Angela J.
Altered Statistical Learning and Decision-Making in Methamphetamine Dependence: Evidence from a Two-Armed Bandit Task
title Altered Statistical Learning and Decision-Making in Methamphetamine Dependence: Evidence from a Two-Armed Bandit Task
title_full Altered Statistical Learning and Decision-Making in Methamphetamine Dependence: Evidence from a Two-Armed Bandit Task
title_fullStr Altered Statistical Learning and Decision-Making in Methamphetamine Dependence: Evidence from a Two-Armed Bandit Task
title_full_unstemmed Altered Statistical Learning and Decision-Making in Methamphetamine Dependence: Evidence from a Two-Armed Bandit Task
title_short Altered Statistical Learning and Decision-Making in Methamphetamine Dependence: Evidence from a Two-Armed Bandit Task
title_sort altered statistical learning and decision-making in methamphetamine dependence: evidence from a two-armed bandit task
topic Psychology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4683191/
https://www.ncbi.nlm.nih.gov/pubmed/26733906
http://dx.doi.org/10.3389/fpsyg.2015.01910
work_keys_str_mv AT harlekatiam alteredstatisticallearninganddecisionmakinginmethamphetaminedependenceevidencefromatwoarmedbandittask
AT zhangshunan alteredstatisticallearninganddecisionmakinginmethamphetaminedependenceevidencefromatwoarmedbandittask
AT schiffmax alteredstatisticallearninganddecisionmakinginmethamphetaminedependenceevidencefromatwoarmedbandittask
AT mackeyscott alteredstatisticallearninganddecisionmakinginmethamphetaminedependenceevidencefromatwoarmedbandittask
AT paulusmartinp alteredstatisticallearninganddecisionmakinginmethamphetaminedependenceevidencefromatwoarmedbandittask
AT yuangelaj alteredstatisticallearninganddecisionmakinginmethamphetaminedependenceevidencefromatwoarmedbandittask