Cargando…

Computing reward‐prediction error: an integrated account of cortical timing and basal‐ganglia pathways for appetitive and aversive learning

There are two prevailing notions regarding the involvement of the corticobasal ganglia system in value‐based learning: (i) the direct and indirect pathways of the basal ganglia are crucial for appetitive and aversive learning, respectively, and (ii) the activity of midbrain dopamine neurons represen...

Descripción completa

Detalles Bibliográficos
Autores principales:	Morita, Kenji, Kawaguchi, Yasuo
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	John Wiley and Sons Inc. 2015
Materias:	Computational Neuroscience
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5034842/ https://www.ncbi.nlm.nih.gov/pubmed/26095906 http://dx.doi.org/10.1111/ejn.12994

_version_	1782455339523768320
author	Morita, Kenji Kawaguchi, Yasuo
author_facet	Morita, Kenji Kawaguchi, Yasuo
author_sort	Morita, Kenji
collection	PubMed
description	There are two prevailing notions regarding the involvement of the corticobasal ganglia system in value‐based learning: (i) the direct and indirect pathways of the basal ganglia are crucial for appetitive and aversive learning, respectively, and (ii) the activity of midbrain dopamine neurons represents reward‐prediction error. Although (ii) constitutes a critical assumption of (i), it remains elusive how (ii) holds given (i), with the basal‐ganglia influence on the dopamine neurons. Here we present a computational neural‐circuit model that potentially resolves this issue. Based on the latest analyses of the heterogeneous corticostriatal neurons and connections, our model posits that the direct and indirect pathways, respectively, represent the values of upcoming and previous actions, and up‐regulate and down‐regulate the dopamine neurons via the basal‐ganglia output nuclei. This explains how the difference between the upcoming and previous values, which constitutes the core of reward‐prediction error, is calculated. Simultaneously, it predicts that blockade of the direct/indirect pathway causes a negative/positive shift of reward‐prediction error and thereby impairs learning from positive/negative error, i.e. appetitive/aversive learning. Through simulation of reward‐reversal learning and punishment‐avoidance learning, we show that our model could indeed account for the experimentally observed features that are suggested to support notion (i) and could also provide predictions on neural activity. We also present a behavioral prediction of our model, through simulation of inter‐temporal choice, on how the balance between the two pathways relates to the subject's time preference. These results indicate that our model, incorporating the heterogeneity of the cortical influence on the basal ganglia, is expected to provide a closed‐circuit mechanistic understanding of appetitive/aversive learning.
format	Online Article Text
id	pubmed-5034842
institution	National Center for Biotechnology Information
language	English
publishDate	2015
publisher	John Wiley and Sons Inc.
record_format	MEDLINE/PubMed
spelling	pubmed-50348422016-10-03 Computing reward‐prediction error: an integrated account of cortical timing and basal‐ganglia pathways for appetitive and aversive learning Morita, Kenji Kawaguchi, Yasuo Eur J Neurosci Computational Neuroscience There are two prevailing notions regarding the involvement of the corticobasal ganglia system in value‐based learning: (i) the direct and indirect pathways of the basal ganglia are crucial for appetitive and aversive learning, respectively, and (ii) the activity of midbrain dopamine neurons represents reward‐prediction error. Although (ii) constitutes a critical assumption of (i), it remains elusive how (ii) holds given (i), with the basal‐ganglia influence on the dopamine neurons. Here we present a computational neural‐circuit model that potentially resolves this issue. Based on the latest analyses of the heterogeneous corticostriatal neurons and connections, our model posits that the direct and indirect pathways, respectively, represent the values of upcoming and previous actions, and up‐regulate and down‐regulate the dopamine neurons via the basal‐ganglia output nuclei. This explains how the difference between the upcoming and previous values, which constitutes the core of reward‐prediction error, is calculated. Simultaneously, it predicts that blockade of the direct/indirect pathway causes a negative/positive shift of reward‐prediction error and thereby impairs learning from positive/negative error, i.e. appetitive/aversive learning. Through simulation of reward‐reversal learning and punishment‐avoidance learning, we show that our model could indeed account for the experimentally observed features that are suggested to support notion (i) and could also provide predictions on neural activity. We also present a behavioral prediction of our model, through simulation of inter‐temporal choice, on how the balance between the two pathways relates to the subject's time preference. These results indicate that our model, incorporating the heterogeneity of the cortical influence on the basal ganglia, is expected to provide a closed‐circuit mechanistic understanding of appetitive/aversive learning. John Wiley and Sons Inc. 2015-07-25 2015-08 /pmc/articles/PMC5034842/ /pubmed/26095906 http://dx.doi.org/10.1111/ejn.12994 Text en © 2015 The Authors. European Journal of Neuroscience published by Federation of European Neuroscience Societies and John Wiley & Sons Ltd. This is an open access article under the terms of the Creative Commons Attribution‐NonCommercial‐NoDerivs (http://creativecommons.org/licenses/by-nc-nd/4.0/) License, which permits use and distribution in any medium, provided the original work is properly cited, the use is non‐commercial and no modifications or adaptations are made.
spellingShingle	Computational Neuroscience Morita, Kenji Kawaguchi, Yasuo Computing reward‐prediction error: an integrated account of cortical timing and basal‐ganglia pathways for appetitive and aversive learning
title	Computing reward‐prediction error: an integrated account of cortical timing and basal‐ganglia pathways for appetitive and aversive learning
title_full	Computing reward‐prediction error: an integrated account of cortical timing and basal‐ganglia pathways for appetitive and aversive learning
title_fullStr	Computing reward‐prediction error: an integrated account of cortical timing and basal‐ganglia pathways for appetitive and aversive learning
title_full_unstemmed	Computing reward‐prediction error: an integrated account of cortical timing and basal‐ganglia pathways for appetitive and aversive learning
title_short	Computing reward‐prediction error: an integrated account of cortical timing and basal‐ganglia pathways for appetitive and aversive learning
title_sort	computing reward‐prediction error: an integrated account of cortical timing and basal‐ganglia pathways for appetitive and aversive learning
topic	Computational Neuroscience
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5034842/ https://www.ncbi.nlm.nih.gov/pubmed/26095906 http://dx.doi.org/10.1111/ejn.12994
work_keys_str_mv	AT moritakenji computingrewardpredictionerroranintegratedaccountofcorticaltimingandbasalgangliapathwaysforappetitiveandaversivelearning AT kawaguchiyasuo computingrewardpredictionerroranintegratedaccountofcorticaltimingandbasalgangliapathwaysforappetitiveandaversivelearning

Computing reward‐prediction error: an integrated account of cortical timing and basal‐ganglia pathways for appetitive and aversive learning

Ejemplares similares