Functional role of opponent, dopamine modulated D1/D2 plasticity in prediction error-driven reinforcement learning in the basal ganglia

Jitsev, Jenia; Morrison, Abigail; Abraham, Nobi; Tittgemeyer, Marc
doi:10.12751/nncn.bc2013.0164
000141530 001__ 141530
000141530 005__ 20240313094906.0
000141530 0247_ $$2doi$$a10.12751/nncn.bc2013.0164
000141530 037__ $$aFZJ-2013-06698
000141530 041__ $$aeng
000141530 1001_ $$0P:(DE-Juel1)158080$$aJitsev, Jenia$$b0$$eCorresponding author$$ufzj
000141530 1112_ $$aBerstein Conference on Computational Neuroscience$$cTuebingen$$d2013-09-24 - 2013-09-27$$wGermany
000141530 245__ $$aFunctional role of opponent, dopamine modulated D1/D2 plasticity in prediction error-driven reinforcement learning in the basal ganglia
000141530 260__ $$bG-Node$$c2013
000141530 300__ $$a162 - 163
000141530 3367_ $$2ORCID$$aCONFERENCE_PAPER
000141530 3367_ $$033$$2EndNote$$aConference Paper
000141530 3367_ $$2BibTeX$$aINPROCEEDINGS
000141530 3367_ $$2DRIVER$$aconferenceObject
000141530 3367_ $$2DataCite$$aOutput Types/Conference Paper
000141530 3367_ $$0PUB:(DE-HGF)8$$2PUB:(DE-HGF)$$aContribution to a conference proceedings$$bcontrib$$mcontrib$$s1568970454_27324
000141530 520__ $$aHere, we introduce a spiking actor-critic network model of learning from both reward and punishment in the basal ganglia. Both the dorsal (actor) and ventral (critic) striatum are assumed to contain populations of D1 and D2 medium spiny neurons (MSNs). In the ventral striatum, this allows separate representation of both positive and negative expected outcomes by respective D1/D2 MSN populations, which we hypothesize to reside in the shell part of the Nucleus Accumbens. The positive and negative outcome expectations are fed to dopamine (DA) neurons in VTA region, which compute and signal total prediction error by DA release. Based on recent experimental work [1], DA level is assumed to modulate plasticity of D1 and D2 synapses in opposing way, inducing LTP on D1 and LTD on D2 synapses if being high and vice versa if being low. Crucially, this form of opponent plasticity implements temporal-difference (TD)-like update of both positive and negative outcome expectations and performs appropriate adaptation of action preferences.We implemented the network in the NEST simulator [2] using leaky integrate-and-fire spiking neurons, and designed a battery of experiments in various grid world tasks. Across the tasks the network can learn both to approach the delayed rewards while consequently avoiding punishments, which posed severe difficulties for the previous model without D1/D2 segregation [3]. The model highlights thus the functional role of D1/D2 MSN segregation within the striatum in implementing appropriate TD-like learning from both reward and punishment and explains necessity for opponent direction of DA-dependent plasticity found at synapses converging on distinct striatal MSN types. The approach can be further extended to study how abnormal D1/D2 plasticity may lead to a reorganization of the basal ganglia network towards pathological, dysfunctional states, like for instance those observed in Parkinson disease under condition of progressive dopamine depletion.
000141530 536__ $$0G:(DE-HGF)POF2-311$$a311 - Signaling pathways, cell and tumor biology (POF2-311)$$cPOF2-311$$fPOF II$$x0
000141530 536__ $$0G:(DE-Juel1)HGF-SystemsBiology$$aHASB - Helmholtz Alliance on Systems Biology (HGF-SystemsBiology)$$cHGF-SystemsBiology$$fHASB-2008-2012$$x1
000141530 536__ $$0G:(DE-Juel1)HGF-SMHB-2013-2017$$aSMHB - Supercomputing and Modelling for the Human Brain (HGF-SMHB-2013-2017)$$cHGF-SMHB-2013-2017$$fSMHB$$x2
000141530 536__ $$0G:(DE-HGF)B1175.01.12$$aW2Morrison - W2/W3 Professorinnen Programm der Helmholtzgemeinschaft (B1175.01.12)$$cB1175.01.12$$x3
000141530 588__ $$aDataset connected to DataCite
000141530 7001_ $$0P:(DE-HGF)0$$aAbraham, Nobi$$b1
000141530 7001_ $$0P:(DE-HGF)0$$aTittgemeyer, Marc$$b2
000141530 7001_ $$0P:(DE-Juel1)151166$$aMorrison, Abigail$$b3$$ufzj
000141530 773__ $$a10.12751/nncn.bc2013.0164
000141530 8564_ $$uhttps://portal.g-node.org/abstracts/bc13/#/doi/nncn.bc2013.0164
000141530 909CO $$ooai:juser.fz-juelich.de:141530$$pVDB
000141530 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)158080$$aForschungszentrum Jülich GmbH$$b0$$kFZJ
000141530 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)151166$$aForschungszentrum Jülich GmbH$$b3$$kFZJ
000141530 9131_ $$0G:(DE-HGF)POF2-311$$1G:(DE-HGF)POF2-310$$2G:(DE-HGF)POF2-300$$3G:(DE-HGF)POF2$$4G:(DE-HGF)POF$$aDE-HGF$$bGesundheit$$lKrebsforschung$$vSignaling pathways, cell and tumor biology$$x0
000141530 9141_ $$y2013
000141530 9201_ $$0I:(DE-Juel1)INM-6-20090406$$kINM-6$$lComputational and Systems Neuroscience$$x0
000141530 9201_ $$0I:(DE-Juel1)IAS-6-20130828$$kIAS-6$$lTheoretical Neuroscience$$x1
000141530 980__ $$acontrib
000141530 980__ $$aVDB
000141530 980__ $$aI:(DE-Juel1)INM-6-20090406
000141530 980__ $$aI:(DE-Juel1)IAS-6-20130828
000141530 980__ $$aUNRESTRICTED
000141530 981__ $$aI:(DE-Juel1)IAS-6-20130828
000141530 981__ $$aI:(DE-Juel1)IAS-6-20130828
guest :: login JuSER
		Search		Submit		Personalize Your alerts Your baskets Your searches		Help