A computational model of cortical-striatal mediation of speed-accuracy tradeoff and habit formation emerging from anatomical gradients in dopamine physiology and reinforcement learning
MetadataShow full item record
Decision making – committing to a single action from a plethora of viable alternatives – is a necessity for all motile creatures, each moving a single body to many possible destinations. Some decisions are better than others. For example, to a rat deciding between one path that will bring it to a piece of cheese and another that will bring it to the jaws of a cat, there is a clear reason for the rat to prefer one choice over the other. Two criteria for adjusting decision making for optimal outcome are to make decisions as accurately as possible – choose the course of action most likely to result in the preferred outcome – but also to decide as fast as possible. Because these criteria often conflict, decision making has an inherent “speed-accuracy tradeoff”. Presented here is a computational neural model of decision making, which incorporates neurobiological design principles that optimize this tradeoff via reward-guided transfers of control between two sensory processing systems with different speed/accuracy characteristics. This model incorporates anatomical and physiological evidence that dopamine, the key neurotransmitter in reinforcement learning, has varying effects in different sub-regions of the basal ganglia, a subcortical structure that interfaces with the neocortex to control behavior. Based on the observed differences between these sub-regions, the model proposes that gradual adaptations of synaptic links by reinforcement learning signals lead to rapid changes in the speed and accuracy of decision making, by assigning control of behavior to alternative cortical representations. Chapter one draws conceptual links from experimental data to the design of the proposed model. Chapter two applies the model to speed-accuracy tradeoffs and habit formation by simulating forced-choice paradigms. Several robust behavioral phenomena are replicated. By isolating reinforcement learning factors that control the speed and depth of habit formation, the model can help explain why all substances that strongly and synergistically affect such factors share a high potential for habit formation, or habit abatement. To illustrate such potential applications of the current model, chapter three investigates effects of varying model parameters in accord with the known neurochemical effects of some major habit-forming substances, such as cocaine and ethanol.