Correlates of Reward-Predictive Value in Learning-Related Hippocampal Neural Activity
MetadataShow full item record
Citation (published version)Okatan, Murat. "Correlates of Reward-Predictive Value in Learning-Related Hippocampal Neural Activity" Hippocampus 19(5): 487-506. (2009)
Temporal difference learning (TD) is a popular algorithm in machine learning. Two learning signals that are derived from this algorithm, the predictive value and the prediction error, have been shown to explain changes in neural activity and behavior during learning across species. Here, the predictive value signal is used to explain the time course of learning-related changes in the activity of hippocampal neurons in monkeys performing an associative learning task. The TD algorithm serves as the centerpiece of a joint probability model for the learning-related neural activity and the behavioral responses recorded during the task. The neural component of the model consists of spiking neurons that compete and learn the reward-predictive value of task-relevant input signals. The predictive-value signaled by these neurons influences the behavioral response generated by a stochastic decision stage, which constitutes the behavioral component of the model. It is shown that the time course of the changes in neural activity and behavioral performance generated by the model exhibits key features of the experimental data. The results suggest that information about correct associations may be expressed in the hippocampus before it is detected in the behavior of a subject. In this way, the hippocampus may be among the earliest brain areas to express learning and drive the behavioral changes associated with learning. Correlates of reward-predictive value may be expressed in the hippocampus through rate remapping within spatial memory representations, they may represent reward-related aspects of a declarative or explicit relational memory representation of task contingencies, or they may correspond to reward-related components of episodic memory representations. These potential functions are discussed in connection with hippocampal cell assembly sequences and their reverse reactivation during the awake state. The results provide further support for the proposal that neural processes underlying learning may be implementing a temporal difference-like algorithm.
RightsCopyright 2009 Wiley-Liss, Inc., A Wiley Company Re-use of this article is permitted in accordance with the Creative Commons Deed, Attribution 2.5, which does not permit commercial exploitation.