The Reward Positivity Tracks Positive Reward Prediction Errors From Feedback to Cues During Reinforcement Learning

Yifan Gao, Robert Wilson, Galit Karpov, Travis E. Baker

Published online on May 12, 2026

Abstract

["Psychophysiology, Volume 63, Issue 5, May 2026. ", "\nABSTRACT\nHow does the brain learn to predict rewards? According to temporal difference (TD) learning theory, reward prediction errors (RPEs) should shift from the time of outcome delivery to earlier predictive cues as stimulus‐action‐outcome associations are learned. The reward positivity, an electrophysiological signal believed to index sensitivity of the anterior midcingulate cortex to positive RPEs, should progressively transfer from feedback to predictive cues during learning. However, this core prediction of the reward positivity has remained largely untested. We recorded the EEG from 73 healthy adults performing a probabilistic selection task (PST) with extended training trials. The reward positivity amplitude was measured at both feedback and cue presentation during early and late training phases (first and second halves, respectively). To examine individual differences in RPE‐related learning processes, we split participants into rapid and slow learner groups and fit Q‐learning models to estimate separate learning rates for positive and negative feedback. Results showed clear evidence of temporal backpropagation: early in learning, the reward positivity appeared when feedback was delivered, but late in learning it disappeared from feedback and instead emerged when predictive cues appeared. Rapid learners showed more pronounced shifts in reward positivity from feedback to cue, consistent with their higher learning rates. These findings provide the first clear evidence that reward positivity demonstrates the temporal backpropagation predicted by TD learning theory. The results validate reward positivity as a neural marker of positive RPEs and highlight the importance of examining both cue‐related and feedback‐related brain responses to fully characterize reinforcement learning processes. Our findings have important implications for understanding individual differences in reinforcement learning and for interpreting the reward positivity in clinical populations and across the lifespan.\n"]