Research Article: Stress enhances model-free reinforcement learning only after negative outcome

Date Published: July 19, 2017

Publisher: Public Library of Science

Author(s): Heyeon Park, Daeyeol Lee, Jeanyung Chey, Thomas Boraud.

http://doi.org/10.1371/journal.pone.0180588

Abstract

Previous studies found that stress shifts behavioral control by promoting habits while decreasing goal-directed behaviors during reward-based decision-making. It is, however, unclear how stress disrupts the relative contribution of the two systems controlling reward-seeking behavior, i.e. model-free (or habit) and model-based (or goal-directed). Here, we investigated whether stress biases the contribution of model-free and model-based reinforcement learning processes differently depending on the valence of outcome, and whether stress alters the learning rate, i.e., how quickly information from the new environment is incorporated into choices. Participants were randomly assigned to either a stress or a control condition, and performed a two-stage Markov decision-making task in which the reward probabilities underwent periodic reversals without notice. We found that stress increased the contribution of model-free reinforcement learning only after negative outcome. Furthermore, stress decreased the learning rate. The results suggest that stress diminishes one’s ability to make adaptive choices in multiple aspects of reinforcement learning. This finding has implications for understanding how stress facilitates maladaptive habits, such as addictive behavior, and other dysfunctional behaviors associated with stress in clinical and educational contexts.

Partial Text

Reward-seeking behaviors can be described by two different computational principles that might be supported by distinct neuroanatomical substrates [1–4]. On the one hand, a goal-directed controller selects behaviors expected to produce the best outcomes according to the knowledge of the decision-maker’s environment and motivational state. The process by which the knowledge is updated and outcomes expected from alternative actions are derived from this knowledge is referred to as model-based reinforcement learning (RL). On the other hand, a habit controller relies on the expected values of outcome adjusted incrementally by trial and error, and results in automatic and less computationally demanding action selection. Accordingly, these goal-directed and habit systems might favor different actions, when the motivational status of the actor or the properties of environment change rapidly. However, precisely how the balance between these two controllers is adjusted across different behavioral settings remains poorly understood.

This study was approved by the Seoul National University Institutional Review Board (SNUIRB), and all participants provided written informed consent.

In this study, we found that stress impaired the reward-seeking behavior and demonstrated that the inferior performance under stress might be due to at least two different mechanisms. First, stress increased the influence of the model-free reinforcement learning, particularly the likelihood of switching to an alternative choice when the previous choice led to an undesirable outcome. Second, stress decreased the learning rate, namely, the degree to which new information is incorporated into trial-by-trial decision making. These findings suggest that maladaptive choice behavior under stress might be attributable to both a slower learning rate and the strengthening of model-free RL after a negative outcome.

This study characterized the effect of stress on adaptive decision making, by providing participants with a changing environment where their choice behaviors were modeled in a computational framework of reinforcement learning. We found that stress facilitated the habitual, model-free RL process to shift away from unrewarded action, and that it also interrupted the subjects from incorporating new information into their subsequent choices. These findings provide insight as to the mechanism by which stress diminishes the ability to behave flexibly in reward-based decision making, and have significant implications for understanding and treating stress-related maladaptive conditions characterized by enhanced habit behavior such as addiction and impulse control disorders.

 

Source:

http://doi.org/10.1371/journal.pone.0180588

 

0 0 vote
Article Rating
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments