Research Article: Minimizing endpoint variability through reinforcement learning during reaching movements involving shoulder, elbow and wrist

Date Published: July 18, 2017

Publisher: Public Library of Science

Author(s): David Marc Anton Mehler, Alexandra Reichenbach, Julius Klein, Jörn Diedrichsen, Robert J. van Beers.


Reaching movements are comprised of the coordinated action across multiple joints. The human skeleton is redundant for this task because different joint configurations can lead to the same endpoint in space. How do people learn to use combinations of joints that maximize success in goal-directed motor tasks? To answer this question, we used a 3-degree-of-freedom manipulandum to measure shoulder, elbow and wrist joint movements during reaching in a plane. We tested whether a shift in the relative contribution of the wrist and elbow joints to a reaching movement could be learned by an implicit reinforcement regime. Unknown to the participants, we decreased the task success for certain joint configurations (wrist flexion or extension, respectively) by adding random variability to the endpoint feedback. In return, the opposite wrist postures were rewarded in the two experimental groups (flexion and extension group). We found that the joint configuration slowly shifted towards movements that provided more control over the endpoint and hence higher task success. While the overall learning was significant, only the group that was guided to extend the wrist joint more during the movement showed substantial learning. Importantly, all changes in movement pattern occurred independent of conscious awareness of the experimental manipulation. These findings suggest that the motor system is generally sensitive to its output variability and can optimize joint-space solutions that minimize task-relevant output variability. We discuss biomechanical biases (e.g. joint’s range of movement) that could impose hurdles to the learning process.

Partial Text

Learning a new motor skill often requires the coordinated action across several joints. The biomechanics of the human body equip us with abundant degrees of freedom, meaning that many different movements in joint space achieve the same task goal. How the brain picks one of the options for executing a motor action remains an important question in motor neuroscience [1]. When performing a backhand stroke in tennis, for example, different combinations of joint movement in the trunk, shoulder, elbow and wrist yield a successful hit. However, there will be some joint configurations that allow for more control over the racket, and therewith reduce the variability of the returning ball trajectory and increase the success of achieving the desired action [2]. The many years of training required to become a motor expert are, to some degree, spent on acquiring the optimal movement solutions in joint space. What are the learning mechanisms that underlie this process?

To our knowledge, this study is the first to successfully induce implicit reinforcement learning in joint space of the arm. We investigated planar reaching movements in a redundant task setting that involved the coordination of shoulder, elbow and wrist joints [3]. The learning goal was to change the arm configuration at the endpoint towards larger wrist flexion or extension. The implicit teaching signal was the amount of added variability to the visual feedback of the end-effector position, or in other words, the controllability of the visual cursor. In most previous reinforcement learning studies, participants were made aware of the critical task dimension in the beginning of the experiment. In these studies, the manipulation of task success alone yielded learning [27,28,35]. In contrast, a recent study by Manley et al. [23] indicated that task success alone was not a sufficient teaching signal when participants were unaware of the critical dimension. However, the authors revealed that added extrinsic noise could serve as a successful teaching signal even in the absence of explicit awareness.




0 0 vote
Article Rating
Notify of
Inline Feedbacks
View all comments