Rehabilitation robotics enables consistent and personalized therapy but still relies on complex, expert-driven tuning of control parameters. To address this, a reinforcement learning strategy based on Q-learning is proposed to autonomously adapt key parameters during upper-limb rehabilitation, without requiring prior task-specific knowledge. A systematic evaluation is conducted across combinations of control parameters (radial stiffness and execution time), performance-based reward functions (pointing accuracy and movement smoothness), and exploration strategies (ɛ - greedy and Upper Confidence Bound). The Q-learning agent selects discrete actions (increase, decrease, or maintain the current value) for each control parameter, enabling real-time adaptation based on observed performance. The method is validated using a Kuka robotic arm in experiments involving 16 right-handed healthy subjects (13 males, 3 females) and 8 right-handed individuals simulating impaired motor behavior (5 males, 3 females). Motion signals are acquired through internal robot sensors, while a wearable physiological monitoring system define the Q-learning agent state. Reward improvement and exploration ratio are analyzed as key performance indicators and statistically compared across all tested conditions using the Mann–Whitney test. The results demonstrate that the proposed algorithm effectively adjusts control parameters online, with performance influenced by the reward function, exploration strategy, and selected control actions. Reward improvements of 0.11±0.09 (ɛ - greedy, reward based on pointing ability) and 0.13±0.11 (reward based on smoothness, Upper Confidence Bound strategy) were observed in healthy subjects, indicating enhancements in pointing accuracy and movement smoothness. In simulated pathological cases, improvements of 0.08±0.13 and 0.06±0.16 were observed, respectively.
An online reinforcement learning method to improve control adaptability in robot-aided rehabilitation
Tamantini C.;Lauretti C.;Zollo L.
2025-01-01
Abstract
Rehabilitation robotics enables consistent and personalized therapy but still relies on complex, expert-driven tuning of control parameters. To address this, a reinforcement learning strategy based on Q-learning is proposed to autonomously adapt key parameters during upper-limb rehabilitation, without requiring prior task-specific knowledge. A systematic evaluation is conducted across combinations of control parameters (radial stiffness and execution time), performance-based reward functions (pointing accuracy and movement smoothness), and exploration strategies (ɛ - greedy and Upper Confidence Bound). The Q-learning agent selects discrete actions (increase, decrease, or maintain the current value) for each control parameter, enabling real-time adaptation based on observed performance. The method is validated using a Kuka robotic arm in experiments involving 16 right-handed healthy subjects (13 males, 3 females) and 8 right-handed individuals simulating impaired motor behavior (5 males, 3 females). Motion signals are acquired through internal robot sensors, while a wearable physiological monitoring system define the Q-learning agent state. Reward improvement and exploration ratio are analyzed as key performance indicators and statistically compared across all tested conditions using the Mann–Whitney test. The results demonstrate that the proposed algorithm effectively adjusts control parameters online, with performance influenced by the reward function, exploration strategy, and selected control actions. Reward improvements of 0.11±0.09 (ɛ - greedy, reward based on pointing ability) and 0.13±0.11 (reward based on smoothness, Upper Confidence Bound strategy) were observed in healthy subjects, indicating enhancements in pointing accuracy and movement smoothness. In simulated pathological cases, improvements of 0.08±0.13 and 0.06±0.16 were observed, respectively.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.