An online reinforcement learning method to improve control adaptability in robot-aided rehabilitation

IRIS

Rehabilitation robotics enables consistent and personalized therapy but still relies on complex, expert-driven tuning of control parameters. To address this, a reinforcement learning strategy based on Q-learning is proposed to autonomously adapt key parameters during upper-limb rehabilitation, without requiring prior task-specific knowledge. A systematic evaluation is conducted across combinations of control parameters (radial stiffness and execution time), performance-based reward functions (pointing accuracy and movement smoothness), and exploration strategies (ɛ - greedy and Upper Confidence Bound). The Q-learning agent selects discrete actions (increase, decrease, or maintain the current value) for each control parameter, enabling real-time adaptation based on observed performance. The method is validated using a Kuka robotic arm in experiments involving 16 right-handed healthy subjects (13 males, 3 females) and 8 right-handed individuals simulating impaired motor behavior (5 males, 3 females). Motion signals are acquired through internal robot sensors, while a wearable physiological monitoring system define the Q-learning agent state. Reward improvement and exploration ratio are analyzed as key performance indicators and statistically compared across all tested conditions using the Mann–Whitney test. The results demonstrate that the proposed algorithm effectively adjusts control parameters online, with performance influenced by the reward function, exploration strategy, and selected control actions. Reward improvements of 0.11±0.09 (ɛ - greedy, reward based on pointing ability) and 0.13±0.11 (reward based on smoothness, Upper Confidence Bound strategy) were observed in healthy subjects, indicating enhancements in pointing accuracy and movement smoothness. In simulated pathological cases, improvements of 0.08±0.13 and 0.06±0.16 were observed, respectively.

An online reinforcement learning method to improve control adaptability in robot-aided rehabilitation

Molle R.;Tamantini C.;Lauretti C.;Romano E. M.;Zollo L.

2025-01-01

Abstract

Rehabilitation robotics enables consistent and personalized therapy but still relies on complex, expert-driven tuning of control parameters. To address this, a reinforcement learning strategy based on Q-learning is proposed to autonomously adapt key parameters during upper-limb rehabilitation, without requiring prior task-specific knowledge. A systematic evaluation is conducted across combinations of control parameters (radial stiffness and execution time), performance-based reward functions (pointing accuracy and movement smoothness), and exploration strategies (ɛ - greedy and Upper Confidence Bound). The Q-learning agent selects discrete actions (increase, decrease, or maintain the current value) for each control parameter, enabling real-time adaptation based on observed performance. The method is validated using a Kuka robotic arm in experiments involving 16 right-handed healthy subjects (13 males, 3 females) and 8 right-handed individuals simulating impaired motor behavior (5 males, 3 females). Motion signals are acquired through internal robot sensors, while a wearable physiological monitoring system define the Q-learning agent state. Reward improvement and exploration ratio are analyzed as key performance indicators and statistically compared across all tested conditions using the Mann–Whitney test. The results demonstrate that the proposed algorithm effectively adjusts control parameters online, with performance influenced by the reward function, exploration strategy, and selected control actions. Reward improvements of 0.11±0.09 (ɛ - greedy, reward based on pointing ability) and 0.13±0.11 (reward based on smoothness, Upper Confidence Bound strategy) were observed in healthy subjects, indicating enhancements in pointing accuracy and movement smoothness. In simulated pathological cases, improvements of 0.08±0.13 and 0.06±0.16 were observed, respectively.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2025
			
	Parole chiave
	
				Human–robot interaction; Online learning; Q-learning; Reinforcement learning; Tailoring rehabilitation; Upper-limb robot-aided rehabilitation
			
	Appare nelle tipologie:
	
				1.1 Articolo in rivista

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12610/89930

Citazioni

ND

1

0

social impact