In this letter, we propose a new approach for human motion reconstruction based on Gaussian Mixture Probability Hypothesis Density (GM-PHD) Filter applied to human joint positions extracted from RGB-D camera (e.g. Kinect). Existing inference approaches require a proper association between measurements and joints, which cannot be maintained in case of the multi-tracking occlusion problem. The proposed GM-PHD recursively estimates the number and states of each group of targets. Furthermore, we embed kinematic constraints in the inference process to guarantee robustness to occlusions. We evaluate the accuracy of both the proposed approach and the default one obtained through a Kinect device by comparing them with a motion analysis system (i.e. Vicon optoelectronic system) even in presence of occlusions of one or more body joints. Experimental results show that the filter outperforms the solution provided by the baseline commercial solution approach available in the Kinect device by reducing the hand position and elbow flexion error of 55.8% and 36.3%, respectively. In addition, to evaluate the applicability of the approach in real-world applications, we employ it in a drone gesture-based context to remotely control a drone. The user is able to move the drone in a target position with a 100% success rate.
Robust Upper Limb Kinematic Reconstruction Using a RGB-D Camera
Cordella F.
2024-01-01
Abstract
In this letter, we propose a new approach for human motion reconstruction based on Gaussian Mixture Probability Hypothesis Density (GM-PHD) Filter applied to human joint positions extracted from RGB-D camera (e.g. Kinect). Existing inference approaches require a proper association between measurements and joints, which cannot be maintained in case of the multi-tracking occlusion problem. The proposed GM-PHD recursively estimates the number and states of each group of targets. Furthermore, we embed kinematic constraints in the inference process to guarantee robustness to occlusions. We evaluate the accuracy of both the proposed approach and the default one obtained through a Kinect device by comparing them with a motion analysis system (i.e. Vicon optoelectronic system) even in presence of occlusions of one or more body joints. Experimental results show that the filter outperforms the solution provided by the baseline commercial solution approach available in the Kinect device by reducing the hand position and elbow flexion error of 55.8% and 36.3%, respectively. In addition, to evaluate the applicability of the approach in real-world applications, we employ it in a drone gesture-based context to remotely control a drone. The user is able to move the drone in a target position with a 100% success rate.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.