Limits...
Real-time human pose estimation and gesture recognition from depth images using superpixels and SVM classifier.

Kim H, Lee S, Lee D, Choi S, Ju J, Myung H - Sensors (Basel) (2015)

Bottom Line: The gesture yielding the smallest comparison error is chosen as a recognized gesture.To prevent recognition of gestures when a person performs a gesture that is not registered, we derive the maximum allowable comparison errors by comparing each registered gesture with the other gestures.The experiment results show that our method performs fairly well and is applicable in real environments.

View Article: PubMed Central - PubMed

Affiliation: Urban Robotics Laboratory (URL), Dept. Civil and Environmental Engineering, Korea Advanced Institute of Science and Technology (KAIST), 291 Daehak-ro, Yuseong-gu, Daejeon 305-338, Korea. sskhk05@kaist.ac.kr.

ABSTRACT
In this paper, we present human pose estimation and gesture recognition algorithms that use only depth information. The proposed methods are designed to be operated with only a CPU (central processing unit), so that the algorithm can be operated on a low-cost platform, such as an embedded board. The human pose estimation method is based on an SVM (support vector machine) and superpixels without prior knowledge of a human body model. In the gesture recognition method, gestures are recognized from the pose information of a human body. To recognize gestures regardless of motion speed, the proposed method utilizes the keyframe extraction method. Gesture recognition is performed by comparing input keyframes with keyframes in registered gestures. The gesture yielding the smallest comparison error is chosen as a recognized gesture. To prevent recognition of gestures when a person performs a gesture that is not registered, we derive the maximum allowable comparison errors by comparing each registered gesture with the other gestures. We evaluated our method using a dataset that we generated. The experiment results show that our method performs fairly well and is applicable in real environments.

Show MeSH
Example of the measurement update step when the hand occludes the torso. The hand tracker extracts the depth measurements for hand candidates. The final hand position is estimated by the depth measurement with the smallest Mahalanobis distance from the previous hand position.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4507703&req=5

f4-sensors-15-12410: Example of the measurement update step when the hand occludes the torso. The hand tracker extracts the depth measurements for hand candidates. The final hand position is estimated by the depth measurement with the smallest Mahalanobis distance from the previous hand position.

Mentions: After removing the misclassified superpixels, each joint position of Bi is estimated as the central moment of the superpixels labeled as the corresponding body part Bi. An example of a pose estimation result is shown in Figure 3c. However, when the hands occlude the torso, none of the superpixels are classified as hands. To solve this problem, a hand tracker is designed to estimate the hand position, even when the hand information is not provided by the classifier. The hand tracker is designed based on the Kalman filter. The Kalman filter usually consists of two steps. One is the state prediction step, and the other is the measurement update step, which are calculated at every frame in the background process. In the state prediction step, the state is estimated based on the previous hand position and the hand position difference, i.e., Δx, Δy and Δz. In the measurement update step, the hand tracker extracts the depth measurements for hand candidates within the ROI (region of interest), which is calculated from the previous hand position and Δx, Δy and Δz. The hand position is updated by the depth measurement with the smallest Mahalanobis distance from the previous hand position. If the hand position can be acquired from the classification results, the hand position is corrected by the classification results. Otherwise, the result from the Kalman filter is finally used as the hand position. The exemplary measurement update procedure is shown in Figure 4 in case the hand occludes the torso. The rationale for applying the linear Kalman filter as a hand tracker is as follows. The first reason is that the hand movements are continuous. Therefore, the hand position can be predicted by using the previous hand position and its position difference. The second reason is that the hand movement being faster than the operation speed of the overall algorithm (in our experimental setting, the overall algorithm runs at 15 frames per second) was not considered. This means that the hand tracker can track the general hand movements continuously within the operation speed.


Real-time human pose estimation and gesture recognition from depth images using superpixels and SVM classifier.

Kim H, Lee S, Lee D, Choi S, Ju J, Myung H - Sensors (Basel) (2015)

Example of the measurement update step when the hand occludes the torso. The hand tracker extracts the depth measurements for hand candidates. The final hand position is estimated by the depth measurement with the smallest Mahalanobis distance from the previous hand position.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4507703&req=5

f4-sensors-15-12410: Example of the measurement update step when the hand occludes the torso. The hand tracker extracts the depth measurements for hand candidates. The final hand position is estimated by the depth measurement with the smallest Mahalanobis distance from the previous hand position.
Mentions: After removing the misclassified superpixels, each joint position of Bi is estimated as the central moment of the superpixels labeled as the corresponding body part Bi. An example of a pose estimation result is shown in Figure 3c. However, when the hands occlude the torso, none of the superpixels are classified as hands. To solve this problem, a hand tracker is designed to estimate the hand position, even when the hand information is not provided by the classifier. The hand tracker is designed based on the Kalman filter. The Kalman filter usually consists of two steps. One is the state prediction step, and the other is the measurement update step, which are calculated at every frame in the background process. In the state prediction step, the state is estimated based on the previous hand position and the hand position difference, i.e., Δx, Δy and Δz. In the measurement update step, the hand tracker extracts the depth measurements for hand candidates within the ROI (region of interest), which is calculated from the previous hand position and Δx, Δy and Δz. The hand position is updated by the depth measurement with the smallest Mahalanobis distance from the previous hand position. If the hand position can be acquired from the classification results, the hand position is corrected by the classification results. Otherwise, the result from the Kalman filter is finally used as the hand position. The exemplary measurement update procedure is shown in Figure 4 in case the hand occludes the torso. The rationale for applying the linear Kalman filter as a hand tracker is as follows. The first reason is that the hand movements are continuous. Therefore, the hand position can be predicted by using the previous hand position and its position difference. The second reason is that the hand movement being faster than the operation speed of the overall algorithm (in our experimental setting, the overall algorithm runs at 15 frames per second) was not considered. This means that the hand tracker can track the general hand movements continuously within the operation speed.

Bottom Line: The gesture yielding the smallest comparison error is chosen as a recognized gesture.To prevent recognition of gestures when a person performs a gesture that is not registered, we derive the maximum allowable comparison errors by comparing each registered gesture with the other gestures.The experiment results show that our method performs fairly well and is applicable in real environments.

View Article: PubMed Central - PubMed

Affiliation: Urban Robotics Laboratory (URL), Dept. Civil and Environmental Engineering, Korea Advanced Institute of Science and Technology (KAIST), 291 Daehak-ro, Yuseong-gu, Daejeon 305-338, Korea. sskhk05@kaist.ac.kr.

ABSTRACT
In this paper, we present human pose estimation and gesture recognition algorithms that use only depth information. The proposed methods are designed to be operated with only a CPU (central processing unit), so that the algorithm can be operated on a low-cost platform, such as an embedded board. The human pose estimation method is based on an SVM (support vector machine) and superpixels without prior knowledge of a human body model. In the gesture recognition method, gestures are recognized from the pose information of a human body. To recognize gestures regardless of motion speed, the proposed method utilizes the keyframe extraction method. Gesture recognition is performed by comparing input keyframes with keyframes in registered gestures. The gesture yielding the smallest comparison error is chosen as a recognized gesture. To prevent recognition of gestures when a person performs a gesture that is not registered, we derive the maximum allowable comparison errors by comparing each registered gesture with the other gestures. We evaluated our method using a dataset that we generated. The experiment results show that our method performs fairly well and is applicable in real environments.

Show MeSH