Limits...
Jointly Learning Multiple Sequential Dynamics for Human Action Recognition.

Liu AA, Su YT, Nie WZ, Yang ZX - PLoS ONE (2015)

Bottom Line: For visual representation, we propose the part-induced spatiotemporal action unit sequence to represent each action sample with multiple partwise sequential feature subspaces.For model learning, we propose the multi-task conditional random fields (MTCRFs) model to discover the sequence-specific structure and the sequence-shared relationship.Specifically, the multi-chain graph structure and the corresponding probabilistic model are designed to represent the interaction among multiple part-induced action unit sequences.

View Article: PubMed Central - PubMed

Affiliation: School of Electronic Information Engineering, Tianjin University, Tianjin 300072, China.

ABSTRACT
Discovering visual dynamics during human actions is a challenging task for human action recognition. To deal with this problem, we theoretically propose the multi-task conditional random fields model and explore its application on human action recognition. For visual representation, we propose the part-induced spatiotemporal action unit sequence to represent each action sample with multiple partwise sequential feature subspaces. For model learning, we propose the multi-task conditional random fields (MTCRFs) model to discover the sequence-specific structure and the sequence-shared relationship. Specifically, the multi-chain graph structure and the corresponding probabilistic model are designed to represent the interaction among multiple part-induced action unit sequences. Moreover we propose the model learning and inference methods to discover temporal context within individual action unit sequence and the latent correlation among different body parts. Extensive experiments are implemented to demonstrate the superiority of the proposed method on two popular RGB human action datasets, KTH & TJU, and the depth dataset in MSR Daily Activity 3D.

No MeSH data available.


Action category-wise comparison among HL-HCRF, LL-MTCRFs, ML-MTCRFs, and All-MTCRFs on KTH, TJU, and MDA.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4493153&req=5

pone.0130884.g008: Action category-wise comparison among HL-HCRF, LL-MTCRFs, ML-MTCRFs, and All-MTCRFs on KTH, TJU, and MDA.

Mentions: The action category-wise comparison among HL-HCRF, LL-MTCRFs, ML-MTCRFs, and All-MTCRFs for three datasets are listed in Fig 8a, 8b and 8c. For KTH, All-MTCRFs can work better or equally to the others in 4 out of 6 actions. Especially it can drastically improve the accuracy of Jogging from around 80% to 91.2%, which is the most challenging one in KTH. For TJU, All-MTCRFs can rank 1st in 17 out of 22 action. It can augment the performance of 4 actions (clapping, jacks, p-jump, draw-circle) with more than 10% accuracy. For MDA in the depth modality, All-MTCRFs can also improve the performance and rank 1st in 14 out of 16 action. It can augment the performance of 6 actions (drink, read book, write on a paper, sit still, toss paper, lie down on sofa) with more than 20% accuracy. This comparison demonstrates that the proposed method is more robust to high intra variation caused by more complex actions and diverse environments. The action-wise comparison among the first three methods in Fig 8 shows that LL-MTCRFs can work better or equally to HL-HCRF and ML-MTCRFs in 3 out of 6 actions in KTH, 16 out of 22 actions in TJU, 12 out of 16 actions in MDA. It further demonstrates that the body part-induced ST-AUS representation is more discriminative for local dynamic description and consequently facilitates human action recognition.


Jointly Learning Multiple Sequential Dynamics for Human Action Recognition.

Liu AA, Su YT, Nie WZ, Yang ZX - PLoS ONE (2015)

Action category-wise comparison among HL-HCRF, LL-MTCRFs, ML-MTCRFs, and All-MTCRFs on KTH, TJU, and MDA.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4493153&req=5

pone.0130884.g008: Action category-wise comparison among HL-HCRF, LL-MTCRFs, ML-MTCRFs, and All-MTCRFs on KTH, TJU, and MDA.
Mentions: The action category-wise comparison among HL-HCRF, LL-MTCRFs, ML-MTCRFs, and All-MTCRFs for three datasets are listed in Fig 8a, 8b and 8c. For KTH, All-MTCRFs can work better or equally to the others in 4 out of 6 actions. Especially it can drastically improve the accuracy of Jogging from around 80% to 91.2%, which is the most challenging one in KTH. For TJU, All-MTCRFs can rank 1st in 17 out of 22 action. It can augment the performance of 4 actions (clapping, jacks, p-jump, draw-circle) with more than 10% accuracy. For MDA in the depth modality, All-MTCRFs can also improve the performance and rank 1st in 14 out of 16 action. It can augment the performance of 6 actions (drink, read book, write on a paper, sit still, toss paper, lie down on sofa) with more than 20% accuracy. This comparison demonstrates that the proposed method is more robust to high intra variation caused by more complex actions and diverse environments. The action-wise comparison among the first three methods in Fig 8 shows that LL-MTCRFs can work better or equally to HL-HCRF and ML-MTCRFs in 3 out of 6 actions in KTH, 16 out of 22 actions in TJU, 12 out of 16 actions in MDA. It further demonstrates that the body part-induced ST-AUS representation is more discriminative for local dynamic description and consequently facilitates human action recognition.

Bottom Line: For visual representation, we propose the part-induced spatiotemporal action unit sequence to represent each action sample with multiple partwise sequential feature subspaces.For model learning, we propose the multi-task conditional random fields (MTCRFs) model to discover the sequence-specific structure and the sequence-shared relationship.Specifically, the multi-chain graph structure and the corresponding probabilistic model are designed to represent the interaction among multiple part-induced action unit sequences.

View Article: PubMed Central - PubMed

Affiliation: School of Electronic Information Engineering, Tianjin University, Tianjin 300072, China.

ABSTRACT
Discovering visual dynamics during human actions is a challenging task for human action recognition. To deal with this problem, we theoretically propose the multi-task conditional random fields model and explore its application on human action recognition. For visual representation, we propose the part-induced spatiotemporal action unit sequence to represent each action sample with multiple partwise sequential feature subspaces. For model learning, we propose the multi-task conditional random fields (MTCRFs) model to discover the sequence-specific structure and the sequence-shared relationship. Specifically, the multi-chain graph structure and the corresponding probabilistic model are designed to represent the interaction among multiple part-induced action unit sequences. Moreover we propose the model learning and inference methods to discover temporal context within individual action unit sequence and the latent correlation among different body parts. Extensive experiments are implemented to demonstrate the superiority of the proposed method on two popular RGB human action datasets, KTH & TJU, and the depth dataset in MSR Daily Activity 3D.

No MeSH data available.