Limits...
Jointly Learning Multiple Sequential Dynamics for Human Action Recognition.

Liu AA, Su YT, Nie WZ, Yang ZX - PLoS ONE (2015)

Bottom Line: For visual representation, we propose the part-induced spatiotemporal action unit sequence to represent each action sample with multiple partwise sequential feature subspaces.For model learning, we propose the multi-task conditional random fields (MTCRFs) model to discover the sequence-specific structure and the sequence-shared relationship.Specifically, the multi-chain graph structure and the corresponding probabilistic model are designed to represent the interaction among multiple part-induced action unit sequences.

View Article: PubMed Central - PubMed

Affiliation: School of Electronic Information Engineering, Tianjin University, Tianjin 300072, China.

ABSTRACT
Discovering visual dynamics during human actions is a challenging task for human action recognition. To deal with this problem, we theoretically propose the multi-task conditional random fields model and explore its application on human action recognition. For visual representation, we propose the part-induced spatiotemporal action unit sequence to represent each action sample with multiple partwise sequential feature subspaces. For model learning, we propose the multi-task conditional random fields (MTCRFs) model to discover the sequence-specific structure and the sequence-shared relationship. Specifically, the multi-chain graph structure and the corresponding probabilistic model are designed to represent the interaction among multiple part-induced action unit sequences. Moreover we propose the model learning and inference methods to discover temporal context within individual action unit sequence and the latent correlation among different body parts. Extensive experiments are implemented to demonstrate the superiority of the proposed method on two popular RGB human action datasets, KTH & TJU, and the depth dataset in MSR Daily Activity 3D.

No MeSH data available.


Partwise spatio-temporal action unit sequence.Note that *,×,+,o respectively denote the space-time interest points in different part areas. Different colors denote different body parts.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4493153&req=5

pone.0130884.g001: Partwise spatio-temporal action unit sequence.Note that *,×,+,o respectively denote the space-time interest points in different part areas. Different colors denote different body parts.

Mentions: 1) ST-AUS representation: We propose the partwise spatio-temporal action unit sequence (ST-AUS) as shown in Fig 1. We utilize the prior knowledge of body structure to define seven body parts, e.g., head, left/right limbs, left/right legs, and left/right feet. One specific part region lasting for T frames is considered as an action unit. Then the action units belonging to one part region in an action video is considered as a partwise action unit sequence. Each partwise action unit sequence focuses on the dynamic of specific body area. Consequently, each action video can be represented in multiple sequential feature spaces.


Jointly Learning Multiple Sequential Dynamics for Human Action Recognition.

Liu AA, Su YT, Nie WZ, Yang ZX - PLoS ONE (2015)

Partwise spatio-temporal action unit sequence.Note that *,×,+,o respectively denote the space-time interest points in different part areas. Different colors denote different body parts.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4493153&req=5

pone.0130884.g001: Partwise spatio-temporal action unit sequence.Note that *,×,+,o respectively denote the space-time interest points in different part areas. Different colors denote different body parts.
Mentions: 1) ST-AUS representation: We propose the partwise spatio-temporal action unit sequence (ST-AUS) as shown in Fig 1. We utilize the prior knowledge of body structure to define seven body parts, e.g., head, left/right limbs, left/right legs, and left/right feet. One specific part region lasting for T frames is considered as an action unit. Then the action units belonging to one part region in an action video is considered as a partwise action unit sequence. Each partwise action unit sequence focuses on the dynamic of specific body area. Consequently, each action video can be represented in multiple sequential feature spaces.

Bottom Line: For visual representation, we propose the part-induced spatiotemporal action unit sequence to represent each action sample with multiple partwise sequential feature subspaces.For model learning, we propose the multi-task conditional random fields (MTCRFs) model to discover the sequence-specific structure and the sequence-shared relationship.Specifically, the multi-chain graph structure and the corresponding probabilistic model are designed to represent the interaction among multiple part-induced action unit sequences.

View Article: PubMed Central - PubMed

Affiliation: School of Electronic Information Engineering, Tianjin University, Tianjin 300072, China.

ABSTRACT
Discovering visual dynamics during human actions is a challenging task for human action recognition. To deal with this problem, we theoretically propose the multi-task conditional random fields model and explore its application on human action recognition. For visual representation, we propose the part-induced spatiotemporal action unit sequence to represent each action sample with multiple partwise sequential feature subspaces. For model learning, we propose the multi-task conditional random fields (MTCRFs) model to discover the sequence-specific structure and the sequence-shared relationship. Specifically, the multi-chain graph structure and the corresponding probabilistic model are designed to represent the interaction among multiple part-induced action unit sequences. Moreover we propose the model learning and inference methods to discover temporal context within individual action unit sequence and the latent correlation among different body parts. Extensive experiments are implemented to demonstrate the superiority of the proposed method on two popular RGB human action datasets, KTH & TJU, and the depth dataset in MSR Daily Activity 3D.

No MeSH data available.