Limits...
Jointly Learning Multiple Sequential Dynamics for Human Action Recognition.

Liu AA, Su YT, Nie WZ, Yang ZX - PLoS ONE (2015)

Bottom Line: For visual representation, we propose the part-induced spatiotemporal action unit sequence to represent each action sample with multiple partwise sequential feature subspaces.For model learning, we propose the multi-task conditional random fields (MTCRFs) model to discover the sequence-specific structure and the sequence-shared relationship.Specifically, the multi-chain graph structure and the corresponding probabilistic model are designed to represent the interaction among multiple part-induced action unit sequences.

View Article: PubMed Central - PubMed

Affiliation: School of Electronic Information Engineering, Tianjin University, Tianjin 300072, China.

ABSTRACT
Discovering visual dynamics during human actions is a challenging task for human action recognition. To deal with this problem, we theoretically propose the multi-task conditional random fields model and explore its application on human action recognition. For visual representation, we propose the part-induced spatiotemporal action unit sequence to represent each action sample with multiple partwise sequential feature subspaces. For model learning, we propose the multi-task conditional random fields (MTCRFs) model to discover the sequence-specific structure and the sequence-shared relationship. Specifically, the multi-chain graph structure and the corresponding probabilistic model are designed to represent the interaction among multiple part-induced action unit sequences. Moreover we propose the model learning and inference methods to discover temporal context within individual action unit sequence and the latent correlation among different body parts. Extensive experiments are implemented to demonstrate the superiority of the proposed method on two popular RGB human action datasets, KTH & TJU, and the depth dataset in MSR Daily Activity 3D.

No MeSH data available.


Graph structure of MTCRFs.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4493153&req=5

pone.0130884.g003: Graph structure of MTCRFs.

Mentions: We design the specific graph structure 𝓖 = {𝓥, 𝓔} (Fig 3) for the MTCRFs model. 𝓥 means the node set, including both observation nodes and hidden state nodes. ℰ means the edge set, including the transition between adjacent hidden states and the correlation between the hidden state and the action label. In terms of the designed graph structure, each action video, depicted by the extracted ST-AUS representation, can be represented by P parallel sequences with the chain structure, where denotes individual part-induced ST-AUS which represents the temporal dynamics of a specific body part during one action. S is assigned with a specific action label A ∈ 𝓐. To model the state transition within individual ST-AUS (sp), we utilize the hidden state layer to correlate the adjacent observations, in which means the hidden state sequence corresponding to sp. Each is a member of a finite discrete set 𝓛p of the pth ST-AUS. All the hidden states are correlated by the edge between individual hidden state and the action label node. Consequently, all ST-AUSs can be correlated and will contribute for the modeling and inference of the action category.


Jointly Learning Multiple Sequential Dynamics for Human Action Recognition.

Liu AA, Su YT, Nie WZ, Yang ZX - PLoS ONE (2015)

Graph structure of MTCRFs.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4493153&req=5

pone.0130884.g003: Graph structure of MTCRFs.
Mentions: We design the specific graph structure 𝓖 = {𝓥, 𝓔} (Fig 3) for the MTCRFs model. 𝓥 means the node set, including both observation nodes and hidden state nodes. ℰ means the edge set, including the transition between adjacent hidden states and the correlation between the hidden state and the action label. In terms of the designed graph structure, each action video, depicted by the extracted ST-AUS representation, can be represented by P parallel sequences with the chain structure, where denotes individual part-induced ST-AUS which represents the temporal dynamics of a specific body part during one action. S is assigned with a specific action label A ∈ 𝓐. To model the state transition within individual ST-AUS (sp), we utilize the hidden state layer to correlate the adjacent observations, in which means the hidden state sequence corresponding to sp. Each is a member of a finite discrete set 𝓛p of the pth ST-AUS. All the hidden states are correlated by the edge between individual hidden state and the action label node. Consequently, all ST-AUSs can be correlated and will contribute for the modeling and inference of the action category.

Bottom Line: For visual representation, we propose the part-induced spatiotemporal action unit sequence to represent each action sample with multiple partwise sequential feature subspaces.For model learning, we propose the multi-task conditional random fields (MTCRFs) model to discover the sequence-specific structure and the sequence-shared relationship.Specifically, the multi-chain graph structure and the corresponding probabilistic model are designed to represent the interaction among multiple part-induced action unit sequences.

View Article: PubMed Central - PubMed

Affiliation: School of Electronic Information Engineering, Tianjin University, Tianjin 300072, China.

ABSTRACT
Discovering visual dynamics during human actions is a challenging task for human action recognition. To deal with this problem, we theoretically propose the multi-task conditional random fields model and explore its application on human action recognition. For visual representation, we propose the part-induced spatiotemporal action unit sequence to represent each action sample with multiple partwise sequential feature subspaces. For model learning, we propose the multi-task conditional random fields (MTCRFs) model to discover the sequence-specific structure and the sequence-shared relationship. Specifically, the multi-chain graph structure and the corresponding probabilistic model are designed to represent the interaction among multiple part-induced action unit sequences. Moreover we propose the model learning and inference methods to discover temporal context within individual action unit sequence and the latent correlation among different body parts. Extensive experiments are implemented to demonstrate the superiority of the proposed method on two popular RGB human action datasets, KTH & TJU, and the depth dataset in MSR Daily Activity 3D.

No MeSH data available.