Limits...
A hierarchical probabilistic model for rapid object categorization in natural scenes.

He X, Yang Z, Tsien JZ - PLoS ONE (2011)

Bottom Line: This amazing ability of rapid categorization has motivated many computational models.Most of these models require extensive training to obtain a decision boundary in a very high dimensional (e.g., ∼6,000 in a leading model) feature space and often categorize objects in natural scenes by categorizing the context that co-occurs with objects when objects do not occupy large portions of the scenes.These results suggest that coarse PDs of object categories based on natural object structures and statistical operations on these PDs may underlie the human ability to rapidly categorize scenes.

View Article: PubMed Central - PubMed

Affiliation: Brain and Behavior Discovery Institute, Georgia Health Sciences University, Augusta, Georgia, United States of America.

ABSTRACT
Humans can categorize objects in complex natural scenes within 100-150 ms. This amazing ability of rapid categorization has motivated many computational models. Most of these models require extensive training to obtain a decision boundary in a very high dimensional (e.g., ∼6,000 in a leading model) feature space and often categorize objects in natural scenes by categorizing the context that co-occurs with objects when objects do not occupy large portions of the scenes. It is thus unclear how humans achieve rapid scene categorization.To address this issue, we developed a hierarchical probabilistic model for rapid object categorization in natural scenes. In this model, a natural object category is represented by a coarse hierarchical probability distribution (PD), which includes PDs of object geometry and spatial configuration of object parts. Object parts are encoded by PDs of a set of natural object structures, each of which is a concatenation of local object features. Rapid categorization is performed as statistical inference. Since the model uses a very small number (∼100) of structures for even complex object categories such as animals and cars, it requires little training and is robust in the presence of large variations within object categories and in their occurrences in natural scenes. Remarkably, we found that the model categorized animals in natural scenes and cars in street scenes with a near human-level performance. We also found that the model located animals and cars in natural scenes, thus overcoming a flaw in many other models which is to categorize objects in natural context by categorizing contextual features. These results suggest that coarse PDs of object categories based on natural object structures and statistical operations on these PDs may underlie the human ability to rapidly categorize scenes.

Show MeSH

Related in: MedlinePlus

Statistics of object structures.(A), Relative occurring frequency of animal structures. (B), Examples of animal structures. The vertical axis indicates the percentage of the animals in the dataset by which the structures were shared. (C), The total numbers of structures that were shared by different percentage of the animals in the dataset. (D)–(F), Same format as (A)–(C) respectively for car structures.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC3102072&req=5

pone-0020002-g005: Statistics of object structures.(A), Relative occurring frequency of animal structures. (B), Examples of animal structures. The vertical axis indicates the percentage of the animals in the dataset by which the structures were shared. (C), The total numbers of structures that were shared by different percentage of the animals in the dataset. (D)–(F), Same format as (A)–(C) respectively for car structures.

Mentions: We then examined the statistics of object structures compiled from a set of animals and cars. We found that simpler structures occur more frequently and are shared by more objects and most structures are shared by only a few objects. Figure 5A shows the normalized frequency of 3,100 structures shared by at least 10% of the animals in the training set. The most frequent structure in animals is a patch with a dark spot at the lower left. The 1,000th, 2,000th, and 3,000th structures are a fur patch, a patch of zebra strip, and a patch of deer head respectively. Figure 5B and C show examples of structures and the total number of structures shared by different percentage of the animals in the training set respectively. There are only 3 structures shared by 90% of the animals while there are 1,734 structures shared by 10% of the animals.


A hierarchical probabilistic model for rapid object categorization in natural scenes.

He X, Yang Z, Tsien JZ - PLoS ONE (2011)

Statistics of object structures.(A), Relative occurring frequency of animal structures. (B), Examples of animal structures. The vertical axis indicates the percentage of the animals in the dataset by which the structures were shared. (C), The total numbers of structures that were shared by different percentage of the animals in the dataset. (D)–(F), Same format as (A)–(C) respectively for car structures.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC3102072&req=5

pone-0020002-g005: Statistics of object structures.(A), Relative occurring frequency of animal structures. (B), Examples of animal structures. The vertical axis indicates the percentage of the animals in the dataset by which the structures were shared. (C), The total numbers of structures that were shared by different percentage of the animals in the dataset. (D)–(F), Same format as (A)–(C) respectively for car structures.
Mentions: We then examined the statistics of object structures compiled from a set of animals and cars. We found that simpler structures occur more frequently and are shared by more objects and most structures are shared by only a few objects. Figure 5A shows the normalized frequency of 3,100 structures shared by at least 10% of the animals in the training set. The most frequent structure in animals is a patch with a dark spot at the lower left. The 1,000th, 2,000th, and 3,000th structures are a fur patch, a patch of zebra strip, and a patch of deer head respectively. Figure 5B and C show examples of structures and the total number of structures shared by different percentage of the animals in the training set respectively. There are only 3 structures shared by 90% of the animals while there are 1,734 structures shared by 10% of the animals.

Bottom Line: This amazing ability of rapid categorization has motivated many computational models.Most of these models require extensive training to obtain a decision boundary in a very high dimensional (e.g., ∼6,000 in a leading model) feature space and often categorize objects in natural scenes by categorizing the context that co-occurs with objects when objects do not occupy large portions of the scenes.These results suggest that coarse PDs of object categories based on natural object structures and statistical operations on these PDs may underlie the human ability to rapidly categorize scenes.

View Article: PubMed Central - PubMed

Affiliation: Brain and Behavior Discovery Institute, Georgia Health Sciences University, Augusta, Georgia, United States of America.

ABSTRACT
Humans can categorize objects in complex natural scenes within 100-150 ms. This amazing ability of rapid categorization has motivated many computational models. Most of these models require extensive training to obtain a decision boundary in a very high dimensional (e.g., ∼6,000 in a leading model) feature space and often categorize objects in natural scenes by categorizing the context that co-occurs with objects when objects do not occupy large portions of the scenes. It is thus unclear how humans achieve rapid scene categorization.To address this issue, we developed a hierarchical probabilistic model for rapid object categorization in natural scenes. In this model, a natural object category is represented by a coarse hierarchical probability distribution (PD), which includes PDs of object geometry and spatial configuration of object parts. Object parts are encoded by PDs of a set of natural object structures, each of which is a concatenation of local object features. Rapid categorization is performed as statistical inference. Since the model uses a very small number (∼100) of structures for even complex object categories such as animals and cars, it requires little training and is robust in the presence of large variations within object categories and in their occurrences in natural scenes. Remarkably, we found that the model categorized animals in natural scenes and cars in street scenes with a near human-level performance. We also found that the model located animals and cars in natural scenes, thus overcoming a flaw in many other models which is to categorize objects in natural context by categorizing contextual features. These results suggest that coarse PDs of object categories based on natural object structures and statistical operations on these PDs may underlie the human ability to rapidly categorize scenes.

Show MeSH
Related in: MedlinePlus