Limits...
A hierarchical probabilistic model for rapid object categorization in natural scenes.

He X, Yang Z, Tsien JZ - PLoS ONE (2011)

Bottom Line: This amazing ability of rapid categorization has motivated many computational models.Most of these models require extensive training to obtain a decision boundary in a very high dimensional (e.g., ∼6,000 in a leading model) feature space and often categorize objects in natural scenes by categorizing the context that co-occurs with objects when objects do not occupy large portions of the scenes.These results suggest that coarse PDs of object categories based on natural object structures and statistical operations on these PDs may underlie the human ability to rapidly categorize scenes.

View Article: PubMed Central - PubMed

Affiliation: Brain and Behavior Discovery Institute, Georgia Health Sciences University, Augusta, Georgia, United States of America.

ABSTRACT
Humans can categorize objects in complex natural scenes within 100-150 ms. This amazing ability of rapid categorization has motivated many computational models. Most of these models require extensive training to obtain a decision boundary in a very high dimensional (e.g., ∼6,000 in a leading model) feature space and often categorize objects in natural scenes by categorizing the context that co-occurs with objects when objects do not occupy large portions of the scenes. It is thus unclear how humans achieve rapid scene categorization.To address this issue, we developed a hierarchical probabilistic model for rapid object categorization in natural scenes. In this model, a natural object category is represented by a coarse hierarchical probability distribution (PD), which includes PDs of object geometry and spatial configuration of object parts. Object parts are encoded by PDs of a set of natural object structures, each of which is a concatenation of local object features. Rapid categorization is performed as statistical inference. Since the model uses a very small number (∼100) of structures for even complex object categories such as animals and cars, it requires little training and is robust in the presence of large variations within object categories and in their occurrences in natural scenes. Remarkably, we found that the model categorized animals in natural scenes and cars in street scenes with a near human-level performance. We also found that the model located animals and cars in natural scenes, thus overcoming a flaw in many other models which is to categorize objects in natural context by categorizing contextual features. These results suggest that coarse PDs of object categories based on natural object structures and statistical operations on these PDs may underlie the human ability to rapidly categorize scenes.

Show MeSH

Related in: MedlinePlus

Coarse models of geometry of animals (medium-body animals) and cars in natural scenes.(A), Any animal in natural scenes was modeled by two ellipses, one for the head and one for the body. Any car in natural scenes was modeled by one ellipse. (B), Size (left) and orientation (right) distributions of animal heads in natural scenes. (C), Size (left) and orientation (right) distributions of animal bodies in natural scenes. (D), Size (left) and orientation (right) distributions of cars in natural scenes.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC3102072&req=5

pone-0020002-g002: Coarse models of geometry of animals (medium-body animals) and cars in natural scenes.(A), Any animal in natural scenes was modeled by two ellipses, one for the head and one for the body. Any car in natural scenes was modeled by one ellipse. (B), Size (left) and orientation (right) distributions of animal heads in natural scenes. (C), Size (left) and orientation (right) distributions of animal bodies in natural scenes. (D), Size (left) and orientation (right) distributions of cars in natural scenes.

Mentions: To model human performance on scene categorization, we developed coarse models of object geometry in natural context. We modeled any animal in natural scenes by two ellipses, one for the head and one for the body (Figure 2A). For this purpose, we segmented animals from a set of training scenes by hand. Although current computer vision algorithms can do a decent job on this task (e.g., [30]), we chose to do it manually simply because we need accurate segmentation for compiling object structures (see the following sections). After segmentation, we fitted the histogram of the parameters of the two ellipses to a multi-dimensional Gaussian PD. The distribution of the sizes of animal heads in the dataset of animal scenes had a peak at (35 pixels, 24 pixels) (left panel in Figure 2B). The distribution of the orientations of animal heads had a peak at 87° and a standard deviation of 35° (right panel in Figure 2B). The distribution of the sizes of animal bodies had a peak at (37 pixels, 31 pixels) (left panel in Figure 2C). The distribution of the orientations of animal bodies had a peak at 178° and a standard deviation of 38° (right panel in Figure 2C).


A hierarchical probabilistic model for rapid object categorization in natural scenes.

He X, Yang Z, Tsien JZ - PLoS ONE (2011)

Coarse models of geometry of animals (medium-body animals) and cars in natural scenes.(A), Any animal in natural scenes was modeled by two ellipses, one for the head and one for the body. Any car in natural scenes was modeled by one ellipse. (B), Size (left) and orientation (right) distributions of animal heads in natural scenes. (C), Size (left) and orientation (right) distributions of animal bodies in natural scenes. (D), Size (left) and orientation (right) distributions of cars in natural scenes.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC3102072&req=5

pone-0020002-g002: Coarse models of geometry of animals (medium-body animals) and cars in natural scenes.(A), Any animal in natural scenes was modeled by two ellipses, one for the head and one for the body. Any car in natural scenes was modeled by one ellipse. (B), Size (left) and orientation (right) distributions of animal heads in natural scenes. (C), Size (left) and orientation (right) distributions of animal bodies in natural scenes. (D), Size (left) and orientation (right) distributions of cars in natural scenes.
Mentions: To model human performance on scene categorization, we developed coarse models of object geometry in natural context. We modeled any animal in natural scenes by two ellipses, one for the head and one for the body (Figure 2A). For this purpose, we segmented animals from a set of training scenes by hand. Although current computer vision algorithms can do a decent job on this task (e.g., [30]), we chose to do it manually simply because we need accurate segmentation for compiling object structures (see the following sections). After segmentation, we fitted the histogram of the parameters of the two ellipses to a multi-dimensional Gaussian PD. The distribution of the sizes of animal heads in the dataset of animal scenes had a peak at (35 pixels, 24 pixels) (left panel in Figure 2B). The distribution of the orientations of animal heads had a peak at 87° and a standard deviation of 35° (right panel in Figure 2B). The distribution of the sizes of animal bodies had a peak at (37 pixels, 31 pixels) (left panel in Figure 2C). The distribution of the orientations of animal bodies had a peak at 178° and a standard deviation of 38° (right panel in Figure 2C).

Bottom Line: This amazing ability of rapid categorization has motivated many computational models.Most of these models require extensive training to obtain a decision boundary in a very high dimensional (e.g., ∼6,000 in a leading model) feature space and often categorize objects in natural scenes by categorizing the context that co-occurs with objects when objects do not occupy large portions of the scenes.These results suggest that coarse PDs of object categories based on natural object structures and statistical operations on these PDs may underlie the human ability to rapidly categorize scenes.

View Article: PubMed Central - PubMed

Affiliation: Brain and Behavior Discovery Institute, Georgia Health Sciences University, Augusta, Georgia, United States of America.

ABSTRACT
Humans can categorize objects in complex natural scenes within 100-150 ms. This amazing ability of rapid categorization has motivated many computational models. Most of these models require extensive training to obtain a decision boundary in a very high dimensional (e.g., ∼6,000 in a leading model) feature space and often categorize objects in natural scenes by categorizing the context that co-occurs with objects when objects do not occupy large portions of the scenes. It is thus unclear how humans achieve rapid scene categorization.To address this issue, we developed a hierarchical probabilistic model for rapid object categorization in natural scenes. In this model, a natural object category is represented by a coarse hierarchical probability distribution (PD), which includes PDs of object geometry and spatial configuration of object parts. Object parts are encoded by PDs of a set of natural object structures, each of which is a concatenation of local object features. Rapid categorization is performed as statistical inference. Since the model uses a very small number (∼100) of structures for even complex object categories such as animals and cars, it requires little training and is robust in the presence of large variations within object categories and in their occurrences in natural scenes. Remarkably, we found that the model categorized animals in natural scenes and cars in street scenes with a near human-level performance. We also found that the model located animals and cars in natural scenes, thus overcoming a flaw in many other models which is to categorize objects in natural context by categorizing contextual features. These results suggest that coarse PDs of object categories based on natural object structures and statistical operations on these PDs may underlie the human ability to rapidly categorize scenes.

Show MeSH
Related in: MedlinePlus