Limits...
Extraction of surface-related features in a recurrent model of V1-V2 interactions.

Weidenbacher U, Neumann H - PLoS ONE (2009)

Bottom Line: The approach is based on feedforward and feedback mechanisms found in visual cortical areas V1 and V2.Unlike previous proposals which treat localized junction configurations as 2D image features, we link them to mechanisms of apparent surface segregation.As a consequence, we demonstrate how junctions can change their perceptual representation depending on the scene context and the spatial configuration of boundary fragments.

View Article: PubMed Central - PubMed

Affiliation: Institute of Neural Information Processing, University of Ulm, Ulm, Germany. ulrich.weidenbacher@uni-ulm.de

ABSTRACT

Background: Humans can effortlessly segment surfaces and objects from two-dimensional (2D) images that are projections of the 3D world. The projection from 3D to 2D leads partially to occlusions of surfaces depending on their position in depth and on viewpoint. One way for the human visual system to infer monocular depth cues could be to extract and interpret occlusions. It has been suggested that the perception of contour junctions, in particular T-junctions, may be used as cue for occlusion of opaque surfaces. Furthermore, X-junctions could be used to signal occlusion of transparent surfaces.

Methodology/principal findings: In this contribution, we propose a neural model that suggests how surface-related cues for occlusion can be extracted from a 2D luminance image. The approach is based on feedforward and feedback mechanisms found in visual cortical areas V1 and V2. In a first step, contours are completed over time by generating groupings of like-oriented contrasts. Few iterations of feedforward and feedback processing lead to a stable representation of completed contours and at the same time to a suppression of image noise. In a second step, contour junctions are localized and read out from the distributed representation of boundary groupings. Moreover, surface-related junctions are made explicit such that they are evaluated to interact as to generate surface-segmentations in static images. In addition, we compare our extracted junction signals with a standard computer vision approach for junction detection to demonstrate that our approach outperforms simple feedforward computation-based approaches.

Conclusions/significance: A model is proposed that uses feedforward and feedback mechanisms to combine contextually relevant features in order to generate consistent boundary groupings of surfaces. Perceptually important junction configurations are robustly extracted from neural representations to signal cues for occlusion and transparency. Unlike previous proposals which treat localized junction configurations as 2D image features, we link them to mechanisms of apparent surface segregation. As a consequence, we demonstrate how junctions can change their perceptual representation depending on the scene context and the spatial configuration of boundary fragments.

Show MeSH

Related in: MedlinePlus

Our model simulates cells of two areas in the visual cortex, visual areas V1 and V2.Each model (sub-) area is designed with respect to a basic building block scheme (bottom, left). The scheme consists of three subsequent steps, namely filtering, modulation and centre-surround inhibition. This scheme is applied three times in our model architecture (left), corresponding to upper and lower area V1 and area V2. In this model, modulatory input (provided by feedback from area V2) is only used in lower area V1. Otherwise the default modulatory input is set to 1 (which leaves the signal unchanged). The lower part of area V1 is modelled by simple and complex cells for initial contrast extraction. Note, that each cell pool consists of 12 oriented filters equally distributed between 0° and 180°. The upper part of V1 is modelled by end-stop and bipole cells which both receive input from lower V1. The additively combined signals are further passed to area V2 where long-range lateral connections are modelled by V2 bipole cells. Note, that “•” stands for a multiplicative connection of filter subfields as employed in V2 whereas “○” stands for an additive connection as employed in V1. Finally output of area V2 is used as feedback signal which closes the recurrent loop between areas V1 and V2.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC2691604&req=5

pone-0005909-g003: Our model simulates cells of two areas in the visual cortex, visual areas V1 and V2.Each model (sub-) area is designed with respect to a basic building block scheme (bottom, left). The scheme consists of three subsequent steps, namely filtering, modulation and centre-surround inhibition. This scheme is applied three times in our model architecture (left), corresponding to upper and lower area V1 and area V2. In this model, modulatory input (provided by feedback from area V2) is only used in lower area V1. Otherwise the default modulatory input is set to 1 (which leaves the signal unchanged). The lower part of area V1 is modelled by simple and complex cells for initial contrast extraction. Note, that each cell pool consists of 12 oriented filters equally distributed between 0° and 180°. The upper part of V1 is modelled by end-stop and bipole cells which both receive input from lower V1. The additively combined signals are further passed to area V2 where long-range lateral connections are modelled by V2 bipole cells. Note, that “•” stands for a multiplicative connection of filter subfields as employed in V2 whereas “○” stands for an additive connection as employed in V1. Finally output of area V2 is used as feedback signal which closes the recurrent loop between areas V1 and V2.

Mentions: In this section, we explain the individual model parts in more detail. For a precise mathematical description of the model and its different processing stages the reader is referred to Appendix S1. The detailed model architecture is illustrated in Figure 3.


Extraction of surface-related features in a recurrent model of V1-V2 interactions.

Weidenbacher U, Neumann H - PLoS ONE (2009)

Our model simulates cells of two areas in the visual cortex, visual areas V1 and V2.Each model (sub-) area is designed with respect to a basic building block scheme (bottom, left). The scheme consists of three subsequent steps, namely filtering, modulation and centre-surround inhibition. This scheme is applied three times in our model architecture (left), corresponding to upper and lower area V1 and area V2. In this model, modulatory input (provided by feedback from area V2) is only used in lower area V1. Otherwise the default modulatory input is set to 1 (which leaves the signal unchanged). The lower part of area V1 is modelled by simple and complex cells for initial contrast extraction. Note, that each cell pool consists of 12 oriented filters equally distributed between 0° and 180°. The upper part of V1 is modelled by end-stop and bipole cells which both receive input from lower V1. The additively combined signals are further passed to area V2 where long-range lateral connections are modelled by V2 bipole cells. Note, that “•” stands for a multiplicative connection of filter subfields as employed in V2 whereas “○” stands for an additive connection as employed in V1. Finally output of area V2 is used as feedback signal which closes the recurrent loop between areas V1 and V2.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC2691604&req=5

pone-0005909-g003: Our model simulates cells of two areas in the visual cortex, visual areas V1 and V2.Each model (sub-) area is designed with respect to a basic building block scheme (bottom, left). The scheme consists of three subsequent steps, namely filtering, modulation and centre-surround inhibition. This scheme is applied three times in our model architecture (left), corresponding to upper and lower area V1 and area V2. In this model, modulatory input (provided by feedback from area V2) is only used in lower area V1. Otherwise the default modulatory input is set to 1 (which leaves the signal unchanged). The lower part of area V1 is modelled by simple and complex cells for initial contrast extraction. Note, that each cell pool consists of 12 oriented filters equally distributed between 0° and 180°. The upper part of V1 is modelled by end-stop and bipole cells which both receive input from lower V1. The additively combined signals are further passed to area V2 where long-range lateral connections are modelled by V2 bipole cells. Note, that “•” stands for a multiplicative connection of filter subfields as employed in V2 whereas “○” stands for an additive connection as employed in V1. Finally output of area V2 is used as feedback signal which closes the recurrent loop between areas V1 and V2.
Mentions: In this section, we explain the individual model parts in more detail. For a precise mathematical description of the model and its different processing stages the reader is referred to Appendix S1. The detailed model architecture is illustrated in Figure 3.

Bottom Line: The approach is based on feedforward and feedback mechanisms found in visual cortical areas V1 and V2.Unlike previous proposals which treat localized junction configurations as 2D image features, we link them to mechanisms of apparent surface segregation.As a consequence, we demonstrate how junctions can change their perceptual representation depending on the scene context and the spatial configuration of boundary fragments.

View Article: PubMed Central - PubMed

Affiliation: Institute of Neural Information Processing, University of Ulm, Ulm, Germany. ulrich.weidenbacher@uni-ulm.de

ABSTRACT

Background: Humans can effortlessly segment surfaces and objects from two-dimensional (2D) images that are projections of the 3D world. The projection from 3D to 2D leads partially to occlusions of surfaces depending on their position in depth and on viewpoint. One way for the human visual system to infer monocular depth cues could be to extract and interpret occlusions. It has been suggested that the perception of contour junctions, in particular T-junctions, may be used as cue for occlusion of opaque surfaces. Furthermore, X-junctions could be used to signal occlusion of transparent surfaces.

Methodology/principal findings: In this contribution, we propose a neural model that suggests how surface-related cues for occlusion can be extracted from a 2D luminance image. The approach is based on feedforward and feedback mechanisms found in visual cortical areas V1 and V2. In a first step, contours are completed over time by generating groupings of like-oriented contrasts. Few iterations of feedforward and feedback processing lead to a stable representation of completed contours and at the same time to a suppression of image noise. In a second step, contour junctions are localized and read out from the distributed representation of boundary groupings. Moreover, surface-related junctions are made explicit such that they are evaluated to interact as to generate surface-segmentations in static images. In addition, we compare our extracted junction signals with a standard computer vision approach for junction detection to demonstrate that our approach outperforms simple feedforward computation-based approaches.

Conclusions/significance: A model is proposed that uses feedforward and feedback mechanisms to combine contextually relevant features in order to generate consistent boundary groupings of surfaces. Perceptually important junction configurations are robustly extracted from neural representations to signal cues for occlusion and transparency. Unlike previous proposals which treat localized junction configurations as 2D image features, we link them to mechanisms of apparent surface segregation. As a consequence, we demonstrate how junctions can change their perceptual representation depending on the scene context and the spatial configuration of boundary fragments.

Show MeSH
Related in: MedlinePlus