Limits...
Object segmentation from motion discontinuities and temporal occlusions--a biologically inspired model.

Beck C, Ognibeni T, Neumann H - PLoS ONE (2008)

Bottom Line: We propose a model derived from mechanisms found in visual areas V1, MT, and MSTl of human and primate cortex that achieves robust detection along motion boundaries.In addition, we discuss how this model is related to neurophysiological findings.The model was successfully tested both with artificial and real sequences including self and object motion.

View Article: PubMed Central - PubMed

Affiliation: Institute for Neural Information Processing, University of Ulm, Ulm, Germany. cornelia.beck@uni-ulm.de

ABSTRACT

Background: Optic flow is an important cue for object detection. Humans are able to perceive objects in a scene using only kinetic boundaries, and can perform the task even when other shape cues are not provided. These kinetic boundaries are characterized by the presence of motion discontinuities in a local neighbourhood. In addition, temporal occlusions appear along the boundaries as the object in front covers the background and the objects that are spatially behind it.

Methodology/principal findings: From a technical point of view, the detection of motion boundaries for segmentation based on optic flow is a difficult task. This is due to the problem that flow detected along such boundaries is generally not reliable. We propose a model derived from mechanisms found in visual areas V1, MT, and MSTl of human and primate cortex that achieves robust detection along motion boundaries. It includes two separate mechanisms for both the detection of motion discontinuities and of occlusion regions based on how neurons respond to spatial and temporal contrast, respectively. The mechanisms are embedded in a biologically inspired architecture that integrates information of different model components of the visual processing due to feedback connections. In particular, mutual interactions between the detection of motion discontinuities and temporal occlusions allow a considerable improvement of the kinetic boundary detection.

Conclusions/significance: A new model is proposed that uses optic flow cues to detect motion discontinuities and object occlusion. We suggest that by combining these results for motion discontinuities and object occlusion, object segmentation within the model can be improved. This idea could also be applied in other models for object segmentation. In addition, we discuss how this model is related to neurophysiological findings. The model was successfully tested both with artificial and real sequences including self and object motion.

Show MeSH

Related in: MedlinePlus

Detection of occlusion regions.To detect occlusions and disocclusions in the motion sequence, we                                    compare the motion energy at each spatial position that was                                    estimated using the past frame pair                                        t−1/t0 and using the                                    future frame pair t0/t1. A high difference                                    typically occurs at occlusion and disocclusion positions due to                                    regions that are only visible in t−1 or                                        t1 and thus entail very ambiguous motion                                    estimates.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC2586919&req=5

pone-0003807-g005: Detection of occlusion regions.To detect occlusions and disocclusions in the motion sequence, we compare the motion energy at each spatial position that was estimated using the past frame pair t−1/t0 and using the future frame pair t0/t1. A high difference typically occurs at occlusion and disocclusion positions due to regions that are only visible in t−1 or t1 and thus entail very ambiguous motion estimates.

Mentions: The generation of reliable motion detection at motion boundaries is a difficult task, for in the occlusion regions the detection of corresponding local image structure is not possible for frame t−1 and t0. The lack of local estimates has the consequence that in these regions motion bleeding can appear. This means that salient estimates of the neighbourhood, like of the object generating the occlusion, propagate into the occlusion regions. The propagation can be limited if the motion estimates within the occluded region are strong. For this purpose, we extended the model for motion detection by a mechanism of temporal integration [24]. The underlying idea is that motion estimates within t−1/t0 (“past frame pair”) will fail to calculate the correct optic flow for the image regions containing occlusions. The past frame t−1 contains occlusion regions where parts of the background are covered, while they are visible in frame t0 (see Fig. 4). This problem can be solved by using motion cues of one additional future frame to compute the correspondences between t0 and t1 (“future frame pair”), where the occlusion regions are visible in both frames (assuming coherent motion for the object). The estimates of the two frame pairs are then used as parallel input to V1Model. The occlusion regions are so mainly filled with estimates from the future frame pair as the past frame pair will not contribute a large number of motion estimates at these positions. For the disocclusion regions, mainly the input from the past frame pair is important. Using this specific property of occlusions we are able to compute reliable estimates for occlusion regions without using an explicit detection of these regions. This mechanism offers therewith a good basis for ongoing higher evaluation relying on dense and stable optic flow, like in MSTlModel. On the other hand, the activity provided from the different frame pairs can now be further processed by appending neurons for the detection of occlusion and disocclusion regions. The model is extended by a temporal on-center-off-surround mechanism that responds strongly if at the local position a change in motion energy appears. A change of local motion energy is a strong cue for occlusions as the non-matchable points in an occlusion region entrain low motion energy locally. Temporal motion contrast neurons that respond strongly for changes from low motion energy to high motion energy indicate disocclusion regions, temporal motion contrast neurons that detect changes from high to low motion energy indicate occlusions (Fig. 5). The motion energy at each position is computed by summing up the number of hypotheses generated in a small spatial surround. The following equation describes how the activity in TOModel is computed at time t0:(7)This processing step is accomplished after feedback from MTModel supported the creation of motion hypotheses. The computation is very cheap as the main extra effort is the computation of the difference of motion energies (see Eq. 7).


Object segmentation from motion discontinuities and temporal occlusions--a biologically inspired model.

Beck C, Ognibeni T, Neumann H - PLoS ONE (2008)

Detection of occlusion regions.To detect occlusions and disocclusions in the motion sequence, we                                    compare the motion energy at each spatial position that was                                    estimated using the past frame pair                                        t−1/t0 and using the                                    future frame pair t0/t1. A high difference                                    typically occurs at occlusion and disocclusion positions due to                                    regions that are only visible in t−1 or                                        t1 and thus entail very ambiguous motion                                    estimates.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC2586919&req=5

pone-0003807-g005: Detection of occlusion regions.To detect occlusions and disocclusions in the motion sequence, we compare the motion energy at each spatial position that was estimated using the past frame pair t−1/t0 and using the future frame pair t0/t1. A high difference typically occurs at occlusion and disocclusion positions due to regions that are only visible in t−1 or t1 and thus entail very ambiguous motion estimates.
Mentions: The generation of reliable motion detection at motion boundaries is a difficult task, for in the occlusion regions the detection of corresponding local image structure is not possible for frame t−1 and t0. The lack of local estimates has the consequence that in these regions motion bleeding can appear. This means that salient estimates of the neighbourhood, like of the object generating the occlusion, propagate into the occlusion regions. The propagation can be limited if the motion estimates within the occluded region are strong. For this purpose, we extended the model for motion detection by a mechanism of temporal integration [24]. The underlying idea is that motion estimates within t−1/t0 (“past frame pair”) will fail to calculate the correct optic flow for the image regions containing occlusions. The past frame t−1 contains occlusion regions where parts of the background are covered, while they are visible in frame t0 (see Fig. 4). This problem can be solved by using motion cues of one additional future frame to compute the correspondences between t0 and t1 (“future frame pair”), where the occlusion regions are visible in both frames (assuming coherent motion for the object). The estimates of the two frame pairs are then used as parallel input to V1Model. The occlusion regions are so mainly filled with estimates from the future frame pair as the past frame pair will not contribute a large number of motion estimates at these positions. For the disocclusion regions, mainly the input from the past frame pair is important. Using this specific property of occlusions we are able to compute reliable estimates for occlusion regions without using an explicit detection of these regions. This mechanism offers therewith a good basis for ongoing higher evaluation relying on dense and stable optic flow, like in MSTlModel. On the other hand, the activity provided from the different frame pairs can now be further processed by appending neurons for the detection of occlusion and disocclusion regions. The model is extended by a temporal on-center-off-surround mechanism that responds strongly if at the local position a change in motion energy appears. A change of local motion energy is a strong cue for occlusions as the non-matchable points in an occlusion region entrain low motion energy locally. Temporal motion contrast neurons that respond strongly for changes from low motion energy to high motion energy indicate disocclusion regions, temporal motion contrast neurons that detect changes from high to low motion energy indicate occlusions (Fig. 5). The motion energy at each position is computed by summing up the number of hypotheses generated in a small spatial surround. The following equation describes how the activity in TOModel is computed at time t0:(7)This processing step is accomplished after feedback from MTModel supported the creation of motion hypotheses. The computation is very cheap as the main extra effort is the computation of the difference of motion energies (see Eq. 7).

Bottom Line: We propose a model derived from mechanisms found in visual areas V1, MT, and MSTl of human and primate cortex that achieves robust detection along motion boundaries.In addition, we discuss how this model is related to neurophysiological findings.The model was successfully tested both with artificial and real sequences including self and object motion.

View Article: PubMed Central - PubMed

Affiliation: Institute for Neural Information Processing, University of Ulm, Ulm, Germany. cornelia.beck@uni-ulm.de

ABSTRACT

Background: Optic flow is an important cue for object detection. Humans are able to perceive objects in a scene using only kinetic boundaries, and can perform the task even when other shape cues are not provided. These kinetic boundaries are characterized by the presence of motion discontinuities in a local neighbourhood. In addition, temporal occlusions appear along the boundaries as the object in front covers the background and the objects that are spatially behind it.

Methodology/principal findings: From a technical point of view, the detection of motion boundaries for segmentation based on optic flow is a difficult task. This is due to the problem that flow detected along such boundaries is generally not reliable. We propose a model derived from mechanisms found in visual areas V1, MT, and MSTl of human and primate cortex that achieves robust detection along motion boundaries. It includes two separate mechanisms for both the detection of motion discontinuities and of occlusion regions based on how neurons respond to spatial and temporal contrast, respectively. The mechanisms are embedded in a biologically inspired architecture that integrates information of different model components of the visual processing due to feedback connections. In particular, mutual interactions between the detection of motion discontinuities and temporal occlusions allow a considerable improvement of the kinetic boundary detection.

Conclusions/significance: A new model is proposed that uses optic flow cues to detect motion discontinuities and object occlusion. We suggest that by combining these results for motion discontinuities and object occlusion, object segmentation within the model can be improved. This idea could also be applied in other models for object segmentation. In addition, we discuss how this model is related to neurophysiological findings. The model was successfully tested both with artificial and real sequences including self and object motion.

Show MeSH
Related in: MedlinePlus