Carmi, Ran and Itti, Lawrence (2004) Disentangling topdown from bottom up influences on attentional allocation in dynamic scenes. In: 11th Joint Symposium on Neural Computation, May 15 2004, University Of Southern California. (Unpublished) http://resolver.caltech.edu/CaltechJSNC:2004.poster002
Full text not available from this repository.
Use this Persistent URL to link to this item: http://resolver.caltech.edu/CaltechJSNC:2004.poster002
Motivation: Attentional allocation is determined by the interplay between bottom-up and top-down influences. Here we try to quantify the relative contributions of different influences on attentional allocation in dynamic scenes, as well as examine how they change over time. Methods: In order to manipulate the availability of top-down influences on attentional allocation, heterogeneous video clips were cut into clippets (M=2s), which were scrambled and re-assembled into MTV-style clips. Two groups of 8 Subjects each were instructed to "follow the main actors and actions". One group viewd the original stimuli while the other group viewd the MTV-style clips. Eye positions were recorded using an ISCAN eye-tracker (240Hz, yielding a total of more than a million samples for each group), and segmented into saccades, blinks, and fixation/smooth pursuit periods. A saliency-based model of attention capture (Itti & Koch 2000) was used to probe the relative contribution of bottom-up influences on attentional allocation based on a novel performance metric - Chance-Adjusted Saliency Accumometric (CASA). CASA values were computed based on the weighted sum of differences between normalized saliency at human vs. random saccade targets. Results: Total CASA based on the full saliency model was 6% higher in the MTV group compared to the original group. In both original and MTV groups, CASA based on either motion or flicker features alone was ~95% of the CASA based on the full saliency model. CASA based on either color, intensity, or orientation features alone was ~66% of the full model CASA. Generally, CASA values for earlier saccades after stimulus onset (clip or clippet start) were higher than for later saccades, but tapered off and flactuated around a fairly high value after the first several saccades. Conclusions: The 6% CASA difference between the original and MTV groups shows that eliminating visual context beyond the first ~2s of viewing barely increased the overall relative weight of bottom-up influences on attentional allocation. Our results imply that the relative weight of top-down influences on attentional allocation in dynamic scenes does not increase with viewing time (beyond the first ~2s). We also found that either motion or flicker are ~150% stronger than either color, intensity, or orientation as bottom-up attractors of attention.
|Item Type:||Conference or Workshop Item (Poster)|
|Additional Information:||Copy of Poster will be included|
|Usage Policy:||You are granted permission for individual, educational, research and non-commercial reproduction, distribution, display and performance of this work in any format|
|Deposited By:||Imported from CaltechJSNC|
|Deposited On:||07 Jun 2004|
|Last Modified:||24 Oct 2011 21:36|
Repository Staff Only: item control page