Carmi, Ran and Itti, Lawrence (2004) Disentangling topdown from bottom up influences on attentional allocation in dynamic scenes. In: 11th Joint Symposium on Neural Computation, May 15 2004, University Of Southern California. (Unpublished) https://resolver.caltech.edu/CaltechJSNC:2004.poster002
Full text not available from this repository.
Use this Persistent URL to link to this item: https://resolver.caltech.edu/CaltechJSNC:2004.poster002
Abstract
Motivation: Attentional allocation is determined by the interplay between bottom-up and top-down influences. Here we try to quantify the relative contributions of different influences on attentional allocation in dynamic scenes, as well as examine how they change over time. Methods: In order to manipulate the availability of top-down influences on attentional allocation, heterogeneous video clips were cut into clippets (M=2s), which were scrambled and re-assembled into MTV-style clips. Two groups of 8 Subjects each were instructed to "follow the main actors and actions". One group viewd the original stimuli while the other group viewd the MTV-style clips. Eye positions were recorded using an ISCAN eye-tracker (240Hz, yielding a total of more than a million samples for each group), and segmented into saccades, blinks, and fixation/smooth pursuit periods. A saliency-based model of attention capture (Itti & Koch 2000) was used to probe the relative contribution of bottom-up influences on attentional allocation based on a novel performance metric - Chance-Adjusted Saliency Accumometric (CASA). CASA values were computed based on the weighted sum of differences between normalized saliency at human vs. random saccade targets. Results: Total CASA based on the full saliency model was 6% higher in the MTV group compared to the original group. In both original and MTV groups, CASA based on either motion or flicker features alone was ~95% of the CASA based on the full saliency model. CASA based on either color, intensity, or orientation features alone was ~66% of the full model CASA. Generally, CASA values for earlier saccades after stimulus onset (clip or clippet start) were higher than for later saccades, but tapered off and flactuated around a fairly high value after the first several saccades. Conclusions: The 6% CASA difference between the original and MTV groups shows that eliminating visual context beyond the first ~2s of viewing barely increased the overall relative weight of bottom-up influences on attentional allocation. Our results imply that the relative weight of top-down influences on attentional allocation in dynamic scenes does not increase with viewing time (beyond the first ~2s). We also found that either motion or flicker are ~150% stronger than either color, intensity, or orientation as bottom-up attractors of attention.
Item Type: | Conference or Workshop Item (Poster) |
---|---|
Additional Information: | Copy of Poster will be included |
Record Number: | CaltechJSNC:2004.poster002 |
Persistent URL: | https://resolver.caltech.edu/CaltechJSNC:2004.poster002 |
Usage Policy: | You are granted permission for individual, educational, research and non-commercial reproduction, distribution, display and performance of this work in any format |
ID Code: | 2 |
Collection: | CaltechCONF |
Deposited By: | Imported from CaltechJSNC |
Deposited On: | 07 Jun 2004 |
Last Modified: | 03 Oct 2019 22:49 |
Repository Staff Only: item control page