Cue Normalization Schemes in Saliency-based Visual Attention Models

Saliency-based visual attention models provide visual saliency by combining the conspicuity maps relative to various visual cues. Because the cues are of different nature, the maps to be combined show distinct dynamic ranges and a normalization scheme is therefore required. The normalization scheme used traditionally is an instantaneous peakto- peak normalization. It appears however that this scheme performs poorly in cases where the relative contribution of the cues varies significantly, for instance when the kind of scene changes, like when the scene under study becomes unsaturated or worse, when it looses any chromaticity. To remedy this drawback, this paper proposes an alternative normalization scheme that scales each conspicuity map with respect to a long-term estimate of its maximum, a value which is learned initially from a large number of images. The advantage of the new method is first illustrated by several examples where both normalization schemes are compared. Then, the paper presents the results of an evaluation where the computed visual saliency of a set of 40 images is compared to the respective human attention as derived from the eye movements by a population of 20 subjects. The better performance of the new normalization scheme demonstrates its capability to deal with scenes of varying type, where cue contributions vary a lot. The proposed scheme seems thus preferable in any general purpose model of visual attention.

Published in:
Proceedings of the 2nd International Cognitive Vision Workshop, 1-7
Presented at:
2nd International Cognitive Vision Workshop, Graz, Austria, May 13, 2006

Note: The status of this file is: EPFL only

 Record created 2011-07-28, last modified 2018-11-26

Download fulltext

Rate this document:

Rate this document:
(Not yet reviewed)