000209286 001__ 209286
000209286 005__ 20190317000221.0
000209286 0247_ $$2doi$$a10.5075/epfl-thesis-6700
000209286 02470 $$2urn$$aurn:nbn:ch:bel-epfl-thesis6700-8
000209286 02471 $$2nebis$$a10470756
000209286 037__ $$aTHESIS
000209286 041__ $$aeng
000209286 088__ $$a6700
000209286 245__ $$aAre All Pixels Equally Important? Towards Multi-Level Salient Object Detection
000209286 269__ $$a2015
000209286 260__ $$aLausanne$$bEPFL$$c2015
000209286 336__ $$aTheses
000209286 502__ $$aDr R. Boulic (président) ;  Prof. S. Süsstrunk (directrice) ; Prof. M. Kankanhalli, Prof. A. Baskurt,  Prof. R. Hersch (rapporteurs)
000209286 520__ $$aWhen we look at our environment, we primarily pay attention to visually distinctive objects. We refer to these objects as visually important or salient. Our visual system dedicates most of its processing resources to analyzing these salient objects. An analogous resource allocation can be performed in computer vision, where a salient object detector identifies objects of interest as a pre-processing step. In the literature, salient object detection is considered as a foreground-background segmentation problem. This approach assumes that there is no variation in object importance. Only the most salient object(s) are detected as foreground. In this thesis, we challenge this conventional methodology of salient-object detection and introduce multi-level object saliency. In other words, all pixels are not equally important. The well-known salient-object ground-truth datasets contain images with single objects and thus are not suited to evaluate the varying importance of objects. In contrast, many natural images have multiple objects. The saliency levels of these objects depend on two key factors. First, the duration of eye fixation is longer for visually and semantically informative image regions. Therefore, a difference in fixation duration should reflect a variation in object importance. Second, visual perception is subjective; hence the saliency of an object should be measured by averaging the perception of a group of people. In other words, objective saliency can be considered as the collective human attention. In order to better represent natural images and to measure the saliency levels of objects, we thus collect new images containing multiple objects and create a Comprehensive Object Saliency (COS) dataset. We provide ground truth multi-level salient object maps via eye-tracking and crowd-sourcing experiments. We then propose three salient-object detectors. Our first technique is based on multi-scale linear filtering and can detect salient objects of various sizes. The second method uses a bilateral-filtering approach and is capable of producing uniform object saliency values. Our third method employs image segmentation and machine learning and is robust against image noise and texture. This segmentation-based method performs the best on the existing datasets compared to our other methods and the state-of-the-art methods. The state-of-the-art salient-object detectors are not designed to assess the relative importance of objects and to provide multi-level saliency values. We thus introduce an Object-Awareness Model (OAM) that estimates the saliency levels of objects by using their position and size information. We then modify and extend our segmentation-based salient-object detector with the OAM and propose a Comprehensive Salient Object Detection (CSD) method that is capable of performing multi-level salient-object detection. We show that the CSD method significantly outperforms the state-of-the-art methods on the COS dataset. We use our salient-object detectors as a pre-processing step in three applications. First, we show that multi-level salient-object detection provides more relevant semantic image tags compared to conventional salient-object detection. Second, we employ our salient-object detector to detect salient objects in videos in real time. Third, we use multi-level object-saliency values in context-aware image compression and obtain perceptually better compression compared to standard JPEG with the same file size.
000209286 6531_ $$asaliency
000209286 6531_ $$amulti-level
000209286 6531_ $$asalient-object detection
000209286 6531_ $$asegmentation
000209286 6531_ $$aimage tagging
000209286 700__ $$0245739$$aYildirim, Gökhan$$g200257
000209286 720_2 $$0241946$$aSüsstrunk, Sabine$$edir.$$g125681
000209286 8564_ $$s64814629$$uhttps://infoscience.epfl.ch/record/209286/files/EPFL_TH6700.pdf$$yn/a$$zn/a
000209286 909C0 $$0252320$$pIVRL$$xU10429
000209286 909CO $$ooai:infoscience.tind.io:209286$$pthesis-bn2018$$pDOI$$pIC$$pthesis$$qDOI2$$qGLOBAL_SET
000209286 917Z8 $$x108898
000209286 917Z8 $$x108898
000209286 917Z8 $$x108898
000209286 918__ $$aIC$$cISC$$dEDIC2005-2015
000209286 919__ $$aIVRL
000209286 920__ $$a2015-7-10$$b2015
000209286 970__ $$a6700/THESES
000209286 973__ $$aEPFL$$sPUBLISHED
000209286 980__ $$aTHESIS