Files

Abstract

Can a machine learn how to segment different objects in real world images without having any prior knowledge about the delineation of the classes? In this paper, we demonstrate that this task is indeed possible. We address the problem by training a Convolutional Neural Networks (CNN) model with weakly labeled images, \emph{i.e.}, images in which the only knowledge assumed on each sample is the presence or not of an object. The model, trained in an one--vs-all scheme, learns representations that distinguish image patches that belong to the class of interest from those that belong to the background. The per-pixel segmentation is obtained by applying the model to the patch surrounding the pixel and assigning the inferred class to that pixel. Our system is trained using a subset of the Imagenet dataset. The experiments are validated on two challenging classes for segmentation: cats and dogs. We show both quantitatively and qualitatively that the model achieves good accuracy results for these classes on the Pascal VOC 2012 competition, without using any prior segmentation knowledge. This model is powerful in the sense that it learns how to segment objects without the use of costly fully-labeled segmentation datasets.

Details

Actions

Preview