Indoor Scene Parsing with Instance Segmentation, Semantic Labeling and Support Relationship Inference

Over the years, indoor scene parsing has attracted a growing interest in the computer vision community. Existing methods have typically focused on diverse subtasks of this challenging problem. In particular, while some of them aim at segmenting the image into regions, such as object or surface instances, others aim at inferring the semantic labels of given regions, or their support relationships. These different tasks are typically treated as separate ones. However, they bear strong connections: good regions should respect the semantic labels; support can only be defined for meaningful regions; support relationships strongly depend on semantics. In this paper, we therefore introduce an approach to jointly segment the instances and infer their semantic labels and support relationships from a single input image. By exploiting a hierarchical segmentation, we formulate our problem as that of jointly finding the regions in the hierarchy that correspond to instances and estimating their class labels and pairwise support relationships. We express this via a Markov Random Field, which allows us to further encode links between the different types of variables. Inference in this model can be done exactly via integer linear programming, and we learn its parameters in a structural SVM framework. Our experiments on NYUv2 demonstrate the benefits of reasoning jointly about all these subtasks of indoor scene parsing.

Publié dans:
30Th Ieee Conference On Computer Vision And Pattern Recognition (Cvpr 2017), 6269-6275
Présenté à:
Conference on Computer Vision and Pattern Recognition
New York, Ieee

Note: Le statut de ce fichier est: Anyone

 Notice créée le 2017-04-18, modifiée le 2020-04-20

Télécharger le document

Évaluer ce document:

Rate this document:
(Pas encore évalué)