Abstract

Dense conditional random fields (CRFs) have become a popular framework for modeling several problems in computer vision such as stereo correspondence and multiclass semantic segmentation. By modeling long-range interactions, dense CRFs provide a labeling that captures finer detail than their sparse counterparts. Currently, the state-of-the-art algorithm performs mean-field inference using a filter-based method but fails to provide a strong theoretical guarantee on the quality of the solution. A question naturally arises as to whether it is possible to obtain a maximum a posteriori (MAP) estimate of a dense CRF using a principled method. Within this paper, we show that this is indeed possible. Specifically, we will show that, by using a filter-based method, continuous relaxations of the MAP problem can be optimized efficiently using state-of-the-art algorithms. Specifically, we will solve a quadratic programming relaxation using the Frank-Wolfe algorithm and a linear programming relaxation by developing a proximal minimization framework. By exploiting labeling consistency in the higher-order potentials and utilizing the filter-based method, we are able to formulate the above algorithms such that each iteration has a complexity linear in the number of classes and random variables. The presented algorithms can be applied to any labeling problem using a dense CRF with sparse higher-order potentials. In this paper, we use semantic segmentation as an example application as it demonstrates the ability of the algorithm to scale to dense CRFs with large dimensions. We perform experiments on the Pascal dataset to indicate that the presented algorithms are able to attain lower energies than the mean-field inference method.

Details

Actions