Abstract

This paper presents a novel deep architecture for weakly-supervised temporal action localization that not only generates segment-level action responses but also propagates segment-level responses to the neighborhood in a form of graph Laplacian regularization. Specifically, our approach consists of two sub-modules; a class activation module to estimate the action score map over time through the action classifiers, and a graph regularization module to refine the estimated action score map by solving a quadratic programming problem with the predicted segment-level semantic affinities. Since these two modules are integrated with fully differentiable layers, the proposed networks can be jointly trained in an end-to-end manner. Experimental results on Thumos14 and ActivityNet1.2 demonstrate that the proposed method provides outstanding performances in weakly-supervised temporal action localization.

Details

Actions