Abstract

Person retrieval aims at effectively matching the pedestrian images over an extensive database given a specified identity. As extracting effective features is crucial in a high-performance retrieval system, recent significant progress was achieved by part-based models that have constructed robust local representations on top of vertically striped part features. However, this kind of models use predefined partitioning strategies, making the number and size of each partition identical even when input images vary a lot. This unchangeable setting usually leads to less flexibility and robustness in capturing visual variance. The primary reason for such a negative effect is that a fixed partitioning strategy is unable to deal with (a) the significant variance from pose, illumination and viewpoint which is common in a pedestrian image dataset, and (b) also the inference error and misalignment of human bodies introduced by the prepositive pedestrian detection module or human pose estimation module. In this paper, we tackle this problem via introducing the novel Adaptive Partition Network (APN). The APN utilizes deep reinforcement learning and applies an agent to generate optimal partitioning strategies dynamically for different input images. The agent inside the APN is optimized with the policy gradient algorithm and maximizes the reward of choosing the best partition setting. By leveraging the supervision cues from the objective partitioning strategies that are generated on a set of held-out training images, the agent is trained jointly with other parts of APN, which ensures the APN's robustness and generalization ability. Extensive experimental results on multiple datasets, including CUHK03, DukeMTMC and Market-1501, demonstrate the superiority of APN over the state-of-the-art models.

Details