Learning to Match Aerial Images with Deep Attentive Architectures

Altwaijry, Hani; Trulls, Eduard; Hays, James; Fua, Pascal; Belongie, Serge

doi:10.1109/CVPR.2016.385

conference paper

Learning to Match Aerial Images with Deep Attentive Architectures

Altwaijry, Hani

•

Trulls, Eduard

•

Hays, James

2016

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

Computer Vision and Pattern Recognition

Image matching is a fundamental problem in Computer Vision. In the context of feature-based matching, SIFT and its variants have long excelled in a wide array of applications. However, for ultra-wide baselines, as in the case of aerial images captured under large camera rotations, the appearance variation goes beyond the reach of SIFT and RANSAC. In this paper we propose a data-driven, deep learning-based approach that sidesteps local correspondence by framing the problem as a classification task. Furthermore, we demonstrate that local correspondences can still be useful. To do so we incorporate an attention mechanism to produce a set of probable matches, which allows us to further increase performance. We train our models on a dataset of urban aerial imagery consisting of 'same' and 'different' pairs, collected for this purpose, and characterize the problem via a human study with annotations from Amazon Mechanical Turk. We demonstrate that our models outperform the state-of-the-art on ultra-wide baseline matching, and close the gap with human performance.

Type

conference paper

DOI

10.1109/CVPR.2016.385

Author(s)

Altwaijry, Hani

Trulls, Eduard

Hays, James

Fua, Pascal

Belongie, Serge

Date Issued

2016

Published in

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

Start page

3539

End page

3547

Subjects

Deep Learning

•

Stereo

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units

CVLAB

Event name	Event place	Event date
Computer Vision and Pattern Recognition	Las Vegas, Nevada, USA	June 27-30, 2016

Available on Infoscience

April 11, 2016

Use this identifier to reference this record

https://infoscience.epfl.ch/handle/20.500.14299/125600