This work presents a neural network for the retrieval of images from text queries. The proposed network is composed of two main modules: the first one extracts a global picture representation from local block descriptors while the second one aims at solving the retrieval problem from the extracted representation. Both modules are trained jointly to minimize a loss related to the retrieval performance. This approach is shown to be advantageous when compared to previous models relying on unsupervised feature extraction: average precision over Corel queries reaches 26.2% for our model, which should be compared to 21.6% for PAMIR, the best alternative.
Type
report
Author(s)
Date Issued
2006
Publisher
IDIAP
Written at
EPFL
EPFL units
Available on Infoscience
February 11, 2010
Use this identifier to reference this record