Knowledge-Aware Cross-Modal Text-Image Retrieval for Remote Sensing Images

Mi, LiLi, SiranChappuis, ChristelTuia, Devis2023-02-092023-02-092023-02-092022-09-01https://infoscience.epfl.ch/handle/20.500.14299/194690Image-based retrieval in large Earth observation archives is difficult, because one needs to navigate across thousands of candidate matches only with the proposition image as a guide. By using text as a query language, the retrieval system gains in usability, but at the same time faces difficulties due to the diversity of visual signals that cannot be summarized by a short caption only. For this reason, as a matching-based task, cross-modal text-image retrieval often suffers from information asymmetry between texts and images. To address this challenge, we propose a Knowledge-aware Cross-modal Retrieval (KCR) method for remote sensing text-image retrieval. By mining relevant information from an external knowledge graph, KCR enriches the text scope available in the search query and alleviates the information gaps between texts and images for better matching. Experimental results on two commonly used remote sensing text-image retrieval benchmarks show that the proposed knowledge-aware method outperforms state-of-the-art methods.Knowledge-Aware Cross-Modal Text-Image Retrieval for Remote Sensing Imagestext::conference output::conference proceedings::conference paper