Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. Multi-task prompt-RSVQA to explicitly count objects on aerial images
 
conference paper not in proceedings

Multi-task prompt-RSVQA to explicitly count objects on aerial images

Tartini-Chappuis, Christel  
•
Sertic, Charlotte  
•
Santacroce, Nicolas  
Show more
September 1, 2023
British Machine Vision Conference (BMVC) workshops

Introduced to enable a wider use of Earth Observation images using natural language, Remote Sensing Visual Question Answering (RSVQA) remains a challenging task, in particular for questions related to counting. To address this specific challenge, we propose a modular Multi-task prompt-RSVQA model based on object detection and question answering modules. By creating a semantic bottleneck describing the image and providing a visual answer, our model allows users to assess the visual grounding of the answer and better interpret the prediction. A set of ablation studies are designed to consider the contributions of different modules and evaluation metrics are discussed for a finer-grained assessment. Experiments demonstrate competitive results against literature baselines and a zero-shot VQA model. In particular, our proposed model predicts answers for numerical Counting questions that are consistently closer in distance to the ground truth.

  • Files
  • Details
  • Metrics
Type
conference paper not in proceedings
Author(s)
Tartini-Chappuis, Christel  
Sertic, Charlotte  
Santacroce, Nicolas  
Castillo Navarro, Javiera  
Lobry, Sylvain
Le Saux, Bertrand
Tuia, Devis  
Date Issued

2023-09-01

URL

Open Version

https://workshops.proceedings.bmvc2023.org/WorkshoponMachineVisionforEarthObservation/6/CameraReady/6.pdf
Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units
ECEO  
Event nameEvent placeEvent date
British Machine Vision Conference (BMVC) workshops

Aberdeen

November 20-24, 2023

Available on Infoscience
March 6, 2024
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/205803
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés