Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. Multi-task prompt-RSVQA to explicitly count objects on aerial images
 
conference paper not in proceedings

Multi-task prompt-RSVQA to explicitly count objects on aerial images

Tartini-Chappuis, Christel  
•
Sertic, Charlotte  
•
Santacroce, Nicolas  
Show more
September 1, 2023
British Machine Vision Conference (BMVC) workshops

Introduced to enable a wider use of Earth Observation images using natural language, Remote Sensing Visual Question Answering (RSVQA) remains a challenging task, in particular for questions related to counting. To address this specific challenge, we propose a modular Multi-task prompt-RSVQA model based on object detection and question answering modules. By creating a semantic bottleneck describing the image and providing a visual answer, our model allows users to assess the visual grounding of the answer and better interpret the prediction. A set of ablation studies are designed to consider the contributions of different modules and evaluation metrics are discussed for a finer-grained assessment. Experiments demonstrate competitive results against literature baselines and a zero-shot VQA model. In particular, our proposed model predicts answers for numerical Counting questions that are consistently closer in distance to the ground truth.

  • Files
  • Details
  • Metrics
Loading...
Thumbnail Image
Name

6.pdf

Type

N/a

Access type

openaccess

License Condition

CC BY

Size

2.46 MB

Format

Adobe PDF

Checksum (MD5)

f7cb4ed25807e8a4e6b635af01e8de3e

Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés