Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. EPFL thesis
  4. Making Computer Vision Models Robust and Adaptive
 
doctoral thesis

Making Computer Vision Models Robust and Adaptive

Yeo, Shuqing Teresa  
2023

Visual perception is indispensable for many real-world applications. However, perception models deployed in the real world will encounter numerous and unpredictable distribution shifts, for example, changes in geographic locations, motion blur, and adverse weather conditions, among many others. Thus, to be useful in the real world, these models need to generalize to the complex distribution shifts that can occur. This thesis focuses on three directions aimed at achieving this goal.

For the first direction, we introduce two robustness mechanisms. They are training-time mechanisms as inductive biases are incorporated at training-time and at test-time, the weights of the models are frozen. The first robustness mechanism we introduce ensembles predictions from a diverse set of cues. As each cue responds differently to a distribution shift, we adopt a principled way of merging these predictions and show that it can result in a final robust prediction. The second mechanism is motivated by the rigidity and biases of existing datasets. Examples of dataset biases include containing mostly scenes from developed countries, professional photographs, and so on. Here, we aim to control pre-trained generative models to generate targeted training data to account for these biases, that we can use to fine-tune our models.

Training-time robustness mechanisms attempt to anticipate the shifts that can occur. However, distribution shifts can be unpredictable and models may return unreliable predictions if this shift was not accounted for at training time. Thus, for our second direction, we propose to incorporate test-time adaptation mechanisms so that models can adapt to shifts as they occur. To do so we create a closed-loop system that learns to use feedback signals computed from the environment. We show that this system is able to adapt efficiently at test time.

For the last direction, we introduce a benchmark for testing models on realistic shifts. These shifts are attained from a set of image transformations that take the geometry of the scene into account. Thus, they are more likely to occur in the real world. We show that they can expose the vulnerabilities of existing models.

  • Files
  • Details
  • Metrics
Loading...
Thumbnail Image
Name

EPFL_TH9215.pdf

Type

N/a

Access type

openaccess

License Condition

copyright

Size

88.83 MB

Format

Adobe PDF

Checksum (MD5)

07598875573a797911698cbc2cb526ea

Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés