Bilinear Multimodal Discriminator for Adversarial Domain Adaptation with Privileged Information
Over the past few years, deep Convolutional Neural Networks have shown outstanding performance on semantic segmentation, which is an essential tool needed by self-driving cars to understand their environments. However, their training relies on large datasets with pixel-level ground truth annotations, which are costly and tedious to produce on real data, making application to new situations difficult. In this context, Unsupervised Domain Adaptation (UDA) from synthetic data is an approach of great interest since it leverages cost-free labeled synthetic datasets to help generalizing to unlabeled real ones. In this paper, we propose a new adversarial training strategy for UDA that uses additional privileged information on the synthetic domain during training to improve transfer to the real one. Our method introduces a multimodal discriminator for adversarial training, featuring a bilinear fusion between representations of segmentation and privileged information to exploit at best alignment between modalities. We evaluate our approach on real-world Cityscapes dataset, using synthetic labeled data with depth as privileged information from SYNTHIA dataset and show competitive results.
HEART_2020_paper_40.pdf
Postprint
openaccess
CC BY-NC-ND
2.63 MB
Adobe PDF
920f0a7196e5010ceb9942190156c7e8