Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. MulT: An End-to-End Multitask Learning Transformer
 
conference paper

MulT: An End-to-End Multitask Learning Transformer

Bhattacharjee, Deblina  
•
Zhang, Tong  
•
Suesstrunk, Sabine  
Show more
January 1, 2022
2022 Ieee/Cvf Conference On Computer Vision And Pattern Recognition (Cvpr)
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

We propose an end-to-end Multitask Learning Transformer framework, named MulT, to simultaneously learn multiple high-level vision tasks, including depth estimation, semantic segmentation, reshading, surface normal estimation, 2D keypoint detection, and edge detection. Based on the Swin transformer model, our framework encodes the input image into a shared representation and makes predictions for each vision task using task-specific transformer-based decoder heads. At the heart of our approach is a shared attention mechanism modeling the dependencies across the tasks. We evaluate our model on several multitask benchmarks, showing that our MulT framework outperforms both the state-of-the art multitask convolutional neural network models and all the respective single task transformer models. Our experiments further highlight the benefits of sharing attention across all the tasks, and demonstrate that our MulT model is robust and generalizes well to new domains. Our project website is at https://ivrl.github.io/MulT/.

  • Details
  • Metrics
Type
conference paper
DOI
10.1109/CVPR52688.2022.01172
Web of Science ID

WOS:000870759105011

Author(s)
Bhattacharjee, Deblina  
Zhang, Tong  
Suesstrunk, Sabine  
Salzmann, Mathieu  
Date Issued

2022-01-01

Publisher

IEEE COMPUTER SOC

Publisher place

Los Alamitos

Published in
2022 Ieee/Cvf Conference On Computer Vision And Pattern Recognition (Cvpr)
ISBN of the book

978-1-6654-6946-3

Series title/Series vol.

IEEE Conference on Computer Vision and Pattern Recognition

Start page

12021

End page

12031

Subjects

Computer Science, Artificial Intelligence

•

Imaging Science & Photographic Technology

•

Computer Science

•

Imaging Science & Photographic Technology

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units
IVRL  
Event nameEvent placeEvent date
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

New Orleans, LA

Jun 18-24, 2022

Available on Infoscience
December 19, 2022
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/193275
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés