Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Journal articles
  4. Linear Complexity Self-Attention With 3rd Order Polynomials
 
research article

Linear Complexity Self-Attention With 3rd Order Polynomials

Babiloni, Francesca
•
Marras, Ioannis
•
Deng, Jiankang
Show more
November 1, 2023
Ieee Transactions On Pattern Analysis And Machine Intelligence

Self-attention mechanisms and non-local blocks have become crucial building blocks for state-of-the-art neural architectures thanks to their unparalleled ability in capturing long-range dependencies in the input. However their cost is quadratic with the number of spatial positions hence making their use impractical in many real case applications. In this work, we analyze these methods through a polynomial lens, and we show that self-attention can be seen as a special case of a 3 rd order polynomial. Within this polynomial framework, we are able to design polynomial operators capable of accessing the same data pattern of non-local and self-attention blocks while reducing the complexity from quadratic to linear. As a result, we propose two modules (Poly-NL and Poly-SA) that can be used as "drop-in" replacements for more-complex non-local and self-attention layers in state-of-the-art CNNs and ViT architectures. Our modules can achieve comparable, if not better, performance across a wide range of computer vision tasks while keeping a complexity equivalent to a standard linear layer.

  • Details
  • Metrics
Type
research article
DOI
10.1109/TPAMI.2022.3231971
Web of Science ID

WOS:001085050900002

Author(s)
Babiloni, Francesca
Marras, Ioannis
Deng, Jiankang
Kokkinos, Filippos
Maggioni, Matteo
Chrysos, Grigorios  
Torr, Philip
Zafeiriou, Stefanos
Date Issued

2023-11-01

Publisher

Ieee Computer Soc

Published in
Ieee Transactions On Pattern Analysis And Machine Intelligence
Volume

45

Issue

11

Start page

12726

End page

12737

Subjects

Technology

•

Neural Networks

•

Tensors

•

Computer Architecture

•

Transformers

•

Complexity Theory

•

Standards

•

Kernel

•

Non-Local Blocks

•

Polynomial Expansion

•

Self-Attention

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units
LIONS  
Available on Infoscience
February 19, 2024
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/204045
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés