Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. Linear Attention for Efficient Bidirectional Sequence Modeling
 
conference paper

Linear Attention for Efficient Bidirectional Sequence Modeling

Afzal, Arshia  
•
Abad Rocamora, Elias  
•
Candogan, Leyla
Show more
December 2025
39th Conference on Neural Information Processing Systems (NeurIPS 2025) [forthcoming publication]
39th Conference on Neural Information Processing Systems (NeurIPS 2025)

Linear Transformers and State Space Models have emerged as efficient alternatives to softmax Transformers for causal sequence modeling, enabling parallel training via matrix multiplication and efficient RNN-style inference. However, despite their success in causal tasks, no unified framework exists for applying Linear Transformers to bidirectional sequence modeling. We introduce LION, the first framework to systematically extend Linear Transformers to the bidirectional setting. LION generalizes three core representations commonly used in the causal case-full Linear Attention , bidirectional RNN, and chunkwise parallel form-to the bidirectional setting. These forms are theoretically equivalent and enable models to exploit the strengths of each during training and inference. We prove that a broad class of Linear Transformers can be extended using LION and validate our framework via three core examples based on the choice of decay type: LION-LIT, the bidirectional extension of [25]; LION-D, based on [44]; and LIONS , a variant using selective decay [34, 13]. Across standard bidirectional tasks, LION enables models to match or exceed the performance of softmax Transformers, while offering significantly faster training and more efficient inference than existing State Space Models.

  • Files
  • Details
  • Metrics
Loading...
Thumbnail Image
Name

23483_Linear_Attention_for_Eff.pdf

Type

Main Document

Version

Accepted version

Access type

openaccess

License Condition

N/A

Size

5.3 MB

Format

Adobe PDF

Checksum (MD5)

99bd5b4a393441df608816da3a121298

Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés