Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. Beyond Autoregression: Fast LLMs via Self-Distillation Through Time
 
conference paper

Beyond Autoregression: Fast LLMs via Self-Distillation Through Time

Deschenaux, Justin Samuel  
•
Gulcehre, Caglar  orcid-logo
January 22, 2025
Proceedings of the Thirteenth International Conference on Learning Representations (ICLR) 2025 [Forthcoming publication]
13th International Conference on Learning Representations (ICLR 2025)

Autoregressive (AR) Large Language Models (LLMs) have demonstrated significant success across numerous tasks. However, the AR modeling paradigm presents certain limitations; for instance, contemporary autoregressive LLMs are trained to generate one token at a time, which can result in noticeable latency. Recent advances have indicated that search and repeated sampling can enhance performance in various applications, such as theorem proving, code generation, and alignment, by utilizing greater computational resources during inference. In this study, we demonstrate that diffusion language models are capable of generating at least 32 tokens simultaneously, while exceeding the performance of AR models in text quality and on the LAMBADA natural language understanding benchmark. This outcome is achieved through a novel distillation method for discrete diffusion models, which reduces the number of inference steps by a factor of 32-64. Practically, our models, even without caching, can generate tokens at a rate that is up to 8 times faster than AR models employing KV caching, and we anticipate further improvements with the inclusion of caching. Moreover, we demonstrate the efficacy of our approach for diffusion language models with up to 860M parameters.

  • Files
  • Details
  • Metrics
Loading...
Thumbnail Image
Name

2410.21035v2.pdf

Type

Main Document

Version

Access type

openaccess

License Condition

CC BY

Size

1.83 MB

Format

Adobe PDF

Checksum (MD5)

a32e86361996f990226285474d380eda

Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés