Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. Avant-Garde: Empowering GPUs with Scaled Numeric Formats
 
conference paper

Avant-Garde: Empowering GPUs with Scaled Numeric Formats

Gil, Minseong
•
Ha, Dongho
•
Harma, Simla Burcu  
Show more
June 20, 2025
Proceedings of the 52nd Annual International Symposium on Computer Architecture
The 52nd Annual International Symposium on Computer Architecture

The escalating computational and memory demands of deep neural networks have outpaced chip density improvements, making arithmetic density a key bottleneck for GPUs. Scaled numeric formats, such as FP8 and Microscaling (MX), improve arithmetic density by applying adaptive scaling factors across varying block sizes and multiple scaling hierarchies. Unfortunately, supporting diverse scaled numeric formats often requires GPUs to rely on softwarebased implementations, increasing instruction and register overhead and degrading performance. We propose Avant-Garde, a GPU microarchitecture that natively supports diverse scaled numeric formats by converting them into a consistent single-level internal representation. Avant-Garde integrates an Operand Transformer, a hardware module that dynamically flattens multi-level scaling formats into single-level internal representations, a novel Tensor Core, and an optimized data layout to eliminate instruction and register overhead. Our evaluations show that Avant-Garde achieves up to 74% higher throughput and 44% lower execution time, while maintaining accuracy within 0.2% compared to conventional GPUs.

  • Files
  • Details
  • Metrics
Type
conference paper
DOI
10.1145/3695053.3731100
Author(s)
Gil, Minseong

Korea University

Ha, Dongho
Harma, Simla Burcu  

École Polytechnique Fédérale de Lausanne

Yoon, Myung Kuk

Ewha Womans University

Falsafi, Babak  

École Polytechnique Fédérale de Lausanne

Ro, Won Woo

Yonsei University

Oh, Yunho

Korea University

Date Issued

2025-06-20

Publisher

ACM

Publisher place

New York, NY, USA

Published in
Proceedings of the 52nd Annual International Symposium on Computer Architecture
ISBN of the book

979-8-4007-1261-6

Start page

153

End page

165

Subjects

GPU

•

Deep Neural Network

•

Scaled Numeric Format

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units
PARSA  
Event nameEvent acronymEvent placeEvent date
The 52nd Annual International Symposium on Computer Architecture

ISCA '25

Tokyo, Japan

2025-06-21 - 2025-06-25

FunderFunding(s)Grant NumberGrant URL

National Research Foundation of Korea

NRF-2022R1C1C1011021, RS-2024-00357037, RS-2025-00553645

Microsoft Research PhD Fellowship

Swiss National Science Foundation

200021_212757

Available on Infoscience
June 25, 2025
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/251538
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés