Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. Stop Wasting your Cache! Bringing Machine Learning into Cache Computing
 
conference paper

Stop Wasting your Cache! Bringing Machine Learning into Cache Computing

Petrolo, Vincenzo
•
Guella, Flavia
•
Caon, Michele  
Show more
July 7, 2025
Proceedings of the 22nd ACM International Conference on Computing Frontiers: Workshops and Special Sessions
CF '25 Companion: 22nd ACM International Conference on Computing Frontiers

The rapid evolution of Machine Learning (ML) workloads, particularly Deep Neural Networks (DNNs) and Transformer-based models, has intensified demands on computing architectures, highlighting the limitations of traditional von Neumann systems due to the memory bottleneck. To address these challenges, this paper investigates the mapping of fundamental Machine Learning (ML) operations onto ARCANE, a Near-Memory Computing (NMC)-based architecture that integrates Vector Processing Units (VPUs) directly within the data cache. ARCANE offers a flexible ISA-extension (xmnmc) abstracting memory management, effectively reducing data movement and enhancing performance. We specifically explore the acceleration capabilities of ARCANE when executing fundamental Deep Neural Network (DNN) and Transformer-based operations. Experimental results show that, with a contained area overhead, ARCANE achieves consistent speedups, delivering up to 150 × improvement in 2D convolution, 305 × in Linear layer, and over 32 × in Fused-Weight Self-Attention (FWSA), compared to conventional CPU approaches. These findings underline ARCANE’s significant benefits in supporting efficient deployment of edge-oriented Machine Learning (ML) workloads.

  • Details
  • Metrics
Type
conference paper
DOI
10.1145/3706594.3726983
Author(s)
Petrolo, Vincenzo

Polytechnic University of Turin

Guella, Flavia

Polytechnic University of Turin

Caon, Michele  

EPFL

Masera, Guido

Polytechnic University of Turin

Martina, Maurizio

Polytechnic University of Turin

Date Issued

2025-07-07

Publisher

ACM

Publisher place

New York, NY, USA

Published in
Proceedings of the 22nd ACM International Conference on Computing Frontiers: Workshops and Special Sessions
ISBN of the book

979-8-4007-1393-4

Start page

86

End page

89

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units
ESL  
Event nameEvent acronymEvent placeEvent date
CF '25 Companion: 22nd ACM International Conference on Computing Frontiers

CF '25 Companion

Cagliari Italy

2025-05-28 - 2025-05-30

Available on Infoscience
July 10, 2025
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/252110
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés