Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Journal articles
  4. Non-Structured DNN Weight Pruning--Is It Beneficial in Any Platform?
 
research article

Non-Structured DNN Weight Pruning--Is It Beneficial in Any Platform?

Ma, Xiaolong
•
Lin, Sheng
•
Ye, Shaokai  
Show more
2022
Ieee Transactions On Neural Networks And Learning Systems

Large deep neural network (DNN) models pose the key challenge to energy efficiency due to the significantly higher energy consumption of off-chip DRAM accesses than arithmetic or SRAM operations. It motivates the intensive research on model compression with two main approaches. Weight pruning leverages the redundancy in the number of weights and can be performed in a non-structured, which has higher flexibility and pruning rate but incurs index accesses due to irregular weights, or structured manner, which preserves the full matrix structure with a lower pruning rate. Weight quantization leverages the redundancy in the number of bits in weights. Compared to pruning, quantization is much more hardware-friendly and has become a ``must-do'' step for FPGA and ASIC implementations. Thus, any evaluation of the effectiveness of pruning should be on top of quantization. The key open question is, with quantization, what kind of pruning (non-structured versus structured) is most beneficial? This question is fundamental because the answer will determine the design aspects that we should really focus on to avoid the diminishing return of certain optimizations. This article provides a definitive answer to the question for the first time. First, we build ADMM-NN-S by extending and enhancing ADMM-NN, a recently proposed joint weight pruning and quantization framework, with the algorithmic supports for structured pruning, dynamic ADMM regulation, and masked mapping and retraining. Second, we develop a methodology for fair and fundamental comparison of non-structured and structured pruning in terms of both storage and computation efficiency. Our results show that ADMM-NN-S consistently outperforms the prior art: 1) it achieves 348x, 36x, and 8x overall weight pruning on LeNet-5, AlexNet, and ResNet-50, respectively, with (almost) zero accuracy loss and 2) we demonstrate the first fully binarized (for all layers) DNNs can be lossless in accuracy in many cases. These results provide a strong baseline and credibility of our study. Based on the proposed comparison framework, with the same accuracy and quantization, the results show that non-structured pruning is not competitive in terms of both storage and computation efficiency. Thus, we conclude that structured pruning has a greater potential compared to non-structured pruning. We encourage the community to focus on studying the DNN inference acceleration with structured sparsity.

  • Details
  • Metrics
Type
research article
DOI
10.1109/TNNLS.2021.3063265
Web of Science ID

WOS:000732091500001

Author(s)
Ma, Xiaolong
Lin, Sheng
Ye, Shaokai  
He, Zhezhi
Zhang, Linfeng
Yuan, Geng
Tan, Sia Huat
Li, Zhengang
Fan, Deliang
Qian, Xuehai
Show more
Date Issued

2022

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC

Published in
Ieee Transactions On Neural Networks And Learning Systems
Volume

33

Issue

9

Start page

4930

End page

4944

Subjects

Computer Science, Artificial Intelligence

•

Computer Science, Hardware & Architecture

•

Computer Science, Theory & Methods

•

Engineering, Electrical & Electronic

•

Computer Science

•

Engineering

•

quantization (signal)

•

redundancy

•

computational modeling

•

acceleration

•

degradation

•

random access memory

•

indexes

•

deep neural network (dnn)

•

hardware acceleration

•

quantization

•

weight pruning

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units
UPMWMATHIS  
Available on Infoscience
January 1, 2022
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/184246
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés