Repository logo

Infoscience

  • English
  • French
Log In
Logo EPFL, École polytechnique fédérale de Lausanne

Infoscience

  • English
  • French
Log In
  1. Home
  2. Academic and Research Output
  3. Conferences, Workshops, Symposiums, and Seminars
  4. Adaptive Quantization for Deep Neural Network
 
conference paper

Adaptive Quantization for Deep Neural Network

Zhou, Yiren
•
Moosavi Dezfooli, Seyed Mohsen  
•
Cheung, Ngai-Man
Show more
March 19, 2018
Thirty-Second Aaai Conference On Artificial Intelligence / Thirtieth Innovative Applications Of Artificial Intelligence Conference / Eighth Aaai Symposium On Educational Advances In Artificial Intelligence
32nd AAAI Conference on Artificial Intelligence / 30th Innovative Applications of Artificial Intelligence Conference / 8th AAAI Symposium on Educational Advances in Artificial Intelligence

In recent years Deep Neural Networks (DNNs) have been rapidly developed in various applications, together with increasingly complex architectures. The performance gain of these DNNs generally comes with high computational costs and large memory consumption, which may not be affordable for mobile platforms. Deep model quantization can be used for reducing the computation and memory costs of DNNs, and deploying complex DNNs on mobile equipment. In this work, we propose an optimization framework for deep model quantization. First, we propose a measurement to estimate the effect of parameter quantization errors in individual layers on the overall model prediction accuracy. Then, we propose an optimization process based on this measurement for finding optimal quantization bit-width for each layer. This is the first work that theoretically analyse the relationship between parameter quantization errors of individual layers and model accuracy. Our new quantization algorithm outperforms previous quantization optimization methods, and achieves 20-40% higher compression rate compared to equal bit-width quantization at the same model prediction accuracy.

  • Details
  • Metrics
Type
conference paper
DOI
10.1609/aaai.v32i1.11623
Web of Science ID

WOS:000485488904084

Author(s)
Zhou, Yiren
Moosavi Dezfooli, Seyed Mohsen  
Cheung, Ngai-Man
Frossard, Pascal  
Date Issued

2018-03-19

Publisher

ASSOC ADVANCEMENT ARTIFICIAL INTELLIGENCE

Publisher place

Palo Alto

Published in
Thirty-Second Aaai Conference On Artificial Intelligence / Thirtieth Innovative Applications Of Artificial Intelligence Conference / Eighth Aaai Symposium On Educational Advances In Artificial Intelligence
Start page

4596

End page

4604

Editorial or Peer reviewed

REVIEWED

Written at

EPFL

EPFL units
LTS4  
Event nameEvent placeEvent date
32nd AAAI Conference on Artificial Intelligence / 30th Innovative Applications of Artificial Intelligence Conference / 8th AAAI Symposium on Educational Advances in Artificial Intelligence

New Orleans, LA

February 02-07, 2018

Available on Infoscience
March 19, 2018
Use this identifier to reference this record
https://infoscience.epfl.ch/handle/20.500.14299/145640
Logo EPFL, École polytechnique fédérale de Lausanne
  • Contact
  • infoscience@epfl.ch

  • Follow us on Facebook
  • Follow us on Instagram
  • Follow us on LinkedIn
  • Follow us on X
  • Follow us on Youtube
AccessibilityLegal noticePrivacy policyCookie settingsEnd User AgreementGet helpFeedback

Infoscience is a service managed and provided by the Library and IT Services of EPFL. © EPFL, tous droits réservés