Dynamic Model Pruning with Feedback

Lin, Tao; Stich, Sebastian Urban; Barba Flores, Luis Felipe; Dmitriev, Daniil; Jaggi, Martin

conference paper

Lin, Tao

•

Stich, Sebastian Urban

•

Barba Flores, Luis Felipe

more

2020

ICLR - International Conference on Learning Representations

8th International Conference on Learning Representations (ICLR)

Deep neural networks often have millions of parameters. This can hinder their deployment to low-end devices, not only due to high memory requirements but also because of increased latency at inference. We propose a novel model compression method that generates a sparse trained model without additional overhead: by allowing (i) dynamic allocation of the sparsity pattern and (ii) incorporating feedback signal to reactivate prematurely pruned weights we obtain a performant sparse model in one single training pass (retraining is not needed, but can further improve the performance). We evaluate the method on CIFAR-10 and ImageNet, and show that the obtained sparse models can reach the state-of-the-art performance of dense models and further that their performance surpasses all previously proposed pruning schemes (that come without feedback mechanisms).

Name

dynamic_model_pruning_with_feedback.pdf

Type

Publisher's version

Access type

openaccess

Size

1.67 MB

Format

Adobe PDF

Checksum (MD5)

1245400bc25ab6429e8f5295668907fe