Multi-Agent Reinforcement Learning for Hyperparameter Optimization of Deep Convolutional Neural Networks

Iranfar, Arman; Zapater Sancho, Marina; Atienza Alonso, David

doi:10.1109/TCAD.2021.3077193

research article

Multi-Agent Reinforcement Learning for Hyperparameter Optimization of Deep Convolutional Neural Networks

Iranfar, Arman

•

Zapater Sancho, Marina

•

Atienza Alonso, David

2021

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

Nowadays, Deep Convolutional Neural Networks (DCNNs) play a significant role in many application domains, such as, computer vision, medical imaging, and image processing. Nonetheless, designing a DCNN, able to defeat the state of the art, is a manual, challenging, and time-consuming task, due to the extremely large design space, as a consequence of a large number of layers and their corresponding hyperparameters. In this work, we address the challenge of performing hyperparameter optimization of DCNNs through a novel Multi-Agent Reinforcement Learning (MARL)-based approach, eliminating the human effort. In particular, we adapt Q-learning and define learning agents per layer to split the design space into independent smaller design sub-spaces such that each agent fine-tunes the hyperparameters of the assigned layer concerning a global reward. Moreover, we provide a novel formation of Q-tables along with a new update rule that facilitates agents’ communication. Our MARL-based approach is data-driven and able to consider an arbitrary set of design objectives and constraints. We apply our MARL-based solution to different well-known DCNNs, including GoogLeNet, VGG, and U-Net, and various datasets for image classification and semantic segmentation. Our results have shown that, compared to the original CNNs, the MARL-based approach can reduce the model size, training time, and inference time by up to, respectively, 83x, 52%, and 54% without any degradation in accuracy. Moreover, our approach is very competitive to state-of-the-art neural architecture search methods in terms of the designed CNN accuracy and its number of parameters while significantly reducing the optimization cost.