Kristensen, Andreas ToftegaardGiterman, RobertBalatsoukas-Stimming, AlexiosBurg, Andreas2021-03-262021-03-262021-03-262020-01-0110.1109/ICASSP40776.2020.9054764https://infoscience.epfl.ch/handle/20.500.14299/176226WOS:000615970401169Neural networks have become indispensable for a wide range of applications, but they suffer from high computational- and memory-requirements, requiring optimizations from the algorithmic description of the network to the hardware implementation. Moreover, the high rate of innovation in machine learning makes it important that hardware implementations provide a high level of programmability to support current and future requirements of neural networks. In this work, we present a flexible hardware accelerator for neural networks, called Lupulus, supporting various methods for scheduling and mapping of operations onto the accelerator. Lupulus was implemented in a 28nm FD-SOI technology and demonstrates a peak performance of 380GOPS/GHz with latencies of 21.4 ms and 183.6 ms for the convolutional layers of AlexNet and VGG-16, respectively.AcousticsEngineering, Electrical & ElectronicEngineeringLupulus: A Flexible Hardware Accelerator For Neural Networkstext::conference output::conference proceedings::conference paper