Artificial Neural Network Training on an Optical Processor via Direct Feedback Alignment
Artificial Neural Networks (ANN) are habitually trained via the back-propagation (BP) algorithm. This approach has been extremely successful: Current models like GPT-3 have O(10 11 ) parameters, are trained on O(10 11 ) words and produce awe-inspiring results. However, there are good reasons to look for alternative training methods: With current algorithms and hardware constraints sometimes only half the available computing power is actually used. This is due to a complicated interplay between the size of the ANN, the available memory, throughput limitations of interconnects, the architecture of the network of computers, and the training algorithm. Training a model like the aforementioned GPT-3 takes months and costs millions. A different training paradigm, which could make clever use of specialized hardware, may train large ANNs more efficiently.
2023
979-8-3503-4600-8
979-8-3503-4599-5
1
1
REVIEWED
EPFL
Event name | Event place | Event date |
Munich, Germany | June 26-30, 2023 | |