Computing has long been central to technological advancements, where each leap in compute power has catalyzed applications once thought impossible, such as smartphones. Recently, the exponential growth in the compute demand of artificial intelligence (AI) exceeds the capabilities of single electronic processors. While these processors remain mature and powerful, hosting large AI models on a single unit is impractical, and scaling compute power through parallelization faces interconnection bottlenecks. This thesis investigates optical processors for AI computing, leveraging optics' advantages of high bandwidth, low latency, and dense parallelism. It begins by discussing the challenges electronic processors face in supporting AI workloads, particularly the limitations in data transfer between accelerators and memory. The thesis then introduces the Multilayer Light Processor (MLP), a programmable three-dimensional free-space optical architecture that can function as a neural processor and an optical circuit switch. The MLP is implemented using a spatial light modulator and a differentiable model with backpropagation. It serves as a reconfigurable, multi-purpose optical switching element for data centers. Unlike traditional optical switches, the MLP achieves low loss at high port counts through multilayer wavefront shaping. Furthermore, the thesis presents an optical neural processing framework based on the MLP that achieves nonlinear data processing using linear optics by cascading multiple linear layers within a single MLP. This approach enables the realization of deep neural networks in the optical domain, harnessing the extensive interconnectivity provided by diffraction and synthesizing nonlinearity with multiple scatterings instead of relying on optoelectronic conversions. The thesis also explores multimode fibers (MMFs) and integrated lithium niobate (LN) waveguides as nonlinear optical processing media. By leveraging nonlinear wave propagation of data encoded on high intensity pulses within MMFs and LN waveguides, these systems perform classification tasks comparable to traditional neural networks but with reduced parameter counts and energy consumption. This approach aligns with the concept of reservoir computing, a computational framework that utilizes the inherent physics of systems to process data. In conclusion, this thesis contributes to the field of optical computing by introducing novel architectures and methodologies to overcome electronic interconnection limitations and employ optical-domain nonlinearity for efficient optical neural processing. Through the development of the MLP for optical neural networks and circuit switching, and the exploration of optical reservoir computing with MMFs and LN waveguides, this work lays the foundation for future all-optical AI computing systems. It concludes with a vision of fully optical AI compute clusters.
EPFL_TH11523.pdf
Main Document
Published version
openaccess
N/A
20.93 MB
Adobe PDF
39175d40a9789ab81bb1945751238c61