Programming Light Propagation for Efficient Artificial Intelligence
Contemporary artificial intelligence (AI) models' exponentially growing complexity translates to their expanding skill sets. They can create language or image content mimicking human creativity and understanding, becoming indispensable tools in education, research, and healthcare. However, state-of-the-art AI models require immense computational power, leading to higher energy consumption and environmental impact. This doctoral thesis aims to leverage optical computing for AI by developing hardware and algorithmic tools to reduce energy dissipation while overcoming scaling limitations, utilizing optics' inherent parallelizability and low loss. Unlike electronics, where signals cannot overlap and stronger losses exist, optical systems offer unique advantages. The thesis first explores the propagation of ultrashort pulses in multimode optical fibers as a platform for neural networks (NNs). This platform enables efficient high-dimensional nonlinear information processing through light-matter interactions, improving machine learning task accuracies when combined with a single-layer digital classifier. Tight confinement within the fiber over long distances enhances nonlinear effects. By configuring optical nonlinearities with a metamodel of the platform, we achieve a 99% reduction in model complexity compared to digital electronics-based NNs. Beyond improving our understanding of optical nonlinearities, this work demonstrates their potential for energy-efficient AI. The thesis also introduces structural nonlinearity, which provides programmable, complex, and nonlinear transformations on data using low-power continuous wave lasers. This phenomenon achieves a nonlinear relationship between input data and output light fields by modulating the beam multiple times with the same input. Structural nonlinearity consistently outperforms linear optical networks in experimental classification tasks. Training "deep" optical neural networks, where multiple optical layers process input representations consecutively, demands specialized training algorithms. This thesis develops two approaches to address this challenge. The first involves defining a digital model of the optical system and continuously updating it during backpropagation training. This method enables seamless interfacing with existing neural networks, achieving high-precision modeling of nonlinear optical systems at speeds four orders of magnitude faster than analytical simulation. The second approach defines local loss functions for each optical layer, programmed with a small number of digital parameters to optimize these functions. This eliminates the need for precise system characterization, ensuring the ONN remains resilient to system drifts and can complete training in a single pass over the physical experiment. Generative AI methods, requiring thousands of steps to generate each sample, are even more resource-intensive than classification tasks, incurring significant time and energy costs. The final chapter of this thesis proposes an optical solution by shaping light propagation with passive modulation layers, designed using a physics-based machine learning algorithm. These layers implement denoising diffusion models to generate images from random Gaussian signals without requiring power or active control in the computational layers. This optical generative AI method demonstrates scalability equivalent to its digital counterparts while consuming two orders of magnitude less energy.
EPFL_TH10843.pdf
Main Document
Not Applicable (or Unknown)
openaccess
N/A
16.13 MB
Adobe PDF
777011dd30bb4b951da1fd712735769e