Bound and Conquer: Properties, Algorithms, and Applications of Multi-Camera Imaging

Using more than one camera to capture a scene has a long history in photography and the imaging industry. Inspired by how the human eye operates, researchers and industrial companies have always tried to compensate for the 3-D information lost during the camera acquisition by employing multiple cameras viewing a scene from different viewpoints. In recent years, advances in imaging hardware have led to an explosive growth of multi-camera systems, composed of tens to hundreds of cameras. These include light-field systems as well as some other multi-camera architectures. Inspired by this, we set out to develop an image-based urban localisation system. While developing this system, we realised that to make best use of the available computational power and attain the best-possible performance, we needed to thoroughly understand how the performance of an imaging system is affected by the number of views, especially since the extra storage requirement and computational complexity of processing novel views is not negligible. In this thesis, we provide an overview of our image-based localisation system and study the fundamental limits of the reconstruction accuracy of multi-camera systems. We prove that, under certain conditions, the highest achievable accuracy using a multi-camera system is quadratically related to the number of cameras in the system. We also analyse state-of-the-art reconstruction algorithms in order to characterise the properties that lead to the optimal reconstruction error decay for an algorithm. We introduce the concept of consistency, and prove that consistent reconstruction algorithms asymptotically achieve the optimal quadratic decay rate. Although in this work we have only formally analysed two geometric reconstruction problems, we believe that the proposed framework could also lay the foundation for a robust analysis of other more complex problem such as structure from motion and simultaneous localisation and mapping in the presence of massively many cameras.


Related material