000150471 001__ 150471
000150471 005__ 20190619003302.0
000150471 0247_ $$2doi$$a10.5075/epfl-thesis-4830
000150471 02470 $$2urn$$aurn:nbn:ch:bel-epfl-thesis4830-1
000150471 02471 $$2nebis$$a6131191
000150471 037__ $$aTHESIS
000150471 041__ $$aeng
000150471 088__ $$a4830
000150471 245__ $$aDAISY: A Fast Descriptor for Dense Wide Baseline Stereo and Multiview Reconstruction
000150471 269__ $$a2010
000150471 260__ $$bEPFL$$c2010$$aLausanne
000150471 300__ $$a173
000150471 336__ $$aTheses
000150471 520__ $$aStereo reconstruction is a fundamental problem of computer vision. It has been studied for more than three decades and significant progress has been made in recent years as evidenced by the quality of the models now being produced. This is highly related with the advances in other fields. With the emergence of low cost high-quality cameras, we now live in an era where there is an abundant amount of data for use in reconstruction. The multitude of images with numerous sources of capture arose new interest in the stereo vision community due to new challenges such as being robust to photometric and geometric variability, scalability issues related to number of images and image resolutions. In this thesis, we aim to find efficient, and therefore practical, algorithmic solutions for the two extreme ends of stereo vision problem: first, we consider only two input image case where the cameras are placed far from each other and then we investigate the large scale multi-view reconstruction for ultra-high resolution image sets. Both problems have unique challenges where in the first part we need to handle the large perspective distortions that the image texture undergoes and in the second part we need to design an algorithm that can scale up to ultra-high resolution very large number of image sets using only a single standard computer. For the first problem, we design an efficient dense image descriptor, called DAISY, that is not only robust to photometric transforms like brightness and contrast changes but also robust to perspective effects that view-point changes produce. We use the DAISY descriptor as a photo-consistency measure in an expectation maximization framework with a global graph-cuts optimization algorithm to estimate depth and occlusion maps. We demonstrate very successful results on a variety of data sets some of which have laser scanned ground truths. After the estimation of depth and occlusion maps, we introduce a technique to improve the surface reconstruction in occluded areas by extracting normal cues using simple binary classifiers trained over DAISY-like features. For the large scale ultra-high resolution multi-view stereo problem, we design a very efficient local optimization algorithm instead of the global one developed in the first part of the thesis for the depth estimation framework. The scalability over the number of images is handled by representing the scene with a set of depth maps and the scalability over the image resolution is handled by the use of a local approach for depth map estimation. We demonstrate state-of-the-art quality results for very large sets of very high resolution images computed on a single standard computer at comparatively very short computation times. Overall, we show that the use of a distinctive and robust descriptor to measure photo-consistency allows us to avoid many complex stages other algorithms utilize without sacrificing from the accuracy of the results and thus scale up to large data sets easily.
000150471 6531_ $$acomputer vision
000150471 6531_ $$adense local descriptor
000150471 6531_ $$aDAISY
000150471 6531_ $$ascene reconstruction
000150471 6531_ $$awide baseline stereo
000150471 6531_ $$aultra-high resolution
000150471 6531_ $$alarge scale multi-view stereo
000150471 6531_ $$avision par ordinateur
000150471 6531_ $$adescripteur local dense
000150471 6531_ $$aDAISY
000150471 6531_ $$areconstruction de scène
000150471 6531_ $$awide baseline stéréo
000150471 6531_ $$atrès haute résolution
000150471 6531_ $$amulti-vue stéréo à large échelle
000150471 700__ $$0242709$$g170333$$aTola, Engin
000150471 720_2 $$aFua, Pascal$$edir.$$g112366$$0240252
000150471 8564_ $$uhttps://infoscience.epfl.ch/record/150471/files/EPFL_TH4830.pdf$$zTexte intégral / Full text$$s147483271$$yTexte intégral / Full text
000150471 909C0 $$xU10659$$0252087$$pCVLAB
000150471 909CO $$pthesis-public$$pDOI$$pIC$$ooai:infoscience.tind.io:150471$$qGLOBAL_SET$$pthesis$$pthesis-bn2018$$qDOI2
000150471 918__ $$dEDIC2005-2015$$cISIM$$aIC
000150471 919__ $$aCVLAB
000150471 920__ $$b2010
000150471 970__ $$a4830/THESES
000150471 973__ $$sPUBLISHED$$aEPFL
000150471 980__ $$aTHESIS