000227934 001__ 227934
000227934 005__ 20190509132609.0
000227934 0247_ $$2doi$$a10.5075/epfl-thesis-7589
000227934 02470 $$2urn$$aurn:nbn:ch:bel-epfl-thesis7589-7
000227934 02471 $$2nebis$$a10890274
000227934 037__ $$aTHESIS
000227934 041__ $$aeng
000227934 088__ $$a7589
000227934 245__ $$aVision-based detection of aircrafts and UAVs
000227934 269__ $$a2017
000227934 260__ $$aLausanne$$bEPFL$$c2017
000227934 300__ $$a116
000227934 336__ $$aTheses
000227934 502__ $$aDr Denis Gillet (président) ; Prof. Pascal Fua, Prof. Vincent Lepetit (directeurs) ; Prof. Bertrand Merminod, Prof. Andrew Zisserman, Dr Hervé Jégou (rapporteurs)
000227934 520__ $$aUnmanned Aerial Vehicles are becoming increasingly popular for a broad variety of tasks ranging from aerial imagery to objects delivery. With the expansion of the areas, where drones can be efficiently used, the collision risk with other flying objects increases. Avoiding such collisions would be a relatively easy task, if all the aircrafts in the neighboring airspace could communicate with each other and share their location information. However, it is often the case that either location information is unavailable (e.g. flying in GPS-denied environments) or communication is not possible (e.g. different communication channels or non-cooperative flight scenario). To ensure flight safety in this kind of situations drones need a way to autonomously detect other objects that are intruding the neighboring airspace. Visual-based collision avoidance is of particular interest as cameras generally consume less power and are more lightweight than active sensor alternatives such as radars and lasers. We have therefore developed a set of increasingly sophisticated algorithms to provide drones with a visual collision avoidance capability.  First, we present a novel method for detecting flying objects such as drones and planes that occupy a small part of the camera field of view, possibly move in front of complex backgrounds, and are filmed by a moving camera. In order to be solved this problem requires combining motion and appearance information, as neither of the two alone is capable of providing reliable enough detections. We therefore propose a machine learning technique that operates on spatio- temporal cubes of image intensities where individual patches are aligned using an object-centric regression-based motion stabilization algorithm.  Second, in order to reduce the need to collect a large training dataset and to manual annotate it, we introduce a way to generate realistic synthetic images. Given only a small set of real examples and a coarse 3D model of the object, synthetic data can be generated in arbitrary quantities and further used to supplement real examples for training a detector. The key ingredient of our method is that the synthetically generated images need to be as close as possible to the real ones not in terms of image quality, but according to the features, used by a machine learning algorithm.  Third, though the aforementioned approach yields a substantial increase in performance when using Adaboost and DPM detectors, it does not generalize well to Convolutional Neural Networks, which have become the state-of-the-art. This happens because, as we add more and more synthetic data, the CNNs begin to overfit to the synthetic images at the expense of the real ones. We therefore propose a novel deep domain adaptation technique that allows efficiently combining real and synthetic images without overfitting to either of the two. While most of the adaptation techniques aim at learning features that are invariant to the possible difference of the images, coming from different sources (real and synthetic). Unlike those methods, we suggest modeling this difference with a special two-stream architecture. We evaluate our approach on three different datasets and show its effectiveness for various classification and regression tasks.
000227934 6531_ $$acomputer vision
000227934 6531_ $$aunmanned aerial vehicles
000227934 6531_ $$aobject detection
000227934 6531_ $$amotion compensation
000227934 6531_ $$asynthetic data generation
000227934 6531_ $$amachine learning
000227934 6531_ $$adeep learning
000227934 6531_ $$adomain adaptation
000227934 700__ $$0246627$$aRozantsev, Artem$$g222094
000227934 720_2 $$0240252$$aFua, Pascal$$edir.$$g112366
000227934 720_2 $$0240235$$aLepetit, Vincent$$edir.$$g149007
000227934 8564_ $$s14486639$$uhttps://infoscience.epfl.ch/record/227934/files/EPFL_TH7589.pdf$$yn/a$$zn/a
000227934 909C0 $$0252087$$pCVLAB$$xU10659
000227934 909CO $$ooai:infoscience.tind.io:227934$$pthesis-bn2018$$pDOI$$pIC$$pthesis$$qDOI2$$qGLOBAL_SET
000227934 917Z8 $$x108898
000227934 917Z8 $$x108898
000227934 918__ $$aIC$$cIINFCOM$$dEDIC
000227934 919__ $$aCVLAB
000227934 920__ $$a2017-5-5$$b2017
000227934 970__ $$a7589/THESES
000227934 973__ $$aEPFL$$sPUBLISHED
000227934 980__ $$aTHESIS