000253619 001__ 253619 000253619 005__ 20190509131751.0 000253619 0247_ $$2doi$$a10.5075/epfl-thesis-8391 000253619 037__ $$aTHESIS 000253619 041__ $$aeng 000253619 088__ $$a8391 000253619 245__ $$aLocalizing the Source of an Epidemic Using Few Observations 000253619 260__ $$c2018$$bEPFL$$aLausanne 000253619 269__ $$a2018 000253619 300__ $$a165 000253619 336__ $$aTheses 000253619 502__ $$aProf. Pascal Frossard (président) ; Prof. Patrick Thiran, Dr Laura Elisa Celis (directeurs) ; Prof. Marcel Salathé, Dr Manuel Gomez-Rodriguez, Prof. Bruno Sinopoli (rapporteurs) 000253619 520__ $$aLocalizing the source of an epidemic is a crucial task in many contexts, including the detection of malicious users in social networks and the identification of patient zeros of disease outbreaks. The difficulty of this task lies in the strict limitations on the data available: In most cases, when an epidemic spreads, only few individuals, who we will call sensors, provide information about their state. Furthermore, as the spread of an epidemic usually depends on a large number of variables, accounting for all the possible spreading patterns that could explain the available data can easily result in prohibitive computational costs. Therefore, in the field of source localization, there are two central research directions: The design of practical and reliable algorithms for localizing the source despite the limited data, and the optimization of data collection, i.e., the identification of the most informative sensors. In this dissertation we contribute to both these directions. We consider network epidemics starting from an unknown source. The only information available is provided by a set of sensor nodes that reveal if and when they become infected. We study how many sensors are needed to guarantee the identification of the source. A set of sensors that guarantees the identification of the source is called a double resolving set (DRS); the minimum size of a DRS is called the double metric dimension (DMD). Computing the DMD is, in general, hard, hence estimating it with bounds is desirable. We focus on G(N,p) random networks for which we derive tight bounds for the DMD. We show that the DMD is a non-monotonic function of the parameter p, hence there are critical parameter ranges in which source localization is particularly difficult. Again building on the relationship between source localization and DRSs, we move to optimizing the choice of a fixed number K of sensors. First, we look at the case of trees where the uniqueness of paths makes the problem simpler. For this case, we design polynomial time algorithms for selecting K sensors that optimize certain metrics of interest. Next, turning to general networks, we show that the optimal sensor set depends on the distribution of the time it takes for an infected node u to infect a non-infected neighbor v, which we call the transmission delay from u to v. We consider both a low- and a high-variance regime for the transmission delays. We design algorithms for sensor placement in both cases, and we show that they yield an improvement of up to 50% over state-of-the-art methods. Finally, we propose a framework for source localization where some sensors (called dynamic sensors) can be added while the epidemic spreads and the localization progresses. We design an algorithm for joint source localization and dynamic sensor placement; This algorithm can handle two regimes: offline localization, where we localize the source after the epidemic spread, and online localization, where we localize the source while the epidemic is ongoing. We conduct an empirical study of offline and online localization and show that, by using dynamic sensors, the number of sensors we need to localize the source is up to 10 times less with respect to a strategy where all sensors are deployed a priori. We also study the resistance of our methods to high-variance transmission delays and show that, even in this setting, using dynamic sensors, the source can be localized with less than 5% of the nodes being sensors. 000253619 6531_ $$aSource localization 000253619 6531_ $$aNetwork epidemics 000253619 6531_ $$aSensor placement 000253619 700__ $$0247061$$aSpinelli, Brunella Marta$$g226024 000253619 720_2 $$aThiran, Patrick$$edir.$$g103925 000253619 720_2 $$aCelis, Laura Elisa$$edir.$$g245193 000253619 8564_ $$uhttps://infoscience.epfl.ch/record/253619/files/EPFL_TH8391.pdf$$s2786789 000253619 8564_ $$uhttps://infoscience.epfl.ch/record/253619/files/EPFL_TH8391.gif?subformat=icon$$s6826$$xicon 000253619 8564_ $$uhttps://infoscience.epfl.ch/record/253619/files/EPFL_TH8391.jpg?subformat=icon-180$$s8181$$xicon-180 000253619 8564_ $$uhttps://infoscience.epfl.ch/record/253619/files/EPFL_TH8391.jpg?subformat=icon-700$$s44392$$xicon-700 000253619 8564_ $$uhttps://infoscience.epfl.ch/record/253619/files/EPFL_TH8391.pdf?subformat=pdfa$$s4604708$$xpdfa 000253619 909C0 $$pLCA3 000253619 909CO $$qGLOBAL_SET$$pthesis$$pDOI$$pIC$$ooai:infoscience.epfl.ch:253619$$qDOI2 000253619 918__ $$cIINFCOM$$aIC$$dEDIC 000253619 919__ $$aLCA3 000253619 920__ $$a2018-03-23$$b2018 000253619 970__ $$a8391/THESES 000253619 973__ $$sPUBLISHED$$aEPFL 000253619 980__ $$aTHESIS