222795
20190509132601.0
doi
10.5075/epfl-thesis-7302
urn
urn:nbn:ch:bel-epfl-thesis7302-3
nebis
10745325
THESIS
eng
7302
From data to structures
graph learning under smoothness assumptions and applications in data science
2016
Lausanne
EPFL
2016
116
Theses
Prof. Jean-Philippe Thiran (président) ; Prof. Pierre Vandergheynst (directeur de thèse) ; Prof. Patrick Thiran, Prof. Mauro Maggioni, Prof. Pierre Borgnat (rapporteurs)
Weighted undirected graphs are a simple, yet powerful way to encode structure in data. A first question we need to address regarding such graphs is how to use them effectively to enhance machine learning problems. A second but more important question is how to obtain such a graph directly from the data. These two questions lead the direction of this thesis. We first explore the use of graphs in matrix completion, with direct implications in recommendation systems. In such systems side information graphs arise naturally, by comparing user behavior and interactions, and by product profiling. Given incomplete object ratings by users, and known graphs for user and object similarities, we propose a new matrix completion model that exploits the graphs to achieve better recommendations. Our model is convex and is solved efficiently with an Alternating Direction Multipliers method (ADMM). It achieves better recommendations than standard convex matrix completion, especially when very few ratings are known a priori, while it is robust to the graph quality. Continuing with the second question, we are interested in learning a graph from data. Data is usually assumed to be smooth on a graph, an assumption associated with graph sparsity and connectivity. Using this assumption, we provide a general graph learning framework that is convex and easy to optimize. We classify graph learning according to the prior structure information in edge-local, node-local or global. Edge-local models are the cheapest computationally, learning each edge independently. Exponential decay graphs and $\epsilon$-neighborhood graphs, the most prevalent weighting schemes in the literature, are such models of our framework. We provide a new model that interpolates between these two, and achieves better performance in many settings. Node-local models take into account the neighborhood of each edge. While they require iterative algorithms, they are stronger models achieving better results. We provide a new node-local model for general graph learning under smoothness assumptions, that redefines the state of the art. We discuss the previous state of the art, two global models of our framework. We provide efficient primal-dual based algorithms for efficiently solving our model and the most prevalent of the previous ones. Our model performs best in many settings. Large scale applications require graphs that are easy to compute, with cost that scales gracefully with respect to the number of nodes, and do not need parameter tuning. We provide the first large scale graph learning solution, based on approximate nearest neighbors with cost of $\mathcal{O}(m\log(m))$ for $m$ nodes. Our solution automatically finds the needed parameters given the sparsity level needed. We finally explore learning graphs that change in time. In many applications graphs are not static, and data comes from different versions of them. In such cases there is no one graph that correctly summarizes connectivity, and we need to learn several graphs, one for each selected time window. Assuming that graphs change slowly, we provide two new models based on a known global and our own node-local one. Our convex time-varying-graph models can use any convex differentiable loss function for time-change penalization, have a cost linear to the number of time windows, and are easy to parallelize. The obtained graphs fit the data better and provide a good compromise between detail in time and computational cost.
graph
graph learning
matrix completion
smoothness on graphs
total cumulative energy residual (TCER)
large scale graph learning
time varying graph
network inference
Laplacian estimation
convex optimization
246432
Kalofolias, Vasilis
211494
240428
Vandergheynst, Pierre
dir.
120906
2453269
http://infoscience.epfl.ch/record/222795/files/EPFL_TH7302.pdf
n/a
n/a
252392
LTS2
U10380
oai:infoscience.tind.io:222795
thesis
thesis-bn2018
DOI
STI
DOI2
GLOBAL_SET
108898
108898
STI
IEL
EDIC
LTS2
2016-11-3
2016
7302/THESES
EPFL
PUBLISHED
THESIS