Home > From data to structures > HTML MARC |

000222795 001__ 222795 000222795 005__ 20190317000549.0 000222795 0247_ $$2doi$$a10.5075/epfl-thesis-7302 000222795 02470 $$2urn$$aurn:nbn:ch:bel-epfl-thesis7302-3 000222795 02471 $$2nebis$$a10745325 000222795 037__ $$aTHESIS 000222795 041__ $$aeng 000222795 088__ $$a7302 000222795 245__ $$aFrom data to structures$$bgraph learning under smoothness assumptions and applications in data science 000222795 269__ $$a2016 000222795 260__ $$aLausanne$$bEPFL$$c2016 000222795 300__ $$a116 000222795 336__ $$aTheses 000222795 502__ $$aProf. Jean-Philippe Thiran (président) ; Prof. Pierre Vandergheynst (directeur de thèse) ; Prof. Patrick Thiran, Prof. Mauro Maggioni, Prof. Pierre Borgnat (rapporteurs) 000222795 520__ $$aWeighted undirected graphs are a simple, yet powerful way to encode structure in data. A first question we need to address regarding such graphs is how to use them effectively to enhance machine learning problems. A second but more important question is how to obtain such a graph directly from the data. These two questions lead the direction of this thesis. We first explore the use of graphs in matrix completion, with direct implications in recommendation systems. In such systems side information graphs arise naturally, by comparing user behavior and interactions, and by product profiling. Given incomplete object ratings by users, and known graphs for user and object similarities, we propose a new matrix completion model that exploits the graphs to achieve better recommendations. Our model is convex and is solved efficiently with an Alternating Direction Multipliers method (ADMM). It achieves better recommendations than standard convex matrix completion, especially when very few ratings are known a priori, while it is robust to the graph quality. Continuing with the second question, we are interested in learning a graph from data. Data is usually assumed to be smooth on a graph, an assumption associated with graph sparsity and connectivity. Using this assumption, we provide a general graph learning framework that is convex and easy to optimize. We classify graph learning according to the prior structure information in edge-local, node-local or global. Edge-local models are the cheapest computationally, learning each edge independently. Exponential decay graphs and $\epsilon$-neighborhood graphs, the most prevalent weighting schemes in the literature, are such models of our framework. We provide a new model that interpolates between these two, and achieves better performance in many settings. Node-local models take into account the neighborhood of each edge. While they require iterative algorithms, they are stronger models achieving better results. We provide a new node-local model for general graph learning under smoothness assumptions, that redefines the state of the art. We discuss the previous state of the art, two global models of our framework. We provide efficient primal-dual based algorithms for efficiently solving our model and the most prevalent of the previous ones. Our model performs best in many settings. Large scale applications require graphs that are easy to compute, with cost that scales gracefully with respect to the number of nodes, and do not need parameter tuning. We provide the first large scale graph learning solution, based on approximate nearest neighbors with cost of $\mathcal{O}(m\log(m))$ for $m$ nodes. Our solution automatically finds the needed parameters given the sparsity level needed. We finally explore learning graphs that change in time. In many applications graphs are not static, and data comes from different versions of them. In such cases there is no one graph that correctly summarizes connectivity, and we need to learn several graphs, one for each selected time window. Assuming that graphs change slowly, we provide two new models based on a known global and our own node-local one. Our convex time-varying-graph models can use any convex differentiable loss function for time-change penalization, have a cost linear to the number of time windows, and are easy to parallelize. The obtained graphs fit the data better and provide a good compromise between detail in time and computational cost. 000222795 6531_ $$agraph 000222795 6531_ $$agraph learning 000222795 6531_ $$amatrix completion 000222795 6531_ $$asmoothness on graphs 000222795 6531_ $$atotal cumulative energy residual (TCER) 000222795 6531_ $$alarge scale graph learning 000222795 6531_ $$atime varying graph 000222795 6531_ $$anetwork inference 000222795 6531_ $$aLaplacian estimation 000222795 6531_ $$aconvex optimization 000222795 700__ $$0246432$$aKalofolias, Vasilis$$g211494 000222795 720_2 $$0240428$$aVandergheynst, Pierre$$edir.$$g120906 000222795 8564_ $$s2453269$$uhttps://infoscience.epfl.ch/record/222795/files/EPFL_TH7302.pdf$$yn/a$$zn/a 000222795 909C0 $$0252392$$pLTS2$$xU10380 000222795 909CO $$ooai:infoscience.tind.io:222795$$pthesis$$pthesis-bn2018$$pDOI$$pSTI$$qDOI2$$qGLOBAL_SET 000222795 917Z8 $$x108898 000222795 917Z8 $$x108898 000222795 918__ $$aSTI$$cIEL$$dEDIC 000222795 919__ $$aLTS2 000222795 920__ $$a2016-11-3$$b2016 000222795 970__ $$a7302/THESES 000222795 973__ $$aEPFL$$sPUBLISHED 000222795 980__ $$aTHESIS