Definitive Consensus for Distributed Data Inference
Inference from data is of key importance in many applications of informatics. The current trend in performing such a task of inference from data is to utilise machine learning algorithms. Moreover, in many applications that it is either required or is preferable to infer from the data in a distributed manner. Many practical difficulties arise from the fact that in many distributed applications we avert from transferring data or parts of it due to costs, privacy and computation considerations. Admittedly, it would be advantageous if the final knowledge, attained through distributed data inference, is common to every participating computing node. The key in achieving the aforementioned task is the distributed average consensus algorithm or simply the consensus algorithm herein. The latter has been used in many applications. Initially the main purpose has been for the estimation of the expectation of scalar valued data distributed over a network of machines without a central node. Notably, the algorithm allows the final outcome to be the same for every participating node. Utilising the consensus algorithm as the centre piece makes the task of distributed data inference feasible. However, there are many difficulties that hinder its direct applicability. Thus, we concentrate on the consensus algorithm with the purpose of addressing these difficulties. There are two main concerns. First, the consensus algorithm has asymptotic convergence. Thus, we may only achieve maximum accuracy if the algorithm is left to run for a large number of iterations. Second, the accuracy attained at any iteration during the consensus algorithm is correlated with the standard deviation of the initial value distribution. The consensus algorithm is inherently imprecise at finite time and this hardens the learning process. We solve this problem by introducing the definitive consensus algorithm. This algorithm attains maximum precision in a finite number of iterations, namely in a number of iterations equal to the diameter of the graph in a distributed and decentralised manner. Additionally, we introduce the nonlinear consensus algorithm and the adaptive consensus algorithm. These are modifications of the original consensus algorithm that allow improved precision with fewer iterations in cases of unknown, partially known and stochastically time-varying network topologies. The definitive consensus algorithm can be incorporated in a distributed data inference framework. We approach the problem of data inference from the perspective of machine learning. Specifically, we tailor this distributed inference framework for machine learning on a communication network with data partitioned on the participating computing nodes. Particularly, the distributed data inference framework is detailed and applied to the case of a multilayer feed forward neural network with error back-propagation. A substantial examination of its performance and its comparison with the non-distributed case, is provided. Theoretical foundation for the definitive consensus algorithm is provided. Moreover, its superior performance is validated by numerical experiments. A brief theoretical examination of the nonlinear and the adaptive consensus algorithms is performed to justify their improved performance with respect to the original consensus algorithm. Moreover, extensive numerical simulations are given to compare the nonlinear and the adaptive algorithm with the original consensus algorithm. The most important contributions of this research are principally the definitive consensus algorithm and the distributed data inference framework. Their combination yields a decentralised distributed process over a communication network capable for inference in agreement over the entire network.
Keywords: Consensus ; Distributed ; Learning ; Distributed algorithms ; Distributed Inference ; Distributed Data ; Collaborative Learning ; Agreement ; Nonlinear Consensus ; Adaptive Consensus ; Definitive Consensus ; Finite time consensus ; Networks ; Communication networks ; Neural networks ; Graphs ; graph Diameter ; graph Radius ; Groebner bases ; Multilinear Polynomial systems of equations ; nonlinear Optimisation ; Dynamical Systems ; consensus ; distribué ; apprentissage ; algorithmes distribués ; inférence distribué ; données distribuées ; apprentissage collaboratif ; accord ; consensus non-linéaire ; consensus adaptive ; consensus définitif ; consensus en temps fini ; réseaux ; réseaux de communication ; réseaux de neurones ; diamètre du graphe ; rayon du graphe ; bases de Groebner ; systèmes d'équations polynomiales ; optimisation non-linéaire ; systèmes dynamiquesThèse École polytechnique fédérale de Lausanne EPFL, n° 5026 (2011)
Programme doctoral Informatique, Communications et Information
Faculté informatique et communications
Institut de systèmes de communication
Laboratoire de systèmes non linéaires
Record created on 2011-03-03, modified on 2016-12-12