Files

Abstract

Computer networks are an important part of our life. Yet, they are traditionally hard to manage and operate. The recent shift to Software-Defined Networking (SDN) is promised to change the way in which the networks are run by their operators. However, SDN is still in its initial stage and as such it lags behind traditional networks in terms of correctness and reliability; both the properties being vitally important for the network operators. Meanwhile, general purpose computers were following Moore's law for years and as a result, together with algorithmic improvements, the costs of computing have been declining --- we have more processing power available to us than anytime before. In this dissertation I strive to reduce the correctness and reliability gap of SDN and help speed up the adoption of SDN by providing tools that help programmers and operators gain confidence in their networks. These tools leverage the today's trend in the increasing availability of vast amounts of computing power and combine it with the past research on systematic validation, optimization and satisfiability solving. The first tool this thesis introduces, NICE, focuses on debugging of an SDN controller. In particular, while SDN is conceptually a centralized system with the controller in the command, SDN is still a distributed system. As such, programmers might miss various event interleavings and race conditions that lead to a buggy behavior. NICE uncovers such insidious bugs by systematically exploring possible network states by a novel combination of two techniques -- symbolic execution and model checking. The combination of the two techniques and SDN-related heuristics let NICE explore possible scenarios deeper than any existing technique alone. This thesis also introduces Monocle, a tool aimed at ensuring reliability of SDN switches. Specifically, Monocle continuously verifies that a switch data plane processes packets as configured by the SDN controller. Such monitoring is important in the presence of silent errors ranging from switch firmware bugs, through memory corruption, to transient inconsistencies. To verify that the switch data plane works as intended, Monocle constructs data plane packets and injects them into the network. Such packet construction is, however, a rather intricate problem when considering a complicated nature of the switch packet matching mechanism in the presence of multiple overlapping rules. In the thesis I formally describe the constraints that the generated packets need to satisfy, and I leverage an off-the-shelf satisfiability solver to solve them. Finally, this thesis introduces ESPRES, a tool focused on scheduling network updates. I first observe that big network updates such as traffic engineering or failure re-routing usually contain many subupdates that are independent (e.g, updates of different flows). Then I conjecture that while the length of the total update is bottlenecked by the slowest switch, the time when an individual subupdate finishes can be greatly improved for a majority of them. To achieve this goal in an online fashion, ESPRES uses a greedy mechanism to reorder individual switch commands according to the current situation on different switches.

Details

PDF