Toward Predictable Performance in Software Packet-Processing Platforms
Contention for shared resources—caches, memory controllers, buses, NICs—is assumed to be a hurdle in optimizing and predicting the performance of multi-core software systems, especially packet-processing systems, which make extensive use of these resources. Recent projects have demonstrated the feasibility of building such systems with high and predictable performance, but they all make a crucial simplifying assumption: that all processing cores see identical input and run identical code on it, while all packets incur the same kind of conventional packet processing (e.g., IP forwarding). To generalize for realistic scenarios, we must accommodate multiple clients, running a range of both standard and sophisticated packet-processing applications. This introduces the question of how to configure and run a varied mix of packet-processing applications on modern multicore hardware, so as to achieve both ease of programmability and optimal, predictable performance. We set out to answer this question, and we reach a surprising conclusion: in a software-based packet-processing platform, if we follow a few basic programming and configuration rules, (a) contention affects performance in a predictable manner and (b) the effect depends very little on how we schedule different apps on different cores, i.e., contention-aware scheduling is not meaningful; in short, the most efficient way to manage contention is to mostly ignore it.