Inferring Scalability from Program Pseudocode

Recent trends have led hardware manufacturers to place multiple processing cores on a single chip, making parallel programming the intended way of taking advantage of the increased processing power. However, bringing concurrency to average programmers is considered to be one of the major challenges in computer science today. The difficulty lies not only in writing correct parallel programs, but also in achieving the required efficiency and performance. For parallel programs, performance is not only about obtaining low execution times on a fixed number of cores, but also about maintaining efficiency as the number of available cores is increased. Ideally, programmers should have in their toolkit techniques that can be used when designing parallel programs, before any code is available for testing on production hardware, and which are then able to predict scalability once the program is implemented. Existing methods are either unreliable at predicting scalability, such as the case of disjoint-access parallelism, or do not apply to lock-based programs, which currently make up a large part of existing concurrent programs. Furthermore, using some of these techniques is so complicated that it outweighs the time required to implement, debug and test the program on real hardware. In this thesis we study the problem of predicting the scalability of concurrent programs without implementing them. This allows programmers in the design phase of a concurrent algorithm to choose only one or a few promising solutions that will be implemented, debugged and tested on production hardware. We first consider disjoint-access parallelism, an existing property that applies only to a very restricted class of programs. After an extensive practical evaluation spanning across a variety of scenarios, we find it to be ineffective at predicting scalability. For predicting the scalability of more general concurrent algorithms, we propose the obstruction degree, a new scalability metric based on the consistency requirements of algorithms. It applies to programs using locks, invalidation primitives and transactional memory. Our metric allows programmers to compare two given algorithms as well as predict their scalability limit, the maximum number of processors to which they can scale, thus allowing programmers to choose the appropriate size hardware for running their programs. We also examine the composition of relaxed memory transactions in order to combine the ease of programming offered by transactional memory with the increased scalability of transactions that circumvent the traditional transactional model. We present outheritance, a property we show to be both necessary and sufficient for ensuring the correct composition of relaxed transactions, and we show how to calculate the obstruction degree of compositions that use this new property. We use outheritance to build OE-STM, a new software transactional memory algorithm having elastic transactions that correctly compose.


Related material