Improving Scalability of Symbolic Execution for Software with Complex Environment Interfaces

Bucur, Stefan

doi:10.5075/epfl-thesis-6719

Bucur, Stefan

2015

Download

Formats

Format
BibTeX
MARC
MARCXML
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

Manual software testing is laborious and prone to human error. Yet, among practitioners, it is the most popular method for quality assurance. Automating the test case generation promises better effectiveness, especially for exposing corner-case bugs. Symbolic execution stands out as an automated testing technique that has no false positives, it eventually enumerates all feasible program executions, and can prioritize executions of interest. However, path explosionâthe fact that the number of program executions is typically at least exponential in the size of the programâhinders the applicability of symbolic execution in the real world, where software commonly reaches millions of lines of code. In practice, large systems can be efficiently executed symbolically by exploiting their modularity and thus symbolically execute the different parts of the system separately. However, a component typically depends on its environment to perform its task. Thus, a symbolic execution engine needs to provide an environment interface that is efficient, while maintaining accuracy and completeness. This conundrum is known as the environment problem. Systematically addressing the environment problem is challenging, as its instantiation depends on the nature of the environment and its interface. This thesis addresses two instances of the environment problem in symbolic execution, which are at opposite ends of the spectrum of interface stability: (1) system software interacting with an operating system with stable and well-documented semantics (e.g., POSIX), and (2) high-level programs written in dynamic languages, such as Python, Ruby, or JavaScript, whose semantics and interfaces are continuously evolving. To address the environment problem for stable operating system interfaces, this thesis introduces the idea of splitting an operating system model into a core set of primitives built into the engine at host level and, on top of it, the full operating system interface emulated inside the guest. As few as two primitives are sufficient to support a complex interface such as POSIX: threads with synchronization and address spaces with shared memory. We prototyped this idea in the Cloud9 symbolic execution platform. Cloud9's accurate and efficient POSIX model exposes hard-to-reproduce bugs in systems such as UNIX utilities, web servers, and distributed systems. Cloud9 is available at http://cloud9.epfl.ch. For programs written in high-level interpreted languages, this thesis introduces the idea of using the language interpreter as an "executable language specification". The interpreter runs inside a low-level (e.g., x86) symbolic execution engine while it executes the target program. The aggregate system acts as a high-level symbolic execution engine for the program. To manage the complexity of symbolically executing the entire interpreter, this thesis introduces Class-Uniform Path Analysis (CUPA), an algorithm for prioritizing paths that groups paths into equivalence classes according to a coverage goal. We built a prototype of these ideas in the form of Chef, a symbolic execution platform for interpreted languages that generates up to 1000 times more tests in popular Python and Lua packages compared to a plain execution of the interpreters. Chef is available at http://dslab.epfl.ch/proj/chef/.

Details

Title Improving Scalability of Symbolic Execution for Software with Complex Environment Interfaces

Author(s) Bucur, Stefan

Advisor(s)

Candea, George

Pagination 124

Date 2015

Publisher Lausanne, EPFL

Keywords

symbolic execution; program environments; systems software; interpreted languages

Language English

DOI https://doi.org/10.5075/epfl-thesis-6719

Other identifier(s) urn: urn:nbn:ch:bel-epfl-thesis6719-1

Laboratories DSLAB

Record Appears in Scientific production and competences > I&C - School of Computer and Communication Sciences > IINFCOM > DSLAB - Dependable Systems Laboratory
Scientific production and competences > EPFL Theses
Work produced at EPFL
Published
Theses

Record creation date 2015-07-08

Actions

Preview

Select file: