Proving Memory Safety of Floating-Point Computations by Combining Static and Dynamic Program Analysis
Whitebox fuzzing is a novel form of security testing based on runtime symbolic execution and constraint solving. Over the last couple of years, whitebox fuzzers have found dozens of new security vulnerabilities (buffer overflows) in Windows and Linux applications, including codecs, image viewers and media players. Those types of applications tend to use floating-point instructions available on modern processors, yet existing whitebox fuzzers and SMT constraint solvers do not handle floating-point arithmetic. Are there new security vulnerabilities lurking in floating-point code? A naive solution would be to extend symbolic execution to floating-point (FP) instructions (months of work), extend SMT solvers to reason about FP constraints (months of work), and then face more complex constraints and an even worse path explosion problem. Instead, we propose an alternative approach, based on the rough intuition that FP code should only perform memory-safe data-processing of the ``payload'' of an image or video file, while the non-FP part of the application should deal with buffer allocations and memory address computations, with only the latter being prone to buffer overflows and other security critical bugs. Our approach combines (1) a lightweight local path-insensitive ``may'' static analysis of FP instructions with (2) a high-precision whole-program path-sensitive ``must'' dynamic analysis of non-FP instructions. The aim of this combination is to prove memory safety of the FP part and a form of non-interference between the FP part and the non-FP part with respect to memory address computations. We have implemented our approach using two existing tools for, respectively, static and dynamic x86 binary analysis. We present preliminary results of experiments with standard JPEG, GIF and ANI Windows parsers. For a given test suite of diverse input files, our mixed static/dynamic analysis is able to prove memory safety of FP code in those parsers for a small upfront static analysis cost and a marginal runtime expense compared to regular runtime symbolic execution.
Record created on 2011-07-25, modified on 2016-08-09