Billions and Billions of Constraints: Whitebox Fuzz Testing in Production
Ella Bounimova, Patrice Godefroid, and David Molnar
Track: Technical Research
We report experiences with constraint-based whitebox fuzz testing in production across hundreds of large Windows applications and over 500 machine years of computation from 2007 to 2013. Whitebox fuzzing leverages symbolic execution on binary traces and constraint solving to construct new inputs to a program. These inputs execute previously uncovered paths or trigger security vulnerabilities. Whitebox fuzzing has found one-third of all file fuzzing bugs during the development of Windows 7, saving millions of dollars in potential security vulnerabilities. The technique is in use today across multiple products at Microsoft. We describe key challenges with running whitebox fuzzing in production. We give principles for addressing these challenges and describe two new systems built from these principles: SAGAN, which collects data from every fuzzing run for further analysis, and JobCenter, which controls deployment of our whitebox fuzzing infrastructure across commodity virtual machines. Since June 2010, SAGAN has logged over 3.4 billion constraints solved, millions of symbolic executions, and tens of millions of test cases generated. Our work represents the largest scale deployment of whitebox fuzzing to date, including the largest usage ever for a Satisfiability Modulo Theories (SMT) solver. We present specific data analyses that improved our production use of whitebox fuzzing. Finally we report data on the performance of constraint solving and dynamic test generation that points toward future research problems.