LLSC: A Parallel Symbolic Execution Compiler for LLVM IR

LLSC: A Parallel Symbolic Execution Compiler for LLVM IR🔗

Guannan Wei,
Shanyin Tan,
Oliver Bračevac,
and Tiark Rompf

LLSC has been succeeded by GenSym: see paper and code.

1 Abstract🔗

We present LLSC, a prototype compiler for nondeterministic parallel symbolic execution of the LLVM intermediate representation (IR). Given an LLVM IR program, LLSC generates code preserving the symbolic execution semantics and orchestrating solver invocations. The generated code runs efficiently, since the code has eliminated the interpretation overhead and explores multiple paths in parallel. To the best of our knowledge, LLSC is the first compiler for fork-based symbolic execution semantics that can generate parallel execution code.

In this demonstration paper, we present the current development and preliminary evaluation of LLSC. The principle behind LLSC is to automatically specialize a symbolic interpreter via the 1st Futamura projection, a fundamental connection between interpreters and compilers. The symbolic interpreter is written in an expressive high-level language equipped with a multi-stage programming facility. We demonstrate the run time performance through a set of benchmark programs, showing that LLSC outperforms interpretation-based symbolic execution engines in significant ways.

PDF | Source code | Screencast

The demonstration paper has been accpeted at ESEC/FSE 21:

LLSC: A Parallel Symbolic Execution Compiler for LLVM IR (Demo)
Guannan Wei, Shangyin Tan, Oliver Bračevac, Tiark Rompf
Proceedings of The 29th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2021)

2 A Running Example using the Docker Image🔗

We provide a prebuild Docker image (of linux/amd64) that contains LLSC and all its dependencies. Run the following command will pull the prebuild Docker image to the local machine:

docker pull guannanwei/llsc-dev:0.1

To start a container llsc_demo instantiating this image, run the following command:

docker run --name llsc_demo --rm --ulimit='stack=-1:-1' -it guannanwei/llsc-dev:0.1 bash

Then, we should see the prompt of bash, for example:

root@80add122e534:/#

We need to pull and switch to the fse21demo branch:

cd /llsc/sai
git pull origin fse21demo
git checkout fse21demo

The folder demo contains necessary scripts and benchmark programs used in the paper. After change the working directory into demo, we first run the gen_jar.sh file to assemble compiled Scala/Java files into a jar file:

cd demo
./gen_jar.sh

The first time executing gen_jar.sh will take a while to download dependencies and to compile source files. After it finishes, we should be able to see that llsc.jar appears in the current demo folder.

To run llsc.jar, we can simply use scala llsc.jar. The script llsc provides a shallow wrapper of this command.

The folder benchmark contains C source files and their compiled LLVM IR programs (.ll). To demonstrate how to use LLSC, we use the quicksort example, which is a program that takes an array of 5 symbolic elements (see its corresponding C file). Symbolically executing this program will generate 5! (i.e. 120) different sorted concrete arrays, therefore we expect to see 120 valid paths that will be discovered by symbolic execution.

We compile the quicksort.ll LLVM IR using the following command:

./llsc benchmarks/quicksort.ll demo_qsort @main 0

demo_qsort is the task name of this compilation, and @main is the entrance function name. The last argument 0 is the number of symbolic arguments we would like to pass to the entrance function (in this case @main does not take arguments). After that, the generated C++ files that perform symbolic execution are stored in llsc_gen/demo_qsort. We then can run make to compile those C++ files to an executable, which is also named demo_qsort:

cd llsc_gen/demo_qsort
make -j 4

Next, we shall just run the executable with argument 1, which specify the number of threads we want to use:

./demo_qsort 1

While running, the program prints the block and path coverage information. After it finishes, the program prints the time spent to explore such numbers of blocks/paths, for example:

[0.665 s] #blocks: 10/10; #paths: 120

Meanwhile, the test cases generated (by invoking the STP solver) for these 120 paths are stored in the folder tests.