Sometimes, when working on a complex piece of software — or even a simple program or library with a non-trivial number of moving parts — you run into an issue and have absolutely no idea how to reduce your code to the minimum necessary in order to reproduce the error.
For the past several months I've been blocked by some trait-resolution errors in Symtern, my general-purpose interner crate. I knew the errors were somehow related to my attempts to work around the lack of associated-type constructors — but how? Some of the errors rustc was printing didn't even have line numbers!
This is exactly the kind of problem that software like C-Reduce is meant to solve: by iteratively removing parts of an input file and calling a user-provided executable to test whether the modified input still triggers the unwanted behavior, C-Reduce is able to reliably reduce test cases. And it works with Rust programs! With a little massaging we can use it for entire crates, even those with external dependencies.
C-Reduce works something like this:
-
User runs
creduce
with, as command-line arguments, the names of a test-driver executable and file containing source code that triggers "interesting" behavior.$ creduce driver.sh crate.rs
-
C-Reduce runs the driver executable with no arguments. If the source file named on
creduce
's command line is still "interesting", the driver executable should return with an exit status of0
; any other exit status ends the current reduction attempt. (Continue to step 3 only when the driver executable returns0
.)A driver executable might be, for example, a simple shell script that compiles the named source and checks that the compiler emits the expected error. For example:
#!/bin/sh rustc crate.rs > rustc.out 2>&1 # Check for ICEs grep -q 'internal compiler error' rustc.out return $?
We could instead run a generated executable and check for aberrant behavior just as easily.
#!/bin/sh rustc crate.rs && ./a.out > a.out.log grep -q 'wrong branch' a.out.log return $?
-
In parallel, masticate the source file using some set of transforms that reduce the file's size. After each transform completes, return to step two with its output as the new source file; C-Reduce keeps only the smallest transformed source files that remain "interesting".
-
Repeat steps two and three until no further reduction occurs. At this point, C-Reduce will print the smallest transformed source file to its standard output and replace the original source file with it.
Note that this explanation is a description of "delta debugging" rather than of C-Reduce in particular, and fails to account for many of the latter's more salient qualities.
Earlier I promised a way to use C-Reduce on an entire crate — yet C-Reduce can only chew on a single input file at a time! We'll have to squeeze the crate a little to make it work.
C-Reduce expects us to feed it a single input file to munch on; luckily rustc
provides a simple way to "expand" all mod
declarations much like #include
lines in C or C++ source would be expanded by the C preprocessor:
$ rustc -Z unstable-options --pretty=normal src/lib.rs > crate.rs
(At the time of writing, you'll get a warning if you do this using a non-nightly compiler; it's not entirely clear if Rust's compiler devs plan to explicitly support anything like this on stable in the future.)
Although we now have our single source file, we still need to worry about
passing the right flags to rustc
so it can find any extern crate
s.
Why don't we ask Cargo how to do that?
$ cargo build -v 2>&1 \
| grep 'Running `rustc' | tail -n 1 \
| sed -e 's/^.*Running `\(.\+\)`$/\1/g'
This should print the rustc
invocation Cargo uses for the current crate.
We'll need to adjust the resulting command a little before we use it:
src/lib.rs
orsrc/main.rs
must be replaced with the name of the single source file we created earlier (crate.rs
).--out-dir /path/to/my-crate/target/debug/deps
should probably be removed since C-Reduce will be running our driver script in parallel, using a separate temporary directory for each run.
We expect (or hope) that any extern crate
lines will be eliminated by
C-Reduce anyway, so going to the extra trouble of supporting external
crates lets us skip manually extricating any external dependencies from
our code — a big win.
While we could do all this manually, typing the commands directly into a terminal, it's both cumbersome and unnecessary. Enter my favorite underappreciated tool, GNU Make.
Using Make allows us to automate most of the steps involved.
The first rule defined in a makefile is the default goal, used when no goal is
specified on Make's command line. We probably want that target to be the one
that calls creduce
:
.PHONY: reduce
SOURCE ?= crate.rs
DRIVER ?= driver.sh
reduce: $(DRIVER) $(SOURCE)
creduce $<
For sake of flexibility we've made it possible to specify an alternate driver script or source file on the Make command line with e.g.
$ make DRIVER=alternate-driver.sh
or
$ make SOURCE=alternate-source.rs
Specifying both would just be an overly-verbose way of calling C-Reduce directly; specifying only one of them allows us to depend on the Makefile's machinery for the other.
Next we'll define the rule for creating our expanded source file on-demand:
crate.rs: $(wildcard src/**.rs)
cargo rustc -- -Z unstable-options --pretty=normal > $@ \
|| (rm -f $@; return 1)
Defined this way, it will be updated whenever we change one of our crate's sources. Good.
Now let's tackle the driver script. Its build rule is this:
driver.sh: Makefile
$(file > $@,$(DRIVER_SCRIPT))
chmod +x $@
We've used GNU Make's file
builtin instead of an echo
shell command in
order to bypass the shell quoting we would otherwise need.
We'll define the script itself right in the Makefile (thus the dependency),
which enables us to fetch and transform the rustc
invocation when
DRIVER_SCRIPT
is expanded.
# To use shell variables in the driver script, we'll need to escape
# Make's variable expansion using `$$`.
define DRIVER_SCRIPT
#!/bin/sh
# if you need to check that C-Reduce hasn't removed important parts of
# your source, do it here (before we go to the trouble of compiling).
#
#grep -q 'some_required_pattern' $(SOURCE) || return 1
# Compile the source,
$(RUSTC_CMD) > rustc.out 2>&1
# Check for expected output from rustc.
grep -q 'internal compiler error' rustc.out
return $$?
endef
RUSTC_CMD
is defined as follows,
RUSTC_CMD = $(patsubst --outdir %,,$(patsubst src/%.rs,$(SOURCE),$(shell $(CARGO_BUILD_CMD))))
with CARGO_BUILD_CMD
defined statically (:=
) as
CARGO_BUILD_CMD := cargo build -v 2>&1 | grep 'Running `rustc' | tail -n 1 | sed -e 's/^.*Running `\(.*\)`/\1/g'