Introduction¶

We present here a corpus of Python programs annotated with contracts.

The corpus includes:

Solutions to the exercises of the Advent of Code 2020,
Solutions to the exercises used during the lecture “Introduction to Programming” at ETH Zurich (Switzerland) in Fall 2019, and
Incorrect code with minimal difference to the final solutions obtained from recorded failures (see Recorded Failures and Incorrect Programs).

The design-by-contract is still not widely practiced in Python due to different factors such as unfamiliarity of the community with the concept and with the available tools. Therefore we could not find a sufficiently large and representative code base that would give us a good testbed for the automatic tools.

Hence we employ the solutions to the aforementioned exercises as a benchmark to compare and evaluate different approaches to automatic testing of Python code.

We expect this data set to help us discover the blind spots. The tools (usually) generate the input for the functions based on the type annotations and pre-conditions. So the corpus helps us answer the questions such as:

Which families of pre-conditions are supported?
What are the limits of a testing tool and which pre-conditions are not supported?
Which kinds of inputs can be generated in a computationally efficient way?

We hope that this code base benefits not only the community of tool developers but also the tool users by exposing the involved trade-offs. In particular, the users should learn based on practical examples how the tools differ, understand their limits and distinguish in which cases one tool is better than the other.