Recipes

Learning curve. In our experience, it does not take long for programmers to pick up and start applying design-by-contracts. Although only anecdotal, we observed that it takes junior programmers one to two weeks to get up to speed and start writing meaningful contracts in their code.

In general, writing contracts tends to be easy: while some conditions are really hard, most conditions are rather trivial. However, it takes a bit till it “clicks” so that writing contracts becomes a second nature when writing code. To help people with learning and training, we compiled a list of common recipes (i.e. patterns) that we often observed during the development.

Benefits of the recipes. In our experience, oftentimes it suffices if you just write down the obvious contracts. This will usually already substantially improve the correctness of your code and catch many bugs. To that end, recipes help you so that you do not have to spend so much mental energy on contracts, while reaping the benefits of the low-hanging fruits.

Suggestions. The following list of recipes is of course far from comprehensive. We collected the patterns based on our every-day programming, so they are intrinsically biased towards the areas we work on. If you developed contract patterns of your own that you do not see listed here, please feel free to suggest extensions! The easiest way for us to organize suggestions is if you create a new issue on our GitHub.

Corpus. We collected an extensive corpus of Python programs annotated with contracts, Python-by-contract corpus,. The corpus should serve both education purposes and as a testbed for the tools and libraries in the ecosystem.

We will use the snippets from the corpus as examples for the individual recipes. In the following, we will omit to put an explicit identifier of the corpus for brevity.

Serialize / Deserialize Pair

De/serialization data from different formats is a commonplace in many programs. Whenever you have a pair (a serialization and a de-serialization function), you should make it explicit in the contract how their inputs are connected.

For example, tokenizing a text and then converting the tokens back to text should give you the original input text.

From ethz_eprog_2019/exercise_11/problem_02.py#L159:

@ensure(
    lambda text, result:
    tokens_to_text(result) == text  # type: ignore
)
def tokenize(text: str) -> List[Token]:
    ...

@ensure(lambda tokens, result: tokens == tokenize(result))
def tokens_to_text(tokens: Sequence[Token]) -> str:
    ...

This is a particularly nice scenario for auto-testing tools such as crosshair and icontract-hypothesis. Since the input is usually completely defined, the auto-testing tools can be readily used and often reveal relevant bugs in practice.

Encapsulation of Immutable Types

Certain contracts repeat throughout the code. If they operate on immutable data types, such as Sequence or int, you can encapsulate them into a new data type. Instead of using __init__, you use __new__ for the construction and simply cast the input object to that new type after the pre-conditions were checked.

Consider Lines, a data structure representing an immutable sequence of strings which do not contain new-line characters.

From common.py#L18:

class Lines(DBC):
"""Represent a sequence of text lines."""

    # fmt: off
    @require(
        lambda lines:
        all('\n' not in line and '\r' not in line for line in lines)
    )
    # fmt: on
    def __new__(cls, lines: Sequence[str]) -> "Lines":
        return cast(Lines, lines)

Another good example is an identifier, a string complying to a certain pattern.

From ethz_eprog_2019/exercise_11/problem_02.py#L242:

IDENTIFIER_RE = re.compile(r"[a-zA-Z_][a-zA-Z0-9]*")


class Identifier(DBC, str):

    @require(lambda value: IDENTIFIER_RE.fullmatch(value))
    def __new__(cls, value: str) -> "Identifier":
        return cast(Identifier, value)

This pattern is very efficient — the underlying structure is left as-is and reaps all the benefits of efficient implementation — while it gives you additional static and runtime guarantees.

Note

If you use this pattern with icontract-hypothesis, you can not directly inherit from types such as Sequence as this will cause problems with the underlying Hypothesis library. Please see Hypothesis issue #2951.

For the time being (2021-07-01), the best approach is to annotate your data structure with all the methods that you actually need. See the class Lines at common.py#L18 for a full example.

Unique Elements in a Sequence

If you want to assert that the elements of a sequence are unique, you can use Pigeonhole Principle. Namely, if the elements in a collection are unique, the size of the corresponding set is equal to the size of the collection:

some_collection = list(...)
is_unique = len(set(some_collection)) == some_collection

From aoc2020/day_22_crab_combat.py#L40:

class Deck(DBC):
    """Represent a deck of cards."""

    @require(
        lambda cards:
        len(set(cards)) == len(cards),
        "Unique cards"
    )
    def __init__(self, cards: Sequence[int]) -> None:
        ...

Contracts on Elements of a Collection

We can use built-in functions all and any to model the contracts on the elements of a collection. Coupled with generator expressions, this gives very readable and elegant conditions.

From aoc2020/day_11_seating_system.py#L52:

import re

class Layout:
    """Represent a seat layout."""

    @require(
        lambda table:
        len(table) > 0
        and len(table[0]) > 0
        and all(
            len(row) == len(table[0])
            for row in table
        )
    )
    @require(
        lambda table:
        all(
            re.fullmatch(r"[L#.]", cell)
            for row in table
            for cell in row
        )
    )
    def __init__(self, table: List[List[str]]) -> None:
        """Initialize with the given values."""
        ...

Elements of a Sequence Sorted

If we need to assert that the elements of a sequence are sorted, we can used the built-in function sorted to succinctly formulate the condition:

some_collection = list(...)
is_sorted = sorted(some_collection) == some_collection

From ethz_eprog_2019/exercise_03/problem_04.py#L38:

assert all("".join(sorted(key)) == key for key in TO_NUMBER)

Non-overlapping Sorted Ranges

Ranges (also called intervals) are usually defined with a start and an end. In many problems they are given in a sorted sequence where no overlap is expected. For example, if you have to model tags of a text, where the tags do not overlap.

Before we formulate the condition, let us first introduce a helper function to iterate over two consecutive elements in a sequence.

From common.py#L88:

def pairwise(iterable: Iterable[T]) -> Iterable[Tuple[T, T]]:
    """Iterate over ``(s0, s1, s2, ...)`` as ``((s0, s1), (s1, s2), ...)``."""
    previous = None  # type: Optional[T]
    for current in iterable:
        if previous is not None:
            yield previous, current

        previous = current

Assuming the ranges are sorted and non-overlapping, with inclusive start and exclusive end, the end of the previous range must equal the start of the current range.

From ethz_eprog_2019/exercise_12/problem_01.py#L88:

class Token(DBC):
    @require(lambda start, end: start < end)
    def __init__(self, ..., start: int, end: int, ...) -> None:
        ...

and from ethz_eprog_2019/exercise_12/problem_01.py#L147:

@ensure(
    lambda result:
    all(
        token1.end == token2.start
        for token1, token2 in common.pairwise(result)
    ),
    "Tokens consecutive"
)
def tokenize(text: str) -> List[Token]:
    """Tokenize the given ``text``."""
    ...

If you want to allow “holes” between the ranges, just change the equality to the less comparison:

token1.end < token2.start

Material Conditional (“If … then …”)

You can toggle contracts at load time of a module using enabled argument (see Toggling Contracts). This mechanism can not be used at runtime as we need toggling to be efficient and performed only once.

However, some contracts depend on the runtime values. For example, a condition on some value might apply only if the value is negative, or if a value is not None.

We can use the material conditional to formulate “if … then …” conditioning in the contracts. The implication, A B, means that the logical statement B must hold if the logical statement A holds.

Unlike Eiffel Programming Language, Python does not provide the implication operator, but we can rewrite the material implication as: ¬A B. In Python, this is written as: not A or B.

From ethz_eprog_2019/exercise_05/problem_03.py#L217:

class Range:
    start: Final[float]
    end: Final[float]

    ...

@ensure(
    lambda ranges, value, result:
    not (value < ranges[0].start) or result == -1,
    "Value not covered in ranges => bin not found"
)
def bin_index(ranges: BinRanges, value: float) -> int:
    """Find the index of the bin range among ``ranges`` corresponding to ``value``."""
    ...

This pattern also nicely plays with sphinx-icontract which renders the material implication with the proper symbol . See the generated documentation of bin_index.

Compare against a Redundant Implementation

When you need to get a complex algorithm right, a common approach is to provide multiple redundant implementations. The hope is that the bugs will not replicate across the implementations. For example, the same task is given to different teams, or the same problem is tackled with different algorithms.

We observe often in practice that only two implementations are sufficient:

  1. An optimized complex one, which has a high probability of bugs, and

  2. A naïve inefficient one. This implementation takes substantially longer to run or does not scale with the large input at all. However, it is much easier to read and verify, and the probability of bugs is much smaller.

You assert then in code that the output of the optimized implementation matches the naïve one.

From ethz_eprog_2019/exercise_04/problem_01.py#L39, where we compute the Sieve of Eratosthenes:

@ensure(
    lambda result:
    all(
        naive_is_prime(number)
        for number in result
    )
)
def sieve(limit: int) -> List[int]:
    """
    Apply the Sieve of Eratosthenes on the numbers up to ``limit``.
    :return: list of prime numbers till ``limit``
    """
    ...

Usually you toggle this contract using enabled argument at load time so that it is only applied in the testing environments where you know that the input is small enough for the naïve implementation (see Toggling Contracts):

IN_TESTING_ENVIRONMENT = ...

@ensure(
    lambda result:
    all(
        naive_is_prime(number)
        for number in result
    ),
    enabled=IN_TESTING_ENVIRONMENT
)

Alternatively, you can use material conditional to limit the contract on small inputs at runtime (see the recipe Material Conditional (“If … then …”)):

IN_TESTING_ENVIRONMENT = ...

@ensure(
    lambda result:
    not (max(result) < 100 and len(result) <= 10)
    or all(
        naive_is_prime(number)
        for number in result
    )
)

Exclusive Or (“Either … or …”)

Python already provides an exclusive or operator (^), so we can directly use it to model exclusive properties in the contracts. For example, a function returns either a valid result or an error message (but not both). Another example is if a function expects either one or the other argument to be specified (but, again, not both).

From ethz_eprog_2019/exercise_08/problem_05.py#L59:

@ensure(
    lambda pos, result:
    all(
        (next_pos.x == pos.x and next_pos.y != pos.y)
        ^ (next_pos.x != pos.x and next_pos.y == pos.y)
        for next_pos in result
    ),
    "Next is either in x- or in y-direction"
)
def list_next_positions(pos: Position) -> Sequence[Position]:
    """List all the possible next positions based on the current position ``pos``."""
    ...

Intermediate Variables

Long and complex contracts can become unwieldy, especially if certain expressions are re-used throughout the condition. Python’s Assignment Expressions (a.k.a. walrus operator, :=) is particularly useful in such situations as it allows you to introduce a bit of “procedural” programming into the generally declarative conditions.

From ethz_eprog_2019/exercise_02/problem_03.py#L53:

@ensure(
    lambda result:
    all(
        (
                center := len(line) // 2,
                line[:center] == line[center + 1:][::-1]
                if len(line) % 2 == 1
                else line[:center] == line[center:][::-1]
        )[1]
        for line in result
    ),
    "Horizontal symmetry"
)
def draw(width: int, height: int) -> Lines:
    """Draw the pattern of the size ``width`` x ``height`` and return the text lines."""
    ...

Note

The walrus operator assigns the computed value to a variable and evaluates the value back. Hence we use a tuple to make the code “procedural” and re-use the assigned variable in the second element of the tuple.

Note

The introduction of the walrus operator (PEP 572) did spark some controversy and even lead to resigning of Guido van Rossum as Benevolent Dictator For Life (see this article about PEP 572 on lwn.net). With that said, you have to be careful not to abuse the feature.

For example, if the condition becomes too complex, it may be worth considering refactoring the code into a separate function (and testing it independently as well). Putting contract condition in a separate function has a disadvantage, though, that you do not get as informative violation messages (see Section Usage) and they hinder the tools like icontract-hypothesis.