- The first challenge was solved twice: using an abstract program and using a Java program. All other programs are written in the ASM style abstract programming language of KIV, which is a language with while loops and recursive procedures that operates on arbitrary algebraic data types.
- The problems were solved by Gidon Ernst, Gerhard Schellhorn, Kurt Stenzel and Bogdan Tofan. Each of us solved one problem and contributed to one other.
- No changes to the downloadable KIV system were necessary to do the proofs. We did a few minor tweaks to the xml-outputs you can browse here compared to what you will get from the downloadable system (e.g., we added a page explaining the color coding of proof tree nodes).

The underlying boolean array data structure is modeled as an algebraic array of booleans, by actualizing the generic elements of the array datastructure from KIV's library (specification array) with elements of type boolean, see specification boolarr. Note that array indices in KIV are natural numbers, not integers. To conform with the requirements of having itegers as indices in the algorithm, we use a standard conversion function "i->n: int -> nat", which computes the absolute value and is specified in the library specification intnat.

In specification arr2list we define the central behavioral correctness conditions "issorted(ar)" and "isperm(ar0, ar1)". The sortedness of a boolean array "ar" is defined in a straightforward manner. Whether a boolean array "ar0" is a permutation of another boolean array "ar1" is defined using the existing library theory that defines permutations on lists (specification list-perm). That is, ar0 is a permutation of ar1 iff the corresponding lists of boolean values are permutations. The conversion from boolean arrays to boolean lists (type "list") is done by a recursively defined function "arr2list: array, nat -> list", where the nat parameter gives the upper index bound for conversion (typically the size of the array # ar minus one.). Finally, the order on booleans is defined.

The central proof of correctness and termination can be found in the theorem base of specification 2-way-sort. The proof of theorem two_way_sort-corr-term shows that two_way_sort# terminates, (the double-angle "strong diamond" brackets are the wp-operators expressing total correctness) and gives a sorted permutation of the array. The decreasing term for induction is i→n(j + 1) -- i->n(i)". The central properties of the loop invariant are:

- Below index i, all values of the current array are "false"
- Above index j, all values of the current array are "true"
- The current array is a permutation of the initial array

The proof uses the standard invariant rule for total correctness with this invariant and this decreasing term (see the rule arguments of ), with variables adjusted to match the current goal.

The main lemma required is perm-switch, which is already proved over library specification list-perm. It is used to ensure that swapping of array values results in a permutation of the initial array.

The verification given does not check whether array access during execution of the program is safe: access / modification out of bounds would return an unspecified value / array (so we could surely not prove that the result is a permutation), but would not abort the program. The second solution with Java avoids this problem. Another technique to guarantee safety is to use access programs as in Challenge 3.

- We consider a (compilable) Java-implementation of the given algorithm (included in this class definition). It works on an array data structure in an explicit heap "st" of type "store" (see specification javastore)
- The proof additionally ensures safe array access, safe heap access, ...

The main proof of safety and total correctness is two-way-sort-corr-term in the theorem base of the toplevel specification. It uses the diamond operator of Dynamic Logic for termination (which is sufficient for deterministic programs). An additional postcodition "st[_mode] = noval" ensures that the program terminates regularly without raising an exception. In particular, no "ArrayIndexOutOfBoundsException" exception has occured.

The proof proceeds much like the one from solution 1. It uses symbolic execution and a Hoare-style loop invariant which contains the same core correctness conditions as in solution 1. Additionally, some type and structure information for safe heap access is required to ensure that the program executes "regularly". We do not explain them here (see, e.g., the KIV documentation for more information). The termination argument uses the same variant as in solution 1. Note that for every array access and update a goal with "ArrayIndexOutOfBoundsException" is created if KIV can not ensure automatically that the case is impossible (steps where an exception is thrown are red, filled circles in the proof tree). That such a case is impossible then has to be shown interactively.

- Implement the reduction routine. TOPRED performs a toplevel reduction step, RED1 a single reduction step, and RED implements the full reduction. Sind RED uses RED1 which uses TOPRED a proof for RED usually requires suitable theorems for RED1 and TOPRED.
- Prove that if RED(t) returns t' then t can be reduced to t' and t' and is irreducible. The proof uses induction on the number of procedure calls (both RED and RED1 are implemented as recursive procedures). Bounded procedure calls are part of the Dynamic Logic calculus of KIV. An important lemma is that
RED1 always terminates and if it did not reduce the term then it is indeed irreducible and otherwise RED1 behaves as ->.

We also proved the other direction, i.e. if t can be reduced to an irreducible t' then RED(t) terminates and returns t'. This proof uses induction on the number of steps in t ->* t' (defined in specification reduce).

A required auxiliary theorem is if t -> t1 and t -> t2 then t1 = t2 (a very strong local confluence property).

Theorem RED-iff-reduce is the if-and-only-if theorem.

As a fun theorem we also proved that SII(SII) loops directly for the program (i.e. without using the specification or the if-and-only-if theorem). The proof again uses induction on the number of procedure calls. - Prove that RED terminates on any term not containing S by induction on the size of the term. (Without S only K reduction steps are possible which make the term smaller.) A size function is generated automatically for data types in KIV. The lemmas state that if RED1 or TOPRED reduced the term then its size has decreased.
- Prove that RED(ks n) returns K for even n, and (K K) for odd n where ks 0 = K, and ks (n+1) = ((ks n) K). The proof uses induction on n and the induction hypothesis must be applied for n - 2. The lemma about RED1 states that (ks n) is irreducible for n < 2 and otherwise returns (ks (n - 2))..

A straightforward specification of the given queue algorithms in KIV's abstract programming language is defined in specification queue-impl. The buffer's data array may contain elements of a generic type "elem" (instead of the specific type "int"). Type "ring_buffer" is defined in specification ring-buffer with constructor "mk" and access functions ".data", ".size", ".first" and ".len" respectively. To ensure safe access to the buffer's data in the queue algorithms, we use additional procedures "get#(i, ar; getval)" and "put#(i, putval; ar)" on an array "ar", to get respectively set a value at position "i" in "ar". (A semicolon separates input from in-output parameters.) These access operations diverge (abort) whenever the access index "i" is out of bounds and thus a proof can not be completed. For the test harness, we define a further operation "assert(b;)", which aborts if the boolean input value "b" is false and skips otherwise.

- The buffer's size is not zero and equal to the size of the data array
- The buffer's ".first"-index is never out of bounds and the buffer's ".len" is non-negative and does not exceed the buffer's size

- The represented queue must have the same size as the buffer and contain the elements that are computed by a recursive conversion function "tolist: ring_buffer, nat --> queue"
- Function "tolist" basically returns the elements from ".first" to "(.first + .len) % .size" in the buffers ".data" array

Basically, we show that the create method establishes a valid initial state where the invariant and the abstraction predicate hold. Then we show that each operation maintains the invariant and a valid queue representation. The main proofs can be found in the theorem base of specification refinement.

The proof of correctness of a push operation (theorem "push-corr") on a non-full ring buffer shows that the algorithm is access safe, maintains the invariant and corresponds to an abstract push of the input value on the represented algebraic queue. Preservation of the invariant is straightforward to prove. The preservation of the abstraction predicate basically discerns whether the access has to wrap around or not. It benefits from KIV's library theory for modulo operations (see specification int-div).

The proof of correctness of a pop operation (theorem "pop-corr") on a non-empty ring buffer shows that the algorithm is access safe, maintains the invariant and corresponds to an abstract pop on the algebraic queue, which returns the queues oldest value. Preservation of the invariant and abstraction predicate again benefits from KIV's library theory.

The proof of the test harness (theorem theorem test), automatically symbolically steps through the code until termination, i.e., no intermediate assertion is violated.An overview over the project can be found here. The implementation of the procedures `build#`

and `build_rec#`

is in specification build.
Here is a list of auxiliary lemmas, and here are the
main theorems.

The types are defined in the specifications tree, treeorfail and list:

```
tree = Leaf | Node(l : tree, r : tree)
list = [] | (head : int) + (pop : list)
treeorfail = fail | some(t : tree)
```

Variable conventions in the following:

```
t,t0,t1 : tree
to : treeorfail
d,h,i,j : int
x,y : list
```

The proof relies on a function

```
depths : tree × int → list
```

that computes the list of depths of leaves. It is defined recursively as

```
depths(Leaf, i) = i '
depths(Node(t0,t1), i) = depths(t0, i + 1) + depths(t1, i + 1)
```

where `i '`

is the singleton list consisting of `i`

and `x + y`

is list concatenation (`+`

is overloaded). We show, that no tree has an empty list of depths.

```
depths-nonnil: depths(t,i) ≠ []
```

Also, we need a maximum function on lists (undefined for `[]`

):

```
max : list → int
max(i ') = i
max(i + j + x) = max(max(i,j), max(min(i,j) + x))
```

where `min`

and `max`

are overloaded on pairs of integers. The recursive case extracts the greater of two leading elements.

Lemmas `max-ge`

(proof) and `max-in`

(proof) assert that `max`

computes the greatest element in the list.

```
max-ge: i ∊ x → x ≤ max(x)
max-in: x ≠ [] → max(x) ∊ x
```

Let `to`

be the result of program `build#`

.

For soundness we prove

`⟨ build#(x; to) ⟩ to ≠ fail → x = depths(to.t, 0)`

that is whenever

`build#`

returns successfully then the depths of the result are as specified by the input.For completeness we prove

`⟨ build#(x; to) ⟩ to = fail → ¬ ∃ t. x = depths(t, 0)`

that is whenever

`build#`

fails then there is no tree with depths as specified by the input.For termination we have to find an upper bound

`j`

of the recursion depth of`build_rec#`

The upper bound can be derived from the termination conditions `d = head(x)`

and `d > head(x)`

: an upper bound `j`

is `max(x) ≤ j`

.

The main work is done in two lemmas for `build_rec#`

(soundness and completeness). The proofs goes by induction on `j`

and generalize the upper bound (precondition) to

```
d ≤ j ∧ max(x) ≤ j
```

as well as the postconditions

```
to ≠ fail → x0 = depths(to.t, d) + x
to = fail → ¬ ∃ x1,t. x0 = depths(t, d) + x1
```

where `x0`

is the initial list and `x`

the returned one.

`build#`

Proofs of `build#`

rely on theorems for `build_rec#`

, instantiating the upper bound by `max(x)`

and `d`

by `0`

. There are three cases

- The precondition
`x ≠ [], 0 ≤ max(x)`

holds, then we can apply the theorem for the recursive call, dispatch it via the contract and yield the postcondition. - If
`x = []`

then the recursive call fails immediately and so does`build#`

. By lemma`depths-nonnil`

(see above) there is no matching tree. - Otherwise, if
`max(x) < 0`

then in particular`head(x) < 0`

, thus`build_rec#`

fails as well (`h < d`

), indicating no matching tree. As of`x ≠ []`

`build#`

fails also, propagating the postcondition.

Please note that the main theorems given in the two following subsections are applied manually to make them clearly visible in the proofs. Their application could be automated by the simplifier.

`build_rec#`

Theorem

```
x ≠ [] ∧ d ≤ j ∧ max(x) ≤ j ∧ x0 = x
→ ⟨ build_rec#(d; x, to) ⟩
(to ≠ fail → x0 = depths(to.t, d) + x)
```

The proof goes by induction over `j - d`

where `j`

is the upper bound of the recursion. A successfull run of `build_rec#`

has two recursive calls, that can both be handled by the induction hypothesis. The first recursive call splits some prefix of the list and is uncritical.

The second recursive call is more tricky: to apply induction a second time, we need the fact that the maximum of the remaining list is not greater than the maximum of the whole list, expressed by a theorem to distribute `max`

over list concatenation:

```
max-concat:
x ≠ [] ∧ y ≠ [] → max(x + y) = max(max(x), max(y))
```

which is proved by induction over the sum of the lists' lengths.

The resulting two subtrees are be combined straight forward by `depths-rec`

, completing the induction step.

`build_rec#`

Theorem

```
x ≠ [] ∧ d ≤ j ∧ max(x) ≤ j ∧ x0 = x
→ ⟨ build_rec#(d; x, to) ⟩
(to = fail → ¬ ∃ t,x1. x0 = depths(t, d) + x1)
```

For completeness, we need the fact that a unique tree can be built from a (prefix of a) list of depths:

```
unique-prefix:
depths(t0,i) + x = depths(t1,i) + y ↔ (t0 = t1 ∧ x = y)
```

It is proved by structural induction one tree (`t0`

).

The theorem is used to show that whenever the recursion `fail`

s, there is no other possible way the tree may have been rebuilt. We show the contraposition of the original statement: Assuming there is valid tree `t`

on level `d`

(with `depths(t,d)`

prefix of the list) we show that the call does not fail, by finding a witnesses for both recursive calls.

We perform case distinction over the success of the first recursive call:

- If there is no valid prefix (returns
`fail`

) of the list we are done (contradicts assumption) - If there is a valid prefix, then soundness of
`build_rec#`

gives us the corresponding tree`t0`

such that`depths(t0, d+1)`

is a prefix of the list.- If
`t`

is a leaf we are done. - If
`t`

is a node then by the unique prefix property its left subtree must be equal to`t0`

. The right subtree of`t`

thus serves as a witness for a successfull second recursive call (quantifier instantiation in proof step 32).

- If

Termination is trivially established by the induction in both the soundness and completeness proofs. Termination is expressed simply by

```
⟨ build#(x; to) ⟩ true
```

The two tests are validated by symbolic execution.

- The (parameter) type of nodes is 'elem'. This is already the parameter of lists and sets from the library. Note that we do not fix the set of available nodes to be finite (although the type set contains finite sets only).
- The successor function succ is declared in the signature, no axioms are given.
- reach(a,b,n) means that there is path between a and b of length n.
- reach<(a,b,n) means that b is reachable from a in less than n steps.
- reach(a,S,n) means that all nodes in S are reachable with a shortest path of length n.
- The specification also defines paths (based on list from the library), and the existence of a path is proven equivalent to reachability, but paths are not used in the proofs and invariants of the algorithm.

Predicate INV states that

- a node 'b' is visited, if either it is at a depth less than n + 1 or if there is a predecessor node 'a' of 'b' at depth n that has been removed from C.
- all nodes in C are at level n (i.e. reachable from source with n, but not fewer steps), all nodes in N are at level n +1
- all nodes of level n + 1 are either already in N or have a predessor in C and are not in V (figuring out that we need 'not in V' to avoid getting nodes in the proof that do not exist in reality was the hard part for us).

Predicate INVR states that

- dest is reachable from succ with a minimal path of length n0, which is at least as big as the current counter.
- Either there is some node in C, which reaches dest in n - n0 steps or there is a node in N, which reaches dest in n - n0 -1 steps (and n0 < n).

- INV and INVR hold at the start of the algorithm: (theorems INV-init and INVR-init).
- INV and INVR are preserved by the two set operation of ADD: (theorems INV-step and INVR-step). The latter requires the tweak, that the node v removed from C must different from dest.
- INV and INVR are preserved when N := C is executed (when C was empty): (theorems INV-nextlevel and INVR-nextlevel).

The correctness proof of BFS-correct-if-reachable is trickier. It is expressed as a total correctness operator using a strong diamond operator (two opening/closing angle brackets) which expresses the weakest precondition (wp-operator): all runs of the nondeterministic program terminate and make the postcondition true.

The proof requires INV and INVR as invariant. In addition 'done implies dest in C' is needed again. Finally, an empty C must imply an empty N (and even though the property is simple, it was hard to find).

The proof does not use the invariant rule (here, since the implemented rule does not allow to give a lexical order; in general, since inductive proofs are often easier than figuring out an invariant). Instead, the proof first weakens the precondition to the invariant. (technically, using the cut rule of sequent calculus with the invariant as rule argument, and then weakening).

Then a nested (structural) induction over the difference between n and n0, and the size of C is done (red nodes in the proof tree). Unwinding the loop (purple nodes) once then leads to a state where either the inner induction hypothesis (C does not become empty) or the outer induction hypothesis (d is increased) is applicable (green nodes doing quantifier instantiation). The lemmas for INV and INVR are used. They make it unnessary to unfold of INV and INVR during symbolic execution.