VSTTE Verification Competition : KIV Solutions

More information on the competition and the problems given can be found here: https://sites.google.com/site/vstte2012/compet

General comments

The first challenge was solved twice: using an abstract program and using a Java program. All other programs are written in the ASM style abstract programming language of KIV, which is a language with while loops and recursive procedures that operates on arbitrary algebraic data types.
The problems were solved by Gidon Ernst, Gerhard Schellhorn, Kurt Stenzel and Bogdan Tofan. Each of us solved one problem and contributed to one other.
No changes to the downloadable KIV system were necessary to do the proofs. We did a few minor tweaks to the xml-outputs you can browse here compared to what you will get from the downloadable system (e.g., we added a page explaining the color coding of proof tree nodes).

Challenge 1 : Two-Way Sort

Solution 1: Abstract Program

Our first solution of Challenge 1 implements the given two-way-sort algorithm of a boolean array using KIV's abstract programming language. To get an overview over the first solution, have a short look at the development graph of the project, which gives an overview over the specification structure. The main program "two_way_sort#" is defined in the toplevel specification 2-way-sort. It takes an abstract boolean array as in-out parameter (type "array") and implements the given algorithm in a straightforward manner.

The underlying boolean array data structure is modeled as an algebraic array of booleans, by actualizing the generic elements of the array datastructure from KIV's library (specification array) with elements of type boolean, see specification boolarr. Note that array indices in KIV are natural numbers, not integers. To conform with the requirements of having itegers as indices in the algorithm, we use a standard conversion function "i->n: int -> nat", which computes the absolute value and is specified in the library specification intnat.

In specification arr2list we define the central behavioral correctness conditions "issorted(ar)" and "isperm(ar0, ar1)". The sortedness of a boolean array "ar" is defined in a straightforward manner. Whether a boolean array "ar0" is a permutation of another boolean array "ar1" is defined using the existing library theory that defines permutations on lists (specification list-perm). That is, ar0 is a permutation of ar1 iff the corresponding lists of boolean values are permutations. The conversion from boolean arrays to boolean lists (type "list") is done by a recursively defined function "arr2list: array, nat -> list", where the nat parameter gives the upper index bound for conversion (typically the size of the array # ar minus one.). Finally, the order on booleans is defined.

The central proof of correctness and termination can be found in the theorem base of specification 2-way-sort. The proof of theorem two_way_sort-corr-term shows that two_way_sort# terminates, (the double-angle "strong diamond" brackets are the wp-operators expressing total correctness) and gives a sorted permutation of the array. The decreasing term for induction is i→n(j + 1) -- i->n(i)". The central properties of the loop invariant are:

Below index i, all values of the current array are "false"
Above index j, all values of the current array are "true"
The current array is a permutation of the initial array

The proof uses the standard invariant rule for total correctness with this invariant and this decreasing term (see the rule arguments of ), with variables adjusted to match the current goal.

The main lemma required is perm-switch, which is already proved over library specification list-perm. It is used to ensure that swapping of array values results in a permutation of the initial array.

The verification given does not check whether array access during execution of the program is safe: access / modification out of bounds would return an unspecified value / array (so we could surely not prove that the result is a permutation), but would not abort the program. The second solution with Java avoids this problem. Another technique to guarantee safety is to use access programs as in Challenge 3.

Solution 2: Java Program

To get an overview over the second solution, have a short look at the development graph. The solution includes the rather large library needed for Java (orange specifications). The central new features of Solution II are:

We consider a (compilable) Java-implementation of the given algorithm (included in this class definition). It works on an array data structure in an explicit heap "st" of type "store" (see specification javastore)
The proof additionally ensures safe array access, safe heap access, ...

The central behavioral correctness properties are defined in specification arr2bitlist, which similar to solution 1 maps a reference to an array stored in a heap st to a bitlist using the conversion function "getbooleanarray: reference, store --> bitlist". Sortedness and permutations are defined as in solution 1 (predicates "issorted" and "perm").

The main proof of safety and total correctness is two-way-sort-corr-term in the theorem base of the toplevel specification. It uses the diamond operator of Dynamic Logic for termination (which is sufficient for deterministic programs). An additional postcodition "st[_mode] = noval" ensures that the program terminates regularly without raising an exception. In particular, no "ArrayIndexOutOfBoundsException" exception has occured.

The proof proceeds much like the one from solution 1. It uses symbolic execution and a Hoare-style loop invariant which contains the same core correctness conditions as in solution 1. Additionally, some type and structure information for safe heap access is required to ensure that the program executes "regularly". We do not explain them here (see, e.g., the KIV documentation for more information). The termination argument uses the same variant as in solution 1. Note that for every array access and update a goal with "ArrayIndexOutOfBoundsException" is created if KIV can not ensure automatically that the case is impossible (steps where an exception is thrown are red, filled circles in the proof tree). That such a case is impossible then has to be shown interactively.

Challenge 2: Combinators

Given a formal specification of the SK(I) combinators and rewriting rules the challenge was to implement the reduction as a program and prove certain properties about it. To get an overview over the solution, have a short look at the development graph the project, which gives an overview over the specification structure.

Implement the reduction routine. TOPRED performs a toplevel reduction step, RED1 a single reduction step, and RED implements the full reduction. Sind RED uses RED1 which uses TOPRED a proof for RED usually requires suitable theorems for RED1 and TOPRED.
Prove that if RED(t) returns t' then t can be reduced to t' and t' and is irreducible. The proof uses induction on the number of procedure calls (both RED and RED1 are implemented as recursive procedures). Bounded procedure calls are part of the Dynamic Logic calculus of KIV. An important lemma is that RED1 always terminates and if it did not reduce the term then it is indeed irreducible and otherwise RED1 behaves as ->.
We also proved the other direction, i.e. if t can be reduced to an irreducible t' then RED(t) terminates and returns t'. This proof uses induction on the number of steps in t ->* t' (defined in specification reduce).
A required auxiliary theorem is if t -> t1 and t -> t2 then t1 = t2 (a very strong local confluence property).
Theorem RED-iff-reduce is the if-and-only-if theorem.
As a fun theorem we also proved that SII(SII) loops directly for the program (i.e. without using the specification or the if-and-only-if theorem). The proof again uses induction on the number of procedure calls.
Prove that RED terminates on any term not containing S by induction on the size of the term. (Without S only K reduction steps are possible which make the term smaller.) A size function is generated automatically for data types in KIV. The lemmas state that if RED1 or TOPRED reduced the term then its size has decreased.
Prove that RED(ks n) returns K for even n, and (K K) for odd n where ks 0 = K, and ks (n+1) = ((ks n) K). The proof uses induction on n and the induction hypothesis must be applied for n - 2. The lemma about RED1 states that (ks n) is irreducible for n < 2 and otherwise returns (ks (n - 2))..

Challenge 3: Ring Buffer

The task is to verify an array-based sequential queue implementation. To get an overview over the solution, have a short look at the development graph the project, which gives an overview over the specification structure.

The Algorithm

A straightforward specification of the given queue algorithms in KIV's abstract programming language is defined in specification queue-impl. The buffer's data array may contain elements of a generic type "elem" (instead of the specific type "int"). Type "ring_buffer" is defined in specification ring-buffer with constructor "mk" and access functions ".data", ".size", ".first" and ".len" respectively. To ensure safe access to the buffer's data in the queue algorithms, we use additional procedures "get#(i, ar; getval)" and "put#(i, putval; ar)" on an array "ar", to get respectively set a value at position "i" in "ar". (A semicolon separates input from in-output parameters.) These access operations diverge (abort) whenever the access index "i" is out of bounds and thus a proof can not be completed. For the test harness, we define a further operation "assert(b;)", which aborts if the boolean input value "b" is false and skips otherwise.

Invariants and Lemmas

The central invariant "inv" used in the verification is defined in specification ring-buffer-inv. The main invariant properties are

The buffer's size is not zero and equal to the size of the data array
The buffer's ".first"-index is never out of bounds and the buffer's ".len" is non-negative and does not exceed the buffer's size

The proof of correctness of the implementation wrt an abstract FIFO queue semantics uses refinement. The abstract data structure "queue" is an algebraic list along with standard operations, defined in specification queue. The proof of safety, correctness and the verification of the test harness are in specification refinement. This specification also defines the abstraction predicate "abs", which relates a ring-buffer to a queue of its elements as follows:

The represented queue must have the same size as the buffer and contain the elements that are computed by a recursive conversion function "tolist: ring_buffer, nat --> queue"
Function "tolist" basically returns the elements from ".first" to "(.first + .len) % .size" in the buffers ".data" array

The Main Proofs

Basically, we show that the create method establishes a valid initial state where the invariant and the abstraction predicate hold. Then we show that each operation maintains the invariant and a valid queue representation. The main proofs can be found in the theorem base of specification refinement.

The proof of correctness of a push operation (theorem "push-corr") on a non-full ring buffer shows that the algorithm is access safe, maintains the invariant and corresponds to an abstract push of the input value on the represented algebraic queue. Preservation of the invariant is straightforward to prove. The preservation of the abstraction predicate basically discerns whether the access has to wrap around or not. It benefits from KIV's library theory for modulo operations (see specification int-div).

The proof of correctness of a pop operation (theorem "pop-corr") on a non-empty ring buffer shows that the algorithm is access safe, maintains the invariant and corresponds to an abstract pop on the algebraic queue, which returns the queues oldest value. Preservation of the invariant and abstraction predicate again benefits from KIV's library theory.

The proof of the test harness (theorem theorem test), automatically symbolically steps through the code until termination, i.e., no intermediate assertion is violated.

Challenge 4: Tree Reconstruction

An overview over the project can be found here. The implementation of the procedures build# and build_rec# is in specification build. Here is a list of auxiliary lemmas, and here are the main theorems.

Types

The types are defined in the specifications tree, treeorfail and list:

tree = Leaf | Node(l : tree, r : tree)
list = [] | (head : int) + (pop : list)

treeorfail = fail | some(t : tree)

Variable conventions in the following:

t,t0,t1 : tree
to : treeorfail
d,h,i,j : int
x,y : list

Supporting definitions

The proof relies on a function

depths : tree × int → list

that computes the list of depths of leaves. It is defined recursively as

depths(Leaf, i)        = i '
depths(Node(t0,t1), i) = depths(t0, i + 1) + depths(t1, i + 1)

where i ' is the singleton list consisting of i and x + y is list concatenation (+ is overloaded). We show, that no tree has an empty list of depths.

depths-nonnil: depths(t,i) ≠ []

Also, we need a maximum function on lists (undefined for []):

max : list → int

max(i ') = i
max(i + j + x) = max(max(i,j), max(min(i,j) + x))

where min and max are overloaded on pairs of integers. The recursive case extracts the greater of two leading elements.

Lemmas max-ge (proof) and max-in (proof) assert that max computes the greatest element in the list.

max-ge: i ∊ x → x ≤ max(x)
max-in: x ≠ [] → max(x) ∊ x

Approach

Let to be the result of program build#.

For soundness we prove

⟨ build#(x; to) ⟩ to ≠ fail → x = depths(to.t, 0)

that is whenever build# returns successfully then the depths of the result are as specified by the input.
For completeness we prove

⟨ build#(x; to) ⟩ to = fail → ¬ ∃ t. x = depths(t, 0)

that is whenever build# fails then there is no tree with depths as specified by the input.
For termination we have to find an upper bound j of the recursion depth of build_rec#

The upper bound can be derived from the termination conditions d = head(x) and d > head(x): an upper bound j is max(x) ≤ j.

The main work is done in two lemmas for build_rec# (soundness and completeness). The proofs goes by induction on j and generalize the upper bound (precondition) to

d ≤ j ∧ max(x) ≤ j

as well as the postconditions

to ≠ fail → x0 = depths(to.t, d) + x
to = fail → ¬ ∃ x1,t. x0 = depths(t, d) + x1

where x0 is the initial list and x the returned one.

Proofs

Soundness/completeness of `build#`

Proofs of build# rely on theorems for build_rec#, instantiating the upper bound by max(x) and d by 0. There are three cases

The precondition x ≠ [], 0 ≤ max(x) holds, then we can apply the theorem for the recursive call, dispatch it via the contract and yield the postcondition.
If x = [] then the recursive call fails immediately and so does build#. By lemma depths-nonnil (see above) there is no matching tree.
Otherwise, if max(x) < 0 then in particular head(x) < 0, thus build_rec# fails as well (h < d), indicating no matching tree. As of x ≠ [] build# fails also, propagating the postcondition.

Please note that the main theorems given in the two following subsections are applied manually to make them clearly visible in the proofs. Their application could be automated by the simplifier.

Soundness of `build_rec#`

Theorem

x ≠ [] ∧ d ≤ j ∧ max(x) ≤ j ∧ x0 = x
  → ⟨ build_rec#(d; x, to) ⟩ 
      (to ≠ fail → x0 = depths(to.t, d) + x)

The proof goes by induction over j - d where j is the upper bound of the recursion. A successfull run of build_rec# has two recursive calls, that can both be handled by the induction hypothesis. The first recursive call splits some prefix of the list and is uncritical.

The second recursive call is more tricky: to apply induction a second time, we need the fact that the maximum of the remaining list is not greater than the maximum of the whole list, expressed by a theorem to distribute max over list concatenation:

max-concat:
x ≠ [] ∧ y ≠ [] →  max(x + y) = max(max(x), max(y))

which is proved by induction over the sum of the lists' lengths.

The resulting two subtrees are be combined straight forward by depths-rec, completing the induction step.

Completeness of `build_rec#`

Theorem

x ≠ [] ∧ d ≤ j ∧ max(x) ≤ j ∧ x0 = x
  → ⟨ build_rec#(d; x, to) ⟩ 
      (to = fail → ¬ ∃ t,x1. x0 = depths(t, d) + x1)

For completeness, we need the fact that a unique tree can be built from a (prefix of a) list of depths:

unique-prefix:
depths(t0,i) + x = depths(t1,i) + y   ↔   (t0 = t1 ∧ x = y)

It is proved by structural induction one tree (t0).

The theorem is used to show that whenever the recursion fails, there is no other possible way the tree may have been rebuilt. We show the contraposition of the original statement: Assuming there is valid tree t on level d (with depths(t,d) prefix of the list) we show that the call does not fail, by finding a witnesses for both recursive calls.

We perform case distinction over the success of the first recursive call:

If there is no valid prefix (returns fail) of the list we are done (contradicts assumption)
If there is a valid prefix, then soundness of build_rec# gives us the corresponding tree t0 such that depths(t0, d+1) is a prefix of the list.
- If t is a leaf we are done.
- If t is a node then by the unique prefix property its left subtree must be equal to t0. The right subtree of t thus serves as a witness for a successfull second recursive call (quantifier instantiation in proof step 32).

Termination, Test Cases

Termination is trivially established by the induction in both the soundness and completeness proofs. Termination is expressed simply by

⟨ build#(x; to) ⟩ true

The two tests are validated by symbolic execution.

Challenge 5: Breadth-First Search

To get an overview over the project, first have a short look at the development graph the project, which gives an overview over the specication structure. The main algorithm is defined in specification BFS. The two required theorems are proven over this specification : BFS-correct-if-reachable is soundness and BFS-returns-fail-implies-not-reachable is completeness. The main invariants necessary are defined as two predicates INV and INVR in specification INV, with auxiliary definitons in specification graph. The following subsections give more detailed informations.

The Algorithm

The main algorithm is defined in specification BFS. The algorithm uses ASM style nondeterministic choose operations. An auxiliary routine ADD implements the 'for each w in succ(v)' loop. Lemma ADD-lemma shows that in essence ADD executes two set operations. ADD uses N and C as input and output, therefore the procedure is declared nonfunctional. The main program BFS either returns a natural number n (embedded as 'some(n)') or fail. Note that counter d from the problem description is named n, and is a natural number here (using an integer as in the problem description would just add unnecessary complexity). A small adjustment (using a done-flag) was necessary to adapt the algorithm to the procedure style used in KIV, which has no return statements, but returns the result in the output parameter 'norfail'.

Auxiliary Definitions

The algorithm is based on finite sets as defined in KIV's library. These come with an empty set, operations ++/ -- to add/delete an element as well as with other standard set operations. Specifically for the algorithm we have defined some auxiliary reachability predicates in specification graph:

The (parameter) type of nodes is 'elem'. This is already the parameter of lists and sets from the library. Note that we do not fix the set of available nodes to be finite (although the type set contains finite sets only).
The successor function succ is declared in the signature, no axioms are given.
reach(a,b,n) means that there is path between a and b of length n.
reach<(a,b,n) means that b is reachable from a in less than n steps.
reach(a,S,n) means that all nodes in S are reachable with a shortest path of length n.
The specification also defines paths (based on list from the library), and the existence of a path is proven equivalent to reachability, but paths are not used in the proofs and invariants of the algorithm.

Invariants and Lemmas

The main invariants of the algorithm are given in specification INV. Predicate INV was sufficient for completness, predicate INVR was additionally needed for correctness. Note that the counter d of the problem description is named n in the axioms (and the algorithm), argument n0 in INVR is the depth in which the node 'dest' can be found.

Predicate INV states that

a node 'b' is visited, if either it is at a depth less than n + 1 or if there is a predecessor node 'a' of 'b' at depth n that has been removed from C.
all nodes in C are at level n (i.e. reachable from source with n, but not fewer steps), all nodes in N are at level n +1
all nodes of level n + 1 are either already in N or have a predessor in C and are not in V (figuring out that we need 'not in V' to avoid getting nodes in the proof that do not exist in reality was the hard part for us).

Predicate INVR states that

dest is reachable from succ with a minimal path of length n0, which is at least as big as the current counter.
Either there is some node in C, which reaches dest in n - n0 steps or there is a node in N, which reaches dest in n - n0 -1 steps (and n0 < n).

The theorems over specification INV prove that

INV and INVR hold at the start of the algorithm: (theorems INV-init and INVR-init).
INV and INVR are preserved by the two set operation of ADD: (theorems INV-step and INVR-step). The latter requires the tweak, that the node v removed from C must different from dest.
INV and INVR are preserved when N := C is executed (when C was empty): (theorems INV-nextlevel and INVR-nextlevel).

The proof for INVR already assumes that INV is known. All proofs make use of the simplifier rules (lemmas used as rewrite, elimination, forward rules) proven in graph, and of course they use lots of such rules from the library.

The Two Main Proofs

The completeness proof BFS-returns-fail-implies-not-reachable is the easier one, since it is a partial correctness assertion expressed with the box operator of Dynamic Logic (DL), that corresponds to the wlp-operator of weakest precondition calculus. The required invariant for the main loop (can be found as the rule argument at this (red) node, which is an application of the DL version of Hoare's invariant rule. Note that the variables are adjusted to match the variables of the current goal) in the proof tree) consists of predicate INV in conjuction with the property that done implies that dest is a member of set C. Of course the precondtion, that 'dest' is not reachable from 'source' is invariant too.

The correctness proof of BFS-correct-if-reachable is trickier. It is expressed as a total correctness operator using a strong diamond operator (two opening/closing angle brackets) which expresses the weakest precondition (wp-operator): all runs of the nondeterministic program terminate and make the postcondition true.

The proof requires INV and INVR as invariant. In addition 'done implies dest in C' is needed again. Finally, an empty C must imply an empty N (and even though the property is simple, it was hard to find).

The proof does not use the invariant rule (here, since the implemented rule does not allow to give a lexical order; in general, since inductive proofs are often easier than figuring out an invariant). Instead, the proof first weakens the precondition to the invariant. (technically, using the cut rule of sequent calculus with the invariant as rule argument, and then weakening).

Then a nested (structural) induction over the difference between n and n0, and the size of C is done (red nodes in the proof tree). Unwinding the loop (purple nodes) once then leads to a state where either the inner induction hypothesis (C does not become empty) or the outer induction hypothesis (d is increased) is applicable (green nodes doing quantifier instantiation). The lemmas for INV and INVR are used. They make it unnessary to unfold of INV and INVR during symbolic execution.

[Imprint] [Data Privacy Statement] [E-Mail]

VSTTE Verification Competition : KIV Solutions

Challenge 1 : Two-Way Sort

Solution 1: Abstract Program

Solution 2: Java Program

Challenge 2: Combinators

Challenge 3: Ring Buffer

The Algorithm

Invariants and Lemmas

The Main Proofs

Challenge 4: Tree Reconstruction

Types

Supporting definitions

Approach

Proofs

Soundness/completeness of build#

Soundness of build_rec#

Completeness of build_rec#

Termination, Test Cases

Challenge 5: Breadth-First Search

The Algorithm

Auxiliary Definitions

Invariants and Lemmas

The Two Main Proofs

Soundness/completeness of `build#`

Soundness of `build_rec#`

Completeness of `build_rec#`