COST Verification Competition : KIV Solutions

More information on the competition can be found here: http://foveoos2011.cost-ic0701.org/verification-competition

General comments

All programs are written in the abstract programming language of KIV, which is a while language that operates on arbitrary algebraic data types.

Challenge 1 : Maximum of an array

Given: A non-empty integer array a.
Verify that the index returned by the method max() given below points to an element maximal in the array.

public class Max {

    public static int max(int[] a) {
        int x = 0;
        int y = a.length-1;

        while (x != y) {
            if (a[x] <= a[y]) x++;
            else y--;
        }
        return x;
    }
}

Solution

A maximum function on arrays of natural numbers is specified here. The axiom max-gt says that the maximum is the greatest value in the array. The axiom max-in says that the maximum function returns a value from the array.
The main theorem is challenge1. The proof assumes that the array is not empty.
The proof uses the invariant that the maximum of the array is still within the indices x,y (which are m0,m in the proof).
We have proven termination using the variant y - x (m - m0 in the proof).

A slightly optimized solution which defines a simpler max-function can be found here. A further solution uses KIV's Java calculus to verify a Java implementation of the challenge here.

Challenge 2: Maximum in a tree

Given: A non-empty binary tree, where every node carries an integer. Implement and verify a program that computes the maximum of the values in the tree. Please base your program on the following data structure signature:

public class Tree {

    int value;
    Tree left;
    Tree right;

}

You may represent empty trees as null references or as you consider appropriate.

Solution

Basically this refines an algebraic maximum function defined on freely generated trees to a maximum procedure computing on pointer based trees, using lightweight separation logic to establish disjunctness.

Algebraic trees are specified as follows: tree = leaf | branch(left : tree, val : int, right : tree); (actually, values are of sort elem, which gets instantiated by int later)
Specification challenge2 defines an algebraic maximum function with four cases over branch nodes, whether left or right subtrees are leaves or not, the maximum of leaf nodes is undefined.
Lemmas max-in and max-max assert that the value of the maximum function is in fact a value in the tree (element predicate) and that all values in the tree are less or equal than the maximum.
Specification node defines tree nodes for the linked data structure (the Tree class above). The difference to algebraic trees is this is a pointer structure (using heaps H : ref → node). node = mkdnode(left : ref, val : int, right : ref)
The separation logic specification is instantiated in node-heap and an abstraction function tr is defined in linked-tree recursively over the algebraic tree tr(r, leaf) = emp ∧ r = null tr(r, branch(t0, i, t1)) = ∃ r0, r1. r => mknode(r0, i, r1) * tr(r0, t0) * tr(r1, t1) where r => nd denotes the singleton heap with H[r] = nd.
The main theorem is challenge2. It assumens a non-null root (as the maximum is undefined for the empty tree). The proof goes by well-founded induction over the size of the algebraic tree t with root r, from an initial heap characterized as tr(r, t) * P The additional P accounts for the frame rule (which is not generally valid for our abstract programs) and gets instantiated with those parts of the heap that are irrelevant to the induction hypothesis.

The algorithm is simple (encoded as an abstract KIV procedure MAX#)

i := root.val

if root.left != null then
    i := max(i, max(root.left))

if root.right != null then
    i := max(i, max(root.right))

return i

Challenge 3: Two equal elements

Given: An integer array a of length n+2 with n>=2. It is known that at least two values stored in the array appear twice (i.e., there are at least two duplets).
Implement and verify a program finding such two values.
You may assume that the array contains values between 0 and n-1.

Solution

The challenge is solved in two steps using the algorithms defined in specification finddup. The algorithms are both defined over arrays with generic elements (an actualization with integers could be done).
The solution calls two very similar simple algorithms and is not very efficient (O(n^2)).
FINDDUPS# finds the first duplicate using two nested while loops, which compute the two indices, where the duplicate can be found.
FINDDUPSSND# finds a duplicate where the value is not equal to some given value a. It is called with a = the first array element found.
Verification of both algorithms (theorems challenge3a and challenge3b) is very similar.
The main theorem challenge3 calls the two algorithms sequentially and just composes the results.

Challenge 4: Cyclic list

Given: A Java linked data structure with the signature:

public class Node {

    Node next;
    
    public boolean cyclic() {
        //...
    }

}

Implement and verify the method cyclic() to return true when the data structure is cyclic (i.e., this Node can be reached by following next links) and false when it is not.

Solution

Specification challenge4 contains the two algorithms of Floyd and Brent, both modified to check for null-pointers from the wikipedia page.

The two main proofs are challenge4-Floyd and challenge4-Brent. Both proofs are rather tricky.

They first distinguish between cyclic and acyclic structures, since the two cases need different variants for termination.

The noncyclic case induces over the difference of the length of the list minus the number of steps done for the faster pointer (the "hare"). For Floyd this needs wellfounded induction since it decreases by two on every iteration.

The cyclic case needs the characterizations cyclic-floyd-def and cyclic-brent-def (two theorems) from specification cyclic, to get the variant right.

cyclic-floyd-def gives an r0 such that r0 is reachable from r as well as from r0 in n steps. Termination is then, since n minus the steps of the slower pointer (the "tortoise") decreases.

cyclic-brent-def gives an r0 that is reachable in 2^n - 1 steps, such that from r0 a cycle with m ≤ 2^n steps starts. Termination is then, since 2^n -1 + m minus the steps of the faster pointer decreases by one in each loop iteration.

Since the variants of both proofs require existentially quantified variables from the invariant, the proof does not work by simply applying the invariant rule. Instead we generalize the precondition to the invariant with the cut formula rule (blue color). This gives two goals: one that proves precondition → invariant, in the other we get both precondition and invariant, and weaken (brown color) the initial values of the variables modified in the loop. The existential quantifier gets dropped by simplification. Then we use induction over the size of the variant (red color).

The proof proceeds by exiting the loop in the base case. For the recursive case, the loop is unwound once (violet colour), then the induction hypothesis is applied (green color for the three structural inductions, red color for the one wellfounded induction).

In specification cyclic, theorems cyclic-floyd-def and cyclic-brent-def are two of 6 characterizations of cyclicity, that are all proven to be equivalent. The other four are:

The original axiom cyclic-def defines cyclic(r), iff an r0 is reachable from r in both m and n steps, where m < n.
Theorem cyclic-alt-def says, that cyclic(r) iff some r0 is reachable from r, such that r0 is reachable from itself in n * n0 steps for all n0 (n is the length of the cycle), so it is slighly stronger that the original axiom.
Theorem cyclic-weak-def says this is equivalent to that for all lengths n an element is reachable from r (the equivalence proof is based on finiteness of the heap).
Finally, theorem legal-noncylic-equiv-reaches-null says that for a heap H without dangling pointers (predicate legal in specification cyclic), r is acyclic iff it reaches the null pointer in finitely many steps.

Theorems that are often used are:

cyclic-all-ncyclic from specification cyclic, which says that if a cycle of length n starts at r, then going n0 * n steps (n0 cycles) forward from any r0, that is reachable from r, will again end at r0.
reach-n-split-le from the library specification path-reachable, which says that if r0 is reachable from r in n steps, then for all m ≤ n, there is an r1, that is reached from r in m steps, and r1 reaches r0 in n - m steps.

[Imprint] [Data Privacy Statement] [E-Mail]