Verification of the ShearSort algorithm
This page gives a description of the verification
of the ShearSort algorithm with KIV, that sorts a matrix
using parallel sorts of rows and columns.
The verification of this algorithm was one
of the challenges of the VerifyThis competition 2021.
A description of the challenge can be found at:
Challenge 3  ShearSort ⤓ (PDF) (external link to ETH Zürich)
An earlier proof with KIV that uses a sequence of sequential sorts
instead of a parallel sort has already been uploaded earlier as part of the
KIV solutions for VerifyThis 2021.
This is now a full solution, that also verifies a concurrent version,
including a version of the BubbleSortAlgorithm that is verified
with RelyGuarantee calculus (the old version just had a specification
of the sorting algorithm used for a row/column). The solution
also has been changed from using monomorphic data types to polymorphic ones,
which removes some of the actualized specifications that were used earlier.
More details on the concepts used here are now given in the submitted paper
[KIV21] (a copy can be obtained by mailing the authors).
Check the overview
over the specifications first. From there you
can also find links to the theorem base of each
specification, and to all proofs.
Here is a description of the main important ones:
 Specifications matrix,
matrix+array, and
rwocolupd
specify (twodimensional) matrices and operations on them,
e.g. "#r mx" is its number of rows "mx[m,n,a]" is the modified matrix
mx, where The element in row m, column n has been updated to a.
"mx.toArr" concatenates the rows of the matrix into one array,
"mx.toArc" does the same, but columnwise.
 The basic predicate used is count(a, p) from specification
filtercount,
which counts the number of elements that satisfy predicate p. Since this is a generally
useful predicate, its specification has been put into the library. It is used with
p = "λ j. j < i" to count the number of elements less than i (i.e. zeros)
and with p = "λ j. ¬ j < i" to count the number of ones.
A "clean0row" is a row where this count is zero, i.e. where all
elements are less than i.
 Predicates #clean0rows" and "#clean1rows"
from the toplevel specification ShearSort
count the number of consecutive clean
rows from the bottom/top with respect to element i.
"#dirtyrows" are the other rows in the middle of the matrix,
and "mx.dirtyrows≤(bound)" states that there at most bound dirty rows.
Initially, predicats #clean0rows/#clean1rows were specified to count all rows that have 0/1 only
(not just the consecutive ones at the bottom /top),
but with this definition the main lemma "every iteration (consisting
of sorting rows/columns) halves #dirtyrows" is not provable.
 "colsorted(mx0, n, mx)" specifies that matrix mx is a permutation
of matrix mx0, where the first n columns have been sorted (in ascending order),
while the remaining ones are identical. The index n is needed while
sorting columns, the index is equal to "#c mx" after the loop.
 Similarly, "srowsorted(mx0, n, mx)" specifies that the first
n columns of mx0 have been snakesorted in mx: Even rows have been
sorted in ascending order, odd rows in descending order (where sorted in descending
order is described simply sorted in ascending order for the reverted row).
Note that "srowsorted(mx, #r mx, mx)" simply indicates that all rows
are snakesorted.
 "rowsincreasing(mx,r)" indicates that all rows below row r are
increasing: the elements in row r0 are all smaller than each element
from row r1, when r0 < r1 < r. In the proof the predicate is always
used with r = #r mx, the second argument is just used for inductive lemmas.
The toplevel specification
ShearSort
contains the algorithm to sort a matrix (named pshearsort#).
The basic program to sort a single column in ascending order
(psortcol#) is given in specification
Bubblesort.
It is a version of Bubblesort
(the most optimized version given on the Wikipedia page)
adjusted to sort one column of a matrix.
It has been
verified to be correct
using RelyGuarantee calculus
(the rely used is to assume that other parallel sorts
do not modify the respective column)
The necessary invariant is shown as an annotation.
To demonstrate the (small) overhead of using RG calculus
and sorting a column instead of an array,
specification Bubblesort
contains the standard version for arrays
and a correctness proof using wpcalculus.
The main proof of the ShearSort algorithm basically reduces
to the following lemmas:
 sorting rows never changes #clean0rows or #clean1rows"
as stated by lemmas
rowpermsclean0unchanged and
rowpermsclean1unchanged.
Therefore the number of dirty rows also stays the same when sorting rows.
 The main difficulty for showing that the invariant of shearsort holds is therefore to
show that when a rowsorted matrix mx0 is sorted columnwise the
resulting matrix mx has at most half the dirtyrows of the mx0.
This is asserted by lemma
colsortdirtyboundhalf.
We could not find a precise argument on the internet,
our proof essentially follows the proof of [Hagerup16].
The proof is broken down into several lemmas:
 First,
the final matrix mx has a column cmax with a maximum number n of zeros
(Lemma existsmax0column
proves that every matrix has such a column). Since mx is columnsorted, the 0's in
column cmax are in the first n rows
(Lemma sortedpermcountsplits), and all rows above
n are 1's only.
 Since all other columns have fewer 0's (also in low rows)
it is implied that the number of clean 1rows in mx is #r mx  n as stated
by Lemma max0columndeterminesclean1rows.
 The number of zeros in column cmax of mx0 is n too
(columnsorting does not change this number),
and since all rows below #clean0rows(mx0) contain 0's only, there must be
some k >= 0 of extra rows such that n = #clean0rows(mx0,i) + k. k can
be at most the number of dirty rows in mx0, since no zero
is in a clean 1row of mx0.
The k additional 0's in column cmax are distributed among even and odd rows,
so counting zeros in even/odd rows in mx0[cmax]c above #clean0rows(mx0) will give
two number keven and kodd, such that k = keven + kodd.

Formally, "countevn(a, p)/countodd(a, p)" are used count the elements of an array (here:
a column of the matrix) that have an even/odd index and satisfy predicate p.
Since k ≤ #dirtyrows(mx0, i), we get that both
keven, kodd ≤ #dirtyrows(mx0, i) /↑ 2
(Lemmas countevnmaxcol0bounded and
countoddmaxcol0bounded).
 Since mx0 is snakerowsorted, all columns left of cmax have at least
keven zero's, all columns right of cmax have at least kodd zeros above #clean0rows(mx0,i).
(Lemma rowsortedsmallercolevnbounded and
rowsortedbiggercoloddbounded).
Therefore every column of mx0 has at least #clean0rows(mx0,i) + min(keven,kodd) zeros.
 Again, since the number of zero's in a column does not change
(Lemma max0columnclean0rowsbounded)
the number #clean0rows(mx,i) of clean 0rows in mx is at least
#clean0rows(mx0,i) + min(keven,kodd) too. Combining these facts,
the number of clean 0rows increases by min(keven,kodd),
the number of clean 1rows drops by #dirtyrows(mx0,i)  k,
Since k  min(keven,kodd) = max(keven,kodd) ≤ #dirtyrows(mx0, i) /↑ 2
the main lemma colsortdirtyboundhalf is implied.
 After "log2↑(#r mx)" iterations (where log2↑(0) = 1, and log2↑(n)
is the logarithm with base 2 rounded up to the next integer otherwise)
of row and column sorting the number of dirty columns has decreased
to at most one (see Lemma log2ublem).
Since this is the case for every element i, it can be proved
that the matrix then has increasing rows, i.e rowsincreasing(mx,#r mx)".
This is stated by Lemma dirtyrowsonerowsincreasing.
With increasing rows, sorting all rows in ascending order produces
a sorted (not snakesorted!) matrix, expressed as
the postcondition "ordered≤(mx.toArr)"
of the total correctness theorem
pshearsortcorrect.
The proof effort for the sequential version of the
case study was about two weeks. Proving the two versions
of bubblesort and setting up the instances needed a few hours.
The main effort in adding concurrency was then defining
and implementing the foralloperator, the corresponding rule,
and the rule used to combine wpcalculus reasoning for the main
algorithm with reasoning using RGcalculus
for the parallel sorts (see [KIV21] for more details).
Bibliography
[Hagerup16] 
T. Hagerup
Course notes: Introduction to Parallel Algorithms

[KIV21] 
G, Schellhorn, S. Bodenmüller, M. Bitterlich, W .Reif
Software & System Verification with KIV
submitted to : Symposium on the occasion of Reiner Hähnle's 60th Birthday
will appear in Springer LNCS

Used basic libraries:
Back to the overview of all KIV projects.
For general informations on KIV and a download site for the whole system refer to this page.
[Imprint] [Data Privacy Statement]