# Property-based testing

Today we discuss why property-based testing is a useful tool for verifying the correctness of complex programs. In addition, we apply the technique to test a sample algorithm comprehensively -- without and then with the aid of a testing library.

We will write the test code in Scala. Scala is fundamentally a functional programming language that retains some imperative language constructs. Knowledge of Scala is not required to follow this post, but it could help you focus on the idea being presented, as opposed to language semantics.

The completed test suite is hosted on GitHub as a gist for reference.

## What is Property-based testing?

Property-based testing involves providing a set of properties for the function or method under test. Effectively, we define a specification that our implementation should meet.

The usual approach, writing a limited set of assertions, may be satisfactory when testing simple logic:

"

itredirects an unauthenticated user to login".

But it is not necessarily sufficient when our outcomes are not binary:

"

italways returns a uniformly distributed random integer".

Property-based testing has its origins in functional programming, where data is modelled immutably, and functions are pure. Specifying properties has a direct correspondence to the way proofs of correctness are constructed in maths.

Consider the addition operation on integers. What conditions does it meet?

Associativity:

`$\left( i + j \right) + k = i + \left( j + k \right) \space \forall \space i, j, k \in \mathbb{Z}$`

Identity:

`$i + 0 = 0 + i = i \space \forall \space i \in \mathbb{Z}$`

Commutativity:

`$i + j = j + i \space \forall \space i, j \in \mathbb{Z}$`

We can assert that these properties hold by verifying that they apply for arbitrary integer triples. In a similar manner, property-based testing can also be employed to check algorithms that effect mutable data. We use it to test an implementation of mergesort.

## Mergesort

This implementation of mergesort is transcribed to Scala from Algorithms by Sedgewick and Wayne.

It consists of two subroutines:

- Split the array in two, recursively sorting each partition
- Merge the sorted partitions

```
object Sorting {
/**
* Sorts an array with respect to an `Ordering`, using the mergesort algorithm.
*/
def mergeSort[T: ClassTag](a: Array[T])(implicit o: Ordering[T]): Unit = {
val aux = new Array[T](a.length) // Use a common auxiliary array
// Sort the subsequence [l, h]
def sort(l: Int, h: Int): Unit = {
if(l < h) {
val m = l + (h-l) / 2 // Account for integer overflow
sort(l, m)
sort(m+1, h)
if(o.gt(a(m), a(m+1))) {
merge(l, m, h) // Only merge if [l, h] is not in order
}
}
}
// Merge the sorted subsequences [l, m] and (m, h]
def merge(l: Int, m: Int, h: Int): Unit = {
for(i <- l until h+1){
aux(i) = a(i)
}
var i = l
var j = m + 1
for(k <- l until h+1) {
if(i > m) {
a(k) = aux(j)
j += 1
} else if(j > h) {
a(k) = aux(i)
i += 1
} else if(o.lt(aux(j), aux(i))) {
a(k) = aux(j)
j += 1
} else {
a(k) = aux(i)
i += 1
} // n.b. Stable implementation of the algorithm
}
}
sort(0, a.length-1) // Run the routine on the whole array
}
}
```

Marking the `Ordering`

parameter as `implicit`

tells the compiler to "look" for an instance in scope if it has not been passed-in by the caller. Many built-in types, such as `Int`

, come with an `Ordering`

defined.

Note that we have to use `ClassTags`

to account for JVM type-erasure.

The `Sorting`

namespace is also an appropriate place to keep a utility to check array elements are ordered:

```
def isSorted[T](a: Array[T])(implicit o: Ordering[T]): Boolean = {
val indices = 0 until a.length-1
val shifted = 1 until a.length
indices
.zip(shifted)
.forall { case (i, j) => o.lteq(a(i), a(j)) }
}
```

## Testing "by hand"

A sorting function is defined by two properties:

It must permute its

**input**to form its**output**`$\left(a_i \mid i \in I \right) = \left(a'_i \mid i \in I \right)$`

It must totally-order its input to form its output (the comparer defines the total-ordering)

`$a'_i \leq a'_j \space \forall \space i, j \in I \space \text{such that} \space i < j$`

We have described arrays as a family, and the primed elements belong to the permuted array.

Since a sorting function has an infinite domain, it is not possible to verify these properties without resorting to **sampling**. Here is one strategy to programmatically generate array samples:

- Progressively create arrays of increasing size (up to a maximum)
- Large arrays encode exponentially more states, so generate more of these

Our sort function can act on generic arrays, but to generate samples, we need to specialise on a type. We choose `Int`

; however, there is no reason another type that has a total-ordering defined cannot be used instead -- such as `Char`

or `Double`

.

```
def arrays: Stream[Array[Int]] = {
def sample(size: Int): Array[Int] =
Array.fill(size)(rnd.nextInt())
def samples(size: Int): Stream[Array[Int]] =
Stream.fill(math.pow(2, size).toInt)(sample(size)) // Sample 2^size times
Stream.range(0, maxSize).flatMap(samples)
} // `maxSize` and `rnd` come from the test class
```

Language and machine constraints limit integer arrays of size `n`

to `math.pow(2*Int.MaxValue, n)`

permutations. We only take `math.pow(2, n)`

samples to keep the time it takes to run tests manageable.

Note that, although it is not evident from the caller's perspective, the second argument of `Stream.fill`

is evaluated lazily, so every sample will be distinct.

Using ScalaTest as the test runner, the test-code ends up being quite lean:

```
class SortingTest extends FlatSpec {
val rnd = new Random()
val maxSize = 20
// Context manager to give tests access to sample arrays
def withFixture(testCode: Array[Int] => Boolean): Unit = {
val maybeFailed = arrays.find(a => !testCode(a))
maybeFailed.foreach(a =>
fail(a.mkString("[", ", ", "]"))
)
}
"mergeSort" should "produce a sorted array" in withFixture { array =>
mergeSort(array)
isSorted(array)
}
it should "rearrange keys" in withFixture { array =>
val before = new Histogram(array)
mergeSort(array)
val after = new Histogram(array)
before == after
}
}
```

When verifying property two, we relied on a histogram abstraction to count array elements:

```
class Histogram[K](keys: Seq[K]) {
private val underlying =
keys.foldLeft(Map.empty[K, Int]) { (m, k) =>
val count = m.getOrElse(k, 0) + 1
m + (k -> count)
}
override def equals(histogram: Any): Boolean =
histogram match {
case h: Histogram[K] =>
h.underlying.equals(underlying)
case _ => false
}
override def hashCode: Int = underlying.hashCode
}
```

The strategy in use to generate samples does not adequately cover the situation where arrays to sort are saturated with duplicate keys. Additionally, the tests fail to verify that our implementation of mergesort is stable. We can address both shortcomings with extra code, but it is easier to use ScalaCheck.

## Testing with ScalaCheck

ScalaCheck is a library designed to aid property-based testing. It is inspired by Haskell's QuickCheck.

To use ScalaCheck, we need to be aware of two abstract data types it exports:

`Gen[T]`

is a monad that encodes all the information necessary to produce samples of type`T`

`Prop`

verifies a property by sampling a generator that is passed to it

Here is the (naïve) strategy for sample generation, re-written using ScalaCheck:

```
def unsaturated: Gen[Array[Int]] =
Gen.containerOf[Array, Int](Gen.posNum)
```

ScalaCheck will choose the maximum sample size to generate when running the test.

To test cases where duplicate keys are common, we modify the generator that creates keys to limit itself to choose from the range `[0, sqrt(size))`

, where `size`

is the length of the array being filled.

```
def saturated: Gen[Array[Int]] = {
// One possible way of saturating the array with duplicate keys
val sized = Gen.sized(s => Gen.choose(0, Math.sqrt(s).toInt))
Gen.containerOf[Array, Int](sized)
}
```

A stable sorting algorithm is one that ensures that any two keys which compare equally maintain their relative positions in the array. To test `mergeSort`

is stable we create arrays loaded with tuples, sorting them by the second element in the tuple *and then* the first. We should find that the mutated array is sorted with respect to a compound order which orders the tuples by comparing first elements and, if necessary, breaks ties on the second element. (Actually, this is the default ordering for tuples in Scala.)

Here is the resulting test code, without the generators listed above:

```
object SortingSpecification extends Properties("mergeSort") {
type Pair = (Int, Int) // a type alias
def pairs: Gen[Array[Pair]] = {
val n = Gen.choose(0, 5)
val pair = for {
i <- n
j <- n
} yield (i, j) // Syntax sugar for n.flatMap(i => n.map(j => (i, j)))
Gen.containerOf[Array, Pair](pair)
}
property("isSorted") = Prop.forAll(unsaturated) { a =>
mergeSort(a)
isSorted(a)
}
property("keys") = Prop.forAll(unsaturated) { a =>
val before = new Histogram(a)
mergeSort(a)
val after = new Histogram(a)
before == after
}
property("isStable") = Prop.forAll(pairs) { a =>
val byI = Ordering.by[Pair, Int](_._1)
val byJ = Ordering.by[Pair, Int](_._2)
mergeSort(a)(classTag[Pair], byJ)
mergeSort(a)(classTag[Pair], byI)
isSorted(a)
}
}
```

By using ScalaCheck, we have traded some transparency for the ability to abstract away the details of test case generation. Also, ScalaCheck can perform test case minimisation, which is useful when debugging.

## Further reading

Mergesort was chosen to demonstrate property-based testing because it is the only performant sorting algorithm that is also stable. System sorts typically use a variant of mergesort for sorting reference types, and quicksort for primitives. Sedgewick and Wayne discuss this in more detail.

If you want more insight into how ScalaCheck works, it would be worth reading the book Functional Programming in Scala. The eight-chapter walks the reader through designing such a library.