Notes from lecture 4 – Testing
The most effective means for ensuring software quality is testing. You have all seen testing in earlier programming courses, but today we’re going to remember how it works and perhaps push beyond what you’ve seen in the past.
Here’s a (buggy) implementation of merge sort.
#lang racket (require rackunit) (define (merge-sort l) (cond [(empty? l) '()] [else (merge (take l (/ (length l) 2)) (drop l (/ (length l) 2)))])) (define (merge l1 l2) (match* (l1 l2) [('() _) l2] [(_ '()) l1] [((cons hd1 tl1) (cons hd2 tl2)) (cond [(< hd1 hd2) (cons hd1 (merge tl1 l2))] [(= hd1 hd2) (cons hd1 (merge tl1 tl2))] [(> hd1 hd2) (cons hd2 (merge l1 tl2))])])) (check-equal? (merge-sort '()) '()) (check-equal? (merge-sort '(1 2)) '(1 2)) (check-equal? (merge-sort '(2 1)) '(1 2)) (check-equal? (merge-sort '(1 3 2 4)) '(1 2 3 4))
Although this code is definitely buggy, all of the test cases shown here pass. Stop at this point. Open up DrRacket, copy and paste this code into it, and see if you can write some additional test cases that uncover bugs in merge-sort.
Unit tests like that are very important and are the first line of defense in software reliability. But today I want to show you another technique and I want to encourage you to see if you can find ways to try it out in your own work for this class.
One of the things we can do is call merge-sort with some random inputs and then check to see if it is well-behaved. So: there are two different aspects here to tackle: being well-behaved and generating random inputs.
Check to make sure that the function never crashes. This has the advantage that it applies to nearly all functions, but the disadvantage that there are many other ways to go wrong than simply crashing, so we’ll miss some bugs.
Compare this function to some other implementation of a sorting function; in this case we might use Racket’s standard library sort function
Check something specific about the function we’re testing. Sorting has two especially important properties: the output must be a permutation of the input and the output must be in sorted order.
In the interest of time, let’s just go with the second option but if you didn’t already have a second implementation of the function you wanted to test you’d have to use one of the other options.
(define (random-lon) (for/list ([i (in-range (random-natural))]) (random-natural))) (define (random-natural) (cond [(zero? (random 10)) 0] [else (+ 1 (random-natural))]))
(for ([x (in-range 100)]) (define l (random-lon)) (define sl (sort l <)) (define ms (merge-sort l)) (unless (equal? sl ms) (error 'bug! "\ninput: ~s\noutput: ~s\ncorrect: ~s\n" l ms sl)))
Give this a try and see if you can find and fix all the bugs in the code above. Word to the wise: if you get a counterexample, try running a few more times to see if you get a smaller example.
We won’t talk about it today in lecture, but there is an automatic way to, once you’ve got a buggy input, shrink it. This is super useful when debugging since you’ll get smaller inputs (which is helpful in its own right) but also because you know that no smaller input triggered the bug, which gives you a surprising amount of information.