On this page:
1.1 Syntax
1.2 Dynamic semantics
1.2.1 Errors
1.3 Static semantics
1.3.1 Type safety
1.3.1.1 Preservation
1.3.1.2 Progress
1.4 Termination

1 The let-zl language

1.1 Syntax

The let-zl language has expressions defined as follows:

There are two kinds of literal expressions, integers and the empty list . Additionally, we build longer lists with , which is our traditional cons that creates a linked list node with first and rest. We have two elimination forms for integers, and . Additionally, we have elimination forms for lists, and . Finally, we have variables , and we have a form of sharing in , which binds to the value of in .

1.2 Dynamic semantics

We might have a decent guess as to what this language means, but to be precise, we will define its dynamic semantics using a rewriting system, which registers computation by rewriting expressions to expressions and eventually (hopefully) to values:

We define values—final results—to include numbers , the empty list , and pairs of values .

The reduction relation describes a single computation step, and has a case for each kind of basic computation step that our language performs. For example, here is how we perform addition:

The [plus] rule says that to reduce an addition expression where both parameters are already reduced to numbers, we add the numbers in the metalanguage. The portion of each term is the evaluation context, which means that addition can be performed not just on whole terms, but within terms according to a grammar given below.

The multiplication is similar, also allowing multiplication within any evaluation context:

We have two rules for getting the first and rest of a list:

These say that if we have a cons (pair) of values then extracts the first value and extracts the second value .

Finally (for now), the rule for involves substituting the value for the variable in the body:

In order to describe where evaluation can happen when when it is finished, we extend our syntax with values and evaluation contexts :

Evaluation context give a grammar for where evaluation can take place. For example, suppose we want to reduce the term . We need to decompose that term into an evaluation context and a redex, so that they match one of the reduction rules above. We can do that: the evaluation context is , which matches the grammar of , and the redex in the hole is thus . This decomposition matches rule [plus], which converts it to . Then to perform another reduction, we decompose again, into evaluation context and redex . That converts to plugged back into the evaluation context, for . Then to perform one more reduction step, we decompose into the evaluation context and the redex , which converts to .

We define to be the reflexive, transitive closure of . That is, means that reduces to in zero or more steps.

The dynamic semantics of let-zl is now given by the evaluation function eval, defined as:

eval() =

 

if

As we discuss below, eval is partial for let-zl because there are errors that cause reduction to get “stuck.”

Exercise 1. Extend the language with Booleans. Besides Boolean literals, what do you think are essential operations? Extend the dynamic semantics with the necessary reduction rule(s) and evaluation context(s).

Later we’re going to do induction on the size of terms rather than the structure of terms, and we’re going to use a particular size function, defined as:

Exercise 2. Prove that for all values, = 0.

1.2.1 Errors

Can let-zl programs experience errors? What does it mean for a reduction semantics to have an error? Right now, there are no explicit, checked errors, but there are programs that don’t make sense. For example, . What do these non-sense terms do right now? They get stuck! That is, a term that has in the hole won’t reduce any further.

Indeed, there several classes of terms that get stuck in our definition of let-zl thus far:
  • and .

  • and , where is an integer.

  • or where or is not an integer.

  • Any open term, that is, a term with a variable that is not bound by .

What do these stuck states mean? They might correspond to a real language executing an invalid instruction or some other kind of undefined behavior. This is no good, but there are several ways we could solve the problem.

First, we could make such programs defined by adding transition rules. For example, we could add a rule that the car of a number is 0. Another way to make the programs defined, without sanctioning nonsense, is to add an error state. We do this by extending terms to configurations :

Then we add transition rules that detect all bad states and transition them to , thus flagging them as errors.

This approach is equivalent to adding errors or exceptions to our programming language.

We now update our evaluation function eval to take these errors into account:

eval() =

 

if

eval() =

 

if

Alas, eval is still partial, because there are stuck states that we haven’t converted to wrong states. (The other reason that eval could be partial is non-termination, but as we will prove, we don’t have that.) A second way to rule out stuck states is to impose a type system, which rules out programs with some kinds of errors. We can then prove that no programs admitted by the type system get stuck, which will make eval total for this language.

1.3 Static semantics

With a type system, we assign types to (some) terms to classify them by what kind of value they compute. In our first, simple type system, we will have only two types:

To keep things simple, we will limit to be lists of integers.

We then define a relation that assigns types to terms. For example, integer literals always have type :

Similarly, the literal empty list has type :

To type check an addition or multiplication, we check that the operands are both integers, and then the whole thing is an integer:

To type check a , we require that the first operand be an integer and the second be a list, and then the whole thing is a list:

To type check and , we require that the operand be a list; the result for is an integer, and the result for is another list:

But when we come to check a variable , we get stuck. What’s the type of a variable? To type check variables, we introduce type environments, which keep track of the type of each -bound variable:

We then retrofit all our rules to carry the environment through. For example, the rule for becomes

and similarly for the other rules we’ve seen so far.

Then we can write the rules for variables and for . To type check a variable, look it up in the environment:

If it isn’t found, then the term is open and does not type.

Finally, to type check , we first type check , yielding some type . We then type check with an environment extended with bound to . The resulting type, , is the type of the whole expression:

Exercise 3. Extend the type system to your language with Booleans.

Exercise 4 (Generic lists). Modify the type system as follows: instead of a single type for lists of s, allow , , and so on. How do you have to change the syntax of ? The typing rules?

1.3.1 Type safety

The goal of our type system is to prevent undetected errors—that is, stuck terms—in our programs. To show that it does this, we will prove type safety: if a term has a type , then one of:
  • It will reduce in some number of steps to a value that also has type .

  • It will reduce in some number of steps to .

  • It will reduce forever.

The last case cannot happen with this language, but it will be possible with languages we study in the future.

It is conventional to prove this theorem in terms of two lemmas, progress and preservation:
  • Preservation: if has type and converts in one step to , then also has type .

  • Progress: if has a type , then either takes a conversion step or is a value.

1.3.1.1 Preservation

Before we start, we make an observation about how typing derivations must be formed.

Lemma (Inversion). If then,
  • If the term is a variable then = .

  • If the term is an integer then = .

  • If the term is then = .

  • If the term is or then = and and .

  • If the term is , then = and and

  • If the term is then there is some type such that and .

Proof. By inspection of the typing rules.

We want to prove that if a term has a type and takes a step, the resulting term also has a type. We can do this be considering the cases of the reduction relation and showing that each preserves the type. Alas, each rule involves evaluation contexts in the way of the action. Consequently, we’ll have to prove a lemma about evaluation contexts.

Lemma (Replacement). If , then there exists some type such that . Furthermore, for any other term such that , it is the case that .

Proof. By induction on the structure of :

QED.

There’s one more standard lemma we need before we can prove preservation:

Lemma (Substitution). If and then .

Proof. By induction on the typing derivation for ; by cases on the conclusion:

QED.

Now we are ready to prove preservation:

Lemma (Preservation). If and then .

Proof. By cases on the reduction relation:

QED.

1.3.1.2 Progress

Before we can prove progress, we need to classify values by their types.

Lemma (Canonical forms).

If has type , then:
  • If is then is an integer literal .

  • If is , then either = or = where has type and has type .

Proof. By induction on the typing derivation of :

QED.

Lemma (Context replacement). If then . If then .

Proof. If then must be some redex in a hole: . Furthermore, it must take a step to some = . Then the same redex converts to the same contractum in any evaluation context, including .

If then must be some redex in a hole: which converts to . Then that same redex converts to in any evaluation context, including .

Lemma (Progress). If then term either converts or is a value.

Proof. By induction on the typing derivation; by cases on the conclusion:

QED.

Exercise 5. Prove progress and preservation for your language extended with Booleans.

Exercise 6. Prove progress and preservation for your language extended with generic lists.

Exercise 7. Are the previous two exercises orthogonal? How do they interact or avoid interaction?

1.4 Termination

Now let’s prove a rather strong property about a rather weak language.

Theorem (Size is work). Suppose and = k. Then either reduces to a value or goes wrong in k or fewer steps.

Proof. This proof uses induction, but it uses induction on the set ℕ × ℕ, using a lexicographic ordering. That is, we consider the first natural number to be the number of nodes in the given (when viewed as a tree) and the second one to be . The lexicographic order is well-founded, and so we can use induction when we have a term where the is strictly less than the given one, or when is the same as the given one, but the number of nodes is strictly smaller.

QED.