On this page:
6.1 STLC revisited
6.1.1 Dynamic semantics
6.1.2 Static semantics
6.1.3 Adding a base type
6.1.4 Introducing let polymorphism
6.2 Type schemes in λ-ml
6.3 Statics
6.3.1 The logical type system
6.3.2 The syntax-directed type system
6.4 Type inference algorithm
6.4.1 Unification
6.4.2 Algorithm W
6.5 Constraint-based type inference

6 ML type inference

6.1 STLC revisited

We revisit the simply-typed λ calculus, but with several twists:
  • We remove base types such as (for now) in favor of uninterpreted type variables .

  • We add expressions.

  • Most importantly, we leave types implicit.

Here is the syntax of the resulting system:

6.1.1 Dynamic semantics

The only values are λ abstractions:

The dynamic semantics of this language includes two reduction rules:

Note this means that dynamically we can consider as syntactic sugar for .

6.1.2 Static semantics

The static semantics should be familiar from the simply-typed lambda calculus, since it’s the same but for one thing: the rule for has to “guess” the domain type .

Here’s how it works: To type a λ expression, choose any type—made out of type variables and arrows—for the parameter that lets you type the body. For example, these are all valid judgments for the identity function:
Whatever type it’s given, it returns the same type. How a variable is used may constrain its type. For example, to type we have to guess types for and such that can be applied to . Suppose we guess for . Then we are faced with choosing a type for that can be applied to that, like say . Then we get the type for the whole term. Those aren’t the only types we could have chosen, however, For example, we could choose for ; then could be any arrow type for any type .

Exercise 43. Find types for these terms:

Exercise 44. Find a closed term that has no type. What is the only cause of type errors in this system?

6.1.3 Adding a base type

Let’s make things a bit more interesting by introducing the potential for more type errors. We add Booleans to the language. We add and , and expressions for distinguishing between the two:

There are two new reduction rules, for reducing in the true case and in the false case:

The type rules assign type to both Boolean expressions. An expression types if the condition is a and if the branches have the same type as each other:

Exercise 45. Extend λ-ml with one of these features: products, sums, numbers, records, recursion, references.

Exercise 46. Show that the term has no type.

6.1.4 Introducing let polymorphism

By why not? The term reduces to a Boolean, so shouldn’t it have type ? It doesn’t because is used two different ways. When applied to , it needs to have type , but when applied to it needs to have type (because the result of that application is applied to a ).

However, if we were to reduce the , we would get , which types fine.

So this suggests a different way to type : copy into each occurrence of in :

with this rule, the example from the exercise types correctly. However, other things that shouldn’t type also type. In particular, a term like has type , even though the subterm has no type. To remedy this, we ensure that has a type, even though we don’t restrict it to have that particular type in :

This works! But it has two drawbacks:
  • Now we are typechecking term multiple times, once for each occurrence of in . We can actually construct a family of terms that grow exponentially as a result of this copying.

  • In a real programming system, we want to be able to give a type to because they often allow bindings with open scopes: for the future. This only makes sense if we can say what type has. This is essential for separate or incremental compilation.

6.2 Type schemes in λ-ml

In the exercise above, is used at two different types: and . In fact, if we consider it carefully, it’s safe to use on an argument of any type , and we get that same back. So we could say that has the type scheme for all types .

We will write type schemes with the universal quantifier to indicate which type variables are free to be instantiated in the scheme:

Note that types in λ-ml (and real ML) do not contain like System F types do— just happens in the front of type schemes. (This is called a prenex type.) This is key to making type inference possible, since we cannot in general infer System F types.

6.3 Statics

For λ-ml’s statics, we allow type environments to bind variables to type schemes:

6.3.1 The logical type system

We first give a logical type system, which says which terms has a type but is not very much help in finding the type. The four rules for the four expression forms in our language are nearly the same as before; the only difference is in rule [let-poly], allows the bound variable to have a type scheme instead of a mere monomorphic type (“monotype”):

On the other hand, notice that the domain type inferred for λ is still required to be a monotype.

There are two initial rules, which are not syntax directed, but which are used to instantiate type schemes to types and generalize types to type schemes. To instantiate a type scheme, we can replace its bound variable with any type whatsoever:

Finally, we can generalize any type variable that is not free in the environment :

This is because any type variable that is not mentioned in is unconstrained, but type variables that are mentioned might have requirements imposed on them.

Exercise 47. Derive a type for .

Exercise 48. What types can you derive for ? What do they have in common? What type scheme instantiates to all of them?

6.3.2 The syntax-directed type system

The system presented above allows generalization and instantiation anywhere, but in fact, these rules are only useful in certain places, because we do not allow polymorphic type schemes as the domains of functions. The only place that generalization is useful is when binding the right-hand side of a , and instantiation is only useful when we lookup a variable with a type scheme and want to use it. It’s not necessary to apply the rules anywhere else, so we can combine rule with rule into a new rule [var-inst]:

The rule uses a relation for instantiating a type scheme into an arbitrary monotype:

Similarly, we combine rule [let-poly] with rule [gen] to get rule [let-gen], which generalizes the right-hand side of the :

The metafunction gen simply generalizes the type into a type scheme with the given bound variables:

The syntax-directed type system presented in this section admits exactly the same programs as the logical type system from the previous section. Unlike the logical system, it tells us exactly when we need to apply instantiation and generalization. But it still does not tell us what types to instantiate type schemes to in rule [var-inst], and it does not tell us what type to use for the domain in rule . To actually type terms, we will need an algorithm.

Exercise 49. Extend the syntax-directed type system for your extended language.

6.4 Type inference algorithm

The type inference rules presented above yield many possible typings for terms. For example, the identity function might have type or or and so on. The most general type, however, is , since all other terms are instances of that. The algorithm presented in this section always finds the most general type (if a typing exists).

6.4.1 Unification

To perform type inference, we need a concept of a type substitution, which substitutes some monotypes for type variables:

Exercise 50. Give a type substitution such that =

Type inference will hinge on the idea of unification: Given two types and , is there a substitution that makes them equal: = ? We will use this, for example, if we want to apply a function with type to argument of type . Type variables represent unknown parts of the types at question, and unification tells us if the types might be made, by filling in missing information, the same.

The unification procedure takes two types and either produces the unifying substitution, or fails. In particular, any variable unifies with itself, producing the empty substitution:

A variable unifies with any other type by extending the substitution to map to , provided that is not free in :

(If then they won’t unify and we have a type error. This is the only kind of type error in a system without base types.)

If a variable appears on the right, we swap it to the left and unify:

Type unifies with itself:

Finally, two types unify if their domains unify and their codomains unify:

Note that after produces a substitution , we apply that substitution to and before unifying, in order to propagate the information that we’ve collected. Further, the result of unifying the arrow types is the composition of the substitutions and . In general, when we work with substitutions, we will see that we accumulate and compose them.

Unification has an interesting property: It finds the most general unifier for any pair of unifiable types. A substitution is more general than a substitution if there exists a substitution such that = . That is, if does more substitution than . So suppose that and are two types, and suppose that = . Then the given by unifying and will be more general than (or equal to) .

6.4.2 Algorithm W

Now we are prepared to give the actual inference algorithm. It uses one metafunction, inst, which takes a type scheme and instantiates its bound variables with fresh type variables:

The inst metafunction is given a list of type variables to avoid.

Then we have the inference algorithm itself. The algorithm takes as parameters a type environment and a term to type; if it succeeds, it returns both a type for the term and a substitution making it so. Let’s start with the simplest rules.

To type check a Boolean, we return with the empty substitution:

To type check a variable, we look up the variable in the environment and instantiate the resulting type scheme with fresh type variables:

To type check a λ abstraction, we create a fresh type variable to use as its domain type, and we type check the body assuming that the formal parameter has that type :

Note that while we “guess” a type variable for the domain, it will be refined (via unification) based on how it’s used in the body.

To type check an application is more involved than the other rules we have seen, but the key operation is unifying the domain type of the operator with the type of the operand. First, we infer types for and , using substitution (the result of typing ) for typing . Then, we get a fresh type variable to stand for the result type of the application. We unify the type of the operator, with the type we need it to have, , yielding substitution . Then the composition of the three substitutions, along with result type , is our result:

Note again how the substitutions are threaded through: Substitutions must be applied to any types or environments that existed before that substitution was created.

For the rule, we first infer a type for , and we generalize that type with respect to the (updated-by-substitution) type environment . Then we bind the resulting type scheme in the environment for type checking :

Finally, the rule for works by first type checking its three subterms, threading the substitutions through. Then it needs to unify the type of with , and it needs to unify the types of and with each other. Either of those is then the type of the result.

Theorem (Soundness and completeness of W).
  • Soundness: If then .

  • Completeness: If then for some that is a substitution instance of. (That is, there is some substitution such that = .)

Exercise 51. Extend unification and Algorithm W for your extended language.

6.5 Constraint-based type inference

Algorithm W interleaves walking the term and unification. There’s another approach based on constraints, where we generate a constraint that tells us what has to be true for a term to type, and then we solve the constraint. This technique is important mostly because it allows us to extend our type system in particular ways by adding new kinds of constraints.

Our language of constraints has the trivial true constraint , the conjunction of two constraints , a constraint that two types be equal , and a constraint that introduces a fresh type variable for the subconstraint . Here is the syntax of contraints:

Then we can write a judgment that takes a constraint and, if possible, solves it, producing a substitution:

Now that we know how to solve contraints, it remains to generate them for a given term. We do that with the metafunction , which, given an environment, a term, and a type, generates the constraints required for the typing judgment to hold:

How can we use this to type a term if we don’t know its type to begin with? Suppose we want to type a term in the empty environment. Then we choose a fresh type variable , generate the constraint that has type type, and solve the constraint, yielding a substitition:

Then we look up the type of in the resulting substiution: .

Note that for -free programs, constraint generation is completely separated from solving. However, when we encounter a , we still interleave solving to get the generalized type of the let-bound variable.

Exercise 52. Extend constraint generation for your extended language.