6 ML type inference

← prev up next →

6 ML type inference

6.1 STLC revisited

We revisit the simply-typed λ calculus, but with several twists:

We remove base types such as (for now) in favor of uninterpreted type variables .
We add expressions.
Most importantly, we leave types implicit.

Here is the syntax of the resulting system:

6.1.1 Dynamic semantics

The only values are λ abstractions:

The dynamic semantics of this language includes two reduction rules:

Note this means that dynamically we can consider

as syntactic sugar for

6.1.2 Static semantics

The static semantics should be familiar from the simply-typed lambda calculus, since it’s the same but for one thing: the rule for

has to “guess” the domain type

Here’s how it works: To type a λ expression, choose any type—made out of type variables and arrows—for the parameter that lets you type the body. For example, these are all valid judgments for the identity function:

Whatever type it’s given, it returns the same type. How a variable is used may constrain its type. For example, to type

we have to guess types for

and

such that

can be applied to

. Suppose we guess

for

. Then we are faced with choosing a type for

that can be applied to that, like say

. Then we get the type

for the whole term. Those aren’t the only types we could have chosen, however, For example, we could choose

for

; then

could be any arrow type

for any type

Exercise 43. Find types for these terms:

Exercise 44. Find a closed term that has no type. What is the only cause of type errors in this system?

6.1.3 Adding a base type

Let’s make things a bit more interesting by introducing the potential for more type errors. We add Booleans to the language. We add

and

, and

expressions for distinguishing between the two:

There are two new reduction rules, for reducing

in the true case and in the false case:

The type rules assign type

to both Boolean expressions. An

expression types if the condition is a

and if the branches have the same type as each other:

Exercise 45. Extend λ-ml with one of these features: products, sums, numbers, records, recursion, references.

Exercise 46. Show that the term has no type.

6.1.4 Introducing let polymorphism

By why not? The term reduces to a Boolean, so shouldn’t it have type ? It doesn’t because is used two different ways. When applied to , it needs to have type , but when applied to it needs to have type (because the result of that application is applied to a ).

However, if we were to reduce the , we would get , which types fine.

So this suggests a different way to type

: copy

into each occurrence of

with this rule, the example from the exercise types correctly. However, other things that shouldn’t type also type. In particular, a term like

has type

, even though the subterm

has no type. To remedy this, we ensure that

has a type, even though we don’t restrict it to have that particular type in

This works! But it has two drawbacks:

Now we are typechecking term multiple times, once for each occurrence of in . We can actually construct a family of terms that grow exponentially as a result of this copying.
In a real programming system, we want to be able to give a type to because they often allow bindings with open scopes: for the future. This only makes sense if we can say what type has. This is essential for separate or incremental compilation.

6.2 Type schemes in λ-ml

In the exercise above, is used at two different types: and . In fact, if we consider it carefully, it’s safe to use on an argument of any type , and we get that same back. So we could say that has the type scheme for all types .

We will write type schemes with the universal quantifier to indicate which type variables are free to be instantiated in the scheme:

Note that types in λ-ml (and real ML) do not contain

like System F types do—

just happens in the front of type schemes. (This is called a prenex type.) This is key to making type inference possible, since we cannot in general infer System F types.

6.3 Statics

For λ-ml’s statics, we allow type environments

to bind variables to type schemes:

6.3.1 The logical type system

We first give a logical type system, which says which terms has a type but is not very much help in finding the type. The four rules for the four expression forms in our language are nearly the same as before; the only difference is in rule [let-poly], allows the bound variable to have a type scheme instead of a mere monomorphic type (“monotype”):

On the other hand, notice that the domain type inferred for λ is still required to be a monotype.

There are two initial rules, which are not syntax directed, but which are used to instantiate type schemes to types and generalize types to type schemes. To instantiate a type scheme, we can replace its bound variable with any type whatsoever:

Finally, we can generalize any type variable that is not free in the environment

This is because any type variable that is not mentioned in

is unconstrained, but type variables that are mentioned might have requirements imposed on them.

Exercise 47. Derive a type for .

Exercise 48. What types can you derive for ? What do they have in common? What type scheme instantiates to all of them?

6.3.2 The syntax-directed type system

The system presented above allows generalization and instantiation anywhere, but in fact, these rules are only useful in certain places, because we do not allow polymorphic type schemes as the domains of functions. The only place that generalization is useful is when binding the right-hand side of a

, and instantiation is only useful when we lookup a variable with a type scheme and want to use it. It’s not necessary to apply the rules anywhere else, so we can combine rule

with rule

into a new rule [var-inst]:

The rule uses a relation

for instantiating a type scheme into an arbitrary monotype:

Similarly, we combine rule [let-poly] with rule [gen] to get rule [let-gen], which generalizes the right-hand side of the

The metafunction gen simply generalizes the type into a type scheme with the given bound variables:

The syntax-directed type system presented in this section admits exactly the same programs as the logical type system from the previous section. Unlike the logical system, it tells us exactly when we need to apply instantiation and generalization. But it still does not tell us what types to instantiate type schemes to in rule [var-inst], and it does not tell us what type to use for the domain in rule . To actually type terms, we will need an algorithm.

Exercise 49. Extend the syntax-directed type system for your extended language.

6.4 Type inference algorithm

The type inference rules presented above yield many possible typings for terms. For example, the identity function might have type or or and so on. The most general type, however, is , since all other terms are instances of that. The algorithm presented in this section always finds the most general type (if a typing exists).

6.4.1 Unification

To perform type inference, we need a concept of a type substitution, which substitutes some monotypes for type variables:

Exercise 50. Give a type substitution such that =

Type inference will hinge on the idea of unification: Given two types and , is there a substitution that makes them equal: = ? We will use this, for example, if we want to apply a function with type to argument of type . Type variables represent unknown parts of the types at question, and unification tells us if the types might be made, by filling in missing information, the same.

The unification procedure takes two types and either produces the unifying substitution, or fails. In particular, any variable unifies with itself, producing the empty substitution:

A variable

unifies with any other type

by extending the substitution to map

, provided that

is not free in

(If

then they won’t unify and we have a type error. This is the only kind of type error in a system without base types.)

If a variable appears on the right, we swap it to the left and unify:

Type

unifies with itself:

Finally, two types unify if their domains unify and their codomains unify:

Note that after

produces a substitution

, we apply that substitution to

and

before unifying, in order to propagate the information that we’ve collected. Further, the result of unifying the arrow types is the composition of the substitutions

and

. In general, when we work with substitutions, we will see that we accumulate and compose them.

Unification has an interesting property: It finds the most general unifier for any pair of unifiable types. A substitution is more general than a substitution if there exists a substitution such that = . That is, if does more substitution than . So suppose that and are two types, and suppose that = . Then the given by unifying and will be more general than (or equal to) .

6.4.2 Algorithm W

Now we are prepared to give the actual inference algorithm. It uses one metafunction, inst, which takes a type scheme and instantiates its bound variables with fresh type variables:

The inst metafunction is given a list of type variables to avoid.

Then we have the inference algorithm itself. The algorithm takes as parameters a type environment and a term to type; if it succeeds, it returns both a type for the term and a substitution making it so. Let’s start with the simplest rules.

To type check a Boolean, we return

with the empty substitution:

To type check a variable, we look up the variable in the environment and instantiate the resulting type scheme with fresh type variables:

To type check a λ abstraction, we create a fresh type variable

to use as its domain type, and we type check the body assuming that the formal parameter has that type

Note that while we “guess” a type variable

for the domain, it will be refined (via unification) based on how it’s used in the body.

To type check an application is more involved than the other rules we have seen, but the key operation is unifying the domain type of the operator with the type of the operand. First, we infer types for

and

, using substitution

(the result of typing

) for typing

. Then, we get a fresh type variable

to stand for the result type of the application. We unify the type of the operator,

with the type we need it to have,

, yielding substitution

. Then the composition of the three substitutions, along with result type

, is our result:

Note again how the substitutions are threaded through: Substitutions must be applied to any types or environments that existed before that substitution was created.

For the

rule, we first infer a type for

, and we generalize that type with respect to the (updated-by-substitution) type environment

. Then we bind the resulting type scheme in the environment for type checking

Finally, the rule for

works by first type checking its three subterms, threading the substitutions through. Then it needs to unify the type of

with

, and it needs to unify the types of

and

with each other. Either of those is then the type of the result.

Theorem (Soundness and completeness of W).

Soundness: If then .
Completeness: If then for some that is a substitution instance of. (That is, there is some substitution such that = .)

Exercise 51. Extend unification and Algorithm W for your extended language.

6.5 Constraint-based type inference

Algorithm W interleaves walking the term and unification. There’s another approach based on constraints, where we generate a constraint that tells us what has to be true for a term to type, and then we solve the constraint. This technique is important mostly because it allows us to extend our type system in particular ways by adding new kinds of constraints.

Our language of constraints

has the trivial true constraint

, the conjunction of two constraints

, a constraint that two types be equal

, and a constraint

that introduces a fresh type variable for the subconstraint

. Here is the syntax of contraints:

Then we can write a judgment that takes a constraint and, if possible, solves it, producing a substitution:

Now that we know how to solve contraints, it remains to generate them for a given term. We do that with the metafunction

, which, given an environment, a term, and a type, generates the constraints required for the typing judgment

to hold:

How can we use this to type a term if we don’t know its type to begin with? Suppose we want to type a term

in the empty environment. Then we choose a fresh type variable

, generate the constraint that

has type type, and solve the constraint, yielding a substitition:

Then we look up the type of in the resulting substiution: .

Note that for -free programs, constraint generation is completely separated from solving. However, when we encounter a , we still interleave solving to get the generalized type of the let-bound variable.

Exercise 52. Extend constraint generation for your extended language.

← prev up next →

1	The let-zl language
2	The simply-typed lambda calculus λ-st
3	λ-sub: subtyping with records
4	The polymorphic lambda calculus λ-2
5	The higher-order lambda calculus λ-ω
6	ML type inference
7	Qualified types
8	The Lambda Cube: λ-cube

6.1	STLC revisited
6.2	Type schemes in λ-ml
6.3	Statics
6.4	Type inference algorithm
6.5	Constraint-based type inference