6 ML type inference
6.1 STLC revisited
We revisit the simply-typed λ calculus, but with several twists:
We remove base types such as (for now) in favor of
uninterpreted type variables .
We add expressions.
Most importantly, we leave types implicit.
Here is the syntax of the resulting system:
6.1.1 Dynamic semantics
The only values are λ abstractions:
The dynamic semantics of this language includes two reduction rules:
Note this means that dynamically we can consider
as syntactic sugar for
.
6.1.2 Static semantics
The static semantics should be familiar from the simply-typed lambda calculus,
since it’s the same but for one thing: the rule for
has to “guess” the
domain type
.
Here’s how it works: To type a λ expression, choose any type—made out of
type variables and arrows—for the parameter that lets you type the body.
For example, these are all valid judgments for the identity function:
Whatever type it’s given, it returns the same type. How a variable is used
may constrain its type. For example, to type
we
have to guess types for
and
such that
can be applied
to
. Suppose we guess
for
. Then we are faced with
choosing a type for
that can be applied to that, like say
. Then we get the type
for the
whole term. Those aren’t the only types we could have chosen, however, For
example, we could choose
for
; then
could be
any arrow type
for any type
.
Exercise 43. Find types for these terms:
Exercise 44. Find a closed term that has no type. What is the only cause of type
errors in this system?
6.1.3 Adding a base type
Let’s make things a bit more interesting by introducing the potential for more
type errors. We add Booleans to the language. We add
and
, and
expressions for distinguishing between the two:
There are two new reduction rules, for reducing
in the true case and
in the false case:
The type rules assign type
to both Boolean expressions. An
expression types if the condition is a
and if the branches have the
same type as each other:
Exercise 45. Extend λ-ml with one of these features: products, sums, numbers,
records, recursion, references.
Exercise 46. Show that the term
has no type.
6.1.4 Introducing let polymorphism
By why not? The term reduces to a Boolean, so shouldn’t it have type
? It doesn’t because
is used two different ways. When applied to , it needs to have
type , but when applied to it needs to
have type (because the result of that
application is applied to a ).
However, if we were to reduce the , we would get
,
which types fine.
So this suggests a different way to type
: copy
into each occurrence of
in
:
with this rule, the example from the exercise types correctly. However, other
things that shouldn’t type also type. In particular, a term like
has type
, even though the
subterm
has no type. To remedy this, we ensure that
has a type, even though we don’t restrict it to have that particular
type in
:
This works! But it has two drawbacks:
6.2 Type schemes in λ-ml
In the exercise above, is used at two different types:
and .
In fact, if we consider it carefully, it’s safe to use on an
argument of any type , and we get that same back.
So we could say that has the type scheme
for all types .
We will write type schemes with the universal quantifier to
indicate which type variables are free to be instantiated in the scheme:
Note that types in
λ-ml (and real ML) do not contain
like System
F types do—
just happens in the front of type schemes.
(This is called a prenex type.) This is key to making type inference possible,
since we cannot in general infer System F types.
6.3 Statics
For
λ-ml’s statics, we allow type environments
to bind variables
to type schemes:
6.3.1 The logical type system
We first give a logical type system, which says which terms has a type but
is not very much help in finding the type. The four rules for the four
expression forms in our language are nearly the same as before; the only
difference is in rule [let-poly], allows the bound variable to have a
type scheme instead of a mere monomorphic type (“monotype”):
On the other hand, notice that the domain type inferred for λ is still
required to be a monotype.
There are two initial rules, which are not syntax directed, but which are
used to instantiate type schemes to types and generalize types to type
schemes. To instantiate a type scheme, we can replace its bound variable
with any type whatsoever:
Finally, we can generalize any type variable that is not free in the
environment
:
This is because any type variable that is not mentioned in
is
unconstrained, but type variables that are mentioned might have requirements
imposed on them.
Exercise 47. Derive a type for
.
Exercise 48. What types can you derive for ?
What do they have in common? What type scheme instantiates to all of them?
6.3.2 The syntax-directed type system
The system presented above allows generalization and instantiation anywhere, but
in fact, these rules are only useful in certain places, because we do not allow
polymorphic type schemes as the domains of functions. The only
place that generalization is useful is when binding the right-hand side of a
, and instantiation is only useful when we lookup a variable with
a type scheme and want to use it. It’s not necessary to apply the rules anywhere
else, so we can combine rule
with rule
into a new rule
[var-inst]:
The rule uses a relation
for instantiating a type scheme into an
arbitrary monotype:
Similarly, we combine rule [let-poly] with rule [gen]
to get rule [let-gen], which generalizes the right-hand side
of the
:
The metafunction gen simply generalizes the type into a type scheme with
the given bound variables:
The syntax-directed type system presented in this section admits exactly the
same programs as the logical type system from the previous section. Unlike the
logical system, it tells us exactly when we need to apply instantiation and
generalization. But it still does not tell us what types to instantiate type
schemes to in rule [var-inst], and it does not tell us what type
to use for the domain in rule . To actually type terms, we will
need an algorithm.
Exercise 49. Extend the syntax-directed type system for your extended language.
6.4 Type inference algorithm
The type inference rules presented above yield many possible typings for terms.
For example, the identity function might have type or
or and so on.
The most general type, however, is , since all other terms
are instances of that. The algorithm presented in this section always finds
the most general type (if a typing exists).
6.4.1 Unification
To perform type inference, we need a concept of a type substitution,
which substitutes some monotypes for type variables:
Exercise 50. Give a type substitution such that
=
Type inference will hinge on the idea of unification: Given two types
and , is there a substitution that makes them
equal: = ? We will use
this, for example, if we want to apply a function with type
to argument of type . Type variables represent unknown parts of the
types at question, and unification tells us if the types might be made, by
filling in missing information, the same.
The unification procedure takes two types and either produces the unifying
substitution, or fails. In particular, any variable unifies with itself,
producing the empty substitution:
A variable
unifies with any other type
by extending
the substitution to map
to
,
provided that
is not free in :
(If
then they won’t unify and we have a type error. This is
the only kind of type error in a system without base types.)
If a variable appears on the right, we swap it to the left and unify:
Type
unifies with itself:
Finally, two types unify if their domains unify and their codomains unify:
Note that after
produces a substitution
,
we apply that substitution to
and
before unifying,
in order to propagate the information that we’ve collected. Further, the
result of unifying the arrow types is the composition of the substitutions
and
. In general, when we work with substitutions, we
will see that we accumulate and compose them.
Unification has an interesting property: It finds the most general
unifier for any pair of unifiable types. A substitution is more
general than a substitution if there exists a substitution
such that = . That is, if
does more substitution than . So suppose that
and are two types, and suppose that
= . Then the
given by unifying and will be more general than (or
equal to) .
6.4.2 Algorithm W
Now we are prepared to give the actual inference algorithm. It uses one
metafunction, inst, which takes a type scheme and instantiates its
bound variables with fresh type variables:
The inst metafunction is given a list of type variables to avoid.
Then we have the inference algorithm itself. The algorithm takes as parameters
a type environment and a term to type; if it succeeds, it returns both a type
for the term and a substitution making it so. Let’s start with the simplest
rules.
To type check a Boolean, we return
with the empty substitution:
To type check a variable, we look up the variable in the environment and
instantiate the resulting type scheme with fresh type variables:
To type check a λ abstraction, we create a fresh type variable
to
use as its domain type, and we type check the body assuming that the formal
parameter has that type
:
Note that while we “guess” a type variable
for the domain, it will
be refined (via unification) based on how it’s used in the body.
To type check an application is more involved than the other rules we have seen,
but the key operation is unifying the domain type of the operator with the type
of the operand. First, we infer types for
and
, using
substitution
(the result of typing
) for typing
.
Then, we get a fresh type variable
to stand for the result type of the
application. We unify the type of the operator,
with the type we need it to have,
, yielding substitution
. Then the composition of the three substitutions, along with result
type
, is our result:
Note again how the substitutions are threaded through: Substitutions must be
applied to any types or environments that existed before that substitution
was created.
For the
rule, we first infer a type for
, and we
generalize that type with respect to the (updated-by-substitution) type
environment
. Then we bind the resulting type
scheme in the environment for type checking
:
Finally, the rule for
works by first type checking its three
subterms, threading the substitutions through. Then it needs to unify the type
of
with
, and it needs to unify the types of
and
with each other. Either of those is then the type of the result.
Theorem (Soundness and completeness of W).
Exercise 51. Extend unification and Algorithm W for your extended language.
6.5 Constraint-based type inference
Algorithm W interleaves walking the term and unification. There’s another
approach based on constraints, where we generate a constraint that
tells us what has to be true for a term to type, and then we solve the
constraint. This technique is important mostly because it allows us to
extend our type system in particular ways by adding new kinds of constraints.
Our language of constraints
has the trivial true constraint
, the conjunction of two constraints
, a constraint
that two types be equal
, and a constraint
that introduces a fresh type variable for the subconstraint
. Here is
the syntax of contraints:
Then we can write a judgment that takes a constraint and, if possible, solves
it, producing a substitution:
Now that we know how to solve contraints, it remains to generate them for a
given term. We do that with the metafunction
,
which, given an environment, a term, and a type, generates the
constraints required for the typing judgment
to hold:
How can we use this to type a term if we don’t know its type to begin with?
Suppose we want to type a term
in the empty environment. Then we choose
a fresh type variable
, generate the constraint that
has
type type, and solve the constraint, yielding a substitition:
Then we look up the type of in the resulting substiution:
.
Note that for -free programs, constraint generation is completely
separated from solving. However, when we encounter a , we still
interleave solving to get the generalized type of the let-bound variable.
Exercise 52. Extend constraint generation for your extended language.