7 ML type inference
7.1 STLC revisited
We remove base types such as
(for now) in favor of uninterpreted type variables
.
We add
expressions.
Most importantly, we leave types implicit.
7.1.1 Dynamic semantics
7.1.2 Static semantics
Exercise 51. Find a closed term that has no type. What is the only cause of type errors in this system?
7.1.3 Adding a base type
Exercise 52. Extend λ-ml with one of these features: products, sums, numbers, records, recursion, references.
Exercise 53. Show that the term
has no type.
7.1.4 Introducing let polymorphism
By why not? The term reduces to a Boolean, so shouldn’t it have type
? It doesn’t because
is used two different ways. When applied to
, it needs to have
type
, but when applied to
it needs to
have type
(because the result of that
application is applied to a
).
However, if we were to reduce the , we would get
,
which types fine.
Now we are typechecking term
multiple times, once for each occurrence of
in
. We can actually construct a family of terms that grow exponentially as a result of this copying.
In a real programming system, we want to be able to give a type to
because they often allow bindings with open scopes:
for the future. This only makes sense if we can say what type
has. This is essential for separate or incremental compilation.
7.2 Type schemes in λ-ml
In the exercise above, is used at two different types:
and
.
In fact, if we consider it carefully, it’s safe to use
on an
argument of any type
, and we get that same
back.
So we could say that
has the type scheme
for all types
.
7.3 Statics
7.3.1 The logical type system
Exercise 54. Derive a type for
.
Exercise 55. What types can you derive for ?
What do they have in common? What type scheme instantiates to all of them?
7.3.2 The syntax-directed type system
The syntax-directed type system presented in this section admits exactly the
same programs as the logical type system from the previous section. Unlike the
logical system, it tells us exactly when we need to apply instantiation and
generalization. But it still does not tell us what types to instantiate type
schemes to in rule [var-inst], and it does not tell us what type
to use for the domain in rule . To actually type terms, we will
need an algorithm.
Exercise 56. Extend the syntax-directed type system for your extended language.
7.4 Type inference algorithm
The type inference rules presented above yield many possible typings for terms.
For example, the identity function might have type or
or
and so on.
The most general type, however, is
, since all other terms
are instances of that. The algorithm presented in this section always finds
the most general type (if a typing exists).
7.4.1 Unification
Exercise 57. Give a type substitution such that
=
Type inference will hinge on the idea of unification: Given two types
and
, is there a substitution
that makes them
equal:
=
? We will use
this, for example, if we want to apply a function with type
to argument of type
. Type variables represent unknown parts of the
types at question, and unification tells us if the types might be made, by
filling in missing information, the same.
Unification has an interesting property: It finds the most general
unifier for any pair of unifiable types. A substitution is more
general than a substitution
if there exists a substitution
such that
=
. That is, if
does more substitution than
. So suppose that
and
are two types, and suppose that
=
. Then the
given by unifying
and
will be more general than (or
equal to)
.
7.4.2 Algorithm W
Then we have the inference algorithm itself. The algorithm takes as parameters a type environment and a term to type; if it succeeds, it returns both a type for the term and a substitution making it so. Let’s start with the simplest rules.
Soundness: If
then
.
Completeness: If
then
for some
that
is a substitution instance of. (That is, there is some substitution
such that
=
.)
Exercise 58. Extend unification and Algorithm W for your extended language.
7.5 Constraint-based type inference
Algorithm W interleaves walking the term and unification. There’s another approach based on constraints, where we generate a constraint that tells us what has to be true for a term to type, and then we solve the constraint. This technique is important mostly because it allows us to extend our type system in particular ways by adding new kinds of constraints.
Then we look up the type of in the resulting substiution:
.
Note that for -free programs, constraint generation is completely
separated from solving. However, when we encounter a
, we still
interleave solving to get the generalized type of the let-bound variable.
Exercise 59. Extend constraint generation for your extended language.