3 λ-sub: subtyping with records
3.1 Syntax
A record type lists field names with their types; assume the field names are not repeated within a record. A record expression lists field names with expressions whose values will fill the fields. A projection expression projects the value of the named field from a record.
3.2 Dynamic semantics
3.3 Static semantics
This works, but it’s not as expressive as we might like. Consider a function
. It takes a record of one field
and projects out that field. But is there any reason we shouldn’t be
able to use this function on a record with more fields than
?
Subtyping captures that intuition, allowing us to formalize it and prove it
sound.
3.3.1 Subtyping
To do this, we define the subtype relation , which related pairs of
types. Intuititively
means that a
may be used
wherever a
is required.
Exercise 23. Suppose that . Consider the types
,
,
, and
. Which of these are subtypes of which others? Does
this make sense?
Exercise 24. Prove that is a preorder, that is, reflexive and
transitive.
The idea of subtyping is that we can apply it everywhere. If we can conclude
that and
then we should be able to
conclude that
. It’s possible to add such a rule, and
it works fine theoretically, but because the rule is not syntax directed,
it can be difficult to implement. In fact, the only place in our current
language that we need subtyping is in the application rule, so we replace
the STLC application rule with this:
3.3.2 Type safety
Subtyping changes our preservation theorem somewhat, because reduction can cause type refinement. (That is, we learn more type information.) Here is the updated preservation theorem:
Theorem (Preservation). If and
then there exists some
such that
and
.
Before we can prove it, we update the replacement and substitution lemmas as follows:
Lemma (Replacement). If , then
for some type
. Furthermore, for any
such that
for
,
for some
such that
.
Proof. By induction on . The interesting cases are for application:
If
is
then the whole term has a type
only if there are some types
and
such that
and
where
. Then by induction,
has a type, and if we replace
with
having a subtype of that, then
for
. The subtyping relation relates arrows only to other arrows, so
=
with
and
. Then by transitivity,
. This means that we can reform the application
, which has a subtype of
.
If
is
, then the whole term has a type
only if there are some types
and
such that
and
where
. Then by induction,
has a type, and if we replace
with
having a subtype of that, then
where
. Then by transitivity,
, so we can reform the application having the same type
.
Lemma (Substitution). If
and
where
then
for
.
Proof. By induction on the derivation of the typing of :
.
If
=
, then
=
. Then
=
, which has type
. Let
be
. Then the subtyping holds.
If
≠
, then
, as before the substitution.
, then substitution has no effect and it types in any environment.
, then by induction
, which relates only to
. Then reapply
.
, then by inversion we know that
. Then by the induction hypothesis,
for some
. Then by [abs],
, which is a subtype of
.
, then by inversion we know that
and
where
. Then by induction (twice), we have that
where
and that
where
. By inspection of the subtype relation, the only types related to arrow types are arrow types, so
must be an arrow type
where
and
. Then by transitivity (twice),
. This means we can apply
yielding type
, which is a subtype of
.
The record construction and projection cases are straightforward.
Proof (of preservation). By cases on the reduction relation. There are two cases:
If
, then by replacement,
has a type, and it suffices to show that this type is preserved. Then by inversion (twice), we know that
and
where
. Then by the substitution lemma,
where
.
If
, this case is straightforward.
QED.
Lemma (Canonical forms).
If , then:
If
is
, then
is either
or
.
If
is
, then
has the form
.
If
is
, then
is a record with at least the fields
.
Proof. By induction on the structure of the typing derivation. Only four rules form values, and those rules correspond to the conditions of the lemma.
Lemma (Progress). If then either
is a value or
for some term
.
Proof. By induction on the typing derivation:
is vacuous.
is a value.
If
then by inversion,
. Then by induction,
either takes a step or is a value. If it’s a value, then
is a value; if it takes a step to
then
takes a step to
.
If
then by inversion,
and
for some types
and
such that
. Then by induction, each of
and
either is a value or takes a step. If
takes a step to
, then the whole term takes a step to
. If
is a value
and
takes a step to
, then the whole term takes a step to
. Otherwise,
is a value
. By the canonical forms lemma,
has the form
. Then the whole term takes a step to
.
is a value.
If
then by inversion,
for all
. Then by induction, each of those takes a step or is a value. If any takes a step, then the whole term steps by the leftmost
to take a step. Otherwise, they are all values, and the whole term is a value.
If
then by inversion,
has a record type with a field
having type
. By induction,
either takes a step or is a value
. If it takes a step then the whole term takes a step. If it’s a value, then by the canonical forms lemma, it’s a value
. Then the whole term takes a step to
.
Theorem (Type safety). λ-sub is type safe.
Proof. By progess and preservation.
3.4 Compiling with coercions
To say that is to say that a
can be used
wherever a
is expected, but do our run-time representations actually
make that true? In some languages yes, but in many languages no. We might not
want, for example, for record operations to have to do a (linear) search of
field names at run time, but instead to fix the offset at compile time. Such
a representation choice is not incompable with subtyping, if we are willing to
interpret subtyping as a coercion between potentially different underlying
representation types. For example, record type
is a subtype of
record type
. The former is represented by a 3-element vector
containing the values of fields a, b, and c, whereas the rather is represented
as a 2-element vector containing the values of fields a and c. We cannot use an
instance of the former as the latter directly, but we can coerce it. The
coercion between two types in the subtype relationship is witnessed by the
function converting the subtype to the supertype.
Exercise 25. Define a Point as a record with fields x and y, which are integers. Define a ColorPoint as a Point with an additional field, the color, which is a string. Define a function that takes a Point. Show that your function can be used on a ColorPoint.