4 λ-sub: subtyping with records

4 λ-sub: subtyping with records🔗

4.1 Syntax🔗

Extending STLC with records is straightforward. First, we extend the syntax of types and terms, using

for record field labels:

A record type lists field names with their types; assume the field names are not repeated within a record and they are always in a canonical order (imagine the parser sorts them so we do not need to consider out-of-order fields). A record expression lists field names with expressions whose values will fill the fields. A projection expression projects the value of the named field from a record.

4.2 Dynamic semantics🔗

The dynamics are straightforward. We extend values to include records where every field contains a value. We extend evaluation contexts to evaluate the fields of a record from left to right.

Then we add one reduction rule, for projecting the field from a record:

4.3 Static semantics🔗

The simplest way to type records is to add one rule for each new expression form and keep the rest of the language the same:

This works, but it’s not as expressive as we might like. Consider a function . It takes a record of one field and projects out that field. But is there any reason we shouldn’t be able to use this function on a record with more fields than ? Subtyping captures that intuition, allowing us to formalize it and prove it sound.

4.3.1 Subtyping🔗

To do this, we define the subtype relation , which related pairs of types. Intuititively means that a may be used wherever a is required.

First,

is a subtype of itself:

Second, function types are contravariant in the domain and covariant in the arguments:

Exercise 30. Suppose that (and and ). Consider the types , , , and . Which of these are subtypes of which others? Does this make sense?

Finally, records provide subtyping by allowing the forgetting of fields (this is called width subtyping) and by subtyping within individual fields (depth subtyping). We can express this with three rules:

Rule [rec-empty] says that the empty record is a subtype of itself; we need this as a base case. Rule [rec-width] says that supertype records may have fields that are missing from their subtypes. Rule [rec-depth] says that when records have a common member then the types of the fields must be subtypes.

Exercise 31. Prove that is a preorder, that is, reflexive and transitive.

The idea of subtyping is that we can apply it everywhere. If we can conclude that and then we should be able to conclude that . It’s possible to add such a rule, and it works fine theoretically, but because the rule is not syntax directed, it can be difficult to implement. In fact, the only place in our current language that we need subtyping is in the application rule, so we replace the STLC application rule with this:

4.3.2 Type safety🔗

Subtyping changes our preservation theorem somewhat, because reduction can cause type refinement. (That is, we learn more type information.) Here is the updated preservation theorem:

Theorem (Preservation). If and then there exists some such that and .

Before we can prove it, we update the replacement and substitution lemmas as follows:

Lemma (Replacement). If , then for some type . Furthermore, for any such that for , for some such that .

Proof. By induction on . The interesting cases are for application:

If is then the whole term has a type only if there are some types and such that and where . Then by induction, has a type, and if we replace with having a subtype of that, then for . The subtyping relation relates arrows only to other arrows, so = with and . Then by transitivity, . This means that we can reform the application , which has a subtype of .
If is , then the whole term has a type only if there are some types and such that and where . Then by induction, has a type, and if we replace with having a subtype of that, then where . Then by transitivity, , so we can reform the application having the same type .

Lemma (Substitution). If and where then for .

Proof. By induction on the derivation of the typing of :

.
If = , then = . Then = , which has type . Let be . Then the subtyping holds.
If ≠ , then , as before the substitution.
, then substitution has no effect and it types in any environment.
, then by induction , which relates only to . Then reapply .
, then by inversion we know that . Then by the induction hypothesis, for some . Then by [abs], , which is a subtype of .
, then by inversion we know that and where . Then by induction (twice), we have that where and that where . By inspection of the subtype relation, the only types related to arrow types are arrow types, so must be an arrow type where and . Then by transitivity (twice), . This means we can apply yielding type , which is a subtype of .
The record construction and projection cases are straightforward.

Proof (of preservation). By cases on the reduction relation. There are two cases:

If , then by replacement, has a type, and it suffices to show that this type is preserved. Then by inversion (twice), we know that and where . Then by the substitution lemma, where .
If , this case is straightforward.

QED.

Lemma (Canonical forms).

If , then:

If is , then is either or .
If is , then has the form .
If is , then is a record with at least the fields .

Proof. By induction on the structure of the typing derivation. Only four rules form values, and those rules correspond to the conditions of the lemma.

Lemma (Progress). If then either is a value or for some term .

Proof. By induction on the typing derivation:

is vacuous.
is a value.
If then by inversion, . Then by induction, either takes a step or is a value. If it’s a value, then is a value; if it takes a step to then takes a step to .
If then by inversion, and for some types and such that . Then by induction, each of and either is a value or takes a step. If takes a step to , then the whole term takes a step to . If is a value and takes a step to , then the whole term takes a step to . Otherwise, is a value . By the canonical forms lemma, has the form . Then the whole term takes a step to .
is a value.
If then by inversion, for all . Then by induction, each of those takes a step or is a value. If any takes a step, then the whole term steps by the leftmost to take a step. Otherwise, they are all values, and the whole term is a value.
If then by inversion, has a record type with a field having type . By induction, either takes a step or is a value . If it takes a step then the whole term takes a step. If it’s a value, then by the canonical forms lemma, it’s a value . Then the whole term takes a step to .

Theorem (Type safety). λ-sub is type safe.

Proof. By progess and preservation.

4.4 Compiling with coercions🔗

To say that is to say that a can be used wherever a is expected, but do our run-time representations actually make that true? In some languages yes, but in many languages no. We might not want, for example, for record operations to have to do a (linear) search of field names at run time, but instead to fix the offset at compile time. Such a representation choice is not incompable with subtyping, if we are willing to interpret subtyping as a coercion between potentially different underlying representation types. For example, record type is a subtype of record type . The former is represented by a 3-element vector containing the values of fields a, b, and c, whereas the rather is represented as a 2-element vector containing the values of fields a and c. We cannot use an instance of the former as the latter directly, but we can coerce it. The coercion between two types in the subtype relationship is witnessed by the function converting the subtype to the supertype.

In particular, the witness to the fact that

is the identity function on type

To witness an arrow subtyping, we build a function that applies the witness to the domain coercion to the argument and the witness to the codomain coercion to the result of the coerced function:

The empty record is a supertype of itself, by the identity coercion:

In subtyping records, we can skip fields and not include them in the supertype:

The depth-subyping record case is hairy. We convert record types by converting one element and then recursively converting the rest of the record, and then reassembling the desired result:

The typing rules now translate from a language with subtyping to a language that doesn’t use subtyping. All of the rules except [app] just translate each term by homomorphically translating the subterms:

The only interesting rule is [app], which includes subtyping. It generates the coercion for the particular subtyping used, and then applies that to coerce the argument to the function:

Exercise 32. Define a Point as a record with fields x and y, which are integers. Define a ColorPoint as a Point with an additional field, the color, which is a string. Define a function that takes a Point. Show that your function can be used on a ColorPoint.

contents ← prev up next →

1	Mathematical Preliminaries
2	The let-zl language
3	The simply-typed lambda calculus λ-st
4	λ-sub: subtyping with records
5	The polymorphic lambda calculus λ-2
6	The higher-order lambda calculus λ-ω
7	ML type inference
8	Qualified types

4.1	Syntax
4.2	Dynamic semantics
4.3	Static semantics
4.4	Compiling with coercions