3 λ-sub: subtyping with records
3.1 Syntax
A record type lists field names with their types; assume the field names are not repeated within a record. A record expression lists field names with expressions whose values will fill the fields. A projection expression projects the value of the named field from a record.
3.2 Dynamic semantics
3.3 Static semantics
This works, but it’s not as expressive as we might like. Consider a function . It takes a record of one field and projects out that field. But is there any reason we shouldn’t be able to use this function on a record with more fields than ? Subtyping captures that intuition, allowing us to formalize it and prove it sound.
3.3.1 Subtyping
To do this, we define the subtype relation , which related pairs of types. Intuititively means that a may be used wherever a is required.
Exercise 23. Suppose that . Consider the types , , , and . Which of these are subtypes of which others? Does this make sense?
Exercise 24. Prove that is a preorder, that is, reflexive and transitive.
The idea of subtyping is that we can apply it everywhere. If we can conclude that and then we should be able to conclude that . It’s possible to add such a rule, and it works fine theoretically, but because the rule is not syntax directed, it can be difficult to implement. In fact, the only place in our current language that we need subtyping is in the application rule, so we replace the STLC application rule with this:
3.3.2 Type safety
Subtyping changes our preservation theorem somewhat, because reduction can cause type refinement. (That is, we learn more type information.) Here is the updated preservation theorem:
Theorem (Preservation). If and then there exists some such that and .
Before we can prove it, we update the replacement and substitution lemmas as follows:
Lemma (Replacement). If , then for some type . Furthermore, for any such that for , for some such that .
Proof. By induction on . The interesting cases are for application:
If is then the whole term has a type only if there are some types and such that and where . Then by induction, has a type, and if we replace with having a subtype of that, then for . The subtyping relation relates arrows only to other arrows, so = with and . Then by transitivity, . This means that we can reform the application , which has a subtype of .
If is , then the whole term has a type only if there are some types and such that and where . Then by induction, has a type, and if we replace with having a subtype of that, then where . Then by transitivity, , so we can reform the application having the same type .
Lemma (Substitution). If and where then for .
Proof. By induction on the derivation of the typing of :
.
If = , then = . Then = , which has type . Let be . Then the subtyping holds.
If ≠ , then , as before the substitution.
, then substitution has no effect and it types in any environment.
, then by induction , which relates only to . Then reapply .
, then by inversion we know that . Then by the induction hypothesis, for some . Then by [abs], , which is a subtype of .
, then by inversion we know that and where . Then by induction (twice), we have that where and that where . By inspection of the subtype relation, the only types related to arrow types are arrow types, so must be an arrow type where and . Then by transitivity (twice), . This means we can apply yielding type , which is a subtype of .
The record construction and projection cases are straightforward.
Proof (of preservation). By cases on the reduction relation. There are two cases:
If , then by replacement, has a type, and it suffices to show that this type is preserved. Then by inversion (twice), we know that and where . Then by the substitution lemma, where .
If , this case is straightforward.
QED.
Lemma (Canonical forms).
If , then:
If is , then is either or .
If is , then has the form .
If is , then is a record with at least the fields .
Proof. By induction on the structure of the typing derivation. Only four rules form values, and those rules correspond to the conditions of the lemma.
Lemma (Progress). If then either is a value or for some term .
Proof. By induction on the typing derivation:
is vacuous.
is a value.
If then by inversion, . Then by induction, either takes a step or is a value. If it’s a value, then is a value; if it takes a step to then takes a step to .
If then by inversion, and for some types and such that . Then by induction, each of and either is a value or takes a step. If takes a step to , then the whole term takes a step to . If is a value and takes a step to , then the whole term takes a step to . Otherwise, is a value . By the canonical forms lemma, has the form . Then the whole term takes a step to .
is a value.
If then by inversion, for all . Then by induction, each of those takes a step or is a value. If any takes a step, then the whole term steps by the leftmost to take a step. Otherwise, they are all values, and the whole term is a value.
If then by inversion, has a record type with a field having type . By induction, either takes a step or is a value . If it takes a step then the whole term takes a step. If it’s a value, then by the canonical forms lemma, it’s a value . Then the whole term takes a step to .
Theorem (Type safety). λ-sub is type safe.
Proof. By progess and preservation.
3.4 Compiling with coercions
To say that is to say that a can be used wherever a is expected, but do our run-time representations actually make that true? In some languages yes, but in many languages no. We might not want, for example, for record operations to have to do a (linear) search of field names at run time, but instead to fix the offset at compile time. Such a representation choice is not incompable with subtyping, if we are willing to interpret subtyping as a coercion between potentially different underlying representation types. For example, record type is a subtype of record type . The former is represented by a 3-element vector containing the values of fields a, b, and c, whereas the rather is represented as a 2-element vector containing the values of fields a and c. We cannot use an instance of the former as the latter directly, but we can coerce it. The coercion between two types in the subtype relationship is witnessed by the function converting the subtype to the supertype.
Exercise 25. Define a Point as a record with fields x and y, which are integers. Define a ColorPoint as a Point with an additional field, the color, which is a string. Define a function that takes a Point. Show that your function can be used on a ColorPoint.