7 Qualified types
In the previous lecture, we saw how ML infers types for programs that
lack type annotations. In this lecture, we see how to extend ML with a
principled form of overloading, similar to how it appears in Haskell and
Rust. In particular, we will extend type schemes to a form
,
where is a logical formula over types that must be
satisfied to use a value having that type scheme.
7.1 Syntax
Our language includes the usual variables, lambda abstractions,
applications, and let from ML, as well as some constants, a condition
form, and pairs:
The constants include integers, and functions for projecting from pairs,
subtraction, equality, and less-than:
7.2 Dynamic semantics
Values include constants, lambdas, and pairs of values:
Evaluation contexts are standard, performing left-to-right evaluation
for applications and pairs:
We give a reduction relation that includes rules for application and
let, two rules for if0 (true and false), and delta, which handles
applications of constants by delegating to a metafunction:
The metafunction
gives the results for applying constants
to values:
Note that the functions represented by constants are uncurried, taking
pairs of values—this simplifies our presentation somewhat.
7.3 Static semantics
As in ML the static semantics assigns prenex type schemes to let-bound
values, but type schemes now have an additional component. The syntax of
types is as follows.
7.3.1 Syntax of types
Monotypes include type variables, the base type
, product
types, and function types:
To represent overloading, we define a fixed set of
type classes
, which are used to construct
predicates on types
:
For any type
, the predicate
means that type
supports equality, and the predicate
means that
type
supports less-than. In a real system, the set of type
classes (and thus the possible predicates) would be extensible by the
user.
A
predicate context is a collection of predicates:
Then a
qualified type is a monotype qualified by some
predicate context:
Then a type scheme is a qualified type generalized over some quantified set
of type variables:
For example, type scheme
describes a function that takes (curried) two arguments of any
type
supporting equality and returns an integer.
As in ML, typing environments map variable names to type schemes:
7.3.2 The types of constants
We can now define the
metafunction, which gives type
schemes for the constants:
Note that
and
are overloaded.
7.3.3 Instantiation and entailment
Before we can give our main typing relation, we need two auxiliary
judgments. The first, as in ML, relates a type scheme to its instantiations
as qualified types:
The second relation is entailment for predicate contexts. This is not
strictly necessary (and omitted from Jones’s paper), but allows us to
make predicate contexts smaller when they are redundant. The first two
rules rule say that a predicate context entails itself and that
entailment is transitive.
The next rule says that we can remove duplicate predicates from a context:
The next two rules say that integers support equality and ordering, and
that fact need not be recorded in the context to prove it:
Finally, equality works on pairs if it works on both components of the pair:
7.3.4 Syntax-directed typing
The typing judgment is of the form , where
gives constraints on the types in . Even though it
appears on the left, should be thought of as an out-parameter.
The rule for typing a variable says to look up its type scheme in the
environment and then instantiate the bound variables of the type scheme.
The predicate context
from the instantiated type scheme becomes
the predicate context for the judgment:
Typing a constants is substantially the same, except we get its type
scheme using the
metafunction:
The rules for lambda abstractions, applications, conditionals, and
pairs, as the same as they would be in ML, except that we thread through
and combine the predicate contexts:
Finally, the let rule is where the action is:
First we type , which produces a predicate context .
Then we apply the entailment relation to reduce to a context
that entails it, . (This step can be omitted, but it reflects
the idea that we probably want to simplify predicate contexts before
including them in type schemes.) Then we build a type scheme by
generalizing all the type variables in and that do
not appear in , and bind that in the environment to
type . Note that the resulting predicate context for the
judgment is only , the constraints required by ,
since the constraints required by are carried by the
resulting type scheme.
Alternatively, we could split the predicates of (or )
into those relevant to , which we would package up in the type
scheme, and those irrelevant to , which we would propogate
upward.
Exercise 53. Use Haskell’s type classes to implement bijections
between the natural numbers and lists.
To get started, install ghc (and be sure that
QuickCheck is installed, perhaps by issuing the command
cabal install quickcheck).
Put your code in XEnum.hs and
use ghc -o XEnum XEnum.hs && ./XEnum to run your code.
Because Haskell is
whitespace-sensitive, copying code from webpages is fraught;
accordingly the declarations in the code below are all in
XEnum.hs
Here are some declarations to get started, along
with an explanation of them.
|
{-# LANGUAGE ScopedTypeVariables #-} |
import Test.QuickCheck |
import Numeric.Natural |
|
class XEnum a where |
into :: a -> Natural |
outof :: Natural -> a |
|
instance XEnum Natural where |
into n = n |
outof n = n |
|
The class declaration introduces a new predicate XEnum that supports two
operations, into and outof. These are two functions
that realize a bijection between the type a and the
natural numbers.
The instance declaration says that the type Natural supports
enumeration by giving the functions that translate from the naturals
to the naturals (i.e., the identity function).
For our first substantial instance, fill in the into and
outof functions to define a bijection between the natural
numbers and the integers:
instance XEnum Integer where |
into x = error "not implemented" |
outof n = error "not implemented" |
There is more than one way to do this, but it is also easy to make
arithmetic errors when doing it. So we can use Quick Check to help find those
errors. Add this declaration to the end of your program:
prop_inout :: (Eq a, XEnum a) => a -> Bool |
prop_inout x = outof (into x) == x |
main = quickCheck (prop_inout :: Integer -> Bool) |
If you do not see output like +++ OK, passed 100 tests.,
then you have a bug in your bijections.
Once you have finished that, add the support for (disjoint)
unions. To do that we need to assume we have two enumerable
things and then we are going to add a bijection using
the Either type:
instance (XEnum a , XEnum b) => XEnum (Either a b) where |
into (Left x) = error "not implemented" |
into (Right x) = error "not implemented" |
outof n = error "not implemented" |
The idea of this bijection is to use the odd numbers for either
Left or Right values, and use the even numbers for
the other. So we can embed two enumerable values into one.
Also test this one with Quick Check,
using prop_inout :: (Either Integer Natural) -> Bool.
Next up, pairs.
instance (XEnum a , XEnum b) => XEnum (a , b) where |
into (a , b) = error "not implemented" |
outof n = error "not implemented" |
Once you have that all working, define enumerations for lists.
instance XEnum a => XEnum [a] where |
into l = error "not implemented" |
outof n = error "not implemented" |
Be aware that the formulas and the bijections you’ve built work
only for infinite sets (i.e., the naturals, the integers, pairs of them, etc.)
If you want to use these bijections on sets that are finite, you need
to add a size operation:
data ENatural = Fin Natural | Inf |
|
class XEnum a where |
into :: a -> Natural |
outof :: Natural -> a |
size :: ENatural |
The size of an
Either is the sum of the sizes and the size
of a pair enumeration is the product of the sizes. Also note
that the corresponding formulas will need adjustment to handle the
case where one of sides is finite. If you get stuck trying to figure
out the formulas, look in this paper:
https://www.eecs.northwestern.edu/~robby/pubs/papers/jfp2017-nfmf.pdf,
7.4 Type inference algorithm
The above type system provides a satisfactory account of which terms
type and which do not, but it does not give us an algorithm that we can
actually run. In this section, we extend ML’s Algorithm W for qualified
types.
First, we give a helper metafunction for instantiating a type scheme
with fresh type variables:
Again we use unification. Because unification is applied to monotypes,
it is the same as in ML (except now we have to handle product types as
well):
Algorithm W for qualified types takes a type environment and a term, and
returns a substitution, a type, and a predicate context:
.
To infer the type of a variable or constant, we look up its type scheme
(in the environment or the
metafunction, respectively)
and instantiate it with fresh type variables, yielding a qualified type
. The
is the type of the variable or constant,
and
is the predicate context that must be satisfied:
Lambda abstraction, application, pairing, and the conditional are as
before, merely propagating and combining predicate contexts:
Note how substitutions must be applied to predicate contexts, just as we
apply them to type environments and types.
Finally, the let rule follows the let rule from the previous section,
packaging up the predicate context generated for
in the type
scheme assigned to
. We (optionally) assume a metafunction
that simplifies the predicate context before
constructing the type scheme.
7.5 Evidence translation
Exercise 54. What is the most general type scheme of the term
?
How would you implement such a function—in particular, how does it
figure out the equality for a generic/unknown type parameter? Well, our
operational semantics cheated by relying on Racket’s underlying
polymorphic equal? function. Racket’s equal? relies on
Racket’s object representations, which include tags that distinguish
number from Booleans from pairs, etc. But what about in a typed language
that does not use tags and thus cannot support polymorphic equality?
One solution is called evidence passing, wherein using a qualified type
requires passing evidence that it is inhabited, where this evidence
specifies some information about how to perform the associated
operations. In our type classes example, the evidence is the equality or
less-than function specialized to the required type. (In a real
evidence-passing implementation such as how Haskell is traditionally
implemented, the evidence is a dictionary of methods.)
We can translate implicitly-typed λ-qual programs like the above into
programs that pass evidence explicitly. We do this by typing them in an
evidence environment, which names the evidence for each predicate:
We can use the evidence environment to summon or construct evidence if
it’s available. In particular, the judgment
uses evidence environment to construct , which is
evidence of predicate . In particular, if is
then should be an equality function of type
; if is then
should be a less-than function of type .
For base type
, the evidence is just a primitive function
performing the correct operation:
For a product type, we summon evidence for each component type, and then
construct the equality function for the product.
Other types are looked up in the evidence environment:
Note that if difference evidence appears for the same repeated
predicate, then the behavior can be incoherent.
The evidence translation uses two more auxiliary judgments. The first is
for applying a term that expect evidence to its expected evidence:
The second abstracts over the evidence expected by a term based on its
context:
Four rules of the typing judgment are unremarkable, simply passing the
evidence environment through and translating homomorphically:
The rules for variables and constants take a polymorphic value and apply
it to the required evidence for any predicates contained in its
qualified type, using the evidence application judgment:
The let form, as above, generalizes, by abstracting the right-hand side
over evidence corresponding to its inferred evidence context:
Exercise 55. Rust uses monomorphization to implement generics and traits.
It does this by duplicating polymorphic code, specializing it at each
required type. Write a relation that formalizes monomorphization for
describes λ-qual.