Gradual types and annotations

April 25, 2017

This week's post will be about adding gradual types and type annotatinos to the system.

Last week, we took advantage of the fact that Jeremy Siek, the inventor of gradual typing was in Paris to work on adding gradual types to the system.

The idea is to use this to type the parts that can't be typed using a traditional type-system.This has also been the occasion to explicit where type annotations would be needed.

(note that this is just a very quick introduction. If you want something more detailed, I enjoin you to read the original blogpost about it).

The jungle of programming languages is roughly split into two universes: the typed world and the untyped one. The goal of gradual typing is to bridge the gap in between. For this purpose, we add a new type (the "gradual type", often written `?`

), which is the type of expressions that we do not want (or can't) type.

The type `?`

can replace every type (for example, if a function expects an argument of type `Int`

, I can pass in a value of type `?`

instead), and every type can replace the type `?`

(so if a function expects an argument of type `?`

, I can pass an `Int`

or a `Bool`

or a function `String → (? → Bool)`

instead).

This tool is used in various languages such as Hack, Flow or Typescript.

So how can this be useful for nix?

First, let's see where we need to help the (classical) type checker in order to see where and why those types can help.

While a type-checker can be really smart, ~~the mind of a programmer is often way too weird for him~~ it may only infer a limited amount of stuff, and often needs some hints to understand what is going on.

In the case of nix, there are two places where you probably will need to annotate stuff: lambdas and (recursive) let-bindings. Those are the places where it would otherwise have to guess the type of variables,
^{2}
which is out of his reach.
^{3}

In order to type a function `λx.e`

, the type-checker will have to type the body `e`

without knowing anything about `x`

, and by the way deduce the possible types for `x`

. This is possible in ML thanks to unification, but unfortunately wouldn't work in our case.

As a consequence, the programmer will have to explicitly annotate the type of the argument, so that the type-checker can do its job. In fact, the programmer may even write several types for the argument, because function types aren't only arrows `τ → σ`

, but also intersections of arrows ⋀_{i}(*τ*_{i} → *σ*_{i}). So for each type *τ*_{i} of the argument, the type-checker will run to deduce a return type *σ*_{i}. Functions will then have a form like `λx:(τ1;...;τn).e`

Recursive declarations are similar to lambdas in the sense that we use a variable before it is fully defined − hence before it is typed. So for the very same reason, we must annotate the variables that we define recursively.

Those annotations seem like a reasonable trade-off, but it breaks compatibility with existing untyped code. The obvious solution is to add a default type annotation if none is given by the programmer, which must be the gradual type if we want the body to type and still stay sound.

So an unannotated function `λx.e`

is just syntactic sugar for `λx:?.e`

.

That's nice, but what do we do with it?

The essence of the gradual type is that a value of type `?`

can be used in place of anything, and any value can be used where the type `?`

is expected. So in this case, the function will work exactly as it would in the untyped world: the argument can be used as a value of any type inside the body, and any value can be passed as argument.

(The concept of gradual typing also includes adding explicit casts at the boundaries between the typed and untyped worlds so that runtime type errors appear reliably at the right place, but as we do not touch the nix interpreter itself, this is out of scope for this project).

The first use of gradual typing is a transitional one: we plan to use it to type legacy code, waiting for it to be rewritten in a typed way.

Another use, which is also (I hope) a transitional one is to mitigate the weaknesses of the type system (waiting for it to be improved of course). In particular, there is currently no polymorphism for now (because we don't know how to make polymorphism work with records and − ironically − gradual typing). This is rather annoying, mainly because some basic functions (head, tail, fix, …) **are** polymorphic (and typing them otherwise wouldn't make sense). There are people working on this, and hopefully a solution will land soon, but in the meantime, we have to deal with the absence of polymorphism.

For some really basic and important functions, we may hardcode some typing rules to implement in an ad-hoc way the missing polymorphism. But as we can't do this for every function, an acceptable solution is to type the other one gradually (so for example the `identity`

function could be typed by `? → ?`

or the `head`

function by `[? ?*] → ?`

. This would of course be rather imprecise in some cases, but nonetheless allows us to give a useful type to those polymorphic functions.

Let's now see some real nix code. This is an example from the first post of this blog

If we leave this as such, the type-checker will understand it as:

```
(x::?):
if isInteger x then
toString x
else if isBool x then
x
else
raise "Invalid argument"
```

In this case, it will try to type the body under the assumption that `x`

is of type `?`

, see that in this case the return type may be `String`

or `Bool`

, and thus type the whole function as `? → (String ∨ Bool)`

.

Now, suppose the programmer adds a type annotation to the argument:

```
(x::Int ∨ Bool):
if isInteger x then
toString x
else if isBool x then
x
else
raise "Invalid argument"
```

The type-checker will now type the body under the assumption that `x`

is of type `Int ∨ Bool`

. Like in previous case, he will find a type `String ∨ Bool`

, and deduce a type `(Int ∨ Bool) → (String ∨ Bool)`

for the body.

That's cool, but it could do better, this is not the most precise type for the function, we would like the type-system to infer the type `(Int → String) ∧ (Bool → Bool)`

instead.

Remember somewhere earlier, I told that you sometimes would need to annotate several types for the argument. This has been intended exactly for those situations. If we just change the type annotation to this:

```
(x::Int; Bool):
if isInteger x then
toString x
else if isBool x then
x
else
raise "Invalid argument"
```

the type-checker will try to infer the type of the body twice: a first with the assumption `x::Int`

(in which case he infers a return type `String`

), and a second time with the assumption `x::Bool`

(in which case he infers a return type `Bool`

). Then, he will take the intersection of those two results to deduce the final type of the function, which will here be `(Int → String) ∧ (Bool → Bool)`

, as expected.

All those reasonings also apply to recursive let-bindings (although the need of annotating several types in a let-binding is rather unlikely to happen).

For simplicity, I eluded here everything related to patterns by assuming that the functions were in the form `x:e`

. When the argument is matched against a pattern, it is of course possible to use this to get more informations about the domain of the function, even without annotation.

For simplicity again, I assumed that the language was extended with type annotations, which obviously are not present in nix. So to keep compatibility, those will probably have to be written in a way that makes the nix parser happy, most probably inside comments (or we can try to add them to the official parser, which would be even better).

Unlike nix, nix-light has two separate `let`

and `let rec`

constructs. This is nice because the non-recursive version is trivial to type, and doesn't require any type annotation. During the compilation from nix to nix-light, it is possible to translate let-bindings who do not use recursion into non-recursive one, so that some let-bindings in nix won't need to be annotated.