April 25, 2017

This week's post will be about adding gradual types and type annotatinos to the system.

Last week, we took advantage of the fact that Jeremy Siek, the inventor of gradual typing was in Paris to work on adding gradual types to the system.

The idea is to use this to type the parts that can't be typed using a traditional type-system. 1

This has also been the occasion to explicit where type annotations would be needed.

About gradual types

(note that this is just a very quick introduction. If you want something more detailed, I enjoin you to read the original blogpost about it).

The jungle of programming languages is roughly split into two universes: the typed world and the untyped one. The goal of gradual typing is to bridge the gap in between. For this purpose, we add a new type (the "gradual type", often written ?), which is the type of expressions that we do not want (or can't) type.

The type ? can replace every type (for example, if a function expects an argument of type Int, I can pass in a value of type ? instead), and every type can replace the type ? (so if a function expects an argument of type ?, I can pass an Int or a Bool or a function String → (? → Bool) instead).

This tool is used in various languages such as Hack, Flow or Typescript.

Back to nix

So how can this be useful for nix?

First, let's see where we need to help the (classical) type checker in order to see where and why those types can help.

Type annotations

While a type-checker can be really smart, the mind of a programmer is often way too weird for him it may only infer a limited amount of stuff, and often needs some hints to understand what is going on.

In the case of nix, there are two places where you probably will need to annotate stuff: lambdas and (recursive) let-bindings. Those are the places where it would otherwise have to guess the type of variables, 2 which is out of his reach. 3

Lambdas

In order to type a function λx.e, the type-checker will have to type the body e without knowing anything about x, and by the way deduce the possible types for x. This is possible in ML thanks to unification, but unfortunately wouldn't work in our case.

As a consequence, the programmer will have to explicitly annotate the type of the argument, so that the type-checker can do its job. In fact, the programmer may even write several types for the argument, because function types aren't only arrows τ → σ, but also intersections of arrows i(τi → σi). So for each type τi of the argument, the type-checker will run to deduce a return type σi. Functions will then have a form like λx:(τ1;...;τn).e

Recursive let-bindings

Recursive declarations are similar to lambdas in the sense that we use a variable before it is fully defined − hence before it is typed. So for the very same reason, we must annotate the variables that we define recursively.

Including gradual types into this

Those annotations seem like a reasonable trade-off, but it breaks compatibility with existing untyped code. The obvious solution is to add a default type annotation if none is given by the programmer, which must be the gradual type if we want the body to type and still stay sound.

So an unannotated function λx.e is just syntactic sugar for λx:?.e.

That's nice, but what do we do with it?

The essence of the gradual type is that a value of type ? can be used in place of anything, and any value can be used where the type ? is expected. So in this case, the function will work exactly as it would in the untyped world: the argument can be used as a value of any type inside the body, and any value can be passed as argument.

(The concept of gradual typing also includes adding explicit casts at the boundaries between the typed and untyped worlds so that runtime type errors appear reliably at the right place, but as we do not touch the nix interpreter itself, this is out of scope for this project).

Working around the lack of polymorphism

The first use of gradual typing is a transitional one: we plan to use it to type legacy code, waiting for it to be rewritten in a typed way.

Another use, which is also (I hope) a transitional one is to mitigate the weaknesses of the type system (waiting for it to be improved of course). In particular, there is currently no polymorphism for now (because we don't know how to make polymorphism work with records and − ironically − gradual typing). This is rather annoying, mainly because some basic functions (head, tail, fix, …) are polymorphic (and typing them otherwise wouldn't make sense). There are people working on this, and hopefully a solution will land soon, but in the meantime, we have to deal with the absence of polymorphism.

For some really basic and important functions, we may hardcode some typing rules to implement in an ad-hoc way the missing polymorphism. But as we can't do this for every function, an acceptable solution is to type the other one gradually (so for example the identity function could be typed by ? → ? or the head function by [? ?*] → ?. This would of course be rather imprecise in some cases, but nonetheless allows us to give a useful type to those polymorphic functions.

Some examples

Let's now see some real nix code. This is an example from the first post of this blog

Gradual typing of a function

x:
if isInteger x then
  toString x
else if isBool x then
  x
else
  raise "Invalid argument"

If we leave this as such, the type-checker will understand it as:

(x::?):
if isInteger x then
  toString x
else if isBool x then
  x
else
  raise "Invalid argument"

In this case, it will try to type the body under the assumption that x is of type ?, see that in this case the return type may be String or Bool, and thus type the whole function as ? → (String ∨ Bool).

With a type annotation

Now, suppose the programmer adds a type annotation to the argument:

(x::Int ∨ Bool):
if isInteger x then
  toString x
else if isBool x then
  x
else
  raise "Invalid argument"

The type-checker will now type the body under the assumption that x is of type Int ∨ Bool. Like in previous case, he will find a type String ∨ Bool, and deduce a type (Int ∨ Bool) → (String ∨ Bool) for the body.

That's cool, but it could do better, this is not the most precise type for the function, we would like the type-system to infer the type (Int → String) ∧ (Bool → Bool) instead.

With a types annotation

Remember somewhere earlier, I told that you sometimes would need to annotate several types for the argument. This has been intended exactly for those situations. If we just change the type annotation to this:

(x::Int; Bool):
if isInteger x then
  toString x
else if isBool x then
  x
else
  raise "Invalid argument"

the type-checker will try to infer the type of the body twice: a first with the assumption x::Int (in which case he infers a return type String), and a second time with the assumption x::Bool (in which case he infers a return type Bool). Then, he will take the intersection of those two results to deduce the final type of the function, which will here be (Int → String) ∧ (Bool → Bool), as expected.

All those reasonings also apply to recursive let-bindings (although the need of annotating several types in a let-binding is rather unlikely to happen).

Further thoughts

Patterns

For simplicity, I eluded here everything related to patterns by assuming that the functions were in the form x:e. When the argument is matched against a pattern, it is of course possible to use this to get more informations about the domain of the function, even without annotation.

Concrete sytax for type annotations

For simplicity again, I assumed that the language was extended with type annotations, which obviously are not present in nix. So to keep compatibility, those will probably have to be written in a way that makes the nix parser happy, most probably inside comments (or we can try to add them to the official parser, which would be even better).

Non-recursive let-bindings

Unlike nix, nix-light has two separate let and let rec constructs. This is nice because the non-recursive version is trivial to type, and doesn't require any type annotation. During the compilation from nix to nix-light, it is possible to translate let-bindings who do not use recursion into non-recursive one, so that some let-bindings in nix won't need to be annotated.


1 Waiting for them to be rewritten in a typeable way of course
2 In some type-systems such as the Hindley-Milner type system used in the ML languages, there is no need for such annotations because the type-system is simple enough to allow working around that impossible guess. Unfortunately, those workarounds do not scale with more complicated type systems − in particular with intersection types.
3 This doesn't mean that the type-checker won't be able to say anything about those, just that most of the time, the types it would infer would be too imprecise.

Commentaires