Friday, February 12, 2010

Temporarily Giving up on Polymorphic Variants

Polymorphic Variants finally reached the point where they were causing more problems than they were worth. So I've temporarily put them aside... leaving support in the compiler, but I'm adding normal datatype/sum/union declarations back in.

  1. The types are too expressive (see previous post)
  2. Since the types are so complex, the solver gets hit with constraints leading to tens of thousands of type variables. This is mostly triggered by large initialization expressions like parser tables. Waiting 20 seconds to type a 40K file is not cool.
  3. I need to get back to work, i.e. I need to make some progress.

So after plugging away for a week or so, I've now run into a new problem - the value restriction. Something I've put off dealing with for a long time. Turns out the problem is classically triggered by type constructors for things like empty lists. For example, this types without any problems:

(datatype list
(nil)
(cons 'a list)
)

(let ((l0 (list:nil)))
(set! l0 (list:cons 34 l0))
(list:cons #f l0)
)

[Note that the last line glues together a "list(bool)" onto a "list(int)"]

The solution is to restrict the polymorphism of assigned variables, hence the Value Restriction.

-Sam

Friday, February 5, 2010

restraining polymorphic variants

Since I started playing with polymorphic variants, I've repeatedly stumbled into a strange problem with them. They're too expressive!

Polymorphic variants are the flip side of polymorphic records. This is a major feature of my type system that allows you to use any label anywhere, in any way, without having to declare it first. I really like this ability, especially in the context of Irken, because it has a very Pythonic, dynamic-language feel to it.

As an example, here's some code that creates a lisp-like list:


(:cons 1 (:cons 2 (:cons 3 (:nil)))


That's it. Symbols starting with colons are constructors. No datatype declaration. Completely type safe. You just use constructors however you like. OCaml has the same feature:


# `Cons (1, `Cons (2, `Cons 3, `Nil));;
- : [> `Cons of int * [> `Cons of int * [> `Cons of int ] * [> `Nil ] ] ] =
`Cons (1, `Cons (2, `Cons 3, `Nil))


But if you look at the type that OCaml assigns to the result you see the seed of the problem. Watch this:


# `Cons (1, `Cons ("hello", `Nil));;
- : [> `Cons of int * [> `Cons of string * [> `Nil ] ] ] =
`Cons (1, `Cons ("hello", `Nil))


See? You could call that the datatype of lists that start with an int followed by a string.

With a complex data structure like a red-black tree, you get types like this:


rsum(rlabel('purple', pre(product(rsum(rlabel('empty', pre(kb), rlabel('purple',
pre(jw), rlabel('red', pre(gz), rdefault(abs()))))), rsum(rlabel('empty', pre(nq),
rlabel('purple', pre(nl), rlabel('red', pre(ko), rdefault(abs()))))), jc, jd)),
rlabel('empty', pre(product()), rlabel('red', pre(product(rsum(rlabel('purple',
pre(product(hj, hk, hl, hm)), bgc)), rsum(rlabel('purple', pre(product(hn, ho, hp,
hq)), bgp)), hh, hi)), rdefault(abs())))))


What the hell is that?? Well, it's a type built during the compilation of one of the tree-balancing functions; a reflection of the matching structure used to examine the tree a few levels deep. Think of it as a template that sits on top of a red-black tree.

Here, the compiler is building incredibly detailed, complex types and checking them for me. What I would prefer is for it to automatically recognize recursive datatypes when I use them - without have to declare them. Or, I would like the compiler to keep variants monomorphic within a single type.

This would involve a loss of some expressive power. But how often would you want to use variants polymorphically within the same object? And if you needed that, why wouldn't you use a different label?