Friday, February 5, 2010

restraining polymorphic variants

Since I started playing with polymorphic variants, I've repeatedly stumbled into a strange problem with them. They're too expressive!

Polymorphic variants are the flip side of polymorphic records. This is a major feature of my type system that allows you to use any label anywhere, in any way, without having to declare it first. I really like this ability, especially in the context of Irken, because it has a very Pythonic, dynamic-language feel to it.

As an example, here's some code that creates a lisp-like list:

(:cons 1 (:cons 2 (:cons 3 (:nil)))

That's it. Symbols starting with colons are constructors. No datatype declaration. Completely type safe. You just use constructors however you like. OCaml has the same feature:

# `Cons (1, `Cons (2, `Cons 3, `Nil));;
- : [> `Cons of int * [> `Cons of int * [> `Cons of int ] * [> `Nil ] ] ] =
`Cons (1, `Cons (2, `Cons 3, `Nil))

But if you look at the type that OCaml assigns to the result you see the seed of the problem. Watch this:

# `Cons (1, `Cons ("hello", `Nil));;
- : [> `Cons of int * [> `Cons of string * [> `Nil ] ] ] =
`Cons (1, `Cons ("hello", `Nil))

See? You could call that the datatype of lists that start with an int followed by a string.

With a complex data structure like a red-black tree, you get types like this:

rsum(rlabel('purple', pre(product(rsum(rlabel('empty', pre(kb), rlabel('purple',
pre(jw), rlabel('red', pre(gz), rdefault(abs()))))), rsum(rlabel('empty', pre(nq),
rlabel('purple', pre(nl), rlabel('red', pre(ko), rdefault(abs()))))), jc, jd)),
rlabel('empty', pre(product()), rlabel('red', pre(product(rsum(rlabel('purple',
pre(product(hj, hk, hl, hm)), bgc)), rsum(rlabel('purple', pre(product(hn, ho, hp,
hq)), bgp)), hh, hi)), rdefault(abs())))))

What the hell is that?? Well, it's a type built during the compilation of one of the tree-balancing functions; a reflection of the matching structure used to examine the tree a few levels deep. Think of it as a template that sits on top of a red-black tree.

Here, the compiler is building incredibly detailed, complex types and checking them for me. What I would prefer is for it to automatically recognize recursive datatypes when I use them - without have to declare them. Or, I would like the compiler to keep variants monomorphic within a single type.

This would involve a loss of some expressive power. But how often would you want to use variants polymorphically within the same object? And if you needed that, why wouldn't you use a different label?

1 comment:

  1. Jacques Garrigue gave me a cool example:

    [`Nil | `Cons of 'a * [`Cons of 'a * 'b]] as 'b

    ... for lists of even length. [note that it's still a subtype of normal lists].