Monday, May 23, 2011

git, github, clang

I've imported my svn history into git, and published the whole thing on github:

Still trying to grok git and github, there'll probably be a stall for a week or so.

In the meanwhile, I have interesting news:  I was able to take out all the 'nested functions' which relied on a gcc extension.  I resisted this for a long time because early testing had shown that declaring the 'registers' as local register variables in the main function was a huge win.

After a quick test the other day, I found that moving everything out to global variables has no visible impact on performance.  I don't know if this is because of improvements in gcc & llvm, or if I just wasn't trying a large enough program.

That means I'm now able to compile the output with clang, rather than relying on llvm+dragonegg, which can be much more annoying to build, install, and use.

The good news continues: although the performance of the generated code is identical (maybe a tiny bit faster), the compilation time has been cut quite significantly.  On my machine an LLVM -O2 compile dropped from 18m to 8m.  The -O0 compile time is even sweeter, about 3s, which is actually faster than Irken itself, which hovers right around 5s to self-compile.  So a roughly 10s edit-compile cycle will definitely speed the development of Irken up.

Now the bad news.  The no-nested-funs version causes 'as' on snow leopard to take a lunch break.  It actually nails the CPU for 10 or 15 minutes before finally emerging as if nothing was wrong.  No problems on FreeBSD 7.3 with gcc-4.5.3, though.  This could be a significant annoyance, since I expect many people trying out Irken would prefer to do so with a stock compiler.

More bad news:  clang-2.9 crashes when fed Irken's self-compiled output.  I was going to file a bug against it, but when I checked out the svn version and built it the problem was gone.  Sigh.

This is why I'm trying to figure out how to make branches on github, since it'd be nice to maintain both versions for a while until these issues are ironed out.  I have made a local branch called 'no-nested-funs', I just haven't figured out how to publish it on github.

Sunday, May 15, 2011

nested datatypes, 2-3 trees, numerical encoding

This paper by Hinze & Paterson discusses the use of 'nested datatypes' to implement 2-3 trees, but in a way that uses the type system to automatically handle the invariants.  A blog post from Mark Chu-Carroll is a bit easier to follow.  Here's his example of a sequence encoded as a 2-3 tree (in Haskell):

data Tree t = Zero t | Succ (Tree (Node t))
data Node s = Node2 s s | Node3 s s s

Each level of the tree contains sub-trees that are 'one less' via the numeric encoding scheme.

This sent me back to Okasaki's book, where I'm finally starting to grok those last two chapters.  This bizarre mingling of numerical encoding and data structures lights a fire in the parts of my brain dedicated to data structures.  If feel as if I could automatically see the relationship between red-black trees, 2-3 trees, and binary number representations.... if only I could double my IQ.

I have this cool puzzle called the 'Spin-Out', which is actually a physical demonstration of the properties of Gray Codes.  It seems to me that Gray Codes and balanced trees should be a natural fit.

Unfortunately, I can't experiment with any of these ideas in Irken (or OCaml, or SML) because the nested datatypes require polymorphic recursion.

Type inference with polymorphic recursion is undecidable.  Haskell allows it only if the user provides a manual type signature.  I'm guessing this is impossible (i.e., unsafe) with SML/OCaml because of the presence of assignment?  Could such a feature safely be used with functions that the compiler can verify are pure?

Wednesday, April 27, 2011

DOOM DOOM DOOM (a coro/thread kqueue-based scheduler)

I've created a new sub-project, code-named 'doom', for the coroutine-based cooperative threading system.  It uses kqueue(), of course.  Because it's better.  It wouldn't be too hard for someone from linux-land to make a version based on epoll.

In the doom directory, you'll find a simple http demo, and the obligatory echo server.

Exception handling certainly makes the code look a lot nicer.

I'm the best at space.

Friday, April 22, 2011

exception handling, take two

A bit tricky, but I found a way to implement try/except almost completely with defmacro.  I needed to add some extra typechecking around raise and try to ensure that all exception types are monomorphic.  I achieved this by having the compiler keep a global map of exception-name => tvar.   Seems to work...

While working on this I learned that the Scheme-style call/cc is not really type safe, and I need to use the SML version.  The main difference is the protocol - SML's callcc requires the use of a throw procedure since the continuation needs to carry type information:

(define (callcc p) : (((continuation 'a) -> 'a) -> 'a)
  (p (getcc)))

(define (throw k v)
  (putcc k v))

;; -*- Mode: Irken -*-

(include "lib/core.scm")
(include "lib/random.scm")

;; We use polymorphic variants for exceptions.
;; Since we're a whole-program compiler there's no need to declare
;; them - though I might could be convinced it's still a good idea.
;; Exception data must be monomorphic to preserve type safety.
;; <try> and <raise> are implemented as macros, with one extra hitch:
;;  two special compiler primitives are used to check that exception
;;  types are consistent: %exn-raise and %exn-handle

(define (base-exception-handler exn) : ((rsum 'a) -> 'b)
  (error1 "uncaught exception" exn))

(define *the-exception-handler* base-exception-handler)

(defmacro raise
  (raise exn) -> (*the-exception-handler* (%exn-raise #f exn))

(defmacro try
  ;; done accumulating body parts, finish up.
  (try (begin body0 ...) <except> exn-match ...)
  -> (callcc
      (lambda ($exn-k0)
        (let (($old-hand *the-exception-handler*))
           (lambda ($exn-val)
             (set! *the-exception-handler* $old-hand)
             (throw $exn-k0
              (%exn-handle #f $exn-val
               (match $exn-val with
                 exn-match ...
                 _ -> (raise $exn-val))))))
          (let (($result (begin body0 ...)))
            (set! *the-exception-handler* $old-hand)
  ;; accumulating body parts...
  (try (begin body0 ...) body1 body2 ...) -> (try (begin body0 ... body1) body2 ...)
  ;; begin to accumulate...
  (try body0 body1 ...)                   -> (try (begin body0) body1 ...)

(define (level0 x)
   (level1 x)
   (:Level0 x) -> (:pair "level 0" x)

(define (level1 x)
   (level2 x)
   (:Level1 x) -> (:pair "level 1" x)

(define (level2 x)
   (level3 x)
   (:Level2 x) -> (:pair "level 2" x)

(define (level3 x)
   (match x with
     0 -> (raise (:Level0 x))
     1 -> (raise (:Level1 x))
     2 -> (raise (:Level2 x))
     3 -> (raise (:Level3 x))
     4 -> (:pair "no exception!" 99)
     _ -> (raise (:Bottom x))
   (:Level3 x) -> (:pair "level 3" x)

(printn (level0 0))
(printn (level0 1))
(printn (level0 2))
(printn (level0 3))
(printn (level0 4))
(printn (level0 5))

Wednesday, April 6, 2011

exception handling with defmacro

I've started work on a coroutine/thread scheduler, code named "Doom".  Doing some socket I/O has finally  snapped my 'missing feature' threshold - I really need exception handling to do it right.

Here's my first attempt. The basic syntax is (try <body> except <exception-patterns>).

If this works, it'll be a great example of the huge advantages of the lisp syntax and macros - no changes made to the compiler to support it!

;; -*- Mode: Irken -*-

(include "lib/core.scm")
(include "lib/random.scm")

;; can we implement an exception system using the macro system?
;; XXX eventually we'll need to change call/cc to swap out exception
;;  handlers, which probably means making a 'modern' call/cc with
;;  dynamic-wind, etc.

(define (base-exception-handler exn)
  (error1 "uncaught exception" exn))

(define *the-exception-handler* base-exception-handler)

(define (raise exn)
  (*the-exception-handler* exn)

;; what kind of values do we want for exceptions, and how they're caught/compared?
;; python: started with strings, went to objects of the Exception class.
;; scheme: SRFI-12 includes something that looks like property lists?
;; sml: string refs? (in sml/nj according to CwC)
;; ocaml: declared exception datatypes.
;; [alternate for ocaml:
;;  sounds a little like my 'fourth idea']

;; my first temptation is to just use symbols, and stay simple

;; second idea is to use a record, which theoretically gives you the
;;   ability to attach other kinds of data, which could be very
;;   convenient. Example: (raise BadFD) => {exception='BadFD ...}

;; third idea: hardcore - define a global exception type, force the user to
;;   extend it.  [could be made easier by compiler hacks that allow you to
;;   extend the type anywhere in the source?].  Code would have to wildcard
;;   exceptions it doesn't know about?  [or is that the definition of handing
;;   it upstream?]

;; fourth idea: use records (or polymorphic variants) to define unique
;;   exception names *and* types.  For example: (raise (BadFD
;;   current-fd)) would type as {BadFD=int ...}  this would make it
;;   more like the SRFI-12 property-list thing (unless I misunderstand
;;   what SRFI-12 is all about).  Hmmm... maybe we could extend the
;;   exception handler by extending the record?

;; It'd be nice if the type system can tell us what exceptions are possible
;;   at any given point in the code... is this maybe impossible due to the
;;   dynamic (vs static/lexical) nature of exception handling?

;; the more I think about the 'fourth idea' the less reason I see to
;;  restrict the type of exceptions.  Syntax-wise, the following macro
;;  does not enforce any restrictions.  It's possible though that only
;;  the polymorphic-variant-scheme will survive type checking.

;; this is the first time I've tried to 'accumulate' pieces in a macro.
;; it's possible that there's a better idiom that I just haven't discovered yet.
;; should really look to see how the scheme folks do this (or do they just avoid
;; it all by wrapping everything in verbose s-expressions).

(defmacro try
  ;; done accumulating body parts, finish up.
  (try (begin body0 ...) <except> exn-match ...)
  -> (let (($old-hand *the-exception-handler*))
        (lambda ($exn)
          (set! *the-exception-handler* $old-hand)
          (match $exn with
            exn-match ...
            _ -> (raise $exn))))
       (let (($result (begin body0 ...)))
         (set! *the-exception-handler* $old-hand)
  ;; accumulating body parts...
  (try (begin body0 ...) body1 body2 ...) -> (try (begin body0 ... body1) body2 ...)
  ;; begin to accumulate...
  (try body0 body1 ...)                   -> (try (begin body0) body1 ...)

(define (random-barf)
  (if (= (logand (random) 1) 1)
      (raise (:OtherException 99))

(define (thing)
   (let loop ((n 0))
     (if (= n 100)
         (raise (:MyException 12))
           (loop (+ n 1)))))
   (:MyException value) -> value
   (:OtherException _)  -> 9


Here's what the macro expansion looks like for thing:

(define (thing)
  (let (($old-hand *the-exception-handler*))
       (lambda ($exn)
           (set! *the-exception-handler* $old-hand)
             (MyException OtherException)
             (1 1)
               ((m9 (%nvget (:MyException 0 1) $exn)))
               (let_subst (value m9) value))
             (%match-error #f))
            (raise $exn)))))
      (let (($result
                   (function loop (n) (if (= n 100) (raise (:MyException 12)) (loop (+ n 1))))))
               (loop 0))))
        (begin (set! *the-exception-handler* $old-hand) $result)))))

Wednesday, March 23, 2011

same-fringe demo

This demonstrates how to use generators to solve the same-fringe problem.

Also, I found a minimalist way of building binary-tree literals using defmacro.

;; -*- Mode: Irken -*-

(include "lib/core.scm")

(datatype btree
  (:node (btree 'a) (btree 'a))
  (:leaf 'a))

(defmacro btree/make
  (btree/make (l r)) -> (btree:node (btree/make l) (btree/make r))
  (btree/make x)     -> (btree:leaf x))

(define t0 (literal (btree/make ((0 ((1 (2 (3 4))) 5)) (((6 7) ((8 (9 10)) 11)) ((12 (((13 14) 15) (16 17))) (18 19)))))))
(define t1 (literal (btree/make (((0 ((1 2) 3)) (((4 5) (((6 7) 8) (9 10))) ((11 ((12 13) 14)) ((15 (16 17)) 18)))) 19))))
(define t2 (literal (btree/make (((0 ((1 2) 3)) (((4 5) (((6 7) 8) (9 10))) ((88 ((12 13) 14)) ((15 (16 17)) 18)))) 19))))

(define btree/inorder
  p (btree:leaf x)   -> (begin (p x) #u)
  p (btree:node l r) -> (begin (btree/inorder p l) (btree/inorder p r) #u))

(define (btree/make-generator t)
   (lambda (consumer)
      (lambda (x) (consumer (maybe:yes x)))
     (forever (consumer (maybe:no))))))

(define (same-fringe t0 t1 =)
  (let ((g0 (btree/make-generator t0))
        (g1 (btree/make-generator t1)))
    (let loop ((m0 (g0)) (m1 (g1)))
      (match m0 m1 with
        (maybe:yes v0) (maybe:yes v1)
        -> (if (= v0 v1)
               (loop (g0) (g1))
               (print-string "NOT equal\n"))
        (maybe:no) (maybe:no)
        -> (print-string "equal\n")
        _ _ -> (print-string "unequal size\n")))))

(same-fringe t0 t1 =)
(same-fringe t0 t2 =)

Sunday, March 20, 2011

win32 binary

I've tried to make the bootstrapping script work on win32, and I think for the most part I've succeeded.  There are issues with mac-specific flags making it into the bootstrap file that I still need to work out.

I used mingw, but I suspect other gcc binaries will work, too.  A 32-bit executable is now available, which may ease the bootstrapping process for some.

Thursday, March 17, 2011

description of the compiler and runtime

I've started on a 'HACKING.txt' document describing the compiler and runtime.
It should be enough to get folks started on understanding the source.
Feedback appreciated!


self-hosted distribution

I've finally put together a 'real' self-hosted distribution.

The tarball is a little bigger - even though I removed all the python code.  The reason is that I have to distribute a pre-compiled version of self/compile.c so that the compiler can be bootstrapped.

Enjoy, and feedback appreciated.

Here's the text of the README:

This is the initial release of the self-hosted compiler.

It's still in a very unpolished state, but you can use it to bootstrap itself.

Just run the script "util/":

$ python util/

Which does the following:
1) run gcc on the distributed version of self/compile.c
2) this binary will be used to recompile the compiler.
3) that binary will recompile the compiler again.
4) the output from steps 2 and 3 are compared, they should be identical.

If you're happy with the resulting compiler, you can compile an optimized
version of self/compile.c, but be warned - you'll need a lot of memory and
a lot of time.

I am using dragonegg for optimized builds, and that seems to take about a GB
of memory, and 18 minutes to build.  It's important to use '-O2', not '-O',
because '-O' takes 53GB of memory and hours to compile.

Very little documentation exists yet, try 'lang.html' for a brief tutorial.
The best way to get familiar with the language is to read the source code in
the 'self' directory, and browse over the files in "tests".


Wednesday, March 9, 2011

C compiler issues

As I came closer to completing the Irken implementation, I noticed that my edit-compile cycle was taking longer and longer.  And by that, I don't mean a linear change.  At some point a threshold was crossed, such that compiling an optimized binary could take nearly an hour with dragonegg, and much longer with gcc, consuming over 17G of memory while at it!

After doing some tests, I've identified at least one of the causes: my varref and varset functions.

A couple of years ago, the compiler output for a varref insn looked like this:

  r0 = lenv[1][1][1][0];

Where the variable we are referencing is 3 levels up and at the 0 index.  (i.e., a De Bruijn index of (3,0)).

I noticed that I could write an inline lexical function, varref(), that did this with a loop:

  r0 = varref (3, 0);

... which is much cleaner.  With -O, gcc, llvm, and dragonegg were all unrolling the constant loop and creating code that was identical to the first version.

I didn't notice the cost of this convenient feature until my program size got large enough... the compiler sources, when using the inline functions, take 5X as long to compile -O as the first version.

Also, a 'platform' note: I work on OS X, where the stock compiler is still /usr/bin/gcc.  I did some timings for a non-optimized build and discovered that the stock gcc is over twice as fast as either my hand-built gcc-4.5.0, or dragonegg.  So for quick edit-compile cycles, I switched back to the stock version.  Though it'd be nice to know why the version from Apple is so much faster...


[assumed skynet joke here]

A few days ago, after many weeks of frenetic work, the new self-hosted Irken compiled itself.  There's still a lot of work to do, but this major milestone has been reached.  The compiler passes the stage1==stage2 test, and I'm now concentrating on various features still missing compared to the python compiler.  Also, the edit-compile cycle is still pretty slow, because both the compiler and gcc are slower than need be.  Once I have these issues solved and some rudimentary error reporting (the current error reporting might be described as medieval) I'll make an official release.

I'm also thinking about how to best re-package the system, considering orphaning the python development branch, and starting a whole new repository for the self-hosted system.

In the meanwhile, if there's anyone out there that would just like to see it compile itself, I rolled a demo tarball up.  Enjoy, and feedback is much appreciated!

Monday, February 14, 2011

progress on self-hosting

Today the self-hosted Irken compiled its first complete program, all the way through substituting into the header template file, and compiling via gcc. A nice feeling.

;; -*- Mode: Irken -*-

(datatype list
  (:cons 'a (list 'a))

(defmacro list
  (list)         -> (list:nil)
  (list x y ...) -> (list:cons x (list y ...)))

(define (+ a b)
  (%%cexp (int int -> int) "%0+%1" a b))

(define length
  () -> 0
  (_ . tl) -> (+ 1 (length tl)))

(length (list 1 2 3))

I also finally got around to checking in the changes from the last couple of weeks, too much slacking off there.

Wednesday, February 2, 2011

Y combinator in Irken

The second version requires recursive types.

;; -*- Mode: Irken -*-

;; See

(include "lib/core.scm")

(define (Y f)
  (lambda (x)
    ((f (Y f)) x)))

(define (factabs fact)
  (lambda (x)
    (if (= x 0)
        (* x (fact (- x 1))))))

(printn ((Y factabs) 5))

;; and
;; let fix f' = let g f x = f' (f f) x in (g g);;
;; let fact = fix fact' in fact 5;;

(define (Y2 f1)
  ;; the '^' prefix tells the compiler not to inline this function
  ;;  which in this particular case would never terminate...
  (define (^g f)
    (lambda (x)
      (f1 (f f) x)))
  (^g ^g))

(define (fact1 fact x)
  (if (= x 0)
      (* x (fact (- x 1)))))

(printn ((Y2 fact1) 5))

The function ^g has type
μ(A, (A->b))->(c->d)

recursive types, unification, sharing

I'm having trouble with recursive types in my type reconstruction (type inference) algorithm, some kind of combinatorial explosion.  I think there are multiple triggers for this: at each call to unify(), I'm running a complete apply-subst(), rather than a level at a time.  For the massive types generated by my OO code, this means lots of wasted, repeated work.

But of course there's even more going on.  I think my code that detects recursive types is not working as well as it should, and I think the cause is the lack of 'sharing'.  Remy/Pottier use a different style of unification, that involves preserving shared elements of types, and I think this where I'm screwing up: because parts of types are being copied, I'm missing links between identical parts of the graph.

I found a nice presentation from Remy describing "Core ML", that explains this in (accented) English, rather than a relentless flurry of LaTeX: "Using, Understanding, and Unraveling The OCaml Language: From Practice to Theory and vice versa".

The constraint-oriented solver gave me recursive types for free, (because its output types are naturally graphs), but I'm really hoping to avoid going back to that code, because it's much slower and I frankly don't understand it as well as I'd like.  Maybe I can pull out just the multi-equation unification stuff, and get back recursive types.

Monday, January 31, 2011

simple approach to OO and exploding equirecursive types

While trying to port the graph algorithms (Kosaraju's algorithm / topological sort) I got to a point where I really wanted some imperative set/map objects, so I combined a 'class' using alists (for map) and lists (for set) with a few 'methods' put into records.

After playing with it for a while, I found a fairly neat way to do method calls and object layouts:

(define (set-class)

  (define (in self x)
    (let loop ((l self.list))
      (match l with
        () -> #f
        (hd . tl) -> (if (eq? hd x)
                         (loop tl)))))
  (define (add self x)
    (if (self::in x)
        (set! self.list (list:cons x self.list))))
  (define (get self)

  (let ((methods {in=in add=add get=get}))
    (define (new l)
      {o=methods list=l})

This leads to an efficient object layout very similar to Python's: one slot identifies the object type, the other slots hold instance variables.  To support this, I added a reader hack that translates

(var::method arg0)


(var.o.method var arg0)

The result is something that I'm pretty sure will act just like Python's duck typing... for example a generic output function may call the 'write' method on an object, and its type will reflect that need: for a write method of a particular signature.  Instead of 'o' I'd probably use a slot name like '__methods__'... this suggests standard repr/str interfaces, and the other duck-typed protocols that Python makes such good use of.

Pretty cool, huh?

Well, as soon as I tried actually using this code, the inferencer fell to its knees. I'm not kidding. It went from a 30 second compile to a 20 minute compile, eating up over a GB of memory.  Hmmmm... looks like my quick-and-easy implementation of μ-types has problems, surprise.  I think I was 'asking for it' by calling each method with a 'self' argument... each method call has to unify the object type twice.  And storing objects inside of other objects doesn't help, either.

Back to the textbook.  I'm hoping that reading this paper again will help.

Wednesday, January 26, 2011

self-hosting dilemma

I have nearly the whole compiler rewritten in itself.  I've saved one of the nastiest bits for last, the inliner.  I've been planning on a new design for the inliner - rather than just copying the Python code.  So of course, what happens in this late stage?  A bug in the inliner!

The program types correctly if inlining is turned off, but fails if it's turned on.  This is a great example of the gains to be had by typing both before and after phases of the compiler - but unfortunately in this case it means I have to fix a bug in some code that I intend to throw away as soon as it works!

Theoretically I could just turn off inlining until I'm done, and then write the new inliner in the self-hosted environment - but I can't be absolutely sure that it's not a more complex interaction between the typer and the inliner.

Thursday, January 20, 2011

bloated C compilers

I've been testing Irken out on my netbook (a Samsung NC10), and discovered a big problem with feeding Irken's output to gcc.

When I tried to compile the self-hosted Irken with -O, the netbook fell on its knees.  After about 15 minutes, I decided to look into it... why hadn't the compile finished yet?

The input Irken source was about 60kB, which exploded into a 920kB C file.  The C output is actually pretty low-level stuff, so it's probably comparable to a 300kB hand-written C file.  But let's just call it 1MB and move on...

The gcc (version 4.4) process had eaten 1.4GB of virtual memory, so it was paging on the netbook.

That's 1600X the size of the C input file.  It's nearly 25,000X the size of the Irken input.  Wow.

So as a test I installed dragonegg and gcc-4.5.  Quite a difference.  The dragonegg compile held at 143MB for most of the compile, and peaked at around 200MB.  Only 226X the size of the C input, and a mere 3400X the size of the Irken input.

But what's an order of magnitude between friends?

The datatype + record idiom

One of the difficult decisions when working on a compiler is how to represent nodes.  There's a strain between the pure-functional and object-oriented styles that can be really painful.  I'm certainly not happy with the way I did it in Irken - a combination of pure-functional class with a bunch of attribute hacks.

While working to solve the same problem in self-hosted Irken, I've come across a pretty good idiom that captures the best of both worlds; giving me exactly the right combination of strong typing, mutability, and flexibility that I need.

The trick is to use a record as the node representation, but inside it put a slot that holds all the strongly-typed metadata that you need.  Here's what it looks like so far:

;; node type holds metadata related to the node,
;;  but sub-nodes are held with the record.
(datatype node
  (:varref symbol)
  (:varset symbol)
  (:literal literal)
  (:cexp type string)
  (:nvcase symbol (list symbol))
  (:function symbol (list symbol)) ;; name formals

(define node-counter (make-counter 0))

;; given a list of nodes, add up their sizes (+1)
(define (sum-size l)
  (fold (lambda (n acc) (+ n.s acc)) 1 l))

(define (node t subs)
  {t=t subs=subs s=(+ (sum-size subs) 1) id=( type=(type:base '?)}

(define (node/varref name)
  (node (node:varref name) '()))

(define (node/varset name val)
  (node (node:varset name) (LIST val)))

(define (node/literal lit)
  (node (node:literal lit) '()))

(define (node/cexp type template args)
  (node (node:cexp type template) args))

(define (node/sequence subs)
  (node (node:sequence) subs))

Note how the node constructor initializes the other slots in the record.  The interesting one for me today is the type field.   This is later filled in by the typer, succinctly:

(define (type-of exp tenv)
  (let ((t (type-of* exp tenv)))
    (set! exp.type t)

(define (apply-subst-to-program n)
  (set! n.type (apply-subst-to-type n.type))
  (for-each apply-subst-to-program n.subs))

(define (type-program node)
  (let ((t (type-of node (alist/make))))
    (apply-subst-to-program node)

We've still preserved the strong typing of nodes, though: here's what a typical node-visiting function looks like:

(define (compile tail? exp lenv k)

  ;; override continuation when in tail position
  (if tail?
      (set! k (cont (k/free k) gen-return)))
  (match exp.t with
    (node:literal lit)           -> (c-literal lit k)
    (node:sequence)              -> (c-sequence tail? exp.subs lenv k)
    (node:if)                    -> (c-conditional tail? exp lenv k)
    (node:function name formals) -> (c-function name formals (car exp.subs) lenv k)
    (node:varref name)           -> (c-varref name lenv k)
    (node:varset name)           -> (c-varset name (car exp.subs) lenv k)
    (node:cexp sig template)     -> (c-cexp sig template exp.subs lenv k)
    (node:call)                  -> (c-call tail? exp lenv k)
    _ -> (begin (pp-node exp 0) (error1 "NYI" exp))

Note how metadata that's specific to each node type is stored in the datatype (exp.t), but generic info (like the list of sub-nodes, or the size) is stored in the record.

Tuesday, January 18, 2011

making the match compiler smarter

The match compiler developed a little problem.  It's amazing that it didn't show up earlier.  As described in chapter 5 of Peyton-Jones' book, "The Implementation of Functional Programming Languages", certain 'obvious' or 'optimized' pattern matches actually compile to worse code.

The following function is from the book (pg 88), and is similar to an implementation of map2.

(define demo
  () ys -> (:A ys)
  xs () -> (:B xs)
  (x . xs) (y . ys) -> (:C x xs y ys)

The second match clause will actually compile more efficiently if you use (x . xs) rather than a variable, because the match compiler knows in this case you're matching a cons cell. (it avoids the 'mixture rule' and uses only the 'constructor rule').

Regardless of this optimization, I ran into a more significant issue.  If I use xs, I get an 'incomplete match' error, because the code generated for this match has to examine the first list twice - in the first test it covers only the nil case.  In the second test, it covers the cons case, but has an 'else' clause that generates a match error.

Here's what the node tree looks like:

290 L    primapp [33] ('%%fatbar', None) ? 
262 L      nvcase [5] ('list', ['nil']) ? 
258 L        varref [1] m0_7_i0 ? 
260 L        primapp [2] ('%vcon/A/1', None) ? 
259 L          varref [1] m1_8_i1 ? 
261 L        primapp [1] ('%%fail', None) ? 
289 L      primapp [27] ('%%fatbar', None) ? 
267 L        nvcase [5] ('list', ['nil']) ? 
263 L          varref [1] m1_8_i1 ? 
265 L          primapp [2] ('%vcon/B/1', None) ? 
264 L            varref [1] m0_7_i0 ? 
266 L          primapp [1] ('%%fail', None) ? 
288 L        nvcase [21] ('list', ['cons']) ? 
268 L          varref [1] m0_7_i0 ? 
286 L          let_splat [18] [{m2_9_i0}, {m3_10_i0}] ? 
270 L            primapp [2] ('%nvget/list/cons/0', None) ? 
269 L              varref [1] m0_7_i0 ? 
272 L            primapp [2] ('%nvget/list/cons/1', None) ? 
271 L              varref [1] m0_7_i0 ? 
285 L            nvcase [13] ('list', ['cons']) ? 
273 L              varref [1] m1_8_i1 ? 
283 L              let_splat [10] [{m4_11_i0}, {m5_12_i0}] ? 
275 L                primapp [2] ('%nvget/list/cons/0', None) ? 
274 L                  varref [1] m1_8_i1 ? 
277 L                primapp [2] ('%nvget/list/cons/1', None) ? 
276 L                  varref [1] m1_8_i1 ? 
282 L                primapp [5] ('%vcon/C/4', None) ? 
278 L                  varref [1] m2_9_i0 ? 
279 L                  varref [1] m3_10_i0 ? 
280 L                  varref [1] m4_11_i0 ? 
281 L                  varref [1] m5_12_i0 ? 
284 L              primapp [1] ('%%match-error', None) ? 
287 L          primapp [1] ('%%match-error', None) ? 

...and in a pseudo-python:
if nil(m0):
elif nil(m1):
  if cons(m0):
    if cons(m1):
      raise Error
    raise Error

The problem is that the match is actually exhaustive, because the nil case was examined already.  %%match-error in Irken is actually a kind of sentinel, the back end doesn't know how to compile it, so you can't actually compile code that's not exhaustive.

To fix this, I added an extra pass to the analysis phase that builds environments of 'already tested alternatives' for each nvcase variable. This actually killed two birds with one stone. In many cases, the innermost nvcase test is completely unnecessary - for this example, we've already tested for nil, and actually testing for a cons is redundant. In this case we simply remove the nvcase test (and the erroneous match-error), and return the body.

The code then generated for map is closer to what a human might write:


if nil(m0):
elif nil(m1):


0 = constructed [] 1
   1 = constructed [] 0
   2 = fatbar 'fatbar_0' []
       2 = move [0, None] 'm0_7_i0'
       2 = nvcase 'list' [2]
           2 = move [1, None] 'm1_8_i1'
           2 = primop [2] ('%make-tuple', 'A', 0)
               return [2] None
               fail [] ('fatbar_0', 0)
           return [2] None
       2 = fatbar 'fatbar_1' []
           2 = move [1, None] 'm1_8_i1'
           2 = nvcase 'list' [2]
               2 = move [0, None] 'm0_7_i0'
               2 = primop [2] ('%make-tuple', 'B', 1)
                   return [2] None
                   fail [] ('fatbar_1', 0)
               return [2] None
           2 = move [0, None] 'm0_7_i0'
           2 = primop [2] ('%vget', '0')
           3 = move [0, None] 'm0_7_i0'
           3 = primop [3] ('%vget', '1')
           4 = move [1, None] 'm1_8_i1'
           4 = primop [4] ('%vget', '0')
           5 = move [1, None] 'm1_8_i1'
           5 = primop [5] ('%vget', '1')
           6 = move [2, None] 'm2_9_i0'
           7 = move [3, None] 'm3_10_i0'
           8 = move [4, None] 'm4_11_i0'
           9 = move [5, None] 'm5_12_i0'
           6 = primop [6, 7, 8, 9] ('%make-tuple', 'C', 2)
               return [6] None
           return [2] None
       return [2] None

Friday, January 14, 2011

progress on self-hosting

Making excellent progress on self-hosting.  Today the baby irken compiler spat up its first real code:

beast:irken rushing$ self/backend

-- reader --
(((lambda (x y) x) 3 4))

-- macros --
((function lambda (x y) x) 3 4)

-- node tree --

5 call
  2 function lambda (x y)
    1 varref x
  1 literal {u1 3}
  1 literal {u1 4}

-- RTL --

0 env 2 
1 lit {u1 3}
- stor 1 1 0 0 
1 lit {u1 4}
0 stor 1 1 0 1 
1 close lambda
  0 ref 0 0 
  - ret 0
0 tail 1 0 
- ret 0

-- C output --
  r0 = allocate (TC_TUPLE, 3);
  r1 = (object *) 7;
  r0[2] = r1;
  r1 = (object *) 9;
  r0[3] = r1;
  // def lambda
  goto L0;
    r0 = varref (0, 0);
  r1 = allocate (TC_CLOSURE, 2);
  r1[1] = &&FUN_lambda; r1 = lenv;
  r0[1] = r1[2]; lenv = r0; goto *r1;
{total ticks: 1870544 gc ticks: 0}

Thursday, January 13, 2011

format macro

How cool is this?

;; -*- Mode: Irken -*-

(include "lib/core.scm")
(include "lib/pair.scm")
(include "lib/string.scm")

(defmacro format
  (format)                       -> (list:nil)
  (format (<int> n) item ...)    -> (list:cons (int->string n) (format item ...))
  (format (<char> ch) item ...)  -> (list:cons (char->string ch) (format item ...))
  (format (<bool> b) item ...)   -> (list:cons (bool->string b) (format item ...))
  (format (<string> s) item ...) -> (list:cons s (format item ...))
  (format x item ...)            -> (list:cons x (format item ...))

(print-string (string-join (format "testing" (int 1) (bool #t) (char #\A) "\n") " "))

beast:irken rushing$ tests/t_format
testing 1 #t A 
{total ticks: 154666 gc ticks: 0}

Let me explain what's going on here.  First, this is a macro definition.  Macros are specified using a pattern language.  To the left of each arrow is an input pattern - on the right is how that input will be transformed; before the code gets to the compiler  (in other words, textual substitution).

Symbols surrounded by angle brackets are keywords - i.e., constant symbols. This is to distinguish them from variables. Every other name in the input pattern is bound to a variable, which can be used in the output form.

So the final line is transformed into this expression:

  (list:cons "testing"
             (list:cons (int->string 1)
                        (list:cons (bool->string #t)
                                   (list:cons (char->string #\A)
                                              (list:cons "\n" (list:nil))))))
  " "))

This is actually a very simple macro, the patterns could be much more sophisticated, with embedded list structures and other keywords. But it creates the illusion of a variable-arity, variably-typed format function in a language that is strictly, statically typed and does not support variable arity! And of course other macros may be invoked. I'm sure I'll be expanding it as I work more on Irken's self-hosted back end.

Usually string formatting is an ugly part of any language - one of the classic problems with C is that the format language is ill-specified and dangerous.  Only recently have compilers become smart enough to even do rudimentary checking...

I'm only just beginning to see what's possible with this macro system.  I think the pattern matching syntax combines very nicely with the general pattern matching in the language.