Wednesday, March 23, 2011

same-fringe demo

This demonstrates how to use generators to solve the same-fringe problem.

Also, I found a minimalist way of building binary-tree literals using defmacro.

;; -*- Mode: Irken -*-

(include "lib/core.scm")

(datatype btree
  (:node (btree 'a) (btree 'a))
  (:leaf 'a))

(defmacro btree/make
  (btree/make (l r)) -> (btree:node (btree/make l) (btree/make r))
  (btree/make x)     -> (btree:leaf x))

(define t0 (literal (btree/make ((0 ((1 (2 (3 4))) 5)) (((6 7) ((8 (9 10)) 11)) ((12 (((13 14) 15) (16 17))) (18 19)))))))
(define t1 (literal (btree/make (((0 ((1 2) 3)) (((4 5) (((6 7) 8) (9 10))) ((11 ((12 13) 14)) ((15 (16 17)) 18)))) 19))))
(define t2 (literal (btree/make (((0 ((1 2) 3)) (((4 5) (((6 7) 8) (9 10))) ((88 ((12 13) 14)) ((15 (16 17)) 18)))) 19))))

(define btree/inorder
  p (btree:leaf x)   -> (begin (p x) #u)
  p (btree:node l r) -> (begin (btree/inorder p l) (btree/inorder p r) #u))

(define (btree/make-generator t)
   (lambda (consumer)
      (lambda (x) (consumer (maybe:yes x)))
     (forever (consumer (maybe:no))))))

(define (same-fringe t0 t1 =)
  (let ((g0 (btree/make-generator t0))
        (g1 (btree/make-generator t1)))
    (let loop ((m0 (g0)) (m1 (g1)))
      (match m0 m1 with
        (maybe:yes v0) (maybe:yes v1)
        -> (if (= v0 v1)
               (loop (g0) (g1))
               (print-string "NOT equal\n"))
        (maybe:no) (maybe:no)
        -> (print-string "equal\n")
        _ _ -> (print-string "unequal size\n")))))

(same-fringe t0 t1 =)
(same-fringe t0 t2 =)

Sunday, March 20, 2011

win32 binary

I've tried to make the bootstrapping script work on win32, and I think for the most part I've succeeded.  There are issues with mac-specific flags making it into the bootstrap file that I still need to work out.

I used mingw, but I suspect other gcc binaries will work, too.  A 32-bit executable is now available, which may ease the bootstrapping process for some.

Thursday, March 17, 2011

description of the compiler and runtime

I've started on a 'HACKING.txt' document describing the compiler and runtime.
It should be enough to get folks started on understanding the source.
Feedback appreciated!


self-hosted distribution

I've finally put together a 'real' self-hosted distribution.

The tarball is a little bigger - even though I removed all the python code.  The reason is that I have to distribute a pre-compiled version of self/compile.c so that the compiler can be bootstrapped.

Enjoy, and feedback appreciated.

Here's the text of the README:

This is the initial release of the self-hosted compiler.

It's still in a very unpolished state, but you can use it to bootstrap itself.

Just run the script "util/":

$ python util/

Which does the following:
1) run gcc on the distributed version of self/compile.c
2) this binary will be used to recompile the compiler.
3) that binary will recompile the compiler again.
4) the output from steps 2 and 3 are compared, they should be identical.

If you're happy with the resulting compiler, you can compile an optimized
version of self/compile.c, but be warned - you'll need a lot of memory and
a lot of time.

I am using dragonegg for optimized builds, and that seems to take about a GB
of memory, and 18 minutes to build.  It's important to use '-O2', not '-O',
because '-O' takes 53GB of memory and hours to compile.

Very little documentation exists yet, try 'lang.html' for a brief tutorial.
The best way to get familiar with the language is to read the source code in
the 'self' directory, and browse over the files in "tests".


Wednesday, March 9, 2011

C compiler issues

As I came closer to completing the Irken implementation, I noticed that my edit-compile cycle was taking longer and longer.  And by that, I don't mean a linear change.  At some point a threshold was crossed, such that compiling an optimized binary could take nearly an hour with dragonegg, and much longer with gcc, consuming over 17G of memory while at it!

After doing some tests, I've identified at least one of the causes: my varref and varset functions.

A couple of years ago, the compiler output for a varref insn looked like this:

  r0 = lenv[1][1][1][0];

Where the variable we are referencing is 3 levels up and at the 0 index.  (i.e., a De Bruijn index of (3,0)).

I noticed that I could write an inline lexical function, varref(), that did this with a loop:

  r0 = varref (3, 0);

... which is much cleaner.  With -O, gcc, llvm, and dragonegg were all unrolling the constant loop and creating code that was identical to the first version.

I didn't notice the cost of this convenient feature until my program size got large enough... the compiler sources, when using the inline functions, take 5X as long to compile -O as the first version.

Also, a 'platform' note: I work on OS X, where the stock compiler is still /usr/bin/gcc.  I did some timings for a non-optimized build and discovered that the stock gcc is over twice as fast as either my hand-built gcc-4.5.0, or dragonegg.  So for quick edit-compile cycles, I switched back to the stock version.  Though it'd be nice to know why the version from Apple is so much faster...


[assumed skynet joke here]

A few days ago, after many weeks of frenetic work, the new self-hosted Irken compiled itself.  There's still a lot of work to do, but this major milestone has been reached.  The compiler passes the stage1==stage2 test, and I'm now concentrating on various features still missing compared to the python compiler.  Also, the edit-compile cycle is still pretty slow, because both the compiler and gcc are slower than need be.  Once I have these issues solved and some rudimentary error reporting (the current error reporting might be described as medieval) I'll make an official release.

I'm also thinking about how to best re-package the system, considering orphaning the python development branch, and starting a whole new repository for the self-hosted system.

In the meanwhile, if there's anyone out there that would just like to see it compile itself, I rolled a demo tarball up.  Enjoy, and feedback is much appreciated!