Wednesday, March 9, 2011

C compiler issues

As I came closer to completing the Irken implementation, I noticed that my edit-compile cycle was taking longer and longer.  And by that, I don't mean a linear change.  At some point a threshold was crossed, such that compiling an optimized binary could take nearly an hour with dragonegg, and much longer with gcc, consuming over 17G of memory while at it!

After doing some tests, I've identified at least one of the causes: my varref and varset functions.

A couple of years ago, the compiler output for a varref insn looked like this:

  r0 = lenv[1][1][1][0];

Where the variable we are referencing is 3 levels up and at the 0 index.  (i.e., a De Bruijn index of (3,0)).

I noticed that I could write an inline lexical function, varref(), that did this with a loop:

  r0 = varref (3, 0);

... which is much cleaner.  With -O, gcc, llvm, and dragonegg were all unrolling the constant loop and creating code that was identical to the first version.

I didn't notice the cost of this convenient feature until my program size got large enough... the compiler sources, when using the inline functions, take 5X as long to compile -O as the first version.

Also, a 'platform' note: I work on OS X, where the stock compiler is still /usr/bin/gcc.  I did some timings for a non-optimized build and discovered that the stock gcc is over twice as fast as either my hand-built gcc-4.5.0, or dragonegg.  So for quick edit-compile cycles, I switched back to the stock version.  Though it'd be nice to know why the version from Apple is so much faster...

No comments:

Post a Comment