tag:blogger.com,1999:blog-51450157245556056512024-02-20T13:13:42.006-08:00The Alien TongueNews and Info for the Irken CommunitySam Rushinghttp://www.blogger.com/profile/13115847299260965994noreply@blogger.comBlogger68125tag:blogger.com,1999:blog-5145015724555605651.post-89517571901622347522013-03-12T20:41:00.000-07:002013-03-12T20:41:02.071-07:00C backend re-architected to SSA form.Well, one year later, I've managed to back into the CPS output via a completely different direction. The rewrite began as an experiment, triggered by the completely unpredictable behavior of new versions of clang when compiling Irken's output. [I also found <a href="http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56510">a gcc bug.</a>] I thought I'd try to generate 'real' CPS - i.e., separate C functions that only make tail calls. This turned out to be much easier than my attempts last year starting from the other end.<br />
<br />
After only about 3-4 days of work, we now have output like this:<br />
<br />
<br />
<pre style="background-color: #f8f8f8; line-height: 16px;"><span class="p">(</span><span class="k" style="color: green; font-weight: bold;">let </span><span class="nv" style="color: #19177c;">loop</span> <span class="p">((</span><span class="nf" style="color: blue;">n</span> <span class="mi" style="color: #666666;">1000000</span><span class="p">))</span>
<span class="p">(</span><span class="k" style="color: green; font-weight: bold;">if </span><span class="p">(</span><span class="nb" style="color: green;">zero? </span><span class="nv" style="color: #19177c;">n</span><span class="p">)</span>
<span class="s" style="color: #ba2121;">"done"</span>
<span class="p">(</span><span class="nf" style="color: blue;">loop</span> <span class="p">(</span><span class="nb" style="color: green;">- </span><span class="nv" style="color: #19177c;">n</span> <span class="mi" style="color: #666666;">1</span><span class="p">))))</span></pre>
<br />
<br />
<br />
<pre style="background-color: #f8f8f8; line-height: 16px;"><span class="k" style="color: green; font-weight: bold;">static</span> <span class="kt" style="color: #b00040;">void</span> <span class="nf" style="color: blue;">FUN_loop_17_0</span> <span class="p">(</span><span class="kt" style="color: #b00040;">void</span><span class="p">)</span> <span class="p">{</span>
<span class="n">O</span> <span class="n">r10</span> <span class="o" style="color: #666666;">=</span> <span class="p">((</span><span class="n">object</span><span class="o" style="color: #666666;">*</span><span class="p">)</span> <span class="n">lenv</span><span class="p">)</span> <span class="p">[</span><span class="mi" style="color: #666666;">2</span><span class="p">];</span>
<span class="k" style="color: green; font-weight: bold;">if</span> <span class="n">PXLL_IS_TRUE</span><span class="p">(</span><span class="n">PXLL_TEST</span><span class="p">(</span><span class="n">unbox</span><span class="p">(</span><span class="n">r10</span><span class="p">)</span><span class="o" style="color: #666666;">==</span><span class="mi" style="color: #666666;">0</span><span class="p">))</span> <span class="p">{</span>
<span class="n">O</span> <span class="n">r12</span> <span class="o" style="color: #666666;">=</span> <span class="p">(</span><span class="n">object</span><span class="o" style="color: #666666;">*</span><span class="p">)</span> <span class="o" style="color: #666666;">&</span><span class="n">constructed_0</span><span class="p">;</span>
<span class="n">PXLL_RETURN</span><span class="p">(</span><span class="mi" style="color: #666666;">12</span><span class="p">);</span>
<span class="p">}</span> <span class="k" style="color: green; font-weight: bold;">else</span> <span class="p">{</span>
<span class="n">O</span> <span class="n">r16</span> <span class="o" style="color: #666666;">=</span> <span class="p">(</span><span class="n">object</span> <span class="o" style="color: #666666;">*</span><span class="p">)</span> <span class="mi" style="color: #666666;">3</span><span class="p">;</span>
<span class="n">O</span> <span class="n">r17</span> <span class="o" style="color: #666666;">=</span> <span class="p">((</span><span class="n">object</span><span class="o" style="color: #666666;">*</span><span class="p">)</span> <span class="n">lenv</span><span class="p">)</span> <span class="p">[</span><span class="mi" style="color: #666666;">2</span><span class="p">];</span>
<span class="n">O</span> <span class="n">r15</span> <span class="o" style="color: #666666;">=</span> <span class="n">box</span><span class="p">((</span><span class="n">pxll_int</span><span class="p">)</span><span class="n">unbox</span><span class="p">(</span><span class="n">r17</span><span class="p">)</span><span class="o" style="color: #666666;">-</span><span class="n">unbox</span><span class="p">(</span><span class="n">r16</span><span class="p">));</span>
<span class="n">lenv</span><span class="p">[</span><span class="mi" style="color: #666666;">2</span><span class="p">]</span> <span class="o" style="color: #666666;">=</span> <span class="n">r15</span><span class="p">;</span>
<span class="n">FUN_loop_17_0</span><span class="p">();</span>
<span class="p">}</span>
<span class="p">}</span></pre>
A few things to note: the loop is achieved with a tail call in C. This requires that the C compiler perform tail call optimization or the stack will explode rather quickly.<br />
<br />
The 'registers' are now emitted as local variables, with a fresh name for every line. This output is very, very close to the SSA form needed for LLVM itself, and is practically begging for an LLVM backend.<br />
<br />
There is no longer any dependency on C extensions - the address-of-label extension was never a happy citizen in LLVM land, anyway - and the lexical functions extension (needed by Irken circa 3 years ago) is a distant memory.<br />
<br />
The output of the compiler runs about the same speed, but the compilation time itself is much more predictable. clang -O3 takes about 30 seconds, and gcc -O3 around a minute. gcc actually seems to do a better job. I need to play around with clang and gcc's options in order to figure out the best flags for fast edit-compile-link turnaround - I'm missing the 4-second compiles I used to have with clang and the old compiler.<br />
<br />
Other great possibilities include separate compilation, JIT, and (possibly) closure conversion / passing args in registers. An LLVM backend might open up the possibility of efficient floating-point support.<br />
<br />
However, the next step is probably to write a real garbage collector. I'm seriously considering using Irken for some real server work soon, and to survive in that world a two-space collector just isn't going to cut it. I've read over the "<a href="http://people.cs.umass.edu/~moss/papers/pldi-2002-beltway.pdf">Beltway</a>" paper several times and I think I grok it well enough to implement it.<br />
<br />
Sam Rushinghttp://www.blogger.com/profile/13115847299260965994noreply@blogger.com1tag:blogger.com,1999:blog-5145015724555605651.post-87198116603439093412012-03-06T20:41:00.000-08:002012-03-06T20:41:31.909-08:00LLVM, CPSI've done quite a bit of work these past two months on writing an LLVM backend.<div>Rather than directly translate the current design (one giant function), I'm trying to map function to function, and use llvm's support for tail calls to get arguments-in-registers to work.</div><div><br />
</div><div>This requires a 'real' CPS transform, and one that's up near the front of the chain of passes, not the last thing before the backend.</div><div><br />
</div><div>The CPS transform introduces 'continuation functions' - these are similar to normal functions, except they take a single argument. They are targeted by SSA 'terminators' like branch, jump, etc...</div><div><br />
</div><div>A continuation function corresponds to a 'basic block' in SSA - think of the phi insn as if it took the single argument and moved it into the correct register.</div><div><br />
</div><div>If every function were a tail call, we could emit all the code as basic blocks with branches of various kinds. Sadly, some calls are not tail calls. This requires that we render the continuation function as a real function, and that all 'jumps' to it are tail calls.</div><div><br />
</div><div>Needless to say, this is a pretty major rewrite of Irken. I've already made three failed swipes at it, but I hope to find the time to get it working within the next few months.</div><div><br />
</div>Sam Rushinghttp://www.blogger.com/profile/13115847299260965994noreply@blogger.com0tag:blogger.com,1999:blog-5145015724555605651.post-16888718931759301012012-02-13T17:05:00.000-08:002012-02-13T17:07:03.711-08:00pattern matching with regexpsThis is just a reminder to myself to think about extending pattern matching to regular expressions on strings. It'd be a great feature, though with the famous caveat about regular expressions. [I have a problem. I think I can solve it with regular expressions. Now I have two problems.]<br />
<br />
<br />
<br />
<span style="font-family: 'Courier New', Courier, monospace;">(define sample</span><br />
<span style="font-family: 'Courier New', Courier, monospace;"> "(?P<x><x>[0-9]+)+(?P<y><y>[0-9]+)" -> (+ (string->int x) (string->int y))</y></x></span><br />
<span style="font-family: 'Courier New', Courier, monospace;"> "[^x]+(?P<nx><nx>x+)[^x]+"<span class="Apple-tab-span" style="white-space: pre;"> </span>-> (string-length nx)</nx></span><br />
<span style="font-family: 'Courier New', Courier, monospace;"> )</span><br />
<span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;"><br />
</span><br />
I still remember looking through the ejabberd source code, trying to find their xml parser. When it finally dawned on me that it was automagically hidden in the pattern matching, I was impressed.Sam Rushinghttp://www.blogger.com/profile/13115847299260965994noreply@blogger.com1tag:blogger.com,1999:blog-5145015724555605651.post-27862481088089252292012-01-21T01:18:00.000-08:002012-01-21T01:18:11.041-08:00Some notes on continuation-passing styleI've been going over the CPS transform lately, and wrote some notes on how to use the transform to compile to C, stackless style.<br />
<br />
<a href="http://dark.nightmare.com/rushing/factcps/">dark.nightmare.com/rushing/factcps/</a>Sam Rushinghttp://www.blogger.com/profile/13115847299260965994noreply@blogger.com0tag:blogger.com,1999:blog-5145015724555605651.post-48026413839249849442012-01-09T17:30:00.000-08:002012-01-09T17:30:17.976-08:00thinking about an llvm backend, againI've been very happy with my other project that uses LLVM as the backend for a compiler.<br />
I've also learned a lot about LLVM's IR design.<br />
<br />
I can't help but think that it might only take a week or two to write a back end for Irken.<br />
However, there are a few issues.<br />
<br />
<ol><li>My CPS output assumes that each insn has a result that is put into a register. This doesn't fit LLVM's model, which assumes that arguments to insns can be <i>either</i> an immediate <i>or</i> a register. In fact, they do not allow you to put an immediate into a register.</li>
<li>LLVM doesn't like gcc's goto-label feature. I think they implemented it reluctantly, because it doesn't fit the SSA model very well. The majority of funcalls by Irken are to known functions, which translate to a branch to a known label. However, whenever higher-level functions are called, this leads to something like "<span style="font-family: 'Courier New', Courier, monospace;">goto *r0;</span>" in C, which turns into an indirectbr insn in LLVM. It implements this using a gigantic phi/indirectbr table at the very end of the function. Maybe this isn't really a problem - it just looks bad.</li>
<li>The extremely useful <span style="font-family: 'Courier New', Courier, monospace;">%cexp</span> primitive would go away! There's something to be said for being able to refer to OS constants, OS functions, macros, etc... with such a simple mechanism. I'd probably just have to let it go, and force users to write stubs.</li>
</ol><div>I think #1 can be dealt with by adding an extra pass that translates between the two representations. (maybe I should revisit the CPS==SSA paper?)</div><div>Another approach might be to use an annoying alloca/store/load sequence and hope that mem2reg does a good job?</div><div><br />
</div>Sam Rushinghttp://www.blogger.com/profile/13115847299260965994noreply@blogger.com0tag:blogger.com,1999:blog-5145015724555605651.post-10468498960036076182012-01-02T00:15:00.000-08:002012-01-02T00:15:26.126-08:00Irken status, LLVM bitcodeThis is just a quick note that the Irken project has <i>not</i> died, only gone into a temporary hibernation.<br />
When in the course of history it becomes necessary to earn a living...<br />
<br />
Anyway, I have some tangentially related comments about LLVM.<br />
My current employer is interested in maybe doing a little LLVM JIT work, wherein my experience with compilers may prove useful. A nice side effect for me is that I <i>finally</i> get to play around a bit with LLVM, something which I carefully avoided while working on Irken.<br />
<br />
My first reaction was to pick up llvm-py, which looked like a fairly complete interface to the various libraries that make up the llvm system.<br />
<br />
Big Mistake. It doesn't build any more. It's only a little over a year old. The last release of llvm-py was for LLVM 2.8rc2. I am now running 3.0.<br />
<br />
Here's the problem: the LLVM API's are <b>constantly</b><a href="http://lists.cs.uiuc.edu/pipermail/llvmdev/2009-June/023190.html"> changing</a>. And that's a good thing, really. But since those C++ API's are the only well-supported interface to the system, it's a moving target.<br />
<br />
There are <a href="http://llvm.org/docs/FAQ.html#langirgen">three main options</a> for a compiler writer who wants to target LLVM. The first is to use the C++ API. (or you could use the completely undocumented and incomplete C interface).<br />
<br />
The second option is to write llvm assembly. This is not a bad option, but will be slower because of the need to parse the assembly.<br />
<br />
The third is to write llvm 'bitcode'. I thought this was probably the right approach. Bitcode is a binary representation of the LLVM IR language, it seems that it would be likely to change much more slowly than the library interfaces.<br />
<br />
The problem is that the person who designed the <a href="http://llvm.org/docs/BitCodeFormat.html">bitcode file format</a> was on meth at the time. This has got to be one of the most evil formats I've ever seen. I <b>think</b> they were trying to make the resulting file as small as possible, but at the expense of counting every bit. (perhaps an early LLVM project flew on a space probe?) Just to give you an idea, 3 and 4-bit integers (not aligned on any boundary) are a common field type. And not just that, but some of the fields are variably sized, starting at 2 bits and going up. Symbols are represented using 6-bit characters. Also, the concept of 'bitstream' is taken very literally, the bits in a file are viewed as if it were one large integer, starting from the LSB. This is actually an elegant approach, but it is <i>completely</i> different from every other file format or network protocol. I had to continually resist using common parsing patterns with it. There's also a compression-like method of 'abbreviating' common record types. And after all that work to save bits, at the end of every block the bit stream is aligned to 32 bits!<br />
<br />
Ok, got that off my chest.<br />
<br />
My sense is that nobody uses the bitcode format because of this (the only other implementation I could find was in Haskell, and was also impenetrable because Everything Is A Monad, You Know). Most people just succumb to the pressure to write against the C++ API, and will then spend the rest of their lives reading the llvm-dev list to keep track of the changes.<br />
<br />
I have <a href="http://dark.nightmare.com/rushing/python/bc.py">mostly decoded the format</a>, and I think I could actually output it for a compiler.<br />
But I think I'll split the difference and start out by outputting llvm assembly to start with. If the day comes when I need to shave some of the 10MB off the resulting .so file, and maybe speed things up, I'll revisit the bitcode issue.<br />
<br />
In the meanwhile I've had a whack at using <a href="http://dark.nightmare.com/rushing/misc/llvm.pyx.html">Cython's C++ support to make a minimal interface to LLVM</a> where I can feed either llvm assembly or bitcode to the JIT and run it.Sam Rushinghttp://www.blogger.com/profile/13115847299260965994noreply@blogger.com1tag:blogger.com,1999:blog-5145015724555605651.post-45685326638178408132011-05-23T17:15:00.000-07:002011-05-23T17:15:47.584-07:00git, github, clangI've imported my svn history into git, and published the whole thing on github:<br />
<br />
<a href="https://github.com/samrushing/irken-compiler/">https://github.com/samrushing/irken-compiler/</a><br />
<br />
Still trying to grok git and github, there'll probably be a stall for a week or so.<br />
<br />
In the meanwhile, I have interesting news: I was able to take out all the 'nested functions' which relied on a gcc extension. I resisted this for a long time because early testing had shown that declaring the 'registers' as local register variables in the main function was a huge win.<br />
<br />
After a quick test the other day, I found that moving everything out to global variables has no visible impact on performance. I don't know if this is because of improvements in gcc & llvm, or if I just wasn't trying a large enough program.<br />
<br />
That means I'm now able to compile the output with clang, rather than relying on llvm+dragonegg, which can be much more annoying to build, install, and use.<br />
<br />
The good news continues: although the performance of the generated code is identical (maybe a tiny bit faster), the compilation time has been cut quite significantly. On my machine an LLVM -O2 compile dropped from 18m to 8m. The -O0 compile time is even sweeter, about 3s, which is actually faster than Irken itself, which hovers right around 5s to self-compile. So a roughly 10s edit-compile cycle will definitely speed the development of Irken up.<br />
<br />
Now the bad news. The no-nested-funs version causes 'as' on snow leopard to take a lunch break. It actually nails the CPU for 10 or 15 minutes before finally emerging as if nothing was wrong. No problems on FreeBSD 7.3 with gcc-4.5.3, though. This could be a significant annoyance, since I expect many people trying out Irken would prefer to do so with a stock compiler.<br />
<br />
More bad news: clang-2.9 crashes when fed Irken's self-compiled output. I was going to file a bug against it, but when I checked out the svn version and built it the problem was gone. Sigh.<br />
<br />
This is why I'm trying to figure out how to make branches on github, since it'd be nice to maintain both versions for a while until these issues are ironed out. I have made a local branch called 'no-nested-funs', I just haven't figured out how to publish it on github.Sam Rushinghttp://www.blogger.com/profile/13115847299260965994noreply@blogger.com1tag:blogger.com,1999:blog-5145015724555605651.post-87304151273284118992011-05-15T15:59:00.000-07:002011-05-15T15:59:43.184-07:00nested datatypes, 2-3 trees, numerical encoding<span class="Apple-style-span" style="font-family: Times, 'Times New Roman', serif;">This <a href="http://www.soi.city.ac.uk/~ross/papers/FingerTree.pdf">paper by Hinze & Paterson</a> discusses the use of 'nested datatypes' to implement 2-3 trees, but in a way that uses the type system to automatically handle the invariants. A <a href="http://scienceblogs.com/goodmath/2010/04/finger_trees_done_right_i_hope.php">blog post from Mark Chu-Carroll</a> is a bit easier to follow. Here's his example of a sequence encoded as a 2-3 tree (in Haskell):</span><br />
<br />
<span class="Apple-style-span" style="font-family: 'Trebuchet MS', Arial, Verdana, Geneva, Helvetica, sans-serif; line-height: 16px;"></span><br />
<pre style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-color: initial; outline-style: initial; outline-width: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">data Tree t = Zero t | Succ (Tree (Node t))
data Node s = Node2 s s | Node3 s s s</pre><pre style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-color: initial; outline-style: initial; outline-width: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;"></pre><pre style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-color: initial; outline-style: initial; outline-width: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;"><span class="Apple-style-span" style="font-family: Times, 'Times New Roman', serif;">
</span></pre><pre style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-color: initial; outline-style: initial; outline-width: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;"></pre><pre style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-color: initial; outline-style: initial; outline-width: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;"><span class="Apple-style-span" style="font-family: Times, 'Times New Roman', serif; white-space: normal;">Each level of the tree contains sub-trees that are 'one less' via the numeric encoding scheme.</span></pre><pre style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-color: initial; outline-style: initial; outline-width: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;"><span class="Apple-style-span" style="font-family: Times, 'Times New Roman', serif; white-space: normal;">
</span></pre><pre style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-color: initial; outline-style: initial; outline-width: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;"><span class="Apple-style-span" style="font-family: Times, 'Times New Roman', serif; white-space: normal;">This sent me back to Okasaki's book, where I'm finally starting to grok those last two chapters. This <a href="http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.46.571">bizarre mingling of numerical encoding and data structures</a> lights a fire in the parts of my brain dedicated to data structures. If feel as if I could automatically see the relationship between red-black trees, 2-3 trees, and binary number representations.... if only I could double my IQ.</span></pre><pre style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-color: initial; outline-style: initial; outline-width: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;"><span class="Apple-style-span" style="font-family: Times, 'Times New Roman', serif; white-space: normal;">
</span></pre><pre style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-color: initial; outline-style: initial; outline-width: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;"><span class="Apple-style-span" style="font-family: Times, 'Times New Roman', serif; white-space: normal;">
</span></pre><pre style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-color: initial; outline-style: initial; outline-width: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;"><span class="Apple-style-span" style="font-family: Times, 'Times New Roman', serif; white-space: normal;">
</span></pre><pre style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-color: initial; outline-style: initial; outline-width: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;"><span class="Apple-style-span" style="font-family: Times, 'Times New Roman', serif; white-space: normal;">I have this cool puzzle called the '<a href="http://www.jaapsch.net/puzzles/spinout.htm">Spin-Out</a>', which is actually a physical demonstration of the properties of <a href="http://en.wikipedia.org/wiki/Gray_code">Gray Codes</a>. It seems to me that Gray Codes and balanced trees should be a natural fit.</span></pre><pre style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-color: initial; outline-style: initial; outline-width: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;"><span class="Apple-style-span" style="font-family: Times, 'Times New Roman', serif; white-space: normal;">
</span></pre><pre style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-color: initial; outline-style: initial; outline-width: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;"><span class="Apple-style-span" style="font-family: Times, 'Times New Roman', serif; white-space: normal;">
</span></pre><pre style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-color: initial; outline-style: initial; outline-width: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;"><span class="Apple-style-span" style="font-family: Times, 'Times New Roman', serif; white-space: normal;">Unfortunately, I can't experiment with any of these ideas in Irken (or OCaml, or SML) because the nested datatypes require polymorphic recursion.</span></pre><pre style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-color: initial; outline-style: initial; outline-width: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;"><span class="Apple-style-span" style="font-family: Times, 'Times New Roman', serif; white-space: normal;">
</span></pre><pre style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-color: initial; outline-style: initial; outline-width: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;"><span class="Apple-style-span" style="font-family: Times, 'Times New Roman', serif; white-space: normal;">Type inference with polymorphic recursion is undecidable. Haskell allows it only if the user provides a manual type signature. I'm guessing this is impossible (i.e., unsafe) with SML/OCaml because of the presence of assignment? Could such a feature safely be used with functions that the compiler can verify are pure?</span></pre><pre style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-color: initial; outline-style: initial; outline-width: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;"><span class="Apple-style-span" style="font-family: Times, 'Times New Roman', serif; white-space: normal;">
</span></pre><pre style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-color: initial; outline-style: initial; outline-width: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;"><span class="Apple-style-span" style="font-family: Times, 'Times New Roman', serif; white-space: normal;">
</span></pre><pre style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-color: initial; outline-style: initial; outline-width: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;"><span class="Apple-style-span" style="font-family: Times, 'Times New Roman', serif; white-space: normal;">
</span></pre><pre style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-color: initial; outline-style: initial; outline-width: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;"><span class="Apple-style-span" style="font-family: Times;"><span class="Apple-style-span" style="white-space: normal;">
</span></span></pre><pre></pre>Sam Rushinghttp://www.blogger.com/profile/13115847299260965994noreply@blogger.com0tag:blogger.com,1999:blog-5145015724555605651.post-85492831943128406522011-04-27T00:36:00.000-07:002011-04-27T00:36:26.695-07:00DOOM DOOM DOOM (a coro/thread kqueue-based scheduler)I've created a new sub-project, code-named 'doom', for the coroutine-based cooperative threading system. It uses kqueue(), of course. Because it's better. It wouldn't be too hard for someone from linux-land to make a version based on epoll.<br />
<br />
In the <a href="http://dark.nightmare.com/rushing/irken/irken/doom/">doom directory</a>, you'll find a <a href="http://dark.nightmare.com/rushing/irken/irken/doom/fetch.scm.html">simple http demo</a>, and the <a href="http://dark.nightmare.com/rushing/irken/irken/doom/echo.scm.html">obligatory echo server</a>.<br />
<br />
Exception handling certainly makes the code look a lot nicer.<br />
<br />
I'm the best at space.Sam Rushinghttp://www.blogger.com/profile/13115847299260965994noreply@blogger.com0tag:blogger.com,1999:blog-5145015724555605651.post-47222534155152457452011-04-22T15:55:00.000-07:002011-04-22T21:58:52.333-07:00exception handling, take twoA bit tricky, but I found a way to implement <span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;">try/except</span> almost completely with defmacro. I needed to add some extra typechecking around <span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;">raise</span> and <span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;">try</span> to ensure that all exception types are monomorphic. I achieved this by having the compiler keep a global map of exception-name => tvar. Seems to work...<br />
<br />
While working on this I learned that the Scheme-style <span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;">call/cc</span> is not really type safe, and I need to use the SML version. The main difference is the protocol - SML's <span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;">callcc</span> requires the use of a <span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;">throw</span> procedure since the continuation needs to carry type information:<br />
<br />
<pre>(define (callcc p) : (((continuation 'a) -> 'a) -> 'a)
(p (getcc)))
(define (throw k v)
(putcc k v))</pre><div><br />
</div><br />
<br />
<br />
<div class="highlight"><pre><span class="c1">;; -*- Mode: Irken -*-</span>
<span class="p">(</span><span class="nf">include</span> <span class="s">"lib/core.scm"</span><span class="p">)</span>
<span class="p">(</span><span class="nf">include</span> <span class="s">"lib/random.scm"</span><span class="p">)</span>
<span class="c1">;; We use polymorphic variants for exceptions.</span>
<span class="c1">;; Since we're a whole-program compiler there's no need to declare</span>
<span class="c1">;; them - though I might could be convinced it's still a good idea.</span>
<span class="c1">;;</span>
<span class="c1">;; Exception data must be monomorphic to preserve type safety.</span>
<span class="c1">;;</span>
<span class="c1">;; <try> and <raise> are implemented as macros, with one extra hitch:</span>
<span class="c1">;; two special compiler primitives are used to check that exception</span>
<span class="c1">;; types are consistent: %exn-raise and %exn-handle</span>
<span class="p">(</span><span class="k">define </span><span class="p">(</span><span class="nf">base-exception-handler</span> <span class="nv">exn</span><span class="p">)</span> <span class="nv">:</span> <span class="p">((</span><span class="nf">rsum</span> <span class="ss">'a</span><span class="p">)</span> <span class="k">-> </span><span class="ss">'b</span><span class="p">)</span>
<span class="p">(</span><span class="nf">error1</span> <span class="s">"uncaught exception"</span> <span class="nv">exn</span><span class="p">))</span>
<span class="p">(</span><span class="k">define </span><span class="nv">*the-exception-handler*</span> <span class="nv">base-exception-handler</span><span class="p">)</span>
<span class="p">(</span><span class="nf">defmacro</span> <span class="nv">raise</span>
<span class="p">(</span><span class="nf">raise</span> <span class="nv">exn</span><span class="p">)</span> <span class="k">-> </span><span class="p">(</span><span class="nf">*the-exception-handler*</span> <span class="p">(</span><span class="nf">%exn-raise</span> <span class="no">#f</span> <span class="nv">exn</span><span class="p">))</span>
<span class="p">)</span>
<span class="p">(</span><span class="nf">defmacro</span> <span class="nv">try</span>
<span class="c1">;; done accumulating body parts, finish up.</span>
<span class="p">(</span><span class="nf">try</span> <span class="p">(</span><span class="k">begin </span><span class="nv">body0</span> <span class="o">...</span><span class="p">)</span> <span class="nv"><except></span> <span class="nv">exn-match</span> <span class="o">...</span><span class="p">)</span>
<span class="k">-> </span><span class="p">(</span><span class="nf">callcc</span>
<span class="p">(</span><span class="k">lambda </span><span class="p">(</span><span class="nf">$exn-k0</span><span class="p">)</span>
<span class="p">(</span><span class="k">let </span><span class="p">((</span><span class="nf">$old-hand</span> <span class="nv">*the-exception-handler*</span><span class="p">))</span>
<span class="p">(</span><span class="nf">set!</span>
<span class="nv">*the-exception-handler*</span>
<span class="p">(</span><span class="k">lambda </span><span class="p">(</span><span class="nf">$exn-val</span><span class="p">)</span>
<span class="p">(</span><span class="k">set! </span><span class="nv">*the-exception-handler*</span> <span class="nv">$old-hand</span><span class="p">)</span>
<span class="p">(</span><span class="nf">throw</span> <span class="nv">$exn-k0</span>
<span class="p">(</span><span class="nf">%exn-handle</span> <span class="no">#f</span> <span class="nv">$exn-val</span>
<span class="p">(</span><span class="nf">match</span> <span class="nv">$exn-val</span> <span class="nv">with</span>
<span class="nv">exn-match</span> <span class="o">...</span>
<span class="nv">_</span> <span class="k">-> </span><span class="p">(</span><span class="nf">raise</span> <span class="nv">$exn-val</span><span class="p">))))))</span>
<span class="p">(</span><span class="k">let </span><span class="p">((</span><span class="nf">$result</span> <span class="p">(</span><span class="k">begin </span><span class="nv">body0</span> <span class="o">...</span><span class="p">)))</span>
<span class="p">(</span><span class="k">set! </span><span class="nv">*the-exception-handler*</span> <span class="nv">$old-hand</span><span class="p">)</span>
<span class="nv">$result</span><span class="p">))))</span>
<span class="c1">;; accumulating body parts...</span>
<span class="p">(</span><span class="nf">try</span> <span class="p">(</span><span class="k">begin </span><span class="nv">body0</span> <span class="o">...</span><span class="p">)</span> <span class="nv">body1</span> <span class="nv">body2</span> <span class="o">...</span><span class="p">)</span> <span class="k">-> </span><span class="p">(</span><span class="nf">try</span> <span class="p">(</span><span class="k">begin </span><span class="nv">body0</span> <span class="o">...</span> <span class="nv">body1</span><span class="p">)</span> <span class="nv">body2</span> <span class="o">...</span><span class="p">)</span>
<span class="c1">;; begin to accumulate...</span>
<span class="p">(</span><span class="nf">try</span> <span class="nv">body0</span> <span class="nv">body1</span> <span class="o">...</span><span class="p">)</span> <span class="k">-> </span><span class="p">(</span><span class="nf">try</span> <span class="p">(</span><span class="k">begin </span><span class="nv">body0</span><span class="p">)</span> <span class="nv">body1</span> <span class="o">...</span><span class="p">)</span>
<span class="p">)</span>
<span class="p">(</span><span class="k">define </span><span class="p">(</span><span class="nf">level0</span> <span class="nv">x</span><span class="p">)</span>
<span class="p">(</span><span class="nf">try</span>
<span class="p">(</span><span class="nf">level1</span> <span class="nv">x</span><span class="p">)</span>
<span class="nv">except</span>
<span class="p">(</span><span class="nf">:Level0</span> <span class="nv">x</span><span class="p">)</span> <span class="k">-> </span><span class="p">(</span><span class="nf">:pair</span> <span class="s">"level 0"</span> <span class="nv">x</span><span class="p">)</span>
<span class="p">))</span>
<span class="p">(</span><span class="k">define </span><span class="p">(</span><span class="nf">level1</span> <span class="nv">x</span><span class="p">)</span>
<span class="p">(</span><span class="nf">try</span>
<span class="p">(</span><span class="nf">level2</span> <span class="nv">x</span><span class="p">)</span>
<span class="nv">except</span>
<span class="p">(</span><span class="nf">:Level1</span> <span class="nv">x</span><span class="p">)</span> <span class="k">-> </span><span class="p">(</span><span class="nf">:pair</span> <span class="s">"level 1"</span> <span class="nv">x</span><span class="p">)</span>
<span class="p">))</span>
<span class="p">(</span><span class="k">define </span><span class="p">(</span><span class="nf">level2</span> <span class="nv">x</span><span class="p">)</span>
<span class="p">(</span><span class="nf">try</span>
<span class="p">(</span><span class="nf">level3</span> <span class="nv">x</span><span class="p">)</span>
<span class="nv">except</span>
<span class="p">(</span><span class="nf">:Level2</span> <span class="nv">x</span><span class="p">)</span> <span class="k">-> </span><span class="p">(</span><span class="nf">:pair</span> <span class="s">"level 2"</span> <span class="nv">x</span><span class="p">)</span>
<span class="p">))</span>
<span class="p">(</span><span class="k">define </span><span class="p">(</span><span class="nf">level3</span> <span class="nv">x</span><span class="p">)</span>
<span class="p">(</span><span class="nf">try</span>
<span class="p">(</span><span class="nf">match</span> <span class="nv">x</span> <span class="nv">with</span>
<span class="mi">0</span> <span class="k">-> </span><span class="p">(</span><span class="nf">raise</span> <span class="p">(</span><span class="nf">:Level0</span> <span class="nv">x</span><span class="p">))</span>
<span class="mi">1</span> <span class="k">-> </span><span class="p">(</span><span class="nf">raise</span> <span class="p">(</span><span class="nf">:Level1</span> <span class="nv">x</span><span class="p">))</span>
<span class="mi">2</span> <span class="k">-> </span><span class="p">(</span><span class="nf">raise</span> <span class="p">(</span><span class="nf">:Level2</span> <span class="nv">x</span><span class="p">))</span>
<span class="mi">3</span> <span class="k">-> </span><span class="p">(</span><span class="nf">raise</span> <span class="p">(</span><span class="nf">:Level3</span> <span class="nv">x</span><span class="p">))</span>
<span class="mi">4</span> <span class="k">-> </span><span class="p">(</span><span class="nf">:pair</span> <span class="s">"no exception!"</span> <span class="mi">99</span><span class="p">)</span>
<span class="nv">_</span> <span class="k">-> </span><span class="p">(</span><span class="nf">raise</span> <span class="p">(</span><span class="nf">:Bottom</span> <span class="nv">x</span><span class="p">))</span>
<span class="p">)</span>
<span class="nv">except</span>
<span class="p">(</span><span class="nf">:Level3</span> <span class="nv">x</span><span class="p">)</span> <span class="k">-> </span><span class="p">(</span><span class="nf">:pair</span> <span class="s">"level 3"</span> <span class="nv">x</span><span class="p">)</span>
<span class="p">))</span>
<span class="p">(</span><span class="nf">printn</span> <span class="p">(</span><span class="nf">level0</span> <span class="mi">0</span><span class="p">))</span>
<span class="p">(</span><span class="nf">printn</span> <span class="p">(</span><span class="nf">level0</span> <span class="mi">1</span><span class="p">))</span>
<span class="p">(</span><span class="nf">printn</span> <span class="p">(</span><span class="nf">level0</span> <span class="mi">2</span><span class="p">))</span>
<span class="p">(</span><span class="nf">printn</span> <span class="p">(</span><span class="nf">level0</span> <span class="mi">3</span><span class="p">))</span>
<span class="p">(</span><span class="nf">printn</span> <span class="p">(</span><span class="nf">level0</span> <span class="mi">4</span><span class="p">))</span>
<span class="p">(</span><span class="nf">printn</span> <span class="p">(</span><span class="nf">level0</span> <span class="mi">5</span><span class="p">))</span>
</pre></div>Sam Rushinghttp://www.blogger.com/profile/13115847299260965994noreply@blogger.com0tag:blogger.com,1999:blog-5145015724555605651.post-64032794541413804782011-04-06T17:15:00.000-07:002011-04-06T17:15:32.593-07:00exception handling with defmacroI've started work on a coroutine/thread scheduler, code named "Doom". Doing some socket I/O has finally snapped my 'missing feature' threshold - I really need exception handling to do it right.<br />
<br />
Here's my first attempt. The basic syntax is <code>(try <body> except <exception-patterns>)</code>.<br />
<br />
If this works, it'll be a great example of the huge advantages of the lisp syntax and macros - no changes made to the compiler to support it!<br />
<br />
<div class="highlight"><pre><span class="c1">;; -*- Mode: Irken -*-</span>
<span class="p">(</span><span class="nf">include</span> <span class="s">"lib/core.scm"</span><span class="p">)</span>
<span class="p">(</span><span class="nf">include</span> <span class="s">"lib/random.scm"</span><span class="p">)</span>
<span class="c1">;; can we implement an exception system using the macro system?</span>
<span class="c1">;; XXX eventually we'll need to change call/cc to swap out exception</span>
<span class="c1">;; handlers, which probably means making a 'modern' call/cc with</span>
<span class="c1">;; dynamic-wind, etc.</span>
<span class="p">(</span><span class="k">define </span><span class="p">(</span><span class="nf">base-exception-handler</span> <span class="nv">exn</span><span class="p">)</span>
<span class="p">(</span><span class="nf">error1</span> <span class="s">"uncaught exception"</span> <span class="nv">exn</span><span class="p">))</span>
<span class="p">(</span><span class="k">define </span><span class="nv">*the-exception-handler*</span> <span class="nv">base-exception-handler</span><span class="p">)</span>
<span class="p">(</span><span class="k">define </span><span class="p">(</span><span class="nf">raise</span> <span class="nv">exn</span><span class="p">)</span>
<span class="p">(</span><span class="nf">*the-exception-handler*</span> <span class="nv">exn</span><span class="p">)</span>
<span class="p">)</span>
<span class="c1">;; what kind of values do we want for exceptions, and how they're caught/compared?</span>
<span class="c1">;; python: started with strings, went to objects of the Exception class.</span>
<span class="c1">;; scheme: SRFI-12 includes something that looks like property lists?</span>
<span class="c1">;; sml: string refs? (in sml/nj according to CwC)</span>
<span class="c1">;; ocaml: declared exception datatypes.</span>
<span class="c1">;; [alternate for ocaml: http://dutherenverseauborddelatable.wordpress.com/downloads/exception-monads-for-ocaml/</span>
<span class="c1">;; sounds a little like my 'fourth idea']</span>
<span class="c1">;; my first temptation is to just use symbols, and stay simple</span>
<span class="c1">;; second idea is to use a record, which theoretically gives you the</span>
<span class="c1">;; ability to attach other kinds of data, which could be very</span>
<span class="c1">;; convenient. Example: (raise BadFD) => {exception='BadFD ...}</span>
<span class="c1">;; third idea: hardcore - define a global exception type, force the user to</span>
<span class="c1">;; extend it. [could be made easier by compiler hacks that allow you to</span>
<span class="c1">;; extend the type anywhere in the source?]. Code would have to wildcard</span>
<span class="c1">;; exceptions it doesn't know about? [or is that the definition of handing</span>
<span class="c1">;; it upstream?]</span>
<span class="c1">;; fourth idea: use records (or polymorphic variants) to define unique</span>
<span class="c1">;; exception names *and* types. For example: (raise (BadFD</span>
<span class="c1">;; current-fd)) would type as {BadFD=int ...} this would make it</span>
<span class="c1">;; more like the SRFI-12 property-list thing (unless I misunderstand</span>
<span class="c1">;; what SRFI-12 is all about). Hmmm... maybe we could extend the</span>
<span class="c1">;; exception handler by extending the record?</span>
<span class="c1">;; It'd be nice if the type system can tell us what exceptions are possible</span>
<span class="c1">;; at any given point in the code... is this maybe impossible due to the</span>
<span class="c1">;; dynamic (vs static/lexical) nature of exception handling?</span>
<span class="c1">;; the more I think about the 'fourth idea' the less reason I see to</span>
<span class="c1">;; restrict the type of exceptions. Syntax-wise, the following macro</span>
<span class="c1">;; does not enforce any restrictions. It's possible though that only</span>
<span class="c1">;; the polymorphic-variant-scheme will survive type checking.</span>
<span class="c1">;; this is the first time I've tried to 'accumulate' pieces in a macro.</span>
<span class="c1">;; it's possible that there's a better idiom that I just haven't discovered yet.</span>
<span class="c1">;; should really look to see how the scheme folks do this (or do they just avoid</span>
<span class="c1">;; it all by wrapping everything in verbose s-expressions).</span>
<span class="p">(</span><span class="nf">defmacro</span> <span class="nv">try</span>
<span class="c1">;; done accumulating body parts, finish up.</span>
<span class="p">(</span><span class="nf">try</span> <span class="p">(</span><span class="k">begin </span><span class="nv">body0</span> <span class="o">...</span><span class="p">)</span> <span class="nv"><except></span> <span class="nv">exn-match</span> <span class="o">...</span><span class="p">)</span>
<span class="k">-> </span><span class="p">(</span><span class="k">let </span><span class="p">((</span><span class="nf">$old-hand</span> <span class="nv">*the-exception-handler*</span><span class="p">))</span>
<span class="p">(</span><span class="nf">set!</span>
<span class="nv">*the-exception-handler*</span>
<span class="p">(</span><span class="k">lambda </span><span class="p">(</span><span class="nf">$exn</span><span class="p">)</span>
<span class="p">(</span><span class="k">set! </span><span class="nv">*the-exception-handler*</span> <span class="nv">$old-hand</span><span class="p">)</span>
<span class="p">(</span><span class="nf">match</span> <span class="nv">$exn</span> <span class="nv">with</span>
<span class="nv">exn-match</span> <span class="o">...</span>
<span class="nv">_</span> <span class="k">-> </span><span class="p">(</span><span class="nf">raise</span> <span class="nv">$exn</span><span class="p">))))</span>
<span class="p">(</span><span class="k">let </span><span class="p">((</span><span class="nf">$result</span> <span class="p">(</span><span class="k">begin </span><span class="nv">body0</span> <span class="o">...</span><span class="p">)))</span>
<span class="p">(</span><span class="k">set! </span><span class="nv">*the-exception-handler*</span> <span class="nv">$old-hand</span><span class="p">)</span>
<span class="nv">$result</span><span class="p">))</span>
<span class="c1">;; accumulating body parts...</span>
<span class="p">(</span><span class="nf">try</span> <span class="p">(</span><span class="k">begin </span><span class="nv">body0</span> <span class="o">...</span><span class="p">)</span> <span class="nv">body1</span> <span class="nv">body2</span> <span class="o">...</span><span class="p">)</span> <span class="k">-> </span><span class="p">(</span><span class="nf">try</span> <span class="p">(</span><span class="k">begin </span><span class="nv">body0</span> <span class="o">...</span> <span class="nv">body1</span><span class="p">)</span> <span class="nv">body2</span> <span class="o">...</span><span class="p">)</span>
<span class="c1">;; begin to accumulate...</span>
<span class="p">(</span><span class="nf">try</span> <span class="nv">body0</span> <span class="nv">body1</span> <span class="o">...</span><span class="p">)</span> <span class="k">-> </span><span class="p">(</span><span class="nf">try</span> <span class="p">(</span><span class="k">begin </span><span class="nv">body0</span><span class="p">)</span> <span class="nv">body1</span> <span class="o">...</span><span class="p">)</span>
<span class="p">)</span>
<span class="p">(</span><span class="k">define </span><span class="p">(</span><span class="nf">random-barf</span><span class="p">)</span>
<span class="p">(</span><span class="k">if </span><span class="p">(</span><span class="nb">= </span><span class="p">(</span><span class="nf">logand</span> <span class="p">(</span><span class="nf">random</span><span class="p">)</span> <span class="mi">1</span><span class="p">)</span> <span class="mi">1</span><span class="p">)</span>
<span class="p">(</span><span class="nf">raise</span> <span class="p">(</span><span class="nf">:OtherException</span> <span class="mi">99</span><span class="p">))</span>
<span class="mi">7</span><span class="p">))</span>
<span class="p">(</span><span class="k">define </span><span class="p">(</span><span class="nf">thing</span><span class="p">)</span>
<span class="p">(</span><span class="nf">try</span>
<span class="p">(</span><span class="k">let </span><span class="nv">loop</span> <span class="p">((</span><span class="nf">n</span> <span class="mi">0</span><span class="p">))</span>
<span class="p">(</span><span class="k">if </span><span class="p">(</span><span class="nb">= </span><span class="nv">n</span> <span class="mi">100</span><span class="p">)</span>
<span class="p">(</span><span class="nf">raise</span> <span class="p">(</span><span class="nf">:MyException</span> <span class="mi">12</span><span class="p">))</span>
<span class="p">(</span><span class="nf">begin</span>
<span class="p">(</span><span class="nf">random-barf</span><span class="p">)</span>
<span class="p">(</span><span class="nf">loop</span> <span class="p">(</span><span class="nb">+ </span><span class="nv">n</span> <span class="mi">1</span><span class="p">)))))</span>
<span class="nv">except</span>
<span class="p">(</span><span class="nf">:MyException</span> <span class="nv">value</span><span class="p">)</span> <span class="k">-> </span><span class="nv">value</span>
<span class="p">(</span><span class="nf">:OtherException</span> <span class="nv">_</span><span class="p">)</span> <span class="k">-> </span><span class="mi">9</span>
<span class="p">))</span>
<span class="p">(</span><span class="nf">thing</span><span class="p">)</span>
</pre><pre></pre><pre></pre></div><br />
Here's what the macro expansion looks like for <code>thing</code>:<br />
<br />
<pre>(define (thing)
(let (($old-hand *the-exception-handler*))
(begin
(set!
*the-exception-handler*
(lambda ($exn)
(begin
(set! *the-exception-handler* $old-hand)
(%fatbar
#f
(%nvcase
nil
$exn
(MyException OtherException)
(1 1)
((let-splat
((m9 (%nvget (:MyException 0 1) $exn)))
(let_subst (value m9) value))
9)
(%match-error #f))
(raise $exn)))))
(let (($result
(letrec
((loop
(function loop (n) (if (= n 100) (raise (:MyException 12)) (loop (+ n 1))))))
(loop 0))))
(begin (set! *the-exception-handler* $old-hand) $result)))))</pre>Sam Rushinghttp://www.blogger.com/profile/13115847299260965994noreply@blogger.com5tag:blogger.com,1999:blog-5145015724555605651.post-81138940310662484012011-03-23T17:55:00.000-07:002011-03-23T17:55:46.701-07:00same-fringe demoThis demonstrates how to use generators to solve the same-fringe problem.<div><br />
</div><div>Also, I found a minimalist way of building binary-tree literals using defmacro.</div><div><br />
</div><div><br />
</div><div class="highlight"><pre><span class="c1">;; -*- Mode: Irken -*-</span>
<span class="p">(</span><span class="nf">include</span> <span class="s">"lib/core.scm"</span><span class="p">)</span>
<span class="p">(</span><span class="nf">datatype</span> <span class="nv">btree</span>
<span class="p">(</span><span class="nf">:node</span> <span class="p">(</span><span class="nf">btree</span> <span class="ss">'a</span><span class="p">)</span> <span class="p">(</span><span class="nf">btree</span> <span class="ss">'a</span><span class="p">))</span>
<span class="p">(</span><span class="nf">:leaf</span> <span class="ss">'a</span><span class="p">))</span>
<span class="p">(</span><span class="nf">defmacro</span> <span class="nv">btree/make</span>
<span class="p">(</span><span class="nf">btree/make</span> <span class="p">(</span><span class="nf">l</span> <span class="nv">r</span><span class="p">))</span> <span class="k">-> </span><span class="p">(</span><span class="nf">btree:node</span> <span class="p">(</span><span class="nf">btree/make</span> <span class="nv">l</span><span class="p">)</span> <span class="p">(</span><span class="nf">btree/make</span> <span class="nv">r</span><span class="p">))</span>
<span class="p">(</span><span class="nf">btree/make</span> <span class="nv">x</span><span class="p">)</span> <span class="k">-> </span><span class="p">(</span><span class="nf">btree:leaf</span> <span class="nv">x</span><span class="p">))</span>
<span class="p">(</span><span class="k">define </span><span class="nv">t0</span> <span class="p">(</span><span class="nf">literal</span> <span class="p">(</span><span class="nf">btree/make</span> <span class="p">((</span><span class="mi">0</span> <span class="p">((</span><span class="mi">1</span> <span class="p">(</span><span class="mi">2</span> <span class="p">(</span><span class="mi">3</span> <span class="mi">4</span><span class="p">)))</span> <span class="mi">5</span><span class="p">))</span> <span class="p">(((</span><span class="mi">6</span> <span class="mi">7</span><span class="p">)</span> <span class="p">((</span><span class="mi">8</span> <span class="p">(</span><span class="mi">9</span> <span class="mi">10</span><span class="p">))</span> <span class="mi">11</span><span class="p">))</span> <span class="p">((</span><span class="mi">12</span> <span class="p">(((</span><span class="mi">13</span> <span class="mi">14</span><span class="p">)</span> <span class="mi">15</span><span class="p">)</span> <span class="p">(</span><span class="mi">16</span> <span class="mi">17</span><span class="p">)))</span> <span class="p">(</span><span class="mi">18</span> <span class="mi">19</span><span class="p">)))))))</span>
<span class="p">(</span><span class="k">define </span><span class="nv">t1</span> <span class="p">(</span><span class="nf">literal</span> <span class="p">(</span><span class="nf">btree/make</span> <span class="p">(((</span><span class="mi">0</span> <span class="p">((</span><span class="mi">1</span> <span class="mi">2</span><span class="p">)</span> <span class="mi">3</span><span class="p">))</span> <span class="p">(((</span><span class="mi">4</span> <span class="mi">5</span><span class="p">)</span> <span class="p">(((</span><span class="mi">6</span> <span class="mi">7</span><span class="p">)</span> <span class="mi">8</span><span class="p">)</span> <span class="p">(</span><span class="mi">9</span> <span class="mi">10</span><span class="p">)))</span> <span class="p">((</span><span class="mi">11</span> <span class="p">((</span><span class="mi">12</span> <span class="mi">13</span><span class="p">)</span> <span class="mi">14</span><span class="p">))</span> <span class="p">((</span><span class="mi">15</span> <span class="p">(</span><span class="mi">16</span> <span class="mi">17</span><span class="p">))</span> <span class="mi">18</span><span class="p">))))</span> <span class="mi">19</span><span class="p">))))</span>
<span class="p">(</span><span class="k">define </span><span class="nv">t2</span> <span class="p">(</span><span class="nf">literal</span> <span class="p">(</span><span class="nf">btree/make</span> <span class="p">(((</span><span class="mi">0</span> <span class="p">((</span><span class="mi">1</span> <span class="mi">2</span><span class="p">)</span> <span class="mi">3</span><span class="p">))</span> <span class="p">(((</span><span class="mi">4</span> <span class="mi">5</span><span class="p">)</span> <span class="p">(((</span><span class="mi">6</span> <span class="mi">7</span><span class="p">)</span> <span class="mi">8</span><span class="p">)</span> <span class="p">(</span><span class="mi">9</span> <span class="mi">10</span><span class="p">)))</span> <span class="p">((</span><span class="mi">88</span> <span class="p">((</span><span class="mi">12</span> <span class="mi">13</span><span class="p">)</span> <span class="mi">14</span><span class="p">))</span> <span class="p">((</span><span class="mi">15</span> <span class="p">(</span><span class="mi">16</span> <span class="mi">17</span><span class="p">))</span> <span class="mi">18</span><span class="p">))))</span> <span class="mi">19</span><span class="p">))))</span>
<span class="p">(</span><span class="k">define </span><span class="nv">btree/inorder</span>
<span class="nv">p</span> <span class="p">(</span><span class="nf">btree:leaf</span> <span class="nv">x</span><span class="p">)</span> <span class="k">-> </span><span class="p">(</span><span class="k">begin </span><span class="p">(</span><span class="nf">p</span> <span class="nv">x</span><span class="p">)</span> <span class="o">#</span><span class="nv">u</span><span class="p">)</span>
<span class="nv">p</span> <span class="p">(</span><span class="nf">btree:node</span> <span class="nv">l</span> <span class="nv">r</span><span class="p">)</span> <span class="k">-> </span><span class="p">(</span><span class="k">begin </span><span class="p">(</span><span class="nf">btree/inorder</span> <span class="nv">p</span> <span class="nv">l</span><span class="p">)</span> <span class="p">(</span><span class="nf">btree/inorder</span> <span class="nv">p</span> <span class="nv">r</span><span class="p">)</span> <span class="o">#</span><span class="nv">u</span><span class="p">))</span>
<span class="p">(</span><span class="k">define </span><span class="p">(</span><span class="nf">btree/make-generator</span> <span class="nv">t</span><span class="p">)</span>
<span class="p">(</span><span class="nf">make-generator</span>
<span class="p">(</span><span class="k">lambda </span><span class="p">(</span><span class="nf">consumer</span><span class="p">)</span>
<span class="p">(</span><span class="nf">btree/inorder</span>
<span class="p">(</span><span class="k">lambda </span><span class="p">(</span><span class="nf">x</span><span class="p">)</span> <span class="p">(</span><span class="nf">consumer</span> <span class="p">(</span><span class="nf">maybe:yes</span> <span class="nv">x</span><span class="p">)))</span>
<span class="nv">t</span><span class="p">)</span>
<span class="p">(</span><span class="nf">forever</span> <span class="p">(</span><span class="nf">consumer</span> <span class="p">(</span><span class="nf">maybe:no</span><span class="p">))))))</span>
<span class="p">(</span><span class="k">define </span><span class="p">(</span><span class="nf">same-fringe</span> <span class="nv">t0</span> <span class="nv">t1</span> <span class="nv">=</span><span class="p">)</span>
<span class="p">(</span><span class="k">let </span><span class="p">((</span><span class="nf">g0</span> <span class="p">(</span><span class="nf">btree/make-generator</span> <span class="nv">t0</span><span class="p">))</span>
<span class="p">(</span><span class="nf">g1</span> <span class="p">(</span><span class="nf">btree/make-generator</span> <span class="nv">t1</span><span class="p">)))</span>
<span class="p">(</span><span class="k">let </span><span class="nv">loop</span> <span class="p">((</span><span class="nf">m0</span> <span class="p">(</span><span class="nf">g0</span><span class="p">))</span> <span class="p">(</span><span class="nf">m1</span> <span class="p">(</span><span class="nf">g1</span><span class="p">)))</span>
<span class="p">(</span><span class="nf">match</span> <span class="nv">m0</span> <span class="nv">m1</span> <span class="nv">with</span>
<span class="p">(</span><span class="nf">maybe:yes</span> <span class="nv">v0</span><span class="p">)</span> <span class="p">(</span><span class="nf">maybe:yes</span> <span class="nv">v1</span><span class="p">)</span>
<span class="k">-> </span><span class="p">(</span><span class="k">if </span><span class="p">(</span><span class="nb">= </span><span class="nv">v0</span> <span class="nv">v1</span><span class="p">)</span>
<span class="p">(</span><span class="nf">loop</span> <span class="p">(</span><span class="nf">g0</span><span class="p">)</span> <span class="p">(</span><span class="nf">g1</span><span class="p">))</span>
<span class="p">(</span><span class="nf">print-string</span> <span class="s">"NOT equal\n"</span><span class="p">))</span>
<span class="p">(</span><span class="nf">maybe:no</span><span class="p">)</span> <span class="p">(</span><span class="nf">maybe:no</span><span class="p">)</span>
<span class="k">-> </span><span class="p">(</span><span class="nf">print-string</span> <span class="s">"equal\n"</span><span class="p">)</span>
<span class="nv">_</span> <span class="nv">_</span> <span class="k">-> </span><span class="p">(</span><span class="nf">print-string</span> <span class="s">"unequal size\n"</span><span class="p">)))))</span>
<span class="p">(</span><span class="nf">same-fringe</span> <span class="nv">t0</span> <span class="nv">t1</span> <span class="nv">=</span><span class="p">)</span>
<span class="p">(</span><span class="nf">same-fringe</span> <span class="nv">t0</span> <span class="nv">t2</span> <span class="nv">=</span><span class="p">)</span>
</pre></div>Sam Rushinghttp://www.blogger.com/profile/13115847299260965994noreply@blogger.com0tag:blogger.com,1999:blog-5145015724555605651.post-36196058145315305952011-03-20T18:09:00.000-07:002011-03-20T18:09:36.180-07:00win32 binaryI've tried to make the bootstrapping script work on win32, and I think for the most part I've succeeded. There are issues with mac-specific flags making it into the bootstrap file that I still need to work out.<br />
<br />
I used mingw, but I suspect other gcc binaries will work, too. A <a href="http://dark.nightmare.com/rushing/irken/irken.110320.win32-exe.zip">32-bit executable is now available</a>, which may ease the bootstrapping process for some.Sam Rushinghttp://www.blogger.com/profile/13115847299260965994noreply@blogger.com0tag:blogger.com,1999:blog-5145015724555605651.post-25265268353444397432011-03-17T18:58:00.000-07:002011-03-17T18:58:48.762-07:00description of the compiler and runtimeI've started on a <a href="http://dark.nightmare.com/rushing/irken/irken/self/HACKING.txt">'HACKING.txt'</a> document describing the compiler and runtime.<br />
It should be enough to get folks started on understanding the source.<br />
Feedback appreciated!<br />
<br />
-SamSam Rushinghttp://www.blogger.com/profile/13115847299260965994noreply@blogger.com0tag:blogger.com,1999:blog-5145015724555605651.post-47952538765453429132011-03-17T16:15:00.000-07:002011-03-17T16:15:16.650-07:00self-hosted distributionI've finally put together a 'real' self-hosted <a href="http://dark.nightmare.com/rushing/irken/irken.110317.tar.gz">distribution</a>.<br />
<br />
The tarball is a little bigger - even though I removed all the python code. The reason is that I have to distribute a pre-compiled version of self/compile.c so that the compiler can be bootstrapped.<br />
<br />
Enjoy, and feedback appreciated.<br />
<br />
Here's the text of the README:<br />
<br />
<pre>This is the initial release of the self-hosted compiler.
It's still in a very unpolished state, but you can use it to bootstrap itself.
Just run the script "util/bootstrap.py":
$ python util/bootstrap.py
Which does the following:
1) run gcc on the distributed version of self/compile.c
2) this binary will be used to recompile the compiler.
3) that binary will recompile the compiler again.
4) the output from steps 2 and 3 are compared, they should be identical.
If you're happy with the resulting compiler, you can compile an optimized
version of self/compile.c, but be warned - you'll need a lot of memory and
a lot of time.
I am using dragonegg for optimized builds, and that seems to take about a GB
of memory, and 18 minutes to build. It's important to use '-O2', not '-O',
because '-O' takes 53GB of memory and hours to compile.
Very little documentation exists yet, try 'lang.html' for a brief tutorial.
The best way to get familiar with the language is to read the source code in
the 'self' directory, and browse over the files in "tests".
-Sam
</pre>Sam Rushinghttp://www.blogger.com/profile/13115847299260965994noreply@blogger.com0tag:blogger.com,1999:blog-5145015724555605651.post-5536509998393822222011-03-09T15:03:00.000-08:002011-03-09T15:03:28.313-08:00C compiler issuesAs I came closer to completing the Irken implementation, I noticed that my edit-compile cycle was taking longer and longer. And by that, I don't mean a linear change. At some point a threshold was crossed, such that compiling an optimized binary could take nearly an hour with dragonegg, and much longer with gcc, consuming over 17G of memory while at it!<br />
<br />
After doing some tests, I've identified at least one of the causes: my <span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;">varref</span> and <span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;">varset</span> functions.<br />
<br />
A couple of years ago, the compiler output for a varref insn looked like this:<br />
<br />
<span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"> r0 = lenv[1][1][1][0];</span><br />
<br />
Where the variable we are referencing is 3 levels up and at the 0 index. (i.e., a De Bruijn index of (3,0)).<br />
<br />
I noticed that I could write an inline lexical function, varref(), that did this with a loop:<br />
<br />
<span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"> r0 = varref (3, 0);</span><br />
<br />
... which is much cleaner. With -O, gcc, llvm, and dragonegg were all unrolling the constant loop and creating code that was identical to the first version.<br />
<br />
I didn't notice the cost of this convenient feature until my program size got large enough... the compiler sources, when using the inline functions, take 5X as long to compile -O as the first version.<br />
<br />
Also, a 'platform' note: I work on OS X, where the stock compiler is still /usr/bin/gcc. I did some timings for a non-optimized build and discovered that the stock gcc is over twice as fast as either my hand-built gcc-4.5.0, or dragonegg. So for quick edit-compile cycles, I switched back to the stock version. Though it'd be nice to know why the version from Apple is so much faster...Sam Rushinghttp://www.blogger.com/profile/13115847299260965994noreply@blogger.com1tag:blogger.com,1999:blog-5145015724555605651.post-62380571171367972022011-03-09T14:50:00.000-08:002011-03-09T14:50:41.716-08:00Self-hosted![assumed skynet joke here]<br />
<br />
A few days ago, after many weeks of frenetic work, the new self-hosted Irken compiled itself. There's still a lot of work to do, but this major milestone has been reached. The compiler passes the stage1==stage2 test, and I'm now concentrating on various features still missing compared to the python compiler. Also, the edit-compile cycle is still pretty slow, because both the compiler and gcc are slower than need be. Once I have these issues solved and some rudimentary error reporting (the current error reporting might be described as medieval) I'll make an official release.<br />
<br />
I'm also thinking about how to best re-package the system, considering orphaning the python development branch, and starting a whole new repository for the self-hosted system.<br />
<br />
In the meanwhile, if there's anyone out there that would just like to see it compile itself, I rolled a <a href="http://dark.nightmare.com/rushing/irken/irken-self.tar.gz">demo tarball</a> up. Enjoy, and feedback is much appreciated!Sam Rushinghttp://www.blogger.com/profile/13115847299260965994noreply@blogger.com0tag:blogger.com,1999:blog-5145015724555605651.post-64877846820187040102011-02-14T17:20:00.000-08:002011-02-14T17:20:04.510-08:00progress on self-hostingToday the self-hosted Irken compiled its first complete program, all the way through substituting into the header template file, and compiling via gcc. A nice feeling.<br />
<br />
<div class="highlight"><pre><span class="c1">;; -*- Mode: Irken -*-</span>
<span class="p">(</span><span class="nf">datatype</span> <span class="nv">list</span>
<span class="p">(</span><span class="nf">:nil</span><span class="p">)</span>
<span class="p">(</span><span class="nf">:cons</span> <span class="ss">'a</span> <span class="p">(</span><span class="nb">list </span><span class="ss">'a</span><span class="p">))</span>
<span class="p">)</span>
<span class="p">(</span><span class="nf">defmacro</span> <span class="nv">list</span>
<span class="p">(</span><span class="nf">list</span><span class="p">)</span> <span class="nv">-></span> <span class="p">(</span><span class="nf">list:nil</span><span class="p">)</span>
<span class="p">(</span><span class="nb">list </span><span class="nv">x</span> <span class="nv">y</span> <span class="o">...</span><span class="p">)</span> <span class="nv">-></span> <span class="p">(</span><span class="nf">list:cons</span> <span class="nv">x</span> <span class="p">(</span><span class="nb">list </span><span class="nv">y</span> <span class="o">...</span><span class="p">)))</span>
<span class="p">(</span><span class="k">define </span><span class="p">(</span><span class="nb">+ </span><span class="nv">a</span> <span class="nv">b</span><span class="p">)</span>
<span class="p">(</span><span class="nf">%%cexp</span> <span class="p">(</span><span class="nf">int</span> <span class="nv">int</span> <span class="nv">-></span> <span class="nv">int</span><span class="p">)</span> <span class="s">"%0+%1"</span> <span class="nv">a</span> <span class="nv">b</span><span class="p">))</span>
<span class="p">(</span><span class="k">define </span><span class="nv">length</span>
<span class="p">()</span> <span class="nv">-></span> <span class="mi">0</span>
<span class="p">(</span><span class="nf">_</span> <span class="o">.</span> <span class="nv">tl</span><span class="p">)</span> <span class="nv">-></span> <span class="p">(</span><span class="nb">+ </span><span class="mi">1</span> <span class="p">(</span><span class="nb">length </span><span class="nv">tl</span><span class="p">)))</span>
<span class="p">(</span><span class="nb">length </span><span class="p">(</span><span class="nb">list </span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">))</span>
</pre></div><br />
<br />
I also finally got around to checking in the changes from the last couple of weeks, too much slacking off there.Sam Rushinghttp://www.blogger.com/profile/13115847299260965994noreply@blogger.com0tag:blogger.com,1999:blog-5145015724555605651.post-64058308195343003162011-02-02T21:22:00.000-08:002011-02-02T21:22:50.149-08:00Y combinator in IrkenThe second version requires recursive types.<br />
<br />
<br />
<div class="highlight"><pre><span class="c1">;; -*- Mode: Irken -*-</span>
<span class="c1">;; See http://en.wikipedia.org/wiki/Fixed_point_combinator</span>
<span class="p">(</span><span class="nf">include</span> <span class="s">"lib/core.scm"</span><span class="p">)</span>
<span class="p">(</span><span class="k">define </span><span class="p">(</span><span class="nf">Y</span> <span class="nv">f</span><span class="p">)</span>
<span class="p">(</span><span class="k">lambda </span><span class="p">(</span><span class="nf">x</span><span class="p">)</span>
<span class="p">((</span><span class="nf">f</span> <span class="p">(</span><span class="nf">Y</span> <span class="nv">f</span><span class="p">))</span> <span class="nv">x</span><span class="p">)))</span>
<span class="p">(</span><span class="k">define </span><span class="p">(</span><span class="nf">factabs</span> <span class="nv">fact</span><span class="p">)</span>
<span class="p">(</span><span class="k">lambda </span><span class="p">(</span><span class="nf">x</span><span class="p">)</span>
<span class="p">(</span><span class="k">if </span><span class="p">(</span><span class="nb">= </span><span class="nv">x</span> <span class="mi">0</span><span class="p">)</span>
<span class="mi">1</span>
<span class="p">(</span><span class="nb">* </span><span class="nv">x</span> <span class="p">(</span><span class="nf">fact</span> <span class="p">(</span><span class="nb">- </span><span class="nv">x</span> <span class="mi">1</span><span class="p">))))))</span>
<span class="p">(</span><span class="nf">printn</span> <span class="p">((</span><span class="nf">Y</span> <span class="nv">factabs</span><span class="p">)</span> <span class="mi">5</span><span class="p">))</span>
<span class="c1">;; and http://caml.inria.fr/pub/docs/u3-ocaml/ocaml-ml.html#toc5</span>
<span class="c1">;; let fix f' = let g f x = f' (f f) x in (g g);;</span>
<span class="c1">;; let fact = fix fact' in fact 5;;</span>
<span class="p">(</span><span class="k">define </span><span class="p">(</span><span class="nf">Y2</span> <span class="nv">f1</span><span class="p">)</span>
<span class="c1">;; the '^' prefix tells the compiler not to inline this function</span>
<span class="c1">;; which in this particular case would never terminate...</span>
<span class="p">(</span><span class="k">define </span><span class="p">(</span><span class="nf">^g</span> <span class="nv">f</span><span class="p">)</span>
<span class="p">(</span><span class="k">lambda </span><span class="p">(</span><span class="nf">x</span><span class="p">)</span>
<span class="p">(</span><span class="nf">f1</span> <span class="p">(</span><span class="nf">f</span> <span class="nv">f</span><span class="p">)</span> <span class="nv">x</span><span class="p">)))</span>
<span class="p">(</span><span class="nf">^g</span> <span class="nv">^g</span><span class="p">))</span>
<span class="p">(</span><span class="k">define </span><span class="p">(</span><span class="nf">fact1</span> <span class="nv">fact</span> <span class="nv">x</span><span class="p">)</span>
<span class="p">(</span><span class="k">if </span><span class="p">(</span><span class="nb">= </span><span class="nv">x</span> <span class="mi">0</span><span class="p">)</span>
<span class="mi">1</span>
<span class="p">(</span><span class="nb">* </span><span class="nv">x</span> <span class="p">(</span><span class="nf">fact</span> <span class="p">(</span><span class="nb">- </span><span class="nv">x</span> <span class="mi">1</span><span class="p">)))))</span>
<span class="p">(</span><span class="nf">printn</span> <span class="p">((</span><span class="nf">Y2</span> <span class="nv">fact1</span><span class="p">)</span> <span class="mi">5</span><span class="p">))</span>
</pre></div><br />
<br />
The function ^g has type <pre>μ(A, (A->b))->(c->d)</pre>Sam Rushinghttp://www.blogger.com/profile/13115847299260965994noreply@blogger.com0tag:blogger.com,1999:blog-5145015724555605651.post-40830675814799522682011-02-02T17:31:00.000-08:002011-02-02T17:31:00.406-08:00recursive types, unification, sharingI'm having trouble with recursive types in my type reconstruction (type inference) algorithm, some kind of combinatorial explosion. I think there are multiple triggers for this: at each call to unify(), I'm running a complete apply-subst(), rather than a level at a time. For the massive types generated by my OO code, this means lots of wasted, repeated work.<br />
<br />
But of course there's even more going on. I think my code that detects recursive types is not working as well as it should, and I think the cause is the lack of 'sharing'. Remy/Pottier use a different style of unification, that involves preserving shared elements of types, and I think this where I'm screwing up: because parts of types are being copied, I'm missing links between identical parts of the graph.<br />
<br />
I found a nice presentation from Remy describing "Core ML", that explains this in (accented) English, rather than a relentless flurry of LaTeX: <a href="http://caml.inria.fr/pub/docs/u3-ocaml/index.html">"Using, Understanding, and Unraveling The OCaml Language: From Practice to Theory and vice versa"</a>.<br />
<br />
The constraint-oriented solver gave me recursive types for free, (because its output types are naturally graphs), but I'm really hoping to avoid going back to that code, because it's much slower and I frankly don't understand it as well as I'd like. Maybe I can pull out just the multi-equation unification stuff, and get back recursive types.Sam Rushinghttp://www.blogger.com/profile/13115847299260965994noreply@blogger.com3tag:blogger.com,1999:blog-5145015724555605651.post-37498053605567120622011-01-31T15:54:00.000-08:002011-01-31T15:57:26.308-08:00simple approach to OO and exploding equirecursive typesWhile trying to port the graph algorithms (Kosaraju's algorithm / topological sort) I got to a point where I really wanted some imperative set/map objects, so I combined a 'class' using alists (for map) and lists (for set) with a few 'methods' put into records.<br />
<br />
After playing with it for a while, I found a fairly neat way to do method calls and object layouts:<br />
<br />
<div class="highlight"><pre><span class="p">(</span><span class="k">define </span><span class="p">(</span><span class="nf">set-class</span><span class="p">)</span>
<span class="p">(</span><span class="k">define </span><span class="p">(</span><span class="nf">in</span> <span class="nv">self</span> <span class="nv">x</span><span class="p">)</span>
<span class="p">(</span><span class="k">let </span><span class="nv">loop</span> <span class="p">((</span><span class="nf">l</span> <span class="nv">self</span><span class="o">.</span><span class="nv">list</span><span class="p">))</span>
<span class="p">(</span><span class="nf">match</span> <span class="nv">l</span> <span class="nv">with</span>
<span class="p">()</span> <span class="nv">-></span> <span class="no">#f</span>
<span class="p">(</span><span class="nf">hd</span> <span class="o">.</span> <span class="nv">tl</span><span class="p">)</span> <span class="nv">-></span> <span class="p">(</span><span class="k">if </span><span class="p">(</span><span class="nb">eq? </span><span class="nv">hd</span> <span class="nv">x</span><span class="p">)</span>
<span class="no">#t</span>
<span class="p">(</span><span class="nf">loop</span> <span class="nv">tl</span><span class="p">)))))</span>
<span class="p">(</span><span class="k">define </span><span class="p">(</span><span class="nf">add</span> <span class="nv">self</span> <span class="nv">x</span><span class="p">)</span>
<span class="p">(</span><span class="k">if </span><span class="p">(</span><span class="nf">self::in</span> <span class="nv">x</span><span class="p">)</span>
<span class="o">#</span><span class="nv">u</span>
<span class="p">(</span><span class="k">set! </span><span class="nv">self</span><span class="o">.</span><span class="nv">list</span> <span class="p">(</span><span class="nf">list:cons</span> <span class="nv">x</span> <span class="nv">self</span><span class="o">.</span><span class="nv">list</span><span class="p">))))</span>
<span class="p">(</span><span class="k">define </span><span class="p">(</span><span class="nf">get</span> <span class="nv">self</span><span class="p">)</span>
<span class="nv">self</span><span class="o">.</span><span class="nv">list</span><span class="p">)</span>
<span class="p">(</span><span class="k">let </span><span class="p">((</span><span class="nf">methods</span> <span class="p">{</span><span class="nv">in=in</span> <span class="nv">add=add</span> <span class="nv">get=get</span><span class="p">}))</span>
<span class="p">(</span><span class="k">define </span><span class="p">(</span><span class="nf">new</span> <span class="nv">l</span><span class="p">)</span>
<span class="p">{</span><span class="nv">o=methods</span> <span class="nv">list=l</span><span class="p">})</span>
<span class="nv">new</span>
<span class="p">))</span>
</pre></div><br />
This leads to an efficient object layout very similar to Python's: one slot identifies the object type, the other slots hold instance variables. To support this, I added a reader hack that translates<br />
<br />
<br />
<pre>(var::method arg0)</pre><br />
into<br />
<br />
<pre>(var.o.method var arg0)</pre><br />
The result is something that I'm pretty sure will act just like Python's duck typing... for example a generic output function may call the 'write' method on an object, and its type will reflect that need: for a write method of a particular signature. Instead of 'o' I'd probably use a slot name like '__methods__'... this suggests standard repr/str interfaces, and the other duck-typed protocols that Python makes such good use of.<br />
<br />
Pretty cool, huh?<br />
<br />
Well, as soon as I tried actually using this code, the inferencer fell to its knees. I'm not kidding. It went from a 30 second compile to a 20 <i>minute </i>compile, eating up over a GB of memory. Hmmmm... looks like my quick-and-easy implementation of μ-types has problems, surprise. I think I was 'asking for it' by calling each method with a 'self' argument... each method call has to unify the object type twice. And storing objects inside of other objects doesn't help, either.<br />
<br />
Back to the textbook. I'm hoping that reading <a href="http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.4.2276">this paper</a> again will help.Sam Rushinghttp://www.blogger.com/profile/13115847299260965994noreply@blogger.com1tag:blogger.com,1999:blog-5145015724555605651.post-46284633931749369192011-01-26T19:08:00.000-08:002011-01-26T19:08:52.400-08:00self-hosting dilemmaI have nearly the whole compiler rewritten in itself. I've saved one of the nastiest bits for last, the inliner. I've been planning on a new design for the inliner - rather than just copying the Python code. So of course, what happens in this late stage? A bug in the inliner!<br />
<br />
The program types correctly if inlining is turned off, but fails if it's turned on. This is a great example of the gains to be had by typing both before and after phases of the compiler - but unfortunately in this case it means I have to fix a bug in some code that I intend to throw away as soon as it works!<br />
<br />
Theoretically I could just turn <i>off</i> inlining until I'm done, and then write the new inliner in the self-hosted environment - but I can't be absolutely sure that it's not a more complex interaction between the typer and the inliner.Sam Rushinghttp://www.blogger.com/profile/13115847299260965994noreply@blogger.com0tag:blogger.com,1999:blog-5145015724555605651.post-31593964918658111202011-01-20T23:15:00.000-08:002011-01-20T23:15:13.725-08:00bloated C compilersI've been testing Irken out on my netbook (a Samsung NC10), and discovered a big problem with feeding Irken's output to gcc.<br />
<br />
When I tried to compile the self-hosted Irken with -O, the netbook fell on its knees. After about 15 minutes, I decided to look into it... why hadn't the compile finished yet?<br />
<br />
The input Irken source was about 60kB, which exploded into a 920kB C file. The C output is actually pretty low-level stuff, so it's probably comparable to a 300kB hand-written C file. But let's just call it 1MB and move on...<br />
<br />
The gcc (version 4.4) process had eaten 1.4GB of virtual memory, so it was paging on the netbook.<br />
<br />
That's 1600X the size of the C input file. It's nearly 25,000X the size of the Irken input. Wow.<br />
<br />
So as a test I installed dragonegg and gcc-4.5. Quite a difference. The dragonegg compile held at 143MB for most of the compile, and peaked at around 200MB. Only 226X the size of the C input, and a mere 3400X the size of the Irken input.<br />
<br />
But what's an order of magnitude between friends?Sam Rushinghttp://www.blogger.com/profile/13115847299260965994noreply@blogger.com0tag:blogger.com,1999:blog-5145015724555605651.post-32188452903145405822011-01-20T15:27:00.000-08:002011-01-20T15:27:01.562-08:00The datatype + record idiomOne of the difficult decisions when working on a compiler is how to represent nodes. There's a strain between the pure-functional and object-oriented styles that can be really painful. I'm certainly not happy with the way I did it in Irken - a combination of pure-functional class with a bunch of attribute hacks.<br />
<div><br />
</div><div>While working to solve the same problem in self-hosted Irken, I've come across a pretty good idiom that captures the best of both worlds; giving me exactly the right combination of strong typing, mutability, and flexibility that I need.</div><div><br />
</div><div>The trick is to use a record as the node representation, but inside it put a slot that holds all the strongly-typed metadata that you need. Here's what it looks like so far:</div><div><br />
</div><div class="highlight"><pre><span class="c1">;; node type holds metadata related to the node,</span>
<span class="c1">;; but sub-nodes are held with the record.</span>
<span class="p">(</span><span class="nf">datatype</span> <span class="nv">node</span>
<span class="p">(</span><span class="nf">:varref</span> <span class="nv">symbol</span><span class="p">)</span>
<span class="p">(</span><span class="nf">:varset</span> <span class="nv">symbol</span><span class="p">)</span>
<span class="p">(</span><span class="nf">:literal</span> <span class="nv">literal</span><span class="p">)</span>
<span class="p">(</span><span class="nf">:cexp</span> <span class="nv">type</span> <span class="nv">string</span><span class="p">)</span>
<span class="p">(</span><span class="nf">:nvcase</span> <span class="nv">symbol</span> <span class="p">(</span><span class="nb">list </span><span class="nv">symbol</span><span class="p">))</span>
<span class="p">(</span><span class="nf">:sequence</span><span class="p">)</span>
<span class="p">(</span><span class="nf">:if</span><span class="p">)</span>
<span class="p">(</span><span class="nf">:function</span> <span class="nv">symbol</span> <span class="p">(</span><span class="nb">list </span><span class="nv">symbol</span><span class="p">))</span> <span class="c1">;; name formals</span>
<span class="p">(</span><span class="nf">:call</span><span class="p">)</span>
<span class="p">)</span>
<span class="p">(</span><span class="k">define </span><span class="nv">node-counter</span> <span class="p">(</span><span class="nf">make-counter</span> <span class="mi">0</span><span class="p">))</span>
<span class="c1">;; given a list of nodes, add up their sizes (+1)</span>
<span class="p">(</span><span class="k">define </span><span class="p">(</span><span class="nf">sum-size</span> <span class="nv">l</span><span class="p">)</span>
<span class="p">(</span><span class="nf">fold</span> <span class="p">(</span><span class="k">lambda </span><span class="p">(</span><span class="nf">n</span> <span class="nv">acc</span><span class="p">)</span> <span class="p">(</span><span class="nb">+ </span><span class="nv">n</span><span class="o">.</span><span class="nv">s</span> <span class="nv">acc</span><span class="p">))</span> <span class="mi">1</span> <span class="nv">l</span><span class="p">))</span>
<span class="p">(</span><span class="k">define </span><span class="p">(</span><span class="nf">node</span> <span class="nv">t</span> <span class="nv">subs</span><span class="p">)</span>
<span class="p">{</span><span class="nv">t=t</span> <span class="nv">subs=subs</span> <span class="nv">s=</span><span class="p">(</span><span class="nb">+ </span><span class="p">(</span><span class="nf">sum-size</span> <span class="nv">subs</span><span class="p">)</span> <span class="mi">1</span><span class="p">)</span> <span class="nv">id=</span><span class="p">(</span><span class="nf">node-counter</span><span class="o">.</span><span class="nv">inc</span><span class="p">)</span> <span class="nv">type=</span><span class="p">(</span><span class="nf">type:base</span> <span class="ss">'?</span><span class="p">)}</span>
<span class="p">)</span>
<span class="p">(</span><span class="k">define </span><span class="p">(</span><span class="nf">node/varref</span> <span class="nv">name</span><span class="p">)</span>
<span class="p">(</span><span class="nf">node</span> <span class="p">(</span><span class="nf">node:varref</span> <span class="nv">name</span><span class="p">)</span> <span class="o">'</span><span class="p">()))</span>
<span class="p">(</span><span class="k">define </span><span class="p">(</span><span class="nf">node/varset</span> <span class="nv">name</span> <span class="nv">val</span><span class="p">)</span>
<span class="p">(</span><span class="nf">node</span> <span class="p">(</span><span class="nf">node:varset</span> <span class="nv">name</span><span class="p">)</span> <span class="p">(</span><span class="nf">LIST</span> <span class="nv">val</span><span class="p">)))</span>
<span class="p">(</span><span class="k">define </span><span class="p">(</span><span class="nf">node/literal</span> <span class="nv">lit</span><span class="p">)</span>
<span class="p">(</span><span class="nf">node</span> <span class="p">(</span><span class="nf">node:literal</span> <span class="nv">lit</span><span class="p">)</span> <span class="o">'</span><span class="p">()))</span>
<span class="p">(</span><span class="k">define </span><span class="p">(</span><span class="nf">node/cexp</span> <span class="nv">type</span> <span class="nv">template</span> <span class="nv">args</span><span class="p">)</span>
<span class="p">(</span><span class="nf">node</span> <span class="p">(</span><span class="nf">node:cexp</span> <span class="nv">type</span> <span class="nv">template</span><span class="p">)</span> <span class="nv">args</span><span class="p">))</span>
<span class="p">(</span><span class="k">define </span><span class="p">(</span><span class="nf">node/sequence</span> <span class="nv">subs</span><span class="p">)</span>
<span class="p">(</span><span class="nf">node</span> <span class="p">(</span><span class="nf">node:sequence</span><span class="p">)</span> <span class="nv">subs</span><span class="p">))</span>
</pre></div><br />
<div>Note how the node constructor initializes the other slots in the record. The interesting one for me today is the type field. This is later filled in by the typer, succinctly:</div><br />
<div class="highlight"><pre><span class="p">(</span><span class="k">define </span><span class="p">(</span><span class="nf">type-of</span> <span class="nv">exp</span> <span class="nv">tenv</span><span class="p">)</span>
<span class="p">(</span><span class="k">let </span><span class="p">((</span><span class="nf">t</span> <span class="p">(</span><span class="nf">type-of*</span> <span class="nv">exp</span> <span class="nv">tenv</span><span class="p">)))</span>
<span class="p">(</span><span class="k">set! </span><span class="nv">exp</span><span class="o">.</span><span class="nv">type</span> <span class="nv">t</span><span class="p">)</span>
<span class="nv">t</span><span class="p">))</span>
<span class="p">(</span><span class="k">define </span><span class="p">(</span><span class="nf">apply-subst-to-program</span> <span class="nv">n</span><span class="p">)</span>
<span class="p">(</span><span class="k">set! </span><span class="nv">n</span><span class="o">.</span><span class="nv">type</span> <span class="p">(</span><span class="nf">apply-subst-to-type</span> <span class="nv">n</span><span class="o">.</span><span class="nv">type</span><span class="p">))</span>
<span class="p">(</span><span class="nb">for-each </span><span class="nv">apply-subst-to-program</span> <span class="nv">n</span><span class="o">.</span><span class="nv">subs</span><span class="p">))</span>
<span class="p">(</span><span class="k">define </span><span class="p">(</span><span class="nf">type-program</span> <span class="nv">node</span><span class="p">)</span>
<span class="p">(</span><span class="k">let </span><span class="p">((</span><span class="nf">t</span> <span class="p">(</span><span class="nf">type-of</span> <span class="nv">node</span> <span class="p">(</span><span class="nf">alist/make</span><span class="p">))))</span>
<span class="p">(</span><span class="nf">apply-subst-to-program</span> <span class="nv">node</span><span class="p">)</span>
<span class="nv">t</span><span class="p">))</span>
</pre></div><br />
We've still preserved the strong typing of nodes, though: here's what a typical node-visiting function looks like:<br />
<br />
<div class="highlight"><pre><span class="p">(</span><span class="k">define </span><span class="p">(</span><span class="nf">compile</span> <span class="nv">tail?</span> <span class="nv">exp</span> <span class="nv">lenv</span> <span class="nv">k</span><span class="p">)</span>
<span class="c1">;; override continuation when in tail position</span>
<span class="p">(</span><span class="k">if </span><span class="nv">tail?</span>
<span class="p">(</span><span class="k">set! </span><span class="nv">k</span> <span class="p">(</span><span class="nf">cont</span> <span class="p">(</span><span class="nf">k/free</span> <span class="nv">k</span><span class="p">)</span> <span class="nv">gen-return</span><span class="p">)))</span>
<span class="p">(</span><span class="nf">match</span> <span class="nv">exp</span><span class="o">.</span><span class="nv">t</span> <span class="nv">with</span>
<span class="p">(</span><span class="nf">node:literal</span> <span class="nv">lit</span><span class="p">)</span> <span class="nv">-></span> <span class="p">(</span><span class="nf">c-literal</span> <span class="nv">lit</span> <span class="nv">k</span><span class="p">)</span>
<span class="p">(</span><span class="nf">node:sequence</span><span class="p">)</span> <span class="nv">-></span> <span class="p">(</span><span class="nf">c-sequence</span> <span class="nv">tail?</span> <span class="nv">exp</span><span class="o">.</span><span class="nv">subs</span> <span class="nv">lenv</span> <span class="nv">k</span><span class="p">)</span>
<span class="p">(</span><span class="nf">node:if</span><span class="p">)</span> <span class="nv">-></span> <span class="p">(</span><span class="nf">c-conditional</span> <span class="nv">tail?</span> <span class="nv">exp</span> <span class="nv">lenv</span> <span class="nv">k</span><span class="p">)</span>
<span class="p">(</span><span class="nf">node:function</span> <span class="nv">name</span> <span class="nv">formals</span><span class="p">)</span> <span class="nv">-></span> <span class="p">(</span><span class="nf">c-function</span> <span class="nv">name</span> <span class="nv">formals</span> <span class="p">(</span><span class="nb">car </span><span class="nv">exp</span><span class="o">.</span><span class="nv">subs</span><span class="p">)</span> <span class="nv">lenv</span> <span class="nv">k</span><span class="p">)</span>
<span class="p">(</span><span class="nf">node:varref</span> <span class="nv">name</span><span class="p">)</span> <span class="nv">-></span> <span class="p">(</span><span class="nf">c-varref</span> <span class="nv">name</span> <span class="nv">lenv</span> <span class="nv">k</span><span class="p">)</span>
<span class="p">(</span><span class="nf">node:varset</span> <span class="nv">name</span><span class="p">)</span> <span class="nv">-></span> <span class="p">(</span><span class="nf">c-varset</span> <span class="nv">name</span> <span class="p">(</span><span class="nb">car </span><span class="nv">exp</span><span class="o">.</span><span class="nv">subs</span><span class="p">)</span> <span class="nv">lenv</span> <span class="nv">k</span><span class="p">)</span>
<span class="p">(</span><span class="nf">node:cexp</span> <span class="nv">sig</span> <span class="nv">template</span><span class="p">)</span> <span class="nv">-></span> <span class="p">(</span><span class="nf">c-cexp</span> <span class="nv">sig</span> <span class="nv">template</span> <span class="nv">exp</span><span class="o">.</span><span class="nv">subs</span> <span class="nv">lenv</span> <span class="nv">k</span><span class="p">)</span>
<span class="p">(</span><span class="nf">node:call</span><span class="p">)</span> <span class="nv">-></span> <span class="p">(</span><span class="nf">c-call</span> <span class="nv">tail?</span> <span class="nv">exp</span> <span class="nv">lenv</span> <span class="nv">k</span><span class="p">)</span>
<span class="nv">_</span> <span class="nv">-></span> <span class="p">(</span><span class="k">begin </span><span class="p">(</span><span class="nf">pp-node</span> <span class="nv">exp</span> <span class="mi">0</span><span class="p">)</span> <span class="p">(</span><span class="nf">error1</span> <span class="s">"NYI"</span> <span class="nv">exp</span><span class="p">))</span>
<span class="p">)</span>
<span class="p">)</span>
</pre></div><br />
Note how metadata that's specific to each node type is stored in the datatype (<span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;">exp.t</span>), but generic info (like the list of sub-nodes, or the size) is stored in the record.Sam Rushinghttp://www.blogger.com/profile/13115847299260965994noreply@blogger.com0tag:blogger.com,1999:blog-5145015724555605651.post-88856309373600876292011-01-18T17:34:00.001-08:002011-01-18T17:35:21.777-08:00making the match compiler smarterThe match compiler developed a little problem. It's amazing that it didn't show up earlier. As described in chapter 5 of Peyton-Jones' book, "<a href="http://research.microsoft.com/en-us/um/people/simonpj/papers/slpj-book-1987/">The Implementation of Functional Programming Languages"</a>, certain 'obvious' or 'optimized' pattern matches actually compile to worse code.<br />
<br />
The following function is from the book (pg 88), and is similar to an implementation of map2.<br />
<br />
<div class="highlight"><pre><span class="p">(</span><span class="k">define </span><span class="nv">demo</span>
<span class="p">()</span> <span class="nv">ys</span> <span class="nv">-></span> <span class="p">(</span><span class="nf">:A</span> <span class="nv">ys</span><span class="p">)</span>
<span class="nv">xs</span> <span class="p">()</span> <span class="nv">-></span> <span class="p">(</span><span class="nf">:B</span> <span class="nv">xs</span><span class="p">)</span>
<span class="p">(</span><span class="nf">x</span> <span class="o">.</span> <span class="nv">xs</span><span class="p">)</span> <span class="p">(</span><span class="nf">y</span> <span class="o">.</span> <span class="nv">ys</span><span class="p">)</span> <span class="nv">-></span> <span class="p">(</span><span class="nf">:C</span> <span class="nv">x</span> <span class="nv">xs</span> <span class="nv">y</span> <span class="nv">ys</span><span class="p">)</span>
<span class="p">)</span></pre></div><br />
The second match clause will actually compile more efficiently if you use <span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;">(x . xs)</span> rather than a variable, because the match compiler knows in this case you're matching a cons cell. (it avoids the 'mixture rule' and uses only the 'constructor rule').<br />
<br />
Regardless of this optimization, I ran into a more significant issue. If I use <span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;">xs</span>, I get an 'incomplete match' error, because the code generated for this match has to examine the first list twice - in the first test it covers only the <span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;">nil</span> case. In the second test, it covers the <span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;">cons</span> case, but has an 'else' clause that generates a match error.<br />
<br />
Here's what the node tree looks like:<br />
<br />
<pre>290 L primapp [33] ('%%fatbar', None) ?
262 L nvcase [5] ('list', ['nil']) ?
258 L varref [1] m0_7_i0 ?
260 L primapp [2] ('%vcon/A/1', None) ?
259 L varref [1] m1_8_i1 ?
261 L primapp [1] ('%%fail', None) ?
289 L primapp [27] ('%%fatbar', None) ?
267 L nvcase [5] ('list', ['nil']) ?
263 L varref [1] m1_8_i1 ?
265 L primapp [2] ('%vcon/B/1', None) ?
264 L varref [1] m0_7_i0 ?
266 L primapp [1] ('%%fail', None) ?
288 L nvcase [21] ('list', ['cons']) ?
268 L varref [1] m0_7_i0 ?
286 L let_splat [18] [{m2_9_i0}, {m3_10_i0}] ?
270 L primapp [2] ('%nvget/list/cons/0', None) ?
269 L varref [1] m0_7_i0 ?
272 L primapp [2] ('%nvget/list/cons/1', None) ?
271 L varref [1] m0_7_i0 ?
285 L nvcase [13] ('list', ['cons']) ?
273 L varref [1] m1_8_i1 ?
283 L let_splat [10] [{m4_11_i0}, {m5_12_i0}] ?
275 L primapp [2] ('%nvget/list/cons/0', None) ?
274 L varref [1] m1_8_i1 ?
277 L primapp [2] ('%nvget/list/cons/1', None) ?
276 L varref [1] m1_8_i1 ?
282 L primapp [5] ('%vcon/C/4', None) ?
278 L varref [1] m2_9_i0 ?
279 L varref [1] m3_10_i0 ?
280 L varref [1] m4_11_i0 ?
281 L varref [1] m5_12_i0 ?
284 L primapp [1] ('%%match-error', None) ?
287 L primapp [1] ('%%match-error', None) ?
</pre><br />
...and in a pseudo-python:<br />
<div class="highlight"><pre><span class="k">if</span> <span class="n">nil</span><span class="p">(</span><span class="n">m0</span><span class="p">):</span>
<span class="n">A</span><span class="p">()</span>
<span class="k">elif</span> <span class="n">nil</span><span class="p">(</span><span class="n">m1</span><span class="p">):</span>
<span class="n">B</span><span class="p">()</span>
<span class="k">else</span><span class="p">:</span>
<span class="k">if</span> <span class="n">cons</span><span class="p">(</span><span class="n">m0</span><span class="p">):</span>
<span class="k">if</span> <span class="n">cons</span><span class="p">(</span><span class="n">m1</span><span class="p">):</span>
<span class="n">C</span><span class="p">()</span>
<span class="k">else</span><span class="p">:</span>
<span class="k">raise</span> <span class="n">Error</span>
<span class="k">else</span><span class="p">:</span>
<span class="k">raise</span> <span class="n">Error</span>
</pre></div><br />
The problem is that the match is actually exhaustive, because the nil case was examined already. <span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;">%%match-error</span> in Irken is actually a kind of sentinel, the back end doesn't know how to compile it, so you can't actually compile code that's not exhaustive.<br />
<br />
To fix this, I added an extra pass to the analysis phase that builds environments of 'already tested alternatives' for each nvcase variable. This actually killed two birds with one stone. In many cases, the innermost nvcase test is completely unnecessary - for this example, we've already tested for nil, and actually testing for a cons is redundant. In this case we simply remove the nvcase test (and the erroneous match-error), and return the body.<br />
<br />
The code then generated for map is closer to what a human might write:<br />
<br />
pseudo-python:<br />
<br />
<div class="highlight"><pre><span class="k">if</span> <span class="n">nil</span><span class="p">(</span><span class="n">m0</span><span class="p">):</span>
<span class="n">A</span><span class="p">()</span>
<span class="k">elif</span> <span class="n">nil</span><span class="p">(</span><span class="n">m1</span><span class="p">):</span>
<span class="n">B</span><span class="p">()</span>
<span class="k">else</span><span class="p">:</span>
<span class="n">C</span><span class="p">()</span>
</pre></div><br />
RTL:<br />
<br />
<pre>0 = constructed [] 1
1 = constructed [] 0
2 = fatbar 'fatbar_0' []
2 = move [0, None] 'm0_7_i0'
2 = nvcase 'list' [2]
2 = move [1, None] 'm1_8_i1'
2 = primop [2] ('%make-tuple', 'A', 0)
return [2] None
fail [] ('fatbar_0', 0)
return [2] None
2 = fatbar 'fatbar_1' []
2 = move [1, None] 'm1_8_i1'
2 = nvcase 'list' [2]
2 = move [0, None] 'm0_7_i0'
2 = primop [2] ('%make-tuple', 'B', 1)
return [2] None
fail [] ('fatbar_1', 0)
return [2] None
2 = move [0, None] 'm0_7_i0'
2 = primop [2] ('%vget', '0')
3 = move [0, None] 'm0_7_i0'
3 = primop [3] ('%vget', '1')
4 = move [1, None] 'm1_8_i1'
4 = primop [4] ('%vget', '0')
5 = move [1, None] 'm1_8_i1'
5 = primop [5] ('%vget', '1')
6 = move [2, None] 'm2_9_i0'
7 = move [3, None] 'm3_10_i0'
8 = move [4, None] 'm4_11_i0'
9 = move [5, None] 'm5_12_i0'
6 = primop [6, 7, 8, 9] ('%make-tuple', 'C', 2)
return [6] None
return [2] None
return [2] None
</pre>Sam Rushinghttp://www.blogger.com/profile/13115847299260965994noreply@blogger.com0