Monday, January 9, 2012

thinking about an llvm backend, again

I've been very happy with my other project that uses LLVM as the backend for a compiler.
I've also learned a lot about LLVM's IR design.

I can't help but think that it might only take a week or two to write a back end for Irken.
However, there are a few issues.

  1. My CPS output assumes that each insn has a result that is put into a register.  This doesn't fit LLVM's model, which assumes that arguments to insns can be either an immediate or a register.  In fact, they do not allow you to put an immediate into a register.
  2. LLVM doesn't like gcc's goto-label feature.   I think they implemented it reluctantly, because it doesn't fit the SSA model very well.  The majority of funcalls by Irken are to known functions, which translate to a branch to a known label.  However, whenever higher-level functions are called, this leads to something like "goto *r0;" in C, which turns into an indirectbr insn in LLVM.  It implements this using a gigantic phi/indirectbr table at the very end of the function.  Maybe this isn't really a problem - it just looks bad.
  3. The extremely useful %cexp primitive would go away!  There's something to be said for being able to refer to OS constants, OS functions, macros, etc... with such a simple mechanism.  I'd probably just have to let it go, and force users to write stubs.
I think #1 can be dealt with by adding an extra pass that translates between the two representations.  (maybe I should revisit the CPS==SSA paper?)
Another approach might be to use an annoying alloca/store/load sequence and hope that mem2reg does a good job?

No comments:

Post a Comment