I've also learned a lot about LLVM's IR design.
I can't help but think that it might only take a week or two to write a back end for Irken.
However, there are a few issues.
- My CPS output assumes that each insn has a result that is put into a register. This doesn't fit LLVM's model, which assumes that arguments to insns can be either an immediate or a register. In fact, they do not allow you to put an immediate into a register.
- LLVM doesn't like gcc's goto-label feature. I think they implemented it reluctantly, because it doesn't fit the SSA model very well. The majority of funcalls by Irken are to known functions, which translate to a branch to a known label. However, whenever higher-level functions are called, this leads to something like "goto *r0;" in C, which turns into an indirectbr insn in LLVM. It implements this using a gigantic phi/indirectbr table at the very end of the function. Maybe this isn't really a problem - it just looks bad.
- The extremely useful %cexp primitive would go away! There's something to be said for being able to refer to OS constants, OS functions, macros, etc... with such a simple mechanism. I'd probably just have to let it go, and force users to write stubs.
I think #1 can be dealt with by adding an extra pass that translates between the two representations. (maybe I should revisit the CPS==SSA paper?)
Another approach might be to use an annoying alloca/store/load sequence and hope that mem2reg does a good job?