Monday, November 16, 2009

The Resurrected Lexer

I have resurrected the lexer, after reworking lots of supportive library code, including vectors, symbols, the symbol table, lists, etc...

The new variant datatypes seem very comfortable to me. If you'd like to see the lexer working, compile and execute tests/t20.scm:

[rushing@dark ~/src/irken]$ tests/t20
{u8 newline ""}
{u8 keyword "def"}
{u8 whitespace " "}
{u8 ident "thing"}
{u8 whitespace " "}
{u8 lparen ""}
{u8 ident "x"}
{u8 rparen ""}
{u8 colon ""}
{u8 newline ""}
{u8 whitespace " "}
{u8 ident "return"}
{u8 whitespace " "}
{u8 ident "x"}
{u8 whitespace " "}
{u8 addop "+"}
{u8 whitespace " "}
{u8 number "5"}
{u8 newline ""}
#t
{total ticks: 28843224 gc ticks: 0}


Note that "{u8 ...}" is how the printer outputs the (:token kind value) constructor. Each variant constructor is assigned a unique integer tag. You'll see the same (somewhat confusing) output when playing with lists - I need to either pre-allocate (and thus reserve) certain constructor names - or quit using the builtin printer code (which is in C). I'm not sure that it's practical to teach the C printer how to print out variants, since the user can use them in different ways for different purposes.

No comments:

Post a Comment