My first sneaky plan was to generate a set of parsing tables, and then implement the parser engine in Irken. [This is similar to my approach with the lexer - the lexer tables are generated by Python code, and emit Irken code]. I like dparser, but after looking at the parsing engine, it looked a bit intense for my purposes. I then started playing with Earley parsers. You can write an Earley parser in an amazingly small amount of code.
But Earley parsers aren't really table-driven, and it sounds like performance will be an issue.
Then I discovered Jason Evans' "Parsing.py" module. It's perfect. It generates a bog standard set of LR(1) tables. However, there was still a big gap between Python/Grammar/Grammar and the input needed by Parsing.py (which consists of a class for each terminal and non-terminal, and a method for each reduction - annotated with docstring-based reduction rules).
To fill that gap, I needed two things:
- A parser for the python meta-grammar syntax.
- A translator from that grammar into a Parsing.py parser.
[Thanks to John Plevyak for assistance with dparser]
Keep up the good work!
ReplyDelete