-
Notifications
You must be signed in to change notification settings - Fork 0
Home
Zapp is a packrat parser generator, that is written in zig and generates parsers written in zig.
It generates byte level parsers (so no unicode), which do not accept the null byte as input (because the generated parser parsers uses null terminated strings).
A packrat parser is a top-down parser (like recursive descent). Unlike recursive descent it has unlimited lookahead, meaning it can accept more complicated grammars.
Packrat parsers solve the problem that a basic unlimited lookahead top-down parser would have through memoization (storing the result of already parsed rules).
NOTE: because it is a top down parser and no special mechanism is implemented in zapp, left recursions are not allowed in grammars.
This results in an
NOTE: in some case because of the way the grammar is structured memoization can be disabled (see parser interface)
For this reason packrat parsers do not work well for very large files of flat data. They start to make sense for complicated grammars of human generated code (because the files are generally smaller).
Additionally they do not need a tokenizer, which means that the Grammar can contain ambiguous tokens.
A parser Expression Grammar is the standard grammar format used for the generation of packrat parsers. They are characterized through an operation called ordered choice which removes ambiguity from rules (which is common in LR parsers). This can create a problem though as the order of sequences means that some rules cannot be matched if not enough attention is paid to the grammar.
The syntax of the grammars accepted by zapp is described here.