pacc logo

pacc — a compiler-compiler

Release pacc-0.3

A new stable release is now available. Users of Fedora, CentOS, or RHEL can now install this release from a copr. Of course, the development source is still available from github and gitlab.

pacc is a compiler-compiler, somewhat like yacc (or bison). Its input is a description of a grammar, and its output is a C function that recognizes strings of that grammar. The significant technical difference is this: yacc reads a context-free grammar (CFGs), and writes a LALR(1) parser; pacc reads a parsing expression grammar (PEG), and writes a packrat parser.

PEGs and packrat parsing offer several advantages over CFGs.

  • There is no need for a two-level structure with a separate lexer (this is essentially a misfeature of CFGs - they are unable to express standard tokenization rules naturally).
  • PEGs can “look ahead” in the input as far as they need to.
  • Despite arbitrary look-ahead, packrat parsers are linear in time and space complexity: O(n) in the size of the input (whereas LALR(1) parsers are O(n²), and fully general CFG parsing is O(n³)).
  • PEGs are easy to understand, and pleasant to work with.

The current stable release is pacc-0.3 (bugyō) under the GPL. This is a beta release, see the TODO list for further information. The intention is that pacc will mature to be an industrial-strength parser-generator.

The name pacc is a recursive acronym: pacc: a compiler-compiler. Needless to say, pacc's own parser is written in pacc.

Last updated: 2016-08-03 21:31:34 UTC

News

Porting and packaging

One thing pacc needs is more users. And, perhaps, one way to get more users is to reduce the friction in getting started with pacc. An obvious lubricant is packaging. Read More...

Release relief

Looking at _pacc_coords(), I noticed that it seemed to have the same realloc() bug that I'd just fixed in _pacc_result(). However, the "list of arrays" trick really wasn't going to work here. Read More...

See more news articles

feed