pacc logo

Right Anchored

pacc grammars are now anchored on the right. That is, to be considered a successful parse, the start rule must match the entire input, and not just a prefix of it.

I contemplated this change a while ago, and decided against it on the grounds that other PEG parsers don't anchor on the right. However, we're not endeavouring to be compatible with any other PEG parser, so in this case I think it's more important to be Right than Popular. Why do I think right-anchoring is Right?

Symmetry
pacc — and every other PEG parser — anchors to the left (the start rule must start matching the first character of input), so why not to the right too?
Utility
Every practical pacc grammar, including the “trivial” evaluators after Bryan Ford, ends up explicitly anchoring to the right, with a !. rule. Since that's at least 50% of parsers, it should be the default.
Paltry
Whichever way the default goes, the other option is just two characters away: !. anchors, .* unanchors.
Pedagogy
This was the clincher for me. The first example language in the tutorial comprises just two utterances: yes and no. But then we have to explain why the parser also accepts yesterday, nothing, and an infinite variety of other non-utterances, and how to fix this problem.

So it's done. Unfortunately, this introduces 43 new FAILs: they're almost all of the “different error message” variety, but they'll have to be looked at... Actually, although some are to do with right-anchoring, but mostly the differences are caused by changing an occurrence of col to rule_col when an error is flagged for a rule that has made no progress. For example, the test bad/rep0.pacc consists of the “rule” S ← .**. This used to generate the error expected NameStart:c:0, which is fairly meaningless, desugared, and was one character past where it should have been to boot!

Now it points to the second * with the error expected Defns, or end-of-input, which is definitely an improvement. (This stuff still isn't right, though: there's a whole bunch of other things — any matcher, or for example — that could come after S ← .*. Still, we're making progress.)

All fixed up, back to the expected 4 FAILs.

Last updated: 2015-05-24 19:45:28 UTC

Donate

Support the development of pacc with a donation! We accept donations in BitCoin or via PayPal who handle almost any other form of payment.

News

Porting and packaging

One thing pacc needs is more users. And, perhaps, one way to get more users is to reduce the friction in getting started with pacc. An obvious lubricant is packaging. Read More...

Release relief

Looking at _pacc_coords(), I noticed that it seemed to have the same realloc() bug that I'd just fixed in _pacc_result(). However, the "list of arrays" trick really wasn't going to work here. Read More...

See more news articles

feed