pacc logo

Default expressions

I finally got round to implementing a simple but highly useful feature in the language, which I'm calling default expressions. It's probably simplest to explain by showing the test case:

Answer <- Digit

Digit <- Four / f:Five -> f / Six

Four <- "four" -> 4
Five <- "five" -> 5
Six <- "six" -> 6

In both Answer and two of the three alternatives of Digit there is no result expression. Previously this was never allowed in a non-void rule. With the new feature, if a sequence consists of a single call, a missing result expression is equivalent to a binding and expression that returns the result of that call.

I did consider making the feature more general. For example, a rule like Number ← n:[0-9]+ → n could benefit from default expressions. However, it strikes me that a rule like this would be quite unusual. More likely would be Number ← n:[0-9]+ _ → n, following the usual convention that _ is a rule that matches whitespace. It is not clear that a generalized default expression feature would help in this case. Were it fully generalized, the meaning of Number ← [0-9]+ _ would most obviously be Number ← n:([0-9]+ _) → n. That's probably not what you wanted. (And, if it is, you'd probably be better off writing Number ← [0-9]+ _ → { ref() } which is longer but clearer.)

The default expressions feature is implemented in sugar.c using the standard walk() function. However, this is not a particularly good fit for the job: walk() is great at desugaring expressions that may occur anywhere in the AST. But the default expressions feature only applies to sequences that are either at the top-level of a rule, or are children of a top-level alt. Something that simply looped over the rules would be simpler and clearer.

I came across this interesting 1000 word introduction to compilers the other day. Of course, I disagree with his assessment that parsing is “kinda stuffy” 🙂 But it did get me thinking more about simplifications / optimizations. These have received scant attention in pacc, simply stuffed into sugar.c, definitely one of the uglier and least polished parts of the compiler. Of course, pacc is a tiny language, and it is right that there is relatively little between the parser and the emitter. Still, it's a nice place to work.

There are a couple of snags with the implementation of default expressions as it stands. One is that we generate new expr nodes that don't contain coords nodes. There's currently no obvious place we could get the coordinates from.

Another snag is that it fails to check that the types line up.

Unfortunately, these two snags interact particularly badly. Suppose we change the last rule to be Six :: char * <- "six" -> { "string" } the only hint that anything is wrong is a C compiler warning:

test/parse.c: In function ‘pacc_parse’:
test/parse.c:762:24: warning: assignment makes integer from pointer without a cast
           cur = _pacc_p;

The warning is accurate, but there's no way to see what error in the code produced it.

Obviously pacc has enough type information available to it to produce a proper error at this point. It's just a small matter of coding...

Last updated: 2015-05-27 06:34:11 UTC


Porting and packaging

One thing pacc needs is more users. And, perhaps, one way to get more users is to reduce the friction in getting started with pacc. An obvious lubricant is packaging. Read More...

Release relief

Looking at _pacc_coords(), I noticed that it seemed to have the same realloc() bug that I'd just fixed in _pacc_result(). However, the "list of arrays" trick really wasn't going to work here. Read More...

See more news articles