Next: The any matcher, Previous: Left recursion, Up: Top [Contents][Index]
Leaving aside problematic left association, let’s go back to our previous calculator. You’ll recall that this handles addition and multiplication, with arbitrary whitespace between numbers and symbols. See Handling whitespace.
Suppose we wanted to make an interactive calculator, with a read-eval-print loop. It should print a prompt, read a line from the user, evaluate it, print the result, and then print another prompt. That’s easy enough to achieve, as each line read from the user will be a complete utterance in our grammar.
That interface would doubtless be fine in practice for a simple calculator. But for a more complex interactive language, we might like a more sophisticated read-eval-print loop. To demonstrate this, let’s artificially complicate the calculator. Suppose we terminate each sum with an equals character, so a typical input would be ‘2+3=’. Now that we can recognise the end of an utterance, we can allow the user to enter an expression that spreads over several lines.
For a calculator, this is overkill, but it’s exactly how command-line interfaces such as the Unix shells, SQL front-ends, and Lisp-like languages typically work. Normally, a different prompt is printed when further input is required; the classic example would be the ‘PS1’ and ‘PS2’ prompts in the Bourne shell. Pacc has support for building interfaces like these.
First, we need to mark in our grammar the points where the input can be split across multiple lines. We do this with the symbol ‘$’. This needs to go in high level rules, in all the places where it makes sense for the user to hit RET and keep typing, but not, for instance, in the middle of a number.
To strip the idea right down to its basics, consider this trivial grammar, which sums two digits, possibly on separate lines. Note that each digit is optionally followed by ‘\n’, so valid utterances include both unsplit inputs, such as ‘23\n’, and inputs split onto two lines, such as ‘2\n3\n’.
Sum :: int <- a:Digit $ b:Digit { a + b } Digit <- [0-9] "\n"? { ref_0(ref()) - '0' }
There is only one possible place to break the input: between the two digits. We have marked that point in the grammar with the ‘$’. Now we can invoke pacc with the ‘-f’ flag, and it will produce two parsers. The first parser is a perfectly normal one; the ‘$’ has no effect at all.
For the second parser, pacc modifies the grammar to something like this:
Sum :: void <- Digit Digit? Digit <- [0-9] "\n"?
The second digit (everything after the ‘$’) is now optional. Additionally, semantic expressions have been removed: this grammar is for recognising only, it can’t evaluate anything.
We can use the two grammars to construct an interactive two-digit-summer as follows.
Here’s what the complete grammar for the interactive calculator will look like, icalc.pacc. As well as the ‘$’ markers, we have added the ‘Equals’ rule that indicates the end of an expression.
Expression <- _ s:Sum $ Equals -> s Sum <- p:Product $ Plus $ s:Sum -> { p + s } / Product Product <- t:Term $ Star $ p:Product -> { t * p } / Term Term <- Decimal / lBrace $ s:Sum $ rBrace -> s # lexical elements Decimal <- [0-9]+ _ -> { atoi(ref_str()) } Plus :: void <- "+" _ Star <- "*" _ Equals <- "=" _ lBrace <- "(" _ rBrace <- ")" _ _ <- [ \t\n]*
Here’s a wrapper program, main.c.
#include <stdio.h> #include <stdlib.h> #include <string.h> #include "parse.h" #define LINE 80 #define P1 "$ " #define P2 "> " int main(void) { char *text = 0; char *prompt = P1; char line[LINE + 1]; int len, parsed; struct pacc_parser *p0 = pacc_new(); struct pacc_parser *p1 = pacc_feed_new(); text = malloc(1); if (!text) exit(1); len = 0; *text = '\0'; prompt = P1; for (;;) { printf("%s", prompt); fflush(stdout); if (!fgets(line, LINE, stdin)) break; len += strlen(line); text = realloc(text, len); if (!text) exit(1); strcat(text, line); pacc_input(p0, text, len); parsed = pacc_parse(p0); if (parsed) { printf("%d\n", pacc_result(p0)); len = 0; *text = '\0'; prompt = P1; } else { pacc_feed_input(p1, text, len); parsed = pacc_feed_parse(p1); if (parsed) { prompt = P2; } else { char *e = pacc_feed_error(p1); fprintf(stderr, "%s\n", e); free(e); len = 0; *text = '\0'; prompt = P1; } } } if (len) fprintf(stderr, "unexpected end of input\n"); else fprintf(stderr, "\n"); return 0; }
And, for completeness, here’s a Makefile. (This example can be found in the test/icalc/ directory of the pacc source distribution.)
CFLAGS = -std=c99 PACC = ../../pacc icalc: parse1.o parse2.o main.o $(CC) -o $@ $^ main.o: main.c parse.h parse1.o: parse1.c parse2.o: parse2.c parse.h parse1.c parse2.c: icalc.pacc $(PACC) -dparse.h -o parse1.c -f parse2.c $< clean: rm -f icalc main.o parse1.o parse2.o parse.h parse1.c parse2.c
Next: The any matcher, Previous: Left recursion, Up: Top [Contents][Index]
Last updated: 2016-08-03 21:39:50 UTC
One thing pacc needs is more users. And, perhaps, one way to get more users is to reduce the friction in getting started with pacc. An obvious lubricant is packaging. Read More...
Looking at _pacc_coords(), I noticed that it seemed to have the same realloc() bug that I'd just fixed in _pacc_result(). However, the "list of arrays" trick really wasn't going to work here. Read More...