Wirth BNF Grammars

Wirth uses his own meta language to define its own syntax (and serve as an example of its use):

grammar    = { production }.
production = identifier "=" expression ".".
expression = term { "|" term }.
term       = factor { factor }.
factor     = identifier | literal | "(" expression ")" |
             "[" expression "]" | "{" expression "}".
literal    = """" character { character } """".
The word identifier is used to denote a nonterminal symbol, and literal denotes a terminal symbol. For brevity, identifier and character are not further defined.

Repetition is denoted by curly braces, i.e., { a } denotes: empty, a, aa, ... . Optionality is expressed by square brackets, i.e., [ a ] denotes a or empty. Parentheses merely serve for grouping, i.e., ( a | b ) c stands for: a c | b c.

Terminal symbols are either literals, i.e., are enclosed in quote marks or are identifiers which do not appear on the left hand side of the metasymbol =. If a quote mark appears a a literal itself, then it is written twice (as is common in many programming languages).

As a machine readable form, I have added the following additional properties to Wirth BNF grammars:

Note that the spacing permits convenient processing by simple awk scripts.


Niklaus Wirth, What can we do about the unnecessary diversity of notation for syntactic descriptions, CACM, 20 (November 1977), pp. 822-823.

Robert Noonan, noonan@cs.wm.edu
Oct 21 97