Derive

Due: 11:59 PM, Feb 5
Submit Project: derive 
Source Files: derive.py 
Test Files: Binary.txt Ambig.txt

You are to read in a grammar and generate all terminal strings up to a specified length. The grammar rules will appear one rule per line, with each symbol separated from the next symbol by white space.

N = D
N = N D
D = 0
D = 1

With a specifed length of 2, your program should generate:

0
1
0 0
0 1
1 0
1 1


Requirements

  1. Output:
    1. The order of the lines shown is irrelevant.
    2. There should be no other output, especially on the lines with the digit sequences on them.
    3. In printing a digit sequence, there should be a space between successive digits.
  2. Input:
    1. Productions are required to appear one per line.
    2. All the productions for a given nonterminal will be grouped together.
    3. The start symbol is the lhs of the first production, in this case N.
    4. The meta symbol for "derives", or "can be replaced by" is always the second symbol and is the same for all productions (but not all grammars).
    5. All symbols including metasymbols (ex. =), nonterminals (ex: N D), and terminals (ex: 9) must be separated from each other by 1 or more spaces.
    6. The set of Nonterminal symbols is precisely the set of symbols that appear at least once as a lhs symbol. All the remaining symbols are terminals.
  3. Hints:
    1. Store the productions as a dictionary with the key being the lhs nonterminal and the value being a list of strings.
    2. Store the candidate terminal strings (worklist) as a simple numeric list.


Algorithm


Read the length N from the command argument (or from input).
Read and store all the productions.
Push the start symbol onto the  worklist.
While the worklist is not empty:
	Get and delete one potential sentence s from the worklist.
	If the | s | > N, continue.
	If s has no nonterminals, print s and continue.
	Choose the leftmost nonterminal NT.
	For all productions NT -> rhs:
		Replace NT in s with rhs; call it tmp.
		Store tmp on worklist.


Bad Assumptions

  1. The metasymbol for "derives" is "=" or one character in length.
  2. Any symbol is restricted to being one character.
  3. There is an error in a grammar test file.
  4. A grammar will cause the above algorithm to cycle.


Python Idioms


Invocation

python  derive.py  [-l length]  grammarfile

where:


Python3

If you are using Python3, then the first line of each Python file must contain:
#! /usr/bin/python3


Submision

This project should be submitted through the assignment link in the Blackboard. If you have more than one files to submit, you are required to put all files into a single tar file for the submission.