In the first phase, you
need to instrument the IR and insert the following function calls
into the program:
- Insert Init();
at the beginning of the program;
- Insert RecordInst();
for each statement (except return statements) in the program.
For example, the following code fragment
...
a = a + 1;
foo(a);
...
should be transformed to
...
RecordInst();
a = a + 1;
RecordInst();
foo(a);
...
- Insert Report();
at each exit of the program.
You also need to implement
functions Init(), RecordInst(), and Report() and make
them part of the program. They maintain a global counter that
records the number of dynamic instructions (statements in case of
PHASE I) in the program. Init() sets the counter to zero.
Each time RecordInst() is called, the counter is incremented
by one. Finally, Report() prints the total number of
instructions (statements) executed to the standard output. An
example of instrumented programs (without the implementation of the
functions) can be downloaded from
here. Except for recording and reporting number of instructions
executed, the instrumented program should behave exactly the same as
the original program. Make sure that there is no possibility of name
conflicts among variables. However, I would not mind if you do not
ensure their uniqueness in the C code, since the names were given by
me. However, other than these three functions you need to make sure
that none of the entities that you add to the code conflicts with
any other variable or function name in the code.
Here are my test programs after code generation (PHASE II) and
instrumentation. At this stage, you may ignore the extra statements
and variables generated in them. You need to learn how to use
the provided front-end and back-end tools, how to traverse and
manipulate the AST tree before start working on the instrumentation.
This will also prepare you for the later phases.
Hint: You may want to
add function definitions not by modifying AST, but by adding them
directly to the output file. Do not forget #include lines.
Phase II: Code
Generation
In
this phase, you need to transform the program into an
assembly-like format. In doing so, you are still working on the
AST. What is changed is only its internal structure. After the
transformation, you may call method GenCode() to generate
the output C file.
Expressions
in the transformed program should satisfy the following
requirements:
- Left hand side of
an assignment statement should be a variable or an array
access.
- Right hand side
of an assignment statement should be either a variable, a
constant, an array access, a function call or an expression
with no more than two operands (or one operator).
- Any
operand of an expression can only be a variable or a
constant.
-
The predicate of a conditional
jump can be one comparison operation whose two operands are
either a variable or a constant.
- Parameters to a
function call can only be a variable or a constant.
- Index of an array
can only be a variable or a constant.
- Return parameter
can only be a variable or a constant.
After the program transformation, new temporary variables could be
introduced.
. An example program segment
a =
b + c[e] + d;
if (a+d<b) goto L1;
foo (m[a+d]);
m[e+f] = m[a] + m[d];
L1:
needs to be converted to something
like the following:
t0 =
c[e];
t1 = b + t0;
a = t1 + d;
t2 = a + d;
if (t2 < b) goto L1;
t3 = a + d;
t4 = m[t3];
foo(t4);
t5 = m[a];
t6 = m[d];
t7 = e + f;
m[t7] = t5 + t6;
L1:
Some notices:
You should take
scanf and printf as special instructions that need
not conform to above specifications. You should not make any
temporary variable to take value of the pointer passed to
scanf. Remember that our language does not have pointers, so
we do not deal with them. For printf, you still need to
make all the arguments in variable or constant form except for
the format string.
Here are my test programs after code generation and
instrumentation. Your output should match with them. Remember
that you have to take care of types. Our language has integers
and floating points only.
Phase III: Control
Flow
In this phase, you
need to construct a control flow graph (CFG) for each function
definition. You should identify all the basic blocks (BB) in
a function and figure out predecessors and successors
of each BB. You may assign each BB a unique ID with naming
scope either within a function or throughout the whole program.
As a debugging support, your compiler
should provide a means of dumping the CFG's into an output stream. In this
assignment, you are required to dump the CFG's of each program into
a text file with extension .cfg (for instance, automaton.cfg for
automaton.new.c)
in the output program directory.
A sample control flow graph is shown as following
int a, b;
//Block 1:
//
Predecessors: Entry
//
Successors: Block 2, Block 3
a = 1;
b = 2;
if ((a > b)) goto L6;
//Block 2:
//
Predecessors: Block 1
//
Successors: Block 3, Block 4
a = a - b;
b = b - a;
if ((a <= b)) goto L4;
//Block 3:
//
Predecessors: Block 1, Block 2
//
Successors: Exit
L6:
return 0;
//Block 4:
//
Predecessors: Block 2
//
Successors: Block 6
L4:
goto L3;
//Block 5:
//
Predecessors:
//
Successors: Exit
L7:
return 1;
//Block 6:
//
Predecessors: Block 4
//
Successors: Exit
L3:
Note: your dumped CFG's should not omit labels if they mark
the beginning of a basic block. Function calls are not necessarily
conditions of starting a new BB.
 |