SlideShare a Scribd company logo
COMPILER CONSTRUCTION Principles and Practice Kenneth C. Louden
8.  Code Generation
Contents Part One 8.1 Intermediate Code and Data Structure for code Generation 8.2 Basic Code Generation Techniques Part Two 8.3 Code Generation of Data Structure Reference 8.4 Code Generation of Control Statements and Logical Expression 8.5 Code Generation of Procedure and Function calls Part Three 8.6 Code Generation on Commercial Compilers: Two Case Studies 8.7 TM: A Simple Target Machine 8.8 A Code Generator for the TINY Language 8.9 A Survey of Code Optimization Techniques 8.10 Simple Optimizations for TINY Code Generator
8.6 Code Generation in Commercial Compilers: Two Case Studies Borland’s C Compiler for 80X86 Sun’s C Compiler for SparcStations
For example,  Consider the C procedure Void f ( int x, char c) { int a[10];   double y;   … }
Offset of x fp Offset of c Offset of a Offset of y The activation record for a call to f would appear as y a[0] a[1] … a[9] Return address Control link c x
Assuming two bytes for integers, four bytes for addresses, one byte for character and eight bytes for double-precision floating point, we would have the following offset values:  Now, an access of a[i], would require the computation of the address: (-24+2*i)(fp) -32 y -24 a +4 c +5 x Offset Name
For the expression:  ( x = x +3 ) + 4, the p-code and three-address code: Lad x Lod x Ldc 3 Adi t1=x+3 Stn x=t1 Ldc 4 Adi t2=t1+4
8.6.1 The Borland 3.0 C Compiler for the 80X86
Consider the examples of the output of this compiler with the  following assignment (x = x +3 ) + 4 The assembly code for this expression as produced by the Borland 3.0 compiler for the Intel 80x86 is as follows: mov ax, word ptr [bp-2] add ax, 3 mov word ptr [bp-2], ax add ax, 4 Notes: The  bp  is used as the frame pointer. The  static simulation method  is used to convert the intermediate code into the target code.
For the expression:  ( x = x +3 ) + 4,  The  p-code and three-address  code: Lad x Lod x Ldc 3 Adi t1=x+3 Stn x=t1 Ldc 4 Adi t2=t1+4
1) Array Reference An example:  (a [ i + 1 ] = 2 ) + a [ j ] Assume that i j, and a are  local variables  declared as int i, j; int a[10]; The Borland C compiler generates the following assembly code for the above expression (in next page)
Expression: (a [ i + 1 ] = 2 ) + a [ j ] ( 1 ) mov bx,word ptr [bp-2] ( 2 ) shl  bx , 1 ( 3 ) l ea ax, word ptr [bp-22] ( 4 ) add bx , ax ( 5 ) mov ax , 2 ( 6 ) mov word ptr [bx],ax ( 7 ) mov bx,word ptr [bp-4] ( 8 ) shl  bx , 1 ( 9 ) l ea dx,word ptr [bp-24] ( 1 0 ) add bx , dx ( 1 1 ) add ax,word ptr [bx] The compiler has applied the  algebraic fact to compute address : address(a [ i + 1 ]) = base _ address (a) + (i + 1)*elem_size (a) = (base _ address (a) + elem_size (a)) + i*elem_size (a)
Array reference generated by a code generation procedure. ( a [ i + 1 ] = 2 ) + a [ j ]   lda a lod i ldc 1 a d i ixa elem_size(a) ldc 2 s t n lda a lod j ixa elem_size(a) ind 0 adi
2) Pointer and Field References Assume the declarations of previous examples: typedef struct rec {  int i; char c; int j; } Rec; typedef struct treeNode {  int val; struct treeNode * lchild, * rchild; } TreeNode; … Rec x; TreeNode *p;
Assume that  X and P are declared as local variables  and that appropriate allocation of pointers has been done. Consider, first, the sizes of the data types involved.  Integer variable has size 2 bytes; Character variable has size 1 bytes; The pointer has size 2 bytes. The code generated for the statement x.j =x.i;   is mov ax, word ptr [ bp-6 ] mov word ptr [bp-3],ax Notes:  Local variables are allocated  only on even-byte  boundaries; The offset computation for  j (-6 + 3 = -3 ) is performed statically by the compiler .
The code generated for the statement p->lchild = p; is mov word ptr [ si+2 ], si And the statement   p = p->rchild;  is mov si, word ptr [ si+ 4]
3) If and While-Statement The statements we use are if (x>y) y++; else x--; and while (x<y) y -= x; The Borland compiler generates the following 80x86 code for the given if-statement: cmp bx , dx jle short @1@86 inc dx jmp short @1@114 @1@86 : dec bx @1@114 :
For the given while-statement: jmp short @1@170 @1@142 : sub dx , bx @1@170 : cmp bx , dx jl short @1@142
4) Function definition and call The examples are the C function definition: int f( int x, int y) {  return x+y+1  ; } And a corresponding call f (2+3, 4)
The Borland compiler for the call  f (2+3, 4): mov ax,4 push ax mov ax,5 push ax call near ptr _f pop cx pop cx Notes:  The arguments are pushed on the stack  in reverse order ; The caller is responsible for  removing the arguments from the stack after the call .  The call instruction on the 80x86  automatically pushes the return address  onto the stack.
Now, consider the code generated by the Borland compiler for the definition of f: _ f  proc near   push bp   mov bp , sp   mov ax, word ptr [bp+4]   add ax, word ptr [bp+6]   inc ax   jmp short @1@58 @1@58 :   pop bp   ret _ f  endp
After these operations, the stack looks as follows: The body of f then corresponds to the code that comes next mov ax, word ptr [bp+4] add ax, word ptr [bp+6] inc ax Finally, the code executes a jump, restores the old  bp  from the stack, and returns to the caller.
8.6.2 The Sun 2.0 C Compiler for Sun SPARCstation
Consider again with the assignment (x = x + 3 ) + 4 The Sun C compiler produces assembly code: ld [ %fp +  - 0x4  ] , % o1 add %o1 , 0x3 , %o1 st %o1 , [ %fp + - 0x4 ] ld [ %fp + - 0x4 ] , %o2 add %o2 , 0x4 , %o3
1)  Array Reference   ( a [ i + 1 ] = 2 ) + a [ j ] translated to the following assembly code by the Sun compiler: ( 1 ) add %fp , - 0x2c , %o1 /*fp-44 ( 2 ) l d [ %f p + - 0x4 ] , %o2 ( 3 ) sll %o2 , 0x2 , %o3 ( 4 ) mov 0x2 , %o4 ( 5 ) st %o4 , [ %o1 + %o3 ] ( 6 ) add %f p , - 0x30 , %o5 /*fp-48 ( 7 ) ld [ %fp + - 0 x 8 ] , % o 7 ( 8 ) sll %o7 , 0 x 2 , % l 0 ( 9 ) ld [ %o5 + % l 0 ] , %l1 ( 1 0 ) mov 0x2 , %l2 ( 1 1 ) add %l2 , %l1 , %l3
2 ) Pointer and Field References typedef struct rec {  int i; char c; int j; } Rec; typedef struct treeNode {  int val; struct treeNode * lchild, * rchild; } TreeNode; … Rec x; /*allocated only on 4-bytes boundaries. TreeNode p; /*4 bytes sizes.
The code generated for the assignment  x.j =x.i;  is ld [%fp+-0xc], %o1 st %o1, [%fp+-0x4] The pointer assignment  p =   p - > r c h i l d;  results in the target code: ld [ % f p + - 0x10 ] , %o4 ld [%o4 +0x8],%o5 st %o5 , [ %f p + - 0x10 ]
3)  If- and while-statements   if (x>y) y++; else x--;  The Sun SPARCstation compiler generates the following code: ld [ % f p + - 0x4 ] , %o2 ld [ % f p + - 0x8 ] , %o3 cmp %o2 , %o3 bg L16 nop b L15 nop
L16 : ld [ %f p + - 0x8 ] , %o4 add %o4 , 0x1 , %o4 st %o4,[ %fp+-0x8] b L17 nop L15 : ld [ %fp + - 0x4 ] , %o5 sub % o 5 , 0 x 1 , % o 5 st %o5,[ %fp+-0x4] L17 :
and  while (x<y) y -= x ; The code generated by the Sun compiler for the while-loop is ld [ % f p + - 0x4 ] , %o7 ld [ % f p + - 0x8 ] , %10 cmp %o7 , %10 bl L21 nop b L20 nop L21 :
L18 : ld [ % f p + - 0 x 4 ] , % 1 1 ld [ % f p + - 0 x 8 ] , % 1 2 sub % 1 2 , % 1 1 , % 1 2 st %12,[ %fp+-0x8] ld [ % f p + - 0 x 4 ] , % 1 3 ld [ % f p + - 0 x 8 ] , % 1 4 cmp % 1 3 , % 1 4 bl L18 nop b L22 nop L22 : L20 :
4)  Function Definition and Call We use the same function definition as previously the C function definition: int f( int x, int y) { return x+y+1  ; } and a corresponding call f (2+3, 4) Sun compiler generates the following code: mov 0x5, %o0 mov 0x4, %o1 call _f, 2
And the code generated for the definition f is  _ f : !#PROLOGUE# 0 sethi %hi ( LF62) , %gl add %gl , %lo ( L F 6 2 ) , %gl save %sp , %gl , %sp !#PROLOGUE# 1 st %i0 , [ %fp + 0x44 ] st %i1, [%fp+0x48] L64 : . seg &quot; text &quot; ld [ %fp + 0x44 ] , %o0 ld [%fp+0x48] ,%o1
add %o0 , %o1 , %o0 add %o0 , 0x1 , %o0 b LE62 nop LF62 : mov %o0 , %i0 ret Restore Notes: The call  passes the arguments in register O0 and O1 , rather than on the stack; The call indicates  with number 2  how many registers are used for this purpose; The “o” registers become the “i” registers after call.
8.7 TM: A Simple Target Machine
In the section following this one will present a  code generator for the TINY  language Generate target code directly for a very simple machine that can be easily simulated This machine is called  TM (for Tiny Machine).
8.7.1 Basic Architecture of the Tiny Machine
TM consists of a  read-only instruction memory , a  data memory , and a set of  eight general-purpose registers .  These all use nonnegative integer addresses beginning at 0.  Register 7 is the program counter and is the only special register, as described below.  The C declarations #define IADDR_SIZE ... /* size of instruction memory */ #define DADDR_SIZE... /* size of data memory */ #define NO_REGS 8 /* number of registers */ #define PC_REG 7 Instruction iMem[IADDR_SIZE]; int dMem[DADDR_SIZE]; int reg[NO_REGS];
TM performs a conventional fetch-execute cycle: d o /* fetch */ Current Instruction = iMem [reg[pcRegNo]++]; /* execute current instruction */ . . . while (!(halt||error)); A register-only instruction has the format opcode r, s, t There are two basic instruction formats:  Register only ------  RO instruction ; Register-memory ------  RM instruction . The complete instruction set of the Tiny Machine is listed in the next page.
RO Instructions Format  opcode r, s, t Opcode Effect HALT  stop execution (operands ignored) IN  reg [r] ← integer value read from the standard input (s and t ignored) OUT  reg [r] -> the standard output (s and t ignored) ADD  reg [r] = reg[s] + reg[t] SUB  reg [r] = reg[s] - reg[t] MUL  reg [r] = reg[s] * reg[t] DIV  reg [r] = reg[s] / reg[t](may generate ZERO _ DIV)
RM Instructions Format opcode r, d(s) (a=d+ r e g [s]; any reference to DMem [a] generates DEME_ERR if a<0 or a≥DADDR – SIZE ) Opcode  Effect LD  reg [r] = dMem[a] (load r with memory value at a) LDA  reg [r] = a  (load address a directly into r) LDC  reg [r] = d  (load constant d directly into r – s is ignored) ST  dMem[a] = reg[r]  (store value in r to memory location a) JLT  if (reg [r]<0) reg [PC_REG] = a  (jump to instruction a if r is negative, similarly for the following) JLE  if (reg [r]<=0 reg [PC_REG] = a JGE  if (reg [r]>0) reg [PC_REG] = a JGT  if (reg [r]>0) reg [PC_REG] = a TEQ  if (reg [r]==0) reg [PC_REG] = a JNE  if (reg [r]!=0) reg [PC_REG] = a
A register-memory instruction has the format opcode r,d(s) RM instructions include three different load instructions corresponding to the three addressing modes  “ load constant” (LDC),  “ load address” (LDA),  and “load memory” (LD). Since the instruction set is minimal, some comments are in order about how  they can be used to achieve almost all standard programming language operations . (More detail in the page P456-457) (1) The target register in the arithmetic, IN, and load operations comes first, and the source register(s) come second. (2) All arithmetic operations are restricted to registers.
(3) There are no floating-point operations or floating-point registers. (4) There are no addressing modes specifiable in the operands as in some assembly code. (5) There is no restriction on the use of the  pc  in any of the instructions. LDA 7, d(s) (6) There is also no indirect jump instruction. LD 7, 0(1) (7) The  conditional jump  instructions (JLT, etc.) can be made relative to the current position in the program by using the pc as the second register. JEQ 0, 4(7) (8) There is no procedure call or JSUB instruction. LD 7, d(s)
8.7.2 The TM Simulator
The machine simulator accepts text files containing TM instructions as described above, with the following conventions: An entirely blank line is ignored. A line beginning with an asterisk is considered a comment and is ignored. Any other line must contain an integer instruction location followed by a colon followed by a legal instruction.  Any text occurring after the instruction is considered a comment and is ignored . Figure 8.16 A TM program showing format conventions
* This program inputs an integer, computes * its factorial if it is positive, * and prints the result 0 :  IN 0, 0, 0  r0 = read 1 :  JLE 0, 6 (7)  if 0 < r0 then 2 :  LDC 1,1,0  r1 = 1 3 :  LDC 2, 1, 0  r2 = 1 * repeat 4 :  MUL 1, 1, 0  r1 = r1*r0 5 :  SUB 0, 0, 2  r0 = r0-r2 6 : JNE 0, -3 (7)  until r0 == 0 7 :  OUT 1, 0, 0 write r1 8 : HALT 0, 0, 0  halt * end of program Note: there is no need for location to appear in ascending sequence as they do above.
For example, a code generator is likely to  generate the code of Figure 8.16 in the following sequence : 0 :  IN 0,0,0 2 : LDC 1,1,0 3 : LDC 2,1,0 4 :  MUL 1,1,0 5 :  SUB 0,0,2 6 :  JNE 0,-3(7) 7 :  OUT 1,0,0 1 :  JLE 0,6(7) 8 :  HALT 0,0,0
8.8 Code Generation for the Tiny Language
8.8.1 The TM Interface of the TINY Code Generator
Encapsulate some of the information the code generator needs to know about the TM in files code.h and code.c which are listed in Appendix b Review here some of the features of the constant and function definitions in the code.h file If a program has two variables x and y, and there are two temporary values currently stored in memory, then dMem would look as following page
There are seven code emitting functions: EmitComment, TranceCode, emitRO, emitRM, emitSkip, emitRestore, emitBackup, emitRM_Abs
8.8.2 The TINY Code Generator
The TINY code generator is contained in file cgen.c, with its only interface to the TINY compiler the function codeGen, with prototype void codeGen(void); The codeGen function does:  It generates a few comments and instructions that set up the runtime environment on startup,  The calls the  cGen  function on the syntax tree,  And finally generates a  HALT  instruction to end the program.
A TINY syntax tree has a form given by the declarations typedef enum { StmtK, ExpK } NodeKind; typedef enum { IfK, RepeatK, AssignK, ReadK, WriteK } StmtKind; typedef enum { OpK, ConstK, IdK } ExpKind; #define MAXCHILDREN 3 typedef struct treeNode {    struct treeNode * child[MAXCHILDREN] ; struct treeNode * sibling; int lineno; NodeKind nodekind; union { StmtKind stmt; ExpKind exp; } kind; union { TokenType op; Int val; char * name; } attr; ExpType type; } TreeNode;
The cGen function tests only whether a node is a statement or expression node(or null),  calling the appropriate function genStmt or genExp,  and then calling itself recursively on siblings.
8.8.3 Generating and Using TM Code Files with the TINY Compiler
8.8.4 A Sample TM Code File Generated by the TINY Compiler
8.9 A Survey of Code Optimizations Techniques
8.9.1 Principal Sources of Code Optimizations
(1) Register Allocation Good  use of registers  is the most important feature of efficient code. (2) Unnecessary Operations The second major source of code improvement is to avoid generating code for operations that are  redundant or unnecessary . (3) Costly Operations A code generator should not only look for unnecessary operations, but should take advantage of opportunities to reduce the cost of operations that are necessary,  but may be implemented  in cheaper ways  than the source code or a simple implementation might indicate.
(4) Prediction Program Behavior To perform some of the previously described optimizations, a compiler must  collect information about the uses of variables, values and procedures in programs : whether expressions are reused, whether or when variables change their values or remain constant, and whether procedures are called or not. A different approach is taken  by some compilers in that statistical behavior about a program is gathered from actual executions and the used to predict which paths are most likely to be taken, which procedures are most likely to be called often, and which sections of code are likely to be executed the most frequently.
8.9.2 Classification of Optimizations
Two useful classifications are  the time  during the compilation process when an optimization can be applied and  the area of the program  over which the optimization applies: The time of application during compilation. Optimizations can be performed at practically every stage of compilation.  For example, constant folding…. Some optimizations can be delayed until after target code has been generated - the target code is examined and rewritten to reflect the optimization.  For example, jump optimization….
The majority of optimizations are performed either during  intermediate code generation , just after intermediate code generation, or  during target code generation . To the extent that an optimization does not depend on the characteristics of the target machine (called  source-level optimizations )  They can be performed earlier than those that do depend on the target architecture ( target-level optimizations ).  Sometimes both optimizations do.
Consider the effect that one optimization may have on another.  For instance,  propagate constants before performing unreachable code elimination . Occasionally, a phase problem may arise in that each of two optimizations may uncover further opportunities for the other.  For example, consider the code x = 1; . . . y = 0; . . . if (y) x = 0; . . . if (x) y = 1;
A first pass at constant propagation might result in the code x = 1; . . . y = 0; . . . if (0) x = 0; . . . if (x) y = 1; Now, the body of the first if is unreachable code; eliminating it yields: x = 1; . . . y = 0; . . . if (x) y = 1;
The second classification scheme for optimizations that we consider is by the  area of the program over which  the optimization applies The categories for this classification are called  local ,  global  and  inter-procedural  optimizations ( 1 ) Local optimizations: applied to s traight-line segments of code , or  basic blocks . ( 2 ) Global optimizations: applied to an individual procedure. ( 3 ) Inter-procedural optimizations: beyond the boundaries of procedures to the entire program.
 
8.9.3 Data Structures and Implementation Techniques for Optimizations
Some optimizations can be made by transformations on the syntax tree itself Including constant folding and unreachable code elimination. However the  syntax tree is an unwieldy or unsuitable structure  for collecting information and performing optimizations An optimizer that performs global optimizations will construct from the intermediate code of each procedure A graphical representation of the code called a  flow graph .  The  nodes  of a flow graph are the  basic blocks , and the  edges  are formed from the  conditional and unconditional jumps.  Each basic block node contains the sequence of intermediate code instructions of the block.
 
A single pass can construct a flow graph, together with each of its basic blocks, over the intermediate code Each new basic block is identified  as follows : The  first instruction  begins a new basic block; Each label  that is the target of a jump begin a new basic block; Each instruction that  follows a jump  begins a new basic block;
A standard data flow analysis problem  is to compute, for each variable, the set of so-called  reaching definitions  of that variable at the beginning of each basic block.  Here a  definition  is an intermediate code instruction that can set the value of the variable, such as an assignment or a read Another data structure is frequently constructed for each block, called the  DAG of a basic block . DAG  traces the computation and reassignment  of values and variables in a basic block as follows.  Values that are used in the block that come from elsewhere are represented as   leaf nodes .
Operations on those and other values are represented by  interior nodes .  Assignment of a new value is represented by  attaching the name of target variable or temporary to the node  representing the value assigned For example:
Repeated use of the same value also is represented in the DAG structure.  For example, the C assignment x = (x+1)*(x+1) translates into the three-address instructions: t1 = x + 1 t2 = x + 1 t3 = t1 * t2 x = t3 DAG for this sequence of instructions is given, showing the repeated use of the expression x+1
The DAG of a basic block can be constructed by maintaining two dictionaries.  A table  containing variable names and constants , with a lookup operation that returns the DAG node to which a variable name is currently assigned. A table of  DAG nodes , with a lookup operation that, given an operation and child node Target code , or a revised version of intermediate code, can be generated from a DAG  by a traversal according to any of the possible topological sorts  of the nonleaf nodes.
t3 = x - 1 t2 = fact * x x = t3 t4 = x == 0 fact = t2 Of course,  wish to avoid the unnecessary use of temporaries, and so would want to generate the following equivalent three-address code,  whose order must remain fixed: fact = fact * x x = x - 1 t4 = x == 0
A similar traversal of the DAG of above Figure  results in the following revised three-address code: t1 = x + 1 x = t1 * t1 Using DAG to generate target code for a basic block, we automatically  get local common sub expression elimination The DAG representation also makes it possible  to eliminate redundant stores  and tells us how many references to each value there are
A final method that is often used to assist  register allocation  as code generator proceeds  Involves the maintenance of data called  register descriptors  and  address descriptors . Register descriptors  associate with each register a list of the variable names whose value is currently in the register. Address descriptors  associate with each variable name the locations in memory where its value is to be found. For example, take the basic block DAG of Figure 8.19 and consider the generation of TM code  according to a left-to-right traversal of the interior nodes ,  Using the three registers 0, 1, and 2.  Assume that there are four address descriptors:  inReg(reg_no) ,  isGlobal(global_offset) ,  isTemp(temp_offset ),  and  isCounst(value) .
Assume further that  x is in global location 0 , that  fact is in global location 1 , that global locations are accessed via the  gp  register, and that temporary locations are accessed via the  mp  register.  Finally, assume also that none of the registers begin with any values in them.  Then, before code generation for the basic block begins, the address descriptors for the variables and constants would be as follows:
Now assume that the following code is generated: LD 0,1(gp) load fact into reg 0 LD 1,0(gp) load x into reg 1 MUL 0,0,1 The address descriptors would now be Variable/Constant  Address Descriptors
And the register descriptors would be Register    Variables Contained
Now, given the subsequent code LDC 2,1(0) load constant 1 into reg 2 ADD 1,1,2 The address descriptors would become: Variable/Constant  Address Descriptors And the register descriptors would become: Register  Variables Contained
8.10 Simple Optimizations for the TINY Code Generator
Primarily, the inefficiencies are due to two sources: The TINY code generator makes very poor use of the registers of the TM machine. The TINY code generator unnecessarily generates logical values 0 and 1 for tests, when these tests only appear in if-statements and while-statements, where simpler code will do. In this section, indicate how even relatively crude techniques can substantially improve the code generated by the TINY compiler.
8.10.1 Keeping Temporaries in Registers
The first optimization is an easy method for keeping temporaries in registers rather than constantly storing and reloading them from memory.  In the TINY code generator temporaries were always stored at the location: tmpoffset(mp)  where  tmpoffset  is a static variable initialized to 0, decremented each time a temporary is stored, and incremented each time it is reloaded.  A simple way to use registers as temporary locations is to interpret  tmpoffset  as initially referring to registers, and only after the available registers are exhausted, to use it as an actual offset into memory.
With this improvement, the TINY code generator now generates the TM code sequence given in Figure 8.21.
8.10.2 Keeping Variables in Registers
A further improvement can be made to the use of the TM registers by reserving some of the registers for use as variable locations.  A basic scheme is to simply pick a few registers and allocate these as the locations for the most used variables of the program With these modifications, the code of the sample program might now use register 3 for the variable x and register 4 for the variable fact, assuming that register 0 through 2 are still reserved for temporaries.
 
8.10.3 Optimizing Test Expressions
The final optimization is to simplify the code generated for tests in if-statement and while-statements.  This improvement depends on the fact that a comparison operator must appear as the root node of the test expression.  The  genExp  code for this operator will simply generate code to subtract the value of the right-hand operand from the left-hand operand, leaving the result in register 0.  The code for the if-statement or while-statement will then test for which comparison operator is applied and generate appropriate conditional jump code. if 0<x then.
Corresponds to the TM code: 4: SUB 0,0,3 5: JLT 0,2(7) 6: LDC 0,0(0) 7: LDA 7,1(7) 8: LDC 0,1(0) 9: JEQ 0,15(7) will generate instead the simpler TM code 4: SUB 0,0,3 5: JGE 0,10(7)
With this optimization, the code generated for the test program becomes that given in Figure 8.23.
End of Part Three THANKS

More Related Content

PPT
Chapter Eight(1)
PPT
Chapter Seven(2)
PPT
Chapter Eight(2)
PPT
Chapter Seven(1)
PPTX
Intermediate code generation1
PPT
Interm codegen
PPTX
Lecture 12 intermediate code generation
Chapter Eight(1)
Chapter Seven(2)
Chapter Eight(2)
Chapter Seven(1)
Intermediate code generation1
Interm codegen
Lecture 12 intermediate code generation

What's hot (20)

PPT
Intermediate code generation (Compiler Design)
PPT
Lecture 16 17 code-generation
PPT
Intermediate code generation
PPTX
COMPILER DESIGN AND CONSTRUCTION
PDF
Intermediate code generation in Compiler Design
PPT
Chapter 6 intermediate code generation
PPTX
Code generation
PDF
Compiler unit 4
PPTX
Intermediate code- generation
PPT
Chapter Three(1)
PPTX
Code optimization
PDF
PPTX
Compiler Design - Ambiguous grammar, LMD & RMD, Infix & Postfix, Implementati...
PDF
Compiler unit 5
PPT
Code generator
PDF
Intermediate code generation
PDF
Compiler unit 2&3
PPTX
Back patching
PPS
C programming session 01
PDF
Code optimization in compiler design
Intermediate code generation (Compiler Design)
Lecture 16 17 code-generation
Intermediate code generation
COMPILER DESIGN AND CONSTRUCTION
Intermediate code generation in Compiler Design
Chapter 6 intermediate code generation
Code generation
Compiler unit 4
Intermediate code- generation
Chapter Three(1)
Code optimization
Compiler Design - Ambiguous grammar, LMD & RMD, Infix & Postfix, Implementati...
Compiler unit 5
Code generator
Intermediate code generation
Compiler unit 2&3
Back patching
C programming session 01
Code optimization in compiler design
Ad

Viewers also liked (19)

PPTX
Three address code In Compiler Design
PPT
Code Optimization
PPSX
Spr ch-05-compilers
DOC
Chapter 1 1
PPT
Chapter One
PDF
C:\Documents And Settings\Fredlin\Desktop\Ic Design\Synthesis200301
PDF
X Windows
PDF
Advance data structure & algorithm
PDF
Computer Networks Foundation - Study Notes
PPTX
Cryptography & Network Security By, Er. Swapnil Kaware
PPTX
Compiler construction
PDF
MMS2401 - Multimedia system and Communication Notes
PPTX
Code Generation
PDF
Cryptography for software engineers
PDF
Software engineering lecture notes
PDF
Annual security report cisco 2016 persian revision
PDF
Ceh v8 labs module 07 viruses and worms
PDF
Cisco 2016 Security Report
Three address code In Compiler Design
Code Optimization
Spr ch-05-compilers
Chapter 1 1
Chapter One
C:\Documents And Settings\Fredlin\Desktop\Ic Design\Synthesis200301
X Windows
Advance data structure & algorithm
Computer Networks Foundation - Study Notes
Cryptography & Network Security By, Er. Swapnil Kaware
Compiler construction
MMS2401 - Multimedia system and Communication Notes
Code Generation
Cryptography for software engineers
Software engineering lecture notes
Annual security report cisco 2016 persian revision
Ceh v8 labs module 07 viruses and worms
Cisco 2016 Security Report
Ad

Similar to Chapter Eight(3) (20)

PPTX
Code Generation Part-2 in Compiler Construction
PPT
EMBEDDED SYSTEMS 4&5
PPTX
Introduction to Assembly Language
PPTX
PDF
02 isa
PPT
Assembly language
PPT
8085 instruction set and Programming
PPT
10 8086 instruction set
PPT
Assembler design option
PPT
CO_Chapter2.ppt
PPT
PattPatelCh05.ppt
PPTX
Intro to reverse engineering owasp
PPT
Unit 3 assembler and processor
PDF
Topic 6 - Programming in Assembly Language_230517_115118.pdf
PPTX
PPT
Instruction Set Architecture
DOCX
Microprocessor
PDF
OptimizingARM
PDF
Microprocessor 8086-lab-mannual
PPT
MIPS instruction set microprocessor lecture notes
Code Generation Part-2 in Compiler Construction
EMBEDDED SYSTEMS 4&5
Introduction to Assembly Language
02 isa
Assembly language
8085 instruction set and Programming
10 8086 instruction set
Assembler design option
CO_Chapter2.ppt
PattPatelCh05.ppt
Intro to reverse engineering owasp
Unit 3 assembler and processor
Topic 6 - Programming in Assembly Language_230517_115118.pdf
Instruction Set Architecture
Microprocessor
OptimizingARM
Microprocessor 8086-lab-mannual
MIPS instruction set microprocessor lecture notes

Recently uploaded (20)

PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Electronic commerce courselecture one. Pdf
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Encapsulation theory and applications.pdf
PPTX
A Presentation on Artificial Intelligence
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PPTX
Cloud computing and distributed systems.
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
KodekX | Application Modernization Development
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Machine learning based COVID-19 study performance prediction
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
Understanding_Digital_Forensics_Presentation.pptx
Spectral efficient network and resource selection model in 5G networks
Electronic commerce courselecture one. Pdf
Diabetes mellitus diagnosis method based random forest with bat algorithm
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Encapsulation theory and applications.pdf
A Presentation on Artificial Intelligence
CIFDAQ's Market Insight: SEC Turns Pro Crypto
Cloud computing and distributed systems.
The AUB Centre for AI in Media Proposal.docx
Reach Out and Touch Someone: Haptics and Empathic Computing
KodekX | Application Modernization Development
Digital-Transformation-Roadmap-for-Companies.pptx
Machine learning based COVID-19 study performance prediction
Agricultural_Statistics_at_a_Glance_2022_0.pdf
MYSQL Presentation for SQL database connectivity
Mobile App Security Testing_ A Comprehensive Guide.pdf
The Rise and Fall of 3GPP – Time for a Sabbatical?
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Bridging biosciences and deep learning for revolutionary discoveries: a compr...

Chapter Eight(3)

  • 1. COMPILER CONSTRUCTION Principles and Practice Kenneth C. Louden
  • 2. 8. Code Generation
  • 3. Contents Part One 8.1 Intermediate Code and Data Structure for code Generation 8.2 Basic Code Generation Techniques Part Two 8.3 Code Generation of Data Structure Reference 8.4 Code Generation of Control Statements and Logical Expression 8.5 Code Generation of Procedure and Function calls Part Three 8.6 Code Generation on Commercial Compilers: Two Case Studies 8.7 TM: A Simple Target Machine 8.8 A Code Generator for the TINY Language 8.9 A Survey of Code Optimization Techniques 8.10 Simple Optimizations for TINY Code Generator
  • 4. 8.6 Code Generation in Commercial Compilers: Two Case Studies Borland’s C Compiler for 80X86 Sun’s C Compiler for SparcStations
  • 5. For example, Consider the C procedure Void f ( int x, char c) { int a[10]; double y; … }
  • 6. Offset of x fp Offset of c Offset of a Offset of y The activation record for a call to f would appear as y a[0] a[1] … a[9] Return address Control link c x
  • 7. Assuming two bytes for integers, four bytes for addresses, one byte for character and eight bytes for double-precision floating point, we would have the following offset values: Now, an access of a[i], would require the computation of the address: (-24+2*i)(fp) -32 y -24 a +4 c +5 x Offset Name
  • 8. For the expression: ( x = x +3 ) + 4, the p-code and three-address code: Lad x Lod x Ldc 3 Adi t1=x+3 Stn x=t1 Ldc 4 Adi t2=t1+4
  • 9. 8.6.1 The Borland 3.0 C Compiler for the 80X86
  • 10. Consider the examples of the output of this compiler with the following assignment (x = x +3 ) + 4 The assembly code for this expression as produced by the Borland 3.0 compiler for the Intel 80x86 is as follows: mov ax, word ptr [bp-2] add ax, 3 mov word ptr [bp-2], ax add ax, 4 Notes: The bp is used as the frame pointer. The static simulation method is used to convert the intermediate code into the target code.
  • 11. For the expression: ( x = x +3 ) + 4, The p-code and three-address code: Lad x Lod x Ldc 3 Adi t1=x+3 Stn x=t1 Ldc 4 Adi t2=t1+4
  • 12. 1) Array Reference An example: (a [ i + 1 ] = 2 ) + a [ j ] Assume that i j, and a are local variables declared as int i, j; int a[10]; The Borland C compiler generates the following assembly code for the above expression (in next page)
  • 13. Expression: (a [ i + 1 ] = 2 ) + a [ j ] ( 1 ) mov bx,word ptr [bp-2] ( 2 ) shl bx , 1 ( 3 ) l ea ax, word ptr [bp-22] ( 4 ) add bx , ax ( 5 ) mov ax , 2 ( 6 ) mov word ptr [bx],ax ( 7 ) mov bx,word ptr [bp-4] ( 8 ) shl bx , 1 ( 9 ) l ea dx,word ptr [bp-24] ( 1 0 ) add bx , dx ( 1 1 ) add ax,word ptr [bx] The compiler has applied the algebraic fact to compute address : address(a [ i + 1 ]) = base _ address (a) + (i + 1)*elem_size (a) = (base _ address (a) + elem_size (a)) + i*elem_size (a)
  • 14. Array reference generated by a code generation procedure. ( a [ i + 1 ] = 2 ) + a [ j ] lda a lod i ldc 1 a d i ixa elem_size(a) ldc 2 s t n lda a lod j ixa elem_size(a) ind 0 adi
  • 15. 2) Pointer and Field References Assume the declarations of previous examples: typedef struct rec { int i; char c; int j; } Rec; typedef struct treeNode { int val; struct treeNode * lchild, * rchild; } TreeNode; … Rec x; TreeNode *p;
  • 16. Assume that X and P are declared as local variables and that appropriate allocation of pointers has been done. Consider, first, the sizes of the data types involved. Integer variable has size 2 bytes; Character variable has size 1 bytes; The pointer has size 2 bytes. The code generated for the statement x.j =x.i; is mov ax, word ptr [ bp-6 ] mov word ptr [bp-3],ax Notes: Local variables are allocated only on even-byte boundaries; The offset computation for j (-6 + 3 = -3 ) is performed statically by the compiler .
  • 17. The code generated for the statement p->lchild = p; is mov word ptr [ si+2 ], si And the statement p = p->rchild; is mov si, word ptr [ si+ 4]
  • 18. 3) If and While-Statement The statements we use are if (x>y) y++; else x--; and while (x<y) y -= x; The Borland compiler generates the following 80x86 code for the given if-statement: cmp bx , dx jle short @1@86 inc dx jmp short @1@114 @1@86 : dec bx @1@114 :
  • 19. For the given while-statement: jmp short @1@170 @1@142 : sub dx , bx @1@170 : cmp bx , dx jl short @1@142
  • 20. 4) Function definition and call The examples are the C function definition: int f( int x, int y) { return x+y+1 ; } And a corresponding call f (2+3, 4)
  • 21. The Borland compiler for the call f (2+3, 4): mov ax,4 push ax mov ax,5 push ax call near ptr _f pop cx pop cx Notes: The arguments are pushed on the stack in reverse order ; The caller is responsible for removing the arguments from the stack after the call . The call instruction on the 80x86 automatically pushes the return address onto the stack.
  • 22. Now, consider the code generated by the Borland compiler for the definition of f: _ f proc near push bp mov bp , sp mov ax, word ptr [bp+4] add ax, word ptr [bp+6] inc ax jmp short @1@58 @1@58 : pop bp ret _ f endp
  • 23. After these operations, the stack looks as follows: The body of f then corresponds to the code that comes next mov ax, word ptr [bp+4] add ax, word ptr [bp+6] inc ax Finally, the code executes a jump, restores the old bp from the stack, and returns to the caller.
  • 24. 8.6.2 The Sun 2.0 C Compiler for Sun SPARCstation
  • 25. Consider again with the assignment (x = x + 3 ) + 4 The Sun C compiler produces assembly code: ld [ %fp + - 0x4 ] , % o1 add %o1 , 0x3 , %o1 st %o1 , [ %fp + - 0x4 ] ld [ %fp + - 0x4 ] , %o2 add %o2 , 0x4 , %o3
  • 26. 1) Array Reference ( a [ i + 1 ] = 2 ) + a [ j ] translated to the following assembly code by the Sun compiler: ( 1 ) add %fp , - 0x2c , %o1 /*fp-44 ( 2 ) l d [ %f p + - 0x4 ] , %o2 ( 3 ) sll %o2 , 0x2 , %o3 ( 4 ) mov 0x2 , %o4 ( 5 ) st %o4 , [ %o1 + %o3 ] ( 6 ) add %f p , - 0x30 , %o5 /*fp-48 ( 7 ) ld [ %fp + - 0 x 8 ] , % o 7 ( 8 ) sll %o7 , 0 x 2 , % l 0 ( 9 ) ld [ %o5 + % l 0 ] , %l1 ( 1 0 ) mov 0x2 , %l2 ( 1 1 ) add %l2 , %l1 , %l3
  • 27. 2 ) Pointer and Field References typedef struct rec { int i; char c; int j; } Rec; typedef struct treeNode { int val; struct treeNode * lchild, * rchild; } TreeNode; … Rec x; /*allocated only on 4-bytes boundaries. TreeNode p; /*4 bytes sizes.
  • 28. The code generated for the assignment x.j =x.i; is ld [%fp+-0xc], %o1 st %o1, [%fp+-0x4] The pointer assignment p = p - > r c h i l d; results in the target code: ld [ % f p + - 0x10 ] , %o4 ld [%o4 +0x8],%o5 st %o5 , [ %f p + - 0x10 ]
  • 29. 3) If- and while-statements if (x>y) y++; else x--; The Sun SPARCstation compiler generates the following code: ld [ % f p + - 0x4 ] , %o2 ld [ % f p + - 0x8 ] , %o3 cmp %o2 , %o3 bg L16 nop b L15 nop
  • 30. L16 : ld [ %f p + - 0x8 ] , %o4 add %o4 , 0x1 , %o4 st %o4,[ %fp+-0x8] b L17 nop L15 : ld [ %fp + - 0x4 ] , %o5 sub % o 5 , 0 x 1 , % o 5 st %o5,[ %fp+-0x4] L17 :
  • 31. and while (x<y) y -= x ; The code generated by the Sun compiler for the while-loop is ld [ % f p + - 0x4 ] , %o7 ld [ % f p + - 0x8 ] , %10 cmp %o7 , %10 bl L21 nop b L20 nop L21 :
  • 32. L18 : ld [ % f p + - 0 x 4 ] , % 1 1 ld [ % f p + - 0 x 8 ] , % 1 2 sub % 1 2 , % 1 1 , % 1 2 st %12,[ %fp+-0x8] ld [ % f p + - 0 x 4 ] , % 1 3 ld [ % f p + - 0 x 8 ] , % 1 4 cmp % 1 3 , % 1 4 bl L18 nop b L22 nop L22 : L20 :
  • 33. 4) Function Definition and Call We use the same function definition as previously the C function definition: int f( int x, int y) { return x+y+1 ; } and a corresponding call f (2+3, 4) Sun compiler generates the following code: mov 0x5, %o0 mov 0x4, %o1 call _f, 2
  • 34. And the code generated for the definition f is _ f : !#PROLOGUE# 0 sethi %hi ( LF62) , %gl add %gl , %lo ( L F 6 2 ) , %gl save %sp , %gl , %sp !#PROLOGUE# 1 st %i0 , [ %fp + 0x44 ] st %i1, [%fp+0x48] L64 : . seg &quot; text &quot; ld [ %fp + 0x44 ] , %o0 ld [%fp+0x48] ,%o1
  • 35. add %o0 , %o1 , %o0 add %o0 , 0x1 , %o0 b LE62 nop LF62 : mov %o0 , %i0 ret Restore Notes: The call passes the arguments in register O0 and O1 , rather than on the stack; The call indicates with number 2 how many registers are used for this purpose; The “o” registers become the “i” registers after call.
  • 36. 8.7 TM: A Simple Target Machine
  • 37. In the section following this one will present a code generator for the TINY language Generate target code directly for a very simple machine that can be easily simulated This machine is called TM (for Tiny Machine).
  • 38. 8.7.1 Basic Architecture of the Tiny Machine
  • 39. TM consists of a read-only instruction memory , a data memory , and a set of eight general-purpose registers . These all use nonnegative integer addresses beginning at 0. Register 7 is the program counter and is the only special register, as described below. The C declarations #define IADDR_SIZE ... /* size of instruction memory */ #define DADDR_SIZE... /* size of data memory */ #define NO_REGS 8 /* number of registers */ #define PC_REG 7 Instruction iMem[IADDR_SIZE]; int dMem[DADDR_SIZE]; int reg[NO_REGS];
  • 40. TM performs a conventional fetch-execute cycle: d o /* fetch */ Current Instruction = iMem [reg[pcRegNo]++]; /* execute current instruction */ . . . while (!(halt||error)); A register-only instruction has the format opcode r, s, t There are two basic instruction formats: Register only ------ RO instruction ; Register-memory ------ RM instruction . The complete instruction set of the Tiny Machine is listed in the next page.
  • 41. RO Instructions Format opcode r, s, t Opcode Effect HALT stop execution (operands ignored) IN reg [r] ← integer value read from the standard input (s and t ignored) OUT reg [r] -> the standard output (s and t ignored) ADD reg [r] = reg[s] + reg[t] SUB reg [r] = reg[s] - reg[t] MUL reg [r] = reg[s] * reg[t] DIV reg [r] = reg[s] / reg[t](may generate ZERO _ DIV)
  • 42. RM Instructions Format opcode r, d(s) (a=d+ r e g [s]; any reference to DMem [a] generates DEME_ERR if a<0 or a≥DADDR – SIZE ) Opcode Effect LD reg [r] = dMem[a] (load r with memory value at a) LDA reg [r] = a (load address a directly into r) LDC reg [r] = d (load constant d directly into r – s is ignored) ST dMem[a] = reg[r] (store value in r to memory location a) JLT if (reg [r]<0) reg [PC_REG] = a (jump to instruction a if r is negative, similarly for the following) JLE if (reg [r]<=0 reg [PC_REG] = a JGE if (reg [r]>0) reg [PC_REG] = a JGT if (reg [r]>0) reg [PC_REG] = a TEQ if (reg [r]==0) reg [PC_REG] = a JNE if (reg [r]!=0) reg [PC_REG] = a
  • 43. A register-memory instruction has the format opcode r,d(s) RM instructions include three different load instructions corresponding to the three addressing modes “ load constant” (LDC), “ load address” (LDA), and “load memory” (LD). Since the instruction set is minimal, some comments are in order about how they can be used to achieve almost all standard programming language operations . (More detail in the page P456-457) (1) The target register in the arithmetic, IN, and load operations comes first, and the source register(s) come second. (2) All arithmetic operations are restricted to registers.
  • 44. (3) There are no floating-point operations or floating-point registers. (4) There are no addressing modes specifiable in the operands as in some assembly code. (5) There is no restriction on the use of the pc in any of the instructions. LDA 7, d(s) (6) There is also no indirect jump instruction. LD 7, 0(1) (7) The conditional jump instructions (JLT, etc.) can be made relative to the current position in the program by using the pc as the second register. JEQ 0, 4(7) (8) There is no procedure call or JSUB instruction. LD 7, d(s)
  • 45. 8.7.2 The TM Simulator
  • 46. The machine simulator accepts text files containing TM instructions as described above, with the following conventions: An entirely blank line is ignored. A line beginning with an asterisk is considered a comment and is ignored. Any other line must contain an integer instruction location followed by a colon followed by a legal instruction. Any text occurring after the instruction is considered a comment and is ignored . Figure 8.16 A TM program showing format conventions
  • 47. * This program inputs an integer, computes * its factorial if it is positive, * and prints the result 0 : IN 0, 0, 0 r0 = read 1 : JLE 0, 6 (7) if 0 < r0 then 2 : LDC 1,1,0 r1 = 1 3 : LDC 2, 1, 0 r2 = 1 * repeat 4 : MUL 1, 1, 0 r1 = r1*r0 5 : SUB 0, 0, 2 r0 = r0-r2 6 : JNE 0, -3 (7) until r0 == 0 7 : OUT 1, 0, 0 write r1 8 : HALT 0, 0, 0 halt * end of program Note: there is no need for location to appear in ascending sequence as they do above.
  • 48. For example, a code generator is likely to generate the code of Figure 8.16 in the following sequence : 0 : IN 0,0,0 2 : LDC 1,1,0 3 : LDC 2,1,0 4 : MUL 1,1,0 5 : SUB 0,0,2 6 : JNE 0,-3(7) 7 : OUT 1,0,0 1 : JLE 0,6(7) 8 : HALT 0,0,0
  • 49. 8.8 Code Generation for the Tiny Language
  • 50. 8.8.1 The TM Interface of the TINY Code Generator
  • 51. Encapsulate some of the information the code generator needs to know about the TM in files code.h and code.c which are listed in Appendix b Review here some of the features of the constant and function definitions in the code.h file If a program has two variables x and y, and there are two temporary values currently stored in memory, then dMem would look as following page
  • 52. There are seven code emitting functions: EmitComment, TranceCode, emitRO, emitRM, emitSkip, emitRestore, emitBackup, emitRM_Abs
  • 53. 8.8.2 The TINY Code Generator
  • 54. The TINY code generator is contained in file cgen.c, with its only interface to the TINY compiler the function codeGen, with prototype void codeGen(void); The codeGen function does: It generates a few comments and instructions that set up the runtime environment on startup, The calls the cGen function on the syntax tree, And finally generates a HALT instruction to end the program.
  • 55. A TINY syntax tree has a form given by the declarations typedef enum { StmtK, ExpK } NodeKind; typedef enum { IfK, RepeatK, AssignK, ReadK, WriteK } StmtKind; typedef enum { OpK, ConstK, IdK } ExpKind; #define MAXCHILDREN 3 typedef struct treeNode { struct treeNode * child[MAXCHILDREN] ; struct treeNode * sibling; int lineno; NodeKind nodekind; union { StmtKind stmt; ExpKind exp; } kind; union { TokenType op; Int val; char * name; } attr; ExpType type; } TreeNode;
  • 56. The cGen function tests only whether a node is a statement or expression node(or null), calling the appropriate function genStmt or genExp, and then calling itself recursively on siblings.
  • 57. 8.8.3 Generating and Using TM Code Files with the TINY Compiler
  • 58. 8.8.4 A Sample TM Code File Generated by the TINY Compiler
  • 59. 8.9 A Survey of Code Optimizations Techniques
  • 60. 8.9.1 Principal Sources of Code Optimizations
  • 61. (1) Register Allocation Good use of registers is the most important feature of efficient code. (2) Unnecessary Operations The second major source of code improvement is to avoid generating code for operations that are redundant or unnecessary . (3) Costly Operations A code generator should not only look for unnecessary operations, but should take advantage of opportunities to reduce the cost of operations that are necessary, but may be implemented in cheaper ways than the source code or a simple implementation might indicate.
  • 62. (4) Prediction Program Behavior To perform some of the previously described optimizations, a compiler must collect information about the uses of variables, values and procedures in programs : whether expressions are reused, whether or when variables change their values or remain constant, and whether procedures are called or not. A different approach is taken by some compilers in that statistical behavior about a program is gathered from actual executions and the used to predict which paths are most likely to be taken, which procedures are most likely to be called often, and which sections of code are likely to be executed the most frequently.
  • 63. 8.9.2 Classification of Optimizations
  • 64. Two useful classifications are the time during the compilation process when an optimization can be applied and the area of the program over which the optimization applies: The time of application during compilation. Optimizations can be performed at practically every stage of compilation. For example, constant folding…. Some optimizations can be delayed until after target code has been generated - the target code is examined and rewritten to reflect the optimization. For example, jump optimization….
  • 65. The majority of optimizations are performed either during intermediate code generation , just after intermediate code generation, or during target code generation . To the extent that an optimization does not depend on the characteristics of the target machine (called source-level optimizations ) They can be performed earlier than those that do depend on the target architecture ( target-level optimizations ). Sometimes both optimizations do.
  • 66. Consider the effect that one optimization may have on another. For instance, propagate constants before performing unreachable code elimination . Occasionally, a phase problem may arise in that each of two optimizations may uncover further opportunities for the other. For example, consider the code x = 1; . . . y = 0; . . . if (y) x = 0; . . . if (x) y = 1;
  • 67. A first pass at constant propagation might result in the code x = 1; . . . y = 0; . . . if (0) x = 0; . . . if (x) y = 1; Now, the body of the first if is unreachable code; eliminating it yields: x = 1; . . . y = 0; . . . if (x) y = 1;
  • 68. The second classification scheme for optimizations that we consider is by the area of the program over which the optimization applies The categories for this classification are called local , global and inter-procedural optimizations ( 1 ) Local optimizations: applied to s traight-line segments of code , or basic blocks . ( 2 ) Global optimizations: applied to an individual procedure. ( 3 ) Inter-procedural optimizations: beyond the boundaries of procedures to the entire program.
  • 69.  
  • 70. 8.9.3 Data Structures and Implementation Techniques for Optimizations
  • 71. Some optimizations can be made by transformations on the syntax tree itself Including constant folding and unreachable code elimination. However the syntax tree is an unwieldy or unsuitable structure for collecting information and performing optimizations An optimizer that performs global optimizations will construct from the intermediate code of each procedure A graphical representation of the code called a flow graph . The nodes of a flow graph are the basic blocks , and the edges are formed from the conditional and unconditional jumps. Each basic block node contains the sequence of intermediate code instructions of the block.
  • 72.  
  • 73. A single pass can construct a flow graph, together with each of its basic blocks, over the intermediate code Each new basic block is identified as follows : The first instruction begins a new basic block; Each label that is the target of a jump begin a new basic block; Each instruction that follows a jump begins a new basic block;
  • 74. A standard data flow analysis problem is to compute, for each variable, the set of so-called reaching definitions of that variable at the beginning of each basic block. Here a definition is an intermediate code instruction that can set the value of the variable, such as an assignment or a read Another data structure is frequently constructed for each block, called the DAG of a basic block . DAG traces the computation and reassignment of values and variables in a basic block as follows. Values that are used in the block that come from elsewhere are represented as leaf nodes .
  • 75. Operations on those and other values are represented by interior nodes . Assignment of a new value is represented by attaching the name of target variable or temporary to the node representing the value assigned For example:
  • 76. Repeated use of the same value also is represented in the DAG structure. For example, the C assignment x = (x+1)*(x+1) translates into the three-address instructions: t1 = x + 1 t2 = x + 1 t3 = t1 * t2 x = t3 DAG for this sequence of instructions is given, showing the repeated use of the expression x+1
  • 77. The DAG of a basic block can be constructed by maintaining two dictionaries. A table containing variable names and constants , with a lookup operation that returns the DAG node to which a variable name is currently assigned. A table of DAG nodes , with a lookup operation that, given an operation and child node Target code , or a revised version of intermediate code, can be generated from a DAG by a traversal according to any of the possible topological sorts of the nonleaf nodes.
  • 78. t3 = x - 1 t2 = fact * x x = t3 t4 = x == 0 fact = t2 Of course, wish to avoid the unnecessary use of temporaries, and so would want to generate the following equivalent three-address code, whose order must remain fixed: fact = fact * x x = x - 1 t4 = x == 0
  • 79. A similar traversal of the DAG of above Figure results in the following revised three-address code: t1 = x + 1 x = t1 * t1 Using DAG to generate target code for a basic block, we automatically get local common sub expression elimination The DAG representation also makes it possible to eliminate redundant stores and tells us how many references to each value there are
  • 80. A final method that is often used to assist register allocation as code generator proceeds Involves the maintenance of data called register descriptors and address descriptors . Register descriptors associate with each register a list of the variable names whose value is currently in the register. Address descriptors associate with each variable name the locations in memory where its value is to be found. For example, take the basic block DAG of Figure 8.19 and consider the generation of TM code according to a left-to-right traversal of the interior nodes , Using the three registers 0, 1, and 2. Assume that there are four address descriptors: inReg(reg_no) , isGlobal(global_offset) , isTemp(temp_offset ), and isCounst(value) .
  • 81. Assume further that x is in global location 0 , that fact is in global location 1 , that global locations are accessed via the gp register, and that temporary locations are accessed via the mp register. Finally, assume also that none of the registers begin with any values in them. Then, before code generation for the basic block begins, the address descriptors for the variables and constants would be as follows:
  • 82. Now assume that the following code is generated: LD 0,1(gp) load fact into reg 0 LD 1,0(gp) load x into reg 1 MUL 0,0,1 The address descriptors would now be Variable/Constant Address Descriptors
  • 83. And the register descriptors would be Register Variables Contained
  • 84. Now, given the subsequent code LDC 2,1(0) load constant 1 into reg 2 ADD 1,1,2 The address descriptors would become: Variable/Constant Address Descriptors And the register descriptors would become: Register Variables Contained
  • 85. 8.10 Simple Optimizations for the TINY Code Generator
  • 86. Primarily, the inefficiencies are due to two sources: The TINY code generator makes very poor use of the registers of the TM machine. The TINY code generator unnecessarily generates logical values 0 and 1 for tests, when these tests only appear in if-statements and while-statements, where simpler code will do. In this section, indicate how even relatively crude techniques can substantially improve the code generated by the TINY compiler.
  • 88. The first optimization is an easy method for keeping temporaries in registers rather than constantly storing and reloading them from memory. In the TINY code generator temporaries were always stored at the location: tmpoffset(mp) where tmpoffset is a static variable initialized to 0, decremented each time a temporary is stored, and incremented each time it is reloaded. A simple way to use registers as temporary locations is to interpret tmpoffset as initially referring to registers, and only after the available registers are exhausted, to use it as an actual offset into memory.
  • 89. With this improvement, the TINY code generator now generates the TM code sequence given in Figure 8.21.
  • 90. 8.10.2 Keeping Variables in Registers
  • 91. A further improvement can be made to the use of the TM registers by reserving some of the registers for use as variable locations. A basic scheme is to simply pick a few registers and allocate these as the locations for the most used variables of the program With these modifications, the code of the sample program might now use register 3 for the variable x and register 4 for the variable fact, assuming that register 0 through 2 are still reserved for temporaries.
  • 92.  
  • 93. 8.10.3 Optimizing Test Expressions
  • 94. The final optimization is to simplify the code generated for tests in if-statement and while-statements. This improvement depends on the fact that a comparison operator must appear as the root node of the test expression. The genExp code for this operator will simply generate code to subtract the value of the right-hand operand from the left-hand operand, leaving the result in register 0. The code for the if-statement or while-statement will then test for which comparison operator is applied and generate appropriate conditional jump code. if 0<x then.
  • 95. Corresponds to the TM code: 4: SUB 0,0,3 5: JLT 0,2(7) 6: LDC 0,0(0) 7: LDA 7,1(7) 8: LDC 0,1(0) 9: JEQ 0,15(7) will generate instead the simpler TM code 4: SUB 0,0,3 5: JGE 0,10(7)
  • 96. With this optimization, the code generated for the test program becomes that given in Figure 8.23.
  • 97. End of Part Three THANKS