SlideShare a Scribd company logo
Intermediate code generation
Intermediate Code Generation
• The final phase of the compiler front-end
• Goal: translate the program into a format
expected by the compiler back-end
• In typical compilers: followed by intermediate
code optimization and machine code
generation
Why use an intermediate representation?
• It makes optimization easier: write optimization methods only
for the intermediate representation
• The intermediate representation can be directly interpreted
• SPARC (Scalable Processor Architecture)
• MIPS (Microprocessor without Interlocked Pipelined Stages)
Why Intermediate Representation?
How to choose the intermediate representation?
• It should be easy to translate the source language
to the intermediate representation
• It should be easy to translate the intermediate
representation to the machine code
• The intermediate representation should be
suitable for optimization
• It should be neither too high level nor too low
level
• Single compiler can have more than one
intermediate representation
Common Intermediate representations
• General forms of intermediate representations (IR):
– Graphical IR (i.e. parse tree, abstract syntax trees, DAG..)
– Linear IR (i.e. non-graphical)
– Three Address Code (TAC): instructions of the form “result
= op1 operator op2”
– Static single assignment (SSA) form: each variable is
assigned once.
Y = 1
Y = 2
X = Y
Y1 = 1
Y2 = 2
X1 = Y2
Example IR in programming languages
• Java bytecode (executed on the Java Virtual Machine)
• C is used in several compilers as an intermediate
representation (e.g. Lisp, Haskell, Cython. . . )
• Microsoft’s Common Intermediate Language (CIL)
• GNU Compiler Collection (GCC) uses abstract syntax trees
Position of Intermediate code generator
• Intermediate code is the interface between front end
and back end in a compiler
Parser
Static
Checker
Intermediate Code
Generator
Code
Generator
Front end Back end
Abstract Syntax Tree vs. Concrete Syntax (Parse) Tree
Variants of syntax trees - DAG
• Syntax tree is used to crate a DAG instead of tree for Expressions.
• A directed acyclic graph (DAG) is an AST with a unique node for each value.
• It can easily show the common sub-expressions and then use that knowledge during code
generation.
• Common sub-expressions has more than one parent. Ex. a and b-c
• Example: a+a*(b-c)+(b-c)*d
• Node a and (b-c) are unique nodes that values are using in two different context.
+
+ *
*
-
b c
a
d
DAG’s – using Array
• Algorithm
– Search the array for a node m with label op, left child l and right child r
– If there is such a node, return the value number m
– If not, create in the array a new node n with label op, left child l, and right child
r and return its value n.
– The search for m can be made more efficient by using k lists and using a hash
function to determine which lists to check.
=
+
1
i
id entry for i
num 10
+ 0 1
= 0 1
i := i + 1
Array of Records
0
1
2
3
4
10
Value-number method for constructing a node in a DAG
Input: label operator, left child, right child
Output: op, l, r
Data structure - Array
Data structure – Hash table
SDD for creating DAG’s
1) E -> E1+T
2) E -> E1-T
3) E -> T
4) T -> (E)
5) T -> id
6) T -> num
Grammar Productions Semantic Rules
{E.node= new mknode(‘+’, E1.node,T.node)}
{E.node= new mknode(‘-’, E1.node,T.node)}
{E.node = T.node}
{T.node = E.node}
{T.node = new mkleaf(id, id.entry)}
{T.node = new mkleaf(num, num.val)}
Example:
1) p1=mkleaf(id, entry-a)
2) P2=mkleaf(id, entry-a)=p1
3) p3=mkleaf(id, entry-b)
4) p4=mkleaf(id, entry-c)
5) p5=mknode(‘-’,p3,p4)
6) p6=mknode(‘*’,p1,p5)
7) p7=mknode(‘+’,p1,p6)
8) p8=mkleaf(id,entry-b)=p3
9) p9=mkleaf(id,entry-c)=p4
10) p10=mknode(‘-’,p3,p4)=p5
11) p11=mkleaf(id,entry-d)
12) p12=mknode(‘*’,p5,p11)
13) p13=mknode(‘+’,p7,p12)
Example
• To work out a+(b-c)+(b-c)
– Construct Syntax tree, DAG
P4
400
Exercise
• To construct syntax tree, DAG, array of records for the following
expression a := b*(-c)+b*(-c)
• Consider the SDD to produce syntax trees for assignment statements
Three address code: Addresses
• Three-address code is built from two concepts:
– addresses and instructions.
• Instruction is the statement or operation
– At most one operator on the right side of an instruction.
– 3-address code form:
x = y op z
• An address can be
– Identifier: source variable program name or pointer to the Symbol Table name.
– constant: Constants in the program.
Forms or types of three address instructions
• Assignment Statements ----- x := y op z
• Assignment instructions ----- x := op y
• Copy statements ----- x := y
• Unconditional jump ----- goto L
• Conditional jump ----- if x relop y goto L [relop are <, =, >= , etc.]
• Procedure calls: 3 address code generated for call of the procedure y=p(x1,x2,…,xn)
param x1
param x2,
…
param xn
y = call p, n
• Indexed assignments ------ x := y[i] and x[i] := y
• Address and pointer assignments ------ x := &y and x := *y and *x := y
Ex: Write Three Address Code for the block of statements
int a;
int b;
a = 5 + 2 * b;
Solution:
t0 = 5;
t1 = 2 * b;
a = t0 + t1;
Ex: Write Three Address Code for the if -else
if (A < B)
{
t = 1
}
else
{
t = 0
}
Solution-
(1) if (A < B) goto (4)
(2) t=0
(3) goto (5)
(4) t = 1
(5)
Ex: Write Three Address Code for the if-else
if (A < B) && (C < D)
{
t = 1
}
else
{
t = 0
}
Solution-
(1) if (A < B) goto (3)
(2) goto (4)
(3) if (C < D) goto (6)
(4) t = 0
(5) goto (7)
(6) t = 1
(7)
Ex: Write Three Address Code for the while statements
a=3; b=4; i=0;
while(i<n){
a= b+1;
a=a*a;
i++;
}
c=a;
Solution:
a=3;
b=4;
i=0;
L1:
T1=i<n;
if T1 goto L2;
goto L3;
L2:
T2=b+1;
a=T2;
T3=a*a;
a=T3
i++;
goto L1;
L3:
c=a;
Ex: Write Three Address Code for the switch case
switch (ch)
{
case 1 : c = a + b;
break;
case 2 : c = a – b;
break;
}
Solution-
if ch = 1 goto L1
if ch = 2 goto L2
L1:
T1 = a + b
c = T1
goto Last
L2:
T1 = a – b
c = T2
goto Last
Last:
Instructions in Three Address Code
Instructions in Three Address Code
Choice of allowable operators
• It is an important issue in the design of an intermediate form
• A small operator set is easier to implement
• Restricted instruction set may force front end to generate long
sequences of statements for some source language operations
• The optimizer and code generator may then have to work harder if
good code is to be generated
• Close enough to machine instructions to simplify code
generation
Example
do
i = i+1;
while (a[i*8] < v);
L: t1 = i + 1
i = t1
t2 = i * 8
t3 = a[t2]
if t3 < v goto L
Symbolic labels
100: t1 = i + 1
101: i = t1
102: t2 = i * 8
103: t3 = a[t2]
104: if t3 < v goto 100
Position numbers
Syntax tree vs. DAG vs. Three address code
• AST is the procedure’s parse tree with the nodes for most non-terminal symbols
removed.
• Directed Acyclic Graph is an AST with a unique node for each value.
• Three address code is a sequence of statements of the general form x := y op z
• In a TAC there is at most one operator at the right side of an instruction.
• Example:
t1 = b – c
t2 = a * t1
t3 = a + t2
t4 = t1 * d
t5 = t3 + t4
a+a*(b-c)+(b-c)*d
+
+ *
*
-
b c
a
d
AST DAG TAC
Data structures for three address
codes
• Implementations of Three-Address statements
– Quadruples
• Has four fields: op, arg1, arg2 and result
– Triples
• Temporaries are not used and instead references to
instructions are made
– Indirect triples
• In addition to triples we use a list of pointers to triples
Example
• a = b * uminus c + b * uminus c
(or)
a = b * (-c) + b * (-c)
t1 = uminus c
t2 = b * t1
t3 = uminus c
t4 = b * t3
t5 = t2 + t4
a = t5
Three address code
uminus
*
uminus c t3
*
+
=
c t1
b t2
t1
b t4
t3
t2 t5
t4
t5 a
arg1 result
arg2
op
Quadruples
uminus
*
uminus c
*
+
=
c
b (0)
b (2)
(1) (3)
a
arg1 arg2
op
Triples
(4)
0
1
2
3
4
5
uminus
*
uminus c
*
+
=
c
b (0)
b (2)
(1) (3)
a
arg1 arg2
op
Indirect Triples
(4)
0
1
2
3
4
5
(0)
(1)
(2)
(3)
(4)
(5)
op
35
36
37
38
39
40
Compare Quadruples, Triples &
Indirect Triples
• When instructions are moving around during optimizations:
quadruples are better than triples.
– Quadruple uses temporary variables
• Indirect triples solve this problem
Syntax tree vs. Triples vs. 3AC
More triple representation
• x[i]:=y
• x:=y[i] //Exercise
op arg1 arg2
(0) []= x i
(1) assign (0) y
op arg1 arg2
(0) []= y i
(1) Assign x (0)
Exercise
• To give 3 Address code representations for
a+a*(b-c)+(b-c)*d
– Quadruples?
– Triples?
– Indirect Triples?
Types and Declarations
• Type checking: Ensures the types of operands matches that is expected
by its context (operator).
– E.g. mod operation needs integer operands
• Determine the storage needed
• Calculate the address of an array reference
• Insert explicit type conversion
• When declarations are together, a single offset on the stack pointer
suffices.
• int x, y, z; fun1(); fun2();
• Otherwise, the translator needs to keep track of the current offset.
• int x; fun1(); int y, z; fun2();
Storage layout
• From the type, we can determine amount of
storage at run time.
• At compile time, we will use this amount to
assign its name a relative address.
• Type and relative address are saved in the
symbol table entry of the name.
• Data with length determined only at run time
saves a pointer in the symbol table.
Type Systems Design
• Design is based on syntactic constructs in the
language, notion of types and the rules for
assigning the types to language constructs.
• E.g.
– In arithmetic operation such as addition,
subtraction, multiplication and division, if both
operands are integers then result is also integer.
Type Expressions
• The type of a language construct is denoted by type expression.
• It is either a basic type or formed by applying an operator (type
constructor) to other type expressions.
Equivalence of Type Expression
• Two types are structurally equivalent iff one of the following
conditions is true.
• They are the same basic type.
• They are formed by applying the same constructor to structurally
equivalent types.
• One is a type name that denotes the other.
• int a[2][3] is not equivalent to int b[3][2];
• int a is not equivalent to char b[4];
• struct {int, char} is not equivalent to struct {char, int};
• int * is not equivalent to void *.
Type checking rules for expressions
• Basic type expressions: Boolean, char, integer, float
• Eliteral {E.type = char }
• E num {E.type = integer}
• E id {E.type = lookup(id.entry) }
Type checking Expressions (cont..d)
• Special Basic type expressions: type_error (raise the error
during type checking) and void.
E E1 mod E2 { E.type = { if E1.type = = integer and
E2.type = = integer then
integer
else
type_error }
}
• Type name is a type expression
Ea {E.name = a}
Type checking Expressions (cont..d)
• A type expression can be formed by applying the array type
constructor to a number and a type expression
Constructors include:
• Arrays: If T is a type expression then array(I,T) is a type expressions
denotes array with type T and index set I.
– int a[2][3] is array of 2 arrays of 3 integers.
– In functional style: array(2, array(3, int))
EE1[E2] {E.type = {if E2.type = = integer and
E1.type = array(I, T) then
T
else
type_error
Type checking Expressions (cont..d)
• Product: If s and t are type expressions, then their Cartesian product s*t
is a type expression
• Records: A record is a data structure with named field. It applied to a
tuple formed from field names and field types.
• E.g.
type row = record
{
address: integer;
lexeme : array[1..15] of caht
}
var table: array[1…101] of row;
It declares the type name row denotes the type expression record ( (address
x integer) x (lexeme x array(1..15, char)) )
The variable table is array of records of this type.
Type checking Expressions (cont..d)
• Assignment Statements: E may be Arithmetic, Logical, Relational
expression
Sid = E { S.type = {if id.type = E.type then
void
else
type_error} }
• If Statements:
Sif E then S1 {S.type = { if E.type = Boolean then
S1.type
else
type_error } }
Type checking Expressions (cont..d)
• While Statements:
Swhile E do S1 {S.type = { if E.type = Boolean then
S1.type
else
type_error
}
}
• Pointers: If T is a type expression, then pointer(T) is a type expression
(i.e. pointer to an object of type T).
E*E1 {E.type = {if E.type = ptr(T) then
T
else
type_error } }
Type checking Expressions (cont..d)
• Functions: It maps a domain type D to a range type R. The type of such
function is denoted by type expression DR.
Mapping: T DR {T.type = D.type  R.type }
Function call: E E1 (E2) {E.type = {if E2.type = T1 and
E1.type = T1 T2 then
T2
else
type_error }}
Type checking rules for coercions
• Implicit type conversions (by Compiler) and Explicit type
conversions (by programmer)
EE1 op E2 {E.type = {if E1.type = integer and
E2.type = integer then integer
else if E1.type = integer and
E2.type = float then float
else if E1.type = float and
E2.type = integer then float
else if E1.type = float and
E2.type = float then float
else type_error
Exercise
• Write productions and semantic rules for
computing types and finding their widths in
bytes.
Apply SDD – To find size or width of an array
Intermediate Representation for Array Expression
Type Checking
• Type expressions are checked for
– Correct code
– Security aspects
– Efficient code generation
Reference
• A.V. Aho, M.S. Lam, R. Sethi, J. D. Ullman,
Compilers Principles, Techniques and Tools,
Pearson Edition, 2013.
P. Kuppusamy - Lexical Analyzer

More Related Content

PPTX
Code generation
PDF
Regression Analysis
PDF
PPSX
Computer networks
PPTX
Symbol table design (Compiler Construction)
PPTX
Landslides
PPTX
Intermediate code generator
PPTX
Sensors in IOT
Code generation
Regression Analysis
Computer networks
Symbol table design (Compiler Construction)
Landslides
Intermediate code generator
Sensors in IOT

What's hot (20)

PPT
Intermediate code generation (Compiler Design)
PPTX
Principle source of optimazation
PPT
Chapter 6 intermediate code generation
PDF
Syntax Directed Definition and its applications
PPTX
Assemblers
PPTX
Loop optimization
PPT
Intermediate code generation
PDF
14-Intermediate code generation - Variants of Syntax trees - Three Address Co...
PPTX
Concurrency control
PPT
1.Role lexical Analyzer
PPTX
Three address code In Compiler Design
PPTX
Type checking compiler construction Chapter #6
PPT
Lecture 1 - Lexical Analysis.ppt
PPT
POST’s CORRESPONDENCE PROBLEM
PPTX
Three Address code
PPTX
Attribute grammer
PPT
Chapter 5 -Syntax Directed Translation - Copy.ppt
PPTX
Syntax Analysis in Compiler Design
PPTX
Directed Acyclic Graph Representation of basic blocks
PPTX
Type checking in compiler design
Intermediate code generation (Compiler Design)
Principle source of optimazation
Chapter 6 intermediate code generation
Syntax Directed Definition and its applications
Assemblers
Loop optimization
Intermediate code generation
14-Intermediate code generation - Variants of Syntax trees - Three Address Co...
Concurrency control
1.Role lexical Analyzer
Three address code In Compiler Design
Type checking compiler construction Chapter #6
Lecture 1 - Lexical Analysis.ppt
POST’s CORRESPONDENCE PROBLEM
Three Address code
Attribute grammer
Chapter 5 -Syntax Directed Translation - Copy.ppt
Syntax Analysis in Compiler Design
Directed Acyclic Graph Representation of basic blocks
Type checking in compiler design
Ad

Similar to Intermediate code generation in Compiler Design (20)

PPT
458237.-Compiler-Design-Intermediate-code-generation.ppt
PDF
Project presentation PPT.pdf this is help for student who doing this complier...
PPT
Interm codegen
PPTX
Presentation(intermediate code generation)
PPTX
Lecture 12 intermediate code generation
PPT
Intermediate code generation
PDF
Module 6 Intermediate Code Generation.pdf
PPT
Compiler chapter six .ppt course material
PPT
Chapter 6 Intermediate Code Generation
PPTX
Intermediate code generation1
PPT
u4-p1 Syntax Directed Translation and .ppt
PPT
u4-p1 syntax directed translation and .ppt
PDF
Syntaxdirected (1)
PDF
PDF
PDF
INTERMEDIATE CODE GENERTION-CD UNIT-3.pdf
PPTX
Best C++ Programming Homework Help
PPTX
Learn C LANGUAGE at ASIT
PDF
Chapter 11 - Intermediate Code Generation.pdf
PPTX
Automata compiler design ppt for btech students
458237.-Compiler-Design-Intermediate-code-generation.ppt
Project presentation PPT.pdf this is help for student who doing this complier...
Interm codegen
Presentation(intermediate code generation)
Lecture 12 intermediate code generation
Intermediate code generation
Module 6 Intermediate Code Generation.pdf
Compiler chapter six .ppt course material
Chapter 6 Intermediate Code Generation
Intermediate code generation1
u4-p1 Syntax Directed Translation and .ppt
u4-p1 syntax directed translation and .ppt
Syntaxdirected (1)
INTERMEDIATE CODE GENERTION-CD UNIT-3.pdf
Best C++ Programming Homework Help
Learn C LANGUAGE at ASIT
Chapter 11 - Intermediate Code Generation.pdf
Automata compiler design ppt for btech students
Ad

More from Kuppusamy P (20)

PDF
Recurrent neural networks rnn
PDF
Deep learning
PDF
Image segmentation
PDF
Image enhancement
PDF
Feature detection and matching
PDF
Image processing, Noise, Noise Removal filters
PDF
Flowchart design for algorithms
PDF
Algorithm basics
PDF
Problem solving using Programming
PDF
Parts of Computer, Hardware and Software
PDF
Strings in java
PDF
Java methods or Subroutines or Functions
PDF
Java arrays
PDF
Java iterative statements
PDF
Java conditional statements
PDF
Java data types
PDF
Java introduction
PDF
Logistic regression in Machine Learning
PDF
Anomaly detection (Unsupervised Learning) in Machine Learning
PDF
Machine Learning Performance metrics for classification
Recurrent neural networks rnn
Deep learning
Image segmentation
Image enhancement
Feature detection and matching
Image processing, Noise, Noise Removal filters
Flowchart design for algorithms
Algorithm basics
Problem solving using Programming
Parts of Computer, Hardware and Software
Strings in java
Java methods or Subroutines or Functions
Java arrays
Java iterative statements
Java conditional statements
Java data types
Java introduction
Logistic regression in Machine Learning
Anomaly detection (Unsupervised Learning) in Machine Learning
Machine Learning Performance metrics for classification

Recently uploaded (20)

PDF
Mark Klimek Lecture Notes_240423 revision books _173037.pdf
PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
PPTX
Week 4 Term 3 Study Techniques revisited.pptx
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PPTX
master seminar digital applications in india
PDF
RMMM.pdf make it easy to upload and study
PDF
Anesthesia in Laparoscopic Surgery in India
PPTX
Pharma ospi slides which help in ospi learning
PPTX
Cell Structure & Organelles in detailed.
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PPTX
Cell Types and Its function , kingdom of life
PPTX
Introduction to Child Health Nursing – Unit I | Child Health Nursing I | B.Sc...
PPTX
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
PDF
102 student loan defaulters named and shamed – Is someone you know on the list?
PPTX
BOWEL ELIMINATION FACTORS AFFECTING AND TYPES
PDF
Basic Mud Logging Guide for educational purpose
PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PDF
TR - Agricultural Crops Production NC III.pdf
Mark Klimek Lecture Notes_240423 revision books _173037.pdf
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
Week 4 Term 3 Study Techniques revisited.pptx
FourierSeries-QuestionsWithAnswers(Part-A).pdf
master seminar digital applications in india
RMMM.pdf make it easy to upload and study
Anesthesia in Laparoscopic Surgery in India
Pharma ospi slides which help in ospi learning
Cell Structure & Organelles in detailed.
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
Cell Types and Its function , kingdom of life
Introduction to Child Health Nursing – Unit I | Child Health Nursing I | B.Sc...
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
102 student loan defaulters named and shamed – Is someone you know on the list?
BOWEL ELIMINATION FACTORS AFFECTING AND TYPES
Basic Mud Logging Guide for educational purpose
Module 4: Burden of Disease Tutorial Slides S2 2025
human mycosis Human fungal infections are called human mycosis..pptx
Pharmacology of Heart Failure /Pharmacotherapy of CHF
TR - Agricultural Crops Production NC III.pdf

Intermediate code generation in Compiler Design

  • 2. Intermediate Code Generation • The final phase of the compiler front-end • Goal: translate the program into a format expected by the compiler back-end • In typical compilers: followed by intermediate code optimization and machine code generation
  • 3. Why use an intermediate representation? • It makes optimization easier: write optimization methods only for the intermediate representation • The intermediate representation can be directly interpreted • SPARC (Scalable Processor Architecture) • MIPS (Microprocessor without Interlocked Pipelined Stages)
  • 5. How to choose the intermediate representation? • It should be easy to translate the source language to the intermediate representation • It should be easy to translate the intermediate representation to the machine code • The intermediate representation should be suitable for optimization • It should be neither too high level nor too low level • Single compiler can have more than one intermediate representation
  • 6. Common Intermediate representations • General forms of intermediate representations (IR): – Graphical IR (i.e. parse tree, abstract syntax trees, DAG..) – Linear IR (i.e. non-graphical) – Three Address Code (TAC): instructions of the form “result = op1 operator op2” – Static single assignment (SSA) form: each variable is assigned once. Y = 1 Y = 2 X = Y Y1 = 1 Y2 = 2 X1 = Y2
  • 7. Example IR in programming languages • Java bytecode (executed on the Java Virtual Machine) • C is used in several compilers as an intermediate representation (e.g. Lisp, Haskell, Cython. . . ) • Microsoft’s Common Intermediate Language (CIL) • GNU Compiler Collection (GCC) uses abstract syntax trees
  • 8. Position of Intermediate code generator • Intermediate code is the interface between front end and back end in a compiler Parser Static Checker Intermediate Code Generator Code Generator Front end Back end
  • 9. Abstract Syntax Tree vs. Concrete Syntax (Parse) Tree
  • 10. Variants of syntax trees - DAG • Syntax tree is used to crate a DAG instead of tree for Expressions. • A directed acyclic graph (DAG) is an AST with a unique node for each value. • It can easily show the common sub-expressions and then use that knowledge during code generation. • Common sub-expressions has more than one parent. Ex. a and b-c • Example: a+a*(b-c)+(b-c)*d • Node a and (b-c) are unique nodes that values are using in two different context. + + * * - b c a d
  • 11. DAG’s – using Array • Algorithm – Search the array for a node m with label op, left child l and right child r – If there is such a node, return the value number m – If not, create in the array a new node n with label op, left child l, and right child r and return its value n. – The search for m can be made more efficient by using k lists and using a hash function to determine which lists to check. = + 1 i id entry for i num 10 + 0 1 = 0 1 i := i + 1 Array of Records 0 1 2 3 4 10 Value-number method for constructing a node in a DAG Input: label operator, left child, right child Output: op, l, r
  • 13. Data structure – Hash table
  • 14. SDD for creating DAG’s 1) E -> E1+T 2) E -> E1-T 3) E -> T 4) T -> (E) 5) T -> id 6) T -> num Grammar Productions Semantic Rules {E.node= new mknode(‘+’, E1.node,T.node)} {E.node= new mknode(‘-’, E1.node,T.node)} {E.node = T.node} {T.node = E.node} {T.node = new mkleaf(id, id.entry)} {T.node = new mkleaf(num, num.val)} Example: 1) p1=mkleaf(id, entry-a) 2) P2=mkleaf(id, entry-a)=p1 3) p3=mkleaf(id, entry-b) 4) p4=mkleaf(id, entry-c) 5) p5=mknode(‘-’,p3,p4) 6) p6=mknode(‘*’,p1,p5) 7) p7=mknode(‘+’,p1,p6) 8) p8=mkleaf(id,entry-b)=p3 9) p9=mkleaf(id,entry-c)=p4 10) p10=mknode(‘-’,p3,p4)=p5 11) p11=mkleaf(id,entry-d) 12) p12=mknode(‘*’,p5,p11) 13) p13=mknode(‘+’,p7,p12)
  • 15. Example • To work out a+(b-c)+(b-c) – Construct Syntax tree, DAG P4 400
  • 16. Exercise • To construct syntax tree, DAG, array of records for the following expression a := b*(-c)+b*(-c) • Consider the SDD to produce syntax trees for assignment statements
  • 17. Three address code: Addresses • Three-address code is built from two concepts: – addresses and instructions. • Instruction is the statement or operation – At most one operator on the right side of an instruction. – 3-address code form: x = y op z • An address can be – Identifier: source variable program name or pointer to the Symbol Table name. – constant: Constants in the program.
  • 18. Forms or types of three address instructions • Assignment Statements ----- x := y op z • Assignment instructions ----- x := op y • Copy statements ----- x := y • Unconditional jump ----- goto L • Conditional jump ----- if x relop y goto L [relop are <, =, >= , etc.] • Procedure calls: 3 address code generated for call of the procedure y=p(x1,x2,…,xn) param x1 param x2, … param xn y = call p, n • Indexed assignments ------ x := y[i] and x[i] := y • Address and pointer assignments ------ x := &y and x := *y and *x := y
  • 19. Ex: Write Three Address Code for the block of statements int a; int b; a = 5 + 2 * b; Solution: t0 = 5; t1 = 2 * b; a = t0 + t1;
  • 20. Ex: Write Three Address Code for the if -else if (A < B) { t = 1 } else { t = 0 } Solution- (1) if (A < B) goto (4) (2) t=0 (3) goto (5) (4) t = 1 (5)
  • 21. Ex: Write Three Address Code for the if-else if (A < B) && (C < D) { t = 1 } else { t = 0 } Solution- (1) if (A < B) goto (3) (2) goto (4) (3) if (C < D) goto (6) (4) t = 0 (5) goto (7) (6) t = 1 (7)
  • 22. Ex: Write Three Address Code for the while statements a=3; b=4; i=0; while(i<n){ a= b+1; a=a*a; i++; } c=a; Solution: a=3; b=4; i=0; L1: T1=i<n; if T1 goto L2; goto L3; L2: T2=b+1; a=T2; T3=a*a; a=T3 i++; goto L1; L3: c=a;
  • 23. Ex: Write Three Address Code for the switch case switch (ch) { case 1 : c = a + b; break; case 2 : c = a – b; break; } Solution- if ch = 1 goto L1 if ch = 2 goto L2 L1: T1 = a + b c = T1 goto Last L2: T1 = a – b c = T2 goto Last Last:
  • 24. Instructions in Three Address Code
  • 25. Instructions in Three Address Code
  • 26. Choice of allowable operators • It is an important issue in the design of an intermediate form • A small operator set is easier to implement • Restricted instruction set may force front end to generate long sequences of statements for some source language operations • The optimizer and code generator may then have to work harder if good code is to be generated • Close enough to machine instructions to simplify code generation
  • 27. Example do i = i+1; while (a[i*8] < v); L: t1 = i + 1 i = t1 t2 = i * 8 t3 = a[t2] if t3 < v goto L Symbolic labels 100: t1 = i + 1 101: i = t1 102: t2 = i * 8 103: t3 = a[t2] 104: if t3 < v goto 100 Position numbers
  • 28. Syntax tree vs. DAG vs. Three address code • AST is the procedure’s parse tree with the nodes for most non-terminal symbols removed. • Directed Acyclic Graph is an AST with a unique node for each value. • Three address code is a sequence of statements of the general form x := y op z • In a TAC there is at most one operator at the right side of an instruction. • Example: t1 = b – c t2 = a * t1 t3 = a + t2 t4 = t1 * d t5 = t3 + t4 a+a*(b-c)+(b-c)*d + + * * - b c a d AST DAG TAC
  • 29. Data structures for three address codes • Implementations of Three-Address statements – Quadruples • Has four fields: op, arg1, arg2 and result – Triples • Temporaries are not used and instead references to instructions are made – Indirect triples • In addition to triples we use a list of pointers to triples
  • 30. Example • a = b * uminus c + b * uminus c (or) a = b * (-c) + b * (-c) t1 = uminus c t2 = b * t1 t3 = uminus c t4 = b * t3 t5 = t2 + t4 a = t5 Three address code uminus * uminus c t3 * + = c t1 b t2 t1 b t4 t3 t2 t5 t4 t5 a arg1 result arg2 op Quadruples uminus * uminus c * + = c b (0) b (2) (1) (3) a arg1 arg2 op Triples (4) 0 1 2 3 4 5 uminus * uminus c * + = c b (0) b (2) (1) (3) a arg1 arg2 op Indirect Triples (4) 0 1 2 3 4 5 (0) (1) (2) (3) (4) (5) op 35 36 37 38 39 40
  • 31. Compare Quadruples, Triples & Indirect Triples • When instructions are moving around during optimizations: quadruples are better than triples. – Quadruple uses temporary variables • Indirect triples solve this problem
  • 32. Syntax tree vs. Triples vs. 3AC
  • 33. More triple representation • x[i]:=y • x:=y[i] //Exercise op arg1 arg2 (0) []= x i (1) assign (0) y op arg1 arg2 (0) []= y i (1) Assign x (0)
  • 34. Exercise • To give 3 Address code representations for a+a*(b-c)+(b-c)*d – Quadruples? – Triples? – Indirect Triples?
  • 35. Types and Declarations • Type checking: Ensures the types of operands matches that is expected by its context (operator). – E.g. mod operation needs integer operands • Determine the storage needed • Calculate the address of an array reference • Insert explicit type conversion • When declarations are together, a single offset on the stack pointer suffices. • int x, y, z; fun1(); fun2(); • Otherwise, the translator needs to keep track of the current offset. • int x; fun1(); int y, z; fun2();
  • 36. Storage layout • From the type, we can determine amount of storage at run time. • At compile time, we will use this amount to assign its name a relative address. • Type and relative address are saved in the symbol table entry of the name. • Data with length determined only at run time saves a pointer in the symbol table.
  • 37. Type Systems Design • Design is based on syntactic constructs in the language, notion of types and the rules for assigning the types to language constructs. • E.g. – In arithmetic operation such as addition, subtraction, multiplication and division, if both operands are integers then result is also integer.
  • 38. Type Expressions • The type of a language construct is denoted by type expression. • It is either a basic type or formed by applying an operator (type constructor) to other type expressions.
  • 39. Equivalence of Type Expression • Two types are structurally equivalent iff one of the following conditions is true. • They are the same basic type. • They are formed by applying the same constructor to structurally equivalent types. • One is a type name that denotes the other. • int a[2][3] is not equivalent to int b[3][2]; • int a is not equivalent to char b[4]; • struct {int, char} is not equivalent to struct {char, int}; • int * is not equivalent to void *.
  • 40. Type checking rules for expressions • Basic type expressions: Boolean, char, integer, float • Eliteral {E.type = char } • E num {E.type = integer} • E id {E.type = lookup(id.entry) }
  • 41. Type checking Expressions (cont..d) • Special Basic type expressions: type_error (raise the error during type checking) and void. E E1 mod E2 { E.type = { if E1.type = = integer and E2.type = = integer then integer else type_error } } • Type name is a type expression Ea {E.name = a}
  • 42. Type checking Expressions (cont..d) • A type expression can be formed by applying the array type constructor to a number and a type expression Constructors include: • Arrays: If T is a type expression then array(I,T) is a type expressions denotes array with type T and index set I. – int a[2][3] is array of 2 arrays of 3 integers. – In functional style: array(2, array(3, int)) EE1[E2] {E.type = {if E2.type = = integer and E1.type = array(I, T) then T else type_error
  • 43. Type checking Expressions (cont..d) • Product: If s and t are type expressions, then their Cartesian product s*t is a type expression • Records: A record is a data structure with named field. It applied to a tuple formed from field names and field types. • E.g. type row = record { address: integer; lexeme : array[1..15] of caht } var table: array[1…101] of row; It declares the type name row denotes the type expression record ( (address x integer) x (lexeme x array(1..15, char)) ) The variable table is array of records of this type.
  • 44. Type checking Expressions (cont..d) • Assignment Statements: E may be Arithmetic, Logical, Relational expression Sid = E { S.type = {if id.type = E.type then void else type_error} } • If Statements: Sif E then S1 {S.type = { if E.type = Boolean then S1.type else type_error } }
  • 45. Type checking Expressions (cont..d) • While Statements: Swhile E do S1 {S.type = { if E.type = Boolean then S1.type else type_error } } • Pointers: If T is a type expression, then pointer(T) is a type expression (i.e. pointer to an object of type T). E*E1 {E.type = {if E.type = ptr(T) then T else type_error } }
  • 46. Type checking Expressions (cont..d) • Functions: It maps a domain type D to a range type R. The type of such function is denoted by type expression DR. Mapping: T DR {T.type = D.type  R.type } Function call: E E1 (E2) {E.type = {if E2.type = T1 and E1.type = T1 T2 then T2 else type_error }}
  • 47. Type checking rules for coercions • Implicit type conversions (by Compiler) and Explicit type conversions (by programmer) EE1 op E2 {E.type = {if E1.type = integer and E2.type = integer then integer else if E1.type = integer and E2.type = float then float else if E1.type = float and E2.type = integer then float else if E1.type = float and E2.type = float then float else type_error
  • 48. Exercise • Write productions and semantic rules for computing types and finding their widths in bytes.
  • 49. Apply SDD – To find size or width of an array
  • 51. Type Checking • Type expressions are checked for – Correct code – Security aspects – Efficient code generation
  • 52. Reference • A.V. Aho, M.S. Lam, R. Sethi, J. D. Ullman, Compilers Principles, Techniques and Tools, Pearson Edition, 2013. P. Kuppusamy - Lexical Analyzer