SlideShare a Scribd company logo
Intermediate-Code Generation
Intermediate-Code Generation
Three-Address Code
• In three-address code, there is at most one operator on the right
side of an instruction
• Thus a source-language expression like x+y*z might be translated
into the sequence of three-address instructions
Common three-address instructions
Automata compiler design ppt for btech students
Automata compiler design ppt for btech students
Common three-address instructions
Data structures for TAC
Quadruples
Data structures for TAC
Quadruples
Data structures for TAC
Quadruples
Data structures for TAC
Triples
• A triple has only three fields, which we call op, arg1, and arg2
• Note that the result field in Quad is used primarily for
temporary names.
• Using triples, we refer to the result of an operation x op y by its
position, rather than by an explicit temporary name.
• Thus, instead of the temporary t, a triple representation would
refer to position (0).
• Parenthesized numbers represent pointers into the triple
structure itself.
Data structures for TAC
Triples
• A benefit of quadruples over triples can be seen in an optimizing
compiler, where instructions are often moved around.
• With quadruples, if we move an instruction that
computes a temporary t, then the instructions that use t
require no change.
• With triples, the result of an operation is referred to by its
position
• So moving an instruction may require us to change all
references
to that result.
Benefit of Quad over Triples
Indirect triples
• Consist of a listing of pointers to triples,
• Rather than a listing of triples themselves.
• For example, use an array instruction to list pointers to triples in
the desired order.
With indirect triples, an optimizing compiler can move an
instruction by
reordering the instruction list, without affecting the triples
themselves.
Static Single-Assignment Form
Two distinctive aspects distinguish SSA from three-address code.
(a) The first is that all assignments in SSA are to variables
with distinct names; hence the term static single-assignment.
Static Single-Assignment Form
Two distinctive aspects distinguish SSA from three-address code.
(a) The first is that all assignments in SSA are to variables
with distinct names; hence the term static single-assignment.
(b)
Static Single-Assignment Form
Major translation classes of Three address
code generation
(a) Declaration statements (+ handling data
type and storage)
(b) Expressions and statements
(a) Control flow statements
Declaration statement
Representing data types: Type Expressions
Types have structure, which we shall represent using type
expressions.
• A type expression is either a basic type (boolean, char,
integer,
float, and void)
or
• is formed by
applying an
operator called
a type
constructor to
a
type
Declaration statement
• The array type int [2] [3] can be read as "array of 2
arrays of 3 integers each"
• Can be represented as a type expression array(2,
array(3, integer)).
• This type is represented by the tree.
• The operator array takes two parameters, a number
and a type.
• Here the type expression can be formed by
applying the array type constructor to a number
and a type expression.
Declaration statement
Example SDD
Type Expressions
• Nonterminal T generates either a basic type or an array type.
• Nonterminal B generates one of the basic types int and float.
• T generates a basic type when C derives €.
• Otherwise, C generates array components consisting of a sequence of
integers, each integer surrounded by brackets.
Declaration statement
Example SDD
Type Expressions
• The nonterminals B and T have a synthesized attribute t
representing a type.
• The nonterminal C has two attributes: an inherited attribute b
and a synthesized attribute t.
Declaration statement
Example SDD
input string int [2][3]
• The nonterminal C has two attributes: an inherited attribute b
and a synthesized attribute t.
• The inherited b attributes pass a basic type down the tree, and
the synthesized t attributes accumulate the result.
Declaration statement: Example SDD
Symbol table
Find the storage for each variable
More on Declaration statement
Data type + Storage layout
• Simplified grammar that
declares just one name at
a time;
• We already explored the
declarations with lists of
names
Storage layout:
• Relative address of all the variables
• From the type of a variable, we can determine the amount of storage that
will be needed for the variable at run time.
• At compile time, we can use these amounts to assign each variable a
relative address.
• The type and relative address are saved in the symbol-table entry for the
variable name.
More on Declaration statement
Data type + Storage layout
• The width of a type is the number of storage units needed for
objects of that type.
• A basic type, such as a character, integer, or float, requires an
integral number of bytes.
• Arrays allocated in one contiguous block of bytes
Computes data types and
their widths for basic
and array types
Type Expressions
The width of an array is obtained by multiplying the
width of an element by the number of elements in the
array.
More on Declaration statement
Data type + Storage layout
Relative address
Name Data type Relative address
Sequences of Declarations
• Relative address: offset
• Keeps track of the next available relative address
• The translation scheme deals with a sequence of declarations of
the form T id, where T generates a data type
• Before the first declaration is considered, offset is set to 0.
• As each new name x is seen, x is entered into the symbol table
with its relative address = current value of offset,
• which is then incremented by the width of the type of x.
int x;
float y;
Sequences of Declarations
• Relative address: offset
• Keeps track of the next available relative address
Record or Structure data type
Record or Structure data type
Record or Structure data type
Homework
int x,
y;
float
p, q
record
{ int x;
float
q;
} a;
char
Write the grammar, SDD
and populate the symbol
table by executing those
rules
Major translation classes of Three address
code generation
(a) Declaration statements (+ handling data
type and storage)
(b) Expressions and statements
(a) Control flow statements
Translation of Expressions
Three-address code for an assignment statement
Attribute code for S
attributes addr and code for an expression E.
Attributes S.code and E.code denote the three-address code for S and E, respectively.
Attribute E.addr denotes the address that will hold the value of E.
Translation of Expressions
Three-address code for an assignment statement
When an expression is a single identifier, say x,
then x itself holds the value of the expression.
The semantic rules for this production define
E.addr to point to the symbol-table entry
Address of the variable
Translation of Expressions
Three-address code for an assignment statement
Translation of Expressions
Three-address code for an assignment statement
Translation of Expressions
Three-address code for an assignment statement
E.code=‘ ’
E.code=‘ ’
E.code=‘ ’
E.code=‘ ’
E.code=
E.code=
S.code=
E.code=‘ ’
Translation of Expressions
Three-address code for an assignment statement
Incremental Translation
• So far, E.Code attributes were long strings
• Generated incrementally
• In incremental translation, generate only the new three-
address instructions
• Past sequence may either be retained in memory for
further processing, or it may be output incrementally.
• In the incremental approach, gen() not only constructs a
three-address instruction,
• it appends the instruction to the sequence of
instructions generated so far.
Incremental Translation
• This translation scheme generates the same code as the
previous syntax directed definition.
• With the incremental approach, the E.code attribute is
not used,
• Since there is a single sequence of instructions
that is
Translation of Array Expressions
Translation of Array Expressions
Let w1 be the length of a row (# of columns) and w2 be the size of
an element in a row.
The relative address of A[i1][i2] can be calculated by the formula
Translate 2D index to 1D index
Translation of Array Expressions
Objective a is a 2×3 array
Translation of Array Expressions
L.addr denotes a temporary
variable containing the offset
of the array elements (array
index)
Translation of Array Expressions
• L.array is a pointer to the
symbol-table entry for the array
name.
• L.array.base indicates base
address of the array -- array
name
• L.type is the type t of the
subarray generated by L.
• For array type t, we assume that
its width is given by t.width.
• For any array type t, t.elem
gives the array element.
Translation of Array Expressions
Standard
Standard
Translation of Array Expressions
Translation of Array Expressions
Parse tree
Translation of Array Expressions
Translation of Array Expressions
a[i]
Translation of Array Expressions
a[i][j]
Translation of Array Expressions
Translation of Array Expressions
Control Flow
Translation of statements such as if-else-statements and while-
statements
Key step: Translation of boolean expressions
Boolean expressions are used as
(ii)
(i) Conditional expressions in statements that alter the flow of
control
A boolean expression can evaluate true Or false as values. Such
boolean expressions can be evaluated in analogy to arithmetic
expressions using three-address instructions with logical
operators
• The intended use of boolean expressions is determined by its
syntactic context.
• We concentrate on the use of boolean expressions to alter the
flow of control.
• For clarity, we introduce a new nonterminal B
Control Flow: Three address codes
Boolean Expressions
Short-Circuit Code
Control Flow
Translation of Flow-of-Control Statements
into three-address code
• Nonterminal B represents a boolean expression and nonterminal S
represents a statement.
• Both B and S have a synthesized attribute code, which gives the
translation into three-address instructions.
• Inherited attributes B.true, B.false, S.next generate labels for
control
flow
• B.true the label to which control flows if B is true,
• B.false, the label to which control flows if B is false.
• With a statement S, we associate an inherited attribute S.next
denoting a label for the instruction immediately after the code
• Nonterminal B represents a boolean expression and nonterminal S
represents a statement.
• Both B and S have a synthesized attribute code, which gives the
translation into three-address instructions.
• Inherited attributes B.true, B.false, S.next generate labels for control
flow
• B.true the label to which control flows if B is true,
• B.false, the label to which control flows if B is false.
• With a statement S, we associate an inherited attribute S.next
denoting a label for the instruction immediately after the code for S.
• newlabel() creates a new label each time it is called,
• label(L) attaches label L to the next three-address instruction to be
generated
• A program consists of a statement generated by P -> S.
• The semantic rules associated with this production initialize S.next to a
new label.
• P.code consists of S.code followed by the new label S.next.
assign in the production S —> assign is a placeholder for
assignment statements.
• In translating S -> if (B) S1, the semantic rules create a new label B.true
and attach it to the first three-address instruction generated for the
statement S1.
• Thus, jumps to B.true within the code for B will go to the code S1.
• By setting B.false to S.next, we ensure that control will skip the code
for S1 if B evaluates to false.
standard
S id=E;
• In translating S -> if (B) S1, the semantic rules create a new label B.true
and attach it to the first three-address instruction generated for the
statement S1.
• Thus, jumps to B.true within the code for B will go to the code S1.
• By setting B.false to S.next, we ensure that control will skip the code
for S1 if B evaluates to false.
Translation of Boolean Expressions
if a>b
x=0 S id=E;
S.next=L1 P.Code
:
S.code
L1: if a>b
x=0
S id=E;
S.next=L1
B.true= L2
B. false=L1
L1:
P.Code
:
B.code
L2:
S1.cod
e
if a>b
x=0
S.next=L1
B.true= L2
B. false=L1
P.Code:
if a>b goto L2
goto L1
L2: S1
L1:…… if a>b
x=0
Translation of Boolean Expressions
B1
B2
B
If B1 is true, then we immediately know that B itself
is true, so B1.true is the same as B.true.
If B1 is false, then B2 must be evaluated, so B1.false
gets the label of the first instruction in the code for
B2
L1
The true and false exits of B2 are the same as the true and false
exits of B, respectively.
L2
Translation of Boolean Expressions
Automata compiler design ppt for btech students
Automata compiler design ppt for btech students
Translation of if-else statement
• In translating the if-else-statement, if B is true , the code for the
boolean expression B has jumps to the first instruction of the code
for S1,
• if B is false, control jumps to the first instruction of the code for S2.
• Further, control flows from both S1 and S2 to the three-address
instruction immediately following the code for S — its label is
given by the inherited attribute S.next.
• An explicit goto S.next appears after the code for S1 to skip over
the code for S2 . No goto is needed after S2 , since S2.next is the
same as S.next.
if a>b
x=0;
else
y=0; L1:
S.next=L1
S id=E;
if a>b
x=0;
else
y=0;
L1:
S.next=L1
S id=E;
B.code=
if a>b goto L2
goto L3
if a>b
x=0;
else
y=0;
S.code=
if a>b goto L2
goto L3
L2: x=0
goto L1
L3: y=0
L1:
S.next=L1
S id=E;
Translation of while statement
• We use a local variable begin to hold a new label attached to the first
instruction for this while-statement,
• which is also the first instruction for B.
• Variable begin is local to the semantic rules for this production.
• The inherited label S.next marks the instruction that control must flow to if B
is false; hence, B.false is set to be S.next.
• A new label B.true is attached to the first instruction for S1;
• the code for B generates a jump to this label if B is true.
• After the code for S1 we place the instruction goto begin, which causes a jump
back to the beginning of the code for the boolean expression.
• Note that S1.next is set to this label begin
Avoiding Redundant Gotos
Free fall to L4 when x>200
Jumps to L1 only when x<=200
The code for statement S1
immediately follows the code for the
boolean expression B
B
L1
L2
S1
L1
if
Jump when false
Avoiding Redundant Gotos
Jump when true
Jump when false
Avoiding Redundant Gotos
B1
B2
L1
L2
• Meaning of label fall for B is different from its
meaning for B1.
• Suppose B.true is fall; i.e, control falls through B, if B
evaluates to true.
• Although B evaluates to true if B1 does, B1.true must
ensure that control jumps over the code for B2 to get
to the next instruction after B.
B
Avoiding Redundant Gotos
Avoiding Redundant Gotos
change
change
Automata compiler design ppt for btech students
S.next=L1
B.true= L2
B. false=L1
P.Code:
if a>b goto L2
goto L1
L2: S1
L1:…… if a>b
x=0
Backpatching: Problem
Backpatching
• Key problem when generating code for boolean expressions and flow-of-
control statements is that
• Matching a jump instruction with the target of the jump.
• For example, the translation of the boolean expression B in if ( B ) S contains
a jump, for when B is false, to the instruction following the code for S.
• In a one-pass translation, B must be translated before S is examined.
• What then is the target of the goto that jumps over the code for S?
B
L1
L2
S
if
Backpatching
• Lists of jumps are passed as synthesized attributes.
• When a jump is generated, the target of the jump is temporarily left
unspecified.
• Each such jump is put on a list of jumps whose labels are to be filled in
when the proper label can be determined.
• All of the jumps on a list have the same target label.
• Synthesized attributes truelist and falselist of nonterminal B are used to
manage labels.
• B.truelist will be a list of jump or conditional jump instructions into
which, we must insert the label to which control goes if B is true.
• B.falselist likewise is the list of instructions that eventually get the label
to which control goes when B is false.
• Code is generated for B, jumps to the true and false exits are left
incomplete, with the label field unfilled.
• These incomplete jumps are placed on lists pointed to by B.truelist and
B.falselist, as appropriate.
• We generate instructions into an instruction array,
• Labels will be indices into this array.
• To manipulate lists of jumps, we use three functions:
1.makelist(i) creates a new list containing an index i into the array of
instructions; makelist returns a pointer to the newly created list.
2. merge(p1,p2) concatenates the lists pointed to by p1 and p2, and
returns
a pointer to the concatenated list.
3. backpatch(p,i) inserts i as the target label for each of the
instructions on
the list pointed to by p.
Backpatching
Backpatching for Boolean Expressions
• We now construct a translation scheme suitable for generating code for
Boolean expressions during bottom-up parsing.
Semantic actions generates two instructions, a conditional goto and an
unconditional goto.
Neither has its target filled in.
These instructions are put on new lists, pointed to by B. truelist and B .falselist
respectively
Backpatching for Boolean Expressions
• We now construct a translation scheme suitable for generating code for
Boolean expressions during bottom-up parsing.
• A marker nonterminal M causes a semantic action to pick up, at
appropriate times, the index of the next instruction to be generated.
• If B1 is true, then B is also true, so the jumps on B1.truelist become part of
B.truelist.
• If B1 is false, however, we must next test B2,
• So the target for the jumps B1.falselist must be the beginning of the code
generated for B2.
• This target is obtained using the marker nonterminal M.
• That nonterminal M produces, as a synthesized attribute M.instr, the index
of the next instruction, just before B2 code starts being generated.
Backpatching for Boolean Expressions
Automata compiler design ppt for btech students
Automata compiler design ppt for btech students
Automata compiler design ppt for btech students
Flow-of-Control Statements
Boolean expressions generated by nonterminal B have two lists of
jumps, B.truelist and B.falselist, corresponding to the true and false
exits from the code for B
Statements generated by nonterminals S and L have a list of unfilled
jumps
S.nextlist is a list of all conditional and unconditional jumps to the
instruction following the code for statement S in execution order.
L.nextlist is defined similarly.
Flow-of-Control Statements
• The two occurrences of the marker nonterminal M record the
instruction numbers of the
• M1--beginning of the code for B (begin) and the
• M2--beginning of the code for S1 (B.true).
Flow-of-Control Statements (while)
• Attribute M.instr points to the number of the next instruction.
• After the body S1 of the while-statement is executed,
• Control flows to the beginning.
• We backpatch S1.nextlist to make all targets on that list be
M1.instr.
• An explicit jump to the beginning of the code for B is appended after
the code for S1
• B. truelist is backpatched to go to the beginning of S1 ‘
• by making jumps on B.truelist go to M2.instr
Flow-of-Control Statements (while)
Flow-of-Control Statements (if-else)
• S1 is an assignment statement, we must
include at the end of the code for S1 a jump
over the code for S2
• N has attribute N.nextlist, which will be a list
consisting of the instruction number of the
jump goto _
Flow-of-Control Statements (if-else)
Flow-of-Control Statements (if-else)
Translation of if-else statement
Intermediate Code for Procedures
• Nonterminals D and T generate function definition and types,
respectively
• A function definition generated by D consists of keyword define, a return
type, the function name, formal parameters in parentheses and a function
body consisting of a statement.
• Nonterminal F generates zero or more formal parameters, where a
formal parameter consists of a type followed by an identifier.
• Nonterminals S and E generate statements and expressions,
respectively. The production for S adds a statement that returns the
value of an expression.
• The production for E adds function calls, with actual parameters
generated by A. An actual parameter is an expression.
Intermediate Code for Procedures
Function calls.
• When generating three-address instructions for a function call id(E, E,... , E),
• It is sufficient to generate the three-address instructions for evaluating or
reducing the parameters E to addresses (E.addr),
• Followed by a param instruction for each parameter.
Intermediate Code for Procedures
Symbol tables.
• Let s be the top symbol table when the function definition is reached.
• The function name is entered into symbol table s for use in the rest of the
program.
• Data type function (return type, parameter type) inserted in symbol table
• The formal parameters of a function can be handled in analogy with field names
in a record
• In the production for D, after seeing define and the function name, we push s
and set up a new symbol table
• Env.push(top)]
• t = new Env();
• The new symbol table, t.
• The new table t is used to translate the function body.
• We revert to the previous symbol table s after the function body is translated.
Record or Structure data type
Type Conversions
• Consider expressions like x + i, where x is of type float and i is of type integer
• Since the representation of integers and floating-point numbers is different
within a computer
• Compiler may need to convert one of the operands of + to ensure that both
operands are of the same type when the addition occurs.
Type Conversions
Type conversion rules
widening conversions, which are intended to preserve information,
• Conversion from one type to another is said to be
implicit if it is done automatically by the compiler.
• Conversion is said to be explicit if the programmer must
write something to cause the conversion.
• Explicit conversions are also called type casts
Automata compiler design ppt for btech students
New semantic action illustrates how type conversions can be added to the
SDT for translating expressions
New semantic action illustrates how type conversions can be added to the
SDT for translating expressions

More Related Content

PPTX
Syntax directed definition and intermediate code generation
PPT
Chapter 6 Intermediate Code Generation
PDF
INTERMEDIATE CODE GENERTION-CD UNIT-3.pdf
PPTX
Unit 3 Compiler Design Regulation 2021.pptx
PPT
Chapter 6 intermediate code generation
PDF
C++ PPT IN NUMERICAL METHOD FOR M.Sc PHYSICS
PPT
458237.-Compiler-Design-Intermediate-code-generation.ppt
PPTX
c unit programming arrays in detail 3.pptx
Syntax directed definition and intermediate code generation
Chapter 6 Intermediate Code Generation
INTERMEDIATE CODE GENERTION-CD UNIT-3.pdf
Unit 3 Compiler Design Regulation 2021.pptx
Chapter 6 intermediate code generation
C++ PPT IN NUMERICAL METHOD FOR M.Sc PHYSICS
458237.-Compiler-Design-Intermediate-code-generation.ppt
c unit programming arrays in detail 3.pptx

Similar to Automata compiler design ppt for btech students (20)

PPT
Data Handling
PDF
THE C++ LECTURE 2 ON DATA STRUCTURES OF C++
PPTX
Lesson 7-computer programming case study-FINAL.pptx
PPS
C programming session 04
PPTX
C tokens.pptx
PPTX
arraytypes of array and pointer string in c++.pptx
PPTX
UNIT - III Compiler.pptx power point presentation
PDF
Intermediate code generation in Compiler Design
PPTX
Learn C LANGUAGE at ASIT
PPTX
Module 3 - Semantic Analysis(Compiler Design).pptx
PPT
C the basic concepts
PPTX
Basic Concepts of C Language.pptx
PDF
Assignment12
PPTX
Three address code In Compiler Design
DOC
Compiler notes--unit-iii
PPTX
C programming basic concepts of mahi.pptx
PPTX
Co&amp;al lecture-08
PPT
Data structure
PPTX
data types in C programming
PDF
cassignmentii-170424105623.pdf
Data Handling
THE C++ LECTURE 2 ON DATA STRUCTURES OF C++
Lesson 7-computer programming case study-FINAL.pptx
C programming session 04
C tokens.pptx
arraytypes of array and pointer string in c++.pptx
UNIT - III Compiler.pptx power point presentation
Intermediate code generation in Compiler Design
Learn C LANGUAGE at ASIT
Module 3 - Semantic Analysis(Compiler Design).pptx
C the basic concepts
Basic Concepts of C Language.pptx
Assignment12
Three address code In Compiler Design
Compiler notes--unit-iii
C programming basic concepts of mahi.pptx
Co&amp;al lecture-08
Data structure
data types in C programming
cassignmentii-170424105623.pdf
Ad

Recently uploaded (20)

PPTX
Learn about numerology and do tarot reading
PPTX
Chapter-7-The-Spiritual-Self-.pptx-First
PDF
My 'novel' Account of Human Possibility pdf.pdf
PPTX
How to Deal with Imposter Syndrome for Personality Development?
PDF
The Power of Pausing Before You React by Meenakshi Khakat
PPTX
Pradeep Kumar Roll no.30 Paper I.pptx....
PPTX
cấu trúc sử dụng mẫu Cause - Effects.pptx
PPT
proper hygiene for teenagers for secondary students .ppt
PDF
Attachment Theory What Childhood Says About Your Relationships.pdf
PDF
The Spotlight Effect No One Is Thinking About You as Much as You Think - by M...
PPTX
diasspresentationndkcnskndncelklkfndc.pptx
PPTX
Personal Development - By Knowing Oneself?
PPTX
Understanding the Self power point presentation
PPT
cypt-cht-healthy-relationships-part1-presentation-v1.1en.ppt
PPTX
Attitudes presentation for psychology.pptx
PDF
Top 10 Visionary Entrepreneurs to Watch in 2025
PPTX
Learn how to prevent Workplace Incidents?
PPTX
SELF ASSESSMENT -SNAPSHOT.pptx an index of yourself by Dr NIKITA SHARMA
PPTX
Learn how to use Portable Grinders Safely
PPTX
PERDEV-LESSON-3 DEVELOPMENTMENTAL STAGES.pptx
Learn about numerology and do tarot reading
Chapter-7-The-Spiritual-Self-.pptx-First
My 'novel' Account of Human Possibility pdf.pdf
How to Deal with Imposter Syndrome for Personality Development?
The Power of Pausing Before You React by Meenakshi Khakat
Pradeep Kumar Roll no.30 Paper I.pptx....
cấu trúc sử dụng mẫu Cause - Effects.pptx
proper hygiene for teenagers for secondary students .ppt
Attachment Theory What Childhood Says About Your Relationships.pdf
The Spotlight Effect No One Is Thinking About You as Much as You Think - by M...
diasspresentationndkcnskndncelklkfndc.pptx
Personal Development - By Knowing Oneself?
Understanding the Self power point presentation
cypt-cht-healthy-relationships-part1-presentation-v1.1en.ppt
Attitudes presentation for psychology.pptx
Top 10 Visionary Entrepreneurs to Watch in 2025
Learn how to prevent Workplace Incidents?
SELF ASSESSMENT -SNAPSHOT.pptx an index of yourself by Dr NIKITA SHARMA
Learn how to use Portable Grinders Safely
PERDEV-LESSON-3 DEVELOPMENTMENTAL STAGES.pptx
Ad

Automata compiler design ppt for btech students

  • 2. Intermediate-Code Generation Three-Address Code • In three-address code, there is at most one operator on the right side of an instruction • Thus a source-language expression like x+y*z might be translated into the sequence of three-address instructions
  • 7. Data structures for TAC Quadruples
  • 8. Data structures for TAC Quadruples
  • 9. Data structures for TAC Quadruples
  • 10. Data structures for TAC Triples • A triple has only three fields, which we call op, arg1, and arg2 • Note that the result field in Quad is used primarily for temporary names. • Using triples, we refer to the result of an operation x op y by its position, rather than by an explicit temporary name. • Thus, instead of the temporary t, a triple representation would refer to position (0). • Parenthesized numbers represent pointers into the triple structure itself.
  • 11. Data structures for TAC Triples
  • 12. • A benefit of quadruples over triples can be seen in an optimizing compiler, where instructions are often moved around. • With quadruples, if we move an instruction that computes a temporary t, then the instructions that use t require no change. • With triples, the result of an operation is referred to by its position • So moving an instruction may require us to change all references to that result. Benefit of Quad over Triples
  • 13. Indirect triples • Consist of a listing of pointers to triples, • Rather than a listing of triples themselves. • For example, use an array instruction to list pointers to triples in the desired order. With indirect triples, an optimizing compiler can move an instruction by reordering the instruction list, without affecting the triples themselves.
  • 14. Static Single-Assignment Form Two distinctive aspects distinguish SSA from three-address code. (a) The first is that all assignments in SSA are to variables with distinct names; hence the term static single-assignment.
  • 15. Static Single-Assignment Form Two distinctive aspects distinguish SSA from three-address code. (a) The first is that all assignments in SSA are to variables with distinct names; hence the term static single-assignment. (b)
  • 17. Major translation classes of Three address code generation (a) Declaration statements (+ handling data type and storage) (b) Expressions and statements (a) Control flow statements
  • 18. Declaration statement Representing data types: Type Expressions Types have structure, which we shall represent using type expressions. • A type expression is either a basic type (boolean, char, integer, float, and void) or • is formed by applying an operator called a type constructor to a type
  • 19. Declaration statement • The array type int [2] [3] can be read as "array of 2 arrays of 3 integers each" • Can be represented as a type expression array(2, array(3, integer)). • This type is represented by the tree. • The operator array takes two parameters, a number and a type. • Here the type expression can be formed by applying the array type constructor to a number and a type expression.
  • 20. Declaration statement Example SDD Type Expressions • Nonterminal T generates either a basic type or an array type. • Nonterminal B generates one of the basic types int and float. • T generates a basic type when C derives €. • Otherwise, C generates array components consisting of a sequence of integers, each integer surrounded by brackets.
  • 21. Declaration statement Example SDD Type Expressions • The nonterminals B and T have a synthesized attribute t representing a type. • The nonterminal C has two attributes: an inherited attribute b and a synthesized attribute t.
  • 22. Declaration statement Example SDD input string int [2][3] • The nonterminal C has two attributes: an inherited attribute b and a synthesized attribute t. • The inherited b attributes pass a basic type down the tree, and the synthesized t attributes accumulate the result.
  • 24. Symbol table Find the storage for each variable
  • 25. More on Declaration statement Data type + Storage layout • Simplified grammar that declares just one name at a time; • We already explored the declarations with lists of names Storage layout: • Relative address of all the variables • From the type of a variable, we can determine the amount of storage that will be needed for the variable at run time. • At compile time, we can use these amounts to assign each variable a relative address. • The type and relative address are saved in the symbol-table entry for the variable name.
  • 26. More on Declaration statement Data type + Storage layout • The width of a type is the number of storage units needed for objects of that type. • A basic type, such as a character, integer, or float, requires an integral number of bytes. • Arrays allocated in one contiguous block of bytes Computes data types and their widths for basic and array types Type Expressions The width of an array is obtained by multiplying the width of an element by the number of elements in the array.
  • 27. More on Declaration statement Data type + Storage layout
  • 28. Relative address Name Data type Relative address
  • 29. Sequences of Declarations • Relative address: offset • Keeps track of the next available relative address • The translation scheme deals with a sequence of declarations of the form T id, where T generates a data type • Before the first declaration is considered, offset is set to 0. • As each new name x is seen, x is entered into the symbol table with its relative address = current value of offset, • which is then incremented by the width of the type of x. int x; float y;
  • 30. Sequences of Declarations • Relative address: offset • Keeps track of the next available relative address
  • 31. Record or Structure data type
  • 32. Record or Structure data type
  • 33. Record or Structure data type
  • 34. Homework int x, y; float p, q record { int x; float q; } a; char Write the grammar, SDD and populate the symbol table by executing those rules
  • 35. Major translation classes of Three address code generation (a) Declaration statements (+ handling data type and storage) (b) Expressions and statements (a) Control flow statements
  • 36. Translation of Expressions Three-address code for an assignment statement Attribute code for S attributes addr and code for an expression E. Attributes S.code and E.code denote the three-address code for S and E, respectively. Attribute E.addr denotes the address that will hold the value of E.
  • 37. Translation of Expressions Three-address code for an assignment statement When an expression is a single identifier, say x, then x itself holds the value of the expression. The semantic rules for this production define E.addr to point to the symbol-table entry Address of the variable
  • 38. Translation of Expressions Three-address code for an assignment statement
  • 39. Translation of Expressions Three-address code for an assignment statement
  • 40. Translation of Expressions Three-address code for an assignment statement
  • 45. Translation of Expressions Three-address code for an assignment statement
  • 46. Incremental Translation • So far, E.Code attributes were long strings • Generated incrementally • In incremental translation, generate only the new three- address instructions • Past sequence may either be retained in memory for further processing, or it may be output incrementally. • In the incremental approach, gen() not only constructs a three-address instruction, • it appends the instruction to the sequence of instructions generated so far.
  • 47. Incremental Translation • This translation scheme generates the same code as the previous syntax directed definition. • With the incremental approach, the E.code attribute is not used, • Since there is a single sequence of instructions that is
  • 48. Translation of Array Expressions
  • 49. Translation of Array Expressions Let w1 be the length of a row (# of columns) and w2 be the size of an element in a row. The relative address of A[i1][i2] can be calculated by the formula Translate 2D index to 1D index
  • 50. Translation of Array Expressions Objective a is a 2×3 array
  • 51. Translation of Array Expressions L.addr denotes a temporary variable containing the offset of the array elements (array index)
  • 52. Translation of Array Expressions • L.array is a pointer to the symbol-table entry for the array name. • L.array.base indicates base address of the array -- array name • L.type is the type t of the subarray generated by L. • For array type t, we assume that its width is given by t.width. • For any array type t, t.elem gives the array element.
  • 53. Translation of Array Expressions Standard Standard
  • 54. Translation of Array Expressions
  • 55. Translation of Array Expressions Parse tree
  • 56. Translation of Array Expressions
  • 57. Translation of Array Expressions a[i]
  • 58. Translation of Array Expressions a[i][j]
  • 59. Translation of Array Expressions
  • 60. Translation of Array Expressions
  • 61. Control Flow Translation of statements such as if-else-statements and while- statements Key step: Translation of boolean expressions Boolean expressions are used as (ii) (i) Conditional expressions in statements that alter the flow of control A boolean expression can evaluate true Or false as values. Such boolean expressions can be evaluated in analogy to arithmetic expressions using three-address instructions with logical operators • The intended use of boolean expressions is determined by its syntactic context. • We concentrate on the use of boolean expressions to alter the flow of control. • For clarity, we introduce a new nonterminal B
  • 62. Control Flow: Three address codes
  • 65. Translation of Flow-of-Control Statements into three-address code • Nonterminal B represents a boolean expression and nonterminal S represents a statement. • Both B and S have a synthesized attribute code, which gives the translation into three-address instructions. • Inherited attributes B.true, B.false, S.next generate labels for control flow • B.true the label to which control flows if B is true, • B.false, the label to which control flows if B is false. • With a statement S, we associate an inherited attribute S.next denoting a label for the instruction immediately after the code
  • 66. • Nonterminal B represents a boolean expression and nonterminal S represents a statement. • Both B and S have a synthesized attribute code, which gives the translation into three-address instructions. • Inherited attributes B.true, B.false, S.next generate labels for control flow • B.true the label to which control flows if B is true, • B.false, the label to which control flows if B is false. • With a statement S, we associate an inherited attribute S.next denoting a label for the instruction immediately after the code for S.
  • 67. • newlabel() creates a new label each time it is called, • label(L) attaches label L to the next three-address instruction to be generated • A program consists of a statement generated by P -> S. • The semantic rules associated with this production initialize S.next to a new label. • P.code consists of S.code followed by the new label S.next.
  • 68. assign in the production S —> assign is a placeholder for assignment statements. • In translating S -> if (B) S1, the semantic rules create a new label B.true and attach it to the first three-address instruction generated for the statement S1. • Thus, jumps to B.true within the code for B will go to the code S1. • By setting B.false to S.next, we ensure that control will skip the code for S1 if B evaluates to false. standard S id=E;
  • 69. • In translating S -> if (B) S1, the semantic rules create a new label B.true and attach it to the first three-address instruction generated for the statement S1. • Thus, jumps to B.true within the code for B will go to the code S1. • By setting B.false to S.next, we ensure that control will skip the code for S1 if B evaluates to false.
  • 70. Translation of Boolean Expressions if a>b x=0 S id=E;
  • 71. S.next=L1 P.Code : S.code L1: if a>b x=0 S id=E;
  • 73. S.next=L1 B.true= L2 B. false=L1 P.Code: if a>b goto L2 goto L1 L2: S1 L1:…… if a>b x=0
  • 74. Translation of Boolean Expressions B1 B2 B If B1 is true, then we immediately know that B itself is true, so B1.true is the same as B.true. If B1 is false, then B2 must be evaluated, so B1.false gets the label of the first instruction in the code for B2 L1 The true and false exits of B2 are the same as the true and false exits of B, respectively. L2
  • 75. Translation of Boolean Expressions
  • 78. Translation of if-else statement • In translating the if-else-statement, if B is true , the code for the boolean expression B has jumps to the first instruction of the code for S1, • if B is false, control jumps to the first instruction of the code for S2. • Further, control flows from both S1 and S2 to the three-address instruction immediately following the code for S — its label is given by the inherited attribute S.next. • An explicit goto S.next appears after the code for S1 to skip over the code for S2 . No goto is needed after S2 , since S2.next is the same as S.next.
  • 81. if a>b x=0; else y=0; S.code= if a>b goto L2 goto L3 L2: x=0 goto L1 L3: y=0 L1: S.next=L1 S id=E;
  • 82. Translation of while statement • We use a local variable begin to hold a new label attached to the first instruction for this while-statement, • which is also the first instruction for B. • Variable begin is local to the semantic rules for this production. • The inherited label S.next marks the instruction that control must flow to if B is false; hence, B.false is set to be S.next. • A new label B.true is attached to the first instruction for S1; • the code for B generates a jump to this label if B is true. • After the code for S1 we place the instruction goto begin, which causes a jump back to the beginning of the code for the boolean expression. • Note that S1.next is set to this label begin
  • 83. Avoiding Redundant Gotos Free fall to L4 when x>200 Jumps to L1 only when x<=200 The code for statement S1 immediately follows the code for the boolean expression B B L1 L2 S1 L1 if Jump when false
  • 84. Avoiding Redundant Gotos Jump when true Jump when false
  • 85. Avoiding Redundant Gotos B1 B2 L1 L2 • Meaning of label fall for B is different from its meaning for B1. • Suppose B.true is fall; i.e, control falls through B, if B evaluates to true. • Although B evaluates to true if B1 does, B1.true must ensure that control jumps over the code for B2 to get to the next instruction after B. B
  • 89. S.next=L1 B.true= L2 B. false=L1 P.Code: if a>b goto L2 goto L1 L2: S1 L1:…… if a>b x=0 Backpatching: Problem
  • 90. Backpatching • Key problem when generating code for boolean expressions and flow-of- control statements is that • Matching a jump instruction with the target of the jump. • For example, the translation of the boolean expression B in if ( B ) S contains a jump, for when B is false, to the instruction following the code for S. • In a one-pass translation, B must be translated before S is examined. • What then is the target of the goto that jumps over the code for S? B L1 L2 S if
  • 91. Backpatching • Lists of jumps are passed as synthesized attributes. • When a jump is generated, the target of the jump is temporarily left unspecified. • Each such jump is put on a list of jumps whose labels are to be filled in when the proper label can be determined. • All of the jumps on a list have the same target label. • Synthesized attributes truelist and falselist of nonterminal B are used to manage labels. • B.truelist will be a list of jump or conditional jump instructions into which, we must insert the label to which control goes if B is true. • B.falselist likewise is the list of instructions that eventually get the label to which control goes when B is false. • Code is generated for B, jumps to the true and false exits are left incomplete, with the label field unfilled. • These incomplete jumps are placed on lists pointed to by B.truelist and B.falselist, as appropriate.
  • 92. • We generate instructions into an instruction array, • Labels will be indices into this array. • To manipulate lists of jumps, we use three functions: 1.makelist(i) creates a new list containing an index i into the array of instructions; makelist returns a pointer to the newly created list. 2. merge(p1,p2) concatenates the lists pointed to by p1 and p2, and returns a pointer to the concatenated list. 3. backpatch(p,i) inserts i as the target label for each of the instructions on the list pointed to by p. Backpatching
  • 93. Backpatching for Boolean Expressions • We now construct a translation scheme suitable for generating code for Boolean expressions during bottom-up parsing. Semantic actions generates two instructions, a conditional goto and an unconditional goto. Neither has its target filled in. These instructions are put on new lists, pointed to by B. truelist and B .falselist respectively
  • 94. Backpatching for Boolean Expressions • We now construct a translation scheme suitable for generating code for Boolean expressions during bottom-up parsing. • A marker nonterminal M causes a semantic action to pick up, at appropriate times, the index of the next instruction to be generated. • If B1 is true, then B is also true, so the jumps on B1.truelist become part of B.truelist. • If B1 is false, however, we must next test B2, • So the target for the jumps B1.falselist must be the beginning of the code generated for B2. • This target is obtained using the marker nonterminal M. • That nonterminal M produces, as a synthesized attribute M.instr, the index of the next instruction, just before B2 code starts being generated.
  • 99. Flow-of-Control Statements Boolean expressions generated by nonterminal B have two lists of jumps, B.truelist and B.falselist, corresponding to the true and false exits from the code for B Statements generated by nonterminals S and L have a list of unfilled jumps S.nextlist is a list of all conditional and unconditional jumps to the instruction following the code for statement S in execution order. L.nextlist is defined similarly.
  • 101. • The two occurrences of the marker nonterminal M record the instruction numbers of the • M1--beginning of the code for B (begin) and the • M2--beginning of the code for S1 (B.true). Flow-of-Control Statements (while)
  • 102. • Attribute M.instr points to the number of the next instruction. • After the body S1 of the while-statement is executed, • Control flows to the beginning. • We backpatch S1.nextlist to make all targets on that list be M1.instr. • An explicit jump to the beginning of the code for B is appended after the code for S1 • B. truelist is backpatched to go to the beginning of S1 ‘ • by making jumps on B.truelist go to M2.instr Flow-of-Control Statements (while)
  • 104. • S1 is an assignment statement, we must include at the end of the code for S1 a jump over the code for S2 • N has attribute N.nextlist, which will be a list consisting of the instruction number of the jump goto _ Flow-of-Control Statements (if-else)
  • 107. Intermediate Code for Procedures • Nonterminals D and T generate function definition and types, respectively • A function definition generated by D consists of keyword define, a return type, the function name, formal parameters in parentheses and a function body consisting of a statement. • Nonterminal F generates zero or more formal parameters, where a formal parameter consists of a type followed by an identifier. • Nonterminals S and E generate statements and expressions, respectively. The production for S adds a statement that returns the value of an expression. • The production for E adds function calls, with actual parameters generated by A. An actual parameter is an expression.
  • 108. Intermediate Code for Procedures Function calls. • When generating three-address instructions for a function call id(E, E,... , E), • It is sufficient to generate the three-address instructions for evaluating or reducing the parameters E to addresses (E.addr), • Followed by a param instruction for each parameter.
  • 109. Intermediate Code for Procedures Symbol tables. • Let s be the top symbol table when the function definition is reached. • The function name is entered into symbol table s for use in the rest of the program. • Data type function (return type, parameter type) inserted in symbol table • The formal parameters of a function can be handled in analogy with field names in a record • In the production for D, after seeing define and the function name, we push s and set up a new symbol table • Env.push(top)] • t = new Env(); • The new symbol table, t. • The new table t is used to translate the function body. • We revert to the previous symbol table s after the function body is translated.
  • 110. Record or Structure data type
  • 111. Type Conversions • Consider expressions like x + i, where x is of type float and i is of type integer • Since the representation of integers and floating-point numbers is different within a computer • Compiler may need to convert one of the operands of + to ensure that both operands are of the same type when the addition occurs.
  • 112. Type Conversions Type conversion rules widening conversions, which are intended to preserve information, • Conversion from one type to another is said to be implicit if it is done automatically by the compiler. • Conversion is said to be explicit if the programmer must write something to cause the conversion. • Explicit conversions are also called type casts
  • 114. New semantic action illustrates how type conversions can be added to the SDT for translating expressions
  • 115. New semantic action illustrates how type conversions can be added to the SDT for translating expressions