SlideShare a Scribd company logo
Outline
 Code Generation Issues
 Target language Issues
 Addresses in Target Code
 Basic Blocks and Flow Graphs
 Optimizations of Basic Blocks
 A Simple Code Generator
 Peephole optimization
 Register allocation and assignment
 Instruction selection by tree rewriting
Introduction
 The final phase of a compiler is code generator
 It receives an intermediate representation (IR) with
supplementary information in symbol table
 Produces a semantically equivalent target program
 Code generator main tasks:
 Instruction selection
 Register allocation and assignment
 Insrtuction ordering
Front
end
Code
optimizer
Code
Generato
r
complexity of mapping
 the level of the IR
 the nature of the instruction-set architecture
 the desired quality of the generated code.
x=y+z
LD R0, y
ADD R0, z
ST x, R0
a=b+c
d=a+e
LD R0, b
ADD R0, c
ST a, R0
LD R0, a
ADD R0, e
ST d, R0
Register allocation
 Two subproblems
 Register allocation: selecting the set of variables that will
reside in registers at each point in the program
 Resister assignment: selecting specific register that a
variable reside in
 Complications imposed by the hardware architecture
 Example: register pairs for multiplication and division
t=a+b
t=t*c
T=t/d
t=a+b
t=t+c
T=t/d
L R1, a
A R1, b
M R0, c
D R0, d
ST R1, t
L R0, a
A R0, b
M R0, c
SRDA R0, 32
D R0, d
ST R1, t
A simple target machine model
 Load operations: LD r,x and LD r1, r2
 Store operations: ST x,r
 Computation operations: OP dst, src1, src2
 Unconditional jumps: BR L
 Conditional jumps: Bcond r, L like BLTZ r, L
Addressing Modes
 variable name: x
 indexed address: a(r) like LD R1, a(R2) means
R1=contents(a+contents(R2))
 integer indexed by a register : like LD R1, 100(R2)
 Indirect addressing mode: *r and *100(r)
 immediate constant addressing mode: like LD R1,
#100
Basic blocks and flow graphs
 Partition the intermediate code into basic blocks
 The flow of control can only enter the basic block
through the first instruction in the block. That is,
there are no jumps into the middle of the block.
 Control will leave the block without halting or
branching, except possibly at the last instruction
in the block.
 The basic blocks become the nodes of a flow
graph
rules for finding leaders
 The first three-address instruction in the
intermediate code is a leader.
 Any instruction that is the target of a conditional
or unconditional jump is a leader.
 Any instruction that immediately follows a
conditional or unconditional jump is a leader.
Intermediate code to set a 10*10 matrix
to an identity matrix
Flow graph based on Basic Blocks
liveness and next-use information
 We wish to determine for each three address
statement x=y+z what the next uses of x, y and z are.
 Algorithm:
 Attach to statement i the information currently found in
the symbol table regarding the next use and liveness of
x, y, and z.
 In the symbol table, set x to "not live" and "no next use.“
 In the symbol table, set y and z to "live" and the next
uses of y and z to i.
DAG representation of basic
blocks
 There is a node in the DAG for each of the initial
values of the variables appearing in the basic block.
 There is a node N associated with each statement s
within the block. The children of N are those nodes
corresponding to statements that are the last
definitions, prior to s, of the operands used by s.
 Node N is labeled by the operator applied at s, and
also attached to N is the list of variables for which it
is the last definition within the block.
 Certain nodes are designated output nodes. These are
the nodes whose variables are live on exit from the
block.
Code improving transformations
 We can eliminate local common subexpressions, that
is, instructions that compute a value that has
already been computed.
 We can eliminate dead code, that is, instructions that
compute a value that is never used.
 We can reorder statements that do not depend on
one another; such reordering may reduce the time a
temporary value needs to be preserved in a register.
 We can apply algebraic laws to reorder operands of
three-address instructions, and sometimes t hereby
simplify t he computation.
DAG for basic block
DAG for basic block
array accesses in a DAG
 An assignment from an array, like x = a [i], is represented
by creating a node with operator =[] and two children
representing the initial value of the array, a0 in this case,
and the index i. Variable x becomes a label of this new
node.
 An assignment to an array, like a [j] = y, is represented by a
new node with operator []= and three children
representing a0, j and y. There is no variable labeling this
node. What is different is that the creation of this node kills
all currently constructed nodes whose value depends on
a0. A node that has been killed cannot receive any more
labels; that is, it cannot become a common subexpression.
DAG for a sequence of array assignments
Rules for reconstructing the basic block
from a DAG
 The order of instructions must respect the order of nodes in the DAG.
That is, we cannot compute a node's value until we have computed a
value for each of its children.
 Assignments to an array must follow all previous assignments to, or
evaluations from, the same array, according to the order of these
instructions in the original basic block.
 Evaluations of array elements must follow any previous (according to
the original block) assignments to the same array. The only
permutation allowed is that two evaluations from the same array may
be done in either order, as long as neither crosses over an assignment
to that array.
 Any use of a variable must follow all previous (according to the original
block) procedure calls or indirect assignments through a pointer.
 Any procedure call or indirect assignment through a pointer must
follow all previous (according to the original block) evaluations of any
variable.
principal uses of registers
 In most machine architectures, some or all of the
operands of an operation must be in registers in
order to perform the operation.
 Registers make good temporaries - places to hold the
result of a subexpression while a larger expression is
being evaluated, or more generally, a place to hold a
variable that is used only within a single basic block.
 Registers are often used to help with run-time
storage management, for example, to manage the
run-time stack, including the maintenance of stack
pointers and possibly the top elements of the stack
itself.
Descriptors for data structure
 For each available register, a register descriptor keeps track
of the variable names whose current value is in that
register. Since we shall use only those registers that are
available for local use within a basic block, we assume that
initially, all register descriptors are empty. As the code
generation progresses, each register will hold the value of
zero or more names.
 For each program variable, an address descriptor keeps
track of the location or locations where the current value of
that variable can be found. The location might be a register,
a memory address, a stack location, or some set of more
than one of these. The information can be stored in the
symbol-table entry for that variable name.
Machine Instructions for Operations
 Use getReg(x = y + z) to select registers for x, y,
and z. Call these Rx, Ry and Rz.
 If y is not in Ry (according to the register
descriptor for Ry), then issue an instruction LD
Ry, y', where y' is one of the memory locations for
y (according to the address descriptor for y).
 Similarly, if z is not in Rz, issue and instruction
LD Rz, z', where z' is a location for x .
 Issue the instruction ADD Rx , Ry, Rz.
Rules for updating the register and address descriptors
 For the instruction LD R, x
 Change the register descriptor for register R so it holds only x.
 Change the address descriptor for x by adding register R as an
additional location.
 For the instruction ST x, R, change the address descriptor for x to
include its own memory location.
 For an operation such as ADD Rx, Ry, Rz implementing a three-address
instruction x = y + x
 Change the register descriptor for Rx so that it holds only x.
 Change the address descriptor for x so that its only location is Rx. Note
that the memory location for x is not now in the address descriptor
for x.
 Remove Rx from the address descriptor of any variable other than x.
 When we process a copy statement x = y, after generating the load for
y into register Ry, if needed, and after managing descriptors as for all
load statements (per rule I):
 Add x to the register descriptor for Ry.
 Change the address descriptor for x so that its only location is Ry .
Instructions generated and the changes in the
register and address descriptors
Rules for picking register Ry for y
 If y is currently in a register, pick a register already
containing y as Ry. Do not issue a machine
instruction to load this register, as none is needed.
 If y is not in a register, but there is a register that is
currently empty, pick one such register as Ry.
 The difficult case occurs when y is not in a register,
and there is no register that is currently empty. We
need to pick one of the allowable registers anyway,
and we need to make it safe to reuse.
Possibilities for value of R
 If the address descriptor for v says that v is somewhere besides
R, then we are OK.
 If v is x, the value being computed by instruction I, and x is not
also one of the other operands of instruction I (z in this
example), then we are OK. The reason is that in this case, we
know this value of x is never again going to be used, so we are
free to ignore it.
 Otherwise, if v is not used later (that is, after the instruction I,
there are no further uses of v, and if v is live on exit from the
block, then v is recomputed within the block), then we are OK.
 If we are not OK by one of the first two cases, then we need to
generate the store instruction ST v, R to place a copy of v in its
own memory location. This operation is called a spill.
Selection of the register Rx
1. Since a new value of x is being computed, a
register that holds only x is always an
acceptable choice for Rx.
2. If y is not used after instruction I, and Ry holds
only y after being loaded, Ry can also be used
as Rx. A similar option holds regarding z and
Rx.
Possibilities for value of R
 If the address descriptor for v says that v is somewhere besides
R, then we are OK.
 If v is x, the value being computed by instruction I, and x is not
also one of the other operands of instruction I (z in this
example), then we are OK. The reason is that in this case, we
know this value of x is never again going to be used, so we are
free to ignore it.
 Otherwise, if v is not used later (that is, after the instruction I,
there are no further uses of v, and if v is live on exit from the
block, then v is recomputed within the block), then we are OK.
 If we are not OK by one of the first two cases, then we need to
generate the store instruction ST v, R to place a copy of v in its
own memory location. This operation is called a spill.
Register Allocation and Assignment
 Global Register Allocation
 Usage Counts
 Register Assignment for Outer Loops
 Register Allocation by Graph Coloring
Global register allocation
 Previously explained algorithm does local (block
based) register allocation
 This resulted that all live variables be stored at the
end of block
 To save some of these stores and their corresponding
loads, we might arrange to assign registers to
frequently used variables and keep these registers
consistent across block boundaries (globally)
 Some options are:
 Keep values of variables used in loops inside registers
 Use graph coloring approach for more globally
allocation
Usage counts
 For the loops we can approximate the saving by
register allocation as:
 Sum over all blocks (B) in a loop (L)
 For each uses of x before any definition in the
block we add one unit of saving
 If x is live on exit from B and is assigned a value in
B, then we ass 2 units of saving
Flow graph of an inner loop
Code sequence using global register
assignment
Register allocation by Graph
coloring
 Two passes are used
 Target-machine instructions are selected as though
there are an infinite number of symbolic registers
 Assign physical registers to symbolic ones
 Create a register-interference graph
 Nodes are symbolic registers and edges connects two
nodes if one is live at a point where the other is defined.
 For example in the previous example an edge connects
a and d in the graph
 Use a graph coloring algorithm to assign registers.
Intermediate-code tree for a[i]=b+1
Tree-rewriting rules
Syntax-directed translation scheme
An instruction set for tree matching
Ershov Numbers
 Label any leaf 1.
 The label of an interior node with one child is the
label of its child.
 The label of an interior node with two children is
 The larger of the labels of its children, if those
labels are different.
 One plus the label of its children if the labels are
the same.
A tree labeled with Ershov numbers
Generating code from a labeled expression tree
 To generate machine code for an interior node with label k and two
children with equal labels (which must be k - l) do the following:
 Recursively generate code for the right child, using base b+1. The result of
the right child appears in register Rb+k.
 Recursively generate code for the left child, using base b; the result appears
in Rb+k-1.
 Generate the instruction OP Rb+k, Rb+k-1, Rb+k, where OP is the appropriate
operation for the interior node in question.
 Suppose we have an interior node with label k and children with unequal
labels. Then one of the children, which we'll call the "big" child, has label k ,
and the other child, the "little" child, has some label m < k. Do the following
to generate code for this interior node, using base b:
 Recursively generate code for the big child, using base b; the result appears
in register Rb+k-l.
 Recursively generate code for the small child, using base b; the result
appears in register Rb+m-l. Note that since m < k, neither Rb+k-l nor any higher-
numbered register is used.
 Generate the instruction OP Rb+k-l, Rb+m-l, Rb+k-1 or the instruction OP Rb+k-l, Rb+k-l,
Rb+m+l, depending on whether the big child is the right or left child,
respectively.
 For a leaf representing operand x, if the base is b generate the instruction
LD Rb, x.
Optimal three-register code
Evaluating Expressions with an
Insufficient Supply of Registers
 Node N has at least one child with label r or greater. Pick the larger
child (or either if their labels are the same) to be the "big" child and
let the other child be the "little" child.
 Recursively generate code for the big child, using base b = 1. The
result of this evaluation will appear in register Rr
 Generate the machine instruction ST tk, Rr, where tk is a temporary
variable used for temporary results used to help evaluate nodes with
label k.
 Generate code for the little child as follows. If the little child has label
r or greater, pick base b=1. If the label of the little child is j<r, then
pick b=r-j. Then recursively apply this algorithm to the little child;
the result appears in Rr.
 Generate the instruction LD Rr-l, tk.
 If the big child is the right child of N, then generate the instruction
OP Rr, Rr, Rr-1. If the big child is the left child, generate OP Rr, Rr-1, Rr.
Optimal three-register code
using only two registers
Dynamic Programming Algorithm
 Compute bottom-up for each node n of the expression tree
T an array C of costs, in which the ith component C[i] is
the optimal cost of computing the subtree S rooted at n
into a register, assuming i registers are available for the
computation, for
 Traverse T, using the cost vectors to determine which
subtrees of T must be computed into memory.
 Traverse each tree using the cost vectors and associated
instructions to generate the final target code. The code for
the subtrees computed into memory locations is
generated first.
 r
i 

1
Syntax tree for (a-b)+c*(d/e) with
cost vector at each node
minimum cost of evaluating the
root with two registers available
 Compute the left subtree with two registers available into
register R0, compute the right subtree with one register
available into register R1, and use the instruction ADD R0,
R0, R1 to compute the root. This sequence has cost 2+5+1=8.
 Compute the right subtree with two registers available into
R l , compute the left subtree with one register available
into R0, and use the instruction ADD R0, R0, R1. This
sequence has cost 4+2+1=7.
 Compute the right subtree into memory location M,
compute the left subtree with two registers available into
register RO, and use the instruction ADD R0, R0, M. This
sequence has cost 5+2+1=8.
Code Generations - 1 compiler design.ppt

More Related Content

PPT
COMPILER_DESIGN_CLASS 2.ppt
PPTX
COMPILER_DESIGN_CLASS 1.pptx
PDF
Compiler unit 5
PPTX
Compiler Design_Code generation techniques.pptx
PPTX
Register allocation and assignment
PPTX
Unit iv(simple code generator)
PPT
456589.-Compiler-Design-Code-Generation (1).ppt
PDF
Enscape 3D 3.6.6 License Key Crack Full Version
COMPILER_DESIGN_CLASS 2.ppt
COMPILER_DESIGN_CLASS 1.pptx
Compiler unit 5
Compiler Design_Code generation techniques.pptx
Register allocation and assignment
Unit iv(simple code generator)
456589.-Compiler-Design-Code-Generation (1).ppt
Enscape 3D 3.6.6 License Key Crack Full Version

Similar to Code Generations - 1 compiler design.ppt (20)

PDF
Wondershare UniConverter Crack Download Latest 2025
PPT
Code_generatio.lk,jhgfdcxzcvgfhjkmnjhgfcxvfghjmh
PDF
Wondershare Filmora Crack 12.0.10 With Latest 2025
PDF
Skype 125.0.201 Crack key Free Download
PPT
456589.-Compiler-Design-Code-Generation (1).ppt
PPT
PRESENTATION ON DATA STRUCTURE AND THEIR TYPE
PDF
220 runtime environments
PPT
unit-5.pptvshvshshhshsjjsjshhshshshhshsj
PDF
VHDL- data types
PPTX
C language presentation
PDF
05 dataflow
PPTX
Co&amp;al lecture-07
PPTX
MES_MODULE 2.pptx
PDF
INTERVIEW QUESTIONS_Verilog_PART-3-5 (1).pdf
PPTX
Introduction to R for beginners
PPTX
X86 operation types
PPT
Symbol Table, Error Handler & Code Generation
PPT
Introduction to Domain Calculus Notes.ppt
PPTX
Arrays, Strings & Loops in assembly Language.pptx
Wondershare UniConverter Crack Download Latest 2025
Code_generatio.lk,jhgfdcxzcvgfhjkmnjhgfcxvfghjmh
Wondershare Filmora Crack 12.0.10 With Latest 2025
Skype 125.0.201 Crack key Free Download
456589.-Compiler-Design-Code-Generation (1).ppt
PRESENTATION ON DATA STRUCTURE AND THEIR TYPE
220 runtime environments
unit-5.pptvshvshshhshsjjsjshhshshshhshsj
VHDL- data types
C language presentation
05 dataflow
Co&amp;al lecture-07
MES_MODULE 2.pptx
INTERVIEW QUESTIONS_Verilog_PART-3-5 (1).pdf
Introduction to R for beginners
X86 operation types
Symbol Table, Error Handler & Code Generation
Introduction to Domain Calculus Notes.ppt
Arrays, Strings & Loops in assembly Language.pptx
Ad

Recently uploaded (20)

PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Network Security Unit 5.pdf for BCA BBA.
PPTX
sap open course for s4hana steps from ECC to s4
PDF
KodekX | Application Modernization Development
PDF
Electronic commerce courselecture one. Pdf
PDF
Spectral efficient network and resource selection model in 5G networks
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Unlocking AI with Model Context Protocol (MCP)
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PPTX
Programs and apps: productivity, graphics, security and other tools
PPT
Teaching material agriculture food technology
Building Integrated photovoltaic BIPV_UPV.pdf
Chapter 3 Spatial Domain Image Processing.pdf
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Advanced methodologies resolving dimensionality complications for autism neur...
Dropbox Q2 2025 Financial Results & Investor Presentation
Mobile App Security Testing_ A Comprehensive Guide.pdf
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Network Security Unit 5.pdf for BCA BBA.
sap open course for s4hana steps from ECC to s4
KodekX | Application Modernization Development
Electronic commerce courselecture one. Pdf
Spectral efficient network and resource selection model in 5G networks
Understanding_Digital_Forensics_Presentation.pptx
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Unlocking AI with Model Context Protocol (MCP)
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Programs and apps: productivity, graphics, security and other tools
Teaching material agriculture food technology
Ad

Code Generations - 1 compiler design.ppt

  • 1. Outline  Code Generation Issues  Target language Issues  Addresses in Target Code  Basic Blocks and Flow Graphs  Optimizations of Basic Blocks  A Simple Code Generator  Peephole optimization  Register allocation and assignment  Instruction selection by tree rewriting
  • 2. Introduction  The final phase of a compiler is code generator  It receives an intermediate representation (IR) with supplementary information in symbol table  Produces a semantically equivalent target program  Code generator main tasks:  Instruction selection  Register allocation and assignment  Insrtuction ordering Front end Code optimizer Code Generato r
  • 3. complexity of mapping  the level of the IR  the nature of the instruction-set architecture  the desired quality of the generated code. x=y+z LD R0, y ADD R0, z ST x, R0 a=b+c d=a+e LD R0, b ADD R0, c ST a, R0 LD R0, a ADD R0, e ST d, R0
  • 4. Register allocation  Two subproblems  Register allocation: selecting the set of variables that will reside in registers at each point in the program  Resister assignment: selecting specific register that a variable reside in  Complications imposed by the hardware architecture  Example: register pairs for multiplication and division t=a+b t=t*c T=t/d t=a+b t=t+c T=t/d L R1, a A R1, b M R0, c D R0, d ST R1, t L R0, a A R0, b M R0, c SRDA R0, 32 D R0, d ST R1, t
  • 5. A simple target machine model  Load operations: LD r,x and LD r1, r2  Store operations: ST x,r  Computation operations: OP dst, src1, src2  Unconditional jumps: BR L  Conditional jumps: Bcond r, L like BLTZ r, L
  • 6. Addressing Modes  variable name: x  indexed address: a(r) like LD R1, a(R2) means R1=contents(a+contents(R2))  integer indexed by a register : like LD R1, 100(R2)  Indirect addressing mode: *r and *100(r)  immediate constant addressing mode: like LD R1, #100
  • 7. Basic blocks and flow graphs  Partition the intermediate code into basic blocks  The flow of control can only enter the basic block through the first instruction in the block. That is, there are no jumps into the middle of the block.  Control will leave the block without halting or branching, except possibly at the last instruction in the block.  The basic blocks become the nodes of a flow graph
  • 8. rules for finding leaders  The first three-address instruction in the intermediate code is a leader.  Any instruction that is the target of a conditional or unconditional jump is a leader.  Any instruction that immediately follows a conditional or unconditional jump is a leader.
  • 9. Intermediate code to set a 10*10 matrix to an identity matrix
  • 10. Flow graph based on Basic Blocks
  • 11. liveness and next-use information  We wish to determine for each three address statement x=y+z what the next uses of x, y and z are.  Algorithm:  Attach to statement i the information currently found in the symbol table regarding the next use and liveness of x, y, and z.  In the symbol table, set x to "not live" and "no next use.“  In the symbol table, set y and z to "live" and the next uses of y and z to i.
  • 12. DAG representation of basic blocks  There is a node in the DAG for each of the initial values of the variables appearing in the basic block.  There is a node N associated with each statement s within the block. The children of N are those nodes corresponding to statements that are the last definitions, prior to s, of the operands used by s.  Node N is labeled by the operator applied at s, and also attached to N is the list of variables for which it is the last definition within the block.  Certain nodes are designated output nodes. These are the nodes whose variables are live on exit from the block.
  • 13. Code improving transformations  We can eliminate local common subexpressions, that is, instructions that compute a value that has already been computed.  We can eliminate dead code, that is, instructions that compute a value that is never used.  We can reorder statements that do not depend on one another; such reordering may reduce the time a temporary value needs to be preserved in a register.  We can apply algebraic laws to reorder operands of three-address instructions, and sometimes t hereby simplify t he computation.
  • 14. DAG for basic block
  • 15. DAG for basic block
  • 16. array accesses in a DAG  An assignment from an array, like x = a [i], is represented by creating a node with operator =[] and two children representing the initial value of the array, a0 in this case, and the index i. Variable x becomes a label of this new node.  An assignment to an array, like a [j] = y, is represented by a new node with operator []= and three children representing a0, j and y. There is no variable labeling this node. What is different is that the creation of this node kills all currently constructed nodes whose value depends on a0. A node that has been killed cannot receive any more labels; that is, it cannot become a common subexpression.
  • 17. DAG for a sequence of array assignments
  • 18. Rules for reconstructing the basic block from a DAG  The order of instructions must respect the order of nodes in the DAG. That is, we cannot compute a node's value until we have computed a value for each of its children.  Assignments to an array must follow all previous assignments to, or evaluations from, the same array, according to the order of these instructions in the original basic block.  Evaluations of array elements must follow any previous (according to the original block) assignments to the same array. The only permutation allowed is that two evaluations from the same array may be done in either order, as long as neither crosses over an assignment to that array.  Any use of a variable must follow all previous (according to the original block) procedure calls or indirect assignments through a pointer.  Any procedure call or indirect assignment through a pointer must follow all previous (according to the original block) evaluations of any variable.
  • 19. principal uses of registers  In most machine architectures, some or all of the operands of an operation must be in registers in order to perform the operation.  Registers make good temporaries - places to hold the result of a subexpression while a larger expression is being evaluated, or more generally, a place to hold a variable that is used only within a single basic block.  Registers are often used to help with run-time storage management, for example, to manage the run-time stack, including the maintenance of stack pointers and possibly the top elements of the stack itself.
  • 20. Descriptors for data structure  For each available register, a register descriptor keeps track of the variable names whose current value is in that register. Since we shall use only those registers that are available for local use within a basic block, we assume that initially, all register descriptors are empty. As the code generation progresses, each register will hold the value of zero or more names.  For each program variable, an address descriptor keeps track of the location or locations where the current value of that variable can be found. The location might be a register, a memory address, a stack location, or some set of more than one of these. The information can be stored in the symbol-table entry for that variable name.
  • 21. Machine Instructions for Operations  Use getReg(x = y + z) to select registers for x, y, and z. Call these Rx, Ry and Rz.  If y is not in Ry (according to the register descriptor for Ry), then issue an instruction LD Ry, y', where y' is one of the memory locations for y (according to the address descriptor for y).  Similarly, if z is not in Rz, issue and instruction LD Rz, z', where z' is a location for x .  Issue the instruction ADD Rx , Ry, Rz.
  • 22. Rules for updating the register and address descriptors  For the instruction LD R, x  Change the register descriptor for register R so it holds only x.  Change the address descriptor for x by adding register R as an additional location.  For the instruction ST x, R, change the address descriptor for x to include its own memory location.  For an operation such as ADD Rx, Ry, Rz implementing a three-address instruction x = y + x  Change the register descriptor for Rx so that it holds only x.  Change the address descriptor for x so that its only location is Rx. Note that the memory location for x is not now in the address descriptor for x.  Remove Rx from the address descriptor of any variable other than x.  When we process a copy statement x = y, after generating the load for y into register Ry, if needed, and after managing descriptors as for all load statements (per rule I):  Add x to the register descriptor for Ry.  Change the address descriptor for x so that its only location is Ry .
  • 23. Instructions generated and the changes in the register and address descriptors
  • 24. Rules for picking register Ry for y  If y is currently in a register, pick a register already containing y as Ry. Do not issue a machine instruction to load this register, as none is needed.  If y is not in a register, but there is a register that is currently empty, pick one such register as Ry.  The difficult case occurs when y is not in a register, and there is no register that is currently empty. We need to pick one of the allowable registers anyway, and we need to make it safe to reuse.
  • 25. Possibilities for value of R  If the address descriptor for v says that v is somewhere besides R, then we are OK.  If v is x, the value being computed by instruction I, and x is not also one of the other operands of instruction I (z in this example), then we are OK. The reason is that in this case, we know this value of x is never again going to be used, so we are free to ignore it.  Otherwise, if v is not used later (that is, after the instruction I, there are no further uses of v, and if v is live on exit from the block, then v is recomputed within the block), then we are OK.  If we are not OK by one of the first two cases, then we need to generate the store instruction ST v, R to place a copy of v in its own memory location. This operation is called a spill.
  • 26. Selection of the register Rx 1. Since a new value of x is being computed, a register that holds only x is always an acceptable choice for Rx. 2. If y is not used after instruction I, and Ry holds only y after being loaded, Ry can also be used as Rx. A similar option holds regarding z and Rx.
  • 27. Possibilities for value of R  If the address descriptor for v says that v is somewhere besides R, then we are OK.  If v is x, the value being computed by instruction I, and x is not also one of the other operands of instruction I (z in this example), then we are OK. The reason is that in this case, we know this value of x is never again going to be used, so we are free to ignore it.  Otherwise, if v is not used later (that is, after the instruction I, there are no further uses of v, and if v is live on exit from the block, then v is recomputed within the block), then we are OK.  If we are not OK by one of the first two cases, then we need to generate the store instruction ST v, R to place a copy of v in its own memory location. This operation is called a spill.
  • 28. Register Allocation and Assignment  Global Register Allocation  Usage Counts  Register Assignment for Outer Loops  Register Allocation by Graph Coloring
  • 29. Global register allocation  Previously explained algorithm does local (block based) register allocation  This resulted that all live variables be stored at the end of block  To save some of these stores and their corresponding loads, we might arrange to assign registers to frequently used variables and keep these registers consistent across block boundaries (globally)  Some options are:  Keep values of variables used in loops inside registers  Use graph coloring approach for more globally allocation
  • 30. Usage counts  For the loops we can approximate the saving by register allocation as:  Sum over all blocks (B) in a loop (L)  For each uses of x before any definition in the block we add one unit of saving  If x is live on exit from B and is assigned a value in B, then we ass 2 units of saving
  • 31. Flow graph of an inner loop
  • 32. Code sequence using global register assignment
  • 33. Register allocation by Graph coloring  Two passes are used  Target-machine instructions are selected as though there are an infinite number of symbolic registers  Assign physical registers to symbolic ones  Create a register-interference graph  Nodes are symbolic registers and edges connects two nodes if one is live at a point where the other is defined.  For example in the previous example an edge connects a and d in the graph  Use a graph coloring algorithm to assign registers.
  • 37. An instruction set for tree matching
  • 38. Ershov Numbers  Label any leaf 1.  The label of an interior node with one child is the label of its child.  The label of an interior node with two children is  The larger of the labels of its children, if those labels are different.  One plus the label of its children if the labels are the same.
  • 39. A tree labeled with Ershov numbers
  • 40. Generating code from a labeled expression tree  To generate machine code for an interior node with label k and two children with equal labels (which must be k - l) do the following:  Recursively generate code for the right child, using base b+1. The result of the right child appears in register Rb+k.  Recursively generate code for the left child, using base b; the result appears in Rb+k-1.  Generate the instruction OP Rb+k, Rb+k-1, Rb+k, where OP is the appropriate operation for the interior node in question.  Suppose we have an interior node with label k and children with unequal labels. Then one of the children, which we'll call the "big" child, has label k , and the other child, the "little" child, has some label m < k. Do the following to generate code for this interior node, using base b:  Recursively generate code for the big child, using base b; the result appears in register Rb+k-l.  Recursively generate code for the small child, using base b; the result appears in register Rb+m-l. Note that since m < k, neither Rb+k-l nor any higher- numbered register is used.  Generate the instruction OP Rb+k-l, Rb+m-l, Rb+k-1 or the instruction OP Rb+k-l, Rb+k-l, Rb+m+l, depending on whether the big child is the right or left child, respectively.  For a leaf representing operand x, if the base is b generate the instruction LD Rb, x.
  • 42. Evaluating Expressions with an Insufficient Supply of Registers  Node N has at least one child with label r or greater. Pick the larger child (or either if their labels are the same) to be the "big" child and let the other child be the "little" child.  Recursively generate code for the big child, using base b = 1. The result of this evaluation will appear in register Rr  Generate the machine instruction ST tk, Rr, where tk is a temporary variable used for temporary results used to help evaluate nodes with label k.  Generate code for the little child as follows. If the little child has label r or greater, pick base b=1. If the label of the little child is j<r, then pick b=r-j. Then recursively apply this algorithm to the little child; the result appears in Rr.  Generate the instruction LD Rr-l, tk.  If the big child is the right child of N, then generate the instruction OP Rr, Rr, Rr-1. If the big child is the left child, generate OP Rr, Rr-1, Rr.
  • 43. Optimal three-register code using only two registers
  • 44. Dynamic Programming Algorithm  Compute bottom-up for each node n of the expression tree T an array C of costs, in which the ith component C[i] is the optimal cost of computing the subtree S rooted at n into a register, assuming i registers are available for the computation, for  Traverse T, using the cost vectors to determine which subtrees of T must be computed into memory.  Traverse each tree using the cost vectors and associated instructions to generate the final target code. The code for the subtrees computed into memory locations is generated first.  r i   1
  • 45. Syntax tree for (a-b)+c*(d/e) with cost vector at each node
  • 46. minimum cost of evaluating the root with two registers available  Compute the left subtree with two registers available into register R0, compute the right subtree with one register available into register R1, and use the instruction ADD R0, R0, R1 to compute the root. This sequence has cost 2+5+1=8.  Compute the right subtree with two registers available into R l , compute the left subtree with one register available into R0, and use the instruction ADD R0, R0, R1. This sequence has cost 4+2+1=7.  Compute the right subtree into memory location M, compute the left subtree with two registers available into register RO, and use the instruction ADD R0, R0, M. This sequence has cost 5+2+1=8.