SlideShare a Scribd company logo
DEFINITION OF PARSING
A parser is a compiler or interpreter component that breaks
data into smaller elements for easy translation into another
language.
A parsertakes input in the form of a sequence of tokens or
program instructions and usually builds a data structure in
the form of a parse tree or an abstract syntax tree.
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 1
ROLE OF PARSER
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 2
• In the compiler model, the parser obtains a string of tokens from the
lexical analyzer,
• and verifies that the string can be generated by the grammar for the
source language.
• The parser returns any syntax error for the source language.
• It collects sufficient number of tokens and builds a parse tree.
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 3
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 4
• There are basically two types of parser:
• Top-down parser:
• starts at the root of derivation tree and fills in
• picks a production and tries to match the input
• may require backtracking
• some grammars are backtrack-free (predictive)
• Bottom-up parser:
• starts at the leaves and fills in
• starts in a state valid for legal first tokens
• uses a stack to store both state and sentential forms
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 5
TOP DOWN PARSING
• A top-down parser starts with the root of the parse tree, labeled with
the start or goal symbol of the grammar.
• To build a parse, it repeats the following steps until the fringe of the
parse tree matches the input string
• STEP1: At a node labeled A, select a production A  α and construct the appropriate child for each
symbol of α
• STEP2: When a terminal is added to the fringe that doesn’t match the input string, backtrack
• STEP3: Find the next node to be expanded.
• The key is selecting the right production in step 1
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 6
EXAMPLE FOR TOP DOWN PARSING
• Supppose the given production rules are as follows:
• S-> aAd|aB
• A-> b|c
• B->ccd
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 7
PROBLEMS WITH TOPDOWN PARSING
1) BACKTRACKING
 Backtracking is a technique in which for expansion of non-terminal
symbol we choose one alternative and if some mismatch occurs then
we try another alternative if any.
If for a non-terminal there are multiple production rules beginning
with the same input symbol then to get the correct derivation we
need to try all these alternatives.
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 8
EXAMPLE OF BACKTRACKING
• Suppose the given production rules are as follows:
• S->cAd
• A->a|ab
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 9
2) LEFT RECURSION
Left recursion is a case when the left-most non-terminal in a
production of a non-terminal is the non-terminal itself( direct left
recursion ) or through some other non-terminal definitions,
rewrites to the non-terminal again(indirect left recursion). Consider
these examples -
(1) A -> Aq (direct)
(2) A -> Bq
B -> Ar (indirect)
Left recursion has to be removed if the parser performs top-down
parsing
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 10
REMOVING LEFT RECURSION
• To eliminate left recursion we need to modify the grammar. Let, G be
a grammar having a production rule with left recursion
• A-> Aa
• A->B
• Thus, we eliminate left recursion by rewriting the production rule as:
• A->BA’
• A’->aA’
• A’->c
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 11
3) LEFT FACTORING
Left factoring is removing the common left factor that appears in
two productions of the same non-terminal. It is done to avoid back-
tracing by the parser. Suppose the parser has a look-ahead ,consider
this example-
A -> qB | qC
where A,B,C are non-terminals and q is a sentence. In this case, the
parser will be confused as to which of the two productions to
choose and it might have to back-trace. After left factoring, the
grammar is converted to-
A -> qD
D -> B | C
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 12
RECURSIVE DESCENT PARSING
• A recursive descent parser is a kind of top-down parser built from a
set of mutually recursive procedures (or a non-recursive equivalent)
where each such procedure usually implements one of the
productions of the grammar.
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 13
EXAMPLE OF RECURSIVE DESCENT PARSING
Suppose the grammar given is as follows:
E->iE’
E’->+iE’
Program:
E()
{
if(l==‘i’)
{
match(‘i’);
E’();
}
} l=getchar();
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 14
E’()
{
if(l==‘+”)
{
match(‘+’);
match(‘i’);
E’();
}
else
return ;
}
Match(char t)
{
if(l==t)
l=getchar();
else
printf(“Error”);
}
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 15
main()
{
E();
If(l==‘$’)
{
printf(“parsing successful”);
}
}
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 16
PREDICTIVE LL(1) PARSING
• The first “L” in LL(1) refers to the fact that the input is processed from
left to right.
• The second “L” refers to the fact that LL(1) parsing determines a
leftmost derivation for the input string.
• The “1” in parentheses implies that LL(1) parsing uses only one
symbol of input to predict the next grammar rule that should be used.
• The data structures used by LL(1) are 1. Input buffer 2. Stack
3. Parsing table
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 17
• The construction of predictive LL(1) parser is based on two
very important functions and those are First and Follow.
• For construction of predictive LL(1) parser we have to follow
the following steps:
• STEP1: computate FIRST and FOLLOW function.
• STEP2: construct predictive parsing table using first and follow function.
• STEP3: parse the input string with the help of predictive parsing table
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 18
FIRST
If X is a terminal then First(X) is just X!
If there is a Production X → ε then add ε to first(X)
If there is a Production X → Y1Y2..Yk then add first(Y1Y2..Yk)
to first(X)
First(Y1Y2..Yk) is either
First(Y1) (if First(Y1) doesn't contain ε)
OR (if First(Y1) does contain ε) then First (Y1Y2..Yk) is everything in First(Y1)
<except for ε > as well as everything in First(Y2..Yk)
If First(Y1) First(Y2)..First(Yk) all contain ε then add ε to First(Y1Y2..Yk) as well.
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 19
FOLLOW
• First put $ (the end of input marker) in Follow(S) (S is the
start symbol)
• If there is a production A → aBb, (where a can be a whole
string) then everything in FIRST(b) except for ε is placed in
FOLLOW(B).
• If there is a production A → aB, then everything in
FOLLOW(A) is in FOLLOW(B)
• If there is a production A → aBb, where FIRST(b) contains
ε, then everything in FOLLOW(A) is in FOLLOW(B)
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 20
EXAMPLE OF FIRST AND FOLLOW
The Grammar
E → TE'
E' → +TE'
E' → ε
T → FT'
T' → *FT'
T' → ε
F → (E)
F → id
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 21
PROPERTIES OF LL(1) GRAMMARS
1. No left-recursive grammar is LL(1)
2. No ambiguous grammar is LL(1)
3. Some languages have no LL(1) grammar
4. A ε–free grammar where each alternative expansion for A begins with a
distinct terminal is a simple LL(1) grammar.
Example:
S  aS  a
is not LL(1) because FIRST(aS) = FIRST(a) = { a }
S  aS´
S´  aS  ε
accepts the same language and is LL(1)
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 22
PREDICTIVE PARSING TABLE
Method:
1. production A  α:
a) a  FIRST(α), add A  α to M[A,a]
b) If ε  FIRST(α):
I. b  FOLLOW(A), add A  α to M[A,b]
II. If $  FOLLOW(A), add A  α to M[A,$]
2.Set each undefined entry of M to error
If M[A,a] with multiple entries then G is not LL(1).
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 23
EXAMPLE OF PREDICTIVE PARSING LL(1)
TABLE
The given grammar is as follows
S  E
E  TE´
E´  +E  —E  ε
T  FT´
T´  * T  / T  ε
F  num  id
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 24
BOTTOM UP PARSING
Bottom-up parsing starts from the leaf nodes of a tree and
works in upward direction till it reaches the root node.
we start from a sentence and then apply production rules in
reverse manner in order to reach the start symbol.
Here, parser tries to identify R.H.S of production rule and
replace it by corresponding L.H.S. This activity is known as
reduction.
Also known as LR parser, where L means tokens are read
from left to right and R means that it constructs rightmost
derivative.
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 25
EXAMPLE OF BOTTOM-UP PARSER
E → T + E | T
T → int * T | int | (E)
 Consider the string: int * int + int
int * int + int T → int
int * T + int T → int * T
T + int T → int
T + T E → T
T + T E → T
E
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 26
SHIFT REDUCE PARSING
• Bottom-up parsing uses two kinds of actions: 1.Shift 2.Reduce
• Shift: Move | one place to the right , Shifts a terminal to the left string
ABC|xyz ⇒ ABCx|yz
• Reduce: Apply an inverse production at the right end of the left string
If A → xy is a production, then Cbxy|ijk ⇒ CbA|ijk
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 27
EXAMPLE OF SHIFT REDUCE PARSING
|int * int + int shift
int | * int + int shift
int * | int + int shift
int * int | + int reduce T → int
int * T | + int reduce T → int * T
T | + int shift
T + | int shift
T + int | reduce T → int
T + T | reduce E → T
T + E | reduce E → T + E
E |
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 28
OPERATOR PRECEDENCE PARSING
Operator grammars have the property that no production right side
is empty or has two adjacent nonterminals.
 This property enables the implementation of efficient operator-
precedence parsers.
These parser rely on the following three precedence relations:
Relation Meaning
a <· b a yields precedence to b
a =· b a has the same precedence as b
a ·> b a takes precedence over b
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 29
• These operator precedence relations allow to delimit the handles in
the right sentential forms: <· marks the left end, =· appears in
• the interior of the handle, and ·> marks the right end.
• . Suppose that $ is the end of the string, Then for all terminals we can
write: $ <· b and b ·> $
• If we remove all nonterminals and place the correct precedence
relation:<·, =·, ·> between the remaining terminals, there remain
strings that can be analyzed by easily developed parser.
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 30
EXAMPLE OF OPERATOR PRECEDENCE
PARSING
id + * $
id ·> ·> ·>
+ <· ·> <· ·>
* <· ·> ·> ·>
$ <· <· <· ·>
For example, the following operator precedence relations can
be introduced for simple expressions:
Example: The input string:
id1 + id2 * id3
after inserting precedence relations becomes
$ <· id1 ·> + <· id2 ·> * <· id3 ·> $
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 31
UNIT-III
Syntax Directed Translations
Production Semantic Rule
E->E1+T E.code=E1.code||T.code||’+’
• We may alternatively insert the semantic actions inside the grammar
E -> E1+T {print ‘+’}
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 32
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 33
• We can associate information with a language construct by attaching
attributes to the grammar symbols.
• A syntax directed definition specifies the values of attributes by
associating semantic rules with the grammar productions.
Syntax Directed Definitions
1. We associate information with the programming language
constructs by attaching attributes to grammar symbols.
2. Values of these attributes are evaluated by the semantic rules
associated with the production rules.
3. Evaluation of these semantic rules:
• may generate intermediate codes
• may put information into the symbol table
• may perform type checking, may issue error messages
• may perform some other activities
• in fact, they may perform almost any activities.
4. An attribute may hold almost any thing.
• a string, a number, a memory location, a complex record.
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 34
Syntax-Directed Definitions and Translation Schemes
1. When we associate semantic rules with productions, we
use two notations:
• Syntax-Directed Definitions
• Translation Schemes
A. Syntax-Directed Definitions:
• give high-level specifications for translations
• hide many implementation details such as order of evaluation of semantic actions.
• We associate a production rule with a set of semantic actions, and we do not say
when they will be evaluated.
B. Translation Schemes:
• indicate the order of evaluation of semantic actions associated with a production
rule.
• In other words, translation schemes give a little bit information about
implementation details.
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 35
Syntax-Directed Translation
• Conceptually with both the syntax directed translation and
translation scheme we
• Parse the input token stream
• Build the parse tree
• Traverse the tree to evaluate the semantic rules at the parse tree nodes.
Input string parse tree dependency graph evaluation
order for semantic rules
Conceptual view of syntax directed translation
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 36
Syntax-Directed Definitions
1. A syntax-directed definition is a generalization of a context-free
grammar in which:
• Each grammar symbol is associated with a set of attributes.
• This set of attributes for a grammar symbol is partitioned into two subsets called
• synthesized and
• inherited attributes of that grammar symbol.
2. The value of an attribute at a parse tree node is defined by the semantic rule associated with
a production at that node.
3. The value of a synthesized attribute at a node is computed from the values of attributes at
the children in that node of the parse tree.
4. The value of an inherited attribute at a node is computed from the values of attributes at
the siblings and parent of that node of the parse tree.
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 37
Syntax-Directed Definitions
Examples:
Synthesized attribute : E→E1+E2 { E.val =E1.val + E2.val}
Inherited attribute :A→XYZ {Y.val = 2 * A.val}
1. Semantic rules set up dependencies between attributes which can
be represented by a dependency graph.
2. This dependency graph determines the evaluation order of these
semantic rules.
3. Evaluation of a semantic rule defines the value of an attribute. But a
semantic rule may also have some side effects such as printing a
value.
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 38
Syntax Trees
Syntax-Tree
• an intermediate representation of the compiler’s input.
• A condensed form of the parse tree.
• Syntax tree shows the syntactic structure of the program while omitting
irrelevant details.
• Operators and keywords are associated with the interior nodes.
• Chains of simple productions are collapsed.
Syntax directed translation can be based on syntax tree as well as
parse tree.
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 39
Syntax Tree-Examples
Expression:
+
5 *
3 4
• Leaves: identifiers or constants
• Internal nodes: labelled with
operations
• Children: of a node are its operands
if B then S1 else S2
if - then - else
Statement:
Node’s label indicates what kind of a
statement it is
Children of a node correspond to the
components of the statement
B S1 S2
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 40
Intermediate representation and code generation
Two possibilities:
1. .....
semantic
routines
code
generation
Machine code
(+) no extra pass for code generation
(+) allows simple 1-pass compilation
2.
semantic
routines
code
generation
Machine code
IR
(+) allows higher-level operations e.g. open block, call
procedures.
(+) better optimization because IR is at a higher level.
(+) machine dependence is isolated in code generation.
.....
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 41
Three address code
• In a three address code there is at most one operator at the
right side of an instruction
• Example:
+
+ *
-
b c
a
d
t1 = b – c
t2 = a * t1
t3 = a + t2
t4 = t1 * d
t5 = t3 + t4
*
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 42
Forms of three address instructions
• x = y op z
• x = op y
• x = y
• goto L
• if x goto L and ifFalse x goto L
• if x relop y goto L
• Procedure calls using:
• param x
• call p,n
• y = call p,n
• x = y[i] and x[i] = y
• x = &y and x = *y and *x =y
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 43
Example
• do i = i+1; while (a[i] < v);
L: t1 = i + 1
i = t1
t2 = i * 8
t3 = a[t2]
if t3 < v goto L
Symbolic labels
100: t1 = i + 1
101: i = t1
102: t2 = i * 8
103: t3 = a[t2]
104: if t3 < v goto 100
Position numbers
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 44
Data structures for three address codes
• Quadruples
• Has four fields: op, arg1, arg2 and result
• Triples
• Temporaries are not used and instead references to instructions are made
• Indirect triples
• In addition to triples we use a list of pointers to triples
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 45
Example
• b * minus c + b * minus c
t1 = minus c
t2 = b * t1
t3 = minus c
t4 = b * t3
t5 = t2 + t4
a = t5
Three address code
Quadruples Triples Indirect Triples
Op Arg1 Arg2 result
Minus c T1
* b T1 T2
Minus c T3
* b T3 T4
+ t2 t4 T5
= t5 a
Op Arg1 arg2
Minus c
* b (0)
Minus c
* b (2)
+ (1) (3)
a (4)
0
1
2
3
4
5
(0)
(1)
(2)
(3)
(4)
(5)
35
36
37
38
39
40
Op Arg1 arg2
Minus c
* b (0)
Minus c
* b (2)
+ (1) (3)
a (4)
0
1
2
3
4
5
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 46
Intermediate representation and code generation
IR
good for optimization and portability
Machine Code
simple
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 47
Intermediate code
1. postfix form
Example
a+b ab+
(a+b)*c ab+c*
a+b*c abc*+
a:=b*c+b*d abc*bd*+:=
(+) simple and concise
(+) good for driving an interpreter
(- ) Not good for optimization or code generation
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 48
INTERMEDIATE CODE
2. 3-addr code
 Triple
op arg1 arg2
 Quadruple
op arg1 arg2 arg3
Triple: more concise
But what if instructions are deleted,
Moved or added during optimization?
 Triples and quadruples
are more similar to machine code.
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 49
INTERMEDIATE CODE
More detailed 3-addr code
 Add type information
Example a := b*c + b*d
Suppose b,c are integer type, d is float type.
(1) ( I* b c ) (I* b c t1)
(2) (FLOAT b _ ) (FLOAT b t2 _)
(3) ( F* (2) d ) (F* t2 d t3)
(4) (FLOAT (1) _ ) (FLOAT t1 t4 _)
(5) ( *f+ (4) (3)) ( F+ t4 t3 t5)
(6) ( := (5) a ) ( := t5 a _)
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 50
PARSE TREES
Parsing:
build the parse tree
Non-terminals for operator precedence
and associatively are included.
parse tree
<target> := <exp>
id
<exp> + <term>
<term
>
<term> * <factoor>
<factor>
Const
id
<factor>
id
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 51
PARSE TREE
Lexical Analyzer Parser
Source
program
token
getNext
Token
Symbol
table
Parse tree
Rest of Front End
Intermediate
representation
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 52
BOOLEAN EXPRESSIONS
• Control flow translation of boolean expressions:
• Basic idea: generate the jumping code without evaluating the whole
boolean expression.
• Example:
Let E = a < b, we will generate the code as
(1) If a < b then goto E.true
(2) Goto T.false
Grammar:
E->E or E | E and E | not E | (E) | id relop id | true | false.
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 53
E -> E1 or E2 { E1.true = E.true; E1.false = newlabel; E2.true = E.true; E2.false =
E.false;
E.code = E1.code || gen(E1.false ‘:’) || E2.code}
E->E1 and E2 {E1.true = newlabel; E1.false = E.false;
E2.true = E.true; E2.false = E.false;
E.code = E1.code || gen(E1.true ‘:’) || E2.code}
E->not E {E1.true = E.false; E1.false = E.true; E.code = E1.code}
E->(E1) {E1.true = E.true; E1.false = E.false; E.code = E1.code;}
E->id1 relop id2 {E.code = gen(‘if’ id1.place relop.op id2.place ‘goto’ E.true); gen
(‘goto’ E.false);}
E->true {gen(‘goto’ E.true);}
E->false{gen(‘goto’ E.false);}
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 54
Example
Example: a < b or (c < d and e < f)
Example: while a< b do
if c < d then
x := y + z;
else
x: = y – z;
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 55
Statements that alter the flow of control
Fig. The Flowchart of the flow of control
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 56
Translation of Control flow statements
• Most of the programming languages have a common set of
statements that define the control flow of a program.
• These control statements are:
Assignment statement: It has a single statement assigning some
expression to a variable.
if-then-else statement: It has a condition associated with it.
The control flows either to the then-part or to the else-part.
while-do-loop
The control remains within the loop until a specified condition
becomes false.
Block of statements
It is group of statements put within a begin-end block marker.
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 57
Translation of Case Statements
• It is a unique because the structure contains an expression.
• Control jumps to one of the many alternatives.
• Syntax:
switch (E) {
case c1: ……
.
.
case cn: ……
default : ……
}
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 58
Postfix translation:
• The postfix notation for an expression E can be defined:
1. If E is a variable or constant, then the postfix notation for E is
E itself.
2. If E is an expression of the form E1 op E2, where op is any
binary operator, then the postfix notation for E is E’1 E’2 op,
where E’1 and E’2 are the postfix notations for E1 and E2,
respectively.
3. If E is a parenthesized expression of the form (E1), then the
postfix notation for E is the same as the postfix notation for
E1.
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 59
Postfix notation
• Postfix notation is a linearized representation of a syntax
tree.
• It a list of nodes of the tree in which a node appears
immediately after its children.
• the postfix notation of below syntax tree is x a –b* a-b*+=
s
b
*
s
b
Assign
+x
*
uminus uminus
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 60
Translation with a top down parser
• which build parse trees from top(root) to bottom(leaves).
Fig. The procedures of a top down parser.
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 61
Recursive Descent Parsing
• Recursive descent is a top-down parsing technique that constructs
the parse tree from the top and the input is read from left to
right.
• It uses procedures for every terminal and non-terminal entity.
• A form of recursive-descent parsing that does not require any
back-tracking is known as predictive parsing.
• This parsing technique is regarded recursive as it uses context-free
grammar which is recursive in nature.
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 62
Back-tracking
• Top- down parsers start from the root node (start symbol)
and match the input string against the production rules to
replace them (if matched).
• To understand this, take the following example of CFG:
S → rXd | rZd
X → oa | ea
Z → ai
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 63
Back tracking
• Now the parser matches all the input letters in an ordered
manner.
• The string is accepted.
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 64
Predictive Parser
• Predictive parser is a recursive descent parser.
• It has the capability to predict which production is to be
used to replace the input string.
• The predictive parser does not suffer from backtracking.
• The predictive parser puts some constraints on the
grammar and accepts only a class of grammar known as
LL(k) grammar.
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 65
PREDICTIVE PARSER
• The parser refers to the parsing table to take any decision
on the input and stack element combination.
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 66
LL Parser
• An LL Parser accepts LL grammar.
• LL grammar is a subset of context-free grammar but with
some restrictions to get the simplified version.
• LL grammar can be implemented by means of both
algorithms namely, recursive-descent or table-driven.
• LL parser is denoted as LL(k).
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 67
Array references in arithmetic expressions:
• Array elements can be accessed quickly if they are stored
in a block of consecutive locations.
• Elements are numbered 0, 1,…..,n-1, for an array with n
elements.
• If the width of each array element is w, then the ith
element of array A begins in location.
base + i * w
where base relative address(storage allocated)
i. e , base is the relative address of A[0].
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 68
Layouts for a 2D Array
A[1, 1]
A[1, 2]
A[1, 3]
A[2, 1]
A[2, 2]
A[2, 3]
First row
Second row
First column
Third Column
Second Column
A[1, 1]
A[2, 1]
A[1, 2]
A[2, 2]
A[1, 3]
A[2, 3]
(a) Row Major (b) Column Major
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 69
Procedures call
• It is imperative for a compiler to generate good code for
procedure calls and returns.
• The run-time routines that handle procedure argument
passing, calls and returns are part of the run-time support
package.
• Let us consider a grammar for a simple procedure call
statement:
• (1) S call id ( Elist )
• (2) Elist Elist , E
• (3) Elist E
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 70
Declarations and case statements.
• Declarations
Declarations with lists of names can be handled as follow:
D T id ; D | ε
T B C | record ’{’ D ’}’
B int | float
C ε | [ num ] C
Nonterminal D generates a sequence of declarations.
Nonterminal T generates basic, array, or record types.
Nonterminal B generates one of the basic types int or float.
Nonterminal C, for “ component,” generates string of Zero or
more integers, each surrounded by brackets.
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 71
Case Statements
• The “switch” or “case” statement is available in a variety of
languages. The switch-statement
• Syntax is as shown below :
Switch expression
begin
case value : statement
case value : statement
. . .
case value : statement
default : statement
end
ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 72

More Related Content

PDF
Syntax Directed Definition and its applications
PPTX
Type checking compiler construction Chapter #6
PPTX
Context free grammar
PDF
LR Parsing
PDF
Symbol table in compiler Design
PPTX
Compiler design syntax analysis
PPTX
serializability in dbms
PPTX
Theory of automata and formal language
Syntax Directed Definition and its applications
Type checking compiler construction Chapter #6
Context free grammar
LR Parsing
Symbol table in compiler Design
Compiler design syntax analysis
serializability in dbms
Theory of automata and formal language

What's hot (20)

PPT
Ll(1) Parser in Compilers
PDF
Operator precedence
PDF
Syntax directed translation
PPTX
Types of Parser
PPTX
Finite Automata in compiler design
PPTX
COMPILER DESIGN
PPTX
Top down parsing
PDF
Parse Tree
PPTX
Lexical Analysis - Compiler Design
PPT
Top down parsing
PPTX
Compiler Design
PDF
Bottom up parser
PDF
Deterministic Finite Automata (DFA)
PDF
COMPILER DESIGN- Syntax Directed Translation
PDF
Lecture 01 introduction to compiler
PPTX
Principle source of optimazation
PPTX
NFA & DFA
PPTX
Syntax Analysis in Compiler Design
PPTX
Finite automata-for-lexical-analysis
PPTX
Shift reduce parser
Ll(1) Parser in Compilers
Operator precedence
Syntax directed translation
Types of Parser
Finite Automata in compiler design
COMPILER DESIGN
Top down parsing
Parse Tree
Lexical Analysis - Compiler Design
Top down parsing
Compiler Design
Bottom up parser
Deterministic Finite Automata (DFA)
COMPILER DESIGN- Syntax Directed Translation
Lecture 01 introduction to compiler
Principle source of optimazation
NFA & DFA
Syntax Analysis in Compiler Design
Finite automata-for-lexical-analysis
Shift reduce parser
Ad

Similar to Compiler unit 2&3 (20)

PPT
Parsing
PPT
Cd2 [autosaved]
PPT
Lecture 05 syntax analysis 2
PPT
PARSING.ppt
PPT
Ch4_topdownparser_ngfjgh_ngjfhgfffdddf.PPT
PDF
Syntax Analysis PPTs for Third Year Computer Sc. and Engineering
PPTX
Top Down Parsing, Predictive Parsing
PPTX
3. Syntax Analyzer.pptx
PPT
Top_down_Parsing_ full_detail_explanation
PPTX
Syntax Analysis.pptx
PPTX
Syntactic Analysis in Compiler Construction
PPTX
Compiler Design_Syntax Analyzer_Top Down Parsers.pptx
PDF
LL(1) and the LR family of parsers used in compilers
PPT
51114.-Compiler-Design-Syntax-Analysis-Top-down.ppt
PPT
51114.-Compiler-Design-Syntax-Analysis-Top-down.ppt
PPTX
Chapter3pptx__2021_12_23_22_52_54.pptx
PPTX
LL(1) parsing
PDF
Left factor put
PDF
CS17604_TOP Parser Compiler Design Techniques
Parsing
Cd2 [autosaved]
Lecture 05 syntax analysis 2
PARSING.ppt
Ch4_topdownparser_ngfjgh_ngjfhgfffdddf.PPT
Syntax Analysis PPTs for Third Year Computer Sc. and Engineering
Top Down Parsing, Predictive Parsing
3. Syntax Analyzer.pptx
Top_down_Parsing_ full_detail_explanation
Syntax Analysis.pptx
Syntactic Analysis in Compiler Construction
Compiler Design_Syntax Analyzer_Top Down Parsers.pptx
LL(1) and the LR family of parsers used in compilers
51114.-Compiler-Design-Syntax-Analysis-Top-down.ppt
51114.-Compiler-Design-Syntax-Analysis-Top-down.ppt
Chapter3pptx__2021_12_23_22_52_54.pptx
LL(1) parsing
Left factor put
CS17604_TOP Parser Compiler Design Techniques
Ad

More from BBDITM LUCKNOW (19)

PPT
Unit 5 cspc
PPT
Unit 4 cspc
PPT
Unit3 cspc
PPT
Cse ppt 2018
PPT
Binary system ppt
PPT
Unit 4 ca-input-output
PPTX
Unit 3 ca-memory
PPT
Unit 2 ca- control unit
PPTX
Unit 1 ca-introduction
PPTX
PPT
Bnf and ambiquity
PPT
Minimization of dfa
PPTX
Passescd
PDF
Compiler unit 4
PDF
Compiler unit 1
PDF
Compiler unit 5
PDF
Cspc final
PPTX
Validation based protocol
Unit 5 cspc
Unit 4 cspc
Unit3 cspc
Cse ppt 2018
Binary system ppt
Unit 4 ca-input-output
Unit 3 ca-memory
Unit 2 ca- control unit
Unit 1 ca-introduction
Bnf and ambiquity
Minimization of dfa
Passescd
Compiler unit 4
Compiler unit 1
Compiler unit 5
Cspc final
Validation based protocol

Recently uploaded (20)

PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PPTX
Construction Project Organization Group 2.pptx
PDF
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
PPTX
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
DOCX
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
PPTX
additive manufacturing of ss316l using mig welding
PPTX
Welding lecture in detail for understanding
PPTX
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
PPTX
CYBER-CRIMES AND SECURITY A guide to understanding
PPTX
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
PPTX
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
PPTX
Lecture Notes Electrical Wiring System Components
PDF
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
PDF
PPT on Performance Review to get promotions
PPTX
Geodesy 1.pptx...............................................
PDF
Automation-in-Manufacturing-Chapter-Introduction.pdf
PDF
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
PDF
Model Code of Practice - Construction Work - 21102022 .pdf
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
Construction Project Organization Group 2.pptx
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
additive manufacturing of ss316l using mig welding
Welding lecture in detail for understanding
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
CYBER-CRIMES AND SECURITY A guide to understanding
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
Lecture Notes Electrical Wiring System Components
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
PPT on Performance Review to get promotions
Geodesy 1.pptx...............................................
Automation-in-Manufacturing-Chapter-Introduction.pdf
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
Model Code of Practice - Construction Work - 21102022 .pdf

Compiler unit 2&3

  • 1. DEFINITION OF PARSING A parser is a compiler or interpreter component that breaks data into smaller elements for easy translation into another language. A parsertakes input in the form of a sequence of tokens or program instructions and usually builds a data structure in the form of a parse tree or an abstract syntax tree. ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 1
  • 2. ROLE OF PARSER ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 2
  • 3. • In the compiler model, the parser obtains a string of tokens from the lexical analyzer, • and verifies that the string can be generated by the grammar for the source language. • The parser returns any syntax error for the source language. • It collects sufficient number of tokens and builds a parse tree. ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 3
  • 4. ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 4
  • 5. • There are basically two types of parser: • Top-down parser: • starts at the root of derivation tree and fills in • picks a production and tries to match the input • may require backtracking • some grammars are backtrack-free (predictive) • Bottom-up parser: • starts at the leaves and fills in • starts in a state valid for legal first tokens • uses a stack to store both state and sentential forms ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 5
  • 6. TOP DOWN PARSING • A top-down parser starts with the root of the parse tree, labeled with the start or goal symbol of the grammar. • To build a parse, it repeats the following steps until the fringe of the parse tree matches the input string • STEP1: At a node labeled A, select a production A  α and construct the appropriate child for each symbol of α • STEP2: When a terminal is added to the fringe that doesn’t match the input string, backtrack • STEP3: Find the next node to be expanded. • The key is selecting the right production in step 1 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 6
  • 7. EXAMPLE FOR TOP DOWN PARSING • Supppose the given production rules are as follows: • S-> aAd|aB • A-> b|c • B->ccd ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 7
  • 8. PROBLEMS WITH TOPDOWN PARSING 1) BACKTRACKING  Backtracking is a technique in which for expansion of non-terminal symbol we choose one alternative and if some mismatch occurs then we try another alternative if any. If for a non-terminal there are multiple production rules beginning with the same input symbol then to get the correct derivation we need to try all these alternatives. ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 8
  • 9. EXAMPLE OF BACKTRACKING • Suppose the given production rules are as follows: • S->cAd • A->a|ab ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 9
  • 10. 2) LEFT RECURSION Left recursion is a case when the left-most non-terminal in a production of a non-terminal is the non-terminal itself( direct left recursion ) or through some other non-terminal definitions, rewrites to the non-terminal again(indirect left recursion). Consider these examples - (1) A -> Aq (direct) (2) A -> Bq B -> Ar (indirect) Left recursion has to be removed if the parser performs top-down parsing ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 10
  • 11. REMOVING LEFT RECURSION • To eliminate left recursion we need to modify the grammar. Let, G be a grammar having a production rule with left recursion • A-> Aa • A->B • Thus, we eliminate left recursion by rewriting the production rule as: • A->BA’ • A’->aA’ • A’->c ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 11
  • 12. 3) LEFT FACTORING Left factoring is removing the common left factor that appears in two productions of the same non-terminal. It is done to avoid back- tracing by the parser. Suppose the parser has a look-ahead ,consider this example- A -> qB | qC where A,B,C are non-terminals and q is a sentence. In this case, the parser will be confused as to which of the two productions to choose and it might have to back-trace. After left factoring, the grammar is converted to- A -> qD D -> B | C ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 12
  • 13. RECURSIVE DESCENT PARSING • A recursive descent parser is a kind of top-down parser built from a set of mutually recursive procedures (or a non-recursive equivalent) where each such procedure usually implements one of the productions of the grammar. ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 13
  • 14. EXAMPLE OF RECURSIVE DESCENT PARSING Suppose the grammar given is as follows: E->iE’ E’->+iE’ Program: E() { if(l==‘i’) { match(‘i’); E’(); } } l=getchar(); ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 14
  • 17. PREDICTIVE LL(1) PARSING • The first “L” in LL(1) refers to the fact that the input is processed from left to right. • The second “L” refers to the fact that LL(1) parsing determines a leftmost derivation for the input string. • The “1” in parentheses implies that LL(1) parsing uses only one symbol of input to predict the next grammar rule that should be used. • The data structures used by LL(1) are 1. Input buffer 2. Stack 3. Parsing table ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 17
  • 18. • The construction of predictive LL(1) parser is based on two very important functions and those are First and Follow. • For construction of predictive LL(1) parser we have to follow the following steps: • STEP1: computate FIRST and FOLLOW function. • STEP2: construct predictive parsing table using first and follow function. • STEP3: parse the input string with the help of predictive parsing table ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 18
  • 19. FIRST If X is a terminal then First(X) is just X! If there is a Production X → ε then add ε to first(X) If there is a Production X → Y1Y2..Yk then add first(Y1Y2..Yk) to first(X) First(Y1Y2..Yk) is either First(Y1) (if First(Y1) doesn't contain ε) OR (if First(Y1) does contain ε) then First (Y1Y2..Yk) is everything in First(Y1) <except for ε > as well as everything in First(Y2..Yk) If First(Y1) First(Y2)..First(Yk) all contain ε then add ε to First(Y1Y2..Yk) as well. ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 19
  • 20. FOLLOW • First put $ (the end of input marker) in Follow(S) (S is the start symbol) • If there is a production A → aBb, (where a can be a whole string) then everything in FIRST(b) except for ε is placed in FOLLOW(B). • If there is a production A → aB, then everything in FOLLOW(A) is in FOLLOW(B) • If there is a production A → aBb, where FIRST(b) contains ε, then everything in FOLLOW(A) is in FOLLOW(B) ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 20
  • 21. EXAMPLE OF FIRST AND FOLLOW The Grammar E → TE' E' → +TE' E' → ε T → FT' T' → *FT' T' → ε F → (E) F → id ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 21
  • 22. PROPERTIES OF LL(1) GRAMMARS 1. No left-recursive grammar is LL(1) 2. No ambiguous grammar is LL(1) 3. Some languages have no LL(1) grammar 4. A ε–free grammar where each alternative expansion for A begins with a distinct terminal is a simple LL(1) grammar. Example: S  aS  a is not LL(1) because FIRST(aS) = FIRST(a) = { a } S  aS´ S´  aS  ε accepts the same language and is LL(1) ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 22
  • 23. PREDICTIVE PARSING TABLE Method: 1. production A  α: a) a  FIRST(α), add A  α to M[A,a] b) If ε  FIRST(α): I. b  FOLLOW(A), add A  α to M[A,b] II. If $  FOLLOW(A), add A  α to M[A,$] 2.Set each undefined entry of M to error If M[A,a] with multiple entries then G is not LL(1). ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 23
  • 24. EXAMPLE OF PREDICTIVE PARSING LL(1) TABLE The given grammar is as follows S  E E  TE´ E´  +E  —E  ε T  FT´ T´  * T  / T  ε F  num  id ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 24
  • 25. BOTTOM UP PARSING Bottom-up parsing starts from the leaf nodes of a tree and works in upward direction till it reaches the root node. we start from a sentence and then apply production rules in reverse manner in order to reach the start symbol. Here, parser tries to identify R.H.S of production rule and replace it by corresponding L.H.S. This activity is known as reduction. Also known as LR parser, where L means tokens are read from left to right and R means that it constructs rightmost derivative. ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 25
  • 26. EXAMPLE OF BOTTOM-UP PARSER E → T + E | T T → int * T | int | (E)  Consider the string: int * int + int int * int + int T → int int * T + int T → int * T T + int T → int T + T E → T T + T E → T E ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 26
  • 27. SHIFT REDUCE PARSING • Bottom-up parsing uses two kinds of actions: 1.Shift 2.Reduce • Shift: Move | one place to the right , Shifts a terminal to the left string ABC|xyz ⇒ ABCx|yz • Reduce: Apply an inverse production at the right end of the left string If A → xy is a production, then Cbxy|ijk ⇒ CbA|ijk ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 27
  • 28. EXAMPLE OF SHIFT REDUCE PARSING |int * int + int shift int | * int + int shift int * | int + int shift int * int | + int reduce T → int int * T | + int reduce T → int * T T | + int shift T + | int shift T + int | reduce T → int T + T | reduce E → T T + E | reduce E → T + E E | ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 28
  • 29. OPERATOR PRECEDENCE PARSING Operator grammars have the property that no production right side is empty or has two adjacent nonterminals.  This property enables the implementation of efficient operator- precedence parsers. These parser rely on the following three precedence relations: Relation Meaning a <· b a yields precedence to b a =· b a has the same precedence as b a ·> b a takes precedence over b ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 29
  • 30. • These operator precedence relations allow to delimit the handles in the right sentential forms: <· marks the left end, =· appears in • the interior of the handle, and ·> marks the right end. • . Suppose that $ is the end of the string, Then for all terminals we can write: $ <· b and b ·> $ • If we remove all nonterminals and place the correct precedence relation:<·, =·, ·> between the remaining terminals, there remain strings that can be analyzed by easily developed parser. ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 30
  • 31. EXAMPLE OF OPERATOR PRECEDENCE PARSING id + * $ id ·> ·> ·> + <· ·> <· ·> * <· ·> ·> ·> $ <· <· <· ·> For example, the following operator precedence relations can be introduced for simple expressions: Example: The input string: id1 + id2 * id3 after inserting precedence relations becomes $ <· id1 ·> + <· id2 ·> * <· id3 ·> $ ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 31
  • 32. UNIT-III Syntax Directed Translations Production Semantic Rule E->E1+T E.code=E1.code||T.code||’+’ • We may alternatively insert the semantic actions inside the grammar E -> E1+T {print ‘+’} ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 32
  • 33. ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 33 • We can associate information with a language construct by attaching attributes to the grammar symbols. • A syntax directed definition specifies the values of attributes by associating semantic rules with the grammar productions.
  • 34. Syntax Directed Definitions 1. We associate information with the programming language constructs by attaching attributes to grammar symbols. 2. Values of these attributes are evaluated by the semantic rules associated with the production rules. 3. Evaluation of these semantic rules: • may generate intermediate codes • may put information into the symbol table • may perform type checking, may issue error messages • may perform some other activities • in fact, they may perform almost any activities. 4. An attribute may hold almost any thing. • a string, a number, a memory location, a complex record. ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 34
  • 35. Syntax-Directed Definitions and Translation Schemes 1. When we associate semantic rules with productions, we use two notations: • Syntax-Directed Definitions • Translation Schemes A. Syntax-Directed Definitions: • give high-level specifications for translations • hide many implementation details such as order of evaluation of semantic actions. • We associate a production rule with a set of semantic actions, and we do not say when they will be evaluated. B. Translation Schemes: • indicate the order of evaluation of semantic actions associated with a production rule. • In other words, translation schemes give a little bit information about implementation details. ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 35
  • 36. Syntax-Directed Translation • Conceptually with both the syntax directed translation and translation scheme we • Parse the input token stream • Build the parse tree • Traverse the tree to evaluate the semantic rules at the parse tree nodes. Input string parse tree dependency graph evaluation order for semantic rules Conceptual view of syntax directed translation ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 36
  • 37. Syntax-Directed Definitions 1. A syntax-directed definition is a generalization of a context-free grammar in which: • Each grammar symbol is associated with a set of attributes. • This set of attributes for a grammar symbol is partitioned into two subsets called • synthesized and • inherited attributes of that grammar symbol. 2. The value of an attribute at a parse tree node is defined by the semantic rule associated with a production at that node. 3. The value of a synthesized attribute at a node is computed from the values of attributes at the children in that node of the parse tree. 4. The value of an inherited attribute at a node is computed from the values of attributes at the siblings and parent of that node of the parse tree. ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 37
  • 38. Syntax-Directed Definitions Examples: Synthesized attribute : E→E1+E2 { E.val =E1.val + E2.val} Inherited attribute :A→XYZ {Y.val = 2 * A.val} 1. Semantic rules set up dependencies between attributes which can be represented by a dependency graph. 2. This dependency graph determines the evaluation order of these semantic rules. 3. Evaluation of a semantic rule defines the value of an attribute. But a semantic rule may also have some side effects such as printing a value. ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 38
  • 39. Syntax Trees Syntax-Tree • an intermediate representation of the compiler’s input. • A condensed form of the parse tree. • Syntax tree shows the syntactic structure of the program while omitting irrelevant details. • Operators and keywords are associated with the interior nodes. • Chains of simple productions are collapsed. Syntax directed translation can be based on syntax tree as well as parse tree. ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 39
  • 40. Syntax Tree-Examples Expression: + 5 * 3 4 • Leaves: identifiers or constants • Internal nodes: labelled with operations • Children: of a node are its operands if B then S1 else S2 if - then - else Statement: Node’s label indicates what kind of a statement it is Children of a node correspond to the components of the statement B S1 S2 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 40
  • 41. Intermediate representation and code generation Two possibilities: 1. ..... semantic routines code generation Machine code (+) no extra pass for code generation (+) allows simple 1-pass compilation 2. semantic routines code generation Machine code IR (+) allows higher-level operations e.g. open block, call procedures. (+) better optimization because IR is at a higher level. (+) machine dependence is isolated in code generation. ..... ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 41
  • 42. Three address code • In a three address code there is at most one operator at the right side of an instruction • Example: + + * - b c a d t1 = b – c t2 = a * t1 t3 = a + t2 t4 = t1 * d t5 = t3 + t4 * ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 42
  • 43. Forms of three address instructions • x = y op z • x = op y • x = y • goto L • if x goto L and ifFalse x goto L • if x relop y goto L • Procedure calls using: • param x • call p,n • y = call p,n • x = y[i] and x[i] = y • x = &y and x = *y and *x =y ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 43
  • 44. Example • do i = i+1; while (a[i] < v); L: t1 = i + 1 i = t1 t2 = i * 8 t3 = a[t2] if t3 < v goto L Symbolic labels 100: t1 = i + 1 101: i = t1 102: t2 = i * 8 103: t3 = a[t2] 104: if t3 < v goto 100 Position numbers ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 44
  • 45. Data structures for three address codes • Quadruples • Has four fields: op, arg1, arg2 and result • Triples • Temporaries are not used and instead references to instructions are made • Indirect triples • In addition to triples we use a list of pointers to triples ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 45
  • 46. Example • b * minus c + b * minus c t1 = minus c t2 = b * t1 t3 = minus c t4 = b * t3 t5 = t2 + t4 a = t5 Three address code Quadruples Triples Indirect Triples Op Arg1 Arg2 result Minus c T1 * b T1 T2 Minus c T3 * b T3 T4 + t2 t4 T5 = t5 a Op Arg1 arg2 Minus c * b (0) Minus c * b (2) + (1) (3) a (4) 0 1 2 3 4 5 (0) (1) (2) (3) (4) (5) 35 36 37 38 39 40 Op Arg1 arg2 Minus c * b (0) Minus c * b (2) + (1) (3) a (4) 0 1 2 3 4 5 ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 46
  • 47. Intermediate representation and code generation IR good for optimization and portability Machine Code simple ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 47
  • 48. Intermediate code 1. postfix form Example a+b ab+ (a+b)*c ab+c* a+b*c abc*+ a:=b*c+b*d abc*bd*+:= (+) simple and concise (+) good for driving an interpreter (- ) Not good for optimization or code generation ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 48
  • 49. INTERMEDIATE CODE 2. 3-addr code  Triple op arg1 arg2  Quadruple op arg1 arg2 arg3 Triple: more concise But what if instructions are deleted, Moved or added during optimization?  Triples and quadruples are more similar to machine code. ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 49
  • 50. INTERMEDIATE CODE More detailed 3-addr code  Add type information Example a := b*c + b*d Suppose b,c are integer type, d is float type. (1) ( I* b c ) (I* b c t1) (2) (FLOAT b _ ) (FLOAT b t2 _) (3) ( F* (2) d ) (F* t2 d t3) (4) (FLOAT (1) _ ) (FLOAT t1 t4 _) (5) ( *f+ (4) (3)) ( F+ t4 t3 t5) (6) ( := (5) a ) ( := t5 a _) ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 50
  • 51. PARSE TREES Parsing: build the parse tree Non-terminals for operator precedence and associatively are included. parse tree <target> := <exp> id <exp> + <term> <term > <term> * <factoor> <factor> Const id <factor> id ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 51
  • 52. PARSE TREE Lexical Analyzer Parser Source program token getNext Token Symbol table Parse tree Rest of Front End Intermediate representation ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 52
  • 53. BOOLEAN EXPRESSIONS • Control flow translation of boolean expressions: • Basic idea: generate the jumping code without evaluating the whole boolean expression. • Example: Let E = a < b, we will generate the code as (1) If a < b then goto E.true (2) Goto T.false Grammar: E->E or E | E and E | not E | (E) | id relop id | true | false. ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 53
  • 54. E -> E1 or E2 { E1.true = E.true; E1.false = newlabel; E2.true = E.true; E2.false = E.false; E.code = E1.code || gen(E1.false ‘:’) || E2.code} E->E1 and E2 {E1.true = newlabel; E1.false = E.false; E2.true = E.true; E2.false = E.false; E.code = E1.code || gen(E1.true ‘:’) || E2.code} E->not E {E1.true = E.false; E1.false = E.true; E.code = E1.code} E->(E1) {E1.true = E.true; E1.false = E.false; E.code = E1.code;} E->id1 relop id2 {E.code = gen(‘if’ id1.place relop.op id2.place ‘goto’ E.true); gen (‘goto’ E.false);} E->true {gen(‘goto’ E.true);} E->false{gen(‘goto’ E.false);} ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 54
  • 55. Example Example: a < b or (c < d and e < f) Example: while a< b do if c < d then x := y + z; else x: = y – z; ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 55
  • 56. Statements that alter the flow of control Fig. The Flowchart of the flow of control ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 56
  • 57. Translation of Control flow statements • Most of the programming languages have a common set of statements that define the control flow of a program. • These control statements are: Assignment statement: It has a single statement assigning some expression to a variable. if-then-else statement: It has a condition associated with it. The control flows either to the then-part or to the else-part. while-do-loop The control remains within the loop until a specified condition becomes false. Block of statements It is group of statements put within a begin-end block marker. ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 57
  • 58. Translation of Case Statements • It is a unique because the structure contains an expression. • Control jumps to one of the many alternatives. • Syntax: switch (E) { case c1: …… . . case cn: …… default : …… } ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 58
  • 59. Postfix translation: • The postfix notation for an expression E can be defined: 1. If E is a variable or constant, then the postfix notation for E is E itself. 2. If E is an expression of the form E1 op E2, where op is any binary operator, then the postfix notation for E is E’1 E’2 op, where E’1 and E’2 are the postfix notations for E1 and E2, respectively. 3. If E is a parenthesized expression of the form (E1), then the postfix notation for E is the same as the postfix notation for E1. ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 59
  • 60. Postfix notation • Postfix notation is a linearized representation of a syntax tree. • It a list of nodes of the tree in which a node appears immediately after its children. • the postfix notation of below syntax tree is x a –b* a-b*+= s b * s b Assign +x * uminus uminus ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 60
  • 61. Translation with a top down parser • which build parse trees from top(root) to bottom(leaves). Fig. The procedures of a top down parser. ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 61
  • 62. Recursive Descent Parsing • Recursive descent is a top-down parsing technique that constructs the parse tree from the top and the input is read from left to right. • It uses procedures for every terminal and non-terminal entity. • A form of recursive-descent parsing that does not require any back-tracking is known as predictive parsing. • This parsing technique is regarded recursive as it uses context-free grammar which is recursive in nature. ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 62
  • 63. Back-tracking • Top- down parsers start from the root node (start symbol) and match the input string against the production rules to replace them (if matched). • To understand this, take the following example of CFG: S → rXd | rZd X → oa | ea Z → ai ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 63
  • 64. Back tracking • Now the parser matches all the input letters in an ordered manner. • The string is accepted. ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 64
  • 65. Predictive Parser • Predictive parser is a recursive descent parser. • It has the capability to predict which production is to be used to replace the input string. • The predictive parser does not suffer from backtracking. • The predictive parser puts some constraints on the grammar and accepts only a class of grammar known as LL(k) grammar. ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 65
  • 66. PREDICTIVE PARSER • The parser refers to the parsing table to take any decision on the input and stack element combination. ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 66
  • 67. LL Parser • An LL Parser accepts LL grammar. • LL grammar is a subset of context-free grammar but with some restrictions to get the simplified version. • LL grammar can be implemented by means of both algorithms namely, recursive-descent or table-driven. • LL parser is denoted as LL(k). ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 67
  • 68. Array references in arithmetic expressions: • Array elements can be accessed quickly if they are stored in a block of consecutive locations. • Elements are numbered 0, 1,…..,n-1, for an array with n elements. • If the width of each array element is w, then the ith element of array A begins in location. base + i * w where base relative address(storage allocated) i. e , base is the relative address of A[0]. ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 68
  • 69. Layouts for a 2D Array A[1, 1] A[1, 2] A[1, 3] A[2, 1] A[2, 2] A[2, 3] First row Second row First column Third Column Second Column A[1, 1] A[2, 1] A[1, 2] A[2, 2] A[1, 3] A[2, 3] (a) Row Major (b) Column Major ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 69
  • 70. Procedures call • It is imperative for a compiler to generate good code for procedure calls and returns. • The run-time routines that handle procedure argument passing, calls and returns are part of the run-time support package. • Let us consider a grammar for a simple procedure call statement: • (1) S call id ( Elist ) • (2) Elist Elist , E • (3) Elist E ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 70
  • 71. Declarations and case statements. • Declarations Declarations with lists of names can be handled as follow: D T id ; D | ε T B C | record ’{’ D ’}’ B int | float C ε | [ num ] C Nonterminal D generates a sequence of declarations. Nonterminal T generates basic, array, or record types. Nonterminal B generates one of the basic types int or float. Nonterminal C, for “ component,” generates string of Zero or more integers, each surrounded by brackets. ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 71
  • 72. Case Statements • The “switch” or “case” statement is available in a variety of languages. The switch-statement • Syntax is as shown below : Switch expression begin case value : statement case value : statement . . . case value : statement default : statement end ANKUR SRIVASTAVA ASSISTANT PROFESSOR JIT 72