SlideShare a Scribd company logo
Lecture 4: Parsing
CS4200 Compiler Construction
Eelco Visser
TU Delft
September 2018
This Lecture
!2
Java Type Check
JVM
bytecode
Parse CodeGenOptimize
Turning syntax definitions into parsers
Reading Material
3
The perspective of this lecture on declarative syntax definition is
Elained more elaborately in this Onward! 2010 essay. It uses an on
older version of SDF than used in these slides. Production rules have
the form 

X1 … Xn -> N {cons(“C”)} 

instead of 

N.C = X1 … Xn
http://swerl.tudelft.nl/twiki/pub/Main/TechnicalReports/TUD-SERG-2010-019.pdf
https://doi.org/10.1145/1932682.1869535
!5
Compilers: Principles, Techniques, and Tools, 2nd Edition
Alfred V. Aho, Columbia University

Monica S. Lam, Stanford University

Ravi Sethi, Avaya Labs

Jeffrey D. Ullman, Stanford University

2007 | Pearson
Classical compiler textbook

Chapter 4: Syntax Analysis

Read Sections 4.1, 4.2, 4.3, 4.5, 4.6
Pictures in these slides are copies from the book
!6
Sikkel, N. (1993). Parsing Schemata.

PhD thesis. Enschede: Universiteit Twente
This PhD thesis presents a uniform framework for describing
a wide range of parsing algorithms.
https://research.utwente.nl/en/publications/parsing-schemata
“Parsing schemata provide a general framework for
specication, analysis and comparison of (sequential and/or
parallel) parsing algorithms. A grammar specifies implicitly what
the valid parses of a sentence are; a parsing algorithm specifies
Elicitly how to compute these. Parsing schemata form a well-
defined level of abstraction in between grammars and parsing
algorithms. A parsing schema specifies the types of
intermediate results that can be computed by a parser, and the
rules that allow to Eand a given set of such results with new
results. A parsing schema does not specify the data structures,
control structures, and (in case of parallel processing)
communication structures that are to be used by a parser.”
For the interested
!7
https://ivi.fnwi.uva.nl/tcs/pub/reports/1995/P9507.ps.Z
This paper applies parsing schemata to disambiguation filters
for priority conflicts.
For the interested
Parser Architecture
8
Traditional Parser Architecture
!9
Source: Compilers Principles, Techniques & Tools
Context-Free Grammars
10
Terminals
- Basic symbols from which strings are formed

Nonterminals
- Syntactic variables that denote sets of strings

Start Symbol
- Denotes the nonterminal that generates strings of the languages

Productions
- A = X … X

- Head/left side (A) is a nonterminal

- Body/right side (X … X) zero or more terminals and nonterminals
!11
Context-Free Grammars
Example Context-Free Grammar
!12
grammar
start S
non-terminals E T F
terminals "+" "*" "(" ")" ID
productions
S = E
E = E "+" T
E = T
T = T "*" F
T = F
F = "(" E ")"
F = ID
Abbreviated Grammar
!13
grammar
productions
S = E
E = E "+" T
E = T
T = T "*" F
T = F
F = "(" E ")"
F = ID
Nonterminals, terminals can be derived from productions
First production defines start symbol
grammar
start S
non-terminals E T F
terminals "+" "*" "(" ")" ID
productions
S = E
E = E "+" T
E = T
T = T "*" F
T = F
F = "(" E ")"
F = ID
Notation
!14
A, B, C: non-terminals
l: terminals
a, b, c: strings of non-terminals and terminals
(alpha, beta, gamma in math)
w, v: strings of terminal symbols
Meta: Syntax of Grammars
!15
context-free syntax
Production.Prod = <
<Symbol><Constructor?> = <Symbol*>
>
Symbol.NT = <<ID>>
Symbol.T = <<STRING>>
Symbol.L = <<LCID>>
Constructor.Con = <.<ID>>
context-free syntax // grammars
Grammar.Grammar = <
grammar
<Start?>
<Sorts?>
<Terminals?>
<Productions>
>
context-free syntax
Start.Start = <
start <ID>
>
Sorts.Sorts = <
sorts <ID*>
>
Sorts.NonTerminals = <
non-terminals <ID*>
>
Terminals.Terminals = <
terminals <Symbol*>
>
Productions.Productions = <
productions
<Production*>
>
Derivations: Generating
Sentences from Symbols
16
Derivations
!17
grammar
productions
E = E "+" E
E = E "*" E
E = "-" E
E = "(" E ")"
E = ID
// derivation step: replace symbol by rhs of production
// E = E "+" E
// replace E by E "+" E
//
// derivation:
// repeatedly apply derivations
derivation
E
=> "-" E
=> "-" "(" E ")"
=> "-" "(" ID ")"
derivation // derives in zero or more steps
E =>* "-" "(" ID "+" ID ")"
Meta: Syntax of Derivations
!18
context-free syntax // derivations
Derivation.Derivation = <
derivation
<Symbol> <Step*>
>
Step.Step = [=> [Symbol*]]
Step.Steps = [=>* [Symbol*]]
Step.Steps1 = [=>+ [Symbol*]]
Left-Most Derivation
!19
grammar
productions
E = E "+" E
E = E "*" E
E = "-" E
E = "(" E ")"
E = ID
derivation // left-most derivation
E
=> "-" E
=> "-" "(" E ")"
=> "-" "(" E "+" E ")"
=> "-" "(" ID "+" E ")"
=> "-" "(" ID "+" ID ")"
Left-most derivation: Expand left-most non-terminal at each step
Right-Most Derivation
!20
grammar
productions
E = E "+" E
E = E "*" E
E = "-" E
E = "(" E ")"
E = ID
derivation // left-most derivation
E
=> "-" E
=> "-" "(" E ")"
=> "-" "(" E "+" E ")"
=> "-" "(" ID "+" E ")"
=> "-" "(" ID "+" ID ")"
Right-most derivation: Expand right-most non-terminal at each step
derivation // right-most derivation
E
=> "-" E
=> "-" "(" E ")"
=> "-" "(" E "+" E ")"
=> "-" "(" E "+" ID ")"
=> "-" "(" ID "+" ID ")"
Meta: Tree Derivations
!21
context-free syntax // tree derivations
Derivation.TreeDerivation = <
tree derivation
<Symbol> <PStep*>
>
PStep.Step = [=> [PT*]]
PStep.Steps = [=>* [PT*]]
PStep.Steps1 = [=>+ [PT*]]
PT.App = <<Symbol>[<PT*>]>
PT.Str = <<STRING>>
PT.Sym = <<Symbol>>
Left-Most Tree Derivation
!22
grammar
productions
E.A = E "+" E
E.T = E "*" E
E.N = "-" E
E.P = "(" E ")"
E.V = ID
derivation // left-most derivation
E
=> "-" E
=> "-" "(" E ")"
=> "-" "(" E "+" E ")"
=> "-" "(" ID "+" E ")"
=> "-" "(" ID "+" ID ")"
tree derivation // left-most
E
=> E["-" E]
=> E["-" E["(" E ")"]]
=> E["-" E["(" E[E "+" E] ")"]]
=> E["-" E["(" E[E[ID] "+" E] ")"]]
=> E["-" E["(" E[E[ID] "+" E[ID]] ")"]]
Left-Most Tree Derivation
!23
tree derivation // left-most
E
=> E["-" E]
=> E["-" E["(" E ")"]]
=> E["-" E["(" E[E "+" E] ")"]]
=> E["-" E["(" E[E[ID] "+" E] ")"]]
=> E["-" E["(" E[E[ID] "+" E[ID]] ")"]]
Ambiguity: Deriving Multiple Parse Trees
!24
grammar
productions
E.A = E "+" E
E.T = E "*" E
E.N = "-" E
E.P = "(" E ")"
E.V = ID
derivation
E =>* ID "+" ID "*" ID
Ambiguous grammar: produces >1 parse tree for a sentence
derivation
E
=> E "+" E
=> ID "+" E
=> ID "+" E "*" E
=> ID "+" ID "*" E
=> ID "+" ID "*" ID
tree derivation
E =>* E[E[ID] "+" E[E[ID] "*" E[ID]]]
derivation
E
=> E "*" E
=> E "+" E "*" E
=> ID "+" E "*" E
=> ID "+" ID "*" E
=> ID "+" ID "*" ID
tree derivation
E =>* E[E[E[ID] "+" E[ID]] "*" E[ID]]
Meta: Term Derivations
!25
context-free syntax // term derivations
Derivation.TermDerivation = <
term derivation
<Symbol> <TStep*>
>
TStep.Step = [=> [Term*]]
TStep.Steps = [=>* [Term*]]
TStep.Steps1 = [=>+ [Term*]]
Term.App = <<ID>(<{Term ","}*>)>
Term.Str = <<STRING>>
Term.Sym = <<Symbol>>
Ambiguity: Deriving Abstract Syntax Terms
!26
grammar
productions
E.A = E "+" E
E.T = E "*" E
E.N = "-" E
E.P = "(" E ")"
E.V = ID
derivation
E =>* ID "+" ID "*" ID
derivation
E
=> E "+" E
=> ID "+" E
=> ID "+" E "*" E
=> ID "+" ID "*" E
=> ID "+" ID "*" ID
derivation
E
=> E "*" E
=> E "+" E "*" E
=> ID "+" E "*" E
=> ID "+" ID "*" E
=> ID "+" ID "*" ID
term derivation
E
=> A(E, E)
=> A(V(ID), E)
=> A(V(ID), T(E, E))
=> A(V(ID), T(V(ID), E))
=> A(V(ID), T(V(ID), V(ID)))
term derivation
E
=> T(E, E)
=> T(A(E, E), E)
=> T(A(V(ID), E), E)
=> T(A(V(ID), V(ID)), E)
=> T(A(V(ID), V(ID)), V(ID))
Grammar Transformations
27
Why?
- Disambiguation

- For use by a particular parsing algorithm

Transformations
- Eliminating ambiguities

- Eliminating left recursion

- Left factoring

Properties
- Does transformation preserve the language (set of strings, trees)?

- Does transformation preserve the structure of trees?
!28
Grammar Transformations
Ambiguous Expression Grammar
!29
derivation
E =>* ID "*" ID "+" ID
term derivation
E
=> A(E, E)
=> A(T(E, E), E)
=> A(T(E, E), E)
=> A(T(V(ID), E), E)
=> A(T(V(ID), V(ID)), E)
=> A(T(V(ID), V(ID)), V(ID))
term derivation
E
=> T(E, E)
=> T(E, E)
=> T(V(ID), E)
=> T(V(ID), A(E, E))
=> T(V(ID), A(V(ID), E))
=> T(V(ID), A(V(ID), V(ID)))
grammar
productions
E.A = E "+" E
E.T = E "*" E
E.M = "-" E
E.B = "(" E ")"
E.V = ID
Associativity and Priority Filter Ambiguities
!30
grammar
productions
E.A = E "+" E
E.T = E "*" E
E.M = "-" E
E.B = "(" E ")"
E.V = ID
derivation
E =>* ID "*" ID "+" ID
term derivation
E
=> A(E, E)
=> A(T(E, E), E)
=> A(T(E, E), E)
=> A(T(V(ID), E), E)
=> A(T(V(ID), V(ID)), E)
=> A(T(V(ID), V(ID)), V(ID))
term derivation
E
=> T(E, E)
=> T(E, E)
=> T(V(ID), E)
=> T(V(ID), A(E, E))
=> T(V(ID), A(V(ID), E))
=> T(V(ID), A(V(ID), V(ID)))
grammar
productions
E.A = E "+" E {left}
E.T = E "*" E {left}
E.M = "-" E
E.B = "(" E ")"
E.V = ID
priorities
E.M > E. T > E.A
Define Associativity and Priority by Transformation
!31
grammar
productions
E.A = E "+" T
E = T
T.T = T "*" F
T = F
F.V = ID
F.B = "(" E ")"
derivation
E =>* ID "*" ID "+" ID
term derivation
E
=> A(E, E)
=> A(T(E, E), E)
=> A(T(E, E), E)
=> A(T(V(ID), E), E)
=> A(T(V(ID), V(ID)), E)
=> A(T(V(ID), V(ID)), V(ID))
term derivation
E
=> T(E, E)
=> T(E, E)
=> T(V(ID), E)
=> T(V(ID), A(E, E))
=> T(V(ID), A(V(ID), E))
=> T(V(ID), A(V(ID), V(ID)))
grammar
productions
E.A = E "+" E {left}
E.T = E "*" E {left}
E.M = "-" E
E.B = "(" E ")"
E.V = ID
priorities
E.M > E. T > E.A
Define Associativity and Priority by Transformation
!32
grammar
productions
E.A = E "+" T
E = T
T.T = T "*" F
T = F
F.V = ID
F.B = "(" E ")"
grammar
productions
E.A = E "+" E {left}
E.T = E "*" E {left}
E.M = "-" E
E.B = "(" E ")"
E.V = ID
priorities
E.M > E. T > E.A
Define new non-terminal for each priority level: 

E, T, F
Add ‘injection’ productions to include priority
level n+1 in n:

E = T

T = F
Transform productions

Left: E = E “+” T

Right: E = T “+” E
Change head of production to reflect priority
level

T = T “*” F
Dangling Else Grammar
!33
grammar
sorts S E
productions
S.If = if E then S
S.IfE = if E then S else S
S = other
derivation
S =>* if E1 then S1 else if E2 then S2 else S3
term derivation
S =>* IfE(E1, S1, IfE(E2, S2, S3))
term derivation
S
=> IfE(E1, S1, S)
=> IfE(E1, S1, IfE(E2, S2, S3))
Dangling Else Grammar is Ambiguous
!34
grammar
sorts S E
productions
S.If = if E then S
S.IfE = if E then S else S
S = other
term derivation
S
=> If(E1, S)
=> If(E1, IfE(E2, S1, S2))
derivation
S
=> if E1 then S
=> if E1 then if E2 then S1 else S2
derivation
S =>* if E1 then if E2 then S1 else S2
term derivation
S
=> IfE(E1, S, S2)
=> IfE(E1, If(E2, S1), S2)
derivation
S
=> if E1 then S else S2
=> if E1 then if E2 then S1 else S2
Eliminating Dangling Else Ambiguity
!35
grammar
sorts S E
productions
S.If = if E then S
S.IfE = if E then S else S
S = other
grammar
productions
S = MS
S = OS
MS = if E then MS else MS
MS = other
OS = if E then S
OS = if E then MS else OS
Generalization of this transformation: contextual grammars
!36
This paper defines a declarative semantics for associativity
and priority declarations for disambiguation.

The paper provides a safe semantics and extends it to
deep priority conflicts.

The result of disambiguation is a contextual grammar,
which generalises the disambiguation for the dangling-else
grammar.
The paper is still in production. Ask us for a copy of the draft.
Eliminating Left Recursion
!37
grammar
productions
E = E "+" T
E = T
T = T "*" F
T = F
F = "(" E ")"
F = ID
grammar
productions
A = A a
A = b
grammar
productions
A = b A'
A' = a A'
A' = // empty
grammar
productions
E = T E'
E' = "+" T E'
E' =
T = F T'
T' = "*" F T'
T' =
F = "(" E ")"
F = ID
// b followed by a list of as
Left Factoring
!38
grammar
productions
S.If = if E then S
S.IfE = if E then S else S
S = other
grammar
productions
A = a b1
A = a b2
A = c
grammar
productions
A = a A'
A' = b1
A' = b2
A = c
grammar
sorts S E
productions
S.If = if E then S S'
S'.Else = else S
S'.NoElse = // empty
S = other
Preservation
- Preserves set of sentences

- Preserves set of trees

- Preserves tree structure

Systematic
- Algorithmic

- Heuristic
!39
Properties of Grammar Transformations
Top-Down Parsing
40
Top-Down Parse
!41
grammar
sorts E T E' F T'
productions
E = T E'
E' = "+" T E'
E' =
T = F T'
T' = "*" F T'
T' =
F = "(" E ")"
F = ID
tree derivation // top-down parse
E
=> E[T E']
=> E[T[F T'] E']
=> E[T[F[ID] T'] E']
=> E[T[F[ID] T'[]] E']
=> E[T[F[ID] T'[]] E'["+" T E']]
=> E[T[F[ID] T'[]] E'["+" T[F T'] E']]
=> E[T[F[ID] T'[]] E'["+" T[F[ID] T'] E']]
=> E[T[F[ID] T'[]] E'["+" T[F[ID] T'["*" F T']] E']]
=> E[T[F[ID] T'[]] E'["+" T[F[ID] T'["*" F[ID] T']] E']]
=> E[T[F[ID] T'[]] E'["+" T[F[ID] T'["*" F[ID] T'[]]] E']]
=> E[T[F[ID] T'[]] E'["+" T[F[ID] T'["*" F[ID] T'[]]] E'[]]]
derivation
E =>* ID "+" ID "*" ID
Top-Down Parse
!42
grammar
sorts E T E' F T'
productions
E = T E'
E' = "+" T E'
E' =
T = F T'
T' = "*" F T'
T' =
F = "(" E ")"
F = ID
Non-Deterministic Recursive Descent Parsing
!43
Use LL(1) grammar
- Not left recursive

- Left factored

Top-down back-track parsing
- Predict symbol

- If terminal: corresponds to next input symbol?

- Try productions for non-terminal in turn

Predictive parsing
- Predict symbol to parse

- Use lookahead to deterministically chose production for non-terminal

Variants
- Parser combinators, PEG, packrat, …
!44
LL Parsing
Reducing Sentences
to Symbols
45
Meta: Reductions
!46
context-free syntax
Reduction.Reduction = <
reduction
<Symbol*> <RStep*>
>
RStep.Step = [<= [Symbol*]]
RStep.Steps = [<=* [Symbol*]]
RStep.Steps1 = [<=+ [Symbol*]]
context-free syntax
Reduction.TreeReduction = <
tree reduction
<PT*> <PRStep*>
>
PRStep.Step = [<= [PT*]]
PRStep.Steps = [<=* [PT*]]
PRStep.Steps1 = [<=+ [PT*]]
context-free syntax
Reduction.TermReduction = <
term reduction
<Term*> <TRStep*>
>
TRStep.Step = [<= [Term*]]
TRStep.Steps = [<=* [Term*]]
TRStep.Steps1 = [<=+ [Term*]]
A Reduction is an Inverse Derivation
!47
grammar
sorts A
productions
A = b
reduction
a b c <= a A c
Reducing to Symbols
!48
grammar
sorts E T F ID
productions
E.P = E "+" T
E.E = T
T.M = T "*" F
T.T = F
F.B = "(" E ")"
F.V = ID
reduction
ID "*" ID
<= F "*" ID
<= T "*" ID
<= T "*" F
<= T
<= E
Reducing to Parse Trees
!49
grammar
sorts E T F ID
productions
E.P = E "+" T
E.E = T
T.M = T "*" F
T.T = F
F.B = "(" E ")"
F.V = ID
reduction
ID "*" ID
<= F "*" ID
<= T "*" ID
<= T "*" F
<= T
<= E
tree reduction
ID "*" ID
<= F[ID] "*" ID
<= T[F[ID]] "*" ID
<= T[F[ID]] "*" F[ID]
<= T[T[F[ID]] "*" F[ID]]
<= E[T[T[F[ID]] "*" F[ID]]]
Reducing to Abstract Syntax Terms
!50
grammar
sorts E T F ID
productions
E.P = E "+" T
E.E = T
T.M = T "*" F
T.T = F
F.B = "(" E ")"
F.V = ID
reduction
ID "*" ID
<= F "*" ID
<= T "*" ID
<= T "*" F
<= T
<= E
tree reduction
ID "*" ID
<= F[ID] "*" ID
<= T[F[ID]] "*" ID
<= T[F[ID]] "*" F[ID]
<= T[T[F[ID]] "*" F[ID]]
<= E[T[T[F[ID]] "*" F[ID]]]
term reduction
ID "*" ID
<= V(ID) "*" ID
<= T(V(ID)) "*" ID
<= T(V(ID)) "*" V(ID)
<= M(T(V(ID)), V(ID))
<= E(M(T(V(ID)), V(ID)))
Handles
!51
grammar
productions
A = b
derivation // right-most derivation
S =>* a A w => a b w
// Handle
// sentential form: string of non-terminal and terminal symbols
// that can be derived from start symbol
// S =>* a
// right sentential form: a sentential form derived by a right-most derivation
// handle: the part of a right sentential form that if reduced
// would produced the previous right-sential form in a right-most derivation
reduction
a b w <= a A w <=* S
Shift Reduce Parsing
52
Shift-Reduce Parsing Machine
!53
$ S | $ accept
// reduced to start symbol; accept
$ a | w $ error
// no action possible in this state
shift-reduce parse
$ stack | input $ action
$ a | l w $ shift
=> $ a l | w $
// shift input symbol on the stack
$ a b | w $ reduce by A = b
=> $ a A | w $
// reduce n symbols of the stack to symbol A
Shift-Reduce Parsing
!54
grammar
productions
E.P = E "+" T
E.E = T
T.M = T "*" F
T.T = F
F.B = "(" E ")"
F.V = ID
shift-reduce parse
$ | ID "*" ID $ shift
=> $ ID | "*" ID $ reduce by F = ID
=> $ F | "*" ID $ reduce by T = F
=> $ T | "*" ID $ shift
=> $ T "*" | ID $ shift
=> $ T "*" ID | $ reduce by F = ID
=> $ T "*" F | $ reduce by T = T "*" F
=> $ T | $ reduce by E = T
=> $ E | $ accept
Shift-reduce parsing constructs a right-most derivation
Shift-Reduce Conflicts
!55
grammar
sorts S E
productions
S.If = if E then S
S.IfE = if E then S else S
S = other
shift-reduce parse
$ | if E then S else S $ shift
=> $ if | E then S else S $ shift
=>* $ if E then S | else S $ shift
=> $ if E then S else | S $ shift
=> $ if E then S else S | $ reduce by S = if E then S else S
=> $ S | $ accept
shift-reduce parse
$ | if E then S else S $ shift
=> $ if | E then S else S $ shift
=>* $ if E then S | else S $ reduce by S = if E then S
=> $ S | else S $ error
Shift-Reduce Conflicts
!56
grammar
sorts S E
productions
S.If = if E then S
S.IfE = if E then S else S
S = other
shift-reduce parse
$ | if E then if E then S else S $ ...
=>* $ if E then if E then S | else S $ shift
=> $ if E then if E then S else | S $ shift
=> $ if E then if E then S else S | $ reduce by S = if E then S else S
=> $ if E then S | $ reduce by S = if E then S
=> $ S | $ accept
shift-reduce parse
$ | if E then if E then S else S $ ...
=>* $ if E then if E then S | else S $ reduce by S = if E then S
=> $ if E then S | else S $ shift
=> $ if E then S else | S $ shift
=> $ if E then S else S | $ reduce by S = if E then S else S
=> $ S
Simple LR Parsing
57
How can we make shift-reduce parsing deterministic?
!58
grammar
productions
E.P = E "+" T
E.E = T
T.M = T "*" F
T.T = F
F.B = "(" E ")"
F.V = ID
shift-reduce parse
$ | ID "*" ID $ shift
=> $ ID | "*" ID $ reduce by F = ID
=> $ F | "*" ID $ reduce by T = F
=> $ T | "*" ID $ shift
=> $ T "*" | ID $ shift
=> $ T "*" ID | $ reduce by F = ID
=> $ T "*" F | $ reduce by T = T "*" F
=> $ T | $ reduce by E = T
=> $ E | $ accept
Is there a production in the grammar that matches the top of the stack?
How to chose between possible shift and reduce actions?
LR(k) Parsing
- L: left-to-right scanning

- R: constructing a right-most derivation

- k: the number of lookahead symbols

Motivation for LR
- LR grammars for many programming languages

- Most general non-backtracking shift-reduce parsing method 

- Early detection of parse errors

- Class of grammars superset of predictive/LL methods

SLR: Simple LR
!59
LR Parsing
Items
!60
[E = . E "+" T]
[E = E . "+" T]
[E = E "+" . T]
[E = E "+" T .]
E = E "+" T
Production Items
Item indicates how far we have progressed in parsing a production
We expect to see an expression
We have seen an expression 

and may see a “+”
Item Set
!61
{
[F = "(" . E ")"]
[E = . E "+" T]
[E = . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
Item set used to keep track where we are in a parse
grammar
productions
S = E
E = E "+" T
E = T
T = T "*" F
T = F
F = "(" E ")"
F = ID
Closure of an Item Set
!62
SetOfItems Closure(I) {
J := I;
repeat
for(each [A = a . B b] in J)
for(each [B = c] in G)
if([B = . c] is not in J)
J := Add([B = .c], J);
until Not(Changed(J));
return J;
}
{
[F = "(" . E ")"]
}
{
[F = "(" . E ")"]
[E = . E "+" T]
[E = . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
grammar
productions
S = E
E = E "+" T
E = T
T = T "*" F
T = F
F = "(" E ")"
F = ID
Goto
!63
SetOfItems Goto(I, X) {
J := {};
for(each [A = a . X b] in I)
J := Add([A = a X . b], J);
return Closure(J);
}
{
[F = "(" . E ")"]
}
{
[S = . E]
[E = . E "+" T]
[E = . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
grammar
productions
S = E
E = E "+" T
E = T
T = T "*" F
T = F
F = "(" E ")"
F = ID
{
[F = "(" . E ")"]
[E = . E "+" T]
[E = . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
Computing LR(0) Automaton
!64
Items(G) {
C := {Closure({[Sp = . S]})};
repeat
for(each I in C)
for(each X in G) {
if(NotEmpty(Goto(I, X)) and Not(In(J, C)))
C := Add(Goto(I, X), C);
}
until Not(Changed(C));
}
Computing LR(0) Automaton
!65
grammar
productions
S = E
E = E "+" T
E = T
T = T "*" F
T = F
F = "(" E ")"
F = ID
state 0 {
[S = . E]
}
state 0 {
[S = . E]
[E = . E "+" T]
[E = . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
!66
state 0 {
[S = . E]
[E = . E "+" T]
[E = . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
!67
state 1 {
[S = E . ]
[E = E . "+" T]
}
accept
state 0 {
[S = . E]
[E = . E "+" T]
[E = . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
shift E to 1
!68
state 1 {
[S = E . ]
[E = E . "+" T]
}
accept
state 0 {
[S = . E]
[E = . E "+" T]
[E = . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
shift E to 1
shift T to 2
state 2 {
[E = T . ]
[T = T . "*" F]
}
reduce E = T
!69
state 1 {
[S = E . ]
[E = E . "+" T]
}
accept
state 0 {
[S = . E]
[E = . E "+" T]
[E = . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
shift E to 1
shift T to 2
shift F to 3
state 3 {
[T = F . ]
}
reduce T = F
state 2 {
[E = T . ]
[T = T . "*" F]
}
reduce E = T
!70
state 1 {
[S = E . ]
[E = E . "+" T]
}
accept
state 0 {
[S = . E]
[E = . E "+" T]
[E = . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
shift E to 1
shift T to 2
shift F to 3
shift "(" to 4
state 4 {
[F = "(" . E ")"]
[E = . E "+" T]
[E = . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
state 3 {
[T = F . ]
}
reduce T = F
state 2 {
[E = T . ]
[T = T . "*" F]
}
reduce E = T
!71
state 1 {
[S = E . ]
[E = E . "+" T]
}
accept
state 0 {
[S = . E]
[E = . E "+" T]
[E = . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
shift E to 1
shift T to 2
shift F to 3
shift "(" to 4
state 4 {
[F = "(" . E ")"]
[E = . E "+" T]
[E = . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
shift T to 2
shift F to 3
shift "(" to 4
state 3 {
[T = F . ]
}
reduce T = F
state 2 {
[E = T . ]
[T = T . "*" F]
}
reduce E = T
!72
state 1 {
[S = E . ]
[E = E . "+" T]
}
accept
shift "+" to 6
state 0 {
[S = . E]
[E = . E "+" T]
[E = . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
shift E to 1
shift T to 2
shift F to 3
shift "(" to 4
shift ID to 5
state 4 {
[F = "(" . E ")"]
[E = . E "+" T]
[E = . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
shift T to 2
shift F to 3
shift "(" to 4
shift ID to 5
state 5 {
[F = ID .]
}
reduce F = ID
state 3 {
[T = F . ]
}
reduce T = F
state 2 {
[E = T . ]
[T = T . "*" F]
}
reduce E = T
shift "*" to 7
!73
state 1 {
[S = E . ]
[E = E . "+" T]
}
accept
shift "+" to 6
state 0 {
[S = . E]
[E = . E "+" T]
[E = . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
shift E to 1
shift T to 2
shift F to 3
shift "(" to 4
shift ID to 5
state 4 {
[F = "(" . E ")"]
[E = . E "+" T]
[E = . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
shift T to 2
shift F to 3
shift "(" to 4
shift ID to 5
state 5 {
[F = ID .]
}
reduce F = ID
state 6 {
[E = E "+" . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
state 3 {
[T = F . ]
}
reduce T = F
state 2 {
[E = T . ]
[T = T . "*" F]
}
reduce E = T
shift "*" to 7
!74
state 1 {
[S = E . ]
[E = E . "+" T]
}
accept
shift "+" to 6
state 0 {
[S = . E]
[E = . E "+" T]
[E = . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
shift E to 1
shift T to 2
shift F to 3
shift "(" to 4
shift ID to 5
state 4 {
[F = "(" . E ")"]
[E = . E "+" T]
[E = . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
shift T to 2
shift F to 3
shift "(" to 4
shift ID to 5
state 5 {
[F = ID .]
}
reduce F = ID
state 6 {
[E = E "+" . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
shift F to 3
shift "(" to 4
shift ID to 5
state 3 {
[T = F . ]
}
reduce T = F
state 2 {
[E = T . ]
[T = T . "*" F]
}
reduce E = T
shift "*" to 7
!75
state 1 {
[S = E . ]
[E = E . "+" T]
}
accept
shift "+" to 6
state 0 {
[S = . E]
[E = . E "+" T]
[E = . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
shift E to 1
shift T to 2
shift F to 3
shift "(" to 4
shift ID to 5
state 4 {
[F = "(" . E ")"]
[E = . E "+" T]
[E = . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
shift T to 2
shift F to 3
shift "(" to 4
shift ID to 5
state 5 {
[F = ID .]
}
reduce F = ID
state 6 {
[E = E "+" . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
shift F to 3
shift "(" to 4
shift ID to 5
state 7 {
[T = T "*" . F]
[F = . ID]
}
shift ID to 5state 3 {
[T = F . ]
}
reduce T = F
state 2 {
[E = T . ]
[T = T . "*" F]
}
reduce E = T
shift "*" to 7
!76
state 1 {
[S = E . ]
[E = E . "+" T]
}
accept
shift "+" to 6
state 0 {
[S = . E]
[E = . E "+" T]
[E = . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
shift E to 1
shift T to 2
shift F to 3
shift "(" to 4
shift ID to 5
state 4 {
[F = "(" . E ")"]
[E = . E "+" T]
[E = . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
shift E to 8
shift T to 2
shift F to 3
shift "(" to 4
shift ID to 5
state 5 {
[F = ID .]
}
reduce F = ID
state 6 {
[E = E "+" . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
shift F to 3
shift "(" to 4
shift ID to 5
state 7 {
[T = T "*" . F]
[F = . ID]
}
shift ID to 5
state 8 {
[F = "(" E . ")"]
}
state 3 {
[T = F . ]
}
reduce T = F
state 2 {
[E = T . ]
[T = T . "*" F]
}
reduce E = T
shift "*" to 7
!77
state 1 {
[S = E . ]
[E = E . "+" T]
}
accept
shift "+" to 6
state 0 {
[S = . E]
[E = . E "+" T]
[E = . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
shift E to 1
shift T to 2
shift F to 3
shift "(" to 4
shift ID to 5
state 4 {
[F = "(" . E ")"]
[E = . E "+" T]
[E = . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
shift E to 8
shift T to 2
shift F to 3
shift "(" to 4
shift ID to 5
state 5 {
[F = ID .]
}
reduce F = ID
state 6 {
[E = E "+" . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
shift T to 9
shift F to 3
shift "(" to 4
shift ID to 5
state 7 {
[T = T "*" . F]
[F = . ID]
}
shift ID to 5
state 8 {
[F = "(" E . ")"]
}
state 9 {
[E = E "+" T .]
[T = T . "*" F]
}
reduce E = E "+" T
shift "*" to 7
state 3 {
[T = F . ]
}
reduce T = F
state 2 {
[E = T . ]
[T = T . "*" F]
}
reduce E = T
shift "*" to 7
!78
state 1 {
[S = E . ]
[E = E . "+" T]
}
accept
shift "+" to 6
state 0 {
[S = . E]
[E = . E "+" T]
[E = . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
shift E to 1
shift T to 2
shift F to 3
shift "(" to 4
shift ID to 5
state 4 {
[F = "(" . E ")"]
[E = . E "+" T]
[E = . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
shift E to 8
shift T to 2
shift F to 3
shift "(" to 4
shift ID to 5
state 5 {
[F = ID .]
}
reduce F = ID
state 6 {
[E = E "+" . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
shift T to 9
shift F to 3
shift "(" to 4
shift ID to 5
state 7 {
[T = T "*" . F]
[F = . ID]
}
shift F to 10
shift ID to 5
state 8 {
[F = "(" E . ")"]
}
shift ")" to 11
state 9 {
[E = E "+" T .]
[T = T . "*" F]
}
reduce E = E "+" T
shift "*" to 7
state 10 {
[T = T "*" F .]
}
reduce T = T "*" F
state 3 {
[T = F . ]
}
reduce T = F
state 2 {
[E = T . ]
[T = T . "*" F]
}
reduce E = T
shift "*" to 7
!79
state 1 {
[S = E . ]
[E = E . "+" T]
}
accept
shift "+" to 6
state 0 {
[S = . E]
[E = . E "+" T]
[E = . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
shift E to 1
shift T to 2
shift F to 3
shift "(" to 4
shift ID to 5
state 4 {
[F = "(" . E ")"]
[E = . E "+" T]
[E = . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
shift E to 8
shift T to 2
shift F to 3
shift "(" to 4
shift ID to 5
state 5 {
[F = ID .]
}
reduce F = ID
state 6 {
[E = E "+" . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
shift T to 9
shift F to 3
shift "(" to 4
shift ID to 5
state 7 {
[T = T "*" . F]
[F = . ID]
}
shift F to 10
shift ID to 5
state 8 {
[F = "(" E . ")"]
}
shift ")" to 11
state 9 {
[E = E "+" T .]
[T = T . "*" F]
}
reduce E = E "+" T
shift "*" to 7
state 10 {
[T = T "*" F .]
}
reduce T = T "*" F
state 11 {
[F = "(" E ")" .]
}
reduce F = "(" E ")"
state 3 {
[T = F . ]
}
reduce T = F
state 2 {
[E = T . ]
[T = T . "*" F]
}
reduce E = T
shift "*" to 7
SLR Parse
!80
$ | 0 | ID "*" ID $ shift ID to 5
=> $ ID | 0 5 | "*" ID $ reduce F = ID
=> $ | 0 | F "*" ID $ shift F to 3
=> $ F | 0 3 | "*" ID $ reduce T = F
=> $ | 0 | T "*" ID $ shift T to 2
=> $ T | 0 2 | "*" ID $ shift "*" to 7
=> $ T "*" | 0 2 7 | ID $ shift ID to 5
=> $ T "*" ID | 0 2 7 5 | $ reduce F = ID
=> $ T "*" | 0 2 7 | F $ shift F to 10
=> $ T "*" F | 0 2 7 10 | $ reduce T = T "*" F
=> $ | 0 | T $ shift T to 2
=> $ T | 0 2 | $ reduce E = T
=> $ | 0 | E $ shift E to 1
=> $ E | 0 1 | $ accept
grammar
productions
S = E
E = E "+" T
E = T
T = T "*" F
T = F
F = "(" E ")"
F = ID
!81
state 0 {
[S = . E]
[E = . E "+" T]
[E = . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
shift E to 1
shift T to 2
shift F to 3
shift "(" to 4
shift ID to 5
state 3 {
[T = F . ]
}
reduce T = F
state 5 {
[F = ID .]
}
reduce F = ID
state 2 {
[E = T . ]
[T = T . "*" F]
}
reduce E = T
shift "*" to 7
state 7 {
[T = T "*" . F]
[F = . ID]
}
shift F to 10
shift ID to 5
state 10 {
[T = T "*" F .]
}
reduce T = T "*" F
state 2 {
[E = T . ]
[T = T . "*" F]
}
reduce E = T
shift "*" to 7
state 1 {
[S = E . ]
[E = E . "+" T]
}
accept
shift "+" to 6
state 5 {
[F = ID .]
}
reduce F = ID
$ | 0 | ID "*" ID $ shift ID to 5
=> $ ID | 0 5 | "*" ID $ reduce F = ID
=> $ | 0 | F "*" ID $ shift F to 3
=> $ F | 0 3 | "*" ID $ reduce T = F
=> $ | 0 | T "*" ID $ shift T to 2
=> $ T | 0 2 | "*" ID $ shift "*" to 7
=> $ T "*" | 0 2 7 | ID $ shift ID to 5
=> $ T "*" ID | 0 2 7 5 | $ reduce F = ID
=> $ T "*" | 0 2 7 | F $ shift F to 10
=> $ T "*" F | 0 2 7 10 | $ reduce T = T "*" F
=> $ | 0 | T $ shift T to 2
=> $ T | 0 2 | $ reduce E = T
=> $ | 0 | E $ shift E to 1
=> $ E | 0 1 | $ accept
10
2
3
4
5
6
7
8
!82
state 0 {
[S = . E]
[E = . E "+" T]
[E = . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
shift E to 1
shift T to 2
shift F to 3
shift "(" to 4
shift ID to 5
state 3 {
[T = F . ]
}
reduce T = F
state 5 {
[F = ID .]
}
reduce F = ID
state 7 {
[T = T "*" . F]
[F = . ID]
}
shift F to 10
shift ID to 5
state 10 {
[T = T "*" F .]
}
reduce T = T "*" F
state 1 {
[S = E . ]
[E = E . "+" T]
}
accept
shift "+" to 6
state 5 {
[F = ID .]
}
reduce F = ID
Parsing: ID "+" ID "*" ID .
state 1 {
[S = E . ]
[E = E . "+" T]
}
accept
shift "+" to 6
state 6 {
[E = E "+" . T]
[T = . T "*" F]
[T = . F]
[F = . "(" E ")"]
[F = . ID]
}
shift T to 9
shift F to 3
shift "(" to 4
shift ID to 5
state 9 {
[E = E "+" T .]
[T = T . "*" F]
}
reduce E = E "+" T
shift "*" to 7
E "+"
ID
F
T
ID"*"
F
E
state 9 {
[E = E "+" T .]
[T = T . "*" F]
}
reduce E = E "+" T
shift "*" to 7
T
1
2 3
4
5
6
7
8
9
10
LR(0) Parse Table
!83
See book
Solving Shift/Reduce
Conflicts
84
Solving Shift/Reduce Conflicts
!85
First and Follow: see book
Parsing: Summary
86
Context-free grammars
- Productions define how to generate sentences of language

- Derivation: generate sentence from (start) symbol

- Reduction: reduce sentence to (start) symbol

Parse tree
- Represents structure of derivation

- Abstracts from derivation order

Parser
- Algorithm to reconstruct derivation
!87
Parsing
First/Follow
- Selecting between actions in LR parse table

Other algorithms
- Top-down: LL(k) table

- Generalized parsing: Earley, Generalized-LR

- Scannerless parsing: characters as tokens

Disambiguation
- Semantics of declarative disambiguation

- Deep priority conflicts
!88
More Topics in Syntax and Parsing
Next: Transformation
89
Except where otherwise noted, this work is licensed under

More Related Content

PDF
CS4200 2019 | Lecture 3 | Parsing
PDF
Compiler Construction | Lecture 5 | Transformation by Term Rewriting
PDF
Declare Your Language: Syntax Definition
PDF
Declare Your Language: Type Checking
PDF
Declare Your Language: Name Resolution
PDF
Declare Your Language: Transformation by Strategic Term Rewriting
PDF
Declare Your Language: Syntactic (Editor) Services
PDF
Compiler Construction | Lecture 8 | Type Constraints
CS4200 2019 | Lecture 3 | Parsing
Compiler Construction | Lecture 5 | Transformation by Term Rewriting
Declare Your Language: Syntax Definition
Declare Your Language: Type Checking
Declare Your Language: Name Resolution
Declare Your Language: Transformation by Strategic Term Rewriting
Declare Your Language: Syntactic (Editor) Services
Compiler Construction | Lecture 8 | Type Constraints

What's hot (20)

PDF
Dynamic Semantics Specification and Interpreter Generation
PDF
CS4200 2019 | Lecture 5 | Transformation by Term Rewriting
PDF
CS4200 2019 | Lecture 4 | Syntactic Services
PDF
Compiler Construction | Lecture 3 | Syntactic Editor Services
PDF
Compiler Construction | Lecture 9 | Constraint Resolution
PDF
CS4200 2019 | Lecture 2 | syntax-definition
PDF
Separation of Concerns in Language Definition
PDF
Left factor put
PPTX
Parsing
PDF
Operator precedence
PPT
Antlr V3
PDF
Syntax analysis
PDF
Declarative Type System Specification with Statix
PDF
Compiler Construction | Lecture 14 | Interpreters
PDF
A Language Designer’s Workbench. A one-stop shop for implementation and verif...
PPT
Module 11
PDF
Static name resolution
PDF
Compiler Construction | Lecture 2 | Declarative Syntax Definition
PPTX
Syntax-Directed Translation into Three Address Code
PDF
Compiler Construction | Lecture 6 | Introduction to Static Analysis
Dynamic Semantics Specification and Interpreter Generation
CS4200 2019 | Lecture 5 | Transformation by Term Rewriting
CS4200 2019 | Lecture 4 | Syntactic Services
Compiler Construction | Lecture 3 | Syntactic Editor Services
Compiler Construction | Lecture 9 | Constraint Resolution
CS4200 2019 | Lecture 2 | syntax-definition
Separation of Concerns in Language Definition
Left factor put
Parsing
Operator precedence
Antlr V3
Syntax analysis
Declarative Type System Specification with Statix
Compiler Construction | Lecture 14 | Interpreters
A Language Designer’s Workbench. A one-stop shop for implementation and verif...
Module 11
Static name resolution
Compiler Construction | Lecture 2 | Declarative Syntax Definition
Syntax-Directed Translation into Three Address Code
Compiler Construction | Lecture 6 | Introduction to Static Analysis
Ad

Similar to Compiler Construction | Lecture 4 | Parsing (20)

PPTX
Top down and botttom up Parsing
PPTX
Top down and botttom up 2 LATEST.
PDF
Ch04
PDF
CS17604_TOP Parser Compiler Design Techniques
PDF
Declare Your Language: Dynamic Semantics
PDF
Syntax Definition
PDF
Syntax Definition
ZIP
Round PEG, Round Hole - Parsing Functionally
PPTX
codin9cafe[2015.02.25]Open course(programming languages) - 장철호(Ch Jang)
PPTX
Open course(programming languages) 20150225
PDF
Dynamic Semantics
PDF
仕事で使うF#
PDF
Perl6 Regexen: Reduce the line noise in your code.
PDF
Slaying the Dragon: Implementing a Programming Language in Ruby
PPTX
Intermediate code generation1
PPT
Scala presentation by Aleksandar Prokopec
PPTX
Top down parsing(sid) (1)
PDF
12IRGeneration.pdf
Top down and botttom up Parsing
Top down and botttom up 2 LATEST.
Ch04
CS17604_TOP Parser Compiler Design Techniques
Declare Your Language: Dynamic Semantics
Syntax Definition
Syntax Definition
Round PEG, Round Hole - Parsing Functionally
codin9cafe[2015.02.25]Open course(programming languages) - 장철호(Ch Jang)
Open course(programming languages) 20150225
Dynamic Semantics
仕事で使うF#
Perl6 Regexen: Reduce the line noise in your code.
Slaying the Dragon: Implementing a Programming Language in Ruby
Intermediate code generation1
Scala presentation by Aleksandar Prokopec
Top down parsing(sid) (1)
12IRGeneration.pdf
Ad

More from Eelco Visser (14)

PDF
CS4200 2019 Lecture 1: Introduction
PDF
A Direct Semantics of Declarative Disambiguation Rules
PDF
Compiler Construction | Lecture 17 | Beyond Compiler Construction
PDF
Domain Specific Languages for Parallel Graph AnalytiX (PGX)
PDF
Compiler Construction | Lecture 15 | Memory Management
PDF
Compiler Construction | Lecture 13 | Code Generation
PDF
Compiler Construction | Lecture 12 | Virtual Machines
PDF
Compiler Construction | Lecture 11 | Monotone Frameworks
PDF
Compiler Construction | Lecture 10 | Data-Flow Analysis
PDF
Compiler Construction | Lecture 7 | Type Checking
PDF
Compiler Construction | Lecture 1 | What is a compiler?
PDF
Declare Your Language: Virtual Machines & Code Generation
PDF
Declare Your Language: Constraint Resolution 2
PDF
Declare Your Language: Constraint Resolution 1
CS4200 2019 Lecture 1: Introduction
A Direct Semantics of Declarative Disambiguation Rules
Compiler Construction | Lecture 17 | Beyond Compiler Construction
Domain Specific Languages for Parallel Graph AnalytiX (PGX)
Compiler Construction | Lecture 15 | Memory Management
Compiler Construction | Lecture 13 | Code Generation
Compiler Construction | Lecture 12 | Virtual Machines
Compiler Construction | Lecture 11 | Monotone Frameworks
Compiler Construction | Lecture 10 | Data-Flow Analysis
Compiler Construction | Lecture 7 | Type Checking
Compiler Construction | Lecture 1 | What is a compiler?
Declare Your Language: Virtual Machines & Code Generation
Declare Your Language: Constraint Resolution 2
Declare Your Language: Constraint Resolution 1

Recently uploaded (20)

PDF
System and Network Administration Chapter 2
PDF
System and Network Administraation Chapter 3
PDF
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
PPTX
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
PPTX
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
PPTX
Introduction to Artificial Intelligence
PDF
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
PDF
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
PDF
Which alternative to Crystal Reports is best for small or large businesses.pdf
PPTX
Transform Your Business with a Software ERP System
PPTX
ai tools demonstartion for schools and inter college
PDF
Design an Analysis of Algorithms II-SECS-1021-03
PPTX
L1 - Introduction to python Backend.pptx
PDF
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
PPTX
Essential Infomation Tech presentation.pptx
PPTX
Odoo POS Development Services by CandidRoot Solutions
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 41
PDF
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
PDF
Softaken Excel to vCard Converter Software.pdf
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
System and Network Administration Chapter 2
System and Network Administraation Chapter 3
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
Introduction to Artificial Intelligence
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
Which alternative to Crystal Reports is best for small or large businesses.pdf
Transform Your Business with a Software ERP System
ai tools demonstartion for schools and inter college
Design an Analysis of Algorithms II-SECS-1021-03
L1 - Introduction to python Backend.pptx
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
Essential Infomation Tech presentation.pptx
Odoo POS Development Services by CandidRoot Solutions
Internet Downloader Manager (IDM) Crack 6.42 Build 41
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
Softaken Excel to vCard Converter Software.pdf
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025

Compiler Construction | Lecture 4 | Parsing

  • 1. Lecture 4: Parsing CS4200 Compiler Construction Eelco Visser TU Delft September 2018
  • 2. This Lecture !2 Java Type Check JVM bytecode Parse CodeGenOptimize Turning syntax definitions into parsers
  • 4. The perspective of this lecture on declarative syntax definition is Elained more elaborately in this Onward! 2010 essay. It uses an on older version of SDF than used in these slides. Production rules have the form X1 … Xn -> N {cons(“C”)} instead of N.C = X1 … Xn http://swerl.tudelft.nl/twiki/pub/Main/TechnicalReports/TUD-SERG-2010-019.pdf https://doi.org/10.1145/1932682.1869535
  • 5. !5 Compilers: Principles, Techniques, and Tools, 2nd Edition Alfred V. Aho, Columbia University Monica S. Lam, Stanford University Ravi Sethi, Avaya Labs Jeffrey D. Ullman, Stanford University 2007 | Pearson Classical compiler textbook Chapter 4: Syntax Analysis Read Sections 4.1, 4.2, 4.3, 4.5, 4.6 Pictures in these slides are copies from the book
  • 6. !6 Sikkel, N. (1993). Parsing Schemata. PhD thesis. Enschede: Universiteit Twente This PhD thesis presents a uniform framework for describing a wide range of parsing algorithms. https://research.utwente.nl/en/publications/parsing-schemata “Parsing schemata provide a general framework for specication, analysis and comparison of (sequential and/or parallel) parsing algorithms. A grammar specifies implicitly what the valid parses of a sentence are; a parsing algorithm specifies Elicitly how to compute these. Parsing schemata form a well- defined level of abstraction in between grammars and parsing algorithms. A parsing schema specifies the types of intermediate results that can be computed by a parser, and the rules that allow to Eand a given set of such results with new results. A parsing schema does not specify the data structures, control structures, and (in case of parallel processing) communication structures that are to be used by a parser.” For the interested
  • 7. !7 https://ivi.fnwi.uva.nl/tcs/pub/reports/1995/P9507.ps.Z This paper applies parsing schemata to disambiguation filters for priority conflicts. For the interested
  • 9. Traditional Parser Architecture !9 Source: Compilers Principles, Techniques & Tools
  • 11. Terminals - Basic symbols from which strings are formed Nonterminals - Syntactic variables that denote sets of strings Start Symbol - Denotes the nonterminal that generates strings of the languages Productions - A = X … X - Head/left side (A) is a nonterminal - Body/right side (X … X) zero or more terminals and nonterminals !11 Context-Free Grammars
  • 12. Example Context-Free Grammar !12 grammar start S non-terminals E T F terminals "+" "*" "(" ")" ID productions S = E E = E "+" T E = T T = T "*" F T = F F = "(" E ")" F = ID
  • 13. Abbreviated Grammar !13 grammar productions S = E E = E "+" T E = T T = T "*" F T = F F = "(" E ")" F = ID Nonterminals, terminals can be derived from productions First production defines start symbol grammar start S non-terminals E T F terminals "+" "*" "(" ")" ID productions S = E E = E "+" T E = T T = T "*" F T = F F = "(" E ")" F = ID
  • 14. Notation !14 A, B, C: non-terminals l: terminals a, b, c: strings of non-terminals and terminals (alpha, beta, gamma in math) w, v: strings of terminal symbols
  • 15. Meta: Syntax of Grammars !15 context-free syntax Production.Prod = < <Symbol><Constructor?> = <Symbol*> > Symbol.NT = <<ID>> Symbol.T = <<STRING>> Symbol.L = <<LCID>> Constructor.Con = <.<ID>> context-free syntax // grammars Grammar.Grammar = < grammar <Start?> <Sorts?> <Terminals?> <Productions> > context-free syntax Start.Start = < start <ID> > Sorts.Sorts = < sorts <ID*> > Sorts.NonTerminals = < non-terminals <ID*> > Terminals.Terminals = < terminals <Symbol*> > Productions.Productions = < productions <Production*> >
  • 17. Derivations !17 grammar productions E = E "+" E E = E "*" E E = "-" E E = "(" E ")" E = ID // derivation step: replace symbol by rhs of production // E = E "+" E // replace E by E "+" E // // derivation: // repeatedly apply derivations derivation E => "-" E => "-" "(" E ")" => "-" "(" ID ")" derivation // derives in zero or more steps E =>* "-" "(" ID "+" ID ")"
  • 18. Meta: Syntax of Derivations !18 context-free syntax // derivations Derivation.Derivation = < derivation <Symbol> <Step*> > Step.Step = [=> [Symbol*]] Step.Steps = [=>* [Symbol*]] Step.Steps1 = [=>+ [Symbol*]]
  • 19. Left-Most Derivation !19 grammar productions E = E "+" E E = E "*" E E = "-" E E = "(" E ")" E = ID derivation // left-most derivation E => "-" E => "-" "(" E ")" => "-" "(" E "+" E ")" => "-" "(" ID "+" E ")" => "-" "(" ID "+" ID ")" Left-most derivation: Expand left-most non-terminal at each step
  • 20. Right-Most Derivation !20 grammar productions E = E "+" E E = E "*" E E = "-" E E = "(" E ")" E = ID derivation // left-most derivation E => "-" E => "-" "(" E ")" => "-" "(" E "+" E ")" => "-" "(" ID "+" E ")" => "-" "(" ID "+" ID ")" Right-most derivation: Expand right-most non-terminal at each step derivation // right-most derivation E => "-" E => "-" "(" E ")" => "-" "(" E "+" E ")" => "-" "(" E "+" ID ")" => "-" "(" ID "+" ID ")"
  • 21. Meta: Tree Derivations !21 context-free syntax // tree derivations Derivation.TreeDerivation = < tree derivation <Symbol> <PStep*> > PStep.Step = [=> [PT*]] PStep.Steps = [=>* [PT*]] PStep.Steps1 = [=>+ [PT*]] PT.App = <<Symbol>[<PT*>]> PT.Str = <<STRING>> PT.Sym = <<Symbol>>
  • 22. Left-Most Tree Derivation !22 grammar productions E.A = E "+" E E.T = E "*" E E.N = "-" E E.P = "(" E ")" E.V = ID derivation // left-most derivation E => "-" E => "-" "(" E ")" => "-" "(" E "+" E ")" => "-" "(" ID "+" E ")" => "-" "(" ID "+" ID ")" tree derivation // left-most E => E["-" E] => E["-" E["(" E ")"]] => E["-" E["(" E[E "+" E] ")"]] => E["-" E["(" E[E[ID] "+" E] ")"]] => E["-" E["(" E[E[ID] "+" E[ID]] ")"]]
  • 23. Left-Most Tree Derivation !23 tree derivation // left-most E => E["-" E] => E["-" E["(" E ")"]] => E["-" E["(" E[E "+" E] ")"]] => E["-" E["(" E[E[ID] "+" E] ")"]] => E["-" E["(" E[E[ID] "+" E[ID]] ")"]]
  • 24. Ambiguity: Deriving Multiple Parse Trees !24 grammar productions E.A = E "+" E E.T = E "*" E E.N = "-" E E.P = "(" E ")" E.V = ID derivation E =>* ID "+" ID "*" ID Ambiguous grammar: produces >1 parse tree for a sentence derivation E => E "+" E => ID "+" E => ID "+" E "*" E => ID "+" ID "*" E => ID "+" ID "*" ID tree derivation E =>* E[E[ID] "+" E[E[ID] "*" E[ID]]] derivation E => E "*" E => E "+" E "*" E => ID "+" E "*" E => ID "+" ID "*" E => ID "+" ID "*" ID tree derivation E =>* E[E[E[ID] "+" E[ID]] "*" E[ID]]
  • 25. Meta: Term Derivations !25 context-free syntax // term derivations Derivation.TermDerivation = < term derivation <Symbol> <TStep*> > TStep.Step = [=> [Term*]] TStep.Steps = [=>* [Term*]] TStep.Steps1 = [=>+ [Term*]] Term.App = <<ID>(<{Term ","}*>)> Term.Str = <<STRING>> Term.Sym = <<Symbol>>
  • 26. Ambiguity: Deriving Abstract Syntax Terms !26 grammar productions E.A = E "+" E E.T = E "*" E E.N = "-" E E.P = "(" E ")" E.V = ID derivation E =>* ID "+" ID "*" ID derivation E => E "+" E => ID "+" E => ID "+" E "*" E => ID "+" ID "*" E => ID "+" ID "*" ID derivation E => E "*" E => E "+" E "*" E => ID "+" E "*" E => ID "+" ID "*" E => ID "+" ID "*" ID term derivation E => A(E, E) => A(V(ID), E) => A(V(ID), T(E, E)) => A(V(ID), T(V(ID), E)) => A(V(ID), T(V(ID), V(ID))) term derivation E => T(E, E) => T(A(E, E), E) => T(A(V(ID), E), E) => T(A(V(ID), V(ID)), E) => T(A(V(ID), V(ID)), V(ID))
  • 28. Why? - Disambiguation - For use by a particular parsing algorithm Transformations - Eliminating ambiguities - Eliminating left recursion - Left factoring Properties - Does transformation preserve the language (set of strings, trees)? - Does transformation preserve the structure of trees? !28 Grammar Transformations
  • 29. Ambiguous Expression Grammar !29 derivation E =>* ID "*" ID "+" ID term derivation E => A(E, E) => A(T(E, E), E) => A(T(E, E), E) => A(T(V(ID), E), E) => A(T(V(ID), V(ID)), E) => A(T(V(ID), V(ID)), V(ID)) term derivation E => T(E, E) => T(E, E) => T(V(ID), E) => T(V(ID), A(E, E)) => T(V(ID), A(V(ID), E)) => T(V(ID), A(V(ID), V(ID))) grammar productions E.A = E "+" E E.T = E "*" E E.M = "-" E E.B = "(" E ")" E.V = ID
  • 30. Associativity and Priority Filter Ambiguities !30 grammar productions E.A = E "+" E E.T = E "*" E E.M = "-" E E.B = "(" E ")" E.V = ID derivation E =>* ID "*" ID "+" ID term derivation E => A(E, E) => A(T(E, E), E) => A(T(E, E), E) => A(T(V(ID), E), E) => A(T(V(ID), V(ID)), E) => A(T(V(ID), V(ID)), V(ID)) term derivation E => T(E, E) => T(E, E) => T(V(ID), E) => T(V(ID), A(E, E)) => T(V(ID), A(V(ID), E)) => T(V(ID), A(V(ID), V(ID))) grammar productions E.A = E "+" E {left} E.T = E "*" E {left} E.M = "-" E E.B = "(" E ")" E.V = ID priorities E.M > E. T > E.A
  • 31. Define Associativity and Priority by Transformation !31 grammar productions E.A = E "+" T E = T T.T = T "*" F T = F F.V = ID F.B = "(" E ")" derivation E =>* ID "*" ID "+" ID term derivation E => A(E, E) => A(T(E, E), E) => A(T(E, E), E) => A(T(V(ID), E), E) => A(T(V(ID), V(ID)), E) => A(T(V(ID), V(ID)), V(ID)) term derivation E => T(E, E) => T(E, E) => T(V(ID), E) => T(V(ID), A(E, E)) => T(V(ID), A(V(ID), E)) => T(V(ID), A(V(ID), V(ID))) grammar productions E.A = E "+" E {left} E.T = E "*" E {left} E.M = "-" E E.B = "(" E ")" E.V = ID priorities E.M > E. T > E.A
  • 32. Define Associativity and Priority by Transformation !32 grammar productions E.A = E "+" T E = T T.T = T "*" F T = F F.V = ID F.B = "(" E ")" grammar productions E.A = E "+" E {left} E.T = E "*" E {left} E.M = "-" E E.B = "(" E ")" E.V = ID priorities E.M > E. T > E.A Define new non-terminal for each priority level: E, T, F Add ‘injection’ productions to include priority level n+1 in n: E = T T = F Transform productions Left: E = E “+” T Right: E = T “+” E Change head of production to reflect priority level T = T “*” F
  • 33. Dangling Else Grammar !33 grammar sorts S E productions S.If = if E then S S.IfE = if E then S else S S = other derivation S =>* if E1 then S1 else if E2 then S2 else S3 term derivation S =>* IfE(E1, S1, IfE(E2, S2, S3)) term derivation S => IfE(E1, S1, S) => IfE(E1, S1, IfE(E2, S2, S3))
  • 34. Dangling Else Grammar is Ambiguous !34 grammar sorts S E productions S.If = if E then S S.IfE = if E then S else S S = other term derivation S => If(E1, S) => If(E1, IfE(E2, S1, S2)) derivation S => if E1 then S => if E1 then if E2 then S1 else S2 derivation S =>* if E1 then if E2 then S1 else S2 term derivation S => IfE(E1, S, S2) => IfE(E1, If(E2, S1), S2) derivation S => if E1 then S else S2 => if E1 then if E2 then S1 else S2
  • 35. Eliminating Dangling Else Ambiguity !35 grammar sorts S E productions S.If = if E then S S.IfE = if E then S else S S = other grammar productions S = MS S = OS MS = if E then MS else MS MS = other OS = if E then S OS = if E then MS else OS Generalization of this transformation: contextual grammars
  • 36. !36 This paper defines a declarative semantics for associativity and priority declarations for disambiguation. The paper provides a safe semantics and extends it to deep priority conflicts. The result of disambiguation is a contextual grammar, which generalises the disambiguation for the dangling-else grammar. The paper is still in production. Ask us for a copy of the draft.
  • 37. Eliminating Left Recursion !37 grammar productions E = E "+" T E = T T = T "*" F T = F F = "(" E ")" F = ID grammar productions A = A a A = b grammar productions A = b A' A' = a A' A' = // empty grammar productions E = T E' E' = "+" T E' E' = T = F T' T' = "*" F T' T' = F = "(" E ")" F = ID // b followed by a list of as
  • 38. Left Factoring !38 grammar productions S.If = if E then S S.IfE = if E then S else S S = other grammar productions A = a b1 A = a b2 A = c grammar productions A = a A' A' = b1 A' = b2 A = c grammar sorts S E productions S.If = if E then S S' S'.Else = else S S'.NoElse = // empty S = other
  • 39. Preservation - Preserves set of sentences - Preserves set of trees - Preserves tree structure Systematic - Algorithmic - Heuristic !39 Properties of Grammar Transformations
  • 41. Top-Down Parse !41 grammar sorts E T E' F T' productions E = T E' E' = "+" T E' E' = T = F T' T' = "*" F T' T' = F = "(" E ")" F = ID tree derivation // top-down parse E => E[T E'] => E[T[F T'] E'] => E[T[F[ID] T'] E'] => E[T[F[ID] T'[]] E'] => E[T[F[ID] T'[]] E'["+" T E']] => E[T[F[ID] T'[]] E'["+" T[F T'] E']] => E[T[F[ID] T'[]] E'["+" T[F[ID] T'] E']] => E[T[F[ID] T'[]] E'["+" T[F[ID] T'["*" F T']] E']] => E[T[F[ID] T'[]] E'["+" T[F[ID] T'["*" F[ID] T']] E']] => E[T[F[ID] T'[]] E'["+" T[F[ID] T'["*" F[ID] T'[]]] E']] => E[T[F[ID] T'[]] E'["+" T[F[ID] T'["*" F[ID] T'[]]] E'[]]] derivation E =>* ID "+" ID "*" ID
  • 42. Top-Down Parse !42 grammar sorts E T E' F T' productions E = T E' E' = "+" T E' E' = T = F T' T' = "*" F T' T' = F = "(" E ")" F = ID
  • 44. Use LL(1) grammar - Not left recursive - Left factored Top-down back-track parsing - Predict symbol - If terminal: corresponds to next input symbol? - Try productions for non-terminal in turn Predictive parsing - Predict symbol to parse - Use lookahead to deterministically chose production for non-terminal Variants - Parser combinators, PEG, packrat, … !44 LL Parsing
  • 46. Meta: Reductions !46 context-free syntax Reduction.Reduction = < reduction <Symbol*> <RStep*> > RStep.Step = [<= [Symbol*]] RStep.Steps = [<=* [Symbol*]] RStep.Steps1 = [<=+ [Symbol*]] context-free syntax Reduction.TreeReduction = < tree reduction <PT*> <PRStep*> > PRStep.Step = [<= [PT*]] PRStep.Steps = [<=* [PT*]] PRStep.Steps1 = [<=+ [PT*]] context-free syntax Reduction.TermReduction = < term reduction <Term*> <TRStep*> > TRStep.Step = [<= [Term*]] TRStep.Steps = [<=* [Term*]] TRStep.Steps1 = [<=+ [Term*]]
  • 47. A Reduction is an Inverse Derivation !47 grammar sorts A productions A = b reduction a b c <= a A c
  • 48. Reducing to Symbols !48 grammar sorts E T F ID productions E.P = E "+" T E.E = T T.M = T "*" F T.T = F F.B = "(" E ")" F.V = ID reduction ID "*" ID <= F "*" ID <= T "*" ID <= T "*" F <= T <= E
  • 49. Reducing to Parse Trees !49 grammar sorts E T F ID productions E.P = E "+" T E.E = T T.M = T "*" F T.T = F F.B = "(" E ")" F.V = ID reduction ID "*" ID <= F "*" ID <= T "*" ID <= T "*" F <= T <= E tree reduction ID "*" ID <= F[ID] "*" ID <= T[F[ID]] "*" ID <= T[F[ID]] "*" F[ID] <= T[T[F[ID]] "*" F[ID]] <= E[T[T[F[ID]] "*" F[ID]]]
  • 50. Reducing to Abstract Syntax Terms !50 grammar sorts E T F ID productions E.P = E "+" T E.E = T T.M = T "*" F T.T = F F.B = "(" E ")" F.V = ID reduction ID "*" ID <= F "*" ID <= T "*" ID <= T "*" F <= T <= E tree reduction ID "*" ID <= F[ID] "*" ID <= T[F[ID]] "*" ID <= T[F[ID]] "*" F[ID] <= T[T[F[ID]] "*" F[ID]] <= E[T[T[F[ID]] "*" F[ID]]] term reduction ID "*" ID <= V(ID) "*" ID <= T(V(ID)) "*" ID <= T(V(ID)) "*" V(ID) <= M(T(V(ID)), V(ID)) <= E(M(T(V(ID)), V(ID)))
  • 51. Handles !51 grammar productions A = b derivation // right-most derivation S =>* a A w => a b w // Handle // sentential form: string of non-terminal and terminal symbols // that can be derived from start symbol // S =>* a // right sentential form: a sentential form derived by a right-most derivation // handle: the part of a right sentential form that if reduced // would produced the previous right-sential form in a right-most derivation reduction a b w <= a A w <=* S
  • 53. Shift-Reduce Parsing Machine !53 $ S | $ accept // reduced to start symbol; accept $ a | w $ error // no action possible in this state shift-reduce parse $ stack | input $ action $ a | l w $ shift => $ a l | w $ // shift input symbol on the stack $ a b | w $ reduce by A = b => $ a A | w $ // reduce n symbols of the stack to symbol A
  • 54. Shift-Reduce Parsing !54 grammar productions E.P = E "+" T E.E = T T.M = T "*" F T.T = F F.B = "(" E ")" F.V = ID shift-reduce parse $ | ID "*" ID $ shift => $ ID | "*" ID $ reduce by F = ID => $ F | "*" ID $ reduce by T = F => $ T | "*" ID $ shift => $ T "*" | ID $ shift => $ T "*" ID | $ reduce by F = ID => $ T "*" F | $ reduce by T = T "*" F => $ T | $ reduce by E = T => $ E | $ accept Shift-reduce parsing constructs a right-most derivation
  • 55. Shift-Reduce Conflicts !55 grammar sorts S E productions S.If = if E then S S.IfE = if E then S else S S = other shift-reduce parse $ | if E then S else S $ shift => $ if | E then S else S $ shift =>* $ if E then S | else S $ shift => $ if E then S else | S $ shift => $ if E then S else S | $ reduce by S = if E then S else S => $ S | $ accept shift-reduce parse $ | if E then S else S $ shift => $ if | E then S else S $ shift =>* $ if E then S | else S $ reduce by S = if E then S => $ S | else S $ error
  • 56. Shift-Reduce Conflicts !56 grammar sorts S E productions S.If = if E then S S.IfE = if E then S else S S = other shift-reduce parse $ | if E then if E then S else S $ ... =>* $ if E then if E then S | else S $ shift => $ if E then if E then S else | S $ shift => $ if E then if E then S else S | $ reduce by S = if E then S else S => $ if E then S | $ reduce by S = if E then S => $ S | $ accept shift-reduce parse $ | if E then if E then S else S $ ... =>* $ if E then if E then S | else S $ reduce by S = if E then S => $ if E then S | else S $ shift => $ if E then S else | S $ shift => $ if E then S else S | $ reduce by S = if E then S else S => $ S
  • 58. How can we make shift-reduce parsing deterministic? !58 grammar productions E.P = E "+" T E.E = T T.M = T "*" F T.T = F F.B = "(" E ")" F.V = ID shift-reduce parse $ | ID "*" ID $ shift => $ ID | "*" ID $ reduce by F = ID => $ F | "*" ID $ reduce by T = F => $ T | "*" ID $ shift => $ T "*" | ID $ shift => $ T "*" ID | $ reduce by F = ID => $ T "*" F | $ reduce by T = T "*" F => $ T | $ reduce by E = T => $ E | $ accept Is there a production in the grammar that matches the top of the stack? How to chose between possible shift and reduce actions?
  • 59. LR(k) Parsing - L: left-to-right scanning - R: constructing a right-most derivation - k: the number of lookahead symbols Motivation for LR - LR grammars for many programming languages - Most general non-backtracking shift-reduce parsing method - Early detection of parse errors - Class of grammars superset of predictive/LL methods SLR: Simple LR !59 LR Parsing
  • 60. Items !60 [E = . E "+" T] [E = E . "+" T] [E = E "+" . T] [E = E "+" T .] E = E "+" T Production Items Item indicates how far we have progressed in parsing a production We expect to see an expression We have seen an expression and may see a “+”
  • 61. Item Set !61 { [F = "(" . E ")"] [E = . E "+" T] [E = . T] [T = . T "*" F] [T = . F] [F = . "(" E ")"] [F = . ID] } Item set used to keep track where we are in a parse grammar productions S = E E = E "+" T E = T T = T "*" F T = F F = "(" E ")" F = ID
  • 62. Closure of an Item Set !62 SetOfItems Closure(I) { J := I; repeat for(each [A = a . B b] in J) for(each [B = c] in G) if([B = . c] is not in J) J := Add([B = .c], J); until Not(Changed(J)); return J; } { [F = "(" . E ")"] } { [F = "(" . E ")"] [E = . E "+" T] [E = . T] [T = . T "*" F] [T = . F] [F = . "(" E ")"] [F = . ID] } grammar productions S = E E = E "+" T E = T T = T "*" F T = F F = "(" E ")" F = ID
  • 63. Goto !63 SetOfItems Goto(I, X) { J := {}; for(each [A = a . X b] in I) J := Add([A = a X . b], J); return Closure(J); } { [F = "(" . E ")"] } { [S = . E] [E = . E "+" T] [E = . T] [T = . T "*" F] [T = . F] [F = . "(" E ")"] [F = . ID] } grammar productions S = E E = E "+" T E = T T = T "*" F T = F F = "(" E ")" F = ID { [F = "(" . E ")"] [E = . E "+" T] [E = . T] [T = . T "*" F] [T = . F] [F = . "(" E ")"] [F = . ID] }
  • 64. Computing LR(0) Automaton !64 Items(G) { C := {Closure({[Sp = . S]})}; repeat for(each I in C) for(each X in G) { if(NotEmpty(Goto(I, X)) and Not(In(J, C))) C := Add(Goto(I, X), C); } until Not(Changed(C)); }
  • 65. Computing LR(0) Automaton !65 grammar productions S = E E = E "+" T E = T T = T "*" F T = F F = "(" E ")" F = ID state 0 { [S = . E] } state 0 { [S = . E] [E = . E "+" T] [E = . T] [T = . T "*" F] [T = . F] [F = . "(" E ")"] [F = . ID] }
  • 66. !66 state 0 { [S = . E] [E = . E "+" T] [E = . T] [T = . T "*" F] [T = . F] [F = . "(" E ")"] [F = . ID] }
  • 67. !67 state 1 { [S = E . ] [E = E . "+" T] } accept state 0 { [S = . E] [E = . E "+" T] [E = . T] [T = . T "*" F] [T = . F] [F = . "(" E ")"] [F = . ID] } shift E to 1
  • 68. !68 state 1 { [S = E . ] [E = E . "+" T] } accept state 0 { [S = . E] [E = . E "+" T] [E = . T] [T = . T "*" F] [T = . F] [F = . "(" E ")"] [F = . ID] } shift E to 1 shift T to 2 state 2 { [E = T . ] [T = T . "*" F] } reduce E = T
  • 69. !69 state 1 { [S = E . ] [E = E . "+" T] } accept state 0 { [S = . E] [E = . E "+" T] [E = . T] [T = . T "*" F] [T = . F] [F = . "(" E ")"] [F = . ID] } shift E to 1 shift T to 2 shift F to 3 state 3 { [T = F . ] } reduce T = F state 2 { [E = T . ] [T = T . "*" F] } reduce E = T
  • 70. !70 state 1 { [S = E . ] [E = E . "+" T] } accept state 0 { [S = . E] [E = . E "+" T] [E = . T] [T = . T "*" F] [T = . F] [F = . "(" E ")"] [F = . ID] } shift E to 1 shift T to 2 shift F to 3 shift "(" to 4 state 4 { [F = "(" . E ")"] [E = . E "+" T] [E = . T] [T = . T "*" F] [T = . F] [F = . "(" E ")"] [F = . ID] } state 3 { [T = F . ] } reduce T = F state 2 { [E = T . ] [T = T . "*" F] } reduce E = T
  • 71. !71 state 1 { [S = E . ] [E = E . "+" T] } accept state 0 { [S = . E] [E = . E "+" T] [E = . T] [T = . T "*" F] [T = . F] [F = . "(" E ")"] [F = . ID] } shift E to 1 shift T to 2 shift F to 3 shift "(" to 4 state 4 { [F = "(" . E ")"] [E = . E "+" T] [E = . T] [T = . T "*" F] [T = . F] [F = . "(" E ")"] [F = . ID] } shift T to 2 shift F to 3 shift "(" to 4 state 3 { [T = F . ] } reduce T = F state 2 { [E = T . ] [T = T . "*" F] } reduce E = T
  • 72. !72 state 1 { [S = E . ] [E = E . "+" T] } accept shift "+" to 6 state 0 { [S = . E] [E = . E "+" T] [E = . T] [T = . T "*" F] [T = . F] [F = . "(" E ")"] [F = . ID] } shift E to 1 shift T to 2 shift F to 3 shift "(" to 4 shift ID to 5 state 4 { [F = "(" . E ")"] [E = . E "+" T] [E = . T] [T = . T "*" F] [T = . F] [F = . "(" E ")"] [F = . ID] } shift T to 2 shift F to 3 shift "(" to 4 shift ID to 5 state 5 { [F = ID .] } reduce F = ID state 3 { [T = F . ] } reduce T = F state 2 { [E = T . ] [T = T . "*" F] } reduce E = T shift "*" to 7
  • 73. !73 state 1 { [S = E . ] [E = E . "+" T] } accept shift "+" to 6 state 0 { [S = . E] [E = . E "+" T] [E = . T] [T = . T "*" F] [T = . F] [F = . "(" E ")"] [F = . ID] } shift E to 1 shift T to 2 shift F to 3 shift "(" to 4 shift ID to 5 state 4 { [F = "(" . E ")"] [E = . E "+" T] [E = . T] [T = . T "*" F] [T = . F] [F = . "(" E ")"] [F = . ID] } shift T to 2 shift F to 3 shift "(" to 4 shift ID to 5 state 5 { [F = ID .] } reduce F = ID state 6 { [E = E "+" . T] [T = . T "*" F] [T = . F] [F = . "(" E ")"] [F = . ID] } state 3 { [T = F . ] } reduce T = F state 2 { [E = T . ] [T = T . "*" F] } reduce E = T shift "*" to 7
  • 74. !74 state 1 { [S = E . ] [E = E . "+" T] } accept shift "+" to 6 state 0 { [S = . E] [E = . E "+" T] [E = . T] [T = . T "*" F] [T = . F] [F = . "(" E ")"] [F = . ID] } shift E to 1 shift T to 2 shift F to 3 shift "(" to 4 shift ID to 5 state 4 { [F = "(" . E ")"] [E = . E "+" T] [E = . T] [T = . T "*" F] [T = . F] [F = . "(" E ")"] [F = . ID] } shift T to 2 shift F to 3 shift "(" to 4 shift ID to 5 state 5 { [F = ID .] } reduce F = ID state 6 { [E = E "+" . T] [T = . T "*" F] [T = . F] [F = . "(" E ")"] [F = . ID] } shift F to 3 shift "(" to 4 shift ID to 5 state 3 { [T = F . ] } reduce T = F state 2 { [E = T . ] [T = T . "*" F] } reduce E = T shift "*" to 7
  • 75. !75 state 1 { [S = E . ] [E = E . "+" T] } accept shift "+" to 6 state 0 { [S = . E] [E = . E "+" T] [E = . T] [T = . T "*" F] [T = . F] [F = . "(" E ")"] [F = . ID] } shift E to 1 shift T to 2 shift F to 3 shift "(" to 4 shift ID to 5 state 4 { [F = "(" . E ")"] [E = . E "+" T] [E = . T] [T = . T "*" F] [T = . F] [F = . "(" E ")"] [F = . ID] } shift T to 2 shift F to 3 shift "(" to 4 shift ID to 5 state 5 { [F = ID .] } reduce F = ID state 6 { [E = E "+" . T] [T = . T "*" F] [T = . F] [F = . "(" E ")"] [F = . ID] } shift F to 3 shift "(" to 4 shift ID to 5 state 7 { [T = T "*" . F] [F = . ID] } shift ID to 5state 3 { [T = F . ] } reduce T = F state 2 { [E = T . ] [T = T . "*" F] } reduce E = T shift "*" to 7
  • 76. !76 state 1 { [S = E . ] [E = E . "+" T] } accept shift "+" to 6 state 0 { [S = . E] [E = . E "+" T] [E = . T] [T = . T "*" F] [T = . F] [F = . "(" E ")"] [F = . ID] } shift E to 1 shift T to 2 shift F to 3 shift "(" to 4 shift ID to 5 state 4 { [F = "(" . E ")"] [E = . E "+" T] [E = . T] [T = . T "*" F] [T = . F] [F = . "(" E ")"] [F = . ID] } shift E to 8 shift T to 2 shift F to 3 shift "(" to 4 shift ID to 5 state 5 { [F = ID .] } reduce F = ID state 6 { [E = E "+" . T] [T = . T "*" F] [T = . F] [F = . "(" E ")"] [F = . ID] } shift F to 3 shift "(" to 4 shift ID to 5 state 7 { [T = T "*" . F] [F = . ID] } shift ID to 5 state 8 { [F = "(" E . ")"] } state 3 { [T = F . ] } reduce T = F state 2 { [E = T . ] [T = T . "*" F] } reduce E = T shift "*" to 7
  • 77. !77 state 1 { [S = E . ] [E = E . "+" T] } accept shift "+" to 6 state 0 { [S = . E] [E = . E "+" T] [E = . T] [T = . T "*" F] [T = . F] [F = . "(" E ")"] [F = . ID] } shift E to 1 shift T to 2 shift F to 3 shift "(" to 4 shift ID to 5 state 4 { [F = "(" . E ")"] [E = . E "+" T] [E = . T] [T = . T "*" F] [T = . F] [F = . "(" E ")"] [F = . ID] } shift E to 8 shift T to 2 shift F to 3 shift "(" to 4 shift ID to 5 state 5 { [F = ID .] } reduce F = ID state 6 { [E = E "+" . T] [T = . T "*" F] [T = . F] [F = . "(" E ")"] [F = . ID] } shift T to 9 shift F to 3 shift "(" to 4 shift ID to 5 state 7 { [T = T "*" . F] [F = . ID] } shift ID to 5 state 8 { [F = "(" E . ")"] } state 9 { [E = E "+" T .] [T = T . "*" F] } reduce E = E "+" T shift "*" to 7 state 3 { [T = F . ] } reduce T = F state 2 { [E = T . ] [T = T . "*" F] } reduce E = T shift "*" to 7
  • 78. !78 state 1 { [S = E . ] [E = E . "+" T] } accept shift "+" to 6 state 0 { [S = . E] [E = . E "+" T] [E = . T] [T = . T "*" F] [T = . F] [F = . "(" E ")"] [F = . ID] } shift E to 1 shift T to 2 shift F to 3 shift "(" to 4 shift ID to 5 state 4 { [F = "(" . E ")"] [E = . E "+" T] [E = . T] [T = . T "*" F] [T = . F] [F = . "(" E ")"] [F = . ID] } shift E to 8 shift T to 2 shift F to 3 shift "(" to 4 shift ID to 5 state 5 { [F = ID .] } reduce F = ID state 6 { [E = E "+" . T] [T = . T "*" F] [T = . F] [F = . "(" E ")"] [F = . ID] } shift T to 9 shift F to 3 shift "(" to 4 shift ID to 5 state 7 { [T = T "*" . F] [F = . ID] } shift F to 10 shift ID to 5 state 8 { [F = "(" E . ")"] } shift ")" to 11 state 9 { [E = E "+" T .] [T = T . "*" F] } reduce E = E "+" T shift "*" to 7 state 10 { [T = T "*" F .] } reduce T = T "*" F state 3 { [T = F . ] } reduce T = F state 2 { [E = T . ] [T = T . "*" F] } reduce E = T shift "*" to 7
  • 79. !79 state 1 { [S = E . ] [E = E . "+" T] } accept shift "+" to 6 state 0 { [S = . E] [E = . E "+" T] [E = . T] [T = . T "*" F] [T = . F] [F = . "(" E ")"] [F = . ID] } shift E to 1 shift T to 2 shift F to 3 shift "(" to 4 shift ID to 5 state 4 { [F = "(" . E ")"] [E = . E "+" T] [E = . T] [T = . T "*" F] [T = . F] [F = . "(" E ")"] [F = . ID] } shift E to 8 shift T to 2 shift F to 3 shift "(" to 4 shift ID to 5 state 5 { [F = ID .] } reduce F = ID state 6 { [E = E "+" . T] [T = . T "*" F] [T = . F] [F = . "(" E ")"] [F = . ID] } shift T to 9 shift F to 3 shift "(" to 4 shift ID to 5 state 7 { [T = T "*" . F] [F = . ID] } shift F to 10 shift ID to 5 state 8 { [F = "(" E . ")"] } shift ")" to 11 state 9 { [E = E "+" T .] [T = T . "*" F] } reduce E = E "+" T shift "*" to 7 state 10 { [T = T "*" F .] } reduce T = T "*" F state 11 { [F = "(" E ")" .] } reduce F = "(" E ")" state 3 { [T = F . ] } reduce T = F state 2 { [E = T . ] [T = T . "*" F] } reduce E = T shift "*" to 7
  • 80. SLR Parse !80 $ | 0 | ID "*" ID $ shift ID to 5 => $ ID | 0 5 | "*" ID $ reduce F = ID => $ | 0 | F "*" ID $ shift F to 3 => $ F | 0 3 | "*" ID $ reduce T = F => $ | 0 | T "*" ID $ shift T to 2 => $ T | 0 2 | "*" ID $ shift "*" to 7 => $ T "*" | 0 2 7 | ID $ shift ID to 5 => $ T "*" ID | 0 2 7 5 | $ reduce F = ID => $ T "*" | 0 2 7 | F $ shift F to 10 => $ T "*" F | 0 2 7 10 | $ reduce T = T "*" F => $ | 0 | T $ shift T to 2 => $ T | 0 2 | $ reduce E = T => $ | 0 | E $ shift E to 1 => $ E | 0 1 | $ accept grammar productions S = E E = E "+" T E = T T = T "*" F T = F F = "(" E ")" F = ID
  • 81. !81 state 0 { [S = . E] [E = . E "+" T] [E = . T] [T = . T "*" F] [T = . F] [F = . "(" E ")"] [F = . ID] } shift E to 1 shift T to 2 shift F to 3 shift "(" to 4 shift ID to 5 state 3 { [T = F . ] } reduce T = F state 5 { [F = ID .] } reduce F = ID state 2 { [E = T . ] [T = T . "*" F] } reduce E = T shift "*" to 7 state 7 { [T = T "*" . F] [F = . ID] } shift F to 10 shift ID to 5 state 10 { [T = T "*" F .] } reduce T = T "*" F state 2 { [E = T . ] [T = T . "*" F] } reduce E = T shift "*" to 7 state 1 { [S = E . ] [E = E . "+" T] } accept shift "+" to 6 state 5 { [F = ID .] } reduce F = ID $ | 0 | ID "*" ID $ shift ID to 5 => $ ID | 0 5 | "*" ID $ reduce F = ID => $ | 0 | F "*" ID $ shift F to 3 => $ F | 0 3 | "*" ID $ reduce T = F => $ | 0 | T "*" ID $ shift T to 2 => $ T | 0 2 | "*" ID $ shift "*" to 7 => $ T "*" | 0 2 7 | ID $ shift ID to 5 => $ T "*" ID | 0 2 7 5 | $ reduce F = ID => $ T "*" | 0 2 7 | F $ shift F to 10 => $ T "*" F | 0 2 7 10 | $ reduce T = T "*" F => $ | 0 | T $ shift T to 2 => $ T | 0 2 | $ reduce E = T => $ | 0 | E $ shift E to 1 => $ E | 0 1 | $ accept 10 2 3 4 5 6 7 8
  • 82. !82 state 0 { [S = . E] [E = . E "+" T] [E = . T] [T = . T "*" F] [T = . F] [F = . "(" E ")"] [F = . ID] } shift E to 1 shift T to 2 shift F to 3 shift "(" to 4 shift ID to 5 state 3 { [T = F . ] } reduce T = F state 5 { [F = ID .] } reduce F = ID state 7 { [T = T "*" . F] [F = . ID] } shift F to 10 shift ID to 5 state 10 { [T = T "*" F .] } reduce T = T "*" F state 1 { [S = E . ] [E = E . "+" T] } accept shift "+" to 6 state 5 { [F = ID .] } reduce F = ID Parsing: ID "+" ID "*" ID . state 1 { [S = E . ] [E = E . "+" T] } accept shift "+" to 6 state 6 { [E = E "+" . T] [T = . T "*" F] [T = . F] [F = . "(" E ")"] [F = . ID] } shift T to 9 shift F to 3 shift "(" to 4 shift ID to 5 state 9 { [E = E "+" T .] [T = T . "*" F] } reduce E = E "+" T shift "*" to 7 E "+" ID F T ID"*" F E state 9 { [E = E "+" T .] [T = T . "*" F] } reduce E = E "+" T shift "*" to 7 T 1 2 3 4 5 6 7 8 9 10
  • 87. Context-free grammars - Productions define how to generate sentences of language - Derivation: generate sentence from (start) symbol - Reduction: reduce sentence to (start) symbol Parse tree - Represents structure of derivation - Abstracts from derivation order Parser - Algorithm to reconstruct derivation !87 Parsing
  • 88. First/Follow - Selecting between actions in LR parse table Other algorithms - Top-down: LL(k) table - Generalized parsing: Earley, Generalized-LR - Scannerless parsing: characters as tokens Disambiguation - Semantics of declarative disambiguation - Deep priority conflicts !88 More Topics in Syntax and Parsing
  • 90. Except where otherwise noted, this work is licensed under