SlideShare a Scribd company logo
ALF
Parser Top-Down
Bibliographie pour aujourd'hui
Keith Cooper, Linda Torczon, Engineering a
Compiler
– Chapitre 3
• 3.3
Alfred V. Aho, Monica S. Lam, Ravi Sethi, Jeffrey D.
Ullman, Compilers: Principles, Techniques, and Tools
(2nd Edition)
– Chapitre 4
• 4.4
Contenu
• Les parser
• LL
Alexander Aiken
• Américain
• Stanford
• LL(*)
• MOSS
• ANTLR
Slides
Partie de slides sont écrie par
Bogdan Nitulescu
Notation BNF
RFC 2616 HTTP/1.1 June 1999
HTTP-date = rfc1123-date | rfc850-date | asctime-date
rfc1123-date = wkday "," SP date1 SP time SP "GMT“
rfc850-date = weekday "," SP date2 SP time SP "GMT“
asctime-date = wkday SP date3 SP time SP 4DIGIT
date1 = 2DIGIT SP month SP 4DIGIT
; day month year (e.g., 02 Jun 1982)
date2 = 2DIGIT "-" month "-" 2DIGIT
; day-month-year (e.g., 02-Jun-82)
date3 = month SP ( 2DIGIT | ( SP 1DIGIT ))
; month day (e.g., Jun 2)
time = 2DIGIT ":" 2DIGIT ":" 2DIGIT
; 00:00:00 - 23:59:59
wkday = "Mon" | "Tue" | "Wed“
| "Thu" | "Fri" | "Sat" | "Sun“
weekday = "Monday" | "Tuesday" | "Wednesday“
| "Thursday" | "Friday" | "Saturday" | "Sunday“
month = "Jan" | "Feb" | "Mar" | "Apr“
| "May" | "Jun" | "Jul" | "Aug"
| "Sep" | "Oct" | "Nov" | "Dec"
Arbre de dérivation / syntactique
E  E + E
E  E * E
E  n
n : [0-9]+
Lexer
2 * 3 + 4 * 5
n * n + n * n
2 3 4 5
parser
n n n n
2 3 4 5
* *
+
n n n n
2 3 4 5
E * E E * E
E + E
E
• jetons (tokens)
• Valeurs
• Grammaire
• Arbre de dérivation
• Arbre syntactique
Types d’analyse syntactique
 Descendent (top-down)
 Avec backtracking
 Prédictive
 Descendent récursive, LL avec un tableau
 Ascendant (bottom-up)
 Avec backtracking
 Shift-reduce
 LR(0),SLR,LALR, LR canonique
–Instr
–id = Expr ;
–id = ( Expr ) ;
–id = ( Expr + Expr ) ;
–id = ( id + Expr ) ;
–id = ( id + id ) ;
id = ( id + id ) ;
id = ( id + id ) ;
id = ( id + id ) ;
id = ( id + id ) ;
id = ( id + id ) ;
id = ( id + id ) ;
• LL: La chaîne de jetons est itérée à partir du côté
gauche (L)
• Le non-terminal le plus à gauche est dérivé (L)
Dérivation gauche, top down
–Instr
–id = Expr ;
–id = Expr + Expr ;
–id = ( Expr ) + Expr ;
–id = ( id ) + Expr ;
–id = ( id ) + ( Expr ) ;
–id = ( id + id ) ;
id = ( id ) + ( id ) ;
id = ( id ) + ( id ) ;
id = ( id ) + ( id ) ;
id = ( id ) + ( id ) ;
id = ( id ) + ( id ) ;
id = ( id ) + ( id ) ;
id = ( id ) + ( id ) ;
• Comment choisir la production utilisée pour la
dérivation?
• Backtracking?
Dérivation gauche, top down
Parser LL, LR
 Nous devrions éviter backtracking
 Une grammaire qui permet le parser déterministe
 LL(k) lit left-to-right, dérivation left
 LR(k) lit left-to-right, dérivation right
 K – lookahead (combien de tokens sont lus)
 LL(k) < LR(k)
 L'algorithme est indépendant du langage, la
grammaire dépend du langage
Analyse descendent récursive
 Non-terminal -> fonction
 Si le symbole apparaît dans la partie droite
de production -> appel la fonction
 Si le symbole apparaît dans la partie
gauche de production – la production est
choisi en fonction des jetons (tokens)
suivants (lookahead)
MatchToken (token) {
if (lookahead != token) throw error();
lookahead = lexer.getNextToken();
}
rfc850-date = weekday "," SP date2 SP time SP "GMT“
ParseRFC850Date() {
ParseWeekDay();
MatchToken(",");
MatchToken(SP);
ParseDate2();
MatchToken(SP);
ParseTime();
MatchToken(SP);
MatchToken("GMT“);
}
Fonction pour parser le non-
terminal rfc850-date
Analyse descendent récursive
Avec la grammaire
E  E + T | T
T  T  F | F
F  ( E ) | id
Un parser descendant entre dans une boucle infinie
lorsque vous essayez de parser cette grammaire
E E
+E T
E
+E T
+E T
E
+E T
+E T
+E T
(Aho,Sethi,Ullman, pp. 176)
Récursivité gauche
Grammaire des
expression
E  E + T | T
T  T  F | F
F  ( E ) | id
Peut être écrive sans la récursivité gauche
E  TE’
E’  +TE’ | 
T  FT’
T’  FT’ | 
F  ( E ) | id
(Aho,Sethi,Ullman, pp. 176)
ε – string vide
Récursivité gauche
Exemple de parser récursive
ParseE() {
ParseT(); ParseE1();
}
ParseE1() {
if (lookahead==“+”)
{
MatchToken(“+”);
ParseT();
ParseE1();
}
}
ParseT() {
ParseF(); ParseT1();
}
ParseT1() {
if (lookahead==“*”)
{
MatchToken(“*”);
ParseF();
ParseT1();
}
}
E  TE’
E’  +TE’ | 
T  FT’
T’  FT’ | 
F  ( E ) | id
ParseF() {
if (lookahead == “(“) {
MatchToken(“(“); ParseE(); MatchToken(“)”);
}
else
MatchToken(T_ID);
}
Comment choisir entre deux productions?
Comment pouvons-nous savoir quelles conditions de poser
a if?
Lorsque nous émettons des erreurs?
ParseF() {
if (lookahead == “(“) {
MatchToken(“(“);
ParseE();
MatchToken(“)”);
}
else if (lookahead == T_ID)
{
MatchToken(T_ID);
}
else throw error();
}
F  ( E )
F  id
T’  *FT’
T’  ε
ParseT1() {
if (lookahead==“*”) {
MatchToken(“*”);
ParseF();
ParseT1();
}
else if (lookahead == “+”) { }
else if (lookahead == “)”) { }
else if (lookahead == T_EOF) { }
else throw error();
}
Analyse descendent récursive
Les conditions pour if
• FIRST
– Ensemble de terminaux-préfixées pour le non-terminal
• FOLLOW
– Ensemble de terminaux suivantes pour le non-terminal
• NULLABLE
– Ensemble de non-terminaux qui peut etre derive en ε
FIRST
E  TE’
E’  +TE’ | 
T  FT’
T’  FT’ | 
F  ( E ) | id
GRAMMAIRE:
1. If X is a terminal, FIRST(X) = {X}
FIRST(id) = {id}
FIRST() = {}
FIRST(+) = {+}
ENSEBLES:
2. If X   , then   FIRST(X)
4. If X  Y1 Y2 ••• Yk
FIRST(() = {(}
FIRST()) = {)}
FIRST (pseudocode):
and a FIRST(Yi)
then a  FIRST(X)
FIRST(F) = {(, id}
FIRST(T) = FIRST(F) = {(, id}
FIRST(E) = FIRST(T) = {(, id}
FIRST(E’) = {} {+, }
FIRST(T’) = {} {, }
(Aho,Sethi,Ullman, pp. 189)
*
3. If X  Y1 Y2 ••• Yk
and Y1••• Yi-1 
and a FIRST(Y1)
then a  FIRST(X)
E  TE’
E’  +TE’ | 
T  FT’
T’  FT’ | 
F  ( E ) | id
GRAMMAIRE:
1. If S is the start symbol, then $  FOLLOW(S)
FOLLOW(E) = {$}
FOLLOW(E’) = { ), $}
ENSEBLES:
2. If A  B,
and a  FIRST()
and a  
then a  FOLLOW(B)
3. If A  B
and a  FOLLOW(A)
then a  FOLLOW(B)
FOLLOW – pseudocode:
{ ), $}
3a. If A  B
and
and a  FOLLOW(A)
then a  FOLLOW(B)
*  
FOLLOW(T) = { ), $}
FIRST(F) = {(, id}
FIRST(T) = {(, id}
FIRST(E) = {(, id}
FIRST(E’) = {+, }
FIRST(T’) = { , }
 et  - string de terminaux et non-
terminaux
A et B – non-terminaux,
$ - fin du text
(Aho,Sethi,Ullman,
pp. 189)
FOLLOW
E  TE’
E’  +TE’ | 
T  FT’
T’  FT’ | 
F  ( E ) | id
1. If S is the start symbol, then $  FOLLOW(S)
FOLLOW(E) = {), $}
FOLLOW(E’) = { ), $}
3. If A  B
and a  FOLLOW(A)
then a  FOLLOW(B)
3a. If A  B
and
and a  FOLLOW(A)
then a  FOLLOW(B)
*  
FOLLOW(T) = { ), $}
FIRST(F) = {(, id}
FIRST(T) = {(, id}
FIRST(E) = {(, id}
FIRST(E’) = {+, }
FIRST(T’) = { , }
2. If A  B,
and a  FIRST()
and a  
then a  FOLLOW(B)
{+, ), $}
(Aho,Sethi,Ullman, pp. 189)
GRAMMAIRE:
ENSEBLES:
FOLLOW – règles:
FOLLOW
E  TE’
E’  +TE’ | 
T  FT’
T’  FT’ | 
F  ( E ) | id
1. If S is the start symbol, then $  FOLLOW(S)
FOLLOW(E) = {), $}
FOLLOW(E’) = { ), $}
FOLLOW(T) = {+, ), $}
FIRST(F) = {(, id}
FIRST(T) = {(, id}
FIRST(E) = {(, id}
FIRST(E’) = {+, }
FIRST(T’) = { , }
2. If A  B,
and a  FIRST()
and a  
then a  FOLLOW(B)
3. If A  B
and a  FOLLOW(A)
then a  FOLLOW(B)
FOLLOW(T’) = {+, ), $}
3a. If A  B
and
and a  FOLLOW(A)
then a  FOLLOW(B)
*  
(Aho,Sethi,Ullman, pp. 189)
GRAMMAIRE:
ENSEBLES:
FOLLOW – règles:
FOLLOW
E  TE’
E’  +TE’ | 
T  FT’
T’  FT’ | 
F  ( E ) | id
1. If S is the start symbol, then $  FOLLOW(S)
FOLLOW(E) = {), $}
FOLLOW(E’) = { ), $}
FOLLOW(T) = {+, ), $}
FIRST(F) = {(, id}
FIRST(T) = {(, id}
FIRST(E) = {(, id}
FIRST(E’) = {+, }
FIRST(T’) = { , }
2. If A  B,
and a  FIRST()
and a  
then a  FOLLOW(B)
3. If A  B
and a  FOLLOW(A)
then a  FOLLOW(B)
FOLLOW(T’) = {+, ), $}
3a. If A  B
and
and a  FOLLOW(A)
then a  FOLLOW(B)
*  
FOLLOW(F) = {+, ), $}
(Aho,Sethi,Ullman, pp. 189)
GRAMMAIRE:
ENSEBLES:
FOLLOW – règles:
FOLLOW
E  TE’
E’  +TE’ | 
T  FT’
T’  FT’ | 
F  ( E ) | id
1. If S is the start symbol, then $  FOLLOW(S)
FOLLOW(E) = {), $}
FOLLOW(E’) = { ), $}
FOLLOW(T) = {+, ), $}
FIRST(F) = {(, id}
FIRST(T) = {(, id}
FIRST(E) = {(, id}
FIRST(E’) = {+, }
FIRST(T’) = { , }
3. If A  B
and a  FOLLOW(A)
then a  FOLLOW(B)
FOLLOW(T’) = {+, ), $}
3a. If A  B
and
and a  FOLLOW(A)
then a  FOLLOW(B)
*  
FOLLOW(F) = {+, ), $}
2. If A  B,
and a  FIRST()
and a  
then a  FOLLOW(B)
{+, , ), $}
(Aho,Sethi,Ullman, pp. 189)
GRAMMAIRE:
ENSEBLES:
FOLLOW – règles:
FOLLOW
L’algo générique récursive LL(1)
A  a B … x
A  C D … y
…
ParseA() {
if (lookahead in FIRST(a B … x FOLLOW(A)) {
MatchToken(a); ParseB(); … MatchToken(x);
}
else if (lookahead in FIRST(C D … y FOLLOW(A))
{
ParseC(); ParseD(); … MatchToken(y);
}
…
else throw error();
}
• Pour chaque non-terminal crée une fonction de parser.
• Pour chaque règle Aα ajouter un test
if (lookahead in FIRST(αFOLLOW(A)) )
• Pour chaque non-terminal dans a appeler la fonction de parser.
• Pour chaque terminal dans a, vérifier le lookahead(match)
Récursivité gauche
Quand une grammaire a au moins une forme de production
A  Aα
nous disons qu'il est une grammaire récursive gauche.
Le parsers descendent ne fonctionnent pas (sans
backtracking) sur les grammaire récursives gauche.
(Aho,Sethi,Ullman, pp. 176)
Récursivité peut ne pas être immédiat
A  Bα
B  A β
Elimination récursivité gauche
• Cela se fait par la réécriture de la
grammaire
E  E + T | T
T  T  F | F
F  ( E ) | id
E  TE’
E’  +TE’ | 
T  FT’
T’  FT’ | 
F  ( E ) | id
(Aho,Sethi,Ullman, pp. 176)
List  List Item | Item
List  Item List’
List’  Item List’ | ε
Cas général (récursivité immédiat):
A → Aβ1 |Aβ2 | ... |Aβm | α1 | α2 | ... | αn
A → α1A' | α2A' | ... | αnA‘
A' → β1A' | β2A' | ... | βmA'| ε
E  E + T | T
T  T  F | F
F  ( E ) | id
E  TE’
E’  +TE’ | 
T  FT’
T’  FT’ | 
F  ( E ) | id
Elimination récursivité gauche
• Pour une instruction if:
• Pour parser avec LL, elle doit être factorise:
Factorisation gauche
 Cas général:
A → αβ1 | αβ2 | ... | αβn | δ
 Factorise:
A → αA' | δ
A' → β1 | β2 | ... | βn
Factorisation gauche
Elimination des ambiguïtés
 Ambigu: Ε → Ε + Ε | Ε * Ε | a | ( E )
1. Ε → Ε + T | T
T → T * F | F
F → a | ( E )
2. Ε → T + E | T
T → F * T | F
F → a | ( E )
 La précédence des operateurs
 La associativité gauche ou droite
 Productions qui peuvent produire l'ambiguïté:
X → aAbAc
 Cas général:
A → A B A | α1 | α2 | ... | αn
 Désambiguïsât:
A → A' B A | A‘
A' → α1 | α2 | ... | αn
Elimination des ambiguïtés
Parser automatique
• Automate push-down
• Le parser est fait avec un automate est un
tableau
• Langage LL(1) si il n'a pas de conflits dans
le tableau
E  TE’
E’  +TE’ | 
T  FT’
T’  FT’ | 
F  ( E ) | id
Grammaire:
INPUTSYMBOLNON-
TERMINAL id + * ( ) $
E ETE’ ETE’
E’ E’+TE’ E’  E’ 
T TFT’ TFT’
T’ T’ T’ *FT’ T’  T’ 
F Fid F(E)
Tableau
de
Parsing:
(Aho,Sethi,Ullman, pp. 188)
Exemple de parser LL
INPUTSYMBOLNON-
TERMINAL id + * ( ) $
E ETE’ ETE’
E’ E’ +TE’ E’  E’ 
T TFT’ TFT’
T’ T’ T’ *FT’ T’  T’ 
F Fid F(E)
PILE:
id idid+ INPUT:
Predictive Parsing
Program
E
$
$ OUTPUT:
E
T
E’
$
T E’
TABLEAU
DE
PARSING:
Exemple de parser LL
T
E’
$
T
E’
$
INPUTSYMBOLNON-
TERMINAL id + * ( ) $
E ETE’ ETE’
E’ E’ +TE’ E’  E’ 
T TFT’ TFT’
T’ T’ T’ *FT’ T’  T’ 
F Fid F(E)
id idid+ INPUT:
Predictive Parsing
Program
$ OUTPUT:
E
F
T’
E’
$
F T’
T E’
(Aho,Sethi,
Ullman,
pp. 186)
PILE:
TABLEAU
DE
PARSING:
Exemple de parser LL
(Aho,Sethi,
Ullman,
pp. 188)
T
E’
$
T
E’
$
INPUTSYMBOLNON-
TERMINAL id + * ( ) $
E ETE’ ETE’
E’ E’ +TE’ E’  E’ 
T TFT’ TFT’
T’ T’ T’ *FT’ T’  T’ 
F Fid F(E)
id idid+ INPUT:
Predictive Parsing
Program
$ OUTPUT:
E
F
T’
E’
$
F T’
T E’
id
T’
E’
$
id
PILE:
TABLEAU
DE
PARSING:
Exemple de parser LL
INPUTSYMBOLNON-
TERMINAL id + * ( ) $
E ETE’ ETE’
E’ E’ +TE’ E’  E’ 
T TFT’ TFT’
T’ T’ T’ *FT’ T’  T’ 
F Fid F(E)
id idid+ INPUT:
Predictive Parsing
Program
$ OUTPUT:
E
T’
E’
$
F T’
T E’
id
Quand l’action c’est Top(Pile) = input ≠ $ : ‘Pop’ de la pile, avance la bande de input.
(Aho,Sethi,
Ullman,
pp. 188)
PILE:
TABLEAU
DE
PARSING:
Exemple de parser LL
INPUTSYMBOLNON-
TERMINAL id + * ( ) $
E ETE’ ETE’
E’ E’ +TE’ E’  E’ 
T TFT’ TFT’
T’ T’ T’ *FT’ T’  T’ 
F Fid F(E)
id idid+ INPUT:
Predictive Parsing
Program
$ OUTPUT:
E
F T’
T E’
id 
T’
E’
$
E’
$
(Aho,Sethi,
Ullman,
pp. 188)
PILE:
TABLEAU
DE
PARSING:
Exemple de parser LL
E
F T’
T E’
id 
T+ E’
F T’
id F T’
id 

Et ainsi, il construit
l’arbre de dérivation:
E’  +TE’
T  FT’
F  id
T’   FT’
F  id
T’  
E’  
Quand Top(Pile) = input = $
Le parser arrêt et accepte l’input
(Aho,Sethi,
Ullman,
pp. 188)
Exemple de parser LL
Remplir de tableau
• FIRST
– Ensemble de terminaux-préfixées pour le non-terminal
• FOLLOW
– Ensemble de terminaux suivantes pour le non-terminal
• NULLABLE
– Ensemble de non-terminaux qui peut etre derive en ε
Reguli pentru construit tabela de parsare
E  TE’
E’  +TE’ | 
T  FT’
T’  FT’ | 
F  ( E ) | id
GRAMMAIRE:
FOLLOW(E) = {), $}
FOLLOW(E’) = { ), $}
FOLLOW SETS:
FOLLOW(T) = {+, ), $}
FOLLOW(T’) = {+, ), $}
FOLLOW(F) = {+, , ), $}
FIRST(F) = {(, id}
FIRST(T) = {(, id}
FIRST(E) = {(, id}
FIRST(E’) = {+, }
FIRST(T’) = { , }
FIRST SETS:
TABLEAU
DE
PARSING:
1. If A  :
if a  FIRST(), add A   to M[A, a]
INPUTSYMBOLNON-
TERMINAL id + * ( ) $
E ETE’ ETE’
E’ E’ +TE’ E’  E’ 
T TFT’ TFT’
T’ T’ T’ *FT’ T’  T’ 
F Fid F(E)
(Aho,Sethi,Ullman, pp. 190)
1. If A  :
if a  FIRST(), add A   to M[A, a]
INPUTSYMBOLNON-
TERMINAL id + * ( ) $
E ETE’ ETE’
E’ E’ +TE’ E’  E’ 
T TFT’ TFT’
T’ T’ T’ *FT’ T’  T’ 
F Fid F(E)
(Aho,Sethi,Ullman, pp. 190)Reguli pentru construit tabela de parsare
E  TE’
E’  +TE’ | 
T  FT’
T’  FT’ | 
F  ( E ) | id
GRAMMAIRE:
FOLLOW(E) = {), $}
FOLLOW(E’) = { ), $}
FOLLOW SETS:
FOLLOW(T) = {+, ), $}
FOLLOW(T’) = {+, ), $}
FOLLOW(F) = {+, , ), $}
FIRST(F) = {(, id}
FIRST(T) = {(, id}
FIRST(E) = {(, id}
FIRST(E’) = {+, }
FIRST(T’) = { , }
FIRST SETS:
TABLEAU
DE
PARSING:
E  TE’
E’  +TE’ | 
T  FT’
T’  FT’ | 
F  ( E ) | id
GRAMMAIRE:
FOLLOW(E) = {), $}
FOLLOW(E’) = { ), $}
FOLLOW SETS:
FOLLOW(T) = {+, ), $}
FOLLOW(T’) = {+, ), $}
FOLLOW(F) = {+, , ), $}
FIRST(F) = {(, id}
FIRST(T) = {(, id}
FIRST(E) = {(, id}
FIRST(E’) = {+, }
FIRST(T’) = { , }
FIRST SETS:
1. If A  :
if a  FIRST(), add A   to M[A, a]
INPUTSYMBOLNON-
TERMINAL id + * ( ) $
E ETE’ ETE’
E’ E’ +TE’ E’  E’ 
T TFT’ TFT’
T’ T’ T’ *FT’ T’  T’ 
F Fid F(E)
(Aho,Sethi,Ullman, pp. 190)Reguli pentru construit tabela de parsare
TABLEAU
DE
PARSING:
E  TE’
E’  +TE’ | 
T  FT’
T’  FT’ | 
F  ( E ) | id
GRAMMAIRE:
FOLLOW(E) = {), $}
FOLLOW(E’) = { ), $}
FOLLOW SETS:
FOLLOW(T) = {+, ), $}
FOLLOW(T’) = {+, ), $}
FOLLOW(F) = {+, , ), $}
FIRST(F) = {(, id}
FIRST(T) = {(, id}
FIRST(E) = {(, id}
FIRST(E’) = {+, }
FIRST(T’) = { , }
FIRST SETS:
TABLEAU
DE
PARSING:
1. If A  :
if a  FIRST(), add A   to M[A, a]
INPUTSYMBOLNON-
TERMINAL id + * ( ) $
E ETE’ ETE’
E’ E’ +TE’ E’  E’ 
T TFT’ TFT’
T’ T’ T’ *FT’ T’  T’ 
F Fid F(E)
(Aho,Sethi,Ullman, pp. 190)Reguli pentru construit tabela de parsare
E  TE’
E’  +TE’ | 
T  FT’
T’  FT’ | 
F  ( E ) | id
GRAMMAIRE:
FOLLOW(E) = {), $}
FOLLOW(E’) = { ), $}
FOLLOW SETS:
FOLLOW(T) = {+, ), $}
FOLLOW(T’) = {+, ), $}
FOLLOW(F) = {+, , ), $}
FIRST(F) = {(, id}
FIRST(T) = {(, id}
FIRST(E) = {(, id}
FIRST(E’) = {+, }
FIRST(T’) = { , }
FIRST SETS:
1. If A  :
if a  FIRST(), add A   to M[A, a]
INPUTSYMBOLNON-
TERMINAL id + * ( ) $
E ETE’ ETE’
E’ E’ +TE’ E’  E’ 
T TFT’ TFT’
T’ T’ T’ *FT’ T’  T’ 
F Fid F(E)
(Aho,Sethi,Ullman, pp. 190)Reguli pentru construit tabela de parsare
TABLEAU
DE
PARSING:
E  TE’
E’  +TE’ | 
T  FT’
T’  FT’ | 
F  ( E ) | id
GRAMMAIRE:
FOLLOW(E) = {), $}
FOLLOW(E’) = { ), $}
FOLLOW SETS:
FOLLOW(T) = {+, ), $}
FOLLOW(T’) = {+, ), $}
FOLLOW(F) = {+, , ), $}
FIRST(F) = {(, id}
FIRST(T) = {(, id}
FIRST(E) = {(, id}
FIRST(E’) = {+, }
FIRST(T’) = { , }
FIRST SETS:
1. If A  :
if a  FIRST(), add A   to M[A, a]
2. If A  :
if   FIRST(), add A   to M[A, b]
for each terminal b  FOLLOW(A),
INPUTSYMBOLNON-
TERMINAL id + * ( ) $
E ETE’ ETE’
E’ E’ +TE’ E’  E’ 
T TFT’ TFT’
T’ T’ T’ *FT’ T’  T’ 
F Fid F(E)
(Aho,Sethi,Ullman, pp. 190)Reguli pentru construit tabela de parsare
TABLEAU
DE
PARSING:
E  TE’
E’  +TE’ | 
T  FT’
T’  FT’ | 
F  ( E ) | id
GRAMMAIRE:
FOLLOW(E) = {), $}
FOLLOW(E’) = { ), $}
FOLLOW SETS:
FOLLOW(T) = {+, ), $}
FOLLOW(T’) = {+, ), $}
FOLLOW(F) = {+, , ), $}
FIRST(F) = {(, id}
FIRST(T) = {(, id}
FIRST(E) = {(, id}
FIRST(E’) = {+, }
FIRST(T’) = { , }
FIRST SETS:
1. If A  :
if a  FIRST(), add A   to M[A, a]
2. If A  :
if   FIRST(), add A   to M[A, b]
for each terminal b  FOLLOW(A),
INPUTSYMBOLNON-
TERMINAL id + * ( ) $
E ETE’ ETE’
E’ E’ +TE’ E’  E’ 
T TFT’ TFT’
T’ T’ T’ *FT’ T’  T’ 
F Fid F(E)
(Aho,Sethi,Ullman, pp. 190)Reguli pentru construit tabela de parsare
TABLEAU
DE
PARSING:
E  TE’
E’  +TE’ | 
T  FT’
T’  FT’ | 
F  ( E ) | id
GRAMMAIRE:
FOLLOW(E) = {), $}
FOLLOW(E’) = { ), $}
FOLLOW SETS:
FOLLOW(T) = {+, ), $}
FOLLOW(T’) = {+, ), $}
FOLLOW(F) = {+, , ), $}
FIRST(F) = {(, id}
FIRST(T) = {(, id}
FIRST(E) = {(, id}
FIRST(E’) = {+, }
FIRST(T’) = { , }
FIRST SETS:
1. If A  :
if a  FIRST(), add A   to M[A, a]
2. If A  :
if   FIRST(), add A   to M[A, b]
for each terminal b  FOLLOW(A),
3. If A  :
if   FIRST(), and $  FOLLOW(A),
add A   to M[A, $]
INPUTSYMBOLNON-
TERMINAL id + * ( ) $
E ETE’ ETE’
E’ E’ +TE’ E’  E’ 
T TFT’ TFT’
T’ T’ T’ *FT’ T’  T’ 
F Fid F(E)
(Aho,Sethi,Ullman, pp. 190)Reguli pentru construit tabela de parsare
TABLEAU
DE
PARSING:
Utilisation de parser LL(1)
 Grammaires
 Non ambigu
 Factorise
 Non récursive a gauche
 On peut montrer que la grammaire G est LL (1) si et
seulement si pour deux productions de la forme
A  , A  , avec    les conditions suivantes
sont satisfaites:
 FIRST()  FIRST() = 
 Si  * ε alors FIRST()  FOLLOW(A) = et si  * ε
alors FIRST()  FOLLOW(A) = .
Avantage/désavantage LL(1)
 Facile de écrive ‘aux main’
 Vite, facile de comprendre
 La grammaire doit être transforme
 L’arbre de dérivation et diffèrent de l’arbre sémantique
E
F T’
T E’
id 
T+ E’
F T’
id F T’
id 

E
F
T
TE
id
+
F
id
F
id
T
Parser LL
• ANTLR
– Java
– LL (*)
– Factorisation
Règles EBNF
Something?
Something*
Something+
SomethingQ -> ε
| Something
SomethingStar -> ε
| Something SomethingStar
SomethingPlus ->
Something SomethingStar
Sujets
• Les parser
• LL
– Eviter l’ambiguïté
– Factorisation
– Eviter la récursivité gauche
• Algorithme général récursive LL
Questions

More Related Content

PPTX
ALF 5 - Parser Top-Down (2018)
PDF
A brief introduction to functional programming
PDF
Declarative Type System Specification with Statix
PDF
Perl6 signatures
PDF
Pratt Parser in Python
PDF
JavaOne 2016 - Learn Lambda and functional programming
PPTX
Go Java, Go!
PDF
Term Rewriting
ALF 5 - Parser Top-Down (2018)
A brief introduction to functional programming
Declarative Type System Specification with Statix
Perl6 signatures
Pratt Parser in Python
JavaOne 2016 - Learn Lambda and functional programming
Go Java, Go!
Term Rewriting

What's hot (20)

PDF
Go Java, Go!
PPTX
Java 8, lambdas, generics: How to survive? - NYC Java Meetup Group
PDF
Procedural Programming: It’s Back? It Never Went Away
PPTX
Generics and Lambdas cocktail explained - Montreal JUG
PPTX
Lambdas and Generics (long version) - Bordeaux/Toulouse JUG
PDF
Declarative Semantics Definition - Term Rewriting
PDF
Hello, Type Systems! - Introduction to Featherweight Java
PDF
T3chFest 2016 - The polyglot programmer
PDF
GUL UC3M - Introduction to functional programming
PDF
Declarative Thinking, Declarative Practice
PDF
Go Java, Go!
PDF
Dynamic Semantics Specification and Interpreter Generation
PDF
Swift rocks! #1
PDF
Data structures stacks
PDF
The Ring programming language version 1.9 book - Part 31 of 210
PDF
The Next Great Functional Programming Language
PDF
Hammurabi
PDF
Introduction to Functional Programming with Scala
PDF
Why The Free Monad isn't Free
PDF
Core csharp and net quick reference
Go Java, Go!
Java 8, lambdas, generics: How to survive? - NYC Java Meetup Group
Procedural Programming: It’s Back? It Never Went Away
Generics and Lambdas cocktail explained - Montreal JUG
Lambdas and Generics (long version) - Bordeaux/Toulouse JUG
Declarative Semantics Definition - Term Rewriting
Hello, Type Systems! - Introduction to Featherweight Java
T3chFest 2016 - The polyglot programmer
GUL UC3M - Introduction to functional programming
Declarative Thinking, Declarative Practice
Go Java, Go!
Dynamic Semantics Specification and Interpreter Generation
Swift rocks! #1
Data structures stacks
The Ring programming language version 1.9 book - Part 31 of 210
The Next Great Functional Programming Language
Hammurabi
Introduction to Functional Programming with Scala
Why The Free Monad isn't Free
Core csharp and net quick reference
Ad

Similar to ALF 5 - Parser Top-Down (20)

PPTX
Top down and botttom up 2 LATEST.
PPTX
Top down and botttom up Parsing
PPT
compiler-lecture-6nn-14112022-110738am.ppt
PPTX
Top down parsing(sid) (1)
PDF
CS17604_TOP Parser Compiler Design Techniques
PDF
Left factor put
PPTX
LL(1) parsing
PPT
Ch4_topdownparser_ngfjgh_ngjfhgfffdddf.PPT
PPTX
6-Practice Problems - LL(1) parser-16-05-2023.pptx
PPT
Integrated Fundamental and Technical Analysis of Select Public Sector Oil Com...
PPTX
Compiler Design_Syntax Analyzer_Top Down Parsers.pptx
PPT
7-parsing-error and Jake were centered at Gondar began
PPT
Predicting Stock Market Trends Using Machine Learning and Deep Learning Algor...
PDF
PDF
PDF
PPTX
Parsing in Compiler Design
Top down and botttom up 2 LATEST.
Top down and botttom up Parsing
compiler-lecture-6nn-14112022-110738am.ppt
Top down parsing(sid) (1)
CS17604_TOP Parser Compiler Design Techniques
Left factor put
LL(1) parsing
Ch4_topdownparser_ngfjgh_ngjfhgfffdddf.PPT
6-Practice Problems - LL(1) parser-16-05-2023.pptx
Integrated Fundamental and Technical Analysis of Select Public Sector Oil Com...
Compiler Design_Syntax Analyzer_Top Down Parsers.pptx
7-parsing-error and Jake were centered at Gondar began
Predicting Stock Market Trends Using Machine Learning and Deep Learning Algor...
Parsing in Compiler Design
Ad

More from Alexandru Radovici (20)

PPTX
SdE2 - Pilot Tock
PPTX
SdE2 - Systèmes embarquées
PPTX
SdE2 - Planification, IPC
PPTX
ALF1 - Introduction
PPTX
SdE2 - Introduction
PPTX
MDAD 6 - AIDL and Services
PPTX
MDAD 5 - Threads
PPTX
MDAD 4 - Lists, adapters and recycling
PPTX
MDAD 3 - Basics of UI Applications
PPTX
MDAD 2 - Introduction to the Android Framework
PPTX
MDAD 1 - Hardware
PPTX
MDAD 0 - Introduction
PPTX
SdE 11 - Reseau
PPTX
SdE 10 - Threads
PPTX
SdE 8 - Synchronisation de execution
PPTX
SdE 8 - Memoire Virtuelle
PPTX
SdE 7 - Gestion de la Mémoire
PPTX
SdE 6 - Planification
PPTX
SdE 5 - Planification
PPTX
ALF 6 - Parser
SdE2 - Pilot Tock
SdE2 - Systèmes embarquées
SdE2 - Planification, IPC
ALF1 - Introduction
SdE2 - Introduction
MDAD 6 - AIDL and Services
MDAD 5 - Threads
MDAD 4 - Lists, adapters and recycling
MDAD 3 - Basics of UI Applications
MDAD 2 - Introduction to the Android Framework
MDAD 1 - Hardware
MDAD 0 - Introduction
SdE 11 - Reseau
SdE 10 - Threads
SdE 8 - Synchronisation de execution
SdE 8 - Memoire Virtuelle
SdE 7 - Gestion de la Mémoire
SdE 6 - Planification
SdE 5 - Planification
ALF 6 - Parser

Recently uploaded (20)

PPTX
Introduction to Building Materials
PDF
Vision Prelims GS PYQ Analysis 2011-2022 www.upscpdf.com.pdf
PDF
Practical Manual AGRO-233 Principles and Practices of Natural Farming
PPTX
Unit 4 Computer Architecture Multicore Processor.pptx
PPTX
202450812 BayCHI UCSC-SV 20250812 v17.pptx
PDF
1.3 FINAL REVISED K-10 PE and Health CG 2023 Grades 4-10 (1).pdf
PDF
HVAC Specification 2024 according to central public works department
PDF
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
PDF
advance database management system book.pdf
PPTX
B.Sc. DS Unit 2 Software Engineering.pptx
PPTX
TNA_Presentation-1-Final(SAVE)) (1).pptx
PPTX
ELIAS-SEZIURE AND EPilepsy semmioan session.pptx
PDF
Weekly quiz Compilation Jan -July 25.pdf
PDF
BP 704 T. NOVEL DRUG DELIVERY SYSTEMS (UNIT 1)
PDF
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
PPTX
History, Philosophy and sociology of education (1).pptx
PPTX
CHAPTER IV. MAN AND BIOSPHERE AND ITS TOTALITY.pptx
PPTX
Computer Architecture Input Output Memory.pptx
PDF
LDMMIA Reiki Yoga Finals Review Spring Summer
PDF
1_English_Language_Set_2.pdf probationary
Introduction to Building Materials
Vision Prelims GS PYQ Analysis 2011-2022 www.upscpdf.com.pdf
Practical Manual AGRO-233 Principles and Practices of Natural Farming
Unit 4 Computer Architecture Multicore Processor.pptx
202450812 BayCHI UCSC-SV 20250812 v17.pptx
1.3 FINAL REVISED K-10 PE and Health CG 2023 Grades 4-10 (1).pdf
HVAC Specification 2024 according to central public works department
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
advance database management system book.pdf
B.Sc. DS Unit 2 Software Engineering.pptx
TNA_Presentation-1-Final(SAVE)) (1).pptx
ELIAS-SEZIURE AND EPilepsy semmioan session.pptx
Weekly quiz Compilation Jan -July 25.pdf
BP 704 T. NOVEL DRUG DELIVERY SYSTEMS (UNIT 1)
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
History, Philosophy and sociology of education (1).pptx
CHAPTER IV. MAN AND BIOSPHERE AND ITS TOTALITY.pptx
Computer Architecture Input Output Memory.pptx
LDMMIA Reiki Yoga Finals Review Spring Summer
1_English_Language_Set_2.pdf probationary

ALF 5 - Parser Top-Down

  • 2. Bibliographie pour aujourd'hui Keith Cooper, Linda Torczon, Engineering a Compiler – Chapitre 3 • 3.3 Alfred V. Aho, Monica S. Lam, Ravi Sethi, Jeffrey D. Ullman, Compilers: Principles, Techniques, and Tools (2nd Edition) – Chapitre 4 • 4.4
  • 4. Alexander Aiken • Américain • Stanford • LL(*) • MOSS • ANTLR
  • 5. Slides Partie de slides sont écrie par Bogdan Nitulescu
  • 6. Notation BNF RFC 2616 HTTP/1.1 June 1999 HTTP-date = rfc1123-date | rfc850-date | asctime-date rfc1123-date = wkday "," SP date1 SP time SP "GMT“ rfc850-date = weekday "," SP date2 SP time SP "GMT“ asctime-date = wkday SP date3 SP time SP 4DIGIT date1 = 2DIGIT SP month SP 4DIGIT ; day month year (e.g., 02 Jun 1982) date2 = 2DIGIT "-" month "-" 2DIGIT ; day-month-year (e.g., 02-Jun-82) date3 = month SP ( 2DIGIT | ( SP 1DIGIT )) ; month day (e.g., Jun 2) time = 2DIGIT ":" 2DIGIT ":" 2DIGIT ; 00:00:00 - 23:59:59 wkday = "Mon" | "Tue" | "Wed“ | "Thu" | "Fri" | "Sat" | "Sun“ weekday = "Monday" | "Tuesday" | "Wednesday“ | "Thursday" | "Friday" | "Saturday" | "Sunday“ month = "Jan" | "Feb" | "Mar" | "Apr“ | "May" | "Jun" | "Jul" | "Aug" | "Sep" | "Oct" | "Nov" | "Dec"
  • 7. Arbre de dérivation / syntactique E  E + E E  E * E E  n n : [0-9]+ Lexer 2 * 3 + 4 * 5 n * n + n * n 2 3 4 5 parser n n n n 2 3 4 5 * * + n n n n 2 3 4 5 E * E E * E E + E E • jetons (tokens) • Valeurs • Grammaire • Arbre de dérivation • Arbre syntactique
  • 8. Types d’analyse syntactique  Descendent (top-down)  Avec backtracking  Prédictive  Descendent récursive, LL avec un tableau  Ascendant (bottom-up)  Avec backtracking  Shift-reduce  LR(0),SLR,LALR, LR canonique
  • 9. –Instr –id = Expr ; –id = ( Expr ) ; –id = ( Expr + Expr ) ; –id = ( id + Expr ) ; –id = ( id + id ) ; id = ( id + id ) ; id = ( id + id ) ; id = ( id + id ) ; id = ( id + id ) ; id = ( id + id ) ; id = ( id + id ) ; • LL: La chaîne de jetons est itérée à partir du côté gauche (L) • Le non-terminal le plus à gauche est dérivé (L) Dérivation gauche, top down
  • 10. –Instr –id = Expr ; –id = Expr + Expr ; –id = ( Expr ) + Expr ; –id = ( id ) + Expr ; –id = ( id ) + ( Expr ) ; –id = ( id + id ) ; id = ( id ) + ( id ) ; id = ( id ) + ( id ) ; id = ( id ) + ( id ) ; id = ( id ) + ( id ) ; id = ( id ) + ( id ) ; id = ( id ) + ( id ) ; id = ( id ) + ( id ) ; • Comment choisir la production utilisée pour la dérivation? • Backtracking? Dérivation gauche, top down
  • 11. Parser LL, LR  Nous devrions éviter backtracking  Une grammaire qui permet le parser déterministe  LL(k) lit left-to-right, dérivation left  LR(k) lit left-to-right, dérivation right  K – lookahead (combien de tokens sont lus)  LL(k) < LR(k)  L'algorithme est indépendant du langage, la grammaire dépend du langage
  • 12. Analyse descendent récursive  Non-terminal -> fonction  Si le symbole apparaît dans la partie droite de production -> appel la fonction  Si le symbole apparaît dans la partie gauche de production – la production est choisi en fonction des jetons (tokens) suivants (lookahead)
  • 13. MatchToken (token) { if (lookahead != token) throw error(); lookahead = lexer.getNextToken(); } rfc850-date = weekday "," SP date2 SP time SP "GMT“ ParseRFC850Date() { ParseWeekDay(); MatchToken(","); MatchToken(SP); ParseDate2(); MatchToken(SP); ParseTime(); MatchToken(SP); MatchToken("GMT“); } Fonction pour parser le non- terminal rfc850-date Analyse descendent récursive
  • 14. Avec la grammaire E  E + T | T T  T  F | F F  ( E ) | id Un parser descendant entre dans une boucle infinie lorsque vous essayez de parser cette grammaire E E +E T E +E T +E T E +E T +E T +E T (Aho,Sethi,Ullman, pp. 176) Récursivité gauche
  • 15. Grammaire des expression E  E + T | T T  T  F | F F  ( E ) | id Peut être écrive sans la récursivité gauche E  TE’ E’  +TE’ |  T  FT’ T’  FT’ |  F  ( E ) | id (Aho,Sethi,Ullman, pp. 176) ε – string vide Récursivité gauche
  • 16. Exemple de parser récursive ParseE() { ParseT(); ParseE1(); } ParseE1() { if (lookahead==“+”) { MatchToken(“+”); ParseT(); ParseE1(); } } ParseT() { ParseF(); ParseT1(); } ParseT1() { if (lookahead==“*”) { MatchToken(“*”); ParseF(); ParseT1(); } } E  TE’ E’  +TE’ |  T  FT’ T’  FT’ |  F  ( E ) | id ParseF() { if (lookahead == “(“) { MatchToken(“(“); ParseE(); MatchToken(“)”); } else MatchToken(T_ID); }
  • 17. Comment choisir entre deux productions? Comment pouvons-nous savoir quelles conditions de poser a if? Lorsque nous émettons des erreurs? ParseF() { if (lookahead == “(“) { MatchToken(“(“); ParseE(); MatchToken(“)”); } else if (lookahead == T_ID) { MatchToken(T_ID); } else throw error(); } F  ( E ) F  id T’  *FT’ T’  ε ParseT1() { if (lookahead==“*”) { MatchToken(“*”); ParseF(); ParseT1(); } else if (lookahead == “+”) { } else if (lookahead == “)”) { } else if (lookahead == T_EOF) { } else throw error(); } Analyse descendent récursive
  • 18. Les conditions pour if • FIRST – Ensemble de terminaux-préfixées pour le non-terminal • FOLLOW – Ensemble de terminaux suivantes pour le non-terminal • NULLABLE – Ensemble de non-terminaux qui peut etre derive en ε
  • 19. FIRST E  TE’ E’  +TE’ |  T  FT’ T’  FT’ |  F  ( E ) | id GRAMMAIRE: 1. If X is a terminal, FIRST(X) = {X} FIRST(id) = {id} FIRST() = {} FIRST(+) = {+} ENSEBLES: 2. If X   , then   FIRST(X) 4. If X  Y1 Y2 ••• Yk FIRST(() = {(} FIRST()) = {)} FIRST (pseudocode): and a FIRST(Yi) then a  FIRST(X) FIRST(F) = {(, id} FIRST(T) = FIRST(F) = {(, id} FIRST(E) = FIRST(T) = {(, id} FIRST(E’) = {} {+, } FIRST(T’) = {} {, } (Aho,Sethi,Ullman, pp. 189) * 3. If X  Y1 Y2 ••• Yk and Y1••• Yi-1  and a FIRST(Y1) then a  FIRST(X)
  • 20. E  TE’ E’  +TE’ |  T  FT’ T’  FT’ |  F  ( E ) | id GRAMMAIRE: 1. If S is the start symbol, then $  FOLLOW(S) FOLLOW(E) = {$} FOLLOW(E’) = { ), $} ENSEBLES: 2. If A  B, and a  FIRST() and a   then a  FOLLOW(B) 3. If A  B and a  FOLLOW(A) then a  FOLLOW(B) FOLLOW – pseudocode: { ), $} 3a. If A  B and and a  FOLLOW(A) then a  FOLLOW(B) *   FOLLOW(T) = { ), $} FIRST(F) = {(, id} FIRST(T) = {(, id} FIRST(E) = {(, id} FIRST(E’) = {+, } FIRST(T’) = { , }  et  - string de terminaux et non- terminaux A et B – non-terminaux, $ - fin du text (Aho,Sethi,Ullman, pp. 189) FOLLOW
  • 21. E  TE’ E’  +TE’ |  T  FT’ T’  FT’ |  F  ( E ) | id 1. If S is the start symbol, then $  FOLLOW(S) FOLLOW(E) = {), $} FOLLOW(E’) = { ), $} 3. If A  B and a  FOLLOW(A) then a  FOLLOW(B) 3a. If A  B and and a  FOLLOW(A) then a  FOLLOW(B) *   FOLLOW(T) = { ), $} FIRST(F) = {(, id} FIRST(T) = {(, id} FIRST(E) = {(, id} FIRST(E’) = {+, } FIRST(T’) = { , } 2. If A  B, and a  FIRST() and a   then a  FOLLOW(B) {+, ), $} (Aho,Sethi,Ullman, pp. 189) GRAMMAIRE: ENSEBLES: FOLLOW – règles: FOLLOW
  • 22. E  TE’ E’  +TE’ |  T  FT’ T’  FT’ |  F  ( E ) | id 1. If S is the start symbol, then $  FOLLOW(S) FOLLOW(E) = {), $} FOLLOW(E’) = { ), $} FOLLOW(T) = {+, ), $} FIRST(F) = {(, id} FIRST(T) = {(, id} FIRST(E) = {(, id} FIRST(E’) = {+, } FIRST(T’) = { , } 2. If A  B, and a  FIRST() and a   then a  FOLLOW(B) 3. If A  B and a  FOLLOW(A) then a  FOLLOW(B) FOLLOW(T’) = {+, ), $} 3a. If A  B and and a  FOLLOW(A) then a  FOLLOW(B) *   (Aho,Sethi,Ullman, pp. 189) GRAMMAIRE: ENSEBLES: FOLLOW – règles: FOLLOW
  • 23. E  TE’ E’  +TE’ |  T  FT’ T’  FT’ |  F  ( E ) | id 1. If S is the start symbol, then $  FOLLOW(S) FOLLOW(E) = {), $} FOLLOW(E’) = { ), $} FOLLOW(T) = {+, ), $} FIRST(F) = {(, id} FIRST(T) = {(, id} FIRST(E) = {(, id} FIRST(E’) = {+, } FIRST(T’) = { , } 2. If A  B, and a  FIRST() and a   then a  FOLLOW(B) 3. If A  B and a  FOLLOW(A) then a  FOLLOW(B) FOLLOW(T’) = {+, ), $} 3a. If A  B and and a  FOLLOW(A) then a  FOLLOW(B) *   FOLLOW(F) = {+, ), $} (Aho,Sethi,Ullman, pp. 189) GRAMMAIRE: ENSEBLES: FOLLOW – règles: FOLLOW
  • 24. E  TE’ E’  +TE’ |  T  FT’ T’  FT’ |  F  ( E ) | id 1. If S is the start symbol, then $  FOLLOW(S) FOLLOW(E) = {), $} FOLLOW(E’) = { ), $} FOLLOW(T) = {+, ), $} FIRST(F) = {(, id} FIRST(T) = {(, id} FIRST(E) = {(, id} FIRST(E’) = {+, } FIRST(T’) = { , } 3. If A  B and a  FOLLOW(A) then a  FOLLOW(B) FOLLOW(T’) = {+, ), $} 3a. If A  B and and a  FOLLOW(A) then a  FOLLOW(B) *   FOLLOW(F) = {+, ), $} 2. If A  B, and a  FIRST() and a   then a  FOLLOW(B) {+, , ), $} (Aho,Sethi,Ullman, pp. 189) GRAMMAIRE: ENSEBLES: FOLLOW – règles: FOLLOW
  • 25. L’algo générique récursive LL(1) A  a B … x A  C D … y … ParseA() { if (lookahead in FIRST(a B … x FOLLOW(A)) { MatchToken(a); ParseB(); … MatchToken(x); } else if (lookahead in FIRST(C D … y FOLLOW(A)) { ParseC(); ParseD(); … MatchToken(y); } … else throw error(); } • Pour chaque non-terminal crée une fonction de parser. • Pour chaque règle Aα ajouter un test if (lookahead in FIRST(αFOLLOW(A)) ) • Pour chaque non-terminal dans a appeler la fonction de parser. • Pour chaque terminal dans a, vérifier le lookahead(match)
  • 26. Récursivité gauche Quand une grammaire a au moins une forme de production A  Aα nous disons qu'il est une grammaire récursive gauche. Le parsers descendent ne fonctionnent pas (sans backtracking) sur les grammaire récursives gauche. (Aho,Sethi,Ullman, pp. 176) Récursivité peut ne pas être immédiat A  Bα B  A β
  • 27. Elimination récursivité gauche • Cela se fait par la réécriture de la grammaire E  E + T | T T  T  F | F F  ( E ) | id E  TE’ E’  +TE’ |  T  FT’ T’  FT’ |  F  ( E ) | id (Aho,Sethi,Ullman, pp. 176) List  List Item | Item List  Item List’ List’  Item List’ | ε
  • 28. Cas général (récursivité immédiat): A → Aβ1 |Aβ2 | ... |Aβm | α1 | α2 | ... | αn A → α1A' | α2A' | ... | αnA‘ A' → β1A' | β2A' | ... | βmA'| ε E  E + T | T T  T  F | F F  ( E ) | id E  TE’ E’  +TE’ |  T  FT’ T’  FT’ |  F  ( E ) | id Elimination récursivité gauche
  • 29. • Pour une instruction if: • Pour parser avec LL, elle doit être factorise: Factorisation gauche
  • 30.  Cas général: A → αβ1 | αβ2 | ... | αβn | δ  Factorise: A → αA' | δ A' → β1 | β2 | ... | βn Factorisation gauche
  • 31. Elimination des ambiguïtés  Ambigu: Ε → Ε + Ε | Ε * Ε | a | ( E ) 1. Ε → Ε + T | T T → T * F | F F → a | ( E ) 2. Ε → T + E | T T → F * T | F F → a | ( E )  La précédence des operateurs  La associativité gauche ou droite
  • 32.  Productions qui peuvent produire l'ambiguïté: X → aAbAc  Cas général: A → A B A | α1 | α2 | ... | αn  Désambiguïsât: A → A' B A | A‘ A' → α1 | α2 | ... | αn Elimination des ambiguïtés
  • 33. Parser automatique • Automate push-down • Le parser est fait avec un automate est un tableau • Langage LL(1) si il n'a pas de conflits dans le tableau
  • 34. E  TE’ E’  +TE’ |  T  FT’ T’  FT’ |  F  ( E ) | id Grammaire: INPUTSYMBOLNON- TERMINAL id + * ( ) $ E ETE’ ETE’ E’ E’+TE’ E’  E’  T TFT’ TFT’ T’ T’ T’ *FT’ T’  T’  F Fid F(E) Tableau de Parsing: (Aho,Sethi,Ullman, pp. 188) Exemple de parser LL
  • 35. INPUTSYMBOLNON- TERMINAL id + * ( ) $ E ETE’ ETE’ E’ E’ +TE’ E’  E’  T TFT’ TFT’ T’ T’ T’ *FT’ T’  T’  F Fid F(E) PILE: id idid+ INPUT: Predictive Parsing Program E $ $ OUTPUT: E T E’ $ T E’ TABLEAU DE PARSING: Exemple de parser LL
  • 36. T E’ $ T E’ $ INPUTSYMBOLNON- TERMINAL id + * ( ) $ E ETE’ ETE’ E’ E’ +TE’ E’  E’  T TFT’ TFT’ T’ T’ T’ *FT’ T’  T’  F Fid F(E) id idid+ INPUT: Predictive Parsing Program $ OUTPUT: E F T’ E’ $ F T’ T E’ (Aho,Sethi, Ullman, pp. 186) PILE: TABLEAU DE PARSING: Exemple de parser LL
  • 37. (Aho,Sethi, Ullman, pp. 188) T E’ $ T E’ $ INPUTSYMBOLNON- TERMINAL id + * ( ) $ E ETE’ ETE’ E’ E’ +TE’ E’  E’  T TFT’ TFT’ T’ T’ T’ *FT’ T’  T’  F Fid F(E) id idid+ INPUT: Predictive Parsing Program $ OUTPUT: E F T’ E’ $ F T’ T E’ id T’ E’ $ id PILE: TABLEAU DE PARSING: Exemple de parser LL
  • 38. INPUTSYMBOLNON- TERMINAL id + * ( ) $ E ETE’ ETE’ E’ E’ +TE’ E’  E’  T TFT’ TFT’ T’ T’ T’ *FT’ T’  T’  F Fid F(E) id idid+ INPUT: Predictive Parsing Program $ OUTPUT: E T’ E’ $ F T’ T E’ id Quand l’action c’est Top(Pile) = input ≠ $ : ‘Pop’ de la pile, avance la bande de input. (Aho,Sethi, Ullman, pp. 188) PILE: TABLEAU DE PARSING: Exemple de parser LL
  • 39. INPUTSYMBOLNON- TERMINAL id + * ( ) $ E ETE’ ETE’ E’ E’ +TE’ E’  E’  T TFT’ TFT’ T’ T’ T’ *FT’ T’  T’  F Fid F(E) id idid+ INPUT: Predictive Parsing Program $ OUTPUT: E F T’ T E’ id  T’ E’ $ E’ $ (Aho,Sethi, Ullman, pp. 188) PILE: TABLEAU DE PARSING: Exemple de parser LL
  • 40. E F T’ T E’ id  T+ E’ F T’ id F T’ id   Et ainsi, il construit l’arbre de dérivation: E’  +TE’ T  FT’ F  id T’   FT’ F  id T’   E’   Quand Top(Pile) = input = $ Le parser arrêt et accepte l’input (Aho,Sethi, Ullman, pp. 188) Exemple de parser LL
  • 41. Remplir de tableau • FIRST – Ensemble de terminaux-préfixées pour le non-terminal • FOLLOW – Ensemble de terminaux suivantes pour le non-terminal • NULLABLE – Ensemble de non-terminaux qui peut etre derive en ε
  • 42. Reguli pentru construit tabela de parsare E  TE’ E’  +TE’ |  T  FT’ T’  FT’ |  F  ( E ) | id GRAMMAIRE: FOLLOW(E) = {), $} FOLLOW(E’) = { ), $} FOLLOW SETS: FOLLOW(T) = {+, ), $} FOLLOW(T’) = {+, ), $} FOLLOW(F) = {+, , ), $} FIRST(F) = {(, id} FIRST(T) = {(, id} FIRST(E) = {(, id} FIRST(E’) = {+, } FIRST(T’) = { , } FIRST SETS: TABLEAU DE PARSING: 1. If A  : if a  FIRST(), add A   to M[A, a] INPUTSYMBOLNON- TERMINAL id + * ( ) $ E ETE’ ETE’ E’ E’ +TE’ E’  E’  T TFT’ TFT’ T’ T’ T’ *FT’ T’  T’  F Fid F(E) (Aho,Sethi,Ullman, pp. 190)
  • 43. 1. If A  : if a  FIRST(), add A   to M[A, a] INPUTSYMBOLNON- TERMINAL id + * ( ) $ E ETE’ ETE’ E’ E’ +TE’ E’  E’  T TFT’ TFT’ T’ T’ T’ *FT’ T’  T’  F Fid F(E) (Aho,Sethi,Ullman, pp. 190)Reguli pentru construit tabela de parsare E  TE’ E’  +TE’ |  T  FT’ T’  FT’ |  F  ( E ) | id GRAMMAIRE: FOLLOW(E) = {), $} FOLLOW(E’) = { ), $} FOLLOW SETS: FOLLOW(T) = {+, ), $} FOLLOW(T’) = {+, ), $} FOLLOW(F) = {+, , ), $} FIRST(F) = {(, id} FIRST(T) = {(, id} FIRST(E) = {(, id} FIRST(E’) = {+, } FIRST(T’) = { , } FIRST SETS: TABLEAU DE PARSING:
  • 44. E  TE’ E’  +TE’ |  T  FT’ T’  FT’ |  F  ( E ) | id GRAMMAIRE: FOLLOW(E) = {), $} FOLLOW(E’) = { ), $} FOLLOW SETS: FOLLOW(T) = {+, ), $} FOLLOW(T’) = {+, ), $} FOLLOW(F) = {+, , ), $} FIRST(F) = {(, id} FIRST(T) = {(, id} FIRST(E) = {(, id} FIRST(E’) = {+, } FIRST(T’) = { , } FIRST SETS: 1. If A  : if a  FIRST(), add A   to M[A, a] INPUTSYMBOLNON- TERMINAL id + * ( ) $ E ETE’ ETE’ E’ E’ +TE’ E’  E’  T TFT’ TFT’ T’ T’ T’ *FT’ T’  T’  F Fid F(E) (Aho,Sethi,Ullman, pp. 190)Reguli pentru construit tabela de parsare TABLEAU DE PARSING:
  • 45. E  TE’ E’  +TE’ |  T  FT’ T’  FT’ |  F  ( E ) | id GRAMMAIRE: FOLLOW(E) = {), $} FOLLOW(E’) = { ), $} FOLLOW SETS: FOLLOW(T) = {+, ), $} FOLLOW(T’) = {+, ), $} FOLLOW(F) = {+, , ), $} FIRST(F) = {(, id} FIRST(T) = {(, id} FIRST(E) = {(, id} FIRST(E’) = {+, } FIRST(T’) = { , } FIRST SETS: TABLEAU DE PARSING: 1. If A  : if a  FIRST(), add A   to M[A, a] INPUTSYMBOLNON- TERMINAL id + * ( ) $ E ETE’ ETE’ E’ E’ +TE’ E’  E’  T TFT’ TFT’ T’ T’ T’ *FT’ T’  T’  F Fid F(E) (Aho,Sethi,Ullman, pp. 190)Reguli pentru construit tabela de parsare
  • 46. E  TE’ E’  +TE’ |  T  FT’ T’  FT’ |  F  ( E ) | id GRAMMAIRE: FOLLOW(E) = {), $} FOLLOW(E’) = { ), $} FOLLOW SETS: FOLLOW(T) = {+, ), $} FOLLOW(T’) = {+, ), $} FOLLOW(F) = {+, , ), $} FIRST(F) = {(, id} FIRST(T) = {(, id} FIRST(E) = {(, id} FIRST(E’) = {+, } FIRST(T’) = { , } FIRST SETS: 1. If A  : if a  FIRST(), add A   to M[A, a] INPUTSYMBOLNON- TERMINAL id + * ( ) $ E ETE’ ETE’ E’ E’ +TE’ E’  E’  T TFT’ TFT’ T’ T’ T’ *FT’ T’  T’  F Fid F(E) (Aho,Sethi,Ullman, pp. 190)Reguli pentru construit tabela de parsare TABLEAU DE PARSING:
  • 47. E  TE’ E’  +TE’ |  T  FT’ T’  FT’ |  F  ( E ) | id GRAMMAIRE: FOLLOW(E) = {), $} FOLLOW(E’) = { ), $} FOLLOW SETS: FOLLOW(T) = {+, ), $} FOLLOW(T’) = {+, ), $} FOLLOW(F) = {+, , ), $} FIRST(F) = {(, id} FIRST(T) = {(, id} FIRST(E) = {(, id} FIRST(E’) = {+, } FIRST(T’) = { , } FIRST SETS: 1. If A  : if a  FIRST(), add A   to M[A, a] 2. If A  : if   FIRST(), add A   to M[A, b] for each terminal b  FOLLOW(A), INPUTSYMBOLNON- TERMINAL id + * ( ) $ E ETE’ ETE’ E’ E’ +TE’ E’  E’  T TFT’ TFT’ T’ T’ T’ *FT’ T’  T’  F Fid F(E) (Aho,Sethi,Ullman, pp. 190)Reguli pentru construit tabela de parsare TABLEAU DE PARSING:
  • 48. E  TE’ E’  +TE’ |  T  FT’ T’  FT’ |  F  ( E ) | id GRAMMAIRE: FOLLOW(E) = {), $} FOLLOW(E’) = { ), $} FOLLOW SETS: FOLLOW(T) = {+, ), $} FOLLOW(T’) = {+, ), $} FOLLOW(F) = {+, , ), $} FIRST(F) = {(, id} FIRST(T) = {(, id} FIRST(E) = {(, id} FIRST(E’) = {+, } FIRST(T’) = { , } FIRST SETS: 1. If A  : if a  FIRST(), add A   to M[A, a] 2. If A  : if   FIRST(), add A   to M[A, b] for each terminal b  FOLLOW(A), INPUTSYMBOLNON- TERMINAL id + * ( ) $ E ETE’ ETE’ E’ E’ +TE’ E’  E’  T TFT’ TFT’ T’ T’ T’ *FT’ T’  T’  F Fid F(E) (Aho,Sethi,Ullman, pp. 190)Reguli pentru construit tabela de parsare TABLEAU DE PARSING:
  • 49. E  TE’ E’  +TE’ |  T  FT’ T’  FT’ |  F  ( E ) | id GRAMMAIRE: FOLLOW(E) = {), $} FOLLOW(E’) = { ), $} FOLLOW SETS: FOLLOW(T) = {+, ), $} FOLLOW(T’) = {+, ), $} FOLLOW(F) = {+, , ), $} FIRST(F) = {(, id} FIRST(T) = {(, id} FIRST(E) = {(, id} FIRST(E’) = {+, } FIRST(T’) = { , } FIRST SETS: 1. If A  : if a  FIRST(), add A   to M[A, a] 2. If A  : if   FIRST(), add A   to M[A, b] for each terminal b  FOLLOW(A), 3. If A  : if   FIRST(), and $  FOLLOW(A), add A   to M[A, $] INPUTSYMBOLNON- TERMINAL id + * ( ) $ E ETE’ ETE’ E’ E’ +TE’ E’  E’  T TFT’ TFT’ T’ T’ T’ *FT’ T’  T’  F Fid F(E) (Aho,Sethi,Ullman, pp. 190)Reguli pentru construit tabela de parsare TABLEAU DE PARSING:
  • 50. Utilisation de parser LL(1)  Grammaires  Non ambigu  Factorise  Non récursive a gauche  On peut montrer que la grammaire G est LL (1) si et seulement si pour deux productions de la forme A  , A  , avec    les conditions suivantes sont satisfaites:  FIRST()  FIRST() =   Si  * ε alors FIRST()  FOLLOW(A) = et si  * ε alors FIRST()  FOLLOW(A) = .
  • 51. Avantage/désavantage LL(1)  Facile de écrive ‘aux main’  Vite, facile de comprendre  La grammaire doit être transforme  L’arbre de dérivation et diffèrent de l’arbre sémantique E F T’ T E’ id  T+ E’ F T’ id F T’ id   E F T TE id + F id F id T
  • 52. Parser LL • ANTLR – Java – LL (*) – Factorisation
  • 53. Règles EBNF Something? Something* Something+ SomethingQ -> ε | Something SomethingStar -> ε | Something SomethingStar SomethingPlus -> Something SomethingStar
  • 54. Sujets • Les parser • LL – Eviter l’ambiguïté – Factorisation – Eviter la récursivité gauche • Algorithme général récursive LL