SlideShare a Scribd company logo
6
Most read
7
Most read
9
Most read
UNIT – III
Context Free Grammars
• Formal Languages, Grammars,
• Classification of Grammars,
• Chomsky Hierarchy Theorem,
• Context Free Grammar,
• Leftmost and Rightmost Derivations,
• Parse Trees,
• Ambiguous Grammars,
• Simplification of Context Free Grammars-Elimination of Useless Symbols,
• E- Productions and Unit Productions,
• Normal Forms for Context Free Grammars:
▪ Chomsky Normal Form
▪ Greibach Normal Form,
• Pumping Lemma, Closure Properties,
• Applications of Context Free Grammars.
Grammar: Grammar is a set of rules used to define a valid sentence in any language.
Mathematical representation of grammar: Grammar is defined as a 4-tuple G={V,T,P,S}
where
V - finite set of non-terminals
T - finite set of terminals
P – finite set of production rules
S – Start symbol
“Nonterminals”arerepresentedbycapitalletters A,B,C,…soon.
“Terminals” are represented by small letters a, b, c,… so on.
“Production rules” are of the form α  β, where
α contains non-terminal symbols or may also contain both terminals
and non-terminals with atleast one non-terminal.
β contains terminal, non-terminal symbols or any combination of both
terminals and non-terminals.
Formal Grammar: Grammar is an useful model when designing software that processes
data with recursive structure. i.e., It denotes the structure of data.
Formal Language: Formal language is an abstraction for general characteristics of
programming languages.
Classification of Grammars:
Chomsky Hierarchy: Noam Chomsky classified grammar into 4 types depending on
production rules (.
1. Type 0 – Unrestricted Grammar
2. Type 1 – Context Sensitive Grammar (CSG)
3. Type 2 – Context Free Grammar (CFG)
4. Type 3 – Regular Grammar
Type0 Grammar: It is also known as “unrestricted grammar” or “phase structured
grammar”. An unrestricted grammar is a formal grammar in which no restrictions are made
on LHS and RHS of production.
Mathematical representation of Type 0 grammar: Type 0 Grammar is defined as a 4-tuple
G={V,T,P,S} where
V - finite set of non-terminals
T - finite set of terminals
P – finite set of production rules of the form αAβ → αγβ
Where α – left context, β – right context,
α, β ∈ (V∪T)+
and γ ∈ (V∪T)*,
i.e., no epsilon (ɛ) productions are allowed on LHS but epsilon
(ɛ) productions are allowed on RHS.
S – Start symbol
The language that can be generated by making use of type 0 grammar is “recursive
enumerable language”. The machine format that is used to recognize strings of that particular
grammar is “turing machine”.
E.g: S→aSBA | abA
AB→BA
bB→bb
bA→ba
aA→aa
Type 1 Grammar: It is also known as “Context Sensitive Grammar (CSG)”. Context
Sensitive Grammar (CSG) is a formal grammar in which LHS and RHS of any production
rule may be surrounded by a context of terminals and non-terminals.
Mathematical representation of Type 1 grammar: Type 1 Grammar is defined as a 4-tuple
G={V,T,P,S} where
V - finite set of non-terminals
T - finite set of terminals
P – finite set of production rules of the form αAβ → αγβ
Where α – left context, β – right context,
α, β ∈ (V∪T)+
and γ ∈ (V∪T)+
i.e., no epsilon (ɛ) productions are allowed on LHS and RHS.
S – Start symbol
The language that can be generated by making use of type 1 grammar is “Context Sensitive
language (CSL)”. The machine format that is used to recognize strings of that particular
grammar is “Linear Bounded Automata (LBA)”.
E.g: S→abc | aSBC
CB→BC
bB→bb
Type 2 Grammar: It is also known as “Context Free Grammar (CFG)”. Context Free
Grammar (CFG) is a formal grammar in which LHS and RHS of any production rule is free
from a context of terminals and non-terminals.
Mathematical representation of Type 2 grammar: Type 2 Grammar is defined as a 4-tuple
G={V,T,P,S} where
V - finite set of non-terminals
T - finite set of terminals
P – finite set of production rules of the form A → γ
Whereandγ∈ (
V
∪T
)
*,
i.e., A (V∪T)*
,
i.e., LHS of any production rule contains only a single non-
terminal and epsilon (ɛ) productions are allowed on RHS.
S – Start symbol
The language that can be generated by making use of type 2 grammar is “Context Free
language (CFL)”. The machine format that is used to recognize strings of that particular
grammar is “Push Down Automata (PDA)”.
E.g: S→aSb
S→ab
Type 3 Grammar: It is also known as “Regular Grammar (RG)”. Regular Grammar (RG) is
a formal grammar with restrictions on both LHS and RHS. i.e., LHS contains only a single
non-terminal and RHS may contain single terminal (or) single terminal followed by non
terminal (or) nothing.
Mathematical representation of Type 3 grammar: Type 3 Grammar is defined as a 4-tuple
G={V,T,P,S} where
V - finite set of non-terminals
T - finite set of terminals
P – finite set of production rules of the form A t | t
B
|
ɛ
,
Where t – terminals
A,B – non terminals
S – Start symbol
The language that can be generated by making use of type 3 grammar is “Regular language
(RL)”. The machine format that is used to recognize strings of that particular grammar is
“Finite Automata (FA)”.
E.g: S→aS
S→a
Types of formal languages: Based on properties of Chomsky hierarchy of grammars, formal
languages are classified into 4 types.
1. Regular Language
2. Context Free Language
3. Context Sensitive Language
4. Recursive Language and Recursive Enumerable Language
Regular Language: It is a formal language generated by regular grammar.
Properties of Regular Language:
1. It is accepted by DFA, NFA and read only turing machine.
2. It is described by regular expressions.
3. It is generated by regular grammar.
Closure Properties of Regular Language:
Two regular languages are closed under the following.
1. Union
2. Concatenation
3. Complementation
4. Intersection
5. Kleene closure
6. Transpose
7. Substitution
8. Homomorphism
9. Inverse homomorphism
Methods used to prove the given language is not regular – pumping lemma.
E.g: L={an
| n≥1} is a regular language generated by regular grammar
S→aS
S→a
Context Free Language: It is a formal language generated by context free grammar.
Properties of Context Free Language:
1. It is accepted by pushdown automata.
2. It is generated by context free grammar.
3. It doesn’t satisfy operations such as intersection, complement, difference.
Closure Properties of Context Free Language:
Two context free languages are closed under the following.
1. Union
2. Concatenation
3. Kleene closure
4. Transpose
5. Substitution
6. Homomorphism
7. Inverse homomorphism
Methods used to prove the given language is not context free – pumping lemma.
E.g: L={an
bn
| n≥1} is a context free language generated by context free grammar
S→aSb
S→ab
Context Sensitive Language: It is a formal language generated by context sensitive
grammar.
Properties of Context Sensitive Language:
1. It is accepted by pushdown automata.
2. It is generated by context sensitive grammar.
3. It doesn’t satisfy operations such as complementation, difference.
Closure Properties of Context Sensitive Language:
Two context sensitive languages are closed under the following.
1. Union
2. Concatenation
3. Kleene closure
4. Intersection
5. Transpose
Methods used to prove the given language is not context sensitive – pumping lemma.
E.g: L={an
bn
cn
| n≥1} is a regular language generated by context sensitive grammar
S→abc | aSBC
CB→BC
bB→bb
Recursive Language: Any language L over alphabet ∑ is said to be recursive, if there is a
turing machine that accepts every word present in the language L and rejects every word
which is not in L.
Properties of Recursive Language:
1. It is accepted by turing machine.
2. It is generated by unrestricted grammar.
Closure Properties of Recursive Language:
Two recursive languages are closed under the following.
1. Union
2. Concatenation
3. Complementation
4. Intersection
5. Kleene closure
Recursive Enumerable Language: A recursive enumerable language L over alphabet ∑
accepts every word in L of turing machine M (or) either rejects or loops every word which is
not in L.
Accept=L
Reject + Loop = L̄
i.e., a turing machine on same input can either accept a language or else, will run in loop
forever.
Properties of Recursive Enumerable Language:
1. It is accepted by turing machine.
2. It is generated by unrestricted grammar.
Closure Properties of Recursive Enumerable Language:
Two recursive enumerable languages are closed under the following.
1. Union
2. Concatenation
3. Intersection
4. Kleene closure
E.g: L={an
bn
an
| n≥0} is a recursive enumerable language generated by unrestricted
grammar S→aSBA | abA
AB→BA
bB→bb
bA→ba
aA→aa
Context Free Grammar:
A grammar G=(V,T,P,S) is said to be context free grammar if it is recursive and free from
context. i.e., productions are in the form of A (V∪T)*
in which LHS of any productionr
u
l
e
contains only a single non-terminal and epsilon (ɛ) productions are allowed on RHS.
CFG is formally defined as G=(V,T,P,S) where
V - finite set of non-terminals
T - finite set of terminals
P – finite set of production rules of the form A (
V
∪
T
)
*
,
S – Start symbol
Backus naur Form:
It was proposed by John Backus and Peter Naur. If there are many productions with same
LHS variable, then they can be combined into single production LHS → RHS, where
LHS is common variable of LHS of productions
LHS is RHS productions separated by vertical bar ( | )
E.g: S→aS
S→ɛ
ɛ should be written at the end. Therefore, S→aS | ɛ
Derivation:
The process of generating a string from the production rule of a grammar is called derivation.
Types of Derivation:
1. Leftmost derivation
2. Rightmost derivation
Leftmost derivation:
A derivation is called leftmost derivation, if leftmost variable is replaced with some
production rule of a grammar.
E.g: S→aSbS | ab
Let the string to be derived is aabbab.
LMD: S→aSbS
→aabbS
→aabbab
Rightmost derivation:
A derivation is called rightmost derivation, if rightmost variable is replaced with some
production rule of a grammar.
E.g: S→aSbS | ab
Let the string to be derived is aabbab.
RMD: S→aSbS
→aSbab
→aabbab
Derivation tree:
The diagrammatical representation of the derivation process by a tree structure is known as
“derivation tree”.
It is also known as “parse tree” because it is created by parser (second phase of compiler) to
check syntax of the statement.
It is also defined as an ordered tree in which LHS of production rule represents parent node
and RHS of production rule represents children nodes.
Properties of Derivation tree:
In derivation tree of CFG,
1. Root node is represented starting variable.
2. Leaf nodes are represented by terminals or epsilon (ɛ).
3. Internal nodes are represented by non- terminals.
4. If A→α1 α2 α3... αn then A is parent of nodes α1 α2 α3... αn
Ambiguous Grammars:
If more than one parse tree (or) more than one LMD (or) more than one RMD is possible to
derive the same string, then that grammar is said to be “Ambiguous grammar”, Otherwise it
is known as unambiguous grammar.
Yield of a derivation tree:
The string obtained by reading only the leaves of derivation tree from left to right (ɛ-symbols
are ignored) is known as “yield” of a derivation tree.
Simplification of Context Free Grammars:
Purpose: CFG may contain different types of useless symbols, unit productions, null
productions. These may increase number of steps in generating a language from CFG.
Therefore simplification of CFG will reduce variables and productions, so that time
complexity for generating language become less.
Procedure to simplify CFG:
CFG can be simplified by performing the following steps:
1. Elimination of ɛ-Productions
2. Elimination of Unit Productions
3. Elimination of Useless Productions
Elimination of ɛ-Productions:
ɛ-Productions (null production): A production in the form of “A→ɛ” is called as null
production.
Procedure to eliminate ɛ-Productions:
1. Identify ɛ-Productions in given CFG.(in the form ofA→ɛ)
2. To eliminate A→ɛ, then write all the productions from CFG whose right side contains
A.
3. Replace each occurrence of A with ‘ɛ’ to get production without ‘ɛ’and eliminate
A→ɛ.
4. The obtained productions in step3 must be added to the grammar.
5. A→ɛ is allowed only if A is not used on RHS of anyproduction.
Elimination of unit-Productions:
Unit-Productions: A production in the form of “A→B” is called as unit production.
Procedure to eliminate Unit-Productions:
1. Identify unit-Productions in given CFG.(in the form of A→B)
2. To eliminate A→B, then write all the productions from CFG whose left side contains A.
3. Replace B with ‘A’and eliminate A→B.
4. The obtained productions in step3 must be added to the grammar.
Elimination of useless-Productions:
Useless-Productions: Useless productions are of 2 types.
1. Non-generating productions
2. Non-reachable productions
Non-generating productions: Non-generating productions are the productions which do not
produce terminal string. Non-generating symbols are the symbols because of which productions
cannot produce terminal string.
Non-reachable productions: Non-reachable productions are the productions which cannot be
reached from starting symbol.
Procedure to eliminate useless-Productions:
1. Identify non generating symbols in given CFG.
2. To eliminate non generating symbol A, then remove all the productions from CFG
containing A on LHS or RHS.
3. Identify non reachable productions and remove them.
4. If start symbol is non generating symbol, it cannot be removed.
5. If symbol is non reachable but generating symbol, it cannot be removed.
Normalization of CFG:
The productions in CFG are of the form A→ (V∪T)*
. i.e., there are no restrictions on RHS o
f
CFG. There may be any number of terminals and non terminals in any combination.
If we made restrictions on RHS of CFG specifying that there should be fixed number of terminals
and non terminals, it results in normal form.
Normal form:
If every production of CFG is converted into some standard form, grammar is said to be in
normal form.
Use of normalization:
If some standard form is specified, then complexity of automata design will be less and easy to
implement the automata.
Types of normal forms:
Normal forms are of 2 types. They are
1. Chomsky Normal Form (CNF)
2. Greibach Normal Form (GNF)
Chomsky Normal Form (CNF):
Let G be the CFG. The grammar G=(V,T,P,S) is said to be in CNF, if all productions are of the
form A → BC (or) A → a, where A, B, C ∈ V(non terminals) and a ∈ T (terminals).
i.e., if grammar is in CNF, RHS should contain only one terminal or two non terminals.
Conversion from CFG to CNF:
1. Simplify the given CFG – Eliminate ɛ-Productions, Unit Productions, and Useless
Productions.
2. For the productions which are not in CNF, (having non terminals and terminals on RHS),
replace each terminal with new non terminal.
3. For the productions which are not in CNF, (having more than 2 non terminals on RHS),
replace 2 non terminals with new non terminal.
4. Repeat 2,3 steps until we get CNF grammar.
Greibach Normal Form (GNF):
1. Let G be the CFG. The grammar G=(V,T,P,S) is said to be in GNF, if all productions are
of the form A → aB*,
2. where A, B ∈ V(non terminals) and a ∈ T (terminals).
3. i.e., if grammar is in GNF, RHS should contain single terminal followed by any number
of non terminals.
Left Recursion:
A grammar is said to be left recursive, if it contains production of the form
A → Aα1| Aα2| Aα3|…| Aαn| β1| β2| β3|…| βn
where RHS contains αi starting with A and βi which does not starts with A
Elimination of Left Recursion:
Let A → Aα | β is the production having left recursion. To eliminate left recursion, introduce a
new variable A’ such that
A → βA’ A →
αA’|ɛ
Conversion from CFG to GNF:
1. Convert the given CFG into CNF.
2. Replace the names of variables with A1,A2, A3,… An (indexedform).
3. Convert it into GNF using the following rules
i. If production is of the form Ai→Aj Ak , i>j, then convert it into Ai→Aj Ak , i<j by
substituting Aj with itsRHS.
ii. If production is of the form Ai→Aj Ak , i=j, then eliminate left recursion and
eliminate ɛ productions .
iii. If production is of the form Ai→Aj Ak , i<j, then convert substitute Aj with its
RHS to obtain GNF grammar.
Pumping Lemma for context free languages:
Every language is not context free. To show that L is not context free, pumping lemma is used.
Pumping means “generating”. Pumping lemma means “it is a method of generating many input
strings from a given string.” i.e., it is used to break a given long input string into several
substrings.”
Statement of pumping lemma:
1. Assume that the given language L is context free accepted by pushdown automata M.
2. Choose a string z c L such that |z| ≥ n for some n.
3. Break the string into 5 strings u,v,w,x,y i.e., z=uvwxy such that
i. |vx| ≥ 1
ii. |vwx| ≤ n
4. By using pumping lemma, pump v, x to the suitable integer ‘i’ such that uvi
wxi
y ∉ L for
i≥0. This contradicts our assumption.
5. Therefore L is not context free.
Closure Properties of Context Free Languages:
Closure property: A set is closed under an operation if applying that operation to elements of the
set results in an element of the set.
For example, the set of natural numbers is closed under addition and multiplication but not under
subtraction or division because 2-3 is negative and 1/2 is not an integer. The set of integers is
closed under addition, subtraction and multiplication but not under division.
Context free languages are closed under the following:
1. Union
2. Concatenation
3. Kleene closure
4. Transpose
Context free languages are not closed under the following:
1. Complementation
2. Intersection
Problems : Convert the given CFG to CNF.
Example: Consider the given grammar G: S → a | aA | B
A → aBB | ε
B → Aa | b
Solution:
Step 1: We will create a new production S1 → S, as the start symbol S appears on the RHS.
The grammar will be:
S1 → S
S → a | aA | B
A → aBB | ε
B → Aa | b
Step 2: As grammar G contains A → ε null production, its removal from the grammar yields:
S1 → S
S → a | aA | B
A → aBB
B → Aa | b | a
Now, as grammar G contains Unit production S → B, its removal yield:
S1 → S
S → a | aA | Aa | b
A → aBB
B → Aa | b | a
Also remove the unit production S1 → S, its removal from the grammar yields:
S0 → a | aA | Aa | b
S → a | aA | Aa | b
A → aBB
B → Aa | b | a
Step 3: In the production rule S0 → aA | Aa, S → aA | Aa, A → aBB and B → Aa, terminal a exists
on RHS with non-terminals. So we will replace terminal a with X:
S0 → a | XA | AX | b
S → a | XA | AX | b
A → XBB
B → AX | b | a
X → a
Step 4: In the production rule A → XBB, RHS has more than two symbols, removing it from
grammar yield:
S0 → a | XA | AX | b
S → a | XA | AX | b
A → RB
B → AX | b | a
X → a
R → XB
Hence, for the given grammar, this is the required CNF
Problems: convert the given CFG to GNF
Example: Consider the given grammar G: S → XB | AA
A → a | SA
B → b
X → a
Solution:
As the given grammar G is already in CNF and there is no left recursion, so we can skip step 1 and
step 2 then directly go to step 3.
The production rule A → SA is not in GNF, so we substitute S → XB | AA in the production rule A
→ SA as: S → XB | AA
A → a | XBA | AAA
B → b
X → a
The production rule S → XB and B → XBA is not in GNF, so we substitute X → a in the
production rule S → XB and B → XBA as:
S → aB | AA
A → a | aBA | AAA
B → b
X → a
Now we will remove left recursion (A → AAA), we get:
S → aB | AA
A → aC | aBAC
C → AAC | ε
B → b
X → a
Now we will remove null production C → ε, we get:
S → aB | AA
A → aC | aBAC | a | aBA
C → AAC | AA
B → b
X → a
The production rule S → AA is not in GNF, so we substitute A → aC | aBAC | a | aBA in
production rule S → AA as:
S → aB | aCA | aBACA | aA | aBAA
A → aC | aBAC | a | aBA
C → AAC
C → aCA | aBACA | aA | aBAA
B → b
X → a
The production rule C → AAC is not in GNF, so we substitute A → aC | aBAC | a | aBA in
production rule C → AAC as:
S → aB | aCA | aBACA | aA | aBAA
A → aC | aBAC | a | aBA
C → aCAC | aBACAC | aAC | aBAAC
C → aCA | aBACA | aA | aBAA
B → b
X → a
Hence, this is the GNF form for the grammar G
Derivations:
Ex 1: Derive the string "abb" for leftmost derivation and rightmost derivation using a CFG
given by, S → AB | ε
A → aB
B → Sb
Solution:
Leftmost derivation: abb Rightmost derivation:
Ex 2: Derive the string "aabbabba" for leftmost derivation and rightmost derivation using a
CFG given by, S → aB | bA
A→ a | aS | bAA
B → b | aS | aBB
Solution:
Leftmost derivation:
S
aB S → aB
aaBB B → aBB
aabB B → b
aabbS B → bS
aabbaB S → aB
aabbabS B → bS
aabbabbA S → bA
aabbabba A → a
Rightmost derivation:
S
aB S → aB
aaBB B → aBB
aaBbS B → bS
aaBbbA S → bA
aaBbba A → a
aabSbba B → bS
aabbAbba S → bA
aabbabba A → a
Ex 3: Derive the string "00101" for leftmost derivation and rightmost derivation using a
CFG given by, S → A1B
A → 0A | ε
B → 0B | 1B | ε
Solution:
Leftmost derivation:
S
A1B
0A1B
00A1B
001B
0010B
00101B
00101
Rightmost derivation: S
A1B
A10B
A101B
A101
0A101
00A101
00101
Ex 4: Construct a derivation tree for the string aabbabba for the CFG given by,
S → aB | bA
A → a | aS | bAA
B → b | bS | aBB
Solution: To draw a tree, we will first try to obtain derivation for the string aabbabba
Now, the derivation tree is as follows:
Ex 5: Show the derivation tree for string "aabbbb" with the following grammar.
S → AB | ε
A → aB
B → Sb
Solution: To draw a tree we will first try to obtain derivation for the string aabbbb
Now, the derivation tree for the string "aabbbb" is above.
Ex 6: Let us consider a grammar G with the production rule E → I
E → E + E
E → E * E
E → (E)
I → ε | 0 | 1 | 2 | ... | 9
Solution: For the string "3 * 2 + 5", the above grammar can generate two parse trees by leftmost
derivation:
Since there are two parse trees for a single string "3 * 2 + 5", the grammar G is ambiguous
Ex 1 : Check whether the given grammar G is ambiguous or not. S → aSb | SS , S → ε
Solution: For the string "aabb" the above grammar can generate two parse trees
Since there are two parse trees for a single string "aabb", the grammar G is ambiguous.
Simplification of CFG :-
Removal of Useless Symbols :- A symbol can be useless if it does not appear on the right-
hand side of the production rule and does not take part in the derivation of any string. That symbol is
known as a useless symbol.
Similarly, a variable can be useless if it does not take part in the derivation of any string. That
variable is known as a useless variable.
For Example: T → aaB | abA | aaT
A → aA
B → ab | b
C → ad
In the above example, the variable 'C' will never occur in the derivation of any string, so the
production C → ad is useless. So we will eliminate it, and the other productions are written in such a
way that variable C can never reach from the starting variable 'T'.
Production A → aA is also useless because there is no way to terminate it. If it never terminates,
then it can never produce a string. Hence this production can never take part in any derivation.
To remove this useless production A → aA, we will first find all the variables which will never lead
to a terminal string such as variable 'A'. Then we will remove all the productions in which the
variable 'B' occurs.
Elimination of ε - Production:-
The productions of type S → ε are called ε productions.
These type of productions can only be removed from those grammars that do not generate ε.
Step 1: First find out all nullable non-terminal variable which derives ε.
Step 2: For each production A → a, construct all production A → x, where x is obtained from a by
removing one or more non-terminal from step 1.
Step 3: Now combine the result of step 2 with the original production and remove ε productions.
Ex: Remove the production from the following CFG by preserving the meaning of it.
S → XYX
X → 0X | ε
Y → 1Y | ε
Solution: Now, while removing ε production, we are deleting the rule X → ε and Y → ε. To
preserve the meaning of CFG we are actually placing ε at the right-hand side whenever X and Y
have appeared. Let us take
S → XYX If the first X at right-hand side is ε. Then
S → YX Similarly if the last X in R.H.S. = ε. Then
S → XY If Y = ε then
S → XX If Y and X are ε then,
S → X If both X are replaced by ε
S → Y Now,
S → XY | YX | XX | X | Y Now let us consider
X → 0X If we place ε at right-hand side for X then,
X → 0
X → 0X | 0 Similarly Y → 1Y | 1
Collectively we can rewrite the CFG with removed ε production as
S → XY | YX | XX | X | Y
X → 0X | 0
Y → 1Y | 1
Removing Unit Productions :-
The unit productions are the productions in which one non-terminal gives another non-terminal. Use
the following steps to remove unit production:
Step 1: To remove X → Y, add production X → a to the grammar rule whenever Y → a occurs in
the grammar.
Step 2: Now delete X → Y from the grammar.
Step 3: Repeat step 1 and step 2 until all unit productions are removed.
For example: S → 0A | 1B | C
A → 0S | 00
B → 1 | A
C → 01
Solution: S → C is a unit production. But while removing S → C we have to consider what C gives.
So, we can add a rule to S.
S → 0A | 1B | 01
Similarly, B → A is also a unit production so we can modify it as
B → 1 | 0S | 00
Thus finally we can write CFG without unit production as
S → 0A | 1B | 01
A → 0S | 00
B → 1 | 0S | 00
C → 01

More Related Content

PDF
Flat unit 2
PPTX
Types of grammer - TOC
PPT
context free language
PDF
Flat unit 1
PPTX
Pumping lemma
PPT
Regular Languages
PPTX
Undecidability.pptx
PPTX
CONTEXT FREE GRAMMAR
Flat unit 2
Types of grammer - TOC
context free language
Flat unit 1
Pumping lemma
Regular Languages
Undecidability.pptx
CONTEXT FREE GRAMMAR

What's hot (20)

PPTX
Regular expressions
PDF
Deterministic Finite Automata (DFA)
PPT
Turing Machine
PPT
Finite automata
PPTX
CLR AND LALR PARSER
PDF
Formal Languages and Automata Theory unit 5
PPTX
Types of Parser
PDF
Syntax Directed Definition and its applications
PPT
NFA or Non deterministic finite automata
PPTX
Role-of-lexical-analysis
PDF
NFA to DFA
PPTX
Simplification of cfg ppt
PPT
Regular expressions-Theory of computation
PPTX
Context free grammar
PPTX
Turing machine
PDF
Lecture: Regular Expressions and Regular Languages
PDF
TOC 1 | Introduction to Theory of Computation
PPT
Regular expression with DFA
PPT
Regular Grammar
PPTX
Turing machine by_deep
Regular expressions
Deterministic Finite Automata (DFA)
Turing Machine
Finite automata
CLR AND LALR PARSER
Formal Languages and Automata Theory unit 5
Types of Parser
Syntax Directed Definition and its applications
NFA or Non deterministic finite automata
Role-of-lexical-analysis
NFA to DFA
Simplification of cfg ppt
Regular expressions-Theory of computation
Context free grammar
Turing machine
Lecture: Regular Expressions and Regular Languages
TOC 1 | Introduction to Theory of Computation
Regular expression with DFA
Regular Grammar
Turing machine by_deep
Ad

Similar to Flat unit 3 (20)

PDF
Syntax Analyzer.pdf
PPT
Normal-forms-for-Context-Free-Grammars.ppt
PPT
UNIT III REGULAR GRAMMAR. in automatappt
PPTX
RegularLanguageProperties [Autosaved].pptx
PPTX
Automata theory - CFG and normal forms
PPTX
Automata and Language Theory-Context Free Grammars
PDF
Regular Language and Regular Grammar Lecture
PPT
Formal Languages and regular langugaes IN FLAG
PPT
Theory of computing
PPTX
Types of Language in Theory of Computation
PDF
Theory of Computation Grammar Concepts and Problems
PDF
Context free langauges
PPTX
Module 1 TOC.pptx
PPTX
slides for Automata theory chapter no 11
PDF
05 8640 (update email) multiset cs closure propertie (edit lafi)2
PDF
201506 CSE340 Lecture 16
PPTX
CS911-Lecture-21_43709.pptx
PPTX
Theory of Automata and CGFG for learning
PPTX
2.1 & 2.2 grammar introduction – types of grammar
PPTX
Introduction to Chomsky Hierarchy in TOC.pptx
Syntax Analyzer.pdf
Normal-forms-for-Context-Free-Grammars.ppt
UNIT III REGULAR GRAMMAR. in automatappt
RegularLanguageProperties [Autosaved].pptx
Automata theory - CFG and normal forms
Automata and Language Theory-Context Free Grammars
Regular Language and Regular Grammar Lecture
Formal Languages and regular langugaes IN FLAG
Theory of computing
Types of Language in Theory of Computation
Theory of Computation Grammar Concepts and Problems
Context free langauges
Module 1 TOC.pptx
slides for Automata theory chapter no 11
05 8640 (update email) multiset cs closure propertie (edit lafi)2
201506 CSE340 Lecture 16
CS911-Lecture-21_43709.pptx
Theory of Automata and CGFG for learning
2.1 & 2.2 grammar introduction – types of grammar
Introduction to Chomsky Hierarchy in TOC.pptx
Ad

Recently uploaded (20)

PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PPTX
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PPTX
PPH.pptx obstetrics and gynecology in nursing
PDF
Basic Mud Logging Guide for educational purpose
PDF
Complications of Minimal Access Surgery at WLH
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PPTX
Cell Types and Its function , kingdom of life
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PPTX
BOWEL ELIMINATION FACTORS AFFECTING AND TYPES
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PPTX
Pharma ospi slides which help in ospi learning
PPTX
The Healthy Child – Unit II | Child Health Nursing I | B.Sc Nursing 5th Semester
PPTX
Week 4 Term 3 Study Techniques revisited.pptx
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PDF
RMMM.pdf make it easy to upload and study
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
O5-L3 Freight Transport Ops (International) V1.pdf
Module 4: Burden of Disease Tutorial Slides S2 2025
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PPH.pptx obstetrics and gynecology in nursing
Basic Mud Logging Guide for educational purpose
Complications of Minimal Access Surgery at WLH
Supply Chain Operations Speaking Notes -ICLT Program
Cell Types and Its function , kingdom of life
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
BOWEL ELIMINATION FACTORS AFFECTING AND TYPES
human mycosis Human fungal infections are called human mycosis..pptx
Pharma ospi slides which help in ospi learning
The Healthy Child – Unit II | Child Health Nursing I | B.Sc Nursing 5th Semester
Week 4 Term 3 Study Techniques revisited.pptx
Final Presentation General Medicine 03-08-2024.pptx
Microbial diseases, their pathogenesis and prophylaxis
FourierSeries-QuestionsWithAnswers(Part-A).pdf
RMMM.pdf make it easy to upload and study

Flat unit 3

  • 1. UNIT – III Context Free Grammars • Formal Languages, Grammars, • Classification of Grammars, • Chomsky Hierarchy Theorem, • Context Free Grammar, • Leftmost and Rightmost Derivations, • Parse Trees, • Ambiguous Grammars, • Simplification of Context Free Grammars-Elimination of Useless Symbols, • E- Productions and Unit Productions, • Normal Forms for Context Free Grammars: ▪ Chomsky Normal Form ▪ Greibach Normal Form, • Pumping Lemma, Closure Properties, • Applications of Context Free Grammars.
  • 2. Grammar: Grammar is a set of rules used to define a valid sentence in any language. Mathematical representation of grammar: Grammar is defined as a 4-tuple G={V,T,P,S} where V - finite set of non-terminals T - finite set of terminals P – finite set of production rules S – Start symbol “Nonterminals”arerepresentedbycapitalletters A,B,C,…soon. “Terminals” are represented by small letters a, b, c,… so on. “Production rules” are of the form α  β, where α contains non-terminal symbols or may also contain both terminals and non-terminals with atleast one non-terminal. β contains terminal, non-terminal symbols or any combination of both terminals and non-terminals. Formal Grammar: Grammar is an useful model when designing software that processes data with recursive structure. i.e., It denotes the structure of data. Formal Language: Formal language is an abstraction for general characteristics of programming languages. Classification of Grammars: Chomsky Hierarchy: Noam Chomsky classified grammar into 4 types depending on production rules (. 1. Type 0 – Unrestricted Grammar 2. Type 1 – Context Sensitive Grammar (CSG) 3. Type 2 – Context Free Grammar (CFG) 4. Type 3 – Regular Grammar Type0 Grammar: It is also known as “unrestricted grammar” or “phase structured grammar”. An unrestricted grammar is a formal grammar in which no restrictions are made on LHS and RHS of production. Mathematical representation of Type 0 grammar: Type 0 Grammar is defined as a 4-tuple G={V,T,P,S} where V - finite set of non-terminals T - finite set of terminals P – finite set of production rules of the form αAβ → αγβ Where α – left context, β – right context, α, β ∈ (V∪T)+ and γ ∈ (V∪T)*, i.e., no epsilon (ɛ) productions are allowed on LHS but epsilon (ɛ) productions are allowed on RHS. S – Start symbol The language that can be generated by making use of type 0 grammar is “recursive enumerable language”. The machine format that is used to recognize strings of that particular grammar is “turing machine”. E.g: S→aSBA | abA AB→BA bB→bb bA→ba aA→aa
  • 3. Type 1 Grammar: It is also known as “Context Sensitive Grammar (CSG)”. Context Sensitive Grammar (CSG) is a formal grammar in which LHS and RHS of any production rule may be surrounded by a context of terminals and non-terminals. Mathematical representation of Type 1 grammar: Type 1 Grammar is defined as a 4-tuple G={V,T,P,S} where V - finite set of non-terminals T - finite set of terminals P – finite set of production rules of the form αAβ → αγβ Where α – left context, β – right context, α, β ∈ (V∪T)+ and γ ∈ (V∪T)+ i.e., no epsilon (ɛ) productions are allowed on LHS and RHS. S – Start symbol The language that can be generated by making use of type 1 grammar is “Context Sensitive language (CSL)”. The machine format that is used to recognize strings of that particular grammar is “Linear Bounded Automata (LBA)”. E.g: S→abc | aSBC CB→BC bB→bb Type 2 Grammar: It is also known as “Context Free Grammar (CFG)”. Context Free Grammar (CFG) is a formal grammar in which LHS and RHS of any production rule is free from a context of terminals and non-terminals. Mathematical representation of Type 2 grammar: Type 2 Grammar is defined as a 4-tuple G={V,T,P,S} where V - finite set of non-terminals T - finite set of terminals P – finite set of production rules of the form A → γ Whereandγ∈ ( V ∪T ) *, i.e., A (V∪T)* , i.e., LHS of any production rule contains only a single non- terminal and epsilon (ɛ) productions are allowed on RHS. S – Start symbol The language that can be generated by making use of type 2 grammar is “Context Free language (CFL)”. The machine format that is used to recognize strings of that particular grammar is “Push Down Automata (PDA)”. E.g: S→aSb S→ab Type 3 Grammar: It is also known as “Regular Grammar (RG)”. Regular Grammar (RG) is a formal grammar with restrictions on both LHS and RHS. i.e., LHS contains only a single non-terminal and RHS may contain single terminal (or) single terminal followed by non terminal (or) nothing. Mathematical representation of Type 3 grammar: Type 3 Grammar is defined as a 4-tuple G={V,T,P,S} where V - finite set of non-terminals T - finite set of terminals P – finite set of production rules of the form A t | t B | ɛ , Where t – terminals A,B – non terminals S – Start symbol
  • 4. The language that can be generated by making use of type 3 grammar is “Regular language (RL)”. The machine format that is used to recognize strings of that particular grammar is “Finite Automata (FA)”. E.g: S→aS S→a Types of formal languages: Based on properties of Chomsky hierarchy of grammars, formal languages are classified into 4 types. 1. Regular Language 2. Context Free Language 3. Context Sensitive Language 4. Recursive Language and Recursive Enumerable Language Regular Language: It is a formal language generated by regular grammar. Properties of Regular Language: 1. It is accepted by DFA, NFA and read only turing machine. 2. It is described by regular expressions. 3. It is generated by regular grammar. Closure Properties of Regular Language: Two regular languages are closed under the following. 1. Union 2. Concatenation 3. Complementation 4. Intersection 5. Kleene closure 6. Transpose 7. Substitution 8. Homomorphism 9. Inverse homomorphism Methods used to prove the given language is not regular – pumping lemma. E.g: L={an | n≥1} is a regular language generated by regular grammar S→aS S→a Context Free Language: It is a formal language generated by context free grammar. Properties of Context Free Language: 1. It is accepted by pushdown automata. 2. It is generated by context free grammar. 3. It doesn’t satisfy operations such as intersection, complement, difference. Closure Properties of Context Free Language: Two context free languages are closed under the following. 1. Union 2. Concatenation 3. Kleene closure 4. Transpose 5. Substitution 6. Homomorphism 7. Inverse homomorphism Methods used to prove the given language is not context free – pumping lemma. E.g: L={an bn | n≥1} is a context free language generated by context free grammar S→aSb S→ab
  • 5. Context Sensitive Language: It is a formal language generated by context sensitive grammar. Properties of Context Sensitive Language: 1. It is accepted by pushdown automata. 2. It is generated by context sensitive grammar. 3. It doesn’t satisfy operations such as complementation, difference. Closure Properties of Context Sensitive Language: Two context sensitive languages are closed under the following. 1. Union 2. Concatenation 3. Kleene closure 4. Intersection 5. Transpose Methods used to prove the given language is not context sensitive – pumping lemma. E.g: L={an bn cn | n≥1} is a regular language generated by context sensitive grammar S→abc | aSBC CB→BC bB→bb Recursive Language: Any language L over alphabet ∑ is said to be recursive, if there is a turing machine that accepts every word present in the language L and rejects every word which is not in L. Properties of Recursive Language: 1. It is accepted by turing machine. 2. It is generated by unrestricted grammar. Closure Properties of Recursive Language: Two recursive languages are closed under the following. 1. Union 2. Concatenation 3. Complementation 4. Intersection 5. Kleene closure Recursive Enumerable Language: A recursive enumerable language L over alphabet ∑ accepts every word in L of turing machine M (or) either rejects or loops every word which is not in L. Accept=L Reject + Loop = L̄ i.e., a turing machine on same input can either accept a language or else, will run in loop forever. Properties of Recursive Enumerable Language: 1. It is accepted by turing machine. 2. It is generated by unrestricted grammar. Closure Properties of Recursive Enumerable Language: Two recursive enumerable languages are closed under the following. 1. Union 2. Concatenation 3. Intersection 4. Kleene closure E.g: L={an bn an | n≥0} is a recursive enumerable language generated by unrestricted grammar S→aSBA | abA AB→BA bB→bb bA→ba aA→aa
  • 6. Context Free Grammar: A grammar G=(V,T,P,S) is said to be context free grammar if it is recursive and free from context. i.e., productions are in the form of A (V∪T)* in which LHS of any productionr u l e contains only a single non-terminal and epsilon (ɛ) productions are allowed on RHS. CFG is formally defined as G=(V,T,P,S) where V - finite set of non-terminals T - finite set of terminals P – finite set of production rules of the form A ( V ∪ T ) * , S – Start symbol Backus naur Form: It was proposed by John Backus and Peter Naur. If there are many productions with same LHS variable, then they can be combined into single production LHS → RHS, where LHS is common variable of LHS of productions LHS is RHS productions separated by vertical bar ( | ) E.g: S→aS S→ɛ ɛ should be written at the end. Therefore, S→aS | ɛ Derivation: The process of generating a string from the production rule of a grammar is called derivation. Types of Derivation: 1. Leftmost derivation 2. Rightmost derivation Leftmost derivation: A derivation is called leftmost derivation, if leftmost variable is replaced with some production rule of a grammar. E.g: S→aSbS | ab Let the string to be derived is aabbab. LMD: S→aSbS →aabbS →aabbab Rightmost derivation: A derivation is called rightmost derivation, if rightmost variable is replaced with some production rule of a grammar. E.g: S→aSbS | ab Let the string to be derived is aabbab. RMD: S→aSbS →aSbab →aabbab
  • 7. Derivation tree: The diagrammatical representation of the derivation process by a tree structure is known as “derivation tree”. It is also known as “parse tree” because it is created by parser (second phase of compiler) to check syntax of the statement. It is also defined as an ordered tree in which LHS of production rule represents parent node and RHS of production rule represents children nodes. Properties of Derivation tree: In derivation tree of CFG, 1. Root node is represented starting variable. 2. Leaf nodes are represented by terminals or epsilon (ɛ). 3. Internal nodes are represented by non- terminals. 4. If A→α1 α2 α3... αn then A is parent of nodes α1 α2 α3... αn Ambiguous Grammars: If more than one parse tree (or) more than one LMD (or) more than one RMD is possible to derive the same string, then that grammar is said to be “Ambiguous grammar”, Otherwise it is known as unambiguous grammar. Yield of a derivation tree: The string obtained by reading only the leaves of derivation tree from left to right (ɛ-symbols are ignored) is known as “yield” of a derivation tree. Simplification of Context Free Grammars: Purpose: CFG may contain different types of useless symbols, unit productions, null productions. These may increase number of steps in generating a language from CFG. Therefore simplification of CFG will reduce variables and productions, so that time complexity for generating language become less. Procedure to simplify CFG: CFG can be simplified by performing the following steps: 1. Elimination of ɛ-Productions 2. Elimination of Unit Productions 3. Elimination of Useless Productions Elimination of ɛ-Productions: ɛ-Productions (null production): A production in the form of “A→ɛ” is called as null production. Procedure to eliminate ɛ-Productions: 1. Identify ɛ-Productions in given CFG.(in the form ofA→ɛ) 2. To eliminate A→ɛ, then write all the productions from CFG whose right side contains A. 3. Replace each occurrence of A with ‘ɛ’ to get production without ‘ɛ’and eliminate A→ɛ. 4. The obtained productions in step3 must be added to the grammar. 5. A→ɛ is allowed only if A is not used on RHS of anyproduction.
  • 8. Elimination of unit-Productions: Unit-Productions: A production in the form of “A→B” is called as unit production. Procedure to eliminate Unit-Productions: 1. Identify unit-Productions in given CFG.(in the form of A→B) 2. To eliminate A→B, then write all the productions from CFG whose left side contains A. 3. Replace B with ‘A’and eliminate A→B. 4. The obtained productions in step3 must be added to the grammar. Elimination of useless-Productions: Useless-Productions: Useless productions are of 2 types. 1. Non-generating productions 2. Non-reachable productions Non-generating productions: Non-generating productions are the productions which do not produce terminal string. Non-generating symbols are the symbols because of which productions cannot produce terminal string. Non-reachable productions: Non-reachable productions are the productions which cannot be reached from starting symbol. Procedure to eliminate useless-Productions: 1. Identify non generating symbols in given CFG. 2. To eliminate non generating symbol A, then remove all the productions from CFG containing A on LHS or RHS. 3. Identify non reachable productions and remove them. 4. If start symbol is non generating symbol, it cannot be removed. 5. If symbol is non reachable but generating symbol, it cannot be removed. Normalization of CFG: The productions in CFG are of the form A→ (V∪T)* . i.e., there are no restrictions on RHS o f CFG. There may be any number of terminals and non terminals in any combination. If we made restrictions on RHS of CFG specifying that there should be fixed number of terminals and non terminals, it results in normal form. Normal form: If every production of CFG is converted into some standard form, grammar is said to be in normal form. Use of normalization: If some standard form is specified, then complexity of automata design will be less and easy to implement the automata. Types of normal forms: Normal forms are of 2 types. They are 1. Chomsky Normal Form (CNF) 2. Greibach Normal Form (GNF)
  • 9. Chomsky Normal Form (CNF): Let G be the CFG. The grammar G=(V,T,P,S) is said to be in CNF, if all productions are of the form A → BC (or) A → a, where A, B, C ∈ V(non terminals) and a ∈ T (terminals). i.e., if grammar is in CNF, RHS should contain only one terminal or two non terminals. Conversion from CFG to CNF: 1. Simplify the given CFG – Eliminate ɛ-Productions, Unit Productions, and Useless Productions. 2. For the productions which are not in CNF, (having non terminals and terminals on RHS), replace each terminal with new non terminal. 3. For the productions which are not in CNF, (having more than 2 non terminals on RHS), replace 2 non terminals with new non terminal. 4. Repeat 2,3 steps until we get CNF grammar. Greibach Normal Form (GNF): 1. Let G be the CFG. The grammar G=(V,T,P,S) is said to be in GNF, if all productions are of the form A → aB*, 2. where A, B ∈ V(non terminals) and a ∈ T (terminals). 3. i.e., if grammar is in GNF, RHS should contain single terminal followed by any number of non terminals. Left Recursion: A grammar is said to be left recursive, if it contains production of the form A → Aα1| Aα2| Aα3|…| Aαn| β1| β2| β3|…| βn where RHS contains αi starting with A and βi which does not starts with A Elimination of Left Recursion: Let A → Aα | β is the production having left recursion. To eliminate left recursion, introduce a new variable A’ such that A → βA’ A → αA’|ɛ Conversion from CFG to GNF: 1. Convert the given CFG into CNF. 2. Replace the names of variables with A1,A2, A3,… An (indexedform). 3. Convert it into GNF using the following rules i. If production is of the form Ai→Aj Ak , i>j, then convert it into Ai→Aj Ak , i<j by substituting Aj with itsRHS. ii. If production is of the form Ai→Aj Ak , i=j, then eliminate left recursion and eliminate ɛ productions . iii. If production is of the form Ai→Aj Ak , i<j, then convert substitute Aj with its RHS to obtain GNF grammar.
  • 10. Pumping Lemma for context free languages: Every language is not context free. To show that L is not context free, pumping lemma is used. Pumping means “generating”. Pumping lemma means “it is a method of generating many input strings from a given string.” i.e., it is used to break a given long input string into several substrings.” Statement of pumping lemma: 1. Assume that the given language L is context free accepted by pushdown automata M. 2. Choose a string z c L such that |z| ≥ n for some n. 3. Break the string into 5 strings u,v,w,x,y i.e., z=uvwxy such that i. |vx| ≥ 1 ii. |vwx| ≤ n 4. By using pumping lemma, pump v, x to the suitable integer ‘i’ such that uvi wxi y ∉ L for i≥0. This contradicts our assumption. 5. Therefore L is not context free. Closure Properties of Context Free Languages: Closure property: A set is closed under an operation if applying that operation to elements of the set results in an element of the set. For example, the set of natural numbers is closed under addition and multiplication but not under subtraction or division because 2-3 is negative and 1/2 is not an integer. The set of integers is closed under addition, subtraction and multiplication but not under division. Context free languages are closed under the following: 1. Union 2. Concatenation 3. Kleene closure 4. Transpose Context free languages are not closed under the following: 1. Complementation 2. Intersection Problems : Convert the given CFG to CNF. Example: Consider the given grammar G: S → a | aA | B A → aBB | ε B → Aa | b Solution: Step 1: We will create a new production S1 → S, as the start symbol S appears on the RHS. The grammar will be: S1 → S S → a | aA | B A → aBB | ε B → Aa | b Step 2: As grammar G contains A → ε null production, its removal from the grammar yields: S1 → S S → a | aA | B A → aBB B → Aa | b | a
  • 11. Now, as grammar G contains Unit production S → B, its removal yield: S1 → S S → a | aA | Aa | b A → aBB B → Aa | b | a Also remove the unit production S1 → S, its removal from the grammar yields: S0 → a | aA | Aa | b S → a | aA | Aa | b A → aBB B → Aa | b | a Step 3: In the production rule S0 → aA | Aa, S → aA | Aa, A → aBB and B → Aa, terminal a exists on RHS with non-terminals. So we will replace terminal a with X: S0 → a | XA | AX | b S → a | XA | AX | b A → XBB B → AX | b | a X → a Step 4: In the production rule A → XBB, RHS has more than two symbols, removing it from grammar yield: S0 → a | XA | AX | b S → a | XA | AX | b A → RB B → AX | b | a X → a R → XB Hence, for the given grammar, this is the required CNF Problems: convert the given CFG to GNF Example: Consider the given grammar G: S → XB | AA A → a | SA B → b X → a Solution: As the given grammar G is already in CNF and there is no left recursion, so we can skip step 1 and step 2 then directly go to step 3. The production rule A → SA is not in GNF, so we substitute S → XB | AA in the production rule A → SA as: S → XB | AA A → a | XBA | AAA B → b X → a The production rule S → XB and B → XBA is not in GNF, so we substitute X → a in the production rule S → XB and B → XBA as: S → aB | AA A → a | aBA | AAA B → b X → a Now we will remove left recursion (A → AAA), we get: S → aB | AA A → aC | aBAC C → AAC | ε B → b X → a
  • 12. Now we will remove null production C → ε, we get: S → aB | AA A → aC | aBAC | a | aBA C → AAC | AA B → b X → a The production rule S → AA is not in GNF, so we substitute A → aC | aBAC | a | aBA in production rule S → AA as: S → aB | aCA | aBACA | aA | aBAA A → aC | aBAC | a | aBA C → AAC C → aCA | aBACA | aA | aBAA B → b X → a The production rule C → AAC is not in GNF, so we substitute A → aC | aBAC | a | aBA in production rule C → AAC as: S → aB | aCA | aBACA | aA | aBAA A → aC | aBAC | a | aBA C → aCAC | aBACAC | aAC | aBAAC C → aCA | aBACA | aA | aBAA B → b X → a Hence, this is the GNF form for the grammar G Derivations: Ex 1: Derive the string "abb" for leftmost derivation and rightmost derivation using a CFG given by, S → AB | ε A → aB B → Sb Solution: Leftmost derivation: abb Rightmost derivation:
  • 13. Ex 2: Derive the string "aabbabba" for leftmost derivation and rightmost derivation using a CFG given by, S → aB | bA A→ a | aS | bAA B → b | aS | aBB Solution: Leftmost derivation: S aB S → aB aaBB B → aBB aabB B → b aabbS B → bS aabbaB S → aB aabbabS B → bS aabbabbA S → bA aabbabba A → a Rightmost derivation: S aB S → aB aaBB B → aBB aaBbS B → bS aaBbbA S → bA aaBbba A → a aabSbba B → bS aabbAbba S → bA aabbabba A → a Ex 3: Derive the string "00101" for leftmost derivation and rightmost derivation using a CFG given by, S → A1B A → 0A | ε B → 0B | 1B | ε Solution: Leftmost derivation: S A1B 0A1B 00A1B 001B 0010B 00101B 00101 Rightmost derivation: S A1B A10B A101B A101 0A101 00A101 00101
  • 14. Ex 4: Construct a derivation tree for the string aabbabba for the CFG given by, S → aB | bA A → a | aS | bAA B → b | bS | aBB Solution: To draw a tree, we will first try to obtain derivation for the string aabbabba Now, the derivation tree is as follows: Ex 5: Show the derivation tree for string "aabbbb" with the following grammar. S → AB | ε A → aB B → Sb Solution: To draw a tree we will first try to obtain derivation for the string aabbbb Now, the derivation tree for the string "aabbbb" is above.
  • 15. Ex 6: Let us consider a grammar G with the production rule E → I E → E + E E → E * E E → (E) I → ε | 0 | 1 | 2 | ... | 9 Solution: For the string "3 * 2 + 5", the above grammar can generate two parse trees by leftmost derivation: Since there are two parse trees for a single string "3 * 2 + 5", the grammar G is ambiguous Ex 1 : Check whether the given grammar G is ambiguous or not. S → aSb | SS , S → ε Solution: For the string "aabb" the above grammar can generate two parse trees Since there are two parse trees for a single string "aabb", the grammar G is ambiguous. Simplification of CFG :-
  • 16. Removal of Useless Symbols :- A symbol can be useless if it does not appear on the right- hand side of the production rule and does not take part in the derivation of any string. That symbol is known as a useless symbol. Similarly, a variable can be useless if it does not take part in the derivation of any string. That variable is known as a useless variable. For Example: T → aaB | abA | aaT A → aA B → ab | b C → ad In the above example, the variable 'C' will never occur in the derivation of any string, so the production C → ad is useless. So we will eliminate it, and the other productions are written in such a way that variable C can never reach from the starting variable 'T'. Production A → aA is also useless because there is no way to terminate it. If it never terminates, then it can never produce a string. Hence this production can never take part in any derivation. To remove this useless production A → aA, we will first find all the variables which will never lead to a terminal string such as variable 'A'. Then we will remove all the productions in which the variable 'B' occurs. Elimination of ε - Production:- The productions of type S → ε are called ε productions. These type of productions can only be removed from those grammars that do not generate ε. Step 1: First find out all nullable non-terminal variable which derives ε. Step 2: For each production A → a, construct all production A → x, where x is obtained from a by removing one or more non-terminal from step 1. Step 3: Now combine the result of step 2 with the original production and remove ε productions. Ex: Remove the production from the following CFG by preserving the meaning of it. S → XYX X → 0X | ε Y → 1Y | ε Solution: Now, while removing ε production, we are deleting the rule X → ε and Y → ε. To preserve the meaning of CFG we are actually placing ε at the right-hand side whenever X and Y have appeared. Let us take S → XYX If the first X at right-hand side is ε. Then S → YX Similarly if the last X in R.H.S. = ε. Then S → XY If Y = ε then S → XX If Y and X are ε then, S → X If both X are replaced by ε S → Y Now, S → XY | YX | XX | X | Y Now let us consider X → 0X If we place ε at right-hand side for X then, X → 0 X → 0X | 0 Similarly Y → 1Y | 1 Collectively we can rewrite the CFG with removed ε production as S → XY | YX | XX | X | Y X → 0X | 0 Y → 1Y | 1
  • 17. Removing Unit Productions :- The unit productions are the productions in which one non-terminal gives another non-terminal. Use the following steps to remove unit production: Step 1: To remove X → Y, add production X → a to the grammar rule whenever Y → a occurs in the grammar. Step 2: Now delete X → Y from the grammar. Step 3: Repeat step 1 and step 2 until all unit productions are removed. For example: S → 0A | 1B | C A → 0S | 00 B → 1 | A C → 01 Solution: S → C is a unit production. But while removing S → C we have to consider what C gives. So, we can add a rule to S. S → 0A | 1B | 01 Similarly, B → A is also a unit production so we can modify it as B → 1 | 0S | 00 Thus finally we can write CFG without unit production as S → 0A | 1B | 01 A → 0S | 00 B → 1 | 0S | 00 C → 01