SlideShare a Scribd company logo
Regular Expression and
Regular languages
Chapter 2
Outline
2.1. Regular expressions
2.2. Connection between regular expression
and regular languages
2.3. Regular grammar
2.4. Pumping lemma
2.1. Regular expressions
Introduction
• A Regular Expression (RE) is a symbolic method to
describe patterns of strings in a language.
• It represents a regular language, which is a language
accepted by finite automata (either DFA or NFA).
• Regular expressions are essential in:
• Lexical analysis in compilers
• Pattern matching (e.g., grep, regex in Python)
• Describing tokens in programming languages
Formal Definition of Regular
Expressions
I. Base Cases: are the simplest regular expressions — the
building blocks:
1. (phi):
ϕ represents the empty language:
L( ) =
ϕ ∅
This language contains no strings at all.
2.ε (epsilon): represents the language containing only the
empty string:
L(ε) = {ε}
The empty string is a string of length 0.
3.a, for any symbol a Σ: represents the language containing
∈
just the string "a":
L(a) = {a}
Cont …
II. Recursive Rules (Building Larger Expressions)
• If r and s are regular expressions representing languages L(r) and L(s),
• Then the following are also regular expressions:
1.Union: r s
∪
The expression r s
∪ denotes the language:
L(r s) = L(r) L(s)
∪ ∪
That is, any string in either L(r) or L(s).
2.Concatenation: rs
The expression rs denotes the concatenation of L(r) and L(s):
L(rs) = { xy | x L(r), y L(s) }
∈ ∈
3.Kleene Star: r*
The expression r* denotes the language containing zero or more
concatenations of strings from L(r):
L(r) = {ε, w , w w , w w w , ... | each w L(r)}
₁ ₁ ₂ ₁ ₂ ₃ ᵢ ∈ *
Precedence of Operators
• To interpret expressions
correctly without excessive
parentheses, operator
precedence is used:
• Kleene Star * – highest
precedence
• Concatenation
• Union | or ∪ – lowest
precedence
• Example:
• a|bc* is interpreted as a
(b(c*))
∪
• Use parentheses to control
grouping:
• (a|b)*abb means: any
number of a’s and b’s
followed by a, then b, then
b.
Cont …
• To interpret regular expressions without excessive
parentheses, we rely on a standard precedence of operators:
1.Kleene Star (*) – Highest Precedence
•Applies to the symbol or group directly before it.
•Example: a* means zero or more occurrences of a.
2.Concatenation – Medium Precedence
•Joins two patterns end to end.
•Example: ab* is interpreted as a(b*), not (ab)*.
3.Union (| or )
∪ – Lowest Precedence
•Represents choice between alternatives.
•Example: a|b* is interpreted as a | (b*), not (a | b)*.
Algebraic Laws of Regular
Expressions
• Regular expressions follow certain algebraic
identities, helping in simplification:
• Union is commutative: r s = s r
∪ ∪
• Concatenation is associative: (rs)t = r(st)
• Distributive property: r(s t) = rs rt
∪ ∪
• Identity element for union: r ∪ =
ϕ r
• Identity for concatenation: rε = εr = r
• Kleene star properties:
• r* = ε ∪ r rr rrr ...
∪ ∪ ∪
• (r*)* = r*
Excersice
• Write a regular expression for the set
of all strings over the alphabet {a, b}
that start with 'a’.
• Give a brief explanation of how your
regular expression works.
• Answer: Regular Expression: a(a|b)*
• Explanation:
• The string must start with an a.
• The expression (a|b)* means zero or
more occurrences of either a or b.
• So the string starts with a and is
followed by any sequence (including
none) of a’s and b’s.
• Valid examples: a, ab, aabbb, abab, aaaa.
• Write a regular expression for all
strings over {a, b} that end with 'b’.
• Explain how the expression ensures
the last character is 'b’.
• Answer: Regular Expression: (a|b)*b
• Explanation:
• (a|b)* matches any sequence of a's
and b’s.
• The final b ensures that the string
ends with b.
• Valid examples: b, ab, aab, bbab,
bbbb.
Cont …
• Write a regular expression for strings
that contain the substring "ab".
• Explain how your regex ensures "ab"
appears.
• Answer: Regular Expression: (a|b)*ab(a|
b)*
• Explanation:
• (a|b)* before and after means any
characters can come before and
after ab.
• The required ab substring must
appear at least once in the string.
• Examples: ab, aab, babab, aaabba,
bbaab.
• Write a regular expression for
strings over {a, b} that contain
exactly two a’s.
• Describe how this restricts the
count of 'a’.
• Answer: Regular Expression:
(b*)a(b*)a(b*)
• Explanation:
• a appears exactly twice.
• Between and around the as, there
can be any number of bs (even zero).
• No more than two as are allowed.
• Examples: aab, baab, babab, bbabb.
2.2. Connection between
regular expression &
regular languages
Introduction
Regular Expressions (RE):
• Symbolic notation to
describe patterns in
strings.
• Built from basic symbols
using operations: union
( ), concatenation, and
∪
Kleene star (*).
• Example: (a|b)*abb
Regular Languages (RL):
• A class of languages that
can be recognized by
finite automata
(DFA/NFA).
• These are exactly the
languages that can be
described by regular
expressions.
From Regular Languages to Regular
Expressions
• Every regular language can also be represented by a
regular expression.
• Why?
• Regular languages are recognized by DFA/NFA.
• Kleene's Theorem: If a language is recognized by
a finite automaton, then there exists a regular
expression that generates it.
• Conversion: Convert NFA Regular Expression
→
2.3. Regular grammar
Regular Grammar
• Regular grammar is a formal grammar used to
describe regular languages, which are the languages
that can be recognized by finite automata.
• There are two standard forms of regular grammar:
• Right-Linear Grammar
• Left-Linear Grammar
Cont …
• A regular grammar is a restricted type of context-
free grammar (CFG) where all production rules follow
specific patterns.
• Formal Definition
• A regular grammar G is a 4-tuple (V,Σ,P,S):
• V: Finite set of non-terminal symbols (e.g., S,A,B)
• Σ: Finite set of terminal symbols (e.g., a,b)
• P: Production rules of specific forms
• S: Start symbol (S V)
∈
Types of Regular Grammars
I. Right-Linear Grammars
• Rule Forms:
• A aB(Non-terminal
→ →
terminal + non-terminal)
• A a (Non-terminal
→ →
terminal)
• A→ε (Only if A is the
start symbol)
II. Left-Linear Grammars
• Rule Forms:
• A Ba(Non-terminal
→ →
non-terminal + terminal)
• A a (Non-terminal
→ →
terminal)
• A→ε (Only if A is the
start symbol)
• A language is regular if and only if it can be generated by
a right-linear or left-linear grammar.
I. Right Linear Grammar
• Right Linear Grammars are special type of CFGs, where each
production rule has at most 1 variable on RHS & that variable is on
right most position.
• A xB
⇢
• A x
⇢
• where A,B V and x T*
∈ ∈
• Example 1:
• S -> aA | B
• A -> aaB
• B -> bB | a
• Grammar G is right-linear
Cont …
• Example 2: FA for accepting
strings that start with b
• ∑ = {a, b}
• Initial state(q0) = A
• Final state(F) = B
• The RLG corresponding to FA is
• A bB
⇢
• B /aB/bB
⇢ ∈
• The above grammar is RLG,
which can be written directly
through FA.
II. Left Linear Grammar
• Left Linear Grammars are Special type of CFGs, where Each
Production Rule has At Most 1 Variable on RHS & that variable
is on Left Most position.
• A Bx
⇢
• A x
⇢
• where A,B V and x T*
∈ ∈
• Example:
• A -> Da | Bc | b
• B -> Bf | Ca | a
• C -> Ca | D
• D -> 𝛆
2.4. Pumping lemma and
non-regular language
grammars
Introduction
• pumping lemma is used to prove that a language is not regular.
• what are regular languages and what are non-regular languages?
• In previous lessons, we learned that regular languages are
exactly those that can be described by regular expressions or
recognized by finite automata (DFA/NFA).
• All regular languages can be described using regular expressions.
• However, not all languages are regular.
• To prove that a language is not regular, we use a powerful tool
called the Pumping Lemma.
What is the Pumping Lemma?
• If a language A is regular, then there exists a
pumping length p≥1 such that any string s A, where
∈
s ≥p, can be split into three parts:
∣ ∣
• s=xyz
• such that the following three conditions are true:
• xyi
z A for all ≥0
∈ 𝑖
• ∣y ≥1
∣
• ∣xy ≤p
∣
Cont …
• Let’s break them down:
• xyz is the original string.
• Y is the part that can be repeated (pumped).
• Condition 1 ensures that no matter how many times we
repeat y (including zero times), the new string stays in the
language.
• Condition 2 ensures that we are actually repeating
something (not the empty string).
• Condition 3 limits the position of the pumpable section y
— it must be within the first p characters.
Proof by contradiction
• How to Use the Pumping Lemma to Prove a Language is Not
Regular?
• We use a proof by contradiction:
• Assume the language A is regular.
• Then there must exist a pumping length p.
• Choose a string s A such that s ≥p.
∈ ∣ ∣
• split s into xyz
• Find a value of i such that the pumped string xyi
z A.
∉
• This contradiction means that the language cannot be regular.
Summary Table
Concept Meaning
Pumping Lemma A property that all regular languages must satisfy
Pumping Length (p) The length beyond which strings can be pumped
Goal
Show that no matter how a long string is split, the
conditions will fail
Outcome
If conditions fail contradiction language is
→ → not
regular
Example
• Example 1: Language 1={a
𝐿 n
bn
≥0}L 1​={a
∣𝑛 n
bn
n≥0}This
∣
language contains strings with equal number of a’s followed
by equal number of b’s
• (e.g., ab, aabb, aaabbb, etc.).
End Of Chapter
Instructor: Biniyam E.

More Related Content

PDF
Chapter 3 REGULAR EXPRESSION.pdf
PPTX
Regular expressions
DOCX
Regular Expression .docx
PPTX
Chapter 4_Regular Expressions in Automata.pptx
PPT
4_Regular_Expressionssssssssssassssss.ppt
PPTX
AUTOMATA AUTOMATA Automata4Chapter3.pptx
PPTX
Mod 2_RegularExpressions.pptx
PPTX
fullunit2-220307090026 (1) theory of computation.pptx
Chapter 3 REGULAR EXPRESSION.pdf
Regular expressions
Regular Expression .docx
Chapter 4_Regular Expressions in Automata.pptx
4_Regular_Expressionssssssssssassssss.ppt
AUTOMATA AUTOMATA Automata4Chapter3.pptx
Mod 2_RegularExpressions.pptx
fullunit2-220307090026 (1) theory of computation.pptx

Similar to Chapter Two - Regular Expression and Regular languages.pptx (20)

PDF
Lecture: Regular Expressions and Regular Languages
PPT
PPT 2.1.1(The Pumping Lemma for Regular sets, Application of the Pumping Lemm...
PDF
RegularExpressions.pdf
PPTX
Automata theory -RE to NFA-ε
DOC
PDF
Flat unit 2
PPT
jhiu ghfpovypoqwytpboyvetpqotybpo8uvb[O8YBTE-8V
PPTX
Regular Expressions here we have .pptx
PPTX
Unit2 Toc.pptx
PPTX
13000120020_A.pptx
PDF
Formal Languages and Automata Theory unit 2
PPTX
THEORYOFAUTOMATATHEORYOFAUTOMATATHEORYOFAUTOMATA.pptx
PDF
Chapter2CDpdf__2021_11_26_09_19_08.pdf
PDF
Unit ii
DOCX
unit 2 part b.docx
PPTX
L_2_apl.pptx
DOCX
theory of computation notes for school of engineering
PPTX
Ch2 automata.pptx
PPTX
Theory of Automata and formal languages unit 2
PPT
Regular Grammar
Lecture: Regular Expressions and Regular Languages
PPT 2.1.1(The Pumping Lemma for Regular sets, Application of the Pumping Lemm...
RegularExpressions.pdf
Automata theory -RE to NFA-ε
Flat unit 2
jhiu ghfpovypoqwytpboyvetpqotybpo8uvb[O8YBTE-8V
Regular Expressions here we have .pptx
Unit2 Toc.pptx
13000120020_A.pptx
Formal Languages and Automata Theory unit 2
THEORYOFAUTOMATATHEORYOFAUTOMATATHEORYOFAUTOMATA.pptx
Chapter2CDpdf__2021_11_26_09_19_08.pdf
Unit ii
unit 2 part b.docx
L_2_apl.pptx
theory of computation notes for school of engineering
Ch2 automata.pptx
Theory of Automata and formal languages unit 2
Regular Grammar
Ad

Recently uploaded (20)

PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Hybrid model detection and classification of lung cancer
PPTX
cloud_computing_Infrastucture_as_cloud_p
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
PDF
Hindi spoken digit analysis for native and non-native speakers
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PPTX
Chapter 5: Probability Theory and Statistics
PDF
A comparative analysis of optical character recognition models for extracting...
PDF
Mushroom cultivation and it's methods.pdf
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PDF
Encapsulation theory and applications.pdf
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PPTX
A Presentation on Touch Screen Technology
PPTX
TLE Review Electricity (Electricity).pptx
PDF
A novel scalable deep ensemble learning framework for big data classification...
MIND Revenue Release Quarter 2 2025 Press Release
Hybrid model detection and classification of lung cancer
cloud_computing_Infrastucture_as_cloud_p
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
Unlocking AI with Model Context Protocol (MCP)
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
Hindi spoken digit analysis for native and non-native speakers
Digital-Transformation-Roadmap-for-Companies.pptx
Chapter 5: Probability Theory and Statistics
A comparative analysis of optical character recognition models for extracting...
Mushroom cultivation and it's methods.pdf
Assigned Numbers - 2025 - Bluetooth® Document
Encapsulation theory and applications.pdf
Programs and apps: productivity, graphics, security and other tools
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
A Presentation on Touch Screen Technology
TLE Review Electricity (Electricity).pptx
A novel scalable deep ensemble learning framework for big data classification...
Ad

Chapter Two - Regular Expression and Regular languages.pptx

  • 1. Regular Expression and Regular languages Chapter 2
  • 2. Outline 2.1. Regular expressions 2.2. Connection between regular expression and regular languages 2.3. Regular grammar 2.4. Pumping lemma
  • 4. Introduction • A Regular Expression (RE) is a symbolic method to describe patterns of strings in a language. • It represents a regular language, which is a language accepted by finite automata (either DFA or NFA). • Regular expressions are essential in: • Lexical analysis in compilers • Pattern matching (e.g., grep, regex in Python) • Describing tokens in programming languages
  • 5. Formal Definition of Regular Expressions I. Base Cases: are the simplest regular expressions — the building blocks: 1. (phi): ϕ represents the empty language: L( ) = ϕ ∅ This language contains no strings at all. 2.ε (epsilon): represents the language containing only the empty string: L(ε) = {ε} The empty string is a string of length 0. 3.a, for any symbol a Σ: represents the language containing ∈ just the string "a": L(a) = {a}
  • 6. Cont … II. Recursive Rules (Building Larger Expressions) • If r and s are regular expressions representing languages L(r) and L(s), • Then the following are also regular expressions: 1.Union: r s ∪ The expression r s ∪ denotes the language: L(r s) = L(r) L(s) ∪ ∪ That is, any string in either L(r) or L(s). 2.Concatenation: rs The expression rs denotes the concatenation of L(r) and L(s): L(rs) = { xy | x L(r), y L(s) } ∈ ∈ 3.Kleene Star: r* The expression r* denotes the language containing zero or more concatenations of strings from L(r): L(r) = {ε, w , w w , w w w , ... | each w L(r)} ₁ ₁ ₂ ₁ ₂ ₃ ᵢ ∈ *
  • 7. Precedence of Operators • To interpret expressions correctly without excessive parentheses, operator precedence is used: • Kleene Star * – highest precedence • Concatenation • Union | or ∪ – lowest precedence • Example: • a|bc* is interpreted as a (b(c*)) ∪ • Use parentheses to control grouping: • (a|b)*abb means: any number of a’s and b’s followed by a, then b, then b.
  • 8. Cont … • To interpret regular expressions without excessive parentheses, we rely on a standard precedence of operators: 1.Kleene Star (*) – Highest Precedence •Applies to the symbol or group directly before it. •Example: a* means zero or more occurrences of a. 2.Concatenation – Medium Precedence •Joins two patterns end to end. •Example: ab* is interpreted as a(b*), not (ab)*. 3.Union (| or ) ∪ – Lowest Precedence •Represents choice between alternatives. •Example: a|b* is interpreted as a | (b*), not (a | b)*.
  • 9. Algebraic Laws of Regular Expressions • Regular expressions follow certain algebraic identities, helping in simplification: • Union is commutative: r s = s r ∪ ∪ • Concatenation is associative: (rs)t = r(st) • Distributive property: r(s t) = rs rt ∪ ∪ • Identity element for union: r ∪ = ϕ r • Identity for concatenation: rε = εr = r • Kleene star properties: • r* = ε ∪ r rr rrr ... ∪ ∪ ∪ • (r*)* = r*
  • 10. Excersice • Write a regular expression for the set of all strings over the alphabet {a, b} that start with 'a’. • Give a brief explanation of how your regular expression works. • Answer: Regular Expression: a(a|b)* • Explanation: • The string must start with an a. • The expression (a|b)* means zero or more occurrences of either a or b. • So the string starts with a and is followed by any sequence (including none) of a’s and b’s. • Valid examples: a, ab, aabbb, abab, aaaa. • Write a regular expression for all strings over {a, b} that end with 'b’. • Explain how the expression ensures the last character is 'b’. • Answer: Regular Expression: (a|b)*b • Explanation: • (a|b)* matches any sequence of a's and b’s. • The final b ensures that the string ends with b. • Valid examples: b, ab, aab, bbab, bbbb.
  • 11. Cont … • Write a regular expression for strings that contain the substring "ab". • Explain how your regex ensures "ab" appears. • Answer: Regular Expression: (a|b)*ab(a| b)* • Explanation: • (a|b)* before and after means any characters can come before and after ab. • The required ab substring must appear at least once in the string. • Examples: ab, aab, babab, aaabba, bbaab. • Write a regular expression for strings over {a, b} that contain exactly two a’s. • Describe how this restricts the count of 'a’. • Answer: Regular Expression: (b*)a(b*)a(b*) • Explanation: • a appears exactly twice. • Between and around the as, there can be any number of bs (even zero). • No more than two as are allowed. • Examples: aab, baab, babab, bbabb.
  • 12. 2.2. Connection between regular expression & regular languages
  • 13. Introduction Regular Expressions (RE): • Symbolic notation to describe patterns in strings. • Built from basic symbols using operations: union ( ), concatenation, and ∪ Kleene star (*). • Example: (a|b)*abb Regular Languages (RL): • A class of languages that can be recognized by finite automata (DFA/NFA). • These are exactly the languages that can be described by regular expressions.
  • 14. From Regular Languages to Regular Expressions • Every regular language can also be represented by a regular expression. • Why? • Regular languages are recognized by DFA/NFA. • Kleene's Theorem: If a language is recognized by a finite automaton, then there exists a regular expression that generates it. • Conversion: Convert NFA Regular Expression →
  • 16. Regular Grammar • Regular grammar is a formal grammar used to describe regular languages, which are the languages that can be recognized by finite automata. • There are two standard forms of regular grammar: • Right-Linear Grammar • Left-Linear Grammar
  • 17. Cont … • A regular grammar is a restricted type of context- free grammar (CFG) where all production rules follow specific patterns. • Formal Definition • A regular grammar G is a 4-tuple (V,Σ,P,S): • V: Finite set of non-terminal symbols (e.g., S,A,B) • Σ: Finite set of terminal symbols (e.g., a,b) • P: Production rules of specific forms • S: Start symbol (S V) ∈
  • 18. Types of Regular Grammars I. Right-Linear Grammars • Rule Forms: • A aB(Non-terminal → → terminal + non-terminal) • A a (Non-terminal → → terminal) • A→ε (Only if A is the start symbol) II. Left-Linear Grammars • Rule Forms: • A Ba(Non-terminal → → non-terminal + terminal) • A a (Non-terminal → → terminal) • A→ε (Only if A is the start symbol) • A language is regular if and only if it can be generated by a right-linear or left-linear grammar.
  • 19. I. Right Linear Grammar • Right Linear Grammars are special type of CFGs, where each production rule has at most 1 variable on RHS & that variable is on right most position. • A xB ⇢ • A x ⇢ • where A,B V and x T* ∈ ∈ • Example 1: • S -> aA | B • A -> aaB • B -> bB | a • Grammar G is right-linear
  • 20. Cont … • Example 2: FA for accepting strings that start with b • ∑ = {a, b} • Initial state(q0) = A • Final state(F) = B • The RLG corresponding to FA is • A bB ⇢ • B /aB/bB ⇢ ∈ • The above grammar is RLG, which can be written directly through FA.
  • 21. II. Left Linear Grammar • Left Linear Grammars are Special type of CFGs, where Each Production Rule has At Most 1 Variable on RHS & that variable is on Left Most position. • A Bx ⇢ • A x ⇢ • where A,B V and x T* ∈ ∈ • Example: • A -> Da | Bc | b • B -> Bf | Ca | a • C -> Ca | D • D -> 𝛆
  • 22. 2.4. Pumping lemma and non-regular language grammars
  • 23. Introduction • pumping lemma is used to prove that a language is not regular. • what are regular languages and what are non-regular languages? • In previous lessons, we learned that regular languages are exactly those that can be described by regular expressions or recognized by finite automata (DFA/NFA). • All regular languages can be described using regular expressions. • However, not all languages are regular. • To prove that a language is not regular, we use a powerful tool called the Pumping Lemma.
  • 24. What is the Pumping Lemma? • If a language A is regular, then there exists a pumping length p≥1 such that any string s A, where ∈ s ≥p, can be split into three parts: ∣ ∣ • s=xyz • such that the following three conditions are true: • xyi z A for all ≥0 ∈ 𝑖 • ∣y ≥1 ∣ • ∣xy ≤p ∣
  • 25. Cont … • Let’s break them down: • xyz is the original string. • Y is the part that can be repeated (pumped). • Condition 1 ensures that no matter how many times we repeat y (including zero times), the new string stays in the language. • Condition 2 ensures that we are actually repeating something (not the empty string). • Condition 3 limits the position of the pumpable section y — it must be within the first p characters.
  • 26. Proof by contradiction • How to Use the Pumping Lemma to Prove a Language is Not Regular? • We use a proof by contradiction: • Assume the language A is regular. • Then there must exist a pumping length p. • Choose a string s A such that s ≥p. ∈ ∣ ∣ • split s into xyz • Find a value of i such that the pumped string xyi z A. ∉ • This contradiction means that the language cannot be regular.
  • 27. Summary Table Concept Meaning Pumping Lemma A property that all regular languages must satisfy Pumping Length (p) The length beyond which strings can be pumped Goal Show that no matter how a long string is split, the conditions will fail Outcome If conditions fail contradiction language is → → not regular
  • 28. Example • Example 1: Language 1={a 𝐿 n bn ≥0}L 1​={a ∣𝑛 n bn n≥0}This ∣ language contains strings with equal number of a’s followed by equal number of b’s • (e.g., ab, aabb, aaabbb, etc.).