Complexity and Computability Regular Expressions
Regular Expressions
Regular Expressions(RE) denote structure of data, especially text
strings
They describe the same strings as those defined by finite automata
Define all and only regular languages (algebraic description of
languages) in comparision to the machine-like descriptions
Give a declarative way to express strings, therefore serve as input
language for many string-processing systems
Safari - Yonasi (Makerere University) CSC 2210 2012/2013 1 / 29
Complexity and Computability Regular Expressions
Constructing a RE
General rule: To construct a RE for the language consisting of only the
string w , use w itself as a RE
Example: Write a regular expression for the set of strings consisting of
alternating 0’s and 1’s.
First develop a regular expression for the language consisting of single
string 01, then use star operator to get expression for all strings of form
0guatda.com/cmx.p101...01
0 and 1 are regular expressions of the languages {0} and {1}
Concatenating the expressions gives a regular expression for the
language {01}, RE = 01
The RE 01 will be used for the construction
Safari - Yonasi (Makerere University) CSC 2210 2012/2013 26 / 29
Complexity and Computability Regular Expressions
Constructing a RE
Therefore, strings consisting of zero or more occurrences of 01 will be
(01)∗. Note that (01)∗ is not the same as 01∗
This gives the language L((01)∗) which does not entirely satisfy what
we want, because it only considers strings beginning with 0 and
ending with 1.
Consider also possibility of having a 1 at the beginning and or a 0 at
the end
Safari - Yonasi (Makerere University) CSC 2210 2012/2013 27 / 29
Complexity and Computability Regular Expressions
Constructing a RE
For the three other possibilities, construct a RE for each,
0(10)∗ for strings that begin and end with 0,
1(01)∗ for those that begin and end with 1,
(10)∗ for those that begin with 1 and end with 0
This gives:
T (G ) = (01)∗ + (10)∗ + 0(10)∗ + 1(01)∗
Note: Union operator gives collective set of all possibilities of strings with
alternating 0s, 1s (it combines the whole set of possible strings).
Safari - Yonasi (Makerere University) CSC 2210 2012/2013 28 / 29
Complexity and Computability Regular Expressions
Regular Sets
Definition
Let Σ be an alphabet. A regular set over Σ is defined as:
Basis:
The constant, ∅, is a regular expression denoting the empty language
(L(∅) = ∅), which is a regular set of Σ
The constant, ϵ, is a regular expression denoting the empty word (ϵ),
and the language L(ϵ) = {ϵ} is a regular set of Σ
The symbol a is a regular expression, a, denoting the language {a}.
L(a) = {a} is a regular set of Σ ∀ a ∈ Σ
The variable L represents any language
Safari - Yonasi (Makerere University) CSC 2210 2012/2013 8 / 29
Complexity and Computability Regular Expressions
Regular Sets
Induction:
If P and Q are regular sets/regular expressions over Σ, then:
P + Q denotes a regular expression of the union of L(P) and L(Q),
that is, L(P + Q) = L(P) ∪ L(Q)
PQ is a regular expression denoting concatenation of the languages,
L(PQ) = L(P)L(Q)
P∗ is a RE denoting closure of L(P), L(P∗) = (L(P))∗
(P) is a regular expression denoting the same language as P, L((P))
= L(P)
Nothing else is a regular set
Safari - Yonasi (Makerere University) CSC 2210 2012/2013 9 / 29
Complexity and Computability Regular Expressions
Operations on Regular Sets
Therefore, a set of Σ∗ is regular iff it falls in any of the above conditions,
or can be obtained from them by a finite number of applications of the
operations of union, concatenation and closure.
Example: The RE 01∗ + 10∗ represents the language having strings that
are either a single zero followed by any number of 1’s or a single 1 followed
by any number of 0’s
Safari - Yonasi (Makerere University) CSC 2210 2012/2013 10 / 29
Complexity and Computability Regular Expressions
Operator Precedence
Definition
The order of precedence for operators in decreasing order is:
∗ - closure/star operator, applies to a well-formed RE to its left
• - concatenation operator, from left if similar
Note: RS = SR (order sensitive)
+ - union operator
( ) - parentheses can be used to group operands, and help to override the
precedence order.
Safari - Yonasi (Makerere University) CSC 2210 2012/2013 11 / 29
Complexity and Computability Regular Expressions
Example
Consider the mathematical expression xy + z or x − y − z. How
would you group the operands?
Group the RE 01∗ + 1:
(1∗), then (0(1∗)), then (0(1∗)) + 1, the language consisting of the
string 1 and all strings consisting of a 0 followed by any number of 1’s
(which may also be none)
If grouped as (01)∗ + 1 (the dot before the star), it represents the
language having the string 1 and the strings repeating 01 (zero or more
times)
If grouped as 0(1∗ + 1), (the union first), it represents the language of
strings beginning with 0, followed by any number of 1’s.
Safari - Yonasi (Makerere University) CSC 2210 2012/2013 12 / 29
Complexity and Computability Regular Expressions
Applications
They are used to express important applications such as:
Text search, which is accomplished by converting the RE into a DFA
or NFA,
Building compiler components by describing their software
components, such as the lexical analyzer
Safari - Yonasi (Makerere University) CSC 2210 2012/2013 2 / 29
Complexity and Computability Regular Expressions
Properties of Operations on REs
For the regular sets R, S and T formed from Σ∗,
R + R = R (the indempotent law for union),
R + ∅ = R (the identity for union)
R + S = S + R, - the commutative law for union
(R + S ) +T = R+ (S + T ), - associative law for union
(RS )T = R(ST ) = RST , - associative law for concatenation
Note that there is no commutative law for concatenation
Rϵ = ϵR = R - identity for concatenation
(R + S )T = RT + ST - right distributive law of concatenation over
union
T (R + S ) = TR + TS - left distributive law of concatenation over
union
Safari - Yonasi (Makerere University) CSC 2210 2012/2013 14 / 29
Simplifications for the Operators
For a RE converted to an ϵ−NFA, some simplifications exist for the
operator constructs:
For the union operator, instead of creating new start and accepting
states, merge two start states into one with all the transitions of both
start states. Similarly, merge the two accepting states
For the concatenation operator, merge the accepting state of the first
automaton with the start of the second
For the closure, add ϵ− transitions from the accepting state to the
start state and vice-versa
Safari-Yonasi (Makerere University) CSC 2210 2012/2013 37 / 47
Transition Graphs from REs
Basis:
Start by constucting transition diagrams for basic expressions for
smaller automata,
ϵ,
∅, and
RE a
Induction:
Then combining these automata inductively, to form larger automata
that accept the different operations of
union,
concatenation and
closure
of the languages accepted by smaller automata
Safari-Yonasi (Makerere University) CSC 2210 2012/2013 31 / 47
Complexity and Computability Regular Expressions
Testing RE Properties
To test whether two REs R = S, where R and S have the same set of
variables:
Convert R and S to concrete RE C and D, respectively by replacing
each variable by a concrete symbol
Test whether L(C) = L(D)
If so, R = S is a true law, else false
Note: If the languages are not the same, it is sufficient to provide a single
string that is in one language but not the other
Safari - Yonasi (Makerere University) CSC 2210 2012/2013 17 / 29
Complexity and Computability Regular Expressions
Operations on REs
Given two sets of words R and S (languages) from Σ∗,
The union of R and S denoted R ∪ S is the set of words that are
either in R or S or both:
R + S = {x : x ∈ R or x ∈ S} - UNION SET
The concatenation of languages R and S is the set of strings formed
by taking any string in R and concatenating it with any string in S :
R • S = {xy : x ∈ R,y ∈ S} - CONCATENATION SET
Safari - Yonasi (Makerere University) CSC 2210 2012/2013 3 / 29
Complexity and Computability Regular Expressions
Operations on REs
The closure (star or Kleene closure) of language R denoted R∗ is the
set of the words that can be formed by taking any number of strings
from R (same string may be repeated) and concatenating the words
R∗ = ϵ + {x : x is obtained by concatenating a finite number of
words of R}
= ϵ + R + R2 + + Ri + ..., where R0 = ϵ, the zeroth power of R
1 Ri = (...(RR)R...)R i times
2 Note that Ri has 2i members
3 R∗ = ∪i≥0 Ri
Safari - Yonasi (Makerere University) CSC 2210 2012/2013 4 / 29
Complexity and Computability Regular Expressions
Operations on REs
For the empty language ∅,
∅0 = {ϵ} and
∅i∀ i ≥ 1 is empty (no strings can be selected from an empty set)
∅∗ = {ϵ}
Ri is a finite set but R∗ may be an infinite set
Exceptions
1
2
The closure of the language ∅∗, is not an infinite set
If R is a string of 0’s, R0 ={ϵ}, R1 = R, R2 = R ⇔ R∗ = R
In this case, R∗ is not infinite
Safari - Yonasi (Makerere University) CSC 2210 2012/2013 5 / 29
Union Operator
Consider the RE given by R + S
Starting at the new start state, the automaton can transition to the
start state of either R or S
The accepting state of one of the automata is reached by following
path labelled by some string in L(R) or L(S )
Then one of the ϵ arcs is followed to the accepting state of the new
automaton, giving the language of the automaton as L(R) ∪ L(S )
Safari-Yonasi (Makerere University) CSC 2210 2012/2013 32 / 47
Concatenation
For RS,
The start state of the first automaton becomes the start state of the
whole RE,
The accepting state of the second automaton becomes the accepting
state of the whole
Paths for acceptance go through R (by string in L(R)), then S
(L(S )), giving the language L(R)L(S )
Safari-Yonasi (Makerere University) CSC 2210 2012/2013 33 / 47
Closure
When drawing transition graphs for regular expressions of the form R∗:
Draw the graph for R,
To include R∗, add two nodes to represent the start and final nodes,
let the other nodes be intermediate nodes. Join the new nodes to the
graph of R with arcs labeled ϵ
Add an arc from the start state of the new graph to the final node,
labeled ϵ, irrespective of what R may be (remember that for R∗, ϵ
must be an accepted string)
Add another arc from the final state of R to the initial node of R
labeled ϵ. This allows motion to start for R, through the automaton
one or more times, to the accepting state.
Safari-Yonasi (Makerere University) CSC 2210 2012/2013 34 / 47
Closure
Note:
From the start state, there is a path from the start to accepting state
along path labelled ϵ
Also, there is a path from the new start state to the start state of R,
through the automaton one or more times, to the accepting state
Safari-Yonasi (Makerere University) CSC 2210 2012/2013 35 / 47
(R)
The expression (R) is the same as R, since parentheses do not change
the language defined by the expression. The representation for (R) is
therefore the same as that for R.
Safari-Yonasi (Makerere University) CSC 2210 2012/2013 36 / 47
Construction of Transition Graphs from REs
Just as was done for the finite state system, transition graphs can
also be constructed for REs
This is achieved by drawing the equivalent ϵ−NFAs for the RE
The ϵ−NFAs have a single accepting state
Generally, there are no arcs into the initial state or out of the
accepting state
Safari-Yonasi (Makerere University) CSC 2210 2012/2013 30 / 47
Example 1
Consider the RE represented by: T (G ) = 0 + 11∗,
To draw a transition graph (ϵ−NFA) for this RE:
Consider the sub-expressions in the RE, and for each, assume that the
languages of the sub-expression are also those of ϵ−NFAs with one
accepting state:
Let them be T (G1) = 0 and T (G2) = 11∗
Construct transition graphs for each of the sub-expressions
Safari-Yonasi (Makerere University) CSC 2210 2012/2013 38 / 47
Example 1
For T (G1):
For T (G2):
Combine the two transition graphs using the union operator to obtain
the transition graph for T (G )
Safari-Yonasi (Makerere University) CSC 2210 2012/2013 39 / 47
Example 1
Introduce nodes to combine the two transition graphs and connect
the nodes with ϵ, the empty string transition
Note that the ϵ− transitions disappear in concatenation, so in effect,
do not affect the graph
Safari-Yonasi (Makerere University) CSC 2210 2012/2013 40 / 47
Example 1
Let one of the nodes introduced be an initial node and the other a
final node
This is a viable transition graph for T (G )
Safari-Yonasi (Makerere University) CSC 2210 2012/2013 41 / 47
Example 1 cont’d
However, it is inefficient to have so many ϵ’s
Reducing the ϵ’s, (bearing in mind that ϵ’s disappear in
concatenation), gives the following as the resulting transition graph
for T (G )
Note that for the union operator, both strings, 0 and 11∗ are accepted
by the transition graph and their transition graphs are combined.
Safari-Yonasi (Makerere University) CSC 2210 2012/2013 42 / 47
Example 2
To concatenate T (G1) = 0 and T (G2) = 11∗ for the expression
T (G ) = 011∗,
For concatenation, the final state of the transition graph of one
sub-expression is joined to the initial state of the other
For the example in consideration, the graph of T (G1) is joined to that
of T (G2) giving:
The extra transition representing the combination of the two graphs is
labeled with ϵ
Safari-Yonasi (Makerere University) CSC 2210 2012/2013 43 / 47
Example 2
Reducing the graph gives:
as the resulting graph for the RE of T (G )
Safari-Yonasi (Makerere University) CSC 2210 2012/2013 44 / 47
Example 3
For the closure operator, consider the RE represented by T (G ) = (01∗)∗
Draw T (G1) = 01∗
Then construct T (G ) = (01∗)∗
Safari-Yonasi (Makerere University) CSC 2210 2012/2013 45 / 47
Characteristics of the ϵ− NFA
They have exactly one accepting state
There are no arcs into the initial state
There are no arcs out of the accepting state
Safari-Yonasi (Makerere University) CSC 2210 2012/2013 46 / 47
Exercise
Draw transition graphs to represent the following strings/expressions:
1
2
3
01∗
(0 + 1)01
(10 + 0∗1)∗1
Safari-Yonasi (Makerere University) CSC 2210 2012/2013 47 / 47
Complexity and Computability Regular Expressions
Example
Let Σ = {a, b} and R = {aa, ab} = {a2, ab} and S = {aba, ab, ba}, then:
R + S = {a2,ab,aba,ba}
RS = {a2, ab} • {aba, ab, ba} =
{a2aba, a2ab, a2ba, ababa, abab, abba}
R∗ = ϵ + {a2,ab} + {a2a2,a2ab,aba2,abab} + ...
= ϵ + {a2,ab,a2a2,a2ab,aba2,abab} + ...
= (a2)m (ab)n, m, n ≥ 0
Safari - Yonasi (Makerere University) CSC 2210 2012/2013 6 / 29
Complexity and Computability Regular Expressions
Exercise
Given Σ = {0, 1} and L = {001, 10, 111} and M = {ϵ, 001}. Find:
1
2
3
4
L∪M
L•M
L∗
M∗
Safari - Yonasi (Makerere University) CSC 2210 2012/2013 7 / 29
Complexity and Computability Regular Expressions
Example 1
Consider the RE 0 + 01∗, applying some of the laws/ properties of REs,
0 can be factored out of the union, but the RE 0 would have to be
replaced by another RE
Using identity for concatenation, 0 = 0ϵ giving the RE 0ϵ + 01∗
Applying the left distributive law to the RE gives 0(ϵ + 1∗)
However, ϵ ∈ L(1∗) giving ϵ + 1∗ = 1∗ giving:
0(ϵ + 1∗) = 01∗
Assignment: Define the language of this RE
Safari - Yonasi (Makerere University) CSC 2210 2012/2013 18 / 29
Complexity and Computability Regular Expressions
Example 2
Prove the law: (R + S)∗ = (R∗S∗)∗
Let R = a, S = b ⇒ (a + b)∗ = (a∗b∗)∗
LHS: (a + b)∗ = {ϵ, a, b, } giving strings of a’ and b’s mixed
RHS: (a∗b∗)∗:
a∗ = {ϵ, a, aa, aaa, ...}
b∗ = {ϵ, b, bb, bbb, ...}
(a∗b∗)= {ϵ, a, aa, aaa, ...}{ϵ, b, bb, bbb, ...} = {ϵ, a, b, aa, ab, bb,
abb, bbb, ... }
(a∗b∗)∗ = {ϵ, a, b, aa, ab, bb, abb, bbb, ... }∗ = {ϵ, a, b, ...} also
strings with a’s and b’s mixed
∴ LHS = RHS, (R + S)∗ = (R∗S∗)∗
Safari - Yonasi (Makerere University) CSC 2210 2012/2013 19 / 29
Complexity and Computability Regular Expressions
Exercise
Prove or disprove the following statements on REs:
R∗R∗ = R∗
(R + S )∗ = R∗ + S∗
(RS + S )∗RS = (RR∗S )∗
Safari - Yonasi (Makerere University) CSC 2210 2012/2013 20 / 29
Complexity and Computability Regular Expressions
Transition Graphs for REs
Let Σ = {0, 1} be a two-letter alphabet. The transition graph, G , over Σ
consists of:
A finite set of nodes, with at least one labelled as the initial and some
(may be more than one) labelled as final states.
Safari - Yonasi (Makerere University) CSC 2210 2012/2013 21 / 29
Complexity and Computability Regular Expressions
Transition Graphs for REs
Oriented branches (which may be represented as ordered pairs of
nodes, arrows, arcs or loops)
Every arrow is labelled with a 0, 1, or ϵ
A word, w , is accepted by a transition graph if there exists a path
from an initial node to a final node such that the labels of the arrows
of the path form the word w , after the ϵ’s are deleted (ϵ’s disappear
in concatenation)
The empty string, ϵ, is accepted if one node is both a start and final
node or if there exists a path from an initial to final node whose
arrows are all labelled with ϵ’s.
The set of words accepted by a transition graph is denoted by T(G)
Safari - Yonasi (Makerere University) CSC 2210 2012/2013 22 / 29
Complexity and Computability Regular Expressions
Examples
T(G) = {1}, accepts only 1
T(G) = {1∗}, accepts ϵ, 1, 11, 111, ...
T(G) = {∅}, does not accept anything
Safari - Yonasi (Makerere University) CSC 2210 2012/2013 23 / 29
Complexity and Computability Regular Expressions
Examples
T(G) = {ϵ}
T(G) = {11∗}, the empty word, ϵ is not accepted
Safari - Yonasi (Makerere University) CSC 2210 2012/2013 24 / 29
Complexity and Computability Regular Expressions
Exercise
What sets of words are accepted by the following transition graphs?
Safari - Yonasi (Makerere University) CSC 2210 2012/2013 25 / 29
Complexity and Computability Regular Expressions
Assignment
1
2
Write a regular expression for the language consisting of strings of 0’s
and 1’s such that every pair of adjacent 0’s appears before any pair of
adjacent 1’s.
Give the English description of the language of the RE
(1 + ϵ)(00∗1)∗0∗
Safari - Yonasi (Makerere University) CSC 2210 2012/2013 29 / 29

More Related Content

PPT
3-regular_expressions_and_languages (1).ppt
PPT
3-regular_expressions_and_languages (1).ppt
PPT
3-regular_expressions_and_languages.ppt 1
PDF
PPTX
AUTOMATA AUTOMATA Automata4Chapter3.pptx
PDF
PDF
QB104541.pdf
PDF
Lecture: Regular Expressions and Regular Languages
3-regular_expressions_and_languages (1).ppt
3-regular_expressions_and_languages (1).ppt
3-regular_expressions_and_languages.ppt 1
AUTOMATA AUTOMATA Automata4Chapter3.pptx
QB104541.pdf
Lecture: Regular Expressions and Regular Languages

Similar to 1LECTURE 8 Regular_Expressions.ppt (20)

PPT
regular expression smmlmmmmmmmmmmmmm.ppt
PPT
re1.ppt reggular expression definition eq
PPT
regular Expression definition equivalence
PPT
regular expression to automata toc flat,
PPT
Theory of Computation - Lectures 4 and 5
PDF
RegularExpressions.pdf
PDF
Chapter2CDpdf__2021_11_26_09_19_08.pdf
PPTX
Regular expression
PPT
re1.ppt
PPT
Re1 (3)
PPT
RegularExpressions-theory of computation and formal language
PPT
PDF
Automata
PDF
Automata
PDF
Unit ii
PDF
Flat unit 2
PDF
Regular expression
PDF
Flat unit 1
PPT
Closure Properties of Regular Languages.ppt
regular expression smmlmmmmmmmmmmmmm.ppt
re1.ppt reggular expression definition eq
regular Expression definition equivalence
regular expression to automata toc flat,
Theory of Computation - Lectures 4 and 5
RegularExpressions.pdf
Chapter2CDpdf__2021_11_26_09_19_08.pdf
Regular expression
re1.ppt
Re1 (3)
RegularExpressions-theory of computation and formal language
Automata
Automata
Unit ii
Flat unit 2
Regular expression
Flat unit 1
Closure Properties of Regular Languages.ppt
Ad

Recently uploaded (20)

PPTX
430838499-Anaesthesiiiia-Equipmenooot.pptx
DOCX
GIZ Capacity Building Requirements for ICT Department.docx
PDF
Branding_RAMP-ML........................
PPT
notes_Lecture2 23l3j2 dfjl dfdlkj d 2.ppt
PPTX
ANN DL UNIT 1 ANIL 13.10.24.pptxcccccccccc
PDF
Women’s Talk Session 1- Talking about women
PPT
NO000387 (1).pptsbsnsnsnsnsnsnsmsnnsnsnsjsnnsnsnsnnsnnansnwjwnshshshs
PPTX
Final Second DC Messeting PPT-Pradeep.M final.pptx
PDF
BPT_Beach_Energy_FY25_half_year_results_presentation.pdf
PDF
LSR CASEBOOK 2024-25.pdf. very nice casbook
PDF
Sheri Ann Lowe Compliance Strategist Resume
PPT
ppt-of-extraction-of-metals-12th-1.pptb9
DOCX
PRACTICE-TEST-12 is specially designed for those
PPT
Woman as Engineer and Technicians in the field of Clinical & Biomedical Engin...
PDF
Basic GMP - Training good manufacturing procedure
PPT
pwm ppt .pdf long description of pwm....
PPTX
Coordination Compuch flasks didiinds.pptx
PDF
Career Overview of John Munro of Hilton Head
PPTX
employee on boarding for jobs for freshers try it
PPTX
D1basicstoloopscppforbeginnersgodoit.pptx
430838499-Anaesthesiiiia-Equipmenooot.pptx
GIZ Capacity Building Requirements for ICT Department.docx
Branding_RAMP-ML........................
notes_Lecture2 23l3j2 dfjl dfdlkj d 2.ppt
ANN DL UNIT 1 ANIL 13.10.24.pptxcccccccccc
Women’s Talk Session 1- Talking about women
NO000387 (1).pptsbsnsnsnsnsnsnsmsnnsnsnsjsnnsnsnsnnsnnansnwjwnshshshs
Final Second DC Messeting PPT-Pradeep.M final.pptx
BPT_Beach_Energy_FY25_half_year_results_presentation.pdf
LSR CASEBOOK 2024-25.pdf. very nice casbook
Sheri Ann Lowe Compliance Strategist Resume
ppt-of-extraction-of-metals-12th-1.pptb9
PRACTICE-TEST-12 is specially designed for those
Woman as Engineer and Technicians in the field of Clinical & Biomedical Engin...
Basic GMP - Training good manufacturing procedure
pwm ppt .pdf long description of pwm....
Coordination Compuch flasks didiinds.pptx
Career Overview of John Munro of Hilton Head
employee on boarding for jobs for freshers try it
D1basicstoloopscppforbeginnersgodoit.pptx
Ad

1LECTURE 8 Regular_Expressions.ppt

  • 1. Complexity and Computability Regular Expressions Regular Expressions Regular Expressions(RE) denote structure of data, especially text strings They describe the same strings as those defined by finite automata Define all and only regular languages (algebraic description of languages) in comparision to the machine-like descriptions Give a declarative way to express strings, therefore serve as input language for many string-processing systems Safari - Yonasi (Makerere University) CSC 2210 2012/2013 1 / 29
  • 2. Complexity and Computability Regular Expressions Constructing a RE General rule: To construct a RE for the language consisting of only the string w , use w itself as a RE Example: Write a regular expression for the set of strings consisting of alternating 0’s and 1’s. First develop a regular expression for the language consisting of single string 01, then use star operator to get expression for all strings of form 0guatda.com/cmx.p101...01 0 and 1 are regular expressions of the languages {0} and {1} Concatenating the expressions gives a regular expression for the language {01}, RE = 01 The RE 01 will be used for the construction Safari - Yonasi (Makerere University) CSC 2210 2012/2013 26 / 29
  • 3. Complexity and Computability Regular Expressions Constructing a RE Therefore, strings consisting of zero or more occurrences of 01 will be (01)∗. Note that (01)∗ is not the same as 01∗ This gives the language L((01)∗) which does not entirely satisfy what we want, because it only considers strings beginning with 0 and ending with 1. Consider also possibility of having a 1 at the beginning and or a 0 at the end Safari - Yonasi (Makerere University) CSC 2210 2012/2013 27 / 29
  • 4. Complexity and Computability Regular Expressions Constructing a RE For the three other possibilities, construct a RE for each, 0(10)∗ for strings that begin and end with 0, 1(01)∗ for those that begin and end with 1, (10)∗ for those that begin with 1 and end with 0 This gives: T (G ) = (01)∗ + (10)∗ + 0(10)∗ + 1(01)∗ Note: Union operator gives collective set of all possibilities of strings with alternating 0s, 1s (it combines the whole set of possible strings). Safari - Yonasi (Makerere University) CSC 2210 2012/2013 28 / 29
  • 5. Complexity and Computability Regular Expressions Regular Sets Definition Let Σ be an alphabet. A regular set over Σ is defined as: Basis: The constant, ∅, is a regular expression denoting the empty language (L(∅) = ∅), which is a regular set of Σ The constant, ϵ, is a regular expression denoting the empty word (ϵ), and the language L(ϵ) = {ϵ} is a regular set of Σ The symbol a is a regular expression, a, denoting the language {a}. L(a) = {a} is a regular set of Σ ∀ a ∈ Σ The variable L represents any language Safari - Yonasi (Makerere University) CSC 2210 2012/2013 8 / 29
  • 6. Complexity and Computability Regular Expressions Regular Sets Induction: If P and Q are regular sets/regular expressions over Σ, then: P + Q denotes a regular expression of the union of L(P) and L(Q), that is, L(P + Q) = L(P) ∪ L(Q) PQ is a regular expression denoting concatenation of the languages, L(PQ) = L(P)L(Q) P∗ is a RE denoting closure of L(P), L(P∗) = (L(P))∗ (P) is a regular expression denoting the same language as P, L((P)) = L(P) Nothing else is a regular set Safari - Yonasi (Makerere University) CSC 2210 2012/2013 9 / 29
  • 7. Complexity and Computability Regular Expressions Operations on Regular Sets Therefore, a set of Σ∗ is regular iff it falls in any of the above conditions, or can be obtained from them by a finite number of applications of the operations of union, concatenation and closure. Example: The RE 01∗ + 10∗ represents the language having strings that are either a single zero followed by any number of 1’s or a single 1 followed by any number of 0’s Safari - Yonasi (Makerere University) CSC 2210 2012/2013 10 / 29
  • 8. Complexity and Computability Regular Expressions Operator Precedence Definition The order of precedence for operators in decreasing order is: ∗ - closure/star operator, applies to a well-formed RE to its left • - concatenation operator, from left if similar Note: RS = SR (order sensitive) + - union operator ( ) - parentheses can be used to group operands, and help to override the precedence order. Safari - Yonasi (Makerere University) CSC 2210 2012/2013 11 / 29
  • 9. Complexity and Computability Regular Expressions Example Consider the mathematical expression xy + z or x − y − z. How would you group the operands? Group the RE 01∗ + 1: (1∗), then (0(1∗)), then (0(1∗)) + 1, the language consisting of the string 1 and all strings consisting of a 0 followed by any number of 1’s (which may also be none) If grouped as (01)∗ + 1 (the dot before the star), it represents the language having the string 1 and the strings repeating 01 (zero or more times) If grouped as 0(1∗ + 1), (the union first), it represents the language of strings beginning with 0, followed by any number of 1’s. Safari - Yonasi (Makerere University) CSC 2210 2012/2013 12 / 29
  • 10. Complexity and Computability Regular Expressions Applications They are used to express important applications such as: Text search, which is accomplished by converting the RE into a DFA or NFA, Building compiler components by describing their software components, such as the lexical analyzer Safari - Yonasi (Makerere University) CSC 2210 2012/2013 2 / 29
  • 11. Complexity and Computability Regular Expressions Properties of Operations on REs For the regular sets R, S and T formed from Σ∗, R + R = R (the indempotent law for union), R + ∅ = R (the identity for union) R + S = S + R, - the commutative law for union (R + S ) +T = R+ (S + T ), - associative law for union (RS )T = R(ST ) = RST , - associative law for concatenation Note that there is no commutative law for concatenation Rϵ = ϵR = R - identity for concatenation (R + S )T = RT + ST - right distributive law of concatenation over union T (R + S ) = TR + TS - left distributive law of concatenation over union Safari - Yonasi (Makerere University) CSC 2210 2012/2013 14 / 29
  • 12. Simplifications for the Operators For a RE converted to an ϵ−NFA, some simplifications exist for the operator constructs: For the union operator, instead of creating new start and accepting states, merge two start states into one with all the transitions of both start states. Similarly, merge the two accepting states For the concatenation operator, merge the accepting state of the first automaton with the start of the second For the closure, add ϵ− transitions from the accepting state to the start state and vice-versa Safari-Yonasi (Makerere University) CSC 2210 2012/2013 37 / 47
  • 13. Transition Graphs from REs Basis: Start by constucting transition diagrams for basic expressions for smaller automata, ϵ, ∅, and RE a Induction: Then combining these automata inductively, to form larger automata that accept the different operations of union, concatenation and closure of the languages accepted by smaller automata Safari-Yonasi (Makerere University) CSC 2210 2012/2013 31 / 47
  • 14. Complexity and Computability Regular Expressions Testing RE Properties To test whether two REs R = S, where R and S have the same set of variables: Convert R and S to concrete RE C and D, respectively by replacing each variable by a concrete symbol Test whether L(C) = L(D) If so, R = S is a true law, else false Note: If the languages are not the same, it is sufficient to provide a single string that is in one language but not the other Safari - Yonasi (Makerere University) CSC 2210 2012/2013 17 / 29
  • 15. Complexity and Computability Regular Expressions Operations on REs Given two sets of words R and S (languages) from Σ∗, The union of R and S denoted R ∪ S is the set of words that are either in R or S or both: R + S = {x : x ∈ R or x ∈ S} - UNION SET The concatenation of languages R and S is the set of strings formed by taking any string in R and concatenating it with any string in S : R • S = {xy : x ∈ R,y ∈ S} - CONCATENATION SET Safari - Yonasi (Makerere University) CSC 2210 2012/2013 3 / 29
  • 16. Complexity and Computability Regular Expressions Operations on REs The closure (star or Kleene closure) of language R denoted R∗ is the set of the words that can be formed by taking any number of strings from R (same string may be repeated) and concatenating the words R∗ = ϵ + {x : x is obtained by concatenating a finite number of words of R} = ϵ + R + R2 + + Ri + ..., where R0 = ϵ, the zeroth power of R 1 Ri = (...(RR)R...)R i times 2 Note that Ri has 2i members 3 R∗ = ∪i≥0 Ri Safari - Yonasi (Makerere University) CSC 2210 2012/2013 4 / 29
  • 17. Complexity and Computability Regular Expressions Operations on REs For the empty language ∅, ∅0 = {ϵ} and ∅i∀ i ≥ 1 is empty (no strings can be selected from an empty set) ∅∗ = {ϵ} Ri is a finite set but R∗ may be an infinite set Exceptions 1 2 The closure of the language ∅∗, is not an infinite set If R is a string of 0’s, R0 ={ϵ}, R1 = R, R2 = R ⇔ R∗ = R In this case, R∗ is not infinite Safari - Yonasi (Makerere University) CSC 2210 2012/2013 5 / 29
  • 18. Union Operator Consider the RE given by R + S Starting at the new start state, the automaton can transition to the start state of either R or S The accepting state of one of the automata is reached by following path labelled by some string in L(R) or L(S ) Then one of the ϵ arcs is followed to the accepting state of the new automaton, giving the language of the automaton as L(R) ∪ L(S ) Safari-Yonasi (Makerere University) CSC 2210 2012/2013 32 / 47
  • 19. Concatenation For RS, The start state of the first automaton becomes the start state of the whole RE, The accepting state of the second automaton becomes the accepting state of the whole Paths for acceptance go through R (by string in L(R)), then S (L(S )), giving the language L(R)L(S ) Safari-Yonasi (Makerere University) CSC 2210 2012/2013 33 / 47
  • 20. Closure When drawing transition graphs for regular expressions of the form R∗: Draw the graph for R, To include R∗, add two nodes to represent the start and final nodes, let the other nodes be intermediate nodes. Join the new nodes to the graph of R with arcs labeled ϵ Add an arc from the start state of the new graph to the final node, labeled ϵ, irrespective of what R may be (remember that for R∗, ϵ must be an accepted string) Add another arc from the final state of R to the initial node of R labeled ϵ. This allows motion to start for R, through the automaton one or more times, to the accepting state. Safari-Yonasi (Makerere University) CSC 2210 2012/2013 34 / 47
  • 21. Closure Note: From the start state, there is a path from the start to accepting state along path labelled ϵ Also, there is a path from the new start state to the start state of R, through the automaton one or more times, to the accepting state Safari-Yonasi (Makerere University) CSC 2210 2012/2013 35 / 47
  • 22. (R) The expression (R) is the same as R, since parentheses do not change the language defined by the expression. The representation for (R) is therefore the same as that for R. Safari-Yonasi (Makerere University) CSC 2210 2012/2013 36 / 47
  • 23. Construction of Transition Graphs from REs Just as was done for the finite state system, transition graphs can also be constructed for REs This is achieved by drawing the equivalent ϵ−NFAs for the RE The ϵ−NFAs have a single accepting state Generally, there are no arcs into the initial state or out of the accepting state Safari-Yonasi (Makerere University) CSC 2210 2012/2013 30 / 47
  • 24. Example 1 Consider the RE represented by: T (G ) = 0 + 11∗, To draw a transition graph (ϵ−NFA) for this RE: Consider the sub-expressions in the RE, and for each, assume that the languages of the sub-expression are also those of ϵ−NFAs with one accepting state: Let them be T (G1) = 0 and T (G2) = 11∗ Construct transition graphs for each of the sub-expressions Safari-Yonasi (Makerere University) CSC 2210 2012/2013 38 / 47
  • 25. Example 1 For T (G1): For T (G2): Combine the two transition graphs using the union operator to obtain the transition graph for T (G ) Safari-Yonasi (Makerere University) CSC 2210 2012/2013 39 / 47
  • 26. Example 1 Introduce nodes to combine the two transition graphs and connect the nodes with ϵ, the empty string transition Note that the ϵ− transitions disappear in concatenation, so in effect, do not affect the graph Safari-Yonasi (Makerere University) CSC 2210 2012/2013 40 / 47
  • 27. Example 1 Let one of the nodes introduced be an initial node and the other a final node This is a viable transition graph for T (G ) Safari-Yonasi (Makerere University) CSC 2210 2012/2013 41 / 47
  • 28. Example 1 cont’d However, it is inefficient to have so many ϵ’s Reducing the ϵ’s, (bearing in mind that ϵ’s disappear in concatenation), gives the following as the resulting transition graph for T (G ) Note that for the union operator, both strings, 0 and 11∗ are accepted by the transition graph and their transition graphs are combined. Safari-Yonasi (Makerere University) CSC 2210 2012/2013 42 / 47
  • 29. Example 2 To concatenate T (G1) = 0 and T (G2) = 11∗ for the expression T (G ) = 011∗, For concatenation, the final state of the transition graph of one sub-expression is joined to the initial state of the other For the example in consideration, the graph of T (G1) is joined to that of T (G2) giving: The extra transition representing the combination of the two graphs is labeled with ϵ Safari-Yonasi (Makerere University) CSC 2210 2012/2013 43 / 47
  • 30. Example 2 Reducing the graph gives: as the resulting graph for the RE of T (G ) Safari-Yonasi (Makerere University) CSC 2210 2012/2013 44 / 47
  • 31. Example 3 For the closure operator, consider the RE represented by T (G ) = (01∗)∗ Draw T (G1) = 01∗ Then construct T (G ) = (01∗)∗ Safari-Yonasi (Makerere University) CSC 2210 2012/2013 45 / 47
  • 32. Characteristics of the ϵ− NFA They have exactly one accepting state There are no arcs into the initial state There are no arcs out of the accepting state Safari-Yonasi (Makerere University) CSC 2210 2012/2013 46 / 47
  • 33. Exercise Draw transition graphs to represent the following strings/expressions: 1 2 3 01∗ (0 + 1)01 (10 + 0∗1)∗1 Safari-Yonasi (Makerere University) CSC 2210 2012/2013 47 / 47
  • 34. Complexity and Computability Regular Expressions Example Let Σ = {a, b} and R = {aa, ab} = {a2, ab} and S = {aba, ab, ba}, then: R + S = {a2,ab,aba,ba} RS = {a2, ab} • {aba, ab, ba} = {a2aba, a2ab, a2ba, ababa, abab, abba} R∗ = ϵ + {a2,ab} + {a2a2,a2ab,aba2,abab} + ... = ϵ + {a2,ab,a2a2,a2ab,aba2,abab} + ... = (a2)m (ab)n, m, n ≥ 0 Safari - Yonasi (Makerere University) CSC 2210 2012/2013 6 / 29
  • 35. Complexity and Computability Regular Expressions Exercise Given Σ = {0, 1} and L = {001, 10, 111} and M = {ϵ, 001}. Find: 1 2 3 4 L∪M L•M L∗ M∗ Safari - Yonasi (Makerere University) CSC 2210 2012/2013 7 / 29
  • 36. Complexity and Computability Regular Expressions Example 1 Consider the RE 0 + 01∗, applying some of the laws/ properties of REs, 0 can be factored out of the union, but the RE 0 would have to be replaced by another RE Using identity for concatenation, 0 = 0ϵ giving the RE 0ϵ + 01∗ Applying the left distributive law to the RE gives 0(ϵ + 1∗) However, ϵ ∈ L(1∗) giving ϵ + 1∗ = 1∗ giving: 0(ϵ + 1∗) = 01∗ Assignment: Define the language of this RE Safari - Yonasi (Makerere University) CSC 2210 2012/2013 18 / 29
  • 37. Complexity and Computability Regular Expressions Example 2 Prove the law: (R + S)∗ = (R∗S∗)∗ Let R = a, S = b ⇒ (a + b)∗ = (a∗b∗)∗ LHS: (a + b)∗ = {ϵ, a, b, } giving strings of a’ and b’s mixed RHS: (a∗b∗)∗: a∗ = {ϵ, a, aa, aaa, ...} b∗ = {ϵ, b, bb, bbb, ...} (a∗b∗)= {ϵ, a, aa, aaa, ...}{ϵ, b, bb, bbb, ...} = {ϵ, a, b, aa, ab, bb, abb, bbb, ... } (a∗b∗)∗ = {ϵ, a, b, aa, ab, bb, abb, bbb, ... }∗ = {ϵ, a, b, ...} also strings with a’s and b’s mixed ∴ LHS = RHS, (R + S)∗ = (R∗S∗)∗ Safari - Yonasi (Makerere University) CSC 2210 2012/2013 19 / 29
  • 38. Complexity and Computability Regular Expressions Exercise Prove or disprove the following statements on REs: R∗R∗ = R∗ (R + S )∗ = R∗ + S∗ (RS + S )∗RS = (RR∗S )∗ Safari - Yonasi (Makerere University) CSC 2210 2012/2013 20 / 29
  • 39. Complexity and Computability Regular Expressions Transition Graphs for REs Let Σ = {0, 1} be a two-letter alphabet. The transition graph, G , over Σ consists of: A finite set of nodes, with at least one labelled as the initial and some (may be more than one) labelled as final states. Safari - Yonasi (Makerere University) CSC 2210 2012/2013 21 / 29
  • 40. Complexity and Computability Regular Expressions Transition Graphs for REs Oriented branches (which may be represented as ordered pairs of nodes, arrows, arcs or loops) Every arrow is labelled with a 0, 1, or ϵ A word, w , is accepted by a transition graph if there exists a path from an initial node to a final node such that the labels of the arrows of the path form the word w , after the ϵ’s are deleted (ϵ’s disappear in concatenation) The empty string, ϵ, is accepted if one node is both a start and final node or if there exists a path from an initial to final node whose arrows are all labelled with ϵ’s. The set of words accepted by a transition graph is denoted by T(G) Safari - Yonasi (Makerere University) CSC 2210 2012/2013 22 / 29
  • 41. Complexity and Computability Regular Expressions Examples T(G) = {1}, accepts only 1 T(G) = {1∗}, accepts ϵ, 1, 11, 111, ... T(G) = {∅}, does not accept anything Safari - Yonasi (Makerere University) CSC 2210 2012/2013 23 / 29
  • 42. Complexity and Computability Regular Expressions Examples T(G) = {ϵ} T(G) = {11∗}, the empty word, ϵ is not accepted Safari - Yonasi (Makerere University) CSC 2210 2012/2013 24 / 29
  • 43. Complexity and Computability Regular Expressions Exercise What sets of words are accepted by the following transition graphs? Safari - Yonasi (Makerere University) CSC 2210 2012/2013 25 / 29
  • 44. Complexity and Computability Regular Expressions Assignment 1 2 Write a regular expression for the language consisting of strings of 0’s and 1’s such that every pair of adjacent 0’s appears before any pair of adjacent 1’s. Give the English description of the language of the RE (1 + ϵ)(00∗1)∗0∗ Safari - Yonasi (Makerere University) CSC 2210 2012/2013 29 / 29