1. Lecture 2: Regular Language and Finite
Automata
CSM 3125: Theory of Computation
Level 3 - Semester 1, 2024
Bioinformatics Engineering
Md. Saif Uddin
Lecturer
Department of Computer Science and Mathematics
Bangladesh Agricultural University
saifuddin.csm@bau.edu.bd
Theory of Computation 1/18
Lecture 1: Introduction
2. Regular Language (1/2)
Md. Saif Uddin 2/8
A regular language is a type of formal language that can be:
• Recognized by a finite automata/finite state machine,
either deterministic (DFA) or nondeterministic (NFA).
• Described by a regular expression.
• Generated by a regular grammar.
• Doesn’t require memory.
Regular languages are the simplest class of languages in the
Chomsky hierarchy.
They are very useful for pattern matching and lexical analysis
in compilers.
Regular languages can be processed without the need for
memory or a stack.
Identification of Human Value within Arguments
Md. Saif Uddin Semi-Supervised Learning on AAV Data 2/22
Theory of Computation 2/9
Lecture 1: Introduction
Theory of Computation 2/18
Lecture 1: Introduction
3. Regular Language (2/2)
Md. Saif Uddin 2/8
Examples of Regular Languages
• L={w {a,b}
∈ ∗
w has even number of a’s}
∣
• L={w {0,1}
∈ ∗
w does not contain "11"}
∣
• L={ ϵ }: the language with only the empty string
• L={an
n 0} : all strings of a’s
∣ ≥
Non-Regular Language
Some languages are not regular. These languages require
memory, which finite automata lack.
Example:
• L={an
bn
n 0} is
∣ ≥ not regular
Because we need to "count" how many a’s and b’s there are —
this requires a stack (context-free grammar).
Identification of Human Value within Arguments
Md. Saif Uddin Semi-Supervised Learning on AAV Data 2/22
Theory of Computation 2/9
Lecture 1: Introduction
Theory of Computation 2/18
Lecture 1: Introduction
4. Md. Saif Uddin 2/8
• Finite automata can be defined as a recognizer that identifies
whether the input string represents the regular language.
• It consists of a finite number of states and transitions between
those states based on input symbols.
• Finite automata have very limited memory. It cannot store or
count strings.
• It accepts or rejects strings based on whether the sequence of
transitions leads to an accepting (final) state.
Types of Finite Automata
• Deterministic Finite Automaton (DFA)
• Nondeterministic Finite Automaton (NFA)
Identification of Human Value within Arguments
Md. Saif Uddin Semi-Supervised Learning on AAV Data 2/22
Theory of Computation 2/9
Lecture 1: Introduction
Theory of Computation 2/18
Lecture 1: Introduction
Finite Automata
5. Md. Saif Uddin 2/8
• DFA is type of finite automata which has exactly one transition
for each symbol from each state.
• A transition function is defined on every state for every input
symbol.
• Null move is not allowed here. i.e. can’t change state without any
input symbol.
• Easy to implement and simulate.
Formal Definition: A DFA is defined as a 5-tuple:
M=(Q, Σ, δ, q0, F)
Where:
• Q is a finite set called the states
• Σ is a finite set called the alphabet
• δ : Q × Σ → Q is the transition function
• q0 ∈ Q is the start state and
• F ⊆ Q is the set of accept (final) states.
Identification of Human Value within Arguments
Md. Saif Uddin Semi-Supervised Learning on AAV Data 2/22
Theory of Computation 2/9
Lecture 1: Introduction
Theory of Computation 2/18
Lecture 1: Introduction
Deterministic Finite Automata (DFA)
6. Md. Saif Uddin 2/8
Example 1: Construct a DFA of the following language. Draw
transition diagram, transition table and formal definition.
L = {set of all string in Σ(a,b) that starts with a}
= {a, aa, ab, aab, aba, abb, aaab, aaba,….. }
Soln:
Identification of Human Value within Arguments
Md. Saif Uddin Semi-Supervised Learning on AAV Data 2/22
Theory of Computation 2/9
Lecture 1: Introduction
Theory of Computation 2/18
Lecture 1: Introduction
DFA Examples (1/3)
q0 q1
q2
a
b
a, b
a, b
We can describe DFA formally by writing
M= (Q,Σ, δ, q1, F), where
Q = {q1, q2, q3},
Σ = {a,b},
δ is described as
q0 is the start state, and
F = {q1}.
a b
q0 q1 q2
q1 q1 q1
q2 q2 q2
7. Md. Saif Uddin 2/8
Example 2:
L = {w| w ends in a 1}.
Example 3:
L = {w| w is the empty string ε or ends in a 0}.
Identification of Human Value within Arguments
Md. Saif Uddin Semi-Supervised Learning on AAV Data 2/22
Theory of Computation 2/9
Lecture 1: Introduction
Theory of Computation 2/18
Lecture 1: Introduction
DFA Examples (2/3)
8. Md. Saif Uddin 2/8
Example 4:
L = {w| w is a strings that start and end with the same symbol.}.
Identification of Human Value within Arguments
Md. Saif Uddin Semi-Supervised Learning on AAV Data 2/22
Theory of Computation 2/9
Lecture 1: Introduction
Theory of Computation 2/18
Lecture 1: Introduction
DFA Examples (3/3)
9. Md. Saif Uddin 2/8
• Exercise 1: Construct DFA which accept a language of all string
in Σ(a, b) starting with ‘a’ and ending with ‘b’.
• Exercise 2: Construct DFA which accept a language of all string
in Σ(a, b) not starting with ‘a’ or not ending with ‘b’.
• Exercise 3: Construct DFA accepting the strings over alphabet
{0,1,2} beginning with 0, ending with 2, and with 11 as a
substring.
• Exercise 4: Construct DFA accepting the strings over alphabet {a,
b, c} such that if it starts with ‘a’ then contains substring “abc” in
it, and if it starts with ‘b’ then it ends with ‘c’.
• Exercise 5: Construct DFA for language containing strings
starting with ‘1’ and ending with “00” over alphabet Σ(0, 1). Draw
transition diagram, transition table and formal definition.
Identification of Human Value within Arguments
Md. Saif Uddin Semi-Supervised Learning on AAV Data 2/22
Theory of Computation 2/9
Lecture 1: Introduction
Theory of Computation 2/18
Lecture 1: Introduction
DFA Exercises (1/3)
10. Md. Saif Uddin 2/8
• Exercise 6: Design a DFA which accept all strings over (0, 1) in
which second symbol is ‘1’ and fourth symbol is ‘0’.
• Exercise 7: Construct a DFA which accept a language of all
binary string divisible by 3 over Σ(0, 1).
• Exercise 8: Construct a DFA accepting the strings over alphabet
{a, b} that has even number of a’s and even number of b’s.
• Exercise 9: Construct a DFA accepting the strings over alphabet
{a, b} that has odd number of a’s and even number of b’s.
• Exercise 10: Design a DFA over (a, b) such that every string
accepted must ends with a substring “bab”.
Identification of Human Value within Arguments
Md. Saif Uddin Semi-Supervised Learning on AAV Data 2/22
Theory of Computation 2/9
Lecture 1: Introduction
Theory of Computation 2/18
Lecture 1: Introduction
DFA Exercises (2/3)
11. Md. Saif Uddin 2/8
• Exercise 11: Design a DFA over (a, b) such that every string
accepted must starts with a substring aa or bb.
• Exercise 12: Design a DFA over (a, b) such that every string
accepted must contains a substring aa or bb.
• Exercise 13: Design a DFA over (a, b) such that every string
accepted must ends with a substring aa or bb.
Identification of Human Value within Arguments
Md. Saif Uddin Semi-Supervised Learning on AAV Data 2/22
Theory of Computation 2/9
Lecture 1: Introduction
Theory of Computation 2/18
Lecture 1: Introduction
DFA Exercises (3/3)
12. Md. Saif Uddin 2/8
NFA is a finite automata for which –
• For a given state and input symbol, there can be multiple
possible next states. That’s why it produces ambiguity.
• ε-transitions are allowed (i.e., transitions that without any input).
• Dead configuration is allowed.
• Easier to construct than DFA.
• NFA is not associated with real computers.
• Every NFA has an equivalent DFA (but DFA may have
exponentially more states).
Formal Definition: A NFA is defined as a 5-tuple:
M=(Q, Σ, δ, q0, F)
Where everything is similar to DFA except the transition function:
δ: Q×Σ 2
→ Q
Identification of Human Value within Arguments
Md. Saif Uddin Semi-Supervised Learning on AAV Data 2/22
Theory of Computation 2/9
Lecture 1: Introduction
Theory of Computation 2/18
Lecture 1: Introduction
Non-deterministic Finite Automata (NFA)
13. Md. Saif Uddin 2/8
DFA
• Single choices for each
input symbol in each state.
• Dead configuration is not
allowed.
• ε is not allowed.
• Dead state may be required.
• Transition function: δ:
Q×Σ→Q
• Often harder to construct
manually
• May require more states
than NFA
• Digital computers are
deterministic.
Identification of Human Value within Arguments
Md. Saif Uddin Semi-Supervised Learning on AAV Data 2/22
Theory of Computation 2/9
Lecture 1: Introduction
Theory of Computation 2/18
Lecture 1: Introduction
DFA vs NFA
NFA
• Multiple choices are possible
for each input symbol in each
state.
• Dead configuration is allowed.
• ε is allowed.
• Dead state is not required.
• Transition function: δ: Q×Σ 2
→ Q
• Easier and more flexible to
design
• May require fewer states
• Non-deterministic is not
associated with digital
computer and are used for
design purpose.
14. Md. Saif Uddin 2/8
Example: Construct NFA of all binary string in which 2nd
last bit is 1.
L = {10, 11, 010, 011, 110, 111, 0010, 0011, 0110, 0111, ……. }
Soln
:
Transition Diagram:
Transition Table:
Identification of Human Value within Arguments
Md. Saif Uddin Semi-Supervised Learning on AAV Data 2/22
Theory of Computation 2/9
Lecture 1: Introduction
Theory of Computation 2/18
Lecture 1: Introduction
NFA Examples
q0 q2
1
0, 1
q1
0,
1
0 1
q0 {q0} {q0, q1}
q1 {q2} {q2}
q2 Φ Φ
15. Md. Saif Uddin 2/8
Let M = (Q, Σ, δ, q0, F) be required NFA where,
Q = {q0, q1, q2}
Σ = {0, 1}
δ is defined as the transition table
q0 = q0
F = {q2}
• Exercise 1: Construct NFA to accept set of strings in (0+1)* such
that the 3rd
symbol from the right is 1.
• Exercise 2: Construct NFA for the following language:
L = {set of all string that start with ‘0’}
Identification of Human Value within Arguments
Md. Saif Uddin Semi-Supervised Learning on AAV Data 2/22
Theory of Computation 2/9
Lecture 1: Introduction
Theory of Computation 2/18
Lecture 1: Introduction
NFA Exercises (1/3)
16. Md. Saif Uddin 2/8
• Exercise 3: Construct NFA for the following language:
L = {set of all string that end with ‘0’}
• Exercise 4: Construct NFA for the following language:
L = {set of all string that contain ‘1’}
• Exercise 5: Construct NFA that accept sets of all string over {0, 1}
of length 2.
• Exercise 6: Construct NFA for the following language:
L = {set of all string that starts with ‘01’}
• Exercise 7: Construct NFA for the following language:
L = {set of all string that contain ‘10’}
Identification of Human Value within Arguments
Md. Saif Uddin Semi-Supervised Learning on AAV Data 2/22
Theory of Computation 2/9
Lecture 1: Introduction
Theory of Computation 2/18
Lecture 1: Introduction
NFA Exercises (2/3)
17. Md. Saif Uddin 2/8
• Exercise 8: Design NFA over (a, b) such that it accept every string
starting and ending with ‘a’.
• Exercise 9: Design NFA over (a, b) such that it accept every string
starting and ending with same symbol.
• Exercise 10: Design NFA over (a, b) such that it accept every
string starting and ending with different symbol.
• Exercise 6: Construct NFA for the following language:
L = {set of all string that starts with ‘01’}
• Exercise 7: Construct NFA for the following language:
L = {set of all string that contain ‘10’}
Identification of Human Value within Arguments
Md. Saif Uddin Semi-Supervised Learning on AAV Data 2/22
Theory of Computation 2/9
Lecture 1: Introduction
Theory of Computation 2/18
Lecture 1: Introduction
NFA Exercises (3/3)
18. Md. Saif Uddin 2/8
• Convert the following NFA of all binary string in which 2nd
last bit
is 1 into equivalent DFA.
Transition Diagram:
Transition Table:
Identification of Human Value within Arguments
Md. Saif Uddin Semi-Supervised Learning on AAV Data 2/22
Theory of Computation 2/9
Lecture 1: Introduction
Theory of Computation 2/18
Lecture 1: Introduction
Conversion of NFA to DFA
q0 q2
1
0, 1
q1
0,
1
0 1
q0 {q0} {q0, q1}
q1 {q2} {q2}
q2 Φ Φ
19. Regular Operations
Md. Saif Uddin 2/8
• In the theory of computation, the objects are languages and the
tools are operations on these languages, called the regular
operations.
• Definition:
Let A and B be languages. We define the regular operations as
follows:
Union: A B = {x | x A or x B}.
∪ ∈ ∈
Concatenation: A B = {xy | x A and y B}.
◦ ∈ ∈
Star: A∗
= {x1x2 . . . xk | k 0 and each x
≥ i A}.
∈
• Example:
Let the alphabet Σ be the standard 26 letters {a, b, . . . , z}. If A =
{good, bad} and B = {boy, girl}, then
A B = {good, bad, boy, girl},
∪
A B = {goodboy, goodgirl, badboy, badgirl}, and
◦
A∗
= {ε, good, bad, goodgood, goodbad, badgood, badbad,
Identification of Human Value within Arguments
Md. Saif Uddin Semi-Supervised Learning on AAV Data 2/22
Theory of Computation 2/9
Lecture 1: Introduction
Theory of Computation 2/18
Lecture 1: Introduction
20. Closure Properties of Regular Operations
Md. Saif Uddin 2/8
• In the theory of computation, the objects are languages and the
tools are operations on these languages, called the regular
operations.
• Definition:
Let A and B be languages. We define the regular operations as
follows:
Union: A B = {x | x A or x B}.
∪ ∈ ∈
Concatenation: A B = {xy | x A and y B}.
◦ ∈ ∈
Star: A∗
= {x1x2 . . . xk | k 0 and each x
≥ i A}.
∈
• Example:
Let the alphabet Σ be the standard 26 letters {a, b, . . . , z}. If A =
{good, bad} and B = {boy, girl}, then
A B = {good, bad, boy, girl},
∪
A B = {goodboy, goodgirl, badboy, badgirl}, and
◦
A∗
= {ε, good, bad, goodgood, goodbad, badgood, badbad,
Identification of Human Value within Arguments
Md. Saif Uddin Semi-Supervised Learning on AAV Data 2/22
Theory of Computation 2/9
Lecture 1: Introduction
Theory of Computation 2/18
Lecture 1: Introduction
21. Regular Language
Md. Saif Uddin 2/8
Examples of Regular Languages
• L={w {a,b}
∈ ∗
w has even number of a’s}
∣
• L={w {0,1}
∈ ∗
w does not contain "11"}
∣
• L={ ϵ }: the language with only the empty string
• L={an
n 0} : all strings of a’s
∣ ≥
Non-Regular Language
Some languages are not regular. These languages require
memory, which finite automata lack.
Example:
• L={an
bn
n 0} is
∣ ≥ not regular
Because we need to "count" how many a’s and b’s there are —
this requires a stack (context-free grammar).
Identification of Human Value within Arguments
Md. Saif Uddin Semi-Supervised Learning on AAV Data 2/22
Theory of Computation 2/9
Lecture 1: Introduction
Theory of Computation 2/18
Lecture 1: Introduction
22. Why is Theory of Computation Important?
Md. Saif Uddin 2/8
• It builds the foundation of computer science, just like physics
underlies engineering.
• It helps developers and researchers understand:
• Which problems are inherently unsolvable by any
computer (e.g., halting problem).
• Which problems are solvable but require huge
time/memory (e.g., cryptography).
• How to design efficient algorithms within the limits of
what’s computable.
• Finding patterns in languages using automata.
• Building compilers and interpreters.
Identification of Human Value within Arguments
Md. Saif Uddin Semi-Supervised Learning on AAV Data 2/22
Theory of Computation 3/18
Lecture 1: Introduction
24. 1. Automata Theory
Md. Saif Uddin 2/8
• Studies abstract models of computation (automata) and the
languages they recognize.
Common Models:
• Finite Automata (DFA/NFA):
• Very limited memory, used for regular languages.
• Lexical analysis in compilers (e.g., identifying identifiers,
keywords).
• Pattern matching (e.g., grep, regex, spam filters).
• Pushdown Automata (PDA):
• Adds stack memory, used for context-free languages.
• Parsing expressions in programming languages
• Turing Machines:
• Fully powerful, can simulate any algorithm.
• Conceptual model behind all modern computers.
Identification of Human Value within Arguments
Md. Saif Uddin Semi-Supervised Learning on AAV Data 2/22
Theory of Computation 5/18
Lecture 1: Introduction
25. 2. Computability Theory (Decidability Theory)
Md. Saif Uddin 2/8
• Studies which problems can be solved by a machine, no matter
how powerful or slow it is. It focuses on decidability – whether a
problem can be solved with an algorithm.
• Through computability, ToC distinguishes between:
1. Decidable problems – Solvable with an algorithm (e.g.,
checking if a number is prime).
2. Undecidable problems – No algorithm exists to solve them
(e.g., the Halting Problem).
• Turing machines are used to define computability.
Identification of Human Value within Arguments
Md. Saif Uddin Semi-Supervised Learning on AAV Data 2/22
Theory of Computation 6/18
Lecture 1: Introduction
26. 3. Complexity Theory
Md. Saif Uddin 2/8
• Studies how efficiently problems can be solved in terms of
time and space.
• It classifies problems based on resource usage.
• Big-O notation: Classifies algorithm performance.
• Complexity classes:
• P: Problems solvable in polynomial time.
• Sorting a list (Can be solved efficiently e.g., quicksort,
mergesort)
• NP: Problems verifiable in polynomial time.
• Sudoku validation and Cryptography (Easy to check a
solution, hard to generate)
• NP-complete: Hardest problems in NP – if one can be solved
quickly, all can.
• Travelling Salesman Problem
Identification of Human Value within Arguments
Md. Saif Uddin Semi-Supervised Learning on AAV Data 2/22
Theory of Computation 7/18
Lecture 1: Introduction
27. Syllabus Mapping (1/2)
Semi-Supervised Learning on AAV Data
Branches Syllabus
Automata
Theory
Regular languages:
Regular Languages, finite automaton, Examples of finite automata,
Designing finite automata, Equivalence of NFAs and DFAs, The regular
operations - Closure under the regular operations. Regular Expressions.
Equivalence with finite automata. Non-regular Languages - The pumping
lemma for regular languages.
Context-Free Languages:
Formal definition of a context-free grammar - Examples of context-free
grammars. Ambiguity - Chomsky normal form. Pushdown Automata,
Formal definition of a pushdown automaton - Examples of pushdown
automata, Equivalence with context-free grammars.
Computab
-ility
Theory
Computability Theory:
The Church-Turing Thesis. Turing machine, Nondeterministic Turing
machines, Hilbert's problems.
Decidability:
Decidable languages, The halting problem – the diagonalization method.
Theory of Computation 8/18
Lecture 1: Introduction
28. Syllabus Mapping (2/2)
Semi-Supervised Learning on AAV Data
Branches Syllabus
Complexit
y
Theory
Complexity Theory:
The Classes P, NP, Examples of problems in these classes. The P
versus NP question. NP-Completeness, Polynomial time
reducibility, The Cook-Levin Theorem. Examples of NP-Complete
Problems: The vertex cover problem - The Hamiltonian path
problem - The subset sum problem. Approximation algorithm,
Probabilistic Algorithms.
Theory of Computation 9/18
Lecture 1: Introduction
30. i. Symbols
Semi-Supervised Learning on AAV Data
Theory of Computation 11/18
Lecture 1: Introduction
• A symbol (often also called a character) is the smallest building
block/unit in a language.
• It is an atomic element with no meaning by itself.
• These can represent characters, digits, or operations
depending on the context.
• Example: a, b, 1, 0, +, *….
31. ii. Alphabet (Σ)
Semi-Supervised Learning on AAV Data
Theory of Computation 12/18
Lecture 1: Introduction
• An alphabet a finite, non-empty set of symbols.
• It is denoted by Σ (sigma).
• All strings in a language are formed using the symbols from this
alphabet.
• Examples:
Binary Alphabet: Σ = {0, 1}
Decimal Digit Alphabet: Σ = {0, 1, 2, ......., 9}
DNA Alphabet: Σ = {A, C, G, T}
Lowercase English Alphabet: Σ = {a, b, c, ..., z}
32. iii. String
Semi-Supervised Learning on AAV Data
Theory of Computation 13/18
Lecture 1: Introduction
• A string is a finite sequence of symbols taken from an
alphabet, denoted as w.
• The length of a string, denoted as |w| is the number of
symbols in it.
• The empty string (length 0) is denoted by ε.
• Σ* (Sigma star) is the set of all possible strings (including ε)
over the alphabet Σ.
• Examples: Number of Strings (of length 2) that can be
generated over the alphabet Σ {a, b}:
- -
a a
a b
b a
b b
Length of String |w| = 2
Number of Strings = 4
33. iv. Language (1/2)
Semi-Supervised Learning on AAV Data
Theory of Computation 14/18
Lecture 1: Introduction
• A language is a set of strings formed from an alphabet,
according to certain rules.
• A language can be:
• Finite Language:
L1 = { set of string of 2 }
L1 = { xy, yx, xx, yy }
• Infinite Language:
L1 = { set of all strings starts with 'b’ }
L1 = { babb, baa, ba, bbb, baab, ....... }
• If Σ = {a, b}, then L Σ*
⊆ (a language is a subset of all possible
strings over Σ)
• Examples:
L1 = {w {0, 1}* | w contains an even number of 0s}
∈
L2 = {w {a, b}* | w starts and ends with the same symbol}
∈
L3 = {aⁿbⁿ | n 0} = {
≥ ε, "ab", "aabb", "aaabbb", ...}
34. iv. Language (2/2)
Semi-Supervised Learning on AAV Data
Theory of Computation 15/18
Lecture 1: Introduction
• Formal languages can be classified into four types:
• Regular Language
• Context-free Language
• Context-sensitive Language and
• Recursively Enumerable Languages.
35. v. Grammar (1/3)
Semi-Supervised Learning on AAV Data
Theory of Computation 16/18
Lecture 1: Introduction
• In automata, the grammars are formal systems for describing
the structure of languages.
• In grammar, there are set of rules for generating valid strings
in a language.
• Formally, we can define grammar like this. A grammar is a
tuple
G = (V, T, P, S) where:
• V is a finite set of variables (non-terminal symbols)
• T is a finite set of terminal symbols (the alphabet)
• P is a finite set of production rules
• S is the start symbol (S V)
∈
• Grammars are used to generate all valid strings in a language
• It also provides a structural description of the language and
serve as a basis for parsing and syntax analysis.
36. v. Grammar (2/3)
Semi-Supervised Learning on AAV Data
Theory of Computation 17/18
Lecture 1: Introduction
• Different components of a grammar:
Component Description Example
Variables Non-terminal symbols A, B, C
Terminals
Symbols in the
alphabet
a, b, c, 0, 1
Production rules
Rules for string
generation
A aB, B bC
→ →
Start symbol
Initial variable for
derivations
S
• Example:
• G = {V, T, P, S} where V = {S}, T = {a, b, ε}, S = S, and P defines as
follows:
S SS
S aSb | bSa
S ε
• Language L = {na(w)=nb(w)} = {ε, ab, ba, abab, abba, ...}
37. v. Grammar (3/3)
Semi-Supervised Learning on AAV Data
Theory of Computation 18/18
Lecture 1: Introduction
• Chomsky Hierarchy of Grammars:
Type Grammar Type Language Class
Automaton
Model
0
Unrestricted
Grammar
Recursively
enumerable
Turing Machine
1 Context-sensitive Context-sensitive
Linear Bounded
Automaton
2 Context-free (CFG) Context-free
Pushdown
Automaton
3 Regular Regular Finite Automaton