Here are DFAs for the languages in questions 2.4.4 and 2.4.5:
2.4.4:
a) q0 - 0->q1, 1->q0
b) q0 - 0->q1, 1->q2; q1 - 0,1->q0
c) q0 - 0->q1, 1->q2; q1 - 0->q0, 1->q2
d) q0 - 0->q1, 1->q2; q1 - 0,1->q2
e) q0 - 0->q1, 1->q2; q1 - 0,1->q2;
Automata And Computability Solutions To Exercises Fall 2016
1. Automata and Computability
Solutions to Exercises
Fall 2016
Alexis Maciel
Department of Computer Science
Clarkson University
Copyright c 2016 Alexis Maciel
7. Preface
This document contains solutions to the exercises of the course notes Automata
and Computability. These notes were written for the course CS345 Automata
Theory and Formal Languages taught at Clarkson University. The course is also
listed as MA345 and CS541. The solutions are organized according to the same
chapters and sections as the notes.
Here’s some advice. Whether you are studying these notes as a student in a
course or in self-directed study, your goal should be to understand the material
well enough that you can do the exercises on your own. Simply studying the
solutions is not the best way to achieve this. It is much better to spend a reason-
able amount of time and effort trying to do the exercises yourself before looking
at the solutions.
If you can’t do an exercise on your own, you should study the notes some
more. If that doesn’t work, seek help from another student or from your instruc-
tor. Look at the solutions only to check your answer once you think you know
how to do an exercise.
If you needed help doing an exercise, try redoing the same exercise later on
your own. And do additional exercises.
If your solution to an exercise is different from the official solution, take the
time to figure out why. Did you make a mistake? Did you forget something?
8. viii PREFACE
Did you discover another correct solution? If you’re not sure, ask for help from
another student or the instructor. If your solution turns out to be incorrect, fix
it, after maybe getting some help, then try redoing the same exercise later on
your own and do additional exercises.
Feedback on the notes and solutions is welcome. Please send comments to
alexis@clarkson.edu.
11. Chapter 2
Finite Automata
2.1 Turing Machines
There are no exercises in this section.
2.2 Introduction to Finite Automata
2.2.3.
q0 q1 q2
-
d
d
d
Missing edges go to a garbage state. In other words, the full DFA looks
like this:
12. 4 CHAPTER 2. FINITE AUTOMATA
q0 q1 q2
q3
-
d
other
d
-, other
d
-, other
any
The transition label other means any character that’s not a dash or a digit.
2.2.4.
q0 q1 q2
q3 q4
-
d
.
d
.
d
.
d
d
Missing edges go to a garbage state.
13. 2.2. INTRODUCTION TO FINITE AUTOMATA 5
2.2.5.
q0 q1
q2
q3
letter
digit,
other
underscore
underscore,
letter,
digit
other
any
underscore
letter,
digit
other
14. 6 CHAPTER 2. FINITE AUTOMATA
2.2.6.
q0 q1 q2 q3
q4 q5 q6 q7 q8 q9 q10
q13 q14 q15 q16 q17
q26 q27 q28 q29 q30
d d d
d
d d d d d d
-
d d d d
-
d d d d
15. 2.2. INTRODUCTION TO FINITE AUTOMATA 7
2.2.7.
starting_state() { return q0 }
is_accepting(q) { return true iff q is q1 }
next_state(q, c) {
if (q is q0)
if (c is underscore or letter)
return q1
else
return q2
else if (q is q1)
if (c is underscore, letter or digit)
return q1
else
return q2
else // q is q2
return q2
}
2.2.8. The following assumes that the garbage state is labeled q9. In the pseu-
docode algorithm, states are stored as integers. This is more convenient
here.
starting_state() { return 0 }
is_accepting(q) { return true iff q is 8 }
16. 8 CHAPTER 2. FINITE AUTOMATA
next_state(q, c) {
if (q in {0, 1, 2} or {4, 5, 6, 7})
if (c is digit)
return q + 1
else
return 9
else if (q is 3)
if (c is digit)
return 5
else if (c is dash)
return 4
else
return 9
else if (q is 8 or 9)
return 9
}
17. 2.3. FORMAL DEFINITION 9
2.3 Formal Definition
2.3.9. The DFA is ({q0,q1,q2,...,q9},Σ,δ,q0,{q8}) where Σ is the set of all char-
acters that appear on a standard keyboard and δ is defined as follows:
δ(qi, c) =
¨
qi+1 if i /
∈ {3,8,9} and c is digit
q9 if i /
∈ {3,8,9} and c is not digit
δ(q3, c) =
q4 if c is dash
q5 if c is digit
q9 otherwise
δ(q8, c) = q9 for every c
δ(q9, c) = q9 for every c
2.3.10. The DFA is ({q0,q1,q2,q3},Σ,δ,q0,{q2}) where Σ is the set of all char-
acters that appear on a standard keyboard and δ is defined as follows:
δ(q0, c) =
q1 if c is dash
q2 if c is digit
q3 otherwise
δ(qi, c) =
¨
q2 if i ∈ {1,2} and c is digit
q3 if i ∈ {1,2} and c is not digit
δ(q3, c) = q3 for every c
18. 10 CHAPTER 2. FINITE AUTOMATA
2.3.11. The DFA is ({q0,q1,q2,...,q5},Σ,δ,q0,{q2,q4}) where Σ is the set of all
characters that appear on a standard keyboard and δ is defined as follows:
δ(q0, c) =
q1 if c is dash
q2 if c is digit
q3 if c is decimal point
q5 otherwise
δ(qi, c) =
q2 if i ∈ {1,2} and c is digit
q3 if i ∈ {1,2} and c is decimal point
q5 if i ∈ {1,2} and c is not digit or decimal point
δ(qi, c) =
¨
q4 if i ∈ {3,4} and c is digit
q5 if i ∈ {3,4} and c is not digit
δ(q5, c) = q5 for every c
19. 2.4. MORE EXAMPLES 11
2.4 More Examples
2.4.1.
q1 q2
0
1
1
0
q3 q4
1
0
0
1
q0
1
0
2.4.2. In all cases, missing edges go to a garbage state.
a)
q0 q1 q2
0
0
1
1
0
b)
q0 q1 q2
0,1 1
0,1
20. 12 CHAPTER 2. FINITE AUTOMATA
c)
q0 q1 q2 ··· qk−1 qk
0,1 0,1 0,1 0,1 1
0,1
2.4.3. In all cases, missing edges go to a garbage state.
a)
0 0
0,1
1 1
b) The idea is for the DFA to remember the last two symbols it has seen.
21. 2.4. MORE EXAMPLES 13
qǫ
q0
q1
q00 q01
q10 q11
0
1
0
1
0
1
0
1
1
0
1
0
0 1
c) Again, the idea is for the DFA to remember the last two symbols it has
seen. We could simply change the accepting states of the previous
DFA to {q10,q11}. But we can also simplify this DFA by assuming that
strings of length less than two are preceded by 00.
q00 q01
q10 q11
0
1
1
0
1
0
0 1
22. 14 CHAPTER 2. FINITE AUTOMATA
d) The idea is for the DFA to remember the last k symbols it has seen.
But this is too difficult to draw clearly, so here’s a formal description
of the DFA: (Q,{0,1},δ,q0, F) where
Q = {qw | w ∈ {0,1}∗
and w has length k}
q0 = qw0
where w0 = 0k
(that is, a string of k 0’s)
F = {qw ∈ Q | w starts with a 1}
and δ is defined as follows:
δ(qau, b) = qub
where a ∈ Σ, u is a string of length k − 1 and b ∈ Σ.
2.4.4. In all cases, missing edges go to a garbage state.
a)
q0 q1
0
1
0,1
b)
q0 q1 q2
0
1
0
1
0,1
23. 2.4. MORE EXAMPLES 15
c)
q0 q1 q2
0
1
0
1
0,1
d)
q0 q1 q2
0
1
0
1
0,1
e)
q0 q1 q2 ··· qk
0
1
0
1
0
1 1
0,1
2.4.5.
a) The idea is for the DFA to store the value, modulo 3, of the portion
of the number it has seen so far, and then update that value for every
additional digit that is read. To update the value, the current value
is multiplied by 10, the new digit is added and the result is reduced
modulo 3.
24. 16 CHAPTER 2. FINITE AUTOMATA
q0 q1 q2
0,3,6,9
1,4,7
2,5,8
0,3,6,9
1,4,7
2,5,8
0,3,6,9
1,4,7
2,5,8
(Note that this is exactly the same DFA we designed in an example of
this section for the language of strings that have the property that the
sum of their digits is a multiple of 3. This is because 10 mod 3 = 1 so
that when we multiply the current value by 10 and reduce modulo 3,
we are really just multiplying by 1. Which implies that the strategy
we described above is equivalent to simply adding the digits of the
number, modulo 3.)
b) We use the same strategy that was described in the first part, but this
time, we reduce modulo k. Here’s a formal description of the DFA:
(Q,Σ,δ,q0, F) where
Q = {q0,q1,q2,...,qk−1}
Σ = {0,1,2,...,9}
F = {q0}
and δ is defined as follows: for every i ∈ Q and c ∈ Σ,
δ(qi, c) = qj where j = (i · 10 + c) mod k.
25. 2.4. MORE EXAMPLES 17
2.4.6.
a) The idea is for the DFA to verify, for each input symbol, that the third
digit is the sum of the first two plus any carry that was previously
generated, as well as determine if a carry is generated. All that the
DFA needs to remember is the value of the carry (0 or 1). The DFA ac-
cepts if no carry is generated when processing the last input symbol.
Here’s a formal description of the DFA, where state q2 is a garbage
state: (Q,Σ,δ,q0, F) where
Q = {q0,q1,q2}
Σ = {[abc] | a, b, c ∈ {0,1,2,...,9}}
F = {q0}
and δ is defined as follows:
δ(qd,[abc]) =
q0 if d ∈ {0,1} and d + a + b = c
q1 if d ∈ {0,1}, d + a + b ≥ 10 and
(d + a + b) mod 10 = c
q2 otherwise
Here’s a transition diagram of the DFA that shows only one of the
1,000 transitions that come out of each state.
q0 q1
[123]
[561]
[786]
[124]
26. 18 CHAPTER 2. FINITE AUTOMATA
b) Since the DFA is now reading the numbers from left to right, it can’t
compute the carries as it reads the numbers. So it will do the op-
posite: for each input symbol, the DFA will figure out what carry it
needs from the rest of the numbers. For example, if the first symbol
that the DFA sees is [123], the DFA will know that there should be
no carry generated from the rest of the numbers. But if the symbol
is [124], the DFA needs the rest of the number to generate a carry.
And if a carry needs to be generated, the next symbol will have to be
something like [561] but not [358]. The states of the DFA will be
used to remember the carry that is needed from the rest of the num-
bers. The DFA will accept if no carry is needed for the first position of
the numbers (which is given by the last symbol of the input string).
Here’s a formal description of the DFA, where state q2 is a garbage
state: (Q,Σ,δ,q0, F) where
Q = {q0,q1,q2}
Σ = {[abc] | a, b, c ∈ {0,1,2,...,9}}
F = {q0}
and δ is defined as follows:
δ(q0,[abc]) =
¨
qd if d ∈ {0,1} and d + a + b = c
q2 otherwise
δ(q1,[abc]) =
qd if d ∈ {0,1}, d + a + b ≥ 10 and
(d + a + b) mod 10 = c
q2 otherwise
δ(q2,[abc]) = q2, for every [abc] ∈ Σ
27. 2.5. CLOSURE PROPERTIES 19
Here’s a transition diagram of the DFA that shows only one of the
1,000 transitions that come out of each state.
q0 q1
[123]
[124]
[786]
[561]
2.5 Closure Properties
2.5.3. In each case, all we have to do is switch the acceptance status of each
state. But we need to remember to do it for the garbage states too.
a)
q0 q1 q2
0
1
0
1
1
0
0,1
28. 20 CHAPTER 2. FINITE AUTOMATA
b)
q0 q1 q2
0,1 1
0
0,1
0,1
c)
q0 q1 q2 ··· qk−1 qk
0,1 0,1 0,1 0,1 1
0
0,1
0,1
2.5.4. It is important to include in the pair construction the garbage states of
the DFA’s for the simpler languages. (This is actually not needed for in-
tersections but it is critical for unions.) In each case, we give the DFA’s
for the two simpler languages followed by the DFA obtained by the pair
construction.
34. 26 CHAPTER 2. FINITE AUTOMATA
2.5.5. In both cases, missing edges go to a garbage state.
a)
q0 q1 q2 q3 q4
1 1
1
0
#
0
b) The dashed state and edge could be deleted.
q0 q1 q2 q3 q4
1 1
1
0
1
0
37. 3.2. FORMAL DEFINITION 29
3.2 Formal Definition
3.2.1. The NFA is (Q,{0,1},δ,q0, F) where
Q = {q0,q1,q2,q3}
F = {q3}
and δ is defined by the following table
δ 0 1 ǫ
q0 q0 q0,q1 −
q1 q2 q2 −
q2 q3 q3 −
q3 − − −
3.2.2. The NFA is (Q,{0,1},δ,q0, F) where
Q = {q0,q1,q2,q3}
F = {q3}
38. 30 CHAPTER 3. NONDETERMINISTIC FINITE AUTOMATA
and δ is defined by the following table:
δ 0 1 ǫ
q0 q1 q0 −
q1 q2 − q0
q2 − q3 q1
q3 q3 q3 −
3.2.3.
q0 q0 q0 q0 q0 q0 q0
0 0 1 0 0 1
q0 q0 q0 q0 q1 q2 q3
q0 q1 q2 q3 q3 q3 q3
The NFA accepts because the last two sequences end in the accepting state.
41. 3.3. EQUIVALENCE WITH DFA’S 33
3.3.3.
a)
δ′
0 1
q0 q1 −
q1 q1 q1,q2
q2 − −
q1,q2 q1 q1,q2
The start state is {0}. The accepting state is {q1,q2}. (State {q2}
is unreachable from the start state.) Missing transitions go to the
garbage state (−).
b)
δ′
0 1
q0 q0,q1 q0,q2
q1 q3 −
q2 − q3
q3 − −
q0,q1 q0,q1,q3 q0,q2
q0,q2 q0,q1 q0,q2,q3
q0,q1,q3 q0,q1,q3 q0,q2
q0,q2,q3 q0,q1 q0,q2,q3
42. 34 CHAPTER 3. NONDETERMINISTIC FINITE AUTOMATA
The start state is {q0}. The accepting states are {q0,q1,q3} and
{q0,q2,q3}. (States {q1}, {q2} and {q3} are unreachable from the start
state.)
c)
δ′
0 1
q0 q0 q0,q1
q1 q2 q2
q2 − −
q0,q1 q0,q2 q0,q1,q2
q0,q2 q0 q0,q1
q0,q1,q2 q0,q2 q0,q1,q2
The start state is {q0}. The accepting states are {q0,q2} and
{q0,q1,q2}. (States {q1} and {q2} are unreachable from the start
state.)
d)
δ′
0 1
q0 q0 q1
q1 q1 −
The start state is {q0}. The accepting state is {q1}. Missing transitions
go to the garbage state (−). (The given NFA was almost a DFA. All
that was missing was a garbage state and that’s precisely what the
algorithm added.)
43. 3.3. EQUIVALENCE WITH DFA’S 35
3.3.4.
a)
δ′
0 1
q0 q1 −
q1 q1 q1,q2
q2 − −
q1,q2 q1 q1,q2
The start state is E({q0}) = {q0}. The accepting state is {q1,q2}.
(State {q2} is unreachable from the start state.) Missing transitions
go to the garbage state (−).
b)
δ′
0 1
q0 q0,q1,q2 q0,q1,q2
q1 q3 −
q2 − q3
q3 − −
q0,q1,q2 q0,q1,q2,q3 q0,q1,q2,q3
q0,q1,q2,q3 q0,q1,q2,q3 q0,q1,q2,q3
The start state is E({q0}) = {q0,q1,q2}. The accepting state is
{q0,q1,q2,q3}. (States {q0}, {q1}, {q2} and {q3} are unreachable from
the start state.)
44. 36 CHAPTER 3. NONDETERMINISTIC FINITE AUTOMATA
3.4 Closure Properties
3.4.2. Suppose that Mi = (Qi,Σ,δi,qi, Fi), for i = 1,2. Without loss of general-
ity, assume that Q1 and Q2 are disjoint. Then N = (Q,Σ,δ,q0, F) where
Q = Q1 ∪ Q2
q0 = q1
F = F2
and δ is defined as follows:
δ(q,ǫ) =
¨
{q2} if q ∈ F1
; otherwise
δ(q, a) = {δi(q, a)}, if q ∈ Qi and a ∈ Σ.
3.4.3. Suppose that M = (Q1,Σ,δ1,q1, F1). Let q0 be a state not in Q1. Then
N = (Q,Σ,δ,q0, F) where
Q = Q1 ∪ {q0}
F = F1 ∪ {q0}
and δ is defined as follows:
δ(q,ǫ) =
¨
{q1} if q ∈ F1 ∪ {q0}
; otherwise
δ(q, a) = {δ1(q, a)}, if q 6= q0 and a ∈ Σ.
45. 3.4. CLOSURE PROPERTIES 37
3.4.4.
a) In the second to last paragraph of the proof, if k = 0, then it is claimed
that w = x1 with x1 ∈ A. We know x1 is accepted by N, but there is
no reason why that can’t be because x1 leads back to the start state
instead of leading to one of the original accepting states of M.
b) Consider the following DFA for the language of strings that contain
at least one 1:
0
M
1
0
1
0,1
If we used this idea, we would get the following NFA:
0
N
1
0
1
0,1
ǫ
This NFA accepts strings that contain only 0’s. These strings are not
in the language L(M)∗
. Therefore, L(N) 6= L(M)∗
.
3.4.5. This can be shown by modifying the construction that was used for the
star operation. The only change is that a new start state should not be
added. The argument that this construction works is almost the same as
before. If w ∈ A+
, then w = x1 ··· xk with k ≥ 1 and each xi ∈ A. This
46. 38 CHAPTER 3. NONDETERMINISTIC FINITE AUTOMATA
implies that N can accept w by going through M k times, each time reading
one xi and then returning to the start state of M by using one of the new
ǫ transitions (except after xk).
Conversely, if w is accepted by N, then it must be that N uses the new ǫ
“looping back” transitions k times, for some number k ≥ 0, breaking w up
into x1 ··· xk+1, with each xi ∈ A. This implies that w ∈ A+
. Therefore,
L(N) = A+
.
3.4.6. Suppose that L is regular and that it is recognized by a DFA M that looks
like this:
M
This DFA can be turned into an equivalent NFA N with a single accepting
state as follows:
N
ǫ
ǫ
47. 3.4. CLOSURE PROPERTIES 39
That is, we add a new accepting state, an ǫ transition from each of the
old accepting states to the new one, and we make the old accepting states
non-accepting.
We can show that L(N) = L(M) as follows. If w is accepted by M, then w
leads to an old accepting, which implies that N can accept w by using one
of the new transitions. If w is accepted by N, then the reading of w must
finish with one of the new transitions. This implies that in M, w leads to
one of the old accepting states, so w is accepted by M.
3.4.7. Suppose that L is recognized by a DFA M. Transform N into an equivalent
NFA with a single accepting state. (The previous exercise says that this can
be done.) Now reverse every transition in N: if a transition labeled a goes
from q1 to q2, make it go from q2 to q1. In addition, make the accepting
state become the start state, and switch the accepting status of the new
and old start states. Call the result N′
.
We claim that N′
recognizes LR
. If w = w1 ··· wn is accepted by N′
, it must
be that there is a path through N′
labeled w. But then, this means that
there was a path labeled wn ··· w1 through N. Therefore, w is the reverse
of a string in L, which means that w ∈ LR
. It is easy to see that the reverse
is also true.
49. Chapter 4
Regular Expressions
4.1 Introduction
4.1.5.
a) (− ∪ ǫ)DD∗
.
b) (− ∪ ǫ)DD∗
∪ (− ∪ ǫ)D∗
.DD∗
.
c) _(_ ∪ L ∪ D)∗
(L ∪ D)(_ ∪ L ∪ D)∗
∪ L(_ ∪ L ∪ D)∗
.
d) D7
∪ D10
∪ D3
−D4
∪ D3
−D3
−D4
.
4.2 Formal Definition
There are no exercises in this section.
50. 42 CHAPTER 4. REGULAR EXPRESSIONS
4.3 More Examples
4.3.1. 0 ∪ 1 ∪ 0Σ∗
0 ∪ 1Σ∗
1.
4.3.2.
a) 0Σ∗
1.
b) Σ1Σ∗
.
c) Σk−1
1Σ∗
.
4.3.3.
a) (00 ∪ 11)Σ∗
.
b) Σ∗
(00 ∪ 11).
c) Σ∗
1Σ.
d) Σ∗
1Σk−1
.
4.3.4.
a) Σ∗
1Σ∗
.
b) 0∗
10∗
.
c) Σ∗
1Σ∗
1Σ∗
.
d) 0∗
∪ 0∗
10∗
.
e) (Σ∗
1)k
Σ∗
.
4.3.5.
a) ǫ ∪ 1Σ∗
∪ Σ∗
0.
b) ǫ ∪ Σ ∪ Σ0Σ∗
.
51. 4.3. MORE EXAMPLES 43
c) (ǫ ∪ Σ)k−1
∪ Σk−1
0Σ∗
. Another solution: (∪k−1
i=0
Σi
) ∪ Σk−1
0Σ∗
.
4.3.6.
a) 01Σ∗
∪ 11Σ∗
0Σ∗
.
b) Σ∗
1Σ∗
1Σ∗
∪ Σ∗
0Σ∗
0Σ∗
.
c) One way to go about this is to focus on the first two 1’s that occur
in the string and then list the ways in which the 0 in the string can
relate to those two 0’s. Here’s what you get:
11+
∪ 011+
∪ 101+
∪ 11+
01∗
.
d) Let E0 = (1∗
01∗
0)∗
1∗
and D0 = (1∗
01∗
0)∗
1∗
01∗
. The regular ex-
pression E0 describes the language of strings with an even number of
0’s while D0 describes the language of strings with an odd number of
0’s. Then the language of strings that contain at least two 1’s and an
even number of 0’s can be described as follows:
E01E01E0 ∪ E01D01D0 ∪ D01E01D0 ∪ D01D01E0.
4.3.7.
a) (00)∗
#11+
.
b) (00)∗
11+
.
55. 4.5. CONVERTING DFA’S INTO REGULAR EXPRESSIONS 47
4.5 Converting DFA’s into Regular Expressions
4.5.1. a) We first add a new accepting state:
q0 q1 q2
0
ǫ
1
1
0
1
0
ǫ
We then remove state q1:
q0 q2
01∗
1
ǫ
0 ∪ 01∗
1
1
ǫ
We remove state q2:
q0
01∗
1(0 ∪ 01∗
1)∗
1
ǫ ∪ 01∗
1(0 ∪ 01∗
1)∗
The final regular expression is
(01∗
1(0 ∪ 01∗
1)∗
1)∗
(ǫ ∪ 01∗
1(0 ∪ 01∗
1)∗
)
56. 48 CHAPTER 4. REGULAR EXPRESSIONS
b) First, we add a new accepting state:
q0 q1 q2
0
0
1
0
0
1
ǫ
0 ∪ 1
Then, we notice that state q2 cannot be used to travel between the
other two states. So we can just remove it:
q0 q1
0
0
0
1
ǫ
We remove state q1:
q0
0 ∪ 0+
1
0+
The final regular expression is (0 ∪ 0+
1)∗
0+
.
57. 4.6. PRECISE DESCRIPTION OF THE ALGORITHM 49
4.6 Precise Description of the Algorithm
4.6.1. Label the new accepting state q4. Then the GNFA is (Q,{0,1},δ,q0, F)
where
Q = {q0,q2,q3,q4}
F = {q4}
and δ is defined by the following table:
δ q0 q2 q3 q4
q0 0 ∪ 1 00 ; ;
q2 ; ; 1 ;
q3 ; ; 0 ∪ 1 ǫ
q4 ; ; ; ;
58. 50 CHAPTER 4. REGULAR EXPRESSIONS
4.6.2. The GNFA is (Q,{0,1,2},δ,q0, F) where
Q = {q0,q2,q3}
F = {q3}
and δ is defined by the following table:
δ q0 q2 q3
q0 0 ∪ 10∗
2 2 ∪ 10∗
1 ǫ ∪ 10∗
q2 1 ∪ 20∗
2 0 ∪ 20∗
1 20∗
q3 ; ; ;
59. Chapter 5
Nonregular Languages
5.1 Some Examples
5.1.1. Suppose that L = {0n
12n
| n ≥ 0} is regular. Let M be a DFA that recog-
nizes L and let n be the number of states of M.
Consider the string w = 0n
12n
. As M reads the 0’s in w, M goes through a
sequence of states r0, r1, r2,..., rn. Because this sequence is of length n+1,
there must be a repetition in the sequence.
Suppose that ri = rj with i < j. Then the computation of M on w looks
like this:
r0 ri
0i
0j−i
0n−j
12n
This implies that the string 0i
0n−j
12n
= 0n−(j−i)
12n
is also accepted. But
60. 52 CHAPTER 5. NONREGULAR LANGUAGES
since this string no longer has exactly n 0’s, it cannot belong to L. This
contradicts the fact that M recognizes L. Therefore, M cannot exist and L
is not regular.
5.1.2. Suppose that L = {0i
1j
| 0 ≤ i ≤ 2j} is regular. Let M be a DFA that
recognizes L and let n be the number of states of M.
Consider the string w = 02n
1n
. As M reads the first n 0’s in w, M goes
through a sequence of states r0, r1, r2,..., rn. Because this sequence is of
length n + 1, there must be a repetition in the sequence.
Suppose that ri = rj with i < j. Then the computation of M on w looks
like this:
r0 ri
0i
0j−i
02n−j
1n
Now consider going twice around the loop. This implies that the string
0i
02(j−i)
02n−j
1n
= 02n+(j−i)
1n
is also accepted. But since this string has
more than 2n 0’s, it does not belong to L. This contradicts the fact that M
recognizes L. Therefore, M cannot exist and L is not regular.
5.1.3. Suppose that L = {wwR
| w ∈ {0,1}∗
} is regular. Let M be a DFA that
recognizes L and let n be the number of states of M.
Consider the string w = 0n
110n
. As M reads the first n 0’s of w, M goes
through a sequence of states r0, r1, r2,..., rn. Because this sequence is of
length n + 1, there must be a repetition in the sequence.
Suppose that ri = rj with i < j. Then the computation of M on w looks
like this:
61. 5.2. THE PUMPING LEMMA 53
r0 ri
0i
0j−i
0n−j
110n
This implies that the string 0i
0n−j
110n
= 0n−(j−i)
110n
is also accepted.
But this string does not belong to L. This contradicts the fact that M rec-
ognizes L. Therefore, M cannot exist and L is not regular.
5.2 The Pumping Lemma
5.2.1. Let L = {0i
1j
| i ≤ j}. Suppose that L is regular. Let p be the pump-
ing length. Consider the string w = 0p
1p
. Clearly, w ∈ L and |w| ≥ p.
Therefore, according to the Pumping Lemma, w can be written as x yz
where
1. |x y| ≤ p.
2. y 6= ǫ.
3. x yk
z ∈ L, for every k ≥ 0.
Condition (1) implies that y contains only 0’s. Condition (2) implies that
y contains at least one 0. Therefore, the string x y2
z does not belong to L
because it contains more 0’s than 1’s. This contradicts Condition (3) and
implies that L is not regular.
5.2.2. Let L = {1i
#1j
#1i+j
}. Suppose that L is regular. Let p be the pumping
length. Consider the string w = 1p
#1p
#12p
. Clearly, w ∈ L and |w| ≥ p.
Therefore, according to the Pumping Lemma, w can be written as x yz
where
62. 54 CHAPTER 5. NONREGULAR LANGUAGES
1. |x y| ≤ p.
2. y 6= ǫ.
3. x yk
z ∈ L, for every k ≥ 0.
Since |x y| ≤ p, we have that y contains only 1’s from the first part of
the string. Therefore, x y2
z = 1p+|y|
#1p
#12p
. Because |y| ≥ 1, this string
cannot belong to L. This contradicts the Pumping Lemma and shows that
L is not regular.
5.2.3. Let L be the language described in the exercise. Suppose that L is regu-
lar. Let p be the pumping length. Consider the string w = 1p
0#1#1p+1
.
Clearly, w ∈ L and |w| ≥ p. Therefore, according to the Pumping Lemma,
w can be written as x yz where
1. |x y| ≤ p.
2. y 6= ǫ.
3. x yk
z ∈ L, for every k ≥ 0.
Since |x y| ≤ p, we have that y contains only 1’s from the first part of the
string. Therefore, x y2
z = 1p+|y|
0#1#1p+1
. Since |y| ≥ 1, this string does
not belong to L because the sum of the first two numbers no longer equals
the third. This contradicts the Pumping Lemma and shows that L is not
regular.
5.2.4. What is wrong with this proof is that we cannot assume that p = 1. All
that the Pumping Lemma says is that p is positive. We cannot assume
anything else about p. For example, if we get a contradiction for the case
p = 1, then we haven’t really contradicted the Pumping Lemma because
it may be that p has another value.
64. 56 CHAPTER 6. CONTEXT-FREE LANGUAGES
b)
R → SN1 | SN0.N1
S → − | ǫ
N0 → DN0 | ǫ
N1 → DN0
D → 0 | ··· | 9
c)
I → _R1 | LR0
R0 → _R0 | LR0 | DR0 | ǫ
R1 → R0 LR0 | R0DR0
L → a | ··· | z | A | ··· | Z
D → 0 | ··· | 9
6.2 Formal Definition of CFG’s
There are no exercises in this section.
65. 6.3. MORE EXAMPLES 57
6.3 More Examples
6.3.1.
a)
S → 0S0 | 1
b)
S → 0S0 | 1S1 | ǫ
c)
S → 0S11 | ǫ
d) Here’s one solution:
S → ZS1 | ǫ
Z → 0 | ǫ
Here’s another one:
S → 0S1 | T
T → T1 | ǫ
66. 58 CHAPTER 6. CONTEXT-FREE LANGUAGES
6.3.2.
S → 1S1 | #T
T → 1T1 | #
6.3.3.
S → (S)S | [S]S | {S}S | ǫ
6.3.4. A string of properly nested parentheses is either () or a string of the
form (u)v where u and v are either empty or strings of properly nested
parentheses.
S → (S)S | ()S | (S) | ()
6.4 Ambiguity and Parse Trees
6.4.4. Two parse trees in the first grammar:
E
E
E
a
∗ E
a
+ E
a
E
E
a
∗ E
E
a
+ E
a
67. 6.4. AMBIGUITY AND PARSE TREES 59
The unique parse tree in the second grammar:
E
E
T
T
F
a
∗ F
a
+ T
F
a
69. Chapter 7
Non Context-Free Languages
7.1 The Basic Idea
7.1.1. Let L denote that language and suppose that L is context-free. Let G be
a CFG that generates L. Let w = an
bn
cn
where n = b|V| + 1. By the
argument developed in this section, any derivation of w in G contains a
repeated variable. Assume that the repetition is of the nested type. Then
uvk
x yk
z ∈ L for every k ≥ 0. And, as shown at the end of the section, we
can ensure that v and y are not both empty. There are now three cases to
consider.
First, suppose that either v or y contains more than one type of symbol.
Then uv2
x y2
z /
∈ L because that string is not even in a∗
b∗
c∗
.
Second, suppose that v and y each contain only one type of symbol but
no c’s. Then uv2
x y2
z contains more a’s or b’s than c’s. Therefore,
uv2
x y2
z /
∈ L.
Third, suppose that v and y each contain only one type of symbol includ-
70. 62 CHAPTER 7. NON CONTEXT-FREE LANGUAGES
ing some c’s. Then v and y cannot contain both a’s or b’s. This implies
that uv0
x y0
z contains less c’s than a’s or less c’s than b’s. Therefore,
uv0
x y0
z /
∈ L.
In all three cases, we have a contradiction. This proves that L is not
context-free.
7.2 A Pumping Lemma
7.2.1. Let L denote that language and suppose that L is context-free. Let p be
the pumping length. Consider the string w = 0p
1p
0p
. Clearly, w ∈ L and
|w| ≥ p. Therefore, according to the Pumping Lemma, w can be written
as uvx yz where
1. v y 6= ǫ.
2. uvk
x yk
z ∈ L, for every k ≥ 0.
There are two cases to consider. First, suppose that either v or y contains
more than one type of symbol. Then uv2
x y2
z /
∈ L because that string is
not even in 0∗
1∗
0∗
.
Second, suppose that v and y each contain only one type of symbol. The
string w consists of three blocks of p symbols and v and y can touch at
most two of those blocks. Therefore, uv2
x y2
z = 0p+i
1p+j
0p+k
where at
least one of i, j, k is greater than 0 and at least one of i, j, k is equal to 0.
This implies that uv2
x y2
z /
∈ L.
In both cases, we have that uv2
x y2
z /
∈ L. This is a contradiction and
proves that L is not context-free.
71. 7.2. A PUMPING LEMMA 63
7.2.2. Let L denote that language and suppose that L is context-free. Let p be
the pumping length. Consider the string w = ap
bp
cp
. Clearly, w ∈ L and
|w| ≥ p. Therefore, according to the Pumping Lemma, w can be written
as uvx yz where
1. v y 6= ǫ.
2. uvk
x yk
z ∈ L, for every k ≥ 0.
There are three cases to consider. First, suppose that either v or y contains
more than one type of symbol. Then uv2
x y2
z /
∈ L because that string is
not even in a∗
b∗
c∗
.
In the other two cases, v and y each contain only one type of symbol. The
second case is when v consists of a’s. Then, since y cannot contain both
b’s and c’s, uv2
x y2
z contains more a’s than b’s or more a’s than c’s. This
implies that uv2
x y2
z /
∈ L.
The third case is when v does not contain any a’s. Then y can’t either.
This implies that uv0
x y0
z contains less b’s than a’s or less c’s than a’s.
Therefore, uv0
x y0
z /
∈ L.
In all cases, we have that uvk
x yk
z /
∈ L for some k ≥ 0. This is a contra-
diction and proves that L is not context-free.
7.2.3. Let L denote that language and suppose that L is context-free. Let p be the
pumping length. Consider the string w = 1p
#1p
#12p
. Clearly, w ∈ L and
|w| ≥ p. Therefore, according to the Pumping Lemma, w can be written
as uvx yz where
1. v y 6= ǫ.
2. uvk
x yk
z ∈ L, for every k ≥ 0.
72. 64 CHAPTER 7. NON CONTEXT-FREE LANGUAGES
There are several cases to consider. First, suppose that either v or y con-
tains a #. Then uv2
x y2
z /
∈ L because it contains too many #’s.
For the remaining cases, assume that neither v nor y contains a #. Note
that w consists of three blocks of 1’s separated by #’s. This implies that v
and y are each completely contained within one block and that v and y
cannot contain 1’s from all three blocks.
The second case is when v and y don’t contain any 1’s from the third block.
Then uv2
x y2
z = 1p+i
#1p+j
#12p
where at least one of i, j is greater than
0. This implies that uv2
x y2
z /
∈ L.
The third case is when v and y don’t contain any 1’s from the first two
blocks. Then uv2
x y2
z = 1p
#1p
#12p+i
where i > 0. This implies that
uv2
x y2
z /
∈ L.
The fourth case is when v consists of 1’s the first block and y consists of
1’s the third block. Then uv2
x y2
z = 1p+i
#1p
#12p+j
where both i, j are
greater than 0. This implies that uv2
x y2
z /
∈ L because the first block is
larger than the second block.
The fifth and final case is when v consists of 1’s the second block and y
consists of 1’s the third block. Then uv0
x y0
z = 1p
#1p−i
#12p−j
where
both i, j are greater than 0. This implies that uv2
x y2
z /
∈ L because the
second block is smaller than the first block.
In all cases, we have a contradiction. This proves that L is not context-free.
7.3 A Stronger Pumping Lemma
7.3.1. Let L denote that language and suppose that L is context-free. Let p be
the pumping length. Consider the string w = 0p
12p
0p
. Clearly, w ∈ L and
73. 7.3. A STRONGER PUMPING LEMMA 65
|w| ≥ p. Therefore, according to the Pumping Lemma, w can be written
as uvx yz where
1. |vx y| ≤ p.
2. v y 6= ǫ.
3. uvk
x yk
z ∈ L, for every k ≥ 0.
The string w consists of three blocks of symbols. Since |vx y| ≤ p, v and y
are completely contained within two consecutive blocks. Suppose that v
and y are both contained within a single block. Then uv2
x y2
z has addi-
tional symbols of one type but not the other. Therefore, this string is not
in L.
Now suppose that v and y touch two consecutive blocks, the first two, for
example. Then uv0
x y0
z = 0i
1j
0p
1p
where 0 < i, j < p. This string is
clearly not in L. The same is true for the other blocks.
Therefore, in all cases, we have that w cannot be pumped. This contradicts
the Pumping Lemma and proves that L is not context-free.
7.3.2. Let L denote that language and suppose that L is context-free. Let p be the
pumping length. Consider the string w = 1p
#1p
#1p2
. Clearly, w ∈ L and
|w| ≥ p. Therefore, according to the Pumping Lemma, w can be written
as uvx yz where
1. v y 6= ǫ.
2. uvk
x yk
z ∈ L, for every k ≥ 0.
There are several cases to consider. First, suppose that either v or y con-
tains a #. Then uv2
x y2
z /
∈ L because it contains too many #’s.
74. 66 CHAPTER 7. NON CONTEXT-FREE LANGUAGES
For the remaining cases, assume that neither v or y contains a #. Note
that w consists of three blocks of 1’s separated by #’s. This implies that v
and y are each completely contained within one block and that v and y
cannot touch all three blocks.
The second case is when v and y are contained within the first two blocks.
Then uv2
x y2
z = 1p+i
#1p+j
#1p2
where at least one of i, j is greater than
0. This implies that uv2
x y2
z /
∈ L.
The third case is when v and y are both within the third block. Then
uv2
x y2
z = 1p
#1p
#1p2
+i
where i > 0. This implies that uv2
x y2
z /
∈ L.
The fourth case is when v consists of 1’s from the first block and y consists
of 1’s from the third block. This case cannot occur since |vx y| ≤ p.
The fifth and final case is when v consists of 1’s from the second block and
y consists of 1’s from the third block. Then uv2
x y2
z = 1p
#1p+i
#1p2
+j
where both i, j are greater than 0. Now, p(p + i) ≥ p(p + 1) = p2
+ p.
On the other hand, p2
+ j < p2
+ p since j = |y| < |vx y| ≤ p. Therefore,
p(p + i) 6= p2
+ j. This implies that uv2
x y2
z /
∈ L.
In all cases, we have a contradiction. This proves that L is not context-free.
75. Chapter 8
More on Context-Free Languages
8.1 Closure Properties
8.1.2. Here’s a CFG for the language {ai
bj
ck
| i 6= j or j 6= k}:
S → TC0 | A0U
T → aTb | A1 | B1 (ai
bj
, i 6= j)
U → bUc | B1 | C1 (bj
ck
, j 6= k)
A0 → aA0 | ǫ (a∗
)
C0 → cC0 | ǫ (c∗
)
A1 → aA1 | a (a+
)
B1 → bB1 | b (b+
)
C1 → cC1 | c (c+
)
76. 68 CHAPTER 8. MORE ON CONTEXT-FREE LANGUAGES
Now, the complement of {an
bn
cn
| n ≥ 0} is
a∗b∗
c∗ ∪ {ai
bj
ck
| i 6= j or j 6= k}.
The language on the left is regular and, therefore, context-free. We have
just shown that the language on the right is context-free. Therefore, the
complement of {an
bn
cn
| n ≥ 0} is context-free because the union of two
CFL’s is always context-free.
8.1.3. Suppose that w = x y where |x| = |y| but x 6= y. Focus on one of the
positions where x and y differ. It must be the case that x = u1au2 and
y = v1 bv2, where |u1| = |v1|, |u2| = |v2|, a, b ∈ {0,1} and a 6= b. This
implies that w = u1au2v1 bv2. Now, notice that |u2v1| = |v2| + |u1|. We can
then split u2v1 differently, as s1s2 where |s1| = |u1| and |s2| = |v2|. This
implies that w = u1as1s2 bv2 where |u1| = |s1| and |s2| = |v2|. The idea
behind a CFG that derives w is to generate u1as1 followed by s2 bv2. Here’s
the result:
S → T0T1 | T1T0
T0 → AT0A | 0 (u0s, |u| = |s|)
T1 → AT1A | 1 (u1s, |u| = |s|)
A → 0 | 1
Now, the complement of {ww | w ∈ {0,1}∗
} is
{w ∈ {0,1}∗
| |w| is odd} ∪ {x y | x, y ∈ {0,1}∗
,|x| = |y| but x 6= y}
The language on the left is regular and, therefore, context-free. We have
just shown that the language on the right is context-free. Therefore, the
77. 8.2. PUSHDOWN AUTOMATA 69
complement of {ww | w ∈ {0,1}∗
} is context-free because the union of
two CFL’s is always context-free.
8.2 Pushdown Automata
8.2.2. One possible solution is to start with a CFG for this language and then
simulate this CFG with a stack algorithm. Here’s a CFG for this language:
S → 0S1
S → ǫ
78. 70 CHAPTER 8. MORE ON CONTEXT-FREE LANGUAGES
Now, here’s a single-scan stack algorithm that simulates this CFG:
push S on the stack
while (stack not empty)
if (top of stack is S)
nondeterministically choose to replace S
by 0S1 (with 0 at the top of the
stack) or to delete S
else // top of stack is 0 or 1
if (end of input) reject
read next input symbol c
if (c equals top of stack)
pop stack
else
reject
if (end of input)
accept
else
reject
79. 8.3. DETERMINISTIC ALGORITHMS FOR CFL’S 71
Another solution is a more direct algorithm:
if (end of input) accept
initialize stack to empty
read next char c
while (c is 0)
push 0 on the stack
if (end of input) // some 0’s but no 1’s
reject
read next char c
while (c is 1)
if (stack empty) reject // too many 1’s
pop stack
if (end of input)
if (stack empty)
accept
else
reject // too many 0’s
read next char c
reject // 0’s after 1’s
8.3 Deterministic Algorithms for CFL’s
8.3.3. Let L be the language of strings of the form ww. We know that L is not
context-free. If L was a DCFL, then L would be also be a DCFL because
that class is closed under complementation. This would contradict the fact
that L is not even context-free.
81. Chapter 9
Turing Machines
9.1 Introduction
9.1.1. The idea is to repeatedly cross off one a, one b and one c.
1. If the input is empty, accept.
2. Scan the input to verify that it is of the form a∗
b∗
c∗
. If not, reject.
3. Return the head to the beginning of the memory.
4. Cross off the first a.
5. Move right to the first b and cross it off. If no b can be found, reject.
6. Move right to the first c and cross it off. If no c can be found, reject.
7. Repeat Steps 2 to 5 until all the a’s have been crossed off. When that
happens, scan right to verify that all other symbols have been crossed
off. If so, accept. Otherwise, reject.
9.1.2. The idea is to first find the middle of the string and then proceed as we
did in this section for the language {w#w}. In what follows, when we
82. 74 CHAPTER 9. TURING MACHINES
mark a symbol with an L, we change it into either 0L
or 1L
. Similarly for
marking with an R.
1. If the input is empty, accept.
2. Mark the first unmarked symbol with an L.
3. Move right to the last unmarked symbol. If none can be found, reject
(because the input is of odd length). Otherwise, mark it with an R
and move left.
4. Repeat Steps 2 and 3 until all the symbols have been marked. The
input is now of the form uv where |u| = |v|, all the symbols of u are
marked with an L and all the symbols of v are marked with an R.
5. Verify that u = v by following Steps 2 to 5 of the TM for the language
{w#w}.
83. 9.2. FORMAL DEFINITION 75
9.2 Formal Definition
9.2.2. We will cross off an a by replacing it with an x. For b and c, we will use
y and z, respectively. Missing transitions go to the rejecting state.
q0 q1 q2 q3
q4
qaccept
a → x,R
a,y → R
b → y,R
b,z → R
c → z,L
z,b,y,a → L
x → R
y → R
y,z → R
→ R
→ R
9.3 Variations on the Basic Turing Machine
9.3.2. Suppose that M is a Turing machine with doubly infinite memory. We
construct a basic Turing machine M′
that simulates M as follows.
84. 76 CHAPTER 9. TURING MACHINES
1. Let w be the input string. Shift w one position to the right. Place a #
before and after w so the tape contains #w#.
2. Move the head to the first symbol of w and run M.
3. Whenever M moves to the rightmost #, replace it with a blank and
write a # in the next position. Return to the blank and continue
running M.
4. Whenever M moves to the leftmost #, shift the entire contents of the
memory (up to the rightmost #) one position to the right. Write a #
and a blank in the first two positions, put the head on that blank and
continue running M.
5. Repeat Steps 2 to 4 until M halts. Accept if M accepts. Otherwise,
reject.
9.3.3. Suppose that L1 and L2 are decidable languages. Let M1 and M2 be TM’s
that decide these languages. Here’s a TM that decides L1:
1. Run M1 on the input.
2. If M1 accepts, reject. If M1 rejects, accept.
Here’s a TM that decides L1 ∪ L2:
1. Copy the input to a second tape.
2. Run M1 on the first tape.
3. If M1 accepts, accept.
4. Otherwise, run M2 on the second tape.
5. If M2 accepts, accept. Otherwise, reject,
85. 9.3. VARIATIONS ON THE BASIC TURING MACHINE 77
Here’s a TM that decides L1 ∩ L2:
1. Copy the input to a second tape.
2. Run M1 on the first tape.
3. If M1 rejects, reject.
4. Otherwise, run M2 on the second tape.
5. If M2 accepts, accept. Otherwise, reject,
Here’s a TM that decides L1 L2:
1. If the input is empty, run M1 on the first tape and M2 on a blank
second tape. If both accept, accept. Otherwise, reject.
2. Mark the first symbol of the input. (With an underline, for example.)
3. Copy the beginning of the input, up to but not including the marked
symbol, to a second tape. Copy the rest of the input to a third tape.
4. Run M1 on the second tape and M2 on the third tape.
5. If both accept, accept.
6. Otherwise, move the mark to the next symbol of the input.
7. While the mark has not reached a blank space, repeat Steps 3 to 6.
8. Delete the mark from the first tape. Run M1 on the first tape and M2
on a blank second tape. If both accept, accept. Otherwise, reject.
9.3.4.
1. Verify that the input is of the form x#y#z where x, y and z are strings
of digits of the same length. If not, reject.
86. 78 CHAPTER 9. TURING MACHINES
2. Write a # in the first position of tapes 2, 3 and 4.
3. Copy x, y and z to tapes 2, 3 and 4, respectively.
4. Set the carry to 0. (Remember the carry with the states of the TM.)
5. Scan those numbers simultaneously from right to left, using the initial
# to know when to stop. For each position, compute the sum n of the
carry and the digits of x and y (using the transition function). If
n mod 10 is not equal to the digit of z, reject. Set the carry to ⌊n/10⌋.
6. If the carry is 0, accept. Otherwise, reject.
9.4 Equivalence with Programs
9.4.1. Here’s a TM for the copy instruction:
1. Move the memory head to location i.
2. Copy 32 bits starting at that memory location to an extra tape.
3. Move the memory head to location j.
4. Copy the 32 bits from the extra tape to the 32 bits that start at the
current memory location.
Here’s a TM for the add instruction:
1. Move the memory head to location i.
2. Copy 32 bits starting at that memory location to a second extra tape.
3. Move the memory head to location j.
87. 9.4. EQUIVALENCE WITH PROGRAMS 79
4. Add the 32 bits from the second extra tape to the 32 bits that start at
the current memory location. (This can be done by adapting the so-
lution to an exercise from the previous section.) Discard any leftover
carry.
Here’s a TM for the jump-if instruction:
1. Move the memory head to location i.
2. Scan the 32 bits that start at that memory location. If they’re all 0,
transition to the first state of the group of states that implement the
other instruction. Otherwise, continue to the next instruction.
9.4.2. Suppose that L is a decidable language. Let M be a TM that decides this
language. Here’s a TM that decides L∗
. (Note that this is a high-level
description.)
Let w be the input and let n be the length of w
If w is empty, accept
For each k in {1, 2, ..., n}
For every partition of w into k substrings
s[1], ..., s[k]
Run M on each s[i]
If M accepts all of them, accept
Reject
89. Chapter 10
Problems Concerning Formal
Languages
10.1 Regular Languages
10.1.1. If M is a DFA with input alphabet Σ, then L(M) = Σ∗
if and only if
L(M) = ;. This leads to the following algorithm for ALLDFA:
1. Verify that the input string is of the form 〈M〉 where M is a DFA. If
not, reject.
2. Construct a DFA M′
for the complement of L(M). (This can be done
by simply switching the acceptance status of every state in M.)
3. Test if L(M′
) = ; by using the emptiness algorithm.
4. Accept if that algorithm accepts. Otherwise, reject.
10.1.2. The key observation here is that L(R1) ⊆ L(R2) if and only if L(R1) −
L(R2) = ;. Since L(R1) − L(R2) = L(R1) ∩ L(R2), this language is regular.
90. 82 CHAPTER 10. PROBLEMS CONCERNING FORMAL LANGUAGES
This leads to the following algorithm for SUBSETREX:
1. Verify that the input string is of the form 〈R1,R2〉 where R1 and R2
are regular expressions. If not, reject.
2. Construct a DFA M for the language L(R1) − L(R2). (This can be
done by converting R1 and R2 to DFA’s and then combining these
DFA’s using the constructions for closure under complementation and
intersection.)
3. Test if L(M) = ; by using the emptiness algorithm.
4. Accept if that algorithm accepts. Otherwise, reject.
10.1.3. Let LODD denote the language of strings of odd length. If M is a DFA, then
M accepts at least one string of odd length if and only if L(M)∩ LODD 6= ;.
Since LODD is regular, this leads to the following algorithm:
1. Verify that the input string is of the form 〈M〉 where M is a DFA. If
not, reject.
2. Construct a DFA M′
for the language L(M)∩ LODD. (This can be done
by combining M with a DFA for LODD using the construction for clo-
sure under interesction.)
3. Test if L(M′
) = ; by using the emptiness algorithm.
4. Reject if that algorithm accepts. Otherwise, accept.
91. 10.2. CFL’S 83
10.2 CFL’s
10.2.1. Here’s an algorithm:
1. Verify that the input string is of the form 〈G〉 where G is a CFG. If not,
reject.
2. Determine if G derives ǫ by using an algorithm for ACFG.
3. Accept if that algorithm accepts. Otherwise, reject.
10.2.2. Let LODD denote the language of strings of odd length. If G is a CFG, then
G derives at least one string of odd length if and only if L(G) ∩ LODD 6= ;.
This leads to the following algorithm:
1. Verify that the input string is of the form 〈G〉 where G is a CFG. If not,
reject.
2. Construct a CFG G′
for the language L(G) ∩ LODD. (This can be done
by converting G into a PDA, combining it with a DFA for LODD as
outlined in Section 8.2, and converting the resulting PDA into a CFG.)
3. Test if L(G′
) = ; by using the emptiness algorithm.
4. Reject if that algorithm accepts. Otherwise, accept.
93. Chapter 11
Undecidability
11.1 An Unrecognizable Language
11.1.1. Suppose, by contradiction, that L is recognized by some Turing machine
M. In other words, for every string w, M accepts w if and only if w ∈ L.
In particular,
M accepts 〈M〉 if and only if 〈M〉 ∈ L
But the definition of L tells us that
〈M〉 ∈ L if and only if M does not accept 〈M〉
This is a contradiction. Therefore, M cannot exist and L is not recogniz-
able.
94. 86 CHAPTER 11. UNDECIDABILITY
11.2 Natural Undecidable Languages
11.2.1. Here’s a Turing machine that recognizes D:
1. Let w be the input string. Find i such that w = si.
2. Generate the encoding of machine Mi.
3. Simulate Mi on si.
4. If M accepts, accept. Otherwise, reject.
11.3 Reducibility and Additional Examples
11.3.1. Suppose that algorithm R decides BUMPS_OFF_LEFTTM. We use this
algorithm to design an algorithm S for the acceptance problem:
1. Verify that the input string is of the form 〈M, w〉 where M is a Turing
machine and w is a string over the input alphabet of M. If not, reject.
2. Without loss of generality, suppose that # is a symbol not in the tape
alphabet of M. (Otherwise, pick some other symbol.) Construct the
following Turing machine M′
:
(a) Let x be the input string. Shift x one position to the right. Place
a # before x so the tape contains #x.
(b) Move the head to the first symbol of x and run M.
(c) Whenever M moves the head to the #, move the head back one
position to the right.
(d) If M accepts, move the head to the # and move left again.
3. Run R on 〈M′
, w〉.
95. 11.3. REDUCIBILITY AND ADDITIONAL EXAMPLES 87
4. If R accepts, accept. Otherwise, reject.
To prove that S decides ATM, first suppose that M accepts w. Then when
M′
runs on w, it attempts to move left from the first position of its tape
(where the # is). This implies that R accepts 〈M′
, w〉 and that S accepts
〈M, w〉, which is what we want.
Second, suppose that M does not accept w. Then when M′
runs on w, it
never attempts to move left from the first position of its tape. This implies
that R rejects 〈M′
, w〉 and that S rejects 〈M, w〉. Therefore, S decides ATM.
Since ATM is undecidable, this is a contradiction. Therefore, R does not
exist and BUMPS_OFF_LEFTTM is undecidable.
11.3.2. Suppose that algorithm R decides ENTERS_STATETM. We use this algo-
rithm to design an algorithm S for the acceptance problem:
1. Verify that the input string is of the form 〈M, w〉 where M is a Turing
machine and w is a string over the input alphabet of M. If not, reject.
2. Run R on 〈M, w,qaccept〉, where qaccept is the accepting state of M.
3. If R accepts, accept. Otherwise, reject.
It’s easy to see that S decides ATM because M accepts w if and only if M
enters its accepting state while running on w.
Since ATM is undecidable, this is a contradiction. Therefore, R does not
exist and ENTERS_STATETM is undecidable.
11.3.4. Suppose that algorithm R decides ACCEPTSǫTM. We use this algorithm
to design an algorithm S for the acceptance problem:
1. Verify that the input string is of the form 〈M, w〉 where M is a Turing
machine and w is a string over the input alphabet of M. If not, reject.
96. 88 CHAPTER 11. UNDECIDABILITY
2. Construct the following Turing machine M′
:
(a) Let x be the input string. If x 6= ǫ, reject.
(b) Run M on w.
(c) If M accepts, accept. Otherwise, reject.
3. Run R on 〈M′
〉.
4. If R accepts, accept. Otherwise, reject.
To prove that S decides ATM, first suppose that M accepts w. Then L(M′
) =
{ǫ}, which implies that R accepts 〈M′
〉 and that S accepts 〈M, w〉, which
is what we want. On the other hand, suppose that M does not accept
w. Then L(M′
) = ;, which implies that R rejects 〈M′
〉 and that S rejects
〈M, w〉, which is again what we want.
Since ATM is undecidable, this is a contradiction. Therefore, R does not
exist and ACCEPTSǫTM is undecidable.
11.3.5. Suppose that algorithm R decides HALTS_ON_ALLTM. We use this algo-
rithm to design an algorithm S for HALTTM:
1. Verify that the input string is of the form 〈M, w〉 where M is a Turing
machine and w is a string over the input alphabet of M. If not, reject.
2. Construct the following Turing machine M′
:
(a) Run M on w.
(b) If M accepts, accept. Otherwise, reject.
3. Run R on 〈M′
〉.
4. If R accepts, accept. Otherwise, reject.
97. 11.3. REDUCIBILITY AND ADDITIONAL EXAMPLES 89
Note that M′
ignores its input and always runs M on w. This implies that
if M halts on w, then M′
halts on every input, R accepts M′
and S accepts
〈M, w〉. On the other hand, if M doesn’t halt on w, then M′
doesn’t halts
on any input, R rejects M′
and S rejects 〈M, w〉. This shows that S decides
HALTTM.
However, we know that HALTTM is undecidable. So this is a contradiction,
which implies that R does not exist and HALTS_ON_ALLTM is undecidable.
11.3.6. Suppose that algorithm R decides EQCFG. We use this algorithm to design
an algorithm S for ALLCFG:
1. Verify that the input string is of the form 〈G〉 where G is a CFG. If not,
reject.
2. Let Σ be the alphabet of G and construct a CFG G′
that generates Σ∗
.
3. Run R on 〈G, G′
〉.
4. If R accepts, accept. Otherwise, reject.
This algorithm accepts 〈G〉 if and only if L(G) = L(G′
) = Σ∗
. Therefore, S
decides ALLCFG, which contradicts the fact that ALLCFG. Therefore, R does
not exist and EQCFG is undecidable.
11.3.7. Yes, the proof of the theorem still works, as long as we make two
changes.
First, Step 4 in the description of S should be changed to the following:
If R accepts, reject. Otherwise, accept.
Second, the paragraph that follows the description of S should be changed
as follows:
98. 90 CHAPTER 11. UNDECIDABILITY
Suppose that M accepts w. Then L(M′
) = {0n
1n
| n ≥ 0},
which implies that R rejects 〈M′
〉 and that S accepts 〈M, w〉.
On the other hand, suppose that M does not accept w. Then
L(M′
) = ;, which implies that R accepts 〈M′
〉 and that S rejects
〈M, w〉. Therefore, S decides ATM.
11.3.8. Suppose that algorithm R decides INFINITETM. We use this algorithm to
design an algorithm S for ATM:
1. Verify that the input string is of the form 〈M, w〉 where M is a Turing
machine and w is a string over the input alphabet of M. If not, reject.
2. Construct the following Turing machine M′
:
(a) Run M on w.
(b) If M accepts, accept. Otherwise, reject.
3. Run R on 〈M′
〉.
4. If R accepts, accept. Otherwise, reject.
To prove that S decides ATM, first suppose that M accepts w. Then M′
accepts every input string, which implies that L(M′
) is infinite, R accepts
M′
and S accepts 〈M, w〉. On the other hand, if M doesn’t accept w, then
L(M′
) = ;, R rejects M′
and S rejects 〈M, w〉.
Since ATM is undecidable, this is a contradiction. Therefore, R does not
exist and INFINITETM is undecidable.
11.3.9. Suppose that algorithm R decides DECIDABLETM. We use this algorithm
to design an algorithm S for the acceptance problem. Given a Turing ma-
chine M and a string w, S will construct a new Turing machine M′
whose
language is decidable if and only if M accepts w.
99. 11.4. RICE’S THEOREM 91
Let MD be a Turing machine that recognizes the language D defined in
Section 11.1. Exercise 11.2.1 asked you to show that MD exists. Let Σ be
the alphabet of D. (D can be defined over any alphabet.)
1. Verify that the input string is of the form 〈M, w〉 where M is a Turing
machine and w is a string over the input alphabet of M. If not, reject.
2. Construct the following Turing machine M′
with input alphabet Σ:
(a) Let x be the input string.
(b) Run MD on x. If MD accepts, accept.
(c) Otherwise, run M on w.
(d) If M accepts, accept. Otherwise, reject.
3. Run R on 〈M′
〉.
4. If R accepts, accept. Otherwise, reject.
To show that S decides ATM, first suppose that M accepts w. Then L(M′
) =
Σ∗
, which is a decidable language. This implies that R accepts 〈M′
〉 and
that S accepts 〈M, w〉. On the other hand, suppose that M does not accept
w. Then L(M′
) = D, which is undecidable. This implies that R rejects 〈M′
〉
and that S rejects 〈M, w〉.
Since ATM is undecidable, this is a contradiction. Therefore, R does not
exist and DECIDABLETM is undecidable.
11.4 Rice’s Theorem
11.4.1. Yes, because it is still true that L(M′
) = L when M accepts w, and that
L(M′
) = ; when M does not accept w.
100. 92 CHAPTER 11. UNDECIDABILITY
11.5 Natural Unrecognizable Languages
11.5.1.
a) Suppose that L is recognizable and that B is decidable. Let ML and
MB be Turing machines that recognize L and B, respectively. We use
ML and MB to design a Turing machine M that recognizes L ∪ B:
1. Let w be the input string. Run MB on w.
2. If B accepts, accept.
3. Otherwise (if B rejects), run ML on w.
4. If ML accepts, accept. Otherwise, reject.
(Note that the order in which M runs MB and ML is important. If the
order was reversed, M might get stuck running ML forever without
getting a chance to run MB. That would be a problem for strings in
B − L.)
b) Suppose that L is recognizable and that B is decidable. Let ML and
MB be Turing machines that recognize L and B, respectively. We use
ML and MB to design a Turing machine M that recognizes L ∩ B:
1. Let w be the input string. Run MB on w.
2. If B rejects, reject.
3. Otherwise (if B accepts), run ML on w.
4. If ML accepts, accept. Otherwise, reject.
(The order in which M runs MB and ML is again important.)
11.5.2. From an exercise in the previous section, we know that ACCEPTSǫTM
is undecidable. The following Turing machine shows that ACCEPTSǫTM is
recognizable:
101. 11.5. NATURAL UNRECOGNIZABLE LANGUAGES 93
1. Verify that the input string is of the form 〈M〉 where M is a Turing
machine. If not, reject.
2. Simulate M on ǫ.
3. If M accepts, accept. Otherwise, reject.
Therefore, by the results of this section, ACCEPTSǫTM is not recognizable.
Now, ACCEPTSǫTM is the union of NACCEPTSǫTM and the set of strings
that are not of the form 〈M〉 where M is a Turing machine. If
NACCEPTSǫTM was recognizable, then ACCEPTSǫTM would be too. There-
fore, NACCEPTSǫTM is not recognizable.
11.5.3.
a) Suppose that L1 and L2 are recognizable languages. Let M1 and M2
be Turing machines that recognize L1 and L2, respectively. We use
M1 and M2 to design a Turing machine M that recognizes L1 ∪ L2:
1. Let w be the input string. Run M1 and M2 at the same time on w
(by alternating between M1 and M2, one step at a time).
2. As soon as either machine accepts, accept. If they both reject,
reject.
If w ∈ L1 ∪ L2, then either M1 or M2 accepts, causing M to accept. On
the other hand, if M accepts w, then it must be that either M1 or M2
accepts, implying that w ∈ L1 ∪ L2. Therefore, L(M) = L1 ∪ L2.
b) One solution is to adapt the proof of part (a) for union. All that’s
needed is to change Step 2 of M:
If both machines accept, accept. As soon as either one re-
jects, reject.
102. 94 CHAPTER 11. UNDECIDABILITY
Then, if w ∈ L1∩L2, then both M1 and M2 accept, causing M to accept.
On the other hand, if M accepts w, then it must be that both M1 and
M2 accept, implying that w ∈ L1 ∩ L2. Therefore, L(M) = L1 ∩ L2.
Another solution is to design a different, and somewhat simpler, M:
1. Let w be the input string. Run M1 on w.
2. If M1 rejects, reject.
3. Otherwise (if M1 accepts), run M2 on w.
4. If M2 rejects, reject. Otherwise, accept.
We can show that this M is correct by using the same argument we
used for the previous M.
c) Suppose that L1 and L2 are recognizable languages. Let M1 and M2
be Turing machines that recognize L1 and L2, respectively. We use M1
and M2 to design a Turing machine M that recognizes L1 L2. Here’s a
description of M in pseudocode:
Let w be the input and let n be the length of w
If w is empty, accept
For each k in {0, 1, 2, ..., n}, in parallel
Let x be the string that consists of the
first k symbols of w
Let y be the string that consists of the
remaining symbols of w
In parallel, run M1 on x and M2 on y
If, at any point, both machines have accepted
Accept
Reject
If w ∈ L1 L2, then w = x y with x ∈ L1 and y ∈ L2. This implies that
for one of the partitions, M will find that both M1 and M2 accept and
103. 11.5. NATURAL UNRECOGNIZABLE LANGUAGES 95
M will accept.
It’s easy to see that M accepts w only if w is of the form x y with
x ∈ L1 and y ∈ L2.
(Note that the last step of M, the reject instruction, will be reached
only if M1 and M2 halt on every possible partition of w.)
d) Suppose that L is a recognizable language. Let M be a TM that rec-
ognizes this language. Here’s a TM M′
that recognizes L∗
:
Let w be the input and let n be the length of w
If w is empty, accept
For each k in {1, 2, ..., n}, in parallel
For every partition of w into k substrings
s[1], ..., s[k], in parallel
Run M on each s[i]
If, at any point, M has accepted all the
substrings
Accept
Reject
If w ∈ L∗
, then either w = ǫ or w = s1 ···sk for some k ≥ 1 with
every si ∈ L. If w = ǫ, then M′
accepts. If it’s the other case, then in
the branch of the parallel computation that corresponds to substrings
s1,...,sk, M′
will find that M accepts every si and M′
will accept.
On the other hand, it’s easy to see that M′
accepts w only if w ∈ L∗
.
(Note that the last step of M′
, the reject instruction, will be reached
only if M halts on every possible partition of w.)