1. Chapter # 1
Chapter # 1
Languages
Languages
.
.
Dr. Shaukat Ali
Dr. Shaukat Ali
2. Languages.
• Two types of languages:
– Natural Languages .i.e English, Urdu etc.
– Computer Languages / Artificial Languages .i.e. C++, Java etc.
• Natural Languages:
– Three basic entities: letters, words and sentences.
– Groups of letters make up words and groups of words make up
sentences and so on.
– But not all collection of letters form valid words and not all
collection of words form valid sentences.
• Computer Languages:
– Certain character strings are recognizable words .i.e GOTO, END
etc.
– Certain strings of words are recognizable commands.
– Certain set of commands become a program.
3. Languages
.
• It is very hard to state all the rules for a natural
language.
– Many incoherent strings of words are actually
understandable sentences.
– Humans are capable of understanding error prone
sentences that we hear.
• But for computer languages it is necessary to
describe all of the rules explicitly.
• Languages for which all rules cannot be explicitly
stated are called informal languages.
• Languages for which all rules are to be explicitly
stated are called formal language.
4. Theory of formal languages
.
• Formal refers to:
– To the fact that all the rules for the language are
explicitly stated.
• What sequences of symbols can occur ?
• No liberties are tolerated.
– Simply we can say that:
• Formal languages are the game of symbols with formal rules.
• Here we are concerned with the form of the string of symbols,
we are interested in, not the meaning.
5. Theory of formal languages
.
• Structure:
– One finite set of fundamental units , called “alphabet”,
denoted Σ.
– An element of alphabet is called “character”.
– A certain specified set of strings of characters from
the alphabet will be called “language” denoted L.
– Those strings that are permissible in the language we
call “words”.
– The string without letter is called “empty string” or
“null string”, denoted by Λ.
– The language that has no word is denoted by (null
∅
set symbol).
6. Theory of formal languages
.
• Symbols:
–Union operation + (Plus)
–Different operation − (Minus)
–Alphabet Σ (Sigma)
–Empty string Λ (Lambda)
ε (Epsilon)
–Language L (English L)
Γ (Gemma)
–Empty language (Phi)
∅
7. Theory of formal languages
.
• Language Defining. (English Language).
– Given an alphabet Σ = { a,b,c,…,z, ‘,- }.
• Characters of alphabet can often be separated by spaces or commas.
– We can now specify Language (L) as
{ all words in a standard dictionary }.
– If we call this language ENGLISH-WORDS, we may write:
ENGLISH-WORDS = { all words in a standard dictionary }.
– But this language does not have any grammar. To make a
formal definition of the language of the sentence in
ENGLISH.
ENGLISH-SENTENCE = { all words in a standard
dictionary, blank space, the usually punctuation
marks }
8. Theory of formal languages
.
• Language Defining (Example).
– Let Σ = {x} be an alphabet.
– We can define the language by saying that any
nonempty string of alphabet characters is a word.
L = { x xx xxx xxxx … }
– Or to write it in an alternative form.
L = { xn
for n = 1 2 3 … }.
– Similarly a languge containing words of odd number
of characters is.
L2 = { x xxx xxxxx xxxxxxx …}
L2 = { xodd
}
L2 = { x2n-1
for n = 1 2 3 … }.
9. Theory of formal languages
.
• String Concatenation.
– Two strings are written side by side to form a new
longer string.
– For example, if we concatenate the word xxx with the
word xxxx, we obtain the word xxxxxxx.
– Mathematically it can be shown as:
Xm
concatenated with Xn
is the word Xn+m
– If a = xxx and b = xx then to denote the word formed
by concatenating a and b is:
ab = xxxxx
– In this simple example ab = ba. But this relationship
does not hold for all languages.
• For example when we concatenate “house” and “boat”. We
will get different strings.
10. Some Definitions
.
• The function “Length” of a string defines the
number of letters in the string.
– For example, if a word a = xxxx in L, then Length(a)=4.
– In any language that includes Λ, we have Length(Λ)=0.
• The function reverse is defined by if a is a word in
L, then reverse(a) is the same string of letters
spelled backward, called the reverse of a.
– For example, reverse(123)=321.
– Remark: The reverse(a) is not necessary in the language
of a.
11. Some Definitions
.
• We define the function na(w) of a w to be the number
of letter a in the string w.
– For example, if a word w = aabbac in L, then na(w)=3.
• We define a new Language called PALINDROME
over the alphabet if
Language = { Λ and all strings x such that reserve(x)=x }.
– For example, let Σ={ a, b }, and
PALINDROME={ Λ a b aa bb aaa aba bab bbb …}.
– We usually put words in size order and then listed all the
words of the same length alphabetically. This order is
called lexicographic order.
12. Some Definitions
.
• Kleene Closure.
– Given an alphabet Σ, the language that any string of characters
from Σ is a word, even the null string.
– This language is called the closure of the alphabet. It is denoted
by
Σ*
.
– This notation is sometimes known as the Kleene star.
– It is like a loop on the alphabets with zero or many iterations.
– Kleene star can be considered as an operation that makes an
infinite language.
– When we say “infinite language”, we mean infinitely many words
in the language, each of finite length.
– If Σ = { x }, then
Σ*
= {Λ x xx xxx xxxx xxxxx ---- }
– Similarly if Σ = { 0, 1 }, then
Σ*
= {Λ 0 1 00 11 01 10 001 010 011 ----- }
13. Some Definitions
.
• if S is a set of words, then by S*
we mean the set of
all finite strings formed by concatenating words
from S including the null string.
– Example: If S = { aa, b }then
S*
= { Λ and any word composed of factors of aa and b }.
= { Λ and all strings of a’s and b’s in which the a’s
occurs in even clumps }.
= {Λ, b, aa, bb, aab, baa, bbb, aaaa, aabb, baab, bbaa,
bbbb, aaaab, aabaa, aabbb, baabb,----}
– The string aabaaab is not in S* since it has a clump of
a’s of length 3.
14. Some Definitions
.
• Positive Closure.
– Given an alphabet Σ, the language that any string (not
zero) of characters in Σ are in this language is called
the positive closure of the alphabet. It is denoted by
Σ+
.
– Example if Σ = { a, b }, then
Σ+
= { a, b, aa, ab, bb, ba, aaa, aba, aab, abb,----}
– It is loop of one or more iterations.
– Positive Closure is the same a Kleene Closure except
for the null string Λ.
– Similarly S+
is the same as S*
except for the null string.
– Example: Let L={ ab }, Then
L+
= { ab abab ababab …}.
15. Theorem 1
.
• For any set S of strings we have S* = S**
• Lets first show that what this theorem means.
– Given an alphabet S={ aa bbb }. Then
– S* is the set of all strings where a’s occur in even
clumps and b’s in groups of 3, 6, 9….
– Some words in S* are bbb aabbbaaaa bbbaa.
– Now if we take (S*)*, we will get one big string by
concatenating all the words in S*. Such as:
(S*)* = {---, bbbaabbbaaaabbbaa, ----}.
– But this big string is also present in S*. Therefore:
bbbaabbbaaaabbbaa = (bbb)(aa)(bbb)(aa)(aa)(bbb)(aa)
– Therefore any string that is contained in S* is also
present in S**.
– Note (S*)* = S**
16. Theorem 1
.
• Proof.
– Every words in S** is made up of factors from S*.
– Every words in S* is made up of factors from S.
– Therefore every words in S** is made up of factors
from S.
– Similarly every word in S** is also a word in S*.
Therefore we can say that S** is contained or equal to
S*.
– Similarly every word in S* is contained or equal to
S**.
– Therefore S* = S**
17. Problems in class room
.
• Consider the language S* where S = { a, b}. How many
words does this language have of length 2 and length 3.
• Consider the language S* where S = { aa, b}. How many
words does this language have of length 4, length 5 and
length 6.
• Consider the language S* where S = { ab, ba}. Can any
word in this language contain a substring aaa or bbb.
• Consider the language S* where S = { aa, aba, baa}. Show
that the words aabaa, baaabaaa and baaaaababaaaa are all
in this language.
• Let S = { ab, bb } and let T = { ab, bb, bbbb }. Show that
S* = T*
18. Recursive Definition
.
• A mathematical method for defining a set of new
language.
• A recursive definition is normally a three-step
process.
– First, we specify some basic objects in the set.
– Second, we give rules for constructing more object in
the set from the ones we already know.
– Third, we declare that no object except those
constructed in this way (by First and Second) are
allowed in the set.
• This is called recursive because rules for defining
objects calls themselves again and again.
19. Example
.
• To define the set of positive EVEN integers.
– One standard way of defining this set is:
EVEN is the set of all positive whole numbers divisible by 2.
– Another way of defining this set is:
EVEN is the set of all 2n where n = 0, 1, 2, 3, ----
– By using recursive definition.
• The set EVEN is defined by these three rules.
– Rule 1: 0, 2 are in EVEN. (Defining basic object in the set.)
– Rule 2: if x is in EVEN, then so is x+2.(More objects.)
– Rule 3: The only elements in the set EVEN are those that
can be produced from the two rules above.
• The last rule above is completely redundant.
– There is no need of it, because the result can be obtained from
the above two rules.
• Here we define EVEN in terms of previously known elements of
EVEN.
20. Proof
.
• Suppose that we want to prove that 14 is in the set EVEN.
– By using first definition, we divide 14 by 2 and find that there is no
remainder, therefore it is in EVEN set.
– By using second definition, we have to come up with the number .i.e. 7
and then, since 14 = (2)(7), therefore it is in EVEN set.
– By using recursive definition is a lengthier process.
• By Rule1, we know that 2 is in EVEN.
• By Rule2, we know that 2+2=4 is in EVEN.
• By Rule2, we know that 4+2=6 is in EVEN. (4 has been shown in EVEN).
• By Rule2, we know that 6+2=8 is in EVEN. (6 has been shown in EVEN).
• By Rule2, we know that 8+2=10 is in EVEN. (8 has been shown in EVEN).
• By Rule2, we know that 10+2=12 is in EVEN. (10 has been shown in EVEN).
• By Rule2, we know that 12+2=14 is in EVEN. (12 has been shown in EVEN).
– This process is pretty horrible, it takes a lengthy time (greater number of
steps) to find an object belongs to or not.
21. Another Definition
.
• The set EVEN is also defined by these three rules.
– Rule1: 0, 2 are in EVEN.
– Rule2: if x and y are in EVEN, then so is x+y.
– Rule3: The only elements in the set EVEN are those
that can be produced from the two rules above.
– Now to prove that 14 is in the EVEN.
• By Rule1, we know that 2 is in EVEN.
• By Rule2, x = 2, y = 2 → 4 is in EVEN.
• By Rule2, x = 2, y = 4 → 6 is in EVEN.
• By Rule2, x = 4, y = 4 → 8 is in EVEN.
• By Rule2, x = 6, y = 8 → 14 is in EVEN.
– Rule 2 also applies to the case where x and y stand for the same
number.
– This method requires the fewer number of steps to prove. Therefore
this definition is better than the first one.
– This definition is still harder to use than the two non-recursive
definition but it has a certain advantage.
• It gives us a rule that the sum of two numbers in EVEN is also a
number in EVEN.
22. Example
.
• To define recursive definition for positive integers.
– Rule 1: 1 is in INTEGER.
– Rule 2: If x is in INTEGER, then so is x + 1.
– Rule3: The only elements in the set POSITIVE are those that can
be produced from the two rules above.
• This definition only applies to positive integer in the
INTEGER set.
• To extend this definition to include both the positive and
negative integers, we should use the following recursive
definition.
– Rule 1: 1 is in INTEGER.
– Rule 2: If x and y are in INTEGER, then so are x + y, x – y and
xy.
– Rule3: The only elements in the set INTEGER are those that can
be produced from the two rules above.
23. Example
.
• Recursive definition for polynomials.
– A polynomial is a finite set of terms, each of which is
of the form a real number times a power of x ( that
may be x0
= 1).
– The set polynomial is defined by the following rules.
• Rule 1: Any number is in POLYNOMIAL
• Rule 2: Any variable x is in POLYNOMIAL.
• Rule 3: if x and y are in POLYNOMIAL, then so is x+y, x-y,
x×y and (x).
• Rule 4: The only elements in the set POLYNOMIAL are
those that can be produced from
the three rules above.
24. Proof
.
• Show that 3x2
+7x-9 is in POLYNOMIAL.
– By Rule 1: 3 is in polynomial.
– By Rule 2: x is in polynomial.
– By Rule 3: (3)(x) is in polynomial, call it 3x.
– By Rule 3: (3x)(x) is in polynomial, call it 3x2
.
– By Rule 1: 7 is in polynomial.
– By Rule 3: (7)(x) is in polynomial, call it 7x.
– By Rule 3: 3x2
+ 7x is in polynomial.
– By Rule 1: -9 is in polynomial.
– By Rule 3: 3x2
+ 7x + (-9) is in polynomial, call it
3x2
+ 7x – 9.
25. More Examples.
• Write recursive definition for the followings.
1. L = { x, xx, xxx, xxxx, ---- }
1. Rule 1: x is in L.
2. Rule 2: If w is any word in L, the xw is also in L.
2. L = { xodd
} = { x, xxx, xxxxx, xxxxxxx, ---- }
1. Rule 1: x is in L.
2. Rule 2: If w is any word in L, the xxw is also in L.
3. L = {Λ, x, xx, xxx, xxxx, ---- }
1. Rule 1: Λ and x are in L.
2. Rule 2: If w is any word in L, the xw is also in L.
4. L = {1, 2, 3, 4, 5, ---- }
1. Rule 1: 1, 2, 3, 4, 5, 6, 7, 8, 9 are in L.
2. Rule 2: If w is any word in L, the w0, w1, w2, w3, w4, w5, w6, w7,
w8, w9 are also in L.
5. Definition for Kleene Closure.
1. If S is a language, then all words of S are in S*.
2. Λ is in S*.
3. If x and y are in S*, then so is their concatenation xy.