2 classical cryptosystems

Lecture 2
Classical Cryptosystems
Shift cipher
Substitution cipher
Vigenère cipher
Hill cipher

1

Shift Cipher
• A Substitution Cipher
• The Key Space:
– [0 … 25]
• Encryption given a key K:
– each letter in the plaintext P is replaced with the K’th
letter following the corresponding number (shift
right)
• Decryption given K:
– shift left
• History: K = 3, Caesar’s cipher

2

Shift Cipher
• Formally:
• Let P=C=K=Z26 For 0≤K≤25
ek(x) = x+K mod 26
and
dk(y) = y-K mod 26

ሺ‫ܼ ∈ ݕ ,ݔ‬ଶ଺ ሻ

3

Shift Cipher: An Example
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

• P = CRYPTOGRAPHYISFUN Note that punctuation is often
eliminated
• K = 11
• C = NCJAVZRCLASJTDQFY
• C → 2; 2+11 mod 26 = 13 → N
• R → 17; 17+11 mod 26 = 2 → C
• …
• N → 13; 13+11 mod 26 = 24 → Y

4

Shift Cipher: Cryptanalysis
• Can an attacker find K?
– YES: exhaustive search, key space is small (<= 26
possible keys).
– Once K is found, very easy to decrypt

Exercise 1: decrypt the following ciphertext
hphtwwxppelextoytrse

Exercise 2: decrypt the following ciphertext
jbcrclqrwcrvnbjenbwrwn
VERY useful MATLAB functions can be found here:
5
http://www2.math.umd.edu/~lcw/MatlabCode/

General Mono-alphabetical
Substitution Cipher
• The key space: all possible permutations of
Σ = {A, B, C, …, Z}
• Encryption, given a key (permutation) π:
– each letter X in the plaintext P is replaced with π(X)
• Decryption, given a key π:
– each letter Y in the ciphertext C is replaced with π-1(Y)
• Example
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
π B A D C Z H W Y G O Q X S V T R N M S K J I P E F U

• BECAUSE AZDBJSZ
6

Strength of the General
Substitution Cipher
• Exhaustive search is now infeasible
– key space size is 26! ≈ 4*1026
• Dominates the art of secret writing
throughout the first millennium A.D.
• Thought to be unbreakable by many back then

7

Affine Cipher
• The Shift cipher is a special case of the
Substitution cipher where only 26 of the 26!
possible permutations are used
• Another special case of the substitution cipher is
the Affine cipher, where the encryption function
has the form
e(x) = ax+b mod 26 ሺܽ, ܾ ∈ ܼଶ଺ ሻ

• Note that with a=1 we have a Shift cipher.
• When decryption is possible ?

8

Affine Cipher
• Decryption is possible if the affine function is injective
• In order words, for any y in Z26 we want the
congruence
ax+b≡y (mod26)
to have a unique solution for x.
• This congruence is equivalent to
ax≡y-b (mod26)
• Now, as y varies over Z26 ,so, too, does y-b vary over Z26
Hence it suffices to study the congruence
ax≡y (mod26)

9

Affine Cipher
ax≡y (mod26)
• This congruence has a unique solution for every y if and
only if gcd(a,26)=1 (i.e.: a and 26 are relatively prime)
• gcd=greatest common divisor

• Suppose that gcd(a,26)=d>1
for example gcd(4,26)=2
• e(x) = 4x+7 mod 26 is NOT a valid encryption function
• For example, both ‘a’ and ‘n’ encrypt to H
(more in general: x and x+13 will encrypt to the same
value)

affinecrypt('a',4,7)=affinecrypt('n',4,7)=‘h’
10

Cryptanalysis of Substitution Ciphers:
Frequency Analysis
• Basic ideas:
– Each language has certain features: frequency of
letters, or of groups of two or more letters.
– Substitution ciphers preserve the language
features.
– Substitution ciphers are vulnerable to frequency
analysis attacks.

11

Frequency of Letters in English

12

Frequency of Letters in French

13

Other Frequency Features of English
• Vowels, which constitute 40 % of plaintext, are
often separated by consonants.
• Letter “A” is often found in the beginning of a
word or second from last.
• Letter “I” is often third from the end of a
word.
• Letter “Q” is followed only by “U”
• And more …
14

Substitution Ciphers: Cryptanalysis
• The number of different ciphertext characters
or combinations are counted to determine the
frequency of usage.
• The cipher text is examined for patterns,
repeated series, and common combinations.
• Replace ciphertext characters with possible
plaintext equivalents using known language
characteristics

15

Frequency Analysis History
• Earliest known description of frequency
analysis is in a book by the ninth-century
scientist al-Kindi
• Rediscovered or introduced in Europe during
the Renaissance
• Frequency analysis made substitution cipher
insecure

16

Improve the Security of the
Substitution Cipher
• Using nulls
– e.g., using numbers from 1 to 99 as the ciphertext
alphabet, some numbers representing nothing are
inserted randomly
• Deliberately misspell words
– e.g., “Thys haz thi ifekkt off diztaughting thi ballans off
frikwenseas”
• Homophonic substitution cipher
– each letter is replaced by a variety of substitutes
• These make frequency analysis more difficult, but
not impossible

17

Summary
• Shift ciphers are easy to break using brute
force attacks, they have small key space.
• Substitution ciphers preserve language
features and are vulnerable to frequency
analysis attacks.

18

Towards the Polyalphabetic
Substitution Ciphers
• Main weaknesses of monoalphabetic
substitution ciphers
– each letter in the ciphertext corresponds to only
one letter in the plaintext
– Idea for a stronger cipher (1460’s by Alberti)
• use more than one cipher alphabet, and switch
between them when encrypting different
letters
• Developed into a practical cipher by Vigenère
(published in 1586)
19

The Vigenère Cipher
• Definition:
Given m, a positive integer, P = C = (Z26)n, and K = (k1,
k2, … , km) a key, we define:
• Encryption:
ek(p1, p2… pm) = (p1+k1, p2+k2…pm+km) (mod 26)
• Decryption:
dk(c1, c2… cm) = (c1-k1, c2-k2 … cm- km) (mod 26)
• Example:
Plaintext: CRYPTOGRAPHY
Key: LUCKLUCKLUCK
Ciphertext: N L A Z E I I B L J J I
20

Vigenère Square

Plaintext:
CRYPTOGRAPHY
Key:
LUCKLUCKLUCK
Ciphertext:
NLAZEIIBLJJI

21

Security of Vigenere Cipher
• Vigenere masks the frequency with which a
character appears in a language: one letter in
the ciphertext corresponds to multiple letters
in the plaintext. Makes the use of frequency
analysis more difficult.
• Any message encrypted by a Vigenere cipher
is a collection of as many shift ciphers as there
are letters in the key.

22

Vigenere Cipher: Cryptanalysis
• Find the length of the key.
• Divide the message into that many shift cipher
encryptions.
• Use frequency analysis to solve the resulting
shift ciphers.
– how?

23

How to Find the Key Length?
• For Vigenere, as the length of the keyword
increases, the letter frequency shows less
English (or French)-like characteristics and
becomes more random (when key length ->
infinite, see One Time Pad).
• Two methods to find the key length:
– Kasisky test
– Index of coincidence (Friedman)

24

Kasisky Test
• (First described in 1863 by Friedrich Kasiski)
• Note: two identical segments of plaintext, will be
encrypted to the same ciphertext, if they occur in
the text at a distance Δ, (Δ≡0 (mod m), m is the
key length).
• Algorithm:
– Search for pairs of identical segments of length at
least 3
– Record distances between the two segments: Δ1, Δ2,
…
– m divides gcd(Δ1, Δ2, …)

25

Example of the Kasisky Test
• Key:
KINGKINGKINGKINGKINGKING
• Plaintext:
thesunandthemaninthemoon
• Ciphertext:
DPRYEVNTNBUKWIAOXBUKWWBT

8 positions
The lenght of the keyworld probably divides 8 evenly
(e.g. it may be 2, 4 or 8)
26

Index of Coincidence (Friedman)
• Informally: Measures the probability that two
random elements of the n-letters string x are
identical.
• Definition:
Suppose x = x1x2…xn is a string of n alphabetic
characters. Then, the index of coincidence of
x, denoted Ic(x), is defined to be the
probability that two random elements of x are
identical.

27

Index of Coincidence (cont.)
• Reminder: binomial coefficient
n n!
 =
 k  k!(n − k )!
 
• It denotes the number of ways of choosing a subset of k
objects from a set of n objects.
• Suppose we denote the frequencies of A, B, C … Z in x by
f0, f1, … f25 (respectively).

• We want to compute Ic(x)

28

Elements of Probability Theory
• A random experiment has an unpredictable
outcome.
• Definition
The sample space (S) of a random
phenomenon is the set of all outcomes for a
given experiment.
• Definition
The event (E) is a subset of a sample space, an
event is any collection of outcomes.
30

Basic Axioms of Probability
• If E is an event, Pr(E) is the probability that
event E occurs, then
– 0 ≤ Pr(A) ≤ 1 for any set A in S.
– Pr(S) = 1 , where S is the sample space.
– If E1, E2, … En is a sequence of mutually exclusive
events, that is Ei ∩ Ej = 0, for all i ≠ j then:
n
Pr (E1 ∪ E2 ∪ ... ∪ En ) = ∑ Pr (Ei )
i =1

31

Probability: More Properties
• If E is an event and Pr(E) is the probability that
the event E occurs then
– Pr(Ê) = 1 - Pr(E) where Ê is the complimentary
event of E
– If outcomes in S are equally like, then
Pr(E) = |E| / |S|
(where |S| denotes the cardinality of the set S)

32

Example

• Random throw of a pair of dice.
• What is the probability that the sum is 3?
Solution: Each dice can take six different values
{1,2,3,4,5,6}. The number of possible events
(value of the pair of dice) is 36, therefore each
event occurs with probability 1/36.
Examine the sum: 3 = 1+2 = 2+1
The probability that the sum is 3 is 2/36.

• What is the probability that the sum is 11?
33

• Reminder: binomial coefficient
n n!
 =
 k  k!(n − k )!
 
• It denotes the number of ways of choosing a subset of k
objects from a set of n objects.

• Suppose we denote the frequencies of A, B, C … Z in x by
f0, f1, … f25 (respectively).
• We want to compute Ic(x)

35

• We can choose two elements of x (whose size is n) in
n Example: if n=3, nchoosek(3,2)=3; there are 3 ways of
  ways.
 2 choosing couples of n=3 items. Example: string ABC
   fi 
• For each i in [0…25], there are   ways of choosing
2
 
both elements to be i. Hence we have the formula
25
 fi  25
∑ 2 
 
i =0  =
∑ f (f i i − 1)
I c (x ) = i =0

 n n(n − 1)
 
 2
 
36

Example: IC of a String
• Consider the text 3C
2D
x= “THEINDEXOFCOINCIDENCE” 4E
25 1F
∑ f (f i i − 1) 1H

I c (x ) = i =0
3I

n(n − 1)
3N
20
1T
• There are 21 characters, with frequencies 1X

• Ic = (3*2+ 2*1+ 4*3+ 1*0+ 1*0+ 3*2+ 3*2+
2*1+ 1*0+ 1*0) / 21*20 = 34/420 = 0.0809

37

• Now, if we suppose that n is very big (e.g., we take all
words in the English dictionary), then we can further
approximate the formula:
These are the real
25
 fi  25 25 frequencies of letters
∑ 2 
  ∑ f i ( f i − 1) ∑ fi 2 25
in English (see Table)
I c (x ) =   = = ∑ pi2
i =0 i =0
≈ i =0

n n(n − 1) n2 i =0
 
 2
 
THIS IS AN APPROXIMATION IF n is VERY BIG

38

Example: IC of a Language
• For English, pi can be estimated as follows
Letter pi Letter pi Letter pi Letter pi
A 0.082 H 0.061 O 0.075 V 0.010
B 0.015 I 0.070 P 0.019 W 0.023
C 0.028 J 0.002 Q 0.001 X 0.001
D 0.043 K 0.008 R 0.060 Y 0.020
E 0.127 L 0.040 S 0.063 Z 0.001
F 0.022 M 0.024 T 0.091
G 0.20 N 0.067 U 0.028

25
I c (x ) = ∑ pi2 = 0.065
i =0 39

IC of a ciphertext
• Now, the same reasoning applies if x is a
ciphertext obtained by means of any
monoalphabetic cipher. In this case, the
individual probabilities will be permuted, BUT
the quantity
25
I c (x ) = ∑ pi2 = 0.065
i =0

will be unchanged!

40

Find the Key Length
• For Vigenere, as the length of the keyword
increases, the letter frequency shows less
English-like characteristics and becomes more
random.
• Two methods to find the key length:
– Kasisky test
– Index of coincidence (Friedman)

41

Finding the Key Length
• Suppose we start with a ciphertext q = q1q2…qn
• Define m substrings y1… ym as follows

 q1 qm +1 qn − m +1  y1
q qm + 2 
qn − m + 2  y2
 2
 ... ... 
 
 qm q2 m qn  ym

42

Finding the Key Length
• In our previous example, supposing we already
guessed, n=12, m=4

 q1 qm +1 qn − m +1  y1 = CTA
q qm + 2 
qn − m + 2  y2
 2 = ROP
 ... ...  = YGH
 
 qm q2 m qn  ym = PRY
43

Guessing the Key Length
• If this is done, and m is indeed the key length,
then each Ic(yi) should be roughly equal to 0.065
(e.g. it will “look like” English text)
25
I c ( yi ) = ∑ pi2 = 0.065 ∀1 ≤ i ≤ m
i =0

• If m is not the key length, the text will “look like”
much more random, since it is obtained by shift
encryption with different keys. Observe that a
completely random string will have:
2
25
 1 
I c ( x ) ≈ ∑   = 26 ⋅ 2 =
1 1
= 0.0385 ∀1 ≤ i ≤ m 44
i = 0  26  26 26

Guessing the Key Length
• For French language, the index of coincidence
is approximately 0.0778
• The values 0.065 (or 0.0778 for French) and
0.0385 are sufficiently far apart that we will
often be able to determine the correct
keyword length (or confirm a guess that has
already been made using the Kasiski test)

45

Finding the Key, if Key Length Known
• Consider vectors yi, and look for the most
frequent letter
• Check if mapping that letter to e will not result
in unlikely mapping for other letters
• Use mutual index of coincidence between two
strings
– To determine relative shifts, and hence the key

46

Summary
• Vigenère cipher is vulnerable:
once the key length is found, a cryptanalyst
can apply frequency analysis.

47

The Hill Cipher
• Use linear equations
– each output bit (ciphertext, C) is a linear
combination of the input bits (plaintext message, M)
– the key k is a matrix
• C=kM
• M = k-1 C
– known as the Hill cipher
– easily breakable by known-plaintext attack

48

The Hill Cipher
• It’s another polyalphabetic cryptosystem,
invented in 1929 by Lester S. Hill.
• Let m be a positive integer (we will see an
example with m=2), and define P=C=(Z26 )m

• The idea is to take m linear combinations of
the m alphabetic characters in one plaintext
element, thus producing the m alphabetic
characters in one ciphertext element.

49

The Hill Cipher
• Example with m=2
• We can write a plaintext element as x=(x1,x2)
and a ciphertext element as y=(y1,y2).
• Here y1 would be a linear combination of x1
and x2, as would be y2
• We might take
‫ݕ‬ଵ ൌ 11‫ݔ‬ଵ ൅ 3‫ݔ‬ଶ
All computed Mod 26
‫ݕ‬ଶ ൌ 8‫ݔ‬ଵ ൅ 7‫ݔ‬ଶ

50

The Hill Cipher
• We might take
‫ݕ‬ଵ ൌ 11‫ݔ‬ଵ ൅ 3‫ݔ‬ଶ
‫ݕ‬ଶ ൌ 8‫ݔ‬ଵ ൅ 7‫ݔ‬ଶ
• Of course, this can be written more succinctly in
matrix notation as follows:
11 8
ሺ‫ݕ‬ଵ , ‫ݕ‬ଶ )=ሺ‫ݔ‬ଵ , ‫ݔ‬ଶ )
3 7
• In general, we will take an m x m matrix K as our
key. We will write y=xK
• The ciphertext is obtained from the plaintext by
means of a linear transformation.

51

The Hill Cipher (Decryption)
• To decrypt, we should multiply both sides for the
inverse of K, K-1:
– yK-1=xKK-1
– hence x=yK-1
• Does K-1 always exist ? Of course not!
• By definition, the inverse matrix to an m x m matrix K
(if it exists) is the matrix K-1 such that K K-1 = Im
• For example:
1 0
‫ܫ‬ଶ =
0 1
• We can verify that the encryption matrix above has an
inverse modulo 26
ିଵ
11 8 ૠ ૚ૡ 11 8 7 18 261 286 1 0
ൌ ൌ ൌ
3 7 ૛૜ ૚૚ 3 7 23 11 182 131 0 1 52

The Hill Cipher (example)
• The key is ‫ ܭ‬ൌ 11 7
3
8

• From the computation above ‫ି ܭ‬ଵ ൌ 237 18
11
• We want to encrypt the plaintext july
– Hence we have two elements of plaintext to
encrypt: (9,20), corresponding to ju and (11,24)
corresponding to ly
• We compute as follows:
11 8
9,20 ൌ 99 ൅ 60, 72 ൅ 140 ൌ ሺ3,4ሻ DE
3 7
11 8 LW
11,24 ൌ 121 ൅ 72, 88 ൅ 168 ൌ ሺ11,22ሻ
3 7
53
DELW

The Hill Cipher (example)
• Verify that DELW decrypts to july using the
matrix ‫ି ܭ‬ଵ ൌ 23 18
7
11

54

The Hill Cipher
• Now the question is: when K is invertible?
• The invertibility of a matrix depends on the value
of its determinant (det K =k11k22-k12k21)
• We know that a real matrix K has an inverse if
and only if its determinant is non-zero
• However, it is important to remember that we are
working over Z26
• The relevant result for our purposes is that a
matrix K has an inverse modulo 26 if and only if
gcd(det K, 26)=1
– In our example, det K=53 (mod26) = 1 and gcd(1,26)=1

55

The Hill Cipher
• How to compute K-1 (when it exists, of
course)?
• Recall that det K =k11k22-k12k21
• It can be shown that:
݇ଶଶ െ݇ଵଶ
‫ି ܭ‬ଵ ൌ ሺ݀݁‫ܭݐ‬ሻିଵ
െ݇ଶଵ ݇ଵଵ

• In our example
– det K = 53 (mod 26)=1
– Now, 1-1mod 26=1
– Hence
7 െ8 7 18
‫ି ܭ‬ଵ ൌ 1 ݉‫ 62݀݋‬ൌ
െ3 11 23 11
56

2 classical cryptosystems

More Related Content

What's hot (20)

Viewers also liked (16)

Similar to 2 classical cryptosystems (20)

2 classical cryptosystems