# CS387 - Unit 1: Symmetric Cryptography and Applications

Contents

## Introduction

Cryptography is a branch of mathematics and computer science that uses secrets to solve problems! In this class you will learn the foundations of cryptography as well as how to use cryptography to solve problems in computing; how to send messages securely, how to manage accounts on websites as well as how to do things like perform computation where you can keep your data secret while still getting the results of a function that depends on your data and some other data.

The word "cryptography" has two parts: Crypto, comes from the Greek word "secret," - to hide and the second part "graphy" means "writing" - the same "graphy" that appears in the words photography or telegraphy.

In this course you will be studying, not just secret writing, but everything about secrets. Another name for this course could be cryptology - the science of secrets!

### Quiz: Introduction

Which of these involve cryptology?

• a. Opening a door
• b. Playing poker
• c. Logging into your udacity.com account
• d. Doing a search using google.com

## Course Overview

Symmetric cryptography means that the two parties have the same key. For example, given two individuals, Alice and Bob, who want to know that nobody is eavesdropping. In symmetric cryptography, you assume that they both start with the same key that they can use for both encryption and decryption.

In future units you will learn about asymmetric cryptography and its applications. Asymmetric cryptography is also known as "public key cryptography." The difference between symmetric and asymmetric cryptography is that in asymmetric cryptography the key used to encrypt is different from the key used to decrypt. If the keys are different and unrelated, that means you can reveal one of the keys without revealing the other key.

In the final units of this course you will see how you can use both symmetric and asymmetric cryptography to solve problems. Most protocols involve symmetric and asymmetric cryptography because asymmetric is very expensive, it requires lots of keys, mathematics and computation, where symmetric cryptography can be very efficient.

## Don't Implement your own Crypto

Implementing your own cryptography is very challenging, making it very easy to do incorrectly.

Computer programs can be thought of as black boxes, with inputs and outputs. Encryption functions will usually take a key and a message to produce some ciphertext. The implementations that you will see in this class will be functionally correct, they will produce the correct outputs. They will not be secure because programs are not really black boxes when you use them. There are a lot of other things that someone could observe about this; for example, timing side channel - how long it takes.

When you can observe properties of a function, other than their functional properties - their inputs and outputs, it is called a '''side-channel'''. Timing is an example of a side-channel. Timing is the time it takes to run a function, which could vary, it could depend on the message or the key.

Most of the code that you will write in this class will have that property. You will not be adding complexity to the code to worry about things like side-channels, but in terms of implementing things correctly, side-channels are really important.

There are lots of ways that code could have side-channels. Another example, is that side-channels could effect what is in the cache of the processor and that could be visible in some way. A side-channel could also affect how much power your processor uses. People have shown ways to break smart cards by measuring the power consumption as they do encryption and that gives you some insight into what the key is.

These are the kinds of things that make implementing your own cryptography difficult. If you were building cryptography for any important use you would have to be mindful of these types of challenges, but for this class you will not need to worry about it for most of your code. Therefore, you should not view your implementations as secure implementations. The only reason to implement your own crypto, is for fun and learning.

If you actually want to use cryptography to protect something important, you should use a library implementation that has been carefully vetted and written in a way that takes into consideration the issues discussed earlier. The library implementation you use should also have been looked over by others to ensure a high level of confidence that it is correct and secure.

### Quiz: Don't Implement your own Crypto

Should you use any code from this course to protect nuclear launch codes?

## Symmetric Cryptosystems

Symmetric means both encryption and decryption are done with the same key. Here are some terms that you will see throughout the course:

1. Encryption - think of encryption as a function that takes plaintext (the unencrypted message) as input and it outputs a ciphertext. The goal is to be able to send that ciphertext over an insecure channel, which may be a wireless channel, the internet or a courier - any channel you cannot trust to be secure. Hopefully, what comes out of the channel is the same ciphertext that was put into it, that goes into a decryption function.

2. Decryption - think of decryption as a function that takes the ciphertext, and what comes out, is the message, in plaintext, that you sent in.

The cast of characters who play out this process are Alice, who sends the message, Bob who receives it, and possibly a malicious character, Eve, who might be listening in on the channel. Eve, the eavesdropper can only hear what is sent over the insecure channel, she does not have access to the plaintext that goes into the encryption function, or the plaintext that comes out of the decryption function.

plaintext is some message that is selected from a set of messages, m\in M

M is the set of all possible messages, which could be finite or infinite. For a given length, M is finite. m is some message selected from the set.

ciphertext is ciphertext, c selected from all possible ciphertexts, C: c\in C

Encryption, E goes from an element of M to an element of C: E:M \to C

Decryption, D goes from an element of C to an element of M: D: C \to M

In order for Bob to receive the same message as the one Alice encrypted, you need the property that the D function returns the input of the E function:

\forall m \in M: D(E(m))=m

## Keys and Kerckhoff's Principle

One very important principle in cryptography is Kerckhoff's Principle, which was documented in his book in (1883). Kerckhoff stated,

Faut qu'il n'exige pas le secret, et qu'il puisse sans inconvenient tomber entre les mains de l'ennemi.

This loosely translates to, "the cipher must not depend on the secrecy of the mechanism, it must not matter if it falls into the hands of the enemy."

Kerckhoff's statement focuses on the mechanism, because coming up with new functions, or mechanisms, in Kerckhoff's day was a lot of work - you do not want to have to keep inventing new encryption and decryption functions. Rather, you want your cipher to still be secure even if your encryption and decryption functions are known. The way to achieve this is by having a key. Therefore, instead of just having the message as the input to the encryption function, you also input a key.

If it is a symmetric system, that key is the same key that is needed in the decryption function as well. If the security relies only on keeping the key secret, then you can make your encryption and decryption functions public, analyze them, put a lot of work into developing good encryption and decryption functions. If you think your key has been exposed, then you just need to come up with a new key and keep using the same functions. It is more work to re-create the encryption and decryption functions than it is to change the key. If the function turns out to have a weakness, that is a much more serious problem because then you would have to legitimize new encryption and decryption functions.

### Quiz: What parts of a cryptosystem must be kept secret?

• Alice
• Encryption Algorithm
• Decryption Algorithm
• Keys
• Ciphertext

## Correctness and Security

With the addition of the key, you are now working with three main elements m \in M, c \in C, k \in K

You want your encryption function to take the message and a key and map that to a cipher text. Your decryption function will take a ciphertext and a key and map that to a message:

E: M \times K \to C \\ D: C \times K \to M

Correctness property: To be correct you need to obtain the same message after decryption. You need to know that for all messages and keys, you have the property that decrypting, using that key (the subscript indicates that there are two inputs for the decryption function; one is the key and one is the input ciphertext). The ciphertext is the result of encrypting using the key and the message. This can be written as:

• \forall m\in M, \forall k\in K: D_k(E_k(m)) = m

You know the result because you want the result to be the same message that went into the encryption function to come out of the decryption function.

In addition to obtaining a correct message result, you also want your encryption and decryption functions to be secure.

Security property: The ideal condition for secure encryption and decryption functions is one where the ciphertext reveals nothing about the key or the message.

### Quiz: Correctness and Security

\begin{aligned} E_k(m) &= m+k & D_k(c) &= c-k \\ E_k(m) &= m & D_k(c) &= c \\ E_k(m) &= m\%k & D_k(c) &= c*k \end{aligned}

Which of these functions satisfy the correctness property for a symmetric cipher? Each choice is a pair of functions and the message and the keys are natural numbers:

• M = {1, 2, 3, ...}
• K = {1, 2, 3, ...}

## XOR Function

The One-time Pad (OTP) is the only perfect cipher because it reveals no information about the message or the key. It can be impractical because it can require very large keys, depending on the size of the message.

In order to understand the OTP it is necessary to understand the Exclusive-Or function, also referred to as XOR, or symbolically by \oplus.

The truth table for the XOR function is as follows:

 A B A\oplusB 0 0 0 1 0 1 0 1 1 1 1 0

### Quiz: What do you think the value of X \oplus Y \oplus X is?

• 0
• X
• Y
• It depends on X

Note that X \oplus X = 0 and that XOR is both associative and commutative (i.e. X \oplus Y \oplus X = X \oplus X \oplus Y = Y).

The functionality of the OTP follows immediately from the definition of XOR:

Where C = Cipher text, K = Key and M = Message

C = K \oplus M (encryption) and C \oplus K = M (decryption)

The set of all possible messages with length n\in\mathbb{N} is defined as M:=\{0,1\}^n Therefore a message m\in M is represented as a string of 0's and 1's of length n. The set of all possible keys with length n\in\mathbb{N} is defines as K:=\{0,1\}^n Therefore a key k\in K is represented as a string of 0's and 1's of length n. To do encryption, the encryption function is defined as follows: E:M\times K\to C:m\mapsto E_{k}(m)=c where

• m=m_{0}m_{1}m_{2}\ldots m_{n-1} is a sequence of bits of length n,
• k=k_{0}k_{1}k_{2}\ldots k_{n-1} a sequence of bits of length n and
• c=c_{0}c_{1}c_{2}\ldots c_{n-1} is the ciphertext(encryption of m) of length n

The value of each chipertext bit is defined \forall i\in\{0,1,2,\ldots,n-1\}: c_{i}=m_{i}\oplus k_{i}

Example:

Let be m='CS'. Using the following two procedures to get the desired binary message (only 0's and 1's and every character has a length of 7):

def convert_to_bits(n,pad):

result = [ ]
while n > 0:
if n % 2 == 0:
result = [0] + result
else:
result = [1] + result
n = n // 2

result = [0] + result
return result


which turns a single character into a sequence of bits of length pad and

def string_to_bits(s):
result = [ ]
for c in s:
result = result + convert_to_bits(ord(c),7)
return result


which turns a string into a sequence of bits (every character is represented as a sequence of 7 bits).

Therefore our message has a length of 14 and is:

string_to_bits('CS') \to [1,0,1,0,0,1,1,1,0,0,0,0,1,1]

Let the random choosen key be k=11001000100110. (The key must have the same length of the message)

To get the ciphertext simply xor'ing the corresponding entries from m and k:

• c_{0}=m_{0}\oplus k_{0}=1\oplus 1 = 0
• c_{1}=m_{1}\oplus k_{1}=0 \oplus 1 = 1
• \vdots
• c_{n-1}=c_{13}=m_{13}\oplus k_{13}=1\oplus 0 = 1

Doing this we get: c=01101111100101

### Quiz: What guessed k value would make this ciphertext decrypt to 'BS'

To make this ciphertext do decrypt as 'BS', we have to change the first letter 'C' to one bit below that means:

• m='BS'=10100101000011

That means we need to flip on bit in the key (as the ciphertext has to be the same):

• k=11001010100110

The rest(ciphertext) stays the same. So if you guessed this key instead of the correct key, we get what looks like a fairly reasonable message ought but it would be one off from the one that was there.

Why the One-Time Pad provides perfect secrecy:

In fact, we could get any possible message we want by guessing different keys. That means also for any given ciphertext we can produce any message we want by picking different keys. That means that if we just have the ciphertext, we haven't learned anything at all about the message.

some historical infomations followed

## Proving Security

### Quiz: Order these arguments by how effective they are in convincing someone a cipher is ''secure''

1. Here's a mathematical proof, accepted by experts, that shows the cipher is secure
2. Here's a strong argument why breaking the cipher is at least as hard as some problems we believe is hard
3. Many very smart, highly motivated people tried to break it but couldn't
4. There are 834 quadrillion possible keys, so it must be secure.

## Probability Review pt. 1

Let \Omega be the set of all possible outcomes (called the probability space).

For example, in the case of flipping a coin, \Omega=\{H,T\}.  If our probability space has a uniform distribution that means each outcome is equally likely.

We can define a function P:\mathrm{outcome} \mapsto [0,1] for any outcome in the probability space, where 0 means the outcome never happens, and 1 means it always happens.  When flipping a fair coin, we have a uniform distribution so P(H) = \frac{1}{2}, P(T) = \frac{1}{2}. Notice that the sum of P values for all outcomes must equal 1, i.e. \sum \limits_{w \in \Omega} P(w) = 1.

As another example consider a coin that can land on its edge.  Then \Omega=\{H,T,E\}.  Assume that P(H) = P(T) = 0.49999

### Quiz: What is the probability of E?

Answer: The probabilities of all outcomes must equal 1, so the answer is 1 - 0.49999 - 0.49999 = 0.00002.

## Probability Review pt. 2

An event is a subset of outcomes from a probability space.

For example, the event \mathrm{Heads} = \{H\}, and the event \mathrm{Valid} = \{H,T\}.

The probability for an event is just the sum of the probabilities of its outcomes, that is P(\mathcal{A}) = \sum\limits_{w \in \mathcal{A}} P(w)

### Quiz: What is P(\mathrm{Valid})?

Assume P(H) = P(T) = 0.49999.

Answer: 0.99998 (=P(H) + P(T) )

Complementary Event Probability:

\forall \mathcal{A}: P(\mathcal{A}) + P(\bar{\mathcal{A}}) = 1 where \mathcal{A} is an event, and \bar{\mathcal{A}} is the complement event, i.e. \bar{\mathcal{A}}=\Omega \setminus \mathcal{A}.

## Probability Review pt. 3

Conditional Probability

Definition: Given two events, \mathcal{A} and \mathcal{B}, in the same probability space, the conditional probability of \mathcal{B} given that \mathcal{A} occurred is P(\mathcal{B}|\mathcal{A}) = \frac{P(\mathcal{A} \cap \mathcal{B})}{P(\mathcal{A})}

### Quiz: Given that a coin toss is Valid, what is the probability that it is Heads?

Assume P(H) = P(T) = 0.49999, P(E)=0.00002, and \mathrm{Valid} = \{H,T\}.

Answer: \begin{eqnarray*} P(\mathrm{Heads}|\mathrm{Valid}) &=& \frac{P(\mathrm{Heads} \cap \mathrm{Valid})}{P(\mathrm{Valid})} \\ &=& \frac{P(\{H\} \cap \{H,T\})}{P(\{H,T\})} \\ &=& \frac{P(\{H\})}{P(\{H,T\})} \\ &=& \frac{0.49999}{0.49999+0.49999} \\ &=& 0.5 \\ \end{eqnarray*}

## Perfect Cipher

For a perfect cipher, the ciphertext provides an attacker with no additional information about the plaintext.

### Quiz: which of these is the property we want for a perfect cipher?

Let m, m^* \in \mathcal{M}, k \in \mathcal{K} where \mathcal{M} is the set of all messages, \mathcal{K} is the set of all keys, and let c be the ciphertext.

• P(m=m^* | E_k(m)=c) = \frac{1}{|\mathcal{M}|}
• P(m=m^* | E_k(m)=c) = P(m=m^*)
• P(m=m^*) = P(E_k(m)=E_k(m^*))
• P(m=m^* | E_k(m)=c) = \frac{1}{|\mathcal{K}|}

Answer: P(m=m^* | E_k(m)=c) = P(m=m^*).

## OTP is a Perfect Cipher pt. 1

For a perfect cipher:

P(m=m^* | E_k(m)=c) = P(m=m^*)

### Quiz: Given any message m and ciphertext c, how many one-time pad keys k encrypt E_k(m)=c?

Answer: 1.  For any message - ciphertext pair, there's always one key that maps that message to the ciphertext.

So \begin{eqnarray*} P(E_k(m)=c) &= &\sum\limits_{m_i \in \mathcal{M}} \sum\limits_{k_i \in \mathcal{K}} \frac{P(E_{k_i}(m_i)=c)}{|\mathcal{M}| \cdot |\mathcal{K}|} \\ &=&\frac{|\mathcal{M}|}{|\mathcal{M}| \cdot |\mathcal{K}|} \\ &=&\frac{1}{|\mathcal{K}|} \end{eqnarray*}

We know \sum\limits_{k_i \in \mathcal{K}} P(E_{k_i}(m_i)=c) = 1 in the double sum above because only one key maps the message to the ciphertext.

## OTP is a Perfect Cipher pt. 2

### Quiz: What is P(m=m^* \cap E_k(m)=c)?

The probability space consists of pairs (m, k), i.e. of messages and keys.  For some of those pairs E_k(m)=c (for each message there is only one key such that this is true).

We don't make any assumptions about the probability distribution of the messages, but we do assume the keys are chosen uniformly randomly.

## OTP is a Perfect Cipher pt. 3

P(m=m^* \cap E_k(m)=c) = \frac{P(m=m^*)}{|\mathcal{K}|}

So \begin{eqnarray*} P(m=m^* | E_k(m)=c) &=& \frac{P(m=m^*)}{|\mathcal{K}|} / \frac{1}{|\mathcal{K}|} \\ &=&P(m=m^*)\\ \end{eqnarray*}

Hence OTP is a perfect cipher.

Why are we not done?

• OTP is malleable (an active attacker could change the ciphertext in transit so that Bob would decrypt the wrong message).
• OTP is impractical.  The keys have to be as long as the messages, and we can never reuse the key, so |\mathcal{K}| = |\mathcal{M}|.

## Perfect Cipher is Impractical

Shannon's Theorem: If a cipher is perfect, it must be impractical: |\mathcal{K}| \ge |\mathcal{M}|

Suppose E is a perfect cipher where |\mathcal{M}| \gt |\mathcal{K}|.  Let c_0 \in \mathcal{C} with P(E_k(m)=c_0) \gt 0 .  Then we can decrypt c_0 with all keys k \in \mathcal{K} using a decryption function D so that D_k(E_k(m)) = m. Then let \mathcal{M}_0 = \bigcup \limits_{k \in \mathcal{K}} D_k(c_0).

### Quiz: Which of these must be true?

1. |\mathcal{M}_0| \lt |\mathcal{M}|
2. There exists some m^* \in \mathcal{M} where m^* \notin \mathcal{M}_0.

From the assumption that |\mathcal{M}| \gt |\mathcal{K}|, we can infer that |\mathcal{M}_0| \le |\mathcal{K}|.  Remember that for a perfect cipher P(m=m^* | E_k(m)=c) = P(m=m^*) and we have E_k(m) = c_0, and because there exists some m^* \in \mathcal{M} where m^* \notin \mathcal{M}_0, P(m=m^*)=0 \neq P(m=m^*) \gt 0 which is a contradiction.  Thus, there exists no perfect cipher where |\mathcal{M}| \gt |\mathcal{K}|.