ORAM: A Brief Overview

+
Oblivious RAM
ORAM
A Brief Overview
Dibyendu Nath, UC Santa Barbara

+
Outline
 What’s ORAM ?
 Motivation
 History
 More on ORAM
 Goldreich’s ORAM
 Ostrovsky’s ORAM
 Recent Developments
 Conclusion

+
What is ORAM ?
 Oblivious Random Access Machine [ORAM]
 A machine is oblivious if the sequence in which it accesses memory locations is
equivalent for any two inputs with the same running time.
 E.g. For a client-server model with outsourced data storage in server, the server
cannot gain info about actual memory access pattern from client requests
 Main Idea:
Memory Access Pattern is hidden from adversary.
Caveat: Assume data content is not leaked as it is protected by traditional
cryptographic methods (encryption)
ServerClient
read(i)
write(i, d)
.
.
.

+ Motivation
 Rise of Cloud Computing
 Success of Pay-as-you-Go model
for Public Clouds
 Affordability, Elasticity,
SLA Guarantees
 Lots of “Big” Data
 Solution: Outsourced Data Storage
 Accessible from many devices (desktop,
laptop, phone, car, ...).
 Greater reliability.
 May be cheaper (economy of scale).
 Security Concerns behind
outsourcing data to Public Clouds

+
Motivation
 Possible Solutions
 Duh! Use encryption
 Use “trusted” public cloud services
 Com’on, Google is “not evil”
 Use private cloud infrastructure
 Eg. Walmart uses OpenStack
 Is there a way to hide how we access our data but still use
public cloud infrastructure ?
 Answer: ORAM

+
Motivation: Example Attack
CloudApp
read(i)
write(i, d)
.
.
.
Genuine Client Request
Simulated Client Request
from “trusted” Cloud
provider
Cloud Storage
with Encrypted
Content

+
Motivation: Goal
 “Untrusted” means:
 It may not implement Write/Read properly
 It will try to learn about data
Goal: Securely Outsourcing Data - Store,
access, and update data on an untrusted
server.
M[0]=d1
M[1]=d2
M[2]=d3
…
…
M[n]=dn
read(i)
write(i, d)
.
.
.
Client
Remote Server
[Islam et al, ‘12]

+
Motivation: Solution
 An ORAM emulator is an intermediate layer
that protects any client (i.e. program).
 ORAM will issue operations that deviate from
actual client requests.
 Correctness:
 If server is honest then input/ output behavior is same for client.
 Security:
 Server cannot distinguish between two clients with same running time.

+
History of ORAM
 Pippenger and Fischer showed “oblivious Turing
machines” could simulate general Turing machines
 Goldreich introduced analogous notion of ORAMs in ’87
and gave first interesting construction
 Ostrovsky gave a more efficient construction in ’90
 ... 20 years pass, time sharing systems become “clouds”
...
 Then a flurry of papers improving efficiency: ~10 since
2010

+
ORAM Assumptions
 Assumption #1: Server does not see data.
 Store an encryption key on emulator and re-encrypt on every
read/write.
 Assumption #2: Server does not see op (read vs write).
 Every op is replaced with both a read and a write.
 Assumption #3: Server is honest-but-curious.
 Store a MAC key on emulator and sign (address, time, data) on
each operation.
 Not malicious

+
ORAM Security after Simplification
 What’s left to protect is the “access pattern” of the program.
 Definition: The access pattern generated by a sequence (i1,
..., in) with the ORAM emulator is the random variable (j1, ... ,
jT) sampled while running with an honest server.
 Definition: An ORAM emulator is secure if for every pair of
sequences of the same length, their access patterns are
indistinguishable.

+
Trivial ORAMs
 Example #1: Store everything in ORAM simulator cache
and simulate with no calls to server.
 Client storage = N.
 Example #2: Store memory on server, but scan entire
memory on every operation.
 Amortized and worst-case communication overhead = N.
 Example #3: Assume client accesses each memory slot
at most once, and then permute addresses using a PRP.
 Essentially optimal, but assumption does not hold in practice.

+
ORAM Lower Bounds
 Theorem (GO’90):
Any ORAM emulator must perform Ω(t log t) operations to
simulate t operations.
 Proved via a combinatorial argument.
 Theorem (BM’10):
Any ORAM emulator must either perform Ω(t log t log log
t) operations to simulate t operations or use storage Ω(N2-
o(1)) (on the server).

+
ORAM Efficiency Goals
 In order to be interesting, an ORAM must
simultaneously provide
 o(N) client storage
 o(N) amortized overhead
 Handling of repeated access to addresses.
 Desirable features for an “optimal ORAM”:
 O(log N) worst-case overhead
 O(1) client storage between operations
 O(1) client memory usage during operations
 “Stateless client”: Allows several clients who share a short key
to obliviously access data w/o communicating amongst
themselves between queries. Requires op counters.

+
Basic Tools: Shuffling
 This means we move data at address i to address π(i).
 Proof idea:
Use an oblivious sorting algorithm.
For each comparison in the sort, read both positions and rewrite
them, either swapping the data or not (depending on if π(i) >
π(j)).
Key Idea: Use sorting networks like Batcher or AKS, where
memory accesses are independent of input
Claim: Given any permutation π on {1 , ... , N}, we can permute the
data according to π with a sequence of ops that does not depend on
the data or π.

+
Oblivious Sort: Sorting Network
 Sorts Fixed number of values using a Fixed Sequence of
Comparisons.
 Perfect for Oblivious Shuffling
 Can be represented as networks of wires and comparator
modules
 Values (of any ordered type) flow across the wires. E
 Each comparator connect two wires
Batcher Sorting Network for n = 4
Time Complexity = O(N log2 N)

+
Batcher Sorting Network
Algorithm
N = 8

+
Goldreich’s “Square Root” ORAM
 Scan Cache from the Server
 If data is not in server cache, read it from main
memory
 If data is in server cache, read next dummy slot
 Write data into server cache
 Reshuffle with new K and flush cache after
every C reads.
Initialization: Pick PRP key K. Use it to obliviously shuffle
N data slots together with C “dummy” slots. Empty cache.
N data
slots
C dummy
slots
C cache
slots
Server Storage: N + 2C slots [General Case]
Client Storage: O(1)

+
Goldreich’s ORAM : Efficiency
Taking C=N1/2
Batcher sort ⇒ extra log N factor in costs
Security: Relatively easy to prove.
Server sees an oblivious sort and then C unique, random looking
read/writes before reinitializing.

+
Ostrovsky’s “Hierarchical” ORAM
[Ostrovsky ’96]
 Aim: Minimize Amortized cost of Random Shuffling in
“Square Root” Approach
 Store data (including dummies) in Random-Hash Table,
rather than Randomly Sorted Array and recurse carefully
Main Idea:
 Use Hierarchy of Buffers (hash tables) of different sizes
 Shuffle buffers with frequency inversely proportional to their
sizes
Much more complicated technique for hiding repeated access to
same slots.

+ Ostrovsky’s “Hierarchical” ORAM
Design
Server Storage
• log N “levels” for N items
• Level i contains 2i buckets
• Each buckets contains log N slots
• Each slot contains a ciphertext
encrypting data or dummy.
level
2
3
4
1
K
2
K3
K4
K
1
• Data starts on lowest level into buckets,
overflow happens
Client Storage
PRP key Ki for each level
• When accessed, data gets moved to level 1 with
negligible prob.
• Eventually, data gets shuffled to lower levels
• Invariant: ≤ 2i data slots used in level i
• (i.e. ≤ 1 per bucket on average)
= data
PRP
Keys

+
Ostrovsky’s ORAM: Read/Write
Read/Write(addr) *
 Scan both top buckets for data
 At each lower level, scan exactly one bucket
 Until found, scan bucket at F(Ki, addr) on that level
 After found, scan a random bucket on that level
 Write data into bucket F(K1, addr) on level 1
 Perform a “shuffling procedure” to maintain invariant
* Server is blind to ops i.e. every op is replaced with both a read
and a write (which might or might not modify the data).

+
Ostrovsky’s ORAM in Action
level
2
3
4
1
K
2
K3
K4
K
1
= data
PRF
Keys
Read/Write(blue address):
1. Scan both buckets at level 1
2. Scan bucket F(K2, addr) = 4 in level 2
3. Scan F(K3, addr) = 3 in level 3 (finding data)
4. Scan a random bucket in level
5. Move found data to level 1

+
Shuffling Procedure
 We “merge levels” so that each level has ≤ 1 slot per
bucket on average
 After T operations:
 Let D={ max x : 2x divides T }
 For i = 1 to D
 Pick new PRP key for level i+1
 Shuffle data in levels i and i+1 together into level i+1 using new
key
 Level i is shuffled after every 2i ops.

+
Shuffling: Oblivious Hashing
 12-step Shuffling process with multiple calls to 2 primitives for
merging level i and i+1 into i
 Primitives Used:
 Scanning: Reading all words in memory array and possibly
modifying them
 Oblivious in the sense, order of access is predetermined and
same content may be written back
 7 Calls
 Oblivious Sorting:
Sorting memory array by sorting keys using Sorting Network such that
sequence of memory accesses are fixed and independent of input.
 4 Calls
Example of Sorting Networks: Batchers, AKS

+
level
2
3
4
1
K
2
K3
K4
K
1
= data
PRF
Keys
1. Read a Slot

+
level
2
3
4
1
K
2
K3
K4
K
1
= data
PRF
Keys
1. Read a Slot
2. Read another Slot
 Level 1 shuffled after 2^1 = 2 operations [stops here]

+
level
2
3
4
1
K
2
K3
K4
K
1
= data
PRF
Keys
1. Read a Slot
2. Read another Slot
 Level 1 shuffled after 2^2 = 2 operations
3. 2 more Reads
 Level 1 shuffled after 2^2 = 4 operations
 Level 2 shuffled after 2^2 = 4 operations [stops here]

+
Ostrovsky’s ORAM: Security
 Security proof is more delicate than the first one.
 Key observation: This scheme never uses the value F(Ki,
addr) on the same (key, address) twice.
 Why? Suppose client touches for the same address twice.
 After the first read, data is promoted to level 1.
 During the next read:
 If it is still on level 1, then we don’t evaluate F at all.
 If it is has been moved, a new key must have been chosen for
that level since last read due to shuffling.
 Using key observation, all reads look like random bucket scans.

+
Ostrovsky’s ORAM: Efficiency

+
Recent Developments
 Cuckoo Hashing ORAMs
 Replace bucket-lists with more efficient hash table
 Path ORAM Protocol
 Binary Tree based ORAM Framework with client stash
 Each block is mapped to a uniformly random leaf
bucket in the tree,
 Unstashed blocks are always placed in some bucket
along the path to the mapped leaf.

+
Conclusion
Main Takeaway
 Use sorting networks with PRP keys to simulate oblivious
shuffling after a certain number of memory accesses
 Rinse and repeat
Improvement
 Use Hierarchy of Hash table Buffers of different sizes
 Shuffle buffers with frequency inversely proportional to their sizes

+
References
 Goldreich, Oded, and Rafail Ostrovsky. "Software protection
and simulation on oblivious RAMs." Journal of the ACM (JACM)
43.3 (1996): 431-473.
 Islam, Mohammad Saiful, Mehmet Kuzu, and Murat
Kantarcioglu. "Access Pattern disclosure on Searchable
Encryption: Ramification, Attack and Mitigation." NDSS. Vol.
20. 2012.
 http://cseweb.ucsd.edu/~cdcash/oram-slides.pdf

ORAM: A Brief Overview

More Related Content

What's hot (20)

Viewers also liked (15)

Similar to ORAM: A Brief Overview (20)

Recently uploaded (20)

ORAM: A Brief Overview

Editor's Notes