SlideShare a Scribd company logo
Power Analysis Attacks
Lee Stewart
May 13, 2015
i
Table of Contents
Introduction......................................................................................................................... 1
Hamming Weight and Hamming Distance......................................................................... 1
Simple Power Analysis Attacks.......................................................................................... 2
Differential Power Analysis Attacks .................................................................................. 4
Differential Power Analysis Trace Equation .................................................................. 6
Stages of Differential Power Analysis Attack .................................................................... 6
Set-up.............................................................................................................................. 7
Measurement................................................................................................................... 7
Signal Processing............................................................................................................ 8
Selection Function Generation........................................................................................ 8
Averaging........................................................................................................................ 8
Evaluation ....................................................................................................................... 9
Variants of Differential Power Analysis........................................................................... 10
Correlation Power Analysis .......................................................................................... 10
Probability Distribution Analysis ................................................................................. 10
High-Order Differential Power Analysis...................................................................... 11
Template Attack............................................................................................................ 11
Countermeasures............................................................................................................... 12
Leakage Reduction........................................................................................................ 12
Balancing ...................................................................................................................... 12
Amplitude and Temporal Noise.................................................................................... 13
ii
Protocol-Level Countermeasures.................................................................................. 13
Masking......................................................................................................................... 14
Randomized Instruction Injection Method ................................................................... 14
Randomized Execution Algorithms.............................................................................. 18
Conclusion ........................................................................................................................ 20
References......................................................................................................................... 21
Table of Figures
Figure 1 Trace from a Smart Card (Kocher et. al., 2011, p. 12)......................................... 2
Figure 2: RSA Trace (Kocher et. al., 2011, p. 12).............................................................. 3
Figure 3: RSA Square and Multiply Algorithm (Lomne et. al., 2011, p. 60)..................... 3
Figure 4: Trace of ECC Key (Li et. al., 2011, p. 71).......................................................... 4
Figure 5: ECC Double and Add Algorithm (Li et. al., 2011, p. 70) ................................... 4
Figure 6: Distribution of Traces for LSB of 1st
AES S-box (Kocher, 2011, p. 7).............. 5
Figure 7: DPA Tests for K = 101 to 105 (Kocher et. al, 2011, p. 10) ................................ 5
Figure 8: Example RIJIDindex Calculation (Ambrose et. al., 2012, p. 17)...................... 16
Table 1: RIJIDindex for softRIJID (Ambrose et. al., 2012, p. 21) ................................... 17
Table 2: RIJIDindex for AutoRIJID (Ambrose et. al., 2012, p. 24) ................................. 17
Table 3: Comparison of Overhead Time for DIRI, REIDI, AREID, REBIDI (Zhang et.
al., 2012, p. 435) ........................................................................................................................... 18
Table 4: Comparison of Unbiased Variance of DIRI, REIDI, AREIDI, and REBIDI
(Zhang et. al., 2012, p. 435).......................................................................................................... 19
1
Introduction
Power analysis attacks are a subset of side channel attacks (SCA). Power analysis attacks
are based on the power consumption of integrated circuits (ICs). The other two types of SCA are
timing and electromagnetic emission (Regazzoni, Wang, & Standaert, 2011, p. 56). As hardware
devices such as smart cards, application specific integrated circuits (ASIC), and field
programmable gate arrays (FPGA) perform encryption/decryption their transistors consume
power. In the literature, this power consumption is known as a leakage. By capturing and
analyzing these leakages, an attacker can obtain the secret key. The captured power
measurements from these devices are called traces. There are two types of power analysis
attacks: simple power analysis (SPA) and differential power analysis (DPA) (Kocher, Jaffe, Jun,
& Rohatgi, 2011, p. 4-8). This paper will discuss these types of power analysis attacks and
various countermeasures used to thwart them. Before delving into these topics further
background is needed with respect to leakage models.
Hamming Weight and Hamming Distance
The two most popular leakage models are Hamming weight and Hamming distance. In
the Hamming weight model, a bit value of 1 consumes a significant amount of power whereas a
0 consumes minimal power. Therefore,
 Transitions 0 → 0 and 1 → 0 do not lead to significant power utilization
 Transitions 0 → 1 and 1 → 1 lead to significant power utilization
The Hamming distance model considers that only switching leads to power consumption.
Therefore,
 Transitions 0 → 0 and 1 → 1 do not lead to significant power utilization
2
 Transitions 0 → 1 and 1 → 0 lead to significant power utilization (Lomne,
Dehaboui, Maurine, Torres, & Robert, 2011, p. 58).
Simple Power Analysis Attacks
Simple power analysis involves visual inspection of a trace to obtain cryptographic
secrets (Moradi, Barenghi, Kasper, & Paar, 2011, p. 114). SPA can also be used as a preliminary
step before differential power analysis. Figure 1 shows a trace from a smart card. The trace
identifies when the triple DES operation occurs. Once obtained, the attacker can zoom in on an
area of interest to gain more information.
Figure 1 Trace from a Smart Card (Kocher et. al., 2011, p. 12)
Figure 2 shows a subset of a trace used to obtain the binary representation of the
exponent b in the RSA function 𝑎 𝑏
𝑚𝑜𝑑 𝑛. In the RSA square and multiply algorithm,
multiplication consumes more power and occurs when there is a 1 in the exponent. A short peak
followed by a taller one represents a 1. Two short peaks indicate a 0. (Kocher et. al., 2011,
Figure 3 contains the square and multiply algorithm (Kocher et. al., 2011, p. 12).
3
Figure 2: RSA Trace (Kocher et. al., 2011, p. 12)
Figure 3: RSA Square and Multiply Algorithm (Lomne et. al., 2011, p. 60)
Figure 4 shows the pattern of the elliptic curve cryptography (ECC) add-and-double
algorithm. There are three operations in the algorithm: (A) addition, (D1) doubling after addition,
and (D2) doubling after doubling. D2 consumes more power than D1 or A. D2 represents a 0 and
A followed by D1 indicates a 1. Figure 5 shows the add-and-double algorithm. More information
about ECC and the algorithm can be found in the reference. (Li, Wu, Xu, Yuan, & Luo, 2011, p.
71).
4
Figure 4: Trace of ECC Key (Li et. al., 2011, p. 71)
Figure 5: ECC Double and Add Algorithm (Li et. al., 2011, p. 70)
Differential Power Analysis Attacks
Differential power analysis is a more powerful technique than SPA. In this method, a
cryptanalyst uses statistics to analyze the correlation between data and traces (Zhang, Liao, Qiu,
Hu & Sha, 2012, p. 426). DPA can use a known plaintext or known cipher text attack (Lomne et.
al., 2011, p. 60). DPA also uses a divide-and-conquer strategy to recover different parts of the
key. (Mangard, Oswald, & Standaert, 2011, p. 101). One caveat of DPA is that the traces have to
be aligned. If they are not aligned because of countermeasures (to be discussed later) they must
be re-synchronized. (Lomne et. al., 2011, p. 61)
For example in AES, an attacker normally targets the output of AddRoundKey or
SubBytes, known as intermediaries. Figure 6 shows the probability distribution of traces of the
least significant bit (LSB) of the first S-box in the AES scheme. The left distribution represents 1
5
and the right distribution represents 0. From the figure, it can be seen that the distributions are
approximately Gaussian and the mean of the 0 bit is greater than that of the 1 bit. The
distributions overlap significantly, so a large number of measurements are needed in order to
discriminate them.
Figure 6: Distribution of Traces for LSB of 1st
AES S-box (Kocher, 2011, p. 7)
Figure 7 shows the DPA results of the first eight bits of a key. The number of values that
can be tested per S-box ranges from 0 to 255. The five traces represent values of 101 to 105
respectively. The third trace corresponding to key 103 has the largest spikes and is the correct
key. The same traces can be reused in finding the remaining parts of the key (Kocher, 2011, p. 9-
10).
Figure 7: DPA Tests for K = 101 to 105 (Kocher et. al, 2011, p. 10)
6
An illustration of the power of a DPA attack is provided by Moradi, Barenghi, Kasper,
and Paar (2011, p. 120) on their attack of a triple DES implementation on a Xilinx Virtex II
FPGA. They used 50,000 traces in their analysis. With these traces they were able to recover a 6-
bit DES subkey in less than 4 seconds using about 20 megabytes of memory on a PC. The whole
112-bit key of a 2-key triple DES was obtained in roughly two minutes and the 168-bit key of a
3-key triple DES was recovered in about three minutes. However, they concluded the attack
could be done with as little as 25,000 traces.
Differential Power Analysis Trace Equation
A following equation defines a DPA test.
∆ 𝐷[𝑗] =
∑ 𝐷(𝐶𝑖, 𝐾 𝑛)𝑇𝑖[𝑗]𝑚
𝑖=1
∑ 𝐷(𝐶𝑖, 𝐾 𝑛)𝑚
𝑖=1
−
∑ (1 − 𝐷(𝐶𝑖, 𝐾 𝑛))𝑇𝑖[𝑗]𝑚
𝑖=1
∑ (1 − 𝐷(𝐶𝑖, 𝐾 𝑛))𝑚
𝑖=1
Where:
∆ 𝐷[𝑗] is the differential trace at the jth time offset
Ti[j] is the power measurement at the jth time offset within the trace Ti
Ci is the set of known inputs or outputs for the ith trace
Kn is a guess of part of the key
D(Ci, Kn) is a binary valued selection function with input Ci and Kn
The value of Kn that produces the largest spikes in the differential trace is considered to
be the most likely candidate for the correct key (Kocher et. al., 2011, p. 10).
Stages of Differential Power Analysis Attack
There are six steps in a DPA attack: set-up, measurement, signal processing, prediction
and selection function generation, averaging, and evaluation. These are discussed below.
7
Set-up
In the set-up stage, the equipment needed to get information from a smart card, ASIC, or
FPGA is set up. The auxiliary equipment consists of a resistor or current probe, oscilloscope, and
personal computer (PC). For a smart card, the resistor or current probe is connected in series with
the ground line. For a FPGA, the resistor or current probe is connected in series with the power
input. If the device has an internal battery, a resistor is not needed.
The following tips will ensure better data collection. First, taking measurements near the
IC improves the quality of the traces. Second, removing the decoupling capacitors on the board
or using a power supply will reduce noise. Third, operating the device near its peak voltage or
clock rate can reduce the effect of countermeasures. Fourth, increasing the input message length
will drive up the number of operations per trace and speed up data collection (Kocher et. al.,
2011, p. 14-15).
Measurement
In the measurement stage traces are collected by an oscilloscope and fed to a PC for
statistical processing along with the associated plaintext or cipher text. The biggest issue
affecting measurement is the signal-to-noise ratio (SNR). As the SNR decreases, the number of
traces needed increases. This will be discussed later in the Countermeasures section. Sampling
error is a concern but less pronounced. Quality can be improved by adding analog filters or
adjusting bandwidth or sampling rates. The quantity of traces needed can be reduced by first
inspecting SPA traces in order to remove irrelevant regions or using a preliminary DPA test
using known input and output bits. However, the preliminary test only works if the key is known.
8
The best scope to use for data collection would be one with deep memory for capturing
longer traces, trigger flexibility to start capturing the trace at the right time, and rapid trigger re-
arming time which can help speed up data collection (Kocher et. al., 2011, p. 15).
Signal Processing
Signal processing is used to remove alignment errors, isolate and highlight areas of
interest, and reduce noise. In most cases, only time alignment is needed. Traces with good
alignment reduce the amount of traces needed for analysis. Alternately, DPA could be converted
to the frequency domain using Fourier transform before analysis. Another processing technique
is compression which will reduce the number of traces needed, reduce noise, and amplify signal
resolution. (Kocher et. al., 2011, p. 15-16).
Selection Function Generation
A selection function is used to assign traces to subsets. They are educated guesses as to
the possible value in one or more intermediaries in a cryptographic calculation. Intermediaries,
as previously discussed, would include interim results from a cryptographic operation such as
output from the AddRoundKey stage in AES. Usually selection functions are single bit (0 or 1)
but can be multi-bit (Kocher et. al., 2011, p. 16).
Averaging
In the averaging stage, each trace subset created in the selection function stage is
averaged. This is the most computationally intensive step. Offsets are normally used during the
averaging process to maintain independence from the underlying device.
Two constrains affecting the averaging performance are processing power and storage
throughput. The complexity of averaging is 𝑂(𝑁 ∙ 𝑀 ∙ 𝐿) where N is the number of traces, L is
the length of traces, and M is the number of selection functions. The memory use is 𝑂(𝑀 ∙ 𝐿).
9
To improve this stage, two optimizations are available. The first is to calculate the
average of all traces (Atotal) and the average of the subset (A0 or A1) where the selection function
is either 0 or 1. The average of the other subset can then be found by subtracting A0 or A1 from
Atotal. While the complexity is the same, the memory use is cut in half.
The second optimization is to use a cache during calculations. For instance, a cache size
of 28
– 1 can handle eight traces at a time and compute the sums of 255 traces. Then, the
preprocessed traces can be added to the previous averages in bulk. A cache size of c requires
𝑂(𝑐 ∙ 𝐿) memory and takes 𝑂(2 𝑐
∙ 𝐿) operations to set up. Performance using a cache is
improved to 𝑂 (
𝑁
𝑐
∙ 𝑀 ∙ 𝐿 + 2 𝑐
)
Furthermore, decreasing the trace size can speed up averaging time. As mentioned
earlier, signal processing techniques such as removing irrelevant data or compression can help in
this regard. Also, the averaging task can be run in parallel over multiple drives, threads, or
machines. (Kocher et. al., 2011, p. 17).
Evaluation
In the evaluation stage, the mean difference between the sets of traces in the averaging
stage is calculated. Any significant deviations will appear as spikes in the output which signify
that the selection function is correct and hence the key, sub-key, or intermediate value is correct.
For regions of the trace affected by significant noise, spurious spikes can appear in a differential
trace. These are known as harmonics. To compensate for these, the trace can be normalized by
dividing each point in the trace by the standard deviation (Kocher et. al., 2011, p. 17-18). This is
analagous to calculating z-values for students of statistics
10
Variants of Differential Power Analysis
Correlation Power Analysis
Correlation power analysis (CPA) is not as powerful as DPA. It is based on the
correlation between power consumption and model of energy consumption of a device. As
already mentioned, Hamming weight and Hamming distance are the most prevalent models. The
following equation is the correlation function between power consumption and Hamming weight
(or Hamming distance). It is proportional to the correlation between power and energy.
𝜌 𝑊𝐻,𝑘(𝐵) =
𝐸(𝑊𝐻 𝑘) − 𝐸(𝑊)𝐸(𝐻 𝑘)
𝜎 𝑤 𝜎 𝐻 𝑘
Where:
W is the power consumption
Hk is the Hamming weight or Hamming distance of the kth key
E(WHk), E(W), and E(Hk) are the expected value of WHk, W, and Hk
𝜎 𝑤, 𝜎 𝐻 𝑘
are the variances of W and Hk
The number of traces to use depends on factors such as clock frequency, resolution, and
acquisition time (Menicocci, Trifiletti, & Trotta, 2013, p. 146).
Correlation power analysis is most effective in white-box analysis where the device
leakage model is known. However, it can also be used for black-box attacks as long as there is
some correlation between the leakage of the device and its leakage model. CPA is ideal when the
number of traces is limited (Kocher et. al., 2011, p. 19).
Probability Distribution Analysis
Probability distribution analysis (PDA) is a form of DPA in which traces are
preprocessed before being analyzed. First, the average of all traces is computed. This value is
11
subtracted from each trace. The remaining data points are squared. The analysis performed on
the preprocessed traces is equivalent to comparing the variances of the data at each point, rather
than their means. An attacker would use PDA in a situation where a countermeasure has been
used that causes the differential trace to be flat (Kocher et. al., 2011, p. 19-20).
High-Order Differential Power Analysis
High-Order DPA is a method that analyzes the relationship between multiple parameters.
The number of parameters used indicates the order of the attack. High-order DPA can help
analyze relationships such as:
Similarity/difference: The calculations at different points in one or more traces may use
the same parameter. High-order functions that measure correlation or covariance can be used to
detect these relationships. High-order DPA has been used successfully in attacks on RSA and
ECC.
Masked shares of a secret: This is a second order DPA attack used to attack the masking
countermeasure (discussed in the Countermeasures section). An example would be a mask that
manipulates an intermediate such as SubBytes in AES as two parts, say A and B. A and B are
random, but 𝐴⨁𝐵 would reveal the intermediate value. A high-order function can combine a
measurement correlated to the first part with a measurement correlated to the second part, so that
the combination is correlated to the sensitive variable. (Kocher et. al., 2011, p. 20).
Template Attack
In a template attack, an analyst builds an IC identical to a target IC then compares traces
between them (Lomne et. al., 2011, p. 66). The advantage of template attacks is that it can be
used when the number of traces is small (Kocher et. al., 2011, p. 21). The attack occurs in two
stages: template building stage and template matching stage. In the building stage, template ICs
12
for pairs of data and keys are built. The traces are modelled as a multivariate normal distribution.
For more information about the underlying theory see the reference. In the matching stage, the
probability density function of the multivariate normal distribution of the template is compared
to the target device. (Lomne et. al., 2011, p. 66-67) Smartphones are a popular target of these
attacks (Kocher et. al., 2011, p. 21).
Countermeasures
There are three ways to mitigate power analysis attacks: decrease the signal-to-noise
ratio, balance the power used by ICs, manipulate the data before encryption, and randomize
operations. These counter measures can be implemented in hardware or software. The following
subsections will discuss some of these methods.
Leakage Reduction
In the leakage reduction countermeasure, the signal-to-noise ratio is reduced by either
decreasing the signal or increasing the noise. In terms of effectiveness, a factor k reduction in the
SNR means that k2
more traces are needed for an attack (Kocher et. al., 2011, p. 21).
Balancing
The goal of balancing is to even the power used by ICs while performing cryptographic
operations. Some techniques include using multi-bit data representations and balanced
transistions, dual-rail precharge logic, current mode, and asynchronous logic styles. These are
ongoing areas of research (Kocher et. al., 2011, p. 21). The dual-rail precharge method has two
stages. The precharge stage resets all signals to a known state. The evaluation stage does
computations with a fixed number of transitions. The trade-off of this scheme is that it doubles
the complexity of implementation. (Danger, Guille, Barthe, Benoit, 2011, p. 86-87). Balancing
has the same k2
effectiveness mentioned above. (Kocher et. al., 2011, p. 21)
13
Amplitude and Temporal Noise
Both amplitude and temporal noise alter the SNR by increasing the noise component.
Amplitude noise is added by using ICs that consume variable amounts of power or that perform
calculations that are uncorrelated to intermediates such as SubBytes or AddRoundKey in AES.
Temporal noise is added by varying timing and execution order of instructions. Methods include
varying clock speed, adding random wait states, random execution, use of dummy operations,
and random branching. These countermeasures can be defeated by using filters and signal
processing techniques discussed earlier. It also has the same k2
effectiveness relationship
mentioned in the leakage reduction section. (Kocher et. al., 2011, p. 22).
Protocol-Level Countermeasures
Protocol-level countermeasures involve designing cryptographic protocols that can
survive leakage. Of course, it is impossible to totally prevent leakage. The power of these
countermeasures is that they can compensate for less-than-perfect hardware. To illustrate, an IC
using a regular protocol can be broken with a leakage rate as low as 10-9
bits per operation.
Protocol-level countermeasures can survive leakage of more than 10 bits per operation–a
staggering difference.
One protocol-level countermeasure is to limit the number of transactions that can be
performed with any given key. For example, a policy could be implemented which destroys the
key after 20 transactions. This is analogous to bank pin codes which lock a user`s account after
three incorrect guesses.
Another method is key update procedures which update keys periodically. The key
update process changes the key state so that the new key cannot be correlated to the old key. Key
updates may be structured in a key-tree. A key-tree is a tree structure defined from a root key and
14
a set of key update transforms. Counters or other protocol constructions limit the number of
times any given node is used to form transaction keys (Kocher et. al., 2011, p. 23-24).
Masking
The masking countermeasure conceals plaintext or an intermediate value x with a mask m
which takes random values. There are two types of masks: Boolean and arithmetic. Boolean
masking, shown below, uses bit wise exclusive-or.
𝑥 𝑚 = 𝑥⨁𝑚
Arithmetic masking, shown below, uses modular addition or multiplication on a finite
field.
𝑥 𝑚 = 𝑥 + 𝑚 (𝑚𝑜𝑑 𝑛) or
𝑥 𝑚 = 𝑥 ∗ 𝑚 (𝑚𝑜𝑑 𝑛)
where 𝑛 = 2|𝑥|
= 2|𝑚|
The drawback of masking is that it decreases throughput by one-half (Danger et. al.,
2011, p. 76). Data masking techniques are vulnerable to second-order DPA attacks (Ambrose,
Ragel, & Parameswaran, 2012, p. 5).
Randomized Instruction Injection Method
Ambrose et. al. (2012) claim that the countermeasures mentioned above are vulnerable to
multi-order DPA or have high area, run time, and energy costs (p. 1). For example, the balancing
countermeasure increases execution time by 75 percent (p. 6). To overcome these problems, they
developed a randomized instruction technique (RIJID). RIJID scrambles the trace by injecting
random instructions at random points in an algorithm (p. 1).
There are two ways to trigger the instructions:
15
 softRIJID: a hardware/software approach, where code injection is triggered by
special instructions at runtime; and
 autoRIJID: a hardware approach, where code injection is triggered by the
processor at runtime.
RIJID differs from other approaches in that it injects real instructions such as AND and
OR randomly instead of dummy instructions such as NO-OP at fixed points (p. 2). The issue
with dummy instructions is that they have their own trace profile and can be synchronized.
Dummy instructions are also vulnerable to sliding window differential power analysis (SW-
DPA) (p. 5)
Using softRIJID, the area of a RISC processor increased by 1.98 percent with an increase
of 29.8 percent in runtime and 27.1 percent in energy. Using autoRIJID, the area of a RISC
processor increased by 1.20 percent with an increase of 25.0 percent in runtime and 28.5 percent
in energy (p. 6).
Some of the limitations of RIJID are:
 It is a design-time approach and needs hardware changes
 softRIJID requires compiler support
 RIJIDindex can only compare different injection methods. It cannot compare
completely different methods such as hardware balancing against code injection
 The design does not support processors that have advanced scalar features (p. 7)
To measure the amount of scrambling, the authors developed a metric called RIJIDindex,
which is based on cross-correlation. The higher the index; the greater amount of scrambling
present. RIJIDindex can be used to gauge the vulnerability of a cryptosystem instead of DPA.
The RIJID framework consists of:
16
 Analyzing the power waveform and extracting a template. In this context, a
template is an expected trace pattern (not a real trace) that would be observed
based on a cryptographic algorithm such as AES. The paper has an example of a
template for triple DES.
 Applying RIJID to the system and measuring the scrambled trace
 Using RIJID to calculate the level of scrambling (p. 17)
Figure 8: Example RIJIDindex Calculation (Ambrose et. al., 2012, p. 17)
Figure 8 shows an example of a RIJIDindex calculation. (a) shows an original trace. (b)
shows a repeating template. (c) shows the RIJID scrambled trace. (d) shows a random trace that
17
does not have a template. (e) shows the cross-correlation between the original trace and the
repeating template. As can be seen in the figure, there are significant peaks at points where the
template and original match. (f) shows the cross-correlation between the template and the
scrambled trace. (g) shows the cross-correlation between the template and the random trace. As
can be seen from the figure, both cross-correlations (f) and (g) do not show any significant peaks
(p. 18).
The formula for the RIJIDindex is:
𝑅𝐼𝐽𝐼𝐷𝑖𝑛𝑑𝑒𝑥 =
∆ 𝑜 − ∆ 𝑧
∆ 𝑜 − ∆ 𝑟
The numerator is the difference between the means of the original (∆ 𝑜) and RIJID (∆ 𝑧)
traces. The denominator is the difference between the means of the original and random (∆ 𝑟)
traces (p. 18). For softRIJID, Table 1 shows the RIJIDindex for various injection pairs and
cryptographic algorithms. As can be seen in the table, softRIJID has the best overall performance
for RSA. The ideal injection pair for RSA is (3, 3) (p. 21).
Table 1: RIJIDindex for softRIJID (Ambrose et. al., 2012, p. 21)
N D RSA IDEA RC4
3 3 0.9998 0.7204 0.9908
4 4 0.9896 0.7475 0.9523
5 5 0.9511 0.9738 0.9998
6 6 0.9607 0.7571 0.9364
For autoRIJID, Table 2 shows the RIJIDindex for various cryptographic algorithms using
an injection pair of (5, 5). As can be seen in the table, autoRIJID has the best performance for
RSA. (p. 24)
Table 2: RIJIDindex for AutoRIJID (Ambrose et. al., 2012, p. 24)
Algorithm RIJIDindex
TripleDES 0.7040
Blowfish 0.9622
18
Algorithm RIJIDindex
Rijindael 0.9495
SHA 0.7096
RSA 0.9980
Randomized Execution Algorithms
Zhang et. al. (2012) developed four algorithms to resist power analysis attacks. They are:
 Dummy Instructions Random Insertion (DIRI)
 Randomized Execution with Independent Dummy Instructions (REIDI)
 Advanced Randomized Execution with Independent Dummy Instructions
(AREIDI)
 Randomized Execution with Binding and Independent Dummy Instructions
(REBIDI)
The underlying details of the algorithms are complex, so only their experimental results
will be presented. The researchers used Mibench to obtain the results below. Mibench is a free
benchmark system. Table 3 is adapted from a larger table published in the paper. The table only
lists the extra overhead used by the algorithms relative to the benchmarks as a percentage.
Observe that all the candidate algorithms use more overhead than the respective benchmarks;
however, DIRI performs better than others with an average overhead of 6.03 percent in excess of
the benchmarks. REBIDI is the worst performer. It has an average overhead approximately one-
third more than the benchmarks. REIDI and AREIDI are approximately the same at about 24.5
percent.
Table 3: Comparison of Overhead Time for DIRI, REIDI, AREID, REBIDI (Zhang et. al., 2012, p. 435)
Overhead (%)
Benchmarks DIRI REIDI AREIDI REBIDI
SHA 5.88 27.01 23.52 29.55
Rijindael 5.98 25.20 18.86 27.88
19
Blowfish 6.44 26.71 19.37 33.70
FFT 4.91 26.73 21.68 39.60
CRC32 7.57 21.99 34.28 39.95
Adpcm 5.12 23.40 23.56 30.36
Dijkstra 6.25 24.50 28.38 32.83
Qsort (small) 7.73 25.68 27.05 39.55
Qsort (large) 5.24 23.08 25.64 35.12
stringsearch 5.22 20.40 22.77 30.83
Average 6.03 24.47 24.51 33.94
Likewise, Table 4 is adapted from a larger table in the paper. It shows the improvement
of each candidate algorithm in terms of security as measured by the unbiased variance of the
power. Observe that REBIDI has the most improvement over the benchmarks with an average of
76.56 percent. DIRI has the lowest average improvement with an average of 40.22 percent.
Table 4: Comparison of Unbiased Variance of DIRI, REIDI, AREIDI, and REBIDI (Zhang et. al., 2012, p. 435)
Improvement (%)
Benchmarks DIRI REIDI AREIDI REBIDI
SHA 35.81 77.99 79.19 84.43
Rijindael 40.26 79.43 80.20 83.13
Blowfish 50.70 69.27 69.81 71.03
FFT 47.06 73.97 76.79 77.97
CRC32 35.63 72.64 72.57 73.69
adpcm 34.36 73.59 70.55 74.09
Dijkstra 46.11 72.55 75.88 76.54
Qsort (small) 35.39 68.18 70.46 75.43
Qsort (large) 42.31 68.26 67.24 74.42
stringsearch 34.56 73.90 72.25 75.21
Average 40.22 72.98 73.49 76.56
Comparing both tables, a trade-off can be seen. An increase in performance comes at the
cost of increased overhead on the IC. For example, DIRI uses an average 6.03 percent more
overhead but has the lowest average rate of improvement at 40.22 percent. Looking at it from a
different perspective though, DIRI has a better ratio. For a modest increase in average overhead,
the gain is approximately seven fold.
20
Conclusion
Power analysis attacks use the power consumption of ICs to steal secret keys from
devices such as smart cards and FPGAs. This power consumption is known as a leakage. The
prevalent leakage models are Hamming weight and Hamming distance. Hamming weight defines
maximum power usage as an operation where the bit ends up as 1. Hamming distance defines
maximum power as a switching operation where the bit changes from 0 to 1 or 1 to 0.
Simple power analysis (SPA) and differential power analysis (DPA) attacks are types of
power analysis attacks, with DPA being the more powerful approach. SPA involves visual
inspection of a power trace whereas DPA uses correlation between data and traces. For SPA, we
looked at examples of finding the exponent in RSA and the key in elliptic curve cryptography
(ECC). For DPA, we looked at an example of how to find a partial AES key for one S-box and
learned how a group of researchers cracked a triple DES key in three minutes. Next, the DPA
trace equation was presented.
Following the DPA theory, the stages of a DPA attack were discussed. The steps include
set-up, measurement, signal processing, selection function generation, averaging, and evaluation.
Set-up involves connecting a resistor in series with the device and using an oscilloscope to
capture voltage readings as the device performs cryptographic operations. A PC is used to
analyze the information. Measurement consists of capturing traces. Signal processing is used to
clean up the signal before analysis. Selection functions are educated guesses about the
cryptographic data whether it is part of the key or an intermediate result. Selection functions are
usually single bit (0 or 1) and are used to separate the traces into sets. Averaging is performed on
each subset of traces assigned in the selection function step. Evaluation involves computing DPA
21
tests to see if the selection function is correct. Tests with large spikes indicate a highly probable
correct value.
Next several variants of DPA where presented. These include correlation power analysis,
probability distribution analysis, high-order DPA, and template attack.
Finally, several countermeasures to prevent power analysis attacks were discussed. These
include classical approaches such as leakage reduction (i.e. reducing signal-to-noise ratio),
balancing (i.e. even the power consumption of IC operations), amplitude and temporal noise (i.e.
reduce SNR), protocol-level countermeasures such as setting key limits and key update
procedures, masking (i.e. Boolean and arithmetic), and several randomized algorithms.
References
Ambrose, J. A., Ragel, R. G., & Parameswaran, S. (2012). Randomized instruction injection to
counter power analysis attacks. ACM Transactions on Embedded Computing Systems
(TECS), 11(3), 69.
Danger, J. L., Guilley, S., Barthe, L., Benoit, P. (2011). Countermeasures against physical
attacks in FPGAs. Security trends for FPGAS (pp. 47-72) Springer.
Kocher, P., Jaffe, J., Jun, B., & Rohatgi, P. (2011). Introduction to differential power analysis.
Journal of Cryptographic Engineering, 1 (1), 527.
Li, H., Wu, K., Xu, G., Yuan, H., & Luo, P. (2011). Simple power analysis attacks using chosen
message against ECC hardware implementations. Paper presented at the Internet Security
(WorldCIS), 2011 World Congress on, 68-72.
Lomne, V., Dehaboui, A., Maurine, P., Torres, L., & Robert, M. (2011). Side channel attacks.
Security trends for FPGAS (pp. 47-72) Springer.
22
Mangard, S., Oswald, E., & Standaert, F. (2011). One for all–all for one: Unifying standard
differential power analysis attacks. IET Information Security, 5(2), 100-110.
Menicocci, R., Trifiletti, A., & Trotta, F. (2013). Random interleaved pipeline countermeasure
against power analysis attacks. Paper presented at the Ph. D. Research in Microelectronics
and Electronics (PRIME), 2013 9th Conference on, 145-148.
Moradi, A., Barenghi, A., Kasper, T., & Paar, C. (2011). On the vulnerability of FPGA bitstream
encryption against power analysis attacks: Extracting keys from xilinx virtex-II FPGAs.
Paper presented at the Proceedings of the 18th ACM Conference on Computer and
Communications Security, 111-124.
Regazzoni, F., Wang, Y., & Standaert, F. (2011). FPGA implementations of the AES masked
against power analysis attacks. Paper presented at the Second International Workshop on
Constructive Side-Channel Analysis and Secure Design (COSADE 2011), 56-66.
Zhang, D., Liao, X., Qiu, M., Hu, J., & Sha, E. H. -. (2012). Randomized execution algorithms
for smart cards to resist power analysis attacks. Journal of Systems Architecture, 58(10),
426-438. doi:http://dx.doi.org.qe2a-proxy.mun.ca/10.1016/j.sysarc.2012.08.004.

More Related Content

PPTX
Power Analysis Attacks
PPTX
Air Blast Circuit Breaker(ABCB)_Switchgear & protection.pptx
PPTX
MSB 105.pptx
PDF
Peralatan tegangan-tinggi
PPT
PTV PENGENALAN(1).ppt
PPSX
Underground cables
PPTX
Langkah langkah kerja membuat simpai tamatan
PPTX
Jenis-jenis Papan Suis Utama ( Main Switch Board)
Power Analysis Attacks
Air Blast Circuit Breaker(ABCB)_Switchgear & protection.pptx
MSB 105.pptx
Peralatan tegangan-tinggi
PTV PENGENALAN(1).ppt
Underground cables
Langkah langkah kerja membuat simpai tamatan
Jenis-jenis Papan Suis Utama ( Main Switch Board)

What's hot (14)

PDF
Pendawaian domestik
PPT
Bab 2 Asas Elektrik - kabel dan pengalir 1.ppt
PPT
B14 kabel
PPT
KABEL BAWAH TANAH.ppt
PPT
SENGGARAAN ACB.ppt
PDF
ИБП EP Series 700-3000 ВА (Руководство по эксплуатации)
PPT
LV switch Board( LV ).ppt
PPTX
Bateri present
PPTX
JARINGAN DISTRIBUSI PRIMER (JTM ) SISTEM TENAGA LISTRIK
PPT
Components used in electrical installations
PPTX
Switchgear Operation
PPT
Trunking
PPTX
Supports of overhead line
PPTX
presentation on substation layout and BUS bar arrangement.
Pendawaian domestik
Bab 2 Asas Elektrik - kabel dan pengalir 1.ppt
B14 kabel
KABEL BAWAH TANAH.ppt
SENGGARAAN ACB.ppt
ИБП EP Series 700-3000 ВА (Руководство по эксплуатации)
LV switch Board( LV ).ppt
Bateri present
JARINGAN DISTRIBUSI PRIMER (JTM ) SISTEM TENAGA LISTRIK
Components used in electrical installations
Switchgear Operation
Trunking
Supports of overhead line
presentation on substation layout and BUS bar arrangement.
Ad

Similar to Power Analysis Attacks (20)

PDF
201977 1-1-3-pb
PPTX
Regresi Data Panel dalam ekonometrika.pptx
PDF
Penalized Regressions with Different Tuning Parameter Choosing Criteria and t...
PDF
Identification of Outliersin Time Series Data via Simulation Study
PDF
GENETIC ALGORITHM FOR FUNCTION APPROXIMATION: AN EXPERIMENTAL INVESTIGATION
PDF
Kiaras Ioannis cern
PDF
Regression and Classification Analysis
PPTX
One-Way ANOVA: Conceptual Foundations
PDF
PDF
poster-lowe-6
PDF
Paper 7 (s.k. ashour)
PDF
CLIM Program: Remote Sensing Workshop, Blocking Methods for Spatial Statistic...
DOCX
Museum Paper Rubric50 pointsRubric below is a chart form of .docx
PDF
Research Assignment INAR(1)
PPTX
Ols by hiron
PDF
Final Beam Paper
PDF
BINARY TREE SORT IS MORE ROBUST THAN QUICK SORT IN AVERAGE CASE
DOC
Math 221 Massive Success / snaptutorial.com
PDF
DETECTION OF RELIABLE SOFTWARE USING SPRT ON TIME DOMAIN DATA
201977 1-1-3-pb
Regresi Data Panel dalam ekonometrika.pptx
Penalized Regressions with Different Tuning Parameter Choosing Criteria and t...
Identification of Outliersin Time Series Data via Simulation Study
GENETIC ALGORITHM FOR FUNCTION APPROXIMATION: AN EXPERIMENTAL INVESTIGATION
Kiaras Ioannis cern
Regression and Classification Analysis
One-Way ANOVA: Conceptual Foundations
poster-lowe-6
Paper 7 (s.k. ashour)
CLIM Program: Remote Sensing Workshop, Blocking Methods for Spatial Statistic...
Museum Paper Rubric50 pointsRubric below is a chart form of .docx
Research Assignment INAR(1)
Ols by hiron
Final Beam Paper
BINARY TREE SORT IS MORE ROBUST THAN QUICK SORT IN AVERAGE CASE
Math 221 Massive Success / snaptutorial.com
DETECTION OF RELIABLE SOFTWARE USING SPRT ON TIME DOMAIN DATA
Ad

More from Lee Stewart (7)

PPTX
Mario Tennis Presentation
PPTX
Knowledge Management
PPT
Risk Management
PPTX
MEM Presentation
PDF
mmWave Paper
PDF
DesignReport
PDF
Engineering Management Paper
Mario Tennis Presentation
Knowledge Management
Risk Management
MEM Presentation
mmWave Paper
DesignReport
Engineering Management Paper

Power Analysis Attacks

  • 1. Power Analysis Attacks Lee Stewart May 13, 2015
  • 2. i Table of Contents Introduction......................................................................................................................... 1 Hamming Weight and Hamming Distance......................................................................... 1 Simple Power Analysis Attacks.......................................................................................... 2 Differential Power Analysis Attacks .................................................................................. 4 Differential Power Analysis Trace Equation .................................................................. 6 Stages of Differential Power Analysis Attack .................................................................... 6 Set-up.............................................................................................................................. 7 Measurement................................................................................................................... 7 Signal Processing............................................................................................................ 8 Selection Function Generation........................................................................................ 8 Averaging........................................................................................................................ 8 Evaluation ....................................................................................................................... 9 Variants of Differential Power Analysis........................................................................... 10 Correlation Power Analysis .......................................................................................... 10 Probability Distribution Analysis ................................................................................. 10 High-Order Differential Power Analysis...................................................................... 11 Template Attack............................................................................................................ 11 Countermeasures............................................................................................................... 12 Leakage Reduction........................................................................................................ 12 Balancing ...................................................................................................................... 12 Amplitude and Temporal Noise.................................................................................... 13
  • 3. ii Protocol-Level Countermeasures.................................................................................. 13 Masking......................................................................................................................... 14 Randomized Instruction Injection Method ................................................................... 14 Randomized Execution Algorithms.............................................................................. 18 Conclusion ........................................................................................................................ 20 References......................................................................................................................... 21 Table of Figures Figure 1 Trace from a Smart Card (Kocher et. al., 2011, p. 12)......................................... 2 Figure 2: RSA Trace (Kocher et. al., 2011, p. 12).............................................................. 3 Figure 3: RSA Square and Multiply Algorithm (Lomne et. al., 2011, p. 60)..................... 3 Figure 4: Trace of ECC Key (Li et. al., 2011, p. 71).......................................................... 4 Figure 5: ECC Double and Add Algorithm (Li et. al., 2011, p. 70) ................................... 4 Figure 6: Distribution of Traces for LSB of 1st AES S-box (Kocher, 2011, p. 7).............. 5 Figure 7: DPA Tests for K = 101 to 105 (Kocher et. al, 2011, p. 10) ................................ 5 Figure 8: Example RIJIDindex Calculation (Ambrose et. al., 2012, p. 17)...................... 16 Table 1: RIJIDindex for softRIJID (Ambrose et. al., 2012, p. 21) ................................... 17 Table 2: RIJIDindex for AutoRIJID (Ambrose et. al., 2012, p. 24) ................................. 17 Table 3: Comparison of Overhead Time for DIRI, REIDI, AREID, REBIDI (Zhang et. al., 2012, p. 435) ........................................................................................................................... 18 Table 4: Comparison of Unbiased Variance of DIRI, REIDI, AREIDI, and REBIDI (Zhang et. al., 2012, p. 435).......................................................................................................... 19
  • 4. 1 Introduction Power analysis attacks are a subset of side channel attacks (SCA). Power analysis attacks are based on the power consumption of integrated circuits (ICs). The other two types of SCA are timing and electromagnetic emission (Regazzoni, Wang, & Standaert, 2011, p. 56). As hardware devices such as smart cards, application specific integrated circuits (ASIC), and field programmable gate arrays (FPGA) perform encryption/decryption their transistors consume power. In the literature, this power consumption is known as a leakage. By capturing and analyzing these leakages, an attacker can obtain the secret key. The captured power measurements from these devices are called traces. There are two types of power analysis attacks: simple power analysis (SPA) and differential power analysis (DPA) (Kocher, Jaffe, Jun, & Rohatgi, 2011, p. 4-8). This paper will discuss these types of power analysis attacks and various countermeasures used to thwart them. Before delving into these topics further background is needed with respect to leakage models. Hamming Weight and Hamming Distance The two most popular leakage models are Hamming weight and Hamming distance. In the Hamming weight model, a bit value of 1 consumes a significant amount of power whereas a 0 consumes minimal power. Therefore,  Transitions 0 → 0 and 1 → 0 do not lead to significant power utilization  Transitions 0 → 1 and 1 → 1 lead to significant power utilization The Hamming distance model considers that only switching leads to power consumption. Therefore,  Transitions 0 → 0 and 1 → 1 do not lead to significant power utilization
  • 5. 2  Transitions 0 → 1 and 1 → 0 lead to significant power utilization (Lomne, Dehaboui, Maurine, Torres, & Robert, 2011, p. 58). Simple Power Analysis Attacks Simple power analysis involves visual inspection of a trace to obtain cryptographic secrets (Moradi, Barenghi, Kasper, & Paar, 2011, p. 114). SPA can also be used as a preliminary step before differential power analysis. Figure 1 shows a trace from a smart card. The trace identifies when the triple DES operation occurs. Once obtained, the attacker can zoom in on an area of interest to gain more information. Figure 1 Trace from a Smart Card (Kocher et. al., 2011, p. 12) Figure 2 shows a subset of a trace used to obtain the binary representation of the exponent b in the RSA function 𝑎 𝑏 𝑚𝑜𝑑 𝑛. In the RSA square and multiply algorithm, multiplication consumes more power and occurs when there is a 1 in the exponent. A short peak followed by a taller one represents a 1. Two short peaks indicate a 0. (Kocher et. al., 2011, Figure 3 contains the square and multiply algorithm (Kocher et. al., 2011, p. 12).
  • 6. 3 Figure 2: RSA Trace (Kocher et. al., 2011, p. 12) Figure 3: RSA Square and Multiply Algorithm (Lomne et. al., 2011, p. 60) Figure 4 shows the pattern of the elliptic curve cryptography (ECC) add-and-double algorithm. There are three operations in the algorithm: (A) addition, (D1) doubling after addition, and (D2) doubling after doubling. D2 consumes more power than D1 or A. D2 represents a 0 and A followed by D1 indicates a 1. Figure 5 shows the add-and-double algorithm. More information about ECC and the algorithm can be found in the reference. (Li, Wu, Xu, Yuan, & Luo, 2011, p. 71).
  • 7. 4 Figure 4: Trace of ECC Key (Li et. al., 2011, p. 71) Figure 5: ECC Double and Add Algorithm (Li et. al., 2011, p. 70) Differential Power Analysis Attacks Differential power analysis is a more powerful technique than SPA. In this method, a cryptanalyst uses statistics to analyze the correlation between data and traces (Zhang, Liao, Qiu, Hu & Sha, 2012, p. 426). DPA can use a known plaintext or known cipher text attack (Lomne et. al., 2011, p. 60). DPA also uses a divide-and-conquer strategy to recover different parts of the key. (Mangard, Oswald, & Standaert, 2011, p. 101). One caveat of DPA is that the traces have to be aligned. If they are not aligned because of countermeasures (to be discussed later) they must be re-synchronized. (Lomne et. al., 2011, p. 61) For example in AES, an attacker normally targets the output of AddRoundKey or SubBytes, known as intermediaries. Figure 6 shows the probability distribution of traces of the least significant bit (LSB) of the first S-box in the AES scheme. The left distribution represents 1
  • 8. 5 and the right distribution represents 0. From the figure, it can be seen that the distributions are approximately Gaussian and the mean of the 0 bit is greater than that of the 1 bit. The distributions overlap significantly, so a large number of measurements are needed in order to discriminate them. Figure 6: Distribution of Traces for LSB of 1st AES S-box (Kocher, 2011, p. 7) Figure 7 shows the DPA results of the first eight bits of a key. The number of values that can be tested per S-box ranges from 0 to 255. The five traces represent values of 101 to 105 respectively. The third trace corresponding to key 103 has the largest spikes and is the correct key. The same traces can be reused in finding the remaining parts of the key (Kocher, 2011, p. 9- 10). Figure 7: DPA Tests for K = 101 to 105 (Kocher et. al, 2011, p. 10)
  • 9. 6 An illustration of the power of a DPA attack is provided by Moradi, Barenghi, Kasper, and Paar (2011, p. 120) on their attack of a triple DES implementation on a Xilinx Virtex II FPGA. They used 50,000 traces in their analysis. With these traces they were able to recover a 6- bit DES subkey in less than 4 seconds using about 20 megabytes of memory on a PC. The whole 112-bit key of a 2-key triple DES was obtained in roughly two minutes and the 168-bit key of a 3-key triple DES was recovered in about three minutes. However, they concluded the attack could be done with as little as 25,000 traces. Differential Power Analysis Trace Equation A following equation defines a DPA test. ∆ 𝐷[𝑗] = ∑ 𝐷(𝐶𝑖, 𝐾 𝑛)𝑇𝑖[𝑗]𝑚 𝑖=1 ∑ 𝐷(𝐶𝑖, 𝐾 𝑛)𝑚 𝑖=1 − ∑ (1 − 𝐷(𝐶𝑖, 𝐾 𝑛))𝑇𝑖[𝑗]𝑚 𝑖=1 ∑ (1 − 𝐷(𝐶𝑖, 𝐾 𝑛))𝑚 𝑖=1 Where: ∆ 𝐷[𝑗] is the differential trace at the jth time offset Ti[j] is the power measurement at the jth time offset within the trace Ti Ci is the set of known inputs or outputs for the ith trace Kn is a guess of part of the key D(Ci, Kn) is a binary valued selection function with input Ci and Kn The value of Kn that produces the largest spikes in the differential trace is considered to be the most likely candidate for the correct key (Kocher et. al., 2011, p. 10). Stages of Differential Power Analysis Attack There are six steps in a DPA attack: set-up, measurement, signal processing, prediction and selection function generation, averaging, and evaluation. These are discussed below.
  • 10. 7 Set-up In the set-up stage, the equipment needed to get information from a smart card, ASIC, or FPGA is set up. The auxiliary equipment consists of a resistor or current probe, oscilloscope, and personal computer (PC). For a smart card, the resistor or current probe is connected in series with the ground line. For a FPGA, the resistor or current probe is connected in series with the power input. If the device has an internal battery, a resistor is not needed. The following tips will ensure better data collection. First, taking measurements near the IC improves the quality of the traces. Second, removing the decoupling capacitors on the board or using a power supply will reduce noise. Third, operating the device near its peak voltage or clock rate can reduce the effect of countermeasures. Fourth, increasing the input message length will drive up the number of operations per trace and speed up data collection (Kocher et. al., 2011, p. 14-15). Measurement In the measurement stage traces are collected by an oscilloscope and fed to a PC for statistical processing along with the associated plaintext or cipher text. The biggest issue affecting measurement is the signal-to-noise ratio (SNR). As the SNR decreases, the number of traces needed increases. This will be discussed later in the Countermeasures section. Sampling error is a concern but less pronounced. Quality can be improved by adding analog filters or adjusting bandwidth or sampling rates. The quantity of traces needed can be reduced by first inspecting SPA traces in order to remove irrelevant regions or using a preliminary DPA test using known input and output bits. However, the preliminary test only works if the key is known.
  • 11. 8 The best scope to use for data collection would be one with deep memory for capturing longer traces, trigger flexibility to start capturing the trace at the right time, and rapid trigger re- arming time which can help speed up data collection (Kocher et. al., 2011, p. 15). Signal Processing Signal processing is used to remove alignment errors, isolate and highlight areas of interest, and reduce noise. In most cases, only time alignment is needed. Traces with good alignment reduce the amount of traces needed for analysis. Alternately, DPA could be converted to the frequency domain using Fourier transform before analysis. Another processing technique is compression which will reduce the number of traces needed, reduce noise, and amplify signal resolution. (Kocher et. al., 2011, p. 15-16). Selection Function Generation A selection function is used to assign traces to subsets. They are educated guesses as to the possible value in one or more intermediaries in a cryptographic calculation. Intermediaries, as previously discussed, would include interim results from a cryptographic operation such as output from the AddRoundKey stage in AES. Usually selection functions are single bit (0 or 1) but can be multi-bit (Kocher et. al., 2011, p. 16). Averaging In the averaging stage, each trace subset created in the selection function stage is averaged. This is the most computationally intensive step. Offsets are normally used during the averaging process to maintain independence from the underlying device. Two constrains affecting the averaging performance are processing power and storage throughput. The complexity of averaging is 𝑂(𝑁 ∙ 𝑀 ∙ 𝐿) where N is the number of traces, L is the length of traces, and M is the number of selection functions. The memory use is 𝑂(𝑀 ∙ 𝐿).
  • 12. 9 To improve this stage, two optimizations are available. The first is to calculate the average of all traces (Atotal) and the average of the subset (A0 or A1) where the selection function is either 0 or 1. The average of the other subset can then be found by subtracting A0 or A1 from Atotal. While the complexity is the same, the memory use is cut in half. The second optimization is to use a cache during calculations. For instance, a cache size of 28 – 1 can handle eight traces at a time and compute the sums of 255 traces. Then, the preprocessed traces can be added to the previous averages in bulk. A cache size of c requires 𝑂(𝑐 ∙ 𝐿) memory and takes 𝑂(2 𝑐 ∙ 𝐿) operations to set up. Performance using a cache is improved to 𝑂 ( 𝑁 𝑐 ∙ 𝑀 ∙ 𝐿 + 2 𝑐 ) Furthermore, decreasing the trace size can speed up averaging time. As mentioned earlier, signal processing techniques such as removing irrelevant data or compression can help in this regard. Also, the averaging task can be run in parallel over multiple drives, threads, or machines. (Kocher et. al., 2011, p. 17). Evaluation In the evaluation stage, the mean difference between the sets of traces in the averaging stage is calculated. Any significant deviations will appear as spikes in the output which signify that the selection function is correct and hence the key, sub-key, or intermediate value is correct. For regions of the trace affected by significant noise, spurious spikes can appear in a differential trace. These are known as harmonics. To compensate for these, the trace can be normalized by dividing each point in the trace by the standard deviation (Kocher et. al., 2011, p. 17-18). This is analagous to calculating z-values for students of statistics
  • 13. 10 Variants of Differential Power Analysis Correlation Power Analysis Correlation power analysis (CPA) is not as powerful as DPA. It is based on the correlation between power consumption and model of energy consumption of a device. As already mentioned, Hamming weight and Hamming distance are the most prevalent models. The following equation is the correlation function between power consumption and Hamming weight (or Hamming distance). It is proportional to the correlation between power and energy. 𝜌 𝑊𝐻,𝑘(𝐵) = 𝐸(𝑊𝐻 𝑘) − 𝐸(𝑊)𝐸(𝐻 𝑘) 𝜎 𝑤 𝜎 𝐻 𝑘 Where: W is the power consumption Hk is the Hamming weight or Hamming distance of the kth key E(WHk), E(W), and E(Hk) are the expected value of WHk, W, and Hk 𝜎 𝑤, 𝜎 𝐻 𝑘 are the variances of W and Hk The number of traces to use depends on factors such as clock frequency, resolution, and acquisition time (Menicocci, Trifiletti, & Trotta, 2013, p. 146). Correlation power analysis is most effective in white-box analysis where the device leakage model is known. However, it can also be used for black-box attacks as long as there is some correlation between the leakage of the device and its leakage model. CPA is ideal when the number of traces is limited (Kocher et. al., 2011, p. 19). Probability Distribution Analysis Probability distribution analysis (PDA) is a form of DPA in which traces are preprocessed before being analyzed. First, the average of all traces is computed. This value is
  • 14. 11 subtracted from each trace. The remaining data points are squared. The analysis performed on the preprocessed traces is equivalent to comparing the variances of the data at each point, rather than their means. An attacker would use PDA in a situation where a countermeasure has been used that causes the differential trace to be flat (Kocher et. al., 2011, p. 19-20). High-Order Differential Power Analysis High-Order DPA is a method that analyzes the relationship between multiple parameters. The number of parameters used indicates the order of the attack. High-order DPA can help analyze relationships such as: Similarity/difference: The calculations at different points in one or more traces may use the same parameter. High-order functions that measure correlation or covariance can be used to detect these relationships. High-order DPA has been used successfully in attacks on RSA and ECC. Masked shares of a secret: This is a second order DPA attack used to attack the masking countermeasure (discussed in the Countermeasures section). An example would be a mask that manipulates an intermediate such as SubBytes in AES as two parts, say A and B. A and B are random, but 𝐴⨁𝐵 would reveal the intermediate value. A high-order function can combine a measurement correlated to the first part with a measurement correlated to the second part, so that the combination is correlated to the sensitive variable. (Kocher et. al., 2011, p. 20). Template Attack In a template attack, an analyst builds an IC identical to a target IC then compares traces between them (Lomne et. al., 2011, p. 66). The advantage of template attacks is that it can be used when the number of traces is small (Kocher et. al., 2011, p. 21). The attack occurs in two stages: template building stage and template matching stage. In the building stage, template ICs
  • 15. 12 for pairs of data and keys are built. The traces are modelled as a multivariate normal distribution. For more information about the underlying theory see the reference. In the matching stage, the probability density function of the multivariate normal distribution of the template is compared to the target device. (Lomne et. al., 2011, p. 66-67) Smartphones are a popular target of these attacks (Kocher et. al., 2011, p. 21). Countermeasures There are three ways to mitigate power analysis attacks: decrease the signal-to-noise ratio, balance the power used by ICs, manipulate the data before encryption, and randomize operations. These counter measures can be implemented in hardware or software. The following subsections will discuss some of these methods. Leakage Reduction In the leakage reduction countermeasure, the signal-to-noise ratio is reduced by either decreasing the signal or increasing the noise. In terms of effectiveness, a factor k reduction in the SNR means that k2 more traces are needed for an attack (Kocher et. al., 2011, p. 21). Balancing The goal of balancing is to even the power used by ICs while performing cryptographic operations. Some techniques include using multi-bit data representations and balanced transistions, dual-rail precharge logic, current mode, and asynchronous logic styles. These are ongoing areas of research (Kocher et. al., 2011, p. 21). The dual-rail precharge method has two stages. The precharge stage resets all signals to a known state. The evaluation stage does computations with a fixed number of transitions. The trade-off of this scheme is that it doubles the complexity of implementation. (Danger, Guille, Barthe, Benoit, 2011, p. 86-87). Balancing has the same k2 effectiveness mentioned above. (Kocher et. al., 2011, p. 21)
  • 16. 13 Amplitude and Temporal Noise Both amplitude and temporal noise alter the SNR by increasing the noise component. Amplitude noise is added by using ICs that consume variable amounts of power or that perform calculations that are uncorrelated to intermediates such as SubBytes or AddRoundKey in AES. Temporal noise is added by varying timing and execution order of instructions. Methods include varying clock speed, adding random wait states, random execution, use of dummy operations, and random branching. These countermeasures can be defeated by using filters and signal processing techniques discussed earlier. It also has the same k2 effectiveness relationship mentioned in the leakage reduction section. (Kocher et. al., 2011, p. 22). Protocol-Level Countermeasures Protocol-level countermeasures involve designing cryptographic protocols that can survive leakage. Of course, it is impossible to totally prevent leakage. The power of these countermeasures is that they can compensate for less-than-perfect hardware. To illustrate, an IC using a regular protocol can be broken with a leakage rate as low as 10-9 bits per operation. Protocol-level countermeasures can survive leakage of more than 10 bits per operation–a staggering difference. One protocol-level countermeasure is to limit the number of transactions that can be performed with any given key. For example, a policy could be implemented which destroys the key after 20 transactions. This is analogous to bank pin codes which lock a user`s account after three incorrect guesses. Another method is key update procedures which update keys periodically. The key update process changes the key state so that the new key cannot be correlated to the old key. Key updates may be structured in a key-tree. A key-tree is a tree structure defined from a root key and
  • 17. 14 a set of key update transforms. Counters or other protocol constructions limit the number of times any given node is used to form transaction keys (Kocher et. al., 2011, p. 23-24). Masking The masking countermeasure conceals plaintext or an intermediate value x with a mask m which takes random values. There are two types of masks: Boolean and arithmetic. Boolean masking, shown below, uses bit wise exclusive-or. 𝑥 𝑚 = 𝑥⨁𝑚 Arithmetic masking, shown below, uses modular addition or multiplication on a finite field. 𝑥 𝑚 = 𝑥 + 𝑚 (𝑚𝑜𝑑 𝑛) or 𝑥 𝑚 = 𝑥 ∗ 𝑚 (𝑚𝑜𝑑 𝑛) where 𝑛 = 2|𝑥| = 2|𝑚| The drawback of masking is that it decreases throughput by one-half (Danger et. al., 2011, p. 76). Data masking techniques are vulnerable to second-order DPA attacks (Ambrose, Ragel, & Parameswaran, 2012, p. 5). Randomized Instruction Injection Method Ambrose et. al. (2012) claim that the countermeasures mentioned above are vulnerable to multi-order DPA or have high area, run time, and energy costs (p. 1). For example, the balancing countermeasure increases execution time by 75 percent (p. 6). To overcome these problems, they developed a randomized instruction technique (RIJID). RIJID scrambles the trace by injecting random instructions at random points in an algorithm (p. 1). There are two ways to trigger the instructions:
  • 18. 15  softRIJID: a hardware/software approach, where code injection is triggered by special instructions at runtime; and  autoRIJID: a hardware approach, where code injection is triggered by the processor at runtime. RIJID differs from other approaches in that it injects real instructions such as AND and OR randomly instead of dummy instructions such as NO-OP at fixed points (p. 2). The issue with dummy instructions is that they have their own trace profile and can be synchronized. Dummy instructions are also vulnerable to sliding window differential power analysis (SW- DPA) (p. 5) Using softRIJID, the area of a RISC processor increased by 1.98 percent with an increase of 29.8 percent in runtime and 27.1 percent in energy. Using autoRIJID, the area of a RISC processor increased by 1.20 percent with an increase of 25.0 percent in runtime and 28.5 percent in energy (p. 6). Some of the limitations of RIJID are:  It is a design-time approach and needs hardware changes  softRIJID requires compiler support  RIJIDindex can only compare different injection methods. It cannot compare completely different methods such as hardware balancing against code injection  The design does not support processors that have advanced scalar features (p. 7) To measure the amount of scrambling, the authors developed a metric called RIJIDindex, which is based on cross-correlation. The higher the index; the greater amount of scrambling present. RIJIDindex can be used to gauge the vulnerability of a cryptosystem instead of DPA. The RIJID framework consists of:
  • 19. 16  Analyzing the power waveform and extracting a template. In this context, a template is an expected trace pattern (not a real trace) that would be observed based on a cryptographic algorithm such as AES. The paper has an example of a template for triple DES.  Applying RIJID to the system and measuring the scrambled trace  Using RIJID to calculate the level of scrambling (p. 17) Figure 8: Example RIJIDindex Calculation (Ambrose et. al., 2012, p. 17) Figure 8 shows an example of a RIJIDindex calculation. (a) shows an original trace. (b) shows a repeating template. (c) shows the RIJID scrambled trace. (d) shows a random trace that
  • 20. 17 does not have a template. (e) shows the cross-correlation between the original trace and the repeating template. As can be seen in the figure, there are significant peaks at points where the template and original match. (f) shows the cross-correlation between the template and the scrambled trace. (g) shows the cross-correlation between the template and the random trace. As can be seen from the figure, both cross-correlations (f) and (g) do not show any significant peaks (p. 18). The formula for the RIJIDindex is: 𝑅𝐼𝐽𝐼𝐷𝑖𝑛𝑑𝑒𝑥 = ∆ 𝑜 − ∆ 𝑧 ∆ 𝑜 − ∆ 𝑟 The numerator is the difference between the means of the original (∆ 𝑜) and RIJID (∆ 𝑧) traces. The denominator is the difference between the means of the original and random (∆ 𝑟) traces (p. 18). For softRIJID, Table 1 shows the RIJIDindex for various injection pairs and cryptographic algorithms. As can be seen in the table, softRIJID has the best overall performance for RSA. The ideal injection pair for RSA is (3, 3) (p. 21). Table 1: RIJIDindex for softRIJID (Ambrose et. al., 2012, p. 21) N D RSA IDEA RC4 3 3 0.9998 0.7204 0.9908 4 4 0.9896 0.7475 0.9523 5 5 0.9511 0.9738 0.9998 6 6 0.9607 0.7571 0.9364 For autoRIJID, Table 2 shows the RIJIDindex for various cryptographic algorithms using an injection pair of (5, 5). As can be seen in the table, autoRIJID has the best performance for RSA. (p. 24) Table 2: RIJIDindex for AutoRIJID (Ambrose et. al., 2012, p. 24) Algorithm RIJIDindex TripleDES 0.7040 Blowfish 0.9622
  • 21. 18 Algorithm RIJIDindex Rijindael 0.9495 SHA 0.7096 RSA 0.9980 Randomized Execution Algorithms Zhang et. al. (2012) developed four algorithms to resist power analysis attacks. They are:  Dummy Instructions Random Insertion (DIRI)  Randomized Execution with Independent Dummy Instructions (REIDI)  Advanced Randomized Execution with Independent Dummy Instructions (AREIDI)  Randomized Execution with Binding and Independent Dummy Instructions (REBIDI) The underlying details of the algorithms are complex, so only their experimental results will be presented. The researchers used Mibench to obtain the results below. Mibench is a free benchmark system. Table 3 is adapted from a larger table published in the paper. The table only lists the extra overhead used by the algorithms relative to the benchmarks as a percentage. Observe that all the candidate algorithms use more overhead than the respective benchmarks; however, DIRI performs better than others with an average overhead of 6.03 percent in excess of the benchmarks. REBIDI is the worst performer. It has an average overhead approximately one- third more than the benchmarks. REIDI and AREIDI are approximately the same at about 24.5 percent. Table 3: Comparison of Overhead Time for DIRI, REIDI, AREID, REBIDI (Zhang et. al., 2012, p. 435) Overhead (%) Benchmarks DIRI REIDI AREIDI REBIDI SHA 5.88 27.01 23.52 29.55 Rijindael 5.98 25.20 18.86 27.88
  • 22. 19 Blowfish 6.44 26.71 19.37 33.70 FFT 4.91 26.73 21.68 39.60 CRC32 7.57 21.99 34.28 39.95 Adpcm 5.12 23.40 23.56 30.36 Dijkstra 6.25 24.50 28.38 32.83 Qsort (small) 7.73 25.68 27.05 39.55 Qsort (large) 5.24 23.08 25.64 35.12 stringsearch 5.22 20.40 22.77 30.83 Average 6.03 24.47 24.51 33.94 Likewise, Table 4 is adapted from a larger table in the paper. It shows the improvement of each candidate algorithm in terms of security as measured by the unbiased variance of the power. Observe that REBIDI has the most improvement over the benchmarks with an average of 76.56 percent. DIRI has the lowest average improvement with an average of 40.22 percent. Table 4: Comparison of Unbiased Variance of DIRI, REIDI, AREIDI, and REBIDI (Zhang et. al., 2012, p. 435) Improvement (%) Benchmarks DIRI REIDI AREIDI REBIDI SHA 35.81 77.99 79.19 84.43 Rijindael 40.26 79.43 80.20 83.13 Blowfish 50.70 69.27 69.81 71.03 FFT 47.06 73.97 76.79 77.97 CRC32 35.63 72.64 72.57 73.69 adpcm 34.36 73.59 70.55 74.09 Dijkstra 46.11 72.55 75.88 76.54 Qsort (small) 35.39 68.18 70.46 75.43 Qsort (large) 42.31 68.26 67.24 74.42 stringsearch 34.56 73.90 72.25 75.21 Average 40.22 72.98 73.49 76.56 Comparing both tables, a trade-off can be seen. An increase in performance comes at the cost of increased overhead on the IC. For example, DIRI uses an average 6.03 percent more overhead but has the lowest average rate of improvement at 40.22 percent. Looking at it from a different perspective though, DIRI has a better ratio. For a modest increase in average overhead, the gain is approximately seven fold.
  • 23. 20 Conclusion Power analysis attacks use the power consumption of ICs to steal secret keys from devices such as smart cards and FPGAs. This power consumption is known as a leakage. The prevalent leakage models are Hamming weight and Hamming distance. Hamming weight defines maximum power usage as an operation where the bit ends up as 1. Hamming distance defines maximum power as a switching operation where the bit changes from 0 to 1 or 1 to 0. Simple power analysis (SPA) and differential power analysis (DPA) attacks are types of power analysis attacks, with DPA being the more powerful approach. SPA involves visual inspection of a power trace whereas DPA uses correlation between data and traces. For SPA, we looked at examples of finding the exponent in RSA and the key in elliptic curve cryptography (ECC). For DPA, we looked at an example of how to find a partial AES key for one S-box and learned how a group of researchers cracked a triple DES key in three minutes. Next, the DPA trace equation was presented. Following the DPA theory, the stages of a DPA attack were discussed. The steps include set-up, measurement, signal processing, selection function generation, averaging, and evaluation. Set-up involves connecting a resistor in series with the device and using an oscilloscope to capture voltage readings as the device performs cryptographic operations. A PC is used to analyze the information. Measurement consists of capturing traces. Signal processing is used to clean up the signal before analysis. Selection functions are educated guesses about the cryptographic data whether it is part of the key or an intermediate result. Selection functions are usually single bit (0 or 1) and are used to separate the traces into sets. Averaging is performed on each subset of traces assigned in the selection function step. Evaluation involves computing DPA
  • 24. 21 tests to see if the selection function is correct. Tests with large spikes indicate a highly probable correct value. Next several variants of DPA where presented. These include correlation power analysis, probability distribution analysis, high-order DPA, and template attack. Finally, several countermeasures to prevent power analysis attacks were discussed. These include classical approaches such as leakage reduction (i.e. reducing signal-to-noise ratio), balancing (i.e. even the power consumption of IC operations), amplitude and temporal noise (i.e. reduce SNR), protocol-level countermeasures such as setting key limits and key update procedures, masking (i.e. Boolean and arithmetic), and several randomized algorithms. References Ambrose, J. A., Ragel, R. G., & Parameswaran, S. (2012). Randomized instruction injection to counter power analysis attacks. ACM Transactions on Embedded Computing Systems (TECS), 11(3), 69. Danger, J. L., Guilley, S., Barthe, L., Benoit, P. (2011). Countermeasures against physical attacks in FPGAs. Security trends for FPGAS (pp. 47-72) Springer. Kocher, P., Jaffe, J., Jun, B., & Rohatgi, P. (2011). Introduction to differential power analysis. Journal of Cryptographic Engineering, 1 (1), 527. Li, H., Wu, K., Xu, G., Yuan, H., & Luo, P. (2011). Simple power analysis attacks using chosen message against ECC hardware implementations. Paper presented at the Internet Security (WorldCIS), 2011 World Congress on, 68-72. Lomne, V., Dehaboui, A., Maurine, P., Torres, L., & Robert, M. (2011). Side channel attacks. Security trends for FPGAS (pp. 47-72) Springer.
  • 25. 22 Mangard, S., Oswald, E., & Standaert, F. (2011). One for all–all for one: Unifying standard differential power analysis attacks. IET Information Security, 5(2), 100-110. Menicocci, R., Trifiletti, A., & Trotta, F. (2013). Random interleaved pipeline countermeasure against power analysis attacks. Paper presented at the Ph. D. Research in Microelectronics and Electronics (PRIME), 2013 9th Conference on, 145-148. Moradi, A., Barenghi, A., Kasper, T., & Paar, C. (2011). On the vulnerability of FPGA bitstream encryption against power analysis attacks: Extracting keys from xilinx virtex-II FPGAs. Paper presented at the Proceedings of the 18th ACM Conference on Computer and Communications Security, 111-124. Regazzoni, F., Wang, Y., & Standaert, F. (2011). FPGA implementations of the AES masked against power analysis attacks. Paper presented at the Second International Workshop on Constructive Side-Channel Analysis and Secure Design (COSADE 2011), 56-66. Zhang, D., Liao, X., Qiu, M., Hu, J., & Sha, E. H. -. (2012). Randomized execution algorithms for smart cards to resist power analysis attacks. Journal of Systems Architecture, 58(10), 426-438. doi:http://dx.doi.org.qe2a-proxy.mun.ca/10.1016/j.sysarc.2012.08.004.