ES_SAA_OG_PF_ECCTD_Pos

GeneralizedDivision-FreeArchitectureand
CompactMemoryStructureforResampling
inParticleFilters
Syed Asad Alam and Oscar Gustafsson
{syed.asad.alam, oscar.gustafsson}@liu.se
Department of Electrical Engineering, Linköping University, Sweden
Aims and Objectives
• The most basic form of resampling algorithm in a
particle filter has a high hardware cost
• Requires normalized and ordered data set for
implementation → Multinomial resampling
• Alternative algorithms used to avoid multinomial →
Stratified, systematic resampling
• Aim
– Architecture for multinomial resampling free
from the need of ordering and normalization
– Memory optimization for the weights and
random function
Particle Filters
• Model based filtering
– State transition and observation models are
non-linear and noise non-gaussian
• Purpose → Estimation of a state from a set of
observations corrupted by noise
• Applications → Target tracking, computer vision,
channel estimation . . .
• Steps → Time-update, weight computation and
resampling
estimate
ResamplingTime update
computation
Weight
Input observations
xn
yn
wn
˜xn, ˜wn
Output
Figure 1: Overall structure of particle filter.
Comparator
Memory
Weight
Normalized
Replicated−
Factors
Replicated/
Memory
Discarded/
Normalization
and
Cumulative
Sum
Unit
Generation
Number
Random
Particle
Memoryand Time
Update
Sample
(i) Time Update
Unit
Unit
Control
Unit
(iii) Resampling
(ii) Weight Memory and
Random Number Generation
Memory address
wK
W ′
K
U′
K
Figure 2: Basic architecture of particle filter.
Resampling
Resampling prevents degeneration of particles and
improves estimation by eliminating and replicating
particles having low and high particle weights respectively.
The total number of particles, M, remains the same.
0 1
Systematic Stratified Multinomial
Figure 3: Uniformly distributed samples for M = 10.
Standard algorithms for resampling available in the
literature
• Multinomial → Uniform random numbers – U[0, 1)
• Stratified → Partition U[0, 1) into M regions, one
sample from each interval with a random offset
• Systematic → Similar to stratified resampling, offset
is fixed
Proposed Idea
Background
• Complexity of multinomial resampling can be reduced
from M2
to approximately 2M by generating ordered
random numbers → High hardware cost
• Accumulation and normalization will provide an
intrinsic ordering, reducing the hardware cost
• The comparison, needed for replication and
discarding particles, can be formulated as:
WK
WM
⋚
UK
UM
(1)
where WK = K
j=0 wj and UK = K
j=0 uj, are the running
sums, WM and UM are the cumulative sum.
Division Free Architecture
Reformulation of (1) gives:
WK × UM
W ′
K
⋚ UK × WM
U′
K
(2)
• No normalization required
• Equally efficient for non-powers-of-two M
• Independent of generating ordered random numbers
• Can be used for stratified and systematic resampling
with appropriate random number generators
REG
REG
Weight
Memory
Accumulator
Memory
Random
value
Accumulator
From Control
Unit
From Control
Unit
uK
wK
WM
Bw + log2 M
Br + log2 M
UM
U′
K
W ′
K
Figure 4: Memory and data generation for resampling with
stored cumulative sum.
Memory Optimization
• Storing cumulative sum of data increases the word
length requirement for the two memories
• Can be reduced from Bw + log2 M and Br + log2 M to
Bw and Br respectively by on-the-fly accumulation
• Accumulators are placed after each memory
• Extra hardware cost of multiplexer and associated
control logic
Weight
Memory
REG
Accumulator
Memory
Random
value
REG
Accumulator
Unit
Unit
From Control
From Control
uK
wK
WMUM
Bw
Br
U′
K
W ′
K
Figure 5: Memory and data generation for resampling with
on-the-fly cumulative sum.
Results
Complexity – Standard Cells
Table 1: Complexity, in terms of Area (mm2
),
of architectures based on stored and online sum.
Particle
count
Bit
growth
Stored
sum
Online
sum
Savings
(%)
10 4 0.022 0.014 36.36
20 5 0.035 0.019 45.71
100 7 0.112 0.088 21.43
128 7 0.114 0.088 22.81
200 8 0.220 0.153 30.45
256 8 0.220 0.154 30.00
512 9 0.441 0.291 34.01
1000 10 0.833 0.550 33.97
1024 10 0.857 0.555 35.24
2000 11 1.703 1.103 35.23
2048 11 1.731 1.105 36.16
Complexity – FPGA
Number of particles, M
512 1k 1024 2k 2048 3k 4k 8k 10k 15k 16k 20k
NumberofLUTs
0
50
100
150
200
250
300
350
400
450
500
Stored
Online
Figure 6: Look-up table used by architectures based on
stored and online sum.
Number of particles, M
512 1k 1024 2k 2048 3k 4k 8k 10k 15k 16k 20k
Numberof36kBRAM
0
10
20
30
40
50
60
70
Stored
Online
Figure 7: Memory used by architectures based on stored
and online sum.
Summary
• Proposed a generalized division free architecture for
the resampling stage
• Independent of the non-powers-of-two number of
particles, normalization and generation of ordered
random numbers
• Achieved by use of double multipliers and
accumulators
• Memory optimization results in reduction of area and
memory usage up to 45% and 50% respectively
• Achieved by on-the-fly accumulation of particle
weights and random numbers
• Each memory holds the original particle weight and
random number
• Reduces the word length required for each memory

ES_SAA_OG_PF_ECCTD_Pos

More Related Content

What's hot (20)

Similar to ES_SAA_OG_PF_ECCTD_Pos (20)

ES_SAA_OG_PF_ECCTD_Pos