Memory access control in multiprocessor for real-time system with mixed criticality

Memory Access Control in
Multiprocessor for Real-time Systems
with Mixed Criticality

Heechul Yun+, Gang Yao+, Rodolfo Pellizzoni*,
Marco Caccamo+, Lui Sha+
University of Illinois at Urbana and Champaign+
Univerity of Waterloo*

Multi-core Systems
• Mainstream in smartphone
– Dual/quad-core smartphones
– More performance with less
power
Tegra 3 (4 cores)

• Traditional embedded/real-time domains
– Avionics companies are investigating [Nowotsch12]
• 8 core P4080 processor from Freescale

2
[Nowotsch12] “Leveraging Multi-Core Computing Architectures in Avionics”, EDCC, 2012

Challenge

Core1 Core2 Core3 Core4

System bus

DRAM

• Timing isolation is hard to achieve

3

Challenge
Appl1 App2 App 3 App 4


System bus

DRAM

• Cores compete for shared HW resources
– System bus, DRAM controller, shared cache, …

4

Effect of Memory Contention
Run-time increase (%)
70%

60% 60% WCET increase
50% is unacceptable
40%

30%

App.
App membomb 20%

10%

0%
Core Core 429.mcf 471.omnetpp 473.astar 433.milc 470.lbm

Shared system bus • Run-time increase due to contention
Memory – Five SPEC2006 benchmarks
– Compared to solo execution
5

Goal
• Mechanism to control memory contention
– Software based controller for COTS multi-core
processors

• Response time analysis accounting memory
contention effect
– Based on proposed software based controller

6

Outline
• Motivation
• Memory Access Control System
• Response Time Analysis
• Evaluation
• Conclusion

7

System Architecture

Memory bandwidth controllers (Part of OS)
20% 30% 10% 40%

System bus

DRAM

• Assign memory bandwidth to each core using per-core
memory bandwidth controller

8

Memory Bandwidth Controller
• Periodic server for memory resource

• Periodically monitor memory accesses of the core
and control user specified bandwidth using OS
scheduler
– Monitoring can be efficiently done by using per-core
hardware performance counter
– Bandwidth = # memory accesses X avg. access time

9

Memory Bandwidth Controller
• Period: 10 time unit, Budget: 2 memory accesses
– memory access takes 1 time unit

Enqueue tasks
2
Budget
1

Task

0 10 20
Dequeue tasks Dequeue tasks

computation
10
memory fetch

Outline
• Motivation
• Memory Bandwidth Control System
• Evaluation
• Conclusion

11

System Model
Critical core Interfering cores

Memory bandwidth controller

System bus

DRAM

• Cores are partitioned based on criticality
• Critical core runs periodic real-time tasks with fixed
priority scheduling algorithm
• Interfering cores run non-critical workload and
regulated with proposed memory access controller
12

Assumptions
Critical core Interfering cores

Memory bandwidth controller

System bus

DRAM

• Private or partitioned last level cache (LLC)
• Round-robin bus arbitration policy
• Memory access latency is constant
• 1 LLC miss = 1 DRAM access
13

Simple Case: One Interfering Core
Critical Interfering

Core Core

Memory
bandwidth
controller

System bus

DRAM

• Critical core - core under analysis
• Interfering core – generating memory interference
14

Problem Formulation
• For a given periodic real-time task set 𝑇 = {𝜏1 ,
𝜏1 ,…, 𝜏 𝑛 } on a critical core
• Problem:
– Determine 𝑇 is schedulable on the critical core
given memory access control budget Q and period
P on the interfering core

15

Task Model

CM

computation
memory fetch
(cache stall)
C time

– C : WCET of a task on isolated core (no interference)
– CM: number of last level cache misses (DRAM accesses)
– L: stall time of single cache miss

16

Memory Interference Model

– P : memory access controller period
– Q: memory access time budget
– αu(t): Linearized interfering memory traffic upper-bound

17

Background [Pellizzoni07]
• Accounting Memory Interference
– Cache bound: maximum interference time <=
maximum number cache-accesses (CM) * L of the task
under analysis
– Traffic bound: maximum interference time <=
maximum bus time requested by the interfering core

Cache-bound Traffic bound
𝐶 : WCET account memory stall delay
L: stall time of single cache miss
18
[Pellizzoni07] “Toward the predictable integration of real-time cots based systems,” RTSS, 2007

Classic Response Time Analysis

𝑅𝑖 𝑘
𝑅 𝑖𝑘+1 = 𝐶 𝑖 + ∗ 𝐶𝑗
𝑇𝑗
𝑗<𝑖

– Tasks are sorted in priority order
• low index = high priority task
– 𝐶 𝑖 : WCET of task i (in isolation w/o memory interference)
– 𝑅 𝑖 : Response time of task i
– 𝑇𝑗 : Period of task I

19

Extended Response Time Analysis

𝑅𝑖 𝑘
𝑅 𝑖𝑘+1 = 𝐶 𝑖 + ∗ 𝐶𝑗 + min 𝑁 𝑅 𝑘 ∗ 𝐿, 𝛼 𝑢 𝑅 𝑘
𝑇𝑗
𝑗<𝑖 𝑡
where 𝑁 𝑡 = 𝑗≤𝑖 ∗ 𝐶𝑀𝑗
𝑇𝑗

𝑢
𝑄 𝑄(𝑃 − 𝑄)
𝛼 𝑡 = 𝑡 +2
𝑃 𝑃
– 𝑁 𝑡 : aggregated cache misses over time t
– 𝛼 𝑢 (𝑡): interfering memory traffic over time t
– P: memory access control period
– Q: memory access time budget

• Proposed method achieves tighter response time than using 𝐶

20

Outline
• Motivation
• Memory Bandwidth Control System
• Evaluation
• Conclusion

21

Linux Kernel Implementation
• Extending CPU bandwidth reservation feature of
group scheduler
– Specify core and bandwidth (memory budget, period)
• mkdir /sys/fs/cgroup/core3; cd /sys/fs/cgroup/core3
• echo 3 > cpuset.cpus  core 3
• echo 10000 > cpu.cfs_period_us  period
• echo 500000 > cpu.cfs_quota_event  cache-misses budget
– Added feature

– Monitor memory usage at every scheduler tick and
context switch

22

Experimental Platform
Intel Core2Quad

Core 0 Core 1 Core 2 Core 3

L1-I L1-D L1-I L1-D L1-I L1-D L1-I L1-D

L2 Cache L2 Cache

System Bus

DRAM

• Core 0,2 were disabled to simulate a private LLC system
• Running a modified Linux 3.2 kernel
– https://github.com/heechul/linux-sched-coreidle/tree/sched-3.2-throttle-v2

23

Synthetic Task

App.
App membomb

Core1 Core3

Shared system bus
Memory

• Core under analysis runs a synthetic task with 50% memory bandwidth
• Vary throttling budget of the interfering core from 0 to 100%
• Two findings: (1) we can control interference, (2) analysis provide an upper
bound (albeit still pessimistic)

24

H.264 Movie Playback

• Cache-miss counts sampled over every 100ms
• Some inaccuracy in regulation due to implementation limitation
– Current version is improved accurate by using hardware overflow interrupt

25

Conclusion
• Shared hardware resources in multi-core systems
are big challenges for designing real-time systems

• We proposed and implemented a mechanism to
provide memory bandwidth reservation
capability on COTS multi-core processors

• We developed a response time analysis method
using the proposed memory access control
mechanism

26

Memory access control in multiprocessor for real-time system with mixed criticality

More Related Content

What's hot (20)

Similar to Memory access control in multiprocessor for real-time system with mixed criticality (20)

More from Heechul Yun (6)

Recently uploaded (20)

Memory access control in multiprocessor for real-time system with mixed criticality