SlideShare a Scribd company logo
From L3 to seL4: 
What Have We Learnt in 
20 Years of L4 Microkernels? 
Kevin Elphinstone & @GernotHeiser 
NICTA and UNSW Australia 
SOSP'13
Copyright Notice 
These slides are distributed under the Creative Commons 
Attribution 3.0 License 
• You are free: 
– to share—to copy, distribute and transmit the work 
– to remix—to adapt the work 
• under the following conditions: 
– Attribution: You must attribute the work (but not in any way that 
suggests that the author endorses you or your use of the work) as 
follows: 
• “Courtesy of Gernot Heiser, [Institution]”, where [Institution] is one of 
“UNSW” or “NICTA” 
The complete license text can be found at 
http://creativecommons.org/licenses/by/3.0/legalcode 
©2013 Gernot Heiser, NICTA 2 
COMP9242 S2/2014 W01
1993 
©2013 Gernot Heiser, NICTA 3 
SOSP'13 
Improving IPC 
by Kernel 
Design [SOSP]
400 
300 
200 
100 
0 
Mach 
0 2000 4000 6000 
©2013 Gernot Heiser, NICTA 4 
Message Length [B] 
[μs] 
L4 
1993 IPC Performance 
SOSP'13 
115 μs 
5 μs 
i486 @ 
50 MHz 
Culprit: 
Cache 
footprint 
[SOSP’95] raw copy
IPC Performance over 20 Years 
Name Year Processor MHz Cycles μs 
Original 1993 i486 50 250 5.00 
Original 1997 Pentium 160 121 0.75 
L4/MIPS 1997 R4700 100 86 0.86 
L4/Alpha 1997 21064 433 45 0.10 
Hazelnut 2002 Pentium 4 1,400 2,000 1.38 
Pistachio 2005 Itanium 1,500 36 0.02 
OKL4 2007 XScale 255 400 151 0.64 
NOVA 2010 i7 Bloomfield (32-bit) 2,660 288 0.11 
seL4 2013 i7 Haswell (32-bit) 3,400 301 0.09 
seL4 2013 ARM11 532 188 0.35 
seL4 2013 Cortex A9 1,000 316 0.32 
©2013 Gernot Heiser, NICTA 5 
SOSP'13
Core Microkernel Principle: Minimality 
A concept is tolerated inside the 
microkernel only if moving it outside 
the kernel, i.e. permitting competing 
implementations, would prevent the 
implementation of the system’s 
required functionality. [SOSP’95] 
©2013 Gernot Heiser, NICTA 6 
SOSP'13
Minimality: Source Code Size 
Name Architecture C/C++ asm total kSLOC 
Original i486 0 6.4 6.4 
L4/Alpha Alpha 0 14.2 14.2 
L4/MIPS MIPS64 6.0 4.5 10.5 
Hazelnut x86 10.0 0.8 10.8 
Pistachio x86 22.4 1.4 23.0 
L4-embedded ARMv5 7.6 1.4 9.0 
OKL4 3.0 ARMv6 15.0 0.0 15.0 
Fiasco.OC x86 36.2 1.1 37.6 
seL4 ARMv6 9.7 0.5 10.2 
©2013 Gernot Heiser, NICTA 7 
SOSP'13
L4 Family Tree 
L4/MIPS 
L4/Alpha 
L3 → L4 “X” Hazelnut Pistachio 
OKL4 Microvisor 
Fiasco Fiasco.OC 
UNSW/NICTA 
93 94 95 96 97 98 99 00 01 02 03 04 05 06 07 08 09 10 11 12 13 
©2013 Gernot Heiser, NICTA 8 
seL4 
OKL4 μKernel 
Codezero 
P4 → PikeOS 
L4-embed. 
GMD/IBM/Karlsruhe NOVA 
Dresden 
Commercial Clone 
OK Labs 
SOSP'13 
API Inheritance 
Code Inheritance
L4 Deployments – in the Billions 
©2013 Gernot Heiser, NICTA 9 
SOSP'13 
SiMKo 3 “Merkelphone”
seL4: Unprecedented Dependability 
Proof Proof Proof 
Non-interference 
[S&P’13] 
©2013 Gernot Heiser, NICTA 10 
SOSP'13 
Integrity 
Abstract 
Model 
C Imple-mentation 
Confiden-tiality 
Availability 
Binary 
code 
Functional 
correctness 
[SOSP’09] 
Integrity 
[ITP’11] 
Translation 
correctness 
[PLDI’13] 
• First & only OS kernel 
with security proofs to 
binary code 
• First & only protected-mode 
OS kernel with 
sound timeliness analysis 
Timeliness 
[RTSS’11, 
EuroSys’12]
L4 Design and Implementation 
Implement. Tricks [SOSP’93] 
• Process kernel 
• Virtual TCB array 
• Lazy scheduling 
• Direct process switch 
• Non-preemptible 
• Non-portable 
• Non-standard calling 
convention 
• Assembler 
©2013 Gernot Heiser, NICTA 11 
Design Decisions [SOSP’95] 
• Synchronous IPC 
• Rich message structure, 
arbitrary out-of-line messages 
• Zero-copy register messages 
• User-mode page-fault handlers 
• Threads as IPC destinations 
• IPC timeouts 
• Hierarchical IPC control 
• User-mode device drivers 
• Process hierarchy 
• Recursive address-space 
construction 
SOSP'13 
Objective: Minimise cache footprint and TLB misses
DESIGN 
©2013 Gernot Heiser, NICTA 12 
SOSP'13
L4 Synchronous IPC 
©2013 Gernot Heiser, NICTA 13 
SOSP'13 
Thread1 
Running Blocked 
Thread2 
Blocked Running 
Send (dest, msg) 
Wait (src, msg) 
….... 
Kernel 
copy 
Rendezvous 
model 
Kernel executes in sender’s context 
• copies memory data directly to 
receiver (single-copy) 
• leaves message registers unchanged 
during context switch (zero copy)
“Long” IPC 
Sender address space 
Kernel copy 
Receiver address space 
LONG IPC 
ABANDONED 
• IPC page faults are nested exceptions ⇒ In-kernel concurrency 
– L4 executes with interrupts disabled for performance, no concurrency 
• Must invoke untrusted usermode page-fault handlers 
– potential for DOSing other thread 
• Timeouts to avoid DOS attacks 
– complexity 
©2013 Gernot Heiser, NICTA 14 
Page fault! 
Why have long IPC? 
• POSIX-style APIs 
write (fd, buf, nbytes) 
• Usually prefer shared buffers 
SOSP'13
Timeouts 
IPC Timeouts 
ABANDONED 
in seL4, OKL4 
©2013 Gernot Heiser, NICTA 15 
SOSP'13 
Thread1 
Running Blocked 
Thread2 
Blocked Running 
Send (dest, msg) 
….... Wait (src, msg) 
Kernel 
copy 
Limit IPC 
blocking 
time 
Thread1 
Running Blocked 
Rcv(NIL_THRD, delay) 
….... 
Timed 
wait 
• No theory/heuristics for 
determining timeouts 
• Typically server reply 
with zero TO, else ∞ 
• Can do timed wait with 
timer syscall
Synchronous IPC Issues 
©2013 Gernot Heiser, NICTA 16 
SOSP'13 
Thread1 
Running Blocked 
Initiate_IO(…,…) 
IO_Wait(…,…) 
Not 
generally 
possible 
Worker_Th 
Running Blocked 
IO_Th 
Blocked Running 
Unblock (IO_Th) ….... Call (IO,msg) 
Sync(Worker_Th) 
Sync(IO_Th) ….... 
• Sync IPC forces multi-threaded code! 
• Also poor choice for multi-core
Asynchronous Notifications 
©2013 Gernot Heiser, NICTA 17 
SOSP'13 
….... 
Thread1 
Running Blocked 
Thread2 
Blocked Running 
w = Poll (…) 
…... w = Wait (…) 
Send (Thr _ 2 , …) ….... 
Send (Thr_2, …) • Delivers few bits (destructively) 
• Kernel only buffers single word 
• Maps well to interrupts, exceptions 
Server 
Client Driver Sync Async 
• Thread can wait for 
synchronous and 
asynchronous messages 
concurrently 
Sync IPC 
complemented 
with async
IPC Destination Naming 
Client Server 
Thread IDs 
replaced by 
IPC “endpoints” 
(ports) 
©2013 Gernot Heiser, NICTA 18 
SOSP'13 
IPC 
Client Server 
Load 
balancer Workers 
Client Server 
All IPCs 
duplicated! 
Original L4 
addressed IPC 
to threads 
Client must do 
load balancing? 
RPC reply from 
wrong thread! 
• Inefficient designs 
• Poor information hiding 
• Covert channels [Shapiro ‘02]
Endpoints 
Client Server 
Client Server 
©2013 Gernot Heiser, NICTA 19 
SOSP'13 
IPC 
Send 
Rcv 
Sync EP 
• Sync EP queues senders/receivers 
• Does not buffer messages 
0x01 
0x10 
0x30 
Async EP 
00xx0013011 • Async EP accumulates bits
Other Design Issues 
IPC Control: “Clans & Chiefs” Process Hierarchy 
©2013 Gernot Heiser, NICTA 20 
SOSP'13 
IPC 
Chief 
Clan 
IPC outside clan 
re-directs to chief 
Create 
Hierarchical 
resource 
management 
Inflexible, clumsy, 
inefficient hierarchies! 
Hierarchies 
replaced by 
delegatable cap-based 
access 
control
IMPLEMENTATION 
©2013 Gernot Heiser, NICTA 21 
SOSP'13
Virtual TCB Array 
Thread ID 
©2013 Gernot Heiser, NICTA 22 
SOSP'13 
VM TCB TCB 
Fast TCB & 
stack lookup 
TCB 
proper 
Kernel 
stack 
Get own 
TCB base 
by masking 
stack pointer 
Trades cache for TLB footprint 
and virtual address space 
• Not worthwhile on 
modern processors! 
• Stacks can dominate 
kernel memory use! 
Trades TLB 
footprint 
for cache 
and kernel 
memory
“Lazy” Scheduling 
• In IPC-based systems, threads 
block and unblock frequenty 
• Many ready queue manipulations 
©2013 Gernot Heiser, NICTA 23 
SOSP'13 
thread_t schedule() { 
foreach (prio in priorities) { 
foreach (thread in runQueue[prio]) { 
if (isRunnable(thread)) 
return thread; 
else 
schedDequeue(thread); 
} 
} 
return idleThread; 
} 
Idea: leave blocked 
threads in ready 
queue, scheduler 
cleans up 
Scheduler execution 
time is unbounded! 
“Benno scheduling”: 
• All threads on ready queue 
are runnable 
• All runnable threads in ready 
queue except the running one
L4 Design and Implementation 
Implement. Tricks [SOSP’93] 
• Process kernel 
• Virtual TCB array 
• Lazy scheduling 
• Direct process switch 
• Non-preemptible 
• Non-portable 
• Non-standard calling 
convention 
• Assembler 
©2013 Gernot Heiser, NICTA 24 
Design Decisions [SOSP’95] 
• Synchronous IPC 
• Rich message structure, 
arbitrary out-of-line messages 
• Zero-copy register messages 
• User-mode page-fault handlers 
• Threads as IPC destinations 
• IPC timeouts 
• Hierarchical IPC control 
• User-mode device drivers 
• Process hierarchy 
• Recursive address-space 
construction 
SOSP'13
What are the Principles? 
• Minimality is excellent driver of design decisions 
– L4 kernels have become simpler over time 
– Policy-mechanism separation (user-mode page-fault handlers) 
– Device drivers really belong to user level 
– Minimality is key enabler for formal verification! 
• IPC speed still matters 
– But not everywhere, premature optimisation is wastive 
– Compilers have got so much better 
– Verification does not compromise performance 
– Verification invariants can help improve speed! [Shi, OOPSLA’13] 
• Also found that capabilities are the way to go 
– Shapiro (EROS) was right 
• However, a clean abstraction of time still elusive 
©2013 Gernot Heiser, NICTA 25 
SOSP'13
Conclusions 
Where to find more: 
• UNSW Advanced Operating Systems Course 
http://www.cse.unsw.edu.au/~cs9242 
• NICTA Trustworthy Systems research 
http://trustworthy.systems 
• seL4 open-source portal 
http://sel4.systems 
• L4 Microkernel Headquarters 
http://l4hq.org 
• Gernot’s blog: 
http://microkerneldude.wordpress.com/ 
• Gernot’s research home page: 
http://ssrg.nicta.com.au/people/?cn=Gernot+Heiser 
©2013 Gernot Heiser, NICTA 26 
SOSP'13 
• Details changed, but principles remained 
• Microkernels rock! (If done right!)

More Related Content

PDF
seL4 intro
PPT
Microkernel-based operating system development
PDF
Affordable trustworthy-systems
PDF
Microkernel design
PDF
seL4 on RISC-V/lowRISC - ORCONF'15
PDF
Introduction to Microkernels
PDF
Implement Runtime Environments for HSA using LLVM
PDF
L4 Microkernel :: Design Overview
seL4 intro
Microkernel-based operating system development
Affordable trustworthy-systems
Microkernel design
seL4 on RISC-V/lowRISC - ORCONF'15
Introduction to Microkernels
Implement Runtime Environments for HSA using LLVM
L4 Microkernel :: Design Overview

What's hot (20)

ODP
A tour of F9 microkernel and BitSec hypervisor
PDF
Embedded Virtualization applied in Mobile Devices
PDF
PDF
Construct an Efficient and Secure Microkernel for IoT
PDF
Hints for L4 Microkernel
PDF
olibc: Another C Library optimized for Embedded Linux
PDF
F9: A Secure and Efficient Microkernel Built for Deeply Embedded Systems
PDF
Faults inside System Software
PDF
Advanced Components on Top of L4Re
PDF
Memory, IPC and L4Re
PDF
High Performance Storage Devices in the Linux Kernel
PDF
Shorten Device Boot Time for Automotive IVI and Navigation Systems
PPTX
Gnu linux for safety related systems
PDF
CIF16: Unikernels: The Past, the Present, the Future ( Russell Pavlicek, Xen ...
PDF
Using VPP and SRIO-V with Clear Containers
PPTX
XPDS14: Unikernels: Who, What, Where, When, Why - Adam Wick, Galois
PPTX
Microservices in Unikernels
PDF
What's LUM Got To Do with It: Deployment Considerations for Linux User Manage...
PPTX
CIF16/Scale14x: The latest from the Xen Project (Lars Kurth, Chairman of Xen ...
PDF
The Role of a Network Software Developer in Network Transformation
A tour of F9 microkernel and BitSec hypervisor
Embedded Virtualization applied in Mobile Devices
Construct an Efficient and Secure Microkernel for IoT
Hints for L4 Microkernel
olibc: Another C Library optimized for Embedded Linux
F9: A Secure and Efficient Microkernel Built for Deeply Embedded Systems
Faults inside System Software
Advanced Components on Top of L4Re
Memory, IPC and L4Re
High Performance Storage Devices in the Linux Kernel
Shorten Device Boot Time for Automotive IVI and Navigation Systems
Gnu linux for safety related systems
CIF16: Unikernels: The Past, the Present, the Future ( Russell Pavlicek, Xen ...
Using VPP and SRIO-V with Clear Containers
XPDS14: Unikernels: Who, What, Where, When, Why - Adam Wick, Galois
Microservices in Unikernels
What's LUM Got To Do with It: Deployment Considerations for Linux User Manage...
CIF16/Scale14x: The latest from the Xen Project (Lars Kurth, Chairman of Xen ...
The Role of a Network Software Developer in Network Transformation
Ad

Similar to From L3 to seL4: What have we learnt in 20 years of L4 microkernels (20)

PPTX
Real time Linux
PDF
μ-Kernel Evolution
PPTX
Microkernels and Beyond
PPTX
PPT
UNIT V PPT.ppt
PPTX
Operating Systems
PDF
The Quest for the Perfect API
PDF
Van jaconson netchannels
PPTX
Lec 9-os-review
PDF
CS9222 ADVANCED OPERATING SYSTEMS
PDF
Open comrtos formally_developed _rtos_for_heterogeneous_systems
PDF
An octa core processor with shared memory and message-passing
PPTX
What to do when detect deadlock
PDF
Status of Embedded Linux
PPTX
17MNPU_49_1_08
PPT
PDF
TRACK B: Multicores & Network On Chip Architectures/ Oren Hollander
PPTX
Making a Process (Virtualizing Memory)
PDF
Why kernelspace sucks?
Real time Linux
μ-Kernel Evolution
Microkernels and Beyond
UNIT V PPT.ppt
Operating Systems
The Quest for the Perfect API
Van jaconson netchannels
Lec 9-os-review
CS9222 ADVANCED OPERATING SYSTEMS
Open comrtos formally_developed _rtos_for_heterogeneous_systems
An octa core processor with shared memory and message-passing
What to do when detect deadlock
Status of Embedded Linux
17MNPU_49_1_08
TRACK B: Multicores & Network On Chip Architectures/ Oren Hollander
Making a Process (Virtualizing Memory)
Why kernelspace sucks?
Ad

Recently uploaded (20)

PDF
AI in Product Development-omnex systems
PDF
Odoo Companies in India – Driving Business Transformation.pdf
PDF
Navsoft: AI-Powered Business Solutions & Custom Software Development
PDF
Softaken Excel to vCard Converter Software.pdf
PPTX
Online Work Permit System for Fast Permit Processing
PPTX
Transform Your Business with a Software ERP System
PDF
How Creative Agencies Leverage Project Management Software.pdf
PDF
Digital Strategies for Manufacturing Companies
PPTX
Introduction to Artificial Intelligence
PDF
Wondershare Filmora 15 Crack With Activation Key [2025
PDF
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
PPTX
history of c programming in notes for students .pptx
PPT
Introduction Database Management System for Course Database
PPTX
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
PPTX
VVF-Customer-Presentation2025-Ver1.9.pptx
PDF
How to Choose the Right IT Partner for Your Business in Malaysia
PDF
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
PPTX
Odoo POS Development Services by CandidRoot Solutions
PPTX
ISO 45001 Occupational Health and Safety Management System
PDF
Nekopoi APK 2025 free lastest update
AI in Product Development-omnex systems
Odoo Companies in India – Driving Business Transformation.pdf
Navsoft: AI-Powered Business Solutions & Custom Software Development
Softaken Excel to vCard Converter Software.pdf
Online Work Permit System for Fast Permit Processing
Transform Your Business with a Software ERP System
How Creative Agencies Leverage Project Management Software.pdf
Digital Strategies for Manufacturing Companies
Introduction to Artificial Intelligence
Wondershare Filmora 15 Crack With Activation Key [2025
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
history of c programming in notes for students .pptx
Introduction Database Management System for Course Database
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
VVF-Customer-Presentation2025-Ver1.9.pptx
How to Choose the Right IT Partner for Your Business in Malaysia
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
Odoo POS Development Services by CandidRoot Solutions
ISO 45001 Occupational Health and Safety Management System
Nekopoi APK 2025 free lastest update

From L3 to seL4: What have we learnt in 20 years of L4 microkernels

  • 1. From L3 to seL4: What Have We Learnt in 20 Years of L4 Microkernels? Kevin Elphinstone & @GernotHeiser NICTA and UNSW Australia SOSP'13
  • 2. Copyright Notice These slides are distributed under the Creative Commons Attribution 3.0 License • You are free: – to share—to copy, distribute and transmit the work – to remix—to adapt the work • under the following conditions: – Attribution: You must attribute the work (but not in any way that suggests that the author endorses you or your use of the work) as follows: • “Courtesy of Gernot Heiser, [Institution]”, where [Institution] is one of “UNSW” or “NICTA” The complete license text can be found at http://creativecommons.org/licenses/by/3.0/legalcode ©2013 Gernot Heiser, NICTA 2 COMP9242 S2/2014 W01
  • 3. 1993 ©2013 Gernot Heiser, NICTA 3 SOSP'13 Improving IPC by Kernel Design [SOSP]
  • 4. 400 300 200 100 0 Mach 0 2000 4000 6000 ©2013 Gernot Heiser, NICTA 4 Message Length [B] [μs] L4 1993 IPC Performance SOSP'13 115 μs 5 μs i486 @ 50 MHz Culprit: Cache footprint [SOSP’95] raw copy
  • 5. IPC Performance over 20 Years Name Year Processor MHz Cycles μs Original 1993 i486 50 250 5.00 Original 1997 Pentium 160 121 0.75 L4/MIPS 1997 R4700 100 86 0.86 L4/Alpha 1997 21064 433 45 0.10 Hazelnut 2002 Pentium 4 1,400 2,000 1.38 Pistachio 2005 Itanium 1,500 36 0.02 OKL4 2007 XScale 255 400 151 0.64 NOVA 2010 i7 Bloomfield (32-bit) 2,660 288 0.11 seL4 2013 i7 Haswell (32-bit) 3,400 301 0.09 seL4 2013 ARM11 532 188 0.35 seL4 2013 Cortex A9 1,000 316 0.32 ©2013 Gernot Heiser, NICTA 5 SOSP'13
  • 6. Core Microkernel Principle: Minimality A concept is tolerated inside the microkernel only if moving it outside the kernel, i.e. permitting competing implementations, would prevent the implementation of the system’s required functionality. [SOSP’95] ©2013 Gernot Heiser, NICTA 6 SOSP'13
  • 7. Minimality: Source Code Size Name Architecture C/C++ asm total kSLOC Original i486 0 6.4 6.4 L4/Alpha Alpha 0 14.2 14.2 L4/MIPS MIPS64 6.0 4.5 10.5 Hazelnut x86 10.0 0.8 10.8 Pistachio x86 22.4 1.4 23.0 L4-embedded ARMv5 7.6 1.4 9.0 OKL4 3.0 ARMv6 15.0 0.0 15.0 Fiasco.OC x86 36.2 1.1 37.6 seL4 ARMv6 9.7 0.5 10.2 ©2013 Gernot Heiser, NICTA 7 SOSP'13
  • 8. L4 Family Tree L4/MIPS L4/Alpha L3 → L4 “X” Hazelnut Pistachio OKL4 Microvisor Fiasco Fiasco.OC UNSW/NICTA 93 94 95 96 97 98 99 00 01 02 03 04 05 06 07 08 09 10 11 12 13 ©2013 Gernot Heiser, NICTA 8 seL4 OKL4 μKernel Codezero P4 → PikeOS L4-embed. GMD/IBM/Karlsruhe NOVA Dresden Commercial Clone OK Labs SOSP'13 API Inheritance Code Inheritance
  • 9. L4 Deployments – in the Billions ©2013 Gernot Heiser, NICTA 9 SOSP'13 SiMKo 3 “Merkelphone”
  • 10. seL4: Unprecedented Dependability Proof Proof Proof Non-interference [S&P’13] ©2013 Gernot Heiser, NICTA 10 SOSP'13 Integrity Abstract Model C Imple-mentation Confiden-tiality Availability Binary code Functional correctness [SOSP’09] Integrity [ITP’11] Translation correctness [PLDI’13] • First & only OS kernel with security proofs to binary code • First & only protected-mode OS kernel with sound timeliness analysis Timeliness [RTSS’11, EuroSys’12]
  • 11. L4 Design and Implementation Implement. Tricks [SOSP’93] • Process kernel • Virtual TCB array • Lazy scheduling • Direct process switch • Non-preemptible • Non-portable • Non-standard calling convention • Assembler ©2013 Gernot Heiser, NICTA 11 Design Decisions [SOSP’95] • Synchronous IPC • Rich message structure, arbitrary out-of-line messages • Zero-copy register messages • User-mode page-fault handlers • Threads as IPC destinations • IPC timeouts • Hierarchical IPC control • User-mode device drivers • Process hierarchy • Recursive address-space construction SOSP'13 Objective: Minimise cache footprint and TLB misses
  • 12. DESIGN ©2013 Gernot Heiser, NICTA 12 SOSP'13
  • 13. L4 Synchronous IPC ©2013 Gernot Heiser, NICTA 13 SOSP'13 Thread1 Running Blocked Thread2 Blocked Running Send (dest, msg) Wait (src, msg) ….... Kernel copy Rendezvous model Kernel executes in sender’s context • copies memory data directly to receiver (single-copy) • leaves message registers unchanged during context switch (zero copy)
  • 14. “Long” IPC Sender address space Kernel copy Receiver address space LONG IPC ABANDONED • IPC page faults are nested exceptions ⇒ In-kernel concurrency – L4 executes with interrupts disabled for performance, no concurrency • Must invoke untrusted usermode page-fault handlers – potential for DOSing other thread • Timeouts to avoid DOS attacks – complexity ©2013 Gernot Heiser, NICTA 14 Page fault! Why have long IPC? • POSIX-style APIs write (fd, buf, nbytes) • Usually prefer shared buffers SOSP'13
  • 15. Timeouts IPC Timeouts ABANDONED in seL4, OKL4 ©2013 Gernot Heiser, NICTA 15 SOSP'13 Thread1 Running Blocked Thread2 Blocked Running Send (dest, msg) ….... Wait (src, msg) Kernel copy Limit IPC blocking time Thread1 Running Blocked Rcv(NIL_THRD, delay) ….... Timed wait • No theory/heuristics for determining timeouts • Typically server reply with zero TO, else ∞ • Can do timed wait with timer syscall
  • 16. Synchronous IPC Issues ©2013 Gernot Heiser, NICTA 16 SOSP'13 Thread1 Running Blocked Initiate_IO(…,…) IO_Wait(…,…) Not generally possible Worker_Th Running Blocked IO_Th Blocked Running Unblock (IO_Th) ….... Call (IO,msg) Sync(Worker_Th) Sync(IO_Th) ….... • Sync IPC forces multi-threaded code! • Also poor choice for multi-core
  • 17. Asynchronous Notifications ©2013 Gernot Heiser, NICTA 17 SOSP'13 ….... Thread1 Running Blocked Thread2 Blocked Running w = Poll (…) …... w = Wait (…) Send (Thr _ 2 , …) ….... Send (Thr_2, …) • Delivers few bits (destructively) • Kernel only buffers single word • Maps well to interrupts, exceptions Server Client Driver Sync Async • Thread can wait for synchronous and asynchronous messages concurrently Sync IPC complemented with async
  • 18. IPC Destination Naming Client Server Thread IDs replaced by IPC “endpoints” (ports) ©2013 Gernot Heiser, NICTA 18 SOSP'13 IPC Client Server Load balancer Workers Client Server All IPCs duplicated! Original L4 addressed IPC to threads Client must do load balancing? RPC reply from wrong thread! • Inefficient designs • Poor information hiding • Covert channels [Shapiro ‘02]
  • 19. Endpoints Client Server Client Server ©2013 Gernot Heiser, NICTA 19 SOSP'13 IPC Send Rcv Sync EP • Sync EP queues senders/receivers • Does not buffer messages 0x01 0x10 0x30 Async EP 00xx0013011 • Async EP accumulates bits
  • 20. Other Design Issues IPC Control: “Clans & Chiefs” Process Hierarchy ©2013 Gernot Heiser, NICTA 20 SOSP'13 IPC Chief Clan IPC outside clan re-directs to chief Create Hierarchical resource management Inflexible, clumsy, inefficient hierarchies! Hierarchies replaced by delegatable cap-based access control
  • 21. IMPLEMENTATION ©2013 Gernot Heiser, NICTA 21 SOSP'13
  • 22. Virtual TCB Array Thread ID ©2013 Gernot Heiser, NICTA 22 SOSP'13 VM TCB TCB Fast TCB & stack lookup TCB proper Kernel stack Get own TCB base by masking stack pointer Trades cache for TLB footprint and virtual address space • Not worthwhile on modern processors! • Stacks can dominate kernel memory use! Trades TLB footprint for cache and kernel memory
  • 23. “Lazy” Scheduling • In IPC-based systems, threads block and unblock frequenty • Many ready queue manipulations ©2013 Gernot Heiser, NICTA 23 SOSP'13 thread_t schedule() { foreach (prio in priorities) { foreach (thread in runQueue[prio]) { if (isRunnable(thread)) return thread; else schedDequeue(thread); } } return idleThread; } Idea: leave blocked threads in ready queue, scheduler cleans up Scheduler execution time is unbounded! “Benno scheduling”: • All threads on ready queue are runnable • All runnable threads in ready queue except the running one
  • 24. L4 Design and Implementation Implement. Tricks [SOSP’93] • Process kernel • Virtual TCB array • Lazy scheduling • Direct process switch • Non-preemptible • Non-portable • Non-standard calling convention • Assembler ©2013 Gernot Heiser, NICTA 24 Design Decisions [SOSP’95] • Synchronous IPC • Rich message structure, arbitrary out-of-line messages • Zero-copy register messages • User-mode page-fault handlers • Threads as IPC destinations • IPC timeouts • Hierarchical IPC control • User-mode device drivers • Process hierarchy • Recursive address-space construction SOSP'13
  • 25. What are the Principles? • Minimality is excellent driver of design decisions – L4 kernels have become simpler over time – Policy-mechanism separation (user-mode page-fault handlers) – Device drivers really belong to user level – Minimality is key enabler for formal verification! • IPC speed still matters – But not everywhere, premature optimisation is wastive – Compilers have got so much better – Verification does not compromise performance – Verification invariants can help improve speed! [Shi, OOPSLA’13] • Also found that capabilities are the way to go – Shapiro (EROS) was right • However, a clean abstraction of time still elusive ©2013 Gernot Heiser, NICTA 25 SOSP'13
  • 26. Conclusions Where to find more: • UNSW Advanced Operating Systems Course http://www.cse.unsw.edu.au/~cs9242 • NICTA Trustworthy Systems research http://trustworthy.systems • seL4 open-source portal http://sel4.systems • L4 Microkernel Headquarters http://l4hq.org • Gernot’s blog: http://microkerneldude.wordpress.com/ • Gernot’s research home page: http://ssrg.nicta.com.au/people/?cn=Gernot+Heiser ©2013 Gernot Heiser, NICTA 26 SOSP'13 • Details changed, but principles remained • Microkernels rock! (If done right!)