SlideShare a Scribd company logo
© 2018 NETRONOME SYSTEMS, INC. 1
Verifier optimization work
Jakub Kicinski <kuba@kernel.org>
LFSMM
BPF Microconference
San Juan, 2 May 2019
© 2019 NETRONOME SYSTEMS, INC. 2CONFIDENTIAL
Recent optimizations from Alexei
● rare explored state removal
most explored states never prune any later walks - remove states after:
miss_cnt > 3 + hit_cnt * 3
● read marking backpropagation pruning
read marks are propagated to source states, once state with read mark
already set is reached, propagation can stop
● big verifier lock removal
already covered
© 2019 NETRONOME SYSTEMS, INC. 3CONFIDENTIAL
Cycles spent*
* sum over Cilium test programs
Function cycles % do_check % insn prog % insn walk
Total (do_check) 2613 100.00%
copy_verifier_state 558 21.35%
regsafe 368 14.08%
free_verifier_state 167 6.39%
check_cond_jmp_op 252 9.64% 10.13% 10.15%
check_alu_op 100 3.83% 59.13% 57.02%
check_mem_access 89 3.41% 23.53% 26.28%
check_helper_call 80 3.06% 5.65% 4.62%
mark_reg_read 229 8.76%
mark_reg_unknown 71 2.72%
mark_reg_known 15 0.57%
© 2019 NETRONOME SYSTEMS, INC. 4CONFIDENTIAL
Cycles spent*
* sum over Cilium test programs
Function cycles % do_check % insn prog % insn walk
Total (do_check) 2613 100.00%
copy_verifier_state 558 21.35%
regsafe 368 14.08%
free_verifier_state 167 6.39%
check_cond_jmp_op 252 9.64% 10.13% 10.15%
check_alu_op 100 3.83% 59.13% 57.02%
check_mem_access 89 3.41% 23.53% 26.28%
check_helper_call 80 3.06% 5.65% 4.62%
mark_reg_read 229 8.76%
mark_reg_unknown 71 2.72%
mark_reg_known 15 0.57%
Trivial micro optimization - avoid the use of zalloc+memcpy
19.41%
© 2019 NETRONOME SYSTEMS, INC. 5CONFIDENTIAL
Pruning point analysis
n prunes sum(points)
0 5137
1 615
2 242
3 167
4 51
5 39
6 45
7 19
8 24
9 17
10 11
© 2019 NETRONOME SYSTEMS, INC. 6CONFIDENTIAL
Pruning point elimination
● pruning points are too dense - every 3.8 instruction in Cilium progs
● 80% of conditional branch pruning points with 0 hits
● replacing the pruning heuristic with marking every 10th instruction gives
4-20% do_check speedup for Cilium progs
● 33% more instructions walked
● no good heuristic apparent, yet
● pruning on fall through insn, rather than jmp - 4%
● in-place branch pruning
Branch 9279 27.55%
Shallow 4641 13.78%
Pruning 24397 72.45%
Total 33676
© 2019 NETRONOME SYSTEMS, INC. 7CONFIDENTIAL
Other ideas
● tail elimination:
r0 = const
exit
covered by the shallow branch optimization
● pure function detection/pruning (callsite independent)
real-life benefit unclear due to small number of no-inline samples
● “fudge” builtin:
var = __builtin_constant_relaxed(5, 0xff)
hints the verifier should loosen the info about the constant
© 2019 NETRONOME SYSTEMS, INC. 8CONFIDENTIAL
1M instruction challenges
● jump offset (16 bit)
● instruction patching is quadratic
● pruning state grows as O(stack frames x prog len)
● execution time estimation?

More Related Content

PDF
Finding Bugs, Fixing Bugs, Preventing Bugs - Exploiting Automated Tests to In...
PDF
Apache SystemML Optimizer and Runtime techniques by Arvind Surve and Matthias...
PDF
Apache SystemML Optimizer and Runtime techniques by Arvind Surve and Matthias...
PDF
ALEA:Fine-grain Energy Profiling with Basic Block sampling
PPTX
How to add an optimization for C# to RyuJIT
PDF
Lect-06Lect-06Lect-06Lect-06Lect-06Lect-06Lect-06Lect-06
PPTX
Case Study of End to End Formal Verification Methodology
PDF
DeepXplore: Automated Whitebox Testing of Deep Learning
Finding Bugs, Fixing Bugs, Preventing Bugs - Exploiting Automated Tests to In...
Apache SystemML Optimizer and Runtime techniques by Arvind Surve and Matthias...
Apache SystemML Optimizer and Runtime techniques by Arvind Surve and Matthias...
ALEA:Fine-grain Energy Profiling with Basic Block sampling
How to add an optimization for C# to RyuJIT
Lect-06Lect-06Lect-06Lect-06Lect-06Lect-06Lect-06Lect-06
Case Study of End to End Formal Verification Methodology
DeepXplore: Automated Whitebox Testing of Deep Learning

More from Netronome (20)

PPTX
Disaggregation a Primer: Optimizing design for Edge Cloud & Bare Metal applic...
PDF
LFSMM AF XDP Queue I-DS
PDF
Using Network Acceleration for an Optimized Edge Cloud Server Architecture
PDF
Offloading TC Rules on OVS Internal Ports
PDF
Quality of Service Ingress Rate Limiting and OVS Hardware Offloads
PDF
ODSA Sub-Project Launch
PDF
Flexible and Scalable Domain-Specific Architectures
PDF
Unifying Network Filtering Rules for the Linux Kernel with eBPF
PDF
Massively Parallel RISC-V Processing with Transactional Memory
PDF
Offloading Linux LAG Devices Via Open vSwitch and TC
PDF
eBPF Debugging Infrastructure - Current Techniques
PDF
Efficient JIT to 32-bit Arches
PDF
eBPF & Switch Abstractions
PDF
eBPF Tooling and Debugging Infrastructure
PDF
BPF Hardware Offload Deep Dive
PPTX
Demystify eBPF JIT Compiler
PDF
eBPF/XDP
PDF
P4 Introduction
PDF
Host Data Plane Acceleration: SmartNIC Deployment Models
PDF
The Power of SmartNICs
Disaggregation a Primer: Optimizing design for Edge Cloud & Bare Metal applic...
LFSMM AF XDP Queue I-DS
Using Network Acceleration for an Optimized Edge Cloud Server Architecture
Offloading TC Rules on OVS Internal Ports
Quality of Service Ingress Rate Limiting and OVS Hardware Offloads
ODSA Sub-Project Launch
Flexible and Scalable Domain-Specific Architectures
Unifying Network Filtering Rules for the Linux Kernel with eBPF
Massively Parallel RISC-V Processing with Transactional Memory
Offloading Linux LAG Devices Via Open vSwitch and TC
eBPF Debugging Infrastructure - Current Techniques
Efficient JIT to 32-bit Arches
eBPF & Switch Abstractions
eBPF Tooling and Debugging Infrastructure
BPF Hardware Offload Deep Dive
Demystify eBPF JIT Compiler
eBPF/XDP
P4 Introduction
Host Data Plane Acceleration: SmartNIC Deployment Models
The Power of SmartNICs
Ad

Recently uploaded (20)

PDF
Encapsulation theory and applications.pdf
PDF
A comparative analysis of optical character recognition models for extracting...
PPTX
Group 1 Presentation -Planning and Decision Making .pptx
PPTX
Machine Learning_overview_presentation.pptx
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPT
Teaching material agriculture food technology
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
August Patch Tuesday
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Getting Started with Data Integration: FME Form 101
PDF
A comparative study of natural language inference in Swahili using monolingua...
PDF
Machine learning based COVID-19 study performance prediction
PDF
Accuracy of neural networks in brain wave diagnosis of schizophrenia
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PDF
Empathic Computing: Creating Shared Understanding
PPTX
1. Introduction to Computer Programming.pptx
PDF
Univ-Connecticut-ChatGPT-Presentaion.pdf
PPTX
A Presentation on Artificial Intelligence
PPTX
Tartificialntelligence_presentation.pptx
Encapsulation theory and applications.pdf
A comparative analysis of optical character recognition models for extracting...
Group 1 Presentation -Planning and Decision Making .pptx
Machine Learning_overview_presentation.pptx
Diabetes mellitus diagnosis method based random forest with bat algorithm
Teaching material agriculture food technology
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
August Patch Tuesday
Reach Out and Touch Someone: Haptics and Empathic Computing
Building Integrated photovoltaic BIPV_UPV.pdf
Getting Started with Data Integration: FME Form 101
A comparative study of natural language inference in Swahili using monolingua...
Machine learning based COVID-19 study performance prediction
Accuracy of neural networks in brain wave diagnosis of schizophrenia
NewMind AI Weekly Chronicles - August'25-Week II
Empathic Computing: Creating Shared Understanding
1. Introduction to Computer Programming.pptx
Univ-Connecticut-ChatGPT-Presentaion.pdf
A Presentation on Artificial Intelligence
Tartificialntelligence_presentation.pptx
Ad

LFSMM Verifier Optimizations and 1 M Instructions

  • 1. © 2018 NETRONOME SYSTEMS, INC. 1 Verifier optimization work Jakub Kicinski <kuba@kernel.org> LFSMM BPF Microconference San Juan, 2 May 2019
  • 2. © 2019 NETRONOME SYSTEMS, INC. 2CONFIDENTIAL Recent optimizations from Alexei ● rare explored state removal most explored states never prune any later walks - remove states after: miss_cnt > 3 + hit_cnt * 3 ● read marking backpropagation pruning read marks are propagated to source states, once state with read mark already set is reached, propagation can stop ● big verifier lock removal already covered
  • 3. © 2019 NETRONOME SYSTEMS, INC. 3CONFIDENTIAL Cycles spent* * sum over Cilium test programs Function cycles % do_check % insn prog % insn walk Total (do_check) 2613 100.00% copy_verifier_state 558 21.35% regsafe 368 14.08% free_verifier_state 167 6.39% check_cond_jmp_op 252 9.64% 10.13% 10.15% check_alu_op 100 3.83% 59.13% 57.02% check_mem_access 89 3.41% 23.53% 26.28% check_helper_call 80 3.06% 5.65% 4.62% mark_reg_read 229 8.76% mark_reg_unknown 71 2.72% mark_reg_known 15 0.57%
  • 4. © 2019 NETRONOME SYSTEMS, INC. 4CONFIDENTIAL Cycles spent* * sum over Cilium test programs Function cycles % do_check % insn prog % insn walk Total (do_check) 2613 100.00% copy_verifier_state 558 21.35% regsafe 368 14.08% free_verifier_state 167 6.39% check_cond_jmp_op 252 9.64% 10.13% 10.15% check_alu_op 100 3.83% 59.13% 57.02% check_mem_access 89 3.41% 23.53% 26.28% check_helper_call 80 3.06% 5.65% 4.62% mark_reg_read 229 8.76% mark_reg_unknown 71 2.72% mark_reg_known 15 0.57% Trivial micro optimization - avoid the use of zalloc+memcpy 19.41%
  • 5. © 2019 NETRONOME SYSTEMS, INC. 5CONFIDENTIAL Pruning point analysis n prunes sum(points) 0 5137 1 615 2 242 3 167 4 51 5 39 6 45 7 19 8 24 9 17 10 11
  • 6. © 2019 NETRONOME SYSTEMS, INC. 6CONFIDENTIAL Pruning point elimination ● pruning points are too dense - every 3.8 instruction in Cilium progs ● 80% of conditional branch pruning points with 0 hits ● replacing the pruning heuristic with marking every 10th instruction gives 4-20% do_check speedup for Cilium progs ● 33% more instructions walked ● no good heuristic apparent, yet ● pruning on fall through insn, rather than jmp - 4% ● in-place branch pruning Branch 9279 27.55% Shallow 4641 13.78% Pruning 24397 72.45% Total 33676
  • 7. © 2019 NETRONOME SYSTEMS, INC. 7CONFIDENTIAL Other ideas ● tail elimination: r0 = const exit covered by the shallow branch optimization ● pure function detection/pruning (callsite independent) real-life benefit unclear due to small number of no-inline samples ● “fudge” builtin: var = __builtin_constant_relaxed(5, 0xff) hints the verifier should loosen the info about the constant
  • 8. © 2019 NETRONOME SYSTEMS, INC. 8CONFIDENTIAL 1M instruction challenges ● jump offset (16 bit) ● instruction patching is quadratic ● pruning state grows as O(stack frames x prog len) ● execution time estimation?