SlideShare a Scribd company logo
A whirlwind tour of the
LLVM optimizer
Nikita Popov @ EuroLLVM 2023
Agenda
● High-level overview of the middle-end optimization pipeline
● Brief description of important optimization passes
○ Get basic idea about pass responsibilities
○ Learn about key restrictions/constraints
2
About Me
● Software Engineer on Platform Tools team at Red Hat
○ Packaging of LLVM for Fedora, CentOS and RHEL
○ Upstream work on LLVM and Clang
3
About Me
● Software Engineer on Platform Tools team at Red Hat
○ Packaging of LLVM for Fedora, CentOS and RHEL
○ Upstream work on LLVM and Clang
● I work on:
○ The LLVM middle-end
○ LLVM / Rust integration
○ Compilation time improvements (LLVM Compile-Time Tracker)
4
...ends
5
Frontend Middle-end Backend
Clang
Rust
Swift
Julia
...
X86
AArch64
ARM
RISCV
...
Default (non-LTO) pipeline
6
Module 1 Module 1'
Optimize
Module 2 Module 2'
Module 3 Module 3'
Full LTO pipeline
7
Module 1 Module 1'
Pre-link
optimize
Module 2 Module 2'
Module 3 Module 3'
Module M Module M'
Post-link
optimize
Merge
Thin LTO pipeline
8
Module 1 Module 1'
Pre-link
optimize
Module 2 Module 2'
Module 3 Module 3'
Module 2'' Module 2'''
Post-link
optimize
Cross
import
Module 1''
Module 3'' Module 3'''
Module 1'''
Default pipeline
9
Module
Simplification
Module
Optimization
Backend
Default pipeline
10
Module
Simplification
Module
Optimization
Backend
More
canonical
Less
canonical
Default pipeline
11
Module
Simplification
Module
Optimization
Backend
More
canonical
Less
canonical
Inlining
Mem2Reg
LICM (Loop Invariant Code Motion)
...
Make further opts
easier
Default pipeline
12
Module
Simplification
Module
Optimization
Backend
More
canonical
Less
canonical
Vectorization
Runtime unrolling
...
Make further opts
harder
Inlining
Mem2Reg
LICM
...
Make further opts
easier
Default pipeline
13
Module
Simplification
Module
Optimization
Backend
More
canonical
Less
canonical
Vectorization
Runtime unrolling
...
Make further opts
harder
Inlining
Mem2Reg
LICM
...
Make further opts
easier
Target-specific
optimization
Lowering to
machine code
ThinLTO pipeline
14
Module 1
Simplification
Module 1'
Simplification
Module 1'
Optimization
Module 2
Simplification
Module 2'
Simplification
Module 2'
Optimization
Cross
import
Post-link
Pre-link
ThinLTO pipeline
15
Module 1
Simplification
Module 1'
Simplification
Module 1'
Optimization
Module 2
Simplification
Module 2'
Simplification
Module 2'
Optimization
Cross
import
Post-link
Pre-link
Second round of inlining
ThinLTO pipeline
16
Module 1
Simplification
Module 1'
Simplification
Module 1'
Optimization
Module 2
Simplification
Module 2'
Simplification
Module 2'
Optimization
Cross
import
Post-link
Pre-link
Second round of inlining
Don't run decanonicalizing
transforms pre-link
Module Simplification
17
Early
Cleanup
Inlining
Function Simplification
Late
Cleanup
CGSCC Pipeline
CGSCC Pipeline
18
g
h
i
f
CGSCC Pipeline
19
g
h
i
simplify
f
simplify
CGSCC Pipeline
20
g,h
i
f
simplify
try inline
CGSCC Pipeline
21
g,h
i
f
simplify
try inline
simplify
CGSCC Pipeline
22
g,h
i
f
simplify
try inline
simplify
try inline
simplify
CGSCC Pipeline
23
g,h
i
f
simplify
try inline
simplify
try inline
simplify
Inlining sees already simplified functions!
Call-Graph Strongly Connected Components
24
g h i
f
SCC 1
SCC 2
SCC 3
No well-defined order within SCC
Running pipelines
● opt -passes='default<O3>' == opt -O3
● opt -passes='thinlto-pre-link<O3>'
● opt -passes='thinlto<O3>'
● opt -passes='lto-pre-link<O3>'
● opt -passes='lto<O3>'
25
opt -passes='default<O3>' -print-pipeline-passes
annotation2metadata,forceattrs,inferattrs,coro-early,function<eager-inv>(lower-expect,simplifycfg<bonus-inst-threshold=1;no-forw
ard-switch-cond;no-switch-range-to-icmp;no-switch-to-lookup;keep-loops;no-hoist-common-insts;no-sink-common-insts>,sroa<modify-c
fg>,early-cse<>,callsite-splitting),openmp-opt,ipsccp,called-value-propagation,globalopt,function<eager-inv>(mem2reg,instcombine
<max-iterations=1000;no-use-loop-info>,simplifycfg<bonus-inst-threshold=1;no-forward-switch-cond;switch-range-to-icmp;no-switch-
to-lookup;keep-loops;no-hoist-common-insts;no-sink-common-insts>),require<globals-aa>,function(invalidate<aa>),require<profile-s
ummary>,cgscc(devirt<4>(inline<only-mandatory>,inline,function-attrs<skip-non-recursive>,argpromotion,openmp-opt-cgscc,function<
eager-inv;no-rerun>(sroa<modify-cfg>,early-cse<memssa>,speculative-execution,jump-threading,correlated-propagation,simplifycfg<b
onus-inst-threshold=1;no-forward-switch-cond;switch-range-to-icmp;no-switch-to-lookup;keep-loops;no-hoist-common-insts;no-sink-c
ommon-insts>,instcombine<max-iterations=1000;no-use-loop-info>,aggressive-instcombine,constraint-elimination,libcalls-shrinkwrap
,tailcallelim,simplifycfg<bonus-inst-threshold=1;no-forward-switch-cond;switch-range-to-icmp;no-switch-to-lookup;keep-loops;no-h
oist-common-insts;no-sink-common-insts>,reassociate,loop-mssa(loop-instsimplify,loop-simplifycfg,licm<no-allowspeculation>,loop-
rotate,licm<allowspeculation>,simple-loop-unswitch<nontrivial;trivial>),simplifycfg<bonus-inst-threshold=1;no-forward-switch-con
d;switch-range-to-icmp;no-switch-to-lookup;keep-loops;no-hoist-common-insts;no-sink-common-insts>,instcombine<max-iterations=100
0;no-use-loop-info>,loop(loop-idiom,indvars,loop-deletion,loop-unroll-full),sroa<modify-cfg>,vector-combine,mldst-motion<no-spli
t-footer-bb>,gvn<>,sccp,bdce,instcombine<max-iterations=1000;no-use-loop-info>,jump-threading,correlated-propagation,adce,memcpy
opt,dse,move-auto-init,loop-mssa(licm<allowspeculation>),coro-elide,simplifycfg<bonus-inst-threshold=1;no-forward-switch-cond;sw
itch-range-to-icmp;no-switch-to-lookup;keep-loops;hoist-common-insts;sink-common-insts>,instcombine<max-iterations=1000;no-use-l
oop-info>),function-attrs,function(require<should-not-run-function-passes>),coro-split)),deadargelim,coro-cleanup,globalopt,glob
aldce,elim-avail-extern,rpo-function-attrs,recompute-globalsaa,function<eager-inv>(float2int,lower-constant-intrinsics,chr,loop(
loop-rotate,loop-deletion),loop-distribute,inject-tli-mappings,loop-vectorize<no-interleave-forced-only;no-vectorize-forced-only
;>,loop-load-elim,instcombine<max-iterations=1000;no-use-loop-info>,simplifycfg<bonus-inst-threshold=1;forward-switch-cond;switc
h-range-to-icmp;switch-to-lookup;no-keep-loops;hoist-common-insts;sink-common-insts>,slp-vectorizer,vector-combine,instcombine<m
ax-iterations=1000;no-use-loop-info>,loop-unroll<O3>,transform-warning,sroa<preserve-cfg>,instcombine<max-iterations=1000;no-use
-loop-info>,loop-mssa(licm<allowspeculation>),alignment-from-assumptions,loop-sink,instsimplify,div-rem-pairs,tailcallelim,simpl
ifycfg<bonus-inst-threshold=1;no-forward-switch-cond;switch-range-to-icmp;no-switch-to-lookup;keep-loops;no-hoist-common-insts;n
o-sink-common-insts>),globaldce,constmerge,cg-profile,rel-lookup-table-converter,function(annotation-remarks),verify,print
26
Defined in PassBuilderPipelines.cpp
godbolt.org – LLVM Opt Pipeline
27
godbolt.org – LLVM Opt Pipeline
28
godbolt.org – LLVM Opt Pipeline
29
Or run opt -print-after-all
locally
30
SSA Construction
31
Mem2Reg
int test(int x, int y) {
return x + y;
}
32
Mem2Reg
define i32 @test(i32 %x, i32 %y) {
entry:
%x.addr = alloca i32
%y.addr = alloca i32
store i32 %x, ptr %x.addr
store i32 %y, ptr %y.addr
%0 = load i32, ptr %x.addr
%1 = load i32, ptr %y.addr
%add = add nsw i32 %0, %1
ret i32 %add
}
33
Mem2Reg
define i32 @test(i32 %x, i32 %y) {
entry:
%add = add nsw i32 %x, %y
ret i32 %add
}
34
SROA: Scalar Replacement of Aggregates
● Break up allocas into smaller allocas based on access pattern
○ %vec = alloca { ptr, i64, i64 }
○ -> %vec.ptr = alloca ptr
○ -> %vec.size = alloca i64
○ -> %vec.capacity = alloca i64
35
SROA: Scalar Replacement of Aggregates
● Break up allocas into smaller allocas based on access pattern
○ %vec = alloca { ptr, i64, i64 }
○ -> %vec.ptr = alloca ptr
○ -> %vec.size = alloca i64
○ -> %vec.capacity = alloca i64
● Then run Mem2Reg to convert alloca/load/store to SSA values
36
SROA: Scalar Replacement of Aggregates
● Break up allocas into smaller allocas based on access pattern
○ %vec = alloca { ptr, i64, i64 }
○ -> %vec.ptr = alloca ptr
○ -> %vec.size = alloca i64
○ -> %vec.capacity = alloca i64
● Then run Mem2Reg to convert alloca/load/store to SSA values
● Knows many tricks for overlapping accesses
○ For example inserting/extracting bits of a larger integer
37
Control-Flow Optimization
38
SimplifyCFG
● The kitchen sink of control-flow transforms
○ If it fits nowhere else, put it here!
39
SimplifyCFG: Hoist
if (cond) {
foo();
a();
} else {
foo();
b();
}
40
foo();
if (cond) {
a();
} else {
b();
}
SimplifyCFG: Speculate
if (cond) {
x = foo();
} else {
x = 0;
}
41
tmp = foo();
x = cond ? tmp : 0;
SimplifyCFG: Switch to lookup table
switch (x) {
case 0:
return 10;
case 1:
return 42;
case 2:
return 123;
case 3:
return 7;
default:
return 13;
}
42
int table[] = {10, 42, 123, 7};
if (x < 4) {
return table[x];
} else {
return 13;
}
SimplifyCFG
● The kitchen sink of control-flow transforms
○ If it fits nowhere else, put it here!
● Invoked with many different options at different pipeline positions
○ Some transforms only run late in the pipeline
43
SimplifyCFG
● The kitchen sink of control-flow transforms
○ If it fits nowhere else, put it here!
● Invoked with many different options at different pipeline positions
○ Some transforms only run late in the pipeline
● Can use target-dependent cost model (via TargetTransformInfo)
44
Instruction Combining
(Peephole Optimization)
45
InstCombine
● The kitchen sink of non-CFG transforms
○ If it fits nowhere else, put it here!
46
InstCombine: Analysis helpers
47
InstCombine InstSimplify ConstantFolding
InstCombine: Analysis helpers
● ConstantFolding
○ Folds instructions with constant operands to constants
○ 1 + 2 => 3
48
InstCombine: Analysis helpers
● ConstantFolding
○ Folds instructions with constant operands to constants
○ 1 + 2 => 3
● InstSimplify
○ Folds instructions to existing values or constants
○ x + 0 => x
○ x - x => 0
49
InstCombine: Analysis helpers
● ConstantFolding
○ Folds instructions with constant operands to constants
○ 1 + 2 => 3
● InstSimplify
○ Folds instructions to existing values or constants
○ x + 0 => x
○ x - x => 0
● InstCombine
○ Tries constant folding and instruction simplification first
○ Performs folds that create or modify instructions
○ x * 4 => x << 2
50
InstCombine
● The kitchen sink of non-CFG transforms
○ If it fits nowhere else, put it here!
○ Use InstSimplify / ConstantFolding for transforms that don't create/modify instructions.
51
InstCombine
● The kitchen sink of non-CFG transforms
○ If it fits nowhere else, put it here!
○ Use InstSimplify / ConstantFolding for transforms that don't create/modify instructions.
● Also used to paper over phase ordering issues
○ InstCombine re-implements weak versions of transforms from other passes
○ For example: Basic store-to-load forwarding (usually done by EarlyCSE/GVN)
52
…Combine
● InstCombine
○ Canonicalization pass: Cannot be target-dependent
○ Backend implements reverse/undo transform if necessary
53
…Combine
● InstCombine
○ Canonicalization pass: Cannot be target-dependent
○ Backend implements reverse/undo transform if necessary
● AggressiveInstCombine
○ For expensive transforms, only runs once in pipeline
○ Target-dependence discouraged but sometimes allowed
54
…Combine
● InstCombine
○ Canonicalization pass: Cannot be target-dependent
○ Backend implements reverse/undo transform if necessary
● AggressiveInstCombine
○ For expensive transforms, only runs once in pipeline
○ Target-dependence discouraged but sometimes allowed
● VectorCombine
○ For target-dependent, cost-model driven vector transforms
55
CVP: CorrelatedValuePropagation
● Optimizations based on value range information (from LazyValueInfo)
● Important for bounds check elimination
○ icmp ult i32 %x, 10 => i1 true if %x in [0, 10)
56
CVP: CorrelatedValuePropagation
● Optimizations based on value range information (from LazyValueInfo)
● Important for bounds check elimination
○ icmp ult i32 %x, 10 => i1 true if %x in [0, 10)
● Other range based optimizations
○ sdiv i32 %x, %y => udiv i32 %x, %y if %x, %y non-negative
57
Same transform, different analysis
58
● Some folds (e.g. sdiv -> udiv) are implemented in multiple passes
○ Folds are driven by different analyses, which are good at different things
Same transform, different analysis
59
InstCombine
ValueTracking
(KnownBits)
CorrelatedValue
Propagation
LazyValueInfo
IndVarSimplify
ScalarEvolution
IPSCCP
ValueLattice +
PredicateInfo
● Some folds (e.g. sdiv -> udiv) are implemented in multiple passes
○ Folds are driven by different analyses, which are good at different things
Redundancy Elimination
60
EarlyCSE: Common Subexpression Elimination
61
add1 = x + y;
// ...
add2 = x + y;
use(add1);
use(add2);
add1 = x + y;
// ...
use(add1);
use(add1);
EarlyCSE: Common Subexpression Elimination
● Basic CSE based on scoped hash table
● Load CSE and store-to-load forwarding using MemorySSA
62
EarlyCSE: Store to load forwarding
63
*p = v1;
// p not written here
v2 = *p;
use(v1);
use(v2);
*p = v1;
use(v1);
use(v1);
GVN: Global Value Numbering
● More general (and much more expensive!) than EarlyCSE
● Uses MemoryDependenceAnalysis
● Non-local load CSE
● Partial redundancy elimination (PRE)
64
GVN: Non-local load CSE
65
if (...) {
v1 = *p;
} else {
*p = v2;
}
v3 = *p;
use(v3);
if (...) {
v1 = *p;
} else {
*p = v2;
}
v3 = phi(v1, v2);
use(v3);
GVN: Load PRE
66
if (...) {
} else {
*p = v1;
}
v2 = *p;
use(v2);
if (...) {
v2_pre = *p;
} else {
*p = v1;
}
v2 = phi(v2_pre, v1);
use(v2);
Memory Optimizations
67
MemCpyOpt
● Optimize memcpy and memset using MemorySSA
68
MemCpyOpt: Memcpy forwarding
69
memcpy(y, x, 16);
// y not written here
memcpy(z, y, 16);
memcpy(y, x, 16);
// y not written here
memcpy(z, x, 16);
MemCpyOpt: Call Slot Optimization
70
Ty tmp;
foo(tmp);
memcpy(dst, tmp, sizeof(Ty));
foo(dst);
DSE: Dead Store Elimination
● Remove dead stores using MemorySSA
71
DSE: Dead Store Elimination
72
*p = v1;
// p not read here
*p = v2;
// p not read here
*p = v2;
DSE: Dead before return
73
%p = alloca i32
; ...
store i32 %v, ptr %p
; %p not read here
ret void
%p = alloca i32
; ...
; %p not read here
ret void
Loop Optimization
74
Loop pass manager
● Visit child loops first, then parent loops
● Constructs LoopSimplify and LCSSA (Loop-Closed SSA) form before running
75
76
Preheader
Exit
Loop
LICM: Hoist
x = foo();
use(x);
y = bar();
use(y);
77
Preheader
Exit
Loop
LICM: Hoist
use(x);
y = bar();
use(y);
x = foo();
78
Preheader
Exit
Loop
LICM: Sink
use(x);
y = bar();
use(y);
x = foo();
79
Preheader
Exit
Loop
LICM: Sink
use(x);
y = bar();
use(y);
x = foo();
80
Preheader
Exit
Loop
LICM: Promote
v = *p;
vn = v + 1;
*p = vn;
81
Preheader
Exit
Loop
LICM: Promote
v = phi(v0, vn);
vn = v + 1;
*p = vn;
v0 = *p;
LICM: Loop Invariant Code Motion
● Transforms:
○ Hoist instructions into preheader
○ Sink instructions into exits
○ Promote scalars
● Uses MemorySSA
● Canonicalization pass: Cannot be target or PGO dependent
○ May be undone by LoopSink or MachineSink
82
IndVarSimplify
● Uses ScalarEvolution analysis
● Simplify induction variables (IVs) and their uses
● Simplify loop exit conditions
83
IndVarSimplify: Loop exit value replacement
unsigned test(unsigned n) {
unsigned sum = 0;
for (unsigned i = 0; i <= n; i++) {
sum += i;
}
return sum;
}
84
IndVarSimplify: Loop exit value replacement
unsigned test(unsigned n) {
unsigned sum = 0;
for (unsigned i = 0; i <= n; i++) {
sum += i;
}
return sum;
}
unsigned test(unsigned n) {
for (unsigned i = 0; i <= n; i++) {}
return (n * (n - 1))/2 + n;
}
85
IndVarSimplify: Loop exit value replacement
unsigned test(unsigned n) {
unsigned sum = 0;
for (unsigned i = 0; i <= n; i++) {
sum += i;
}
return sum;
}
unsigned test(unsigned n) {
for (unsigned i = 0; i <= n; i++) {}
return (n * (n - 1))/2 + n;
}
86
Later removed by LoopDeletion
LoopUnroll: Full unrolling
87
Iteration #1
Iteration #2
Iteration #3
Iteration #4
Iteration #1-4
LoopUnroll: Loop peeling
88
Iteration #1-N
Iteration #1
Iteration #2-N
LoopUnroll: Partial unrolling
89
Iteration #(4i+1)
Iteration #(4i+2)
Iteration #(4i+3)
Iteration #(4i+4)
Iteration #1-400
LoopUnroll: Runtime unrolling
90
Iteration #(4i+1)
Iteration #(4i+2)
Iteration #(4i+3)
Iteration #(4i+4)
Iteration #1-N
Tail iterations
LoopUnroll
● Simplification:
○ Full unrolling (requires known constant trip count)
○ Loop peeling
● Optimization:
○ Partial unrolling (requires known constant trip count/multiple)
○ Runtime unrolling
91
Vectorization
92
LoopVectorize
● VPlan to model vectorization without IR changes
● LoopAccessAnalysis to ensure memory dependences are safe
● May require inserting runtime checks and LoopVersioning
93
SLPVectorize
● SLP = Superword-Level Parallelism
● Vectorizes straight-line code
94
Inter-Procedural Optimization (IPO)
95
FunctionAttrs
● Infer attributes on function, arguments and return values
○ nounwind, readonly, nonnull, etc.
96
FunctionAttrs
● Infer attributes on function, arguments and return values
○ nounwind, readonly, nonnull, etc.
● General approach:
○ Optimistically all functions in the SCC are nounwind
○ Check whether there are any non-nounwind instructions
○ If not, mark all functions in the SCC nounwind
97
FunctionAttrs
● Infer attributes on function, arguments and return values
○ nounwind, readonly, nonnull, etc.
● General approach:
○ Optimistically all functions in the SCC are nounwind
○ Check whether there are any non-nounwind instructions
○ If not, mark all functions in the SCC nounwind
● New "Attributor" implements much stronger version of this, but not enabled by
default (too slow)
98
IPSCCP: Inter-Procedural Sparse Conditional Constant Propagation
● Propagates constants and constant ranges across functions
● Uses PredicateInfo to take branch conditions into account
99
IPSCCP: Inter-Procedural Sparse Conditional Constant Propagation
● Propagates constants and constant ranges across functions
● Uses PredicateInfo to take branch conditions into account
● Runs very early, before most simplification (which may lose information)
100
IPSCCP: Inter-Procedural Sparse Conditional Constant Propagation
● Propagates constants and constant ranges across functions
● Uses PredicateInfo to take branch conditions into account
● Runs very early, before most simplification (which may lose information)
● Also does function specialization (since recently)
101
Thank You!
Questions?
102
The End
● Blog: https://www.npopov.com/
● Reach me at:
○ npopov@redhat.com
○ https://twitter.com/nikita_ppv
103
Bonus Slides
104
JumpThreading
105
if (x > 10) {
greater10();
}
always();
if (x > 0) {
greater0();
}
if (x > 10) {
greater10();
always();
greater0();
} else {
always();
}
JumpThreading
● Optimizes conditional branches where one condition implies another
● Uses LazyValueInfo analysis, which provides value range information
106
107
Header
Latch
Preheader
Exit 1
Exit 2
Loop
Backedge
Loop
108
Header
Latch
Preheader
Exit 1
Exit 2
Loop
Backedge
LoopSimplify
Form
SimpleLoopUnswitch
while (...) {
if (c) {
foo();
} else {
bar();
}
}
109
if (c) {
while (...) {
foo();
}
} else {
while (...) {
bar();
}
}

More Related Content

PPTX
Culture
PDF
Kiwi SaaS Metrics That Matter 2023^LLJ r2.2.pdf
PDF
Heap exploitation
PDF
Vertex AI - Unified ML Platform for the entire AI workflow on Google Cloud
PDF
mastering the curl command line.pdf
PPTX
Exploring ChatGPT for Effective Teaching and Learning.pptx
PPTX
CHAT GPT.pptx
PDF
ChatGPT IT Powerpoint Presentation Slides
Culture
Kiwi SaaS Metrics That Matter 2023^LLJ r2.2.pdf
Heap exploitation
Vertex AI - Unified ML Platform for the entire AI workflow on Google Cloud
mastering the curl command line.pdf
Exploring ChatGPT for Effective Teaching and Learning.pptx
CHAT GPT.pptx
ChatGPT IT Powerpoint Presentation Slides

What's hot (20)

PDF
How A Compiler Works: GNU Toolchain
PDF
twlkh-linux-vsyscall-and-vdso
PPTX
Understanding eBPF in a Hurry!
PDF
Linux Binary Exploitation - Heap Exploitation
PDF
Linux binary Exploitation - Basic knowledge
KEY
Introduction to memcached
PDF
Linux Networking Explained
PDF
GDB Rocks!
PDF
Fun with Network Interfaces
PDF
x86とコンテキストスイッチ
PDF
Qemu JIT Code Generator and System Emulation
PDF
introduction to linux kernel tcp/ip ptocotol stack
PDF
あるコンテキストスイッチの話
PPTX
冬のLock free祭り safe
PDF
Linux Performance Analysis and Tools
PDF
[143] Modern C++ 무조건 써야 해?
PDF
Efficient Data Storage for Analytics with Parquet 2.0 - Hadoop Summit 2014
PDF
Linux女子部 iptables復習編
PDF
C++ マルチスレッド 入門
PDF
JVM JIT-compiler overview @ JavaOne Moscow 2013
How A Compiler Works: GNU Toolchain
twlkh-linux-vsyscall-and-vdso
Understanding eBPF in a Hurry!
Linux Binary Exploitation - Heap Exploitation
Linux binary Exploitation - Basic knowledge
Introduction to memcached
Linux Networking Explained
GDB Rocks!
Fun with Network Interfaces
x86とコンテキストスイッチ
Qemu JIT Code Generator and System Emulation
introduction to linux kernel tcp/ip ptocotol stack
あるコンテキストスイッチの話
冬のLock free祭り safe
Linux Performance Analysis and Tools
[143] Modern C++ 무조건 써야 해?
Efficient Data Storage for Analytics with Parquet 2.0 - Hadoop Summit 2014
Linux女子部 iptables復習編
C++ マルチスレッド 入門
JVM JIT-compiler overview @ JavaOne Moscow 2013
Ad

Similar to A whirlwind tour of the LLVM optimizer (20)

PDF
What C and C++ Can Do and When Do You Need Assembly? by Alexander Krizhanovsky
PDF
FPGA_Logic.pdf
PDF
synopsys logic synthesis
PDF
Performance tweaks and tools for Linux (Joe Damato)
PDF
Postgres Vision 2018: Making Postgres Even Faster
 
PDF
Debugging Ruby
PDF
Debugging Ruby Systems
PDF
Network Programming: Data Plane Development Kit (DPDK)
PDF
SPIN Cours SPIN Cours SPIN Cours SPIN Cours
PDF
LCA14: LCA14-412: GPGPU on ARM SoC session
PPT
Cisco data center support
PDF
Pragmatic Optimization in Modern Programming - Ordering Optimization Approaches
PPT
Channel 2010
PDF
C++ CoreHard Autumn 2018. Concurrency and Parallelism in C++17 and C++20/23 -...
PPT
Lec15 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- EPIC VLIW
PPTX
Modern Linux Tracing Landscape
PPTX
Algorithm analysis.pptx
PDF
XDP in Practice: DDoS Mitigation @Cloudflare
PDF
design-compiler.pdf
PPTX
Adapting to Adaptive Plans on 12c
What C and C++ Can Do and When Do You Need Assembly? by Alexander Krizhanovsky
FPGA_Logic.pdf
synopsys logic synthesis
Performance tweaks and tools for Linux (Joe Damato)
Postgres Vision 2018: Making Postgres Even Faster
 
Debugging Ruby
Debugging Ruby Systems
Network Programming: Data Plane Development Kit (DPDK)
SPIN Cours SPIN Cours SPIN Cours SPIN Cours
LCA14: LCA14-412: GPGPU on ARM SoC session
Cisco data center support
Pragmatic Optimization in Modern Programming - Ordering Optimization Approaches
Channel 2010
C++ CoreHard Autumn 2018. Concurrency and Parallelism in C++17 and C++20/23 -...
Lec15 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- EPIC VLIW
Modern Linux Tracing Landscape
Algorithm analysis.pptx
XDP in Practice: DDoS Mitigation @Cloudflare
design-compiler.pdf
Adapting to Adaptive Plans on 12c
Ad

More from Nikita Popov (11)

PDF
Opaque Pointers Are Coming
PDF
What's new in PHP 8.0?
PDF
Just-In-Time Compiler in PHP 8
PDF
What's new in PHP 8.0?
PDF
PHP Performance Trivia
PDF
Typed Properties and more: What's coming in PHP 7.4?
PDF
Static Optimization of PHP bytecode (PHPSC 2017)
PDF
PHP Language Trivia
PDF
PHP 7 – What changed internally? (Forum PHP 2015)
PDF
PHP 7 – What changed internally? (PHP Barcelona 2015)
PDF
PHP 7 – What changed internally?
Opaque Pointers Are Coming
What's new in PHP 8.0?
Just-In-Time Compiler in PHP 8
What's new in PHP 8.0?
PHP Performance Trivia
Typed Properties and more: What's coming in PHP 7.4?
Static Optimization of PHP bytecode (PHPSC 2017)
PHP Language Trivia
PHP 7 – What changed internally? (Forum PHP 2015)
PHP 7 – What changed internally? (PHP Barcelona 2015)
PHP 7 – What changed internally?

Recently uploaded (20)

PPTX
Chapter 5: Probability Theory and Statistics
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Hybrid model detection and classification of lung cancer
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PDF
Enhancing emotion recognition model for a student engagement use case through...
PDF
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PDF
A comparative analysis of optical character recognition models for extracting...
PPTX
cloud_computing_Infrastucture_as_cloud_p
PDF
Mushroom cultivation and it's methods.pdf
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PDF
A comparative study of natural language inference in Swahili using monolingua...
PDF
Encapsulation theory and applications.pdf
PPTX
Tartificialntelligence_presentation.pptx
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Univ-Connecticut-ChatGPT-Presentaion.pdf
PDF
DP Operators-handbook-extract for the Mautical Institute
PDF
Approach and Philosophy of On baking technology
Chapter 5: Probability Theory and Statistics
gpt5_lecture_notes_comprehensive_20250812015547.pdf
MIND Revenue Release Quarter 2 2025 Press Release
Hybrid model detection and classification of lung cancer
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
Enhancing emotion recognition model for a student engagement use case through...
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
Assigned Numbers - 2025 - Bluetooth® Document
A comparative analysis of optical character recognition models for extracting...
cloud_computing_Infrastucture_as_cloud_p
Mushroom cultivation and it's methods.pdf
Building Integrated photovoltaic BIPV_UPV.pdf
NewMind AI Weekly Chronicles - August'25-Week II
A comparative study of natural language inference in Swahili using monolingua...
Encapsulation theory and applications.pdf
Tartificialntelligence_presentation.pptx
Programs and apps: productivity, graphics, security and other tools
Univ-Connecticut-ChatGPT-Presentaion.pdf
DP Operators-handbook-extract for the Mautical Institute
Approach and Philosophy of On baking technology

A whirlwind tour of the LLVM optimizer