SlideShare a Scribd company logo
Understanding your memory usage
Marian Marinov <mm@yuhu.biz>
Marian Marinov <mm@yuhu.biz>
OpenFest 2024
OpenFest 2024
Who am I?
Who am I?
➢ Sysadmin with more then 25y of
experience
➢ Director of Engineering at Web Hosting
Canada
➢ a FOSS dude :)
When does OOM occurs?
When does OOM occurs?
➢ What do you think?
When does OOM occurs?
When does OOM occurs?
➢ What do you think?
➢ When there is not enough memory for
the applications?
When does OOM occurs?
When does OOM occurs?
➢ What do you think?
➢ When there is not enough memory for
the applications?
➢ When there is not enough memory for
the kernel?
When does OOM occurs?
When does OOM occurs?
➢ What do you think?
➢ When there is not enough memory for
the applications?
➢ When there is not enough memory for
the kernel?
➢ But is this really the case?
Overcommit
Overcommit
But wait... there is a bit more, before
we continue.
https://www.kernel.org/doc/html/latest/admin-guide/sysctl/vm.html
Overcommit
Overcommit
But wait... there is a bit more, before
we continue.
➢Not all allocated memory is used
https://www.kernel.org/doc/html/latest/admin-guide/sysctl/vm.html
Overcommit
Overcommit
But wait... there is a bit more, before
we continue.
➢Not all allocated memory is used
➢ buffers
➢ caches
➢ mmap-ed files
https://www.kernel.org/doc/html/latest/admin-guide/sysctl/vm.html
Overcommit
Overcommit
But wait... there is a bit more, before
we continue.
vm.overcommit_memory
vm.overcommit_kbytes
vm.overcommit_ratio
https://www.kernel.org/doc/html/latest/admin-guide/sysctl/vm.html
Overcommit
Overcommit
But wait... there is a bit more, before
we continue.
_memory is 0 - total memory + swap
https://www.kernel.org/doc/html/latest/admin-guide/sysctl/vm.html
Overcommit
Overcommit
But wait... there is a bit more, before
we continue.
_memory is 0 - total memory + swap
_memory is 1 - allow any allocations
https://www.kernel.org/doc/html/latest/admin-guide/sysctl/vm.html
Overcommit
Overcommit
But wait... there is a bit more, before
we continue.
_memory is 0 - total memory + swap
_memory is 1 - allow any allocations
_memory is 2 - checks the _kbytes or
_ratio
https://www.kernel.org/doc/html/latest/admin-guide/sysctl/vm.html
Overcommit
Overcommit
But wait... there is a bit more, before
we continue.
_kbytes and _ratio are for physical
memory + swap
vm.overcommit_kbytes - specific size
vm.overcommit_ratio - percentage
https://www.kernel.org/doc/html/latest/admin-guide/sysctl/vm.html
What is OOM?
What is OOM?
OOM - Out of Memory
OOM - Out of Memory
What is OOM?
What is OOM?
OOM - Out of Memory
OOM - Out of Memory
OOM - Killer
OOM - Killer
What is OOM?
What is OOM?
php-fpm invoked oom-killer: gfp_mask=0x6000c0(GFP_KERNEL), order=0, oom_score_adj=0
CPU: 38 PID: 213145 Comm: php-fpm Kdump: loaded Tainted: G OE -------- - - 5.4.61
Hardware name: Dell Inc. PowerEdge R630, BIOS 2.16.2 02/19/2023
Call Trace:
dump_stack+0x41/0x60
dump_header+0x4a/0x1df
oom_kill_process.cold.33+0xb/0x10
out_of_memory+0x1bd/0x4e0
mem_cgroup_out_of_memory+0xec/0x100
try_charge_memcg+0x61a/0x690
? __alloc_pages_nodemask+0x166/0x330
__mem_cgroup_charge+0x40/0xa0
mem_cgroup_charge+0x2f/0x80
handle_pte_fault+0x372/0x880
__handle_mm_fault+0x552/0x6d0
? filemap_fdatawait_keep_errors+0x50/0x50
handle_mm_fault+0xca/0x2a0
__do_page_fault+0x1e4/0x440
do_page_fault+0x37/0x12d
? page_fault+0x8/0x30
page_fault+0x1e/0x30
What is OOM?
What is OOM?
php-fpm invoked oom-killer: gfp_mask=0x6000c0(GFP_KERNEL), order=0, oom_score_adj=0
CPU: 38 PID: 213145 Comm: php-fpm Kdump: loaded Tainted: G OE -------- - - 5.4.61
Hardware name: Dell Inc. PowerEdge R630, BIOS 2.16.2 02/19/2023
Call Trace:
dump_stack+0x41/0x60
dump_header+0x4a/0x1df
oom_kill_process.cold.33+0xb/0x10
out_of_memory+0x1bd/0x4e0
mem_cgroup_out_of_memory+0xec/0x100
try_charge_memcg+0x61a/0x690
? __alloc_pages_nodemask+0x166/0x330
__mem_cgroup_charge+0x40/0xa0
mem_cgroup_charge+0x2f/0x80
handle_pte_fault+0x372/0x880
__handle_mm_fault+0x552/0x6d0
? filemap_fdatawait_keep_errors+0x50/0x50
handle_mm_fault+0xca/0x2a0
__do_page_fault+0x1e4/0x440
do_page_fault+0x37/0x12d
? page_fault+0x8/0x30
page_fault+0x1e/0x30
What is OOM?
What is OOM?
php-fpm invoked oom-killer: gfp_mask=0x6000c0(GFP_KERNEL), order=0, oom_score_adj=0
CPU: 38 PID: 213145 Comm: php-fpm Kdump: loaded Tainted: G OE -------- - - 5.4.61
Hardware name: Dell Inc. PowerEdge R630, BIOS 2.16.2 02/19/2023
Call Trace:
dump_stack+0x41/0x60
dump_header+0x4a/0x1df
oom_kill_process.cold.33+0xb/0x10
out_of_memory+0x1bd/0x4e0
mem_cgroup_out_of_memory+0xec/0x100
try_charge_memcg+0x61a/0x690
? __alloc_pages_nodemask+0x166/0x330
__mem_cgroup_charge+0x40/0xa0
mem_cgroup_charge+0x2f/0x80
handle_pte_fault+0x372/0x880
__handle_mm_fault+0x552/0x6d0
? filemap_fdatawait_keep_errors+0x50/0x50
handle_mm_fault+0xca/0x2a0
__do_page_fault+0x1e4/0x440
do_page_fault+0x37/0x12d
? page_fault+0x8/0x30
page_fault+0x1e/0x30
What is OOM?
What is OOM?
php-fpm invoked oom-killer: gfp_mask=0x6000c0(GFP_KERNEL), order=0, oom_score_adj=0
CPU: 38 PID: 213145 Comm: php-fpm Kdump: loaded Tainted: G OE -------- - - 5.4.61
Hardware name: Dell Inc. PowerEdge R630, BIOS 2.16.2 02/19/2023
Call Trace:
dump_stack+0x41/0x60
dump_header+0x4a/0x1df
oom_kill_process.cold.33+0xb/0x10
out_of_memory+0x1bd/0x4e0
mem_cgroup_out_of_memory+0xec/0x100
try_charge_memcg+0x61a/0x690
? __alloc_pages_nodemask+0x166/0x330
__mem_cgroup_charge+0x40/0xa0
mem_cgroup_charge+0x2f/0x80
handle_pte_fault+0x372/0x880
__handle_mm_fault+0x552/0x6d0
? filemap_fdatawait_keep_errors+0x50/0x50
handle_mm_fault+0xca/0x2a0
__do_page_fault+0x1e4/0x440
do_page_fault+0x37/0x12d
? page_fault+0x8/0x30
page_fault+0x1e/0x30
What is OOM?
What is OOM?
php-fpm invoked oom-killer: gfp_mask=0x6000c0(GFP_KERNEL), order=0, oom_score_adj=0
CPU: 38 PID: 213145 Comm: php-fpm Kdump: loaded Tainted: G OE -------- - - 5.4.61
Hardware name: Dell Inc. PowerEdge R630, BIOS 2.16.2 02/19/2023
Call Trace:
dump_stack+0x41/0x60
dump_header+0x4a/0x1df
oom_kill_process.cold.33+0xb/0x10
out_of_memory+0x1bd/0x4e0
mem_cgroup_out_of_memory+0xec/0x100
try_charge_memcg+0x61a/0x690
? __alloc_pages_nodemask+0x166/0x330
__mem_cgroup_charge+0x40/0xa0
mem_cgroup_charge+0x2f/0x80
handle_pte_fault+0x372/0x880
__handle_mm_fault+0x552/0x6d0
? filemap_fdatawait_keep_errors+0x50/0x50
handle_mm_fault+0xca/0x2a0
__do_page_fault+0x1e4/0x440
do_page_fault+0x37/0x12d
? page_fault+0x8/0x30
page_fault+0x1e/0x30
What is OOM?
What is OOM?
RIP: 0033:0x6718a4
Code: 8b 3c 9a 48 8d 15 fc 6f 19 00 8b 34 9a 48 8d 14 38 83 ee 01 48 89 54 dd 20 0f
af f7 48 01 c6 0f 1f 80 00 00 00 00 48 8d 0c 3a <48> 89 0a 48 89 ca 48 39 f1 75 f1
48 c7 01 00 00 00 00 48 83 c4 08
RSP: 002b:00007ffdffde2b60 EFLAGS: 00010206
RAX: 00007f006b4c2000 RBX: 000000000000001d RCX: 00007f006b4c3800
RDX: 00007f006b4c2c00 RSI: 00007f006b4c4400 RDI: 0000000000000c00
RBP: 00007f008c800040 R08: 00000000000000c4 R09: 00007f006b400000
R10: 00000000000c2000 R11: 00000000000000c2 R12: 0000000000000003
R13: 000000000000001d R14: 00000000036d6c98 R15: 00007f006b4a62e8
memory: usage 1048576kB, limit 1048576kB, failcnt 0
memory+swap: usage 1048576kB, limit 1048576kB, failcnt 5264
kmem: usage 0kB, limit 9007199254740988kB, failcnt 0
Memory cgroup stats for /stats:
anon 1073647616#012file 94208#012kernel_stack 0#012pagetables 0#012percpu 0#012sock
0#012shmem 49152#012file_mapped 49152#012file_dirty 0#012file_writeback
0#012swapcached 0#012anon_thp 0#012file_thp 0#012shmem_thp 0#012inactive_anon
1073627136#012active_anon 69632#012inactive_file 0#012active_file 0#012unevictable
0#012slab_reclaimable 0#012slab_unreclaimable 0#012slab 0#012workingset_refault_anon
0#012workingset_refault_file 1532#012workingset_activate_anon 0#012workingset_activate_file
661#012workingset_restore_anon 0#012workingset_restore_file 619#012workingset_nodereclaim
0#012pgscan 17515#012pgsteal 15856#012pgscan_kswapd 1617#012pgscan_direct
15898#012pgsteal_kswapd 1617#012pgsteal_direct 14239#012pgfault 3216718#012pgmajfault
3#012pgrefill 31284#012pgactivate 28060#012pgdeactivate 31283#012pglazyfree
0#012pglazyfreed 0#012thp_fault_alloc 0#012thp_collapse_alloc 0
What is OOM?
What is OOM?
RIP: 0033:0x6718a4
Code: 8b 3c 9a 48 8d 15 fc 6f 19 00 8b 34 9a 48 8d 14 38 83 ee 01 48 89 54 dd 20 0f
af f7 48 01 c6 0f 1f 80 00 00 00 00 48 8d 0c 3a <48> 89 0a 48 89 ca 48 39 f1 75 f1
48 c7 01 00 00 00 00 48 83 c4 08
RSP: 002b:00007ffdffde2b60 EFLAGS: 00010206
RAX: 00007f006b4c2000 RBX: 000000000000001d RCX: 00007f006b4c3800
RDX: 00007f006b4c2c00 RSI: 00007f006b4c4400 RDI: 0000000000000c00
RBP: 00007f008c800040 R08: 00000000000000c4 R09: 00007f006b400000
R10: 00000000000c2000 R11: 00000000000000c2 R12: 0000000000000003
R13: 000000000000001d R14: 00000000036d6c98 R15: 00007f006b4a62e8
memory: usage 1048576kB, limit 1048576kB, failcnt 0
memory+swap: usage 1048576kB, limit 1048576kB, failcnt 5264
kmem: usage 0kB, limit 9007199254740988kB, failcnt 0
Memory cgroup stats for /stats:
anon 1073647616#012file 94208#012kernel_stack 0#012pagetables 0#012percpu 0#012sock
0#012shmem 49152#012file_mapped 49152#012file_dirty 0#012file_writeback
0#012swapcached 0#012anon_thp 0#012file_thp 0#012shmem_thp 0#012inactive_anon
1073627136#012active_anon 69632#012inactive_file 0#012active_file 0#012unevictable
0#012slab_reclaimable 0#012slab_unreclaimable 0#012slab 0#012workingset_refault_anon
0#012workingset_refault_file 1532#012workingset_activate_anon 0#012workingset_activate_file
661#012workingset_restore_anon 0#012workingset_restore_file 619#012workingset_nodereclaim
0#012pgscan 17515#012pgsteal 15856#012pgscan_kswapd 1617#012pgscan_direct
15898#012pgsteal_kswapd 1617#012pgsteal_direct 14239#012pgfault 3216718#012pgmajfault
3#012pgrefill 31284#012pgactivate 28060#012pgdeactivate 31283#012pglazyfree
0#012pglazyfreed 0#012thp_fault_alloc 0#012thp_collapse_alloc 0
What is OOM?
What is OOM?
RIP: 0033:0x6718a4
Code: 8b 3c 9a 48 8d 15 fc 6f 19 00 8b 34 9a 48 8d 14 38 83 ee 01 48 89 54 dd 20 0f
af f7 48 01 c6 0f 1f 80 00 00 00 00 48 8d 0c 3a <48> 89 0a 48 89 ca 48 39 f1 75 f1
48 c7 01 00 00 00 00 48 83 c4 08
RSP: 002b:00007ffdffde2b60 EFLAGS: 00010206
RAX: 00007f006b4c2000 RBX: 000000000000001d RCX: 00007f006b4c3800
RDX: 00007f006b4c2c00 RSI: 00007f006b4c4400 RDI: 0000000000000c00
RBP: 00007f008c800040 R08: 00000000000000c4 R09: 00007f006b400000
R10: 00000000000c2000 R11: 00000000000000c2 R12: 0000000000000003
R13: 000000000000001d R14: 00000000036d6c98 R15: 00007f006b4a62e8
memory: usage 1048576kB, limit 1048576kB, failcnt 0
memory+swap: usage 1048576kB, limit 1048576kB, failcnt 5264
kmem: usage 0kB, limit 9007199254740988kB, failcnt 0
Memory cgroup stats for /stats:
anon 1073647616#012file 94208#012kernel_stack 0#012pagetables 0#012percpu 0#012sock
0#012shmem 49152#012file_mapped 49152#012file_dirty 0#012file_writeback
0#012swapcached 0#012anon_thp 0#012file_thp 0#012shmem_thp 0#012inactive_anon
1073627136#012active_anon 69632#012inactive_file 0#012active_file 0#012unevictable
0#012slab_reclaimable 0#012slab_unreclaimable 0#012slab 0#012workingset_refault_anon
0#012workingset_refault_file 1532#012workingset_activate_anon 0#012workingset_activate_file
661#012workingset_restore_anon 0#012workingset_restore_file 619#012workingset_nodereclaim
0#012pgscan 17515#012pgsteal 15856#012pgscan_kswapd 1617#012pgscan_direct
15898#012pgsteal_kswapd 1617#012pgsteal_direct 14239#012pgfault 3216718#012pgmajfault
3#012pgrefill 31284#012pgactivate 28060#012pgdeactivate 31283#012pglazyfree
0#012pglazyfreed 0#012thp_fault_alloc 0#012thp_collapse_alloc 0
What is OOM?
What is OOM?
RIP: 0033:0x6718a4
Code: 8b 3c 9a 48 8d 15 fc 6f 19 00 8b 34 9a 48 8d 14 38 83 ee 01 48 89 54 dd 20 0f
af f7 48 01 c6 0f 1f 80 00 00 00 00 48 8d 0c 3a <48> 89 0a 48 89 ca 48 39 f1 75 f1
48 c7 01 00 00 00 00 48 83 c4 08
RSP: 002b:00007ffdffde2b60 EFLAGS: 00010206
RAX: 00007f006b4c2000 RBX: 000000000000001d RCX: 00007f006b4c3800
RDX: 00007f006b4c2c00 RSI: 00007f006b4c4400 RDI: 0000000000000c00
RBP: 00007f008c800040 R08: 00000000000000c4 R09: 00007f006b400000
R10: 00000000000c2000 R11: 00000000000000c2 R12: 0000000000000003
R13: 000000000000001d R14: 00000000036d6c98 R15: 00007f006b4a62e8
memory: usage 1048576kB, limit 1048576kB, failcnt 0
memory+swap: usage 1048576kB, limit 1048576kB, failcnt 5264
kmem: usage 0kB, limit 9007199254740988kB, failcnt 0
Memory cgroup stats for /stats:
anon 1073647616#012file 94208#012kernel_stack 0#012pagetables 0#012percpu 0#012sock
0#012shmem 49152#012file_mapped 49152#012file_dirty 0#012file_writeback
0#012swapcached 0#012anon_thp 0#012file_thp 0#012shmem_thp 0#012inactive_anon
1073627136#012active_anon 69632#012inactive_file 0#012active_file 0#012unevictable
0#012slab_reclaimable 0#012slab_unreclaimable 0#012slab 0#012workingset_refault_anon
0#012workingset_refault_file 1532#012workingset_activate_anon 0#012workingset_activate_file
661#012workingset_restore_anon 0#012workingset_restore_file 619#012workingset_nodereclaim
0#012pgscan 17515#012pgsteal 15856#012pgscan_kswapd 1617#012pgscan_direct
15898#012pgsteal_kswapd 1617#012pgsteal_direct 14239#012pgfault 3216718#012pgmajfault
3#012pgrefill 31284#012pgactivate 28060#012pgdeactivate 31283#012pglazyfree
0#012pglazyfreed 0#012thp_fault_alloc 0#012thp_collapse_alloc 0
What is OOM?
What is OOM?
RIP: 0033:0x6718a4
Code: 8b 3c 9a 48 8d 15 fc 6f 19 00 8b 34 9a 48 8d 14 38 83 ee 01 48 89 54 dd 20 0f
af f7 48 01 c6 0f 1f 80 00 00 00 00 48 8d 0c 3a <48> 89 0a 48 89 ca 48 39 f1 75 f1
48 c7 01 00 00 00 00 48 83 c4 08
RSP: 002b:00007ffdffde2b60 EFLAGS: 00010206
RAX: 00007f006b4c2000 RBX: 000000000000001d RCX: 00007f006b4c3800
RDX: 00007f006b4c2c00 RSI: 00007f006b4c4400 RDI: 0000000000000c00
RBP: 00007f008c800040 R08: 00000000000000c4 R09: 00007f006b400000
R10: 00000000000c2000 R11: 00000000000000c2 R12: 0000000000000003
R13: 000000000000001d R14: 00000000036d6c98 R15: 00007f006b4a62e8
memory: usage 1048576kB, limit 1048576kB, failcnt 0
memory+swap: usage 1048576kB, limit 1048576kB, failcnt 5264
kmem: usage 0kB, limit 9007199254740988kB, failcnt 0
Memory cgroup stats for /stats:
anon 1073647616#012file 94208#012kernel_stack 0#012pagetables 0#012percpu 0#012sock
0#012shmem 49152#012file_mapped 49152#012file_dirty 0#012file_writeback
0#012swapcached 0#012anon_thp 0#012file_thp 0#012shmem_thp 0#012inactive_anon
1073627136#012active_anon 69632#012inactive_file 0#012active_file 0#012unevictable
0#012slab_reclaimable 0#012slab_unreclaimable 0#012slab 0#012workingset_refault_anon
0#012workingset_refault_file 1532#012workingset_activate_anon 0#012workingset_activate_file
661#012workingset_restore_anon 0#012workingset_restore_file 619#012workingset_nodereclaim
0#012pgscan 17515#012pgsteal 15856#012pgscan_kswapd 1617#012pgscan_direct
15898#012pgsteal_kswapd 1617#012pgsteal_direct 14239#012pgfault 3216718#012pgmajfault
3#012pgrefill 31284#012pgactivate 28060#012pgdeactivate 31283#012pglazyfree
0#012pglazyfreed 0#012thp_fault_alloc 0#012thp_collapse_alloc 0
cGroup / - system wide
cGroup /name - local
OOM Controls
OOM Controls
➢ What controls do we have over the OOM
What controls do we have over the OOM
Killer?
Killer?
OOM Controls - system wide
OOM Controls - system wide
➢ vm.oom_dump_tasks = 1
vm.oom_dump_tasks = 1
OOM Controls - system wide
OOM Controls - system wide
Tasks state (memory values in pages):
[ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name
[ 212920] 1480 212920 136532 7756 634880 0 0 php-fpm
[ 212923] 1480 212923 177099 28232 819200 0 0 php-fpm
[ 212962] 1480 212962 176497 27616 815104 0 0 php-fpm
[ 212986] 1480 212986 157912 26572 798720 0 0 php-fpm
[ 213070] 1480 213070 150661 19036 729088 0 0 php-fpm
[ 213120] 1480 213120 153412 21927 757760 0 0 php-fpm
[ 213140] 1480 213140 152899 21203 749568 0 0 php-fpm
[ 213141] 1480 213141 153923 22203 757760 0 0 php-fpm
[ 213142] 1480 213142 153411 21805 753664 0 0 php-fpm
[ 213143] 1480 213143 152387 20753 745472 0 0 php-fpm
[ 213144] 1480 213144 150304 18435 724992 0 0 php-fpm
[ 213145] 1480 213145 151875 20093 741376 0 0 php-fpm
[ 213147] 1480 213147 152899 21543 749568 0 0 php-fpm
[ 213149] 1480 213149 152900 21469 749568 0 0 php-fpm
[ 213150] 1480 213150 153411 21796 753664 0 0 php-fpm
[ 213151] 1480 213151 151875 20090 741376 0 0 php-fpm
[ 213152] 1480 213152 150304 18771 724992 0 0 php-fpm
oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=/,mems_allowed=0-1,
oom_memcg=/stats,task_memcg=/stats,task=php-fpm,pid=212923,uid=1480
Memory cgroup out of memory:
Killed process 212923 (php-fpm)
total-vm:708396kB, anon-rss:89532kB, file-rss:23388kB, shmem-rss:8kB, UID:1480
pgtables:800kB oom_score_adj:0
OOM Controls - system wide
OOM Controls - system wide
➢ vm.oom_dump_tasks = 1
vm.oom_dump_tasks = 1
➢ vm.panic_on_oom = 0
vm.panic_on_oom = 0
OOM Controls - system wide
OOM Controls - system wide
➢ vm.oom_dump_tasks = 1
vm.oom_dump_tasks = 1
➢ vm.panic_on_oom = 0
vm.panic_on_oom = 0
➢ vm.oom_kill_allocating_task = 0
vm.oom_kill_allocating_task = 0
OOM Controls - system wide
OOM Controls - system wide
➢ vm.oom_dump_tasks = 1
vm.oom_dump_tasks = 1
➢ vm.panic_on_oom = 0
vm.panic_on_oom = 0
➢ vm.oom_kill_allocating_task = 0
vm.oom_kill_allocating_task = 0
➢ may not release enough resources
OOM Controls - system wide
OOM Controls - system wide
➢ vm.oom_dump_tasks = 1
vm.oom_dump_tasks = 1
➢ vm.panic_on_oom = 0
vm.panic_on_oom = 0
➢ vm.oom_kill_allocating_task = 0
vm.oom_kill_allocating_task = 0
➢ may not release enough resources
➢ may trigger another OOM-Killer
# sysctl vm.oom_dump_tasks=0
# sysctl vm.oom_dump_tasks=0
OOM control - per-process
OOM control - per-process
➢ oom_score_adj = oom_adj - Adjust the
oom_score_adj = oom_adj - Adjust the
OOM score
OOM score
https://www.man7.org/linux/man-pages/man5/proc_pid_oom_adj.5.html
OOM control - per-process
OOM control - per-process
➢ oom_score_adj = oom_adj - Adjust the
oom_score_adj = oom_adj - Adjust the
OOM score
OOM score
➢ oom_score - display the current OOM
oom_score - display the current OOM
score
score
https://www.man7.org/linux/man-pages/man5/proc_pid_oom_adj.5.html
OOM control - per-process
OOM control - per-process
➢ oom_score_adj = oom_adj - Adjust the
oom_score_adj = oom_adj - Adjust the
OOM score
OOM score
➢ oom_score - display the current OOM
oom_score - display the current OOM
score
score
/proc/PID/oom_score_adj
/proc/PID/oom_score_adj
/proc/PID/oom_score
/proc/PID/oom_score
https://www.man7.org/linux/man-pages/man5/proc_pid_oom_adj.5.html
OOM control - per-process
OOM control - per-process
➢ oom_score_adj - between 0-1000
oom_score_adj - between 0-1000
0 - never kill
0 - never kill
1000 - always kill
1000 - always kill
https://www.man7.org/linux/man-pages/man5/proc_pid_oom_adj.5.html
OOM control - per-process
OOM control - per-process
➢ oom_score_adj - between 0-1000
oom_score_adj - between 0-1000
0 - never kill
0 - never kill
1000 - always kill
1000 - always kill
% of the process memory + swap
% of the process memory + swap
https://www.man7.org/linux/man-pages/man5/proc_pid_oom_adj.5.html
OOM control - per-process
OOM control - per-process
➢ oom_score_adj - between 0-1000
oom_score_adj - between 0-1000
0 - never kill
0 - never kill
1000 - always kill
1000 - always kill
% of the process memory + swap
% of the process memory + swap
➢ Inheritance
Inheritance
https://www.man7.org/linux/man-pages/man5/proc_pid_oom_adj.5.html
OOM control - per-process
OOM control - per-process
Not many people know this:
Not many people know this:
choom [options] -p pid
choom [options] -p pid
choom [options] -n number -p pid
choom [options] -n number -p pid
choom [options] -n number [--] command
choom [options] -n number [--] command
[args...]
[args...]
https://www.man7.org/linux/man-pages/man5/proc_pid_oom_adj.5.html
OOM context
OOM context
➢ cpusets (cpu and mems)
cpusets (cpu and mems)
OOM context
OOM context
➢ cpusets (cpu and mems)
cpusets (cpu and mems)
➢ mempolicy - NUMA policy
mempolicy - NUMA policy
OOM context
OOM context
➢ cpusets (cpu and mems)
cpusets (cpu and mems)
➢ mempolicy - NUMA policy
mempolicy - NUMA policy
➢ memory limit - ulimits/cgroup limit
memory limit - ulimits/cgroup limit
OOM context
OOM context
➢ cpusets (cpu and mems)
cpusets (cpu and mems)
➢ mempolicy - NUMA policy
mempolicy - NUMA policy
➢ memory limit - ulimits/cgroup limit
memory limit - ulimits/cgroup limit
➢ the entire system
the entire system
Two types of OOMs
Two types of OOMs
Local vs. System wide OOM
Local vs. System wide OOM
cGroup(local) OOM
cGroup(local) OOM
memory.oom.group
memory.oom.group
cGroup(local) OOM
cGroup(local) OOM
memory.oom.group
memory.oom.group
- should we kill all processes in the
- should we kill all processes in the
group
group
What happens when
What happens when
the OOM triggers
the OOM triggers
Or why is my machine stuck?
Or why is my machine stuck?
What happens when
What happens when
the OOM triggers
the OOM triggers
➢Is it local or system?
Is it local or system?
What happens when
What happens when
the OOM triggers
the OOM triggers
➢Is it local or system?
Is it local or system?
* If local, OOM-Killer is triggered for
* If local, OOM-Killer is triggered for
the cGroup
the cGroup
What happens when
What happens when
the OOM triggers
the OOM triggers
➢Is it local or system?
Is it local or system?
* If local, OOM-Killer is triggered for
* If local, OOM-Killer is triggered for
the cGroup
the cGroup
* If system, OOM-Killer is triggered for
* If system, OOM-Killer is triggered for
the whole system
the whole system
What happens when
What happens when
the OOM triggers
the OOM triggers
➢If local:
If local:
* How many processes are in the cGroup?
* How many processes are in the cGroup?
What happens when
What happens when
the OOM triggers
the OOM triggers
➢If local:
If local:
* How many processes are in the cGroup?
* How many processes are in the cGroup?
* How many of them are excluded?
* How many of them are excluded?
What happens when
What happens when
the OOM triggers
the OOM triggers
➢If local:
If local:
*
* How many processes are in the cGroup?
How many processes are in the cGroup?
* How many of them are excluded?
* How many of them are excluded?
* How much memory in total is used by
* How much memory in total is used by
the cGroup?
the cGroup?
What happens when
What happens when
the OOM triggers
the OOM triggers
➢ If system wide:
If system wide:
* Is kswapdX still running reclaim?
* Is kswapdX still running reclaim?
What happens when
What happens when
the OOM triggers
the OOM triggers
➢ If system wide:
If system wide:
* Is kswapdX still running reclaim?
* Is kswapdX still running reclaim?
* How much RAM do you have?
* How much RAM do you have?
What happens when
What happens when
the OOM triggers
the OOM triggers
➢ If system wide:
If system wide:
* Is kswapdX still running reclaim?
* Is kswapdX still running reclaim?
* How much RAM do you have?
* How much RAM do you have?
* How many memory nodes?
* How many memory nodes?
What happens when
What happens when
the OOM triggers
the OOM triggers
➢ If system wide:
If system wide:
* Is kswapdX still running reclaim?
* Is kswapdX still running reclaim?
* How much RAM do you have?
* How much RAM do you have?
* How many memory nodes?
* How many memory nodes?
* How many NUMA nodes?
* How many NUMA nodes?
What happens when
What happens when
the OOM triggers
the OOM triggers
➢ If system wide:
If system wide:
* Is kswapdX still running reclaim?
* Is kswapdX still running reclaim?
* How much RAM do you have?
* How much RAM do you have?
* How many memory nodes?
* How many memory nodes?
* How many NUMA nodes?
* How many NUMA nodes?
* How many processes?
* How many processes?
Memory reclaiming
Memory reclaiming
/proc/zoneinfo
/proc/zoneinfo
Node 0, zone Normal
pages free 74935
min 11778
low 36539
high 61300
spanned 25165824
present 25165824
managed 24763136
Node 1, zone Normal
pages free 2182569
min 11032
low 35279
high 59526
spanned 24641536
present 24641536
managed 24247484
Memory reclaiming
Memory reclaiming
/proc/zoneinfo
/proc/zoneinfo
Node 0, zone Normal
pages free 74935
min 11778
low 36539 triggers memory reclaiming
high 61300
spanned 25165824
present 25165824
managed 24763136
Node 1, zone Normal
pages free 2182569
min 11032
low 35279
high 59526
spanned 24641536
present 24641536
managed 24247484
Memory reclaiming
Memory reclaiming
/proc/zoneinfo
/proc/zoneinfo
Node 0, zone Normal
pages free 74935
min 11778
low 36539
high 61300 stops the memory reclaiming
spanned 25165824
present 25165824
managed 24763136
Node 1, zone Normal
pages free 2182569
min 11032
low 35279
high 59526
spanned 24641536
present 24641536
managed 24247484
Memory reclaiming
Memory reclaiming
/proc/zoneinfo
/proc/zoneinfo
Node 0, zone Normal
pages free 74935
min 11778 direct reclaiming
low 36539
high 61300
spanned 25165824
present 25165824
managed 24763136
Node 1, zone Normal
pages free 2182569
min 11032
low 35279
high 59526
spanned 24641536
present 24641536
managed 24247484
Memory reclaiming
Memory reclaiming
/proc/zoneinfo
/proc/zoneinfo
Node 0, zone Normal
pages free 74935
min 11778
low 36539
high 61300
spanned 25165824
present 25165824
managed 24763136
Node 1, zone Normal
pages free 2182569
min 11032
low 35279
high 59526
spanned 24641536
present 24641536
managed 24247484
Memory reclaim process is executed by kswapd per NUMA node.
Memory reclaim process
Memory reclaim process
➢ Flush dirty pages
Flush dirty pages
Memory reclaim process
Memory reclaim process
➢ Flush dirty pages
Flush dirty pages
➢ Flush reclaimable pages from swap
Flush reclaimable pages from swap
Memory reclaim process
Memory reclaim process
➢ Flush dirty pages
Flush dirty pages
➢ Flush reclaimable pages from swap
Flush reclaimable pages from swap
➢ HDD/SSD/NVMe - between 40 and 400MB/s
HDD/SSD/NVMe - between 40 and 400MB/s
Memory reclaim process
Memory reclaim process
➢ Flush dirty pages
Flush dirty pages
➢ Flush reclaimable pages from swap
Flush reclaimable pages from swap
➢ HDD/SSD/NVMe - between 40 and 400MB/s
HDD/SSD/NVMe - between 40 and 400MB/s
➢ vm.swapiness and vm.dirty_
vm.swapiness and vm.dirty_
The OOM-Killer process
The OOM-Killer process
➢ Read lock the processes
Read lock the processes
The OOM-Killer process
The OOM-Killer process
➢ Read lock the processes
Read lock the processes
➢ collects current memory allocation
collects current memory allocation
sizes and oom scores
sizes and oom scores
The OOM-Killer process
The OOM-Killer process
➢ Read lock the processes
Read lock the processes
➢ collects current memory allocation
collects current memory allocation
sizes and oom scores
sizes and oom scores
➢ Kills the task with the highest memory
Kills the task with the highest memory
usage and the highest badness score.
usage and the highest badness score.
➢ ( RSS + SWAP + PAGETABLE / PAGE_SIZE ) + ( totalpages / 1000 )
( RSS + SWAP + PAGETABLE / PAGE_SIZE ) + ( totalpages / 1000 )
➢ for more information mm/oom_kill.c oom_badness()
for more information mm/oom_kill.c oom_badness()
The OOM-Killer process
The OOM-Killer process
➢ Read lock the processes
Read lock the processes
➢ collects current memory allocation
collects current memory allocation
sizes and oom scores
sizes and oom scores
➢ Kills the task with the highest memory
Kills the task with the highest memory
usage and the highest badness score.
usage and the highest badness score.
➢ ( RSS + SWAP + PAGETABLE / PAGE_SIZE ) + ( totalpages / 1000 )
( RSS + SWAP + PAGETABLE / PAGE_SIZE ) + ( totalpages / 1000 )
➢ for more information mm/oom_kill.c oom_badness()
for more information mm/oom_kill.c oom_badness()
➢ Print the list of tasks
Print the list of tasks
➢ if dump_tasks=1
if dump_tasks=1
Is this all?
Is this all?
➢ Actually NO :)
Is this all?
Is this all?
➢ Actually NO :)
➢ Contiguous memory allocations
Is this all?
Is this all?
➢ Actually NO :)
➢ Contiguous memory allocations
➢ generally userspace does not request
contiguous memory...
➢ but when it does... :)
/proc/buddyinfo
/proc/buddyinfo
➢ Buddy system memory allocation
➢ Smallest is one page size: 4KB
➢ Largest is 4MB
➢ 11 regions power of 2
/proc/buddyinfo
/proc/buddyinfo
➢ Zones:
* DMA - first 4MB
* DMA32 - first 4GB
* Normal - everything else
/proc/buddyinfo
/proc/buddyinfo
Templar:
Node 1, zone DMA 0 0 1 0 2 1 1 0 1 1 3
Node 1, zone DMA32 4 4 4 5 6 2 2 1 1 1 488
Node 1, zone Normal 1437 2734 1350 1365 1084 407 192 130 99 3 2360
Node 0, zone Normal 7183 3166 2094 1081 729 191 25 9 23 18 1
Hawk:
Node 0, zone DMA 0 0 0 0 0 0 0 0 1 1 2
Node 0, zone DMA32 11890 10137 5940 740 654 374 164 87 41 14 20
Node 0, zone Normal 55276 17780 11845 5488 2360 2687 445 981 620 175 88
Node 1, zone Normal 49715 15934 30727 8659 3264 1773 669 155 109 47 163
/proc/buddyinfo
/proc/buddyinfo
Templar:
Node: 0
Zone: Normal
Free KiB in zone: 251292.00
Fragment size Free fragments Total available KiB
4.00 KiB 2869 11476.0
8.00 KiB 2523 20184.0
16.00 KiB 2387 38192.0
32.00 KiB 1244 39808.0
64.00 KiB 709 45376.0
128.00 KiB 210 26880.0
256.00 KiB 29 7424.0
512.00 KiB 9 4608.0
1.00 MiB 22 22528.0
2.00 MiB 15 30720.0
4.00 MiB 1 4096.0
Node: 1
Zone: Normal
Free KiB in zone: 10225244.00
Fragment size Free fragments Total available KiB
4.00 KiB 20017 80068.0
8.00 KiB 4981 39848.0
16.00 KiB 1745 27920.0
32.00 KiB 1585 50720.0
64.00 KiB 1125 72000.0
128.00 KiB 415 53120.0
256.00 KiB 198 50688.0
512.00 KiB 136 69632.0
1.00 MiB 104 106496.0
2.00 MiB 10 20480.0
4.00 MiB 2357 9654272.0
/proc/buddyinfo
/proc/buddyinfo
Hawk:
Node: 0
Zone: Normal
Free KiB in zone: 1968572.00
Fragment size Free fragments Total available KiB
4096 65061 260244.0
8192 11233 89864.0
16384 8884 142144.0
32768 3691 118112.0
65536 2424 155136.0
131072 1569 200832.0
262144 1027 262912.0
524288 650 332800.0
1048576 341 349184.0
2097152 20 40960.0
4194304 4 16384.0
Node: 1
Zone: Normal
Free KiB in zone: 2951940.00
Fragment size Free fragments Total available KiB
4096 222865 891460.0
8192 80496 643968.0
16384 16054 256864.0
32768 6507 208224.0
65536 5168 330752.0
131072 203 25984.0
262144 21 5376.0
524288 157 80384.0
1048576 59 60416.0
2097152 35 71680.0
4194304 92 376832.0
/proc/buddyinfo
/proc/buddyinfo
Gandalf:
Node: 0
Zone: Normal
Free KiB in zone: 5502176.00
Fragment size Free fragments Total available KiB
4.00 KiB 218652 874608.0
8.00 KiB 361548 2892384.0
16.00 KiB 106889 1710224.0
32.00 KiB 726 23232.0
64.00 KiB 5 320.0
128.00 KiB 5 640.0
256.00 KiB 1 256.0
512.00 KiB 1 512.0
1.00 MiB 0 0.0
2.00 MiB 0 0.0
4.00 MiB 0 0.0
Where is my memory going?
Where is my memory going?
* /proc/slabinfo
- task_struct - process/thread information
- TCP/UDP
- iptables/conntrack
- VFS
Where is my memory going?
Where is my memory going?
* kernel objects
root@templar:/proc# ps axf|wc -l
1093
root@templar:/proc# grep task_s /proc/slabinfo
task_struct 48932 49810 3200 10 8 ...
root@templar:/proc#
Questions?
Marian Marinov <mm@yuhu.biz>

More Related Content

PDF
OSDC 2017 - Werner Fischer - Linux performance profiling and monitoring
PDF
OSMC 2015 | Linux Performance Profiling and Monitoring by Werner Fischer
PDF
OSMC 2015: Linux Performance Profiling and Monitoring by Werner Fischer
PDF
OSDC 2015: Georg Schönberger | Linux Performance Profiling and Monitoring
PDF
Linux Performance Profiling and Monitoring
ODP
Linux kernel debugging(ODP format)
PDF
Linux kernel debugging(PDF format)
ODP
Linux Kernel Crashdump
OSDC 2017 - Werner Fischer - Linux performance profiling and monitoring
OSMC 2015 | Linux Performance Profiling and Monitoring by Werner Fischer
OSMC 2015: Linux Performance Profiling and Monitoring by Werner Fischer
OSDC 2015: Georg Schönberger | Linux Performance Profiling and Monitoring
Linux Performance Profiling and Monitoring
Linux kernel debugging(ODP format)
Linux kernel debugging(PDF format)
Linux Kernel Crashdump

Similar to Understanding your memory usage under Linux (20)

PDF
Kernel Recipes 2013 - Deciphering Oopsies
PDF
Reverse engineering Swisscom's Centro Grande Modem
PDF
Performance tuning
PDF
linux-memory-explained.pdf
PDF
Debugging 2013- Jesper Brouer
PDF
A close encounter_with_real_world_and_odd_perf_issues
PDF
MySQL Monitoring 101
PPT
Troubleshooting Linux Kernel Modules And Device Drivers
PPT
Troubleshooting linux-kernel-modules-and-device-drivers-1233050713693744-1
PDF
Musclenerd - Evolution of iPhone Baseband and Unlocks
PDF
OpenIot & ELC Europe 2016 Berlin - How to develop the ARM 64bit board, Samsun...
PDF
HKG18-TR14 - Postmortem Debugging with Coresight
PDF
44CON London 2015 - Jtagsploitation: 5 wires, 5 ways to root
PDF
Cassandra Day SV 2014: Basic Operations with Apache Cassandra
PDF
44CON 2014 - Stupid PCIe Tricks, Joe Fitzpatrick
KEY
Joomla! Day Deutschland 2012 - Active Security
PDF
BlueHat v18 || A mitigation for kernel toctou vulnerabilities
PDF
Tips for better CI on Android
DOCX
Panic report 121112
PDF
Crash_Report_Mechanism_In_Tizen
Kernel Recipes 2013 - Deciphering Oopsies
Reverse engineering Swisscom's Centro Grande Modem
Performance tuning
linux-memory-explained.pdf
Debugging 2013- Jesper Brouer
A close encounter_with_real_world_and_odd_perf_issues
MySQL Monitoring 101
Troubleshooting Linux Kernel Modules And Device Drivers
Troubleshooting linux-kernel-modules-and-device-drivers-1233050713693744-1
Musclenerd - Evolution of iPhone Baseband and Unlocks
OpenIot & ELC Europe 2016 Berlin - How to develop the ARM 64bit board, Samsun...
HKG18-TR14 - Postmortem Debugging with Coresight
44CON London 2015 - Jtagsploitation: 5 wires, 5 ways to root
Cassandra Day SV 2014: Basic Operations with Apache Cassandra
44CON 2014 - Stupid PCIe Tricks, Joe Fitzpatrick
Joomla! Day Deutschland 2012 - Active Security
BlueHat v18 || A mitigation for kernel toctou vulnerabilities
Tips for better CI on Android
Panic report 121112
Crash_Report_Mechanism_In_Tizen
Ad

More from OpenFest team (20)

PDF
Embedding FreeBSD: for large and small beds
PDF
Why you can charge for open source software
PDF
Microinvest Warehouse Open
PDF
Backbone.js
PDF
Как да правим по-добър бизнес с услуги около софтуера с отворен код
PDF
Pf sense 2.0
PDF
Електронни пари: Пътят до BitCoin и поглед напред
PDF
Node.social
PDF
Виртуализирано видеонаблюдение под FreeBSD
PDF
RFID технологии и проблеми със сигурността им
PDF
Redis the better NoSQL
PDF
initLab
PDF
Свободни PLC
PDF
Distributed WPA PSK security audit
PDF
PDF
Why kernelspace sucks?
PDF
Направи си сам суперкомпютър
PDF
Свободни курсове за обучение
PDF
Using Open Source technologies to create Enterprise Level Cloud System
PDF
Emacs reborn
Embedding FreeBSD: for large and small beds
Why you can charge for open source software
Microinvest Warehouse Open
Backbone.js
Как да правим по-добър бизнес с услуги около софтуера с отворен код
Pf sense 2.0
Електронни пари: Пътят до BitCoin и поглед напред
Node.social
Виртуализирано видеонаблюдение под FreeBSD
RFID технологии и проблеми със сигурността им
Redis the better NoSQL
initLab
Свободни PLC
Distributed WPA PSK security audit
Why kernelspace sucks?
Направи си сам суперкомпютър
Свободни курсове за обучение
Using Open Source technologies to create Enterprise Level Cloud System
Emacs reborn
Ad

Recently uploaded (20)

PDF
composite construction of structures.pdf
DOCX
573137875-Attendance-Management-System-original
PPTX
additive manufacturing of ss316l using mig welding
PDF
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
PPTX
Construction Project Organization Group 2.pptx
PDF
Embodied AI: Ushering in the Next Era of Intelligent Systems
PPTX
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
PPTX
web development for engineering and engineering
PDF
PPT on Performance Review to get promotions
PPTX
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
PPTX
UNIT 4 Total Quality Management .pptx
PPTX
Sustainable Sites - Green Building Construction
PDF
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
PDF
Automation-in-Manufacturing-Chapter-Introduction.pdf
PPTX
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
PPTX
UNIT-1 - COAL BASED THERMAL POWER PLANTS
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
PPTX
Foundation to blockchain - A guide to Blockchain Tech
PPTX
OOP with Java - Java Introduction (Basics)
PPT
Project quality management in manufacturing
composite construction of structures.pdf
573137875-Attendance-Management-System-original
additive manufacturing of ss316l using mig welding
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
Construction Project Organization Group 2.pptx
Embodied AI: Ushering in the Next Era of Intelligent Systems
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
web development for engineering and engineering
PPT on Performance Review to get promotions
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
UNIT 4 Total Quality Management .pptx
Sustainable Sites - Green Building Construction
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
Automation-in-Manufacturing-Chapter-Introduction.pdf
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
UNIT-1 - COAL BASED THERMAL POWER PLANTS
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
Foundation to blockchain - A guide to Blockchain Tech
OOP with Java - Java Introduction (Basics)
Project quality management in manufacturing

Understanding your memory usage under Linux

  • 1. Understanding your memory usage Marian Marinov <mm@yuhu.biz> Marian Marinov <mm@yuhu.biz> OpenFest 2024 OpenFest 2024
  • 2. Who am I? Who am I? ➢ Sysadmin with more then 25y of experience ➢ Director of Engineering at Web Hosting Canada ➢ a FOSS dude :)
  • 3. When does OOM occurs? When does OOM occurs? ➢ What do you think?
  • 4. When does OOM occurs? When does OOM occurs? ➢ What do you think? ➢ When there is not enough memory for the applications?
  • 5. When does OOM occurs? When does OOM occurs? ➢ What do you think? ➢ When there is not enough memory for the applications? ➢ When there is not enough memory for the kernel?
  • 6. When does OOM occurs? When does OOM occurs? ➢ What do you think? ➢ When there is not enough memory for the applications? ➢ When there is not enough memory for the kernel? ➢ But is this really the case?
  • 7. Overcommit Overcommit But wait... there is a bit more, before we continue. https://www.kernel.org/doc/html/latest/admin-guide/sysctl/vm.html
  • 8. Overcommit Overcommit But wait... there is a bit more, before we continue. ➢Not all allocated memory is used https://www.kernel.org/doc/html/latest/admin-guide/sysctl/vm.html
  • 9. Overcommit Overcommit But wait... there is a bit more, before we continue. ➢Not all allocated memory is used ➢ buffers ➢ caches ➢ mmap-ed files https://www.kernel.org/doc/html/latest/admin-guide/sysctl/vm.html
  • 10. Overcommit Overcommit But wait... there is a bit more, before we continue. vm.overcommit_memory vm.overcommit_kbytes vm.overcommit_ratio https://www.kernel.org/doc/html/latest/admin-guide/sysctl/vm.html
  • 11. Overcommit Overcommit But wait... there is a bit more, before we continue. _memory is 0 - total memory + swap https://www.kernel.org/doc/html/latest/admin-guide/sysctl/vm.html
  • 12. Overcommit Overcommit But wait... there is a bit more, before we continue. _memory is 0 - total memory + swap _memory is 1 - allow any allocations https://www.kernel.org/doc/html/latest/admin-guide/sysctl/vm.html
  • 13. Overcommit Overcommit But wait... there is a bit more, before we continue. _memory is 0 - total memory + swap _memory is 1 - allow any allocations _memory is 2 - checks the _kbytes or _ratio https://www.kernel.org/doc/html/latest/admin-guide/sysctl/vm.html
  • 14. Overcommit Overcommit But wait... there is a bit more, before we continue. _kbytes and _ratio are for physical memory + swap vm.overcommit_kbytes - specific size vm.overcommit_ratio - percentage https://www.kernel.org/doc/html/latest/admin-guide/sysctl/vm.html
  • 15. What is OOM? What is OOM? OOM - Out of Memory OOM - Out of Memory
  • 16. What is OOM? What is OOM? OOM - Out of Memory OOM - Out of Memory OOM - Killer OOM - Killer
  • 17. What is OOM? What is OOM? php-fpm invoked oom-killer: gfp_mask=0x6000c0(GFP_KERNEL), order=0, oom_score_adj=0 CPU: 38 PID: 213145 Comm: php-fpm Kdump: loaded Tainted: G OE -------- - - 5.4.61 Hardware name: Dell Inc. PowerEdge R630, BIOS 2.16.2 02/19/2023 Call Trace: dump_stack+0x41/0x60 dump_header+0x4a/0x1df oom_kill_process.cold.33+0xb/0x10 out_of_memory+0x1bd/0x4e0 mem_cgroup_out_of_memory+0xec/0x100 try_charge_memcg+0x61a/0x690 ? __alloc_pages_nodemask+0x166/0x330 __mem_cgroup_charge+0x40/0xa0 mem_cgroup_charge+0x2f/0x80 handle_pte_fault+0x372/0x880 __handle_mm_fault+0x552/0x6d0 ? filemap_fdatawait_keep_errors+0x50/0x50 handle_mm_fault+0xca/0x2a0 __do_page_fault+0x1e4/0x440 do_page_fault+0x37/0x12d ? page_fault+0x8/0x30 page_fault+0x1e/0x30
  • 18. What is OOM? What is OOM? php-fpm invoked oom-killer: gfp_mask=0x6000c0(GFP_KERNEL), order=0, oom_score_adj=0 CPU: 38 PID: 213145 Comm: php-fpm Kdump: loaded Tainted: G OE -------- - - 5.4.61 Hardware name: Dell Inc. PowerEdge R630, BIOS 2.16.2 02/19/2023 Call Trace: dump_stack+0x41/0x60 dump_header+0x4a/0x1df oom_kill_process.cold.33+0xb/0x10 out_of_memory+0x1bd/0x4e0 mem_cgroup_out_of_memory+0xec/0x100 try_charge_memcg+0x61a/0x690 ? __alloc_pages_nodemask+0x166/0x330 __mem_cgroup_charge+0x40/0xa0 mem_cgroup_charge+0x2f/0x80 handle_pte_fault+0x372/0x880 __handle_mm_fault+0x552/0x6d0 ? filemap_fdatawait_keep_errors+0x50/0x50 handle_mm_fault+0xca/0x2a0 __do_page_fault+0x1e4/0x440 do_page_fault+0x37/0x12d ? page_fault+0x8/0x30 page_fault+0x1e/0x30
  • 19. What is OOM? What is OOM? php-fpm invoked oom-killer: gfp_mask=0x6000c0(GFP_KERNEL), order=0, oom_score_adj=0 CPU: 38 PID: 213145 Comm: php-fpm Kdump: loaded Tainted: G OE -------- - - 5.4.61 Hardware name: Dell Inc. PowerEdge R630, BIOS 2.16.2 02/19/2023 Call Trace: dump_stack+0x41/0x60 dump_header+0x4a/0x1df oom_kill_process.cold.33+0xb/0x10 out_of_memory+0x1bd/0x4e0 mem_cgroup_out_of_memory+0xec/0x100 try_charge_memcg+0x61a/0x690 ? __alloc_pages_nodemask+0x166/0x330 __mem_cgroup_charge+0x40/0xa0 mem_cgroup_charge+0x2f/0x80 handle_pte_fault+0x372/0x880 __handle_mm_fault+0x552/0x6d0 ? filemap_fdatawait_keep_errors+0x50/0x50 handle_mm_fault+0xca/0x2a0 __do_page_fault+0x1e4/0x440 do_page_fault+0x37/0x12d ? page_fault+0x8/0x30 page_fault+0x1e/0x30
  • 20. What is OOM? What is OOM? php-fpm invoked oom-killer: gfp_mask=0x6000c0(GFP_KERNEL), order=0, oom_score_adj=0 CPU: 38 PID: 213145 Comm: php-fpm Kdump: loaded Tainted: G OE -------- - - 5.4.61 Hardware name: Dell Inc. PowerEdge R630, BIOS 2.16.2 02/19/2023 Call Trace: dump_stack+0x41/0x60 dump_header+0x4a/0x1df oom_kill_process.cold.33+0xb/0x10 out_of_memory+0x1bd/0x4e0 mem_cgroup_out_of_memory+0xec/0x100 try_charge_memcg+0x61a/0x690 ? __alloc_pages_nodemask+0x166/0x330 __mem_cgroup_charge+0x40/0xa0 mem_cgroup_charge+0x2f/0x80 handle_pte_fault+0x372/0x880 __handle_mm_fault+0x552/0x6d0 ? filemap_fdatawait_keep_errors+0x50/0x50 handle_mm_fault+0xca/0x2a0 __do_page_fault+0x1e4/0x440 do_page_fault+0x37/0x12d ? page_fault+0x8/0x30 page_fault+0x1e/0x30
  • 21. What is OOM? What is OOM? php-fpm invoked oom-killer: gfp_mask=0x6000c0(GFP_KERNEL), order=0, oom_score_adj=0 CPU: 38 PID: 213145 Comm: php-fpm Kdump: loaded Tainted: G OE -------- - - 5.4.61 Hardware name: Dell Inc. PowerEdge R630, BIOS 2.16.2 02/19/2023 Call Trace: dump_stack+0x41/0x60 dump_header+0x4a/0x1df oom_kill_process.cold.33+0xb/0x10 out_of_memory+0x1bd/0x4e0 mem_cgroup_out_of_memory+0xec/0x100 try_charge_memcg+0x61a/0x690 ? __alloc_pages_nodemask+0x166/0x330 __mem_cgroup_charge+0x40/0xa0 mem_cgroup_charge+0x2f/0x80 handle_pte_fault+0x372/0x880 __handle_mm_fault+0x552/0x6d0 ? filemap_fdatawait_keep_errors+0x50/0x50 handle_mm_fault+0xca/0x2a0 __do_page_fault+0x1e4/0x440 do_page_fault+0x37/0x12d ? page_fault+0x8/0x30 page_fault+0x1e/0x30
  • 22. What is OOM? What is OOM? RIP: 0033:0x6718a4 Code: 8b 3c 9a 48 8d 15 fc 6f 19 00 8b 34 9a 48 8d 14 38 83 ee 01 48 89 54 dd 20 0f af f7 48 01 c6 0f 1f 80 00 00 00 00 48 8d 0c 3a <48> 89 0a 48 89 ca 48 39 f1 75 f1 48 c7 01 00 00 00 00 48 83 c4 08 RSP: 002b:00007ffdffde2b60 EFLAGS: 00010206 RAX: 00007f006b4c2000 RBX: 000000000000001d RCX: 00007f006b4c3800 RDX: 00007f006b4c2c00 RSI: 00007f006b4c4400 RDI: 0000000000000c00 RBP: 00007f008c800040 R08: 00000000000000c4 R09: 00007f006b400000 R10: 00000000000c2000 R11: 00000000000000c2 R12: 0000000000000003 R13: 000000000000001d R14: 00000000036d6c98 R15: 00007f006b4a62e8 memory: usage 1048576kB, limit 1048576kB, failcnt 0 memory+swap: usage 1048576kB, limit 1048576kB, failcnt 5264 kmem: usage 0kB, limit 9007199254740988kB, failcnt 0 Memory cgroup stats for /stats: anon 1073647616#012file 94208#012kernel_stack 0#012pagetables 0#012percpu 0#012sock 0#012shmem 49152#012file_mapped 49152#012file_dirty 0#012file_writeback 0#012swapcached 0#012anon_thp 0#012file_thp 0#012shmem_thp 0#012inactive_anon 1073627136#012active_anon 69632#012inactive_file 0#012active_file 0#012unevictable 0#012slab_reclaimable 0#012slab_unreclaimable 0#012slab 0#012workingset_refault_anon 0#012workingset_refault_file 1532#012workingset_activate_anon 0#012workingset_activate_file 661#012workingset_restore_anon 0#012workingset_restore_file 619#012workingset_nodereclaim 0#012pgscan 17515#012pgsteal 15856#012pgscan_kswapd 1617#012pgscan_direct 15898#012pgsteal_kswapd 1617#012pgsteal_direct 14239#012pgfault 3216718#012pgmajfault 3#012pgrefill 31284#012pgactivate 28060#012pgdeactivate 31283#012pglazyfree 0#012pglazyfreed 0#012thp_fault_alloc 0#012thp_collapse_alloc 0
  • 23. What is OOM? What is OOM? RIP: 0033:0x6718a4 Code: 8b 3c 9a 48 8d 15 fc 6f 19 00 8b 34 9a 48 8d 14 38 83 ee 01 48 89 54 dd 20 0f af f7 48 01 c6 0f 1f 80 00 00 00 00 48 8d 0c 3a <48> 89 0a 48 89 ca 48 39 f1 75 f1 48 c7 01 00 00 00 00 48 83 c4 08 RSP: 002b:00007ffdffde2b60 EFLAGS: 00010206 RAX: 00007f006b4c2000 RBX: 000000000000001d RCX: 00007f006b4c3800 RDX: 00007f006b4c2c00 RSI: 00007f006b4c4400 RDI: 0000000000000c00 RBP: 00007f008c800040 R08: 00000000000000c4 R09: 00007f006b400000 R10: 00000000000c2000 R11: 00000000000000c2 R12: 0000000000000003 R13: 000000000000001d R14: 00000000036d6c98 R15: 00007f006b4a62e8 memory: usage 1048576kB, limit 1048576kB, failcnt 0 memory+swap: usage 1048576kB, limit 1048576kB, failcnt 5264 kmem: usage 0kB, limit 9007199254740988kB, failcnt 0 Memory cgroup stats for /stats: anon 1073647616#012file 94208#012kernel_stack 0#012pagetables 0#012percpu 0#012sock 0#012shmem 49152#012file_mapped 49152#012file_dirty 0#012file_writeback 0#012swapcached 0#012anon_thp 0#012file_thp 0#012shmem_thp 0#012inactive_anon 1073627136#012active_anon 69632#012inactive_file 0#012active_file 0#012unevictable 0#012slab_reclaimable 0#012slab_unreclaimable 0#012slab 0#012workingset_refault_anon 0#012workingset_refault_file 1532#012workingset_activate_anon 0#012workingset_activate_file 661#012workingset_restore_anon 0#012workingset_restore_file 619#012workingset_nodereclaim 0#012pgscan 17515#012pgsteal 15856#012pgscan_kswapd 1617#012pgscan_direct 15898#012pgsteal_kswapd 1617#012pgsteal_direct 14239#012pgfault 3216718#012pgmajfault 3#012pgrefill 31284#012pgactivate 28060#012pgdeactivate 31283#012pglazyfree 0#012pglazyfreed 0#012thp_fault_alloc 0#012thp_collapse_alloc 0
  • 24. What is OOM? What is OOM? RIP: 0033:0x6718a4 Code: 8b 3c 9a 48 8d 15 fc 6f 19 00 8b 34 9a 48 8d 14 38 83 ee 01 48 89 54 dd 20 0f af f7 48 01 c6 0f 1f 80 00 00 00 00 48 8d 0c 3a <48> 89 0a 48 89 ca 48 39 f1 75 f1 48 c7 01 00 00 00 00 48 83 c4 08 RSP: 002b:00007ffdffde2b60 EFLAGS: 00010206 RAX: 00007f006b4c2000 RBX: 000000000000001d RCX: 00007f006b4c3800 RDX: 00007f006b4c2c00 RSI: 00007f006b4c4400 RDI: 0000000000000c00 RBP: 00007f008c800040 R08: 00000000000000c4 R09: 00007f006b400000 R10: 00000000000c2000 R11: 00000000000000c2 R12: 0000000000000003 R13: 000000000000001d R14: 00000000036d6c98 R15: 00007f006b4a62e8 memory: usage 1048576kB, limit 1048576kB, failcnt 0 memory+swap: usage 1048576kB, limit 1048576kB, failcnt 5264 kmem: usage 0kB, limit 9007199254740988kB, failcnt 0 Memory cgroup stats for /stats: anon 1073647616#012file 94208#012kernel_stack 0#012pagetables 0#012percpu 0#012sock 0#012shmem 49152#012file_mapped 49152#012file_dirty 0#012file_writeback 0#012swapcached 0#012anon_thp 0#012file_thp 0#012shmem_thp 0#012inactive_anon 1073627136#012active_anon 69632#012inactive_file 0#012active_file 0#012unevictable 0#012slab_reclaimable 0#012slab_unreclaimable 0#012slab 0#012workingset_refault_anon 0#012workingset_refault_file 1532#012workingset_activate_anon 0#012workingset_activate_file 661#012workingset_restore_anon 0#012workingset_restore_file 619#012workingset_nodereclaim 0#012pgscan 17515#012pgsteal 15856#012pgscan_kswapd 1617#012pgscan_direct 15898#012pgsteal_kswapd 1617#012pgsteal_direct 14239#012pgfault 3216718#012pgmajfault 3#012pgrefill 31284#012pgactivate 28060#012pgdeactivate 31283#012pglazyfree 0#012pglazyfreed 0#012thp_fault_alloc 0#012thp_collapse_alloc 0
  • 25. What is OOM? What is OOM? RIP: 0033:0x6718a4 Code: 8b 3c 9a 48 8d 15 fc 6f 19 00 8b 34 9a 48 8d 14 38 83 ee 01 48 89 54 dd 20 0f af f7 48 01 c6 0f 1f 80 00 00 00 00 48 8d 0c 3a <48> 89 0a 48 89 ca 48 39 f1 75 f1 48 c7 01 00 00 00 00 48 83 c4 08 RSP: 002b:00007ffdffde2b60 EFLAGS: 00010206 RAX: 00007f006b4c2000 RBX: 000000000000001d RCX: 00007f006b4c3800 RDX: 00007f006b4c2c00 RSI: 00007f006b4c4400 RDI: 0000000000000c00 RBP: 00007f008c800040 R08: 00000000000000c4 R09: 00007f006b400000 R10: 00000000000c2000 R11: 00000000000000c2 R12: 0000000000000003 R13: 000000000000001d R14: 00000000036d6c98 R15: 00007f006b4a62e8 memory: usage 1048576kB, limit 1048576kB, failcnt 0 memory+swap: usage 1048576kB, limit 1048576kB, failcnt 5264 kmem: usage 0kB, limit 9007199254740988kB, failcnt 0 Memory cgroup stats for /stats: anon 1073647616#012file 94208#012kernel_stack 0#012pagetables 0#012percpu 0#012sock 0#012shmem 49152#012file_mapped 49152#012file_dirty 0#012file_writeback 0#012swapcached 0#012anon_thp 0#012file_thp 0#012shmem_thp 0#012inactive_anon 1073627136#012active_anon 69632#012inactive_file 0#012active_file 0#012unevictable 0#012slab_reclaimable 0#012slab_unreclaimable 0#012slab 0#012workingset_refault_anon 0#012workingset_refault_file 1532#012workingset_activate_anon 0#012workingset_activate_file 661#012workingset_restore_anon 0#012workingset_restore_file 619#012workingset_nodereclaim 0#012pgscan 17515#012pgsteal 15856#012pgscan_kswapd 1617#012pgscan_direct 15898#012pgsteal_kswapd 1617#012pgsteal_direct 14239#012pgfault 3216718#012pgmajfault 3#012pgrefill 31284#012pgactivate 28060#012pgdeactivate 31283#012pglazyfree 0#012pglazyfreed 0#012thp_fault_alloc 0#012thp_collapse_alloc 0
  • 26. What is OOM? What is OOM? RIP: 0033:0x6718a4 Code: 8b 3c 9a 48 8d 15 fc 6f 19 00 8b 34 9a 48 8d 14 38 83 ee 01 48 89 54 dd 20 0f af f7 48 01 c6 0f 1f 80 00 00 00 00 48 8d 0c 3a <48> 89 0a 48 89 ca 48 39 f1 75 f1 48 c7 01 00 00 00 00 48 83 c4 08 RSP: 002b:00007ffdffde2b60 EFLAGS: 00010206 RAX: 00007f006b4c2000 RBX: 000000000000001d RCX: 00007f006b4c3800 RDX: 00007f006b4c2c00 RSI: 00007f006b4c4400 RDI: 0000000000000c00 RBP: 00007f008c800040 R08: 00000000000000c4 R09: 00007f006b400000 R10: 00000000000c2000 R11: 00000000000000c2 R12: 0000000000000003 R13: 000000000000001d R14: 00000000036d6c98 R15: 00007f006b4a62e8 memory: usage 1048576kB, limit 1048576kB, failcnt 0 memory+swap: usage 1048576kB, limit 1048576kB, failcnt 5264 kmem: usage 0kB, limit 9007199254740988kB, failcnt 0 Memory cgroup stats for /stats: anon 1073647616#012file 94208#012kernel_stack 0#012pagetables 0#012percpu 0#012sock 0#012shmem 49152#012file_mapped 49152#012file_dirty 0#012file_writeback 0#012swapcached 0#012anon_thp 0#012file_thp 0#012shmem_thp 0#012inactive_anon 1073627136#012active_anon 69632#012inactive_file 0#012active_file 0#012unevictable 0#012slab_reclaimable 0#012slab_unreclaimable 0#012slab 0#012workingset_refault_anon 0#012workingset_refault_file 1532#012workingset_activate_anon 0#012workingset_activate_file 661#012workingset_restore_anon 0#012workingset_restore_file 619#012workingset_nodereclaim 0#012pgscan 17515#012pgsteal 15856#012pgscan_kswapd 1617#012pgscan_direct 15898#012pgsteal_kswapd 1617#012pgsteal_direct 14239#012pgfault 3216718#012pgmajfault 3#012pgrefill 31284#012pgactivate 28060#012pgdeactivate 31283#012pglazyfree 0#012pglazyfreed 0#012thp_fault_alloc 0#012thp_collapse_alloc 0 cGroup / - system wide cGroup /name - local
  • 27. OOM Controls OOM Controls ➢ What controls do we have over the OOM What controls do we have over the OOM Killer? Killer?
  • 28. OOM Controls - system wide OOM Controls - system wide ➢ vm.oom_dump_tasks = 1 vm.oom_dump_tasks = 1
  • 29. OOM Controls - system wide OOM Controls - system wide Tasks state (memory values in pages): [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name [ 212920] 1480 212920 136532 7756 634880 0 0 php-fpm [ 212923] 1480 212923 177099 28232 819200 0 0 php-fpm [ 212962] 1480 212962 176497 27616 815104 0 0 php-fpm [ 212986] 1480 212986 157912 26572 798720 0 0 php-fpm [ 213070] 1480 213070 150661 19036 729088 0 0 php-fpm [ 213120] 1480 213120 153412 21927 757760 0 0 php-fpm [ 213140] 1480 213140 152899 21203 749568 0 0 php-fpm [ 213141] 1480 213141 153923 22203 757760 0 0 php-fpm [ 213142] 1480 213142 153411 21805 753664 0 0 php-fpm [ 213143] 1480 213143 152387 20753 745472 0 0 php-fpm [ 213144] 1480 213144 150304 18435 724992 0 0 php-fpm [ 213145] 1480 213145 151875 20093 741376 0 0 php-fpm [ 213147] 1480 213147 152899 21543 749568 0 0 php-fpm [ 213149] 1480 213149 152900 21469 749568 0 0 php-fpm [ 213150] 1480 213150 153411 21796 753664 0 0 php-fpm [ 213151] 1480 213151 151875 20090 741376 0 0 php-fpm [ 213152] 1480 213152 150304 18771 724992 0 0 php-fpm oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=/,mems_allowed=0-1, oom_memcg=/stats,task_memcg=/stats,task=php-fpm,pid=212923,uid=1480 Memory cgroup out of memory: Killed process 212923 (php-fpm) total-vm:708396kB, anon-rss:89532kB, file-rss:23388kB, shmem-rss:8kB, UID:1480 pgtables:800kB oom_score_adj:0
  • 30. OOM Controls - system wide OOM Controls - system wide ➢ vm.oom_dump_tasks = 1 vm.oom_dump_tasks = 1 ➢ vm.panic_on_oom = 0 vm.panic_on_oom = 0
  • 31. OOM Controls - system wide OOM Controls - system wide ➢ vm.oom_dump_tasks = 1 vm.oom_dump_tasks = 1 ➢ vm.panic_on_oom = 0 vm.panic_on_oom = 0 ➢ vm.oom_kill_allocating_task = 0 vm.oom_kill_allocating_task = 0
  • 32. OOM Controls - system wide OOM Controls - system wide ➢ vm.oom_dump_tasks = 1 vm.oom_dump_tasks = 1 ➢ vm.panic_on_oom = 0 vm.panic_on_oom = 0 ➢ vm.oom_kill_allocating_task = 0 vm.oom_kill_allocating_task = 0 ➢ may not release enough resources
  • 33. OOM Controls - system wide OOM Controls - system wide ➢ vm.oom_dump_tasks = 1 vm.oom_dump_tasks = 1 ➢ vm.panic_on_oom = 0 vm.panic_on_oom = 0 ➢ vm.oom_kill_allocating_task = 0 vm.oom_kill_allocating_task = 0 ➢ may not release enough resources ➢ may trigger another OOM-Killer # sysctl vm.oom_dump_tasks=0 # sysctl vm.oom_dump_tasks=0
  • 34. OOM control - per-process OOM control - per-process ➢ oom_score_adj = oom_adj - Adjust the oom_score_adj = oom_adj - Adjust the OOM score OOM score https://www.man7.org/linux/man-pages/man5/proc_pid_oom_adj.5.html
  • 35. OOM control - per-process OOM control - per-process ➢ oom_score_adj = oom_adj - Adjust the oom_score_adj = oom_adj - Adjust the OOM score OOM score ➢ oom_score - display the current OOM oom_score - display the current OOM score score https://www.man7.org/linux/man-pages/man5/proc_pid_oom_adj.5.html
  • 36. OOM control - per-process OOM control - per-process ➢ oom_score_adj = oom_adj - Adjust the oom_score_adj = oom_adj - Adjust the OOM score OOM score ➢ oom_score - display the current OOM oom_score - display the current OOM score score /proc/PID/oom_score_adj /proc/PID/oom_score_adj /proc/PID/oom_score /proc/PID/oom_score https://www.man7.org/linux/man-pages/man5/proc_pid_oom_adj.5.html
  • 37. OOM control - per-process OOM control - per-process ➢ oom_score_adj - between 0-1000 oom_score_adj - between 0-1000 0 - never kill 0 - never kill 1000 - always kill 1000 - always kill https://www.man7.org/linux/man-pages/man5/proc_pid_oom_adj.5.html
  • 38. OOM control - per-process OOM control - per-process ➢ oom_score_adj - between 0-1000 oom_score_adj - between 0-1000 0 - never kill 0 - never kill 1000 - always kill 1000 - always kill % of the process memory + swap % of the process memory + swap https://www.man7.org/linux/man-pages/man5/proc_pid_oom_adj.5.html
  • 39. OOM control - per-process OOM control - per-process ➢ oom_score_adj - between 0-1000 oom_score_adj - between 0-1000 0 - never kill 0 - never kill 1000 - always kill 1000 - always kill % of the process memory + swap % of the process memory + swap ➢ Inheritance Inheritance https://www.man7.org/linux/man-pages/man5/proc_pid_oom_adj.5.html
  • 40. OOM control - per-process OOM control - per-process Not many people know this: Not many people know this: choom [options] -p pid choom [options] -p pid choom [options] -n number -p pid choom [options] -n number -p pid choom [options] -n number [--] command choom [options] -n number [--] command [args...] [args...] https://www.man7.org/linux/man-pages/man5/proc_pid_oom_adj.5.html
  • 41. OOM context OOM context ➢ cpusets (cpu and mems) cpusets (cpu and mems)
  • 42. OOM context OOM context ➢ cpusets (cpu and mems) cpusets (cpu and mems) ➢ mempolicy - NUMA policy mempolicy - NUMA policy
  • 43. OOM context OOM context ➢ cpusets (cpu and mems) cpusets (cpu and mems) ➢ mempolicy - NUMA policy mempolicy - NUMA policy ➢ memory limit - ulimits/cgroup limit memory limit - ulimits/cgroup limit
  • 44. OOM context OOM context ➢ cpusets (cpu and mems) cpusets (cpu and mems) ➢ mempolicy - NUMA policy mempolicy - NUMA policy ➢ memory limit - ulimits/cgroup limit memory limit - ulimits/cgroup limit ➢ the entire system the entire system
  • 45. Two types of OOMs Two types of OOMs Local vs. System wide OOM Local vs. System wide OOM
  • 47. cGroup(local) OOM cGroup(local) OOM memory.oom.group memory.oom.group - should we kill all processes in the - should we kill all processes in the group group
  • 48. What happens when What happens when the OOM triggers the OOM triggers Or why is my machine stuck? Or why is my machine stuck?
  • 49. What happens when What happens when the OOM triggers the OOM triggers ➢Is it local or system? Is it local or system?
  • 50. What happens when What happens when the OOM triggers the OOM triggers ➢Is it local or system? Is it local or system? * If local, OOM-Killer is triggered for * If local, OOM-Killer is triggered for the cGroup the cGroup
  • 51. What happens when What happens when the OOM triggers the OOM triggers ➢Is it local or system? Is it local or system? * If local, OOM-Killer is triggered for * If local, OOM-Killer is triggered for the cGroup the cGroup * If system, OOM-Killer is triggered for * If system, OOM-Killer is triggered for the whole system the whole system
  • 52. What happens when What happens when the OOM triggers the OOM triggers ➢If local: If local: * How many processes are in the cGroup? * How many processes are in the cGroup?
  • 53. What happens when What happens when the OOM triggers the OOM triggers ➢If local: If local: * How many processes are in the cGroup? * How many processes are in the cGroup? * How many of them are excluded? * How many of them are excluded?
  • 54. What happens when What happens when the OOM triggers the OOM triggers ➢If local: If local: * * How many processes are in the cGroup? How many processes are in the cGroup? * How many of them are excluded? * How many of them are excluded? * How much memory in total is used by * How much memory in total is used by the cGroup? the cGroup?
  • 55. What happens when What happens when the OOM triggers the OOM triggers ➢ If system wide: If system wide: * Is kswapdX still running reclaim? * Is kswapdX still running reclaim?
  • 56. What happens when What happens when the OOM triggers the OOM triggers ➢ If system wide: If system wide: * Is kswapdX still running reclaim? * Is kswapdX still running reclaim? * How much RAM do you have? * How much RAM do you have?
  • 57. What happens when What happens when the OOM triggers the OOM triggers ➢ If system wide: If system wide: * Is kswapdX still running reclaim? * Is kswapdX still running reclaim? * How much RAM do you have? * How much RAM do you have? * How many memory nodes? * How many memory nodes?
  • 58. What happens when What happens when the OOM triggers the OOM triggers ➢ If system wide: If system wide: * Is kswapdX still running reclaim? * Is kswapdX still running reclaim? * How much RAM do you have? * How much RAM do you have? * How many memory nodes? * How many memory nodes? * How many NUMA nodes? * How many NUMA nodes?
  • 59. What happens when What happens when the OOM triggers the OOM triggers ➢ If system wide: If system wide: * Is kswapdX still running reclaim? * Is kswapdX still running reclaim? * How much RAM do you have? * How much RAM do you have? * How many memory nodes? * How many memory nodes? * How many NUMA nodes? * How many NUMA nodes? * How many processes? * How many processes?
  • 60. Memory reclaiming Memory reclaiming /proc/zoneinfo /proc/zoneinfo Node 0, zone Normal pages free 74935 min 11778 low 36539 high 61300 spanned 25165824 present 25165824 managed 24763136 Node 1, zone Normal pages free 2182569 min 11032 low 35279 high 59526 spanned 24641536 present 24641536 managed 24247484
  • 61. Memory reclaiming Memory reclaiming /proc/zoneinfo /proc/zoneinfo Node 0, zone Normal pages free 74935 min 11778 low 36539 triggers memory reclaiming high 61300 spanned 25165824 present 25165824 managed 24763136 Node 1, zone Normal pages free 2182569 min 11032 low 35279 high 59526 spanned 24641536 present 24641536 managed 24247484
  • 62. Memory reclaiming Memory reclaiming /proc/zoneinfo /proc/zoneinfo Node 0, zone Normal pages free 74935 min 11778 low 36539 high 61300 stops the memory reclaiming spanned 25165824 present 25165824 managed 24763136 Node 1, zone Normal pages free 2182569 min 11032 low 35279 high 59526 spanned 24641536 present 24641536 managed 24247484
  • 63. Memory reclaiming Memory reclaiming /proc/zoneinfo /proc/zoneinfo Node 0, zone Normal pages free 74935 min 11778 direct reclaiming low 36539 high 61300 spanned 25165824 present 25165824 managed 24763136 Node 1, zone Normal pages free 2182569 min 11032 low 35279 high 59526 spanned 24641536 present 24641536 managed 24247484
  • 64. Memory reclaiming Memory reclaiming /proc/zoneinfo /proc/zoneinfo Node 0, zone Normal pages free 74935 min 11778 low 36539 high 61300 spanned 25165824 present 25165824 managed 24763136 Node 1, zone Normal pages free 2182569 min 11032 low 35279 high 59526 spanned 24641536 present 24641536 managed 24247484 Memory reclaim process is executed by kswapd per NUMA node.
  • 65. Memory reclaim process Memory reclaim process ➢ Flush dirty pages Flush dirty pages
  • 66. Memory reclaim process Memory reclaim process ➢ Flush dirty pages Flush dirty pages ➢ Flush reclaimable pages from swap Flush reclaimable pages from swap
  • 67. Memory reclaim process Memory reclaim process ➢ Flush dirty pages Flush dirty pages ➢ Flush reclaimable pages from swap Flush reclaimable pages from swap ➢ HDD/SSD/NVMe - between 40 and 400MB/s HDD/SSD/NVMe - between 40 and 400MB/s
  • 68. Memory reclaim process Memory reclaim process ➢ Flush dirty pages Flush dirty pages ➢ Flush reclaimable pages from swap Flush reclaimable pages from swap ➢ HDD/SSD/NVMe - between 40 and 400MB/s HDD/SSD/NVMe - between 40 and 400MB/s ➢ vm.swapiness and vm.dirty_ vm.swapiness and vm.dirty_
  • 69. The OOM-Killer process The OOM-Killer process ➢ Read lock the processes Read lock the processes
  • 70. The OOM-Killer process The OOM-Killer process ➢ Read lock the processes Read lock the processes ➢ collects current memory allocation collects current memory allocation sizes and oom scores sizes and oom scores
  • 71. The OOM-Killer process The OOM-Killer process ➢ Read lock the processes Read lock the processes ➢ collects current memory allocation collects current memory allocation sizes and oom scores sizes and oom scores ➢ Kills the task with the highest memory Kills the task with the highest memory usage and the highest badness score. usage and the highest badness score. ➢ ( RSS + SWAP + PAGETABLE / PAGE_SIZE ) + ( totalpages / 1000 ) ( RSS + SWAP + PAGETABLE / PAGE_SIZE ) + ( totalpages / 1000 ) ➢ for more information mm/oom_kill.c oom_badness() for more information mm/oom_kill.c oom_badness()
  • 72. The OOM-Killer process The OOM-Killer process ➢ Read lock the processes Read lock the processes ➢ collects current memory allocation collects current memory allocation sizes and oom scores sizes and oom scores ➢ Kills the task with the highest memory Kills the task with the highest memory usage and the highest badness score. usage and the highest badness score. ➢ ( RSS + SWAP + PAGETABLE / PAGE_SIZE ) + ( totalpages / 1000 ) ( RSS + SWAP + PAGETABLE / PAGE_SIZE ) + ( totalpages / 1000 ) ➢ for more information mm/oom_kill.c oom_badness() for more information mm/oom_kill.c oom_badness() ➢ Print the list of tasks Print the list of tasks ➢ if dump_tasks=1 if dump_tasks=1
  • 73. Is this all? Is this all? ➢ Actually NO :)
  • 74. Is this all? Is this all? ➢ Actually NO :) ➢ Contiguous memory allocations
  • 75. Is this all? Is this all? ➢ Actually NO :) ➢ Contiguous memory allocations ➢ generally userspace does not request contiguous memory... ➢ but when it does... :)
  • 76. /proc/buddyinfo /proc/buddyinfo ➢ Buddy system memory allocation ➢ Smallest is one page size: 4KB ➢ Largest is 4MB ➢ 11 regions power of 2
  • 77. /proc/buddyinfo /proc/buddyinfo ➢ Zones: * DMA - first 4MB * DMA32 - first 4GB * Normal - everything else
  • 78. /proc/buddyinfo /proc/buddyinfo Templar: Node 1, zone DMA 0 0 1 0 2 1 1 0 1 1 3 Node 1, zone DMA32 4 4 4 5 6 2 2 1 1 1 488 Node 1, zone Normal 1437 2734 1350 1365 1084 407 192 130 99 3 2360 Node 0, zone Normal 7183 3166 2094 1081 729 191 25 9 23 18 1 Hawk: Node 0, zone DMA 0 0 0 0 0 0 0 0 1 1 2 Node 0, zone DMA32 11890 10137 5940 740 654 374 164 87 41 14 20 Node 0, zone Normal 55276 17780 11845 5488 2360 2687 445 981 620 175 88 Node 1, zone Normal 49715 15934 30727 8659 3264 1773 669 155 109 47 163
  • 79. /proc/buddyinfo /proc/buddyinfo Templar: Node: 0 Zone: Normal Free KiB in zone: 251292.00 Fragment size Free fragments Total available KiB 4.00 KiB 2869 11476.0 8.00 KiB 2523 20184.0 16.00 KiB 2387 38192.0 32.00 KiB 1244 39808.0 64.00 KiB 709 45376.0 128.00 KiB 210 26880.0 256.00 KiB 29 7424.0 512.00 KiB 9 4608.0 1.00 MiB 22 22528.0 2.00 MiB 15 30720.0 4.00 MiB 1 4096.0 Node: 1 Zone: Normal Free KiB in zone: 10225244.00 Fragment size Free fragments Total available KiB 4.00 KiB 20017 80068.0 8.00 KiB 4981 39848.0 16.00 KiB 1745 27920.0 32.00 KiB 1585 50720.0 64.00 KiB 1125 72000.0 128.00 KiB 415 53120.0 256.00 KiB 198 50688.0 512.00 KiB 136 69632.0 1.00 MiB 104 106496.0 2.00 MiB 10 20480.0 4.00 MiB 2357 9654272.0
  • 80. /proc/buddyinfo /proc/buddyinfo Hawk: Node: 0 Zone: Normal Free KiB in zone: 1968572.00 Fragment size Free fragments Total available KiB 4096 65061 260244.0 8192 11233 89864.0 16384 8884 142144.0 32768 3691 118112.0 65536 2424 155136.0 131072 1569 200832.0 262144 1027 262912.0 524288 650 332800.0 1048576 341 349184.0 2097152 20 40960.0 4194304 4 16384.0 Node: 1 Zone: Normal Free KiB in zone: 2951940.00 Fragment size Free fragments Total available KiB 4096 222865 891460.0 8192 80496 643968.0 16384 16054 256864.0 32768 6507 208224.0 65536 5168 330752.0 131072 203 25984.0 262144 21 5376.0 524288 157 80384.0 1048576 59 60416.0 2097152 35 71680.0 4194304 92 376832.0
  • 81. /proc/buddyinfo /proc/buddyinfo Gandalf: Node: 0 Zone: Normal Free KiB in zone: 5502176.00 Fragment size Free fragments Total available KiB 4.00 KiB 218652 874608.0 8.00 KiB 361548 2892384.0 16.00 KiB 106889 1710224.0 32.00 KiB 726 23232.0 64.00 KiB 5 320.0 128.00 KiB 5 640.0 256.00 KiB 1 256.0 512.00 KiB 1 512.0 1.00 MiB 0 0.0 2.00 MiB 0 0.0 4.00 MiB 0 0.0
  • 82. Where is my memory going? Where is my memory going? * /proc/slabinfo - task_struct - process/thread information - TCP/UDP - iptables/conntrack - VFS
  • 83. Where is my memory going? Where is my memory going? * kernel objects root@templar:/proc# ps axf|wc -l 1093 root@templar:/proc# grep task_s /proc/slabinfo task_struct 48932 49810 3200 10 8 ... root@templar:/proc#