SlideShare a Scribd company logo
5
Most read
19
Most read
24
Most read
Linux 核心專題:
探討 sched_ext 及機器學習
執行人: EricccTaiwan, charliechiou
2025/06/28
2025/06/28
周呈陽 成大電通所碩二
GitHub: EricccTaiwan
LinkedIn: Eric Chou
MS @ NCKUCCE
邱柏穎 成大電通所碩一
GitHub: charliechiou
LinkedIn: Po-Ying Chiu
MS @ NCKUCCE
2
3
● A review of CFS and EEVDF
● Innovations and mechanisms behind sched_ext
● From custom FCFS/RR scheduler to introducing Machine Learning
● What’s Next ?
Outline
4
● A review of CFS and EEVDF
● Innovations and mechanisms behind sched_ext
● From custom FCFS/RR scheduler to introducing Machine Learning
● What’s Next ?
Outline
CFS[1]
and EEVDF[2]
● Both CFS (v2.6.23) and EEVDF (v6.6) are general-purpose schedulers in Linux kernel.
● CFS: Select the task with the earliest vruntime[3]
, aiming to ensure fairness.
● EEVDF: Among eligible tasks (with lower vruntime than average vruntime), choose the
one with the earliest virtual deadline.
5
CFS EEVDF
Ref : 《Demystifying the Linux CPU Scheduler》 - by Ching-Chun (“Jserv”) Huang
[1] Completely Fair Scheduler
[2] Earliest Eligible Virtual Deadline First
[3] virtual runtime
6
● A review of CFS and EEVDF
● Innovations and mechanisms behind sched_ext
○ What is sched_ext ?
○ Why sched_ext ?
○ How sched_ext ?
○ sched_ext keynote
● From custom FCFS/RR scheduler to introducing Machine Learning
● What’s Next ?
Outline
“ sched_ext (scx) is a Linux kernel feature
which enables implementing kernel thread schedulers in eBPF and dynamically loading them. ”
7
What is sched_ext [1]
? [1] sched_ext: scheduler extension
● Sched_ext is an extensible scheduler class that allows for building of scheduling policies with eBPF.
● Implements scheduling policies as loadable eBPF programs that run in kernel context yet can be
swapped at run-time.
● sched_ext was merged in Linux v6.12 in Sep 2024, which is the minimum required kernel.
○ Re: [PATCHSET v6] sched: Implement BPF extensible scheduler class
● Safety: If the eBPF program faults or violates verifier rules, the kernel automatically falls back to
the default CFS/EEVDF scheduler.
8
What is sched_ext ?
Ref : 《Demystifying the Linux CPU Scheduler》 - by Ching-Chun (“Jserv”) Huang
High-Prio Low-Prio
● The default schedulers are tuned for best-effort throughput.
● Custom eBPF schedulers can be workload-aware.
○ e.g., scx_lavd for games (latency-critical tasks).
● The steps of customizing a CPU scheduler without sched_ext - by David Vernet [LKML, May 14, 2024] :
9
Why sched_ext ?
● With sched_ext, it’s just a
“ 5 second compile job + 1 second to reload a safe BPF scheduler. ”
1. Tweak and recompile the kernel
2. Reinstall the kernel on the Steam Deck
3. Reboot the Steam Deck
4. Reload a game and let caches rewarm
5. Measure FPS
$ meson compile -C
$ meson install -C build
Interface 1: core kernel scheduler ⇒ scheduler class
The core scheduler hands all SCHED_NORMAL events to the sched_ext class.
Interface 2: sched_ext framework ⇒ eBPF scheduler
The sched_ext framework forwards those events to the eBPF scheduler’s
callback functions.
Interface 3: eBPF scheduler ⇒ sched_ext framework
The eBPF scheduler uses helpers to manage DiSpatch Queues (DSQs), enqueue
or dequeue tasks, and kick CPUs.
Interface 4: eBPF scheduler ⟺ user-space counterpart
A user-space program exchanges metrics and settings with the scheduler via
eBPF maps and ring buffers.
10
Ref : sched_ext: scheduler architecture and interfaces (Part 2) by Changwoo Min
How sched_ext ?
“I'm also not a believer in the argument that has been used (multiple times) that the BPF scheduler
would keep people from participating in scheduler development. I personally think the main thing that
keeps people from participating is too high barriers to participation.“
— Linus Torvalds, [LKML, June 24, 2024]
Benefits:
● Flexibility:
○ Customizable schedulers in user space.
● Agility:
○ Faster iteration and development.
● Accessibility:
○ Easier participation in scheduler innovation.
11
sched_ext keynote
12
● A review of CFS and EEVDF
● Innovations and mechanisms behind sched_ext
● From custom FCFS/RR scheduler to introducing Machine Learning
○ FCFS/RR scheduler
○ ML-based load prediction and adaptation
● What’s Next ?
Outline
13
10_000_000
20_000_000
30_000_000
Perfetto : https://ui.perfetto.dev/
FCFS/RR scheduler
Sched_ext
Perfetto
With sched_ext, we can assign time slice in user space.
14
dispatched_task.slice_ns = u64::MAX;
● Starvation !
● FCFS does not imply an infinite time slice
scx_rlfifo: Clarify Round-Robin scheduling #1774
FCFS/RR scheduler
ML-based load prediction and adaptation
15
Are we learning yet ?
*scx_rusty *Candle Burn
or ?
or
scx_lavd
Rust
eBPF
● Main problems:
○ Scheduler ? ➡ scx_rusty
○ ML Framework ? ➡ Candle
○ Topology ? ➡ Level 2 Cache
ML-based load prediction and adaptation
16
Env. 1 : Ubuntu 25.04 (GNU/Linux 6.14.0-22-generic, x86_64)
Early return :(
● Origial scx_rusty balance between NUMA node
and Last Level Cache (LLC)
● Whether it would be useful to define domains in
other terms is another issue
scx_rusty: Domain detect didn't work #2214
ML-based load prediction and adaptation
17
Last Level Cache-based Balancing
Level 2 Cache-based Balancing
Env. 2: Ubuntu 25.04 (GNU/Linux 6.14.0-15-generic, arm)
Load balancing performs poorly on domains closer to the core.
Lower !
Higher !
ML-based load prediction and adaptation
18
Filter Pick tests Check
Collect data here!
Apply ML here!
Sufficient ?
$ stress-ng --cpu 30 -l 100 --timeout 120s --cpu-method matrixprod
Task Selection
Filter out
ML
Training
Inferencing
ML prevents selecting
suboptimal tasks
ML-based load prediction and adaptation
19
Kernel
compilation
EEVDF
scx_rusty
L3 cache bal* L2 cache bal*
L2 cache ML bal*
( Ours )
CPU 1417 % 1315 % 1070 % 1400 %
time 1:36.24 1:42.99 2:01.38 1:37.07
Migrate times 55,873 217,361 428,263 457,935
$ sudo perf stat -e sched:sched_migrate_task
wake lat 99.0th 2,988 1670 1,086 2,420
request lat 99.0th 14,032 10736 12,432 9,232
RPS 50.0th 1,870 3644 2,828 3,684
$ schbench -m 4 -t 4 -r 10
● Migrate times increased from EEVDF (55,873)
to scx_rusty (217,361)
● ML improves performance under L2
cache-based balancing
● Migrate times increased from L3 cache
balancing (217,361) to L2 cache balancing
(428,263)
● Request latency improved significantly
Bold : best,
Bold + underline : second-best,
*bal: balance.
20
● A review of CFS and EEVDF
● Innovations and mechanisms behind sched_ext
● From custom FCFS/RR scheduler to introducing Machine Learning
● What’s next?
Outline
Contributing:
● Keep contributing !
○ PR: EricccTaiwan , charliechiou
● Make our scheduler into upstream !
● COSCUP / (maybe OSSummit)
21
What’s Next ?
Improvement:
● Collecting data from other schedulers.
○ e.g., scx_lavd, scx_bpfland …
● Changing the workload.
○ e.g., stress-ng, compiling kernel,
gaming …
● Modifying the load balancing mechanism.
Special Thanks:
● Ching-Chun (“Jserv”) Huang : Leading us into the world of Linux.
● Sched_ext community : Kindly answering our questions and reviewing our PRs.
● Chia-Ping Tsai : Introducing us to the “opensource4you” community.
Thanks For Listening !
2025年 Linux 核心專題: 探討 sched_ext 及機器學習.pdf
● Why collect data by using stress-ng instead of compiling kernel ?
● The imporvement of request and degrad of wake up latency ?
24
Appendix

More Related Content

PDF
Building a Custom Linux CPU Scheduler with sched_ext.pdf
PDF
Scheduling in Android
PDF
Scheduling in Android
PDF
Process Scheduler and Balancer in Linux Kernel
PDF
Deadline Miss Detection with SCHED_DEADLINE
PDF
The Linux Kernel Scheduler (For Beginners) - SFO17-421
PDF
Linux Scheduler Latest_ viresh Kumar.pdf
PPTX
Smarter Scheduling
Building a Custom Linux CPU Scheduler with sched_ext.pdf
Scheduling in Android
Scheduling in Android
Process Scheduler and Balancer in Linux Kernel
Deadline Miss Detection with SCHED_DEADLINE
The Linux Kernel Scheduler (For Beginners) - SFO17-421
Linux Scheduler Latest_ viresh Kumar.pdf
Smarter Scheduling

Similar to 2025年 Linux 核心專題: 探討 sched_ext 及機器學習.pdf (20)

PDF
Linux kernel development ch4
PPTX
Dataplane programming with eBPF: architecture and tools
PDF
eBPF — Divulging The Hidden Super Power.pdf
PPTX
Polling server
PDF
State of the Union eBPF - Linux Kernel Programming
PPTX
Process scheduling
PDF
eBPF — Divulging The Hidden Super Power.pdf
PDF
Linux scheduler
PDF
Linux schedulers for fun and profit with SchedKit
PDF
Improvement of Scheduling Granularity for Deadline Scheduler
PDF
SRE NL MeetUp - eBPF.pdf
PDF
An Evaluation of Adaptive Partitioning of Real-Time Workloads on Linux
PPT
Process scheduling linux
PDF
LCA14: LCA14-306: CPUidle & CPUfreq integration with scheduler
PPTX
Scheduling in Linux and Web Servers
PDF
Embedded Recipes 2017 - Understanding SCHED_DEADLINE - Steven Rostedt
PDF
Kernel bug hunting
PDF
BKK16-104 sched-freq
PPTX
Scheduling Algorithms unit IV(II).pptx -
PDF
Introduction of eBPF - 時下最夯的Linux Technology
Linux kernel development ch4
Dataplane programming with eBPF: architecture and tools
eBPF — Divulging The Hidden Super Power.pdf
Polling server
State of the Union eBPF - Linux Kernel Programming
Process scheduling
eBPF — Divulging The Hidden Super Power.pdf
Linux scheduler
Linux schedulers for fun and profit with SchedKit
Improvement of Scheduling Granularity for Deadline Scheduler
SRE NL MeetUp - eBPF.pdf
An Evaluation of Adaptive Partitioning of Real-Time Workloads on Linux
Process scheduling linux
LCA14: LCA14-306: CPUidle & CPUfreq integration with scheduler
Scheduling in Linux and Web Servers
Embedded Recipes 2017 - Understanding SCHED_DEADLINE - Steven Rostedt
Kernel bug hunting
BKK16-104 sched-freq
Scheduling Algorithms unit IV(II).pptx -
Introduction of eBPF - 時下最夯的Linux Technology
Ad

Recently uploaded (20)

PPTX
Monitoring Stack: Grafana, Loki & Promtail
PPTX
history of c programming in notes for students .pptx
PDF
Download FL Studio Crack Latest version 2025 ?
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 41
PDF
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
PDF
How to Choose the Right IT Partner for Your Business in Malaysia
PDF
Designing Intelligence for the Shop Floor.pdf
PDF
Salesforce Agentforce AI Implementation.pdf
PPTX
Reimagine Home Health with the Power of Agentic AI​
PPTX
Patient Appointment Booking in Odoo with online payment
PDF
Design an Analysis of Algorithms I-SECS-1021-03
PDF
CCleaner Pro 6.38.11537 Crack Final Latest Version 2025
PDF
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
PPTX
Why Generative AI is the Future of Content, Code & Creativity?
PDF
medical staffing services at VALiNTRY
PDF
Nekopoi APK 2025 free lastest update
PPTX
Advanced SystemCare Ultimate Crack + Portable (2025)
PDF
17 Powerful Integrations Your Next-Gen MLM Software Needs
PPTX
Operating system designcfffgfgggggggvggggggggg
PPTX
Oracle Fusion HCM Cloud Demo for Beginners
Monitoring Stack: Grafana, Loki & Promtail
history of c programming in notes for students .pptx
Download FL Studio Crack Latest version 2025 ?
Internet Downloader Manager (IDM) Crack 6.42 Build 41
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
How to Choose the Right IT Partner for Your Business in Malaysia
Designing Intelligence for the Shop Floor.pdf
Salesforce Agentforce AI Implementation.pdf
Reimagine Home Health with the Power of Agentic AI​
Patient Appointment Booking in Odoo with online payment
Design an Analysis of Algorithms I-SECS-1021-03
CCleaner Pro 6.38.11537 Crack Final Latest Version 2025
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
Why Generative AI is the Future of Content, Code & Creativity?
medical staffing services at VALiNTRY
Nekopoi APK 2025 free lastest update
Advanced SystemCare Ultimate Crack + Portable (2025)
17 Powerful Integrations Your Next-Gen MLM Software Needs
Operating system designcfffgfgggggggvggggggggg
Oracle Fusion HCM Cloud Demo for Beginners
Ad

2025年 Linux 核心專題: 探討 sched_ext 及機器學習.pdf

  • 1. Linux 核心專題: 探討 sched_ext 及機器學習 執行人: EricccTaiwan, charliechiou 2025/06/28 2025/06/28
  • 2. 周呈陽 成大電通所碩二 GitHub: EricccTaiwan LinkedIn: Eric Chou MS @ NCKUCCE 邱柏穎 成大電通所碩一 GitHub: charliechiou LinkedIn: Po-Ying Chiu MS @ NCKUCCE 2
  • 3. 3 ● A review of CFS and EEVDF ● Innovations and mechanisms behind sched_ext ● From custom FCFS/RR scheduler to introducing Machine Learning ● What’s Next ? Outline
  • 4. 4 ● A review of CFS and EEVDF ● Innovations and mechanisms behind sched_ext ● From custom FCFS/RR scheduler to introducing Machine Learning ● What’s Next ? Outline
  • 5. CFS[1] and EEVDF[2] ● Both CFS (v2.6.23) and EEVDF (v6.6) are general-purpose schedulers in Linux kernel. ● CFS: Select the task with the earliest vruntime[3] , aiming to ensure fairness. ● EEVDF: Among eligible tasks (with lower vruntime than average vruntime), choose the one with the earliest virtual deadline. 5 CFS EEVDF Ref : 《Demystifying the Linux CPU Scheduler》 - by Ching-Chun (“Jserv”) Huang [1] Completely Fair Scheduler [2] Earliest Eligible Virtual Deadline First [3] virtual runtime
  • 6. 6 ● A review of CFS and EEVDF ● Innovations and mechanisms behind sched_ext ○ What is sched_ext ? ○ Why sched_ext ? ○ How sched_ext ? ○ sched_ext keynote ● From custom FCFS/RR scheduler to introducing Machine Learning ● What’s Next ? Outline
  • 7. “ sched_ext (scx) is a Linux kernel feature which enables implementing kernel thread schedulers in eBPF and dynamically loading them. ” 7 What is sched_ext [1] ? [1] sched_ext: scheduler extension
  • 8. ● Sched_ext is an extensible scheduler class that allows for building of scheduling policies with eBPF. ● Implements scheduling policies as loadable eBPF programs that run in kernel context yet can be swapped at run-time. ● sched_ext was merged in Linux v6.12 in Sep 2024, which is the minimum required kernel. ○ Re: [PATCHSET v6] sched: Implement BPF extensible scheduler class ● Safety: If the eBPF program faults or violates verifier rules, the kernel automatically falls back to the default CFS/EEVDF scheduler. 8 What is sched_ext ? Ref : 《Demystifying the Linux CPU Scheduler》 - by Ching-Chun (“Jserv”) Huang High-Prio Low-Prio
  • 9. ● The default schedulers are tuned for best-effort throughput. ● Custom eBPF schedulers can be workload-aware. ○ e.g., scx_lavd for games (latency-critical tasks). ● The steps of customizing a CPU scheduler without sched_ext - by David Vernet [LKML, May 14, 2024] : 9 Why sched_ext ? ● With sched_ext, it’s just a “ 5 second compile job + 1 second to reload a safe BPF scheduler. ” 1. Tweak and recompile the kernel 2. Reinstall the kernel on the Steam Deck 3. Reboot the Steam Deck 4. Reload a game and let caches rewarm 5. Measure FPS $ meson compile -C $ meson install -C build
  • 10. Interface 1: core kernel scheduler ⇒ scheduler class The core scheduler hands all SCHED_NORMAL events to the sched_ext class. Interface 2: sched_ext framework ⇒ eBPF scheduler The sched_ext framework forwards those events to the eBPF scheduler’s callback functions. Interface 3: eBPF scheduler ⇒ sched_ext framework The eBPF scheduler uses helpers to manage DiSpatch Queues (DSQs), enqueue or dequeue tasks, and kick CPUs. Interface 4: eBPF scheduler ⟺ user-space counterpart A user-space program exchanges metrics and settings with the scheduler via eBPF maps and ring buffers. 10 Ref : sched_ext: scheduler architecture and interfaces (Part 2) by Changwoo Min How sched_ext ?
  • 11. “I'm also not a believer in the argument that has been used (multiple times) that the BPF scheduler would keep people from participating in scheduler development. I personally think the main thing that keeps people from participating is too high barriers to participation.“ — Linus Torvalds, [LKML, June 24, 2024] Benefits: ● Flexibility: ○ Customizable schedulers in user space. ● Agility: ○ Faster iteration and development. ● Accessibility: ○ Easier participation in scheduler innovation. 11 sched_ext keynote
  • 12. 12 ● A review of CFS and EEVDF ● Innovations and mechanisms behind sched_ext ● From custom FCFS/RR scheduler to introducing Machine Learning ○ FCFS/RR scheduler ○ ML-based load prediction and adaptation ● What’s Next ? Outline
  • 13. 13 10_000_000 20_000_000 30_000_000 Perfetto : https://ui.perfetto.dev/ FCFS/RR scheduler Sched_ext Perfetto With sched_ext, we can assign time slice in user space.
  • 14. 14 dispatched_task.slice_ns = u64::MAX; ● Starvation ! ● FCFS does not imply an infinite time slice scx_rlfifo: Clarify Round-Robin scheduling #1774 FCFS/RR scheduler
  • 15. ML-based load prediction and adaptation 15 Are we learning yet ? *scx_rusty *Candle Burn or ? or scx_lavd Rust eBPF ● Main problems: ○ Scheduler ? ➡ scx_rusty ○ ML Framework ? ➡ Candle ○ Topology ? ➡ Level 2 Cache
  • 16. ML-based load prediction and adaptation 16 Env. 1 : Ubuntu 25.04 (GNU/Linux 6.14.0-22-generic, x86_64) Early return :( ● Origial scx_rusty balance between NUMA node and Last Level Cache (LLC) ● Whether it would be useful to define domains in other terms is another issue scx_rusty: Domain detect didn't work #2214
  • 17. ML-based load prediction and adaptation 17 Last Level Cache-based Balancing Level 2 Cache-based Balancing Env. 2: Ubuntu 25.04 (GNU/Linux 6.14.0-15-generic, arm) Load balancing performs poorly on domains closer to the core. Lower ! Higher !
  • 18. ML-based load prediction and adaptation 18 Filter Pick tests Check Collect data here! Apply ML here! Sufficient ? $ stress-ng --cpu 30 -l 100 --timeout 120s --cpu-method matrixprod Task Selection Filter out ML Training Inferencing ML prevents selecting suboptimal tasks
  • 19. ML-based load prediction and adaptation 19 Kernel compilation EEVDF scx_rusty L3 cache bal* L2 cache bal* L2 cache ML bal* ( Ours ) CPU 1417 % 1315 % 1070 % 1400 % time 1:36.24 1:42.99 2:01.38 1:37.07 Migrate times 55,873 217,361 428,263 457,935 $ sudo perf stat -e sched:sched_migrate_task wake lat 99.0th 2,988 1670 1,086 2,420 request lat 99.0th 14,032 10736 12,432 9,232 RPS 50.0th 1,870 3644 2,828 3,684 $ schbench -m 4 -t 4 -r 10 ● Migrate times increased from EEVDF (55,873) to scx_rusty (217,361) ● ML improves performance under L2 cache-based balancing ● Migrate times increased from L3 cache balancing (217,361) to L2 cache balancing (428,263) ● Request latency improved significantly Bold : best, Bold + underline : second-best, *bal: balance.
  • 20. 20 ● A review of CFS and EEVDF ● Innovations and mechanisms behind sched_ext ● From custom FCFS/RR scheduler to introducing Machine Learning ● What’s next? Outline
  • 21. Contributing: ● Keep contributing ! ○ PR: EricccTaiwan , charliechiou ● Make our scheduler into upstream ! ● COSCUP / (maybe OSSummit) 21 What’s Next ? Improvement: ● Collecting data from other schedulers. ○ e.g., scx_lavd, scx_bpfland … ● Changing the workload. ○ e.g., stress-ng, compiling kernel, gaming … ● Modifying the load balancing mechanism.
  • 22. Special Thanks: ● Ching-Chun (“Jserv”) Huang : Leading us into the world of Linux. ● Sched_ext community : Kindly answering our questions and reviewing our PRs. ● Chia-Ping Tsai : Introducing us to the “opensource4you” community. Thanks For Listening !
  • 24. ● Why collect data by using stress-ng instead of compiling kernel ? ● The imporvement of request and degrad of wake up latency ? 24 Appendix