SlideShare a Scribd company logo
Process Scheduling
Darren Huang#791
Multitasking
• Multitasking operating systems come in two flavors: cooperative
multitasking and preemptive multitasking
• Linux implements preemptive multitasking
• the scheduler decides when a process is to cease running and a new process
is to begin running
Process Scheduler
• Linux kernel introduce Completely Fair Scheduler since version 2.6.23
• CFS has been modified a bit further in 2.6.24
• Comparison
• Linux pre-2.6 Multilevel feedback queue
• Linux 2.6-2.6.23 O(1) scheduler
• Linux post-2.6.23 Completely Fair Scheduler
• FreeBSD Multilevel feedback queue
• Mac OS X Multilevel feedback queue
• Windows NT Multilevel feedback queue
• Brain Fuck Scheduler
Policy
• I/O-bound processes
• Processor-bound processes
• tends to run such processes less frequently but for longer durations
• Policy in Unix systems tends to explicitly favor I/O-bound processes,
thus providing good process response time
• Linux is favoring I/O-bound processes over processor-bound
processors
Process Priority
• The Linux kernel implements two separate priority ranges
• Nice value
• Real-time priority
• Nice value
• A number from -20 to +19 with a default of 0
• Real-time priority
• Default range from 0 to 99
• Real-time priority and nice value are in disjoint value spaces
Timeslice
• Timeslice is the numeric value that represents how long a task can
run until it is preempted
• Linux’s CFS scheduler does NOT directly assign timeslices to processes
• CFS assigns processes a proportion of the processor
The Scheduling Policy in Action
• A text editor vs. a video encoder
Scheduling Algorithm
• How traditional Unix systems schedule processes.
• Mapping nice values onto timeslice to alloct each nice value cause
some drawbacks.
• Process A: nice value = 0 timeslice of 100 milliseconds
Process B: nice value = 20 timeslice of 5 milliseconds,
• Process A: nice value = 20 timeslice of 5 milliseconds
Process B: nice value = 20 timeslice of 5 milliseconds,
• Process A: nice value = 0 timeslice of 100 milliseconds
Process B: nice value = 0 timeslice of 100 milliseconds,
Scheduling Algorithm
• Process A: nice value = 0 timeslice of 100 milliseconds
Process B: nice value = 1 timeslice of 95 milliseconds,
• Process A: nice value = 18 timeslice of 10 milliseconds
Process B: nice value = 19 timeslice of 5 milliseconds,
• If performing a nice value to timeslice mapping, we need the ability to assign
the absolute timeslice.(ex. integer multiple of the timer ticks) Timeslice
change with different timer ticks.
• Optimize for interactive tasks. One process gains unfair amount of process
time.
Scheduling Algorithm
• The Linux scheduler is modular, and the modularity is called scheduler
classes
• The base scheduler code is defined in kernel/sched.c
• CFS is defined in kernel/sched_fair.c
• CFS basically models an “ideal, precise multi-tasking CPU” on real
hardware
• Do away with timeslices completely and assign each process a
PROPOTION of the processor
Ideal, Precise, Multitasking CPU
Actual Hardware CPU
Fair Scheduling
• CFS is called a fair scheduler because it gives each process a fair
share—a proportion—of the processor’s time
• The absolute timeslice allotted any nice value is NOT an absolute
number, but a given proportion of the processor
• CFS is NOT perfectly fair, because it only approximates perfect
multitasking
• But it can place a lower bound on latency of n for n runnable
processes on the unfairness
The Linux Scheduling Implementation
• We discuss four components of CFS
• Time Accounting
• Process Selection
• The Scheduler Entry Point
• Sleeping and Waking Up
Time Accounting
• CFS does NOT have the notion of a timeslice, but it must still keep
account for the time that each process runs
• CFS uses the scheduler entity structure, struct sched_entity,
defined in <linux/sched.h>, to keep track of process accounting
• The scheduler entity structure is embedded in the process descriptor,
struct task_stuct, as a member variable named se
Time Accounting: Virtual Runtime
• The virtual runtime is used to help us approximate the “ideal
multitasking processor” that CFS is modeling
• CFS uses vruntime to account for how long a process has run and
thus how much longer it ought to run
• The vruntime variable stores the virtual runtime of a process, which is
the actual runtime normalized by the number of runnable processes
Process Selection
• CFS uses a red-black tree to manage the list of runnable processes
and efficiently find the process with the smallest vruntime
• Picking the next task
• run the process represented by the leftmost node in the rbtree
• __pick_next_entity()
• Adding processes to the tree
• enqueue_entity()
• Removing processes from the tree
• dequeue_entity()
The Scheduler Entry Point
• The main entry point into the process schedule is the function
schedule(), defined in kernel/sched.c
Sleeping and Waking Up
• Tasks that are sleeping (blocked) are in a special non-runnable state
• Without this special state, the scheduler would select tasks that did
not want to run
• Sleeping is handled via wait queues
• A wait queue is a simple list of processes waiting for an event to occur
Preemption and Context Switching
• Context switching is handled by the context_switch() function
defined in kernel/sched.c
• It is called by schedule() when a new process has been selected to
run to do two basic jobs
• Calls switch_mm() to switch the virtual memory mapping from the previous
process’s to that of the new process
• Calls switch_to() switch the processor state from the previous process’s to
the current’s
• The kernel provides the need_resched flag to signify whether a
reschedule should be performed
Preemption and Context Switching (cont.)
• Upon returning to user-space or returning from an interrupt, the
need_resched flag is checked
• If it is set, the kernel invokes the scheduler before continuing
• In 2.6, the need_resched flag was moved into a single bit of a special
flag variable inside the thread_info structure
User Preemption
• User preemption can occur
• When returning to user-space from a system call
• When returning to user-space from an interrupt handler
Kernel Preemption
• Kernel preemption can occur
• When an interrupt handler exits, before returning to kernel-space
• When kernel code becomes preemptible again
• If a task in the kernel explicitly calls schedule()
• If a task in the kernel blocks (which results in a call to schedule())
Real-Time Scheduling Policies
• Linux provides two real-time scheduling policies, SCHED_FIFO and
SCHED_RR
• The normal, not real-time scheduling policy is SCHED_NORMAL
• Real-time policies are managed not by the CFS, but by a special real-
time scheduler, defined in kernel/sched_rt.c
Scheduler-Related System Calls
Group Scheduling Enhancements in 2.6.24
References
• http://git.kernel.org/cgit/linux/kernel/git/next/linux-
next.git/tree/Documentation/scheduler/sched-design-CFS.txt
• http://dl.acm.org/citation.cfm?id=1400102
• http://dl.acm.org/citation.cfm?id=1594375
• http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4631872
• http://www.ibm.com/developerworks/linux/library/l-scheduler/
• http://www.ibm.com/developerworks/linux/library/l-cfs/
• http://en.wikipedia.org/wiki/Completely_Fair_Scheduler
• http://blog.xuite.net/ian11832/blogg/23745751

More Related Content

PPTX
Round-ribon algorithm presntation
PPTX
Cpu scheduling algorithm on windows
PPT
Homework solution1
PDF
Process Scheduler and Balancer in Linux Kernel
PPTX
Feedback queuing models for time shared systems
PPTX
Round Robin Algorithm.pptx
PPT
process management
Round-ribon algorithm presntation
Cpu scheduling algorithm on windows
Homework solution1
Process Scheduler and Balancer in Linux Kernel
Feedback queuing models for time shared systems
Round Robin Algorithm.pptx
process management

What's hot (19)

PDF
Process Synchronization
PDF
BEAM (Erlang VM) as a Soft Real-time Platform
PDF
CPU Scheduling
PPTX
Replication in Distributed Systems
PPTX
Homework solutionsch9
PDF
3 process management
PPTX
Operating Systems - Processor Management
PPTX
Process management
PPT
水晶礦脈
PDF
Supporting Time-Sensitive Applications on a Commodity OS
PPT
101 3.6 modify process execution priorities
PDF
Ch5 process synchronization
PPTX
Lecture 2 process
PPT
Open MPI 2
PPT
11 process definition
PDF
Embedded Recipes 2017 - Reliable monitoring with systemd - Jérémy Rosen
PPTX
Dynamic Resource Management In a Massively Parallel Stream Processing Engine
PDF
Comparision of scheduling algorithms
Process Synchronization
BEAM (Erlang VM) as a Soft Real-time Platform
CPU Scheduling
Replication in Distributed Systems
Homework solutionsch9
3 process management
Operating Systems - Processor Management
Process management
水晶礦脈
Supporting Time-Sensitive Applications on a Commodity OS
101 3.6 modify process execution priorities
Ch5 process synchronization
Lecture 2 process
Open MPI 2
11 process definition
Embedded Recipes 2017 - Reliable monitoring with systemd - Jérémy Rosen
Dynamic Resource Management In a Massively Parallel Stream Processing Engine
Comparision of scheduling algorithms
Ad

Similar to Linux kernel development ch4 (20)

PPTX
Process scheduling
PPTX
Process and CPU Scheduling.pptx it is about Operating system
PPTX
Linux Process & CF scheduling
PDF
Linux scheduler
PPT
06-scheduling.ppt including multiple CPUs
PPT
Linux Performance Tunning Kernel
PPT
cpu sechduling
PDF
seminar report
PDF
cpu scheduling.pdfoieheoirwuojorkjp;ooooo
ODP
Linux Internals - Kernel/Core
PPTX
UNIPROCESS SCHEDULING.pptx
PPT
cpu scheduling in os
PPTX
Cpu_sheduling.pptx
PDF
CH06.pdf
PDF
Linux Scheduler Latest_ viresh Kumar.pdf
PDF
Ch6 cpu scheduling
PPT
Scheduling algorithm (chammu)
PPTX
Process Scheduling in operating systems.pptx
PPTX
2_CPU Scheduling (2)beautifulgameyt.pptx
Process scheduling
Process and CPU Scheduling.pptx it is about Operating system
Linux Process & CF scheduling
Linux scheduler
06-scheduling.ppt including multiple CPUs
Linux Performance Tunning Kernel
cpu sechduling
seminar report
cpu scheduling.pdfoieheoirwuojorkjp;ooooo
Linux Internals - Kernel/Core
UNIPROCESS SCHEDULING.pptx
cpu scheduling in os
Cpu_sheduling.pptx
CH06.pdf
Linux Scheduler Latest_ viresh Kumar.pdf
Ch6 cpu scheduling
Scheduling algorithm (chammu)
Process Scheduling in operating systems.pptx
2_CPU Scheduling (2)beautifulgameyt.pptx
Ad

Linux kernel development ch4

  • 2. Multitasking • Multitasking operating systems come in two flavors: cooperative multitasking and preemptive multitasking • Linux implements preemptive multitasking • the scheduler decides when a process is to cease running and a new process is to begin running
  • 3. Process Scheduler • Linux kernel introduce Completely Fair Scheduler since version 2.6.23 • CFS has been modified a bit further in 2.6.24 • Comparison • Linux pre-2.6 Multilevel feedback queue • Linux 2.6-2.6.23 O(1) scheduler • Linux post-2.6.23 Completely Fair Scheduler • FreeBSD Multilevel feedback queue • Mac OS X Multilevel feedback queue • Windows NT Multilevel feedback queue • Brain Fuck Scheduler
  • 4. Policy • I/O-bound processes • Processor-bound processes • tends to run such processes less frequently but for longer durations • Policy in Unix systems tends to explicitly favor I/O-bound processes, thus providing good process response time • Linux is favoring I/O-bound processes over processor-bound processors
  • 5. Process Priority • The Linux kernel implements two separate priority ranges • Nice value • Real-time priority • Nice value • A number from -20 to +19 with a default of 0 • Real-time priority • Default range from 0 to 99 • Real-time priority and nice value are in disjoint value spaces
  • 6. Timeslice • Timeslice is the numeric value that represents how long a task can run until it is preempted • Linux’s CFS scheduler does NOT directly assign timeslices to processes • CFS assigns processes a proportion of the processor
  • 7. The Scheduling Policy in Action • A text editor vs. a video encoder
  • 8. Scheduling Algorithm • How traditional Unix systems schedule processes. • Mapping nice values onto timeslice to alloct each nice value cause some drawbacks. • Process A: nice value = 0 timeslice of 100 milliseconds Process B: nice value = 20 timeslice of 5 milliseconds, • Process A: nice value = 20 timeslice of 5 milliseconds Process B: nice value = 20 timeslice of 5 milliseconds, • Process A: nice value = 0 timeslice of 100 milliseconds Process B: nice value = 0 timeslice of 100 milliseconds,
  • 9. Scheduling Algorithm • Process A: nice value = 0 timeslice of 100 milliseconds Process B: nice value = 1 timeslice of 95 milliseconds, • Process A: nice value = 18 timeslice of 10 milliseconds Process B: nice value = 19 timeslice of 5 milliseconds, • If performing a nice value to timeslice mapping, we need the ability to assign the absolute timeslice.(ex. integer multiple of the timer ticks) Timeslice change with different timer ticks. • Optimize for interactive tasks. One process gains unfair amount of process time.
  • 10. Scheduling Algorithm • The Linux scheduler is modular, and the modularity is called scheduler classes • The base scheduler code is defined in kernel/sched.c • CFS is defined in kernel/sched_fair.c • CFS basically models an “ideal, precise multi-tasking CPU” on real hardware • Do away with timeslices completely and assign each process a PROPOTION of the processor
  • 13. Fair Scheduling • CFS is called a fair scheduler because it gives each process a fair share—a proportion—of the processor’s time • The absolute timeslice allotted any nice value is NOT an absolute number, but a given proportion of the processor • CFS is NOT perfectly fair, because it only approximates perfect multitasking • But it can place a lower bound on latency of n for n runnable processes on the unfairness
  • 14. The Linux Scheduling Implementation • We discuss four components of CFS • Time Accounting • Process Selection • The Scheduler Entry Point • Sleeping and Waking Up
  • 15. Time Accounting • CFS does NOT have the notion of a timeslice, but it must still keep account for the time that each process runs • CFS uses the scheduler entity structure, struct sched_entity, defined in <linux/sched.h>, to keep track of process accounting • The scheduler entity structure is embedded in the process descriptor, struct task_stuct, as a member variable named se
  • 16. Time Accounting: Virtual Runtime • The virtual runtime is used to help us approximate the “ideal multitasking processor” that CFS is modeling • CFS uses vruntime to account for how long a process has run and thus how much longer it ought to run • The vruntime variable stores the virtual runtime of a process, which is the actual runtime normalized by the number of runnable processes
  • 17. Process Selection • CFS uses a red-black tree to manage the list of runnable processes and efficiently find the process with the smallest vruntime • Picking the next task • run the process represented by the leftmost node in the rbtree • __pick_next_entity() • Adding processes to the tree • enqueue_entity() • Removing processes from the tree • dequeue_entity()
  • 18. The Scheduler Entry Point • The main entry point into the process schedule is the function schedule(), defined in kernel/sched.c
  • 19. Sleeping and Waking Up • Tasks that are sleeping (blocked) are in a special non-runnable state • Without this special state, the scheduler would select tasks that did not want to run • Sleeping is handled via wait queues • A wait queue is a simple list of processes waiting for an event to occur
  • 20. Preemption and Context Switching • Context switching is handled by the context_switch() function defined in kernel/sched.c • It is called by schedule() when a new process has been selected to run to do two basic jobs • Calls switch_mm() to switch the virtual memory mapping from the previous process’s to that of the new process • Calls switch_to() switch the processor state from the previous process’s to the current’s • The kernel provides the need_resched flag to signify whether a reschedule should be performed
  • 21. Preemption and Context Switching (cont.) • Upon returning to user-space or returning from an interrupt, the need_resched flag is checked • If it is set, the kernel invokes the scheduler before continuing • In 2.6, the need_resched flag was moved into a single bit of a special flag variable inside the thread_info structure
  • 22. User Preemption • User preemption can occur • When returning to user-space from a system call • When returning to user-space from an interrupt handler
  • 23. Kernel Preemption • Kernel preemption can occur • When an interrupt handler exits, before returning to kernel-space • When kernel code becomes preemptible again • If a task in the kernel explicitly calls schedule() • If a task in the kernel blocks (which results in a call to schedule())
  • 24. Real-Time Scheduling Policies • Linux provides two real-time scheduling policies, SCHED_FIFO and SCHED_RR • The normal, not real-time scheduling policy is SCHED_NORMAL • Real-time policies are managed not by the CFS, but by a special real- time scheduler, defined in kernel/sched_rt.c
  • 27. References • http://git.kernel.org/cgit/linux/kernel/git/next/linux- next.git/tree/Documentation/scheduler/sched-design-CFS.txt • http://dl.acm.org/citation.cfm?id=1400102 • http://dl.acm.org/citation.cfm?id=1594375 • http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4631872 • http://www.ibm.com/developerworks/linux/library/l-scheduler/ • http://www.ibm.com/developerworks/linux/library/l-cfs/ • http://en.wikipedia.org/wiki/Completely_Fair_Scheduler • http://blog.xuite.net/ian11832/blogg/23745751