SlideShare a Scribd company logo
Putting a Fork in Fork (Linux Process and Memory Management)
Updates
Progress updates and
scheduling design
reviews will be due
Sunday 11:59pm

Tonight on Colbert Report!

Tuesday’s Class:
Yuchen Zhou on
Authentication using
Single Sign-On

12 November 2013

University of Virginia cs4414

1
Recap: Last Class
Translation Lookaside Buffer (Cache)

Memory

Page

Paging
Unit

Physical Address

Dir

Linear Address

Logical Address

Segmentation Unit

Offset

CR3

Page
Directory

Page Table

Physical Memory

GDTR

Global
Descriptor
Table
12 November 2013

University of Virginia cs4414

2
#include <stdio.h>
#include <stdlib.h>

int main(int argc, char **argv) {
char *s = (char *) malloc (1);
int i= 0;
while (1) {
printf("%d: %xn", i, s[i]);
i += 4;
}
}

12 November 2013

What will this program do?

> ./a.out
0: 0
4: 0
8: 0
12: 0
…1033872: 0
1033876: 0
1033880: 0
1033884: 0
Segmentation fault: 11

University of Virginia cs4414

3
12 November 2013

University of Virginia cs4414

4
> clang segv.c
segv.c:22:8: warning: expression result unused [-Wunused-value]
s[i];
~ ~^
1 warning generated.
> ./a.out
^C

12 November 2013

University of Virginia cs4414

5
$ ./a.out
Caught segv: 11
i = 1033888
Caught segv: 11
i = 1033888
Caught segv: 11
i = 1033888
Caught segv: 11
i = 1033888
Caught segv: 11
i = 1033888
Caught segv: 11
i = 1033888
Caught segv: 11
i = 1033888
…
12 November 2013

University of Virginia cs4414

6
> ulimit -a
core file size
data seg size
file size
max locked memory
max memory size
open files
pipe size
stack size
cpu time
max user processes
virtual memory

12 November 2013

(blocks,
(kbytes,
(blocks,
(kbytes,
(kbytes,

-c)
-d)
-f)
-l)
-m)
(-n)
(512 bytes, -p)
(kbytes, -s)
(seconds, -t)
(-u)
(kbytes, -v)

University of Virginia cs4414

0
unlimited
unlimited
unlimited
unlimited
256
1
8515
unlimited
709
unlimited

7
USENIX Security 2007

12 November 2013

University of Virginia cs4414

8
Rust Runtime

Forking Fork
run::Process::new(program, argv, options)
spawn_process_os(prog, args, env, dir, in_fd, …)
fork()
int 0x80

libc: fork()
jumps into kernel code
sets supervisor mode

linux kernel: fork syscall
12 November 2013

University of Virginia cs4414

9
/*
* linux/kernel/fork.c
*
* Copyright (C) 1991, 1992
*/

Linus Torvalds

/*
* 'fork.c' contains the help-routines for the 'fork' system call
* (see also entry.S and others).
* Fork is rather simple, once you get the hang of it, but the memory
* management can be a bitch. See 'mm/memory.c': 'copy_page_range()'
*/
#include
#include
#include
#include
#include
#include
…

<linux/slab.h>
<linux/init.h>
<linux/unistd.h>
<linux/module.h>
<linux/vmalloc.h>
<linux/completion.h>

1935 total lines
12 November 2013

University of Virginia cs4414

10
/*
* Ok, this is the main fork-routine.
*
* It copies the process, and if successful kick-starts
* it and waits for it to finish using the VM if required.
*/
long do_fork(unsigned long clone_flags,
unsigned long stack_start,
unsigned long stack_size,
int __user *parent_tidptr,
int __user *child_tidptr)
{
struct task_struct *p;
int trace = 0;
long nr;
/*
* Determine whether and which event to report to ptracer. When
* called from kernel_thread or CLONE_UNTRACED is explicitly
* requested, no event is reported; otherwise, report if the event
* for the type of forking is enabled.
*/
if (!(clone_flags & CLONE_UNTRACED)) { … }
12 November 2013

University of Virginia cs4414

11
long do_fork(unsigned long clone_flags,
unsigned long stack_start,
unsigned long stack_size,
int __user *parent_tidptr,
int __user *child_tidptr)
{
struct task_struct *p;
int trace = 0;
long nr;
/* Determine whether and which event to report to ptracer... */
p = copy_process(clone_flags, stack_start, stack_size,
child_tidptr, NULL, trace);
/*
* Do this prior (to) waking up the new thread – the thread pointer
* might get invalid after that point, if the thread exits quickly.
*/
if (!IS_ERR(p)) {
...

12 November 2013

University of Virginia cs4414

12
/*
This creates a new process as a copy of the old one, but does not actually start it yet. It copies
the registers, and all the appropriate parts of the process environment (as per the clone flags).
The actual kick-off is left to the caller.
*/
static struct task_struct *copy_process(unsigned long clone_flags,
unsigned long stack_start,
unsigned long stack_size,
int __user *child_tidptr,
struct pid *pid,
int trace)
{
int retval;
struct task_struct *p;
if ((clone_flags & (CLONE_NEWNS|CLONE_FS)) == (CLONE_NEWNS|CLONE_FS))
return ERR_PTR(-EINVAL);
... // lots more error cases based on flags
retval = security_task_create(clone_flags);
if (retval)
goto fork_out;
... // this is the interesting part we will look at next
fork_out:
return ERR_PTR(retval);
}
12 November 2013

University of Virginia cs4414

13
What should be in a task_struct?

“task” here means process (its what copy_process returns), not to be
confused with a Rust task
12 November 2013

University of Virginia cs4414

14
include/linux/sched.h

Definition of task_struct is over 400 lines!

12 November 2013

University of Virginia cs4414

15
Memory Management

mm_struct is another huge data structure…we’ll look at later.

12 November 2013

University of Virginia cs4414

16
12 November 2013

University of Virginia cs4414

17
Stack Canary
arch/x86/include/asm/stackprotector.h

12 November 2013

University of Virginia cs4414

18
Protecting Stack Frames
Saved Registers
Saved Registers
Parameters
Parameters
Return Address
Return Address

gcc –Wstack-protector

Local Variables

Canary

Local Variables

Why does the kernel need code to support this?
12 November 2013

University of Virginia cs4414

19
12 November 2013

University of Virginia cs4414

20
Other things in struct task:

12 November 2013

University of Virginia cs4414

21
static struct task_struct *copy_process(unsigned long clone_flags,
unsigned long stack_start,
unsigned long stack_size,
int __user *child_tidptr,
struct pid *pid,
int trace)
{
int retval;
struct task_struct *p;
... // lots more error cases based on flags

What is current?

retval = security_task_create(clone_flags);
if (retval)
#ifndef _ASM_X86_CURRENT_H
goto fork_out;
retval = -ENOMEM;
p = dup_task_struct(current);
if (!p)
goto fork_out;
...
fork_out:
return ERR_PTR(retval);
}

#define _ASM_X86_CURRENT_H
#include <linux/compiler.h>
#include <asm/percpu.h>
#ifndef __ASSEMBLY__
struct task_struct;
DECLARE_PER_CPU(struct task_struct *, current_task);
static __always_inline
struct task_struct *get_current(void)
{
return percpu_read_stable(current_task);
}
#define current get_current()
#endif /* __ASSEMBLY__ */
#endif /* _ASM_X86_CURRENT_H */

/linux-2.6.32-rc3/arch/x86/include/asm/current.h
12 November 2013

University of Virginia cs4414

22
static struct task_struct *dup_task_struct(struct task_struct *orig)
{
struct task_struct *tsk;
struct thread_info *ti;
unsigned long *stackend;
int node = tsk_fork_get_node(orig);
int err;
tsk = alloc_task_struct_node(node);
if (!tsk)
return NULL;
ti = alloc_thread_info_node(tsk, node);
if (!ti)
goto free_tsk;
err = arch_dup_task_struct(tsk, orig);
if (err)
goto free_ti;
tsk->stack = ti;
setup_thread_stack(tsk, orig);
clear_user_return_notifier(tsk);
clear_tsk_need_resched(tsk);
stackend = end_of_stack(tsk);
*stackend = STACK_END_MAGIC; /* for overflow detection */

#ifdef CONFIG_CC_STACKPROTECTOR
tsk->stack_canary = get_random_int();
#endif
...
12 November 2013

University of Virginia cs4414

23
static struct task_struct *dup_task_struct(struct task_struct *orig)
{
struct task_struct *tsk;
Linux/include/linux/sched.h
struct thread_info *ti;
unsigned long *stackend;
int node = tsk_fork_get_node(orig);
...
int err;
#define task_thread_info(task)((struct thread_info *)(task)->stack)

#define task_stack_page(task)
((task)->stack)
tsk = alloc_task_struct_node(node);
if (!tsk)
static inline void setup_thread_stack(struct task_struct *p,
return NULL;
{

struct task_struct *org)

*task_thread_info(p) = *task_thread_info(org);
ti = alloc_thread_info_node(tsk, node);
if (!ti) task_thread_info(p)->task = p;
goto free_tsk;
}
static inline unsigned long *end_of_stack(struct task_struct
err = arch_dup_task_struct(tsk, orig);
if (err)
{
goto free_ti;
return (unsigned long *)(task_thread_info(p) + 1);

*p)

}
tsk->stack = ti;
setup_thread_stack(tsk, orig);
clear_user_return_notifier(tsk);
clear_tsk_need_resched(tsk);
stackend = end_of_stack(tsk);
*stackend = STACK_END_MAGIC; /* for overflow detection */

#ifdef CONFIG_CC_STACKPROTECTOR
tsk->stack_canary = get_random_int();
#endif
...
12 November 2013

University of Virginia cs4414

24
static struct task_struct *dup_task_struct(struct task_struct *orig)
{
struct task_struct *tsk;
struct thread_info *ti;
unsigned long *stackend;
int node = tsk_fork_get_node(orig);
int err;
tsk = alloc_task_struct_node(node);
if (!tsk)
return NULL;
ti = alloc_thread_info_node(tsk, node);
if (!ti)
goto free_tsk;
err = arch_dup_task_struct(tsk, orig);
if (err)
goto free_ti;
tsk->stack = ti;
setup_thread_stack(tsk, orig);
clear_user_return_notifier(tsk);
clear_tsk_need_resched(tsk);
stackend = end_of_stack(tsk);
*stackend = STACK_END_MAGIC; /* for overflow detection */

#ifdef CONFIG_CC_STACKPROTECTOR
tsk->stack_canary = get_random_int();
#endif
...
12 November 2013

University of Virginia cs4414

25
12 November 2013

University of Virginia cs4414

26
12 November 2013

University of Virginia cs4414

27
12 November 2013

University of Virginia cs4414

28
https://github.com/torvalds/linux/search?q=ST
ACK_END_MAGIC&ref=cmdform

In no_context, called by mm_fault_error

Does this help defend against a stack-smashing buffer overflow attack?
12 November 2013

University of Virginia cs4414

29
12 November 2013

University of Virginia cs4414

30
...
tsk->stack_canary = get_random_int();
...

12 November 2013

University of Virginia cs4414

31
static struct task_struct *dup_task_struct(struct task_struct *orig)
{
...
clear_tsk_need_resched(tsk);
stackend = end_of_stack(tsk);
*stackend = STACK_END_MAGIC; /* for overflow detection */
#ifdef CONFIG_CC_STACKPROTECTOR
tsk->stack_canary = get_random_int();
#endif

/*
* One for us, one for whoever does the "release_task()" (usually
* parent)
*/
atomic_set(&tsk->usage, 2);
#ifdef CONFIG_BLK_DEV_IO_TRACE
tsk->btrace_seq = 0;
#endif
tsk->splice_pipe = NULL;
tsk->task_frag.page = NULL;
account_kernel_stack(ti, 1);
return tsk;
free_ti:
free_thread_info(ti);
free_tsk:
free_task_struct(tsk);
return NULL;
}
12 November 2013

University of Virginia cs4414

32
static struct task_struct *copy_process(...)
{
...
p = dup_task_struct(current);
...
/* Perform scheduler related setup. Assign this task to a CPU. */
sched_fork(p);
...
}
kernel/sched/core.c

12 November 2013

University of Virginia cs4414

33
12 November 2013

University of Virginia cs4414

34
12 November 2013

University of Virginia cs4414

35
include/linux/smp.h

12 November 2013

University of Virginia cs4414

36
http://lxr.free-electrons.com/ident?i=preempt_disable

12 November 2013

University of Virginia cs4414

37
static struct task_struct *copy_process(...)
{
...
p = dup_task_struct(current);
...
/* Perform scheduler related setup. Assign this task to a CPU. */
sched_fork(p);
...
retval = copy_mm(clone_flags, p);
...
}

static int copy_mm(unsigned long clone_flags, struct task_struct *tsk)
{
struct mm_struct *mm, *oldmm;
int retval;
...
mm = dup_mm(tsk);
if (!mm)
goto fail_nomem;
good_mm:
tsk->mm = mm;
tsk->active_mm = mm;
return 0;
…
12 November 2013

University of Virginia cs4414

38
/*
* Allocate a new mm structure and copy contents from the
* mm structure of the passed in task structure.
*/
struct mm_struct *dup_mm(struct task_struct *tsk)
{
struct mm_struct *mm, *oldmm = current->mm;
int err;
if (!oldmm)
return NULL;
mm = allocate_mm();
if (!mm)
goto fail_nomem;
memcpy(mm, oldmm, sizeof(*mm));
...
#define allocate_mm() (kmem_cache_alloc(mm_cachep, GFP_KERNEL))
#define free_mm(mm)
(kmem_cache_free(mm_cachep, (mm)))

12 November 2013

University of Virginia cs4414

39
Three Linux memory allocators:
SLOB = “Simple List of Blocks”
SLAB = allocation with less fragmentation
SLUB = less fragmentation, better reuse (Default)
12 November 2013

University of Virginia cs4414

40
12 November 2013

University of Virginia cs4414

41
12 November 2013

University of Virginia cs4414

42
12 November 2013

University of Virginia cs4414

43
12 November 2013

University of Virginia cs4414

44
12 November 2013

University of Virginia cs4414

45
include/linux/gfp.h

12 November 2013

University of Virginia cs4414

46
12 November 2013

University of Virginia cs4414

47
mm/page_alloc.c

12 November 2013

University of Virginia cs4414

48
Page Table
32-bit linear address
CR3

Dir

Page

10 bits
(1K tables)

Page
Directory

Offset

10 bits
12 bits
(1K entries) (4K pages)

Page Entry

Page Table

Physical
Memory
Page + Offset

CR3+Dir

12 November 2013

University of Virginia cs4414

49
12 November 2013

University of Virginia cs4414

50
arch/x86/include/asm/pgtable.h

12 November 2013

University of Virginia cs4414

51
TLB

Memory

Paging
Unit

Physical Address

Linear Address

Logical Address

Segmentation Unit

32-bit linear address

CR3

What does the
kernel need to
do to flush the
TLB?

Dir
10 bits
(1K tables)

Page
10 bits
(1K entries)

Offset
12 bits
(4K pages)

Page Entry

Page Directory

Page Table

CR3+Dir

12 November 2013

University of Virginia cs4414

52
arch/x86/include/asm/tlbflush.h

arch/x86/include/asm/special_insns.h
12 November 2013

University of Virginia cs4414

53
Charge
Progress updates
and scheduling
design reviews will
be due Sunday
11:59pm
Tuesday’s Class:
Yuchen Zhou on
Authentication using
Single Sign-On
12 November 2013

University of Virginia cs4414

54

More Related Content

PPTX
Making a Process
PPTX
Virtual Memory (Making a Process)
PPTX
Crossing into Kernel Space
PPTX
Scheduling
PPTX
Multi-Tasking Map (MapReduce, Tasks in Rust)
PPTX
Smarter Scheduling
PPTX
Managing Memory
PPTX
Synchronization
Making a Process
Virtual Memory (Making a Process)
Crossing into Kernel Space
Scheduling
Multi-Tasking Map (MapReduce, Tasks in Rust)
Smarter Scheduling
Managing Memory
Synchronization

What's hot (20)

PPTX
System Calls
PPTX
Segmentation Faults, Page Faults, Processes, Threads, and Tasks
PPTX
SSL Failing, Sharing, and Scheduling
PPTX
Making a Process (Virtualizing Memory)
PPTX
Scheduling in Linux and Web Servers
PDF
Kernel Recipes 2019 - GNU poke, an extensible editor for structured binary data
PPTX
The Internet
ODP
Linux Capabilities - eng - v2.1.5, compact
PPTX
How to write memory efficient code?
PPTX
Once Upon a Process
PDF
Solaris Kernel Debugging V1.0
PDF
Profiling your Applications using the Linux Perf Tools
PPTX
Lec05 buffers basic_examples
PPTX
How & why-memory-efficient?
PPTX
Lec09 nbody-optimization
PPTX
Lec11 timing
PDF
Python twisted
PDF
Zabbix LLD from a C Module by Jan-Piet Mens
PPTX
Opendaylight app development
PPTX
Lec02 03 opencl_intro
System Calls
Segmentation Faults, Page Faults, Processes, Threads, and Tasks
SSL Failing, Sharing, and Scheduling
Making a Process (Virtualizing Memory)
Scheduling in Linux and Web Servers
Kernel Recipes 2019 - GNU poke, an extensible editor for structured binary data
The Internet
Linux Capabilities - eng - v2.1.5, compact
How to write memory efficient code?
Once Upon a Process
Solaris Kernel Debugging V1.0
Profiling your Applications using the Linux Perf Tools
Lec05 buffers basic_examples
How & why-memory-efficient?
Lec09 nbody-optimization
Lec11 timing
Python twisted
Zabbix LLD from a C Module by Jan-Piet Mens
Opendaylight app development
Lec02 03 opencl_intro
Ad

Similar to Putting a Fork in Fork (Linux Process and Memory Management) (20)

PDF
Exploitation of counter overflows in the Linux kernel
PPTX
grsecurity and PaX
DOCX
finalprojtemplatev5finalprojtemplate.gitignore# Ignore the b
ODP
Linux kernel tracing superpowers in the cloud
PPTX
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak
PDF
C++ CoreHard Autumn 2018. Concurrency and Parallelism in C++17 and C++20/23 -...
PDF
CUDA Deep Dive
PDF
Bruce Momjian - Inside PostgreSQL Shared Memory @ Postgres Open
PDF
PT-4057, Automated CUDA-to-OpenCL™ Translation with CU2CL: What's Next?, by W...
PPTX
Grand Central Dispatch
PPT
Microkernel Development
PDF
Lee 2020 what the clock !
PDF
CONFidence 2017: Escaping the (sand)box: The promises and pitfalls of modern ...
PPTX
Linux kernel debugging
PPTX
Kapacitor - Real Time Data Processing Engine
PDF
A CTF Hackers Toolbox
PDF
Db2 For I Parallel Data Load
PDF
Osol Pgsql
PDF
2015.07.16 Способы диагностики PostgreSQL
PPT
PHP CLI: A Cinderella Story
Exploitation of counter overflows in the Linux kernel
grsecurity and PaX
finalprojtemplatev5finalprojtemplate.gitignore# Ignore the b
Linux kernel tracing superpowers in the cloud
CONFidence 2015: DTrace + OSX = Fun - Andrzej Dyjak
C++ CoreHard Autumn 2018. Concurrency and Parallelism in C++17 and C++20/23 -...
CUDA Deep Dive
Bruce Momjian - Inside PostgreSQL Shared Memory @ Postgres Open
PT-4057, Automated CUDA-to-OpenCL™ Translation with CU2CL: What's Next?, by W...
Grand Central Dispatch
Microkernel Development
Lee 2020 what the clock !
CONFidence 2017: Escaping the (sand)box: The promises and pitfalls of modern ...
Linux kernel debugging
Kapacitor - Real Time Data Processing Engine
A CTF Hackers Toolbox
Db2 For I Parallel Data Load
Osol Pgsql
2015.07.16 Способы диагностики PostgreSQL
PHP CLI: A Cinderella Story
Ad

More from David Evans (20)

PPTX
Cryptocurrency Jeopardy!
PPTX
Trick or Treat?: Bitcoin for Non-Believers, Cryptocurrencies for Cypherpunks
PPTX
Hidden Services, Zero Knowledge
PPTX
Anonymity in Bitcoin
PPTX
Midterm Confirmations
PPTX
Scripting Transactions
PPTX
How to Live in Paradise
PPTX
Bitcoin Script
PPTX
Mining Economics
PPTX
Mining
PPTX
The Blockchain
PPTX
Becoming More Paranoid
PPTX
Asymmetric Key Signatures
PPTX
Introduction to Cryptography
PPTX
Class 1: What is Money?
PPTX
Multi-Party Computation for the Masses
PPTX
Proof of Reserve
PPTX
Silk Road
PPTX
Blooming Sidechains!
PPTX
Useful Proofs of Work, Permacoin
Cryptocurrency Jeopardy!
Trick or Treat?: Bitcoin for Non-Believers, Cryptocurrencies for Cypherpunks
Hidden Services, Zero Knowledge
Anonymity in Bitcoin
Midterm Confirmations
Scripting Transactions
How to Live in Paradise
Bitcoin Script
Mining Economics
Mining
The Blockchain
Becoming More Paranoid
Asymmetric Key Signatures
Introduction to Cryptography
Class 1: What is Money?
Multi-Party Computation for the Masses
Proof of Reserve
Silk Road
Blooming Sidechains!
Useful Proofs of Work, Permacoin

Recently uploaded (20)

PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Spectral efficient network and resource selection model in 5G networks
PPTX
Programs and apps: productivity, graphics, security and other tools
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Encapsulation theory and applications.pdf
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
MIND Revenue Release Quarter 2 2025 Press Release
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Digital-Transformation-Roadmap-for-Companies.pptx
Review of recent advances in non-invasive hemoglobin estimation
Spectral efficient network and resource selection model in 5G networks
Programs and apps: productivity, graphics, security and other tools
20250228 LYD VKU AI Blended-Learning.pptx
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Diabetes mellitus diagnosis method based random forest with bat algorithm
Encapsulation theory and applications.pdf
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Network Security Unit 5.pdf for BCA BBA.
Unlocking AI with Model Context Protocol (MCP)
The Rise and Fall of 3GPP – Time for a Sabbatical?
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
The AUB Centre for AI in Media Proposal.docx
How UI/UX Design Impacts User Retention in Mobile Apps.pdf

Putting a Fork in Fork (Linux Process and Memory Management)

  • 2. Updates Progress updates and scheduling design reviews will be due Sunday 11:59pm Tonight on Colbert Report! Tuesday’s Class: Yuchen Zhou on Authentication using Single Sign-On 12 November 2013 University of Virginia cs4414 1
  • 3. Recap: Last Class Translation Lookaside Buffer (Cache) Memory Page Paging Unit Physical Address Dir Linear Address Logical Address Segmentation Unit Offset CR3 Page Directory Page Table Physical Memory GDTR Global Descriptor Table 12 November 2013 University of Virginia cs4414 2
  • 4. #include <stdio.h> #include <stdlib.h> int main(int argc, char **argv) { char *s = (char *) malloc (1); int i= 0; while (1) { printf("%d: %xn", i, s[i]); i += 4; } } 12 November 2013 What will this program do? > ./a.out 0: 0 4: 0 8: 0 12: 0 …1033872: 0 1033876: 0 1033880: 0 1033884: 0 Segmentation fault: 11 University of Virginia cs4414 3
  • 5. 12 November 2013 University of Virginia cs4414 4
  • 6. > clang segv.c segv.c:22:8: warning: expression result unused [-Wunused-value] s[i]; ~ ~^ 1 warning generated. > ./a.out ^C 12 November 2013 University of Virginia cs4414 5
  • 7. $ ./a.out Caught segv: 11 i = 1033888 Caught segv: 11 i = 1033888 Caught segv: 11 i = 1033888 Caught segv: 11 i = 1033888 Caught segv: 11 i = 1033888 Caught segv: 11 i = 1033888 Caught segv: 11 i = 1033888 … 12 November 2013 University of Virginia cs4414 6
  • 8. > ulimit -a core file size data seg size file size max locked memory max memory size open files pipe size stack size cpu time max user processes virtual memory 12 November 2013 (blocks, (kbytes, (blocks, (kbytes, (kbytes, -c) -d) -f) -l) -m) (-n) (512 bytes, -p) (kbytes, -s) (seconds, -t) (-u) (kbytes, -v) University of Virginia cs4414 0 unlimited unlimited unlimited unlimited 256 1 8515 unlimited 709 unlimited 7
  • 9. USENIX Security 2007 12 November 2013 University of Virginia cs4414 8
  • 10. Rust Runtime Forking Fork run::Process::new(program, argv, options) spawn_process_os(prog, args, env, dir, in_fd, …) fork() int 0x80 libc: fork() jumps into kernel code sets supervisor mode linux kernel: fork syscall 12 November 2013 University of Virginia cs4414 9
  • 11. /* * linux/kernel/fork.c * * Copyright (C) 1991, 1992 */ Linus Torvalds /* * 'fork.c' contains the help-routines for the 'fork' system call * (see also entry.S and others). * Fork is rather simple, once you get the hang of it, but the memory * management can be a bitch. See 'mm/memory.c': 'copy_page_range()' */ #include #include #include #include #include #include … <linux/slab.h> <linux/init.h> <linux/unistd.h> <linux/module.h> <linux/vmalloc.h> <linux/completion.h> 1935 total lines 12 November 2013 University of Virginia cs4414 10
  • 12. /* * Ok, this is the main fork-routine. * * It copies the process, and if successful kick-starts * it and waits for it to finish using the VM if required. */ long do_fork(unsigned long clone_flags, unsigned long stack_start, unsigned long stack_size, int __user *parent_tidptr, int __user *child_tidptr) { struct task_struct *p; int trace = 0; long nr; /* * Determine whether and which event to report to ptracer. When * called from kernel_thread or CLONE_UNTRACED is explicitly * requested, no event is reported; otherwise, report if the event * for the type of forking is enabled. */ if (!(clone_flags & CLONE_UNTRACED)) { … } 12 November 2013 University of Virginia cs4414 11
  • 13. long do_fork(unsigned long clone_flags, unsigned long stack_start, unsigned long stack_size, int __user *parent_tidptr, int __user *child_tidptr) { struct task_struct *p; int trace = 0; long nr; /* Determine whether and which event to report to ptracer... */ p = copy_process(clone_flags, stack_start, stack_size, child_tidptr, NULL, trace); /* * Do this prior (to) waking up the new thread – the thread pointer * might get invalid after that point, if the thread exits quickly. */ if (!IS_ERR(p)) { ... 12 November 2013 University of Virginia cs4414 12
  • 14. /* This creates a new process as a copy of the old one, but does not actually start it yet. It copies the registers, and all the appropriate parts of the process environment (as per the clone flags). The actual kick-off is left to the caller. */ static struct task_struct *copy_process(unsigned long clone_flags, unsigned long stack_start, unsigned long stack_size, int __user *child_tidptr, struct pid *pid, int trace) { int retval; struct task_struct *p; if ((clone_flags & (CLONE_NEWNS|CLONE_FS)) == (CLONE_NEWNS|CLONE_FS)) return ERR_PTR(-EINVAL); ... // lots more error cases based on flags retval = security_task_create(clone_flags); if (retval) goto fork_out; ... // this is the interesting part we will look at next fork_out: return ERR_PTR(retval); } 12 November 2013 University of Virginia cs4414 13
  • 15. What should be in a task_struct? “task” here means process (its what copy_process returns), not to be confused with a Rust task 12 November 2013 University of Virginia cs4414 14
  • 16. include/linux/sched.h Definition of task_struct is over 400 lines! 12 November 2013 University of Virginia cs4414 15
  • 17. Memory Management mm_struct is another huge data structure…we’ll look at later. 12 November 2013 University of Virginia cs4414 16
  • 18. 12 November 2013 University of Virginia cs4414 17
  • 19. Stack Canary arch/x86/include/asm/stackprotector.h 12 November 2013 University of Virginia cs4414 18
  • 20. Protecting Stack Frames Saved Registers Saved Registers Parameters Parameters Return Address Return Address gcc –Wstack-protector Local Variables Canary Local Variables Why does the kernel need code to support this? 12 November 2013 University of Virginia cs4414 19
  • 21. 12 November 2013 University of Virginia cs4414 20
  • 22. Other things in struct task: 12 November 2013 University of Virginia cs4414 21
  • 23. static struct task_struct *copy_process(unsigned long clone_flags, unsigned long stack_start, unsigned long stack_size, int __user *child_tidptr, struct pid *pid, int trace) { int retval; struct task_struct *p; ... // lots more error cases based on flags What is current? retval = security_task_create(clone_flags); if (retval) #ifndef _ASM_X86_CURRENT_H goto fork_out; retval = -ENOMEM; p = dup_task_struct(current); if (!p) goto fork_out; ... fork_out: return ERR_PTR(retval); } #define _ASM_X86_CURRENT_H #include <linux/compiler.h> #include <asm/percpu.h> #ifndef __ASSEMBLY__ struct task_struct; DECLARE_PER_CPU(struct task_struct *, current_task); static __always_inline struct task_struct *get_current(void) { return percpu_read_stable(current_task); } #define current get_current() #endif /* __ASSEMBLY__ */ #endif /* _ASM_X86_CURRENT_H */ /linux-2.6.32-rc3/arch/x86/include/asm/current.h 12 November 2013 University of Virginia cs4414 22
  • 24. static struct task_struct *dup_task_struct(struct task_struct *orig) { struct task_struct *tsk; struct thread_info *ti; unsigned long *stackend; int node = tsk_fork_get_node(orig); int err; tsk = alloc_task_struct_node(node); if (!tsk) return NULL; ti = alloc_thread_info_node(tsk, node); if (!ti) goto free_tsk; err = arch_dup_task_struct(tsk, orig); if (err) goto free_ti; tsk->stack = ti; setup_thread_stack(tsk, orig); clear_user_return_notifier(tsk); clear_tsk_need_resched(tsk); stackend = end_of_stack(tsk); *stackend = STACK_END_MAGIC; /* for overflow detection */ #ifdef CONFIG_CC_STACKPROTECTOR tsk->stack_canary = get_random_int(); #endif ... 12 November 2013 University of Virginia cs4414 23
  • 25. static struct task_struct *dup_task_struct(struct task_struct *orig) { struct task_struct *tsk; Linux/include/linux/sched.h struct thread_info *ti; unsigned long *stackend; int node = tsk_fork_get_node(orig); ... int err; #define task_thread_info(task)((struct thread_info *)(task)->stack) #define task_stack_page(task) ((task)->stack) tsk = alloc_task_struct_node(node); if (!tsk) static inline void setup_thread_stack(struct task_struct *p, return NULL; { struct task_struct *org) *task_thread_info(p) = *task_thread_info(org); ti = alloc_thread_info_node(tsk, node); if (!ti) task_thread_info(p)->task = p; goto free_tsk; } static inline unsigned long *end_of_stack(struct task_struct err = arch_dup_task_struct(tsk, orig); if (err) { goto free_ti; return (unsigned long *)(task_thread_info(p) + 1); *p) } tsk->stack = ti; setup_thread_stack(tsk, orig); clear_user_return_notifier(tsk); clear_tsk_need_resched(tsk); stackend = end_of_stack(tsk); *stackend = STACK_END_MAGIC; /* for overflow detection */ #ifdef CONFIG_CC_STACKPROTECTOR tsk->stack_canary = get_random_int(); #endif ... 12 November 2013 University of Virginia cs4414 24
  • 26. static struct task_struct *dup_task_struct(struct task_struct *orig) { struct task_struct *tsk; struct thread_info *ti; unsigned long *stackend; int node = tsk_fork_get_node(orig); int err; tsk = alloc_task_struct_node(node); if (!tsk) return NULL; ti = alloc_thread_info_node(tsk, node); if (!ti) goto free_tsk; err = arch_dup_task_struct(tsk, orig); if (err) goto free_ti; tsk->stack = ti; setup_thread_stack(tsk, orig); clear_user_return_notifier(tsk); clear_tsk_need_resched(tsk); stackend = end_of_stack(tsk); *stackend = STACK_END_MAGIC; /* for overflow detection */ #ifdef CONFIG_CC_STACKPROTECTOR tsk->stack_canary = get_random_int(); #endif ... 12 November 2013 University of Virginia cs4414 25
  • 27. 12 November 2013 University of Virginia cs4414 26
  • 28. 12 November 2013 University of Virginia cs4414 27
  • 29. 12 November 2013 University of Virginia cs4414 28
  • 30. https://github.com/torvalds/linux/search?q=ST ACK_END_MAGIC&ref=cmdform In no_context, called by mm_fault_error Does this help defend against a stack-smashing buffer overflow attack? 12 November 2013 University of Virginia cs4414 29
  • 31. 12 November 2013 University of Virginia cs4414 30
  • 32. ... tsk->stack_canary = get_random_int(); ... 12 November 2013 University of Virginia cs4414 31
  • 33. static struct task_struct *dup_task_struct(struct task_struct *orig) { ... clear_tsk_need_resched(tsk); stackend = end_of_stack(tsk); *stackend = STACK_END_MAGIC; /* for overflow detection */ #ifdef CONFIG_CC_STACKPROTECTOR tsk->stack_canary = get_random_int(); #endif /* * One for us, one for whoever does the "release_task()" (usually * parent) */ atomic_set(&tsk->usage, 2); #ifdef CONFIG_BLK_DEV_IO_TRACE tsk->btrace_seq = 0; #endif tsk->splice_pipe = NULL; tsk->task_frag.page = NULL; account_kernel_stack(ti, 1); return tsk; free_ti: free_thread_info(ti); free_tsk: free_task_struct(tsk); return NULL; } 12 November 2013 University of Virginia cs4414 32
  • 34. static struct task_struct *copy_process(...) { ... p = dup_task_struct(current); ... /* Perform scheduler related setup. Assign this task to a CPU. */ sched_fork(p); ... } kernel/sched/core.c 12 November 2013 University of Virginia cs4414 33
  • 35. 12 November 2013 University of Virginia cs4414 34
  • 36. 12 November 2013 University of Virginia cs4414 35
  • 39. static struct task_struct *copy_process(...) { ... p = dup_task_struct(current); ... /* Perform scheduler related setup. Assign this task to a CPU. */ sched_fork(p); ... retval = copy_mm(clone_flags, p); ... } static int copy_mm(unsigned long clone_flags, struct task_struct *tsk) { struct mm_struct *mm, *oldmm; int retval; ... mm = dup_mm(tsk); if (!mm) goto fail_nomem; good_mm: tsk->mm = mm; tsk->active_mm = mm; return 0; … 12 November 2013 University of Virginia cs4414 38
  • 40. /* * Allocate a new mm structure and copy contents from the * mm structure of the passed in task structure. */ struct mm_struct *dup_mm(struct task_struct *tsk) { struct mm_struct *mm, *oldmm = current->mm; int err; if (!oldmm) return NULL; mm = allocate_mm(); if (!mm) goto fail_nomem; memcpy(mm, oldmm, sizeof(*mm)); ... #define allocate_mm() (kmem_cache_alloc(mm_cachep, GFP_KERNEL)) #define free_mm(mm) (kmem_cache_free(mm_cachep, (mm))) 12 November 2013 University of Virginia cs4414 39
  • 41. Three Linux memory allocators: SLOB = “Simple List of Blocks” SLAB = allocation with less fragmentation SLUB = less fragmentation, better reuse (Default) 12 November 2013 University of Virginia cs4414 40
  • 42. 12 November 2013 University of Virginia cs4414 41
  • 43. 12 November 2013 University of Virginia cs4414 42
  • 44. 12 November 2013 University of Virginia cs4414 43
  • 45. 12 November 2013 University of Virginia cs4414 44
  • 46. 12 November 2013 University of Virginia cs4414 45
  • 48. 12 November 2013 University of Virginia cs4414 47
  • 50. Page Table 32-bit linear address CR3 Dir Page 10 bits (1K tables) Page Directory Offset 10 bits 12 bits (1K entries) (4K pages) Page Entry Page Table Physical Memory Page + Offset CR3+Dir 12 November 2013 University of Virginia cs4414 49
  • 51. 12 November 2013 University of Virginia cs4414 50
  • 53. TLB Memory Paging Unit Physical Address Linear Address Logical Address Segmentation Unit 32-bit linear address CR3 What does the kernel need to do to flush the TLB? Dir 10 bits (1K tables) Page 10 bits (1K entries) Offset 12 bits (4K pages) Page Entry Page Directory Page Table CR3+Dir 12 November 2013 University of Virginia cs4414 52
  • 55. Charge Progress updates and scheduling design reviews will be due Sunday 11:59pm Tuesday’s Class: Yuchen Zhou on Authentication using Single Sign-On 12 November 2013 University of Virginia cs4414 54