SlideShare a Scribd company logo
1
Network & Filesystem:
Doing less cross rings memory copy
Louis Solofrizzo
System & Kernel, Storage Team
2
Optimization: Zero-Copy in the Linux Kernel
Quick history of the context-switch
The cost of a ring-copy
Normal copies in the Linux Kernel and the Glibc
Zero-copy API in the Linux Kernel
Going further: Avoiding the context-switch
3
Abbreviations
Context Switch: Storing the state of a process
Syscall: System Call
MMU: Memory Management Unit
Kernel / User mode: System Privilege
IRQ: InteRrupt Request
4
Quick history of the context-switch
The User - Kernel Space system
One API to rule them all
Syscalls are born! (yay!)
Multitasking
CPU context-switch, with privileges
5
The cost of a ring copy
Copying memory from kernel to user space do
cost a lot
The physical translation cache is flushed (TLB)
+ The cost of the context-switch itself
6
Syscall breakdown
7
Syscall timing
8
Copies in the Linux Kernel and libc
ssize_t read(int, void *, size_t)
ssize_t write(int, const void *, size_t);
ssize_t ksys_write(unsigned int, const char *, size_t);
ssize_t ksys_read(unsigned int, char *, size_t);
9
Copies in the Linux Kernel and libc
ssize_t readv(int, const struct iovec *, int);
ssize_t writev(int, const struct iovec *, int);
long sys_preadv(unsigned long, const struct iovec *,
unsigned long, unsigned long, unsigned long);
long sys_pwritev(unsigned long, const struct iovec *,
unsigned long, unsigned long, unsigned long);
10
Zero-copy API in the Linux Kernel
ssize_t splice(int, loff_t *, int, loff_t *,
size_t, unsigned int);
splice() moves data between two
file descriptors without copying
between kernel address space and
user address space.
11
Upload architecture example
12
Zero-copy upload
13
Zero-Copy upload architecture example
14
Benchmark
Upload Normal: ~940Mbps
Upload Zero-copy: ~3Gbps
15
Other Zero-copy API
ssize_t sendfile(int, int, off_t *, size_t);
ssize_t copy_file_range(int, loff_t *, int, loff_t *,
size_t, unsigned int);
sendfile() copies data between one
file descriptor and another.
The copy_file_range() system call
performs an in-kernel copy between
two file descriptors without the
additional cost of transferring data
from the kernel to user space and
then back into the kernel.
16
Zero Copy in Linux common applications
Nginx sendfile()
HAProxy
DIRECT_IO (ZFS, ext4)
17
Going further: Avoiding the context-switch
UniKernels
MicroKernels
18
Thanks!
Follow our news, tutorials and cloud informations
on
Twitter et LinkedIn @Scaleway
19
20

More Related Content

PPTX
zmq.rs - A brief history of concurrency in Rust
ODP
Checkpoint/restore of containers with CRIU
PPTX
N problems of Linux containers
PDF
NFS updates for CLSF
PPTX
FLESCH INDEX AND SYS AND OS MODULE IN PYTHON PROGRAMMING LANGUAGE
PPT
Storage Simplified NFS LXC K3S
PDF
Build, Ship, and Run Any App, Anywhere using Docker
PDF
Linux kernel bug hunting
zmq.rs - A brief history of concurrency in Rust
Checkpoint/restore of containers with CRIU
N problems of Linux containers
NFS updates for CLSF
FLESCH INDEX AND SYS AND OS MODULE IN PYTHON PROGRAMMING LANGUAGE
Storage Simplified NFS LXC K3S
Build, Ship, and Run Any App, Anywhere using Docker
Linux kernel bug hunting

What's hot (19)

PPT
Networking chapter V
PDF
TiReX: Tiled Regular eXpression matching architecture
PDF
はじめてのGlusterFS
PDF
BPF: Next Generation of Programmable Datapath
PDF
TDC2017 | São Paulo - Trilha Containers How we figured out we had a SRE team ...
PDF
XPDS14 - OSv - A Modern Semi-POSIX LibraryOS - Glauber Costa, Cloudius Systems
PDF
Keeping your files safe in the post-Snowden era with SXFS
PDF
GlusterFS As an Object Storage
PDF
Docker: please contain your excitement
PDF
Zero cloud adrian_otto
PDF
Emerging Persistent Memory Hardware and ZUFS - PM-based File Systems in User ...
PPTX
Designing Tracing Tools
PDF
Function Level Analysis of Linux NVMe Driver
PPT
Installation of application server 10g in red hat 4
PPT
Mirage: ML kernels in the cloud (ML Workshop 2010)
PDF
tokyotalk
PDF
Glomosim introduction
PDF
Glomosim adding routing protocol
PPT
Cvs and version control
Networking chapter V
TiReX: Tiled Regular eXpression matching architecture
はじめてのGlusterFS
BPF: Next Generation of Programmable Datapath
TDC2017 | São Paulo - Trilha Containers How we figured out we had a SRE team ...
XPDS14 - OSv - A Modern Semi-POSIX LibraryOS - Glauber Costa, Cloudius Systems
Keeping your files safe in the post-Snowden era with SXFS
GlusterFS As an Object Storage
Docker: please contain your excitement
Zero cloud adrian_otto
Emerging Persistent Memory Hardware and ZUFS - PM-based File Systems in User ...
Designing Tracing Tools
Function Level Analysis of Linux NVMe Driver
Installation of application server 10g in red hat 4
Mirage: ML kernels in the cloud (ML Workshop 2010)
tokyotalk
Glomosim introduction
Glomosim adding routing protocol
Cvs and version control
Ad

Similar to Network & Filesystem: Doing less cross rings memory copy (20)

PPT
Contiki OS preparation usage with kit CC256
PDF
2. Vagin. Linux containers. June 01, 2013
ODP
Fedora Virtualization Day: Linux Containers & CRIU
PPTX
Linux internals v4
ODP
Linux26 New Features
ODP
Linux Internals - Kernel/Core
PDF
Using open source software to build an industrial grade embedded linux platfo...
PDF
Kernel Recipes 2016 - The kernel report
PDF
OS scheduling and The anatomy of a context switch
PDF
Linux System Programming 1st Edition Robert Love
PPT
UNIT V PPT.ppt
PPTX
Software update for embedded systems
PDF
The Linux Block Layer - Built for Fast Storage
PPTX
Operating Systems & Applications
PDF
Barcamp PT
PDF
FlexSC: Exception-Less System Calls - presented @ OSDI 2010
PDF
Making Linux do Hard Real-time
PPT
Servers and Processes: Behavior and Analysis
PDF
Linux kernel architecture
PPTX
Design, Build,and Maintain the Embedded Linux Platform
Contiki OS preparation usage with kit CC256
2. Vagin. Linux containers. June 01, 2013
Fedora Virtualization Day: Linux Containers & CRIU
Linux internals v4
Linux26 New Features
Linux Internals - Kernel/Core
Using open source software to build an industrial grade embedded linux platfo...
Kernel Recipes 2016 - The kernel report
OS scheduling and The anatomy of a context switch
Linux System Programming 1st Edition Robert Love
UNIT V PPT.ppt
Software update for embedded systems
The Linux Block Layer - Built for Fast Storage
Operating Systems & Applications
Barcamp PT
FlexSC: Exception-Less System Calls - presented @ OSDI 2010
Making Linux do Hard Real-time
Servers and Processes: Behavior and Analysis
Linux kernel architecture
Design, Build,and Maintain the Embedded Linux Platform
Ad

More from Scaleway (20)

PDF
Entreprises : découvrez les briques essentielles d’une solution IoT
PDF
Understand, verify, and act on the security of your Kubernetes clusters - Sca...
PDF
Éditeurs d'applications mobiles : augmentez la résolution des photos de vos c...
PPTX
Discover the benefits of Kubernetes to host a SaaS solution
PPTX
6 winning strategies for agil SaaS editors
PDF
Webinar - Relying on Bare Metal to manage your workloads
PPTX
Webinaire du 09/04/20 - S'appuyer sur du Bare Metal pour gérer ses pics de ch...
PPTX
Scaleway Approach to VXLAN EVPN Fabric
PDF
Workshop IoT Hub : Pilotez une ampoule connectée
PDF
Why and how we proxy our IoT broker connections
PDF
From local servers up to Kubernetes in the cloud
PDF
L’évolution des serveurs dédiés vers le Bare Metal et les instances : comm...
PDF
L’IA, booster de votre activité : principes, usages & idéation
PDF
Comment automatiser le déploiement de sa plateforme sur des infrastructures ...
PDF
Serverless
PDF
Migrating the Online’s console with Docker
PDF
Routage à grande échelle des requêtes via RabbitMQ
PDF
Instances Behind the Scene: What happen when you click on «create a new insta...
PDF
Demystifying IoT : Bringing the cloud to connected devices with IoT Station
PDF
L’odyssée d’une requête HTTP chez Scaleway
Entreprises : découvrez les briques essentielles d’une solution IoT
Understand, verify, and act on the security of your Kubernetes clusters - Sca...
Éditeurs d'applications mobiles : augmentez la résolution des photos de vos c...
Discover the benefits of Kubernetes to host a SaaS solution
6 winning strategies for agil SaaS editors
Webinar - Relying on Bare Metal to manage your workloads
Webinaire du 09/04/20 - S'appuyer sur du Bare Metal pour gérer ses pics de ch...
Scaleway Approach to VXLAN EVPN Fabric
Workshop IoT Hub : Pilotez une ampoule connectée
Why and how we proxy our IoT broker connections
From local servers up to Kubernetes in the cloud
L’évolution des serveurs dédiés vers le Bare Metal et les instances : comm...
L’IA, booster de votre activité : principes, usages & idéation
Comment automatiser le déploiement de sa plateforme sur des infrastructures ...
Serverless
Migrating the Online’s console with Docker
Routage à grande échelle des requêtes via RabbitMQ
Instances Behind the Scene: What happen when you click on «create a new insta...
Demystifying IoT : Bringing the cloud to connected devices with IoT Station
L’odyssée d’une requête HTTP chez Scaleway

Recently uploaded (20)

PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PDF
Review of recent advances in non-invasive hemoglobin estimation
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
cuic standard and advanced reporting.pdf
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PPTX
Cloud computing and distributed systems.
PPT
Teaching material agriculture food technology
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Unlocking AI with Model Context Protocol (MCP)
PPTX
A Presentation on Artificial Intelligence
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Modernizing your data center with Dell and AMD
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
Review of recent advances in non-invasive hemoglobin estimation
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
cuic standard and advanced reporting.pdf
CIFDAQ's Market Insight: SEC Turns Pro Crypto
“AI and Expert System Decision Support & Business Intelligence Systems”
Per capita expenditure prediction using model stacking based on satellite ima...
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Cloud computing and distributed systems.
Teaching material agriculture food technology
Digital-Transformation-Roadmap-for-Companies.pptx
Network Security Unit 5.pdf for BCA BBA.
Reach Out and Touch Someone: Haptics and Empathic Computing
Unlocking AI with Model Context Protocol (MCP)
A Presentation on Artificial Intelligence
Dropbox Q2 2025 Financial Results & Investor Presentation
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Modernizing your data center with Dell and AMD
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...

Network & Filesystem: Doing less cross rings memory copy

  • 1. 1
  • 2. Network & Filesystem: Doing less cross rings memory copy Louis Solofrizzo System & Kernel, Storage Team 2
  • 3. Optimization: Zero-Copy in the Linux Kernel Quick history of the context-switch The cost of a ring-copy Normal copies in the Linux Kernel and the Glibc Zero-copy API in the Linux Kernel Going further: Avoiding the context-switch 3
  • 4. Abbreviations Context Switch: Storing the state of a process Syscall: System Call MMU: Memory Management Unit Kernel / User mode: System Privilege IRQ: InteRrupt Request 4
  • 5. Quick history of the context-switch The User - Kernel Space system One API to rule them all Syscalls are born! (yay!) Multitasking CPU context-switch, with privileges 5
  • 6. The cost of a ring copy Copying memory from kernel to user space do cost a lot The physical translation cache is flushed (TLB) + The cost of the context-switch itself 6
  • 9. Copies in the Linux Kernel and libc ssize_t read(int, void *, size_t) ssize_t write(int, const void *, size_t); ssize_t ksys_write(unsigned int, const char *, size_t); ssize_t ksys_read(unsigned int, char *, size_t); 9
  • 10. Copies in the Linux Kernel and libc ssize_t readv(int, const struct iovec *, int); ssize_t writev(int, const struct iovec *, int); long sys_preadv(unsigned long, const struct iovec *, unsigned long, unsigned long, unsigned long); long sys_pwritev(unsigned long, const struct iovec *, unsigned long, unsigned long, unsigned long); 10
  • 11. Zero-copy API in the Linux Kernel ssize_t splice(int, loff_t *, int, loff_t *, size_t, unsigned int); splice() moves data between two file descriptors without copying between kernel address space and user address space. 11
  • 16. Other Zero-copy API ssize_t sendfile(int, int, off_t *, size_t); ssize_t copy_file_range(int, loff_t *, int, loff_t *, size_t, unsigned int); sendfile() copies data between one file descriptor and another. The copy_file_range() system call performs an in-kernel copy between two file descriptors without the additional cost of transferring data from the kernel to user space and then back into the kernel. 16
  • 17. Zero Copy in Linux common applications Nginx sendfile() HAProxy DIRECT_IO (ZFS, ext4) 17
  • 18. Going further: Avoiding the context-switch UniKernels MicroKernels 18
  • 19. Thanks! Follow our news, tutorials and cloud informations on Twitter et LinkedIn @Scaleway 19
  • 20. 20