SlideShare a Scribd company logo
CRIU - Checkpoint/Restore in User-space




           Andrey Vagin <avagin@openvz.org><
What is C/R and how can it be used?

  C/R is the ability to save states of processes and to restore them later.

  Usage scenarios:
   –   Failure recovery
   –   Live migration
   –   Reboot-less upgrade
   –   Speed up of slow-boot services
   –   HPC issues




                                        2
History

     BLCR          DMTCP            OpenVZ             Linux C/R   CRIU
     2003          2007             2005               2008        2011


●   Berkeley Lab Checkpoint/Restart (BLCR) (2003)
    –   Load a kernel module and link with a library
●   DMTCP: Distributed MultiThreaded CheckPointing (2004-2006)
    –   Preload a library
●   OpenVZ (2005)
    –   OpenVZ kernel
●   Linux Checkpoint/Restart by Oren Laadan (2008)
    –   A non-mainline kernel
●   CRIU (2011)


                                                   3
How does this work?

                                                Image files
Kernel objects     Process tree
                                                001101   001101
                                                101010   101010
                                                110001   110001
                                                011010   011010
                                                000011   000011
                                                010101   010101


   Files                                        001101
                                                101010
                                                         001101
                                                         101010
                                                110001   110001
                                                011010   011010
Sockets                               crtools   000011
                                                010101
                                                         000011
                                                         010101

  Pipes                                         001101   001101
                                                101010   101010
                                                110001   110001
                                                011010   011010
                                                000011   000011
                                                010101   010101




            Name-spaces



                                  4
Kernel interfaces

       /proc/




       ptrace


Dump                       Restore
                syscalls




                netlink




                    5
Dump
●   Parasite code
    –   Receive file descriptors
    –   Dump memory content
    –   Prctl(), sigaction, pending signals, timers, etc.
●   Ptrace
    –   freeze processes
    –   Inject a parasite code
●   Netlink
    –   Get information about sockets, netns
●   Procfs
        /proc/PID/maps, /proc/PID/map_files/,
        /proc/PID/status, /proc/PID/mountinfo



                                                6
Restore
                                                      Namespaces

●   Collect shared objects
●   Restore name-spaces
●   Create a process tree                                 Processes
     –   Restore SID, PGID
     –   Restore objects, which should be inherited
●   Files, sockets, pipes, ...
●   Restore per-task properties.
●   Restore memory
●   Call sigreturn
●   Awesome


                                              7
Interesting moments

●   How to restore shared objects?
    –   Send file descriptors via unix sockets
    –   Map files from /proc/self/map_files/ for restoring anon shared mappings
●   How to restore memory mappings on the correct places?
    –   Map a new code block and a stack
    –   Unmap crtools' mappings
    –   Remap task's mappings on the correct places
●   How to resume a process?
    –   Create a signal frame
    –   Call sigreturn()




                                                 8
Kernel impact

    ~120 patches merged             ~20 patches in flight




    ~10 new features appeared       ~2 new features to come




                                9
New features in a kernel

●   Parasite code injection (by Tejun Heo)
     –   Read task states, that are currently retrieved by a task only about itself
●   The kcmp() system call
     –   Helps checking which kernel objects are shared between processes
●   Proc map_files directory
     –   Find out what exact file is mapped
     –   Mappings sharing info
●   A bunch of prctl extensions
     –   Set various private stuff on task/mm objects (c/r-only feature)
●   Last-pid sysctl
     –   Restore task with desired PID value


                                               10
New features in a kernel (net)

●   TCP repair mode
    –   Read intimate state of a TCP connection
        and reconstructs it from scratch on a freshly created socket
●   Sockets information dumping via netlink (sock_diag)
    –   Extendable sockets state retrieving engine
●   Virtual net devices indexes
    –   Allows to restore network devices in a namespace
●   Socket peeking offset
    –   Allows peeking sockets queues (reading without removing data from queue)




                                             11
What are already supported?

●   Linux 3.8                                   ●   In flight
    –   X86_64 architecture                          –   and ARM architecture
    –   Process tree linkage                         –   Pending signals
    –   Multi-threaded apps                          –   TCP time-stamps
    –   All kinds of memory mappings
    –   Terminals, groups, sessions
    –   Open files (shared and unlinked)
    –   Established TCP connections
    –   Unix sockets, Packet sockets
    –   Name-spaces (net, mount, ipc)
    –   Non-posix files (epoll, inotify)
    –   Pipes, Fifo-s, IPC, ...


                                           12
How is CRIU tested?

●   ZDTM – a set of unit-tests
●   Real-life applications
     –   Apache, Nginx
     –   MySQL, MongoDB, Oracle
     –   Make && gcc
     –   Tar & gzip
     –   Screen
     –   Java
     –   LXC
     –   VNC server + GUI applications




                                         13
Future plans

●   Support all kinds of kernel objects
●   Merge all in-flight patches in the mainstream kernel
●   Integrate CRIU with OpenVZ and LXC utilities
●   Iterative migration
     –   Migrate memory content before freezing applications
●   Integration in distributions
     –   CRIU was accepted to Fedora 19




                                            14
How to use

●   ./crtools dump -t pid [<options>]
     –   checkpoint a process/tree identified by pid
●    ./crtools restore -t pid [<options>]
     –   restore - restore a process/tree identified by pid
●    ./crtools show (-D dir)|(-f file) [<options>]
     –   show dump file(s) contents
●    ./crtools check
     –   checks whether the kernel support is up-to-date
●    ./crtools exec -t pid <syscall-string>
     –   exec - execute a system call by other task




                                                15
Checkpoint/restore of a VNC server.




                         16
Questions?

http://criu.org

More Related Content

PDF
2. Vagin. Linux containers. June 01, 2013
PDF
Kernel Recipes 2015 - Hardened kernels for everyone
PDF
VLANs in the Linux Kernel
PDF
Containers with systemd-nspawn
PDF
Zookeeper In Simple Words
PDF
Full system roll-back and systemd in SUSE Linux Enterprise 12
PDF
Modern net bsd kernel module
PDF
Containers and Namespaces in the Linux Kernel
2. Vagin. Linux containers. June 01, 2013
Kernel Recipes 2015 - Hardened kernels for everyone
VLANs in the Linux Kernel
Containers with systemd-nspawn
Zookeeper In Simple Words
Full system roll-back and systemd in SUSE Linux Enterprise 12
Modern net bsd kernel module
Containers and Namespaces in the Linux Kernel

What's hot (18)

PDF
Namespaces in Linux
PDF
Anatomy of a Container: Namespaces, cgroups & Some Filesystem Magic - LinuxCon
PDF
Namespaces and cgroups - the basis of Linux containers
PDF
Linuxcon Barcelon 2012: LXC Best Practices
PDF
Linux cgroups and namespaces
PDF
Introduction to NetBSD kernel
ODP
CRIU: Time and Space Travel for Linux Containers
PDF
Kernel Recipes 2013 - Viewing real time ltt trace using gtkwave
ODP
Fedora Virtualization Day: Linux Containers & CRIU
PDF
Virtualization which isn't: LXC (Linux Containers)
PDF
BlueHat v18 || A mitigation for kernel toctou vulnerabilities
PDF
Effective service and resource management with systemd
PDF
FOSDEM2015: Live migration for containers is around the corner
PPTX
protothread and its usage in contiki OS
PDF
GemStone/S Update
PDF
Inteligencia artificial 13
PPT
Intro To .Net Threads
PPT
Namespaces in Linux
Anatomy of a Container: Namespaces, cgroups & Some Filesystem Magic - LinuxCon
Namespaces and cgroups - the basis of Linux containers
Linuxcon Barcelon 2012: LXC Best Practices
Linux cgroups and namespaces
Introduction to NetBSD kernel
CRIU: Time and Space Travel for Linux Containers
Kernel Recipes 2013 - Viewing real time ltt trace using gtkwave
Fedora Virtualization Day: Linux Containers & CRIU
Virtualization which isn't: LXC (Linux Containers)
BlueHat v18 || A mitigation for kernel toctou vulnerabilities
Effective service and resource management with systemd
FOSDEM2015: Live migration for containers is around the corner
protothread and its usage in contiki OS
GemStone/S Update
Inteligencia artificial 13
Intro To .Net Threads
Ad

Similar to Checkpoint/Restore mostly in Userspace (20)

PDF
CRIU (Checkpoint and Restore In Userspace) FOSDEM 2015
PDF
FOSDEM 2015: Live migration for containers is around the corner
PDF
Андрей Вагин. Все что вы хотели знать о Criu, но стеснялись спросить...
PDF
Open WG Talk #2 Everything you wanted to know about CRIU (but were afraid to ...
PDF
Open WG Talk #2 Everything you wanted to know about CRIU (but were afraid to ...
PDF
Checkpoint and Restore In Userspace
PPTX
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
PDF
Containers > VMs
ODP
CRIU: are we there yet?
PDF
Windows Kernel Debugging
PDF
Using Netconf/Yang with OpenDalight
PDF
Ospresentation 120112074429-phpapp02 (1)
PPTX
Swifty Serverless: How to minimise latencies and cold start period for server...
PDF
Docker 1.11 Meetup: Containerd and runc, by Arnaud Porterie and Michael Crosby
PDF
Docker 1.11 Meetup: Containerd and runc, by Arnaud Porterie and Michael Crosby
PPTX
Extending OpenVIM R3 to support Unikernels (and Xen)
ODP
Not so brief history of Linux Containers
ODP
Not so brief history of Linux Containers - Kir Kolyshkin
PDF
Docker 1.11 @ Docker SF Meetup
PDF
rtnetlink
CRIU (Checkpoint and Restore In Userspace) FOSDEM 2015
FOSDEM 2015: Live migration for containers is around the corner
Андрей Вагин. Все что вы хотели знать о Criu, но стеснялись спросить...
Open WG Talk #2 Everything you wanted to know about CRIU (but were afraid to ...
Open WG Talk #2 Everything you wanted to know about CRIU (but were afraid to ...
Checkpoint and Restore In Userspace
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
Containers > VMs
CRIU: are we there yet?
Windows Kernel Debugging
Using Netconf/Yang with OpenDalight
Ospresentation 120112074429-phpapp02 (1)
Swifty Serverless: How to minimise latencies and cold start period for server...
Docker 1.11 Meetup: Containerd and runc, by Arnaud Porterie and Michael Crosby
Docker 1.11 Meetup: Containerd and runc, by Arnaud Porterie and Michael Crosby
Extending OpenVIM R3 to support Unikernels (and Xen)
Not so brief history of Linux Containers
Not so brief history of Linux Containers - Kir Kolyshkin
Docker 1.11 @ Docker SF Meetup
rtnetlink
Ad

Recently uploaded (20)

PPTX
Cloud computing and distributed systems.
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
cuic standard and advanced reporting.pdf
PDF
A comparative analysis of optical character recognition models for extracting...
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Electronic commerce courselecture one. Pdf
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Empathic Computing: Creating Shared Understanding
PPTX
Machine Learning_overview_presentation.pptx
PPTX
Programs and apps: productivity, graphics, security and other tools
PPTX
sap open course for s4hana steps from ECC to s4
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PPTX
Big Data Technologies - Introduction.pptx
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
Cloud computing and distributed systems.
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Review of recent advances in non-invasive hemoglobin estimation
Spectral efficient network and resource selection model in 5G networks
Mobile App Security Testing_ A Comprehensive Guide.pdf
cuic standard and advanced reporting.pdf
A comparative analysis of optical character recognition models for extracting...
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
“AI and Expert System Decision Support & Business Intelligence Systems”
Electronic commerce courselecture one. Pdf
MYSQL Presentation for SQL database connectivity
Empathic Computing: Creating Shared Understanding
Machine Learning_overview_presentation.pptx
Programs and apps: productivity, graphics, security and other tools
sap open course for s4hana steps from ECC to s4
Digital-Transformation-Roadmap-for-Companies.pptx
Advanced methodologies resolving dimensionality complications for autism neur...
Big Data Technologies - Introduction.pptx
Network Security Unit 5.pdf for BCA BBA.
The Rise and Fall of 3GPP – Time for a Sabbatical?

Checkpoint/Restore mostly in Userspace

  • 1. CRIU - Checkpoint/Restore in User-space Andrey Vagin <avagin@openvz.org><
  • 2. What is C/R and how can it be used? C/R is the ability to save states of processes and to restore them later. Usage scenarios: – Failure recovery – Live migration – Reboot-less upgrade – Speed up of slow-boot services – HPC issues 2
  • 3. History BLCR DMTCP OpenVZ Linux C/R CRIU 2003 2007 2005 2008 2011 ● Berkeley Lab Checkpoint/Restart (BLCR) (2003) – Load a kernel module and link with a library ● DMTCP: Distributed MultiThreaded CheckPointing (2004-2006) – Preload a library ● OpenVZ (2005) – OpenVZ kernel ● Linux Checkpoint/Restart by Oren Laadan (2008) – A non-mainline kernel ● CRIU (2011) 3
  • 4. How does this work? Image files Kernel objects Process tree 001101 001101 101010 101010 110001 110001 011010 011010 000011 000011 010101 010101 Files 001101 101010 001101 101010 110001 110001 011010 011010 Sockets crtools 000011 010101 000011 010101 Pipes 001101 001101 101010 101010 110001 110001 011010 011010 000011 000011 010101 010101 Name-spaces 4
  • 5. Kernel interfaces /proc/ ptrace Dump Restore syscalls netlink 5
  • 6. Dump ● Parasite code – Receive file descriptors – Dump memory content – Prctl(), sigaction, pending signals, timers, etc. ● Ptrace – freeze processes – Inject a parasite code ● Netlink – Get information about sockets, netns ● Procfs /proc/PID/maps, /proc/PID/map_files/, /proc/PID/status, /proc/PID/mountinfo 6
  • 7. Restore Namespaces ● Collect shared objects ● Restore name-spaces ● Create a process tree Processes – Restore SID, PGID – Restore objects, which should be inherited ● Files, sockets, pipes, ... ● Restore per-task properties. ● Restore memory ● Call sigreturn ● Awesome 7
  • 8. Interesting moments ● How to restore shared objects? – Send file descriptors via unix sockets – Map files from /proc/self/map_files/ for restoring anon shared mappings ● How to restore memory mappings on the correct places? – Map a new code block and a stack – Unmap crtools' mappings – Remap task's mappings on the correct places ● How to resume a process? – Create a signal frame – Call sigreturn() 8
  • 9. Kernel impact ~120 patches merged ~20 patches in flight ~10 new features appeared ~2 new features to come 9
  • 10. New features in a kernel ● Parasite code injection (by Tejun Heo) – Read task states, that are currently retrieved by a task only about itself ● The kcmp() system call – Helps checking which kernel objects are shared between processes ● Proc map_files directory – Find out what exact file is mapped – Mappings sharing info ● A bunch of prctl extensions – Set various private stuff on task/mm objects (c/r-only feature) ● Last-pid sysctl – Restore task with desired PID value 10
  • 11. New features in a kernel (net) ● TCP repair mode – Read intimate state of a TCP connection and reconstructs it from scratch on a freshly created socket ● Sockets information dumping via netlink (sock_diag) – Extendable sockets state retrieving engine ● Virtual net devices indexes – Allows to restore network devices in a namespace ● Socket peeking offset – Allows peeking sockets queues (reading without removing data from queue) 11
  • 12. What are already supported? ● Linux 3.8 ● In flight – X86_64 architecture – and ARM architecture – Process tree linkage – Pending signals – Multi-threaded apps – TCP time-stamps – All kinds of memory mappings – Terminals, groups, sessions – Open files (shared and unlinked) – Established TCP connections – Unix sockets, Packet sockets – Name-spaces (net, mount, ipc) – Non-posix files (epoll, inotify) – Pipes, Fifo-s, IPC, ... 12
  • 13. How is CRIU tested? ● ZDTM – a set of unit-tests ● Real-life applications – Apache, Nginx – MySQL, MongoDB, Oracle – Make && gcc – Tar & gzip – Screen – Java – LXC – VNC server + GUI applications 13
  • 14. Future plans ● Support all kinds of kernel objects ● Merge all in-flight patches in the mainstream kernel ● Integrate CRIU with OpenVZ and LXC utilities ● Iterative migration – Migrate memory content before freezing applications ● Integration in distributions – CRIU was accepted to Fedora 19 14
  • 15. How to use ● ./crtools dump -t pid [<options>] – checkpoint a process/tree identified by pid ● ./crtools restore -t pid [<options>] – restore - restore a process/tree identified by pid ● ./crtools show (-D dir)|(-f file) [<options>] – show dump file(s) contents ● ./crtools check – checks whether the kernel support is up-to-date ● ./crtools exec -t pid <syscall-string> – exec - execute a system call by other task 15
  • 16. Checkpoint/restore of a VNC server. 16

Editor's Notes

  • #4: BLCR is used a kernel module, doesn&apos;t checkpoint sockets, SysV IPC, zombies, etc. Applications should be linked with a library and executed via a helper. DMTCP uses an executer too, but doesn&apos;t require a kernel module. C/R in OpenVZ is used for checkpount/restore and migrate OpenVZ containers. It requires the OpenVZ kernel. Linux C/R is very similar on OpenVZ C/R. It is used for checkpoint/restore of LXC. CRIU combines all this project. It will work on the pure upstream kernel. It&apos;s able to dump a task without any preparation.