SlideShare a Scribd company logo
Ftrace
Debugger, performance measurements, kernel teacher




        Frédéric Weisbecker <fweisbec@gmail.com>
Introduction

   Origins from the PREEMPT_RT patch.

   Self-contained kernel tracing tool/framework

   Set of tracers

   Set of user toggable/tunable tracepoints
The Ring Buffer

   Generic ring buffer for all the kernel
   Per cpu write and read
   Lockless write and read
   Read through ftrace layer or directly splice
Ring Buffer operations

   Write side
       Overwrite or stop in before head mode
       Before: Lock and reserve
       After:
            Unlock and commit
            Unlock and discard
   Read side
       Iterator (local reader)
       Read (global consumer)
Tracers

   Most basic tracing unit
   Callbacks:
       Higher level tracing framework operations
       Lower level fs operations
   Use of tracepoints or ad hoc captures
   Insertion to the ring buffer
   Reserved for tracing requiring low level
    operations.
Function tracer

   Use of a gcc trick (-pg option)
       Static calls to an mcount function
       Probing on entry
       Careful choice of untraced functions

   Different modes:
       Static mcount() calls
       Dynamic patching
Function trace
   # tracer: function
   #
   #       TASK-PID     CPU#    TIMESTAMP FUNCTION
   #         ||     |     |     |                  |
       soffice.bin-5363 [001] 2744.270302: raise_softirq <-run_local_timers
       soffice.bin-5363 [001] 2744.270303: rcu_pending <-update_process_times
       soffice.bin-5363 [001] 2744.270303: __rcu_pending <-rcu_pending
       soffice.bin-5363 [001] 2744.270304: __rcu_pending <-rcu_pending
       soffice.bin-5363 [001] 2744.270304: printk_tick <-update_process_times
Function graph tracer

   Extends the function tracer by also hooking on
    return:
       Live hooking
       Each task has its private stack of function calls

   New facilities:
       Draw a call graph
       Measure execution time of functions
Function graph trace
   # tracer: function_graph
    #
    # CPU DURATION           FUNCTION CALLS
    #|          | |             | | | |

    0)   0.931 us   |   _spin_lock();
    0)              |   page_add_new_anon_rmap() {
    0)              |     __inc_zone_page_state() {
    0)   0.615 us   |       __inc_zone_state();
    0)   1.848 us   |     }
    0)   0.751 us   |     page_evictable();
    0)              |     lru_cache_add_lru() {
    0)   0.691 us   |       __lru_cache_add();
    0)   1.990 us   |     }
    0)   7.231 us   |   }
    0)   0.766 us   |   _spin_unlock();
Graph tracer enhancement

   Clients of entry/return hooks: save custom
    datas in task call graph stack
   Print return values (size? Format?)
   Print parameters values (use of dwarf infos)
   Filter by duration (manage a stack to filter?
    Userland post-processing?)
Syscalls tracer

   Use existing syscall definition CPP wrapper
       Build a syscall metadata table
       Link syscall metadata table to syscall table

   Fast retrieval of number of parameters on fast
    path
       One shot registers saving (struct pt_regs)
   Fast retrieval of metadata on slow path
       Retrieve parameter types and names, link to its
        value (pretty-printing)
Syscall trace
   # tracer: syscall
    #
    #        TASK-PID CPU# TIMESTAMP FUNCTION
    #          |      |    |     |                  |
            bash-5606 [000] 2404.628180: sys_dup2(oldfd: a, newfd: 1)
            bash-5606 [000] 2404.628261: sys_dup2 -> 0x1
            bash-5606 [000] 2404.628264: sys_fcntl(fd: a, cmd: 1, arg: 0)
            bash-5606 [000] 2404.628267: sys_fcntl -> 0x1
            bash-5606 [000] 2404.628270: sys_close(fd: a)
            bash-5606 [000] 2404.628273: sys_close -> 0x0
            bash-5606 [000] 2404.628290: sys_rt_sigprocmask(how: 0, set: 0, oset:
    6cf808, sigsetsize: 8)
            bash-5606 [000] 2404.628294: sys_rt_sigprocmask -> 0x0
Syscall tracing enhancements

   Build one ftrace event per syscall (ready)
       Provide filters, toggling, no need of a tracer
   Build a hashlist of complex types:
       Pointers to a structure: size?
       Format
       Link syscalls metadata to this hashlist of complex
        types. For fast path, have two new fields in the
        syscall metadata:
            Bitmap of complex types for this syscall
            Size of parameter to save from the user pointer (or
             callback to save in case of very complex parameters).
Some other tracers

   Latency tracing (irqsoff, preemptoff,
    preemptirqsoff) requires snapshot mode
   Tracers waiting for ftrace events conversion
       Kmemtrace
       Blktrace
       Boot tracer
   Tracers in a middle stage
       Power, sched, etc...
   Exceptions: mmiotrace...
Ftrace events

   Upper layer of tracepoints
   User-side toggable: the enable/set_event files
       By event
       By subsystem
       All
   Can be filtered using tunable rules
Defining an event

   TRACE_EVENT(name,
       TP_PROTO(proto),
       TP_ARGS(args),
       TP_STRUCT__entry(define fields),
       TP_fast_assign(assign_fields),
       TP_printk("fmt", fields)
    );
   Various set of fields
       Static: __field, __array
       Dynamic: __dynamic_array, __string
Drawbacks of ftrace events

   CPP is somewhat limited
   Need of a specific tracer or dedictated code for
    (rare) low level or ad-hoc needs.
   No histogram / statistical tracing
Ideas for the future

   Ftrace is bad at stat/histogram tracing
   Use perfcounter as a powerful bridge and user
    interface
   Your ideas!

More Related Content

PDF
Bpf performance tools chapter 4 bcc
PDF
Linux 4.x Tracing: Performance Analysis with bcc/BPF
PPT
eTwinning - traceroute command
PDF
計算機性能の限界点とその考え方
PDF
Performance Wins with eBPF: Getting Started (2021)
PDF
Tracer Evaluation
PDF
Deploying Prometheus stacks with Juju
ODP
Drizzle to MySQL, Stress Free Migration
Bpf performance tools chapter 4 bcc
Linux 4.x Tracing: Performance Analysis with bcc/BPF
eTwinning - traceroute command
計算機性能の限界点とその考え方
Performance Wins with eBPF: Getting Started (2021)
Tracer Evaluation
Deploying Prometheus stacks with Juju
Drizzle to MySQL, Stress Free Migration

What's hot (20)

PDF
Low Overhead System Tracing with eBPF
PDF
bcc/BPF tools - Strategy, current tools, future challenges
PDF
eBPF Trace from Kernel to Userspace
PDF
When the OS gets in the way
PPTX
Slurm @ 2018 LabTech
PDF
Ixgbe internals
PDF
BPF Internals (eBPF)
PDF
YOW2020 Linux Systems Performance
PDF
Performance Wins with BPF: Getting Started
PDF
Profiling your Applications using the Linux Perf Tools
PDF
re:Invent 2019 BPF Performance Analysis at Netflix
PDF
Solaris Kernel Debugging V1.0
PDF
LSFMM 2019 BPF Observability
PDF
pg_proctab: Accessing System Stats in PostgreSQL
ODP
Linux Capabilities - eng - v2.1.5, compact
PDF
eBPF Perf Tools 2019
PDF
Blazing Performance with Flame Graphs
PDF
Security Monitoring with eBPF
PDF
pg_proctab: Accessing System Stats in PostgreSQL
PPTX
Performance and how to measure it - ProgSCon London 2016
Low Overhead System Tracing with eBPF
bcc/BPF tools - Strategy, current tools, future challenges
eBPF Trace from Kernel to Userspace
When the OS gets in the way
Slurm @ 2018 LabTech
Ixgbe internals
BPF Internals (eBPF)
YOW2020 Linux Systems Performance
Performance Wins with BPF: Getting Started
Profiling your Applications using the Linux Perf Tools
re:Invent 2019 BPF Performance Analysis at Netflix
Solaris Kernel Debugging V1.0
LSFMM 2019 BPF Observability
pg_proctab: Accessing System Stats in PostgreSQL
Linux Capabilities - eng - v2.1.5, compact
eBPF Perf Tools 2019
Blazing Performance with Flame Graphs
Security Monitoring with eBPF
pg_proctab: Accessing System Stats in PostgreSQL
Performance and how to measure it - ProgSCon London 2016
Ad

Similar to Interruption Timer Périodique (20)

PPTX
Dpdk applications
PPTX
Modern Linux Tracing Landscape
PDF
BPF: Tracing and more
PDF
Velocity 2017 Performance analysis superpowers with Linux eBPF
PPT
2007 Tidc India Profiling
PDF
USENIX ATC 2017 Performance Superpowers with Enhanced BPF
PDF
OSSNA 2017 Performance Analysis Superpowers with Linux BPF
PPTX
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
PDF
OSDC 2017 - Werner Fischer - Linux performance profiling and monitoring
PPTX
DPDK layer for porting IPS-IDS
PDF
Android Boot Time Optimization
PDF
Performance Analysis Tools for Linux Kernel
PPTX
Oracle Basics and Architecture
PDF
Accelerating Real Time Analytics with Spark Streaming and FPGAaaS with Prabha...
PDF
Kafka Summit SF 2017 - One Day, One Data Hub, 100 Billion Messages: Kafka at ...
PPTX
Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...
PDF
2010 03 papi_indiana
PPTX
Where the wild things are - Benchmarking and Micro-Optimisations
PDF
Crash_Report_Mechanism_In_Tizen
PDF
OSMC 2015 | Linux Performance Profiling and Monitoring by Werner Fischer
Dpdk applications
Modern Linux Tracing Landscape
BPF: Tracing and more
Velocity 2017 Performance analysis superpowers with Linux eBPF
2007 Tidc India Profiling
USENIX ATC 2017 Performance Superpowers with Enhanced BPF
OSSNA 2017 Performance Analysis Superpowers with Linux BPF
End to End Processing of 3.7 Million Telemetry Events per Second using Lambda...
OSDC 2017 - Werner Fischer - Linux performance profiling and monitoring
DPDK layer for porting IPS-IDS
Android Boot Time Optimization
Performance Analysis Tools for Linux Kernel
Oracle Basics and Architecture
Accelerating Real Time Analytics with Spark Streaming and FPGAaaS with Prabha...
Kafka Summit SF 2017 - One Day, One Data Hub, 100 Billion Messages: Kafka at ...
Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...
2010 03 papi_indiana
Where the wild things are - Benchmarking and Micro-Optimisations
Crash_Report_Mechanism_In_Tizen
OSMC 2015 | Linux Performance Profiling and Monitoring by Werner Fischer
Ad

More from Anne Nicolas (20)

PDF
Kernel Recipes 2019 - Driving the industry toward upstream first
PDF
Kernel Recipes 2019 - No NMI? No Problem! – Implementing Arm64 Pseudo-NMI
PDF
Kernel Recipes 2019 - Hunting and fixing bugs all over the Linux kernel
PDF
Kernel Recipes 2019 - Metrics are money
PDF
Kernel Recipes 2019 - Kernel documentation: past, present, and future
PDF
Embedded Recipes 2019 - Knowing your ARM from your ARSE: wading through the t...
PDF
Kernel Recipes 2019 - GNU poke, an extensible editor for structured binary data
PDF
Kernel Recipes 2019 - Analyzing changes to the binary interface exposed by th...
PDF
Embedded Recipes 2019 - Remote update adventures with RAUC, Yocto and Barebox
PDF
Embedded Recipes 2019 - Making embedded graphics less special
PDF
Embedded Recipes 2019 - Linux on Open Source Hardware and Libre Silicon
PDF
Embedded Recipes 2019 - From maintaining I2C to the big (embedded) picture
PDF
Embedded Recipes 2019 - Testing firmware the devops way
PDF
Embedded Recipes 2019 - Herd your socs become a matchmaker
PDF
Embedded Recipes 2019 - LLVM / Clang integration
PDF
Embedded Recipes 2019 - Introduction to JTAG debugging
PDF
Embedded Recipes 2019 - Pipewire a new foundation for embedded multimedia
PDF
Kernel Recipes 2019 - ftrace: Where modifying a running kernel all started
PDF
Kernel Recipes 2019 - Suricata and XDP
PDF
Kernel Recipes 2019 - Marvels of Memory Auto-configuration (SPD)
Kernel Recipes 2019 - Driving the industry toward upstream first
Kernel Recipes 2019 - No NMI? No Problem! – Implementing Arm64 Pseudo-NMI
Kernel Recipes 2019 - Hunting and fixing bugs all over the Linux kernel
Kernel Recipes 2019 - Metrics are money
Kernel Recipes 2019 - Kernel documentation: past, present, and future
Embedded Recipes 2019 - Knowing your ARM from your ARSE: wading through the t...
Kernel Recipes 2019 - GNU poke, an extensible editor for structured binary data
Kernel Recipes 2019 - Analyzing changes to the binary interface exposed by th...
Embedded Recipes 2019 - Remote update adventures with RAUC, Yocto and Barebox
Embedded Recipes 2019 - Making embedded graphics less special
Embedded Recipes 2019 - Linux on Open Source Hardware and Libre Silicon
Embedded Recipes 2019 - From maintaining I2C to the big (embedded) picture
Embedded Recipes 2019 - Testing firmware the devops way
Embedded Recipes 2019 - Herd your socs become a matchmaker
Embedded Recipes 2019 - LLVM / Clang integration
Embedded Recipes 2019 - Introduction to JTAG debugging
Embedded Recipes 2019 - Pipewire a new foundation for embedded multimedia
Kernel Recipes 2019 - ftrace: Where modifying a running kernel all started
Kernel Recipes 2019 - Suricata and XDP
Kernel Recipes 2019 - Marvels of Memory Auto-configuration (SPD)

Interruption Timer Périodique

  • 1. Ftrace Debugger, performance measurements, kernel teacher Frédéric Weisbecker <fweisbec@gmail.com>
  • 2. Introduction  Origins from the PREEMPT_RT patch.  Self-contained kernel tracing tool/framework  Set of tracers  Set of user toggable/tunable tracepoints
  • 3. The Ring Buffer  Generic ring buffer for all the kernel  Per cpu write and read  Lockless write and read  Read through ftrace layer or directly splice
  • 4. Ring Buffer operations  Write side  Overwrite or stop in before head mode  Before: Lock and reserve  After:  Unlock and commit  Unlock and discard  Read side  Iterator (local reader)  Read (global consumer)
  • 5. Tracers  Most basic tracing unit  Callbacks:  Higher level tracing framework operations  Lower level fs operations  Use of tracepoints or ad hoc captures  Insertion to the ring buffer  Reserved for tracing requiring low level operations.
  • 6. Function tracer  Use of a gcc trick (-pg option)  Static calls to an mcount function  Probing on entry  Careful choice of untraced functions  Different modes:  Static mcount() calls  Dynamic patching
  • 7. Function trace  # tracer: function  #  # TASK-PID CPU# TIMESTAMP FUNCTION  # || | | | |  soffice.bin-5363 [001] 2744.270302: raise_softirq <-run_local_timers  soffice.bin-5363 [001] 2744.270303: rcu_pending <-update_process_times  soffice.bin-5363 [001] 2744.270303: __rcu_pending <-rcu_pending  soffice.bin-5363 [001] 2744.270304: __rcu_pending <-rcu_pending  soffice.bin-5363 [001] 2744.270304: printk_tick <-update_process_times
  • 8. Function graph tracer  Extends the function tracer by also hooking on return:  Live hooking  Each task has its private stack of function calls  New facilities:  Draw a call graph  Measure execution time of functions
  • 9. Function graph trace  # tracer: function_graph # # CPU DURATION FUNCTION CALLS #| | | | | | | 0) 0.931 us | _spin_lock(); 0) | page_add_new_anon_rmap() { 0) | __inc_zone_page_state() { 0) 0.615 us | __inc_zone_state(); 0) 1.848 us | } 0) 0.751 us | page_evictable(); 0) | lru_cache_add_lru() { 0) 0.691 us | __lru_cache_add(); 0) 1.990 us | } 0) 7.231 us | } 0) 0.766 us | _spin_unlock();
  • 10. Graph tracer enhancement  Clients of entry/return hooks: save custom datas in task call graph stack  Print return values (size? Format?)  Print parameters values (use of dwarf infos)  Filter by duration (manage a stack to filter? Userland post-processing?)
  • 11. Syscalls tracer  Use existing syscall definition CPP wrapper  Build a syscall metadata table  Link syscall metadata table to syscall table  Fast retrieval of number of parameters on fast path  One shot registers saving (struct pt_regs)  Fast retrieval of metadata on slow path  Retrieve parameter types and names, link to its value (pretty-printing)
  • 12. Syscall trace  # tracer: syscall # # TASK-PID CPU# TIMESTAMP FUNCTION # | | | | | bash-5606 [000] 2404.628180: sys_dup2(oldfd: a, newfd: 1) bash-5606 [000] 2404.628261: sys_dup2 -> 0x1 bash-5606 [000] 2404.628264: sys_fcntl(fd: a, cmd: 1, arg: 0) bash-5606 [000] 2404.628267: sys_fcntl -> 0x1 bash-5606 [000] 2404.628270: sys_close(fd: a) bash-5606 [000] 2404.628273: sys_close -> 0x0 bash-5606 [000] 2404.628290: sys_rt_sigprocmask(how: 0, set: 0, oset: 6cf808, sigsetsize: 8) bash-5606 [000] 2404.628294: sys_rt_sigprocmask -> 0x0
  • 13. Syscall tracing enhancements  Build one ftrace event per syscall (ready)  Provide filters, toggling, no need of a tracer  Build a hashlist of complex types:  Pointers to a structure: size?  Format  Link syscalls metadata to this hashlist of complex types. For fast path, have two new fields in the syscall metadata:  Bitmap of complex types for this syscall  Size of parameter to save from the user pointer (or callback to save in case of very complex parameters).
  • 14. Some other tracers  Latency tracing (irqsoff, preemptoff, preemptirqsoff) requires snapshot mode  Tracers waiting for ftrace events conversion  Kmemtrace  Blktrace  Boot tracer  Tracers in a middle stage  Power, sched, etc...  Exceptions: mmiotrace...
  • 15. Ftrace events  Upper layer of tracepoints  User-side toggable: the enable/set_event files  By event  By subsystem  All  Can be filtered using tunable rules
  • 16. Defining an event  TRACE_EVENT(name, TP_PROTO(proto), TP_ARGS(args), TP_STRUCT__entry(define fields), TP_fast_assign(assign_fields), TP_printk("fmt", fields) );  Various set of fields  Static: __field, __array  Dynamic: __dynamic_array, __string
  • 17. Drawbacks of ftrace events  CPP is somewhat limited  Need of a specific tracer or dedictated code for (rare) low level or ad-hoc needs.  No histogram / statistical tracing
  • 18. Ideas for the future  Ftrace is bad at stat/histogram tracing  Use perfcounter as a powerful bridge and user interface  Your ideas!