Understanding the Linux Interrupt Subsystem

Understanding the Linux Interrupt Subsystem

Interrupts are a fundamental mechanism in operating systems that allow hardware devices to signal the CPU when they need attention. The Linux kernel's interrupt subsystem provides a sophisticated framework for handling these asynchronous events efficiently. This article explores the architecture and of the Linux interrupt subsystem.

What are Interrupts?

┌─────────────────────────────────────────────────────────┐
│                                                         │
│                        CPU                              │
│                                                         │
│  ┌───────────────┐     ┌───────────────┐                │
│  │ Current Task  │     │ Interrupt     │                │
│  │ Execution     │───▶│ Handler       │                │
│  └───────────────┘     └───────┬───────┘                │
│                                │                        │
└────────────────────────────────┼────────────────────────┘
                                 │
                                 │ Interrupt Signal
                                 │
┌────────────────────────────────▼────────────────────────┐
│                                                         │
│                    Hardware Devices                     │
│                                                         │
│  ┌───────────┐  ┌───────────┐  ┌───────────┐            │
│  │ Network   │  │ Disk      │  │ Keyboard  │            │
│  │ Card      │  │ Controller│  │           │            │
│  └───────────┘  └───────────┘  └───────────┘            │
│                                                         │
└─────────────────────────────────────────────────────────┘
        

Interrupts allow hardware devices to signal the CPU when they need attention, such as when:

  • A network card receives a packet
  • A disk completes a read operation
  • A keyboard key is pressed
  • A timer expires

When an interrupt occurs, the CPU temporarily suspends its current execution, saves its state, and jumps to a specific interrupt handler routine to service the interrupt.

Linux Interrupt Subsystem Architecture

┌───────────────────────────────────────────────────────────┐
│                                                           │
│                    User Applications                      │
│                                                           │
└───────────────────────────┬───────────────────────────────┘
                            │
                            │ System Calls
                            │
┌───────────────────────────▼───────────────────────────────┐
│                                                           │
│                      Linux Kernel                         │
│                                                           │
│  ┌─────────────────┐      ┌─────────────────────────┐     │
│  │                 │      │                         │     │
│  │  Device Drivers │◄────►│    Interrupt Subsystem  │     │
│  │                 │      │                         │     │
│  └────────┬────────┘      └─────────────┬───────────┘     │
│           │                             │                 │
│           │                             │                 │
│  ┌────────▼────────┐      ┌─────────────▼───────────┐     │
│  │                 │      │                         │     │
│  │  Kernel Core    │◄────►│     Hardware Layer      │     │
│  │                 │      │                         │     │
│  └─────────────────┘      └─────────────────────────┘     │
│                                                           │
└───────────────────────────────────────────────────────────┘
        

The Linux interrupt subsystem consists of several key components:

  1. Interrupt Controllers: Hardware components that manage interrupt signals from devices
  2. Interrupt Handlers: Kernel functions that process specific interrupts
  3. Interrupt Descriptor Table (IDT): Maps interrupt numbers to handler routines
  4. Softirqs and Tasklets: Deferred interrupt processing mechanisms
  5. Threaded Interrupts: Interrupt handlers that run in their own kernel threads

Interrupt Flow in Linux

┌───────────────────────────────────────────────────────────┐
│                                                           │
│                  Interrupt Flow                           │
│                                                           │
│  ┌─────────────┐    ┌─────────────┐    ┌─────────────┐    │
│  │ Hardware    │    │ Interrupt   │    │ Top Half    │    │
│  │ Interrupt   │───►│ Controller  │───►│ (Handler)   │    │
│  └─────────────┘    └─────────────┘    └──────┬──────┘    │
│                                                │           │
│                                                ▼           │
│                                         ┌─────────────┐    │
│                                         │ Bottom Half │    │
│                                         │ Processing  │    │
│                                         └─────────────┘    │
│                                                            │
└────────────────────────────────────────────────────────────┘
        

When a hardware interrupt occurs:

  1. The device asserts an interrupt line
  2. The interrupt controller identifies the interrupt and signals the CPU
  3. The CPU acknowledges the interrupt and looks up the appropriate handler in the IDT
  4. The kernel executes the interrupt handler (top half)
  5. The handler schedules deferred work (bottom half) if needed
  6. The CPU returns to its previous task

Top Half vs. Bottom Half Processing

┌───────────────────────────────────────────────────────────┐
│                                                           │
│             Top Half vs. Bottom Half                      │
│                                                           │
│  ┌─────────────────────┐    ┌───────────────────────┐     │
│  │ Top Half            │    │ Bottom Half           │     │
│  │ (Interrupt Context) │    │ (Process Context)     │     │
│  │                     │    │                       │     │
│  │ - Fast execution    │    │ - Longer processing   │     │
│  │ - Interrupts off    │    │ - Interrupts on       │     │
│  │ - Cannot sleep      │    │ - Can sleep           │     │
│  │ - Minimal work      │    │ - Complex processing  │     │
│  └─────────────────────┘    └───────────────────────┘     │
│                                                           │
└───────────────────────────────────────────────────────────┘
        

Linux divides interrupt handling into two parts:

1. Top Half (Interrupt Handler):

- Runs with interrupts disabled

- Must execute quickly

- Cannot sleep or block

- Acknowledges the interrupt and saves essential data

- Schedules the bottom half for later execution

2. Bottom Half (Deferred Work):

- Runs with interrupts enabled

- Can take more time to execute

- Can sleep if necessary

- Processes the data collected by the top half

- Implemented as softirqs, tasklets, or work queues

Interrupt Types in Linux

┌───────────────────────────────────────────────────────────┐
│                                                           │
│                  Interrupt Types                          │
│                                                           │
│  ┌─────────────────────┐    ┌───────────────────────┐     │
│  │ Hardware Interrupts │    │ Software Interrupts   │     │
│  │                     │    │                       │     │
│  │ - Device-generated  │    │ - Kernel-generated    │     │
│  │ - Asynchronous      │    │ - Synchronous         │     │
│  │ - External events   │    │ - Internal events     │     │
│  └─────────────────────┘    └───────────────────────┘     │
│                                                           │
│  ┌─────────────────────┐    ┌───────────────────────┐     │
│  │ Maskable Interrupts │    │ Non-maskable Interrupts│    │
│  │                     │    │                       │     │
│  │ - Can be disabled   │    │ - Cannot be disabled  │     │
│  │ - Normal priority   │    │ - Highest priority    │     │
│  │ - Most devices      │    │ - Critical hardware   │     │
│  └─────────────────────┘    └───────────────────────┘     │
│                                                           │
└───────────────────────────────────────────────────────────┘
        

Linux handles several types of interrupts:

  1. Hardware Interrupts: Generated by physical devices
  2. Software Interrupts: Generated by the kernel itself
  3. Maskable Interrupts: Can be temporarily disabled
  4. Non-maskable Interrupts (NMIs): Cannot be disabled, used for critical events

Interrupt Request (IRQ) Numbers

┌───────────────────────────────────────────────────────────┐
│                                                           │
│                    IRQ Allocation                         │
│                                                           │
│  ┌─────────────────────┐    ┌───────────────────────┐     │
│  │ Static IRQs (0-15)  │    │ Dynamic IRQs (>16)    │     │
│  │                     │    │                       │     │
│  │ - Legacy devices    │    │ - Modern devices      │     │
│  │ - Fixed assignments │    │ - Allocated at boot   │     │
│  │ - Historical        │    │ - PCI, USB, etc.      │     │
│  └─────────────────────┘    └───────────────────────┘     │
│                                                           │
│  ┌─────────────────────────────────────────────────┐      │
│  │ Example Static IRQ Assignments                  │      │
│  │                                                 │      │
│  │ IRQ 0: System Timer                            │      │
│  │ IRQ 1: Keyboard                                │      │
│  │ IRQ 2: Cascade for IRQs 8-15                   │      │
│  │ IRQ 3: COM2/COM4                               │      │
│  │ IRQ 4: COM1/COM3                               │      │
│  │ IRQ 8: Real-time Clock                         │      │
│  │ IRQ 14: Primary IDE                            │      │
│  └─────────────────────────────────────────────────┘      │
│                                                           │
└───────────────────────────────────────────────────────────┘
        

Each interrupt source is assigned an IRQ (Interrupt Request) number:

  • Static IRQs (0-15): Historically assigned to specific devices
  • Dynamic IRQs (>16): Allocated dynamically to modern devices

Interrupt Controllers

┌───────────────────────────────────────────────────────────┐
│                                                           │
│                 Interrupt Controllers                     │
│                                                           │
│  ┌─────────────────────┐    ┌───────────────────────┐     │
│  │ Legacy PIC          │    │ Advanced APIC         │     │
│  │ (8259A)             │    │                       │     │
│  │                     │    │ - Multiprocessor      │     │
│  │ - 8+8 IRQs          │    │ - 256 IRQs            │     │
│  │ - Single CPU        │    │ - Per-CPU local APIC  │     │
│  │ - Limited features  │    │ - I/O APIC            │     │
│  └─────────────────────┘    └───────────────────────┘     │
│                                                           │
│  ┌─────────────────────┐    ┌───────────────────────┐     │
│  │ GIC                 │    │ Platform-specific     │     │
│  │ (ARM)               │    │ Controllers           │     │
│  │                     │    │                       │     │
│  │ - ARM architecture  │    │ - Custom hardware     │     │
│  │ - Multiprocessor    │    │ - Specialized         │     │
│  │ - SMP support       │    │ - Embedded systems    │     │
│  └─────────────────────┘    └───────────────────────┘     │
│                                                           │
└───────────────────────────────────────────────────────────┘
        

Linux supports various interrupt controllers:

  1. Programmable Interrupt Controller (PIC): Legacy 8259A controller
  2. Advanced Programmable Interrupt Controller (APIC): Modern x86 systems
  3. Generic Interrupt Controller (GIC): ARM-based systems
  4. Platform-specific controllers: Custom controllers for specific hardware

Registering Interrupt Handlers

Device drivers register interrupt handlers using the request_irq() function:

int request_irq(
    unsigned int irq, 
    irq_handler_t handler, 
    unsigned long flags,
    const char *name, 
    void *dev
);
        

Where:

  • irq: The IRQ number to request
  • handler: Pointer to the interrupt handler function
  • flags: Interrupt flags (IRQF_SHARED, IRQF_ONESHOT, etc.)
  • name: Name for /proc/interrupts
  • dev: Device identifier for shared interrupts

Example from a network driver:

static int rtl8169_open(struct net_device *dev)
{
    struct rtl8169_private *tp = netdev_priv(dev);
    int retval;

    // ... existing code ...

    retval = request_irq(tp->irq, rtl8169_interrupt, IRQF_SHARED,
                        dev->name, dev);
    if (retval < 0)
        goto err_out;

    // ... existing code ...
}
        

Deferred Interrupt Processing

┌───────────────────────────────────────────────────────────┐
│                                                           │
│              Deferred Processing Mechanisms               │
│                                                           │
│  ┌─────────────────────┐    ┌───────────────────────┐     │
│  │ Softirqs            │    │ Tasklets              │     │
│  │                     │    │                       │     │
│  │ - Static allocation │    │ - Dynamic allocation  │     │
│  │ - Parallel execution│    │ - Serial execution    │     │
│  │ - System-defined    │    │ - Built on softirqs   │     │
│  │ - Low-level         │    │ - Simpler interface   │     │
│  └─────────────────────┘    └───────────────────────┘     │
│                                                           │
│  ┌─────────────────────┐    ┌───────────────────────┐     │
│  │ Work Queues         │    │ Threaded IRQs         │     │
│  │                     │    │                       │     │
│  │ - Kernel threads    │    │ - Kernel thread       │     │
│  │ - Can sleep         │    │ - Can sleep           │     │
│  │ - Flexible          │    │ - Replaces bottom half│     │
│  │ - General purpose   │    │ - Modern approach     │     │
│  └─────────────────────┘    └───────────────────────┘     │
│                                                           │
└───────────────────────────────────────────────────────────┘
        

Linux provides several mechanisms for deferred interrupt processing:

1. Softirqs:

- Limited number of statically defined handlers

- Can run in parallel on multiple CPUs

- Used for high-frequency, performance-critical tasks

- Examples: network RX/TX, timers, scheduling

2. Tasklets:

- Built on top of softirqs

- Dynamically allocated

- Run serially (same tasklet can not run on multiple CPUs simultaneously)

- Simpler interface than softirqs

3. Work Queues:

- Run in the context of kernel worker threads

- Can sleep and block

- Used for tasks that may need to wait for resources

4. Threaded IRQs:

- Run the handler in its own kernel thread

- Can sleep and block

- Modern approach for complex device drivers

Softirq Implementation

┌───────────────────────────────────────────────────────────┐
│                                                           │
│                    Softirq Types                          │
│                                                           │
│  ┌─────────────────────────────────────────────────┐      │
│  │ HI_SOFTIRQ       - High priority tasklets       │      │
│  │ TIMER_SOFTIRQ    - Timer processing             │      │
│  │ NET_TX_SOFTIRQ   - Network transmit             │      │
│  │ NET_RX_SOFTIRQ   - Network receive              │      │
│  │ BLOCK_SOFTIRQ    - Block device operations      │      │
│  │ IRQ_POLL_SOFTIRQ - IRQ polling                  │      │
│  │ TASKLET_SOFTIRQ  - Regular tasklets             │      │
│  │ SCHED_SOFTIRQ    - Scheduler operations         │      │
│  │ HRTIMER_SOFTIRQ  - High-resolution timers       │      │
│  │ RCU_SOFTIRQ      - RCU processing               │      │
│  └─────────────────────────────────────────────────┘      │
│                                                           │
└───────────────────────────────────────────────────────────┘
        

Softirqs are processed at specific points in the kernel:

  • After returning from a hardware interrupt
  • When explicitly invoked by do_softirq()
  • In the ksoftirqd kernel threads (when softirqs are pending for too long)

Interrupt Context vs. Process Context

┌───────────────────────────────────────────────────────────┐
│                                                           │
│          Interrupt Context vs. Process Context            │
│                                                           │
│  ┌─────────────────────┐    ┌───────────────────────┐     │
│  │ Interrupt Context   │    │ Process Context       │     │
│  │                     │    │                       │     │
│  │ - Cannot sleep      │    │ - Can sleep           │     │
│  │ - Cannot access     │    │ - Can access          │     │
│  │   user space        │    │   user space          │     │
│  │ - Limited stack     │    │ - Full kernel stack   │     │
│  │ - Preemption off    │    │ - Preemptible         │     │
│  │ - Time-critical     │    │ - Not time-critical   │     │
│  └─────────────────────┘    └───────────────────────┘     │
│                                                           │
└───────────────────────────────────────────────────────────┘
        

Understanding the difference between interrupt context and process context is crucial:

1. Interrupt Context:

- Top half handlers and softirqs run in interrupt context

- Cannot sleep or block

- Cannot access user space memory

- Limited stack space

- Preemption is disabled

2. Process Context:

- Threaded IRQs and work queues run in process context

- Can sleep and block

- Can access user space memory (with proper checks)

- Normal kernel stack

- Can be preempted

Interrupt Statistics

Linux provides interrupt statistics through the /proc/interrupts file:

           CPU0       CPU1       
  0:         84          0   IO-APIC   2-edge      timer
  1:          9          0   IO-APIC   1-edge      i8042
  8:          0          1   IO-APIC   8-edge      rtc0
  9:          0          0   IO-APIC   9-fasteoi   acpi
 12:         15          0   IO-APIC  12-edge      i8042
 16:         31          0   IO-APIC  16-fasteoi   ehci_hcd:usb1
 23:        158          0   IO-APIC  23-fasteoi   ehci_hcd:usb2
 40:          0          0   PCI-MSI 458752-edge   PCIe PME
 41:     123747          0   PCI-MSI 512000-edge   eth0
 42:     115032          0   PCI-MSI 524288-edge   snd_hda_intel:card0
 43:          0          0   PCI-MSI 32768-edge    mei_me
 44:        536          0   PCI-MSI 360448-edge   nvme0q0
 45:        944          0   PCI-MSI 360449-edge   nvme0q1
        

This file shows:

  • IRQ number
  • Count of interrupts per CPU
  • Interrupt controller and type
  • Device name

Interrupt Handling in SMP Systems

┌───────────────────────────────────────────────────────────┐
│                                                           │
│              SMP Interrupt Handling                       │
│                                                           │
│  ┌─────────────────────┐    ┌───────────────────────┐     │
│  │ IRQ Balancing       │    │ Per-CPU Interrupts    │     │
│  │                     │    │                       │     │
│  │ - Distribute load   │    │ - Dedicated to one CPU│     │
│  │ - Dynamic routing   │    │ - No contention       │     │
│  │ - irqbalance daemon │    │ - Cache-friendly      │     │
│  └─────────────────────┘    └───────────────────────┘     │
│                                                           │
│  ┌─────────────────────┐    ┌───────────────────────┐     │
│  │ CPU Affinity        │    │ NUMA Considerations   │     │
│  │                     │    │                       │     │
│  │ - Manual assignment │    │ - Local interrupts    │     │
│  │ - Performance tuning│    │ - Memory locality     │     │
│  │ - /proc/irq/N/smp_  │    │ - Node awareness      │     │
│  │   affinity          │    │                       │     │
│  └─────────────────────┘    └───────────────────────┘     │
│                                                           │
└───────────────────────────────────────────────────────────┘
        

In multi-processor systems, interrupt handling becomes more complex:

1. IRQ Balancing:

- Distributes interrupts across CPUs

- Implemented by the irqbalance daemon

- Aims to balance interrupt load while maintaining cache locality

2. Per-CPU Interrupts:

- Some interrupts can be dedicated to specific CPUs

- Reduces cache thrashing and lock contention

- Improves performance for high-frequency interrupts

3. CPU Affinity:

- Manually assign interrupts to specific CPUs

- Controlled via /proc/irq/N/smp_affinity

- Useful for performance tuning

Real-Time Considerations

┌───────────────────────────────────────────────────────────┐
│                                                           │
│              Real-Time Interrupt Handling                 │
│                                                           │
│  ┌─────────────────────┐    ┌───────────────────────┐     │
│  │ Threaded IRQs       │    │ IRQ Time Limits       │     │
│  │                     │    │                       │     │
│  │ - Preemptible       │    │ - Detect long handlers│     │
│  │ - Prioritized       │    │ - Debug facilities    │     │
│  │ - Reduced latency   │    │ - Latency tracking    │     │
│  └─────────────────────┘    └───────────────────────┘     │
│                                                           │
│  ┌─────────────────────┐    ┌───────────────────────┐     │
│  │ RT Priority         │    │ Interrupt Off Time    │     │
│  │                     │    │                       │     │
│  │ - SCHED_FIFO        │    │ - Minimize time with  │     │
│  │ - Configurable      │    │   interrupts disabled │     │
│  │ - Deterministic     │    │ - Critical for RT     │     │
│  └─────────────────────┘    └───────────────────────┘     │
│                                                           │
└───────────────────────────────────────────────────────────┘
        

For real-time systems, interrupt handling requires special considerations:

1. Threaded IRQs:

- Move most interrupt processing to preemptible kernel threads

- Allow higher-priority tasks to preempt interrupt handling

- Reduce interrupt latency

2. Priority-Based Handling:

- Assign priorities to interrupt threads

- Process critical interrupts before less important ones

- Provide deterministic response times

3. Minimizing Interrupt-Off Time:

- Reduce the time spent with interrupts disabled

- Keep top half handlers as short as possible

- Move processing to bottom halves

Interrupt-Driven I/O Example

Let's look at a simplified example of interrupt-driven I/O in a network driver:

┌───────────────────────────────────────────────────────────┐
│                                                           │
│              Network Driver Interrupt Flow                │
│                                                           │
│  ┌─────────────┐    ┌─────────────┐    ┌─────────────┐    │
│  │ Packet      │    │ Hardware    │    │ Top Half    │    │
│  │ Arrives     │───►│ Interrupt   │───►│ Handler     │    │
│  └─────────────┘    └─────────────┘    └──────┬──────┘    │
│                                               │           │
│                                               ▼           │
│  ┌─────────────┐    ┌─────────────┐    ┌─────────────┐    │
│  │ Protocol    │    │ NAPI        │    │ NET_RX      │    │
│  │ Stack       │◄───│ Poll        │◄───│ Softirq     │    │
│  └─────────────┘    └─────────────┘    └─────────────┘    │
│                                                           │
└───────────────────────────────────────────────────────────┘
        

1. Packet Arrival:

- Network card receives a packet

- Card generates an interrupt

2. Top Half Handler:

- Acknowledges the interrupt

- Disables further interrupts from the device

- Schedules the NET_RX_SOFTIRQ softirq

- Returns quickly

3. Bottom Half Processing:

- NET_RX_SOFTIRQ runs the NAPI polling function

- Driver processes received packets in batches

- Packets are passed to the network stack

- Re-enables interrupts when queue is empty

This approach balances responsiveness with efficiency by:

  • Responding quickly to the initial interrupt
  • Processing multiple packets in a single poll cycle
  • Avoiding interrupt storms during high traffic

Interrupt Mitigation Techniques

┌───────────────────────────────────────────────────────────┐
│                                                           │
│              Interrupt Mitigation                         │
│                                                           │
│  ┌─────────────────────┐    ┌───────────────────────┐     │
│  │ NAPI                │    │ Interrupt Coalescing  │     │
│  │ (New API)           │    │                       │     │
│  │                     │    │ - Hardware bundles    │     │
│  │ - Polling + IRQs    │    │   multiple events     │     │
│  │ - Adaptive          │    │ - Single interrupt for│     │
│  │ - High throughput   │    │   multiple packets    │     │
│  └─────────────────────┘    └───────────────────────┘     │
│                                                           │
│  ┌─────────────────────┐    ┌───────────────────────┐     │
│  │ Throttling          │    │ Dynamic Interrupt     │     │
│  │                     │    │ Moderation            │     │
│  │ - Limit IRQ rate    │    │                       │     │
│  │ - Configurable      │    │ - Adaptive algorithms │     │
│  │ - /proc/irq/N/      │    │ - Load-based          │     │
│  │   throttle          │    │ - Self-tuning         │     │
│  └─────────────────────┘    └───────────────────────┘     │
│                                                           │
└───────────────────────────────────────────────────────────┘
        

To prevent interrupt overload, Linux employs several mitigation techniques:

1. NAPI (New API):

- Hybrid approach combining interrupts and polling

- Uses interrupts during low traffic

- Switches to polling during high traffic

- Reduces interrupt overhead while maintaining responsiveness

2. Interrupt Coalescing:

- Hardware bundles multiple events into a single interrupt

- Configurable via ethtool for network devices

- Balances latency and throughput

3. Interrupt Throttling:

- Limits the rate of interrupts

- Prevents a single device from monopolizing CPU time

- Configurable via /proc/irq/N/throttle

Debugging Interrupt Issues

Linux provides several tools for debugging interrupt-related issues:

  1. /proc/interrupts: Shows interrupt counts per CPU
  2. /proc/stat: Includes interrupt statistics
  3. ftrace: Kernel tracing facility for interrupt events
  4. perf: Performance analysis tool with interrupt tracing
  5. kernelshark: Graphical viewer for ftrace data

Common interrupt-related issues include:

  • Interrupt storms (excessive interrupts from a device)
  • Interrupt latency (delayed response to interrupts)
  • Interrupt conflicts (multiple devices using the same IRQ)
  • Missing interrupts (device not generating expected interrupts)

Conclusion

The Linux interrupt subsystem provides a sophisticated framework for handling asynchronous hardware events efficiently. By dividing interrupt handling into top and bottom halves, employing various deferred processing mechanisms, and implementing mitigation techniques, Linux achieves a balance between responsiveness and throughput.

Understanding how interrupts work in Linux is essential for kernel developers, device driver authors, and system administrators who need to diagnose and optimize system performance. The interrupt subsystem continues to evolve, with ongoing improvements for real-time performance, scalability on many-core systems, and support for new hardware architectures.

To view or add a comment, sign in

Others also viewed

Explore topics