SlideShare a Scribd company logo
Introduction to Linux Kernel
   TCP/IP procotol stack
                雕梁
      核心系统服务器平台组
      diaoliang@taobao.com
   simohayha.bobo@gmail.com
     http://www.pagefault.info
             2011/01/15
Agenda
Introduction

Networking code in the Linux kernel tree

L2 (Link Layer)

L3 (Network Layer)

L4 (Transport Layer)

Config and benchmark tools

Resource
Introduction

   Source
      http://git.kernel.org/
      net-next-2.6 and net-2.6
   Developer
      Alan Cox, David Miller, Eric Dumazet, Patrick Mchardy
      etc.
   Traffic directions
      input , forward and output
   Layer
      L2(Link Layer)/L3(Network Layer)/L4(Transport Layer)
   Device interface
      PCI/PCI-E
Networking code in the Linux kernel tree



Net-Kernel
source tree
Big picture
Link layer
  Frame type
     802.3/802.2/802.2-SNAP/Ethernet
  Input
     Driver
        NAPI
             Poll + Interrupt
     Soft interrupt
        GRO
             feed packet to network stack
        RPS/RFS
             make steer in SMP
     Protocol handler
        use eth_type_trans
        Packet_type list
Link layer
  Output
      Traffic Control
      Soft interrupt
          Transmit SKB
              Scatter/Gather DMA
          Free skb
          XPS
              multiqueue
              avoid cache line bouncing
              improve locality
  Bridge
      Virtual device, must bind one or more real device
      Spanning Tree Protocol
Link
Layer
bigmap
Network Layer(IP)
 Input
    Protocol handler
        net_protocol array
    defragment
        Hashtable
            Each IP packet being defragmented save in a list
        stored in kernel memory until they are totally
        processed
 Output
    fragment
        MTU
        Scatter/Gather IO
        udp
    neighboring
Network Layer(IP)
 Forward
    process ip option
    igonore defragmentation
         Router Alert option
 Route
    Forwarding Information Base(routing table)
    cache
 Netfilter
    HOOK point
         NF_IP_LOCAL_OUT/ NF_IP_LOCAL_IN etc..
 Management
    Long-living IP peer information
          AVL tree
    IP statistics
          per cpu data ipstats_mib
         /proc/net/snmp
Network
Layer
Bigmap
Transport Layer (tcp)
  Init
    bind callback (sock_create)
    Three handshrek
        accept queue
        syn table
        create new socket fd and change state
  Manage socket
    inet_ehash_bucket
            TCP_ESTABLISHED <= sk->sk_state < TCP_CLOSE
         inet_bind_hashbucket
              local binding port info
         listening_hash
              socket in TCP_LISTEN state
Transport Layer (tcp)
 Output
    Tcp push
    Congestion control
        state transition
        congestion windows
        packet count
 Input
    fast path and slow path
    Interrupt context/ Process context
    sk_backlog/receive_queue/prequeue
 Tcp state transition
    Kernel control
 Timer
    Retransmit/keep-alive/time-wait etc
TCP
Bigmap
Config and Benchmark Tools

 Ethtool
    offload fetures
 Benchmark and test tools
    Netperf/pktgen
    Mpstat/tcpstat

 Proc FileSystem
    /proc/net
    /proc/sys/net
        ipv4
        core
 Sys FileSystem
    /sys/class/net/ethx
Resource

 http://kernelnewbies.org

 http://kernel.org

 http://www.kernelplanet.org

 https://lkml.org

 http://vger.kernel.org/vger-lists.html

 http://www.pagefault.info/?tag=kernel

More Related Content

PPTX
Linux Network Stack
PDF
netfilter and iptables
PDF
BPF Internals (eBPF)
PDF
Linux Networking Explained
PDF
LinuxCon 2015 Linux Kernel Networking Walkthrough
PPTX
The TCP/IP Stack in the Linux Kernel
PDF
The Linux Block Layer - Built for Fast Storage
PDF
The linux networking architecture
Linux Network Stack
netfilter and iptables
BPF Internals (eBPF)
Linux Networking Explained
LinuxCon 2015 Linux Kernel Networking Walkthrough
The TCP/IP Stack in the Linux Kernel
The Linux Block Layer - Built for Fast Storage
The linux networking architecture

What's hot (20)

PDF
eBPF - Rethinking the Linux Kernel
PDF
LISA2019 Linux Systems Performance
PPTX
DPDK KNI interface
PDF
Fun with Network Interfaces
ODP
Dpdk performance
PPTX
Modern Linux Tracing Landscape
PPTX
Understanding DPDK
PDF
MySQL Server Backup, Restoration, And Disaster Recovery Planning Presentation
ODP
eBPF maps 101
PDF
EBPF and Linux Networking
PDF
BPF / XDP 8월 세미나 KossLab
PDF
Xdp and ebpf_maps
PPT
U boot porting guide for SoC
PDF
Linux-Internals-and-Networking
PDF
re:Invent 2019 BPF Performance Analysis at Netflix
PDF
Secure Boot on ARM systems – Building a complete Chain of Trust upon existing...
PPTX
Linux Memory Management with CMA (Contiguous Memory Allocator)
PDF
Process Address Space: The way to create virtual address (page table) of user...
PDF
Page cache in Linux kernel
PPTX
Tutorial: Using GoBGP as an IXP connecting router
eBPF - Rethinking the Linux Kernel
LISA2019 Linux Systems Performance
DPDK KNI interface
Fun with Network Interfaces
Dpdk performance
Modern Linux Tracing Landscape
Understanding DPDK
MySQL Server Backup, Restoration, And Disaster Recovery Planning Presentation
eBPF maps 101
EBPF and Linux Networking
BPF / XDP 8월 세미나 KossLab
Xdp and ebpf_maps
U boot porting guide for SoC
Linux-Internals-and-Networking
re:Invent 2019 BPF Performance Analysis at Netflix
Secure Boot on ARM systems – Building a complete Chain of Trust upon existing...
Linux Memory Management with CMA (Contiguous Memory Allocator)
Process Address Space: The way to create virtual address (page table) of user...
Page cache in Linux kernel
Tutorial: Using GoBGP as an IXP connecting router
Ad

Similar to introduction to linux kernel tcp/ip ptocotol stack (20)

PPS
Ccna Imp Guide
PDF
NUSE (Network Stack in Userspace) at #osio
PDF
DevConf 2014 Kernel Networking Walkthrough
PPT
Network
PDF
Network Programming: Data Plane Development Kit (DPDK)
DOCX
หน่วยที่ 2 โปรโตคอล
PDF
OSI layers
PPTX
The Osi Model And Layers
PPTX
Dc fabric path
PDF
FD.io - The Universal Dataplane
PPT
Cisco crs1
PDF
Tempesta FW: a FrameWork and FireWall for HTTP DDoS mitigation and Web Applic...
PPTX
Wireshark
PPTX
Tcpandintroductiontoprotocol 150618054958-lva1-app6892
PPTX
Tcp and introduction to protocol
PDF
Ccent notes part 1
PPTX
End to End Convergence
PDF
MPLS Deployment Chapter 1 - Basic
PPT
TCP/IP Basics
PPTX
Networking revolution
Ccna Imp Guide
NUSE (Network Stack in Userspace) at #osio
DevConf 2014 Kernel Networking Walkthrough
Network
Network Programming: Data Plane Development Kit (DPDK)
หน่วยที่ 2 โปรโตคอล
OSI layers
The Osi Model And Layers
Dc fabric path
FD.io - The Universal Dataplane
Cisco crs1
Tempesta FW: a FrameWork and FireWall for HTTP DDoS mitigation and Web Applic...
Wireshark
Tcpandintroductiontoprotocol 150618054958-lva1-app6892
Tcp and introduction to protocol
Ccent notes part 1
End to End Convergence
MPLS Deployment Chapter 1 - Basic
TCP/IP Basics
Networking revolution
Ad

introduction to linux kernel tcp/ip ptocotol stack

  • 1. Introduction to Linux Kernel TCP/IP procotol stack 雕梁 核心系统服务器平台组 diaoliang@taobao.com simohayha.bobo@gmail.com http://www.pagefault.info 2011/01/15
  • 2. Agenda Introduction Networking code in the Linux kernel tree L2 (Link Layer) L3 (Network Layer) L4 (Transport Layer) Config and benchmark tools Resource
  • 3. Introduction Source http://git.kernel.org/ net-next-2.6 and net-2.6 Developer Alan Cox, David Miller, Eric Dumazet, Patrick Mchardy etc. Traffic directions input , forward and output Layer L2(Link Layer)/L3(Network Layer)/L4(Transport Layer) Device interface PCI/PCI-E
  • 4. Networking code in the Linux kernel tree Net-Kernel source tree
  • 6. Link layer Frame type 802.3/802.2/802.2-SNAP/Ethernet Input Driver NAPI Poll + Interrupt Soft interrupt GRO feed packet to network stack RPS/RFS make steer in SMP Protocol handler use eth_type_trans Packet_type list
  • 7. Link layer Output Traffic Control Soft interrupt Transmit SKB Scatter/Gather DMA Free skb XPS multiqueue avoid cache line bouncing improve locality Bridge Virtual device, must bind one or more real device Spanning Tree Protocol
  • 9. Network Layer(IP) Input Protocol handler net_protocol array defragment Hashtable Each IP packet being defragmented save in a list stored in kernel memory until they are totally processed Output fragment MTU Scatter/Gather IO udp neighboring
  • 10. Network Layer(IP) Forward process ip option igonore defragmentation Router Alert option Route Forwarding Information Base(routing table) cache Netfilter HOOK point NF_IP_LOCAL_OUT/ NF_IP_LOCAL_IN etc.. Management Long-living IP peer information AVL tree IP statistics per cpu data ipstats_mib /proc/net/snmp
  • 12. Transport Layer (tcp) Init bind callback (sock_create) Three handshrek accept queue syn table create new socket fd and change state Manage socket inet_ehash_bucket TCP_ESTABLISHED <= sk->sk_state < TCP_CLOSE inet_bind_hashbucket local binding port info listening_hash socket in TCP_LISTEN state
  • 13. Transport Layer (tcp) Output Tcp push Congestion control state transition congestion windows packet count Input fast path and slow path Interrupt context/ Process context sk_backlog/receive_queue/prequeue Tcp state transition Kernel control Timer Retransmit/keep-alive/time-wait etc
  • 15. Config and Benchmark Tools Ethtool offload fetures Benchmark and test tools Netperf/pktgen Mpstat/tcpstat Proc FileSystem /proc/net /proc/sys/net ipv4 core Sys FileSystem /sys/class/net/ethx
  • 16. Resource http://kernelnewbies.org http://kernel.org http://www.kernelplanet.org https://lkml.org http://vger.kernel.org/vger-lists.html http://www.pagefault.info/?tag=kernel