Address Resolution Protocol (ARP)
Understanding the Bridge Between Layer 2 and Layer 3
Introduction to ARP
The Address Resolution Protocol (ARP) is a crucial network protocol used to map IP addresses to MAC (Media Access Control) addresses in local area networks. When devices need to communicate on the same network segment, they must know each other's MAC addresses to properly frame and deliver data at the data link layer.
Key Point: ARP operates at the boundary between Layer 2 (Data Link) and Layer 3 (Network) of the OSI model, translating logical IP addresses into physical MAC addresses.
Why ARP is Necessary
In TCP/IP networking, devices use IP addresses for logical addressing and routing, but at the physical layer, network interfaces communicate using MAC addresses. ARP solves the problem of discovering the MAC address associated with a known IP address.
Network Communication Layers
ARP Packet Structure
ARP messages are contained within Ethernet frames and have a specific structure:
ARP Process Flow
The ARP resolution process follows these steps:
Detailed ARP Operation
Step 1: ARP Request (Broadcast)
When a device needs to communicate with another device on the same network segment, it first checks its ARP cache. If no entry exists, it broadcasts an ARP request.
ARP Request Broadcast
ARP Request: "Who has IP 192.168.1.100? Tell 192.168.1.10"
Step 2: ARP Reply (Unicast)
The device with the requested IP address responds directly to the requesting device with its MAC address.
Linux Kernel ARP Implementation
The Linux kernel implements ARP through its neighbor subsystem, which is part of the networking stack. This implementation provides a robust and efficient mechanism for address resolution.
Overview: ARP is implemented as part of the neighbor discovery subsystem in Linux, handling the mapping between Layer 3 (IP) and Layer 2 (MAC) addresses. The implementation is primarily found in net/ipv4/arp.c and the generic neighbor code in net/core/neighbour.c.
Key Data Structures
1. struct neighbour
The core data structure representing an ARP entry:
2. struct neigh_table
Manages all ARP entries for IPv4:
Linux Kernel ARP Processing Flow
ARP Processing in the Kernel
Sending ARP Requests
1. Packet Transmission Trigger: When the kernel needs to send an IP packet but doesn't know the destination MAC address, it calls ip_finish_output2()
2. Neighbor Lookup: The function calls __neigh_create() to find or create a neighbor entry
3. ARP Request Generation: If the entry is in NUD_INCOMPLETE state, neigh_resolve_output() is called, which:
Queues the original packet
Calls arp_send_dst() to broadcast an ARP request
Starts a timer for retransmission
Receiving ARP Packets
1. Packet Reception: ARP packets arrive through arp_rcv() function
2. Validation: The kernel validates:
Hardware/protocol types
Address lengths
Network device flags
3. Processing:
ARP Cache State Machine
Neighbor entries in the Linux kernel go through several states:
Key Kernel Functions
Core ARP Functions
Neighbor Subsystem Functions
Kernel Timers and Configuration
Default Timer Values:
Base reachable time: 30 seconds
Retransmission interval: 1 second
Number of retries: 3 attempts
Garbage collection interval: 30 seconds
Kernel Configuration Parameters
Key /proc/sys/net/ipv4/neigh/ parameters:
Modern Kernel Enhancements
Recent Linux kernels include several optimizations and features:
RCU (Read-Copy-Update): Lockless lookups for better performance in multi-core systems
Per-CPU Statistics: Reduced cache line bouncing for network statistics
ARP Offloading: Hardware acceleration support for high-performance NICs
Netlink Notifications: Real-time neighbor state change events via NETLINK_ROUTE
Gratuitous ARP: Automatic announcements for faster network convergence
ARP Filtering: Per-interface response policies via arp_filter and arp_ignore
ARP Cache/Table
Each device maintains an ARP cache (also called ARP table) to store recent IP-to-MAC address mappings, reducing the need for repeated ARP requests.
Types of ARP
1. Standard ARP
Normal ARP operation for resolving IP addresses to MAC addresses within the same network segment.
2. Proxy ARP
A router or gateway responds to ARP requests on behalf of devices in other network segments.
3. Gratuitous ARP
A device sends an ARP request for its own IP address to detect conflicts and update other devices' ARP caches.
4. Reverse ARP (RARP)
Used to obtain an IP address when only the MAC address is known (largely obsoleted by DHCP).
ARP Security Considerations
Security Warning: ARP is inherently insecure and vulnerable to various attacks.
Common ARP Attacks:
ARP Spoofing/Poisoning: Attackers send fake ARP replies to associate their MAC address with another device's IP
Man-in-the-Middle: Intercepting traffic by poisoning ARP caches of both communication endpoints
ARP Flooding: Overwhelming switches with fake ARP requests to cause denial of service
Mitigation Strategies:
Linux-Specific Security Features
Troubleshooting ARP Issues
Common ARP Problems:
Duplicate IP addresses causing ARP conflicts
Stale ARP cache entries
ARP table overflow
Network loops causing ARP storms
Conclusion
ARP is a fundamental protocol that enables communication between devices on local networks by mapping IP addresses to MAC addresses. While simple in concept, understanding ARP is crucial for network troubleshooting, security, and optimization. The Linux kernel implementation provides a robust and highly configurable ARP subsystem through the neighbor discovery framework, offering advanced features like RCU optimization, hardware offloading, and detailed state management.
Remember: ARP only works within the same broadcast domain. Inter-network communication relies on routers and doesn't require ARP for the final destination until the packet reaches the destination network. Understanding both the protocol fundamentals and the kernel implementation details is essential for effective network administration and troubleshooting.