VPP Linux CP - Part7

VPP Linux CP - Part7

About this series

Ever since I first saw VPP - the Vector Packet Processor - I have been deeply impressed with its performance and versatility. For those of us who have used Cisco IOS/XR devices, like the classic ASR (aggregation services router), VPP will look and feel quite familiar as many of the approaches are shared between the two. One thing notably missing, is the higher level control plane, that is to say: there is no OSPF or ISIS, BGP, LDP and the like. This series of posts details my work on a VPP plugin which is called the Linux Control Plane, or LCP for short, which creates Linux network devices that mirror their VPP dataplane counterpart. IPv4 and IPv6 traffic, and associated protocols like ARP and IPv6 Neighbor Discovery can now be handled by Linux, while the heavy lifting of packet forwarding is done by the VPP dataplane. Or, said another way: this plugin will allow Linux to use VPP as a software ASIC for fast forwarding, filtering, NAT, and so on, while keeping control of the interface state (links, addresses and routes) itself. When the plugin is completed, running software like FRR or Bird on top of VPP and achieving >100Mpps and >100Gbps forwarding rates will be well in reach!

Running in Production

In the first articles from this series, I showed the code that needed to be written to implement the Control Plane and Netlink Listener plugins. In the penultimate post, I wrote an SNMP Agentx that exposes the VPP interface data to, say, LibreNMS.

But what are the things one might do to deploy a router end-to-end? That is the topic of this post.

A note on hardware

Before I get into the details, here’s some specifications on the router hardware that I use at IPng Networks (AS50869). See more about our network here.

The chassis is a Supermicro SYS-5018D-FN8T, which includes:

  • Full IPMI support (power, serial-over-lan and kvm-over-ip with HTML5), on a dedicated network port.

  • A 4-core, 8-thread Xeon D1518 CPU which runs at 35W

  • Two independent Intel i210 NICs (Gigabit)

  • A Quad Intel i350 NIC (Gigabit)

  • Two Intel X552 (TenGig)

  • (optional) One Intel X710 Quad-TenGig NIC in the expansion bus

  • m.SATA 120G boot SSD

  • 2x16GB of ECC RAM

The only downside for this machine is that it has only one power supply, so datacenters which do periodical feed-maintenance (such as Interxion is known to do), are likely to reboot the machine from time to time. However, the machine is very well spec’d for VPP in “low” performance scenarios. A machine like this is very affordable (I bought the chassis for about USD 800,- a piece) but its CPU/Memory/PCIe construction is enough to provide forwarding at approximately 35Mpps.

Doing a lazy 1Mpps on this machine’s Xeon D1518, VPP comes in at ~660 clocks per packet with a vector length of ~3.49. This means that if I dedicate 3 cores running at 2200MHz to VPP (leaving 1C2T for the controlplane), this machine has a forwarding capacity of ~34.7Mpps, which fits really well with the Intel X710 NICs (which are limited to 40Mpps [ref]).

I reasonable step-up from here would be Supermicro’s SIS810 with a Xeon E-2288G (8 cores / 16 threads) which carries a dual-PSU, up to 8x Intel i210 NICs and 2x Intel X710 Quad-Tengigs, but it’s quite a bit more expensive. I commit to do that the day AS50869 is forwarding 10Mpps in practice :-)

Install HOWTO

First, I install the “canonical” (pun intended) operating system that VPP is most comfortable running on: Ubuntu 20.04.3. Nothing special selected when installing and after the install is done, I make sure that GRUB uses the serial IPMI port by adding to /etc/default/grub:

Note that the isolcpus is a neat trick that tells the Linux task scheduler to avoid scheduling any workloads on those CPUs. Because the Xeon-D1518 has 4 cores (0,1,2,3) and 4 additional hyperthreads (4,5,6,7), this stanza effectively makes core 1,2,3 unavailable to Linux, leaving only core 0 and its hyperthread 4 are available. This means that our controlplane will have 2 CPUs available to run things like Bird, SNMP, SSH etc, while hyperthreading is essentially turned off on CPU 1,2,3 giving those cores entirely to VPP.

In case you were wondering why I would turn off hyperthreading in this way: hyperthreads share CPU instruction and data cache. The premise of VPP is that a vector (a list) of packets will go through the same routines (like ethernet-input or ip4-lookup) all at once. In such a computational model, VPP leverages the i-cache and d-cache to have subsequent packets make use of the warmed up cache from their predecessor, without having to use the (much slower, relatively speaking) main memory.

The last thing you’d want, is for the hyperthread to come along and replace the cache contents with what-ever it’s doing (be it Linux tasks, or another VPP thread).

So: disaallowing scheduling on 1,2,3 and their counterpart hyperthreads 5,6,7 AND constraining VPP to run only on lcore 1,2,3 will essentially maximize the CPU cache hitrate for VPP, greatly improving performance.

Network Namespace

Originally proposed by TNSR, a Netgate commercial productionization of VPP, it’s a good idea to run VPP and its controlplane in a separate Linux network namespace. A network namespace is logically another copy of the network stack, with its own routes, firewall rules, and network devices.

Creating a namespace looks like follows, on a machine running systemd, like Ubuntu or Debian:

Now, every time we reboot the system, a new network namespace will exist with the name dataplane. That’s where you’ve seen me create interfaces in my previous posts, and that’s where our life-as-a-VPP-router will be born.

Preparing the machine

After creating the namespace, I’ll install a bunch of useful packages and further prepare the machine, but also I’m going to remove a few out-of-the-box installed packages:

Installing VPP

After building the code, specifically after issuing a successful make pkg-deb, a set of Debian packages will be in the build-root sub-directory. Take these and install them like so:

Next up, I make a backup of the original, and then create a reasonable startup configuration for VPP:

A few notes specific to my hardware configuration:

  • the cpu stanza says to run the main thread on CPU 0, and then run three workers (on CPU 1,2,3; the ones for which I disabled the Linux scheduler by means of isolcpus). So CPU 0 and its hyperthread CPU 4 are available for Linux to schedule on, while there are three full cores dedicated to forwarding. This will ensure very low latency/jitter and predictably high throughput!

  • HugePages are a memory optimization mechanism in Linux. In virtual memory management, the kernel maintains a table in which it has a mapping of the virtual memory address to a physical address. For every page transaction, the kernel needs to load related mapping. If you have small size pages then you need to load more numbers of pages resulting kernel to load more mapping tables. This decreases performance. I set these to a larger size of 2MB (the default is 4KB), reducing mapping load and thereby considerably improving performance.

  • I need to ensure there’s enough Stats Segment memory available - each worker thread keeps counters of each prefix, and with a full BGP table (weighing in at 1M prefixes in Q3’21), the amount of memory needed is substantial. Similarly, I need to ensure there are sufficient Buffers available.

Finally, observe the stanza unix { exec /etc/vpp/bootstrap.vpp } and this is a way for me to tell VPP to run a bunch of CLI commands as soon as it starts. This ensures that if VPP were to crash, or the machine were to reboot (more likely :-), that VPP will start up with a working interface and IP address configuration, and any other things I might want VPP to do (like bridge-domains).

A note on VPP’s binding of interfaces: by default, VPP’s dpdk driver will acquire any interface from Linux that is not in use (which means: any interface that is admin-down/unconfigured). To make sure that VPP gets all interfaces, I will remove /etc/netplan/* (or in Debian’s case, /etc/network/interfaces). This is why Supermicro’s KVM and serial-over-lan are so valuable, as they allow me to log in and deconfigure the entire machine, in order to yield all interfaces to VPP. They also allow me to reinstall switch from DANOS to Ubuntu+VPP on a server that’s 700km away.

Anyway, I can start VPP simply like so:

See all interfaces? Great. Moving on :)

Configuring VPP

I set a VPP interface configuration (which it’ll read and apply any time it starts or restarts, thereby making the configuration persistent across crashes and reboots). Using the exec stanza described above, the contents now become, taking as an example, our first router in Lille, France [details], configured as so:

This base-line configuration will:

  • Ensure all host interfaces are created in namespace dataplane which we created earlier

  • Turn on lcp-sync, which copies forward any configuration from VPP into Linux (see VPP Part 2)

  • Turn on lcp-auto-subint, which automatically creates LIPs (Linux interface pairs) for all sub-interfaces (see VPP Part 3)

  • Create a loopback interface, give it IPv4/IPv6 addresses, and expose it to Linux

  • Create one LIP interface for four of the Gigabit and all 6x TenGigabit interfaces

  • Leave 2 interfaces (GigabitEthernet7/0/0 and GigabitEthernet8/0/0) for later

Further, sub-interfaces and bridge-groups might be configured as such:

Particularly the last stanza, creating a bridge-domain, will remind Cisco operators of the same semantics on the ASR9k and IOS/XR operating system. What it does is create a bridge with two physical interfaces, and one so-called bridge virtual interface which I expose to Linux as bvi1, with an IPv4 and IPv6 address. Beautiful!

Configuring Bird

Now that VPP’s interfaces are up, which I can validate with both vppctl show int addr and as well sudo ip netns exec dataplane ip addr, I am ready to configure Bird and put the router in the default free zone (ie. run BGP on it):

The most important thing to note in the configuration is that Bird tends to add a route for all of the connected interfaces, while Linux has already added those. Therefore, I avoid the source RTS_DEVICE, which means “connected routes”, but otherwise offer all routes to the kernel, which in turn propagates these as Netlink messages which are consumed by VPP. A detailed discussion of Bird’s configuration semantics is in my VPP Part 5 post.

Configuring SSH

While Ubuntu (or Debian) will start an SSH daemon upon startup, they will do this in the default namespace. However, our interfaces (like loop0 or xe1-2.100 above) are configured to be present in the dataplane namespace. Therefor, I’ll add a second SSH daemon that runs specifically in the alternate namespace, like so:

And with that, our loopback address, and indeed any other interface created in the dataplane namespace, will accept SSH connections. Yaay!

Configuring SNMPd

At IPng Networks, we use LibreNMS to monitor our machines and routers in production. Similar to SSH, I want the snmpd (which we disabled all the way at the top of this article), to be exposed in the dataplane namespace. However, that namespace will have interfaces like xe0-0 or loop0 or bvi1 configured, and it’s important to note that Linux will only see those packets that were punted by VPP, that is to say, those packets which were destined to any IP address configured on the control plane. Any traffic going through VPP will never be seen by Linux! So, I’ll have to be clever and count this traffic by polling VPP instead. This was the topic of my previous VPP Part 6 about the SNMP Agent. All of that code was released to Github, notably there’s a hint there for an snmpd-dataplane.service and a vpp-snmp-agent.service, including the compiled binary that reads from VPP and feeds this to SNMP.

Then, the SNMP daemon configuration file, assuming net-snmp (the default for Ubuntu and Debian) which was installed in the very first step above, I’ll yield the following simple configuration file:

This config assumes that /var/run/snmpd.serial exists as a regular file rather than a /sys entry. That’s because while the sys_vendor and product_name fields are easily retrievable as user from the /sys filesystem, for some reason board_serial and product_serial are only readable by root, and our SNMPd runs as user Debian-snmp. So, I’ll just generate this at boot-time in /etc/rc.local, like so:

Results

With all of this, I’m ready to pick up the machine in LibreNMS, which looks a bit like this:

Or a specific traffic pattern looking at interfaces:

Clearly, looking at the 17d of ~18Gbit of traffic going through this particular router, with zero crashes and zero SNMPd / Agent restarts, this thing is a winner:

Thanks for reading this far :-)

Elena Popa

Automotive CAE Engineer | Durability & Noise, Vibration Analysis

6mo

I tried to depoy vpp in container kubernetes: same pod, 2 containers, 1 vpp the othe one frr. The frr is changeing the routes, the routes are visible in both containers as they are sharing the same network namespace, but the vpp doesn't sync the linux routes at all. the vpp version is 25.x (latest stable). From documenntation, for me is not clear if vpp/lcp should sync the routes with the linux host or not. Can you please clarify ?

Like
Reply

I want to thank you again for enlightening me to so much that I knew nothing about. Tremendously grateful to be able to follow your well documented journey of discovery!

Andrey Slastenov

Product Manager Security @ G-Core | CCIE #19983

1y

Nice article! From my perspective, service routers can gain a lot of flexibility with a software-based solution. Currently, we are heavily focused on developing an XDP-based DDoS Protection solution. As part of this effort, we are expanding our services to encompass functionalities that were traditionally handled by dedicated service routers, such as GRE or NAT. Every week, we brainstorm more ideas for offloading tasks from hardware-based solutions to software-based ones. Unfortunately, comments are constrained by message size, but there are numerous possibilities and intriguing concepts leveraging the power of software-based processing. Which becomes available with the simplicity and flexibility of a pure software-based approach, as opposed to specialized silicon-based solutions.

Like
Reply
Ivan Dorna

Senior IT Technology Specialist. Senior IT Technology Consultant. (20+ years of real experience in IT Infrastructure and cybersecurity). Founder, CEO @Anthilla: IT Cloud Infrastructure and Data experts.

1y

I'm curious about it... Following.

To view or add a comment, sign in

Others also viewed

Explore topics