Challenges and Strategies in eBPF Uprobe Development

Teodor Podobnik

SRE @ Prewave | Master of Science in Computer Science

Published Aug 12, 2024

When it comes to eBPF development, the hardest part is moving forward from the code examples and tutorials you find on the web and designing something on your own.

In this newsletter, I want to help you up your game and become a well-rounded eBPF developer.

Today’s topic is uprobes (user space probes).

The focus will not be on understanding the code but rather on where and how to discover and attach uprobes in the application stack, which is tricky — believe me.

Let me show you.

Consider that we want to extract SSH usernames and passwords on each user login. This is potentially a malicious act but a good exercise for this demonstration.

In the SSH daemon, a.k.a. sshd, Linux PAM (Pluggable Authentication Modules) is a suite of libraries that enable user authentication, so this should be our starting point.

We start by looking through its source code on GitHub as we're interested in finding a hook point or, I should say, a function that either handles the authentication or has access to the authentication parameters so we can extract them.

Traversing through the code and searching for specific symbols, one can find a pam_get_authtok function. This should trigger some interest.

We look further and find that the first argument of the function is a pam_handle_t struct that holds the user (username) and the authtok (password) parameters.

At this stage, we can only technically speculate that this function is triggered on user login, or we can spend a bit more time in the source code.

We can verify this function is utilized by our sshd by exporting the libraries it utilizes and searching for the function symbol. We achieve that by first figuring out the sshd PID (Process ID):

ps faux | grep sshd

Using this information, we can list the symbol table of the library using the following command:

sudo readelf -s --wide /proc/<PID>/root/usr/lib/x86_64-linux-gnu/libpam.so.0 | grep pam_get_authtok

💡 Note: You need to replace the <PID> placeholder with the sshd PID. (second column of ps faux command output - in our case 597234)

We can see that the library does indeed have the pam_get_authtok function symbol. This is great so far — let's try to hook onto it.

Just for the sake of it, first, we will print some dummy data to check if the function is actually triggered.

Here's a MVP with a Kernel space program:

//go:build ignore
#include "vmlinux.h"
#include <bpf/bpf_helpers.h>

char __license[] SEC("license") = "GPL";

SEC("uretprobe/pam_get_authtok")
int trace_pam_get_authtok(struct pt_regs *ctx) {
    bpf_printk("It's triggered - jeii!");
    return 0;
};

💡 Note: A uretprobe is a probe that is triggered on the exit of a function, rather than on the entry of it like a uprobe.

And a User Space program:

package main

//go:generate go run github.com/cilium/ebpf/cmd/bpf2go -target amd64 guard guard.c

import (
        "log"
        "fmt"
        "time"
        "os/exec"
	"strings"
	"bufio"
	"bytes"
	"github.com/cilium/ebpf/link"
	"github.com/cilium/ebpf/rlimit"
)

const (
	defaultBinPath = "libpam.so.0"
	defaultSymbol  = "pam_get_authtok"
)

func findLibraryPath(libname string) (string, error) {
        cmd := exec.Command("sh", "-c", fmt.Sprintf("ldconfig -p | grep %s", libname))
	// Run the command and get the output
	var out bytes.Buffer
	cmd.Stdout = &out
	err := cmd.Run()
	if err != nil {
		return "", fmt.Errorf("failed to run ldconfig: %w", err)
	}

	// Read the first line of output which should have the library path
	scanner := bufio.NewScanner(&out)
	if scanner.Scan() {
		line := scanner.Text()
		// Extract the path from the ldconfig output
		if start := strings.LastIndex(line, ">"); start != -1 {
			path := strings.TrimSpace(line[start+1:])
			return path, nil
		}
	}
	return "", fmt.Errorf("library not found")
}

func main() {
	// Allow the current process to lock memory for eBPF resources.
	if err := rlimit.RemoveMemlock(); err != nil {
		log.Fatal(err)
	}
	// Load pre-compiled programs and maps into the kernel.
	objs := guardObjects{}
	if err := loadGuardObjects(&objs, nil); err != nil {
		log.Fatalf("loading objects: %s", err)
	}
	defer objs.Close()
	pamPath, err := findLibraryPath(defaultBinPath);
	if err != nil {
		log.Fatal(err)
	}
	log.Printf("LibPAM path: %s\n", defaultBinPath);
	// Open an ELF binary and read its symbols.
	ex, err := link.OpenExecutable(pamPath)
	if err != nil {
		log.Fatalf("opening executable: %s", err)
	}
	// Set up uretprobes
	uretprobe_pam, err := ex.Uretprobe(defaultSymbol, objs.TracePamGetAuthtok, nil)
	if err != nil {
		log.Fatalf("creating uretprobe - %s: %s", defaultSymbol, err)
	}
	defer uretprobe_pam.Close()
	for {
		time.Sleep(1 * time.Second)
	}
}

So the kernel program will just print something, but the slightly trickier part is attaching the uretprobe to the libpam binary. We first need to find the path to the library itself and utilize eBPF Link to attach our program to it.

By running this program, we can see that our program was indeed triggered by printing eBPF traces:

 sudo bpftool prog trace

This is good — we are on the right path. Now it's just a matter of checking which parameters we need to read from the registers to get the username and password parameters, which we already discussed before.

In the pam_get_authtok function, we need to read the pam_handle_t struct, which we can copy and paste from the source into our code and utilize the bpf_probe_read helper function to extract the parameters, namely authtok and user. In other words, modifying our kernel code as:

//go:build ignore
#include "vmlinux.h"
#include <bpf/bpf_helpers.h>

char __license[] SEC("license") = "GPL";

SEC("uretprobe/pam_get_authtok")
int trace_pam_get_authtok(struct pt_regs *ctx) {
    bpf_printk("It's triggered - jeii!");

    if (!PT_REGS_PARM1(ctx)) {
         return 0;
    }

    pam_handle_t* phandle = (pam_handle_t*)PT_REGS_PARM1(ctx);
    u32 pid = bpf_get_current_pid_tgid() >> 32;

    u64 password_addr = 0;
    bpf_probe_read(&password_addr, sizeof(password_addr), &phandle->authtok);

    u64 username_addr = 0;
    bpf_probe_read(&username_addr, sizeof(username_addr), &phandle->user);

    return 0;
};

This way, we finished the extraction of the variables and technically got what we wanted.

💡 Note: This example only covers username and password authentication, but we could technically expand this to also cover authentication using SSH public and private keys.

To wrap this up, in the final code, I’ve also added the eBPF Ring Buffer Map to forward the extracted data to user space. You can find the complete code on my GitHub.

We have come to the end of this week's newsletter, and to be honest, the hardest part in developing uprobe eBPF programs is finding the right hook points and understanding the library source code. Additionally, it’s important to verify using ps and readelf that our applications are actually utilizing the function we plan to track. However, if you're searching for a hook point inside your application or library, this would be more straightforward.

I hope you find this resource as enlightening as I did. Stay tuned for more exciting developments and updates in the world of eBPF in next week's newsletter.

Until then, keep 🐝-ing!

Warm regards, Teodor

Jameel Kaisar

SDE @Amazon | Ex BrowserStack, Zeta, Wikipedia, BARC | Open Source Contributor

11mo

Very tricky indeed! Identifying the right function and structs requires a deep dive into the codebase. eBPF uprobes can be quite challenging, especially when some function parameters are passed via stack rather than the registers.

1 Reaction

See more comments

Challenges and Strategies in eBPF Uprobe Development

Teodor Podobnik

SRE @ Prewave | Master of Science in Computer Science

Cloud Chirp

1,103 follower

More articles by this author

Others also viewed

One, Integrated Platform. One Purpose.

The Platform is Dead; Long Live the Platform

Evaluating Google Gemini CLI: My Developer Experience

.NET 9 Release: Understanding the Updates and New Features

VS Code: Every Developer’s Best Friend

The ultimate note-taking solution

NerdFonts and Starship: Elevating Your Developer Experience in the Command Line

Zed vs. VS Code: A Battle of Code Editor Titans

Mastering SOLID Principles with Practical Examples in C# and .NET Core

Translating C++ Code: Command Line vs. CMake – Which is Right for You?

Explore topics

Cloud Chirp

1,103 follower

Can eBPF Provide Real-Time PostgreSQL Insights Without Degrading Performance?

Feb 21, 2025

Dependency Management at Scale: How To Maintain 200+ Infrastructure Tools Up to Date

Feb 19, 2025

Goby: A Simple CLI to Bootstrap Your eBPF Projects

Feb 9, 2025

Go, C, Rust, and More: Picking the Right eBPF Application Stack

Jan 23, 2025

Tracepoints, Kprobes, or Fprobes: Which One Should You Choose?

Jan 5, 2025

eBPF Stateful Programs and State Synchronization Problem

Dec 23, 2024

Are We Really Stuck with One Socket, One Port?

Dec 17, 2024

🚨 Important Update for All Subscribers 🚨

Sep 26, 2024

eBPF Maps State Synchronization across Multi-Node Kubernetes Cluster

Aug 14, 2024

Cloud Chirp #19 ☀️ - Read Now (3 minutes)

Aug 4, 2024