What is eBPF

homepage-banner

Introduction

Extended Berkeley Packet Filter (eBPF) is a Linux kernel technology that enables engineers to build programs that run securely in kernel space. By providing safe access to the innermost workings of the operating system, eBPF allows developers to tackle a wide range of challenges related to networking, observability, and security.

However, because a bug or error in a kernel program can lead to crashes, data corruption, or other unexpected behaviors, access to kernel space is typically restricted to the operating system and specialized processes. User applications are blocked from directly accessing kernel space to protect the system and hardware from potential harm.

Unfortunately, this tight security also prevents user applications from detecting many low-level networking and security events that are visible only at the kernel level, potentially leaving teams without tools to detect or take action on these events.

To summarize, kernel space protects memory and hardware by restricting access to only the operating system and some specialized processes.

eBPF resolves the dilemma of providing deeper levels of access into kernel operations without increasing the risk of harm to the system by allowing users to run programs in a protected environment within kernel space. The safety of the code is verified before it is allowed to execute, using a “sandbox” approach. As a result, eBPF programs are often used for tasks such as detecting malware, debugging applications, and inspecting traffic in a more granular way than is possible with traditional user-space applications, due to the privileged level of access.

What problems does eBPF solve?

To better answer this question, it helps to understand the origins of eBPF and how it developed from the original Berkeley Packet Filter (BPF).

BPF was introduced in 1993 as a way to control and filter traffic. Before BPF, packet filtering tools were limited to user space, which made these applications CPU-intensive and restricted their capabilities. BPF offered a unique alternative by operating as an in-kernel virtual machine (VM) that allowed packet filtering within normally isolated kernel space.

Despite its innovations, the original BPF was primarily adopted as the underlying technology for a popular utility called tcpdump. However, the BPF project gained new life when it needed to be updated for modern 64-bit processors. As part of this update, important limitations to its original functionality were removed, enabling it to solve new problems outside of packet filtering that engineers were beginning to face.

Security challenges

For example, organizations needed more effective strategies to combat threats that were increasingly evading anti-malware software and firewalls. Additionally, many organizations had once considered security strategy mainly in the context of a strong perimeter defense around the corporate network. However, the new reality of mixed cloud environments made it clear that there was no longer a clear perimeter to defend. Alternative security mechanisms were needed that were powerful but efficient enough to run anywhere and everywhere, not just at the network perimeter.

Accelerating Application Development

Alongside security concerns, other challenges emerged. An increasingly competitive environment was shortening development cycles and putting pressure on engineering teams to find and fix bugs faster. To meet this challenge, more granular application tracing and faster debugging were needed. However, Linux lacked a global mechanism for end-to-end tracing of running applications.

Directly updating the Linux kernel with more powerful features could address these challenges, but doing so was (and still is) difficult in practice. Waiting for the Linux community to build and approve new kernel-level functions takes many years. Alternatively, modifying the kernel source code in one’s own Linux fork is a complicated and risky solution because kernel modifications can destabilize a system. Writing loadable kernel modules (LKMs) is another option to provide the desired functionality, but LKMs are complex, risky, and typically require large teams to develop. Additionally, LKMs require constant maintenance because upgrading to a new Linux kernel version can inadvertently break the module.

How eBPF Solves These Problems

The capabilities of BPF were updated and expanded in 2014 to address various issues.

The resulting expansion, called extended BPF or eBPF, provides programs with broad access to kernel functions and system memory in a protected way. eBPF enables you to obtain detailed information about low-level networking, security, and other system-level activities within the kernel. Moreover, it operates without requiring direct modifications to kernel code.

Compared to programs that run in user space, eBPF programs are inherently more efficient and potentially more powerful because they can observe and respond to almost all operations performed by the operating system. Therefore, for application tracing, eBPF programs do not require any code instrumentation and offer the advantage of being more efficient because CPU cycles are used only when necessary.

In summary, eBPF programs allow safe and efficient access to kernel operations by:

Providing built-in hooks for programs based on system calls, kernel functions, network events, and other triggers.
Providing a mechanism for compiling and verifying code prior to running, which helps ensure security and system stability.
Offering a more straightforward method for enhancing kernel functionality than is possible through LKMs, thereby allowing even small teams to efficiently develop safe programs that run in kernel space.

Where is eBPF used?

eBPF is particularly useful in situations where traditional tools are not efficient enough or cannot detect events with enough granularity. This is often the case in networking, security, or observability challenges. eBPF programs are event-driven, allowing for efficient processing of network traffic and the creation of detailed, yet lightweight, security and observability features. For example, eBPF has been used to develop tools that enable access to low-level system events for security or forensic purposes, perform continuous application profiling, or run tracing and profiling scripts on a Kubernetes cluster—all in an extremely resource-efficient manner.

How does eBPF work?

eBPF allows you to run programs in an isolated, in-kernel execution environment called the eBPF virtual machine (eBPF VM). To prepare an eBPF program for this environment, a developer writes the program, compiles the code into intermediary bytecode using a compiler suite (such as LLVM), identifies a system event (called a “hook”) to attach the program to, and then loads the program into the Linux kernel using one of the available eBPF libraries.

Once loaded into the kernel, the program is automatically verified by the verification engine, and its bytecode is compiled via a just-in-time (JIT) compiler into a machine-specific instruction set. This latter step allows the eBPF program to run with the same level of efficiency as a natively compiled app.

At this stage, the eBPF code is ready to be invoked by the pre-specified hook, such as a system call or network event. After the eBPF code is triggered, it can call special functions called “helpers” that can perform a wide range of tasks, including searching and updating key-value pairs in tables, generating random numbers, redirecting network packets, and more. For security and stability reasons, these helper functions must be predefined by the kernel, but the list of helper calls available to eBPF is quite large. As a result, developers can create projects covering a wide array of use cases without having to modify kernel source code, thus avoiding the risk of compromising the security or reliability of the kernel.

Finally, eBPF source code is first converted to bytecode before running through a JIT compiler and verification engine.

References

https://www.datadoghq.com/knowledge-center/ebpf/
https://linux-kernel-labs.github.io/refs/heads/master/lectures/intro.html
https://www.oreilly.com/library/view/learning-ebpf/9781098135119/ch01.html

Leave a message