Programs generation¶
Bytecode generation¶
bf_program is used to represent a BPF program. It contains the BPF bytecode, as well as the required maps and metadata.
Workflow
The program is composed of different steps:
Initialize the generic context
Preprocess the packet’s headers: gather information about the packet’s size, the protocols available, the input interface…
Execute the filtering rules: execute all the rules defined in the program sequentially. If a rule matches the packet, apply its verdict and return.
Apply the policy if no rule matched: if no rule matched the packet, return the chain’s policy (default action).
Memory layout
The program will use the BPF registers to following way:
r0
: return valuer1
tor5
(included): general purpose registersr6
: address of the header currently filtered onr7
: L3 protocol IDr8
: L4 protocol IDr9
: unusedr10
: frame pointer
This convention is followed throughout the project and must be followed all the time to prevent incompatibilities. Debugging this kind of issues is not fun, so stick to it.
bf_runtime
is used to represent the layout of the first stack frame in the program. It is filled during preprocessing and contains data required for packet filtering.
About preprocessing
The packets are preprocessed according to the program type (i.e. BPF flavor). Each flavor needs to perform the following steps during preprocessing:
Store the packet size and the input interface index into the runtime context
Create a BPF dynamic pointer for the packet
Preprocess the L2, L3, and L4 headers
The header’s preprocessing is required to discover the protocols used in the packet: processing L2 will provide us with information about L3, and so on. The logic used to process layer X is responsible for discovering layer X+1: the L2 header preprocessing logic will discover the L3 protocol ID. When processing layer X, if the protocol is not supported, the protocol ID is reset to 0 (so we won’t execute the rules for this layer) and subsequent layers are not processed (because we can’t discover their protocol).
For example, assuming IPv6 and TCP are the only supported protocols:
L2 processing: discover the packet’s ethertype (IPv6), and store it into
r7
.L3 processing: the protocol ID in
r7
is supported (IPv6), so a slice is created, and the L4 protocol ID is read from the IPV6 header intor8
.L4 processing: the protocol ID in
r8
is supported (TCP), so a slice is created.The program can now start executing the rules.
However, assuming only IPv6 and UDP are supported:
L2 processing: discover the packet’s ethertype (IPv6), and store it into
r7
.L3 processing: the protocol ID in
r7
is supported (IPv6), so a slice is created, and the L4 protocol ID is read from the IPV6 header intor8
.L4 processing: the protocol ID in
r8
is no supported (TCP),r8
is set to 0 and we stop processing this layer.The program can now start executing the rules. No layer 4 rule will be executed as
r8
won’t match any protocol ID.
Warning
L3 and L4 protocol IDs must be stored in registers, no on the stack, as older verifier aren’t able to keep track of scalar values located on the stack. This means the verification will fail because the verifier can’t verify branches properly.
Defines
-
EMIT(program, x)¶
-
EMIT_KFUNC_CALL(program, function)¶
-
EMIT_FIXUP(program, type, insn)¶
-
EMIT_FIXUP_ELFSTUB(program, elfstub_id)¶
-
EMIT_FIXUP_JMP_NEXT_RULE(program, insn)¶
-
EMIT_LOAD_COUNTERS_FD_FIXUP(program, reg)¶
-
EMIT_LOAD_SET_FD_FIXUP(program, reg, index)¶
Load a specific set’s file descriptor.
Note
Similarly to every
EMIT_*
macro, it must be called from a function returning anint
, if the call fails, the macro will return a negative errno value.- Parameters:
program – Program to generate the bytecode for. Can’t be NULL.
reg – Register to store the set file descriptor in.
index – Index of the set in the program.
-
_free_bf_program_¶
Functions
-
int bf_program_new(struct bf_program **program, const struct bf_chain *chain)¶
-
void bf_program_free(struct bf_program **program)¶
-
int bf_program_marsh(const struct bf_program *program, struct bf_marsh **marsh)¶
-
int bf_program_unmarsh(const struct bf_marsh *marsh, struct bf_program **program, const struct bf_chain *chain, int dir_fd)¶
-
void bf_program_dump(const struct bf_program *program, prefix_t *prefix)¶
-
int bf_program_grow_img(struct bf_program *program)¶
-
int bf_program_emit(struct bf_program *program, struct bpf_insn insn)¶
-
int bf_program_emit_kfunc_call(struct bf_program *program, const char *name)¶
-
int bf_program_emit_fixup(struct bf_program *program, enum bf_fixup_type type, struct bpf_insn insn, const union bf_fixup_attr *attr)¶
-
int bf_program_emit_fixup_elfstub(struct bf_program *program, enum bf_elfstub_id id)¶
-
int bf_program_generate(struct bf_program *program)¶
-
int bf_program_load(struct bf_program *prog)¶
Load the BPF program into the kernel.
Prior to loading the BPF program, multiple BPF maps are created to store the counters, the debug strings, and the sets. If the program can’t be loaded, all the maps are destroyed.
Once the loading succeeds, the program and the maps are pinned to the filesystem, unless the daemon is in transient mode. If the BPF objects can’t be pinned, the program is unloaded and the maps destroyed.
- Parameters:
prog – Program to load into the kernel. Can’t be NULL and must contain instructions.
- Returns:
0 on success, or negative errno value on failure.
-
int bf_program_attach(struct bf_program *prog, struct bf_hookopts **hookopts)¶
Attach a loaded program to a hook.
The program is attached to a hook using a
bf_link
object. In persistent mode, the link will be pinned to the filesystem. If the link can’t be pinned, the program will be detached from the hook.Warning
If the program hasn’t been loaded (using
bf_program_load
),bf_program_attach
will fail.- Parameters:
prog – Program to attach. Can’t be NULL.
hookopts – Hook-specific options to attach the program to the hook. Can’t be NULL.
- Returns:
0 on success, or negative errno value on failure.
-
int bf_program_pin(struct bf_program *prog, int dir_fd)¶
Pin the BPF program.
The program and all the BPF objects it uses will be pinned into
dir_fd
. The BPF link is only pinned if the program is attached to a hook.- Parameters:
prog – Program to pin. Can’t be NULL.
dir_fd – File descriptor of the directory to pin the program and its BPF objects into.
- Returns:
0 on success, or a negative errno value on error.
-
void bf_program_unpin(struct bf_program *prog, int dir_fd)¶
Unpin the BPF program.
This function never fails. If the program is not pinned, no file will be removed.
- Parameters:
prog – Program to unpin. Can’t be NULL.
dir_fd – File descriptor of the directory containing the pinned objects.
-
void bf_program_detach(struct bf_program *prog)¶
Detach the program from the kernel.
The program is detached but not unloaded.
- Parameters:
prog – Program to detach. Can’t be NULL.
-
void bf_program_unload(struct bf_program *prog)¶
Unload the program.
- Parameters:
prog – Program to unload. Must not be attached. Can’t be NULL.
-
int bf_program_get_counter(const struct bf_program *program, uint32_t counter_idx, struct bf_counter *counter)¶
-
int bf_program_set_counters(struct bf_program *program, const struct bf_counter *counters)¶
-
size_t bf_program_chain_counter_idx(const struct bf_program *program)¶
-
size_t bf_program_error_counter_idx(const struct bf_program *program)¶
-
struct bf_program¶
- #include <bpfilter/cgen/program.h>
Public Members
-
char prog_name[16U]¶
-
enum bf_flavor flavor¶
-
struct bf_printer *printer¶
Log messages printer.
-
struct bf_map *cmap¶
Counters map.
-
struct bf_map *pmap¶
Printer map.
-
struct bf_link *link¶
Link objects attaching the program to a hook.
-
uint32_t elfstubs_location[_BF_ELFSTUB_MAX]¶
-
struct bpf_insn *img¶
-
size_t img_size¶
-
size_t img_cap¶
-
int prog_fd¶
File descriptor of the program.
-
int pindir_fd¶
File descriptor of the directory to pin the program into. Unused in transient mode.
-
const struct bf_flavor_ops *ops¶
Hook-specific ops to use to generate the program.
-
const struct bf_chain *chain¶
Chain the program is generated from. This is a non-owning pointer: the bf_program doesn’t have to manage its lifetime.
-
struct bf_program runtime¶
Runtime data used to interact with the program and cache information. This data is not serialized.
-
char prog_name[16U]¶
ELF stubs¶
ELF stubs are a mechanism to integrate clang-compiled BPF bytecode into bpfilter-generated BPF programs. Complex logic is more easily implemented in C and integrated into the final program that developed in BPF bytecode directly.
ELF stubs source code is part of bpfilter’s sources, they are compiled using clang, the ELF file is stored in a C array and accessible to the daemon at runtime.
Creating a new ELF stub
Add a new source file for the BPF program in the daemon’s codebase (in the
bpf
folder, as$NAME.bpf.c
).Declare the ELF stub in the daemon’s CMakeLists.txt (in
bf_target_add_elfstubs()
).Add a new ID for this stub in
bf_elfstub_id
.Write the BPF C code: define a single function (additional inline functions and macros are allowed).
Technicalities
At build time, ELF stubs are compiled by clang as BPF program. As such, they are bound to the same limitations as any other BPF program. xxd
is used to convert the ELF file into a C array (with size), which is included in a generated rawstubs.h
header file.
rawstubs.h
defines an array of bf_rawstub
structures containing:
const void *elf
: pointer to the ELF data.size_t len
: size of the ELF file.
Each ELF stub can be manipulated through an instance of bf_rawstub
, all the instances are stored in an array, which is as big as _BF_ELFSTUB_MAX
.
However, the ELF stubs are not accessed directly through the bf_rawstub
structure, as it’s not usable as-is. Instead, the bf_context
will extract the actual bytecode from the ELF file, and relocate the kfunc calls. During generation, bf_ctx_get_elfstub
is used to retrieve a pointer to bf_elfstub
containing the BPF instructions to be copied in the program.
Limitations
Not any BPF program can be integrated into a bpfilter program, this section lists the current set of limitations:
Maps are not supported: BPF maps can be defined, but they are not integrated by bpfilter into the program, so the final BPF program won’t be verifiable.
Function are not supported: each ELF stub source code should contain a single function to be integrated. Inline functions are allowed, as they’re not real functions in the final ELF file.
Those limitations might evolve, as new BPF features are developed and the ELF stub implementation is improved.
While map are not supported, bpf_printk()
can be used, as bpfilter is able to add the strings to its own map.
Defines
-
_free_bf_elfstub_¶
Enums
-
enum bf_elfstub_id¶
Identifiers for the ELF stubs.
Each identifier represents a valid ELF stub. If an ELF stub doesn’t have its identifier, it doesn’t exist from bpfilter’s standpoint.
Values:
-
enumerator BF_ELFSTUB_PARSE_IPV6_EH¶
Parse IPv6 extension headers.
__u8 bf_parse_ipv6(struct bf_runtime *ctx)
Parameters
ctx
: address of thebf_runtime
context of the program.
Return The L4 protocol on success, or 0 if the program fails creating a dynamic pointer slice.
-
enumerator BF_ELFSTUB_UPDATE_COUNTERS¶
Update the counters for a given rule.
__u8 bf_update_counters(struct bf_runtime *ctx, void *map, __u64 key)
Parameters
ctx
: address of thebf_runtime
context of the program.map
: address of the counters map.key
: key of the map to update.
Return 0 on success, or 1 on error.
-
enumerator _BF_ELFSTUB_MAX¶
-
enumerator BF_ELFSTUB_PARSE_IPV6_EH¶
Functions
-
int bf_elfstub_new(struct bf_elfstub **stub, enum bf_elfstub_id id)¶
Allocate and initialize a new ELF stub.
The corresponding raw ELF stub will be read, its text section will be copied into the final ELF, and the calls to kfuncs will be relocated according to the host’s kernel.
- Parameters:
stub – ELF stub to allocate and initialize. Can’t be NULL.
id – Identifier of the raw ELF stub to initialize.
- Returns:
0 on success, or negative errno value on failure.
-
void bf_elfstub_free(struct bf_elfstub **stub)¶
Deinitialise and deallocate an ELF stub.
- Parameters:
stub – ELF stub. Can’t be NULL.
-
struct bf_printk_str¶
- #include <bpfilter/cgen/elfstub.h>
-
struct bf_elfstub¶
- #include <bpfilter/cgen/elfstub.h>
Processed ELF stub to be integrated into a BPF program.
Switch-cases¶
bf_swich is used to generate a switch-case logic in BPF bytecode, the logic is the following:
Create a new bf_swich object and initialize it. Use bf_swich_get to simplify this step. A bf_swich object contains a pointer to the generated program, and the register to perform the switch comparison against.
Call EMIT_SWICH_OPTION to define the various cases for the switch, and the associated BPF bytecode to run.
Call EMIT_SWICH_DEFAULT to define the default case of the switch, this is optional.
Call bf_swich_generate to generate the BPF bytecode for the switch.
Once bf_swich_generate has been called, this is what the switch structure will look like in BPF bytecode:
if case 1 matches REG, jump to case 1 code
if case 2 matches REG, jump to case 2 code
else jump to default code
case 1 code
jump after the switch
case 2 code
jump after the switch
default code
Note
I am fully aware it’s supposed to be spelled switch
and not swich
, but both switch
and case
are reserved keywords in C, so I had to come up with a solution to avoid clashes, and swich
could be pronounced similarly to switch
, at least to my non-native speak ear.
Defines
-
bf_swich_get(program, reg)¶
Create, initialize, and return a new bf_swich object.
- Parameters:
program – bf_program object to create the switch in.
reg – Register to use to compare the cases values to.
- Returns:
A new bf_swich object.
-
EMIT_SWICH_OPTION(swich, imm, ...)¶
Add a case to the bf_swich
- Parameters:
swich – Pointer to a valid bf_swich .
imm – Immediate value to compare against the switch’s register.
... – BPF instructions to execute if the case matches.
-
EMIT_SWICH_DEFAULT(swich, ...)¶
Set the default instruction if no cases of the switch matches the register.
Defining a default option to a bf_swich is optional. If this macro is called twice, the existing default options will be replaced by the new ones.
- Parameters:
swich – Pointer to a valid bf_swich .
... – BPF instructions to execute if no case matches.
Functions
-
int bf_swich_init(struct bf_swich *swich, struct bf_program *program, int reg)¶
Initialise a bf_swich object.
- Parameters:
swich – bf_swich object to initialize, can’t be NULL.
program – bf_program object to generate the switch-case for. Can’t be NULL.
reg – Register to compare to the cases of the switch.
- Returns:
0 on success, or negative errno value on failure.
-
void bf_swich_cleanup(struct bf_swich *swich)¶
Cleanup a bf_swich object.
Once this function returns, the
swich
object can be reused by calling bf_swich_init .- Parameters:
swich – The bf_swich object to clean up.
-
int bf_swich_add_option(struct bf_swich *swich, uint32_t imm, const struct bpf_insn *insns, size_t insns_len)¶
Add an option (case) to the switch object.
- Parameters:
swich – bf_swich object to add the option to. Can’t be NULL.
imm – Immediate value to compare the switch’s register to. If the values are equal, the option’s instructions are executed.
insns – Array of BPF instructions to execute if the case matches.
insns_len – Number of instructions in
insns
.
- Returns:
0 on success, or negative errno value on failure.
-
int bf_swich_set_default(struct bf_swich *swich, const struct bpf_insn *insns, size_t insns_len)¶
Set the switch’s default actions if no case matches.
- Parameters:
swich – bf_swich object to set the default action for. Can’t be NULL.
insns – Array of BPF instructions to execute.
insns_len – Number of instructions in
insns
.
- Returns:
0 on success, or negative errno value on failure.
-
int bf_swich_generate(struct bf_swich *swich)¶
Generate the bytecode for the switch.
The BPF program doesn’t contain any of the instructions of the bf_swich until this function is called.
- Parameters:
swich – bf_swich object to generate the bytecode for. Can’t be NULL.
- Returns:
0 on success, or negative errno value on failure.
-
struct bf_swich¶
- #include <bpfilter/cgen/swich.h>
Context used to define a switch-case structure in BPF bytecode.
Error handling¶
bf_jmpctx is a helper structure to manage jump instructions in the program. A bf_jmpctx will insert a new jump instruction in the BPF program and update its jump offset when the bf_jmpctx is deleted.
Example:
// Within a function body
{
_clean_bf_jmpctx_ struct bf_jmpctx ctx =
bf_jmpctx_get(program, BPF_JMP_IMM(BPF_JEQ, BPF_REG_2, 0, 0));
EMIT(program,
BPF_MOV64_IMM(BPF_REG_0, program->runtime.ops->get_verdict(
BF_VERDICT_ACCEPT)));
EMIT(program, BPF_EXIT_INSN());
}
ctx
is a variable local to the scope, marked with _clean_bf_jmpctx_
. The second argument to bf_jmpctx_get
is the jump instruction to emit, with the correct condition. When the scope is exited, the jump instruction is automatically updated to point to the first instruction outside of the scope.
Hence, all the instructions emitted within the scope will be executed if the condition is not met. If the condition is met, then the program execution will skip the instructions defined in the scope and continue.
Defines
-
bf_jmpctx_default()¶
Functions
-
struct bf_jmpctx¶
- #include <bpfilter/cgen/jmp.h>
Public Members
-
struct bf_program *program¶
A helper structure to manage jump instructions in the program.
The program to emit the jump instruction to.
-
size_t insn_idx¶
The index of the jump instruction in the program’s image.
-
struct bf_program *program¶
Printing debug messages¶
bpfilter
defines a way for generated BPF programs to print log messages through bpf_trace_printk
. This requires:
A set of
bf_printer_*
primitives to manipulate the printer context during the bytecode generation.A
EMIT_PRINT
macro to insert BPF instructions to print a given string.A BPF map, created by
bpfilter
before the BPF programs are attached to the kernel.
The printer context bf_printer
stores all the log messages to be printed by the generated BPF programs. Log messages are deduplicated to limit memory usage.
During the BPF programs generation, EMIT_PRINT
is used to print a given log message from a BPF program. Under the hood, this macro will insert the log message into the global printer context, so it can be used by the BPF programs at runtime.
Before the BPF programs are attached to their hook in the kernel, bpfilter
will create a BPF map to contain a unique string, which is the concatenation of all the log messages defined during the generation step. The various BPF programs will be updated to request their log messages from this map directly.
Note
All the message strings are stored in a single BPF map entry in order to benefit from BPF_PSEUDO_MAP_VALUE
which allows lookup free direct value access for maps. Hence, using a unique instruction, bpfilter
can load the map’s file descriptor and get the address of a message in the buffer. See https://lore.kernel.org/bpf/20190409210910.32048-2-daniel@iogearbox.net.
Defines
-
_free_bf_printer_¶
-
EMIT_PRINT(program, msg)¶
Emit BPF instructions to print a log message.
This function will insert mulitple instruction into the BPF program to load a given log message from a BPF map into a register, store its size, and call
bpf_trace_printk()
to print the message.Warning
As every
EMIT_*
macro,EMIT_PRINT()
will callreturn
if an error occurs. Hence, it must be used within a function that returns an integer.- Parameters:
program – Program to emit the instructions to. Must not be NULL.
msg – Log message to print.
Functions
-
int bf_printer_new(struct bf_printer **printer)¶
Allocate and initialise a new printer context.
- Parameters:
printer – On success, contains a valid printer context.
- Returns:
0 on success, or negative errno value on failure.
-
int bf_printer_new_from_marsh(struct bf_printer **printer, const struct bf_marsh *marsh)¶
Allocate a new printer context and intialise it from serialised data.
- Parameters:
printer – On success, points to the newly allocated and initialised printer context. Can’t be NULL.
marsh – Serialised data to use to initialise the printer message.
- Returns:
0 on success, or negative errno value on error.
-
void bf_printer_free(struct bf_printer **printer)¶
Deinitialise and deallocate a printer context.
- Parameters:
printer – Printer context. Can’t be NULL.
-
int bf_printer_marsh(const struct bf_printer *printer, struct bf_marsh **marsh)¶
Serialise a printer context.
- Parameters:
printer – Printer context to serialise. Can’t be NULL.
marsh – On success, contains the serialised printer context. Can’t be NULL.
- Returns:
0 on success, or negative errno value on failure.
-
void bf_printer_dump(const struct bf_printer *printer, prefix_t *prefix)¶
Dump the content of the printer structure.
- Parameters:
printer – Printer object to dump. Can’t be NULL.
prefix – Prefix to use for the dump. Can be NULL.
-
size_t bf_printer_msg_offset(const struct bf_printer_msg *msg)¶
Return the offset of a specific printer message.
- Parameters:
msg – Printer message. Can’t be NULL.
- Returns:
Offset of
msg
in the concatenated messages buffer.
-
size_t bf_printer_msg_len(const struct bf_printer_msg *msg)¶
Return the length of a specific printer message.
- Parameters:
msg – Printer message. Can’t be NULL.
- Returns:
Length of
msg
, including the trailing nul termination character.
-
const struct bf_printer_msg *bf_printer_add_msg(struct bf_printer *printer, const char *str)¶
Add a new message to the printer.
- Parameters:
printer – Printer context. Can’t be NULL.
str – Message to add to the context. A copy of the buffer is made.
- Returns:
The printer message if it was successfuly added to the context, NULL otherwise.
-
int bf_printer_assemble(const struct bf_printer *printer, void **str, size_t *str_len)¶
Assemble the messages defined inside the printer into a single nul-separated string.
- Parameters:
printer – Printer containing the messages to assemble. Can’t be NULL.
str – On success, contains the pointer to the result string. Can’t be NULL.
str_len – On success, contains the length of the result string, including the nul termination character. Can’t be NULL.
- Returns:
0 on success, or negative errno value on failure.