Programs generation¶
Bytecode generation¶
bf_program is used to represent a BPF program. It contains the BPF bytecode, as well as the required maps and metadata.
Workflow
The program is composed of different steps:
Initialize the generic context
Preprocess the packet’s headers: gather information about the packet’s size, the protocols available, the input interface…
Execute the filtering rules: execute all the rules defined in the program sequentially. If a rule matches the packet, apply its verdict and return.
Apply the policy if no rule matched: if no rule matched the packet, return the chain’s policy (default action).
Memory layout
The program will use the BPF registers to following way:
r0
: return valuer1
tor5
(included): general purpose registersr6
: address of the header currently filtered onr7
: L3 protocol IDr8
: L4 protocol IDr9
: unusedr10
: frame pointer
This convention is followed throughout the project and must be followed all the time to prevent incompatibilities. Debugging this kind of issues is not fun, so stick to it.
bf_program_context is used to represent the layout of the first stack frame in the program. It is filled during preprocessing and contains data required for packet filtering.
About preprocessing
The packets are preprocessed according to the program type (i.e. BPF flavor). Each flavor needs to perform the following steps during preprocessing:
Store the packet size and the input interface index into the runtime context
Create a BPF dynamic pointer for the packet
Preprocess the L2, L3, and L4 headers
The header’s preprocessing is required to discover the protocols used in the packet: processing L2 will rovide us with information about L3, and so on. The logic used to process layer X is responsible for discovering layer X+1: the L2 header preprocessing logic will discover the L3 protocol ID. When processing layer X, if the protocol is not supported, the protocol ID is reset to 0 (so we won’t execute the rules for this layer) and subsequent layers are not processed (because we can’t discover their protocol).
For example, assuming IPv6 and TCP are the only supported protocols:
L2 processing: discover the packet’s ethertype (IPv6), and store it into
r7
.L3 processing: the protocol ID in
r7
is supported (IPv6), so a slice is created, and the L4 protocol ID is read from the IPV6 header intor8
.L4 processing: the protocol ID in
r8
is supported (TCP), so a slice is created.The program can now start executing the rules.
However, assuming only IPv6 and UDP are supported:
L2 processing: discover the packet’s ethertype (IPv6), and store it into
r7
.L3 processing: the protocol ID in
r7
is supported (IPv6), so a slice is created, and the L4 protocol ID is read from the IPV6 header intor8
.L4 processing: the protocol ID in
r8
is no supported (TCP),r8
is set to 0 and we stop processing this layer.The program can now start executing the rules. No layer 4 rule will be executed as
r8
won’t match any protocol ID.
Warning
L3 and L4 protocol IDs must be stored in registers, no on the stack, as older verifier aren’t able to keep track of scalar values located on the stack. This means the verification will fail because the verifier can’t verify branches properly.
Defines
-
PIN_PATH_LEN¶
-
BF_PROG_ID_LEN¶
-
BF_PROG_CTX_OFF(field)¶
Convenience macro to get the offset of a field in bf_program_context based on the frame pointer in
BPF_REG_10
.
-
BF_PROG_SCR_OFF(offset)¶
Convenience macro to get an address in the scratch area of bf_program_context .
-
EMIT(program, x)¶
-
EMIT_KFUNC_CALL(program, function)¶
-
EMIT_FIXUP(program, type, insn)¶
-
EMIT_FIXUP_CALL(program, function)¶
-
EMIT_FIXUP_JMP_NEXT_RULE(program, insn)¶
-
EMIT_LOAD_COUNTERS_FD_FIXUP(program, reg)¶
-
EMIT_LOAD_SET_FD_FIXUP(program, reg, index)¶
Load a specific set’s file descriptor.
Note
Similarly to every
EMIT_*
macro, it must be called from a function returning anint
, if the call fails, the macro will return a negative errno value.- Parameters:
program – Program to generate the bytecode for. Can’t be NULL.
reg – Register to store the set file descriptor in.
index – Index of the set in the program.
-
_cleanup_bf_program_¶
Functions
-
int bf_program_new(struct bf_program **program, enum bf_hook hook, enum bf_front front, const struct bf_chain *chain)¶
-
void bf_program_free(struct bf_program **program)¶
-
int bf_program_marsh(const struct bf_program *program, struct bf_marsh **marsh)¶
-
int bf_program_unmarsh(const struct bf_marsh *marsh, struct bf_program **program, const struct bf_chain *chain)¶
-
void bf_program_dump(const struct bf_program *program, prefix_t *prefix)¶
-
int bf_program_grow_img(struct bf_program *program)¶
-
int bf_program_emit(struct bf_program *program, struct bpf_insn insn)¶
-
int bf_program_emit_kfunc_call(struct bf_program *program, const char *name)¶
-
int bf_program_emit_fixup(struct bf_program *program, enum bf_fixup_type type, struct bpf_insn insn, const union bf_fixup_attr *attr)¶
-
int bf_program_emit_fixup_call(struct bf_program *program, enum bf_fixup_func function)¶
-
int bf_program_generate(struct bf_program *program)¶
-
int bf_program_load(struct bf_program *new_prog, struct bf_program *old_prog)¶
Load and attach the program to the kernel.
Perform the loading and attaching of the program to the kernel in one step. If a similar program already exists,
old_prog
should be a pointer to it, and will be replaced.- Parameters:
new_prog – New program to load and attach to the kernel. Can’t be NULL.
old_prog – Existing program to replace.
- Returns:
0 on success, or negative errno value on failure.
-
int bf_program_unload(struct bf_program *program)¶
-
int bf_program_get_counter(const struct bf_program *program, uint32_t counter_idx, struct bf_counter *counter)¶
-
int bf_program_set_counters(struct bf_program *program, const struct bf_counter *counters)¶
-
struct bf_program_context¶
- #include <bpfilter/cgen/program.h>
BPF program runtime context.
This structure is used to easily read and write data from the program’s stack. At runtime, the first stack frame of each generated program will contain data according to bf_program_context .
The generated programs uses BPF dynamic pointer slices to safely access the packet’s data.
bpf_dynptr_slice
requires a user-provided buffer into which it might copy the requested data, depending on the BPF program type: that is the purpose of the anonynous unions, big enough to store the supported protocol headers.bpf_dynptr_slice
returns the address of the requested data, which is either the address of the user-buffer, or the address of the data in the packet (if the data hasn’t be copied). The program will store this address into the runtime context (i.e.l2
,l3
, andl4
), and it will be used to access the packet’s data.While earlier versions of this structure contained the L3 and L4 protocol IDs, they have been move to registers instead, as old version of the verifier can’t keep track of scalar values in the stack, leading to verification failures.
Warning
Not all the BPF verifier versions are born equal as older ones might require stack access to be 8-bytes aligned to work properly.
Public Members
-
void *arg¶
Argument passed to the BPF program, its content depends on the BPF program type.
-
struct bpf_dynptr dynptr¶
BPF dynamic pointer representing the packet data. Dynamic pointers are used with every program type.
-
uint64_t pkt_size¶
Total size of the packet.
-
uint32_t l3_offset¶
Offset of the layer 3 protocol.
-
uint32_t l4_offset¶
Offset of the layer 4 protocol.
-
uint32_t ifindex¶
On ingress, index of the input interface. On egress, index of the output interface.
-
void *l2_hdr¶
Pointer to the L2 protocol header.
-
void *l3_hdr¶
Pointer to the L3 protocol header.
-
void *l4_hdr¶
Pointer to the L4 protocol header.
-
union bf_program_context._bf_l2 l2¶
-
union bf_program_context._bf_l3 l3¶
-
union bf_program_context._bf_l4 l4¶
-
uint8_t scratch[64]¶
-
union _bf_l3¶
- #include <bpfilter/cgen/program.h>
Layer 3 header.
-
union _bf_l4¶
- #include <bpfilter/cgen/program.h>
Layer 3 header.
-
void *arg¶
-
struct bf_program¶
- #include <bpfilter/cgen/program.h>
Public Members
-
char id[(BPF_OBJ_NAME_LEN - 4)]¶
-
enum bf_hook hook¶
-
enum bf_front front¶
-
char prog_name[BPF_OBJ_NAME_LEN]¶
-
struct bf_printer *printer¶
Log messages printer.
-
struct bf_map *cmap¶
Counters map.
-
struct bf_map *pmap¶
Printer map.
-
bf_list sets¶
List of set maps.
-
bf_list links¶
Link objects attaching the program to a hook.
-
size_t num_counters¶
Number of counters in the counters map. Not all of them are used by the program, but this value is common for all the programs of a given codegen.
-
uint32_t functions_location[_BF_FIXUP_FUNC_MAX]¶
-
struct bpf_insn *img¶
-
size_t img_size¶
-
size_t img_cap¶
-
bf_list fixups¶
-
int prog_fd¶
File descriptor of the program.
-
const struct bf_flavor_ops *ops¶
Hook-specific ops to use to generate the program.
-
const struct bf_chain *chain¶
Chain the program is generated from. This is a non-owning pointer: the bf_program doesn’t have to manage its lifetime.
-
struct bf_program runtime¶
Runtime data used to interact with the program and cache information. This data is not serialized.
-
char id[(BPF_OBJ_NAME_LEN - 4)]¶
Switch-cases¶
bf_swich is used to generate a switch-case logic in BPF bytecode, the logic is the following:
Create a new bf_swich object and initialize it. Use bf_swich_get to simplify this step. A bf_swich object contains a pointer to the generated program, and the register to perform the switch comparison against.
Call EMIT_SWICH_OPTION to define the various cases for the switch, and the associated BPF bytecode to run.
Call EMIT_SWICH_DEFAULT to define the default case of the switch, this is optional.
Call bf_swich_generate to generate the BPF bytecode for the switch.
Once bf_swich_generate has been called, this is what the switch structure will look like in BPF bytecode:
if case 1 matches REG, jump to case 1 code
if case 2 matches REG, jump to case 2 code
else jump to default code
case 1 code
jump after the switch
case 2 code
jump after the switch
default code
Note
I am fully aware it’s supposed to be spelled switch
and not swich
, but both switch
and case
are reserved keywords in C, so I had to come up with a solution to avoid clashes, and swich
could be pronounced similarly to switch
, at least to my non-native speak ear.
Defines
-
bf_swich_get(program, reg)¶
Create, initialize, and return a new bf_swich object.
- Parameters:
program – bf_program object to create the switch in.
reg – Register to use to compare the cases values to.
- Returns:
A new bf_swich object.
-
EMIT_SWICH_OPTION(swich, imm, ...)¶
Add a case to the bf_swich
- Parameters:
swich – Pointer to a valid bf_swich .
imm – Immediate value to compare against the switch’s register.
... – BPF instructions to execute if the case matches.
-
EMIT_SWICH_DEFAULT(swich, ...)¶
Set the default instruction if no cases of the switch matches the register.
Defining a default option to a bf_swich is optional. If this macro is called twice, the existing default options will be replaced by the new ones.
- Parameters:
swich – Pointer to a valid bf_swich .
... – BPF instructions to execute if no case matches.
Functions
-
int bf_swich_init(struct bf_swich *swich, struct bf_program *program, int reg)¶
Initialise a bf_swich object.
- Parameters:
swich – bf_swich object to initialize, can’t be NULL.
program – bf_program object to generate the switch-case for. Can’t be NULL.
reg – Register to compare to the cases of the switch.
- Returns:
0 on success, or negative errno value on failure.
-
void bf_swich_cleanup(struct bf_swich *swich)¶
Cleanup a bf_swich object.
Once this function returns, the
swich
object can be reused by calling bf_swich_init .- Parameters:
swich – The bf_swich object to clean up.
-
int bf_swich_add_option(struct bf_swich *swich, uint32_t imm, const struct bpf_insn *insns, size_t insns_len)¶
Add an option (case) to the switch object.
- Parameters:
swich – bf_swich object to add the option to. Can’t be NULL.
imm – Immediate value to compare the switch’s register to. If the values are equal, the option’s instructions are executed.
insns – Array of BPF instructions to execute if the case matches.
insns_len – Number of instructions in
insns
.
- Returns:
0 on success, or negative errno value on failure.
-
int bf_swich_set_default(struct bf_swich *swich, const struct bpf_insn *insns, size_t insns_len)¶
Set the switch’s default actions if no case matches.
- Parameters:
swich – bf_swich object to set the default action for. Can’t be NULL.
insns – Array of BPF instructions to execute.
insns_len – Number of instructions in
insns
.
- Returns:
0 on success, or negative errno value on failure.
-
int bf_swich_generate(struct bf_swich *swich)¶
Generate the bytecode for the switch.
The BPF program doesn’t contain any of the instructions of the bf_swich until this function is called.
- Parameters:
swich – bf_swich object to generate the bytecode for. Can’t be NULL.
- Returns:
0 on success, or negative errno value on failure.
-
struct bf_swich¶
- #include <bpfilter/cgen/swich.h>
Context used to define a switch-case structure in BPF bytecode.
Public Members
-
struct bf_program *program¶
Program to generate the switch-case in.
-
int reg¶
Register to compare to the various cases of the switch.
-
bf_list options¶
List of options (cases) for the switch.
-
struct bf_swich_option *default_opt¶
Default option, if no case matches the switch’s register.
-
struct bf_program *program¶
Error handling¶
bf_jmpctx is a helper structure to manage jump instructions in the program. A bf_jmpctx will insert a new jump instruction in the BPF program and update its jump offset when the bf_jmpctx is deleted.
Example:
// Within a function body
{
_cleanup_bf_jmpctx_ struct bf_jmpctx ctx =
bf_jmpctx_get(program, BPF_JMP_IMM(BPF_JEQ, BPF_REG_2, 0, 0));
EMIT(program,
BPF_MOV64_IMM(BPF_REG_0, program->runtime.ops->get_verdict(
BF_VERDICT_ACCEPT)));
EMIT(program, BPF_EXIT_INSN());
}
ctx
is a variable local to the scope, marked with _cleanup_bf_jmpctx_
. The second argument to bf_jmpctx_get
is the jump instruction to emit, with the correct condition. When the scope is exited, the jump instruction is automatically updated to point to the first instruction outside of the scope.
Hence, all the instructions emitted within the scope will be executed if the condition is not met. If the condition is met, then the program execution will skip the instructions defined in the scope and continue.
Defines
Functions
-
struct bf_jmpctx¶
- #include <bpfilter/cgen/jmp.h>
Public Members
-
struct bf_program *program¶
A helper structure to manage jump instructions in the program.
The program to emit the jump instruction to.
-
size_t insn_idx¶
The index of the jump instruction in the program’s image.
-
struct bf_program *program¶
Printing debug messages¶
bpfilter
defines a way for generated BPF programs to print log messages through bpf_trace_printk
. This requires:
A set of
bf_printer_*
primitives to manipulate the printer context during the bytecode generation.A
EMIT_PRINT
macro to insert BPF instructions to print a given string.A BPF map, created by
bpfilter
before the BPF programs are attached to the kernel.
The printer context bf_printer
stores all the log messages to be printed by the generated BPF programs. Log messages are deduplicated to limit memory usage.
During the BPF programs generation, EMIT_PRINT
is used to print a given log message from a BPF program. Under the hood, this macro will insert the log message into the global printer context, so it can be used by the BPF programs at runtime.
Before the BPF programs are attached to their hook in the kernel, bpfilter
will create a BPF map to contain a unique string, which is the concatenation of all the log messages defined during the generation step. The various BPF programs will be updated to request their log messages from this map directly.
Note
All the message strings are stored in a single BPF map entry in order to benefit from BPF_PSEUDO_MAP_VALUE
which allows lookup free direct value access for maps. Hence, using a unique instruction, bpfilter
can load the map’s file descriptor and get the address of a message in the buffer. See https://lore.kernel.org/bpf/20190409210910.32048-2-daniel@iogearbox.net.
Defines
-
_cleanup_bf_printer_¶
-
EMIT_PRINT(program, msg)¶
Emit BPF instructions to print a log message.
This function will insert mulitple instruction into the BPF program to load a given log message from a BPF map into a register, store its size, and call
bpf_trace_printk()
to print the message.Warning
As every
EMIT_*
macro,EMIT_PRINT()
will callreturn
if an error occurs. Hence, it must be used within a function that returns an integer.- Parameters:
program – Program to emit the instructions to. Must not be NULL.
msg – Log message to print.
Functions
-
int bf_printer_new(struct bf_printer **printer)¶
Allocate and initialise a new printer context.
- Parameters:
printer – On success, contains a valid printer context.
- Returns:
0 on success, or negative errno value on failure.
-
int bf_printer_new_from_marsh(struct bf_printer **printer, const struct bf_marsh *marsh)¶
Allocate a new printer context and intialise it from serialised data.
- Parameters:
printer – On success, points to the newly allocated and initialised printer context. Can’t be NULL.
marsh – Serialised data to use to initialise the printer message.
- Returns:
0 on success, or negative errno value on error.
-
void bf_printer_free(struct bf_printer **printer)¶
Deinitialise and deallocate a printer context.
- Parameters:
printer – Printer context. Can’t be NULL.
-
int bf_printer_marsh(const struct bf_printer *printer, struct bf_marsh **marsh)¶
Serialise a printer context.
- Parameters:
printer – Printer context to serialise. Can’t be NULL.
marsh – On success, contains the serialised printer context. Can’t be NULL.
- Returns:
0 on success, or negative errno value on failure.
-
void bf_printer_dump(const struct bf_printer *printer, prefix_t *prefix)¶
Dump the content of the printer structure.
- Parameters:
printer – Printer object to dump. Can’t be NULL.
prefix – Prefix to use for the dump. Can be NULL.
-
size_t bf_printer_msg_offset(const struct bf_printer_msg *msg)¶
Return the offset of a specific printer message.
- Parameters:
msg – Printer message. Can’t be NULL.
- Returns:
Offset of
msg
in the concatenated messages buffer.
-
size_t bf_printer_msg_len(const struct bf_printer_msg *msg)¶
Return the length of a specific printer message.
- Parameters:
msg – Printer message. Can’t be NULL.
- Returns:
Length of
msg
, including the trailing nul termination character.
-
const struct bf_printer_msg *bf_printer_add_msg(struct bf_printer *printer, const char *str)¶
Add a new message to the printer.
- Parameters:
printer – Printer context. Can’t be NULL.
str – Message to add to the context. A copy of the buffer is made.
- Returns:
The printer message if it was successfuly added to the context, NULL otherwise.
-
int bf_printer_assemble(const struct bf_printer *printer, void **str, size_t *str_len)¶
Assemble the messages defined inside the printer into a single nul-separated string.
- Parameters:
printer – Printer containing the messages to assemble. Can’t be NULL.
str – On success, contains the pointer to the result string. Can’t be NULL.
str_len – On success, contains the length of the result string, including the nul termination character. Can’t be NULL.
- Returns:
0 on success, or negative errno value on failure.