docs/bpf: Add documentation for new instructions

Add documentation in instruction-set.rst for new instruction encoding
and their corresponding operations. Also removed the question
related to 'no BPF_SDIV' in bpf_design_QA.rst since we have
BPF_SDIV insn now.

Cc: bpf@ietf.org
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20230728011342.3724411-1-yonghong.song@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
This commit is contained in:
Yonghong Song 2023-07-27 18:13:42 -07:00 committed by Alexei Starovoitov
parent 0c606571ae
commit 245d4c40c0
2 changed files with 79 additions and 41 deletions

View File

@ -140,11 +140,6 @@ A: Because if we picked one-to-one relationship to x64 it would have made
it more complicated to support on arm64 and other archs. Also it it more complicated to support on arm64 and other archs. Also it
needs div-by-zero runtime check. needs div-by-zero runtime check.
Q: Why there is no BPF_SDIV for signed divide operation?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
A: Because it would be rarely used. llvm errors in such case and
prints a suggestion to use unsigned divide instead.
Q: Why BPF has implicit prologue and epilogue? Q: Why BPF has implicit prologue and epilogue?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
A: Because architectures like sparc have register windows and in general A: Because architectures like sparc have register windows and in general

View File

@ -154,24 +154,27 @@ otherwise identical operations.
The 'code' field encodes the operation as below, where 'src' and 'dst' refer The 'code' field encodes the operation as below, where 'src' and 'dst' refer
to the values of the source and destination registers, respectively. to the values of the source and destination registers, respectively.
======== ===== ========================================================== ======== ===== ======= ==========================================================
code value description code value offset description
======== ===== ========================================================== ======== ===== ======= ==========================================================
BPF_ADD 0x00 dst += src BPF_ADD 0x00 0 dst += src
BPF_SUB 0x10 dst -= src BPF_SUB 0x10 0 dst -= src
BPF_MUL 0x20 dst \*= src BPF_MUL 0x20 0 dst \*= src
BPF_DIV 0x30 dst = (src != 0) ? (dst / src) : 0 BPF_DIV 0x30 0 dst = (src != 0) ? (dst / src) : 0
BPF_OR 0x40 dst \|= src BPF_SDIV 0x30 1 dst = (src != 0) ? (dst s/ src) : 0
BPF_AND 0x50 dst &= src BPF_OR 0x40 0 dst \|= src
BPF_LSH 0x60 dst <<= (src & mask) BPF_AND 0x50 0 dst &= src
BPF_RSH 0x70 dst >>= (src & mask) BPF_LSH 0x60 0 dst <<= (src & mask)
BPF_NEG 0x80 dst = -dst BPF_RSH 0x70 0 dst >>= (src & mask)
BPF_MOD 0x90 dst = (src != 0) ? (dst % src) : dst BPF_NEG 0x80 0 dst = -dst
BPF_XOR 0xa0 dst ^= src BPF_MOD 0x90 0 dst = (src != 0) ? (dst % src) : dst
BPF_MOV 0xb0 dst = src BPF_SMOD 0x90 1 dst = (src != 0) ? (dst s% src) : dst
BPF_ARSH 0xc0 sign extending dst >>= (src & mask) BPF_XOR 0xa0 0 dst ^= src
BPF_END 0xd0 byte swap operations (see `Byte swap instructions`_ below) BPF_MOV 0xb0 0 dst = src
======== ===== ========================================================== BPF_MOVSX 0xb0 8/16/32 dst = (s8,s16,s32)src
BPF_ARSH 0xc0 0 sign extending dst >>= (src & mask)
BPF_END 0xd0 0 byte swap operations (see `Byte swap instructions`_ below)
======== ===== ============ ==========================================================
Underflow and overflow are allowed during arithmetic operations, meaning Underflow and overflow are allowed during arithmetic operations, meaning
the 64-bit or 32-bit value will wrap. If eBPF program execution would the 64-bit or 32-bit value will wrap. If eBPF program execution would
@ -198,11 +201,20 @@ where '(u32)' indicates that the upper 32 bits are zeroed.
dst = dst ^ imm32 dst = dst ^ imm32
Also note that the division and modulo operations are unsigned. Thus, for Note that most instructions have instruction offset of 0. But three instructions
``BPF_ALU``, 'imm' is first interpreted as an unsigned 32-bit value, whereas (BPF_SDIV, BPF_SMOD, BPF_MOVSX) have non-zero offset.
for ``BPF_ALU64``, 'imm' is first sign extended to 64 bits and the result
interpreted as an unsigned 64-bit value. There are no instructions for The devision and modulo operations support both unsigned and signed flavors.
signed division or modulo. For unsigned operation (BPF_DIV and BPF_MOD), for ``BPF_ALU``, 'imm' is first
interpreted as an unsigned 32-bit value, whereas for ``BPF_ALU64``, 'imm' is
first sign extended to 64 bits and the result interpreted as an unsigned 64-bit
value. For signed operation (BPF_SDIV and BPF_SMOD), for ``BPF_ALU``, 'imm' is
interpreted as a signed value. For ``BPF_ALU64``, the 'imm' is sign extended
from 32 to 64 and interpreted as a signed 64-bit value.
Instruction BPF_MOVSX does move operation with sign extension.
``BPF_ALU | MOVSX`` sign extendes 8-bit and 16-bit into 32-bit and upper 32-bit are zeroed.
``BPF_ALU64 | MOVSX`` sign extends 8-bit, 16-bit and 32-bit into 64-bit.
Shift operations use a mask of 0x3F (63) for 64-bit operations and 0x1F (31) Shift operations use a mask of 0x3F (63) for 64-bit operations and 0x1F (31)
for 32-bit operations. for 32-bit operations.
@ -210,21 +222,23 @@ for 32-bit operations.
Byte swap instructions Byte swap instructions
~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~
The byte swap instructions use an instruction class of ``BPF_ALU`` and a 4-bit The byte swap instructions use instruction classes of ``BPF_ALU`` and ``BPF_ALU64``
'code' field of ``BPF_END``. and a 4-bit 'code' field of ``BPF_END``.
The byte swap instructions operate on the destination register The byte swap instructions operate on the destination register
only and do not use a separate source register or immediate value. only and do not use a separate source register or immediate value.
The 1-bit source operand field in the opcode is used to select what byte For ``BPF_ALU``, the 1-bit source operand field in the opcode is used to select what byte
order the operation convert from or to: order the operation convert from or to. For ``BPF_ALU64``, the 1-bit source operand
field in the opcode is not used and must be 0.
========= ===== ================================================= ========= ========= ===== =================================================
source value description class source value description
========= ===== ================================================= ========= ========= ===== =================================================
BPF_TO_LE 0x00 convert between host byte order and little endian BPF_ALU BPF_TO_LE 0x00 convert between host byte order and little endian
BPF_TO_BE 0x08 convert between host byte order and big endian BPF_ALU BPF_TO_BE 0x08 convert between host byte order and big endian
========= ===== ================================================= BPF_ALU64 BPF_TO_LE 0x00 do byte swap unconditionally
========= ========= ===== =================================================
The 'imm' field encodes the width of the swap operations. The following widths The 'imm' field encodes the width of the swap operations. The following widths
are supported: 16, 32 and 64. are supported: 16, 32 and 64.
@ -239,6 +253,12 @@ Examples:
dst = htobe64(dst) dst = htobe64(dst)
``BPF_ALU64 | BPF_TO_LE | BPF_END`` with imm = 16/32/64 means::
dst = bswap16 dst
dst = bswap32 dst
dst = bswap64 dst
Jump instructions Jump instructions
----------------- -----------------
@ -249,7 +269,8 @@ The 'code' field encodes the operation as below:
======== ===== === =========================================== ========================================= ======== ===== === =========================================== =========================================
code value src description notes code value src description notes
======== ===== === =========================================== ========================================= ======== ===== === =========================================== =========================================
BPF_JA 0x0 0x0 PC += offset BPF_JMP only BPF_JA 0x0 0x0 PC += offset BPF_JMP class
BPF_JA 0x0 0x0 PC += imm BPF_JMP32 class
BPF_JEQ 0x1 any PC += offset if dst == src BPF_JEQ 0x1 any PC += offset if dst == src
BPF_JGT 0x2 any PC += offset if dst > src unsigned BPF_JGT 0x2 any PC += offset if dst > src unsigned
BPF_JGE 0x3 any PC += offset if dst >= src unsigned BPF_JGE 0x3 any PC += offset if dst >= src unsigned
@ -278,6 +299,16 @@ Example:
where 's>=' indicates a signed '>=' comparison. where 's>=' indicates a signed '>=' comparison.
``BPF_JA | BPF_K | BPF_JMP32`` (0x06) means::
gotol +imm
where 'imm' means the branch offset comes from insn 'imm' field.
Note there are two flavors of BPF_JA instrions. BPF_JMP class permits 16-bit jump offset while
BPF_JMP32 permits 32-bit jump offset. A >16bit conditional jmp can be converted to a <16bit
conditional jmp plus a 32-bit unconditional jump.
Helper functions Helper functions
~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~
@ -320,6 +351,7 @@ The mode modifier is one of:
BPF_ABS 0x20 legacy BPF packet access (absolute) `Legacy BPF Packet access instructions`_ BPF_ABS 0x20 legacy BPF packet access (absolute) `Legacy BPF Packet access instructions`_
BPF_IND 0x40 legacy BPF packet access (indirect) `Legacy BPF Packet access instructions`_ BPF_IND 0x40 legacy BPF packet access (indirect) `Legacy BPF Packet access instructions`_
BPF_MEM 0x60 regular load and store operations `Regular load and store operations`_ BPF_MEM 0x60 regular load and store operations `Regular load and store operations`_
BPF_MEMSX 0x80 sign-extension load operations `Sign-extension load operations`_
BPF_ATOMIC 0xc0 atomic operations `Atomic operations`_ BPF_ATOMIC 0xc0 atomic operations `Atomic operations`_
============= ===== ==================================== ============= ============= ===== ==================================== =============
@ -350,9 +382,20 @@ instructions that transfer data between a register and memory.
``BPF_MEM | <size> | BPF_LDX`` means:: ``BPF_MEM | <size> | BPF_LDX`` means::
dst = *(size *) (src + offset) dst = *(unsigned size *) (src + offset)
Where size is one of: ``BPF_B``, ``BPF_H``, ``BPF_W``, or ``BPF_DW``. Where size is one of: ``BPF_B``, ``BPF_H``, ``BPF_W``, or ``BPF_DW`` and
'unsigned size' is one of u8, u16, u32 and u64.
The ``BPF_MEMSX`` mode modifier is used to encode sign-extension load
instructions that transfer data between a register and memory.
``BPF_MEMSX | <size> | BPF_LDX`` means::
dst = *(signed size *) (src + offset)
Where size is one of: ``BPF_B``, ``BPF_H`` or ``BPF_W``, and
'signed size' is one of s8, s16 and s32.
Atomic operations Atomic operations
----------------- -----------------