linux/kernel/bpf
Daniel Borkmann 54869daa6a bpf: Adjust insufficient default bpf_jit_limit
[ Upstream commit 10ec8ca8ec ]

We've seen recent AWS EKS (Kubernetes) user reports like the following:

  After upgrading EKS nodes from v20230203 to v20230217 on our 1.24 EKS
  clusters after a few days a number of the nodes have containers stuck
  in ContainerCreating state or liveness/readiness probes reporting the
  following error:

    Readiness probe errored: rpc error: code = Unknown desc = failed to
    exec in container: failed to start exec "4a11039f730203ffc003b7[...]":
    OCI runtime exec failed: exec failed: unable to start container process:
    unable to init seccomp: error loading seccomp filter into kernel:
    error loading seccomp filter: errno 524: unknown

  However, we had not been seeing this issue on previous AMIs and it only
  started to occur on v20230217 (following the upgrade from kernel 5.4 to
  5.10) with no other changes to the underlying cluster or workloads.

  We tried the suggestions from that issue (sysctl net.core.bpf_jit_limit=452534528)
  which helped to immediately allow containers to be created and probes to
  execute but after approximately a day the issue returned and the value
  returned by cat /proc/vmallocinfo | grep bpf_jit | awk '{s+=$2} END {print s}'
  was steadily increasing.

I tested bpf tree to observe bpf_jit_charge_modmem, bpf_jit_uncharge_modmem
their sizes passed in as well as bpf_jit_current under tcpdump BPF filter,
seccomp BPF and native (e)BPF programs, and the behavior all looks sane
and expected, that is nothing "leaking" from an upstream perspective.

The bpf_jit_limit knob was originally added in order to avoid a situation
where unprivileged applications loading BPF programs (e.g. seccomp BPF
policies) consuming all the module memory space via BPF JIT such that loading
of kernel modules would be prevented. The default limit was defined back in
2018 and while good enough back then, we are generally seeing far more BPF
consumers today.

Adjust the limit for the BPF JIT pool from originally 1/4 to now 1/2 of the
module memory space to better reflect today's needs and avoid more users
running into potentially hard to debug issues.

Fixes: fdadd04931 ("bpf: fix bpf_jit_limit knob for PAGE_SIZE >= 64K")
Reported-by: Stephen Haynes <sh@synk.net>
Reported-by: Lefteris Alexakis <lefteris.alexakis@kpn.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://github.com/awslabs/amazon-eks-ami/issues/1179
Link: https://github.com/awslabs/amazon-eks-ami/issues/1219
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/r/20230320143725.8394-1-daniel@iogearbox.net
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-30 12:47:47 +02:00
..
preload libbpf: Move BPF_SEQ_PRINTF and BPF_SNPRINTF to bpf_helpers.h 2021-05-26 10:45:41 -07:00
arraymap.c bpf: Acquire map uref in .init_seq_private for array map iterator 2022-08-25 11:40:03 +02:00
bpf_inode_storage.c bpf: Fix spelling mistakes 2021-05-24 21:13:05 -07:00
bpf_iter.c bpf: Refactor BPF_PROG_RUN into a function 2021-08-17 00:45:07 +02:00
bpf_local_storage.c bpf: Do not copy spin lock field from user in bpf_selem_alloc 2022-12-08 11:28:39 +01:00
bpf_lru_list.c bpf_lru_list: Read double-checked variable once without lock 2021-02-10 15:54:26 -08:00
bpf_lru_list.h bpf: Fix a typo "inacitve" -> "inactive" 2020-04-06 21:54:10 +02:00
bpf_lsm.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next 2021-06-17 11:54:56 -07:00
bpf_struct_ops_types.h
bpf_struct_ops.c bpf: Handle return value of BPF_PROG_TYPE_STRUCT_OPS prog 2021-09-14 11:09:50 -07:00
bpf_task_storage.c bpf: Use this_cpu_{inc|dec|inc_return} for bpf_task_storage_busy 2022-10-26 12:34:41 +02:00
btf.c btf: fix resolving BTF_KIND_VAR after ARRAY, STRUCT, UNION, PTR 2023-03-17 08:48:55 +01:00
cgroup.c bpf, cgroup: Fix kernel BUG in purge_effective_progs 2022-09-08 12:28:01 +02:00
core.c bpf: Adjust insufficient default bpf_jit_limit 2023-03-30 12:47:47 +02:00
cpumap.c bpf: cpumap: Implement generic cpumap 2021-07-07 20:01:45 -07:00
devmap.c bpf, devmap: Exclude XDP broadcast to master device 2021-08-09 23:25:14 +02:00
disasm.c bpf: Relicense disassembler as GPL-2.0-only OR BSD-2-Clause 2021-09-02 14:49:23 +02:00
disasm.h bpf: Relicense disassembler as GPL-2.0-only OR BSD-2-Clause 2021-09-02 14:49:23 +02:00
dispatcher.c bpf: Remove bpf_image tree 2020-03-13 12:49:52 -07:00
hashtab.c bpf: Propagate error from htab_lock_bucket() to userspace 2022-10-26 12:34:40 +02:00
helpers.c bpf: Add MEM_RDONLY for helper args that are pointers to rdonly mem. 2022-05-01 17:22:26 +02:00
inode.c bpf: Fix mount source show for bpffs 2022-01-27 11:05:26 +01:00
Kconfig sock_map: Relax config dependency to CONFIG_NET 2021-07-15 18:17:49 -07:00
local_storage.c bpf: Increase supported cgroup storage value size 2021-07-27 15:59:29 -07:00
lpm_trie.c bpf: Allow RCU-protected lookups to happen from bh context 2021-06-24 19:41:15 +02:00
Makefile bpf: Enable task local storage for tracing programs 2021-02-26 11:51:47 -08:00
map_in_map.c bpf: Remember BTF of inner maps. 2021-07-15 22:31:10 +02:00
map_in_map.h bpf: Add map_meta_equal map ops 2020-08-28 15:41:30 +02:00
map_iter.c bpf: Introduce MEM_RDONLY flag 2022-05-01 17:22:24 +02:00
net_namespace.c bpf: Add support for forced LINK_DETACH command 2020-08-01 20:38:28 -07:00
offload.c bpf: restore the ebpf program ID for BPF_AUDIT_UNLOAD and PERF_BPF_EVENT_PROG_UNLOAD 2023-01-24 07:22:46 +01:00
percpu_freelist.c bpf: Initialize same number of free nodes for each pcpu_freelist 2022-11-26 09:24:38 +01:00
percpu_freelist.h bpf: Use raw_spin_trylock() for pcpu_freelist_push/pop in NMI 2020-10-06 00:04:11 +02:00
prog_iter.c bpf: Refactor bpf_iter_reg to have separate seq_info member 2020-07-25 20:16:32 -07:00
queue_stack_maps.c bpf: Eliminate rlimit-based memory accounting for queue_stack_maps maps 2020-12-02 18:32:46 -08:00
reuseport_array.c bpf: Fix spelling mistakes 2021-05-24 21:13:05 -07:00
ringbuf.c bpf: Add MEM_RDONLY for helper args that are pointers to rdonly mem. 2022-05-01 17:22:26 +02:00
stackmap.c bpf: Fix excessive memory allocation in stack_map_alloc() 2022-06-06 08:43:42 +02:00
syscall.c bpf: restore the ebpf program ID for BPF_AUDIT_UNLOAD and PERF_BPF_EVENT_PROG_UNLOAD 2023-01-24 07:22:46 +01:00
sysfs_btf.c bpf: Load and verify kernel module BTFs 2020-11-10 15:25:53 -08:00
task_iter.c bpf: Consolidate task_struct BTF_ID declarations 2021-08-25 10:37:05 -07:00
tnum.c bpf, tnums: Provably sound, faster, and more precise algorithm for tnum_mul 2021-06-01 13:34:15 +02:00
trampoline.c bpf: Fix potential array overflow in bpf_trampoline_get_progs() 2022-06-06 08:43:42 +02:00
verifier.c bpf: Skip invalid kfunc call in backtrack_insn 2023-02-09 11:26:48 +01:00