linux/kernel
Daniel Borkmann 5dd0a6b858 bpf: Fix tail_call_reachable rejection for interpreter when jit failed
During testing of f263a81451 ("bpf: Track subprog poke descriptors correctly
and fix use-after-free") under various failure conditions, for example, when
jit_subprogs() fails and tries to clean up the program to be run under the
interpreter, we ran into the following freeze:

  [...]
  #127/8 tailcall_bpf2bpf_3:FAIL
  [...]
  [   92.041251] BUG: KASAN: slab-out-of-bounds in ___bpf_prog_run+0x1b9d/0x2e20
  [   92.042408] Read of size 8 at addr ffff88800da67f68 by task test_progs/682
  [   92.043707]
  [   92.044030] CPU: 1 PID: 682 Comm: test_progs Tainted: G   O   5.13.0-53301-ge6c08cb33a30-dirty #87
  [   92.045542] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1 04/01/2014
  [   92.046785] Call Trace:
  [   92.047171]  ? __bpf_prog_run_args64+0xc0/0xc0
  [   92.047773]  ? __bpf_prog_run_args32+0x8b/0xb0
  [   92.048389]  ? __bpf_prog_run_args64+0xc0/0xc0
  [   92.049019]  ? ktime_get+0x117/0x130
  [...] // few hundred [similar] lines more
  [   92.659025]  ? ktime_get+0x117/0x130
  [   92.659845]  ? __bpf_prog_run_args64+0xc0/0xc0
  [   92.660738]  ? __bpf_prog_run_args32+0x8b/0xb0
  [   92.661528]  ? __bpf_prog_run_args64+0xc0/0xc0
  [   92.662378]  ? print_usage_bug+0x50/0x50
  [   92.663221]  ? print_usage_bug+0x50/0x50
  [   92.664077]  ? bpf_ksym_find+0x9c/0xe0
  [   92.664887]  ? ktime_get+0x117/0x130
  [   92.665624]  ? kernel_text_address+0xf5/0x100
  [   92.666529]  ? __kernel_text_address+0xe/0x30
  [   92.667725]  ? unwind_get_return_address+0x2f/0x50
  [   92.668854]  ? ___bpf_prog_run+0x15d4/0x2e20
  [   92.670185]  ? ktime_get+0x117/0x130
  [   92.671130]  ? __bpf_prog_run_args64+0xc0/0xc0
  [   92.672020]  ? __bpf_prog_run_args32+0x8b/0xb0
  [   92.672860]  ? __bpf_prog_run_args64+0xc0/0xc0
  [   92.675159]  ? ktime_get+0x117/0x130
  [   92.677074]  ? lock_is_held_type+0xd5/0x130
  [   92.678662]  ? ___bpf_prog_run+0x15d4/0x2e20
  [   92.680046]  ? ktime_get+0x117/0x130
  [   92.681285]  ? __bpf_prog_run32+0x6b/0x90
  [   92.682601]  ? __bpf_prog_run64+0x90/0x90
  [   92.683636]  ? lock_downgrade+0x370/0x370
  [   92.684647]  ? mark_held_locks+0x44/0x90
  [   92.685652]  ? ktime_get+0x117/0x130
  [   92.686752]  ? lockdep_hardirqs_on+0x79/0x100
  [   92.688004]  ? ktime_get+0x117/0x130
  [   92.688573]  ? __cant_migrate+0x2b/0x80
  [   92.689192]  ? bpf_test_run+0x2f4/0x510
  [   92.689869]  ? bpf_test_timer_continue+0x1c0/0x1c0
  [   92.690856]  ? rcu_read_lock_bh_held+0x90/0x90
  [   92.691506]  ? __kasan_slab_alloc+0x61/0x80
  [   92.692128]  ? eth_type_trans+0x128/0x240
  [   92.692737]  ? __build_skb+0x46/0x50
  [   92.693252]  ? bpf_prog_test_run_skb+0x65e/0xc50
  [   92.693954]  ? bpf_prog_test_run_raw_tp+0x2d0/0x2d0
  [   92.694639]  ? __fget_light+0xa1/0x100
  [   92.695162]  ? bpf_prog_inc+0x23/0x30
  [   92.695685]  ? __sys_bpf+0xb40/0x2c80
  [   92.696324]  ? bpf_link_get_from_fd+0x90/0x90
  [   92.697150]  ? mark_held_locks+0x24/0x90
  [   92.698007]  ? lockdep_hardirqs_on_prepare+0x124/0x220
  [   92.699045]  ? finish_task_switch+0xe6/0x370
  [   92.700072]  ? lockdep_hardirqs_on+0x79/0x100
  [   92.701233]  ? finish_task_switch+0x11d/0x370
  [   92.702264]  ? __switch_to+0x2c0/0x740
  [   92.703148]  ? mark_held_locks+0x24/0x90
  [   92.704155]  ? __x64_sys_bpf+0x45/0x50
  [   92.705146]  ? do_syscall_64+0x35/0x80
  [   92.706953]  ? entry_SYSCALL_64_after_hwframe+0x44/0xae
  [...]

Turns out that the program rejection from e411901c0b ("bpf: allow for tailcalls
in BPF subprograms for x64 JIT") is buggy since env->prog->aux->tail_call_reachable
is never true. Commit ebf7d1f508 ("bpf, x64: rework pro/epilogue and tailcall
handling in JIT") added a tracker into check_max_stack_depth() which propagates
the tail_call_reachable condition throughout the subprograms. This info is then
assigned to the subprogram's func[i]->aux->tail_call_reachable. However, in the
case of the rejection check upon JIT failure, env->prog->aux->tail_call_reachable
is used. func[0]->aux->tail_call_reachable which represents the main program's
information did not propagate this to the outer env->prog->aux, though. Add this
propagation into check_max_stack_depth() where it needs to belong so that the
check can be done reliably.

Fixes: ebf7d1f508 ("bpf, x64: rework pro/epilogue and tailcall handling in JIT")
Fixes: e411901c0b ("bpf: allow for tailcalls in BPF subprograms for x64 JIT")
Co-developed-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/bpf/618c34e3163ad1a36b1e82377576a6081e182f25.1626123173.git.daniel@iogearbox.net
2021-07-13 08:19:13 -07:00
..
bpf bpf: Fix tail_call_reachable rejection for interpreter when jit failed 2021-07-13 08:19:13 -07:00
cgroup Merge branch 'akpm' (patches from Andrew) 2021-06-29 17:29:11 -07:00
configs drivers/char: remove /dev/kmem for good 2021-05-07 00:26:34 -07:00
debug printk changes for 5.14 2021-06-29 12:07:18 -07:00
dma Merge branch 'stable/for-linus-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb 2021-06-23 09:04:07 -07:00
entry tick/nohz: Only check for RCU deferred wakeup on user/guest entry when needed 2021-05-31 10:14:49 +02:00
events Merge branch 'akpm' (patches from Andrew) 2021-06-29 17:29:11 -07:00
gcov Kconfig: Introduce ARCH_WANTS_NO_INSTR and CC_HAS_NO_PROFILE_FN_ATTR 2021-06-22 11:07:18 -07:00
irq irqchip updates for 5.14 2021-06-28 11:55:20 +02:00
kcsan sched: Introduce task_is_running() 2021-06-18 11:43:07 +02:00
livepatch Livepatching changes for 5.13 2021-04-27 18:14:38 -07:00
locking Scheduler udpates for this cycle: 2021-06-28 12:14:19 -07:00
power PM: hibernate: remove leading spaces before tabs 2021-06-11 18:52:05 +02:00
printk printk changes for 5.14 2021-06-29 12:07:18 -07:00
rcu sched: Change task_struct::state 2021-06-18 11:43:09 +02:00
sched - Fix a small inconsistency (bug) in load tracking, caught by a 2021-06-30 15:37:49 -07:00
time -----BEGIN PGP SIGNATURE----- 2021-06-29 12:31:16 -07:00
trace once: implement DO_ONCE_LITE for non-fast-path "do once" functionality 2021-06-28 15:54:57 -07:00
.gitignore .gitignore: prefix local generated files with a slash 2021-05-02 00:43:35 +09:00
acct.c kernel/acct.c: use #elif instead of #end and #elif 2020-12-15 22:46:15 -08:00
async.c kernel/async.c: remove async_unregister_domain() 2021-05-07 00:26:33 -07:00
audit_fsnotify.c audit_alloc_mark(): don't open-code ERR_CAST() 2021-02-23 10:25:27 -05:00
audit_tree.c audit: Use list_move instead of list_del/list_add 2021-06-08 22:18:35 -04:00
audit_watch.c fsnotify: generalize handle_inode_event() 2020-12-03 14:58:35 +01:00
audit.c lsm: separate security_task_getsecid() into subjective and objective variants 2021-03-22 15:23:32 -04:00
audit.h audit: remove trailing spaces and tabs 2021-06-10 20:59:05 -04:00
auditfilter.c lsm: separate security_task_getsecid() into subjective and objective variants 2021-03-22 15:23:32 -04:00
auditsc.c audit: remove trailing spaces and tabs 2021-06-10 20:59:05 -04:00
backtracetest.c
bounds.c
capability.c capability: handle idmapped mounts 2021-01-24 14:27:16 +01:00
cfi.c add support for Clang CFI 2021-04-08 16:04:20 -07:00
compat.c
configs.c
context_tracking.c
cpu_pm.c
cpu.c A fix for the CPU hotplug and cpusets interaction: 2021-06-29 12:23:02 -07:00
crash_core.c mm: replace CONFIG_FLAT_NODE_MEM_MAP with CONFIG_FLATMEM 2021-06-29 10:53:55 -07:00
crash_dump.c
cred.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2021-06-28 20:39:26 -07:00
delayacct.c delayacct: Add sysctl to enable at runtime 2021-05-12 11:43:25 +02:00
dma.c
exec_domain.c
exit.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2021-06-28 20:39:26 -07:00
extable.c
fail_function.c fault-injection: handle EI_ETYPE_TRUE 2020-12-15 22:46:19 -08:00
fork.c Merge branch 'akpm' (patches from Andrew) 2021-06-29 17:29:11 -07:00
freezer.c sched: Add get_current_state() 2021-06-18 11:43:08 +02:00
futex.c Locking changes for this cycle: 2021-06-28 11:45:29 -07:00
gen_kheaders.sh kbuild: redo fake deps at include/config/*.h 2021-04-25 05:26:10 +09:00
groups.c groups: simplify struct group_info allocation 2021-02-26 09:41:03 -08:00
hung_task.c sched: Change task_struct::state 2021-06-18 11:43:09 +02:00
iomem.c
irq_work.c irq_work: Make irq_work_queue() NMI-safe again 2021-06-10 10:00:08 +02:00
jump_label.c jump_label: Free jump_entry::key bit1 for build use 2021-05-12 14:54:55 +02:00
kallsyms.c kallsyms: strip ThinLTO hashes from static functions 2021-04-08 16:04:21 -07:00
kcmp.c Merge branch 'exec-update-lock-for-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2020-12-15 19:36:48 -08:00
Kconfig.freezer
Kconfig.hz
Kconfig.locks
Kconfig.preempt sched/core: Disable CONFIG_SCHED_CORE by default 2021-06-28 22:43:05 +02:00
kcov.c
kexec_core.c kexec: dump kmessage before machine_kexec 2021-05-07 00:26:32 -07:00
kexec_elf.c
kexec_file.c kernel: kexec_file: fix error return code of kexec_calculate_store_digests() 2021-05-07 00:26:32 -07:00
kexec_internal.h kexec: move machine_kexec_post_load() to public interface 2021-02-22 12:33:26 +00:00
kexec.c
kheaders.c
kmod.c modules: add CONFIG_MODPROBE_PATH 2021-05-07 00:26:33 -07:00
kprobes.c kprobes: Remove kprobe::fault_handler 2021-06-01 16:00:08 +02:00
ksysfs.c
kthread.c Merge branch 'akpm' (patches from Andrew) 2021-06-29 17:29:11 -07:00
latencytop.c
Makefile kbuild: update config_data.gz only when the content of .config is changed 2021-05-02 00:43:35 +09:00
module_signature.c module: harden ELF info handling 2021-01-19 10:24:45 +01:00
module_signing.c module: harden ELF info handling 2021-01-19 10:24:45 +01:00
module-internal.h
module.c module: limit enabling module.sig_enforce 2021-06-22 11:13:19 -07:00
notifier.c
nsproxy.c fixes-v5.11 2020-12-14 16:40:27 -08:00
padata.c
panic.c panic: don't dump stack twice on warn 2020-11-14 11:26:04 -08:00
params.c Modules updates for v5.11 2020-12-17 13:01:31 -08:00
pid_namespace.c fixes-v5.11 2020-12-14 16:40:27 -08:00
pid.c Merge branch 'exec-update-lock-for-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2020-12-15 19:36:48 -08:00
profile.c kernel: Initialize cpumask before parsing 2021-04-10 13:35:54 +02:00
ptrace.c sched: Change task_struct::state 2021-06-18 11:43:09 +02:00
range.c
reboot.c reboot: Add hardware protection power-off 2021-06-21 13:08:36 +01:00
regset.c
relay.c relay: allow the use of const callback structs 2020-12-15 22:46:18 -08:00
resource_kunit.c resource: provide meaningful MODULE_LICENSE() in test suite 2020-11-25 18:52:35 +01:00
resource.c kernel/resource: fix return code check in __request_free_mem_region 2021-05-14 19:41:32 -07:00
rseq.c rseq: Optimise rseq_get_rseq_cs() and clear_rseq_cs() 2021-04-14 18:04:09 +02:00
scftorture.c scftorture: Add debug output for wrong-CPU warning 2021-01-04 13:53:41 -08:00
scs.c scs: switch to vmapped shadow stacks 2020-12-01 10:30:28 +00:00
seccomp.c seccomp: Support atomic "addfd + send reply" 2021-06-28 12:49:52 -07:00
signal.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2021-06-28 20:39:26 -07:00
smp.c smp: Fix smp_call_function_single_async prototype 2021-05-06 15:33:49 +02:00
smpboot.c sched/core: Initialize the idle task with preemption disabled 2021-05-12 13:01:45 +02:00
smpboot.h
softirq.c sched: Introduce task_is_running() 2021-06-18 11:43:07 +02:00
stackleak.c
stacktrace.c
static_call.c static_call: Fix unused variable warn w/o MODULE 2021-04-09 13:22:12 +02:00
stop_machine.c stop_machine: Add caller debug info to queue_stop_cpus_work 2021-03-23 16:01:58 +01:00
sys_ni.c Add Landlock, a new LSM from Mickaël Salaün <mic@linux.microsoft.com> 2021-05-01 18:50:44 -07:00
sys.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2021-06-28 20:39:26 -07:00
sysctl-test.c
sysctl.c for-5.14/block-2021-06-29 2021-06-30 12:12:56 -07:00
task_work.c kasan: record task_work_add() call stack 2021-04-30 11:20:42 -07:00
taskstats.c treewide: rename nla_strlcpy to nla_strscpy. 2020-11-16 08:08:54 -08:00
test_kprobes.c
torture.c torture: Replace torture_init_begin string with %s 2021-03-08 14:22:28 -08:00
tracepoint.c tracepoints: Code clean up 2021-02-09 12:27:29 -05:00
tsacct.c
ucount.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2021-06-28 20:39:26 -07:00
uid16.c
uid16.h
umh.c kernel/umh.c: fix some spelling mistakes 2021-05-07 00:26:34 -07:00
up.c A set of locking related fixes and updates: 2021-05-09 13:07:03 -07:00
user_namespace.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2021-06-28 20:39:26 -07:00
user-return-notifier.c
user.c Reimplement RLIMIT_MEMLOCK on top of ucounts 2021-04-30 14:14:02 -05:00
usermode_driver.c bpf: Fix umd memory leak in copy_process() 2021-03-19 22:23:19 +01:00
utsname_sysctl.c
utsname.c
watch_queue.c watch_queue: rectify kernel-doc for init_watch() 2021-01-26 11:16:34 +00:00
watchdog_hld.c
watchdog.c kernel: watchdog: modify the explanation related to watchdog thread 2021-06-29 10:53:46 -07:00
workqueue_internal.h
workqueue.c wq: handle VM suspension in stall detection 2021-05-20 12:58:30 -04:00