linux/kernel/sched
Mark Rutland 99cf983cc8 sched/preempt: Add PREEMPT_DYNAMIC using static keys
Where an architecture selects HAVE_STATIC_CALL but not
HAVE_STATIC_CALL_INLINE, each static call has an out-of-line trampoline
which will either branch to a callee or return to the caller.

On such architectures, a number of constraints can conspire to make
those trampolines more complicated and potentially less useful than we'd
like. For example:

* Hardware and software control flow integrity schemes can require the
  addition of "landing pad" instructions (e.g. `BTI` for arm64), which
  will also be present at the "real" callee.

* Limited branch ranges can require that trampolines generate or load an
  address into a register and perform an indirect branch (or at least
  have a slow path that does so). This loses some of the benefits of
  having a direct branch.

* Interaction with SW CFI schemes can be complicated and fragile, e.g.
  requiring that we can recognise idiomatic codegen and remove
  indirections understand, at least until clang proves more helpful
  mechanisms for dealing with this.

For PREEMPT_DYNAMIC, we don't need the full power of static calls, as we
really only need to enable/disable specific preemption functions. We can
achieve the same effect without a number of the pain points above by
using static keys to fold early returns into the preemption functions
themselves rather than in an out-of-line trampoline, effectively
inlining the trampoline into the start of the function.

For arm64, this results in good code generation. For example, the
dynamic_cond_resched() wrapper looks as follows when enabled. When
disabled, the first `B` is replaced with a `NOP`, resulting in an early
return.

| <dynamic_cond_resched>:
|        bti     c
|        b       <dynamic_cond_resched+0x10>     // or `nop`
|        mov     w0, #0x0
|        ret
|        mrs     x0, sp_el0
|        ldr     x0, [x0, #8]
|        cbnz    x0, <dynamic_cond_resched+0x8>
|        paciasp
|        stp     x29, x30, [sp, #-16]!
|        mov     x29, sp
|        bl      <preempt_schedule_common>
|        mov     w0, #0x1
|        ldp     x29, x30, [sp], #16
|        autiasp
|        ret

... compared to the regular form of the function:

| <__cond_resched>:
|        bti     c
|        mrs     x0, sp_el0
|        ldr     x1, [x0, #8]
|        cbz     x1, <__cond_resched+0x18>
|        mov     w0, #0x0
|        ret
|        paciasp
|        stp     x29, x30, [sp, #-16]!
|        mov     x29, sp
|        bl      <preempt_schedule_common>
|        mov     w0, #0x1
|        ldp     x29, x30, [sp], #16
|        autiasp
|        ret

Any architecture which implements static keys should be able to use this
to implement PREEMPT_DYNAMIC with similar cost to non-inlined static
calls. Since this is likely to have greater overhead than (inlined)
static calls, PREEMPT_DYNAMIC is only defaulted to enabled when
HAVE_PREEMPT_DYNAMIC_CALL is selected.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Frederic Weisbecker <frederic@kernel.org>
Link: https://lore.kernel.org/r/20220214165216.2231574-6-mark.rutland@arm.com
2022-02-19 11:11:08 +01:00
..
autogroup.c sched: move autogroup sysctls into its own file 2022-02-02 13:11:37 +01:00
autogroup.h sched: move autogroup sysctls into its own file 2022-02-02 13:11:37 +01:00
clock.c sched: Fix various typos 2021-03-22 00:11:52 +01:00
completion.c completion: Use lockdep_assert_RT_in_threaded_ctx() in complete_all() 2020-03-23 18:40:25 +01:00
core_sched.c sched/core: Forced idle accounting 2021-11-17 14:49:00 +01:00
core.c sched/preempt: Add PREEMPT_DYNAMIC using static keys 2022-02-19 11:11:08 +01:00
cpuacct.c sched/cpuacct: Make user/system times in cpuacct.stat more precise 2021-11-23 09:55:22 +01:00
cpudeadline.c sched,rt: Use the full cpumask for balancing 2020-11-10 18:39:00 +01:00
cpudeadline.h
cpufreq_schedutil.c sched/uclamp: Fix iowait boost escaping uclamp restriction 2022-01-27 12:57:19 +01:00
cpufreq.c cpufreq: Avoid leaving stale IRQ work items during CPU offline 2019-12-12 17:59:43 +01:00
cpupri.c sched: Fix various typos 2021-03-22 00:11:52 +01:00
cpupri.h sched/cpupri: Add CPUPRI_HIGHER 2020-10-29 11:00:30 +01:00
cputime.c Peter Zijlstra says: 2022-01-11 17:14:59 -08:00
deadline.c sched/dl: Support schedstats for deadline sched class 2021-10-05 15:51:53 +02:00
debug.c sched/debug: Remove mpol_get/put and task_lock/unlock from sched_show_numa 2022-01-27 12:57:18 +01:00
fair.c sched/isolation: Use single feature type while referring to housekeeping cpumask 2022-02-16 15:57:55 +01:00
features.h sched: Disable TTWU_QUEUE on RT 2021-10-05 15:52:12 +02:00
idle.c sched/idle: Make the idle timer expire in hard interrupt context 2021-09-09 10:36:16 +02:00
isolation.c sched/isolation: Split housekeeping cpumask per isolation features 2022-02-16 15:57:56 +01:00
loadavg.c sched: Make multiple runqueue task counters 32-bit 2021-05-12 21:34:17 +02:00
Makefile sched, kcsan: Enable memory barrier instrumentation 2021-12-09 16:42:28 -08:00
membarrier.c sched/membarrier: fix missing local execution of ipi_sync_rq_state() 2021-03-06 12:40:21 +01:00
pelt.c sched: Fix various typos 2021-03-22 00:11:52 +01:00
pelt.h Merge branch 'sched/urgent' into sched/core, to resolve conflicts 2021-06-18 11:31:25 +02:00
psi.c psi: fix possible trigger missing in the window 2022-02-16 15:57:54 +01:00
rt.c sched/rt: Try to restart rt period timer when rt runtime exceeded 2021-12-07 15:14:10 +01:00
sched-pelt.h
sched.h sched/numa: Fix NUMA topology for systems with CPU-less nodes 2022-02-16 15:57:53 +01:00
smp.h sched/headers: Split out open-coded prototypes into kernel/sched/smp.h 2020-05-28 11:03:20 +02:00
stats.c sched: Introduce task block time in schedstats 2021-10-05 15:51:48 +02:00
stats.h psi: Fix PSI_MEM_FULL state when tasks are in memstall and doing reclaim 2021-11-17 14:49:00 +01:00
stop_task.c sched: Make struct sched_statistics independent of fair sched class 2021-10-05 15:51:45 +02:00
swait.c sched/swait: Prepare usage in completions 2020-03-21 16:00:23 +01:00
topology.c sched/isolation: Use single feature type while referring to housekeeping cpumask 2022-02-16 15:57:55 +01:00
wait_bit.c sched/wait: fix ___wait_var_event(exclusive) 2019-12-17 13:32:50 +01:00
wait.c wait: add wake_up_pollfree() 2021-12-09 10:49:56 -08:00