linux/kernel/sched
Mathieu Desnoyers 99a22e8839 sched: Add missing memory barrier in switch_mm_cid
commit fe90f3967b upstream.

Many architectures' switch_mm() (e.g. arm64) do not have an smp_mb()
which the core scheduler code has depended upon since commit:

    commit 223baf9d17 ("sched: Fix performance regression introduced by mm_cid")

If switch_mm() doesn't call smp_mb(), sched_mm_cid_remote_clear() can
unset the actively used cid when it fails to observe active task after it
sets lazy_put.

There *is* a memory barrier between storing to rq->curr and _return to
userspace_ (as required by membarrier), but the rseq mm_cid has stricter
requirements: the barrier needs to be issued between store to rq->curr
and switch_mm_cid(), which happens earlier than:

  - spin_unlock(),
  - switch_to().

So it's fine when the architecture switch_mm() happens to have that
barrier already, but less so when the architecture only provides the
full barrier in switch_to() or spin_unlock().

It is a bug in the rseq switch_mm_cid() implementation. All architectures
that don't have memory barriers in switch_mm(), but rather have the full
barrier either in finish_lock_switch() or switch_to() have them too late
for the needs of switch_mm_cid().

Introduce a new smp_mb__after_switch_mm(), defined as smp_mb() in the
generic barrier.h header, and use it in switch_mm_cid() for scheduler
transitions where switch_mm() is expected to provide a memory barrier.

Architectures can override smp_mb__after_switch_mm() if their
switch_mm() implementation provides an implicit memory barrier.
Override it with a no-op on x86 which implicitly provide this memory
barrier by writing to CR3.

Fixes: 223baf9d17 ("sched: Fix performance regression introduced by mm_cid")
Reported-by: levi.yun <yeoreum.yun@arm.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> # for arm64
Acked-by: Dave Hansen <dave.hansen@linux.intel.com> # for x86
Cc: <stable@vger.kernel.org> # 6.4.x
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20240415152114.59122-2-mathieu.desnoyers@efficios.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-27 17:13:01 +02:00
..
autogroup.c sched/all: Change all BUG_ON() instances in the scheduler to WARN_ON_ONCE() 2022-08-12 11:25:10 +02:00
autogroup.h sched/headers: Add header guard to kernel/sched/stats.h and kernel/sched/autogroup.h 2022-02-23 08:22:00 +01:00
build_policy.c sched: Fix missing prototype warnings 2022-05-01 10:03:43 +02:00
build_utility.c sched/headers: Remove duplicate header inclusions 2023-10-03 21:27:55 +02:00
clock.c Locking changes for v6.5: 2023-06-27 14:14:30 -07:00
completion.c sched: add a few helpers to wake up tasks on the current cpu 2023-07-17 16:08:08 -07:00
core_sched.c sched: Rename task_running() to task_on_cpu() 2022-09-07 21:53:47 +02:00
core.c header cleanups for 6.8 2024-01-10 16:43:55 -08:00
cpuacct.c Merge branch 'sched/fast-headers' into sched/core 2022-03-15 09:05:05 +01:00
cpudeadline.c sched/topology: Consolidate and clean up access to a CPU's max compute capacity 2023-10-09 12:59:48 +02:00
cpudeadline.h
cpufreq_schedutil.c sched/fair: Fix frequency selection for non-invariant case 2024-01-16 10:41:25 +01:00
cpufreq.c sched/headers: Introduce kernel/sched/build_utility.c and build multiple .c files there 2022-02-23 10:58:33 +01:00
cpupri.c sched/rt: Fix live lock between select_fallback_rq() and RT push 2023-09-28 22:58:13 +02:00
cpupri.h sched/cpupri: Add CPUPRI_HIGHER 2020-10-29 11:00:30 +01:00
cputime.c cputime: remove cputime_to_nsecs fallback 2022-12-27 12:52:17 +01:00
deadline.c sched/deadline: Introduce deadline servers 2023-11-15 09:57:51 +01:00
debug.c sched/fair: Simplify util_est 2023-12-23 15:59:58 +01:00
fair.c sched/fair: Take the scheduling domain into account in select_idle_core() 2024-03-26 18:16:30 -04:00
features.h sched/fair: Remove SCHED_FEAT(UTIL_EST_FASTUP, true) 2023-12-23 15:59:56 +01:00
idle.c sched/cpuidle: Comment about timers requirements VS idle handler 2023-11-15 09:57:51 +01:00
isolation.c sched/headers: Introduce kernel/sched/build_utility.c and build multiple .c files there 2022-02-23 10:58:33 +01:00
loadavg.c sched/headers: Introduce kernel/sched/build_utility.c and build multiple .c files there 2022-02-23 10:58:33 +01:00
Makefile sched/headers: Introduce kernel/sched/build_policy.c and build multiple .c files there 2022-02-23 10:58:33 +01:00
membarrier.c sched/membarrier: reduce the ability to hammer on sys_membarrier 2024-02-20 09:38:05 -08:00
pelt.c sched: Make PELT acronym definition searchable 2023-10-13 09:56:30 +02:00
pelt.h sched/fair: Simplify util_est 2023-12-23 15:59:58 +01:00
psi.c sched/psi: Update poll => rtpoll in relevant comments 2023-10-16 13:42:49 +02:00
rt.c sched: Unify runtime accounting across classes 2023-11-15 09:57:48 +01:00
sched-pelt.h
sched.h sched: Add missing memory barrier in switch_mm_cid 2024-04-27 17:13:01 +02:00
smp.h sched, smp: Trace smp callback causing an IPI 2023-03-24 11:01:29 +01:00
stats.c sched/headers: Introduce kernel/sched/build_utility.c and build multiple .c files there 2022-02-23 10:58:33 +01:00
stats.h sched/psi: Use task->psi_flags to clear in CPU migration 2022-10-30 10:12:15 +01:00
stop_task.c sched: Unify runtime accounting across classes 2023-11-15 09:57:48 +01:00
swait.c sched: add a few helpers to wake up tasks on the current cpu 2023-07-17 16:08:08 -07:00
topology.c sched/fair: Scan cluster before scanning LLC in wake-up path 2023-10-24 10:38:43 +02:00
wait_bit.c wait_on_bit: add an acquire memory barrier 2022-08-26 09:30:25 -07:00
wait.c sched: remove wait bookmarks 2023-10-18 14:34:18 -07:00