linux

korg/linux

mirror of https://mirrors.bfsu.edu.cn/git/linux.git synced 2024-11-13 14:24:11 +08:00

History

Mathieu Desnoyers af7f588d8f sched: Introduce per-memory-map concurrency ID This feature allows the scheduler to expose a per-memory map concurrency ID to user-space. This concurrency ID is within the possible cpus range, and is temporarily (and uniquely) assigned while threads are actively running within a memory map. If a memory map has fewer threads than cores, or is limited to run on few cores concurrently through sched affinity or cgroup cpusets, the concurrency IDs will be values close to 0, thus allowing efficient use of user-space memory for per-cpu data structures. This feature is meant to be exposed by a new rseq thread area field. The primary purpose of this feature is to do the heavy-lifting needed by memory allocators to allow them to use per-cpu data structures efficiently in the following situations: - Single-threaded applications, - Multi-threaded applications on large systems (many cores) with limited cpu affinity mask, - Multi-threaded applications on large systems (many cores) with restricted cgroup cpuset per container. One of the key concern from scheduler maintainers is the overhead associated with additional spin locks or atomic operations in the scheduler fast-path. This is why the following optimization is implemented. On context switch between threads belonging to the same memory map, transfer the mm_cid from prev to next without any atomic ops. This takes care of use-cases involving frequent context switch between threads belonging to the same memory map. Additional optimizations can be done if the spin locks added when context switching between threads belonging to different memory maps end up being a performance bottleneck. Those are left out of this patch though. A performance impact would have to be clearly demonstrated to justify the added complexity. The credit goes to Paul Turner (Google) for the original virtual cpu id idea. This feature is implemented based on the discussions with Paul Turner and Peter Oskolkov (Google), but I took the liberty to implement scheduler fast-path optimizations and my own NUMA-awareness scheme. The rumor has it that Google have been running a rseq vcpu_id extension internally in production for a year. The tcmalloc source code indeed has comments hinting at a vcpu_id prototype extension to the rseq system call [1]. The following benchmarks do not show any significant overhead added to the scheduler context switch by this feature: * perf bench sched messaging (process) Baseline: 86.5±0.3 ms With mm_cid: 86.7±2.6 ms * perf bench sched messaging (threaded) Baseline: 84.3±3.0 ms With mm_cid: 84.7±2.6 ms * hackbench (process) Baseline: 82.9±2.7 ms With mm_cid: 82.9±2.9 ms * hackbench (threaded) Baseline: 85.2±2.6 ms With mm_cid: 84.4±2.9 ms [1] https://github.com/google/tcmalloc/blob/master/tcmalloc/internal/linux_syscall_support.h#L26 Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20221122203932.231377-8-mathieu.desnoyers@efficios.com		2022-12-27 12:52:11 +01:00
..
.gitignore	kbuild: build init/built-in.a just once	2022-09-29 04:40:15 +09:00
build-version	kbuild: build init/built-in.a just once	2022-09-29 04:40:15 +09:00
calibrate.c	License cleanup: add SPDX GPL-2.0 license identifier to files with no license	2017-11-02 11:10:55 +01:00
do_mounts_initrd.c	freezer,umh: Clean up freezer/initrd interaction	2022-09-07 21:53:48 +02:00
do_mounts_rd.c	init: add an init_unlink helper	2020-07-31 08:17:52 +02:00
do_mounts.c	init: move from strlcpy with unused retval to strscpy	2022-09-11 21:55:10 -07:00
do_mounts.h	init: add an init_mknod helper	2020-07-31 08:17:54 +02:00
init_task.c	rcu-tasks: Add data structures for lightweight grace periods	2022-06-20 09:22:28 -07:00
initramfs.c	initramfs: remove unnecessary (void*) conversion	2022-11-18 13:55:08 -08:00
Kconfig	sched: Introduce per-memory-map concurrency ID	2022-12-27 12:52:11 +01:00
main.c	New Feature:	2022-12-17 14:06:53 -06:00
Makefile	kbuild: generate include/generated/compile.h in top Makefile	2022-09-29 04:40:15 +09:00
noinitramfs.c	init: move usermodehelper_enable() to populate_rootfs()	2021-09-08 11:50:27 -07:00
version-timestamp.c	kbuild: build init/built-in.a just once	2022-09-29 04:40:15 +09:00
version.c	init/version.c: remove #include <generated/utsrelease.h>	2022-12-10 10:33:20 +09:00