linux/ipc
Alexey Gladkov 50ec499b9a sysctl: allow change system v ipc sysctls inside ipc namespace
Patch series "Allow to change ipc/mq sysctls inside ipc namespace", v3.

Right now ipc and mq limits count as per ipc namespace, but only real root
can change them.  By default, the current values of these limits are such
that it can only be reduced.  Since only root can change the values, it is
impossible to reduce these limits in the rootless container.

We can allow limit changes within ipc namespace because mq parameters are
limited by RLIMIT_MSGQUEUE and ipc parameters are not limited to anything
other than cgroups.


This patch (of 3):

Rootless containers are not allowed to modify kernel IPC parameters.

All default limits are set to such high values that in fact there are no
limits at all.  All limits are not inherited and are initialized to
default values when a new ipc_namespace is created.

For new ipc_namespace:

size_t       ipc_ns.shm_ctlmax = SHMMAX; // (ULONG_MAX - (1UL << 24))
size_t       ipc_ns.shm_ctlall = SHMALL; // (ULONG_MAX - (1UL << 24))
int          ipc_ns.shm_ctlmni = IPCMNI; // (1 << 15)
int          ipc_ns.shm_rmid_forced = 0;
unsigned int ipc_ns.msg_ctlmax = MSGMAX; // 8192
unsigned int ipc_ns.msg_ctlmni = MSGMNI; // 32000
unsigned int ipc_ns.msg_ctlmnb = MSGMNB; // 16384

The shm_tot (total amount of shared pages) has also ceased to be global,
it is located in ipc_namespace and is not inherited from anywhere.

In such conditions, it cannot be said that these limits limit anything. 
The real limiter for them is cgroups.

If we allow rootless containers to change these parameters, then it can
only be reduced.

Link: https://lkml.kernel.org/r/cover.1705333426.git.legion@kernel.org
Link: https://lkml.kernel.org/r/d2f4603305cbfed58a24755aa61d027314b73a45.1705333426.git.legion@kernel.org
Signed-off-by: Alexey Gladkov <legion@kernel.org>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Link: https://lkml.kernel.org/r/e2d84d3ec0172cfff759e6065da84ce0cc2736f8.1663756794.git.legion@kernel.org
Cc: Christian Brauner <brauner@kernel.org>
Cc: Joel Granados <joel.granados@gmail.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-02-22 15:38:52 -08:00
..
compat.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
ipc_sysctl.c sysctl: allow change system v ipc sysctls inside ipc namespace 2024-02-22 15:38:52 -08:00
Makefile License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
mq_sysctl.c sysctl: Add a size arg to __register_sysctl_table 2023-08-15 15:26:17 -07:00
mqueue.c ipc: convert to new timestamp accessors 2023-10-18 14:08:30 +02:00
msg.c ipc/msg.c: fix percpu_counter use after free 2022-10-28 13:37:22 -07:00
msgutil.c ipc: Use generic ns_common::count 2020-08-19 14:13:52 +02:00
namespace.c ipc,namespace: batch free ipc_namespace structures 2023-01-27 19:08:00 -05:00
sem.c ipc/sem: use flexible array in 'struct sem_undo' 2023-08-18 10:18:51 -07:00
shm.c shm: Slim down dependencies 2023-12-20 19:26:31 -05:00
syscall.c y2038: remove CONFIG_64BIT_TIME 2019-11-15 14:38:27 +01:00
util.c ipc/util.c: cleanup and improve sysvipc_find_ipc() 2022-09-11 21:55:05 -07:00
util.h sched.h: move pid helpers to pid.h 2023-12-20 19:26:31 -05:00