linux/kernel
Tejun Heo 0d5936344f sched: Implement interface for cgroup unified hierarchy
There are a couple interface issues which can be addressed in cgroup2
interface.

* Stats from cpuacct being reported separately from the cpu stats.

* Use of different time units.  Writable control knobs use
  microseconds, some stat fields use nanoseconds while other cpuacct
  stat fields use centiseconds.

* Control knobs which can't be used in the root cgroup still show up
  in the root.

* Control knob names and semantics aren't consistent with other
  controllers.

This patchset implements cpu controller's interface on cgroup2 which
adheres to the controller file conventions described in
Documentation/cgroups/cgroup-v2.txt.  Overall, the following changes
are made.

* cpuacct is implictly enabled and disabled by cpu and its information
  is reported through "cpu.stat" which now uses microseconds for all
  time durations.  All time duration fields now have "_usec" appended
  to them for clarity.

  Note that cpuacct.usage_percpu is currently not included in
  "cpu.stat".  If this information is actually called for, it will be
  added later.

* "cpu.shares" is replaced with "cpu.weight" and operates on the
  standard scale defined by CGROUP_WEIGHT_MIN/DFL/MAX (1, 100, 10000).
  The weight is scaled to scheduler weight so that 100 maps to 1024
  and the ratio relationship is preserved - if weight is W and its
  scaled value is S, W / 100 == S / 1024.  While the mapped range is a
  bit smaller than the orignal scheduler weight range, the dead zones
  on both sides are relatively small and covers wider range than the
  nice value mappings.  This file doesn't make sense in the root
  cgroup and isn't created on root.

* "cpu.weight.nice" is added. When read, it reads back the nice value
  which is closest to the current "cpu.weight".  When written, it sets
  "cpu.weight" to the weight value which matches the nice value.  This
  makes it easy to configure cgroups when they're competing against
  threads in threaded subtrees.

* "cpu.cfs_quota_us" and "cpu.cfs_period_us" are replaced by "cpu.max"
  which contains both quota and period.

v4: - Use cgroup2 basic usage stat as the information source instead
      of cpuacct.

v3: - Added "cpu.weight.nice" to allow using nice values when
      configuring the weight.  The feature is requested by PeterZ.
    - Merge the patch to enable threaded support on cpu and cpuacct.
    - Dropped the bits about getting rid of cpuacct from patch
      description as there is a pretty strong case for making cpuacct
      an implicit controller so that basic cpu usage stats are always
      available.
    - Documentation updated accordingly.  "cpu.rt.max" section is
      dropped for now.

v2: - cpu_stats_show() was incorrectly using CONFIG_FAIR_GROUP_SCHED
      for CFS bandwidth stats and also using raw division for u64.
      Use CONFIG_CFS_BANDWITH and do_div() instead.  "cpu.rt.max" is
      not included yet.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
2017-09-29 14:30:37 -07:00
..
bpf bpf: fix ri->map_owner pointer on bpf_prog_realloc 2017-09-19 16:38:53 -07:00
cgroup cgroup: statically initialize init_css_set->dfl_cgrp 2017-09-25 14:02:53 -07:00
configs ANDROID: binder: add hwbinder,vndbinder to BINDER_DEVICES. 2017-08-22 18:43:23 -07:00
debug sched/headers: Prepare for new header dependencies before moving code to <linux/sched/debug.h> 2017-03-02 08:42:34 +01:00
events bpf: one perf event close won't free bpf program attached by another perf event 2017-09-20 14:10:29 -07:00
gcov gcov: support GCC 7.1 2017-05-12 15:57:15 -07:00
irq genirq: Fix cpumask check in __irq_startup_managed() 2017-09-16 20:20:56 +02:00
livepatch livepatch: Fix stacking of patches with respect to RCU 2017-06-20 10:42:19 +02:00
locking mm: treewide: remove GFP_TEMPORARY allocation flag 2017-09-13 18:53:16 -07:00
power Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-09-12 11:30:56 -07:00
printk Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk 2017-09-07 21:00:52 -07:00
rcu treewide: make "nr_cpu_ids" unsigned 2017-09-08 18:26:48 -07:00
sched sched: Implement interface for cgroup unified hierarchy 2017-09-29 14:30:37 -07:00
time drivers/pps: aesthetic tweaks to PPS-related content 2017-09-08 18:26:51 -07:00
trace This includes 3 minor fixes. 2017-09-20 06:38:07 -10:00
.gitignore
acct.c fs: make the buf argument to __kernel_write a void pointer 2017-09-04 19:05:15 -04:00
async.c async: Adjust system_state checks 2017-05-23 10:01:37 +02:00
audit_fsnotify.c Merge branch 'fsnotify' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs 2017-05-03 11:05:15 -07:00
audit_tree.c Merge branch 'fsnotify' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs 2017-05-03 11:05:15 -07:00
audit_watch.c audit/stable-4.13 PR 20170816 2017-08-16 16:48:34 -07:00
audit.c audit: update the function comments 2017-09-05 09:46:59 -04:00
audit.h ipc: mqueue: Replace timespec with timespec64 2017-09-03 20:21:24 -04:00
auditfilter.c audit: kernel generated netlink traffic should have a portid of 0 2017-05-02 10:16:05 -04:00
auditsc.c Merge branch 'work.ipc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-09-14 17:37:26 -07:00
backtracetest.c
bounds.c
capability.c
compat.c semtimedop(): move compat to native 2017-07-15 20:46:47 -04:00
configs.c
context_tracking.c
cpu_pm.c PM / CPU: replace raw_notifier with atomic_notifier 2017-07-31 13:09:49 +02:00
cpu.c Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-09-04 13:53:53 -07:00
crash_core.c kdump: protect vmcoreinfo data under the crash memory 2017-07-12 16:26:00 -07:00
crash_dump.c
cred.c doc: ReSTify credentials.txt 2017-05-18 10:30:19 -06:00
delayacct.c sched/headers: Prepare to move cputime functionality from <linux/sched.h> into <linux/sched/cputime.h> 2017-03-02 08:42:39 +01:00
dma.c
elfcore.c
exec_domain.c
exit.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2017-09-11 18:34:47 -07:00
extable.c lib/extable.c: use bsearch() library function in search_extable() 2017-07-10 16:32:35 -07:00
fork.c selinux/stable-4.14 PR 20170831 2017-09-12 13:21:00 -07:00
freezer.c
futex_compat.c
futex.c futex: Remove duplicated code and fix undefined behaviour 2017-08-25 22:49:59 +02:00
groups.c kernel/groups.c: use sort library function 2017-07-10 16:32:34 -07:00
hung_task.c kernel/hung_task.c: defer showing held locks 2017-05-08 17:15:10 -07:00
irq_work.c
jump_label.c jump_label: Provide hotplug context variants 2017-08-10 12:28:59 +02:00
kallsyms.c kernel/kallsyms.c: replace all_var with IS_ENABLED(CONFIG_KALLSYMS_ALL) 2017-07-10 16:32:34 -07:00
kcmp.c kcmp: add KCMP_EPOLL_TFD mode to compare epoll target files 2017-07-12 16:26:01 -07:00
Kconfig.freezer
Kconfig.hz
Kconfig.locks
Kconfig.preempt
kcov.c kcov: support compat processes 2017-09-08 18:26:51 -07:00
kexec_core.c x86/mm, kexec: Allow kexec to be used with SME 2017-07-18 11:38:04 +02:00
kexec_file.c kexec_file: adjust declaration of kexec_purgatory 2017-07-12 16:26:02 -07:00
kexec_internal.h kexec_file: adjust declaration of kexec_purgatory 2017-07-12 16:26:02 -07:00
kexec.c kdump: protect vmcoreinfo data under the crash memory 2017-07-12 16:26:00 -07:00
kmod.c kmod: move #ifdef CONFIG_MODULES wrapper to Makefile 2017-09-08 18:26:51 -07:00
kprobes.c kprobes: Ensure that jprobe probepoints are at function entry 2017-07-08 11:05:35 +02:00
ksysfs.c kexec: move vmcoreinfo out of the kernel's .bss section 2017-07-12 16:25:59 -07:00
kthread.c kernel/kthread.c: kthread_worker: don't hog the cpu 2017-08-31 16:33:15 -07:00
latencytop.c sched/headers: Prepare to move sched_info_on() and force_schedstat_enabled() from <linux/sched.h> to <linux/sched/stat.h> 2017-03-02 08:42:39 +01:00
Makefile kmod: move #ifdef CONFIG_MODULES wrapper to Makefile 2017-09-08 18:26:51 -07:00
memremap.c mm/device-public-memory: device memory cache coherent with CPU 2017-09-08 18:26:46 -07:00
module_signing.c
module-internal.h
module.c module: fix ddebug_remove_module() 2017-07-25 15:08:32 +02:00
notifier.c kernel/notifier.c: simplify expression 2017-02-24 17:46:56 -08:00
nsproxy.c perf: Add PERF_RECORD_NAMESPACES to include namespaces related info 2017-03-13 15:57:41 -03:00
padata.c padata: Avoid nested calls to cpus_read_lock() in pcrypt_init_padata() 2017-05-26 10:10:37 +02:00
panic.c locking/refcounts, x86/asm: Implement fast refcount overflow protection 2017-08-17 10:40:26 +02:00
params.c boot/param: Move next_arg() function to lib/cmdline.c for later reuse 2017-04-18 10:37:13 +02:00
pid_namespace.c userns,pidns: Verify the userns for new pid namespaces 2017-07-20 07:43:58 -05:00
pid.c pids: make task_tgid_nr_ns() safe 2017-08-21 12:47:31 -07:00
profile.c sched/headers: Prepare to move sched_info_on() and force_schedstat_enabled() from <linux/sched.h> to <linux/sched/stat.h> 2017-03-02 08:42:39 +01:00
ptrace.c signal: Remove kernel interal si_code magic 2017-07-24 14:30:28 -05:00
range.c
reboot.c
relay.c Merge branch 'work.splice' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-05-02 11:38:06 -07:00
resource.c
seccomp.c seccomp: Implement SECCOMP_RET_KILL_PROCESS action 2017-08-14 13:46:50 -07:00
signal.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2017-09-11 18:34:47 -07:00
smp.c treewide: make "nr_cpu_ids" unsigned 2017-09-08 18:26:48 -07:00
smpboot.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/task.h> 2017-03-02 08:42:35 +01:00
smpboot.h
softirq.c sched/core: Remove 'task' parameter and rename tsk_restore_flags() to current_restore_flags() 2017-04-11 09:06:32 +02:00
stacktrace.c stacktrace/x86: add function for detecting reliable stack traces 2017-03-08 09:18:02 +01:00
stop_machine.c stop_machine: Provide stop_machine_cpuslocked() 2017-05-26 10:10:36 +02:00
sys_ni.c
sys.c prctl: Allow local CAP_SYS_ADMIN changing exe_file 2017-07-20 07:46:07 -05:00
sysctl_binary.c fs: fix kernel_write prototype 2017-09-04 19:05:15 -04:00
sysctl.c kernel/watchdog: split up config options 2017-07-12 16:26:02 -07:00
task_work.c task_work: Replace spin_unlock_wait() with lock/unlock pair 2017-07-25 10:08:58 -07:00
taskstats.c taskstats: add e/u/stime for TGID command 2017-05-08 17:15:12 -07:00
test_kprobes.c
torture.c torture: Fix typo suppressing CPU-hotplug statistics 2017-07-25 13:04:45 -07:00
tracepoint.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/task.h> 2017-03-02 08:42:35 +01:00
tsacct.c sched/headers: Prepare to move cputime functionality from <linux/sched.h> into <linux/sched/cputime.h> 2017-03-02 08:42:39 +01:00
ucount.c ucount: Remove the atomicity from ucount->count 2017-03-06 15:26:37 -06:00
uid16.c sched/headers: Prepare to remove <linux/cred.h> inclusion from <linux/sched.h> 2017-03-02 08:42:31 +01:00
umh.c kmod: split out umh code into its own file 2017-09-08 18:26:50 -07:00
up.c smp: Avoid using two cache lines for struct call_single_data 2017-08-29 15:14:38 +02:00
user_namespace.c userns,pidns: Verify the userns for new pid namespaces 2017-07-20 07:43:58 -05:00
user-return-notifier.c
user.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/user.h> 2017-03-02 08:42:29 +01:00
utsname_sysctl.c sched/headers: Remove <linux/rwsem.h> from <linux/sched.h> 2017-03-03 01:45:36 +01:00
utsname.c sched/headers: Prepare to move the task_lock()/unlock() APIs to <linux/sched/task.h> 2017-03-02 08:42:38 +01:00
watchdog_hld.c kernel/watchdog: Prevent false positives with turbo modes 2017-08-18 12:35:02 +02:00
watchdog.c kernel/watchdog: Prevent false positives with turbo modes 2017-08-18 12:35:02 +02:00
workqueue_internal.h
workqueue.c Merge branch 'for-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq 2017-09-06 21:59:31 -07:00