linux/kernel/sched
Josh Don 304000390f sched: Cgroup SCHED_IDLE support
This extends SCHED_IDLE to cgroups.

Interface: cgroup/cpu.idle.
 0: default behavior
 1: SCHED_IDLE

Extending SCHED_IDLE to cgroups means that we incorporate the existing
aspects of SCHED_IDLE; a SCHED_IDLE cgroup will count all of its
descendant threads towards the idle_h_nr_running count of all of its
ancestor cgroups. Thus, sched_idle_rq() will work properly.
Additionally, SCHED_IDLE cgroups are configured with minimum weight.

There are two key differences between the per-task and per-cgroup
SCHED_IDLE interface:

  - The cgroup interface allows tasks within a SCHED_IDLE hierarchy to
    maintain their relative weights. The entity that is "idle" is the
    cgroup, not the tasks themselves.

  - Since the idle entity is the cgroup, our SCHED_IDLE wakeup preemption
    decision is not made by comparing the current task with the woken
    task, but rather by comparing their matching sched_entity.

A typical use-case for this is a user that creates an idle and a
non-idle subtree. The non-idle subtree will dominate competition vs
the idle subtree, but the idle subtree will still be high priority vs
other users on the system. The latter is accomplished via comparing
matching sched_entity in the waken preemption path (this could also be
improved by making the sched_idle_rq() decision dependent on the
perspective of a specific task).

For now, we maintain the existing SCHED_IDLE semantics. Future patches
may make improvements that extend how we treat SCHED_IDLE entities.

The per-task_group idle field is an integer that currently only holds
either a 0 or a 1. This is explicitly typed as an integer to allow for
further extensions to this API. For example, a negative value may
indicate a highly latency-sensitive cgroup that should be preferred
for preemption/placement/etc.

Signed-off-by: Josh Don <joshdon@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lore.kernel.org/r/20210730020019.1487127-2-joshdon@google.com
2021-08-20 12:32:58 +02:00
..
autogroup.c sched/autogroup: Make autogroup_path() always available 2019-06-24 19:23:40 +02:00
autogroup.h
clock.c sched: Fix various typos 2021-03-22 00:11:52 +01:00
completion.c completion: Use lockdep_assert_RT_in_threaded_ctx() in complete_all() 2020-03-23 18:40:25 +01:00
core_sched.c sched: prctl() core-scheduling interface 2021-05-12 11:43:31 +02:00
core.c sched: Cgroup SCHED_IDLE support 2021-08-20 12:32:58 +02:00
cpuacct.c sched: Wrap rq::lock access 2021-05-12 11:43:26 +02:00
cpudeadline.c sched,rt: Use the full cpumask for balancing 2020-11-10 18:39:00 +01:00
cpudeadline.h
cpufreq_schedutil.c sched/cpufreq: Consider reduced CPU capacity in energy calculation 2021-06-17 14:11:43 +02:00
cpufreq.c cpufreq: Avoid leaving stale IRQ work items during CPU offline 2019-12-12 17:59:43 +01:00
cpupri.c sched: Fix various typos 2021-03-22 00:11:52 +01:00
cpupri.h sched/cpupri: Add CPUPRI_HIGHER 2020-10-29 11:00:30 +01:00
cputime.c Scheduler updates for this cycle are: 2021-04-28 13:33:57 -07:00
deadline.c sched/deadline: Fix missing clock update in migrate_task_rq_dl() 2021-08-06 14:25:24 +02:00
debug.c sched: Cgroup SCHED_IDLE support 2021-08-20 12:32:58 +02:00
fair.c sched: Cgroup SCHED_IDLE support 2021-08-20 12:32:58 +02:00
features.h sched: Warn on long periods of pending need_resched 2021-04-21 13:55:41 +02:00
idle.c sched: Trivial forced-newidle balancer 2021-05-12 11:43:30 +02:00
isolation.c sched/isolation: Reconcile rcu_nocbs= and nohz_full= 2021-05-13 14:12:47 +02:00
loadavg.c sched: Make multiple runqueue task counters 32-bit 2021-05-12 21:34:17 +02:00
Makefile sched: Trivial core scheduling cookie management 2021-05-12 11:43:31 +02:00
membarrier.c sched/membarrier: fix missing local execution of ipi_sync_rq_state() 2021-03-06 12:40:21 +01:00
pelt.c sched: Fix various typos 2021-03-22 00:11:52 +01:00
pelt.h Merge branch 'sched/urgent' into sched/core, to resolve conflicts 2021-06-18 11:31:25 +02:00
psi.c psi: Fix race between psi_trigger_create/destroy 2021-06-24 09:07:50 +02:00
rt.c sched/rt: Fix RT utilization tracking during policy change 2021-06-22 16:41:59 +02:00
sched-pelt.h sched/fair: Fix "runnable_avg_yN_inv" not used warnings 2019-06-17 12:15:58 +02:00
sched.h sched: Cgroup SCHED_IDLE support 2021-08-20 12:32:58 +02:00
smp.h sched/headers: Split out open-coded prototypes into kernel/sched/smp.h 2020-05-28 11:03:20 +02:00
stats.c sched: Fix various typos 2021-03-22 00:11:52 +01:00
stats.h sched: Introduce task_is_running() 2021-06-18 11:43:07 +02:00
stop_task.c sched: Introduce sched_class::pick_task() 2021-05-12 11:43:28 +02:00
swait.c sched/swait: Prepare usage in completions 2020-03-21 16:00:23 +01:00
topology.c sched/topology: Skip updating masks for non-online nodes 2021-08-20 12:32:57 +02:00
wait_bit.c sched/wait: fix ___wait_var_event(exclusive) 2019-12-17 13:32:50 +01:00
wait.c sched/wait: Add add_wait_queue_priority() 2020-11-15 09:49:09 -05:00