linux

korg/linux

mirror of https://mirrors.bfsu.edu.cn/git/linux.git synced 2024-11-27 14:14:24 +08:00

History

Phil Auld baa9be4ffb sched/fair: Fix throttle_list starvation with low CFS quota With a very low cpu.cfs_quota_us setting, such as the minimum of 1000, distribute_cfs_runtime may not empty the throttled_list before it runs out of runtime to distribute. In that case, due to the change from `c06f04c704` to put throttled entries at the head of the list, later entries on the list will starve. Essentially, the same X processes will get pulled off the list, given CPU time and then, when expired, get put back on the head of the list where distribute_cfs_runtime will give runtime to the same set of processes leaving the rest. Fix the issue by setting a bit in struct cfs_bandwidth when distribute_cfs_runtime is running, so that the code in throttle_cfs_rq can decide to put the throttled entry on the tail or the head of the list. The bit is set/cleared by the callers of distribute_cfs_runtime while they hold cfs_bandwidth->lock. This is easy to reproduce with a handful of CPU consumers. I use 'crash' on the live system. In some cases you can simply look at the throttled list and see the later entries are not changing: crash> list cfs_rq.throttled_list -H 0xffff90b54f6ade40 -s cfs_rq.runtime_remaining \| paste - - \| awk '{print $1" "$4}' \| pr -t -n3 1 ffff90b56cb2d200 -976050 2 ffff90b56cb2cc00 -484925 3 ffff90b56cb2bc00 -658814 4 ffff90b56cb2ba00 -275365 5 ffff90b166a45600 -135138 6 ffff90b56cb2da00 -282505 7 ffff90b56cb2e000 -148065 8 ffff90b56cb2fa00 -872591 9 ffff90b56cb2c000 -84687 10 ffff90b56cb2f000 -87237 11 ffff90b166a40a00 -164582 crash> list cfs_rq.throttled_list -H 0xffff90b54f6ade40 -s cfs_rq.runtime_remaining \| paste - - \| awk '{print $1" "$4}' \| pr -t -n3 1 ffff90b56cb2d200 -994147 2 ffff90b56cb2cc00 -306051 3 ffff90b56cb2bc00 -961321 4 ffff90b56cb2ba00 -24490 5 ffff90b166a45600 -135138 6 ffff90b56cb2da00 -282505 7 ffff90b56cb2e000 -148065 8 ffff90b56cb2fa00 -872591 9 ffff90b56cb2c000 -84687 10 ffff90b56cb2f000 -87237 11 ffff90b166a40a00 -164582 Sometimes it is easier to see by finding a process getting starved and looking at the sched_info: crash> task ffff8eb765994500 sched_info PID: 7800 TASK: ffff8eb765994500 CPU: 16 COMMAND: "cputest" sched_info = { pcount = 8, run_delay = 697094208, last_arrival = 240260125039, last_queued = 240260327513 }, crash> task ffff8eb765994500 sched_info PID: 7800 TASK: ffff8eb765994500 CPU: 16 COMMAND: "cputest" sched_info = { pcount = 8, run_delay = 697094208, last_arrival = 240260125039, last_queued = 240260327513 }, Signed-off-by: Phil Auld <pauld@redhat.com> Reviewed-by: Ben Segall <bsegall@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Fixes: `c06f04c704` ("sched: Fix potential near-infinite distribute_cfs_runtime() loop") Link: http://lkml.kernel.org/r/20181008143639.GA4019@pauld.bos.csb Signed-off-by: Ingo Molnar <mingo@kernel.org>		2018-10-11 13:10:18 +02:00
..
autogroup.c	sched/autogroup: Fix possible Spectre-v1 indexing for sched_prio_to_weight[]	2018-05-05 08:34:42 +02:00
autogroup.h	sched/headers: Simplify and clean up header usage in the scheduler	2018-03-04 12:39:29 +01:00
clock.c	sched/clock: Disable interrupts when calling generic_sched_clock_init()	2018-07-30 19:33:35 +02:00
completion.c	sched/Documentation: Update wake_up() & co. memory-barrier guarantees	2018-07-17 09:30:34 +02:00
core.c	sched/numa: Pass destination CPU as a parameter to migrate_task_rq	2018-10-02 09:42:21 +02:00
cpuacct.c	sched/headers: Simplify and clean up header usage in the scheduler	2018-03-04 12:39:29 +01:00
cpudeadline.c	sched/headers: Simplify and clean up header usage in the scheduler	2018-03-04 12:39:29 +01:00
cpudeadline.h	sched/headers: Simplify and clean up header usage in the scheduler	2018-03-04 12:39:29 +01:00
cpufreq_schedutil.c	sched/fair: Remove #ifdefs from scale_rt_capacity()	2018-07-25 11:41:05 +02:00
cpufreq.c	sched/headers: Simplify and clean up header usage in the scheduler	2018-03-04 12:39:29 +01:00
cpupri.c	sched/headers: Simplify and clean up header usage in the scheduler	2018-03-04 12:39:29 +01:00
cpupri.h	sched/headers: Simplify and clean up header usage in the scheduler	2018-03-04 12:39:29 +01:00
cputime.c	sched/headers: Simplify and clean up header usage in the scheduler	2018-03-04 12:39:29 +01:00
deadline.c	sched/numa: Pass destination CPU as a parameter to migrate_task_rq	2018-10-02 09:42:21 +02:00
debug.c	sched/debug: Fix potential deadlock when writing to sched_features	2018-09-10 10:13:45 +02:00
fair.c	sched/fair: Fix throttle_list starvation with low CFS quota	2018-10-11 13:10:18 +02:00
features.h	sched/fair: Update util_est only on util_avg updates	2018-03-20 08:11:09 +01:00
idle.c	sched: idle: Avoid retaining the tick when it has been stopped	2018-08-20 11:25:55 +02:00
isolation.c	sched/headers: Simplify and clean up header usage in the scheduler	2018-03-04 12:39:29 +01:00
loadavg.c	sched/headers: Simplify and clean up header usage in the scheduler	2018-03-04 12:39:29 +01:00
Makefile	sched/pelt: Move PELT related code in a dedicated file	2018-07-15 23:51:20 +02:00
membarrier.c	sched/headers: Simplify and clean up header usage in the scheduler	2018-03-04 12:39:29 +01:00
pelt.c	sched/core: Use PELT for scale_rt_capacity()	2018-07-16 00:16:25 +02:00
pelt.h	sched/irq: Add IRQ utilization tracking	2018-07-15 23:51:21 +02:00
rt.c	Merge branch 'sched/urgent' into sched/core, to pick up fixes	2018-07-25 11:29:58 +02:00
sched-pelt.h	License cleanup: add SPDX GPL-2.0 license identifier to files with no license	2017-11-02 11:10:55 +01:00
sched.h	sched/fair: Fix throttle_list starvation with low CFS quota	2018-10-11 13:10:18 +02:00
stats.c	proc: introduce proc_create_seq{,_data}	2018-05-16 07:23:35 +02:00
stats.h	sched: Clean up and harmonize the coding style of the scheduler code base	2018-03-03 15:50:21 +01:00
stop_task.c	sched: Clean up and harmonize the coding style of the scheduler code base	2018-03-03 15:50:21 +01:00
swait.c	sched/swait: Rename to exclusive	2018-06-20 11:35:56 +02:00
topology.c	sched/topology: Set correct NUMA topology type	2018-09-10 10:13:45 +02:00
wait_bit.c	sched/wait: Improve __var_waitqueue() code generation	2018-03-20 08:23:25 +01:00
wait.c	sched/wait: assert the wait_queue_head lock is held in __wake_up_common	2018-08-22 10:52:47 -07:00