linux/kernel
Paul Turner 19e5eebb8e sched: Fix interactivity bug by charging unaccounted run-time on entity re-weight
Mike Galbraith reported poor interactivity[*] when the new shares distribution
code was combined with autogroups.

The root cause turns out to be a mis-ordering of accounting accrued execution
time and shares updates.  Since update_curr() is issued hierarchically,
updating the parent entity weights to reflect child enqueue/dequeue results in
the parent's unaccounted execution time then being accrued (vs vruntime) at the
new weight as opposed to the weight present at accumulation.

While this doesn't have much effect on processes with timeslices that cross a
tick, it is particularly problematic for an interactive process (e.g. Xorg)
which incurs many (tiny) timeslices.  In this scenario almost all updates are
at dequeue which can result in significant fairness perturbation (especially if
it is the only thread, resulting in potential {tg->shares, MIN_SHARES}
transitions).

Correct this by ensuring unaccounted time is accumulated prior to manipulating
an entity's weight.

[*] http://xkcd.com/619/ is perversely Nostradamian here.

Signed-off-by: Paul Turner <pjt@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
LKML-Reference: <20101216031038.159704378@google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-12-19 16:36:30 +01:00
..
debug kdb: fix crash when KDB_BASE_CMD_MAX is exceeded 2010-11-17 13:54:57 -06:00
gcov llseek: automatically add .llseek fop 2010-10-15 15:53:27 +02:00
irq Merge branch 'linus' into sched/core 2010-12-08 20:15:29 +01:00
power PM / Hibernate: Fix memory corruption related to swap 2010-12-06 23:52:08 +01:00
time ntp: Clamp PLL update interval 2010-09-09 20:48:37 +02:00
trace Merge branch 'linus' into sched/core 2010-12-08 20:15:29 +01:00
.gitignore
acct.c pass a struct path to vfs_statfs 2010-08-09 16:48:42 -04:00
async.c async: use workqueue for worker pool 2010-07-14 11:29:46 +02:00
audit_tree.c in untag_chunk() we need to do alloc_chunk() a bit earlier 2010-10-30 02:18:32 -04:00
audit_watch.c audit: make functions static 2010-10-30 01:42:19 -04:00
audit.c audit: Use rcu for task lookup protection 2010-10-30 08:45:42 -04:00
audit.h audit: make functions static 2010-10-30 01:42:19 -04:00
auditfilter.c Audit: add support to match lsm labels on user audit messages 2010-10-30 01:41:57 -04:00
auditsc.c audit mmap 2010-10-30 08:45:43 -04:00
backtracetest.c
bounds.c
capability.c sched: Remove remaining USER_SCHED code 2010-04-02 20:12:00 +02:00
cgroup_freezer.c cgroup_freezer: update_freezer_state() does incorrect state transitions 2010-10-27 18:03:08 -07:00
cgroup.c convert cgroup and cpuset 2010-10-29 04:17:06 -04:00
compat.c compat: Make compat_alloc_user_space() incorporate the access_ok() 2010-09-14 16:08:45 -07:00
configs.c llseek: automatically add .llseek fop 2010-10-15 15:53:27 +02:00
cpu.c cpu: Remove incorrect BUG_ON 2010-11-23 10:29:08 +01:00
cpuset.c convert cgroup and cpuset 2010-10-29 04:17:06 -04:00
cred.c signals: move cred_guard_mutex from task_struct to signal_struct 2010-10-27 18:03:12 -07:00
delayacct.c
dma.c
elfcore.c elf coredump: add extended numbering support 2010-03-06 11:26:46 -08:00
exec_domain.c sys_personality: remove the bogus checks in sys_personality()->__set_personality() path 2010-08-09 20:45:05 -07:00
exit.c do_exit(): make sure that we run with get_fs() == USER_DS 2010-12-02 14:51:16 -08:00
extable.c
fork.c sched: Add 'autogroup' scheduling feature: automated per session task groups 2010-11-30 16:03:35 +01:00
freezer.c
futex_compat.c futex: Address compiler warnings in exit_robust_list 2010-11-10 13:27:50 +01:00
futex.c futex: Address compiler warnings in exit_robust_list 2010-11-10 13:27:50 +01:00
groups.c kernel/groups.c: fix integer overflow in groups_search 2010-09-09 18:57:24 -07:00
hrtimer.c hrtimer: Preserve timer state in remove_hrtimer() 2010-10-14 13:29:59 +02:00
hung_task.c lockup detector: Fix grammar by adding a missing "to" in the comments 2010-08-17 09:11:52 +02:00
hw_breakpoint.c perf,hw_breakpoint: Initialize hardware api earlier 2010-11-12 14:51:55 +01:00
irq_work.c irq_work: Drop cmpxchg() result 2010-11-18 13:18:47 +01:00
itimer.c
jump_label.c jump label: Make arch_jump_label_text_poke_early() optional 2010-10-29 12:56:13 -04:00
kallsyms.c Revert "kernel: make /proc/kallsyms mode 400 to reduce ease of attacking" 2010-11-19 11:54:40 -08:00
Kconfig.freezer
Kconfig.hz
Kconfig.locks
Kconfig.preempt
kexec.c use clear_page()/copy_page() in favor of memset()/memcpy() on whole pages 2010-10-26 16:52:13 -07:00
kfifo.c kfifo: fix scatterlist usage 2010-10-01 10:50:58 -07:00
kmod.c Make do_execve() take a const filename pointer 2010-08-17 18:07:43 -07:00
kprobes.c jump label: Fix error with preempt disable holding mutex 2010-10-29 12:55:55 -04:00
ksysfs.c sysfs: add struct file* to bin_attr callbacks 2010-05-21 09:37:31 -07:00
kthread.c sched: Make sched_param argument static in sched_setscheduler() callers 2010-10-23 17:56:48 +02:00
latencytop.c latencytop: fix per task accumulator 2010-11-12 07:55:31 -08:00
lockdep_internals.h lockdep: No need to disable preemption in debug atomic ops 2010-05-04 05:38:16 +02:00
lockdep_proc.c lockstat: Make lockstat counting per cpu 2010-04-06 00:15:37 +02:00
lockdep_states.h
lockdep.c lockdep: Check the depth of subclass 2010-10-18 18:44:26 +02:00
Makefile Merge branch 'core-memblock-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2010-10-21 18:52:11 -07:00
module.c tracing: Fix module use of trace_bprintk() 2010-11-10 22:19:24 -05:00
mutex-debug.c
mutex-debug.h
mutex.c mutexes, sched: Introduce arch_mutex_cpu_relax() 2010-11-26 15:05:34 +01:00
mutex.h
notifier.c
ns_cgroup.c cgroup: notify ns_cgroup deprecated 2010-10-27 18:03:09 -07:00
nsproxy.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
padata.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2010-08-04 15:23:14 -07:00
panic.c lib/bug.c: add oops end marker to WARN implementation 2010-08-11 08:59:22 -07:00
params.c param: locking for kernel parameters 2010-08-11 23:04:20 +09:30
perf_event.c perf: Fix the software context switch counter 2010-11-26 15:00:59 +01:00
pid_namespace.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
pid.c Add RCU check for find_task_by_vpid(). 2010-08-19 17:18:02 -07:00
pm_qos_params.c PM / PM QoS: Fix reversed min and max 2010-11-15 22:45:22 +01:00
posix-cpu-timers.c posix-cpu-timers: Rcu_read_lock/unlock protect find_task_by_vpid call 2010-11-10 13:07:06 +01:00
posix-timers.c posix_timer: Move copy_to_user(created_timer_id) down in timer_create() 2010-07-23 15:08:12 +02:00
printk.c printk: Use this_cpu_{read|write} api on printk_pending 2010-12-08 20:16:01 +01:00
profile.c llseek: automatically add .llseek fop 2010-10-15 15:53:27 +02:00
ptrace.c signals: move cred_guard_mutex from task_struct to signal_struct 2010-10-27 18:03:12 -07:00
range.c kernel/range.c: fix clean_sort_range() for the case of full array 2010-11-12 07:55:31 -08:00
rcupdate.c Merge branch 'rcu/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-2.6-rcu into core/rcu 2010-10-07 09:43:11 +02:00
rcutiny_plugin.h rcu: performance fixes to TINY_PREEMPT_RCU callback checking 2010-08-27 10:51:17 -07:00
rcutiny.c rcu: Add a TINY_PREEMPT_RCU 2010-08-20 08:55:00 -07:00
rcutorture.c rcu: fix sparse errors in rcutorture.c 2010-09-23 09:16:42 -07:00
rcutree_plugin.h rcu: fix _oddness handling of verbose stall warnings 2010-09-02 16:15:30 -07:00
rcutree_trace.c rcu: Add tracing data to support queueing models 2010-09-23 09:16:53 -07:00
rcutree.c rcu: using ACCESS_ONCE() to observe the jiffies_stall/rnp->qsmask value 2010-10-07 10:41:06 -07:00
rcutree.h rcu: Add tracing data to support queueing models 2010-09-23 09:16:53 -07:00
relay.c Clean up relay_alloc_page_array() slightly by using vzalloc rather than vmalloc and memset 2010-11-05 08:21:34 -07:00
res_counter.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
resource.c Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 2010-10-28 11:59:52 -07:00
rtmutex_common.h
rtmutex-debug.c
rtmutex-debug.h
rtmutex-tester.c rtmutex-tester: make it build without BKL 2010-10-19 11:29:56 +02:00
rtmutex.c
rtmutex.h
rwsem.c
sched_autogroup.c sched: Add 'autogroup' scheduling feature: automated per session task groups 2010-11-30 16:03:35 +01:00
sched_autogroup.h sched: Add 'autogroup' scheduling feature: automated per session task groups 2010-11-30 16:03:35 +01:00
sched_clock.c sched: Add some clock info to sched_debug 2010-11-23 10:29:08 +01:00
sched_cpupri.c sched: No need for bootmem special cases 2010-07-17 12:06:22 +02:00
sched_cpupri.h sched: No need for bootmem special cases 2010-07-17 12:06:22 +02:00
sched_debug.c sched: Add 'autogroup' scheduling feature: automated per session task groups 2010-11-30 16:03:35 +01:00
sched_fair.c sched: Fix interactivity bug by charging unaccounted run-time on entity re-weight 2010-12-19 16:36:30 +01:00
sched_features.h sched: Rewrite tg_shares_up) 2010-11-18 13:27:46 +01:00
sched_idletask.c sched: Cure load average vs NO_HZ woes 2010-04-23 11:02:02 +02:00
sched_rt.c sched: Implement on-demand (active) cfs_rq list 2010-11-18 13:27:47 +01:00
sched_stats.h sched_stat: Update sched_info_queue/dequeue() code comments 2010-10-24 13:29:01 +02:00
sched_stoptask.c sched: Fix cross-sched-class wakeup preemption 2010-11-11 14:37:23 +01:00
sched.c sched: Make pushable_tasks CONFIG_SMP dependant 2010-12-08 20:16:00 +01:00
seccomp.c
semaphore.c
signal.c signals: annotate lock context change on ptrace_stop() 2010-10-27 18:03:12 -07:00
smp.c Typedef SMP call function pointer 2010-10-27 17:28:36 +01:00
softirq.c Merge commit 'v2.6.37-rc2' into sched/core 2010-11-18 13:22:26 +01:00
spinlock.c
srcu.c kernel: Remove undead ifdef CONFIG_DEBUG_LOCK_ALLOC 2010-09-23 09:14:51 -07:00
stacktrace.c
stop_machine.c stop_machine: convert cpu notifier to return encapsulate errno value 2010-10-26 16:52:15 -07:00
sys_ni.c powerpc: define a compat_sys_recv cond_syscall 2010-09-23 17:03:55 +10:00
sys.c sched: Add 'autogroup' scheduling feature: automated per session task groups 2010-11-30 16:03:35 +01:00
sysctl_binary.c sysctl: don't use own implementation of hex_to_bin() 2010-05-25 08:07:05 -07:00
sysctl_check.c sysctl: min/max bounds are optional 2010-10-15 14:42:24 -07:00
sysctl.c sched: Add 'autogroup' scheduling feature: automated per session task groups 2010-11-30 16:03:35 +01:00
taskstats.c taskstats: split fill_pid function 2010-10-27 18:03:17 -07:00
test_kprobes.c kprobes: Fix selftest to clear flags field for reusing probes 2010-10-14 08:55:27 +02:00
time.c time: Kill off CONFIG_GENERIC_TIME 2010-07-27 12:40:54 +02:00
timeconst.pl
timer.c irq_work: Add generic hardirq context callbacks 2010-10-18 19:58:50 +02:00
tracepoint.c jump_label: Use more consistent naming 2010-10-18 19:58:56 +02:00
tsacct.c taskstats: use real microsecond granularity for CPU times 2010-10-27 18:03:17 -07:00
uid16.c
up.c
user_namespace.c user_ns: Introduce user_nsmap_uid and user_ns_map_gid. 2010-06-16 14:55:34 -07:00
user-return-notifier.c
user.c kernel/user.c: add lock release annotation on free_user() 2010-10-26 16:52:15 -07:00
utsname_sysctl.c
utsname.c
wait.c docbook: add more wait/wake/completion to device-drivers docbook 2010-10-26 17:32:41 -07:00
watchdog.c Merge commit 'v2.6.37-rc2' into sched/core 2010-11-18 13:22:26 +01:00
workqueue_sched.h workqueue: implement concurrency managed dynamic worker pool 2010-06-29 10:07:14 +02:00
workqueue.c workqueue: It is likely that WORKER_NOT_RUNNING is true 2010-12-14 15:05:54 +01:00