linux/kernel
Thomas Gleixner bc7a34b8b9 timer: Reduce timer migration overhead if disabled
Eric reported that the timer_migration sysctl is not really nice
performance wise as it needs to check at every timer insertion whether
the feature is enabled or not. Further the check does not live in the
timer code, so we have an extra function call which checks an extra
cache line to figure out that it is disabled.

We can do better and store that information in the per cpu (hr)timer
bases. I pondered to use a static key, but that's a nightmare to
update from the nohz code and the timer base cache line is hot anyway
when we select a timer base.

The old logic enabled the timer migration unconditionally if
CONFIG_NO_HZ was set even if nohz was disabled on the kernel command
line.

With this modification, we start off with migration disabled. The user
visible sysctl is still set to enabled. If the kernel switches to NOHZ
migration is enabled, if the user did not disable it via the sysctl
prior to the switch. If nohz=off is on the kernel command line,
migration stays disabled no matter what.

Before:
  47.76%  hog       [.] main
  14.84%  [kernel]  [k] _raw_spin_lock_irqsave
   9.55%  [kernel]  [k] _raw_spin_unlock_irqrestore
   6.71%  [kernel]  [k] mod_timer
   6.24%  [kernel]  [k] lock_timer_base.isra.38
   3.76%  [kernel]  [k] detach_if_pending
   3.71%  [kernel]  [k] del_timer
   2.50%  [kernel]  [k] internal_add_timer
   1.51%  [kernel]  [k] get_nohz_timer_target
   1.28%  [kernel]  [k] __internal_add_timer
   0.78%  [kernel]  [k] timerfn
   0.48%  [kernel]  [k] wake_up_nohz_cpu

After:
  48.10%  hog       [.] main
  15.25%  [kernel]  [k] _raw_spin_lock_irqsave
   9.76%  [kernel]  [k] _raw_spin_unlock_irqrestore
   6.50%  [kernel]  [k] mod_timer
   6.44%  [kernel]  [k] lock_timer_base.isra.38
   3.87%  [kernel]  [k] detach_if_pending
   3.80%  [kernel]  [k] del_timer
   2.67%  [kernel]  [k] internal_add_timer
   1.33%  [kernel]  [k] __internal_add_timer
   0.73%  [kernel]  [k] timerfn
   0.54%  [kernel]  [k] wake_up_nohz_cpu


Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Joonwoo Park <joonwoop@codeaurora.org>
Cc: Wenbo Wang <wenbo.wang@memblaze.com>
Link: http://lkml.kernel.org/r/20150526224512.127050787@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-06-19 15:18:28 +02:00
..
bpf bpf: fix 64-bit divide 2015-04-27 23:11:49 -04:00
configs x86: Add "make tinyconfig" to configure the tiniest possible kernel 2014-08-08 16:30:24 -07:00
debug debug: prevent entering debug mode on panic/exception. 2015-02-19 12:39:03 -06:00
events Merge branch 'linus' into timers/core 2015-05-19 16:12:32 +02:00
gcov gcov: fix softlockups 2015-04-17 09:04:08 -04:00
irq genirq: Set IRQCHIP_SKIP_SET_WAKE flag for dummy_irq_chip 2015-04-24 20:57:06 +02:00
livepatch Merge branch 'for-4.1/core-noarch' into for-linus 2015-04-13 23:57:20 +02:00
locking Merge branch 'linus' into timers/core 2015-05-19 16:12:32 +02:00
power Merge back earlier suspend/hibernate material for v4.1. 2015-04-10 12:01:59 +02:00
printk TTY/Serial patches for 4.1-rc1 2015-04-21 09:33:10 -07:00
rcu timer: Reduce timer migration overhead if disabled 2015-06-19 15:18:28 +02:00
sched timer: Reduce timer migration overhead if disabled 2015-06-19 15:18:28 +02:00
time timer: Reduce timer migration overhead if disabled 2015-06-19 15:18:28 +02:00
trace tracing: Make ftrace_print_array_seq compute buf_len 2015-05-06 23:03:23 -04:00
.gitignore
acct.c acct: check FMODE_CAN_WRITE 2015-04-11 22:27:55 -04:00
async.c kernel/async.c: switch to pr_foo() 2014-10-09 22:26:04 -04:00
audit_tree.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2015-04-26 17:22:07 -07:00
audit_watch.c VFS: audit: d_backing_inode() annotations 2015-04-15 15:06:55 -04:00
audit.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2015-04-26 17:22:07 -07:00
audit.h Merge branch 'upstream' of git://git.infradead.org/users/pcmoore/audit 2015-04-22 14:49:23 -07:00
auditfilter.c Merge branch 'upstream' of git://git.infradead.org/users/pcmoore/audit 2015-02-11 20:07:47 -08:00
auditsc.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2015-04-26 17:22:07 -07:00
backtracetest.c
bounds.c page-cgroup: get rid of NR_PCG_FLAGS 2014-08-08 15:57:18 -07:00
capability.c kernel: conditionally support non-root users, groups and capabilities 2015-04-15 16:35:22 -07:00
cgroup_freezer.c
cgroup.c cgroup: remove use of seq_printf return value 2015-04-15 16:35:25 -07:00
compat.c all arches, signal: move restart_block to struct task_struct 2015-02-12 18:54:12 -08:00
configs.c
context_tracking.c context_tracking: Export context_tracking_user_enter/exit 2015-03-09 15:43:00 +01:00
cpu_pm.c
cpu.c Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-04-14 13:36:04 -07:00
cpuset.c kernel, cpuset: remove exception for __GFP_THISNODE 2015-04-14 16:49:03 -07:00
crash_dump.c crash_dump: Make is_kdump_kernel() accessible from modules 2014-08-25 15:42:19 -07:00
cred.c kernel: conditionally support non-root users, groups and capabilities 2015-04-15 16:35:22 -07:00
delayacct.c
dma.c
elfcore.c
exec_domain.c Remove rest of exec domains. 2015-04-12 21:03:31 +02:00
exit.c Remove execution domain support 2015-04-12 20:58:24 +02:00
extable.c ftrace/x86/extable: Add is_ftrace_trampoline() function 2014-11-19 15:25:26 -05:00
fork.c oprofile: reduce mmap_sem hold for mm->exe_file 2015-04-17 09:04:11 -04:00
freezer.c freezer: remove obsolete comments in __thaw_task() 2014-10-21 23:44:20 +02:00
futex_compat.c
futex.c futex: Remove bogus hrtimer_active() check 2015-04-22 17:06:51 +02:00
groups.c kernel: conditionally support non-root users, groups and capabilities 2015-04-15 16:35:22 -07:00
hung_task.c kernel/hung_task.c: change hung_task.c to use for_each_process_thread() 2015-04-15 16:35:22 -07:00
irq_work.c percpu: Convert remaining __get_cpu_var uses in 3.18-rcX 2014-10-29 11:18:18 -04:00
jump_label.c
kallsyms.c kernel/kallsyms.c: use __seq_open_private() 2014-10-14 02:18:16 +02:00
kcmp.c kcmp: fix standard comparison bug 2014-09-10 15:42:12 -07:00
Kconfig.freezer
Kconfig.hz
Kconfig.locks locking/mcs: Better differentiate between MCS variants 2015-01-14 15:07:32 +01:00
Kconfig.preempt
kexec.c kexec: allocate the kexec control page with KEXEC_CONTROL_MEMORY_GFP 2015-04-23 16:52:01 +02:00
kmod.c usermodehelper: kill the kmod_thread_locker logic 2014-12-10 17:41:17 -08:00
kprobes.c kprobes: makes kprobes/enabled works correctly for optimized kprobes. 2015-02-13 21:21:42 -08:00
ksysfs.c
kthread.c kernel/kthread.c: partial revert of 81c98869fa ("kthread: ensure locality of task_struct allocations") 2014-10-09 22:25:51 -04:00
latencytop.c
Makefile modsign: change default key details 2015-04-30 09:35:41 -07:00
module_signing.c
module-internal.h
module.c Quentin opened a can of worms by adding extable entry checking to modpost, 2015-04-22 09:49:24 -07:00
notifier.c rcu: Make SRCU optional by using CONFIG_SRCU 2015-01-06 11:04:29 -08:00
nsproxy.c bury struct proc_ns in fs/proc 2014-12-04 14:34:54 -05:00
padata.c padata: use %*pb[l] to print bitmaps including cpumasks and nodemasks 2015-02-13 21:21:38 -08:00
panic.c livepatch: kernel: add TAINT_LIVEPATCH 2014-12-22 15:40:48 +01:00
params.c params: handle quotes properly for values not of form foo="bar". 2015-04-15 13:31:23 +09:30
pid_namespace.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2014-12-16 15:53:03 -08:00
pid.c fork: report pid reservation failure properly 2015-04-17 09:04:06 -04:00
profile.c profile: use %*pb[l] to print bitmaps including cpumasks and nodemasks 2015-02-13 21:21:38 -08:00
ptrace.c ptrace: ptrace_detach() can no longer race with SIGKILL 2015-04-17 09:04:06 -04:00
range.c kernel: avoid overflow in cmp_range 2015-01-17 10:02:23 +13:00
reboot.c kernel/reboot.c: add orderly_reboot for graceful reboot 2015-04-15 16:35:23 -07:00
relay.c VFS: kernel/: d_inode() annotations 2015-04-15 15:06:55 -04:00
resource.c kernel/resource.c: remove deprecated __check_region() and friends 2015-04-15 16:35:22 -07:00
seccomp.c seccomp: cap SECCOMP_RET_ERRNO data to MAX_ERRNO 2015-02-17 14:34:55 -08:00
signal.c signal: remove warning about using SI_TKILL in rt_[tg]sigqueueinfo 2015-04-17 09:04:06 -04:00
smp.c smp: Fix error case handling in smp_call_function_*() 2015-04-19 13:19:23 -07:00
smpboot.c smpboot: Add common code for notification from dying CPU 2015-03-11 13:20:25 -07:00
smpboot.h
softirq.c Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-02-09 15:24:03 -08:00
stacktrace.c stacktrace: introduce snprint_stack_trace for buffer output 2014-12-13 12:42:48 -08:00
stop_machine.c
sys_ni.c kernel: conditionally support non-root users, groups and capabilities 2015-04-15 16:35:22 -07:00
sys.c prctl: avoid using mmap_sem for exe_file serialization 2015-04-17 09:04:07 -04:00
sysctl_binary.c kernel: add panic_on_warn 2014-12-10 17:41:10 -08:00
sysctl.c timer: Reduce timer migration overhead if disabled 2015-06-19 15:18:28 +02:00
system_certificates.S
system_keyring.c
task_work.c
taskstats.c netlink: make nlmsg_end() and genlmsg_end() void 2015-01-18 01:03:45 -05:00
test_kprobes.c kernel/test_kprobes.c: use current logging functions 2014-08-08 15:57:18 -07:00
torture.c torture: Address race in module cleanup 2014-09-16 13:41:06 -07:00
tracepoint.c
tsacct.c
uid16.c groups: Consolidate the setgroups permission checks 2014-12-05 17:19:27 -06:00
up.c
user_namespace.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2014-12-17 12:31:40 -08:00
user-return-notifier.c scheduler: Replace __get_cpu_var with this_cpu_ptr 2014-08-26 13:45:45 -04:00
user.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2014-12-17 12:31:40 -08:00
utsname_sysctl.c
utsname.c copy address of proc_ns_ops into ns_common 2014-12-04 14:34:47 -05:00
watchdog.c watchdog: Fix merge 'conflict' 2015-05-18 10:08:29 -07:00
workqueue_internal.h
workqueue.c workqueue: Reorder sysfs code 2015-04-06 11:16:04 -04:00