linux/kernel/sched
Peter Zijlstra (Intel) 86038c5ea8 perf: Avoid horrible stack usage
Both Linus (most recent) and Steve (a while ago) reported that perf
related callbacks have massive stack bloat.

The problem is that software events need a pt_regs in order to
properly report the event location and unwind stack. And because we
could not assume one was present we allocated one on stack and filled
it with minimal bits required for operation.

Now, pt_regs is quite large, so this is undesirable. Furthermore it
turns out that most sites actually have a pt_regs pointer available,
making this even more onerous, as the stack space is pointless waste.

This patch addresses the problem by observing that software events
have well defined nesting semantics, therefore we can use static
per-cpu storage instead of on-stack.

Linus made the further observation that all but the scheduler callers
of perf_sw_event() have a pt_regs available, so we change the regular
perf_sw_event() to require a valid pt_regs (where it used to be
optional) and add perf_sw_event_sched() for the scheduler.

We have a scheduler specific call instead of a more generic _noregs()
like construct because we can assume non-recursion from the scheduler
and thereby simplify the code further (_noregs would have to put the
recursion context call inline in order to assertain which __perf_regs
element to use).

One last note on the implementation of perf_trace_buf_prepare(); we
allow .regs = NULL for those cases where we already have a pt_regs
pointer available and do not need another.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Javi Merino <javi.merino@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Petr Mladek <pmladek@suse.cz>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tom Zanussi <tom.zanussi@linux.intel.com>
Cc: Vaibhav Nagarnaik <vnagarnaik@google.com>
Link: http://lkml.kernel.org/r/20141216115041.GW3337@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-01-14 15:11:45 +01:00
..
auto_group.c sched: Change autogroup_move_group() to use for_each_thread() 2014-08-20 09:47:18 +02:00
auto_group.h Revert "sched/autogroup: Fix crash on reboot when autogroup is disabled" 2012-12-11 10:23:45 +01:00
clock.c time: Replace __get_cpu_var uses 2014-08-26 13:45:44 -04:00
completion.c sched/completion: Document when to use wait_for_completion_io_*() 2014-11-16 10:58:54 +01:00
core.c perf: Avoid horrible stack usage 2015-01-14 15:11:45 +01:00
cpuacct.c cgroup: rename cgroup_subsys->base_cftypes to ->legacy_cftypes 2014-07-15 11:05:09 -04:00
cpuacct.h sched/cpuacct: Initialize root cpuacct earlier 2013-04-10 13:54:20 +02:00
cpudeadline.c sched/deadline: Fix inter- exclusive cpusets migrations 2014-09-24 14:46:57 +02:00
cpudeadline.h sched/deadline: Remove unnecessary definitions in cpudeadline.h 2014-11-16 10:59:00 +01:00
cpupri.c Merge commit '3cf2f34' into sched/core, to fix build error 2014-06-12 13:46:37 +02:00
cpupri.h sched/cpupri: Remove unnecessary definitions in cpupri.h 2014-11-16 10:58:59 +01:00
cputime.c sched, time: Fix build error with 64 bit cputime_t on 32 bit systems 2014-10-03 05:46:55 +02:00
deadline.c sched/deadline: Avoid double-accounting in case of missed deadlines 2015-01-09 11:18:57 +01:00
debug.c sched: Refactor task_struct to use numa_faults instead of numa_* pointers 2014-11-04 07:17:57 +01:00
fair.c sched/fair: Fix RCU stall upon -ENOMEM in sched_create_group() 2015-01-09 11:19:00 +01:00
features.h sched: Rename capacity related flags 2014-06-05 11:52:32 +02:00
idle_task.c sched: Provide update_curr callbacks for stop/idle scheduling classes 2014-11-23 14:14:40 -08:00
idle.c sched: Let the scheduler see CPU idle states 2014-09-24 14:46:58 +02:00
Makefile sched/idle: Move cpu/idle.c to sched/idle.c 2014-02-11 09:58:30 +01:00
proc.c cpuidle: menu: Lookup CPU runqueues less 2014-08-06 21:17:45 +02:00
rt.c sched: Move p->nr_cpus_allowed check to select_task_rq() 2014-11-16 10:58:55 +01:00
sched.h Merge branch 'sched/urgent' into sched/core, to pick up fixes before applying more changes 2014-11-16 10:50:25 +01:00
stats.c kernel: audit/fix non-modular users of module_init in core code 2014-04-03 16:21:07 -07:00
stats.h sched: Micro-optimize by dropping unnecessary task_rq() calls 2013-09-25 13:51:06 +02:00
stop_task.c sched: Provide update_curr callbacks for stop/idle scheduling classes 2014-11-23 14:14:40 -08:00
wait.c sched/wait: Fix a kthread race with wait_woken() 2014-11-04 07:17:44 +01:00