linux/arch/x86/kernel/cpu
Stephane Eranian 90151c35b1 perf_events: Fix event scheduling issues introduced by transactional API
The transactional API patch between the generic and model-specific
code introduced several important bugs with event scheduling, at
least on X86. If you had pinned events, e.g., watchdog,  and were
over-committing the PMU, you would get bogus counts. The bug was
showing up on Intel CPU because events would move around more
often that on AMD. But the problem also existed on AMD, though
harder to expose.

The issues were:

 - group_sched_in() was missing a cancel_txn() in the error path

 - cpuc->n_added was not properly maintained, leading to missing
   actions in hw_perf_enable(), i.e., n_running being 0. You cannot
   update n_added until you know the transaction has succeeded. In
   case of failed transaction n_added was not adjusted back.

 - in case of failed transactions, event_sched_out() was called
   and eventually invoked x86_disable_event() to touch the HW reg.
   But with transactions, on X86, event_sched_in() does not touch
   HW registers, it simply collects events into a list. Thus, you
   could end up calling x86_disable_event() on a counter which
   did not correspond to the current event when idx != -1.

The patch modifies the generic and X86 code to avoid all those problems.

First, we keep track of the number of events added last. In case the
transaction fails, we substract them from n_added. This approach is
necessary (as opposed to delaying updates to n_added) because not all
event updates use the transaction API, e.g., single events.

Second, we encapsulate the event_sched_in() and event_sched_out() in
group_sched_in() inside the transaction. That makes the operations
symmetrical and you can also detect that you are inside a transaction
and skip the HW reg access by checking cpuc->group_flag.

With this patch, you can now overcommit the PMU even with pinned
system-wide events present and still get valid counts.

Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1274796225.5882.1389.camel@twins>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-31 08:46:10 +02:00
..
cpufreq x86, k8: Fix section mismatch for powernowk8_exit() 2010-05-25 15:42:21 -07:00
mcheck Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6 2010-05-28 14:42:18 -07:00
mtrr include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
.gitignore Update .gitignore files for generated targets 2008-10-20 11:24:31 -07:00
addon_cpuid_features.c x86, cpu: Make APERF/MPERF a normal table-driven flag 2010-05-03 15:49:31 -07:00
amd.c x86, amd: Get multi-node CPU info from NodeId MSR instead of PCI config space 2009-12-16 15:06:23 -08:00
bugs_64.c x86/cpu: Clean up various files a bit 2009-07-11 11:24:09 +02:00
bugs.c x86: Avoid check hlt for newer cpus 2010-05-07 15:31:17 -07:00
centaur.c x86, cpu: mv display_cacheinfo -> cpu_detect_cache_sizes 2009-11-23 11:59:53 -08:00
cmpxchg.c x86: move cmpxchg fallbacks to a generic place 2008-08-18 16:05:47 +02:00
common.c numa: x86_64: use generic percpu var numa_node_id() implementation 2010-05-27 09:12:57 -07:00
cpu.h x86, cpu: mv display_cacheinfo -> cpu_detect_cache_sizes 2009-11-23 11:59:53 -08:00
cyrix.c x86, cpu: mv display_cacheinfo -> cpu_detect_cache_sizes 2009-11-23 11:59:53 -08:00
hypervisor.c x86, hypervisor: add missing <linux/module.h> 2010-05-09 22:46:54 -07:00
intel_cacheinfo.c x86, cacheinfo: Turn off L3 cache index disable feature in virtualized environments 2010-05-14 11:53:01 -07:00
intel.c Merge branch 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2010-05-18 08:49:13 -07:00
Makefile x86: Detect running on a Microsoft HyperV system 2010-05-06 18:24:15 -07:00
mkcapflags.pl x86: generate names for /proc/cpuinfo from <asm/cpufeature.h> 2008-08-27 19:23:22 -07:00
mshyperv.c x86, hypervisor: add missing <linux/module.h> 2010-05-09 22:46:54 -07:00
perf_event_amd.c perf, x86: Fix __initconst vs const 2010-04-02 19:52:05 +02:00
perf_event_intel_ds.c perf, x86: Improve the PEBS ABI 2010-05-07 11:31:02 +02:00
perf_event_intel_lbr.c perf, x86: Clean up debugctlmsr bit definitions 2010-03-26 09:41:03 +01:00
perf_event_intel.c perf, x86: Improve the PEBS ABI 2010-05-07 11:31:02 +02:00
perf_event_p4.c perf, x86: P4_pmu_schedule_events -- use smp_processor_id instead of raw_ 2010-05-19 09:41:05 +02:00
perf_event_p6.c perf, x86: Fix __initconst vs const 2010-04-02 19:52:05 +02:00
perf_event.c perf_events: Fix event scheduling issues introduced by transactional API 2010-05-31 08:46:10 +02:00
perfctr-watchdog.c perf, x86: rename macro in ARCH_PERFMON_EVENTSEL_ENABLE 2010-03-01 14:21:23 +01:00
powerflags.c x86: generate names for /proc/cpuinfo from <asm/cpufeature.h> 2008-08-27 19:23:22 -07:00
proc.c Merge branch 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2009-09-14 07:57:32 -07:00
sched.c sched: x86: Name old_perf in a unique way 2009-09-16 11:21:07 +02:00
transmeta.c x86, cpu: mv display_cacheinfo -> cpu_detect_cache_sizes 2009-11-23 11:59:53 -08:00
umc.c x86: move various CPU initialization objects into .cpuinit.rodata 2009-03-12 13:13:07 +01:00
vmware.c x86, hypervisor: Export the x86_hyper* symbols 2010-05-09 01:10:34 -07:00