linux/arch/arm64/kernel
Michael O'Farrell 9d2dcc8fc6 arm64: perf: Add cap_user_time aarch64
It is useful to get the running time of a thread.  Doing so in an
efficient manner can be important for performance of user applications.
Avoiding system calls in `clock_gettime` when handling
CLOCK_THREAD_CPUTIME_ID is important.  Other clocks are handled in the
VDSO, but CLOCK_THREAD_CPUTIME_ID falls back on the system call.

CLOCK_THREAD_CPUTIME_ID is not handled in the VDSO since it would have
costs associated with maintaining updated user space accessible time
offsets.  These offsets have to be updated everytime the a thread is
scheduled/descheduled.  However, for programs regularly checking the
running time of a thread, this is a performance improvement.

This patch takes a middle ground, and adds support for cap_user_time an
optional feature of the perf_event API.  This way costs are only
incurred when the perf_event api is enabled.  This is done the same way
as it is in x86.

Ultimately this allows calculating the thread running time in userspace
on aarch64 as follows (adapted from perf_event_open manpage):

u32 seq, time_mult, time_shift;
u64 running, count, time_offset, quot, rem, delta;
struct perf_event_mmap_page *pc;
pc = buf;  // buf is the perf event mmaped page as documented in the API.

if (pc->cap_usr_time) {
    do {
        seq = pc->lock;
        barrier();
        running = pc->time_running;

        count = readCNTVCT_EL0();  // Read ARM hardware clock.
        time_offset = pc->time_offset;
        time_mult   = pc->time_mult;
        time_shift  = pc->time_shift;

        barrier();
    } while (pc->lock != seq);

    quot = (count >> time_shift);
    rem = count & (((u64)1 << time_shift) - 1);
    delta = time_offset + quot * time_mult +
            ((rem * time_mult) >> time_shift);

    running += delta;
    // running now has the current nanosecond level thread time.
}

Summary of changes in the patch:

For aarch64 systems, make arch_perf_update_userpage update the timing
information stored in the perf_event page.  Requiring the following
calculations:
  - Calculate the appropriate time_mult, and time_shift factors to convert
    ticks to nano seconds for the current clock frequency.
  - Adjust the mult and shift factors to avoid shift factors of 32 bits.
    (possibly unnecessary)
  - The time_offset userspace should apply when doing calculations:
    negative the current sched time (now), because time_running and
    time_enabled fields of the perf_event page have just been updated.
Toggle bits to appropriate values:
  - Enable cap_user_time

Signed-off-by: Michael O'Farrell <micpof@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-07-31 10:14:00 +01:00
..
probes License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
vdso arm64 updates for 4.15 2017-11-15 10:56:56 -08:00
.gitignore
acpi_numa.c arm64: numa: rework ACPI NUMA initialization 2018-07-09 18:21:40 +01:00
acpi_parking_protocol.c arm64: fix endianness annotation in acpi_parking_protocol.c 2017-06-29 11:33:15 +01:00
acpi.c arm64: acpi: fix alignment fault in accessing ACPI 2018-07-23 15:34:12 +01:00
alternative.c arm64: Avoid flush_icache_range() in alternatives patching code 2018-06-27 18:21:53 +01:00
arm64ksyms.c arm64: export tishift functions to modules 2018-05-21 19:00:48 +01:00
armv8_deprecated.c arm64: kill config_sctlr_el1() 2018-07-12 14:40:38 +01:00
asm-offsets.c arm64: KVM: Handle guest's ARCH_WORKAROUND_2 requests 2018-05-31 18:00:57 +01:00
cacheinfo.c arm64: Add support for ACPI based firmware tables 2018-05-17 17:28:09 +01:00
cpu_errata.c arm64: kill config_sctlr_el1() 2018-07-12 14:40:38 +01:00
cpu_ops.c arm64: cpu_ops: Add missing 'const' qualifiers 2017-12-01 13:05:08 +00:00
cpu-reset.h arm64: kexec: always reset to EL2 if present 2018-07-04 18:34:24 +01:00
cpu-reset.S arm64: idmap: Use "awx" flags for .idmap.text .pushsection directives 2018-02-06 22:53:27 +00:00
cpufeature.c arm64: use PSR_AA32 definitions 2018-07-05 17:24:14 +01:00
cpuidle.c ARM64 / cpuidle: Use new cpuidle macro for entering retention state 2018-01-02 13:50:34 +00:00
cpuinfo.c arm64: Expose Arm v8.4 features 2018-03-19 18:14:27 +00:00
crash_dump.c arm64: kdump: provide /proc/vmcore file 2017-04-05 18:31:38 +01:00
debug-monitors.c arm64: Use arm64_force_sig_info instead of force_sig_info 2018-03-06 18:52:32 +00:00
efi-entry.S arm64: Add software workaround for Falkor erratum 1041 2018-02-06 22:53:13 +00:00
efi-header.S arm64: efi: split Image code and data into separate PE/COFF sections 2017-04-04 17:50:59 +01:00
efi-rt-wrapper.S efi/arm64: Check whether x18 is preserved by runtime services calls 2018-03-09 08:58:22 +01:00
efi.c efi/arm64: Check whether x18 is preserved by runtime services calls 2018-03-09 08:58:22 +01:00
entry-fpsimd.S arm64/sve: Write ZCR_EL1 on context switch only if changed 2018-05-17 18:19:53 +01:00
entry-ftrace.S arm64: Fix static use of function graph 2017-11-03 12:05:23 +00:00
entry.S arm64: Add support for STACKLEAK gcc plugin 2018-07-26 11:36:34 +01:00
fpsimd.c arm64: move sve_user_{enable,disable} to <asm/fpsimd.h> 2018-07-12 14:40:39 +01:00
ftrace.c arm64: ftrace: emit ftrace-mod.o contents through code 2017-12-01 13:04:59 +00:00
head.S arm64/kvm: Prohibit guest LOR accesses 2018-02-26 10:48:01 +01:00
hibernate-asm.S arm64: assembler: Change order of macro arguments in phys_to_ttbr 2018-02-06 22:53:21 +00:00
hibernate.c arm64: ssbd: Restore mitigation status on CPU resume 2018-05-31 17:35:19 +01:00
hw_breakpoint.c compat: Move compat_timespec/ timeval to compat_time.h 2018-04-19 13:29:54 +02:00
hyp-stub.S arm64: hyp-stub: Zero x0 on successful stub handling 2017-04-09 07:49:35 -07:00
image.h arm64/efi: Make strrchr() available to the EFI namespace 2018-03-05 13:45:38 -06:00
insn.c arm64: insn: Don't fallback on nosync path for general insn patching 2018-07-05 17:24:48 +01:00
io.c arm64: Avoid aligning normal memory pointers in __memcpy_{to,from}io 2017-10-24 16:23:07 +01:00
irq.c arm64: Add vmap_stack header file 2018-01-13 10:45:03 +00:00
jump_label.c jump_label: Rename JUMP_LABEL_{EN,DIS}ABLE to JUMP_LABEL_{JMP,NOP} 2015-08-03 11:34:12 +02:00
kaslr.c arm64/kernel: kaslr: reduce module randomization range to 4 GB 2018-03-08 13:49:26 +00:00
kgdb.c arm64/debug: Fix registers on sleeping tasks 2018-03-06 18:52:34 +00:00
kuser32.S arm64: Add __NR_* definitions for compat syscalls 2014-07-10 11:02:40 +01:00
machine_kexec.c arm64: kexec: machine_kexec should call __flush_icache_range 2018-07-30 17:58:11 +01:00
Makefile arm64: convert compat wrappers to C 2018-07-12 14:49:48 +01:00
module-plts.c arm64/kernel: rename module_emit_adrp_veneer->module_emit_veneer_for_adrp 2018-04-24 19:07:35 +01:00
module.c arm64: Avoid flush_icache_range() in alternatives patching code 2018-06-27 18:21:53 +01:00
module.lds arm64: ftrace: emit ftrace-mod.o contents through code 2017-12-01 13:04:59 +00:00
paravirt.c arm64: introduce CONFIG_PARAVIRT, PARAVIRT_TIME_ACCOUNTING and pv_time_ops 2015-12-21 14:40:54 +00:00
pci.c PCI: Add a generic weak pcibios_align_resource() 2017-08-02 14:53:16 -05:00
perf_callchain.c arm64: unwind: remove sp from struct stackframe 2017-08-09 14:10:29 +01:00
perf_event.c arm64: perf: Add cap_user_time aarch64 2018-07-31 10:14:00 +01:00
perf_regs.c compat: Move compat_timespec/ timeval to compat_time.h 2018-04-19 13:29:54 +02:00
process.c arm64: Add support for STACKLEAK gcc plugin 2018-07-26 11:36:34 +01:00
psci.c arm64: Use __pa_symbol for kernel symbols 2017-01-12 15:05:39 +00:00
ptrace.c arm64: Add stack information to on_accessible_stack 2018-07-26 11:36:07 +01:00
reloc_test_core.c arm64: fix undefined reference to 'printk' 2018-03-19 18:14:25 +00:00
reloc_test_syms.S arm64: fix undefined reference to 'printk' 2018-03-19 18:14:25 +00:00
relocate_kernel.S arm64: Add software workaround for Falkor erratum 1041 2018-02-06 22:53:13 +00:00
return_address.c arm64: unwind: remove sp from struct stackframe 2017-08-09 14:10:29 +01:00
sdei.c arm64: Add stack information to on_accessible_stack 2018-07-26 11:36:07 +01:00
setup.c arm64: export memblock_reserve()d regions via /proc/iomem 2018-07-23 15:30:32 +01:00
signal32.c arm64: use {COMPAT,}SYSCALL_DEFINE0 for sigreturn 2018-07-12 14:49:48 +01:00
signal.c arm64: use {COMPAT,}SYSCALL_DEFINE0 for sigreturn 2018-07-12 14:49:48 +01:00
sleep.S arm64: idmap: Use "awx" flags for .idmap.text .pushsection directives 2018-02-06 22:53:27 +00:00
smccc-call.S firmware: qcom: scm: Fix interrupted SCM calls 2017-02-03 18:46:33 +00:00
smp_spin_table.c arm64: Use __pa_symbol for kernel symbols 2017-01-12 15:05:39 +00:00
smp.c arm64: numa: rework ACPI NUMA initialization 2018-07-09 18:21:40 +01:00
ssbd.c arm64: ssbd: Add prctl interface for per-thread mitigation 2018-05-31 18:00:52 +01:00
stacktrace.c arm64: Add stack information to on_accessible_stack 2018-07-26 11:36:07 +01:00
suspend.c arm64: ssbd: Restore mitigation status on CPU resume 2018-05-31 17:35:19 +01:00
sys32.c arm64: implement syscall wrappers 2018-07-12 14:49:48 +01:00
sys_compat.c signal: Ensure every siginfo we send has all bits initialized 2018-04-25 10:40:51 -05:00
sys.c arm64: implement syscall wrappers 2018-07-12 14:49:48 +01:00
syscall.c arm64: svc: Ensure hardirq tracing is updated before return 2018-07-30 17:43:39 +01:00
time.c arm64: fix unwind_frame() for filtered out fn for function graph tracing 2018-02-23 13:46:38 +00:00
topology.c arm64: topology: re-introduce numa mask check for scheduler MC selection 2018-07-06 13:18:18 +01:00
trace-events-emulation.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
traps.c arm64: convert raw syscall invocation to C 2018-07-12 14:43:09 +01:00
vdso.c arm64/vdso: Support mremap() for vDSO 2017-08-09 12:16:28 +01:00
vmlinux.lds.S arm64: remove no-op macro VMLINUX_SYMBOL() 2018-05-15 18:14:24 +01:00