Commit Graph

616750 Commits

Author SHA1 Message Date
Will Deacon
05492f2fd8 arm64: lse: convert lse alternatives NOP padding to use __nops
The LSE atomics are implemented using alternative code sequences of
different lengths, and explicit NOP padding is used to ensure the
patching works correctly.

This patch converts the bulk of the LSE code over to using the __nops
macro, which makes it slightly clearer as to what is going on and also
consolidates all of the padding at the end of the various sequences.

Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-09 18:12:34 +01:00
Will Deacon
f99a250cb6 arm64: barriers: introduce nops and __nops macros for NOP sequences
NOP sequences tend to get used for padding out alternative sections
and uarch-specific pipeline flushes in errata workarounds.

This patch adds macros for generating these sequences as both inline
asm blocks, but also as strings suitable for embedding in other asm
blocks directly.

Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-09 18:12:28 +01:00
Will Deacon
8a71f0c656 arm64: sysreg: replace open-coded mrs_s/msr_s with {read,write}_sysreg_s
Similar to our {read,write}_sysreg accessors for architected, named
system registers, this patch introduces {read,write}_sysreg_s variants
that can take arbitrary sys_reg output and therefore access IMPDEF
registers or registers that unsupported by binutils.

Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-09 18:12:08 +01:00
Robin Murphy
0e27a7fce6 arm64: Remove shadowed asm-generic headers
We've grown our own versions of bug.h, ftrace.h, pci.h and topology.h,
so generating the generic ones as well is unnecessary and a potential
source of build hiccups. At the very least, having them present has
confused my source-indexing tool, and that simply will not do.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-09 15:05:08 +01:00
Suzuki K Poulose
116c81f427 arm64: Work around systems with mismatched cache line sizes
Systems with differing CPU i-cache/d-cache line sizes can cause
problems with the cache management by software when the execution
is migrated from one to another. Usually, the application reads
the cache size on a CPU and then uses that length to perform cache
operations. However, if it gets migrated to another CPU with a smaller
cache line size, things could go completely wrong. To prevent such
cases, always use the smallest cache line size among the CPUs. The
kernel CPU feature infrastructure already keeps track of the safe
value for all CPUID registers including CTR. This patch works around
the problem by :

For kernel, dynamically patch the kernel to read the cache size
from the system wide copy of CTR_EL0.

For applications, trap read accesses to CTR_EL0 (by clearing the SCTLR.UCT)
and emulate the mrs instruction to return the system wide safe value
of CTR_EL0.

For faster access (i.e, avoiding to lookup the system wide value of CTR_EL0
via read_system_reg), we keep track of the pointer to table entry for
CTR_EL0 in the CPU feature infrastructure.

Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Andre Przywara <andre.przywara@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-09 15:03:29 +01:00
Suzuki K Poulose
9dbd5bb25c arm64: Refactor sysinstr exception handling
Right now we trap some of the user space data cache operations
based on a few Errata (ARM 819472, 826319, 827319 and 824069).
We need to trap userspace access to CTR_EL0, if we detect mismatched
cache line size. Since both these traps share the EC, refactor
the handler a little bit to make it a bit more reader friendly.

Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-09 15:03:29 +01:00
Suzuki K Poulose
072f0a6338 arm64: Introduce raw_{d,i}cache_line_size
On systems with mismatched i/d cache min line sizes, we need to use
the smallest size possible across all CPUs. This will be done by fetching
the system wide safe value from CPU feature infrastructure.
However the some special users(e.g kexec, hibernate) would need the line
size on the CPU (rather than the system wide), when either the system
wide feature may not be accessible or it is guranteed that the caller
executes with a gurantee of no migration.
Provide another helper which will fetch cache line size on the current CPU.

Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: James Morse <james.morse@arm.com>
Reviewed-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-09 15:03:29 +01:00
Suzuki K Poulose
c831b2ae25 arm64: alternative: Add support for patching adrp instructions
adrp uses PC-relative address offset to a page (of 4K size) of
a symbol. If it appears in an alternative code patched in, we
should adjust the offset to reflect the address where it will
be run from. This patch adds support for fixing the offset
for adrp instructions.

Cc: Will Deacon <will.deacon@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Andre Przywara <andre.przywara@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-09 15:03:28 +01:00
Suzuki K Poulose
46084bc253 arm64: insn: Add helpers for adrp offsets
Adds helpers for decoding/encoding the PC relative addresses for adrp.
This will be used for handling dynamic patching of 'adrp' instructions
in alternative code patching.

Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-09 15:03:28 +01:00
Suzuki K Poulose
baa763b565 arm64: alternative: Disallow patching instructions using literals
The alternative code patching doesn't check if the replaced instruction
uses a pc relative literal. This could cause silent corruption in the
instruction stream as the instruction will be executed from a different
address than what it was compiled for. Catch all such cases.

Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Andre Przywara <andre.przywara@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Suggested-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-09 15:03:28 +01:00
Suzuki K Poulose
c47a1900ad arm64: Rearrange CPU errata workaround checks
Right now we run through the work around checks on a CPU
from __cpuinfo_store_cpu. There are some problems with that:

1) We initialise the system wide CPU feature registers only after the
Boot CPU updates its cpuinfo. Now, if a work around depends on the
variance of a CPU ID feature (e.g, check for Cache Line size mismatch),
we have no way of performing it cleanly for the boot CPU.

2) It is out of place, invoked from __cpuinfo_store_cpu() in cpuinfo.c. It
is not an obvious place for that.

This patch rearranges the CPU specific capability(aka work around) checks.

1) At the moment we use verify_local_cpu_capabilities() to check if a new
CPU has all the system advertised features. Use this for the secondary CPUs
to perform the work around check. For that we rename
  verify_local_cpu_capabilities() => check_local_cpu_capabilities()
which:

   If the system wide capabilities haven't been initialised (i.e, the CPU
   is activated at the boot), update the system wide detected work arounds.

   Otherwise (i.e a CPU hotplugged in later) verify that this CPU conforms to the
   system wide capabilities.

2) Boot CPU updates the work arounds from smp_prepare_boot_cpu() after we have
initialised the system wide CPU feature values.

Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Andre Przywara <andre.przywara@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-09 15:03:28 +01:00
Suzuki K Poulose
89ba26458b arm64: Use consistent naming for errata handling
This is a cosmetic change to rename the functions dealing with
the errata work arounds to be more consistent with their naming.

1) check_local_cpu_errata() => update_cpu_errata_workarounds()
check_local_cpu_errata() actually updates the system's errata work
arounds. So rename it to reflect the same.

2) verify_local_cpu_errata() => verify_local_cpu_errata_workarounds()
Use errata_workarounds instead of _errata.

Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-09 15:03:28 +01:00
Suzuki K Poulose
ee7bc638f1 arm64: Set the safe value for L1 icache policy
Right now we use 0 as the safe value for CTR_EL0:L1Ip, which is
not defined at the moment. The safer value for the L1Ip should be
the weakest of the policies, which happens to be AIVIVT. While at it,
fix the comment about safe_val.

Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-09 15:03:28 +01:00
Zhen Lei
7ba5f605f3 arm64/numa: remove the limitation that cpu0 must bind to node0
1. Remove the old binding code.
2. Read the nid of cpu0 from dts.
3. Fallback the nid of cpu0 to 0 when numa=off is set in bootargs.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-09 14:59:09 +01:00
Zhen Lei
df7ffa34cc arm64/numa: remove some useless code
When the deleted code is executed, only the bit of cpu0 was set on
cpu_possible_mask. So that, only set_cpu_numa_node(0, NUMA_NO_NODE); will
be executed. And map_cpu_to_node(0, 0) will soon be called. So these code
can be safely removed.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-09 14:59:09 +01:00
Zhen Lei
7af3a0a992 arm64/numa: support HAVE_SETUP_PER_CPU_AREA
To make each percpu area allocated from its local numa node. Without this
patch, all percpu areas will be allocated from the node which cpu0 belongs
to.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-09 14:59:09 +01:00
Kefeng Wang
f11c7bacd5 arm64: numa: Use pr_fmt()
Use pr_fmt to prefix kernel output, and remove duplicated msg
of NUMA turned off.

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-09 14:59:09 +01:00
Kefeng Wang
ad02180515 of_numa: Use pr_fmt()
Use pr_fmt to prefix kernel output.

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-09 14:59:09 +01:00
Kefeng Wang
837dae1b43 of_numa: Use of_get_next_parent to simplify code
Use of_get_next_parent() instead of open-code.

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-09 14:59:08 +01:00
Zhen Lei
794224ea56 arm64/numa: avoid inconsistent information to be printed
numa_init may return error because of numa configuration error. So "No
NUMA configuration found" is inaccurate. In fact, specific configuration
error information should be immediately printed by the testing branch.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-09 14:59:08 +01:00
Zhen Lei
9787ed6e5c of/numa: remove a duplicated warning
This warning has been printed in of_numa_parse_cpu_nodes before.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-09 14:59:08 +01:00
Zhen Lei
571a588fec of/numa: add nid check for memory block
If the numa-id which was configured in memory@ devicetree node is greater
than MAX_NUMNODES, we should report a warning. We have done this for cpus
and distance-map dt nodes, this patch help them to be consistent.

Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-09 14:59:08 +01:00
Zhen Lei
84b14256c1 of/numa: fix a memory@ node can only contains one memory block
For a normal memory@ devicetree node, its reg property can contains more
memory blocks.

Because we don't known how many memory blocks maybe contained, so we try
from index=0, increase 1 until error returned(the end).

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-09 14:59:08 +01:00
Zhen Lei
16a82f06c4 of/numa: remove a duplicated pr_debug information
This information will be printed in the subfunction numa_add_memblk.
They are not the same, but very similar.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-09 14:59:08 +01:00
Mark Rutland
48538b5863 drivers/perf: arm_pmu: expose a cpumask in sysfs
In systems with heterogeneous CPUs, there are multiple logical CPU PMUs,
each of which covers a subset of CPUs in the system. In some cases
userspace needs to know which CPUs a given logical PMU covers, so we'd
like to expose a cpumask under sysfs, similar to what is done for uncore
PMUs.

Unfortunately, prior to commit 00e727bb38 ("perf stat: Balance
opening and reading events"), perf stat only correctly handled a cpumask
holding a single CPU, and only when profiling in system-wide mode. In
other cases, the presence of a cpumask file could cause perf stat to
behave erratically.

Thus, exposing a cpumask file would break older perf binaries in cases
where they would otherwise work.

To avoid this issue while still providing userspace with the information
it needs, this patch exposes a differently-named file (cpus) under
sysfs. New tools can look for this and operate correctly, while older
tools will not be adversely affected by its presence.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-09 14:51:51 +01:00
Mark Rutland
1589680da6 drivers/perf: arm_pmu: only use common attr_groups
Now that the 32-bit and 64-bit perf backends use the common groups
directly, remove the fallback and no longer allow the groups array to be
overridden.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-09 14:51:51 +01:00
Mark Rutland
9268c5dafa arm: perf: move to common attr_group fields
By using a common attr_groups array, the common arm_pmu code can set up
common files (e.g. cpumask) for us in subsequent patches.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-09 14:51:51 +01:00
Mark Rutland
569de9026c arm64: perf: move to common attr_group fields
By using a common attr_groups array, the common arm_pmu code can set up
common files (e.g. cpumask) for us in subsequent patches.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-09 14:51:51 +01:00
Mark Rutland
86cdd72af9 drivers/perf: arm_pmu: add common attr group fields
In preparation for adding common attribute groups, add an array of
attribute group pointers to arm_pmu, which will be used if the
backend hasn't already set pmu::attr_groups.

Subsequent patches will move backends over to using these, before adding
common fields.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-09 14:51:51 +01:00
Mark Rutland
d3ea42aad5 arm64: simplify contextidr_thread_switch
When CONFIG_PID_IN_CONTEXTIDR is not selected, we use an empty stub
definition of contextidr_thread_switch(). As everything we rely upon
exists regardless of CONFIG_PID_IN_CONTEXTIDR, we don't strictly require
an empty stub.

By using IS_ENABLED() rather than ifdeffery, we avoid duplication, and
get compiler coverage on all the code even when CONFIG_PID_IN_CONTEXTIDR
is not selected and the code is optimised away.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-09 11:43:50 +01:00
Mark Rutland
adf7589997 arm64: simplify sysreg manipulation
A while back we added {read,write}_sysreg accessors to handle accesses
to system registers, without the usual boilerplate asm volatile,
temporary variable, etc.

This patch makes use of these across arm64 to make code shorter and
clearer. For sequences with a trailing ISB, the existing isb() macro is
also used so that asm blocks can be removed entirely.

A few uses of inline assembly for msr/mrs are left as-is. Those
manipulating sp_el0 for the current thread_info value have special
clobber requiremends.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-09 11:43:50 +01:00
Mark Rutland
1f3d8699be arm64/kvm: use {read,write}_sysreg()
A while back we added {read,write}_sysreg accessors to handle accesses
to system registers, without the usual boilerplate asm volatile,
temporary variable, etc.

This patch makes use of these in the arm64 KVM code to make the code
shorter and clearer.

At the same time, a comment style violation next to a system register
access is fixed up in reset_pmcr, and comments describing whether
operations are reads or writes are removed as this is now painfully
obvious.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-09 11:42:27 +01:00
Mark Rutland
d0a69d9f38 arm64: dcc: simplify accessors
A while back we added {read,write}_sysreg accessors to handle accesses
to system registers, without the usual boilerplate asm volatile,
temporary variable, etc.

This patch makes use of these in the arm64 DCC accessors to make the
code shorter and clearer.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-09 11:41:13 +01:00
Mark Rutland
cd5f22d796 arm64: arch_timer: simplify accessors
A while back we added {read,write}_sysreg accessors to handle accesses
to system registers, without the usual boilerplate asm volatile,
temporary variable, etc.

This patch makes use of these in the arm64 arch timer accessors to make
the code shorter and clearer.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-09 11:41:13 +01:00
Mark Rutland
7aff4a2dd3 arm64: sysreg: allow write_sysreg to use XZR
Currently write_sysreg has to allocate a temporary register to write
zero to a system register, which is unfortunate given that the MSR
instruction accepts XZR as an operand.

Allow XZR to be used when appropriate by fiddling with the assembly
constraints.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-09 11:40:39 +01:00
Robin Murphy
ee5e41b5f2 arm64/io: Allow I/O writes to use {W,X}ZR
When zeroing an I/O location, the current accessors are forced to
allocate a temporary register to store the zero for the write. By
tweaking the assembly constraints, we can allow the compiler to use
the zero register directly in such cases, and save some juggling.
Compiling a representative kernel configuration with GCC 6 shows
that 2.3KB worth of code can be wasted just on that!

  text     data    bss      dec      hex     filename
 13316776 3248256 18176769 34741801 2121e29 vmlinux.o.new
 13319140 3248256 18176769 34744165 2122765 vmlinux.o.old

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-08 11:04:13 +01:00
Catalin Marinas
efd9e03fac arm64: Use static keys for CPU features
This patch adds static keys transparently for all the cpu_hwcaps
features by implementing an array of default-false static keys and
enabling them when detected. The cpus_have_cap() check uses the static
keys if the feature being checked is a constant, otherwise the compiler
generates the bitmap test.

Because of the early call to static_branch_enable() via
check_local_cpu_errata() -> update_cpu_capabilities(), the jump labels
are initialised in cpuinfo_store_boot_cpu().

Cc: Will Deacon <will.deacon@arm.com>
Cc: Suzuki K. Poulose <Suzuki.Poulose@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-07 09:41:42 +01:00
Catalin Marinas
ef0da55a84 jump_labels: Allow array initialisers
The static key API is currently designed around single variable
definitions. There are cases where an array of static keys is desirable,
so extend the API to allow this rather than using the internal static
key implementation directly.

Cc: Jason Baron <jbaron@akamai.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Suggested-by: Dave P Martin <Dave.Martin@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-07 09:41:11 +01:00
Kefeng Wang
dae8c235d9 arm64: mm: drop fixup_init() and mm.h
There is only fixup_init() in mm.h , and it is only called
in free_initmem(), so move the codes from fixup_init() into
free_initmem(), then drop fixup_init() and mm.h.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-06 19:09:38 +01:00
Marc Zyngier
282b879635 drivers/perf: arm_pmu: Always consider IRQ0 as an error
As declared by the chief penguin, and enforced by the NO_IRQ brigade,
IRQ0 doesn't exist, and is considered as an error (no irq).

Unfortunately, the arm_pmu driver still considers it as valid in
a large number of cases. Let's fix this.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-06 18:26:33 +01:00
Pratyush Anand
98ab10e977 arm64: ftrace: add save_stack_trace_regs()
Currently, enabling stacktrace of a kprobe events generates warning:

  echo stacktrace > /sys/kernel/debug/tracing/trace_options
  echo "p xhci_irq" > /sys/kernel/debug/tracing/kprobe_events
  echo 1 > /sys/kernel/debug/tracing/events/kprobes/enable

save_stack_trace_regs() not implemented yet.
------------[ cut here ]------------
WARNING: CPU: 1 PID: 0 at ../kernel/stacktrace.c:74 save_stack_trace_regs+0x3c/0x48
Modules linked in:

CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.8.0-rc4-dirty #5128
Hardware name: ARM Juno development board (r1) (DT)
task: ffff800975dd1900 task.stack: ffff800975ddc000
PC is at save_stack_trace_regs+0x3c/0x48
LR is at save_stack_trace_regs+0x3c/0x48
pc : [<ffff000008126c64>] lr : [<ffff000008126c64>] pstate: 600003c5
sp : ffff80097ef52c00

Call trace:
   save_stack_trace_regs+0x3c/0x48
   __ftrace_trace_stack+0x168/0x208
   trace_buffer_unlock_commit_regs+0x5c/0x7c
   kprobe_trace_func+0x308/0x3d8
   kprobe_dispatcher+0x58/0x60
   kprobe_breakpoint_handler+0xbc/0x18c
   brk_handler+0x50/0x90
   do_debug_exception+0x50/0xbc

This patch implements save_stack_trace_regs(), so that stacktrace of a
kprobe events can be obtained.

After this patch, there is no warning and we can see the stacktrace for
kprobe events in trace buffer.

more /sys/kernel/debug/tracing/trace
          <idle>-0     [004] d.h.  1356.000496: p_xhci_irq_0:(xhci_irq+0x0/0x9ac)
          <idle>-0     [004] d.h.  1356.000497: <stack trace>
  => xhci_irq
  => __handle_irq_event_percpu
  => handle_irq_event_percpu
  => handle_irq_event
  => handle_fasteoi_irq
  => generic_handle_irq
  => __handle_domain_irq
  => gic_handle_irq
  => el1_irq
  => arch_cpu_idle
  => default_idle_call
  => cpu_startup_entry
  => secondary_start_kernel
  =>

Tested-by: David A. Long <dave.long@linaro.org>
Reviewed-by: James Morse <james.morse@arm.com>
Signed-off-by: Pratyush Anand <panand@redhat.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-05 13:41:52 +01:00
Ard Biesheuvel
dc00247576 arm64: kernel: re-export _cpu_resume() from sleep.S
Commit b5fe242972 ("arm64: kernel: fix style issues in sleep.S")
changed the linkage of _cpu_resume() to local, even though the symbol
is also referenced from hibernate.c. So revert this change.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-05 10:24:55 +01:00
James Morse
f928c16dbf arm64: Drop generic xlate_dev_mem_{k,}ptr()
The code that provides /dev/mem uses xlate_dev_mem_{k,}ptr() to
avoid making a cachable mapping of a non-cachable area on ia64.
On arm64 we do this via phys_mem_access_prot() instead, but provide
dummy versions of xlate_dev_mem_{k,}ptr().

These are the same as those in asm-generic/io.h, which we include from
asm/io.h

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-05 10:23:56 +01:00
Will Deacon
adeb68ef85 arm64: debug: report TRAP_TRACE instead of TRAP_HWBRPT for singlestep
Single-step traps to userspace (e.g. via ptrace) are expected to use
the TRAP_TRACE for the si_code field of the siginfo, as opposed to
TRAP_HWBRPT that we report currently.

Fix the reported value, which has no effect on existing and legacy
builds of GDB.

Reported-by: Yao Qi <yao.qi@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-02 16:55:58 +01:00
Ard Biesheuvel
a9be2ee093 arm64: head.S: document the use of callee saved registers
Now that the only remaining occurrences of the use of callee saved
registers are on the primary boot path, add a comment to the code
which register is used for what.

Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-02 11:47:51 +01:00
Ard Biesheuvel
60699ba18b arm64: head.S: use ordinary stack frame for __primary_switched()
Instead of stashing the value of the link register in x28 before setting
up the stack and calling into C code, create an ordinary PCS compatible
stack frame so that we can push the return address onto the stack.

Since exception handlers require a stack as well, assign the stack pointer
register before installing the vector table.

Note that this accounts for the difference between THREAD_START_SP and
THREAD_SIZE, given that the stack pointer is always decremented before
calling into any C code.

Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-02 11:47:51 +01:00
Ard Biesheuvel
b929fe320e arm64: kernel: drop use of x24 from primary boot path
Keeping __PHYS_OFFSET in x24 is actually less clear than simply taking
the value of __PHYS_OFFSET using an adrp instruction in the three places
that we need it. So change that.

Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-02 11:47:51 +01:00
Ard Biesheuvel
9dcf7914ae arm64: kernel: use x30 for __enable_mmu return address
Using x27 for passing to __enable_mmu what is essentially the return
address makes the code look more complicated than it needs to be. So
switch to x30/lr, and update the secondary and cpu_resume call sites to
simply call __enable_mmu as an ordinary function, with a bl instruction.
This requires the callers to be covered by .idmap.text.

Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-02 11:47:51 +01:00
Ard Biesheuvel
3c5e9f238b arm64: head.S: move KASLR processing out of __enable_mmu()
The KASLR processing is only used by the primary boot path, and
complements the processing that takes place in __primary_switch().
Move the two parts together, to make the code easier to understand.

Also, fix up a minor whitespace issue.

Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
[will: fixed conflict with -rc3 due to lack of fd363bd417]
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-02 11:30:13 +01:00
Ard Biesheuvel
23c8a500c2 arm64: kernel: use ordinary return/argument register for el2_setup()
The function el2_setup() passes its return value in register w20, and
in the two cases where the caller actually cares about this return value,
it is passed into set_cpu_boot_mode_flag() [almost] directly, which
expects its input in w20 as well.

So there is no reason to use a 'special' callee saved register here, but
we can simply follow the PCS for return value and first argument,
respectively.

Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-02 11:21:15 +01:00