With the recent rewrite of the arm64 KVM hypervisor code in C, enabling
certain options like KASAN would allow the compiler to generate memory
accesses or function calls to addresses not mapped at EL2. This patch
disables the compiler instrumentation on the arm64 hypervisor code for
gcov-based profiling (GCOV_KERNEL), undefined behaviour sanity checker
(UBSAN) and kernel address sanitizer (KASAN).
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: <stable@vger.kernel.org> # 4.5+
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
- Initial page table creation reworked to avoid breaking large block
mappings (huge pages) into smaller ones. The ARM architecture requires
break-before-make in such cases to avoid TLB conflicts but that's not
always possible on live page tables
- Kernel virtual memory layout: the kernel image is no longer linked to
the bottom of the linear mapping (PAGE_OFFSET) but at the bottom of
the vmalloc space, allowing the kernel to be loaded (nearly) anywhere
in physical RAM
- Kernel ASLR: position independent kernel Image and modules being
randomly mapped in the vmalloc space with the randomness is provided
by UEFI (efi_get_random_bytes() patches merged via the arm64 tree,
acked by Matt Fleming)
- Implement relative exception tables for arm64, required by KASLR
(initial code for ARCH_HAS_RELATIVE_EXTABLE added to lib/extable.c but
actual x86 conversion to deferred to 4.7 because of the merge
dependencies)
- Support for the User Access Override feature of ARMv8.2: this allows
uaccess functions (get_user etc.) to be implemented using LDTR/STTR
instructions. Such instructions, when run by the kernel, perform
unprivileged accesses adding an extra level of protection. The
set_fs() macro is used to "upgrade" such instruction to privileged
accesses via the UAO bit
- Half-precision floating point support (part of ARMv8.2)
- Optimisations for CPUs with or without a hardware prefetcher (using
run-time code patching)
- copy_page performance improvement to deal with 128 bytes at a time
- Sanity checks on the CPU capabilities (via CPUID) to prevent
incompatible secondary CPUs from being brought up (e.g. weird
big.LITTLE configurations)
- valid_user_regs() reworked for better sanity check of the sigcontext
information (restored pstate information)
- ACPI parking protocol implementation
- CONFIG_DEBUG_RODATA enabled by default
- VDSO code marked as read-only
- DEBUG_PAGEALLOC support
- ARCH_HAS_UBSAN_SANITIZE_ALL enabled
- Erratum workaround Cavium ThunderX SoC
- set_pte_at() fix for PROT_NONE mappings
- Code clean-ups
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJW6u95AAoJEGvWsS0AyF7xMyoP/3x2O6bgreSQ84BdO4JChN4+
RQ9OVdX8u2ItO9sgaCY2AA6KoiBuEjGmPl/XRuK0I7DpODTtRjEXQHuNNhz8AelC
hn4AEVqamY6Z5BzHFIjs8G9ydEbq+OXcKWEdwSsBhP/cMvI7ss3dps1f5iNPT5Vv
50E/kUz+aWYy7pKlB18VDV7TUOA3SuYuGknWV8+bOY5uPb8hNT3Y3fHOg/EuNNN3
DIuYH1V7XQkXtF+oNVIGxzzJCXULBE7egMcWAm1ydSOHK0JwkZAiL7OhI7ceVD0x
YlDxBnqmi4cgzfBzTxITAhn3OParwN6udQprdF1WGtFF6fuY2eRDSH/L/iZoE4DY
OulL951OsBtF8YC3+RKLk908/0bA2Uw8ftjCOFJTYbSnZBj1gWK41VkCYMEXiHQk
EaN8+2Iw206iYIoyvdjGCLw7Y0oakDoVD9vmv12SOaHeQljTkjoN8oIlfjjKTeP7
3AXj5v9BDMDVh40nkVayysRNvqe48Kwt9Wn0rhVTLxwdJEiFG/OIU6HLuTkretdN
dcCNFSQrRieSFHpBK9G0vKIpIss1ZwLm8gjocVXH7VK4Mo/TNQe4p2/wAF29mq4r
xu1UiXmtU3uWxiqZnt72LOYFCarQ0sFA5+pMEvF5W+NrVB0wGpXhcwm+pGsIi4IM
LepccTgykiUBqW5TRzPz
=/oS+
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Catalin Marinas:
"Here are the main arm64 updates for 4.6. There are some relatively
intrusive changes to support KASLR, the reworking of the kernel
virtual memory layout and initial page table creation.
Summary:
- Initial page table creation reworked to avoid breaking large block
mappings (huge pages) into smaller ones. The ARM architecture
requires break-before-make in such cases to avoid TLB conflicts but
that's not always possible on live page tables
- Kernel virtual memory layout: the kernel image is no longer linked
to the bottom of the linear mapping (PAGE_OFFSET) but at the bottom
of the vmalloc space, allowing the kernel to be loaded (nearly)
anywhere in physical RAM
- Kernel ASLR: position independent kernel Image and modules being
randomly mapped in the vmalloc space with the randomness is
provided by UEFI (efi_get_random_bytes() patches merged via the
arm64 tree, acked by Matt Fleming)
- Implement relative exception tables for arm64, required by KASLR
(initial code for ARCH_HAS_RELATIVE_EXTABLE added to lib/extable.c
but actual x86 conversion to deferred to 4.7 because of the merge
dependencies)
- Support for the User Access Override feature of ARMv8.2: this
allows uaccess functions (get_user etc.) to be implemented using
LDTR/STTR instructions. Such instructions, when run by the kernel,
perform unprivileged accesses adding an extra level of protection.
The set_fs() macro is used to "upgrade" such instruction to
privileged accesses via the UAO bit
- Half-precision floating point support (part of ARMv8.2)
- Optimisations for CPUs with or without a hardware prefetcher (using
run-time code patching)
- copy_page performance improvement to deal with 128 bytes at a time
- Sanity checks on the CPU capabilities (via CPUID) to prevent
incompatible secondary CPUs from being brought up (e.g. weird
big.LITTLE configurations)
- valid_user_regs() reworked for better sanity check of the
sigcontext information (restored pstate information)
- ACPI parking protocol implementation
- CONFIG_DEBUG_RODATA enabled by default
- VDSO code marked as read-only
- DEBUG_PAGEALLOC support
- ARCH_HAS_UBSAN_SANITIZE_ALL enabled
- Erratum workaround Cavium ThunderX SoC
- set_pte_at() fix for PROT_NONE mappings
- Code clean-ups"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (99 commits)
arm64: kasan: Fix zero shadow mapping overriding kernel image shadow
arm64: kasan: Use actual memory node when populating the kernel image shadow
arm64: Update PTE_RDONLY in set_pte_at() for PROT_NONE permission
arm64: Fix misspellings in comments.
arm64: efi: add missing frame pointer assignment
arm64: make mrs_s prefixing implicit in read_cpuid
arm64: enable CONFIG_DEBUG_RODATA by default
arm64: Rework valid_user_regs
arm64: mm: check at build time that PAGE_OFFSET divides the VA space evenly
arm64: KVM: Move kvm_call_hyp back to its original localtion
arm64: mm: treat memstart_addr as a signed quantity
arm64: mm: list kernel sections in order
arm64: lse: deal with clobbered IP registers after branch via PLT
arm64: mm: dump: Use VA_START directly instead of private LOWEST_ADDR
arm64: kconfig: add submenu for 8.2 architectural features
arm64: kernel: acpi: fix ioremap in ACPI parking protocol cpu_postboot
arm64: Add support for Half precision floating point
arm64: Remove fixmap include fragility
arm64: Add workaround for Cavium erratum 27456
arm64: mm: Mark .rodata as RO
...
So far, we're always writing all possible LRs, setting the empty
ones with a zero value. This is obvious doing a low of work for
nothing, and we're better off clearing those we've actually
dirtied on the exit path (it is very rare to inject more than one
interrupt at a time anyway).
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
In order to let the GICv3 code be more lazy in the way it
accesses the LRs, it is necessary to start with a clean slate.
Let's reset the LRs on each CPU when the vgic is probed (which
includes a round trip to EL2...).
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
On exit, any empty LR will be signaled in ICH_ELRSR_EL2. Which
means that we do not have to save it, and we can just clear
its state in the in-memory copy.
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Next on our list of useless accesses is the maintenance interrupt
status registers (ICH_MISR_EL2, ICH_EISR_EL2).
It is pointless to save them if we haven't asked for a maintenance
interrupt the first place, which can only happen for two reasons:
- Underflow: ICH_HCR_UIE will be set,
- EOI: ICH_LR_EOI will be set.
These conditions can be checked on the in-memory copies of the regs.
Should any of these two condition be valid, we must read GICH_MISR.
We can then check for ICH_MISR_EOI, and only when set read
ICH_EISR_EL2.
This means that in most case, we don't have to save them at all.
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Just like on GICv2, we're a bit hammer-happy with GICv3, and access
them more often than we should.
Adopt a policy similar to what we do for GICv2, only save/restoring
the minimal set of registers. As we don't access the registers
linearly anymore (we may skip some), the convoluted accessors become
slightly simpler, and we can drop the ugly indexing macro that
tended to confuse the reviewers.
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
This register resets as unknown in 64bit mode while it resets as zero
in 32bit mode. Here we choose to reset it as zero for consistency.
PMUSERENR_EL0 holds some bits which decide whether PMU registers can be
accessed from EL0. Add some check helpers to handle the access from EL0.
When these bits are zero, only reading PMUSERENR will trap to EL2 and
writing PMUSERENR or reading/writing other PMU registers will trap to
EL1 other than EL2 when HCR.TGE==0. To current KVM configuration
(HCR.TGE==0) there is no way to get these traps. Here we write 0xf to
physical PMUSERENR register on VM entry, so that it will trap PMU access
from EL0 to EL2. Within the register access handler we check the real
value of guest PMUSERENR register to decide whether this access is
allowed. If not allowed, return false to inject UND to guest.
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
We already have virt/kvm/arm/ containing timer and vgic stuff.
Add yet another subdirectory to contain the hyp-specific files
(timer and vgic again).
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
In order to be able to move code outside of kvm/hyp, we need to make
the global hyp.h file accessible from a standard location.
include/asm/kvm_hyp.h seems good enough.
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
The fault decoding process (including computing the IPA in the case
of a permission fault) would be much better done in C code, as we
have a reasonable infrastructure to deal with the VHE/non-VHE
differences.
Let's move the whole thing to C, including the workaround for
erratum 834220, and just patch the odd ESR_EL2 access remaining
in hyp-entry.S.
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
As the kernel fully runs in HYP when VHE is enabled, we can
directly branch to the kernel's panic() implementation, and
not perform an exception return.
Add the alternative code to deal with this.
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Despite the fact that a VHE enabled kernel runs at EL2, it uses
CPACR_EL1 to trap FPSIMD access. Add the required alternative
code to re-enable guest FPSIMD access when it has trapped to
EL2.
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Switch the timer code to the unified sysreg accessors.
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Running the kernel in HYP mode requires the HCR_E2H bit to be set
at all times, and the HCR_TGE bit to be set when running as a host
(and cleared when running as a guest). At the same time, the vector
must be set to the current role of the kernel (either host or
hypervisor), and a couple of system registers differ between VHE
and non-VHE.
We implement these by using another set of alternate functions
that get dynamically patched.
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
As non-VHE and VHE have different ways to express the trapping of
FPSIMD registers to EL2, make __fpsimd_enabled a patchable predicate
and provide a VHE implementation.
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
We're now in a position where we can introduce VHE's minimal
save/restore, which is limited to the handful of shared sysregs.
Add the required alternative function calls that result in a
"do nothing" call on VHE, and the normal save/restore for non-VHE.
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Use the recently introduced unified system register accessors for
those sysregs that behave differently depending on VHE being in
use or not.
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
A handful of system registers are still shared between host and guest,
even while using VHE (tpidr*_el[01] and actlr_el1).
Also, some of the vcpu state (sp_el0, PC and PSTATE) must be
save/restored on entry/exit, as they are used on the host as well.
In order to facilitate the introduction of a VHE-specific sysreg
save/restore, make move the access to these registers to their
own save/restore functions.
No functional change.
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
With ARMv8, host and guest share the same system register file,
making the save/restore procedure completely symetrical.
With VHE, host and guest now have different requirements, as they
use different sysregs.
In order to prepare for this, add split sysreg save/restore functions
for both host and guest. No functional changes yet.
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
VHE brings its own bag of new system registers, or rather system
register accessors, as it define new ways to access both guest
and host system registers. For example, from the host:
- The host TCR_EL2 register is accessed using the TCR_EL1 accessor
- The guest TCR_EL1 register is accessed using the TCR_EL12 accessor
Obviously, this is confusing. A way to somehow reduce the complexity
of writing code for both ARMv8 and ARMv8.1 is to use a set of unified
accessors that will generate the right sysreg, depending on the mode
the CPU is running in. For example:
- read_sysreg_el1(tcr) will use TCR_EL1 on ARMv8, and TCR_EL12 on
ARMv8.1 with VHE.
- read_sysreg_el2(tcr) will use TCR_EL2 on ARMv8, and TCR_EL1 on
ARMv8.1 with VHE.
We end up with three sets of accessors ({read,write}_sysreg_el[012])
that can be directly used from C code. We take this opportunity to
also add the definition for the new VHE sysregs.
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
The kern_hyp_va macro is pretty meaninless with VHE, as there is
only one mapping - the kernel one.
In order to keep the code readable and efficient, use runtime
patching to replace the 'and' instruction used to compute the VA
with a 'nop'.
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
With VHE, the host never issues an HVC instruction to get into the
KVM code, as we can simply branch there.
Use runtime code patching to simplify things a bit.
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
There is no real need to leave the stage2 initialization as part
of the early HYP bootstrap, and we can easily postpone it to
the point where we can safely run C code.
This will help VHE, which doesn't need any of this bootstrap.
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
The GICv3 architecture spec says:
Writing to the active priority registers in any order other than
the following order will result in UNPREDICTABLE behavior:
- ICH_AP0R<n>_EL2.
- ICH_AP1R<n>_EL2.
So let's not pointlessly go against the rule...
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Currently, using BUG_ON() in header files is cumbersome, due to the fact
that asm/bug.h transitively includes a lot of other header files, resulting
in the actual BUG_ON() invocation appearing before its definition in the
preprocessor input. So let's reverse the #include dependency between
asm/bug.h and asm/debug-monitors.h, by moving the definition of BUG_BRK_IMM
from the latter to the former. Also fix up one user of asm/debug-monitors.h
which relied on a transitive include.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Some bits in CPTR are defined as RES1 in the architecture. Setting
these bits to zero may unintentionally enable future architecture
extensions, allowing guests to use them without supervision by the host.
This would be bad: for forwards compatibility, this patch makes
sure the affected bits are always written with 1, not 0.
This patch only addresses CPTR_EL2. Initialisation of other system
registers may still need review.
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
As we've now switched to the new world switch implementation,
remove the weak attributes, as nobody is supposed to override
it anymore.
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Having the system register numbers as #defines has been a pain
since day one, as the ordering is pretty fragile, and moving
things around leads to renumbering and epic conflict resolutions.
Now that we're mostly acessing the sysreg file in C, an enum is
a much better type to use, and we can clean things up a bit.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
So far, we've implemented the new world switch with a completely
different namespace, so that we could have both implementation
compiled in.
Let's take things one step further by adding weak aliases that
have the same names as the original implementation. The weak
attributes allows the new implementation to be overriden by the
old one, and everything still work.
At a later point, we'll be able to simply drop the old code, and
everything will hopefully keep working, thanks to the aliases we
have just added. This also saves us repainting all the callers.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Add the panic handler, together with the small bits of assembly
code to call the kernel's panic implementation.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Add the entry points for HYP mode (both for hypercalls and
exception handling).
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Implement the TLB handling as a direct translation of the assembly
code version.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Implement the fpsimd save restore, keeping the lazy part in
assembler (as returning to C would be overkill).
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Implement the core of the world switch in C. Not everything is there
yet, and there is nothing to re-enter the world switch either.
But this already outlines the code structure well enough.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
KVM so far relies on code patching, and is likely to use it more
in the future. The main issue is that our alternative system works
at the instruction level, while we'd like to have alternatives at
the function level.
In order to cope with this, add the "hyp_alternate_select" macro that
outputs a brief sequence of code that in turn can be patched, allowing
an alternative function to be selected.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Contrary to the previous patch, the guest entry is fairly different
from its assembly counterpart, mostly because it is only concerned
with saving/restoring the GP registers, and nothing else.
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Implement the debug save restore as a direct translation of
the assembly code version.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Implement the 32bit system register save/restore as a direct
translation of the assembly code version.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Implement the system register save/restore as a direct translation of
the assembly code version.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
In order to expose the various EL2 services that are private to
the hypervisor, add a new hyp.h file.
So far, it only contains mundane things such as section annotation
and VA manipulation.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>