linux

korg/linux

mirror of https://mirrors.bfsu.edu.cn/git/linux.git synced 2024-12-11 04:54:13 +08:00

Author	SHA1	Message	Date
Marc Zyngier	619064afa9	KVM: arm64: vgic: Tidy-up calls to vgic_{get,set}_common_attr() The userspace accessors have an early call to vgic_{get,set}_common_attr() that makes the code hard to follow. Move it to the default: clause of the decoding switch statement, which results in a nice cleanup. This requires us to move the handling of the pending table into the common handling, even if it is strictly a GICv3 feature (it has the benefit of keeping the whole control group handling in the same function). Also cleanup vgic_v3_{get,set}_attr() while we're at it, deduplicating the calls to vgic_v3_attr_regs_access(). Suggested-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-07-17 11:55:33 +01:00
Marc Zyngier	4b85080f4e	KVM: arm64: vgic: Consolidate userspace access for base address setting Align kvm_vgic_addr() with the rest of the code by moving the userspace accesses into it. kvm_vgic_addr() is also made static. Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-07-17 11:55:33 +01:00
Marc Zyngier	9f968c9266	KVM: arm64: vgic-v2: Add helper for legacy dist/cpuif base address setting We carry a legacy interface to set the base addresses for GICv2. As this is currently plumbed into the same handling code as the modern interface, it limits the evolution we can make there. Add a helper dedicated to this handling, with a view of maybe removing this in the future. Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-07-17 11:55:33 +01:00
Marc Zyngier	d7df6f282d	KVM: arm64: vgic: Use {get,put}_user() instead of copy_{from.to}_user Tidy-up vgic_get_common_attr() and vgic_set_common_attr() to use {get,put}_user() instead of the more complex (and less type-safe) copy_{from,to}_user(). Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-07-17 11:55:33 +01:00
Marc Zyngier	7e9f723c2a	KVM: arm64: vgic-v2: Consolidate userspace access for MMIO registers Align the GICv2 MMIO accesses from userspace with the way the GICv3 code is now structured. Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-07-17 11:55:33 +01:00
Marc Zyngier	e1246f3f2d	KVM: arm64: vgic-v3: Consolidate userspace access for MMIO registers For userspace accesses to GICv3 MMIO registers (and related data), vgic_v3_{get,set}_attr are littered with {get,put}_user() calls, making it hard to audit and reason about. Consolidate all userspace accesses in vgic_v3_attr_regs_access(), making the code far simpler to audit. Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-07-17 11:55:33 +01:00
Marc Zyngier	38cf0bb762	KVM: arm64: vgic-v3: Use u32 to manage the line level from userspace Despite the userspace ABI clearly defining the bits dealt with by KVM_DEV_ARM_VGIC_GRP_LEVEL_INFO as a __u32, the kernel uses a u64. Use a u32 to match the userspace ABI, which will subsequently lead to some simplifications. Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-07-17 11:55:33 +01:00
Marc Zyngier	db25081e14	KVM: arm64: vgic-v3: Push user access into vgic_v3_cpu_sysregs_uaccess() In order to start making the vgic sysreg access from userspace similar to all the other sysregs, push the userspace memory access one level down into vgic_v3_cpu_sysregs_uaccess(). The next step will be to rely on the sysreg infrastructure to perform this task. Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-07-17 11:55:33 +01:00
Marc Zyngier	b61fc0857a	KVM: arm64: vgic-v3: Simplify vgic_v3_has_cpu_sysregs_attr() Finding out whether a sysreg exists has little to do with that register being accessed, so drop the is_write parameter. Also, the reg pointer is completely unused, and we're better off just passing the attr pointer to the function. This result in a small cleanup of the calling site, with a new helper converting the vGIC view of a sysreg into the canonical one (this is purely cosmetic, as the encoding is the same). Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-07-17 11:55:33 +01:00
Marc Zyngier	98432ccdec	KVM: arm64: Replace vgic_v3_uaccess_read_pending with vgic_uaccess_read_pending Now that GICv2 has a proper userspace accessor for the pending state, switch GICv3 over to it, dropping the local version, moving over the specific behaviours that CGIv3 requires (such as the distinction between pending latch and line level which were never enforced with GICv2). We also gain extra locking that isn't really necessary for userspace, but that's a small price to pay for getting rid of superfluous code. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/r/20220607131427.1164881-3-maz@kernel.org	2022-06-08 10:16:15 +01:00
Marc Zyngier	2cdea19a34	KVM: arm64: Don't read a HW interrupt pending state in user context Since `5bfa685e62` ("KVM: arm64: vgic: Read HW interrupt pending state from the HW"), we're able to source the pending bit for an interrupt that is stored either on the physical distributor or on a device. However, this state is only available when the vcpu is loaded, and is not intended to be accessed from userspace. Unfortunately, the GICv2 emulation doesn't provide specific userspace accessors, and we fallback with the ones that are intended for the guest, with fatal consequences. Add a new vgic_uaccess_read_pending() accessor for userspace to use, build on top of the existing vgic_mmio_read_pending(). Reported-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Tested-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Fixes: `5bfa685e62` ("KVM: arm64: vgic: Read HW interrupt pending state from the HW") Link: https://lore.kernel.org/r/20220607131427.1164881-2-maz@kernel.org Cc: stable@vger.kernel.org	2022-06-07 16:28:19 +01:00
Paolo Bonzini	47e8eec832	KVM/arm64 updates for 5.19 - Add support for the ARMv8.6 WFxT extension - Guard pages for the EL2 stacks - Trap and emulate AArch32 ID registers to hide unsupported features - Ability to select and save/restore the set of hypercalls exposed to the guest - Support for PSCI-initiated suspend in collaboration with userspace - GICv3 register-based LPI invalidation support - Move host PMU event merging into the vcpu data structure - GICv3 ITS save/restore fixes - The usual set of small-scale cleanups and fixes -----BEGIN PGP SIGNATURE----- iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmKGAGsPHG1hekBrZXJu ZWwub3JnAAoJECPQ0LrRPXpDB/gQAMhyZ+wCG0OMEZhwFF6iDfxVEX2Kw8L41NtD a/e6LDWuIOGihItpRkYROc5myG74D7XckF2Bz3G7HJoU4vhwHOV/XulE26GFizoC O1GVRekeSUY81wgS1yfo0jojLupBkTjiq3SjTHoDP7GmCM0qDPBtA0QlMRzd2bMs Kx0+UUXZUHFSTXc7Lp4vqNH+tMp7se+yRx7hxm6PCM5zG+XYJjLxnsZ0qpchObgU 7f6YFojsLUs1SexgiUqJ1RChVQ+FkgICh5HyzORvGtHNNzK6D2sIbsW6nqMGAMql Kr3A5O/VOkCztSYnLxaa76/HqD21mvUrXvr3grhabNc7rOmuzWV0dDgr6c6wHKHb uNCtH4d7Ra06gUrEOrfsgLOLn0Zqik89y6aIlMsnTudMg9gMNgFHy1jz4LM7vMkY FS5AVj059heg2uJcfgTvzzcqneyuBLBmF3dS4coowO6oaj8SycpaEmP5e89zkPMI 1kk8d0e6RmXuCh/2AJ8GxxnKvBPgqp2mMKXOCJ8j4AmHEDX/CKpEBBqIWLKkplUU 8DGiOWJUtRZJg398dUeIpiVLoXJthMODjAnkKkuhiFcQbXomlwgg7YSnNAz6TRED Z7KR2leC247kapHnnagf02q2wED8pBeyrxbQPNdrHtSJ9Usm4nTkY443HgVTJW3s aTwPZAQ7 =mh7W -----END PGP SIGNATURE----- Merge tag 'kvmarm-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for 5.19 - Add support for the ARMv8.6 WFxT extension - Guard pages for the EL2 stacks - Trap and emulate AArch32 ID registers to hide unsupported features - Ability to select and save/restore the set of hypercalls exposed to the guest - Support for PSCI-initiated suspend in collaboration with userspace - GICv3 register-based LPI invalidation support - Move host PMU event merging into the vcpu data structure - GICv3 ITS save/restore fixes - The usual set of small-scale cleanups and fixes [Due to the conflict, KVM_SYSTEM_EVENT_SEV_TERM is relocated from 4 to 6. - Paolo]	2022-05-25 05:09:23 -04:00
Marc Zyngier	5c0ad551e9	Merge branch kvm-arm64/its-save-restore-fixes-5.19 into kvmarm-master/next * kvm-arm64/its-save-restore-fixes-5.19: : . : Tighten the ITS save/restore infrastructure to fail early rather : than late. Patches courtesy of Rocardo Koller. : . KVM: arm64: vgic: Undo work in failed ITS restores KVM: arm64: vgic: Do not ignore vgic_its_restore_cte failures KVM: arm64: vgic: Add more checks when restoring ITS tables KVM: arm64: vgic: Check that new ITEs could be saved in guest memory Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-05-16 17:48:36 +01:00
Marc Zyngier	822ca7f82b	Merge branch kvm-arm64/misc-5.19 into kvmarm-master/next * kvm-arm64/misc-5.19: : . : Misc fixes and general improvements for KVMM/arm64: : : - Better handle out of sequence sysregs in the global tables : : - Remove a couple of unnecessary loads from constant pool : : - Drop unnecessary pKVM checks : : - Add all known M1 implementations to the SEIS workaround : : - Cleanup kerneldoc warnings : . KVM: arm64: vgic-v3: List M1 Pro/Max as requiring the SEIS workaround KVM: arm64: pkvm: Don't mask already zeroed FEAT_SVE KVM: arm64: pkvm: Drop unnecessary FP/SIMD trap handler KVM: arm64: nvhe: Eliminate kernel-doc warnings KVM: arm64: Avoid unnecessary absolute addressing via literals KVM: arm64: Print emulated register table name when it is unsorted KVM: arm64: Don't BUG_ON() if emulated register table is unsorted Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-05-16 17:48:36 +01:00
Ricardo Koller	8c5e74c90b	KVM: arm64: vgic: Undo work in failed ITS restores Failed ITS restores should clean up all state restored until the failure. There is some cleanup already present when failing to restore some tables, but it's not complete. Add the missing cleanup. Note that this changes the behavior in case of a failed restore of the device tables. restore ioctl: 1. restore collection tables 2. restore device tables With this commit, failures in 2. clean up everything created so far, including state created by 1. Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Ricardo Koller <ricarkol@google.com> Reviewed-by: Oliver Upton <oupton@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220510001633.552496-5-ricarkol@google.com	2022-05-16 13:58:04 +01:00
Ricardo Koller	a1ccfd6f6e	KVM: arm64: vgic: Do not ignore vgic_its_restore_cte failures Restoring a corrupted collection entry (like an out of range ID) is being ignored and treated as success. More specifically, a vgic_its_restore_cte failure is treated as success by vgic_its_restore_collection_table. vgic_its_restore_cte uses positive and negative numbers to return error, and +1 to return success. The caller then uses "ret > 0" to check for success. Fix this by having vgic_its_restore_cte only return negative numbers on error. Do this by changing alloc_collection return codes to only return negative numbers on error. Signed-off-by: Ricardo Koller <ricarkol@google.com> Reviewed-by: Oliver Upton <oupton@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220510001633.552496-4-ricarkol@google.com	2022-05-16 13:58:04 +01:00
Ricardo Koller	243b1f6c8f	KVM: arm64: vgic: Add more checks when restoring ITS tables Try to improve the predictability of ITS save/restores (and debuggability of failed ITS saves) by failing early on restore when trying to read corrupted tables. Restoring the ITS tables does some checks for corrupted tables, but not as many as in a save: an overflowing device ID will be detected on save but not on restore. The consequence is that restoring a corrupted table won't be detected until the next save; including the ITS not working as expected after the restore. As an example, if the guest sets tables overlapping each other, which would most likely result in some corrupted table, this is what we would see from the host point of view: guest sets base addresses that overlap each other save ioctl restore ioctl save ioctl (fails) Ideally, we would like the first save to fail, but overlapping tables could actually be intended by the guest. So, let's at least fail on the restore with some checks: like checking that device and event IDs don't overflow their tables. Signed-off-by: Ricardo Koller <ricarkol@google.com> Reviewed-by: Oliver Upton <oupton@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220510001633.552496-3-ricarkol@google.com	2022-05-16 13:58:04 +01:00
Ricardo Koller	cafe7e544d	KVM: arm64: vgic: Check that new ITEs could be saved in guest memory Try to improve the predictability of ITS save/restores by failing commands that would lead to failed saves. More specifically, fail any command that adds an entry into an ITS table that is not in guest memory, which would otherwise lead to a failed ITS save ioctl. There are already checks for collection and device entries, but not for ITEs. Add the corresponding check for the ITT when adding ITEs. Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Ricardo Koller <ricarkol@google.com> Reviewed-by: Oliver Upton <oupton@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220510001633.552496-2-ricarkol@google.com	2022-05-16 13:58:04 +01:00
Marc Zyngier	cae889302e	KVM: arm64: vgic-v3: List M1 Pro/Max as requiring the SEIS workaround Unsusprisingly, Apple M1 Pro/Max have the exact same defect as the original M1 and generate random SErrors in the host when a guest tickles the GICv3 CPU interface the wrong way. Add the part numbers for both the CPU types found in these two new implementations, and add them to the hall of shame. This also applies to the Ultra version, as it is composed of 2 Max SoCs. Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20220514102524.3188730-1-maz@kernel.org	2022-05-15 11:18:50 +01:00
Marc Zyngier	49a1a2c70a	KVM: arm64: vgic-v3: Advertise GICR_CTLR.{IR, CES} as a new GICD_IIDR revision Since adversising GICR_CTLR.{IC,CES} is directly observable from a guest, we need to make it selectable from userspace. For that, bump the default GICD_IIDR revision and let userspace downgrade it to the previous default. For GICv2, the two distributor revisions are strictly equivalent. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220405182327.205520-5-maz@kernel.org	2022-05-04 14:09:53 +01:00
Marc Zyngier	4645d11f4a	KVM: arm64: vgic-v3: Implement MMIO-based LPI invalidation Since GICv4.1, it has become legal for an implementation to advertise GICR_{INVLPIR,INVALLR,SYNCR} while having an ITS, allowing for a more efficient invalidation scheme (no guest command queue contention when multiple CPUs are generating invalidations). Provide the invalidation registers as a primitive to their ITS counterpart. Note that we don't advertise them to the guest yet (the architecture allows an implementation to do this). Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Oliver Upton <oupton@google.com> Link: https://lore.kernel.org/r/20220405182327.205520-4-maz@kernel.org	2022-05-04 14:09:53 +01:00
Marc Zyngier	94828468a6	KVM: arm64: vgic-v3: Expose GICR_CTLR.RWP when disabling LPIs When disabling LPIs, a guest needs to poll GICR_CTLR.RWP in order to be sure that the write has taken effect. We so far reported it as 0, as we didn't advertise that LPIs could be turned off the first place. Start tracking this state during which LPIs are being disabled, and expose the 'in progress' state via the RWP bit. We also take this opportunity to disallow enabling LPIs and programming GICR_{PEND,PROP}BASER while LPI disabling is in progress, as allowed by the architecture (UNPRED behaviour). We don't advertise the feature to the guest yet (which is allowed by the architecture). Reviewed-by: Oliver Upton <oupton@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220405182327.205520-3-maz@kernel.org	2022-05-04 14:09:53 +01:00
Sean Christopherson	f502cc568d	KVM: Add max_vcpus field in common 'struct kvm' For TDX guests, the maximum number of vcpus needs to be specified when the TDX guest VM is initialized (creating the TDX data corresponding to TDX guest) before creating vcpu. It needs to record the maximum number of vcpus on VM creation (KVM_CREATE_VM) and return error if the number of vcpus exceeds it Because there is already max_vcpu member in arm64 struct kvm_arch, move it to common struct kvm and initialize it to KVM_MAX_VCPUS before kvm_arch_init_vm() instead of adding it to x86 struct kvm_arch. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com> Message-Id: <e53234cdee6a92357d06c80c03d77c19cdefb804.1646422845.git.isaku.yamahata@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-02 11:42:42 -04:00
Yu Zhe	c707663e81	KVM: arm64: vgic: Remove unnecessary type castings Remove unnecessary casts. Signed-off-by: Yu Zhe <yuzhe@nfschina.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220329102059.268983-1-yuzhe@nfschina.com	2022-04-06 10:42:55 +01:00
Paolo Bonzini	714797c98e	KVM/arm64 updates for 5.18 - Proper emulation of the OSLock feature of the debug architecture - Scalibility improvements for the MMU lock when dirty logging is on - New VMID allocator, which will eventually help with SVA in VMs - Better support for PMUs in heterogenous systems - PSCI 1.1 support, enabling support for SYSTEM_RESET2 - Implement CONFIG_DEBUG_LIST at EL2 - Make CONFIG_ARM64_ERRATUM_2077057 default y - Reduce the overhead of VM exit when no interrupt is pending - Remove traces of 32bit ARM host support from the documentation - Updated vgic selftests - Various cleanups, doc updates and spelling fixes -----BEGIN PGP SIGNATURE----- iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmI0lrQPHG1hekBrZXJu ZWwub3JnAAoJECPQ0LrRPXpDy0YQAIX2bWcPFMqHqn3CAYhTSTiOK5s+OWx9im5f 5yTPRj+SJ88SWv030r8a5dxWh2dEK2IetM9KifZ0dvmcCs8lYW/9/IUkHYY9lAYJ 9VLH4iPgs9dOD9wtfovfb+vcM8bso9Ndi3aCFJUj+bcNwYU3kBIJ+8AxA5DZoLty 5LPF38eoxrSEv9N0VwqvhGxdgqDp8Zahykr693r+8Wd3Rj6yRoqoEvqWhHdVWlWJ 3quRNkYN4LzjN3x1T9CLaZUqMofbUjfYCAvbZorALJy6In1FfgoyocFe6/JvsmzZ xOlrWWbJz/1NNI6Hoy5aZtQavTFrHu4XbCkjBDL7RhRxj636KWelVoXAbV05XX2r hQYMnN0bwlnAljTefguIZ7frnQyjg5OV8GMu3CTIPMqu//fA+61z+bXoyVy6pzaV gcXHtDgIdiRaT6BJiHST8ctxZWDTr2GUgTGfdlCde7hgmJ7DjManLXvgYx101/Nz VfvKzz3oSvVTelNa/6ZWxuUlwvly0eKONSkwjp0uq5TZ9G8NLaKitA8nKDSkoegx 41iIUEztivuu9KQvQkl8wdcCPwEk8K2sOTH7ikINS/wJ0khiUztndxCAlEPbQo50 567OiSaj5+vqFPZsxWBVTIbmkdBVKCzrG+4B1H4didMb1Q1n2lHhgj1keHTmZyVP jlFofZxf =J1mn -----END PGP SIGNATURE----- Merge tag 'kvmarm-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for 5.18 - Proper emulation of the OSLock feature of the debug architecture - Scalibility improvements for the MMU lock when dirty logging is on - New VMID allocator, which will eventually help with SVA in VMs - Better support for PMUs in heterogenous systems - PSCI 1.1 support, enabling support for SYSTEM_RESET2 - Implement CONFIG_DEBUG_LIST at EL2 - Make CONFIG_ARM64_ERRATUM_2077057 default y - Reduce the overhead of VM exit when no interrupt is pending - Remove traces of 32bit ARM host support from the documentation - Updated vgic selftests - Various cleanups, doc updates and spelling fixes	2022-03-18 12:43:24 -04:00
Julia Lawall	21ea457842	KVM: arm64: fix typos in comments Various spelling mistakes in comments. Detected with the help of Coccinelle. Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220318103729.157574-24-Julia.Lawall@inria.fr	2022-03-18 14:04:15 +00:00
Marc Zyngier	5bfa685e62	KVM: arm64: vgic: Read HW interrupt pending state from the HW It appears that a read access to GIC[DR]_I[CS]PENDRn doesn't always result in the pending interrupts being accurately reported if they are mapped to a HW interrupt. This is particularily visible when acking the timer interrupt and reading the GICR_ISPENDR1 register immediately after, for example (the interrupt appears as not-pending while it really is...). This is because a HW interrupt has its 'active and pending state' kept in the physical distributor, and not in the virtual one, as mandated by the spec (this is what allows the direct deactivation). The virtual distributor only caries the pending and active states (note the plural, as these are two independent and non-overlapping states). Fix it by reading the HW state back, either from the timer itself or from the distributor if necessary. Reported-by: Ricardo Koller <ricarkol@google.com> Tested-by: Ricardo Koller <ricarkol@google.com> Reviewed-by: Ricardo Koller <ricarkol@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220208123726.3604198-1-maz@kernel.org	2022-02-11 11:01:12 +00:00
Paolo Bonzini	17179d0068	KVM/arm64 fixes for 5.17, take #1 - Correctly update the shadow register on exception injection when running in nVHE mode - Correctly use the mm_ops indirection when performing cache invalidation from the page-table walker - Restrict the vgic-v3 workaround for SEIS to the two known broken implementations -----BEGIN PGP SIGNATURE----- iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmHzv5EPHG1hekBrZXJu ZWwub3JnAAoJECPQ0LrRPXpD0DcQAMF0hcKYxwuXi+UwQ8u5SsrpQQZ1BWC6euvB FFQiUPANXq/u0xM2kV+5FhjEfHqqjnh7nYLVKpBcetcvGSfWUnZlVI4DKI+5pdte PTa/minS5sq9BDZ/clRnnomNw0UwtH2OLeolg7+UAqBMihicddVBBU6IqvY1Nx+z F2qovZa3Qqb1EB+9+hPS+qGcjlguaBOEzrJ9uIaw532G1JD1K9hhMlabdhJhiJA3 gWuUJO+cuYEdctli+OJb9g92zIDt0hVP+/1tndlbib5BUw6e2vkdyKF0+/7u77xr SDKNmUosvZt/fABZpv6ycgRszoKRjBCIC5takQCZI/l2QzZFbiP/414E8L0J/zLV PI8e1bs/H9pBF3c7WG+if/3jYs+D+/nYhkE+PeW3k5lxzsHo7XE5ei6mzoxzBusC l4c0QQ7lpwep4dOWm4oRxzE0/9IONgVKKlIKGBkpSbtznDkAToTWobAIFVeZj+nm BVxf+A6ddcnQSzXYa/FUsfV3ZEsJVPSs/DL6mBBJuG8lxNzZnabkt+ODfXuhyrXe 6kGkF9+4HE9XyItieZVDUgRcZ9x57c+3q7A9b7Kl+Ds1Z+hsu0tVqghf5YVQAj3a 4IkOBdPEtaGCSrJWxupX+oimCXqdNfbnOqf4VsO8l1O0O8WBRvYaqYL2RKR32kX2 n3nzO/vE =BKqv -----END PGP SIGNATURE----- Merge tag 'kvmarm-fixes-5.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 fixes for 5.17, take #1 - Correctly update the shadow register on exception injection when running in nVHE mode - Correctly use the mm_ops indirection when performing cache invalidation from the page-table walker - Restrict the vgic-v3 workaround for SEIS to the two known broken implementations	2022-01-28 07:45:15 -05:00
Marc Zyngier	d11a327ed9	KVM: arm64: vgic-v3: Restrict SEIS workaround to known broken systems Contrary to what `df652bcf11` ("KVM: arm64: vgic-v3: Work around GICv3 locally generated SErrors") was asserting, there is at least one other system out there (Cavium ThunderX2) implementing SEIS, and not in an obviously broken way. So instead of imposing the M1 workaround on an innocent bystander, let's limit it to the two known broken Apple implementations. Fixes: `df652bcf11` ("KVM: arm64: vgic-v3: Work around GICv3 locally generated SErrors") Reported-by: Ard Biesheuvel <ardb@kernel.org> Tested-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220122103912.795026-1-maz@kernel.org	2022-01-22 11:38:16 +00:00
Paolo Bonzini	7fd55a02a4	KVM/arm64 updates for Linux 5.16 - Simplification of the 'vcpu first run' by integrating it into KVM's 'pid change' flow - Refactoring of the FP and SVE state tracking, also leading to a simpler state and less shared data between EL1 and EL2 in the nVHE case - Tidy up the header file usage for the nvhe hyp object - New HYP unsharing mechanism, finally allowing pages to be unmapped from the Stage-1 EL2 page-tables - Various pKVM cleanups around refcounting and sharing - A couple of vgic fixes for bugs that would trigger once the vcpu xarray rework is merged, but not sooner - Add minimal support for ARMv8.7's PMU extension - Rework kvm_pgtable initialisation ahead of the NV work - New selftest for IRQ injection - Teach selftests about the lack of default IPA space and page sizes - Expand sysreg selftest to deal with Pointer Authentication - The usual bunch of cleanups and doc update -----BEGIN PGP SIGNATURE----- iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmHYIpgPHG1hekBrZXJu ZWwub3JnAAoJECPQ0LrRPXpDndsP/RsBmX6bmQnDEhaaqfGAxOETyq/my1eT9r/V 3Ax4fEqSFfD5yHbYvqNRC8ueycH4r8WAr4ACWDAI6XpS/pYx00nx2N+HCSgjGyQR FeXqITuGPEsn4NkGuPci0PFmI8rVUzanl1ugRGQAETVrZo2ZVH2uqKVGT8XOlu0J FB/0x6Z4vMuIgEXyfa+DZ8WdW1aCRgPU2oyOdSdWE57/grjyLJqk6EdMmLyaQ19E vz6vXuRnA/GQwOtByqYEnQ8a4VXsQedCMqg/f9mj0BxpDzxC1ps8Nrpv36aJXKUN LEXapP9bCWPW9LqaKAOZnQYrUIIEFHsCUom0n3reDHrgObA+jivpz75L8GEr3CdC Bv78N04Yymjpp2WA6CuO3r9HjL1nJ6tYqobXU2pvqln4nNC3Ukucjq9ZVuWgS6Hx qOZXgPcZ/HpS3l/U+dAu8yIcV2SchQXDudaq8BsfLd8M1bD+oirSBolZFSvz7MYZ 6+jtEDLUOEO5s4rXiJF46+MauxiELcjaewAEK4WwrS8NBwEyhYe9EPsYcQ5pcrQF QwAd1+y7oLfhpGHv5KJKWswfvbtlLCm6NOAhawq0UXM8bS+79tu0dGjiDzVPBuSf SyA3VtBSKxcpvCrljw9ubtjxvKrviE0MDvlmTP2B1NU+lwm8xRBiwUwOH29qP9zU HDeUj2fy =HkZk -----END PGP SIGNATURE----- Merge tag 'kvmarm-5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for Linux 5.16 - Simplification of the 'vcpu first run' by integrating it into KVM's 'pid change' flow - Refactoring of the FP and SVE state tracking, also leading to a simpler state and less shared data between EL1 and EL2 in the nVHE case - Tidy up the header file usage for the nvhe hyp object - New HYP unsharing mechanism, finally allowing pages to be unmapped from the Stage-1 EL2 page-tables - Various pKVM cleanups around refcounting and sharing - A couple of vgic fixes for bugs that would trigger once the vcpu xarray rework is merged, but not sooner - Add minimal support for ARMv8.7's PMU extension - Rework kvm_pgtable initialisation ahead of the NV work - New selftest for IRQ injection - Teach selftests about the lack of default IPA space and page sizes - Expand sysreg selftest to deal with Pointer Authentication - The usual bunch of cleanups and doc update	2022-01-07 10:42:19 -05:00
Marc Zyngier	ce5b5b05c1	Merge branch kvm-arm64/vgic-fixes-5.17 into kvmarm-master/next * kvm-arm64/vgic-fixes-5.17: : . : A few vgic fixes: : - Harden vgic-v3 error handling paths against signed vs unsigned : comparison that will happen once the xarray-based vcpus are in : - Demote userspace-triggered console output to kvm_debug() : . KVM: arm64: vgic: Demote userspace-triggered console prints to kvm_debug() KVM: arm64: vgic-v3: Fix vcpu index comparison Signed-off-by: Marc Zyngier <maz@kernel.org>	2021-12-16 12:54:12 +00:00
Marc Zyngier	440523b92b	KVM: arm64: vgic: Demote userspace-triggered console prints to kvm_debug() Running the KVM selftests results in these messages being dumped in the kernel console: [ 188.051073] kvm [469]: VGIC redist and dist frames overlap [ 188.056820] kvm [469]: VGIC redist and dist frames overlap [ 188.076199] kvm [469]: VGIC redist and dist frames overlap Being amle to trigger this from userspace is definitely not on, so demote these warnings to kvm_debug(). Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211216104507.1482017-1-maz@kernel.org	2021-12-16 10:47:48 +00:00
Marc Zyngier	c95b1d7ca7	KVM: arm64: vgic-v3: Fix vcpu index comparison When handling an error at the point where we try and register all the redistributors, we unregister all the previously registered frames by counting down from the failing index. However, the way the code is written relies on that index being a signed value. Which won't be true once we switch to an xarray-based vcpu set. Since this code is pretty awkward the first place, and that the failure mode is hard to spot, rewrite this loop to iterate over the vcpus upwards rather than downwards. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211216104526.1482124-1-maz@kernel.org	2021-12-16 10:47:24 +00:00
Marc Zyngier	7b6871f670	Merge branch kvm-arm64/pkvm-cleanups-5.17 into kvmarm-master/next * kvm-arm64/pkvm-cleanups-5.17: : . : pKVM cleanups from Quentin Perret: : : This series is a collection of various fixes and cleanups for KVM/arm64 : when running in nVHE protected mode. The first two patches are real : fixes/improvements, the following two are minor cleanups, and the last : two help satisfy my paranoia so they're certainly optional. : . KVM: arm64: pkvm: Make kvm_host_owns_hyp_mappings() robust to VHE KVM: arm64: pkvm: Stub io map functions KVM: arm64: Make __io_map_base static KVM: arm64: Make the hyp memory pool static KVM: arm64: pkvm: Disable GICv2 support KVM: arm64: pkvm: Fix hyp_pool max order Signed-off-by: Marc Zyngier <maz@kernel.org>	2021-12-15 14:21:23 +00:00
Quentin Perret	a770ee80e6	KVM: arm64: pkvm: Disable GICv2 support GICv2 requires having device mappings in guests and the hypervisor, which is incompatible with the current pKVM EL2 page ownership model which only covers memory. While it would be desirable to support pKVM with GICv2, this will require a lot more work, so let's make the current assumption clear until then. Co-developed-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20211208152300.2478542-3-qperret@google.com	2021-12-15 14:16:28 +00:00
Marc Zyngier	46808a4cb8	KVM: Use 'unsigned long' as kvm_for_each_vcpu()'s index Everywhere we use kvm_for_each_vpcu(), we use an int as the vcpu index. Unfortunately, we're about to move rework the iterator, which requires this to be upgrade to an unsigned long. Let's bite the bullet and repaint all of it in one go. Signed-off-by: Marc Zyngier <maz@kernel.org> Message-Id: <20211116160403.4074052-7-maz@kernel.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-12-08 04:24:15 -05:00
Marc Zyngier	94b4a6d521	Merge branch kvm-arm64/misc-5.17 into kvmarm-master/next * kvm-arm64/misc-5.17: : . : - Add minimal support for ARMv8.7's PMU extension : - Constify kvm_io_gic_ops : - Drop kvm_is_transparent_hugepage() prototype : . KVM: Drop stale kvm_is_transparent_hugepage() declaration KVM: arm64: Constify kvm_io_gic_ops KVM: arm64: Add minimal handling for the ARMv8.7 PMU Signed-off-by: Marc Zyngier <maz@kernel.org>	2021-12-07 09:14:53 +00:00
Rikard Falkeborn	636dcd0204	KVM: arm64: Constify kvm_io_gic_ops The only usage of kvm_io_gic_ops is to make a comparison with its address and to pass its address to kvm_iodevice_init() which takes a pointer to const kvm_io_device_ops as input. Make it const to allow the compiler to put it in read-only memory. Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211204213518.83642-1-rikard.falkeborn@gmail.com	2021-12-06 08:34:06 +00:00
Marc Zyngier	cc5705fb1b	KVM: arm64: Drop vcpu->arch.has_run_once for vcpu->pid With the transition to kvm_arch_vcpu_run_pid_change() to handle the "run once" activities, it becomes obvious that has_run_once is now an exact shadow of vcpu->pid. Replace vcpu->arch.has_run_once with a new vcpu_has_run_once() helper that directly checks for vcpu->pid, and get rid of the now unused field. Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Marc Zyngier <maz@kernel.org>	2021-12-01 11:51:22 +00:00
Marc Zyngier	5f8b2591de	Merge branch kvm-arm64/memory-accounting into kvmarm-master/next * kvm-arm64/memory-accounting: : . : Sprinkle a bunch of GFP_KERNEL_ACCOUNT all over the code base : to better track memory allocation made on behalf of a VM. : . KVM: arm64: Add memcg accounting to KVM allocations KVM: arm64: vgic: Add memcg accounting to vgic allocations Signed-off-by: Marc Zyngier <maz@kernel.org>	2021-10-17 11:29:36 +01:00
Jia He	3ef231670b	KVM: arm64: vgic: Add memcg accounting to vgic allocations Inspired by commit `254272ce65` ("kvm: x86: Add memcg accounting to KVM allocations"), it would be better to make arm64 vgic consistent with common kvm codes. The memory allocations of VM scope should be charged into VM process cgroup, hence change GFP_KERNEL to GFP_KERNEL_ACCOUNT. There remain a few cases since these allocations are global, not in VM scope. Signed-off-by: Jia He <justin.he@arm.com> Reviewed-by: Oliver Upton <oupton@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210907123112.10232-2-justin.he@arm.com	2021-10-17 11:25:55 +01:00
Marc Zyngier	20a3043075	Merge branch kvm-arm64/vgic-fixes-5.16 into kvmarm-master/next * kvm-arm64/vgic-fixes-5.16: : . : Multiple updates to the GICv3 emulation in order to better support : the dreadful Apple M1 that only implements half of it, and in a : broken way... : . KVM: arm64: vgic-v3: Align emulated cpuif LPI state machine with the pseudocode KVM: arm64: vgic-v3: Don't advertise ICC_CTLR_EL1.SEIS KVM: arm64: vgic-v3: Reduce common group trapping to ICV_DIR_EL1 when possible KVM: arm64: vgic-v3: Work around GICv3 locally generated SErrors KVM: arm64: Force ID_AA64PFR0_EL1.GIC=1 when exposing a virtual GICv3 Signed-off-by: Marc Zyngier <maz@kernel.org>	2021-10-17 11:10:14 +01:00
Marc Zyngier	0924729b21	KVM: arm64: vgic-v3: Reduce common group trapping to ICV_DIR_EL1 when possible On systems that advertise ICH_VTR_EL2.SEIS, we trap all GICv3 sysreg accesses from the guest. From a performance perspective, this is OK as long as the guest doesn't hammer the GICv3 CPU interface. In most cases, this is fine, unless the guest actively uses priorities and switches PMR_EL1 very often. Which is exactly what happens when a Linux guest runs with irqchip.gicv3_pseudo_nmi=1. In these condition, the performance plumets as we hit PMR each time we mask/unmask interrupts. Not good. There is however an opportunity for improvement. Careful reading of the architecture specification indicates that the only GICv3 sysreg belonging to the common group (which contains the SGI registers, PMR, DIR, CTLR and RPR) that is allowed to generate a SError is DIR. Everything else is safe. It is thus possible to substitute the trapping of all the common group with just that of DIR if it supported by the implementation. Yes, that's yet another optional bit of the architecture. So let's just do that, as it leads to some impressive result on the M1: Without this change: bash-5.1# /host/home/maz/hackbench 100 process 1000 Running with 10040 (== 4000) tasks. Time: 56.596 With this change: bash-5.1# /host/home/maz/hackbench 100 process 1000 Running with 10040 (== 4000) tasks. Time: 8.649 which is a pretty convincing result. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com> Link: https://lore.kernel.org/r/20211010150910.2911495-4-maz@kernel.org	2021-10-17 11:06:36 +01:00
Marc Zyngier	df652bcf11	KVM: arm64: vgic-v3: Work around GICv3 locally generated SErrors The infamous M1 has a feature nobody else ever implemented, in the form of the "GIC locally generated SError interrupts", also known as SEIS for short. These SErrors are generated when a guest does something that violates the GIC state machine. It would have been simpler to just ignore the damned thing, but that's not what this HW does. Oh well. This part of of the architecture is also amazingly under-specified. There is a whole 10 lines that describe the feature in a spec that is 930 pages long, and some of these lines are factually wrong. Oh, and it is deprecated, so the insentive to clarify it is low. Now, the spec says that this should be a virtual SError when HCR_EL2.AMO is set. As it turns out, that's not always the case on this CPU, and the SError sometimes fires on the host as a physical SError. Goodbye, cruel world. This clearly is a HW bug, and it means that a guest can easily take the host down, on demand. Thankfully, we have seen systems that were just as broken in the past, and we have the perfect vaccine for it. Apple M1, please meet the Cavium ThunderX workaround. All your GIC accesses will be trapped, sanitised, and emulated. Only the signalling aspect of the HW will be used. It won't be super speedy, but it will at least be safe. You're most welcome. Given that this has only ever been seen on this single implementation, that the spec is unclear at best and that we cannot trust it to ever be implemented correctly, gate the workaround solely on ICH_VTR_EL2.SEIS being set. Tested-by: Joey Gouly <joey.gouly@arm.com> Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211010150910.2911495-3-maz@kernel.org	2021-10-17 11:06:36 +01:00
Ricardo Koller	96e9038969	KVM: arm64: vgic: Drop vgic_check_ioaddr() There are no more users of vgic_check_ioaddr(). Move its checks to vgic_check_iorange() and then remove it. Signed-off-by: Ricardo Koller <ricarkol@google.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211005011921.437353-6-ricarkol@google.com	2021-10-11 09:31:42 +01:00
Ricardo Koller	2ec02f6c64	KVM: arm64: vgic-v3: Check ITS region is not above the VM IPA size Verify that the ITS region does not extend beyond the VM-specified IPA range (phys_size). base + size > phys_size AND base < phys_size Add the missing check into vgic_its_set_attr() which is called when setting the region. Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Ricardo Koller <ricarkol@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211005011921.437353-5-ricarkol@google.com	2021-10-11 09:31:42 +01:00
Ricardo Koller	c56a87da0a	KVM: arm64: vgic-v2: Check cpu interface region is not above the VM IPA size Verify that the GICv2 CPU interface does not extend beyond the VM-specified IPA range (phys_size). base + size > phys_size AND base < phys_size Add the missing check into kvm_vgic_addr() which is called when setting the region. This patch also enables some superfluous checks for the distributor (vgic_check_ioaddr was enough as alignment == size for the distributors). Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Ricardo Koller <ricarkol@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211005011921.437353-4-ricarkol@google.com	2021-10-11 09:31:41 +01:00
Ricardo Koller	4612d98f58	KVM: arm64: vgic-v3: Check redist region is not above the VM IPA size Verify that the redistributor regions do not extend beyond the VM-specified IPA range (phys_size). This can happen when using KVM_VGIC_V3_ADDR_TYPE_REDIST or KVM_VGIC_V3_ADDR_TYPE_REDIST_REGIONS with: base + size > phys_size AND base < phys_size Add the missing check into vgic_v3_alloc_redist_region() which is called when setting the regions, and into vgic_v3_check_base() which is called when attempting the first vcpu-run. The vcpu-run check does not apply to KVM_VGIC_V3_ADDR_TYPE_REDIST_REGIONS because the regions size is known before the first vcpu-run. Note that using the REDIST_REGIONS API results in a different check, which already exists, at first vcpu run: that the number of redist regions is enough for all vcpus. Finally, this patch also enables some extra tests in vgic_v3_alloc_redist_region() by calculating "size" early for the legacy redist api: like checking that the REDIST region can fit all the already created vcpus. Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Ricardo Koller <ricarkol@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211005011921.437353-3-ricarkol@google.com	2021-10-11 09:31:41 +01:00
Ricardo Koller	f25c5e4daf	kvm: arm64: vgic: Introduce vgic_check_iorange Add the new vgic_check_iorange helper that checks that an iorange is sane: the start address and size have valid alignments, the range is within the addressable PA range, start+size doesn't overflow, and the start wasn't already defined. No functional change. Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Ricardo Koller <ricarkol@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211005011921.437353-2-ricarkol@google.com	2021-10-11 09:31:41 +01:00
Marc Zyngier	3134cc8beb	KVM: arm64: vgic: Resample HW pending state on deactivation When a mapped level interrupt (a timer, for example) is deactivated by the guest, the corresponding host interrupt is equally deactivated. However, the fate of the pending state still needs to be dealt with in SW. This is specially true when the interrupt was in the active+pending state in the virtual distributor at the point where the guest was entered. On exit, the pending state is potentially stale (the guest may have put the interrupt in a non-pending state). If we don't do anything, the interrupt will be spuriously injected in the guest. Although this shouldn't have any ill effect (spurious interrupts are always possible), we can improve the emulation by detecting the deactivation-while-pending case and resample the interrupt. While we're at it, move the logic into a common helper that can be shared between the two GIC implementations. Fixes: `e40cc57bac` ("KVM: arm/arm64: vgic: Support level-triggered mapped interrupts") Reported-by: Raghavendra Rao Ananta <rananta@google.com> Tested-by: Raghavendra Rao Ananta <rananta@google.com> Reviewed-by: Oliver Upton <oupton@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210819180305.1670525-1-maz@kernel.org	2021-08-20 08:53:22 +01:00

1 2

97 Commits