linux

korg/linux

mirror of https://mirrors.bfsu.edu.cn/git/linux.git synced 2025-01-03 20:34:58 +08:00

Author	SHA1	Message	Date
Mark Brown	3bb72d86d8	arm64: Always use individual bits in CPACR floating point enables CPACR_EL1 has several bitfields for controlling traps for floating point features to EL1, each of which has a separate bits for EL0 and EL1. Marc Zyngier noted that we are not consistent in our use of defines to manipulate these, sometimes using a define covering the whole field and sometimes using defines for the individual bits. Make this consistent by expanding the whole field defines where they are used (currently only in the KVM code) and deleting them so that no further uses can be introduced. Suggested-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Mark Brown <broonie@kernel.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20220207152109.197566-3-broonie@kernel.org Signed-off-by: Will Deacon <will@kernel.org>	2022-02-25 14:28:18 +00:00
Mark Brown	879358fc67	arm64: Define CPACR_EL1_FPEN similarly to other floating point controls The base floating point, SVE and SME all have enable controls for EL0 and EL1 in CPACR_EL1 which have a similar layout and function. Currently the basic floating point enable FPEN is defined differently to the SVE control, specified as a single define in kvm_arm.h rather than in sysreg.h. Move the define to sysreg.h and provide separate EL0 and EL1 control bits so code managing the different floating point enables can look consistent. Signed-off-by: Mark Brown <broonie@kernel.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20220207152109.197566-2-broonie@kernel.org Signed-off-by: Will Deacon <will@kernel.org>	2022-02-25 14:28:18 +00:00
Fangrui Song	4013e26670	arm64: module: remove (NOLOAD) from linker script On ELF, (NOLOAD) sets the section type to SHT_NOBITS[1]. It is conceptually inappropriate for .plt and .text.* sections which are always SHT_PROGBITS. In GNU ld, if PLT entries are needed, .plt will be SHT_PROGBITS anyway and (NOLOAD) will be essentially ignored. In ld.lld, since https://reviews.llvm.org/D118840 ("[ELF] Support (TYPE=<value>) to customize the output section type"), ld.lld will report a `section type mismatch` error. Just remove (NOLOAD) to fix the error. [1] https://lld.llvm.org/ELF/linker_script.html As of today, "The section should be marked as not loadable" on https://sourceware.org/binutils/docs/ld/Output-Section-Type.html is outdated for ELF. Tested-by: Nathan Chancellor <nathan@kernel.org> Reported-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Fangrui Song <maskray@google.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20220218081209.354383-1-maskray@google.com Signed-off-by: Will Deacon <will@kernel.org>	2022-02-25 14:06:50 +00:00
Vladimir Murzin	def8c222f0	arm64: Add support of PAuth QARMA3 architected algorithm QARMA3 is relaxed version of the QARMA5 algorithm which expected to reduce the latency of calculation while still delivering a suitable level of security. Support for QARMA3 can be discovered via ID_AA64ISAR2_EL1 APA3, bits [15:12] Indicates whether the QARMA3 algorithm is implemented in the PE for address authentication in AArch64 state. GPA3, bits [11:8] Indicates whether the QARMA3 algorithm is implemented in the PE for generic code authentication in AArch64 state. Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220224124952.119612-4-vladimir.murzin@arm.com Signed-off-by: Will Deacon <will@kernel.org>	2022-02-25 13:38:52 +00:00
Arnd Bergmann	12700c17fc	uaccess: generalize access_ok() There are many different ways that access_ok() is defined across architectures, but in the end, they all just compare against the user_addr_max() value or they accept anything. Provide one definition that works for most architectures, checking against TASK_SIZE_MAX for user processes or skipping the check inside of uaccess_kernel() sections. For architectures without CONFIG_SET_FS(), this should be the fastest check, as it comes down to a single comparison of a pointer against a compile-time constant, while the architecture specific versions tend to do something more complex for historic reasons or get something wrong. Type checking for __user annotations is handled inconsistently across architectures, but this is easily simplified as well by using an inline function that takes a 'const void __user *' argument. A handful of callers need an extra __user annotation for this. Some architectures had trick to use 33-bit or 65-bit arithmetic on the addresses to calculate the overflow, however this simpler version uses fewer registers, which means it can produce better object code in the end despite needing a second (statically predicted) branch. Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Mark Rutland <mark.rutland@arm.com> [arm64, asm-generic] Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Stafford Horne <shorne@gmail.com> Acked-by: Dinh Nguyen <dinguyen@kernel.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de>	2022-02-25 09:36:05 +01:00
Arnd Bergmann	52fe8d125c	arm64: simplify access_ok() arm64 has an inline asm implementation of access_ok() that is derived from the 32-bit arm version and optimized for the case that both the limit and the size are variable. With set_fs() gone, the limit is always constant, and the size usually is as well, so just using the default implementation reduces the check into a comparison against a constant that can be scheduled by the compiler. On a defconfig build, this saves over 28KB of .text. Acked-by: Robin Murphy <robin.murphy@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>	2022-02-25 09:36:05 +01:00
Arnd Bergmann	34737e2698	uaccess: add generic __{get,put}_kernel_nofault Nine architectures are still missing __{get,put}_kernel_nofault: alpha, ia64, microblaze, nds32, nios2, openrisc, sh, sparc32, xtensa. Add a generic version that lets everything use the normal copy_{from,to}_kernel_nofault() code based on these, removing the last use of get_fs()/set_fs() from architecture-independent code. Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de>	2022-02-25 09:36:05 +01:00
James Morse	228a26b912	arm64: Use the clearbhb instruction in mitigations Future CPUs may implement a clearbhb instruction that is sufficient to mitigate SpectreBHB. CPUs that implement this instruction, but not CSV2.3 must be affected by Spectre-BHB. Add support to use this instruction as the BHB mitigation on CPUs that support it. The instruction is in the hint space, so it will be treated by a NOP as older CPUs. Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: James Morse <james.morse@arm.com>	2022-02-24 14:02:44 +00:00
James Morse	558c303c97	arm64: Mitigate spectre style branch history side channels Speculation attacks against some high-performance processors can make use of branch history to influence future speculation. When taking an exception from user-space, a sequence of branches or a firmware call overwrites or invalidates the branch history. The sequence of branches is added to the vectors, and should appear before the first indirect branch. For systems using KPTI the sequence is added to the kpti trampoline where it has a free register as the exit from the trampoline is via a 'ret'. For systems not using KPTI, the same register tricks are used to free up a register in the vectors. For the firmware call, arch-workaround-3 clobbers 4 registers, so there is no choice but to save them to the EL1 stack. This only happens for entry from EL0, so if we take an exception due to the stack access, it will not become re-entrant. For KVM, the existing branch-predictor-hardening vectors are used. When a spectre version of these vectors is in use, the firmware call is sufficient to mitigate against Spectre-BHB. For the non-spectre versions, the sequence of branches is added to the indirect vector. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: James Morse <james.morse@arm.com>	2022-02-24 13:58:52 +00:00
Ingo Molnar	669f45f19c	sched/headers: Add initial new headers as identity mappings This allows code sharing between fast-headers tree and the vanilla scheduler tree. Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Peter Zijlstra <peterz@infradead.org>	2022-02-23 10:58:28 +01:00
Peter Collingbourne	38ddf7dafa	arm64: mte: avoid clearing PSTATE.TCO on entry unless necessary On some microarchitectures, clearing PSTATE.TCO is expensive. Clearing TCO is only necessary if in-kernel MTE is enabled, or if MTE is enabled in the userspace process in synchronous (or, soon, asymmetric) mode, because we do not report uaccess faults to userspace in none or asynchronous modes. Therefore, adjust the kernel entry code to clear TCO only if necessary. Because it is now possible to switch to a task in which TCO needs to be clear from a task in which TCO is set, we also need to do the same thing on task switch. Signed-off-by: Peter Collingbourne <pcc@google.com> Link: https://linux-review.googlesource.com/id/I52d82a580bd0500d420be501af2c35fa8c90729e Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20220219012945.894950-2-pcc@google.com Signed-off-by: Will Deacon <will@kernel.org>	2022-02-22 21:48:44 +00:00
Hou Tao	fa1114d9eb	arm64: insn: add encoders for atomic operations It is a preparation patch for eBPF atomic supports under arm64. eBPF needs support atomic[64]_fetch_add, atomic[64]_[fetch_]{and,or,xor} and atomic[64]_{xchg\|cmpxchg}. The ordering semantics of eBPF atomics are the same with the implementations in linux kernel. Add three helpers to support LDCLR/LDEOR/LDSET/SWP, CAS and DMB instructions. STADD/STCLR/STEOR/STSET are simply encoded as aliases for LDADD/LDCLR/LDEOR/LDSET with XZR as the destination register, so no extra helper is added. atomic_fetch_add() and other atomic ops needs support for STLXR instruction, so extend enum aarch64_insn_ldst_type to do that. LDADD/LDEOR/LDSET/SWP and CAS instructions are only available when LSE atomics is enabled, so just return AARCH64_BREAK_FAULT directly in these newly-added helpers if CONFIG_ARM64_LSE_ATOMICS is disabled. Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20220217072232.1186625-3-houtao1@huawei.com Signed-off-by: Will Deacon <will@kernel.org>	2022-02-22 21:25:48 +00:00
Hou Tao	97e58e395e	arm64: move AARCH64_BREAK_FAULT into insn-def.h If CONFIG_ARM64_LSE_ATOMICS is off, encoders for LSE-related instructions can return AARCH64_BREAK_FAULT directly in insn.h. In order to access AARCH64_BREAK_FAULT in insn.h, we can not include debug-monitors.h in insn.h, because debug-monitors.h has already depends on insn.h, so just move AARCH64_BREAK_FAULT into insn-def.h. It will be used by the following patch to eliminate unnecessary LSE-related encoders when CONFIG_ARM64_LSE_ATOMICS is off. Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20220217072232.1186625-2-houtao1@huawei.com Signed-off-by: Will Deacon <will@kernel.org>	2022-02-22 21:25:48 +00:00
Mark Rutland	0f61f6be1f	arm64: clean up symbol aliasing Now that we have SYM_FUNC_ALIAS() and SYM_FUNC_ALIAS_WEAK(), use those to simplify and more consistently define function aliases across arch/arm64. Aliases are now defined in terms of a canonical function name. For position-independent functions I've made the __pi_<func> name the canonical name, and defined other alises in terms of this. The SYM_FUNC_{START,END}_PI(func) macros obscure the __pi_<func> name, and make this hard to seatch for. The SYM_FUNC_START_WEAK_PI() macro also obscures the fact that the __pi_<func> fymbol is global and the <func> symbol is weak. For clarity, I have removed these macros and used SYM_FUNC_{START,END}() directly with the __pi_<func> name. For example: SYM_FUNC_START_WEAK_PI(func) ... asm insns ... SYM_FUNC_END_PI(func) EXPORT_SYMBOL(func) ... becomes: SYM_FUNC_START(__pi_func) ... asm insns ... SYM_FUNC_END(__pi_func) SYM_FUNC_ALIAS_WEAK(func, __pi_func) EXPORT_SYMBOL(func) For clarity, where there are multiple annotations such as EXPORT_SYMBOL(), I've tried to keep annotations grouped by symbol. For example, where a function has a name and an alias which are both exported, this is organised as: SYM_FUNC_START(func) ... asm insns ... SYM_FUNC_END(func) EXPORT_SYMBOL(func) SYM_FUNC_ALIAS(alias, func) EXPORT_SYMBOL(alias) For consistency with the other string functions, I've defined strrchr as a position-independent function, as it can safely be used as such even though we have no users today. As we no longer use SYM_FUNC_{START,END}_ALIAS(), our local copies are removed. The common versions will be removed by a subsequent patch. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Mark Brown <broonie@kernel.org> Cc: Joey Gouly <joey.gouly@arm.com> Cc: Will Deacon <will@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220216162229.1076788-3-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>	2022-02-22 16:21:34 +00:00
Ingo Molnar	6255b48aeb	Linux 5.17-rc5 -----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmISrYgeHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGg20IAKDZr7rfSHBopjQV Cocw744tom0XuxpvSZpp2GGOOXF+tkswcNNaRIrbGOl1mkyxA7eBZCTMpDeDS9aQ wB0D0Gxx8QBAJp4KgB1W7TB+hIGes/rs8Ve+6iO4ulLLdCVWX/q2boI0aZ7QX9O9 qNi8OsoZQtk6falRvciZFHwV5Av1p2Sy1AW57udQ7DvJ4H98AfKf1u8/z208WWW8 1ixC+qJxQcUcM9vI+7P9Tt7NbFSKv8SvAmqjFY7P+DxQAsVw6KXoqVXykDzeOv0t fUNOE/t0oFZafwtn8h7KBQnwS9lH03+3KkslVZs+iMFyUj/Bar+NVVyKoDhWXtVg /PuMhEg= =eU1o -----END PGP SIGNATURE----- Merge tag 'v5.17-rc5' into sched/core, to resolve conflicts New conflicts in sched/core due to the following upstream fixes: `44585f7bc0` ("psi: fix "defined but not used" warnings when CONFIG_PROC_FS=n") `a06247c680` ("psi: Fix uaf issue when psi trigger is destroyed while being polled") Conflicts: include/linux/psi_types.h kernel/sched/psi.c Signed-off-by: Ingo Molnar <mingo@kernel.org>	2022-02-21 11:53:51 +01:00
Mark Rutland	1b2d3451ee	arm64: Support PREEMPT_DYNAMIC This patch enables support for PREEMPT_DYNAMIC on arm64, allowing the preemption model to be chosen at boot time. Specifically, this patch selects HAVE_PREEMPT_DYNAMIC_KEY, so that each preemption function is an out-of-line call with an early return depending upon a static key. This leaves almost all the codegen up to the compiler, and side-steps a number of pain points with static calls (e.g. interaction with CFI schemes). This should have no worse overhead than using non-inline static calls, as those use out-of-line trampolines with early returns. For example, the dynamic_cond_resched() wrapper looks as follows when enabled. When disabled, the first `B` is replaced with a `NOP`, resulting in an early return. \| <dynamic_cond_resched>: \| bti c \| b <dynamic_cond_resched+0x10> // or `nop` \| mov w0, #0x0 \| ret \| mrs x0, sp_el0 \| ldr x0, [x0, #8] \| cbnz x0, <dynamic_cond_resched+0x8> \| paciasp \| stp x29, x30, [sp, #-16]! \| mov x29, sp \| bl <preempt_schedule_common> \| mov w0, #0x1 \| ldp x29, x30, [sp], #16 \| autiasp \| ret ... compared to the regular form of the function: \| <__cond_resched>: \| bti c \| mrs x0, sp_el0 \| ldr x1, [x0, #8] \| cbz x1, <__cond_resched+0x18> \| mov w0, #0x0 \| ret \| paciasp \| stp x29, x30, [sp, #-16]! \| mov x29, sp \| bl <preempt_schedule_common> \| mov w0, #0x1 \| ldp x29, x30, [sp], #16 \| autiasp \| ret Since arm64 does not yet use the generic entry code, we must define our own `sk_dynamic_irqentry_exit_cond_resched`, which will be enabled/disabled by the common code in kernel/sched/core.c. All other preemption functions and associated static keys are defined there. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lore.kernel.org/r/20220214165216.2231574-8-mark.rutland@arm.com	2022-02-19 11:11:09 +01:00
James Morse	dee435be76	arm64: proton-pack: Report Spectre-BHB vulnerabilities as part of Spectre-v2 Speculation attacks against some high-performance processors can make use of branch history to influence future speculation as part of a spectre-v2 attack. This is not mitigated by CSV2, meaning CPUs that previously reported 'Not affected' are now moderately mitigated by CSV2. Update the value in /sys/devices/system/cpu/vulnerabilities/spectre_v2 to also show the state of the BHB mitigation. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: James Morse <james.morse@arm.com>	2022-02-16 13:22:26 +00:00
James Morse	bd09128d16	arm64: Add percpu vectors for EL1 The Spectre-BHB workaround adds a firmware call to the vectors. This is needed on some CPUs, but not others. To avoid the unaffected CPU in a big/little pair from making the firmware call, create per cpu vectors. The per-cpu vectors only apply when returning from EL0. Systems using KPTI can use the canonical 'full-fat' vectors directly at EL1, the trampoline exit code will switch to this_cpu_vector on exit to EL0. Systems not using KPTI should always use this_cpu_vector. this_cpu_vector will point at a vector in tramp_vecs or __bp_harden_el1_vectors, depending on whether KPTI is in use. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: James Morse <james.morse@arm.com>	2022-02-16 13:17:30 +00:00
James Morse	ba2689234b	arm64: entry: Add vectors that have the bhb mitigation sequences Some CPUs affected by Spectre-BHB need a sequence of branches, or a firmware call to be run before any indirect branch. This needs to go in the vectors. No CPU needs both. While this can be patched in, it would run on all CPUs as there is a single set of vectors. If only one part of a big/little combination is affected, the unaffected CPUs have to run the mitigation too. Create extra vectors that include the sequence. Subsequent patches will allow affected CPUs to select this set of vectors. Later patches will modify the loop count to match what the CPU requires. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: James Morse <james.morse@arm.com>	2022-02-16 13:16:08 +00:00
Catalin Marinas	ab1e435ca7	arm64: mte: Define the number of bytes for storing the tags in a page Rather than explicitly calculating the number of bytes for a compact tag storage format corresponding to a page, just add a MTE_PAGE_TAG_STORAGE macro. With the current MTE implementation of 4 bits per tag, we store 2 tags in a byte. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Luis Machado <luis.machado@linaro.org> Link: https://lore.kernel.org/r/20220131165456.2160675-4-catalin.marinas@arm.com Signed-off-by: Will Deacon <will@kernel.org>	2022-02-15 22:53:29 +00:00
Mark Rutland	16860a209c	arm64: atomics: remove redundant static branch Due to a historical oversight, we emit a redundant static branch for each atomic/atomic64 operation when CONFIG_ARM64_LSE_ATOMICS is selected. We can safely remove this, making the kernel Image reasonably smaller. When CONFIG_ARM64_LSE_ATOMICS is selected, every LSE atomic operation has two preceding static branches with the same target, e.g. b f7c <kernel_init_freeable+0xa4> b f7c <kernel_init_freeable+0xa4> mov w0, #0x1 // #1 ldadd w0, w0, [x19] This is because the __lse_ll_sc_body() wrapper uses system_uses_lse_atomics(), which checks both `arm64_const_caps_ready` and `cpu_hwcap_keys[ARM64_HAS_LSE_ATOMICS]`, each of which emits a static branch. This has been the case since commit: `addfc38672` ("arm64: atomics: avoid out-of-line ll/sc atomics") However, there was never a need to check `arm64_const_caps_ready`, which was itself introduced in commit: `63a1e1c95e` ("arm64/cpufeature: don't use mutex in bringup path") ... so that cpus_have_const_cap() could fall back to checking the `cpu_hwcaps` bitmap prior to the static keys for individual caps becoming enabled. As system_uses_lse_atomics() doesn't check `cpu_hwcaps`, and doesn't need to as we can safely use the LL/SC atomics prior to enabling the `ARM64_HAS_LSE_ATOMICS` static key, it doesn't need to check `arm64_const_caps_ready`. This patch removes the `arm64_const_caps_ready` check from system_uses_lse_atomics(). As the arch_atomic_* routines are meant to be safely usable in noinstr code, I've also marked system_uses_lse_atomics() as __always_inline. This results in one fewer static branch per atomic operation, with the prior example becoming: b f78 <kernel_init_freeable+0xa0> mov w0, #0x1 // #1 ldadd w0, w0, [x19] Each static branch consists of the branch itself and an associated __jump_table entry. Removing these has a reasonable impact on the Image size, with a GCC 11.1.0 defconfig v5.17-rc2 Image being reduced by 128KiB: \| [mark@lakrids:~/src/linux]% ls -al Image* \| -rw-r--r-- 1 mark mark 34619904 Feb 3 18:24 Image.baseline \| -rw-r--r-- 1 mark mark 34488832 Feb 3 18:33 Image.onebranch Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Suzuki Poulose <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20220204104439.270567-1-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>	2022-02-15 17:54:08 +00:00
James Morse	a9c406e646	arm64: entry: Allow the trampoline text to occupy multiple pages Adding a second set of vectors to .entry.tramp.text will make it larger than a single 4K page. Allow the trampoline text to occupy up to three pages by adding two more fixmap slots. Previous changes to tramp_valias allowed it to reach beyond a single page. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: James Morse <james.morse@arm.com>	2022-02-15 17:40:28 +00:00
James Morse	c091fb6ae0	arm64: entry: Move the trampoline data page before the text page The trampoline code has a data page that holds the address of the vectors, which is unmapped when running in user-space. This ensures that with CONFIG_RANDOMIZE_BASE, the randomised address of the kernel can't be discovered until after the kernel has been mapped. If the trampoline text page is extended to include multiple sets of vectors, it will be larger than a single page, making it tricky to find the data page without knowing the size of the trampoline text pages, which will vary with PAGE_SIZE. Move the data page to appear before the text page. This allows the data page to be found without knowing the size of the trampoline text pages. 'tramp_vectors' is used to refer to the beginning of the .entry.tramp.text section, do that explicitly. Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: James Morse <james.morse@arm.com>	2022-02-15 17:39:14 +00:00
James Morse	5bdf343760	KVM: arm64: Allow indirect vectors to be used without SPECTRE_V3A CPUs vulnerable to Spectre-BHB either need to make an SMC-CC firmware call from the vectors, or run a sequence of branches. This gets added to the hyp vectors. If there is no support for arch-workaround-1 in firmware, the indirect vector will be used. kvm_init_vector_slots() only initialises the two indirect slots if the platform is vulnerable to Spectre-v3a. pKVM's hyp_map_vectors() only initialises __hyp_bp_vect_base if the platform is vulnerable to Spectre-v3a. As there are about to more users of the indirect vectors, ensure their entries in hyp_spectre_vector_selector[] are always initialised, and __hyp_bp_vect_base defaults to the regular VA mapping. The Spectre-v3a check is moved to a helper kvm_system_needs_idmapped_vectors(), and merged with the code that creates the hyp mappings. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: James Morse <james.morse@arm.com>	2022-02-15 17:38:25 +00:00
Anshuman Khandual	e921da6bc7	arm64/mm: Consolidate TCR_EL1 fields This renames and moves SYS_TCR_EL1_TCMA1 and SYS_TCR_EL1_TCMA0 definitions into pgtable-hwdef.h thus consolidating all TCR fields in a single header. This does not cause any functional change. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/1643121513-21854-1-git-send-email-anshuman.khandual@arm.com Signed-off-by: Will Deacon <will@kernel.org>	2022-02-15 15:34:22 +00:00
Ard Biesheuvel	35bde68bba	arm64: random: implement arch_get_random_int/_long based on RNDR When support for RNDR/RNDRRS was introduced, we elected to only implement arch_get_random_seed_int/_long(), and back them by RNDR instead of RNDRRS. This was needed to prevent potential performance and/or starvation issues resulting from the fact that the /dev/random driver used to invoke these routines on various hot paths. These issues have all been addressed now [0] [1], and so we can wire up this API more straight-forwardly: - map arch_get_random_int/_long() onto RNDR, which returns the output of a DRBG that is reseeded at an implemented defined rate; - map arch_get_random_seed_int/_long() onto the TRNG firmware service, which returns true, conditioned entropy, or onto RNDRRS if the TRNG service is unavailable, which returns the output of a DRBG that is reseeded every time it is used. [0] `390596c995` random: avoid arch_get_random_seed_long() when collecting IRQ randomness [1] `2ee25b6968` random: avoid superfluous call to RDRAND in CRNG extraction Cc: Andre Przywara <andre.przywara@arm.com> Cc: Mark Brown <broonie@kernel.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220113131239.1610455-1-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org>	2022-02-15 15:06:39 +00:00
Joakim Tjernlund	4f6de676d9	arm64: Correct wrong label in macro __init_el2_gicv3 In commit: `114945d84a` ("arm64: Fix labels in el2_setup macros") We renamed a label from '1' to '.Lskip_gicv3_\@', but failed to update a branch to it, which now targets a later label also called '1'. The branch is taken rarely, when GICv3 is present but SRE is disabled at EL3, causing a boot-time crash. Update the caller to the new label name. Fixes: `114945d84a` ("arm64: Fix labels in el2_setup macros") Cc: <stable@vger.kernel.org> # 5.12.x Signed-off-by: Joakim Tjernlund <joakim.tjernlund@infinera.com> Link: https://lore.kernel.org/r/20220214175643.21931-1-joakim.tjernlund@infinera.com Reviewed-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-02-14 18:37:07 +00:00
Ard Biesheuvel	297565aa22	lib/xor: make xor prototypes more friendly to compiler vectorization Modern compilers are perfectly capable of extracting parallelism from the XOR routines, provided that the prototypes reflect the nature of the input accurately, in particular, the fact that the input vectors are expected not to overlap. This is not documented explicitly, but is implied by the interchangeability of the various C routines, some of which use temporary variables while others don't: this means that these routines only behave identically for non-overlapping inputs. So let's decorate these input vectors with the __restrict modifier, which informs the compiler that there is no overlap. While at it, make the input-only vectors pointer-to-const as well. Tested-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Link: https://github.com/ClangBuiltLinux/linux/issues/563 Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2022-02-11 20:39:39 +11:00
Marc Zyngier	00e6dae00e	Merge branch kvm-arm64/pmu-bl into kvmarm-master/next * kvm-arm64/pmu-bl: : . : Improve PMU support on heterogeneous systems, courtesy of Alexandru Elisei : . KVM: arm64: Refuse to run VCPU if the PMU doesn't match the physical CPU KVM: arm64: Add KVM_ARM_VCPU_PMU_V3_SET_PMU attribute KVM: arm64: Keep a list of probed PMUs KVM: arm64: Keep a per-VM pointer to the default PMU perf: Fix wrong name in comment for struct perf_cpu_context KVM: arm64: Do not change the PMU event filter after a VCPU has run Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-02-08 17:54:41 +00:00
Alexandru Elisei	583cda1b0e	KVM: arm64: Refuse to run VCPU if the PMU doesn't match the physical CPU Userspace can assign a PMU to a VCPU with the KVM_ARM_VCPU_PMU_V3_SET_PMU device ioctl. If the VCPU is scheduled on a physical CPU which has a different PMU, the perf events needed to emulate a guest PMU won't be scheduled in and the guest performance counters will stop counting. Treat it as an userspace error and refuse to run the VCPU in this situation. Suggested-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220127161759.53553-7-alexandru.elisei@arm.com	2022-02-08 17:51:22 +00:00
Marc Zyngier	46b1878214	KVM: arm64: Keep a per-VM pointer to the default PMU As we are about to allow selection of the PMU exposed to a guest, start by keeping track of the default one instead of only the PMU version. Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com> Link: https://lore.kernel.org/r/20220127161759.53553-4-alexandru.elisei@arm.com	2022-02-08 17:51:21 +00:00
Marc Zyngier	5177fe91e4	KVM: arm64: Do not change the PMU event filter after a VCPU has run Userspace can specify which events a guest is allowed to use with the KVM_ARM_VCPU_PMU_V3_FILTER attribute. The list of allowed events can be identified by a guest from reading the PMCEID{0,1}_EL0 registers. Changing the PMU event filter after a VCPU has run can cause reads of the registers performed before the filter is changed to return different values than reads performed with the new event filter in place. The architecture defines the two registers as read-only, and this behaviour contradicts that. Keep track when the first VCPU has run and deny changes to the PMU event filter to prevent this from happening. Signed-off-by: Marc Zyngier <maz@kernel.org> [ Alexandru E: Added commit message, updated ioctl documentation ] Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220127161759.53553-2-alexandru.elisei@arm.com	2022-02-08 17:51:21 +00:00
Marc Zyngier	ebca68972e	Merge branch kvm-arm64/vmid-allocator into kvmarm-master/next * kvm-arm64/vmid-allocator: : . : VMID allocation rewrite from Shameerali Kolothum Thodi, paving the : way for pinned VMIDs and SVA. : . KVM: arm64: Make active_vmids invalid on vCPU schedule out KVM: arm64: Align the VMID allocation with the arm64 ASID KVM: arm64: Make VMID bits accessible outside of allocator KVM: arm64: Introduce a new VMID allocator for KVM Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-02-08 14:58:38 +00:00
Shameer Kolothum	100b4f092f	KVM: arm64: Make active_vmids invalid on vCPU schedule out Like ASID allocator, we copy the active_vmids into the reserved_vmids on a rollover. But it's unlikely that every CPU will have a vCPU as current task and we may end up unnecessarily reserving the VMID space. Hence, set active_vmids to an invalid one when scheduling out a vCPU. Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211122121844.867-5-shameerali.kolothum.thodi@huawei.com	2022-02-08 14:57:04 +00:00
Julien Grall	3248136b36	KVM: arm64: Align the VMID allocation with the arm64 ASID At the moment, the VMID algorithm will send an SGI to all the CPUs to force an exit and then broadcast a full TLB flush and I-Cache invalidation. This patch uses the new VMID allocator. The benefits are: - Aligns with arm64 ASID algorithm. - CPUs are not forced to exit at roll-over. Instead, the VMID will be marked reserved and context invalidation is broadcasted. This will reduce the IPIs traffic. - More flexible to add support for pinned KVM VMIDs in the future. With the new algo, the code is now adapted: - The call to update_vmid() will be done with preemption disabled as the new algo requires to store information per-CPU. Signed-off-by: Julien Grall <julien.grall@arm.com> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211122121844.867-4-shameerali.kolothum.thodi@huawei.com	2022-02-08 14:57:03 +00:00
Shameer Kolothum	f8051e9609	KVM: arm64: Make VMID bits accessible outside of allocator Since we already set the kvm_arm_vmid_bits in the VMID allocator init function, make it accessible outside as well so that it can be used in the subsequent patch. Suggested-by: Will Deacon <will@kernel.org> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211122121844.867-3-shameerali.kolothum.thodi@huawei.com	2022-02-08 14:46:28 +00:00
Shameer Kolothum	417838392f	KVM: arm64: Introduce a new VMID allocator for KVM A new VMID allocator for arm64 KVM use. This is based on arm64 ASID allocator algorithm. One major deviation from the ASID allocator is the way we flush the context. Unlike ASID allocator, we expect less frequent rollover in the case of VMIDs. Hence, instead of marking the CPU as flush_pending and issuing a local context invalidation on the next context switch, we broadcast TLB flush + I-cache invalidation over the inner shareable domain on rollover. Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211122121844.867-2-shameerali.kolothum.thodi@huawei.com	2022-02-08 14:46:28 +00:00
Marc Zyngier	2bb48074b3	Merge branch kvm-arm64/mmu-rwlock into kvmarm-master/next * kvm-arm64/mmu-rwlock: : . : MMU locking optimisations from Jing Zhang, allowing permission : relaxations to occur in parallel. : . KVM: selftests: Add vgic initialization for dirty log perf test for ARM KVM: arm64: Add fast path to handle permission relaxation during dirty logging KVM: arm64: Use read/write spin lock for MMU protection Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-02-08 14:29:29 +00:00
Jing Zhang	fcc5bf8963	KVM: arm64: Use read/write spin lock for MMU protection Replace MMU spinlock with rwlock and update all instances of the lock being acquired with a write lock acquisition. Future commit will add a fast path for permission relaxation during dirty logging under a read lock. Signed-off-by: Jing Zhang <jingzhangos@google.com> Tested-by: Fuad Tabba <tabba@google.com> Reviewed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220118015703.3630552-2-jingzhangos@google.com	2022-02-08 14:27:52 +00:00
Oliver Upton	7dabf02f43	KVM: arm64: Emulate the OS Lock The OS lock blocks all debug exceptions at every EL. To date, KVM has not implemented the OS lock for its guests, despite the fact that it is mandatory per the architecture. Simple context switching between the guest and host is not appropriate, as its effects are not constrained to the guest context. Emulate the OS Lock by clearing MDE and SS in MDSCR_EL1, thereby blocking all but software breakpoint instructions. Signed-off-by: Oliver Upton <oupton@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220203174159.2887882-5-oupton@google.com	2022-02-08 14:23:41 +00:00
Oliver Upton	f24adc65c5	KVM: arm64: Allow guest to set the OSLK bit Allow writes to OSLAR and forward the OSLK bit to OSLSR. Do nothing with the value for now. Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Oliver Upton <oupton@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220203174159.2887882-4-oupton@google.com	2022-02-08 14:23:41 +00:00
Oliver Upton	d42e26716d	KVM: arm64: Stash OSLSR_EL1 in the cpu context An upcoming change to KVM will emulate the OS Lock from the PoV of the guest. Add OSLSR_EL1 to the cpu context and handle reads using the stored value. Define some mnemonics for for handling the OSLM field and use them to make the reset value of OSLSR_EL1 more readable. Wire up a custom handler for writes from userspace and prevent any of the invariant bits from changing. Note that the OSLK bit is not invariant and will be made writable by the aforementioned change. Reviewed-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Oliver Upton <oupton@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220203174159.2887882-3-oupton@google.com	2022-02-08 14:23:41 +00:00
Marc Zyngier	11db7410cf	irqchip/apple-aic: Move PMU-specific registers to their own include file As we are about to have a PMU driver, move the PMU bits from the AIC driver into a common include file. Reviewed-by: Hector Martin <marcan@marcan.st> Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-02-07 16:00:42 +00:00
Catalin Marinas	df20597044	coresight: trbe: Workaround Cortex-A510 erratas This pull request is providing arm64 definitions to support TRBE Cortex-A510 erratas. Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> -----BEGIN PGP SIGNATURE----- iQFPBAABCgA5FiEEeTrpXvBwUkra1RYWo5FxFnwrV6EFAmHy7RkbHG1hdGhpZXUu cG9pcmllckBsaW5hcm8ub3JnAAoJEKORcRZ8K1ehmvEH/2O01bctGwhEtg7wRxdu b/pksZSJZkrTq7cUU/xRUzGEj38owoYb/QFle1+e1qMW8Lt5nI11xLLCuBxTTZFT zazoYnHHciKK5kiQSCK1cN4hTjGfL0dn/cEUkwGMA9PX6B8jG+WvMEHYXZkebt5b BV88QUNB5+S5PPZzF+UczLVQoZ1UmlwkoVyTpRQN97qunqOZ6C1esDgOeghAXTg4 EKni3tl7IkkuDDsWvg4ez4hvnYbCbPaMaFqVI81n1NGHl2fhsKAa3GXKzj+wiG8H gQEXw0q8G8rxJ4Ik/K4/VApWGrqFFSCFCeho8GFqxputUkzGoCRZ1U6JPQIbFWrN lJM= =HLQt -----END PGP SIGNATURE----- Merge tag 'trbe-cortex-a510-errata' of gitolite.kernel.org:pub/scm/linux/kernel/git/coresight/linux into for-next/fixes coresight: trbe: Workaround Cortex-A510 erratas This pull request is providing arm64 definitions to support TRBE Cortex-A510 erratas. Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> * tag 'trbe-cortex-a510-errata' of gitolite.kernel.org:pub/scm/linux/kernel/git/coresight/linux: arm64: errata: Add detection for TRBE trace data corruption arm64: errata: Add detection for TRBE invalid prohibited states arm64: errata: Add detection for TRBE ignored system register writes arm64: Add Cortex-A510 CPU part definition	2022-01-28 16:14:06 +00:00
Anshuman Khandual	53960faf2b	arm64: Add Cortex-A510 CPU part definition Add the CPU Partnumbers for the new Arm designs. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Suzuki Poulose <suzuki.poulose@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/1643120437-14352-2-git-send-email-anshuman.khandual@arm.com Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>	2022-01-27 12:01:53 -07:00
Anshuman Khandual	72bb9dcb6c	arm64: Add Cortex-X2 CPU part definition Add the CPU Partnumbers for the new Arm designs. Cc: Will Deacon <will@kernel.org> Cc: Suzuki Poulose <suzuki.poulose@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Link: https://lore.kernel.org/r/1642994138-25887-2-git-send-email-anshuman.khandual@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-01-24 14:20:50 +00:00
Linus Torvalds	3689f9f8b0	bitmap patches for 5.17-rc1 -----BEGIN PGP SIGNATURE----- iQHJBAABCgAzFiEEi8GdvG6xMhdgpu/4sUSA/TofvsgFAmHi+xgVHHl1cnkubm9y b3ZAZ21haWwuY29tAAoJELFEgP06H77IxdoMAMf3E+L51Ys/4iAiyJQNVoT3aIBC A8ZVOB9he1OA3o3wBNIRKmICHk+ovnfCWcXTr9fG/Ade2wJz88NAsGPQ1Phywb+s iGlpySllFN72RT9ZqtJhLEzgoHHOL0CzTW07TN9GJy4gQA2h2G9CTP+OmsQdnVqE m9Fn3PSlJ5lhzePlKfnln8rGZFgrriJakfEFPC79n/7an4+2Hvkb5rWigo7KQc4Z 9YNqYUcHWZFUgq80adxEb9LlbMXdD+Z/8fCjOrAatuwVkD4RDt6iKD0mFGjHXGL7 MZ9KRS8AfZXawmetk3jjtsV+/QkeS+Deuu7k0FoO0Th2QV7BGSDhsLXAS5By/MOC nfSyHhnXHzCsBMyVNrJHmNhEZoN29+tRwI84JX9lWcf/OLANcCofnP6f2UIX7tZY CAZAgVELp+0YQXdybrfzTQ8BT3TinjS/aZtCrYijRendI1GwUXcyl69vdOKqAHuk 5jy8k/xHyp+ZWu6v+PyAAAEGowY++qhL0fmszA== =RKW4 -----END PGP SIGNATURE----- Merge tag 'bitmap-5.17-rc1' of git://github.com/norov/linux Pull bitmap updates from Yury Norov: - introduce for_each_set_bitrange() - use find_first__bit() instead of find_next__bit() where possible - unify for_each_bit() macros * tag 'bitmap-5.17-rc1' of git://github.com/norov/linux: vsprintf: rework bitmap_list_string lib: bitmap: add performance test for bitmap_print_to_pagebuf bitmap: unify find_bit operations mm/percpu: micro-optimize pcpu_is_populated() Replace for_each__bit_from() with for_each__bit() where appropriate find: micro-optimize for_each_{set,clear}_bit() include/linux: move for_each_bit() macros from bitops.h to find.h cpumask: replace cpumask_next_* with cpumask_first_* where appropriate tools: sync tools/bitmap with mother linux all: replace find_next{,_zero}_bit with find_first{,_zero}_bit where appropriate cpumask: use find_first_and_bit() lib: add find_first_and_bit() arch: remove GENERIC_FIND_FIRST_BIT entirely include: move find.h from asm_generic to linux bitops: move find_bit_*_le functions from le.h to find.h bitops: protect find_first_{,zero}_bit properly	2022-01-23 06:20:44 +02:00
Linus Torvalds	b21bae9af1	arm64 fixes/cleanups: - Add brackets to the io_stop_wc macro. - Avoid -Warray-bounds warning with the LSE atomics inline asm. - Apply __ro_after_init to memory_limit. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmHq44IACgkQa9axLQDI XvFbHBAAqVG6jcW7LfNYbGED28iIj08c2o0Y4+p87caXR0xNDobrPYtUl3zy76l9 lJ+OTrUdBCBrd5exHghyJPvnC+dB1yjSu595DrsN4eSJ0/jfkfyv+vGHTASoyoeK HHL6L9rj3jBlpgDMuRiJFggdUddTSocWMpLDjUbZCNDpQOkf8TakFYrAOosooixS g0Psq5fJ6aP+yymQMQuIJyGiRzskSilTdjFSbMSqFvrCPfzQMZWLnMtF5OAs9Gbi Wcxb1AKAJLZFfy+XslgK709jIeTZ6CExTbWoY/yP9sxqgz+ALbj/dQBY4ryyUMVp C/Vu/bMLEnPW5jgppGXftsnsjivmVj1frb2/5+IY8SHI6Niq+8dna774jOLy6Mad PjYWSg5QxQSw4loZJpGzYduf6G3keKjSSkgeNns3n4f1ZqzAgHCuFF7TglIOeQvp /8AmR3u5fHjp0sMl6DE6zJ+cMTXwAxdXZYPHY9PWexLEsKiS1vkMbN2LYcb4tIjt xjymVGUUZ5/qKQSeZoKCyhjdgc06G6A5W+A/TmgVDGqkBa/U33/PvwoAxYwmaIf8 z02/MEjliP3IRvG0KzX+Cw7SHLtJ+nIgaZQgSK4QZBmKrJQs6CI1M5V1g4gVhpmM sqwtAiIHpUspmu+dP/fF4vcGzvgnZam2kQDv5mv1vFy3Dh1yn0k= =LonE -----END PGP SIGNATURE----- Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes/cleanups from Catalin Marinas: "Some fixes that turned up during the merge window: - Add brackets to the io_stop_wc macro - Avoid -Warray-bounds warning with the LSE atomics inline asm - Apply __ro_after_init to memory_limit" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: mm: apply __ro_after_init to memory_limit arm64: atomics: lse: Dereference matching size asm-generic: Add missing brackets for io_stop_wc macro	2022-01-22 09:22:10 +02:00
Kees Cook	3364c6ce23	arm64: atomics: lse: Dereference matching size When building with -Warray-bounds, the following warning is generated: In file included from ./arch/arm64/include/asm/lse.h:16, from ./arch/arm64/include/asm/cmpxchg.h:14, from ./arch/arm64/include/asm/atomic.h:16, from ./include/linux/atomic.h:7, from ./include/asm-generic/bitops/atomic.h:5, from ./arch/arm64/include/asm/bitops.h:25, from ./include/linux/bitops.h:33, from ./include/linux/kernel.h:22, from kernel/printk/printk.c:22: ./arch/arm64/include/asm/atomic_lse.h:247:9: warning: array subscript 'long unsigned int[0]' is partly outside array bounds of 'atomic_t[1]' [-Warray-bounds] 247 \| asm volatile( \ \| ^~~ ./arch/arm64/include/asm/atomic_lse.h:266:1: note: in expansion of macro '__CMPXCHG_CASE' 266 \| __CMPXCHG_CASE(w, , acq_, 32, a, "memory") \| ^~~~~~~~~~~~~~ kernel/printk/printk.c:3606:17: note: while referencing 'printk_cpulock_owner' 3606 \| static atomic_t printk_cpulock_owner = ATOMIC_INIT(-1); \| ^~~~~~~~~~~~~~~~~~~~ This is due to the compiler seeing an unsigned long * cast against something (atomic_t) that is int sized. Replace the cast with the matching size cast. This results in no change in binary output. Note that __ll_sc__cmpxchg_case_##name##sz already uses the same constraint: [v] "+Q" ((u##sz )ptr Which is why only the LSE form needs updating and not the LL/SC form, so this change is unlikely to be problematic. Cc: Will Deacon <will@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: linux-arm-kernel@lists.infradead.org Acked-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20220112202259.3950286-1-keescook@chromium.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2022-01-20 09:13:48 +00:00
Linus Torvalds	79e06c4c49	RISCV: - Use common KVM implementation of MMU memory caches - SBI v0.2 support for Guest - Initial KVM selftests support - Fix to avoid spurious virtual interrupts after clearing hideleg CSR - Update email address for Anup and Atish ARM: - Simplification of the 'vcpu first run' by integrating it into KVM's 'pid change' flow - Refactoring of the FP and SVE state tracking, also leading to a simpler state and less shared data between EL1 and EL2 in the nVHE case - Tidy up the header file usage for the nvhe hyp object - New HYP unsharing mechanism, finally allowing pages to be unmapped from the Stage-1 EL2 page-tables - Various pKVM cleanups around refcounting and sharing - A couple of vgic fixes for bugs that would trigger once the vcpu xarray rework is merged, but not sooner - Add minimal support for ARMv8.7's PMU extension - Rework kvm_pgtable initialisation ahead of the NV work - New selftest for IRQ injection - Teach selftests about the lack of default IPA space and page sizes - Expand sysreg selftest to deal with Pointer Authentication - The usual bunch of cleanups and doc update s390: - fix sigp sense/start/stop/inconsistency - cleanups x86: - Clean up some function prototypes more - improved gfn_to_pfn_cache with proper invalidation, used by Xen emulation - add KVM_IRQ_ROUTING_XEN_EVTCHN and event channel delivery - completely remove potential TOC/TOU races in nested SVM consistency checks - update some PMCs on emulated instructions - Intel AMX support (joint work between Thomas and Intel) - large MMU cleanups - module parameter to disable PMU virtualization - cleanup register cache - first part of halt handling cleanups - Hyper-V enlightened MSR bitmap support for nested hypervisors Generic: - clean up Makefiles - introduce CONFIG_HAVE_KVM_DIRTY_RING - optimize memslot lookup using a tree - optimize vCPU array usage by converting to xarray -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmHhxvsUHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroPZkAf+Nz92UL/5nNGcdHtE4m7AToMmitE9 bYkesf9BMQvAe5wjkABLuoHGi6ay4jabo4fiGzbdkiK7lO5YgfsWiMB3/MT5fl4E jRPzaVQabp3YZLM8UYCBmfUVuRj524S967SfSRe0AvYjDEH8y7klPf4+7sCsFT0/ Px9Vf2KGuOlf0eM78yKg4rGaF0jS22eLgXm6FfNMY8/e29ZAo/jyUmqBY+Z2xxZG aWhceDtSheW1jwLHLj3nOlQJvHTn8LVGXBE/R8Gda3ZjrBV2rKaDi4Fh+HD+dz86 2zVXwzQ7uck2CMW73GMoXMTWoKSHMyvlBOs1BdvBm4UsnGcXR+q8IFCeuQ== =s73m -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm updates from Paolo Bonzini: "RISCV: - Use common KVM implementation of MMU memory caches - SBI v0.2 support for Guest - Initial KVM selftests support - Fix to avoid spurious virtual interrupts after clearing hideleg CSR - Update email address for Anup and Atish ARM: - Simplification of the 'vcpu first run' by integrating it into KVM's 'pid change' flow - Refactoring of the FP and SVE state tracking, also leading to a simpler state and less shared data between EL1 and EL2 in the nVHE case - Tidy up the header file usage for the nvhe hyp object - New HYP unsharing mechanism, finally allowing pages to be unmapped from the Stage-1 EL2 page-tables - Various pKVM cleanups around refcounting and sharing - A couple of vgic fixes for bugs that would trigger once the vcpu xarray rework is merged, but not sooner - Add minimal support for ARMv8.7's PMU extension - Rework kvm_pgtable initialisation ahead of the NV work - New selftest for IRQ injection - Teach selftests about the lack of default IPA space and page sizes - Expand sysreg selftest to deal with Pointer Authentication - The usual bunch of cleanups and doc update s390: - fix sigp sense/start/stop/inconsistency - cleanups x86: - Clean up some function prototypes more - improved gfn_to_pfn_cache with proper invalidation, used by Xen emulation - add KVM_IRQ_ROUTING_XEN_EVTCHN and event channel delivery - completely remove potential TOC/TOU races in nested SVM consistency checks - update some PMCs on emulated instructions - Intel AMX support (joint work between Thomas and Intel) - large MMU cleanups - module parameter to disable PMU virtualization - cleanup register cache - first part of halt handling cleanups - Hyper-V enlightened MSR bitmap support for nested hypervisors Generic: - clean up Makefiles - introduce CONFIG_HAVE_KVM_DIRTY_RING - optimize memslot lookup using a tree - optimize vCPU array usage by converting to xarray" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (268 commits) x86/fpu: Fix inline prefix warnings selftest: kvm: Add amx selftest selftest: kvm: Move struct kvm_x86_state to header selftest: kvm: Reorder vcpu_load_state steps for AMX kvm: x86: Disable interception for IA32_XFD on demand x86/fpu: Provide fpu_sync_guest_vmexit_xfd_state() kvm: selftests: Add support for KVM_CAP_XSAVE2 kvm: x86: Add support for getting/setting expanded xstate buffer x86/fpu: Add uabi_size to guest_fpu kvm: x86: Add CPUID support for Intel AMX kvm: x86: Add XCR0 support for Intel AMX kvm: x86: Disable RDMSR interception of IA32_XFD_ERR kvm: x86: Emulate IA32_XFD_ERR for guest kvm: x86: Intercept #NM for saving IA32_XFD_ERR x86/fpu: Prepare xfd_err in struct fpu_guest kvm: x86: Add emulation for IA32_XFD x86/fpu: Provide fpu_update_guest_xfd() for IA32_XFD emulation kvm: x86: Enable dynamic xfeatures at KVM_SET_CPUID2 x86/fpu: Provide fpu_enable_guest_xfd_features() for KVM x86/fpu: Add guest support to xfd_enable_feature() ...	2022-01-16 16:15:14 +02:00
Linus Torvalds	d0a231f01e	pci-v5.17-changes -----BEGIN PGP SIGNATURE----- iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmHgpugUHGJoZWxnYWFz QGdvb2dsZS5jb20ACgkQWYigwDrT+vz59g//eWRLb0j2Vgv84ZH4x1iv6MaBboQr 2wScnfoN+MIoh+tuM4kRak15X4nB8rJhNZZCzesMUN6PeZvrkoPo4sz/xdzIrA/N qY3h8NZ3nC4yCvs/tGem0zZUcSCJsxUAD0eegyMSa142xGIOQTHBSJRflR9osKSo bnQlKTkugV8t4kD7NlQ5M3HzN3R+mjsII5JNzCqv2XlzAZG3D8DhPyIpZnRNAOmW KiHOVXvQOocfUlvSs5kBlhgR1HgJkGnruCrJ1iDCWQH1Zk0iuVgoZWgVda6Cs3Xv gcTJLB7VoSdNZKnct9aMNYPKziHkYc7clilPeDsJs5TlSv3kKERzLj6c/5ZAxFWN +RsH+zYHDXJSsL/w0twPnaF5WCuVYUyrs3UiSjUvShKl1T9k9J+Jo8zwUUZx8Xb0 qXX8jRGMHolBGwPXm2fHEb4bwTUI8emPj29qK4L96KsQ3zKXWB8eGSosxUP52Tti RR2WZjkvwlREZCJp6jSEJYkhzoEaVAm8CjKpKUuneX9WcUOsMBSs9k7EXbUy7JeM hq5Keuqa8PZo/IK2DYYAchNnBJUDMsWJeduBW12qSmx3J+9victP2qOFu+9skP0a 85xlO6Cx8beiQh+XnY7jyROvIFuxTnGKHgkq/89Ham/whEzdJ+GRIiYB218kLLCW ILdas3C2iiGz99I= =Vgg4 -----END PGP SIGNATURE----- Merge tag 'pci-v5.17-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull pci updates from Bjorn Helgaas: "Enumeration: - Use pci_find_vsec_capability() instead of open-coding it (Andy Shevchenko) - Convert pci_dev_present() stub from macro to static inline to avoid 'unused variable' errors (Hans de Goede) - Convert sysfs slot attributes from default_attrs to default_groups (Greg Kroah-Hartman) - Use DWORD accesses for LTR, L1 SS to avoid BayHub OZ711LV2 erratum (Rajat Jain) - Remove unnecessary initialization of static variables (Longji Guo) Resource management: - Always write Intel I210 ROM BAR on update to work around device defect (Bjorn Helgaas) PCIe native device hotplug: - Fix pciehp lockdep errors on Thunderbolt undock (Hans de Goede) - Fix infinite loop in pciehp IRQ handler on power fault (Lukas Wunner) Power management: - Convert amd64-agp, sis-agp, via-agp from legacy PCI power management to generic power management (Vaibhav Gupta) IOMMU: - Add function 1 DMA alias quirk for Marvell 88SE9125 SATA controller so it can work with an IOMMU (Yifeng Li) Error handling: - Add PCI_ERROR_RESPONSE and related definitions for signaling and checking for transaction errors on PCI (Naveen Naidu) - Fabricate PCI_ERROR_RESPONSE data (~0) in config read wrappers, instead of in host controller drivers, when transactions fail on PCI (Naveen Naidu) - Use PCI_POSSIBLE_ERROR() to check for possible failure of config reads (Naveen Naidu) Peer-to-peer DMA: - Add Logan Gunthorpe as P2PDMA maintainer (Bjorn Helgaas) ASPM: - Calculate link L0s and L1 exit latencies when needed instead of caching them (Saheed O. Bolarinwa) - Calculate device L0s and L1 acceptable exit latencies when needed instead of caching them (Saheed O. Bolarinwa) - Remove struct aspm_latency since it's no longer needed (Saheed O. Bolarinwa) APM X-Gene PCIe controller driver: - Fix IB window setup, which was broken by the fact that IB resources are now sorted in address order instead of DT dma-ranges order (Rob Herring) Apple PCIe controller driver: - Enable clock gating to save power (Hector Martin) - Fix REFCLK1 enable/poll logic (Hector Martin) Broadcom STB PCIe controller driver: - Declare bitmap correctly for use by bitmap interfaces (Christophe JAILLET) - Clean up computation of legacy and non-legacy MSI bitmasks (Florian Fainelli) - Update suspend/resume/remove error handling to warn about errors and not fail the operation (Jim Quinlan) - Correct the "pcie" and "msi" interrupt descriptions in DT binding (Jim Quinlan) - Add DT bindings for endpoint voltage regulators (Jim Quinlan) - Split brcm_pcie_setup() into two functions (Jim Quinlan) - Add mechanism for turning on voltage regulators for connected devices (Jim Quinlan) - Turn voltage regulators for connected devices on/off when bus is added or removed (Jim Quinlan) - When suspending, don't turn off voltage regulators for wakeup devices (Jim Quinlan) Freescale i.MX6 PCIe controller driver: - Add i.MX8MM support (Richard Zhu) Freescale Layerscape PCIe controller driver: - Use DWC common ops instead of layerscape-specific link-up functions (Hou Zhiqiang) Intel VMD host bridge driver: - Honor platform ACPI _OSC feature negotiation for Root Ports below VMD (Kai-Heng Feng) - Add support for Raptor Lake SKUs (Karthik L Gopalakrishnan) - Reset everything below VMD before enumerating to work around failure to enumerate NVMe devices when guest OS reboots (Nirmal Patel) Bridge emulation (used by Marvell Aardvark and MVEBU): - Make emulated ROM BAR read-only by default (Pali Rohár) - Make some emulated legacy PCI bits read-only for PCIe devices (Pali Rohár) - Update reserved bits in emulated PCIe Capability (Pali Rohár) - Allow drivers to emulate different PCIe Capability versions (Pali Rohár) - Set emulated Capabilities List bit for all PCIe devices, since they must have at least a PCIe Capability (Pali Rohár) Marvell Aardvark PCIe controller driver: - Add bridge emulation definitions for PCIe DEVCAP2, DEVCTL2, DEVSTA2, LNKCAP2, LNKCTL2, LNKSTA2, SLTCAP2, SLTCTL2, SLTSTA2 (Pali Rohár) - Add aardvark support for DEVCAP2, DEVCTL2, LNKCAP2 and LNKCTL2 registers (Pali Rohár) - Clear all MSIs at setup to avoid spurious interrupts (Pali Rohár) - Disable bus mastering when unbinding host controller driver (Pali Rohár) - Mask all interrupts when unbinding host controller driver (Pali Rohár) - Fix memory leak in host controller unbind (Pali Rohár) - Assert PERST# when unbinding host controller driver (Pali Rohár) - Disable link training when unbinding host controller driver (Pali Rohár) - Disable common PHY when unbinding host controller driver (Pali Rohár) - Fix resource type checking to check only IORESOURCE_MEM, not IORESOURCE_MEM_64, which is a flavor of IORESOURCE_MEM (Pali Rohár) Marvell MVEBU PCIe controller driver: - Implement pci_remap_iospace() for ARM so mvebu can use devm_pci_remap_iospace() instead of the previous ARM-specific pci_ioremap_io() interface (Pali Rohár) - Use the standard pci_host_probe() instead of the device-specific mvebu_pci_host_probe() (Pali Rohár) - Replace all uses of ARM-specific pci_ioremap_io() with the ARM implementation of the standard pci_remap_iospace() interface and remove pci_ioremap_io() (Pali Rohár) - Skip initializing invalid Root Ports (Pali Rohár) - Check for errors from pci_bridge_emul_init() (Pali Rohár) - Ignore any bridges at non-zero function numbers (Pali Rohár) - Return ~0 data for invalid config read size (Pali Rohár) - Disallow mapping interrupts on emulated bridges (Pali Rohár) - Clear Root Port Memory & I/O Space Enable and Bus Master Enable at initialization (Pali Rohár) - Make type bits in Root Port I/O Base register read-only (Pali Rohár) - Disable Root Port windows when base/limit set to invalid values (Pali Rohár) - Set controller to Root Complex mode (Pali Rohár) - Set Root Port Class Code to PCI Bridge (Pali Rohár) - Update emulated Root Port secondary bus numbers to better reflect the actual topology (Pali Rohár) - Add PCI_BRIDGE_CTL_BUS_RESET support to emulated Root Ports so pci_reset_secondary_bus() can reset connected devices (Pali Rohár) - Add PCI_EXP_DEVCTL Error Reporting Enable support to emulated Root Ports (Pali Rohár) - Add PCI_EXP_RTSTA PME Status bit support to emulated Root Ports (Pali Rohár) - Add DEVCAP2, DEVCTL2 and LNKCTL2 support to emulated Root Ports on Armada XP and newer devices (Pali Rohár) - Export mvebu-mbus.c symbols to allow pci-mvebu.c to be a module (Pali Rohár) - Add support for compiling as a module (Pali Rohár) MediaTek PCIe controller driver: - Assert PERST# for 100ms to allow power and clock to stabilize (qizhong cheng) MediaTek PCIe Gen3 controller driver: - Disable Mediatek DVFSRC voltage request since lack of DVFSRC to respond to the request causes failure to exit L1 PM Substate (Jianjun Wang) MediaTek MT7621 PCIe controller driver: - Declare mt7621_pci_ops static (Sergio Paracuellos) - Give pcibios_root_bridge_prepare() access to host bridge windows (Sergio Paracuellos) - Move MIPS I/O coherency unit setup from driver to pcibios_root_bridge_prepare() (Sergio Paracuellos) - Add missing MODULE_LICENSE() (Sergio Paracuellos) - Allow COMPILE_TEST for all arches (Sergio Paracuellos) Microsoft Hyper-V host bridge driver: - Add hv-internal interfaces to encapsulate arch IRQ dependencies (Sunil Muthuswamy) - Add arm64 Hyper-V vPCI support (Sunil Muthuswamy) Qualcomm PCIe controller driver: - Undo PM setup in qcom_pcie_probe() error handling path (Christophe JAILLET) - Use __be16 type to store return value from cpu_to_be16() (Manivannan Sadhasivam) - Constify static dw_pcie_ep_ops (Rikard Falkeborn) Renesas R-Car PCIe controller driver: - Fix aarch32 abort handler so it doesn't check the wrong bus clock before accessing the host controller (Marek Vasut) TI Keystone PCIe controller driver: - Add register offset for ti,syscon-pcie-id and ti,syscon-pcie-mode DT properties (Kishon Vijay Abraham I) MicroSemi Switchtec management driver: - Add Gen4 automotive device IDs (Kelvin Cao) - Declare state_names[] as static so it's not allocated and initialized for every call (Kelvin Cao) Host controller driver cleanups: - Use of_device_get_match_data(), not of_match_device(), when we only need the device data in altera, artpec6, cadence, designware-plat, dra7xx, keystone, kirin (Fan Fei) - Drop pointless of_device_get_match_data() cast in j721e (Bjorn Helgaas) - Drop redundant struct device * from j721e since struct cdns_pcie already has one (Bjorn Helgaas) - Rename driver structs to _pcie in intel-gw, iproc, ls-gen4, mediatek-gen3, microchip, mt7621, rcar-gen2, tegra194, uniphier, xgene, xilinx, xilinx-cpm for consistency across drivers (Fan Fei) - Fix invalid address space conversions in hisi, spear13xx (Bjorn Helgaas) Miscellaneous: - Sort Intel Device IDs by value (Andy Shevchenko) - Change Capability offsets to hex to match spec (Baruch Siach) - Correct misspellings (Krzysztof Wilczyński) - Terminate statement with semicolon in pci_endpoint_test.c (Ming Wang)" tag 'pci-v5.17-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (151 commits) PCI: mt7621: Allow COMPILE_TEST for all arches PCI: mt7621: Add missing MODULE_LICENSE() PCI: mt7621: Move MIPS setup to pcibios_root_bridge_prepare() PCI: Let pcibios_root_bridge_prepare() access bridge->windows PCI: mt7621: Declare mt7621_pci_ops static PCI: brcmstb: Do not turn off WOL regulators on suspend PCI: brcmstb: Add control of subdevice voltage regulators PCI: brcmstb: Add mechanism to turn on subdev regulators PCI: brcmstb: Split brcm_pcie_setup() into two funcs dt-bindings: PCI: Add bindings for Brcmstb EP voltage regulators dt-bindings: PCI: Correct brcmstb interrupts, interrupt-map. PCI: brcmstb: Fix function return value handling PCI: brcmstb: Do not use __GENMASK PCI: brcmstb: Declare 'used' as bitmap, not unsigned long PCI: hv: Add arm64 Hyper-V vPCI support PCI: hv: Make the code arch neutral by adding arch specific interfaces PCI: pciehp: Use down_read/write_nested(reset_lock) to fix lockdep errors x86/PCI: Remove initialization of static variables to false PCI: Use DWORD accesses for LTR, L1 SS to avoid erratum misc: pci_endpoint_test: Terminate statement with semicolon ...	2022-01-16 08:08:11 +02:00
Linus Torvalds	f56caedaf9	Merge branch 'akpm' (patches from Andrew) Merge misc updates from Andrew Morton: "146 patches. Subsystems affected by this patch series: kthread, ia64, scripts, ntfs, squashfs, ocfs2, vfs, and mm (slab-generic, slab, kmemleak, dax, kasan, debug, pagecache, gup, shmem, frontswap, memremap, memcg, selftests, pagemap, dma, vmalloc, memory-failure, hugetlb, userfaultfd, vmscan, mempolicy, oom-kill, hugetlbfs, migration, thp, ksm, page-poison, percpu, rmap, zswap, zram, cleanups, hmm, and damon)" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (146 commits) mm/damon: hide kernel pointer from tracepoint event mm/damon/vaddr: hide kernel pointer from damon_va_three_regions() failure log mm/damon/vaddr: use pr_debug() for damon_va_three_regions() failure logging mm/damon/dbgfs: remove an unnecessary variable mm/damon: move the implementation of damon_insert_region to damon.h mm/damon: add access checking for hugetlb pages Docs/admin-guide/mm/damon/usage: update for schemes statistics mm/damon/dbgfs: support all DAMOS stats Docs/admin-guide/mm/damon/reclaim: document statistics parameters mm/damon/reclaim: provide reclamation statistics mm/damon/schemes: account how many times quota limit has exceeded mm/damon/schemes: account scheme actions that successfully applied mm/damon: remove a mistakenly added comment for a future feature Docs/admin-guide/mm/damon/usage: update for kdamond_pid and (mk\|rm)_contexts Docs/admin-guide/mm/damon/usage: mention tracepoint at the beginning Docs/admin-guide/mm/damon/usage: remove redundant information Docs/admin-guide/mm/damon/usage: update for scheme quotas and watermarks mm/damon: convert macro functions to static inline functions mm/damon: modify damon_rand() macro to static inline function mm/damon: move damon_rand() definition into damon.h ...	2022-01-15 20:37:06 +02:00
Yury Norov	47d8c15615	include: move find.h from asm_generic to linux find_bit API and bitmap API are closely related, but inclusion paths are different - include/asm-generic and include/linux, correspondingly. In the past it made a lot of troubles due to circular dependencies and/or undefined symbols. Fix this by moving find.h under include/linux. Signed-off-by: Yury Norov <yury.norov@gmail.com> Tested-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>	2022-01-15 08:47:31 -08:00
Aneesh Kumar K.V	21b084fdf2	mm/mempolicy: wire up syscall set_mempolicy_home_node Link: https://lkml.kernel.org/r/20211202123810.267175-4-aneesh.kumar@linux.ibm.com Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Ben Widawsky <ben.widawsky@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Feng Tang <feng.tang@intel.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Andi Kleen <ak@linux.intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Huang Ying <ying.huang@intel.com> Cc: <linux-api@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2022-01-15 16:30:30 +02:00
Linus Torvalds	8e5b0adeea	Peter Zijlstra says: "Cleanup of the perf/kvm interaction." -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmHdvbkACgkQEsHwGGHe VUrX7w/9FwKUm0WlGcQIAOSdWk85N2qAVH3brYcQHNpTCVe68TOqTCrxCDrGgyUq 2XnCOim99MUlnsVU6QRZqF4yJ8S1tGrc0COJ/qR4SGntucu0oYuDe2aMVq+mWUD7 /IThA0oMRfhki9WwAyUuyCrXzk4blZdlrXyYIRMJGl9xeGNy3cvUtU8f68Kiy22E OcmQ/o9Etsr38dueAMU1KYEmgSTvG47rS8nfyRUu3QpJHbyLmRXH32PQrm3tduxS Bw3gMAH5vqq1UDZJ8ZvsPsO0vFX7dtnKEwEKz4qdtRWk9gi8oLGHIwIXC+VtNqpf mCmX33Jw8uFz9h3JhE84J0j/CgsWHoU6MOs0MOch4Tb69/BfCjQnw1enImhejG8q YEIDjJf/vgRNaw9PYshiTHT+EJTe9inT3S4eK/ynLRDUEslAqyWZZm7bUE/XrEDi yRyGIxry/hNZVvRkXT9QBw32fpgnIH2NAMPLEjJSGCRxT89Tfqz0aRDfacCuHTTh P8pAeiDuy/6RkDlQckOZJWOFFh2IHsykX2l3IJcHqVRqt4ob9b+SZB5qoH/Mv9qb MSAqdFUupYZFC+6XuPAeX5/Mo+wSkP+pYYSbWNxjUa0yNiYecOjE7/8T2SB2y6Mx lk2L0ypsZUYSmpHSfvOdPmf6ucj19/5B4+VCX6PQfcNJTnvvhTE= =tU5G -----END PGP SIGNATURE----- Merge tag 'perf_core_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf updates from Borislav Petkov: "Cleanup of the perf/kvm interaction." * tag 'perf_core_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf: Drop guest callback (un)register stubs KVM: arm64: Drop perf.c and fold its tiny bits of code into arm.c KVM: arm64: Hide kvm_arm_pmu_available behind CONFIG_HW_PERF_EVENTS=y KVM: arm64: Convert to the generic perf callbacks KVM: x86: Move Intel Processor Trace interrupt handler to vmx.c KVM: Move x86's perf guest info callbacks to generic KVM KVM: x86: More precisely identify NMI from guest when handling PMI KVM: x86: Drop current_vcpu for kvm_running_vcpu + kvm_arch_vcpu variable perf/core: Use static_call to optimize perf_guest_info_callbacks perf: Force architectures to opt-in to guest callbacks perf: Add wrappers for invoking guest callbacks perf/core: Rework guest callbacks to prepare for static_call support perf: Drop dead and useless guest "support" from arm, csky, nds32 and riscv perf: Stop pretending that perf can handle multiple guest callbacks KVM: x86: Register Processor Trace interrupt hook iff PT enabled in guest KVM: x86: Register perf callbacks after calling vendor's hardware_setup() perf: Protect perf_guest_cbs with RCU	2022-01-12 16:26:58 -08:00
Sunil Muthuswamy	d9932b4691	PCI: hv: Add arm64 Hyper-V vPCI support Add arm64 Hyper-V vPCI support by implementing the arch specific interfaces. Introduce an IRQ domain and chip specific to Hyper-v vPCI that is based on SPIs. The IRQ domain parents itself to the arch GIC IRQ domain for basic vector management. [bhelgaas: squash in fix from Yang Li <yang.lee@linux.alibaba.com>: https://lore.kernel.org/r/20220112003324.62755-1-yang.lee@linux.alibaba.com] Link: https://lore.kernel.org/r/1641411156-31705-3-git-send-email-sunilmut@linux.microsoft.com Signed-off-by: Sunil Muthuswamy <sunilmut@microsoft.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Michael Kelley <mikelley@microsoft.com>	2022-01-12 08:24:29 -06:00
Linus Torvalds	daadb3bd0e	Peter Zijlstra says: "Lots of cleanups and preparation; highlights: - futex: Cleanup and remove runtime futex_cmpxchg detection - rtmutex: Some fixes for the PREEMPT_RT locking infrastructure - kcsan: Share owner_on_cpu() between mutex,rtmutex and rwsem and annotate the racy owner->on_cpu access once. - atomic64: Dead-Code-Elemination" -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmHdvssACgkQEsHwGGHe VUrbBg//VQvz5BwddIJDj9utt5AvSixNcTF5mJyFKCSIqO0S4J8nCNcvJjZ2bs4S w1YmInFbp0WFGUhaIZiw0e6KWJUoINTng4MfHDZosS1doT2of53ZaQqXs3i81jDz 87w8ADVHL0x4+BNjdsIwbcuPSDTmJFoyFOdeXTIl9hv9ZULT8m4Mt+LJuUHNZ+vF rS1jyseVPWkcm5y+Yie0rhip+ygzbfbt0ArsLfRcrBJsKr6oxLxV2DDF+2djXuuP d2OgGT7VkbgAhoKpzVXUiHsT6ppR5Mn5TLSa4EZ4bPPCUFldOhKuCAImF3T6yVIa 44iX5vQN9v5VHBy6ocPbdOIBuYBYVGCMurh1t7pbpB6G+mmSxMiyta5MY37POwjv K2JT9mC2A6a4d17gue5FT3mnJMBB4eHwVaDfAwCZs/5rRNuoTz4aY5Xy04Mq0ltI 39uarwBd5hwSugBWg44AS5E9h52E654FQ7g6iS4NtUvJuuaXBTl43EcZWx2+mnPL zY+iOMVMgg33VIVcm/mlf/6zWL0LXPmILUiA1fp4Q9/n8u1EuOOyeA/GsC9Pl3wO HY3KpYJA5eQpIk/JEnzKm5ZE3pCrUdH6VDC/SB4owQtafQG6OxyQVP1Gj7KYxZsD NqqpJ4nkKooc5f5DqVEN8wrjyYsnVxEfriEG09OoR6wI3MqyUA4= =vrYy -----END PGP SIGNATURE----- Merge tag 'locking_core_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Borislav Petkov: "Lots of cleanups and preparation. Highlights: - futex: Cleanup and remove runtime futex_cmpxchg detection - rtmutex: Some fixes for the PREEMPT_RT locking infrastructure - kcsan: Share owner_on_cpu() between mutex,rtmutex and rwsem and annotate the racy owner->on_cpu access once. - atomic64: Dead-Code-Elemination" [ Description above by Peter Zijlstra ] * tag 'locking_core_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking/atomic: atomic64: Remove unusable atomic ops futex: Fix additional regressions locking: Allow to include asm/spinlock_types.h from linux/spinlock_types_raw.h x86/mm: Include spinlock_t definition in pgtable. locking: Mark racy reads of owner->on_cpu locking: Make owner_on_cpu() into <linux/sched.h> lockdep/selftests: Adapt ww-tests for PREEMPT_RT lockdep/selftests: Skip the softirq related tests on PREEMPT_RT lockdep/selftests: Unbalanced migrate_disable() & rcu_read_lock(). lockdep/selftests: Avoid using local_lock_{acquire\|release}(). lockdep: Remove softirq accounting on PREEMPT_RT. locking/rtmutex: Add rt_mutex_lock_nest_lock() and rt_mutex_lock_killable(). locking/rtmutex: Squash self-deadlock check for ww_rt_mutex. locking: Remove rt_rwlock_is_contended(). sched: Trigger warning if ->migration_disabled counter underflows. futex: Fix sparc32/m68k/nds32 build regression futex: Remove futex_cmpxchg detection futex: Ensure futex_atomic_cmpxchg_inatomic() is present kernel/locking: Use a pointer in ww_mutex_trylock().	2022-01-11 17:24:45 -08:00
Linus Torvalds	b35b6d4d71	Power management updates for 5.17-rc1 - Add new P-state driver for AMD processors (Huang Rui). - Fix initialization of min and max frequency QoS requests in the cpufreq core (Rafael Wysocki). - Fix EPP handling on Alder Lake in intel_pstate (Srinivas Pandruvada). - Make intel_pstate update cpuinfo.max_freq when notified of HWP capabilities changes and drop a redundant function call from that driver (Rafael Wysocki). - Improve IRQ support in the Qcom cpufreq driver (Ard Biesheuvel, Stephen Boyd, Vladimir Zapolskiy). - Fix double devm_remap() in the Mediatek cpufreq driver (Hector Yuan). - Introduce thermal pressure helpers for cpufreq CPU cooling (Lukasz Luba). - Make cpufreq use default_groups in kobj_type (Greg Kroah-Hartman). - Make cpuidle use default_groups in kobj_type (Greg Kroah-Hartman). - Fix two comments in cpuidle code (Jason Wang, Yang Li). - Allow model-specific normal EPB value to be used in the intel_epb sysfs attribute handling code (Srinivas Pandruvada). - Simplify locking in pm_runtime_put_suppliers() (Rafael Wysocki). - Add safety net to supplier device release in the runtime PM core code (Rafael Wysocki). - Capture device status before disabling runtime PM for it (Rafael Wysocki). - Add new macros for declaring PM operations to allow drivers to avoid guarding them with CONFIG_PM #ifdefs or __maybe_unused and update some drivers to use these macros (Paul Cercueil). - Allow ACPI hardware signature to be honoured during restore from hibernation (David Woodhouse). - Update outdated operating performance points (OPP) documentation (Tang Yizhou). - Reduce log severity for informative message regarding frequency transition failures in devfreq (Tzung-Bi Shih). - Add DRAM frequency controller devfreq driver for Allwinner sunXi SoCs (Samuel Holland). - Add missing COMMON_CLK dependency to sun8i devfreq driver (Arnd Bergmann). - Add support for new layout of Psys PowerLimit Register on SPR to the Intel RAPL power capping driver (Zhang Rui). - Fix typo in a comment in idle_inject.c (Jason Wang). - Remove unused function definition from the DTPM (Dynamit Thermal Power Management) power capping framework (Daniel Lezcano). - Reduce DTPM trace verbosity (Daniel Lezcano). -----BEGIN PGP SIGNATURE----- iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmHcgkgSHHJqd0Byand5 c29ja2kubmV0AAoJEILEb/54YlRxs34P/3kFhRk7qrwEekx6F11im6caLKT9+Qap PuGVqfTbK7TupVQDVGFBEjTjgKY7Ph7Fcr4bqn6wvNOp96cjXyOSk/c1fcpS3Bpr b1PYsFsb9diNKE462sGGYClyCT3X5qQqtpxzOl3g4I1PWKTC1mKFm4Jm2m6S6cFq DKhsgYKFzQSZNb1wJM4JjHS9c3BRygqp4nfEAmifu5b9tLZf7stWnFHhbGq63M9m OwHOrEEnzhf4pOXGZTvIXeczgE6IcuDdlGkIg7XMHnmKSNvj1HqhEgi2lfSRb98z 5eI4S6JymCJGVK+gr8iVCq1iJ+LKqV3YPXRqvI35/+NqIKYxMt2ZivQQf5s3aQLe 26gUulD3O6Pz5tMlwcDElD4/tcClfg35PCD/VzpRR8TAo8vLBb63kZ5v6+HM34ZJ 6QbLTNZJTnGmEqxMccUxP+HhZz8ssqpLAC+R2sE5yXbNpIZq8CbPiGb65RGiX3SG CmRKqH/xQVNKBYP0ChjmUyhKcBxOnx1Xu8AhsN7gRAy0aht7j7OdjTnJuGiX6gu3 Q5WxvVvkekyfhuFQ5TST9y/fzvMJWzeaA6GhVIr6RoBmshNQGTb0H4HXARxS3Ah5 qjd7ao7BFLa898FCHaHIpmFWp0wF5iljwCJQVP3I2qUpPvDJxEtsxc4CF/AZzyNR VudoFqLoIV5C =1egI -----END PGP SIGNATURE----- Merge tag 'pm-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management updates from Rafael Wysocki: "The most signigicant change here is the addition of a new cpufreq 'P-state' driver for AMD processors as a better replacement for the venerable acpi-cpufreq driver. There are also other cpufreq updates (in the core, intel_pstate, ARM drivers), PM core updates (mostly related to adding new macros for declaring PM operations which should make the lives of driver developers somewhat easier), and a bunch of assorted fixes and cleanups. Summary: - Add new P-state driver for AMD processors (Huang Rui). - Fix initialization of min and max frequency QoS requests in the cpufreq core (Rafael Wysocki). - Fix EPP handling on Alder Lake in intel_pstate (Srinivas Pandruvada). - Make intel_pstate update cpuinfo.max_freq when notified of HWP capabilities changes and drop a redundant function call from that driver (Rafael Wysocki). - Improve IRQ support in the Qcom cpufreq driver (Ard Biesheuvel, Stephen Boyd, Vladimir Zapolskiy). - Fix double devm_remap() in the Mediatek cpufreq driver (Hector Yuan). - Introduce thermal pressure helpers for cpufreq CPU cooling (Lukasz Luba). - Make cpufreq use default_groups in kobj_type (Greg Kroah-Hartman). - Make cpuidle use default_groups in kobj_type (Greg Kroah-Hartman). - Fix two comments in cpuidle code (Jason Wang, Yang Li). - Allow model-specific normal EPB value to be used in the intel_epb sysfs attribute handling code (Srinivas Pandruvada). - Simplify locking in pm_runtime_put_suppliers() (Rafael Wysocki). - Add safety net to supplier device release in the runtime PM core code (Rafael Wysocki). - Capture device status before disabling runtime PM for it (Rafael Wysocki). - Add new macros for declaring PM operations to allow drivers to avoid guarding them with CONFIG_PM #ifdefs or __maybe_unused and update some drivers to use these macros (Paul Cercueil). - Allow ACPI hardware signature to be honoured during restore from hibernation (David Woodhouse). - Update outdated operating performance points (OPP) documentation (Tang Yizhou). - Reduce log severity for informative message regarding frequency transition failures in devfreq (Tzung-Bi Shih). - Add DRAM frequency controller devfreq driver for Allwinner sunXi SoCs (Samuel Holland). - Add missing COMMON_CLK dependency to sun8i devfreq driver (Arnd Bergmann). - Add support for new layout of Psys PowerLimit Register on SPR to the Intel RAPL power capping driver (Zhang Rui). - Fix typo in a comment in idle_inject.c (Jason Wang). - Remove unused function definition from the DTPM (Dynamit Thermal Power Management) power capping framework (Daniel Lezcano). - Reduce DTPM trace verbosity (Daniel Lezcano)" * tag 'pm-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (53 commits) x86, sched: Fix undefined reference to init_freq_invariance_cppc() build error cpufreq: amd-pstate: Fix Kconfig dependencies for AMD P-State cpufreq: amd-pstate: Fix struct amd_cpudata kernel-doc comment cpuidle: use default_groups in kobj_type x86: intel_epb: Allow model specific normal EPB value MAINTAINERS: Add AMD P-State driver maintainer entry Documentation: amd-pstate: Add AMD P-State driver introduction cpufreq: amd-pstate: Add AMD P-State performance attributes cpufreq: amd-pstate: Add AMD P-State frequencies attributes cpufreq: amd-pstate: Add boost mode support for AMD P-State cpufreq: amd-pstate: Add trace for AMD P-State module cpufreq: amd-pstate: Introduce the support for the processors with shared memory solution cpufreq: amd-pstate: Add fast switch function for AMD P-State cpufreq: amd-pstate: Introduce a new AMD P-State driver to support future processors ACPI: CPPC: Add CPPC enable register function ACPI: CPPC: Check present CPUs for determining _CPC is valid ACPI: CPPC: Implement support for SystemIO registers x86/msr: Add AMD CPPC MSR definitions x86/cpufeatures: Add AMD Collaborative Processor Performance Control feature flag cpufreq: use default_groups in kobj_type ...	2022-01-10 20:34:00 -08:00
Linus Torvalds	8efd0d9c31	Networking changes for 5.17. Core ---- - Defer freeing TCP skbs to the BH handler, whenever possible, or at least perform the freeing outside of the socket lock section to decrease cross-CPU allocator work and improve latency. - Add netdevice refcount tracking to locate sources of netdevice and net namespace refcount leaks. - Make Tx watchdog less intrusive - avoid pausing Tx and restarting all queues from a single CPU removing latency spikes. - Various small optimizations throughout the stack from Eric Dumazet. - Make netdev->dev_addr[] constant, force modifications to go via appropriate helpers to allow us to keep addresses in ordered data structures. - Replace unix_table_lock with per-hash locks, improving performance of bind() calls. - Extend skb drop tracepoint with a drop reason. - Allow SO_MARK and SO_PRIORITY setsockopt under CAP_NET_RAW. BPF --- - New helpers: - bpf_find_vma(), find and inspect VMAs for profiling use cases - bpf_loop(), runtime-bounded loop helper trading some execution time for much faster (if at all converging) verification - bpf_strncmp(), improve performance, avoid compiler flakiness - bpf_get_func_arg(), bpf_get_func_ret(), bpf_get_func_arg_cnt() for tracing programs, all inlined by the verifier - Support BPF relocations (CO-RE) in the kernel loader. - Further the support for BTF_TYPE_TAG annotations. - Allow access to local storage in sleepable helpers. - Convert verifier argument types to a composable form with different attributes which can be shared across types (ro, maybe-null). - Prepare libbpf for upcoming v1.0 release by cleaning up APIs, creating new, extensible ones where missing and deprecating those to be removed. Protocols --------- - WiFi (mac80211/cfg80211): - notify user space about long "come back in N" AP responses, allow it to react to such temporary rejections - allow non-standard VHT MCS 10/11 rates - use coarse time in airtime fairness code to save CPU cycles - Bluetooth: - rework of HCI command execution serialization to use a common queue and work struct, and improve handling errors reported in the middle of a batch of commands - rework HCI event handling to use skb_pull_data, avoiding packet parsing pitfalls - support AOSP Bluetooth Quality Report - SMC: - support net namespaces, following the RDMA model - improve connection establishment latency by pre-clearing buffers - introduce TCP ULP for automatic redirection to SMC - Multi-Path TCP: - support ioctls: SIOCINQ, OUTQ, and OUTQNSD - support socket options: IP_TOS, IP_FREEBIND, IP_TRANSPARENT, IPV6_FREEBIND, and IPV6_TRANSPARENT, TCP_CORK and TCP_NODELAY - support cmsgs: TCP_INQ - improvements in the data scheduler (assigning data to subflows) - support fastclose option (quick shutdown of the full MPTCP connection, similar to TCP RST in regular TCP) - MCTP (Management Component Transport) over serial, as defined by DMTF spec DSP0253 - "MCTP Serial Transport Binding". Driver API ---------- - Support timestamping on bond interfaces in active/passive mode. - Introduce generic phylink link mode validation for drivers which don't have any quirks and where MAC capability bits fully express what's supported. Allow PCS layer to participate in the validation. Convert a number of drivers. - Add support to set/get size of buffers on the Rx rings and size of the tx copybreak buffer via ethtool. - Support offloading TC actions as first-class citizens rather than only as attributes of filters, improve sharing and device resource utilization. - WiFi (mac80211/cfg80211): - support forwarding offload (ndo_fill_forward_path) - support for background radar detection hardware - SA Query Procedures offload on the AP side New hardware / drivers ---------------------- - tsnep - FPGA based TSN endpoint Ethernet MAC used in PLCs with real-time requirements for isochronous communication with protocols like OPC UA Pub/Sub. - Qualcomm BAM-DMUX WWAN - driver for data channels of modems integrated into many older Qualcomm SoCs, e.g. MSM8916 or MSM8974 (qcom_bam_dmux). - Microchip LAN966x multi-port Gigabit AVB/TSN Ethernet Switch driver with support for bridging, VLANs and multicast forwarding (lan966x). - iwlmei driver for co-operating between Intel's WiFi driver and Intel's Active Management Technology (AMT) devices. - mse102x - Vertexcom MSE102x Homeplug GreenPHY chips - Bluetooth: - MediaTek MT7921 SDIO devices - Foxconn MT7922A - Realtek RTL8852AE Drivers ------- - Significantly improve performance in the datapaths of: lan78xx, ax88179_178a, lantiq_xrx200, bnxt. - Intel Ethernet NICs: - igb: support PTP/time PEROUT and EXTTS SDP functions on 82580/i354/i350 adapters - ixgbevf: new PF -> VF mailbox API which avoids the risk of mailbox corruption with ESXi - iavf: support configuration of VLAN features of finer granularity, stacked tags and filtering - ice: PTP support for new E822 devices with sub-ns precision - ice: support firmware activation without reboot - Mellanox Ethernet NICs (mlx5): - expose control over IRQ coalescing mode (CQE vs EQE) via ethtool - support TC forwarding when tunnel encap and decap happen between two ports of the same NIC - dynamically size and allow disabling various features to save resources for running in embedded / SmartNIC scenarios - Broadcom Ethernet NICs (bnxt): - use page frag allocator to improve Rx performance - expose control over IRQ coalescing mode (CQE vs EQE) via ethtool - Other Ethernet NICs: - amd-xgbe: add Ryzen 6000 (Yellow Carp) Ethernet support - Microsoft cloud/virtual NIC (mana): - add XDP support (PASS, DROP, TX) - Mellanox Ethernet switches (mlxsw): - initial support for Spectrum-4 ASICs - VxLAN with IPv6 underlay - Marvell Ethernet switches (prestera): - support flower flow templates - add basic IP forwarding support - NXP embedded Ethernet switches (ocelot & felix): - support Per-Stream Filtering and Policing (PSFP) - enable cut-through forwarding between ports by default - support FDMA to improve packet Rx/Tx to CPU - Other embedded switches: - hellcreek: improve trapping management (STP and PTP) packets - qca8k: support link aggregation and port mirroring - Qualcomm 802.11ax WiFi (ath11k): - qca6390, wcn6855: enable 802.11 power save mode in station mode - BSS color change support - WCN6855 hw2.1 support - 11d scan offload support - scan MAC address randomization support - full monitor mode, only supported on QCN9074 - qca6390/wcn6855: report signal and tx bitrate - qca6390: rfkill support - qca6390/wcn6855: regdb.bin support - Intel WiFi (iwlwifi): - support SAR GEO Offset Mapping (SGOM) and Time-Aware-SAR (TAS) in cooperation with the BIOS - support for Optimized Connectivity Experience (OCE) scan - support firmware API version 68 - lots of preparatory work for the upcoming Bz device family - MediaTek WiFi (mt76): - Specific Absorption Rate (SAR) support - mt7921: 160 MHz channel support - RealTek WiFi (rtw88): - Specific Absorption Rate (SAR) support - scan offload - Other WiFi NICs - ath10k: support fetching (pre-)calibration data from nvmem - brcmfmac: configure keep-alive packet on suspend - wcn36xx: beacon filter support Signed-off-by: Jakub Kicinski <kuba@kernel.org> -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmHbkZAACgkQMUZtbf5S IruYkQ//XX7BggcwBfukPK83j0dONolClijqKcKR08g4vB5L8GXvv6OErKIWrh4k h8JanCH352ZkbCSw3MvFdm825UYQv8vPMd6Qks/LJ4aSKqCuy4MIlAo+yOw4Km3O i7++lRfma6DqHHI59wvLjWoxZSPu8lL+rI8UsZ5qMOlnNlGAOXsNrzRjaqQ3FddY AMxZeBUtrPqUCCQZFq3U8apkYzUp7CA/3XR9zRcja3uPbrtOV2G+4whRF90qGNWz Tm/QvJ9F/Ab292cbhxR4KuaQ3hUhaCQyDjbZk3+FZzZpAVhYTVqcNjny6+yXmbiP NXRtwemnl1NlWKMnJM8lEeY48u626tRIkxA/Wtd61uoO5uKUSxfGP+UpUi+DfXbF yIw50VQ7L2bpxXP/HjtmhVgZDaWKYyh22Zw4Hp/muMJz0hgUB0KODY3tf2jUWbjJ 0oEgocWyzhhwMQKqupTDCIaRgIs2ewYr4ZrFDhI3HnHC/vv1VjoPRUPIyxwppD2N cXvZb3B1sWK8iX5gCbISGzyU4bB7I0rvJSTU42ueti7n6NqRFZ79qHQpYnnY+JdO z1qOwY/d/yWfBoXVKRtRg2qz6CdEt5BQklwAgVEBgrFpf58gp694EwGMb1htY14J r/k9bVpmyIFpUnBH2CPMRfBVA3tUTqzyzzFV4AMw40NYLKmhLdo= =KLm3 -----END PGP SIGNATURE----- Merge tag '5.17-net-next' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Jakub Kicinski: "Core ---- - Defer freeing TCP skbs to the BH handler, whenever possible, or at least perform the freeing outside of the socket lock section to decrease cross-CPU allocator work and improve latency. - Add netdevice refcount tracking to locate sources of netdevice and net namespace refcount leaks. - Make Tx watchdog less intrusive - avoid pausing Tx and restarting all queues from a single CPU removing latency spikes. - Various small optimizations throughout the stack from Eric Dumazet. - Make netdev->dev_addr[] constant, force modifications to go via appropriate helpers to allow us to keep addresses in ordered data structures. - Replace unix_table_lock with per-hash locks, improving performance of bind() calls. - Extend skb drop tracepoint with a drop reason. - Allow SO_MARK and SO_PRIORITY setsockopt under CAP_NET_RAW. BPF --- - New helpers: - bpf_find_vma(), find and inspect VMAs for profiling use cases - bpf_loop(), runtime-bounded loop helper trading some execution time for much faster (if at all converging) verification - bpf_strncmp(), improve performance, avoid compiler flakiness - bpf_get_func_arg(), bpf_get_func_ret(), bpf_get_func_arg_cnt() for tracing programs, all inlined by the verifier - Support BPF relocations (CO-RE) in the kernel loader. - Further the support for BTF_TYPE_TAG annotations. - Allow access to local storage in sleepable helpers. - Convert verifier argument types to a composable form with different attributes which can be shared across types (ro, maybe-null). - Prepare libbpf for upcoming v1.0 release by cleaning up APIs, creating new, extensible ones where missing and deprecating those to be removed. Protocols --------- - WiFi (mac80211/cfg80211): - notify user space about long "come back in N" AP responses, allow it to react to such temporary rejections - allow non-standard VHT MCS 10/11 rates - use coarse time in airtime fairness code to save CPU cycles - Bluetooth: - rework of HCI command execution serialization to use a common queue and work struct, and improve handling errors reported in the middle of a batch of commands - rework HCI event handling to use skb_pull_data, avoiding packet parsing pitfalls - support AOSP Bluetooth Quality Report - SMC: - support net namespaces, following the RDMA model - improve connection establishment latency by pre-clearing buffers - introduce TCP ULP for automatic redirection to SMC - Multi-Path TCP: - support ioctls: SIOCINQ, OUTQ, and OUTQNSD - support socket options: IP_TOS, IP_FREEBIND, IP_TRANSPARENT, IPV6_FREEBIND, and IPV6_TRANSPARENT, TCP_CORK and TCP_NODELAY - support cmsgs: TCP_INQ - improvements in the data scheduler (assigning data to subflows) - support fastclose option (quick shutdown of the full MPTCP connection, similar to TCP RST in regular TCP) - MCTP (Management Component Transport) over serial, as defined by DMTF spec DSP0253 - "MCTP Serial Transport Binding". Driver API ---------- - Support timestamping on bond interfaces in active/passive mode. - Introduce generic phylink link mode validation for drivers which don't have any quirks and where MAC capability bits fully express what's supported. Allow PCS layer to participate in the validation. Convert a number of drivers. - Add support to set/get size of buffers on the Rx rings and size of the tx copybreak buffer via ethtool. - Support offloading TC actions as first-class citizens rather than only as attributes of filters, improve sharing and device resource utilization. - WiFi (mac80211/cfg80211): - support forwarding offload (ndo_fill_forward_path) - support for background radar detection hardware - SA Query Procedures offload on the AP side New hardware / drivers ---------------------- - tsnep - FPGA based TSN endpoint Ethernet MAC used in PLCs with real-time requirements for isochronous communication with protocols like OPC UA Pub/Sub. - Qualcomm BAM-DMUX WWAN - driver for data channels of modems integrated into many older Qualcomm SoCs, e.g. MSM8916 or MSM8974 (qcom_bam_dmux). - Microchip LAN966x multi-port Gigabit AVB/TSN Ethernet Switch driver with support for bridging, VLANs and multicast forwarding (lan966x). - iwlmei driver for co-operating between Intel's WiFi driver and Intel's Active Management Technology (AMT) devices. - mse102x - Vertexcom MSE102x Homeplug GreenPHY chips - Bluetooth: - MediaTek MT7921 SDIO devices - Foxconn MT7922A - Realtek RTL8852AE Drivers ------- - Significantly improve performance in the datapaths of: lan78xx, ax88179_178a, lantiq_xrx200, bnxt. - Intel Ethernet NICs: - igb: support PTP/time PEROUT and EXTTS SDP functions on 82580/i354/i350 adapters - ixgbevf: new PF -> VF mailbox API which avoids the risk of mailbox corruption with ESXi - iavf: support configuration of VLAN features of finer granularity, stacked tags and filtering - ice: PTP support for new E822 devices with sub-ns precision - ice: support firmware activation without reboot - Mellanox Ethernet NICs (mlx5): - expose control over IRQ coalescing mode (CQE vs EQE) via ethtool - support TC forwarding when tunnel encap and decap happen between two ports of the same NIC - dynamically size and allow disabling various features to save resources for running in embedded / SmartNIC scenarios - Broadcom Ethernet NICs (bnxt): - use page frag allocator to improve Rx performance - expose control over IRQ coalescing mode (CQE vs EQE) via ethtool - Other Ethernet NICs: - amd-xgbe: add Ryzen 6000 (Yellow Carp) Ethernet support - Microsoft cloud/virtual NIC (mana): - add XDP support (PASS, DROP, TX) - Mellanox Ethernet switches (mlxsw): - initial support for Spectrum-4 ASICs - VxLAN with IPv6 underlay - Marvell Ethernet switches (prestera): - support flower flow templates - add basic IP forwarding support - NXP embedded Ethernet switches (ocelot & felix): - support Per-Stream Filtering and Policing (PSFP) - enable cut-through forwarding between ports by default - support FDMA to improve packet Rx/Tx to CPU - Other embedded switches: - hellcreek: improve trapping management (STP and PTP) packets - qca8k: support link aggregation and port mirroring - Qualcomm 802.11ax WiFi (ath11k): - qca6390, wcn6855: enable 802.11 power save mode in station mode - BSS color change support - WCN6855 hw2.1 support - 11d scan offload support - scan MAC address randomization support - full monitor mode, only supported on QCN9074 - qca6390/wcn6855: report signal and tx bitrate - qca6390: rfkill support - qca6390/wcn6855: regdb.bin support - Intel WiFi (iwlwifi): - support SAR GEO Offset Mapping (SGOM) and Time-Aware-SAR (TAS) in cooperation with the BIOS - support for Optimized Connectivity Experience (OCE) scan - support firmware API version 68 - lots of preparatory work for the upcoming Bz device family - MediaTek WiFi (mt76): - Specific Absorption Rate (SAR) support - mt7921: 160 MHz channel support - RealTek WiFi (rtw88): - Specific Absorption Rate (SAR) support - scan offload - Other WiFi NICs - ath10k: support fetching (pre-)calibration data from nvmem - brcmfmac: configure keep-alive packet on suspend - wcn36xx: beacon filter support" * tag '5.17-net-next' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2048 commits) tcp: tcp_send_challenge_ack delete useless param `skb` net/qla3xxx: Remove useless DMA-32 fallback configuration rocker: Remove useless DMA-32 fallback configuration hinic: Remove useless DMA-32 fallback configuration lan743x: Remove useless DMA-32 fallback configuration net: enetc: Remove useless DMA-32 fallback configuration cxgb4vf: Remove useless DMA-32 fallback configuration cxgb4: Remove useless DMA-32 fallback configuration cxgb3: Remove useless DMA-32 fallback configuration bnx2x: Remove useless DMA-32 fallback configuration et131x: Remove useless DMA-32 fallback configuration be2net: Remove useless DMA-32 fallback configuration vmxnet3: Remove useless DMA-32 fallback configuration bna: Simplify DMA setting net: alteon: Simplify DMA setting myri10ge: Simplify DMA setting qlcnic: Simplify DMA setting net: allwinner: Fix print format page_pool: remove spinlock in page_pool_refill_alloc_cache() amt: fix wrong return type of amt_send_membership_update() ...	2022-01-10 19:06:09 -08:00
Rafael J. Wysocki	5561f25beb	Merge branch 'pm-cpufreq' Merge cpufreq updates for 5.17-rc1: - Add new P-state driver for AMD processors (Huang Rui). - Fix initialization of min and max frequency QoS requests in the cpufreq core (Rafael Wysocki). - Fix EPP handling on Alder Lake in intel_pstate (Srinivas Pandruvada). - Make intel_pstate update cpuinfo.max_freq when notified of HWP capabilities changes and drop a redundant function call from that driver (Rafael Wysocki). - Improve IRQ support in the Qcom cpufreq driver (Ard Biesheuvel, Stephen Boyd, Vladimir Zapolskiy). - Fix double devm_remap() in the Mediatek cpufreq driver (Hector Yuan). - Introduce thermal pressure helpers for cpufreq CPU cooling (Lukasz Luba). - Make cpufreq use default_groups in kobj_type (Greg Kroah-Hartman). * pm-cpufreq: (32 commits) x86, sched: Fix undefined reference to init_freq_invariance_cppc() build error cpufreq: amd-pstate: Fix Kconfig dependencies for AMD P-State cpufreq: amd-pstate: Fix struct amd_cpudata kernel-doc comment MAINTAINERS: Add AMD P-State driver maintainer entry Documentation: amd-pstate: Add AMD P-State driver introduction cpufreq: amd-pstate: Add AMD P-State performance attributes cpufreq: amd-pstate: Add AMD P-State frequencies attributes cpufreq: amd-pstate: Add boost mode support for AMD P-State cpufreq: amd-pstate: Add trace for AMD P-State module cpufreq: amd-pstate: Introduce the support for the processors with shared memory solution cpufreq: amd-pstate: Add fast switch function for AMD P-State cpufreq: amd-pstate: Introduce a new AMD P-State driver to support future processors ACPI: CPPC: Add CPPC enable register function ACPI: CPPC: Check present CPUs for determining _CPC is valid ACPI: CPPC: Implement support for SystemIO registers x86/msr: Add AMD CPPC MSR definitions x86/cpufeatures: Add AMD Collaborative Processor Performance Control feature flag cpufreq: use default_groups in kobj_type cpufreq: mediatek-hw: Fix double devm_remap in hotplug case cpufreq: intel_pstate: Update cpuinfo.max_freq on HWP_CAP changes ...	2022-01-10 17:54:45 +01:00
Linus Torvalds	9b9e211360	arm64 updates for 5.17: - KCSAN enabled for arm64. - Additional kselftests to exercise the syscall ABI w.r.t. SVE/FPSIMD. - Some more SVE clean-ups and refactoring in preparation for SME support (scalable matrix extensions). - BTI clean-ups (SYM_FUNC macros etc.) - arm64 atomics clean-up and codegen improvements. - HWCAPs for FEAT_AFP (alternate floating point behaviour) and FEAT_RPRESS (increased precision of reciprocal estimate and reciprocal square root estimate). - Use SHA3 instructions to speed-up XOR. - arm64 unwind code refactoring/unification. - Avoid DC (data cache maintenance) instructions when DCZID_EL0.DZP == 1 (potentially set by a hypervisor; user-space already does this). - Perf updates for arm64: support for CI-700, HiSilicon PCIe PMU, Marvell CN10K LLC-TAD PMU, miscellaneous clean-ups. - Other fixes and clean-ups; highlights: fix the handling of erratum `1418040`, correct the calculation of the nomap region boundaries, introduce io_stop_wc() mapped to the new DGH instruction (data gathering hint). -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmHXNtYACgkQa9axLQDI XvHBGw/+OVGdbORxwrU+uRb7N6qIJkrW/mmM4x1KLo1i+REZLb8/VlXm0xC60FG+ 39x6FSVkRr+lLDfTqpQsOez5FpdsvOe9Fc4L3bwniDg+EPo7x65VmP2dw/Ae2q0i 87xyWCczx5hFEPF/1sb1R1pm3bTXjeklBkdv+OXhwflLOwpCp1J8z8WJK8qJVFX6 CmuE6Q4fDQr0ghl9Nf8DiAr20mHDh8wMKNUJOg4waaQOOCta6q1oJ3qfz6E9z1eW zEE3dfZgBCx7HCRc3KGgzT7H4Ces3BYvhBYP6bJRliVI88XdPiM4MfdGL4UIb27Q NLAdr+FVzk/YLzMHtxSfkT10nBqoOPWUTckLu9jIIl5cpBX73Wiz7jfzBvqFmC/y opSFMZ3lwQPM5WAPtAlZptA3GPPySeInVmvUgB7IQ+1Q1T1n8ri1y5hzTYC4Sc/g amJI1rXf1Al8+2zFBggr6Up+EOnfV9nAwrzLXkRlASsfmvY4dnVWg3NWfBqtEHAq VuZCecSgawxuSlpmJ4VGbLrBFaz18bn9EzujR5fFvi5Qcg1CMFOROi2+6IynopNV IS0R8j6fwgQPA5lcnNIPeJRRkQoqO4l8bPDzeXEny0BSw313EgBSo9aQtnjyIJbp BTuDHARKs+/NvDPvd8GQkxNPgwJnVOL9pdgNAolEu1/k7JtnIS0= =ecyi -----END PGP SIGNATURE----- Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Catalin Marinas: - KCSAN enabled for arm64. - Additional kselftests to exercise the syscall ABI w.r.t. SVE/FPSIMD. - Some more SVE clean-ups and refactoring in preparation for SME support (scalable matrix extensions). - BTI clean-ups (SYM_FUNC macros etc.) - arm64 atomics clean-up and codegen improvements. - HWCAPs for FEAT_AFP (alternate floating point behaviour) and FEAT_RPRESS (increased precision of reciprocal estimate and reciprocal square root estimate). - Use SHA3 instructions to speed-up XOR. - arm64 unwind code refactoring/unification. - Avoid DC (data cache maintenance) instructions when DCZID_EL0.DZP == 1 (potentially set by a hypervisor; user-space already does this). - Perf updates for arm64: support for CI-700, HiSilicon PCIe PMU, Marvell CN10K LLC-TAD PMU, miscellaneous clean-ups. - Other fixes and clean-ups; highlights: fix the handling of erratum `1418040`, correct the calculation of the nomap region boundaries, introduce io_stop_wc() mapped to the new DGH instruction (data gathering hint). * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (81 commits) arm64: Use correct method to calculate nomap region boundaries arm64: Drop outdated links in comments arm64: perf: Don't register user access sysctl handler multiple times drivers: perf: marvell_cn10k: fix an IS_ERR() vs NULL check perf/smmuv3: Fix unused variable warning when CONFIG_OF=n arm64: errata: Fix exec handling in erratum `1418040` workaround arm64: Unhash early pointer print plus improve comment asm-generic: introduce io_stop_wc() and add implementation for ARM64 arm64: Ensure that the 'bti' macro is defined where linkage.h is included arm64: remove __dma_*_area() aliases docs/arm64: delete a space from tagged-address-abi arm64: Enable KCSAN kselftest/arm64: Add pidbench for floating point syscall cases arm64/fp: Add comments documenting the usage of state restore functions kselftest/arm64: Add a test program to exercise the syscall ABI kselftest/arm64: Allow signal tests to trigger from a function kselftest/arm64: Parameterise ptrace vector length information arm64/sve: Minor clarification of ABI documentation arm64/sve: Generalise vector length configuration prctl() for SME arm64/sve: Make sysctl interface for SVE reusable by SME ...	2022-01-10 08:49:37 -08:00
Paolo Bonzini	7fd55a02a4	KVM/arm64 updates for Linux 5.16 - Simplification of the 'vcpu first run' by integrating it into KVM's 'pid change' flow - Refactoring of the FP and SVE state tracking, also leading to a simpler state and less shared data between EL1 and EL2 in the nVHE case - Tidy up the header file usage for the nvhe hyp object - New HYP unsharing mechanism, finally allowing pages to be unmapped from the Stage-1 EL2 page-tables - Various pKVM cleanups around refcounting and sharing - A couple of vgic fixes for bugs that would trigger once the vcpu xarray rework is merged, but not sooner - Add minimal support for ARMv8.7's PMU extension - Rework kvm_pgtable initialisation ahead of the NV work - New selftest for IRQ injection - Teach selftests about the lack of default IPA space and page sizes - Expand sysreg selftest to deal with Pointer Authentication - The usual bunch of cleanups and doc update -----BEGIN PGP SIGNATURE----- iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmHYIpgPHG1hekBrZXJu ZWwub3JnAAoJECPQ0LrRPXpDndsP/RsBmX6bmQnDEhaaqfGAxOETyq/my1eT9r/V 3Ax4fEqSFfD5yHbYvqNRC8ueycH4r8WAr4ACWDAI6XpS/pYx00nx2N+HCSgjGyQR FeXqITuGPEsn4NkGuPci0PFmI8rVUzanl1ugRGQAETVrZo2ZVH2uqKVGT8XOlu0J FB/0x6Z4vMuIgEXyfa+DZ8WdW1aCRgPU2oyOdSdWE57/grjyLJqk6EdMmLyaQ19E vz6vXuRnA/GQwOtByqYEnQ8a4VXsQedCMqg/f9mj0BxpDzxC1ps8Nrpv36aJXKUN LEXapP9bCWPW9LqaKAOZnQYrUIIEFHsCUom0n3reDHrgObA+jivpz75L8GEr3CdC Bv78N04Yymjpp2WA6CuO3r9HjL1nJ6tYqobXU2pvqln4nNC3Ukucjq9ZVuWgS6Hx qOZXgPcZ/HpS3l/U+dAu8yIcV2SchQXDudaq8BsfLd8M1bD+oirSBolZFSvz7MYZ 6+jtEDLUOEO5s4rXiJF46+MauxiELcjaewAEK4WwrS8NBwEyhYe9EPsYcQ5pcrQF QwAd1+y7oLfhpGHv5KJKWswfvbtlLCm6NOAhawq0UXM8bS+79tu0dGjiDzVPBuSf SyA3VtBSKxcpvCrljw9ubtjxvKrviE0MDvlmTP2B1NU+lwm8xRBiwUwOH29qP9zU HDeUj2fy =HkZk -----END PGP SIGNATURE----- Merge tag 'kvmarm-5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for Linux 5.16 - Simplification of the 'vcpu first run' by integrating it into KVM's 'pid change' flow - Refactoring of the FP and SVE state tracking, also leading to a simpler state and less shared data between EL1 and EL2 in the nVHE case - Tidy up the header file usage for the nvhe hyp object - New HYP unsharing mechanism, finally allowing pages to be unmapped from the Stage-1 EL2 page-tables - Various pKVM cleanups around refcounting and sharing - A couple of vgic fixes for bugs that would trigger once the vcpu xarray rework is merged, but not sooner - Add minimal support for ARMv8.7's PMU extension - Rework kvm_pgtable initialisation ahead of the NV work - New selftest for IRQ injection - Teach selftests about the lack of default IPA space and page sizes - Expand sysreg selftest to deal with Pointer Authentication - The usual bunch of cleanups and doc update	2022-01-07 10:42:19 -05:00
Catalin Marinas	945409a6ef	Merge branches 'for-next/misc', 'for-next/cache-ops-dzp', 'for-next/stacktrace', 'for-next/xor-neon', 'for-next/kasan', 'for-next/armv8_7-fp', 'for-next/atomics', 'for-next/bti', 'for-next/sve', 'for-next/kselftest' and 'for-next/kcsan', remote-tracking branch 'arm64/for-next/perf' into for-next/core * arm64/for-next/perf: (32 commits) arm64: perf: Don't register user access sysctl handler multiple times drivers: perf: marvell_cn10k: fix an IS_ERR() vs NULL check perf/smmuv3: Fix unused variable warning when CONFIG_OF=n arm64: perf: Support new DT compatibles arm64: perf: Simplify registration boilerplate arm64: perf: Support Denver and Carmel PMUs drivers/perf: hisi: Add driver for HiSilicon PCIe PMU docs: perf: Add description for HiSilicon PCIe PMU driver dt-bindings: perf: Add YAML schemas for Marvell CN10K LLC-TAD pmu bindings drivers: perf: Add LLC-TAD perf counter support perf/smmuv3: Synthesize IIDR from CoreSight ID registers perf/smmuv3: Add devicetree support dt-bindings: Add Arm SMMUv3 PMCG binding perf/arm-cmn: Add debugfs topology info perf/arm-cmn: Add CI-700 Support dt-bindings: perf: arm-cmn: Add CI-700 perf/arm-cmn: Support new IP features perf/arm-cmn: Demarcate CMN-600 specifics perf/arm-cmn: Move group validation data off-stack perf/arm-cmn: Optimise DTC counter accesses ... * for-next/misc: : Miscellaneous patches arm64: Use correct method to calculate nomap region boundaries arm64: Drop outdated links in comments arm64: errata: Fix exec handling in erratum `1418040` workaround arm64: Unhash early pointer print plus improve comment asm-generic: introduce io_stop_wc() and add implementation for ARM64 arm64: remove __dma__area() aliases docs/arm64: delete a space from tagged-address-abi arm64/fp: Add comments documenting the usage of state restore functions arm64: mm: Use asid feature macro for cheanup arm64: mm: Rename asid2idx() to ctxid2asid() arm64: kexec: reduce calls to page_address() arm64: extable: remove unused ex_handler_t definition arm64: entry: Use SDEI event constants arm64: Simplify checking for populated DT arm64/kvm: Fix bitrotted comment for SVE handling in handle_exit.c for-next/cache-ops-dzp: : Avoid DC instructions when DCZID_EL0.DZP == 1 arm64: mte: DC {GVA,GZVA} shouldn't be used when DCZID_EL0.DZP == 1 arm64: clear_page() shouldn't use DC ZVA when DCZID_EL0.DZP == 1 * for-next/stacktrace: : Unify the arm64 unwind code arm64: Make some stacktrace functions private arm64: Make dump_backtrace() use arch_stack_walk() arm64: Make profile_pc() use arch_stack_walk() arm64: Make return_address() use arch_stack_walk() arm64: Make __get_wchan() use arch_stack_walk() arm64: Make perf_callchain_kernel() use arch_stack_walk() arm64: Mark __switch_to() as __sched arm64: Add comment for stack_info::kr_cur arch: Make ARCH_STACKWALK independent of STACKTRACE * for-next/xor-neon: : Use SHA3 instructions to speed up XOR arm64/xor: use EOR3 instructions when available * for-next/kasan: : Log potential KASAN shadow aliases arm64: mm: log potential KASAN shadow alias arm64: mm: use die_kernel_fault() in do_mem_abort() * for-next/armv8_7-fp: : Add HWCAPS for ARMv8.7 FEAT_AFP amd FEAT_RPRES arm64: cpufeature: add HWCAP for FEAT_RPRES arm64: add ID_AA64ISAR2_EL1 sys register arm64: cpufeature: add HWCAP for FEAT_AFP * for-next/atomics: : arm64 atomics clean-ups and codegen improvements arm64: atomics: lse: define RETURN ops in terms of FETCH ops arm64: atomics: lse: improve constraints for simple ops arm64: atomics: lse: define ANDs in terms of ANDNOTs arm64: atomics lse: define SUBs in terms of ADDs arm64: atomics: format whitespace consistently * for-next/bti: : BTI clean-ups arm64: Ensure that the 'bti' macro is defined where linkage.h is included arm64: Use BTI C directly and unconditionally arm64: Unconditionally override SYM_FUNC macros arm64: Add macro version of the BTI instruction arm64: ftrace: add missing BTIs arm64: kexec: use __pa_symbol(empty_zero_page) arm64: update PAC description for kernel * for-next/sve: : SVE code clean-ups and refactoring in prepararation of Scalable Matrix Extensions arm64/sve: Minor clarification of ABI documentation arm64/sve: Generalise vector length configuration prctl() for SME arm64/sve: Make sysctl interface for SVE reusable by SME * for-next/kselftest: : arm64 kselftest additions kselftest/arm64: Add pidbench for floating point syscall cases kselftest/arm64: Add a test program to exercise the syscall ABI kselftest/arm64: Allow signal tests to trigger from a function kselftest/arm64: Parameterise ptrace vector length information * for-next/kcsan: : Enable KCSAN for arm64 arm64: Enable KCSAN	2022-01-05 18:14:32 +00:00
Marc Zyngier	1c53a1ae36	Merge branch kvm-arm64/misc-5.17 into kvmarm-master/next * kvm-arm64/misc-5.17: : . : Misc fixes and improvements: : - Add minimal support for ARMv8.7's PMU extension : - Constify kvm_io_gic_ops : - Drop kvm_is_transparent_hugepage() prototype : - Drop unused workaround_flags field : - Rework kvm_pgtable initialisation : - Documentation fixes : - Replace open-coded SCTLR_EL1.EE useage with its defined macro : - Sysreg list selftest update to handle PAuth : - Include cleanups : . KVM: arm64: vgic: Replace kernel.h with the necessary inclusions KVM: arm64: Fix comment typo in kvm_vcpu_finalize_sve() KVM: arm64: selftests: get-reg-list: Add pauth configuration KVM: arm64: Fix comment on barrier in kvm_psci_vcpu_on() KVM: arm64: Fix comment for kvm_reset_vcpu() KVM: arm64: Use defined value for SCTLR_ELx_EE KVM: arm64: Rework kvm_pgtable initialisation KVM: arm64: Drop unused workaround_flags vcpu field Signed-off-by: Marc Zyngier <maz@kernel.org>	2022-01-04 17:16:15 +00:00
Jakub Kicinski	aec53e60e0	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net drivers/net/ethernet/mellanox/mlx5/core/en_tc.c commit `077cdda764` ("net/mlx5e: TC, Fix memory leak with rules with internal port") commit `31108d142f` ("net/mlx5: Fix some error handling paths in 'mlx5e_tc_add_fdb_flow()'") commit `4390c6edc0` ("net/mlx5: Fix some error handling paths in 'mlx5e_tc_add_fdb_flow()'") https://lore.kernel.org/all/20211229065352.30178-1-saeed@kernel.org/ net/smc/smc_wr.c commit `49dc9013e3` ("net/smc: Use the bitmap API when applicable") commit `349d43127d` ("net/smc: fix kernel panic caused by race of smc_sock") bitmap_zero()/memset() is removed by the fix Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-12-30 12:12:12 -08:00
Rafael J. Wysocki	5ee22fa4a9	Merge branch 'cpufreq/arm/linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm Pull ARM cpufreq updates for 5.17-rc1 from Viresh Kumar: "- Qcom cpufreq driver updates improve irq support (Ard Biesheuvel, Stephen Boyd, and Vladimir Zapolskiy). - Fixes double devm_remap for mediatek driver (Hector Yuan). - Introduces thermal pressure helpers (Lukasz Luba)." * 'cpufreq/arm/linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm: cpufreq: mediatek-hw: Fix double devm_remap in hotplug case cpufreq: qcom-hw: Use optional irq API cpufreq: qcom-hw: Set CPU affinity of dcvsh interrupts cpufreq: qcom-hw: Fix probable nested interrupt handling cpufreq: qcom-cpufreq-hw: Avoid stack buffer for IRQ name arch_topology: Remove unused topology_set_thermal_pressure() and related cpufreq: qcom-cpufreq-hw: Use new thermal pressure update function cpufreq: qcom-cpufreq-hw: Update offline CPUs per-cpu thermal pressure thermal: cpufreq_cooling: Use new thermal pressure update function arch_topology: Introduce thermal pressure update function	2021-12-30 15:49:54 +01:00
Linus Torvalds	a8ad9a2434	Another EFI fix for v5.16 - Prevent missing prototype warning from breaking the build under CONFIG_WERROR=y -----BEGIN PGP SIGNATURE----- iQGzBAABCgAdFiEE+9lifEBpyUIVN1cpw08iOZLZjyQFAmHJrdYACgkQw08iOZLZ jyRS/wwAvrrv6xidZNqm/BAr4bhY9saQ4BZTracZuHJwD7300H9mjlMDMnVbLhOU oVv9hEHDqBqBKgZxIYc0ok4SMZEpieFhLuTdXzVf5dMg+BL+qHR0EQp8kLp8mssL xbPnMidecL2TfjPqLn34MVPe46uQo2yioxET4hDvWA5xsWNKKd0AeKVzUzdV2AVE elnHsdsBWpCexaIJkQEFJWf/Y3xUlzcqp7zcqgJq4hq3fZlEbD8SJ7TUM7ReAQJ6 5afzYXBm086sn602hZVbAtOy94vLBBQswnbKntmFpcT+z/GLNjycx+A4Z2h+6jjx ImiIkNZA2N5ij12riXdx3n6OG0LMuILlRhhqyckgrgHyc1oYCwaUq+LOuVQENZF+ YNGfyntQ+Y0F9dfG28atnFiL5ZFZcCSJxVZ3Dl2bql2DHptbQNHOBdXKOcDWTkpc JKTs3j3rigZ1I01b1SgWzH418do8s8y4iv4nmp3EdMUO8v8rEFMx3yHIHKSRjMqj 1vAPZtWQ =coEy -----END PGP SIGNATURE----- Merge tag 'efi-urgent-for-v5.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull EFI fix from Ard Biesheuvel: "Another EFI fix for v5.16: - Prevent missing prototype warning from breaking the build under CONFIG_WERROR=y" * tag 'efi-urgent-for-v5.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: efi: Move efifb_setup_from_dmi() prototype from arch headers	2021-12-27 08:58:35 -08:00
Xiongfeng Wang	d5624bb29f	asm-generic: introduce io_stop_wc() and add implementation for ARM64 For memory accesses with write-combining attributes (e.g. those returned by ioremap_wc()), the CPU may wait for prior accesses to be merged with subsequent ones. But in some situation, such wait is bad for the performance. We introduce io_stop_wc() to prevent the merging of write-combining memory accesses before this macro with those after it. We add implementation for ARM64 using DGH instruction and provide NOP implementation for other architectures. Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Suggested-by: Will Deacon <will@kernel.org> Suggested-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20211221035556.60346-1-wangxiongfeng2@huawei.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2021-12-22 10:44:53 +00:00
Fuad Tabba	500ca5241b	KVM: arm64: Use defined value for SCTLR_ELx_EE Replace the hardcoded value with the existing definition. No functional change intended. Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211208192810.657360-1-tabba@google.com	2021-12-20 13:55:48 +00:00
Catalin Marinas	dd73d18e7f	arm64: Ensure that the 'bti' macro is defined where linkage.h is included Not all .S files include asm/assembler.h, however the SYM_FUNC_* definitions invoke the 'bti' macro. Include asm/assembler.h in asm/linkage.h. Fixes: `9be34be87c` ("arm64: Add macro version of the BTI instruction") Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2021-12-17 16:20:45 +00:00
Marc Zyngier	9d8604b285	KVM: arm64: Rework kvm_pgtable initialisation Ganapatrao reported that the kvm_pgtable->mmu pointer is more or less hardcoded to the main S2 mmu structure, while the nested code needs it to point to other instances (as we have one instance per nested context). Rework the initialisation of the kvm_pgtable structure so that this assumtion doesn't hold true anymore. This requires some minor changes to the order in which things are initialised (the mmu->arch pointer being the critical one). Reported-by: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com> Reviewed-by: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211129200150.351436-5-maz@kernel.org	2021-12-16 17:01:05 +00:00
Marc Zyngier	43d8ac2212	Merge branch kvm-arm64/pkvm-hyp-sharing into kvmarm-master/next * kvm-arm64/pkvm-hyp-sharing: : . : Series from Quentin Perret, implementing HYP page share/unshare: : : This series implements an unshare hypercall at EL2 in nVHE : protected mode, and makes use of it to unmmap guest-specific : data-structures from EL2 stage-1 during guest tear-down. : Crucially, the implementation of the share and unshare : routines use page refcounts in the host kernel to avoid : accidentally unmapping data-structures that overlap a common : page. : [...] : . KVM: arm64: pkvm: Unshare guest structs during teardown KVM: arm64: Expose unshare hypercall to the host KVM: arm64: Implement do_unshare() helper for unsharing memory KVM: arm64: Implement __pkvm_host_share_hyp() using do_share() KVM: arm64: Implement do_share() helper for sharing memory KVM: arm64: Introduce wrappers for host and hyp spin lock accessors KVM: arm64: Extend pkvm_page_state enumeration to handle absent pages KVM: arm64: pkvm: Refcount the pages shared with EL2 KVM: arm64: Introduce kvm_share_hyp() KVM: arm64: Implement kvm_pgtable_hyp_unmap() at EL2 KVM: arm64: Hook up ->page_count() for hypervisor stage-1 page-table KVM: arm64: Fixup hyp stage-1 refcount KVM: arm64: Refcount hyp stage-1 pgtable pages KVM: arm64: Provide {get,put}_page() stubs for early hyp allocator Signed-off-by: Marc Zyngier <maz@kernel.org>	2021-12-16 13:06:09 +00:00
Quentin Perret	52b28657eb	KVM: arm64: pkvm: Unshare guest structs during teardown Make use of the newly introduced unshare hypercall during guest teardown to unmap guest-related data structures from the hyp stage-1. Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211215161232.1480836-15-qperret@google.com	2021-12-16 12:58:57 +00:00
Will Deacon	b8cc6eb5bd	KVM: arm64: Expose unshare hypercall to the host Introduce an unshare hypercall which can be used to unmap memory from the hypervisor stage-1 in nVHE protected mode. This will be useful to update the EL2 ownership state of pages during guest teardown, and avoids keeping dangling mappings to unreferenced portions of memory. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211215161232.1480836-14-qperret@google.com	2021-12-16 12:58:57 +00:00
Quentin Perret	3f868e142c	KVM: arm64: Introduce kvm_share_hyp() The create_hyp_mappings() function can currently be called at any point in time. However, its behaviour in protected mode changes widely depending on when it is being called. Prior to KVM init, it is used to create the temporary page-table used to bring-up the hypervisor, and later on it is transparently turned into a 'share' hypercall when the kernel has lost control over the hypervisor stage-1. In order to prepare the ground for also unsharing pages with the hypervisor during guest teardown, introduce a kvm_share_hyp() function to make it clear in which places a share hypercall should be expected, as we will soon need a matching unshare hypercall in all those places. Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211215161232.1480836-7-qperret@google.com	2021-12-16 12:58:56 +00:00
Will Deacon	82bb02445d	KVM: arm64: Implement kvm_pgtable_hyp_unmap() at EL2 Implement kvm_pgtable_hyp_unmap() which can be used to remove hypervisor stage-1 mappings at EL2. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211215161232.1480836-6-qperret@google.com	2021-12-16 12:58:56 +00:00
Mark Brown	30c43e73b3	arm64/sve: Generalise vector length configuration prctl() for SME In preparation for adding SME support update the bulk of the implementation for the vector length configuration prctl() calls to be independent of vector type. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20211210184133.320748-3-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2021-12-14 18:33:44 +00:00
Mark Brown	742a15b1a2	arm64: Use BTI C directly and unconditionally Now we have a macro for BTI C that looks like a regular instruction change all the users of the current BTI_C macro to just emit a BTI C directly and remove the macro. This does mean that we now unconditionally BTI annotate all assembly functions, meaning that they are worse in this respect than code generated by the compiler. The overhead should be minimal for implementations with a reasonable HINT implementation. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20211214152714.2380849-4-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2021-12-14 18:12:58 +00:00
Mark Brown	481ee45ce9	arm64: Unconditionally override SYM_FUNC macros Currently we only override the SYM_FUNC macros when we need to insert BTI C into them, do this unconditionally to make it more likely that we'll notice bugs in our override. Suggested-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Mark Brown <broonie@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20211214152714.2380849-3-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2021-12-14 18:12:58 +00:00
Mark Brown	9be34be87c	arm64: Add macro version of the BTI instruction BTI is only available from v8.5 so we need to encode it using HINT in generic code and for older toolchains. Add an assembler macro based on one written by Mark Rutland which lets us use the mnemonic and update the existing users. Suggested-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Mark Brown <broonie@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20211214152714.2380849-2-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2021-12-14 18:12:58 +00:00
Mark Rutland	053f58bab3	arm64: atomics: lse: define RETURN ops in terms of FETCH ops The FEAT_LSE atomic instructions include LD* instructions which return the original value of a memory location can be used to directly implement FETCH opertations. Each RETURN op is implemented as a copy of the corresponding FETCH op with a trailing instruction to generate the new value of the memory location. We only directly implement _fetch_add(), for which we have a trailing `add` instruction. As the compiler has no visibility of the `add`, this leads to less than optimal code generation when consuming the result. For example, the compiler cannot constant-fold the addition into later operations, and currently GCC 11.1.0 will compile: return __lse_atomic_sub_return(1, v) == 0; As: mov w1, #0xffffffff ldaddal w1, w2, [x0] add w1, w1, w2 cmp w1, #0x0 cset w0, eq // eq = none ret This patch improves this by replacing the `add` with C addition after the inline assembly block, e.g. ret += i; This allows the compiler to manipulate `i`. This permits the compiler to merge the `add` and `cmp` for the above, e.g. mov w1, #0xffffffff ldaddal w1, w1, [x0] cmp w1, #0x1 cset w0, eq // eq = none ret With this change the assembly for each RETURN op is identical to the corresponding FETCH op (including barriers and clobbers) so I've removed the inline assembly and rewritten each RETURN op in terms of the corresponding FETCH op, e.g. \| static inline void __lse_atomic_add_return(int i, atomic_t *v) \| { \| return __lse_atomic_fetch_add(i, v) + i \| } The new construction does not adversely affect the common case, and before and after this patch GCC 11.1.0 can compile: __lse_atomic_add_return(i, v) As: ldaddal w0, w2, [x1] add w0, w0, w2 ... while having the freedom to do better elsewhere. This is intended as an optimization and cleanup. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Acked-by: Will Deacon <will@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20211210151410.2782645-6-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2021-12-14 13:00:33 +00:00
Mark Rutland	8a578a759a	arm64: atomics: lse: improve constraints for simple ops We have overly conservative assembly constraints for the basic FEAT_LSE atomic instructions, and using more accurate and permissive constraints will allow for better code generation. The FEAT_LSE basic atomic instructions have come in two forms: LD{op}{order}{size} <Rs>, <Rt>, [<Rn>] ST{op}{order}{size} <Rs>, [<Rn>] The ST* forms are aliases of the LD* forms where: ST{op}{order}{size} <Rs>, [<Rn>] Is: LD{op}{order}{size} <Rs>, XZR, [<Rn>] For either form, both <Rs> and <Rn> are read but not written back to, and <Rt> is written with the original value of the memory location. Where (<Rt> == <Rs>) or (<Rt> == <Rn>), <Rt> is written after the other register value(s) are consumed. There are no UNPREDICTABLE or CONSTRAINED UNPREDICTABLE behaviours when any pair of <Rs>, <Rt>, or <Rn> are the same register. Our current inline assembly always uses <Rs> == <Rt>, treating this register as both an input and an output (using a '+r' constraint). This forces the compiler to do some unnecessary register shuffling and/or redundant value generation. For example, the compiler cannot reuse the <Rs> value, and currently GCC 11.1.0 will compile: __lse_atomic_add(1, a); __lse_atomic_add(1, b); __lse_atomic_add(1, c); As: mov w3, #0x1 mov w4, w3 stadd w4, [x0] mov w0, w3 stadd w0, [x1] stadd w3, [x2] We can improve this with more accurate constraints, separating <Rs> and <Rt>, where <Rs> is an input-only register ('r'), and <Rt> is an output-only value ('=r'). As <Rt> is written back after <Rs> is consumed, it does not need to be earlyclobber ('=&r'), leaving the compiler free to use the same register for both <Rs> and <Rt> where this is desirable. At the same time, the redundant 'r' constraint for `v` is removed, as the `+Q` constraint is sufficient. With this change, the above example becomes: mov w3, #0x1 stadd w3, [x0] stadd w3, [x1] stadd w3, [x2] I've made this change for the non-value-returning and FETCH ops. The RETURN ops have a multi-instruction sequence for which we cannot use the same constraints, and a subsequent patch will rewrite hte RETURN ops in terms of the FETCH ops, relying on the ability for the compiler to reuse the <Rs> value. This is intended as an optimization. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Acked-by: Will Deacon <will@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20211210151410.2782645-5-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2021-12-14 13:00:23 +00:00
Mark Rutland	5e9e43c987	arm64: atomics: lse: define ANDs in terms of ANDNOTs The FEAT_LSE atomic instructions include atomic bit-clear instructions (`ldclr` and `stclr`) which can be used to directly implement ANDNOT operations. Each AND op is implemented as a copy of the corresponding ANDNOT op with a leading `mvn` instruction to apply a bitwise NOT to the `i` argument. As the compiler has no visibility of the `mvn`, this leads to less than optimal code generation when generating `i` into a register. For example, __lse_atomic_fetch_and(0xf, v) can be compiled to: mov w1, #0xf mvn w1, w1 ldclral w1, w1, [x2] This patch improves this by replacing the `mvn` with NOT in C before the inline assembly block, e.g. i = ~i; This allows the compiler to generate `i` into a register more optimally, e.g. mov w1, #0xfffffff0 ldclral w1, w1, [x2] With this change the assembly for each AND op is identical to the corresponding ANDNOT op (including barriers and clobbers), so I've removed the inline assembly and rewritten each AND op in terms of the corresponding ANDNOT op, e.g. \| static inline void __lse_atomic_and(int i, atomic_t *v) \| { \| return __lse_atomic_andnot(~i, v); \| } This is intended as an optimization and cleanup. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Acked-by: Will Deacon <will@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20211210151410.2782645-4-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2021-12-14 13:00:23 +00:00
Mark Rutland	ef53245060	arm64: atomics lse: define SUBs in terms of ADDs The FEAT_LSE atomic instructions include atomic ADD instructions (`stadd` and `ldadd`), but do not include atomic SUB instructions, so we must build all of the SUB operations using the ADD instructions. We open-code these today, with each SUB op implemented as a copy of the corresponding ADD op with a leading `neg` instruction in the inline assembly to negate the `i` argument. As the compiler has no visibility of the `neg`, this leads to less than optimal code generation when generating `i` into a register. For example, __les_atomic_fetch_sub(1, v) can be compiled to: mov w1, #0x1 neg w1, w1 ldaddal w1, w1, [x2] This patch improves this by replacing the `neg` with negation in C before the inline assembly block, e.g. i = -i; This allows the compiler to generate `i` into a register more optimally, e.g. mov w1, #0xffffffff ldaddal w1, w1, [x2] With this change the assembly for each SUB op is identical to the corresponding ADD op (including barriers and clobbers), so I've removed the inline assembly and rewritten each SUB op in terms of the corresponding ADD op, e.g. \| static inline void __lse_atomic_sub(int i, atomic_t *v) \| { \| __lse_atomic_add(-i, v); \| } For clarity I've moved the definition of each SUB op immediately after the corresponding ADD op, and used a single macro to create the RETURN forms of both ops. This is intended as an optimization and cleanup. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Acked-by: Will Deacon <will@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20211210151410.2782645-3-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2021-12-14 13:00:23 +00:00
Mark Rutland	8e6082e94a	arm64: atomics: format whitespace consistently The code for the atomic ops is formatted inconsistently, and while this is not a functional problem it is rather distracting when working on them. Some have ops have consistent indentation, e.g. \| #define ATOMIC_OP_ADD_RETURN(name, mb, cl...) \ \| static inline int __lse_atomic_add_return##name(int i, atomic_t v) \ \| { \ \| u32 tmp; \ \| \ \| asm volatile( \ \| __LSE_PREAMBLE \ \| " ldadd" #mb " %w[i], %w[tmp], %[v]\n" \ \| " add %w[i], %w[i], %w[tmp]" \ \| : [i] "+r" (i), [v] "+Q" (v->counter), [tmp] "=&r" (tmp) \ \| : "r" (v) \ \| : cl); \ \| \ \| return i; \ \| } While others have negative indentation for some lines, and/or have misaligned trailing backslashes, e.g. \| static inline void __lse_atomic_##op(int i, atomic_t v) \ \| { \ \| asm volatile( \ \| __LSE_PREAMBLE \ \| " " #asm_op " %w[i], %[v]\n" \ \| : [i] "+r" (i), [v] "+Q" (v->counter) \ \| : "r" (v)); \ \| } This patch makes the indentation consistent and also aligns the trailing backslashes. This makes the code easier to read for those (like myself) who are easily distracted by these inconsistencies. This is intended as a cleanup. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Acked-by: Will Deacon <will@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20211210151410.2782645-2-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2021-12-14 13:00:23 +00:00
Joey Gouly	1175011a7d	arm64: cpufeature: add HWCAP for FEAT_RPRES Add a new HWCAP to detect the Increased precision of Reciprocal Estimate and Reciprocal Square Root Estimate feature (FEAT_RPRES), introduced in Armv8.7. Also expose this to userspace in the ID_AA64ISAR2_EL1 feature register. Signed-off-by: Joey Gouly <joey.gouly@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211210165432.8106-4-joey.gouly@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2021-12-13 18:53:00 +00:00
Joey Gouly	9e45365f14	arm64: add ID_AA64ISAR2_EL1 sys register This is a new ID register, introduced in 8.7. Signed-off-by: Joey Gouly <joey.gouly@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: James Morse <james.morse@arm.com> Cc: Alexandru Elisei <alexandru.elisei@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Reiji Watanabe <reijiw@google.com> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211210165432.8106-3-joey.gouly@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2021-12-13 18:53:00 +00:00
Joey Gouly	5c13f042e7	arm64: cpufeature: add HWCAP for FEAT_AFP Add a new HWCAP to detect the Alternate Floating-point Behaviour feature (FEAT_AFP), introduced in Armv8.7. Also expose this to userspace in the ID_AA64MMFR1_EL1 feature register. Signed-off-by: Joey Gouly <joey.gouly@arm.com> Cc: Will Deacon <will@kernel.org> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20211210165432.8106-2-joey.gouly@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2021-12-13 18:52:59 +00:00
Javier Martinez Canillas	4bc5e64e6c	efi: Move efifb_setup_from_dmi() prototype from arch headers Commit `8633ef82f1` ("drivers/firmware: consolidate EFI framebuffer setup for all arches") made the Generic System Framebuffers (sysfb) driver able to be built on non-x86 architectures. But it left the efifb_setup_from_dmi() function prototype declaration in the architecture specific headers. This could lead to the following compiler warning as reported by the kernel test robot: drivers/firmware/efi/sysfb_efi.c:70:6: warning: no previous prototype for function 'efifb_setup_from_dmi' [-Wmissing-prototypes] void efifb_setup_from_dmi(struct screen_info si, const char opt) ^ drivers/firmware/efi/sysfb_efi.c:70:1: note: declare 'static' if the function is not intended to be used outside of this translation unit void efifb_setup_from_dmi(struct screen_info si, const char opt) Fixes: `8633ef82f1` ("drivers/firmware: consolidate EFI framebuffer setup for all arches") Reported-by: kernel test robot <lkp@intel.com> Cc: <stable@vger.kernel.org> # 5.15.x Signed-off-by: Javier Martinez Canillas <javierm@redhat.com> Acked-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://lore.kernel.org/r/20211126001333.555514-1-javierm@redhat.com Signed-off-by: Ard Biesheuvel <ardb@kernel.org>	2021-12-13 15:07:16 +01:00
Ingo Molnar	6773cc31a9	Linux 5.16-rc5 -----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmG2fU0eHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGC7EH/3R7Rt+OD8Wn8Ss3 w8V+dBxVwa2u2oMTyUHPxaeOXZ7bi38XlUdLFPOK/76bGwO0a5TmYZqsWdRbGyT0 HfcYjHsQ0lbJXk/nh2oM47oJxJXVpThIHXJEk0FZ0Y5t+DYjIYlNHzqZymUyhLem St74zgWcyT+MXuqY34vB827FJDUnOxhhhi85tObeunaSPAomy9aiYidSC1ARREnz iz2VUntP/QnRnKVvL2nUZNzcz1xL5vfCRSKsRGRSv3qW1Y/1M71ylt6JVmSftWq+ VmMdFxFhdrb1OK/1ct/930Un/UP2NG9EJsWxote2XYlnVSZHzDqH7lUhbqgdCcLz 1m2tVNY= =7wRd -----END PGP SIGNATURE----- Merge tag 'v5.16-rc5' into locking/core, to pick up fixes Signed-off-by: Ingo Molnar <mingo@kernel.org>	2021-12-13 10:48:46 +01:00
Mark Rutland	d2d1d2645c	arm64: Make some stacktrace functions private Now that open-coded stack unwinds have been converted to arch_stack_walk(), we no longer need to expose any of unwind_frame(), walk_stackframe(), or start_backtrace() outside of stacktrace.c. Make those functions private to stacktrace.c, removing their prototypes from <asm/stacktrace.h> and marking them static. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Mark Brown <broonie@kernel.org> Cc: Madhavan T. Venkataraman <madvenka@linux.microsoft.com> Link: https://lore.kernel.org/r/20211129142849.3056714-10-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2021-12-10 14:06:04 +00:00
Mark Rutland	1e5428b2b7	arm64: Add comment for stack_info::kr_cur We added stack_info::kr_cur in commit: `cd9bc2c925` ("arm64: Recover kretprobe modified return address in stacktrace") ... but didn't add anything in the corresponding comment block. For consistency, add a corresponding comment. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviwed-by: Mark Brown <broonie@kernel.org> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20211129142849.3056714-3-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2021-12-10 14:06:03 +00:00
Marc Zyngier	142ff9bddb	KVM: arm64: Drop unused workaround_flags vcpu field workaround_flags is a leftover from our earlier Spectre-v4 workaround implementation, and now serves no purpose. Get rid of the field and the corresponding asm-offset definition. Fixes: `29e8910a56` ("KVM: arm64: Simplify handling of ARCH_WORKAROUND_2") Signed-off-by: Marc Zyngier <maz@kernel.org>	2021-12-08 14:54:07 +00:00
Paolo Bonzini	93b350f884	Merge branch 'kvm-on-hv-msrbm-fix' into HEAD Merge bugfix for enlightened MSR Bitmap, before adding support to KVM for exposing the feature to nested guests.	2021-12-08 05:30:48 -05:00
Sean Christopherson	005467e06b	KVM: Drop obsolete kvm_arch_vcpu_block_finish() Drop kvm_arch_vcpu_block_finish() now that all arch implementations are nops. No functional change intended. Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: David Matlack <dmatlack@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20211009021236.4122790-10-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-12-08 04:24:50 -05:00
Sean Christopherson	6109c5a6ab	KVM: arm64: Move vGIC v4 handling for WFI out arch callback hook Move the put and reload of the vGIC out of the block/unblock callbacks and into a dedicated WFI helper. Functionally, this is nearly a nop as the block hook is called at the very beginning of kvm_vcpu_block(), and the only code in kvm_vcpu_block() after the unblock hook is to update the halt-polling controls, i.e. can only affect the next WFI. Back when the arch (un)blocking hooks were added by commits `3217f7c25b` ("KVM: Add kvm_arch_vcpu_{un}blocking callbacks) and `d35268da66` ("arm/arm64: KVM: arch_timer: Only schedule soft timer on vcpu_block"), the hooks were invoked only when KVM was about to "block", i.e. schedule out the vCPU. The use case at the time was to schedule a timer in the host based on the earliest timer in the guest in order to wake the blocking vCPU when the emulated guest timer fired. Commit `accb99bcd0` ("KVM: arm/arm64: Simplify bg_timer programming") reworked the timer logic to be even more precise, by waiting until the vCPU was actually scheduled out, and so move the timer logic from the (un)blocking hooks to vcpu_load/put. In the meantime, the hooks gained usage for enabling vGIC v4 doorbells in commit `df9ba95993` ("KVM: arm/arm64: GICv4: Use the doorbell interrupt as an unblocking source"), and added related logic for the VMCR in commit `5eeaf10eec` ("KVM: arm/arm64: Sync ICH_VMCR_EL2 back when about to block"). Finally, commit `07ab0f8d9a` ("KVM: Call kvm_arch_vcpu_blocking early into the blocking sequence") hoisted the (un)blocking hooks so that they wrapped KVM's halt-polling logic in addition to the core "block" logic. In other words, the original need for arch hooks to take action _only_ in the block path is long since gone. Cc: Oliver Upton <oupton@google.com> Cc: Marc Zyngier <maz@kernel.org> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20211009021236.4122790-11-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-12-08 04:24:49 -05:00
Sebastian Andrzej Siewior	77993b595a	locking: Allow to include asm/spinlock_types.h from linux/spinlock_types_raw.h The printk header file includes ratelimit_types.h for its __ratelimit() based usage. It is required for the static initializer used in printk_ratelimited(). It uses a raw_spinlock_t and includes the spinlock_types.h. PREEMPT_RT substitutes spinlock_t with a rtmutex based implementation and so its spinlock_t implmentation (provided by spinlock_rt.h) includes rtmutex.h and atomic.h which leads to recursive includes where defines are missing. By including only the raw_spinlock_t defines it avoids the atomic.h related includes at this stage. An example on powerpc: \| CALL scripts/atomic/check-atomics.sh \|In file included from include/linux/bug.h:5, \| from include/linux/page-flags.h:10, \| from kernel/bounds.c:10: \|arch/powerpc/include/asm/page_32.h: In function âclear_pageâ: \|arch/powerpc/include/asm/bug.h:87:4: error: implicit declaration of function â=80=98__WARNâ=80=99 [-Werror=3Dimplicit-function-declaration] \| 87 \| __WARN(); \ \| \| ^~~~~~ \|arch/powerpc/include/asm/page_32.h:48:2: note: in expansion of macro âWARN_ONâ=99 \| 48 \| WARN_ON((unsigned long)addr & (L1_CACHE_BYTES - 1)); \| \| ^~~~~~~ \|arch/powerpc/include/asm/bug.h:58:17: error: invalid application of âsizeofâ=99 to incomplete type âstruct bug_entryâ=99 \| 58 \| "i" (sizeof(struct bug_entry)), \ \| \| ^~~~~~ \|arch/powerpc/include/asm/bug.h:89:3: note: in expansion of macro âBUG_ENTRYâ=99 \| 89 \| BUG_ENTRY(PPC_TLNEI " %4, 0", \ \| \| ^~~~~~~~~ \|arch/powerpc/include/asm/page_32.h:48:2: note: in expansion of macro âWARN_ONâ=99 \| 48 \| WARN_ON((unsigned long)addr & (L1_CACHE_BYTES - 1)); \| \| ^~~~~~~ \|In file included from arch/powerpc/include/asm/ptrace.h:298, \| from arch/powerpc/include/asm/hw_irq.h:12, \| from arch/powerpc/include/asm/irqflags.h:12, \| from include/linux/irqflags.h:16, \| from include/asm-generic/cmpxchg-local.h:6, \| from arch/powerpc/include/asm/cmpxchg.h:526, \| from arch/powerpc/include/asm/atomic.h:11, \| from include/linux/atomic.h:7, \| from include/linux/rwbase_rt.h:6, \| from include/linux/rwlock_types.h:55, \| from include/linux/spinlock_types.h:74, \| from include/linux/ratelimit_types.h:7, \| from include/linux/printk.h:10, \| from include/asm-generic/bug.h:22, \| from arch/powerpc/include/asm/bug.h:109, \| from include/linux/bug.h:5, \| from include/linux/page-flags.h:10, \| from kernel/bounds.c:10: \|include/linux/thread_info.h: In function â=80=98copy_overflowâ=80=99: \|include/linux/thread_info.h:210:2: error: implicit declaration of function â=80=98WARNâ=80=99 [-Werror=3Dimplicit-function-declaration] \| 210 \| WARN(1, "Buffer overflow detected (%d < %lu)!\n", size, count); \| \| ^~~~ The WARN / BUG include pulls in printk.h and then ptrace.h expects WARN (from bug.h) which is not yet complete. Even hw_irq.h has WARN_ON() statements. On POWERPC64 there are missing atomic64 defines while building 32bit VDSO: \| VDSO32C arch/powerpc/kernel/vdso32/vgettimeofday.o \|In file included from include/linux/atomic.h:80, \| from include/linux/rwbase_rt.h:6, \| from include/linux/rwlock_types.h:55, \| from include/linux/spinlock_types.h:74, \| from include/linux/ratelimit_types.h:7, \| from include/linux/printk.h:10, \| from include/linux/kernel.h:19, \| from arch/powerpc/include/asm/page.h:11, \| from arch/powerpc/include/asm/vdso/gettimeofday.h:5, \| from include/vdso/datapage.h:137, \| from lib/vdso/gettimeofday.c:5, \| from <command-line>: \|include/linux/atomic-arch-fallback.h: In function âarch_atomic64_incâ=99: \|include/linux/atomic-arch-fallback.h:1447:2: error: implicit declaration of function âarch_atomic64_addâ; did you mean âarch_atomic_addâ? [-Werror=3Dimpl \|icit-function-declaration] \| 1447 \| arch_atomic64_add(1, v); \| \| ^~~~~~~~~~~~~~~~~ \| \| arch_atomic_add The generic fallback is not included, atomics itself are not used. If kernel.h does not include printk.h then it comes later from the bug.h include. Allow asm/spinlock_types.h to be included from linux/spinlock_types_raw.h. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20211129174654.668506-12-bigeasy@linutronix.de	2021-12-07 15:14:12 +01:00
Marc Zyngier	94b4a6d521	Merge branch kvm-arm64/misc-5.17 into kvmarm-master/next * kvm-arm64/misc-5.17: : . : - Add minimal support for ARMv8.7's PMU extension : - Constify kvm_io_gic_ops : - Drop kvm_is_transparent_hugepage() prototype : . KVM: Drop stale kvm_is_transparent_hugepage() declaration KVM: arm64: Constify kvm_io_gic_ops KVM: arm64: Add minimal handling for the ARMv8.7 PMU Signed-off-by: Marc Zyngier <maz@kernel.org>	2021-12-07 09:14:53 +00:00
Marc Zyngier	370a17f531	Merge branch kvm-arm64/hyp-header-split into kvmarm-master/next * kvm-arm64/hyp-header-split: : . : Tidy up the header file usage for the nvhe hyp object so : that header files under arch/arm64/kvm/hyp/include are not : included by host code running at EL1. : . KVM: arm64: Move host EL1 code out of hyp/ directory KVM: arm64: Generate hyp_constants.h for the host arm64: Add missing include of asm/cpufeature.h to asm/mmu.h Signed-off-by: Marc Zyngier <maz@kernel.org>	2021-12-07 09:14:49 +00:00
Reiji Watanabe	685e2564da	arm64: mte: DC {GVA,GZVA} shouldn't be used when DCZID_EL0.DZP == 1 Currently, mte_set_mem_tag_range() and mte_zero_clear_page_tags() use DC {GVA,GZVA} unconditionally. But, they should make sure that DCZID_EL0.DZP, which indicates whether or not use of those instructions is prohibited, is zero when using those instructions. Use ST{G,ZG,Z2G} instead when DCZID_EL0.DZP == 1. Fixes: `013bb59dbb` ("arm64: mte: handle tags zeroing at page allocation time") Fixes: `3d0cca0b02` ("kasan: speed up mte_set_mem_tag_range") Signed-off-by: Reiji Watanabe <reijiw@google.com> Link: https://lore.kernel.org/r/20211206004736.1520989-3-reijiw@google.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2021-12-06 17:02:10 +00:00

1 2 3 4 5 ...

4061 Commits