mirror of
https://mirrors.bfsu.edu.cn/git/linux.git
synced 2024-12-04 17:44:14 +08:00
c947948f7a
160 Commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
Mark Rutland
|
f750255fda |
arm64: insn: always inline hint generation
All users of aarch64_insn_gen_hint() (e.g. aarch64_insn_gen_nop()) pass a constant argument and generate a constant value. Some of those users are noinstr code (e.g. for alternatives patching). For noinstr code it is necessary to either inline these functions or to ensure the out-of-line versions are noinstr. Since in all cases these are generating a constant, make them __always_inline. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Joey Gouly <joey.gouly@arm.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Joey Gouly <joey.gouly@arm.com> Link: https://lore.kernel.org/r/20221114135928.3000571-5-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org> |
||
Mark Rutland
|
4488f90c86 |
arm64: insn: simplify insn group identification
The only code which needs to check for an entire instruction group is the aarch64_insn_is_steppable() helper function used by kprobes, which must not be instrumented, and only needs to check for the "Branch, exception generation and system instructions" class. Currently we have an out-of-line helper in insn.c which must be marked as __kprobes, which indexes a table with some bits extracted from the instruction. In aarch64_insn_is_steppable() we then need to compare the result with an expected enum value. It would be simpler to have a predicate for this, as with the other aarch64_insn_is_*() helpers, which would be always inlined to prevent inadvertent instrumentation, and would permit better code generation. This patch adds a predicate function for this instruction group using the existing __AARCH64_INSN_FUNCS() helpers, and removes the existing out-of-line helper. As the only class we currently care about is the branch+exception+sys class, I have only added helpers for this, and left the other classes unimplemented for now. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Joey Gouly <joey.gouly@arm.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Joey Gouly <joey.gouly@arm.com> Link: https://lore.kernel.org/r/20221114135928.3000571-4-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org> |
||
Mark Rutland
|
9b948e79d5 |
arm64: insn: always inline predicates
We have a number of aarch64_insn_*() predicates which are used in code which is not instrumentation safe (e.g. alternatives patching, kprobes). Some of those are marked with __kprobes, but most are not, and are implemented out-of-line in insn.c. This patch moves the predicates to insn.h and marks them with __always_inline. This is ensures that they will respect the instrumentation requirements of their caller which they will be inlined into. At the same time, I've formatted each of the functions consistently as a list, to make them easier to read and update in future. Other than preventing unwanted instrumentation, there should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Joey Gouly <joey.gouly@arm.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Joey Gouly <joey.gouly@arm.com> Link: https://lore.kernel.org/r/20221114135928.3000571-3-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org> |
||
Mark Rutland
|
1b6bf2da7b |
arm64: insn: remove aarch64_insn_gen_prefetch()
There are no users of aarch64_insn_gen_prefetch(), and which encodes a PRFM (immediate) with a hard-coded offset of 0. Remove it for now; we can always restore it with tests if we need it in future. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Joey Gouly <joey.gouly@arm.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Joey Gouly <joey.gouly@arm.com> Link: https://lore.kernel.org/r/20221114135928.3000571-2-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org> |
||
Linus Torvalds
|
f86d1fbbe7 |
Networking changes for 6.0.
Core ---- - Refactor the forward memory allocation to better cope with memory pressure with many open sockets, moving from a per socket cache to a per-CPU one - Replace rwlocks with RCU for better fairness in ping, raw sockets and IP multicast router. - Network-side support for IO uring zero-copy send. - A few skb drop reason improvements, including codegen the source file with string mapping instead of using macro magic. - Rename reference tracking helpers to a more consistent netdev_* schema. - Adapt u64_stats_t type to address load/store tearing issues. - Refine debug helper usage to reduce the log noise caused by bots. BPF --- - Improve socket map performance, avoiding skb cloning on read operation. - Add support for 64 bits enum, to match types exposed by kernel. - Introduce support for sleepable uprobes program. - Introduce support for enum textual representation in libbpf. - New helpers to implement synproxy with eBPF/XDP. - Improve loop performances, inlining indirect calls when possible. - Removed all the deprecated libbpf APIs. - Implement new eBPF-based LSM flavor. - Add type match support, which allow accurate queries to the eBPF used types. - A few TCP congetsion control framework usability improvements. - Add new infrastructure to manipulate CT entries via eBPF programs. - Allow for livepatch (KLP) and BPF trampolines to attach to the same kernel function. Protocols --------- - Introduce per network namespace lookup tables for unix sockets, increasing scalability and reducing contention. - Preparation work for Wi-Fi 7 Multi-Link Operation (MLO) support. - Add support to forciby close TIME_WAIT TCP sockets via user-space tools. - Significant performance improvement for the TLS 1.3 receive path, both for zero-copy and not-zero-copy. - Support for changing the initial MTPCP subflow priority/backup status - Introduce virtually contingus buffers for sockets over RDMA, to cope better with memory pressure. - Extend CAN ethtool support with timestamping capabilities - Refactor CAN build infrastructure to allow building only the needed features. Driver API ---------- - Remove devlink mutex to allow parallel commands on multiple links. - Add support for pause stats in distributed switch. - Implement devlink helpers to query and flash line cards. - New helper for phy mode to register conversion. New hardware / drivers ---------------------- - Ethernet DSA driver for the rockchip mt7531 on BPI-R2 Pro. - Ethernet DSA driver for the Renesas RZ/N1 A5PSW switch. - Ethernet DSA driver for the Microchip LAN937x switch. - Ethernet PHY driver for the Aquantia AQR113C EPHY. - CAN driver for the OBD-II ELM327 interface. - CAN driver for RZ/N1 SJA1000 CAN controller. - Bluetooth: Infineon CYW55572 Wi-Fi plus Bluetooth combo device. Drivers ------- - Intel Ethernet NICs: - i40e: add support for vlan pruning - i40e: add support for XDP framented packets - ice: improved vlan offload support - ice: add support for PPPoE offload - Mellanox Ethernet (mlx5) - refactor packet steering offload for performance and scalability - extend support for TC offload - refactor devlink code to clean-up the locking schema - support stacked vlans for bridge offloads - use TLS objects pool to improve connection rate - Netronome Ethernet NICs (nfp): - extend support for IPv6 fields mangling offload - add support for vepa mode in HW bridge - better support for virtio data path acceleration (VDPA) - enable TSO by default - Microsoft vNIC driver (mana) - add support for XDP redirect - Others Ethernet drivers: - bonding: add per-port priority support - microchip lan743x: extend phy support - Fungible funeth: support UDP segmentation offload and XDP xmit - Solarflare EF100: add support for virtual function representors - MediaTek SoC: add XDP support - Mellanox Ethernet/IB switch (mlxsw): - dropped support for unreleased H/W (XM router). - improved stats accuracy - unified bridge model coversion improving scalability (parts 1-6) - support for PTP in Spectrum-2 asics - Broadcom PHYs - add PTP support for BCM54210E - add support for the BCM53128 internal PHY - Marvell Ethernet switches (prestera): - implement support for multicast forwarding offload - Embedded Ethernet switches: - refactor OcteonTx MAC filter for better scalability - improve TC H/W offload for the Felix driver - refactor the Microchip ksz8 and ksz9477 drivers to share the probe code (parts 1, 2), add support for phylink mac configuration - Other WiFi: - Microchip wilc1000: diable WEP support and enable WPA3 - Atheros ath10k: encapsulation offload support Old code removal: - Neterion vxge ethernet driver: this is untouched since more than 10 years. Signed-off-by: Paolo Abeni <pabeni@redhat.com> -----BEGIN PGP SIGNATURE----- iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmLqN+oSHHBhYmVuaUBy ZWRoYXQuY29tAAoJECkkeY3MjxOkB9kQAI9VqW0c3SfiTJnkVBEIovZ6Tnh5stD2 UYFkh1BdchLsYxi7W4XMpVPSzRztiTP87mIx5c/KvIzj+QNeWL1XWRJSPdI9HhTD pTAA/tM2OG7bqrbyQiKDNfpQdNl7+kk1RwnYd+f9RFl1QVuIJaYhmjVwrsN5xF/+ jUsotpROarM2dGFWiFwJbKhP2zMDT+6qEEahM8pEPggKhv8wRLYjany2cZVEe4e0 WGUpbINAS8gEKm0Ob922WaDfDrcK/N1Z0jNz/kMaENkK18Vvc7F6bCO0DzAawKX9 QZMMwm6mHp3EThflJAMAzCGIYiIcwLhykgdyj8rrjPhFrWbMD2Sdsbo21HOXU/8j u4aAhVl+d+h7emmbgBoJ8sycVJ7BQlXz7lX20sTgADv9xI4/dPhQ17CMRuwX6fXX JSrn6P6e1LTV5CEg6vrlSPnKPY6uhFn/cPw47FxCjRwJ9phVnp+8uZWQmf9Pz3yf Ok/tcj+juFbsmuOshHy2cbRkuNZNS0oRWlSTBo5795ZwOLSakMonR3L+ev2aOvzz DVrFp2Y/iIVwMSFdCbouYdYnhArPRhOAtCmZc2afY8aBN7aaMgrdTy3+mzUoHy3I FG3K+VuKpfi0vY4zn6ZoLZDIpyXIoJJ93RcSGltD32t3Dp1RaQMVEI4s45k05PVm 1nYpXKHA8qML =hxEG -----END PGP SIGNATURE----- Merge tag 'net-next-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking changes from Paolo Abeni: "Core: - Refactor the forward memory allocation to better cope with memory pressure with many open sockets, moving from a per socket cache to a per-CPU one - Replace rwlocks with RCU for better fairness in ping, raw sockets and IP multicast router. - Network-side support for IO uring zero-copy send. - A few skb drop reason improvements, including codegen the source file with string mapping instead of using macro magic. - Rename reference tracking helpers to a more consistent netdev_* schema. - Adapt u64_stats_t type to address load/store tearing issues. - Refine debug helper usage to reduce the log noise caused by bots. BPF: - Improve socket map performance, avoiding skb cloning on read operation. - Add support for 64 bits enum, to match types exposed by kernel. - Introduce support for sleepable uprobes program. - Introduce support for enum textual representation in libbpf. - New helpers to implement synproxy with eBPF/XDP. - Improve loop performances, inlining indirect calls when possible. - Removed all the deprecated libbpf APIs. - Implement new eBPF-based LSM flavor. - Add type match support, which allow accurate queries to the eBPF used types. - A few TCP congetsion control framework usability improvements. - Add new infrastructure to manipulate CT entries via eBPF programs. - Allow for livepatch (KLP) and BPF trampolines to attach to the same kernel function. Protocols: - Introduce per network namespace lookup tables for unix sockets, increasing scalability and reducing contention. - Preparation work for Wi-Fi 7 Multi-Link Operation (MLO) support. - Add support to forciby close TIME_WAIT TCP sockets via user-space tools. - Significant performance improvement for the TLS 1.3 receive path, both for zero-copy and not-zero-copy. - Support for changing the initial MTPCP subflow priority/backup status - Introduce virtually contingus buffers for sockets over RDMA, to cope better with memory pressure. - Extend CAN ethtool support with timestamping capabilities - Refactor CAN build infrastructure to allow building only the needed features. Driver API: - Remove devlink mutex to allow parallel commands on multiple links. - Add support for pause stats in distributed switch. - Implement devlink helpers to query and flash line cards. - New helper for phy mode to register conversion. New hardware / drivers: - Ethernet DSA driver for the rockchip mt7531 on BPI-R2 Pro. - Ethernet DSA driver for the Renesas RZ/N1 A5PSW switch. - Ethernet DSA driver for the Microchip LAN937x switch. - Ethernet PHY driver for the Aquantia AQR113C EPHY. - CAN driver for the OBD-II ELM327 interface. - CAN driver for RZ/N1 SJA1000 CAN controller. - Bluetooth: Infineon CYW55572 Wi-Fi plus Bluetooth combo device. Drivers: - Intel Ethernet NICs: - i40e: add support for vlan pruning - i40e: add support for XDP framented packets - ice: improved vlan offload support - ice: add support for PPPoE offload - Mellanox Ethernet (mlx5) - refactor packet steering offload for performance and scalability - extend support for TC offload - refactor devlink code to clean-up the locking schema - support stacked vlans for bridge offloads - use TLS objects pool to improve connection rate - Netronome Ethernet NICs (nfp): - extend support for IPv6 fields mangling offload - add support for vepa mode in HW bridge - better support for virtio data path acceleration (VDPA) - enable TSO by default - Microsoft vNIC driver (mana) - add support for XDP redirect - Others Ethernet drivers: - bonding: add per-port priority support - microchip lan743x: extend phy support - Fungible funeth: support UDP segmentation offload and XDP xmit - Solarflare EF100: add support for virtual function representors - MediaTek SoC: add XDP support - Mellanox Ethernet/IB switch (mlxsw): - dropped support for unreleased H/W (XM router). - improved stats accuracy - unified bridge model coversion improving scalability (parts 1-6) - support for PTP in Spectrum-2 asics - Broadcom PHYs - add PTP support for BCM54210E - add support for the BCM53128 internal PHY - Marvell Ethernet switches (prestera): - implement support for multicast forwarding offload - Embedded Ethernet switches: - refactor OcteonTx MAC filter for better scalability - improve TC H/W offload for the Felix driver - refactor the Microchip ksz8 and ksz9477 drivers to share the probe code (parts 1, 2), add support for phylink mac configuration - Other WiFi: - Microchip wilc1000: diable WEP support and enable WPA3 - Atheros ath10k: encapsulation offload support Old code removal: - Neterion vxge ethernet driver: this is untouched since more than 10 years" * tag 'net-next-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1890 commits) doc: sfp-phylink: Fix a broken reference wireguard: selftests: support UML wireguard: allowedips: don't corrupt stack when detecting overflow wireguard: selftests: update config fragments wireguard: ratelimiter: use hrtimer in selftest net/mlx5e: xsk: Discard unaligned XSK frames on striding RQ net: usb: ax88179_178a: Bind only to vendor-specific interface selftests: net: fix IOAM test skip return code net: usb: make USB_RTL8153_ECM non user configurable net: marvell: prestera: remove reduntant code octeontx2-pf: Reduce minimum mtu size to 60 net: devlink: Fix missing mutex_unlock() call net/tls: Remove redundant workqueue flush before destroy net: txgbe: Fix an error handling path in txgbe_probe() net: dsa: Fix spelling mistakes and cleanup code Documentation: devlink: add add devlink-selftests to the table of contents dccp: put dccp_qpolicy_full() and dccp_qpolicy_push() in the same lock net: ionic: fix error check for vlan flags in ionic_set_nic_features() net: ice: fix error NETIF_F_HW_VLAN_CTAG_FILTER check in ice_vsi_sync_fltr() nfp: flower: add support for tunnel offload without key ID ... |
||
Xu Kuohai
|
f1e8a24ed2 |
arm64: Add LDR (literal) instruction
Add LDR (literal) instruction to load data from address relative to PC. This instruction will be used to implement long jump from bpf prog to bpf trampoline in the follow-up patch. The instruction encoding: 3 2 2 2 0 0 0 7 6 4 5 0 +-----+-------+---+-----+-------------------------------------+--------+ | 0 x | 0 1 1 | 0 | 0 0 | imm19 | Rt | +-----+-------+---+-----+-------------------------------------+--------+ for 32-bit, variant x == 0; for 64-bit, x == 1. branch_imm_common() is used to check the distance between pc and target address, since it's reused by this patch and LDR (literal) is not a branch instruction, rename it to label_imm_common(). Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/bpf/20220711150823.2128542-3-xukuohai@huawei.com |
||
Mark Brown
|
e97575533a |
arm64/mte: Standardise GMID field name definitions
Usually our defines for bitfields in system registers do not include a SYS_ prefix but those for GMID do. In preparation for automatic generation of defines remove that prefix. No functional change. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220704170302.2609529-9-broonie@kernel.org Signed-off-by: Will Deacon <will@kernel.org> |
||
Linus Torvalds
|
bf9095424d |
S390:
* ultravisor communication device driver * fix TEID on terminating storage key ops RISC-V: * Added Sv57x4 support for G-stage page table * Added range based local HFENCE functions * Added remote HFENCE functions based on VCPU requests * Added ISA extension registers in ONE_REG interface * Updated KVM RISC-V maintainers entry to cover selftests support ARM: * Add support for the ARMv8.6 WFxT extension * Guard pages for the EL2 stacks * Trap and emulate AArch32 ID registers to hide unsupported features * Ability to select and save/restore the set of hypercalls exposed to the guest * Support for PSCI-initiated suspend in collaboration with userspace * GICv3 register-based LPI invalidation support * Move host PMU event merging into the vcpu data structure * GICv3 ITS save/restore fixes * The usual set of small-scale cleanups and fixes x86: * New ioctls to get/set TSC frequency for a whole VM * Allow userspace to opt out of hypercall patching * Only do MSR filtering for MSRs accessed by rdmsr/wrmsr AMD SEV improvements: * Add KVM_EXIT_SHUTDOWN metadata for SEV-ES * V_TSC_AUX support Nested virtualization improvements for AMD: * Support for "nested nested" optimizations (nested vVMLOAD/VMSAVE, nested vGIF) * Allow AVIC to co-exist with a nested guest running * Fixes for LBR virtualizations when a nested guest is running, and nested LBR virtualization support * PAUSE filtering for nested hypervisors Guest support: * Decoupling of vcpu_is_preempted from PV spinlocks -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmKN9M4UHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroNLeAf+KizAlQwxEehHHeNyTkZuKyMawrD6 zsqAENR6i1TxiXe7fDfPFbO2NR0ZulQopHbD9mwnHJ+nNw0J4UT7g3ii1IAVcXPu rQNRGMVWiu54jt+lep8/gDg0JvPGKVVKLhxUaU1kdWT9PhIOC6lwpP3vmeWkUfRi PFL/TMT0M8Nfryi0zHB0tXeqg41BiXfqO8wMySfBAHUbpv8D53D2eXQL6YlMM0pL 2quB1HxHnpueE5vj3WEPQ3PCdy1M2MTfCDBJAbZGG78Ljx45FxSGoQcmiBpPnhJr C6UGP4ZDWpml5YULUoA70k5ylCbP+vI61U4vUtzEiOjHugpPV5wFKtx5nw== =ozWx -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm updates from Paolo Bonzini: "S390: - ultravisor communication device driver - fix TEID on terminating storage key ops RISC-V: - Added Sv57x4 support for G-stage page table - Added range based local HFENCE functions - Added remote HFENCE functions based on VCPU requests - Added ISA extension registers in ONE_REG interface - Updated KVM RISC-V maintainers entry to cover selftests support ARM: - Add support for the ARMv8.6 WFxT extension - Guard pages for the EL2 stacks - Trap and emulate AArch32 ID registers to hide unsupported features - Ability to select and save/restore the set of hypercalls exposed to the guest - Support for PSCI-initiated suspend in collaboration with userspace - GICv3 register-based LPI invalidation support - Move host PMU event merging into the vcpu data structure - GICv3 ITS save/restore fixes - The usual set of small-scale cleanups and fixes x86: - New ioctls to get/set TSC frequency for a whole VM - Allow userspace to opt out of hypercall patching - Only do MSR filtering for MSRs accessed by rdmsr/wrmsr AMD SEV improvements: - Add KVM_EXIT_SHUTDOWN metadata for SEV-ES - V_TSC_AUX support Nested virtualization improvements for AMD: - Support for "nested nested" optimizations (nested vVMLOAD/VMSAVE, nested vGIF) - Allow AVIC to co-exist with a nested guest running - Fixes for LBR virtualizations when a nested guest is running, and nested LBR virtualization support - PAUSE filtering for nested hypervisors Guest support: - Decoupling of vcpu_is_preempted from PV spinlocks" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (199 commits) KVM: x86: Fix the intel_pt PMI handling wrongly considered from guest KVM: selftests: x86: Sync the new name of the test case to .gitignore Documentation: kvm: reorder ARM-specific section about KVM_SYSTEM_EVENT_SUSPEND x86, kvm: use correct GFP flags for preemption disabled KVM: LAPIC: Drop pending LAPIC timer injection when canceling the timer x86/kvm: Alloc dummy async #PF token outside of raw spinlock KVM: x86: avoid calling x86 emulator without a decoded instruction KVM: SVM: Use kzalloc for sev ioctl interfaces to prevent kernel data leak x86/fpu: KVM: Set the base guest FPU uABI size to sizeof(struct kvm_xsave) s390/uv_uapi: depend on CONFIG_S390 KVM: selftests: x86: Fix test failure on arch lbr capable platforms KVM: LAPIC: Trace LAPIC timer expiration on every vmentry KVM: s390: selftest: Test suppression indication on key prot exception KVM: s390: Don't indicate suppression on dirtying, failing memop selftests: drivers/s390x: Add uvdevice tests drivers/s390/char: Add Ultravisor io device MAINTAINERS: Update KVM RISC-V entry to cover selftests support RISC-V: KVM: Introduce ISA extension register RISC-V: KVM: Cleanup stale TLB entries when host CPU changes RISC-V: KVM: Add remote HFENCE functions based on VCPU requests ... |
||
Linus Torvalds
|
7e062cda7d |
Networking changes for 5.19.
Core ---- - Support TCPv6 segmentation offload with super-segments larger than 64k bytes using the IPv6 Jumbogram extension header (AKA BIG TCP). - Generalize skb freeing deferral to per-cpu lists, instead of per-socket lists. - Add a netdev statistic for packets dropped due to L2 address mismatch (rx_otherhost_dropped). - Continue work annotating skb drop reasons. - Accept alternative netdev names (ALT_IFNAME) in more netlink requests. - Add VLAN support for AF_PACKET SOCK_RAW GSO. - Allow receiving skb mark from the socket as a cmsg. - Enable memcg accounting for veth queues, sysctl tables and IPv6. BPF --- - Add libbpf support for User Statically-Defined Tracing (USDTs). - Speed up symbol resolution for kprobes multi-link attachments. - Support storing typed pointers to referenced and unreferenced objects in BPF maps. - Add support for BPF link iterator. - Introduce access to remote CPU map elements in BPF per-cpu map. - Allow middle-of-the-road settings for the kernel.unprivileged_bpf_disabled sysctl. - Implement basic types of dynamic pointers e.g. to allow for dynamically sized ringbuf reservations without extra memory copies. Protocols --------- - Retire port only listening_hash table, add a second bind table hashed by port and address. Avoid linear list walk when binding to very popular ports (e.g. 443). - Add bridge FDB bulk flush filtering support allowing user space to remove all FDB entries matching a condition. - Introduce accept_unsolicited_na sysctl for IPv6 to implement router-side changes for RFC9131. - Support for MPTCP path manager in user space. - Add MPTCP support for fallback to regular TCP for connections that have never connected additional subflows or transmitted out-of-sequence data (partial support for RFC8684 fallback). - Avoid races in MPTCP-level window tracking, stabilize and improve throughput. - Support lockless operation of GRE tunnels with seq numbers enabled. - WiFi support for host based BSS color collision detection. - Add support for SO_TXTIME/SCM_TXTIME on CAN sockets. - Support transmission w/o flow control in CAN ISOTP (ISO 15765-2). - Support zero-copy Tx with TLS 1.2 crypto offload (sendfile). - Allow matching on the number of VLAN tags via tc-flower. - Add tracepoint for tcp_set_ca_state(). Driver API ---------- - Improve error reporting from classifier and action offload. - Add support for listing line cards in switches (devlink). - Add helpers for reporting page pool statistics with ethtool -S. - Add support for reading clock cycles when using PTP virtual clocks, instead of having the driver convert to time before reporting. This makes it possible to report time from different vclocks. - Support configuring low-latency Tx descriptor push via ethtool. - Separate Clause 22 and Clause 45 MDIO accesses more explicitly. New hardware / drivers ---------------------- - Ethernet: - Marvell's Octeon NIC PCI Endpoint support (octeon_ep) - Sunplus SP7021 SoC (sp7021_emac) - Add support for Renesas RZ/V2M (in ravb) - Add support for MediaTek mt7986 switches (in mtk_eth_soc) - Ethernet PHYs: - ADIN1100 industrial PHYs (w/ 10BASE-T1L and SQI reporting) - TI DP83TD510 PHY - Microchip LAN8742/LAN88xx PHYs - WiFi: - Driver for pureLiFi X, XL, XC devices (plfxlc) - Driver for Silicon Labs devices (wfx) - Support for WCN6750 (in ath11k) - Support Realtek 8852ce devices (in rtw89) - Mobile: - MediaTek T700 modems (Intel 5G 5000 M.2 cards) - CAN: - ctucanfd: add support for CTU CAN FD open-source IP core from Czech Technical University in Prague Drivers ------- - Delete a number of old drivers still using virt_to_bus(). - Ethernet NICs: - intel: support TSO on tunnels MPLS - broadcom: support multi-buffer XDP - nfp: support VF rate limiting - sfc: use hardware tx timestamps for more than PTP - mlx5: multi-port eswitch support - hyper-v: add support for XDP_REDIRECT - atlantic: XDP support (including multi-buffer) - macb: improve real-time perf by deferring Tx processing to NAPI - High-speed Ethernet switches: - mlxsw: implement basic line card information querying - prestera: add support for traffic policing on ingress and egress - Embedded Ethernet switches: - lan966x: add support for packet DMA (FDMA) - lan966x: add support for PTP programmable pins - ti: cpsw_new: enable bc/mc storm prevention - Qualcomm 802.11ax WiFi (ath11k): - Wake-on-WLAN support for QCA6390 and WCN6855 - device recovery (firmware restart) support - support setting Specific Absorption Rate (SAR) for WCN6855 - read country code from SMBIOS for WCN6855/QCA6390 - enable keep-alive during WoWLAN suspend - implement remain-on-channel support - MediaTek WiFi (mt76): - support Wireless Ethernet Dispatch offloading packet movement between the Ethernet switch and WiFi interfaces - non-standard VHT MCS10-11 support - mt7921 AP mode support - mt7921 IPv6 NS offload support - Ethernet PHYs: - micrel: ksz9031/ksz9131: cabletest support - lan87xx: SQI support for T1 PHYs - lan937x: add interrupt support for link detection Signed-off-by: Jakub Kicinski <kuba@kernel.org> -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmKNMPQACgkQMUZtbf5S IrsRARAAuDyYs6jFYB3p+xazZdOnbF4iAgVv71+DQGvmsCl6CB9OrsNZMlvE85OL Q3gjcRbgjrkN4lhgI8DmiGYbsUJnAvVjFdNjccz1Z/vTLYvuIM0ol54MUp5S+9WY StncOJkOGJxxR/Gi5gzVmejPDsysU3Jik+hm/fpIcz8pybXxAsFKU5waY5qfl+/T TZepfV0VCfqRDjqcF1qA5+jJZNU8pdodQlZ1+mh8bwu6Jk1ZkWkj6Ov8MWdwQldr LnPeK/9hIGzkdJYHZfajxA3t8D0K5CHzSuih2bJ9ry8ZXgVBkXEThew778/R5izW uB0YZs9COFlrIP7XHjtRTy/2xHOdYIPlj2nWhVdfuQDX8Crvt4VRN6EZ1rjko1ZJ WanfG6WHF8NH5pXBRQbh3kIMKBnYn6OIzuCfCQSqd+niHcxFIM4vRiggeXI5C5TW vJgEWfK6X+NfDiFVa3xyCrEmp5ieA/pNecpwd8rVkql+MtFAAw4vfsotLKOJEAru J/XL6UE+YuLqIJV9ACZ9x1AFXXAo661jOxBunOo4VXhXVzWS9lYYz5r5ryIkgT/8 /Fr0zjANJWgfIuNdIBtYfQ4qG+LozGq038VA06RhFUAZ5tF9DzhqJs2Q2AFuWWBC ewCePJVqo1j2Ceq2mGonXRt47OEnlePoOxTk9W+cKZb7ZWE+zEo= =Wjii -----END PGP SIGNATURE----- Merge tag 'net-next-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Jakub Kicinski: "Core ---- - Support TCPv6 segmentation offload with super-segments larger than 64k bytes using the IPv6 Jumbogram extension header (AKA BIG TCP). - Generalize skb freeing deferral to per-cpu lists, instead of per-socket lists. - Add a netdev statistic for packets dropped due to L2 address mismatch (rx_otherhost_dropped). - Continue work annotating skb drop reasons. - Accept alternative netdev names (ALT_IFNAME) in more netlink requests. - Add VLAN support for AF_PACKET SOCK_RAW GSO. - Allow receiving skb mark from the socket as a cmsg. - Enable memcg accounting for veth queues, sysctl tables and IPv6. BPF --- - Add libbpf support for User Statically-Defined Tracing (USDTs). - Speed up symbol resolution for kprobes multi-link attachments. - Support storing typed pointers to referenced and unreferenced objects in BPF maps. - Add support for BPF link iterator. - Introduce access to remote CPU map elements in BPF per-cpu map. - Allow middle-of-the-road settings for the kernel.unprivileged_bpf_disabled sysctl. - Implement basic types of dynamic pointers e.g. to allow for dynamically sized ringbuf reservations without extra memory copies. Protocols --------- - Retire port only listening_hash table, add a second bind table hashed by port and address. Avoid linear list walk when binding to very popular ports (e.g. 443). - Add bridge FDB bulk flush filtering support allowing user space to remove all FDB entries matching a condition. - Introduce accept_unsolicited_na sysctl for IPv6 to implement router-side changes for RFC9131. - Support for MPTCP path manager in user space. - Add MPTCP support for fallback to regular TCP for connections that have never connected additional subflows or transmitted out-of-sequence data (partial support for RFC8684 fallback). - Avoid races in MPTCP-level window tracking, stabilize and improve throughput. - Support lockless operation of GRE tunnels with seq numbers enabled. - WiFi support for host based BSS color collision detection. - Add support for SO_TXTIME/SCM_TXTIME on CAN sockets. - Support transmission w/o flow control in CAN ISOTP (ISO 15765-2). - Support zero-copy Tx with TLS 1.2 crypto offload (sendfile). - Allow matching on the number of VLAN tags via tc-flower. - Add tracepoint for tcp_set_ca_state(). Driver API ---------- - Improve error reporting from classifier and action offload. - Add support for listing line cards in switches (devlink). - Add helpers for reporting page pool statistics with ethtool -S. - Add support for reading clock cycles when using PTP virtual clocks, instead of having the driver convert to time before reporting. This makes it possible to report time from different vclocks. - Support configuring low-latency Tx descriptor push via ethtool. - Separate Clause 22 and Clause 45 MDIO accesses more explicitly. New hardware / drivers ---------------------- - Ethernet: - Marvell's Octeon NIC PCI Endpoint support (octeon_ep) - Sunplus SP7021 SoC (sp7021_emac) - Add support for Renesas RZ/V2M (in ravb) - Add support for MediaTek mt7986 switches (in mtk_eth_soc) - Ethernet PHYs: - ADIN1100 industrial PHYs (w/ 10BASE-T1L and SQI reporting) - TI DP83TD510 PHY - Microchip LAN8742/LAN88xx PHYs - WiFi: - Driver for pureLiFi X, XL, XC devices (plfxlc) - Driver for Silicon Labs devices (wfx) - Support for WCN6750 (in ath11k) - Support Realtek 8852ce devices (in rtw89) - Mobile: - MediaTek T700 modems (Intel 5G 5000 M.2 cards) - CAN: - ctucanfd: add support for CTU CAN FD open-source IP core from Czech Technical University in Prague Drivers ------- - Delete a number of old drivers still using virt_to_bus(). - Ethernet NICs: - intel: support TSO on tunnels MPLS - broadcom: support multi-buffer XDP - nfp: support VF rate limiting - sfc: use hardware tx timestamps for more than PTP - mlx5: multi-port eswitch support - hyper-v: add support for XDP_REDIRECT - atlantic: XDP support (including multi-buffer) - macb: improve real-time perf by deferring Tx processing to NAPI - High-speed Ethernet switches: - mlxsw: implement basic line card information querying - prestera: add support for traffic policing on ingress and egress - Embedded Ethernet switches: - lan966x: add support for packet DMA (FDMA) - lan966x: add support for PTP programmable pins - ti: cpsw_new: enable bc/mc storm prevention - Qualcomm 802.11ax WiFi (ath11k): - Wake-on-WLAN support for QCA6390 and WCN6855 - device recovery (firmware restart) support - support setting Specific Absorption Rate (SAR) for WCN6855 - read country code from SMBIOS for WCN6855/QCA6390 - enable keep-alive during WoWLAN suspend - implement remain-on-channel support - MediaTek WiFi (mt76): - support Wireless Ethernet Dispatch offloading packet movement between the Ethernet switch and WiFi interfaces - non-standard VHT MCS10-11 support - mt7921 AP mode support - mt7921 IPv6 NS offload support - Ethernet PHYs: - micrel: ksz9031/ksz9131: cabletest support - lan87xx: SQI support for T1 PHYs - lan937x: add interrupt support for link detection" * tag 'net-next-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1809 commits) ptp: ocp: Add firmware header checks ptp: ocp: fix PPS source selector debugfs reporting ptp: ocp: add .init function for sma_op vector ptp: ocp: vectorize the sma accessor functions ptp: ocp: constify selectors ptp: ocp: parameterize input/output sma selectors ptp: ocp: revise firmware display ptp: ocp: add Celestica timecard PCI ids ptp: ocp: Remove #ifdefs around PCI IDs ptp: ocp: 32-bit fixups for pci start address Revert "net/smc: fix listen processing for SMC-Rv2" ath6kl: Use cc-disable-warning to disable -Wdangling-pointer selftests/bpf: Dynptr tests bpf: Add dynptr data slices bpf: Add bpf_dynptr_read and bpf_dynptr_write bpf: Dynptr support for ring buffers bpf: Add bpf_dynptr_from_mem for local dynptrs bpf: Add verifier support for dynptrs bpf: Suppress 'passing zero to PTR_ERR' warning bpf: Introduce bpf_arch_text_invalidate for bpf_prog_pack ... |
||
Robin Murphy
|
b4d6bb38f9 |
arm64: mte: Clean up user tag accessors
Invoking user_ldst to explicitly add a post-increment of 0 is silly. Just use a normal USER() annotation and save the redundant instruction. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Tong Tiangen <tongtiangen@huawei.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20220420030418.3189040-6-tongtiangen@huawei.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> |
||
Marc Zyngier
|
7d26b0516a |
arm64: Use WFxT for __delay() when possible
Marginally optimise __delay() by using a WFIT/WFET sequence. It probably is a win if no interrupt fires during the delay. Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20220419182755.601427-11-maz@kernel.org |
||
Xu Kuohai
|
30c90f6757 |
arm64, insn: Add ldr/str with immediate offset
This patch introduces ldr/str with immediate offset support to simplify the JIT implementation of BPF LDX/STX instructions on arm64. Although arm64 ldr/str immediate is available in pre-index, post-index and unsigned offset forms, the unsigned offset form is sufficient for BPF, so this patch only adds this type. Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220321152852.2334294-2-xukuohai@huawei.com |
||
Linus Torvalds
|
93e220a62d |
Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu: "API: - hwrng core now credits for low-quality RNG devices. Algorithms: - Optimisations for neon aes on arm/arm64. - Add accelerated crc32_be on arm64. - Add ffdheXYZ(dh) templates. - Disallow hmac keys < 112 bits in FIPS mode. - Add AVX assembly implementation for sm3 on x86. Drivers: - Add missing local_bh_disable calls for crypto_engine callback. - Ensure BH is disabled in crypto_engine callback path. - Fix zero length DMA mappings in ccree. - Add synchronization between mailbox accesses in octeontx2. - Add Xilinx SHA3 driver. - Add support for the TDES IP available on sama7g5 SoC in atmel" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (137 commits) crypto: xilinx - Turn SHA into a tristate and allow COMPILE_TEST MAINTAINERS: update HPRE/SEC2/TRNG driver maintainers list crypto: dh - Remove the unused function dh_safe_prime_dh_alg() hwrng: nomadik - Change clk_disable to clk_disable_unprepare crypto: arm64 - cleanup comments crypto: qat - fix initialization of pfvf rts_map_msg structures crypto: qat - fix initialization of pfvf cap_msg structures crypto: qat - remove unneeded assignment crypto: qat - disable registration of algorithms crypto: hisilicon/qm - fix memset during queues clearing crypto: xilinx: prevent probing on non-xilinx hardware crypto: marvell/octeontx - Use swap() instead of open coding it crypto: ccree - Fix use after free in cc_cipher_exit() crypto: ccp - ccp_dmaengine_unregister release dma channels crypto: octeontx2 - fix missing unlock hwrng: cavium - fix NULL but dereferenced coccicheck error crypto: cavium/nitrox - don't cast parameter in bit operations crypto: vmx - add missing dependencies MAINTAINERS: Add maintainer for Xilinx ZynqMP SHA3 driver crypto: xilinx - Add Xilinx SHA3 driver ... |
||
Will Deacon
|
515e5da7b6 |
Merge branch 'for-next/strings' into for-next/core
* for-next/strings: Revert "arm64: Mitigate MTE issues with str{n}cmp()" arm64: lib: Import latest version of Arm Optimized Routines' strncmp arm64: lib: Import latest version of Arm Optimized Routines' strcmp |
||
Will Deacon
|
563c463595 |
Merge branch 'for-next/linkage' into for-next/core
* for-next/linkage: arm64: module: remove (NOLOAD) from linker script linkage: remove SYM_FUNC_{START,END}_ALIAS() x86: clean up symbol aliasing arm64: clean up symbol aliasing linkage: add SYM_FUNC_ALIAS{,_LOCAL,_WEAK}() |
||
Will Deacon
|
b7323ae691 |
Merge branch 'for-next/insn' into for-next/core
* for-next/insn: arm64: insn: add encoders for atomic operations arm64: move AARCH64_BREAK_FAULT into insn-def.h arm64: insn: Generate 64 bit mask immediates correctly |
||
Joey Gouly
|
e33c89256e |
Revert "arm64: Mitigate MTE issues with str{n}cmp()"
This reverts commit
|
||
Joey Gouly
|
387d828adf |
arm64: lib: Import latest version of Arm Optimized Routines' strncmp
Import the latest version of the Arm Optimized Routines strncmp function based on the upstream code of string/aarch64/strncmp.S at commit 189dfefe37d5 from: https://github.com/ARM-software/optimized-routines This latest version includes MTE support. Note that for simplicity Arm have chosen to contribute this code to Linux under GPLv2 rather than the original MIT OR Apache-2.0 WITH LLVM-exception license. Arm is the sole copyright holder for this code. Signed-off-by: Joey Gouly <joey.gouly@arm.com> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20220301101435.19327-3-joey.gouly@arm.com Signed-off-by: Will Deacon <will@kernel.org> |
||
Joey Gouly
|
507f788d05 |
arm64: lib: Import latest version of Arm Optimized Routines' strcmp
Import the latest version of the Arm Optimized Routines strcmp function based on the upstream code of string/aarch64/strcmp.S at commit 189dfefe37d5 from: https://github.com/ARM-software/optimized-routines This latest version includes MTE support. Note that for simplicity Arm have chosen to contribute this code to Linux under GPLv2 rather than the original MIT OR Apache-2.0 WITH LLVM-exception license. Arm is the sole copyright holder for this code. Signed-off-by: Joey Gouly <joey.gouly@arm.com> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20220301101435.19327-2-joey.gouly@arm.com Signed-off-by: Will Deacon <will@kernel.org> |
||
Hou Tao
|
fa1114d9eb |
arm64: insn: add encoders for atomic operations
It is a preparation patch for eBPF atomic supports under arm64. eBPF needs support atomic[64]_fetch_add, atomic[64]_[fetch_]{and,or,xor} and atomic[64]_{xchg|cmpxchg}. The ordering semantics of eBPF atomics are the same with the implementations in linux kernel. Add three helpers to support LDCLR/LDEOR/LDSET/SWP, CAS and DMB instructions. STADD/STCLR/STEOR/STSET are simply encoded as aliases for LDADD/LDCLR/LDEOR/LDSET with XZR as the destination register, so no extra helper is added. atomic_fetch_add() and other atomic ops needs support for STLXR instruction, so extend enum aarch64_insn_ldst_type to do that. LDADD/LDEOR/LDSET/SWP and CAS instructions are only available when LSE atomics is enabled, so just return AARCH64_BREAK_FAULT directly in these newly-added helpers if CONFIG_ARM64_LSE_ATOMICS is disabled. Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20220217072232.1186625-3-houtao1@huawei.com Signed-off-by: Will Deacon <will@kernel.org> |
||
Mark Rutland
|
0f61f6be1f |
arm64: clean up symbol aliasing
Now that we have SYM_FUNC_ALIAS() and SYM_FUNC_ALIAS_WEAK(), use those to simplify and more consistently define function aliases across arch/arm64. Aliases are now defined in terms of a canonical function name. For position-independent functions I've made the __pi_<func> name the canonical name, and defined other alises in terms of this. The SYM_FUNC_{START,END}_PI(func) macros obscure the __pi_<func> name, and make this hard to seatch for. The SYM_FUNC_START_WEAK_PI() macro also obscures the fact that the __pi_<func> fymbol is global and the <func> symbol is weak. For clarity, I have removed these macros and used SYM_FUNC_{START,END}() directly with the __pi_<func> name. For example: SYM_FUNC_START_WEAK_PI(func) ... asm insns ... SYM_FUNC_END_PI(func) EXPORT_SYMBOL(func) ... becomes: SYM_FUNC_START(__pi_func) ... asm insns ... SYM_FUNC_END(__pi_func) SYM_FUNC_ALIAS_WEAK(func, __pi_func) EXPORT_SYMBOL(func) For clarity, where there are multiple annotations such as EXPORT_SYMBOL(), I've tried to keep annotations grouped by symbol. For example, where a function has a name and an alias which are both exported, this is organised as: SYM_FUNC_START(func) ... asm insns ... SYM_FUNC_END(func) EXPORT_SYMBOL(func) SYM_FUNC_ALIAS(alias, func) EXPORT_SYMBOL(alias) For consistency with the other string functions, I've defined strrchr as a position-independent function, as it can safely be used as such even though we have no users today. As we no longer use SYM_FUNC_{START,END}_ALIAS(), our local copies are removed. The common versions will be removed by a subsequent patch. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Mark Brown <broonie@kernel.org> Cc: Joey Gouly <joey.gouly@arm.com> Cc: Will Deacon <will@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220216162229.1076788-3-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org> |
||
Catalin Marinas
|
ab1e435ca7 |
arm64: mte: Define the number of bytes for storing the tags in a page
Rather than explicitly calculating the number of bytes for a compact tag storage format corresponding to a page, just add a MTE_PAGE_TAG_STORAGE macro. With the current MTE implementation of 4 bits per tag, we store 2 tags in a byte. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Luis Machado <luis.machado@linaro.org> Link: https://lore.kernel.org/r/20220131165456.2160675-4-catalin.marinas@arm.com Signed-off-by: Will Deacon <will@kernel.org> |
||
James Morse
|
a6aab01882 |
arm64: insn: Generate 64 bit mask immediates correctly
When the insn framework is used to encode an AND/ORR/EOR instruction, aarch64_encode_immediate() is used to pick the immr imms values. If the immediate is a 64bit mask, with bit 63 set, and zeros in any of the upper 32 bits, the immr value is incorrectly calculated meaning the wrong mask is generated. For example, 0x8000000000000001 should have an immr of 1, but 32 is used, meaning the resulting mask is 0x0000000300000000. It would appear eBPF is unable to hit these cases, as build_insn()'s imm value is a s32, so when used with BPF_ALU64, the sign-extended u64 immediate would always have all-1s or all-0s in the upper 32 bits. KVM does not generate a va_mask with any of the top bits set as these VA wouldn't be usable with TTBR0_EL2. This happens because the rotation is calculated from fls(~imm), which takes an unsigned int, but the immediate may be 64bit. Use fls64() so the 64bit mask doesn't get truncated to a u32. Signed-off-by: James Morse <james.morse@arm.com> Brown-paper-bag-for: Marc Zyngier <maz@kernel.org> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20220127162127.2391947-4-james.morse@arm.com Signed-off-by: Will Deacon <will@kernel.org> |
||
Ard Biesheuvel
|
297565aa22 |
lib/xor: make xor prototypes more friendly to compiler vectorization
Modern compilers are perfectly capable of extracting parallelism from the XOR routines, provided that the prototypes reflect the nature of the input accurately, in particular, the fact that the input vectors are expected not to overlap. This is not documented explicitly, but is implied by the interchangeability of the various C routines, some of which use temporary variables while others don't: this means that these routines only behave identically for non-overlapping inputs. So let's decorate these input vectors with the __restrict modifier, which informs the compiler that there is no overlap. While at it, make the input-only vectors pointer-to-const as well. Tested-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Link: https://github.com/ClangBuiltLinux/linux/issues/563 Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> |
||
Kevin Bracey
|
5f2f5eaa3e |
arm64: lib: accelerate crc32_be
It makes no sense to leave crc32_be using the generic code while we only accelerate the little-endian ops. Even though the big-endian form doesn't fit as smoothly into the arm64, we can speed it up and avoid hitting the D cache. Tested on Cortex-A53. Without acceleration: crc32: CRC_LE_BITS = 64, CRC_BE BITS = 64 crc32: self tests passed, processed 225944 bytes in 192240 nsec crc32c: CRC_LE_BITS = 64 crc32c: self tests passed, processed 112972 bytes in 21360 nsec With acceleration: crc32: CRC_LE_BITS = 64, CRC_BE BITS = 64 crc32: self tests passed, processed 225944 bytes in 53480 nsec crc32c: CRC_LE_BITS = 64 crc32c: self tests passed, processed 112972 bytes in 21480 nsec Signed-off-by: Kevin Bracey <kevin@bracey.fi> Tested-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> |
||
Catalin Marinas
|
945409a6ef |
Merge branches 'for-next/misc', 'for-next/cache-ops-dzp', 'for-next/stacktrace', 'for-next/xor-neon', 'for-next/kasan', 'for-next/armv8_7-fp', 'for-next/atomics', 'for-next/bti', 'for-next/sve', 'for-next/kselftest' and 'for-next/kcsan', remote-tracking branch 'arm64/for-next/perf' into for-next/core
* arm64/for-next/perf: (32 commits)
arm64: perf: Don't register user access sysctl handler multiple times
drivers: perf: marvell_cn10k: fix an IS_ERR() vs NULL check
perf/smmuv3: Fix unused variable warning when CONFIG_OF=n
arm64: perf: Support new DT compatibles
arm64: perf: Simplify registration boilerplate
arm64: perf: Support Denver and Carmel PMUs
drivers/perf: hisi: Add driver for HiSilicon PCIe PMU
docs: perf: Add description for HiSilicon PCIe PMU driver
dt-bindings: perf: Add YAML schemas for Marvell CN10K LLC-TAD pmu bindings
drivers: perf: Add LLC-TAD perf counter support
perf/smmuv3: Synthesize IIDR from CoreSight ID registers
perf/smmuv3: Add devicetree support
dt-bindings: Add Arm SMMUv3 PMCG binding
perf/arm-cmn: Add debugfs topology info
perf/arm-cmn: Add CI-700 Support
dt-bindings: perf: arm-cmn: Add CI-700
perf/arm-cmn: Support new IP features
perf/arm-cmn: Demarcate CMN-600 specifics
perf/arm-cmn: Move group validation data off-stack
perf/arm-cmn: Optimise DTC counter accesses
...
* for-next/misc:
: Miscellaneous patches
arm64: Use correct method to calculate nomap region boundaries
arm64: Drop outdated links in comments
arm64: errata: Fix exec handling in erratum
|
||
Mark Brown
|
742a15b1a2 |
arm64: Use BTI C directly and unconditionally
Now we have a macro for BTI C that looks like a regular instruction change all the users of the current BTI_C macro to just emit a BTI C directly and remove the macro. This does mean that we now unconditionally BTI annotate all assembly functions, meaning that they are worse in this respect than code generated by the compiler. The overhead should be minimal for implementations with a reasonable HINT implementation. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20211214152714.2380849-4-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> |
||
Ard Biesheuvel
|
2c54b423cf |
arm64/xor: use EOR3 instructions when available
Use the EOR3 instruction to implement xor_blocks() if the instruction is available, which is the case if the CPU implements the SHA-3 extension. This is about 20% faster on Apple M1 when using the 5-way version. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20211213140252.2856053-1-ardb@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> |
||
Reiji Watanabe
|
685e2564da |
arm64: mte: DC {GVA,GZVA} shouldn't be used when DCZID_EL0.DZP == 1
Currently, mte_set_mem_tag_range() and mte_zero_clear_page_tags() use DC {GVA,GZVA} unconditionally. But, they should make sure that DCZID_EL0.DZP, which indicates whether or not use of those instructions is prohibited, is zero when using those instructions. Use ST{G,ZG,Z2G} instead when DCZID_EL0.DZP == 1. Fixes: |
||
Reiji Watanabe
|
f0616abd4e |
arm64: clear_page() shouldn't use DC ZVA when DCZID_EL0.DZP == 1
Currently, clear_page() uses DC ZVA instruction unconditionally. But it
should make sure that DCZID_EL0.DZP, which indicates whether or not use
of DC ZVA instruction is prohibited, is zero when using the instruction.
Use STNP instead when DCZID_EL0.DZP == 1.
Fixes:
|
||
Linus Torvalds
|
1e9ed9360f |
Kbuild updates for v5.16
- Remove the global -isystem compiler flag, which was made possible by the introduction of <linux/stdarg.h> - Improve the Kconfig help to print the location in the top menu level - Fix "FORCE prerequisite is missing" build warning for sparc - Add new build targets, tarzst-pkg and perf-tarzst-src-pkg, which generate a zstd-compressed tarball - Prevent gen_init_cpio tool from generating a corrupted cpio when KBUILD_BUILD_TIMESTAMP is set to 2106-02-07 or later - Misc cleanups -----BEGIN PGP SIGNATURE----- iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmGGkysVHG1hc2FoaXJv eUBrZXJuZWwub3JnAAoJED2LAQed4NsGgZkQAIX4i9Tt6pyl/2xGDGkzUqjprfoH QUIo1DoUclLUygoakrrrX3EnZLWrslgPTKjQxdiV6RA6xHfe4cYgNTSq8zM9lsPT lu+B4nEDqoXQ5gyLxMlnjS3FRQTNYIeBZEhSAIiW8TENdLKlKc+NYdoj7th50dO0 SkXRa2dpWHa6t7ZRqHIHMpUWA7gm0w22ZbgQmyUv1CDGO4IHPLqe2b2PMsrzhSZ1 yypP1l6aQVKuP0hN9aytbTRqDxUd0uOzBf00PK5zx23hjdwZ9wmZrFTKDf9fAu/+ nR7gBsa5YoYNQh3UkayZXjR5dClmgsCXZ25OXI7YucQp/8OJ5fadfn1NFpJHsw56 n5cckbHIXgnFUcel5YlkR6qTHjpzdr9vHm90MmiuX99b3oy9czl6pY3qkNfRkllQ v7ME5L1qlw3P3ia1KA+H4zW/LIJ8p5cbKBwaY22m3kY3bTx7PiOfMlep4UVqxXSb 0/OqxSsmYg5LlmwEQ0SSsx45hE0o9nG/cdjkHu1jUOUHxYfpt1T4MTILeGUwmjzd TydJym5MZyXBawu4NVB3QLoKm5Jt2BXtyaWOtq74VSrs77roNCdYuQWJ+1aBf2Pg 0s4CVC2cC7KlxJDImoqswZATGXPMfbiVDcuVSSukYRgBMeCBPUzRhB8YP36BZyD3 9vFYmqSujtUU7nWb =ATFN -----END PGP SIGNATURE----- Merge tag 'kbuild-v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: - Remove the global -isystem compiler flag, which was made possible by the introduction of <linux/stdarg.h> - Improve the Kconfig help to print the location in the top menu level - Fix "FORCE prerequisite is missing" build warning for sparc - Add new build targets, tarzst-pkg and perf-tarzst-src-pkg, which generate a zstd-compressed tarball - Prevent gen_init_cpio tool from generating a corrupted cpio when KBUILD_BUILD_TIMESTAMP is set to 2106-02-07 or later - Misc cleanups * tag 'kbuild-v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (28 commits) kbuild: use more subdir- for visiting subdirectories while cleaning sh: remove meaningless archclean line initramfs: Check timestamp to prevent broken cpio archive kbuild: split DEBUG_CFLAGS out to scripts/Makefile.debug gen_init_cpio: add static const qualifiers kbuild: Add make tarzst-pkg build option scripts: update the comments of kallsyms support sparc: Add missing "FORCE" target when using if_changed kconfig: refactor conf_touch_dep() kconfig: refactor conf_write_dep() kconfig: refactor conf_write_autoconf() kconfig: add conf_get_autoheader_name() kconfig: move sym_escape_string_value() to confdata.c kconfig: refactor listnewconfig code kconfig: refactor conf_write_symbol() kconfig: refactor conf_write_heading() kconfig: remove 'const' from the return type of sym_escape_string_value() kconfig: rename a variable in the lexer to a clearer name kconfig: narrow the scope of variables in the lexer kconfig: Create links to main menu items in search ... |
||
Mark Rutland
|
819771cc28 |
arm64: extable: consolidate definitions
In subsequent patches we'll alter the structure and usage of struct exception_table_entry. For inline assembly, we create these using the `_ASM_EXTABLE()` CPP macro defined in <asm/uaccess.h>, and for plain assembly code we use the `_asm_extable()` GAS macro defined in <asm/assembler.h>, which are largely identical save for different escaping and stringification requirements. This patch moves the common definitions to a new <asm/asm-extable.h> header, so that it's easier to keep the two in-sync, and to remove the implication that these are only used for uaccess helpers (as e.g. load_unaligned_zeropad() is only used on kernel memory, and depends upon `_ASM_EXTABLE()`. At the same time, a few minor modifications are made for clarity and in preparation for subsequent patches: * The structure creation is factored out into an `__ASM_EXTABLE_RAW()` macro. This will make it easier to support different fixup variants in subsequent patches without needing to update all users of `_ASM_EXTABLE()`, and makes it easier to see tha the CPP and GAS variants of the macros are structurally identical. For the CPP macro, the stringification of fields is left to the wrapper macro, `_ASM_EXTABLE()`, as in subsequent patches it will be necessary to stringify fields in wrapper macros to safely concatenate strings which cannot be token-pasted together in CPP. * The fields of the structure are created separately on their own lines. This will make it easier to add/remove/modify individual fields clearly. * Additional parentheses are added around the use of macro arguments in field definitions to avoid any potential problems with evaluation due to operator precedence, and to make errors upon misuse clearer. * USER() is moved into <asm/asm-uaccess.h>, as it is not required by all assembly code, and is already refered to by comments in that file. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20211019160219.5202-8-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org> |
||
Mark Rutland
|
139f9ab73d |
arm64: lib: __arch_copy_to_user(): fold fixups into body
Like other functions, __arch_copy_to_user() places its exception fixups in the `.fixup` section without any clear association with __arch_copy_to_user() itself. If we backtrace the fixup code, it will be symbolized as an offset from the nearest prior symbol, which happens to be `__entry_tramp_text_end`. Further, since the PC adjustment for the fixup is akin to a direct branch rather than a function call, __arch_copy_to_user() itself will be missing from the backtrace. This is confusing and hinders debugging. In general this pattern will also be problematic for CONFIG_LIVEPATCH, since fixups often return to their associated function, but this isn't accurately captured in the stacktrace. To solve these issues for assembly functions, we must move fixups into the body of the functions themselves, after the usual fast-path returns. This patch does so for __arch_copy_to_user(). Inline assembly will be dealt with in subsequent patches. Other than the improved backtracing, there should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20211019160219.5202-4-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org> |
||
Mark Rutland
|
4012e0e227 |
arm64: lib: __arch_copy_from_user(): fold fixups into body
Like other functions, __arch_copy_from_user() places its exception fixups in the `.fixup` section without any clear association with __arch_copy_from_user() itself. If we backtrace the fixup code, it will be symbolized as an offset from the nearest prior symbol, which happens to be `__entry_tramp_text_end`. Further, since the PC adjustment for the fixup is akin to a direct branch rather than a function call, __arch_copy_from_user() itself will be missing from the backtrace. This is confusing and hinders debugging. In general this pattern will also be problematic for CONFIG_LIVEPATCH, since fixups often return to their associated function, but this isn't accurately captured in the stacktrace. To solve these issues for assembly functions, we must move fixups into the body of the functions themselves, after the usual fast-path returns. This patch does so for __arch_copy_from_user(). Inline assembly will be dealt with in subsequent patches. Other than the improved backtracing, there should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20211019160219.5202-3-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org> |
||
Mark Rutland
|
35d67794b8 |
arm64: lib: __arch_clear_user(): fold fixups into body
Like other functions, __arch_clear_user() places its exception fixups in the `.fixup` section without any clear association with __arch_clear_user() itself. If we backtrace the fixup code, it will be symbolized as an offset from the nearest prior symbol, which happens to be `__entry_tramp_text_end`. Further, since the PC adjustment for the fixup is akin to a direct branch rather than a function call, __arch_clear_user() itself will be missing from the backtrace. This is confusing and hinders debugging. In general this pattern will also be problematic for CONFIG_LIVEPATCH, since fixups often return to their associated function, but this isn't accurately captured in the stacktrace. To solve these issues for assembly functions, we must move fixups into the body of the functions themselves, after the usual fast-path returns. This patch does so for __arch_clear_user(). Inline assembly will be dealt with in subsequent patches. Other than the improved backtracing, there should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20211019160219.5202-2-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org> |
||
Alexey Dobriyan
|
04e85bbf71 |
isystem: delete global -isystem compile option
Further isolate kernel from userspace, prevent accidental inclusion of undesireable headers, mainly float.h and stdatomic.h. nds32 keeps -isystem globally due to intrinsics used in entrenched header. -isystem is selectively reenabled for some files, again, for intrinsics. Compile tested on: hexagon-defconfig hexagon-allmodconfig alpha-allmodconfig alpha-allnoconfig alpha-defconfig arm64-allmodconfig arm64-allnoconfig arm64-defconfig arm-am200epdkit arm-aspeed_g4 arm-aspeed_g5 arm-assabet arm-at91_dt arm-axm55xx arm-badge4 arm-bcm2835 arm-cerfcube arm-clps711x arm-cm_x300 arm-cns3420vb arm-colibri_pxa270 arm-colibri_pxa300 arm-collie arm-corgi arm-davinci_all arm-dove arm-ep93xx arm-eseries_pxa arm-exynos arm-ezx arm-footbridge arm-gemini arm-h3600 arm-h5000 arm-hackkit arm-hisi arm-imote2 arm-imx_v4_v5 arm-imx_v6_v7 arm-integrator arm-iop32x arm-ixp4xx arm-jornada720 arm-keystone arm-lart arm-lpc18xx arm-lpc32xx arm-lpd270 arm-lubbock arm-magician arm-mainstone arm-milbeaut_m10v arm-mini2440 arm-mmp2 arm-moxart arm-mps2 arm-multi_v4t arm-multi_v5 arm-multi_v7 arm-mv78xx0 arm-mvebu_v5 arm-mvebu_v7 arm-mxs arm-neponset arm-netwinder arm-nhk8815 arm-omap1 arm-omap2plus arm-orion5x arm-oxnas_v6 arm-palmz72 arm-pcm027 arm-pleb arm-pxa arm-pxa168 arm-pxa255-idp arm-pxa3xx arm-pxa910 arm-qcom arm-realview arm-rpc arm-s3c2410 arm-s3c6400 arm-s5pv210 arm-sama5 arm-shannon arm-shmobile arm-simpad arm-socfpga arm-spear13xx arm-spear3xx arm-spear6xx arm-spitz arm-stm32 arm-sunxi arm-tct_hammer arm-tegra arm-trizeps4 arm-u8500 arm-versatile arm-vexpress arm-vf610m4 arm-viper arm-vt8500_v6_v7 arm-xcep arm-zeus csky-allmodconfig csky-allnoconfig csky-defconfig h8300-edosk2674 h8300-h8300h-sim h8300-h8s-sim i386-allmodconfig i386-allnoconfig i386-defconfig ia64-allmodconfig ia64-allnoconfig ia64-bigsur ia64-generic ia64-gensparse ia64-tiger ia64-zx1 m68k-amcore m68k-amiga m68k-apollo m68k-atari m68k-bvme6000 m68k-hp300 m68k-m5208evb m68k-m5249evb m68k-m5272c3 m68k-m5275evb m68k-m5307c3 m68k-m5407c3 m68k-m5475evb m68k-mac m68k-multi m68k-mvme147 m68k-mvme16x m68k-q40 m68k-stmark2 m68k-sun3 m68k-sun3x microblaze-allmodconfig microblaze-allnoconfig microblaze-mmu mips-ar7 mips-ath25 mips-ath79 mips-bcm47xx mips-bcm63xx mips-bigsur mips-bmips_be mips-bmips_stb mips-capcella mips-cavium_octeon mips-ci20 mips-cobalt mips-cu1000-neo mips-cu1830-neo mips-db1xxx mips-decstation mips-decstation_64 mips-decstation_r4k mips-e55 mips-fuloong2e mips-gcw0 mips-generic mips-gpr mips-ip22 mips-ip27 mips-ip28 mips-ip32 mips-jazz mips-jmr3927 mips-lemote2f mips-loongson1b mips-loongson1c mips-loongson2k mips-loongson3 mips-malta mips-maltaaprp mips-malta_kvm mips-malta_qemu_32r6 mips-maltasmvp mips-maltasmvp_eva mips-maltaup mips-maltaup_xpa mips-mpc30x mips-mtx1 mips-nlm_xlp mips-nlm_xlr mips-omega2p mips-pic32mzda mips-pistachio mips-qi_lb60 mips-rb532 mips-rbtx49xx mips-rm200 mips-rs90 mips-rt305x mips-sb1250_swarm mips-tb0219 mips-tb0226 mips-tb0287 mips-vocore2 mips-workpad mips-xway nds32-allmodconfig nds32-allnoconfig nds32-defconfig nios2-10m50 nios2-3c120 nios2-allmodconfig nios2-allnoconfig openrisc-allmodconfig openrisc-allnoconfig openrisc-or1klitex openrisc-or1ksim openrisc-simple_smp parisc-allnoconfig parisc-generic-32bit parisc-generic-64bit powerpc-acadia powerpc-adder875 powerpc-akebono powerpc-amigaone powerpc-arches powerpc-asp8347 powerpc-bamboo powerpc-bluestone powerpc-canyonlands powerpc-cell powerpc-chrp32 powerpc-cm5200 powerpc-currituck powerpc-ebony powerpc-eiger powerpc-ep8248e powerpc-ep88xc powerpc-fsp2 powerpc-g5 powerpc-gamecube powerpc-ge_imp3a powerpc-holly powerpc-icon powerpc-iss476-smp powerpc-katmai powerpc-kilauea powerpc-klondike powerpc-kmeter1 powerpc-ksi8560 powerpc-linkstation powerpc-lite5200b powerpc-makalu powerpc-maple powerpc-mgcoge powerpc-microwatt powerpc-motionpro powerpc-mpc512x powerpc-mpc5200 powerpc-mpc7448_hpc2 powerpc-mpc8272_ads powerpc-mpc8313_rdb powerpc-mpc8315_rdb powerpc-mpc832x_mds powerpc-mpc832x_rdb powerpc-mpc834x_itx powerpc-mpc834x_itxgp powerpc-mpc834x_mds powerpc-mpc836x_mds powerpc-mpc836x_rdk powerpc-mpc837x_mds powerpc-mpc837x_rdb powerpc-mpc83xx powerpc-mpc8540_ads powerpc-mpc8560_ads powerpc-mpc85xx_cds powerpc-mpc866_ads powerpc-mpc885_ads powerpc-mvme5100 powerpc-obs600 powerpc-pasemi powerpc-pcm030 powerpc-pmac32 powerpc-powernv powerpc-ppa8548 powerpc-ppc40x powerpc-ppc44x powerpc-ppc64 powerpc-ppc64e powerpc-ppc6xx powerpc-pq2fads powerpc-ps3 powerpc-pseries powerpc-rainier powerpc-redwood powerpc-sam440ep powerpc-sbc8548 powerpc-sequoia powerpc-skiroot powerpc-socrates powerpc-storcenter powerpc-stx_gp3 powerpc-taishan powerpc-tqm5200 powerpc-tqm8540 powerpc-tqm8541 powerpc-tqm8548 powerpc-tqm8555 powerpc-tqm8560 powerpc-tqm8xx powerpc-walnut powerpc-warp powerpc-wii powerpc-xes_mpc85xx riscv-allmodconfig riscv-allnoconfig riscv-nommu_k210 riscv-nommu_k210_sdcard riscv-nommu_virt riscv-rv32 s390-allmodconfig s390-allnoconfig s390-debug s390-zfcpdump sh-ap325rxa sh-apsh4a3a sh-apsh4ad0a sh-dreamcast sh-ecovec24 sh-ecovec24-romimage sh-edosk7705 sh-edosk7760 sh-espt sh-hp6xx sh-j2 sh-kfr2r09 sh-kfr2r09-romimage sh-landisk sh-lboxre2 sh-magicpanelr2 sh-microdev sh-migor sh-polaris sh-r7780mp sh-r7785rp sh-rsk7201 sh-rsk7203 sh-rsk7264 sh-rsk7269 sh-rts7751r2d1 sh-rts7751r2dplus sh-sdk7780 sh-sdk7786 sh-se7206 sh-se7343 sh-se7619 sh-se7705 sh-se7712 sh-se7721 sh-se7722 sh-se7724 sh-se7750 sh-se7751 sh-se7780 sh-secureedge5410 sh-sh03 sh-sh2007 sh-sh7710voipgw sh-sh7724_generic sh-sh7757lcr sh-sh7763rdp sh-sh7770_generic sh-sh7785lcr sh-sh7785lcr_32bit sh-shmin sh-shx3 sh-titan sh-ul2 sh-urquell sparc-allmodconfig sparc-allnoconfig sparc-sparc32 sparc-sparc64 um-i386-allmodconfig um-i386-allnoconfig um-i386-defconfig um-x86_64-allmodconfig um-x86_64-allnoconfig x86_64-allmodconfig x86_64-allnoconfig x86_64-defconfig xtensa-allmodconfig xtensa-allnoconfig xtensa-audio_kc705 xtensa-cadence_csp xtensa-common xtensa-generic_kc705 xtensa-iss xtensa-nommu_kc705 xtensa-smp_lx200 xtensa-virt xtensa-xip_kc705 Tested-by: Nathan Chancellor <nathan@kernel.org> # build (hexagon) Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> |
||
Robin Murphy
|
59a68d4138 |
arm64: Mitigate MTE issues with str{n}cmp()
As with strlen(), the patches importing the updated str{n}cmp() implementations were originally developed and tested before the advent of CONFIG_KASAN_HW_TAGS, and have subsequently revealed not to be MTE-safe. Since in-kernel MTE is still a rather niche case, let it temporarily fall back to the generic C versions for correctness until we can figure out the best fix. Fixes: |
||
Arnd Bergmann
|
a7a08b275a |
arch: remove compat_alloc_user_space
All users of compat_alloc_user_space() and copy_in_user() have been removed from the kernel, only a few functions in sparc remain that can be changed to calling arch_copy_in_user() instead. Link: https://lkml.kernel.org/r/20210727144859.4150043-7-arnd@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Feng Tang <feng.tang@intel.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |
||
Jason Wang
|
2806556c5e |
arm64: use __func__ to get function name in pr_err
Prefer using '"%s...", __func__' to get current function's name in a debug message. Signed-off-by: Jason Wang <wangborong@cdjrlc.com> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210726122907.51529-1-wangborong@cdjrlc.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> |
||
Robin Murphy
|
295cf15623 |
arm64: Avoid premature usercopy failure
Al reminds us that the usercopy API must only return complete failure if absolutely nothing could be copied. Currently, if userspace does something silly like giving us an unaligned pointer to Device memory, or a size which overruns MTE tag bounds, we may fail to honour that requirement when faulting on a multi-byte access even though a smaller access could have succeeded. Add a mitigation to the fixup routines to fall back to a single-byte copy if we faulted on a larger access before anything has been written to the destination, to guarantee making *some* forward progress. We needn't be too concerned about the overall performance since this should only occur when callers are doing something a bit dodgy in the first place. Particularly broken userspace might still be able to trick generic_perform_write() into an infinite loop by targeting write() at an mmap() of some read-only device register where the fault-in load succeeds but any store synchronously aborts such that copy_to_user() is genuinely unable to make progress, but, well, don't do that... CC: stable@vger.kernel.org Reported-by: Chen Huang <chenhuang5@huawei.com> Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/dc03d5c675731a1f24a62417dba5429ad744234e.1626098433.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org> |
||
Mark Rutland
|
5f34b1eb2f |
arm64: fix strlen() with CONFIG_KASAN_HW_TAGS
When the kernel is built with CONFIG_KASAN_HW_TAGS and the CPU supports
MTE, memory accesses are checked at 16-byte granularity, and
out-of-bounds accesses can result in tag check faults. Our current
implementation of strlen() makes unaligned 16-byte accesses (within a
naturally aligned 4096-byte window), and can trigger tag check faults.
This can be seen at boot time, e.g.
| BUG: KASAN: invalid-access in __pi_strlen+0x14/0x150
| Read at addr f4ff0000c0028300 by task swapper/0/0
| Pointer tag: [f4], memory tag: [fe]
|
| CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.13.0-09550-g03c2813535a2-dirty #20
| Hardware name: linux,dummy-virt (DT)
| Call trace:
| dump_backtrace+0x0/0x1b0
| show_stack+0x1c/0x30
| dump_stack_lvl+0x68/0x84
| print_address_description+0x7c/0x2b4
| kasan_report+0x138/0x38c
| __do_kernel_fault+0x190/0x1c4
| do_tag_check_fault+0x78/0x90
| do_mem_abort+0x44/0xb4
| el1_abort+0x40/0x60
| el1h_64_sync_handler+0xb0/0xd0
| el1h_64_sync+0x78/0x7c
| __pi_strlen+0x14/0x150
| __register_sysctl_table+0x7c4/0x890
| register_leaf_sysctl_tables+0x1a4/0x210
| register_leaf_sysctl_tables+0xc8/0x210
| __register_sysctl_paths+0x22c/0x290
| register_sysctl_table+0x2c/0x40
| sysctl_init+0x20/0x30
| proc_sys_init+0x3c/0x48
| proc_root_init+0x80/0x9c
| start_kernel+0x640/0x69c
| __primary_switched+0xc0/0xc8
To fix this, we can reduce the (strlen-internal) MIN_PAGE_SIZE to 16
bytes when CONFIG_KASAN_HW_TAGS is selected. This will cause strlen() to
align the base pointer downwards to a 16-byte boundary, and to discard
the additional prefix bytes without counting them. All subsequent
accesses will be 16-byte aligned 16-byte LDPs. While the comments say
the body of the loop will access 32 bytes, this is performed as two
16-byte acceses, with the second made only if the first did not
encounter a NUL byte, so the body of the loop will not over-read across
a 16-byte boundary.
No other string routines are affected. The other str*() routines will
not make any access which straddles a 16-byte boundary, and the mem*()
routines will only make acceses which straddle a 16-byte boundary when
which is entirely within the bounds of the relevant base and size
arguments.
Fixes:
|
||
Will Deacon
|
fdceddb06a |
Merge branch 'for-next/mte' into for-next/core
KASAN optimisations for the hardware tagging (MTE) implementation. * for-next/mte: kasan: disable freed user page poisoning with HW tags arm64: mte: handle tags zeroing at page allocation time kasan: use separate (un)poison implementation for integrated init mm: arch: remove indirection level in alloc_zeroed_user_highpage_movable() kasan: speed up mte_set_mem_tag_range |
||
Will Deacon
|
2c9bd9d806 |
Merge branch 'for-next/kasan' into for-next/core
Optimise out-of-line KASAN checking when using software tagging. * for-next/kasan: kasan: arm64: support specialized outlined tag mismatch checks |
||
Will Deacon
|
181a126979 |
Merge branch 'for-next/insn' into for-next/core
Refactoring of our instruction decoding routines and addition of some missing encodings. * for-next/insn: arm64: insn: avoid circular include dependency arm64: insn: move AARCH64_INSN_SIZE into <asm/insn.h> arm64: insn: decouple patching from insn code arm64: insn: Add load/store decoding helpers arm64: insn: Add some opcodes to instruction decoder arm64: insn: Add barrier encodings arm64: insn: Add SVE instruction class arm64: Move instruction encoder/decoder under lib/ arm64: Move aarch32 condition check functions arm64: Move patching utilities out of instruction encoding/decoding |
||
Will Deacon
|
5ceb045541 |
Merge branch 'for-next/cortex-strings' into for-next/core
Update our kernel string routines to the latest Cortex Strings implementation. * for-next/cortex-strings: arm64: update string routine copyrights and URLs arm64: Rewrite __arch_clear_user() arm64: Better optimised memchr() arm64: Import latest memcpy()/memmove() implementation arm64: Add assembly annotations for weak-PI-alias madness arm64: Import latest version of Cortex Strings' strncmp arm64: Import updated version of Cortex Strings' strlen arm64: Import latest version of Cortex Strings' strcmp arm64: Import latest version of Cortex Strings' memcmp |
||
Peter Collingbourne
|
013bb59dbb |
arm64: mte: handle tags zeroing at page allocation time
Currently, on an anonymous page fault, the kernel allocates a zeroed page and maps it in user space. If the mapping is tagged (PROT_MTE), set_pte_at() additionally clears the tags. It is, however, more efficient to clear the tags at the same time as zeroing the data on allocation. To avoid clearing the tags on any page (which may not be mapped as tagged), only do this if the vma flags contain VM_MTE. This requires introducing a new GFP flag that is used to determine whether to clear the tags. The DC GZVA instruction with a 0 top byte (and 0 tag) requires top-byte-ignore. Set the TCR_EL1.{TBI1,TBID1} bits irrespective of whether KASAN_HW is enabled. Signed-off-by: Peter Collingbourne <pcc@google.com> Co-developed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://linux-review.googlesource.com/id/Id46dc94e30fe11474f7e54f5d65e7658dbdddb26 Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com> Link: https://lore.kernel.org/r/20210602235230.3928842-4-pcc@google.com Signed-off-by: Will Deacon <will@kernel.org> |
||
Mark Rutland
|
6b8f648959 |
arm64: update string routine copyrights and URLs
To make future archaeology easier, let's have the string routine comment blocks encode the specific upstream commit ID they were imported from. These are the same commit IDs as listed in the commits importing the code, expanded to 16 characters. Note that the routines have different commit IDs, each reprsenting the latest upstream commit which changed the particular routine. At the same time, let's consistently include 2021 in the copyright dates. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210602151358.35571-1-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org> |
||
Robin Murphy
|
344323e042 |
arm64: Rewrite __arch_clear_user()
Now that we're always using STTR variants rather than abstracting two different addressing modes, the user_ldst macro here is frankly more obfuscating than helpful. Rewrite __arch_clear_user() with regular USER() annotations so that it's clearer what's going on, and take the opportunity to minimise the branchiness in the most common paths, while also allowing the exception fixup to return an accurate result. Apparently some folks examine large reads from /dev/zero closely enough to notice the loop being hot, so align it per the other critical loops (presumably around a typical instruction fetch granularity). Reviewed-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/1cbd78b12c076a8ad4656a345811cfb9425df0b3.1622128527.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org> |
||
Robin Murphy
|
9e51cafd78 |
arm64: Better optimised memchr()
Although we implement our own assembly version of memchr(), it turns out to be barely any better than what GCC can generate for the generic C version (and would go wrong if the size_t argument were ever large enough to be interpreted as negative). Unfortunately we can't import the tuned implementation from the Arm optimized-routines library, since that has some Advanced SIMD parts which are not really viable for general kernel library code. What we can do, however, is pep things up with some relatively straightforward word-at-a-time logic for larger calls. Adding some timing to optimized-routines' memchr() test for a simple benchmark, overall this version comes in around half as fast as the SIMD code, but still nearly 4x faster than our existing implementation. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/58471b42f9287e039dafa9e5e7035077152438fd.1622128527.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org> |
||
Robin Murphy
|
285133040e |
arm64: Import latest memcpy()/memmove() implementation
Import the latest implementation of memcpy(), based on the upstream code of string/aarch64/memcpy.S at commit afd6244 from https://github.com/ARM-software/optimized-routines, and subsuming memmove() in the process. Note that for simplicity Arm have chosen to contribute this code to Linux under GPLv2 rather than the original MIT license. Note also that the needs of the usercopy routines vs. regular memcpy() have now diverged so far that we abandon the shared template idea and the damage which that incurred to the tuning of LDP/STP loops. We'll be back to tackle those routines separately in future. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/3c953af43506581b2422f61952261e76949ba711.1622128527.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org> |