Commit Graph

3875 Commits

Author SHA1 Message Date
Linus Torvalds
51835949dd Networking changes for 6.11. Not much excitement - a handful of large
patchsets (devmem among them) did not make it in time.
 
 Core & protocols
 ----------------
 
  - Use local_lock in addition to local_bh_disable() to protect per-CPU
    resources in networking, a step closer for local_bh_disable() not
    to act as a big lock on PREEMPT_RT.
 
  - Use flex array for netdevice priv area, ensure its cache alignment.
 
  - Add a sysctl knob to allow user to specify a default rto_min at socket
    init time. Bit of a big hammer but multiple companies were
    independently carrying such patch downstream so clearly it's useful.
 
  - Support scheduling transmission of packets based on CLOCK_TAI.
 
  - Un-pin TCP TIMEWAIT timer to avoid it firing on CPUs later cordoned off
    using cpusets.
 
  - Support multiple L2TPv3 UDP tunnels using the same 5-tuple address.
 
  - Allow configuration of multipath hash seed, to both allow synchronizing
    hashing of two routers, and preventing partial accidental sync.
 
  - Improve TCP compliance with RFC 9293 for simultaneous connect().
 
  - Support sending NAT keepalives in IPsec ESP in UDP states. Userspace
    IKE daemon had to do this before, but the kernel can better keep
    track of it.
 
  - Support sending supervision HSR frames with MAC addresses stored in
    ProxyNodeTable when RedBox (i.e. HSR-SAN) is enabled.
 
  - Introduce IPPROTO_SMC for selecting SMC when socket is created.
 
  - Allow UDP GSO transmit from devices with no checksum offload.
 
  - openvswitch: add packet sampling via psample, separating the sampled
    traffic from "upcall" packets sent to user space for forwarding.
 
  - nf_tables: shrink memory consumption for transaction objects.
 
 Things we sprinkled into general kernel code
 --------------------------------------------
 
  - Power Sequencing subsystem (used by Qualcomm Bluetooth driver
    for QCA6390).
 
  - Add IRQ information in sysfs for auxiliary bus.
 
  - Introduce guard definition for local_lock.
 
  - Add aligned flavor of __cacheline_group_{begin, end}() markings for
    grouping fields in structures.
 
 BPF
 ---
 
  - Notify user space (via epoll) when a struct_ops object is getting
    detached/unregistered.
 
  - Add new kfuncs for a generic, open-coded bits iterator.
 
  - Enable BPF programs to declare arrays of kptr, bpf_rb_root, and
    bpf_list_head.
 
  - Support resilient split BTF which cuts down on duplication and makes
    BTF as compact as possible WRT BTF from modules.
 
  - Add support for dumping kfunc prototypes from BTF which enables both
    detecting as well as dumping compilable prototypes for kfuncs.
 
  - riscv64 BPF JIT improvements in particular to add 12-argument support
    for BPF trampolines and to utilize bpf_prog_pack for the latter.
 
  - Add the capability to offload the netfilter flowtable in XDP layer
    through kfuncs.
 
 Driver API
 ----------
 
  - Allow users to configure IRQ tresholds between which automatic IRQ
    moderation can choose.
 
  - Expand Power Sourcing (PoE) status with power, class and failure
    reason. Support setting power limits.
 
  - Track additional RSS contexts in the core, make sure configuration
    changes don't break them.
 
  - Support IPsec crypto offload for IPv6 ESP and IPv4 UDP-encapsulated ESP
    data paths.
 
  - Support updating firmware on SFP modules.
 
 Tests and tooling
 -----------------
 
  - mptcp: use net/lib.sh to manage netns.
 
  - TCP-AO and TCP-MD5: replace debug prints used by tests with
    tracepoints.
 
  - openvswitch: make test self-contained (don't depend on OvS CLI tools).
 
 Drivers
 -------
 
  - Ethernet high-speed NICs:
    - Broadcom (bnxt):
      - increase the max total outstanding PTP TX packets to 4
      - add timestamping statistics support
      - implement netdev_queue_mgmt_ops
      - support new RSS context API
    - Intel (100G, ice, idpf):
      - implement FEC statistics and dumping signal quality indicators
      - support E825C products (with 56Gbps PHYs)
    - nVidia/Mellanox:
      - support HW-GRO
      - mlx4/mlx5: support per-queue statistics via netlink
      - obey the max number of EQs setting in sub-functions
    - AMD/Solarflare:
      - support new RSS context API
    - AMD/Pensando:
      - ionic: rework fix for doorbell miss to lower overhead
        and skip it on new HW
    - Wangxun:
      - txgbe: support Flow Director perfect filters
 
  - Ethernet NICs consumer, embedded and virtual:
    - Add driver for Tehuti Networks TN40xx chips
    - Add driver for Meta's internal NIC chips
    - Add driver for Ethernet MAC on Airoha EN7581 SoCs
    - Add driver for Renesas Ethernet-TSN devices
    - Google cloud vNIC:
      - flow steering support
    - Microsoft vNIC:
      - support page sizes other than 4KB on ARM64
    - vmware vNIC:
      - support latency measurement (update to version 9)
    - VirtIO net:
      - support for Byte Queue Limits
      - support configuring thresholds for automatic IRQ moderation
      - support for AF_XDP Rx zero-copy
    - Synopsys (stmmac):
      - support for STM32MP13 SoC
      - let platforms select the right PCS implementation
    - TI:
      - icssg-prueth: add multicast filtering support
      - icssg-prueth: enable PTP timestamping and PPS
    - Renesas:
      - ravb: improve Rx performance 30-400% by using page pool,
        theaded NAPI and timer-based IRQ coalescing
      - ravb: add MII support for R-Car V4M
    - Cadence (macb):
      - macb: add ARP support to Wake-On-LAN
    - Cortina:
      - use phylib for RX and TX pause configuration
 
  - Ethernet switches:
    - nVidia/Mellanox:
      - support configuration of multipath hash seed
      - report more accurate max MTU
      - use page_pool to improve Rx performance
    - MediaTek:
      - mt7530: add support for bridge port isolation
    - Qualcomm:
      - qca8k: add support for bridge port isolation
    - Microchip:
      - lan9371/2: add 100BaseTX PHY support
    - NXP:
      - vsc73xx: implement VLAN operations
 
  - Ethernet PHYs:
    - aquantia: enable support for aqr115c
    - aquantia: add support for PHY LEDs
    - realtek: add support for rtl8224 2.5Gbps PHY
    - xpcs: add memory-mapped device support
    - add BroadR-Reach link mode and support in Broadcom's PHY driver
 
  - CAN:
    - add document for ISO 15765-2 protocol support
    - mcp251xfd: workaround for erratum DS80000789E, use timestamps
      to catch when device returns incorrect FIFO status
 
  - WiFi:
    - mac80211/cfg80211:
      - parse Transmit Power Envelope (TPE) data in mac80211 instead of
        in drivers
      - improvements for 6 GHz regulatory flexibility
      - multi-link improvements
      - support multiple radios per wiphy
      - remove DEAUTH_NEED_MGD_TX_PREP flag
    - Intel (iwlwifi):
      - bump FW API to 91 for BZ/SC devices
      - report 64-bit radiotap timestamp
      - enable P2P low latency by default
      - handle Transmit Power Envelope (TPE) advertised by AP
      - remove support for older FW for new devices
      - fast resume (keeping the device configured)
      - mvm: re-enable Multi-Link Operation (MLO)
      - aggregation (A-MSDU) optimizations
    - MediaTek (mt76):
      - mt7925 Multi-Link Operation (MLO) support
    - Qualcomm (ath10k):
      - LED support for various chipsets
    - Qualcomm (ath12k):
      - remove unsupported Tx monitor handling
      - support channel 2 in 6 GHz band
      - support Spatial Multiplexing Power Save (SMPS) in 6 GHz band
      - supprt multiple BSSID (MBSSID) and Enhanced Multi-BSSID
        Advertisements (EMA)
      - support dynamic VLAN
      - add panic handler for resetting the firmware state
      - DebugFS support for datapath statistics
      - WCN7850: support for Wake on WLAN
    - Microchip (wilc1000):
      - read MAC address during probe to make it visible to user space
      - suspend/resume improvements
    - TI (wl18xx):
      - support newer firmware versions
    - RealTek (rtw89):
      - preparation for RTL8852BE-VT support
      - Wake on WLAN support for WiFi 6 chips
      - 36-bit PCI DMA support
    - RealTek (rtlwifi):
      - RTL8192DU support
    - Broadcom (brcmfmac):
      - Management Frame Protection support (to enable WPA3)
 
  - Bluetooth:
    - qualcomm: use the power sequencer for QCA6390
    - btusb: mediatek: add ISO data transmission functions
    - hci_bcm4377: add BCM4388 support
    - btintel: add support for BlazarU core
    - btintel: add support for Whale Peak2
    - btnxpuart: add support for AW693 A1 chipset
    - btnxpuart: add support for IW615 chipset
    - btusb: add Realtek RTL8852BE support ID 0x13d3:0x3591
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmaWjBwACgkQMUZtbf5S
 IrvuSRAAkJuEzTRqgURBCe4eNEQde6mJJig7l2CKHwCbFiHZpRkFHf8qKbcGWbL6
 uLW33SWnKtJVDhxVKWHLq635XW7BAa80YhqGw21GDi+mIEhWXZglHj3xbXNxsMfE
 4eg/kG4BkfYWFmHaXOwVWV/mr7nXf6j7WmXNeXEi32ufE1j0OL+YlQenKnMj8yP2
 j9JmYa2Chwppng1SblHmcjmGkdNVwFhStKeCG+2K7v06wdDH/QYBlbgUv9gw/cxp
 NlW//wgiaeX40U4O3kDwt9C+LDoh+0VrDDeVdQ+IsScLtY3PhAzEoKolFYTq2HSr
 I1JpoaHNnyNsJq3DZrACQ5WlH4yDn6C2EUB6dxNnFaI9F1ZPsi+7MTl6Sei1AklD
 TuQTj/lxOACBwW2Q77NU72uoxiIUauesGPHcnrAFuoCIEhZF0mso7k59BvrXhsOP
 QwcLbQdc1YHNkqv/Vc7NBY+ruMsYB+5Ubbhhj2p27dp/CWFIwxI29fze4dn2uhO6
 ejHN3mbqwPdSzg12YJtM6Iq61Cnwo2eVSvhTxl+ZVSZtI4nu2arzR+y7QTYmNrXP
 6tkgVN9UsWeLl2xJ8wyyqL5mcvNHP2rPXWZ2X56iTaa26m+UlleeQ7YRaYtQAAr0
 Ec/vlDMX64SwHhd+qwE99DXGQf2g+KklHKSLsnajJUVrWFTlRI0=
 =opz8
 -----END PGP SIGNATURE-----

Merge tag 'net-next-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Jakub Kicinski:
 "Not much excitement - a handful of large patchsets (devmem among them)
  did not make it in time.

  Core & protocols:

   - Use local_lock in addition to local_bh_disable() to protect per-CPU
     resources in networking, a step closer for local_bh_disable() not
     to act as a big lock on PREEMPT_RT

   - Use flex array for netdevice priv area, ensure its cache alignment

   - Add a sysctl knob to allow user to specify a default rto_min at
     socket init time. Bit of a big hammer but multiple companies were
     independently carrying such patch downstream so clearly it's useful

   - Support scheduling transmission of packets based on CLOCK_TAI

   - Un-pin TCP TIMEWAIT timer to avoid it firing on CPUs later cordoned
     off using cpusets

   - Support multiple L2TPv3 UDP tunnels using the same 5-tuple address

   - Allow configuration of multipath hash seed, to both allow
     synchronizing hashing of two routers, and preventing partial
     accidental sync

   - Improve TCP compliance with RFC 9293 for simultaneous connect()

   - Support sending NAT keepalives in IPsec ESP in UDP states.
     Userspace IKE daemon had to do this before, but the kernel can
     better keep track of it

   - Support sending supervision HSR frames with MAC addresses stored in
     ProxyNodeTable when RedBox (i.e. HSR-SAN) is enabled

   - Introduce IPPROTO_SMC for selecting SMC when socket is created

   - Allow UDP GSO transmit from devices with no checksum offload

   - openvswitch: add packet sampling via psample, separating the
     sampled traffic from "upcall" packets sent to user space for
     forwarding

   - nf_tables: shrink memory consumption for transaction objects

  Things we sprinkled into general kernel code:

   - Power Sequencing subsystem (used by Qualcomm Bluetooth driver for
     QCA6390)           [ Already merged separately - Linus ]

   - Add IRQ information in sysfs for auxiliary bus

   - Introduce guard definition for local_lock

   - Add aligned flavor of __cacheline_group_{begin, end}() markings for
     grouping fields in structures

  BPF:

   - Notify user space (via epoll) when a struct_ops object is getting
     detached/unregistered

   - Add new kfuncs for a generic, open-coded bits iterator

   - Enable BPF programs to declare arrays of kptr, bpf_rb_root, and
     bpf_list_head

   - Support resilient split BTF which cuts down on duplication and
     makes BTF as compact as possible WRT BTF from modules

   - Add support for dumping kfunc prototypes from BTF which enables
     both detecting as well as dumping compilable prototypes for kfuncs

   - riscv64 BPF JIT improvements in particular to add 12-argument
     support for BPF trampolines and to utilize bpf_prog_pack for the
     latter

   - Add the capability to offload the netfilter flowtable in XDP layer
     through kfuncs

  Driver API:

   - Allow users to configure IRQ tresholds between which automatic IRQ
     moderation can choose

   - Expand Power Sourcing (PoE) status with power, class and failure
     reason. Support setting power limits

   - Track additional RSS contexts in the core, make sure configuration
     changes don't break them

   - Support IPsec crypto offload for IPv6 ESP and IPv4 UDP-encapsulated
     ESP data paths

   - Support updating firmware on SFP modules

  Tests and tooling:

   - mptcp: use net/lib.sh to manage netns

   - TCP-AO and TCP-MD5: replace debug prints used by tests with
     tracepoints

   - openvswitch: make test self-contained (don't depend on OvS CLI
     tools)

  Drivers:

   - Ethernet high-speed NICs:
      - Broadcom (bnxt):
         - increase the max total outstanding PTP TX packets to 4
         - add timestamping statistics support
         - implement netdev_queue_mgmt_ops
         - support new RSS context API
      - Intel (100G, ice, idpf):
         - implement FEC statistics and dumping signal quality indicators
         - support E825C products (with 56Gbps PHYs)
      - nVidia/Mellanox:
         - support HW-GRO
         - mlx4/mlx5: support per-queue statistics via netlink
         - obey the max number of EQs setting in sub-functions
      - AMD/Solarflare:
         - support new RSS context API
      - AMD/Pensando:
         - ionic: rework fix for doorbell miss to lower overhead and
           skip it on new HW
      - Wangxun:
         - txgbe: support Flow Director perfect filters

   - Ethernet NICs consumer, embedded and virtual:
      - Add driver for Tehuti Networks TN40xx chips
      - Add driver for Meta's internal NIC chips
      - Add driver for Ethernet MAC on Airoha EN7581 SoCs
      - Add driver for Renesas Ethernet-TSN devices
      - Google cloud vNIC:
         - flow steering support
      - Microsoft vNIC:
         - support page sizes other than 4KB on ARM64
      - vmware vNIC:
         - support latency measurement (update to version 9)
      - VirtIO net:
         - support for Byte Queue Limits
         - support configuring thresholds for automatic IRQ moderation
         - support for AF_XDP Rx zero-copy
      - Synopsys (stmmac):
         - support for STM32MP13 SoC
         - let platforms select the right PCS implementation
      - TI:
         - icssg-prueth: add multicast filtering support
         - icssg-prueth: enable PTP timestamping and PPS
      - Renesas:
         - ravb: improve Rx performance 30-400% by using page pool,
           theaded NAPI and timer-based IRQ coalescing
         - ravb: add MII support for R-Car V4M
      - Cadence (macb):
         - macb: add ARP support to Wake-On-LAN
      - Cortina:
         - use phylib for RX and TX pause configuration

   - Ethernet switches:
      - nVidia/Mellanox:
         - support configuration of multipath hash seed
         - report more accurate max MTU
         - use page_pool to improve Rx performance
      - MediaTek:
         - mt7530: add support for bridge port isolation
      - Qualcomm:
         - qca8k: add support for bridge port isolation
      - Microchip:
         - lan9371/2: add 100BaseTX PHY support
      - NXP:
         - vsc73xx: implement VLAN operations

   - Ethernet PHYs:
      - aquantia: enable support for aqr115c
      - aquantia: add support for PHY LEDs
      - realtek: add support for rtl8224 2.5Gbps PHY
      - xpcs: add memory-mapped device support
      - add BroadR-Reach link mode and support in Broadcom's PHY driver

   - CAN:
      - add document for ISO 15765-2 protocol support
      - mcp251xfd: workaround for erratum DS80000789E, use timestamps to
        catch when device returns incorrect FIFO status

   - WiFi:
      - mac80211/cfg80211:
         - parse Transmit Power Envelope (TPE) data in mac80211 instead
           of in drivers
         - improvements for 6 GHz regulatory flexibility
         - multi-link improvements
         - support multiple radios per wiphy
         - remove DEAUTH_NEED_MGD_TX_PREP flag
      - Intel (iwlwifi):
         - bump FW API to 91 for BZ/SC devices
         - report 64-bit radiotap timestamp
         - enable P2P low latency by default
         - handle Transmit Power Envelope (TPE) advertised by AP
         - remove support for older FW for new devices
         - fast resume (keeping the device configured)
         - mvm: re-enable Multi-Link Operation (MLO)
         - aggregation (A-MSDU) optimizations
      - MediaTek (mt76):
         - mt7925 Multi-Link Operation (MLO) support
      - Qualcomm (ath10k):
         - LED support for various chipsets
      - Qualcomm (ath12k):
         - remove unsupported Tx monitor handling
         - support channel 2 in 6 GHz band
         - support Spatial Multiplexing Power Save (SMPS) in 6 GHz band
         - supprt multiple BSSID (MBSSID) and Enhanced Multi-BSSID
           Advertisements (EMA)
         - support dynamic VLAN
         - add panic handler for resetting the firmware state
         - DebugFS support for datapath statistics
         - WCN7850: support for Wake on WLAN
      - Microchip (wilc1000):
         - read MAC address during probe to make it visible to user space
         - suspend/resume improvements
      - TI (wl18xx):
         - support newer firmware versions
      - RealTek (rtw89):
         - preparation for RTL8852BE-VT support
         - Wake on WLAN support for WiFi 6 chips
         - 36-bit PCI DMA support
      - RealTek (rtlwifi):
         - RTL8192DU support
      - Broadcom (brcmfmac):
         - Management Frame Protection support (to enable WPA3)

   - Bluetooth:
      - qualcomm: use the power sequencer for QCA6390
      - btusb: mediatek: add ISO data transmission functions
      - hci_bcm4377: add BCM4388 support
      - btintel: add support for BlazarU core
      - btintel: add support for Whale Peak2
      - btnxpuart: add support for AW693 A1 chipset
      - btnxpuart: add support for IW615 chipset
      - btusb: add Realtek RTL8852BE support ID 0x13d3:0x3591"

* tag 'net-next-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1589 commits)
  eth: fbnic: Fix spelling mistake "tiggerring" -> "triggering"
  tcp: Replace strncpy() with strscpy()
  wifi: ath12k: fix build vs old compiler
  tcp: Don't access uninit tcp_rsk(req)->ao_keyid in tcp_create_openreq_child().
  eth: fbnic: Write the TCAM tables used for RSS control and Rx to host
  eth: fbnic: Add L2 address programming
  eth: fbnic: Add basic Rx handling
  eth: fbnic: Add basic Tx handling
  eth: fbnic: Add link detection
  eth: fbnic: Add initial messaging to notify FW of our presence
  eth: fbnic: Implement Rx queue alloc/start/stop/free
  eth: fbnic: Implement Tx queue alloc/start/stop/free
  eth: fbnic: Allocate a netdevice and napi vectors with queues
  eth: fbnic: Add FW communication mechanism
  eth: fbnic: Add message parsing for FW messages
  eth: fbnic: Add register init to set PCIe/Ethernet device config
  eth: fbnic: Allocate core device specific structures and devlink interface
  eth: fbnic: Add scaffolding for Meta's NIC driver
  PCI: Add Meta Platforms vendor ID
  net/sched: cls_flower: propagate tca[TCA_OPTIONS] to NL_REQ_ATTR_CHECK
  ...
2024-07-16 19:28:34 -07:00
Linus Torvalds
d80f2996b8 asm-generic updates for 6.11
Most of this is part of my ongoing work to clean up the system call
 tables. In this bit, all of the newer architectures are converted to
 use the machine readable syscall.tbl format instead in place of complex
 macros in include/uapi/asm-generic/unistd.h.
 
 This follows an earlier series that fixed various API mismatches
 and in turn is used as the base for planned simplifications.
 
 The other two patches are dead code removal and a warning fix.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmaVB1cACgkQYKtH/8kJ
 UicMqxAAnYKOxfjoMIhYYK6bl126wg/vIcDcjIR9cNWH21Nhn3qxn11ZXau3S7xv
 3l/HreEhyEQr4gC2a70IlXyHUadYOlrk+83OURrunWk1oKPmZlMKcfPVbtp8GL7x
 PUNXQfwM1XZLveKwufY24hoZdwKC+Y/5WLc1t0ReznJuAqgeO2rM9W5dnV5bAfCp
 he3F5hFcr196Dz3/GJjJIWrY+cbwfmZWsNtj1vFTL5/r/LuCu8HTkqhsGj8tE5BJ
 NGVEEXbp5eaVTCIGqJWhnuZcsnKN9kM51M7CtdwWf8OTckUVuJap5OsDVKQkWkGl
 bLPbd2jhDltph0sah51hAIvv4WdkThW76u9FRW7KR3fo7ra67eF7l5j7wc1lE2JB
 GwLJ1X56Bxe1GhvvNTlDmb7DrnlP/DMPuRv3Z6xyH6l8iZ2pMGlnAxuw6Bs1s6Y5
 WSs36ZpnS0ctgjfx37ZITsZSvbKFPpQFJP4siwS8aRNv/NFALNNdFyOCY5lNzspZ
 0dxwjn6/7UpHE4MKh6/hvCg2QwupXXBTRytibw+75/rOsR+EYlmtuONtyq2sLUHe
 ktJ5pg+8XuZm27+wLffuluzmY7sv2F8OU4cTYeM60Ynmc6pRzwUY6/VhG52S1/mU
 Ua4VgYIpzOtlLrYmz5QTWIZpdSFSVbIc/3pLriD6hn4Mvg+BwdA=
 =XOhL
 -----END PGP SIGNATURE-----

Merge tag 'asm-generic-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic

Pull asm-generic updates from Arnd Bergmann:
 "Most of this is part of my ongoing work to clean up the system call
  tables. In this bit, all of the newer architectures are converted to
  use the machine readable syscall.tbl format instead in place of
  complex macros in include/uapi/asm-generic/unistd.h.

  This follows an earlier series that fixed various API mismatches and
  in turn is used as the base for planned simplifications.

  The other two patches are dead code removal and a warning fix"

* tag 'asm-generic-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
  vmlinux.lds.h: catch .bss..L* sections into BSS")
  fixmap: Remove unused set_fixmap_offset_io()
  riscv: convert to generic syscall table
  openrisc: convert to generic syscall table
  nios2: convert to generic syscall table
  loongarch: convert to generic syscall table
  hexagon: use new system call table
  csky: convert to generic syscall table
  arm64: rework compat syscall macros
  arm64: generate 64-bit syscall.tbl
  arm64: convert unistd_32.h to syscall.tbl format
  arc: convert to generic syscall table
  clone3: drop __ARCH_WANT_SYS_CLONE3 macro
  kbuild: add syscall table generation to scripts/Makefile.asm-headers
  kbuild: verify asm-generic header list
  loongarch: avoid generating extra header files
  um: don't generate asm/bpf_perf_event.h
  csky: drop asm/gpio.h wrapper
  syscalls: add generic scripts/syscall.tbl
2024-07-16 12:09:03 -07:00
Linus Torvalds
a9a4cd9c33 soc: defconfig updates for 6.11
These are the usual updates to enable newly added drivers, mostly for
 arm64 and riscv this time.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmaVOFUACgkQYKtH/8kJ
 Uiecqw/+LleDzQp7Gf2DaQUesmLwfqs6xj69GIDI69y1PLPI0bQK/hSQ3B+ZTpZP
 fF56oxBbevytgf6LzhI/Ma+RLcSWqyNw6wgzkS+pBnslCC/8zc66wU7WF53CL9LL
 Ovxo0MRKzQdCgXpu1OZFeUM38YT72yKNebjiYE502vHHfm3Q7XmGmPgCgIJqF0rE
 BSQaSCPLEJU/pFsus+gAILKn+ptgt8GUWDKRSGksJZ8oyuWB30fl+GZ4yg6TaFie
 /D3gm3Qh9uXviG7w4ks1gTvuOgoTlRAecrzxstPrQvHDaG2WCdmvI68aek7Dj/DH
 AYb6krExF58VBdTK+HhD9hxxF9nLvk58GVNfTd7FryIejv9Nw0sMWC7SGoYkZShF
 bSkuX2cxy6S0+ejvJ+HOkmLFAXY2D3ScuVpIEaoE3V+R017Izq5vyYWdvZv2CyMv
 zv9wZW6oiDMDRyVbnrw2dZ4JQwna3XWf3ffr8Ds6NNJt4e07npPlkmlwsUfZEkMl
 yQAtUBeeGBDT3GKP2ZyaexHbIwXro6EcICa1hupyLZwR3RSiU7ZsNL98lKW9nTkJ
 2ZK5f9vKAat+yFZ+XfZS+khA2qz2gtrykonHYZvXcV7FcM8UmpvlO22gliwRR4VI
 JOwu+biwM6hnGoJc7WyC5YS2rwcOZZ1JIv6vvUDznOZ3pLKFb7E=
 =OqPR
 -----END PGP SIGNATURE-----

Merge tag 'soc-defconfig-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc

Pull SoC defconfig updates from Arnd Bergmann:
 "These are the usual updates to enable newly added drivers, mostly for
  arm64 and riscv this time"

* tag 'soc-defconfig-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  arm64: defconfig: Enable the IWLWIFI driver
  ARM: multi_v7_defconfig: Add MCP23S08 pinctrl support
  arm64: defconfig: Enable NVIDIA CoreSight PMU driver
  arm64: defconfig: enable SHM Bridge support for the TZ memory allocator
  arm64: defconfig: Enable secure QFPROM driver
  ARM: imx_v6_v7_defconfig: enable DRM_SII902X and DRM_DISPLAY_CONNECTOR
  ARM: imx_v6_v7_defconfig: Enable drivers for TQMa7x/MBa7x
  riscv: defconfig: Enable StarFive JH7110 drivers
  arm64: defconfig: Enable TI LP873X PMIC
  arm64: defconfig: Enable USB2 PHY Driver
  arm64: defconfig: Enable MTD support for Hyperbus
  ARM: configs: at91: Enable LVDS serializer support
  arm64: defconfig: enable several Qualcomm interconnects
  arm64: defconfig: Enable Marvell 88Q2XXX PHY support
  arm64: defconfig: make CONFIG_INTERCONNECT_QCOM_SM8350 built-in
  arm64: defconfig: enable CONFIG_SM_GPUCC_8350
  arm64: defconfig: Enable Renesas R-Car Gen4 PCIe controller
2024-07-16 11:56:15 -07:00
Linus Torvalds
e3950967f6 soc: dt updates for 6.11
The devicetree updates are fairly well spread out across platforms,
 with Qualcomm making up about a third of the total.
 
 There are three new SoCs in existing product families this:
 
  - NXP i.MX95 is a variant of i.MX93, now with six Cortex-A55 cores
    instead of just two as well as a GPU and more high-speed I/O
    devices.
 
  - Qualcomm QCS8550 is a variant of SM8550 for IOT devices
 
  - Airoha EN7581 is a 10G-PON network chip and related to
    the MT7981 Wireless router chip from its parent Mediatek.
 
 In total there are 58 new machines, including four riscv
 boards and eight for 32-bit arm.
 
 The most exciting new addition is probably a pair of laptops
 based on the Qualcomm x1e80100 (Snapdragon X1 Elite) chip,
 the Asus Vivobook S15 and the Lenovo Yoga Slim7x.
 
 Other noteworthy new additions are:
 
  - A total of 20 Qualcomm based machines, mostly Android devices
    from Samsung, Motorola and LG, as well as a wireless router
    and some reference designs
 
  - Six NXP i.MX based machines, mostly industrial boards along
    with some reference designs
 
  - Mediatek sees some interesting Filogic based routers
    including the "OpenWRT One", a few new Chromebooks as
    well as single-board computers.
 
  - Four machines from Solidrun based on Marvell cn913x,
    replacing the older Armada 8000 based counterparts
 
  - The four Amlogic machines are all set top boxes or reference
    designs for them
 
  - The nine new Rockchips machines are mostly single-board
    computers including some interesting ones based on the
    rk3588 chip like the ROCK 5 ITX board and the CM3588
    with its four NVMe slots
 
  - The RISC-V boards are all single-board computers based on
    Starfive JH7110, Microchip MPFS and Allwinner D1, which all
    had similar boards already
 
 There are also a lot of updates to already supported machines,
 notably for the TI K3, Rockchips, Freescale and of course
 Qualcomm platforms.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmaVTSYACgkQYKtH/8kJ
 UidZrQ/9GKrfiZ9xJ/7Vvh/jtF5uObsoVuEC2ZFNXY4q6x6KV8BxuHV6LVHgWVaS
 3+Mp5ER1N+h13cB8aDNQ9lq/TYfINQrAGFPMWK2Ytkg57klqeCblfSiKuQxIfdmG
 SH146R3NPe6lqEZ9yv8KWr1GS8kkkVFgzcOBD2BPwx77elazBvG4Ff5rd3Nizua2
 aAcrO2tKHMOJz4eUOJNvrDppwBZUARwPlScBx+QrJWUIDvjRafGvmwSp80FEQorz
 k258DeBzn3JiHUtvE5MLsaBC1WNghV5WTujEI+SLd5T0XohSr5Y8oisSnn/9fAn4
 CCji0eeeqG/KfIWzEGvs7AKmym1oW1OpdbLRN601YSNxLS7mLE5gEySjFXR3dYje
 IxbYzDV9A8qst/znk+uR6be8YB9r7r+aYi4IlE4lg9xWripTOPNuCx/5tdfa2Ge6
 +fBs4WBz+t0Xba19VjonaP+6HsEPqC2LP0/D44QMktG7QRrYbqILX66Mg/jgPccM
 f167D9WGcWUwoKH2nDZ+m1oXQj0UkSge40gBOFRtGfdCsV77TssmGeq0OeDDSA9K
 bIQgaDVwZuYXr9kyNoYIqziU0JA+mhALLiaAVaMLS8+VcNXRZKscv3fs+yFgCGFy
 aDkqWw6j2M3/O93+t4j4He/KNglquA81DBT8ZZPV1KJ4flTQIk0=
 =xGqj
 -----END PGP SIGNATURE-----

Merge tag 'soc-dt-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc

Pull SoC dt updates from Arnd Bergmann:
 "The devicetree updates are fairly well spread out across platforms,
  with Qualcomm making up about a third of the total.

  There are three new SoCs in existing product families this:

   - NXP i.MX95 is a variant of i.MX93, now with six Cortex-A55 cores
     instead of just two as well as a GPU and more high-speed I/O
     devices.

   - Qualcomm QCS8550 is a variant of SM8550 for IOT devices

   - Airoha EN7581 is a 10G-PON network chip and related to the MT7981
     Wireless router chip from its parent Mediatek.

  In total there are 58 new machines, including four riscv boards and
  eight for 32-bit arm.

  The most exciting new addition is probably a pair of laptops based on
  the Qualcomm x1e80100 (Snapdragon X1 Elite) chip, the Asus Vivobook
  S15 and the Lenovo Yoga Slim7x.

  Other noteworthy new additions are:

   - A total of 20 Qualcomm based machines, mostly Android devices from
     Samsung, Motorola and LG, as well as a wireless router and some
     reference designs

   - Six NXP i.MX based machines, mostly industrial boards along with
     some reference designs

   - Mediatek sees some interesting Filogic based routers including the
     "OpenWRT One", a few new Chromebooks as well as single-board
     computers.

   - Four machines from Solidrun based on Marvell cn913x, replacing the
     older Armada 8000 based counterparts

   - The four Amlogic machines are all set top boxes or reference
     designs for them

   - The nine new Rockchips machines are mostly single-board computers
     including some interesting ones based on the rk3588 chip like the
     ROCK 5 ITX board and the CM3588 with its four NVMe slots

   - The RISC-V boards are all single-board computers based on Starfive
     JH7110, Microchip MPFS and Allwinner D1, which all had similar
     boards already

  There are also a lot of updates to already supported machines, notably
  for the TI K3, Rockchips, Freescale and of course Qualcomm platforms"

* tag 'soc-dt-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (846 commits)
  arm64: dts: allwinner: h616: add crypto engine node
  riscv: dts: add clock generator for Sophgo SG2042 SoC
  arm64: dts: rockchip: Add Xunlong Orange Pi 3B
  dt-bindings: arm: rockchip: Add Xunlong Orange Pi 3B
  arm64: dts: rockchip: Add Radxa ROCK 3B
  dt-bindings: arm: rockchip: Add Radxa ROCK 3B
  mailmap: Update Luca Weiss's email address
  ARM: dts: ixp4xx: nslu2: beeper uses PWM
  arm64: dts: rockchip: add ROCK 5 ITX board
  dt-bindings: arm: rockchip: Add ROCK 5 ITX board
  arm64: dts: rockchip: Add dma-names to uart1 on Pine64 rk3566 devices
  arm64: dts: rockchip: Add avdd supplies to hdmi on rock64
  arm64: dts: qcom: msm8916-lg-c50: add initial dts for LG Leon LTE
  arm64: dts: qcom: msm8916-lg-m216: Add initial device tree
  dt-bindings: arm: qcom: Add msm8916 based LG devices
  ARM: dts: qcom: msm8960: correct memory base
  arm64: dts: qcom: ipq9574: Add icc provider ability to gcc
  dt-bindings: interconnect: Add Qualcomm IPQ9574 support
  arm64: dts: qcom: sm8150: Add video clock controller node
  arm64: dts: qcom: pm6150: Add vibrator
  ...
2024-07-16 11:43:51 -07:00
Paolo Bonzini
86014c1e20 KVM generic changes for 6.11
- Enable halt poll shrinking by default, as Intel found it to be a clear win.
 
  - Setup empty IRQ routing when creating a VM to avoid having to synchronize
    SRCU when creating a split IRQCHIP on x86.
 
  - Rework the sched_in/out() paths to replace kvm_arch_sched_in() with a flag
    that arch code can use for hooking both sched_in() and sched_out().
 
  - Take the vCPU @id as an "unsigned long" instead of "u32" to avoid
    truncating a bogus value from userspace, e.g. to help userspace detect bugs.
 
  - Mark a vCPU as preempted if and only if it's scheduled out while in the
    KVM_RUN loop, e.g. to avoid marking it preempted and thus writing guest
    memory when retrieving guest state during live migration blackout.
 
  - A few minor cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEKTobbabEP7vbhhN9OlYIJqCjN/0FAmaRuOYACgkQOlYIJqCj
 N/1UnQ/8CI5Qfr+/0gzYgtWmtEMczGG+rMNpzD3XVqPjJjXcMcBiQnplnzUVLhha
 vlPdYVK7vgmEt003XGzV55mik46LHL+DX/v4hI3HEdblfyCeNLW3fKEWVRB44qJe
 o+YUQwSK42SORUp9oXuQINxhA//U9EnI7CQxlJ8w8wenv5IJKfIGr01DefmfGPAV
 PKm9t6WLcNqvhZMEyy/zmzM3KVPCJL0NcwI97x6sHxFpQYIDtL0E/VexA4AFqMoT
 QK7cSDC/2US41Zvem/r/GzM/ucdF6vb9suzZYBohwhxtVhwJe2CDeYQZvtNKJ1U7
 GOHPaKL6nBWdZCm/yyWbbX2nstY1lHqxhN3JD0X8wqU5rNcwm2b8Vfyav0Ehc7H+
 jVbDTshOx4YJmIgajoKjgM050rdBK59TdfVL+l+AAV5q/TlHocalYtvkEBdGmIDg
 2td9UHSime6sp20vQfczUEz4bgrQsh4l2Fa/qU2jFwLievnBw0AvEaMximkSGMJe
 b8XfjmdTjlOesWAejANKtQolfrq14+1wYw0zZZ8PA+uNVpKdoovmcqSOcaDC9bT8
 GO/NFUvoG+lkcvJcIlo1SSl81SmGLosijwxWfGvFAqsgpR3/3l3dYp0QtztoCNJO
 d3+HnjgYn5o5FwufuTD3eUOXH4AFjG108DH0o25XrIkb2Kymy0o=
 =BalU
 -----END PGP SIGNATURE-----

Merge tag 'kvm-x86-generic-6.11' of https://github.com/kvm-x86/linux into HEAD

KVM generic changes for 6.11

 - Enable halt poll shrinking by default, as Intel found it to be a clear win.

 - Setup empty IRQ routing when creating a VM to avoid having to synchronize
   SRCU when creating a split IRQCHIP on x86.

 - Rework the sched_in/out() paths to replace kvm_arch_sched_in() with a flag
   that arch code can use for hooking both sched_in() and sched_out().

 - Take the vCPU @id as an "unsigned long" instead of "u32" to avoid
   truncating a bogus value from userspace, e.g. to help userspace detect bugs.

 - Mark a vCPU as preempted if and only if it's scheduled out while in the
   KVM_RUN loop, e.g. to avoid marking it preempted and thus writing guest
   memory when retrieving guest state during live migration blackout.

 - A few minor cleanups
2024-07-16 09:51:36 -04:00
Masahiro Yamada
b9d73218d7 treewide: change conditional prompt for choices to 'depends on'
While Documentation/kbuild/kconfig-language.rst provides a brief
explanation, there are recurring confusions regarding the usage of a
prompt followed by 'if <expr>'. This conditional controls _only_ the
prompt.

A typical usage is as follows:

    menuconfig BLOCK
            bool "Enable the block layer" if EXPERT
            default y

When EXPERT=n, the prompt is hidden, but this config entry is still
active, and BLOCK is set to its default value 'y'. This is reasonable
because you are likely want to enable the block device support. When
EXPERT=y, the prompt is shown, allowing you to toggle BLOCK.

Please note that it is different from 'depends on EXPERT', which would
enable and disable the entire config entry.

However, this conditional prompt has never worked in a choice block.

The following two work in the same way: when EXPERT is disabled, the
choice block is entirely disabled.

[Test Code 1]

    choice
            prompt "choose" if EXPERT

    config A
            bool "A"

    config B
            bool "B"

    endchoice

[Test Code 2]

    choice
            prompt "choose"
            depends on EXPERT

    config A
            bool "A"

    config B
            bool "B"

    endchoice

I believe the first case should hide only the prompt, producing the
default:

   CONFIG_A=y
   # CONFIG_B is not set

The next commit will change (fix) the behavior of the conditional prompt
in choice blocks.

I see several choice blocks wrongly using a conditional prompt, where
'depends on' makes more sense.

To preserve the current behavior, this commit converts such misuses.

I did not touch the following entry in arch/x86/Kconfig:

    choice
            prompt "Memory split" if EXPERT
            default VMSPLIT_3G

This is truly the correct use of the conditional prompt; when EXPERT=n,
this choice block should silently select the reasonable VMSPLIT_3G,
although the resulting PAGE_OFFSET will not be affected anyway.

Presumably, the one in fs/jffs2/Kconfig is also correct, but I converted
it to 'depends on' to avoid any potential behavioral change.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-07-16 01:08:37 +09:00
Qingfang Deng
93b63f68d0
riscv: lib: relax assembly constraints in hweight
rd and rs don't have to be the same. In some cases where rs needs to be
saved for later usage, this will save us some mv instructions.

Signed-off-by: Qingfang Deng <qingfang.deng@siflower.com.cn>
Reviewed-by: Xiao Wang <xiao.w.wang@intel.com>
Link: https://lore.kernel.org/r/20240527092405.134967-1-dqfext@gmail.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-15 08:46:46 -07:00
Christophe Leroy
e6c0c03245 mm: provide mm_struct and address to huge_ptep_get()
On powerpc 8xx huge_ptep_get() will need to know whether the given ptep is
a PTE entry or a PMD entry.  This cannot be known with the PMD entry
itself because there is no easy way to know it from the content of the
entry.

So huge_ptep_get() will need to know either the size of the page or get
the pmd.

In order to be consistent with huge_ptep_get_and_clear(), give mm and
address to huge_ptep_get().

Link: https://lkml.kernel.org/r/cc00c70dd384298796a4e1b25d6c4eb306d3af85.1719928057.git.christophe.leroy@csgroup.eu
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-12 15:52:15 -07:00
yang.zhang
6ad8735994
riscv: set trap vector earlier
The exception vector of the booting hart is not set before enabling
the mmu and then still points to the value of the previous firmware,
typically _start. That makes it hard to debug setup_vm() when bad
things happen. So fix that by setting the exception vector earlier.

Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: yang.zhang <yang.zhang@hexintek.com>
Link: https://lore.kernel.org/r/20240508022445.6131-1-gaoshanliukou@163.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-12 08:55:31 -07:00
Palmer Dabbelt
5ee121a393
Merge patch series "riscv: Apply Zawrs when available"
Andrew Jones <ajones@ventanamicro.com> says:

Zawrs provides two instructions (wrs.nto and wrs.sto), where both are
meant to allow the hart to enter a low-power state while waiting on a
store to a memory location. The instructions also both wait an
implementation-defined "short" duration (unless the implementation
terminates the stall for another reason). The difference is that while
wrs.sto will terminate when the duration elapses, wrs.nto, depending on
configuration, will either just keep waiting or an ILL exception will be
raised. Linux will use wrs.nto, so if platforms have an implementation
which falls in the "just keep waiting" category (which is not expected),
then it should _not_ advertise Zawrs in the hardware description.

Like wfi (and with the same {m,h}status bits to configure it), when
wrs.nto is configured to raise exceptions it's expected that the higher
privilege level will see the instruction was a wait instruction, do
something, and then resume execution following the instruction. For
example, KVM does configure exceptions for wfi (hstatus.VTW=1) and
therefore also for wrs.nto. KVM does this for wfi since it's better to
allow other tasks to be scheduled while a VCPU waits for an interrupt.
For waits such as those where wrs.nto/sto would be used, which are
typically locks, it is also a good idea for KVM to be involved, as it
can attempt to schedule the lock holding VCPU.

This series starts with Christoph's addition of the riscv
smp_cond_load_relaxed function which applies wrs.sto when available.
That patch has been reworked to use wrs.nto and to use the same approach
as Arm for the wait loop, since we can't have arbitrary C code between
the load-reserved and the wrs. Then, hwprobe support is added (since the
instructions are also usable from usermode), and finally KVM is
taught about wrs.nto, allowing guests to see and use the Zawrs
extension.

We still don't have test results from hardware, and it's not possible to
prove that using Zawrs is a win when testing on QEMU, not even when
oversubscribing VCPUs to guests. However, it is possible to use KVM
selftests to force a scenario where we can prove Zawrs does its job and
does it well. [4] is a test which does this and, on my machine, without
Zawrs it takes 16 seconds to complete and with Zawrs it takes 0.25
seconds.

This series is also available here [1]. In order to use QEMU for testing
a build with [2] is needed. In order to enable guests to use Zawrs with
KVM using kvmtool, the branch at [3] may be used.

[1] https://github.com/jones-drew/linux/commits/riscv/zawrs-v3/
[2] https://lore.kernel.org/all/20240312152901.512001-2-ajones@ventanamicro.com/
[3] https://github.com/jones-drew/kvmtool/commits/riscv/zawrs/
[4] cb2beccebc

Link: https://lore.kernel.org/r/20240426100820.14762-8-ajones@ventanamicro.com

* b4-shazam-merge:
  KVM: riscv: selftests: Add Zawrs extension to get-reg-list test
  KVM: riscv: Support guest wrs.nto
  riscv: hwprobe: export Zawrs ISA extension
  riscv: Add Zawrs support for spinlocks
  dt-bindings: riscv: Add Zawrs ISA extension description
  riscv: Provide a definition for 'pause'

Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-12 08:55:29 -07:00
Andrew Jones
86d6a86e59
KVM: riscv: Support guest wrs.nto
When a guest traps on wrs.nto, call kvm_vcpu_on_spin() to attempt
to yield to the lock holding VCPU. Also extend the KVM ISA extension
ONE_REG interface to allow KVM userspace to detect and enable the
Zawrs extension for the Guest/VM.

Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Acked-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20240426100820.14762-13-ajones@ventanamicro.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-12 08:54:51 -07:00
Andrew Jones
244c18fbf6
riscv: hwprobe: export Zawrs ISA extension
Export Zawrs ISA extension through hwprobe.

[Palmer: there's a gap in the numbers here as there will be a merge
conflict when this is picked up.  To avoid confusion I just set the
hwprobe ID to match what it would be post-merge.]

Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Clément Léger <cleger@rivosinc.com>
Link: https://lore.kernel.org/r/20240426100820.14762-12-ajones@ventanamicro.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-12 08:53:50 -07:00
Paolo Bonzini
c8b8b8190a LoongArch KVM changes for v6.11
1. Add ParaVirt steal time support.
 2. Add some VM migration enhancement.
 3. Add perf kvm-stat support for loongarch.
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCAA0FiEEzOlt8mkP+tbeiYy5AoYrw/LiJnoFAmaOS6UWHGNoZW5odWFj
 YWlAa2VybmVsLm9yZwAKCRAChivD8uImehejD/9pACGe3h3krXLcFVWXOFIu5Hpc
 5kQLP0lSPJ/o5Xs8t/oPLrnDX70z90wXI1LOmltc7h32MSwFa2l8COQh+sN5eJBQ
 PNyt7u7bMipp0yJS4Gl3LQQ5vklcGOSpQc/gbeXnVx8J/tz+Mo9YGGLIXVRXRM6W
 Ri8D2VVFiwzQQYeTpPo1u1Ob8C6mA4KOppwvhscMTM3vj4NMbsinBzRnR0lG0Tdw
 meFhxDPly1Ksxsbnj9UGO6UnEY0A2SLONs6MiO4y4DtoqoDlw/lbqFJuYo4vvbx1
 pxtjyirD/PX/wjslQFWUOuU0hMfAodera+JupZ5BZWfcG8FltA4DQfDsm/U9RjK/
 7gGNnr8Xk2/tp6+4AVV+HU2iTgRvq+mXCL72zSy2Y4r7ElBAANDfk4n+Zn/PWisn
 U9wwV8Ue7tVB15BRpRsg77NzBidiCFEe/6flWYiX2y24ke71gwDJBGUy8hMdKt6t
 4Cq8atsU0MvDAzfYMsK9JjskJp4UFq6wb1tXbbuADM4TDhnzlK6s6h3vM+pFlh/f
 my7fDH8/2qsCWhBDM4pmsJskVp+I1GOk/80RjTQISwx7iHktJWvxNYTaisK2fvD5
 Qs1IUWfNFbDX0Lr0QpN6j6X4rZkghR4R6XoFkd4nkicwi+UHVn3oK9GSqv24QJn9
 7+Ev3dfRTUYLd6mC4Q==
 =DpIK
 -----END PGP SIGNATURE-----

Merge tag 'loongarch-kvm-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson into HEAD

LoongArch KVM changes for v6.11

1. Add ParaVirt steal time support.
2. Add some VM migration enhancement.
3. Add perf kvm-stat support for loongarch.
2024-07-12 11:24:12 -04:00
Paolo Bonzini
60d2b2f3c4 KVM/riscv changes for 6.11
- Redirect AMO load/store access fault traps to guest
 - Perf kvm stat support for RISC-V
 - Use HW IMSIC guest files when available
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEZdn75s5e6LHDQ+f/rUjsVaLHLAcFAmaRDW4ACgkQrUjsVaLH
 LAcU2Q/+IaL17M8D8ueOcbmCMqZRReyVdR9vH6q87E9NJYRH2dewZ656bNQnnU20
 3hkbHOnF+NJAHJ0SfXwqNTVkJcQ8u+F3Xui4DlnFZ/lkpcWpvT/DRI5SCjIjiB/G
 SS/xWaRoSjvVJ7M8SyQhHUb2Y/tiDRXOOEl59ROGAKjzC3SY5/NJJ6g5FeE5akT8
 /Q7WisZmc+ZH+a9EEOnl+Do7AFakrlaFM5KnweamfqSlSFrQB12YNpSsmA16k6X9
 fqK/xPQTjeNakdQDPKw8INCbXkt8dsnlrPS6ivL0FCVf38aIJK0jxyLk9JbZGBK8
 +dGCJOLVJontEyOVTYheq2oWv40xAlkXDjLNbnz+Nf7Sau8evFBpE2mPnbUBoGZi
 fu5UCddSw3CFwrFNM+qiBRPz/mNuUpCC4pCh8yJSCDZ374ew9ili2l3Nb2IvBcJ2
 36lQuxlPVTPOv1J76/WtYwsSwaYBHHcBshweTJCkAkezp0d/wAE8bpaw3n4YnfSn
 l4u8/rrnEBb3Cd9cbW1Vk77Vw5e02RlZY5T+JLj7TXWSAFzstYxMpLsf097tqqcn
 vY1iTrpxTcJuY0Rra3SI05eKgliXI5snh08xlW2NiVxu8NjjZMU73b6tg3JX8FHl
 DMCafyQUBueV2jCpwbYribpbWv/UuUl92AKyJOwZ76W/e9YVBLA=
 =atJQ
 -----END PGP SIGNATURE-----

Merge tag 'kvm-riscv-6.11-1' of https://github.com/kvm-riscv/linux into HEAD

KVM/riscv changes for 6.11

- Redirect AMO load/store access fault traps to guest
- Perf kvm stat support for RISC-V
- Use guest files for IMSIC virtualization, when available

ONE_REG support for the Zimop, Zcmop, Zca, Zcf, Zcd, Zcb and Zawrs ISA
extensions is coming through the RISC-V tree.
2024-07-12 11:19:51 -04:00
Christoph Müllner
b8ddb0df30
riscv: Add Zawrs support for spinlocks
RISC-V code uses the generic ticket lock implementation, which calls
the macros smp_cond_load_relaxed() and smp_cond_load_acquire().
Introduce a RISC-V specific implementation of smp_cond_load_relaxed()
which applies WRS.NTO of the Zawrs extension in order to reduce power
consumption while waiting and allows hypervisors to enable guests to
trap while waiting. smp_cond_load_acquire() doesn't need a RISC-V
specific implementation as the generic implementation is based on
smp_cond_load_relaxed() and smp_acquire__after_ctrl_dep() sufficiently
provides the acquire semantics.

This implementation is heavily based on Arm's approach which is the
approach Andrea Parri also suggested.

The Zawrs specification can be found here:
https://github.com/riscv/riscv-zawrs/blob/main/zawrs.adoc

Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
Co-developed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20240426100820.14762-11-ajones@ventanamicro.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-12 03:16:42 -07:00
Andrew Jones
6da111574b
riscv: Provide a definition for 'pause'
If we're going to provide the encoding for 'pause' in cpu_relax()
anyway, then we can drop the toolchain checks and just always use
it. The advantage of doing this is that other code that need
pause don't need to also define it (yes, another use is coming).
Add the definition to insn-def.h since it's an instruction
definition and also because insn-def.h doesn't include much, so
it's safe to include from asm/vdso/processor.h without concern for
circular dependencies.

Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20240426100820.14762-9-ajones@ventanamicro.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-12 03:16:39 -07:00
Jakub Kicinski
7c8267275d Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

Conflicts:

net/sched/act_ct.c
  26488172b0 ("net/sched: Fix UAF when resolving a clash")
  3abbd7ed8b ("act_ct: prepare for stolen verdict coming from conntrack and nat engine")

No adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-11 12:58:13 -07:00
Clément Léger
c9b8cd139c
riscv: hwprobe: export highest virtual userspace address
Some userspace applications (OpenJDK for instance) uses the free MSBs
in pointers to insert additional information for their own logic and
need to get this information from somewhere. Currently they rely on
parsing /proc/cpuinfo "mmu=svxx" string to obtain the current value of
virtual address usable bits [1]. Since this reflect the raw supported
MMU mode, it might differ from the logical one used internally which is
why arch_get_mmap_end() is used. Exporting the highest mmapable address
through hwprobe will allow a more stable interface to be used. For that
purpose, add a new hwprobe key named
RISCV_HWPROBE_KEY_HIGHEST_VIRT_ADDRESS which will export the highest
userspace virtual address.

Link: https://github.com/openjdk/jdk/blob/master/src/hotspot/os_cpu/linux_riscv/vm_version_linux_riscv.cpp#L171 [1]
Signed-off-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20240410144558.1104006-1-cleger@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-11 08:57:33 -07:00
Thorsten Blum
fb9086e95a riscv: Remove unnecessary int cast in variable_fls()
__builtin_clz() returns an int and casting the whole expression to int
is unnecessary. Remove it.

Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
2024-07-10 14:30:35 -07:00
Alexandre Ghiti
16badacd8a
riscv: Improve sbi_ecall() code generation by reordering arguments
The sbi_ecall() function arguments are not in the same order as the
ecall arguments, so we end up re-ordering the registers before the
ecall which is useless and costly.

So simply reorder the arguments in the same way as expected by ecall.
Instead of reordering directly the arguments of sbi_ecall(), use a proxy
macro since the current ordering is more natural.

Before:

Dump of assembler code for function sbi_ecall:
   0xffffffff800085e0 <+0>: add sp,sp,-32
   0xffffffff800085e2 <+2>: sd s0,24(sp)
   0xffffffff800085e4 <+4>: mv t1,a0
   0xffffffff800085e6 <+6>: add s0,sp,32
   0xffffffff800085e8 <+8>: mv t3,a1
   0xffffffff800085ea <+10>: mv a0,a2
   0xffffffff800085ec <+12>: mv a1,a3
   0xffffffff800085ee <+14>: mv a2,a4
   0xffffffff800085f0 <+16>: mv a3,a5
   0xffffffff800085f2 <+18>: mv a4,a6
   0xffffffff800085f4 <+20>: mv a5,a7
   0xffffffff800085f6 <+22>: mv a6,t3
   0xffffffff800085f8 <+24>: mv a7,t1
   0xffffffff800085fa <+26>: ecall
   0xffffffff800085fe <+30>: ld s0,24(sp)
   0xffffffff80008600 <+32>: add sp,sp,32
   0xffffffff80008602 <+34>: ret

After:

Dump of assembler code for function __sbi_ecall:
   0xffffffff8000b6b2 <+0>:	add	sp,sp,-32
   0xffffffff8000b6b4 <+2>:	sd	s0,24(sp)
   0xffffffff8000b6b6 <+4>:	add	s0,sp,32
   0xffffffff8000b6b8 <+6>:	ecall
   0xffffffff8000b6bc <+10>:	ld	s0,24(sp)
   0xffffffff8000b6be <+12>:	add	sp,sp,32
   0xffffffff8000b6c0 <+14>:	ret

Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Reviewed-by: Yunhui Cui <cuiyunhui@bytedance.com>
Link: https://lore.kernel.org/r/20240322112629.68170-1-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-10 13:30:56 -07:00
Samuel Holland
56c1c1a09a
riscv: Add tracepoints for SBI calls and returns
These are useful for measuring the latency of SBI calls. The SBI HSM
extension is excluded because those functions are called from contexts
such as cpuidle where instrumentation is not allowed.

Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Link: https://lore.kernel.org/r/20240321230131.1838105-1-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-10 13:23:09 -07:00
Xiao Wang
a43fe27d65
riscv: Optimize crc32 with Zbc extension
As suggested by the B-ext spec, the Zbc (carry-less multiplication)
instructions can be used to accelerate CRC calculations. Currently, the
crc32 is the most widely used crc function inside kernel, so this patch
focuses on the optimization of just the crc32 APIs.

Compared with the current table-lookup based optimization, Zbc based
optimization can also achieve large stride during CRC calculation loop,
meantime, it avoids the memory access latency of the table-lookup based
implementation and it reduces memory footprint.

If Zbc feature is not supported in a runtime environment, then the
table-lookup based implementation would serve as fallback via alternative
mechanism.

By inspecting the vmlinux built by gcc v12.2.0 with default optimization
level (-O2), we can see below instruction count change for each 8-byte
stride in the CRC32 loop:

rv64: crc32_be (54->31), crc32_le (54->13), __crc32c_le (54->13)
rv32: crc32_be (50->32), crc32_le (50->16), __crc32c_le (50->16)

The compile target CPU is little endian, extra effort is needed for byte
swapping for the crc32_be API, thus, the instruction count change is not
as significant as that in the *_le cases.

This patch is tested on QEMU VM with the kernel CRC32 selftest for both
rv64 and rv32. Running the CRC32 selftest on a real hardware (SpacemiT K1)
with Zbc extension shows 65% and 125% performance improvement respectively
on crc32_test() and crc32c_test().

Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20240621054707.1847548-1-xiao.w.wang@intel.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-10 13:19:50 -07:00
Arnd Bergmann
3db80c999d riscv: convert to generic syscall table
The uapi/asm/unistd_{32,64}.h and asm/syscall_table_{32,64}.h headers can
now be generated from scripts/syscall.tbl, which makes this consistent
with the other architectures that have their own syscall.tbl.

riscv has two extra system call that gets added to scripts/syscall.tbl.

The newstat and rlimit entries in the syscall_abis_64 line are for system
calls that were part of the generic ABI when riscv64 got added but are
no longer enabled by default for new architectures. Both riscv32 and
riscv64 also implement memfd_secret, which is optional for all
architectures.

Unlike all the other 32-bit architectures, the time32 and stat64
sets of syscalls are not enabled on riscv32.

Both the user visible side of asm/unistd.h and the internal syscall
table in the kernel should have the same effective contents after this.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-07-10 14:23:38 +02:00
Arnd Bergmann
505d66d1ab clone3: drop __ARCH_WANT_SYS_CLONE3 macro
When clone3() was introduced, it was not obvious how each architecture
deals with setting up the stack and keeping the register contents in
a fork()-like system call, so this was left for the architecture
maintainers to implement, with __ARCH_WANT_SYS_CLONE3 defined by those
that already implement it.

Five years later, we still have a few architectures left that are missing
clone3(), and the macro keeps getting in the way as it's fundamentally
different from all the other __ARCH_WANT_SYS_* macros that are meant
to provide backwards-compatibility with applications using older
syscalls that are no longer provided by default.

Address this by reversing the polarity of the macro, adding an
__ARCH_BROKEN_SYS_CLONE3 macro to all architectures that don't
already provide the syscall, and remove __ARCH_WANT_SYS_CLONE3
from all the other ones.

Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-07-10 14:23:38 +02:00
Paolo Abeni
7b769adc26 bpf-next-for-netdev
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZoxN0AAKCRDbK58LschI
 g0c5AQDa3ZV9gfbN42y1zSDoM1uOgO60fb+ydxyOYh8l3+OiQQD/fLfpTY3gBFSY
 9yi/pZhw/QdNzQskHNIBrHFGtJbMxgs=
 =p1Zz
 -----END PGP SIGNATURE-----

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Daniel Borkmann says:

====================
pull-request: bpf-next 2024-07-08

The following pull-request contains BPF updates for your *net-next* tree.

We've added 102 non-merge commits during the last 28 day(s) which contain
a total of 127 files changed, 4606 insertions(+), 980 deletions(-).

The main changes are:

1) Support resilient split BTF which cuts down on duplication and makes BTF
   as compact as possible wrt BTF from modules, from Alan Maguire & Eduard Zingerman.

2) Add support for dumping kfunc prototypes from BTF which enables both detecting
   as well as dumping compilable prototypes for kfuncs, from Daniel Xu.

3) Batch of s390x BPF JIT improvements to add support for BPF arena and to implement
   support for BPF exceptions, from Ilya Leoshkevich.

4) Batch of riscv64 BPF JIT improvements in particular to add 12-argument support
   for BPF trampolines and to utilize bpf_prog_pack for the latter, from Pu Lehui.

5) Extend BPF test infrastructure to add a CHECKSUM_COMPLETE validation option
   for skbs and add coverage along with it, from Vadim Fedorenko.

6) Inline bpf_get_current_task/_btf() helpers in the arm64 BPF JIT which gives
   a small 1% performance improvement in micro-benchmarks, from Puranjay Mohan.

7) Extend the BPF verifier to track the delta between linked registers in order
   to better deal with recent LLVM code optimizations, from Alexei Starovoitov.

8) Fix bpf_wq_set_callback_impl() kfunc signature where the third argument should
   have been a pointer to the map value, from Benjamin Tissoires.

9) Extend BPF selftests to add regular expression support for test output matching
   and adjust some of the selftest when compiled under gcc, from Cupertino Miranda.

10) Simplify task_file_seq_get_next() and remove an unnecessary loop which always
    iterates exactly once anyway, from Dan Carpenter.

11) Add the capability to offload the netfilter flowtable in XDP layer through
    kfuncs, from Florian Westphal & Lorenzo Bianconi.

12) Various cleanups in networking helpers in BPF selftests to shave off a few
    lines of open-coded functions on client/server handling, from Geliang Tang.

13) Properly propagate prog->aux->tail_call_reachable out of BPF verifier, so
    that x86 JIT does not need to implement detection, from Leon Hwang.

14) Fix BPF verifier to add a missing check_func_arg_reg_off() to prevent an
    out-of-bounds memory access for dynpointers, from Matt Bobrowski.

15) Fix bpf_session_cookie() kfunc to return __u64 instead of long pointer as
    it might lead to problems on 32-bit archs, from Jiri Olsa.

16) Enhance traffic validation and dynamic batch size support in xsk selftests,
    from Tushar Vyavahare.

bpf-next-for-netdev

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (102 commits)
  selftests/bpf: DENYLIST.aarch64: Remove fexit_sleep
  selftests/bpf: amend for wrong bpf_wq_set_callback_impl signature
  bpf: helpers: fix bpf_wq_set_callback_impl signature
  libbpf: Add NULL checks to bpf_object__{prev_map,next_map}
  selftests/bpf: Remove exceptions tests from DENYLIST.s390x
  s390/bpf: Implement exceptions
  s390/bpf: Change seen_reg to a mask
  bpf: Remove unnecessary loop in task_file_seq_get_next()
  riscv, bpf: Optimize stack usage of trampoline
  bpf, devmap: Add .map_alloc_check
  selftests/bpf: Remove arena tests from DENYLIST.s390x
  selftests/bpf: Add UAF tests for arena atomics
  selftests/bpf: Introduce __arena_global
  s390/bpf: Support arena atomics
  s390/bpf: Enable arena
  s390/bpf: Support address space cast instruction
  s390/bpf: Support BPF_PROBE_MEM32
  s390/bpf: Land on the next JITed instruction after exception
  s390/bpf: Introduce pre- and post- probe functions
  s390/bpf: Get rid of get_probe_mem_regno()
  ...
====================

Link: https://patch.msgid.link/20240708221438.10974-1-daniel@iogearbox.net
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-07-09 17:01:46 +02:00
Arnd Bergmann
95ab7b209b RISC-V Devicetrees for v6.11
Sopgho:
 Add clock support for SG2042.
 
 Signed-off-by: Chen Wang <unicorn_wang@outlook.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEEdoBX2jyDC9ZCTwZjDCzASqG0i0IFAmaMg3wACgkQDCzASqG0
 i0Jh1wv/SVACZmbHdQ0ldCrjdPeSDQ5cN+TP2NHihFfE8+H7rzP/baHv9EnezGVQ
 Jcjw4Qj8uYqggN0wpWbWvuHC8tNPSrYIoQgbtnjyLBEjtwFINSOx7vwvldBszN1g
 ZXQuAf5KRyuBAcqUTVUBqX9bsZ9BvNvrV3xcLL9PZzH05tQiQG4J5ix9DF6wfkNW
 7O0Ibz88EnTAUOhqLWiR+tOB6Us/fN2lZPgyVHjqy3L3OqwsDOtGjMlfDuCW2SJk
 Yw9Yi1tjSCr3Vipvz5RgJfuCElgOTgJCcyW6EkR9aBmVflpBQpFFYsw88EFA7qeO
 sKWKdln2+DuX9xqneDmpVLr7uEBP3QDdI+lr+gjC07k3lVO778meOqYa+NHIFhRL
 zT+VZ9NPD311/mNXt67yUoBntA7SgeaE6hPpcYeXEpl05T4nN9rdyFFNWm/IF3qH
 7gd07Z78e8Cl02fz4V2rMdK61+2vfDbCITrCmbCChaWAdiPgI2Glo6ejGns+xzcw
 dPzLpga/
 =NA9d
 -----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmaM+eMACgkQYKtH/8kJ
 UicYiw//ZH53sAC+TS2NiaW6GeqcXf5FOyt5uJpjG/Sga+OJvWqlL4rd9068aTGv
 Jvx0ZbFm0pml3OaVH7k23xyyGQN74v12Wnj9mZvKaTXTv2MKJfUMN+v3HkbUcju8
 OBGrQcYdxWCg7ZYSDVYdm+ni6HwxiEripjRaVLbBIiAog+HkDtFs0ynpMLDv38MN
 0rdFT5v7Zmfcz226De4zmhA2HK/1eP+vhsl3A/BqozUOk4z6J2eJQ8dUIwri8Cn3
 YAttT0DPasdi3C8DBW0YZpwRIPYcpmPpjgtUyuc6LnqwWnsw+2HPCZsvpT0P336F
 67ZHjYcKPQSPwqCicIgY94s2oqCWuDftD5kUfiQR7y6qfmQB/g0G3PCStH+DHBCC
 SXWOlRjIiACCycMVa1b7CKxCbo524/TjKd6Buf9RpxxrYV6IFaCMx6KH+rQMMa+f
 UV8yXJ7bP0OttvX4jpMelZpdH52Zk6SciJ2MHI7Ca2PHWBKeIPsWmMQeMqEHJ4hi
 ecEZIHMJ3iuhZ1aCv11LMtvFIDUVRyzgPybp241MXiCeSE0pg8QfiQazfnqidwsw
 a7hYb7ic+sCRhNx0QoFnGWQf4Y8PCf/8/oVHXyHFUGtTDM25Z4a1Prp5ax9JW0xx
 FTt7TRIxCTVTrCHWliEk2SiUbje9PWl3lCIPITOt7B9ZW5jYazA=
 =kAb0
 -----END PGP SIGNATURE-----

Merge tag 'riscv-sophgo-dt-for-v6.11' of https://github.com/sophgo/linux into soc/dt

RISC-V Devicetrees for v6.11

Sopgho:
Add clock support for SG2042.

Signed-off-by: Chen Wang <unicorn_wang@outlook.com>

* tag 'riscv-sophgo-dt-for-v6.11' of https://github.com/sophgo/linux:
  riscv: dts: add clock generator for Sophgo SG2042 SoC

Link: https://lore.kernel.org/r/PN1P287MB281861EA2B1706B430D2FA3EFEDB2@PN1P287MB2818.INDP287.PROD.OUTLOOK.COM
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-07-09 10:50:42 +02:00
Chen Wang
b1240a3951 riscv: dts: add clock generator for Sophgo SG2042 SoC
Add clock generator node to device tree for SG2042, and enable clock for
uart.

Signed-off-by: Chen Wang <unicorn_wang@outlook.com>
Reviewed-by: Guo Ren <guoren@kernel.org>
2024-07-09 08:19:52 +08:00
Arnd Bergmann
43528789a0 RISC-V config update for v6.11
StarFive:
 Enable most of the options needed for the jh7100 based boards to be
 properly testable with defconfig.
 
 Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRh246EGq/8RLhDjO14tDGHoIJi0gUCZorLpAAKCRB4tDGHoIJi
 0jCRAPwP4dWXvXRr3889fZKWhwG8QgPp4WOMWUN6qEXXrc5p1gD+L6VOKPblFH6e
 4cVv8BwYHpqhhVt0P+wU6lvtq2xnLAc=
 =PAkK
 -----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmaMPwMACgkQYKtH/8kJ
 UieEvQ//XP74fK0j/RCHTJlTzpT7J6rcl5dumQ6Y4Abt+DQTFcwC5evMmFwLcZFc
 WBVsDiuJi7NXJhIJaXUEbAjQhgAMR+lNm985Gi1dKMEuYGtUvpoNH2D4h0/RBJJH
 rnJ3FNZBwkfsXf7pF3BY8jlBQk9Qu9yGYfiD3F2sjuClLYvU0AIjv0cO7yph/pUR
 NLrgfPr748A5SnwbQDN9zO8S6uv6hjpL4qG0q/ao9UjiD76yTK61Xet27KRr9coE
 B3ylawAw484D72/iJBtoYgtqZlUDCTJQxsXIE9Fxz3kfGzCIhEV4dG0LSqqNkn8I
 nq3hUkoONt1HYkgOP5HkV4Nl51Z0Haf7Nu4h0ikORg6VFGGKeou+6MYrc/tRNgTl
 bqa0hKAyeQhr/xZuPgKFve8prIeJNoBaxLEzC1pvk++pgfyLaCMvu2/8uH0C1mhW
 dH7TxvmF/P7nsaGANOIWD7cz7d4Gear0qiFYXsNHOBRoHR2wHpqipiVTIv4cgw4A
 u4NWC9e9OKoB1HYqaUYreHkMYuocCLvwAj+Bi22kH5YgofbQx/zn4GCQaQn6XX6f
 /s9XV+Tn8kt/wz39DJr9MVTFoJ0YdUlJwCYmA+F2qz+DXDOmExj14U/sP7nnKqgR
 jEolQVmRv9PjcTjhu48P1NBQb5SsWINUZH9cz8+rUVmrdBO6vsU=
 =ib+4
 -----END PGP SIGNATURE-----

Merge tag 'riscv-config-for-v6.11' of https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux into soc/defconfig

RISC-V config update for v6.11

StarFive:
Enable most of the options needed for the jh7100 based boards to be
properly testable with defconfig.

Signed-off-by: Conor Dooley <conor.dooley@microchip.com>

* tag 'riscv-config-for-v6.11' of https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux:
  riscv: defconfig: Enable StarFive JH7110 drivers

Link: https://lore.kernel.org/r/20240707-unused-outflank-aa127ccb2cfe@spud
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-07-08 21:33:23 +02:00
Arnd Bergmann
31f6b5a651 RISC-V Devicetrees for v6.11
T-Head:
 Last change from me before this starts going via Drew's tree is the
 addition of the SBI PMU events node for the th1520.
 
 StarFive:
 A dts for the Pin64 Star64, another board with a jh7110 SoC. This board
 is almost identical to the existing Milk-v Mars and VisionFive 2 boards
 that are already support - just with a different PHY configuration and
 only one of the two PCIe ports exposed. Additionally, the Mars and
 VisionFive 2 get their PCie configuration added.
 
 Microchip:
 A dts for the BeagleV Fire. PCIe is disabled on it for now, as some
 binding and driver changes are required.
 
 Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRh246EGq/8RLhDjO14tDGHoIJi0gUCZorVOAAKCRB4tDGHoIJi
 0j35AQDgZyo3BkwRVpEqGo4GXNWvOPPfiEAuQ+m8hlnDG8PteQD+LOkcfjK/Y2Bk
 vj6vhJYGy/9yODRXaI756szEf6VGDg8=
 =Tefh
 -----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmaL/p0ACgkQYKtH/8kJ
 UicuDQ/9EQiTkV7dluXIl1/imUBIR1cxqH7nLDjy4A7aC5woNBIPKJA0yDI9DAGn
 1hDcQG9zqLNmtpPtfXnlIFMmHoA2DF3UNH+34/RjHkPoweg2J+t+k0SdYL+DRbJi
 cdLNChYScauZAzOllMLR82Z4k8bBNYAUheF/yMnRL6m928slx1bxYMp7u080VhVA
 eN3RbLlsvyeifl/wyhH+U7WgQvvJ5WtVfq3bEGjUo26izGTIqh6FBiguBcj0yGvH
 5tOT6NKtXfuZQPivo6hDGaedgz6FDjfGBGSk48Cg6arcktHNAeyUNwjIMv19w+si
 FW8LZiNXFB8l3nVO/CoqjEcwO5Z2dyeJvp7CQSOcJYm/3XoQSnH38nm0Qmf6+NWY
 pJ3wG0CIbPgSAt2bbumtYW/WRRtrBUEU0de5aZzXGH/iK3aF0bdkQAKj9DOmnc1h
 3dhYkxSXPMbtuS6Zf9zvlchiRyN+CnVhAzqQlf4ibQ6NgfHS9c/ZyKsbRbHGRJGm
 Q5XH/9XgSEp4hCr8/U0e3ZPnuBe1IBfAW2BTVLoKKmWpq/VLqiHvxxrlF9B8ufoU
 /j0tEGqNSC+aUCFvZtn9FMb+KPaQCqj/qRRuFaLlA2zfrv7PTOl9w6sEa6x3+PoL
 u/AIvwsZW1ptuN5Ou72gybxOL/zIEd9urchfAZfDQ+6DoXENgTw=
 =tdm/
 -----END PGP SIGNATURE-----

Merge tag 'riscv-dt-for-v6.11' of https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux into soc/dt

RISC-V Devicetrees for v6.11

T-Head:
Last change from me before this starts going via Drew's tree is the
addition of the SBI PMU events node for the th1520.

StarFive:
A dts for the Pin64 Star64, another board with a jh7110 SoC. This board
is almost identical to the existing Milk-v Mars and VisionFive 2 boards
that are already support - just with a different PHY configuration and
only one of the two PCIe ports exposed. Additionally, the Mars and
VisionFive 2 get their PCie configuration added.

Microchip:
A dts for the BeagleV Fire. PCIe is disabled on it for now, as some
binding and driver changes are required.

Signed-off-by: Conor Dooley <conor.dooley@microchip.com>

* tag 'riscv-dt-for-v6.11' of https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux:
  riscv: dts: starfive: add PCIe dts configuration for JH7110
  riscv: dts: microchip: add an initial devicetree for the BeagleV Fire
  dt-bindings: riscv: microchip: document beaglev-fire
  riscv: dts: starfive: Update flash partition layout
  riscv: dts: thead: th1520: Add PMU event node
  riscv: dts: starfive: add Star64 board devicetree
  dt-bindings: riscv: starfive: add Star64 board compatible
  dt-bindings: riscv: Add T-HEAD C908 compatible

Link: https://lore.kernel.org/r/20240707-nuttiness-lustfully-4aaf03c991b2@spud
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-07-08 16:58:37 +02:00
Arnd Bergmann
9289b97a67 Allwinner SoC device tree changes for 6.11
This includes a commit shared with the clk tree. This commit adds clock
 and reset indices to the device tree binding, and thus is needed for
 both the device tree and driver changes.
 
 ARM64 device tree and binding-only changes
 - Add LRADC (low resolution ADC for resistor network based keys) for H616 SoC
 - Add cache information for A64, H6, and H616 SoCs
 - Correct model names and descriptions for Pine64 boards
 - Add GPADC (general purpose ADC) for H616 SoC
 - Add ADC joysticks based on GPADC for anbernic-rg35xx-h board
 - Add additional CPU OPPs for the H700 on top of existing H616 ones
 - Enable DVFS for rg35xx boards
 - Add IOMMU for H616 SoC
 
 RISC-V device tree changes
 - Add system LDOs to D1s/T113 SoC
 - Add ClockworkPi and DevTerm device trees
 -----BEGIN PGP SIGNATURE-----
 
 iQJCBAABCgAsFiEE2nN1m/hhnkhOWjtHOJpUIZwPJDAFAmaC5BEOHHdlbnNAY3Np
 ZS5vcmcACgkQOJpUIZwPJDDszRAAohtAXVlhcOZUyY6jnToNDRhRDe4WR2dKNuth
 D/sM93uBxEtHkHsV11xKdmXALYszT3c1IxA84Cj3uwBvutIK4/KQUDbtDvGedz/o
 90v3bgAR1ETf7RrQFCxpQUFYNufpnXb5ZHA5jRSdxyg9coLR7sDab0r6UCh/EzyN
 rykvB2oSmjxyG0OjBUU3IcR0B5nliGRc9esVhdSW4cGblaJaT/NrlsmUGL39hXP2
 BayJ3FFfS2izs5uHMOeQUaveadBGssc/kzJvnwy/jeM+76uY4SlH58/dl0ivWM0/
 h5mifWbMIdl5kUEkUz7OaPT9O6K/1O1/heQYrQSHNKtgD6NN9uS4QWuGyfsWcpUo
 iRXnYrscWuGQH8ICAG9upNCto2RXn1nS3+FYv/CmNQ+Nl39RpGGWKW9KPesCttdr
 OgjWGgspyBT3xh4NhrxLbEOmZaCg6ZGfr8Vp4VrXLMbSOrV6Qu8VsJcRWkfwm3yb
 tg9ONSFrSBNwmWQIqjtpWt3j81sOKkz2WYV6+hNgDyQ5HZQZWNHhOUVQ2W912TNh
 vJ9d+3jgtn9wj2DyPdkzPRaNaVK+KqRbpUIXLc7TUAJaMcHzX6+ekt1576ZBNUYL
 hFOoMvzu6k02Fmr8qUe9PX2p5Ie6ubF+2/aWGkZ0pF2cdht1Q6IDiXsSX3PW+VBU
 acB1TT0=
 =3vmg
 -----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmaL9/8ACgkQYKtH/8kJ
 UifiHRAA0qHvjj5uaQiQnjJgOFF+Rmf3goNje2S9f35ziO2AGKqsfp651mBbZFpJ
 yFkIMlELMBLmgB9cEXfvoWlvUjITxduiMjn0aNfu4czRhhk1qiqwy+CVhDHT681g
 71s1crZ0gT/B9hNQkfHz6AmrvMl6k9NPHioaci4abRNXsGNrZuuSFNOAC82l8nrb
 PLJZtNOEbtkVP5pdBatTlsdJywOLpl1KwYvXbbC8oXvMobywUSsW/gjcHV0hUjlp
 4kwDEyi1cUGeNEuPAcp2Pp0H+WtHJsaTN1BhIy3kyCGiCXX3ShkZiaYIPwBGdUrz
 NJihJ/WoU2pMzlX5OH2L2d1KybZhOPB+kzSqw2TWyZuKNc+HY3BaXeZ8uVUz59Tv
 WSml2dz4+T816jDcBrK9sFC1uiJlbieKNJsH8Ol0FykJcfjD48tP7Y1uMhqN3ict
 y/FZ6nvPNEf7YPGGTXDjn0d5ifXAyGXLhIWxRXBP701HAzTkUPeTEdMLwj+6H1eh
 HBxFaa4yeAtH9+0c/Tf1Ge244OxbavwjJU4ufyD7X440nHqFVXQvF2FD451tBd5F
 HbXZ3cm+E8/zg55L0sAcY++5W+Aa2S+TvK64nrKcGEEDonBj3WwdNL5iwtzugWCv
 /AQg1UP3uQIyPvrwcyyoG8lenyHjOhZnACH3qbxpO9JnnwEIXQk=
 =Emn4
 -----END PGP SIGNATURE-----

Merge tag 'sunxi-dt-for-6.11' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux into soc/dt

Allwinner SoC device tree changes for 6.11

This includes a commit shared with the clk tree. This commit adds clock
and reset indices to the device tree binding, and thus is needed for
both the device tree and driver changes.

ARM64 device tree and binding-only changes
- Add LRADC (low resolution ADC for resistor network based keys) for H616 SoC
- Add cache information for A64, H6, and H616 SoCs
- Correct model names and descriptions for Pine64 boards
- Add GPADC (general purpose ADC) for H616 SoC
- Add ADC joysticks based on GPADC for anbernic-rg35xx-h board
- Add additional CPU OPPs for the H700 on top of existing H616 ones
- Enable DVFS for rg35xx boards
- Add IOMMU for H616 SoC

RISC-V device tree changes
- Add system LDOs to D1s/T113 SoC
- Add ClockworkPi and DevTerm device trees

* tag 'sunxi-dt-for-6.11' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux:
  riscv: dts: allwinner: Add ClockworkPi and DevTerm devicetrees
  riscv: dts: allwinner: d1s-t113: Add system LDOs
  arm64: dts: allwinner: h616: add IOMMU node
  arm64: dts: allwinner: rg35xx: Enable DVFS CPU frequency scaling
  arm64: dts: allwinner: h616: add additional CPU OPPs for the H700
  arm64: dts: allwinner: anbernic-rg35xx-h: Add ADC joysticks
  arm64: dts: allwinner: h616: Add GPADC device node
  dt-bindings: clock: sun50i-h616-ccu: Add GPADC clocks
  ARM: dts: sunxi: remove duplicated entries in makefile
  arm64: dts: allwinner: Add cache information to the SoC dtsi for H616
  arm64: dts: allwinner: Add cache information to the SoC dtsi for A64
  arm64: dts: allwinner: Correct the model names for Pine64 boards
  dt-bindings: arm: sunxi: Correct the descriptions for Pine64 boards
  arm64: dts: allwinner: Add cache information to the SoC dtsi for H6
  ARM: dts: sun50i: Add LRADC node
  dt-bindings: input: sun4i-lradc-keys: Add H616 compatible

Link: https://lore.kernel.org/r/ZoQa8r1N8yi7FlPV@wens.tw
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-07-08 16:30:23 +02:00
Puranjay Mohan
a5912c37fa riscv, bpf: Optimize stack usage of trampoline
When BPF_TRAMP_F_CALL_ORIG is not set, stack space for passing arguments
on stack doesn't need to be reserved because the original function is
not called.

Only reserve space for stacked arguments when BPF_TRAMP_F_CALL_ORIG is
set.

Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Pu Lehui <pulehui@huawei.com>
Link: https://lore.kernel.org/bpf/20240708114758.64414-1-puranjay@kernel.org
2024-07-08 15:44:08 +02:00
Linus Torvalds
b673f2bda0 RISC-V Fixes for 6.10-rc7
* A fix for the CMODX example in therecently added icache flushing
   prctl().
 * A fix to the perf driver to avoid corrupting event data on counter
   overflows when external overflow handlers are in use.
 * A fix to clear all hardware performance monitor events on boot, to
   avoid dangling events firmware or previously booted kernels from
   triggering spuriously.
 * A fix to the perf event probing logic to avoid erroneously reporting
   the presence of unimplemented counters.  This also prevents some
   implemented counters from being reported.
 * A build fix for the vector sigreturn selftest on clang.
 * A fix to ftrace, which now requires the previously optional index
   argument to ftrace_graph_ret_addr().
 * A fix to avoid deadlocking if kexec crash handling triggers in an
   interrupt context.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmaIJBATHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYiYk6D/981DUAWJ5JPsqve7PihWnhFXh7T/fm
 KZL7cNQN7/9QmqzJMD756oQCHZT2TeDTxwji4WUQo27uoS1SamsAxRWCPdW8GqDt
 GwBJeviyWDwjNMgrejWwgH3d9so+WZ4kNKfiUrY+j1vgQ8TkE4h5wMzUtOBTSgDI
 5EhHT5B5yjiRcadPshXivZAyimc6mxKJKph5v8W3BGgtLQRHs5tYop4ZkP5Utmv3
 yBie7orfMRx5fNxE6fgn0c/3r49i+KGTSCzkK+0689qPlQNt7MTj4kqDVp7xu2ll
 jl5GJNZrWSZR0cST9AG3VByqfeN2f9sbGYq5fAozkZy3idEYovtvGIU2xJVZRuIU
 ZhY+VTk0fwO8HlilTLMbyk7t99EJ4a7bXcUuD6ub3BthlKfc41PArhZgasL/dFPd
 VOSjy5hfGpJgmifSTpPXElf8jgBq6N4Kw9N+rBNkNiruEiwtWfsyqOckYAfNbULe
 Z8Nikl+3pfWlwzQrAb30X78s4ZyJyOX+XxP118lvx+UAbZofxg5qJJGo7U0Ru54r
 JPBCW8swlco6AXwvAj3yKcaL3qtKlc6f068QvcSaRELUvS2qfuJ7w4fjKdl/IT93
 QggGUyuEVG3UC1Dj961plrACXmqISTAlW8HqkdPvUgLY9rSPuTLuCR54b3fGI+n/
 3wJF6gl5leEPMw==
 =/Gsf
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-6.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V fixes from Palmer Dabbelt:

 - A fix for the CMODX example in the recently added icache flushing
   prctl()

 - A fix to the perf driver to avoid corrupting event data on counter
   overflows when external overflow handlers are in use

 - A fix to clear all hardware performance monitor events on boot, to
   avoid dangling events firmware or previously booted kernels from
   triggering spuriously

 - A fix to the perf event probing logic to avoid erroneously reporting
   the presence of unimplemented counters. This also prevents some
   implemented counters from being reported

 - A build fix for the vector sigreturn selftest on clang

 - A fix to ftrace, which now requires the previously optional index
   argument to ftrace_graph_ret_addr()

 - A fix to avoid deadlocking if kexec crash handling triggers in an
   interrupt context

* tag 'riscv-for-linus-6.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: kexec: Avoid deadlock in kexec crash path
  riscv: stacktrace: fix usage of ftrace_graph_ret_addr()
  riscv: selftests: Fix vsetivli args for clang
  perf: RISC-V: Check standard event availability
  drivers/perf: riscv: Reset the counter to hpmevent mapping while starting cpus
  drivers/perf: riscv: Do not update the event data if uptodate
  documentation: Fix riscv cmodx example
2024-07-05 12:22:51 -07:00
Jakub Kicinski
76ed626479 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

Conflicts:

drivers/net/phy/aquantia/aquantia.h
  219343755e ("net: phy: aquantia: add missing include guards")
  61578f6793 ("net: phy: aquantia: add support for PHY LEDs")

drivers/net/ethernet/wangxun/libwx/wx_hw.c
  bd07a98178 ("net: txgbe: remove separate irq request for MSI and INTx")
  b501d261a5 ("net: txgbe: add FDIR ATR support")
https://lore.kernel.org/all/20240703112936.483c1975@canb.auug.org.au/

include/linux/mlx5/mlx5_ifc.h
  048a403648 ("net/mlx5: IFC updates for changing max EQs")
  99be56171f ("net/mlx5e: SHAMPO, Re-enable HW-GRO")
https://lore.kernel.org/all/20240701133951.6926b2e3@canb.auug.org.au/

Adjacent changes:

drivers/net/wireless/intel/iwlwifi/mvm/mac80211.c
  4130c67cd1 ("wifi: iwlwifi: mvm: check vif for NULL/ERR_PTR before dereference")
  3f3126515f ("wifi: iwlwifi: mvm: add mvm-specific guard")

include/net/mac80211.h
  816c6bec09 ("wifi: mac80211: fix BSS_CHANGED_UNSOL_BCAST_PROBE_RESP")
  5a009b42e0 ("wifi: mac80211: track changes in AP's TPE")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-04 14:16:11 -07:00
Bang Li
8f65aa3223 mm: implement update_mmu_tlb() using update_mmu_tlb_range()
Let's make update_mmu_tlb() simply a generic wrapper around
update_mmu_tlb_range().  Only the latter can now be overridden by the
architecture.  We can now remove __HAVE_ARCH_UPDATE_MMU_TLB as well.

Link: https://lkml.kernel.org/r/20240522061204.117421-3-libang.li@antgroup.com
Signed-off-by: Bang Li <libang.li@antgroup.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Lance Yang <ioworker0@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-03 19:29:57 -07:00
Bang Li
23b1b44e6c mm: add update_mmu_tlb_range()
Patch series "Add update_mmu_tlb_range() to simplify code", v4.

This series of commits mainly adds the update_mmu_tlb_range() to batch
update tlb in an address range and implement update_mmu_tlb() using
update_mmu_tlb_range().

After commit 19eaf44954 ("mm: thp: support allocation of anonymous
multi-size THP"), We may need to batch update tlb of a certain address
range by calling update_mmu_tlb() in a loop.  Using the
update_mmu_tlb_range(), we can simplify the code and possibly reduce the
execution of some unnecessary code in some architectures.


This patch (of 3):

Add update_mmu_tlb_range(), we can batch update tlb of an address range.

Link: https://lkml.kernel.org/r/20240522061204.117421-1-libang.li@antgroup.com
Link: https://lkml.kernel.org/r/20240522061204.117421-2-libang.li@antgroup.com
Signed-off-by: Bang Li <libang.li@antgroup.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Lance Yang <ioworker0@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-03 19:29:57 -07:00
Song Shuai
c562ba719d
riscv: kexec: Avoid deadlock in kexec crash path
If the kexec crash code is called in the interrupt context, the
machine_kexec_mask_interrupts() function will trigger a deadlock while
trying to acquire the irqdesc spinlock and then deactivate irqchip in
irq_set_irqchip_state() function.

Unlike arm64, riscv only requires irq_eoi handler to complete EOI and
keeping irq_set_irqchip_state() will only leave this possible deadlock
without any use. So we simply remove it.

Link: https://lore.kernel.org/linux-riscv/20231208111015.173237-1-songshuaishuai@tinylab.org/
Fixes: b17d19a531 ("riscv: kexec: Fixup irq controller broken in kexec crash path")
Signed-off-by: Song Shuai <songshuaishuai@tinylab.org>
Reviewed-by: Ryo Takakura <takakura@valinux.co.jp>
Link: https://lore.kernel.org/r/20240626023316.539971-1-songshuaishuai@tinylab.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-03 13:11:30 -07:00
Puranjay Mohan
393da6cbb2
riscv: stacktrace: fix usage of ftrace_graph_ret_addr()
ftrace_graph_ret_addr() takes an `idx` integer pointer that is used to
optimize the stack unwinding. Pass it a valid pointer to utilize the
optimizations that might be available in the future.

The commit is making riscv's usage of ftrace_graph_ret_addr() match
x86_64.

Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Link: https://lore.kernel.org/r/20240618145820.62112-1-puranjay@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-03 13:10:03 -07:00
Palmer Dabbelt
210ac17ded
Merge patch series "Assorted fixes in RISC-V PMU driver"
Atish Patra <atishp@rivosinc.com> says:

This series contains 3 fixes out of which the first one is a new fix
for invalid event data reported in lkml[2]. The last two are v3 of Samuel's
patch[1]. I added the RB/TB/Fixes tag and moved 1 unrelated change
to its own patch. I also changed an error message in kvm vcpu_pmu from
pr_err to pr_debug to avoid redundant failure error messages generated
due to the boot time quering of events implemented in the patch[1]

Here is the original cover letter for the patch[1]

Before this patch:
$ perf list hw

List of pre-defined events (to be used in -e or -M):

  branch-instructions OR branches                    [Hardware event]
  branch-misses                                      [Hardware event]
  bus-cycles                                         [Hardware event]
  cache-misses                                       [Hardware event]
  cache-references                                   [Hardware event]
  cpu-cycles OR cycles                               [Hardware event]
  instructions                                       [Hardware event]
  ref-cycles                                         [Hardware event]
  stalled-cycles-backend OR idle-cycles-backend      [Hardware event]
  stalled-cycles-frontend OR idle-cycles-frontend    [Hardware event]

$ perf stat -ddd true

 Performance counter stats for 'true':

              4.36 msec task-clock                       #    0.744 CPUs utilized
                 1      context-switches                 #  229.325 /sec
                 0      cpu-migrations                   #    0.000 /sec
                38      page-faults                      #    8.714 K/sec
         4,375,694      cycles                           #    1.003 GHz                         (60.64%)
           728,945      instructions                     #    0.17  insn per cycle
            79,199      branches                         #   18.162 M/sec
            17,709      branch-misses                    #   22.36% of all branches
           181,734      L1-dcache-loads                  #   41.676 M/sec
             5,547      L1-dcache-load-misses            #    3.05% of all L1-dcache accesses
     <not counted>      LLC-loads                                                               (0.00%)
     <not counted>      LLC-load-misses                                                         (0.00%)
     <not counted>      L1-icache-loads                                                         (0.00%)
     <not counted>      L1-icache-load-misses                                                   (0.00%)
     <not counted>      dTLB-loads                                                              (0.00%)
     <not counted>      dTLB-load-misses                                                        (0.00%)
     <not counted>      iTLB-loads                                                              (0.00%)
     <not counted>      iTLB-load-misses                                                        (0.00%)
     <not counted>      L1-dcache-prefetches                                                    (0.00%)
     <not counted>      L1-dcache-prefetch-misses                                               (0.00%)

       0.005860375 seconds time elapsed

       0.000000000 seconds user
       0.010383000 seconds sys

After this patch:
$ perf list hw

List of pre-defined events (to be used in -e or -M):

  branch-instructions OR branches                    [Hardware event]
  branch-misses                                      [Hardware event]
  cache-misses                                       [Hardware event]
  cache-references                                   [Hardware event]
  cpu-cycles OR cycles                               [Hardware event]
  instructions                                       [Hardware event]

$ perf stat -ddd true

 Performance counter stats for 'true':

              5.16 msec task-clock                       #    0.848 CPUs utilized
                 1      context-switches                 #  193.817 /sec
                 0      cpu-migrations                   #    0.000 /sec
                37      page-faults                      #    7.171 K/sec
         5,183,625      cycles                           #    1.005 GHz
           961,696      instructions                     #    0.19  insn per cycle
            85,853      branches                         #   16.640 M/sec
            20,462      branch-misses                    #   23.83% of all branches
           243,545      L1-dcache-loads                  #   47.203 M/sec
             5,974      L1-dcache-load-misses            #    2.45% of all L1-dcache accesses
   <not supported>      LLC-loads
   <not supported>      LLC-load-misses
   <not supported>      L1-icache-loads
   <not supported>      L1-icache-load-misses
   <not supported>      dTLB-loads
            19,619      dTLB-load-misses
   <not supported>      iTLB-loads
             6,831      iTLB-load-misses
   <not supported>      L1-dcache-prefetches
   <not supported>      L1-dcache-prefetch-misses

       0.006085625 seconds time elapsed

       0.000000000 seconds user
       0.013022000 seconds sys

[1] https://lore.kernel.org/linux-riscv/20240418014652.1143466-1-samuel.holland@sifive.com/
[2] https://lore.kernel.org/all/CC51D53B-846C-4D81-86FC-FBF969D0A0D6@pku.edu.cn/

* b4-shazam-merge:
  perf: RISC-V: Check standard event availability
  drivers/perf: riscv: Reset the counter to hpmevent mapping while starting cpus
  drivers/perf: riscv: Do not update the event data if uptodate

Link: https://lore.kernel.org/r/20240628-misc_perf_fixes-v4-0-e01cfddcf035@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-03 12:57:41 -07:00
Samuel Holland
16d3b1af09
perf: RISC-V: Check standard event availability
The RISC-V SBI PMU specification defines several standard hardware and
cache events. Currently, all of these events are exposed to userspace,
even when not actually implemented. They appear in the `perf list`
output, and commands like `perf stat` try to use them.

This is more than just a cosmetic issue, because the PMU driver's .add
function fails for these events, which causes pmu_groups_sched_in() to
prematurely stop scheduling in other (possibly valid) hardware events.

Add logic to check which events are supported by the hardware (i.e. can
be mapped to some counter), so only usable events are reported to
userspace. Since the kernel does not know the mapping between events and
possible counters, this check must happen during boot, when no counters
are in use. Make the check asynchronous to minimize impact on boot time.

Fixes: e999143459 ("RISC-V: Add perf platform driver based on SBI PMU extension")

Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Tested-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/r/20240628-misc_perf_fixes-v4-3-e01cfddcf035@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-03 12:56:22 -07:00
Pu Lehui
6801b0aef7 riscv, bpf: Add 12-argument support for RV64 bpf trampoline
This patch adds 12 function arguments support for riscv64 bpf trampoline.
The current bpf trampoline supports <= sizeof(u64) bytes scalar arguments [0]
and <= 16 bytes struct arguments [1]. Therefore, we focus on the situation
where scalars are at most XLEN bits and aggregates whose total size does not
exceed 2×XLEN bits in the riscv calling convention [2].

Signed-off-by: Pu Lehui <pulehui@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Björn Töpel <bjorn@rivosinc.com>
Acked-by: Björn Töpel <bjorn@kernel.org>
Acked-by: Puranjay Mohan <puranjay@kernel.org>
Link: https://elixir.bootlin.com/linux/v6.8/source/kernel/bpf/btf.c#L6184 [0]
Link: https://elixir.bootlin.com/linux/v6.8/source/kernel/bpf/btf.c#L6769 [1]
Link: https://github.com/riscv-non-isa/riscv-elf-psabi-doc/releases/download/draft-20230929-e5c800e661a53efe3c2678d71a306323b60eb13b/riscv-abi.pdf [2]
Link: https://lore.kernel.org/bpf/20240702121944.1091530-2-pulehui@huaweicloud.com
2024-07-02 16:01:30 +02:00
Linus Torvalds
651ab78190 SoC fixes for 6.10, part 2
A number of devicetree fixes came in for the rockchip platforms, correcting
 some of the address information, and reverting a change to the MMC controller
 configuration that caused regressions.
 
 Four drivers have one code change each, addressing minor build issues for
 the optee firmware driver, the litex SoC platform driver and two reset
 drivers.
 
 The riscv fixes as also simple, mainly turning off device nodes in the
 canaan dts files unless they are actually usable on a particular board.
 
 Finally, Drew takes over maintaining the THEAD RISC-V SoC platform.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmaCpVIACgkQYKtH/8kJ
 UieIfQ/+KtxzYPfLsUgSZJCeKa6d1A9EgtTBtcEn6gI4HAc5NFDGvYIIetWI/RYN
 2+zOLRdQ8t3CIi2c1sTqy1m+j5vX94p4/2WSW41zASHxN+ryz8VM2SKzE5TGV8IE
 WyHv5fBIHY97u2zegq8K4c/ze2W7bdwBd1V5tYwk7tZnm6VFNMCESQ+Q4mu7kjdy
 irCdxr0j+uDO0cGppwuGWSSR+BiCCCDhu9YjmluD9B57IIB95lyQRgdGy/V9cT9D
 yJS6VwEi+EFBNbzt7TzNrPiXvymQzDeC5K7JavfSRRxW1a/rWLmkmxripiSVZirf
 nHR7cIivj0gvjeZiM3UH/ZMPUdzRk4YXr9889EbO+JQ/iZy/1YIJHKuqLbCjAQZp
 RgjuYYAuG91aHeCHN6cSflNoC49FmmBi9k1GkBdWviU0mhBrfaNfqspMkWUuH34g
 w49H9iflFzJsh6p18h2wG3pB28zubUhLfpmbG5EOP4kXBYvgegAsY9mgSNWcdFFk
 JBxcJlToCm6ooha8bBCpsnDf8/Y1LMlUIcP2Spd/iZ0pxtF2wR8Pt80KDuVko+fj
 TZF3UTu3wuDCyHnSvX0Fc0YTQV0huVawd23upWBxpcdOa4vsScnUbMdKqBIjA6TX
 cGK+fCtVSk45KoAlRi4eLcnNmgse1G2J+ujcrs0QakLxdCx+Ggk=
 =+O9s
 -----END PGP SIGNATURE-----

Merge tag 'arm-fixes-6.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc

Pull SoC fixes from Arnd Bergmann:
 "A number of devicetree fixes came in for the rockchip platforms,
  correcting some of the address information, and reverting a change to
  the MMC controller configuration that caused regressions.

  Four drivers have one code change each, addressing minor build issues
  for the optee firmware driver, the litex SoC platform driver and two
  reset drivers.

  The riscv fixes as also simple, mainly turning off device nodes in the
  canaan dts files unless they are actually usable on a particular
  board.

  Finally, Drew takes over maintaining the THEAD RISC-V SoC platform"

* tag 'arm-fixes-6.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  drivers/soc/litex: drop obsolete dependency on COMPILE_TEST
  tee: optee: ffa: Fix missing-field-initializers warning
  arm64: dts: rockchip: Add sound-dai-cells for RK3368
  arm64: dts: rockchip: Fix the i2c address of es8316 on Cool Pi 4B
  reset: hisilicon: hi6220: add missing MODULE_DESCRIPTION() macro
  reset: gpio: Fix missing gpiolib dependency for GPIO reset controller
  MAINTAINERS: thead: update Maintainer
  arm64: dts: rockchip: fix PMIC interrupt pin on ROCK Pi E
  riscv: dts: starfive: Set EMMC vqmmc maximum voltage to 3.3V on JH7110 boards
  arm64: dts: rockchip: make poweroff(8) work on Radxa ROCK 5A
  Revert "arm64: dts: rockchip: remove redundant cd-gpios from rk3588 sdmmc nodes"
  ARM: dts: rockchip: rk3066a: add #sound-dai-cells to hdmi node
  arm64: dts: rockchip: Fix the value of `dlg,jack-det-rate` mismatch on rk3399-gru
  arm64: dts: rockchip: set correct pwm0 pinctrl on rk3588-tiger
  riscv: dts: canaan: Disable I/O devices unless used
  riscv: dts: canaan: Clean up serial aliases
  arm64: dts: rockchip: Rename LED related pinctrl nodes on rk3308-rock-pi-s
  arm64: dts: rockchip: Fix SD NAND and eMMC init on rk3308-rock-pi-s
  arm64: dts: rockchip: Fix rk3308 codec@ff560000 reset-names
  arm64: dts: rockchip: Fix the DCDC_REG2 minimum voltage on Quartz64 Model B
2024-07-01 09:36:20 -07:00
Pu Lehui
2382a405c5 riscv, bpf: Use bpf_prog_pack for RV64 bpf trampoline
We used bpf_prog_pack to aggregate bpf programs into huge page to
relieve the iTLB pressure on the system. We can apply it to bpf
trampoline, as Song had been implemented it in core and x86 [0]. This
patch is going to use bpf_prog_pack to RV64 bpf trampoline. Since Song
and Puranjay have done a lot of work for bpf_prog_pack on RV64,
implementing this function will be easy.

Signed-off-by: Pu Lehui <pulehui@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Björn Töpel <bjorn@rivosinc.com> #riscv
Link: https://lore.kernel.org/all/20231206224054.492250-1-song@kernel.org [0]
Link: https://lore.kernel.org/bpf/20240622030437.3973492-4-pulehui@huaweicloud.com
2024-07-01 17:10:46 +02:00
Pu Lehui
9f1e16fb1f riscv, bpf: Fix out-of-bounds issue when preparing trampoline image
We get the size of the trampoline image during the dry run phase and
allocate memory based on that size. The allocated image will then be
populated with instructions during the real patch phase. But after
commit 26ef208c20 ("bpf: Use arch_bpf_trampoline_size"), the `im`
argument is inconsistent in the dry run and real patch phase. This may
cause emit_imm in RV64 to generate a different number of instructions
when generating the 'im' address, potentially causing out-of-bounds
issues. Let's emit the maximum number of instructions for the "im"
address during dry run to fix this problem.

Fixes: 26ef208c20 ("bpf: Use arch_bpf_trampoline_size")
Signed-off-by: Pu Lehui <pulehui@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240622030437.3973492-3-pulehui@huaweicloud.com
2024-07-01 17:10:46 +02:00
Minda Chen
2904244a8c riscv: dts: starfive: add PCIe dts configuration for JH7110
Add PCIe dts configuraion for JH7110 SoC platform. The Star64 only has
one exposed PCIe port, so only the Mars and VisionFive 2 get two
enabled.

Signed-off-by: Minda Chen <minda.chen@starfivetech.com>
Reviewed-by: Hal Feng <hal.feng@starfivetech.com>
[conor: squash in star64's single exposed port]
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2024-07-01 13:20:19 +01:00
Greg Kroah-Hartman
33827dc4ad Linux 6.10-rc6
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmaB0NweHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGkvwH/36UJRk/o6wvXnyH
 E6QjCSWo2226APyWks22NjtC3I/8Iqdvkneuh6wG0qL2sXAB078EMjUq5R81bF8H
 wWFBJwetjYTp8GEyLioMEb2wCH/J3R29dLFC4UYTplafXRGP6//xcpJaKmTxcgdR
 31IzvTPXbApZ7L3k1U6rA2bK9PNKcFCOvZlrNMUCuwMrabymHsDfOUt1DqXyg2xp
 zjqiWYBwlklozmgawSWt/mdEgkWuTcAbg+KyqDVQF59s9aj/OOwZ0j+HACq5V8CM
 quTPIAYL6CC9p7uxa69lGr/sgC0Is/BZLPX7RTZAwCgarGvnX+1HUsjDcaFCtrVg
 O6fPUV8=
 =pgUx
 -----END PGP SIGNATURE-----

Merge 6.10-rc6 into tty-next

This resolves the merge issues in the 8250 code due to some reverts in
6.10-rc6 in the console changes.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-07-01 14:16:48 +02:00
Samuel Holland
0ce1d34678 riscv: dts: allwinner: Add ClockworkPi and DevTerm devicetrees
Clockwork Tech manufactures several SoMs for their RasPi CM3-compatible
"ClockworkPi" mainboard. Their R-01 SoM features the Allwinner D1 SoC.
The R-01 contains only the CPU, DRAM, and always-on voltage regulation;
it does not merit a separate devicetree.

The ClockworkPi mainboard features analog audio, a MIPI-DSI panel, USB
host and peripheral ports, an Ampak AP6256 WiFi/Bluetooth module, and an
X-Powers AXP228 PMIC for managing a Li-ion battery.

The DevTerm is a complete system which extends the ClockworkPi mainboard
with a MIPI-DSI panel and a pair of expansion boards. These expansion
boards provide a fan, a USB keyboard, speakers, and a thermal printer.

Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: Samuel Holland <samuel@sholland.org>
Link: https://lore.kernel.org/r/20240622150731.1105901-4-wens@kernel.org
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
2024-06-30 23:06:52 +08:00
Chen-Yu Tsai
8f2cf4442b riscv: dts: allwinner: d1s-t113: Add system LDOs
Now that the bindings for the system LDOs have been merged, the nodes
for the system LDOs can be added. These are used on the ClockworkPi.

This was originally part of Samuel's D1 device tree series [1], but was
dropped in v5 as the regulator bindings weren't merged at the time.

[1] https://lore.kernel.org/linux-sunxi/20221231233851.24923-1-samuel@sholland.org/

Link: https://lore.kernel.org/r/20240622150731.1105901-3-wens@kernel.org
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
2024-06-30 23:06:51 +08:00
Linus Torvalds
de0a9f4486 RISC-V Fixes for 6.10-rc6
* A fix for vector load/store instruction decoding, which could result
   in reserved vector element length encodings decoding as valid vector
   instructions.
 * Instruction patching now aggressively flushes the local instruction
   cache, to avoid situations where patching functions on the flush path
   results in torn instructions being fetched.
 * A fix to prevent the stack walker from showing up as part of traces.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmZ+4zUTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYiTXxD/wPSWbHf24Mr4CrFYbKR7lHWjGku+jG
 8LQa+B9uUgpA8XjNjeeECg7lsJq/1avbPrUlValRckUIMZPHSWK4U7aFkkPs1WFa
 87D5pA4AVkt5U8v/3c5GQ8Tod0Afa7OyFggxdglC3XFvUa5TNdn3pdv0rdE4Mx5l
 QRijFyLlhRv/D5We+exNAVJmkdHfSXQEEyEjXeb83VK+PsZzAXvHLj3omxCyQ1kH
 Kt2RyN8QpkUisFNTpSvPHiuoPjUeJvRs0JIyrO1SwBGHyYs4kg6g0KBk4YjTTXG5
 xbRVG2tMO9TS2jRRHJS7fdI+yF0X1z+t+WUG1F1WHvKxnqbWczJ0UczFgacaVSkw
 BM/yO+VPg22f6bM4H85K5GBdhN8PplFDuHDdVQ8/LDGOrQKaByrorXWq3WrbwJcq
 vVwbnBGW6v4we5COzyHvwnakzl4bEMHoUb2NVTzZM+fFleiEdx4Sg+F5Us8+UlZh
 PztwjPao8spIm81l/wXStxYzschDyonCt74/odic2LDtFBirZWzDdUInXFVzUZs7
 CUxF38XJ6SNQJBVVwQv6qisoWhy6Ca9SGKwY2GwW7Ustx3C4Eh0nrOVmI/DcHRgN
 9rGm13Qfm8eUSznTM+buWTTluZvtmZupGpAP2GhvaUgTMDIfK/vttidW9Kf4KsP8
 hn+jllIc1WgE6Q==
 =0bpZ
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-6.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V fixes from Palmer Dabbelt:

 - A fix for vector load/store instruction decoding, which could result
   in reserved vector element length encodings decoding as valid vector
   instructions.

 - Instruction patching now aggressively flushes the local instruction
   cache, to avoid situations where patching functions on the flush path
   results in torn instructions being fetched.

 - A fix to prevent the stack walker from showing up as part of traces.

* tag 'riscv-for-linus-6.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: stacktrace: convert arch_stack_walk() to noinstr
  riscv: patch: Flush the icache right after patching to avoid illegal insns
  RISC-V: fix vector insn load/store width mask
2024-06-28 16:14:59 -07:00
Jakub Kicinski
193b9b2002 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

No conflicts.

Adjacent changes:
  e3f02f32a0 ("ionic: fix kernel panic due to multi-buffer handling")
  d9c0420999 ("ionic: Mark error paths in the data path as unlikely")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-27 12:14:11 -07:00
Arnd Bergmann
53ed12744c RISC-V Devicetree fixes for v6.10-rc5+
T-Head:
 Jisheng hasn't got enough time to look after the platform, so Drew
 Fustini is going to take over.
 
 StarFive:
 A fix for a regulator voltage range that prevented using low performance
 SD cards.
 
 Canaan:
 Cleanup for some "over eager" aliases for serial ports that did not
 exist on some boards and I/O devices disabled on boards where they were
 not actually in use.
 
 Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRh246EGq/8RLhDjO14tDGHoIJi0gUCZnWJ7wAKCRB4tDGHoIJi
 0pYNAP9f3zNfNJF+nIm10LzLnrX3U/sqYZaFlTPqpzQCpvdtTgEAvaxfN5x6zcR9
 6obpBSAlYN+7OlFVEtMfUf6UYq2rwwA=
 =AQmx
 -----END PGP SIGNATURE-----

Merge tag 'riscv-dt-fixes-for-v6.10-rc5+' of https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux into arm/fixes

RISC-V Devicetree fixes for v6.10-rc5+

T-Head:
Jisheng hasn't got enough time to look after the platform, so Drew
Fustini is going to take over.

StarFive:
A fix for a regulator voltage range that prevented using low performance
SD cards.

Canaan:
Cleanup for some "over eager" aliases for serial ports that did not
exist on some boards and I/O devices disabled on boards where they were
not actually in use.

Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-06-27 16:09:13 +02:00
Palmer Dabbelt
60a6707f58
Merge patch series "riscv: Memory Hot(Un)Plug support"
Björn Töpel <bjorn@kernel.org> says:

From: Björn Töpel <bjorn@rivosinc.com>

================================================================
Memory Hot(Un)Plug support (and ZONE_DEVICE) for the RISC-V port
================================================================

Introduction
============

To quote "Documentation/admin-guide/mm/memory-hotplug.rst": "Memory
hot(un)plug allows for increasing and decreasing the size of physical
memory available to a machine at runtime."

This series adds memory hot(un)plugging, and ZONE_DEVICE support for
the RISC-V Linux port.

MM configuration
================

RISC-V MM has the following configuration:

 * Memory blocks are 128M, analogous to x86-64. It uses PMD
   ("hugepage") vmemmaps. From that follows that 2M (PMD) worth of
   vmemmap spans 32768 pages á 4K which gets us 128M.

 * The pageblock size is the minimum minimum virtio_mem size, and on
   RISC-V it's 2M (2^9 * 4K).

Implementation
==============

The PGD table on RISC-V is shared/copied between for all processes. To
avoid doing page table synchronization, the first patch (patch 1)
pre-allocated the PGD entries for vmemmap/direct map. By doing that
the init_mm PGD will be fixed at kernel init, and synchronization can
be avoided all together.

The following two patches (patch 2-3) does some preparations, followed
by the actual MHP implementation (patch 4-5). Then, MHP and virtio-mem
are enabled (patch 6-7), and finally ZONE_DEVICE support is added
(patch 8).

MHP and locking
===============

TL;DR: The MHP does not step on any toes, except for ptdump.
Additional locking is required for ptdump.

Long version: For v2 I spent some time digging into init_mm
synchronization/update. Here are my findings, and I'd love them to be
corrected if incorrect.

It's been a gnarly path...

The `init_mm` structure is a special mm (perhaps not a "real" one).
It's a "lazy context" that tracks kernel page table resources, e.g.,
the kernel page table (swapper_pg_dir), a kernel page_table_lock (more
about the usage below), mmap_lock, and such.

`init_mm` does not track/contain any VMAs. Having the `init_mm` is
convenient, so that the regular kernel page table walk/modify
functions can be used.

Now, `init_mm` being special means that the locking for kernel page
tables are special as well.

On RISC-V the PGD (top-level page table structure), similar to x86, is
shared (copied) with user processes. If the kernel PGD is modified, it
has to be synched to user-mode processes PGDs. This is avoided by
pre-populating the PGD, so it'll be fixed from boot.

The in-kernel pgd regions are documented in
`Documentation/arch/riscv/vm-layout.rst`.

The distinct regions are:
 * vmemmap
 * vmalloc/ioremap space
 * direct mapping of all physical memory
 * kasan
 * modules, BPF
 * kernel

Memory hotplug is the process of adding/removing memory to/from the
kernel.

Adding is done in two phases:
 1. Add the memory to the kernel
 2. Online memory, making it available to the page allocator.

Step 1 is partially architecture dependent, and updates the init_mm
page table:
 * Update the direct map page tables. The direct map is a linear map,
   representing all physical memory: `virt = phys + PAGE_OFFSET`
 * Add a `struct page` for each added page of memory. Update the
   vmemmap (virtual mapping to the `struct page`, so we can easily
   transform a kernel virtual address to a `struct page *` address.

From an MHP perspective, there are two regions of the PGD that are
updated:
 * vmemmap
 * direct mapping of all physical memory

The `struct mm_struct` has a couple of locks in play:
 * `spinlock_t page_table_lock` protects the page table, and some
    counters
 * `struct rw_semaphore mmap_lock` protect an mm's VMAs

Note again that `init_mm` does not contain any VMAs, but still uses
the mmap_lock in some places.

The `page_table_lock` was originally used to to protect all pages
tables, but more recently a split page table lock has been introduced.
The split lock has a per-table lock for the PTE and PMD tables. If
split lock is disabled, all tables are guarded by
`mm->page_table_lock` (for user processes). Split page table locks are
not used for init_mm.

MHP operations is typically synchronized using
`DEFINE_STATIC_PERCPU_RWSEM(mem_hotplug_lock)`.

Actors
------

The following non-MHP actors in the kernel traverses (read), and/or
modifies the kernel PGD.

 * `ptdump`

   Walks the entire `init_mm`, via `ptdump_walk_pgd()` with the
   `mmap_write_lock(init_mm)` taken.

   Observation: ptdump can race with MHP, and needs additional locking
   to avoid crashes/races.

 * `set_direct_*` / `arch/riscv/mm/pageattr.c`

   The `set_direct_*` functionality is used to "synchronize" the
   direct map to other kernel mappings, e.g. modules/kernel text. The
   direct map is using "as large huge table mappings as possible",
   which means that the `set_direct_*` might need to split the direct
   map.

  The `set_direct_*` functions operates with the
  `mmap_write_lock(init_mm)` taken.

  Observation: `set_direct_*` uses the direct map, but will never
  modify the same entry as MHP. If there is a mapping, that entry will
  never race with MHP. Further, MHP acts when memory is offline.

 * HVO / `mm/hugetlb_vmemmap`

   HVO optimizes the backing `struct page` for hugetlb pages, which
   means changing the "vmemmap" region. HVO can split (merge?) a
   vmemmap pmd. However, it will never race with MHP, since HVO only
   operates at online memory. HVO cannot touch memory being MHP added
   or removed.

 * `apply_to_page_range`

   Walks a range, creates pages and applies a callback (setting
   permissions) for the page.

   When creating a table, it might use `int __pte_alloc_kernel(pmd_t
   *pmd)` which takes the `init_mm.page_table_lock` to synchronize pmd
   populate.

   Used by: `mm/vmalloc.c` and `mm/kasan/shadow.c`. The KASAN callback
   takes the `init_mm.page_table_lock` to synchronize pte creation.

   Observations: `apply_to_page_range` applies to the "vmalloc/ioremap
   space" region, and "kasan" region. *Not* affected by MHP.

 * `apply_to_existing_page_range`

   Walks a range, applies a callback (setting permissions) for the
   page (no page creation).

   Used by: `kernel/bpf/arena.c` and `mm/kasan/shadow.c`. The KASAN
   callback takes the `init_mm.page_table_lock` to synchronize pte
   creation. *Not* affected by MHP regions.

 * `apply_to_existing_page_range` applies to the "vmalloc/ioremap
   space" region, and "kasan" region. *Not* affected by MHP regions.

 *  `ioremap_page_range` and `vmap_page_range`

    Uses the same internal function, and might create table entries at
    the "vmalloc/ioremap space" region. Can call
    `__pte_alloc_kernel()` which takes the `init_mm.page_table_lock`
    synchronizing pmd populate in the region. *Not* affected by MHP
    regions.

Summary:
  * MHP add will never modify the same page table entries, as any of
    the other actors.
  * MHP remove is done when memory is offlined, and will not clash
    with any of the actors.
  * Functions that walk the entire kernel page table need
    synchronization

  * It's sufficient to add the MHP lock ptdump.

Testing
=======

This series adds basic DT supported hotplugging. There is a QEMU
series enabling MHP for the RISC-V "virt" machine here: [1]

ACPI/MSI support is still in the making for RISC-V, and prior proper
(ACPI) PCI MSI support lands [2] and NUMA SRAT support [3], it hard to
try it out.

I've prepared a QEMU branch with proper ACPI GED/PC-DIMM support [4],
and a this series with the required prerequisites [5] (AIA, ACPI AIA
MADT, ACPI NUMA SRAT).

To test with virtio-mem, e.g.:
  | qemu-system-riscv64 \
  |     -machine virt,aia=aplic-imsic \
  |     -cpu rv64,v=true,vlen=256,elen=64,h=true,zbkb=on,zbkc=on,zbkx=on,zkr=on,zkt=on,svinval=on,svnapot=on,svpbmt=on \
  |     -nodefaults \
  |     -nographic -smp 8 -kernel rv64-u-boot.bin \
  |     -drive file=rootfs.img,format=raw,if=virtio \
  |     -device virtio-rng-pci \
  |     -m 16G,slots=3,maxmem=32G \
  |     -object memory-backend-ram,id=mem0,size=16G \
  |     -numa node,nodeid=0,memdev=mem0 \
  |     -serial chardev:char0 \
  |     -mon chardev=char0,mode=readline \
  |     -chardev stdio,mux=on,id=char0 \
  |     -device pci-serial,id=serial0,chardev=char0 \
  |     -object memory-backend-ram,id=vmem0,size=2G \
  |     -device virtio-mem-pci,id=vm0,memdev=vmem0,node=0

where "rv64-u-boot.bin" is U-boot with EFI/ACPI-support (use [6] if
you're lazy).

In the QEMU monitor:
  | (qemu) info memory-devices
  | (qemu) qom-set vm0 requested-size 1G

...to test DAX/KMEM, use the follow QEMU parameters:
  |  -object memory-backend-file,id=mem1,share=on,mem-path=virtio_pmem.img,size=4G \
  |  -device virtio-pmem-pci,memdev=mem1,id=nv1

and the regular ndctl/daxctl dance.

If you're brave to try the ACPI branch, add "acpi=on" to "-machine
virt", and test PC-DIMM MHP (in addition to virtio-{p},mem):

In the QEMU monitor:
  | (qemu) object_add memory-backend-ram,id=mem1,size=1G
  | (qemu) device_add pc-dimm,id=dimm1,memdev=mem1

You can also try hot-remove with some QEMU options, say:
  | -object memory-backend-file,id=mem-1,size=256M,mem-path=/pagesize-2MB
  | -device pc-dimm,id=mem1,memdev=mem-1
  | -object memory-backend-file,id=mem-2,size=1G,mem-path=/pagesize-1GB
  | -device pc-dimm,id=mem2,memdev=mem-2
  | -object memory-backend-file,id=mem-3,size=256M,mem-path=/pagesize-2MB
  | -device pc-dimm,id=mem3,memdev=mem-3

Remove "acpi=on" to run with DT.

Thanks to Alex, Andrew, David, and Oscar for all
comments/tests/fixups.

References
==========

[1] https://lore.kernel.org/qemu-devel/20240521105635.795211-1-bjorn@kernel.org/
[2] https://lore.kernel.org/linux-riscv/20240501121742.1215792-1-sunilvl@ventanamicro.com/
[3] https://lore.kernel.org/linux-riscv/cover.1713778236.git.haibo1.xu@intel.com/
[4] https://github.com/bjoto/qemu/commits/virtio-mem-pc-dimm-mhp-acpi-v2/
[5] https://github.com/bjoto/linux/commits/mhp-v4-acpi
[6] https://github.com/bjoto/riscv-rootfs-utils/tree/acpi

* b4-shazam-merge:
  riscv: Enable DAX VMEMMAP optimization
  riscv: mm: Add support for ZONE_DEVICE
  virtio-mem: Enable virtio-mem for RISC-V
  riscv: Enable memory hotplugging for RISC-V
  riscv: mm: Take memory hotplug read-lock during kernel page table dump
  riscv: mm: Add memory hotplugging support
  riscv: mm: Add pfn_to_kaddr() implementation
  riscv: mm: Refactor create_linear_mapping_range() for memory hot add
  riscv: mm: Change attribute from __init to __meminit for page functions
  riscv: mm: Pre-allocate vmemmap/direct map/kasan PGD entries
  riscv: mm: Properly forward vmemmap_populate() altmap parameter

Link: https://lore.kernel.org/r/20240605114100.315918-1-bjorn@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 08:43:07 -07:00
Björn Töpel
4705c1571a
riscv: Enable DAX VMEMMAP optimization
Now that DAX is usable, enable the DAX VMEMMAP optimization as well.

Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240605114100.315918-12-bjorn@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 08:42:47 -07:00
Björn Töpel
216e04bf1e
riscv: mm: Add support for ZONE_DEVICE
ZONE_DEVICE pages need DEVMAP PTEs support to function
(ARCH_HAS_PTE_DEVMAP). Claim another RSW (reserved for software) bit
in the PTE for DEVMAP mark, add the corresponding helpers, and enable
ARCH_HAS_PTE_DEVMAP for riscv64.

Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
Link: https://lore.kernel.org/r/20240605114100.315918-11-bjorn@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 08:42:46 -07:00
Björn Töpel
f8c2a24055
riscv: Enable memory hotplugging for RISC-V
Enable ARCH_ENABLE_MEMORY_HOTPLUG and ARCH_ENABLE_MEMORY_HOTREMOVE for
RISC-V.

Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
Link: https://lore.kernel.org/r/20240605114100.315918-9-bjorn@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 08:42:44 -07:00
Björn Töpel
37992b7f10
riscv: mm: Take memory hotplug read-lock during kernel page table dump
During memory hot remove, the ptdump functionality can end up touching
stale data. Avoid any potential crashes (or worse), by holding the
memory hotplug read-lock while traversing the page table.

This change is analogous to arm64's commit bf2b59f60e ("arm64/mm:
Hold memory hotplug lock while walking for kernel page table dump").

Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
Link: https://lore.kernel.org/r/20240605114100.315918-8-bjorn@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 08:42:43 -07:00
Björn Töpel
c75a74f4ba
riscv: mm: Add memory hotplugging support
For an architecture to support memory hotplugging, a couple of
callbacks needs to be implemented:

 arch_add_memory()
  This callback is responsible for adding the physical memory into the
  direct map, and call into the memory hotplugging generic code via
  __add_pages() that adds the corresponding struct page entries, and
  updates the vmemmap mapping.

 arch_remove_memory()
  This is the inverse of the callback above.

 vmemmap_free()
  This function tears down the vmemmap mappings (if
  CONFIG_SPARSEMEM_VMEMMAP is enabled), and also deallocates the
  backing vmemmap pages. Note that for persistent memory, an
  alternative allocator for the backing pages can be used; The
  vmem_altmap. This means that when the backing pages are cleared,
  extra care is needed so that the correct deallocation method is
  used.

 arch_get_mappable_range()
  This functions returns the PA range that the direct map can map.
  Used by the MHP internals for sanity checks.

The page table unmap/teardown functions are heavily based on code from
the x86 tree. The same remove_pgd_mapping() function is used in both
vmemmap_free() and arch_remove_memory(), but in the latter function
the backing pages are not removed.

Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240605114100.315918-7-bjorn@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 08:42:42 -07:00
Björn Töpel
6e6c5e21b8
riscv: mm: Add pfn_to_kaddr() implementation
The pfn_to_kaddr() function is used by KASAN's memory hotplugging
path. Add the missing function to the RISC-V port, so that it can be
built with MHP and CONFIG_KASAN.

Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240605114100.315918-6-bjorn@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 08:42:41 -07:00
Björn Töpel
007480fe84
riscv: mm: Refactor create_linear_mapping_range() for memory hot add
Add a parameter to the direct map setup function, so it can be used in
arch_add_memory() later.

Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
Link: https://lore.kernel.org/r/20240605114100.315918-5-bjorn@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 08:42:40 -07:00
Björn Töpel
fe122b89da
riscv: mm: Change attribute from __init to __meminit for page functions
Prepare for memory hotplugging support by changing from __init to
__meminit for the page table functions that are used by the upcoming
architecture specific callbacks.

Changing the __init attribute to __meminit, avoids that the functions
are removed after init. The __meminit attribute makes sure the
functions are kept in the kernel text post init, but only if memory
hotplugging is enabled for the build.

Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
Link: https://lore.kernel.org/r/20240605114100.315918-4-bjorn@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 08:42:39 -07:00
Björn Töpel
66673099f7
riscv: mm: Pre-allocate vmemmap/direct map/kasan PGD entries
The RISC-V port copies the PGD table from init_mm/swapper_pg_dir to
all userland page tables, which means that if the PGD level table is
changed, other page tables has to be updated as well.

Instead of having the PGD changes ripple out to all tables, the
synchronization can be avoided by pre-allocating the PGD entries/pages
at boot, avoiding the synchronization all together.

This is currently done for the bpf/modules, and vmalloc PGD regions.
Extend this scheme for the PGD regions touched by memory hotplugging.

Prepare the RISC-V port for memory hotplug by pre-allocate
vmemmap/direct map/kasan entries at the PGD level. This will roughly
waste ~128 (plus 32 if KASAN is enabled) worth of 4K pages when memory
hotplugging is enabled in the kernel configuration.

Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
Link: https://lore.kernel.org/r/20240605114100.315918-3-bjorn@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 08:42:39 -07:00
Björn Töpel
e3ecf2fdc8
riscv: mm: Properly forward vmemmap_populate() altmap parameter
Make sure that the altmap parameter is properly passed on to
vmemmap_populate_hugepages().

Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
Link: https://lore.kernel.org/r/20240605114100.315918-2-bjorn@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 08:42:37 -07:00
Haibo Xu
d6ecd18893
riscv: dmi: Add SMBIOS/DMI support
Enable the dmi driver for riscv which would allow access the
SMBIOS info through some userspace file(/sys/firmware/dmi/*).

The change was based on that of arm64 and has been verified
by dmidecode tool.

Signed-off-by: Haibo Xu <haibo1.xu@intel.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/r/20240613065507.287577-1-haibo1.xu@intel.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 08:02:33 -07:00
Alexandre Ghiti
50b5bae5be
riscv: Implement pte_accessible()
Like other architectures, a pte is accessible if it is present or if
there is a pending tlb flush and the pte is protnone (which could be the
case when a pte is downgraded to protnone before a flush tlb is
executed).

Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240128115953.25085-1-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 07:56:26 -07:00
Palmer Dabbelt
914e618b43
Merge patch series "Add support for a few Zc* extensions, Zcmop and Zimop"
Clément Léger <cleger@rivosinc.com> says:

Add support for (yet again) more RVA23U64 missing extensions. Add
support for Zimop, Zcmop, Zca, Zcf, Zcd and Zcb extensions ISA string
parsing, hwprobe and kvm support. Zce, Zcmt and Zcmp extensions have
been left out since they target microcontrollers/embedded CPUs and are
not needed by RVA23U64.

Since Zc* extensions states that C implies Zca, Zcf (if F and RV32), Zcd
(if D), this series modifies the way ISA string is parsed and now does
it in two phases. First one parses the string and the second one
validates it for the final ISA description.

* b4-shazam-merge:
  KVM: riscv: selftests: Add Zcmop extension to get-reg-list test
  RISC-V: KVM: Allow Zcmop extension for Guest/VM
  riscv: hwprobe: export Zcmop ISA extension
  riscv: add ISA extension parsing for Zcmop
  dt-bindings: riscv: add Zcmop ISA extension description
  KVM: riscv: selftests: Add some Zc* extensions to get-reg-list test
  RISC-V: KVM: Allow Zca, Zcf, Zcd and Zcb extensions for Guest/VM
  riscv: hwprobe: export Zca, Zcf, Zcd and Zcb ISA extensions
  riscv: add ISA parsing for Zca, Zcf, Zcd and Zcb
  riscv: add ISA extensions validation callback
  dt-bindings: riscv: add Zca, Zcf, Zcd and Zcb ISA extension description
  KVM: riscv: selftests: Add Zimop extension to get-reg-list test
  RISC-V: KVM: Allow Zimop extension for Guest/VM
  riscv: hwprobe: export Zimop ISA extension
  riscv: add ISA extension parsing for Zimop
  dt-bindings: riscv: add Zimop ISA extension description

Link: https://lore.kernel.org/r/20240619113529.676940-1-cleger@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 07:55:02 -07:00
Clément Léger
29cf9b803e
RISC-V: KVM: Allow Zcmop extension for Guest/VM
Extend the KVM ISA extension ONE_REG interface to allow KVM user space
to detect and enable Zcmop extension for Guest/VM.

Signed-off-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Acked-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20240619113529.676940-16-cleger@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 07:54:59 -07:00
Clément Léger
fc078ea317
riscv: hwprobe: export Zcmop ISA extension
Export Zcmop ISA extension through hwprobe.

Signed-off-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Evan Green <evan@rivosinc.com>
Link: https://lore.kernel.org/r/20240619113529.676940-15-cleger@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 07:54:58 -07:00
Clément Léger
164d644059
riscv: add ISA extension parsing for Zcmop
Add parsing for Zcmop ISA extension which was ratified in commit
c732a4f39a4c ("Zcmop is ratified/1.0") of the riscv-isa-manual.

Signed-off-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20240619113529.676940-14-cleger@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 07:54:57 -07:00
Clément Léger
d964e8f2ae
RISC-V: KVM: Allow Zca, Zcf, Zcd and Zcb extensions for Guest/VM
Extend the KVM ISA extension ONE_REG interface to allow KVM user space
to detect and enable Zca, Zcf, Zcd and Zcb extensions for Guest/VM.

Signed-off-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Acked-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20240619113529.676940-11-cleger@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 07:54:54 -07:00
Clément Léger
0ad70db5eb
riscv: hwprobe: export Zca, Zcf, Zcd and Zcb ISA extensions
Export Zca, Zcf, Zcd and Zcb ISA extension through hwprobe.

Signed-off-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20240619113529.676940-10-cleger@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 07:54:53 -07:00
Clément Léger
ba4cd85583
riscv: add ISA parsing for Zca, Zcf, Zcd and Zcb
The Zc* standard extension for code reduction introduces new extensions.
This patch adds support for Zca, Zcf, Zcd and Zcb. Zce, Zcmt and Zcmp
are left out of this patch since they are targeting microcontrollers/
embedded CPUs instead of application processors.

Signed-off-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20240619113529.676940-9-cleger@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 07:54:52 -07:00
Clément Léger
625034abd5
riscv: add ISA extensions validation callback
Since a few extensions (Zicbom/Zicboz) already needs validation and
future ones will need it as well (Zc*) add a validate() callback to
struct riscv_isa_ext_data. This require to rework the way extensions are
parsed and split it in two phases. First phase is isa string or isa
extension list parsing and consists in enabling all the extensions in a
temporary bitmask (source isa) without any validation. The second step
"resolves" the final isa bitmap, handling potential missing dependencies.
The mechanism is quite simple and simply validate each extension
described in the source bitmap before enabling it in the resolved isa
bitmap. validate() callbacks can return either 0 for success,
-EPROBEDEFER if extension needs to be validated again at next loop. A
previous ISA bitmap is kept to avoid looping multiple times if an
extension dependencies are never satisfied until we reach a stable
state. In order to avoid any potential infinite looping, allow looping
a maximum of the number of extension we handle. Zicboz and Zicbom
extensions are modified to use this validation mechanism.

Signed-off-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20240619113529.676940-8-cleger@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 07:54:51 -07:00
Clément Léger
fb2a3d63ef
RISC-V: KVM: Allow Zimop extension for Guest/VM
Extend the KVM ISA extension ONE_REG interface to allow KVM user space
to detect and enable Zimop extension for Guest/VM.

Signed-off-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Acked-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20240619113529.676940-5-cleger@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 07:54:48 -07:00
Clément Léger
36f8960de8
riscv: hwprobe: export Zimop ISA extension
Export Zimop ISA extension through hwprobe.

Signed-off-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20240619113529.676940-4-cleger@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 07:54:47 -07:00
Clément Léger
2467c2104f
riscv: add ISA extension parsing for Zimop
Add parsing for Zimop ISA extension which was ratified in commit
58220614a5f of the riscv-isa-manual.

Signed-off-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20240619113529.676940-3-cleger@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 07:54:46 -07:00
Palmer Dabbelt
cc2c169e34
Merge patch "riscv: stacktrace: convert arch_stack_walk() to noinstr"
This first patch in the larger series is a fix, so I'm merging it into
fixes while the rest of the patch set is still under development.

* b4-shazam-merge:
  riscv: stacktrace: convert arch_stack_walk() to noinstr

Link: https://lore.kernel.org/r/20240613-dev-andyc-dyn-ftrace-v4-v1-0-1a538e12c01e@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 07:38:02 -07:00
Andy Chiu
23b2188920
riscv: stacktrace: convert arch_stack_walk() to noinstr
arch_stack_walk() is called intensively in function_graph when the
kernel is compiled with CONFIG_TRACE_IRQFLAGS. As a result, the kernel
logs a lot of arch_stack_walk and its sub-functions into the ftrace
buffer. However, these functions should not appear on the trace log
because they are part of the ftrace itself. This patch references what
arm64 does for the smae function. So it further prevent the re-enter
kprobe issue, which is also possible on riscv.

Related-to: commit 0fbcd8abf3 ("arm64: Prohibit instrumentation on arch_stack_walk()")
Fixes: 680341382d ("riscv: add CALLER_ADDRx support")
Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240613-dev-andyc-dyn-ftrace-v4-v1-1-1a538e12c01e@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 07:37:59 -07:00
Alexandre Ghiti
edf2d546bf
riscv: patch: Flush the icache right after patching to avoid illegal insns
We cannot delay the icache flush after patching some functions as we may
have patched a function that will get called before the icache flush.

The only way to completely avoid such scenario is by flushing the icache
as soon as we patch a function. This will probably be costly as we don't
batch the icache maintenance anymore.

Fixes: 6ca445d8af ("riscv: Fix early ftrace nop patching")
Reported-by: Conor Dooley <conor.dooley@microchip.com>
Closes: https://lore.kernel.org/linux-riscv/20240613-lubricant-breath-061192a9489a@wendy/
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Andy Chiu <andy.chiu@sifive.com>
Link: https://lore.kernel.org/r/20240624082141.153871-1-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 07:37:27 -07:00
Palmer Dabbelt
d4b539adc8
Merge patch series "riscv: Various text patching improvements"
Samuel Holland <samuel.holland@sifive.com> says:

Here are a few changes to minimize calls to stop_machine() and
flush_icache_*() in the various text patching functions, as well as
to simplify the code.

* b4-shazam-merge:
  riscv: Remove extra variable in patch_text_nosync()
  riscv: Use offset_in_page() in text patching functions
  riscv: Pass patch_text() the length in bytes
  riscv: Simplify text patching loops
  riscv: kprobes: Use patch_text_nosync() for insn slots
  riscv: jump_label: Simplify assembly syntax
  riscv: jump_label: Batch icache maintenance

Link: https://lore.kernel.org/r/20240327160520.791322-1-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 07:36:35 -07:00
Samuel Holland
47742484ee
riscv: Remove extra variable in patch_text_nosync()
This cast is superfluous, and is incorrect anyway if compressed
instructions may be present.

Reviewed-by: Björn Töpel <bjorn@rivosinc.com>
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20240327160520.791322-8-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 07:36:33 -07:00
Samuel Holland
eaee548756
riscv: Use offset_in_page() in text patching functions
This is a bit easier to parse than the equivalent bit manipulation.

Reviewed-by: Björn Töpel <bjorn@rivosinc.com>
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20240327160520.791322-7-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 07:36:32 -07:00
Samuel Holland
51781ce8f4
riscv: Pass patch_text() the length in bytes
patch_text_nosync() already handles an arbitrary length of code, so this
removes a superfluous loop and reduces the number of icache flushes.

Reviewed-by: Björn Töpel <bjorn@rivosinc.com>
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20240327160520.791322-6-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 07:36:31 -07:00
Samuel Holland
5080ca0fe9
riscv: Simplify text patching loops
This reduces the number of variables and makes the code easier to parse.

Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Björn Töpel <bjorn@rivosinc.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20240327160520.791322-5-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 07:36:30 -07:00
Samuel Holland
b1756750a3
riscv: kprobes: Use patch_text_nosync() for insn slots
These instructions are not yet visible to the rest of the system,
so there is no need to do the whole stop_machine() dance.

Reviewed-by: Björn Töpel <bjorn@rivosinc.com>
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Link: https://lore.kernel.org/r/20240327160520.791322-4-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 07:36:29 -07:00
Samuel Holland
2aa30d19cf
riscv: jump_label: Simplify assembly syntax
The idiomatic way to write "jal zero" is "j".

Reviewed-by: Björn Töpel <bjorn@rivosinc.com>
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Link: https://lore.kernel.org/r/20240327160520.791322-3-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 07:36:28 -07:00
Samuel Holland
652b56b184
riscv: jump_label: Batch icache maintenance
Switch to the batched version of the jump label update functions so
instruction cache maintenance is deferred until the end of the update.

Reviewed-by: Björn Töpel <bjorn@rivosinc.com>
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Link: https://lore.kernel.org/r/20240327160520.791322-2-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 07:36:27 -07:00
Yu-Wei Hsu
e325618349 RISC-V: KVM: Redirect AMO load/store access fault traps to guest
The KVM RISC-V does not delegate AMO load/store access fault traps to
VS-mode (hedeleg) so typically M-mode takes these traps and redirects
them back to HS-mode. However, upon returning from M-mode, the KVM
RISC-V running in HS-mode terminates VS-mode software.

The KVM RISC-V should redirect AMO load/store access fault traps back
to VS-mode and let the VS-mode trap handler determine the next steps.

Signed-off-by: Yu-Wei Hsu <betterman5240@gmail.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20240429092113.70695-1-betterman5240@gmail.com
Signed-off-by: Anup Patel <anup@brainfault.org>
2024-06-26 18:37:41 +05:30
Shenlin Liang
91195a90f1 RISCV: KVM: add tracepoints for entry and exit events
Like other architectures, RISCV KVM also needs to add these event
tracepoints to count the number of times kvm guest entry/exit.

Signed-off-by: Shenlin Liang <liangshenlin@eswincomputing.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Tested-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/r/20240422080833.8745-2-liangshenlin@eswincomputing.com
Signed-off-by: Anup Patel <anup@brainfault.org>
2024-06-26 18:37:36 +05:30
Anup Patel
3385339296 RISC-V: KVM: Use IMSIC guest files when available
Let us discover and use IMSIC guest files from the IMSIC global
config provided by the IMSIC irqchip driver.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Link: https://lore.kernel.org/r/20240411090639.237119-3-apatel@ventanamicro.com
Signed-off-by: Anup Patel <anup@brainfault.org>
2024-06-26 18:37:34 +05:30
Anup Patel
e5b088c1dc RISC-V: KVM: Share APLIC and IMSIC defines with irqchip drivers
We have common APLIC and IMSIC headers available under
include/linux/irqchip/ directory which are used by APLIC
and IMSIC irqchip drivers. Let us replace the use of
kvm_aia_*.h headers with include/linux/irqchip/riscv-*.h
headers.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Link: https://lore.kernel.org/r/20240411090639.237119-2-apatel@ventanamicro.com
Signed-off-by: Anup Patel <anup@brainfault.org>
2024-06-26 18:37:32 +05:30
Jesse Taube
04a2aef59c
RISC-V: fix vector insn load/store width mask
RVFDQ_FL_FS_WIDTH_MASK should be 3 bits [14-12], shifted down by 12 bits.
Replace GENMASK(3, 0) with GENMASK(2, 0).

Fixes: cd05483724 ("riscv: Allocate user's vector context in the first-use trap")
Signed-off-by: Jesse Taube <jesse@rivosinc.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20240606182800.415831-1-jesse@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-25 08:47:10 -07:00
Arnd Bergmann
295f10061a syscalls: mmap(): use unsigned offset type consistently
Most architectures that implement the old-style mmap() with byte offset
use 'unsigned long' as the type for that offset, but microblaze and
riscv have the off_t type that is shared with userspace, matching the
prototype in include/asm-generic/syscalls.h.

Make this consistent by using an unsigned argument everywhere. This
changes the behavior slightly, as the argument is shifted to a page
number, and an user input with the top bit set would result in a
negative page offset rather than a large one as we use elsewhere.

For riscv, the 32-bit sys_mmap2() definition actually used a custom
type that is different from the global declaration, but this was
missed due to an incorrect type check.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-06-25 15:57:38 +02:00
Hal Feng
4ed81d9dd7 riscv: dts: starfive: jh7110: Add the core reset and jh7110 compatible for uarts
Add the core reset for uarts, which is necessary for uarts to work.

Signed-off-by: Hal Feng <hal.feng@starfivetech.com>
Link: https://lore.kernel.org/r/20240604084729.57239-4-hal.feng@starfivetech.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-24 16:09:11 +02:00
Rafael Passos
9919c5c98c bpf: remove unused parameter in bpf_jit_binary_pack_finalize
Fixes a compiler warning. the bpf_jit_binary_pack_finalize function
was taking an extra bpf_prog parameter that went unused.
This removves it and updates the callers accordingly.

Signed-off-by: Rafael Passos <rafael@rcpassos.me>
Link: https://lore.kernel.org/r/20240615022641.210320-2-rafael@rcpassos.me
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-06-20 19:50:26 -07:00
Conor Dooley
3f41368fbf riscv: dts: microchip: add an initial devicetree for the BeagleV Fire
Add an initial devicetree for the BeagleV Fire. This devicetree differs
from that in the BeagleBoard BSP as it has a different memory
configuration, however it will boot on the same FPGA images. PCI is
disabled for now, as the Linux PCI driver (and the binding) assume
which root port instance is in use. This will need to be fixed before
PCI can be enabled.

Link: https://www.beagleboard.org/boards/beaglev-fire
Co-developed-by: Jamie Gibbons <jamie.gibbons@microchip.com>
Signed-off-by: Jamie Gibbons <jamie.gibbons@microchip.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2024-06-19 12:36:55 +01:00
Shengyu Qu
3c1f81a1b5 riscv: dts: starfive: Set EMMC vqmmc maximum voltage to 3.3V on JH7110 boards
Currently, for JH7110 boards with EMMC slot, vqmmc voltage for EMMC is
fixed to 1.8V, while the spec needs it to be 3.3V on low speed mode and
should support switching to 1.8V when using higher speed mode. Since
there are no other peripherals using the same voltage source of EMMC's
vqmmc(ALDO4) on every board currently supported by mainline kernel,
regulator-max-microvolt of ALDO4 should be set to 3.3V.

Cc: stable@vger.kernel.org
Signed-off-by: Shengyu Qu <wiagn233@outlook.com>
Fixes: 7dafcfa79c ("riscv: dts: starfive: enable DCDC1&ALDO4 node in axp15060")
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2024-06-19 12:19:32 +01:00
Matthias Brugger
edbce932b1 riscv: dts: starfive: Update flash partition layout
Up to now, the describe flash partition layout has some gaps.
Use the whole flash chip by getting rid of the gaps.

Suggested-by: Heinrich Schuchardt <heinrich.schuchardt@canonical.com>
Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
Reviewed-by: Emil Renner Berthing <emil.renner.berthing@canonical.com>
Reviewed-by: Heinrich Schuchardt <heinrich.schuchardt@canonical.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2024-06-19 11:05:43 +01:00
Inochi Amaoto
c61fea676b riscv: dts: thead: th1520: Add PMU event node
T-HEAD th1520 uses standard C910 chip and its pmu is already supported
by OpenSBI.

Add the pmu event description for T-HEAD th1520 SoC.

Signed-off-by: Inochi Amaoto <inochiama@outlook.com>
Link: https://www.xrvm.com/product/xuantie/4240217381324001280?spm=xrvm.27140568.0.0.7f979b29nzIa1m
Reviewed-by: Guo Ren <guoren@kernel.org>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2024-06-19 11:05:43 +01:00
Henry Bell
2606bf583b riscv: dts: starfive: add Star64 board devicetree
The Pine64 Star64 is a development board based on the Starfive JH7110 SoC.
The board features:

- JH7110 SoC
- 4/8 GiB LPDDR4 DRAM
- AXP15060 PMIC
- 40 pin GPIO header
- 1x USB 3.0 host port
- 3x USB 2.0 host port
- 1x eMMC slot
- 1x MicroSD slot
- 1x QSPI Flash
- 2x 1Gbps Ethernet port
- 1x HDMI port
- 1x 4-lane DSI
- 1x 2-lane CSI
- 1x PCIe 2.0 x1 lane

Signed-off-by: Henry Bell <dmoo_dv@protonmail.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2024-06-19 11:05:43 +01:00
Haylen Chu
890182bb3d riscv: dts: sophgo: disable write-protection for milkv duo
Milkv Duo does not have a write-protect pin, so disable write protect
to prevent SDcards misdetected as read-only.

Fixes: 89a7056ed4 ("riscv: dts: sophgo: add sdcard support for milkv duo")
Signed-off-by: Haylen Chu <heylenay@outlook.com>
Link: https://lore.kernel.org/r/SEYPR01MB4221943C7B101DD2318DA0D3D7CE2@SEYPR01MB4221.apcprd01.prod.exchangelabs.com
Signed-off-by: Inochi Amaoto <inochiama@outlook.com>
Signed-off-by: Chen Wang <unicorn_wang@outlook.com>
2024-06-19 08:46:03 +08:00
David Matlack
a6816314af KVM: Introduce vcpu->wants_to_run
Introduce vcpu->wants_to_run to indicate when a vCPU is in its core run
loop, i.e. when the vCPU is running the KVM_RUN ioctl and immediate_exit
was not set.

Replace all references to vcpu->run->immediate_exit with
!vcpu->wants_to_run to avoid TOCTOU races with userspace. For example, a
malicious userspace could invoked KVM_RUN with immediate_exit=true and
then after KVM reads it to set wants_to_run=false, flip it to false.
This would result in the vCPU running in KVM_RUN with
wants_to_run=false. This wouldn't cause any real bugs today but is a
dangerous landmine.

Signed-off-by: David Matlack <dmatlack@google.com>
Link: https://lore.kernel.org/r/20240503181734.1467938-2-dmatlack@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-06-18 09:20:01 -07:00
Jakub Kicinski
4c7d3d79c7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

No conflicts, no adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-13 13:13:46 -07:00
Hal Feng
d8a7d89abb riscv: defconfig: Enable StarFive JH7110 drivers
Add support for StarFive JH7110 SoC and VisionFive 2 board.
- Cache
- Temperature sensor
- PMIC (AXP15060)
- Ethernet PHY (YT8531)
- Restart GPIO
- RNG
- I2C
- SPI
- Quad SPI
- PCIe
- USB & USB 2.0 PHY & PCIe 2.0/USB 3.0 PHY
- Audio (I2S / TDM / PWM-DAC)
- MIPI-CSI2 RX & D-PHY RX

Signed-off-by: Hal Feng <hal.feng@starfivetech.com>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2024-06-12 23:16:55 +01:00
Sean Christopherson
2a27c43140 KVM: Delete the now unused kvm_arch_sched_in()
Delete kvm_arch_sched_in() now that all implementations are nops.

Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Acked-by: Kai Huang <kai.huang@intel.com>
Link: https://lore.kernel.org/r/20240522014013.1672962-5-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-06-11 14:18:45 -07:00
Steven Rostedt (Google)
5f7fb89a11 function_graph: Everyone uses HAVE_FUNCTION_GRAPH_RET_ADDR_PTR, remove it
All architectures that implement function graph also implements
HAVE_FUNCTION_GRAPH_RET_ADDR_PTR. Remove it, as it is no longer a
differentiator.

Link: https://lore.kernel.org/linux-trace-kernel/20240611031737.982047614@goodmis.org

Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: "Naveen N. Rao" <naveen.n.rao@linux.ibm.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-06-11 11:18:24 -04:00
Jakub Kicinski
b1156532bc bpf-next-for-netdev
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZmIsRAAKCRDbK58LschI
 g4SSAP0bkl6rPMn7zp1h+/l7hlvpp2aVOmasBTe8hIhAGUbluwD/TGq4sNsGgXFI
 i4tUtFRhw8pOjy2guy6526qyJvBs8wY=
 =WMhY
 -----END PGP SIGNATURE-----

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Daniel Borkmann says:

====================
pull-request: bpf-next 2024-06-06

We've added 54 non-merge commits during the last 10 day(s) which contain
a total of 50 files changed, 1887 insertions(+), 527 deletions(-).

The main changes are:

1) Add a user space notification mechanism via epoll when a struct_ops
   object is getting detached/unregistered, from Kui-Feng Lee.

2) Big batch of BPF selftest refactoring for sockmap and BPF congctl
   tests, from Geliang Tang.

3) Add BTF field (type and string fields, right now) iterator support
   to libbpf instead of using existing callback-based approaches,
   from Andrii Nakryiko.

4) Extend BPF selftests for the latter with a new btf_field_iter
   selftest, from Alan Maguire.

5) Add new kfuncs for a generic, open-coded bits iterator,
   from Yafang Shao.

6) Fix BPF selftests' kallsyms_find() helper under kernels configured
   with CONFIG_LTO_CLANG_THIN, from Yonghong Song.

7) Remove a bunch of unused structs in BPF selftests,
   from David Alan Gilbert.

8) Convert test_sockmap section names into names understood by libbpf
   so it can deduce program type and attach type, from Jakub Sitnicki.

9) Extend libbpf with the ability to configure log verbosity
   via LIBBPF_LOG_LEVEL environment variable, from Mykyta Yatsenko.

10) Fix BPF selftests with regards to bpf_cookie and find_vma flakiness
    in nested VMs, from Song Liu.

11) Extend riscv32/64 JITs to introduce shift/add helpers to generate Zba
    optimization, from Xiao Wang.

12) Enable BPF programs to declare arrays and struct fields with kptr,
    bpf_rb_root, and bpf_list_head, from Kui-Feng Lee.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (54 commits)
  selftests/bpf: Drop useless arguments of do_test in bpf_tcp_ca
  selftests/bpf: Use start_test in test_dctcp in bpf_tcp_ca
  selftests/bpf: Use start_test in test_dctcp_fallback in bpf_tcp_ca
  selftests/bpf: Add start_test helper in bpf_tcp_ca
  selftests/bpf: Use connect_to_fd_opts in do_test in bpf_tcp_ca
  libbpf: Auto-attach struct_ops BPF maps in BPF skeleton
  selftests/bpf: Add btf_field_iter selftests
  selftests/bpf: Fix send_signal test with nested CONFIG_PARAVIRT
  libbpf: Remove callback-based type/string BTF field visitor helpers
  bpftool: Use BTF field iterator in btfgen
  libbpf: Make use of BTF field iterator in BTF handling code
  libbpf: Make use of BTF field iterator in BPF linker code
  libbpf: Add BTF field iterator
  selftests/bpf: Ignore .llvm.<hash> suffix in kallsyms_find()
  selftests/bpf: Fix bpf_cookie and find_vma in nested VM
  selftests/bpf: Test global bpf_list_head arrays.
  selftests/bpf: Test global bpf_rb_root arrays and fields in nested struct types.
  selftests/bpf: Test kptr arrays and kptrs in nested struct fields.
  bpf: limit the number of levels of a nested struct type.
  bpf: look into the types of the fields of a struct type recursively.
  ...
====================

Link: https://lore.kernel.org/r/20240606223146.23020-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-10 18:02:14 -07:00
Linus Torvalds
0a02756d91 RISC-V Fixes for 6.10-rc3
* Another fix to avoid allocating pages that overlap with ERR_PTR, which
   manifests on rv32.
 * A revert for the badaccess patch I incorrectly picked up an early
   version of.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmZjGDcTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYiWGoD/9LuIorx3Oxtq8oWq42DpMHQDKIJU/p
 PlopOk0acCucfg13ZfAQDJ0H8PN64SLJ/Zhzh/+Ge+6s1FRD1b5pqsT0ZKUytsrk
 pdWVha8QAQYCm3YekLGvQCHHN1Lh3/dilOvZEJZN1l+ltnRFh+tfurB+E/Cre/mb
 pRMGDtAASXGUAkuEZqKfxEe7zQ4xRnyVx0DtcFl1xVjNP11ywwwbnkf7GyrhmDHT
 SITcjF3NaoqQ4zgEwD8MflL6k0fwhwECw3RBvwT23jfHX56ttPJRw0bn9l8UlY25
 HdvBw45bcIbva0COIMIBnYjpokj5wIOnaOhJSHbHllkVFTpBcaiO+CgDj440sAlJ
 4vvl7ll8Fn/DCEjzQgvePndjzWPyxEGkAU6b8f68wj0dfRU/FW16GE3nO/mYLjjx
 RWob8+/qh6kg51KsaCdDNHaASoeVG6ZYokIOYecc30rDZv2fjQv5ly81k6UpKcCu
 0v9el1vX7KaM6GYQkIhyJbnlCCTGLfKJWcWU/Sup7pw7S2HmAkwV3LnuUQwkVXrG
 Qc6GWTbD7YrfVVQF+yX/PrpmuhQWk6ZUZjoA20jmkoeN4IDUG5l1j6xy6I0dDp6T
 OYSYdQNuGfic2X6wO73iImFJTdleuqFVW5GjiD2LkZXziV5rMGpgaXVGzhwnltk8
 A+B53s9STVmbyQ==
 =tO1B
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V fixes from Palmer Dabbelt:

 - Another fix to avoid allocating pages that overlap with ERR_PTR,
   which manifests on rv32

 - A revert for the badaccess patch I incorrectly picked up an early
   version of

* tag 'riscv-for-linus-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  Revert "riscv: mm: accelerate pagefault when badaccess"
  riscv: fix overlap of allocated page and PTR_ERR
2024-06-07 14:47:38 -07:00
Jakub Kicinski
62b5bf58b9 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

No conflicts.

Adjacent changes:

drivers/net/ethernet/pensando/ionic/ionic_txrx.c
  d9c0420999 ("ionic: Mark error paths in the data path as unlikely")
  491aee894a ("ionic: fix kernel panic in XDP_TX action")

net/ipv6/ip6_fib.c
  b4cb4a1391 ("net: use unrcu_pointer() helper")
  b01e1c0307 ("ipv6: fix possible race in __fib6_drop_pcpu_from()")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-06 12:06:56 -07:00
Paolo Bonzini
b50788f7cd KVM/riscv fixes for 6.10, take #1
- No need to use mask when hart-index-bits is 0
 - Fix incorrect reg_subtype labels in kvm_riscv_vcpu_set_reg_isa_ext()
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEZdn75s5e6LHDQ+f/rUjsVaLHLAcFAmZZ/j0ACgkQrUjsVaLH
 LAcLyg//Tz9gFc7kTPR9kBLhDgTytS0WuBDnh8P09bGAv/36+AJZ8ZbXDTvh44Ul
 w1sLYjDVW08kCWxkTg9VPVxv467O3E7F1bduXMTBI2EBZxPul+T+p8Rv9iunAGD/
 945nkCpAMOl6xiaIFlIISvOH1aviTp/0YcmFGaE3AcbBb7kEaZ2MONuvffedSGV4
 5/MkAKtHpKj7Rekjrnpz0Nl+KVqYqCMS8PAvJNKQgG/e7MC2IHfRX+2GYnBZH1ur
 cENKUotSbmrzoeA9nmB8Yl74qgW/OA7fUU5CBb/x7JYl5oZbQs203z8ve9XIqcwk
 HtQEOp0G/EL3rmQanZl47zWG0ySI5pXRglc7DC8lnyiKjd9+Ut7VIqAVCd4H4ZC5
 5c413zPxrCtEEK+tnkrvnvK+Hks5ZtqXMfpFJQSf0twXt5JpBlgzVL6Suf48/9sA
 IRkkt4kq/XVq4ZErkeHE/pH1kwVqYMfjUHlc0L0dChcA2lNARQbfO+P215xU/H9d
 1ym02fL6P5ZCjzW0t+IfdvK6u7EbI9nAyBw/+8cW8ppPoJBP5AaiBKfWz20XcasE
 3Vktr7njqJf/eaI9oyUnTlkfaktb4Ll8WPnifZgsc9zMJQtCVw1TK4Q1bMtDJ1zX
 GcQlGlu7i1vrmkUUXPE37aQehuUBYk7rmX6B+CW56ILUm62hS0A=
 =oPq/
 -----END PGP SIGNATURE-----

Merge tag 'kvm-riscv-fixes-6.10-1' of https://github.com/kvm-riscv/linux into HEAD

KVM/riscv fixes for 6.10, take #1

- No need to use mask when hart-index-bits is 0
- Fix incorrect reg_subtype labels in kvm_riscv_vcpu_set_reg_isa_ext()
2024-06-03 13:18:18 -04:00
Xiao Wang
96a27ee76f riscv, bpf: Introduce shift add helper with Zba optimization
Zba extension is very useful for generating addresses that index into array
of basic data types. This patch introduces sh2add and sh3add helpers for
RV32 and RV64 respectively, to accelerate addressing for array of unsigned
long data.

Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Björn Töpel <bjorn@kernel.org>
Link: https://lore.kernel.org/bpf/20240524075543.4050464-3-xiao.w.wang@intel.com
2024-06-03 16:45:23 +02:00
Palmer Dabbelt
e2c79b4c5c
Revert "riscv: mm: accelerate pagefault when badaccess"
I accidentally picked up an earlier version of this patch, which had
already landed via mm.  The patch  I picked up contains a bug, which I
kept as I thought it was a fix.  So let's just revert it.

This reverts commit 4c6c002042.

Fixes: 4c6c002042 ("riscv: mm: accelerate pagefault when badaccess")
Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240530164451.21336-1-palmer@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-03 07:41:13 -07:00
Nam Cao
994af1825a
riscv: fix overlap of allocated page and PTR_ERR
On riscv32, it is possible for the last page in virtual address space
(0xfffff000) to be allocated. This page overlaps with PTR_ERR, so that
shouldn't happen.

There is already some code to ensure memblock won't allocate the last page.
However, buddy allocator is left unchecked.

Fix this by reserving physical memory that would be mapped at virtual
addresses greater than 0xfffff000.

Reported-by: Björn Töpel <bjorn@kernel.org>
Closes: https://lore.kernel.org/linux-riscv/878r1ibpdn.fsf@all.your.base.are.belong.to.us
Fixes: 76d2a0493a ("RISC-V: Init and Halt Code")
Signed-off-by: Nam Cao <namcao@linutronix.de>
Cc: <stable@vger.kernel.org>
Tested-by: Björn Töpel <bjorn@rivosinc.com>
Reviewed-by: Björn Töpel <bjorn@rivosinc.com>
Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org>
Link: https://lore.kernel.org/r/20240425115201.3044202-1-namcao@linutronix.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-03 07:41:09 -07:00
Jakub Kicinski
e19de2064f Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

Conflicts:

drivers/net/ethernet/ti/icssg/icssg_classifier.c
  abd5576b9c ("net: ti: icssg-prueth: Add support for ICSSG switch firmware")
  56a5cf538c ("net: ti: icssg-prueth: Fix start counter for ft1 filter")
https://lore.kernel.org/all/20240531123822.3bb7eadf@canb.auug.org.au/

No other adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-31 14:10:28 -07:00
Quan Zhou
c66f3b40b1 RISC-V: KVM: Fix incorrect reg_subtype labels in kvm_riscv_vcpu_set_reg_isa_ext function
In the function kvm_riscv_vcpu_set_reg_isa_ext, the original code
used incorrect reg_subtype labels KVM_REG_RISCV_SBI_MULTI_EN/DIS.
These have been corrected to KVM_REG_RISCV_ISA_MULTI_EN/DIS respectively.
Although they are numerically equivalent, the actual processing
will not result in errors, but it may lead to ambiguous code semantics.

Fixes: 613029442a ("RISC-V: KVM: Extend ONE_REG to enable/disable multiple ISA extensions")
Signed-off-by: Quan Zhou <zhouquan@iscas.ac.cn>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/ff1c6771a67d660db94372ac9aaa40f51e5e0090.1716429371.git.zhouquan@iscas.ac.cn
Signed-off-by: Anup Patel <anup@brainfault.org>
2024-05-31 10:40:39 +05:30
Yong-Xuan Wang
2d707b4e37 RISC-V: KVM: No need to use mask when hart-index-bit is 0
When the maximum hart number within groups is 1, hart-index-bit is set to
0. Consequently, there is no need to restore the hart ID from IMSIC
addresses and hart-index-bit settings. Currently, QEMU and kvmtool do not
pass correct hart-index-bit values when the maximum hart number is a
power of 2, thereby avoiding this issue. Corresponding patches for QEMU
and kvmtool will also be dispatched.

Fixes: 89d01306e3 ("RISC-V: KVM: Implement device interface for AIA irqchip")
Signed-off-by: Yong-Xuan Wang <yongxuan.wang@sifive.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20240415064905.25184-1-yongxuan.wang@sifive.com
Signed-off-by: Anup Patel <anup@brainfault.org>
2024-05-31 09:47:08 +05:30
Andy Chiu
ac295b6742
riscv: vector: adjust minimum Vector requirement to ZVE32X
Make has_vector() to check for ZVE32X. Every in-kernel usage of V that
requires a more complicate version of V must then call out explicitly.

Also, change riscv_v_first_use_handler(), and boot code that calls
riscv_v_setup_vsize() to accept ZVE32X.

Most kernel/user interfaces requires minimum of ZVE32X. Thus, programs
compiled and run with ZVE32X should be supported by the kernel on most
aspects. This includes context-switch, signal, ptrace, prctl, and
hwprobe.

One exception is that ELF_HWCAP returns 'V' only if full V is supported
on the platform. This means that the system without a full V must not
rely on ELF_HWCAP to tell whether it is allowable to execute Vector
without first invoking a prctl() check.

Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
Acked-by: Joel Granados <j.granados@samsung.com>
Link: https://lore.kernel.org/r/20240510-zve-detection-v5-7-0711bdd26c12@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-05-30 14:33:10 -07:00
Andy Chiu
de8f8282a9
riscv: hwprobe: add zve Vector subextensions into hwprobe interface
The following Vector subextensions for "embedded" platforms are added
into RISCV_HWPROBE_KEY_IMA_EXT_0:
 - ZVE32X
 - ZVE32F
 - ZVE64X
 - ZVE64F
 - ZVE64D

Extensions ending with an X indicates that the platform doesn't have a
vector FPU.
Extensions ending with F/D mean that whether single (F) or double (D)
precision vector operation is supported.
The number 32 or 64 follows from ZVE tells the maximum element length.

Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
Reviewed-by: Clément Léger <cleger@rivosinc.com>
Link: https://lore.kernel.org/r/20240510-zve-detection-v5-6-0711bdd26c12@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-05-30 14:33:09 -07:00
Andy Chiu
1e7483542b
riscv: cpufeature: add zve32[xf] and zve64[xfd] isa detection
Multiple Vector subextensions are added. Also, the patch takes care of
the dependencies of Vector subextensions by macro expansions. So, if
some "embedded" platform only reports "zve64f" on the ISA string, the
parser is able to expand it to zve32x zve32f zve64x and zve64f.

Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20240510-zve-detection-v5-5-0711bdd26c12@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-05-30 14:33:08 -07:00
Andy Chiu
98a5700dfa
riscv: cpufeature: call match_isa_ext() for single-letter extensions
Single-letter extensions may also imply multiple subextensions. For
example, Vector extension implies zve64d, and zve64d implies zve64f.

Extension parsing for "riscv,isa-extensions" has the ability to resolve
the dependency by calling match_isa_ext(). This patch makes deprecated
parser call the same function for single letter extensions.

Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20240510-zve-detection-v5-3-0711bdd26c12@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-05-30 14:33:06 -07:00
Andy Chiu
38a94c4666
riscv: smp: fail booting up smp if inconsistent vlen is detected
Currently we only support Vector for SMP platforms, that is, all SMP
cores have the same vlenb. If we happen to detect a mismatching vlen, it
is better to just fail bootting it up to prevent further race/scheduling
issues.

Also, move .Lsecondary_park forward and chage `tail smp_callin` into a
regular call in the early assembly. So a core would be parked right
after a return from smp_callin. Note that a successful smp_callin
does not return.

Fixes: 7017858eb2 ("riscv: Introduce riscv_v_vsize to record size of Vector context")
Reported-by: Conor Dooley <conor.dooley@microchip.com>
Closes: https://lore.kernel.org/linux-riscv/20240228-vicinity-cornstalk-4b8eb5fe5730@spud/
Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Yunhui Cui <cuiyunhui@bytedance.com>
Link: https://lore.kernel.org/r/20240510-zve-detection-v5-2-0711bdd26c12@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-05-30 14:33:05 -07:00
Andy Chiu
77afe3e514
riscv: vector: add a comment when calling riscv_setup_vsize()
The function would fail when it detects the calling hart's vlen doesn't
match the first one's. The boot hart is the first hart calling this
function during riscv_fill_hwcap, so it is impossible to fail here. Add
a comment about this behavior.

Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20240510-zve-detection-v5-1-0711bdd26c12@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-05-30 14:33:04 -07:00
Alexandre Ghiti
1d84afaf02
riscv: Fix fully ordered LR/SC xchg[8|16]() implementations
The fully ordered versions of xchg[8|16]() using LR/SC lack the
necessary memory barriers to guarantee the order.

Fix this by matching what is already implemented in the fully ordered
versions of cmpxchg() using LR/SC.

Suggested-by: Andrea Parri <parri.andrea@gmail.com>
Reported-by: Andrea Parri <parri.andrea@gmail.com>
Closes: https://lore.kernel.org/linux-riscv/ZlYbupL5XgzgA0MX@andrea/T/#u
Fixes: a8ed2b7a2c ("riscv/cmpxchg: Implement xchg for variables of size 1 and 2")
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Andrea Parri <parri.andrea@gmail.com>
Link: https://lore.kernel.org/r/20240530145546.394248-1-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-05-30 09:43:14 -07:00
Nam Cao
7bed516174
riscv: enable HAVE_ARCH_HUGE_VMAP for XIP kernel
HAVE_ARCH_HUGE_VMAP also works on XIP kernel, so remove its dependency on
!XIP_KERNEL.

This also fixes a boot problem for XIP kernel introduced by the commit in
"Fixes:". This commit used huge page mapping for vmemmap, but huge page
vmap was not enabled for XIP kernel.

Fixes: ff172d4818 ("riscv: Use hugepage mappings for vmemmap")
Signed-off-by: Nam Cao <namcao@linutronix.de>
Cc: <stable@vger.kernel.org>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240526110104.470429-1-namcao@linutronix.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-05-30 09:42:52 -07:00
Sergey Matyukevich
a638b0461b
riscv: prevent pt_regs corruption for secondary idle threads
Top of the kernel thread stack should be reserved for pt_regs. However
this is not the case for the idle threads of the secondary boot harts.
Their stacks overlap with their pt_regs, so both may get corrupted.

Similar issue has been fixed for the primary hart, see c7cdd96eca
("riscv: prevent stack corruption by reserving task_pt_regs(p) early").
However that fix was not propagated to the secondary harts. The problem
has been noticed in some CPU hotplug tests with V enabled. The function
smp_callin stored several registers on stack, corrupting top of pt_regs
structure including status field. As a result, kernel attempted to save
or restore inexistent V context.

Fixes: 9a2451f186 ("RISC-V: Avoid using per cpu array for ordered booting")
Fixes: 2875fe0561 ("RISC-V: Add cpu_ops and modify default booting method")
Signed-off-by: Sergey Matyukevich <sergey.matyukevich@syntacore.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240523084327.2013211-1-geomatsi@gmail.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-05-30 09:42:51 -07:00
Jakub Kicinski
4b3529edbb bpf-next-for-netdev
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZlWtmQAKCRDbK58LschI
 g0TUAQDT76jx7Rq1DShCtZ3eqiBMNkYczK8b+GqNsSG8YGduaAEA1jn/GN+H65Rh
 atQZ/pYAfLZflMV04+XE0GyBr5q1uQg=
 =NczG
 -----END PGP SIGNATURE-----

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Daniel Borkmann says:

====================
pull-request: bpf-next 2024-05-28

We've added 23 non-merge commits during the last 11 day(s) which contain
a total of 45 files changed, 696 insertions(+), 277 deletions(-).

The main changes are:

1) Rename skb's mono_delivery_time to tstamp_type for extensibility
   and add SKB_CLOCK_TAI type support to bpf_skb_set_tstamp(),
   from Abhishek Chauhan.

2) Add netfilter CT zone ID and direction to bpf_ct_opts so that arbitrary
   CT zones can be used from XDP/tc BPF netfilter CT helper functions,
   from Brad Cowie.

3) Several tweaks to the instruction-set.rst IETF doc to address
   the Last Call review comments, from Dave Thaler.

4) Small batch of riscv64 BPF JIT optimizations in order to emit more
   compressed instructions to the JITed image for better icache efficiency,
   from Xiao Wang.

5) Sort bpftool C dump output from BTF, aiming to simplify vmlinux.h
   diffing and forcing more natural type definitions ordering,
   from Mykyta Yatsenko.

6) Use DEV_STATS_INC() macro in BPF redirect helpers to silence
   a syzbot/KCSAN race report for the tx_errors counter,
   from Jiang Yunshui.

7) Un-constify bpf_func_info in bpftool to fix compilation with LLVM 17+
   which started treating const structs as constants and thus breaking
   full BTF program name resolution, from Ivan Babrou.

8) Fix up BPF program numbers in test_sockmap selftest in order to reduce
   some of the test-internal array sizes, from Geliang Tang.

9) Small cleanup in Makefile.btf script to use test-ge check for v1.25-only
   pahole, from Alan Maguire.

10) Fix bpftool's make dependencies for vmlinux.h in order to avoid needless
    rebuilds in some corner cases, from Artem Savkov.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (23 commits)
  bpf, net: Use DEV_STAT_INC()
  bpf, docs: Fix instruction.rst indentation
  bpf, docs: Clarify call local offset
  bpf, docs: Add table captions
  bpf, docs: clarify sign extension of 64-bit use of 32-bit imm
  bpf, docs: Use RFC 2119 language for ISA requirements
  bpf, docs: Move sentence about returning R0 to abi.rst
  bpf: constify member bpf_sysctl_kern:: Table
  riscv, bpf: Try RVC for reg move within BPF_CMPXCHG JIT
  riscv, bpf: Use STACK_ALIGN macro for size rounding up
  riscv, bpf: Optimize zextw insn with Zba extension
  selftests/bpf: Handle forwarding of UDP CLOCK_TAI packets
  net: Add additional bit to support clockid_t timestamp type
  net: Rename mono_delivery_time to tstamp_type for scalabilty
  selftests/bpf: Update tests for new ct zone opts for nf_conntrack kfuncs
  net: netfilter: Make ct zone opts configurable for bpf ct helpers
  selftests/bpf: Fix prog numbers in test_sockmap
  bpf: Remove unused variable "prev_state"
  bpftool: Un-const bpf_func_info to fix it for llvm 17 and newer
  bpf: Fix order of args in call to bpf_map_kvcalloc
  ...
====================

Link: https://lore.kernel.org/r/20240528105924.30905-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-28 07:27:29 -07:00
Geert Uytterhoeven
2c917b55d6 riscv: dts: canaan: Disable I/O devices unless used
It is considered good practice to disable on-SoC devices providing
external I/O in the SoC-specific .dtsi, and enable them explicitly in
the board-specific DTS files when actually wired-up and used.

Hence:
  - Set the status of I/O devices in k210.dtsi to "disabled",
  - Override the status of used I/O devices in board-specific DTS files
    to "okay",
  - Drop unneeded status overrides in board DTS-specific files for the
    always-enabled pin controller.

On e.g. MAiXBiT, this gets rid of an error message when probing the
unused slave-only spi2 controller:

    dw_spi_mmio 50240000.spi: error -22: problem registering spi host
    dw_spi_mmio 50240000.spi: probe with driver dw_spi_mmio failed with error -22

which is seen since commit 98d75b9ef2 ("spi: dw: Drop default
number of CS setting").

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2024-05-28 12:25:54 +01:00
Geert Uytterhoeven
9235784cb6 riscv: dts: canaan: Clean up serial aliases
The SoC-specific k210.dtsi declares aliases for all four serial ports.
However, none of the board-specific DTS files configure pin control for
any but the first serial port, so the last three ports are not usable.

Move the aliases node from the SoC-specific k210.dtsi to the
board-specific DTS files, as these are really board-specific, and retain
the sole port that is usable.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2024-05-28 12:25:54 +01:00
Linus Torvalds
f1f9984fdc RISC-V Patches for the 6.10 Merge Window, Part 2
* The compression format used for boot images is now configurable at
   build time, and these formats are shown in `make help`.
 * access_ok() has been optimized.
 * A pair of performance bugs have been fixed in the uaccess handlers.
 * Various fixes and cleanups, including one for the IMSIC build failure
   and one for the early-boot ftrace illegal NOPs bug.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmZQtRwTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYiWPBD/0UitwMg88m6urvMd0Pfvwwbu/OnGqW
 TZT8C55iJi/e5f9K4mBrSyjATI8z/MblD+Zz0adX8ygavS4JuQ7DoWwb1yTT3pww
 +z74FkWeJuiar+HfbhQ602CfMrnzvWjnyJ3URemqy5pIBKyvD9gGkDJDZwf8hJTk
 Vqh5qVtnBqFBO9kWpIx+/pLCfpyHVNkhWr1AzKfoqQ1WPIpZ/o0IGdvS88rL+EBR
 QOXiwVhEsRfC+LT6Jhn8l2bGp7PaSRVOid19OxNsJKpAhpL6AOscaafclVrLBuTd
 gkys0rT2dHdoWTAkPHQpvlOI6OmGTgopxo5pUKJHS8J9VRoBun25zC1FGBF8uyVd
 05CabWPnh7olNsRge9XiNj3x8PXjGVi7X7wUbRgOBG5aDc6TbKdxu37J0tXe0M7a
 Q74ctQvk8Nk6bQWirgTNlfJJHzL5pJbKc9VwY5uGX4qTmH+yEvCIt45ZXgXOuS/F
 eqijStkkdXUDnkMdcpaZJvXP80rHcgfP8bqevvPymRli8ER9zj9aXJQ3rmCUcPz+
 EtbyS+vOEN31wNTA1EQlfIRxfvr22x7r70DDdRwmhuD1W1tgfblm+R0Cq76I5rnJ
 VSgXKq1b4mY0eautqXEnPGyqb7H8iJIq7AoyfbzzWN+4u6yVEUvpDKueeksy+fFt
 sGNtjWqGhWyKXg==
 =/Qtt
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-6.10-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull more RISC-V updates from Palmer Dabbelt:

 - The compression format used for boot images is now configurable at
   build time, and these formats are shown in `make help`

 - access_ok() has been optimized

 - A pair of performance bugs have been fixed in the uaccess handlers

 - Various fixes and cleanups, including one for the IMSIC build failure
   and one for the early-boot ftrace illegal NOPs bug

* tag 'riscv-for-linus-6.10-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: Fix early ftrace nop patching
  irqchip: riscv-imsic: Fixup riscv_ipi_set_virq_range() conflict
  riscv: selftests: Add signal handling vector tests
  riscv: mm: accelerate pagefault when badaccess
  riscv: uaccess: Relax the threshold for fast path
  riscv: uaccess: Allow the last potential unrolled copy
  riscv: typo in comment for get_f64_reg
  Use bool value in set_cpu_online()
  riscv: selftests: Add hwprobe binaries to .gitignore
  riscv: stacktrace: fixed walk_stackframe()
  ftrace: riscv: move from REGS to ARGS
  riscv: do not select MODULE_SECTIONS by default
  riscv: show help string for riscv-specific targets
  riscv: make image compression configurable
  riscv: cpufeature: Fix extension subset checking
  riscv: cpufeature: Fix thead vector hwcap removal
  riscv: rewrite __kernel_map_pages() to fix sleeping in invalid context
  riscv: force PAGE_SIZE linear mapping if debug_pagealloc is enabled
  riscv: Define TASK_SIZE_MAX for __access_ok()
  riscv: Remove PGDIR_SIZE_L3 and TASK_SIZE_MIN
2024-05-24 10:46:35 -07:00
Xiao Wang
99fa63d9ca riscv, bpf: Try RVC for reg move within BPF_CMPXCHG JIT
We could try to emit compressed insn for reg move operation during CMPXCHG
JIT, the instruction compression has no impact on the jump offsets of
following forward and backward jump instructions.

Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Björn Töpel <bjorn@kernel.org>
Link: https://lore.kernel.org/bpf/20240519050507.2217791-1-xiao.w.wang@intel.com
2024-05-24 17:40:33 +02:00
Xiao Wang
e944fc8152 riscv, bpf: Use STACK_ALIGN macro for size rounding up
Use the macro STACK_ALIGN that is defined in asm/processor.h for stack size
rounding up, just like bpf_jit_comp32.c does.

Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Pu Lehui <pulehui@huawei.com>
Link: https://lore.kernel.org/bpf/20240523031835.3977713-1-xiao.w.wang@intel.com
2024-05-24 17:16:56 +02:00
Xiao Wang
c12603e76e riscv, bpf: Optimize zextw insn with Zba extension
The Zba extension provides add.uw insn which can be used to implement
zext.w with rs2 set as ZERO.

Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Pu Lehui <pulehui@huawei.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Pu Lehui <pulehui@huawei.com>
Link: https://lore.kernel.org/bpf/20240516090430.493122-1-xiao.w.wang@intel.com
2024-05-24 16:53:12 +02:00
Alexandre Ghiti
6ca445d8af
riscv: Fix early ftrace nop patching
Commit c97bf62996 ("riscv: Fix text patching when IPI are used")
converted ftrace_make_nop() to use patch_insn_write() which does not
emit any icache flush relying entirely on __ftrace_modify_code() to do
that.

But we missed that ftrace_make_nop() was called very early directly when
converting mcount calls into nops (actually on riscv it converts 2B nops
emitted by the compiler into 4B nops).

This caused crashes on multiple HW as reported by Conor and Björn since
the booting core could have half-patched instructions in its icache
which would trigger an illegal instruction trap: fix this by emitting a
local flush icache when early patching nops.

Fixes: c97bf62996 ("riscv: Fix text patching when IPI are used")
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reported-by: Conor Dooley <conor.dooley@microchip.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Björn Töpel <bjorn@rivosinc.com>
Tested-by: Björn Töpel <bjorn@rivosinc.com>
Link: https://lore.kernel.org/r/20240523115134.70380-1-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-05-23 08:22:17 -07:00
Linus Torvalds
c760b3725e - A series ("kbuild: enable more warnings by default") from Arnd
Bergmann which enables a number of additional build-time warnings.  We
   fixed all the fallout which we could find, there may still be a few
   stragglers.
 
 - Samuel Holland has developed the series "Unified cross-architecture
   kernel-mode FPU API".  This does a lot of consolidation of
   per-architecture kernel-mode FPU usage and enables the use of newer AMD
   GPUs on RISC-V.
 
 - Tao Su has fixed some selftests build warnings in the series
   "Selftests: Fix compilation warnings due to missing _GNU_SOURCE
   definition".
 
 - This pull also includes a nilfs2 fixup from Ryusuke Konishi.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZk6OSAAKCRDdBJ7gKXxA
 jpTGAP9hQaZ+g7CO38hKQAtEI8rwcZJtvUAP84pZEGMjYMGLxQD/S8z1o7UHx61j
 DUbnunbOkU/UcPx3Fs/gp4KcJARMEgs=
 =EPi9
 -----END PGP SIGNATURE-----

Merge tag 'mm-nonmm-stable-2024-05-22-17-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull more non-mm updates from Andrew Morton:

 - A series ("kbuild: enable more warnings by default") from Arnd
   Bergmann which enables a number of additional build-time warnings. We
   fixed all the fallout which we could find, there may still be a few
   stragglers.

 - Samuel Holland has developed the series "Unified cross-architecture
   kernel-mode FPU API". This does a lot of consolidation of
   per-architecture kernel-mode FPU usage and enables the use of newer
   AMD GPUs on RISC-V.

 - Tao Su has fixed some selftests build warnings in the series
   "Selftests: Fix compilation warnings due to missing _GNU_SOURCE
   definition".

 - This pull also includes a nilfs2 fixup from Ryusuke Konishi.

* tag 'mm-nonmm-stable-2024-05-22-17-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (23 commits)
  nilfs2: make block erasure safe in nilfs_finish_roll_forward()
  selftests/harness: use 1024 in place of LINE_MAX
  Revert "selftests/harness: remove use of LINE_MAX"
  selftests/fpu: allow building on other architectures
  selftests/fpu: move FP code to a separate translation unit
  drm/amd/display: use ARCH_HAS_KERNEL_FPU_SUPPORT
  drm/amd/display: only use hard-float, not altivec on powerpc
  riscv: add support for kernel-mode FPU
  x86: implement ARCH_HAS_KERNEL_FPU_SUPPORT
  powerpc: implement ARCH_HAS_KERNEL_FPU_SUPPORT
  LoongArch: implement ARCH_HAS_KERNEL_FPU_SUPPORT
  lib/raid6: use CC_FLAGS_FPU for NEON CFLAGS
  arm64: crypto: use CC_FLAGS_FPU for NEON CFLAGS
  arm64: implement ARCH_HAS_KERNEL_FPU_SUPPORT
  ARM: crypto: use CC_FLAGS_FPU for NEON CFLAGS
  ARM: implement ARCH_HAS_KERNEL_FPU_SUPPORT
  arch: add ARCH_HAS_KERNEL_FPU_SUPPORT
  x86/fpu: fix asm/fpu/types.h include guard
  kbuild: enable -Wcast-function-type-strict unconditionally
  kbuild: enable -Wformat-truncation on clang
  ...
2024-05-22 18:59:29 -07:00
Palmer Dabbelt
c6c901b7d9
Merge patch series "riscv: Extension parsing fixes"
Charlie Jenkins <charlie@rivosinc.com> says:

This series contains two minor fixes for the extension parsing in
cpufeature.c.

Some T-Head boards without vector 1.0 support report "v" in the isa
string in their DT which will cause the kernel to run vector code. The
code to blacklist "v" from these boards was doing so by using
riscv_cached_mvendorid() which has not been populated at the time of
extension parsing. This fix instead greedily reads the mvendorid CSR of
the boot hart to determine if the cpu is from T-Head.

The other fix is for an incorrect indexing bug. riscv extensions
sometimes imply other extensions. When adding these "subset" extensions
to the hardware capabilities array, they need to be checked if they are
valid. The current code only checks if the extension that is including
other extensions is valid and not the subset extensions.

These patches were previously included in:
https://lore.kernel.org/lkml/20240420-dev-charlie-support_thead_vector_6_9-v3-0-67cff4271d1d@rivosinc.com/

* b4-shazam-merge:
  riscv: cpufeature: Fix extension subset checking
  riscv: cpufeature: Fix thead vector hwcap removal

Link: https://lore.kernel.org/r/20240502-cpufeature_fixes-v4-0-b3d1a088722d@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-05-22 16:12:58 -07:00
Kefeng Wang
4c6c002042
riscv: mm: accelerate pagefault when badaccess
The access_error() of vma already checked under per-VMA lock, if it
is a bad access, directly handle error, no need to retry with mmap_lock
again. Since the page faut is handled under per-VMA lock, count it as
a vma lock event with VMA_LOCK_SUCCESS.

Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240403083805.1818160-6-wangkefeng.wang@huawei.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-05-22 16:12:56 -07:00
Xiao Wang
9850e73e82
riscv: uaccess: Relax the threshold for fast path
The bytes copy for unaligned head would cover at most SZREG-1 bytes, so
it's better to set the threshold as >= (SZREG-1 + word_copy stride size)
which equals to 9*SZREG-1.

Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240313091929.4029960-1-xiao.w.wang@intel.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-05-22 16:12:55 -07:00
Xiao Wang
f1905946be
riscv: uaccess: Allow the last potential unrolled copy
When the dst buffer pointer points to the last accessible aligned addr, we
could still run another iteration of unrolled copy.

Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240313103334.4036554-1-xiao.w.wang@intel.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-05-22 16:12:54 -07:00
Xingyou Chen
7e6eae24da
riscv: typo in comment for get_f64_reg
Signed-off-by: Xingyou Chen <rockrush@rockwork.org>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://lore.kernel.org/r/20240317055556.9449-1-rockrush@rockwork.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-05-22 16:12:53 -07:00
Zhao Ke
10378a39ed
Use bool value in set_cpu_online()
The declaration of set_cpu_online() takes a bool value. So replace
int here to make it consistent with the declaration.

Signed-off-by: Zhao Ke <ke.zhao@shingroup.cn>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20240318065404.123668-1-ke.zhao@shingroup.cn
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-05-22 16:12:52 -07:00
Palmer Dabbelt
855ad0f7a1
Merge patch series "riscv: fix debug_pagealloc"
Nam Cao <namcao@linutronix.de> says:

The debug_pagealloc feature is not functional on RISCV. With this feature
enabled (CONFIG_DEBUG_PAGEALLOC=y and debug_pagealloc=on), kernel crashes
early during boot.

QEMU command that can reproduce this problem:
   qemu-system-riscv64 -machine virt \
   -kernel Image \
   -append "console=ttyS0 root=/dev/vda debug_pagealloc=on" \
   -nographic \
   -drive "file=root.img,format=raw,id=hd0" \
   -device virtio-blk-device,drive=hd0 \
   -m 4G \

This series makes debug_pagealloc functional.

* b4-shazam-merge:
  riscv: rewrite __kernel_map_pages() to fix sleeping in invalid context
  riscv: force PAGE_SIZE linear mapping if debug_pagealloc is enabled

Link: https://lore.kernel.org/r/cover.1715750938.git.namcao@linutronix.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-05-22 16:12:50 -07:00
Matthew Bystrin
a2a4d4a6a0
riscv: stacktrace: fixed walk_stackframe()
If the load access fault occures in a leaf function (with
CONFIG_FRAME_POINTER=y), when wrong stack trace will be displayed:

[<ffffffff804853c2>] regmap_mmio_read32le+0xe/0x1c
---[ end trace 0000000000000000 ]---

Registers dump:
    ra     0xffffffff80485758 <regmap_mmio_read+36>
    sp     0xffffffc80200b9a0
    fp     0xffffffc80200b9b0
    pc     0xffffffff804853ba <regmap_mmio_read32le+6>

Stack dump:
    0xffffffc80200b9a0:  0xffffffc80200b9e0  0xffffffc80200b9e0
    0xffffffc80200b9b0:  0xffffffff8116d7e8  0x0000000000000100
    0xffffffc80200b9c0:  0xffffffd8055b9400  0xffffffd8055b9400
    0xffffffc80200b9d0:  0xffffffc80200b9f0  0xffffffff8047c526
    0xffffffc80200b9e0:  0xffffffc80200ba30  0xffffffff8047fe9a

The assembler dump of the function preambula:
    add     sp,sp,-16
    sd      s0,8(sp)
    add     s0,sp,16

In the fist stack frame, where ra is not stored on the stack we can
observe:

        0(sp)                  8(sp)
        .---------------------------------------------.
    sp->|       frame->fp      | frame->ra (saved fp) |
        |---------------------------------------------|
    fp->|         ....         |         ....         |
        |---------------------------------------------|
        |                      |                      |

and in the code check is performed:
	if (regs && (regs->epc == pc) && (frame->fp & 0x7))

I see no reason to check frame->fp value at all, because it is can be
uninitialized value on the stack. A better way is to check frame->ra to
be an address on the stack. After the stacktrace shows as expect:

[<ffffffff804853c2>] regmap_mmio_read32le+0xe/0x1c
[<ffffffff80485758>] regmap_mmio_read+0x24/0x52
[<ffffffff8047c526>] _regmap_bus_reg_read+0x1a/0x22
[<ffffffff8047fe9a>] _regmap_read+0x5c/0xea
[<ffffffff80480376>] _regmap_update_bits+0x76/0xc0
...
---[ end trace 0000000000000000 ]---
As pointed by Samuel Holland it is incorrect to remove check of the stackframe
entirely.

Changes since v2 [2]:
 - Add accidentally forgotten curly brace

Changes since v1 [1]:
 - Instead of just dropping frame->fp check, replace it with validation of
   frame->ra, which should be a stack address.
 - Move frame pointer validation into the separate function.

[1] https://lore.kernel.org/linux-riscv/20240426072701.6463-1-dev.mbstr@gmail.com/
[2] https://lore.kernel.org/linux-riscv/20240521131314.48895-1-dev.mbstr@gmail.com/

Fixes: f766f77a74 ("riscv/stacktrace: Fix stack output without ra on the stack top")
Signed-off-by: Matthew Bystrin <dev.mbstr@gmail.com>
Reviewed-by: Samuel Holland <samuel.holland@sifive.com>
Link: https://lore.kernel.org/r/20240521191727.62012-1-dev.mbstr@gmail.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-05-22 16:12:49 -07:00
Puranjay Mohan
7caa976546
ftrace: riscv: move from REGS to ARGS
This commit replaces riscv's support for FTRACE_WITH_REGS with support
for FTRACE_WITH_ARGS. This is required for the ongoing effort to stop
relying on stop_machine() for RISCV's implementation of ftrace.

The main relevant benefit that this change will bring for the above
use-case is that now we don't have separate ftrace_caller and
ftrace_regs_caller trampolines. This will allow the callsite to call
ftrace_caller by modifying a single instruction. Now the callsite can
do something similar to:

When not tracing:            |             When tracing:

func:                                      func:
  auipc t0, ftrace_caller_top                auipc t0, ftrace_caller_top
  nop  <=========<Enable/Disable>=========>  jalr  t0, ftrace_caller_bottom
  [...]                                      [...]

The above assumes that we are dropping the support of calling a direct
trampoline from the callsite. We need to drop this as the callsite can't
change the target address to call, it can only enable/disable a call to
a preset target (ftrace_caller in the above diagram). We can later optimize
this by calling an intermediate dispatcher trampoline before ftrace_caller.

Currently, ftrace_regs_caller saves all CPU registers in the format of
struct pt_regs and allows the tracer to modify them. We don't need to
save all of the CPU registers because at function entry only a subset of
pt_regs is live:

|----------+----------+---------------------------------------------|
| Register | ABI Name | Description                                 |
|----------+----------+---------------------------------------------|
| x1       | ra       | Return address for traced function          |
| x2       | sp       | Stack pointer                               |
| x5       | t0       | Return address for ftrace_caller trampoline |
| x8       | s0/fp    | Frame pointer                               |
| x10-11   | a0-1     | Function arguments/return values            |
| x12-17   | a2-7     | Function arguments                          |
|----------+----------+---------------------------------------------|

See RISCV calling convention[1] for the above table.

Saving just the live registers decreases the amount of stack space
required from 288 Bytes to 112 Bytes.

Basic testing was done with this on the VisionFive 2 development board.

Note:
  - Moving from REGS to ARGS will mean that RISCV will stop supporting
    KPROBES_ON_FTRACE as it requires full pt_regs to be saved.
  - KPROBES_ON_FTRACE will be supplanted by FPROBES see [2].

[1] https://riscv.org/wp-content/uploads/2015/01/riscv-calling.pdf
[2] https://lore.kernel.org/all/170887410337.564249.6360118840946697039.stgit@devnote2/

Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Tested-by: Björn Töpel <bjorn@rivosinc.com>
Reviewed-by: Björn Töpel <bjorn@rivosinc.com>
Link: https://lore.kernel.org/r/20240405142453.4187-1-puranjay@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-05-22 16:12:48 -07:00
Palmer Dabbelt
12cf29c6f9
Merge patch series "riscv: access_ok() optimization"
Samuel Holland <samuel.holland@sifive.com> says:

This series optimizes access_ok() by defining TASK_SIZE_MAX. At Alex's
suggestion, I also tried making TASK_SIZE constant (specifically by
making PGDIR_SHIFT a variable instead of a ternary expression, then
replacing the load with an immediate using ALTERNATIVE). This appeared
to slightly improve performance on some implementations (C906) but
regressed it on others (FU740). So I am leaving further optimizations to
a later series.

* b4-shazam-merge:
  riscv: Define TASK_SIZE_MAX for __access_ok()
  riscv: Remove PGDIR_SIZE_L3 and TASK_SIZE_MIN

Link: https://lore.kernel.org/r/20240327143858.711792-1-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-05-22 16:12:47 -07:00
Qingfang Deng
2aff5f955b
riscv: do not select MODULE_SECTIONS by default
Since commit aad15bc85c ("riscv: Change code model of module to
medany to improve data accessing"), kernel modules have not been built
with -fPIC, so they wouldn't have R_RISCV_GOT_HI20 or R_RISCV_CALL_PLT
relocations, and handling of those relocations is unnecessary.

If RELOCATABLE=y, kernel modules will be built with -fPIE, which would
reintroduce said relocations, so only select MODULE_SECTIONS when
RELOCATABLE.

Signed-off-by: Qingfang Deng <qingfang.deng@siflower.com.cn>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20240511015725.1162-1-dqfext@gmail.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-05-22 16:12:46 -07:00
Emil Renner Berthing
07501c4907
riscv: show help string for riscv-specific targets
Define the archhelp variable so that 'make ACRH=riscv help' will show
the targets specific to building a RISC-V kernel like other
architectures.

Tested-by: Björn Töpel <bjorn@rivosinc.com>
Signed-off-by: Emil Renner Berthing <emil.renner.berthing@canonical.com>
Reviewed-by: Masahiro Yamada <masahiroy@kernel.org>
Link: https://lore.kernel.org/r/20240504193446.196886-3-emil.renner.berthing@canonical.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-05-22 16:12:45 -07:00
Emil Renner Berthing
e79dfcbfb9
riscv: make image compression configurable
Previously the build process would always set KBUILD_IMAGE to the
uncompressed Image file (unless XIP_KERNEL or EFI_ZBOOT was enabled) and
unconditionally compress it into Image.gz. However there are already
build targets for Image.bz2, Image.lz4, Image.lzma, Image.lzo and
Image.zstd, so let's make use of those, make the compression method
configurable and set KBUILD_IMAGE accordingly so that targets like
'make install' and 'make bindeb-pkg' will use the chosen image.

Tested-by: Björn Töpel <bjorn@rivosinc.com>
Signed-off-by: Emil Renner Berthing <emil.renner.berthing@canonical.com>
Reviewed-by: Nicolas Schier <n.schier@avm.de>
Reviewed-by: Masahiro Yamada <masahiroy@kernel.org>
Link: https://lore.kernel.org/r/20240504193446.196886-2-emil.renner.berthing@canonical.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-05-22 16:12:44 -07:00
Linus Torvalds
0bfbc914d9 RISC-V Patches for the 6.10 Merge Window, Part 1
* Support for byte/half-word compare-and-exchange, emulated via LR/SC
   loops.
 * Support for Rust.
 * Support for Zihintpause in hwprobe.
 * Support for the PR_RISCV_SET_ICACHE_FLUSH_CTX prctl().
 * Support for lockless lockrefs.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmZN/hcTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYiVrGEACUT3gsbTx1q7fa11iQNxOjVkpl66Qn
 7+kI+V9xt5+GuH2EjJk6AsSNHPKeQ8totbSTA8AZjINFvgVjXslN+DPpcjCFKvnh
 NN5/Lyd64X0PZMsxGWlN9SHTFWf2b7lalCnY51BlX/IpBbHWc/no9XUsPSVixx6u
 9q+JoS3D1DDV92nGcA/UK9ICCsDcf4omWgZW7KbjnVWnuY9jt4ctTy11jtF2RM9R
 Z9KAWh0RqPzjz0vNbBBf9Iw7E4jt/Px6HDYPfZAiE2dVsCTHjdsC7TcGRYXzKt6F
 4q9zg8kzwvUG5GaBl7/XprXO1vaeOUmPcTVoE7qlRkSdkknRH/iBz1P4hk+r0fze
 f+h5ZUV/oJP7vDb+vHm/BExtGufgLuJ2oMA2Bp9qI17EMcMsGiRMt7DsBMEafWDk
 bNrFcJdqqYBz6HxfTwzNH5ErxfS/59PuwYl913BTSOH//raCZCFXOfyrSICH7qXd
 UFOLLmBpMuApLa8ayFeI9Mp3flWfbdQHR52zLRLiUvlpWNEDKrNQN417juVwTXF0
 DYkjJDhFPLfFOr/sJBboftOMOUdA9c/CJepY9o4kPvBXUvPtRHN1jdXDNSCVDZRb
 nErnsJ9rv0PzfxQU7Xjhd2QmCMeMlbCQDpXAKKETyyimpTbgF33rovN0i5ixX3m4
 KG6RvKDubOzZdA==
 =YLoD
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-6.10-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V updates from Palmer Dabbelt:

 - Add byte/half-word compare-and-exchange, emulated via LR/SC loops

 - Support for Rust

 - Support for Zihintpause in hwprobe

 - Add PR_RISCV_SET_ICACHE_FLUSH_CTX prctl()

 - Support lockless lockrefs

* tag 'riscv-for-linus-6.10-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (42 commits)
  riscv: defconfig: Enable CONFIG_CLK_SOPHGO_CV1800
  riscv: select ARCH_HAS_FAST_MULTIPLIER
  riscv: mm: still create swiotlb buffer for kmalloc() bouncing if required
  riscv: Annotate pgtable_l{4,5}_enabled with __ro_after_init
  riscv: Remove redundant CONFIG_64BIT from pgtable_l{4,5}_enabled
  riscv: mm: Always use an ASID to flush mm contexts
  riscv: mm: Preserve global TLB entries when switching contexts
  riscv: mm: Make asid_bits a local variable
  riscv: mm: Use a fixed layout for the MM context ID
  riscv: mm: Introduce cntx2asid/cntx2version helper macros
  riscv: Avoid TLB flush loops when affected by SiFive CIP-1200
  riscv: Apply SiFive CIP-1200 workaround to single-ASID sfence.vma
  riscv: mm: Combine the SMP and UP TLB flush code
  riscv: Only send remote fences when some other CPU is online
  riscv: mm: Broadcast kernel TLB flushes only when needed
  riscv: Use IPIs for remote cache/TLB flushes by default
  riscv: Factor out page table TLB synchronization
  riscv: Flush the instruction cache during SMP bringup
  riscv: hwprobe: export Zihintpause ISA extension
  riscv: misaligned: remove CONFIG_RISCV_M_MODE specific code
  ...
2024-05-22 09:56:00 -07:00
Charlie Jenkins
e67e98ee89
riscv: cpufeature: Fix extension subset checking
This loop is supposed to check if ext->subset_ext_ids[j] is valid, rather
than if ext->subset_ext_ids[i] is valid, before setting the extension
id ext->subset_ext_ids[j] in isainfo->isa.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Fixes: 0d8295ed97 ("riscv: add ISA extension parsing for scalar crypto")
Link: https://lore.kernel.org/r/20240502-cpufeature_fixes-v4-2-b3d1a088722d@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-05-22 09:41:03 -07:00
Charlie Jenkins
e482eab4d1
riscv: cpufeature: Fix thead vector hwcap removal
The riscv_cpuinfo struct that contains mvendorid and marchid is not
populated until all harts are booted which happens after the DT parsing.
Use the mvendorid/marchid from the boot hart to determine if the DT
contains an invalid V.

Fixes: d82f32202e ("RISC-V: Ignore V from the riscv,isa DT property on older T-Head CPUs")
Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Guo Ren <guoren@kernel.org>
Link: https://lore.kernel.org/r/20240502-cpufeature_fixes-v4-1-b3d1a088722d@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-05-22 09:41:02 -07:00
Nam Cao
fb1cf08783
riscv: rewrite __kernel_map_pages() to fix sleeping in invalid context
__kernel_map_pages() is a debug function which clears the valid bit in page
table entry for deallocated pages to detect illegal memory accesses to
freed pages.

This function set/clear the valid bit using __set_memory(). __set_memory()
acquires init_mm's semaphore, and this operation may sleep. This is
problematic, because  __kernel_map_pages() can be called in atomic context,
and thus is illegal to sleep. An example warning that this causes:

BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:1578
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 2, name: kthreadd
preempt_count: 2, expected: 0
CPU: 0 PID: 2 Comm: kthreadd Not tainted 6.9.0-g1d4c6d784ef6 #37
Hardware name: riscv-virtio,qemu (DT)
Call Trace:
[<ffffffff800060dc>] dump_backtrace+0x1c/0x24
[<ffffffff8091ef6e>] show_stack+0x2c/0x38
[<ffffffff8092baf8>] dump_stack_lvl+0x5a/0x72
[<ffffffff8092bb24>] dump_stack+0x14/0x1c
[<ffffffff8003b7ac>] __might_resched+0x104/0x10e
[<ffffffff8003b7f4>] __might_sleep+0x3e/0x62
[<ffffffff8093276a>] down_write+0x20/0x72
[<ffffffff8000cf00>] __set_memory+0x82/0x2fa
[<ffffffff8000d324>] __kernel_map_pages+0x5a/0xd4
[<ffffffff80196cca>] __alloc_pages_bulk+0x3b2/0x43a
[<ffffffff8018ee82>] __vmalloc_node_range+0x196/0x6ba
[<ffffffff80011904>] copy_process+0x72c/0x17ec
[<ffffffff80012ab4>] kernel_clone+0x60/0x2fe
[<ffffffff80012f62>] kernel_thread+0x82/0xa0
[<ffffffff8003552c>] kthreadd+0x14a/0x1be
[<ffffffff809357de>] ret_from_fork+0xe/0x1c

Rewrite this function with apply_to_existing_page_range(). It is fine to
not have any locking, because __kernel_map_pages() works with pages being
allocated/deallocated and those pages are not changed by anyone else in the
meantime.

Fixes: 5fde3db5eb ("riscv: add ARCH_SUPPORTS_DEBUG_PAGEALLOC support")
Signed-off-by: Nam Cao <namcao@linutronix.de>
Cc: stable@vger.kernel.org
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/1289ecba9606a19917bc12b6c27da8aa23e1e5ae.1715750938.git.namcao@linutronix.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-05-22 09:15:09 -07:00
Nam Cao
c67ddf59ac
riscv: force PAGE_SIZE linear mapping if debug_pagealloc is enabled
debug_pagealloc is a debug feature which clears the valid bit in page table
entry for freed pages to detect illegal accesses to freed memory.

For this feature to work, virtual mapping must have PAGE_SIZE resolution.
(No, we cannot map with huge pages and split them only when needed; because
pages can be allocated/freed in atomic context and page splitting cannot be
done in atomic context)

Force linear mapping to use small pages if debug_pagealloc is enabled.

Note that it is not necessary to force the entire linear mapping, but only
those that are given to memory allocator. Some parts of memory can keep
using huge page mapping (for example, kernel's executable code). But these
parts are minority, so keep it simple. This is just a debug feature, some
extra overhead should be acceptable.

Fixes: 5fde3db5eb ("riscv: add ARCH_SUPPORTS_DEBUG_PAGEALLOC support")
Signed-off-by: Nam Cao <namcao@linutronix.de>
Cc: stable@vger.kernel.org
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/2e391fa6c6f9b3fcf1b41cefbace02ee4ab4bf59.1715750938.git.namcao@linutronix.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-05-22 09:15:08 -07:00
Linus Torvalds
1b03616209 soc: devicetree updates for v6.10, part 2
This is a follow-up to an earlier pull request for device tree changes,
 as three platform maintainers sent their contents too late to be included
 in the main set, but had not caused any further problems since then:
 
  - The Amlogic platform now containts support for two new SoC types,
    the A4 and A5 chips for audio applications. Both come with a
    reference board, and one more dts file gets addded for the
    combination of the MNT Reform Laptop with the BPI-CM4 CPU
    module
 
  - The ASpeed platform adds support for six addititional server
    platforms that use ast2500 or ast2600 as their BMC, while
    another one gets removed.
 
  - The RISC-V platforms from Microchip, Starfive and and T-HEAD
    get additional features for existing hardware, plus the
    addition of the Milk-V Mars based on the StarFive VisionFive v2
    board.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmZLwrEACgkQYKtH/8kJ
 UidIwA//U0iMWGgChr15MkJmLM+IR1OM0dcb1yZL5ws7tkZMdE16istGBU0a5EIE
 0+OQNc/K9eMV4XCTgevVnvIq2woFeyBQ+ztZg/Ht5r9ZGN5UJ1liWeTEdbdtKhiA
 t9oRL6XJWZEvvS0xassSczI018R+AmXI3LqAmEtcirGhykP12hhpXrYiHPqgCvx+
 j7p8RuG2wwwlPvh/7N9H8BoKTQdMP2CSN/kNQbLmay/6h20DgSiC7SrHfmmKxgkE
 waKMDy4XE3114MowvMX7Uv3wjtldVktx+Mi3aYTzg8Ze7i/6Y+FmjBEY8ZpPp2uE
 tTPADhwiyVrg5dZWvlofsufqktF3JrHN5Ma4ikF0GKigcTKVdqGArt2ohck7nL4K
 EyBIUBWn6AEXwxJlqdsDCNyo3o8QWKgFk0s6GU/qrVj5/0f0T7YcrEVa/6cHEPZc
 jKFVifIIe9agyqBK2grD+sU7WtCGBZOEhLPk622B8PrZTF0Pa5/8pqV2CIzM8GBn
 sE5PFUmKh3QvCB6yg22cv9B+W3tllhE4agtQGEMxCcrwNM++1Kmru6ZMCKTOX6vt
 +HjIdX+6q32tB9HazfGHYhCvnS27/3AyaWa68hy+mMlJDE6JFu/8Pv8hlZi7TV08
 6fcYQAzgZ6LYUdr1ObtcHw8BzYPicDcuThaHJh5l8yg17GzRhF4=
 =iChs
 -----END PGP SIGNATURE-----

Merge tag 'soc-dt-late-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc

Pull more SoC devicetree updates from Arnd Bergmann:
 "This is a follow-up to an earlier pull request for device tree
  changes, as three platform maintainers sent their contents too late to
  be included in the main set, but had not caused any further problems
  since then:

   - The Amlogic platform now containts support for two new SoC types,
     the A4 and A5 chips for audio applications. Both come with a
     reference board, and one more dts file gets addded for the
     combination of the MNT Reform Laptop with the BPI-CM4 CPU module

   - The ASpeed platform adds support for six addititional server
     platforms that use ast2500 or ast2600 as their BMC, while another
     one gets removed

   - The RISC-V platforms from Microchip, Starfive and and T-HEAD get
     additional features for existing hardware, plus the addition of the
     Milk-V Mars based on the StarFive VisionFive v2 board"

* tag 'soc-dt-late-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (76 commits)
  riscv: dts: microchip: add pac1934 power-monitor to icicle
  riscv: dts: thead: Fix node ordering in TH1520 device tree
  ARM: dts: aspeed: Add ASRock E3C256D4I BMC
  dt-bindings: arm: aspeed: document ASRock E3C256D4I
  dt-bindings: trivial-devices: add isil,isl69269
  ARM: dts: aspeed: x4tf: Add dts for asus x4tf project
  dt-bindings: arm: aspeed: add ASUS X4TF board
  ARM: dts: aspeed: Remove Facebook Cloudripper dts
  ARM: dts: aspeed: drop unused ref_voltage ADC property
  ARM: dts: aspeed: harma: correct Mellanox multi-host property
  ARM: dts: aspeed: yosemitev2: correct Mellanox multi-host property
  ARM: dts: aspeed: yosemite4: correct Mellanox multi-host property
  ARM: dts: aspeed: greatlakes: correct Mellanox multi-host property
  ARM: dts: aspeed: Modify I2C bus configuration
  ARM: dts: aspeed: Disable unused ADC channels for Asrock X570D4U BMC
  ARM: dts: aspeed: Modify GPIO table for Asrock X570D4U BMC
  ARM: dts: aspeed: yosemite4: set bus13 frequency to 100k
  ARM: dts: Aspeed: Bonnell: Fix NVMe LED labels
  ARM: dts: aspeed: yosemite4: Enable ipmb device for OCP debug card
  ARM: dts: aspeed: ahe50dc: Update lm25066 regulator name
  ...
2024-05-20 15:11:53 -07:00
Samuel Holland
77acc6b55a riscv: add support for kernel-mode FPU
This is motivated by the amdgpu DRM driver, which needs floating-point
code to support recent hardware.  That code is not performance-critical,
so only provide a minimal non-preemptible implementation for now.

Support is limited to riscv64 because riscv32 requires runtime (libgcc)
assistance to convert between doubles and 64-bit integers.

Link: https://lkml.kernel.org/r/20240329072441.591471-12-samuel.holland@sifive.com
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Christian König <christian.koenig@amd.com> 
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Borislav Petkov (AMD) <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nicolas Schier <nicolas@fjasle.eu>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: WANG Xuerui <git@xen0n.name>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-19 14:36:19 -07:00
Linus Torvalds
61307b7be4 The usual shower of singleton fixes and minor series all over MM,
documented (hopefully adequately) in the respective changelogs.  Notable
 series include:
 
 - Lucas Stach has provided some page-mapping
   cleanup/consolidation/maintainability work in the series "mm/treewide:
   Remove pXd_huge() API".
 
 - In the series "Allow migrate on protnone reference with
   MPOL_PREFERRED_MANY policy", Donet Tom has optimized mempolicy's
   MPOL_PREFERRED_MANY mode, yielding almost doubled performance in one
   test.
 
 - In their series "Memory allocation profiling" Kent Overstreet and
   Suren Baghdasaryan have contributed a means of determining (via
   /proc/allocinfo) whereabouts in the kernel memory is being allocated:
   number of calls and amount of memory.
 
 - Matthew Wilcox has provided the series "Various significant MM
   patches" which does a number of rather unrelated things, but in largely
   similar code sites.
 
 - In his series "mm: page_alloc: freelist migratetype hygiene" Johannes
   Weiner has fixed the page allocator's handling of migratetype requests,
   with resulting improvements in compaction efficiency.
 
 - In the series "make the hugetlb migration strategy consistent" Baolin
   Wang has fixed a hugetlb migration issue, which should improve hugetlb
   allocation reliability.
 
 - Liu Shixin has hit an I/O meltdown caused by readahead in a
   memory-tight memcg.  Addressed in the series "Fix I/O high when memory
   almost met memcg limit".
 
 - In the series "mm/filemap: optimize folio adding and splitting" Kairui
   Song has optimized pagecache insertion, yielding ~10% performance
   improvement in one test.
 
 - Baoquan He has cleaned up and consolidated the early zone
   initialization code in the series "mm/mm_init.c: refactor
   free_area_init_core()".
 
 - Baoquan has also redone some MM initializatio code in the series
   "mm/init: minor clean up and improvement".
 
 - MM helper cleanups from Christoph Hellwig in his series "remove
   follow_pfn".
 
 - More cleanups from Matthew Wilcox in the series "Various page->flags
   cleanups".
 
 - Vlastimil Babka has contributed maintainability improvements in the
   series "memcg_kmem hooks refactoring".
 
 - More folio conversions and cleanups in Matthew Wilcox's series
 
 	"Convert huge_zero_page to huge_zero_folio"
 	"khugepaged folio conversions"
 	"Remove page_idle and page_young wrappers"
 	"Use folio APIs in procfs"
 	"Clean up __folio_put()"
 	"Some cleanups for memory-failure"
 	"Remove page_mapping()"
 	"More folio compat code removal"
 
 - David Hildenbrand chipped in with "fs/proc/task_mmu: convert hugetlb
   functions to work on folis".
 
 - Code consolidation and cleanup work related to GUP's handling of
   hugetlbs in Peter Xu's series "mm/gup: Unify hugetlb, part 2".
 
 - Rick Edgecombe has developed some fixes to stack guard gaps in the
   series "Cover a guard gap corner case".
 
 - Jinjiang Tu has fixed KSM's behaviour after a fork+exec in the series
   "mm/ksm: fix ksm exec support for prctl".
 
 - Baolin Wang has implemented NUMA balancing for multi-size THPs.  This
   is a simple first-cut implementation for now.  The series is "support
   multi-size THP numa balancing".
 
 - Cleanups to vma handling helper functions from Matthew Wilcox in the
   series "Unify vma_address and vma_pgoff_address".
 
 - Some selftests maintenance work from Dev Jain in the series
   "selftests/mm: mremap_test: Optimizations and style fixes".
 
 - Improvements to the swapping of multi-size THPs from Ryan Roberts in
   the series "Swap-out mTHP without splitting".
 
 - Kefeng Wang has significantly optimized the handling of arm64's
   permission page faults in the series
 
 	"arch/mm/fault: accelerate pagefault when badaccess"
 	"mm: remove arch's private VM_FAULT_BADMAP/BADACCESS"
 
 - GUP cleanups from David Hildenbrand in "mm/gup: consistently call it
   GUP-fast".
 
 - hugetlb fault code cleanups from Vishal Moola in "Hugetlb fault path to
   use struct vm_fault".
 
 - selftests build fixes from John Hubbard in the series "Fix
   selftests/mm build without requiring "make headers"".
 
 - Memory tiering fixes/improvements from Ho-Ren (Jack) Chuang in the
   series "Improved Memory Tier Creation for CPUless NUMA Nodes".  Fixes
   the initialization code so that migration between different memory types
   works as intended.
 
 - David Hildenbrand has improved follow_pte() and fixed an errant driver
   in the series "mm: follow_pte() improvements and acrn follow_pte()
   fixes".
 
 - David also did some cleanup work on large folio mapcounts in his
   series "mm: mapcount for large folios + page_mapcount() cleanups".
 
 - Folio conversions in KSM in Alex Shi's series "transfer page to folio
   in KSM".
 
 - Barry Song has added some sysfs stats for monitoring multi-size THP's
   in the series "mm: add per-order mTHP alloc and swpout counters".
 
 - Some zswap cleanups from Yosry Ahmed in the series "zswap same-filled
   and limit checking cleanups".
 
 - Matthew Wilcox has been looking at buffer_head code and found the
   documentation to be lacking.  The series is "Improve buffer head
   documentation".
 
 - Multi-size THPs get more work, this time from Lance Yang.  His series
   "mm/madvise: enhance lazyfreeing with mTHP in madvise_free" optimizes
   the freeing of these things.
 
 - Kemeng Shi has added more userspace-visible writeback instrumentation
   in the series "Improve visibility of writeback".
 
 - Kemeng Shi then sent some maintenance work on top in the series "Fix
   and cleanups to page-writeback".
 
 - Matthew Wilcox reduces mmap_lock traffic in the anon vma code in the
   series "Improve anon_vma scalability for anon VMAs".  Intel's test bot
   reported an improbable 3x improvement in one test.
 
 - SeongJae Park adds some DAMON feature work in the series
 
 	"mm/damon: add a DAMOS filter type for page granularity access recheck"
 	"selftests/damon: add DAMOS quota goal test"
 
 - Also some maintenance work in the series
 
 	"mm/damon/paddr: simplify page level access re-check for pageout"
 	"mm/damon: misc fixes and improvements"
 
 - David Hildenbrand has disabled some known-to-fail selftests ni the
   series "selftests: mm: cow: flag vmsplice() hugetlb tests as XFAIL".
 
 - memcg metadata storage optimizations from Shakeel Butt in "memcg:
   reduce memory consumption by memcg stats".
 
 - DAX fixes and maintenance work from Vishal Verma in the series
   "dax/bus.c: Fixups for dax-bus locking".
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZkgQYwAKCRDdBJ7gKXxA
 jrdKAP9WVJdpEcXxpoub/vVE0UWGtffr8foifi9bCwrQrGh5mgEAx7Yf0+d/oBZB
 nvA4E0DcPrUAFy144FNM0NTCb7u9vAw=
 =V3R/
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2024-05-17-19-19' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull mm updates from Andrew Morton:
 "The usual shower of singleton fixes and minor series all over MM,
  documented (hopefully adequately) in the respective changelogs.
  Notable series include:

   - Lucas Stach has provided some page-mapping cleanup/consolidation/
     maintainability work in the series "mm/treewide: Remove pXd_huge()
     API".

   - In the series "Allow migrate on protnone reference with
     MPOL_PREFERRED_MANY policy", Donet Tom has optimized mempolicy's
     MPOL_PREFERRED_MANY mode, yielding almost doubled performance in
     one test.

   - In their series "Memory allocation profiling" Kent Overstreet and
     Suren Baghdasaryan have contributed a means of determining (via
     /proc/allocinfo) whereabouts in the kernel memory is being
     allocated: number of calls and amount of memory.

   - Matthew Wilcox has provided the series "Various significant MM
     patches" which does a number of rather unrelated things, but in
     largely similar code sites.

   - In his series "mm: page_alloc: freelist migratetype hygiene"
     Johannes Weiner has fixed the page allocator's handling of
     migratetype requests, with resulting improvements in compaction
     efficiency.

   - In the series "make the hugetlb migration strategy consistent"
     Baolin Wang has fixed a hugetlb migration issue, which should
     improve hugetlb allocation reliability.

   - Liu Shixin has hit an I/O meltdown caused by readahead in a
     memory-tight memcg. Addressed in the series "Fix I/O high when
     memory almost met memcg limit".

   - In the series "mm/filemap: optimize folio adding and splitting"
     Kairui Song has optimized pagecache insertion, yielding ~10%
     performance improvement in one test.

   - Baoquan He has cleaned up and consolidated the early zone
     initialization code in the series "mm/mm_init.c: refactor
     free_area_init_core()".

   - Baoquan has also redone some MM initializatio code in the series
     "mm/init: minor clean up and improvement".

   - MM helper cleanups from Christoph Hellwig in his series "remove
     follow_pfn".

   - More cleanups from Matthew Wilcox in the series "Various
     page->flags cleanups".

   - Vlastimil Babka has contributed maintainability improvements in the
     series "memcg_kmem hooks refactoring".

   - More folio conversions and cleanups in Matthew Wilcox's series:
	"Convert huge_zero_page to huge_zero_folio"
	"khugepaged folio conversions"
	"Remove page_idle and page_young wrappers"
	"Use folio APIs in procfs"
	"Clean up __folio_put()"
	"Some cleanups for memory-failure"
	"Remove page_mapping()"
	"More folio compat code removal"

   - David Hildenbrand chipped in with "fs/proc/task_mmu: convert
     hugetlb functions to work on folis".

   - Code consolidation and cleanup work related to GUP's handling of
     hugetlbs in Peter Xu's series "mm/gup: Unify hugetlb, part 2".

   - Rick Edgecombe has developed some fixes to stack guard gaps in the
     series "Cover a guard gap corner case".

   - Jinjiang Tu has fixed KSM's behaviour after a fork+exec in the
     series "mm/ksm: fix ksm exec support for prctl".

   - Baolin Wang has implemented NUMA balancing for multi-size THPs.
     This is a simple first-cut implementation for now. The series is
     "support multi-size THP numa balancing".

   - Cleanups to vma handling helper functions from Matthew Wilcox in
     the series "Unify vma_address and vma_pgoff_address".

   - Some selftests maintenance work from Dev Jain in the series
     "selftests/mm: mremap_test: Optimizations and style fixes".

   - Improvements to the swapping of multi-size THPs from Ryan Roberts
     in the series "Swap-out mTHP without splitting".

   - Kefeng Wang has significantly optimized the handling of arm64's
     permission page faults in the series
	"arch/mm/fault: accelerate pagefault when badaccess"
	"mm: remove arch's private VM_FAULT_BADMAP/BADACCESS"

   - GUP cleanups from David Hildenbrand in "mm/gup: consistently call
     it GUP-fast".

   - hugetlb fault code cleanups from Vishal Moola in "Hugetlb fault
     path to use struct vm_fault".

   - selftests build fixes from John Hubbard in the series "Fix
     selftests/mm build without requiring "make headers"".

   - Memory tiering fixes/improvements from Ho-Ren (Jack) Chuang in the
     series "Improved Memory Tier Creation for CPUless NUMA Nodes".
     Fixes the initialization code so that migration between different
     memory types works as intended.

   - David Hildenbrand has improved follow_pte() and fixed an errant
     driver in the series "mm: follow_pte() improvements and acrn
     follow_pte() fixes".

   - David also did some cleanup work on large folio mapcounts in his
     series "mm: mapcount for large folios + page_mapcount() cleanups".

   - Folio conversions in KSM in Alex Shi's series "transfer page to
     folio in KSM".

   - Barry Song has added some sysfs stats for monitoring multi-size
     THP's in the series "mm: add per-order mTHP alloc and swpout
     counters".

   - Some zswap cleanups from Yosry Ahmed in the series "zswap
     same-filled and limit checking cleanups".

   - Matthew Wilcox has been looking at buffer_head code and found the
     documentation to be lacking. The series is "Improve buffer head
     documentation".

   - Multi-size THPs get more work, this time from Lance Yang. His
     series "mm/madvise: enhance lazyfreeing with mTHP in madvise_free"
     optimizes the freeing of these things.

   - Kemeng Shi has added more userspace-visible writeback
     instrumentation in the series "Improve visibility of writeback".

   - Kemeng Shi then sent some maintenance work on top in the series
     "Fix and cleanups to page-writeback".

   - Matthew Wilcox reduces mmap_lock traffic in the anon vma code in
     the series "Improve anon_vma scalability for anon VMAs". Intel's
     test bot reported an improbable 3x improvement in one test.

   - SeongJae Park adds some DAMON feature work in the series
	"mm/damon: add a DAMOS filter type for page granularity access recheck"
	"selftests/damon: add DAMOS quota goal test"

   - Also some maintenance work in the series
	"mm/damon/paddr: simplify page level access re-check for pageout"
	"mm/damon: misc fixes and improvements"

   - David Hildenbrand has disabled some known-to-fail selftests ni the
     series "selftests: mm: cow: flag vmsplice() hugetlb tests as
     XFAIL".

   - memcg metadata storage optimizations from Shakeel Butt in "memcg:
     reduce memory consumption by memcg stats".

   - DAX fixes and maintenance work from Vishal Verma in the series
     "dax/bus.c: Fixups for dax-bus locking""

* tag 'mm-stable-2024-05-17-19-19' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (426 commits)
  memcg, oom: cleanup unused memcg_oom_gfp_mask and memcg_oom_order
  selftests/mm: hugetlb_madv_vs_map: avoid test skipping by querying hugepage size at runtime
  mm/hugetlb: add missing VM_FAULT_SET_HINDEX in hugetlb_wp
  mm/hugetlb: add missing VM_FAULT_SET_HINDEX in hugetlb_fault
  selftests: cgroup: add tests to verify the zswap writeback path
  mm: memcg: make alloc_mem_cgroup_per_node_info() return bool
  mm/damon/core: fix return value from damos_wmark_metric_value
  mm: do not update memcg stats for NR_{FILE/SHMEM}_PMDMAPPED
  selftests: cgroup: remove redundant enabling of memory controller
  Docs/mm/damon/maintainer-profile: allow posting patches based on damon/next tree
  Docs/mm/damon/maintainer-profile: change the maintainer's timezone from PST to PT
  Docs/mm/damon/design: use a list for supported filters
  Docs/admin-guide/mm/damon/usage: fix wrong schemes effective quota update command
  Docs/admin-guide/mm/damon/usage: fix wrong example of DAMOS filter matching sysfs file
  selftests/damon: classify tests for functionalities and regressions
  selftests/damon/_damon_sysfs: use 'is' instead of '==' for 'None'
  selftests/damon/_damon_sysfs: find sysfs mount point from /proc/mounts
  selftests/damon/_damon_sysfs: check errors from nr_schemes file reads
  mm/damon/core: initialize ->esz_bp from damos_quota_init_priv()
  selftests/damon: add a test for DAMOS quota goal
  ...
2024-05-19 09:21:03 -07:00
Linus Torvalds
ff9a79307f Kbuild updates for v6.10
- Avoid 'constexpr', which is a keyword in C23
 
  - Allow 'dtbs_check' and 'dt_compatible_check' run independently of
    'dt_binding_check'
 
  - Fix weak references to avoid GOT entries in position-independent
    code generation
 
  - Convert the last use of 'optional' property in arch/sh/Kconfig
 
  - Remove support for the 'optional' property in Kconfig
 
  - Remove support for Clang's ThinLTO caching, which does not work with
    the .incbin directive
 
  - Change the semantics of $(src) so it always points to the source
    directory, which fixes Makefile inconsistencies between upstream and
    downstream
 
  - Fix 'make tar-pkg' for RISC-V to produce a consistent package
 
  - Provide reasonable default coverage for objtool, sanitizers, and
    profilers
 
  - Remove redundant OBJECT_FILES_NON_STANDARD, KASAN_SANITIZE, etc.
 
  - Remove the last use of tristate choice in drivers/rapidio/Kconfig
 
  - Various cleanups and fixes in Kconfig
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmZFlGcVHG1hc2FoaXJv
 eUBrZXJuZWwub3JnAAoJED2LAQed4NsG8voQALC8NtFpduWVfLRj2Qg6Ll/xf1vX
 2igcTJEOFHkeqXLGoT8dTDKLEipUBUvKyguPq66CGwVTe2g6zy/nUSXeVtFrUsIa
 msLTi8FqhqUo5lodNvGMRf8qqmuqcvnXoiQwIocF92jtsFy14bhiFY+n4HfcFNjj
 GOKwqBZYQUwY/VVb090efc7RfS9c7uwABJSBelSoxg3AGZriwjGy7Pw5aSKGgVYi
 inqL1eR6qwPP6z7CgQWM99soP+zwybFZmnQrsD9SniRBI4rtAat8Ih5jQFaSUFUQ
 lk2w0NQBRFN88/uR2IJ2GWuIlQ74WeJ+QnCqVuQ59tV5zw90wqSmLzngfPD057Dv
 JjNuhk0UyXVtpIg3lRtd4810ppNSTe33b9OM4O2H846W/crju5oDRNDHcflUXcwm
 Rmn5ho1rb5QVzDVejJbgwidnUInSgJ9PZcvXQ/RJVZPhpgsBzAY9pQexG1G3hviw
 y9UDrt6KP6bF9tHjmolmtdIes9Pj0c4dN6/Rdj4HS4hIQ/GDar0tnwvOvtfUctNL
 orJlBsA6GeMmDVXKkR0ytOCWRYqWWbyt8g70RVKQJfuHX7/hGyAQPaQ2/u4mQhC2
 aevYfbNJMj0VDfGz81HDBKFtkc5n+Ite8l157dHEl2LEabkOkRdNVcn7SNbOvZmd
 ZCSnZ31h7woGfNho
 =D5B/
 -----END PGP SIGNATURE-----

Merge tag 'kbuild-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild updates from Masahiro Yamada:

 - Avoid 'constexpr', which is a keyword in C23

 - Allow 'dtbs_check' and 'dt_compatible_check' run independently of
   'dt_binding_check'

 - Fix weak references to avoid GOT entries in position-independent code
   generation

 - Convert the last use of 'optional' property in arch/sh/Kconfig

 - Remove support for the 'optional' property in Kconfig

 - Remove support for Clang's ThinLTO caching, which does not work with
   the .incbin directive

 - Change the semantics of $(src) so it always points to the source
   directory, which fixes Makefile inconsistencies between upstream and
   downstream

 - Fix 'make tar-pkg' for RISC-V to produce a consistent package

 - Provide reasonable default coverage for objtool, sanitizers, and
   profilers

 - Remove redundant OBJECT_FILES_NON_STANDARD, KASAN_SANITIZE, etc.

 - Remove the last use of tristate choice in drivers/rapidio/Kconfig

 - Various cleanups and fixes in Kconfig

* tag 'kbuild-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (46 commits)
  kconfig: use sym_get_choice_menu() in sym_check_prop()
  rapidio: remove choice for enumeration
  kconfig: lxdialog: remove initialization with A_NORMAL
  kconfig: m/nconf: merge two item_add_str() calls
  kconfig: m/nconf: remove dead code to display value of bool choice
  kconfig: m/nconf: remove dead code to display children of choice members
  kconfig: gconf: show checkbox for choice correctly
  kbuild: use GCOV_PROFILE and KCSAN_SANITIZE in scripts/Makefile.modfinal
  Makefile: remove redundant tool coverage variables
  kbuild: provide reasonable defaults for tool coverage
  modules: Drop the .export_symbol section from the final modules
  kconfig: use menu_list_for_each_sym() in sym_check_choice_deps()
  kconfig: use sym_get_choice_menu() in conf_write_defconfig()
  kconfig: add sym_get_choice_menu() helper
  kconfig: turn defaults and additional prompt for choice members into error
  kconfig: turn missing prompt for choice members into error
  kconfig: turn conf_choice() into void function
  kconfig: use linked list in sym_set_changed()
  kconfig: gconf: use MENU_CHANGED instead of SYMBOL_CHANGED
  kconfig: gconf: remove debug code
  ...
2024-05-18 12:39:20 -07:00
Linus Torvalds
0cc6f45cec IOMMU Updates for Linux v6.10
Including:
 
 	- Core:
 	  - IOMMU memory usage observability - This will make the memory used
 	    for IO page tables explicitly visible.
 	  - Simplify arch_setup_dma_ops()
 
 	- Intel VT-d:
 	  - Consolidate domain cache invalidation
 	  - Remove private data from page fault message
 	  - Allocate DMAR fault interrupts locally
 	  - Cleanup and refactoring
 
 	- ARM-SMMUv2:
 	  - Support for fault debugging hardware on Qualcomm implementations
 	  - Re-land support for the ->domain_alloc_paging() callback
 
 	- ARM-SMMUv3:
 	  - Improve handling of MSI allocation failure
 	  - Drop support for the "disable_bypass" cmdline option
 	  - Major rework of the CD creation code, following on directly from the
 	    STE rework merged last time around.
 	  - Add unit tests for the new STE/CD manipulation logic
 
 	- AMD-Vi:
 	  - Final part of SVA changes with generic IO page fault handling
 
 	- Renesas IPMMU:
 	  - Add support for R8A779H0 hardware
 
 	- A couple smaller fixes and updates across the sub-tree
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAmZHJMkACgkQK/BELZcB
 GuND1Q/+M4RN5jM66XCfhqoP8QaI8I7zDlPDd14ismx0bjtOZhoiXpptKkAA8guo
 7mS57MLqBw/hKYucm1mw+F1qi1HnRWSstKXiCPmzDm3UXYgZJlKkrOw6vydFeHJH
 zx2ei7TmBrc0SrsybWK3NWRfVBBkO8enGZTmti0DfHL/rOFcUM0LHegY51GcDaaH
 SlDr+LLDMeGynSQWhRlVNJVmEI5gpVPitY/mDUpVPoELiW9C0WGk8kPlR11z2pCR
 eUNiqGJUcGasOhmfiYnpJR462eg7J41glquu+YHj8ivPbbu3C4wxgruY/tR4dmJG
 8s6AMAWR53JzG2SrCCwtzyRPSXmKfvixF+VKmlB2Ksc7VAn1xA0DYnY5Tx99EtXu
 qcEaR4SICMti0urmBGo/cGFdXi2TB1ccXqwoRtp1N3KiYnnOaQdLNO9qZdl9uUTI
 uleXACzkCVSssSpBfGjFcPyHU4r3WjMfX0f5ZJPpFMoQmvwV1yeMX7xTEZz4Sxew
 cHfBt9FAW9+4mBMTQfokBt0hZ6jwKcYl/z3Xi2oD+Ik/Qrzx5kcLA8LZLEVRXIBa
 SZh2ASazq/dr8YoZ744VRmlmi+nISAIHbbQMeqQEQgYQh0HpwS9g5HtpsBzNP6aB
 91RHqZSccb/zNdi8e+RH79Y7pX/G5QcuVKcW6KQUBcAAb6hAgOg=
 =JUzp
 -----END PGP SIGNATURE-----

Merge tag 'iommu-updates-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull iommu updates from Joerg Roedel:
 "Core:
   - IOMMU memory usage observability - This will make the memory used
     for IO page tables explicitly visible.
   - Simplify arch_setup_dma_ops()

  Intel VT-d:
   - Consolidate domain cache invalidation
   - Remove private data from page fault message
   - Allocate DMAR fault interrupts locally
   - Cleanup and refactoring

  ARM-SMMUv2:
   - Support for fault debugging hardware on Qualcomm implementations
   - Re-land support for the ->domain_alloc_paging() callback

  ARM-SMMUv3:
   - Improve handling of MSI allocation failure
   - Drop support for the "disable_bypass" cmdline option
   - Major rework of the CD creation code, following on directly from
     the STE rework merged last time around.
   - Add unit tests for the new STE/CD manipulation logic

  AMD-Vi:
   - Final part of SVA changes with generic IO page fault handling

  Renesas IPMMU:
   - Add support for R8A779H0 hardware

  ... and a couple smaller fixes and updates across the sub-tree"

* tag 'iommu-updates-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (80 commits)
  iommu/arm-smmu-v3: Make the kunit into a module
  arm64: Properly clean up iommu-dma remnants
  iommu/amd: Enable Guest Translation after reading IOMMU feature register
  iommu/vt-d: Decouple igfx_off from graphic identity mapping
  iommu/amd: Fix compilation error
  iommu/arm-smmu-v3: Add unit tests for arm_smmu_write_entry
  iommu/arm-smmu-v3: Build the whole CD in arm_smmu_make_s1_cd()
  iommu/arm-smmu-v3: Move the CD generation for SVA into a function
  iommu/arm-smmu-v3: Allocate the CD table entry in advance
  iommu/arm-smmu-v3: Make arm_smmu_alloc_cd_ptr()
  iommu/arm-smmu-v3: Consolidate clearing a CD table entry
  iommu/arm-smmu-v3: Move the CD generation for S1 domains into a function
  iommu/arm-smmu-v3: Make CD programming use arm_smmu_write_entry()
  iommu/arm-smmu-v3: Add an ops indirection to the STE code
  iommu/arm-smmu-qcom: Don't build debug features as a kernel module
  iommu/amd: Add SVA domain support
  iommu: Add ops->domain_alloc_sva()
  iommu/amd: Initial SVA support for AMD IOMMU
  iommu/amd: Add support for enable/disable IOPF
  iommu/amd: Add IO page fault notifier handler
  ...
2024-05-18 10:55:13 -07:00
Linus Torvalds
70a663205d Probes updates for v6.10:
- tracing/probes: Adding new pseudo-types %pd and %pD support for dumping
   dentry name from 'struct dentry *' and file name from 'struct file *'.
 
 - uprobes: Some performance optimizations have been done.
  . Speed up the BPF uprobe event by delaying the fetching of the uprobe
    event arguments that are not used in BPF.
  . Avoid locking by speculatively checking whether uprobe event is valid.
  . Reduce lock contention by using read/write_lock instead of spinlock for
    uprobe list operation. This improved BPF uprobe benchmark result 43% on
    average.
 
 - rethook: Removes non-fatal warning messages when tracing stack from BPF
   and skip rcu_is_watching() validation in rethook if possible.
 
 - objpool: Optimizing objpool (which is used by kretprobes and fprobe as
   rethook backend storage) by inlining functions and avoid caching nr_cpu_ids
   because it is a const value.
 
 - fprobe: Add entry/exit callbacks types (code cleanup)
 - kprobes: Check ftrace was killed in kprobes if it uses ftrace.
 -----BEGIN PGP SIGNATURE-----
 
 iQFPBAABCgA5FiEEh7BulGwFlgAOi5DV2/sHvwUrPxsFAmZFUxsbHG1hc2FtaS5o
 aXJhbWF0c3VAZ21haWwuY29tAAoJENv7B78FKz8b+fIH/A96/SeC5WRLhXmHfTCM
 IvKUea2n0b0oV/2pVfHqfkCBTICuUZ97Opd9VH9jLtjBOTh0fUOGZ2DNVGdSYfWm
 IIkS5dhuZxHXrSHEVYykwLHI3AOL7Q6Ny9EmOg1CNMidUkPMNtBvppsBYPlFU/B/
 qQJAvOdkVOnNITCaas0+MNgepoVVKdJzdNQ1I4WrGyG8isCZBaCYKo2QcGyheCNN
 y8NXvnVHgmgHQ8nTaeE5AawclFzFnhwHfPQPe1kiyGrx15b8K+VYmaZxPKv33A1a
 KT3TKJ1Ep7s7iWFh2iPVJzIwOXCmSnvNTKfNx/MDuKtO7UVfFwytoMEaekbmv3bG
 VqM=
 =n/mW
 -----END PGP SIGNATURE-----

Merge tag 'probes-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull probes updates from Masami Hiramatsu:

 - tracing/probes: Add new pseudo-types %pd and %pD support for dumping
   dentry name from 'struct dentry *' and file name from 'struct file *'

 - uprobes performance optimizations:
    - Speed up the BPF uprobe event by delaying the fetching of the
      uprobe event arguments that are not used in BPF
    - Avoid locking by speculatively checking whether uprobe event is
      valid
    - Reduce lock contention by using read/write_lock instead of
      spinlock for uprobe list operation. This improved BPF uprobe
      benchmark result 43% on average

 - rethook: Remove non-fatal warning messages when tracing stack from
   BPF and skip rcu_is_watching() validation in rethook if possible

 - objpool: Optimize objpool (which is used by kretprobes and fprobe as
   rethook backend storage) by inlining functions and avoid caching
   nr_cpu_ids because it is a const value

 - fprobe: Add entry/exit callbacks types (code cleanup)

 - kprobes: Check ftrace was killed in kprobes if it uses ftrace

* tag 'probes-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  kprobe/ftrace: bail out if ftrace was killed
  selftests/ftrace: Fix required features for VFS type test case
  objpool: cache nr_possible_cpus() and avoid caching nr_cpu_ids
  objpool: enable inlining objpool_push() and objpool_pop() operations
  rethook: honor CONFIG_FTRACE_VALIDATE_RCU_IS_WATCHING in rethook_try_get()
  ftrace: make extra rcu_is_watching() validation check optional
  uprobes: reduce contention on uprobes_tree access
  rethook: Remove warning messages printed for finding return address of a frame.
  fprobe: Add entry/exit callbacks types
  selftests/ftrace: add fprobe test cases for VFS type "%pd" and "%pD"
  selftests/ftrace: add kprobe test cases for VFS type "%pd" and "%pD"
  Documentation: tracing: add new type '%pd' and '%pD' for kprobe
  tracing/probes: support '%pD' type for print struct file's name
  tracing/probes: support '%pd' type for print struct dentry's name
  uprobes: add speculative lockless system-wide uprobe filter check
  uprobes: prepare uprobe args buffer lazily
  uprobes: encapsulate preparation of uprobe args buffer
2024-05-17 18:29:30 -07:00
Samuel Holland
ad5643cf2f
riscv: Define TASK_SIZE_MAX for __access_ok()
TASK_SIZE_MAX should be set to a constant value, at least the largest
valid userspace address under any runtime configuration. This optimizes
the check in __access_ok(), which no longer needs to compute the runtime
value of TASK_SIZE. The check does not need to be exact, as long as it
accepts all valid userspace addresses and rejects all valid kernel
addresses; well-behaved programs will never fail the access_ok() check.

For RISC-V, which requires all virtual addresses to be sign extended,
the optimal choice is LONG_MAX because it simplifies the limit
comparison to a sign bit test.

This removes about half of the references to pgtable_l[45]_enabled.

Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240327143858.711792-3-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-05-16 12:59:58 -07:00
Samuel Holland
9ad6bb3298
riscv: Remove PGDIR_SIZE_L3 and TASK_SIZE_MIN
TASK_SIZE_MIN is unused since commit 085e2ff9ae ("efi: libstub: Drop
randomization of runtime memory map"). PGDIR_SIZE_L3 is only used in the
definition of TASK_SIZE_MIN.

Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240327143858.711792-2-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-05-16 12:59:57 -07:00
Stephen Brennan
1a7d0890dd kprobe/ftrace: bail out if ftrace was killed
If an error happens in ftrace, ftrace_kill() will prevent disarming
kprobes. Eventually, the ftrace_ops associated with the kprobes will be
freed, yet the kprobes will still be active, and when triggered, they
will use the freed memory, likely resulting in a page fault and panic.

This behavior can be reproduced quite easily, by creating a kprobe and
then triggering a ftrace_kill(). For simplicity, we can simulate an
ftrace error with a kernel module like [1]:

[1]: https://github.com/brenns10/kernel_stuff/tree/master/ftrace_killer

  sudo perf probe --add commit_creds
  sudo perf trace -e probe:commit_creds
  # In another terminal
  make
  sudo insmod ftrace_killer.ko  # calls ftrace_kill(), simulating bug
  # Back to perf terminal
  # ctrl-c
  sudo perf probe --del commit_creds

After a short period, a page fault and panic would occur as the kprobe
continues to execute and uses the freed ftrace_ops. While ftrace_kill()
is supposed to be used only in extreme circumstances, it is invoked in
FTRACE_WARN_ON() and so there are many places where an unexpected bug
could be triggered, yet the system may continue operating, possibly
without the administrator noticing. If ftrace_kill() does not panic the
system, then we should do everything we can to continue operating,
rather than leave a ticking time bomb.

Link: https://lore.kernel.org/all/20240501162956.229427-1-stephen.s.brennan@oracle.com/

Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Acked-by: Guo Ren <guoren@kernel.org>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2024-05-16 07:23:30 +09:00
Linus Torvalds
f4b0c4b508 ARM:
* Move a lot of state that was previously stored on a per vcpu
   basis into a per-CPU area, because it is only pertinent to the
   host while the vcpu is loaded. This results in better state
   tracking, and a smaller vcpu structure.
 
 * Add full handling of the ERET/ERETAA/ERETAB instructions in
   nested virtualisation. The last two instructions also require
   emulating part of the pointer authentication extension.
   As a result, the trap handling of pointer authentication has
   been greatly simplified.
 
 * Turn the global (and not very scalable) LPI translation cache
   into a per-ITS, scalable cache, making non directly injected
   LPIs much cheaper to make visible to the vcpu.
 
 * A batch of pKVM patches, mostly fixes and cleanups, as the
   upstreaming process seems to be resuming. Fingers crossed!
 
 * Allocate PPIs and SGIs outside of the vcpu structure, allowing
   for smaller EL2 mapping and some flexibility in implementing
   more or less than 32 private IRQs.
 
 * Purge stale mpidr_data if a vcpu is created after the MPIDR
   map has been created.
 
 * Preserve vcpu-specific ID registers across a vcpu reset.
 
 * Various minor cleanups and improvements.
 
 LoongArch:
 
 * Add ParaVirt IPI support.
 
 * Add software breakpoint support.
 
 * Add mmio trace events support.
 
 RISC-V:
 
 * Support guest breakpoints using ebreak
 
 * Introduce per-VCPU mp_state_lock and reset_cntx_lock
 
 * Virtualize SBI PMU snapshot and counter overflow interrupts
 
 * New selftests for SBI PMU and Guest ebreak
 
 * Some preparatory work for both TDX and SNP page fault handling.
   This also cleans up the page fault path, so that the priorities
   of various kinds of fauls (private page, no memory, write
   to read-only slot, etc.) are easier to follow.
 
 x86:
 
 * Minimize amount of time that shadow PTEs remain in the special
   REMOVED_SPTE state.  This is a state where the mmu_lock is held for
   reading but concurrent accesses to the PTE have to spin; shortening
   its use allows other vCPUs to repopulate the zapped region while
   the zapper finishes tearing down the old, defunct page tables.
 
 * Advertise the max mappable GPA in the "guest MAXPHYADDR" CPUID field,
   which is defined by hardware but left for software use.  This lets KVM
   communicate its inability to map GPAs that set bits 51:48 on hosts
   without 5-level nested page tables.  Guest firmware is expected to
   use the information when mapping BARs; this avoids that they end up at
   a legal, but unmappable, GPA.
 
 * Fixed a bug where KVM would not reject accesses to MSR that aren't
   supposed to exist given the vCPU model and/or KVM configuration.
 
 * As usual, a bunch of code cleanups.
 
 x86 (AMD):
 
 * Implement a new and improved API to initialize SEV and SEV-ES VMs, which
   will also be extendable to SEV-SNP.  The new API specifies the desired
   encryption in KVM_CREATE_VM and then separately initializes the VM.
   The new API also allows customizing the desired set of VMSA features;
   the features affect the measurement of the VM's initial state, and
   therefore enabling them cannot be done tout court by the hypervisor.
 
   While at it, the new API includes two bugfixes that couldn't be
   applied to the old one without a flag day in userspace or without
   affecting the initial measurement.  When a SEV-ES VM is created with
   the new VM type, KVM_GET_REGS/KVM_SET_REGS and friends are
   rejected once the VMSA has been encrypted.  Also, the FPU and AVX
   state will be synchronized and encrypted too.
 
 * Support for GHCB version 2 as applicable to SEV-ES guests.  This, once
   more, is only accessible when using the new KVM_SEV_INIT2 flow for
   initialization of SEV-ES VMs.
 
 x86 (Intel):
 
 * An initial bunch of prerequisite patches for Intel TDX were merged.
   They generally don't do anything interesting.  The only somewhat user
   visible change is a new debugging mode that checks that KVM's MMU
   never triggers a #VE virtualization exception in the guest.
 
 * Clear vmcs.EXIT_QUALIFICATION when synthesizing an EPT Misconfig VM-Exit to
   L1, as per the SDM.
 
 Generic:
 
 * Use vfree() instead of kvfree() for allocations that always use vcalloc()
   or __vcalloc().
 
 * Remove .change_pte() MMU notifier - the changes to non-KVM code are
   small and Andrew Morton asked that I also take those through the KVM
   tree.  The callback was only ever implemented by KVM (which was also the
   original user of MMU notifiers) but it had been nonfunctional ever since
   calls to set_pte_at_notify were wrapped with invalidate_range_start
   and invalidate_range_end... in 2012.
 
 Selftests:
 
 * Enhance the demand paging test to allow for better reporting and stressing
   of UFFD performance.
 
 * Convert the steal time test to generate TAP-friendly output.
 
 * Fix a flaky false positive in the xen_shinfo_test due to comparing elapsed
   time across two different clock domains.
 
 * Skip the MONITOR/MWAIT test if the host doesn't actually support MWAIT.
 
 * Avoid unnecessary use of "sudo" in the NX hugepage test wrapper shell
   script, to play nice with running in a minimal userspace environment.
 
 * Allow skipping the RSEQ test's sanity check that the vCPU was able to
   complete a reasonable number of KVM_RUNs, as the assert can fail on a
   completely valid setup.  If the test is run on a large-ish system that is
   otherwise idle, and the test isn't affined to a low-ish number of CPUs, the
   vCPU task can be repeatedly migrated to CPUs that are in deep sleep states,
   which results in the vCPU having very little net runtime before the next
   migration due to high wakeup latencies.
 
 * Define _GNU_SOURCE for all selftests to fix a warning that was introduced by
   a change to kselftest_harness.h late in the 6.9 cycle, and because forcing
   every test to #define _GNU_SOURCE is painful.
 
 * Provide a global pseudo-RNG instance for all tests, so that library code can
   generate random, but determinstic numbers.
 
 * Use the global pRNG to randomly force emulation of select writes from guest
   code on x86, e.g. to help validate KVM's emulation of locked accesses.
 
 * Allocate and initialize x86's GDT, IDT, TSS, segments, and default exception
   handlers at VM creation, instead of forcing tests to manually trigger the
   related setup.
 
 Documentation:
 
 * Fix a goof in the KVM_CREATE_GUEST_MEMFD documentation.
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmZE878UHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroOukQf+LcvZsWtrC7Wd5K9SQbYXaS4Rk6P6
 JHoQW2d0hUN893J2WibEw+l1J/0vn5JumqHXyZgJ7CbaMtXkWWQTwDSDLuURUKpv
 XNB3Sb17G87NH+s1tOh0tA9h5upbtlHVHvrtIwdbb9+XHgQ6HTL4uk+HdfO/p9fW
 cWBEZAKoWcCIa99Numv3pmq5vdrvBlNggwBugBS8TH69EKMw+V1Vu1SFkIdNDTQk
 NJJ28cohoP3wnwlIHaXSmU4RujipPH3Lm/xupyA5MwmzO713eq2yUqV49jzhD5/I
 MA4Ruvgrdm4wpp89N9lQMyci91u6q7R9iZfMu0tSg2qYI3UPKIdstd8sOA==
 =2lED
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Paolo Bonzini:
 "ARM:

   - Move a lot of state that was previously stored on a per vcpu basis
     into a per-CPU area, because it is only pertinent to the host while
     the vcpu is loaded. This results in better state tracking, and a
     smaller vcpu structure.

   - Add full handling of the ERET/ERETAA/ERETAB instructions in nested
     virtualisation. The last two instructions also require emulating
     part of the pointer authentication extension. As a result, the trap
     handling of pointer authentication has been greatly simplified.

   - Turn the global (and not very scalable) LPI translation cache into
     a per-ITS, scalable cache, making non directly injected LPIs much
     cheaper to make visible to the vcpu.

   - A batch of pKVM patches, mostly fixes and cleanups, as the
     upstreaming process seems to be resuming. Fingers crossed!

   - Allocate PPIs and SGIs outside of the vcpu structure, allowing for
     smaller EL2 mapping and some flexibility in implementing more or
     less than 32 private IRQs.

   - Purge stale mpidr_data if a vcpu is created after the MPIDR map has
     been created.

   - Preserve vcpu-specific ID registers across a vcpu reset.

   - Various minor cleanups and improvements.

  LoongArch:

   - Add ParaVirt IPI support

   - Add software breakpoint support

   - Add mmio trace events support

  RISC-V:

   - Support guest breakpoints using ebreak

   - Introduce per-VCPU mp_state_lock and reset_cntx_lock

   - Virtualize SBI PMU snapshot and counter overflow interrupts

   - New selftests for SBI PMU and Guest ebreak

   - Some preparatory work for both TDX and SNP page fault handling.

     This also cleans up the page fault path, so that the priorities of
     various kinds of fauls (private page, no memory, write to read-only
     slot, etc.) are easier to follow.

  x86:

   - Minimize amount of time that shadow PTEs remain in the special
     REMOVED_SPTE state.

     This is a state where the mmu_lock is held for reading but
     concurrent accesses to the PTE have to spin; shortening its use
     allows other vCPUs to repopulate the zapped region while the zapper
     finishes tearing down the old, defunct page tables.

   - Advertise the max mappable GPA in the "guest MAXPHYADDR" CPUID
     field, which is defined by hardware but left for software use.

     This lets KVM communicate its inability to map GPAs that set bits
     51:48 on hosts without 5-level nested page tables. Guest firmware
     is expected to use the information when mapping BARs; this avoids
     that they end up at a legal, but unmappable, GPA.

   - Fixed a bug where KVM would not reject accesses to MSR that aren't
     supposed to exist given the vCPU model and/or KVM configuration.

   - As usual, a bunch of code cleanups.

  x86 (AMD):

   - Implement a new and improved API to initialize SEV and SEV-ES VMs,
     which will also be extendable to SEV-SNP.

     The new API specifies the desired encryption in KVM_CREATE_VM and
     then separately initializes the VM. The new API also allows
     customizing the desired set of VMSA features; the features affect
     the measurement of the VM's initial state, and therefore enabling
     them cannot be done tout court by the hypervisor.

     While at it, the new API includes two bugfixes that couldn't be
     applied to the old one without a flag day in userspace or without
     affecting the initial measurement. When a SEV-ES VM is created with
     the new VM type, KVM_GET_REGS/KVM_SET_REGS and friends are rejected
     once the VMSA has been encrypted. Also, the FPU and AVX state will
     be synchronized and encrypted too.

   - Support for GHCB version 2 as applicable to SEV-ES guests.

     This, once more, is only accessible when using the new
     KVM_SEV_INIT2 flow for initialization of SEV-ES VMs.

  x86 (Intel):

   - An initial bunch of prerequisite patches for Intel TDX were merged.

     They generally don't do anything interesting. The only somewhat
     user visible change is a new debugging mode that checks that KVM's
     MMU never triggers a #VE virtualization exception in the guest.

   - Clear vmcs.EXIT_QUALIFICATION when synthesizing an EPT Misconfig
     VM-Exit to L1, as per the SDM.

  Generic:

   - Use vfree() instead of kvfree() for allocations that always use
     vcalloc() or __vcalloc().

   - Remove .change_pte() MMU notifier - the changes to non-KVM code are
     small and Andrew Morton asked that I also take those through the
     KVM tree.

     The callback was only ever implemented by KVM (which was also the
     original user of MMU notifiers) but it had been nonfunctional ever
     since calls to set_pte_at_notify were wrapped with
     invalidate_range_start and invalidate_range_end... in 2012.

  Selftests:

   - Enhance the demand paging test to allow for better reporting and
     stressing of UFFD performance.

   - Convert the steal time test to generate TAP-friendly output.

   - Fix a flaky false positive in the xen_shinfo_test due to comparing
     elapsed time across two different clock domains.

   - Skip the MONITOR/MWAIT test if the host doesn't actually support
     MWAIT.

   - Avoid unnecessary use of "sudo" in the NX hugepage test wrapper
     shell script, to play nice with running in a minimal userspace
     environment.

   - Allow skipping the RSEQ test's sanity check that the vCPU was able
     to complete a reasonable number of KVM_RUNs, as the assert can fail
     on a completely valid setup.

     If the test is run on a large-ish system that is otherwise idle,
     and the test isn't affined to a low-ish number of CPUs, the vCPU
     task can be repeatedly migrated to CPUs that are in deep sleep
     states, which results in the vCPU having very little net runtime
     before the next migration due to high wakeup latencies.

   - Define _GNU_SOURCE for all selftests to fix a warning that was
     introduced by a change to kselftest_harness.h late in the 6.9
     cycle, and because forcing every test to #define _GNU_SOURCE is
     painful.

   - Provide a global pseudo-RNG instance for all tests, so that library
     code can generate random, but determinstic numbers.

   - Use the global pRNG to randomly force emulation of select writes
     from guest code on x86, e.g. to help validate KVM's emulation of
     locked accesses.

   - Allocate and initialize x86's GDT, IDT, TSS, segments, and default
     exception handlers at VM creation, instead of forcing tests to
     manually trigger the related setup.

  Documentation:

   - Fix a goof in the KVM_CREATE_GUEST_MEMFD documentation"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (225 commits)
  selftests/kvm: remove dead file
  KVM: selftests: arm64: Test vCPU-scoped feature ID registers
  KVM: selftests: arm64: Test that feature ID regs survive a reset
  KVM: selftests: arm64: Store expected register value in set_id_regs
  KVM: selftests: arm64: Rename helper in set_id_regs to imply VM scope
  KVM: arm64: Only reset vCPU-scoped feature ID regs once
  KVM: arm64: Reset VM feature ID regs from kvm_reset_sys_regs()
  KVM: arm64: Rename is_id_reg() to imply VM scope
  KVM: arm64: Destroy mpidr_data for 'late' vCPU creation
  KVM: arm64: Use hVHE in pKVM by default on CPUs with VHE support
  KVM: arm64: Fix hvhe/nvhe early alias parsing
  KVM: SEV: Allow per-guest configuration of GHCB protocol version
  KVM: SEV: Add GHCB handling for termination requests
  KVM: SEV: Add GHCB handling for Hypervisor Feature Support requests
  KVM: SEV: Add support to handle AP reset MSR protocol
  KVM: x86: Explicitly zero kvm_caps during vendor module load
  KVM: x86: Fully re-initialize supported_mce_cap on vendor module load
  KVM: x86: Fully re-initialize supported_vm_types on vendor module load
  KVM: x86/mmu: Sanity check that __kvm_faultin_pfn() doesn't create noslot pfns
  KVM: x86/mmu: Initialize kvm_page_fault's pfn and hva to error values
  ...
2024-05-15 14:46:43 -07:00
Linus Torvalds
a49468240e Modules changes for v6.10-rc1
Finally something fun. Mike Rapoport does some cleanup to allow us to
 take out module_alloc() out of modules into a new paint shedded execmem_alloc()
 and execmem_free() so to make emphasis these helpers are actually used outside
 of modules. It starts with a no-functional changes API rename / placeholders
 to then allow architectures to define their requirements into a new shiny
 struct execmem_info with ranges, and requirements for those ranges. Archs
 now can intitialize this execmem_info as the last part of mm_core_init() if
 they have to diverge from the norm. Each range is a known type clearly
 articulated and spelled out in enum execmem_type.
 
 Although a lot of this is major cleanup and prep work for future enhancements an
 immediate clear gain is we get to enable KPROBES without MODULES now. That is
 ultimately what motiviated to pick this work up again, now with smaller goal as
 concrete stepping stone.
 
 This has been sitting on linux-next for a little less than a month, a few issues
 were found already and fixed, in particular an odd mips boot issue. Arch folks
 reviewed the code too. This is ready for wider exposure and testing.
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCgAwFiEENnNq2KuOejlQLZofziMdCjCSiKcFAmZDHfMSHG1jZ3JvZkBr
 ZXJuZWwub3JnAAoJEM4jHQowkoinfIwP/iFsr89v9BjWdRTqzufuHwjOxvFymWxU
 BbEpOppRny3CckDU9ag9hLIlUaSL1Bg56Zb+znzp5stKOoiQYMDBvjSYdfybPxW2
 mRS6SClMF1ubWbzdysdp5Ld9u8T0MQPCLX+P2pKhZRGi0wjkBf5WEkTje+muJKI3
 4vYkXS7bNhuTwRQ+EGfze4+AeleGdQJKDWFY00TW9mZTTBADjfHyYU5o0m9ijf5l
 3V/weUznODvjVJStbIF7wEQ845Ae02LN1zXfsloIOuBMhcMju+x8IjPgPbD0KhX2
 yA48q7mVWkirYp0L5GSQchtqV1GBiP0NK1xXWEpyx6EqQZ4RJCsQhlhjijoExYBR
 ylP4bqiGVuE3IN075X0OzGCnmOStuzwssfDmug0sMAZH/MvmOQ21WzZdet2nLMas
 wwJArHqZsBI9BnBlvH9ZM4Y9f1zC7iR1wULaNGwXLPx34X9PIch8Yk+RElP1kMFQ
 +YrjOuWPjl63pmSkrkk+Pe2eesMPcPB41M6Q2iCjDlp0iBp63LIx2XISUbTf0ljM
 EsI4ZQseYpx+BmC7AuQfmXvEOjuXII9z072/artVWcB2u/87ixIprnqZVhcs/spy
 73DnXB4ufor2PCCC5Xrb/6kT6G+PzF3VwTbHQ1D+fYZ5n2qdyG+LKxgXbtxsRVTp
 oUg+Z/AJaCMt
 =Nsg4
 -----END PGP SIGNATURE-----

Merge tag 'modules-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux

Pull modules updates from Luis Chamberlain:
 "Finally something fun. Mike Rapoport does some cleanup to allow us to
  take out module_alloc() out of modules into a new paint shedded
  execmem_alloc() and execmem_free() so to make emphasis these helpers
  are actually used outside of modules.

  It starts with a non-functional changes API rename / placeholders to
  then allow architectures to define their requirements into a new shiny
  struct execmem_info with ranges, and requirements for those ranges.

  Archs now can intitialize this execmem_info as the last part of
  mm_core_init() if they have to diverge from the norm. Each range is a
  known type clearly articulated and spelled out in enum execmem_type.

  Although a lot of this is major cleanup and prep work for future
  enhancements an immediate clear gain is we get to enable KPROBES
  without MODULES now. That is ultimately what motiviated to pick this
  work up again, now with smaller goal as concrete stepping stone"

* tag 'modules-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux:
  bpf: remove CONFIG_BPF_JIT dependency on CONFIG_MODULES of
  kprobes: remove dependency on CONFIG_MODULES
  powerpc: use CONFIG_EXECMEM instead of CONFIG_MODULES where appropriate
  x86/ftrace: enable dynamic ftrace without CONFIG_MODULES
  arch: make execmem setup available regardless of CONFIG_MODULES
  powerpc: extend execmem_params for kprobes allocations
  arm64: extend execmem_info for generated code allocations
  riscv: extend execmem_params for generated code allocations
  mm/execmem, arch: convert remaining overrides of module_alloc to execmem
  mm/execmem, arch: convert simple overrides of module_alloc to execmem
  mm: introduce execmem_alloc() and execmem_free()
  module: make module_memory_{alloc,free} more self-contained
  sparc: simplify module_alloc()
  nios2: define virtual address space for modules
  mips: module: rename MODULE_START to MODULES_VADDR
  arm64: module: remove unneeded call to kasan_alloc_module_shadow()
  kallsyms: replace deprecated strncpy with strscpy
  module: allow UNUSED_KSYMS_WHITELIST to be relative against objtree.
2024-05-15 14:05:08 -07:00
Linus Torvalds
a19264d086 printk changes for 6.10
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEESH4wyp42V4tXvYsjUqAMR0iAlPIFAmZEh6AACgkQUqAMR0iA
 lPJzFw/+PCKHOFI1Z5Aj9negx97sYKAIuJrY9pFQfaxVUBzqZgIEXB/rkn4yurad
 KpK1UppMnLset+GLm+95+BTyP7G256q2jhWZ9u55i089YGrUKcibp+Jy9cCO02r5
 1c0+ZkyMxONgPnE+WcOW7B5p34cie6NFdvqrRzrW5WB4eLaGs3ksBow2j3jXWTii
 aOrsPPZWmT6wEJd4Hm1kZIgnz8gmlsm+VGTSHjjEvWWtvh5garKxCQ3COmdw1WAc
 dL+YjYqTIOQsifJeOpECy8+hZA4uoKpw2dWxfdHEH7F8RkhdumQdWxiGON+KXwXA
 cG1rIaas0gGvVpcvja/bPiATwzqTmXlGAHlrwiDEeiNqh/VckinDw/S82QdIVTii
 qttE2yv8cAVCpsk8GVjuE7unZREc0Ao2tAIz3on7dzFgVGVsK3mJBGAiqVJWDA/A
 3jlFsMoM899IJJ8Fvg0rcu/vkwE4ViiQCurcPgWWqPicHC310PSJ6O0cImbBsL+U
 kQxpkpEUnlgiDy19vKzhHlGR89xxLUxIiq78TRCYrM+NQ4PCvdGQMHe/Wm5EfhPx
 bgzYcNsWjmN4fzokIl+a641wvTCqiUmUqoy7TU+a8a2ssBNaVrHubMrJzkl2OLts
 miLz0xXG+RZA0Z1FNqy3+3EyxoGmUJqjM9jomDAxPvMvrNQjMHA=
 =y45P
 -----END PGP SIGNATURE-----

Merge tag 'printk-for-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux

Pull printk updates from Petr Mladek:

 - Use no_printk() instead of "if (0) printk()" constructs to avoid
   generating printk index for messages disabled at compile time

 - Remove deprecated strncpy/strcpy from printk.c

 - Remove redundant CONFIG_BASE_FULL in favor of CONFIG_BASE_SMALL

* tag 'printk-for-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
  printk: cleanup deprecated uses of strncpy/strcpy
  printk: Remove redundant CONFIG_BASE_FULL
  printk: Change type of CONFIG_BASE_SMALL to bool
  printk: Fix LOG_CPU_MAX_BUF_SHIFT when BASE_SMALL is enabled
  ceph: Use no_printk() helper
  dyndbg: Use *no_printk() helpers
  dev_printk: Add and use dev_no_printk()
  printk: Let no_printk() use _printk()
2024-05-15 12:34:46 -07:00
Linus Torvalds
1b294a1f35 Networking changes for 6.10.
Core & protocols
 ----------------
 
  - Complete rework of garbage collection of AF_UNIX sockets.
    AF_UNIX is prone to forming reference count cycles due to fd passing
    functionality. New method based on Tarjan's Strongly Connected Components
    algorithm should be both faster and remove a lot of workarounds
    we accumulated over the years.
 
  - Add TCP fraglist GRO support, allowing chaining multiple TCP packets
    and forwarding them together. Useful for small switches / routers which
    lack basic checksum offload in some scenarios (e.g. PPPoE).
 
  - Support using SMP threads for handling packet backlog i.e. packet
    processing from software interfaces and old drivers which don't
    use NAPI. This helps move the processing out of the softirq jumble.
 
  - Continue work of converting from rtnl lock to RCU protection.
    Don't require rtnl lock when reading: IPv6 routing FIB, IPv6 address
    labels, netdev threaded NAPI sysfs files, bonding driver's sysfs files,
    MPLS devconf, IPv4 FIB rules, netns IDs, tcp metrics, TC Qdiscs,
    neighbor entries, ARP entries via ioctl(SIOCGARP), a lot of the link
    information available via rtnetlink.
 
  - Small optimizations from Eric to UDP wake up handling, memory accounting,
    RPS/RFS implementation, TCP packet sizing etc.
 
  - Allow direct page recycling in the bulk API used by XDP, for +2% PPS.
 
  - Support peek with an offset on TCP sockets.
 
  - Add MPTCP APIs for querying last time packets were received/sent/acked,
    and whether MPTCP "upgrade" succeeded on a TCP socket.
 
  - Add intra-node communication shortcut to improve SMC performance.
 
  - Add IPv6 (and IPv{4,6}-over-IPv{4,6}) support to the GTP protocol driver.
 
  - Add HSR-SAN (RedBOX) mode of operation to the HSR protocol driver.
 
  - Add reset reasons for tracing what caused a TCP reset to be sent.
 
  - Introduce direction attribute for xfrm (IPSec) states.
    State can be used either for input or output packet processing.
 
 Things we sprinkled into general kernel code
 --------------------------------------------
 
  - Add bitmap_{read,write}(), bitmap_size(), expose BYTES_TO_BITS().
    This required touch-ups and renaming of a few existing users.
 
  - Add Endian-dependent __counted_by_{le,be} annotations.
 
  - Make building selftests "quieter" by printing summaries like
    "CC object.o" rather than full commands with all the arguments.
 
 Netfilter
 ---------
 
  - Use GFP_KERNEL to clone elements, to deal better with OOM situations
    and avoid failures in the .commit step.
 
 BPF
 ---
 
  - Add eBPF JIT for ARCv2 CPUs.
 
  - Support attaching kprobe BPF programs through kprobe_multi link in
    a session mode, meaning, a BPF program is attached to both function entry
    and return, the entry program can decide if the return program gets
    executed and the entry program can share u64 cookie value with return
    program. "Session mode" is a common use-case for tetragon and bpftrace.
 
  - Add the ability to specify and retrieve BPF cookie for raw tracepoint
    programs in order to ease migration from classic to raw tracepoints.
 
  - Add an internal-only BPF per-CPU instruction for resolving per-CPU
    memory addresses and implement support in x86, ARM64 and RISC-V JITs.
    This allows inlining functions which need to access per-CPU state.
 
  - Optimize x86 BPF JIT's emit_mov_imm64, and add support for various
    atomics in bpf_arena which can be JITed as a single x86 instruction.
    Support BPF arena on ARM64.
 
  - Add a new bpf_wq API for deferring events and refactor process-context
    bpf_timer code to keep common code where possible.
 
  - Harden the BPF verifier's and/or/xor value tracking.
 
  - Introduce crypto kfuncs to let BPF programs call kernel crypto APIs.
 
  - Support bpf_tail_call_static() helper for BPF programs with GCC 13.
 
  - Add bpf_preempt_{disable,enable}() kfuncs in order to allow a BPF
    program to have code sections where preemption is disabled.
 
 Driver API
 ----------
 
  - Skip software TC processing completely if all installed rules are
    marked as HW-only, instead of checking the HW-only flag rule by rule.
 
  - Add support for configuring PoE (Power over Ethernet), similar to
    the already existing support for PoDL (Power over Data Line) config.
 
  - Initial bits of a queue control API, for now allowing a single queue
    to be reset without disturbing packet flow to other queues.
 
  - Common (ethtool) statistics for hardware timestamping.
 
 Tests and tooling
 -----------------
 
  - Remove the need to create a config file to run the net forwarding tests
    so that a naive "make run_tests" can exercise them.
 
  - Define a method of writing tests which require an external endpoint
    to communicate with (to send/receive data towards the test machine).
    Add a few such tests.
 
  - Create a shared code library for writing Python tests. Expose the YAML
    Netlink library from tools/ to the tests for easy Netlink access.
 
  - Move netfilter tests under net/, extend them, separate performance tests
    from correctness tests, and iron out issues found by running them
    "on every commit".
 
  - Refactor BPF selftests to use common network helpers.
 
  - Further work filling in YAML definitions of Netlink messages for:
    nftables, team driver, bonding interfaces, vlan interfaces, VF info,
    TC u32 mark, TC police action.
 
  - Teach Python YAML Netlink to decode attribute policies.
 
  - Extend the definition of the "indexed array" construct in the specs
    to cover arrays of scalars rather than just nests.
 
  - Add hyperlinks between definitions in generated Netlink docs.
 
 Drivers
 -------
 
  - Make sure unsupported flower control flags are rejected by drivers,
    and make more drivers report errors directly to the application rather
    than dmesg (large number of driver changes from Asbjørn Sloth Tønnesen).
 
  - Ethernet high-speed NICs:
    - Broadcom (bnxt):
      - support multiple RSS contexts and steering traffic to them
      - support XDP metadata
      - make page pool allocations more NUMA aware
    - Intel (100G, ice, idpf):
      - extract datapath code common among Intel drivers into a library
      - use fewer resources in switchdev by sharing queues with the PF
      - add PFCP filter support
      - add Ethernet filter support
      - use a spinlock instead of HW lock in PTP clock ops
      - support 5 layer Tx scheduler topology
    - nVidia/Mellanox:
      - 800G link modes and 100G SerDes speeds
      - per-queue IRQ coalescing configuration
    - Marvell Octeon:
      - support offloading TC packet mark action
 
  - Ethernet NICs consumer, embedded and virtual:
    - stop lying about skb->truesize in USB Ethernet drivers, it messes up
      TCP memory calculations
    - Google cloud vNIC:
      - support changing ring size via ethtool
      - support ring reset using the queue control API
    - VirtIO net:
      - expose flow hash from RSS to XDP
      - per-queue statistics
      - add selftests
    - Synopsys (stmmac):
      - support controllers which require an RX clock signal from the MII
        bus to perform their hardware initialization
    - TI:
      - icssg_prueth: support ICSSG-based Ethernet on AM65x SR1.0 devices
      - icssg_prueth: add SW TX / RX Coalescing based on hrtimers
      - cpsw: minimal XDP support
    - Renesas (ravb):
      - support describing the MDIO bus
    - Realtek (r8169):
      - add support for RTL8168M
    - Microchip Sparx5:
      - matchall and flower actions mirred and redirect
 
  - Ethernet switches:
    - nVidia/Mellanox:
      - improve events processing performance
    - Marvell:
      - add support for MV88E6250 family internal PHYs
    - Microchip:
      - add DCB and DSCP mapping support for KSZ switches
      - vsc73xx: convert to PHYLINK
    - Realtek:
      - rtl8226b/rtl8221b: add C45 instances and SerDes switching
 
  - Many driver changes related to PHYLIB and PHYLINK deprecated API cleanup.
 
  - Ethernet PHYs:
    - Add a new driver for Airoha EN8811H 2.5 Gigabit PHY.
    - micrel: lan8814: add support for PPS out and external timestamp trigger
 
  - WiFi:
    - Disable Wireless Extensions (WEXT) in all Wi-Fi 7 devices drivers.
      Modern devices can only be configured using nl80211.
    - mac80211/cfg80211
      - handle color change per link for WiFi 7 Multi-Link Operation
    - Intel (iwlwifi):
      - don't support puncturing in 5 GHz
      - support monitor mode on passive channels
      - BZ-W device support
      - P2P with HE/EHT support
      - re-add support for firmware API 90
      - provide channel survey information for Automatic Channel Selection
    - MediaTek (mt76):
      - mt7921 LED control
      - mt7925 EHT radiotap support
      - mt7920e PCI support
    - Qualcomm (ath11k):
      - P2P support for QCA6390, WCN6855 and QCA2066
      - support hibernation
      - ieee80211-freq-limit Device Tree property support
    - Qualcomm (ath12k):
      - refactoring in preparation of multi-link support
      - suspend and hibernation support
      - ACPI support
      - debugfs support, including dfs_simulate_radar support
    - RealTek:
      - rtw88: RTL8723CS SDIO device support
      - rtw89: RTL8922AE Wi-Fi 7 PCI device support
      - rtw89: complete features of new WiFi 7 chip 8922AE including
        BT-coexistence and Wake-on-WLAN
      - rtw89: use BIOS ACPI settings to set TX power and channels
      - rtl8xxxu: enable Management Frame Protection (MFP) support
 
  - Bluetooth:
    - support for Intel BlazarI and Filmore Peak2 (BE201)
    - support for MediaTek MT7921S SDIO
    - initial support for Intel PCIe BT driver
    - remove HCI_AMP support
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmZD6sQACgkQMUZtbf5S
 IrtLYw/+I73ePGIye37o2jpbodcLAUZVfF3r6uYUzK8hokEcKD0QVJa9w7PizLZ3
 UO45ClOXFLJCkfP4reFenLfxGCel2AJI+F7VFl2xaO2XgrcH/lnVrHqKZEAEXjls
 KoYMnShIolv7h2MKP6hHtyTi2j1wvQUKsZC71o9/fuW+4fUT8gECx1YtYcL73wrw
 gEMdlUgBYC3jiiCUHJIFX6iPJ2t/TC+q1eIIF2K/Osrk2kIqQhzoozcL4vpuAZQT
 99ljx/qRelXa8oppDb7nM5eulg7WY8ZqxEfFZphTMC5nLEGzClxuOTTl2kDYI/D/
 UZmTWZDY+F5F0xvNk2gH84qVJXBOVDoobpT7hVA/tDuybobc/kvGDzRayEVqVzKj
 Q0tPlJs+xBZpkK5TVnxaFLJVOM+p1Xosxy3kNVXmuYNBvT/R89UbJiCrUKqKZF+L
 z/1mOYUv8UklHqYAeuJSptHvqJjTGa/fsEYP7dAUBbc1N2eVB8mzZ4mgU5rYXbtC
 E6UXXiWnoSRm8bmco9QmcWWoXt5UGEizHSJLz6t1R5Df/YmXhWlytll5aCwY1ksf
 FNoL7S4u7AZThL1Nwi7yUs4CAjhk/N4aOsk+41S0sALCx30BJuI6UdesAxJ0lu+Z
 fwCQYbs27y4p7mBLbkYwcQNxAxGm7PSK4yeyRIy2njiyV4qnLf8=
 =EsC2
 -----END PGP SIGNATURE-----

Merge tag 'net-next-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Jakub Kicinski:
 "Core & protocols:

   - Complete rework of garbage collection of AF_UNIX sockets.

     AF_UNIX is prone to forming reference count cycles due to fd
     passing functionality. New method based on Tarjan's Strongly
     Connected Components algorithm should be both faster and remove a
     lot of workarounds we accumulated over the years.

   - Add TCP fraglist GRO support, allowing chaining multiple TCP
     packets and forwarding them together. Useful for small switches /
     routers which lack basic checksum offload in some scenarios (e.g.
     PPPoE).

   - Support using SMP threads for handling packet backlog i.e. packet
     processing from software interfaces and old drivers which don't use
     NAPI. This helps move the processing out of the softirq jumble.

   - Continue work of converting from rtnl lock to RCU protection.

     Don't require rtnl lock when reading: IPv6 routing FIB, IPv6
     address labels, netdev threaded NAPI sysfs files, bonding driver's
     sysfs files, MPLS devconf, IPv4 FIB rules, netns IDs, tcp metrics,
     TC Qdiscs, neighbor entries, ARP entries via ioctl(SIOCGARP), a lot
     of the link information available via rtnetlink.

   - Small optimizations from Eric to UDP wake up handling, memory
     accounting, RPS/RFS implementation, TCP packet sizing etc.

   - Allow direct page recycling in the bulk API used by XDP, for +2%
     PPS.

   - Support peek with an offset on TCP sockets.

   - Add MPTCP APIs for querying last time packets were received/sent/acked
     and whether MPTCP "upgrade" succeeded on a TCP socket.

   - Add intra-node communication shortcut to improve SMC performance.

   - Add IPv6 (and IPv{4,6}-over-IPv{4,6}) support to the GTP protocol
     driver.

   - Add HSR-SAN (RedBOX) mode of operation to the HSR protocol driver.

   - Add reset reasons for tracing what caused a TCP reset to be sent.

   - Introduce direction attribute for xfrm (IPSec) states. State can be
     used either for input or output packet processing.

  Things we sprinkled into general kernel code:

   - Add bitmap_{read,write}(), bitmap_size(), expose BYTES_TO_BITS().

     This required touch-ups and renaming of a few existing users.

   - Add Endian-dependent __counted_by_{le,be} annotations.

   - Make building selftests "quieter" by printing summaries like
     "CC object.o" rather than full commands with all the arguments.

  Netfilter:

   - Use GFP_KERNEL to clone elements, to deal better with OOM
     situations and avoid failures in the .commit step.

  BPF:

   - Add eBPF JIT for ARCv2 CPUs.

   - Support attaching kprobe BPF programs through kprobe_multi link in
     a session mode, meaning, a BPF program is attached to both function
     entry and return, the entry program can decide if the return
     program gets executed and the entry program can share u64 cookie
     value with return program. "Session mode" is a common use-case for
     tetragon and bpftrace.

   - Add the ability to specify and retrieve BPF cookie for raw
     tracepoint programs in order to ease migration from classic to raw
     tracepoints.

   - Add an internal-only BPF per-CPU instruction for resolving per-CPU
     memory addresses and implement support in x86, ARM64 and RISC-V
     JITs. This allows inlining functions which need to access per-CPU
     state.

   - Optimize x86 BPF JIT's emit_mov_imm64, and add support for various
     atomics in bpf_arena which can be JITed as a single x86
     instruction. Support BPF arena on ARM64.

   - Add a new bpf_wq API for deferring events and refactor
     process-context bpf_timer code to keep common code where possible.

   - Harden the BPF verifier's and/or/xor value tracking.

   - Introduce crypto kfuncs to let BPF programs call kernel crypto
     APIs.

   - Support bpf_tail_call_static() helper for BPF programs with GCC 13.

   - Add bpf_preempt_{disable,enable}() kfuncs in order to allow a BPF
     program to have code sections where preemption is disabled.

  Driver API:

   - Skip software TC processing completely if all installed rules are
     marked as HW-only, instead of checking the HW-only flag rule by
     rule.

   - Add support for configuring PoE (Power over Ethernet), similar to
     the already existing support for PoDL (Power over Data Line)
     config.

   - Initial bits of a queue control API, for now allowing a single
     queue to be reset without disturbing packet flow to other queues.

   - Common (ethtool) statistics for hardware timestamping.

  Tests and tooling:

   - Remove the need to create a config file to run the net forwarding
     tests so that a naive "make run_tests" can exercise them.

   - Define a method of writing tests which require an external endpoint
     to communicate with (to send/receive data towards the test
     machine). Add a few such tests.

   - Create a shared code library for writing Python tests. Expose the
     YAML Netlink library from tools/ to the tests for easy Netlink
     access.

   - Move netfilter tests under net/, extend them, separate performance
     tests from correctness tests, and iron out issues found by running
     them "on every commit".

   - Refactor BPF selftests to use common network helpers.

   - Further work filling in YAML definitions of Netlink messages for:
     nftables, team driver, bonding interfaces, vlan interfaces, VF
     info, TC u32 mark, TC police action.

   - Teach Python YAML Netlink to decode attribute policies.

   - Extend the definition of the "indexed array" construct in the specs
     to cover arrays of scalars rather than just nests.

   - Add hyperlinks between definitions in generated Netlink docs.

  Drivers:

   - Make sure unsupported flower control flags are rejected by drivers,
     and make more drivers report errors directly to the application
     rather than dmesg (large number of driver changes from Asbjørn
     Sloth Tønnesen).

   - Ethernet high-speed NICs:
      - Broadcom (bnxt):
         - support multiple RSS contexts and steering traffic to them
         - support XDP metadata
         - make page pool allocations more NUMA aware
      - Intel (100G, ice, idpf):
         - extract datapath code common among Intel drivers into a library
         - use fewer resources in switchdev by sharing queues with the PF
         - add PFCP filter support
         - add Ethernet filter support
         - use a spinlock instead of HW lock in PTP clock ops
         - support 5 layer Tx scheduler topology
      - nVidia/Mellanox:
         - 800G link modes and 100G SerDes speeds
         - per-queue IRQ coalescing configuration
      - Marvell Octeon:
         - support offloading TC packet mark action

   - Ethernet NICs consumer, embedded and virtual:
      - stop lying about skb->truesize in USB Ethernet drivers, it
        messes up TCP memory calculations
      - Google cloud vNIC:
         - support changing ring size via ethtool
         - support ring reset using the queue control API
      - VirtIO net:
         - expose flow hash from RSS to XDP
         - per-queue statistics
         - add selftests
      - Synopsys (stmmac):
         - support controllers which require an RX clock signal from the
           MII bus to perform their hardware initialization
      - TI:
         - icssg_prueth: support ICSSG-based Ethernet on AM65x SR1.0 devices
         - icssg_prueth: add SW TX / RX Coalescing based on hrtimers
         - cpsw: minimal XDP support
      - Renesas (ravb):
         - support describing the MDIO bus
      - Realtek (r8169):
         - add support for RTL8168M
      - Microchip Sparx5:
         - matchall and flower actions mirred and redirect

   - Ethernet switches:
      - nVidia/Mellanox:
         - improve events processing performance
      - Marvell:
         - add support for MV88E6250 family internal PHYs
      - Microchip:
         - add DCB and DSCP mapping support for KSZ switches
         - vsc73xx: convert to PHYLINK
      - Realtek:
         - rtl8226b/rtl8221b: add C45 instances and SerDes switching

   - Many driver changes related to PHYLIB and PHYLINK deprecated API
     cleanup

   - Ethernet PHYs:
      - Add a new driver for Airoha EN8811H 2.5 Gigabit PHY.
      - micrel: lan8814: add support for PPS out and external timestamp trigger

   - WiFi:
      - Disable Wireless Extensions (WEXT) in all Wi-Fi 7 devices
        drivers. Modern devices can only be configured using nl80211.
      - mac80211/cfg80211
         - handle color change per link for WiFi 7 Multi-Link Operation
      - Intel (iwlwifi):
         - don't support puncturing in 5 GHz
         - support monitor mode on passive channels
         - BZ-W device support
         - P2P with HE/EHT support
         - re-add support for firmware API 90
         - provide channel survey information for Automatic Channel Selection
      - MediaTek (mt76):
         - mt7921 LED control
         - mt7925 EHT radiotap support
         - mt7920e PCI support
      - Qualcomm (ath11k):
         - P2P support for QCA6390, WCN6855 and QCA2066
         - support hibernation
         - ieee80211-freq-limit Device Tree property support
      - Qualcomm (ath12k):
         - refactoring in preparation of multi-link support
         - suspend and hibernation support
         - ACPI support
         - debugfs support, including dfs_simulate_radar support
      - RealTek:
         - rtw88: RTL8723CS SDIO device support
         - rtw89: RTL8922AE Wi-Fi 7 PCI device support
         - rtw89: complete features of new WiFi 7 chip 8922AE including
           BT-coexistence and Wake-on-WLAN
         - rtw89: use BIOS ACPI settings to set TX power and channels
         - rtl8xxxu: enable Management Frame Protection (MFP) support

   - Bluetooth:
      - support for Intel BlazarI and Filmore Peak2 (BE201)
      - support for MediaTek MT7921S SDIO
      - initial support for Intel PCIe BT driver
      - remove HCI_AMP support"

* tag 'net-next-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1827 commits)
  selftests: netfilter: fix packetdrill conntrack testcase
  net: gro: fix napi_gro_cb zeroed alignment
  Bluetooth: btintel_pcie: Refactor and code cleanup
  Bluetooth: btintel_pcie: Fix warning reported by sparse
  Bluetooth: hci_core: Fix not handling hdev->le_num_of_adv_sets=1
  Bluetooth: btintel: Fix compiler warning for multi_v7_defconfig config
  Bluetooth: btintel_pcie: Fix compiler warnings
  Bluetooth: btintel_pcie: Add *setup* function to download firmware
  Bluetooth: btintel_pcie: Add support for PCIe transport
  Bluetooth: btintel: Export few static functions
  Bluetooth: HCI: Remove HCI_AMP support
  Bluetooth: L2CAP: Fix div-by-zero in l2cap_le_flowctl_init()
  Bluetooth: qca: Fix error code in qca_read_fw_build_info()
  Bluetooth: hci_conn: Use __counted_by() and avoid -Wfamnae warning
  Bluetooth: btintel: Add support for Filmore Peak2 (BE201)
  Bluetooth: btintel: Add support for BlazarI
  LE Create Connection command timeout increased to 20 secs
  dt-bindings: net: bluetooth: Add MediaTek MT7921S SDIO Bluetooth
  Bluetooth: compute LE flow credits based on recvbuf space
  Bluetooth: hci_sync: Use cmd->num_cis instead of magic number
  ...
2024-05-14 19:42:24 -07:00
Linus Torvalds
6bfd2d442a Updates for the interrupt subsystem:
- Core code:
 
    - Interrupt storm detection for the lockup watchdog:
 
      Lockups which are caused by interrupt storms are not easy to debug
      because there is no information about the events which make the lockup
      detector trigger.
 
      To make this more user friendly, provide an extenstion to interrupt
      statistics which allows to take snapshots and an interface to retrieve
      the delta to the snapshot. Use this new mechanism in the watchdog code
      to do a two stage lockup analysis by taking the snapshot and printing
      the deltas for the topmost active interrupts on the second trigger.
 
      Note: This contains both the interrupt and the watchdog changes as
      the latter depend on the former obviously.
 
   - Avoid summation loops in the /proc/interrupts output and use the global
     counter when possible
 
   - Skip suspended interrupts on CPU hotplug operations to ensure that they
     are not delivered before the system resumes the device drivers when
     coming out of suspend.
 
   - On CPU hot-unplug interrupts which are affine to the outgoing CPU are
     migrated to a different CPU in the affinity mask. This can fail when
     the CPUs have no vectors left. Instead of giving up try to migrate it
     to any online CPU and thereby breaking the affinity setting in order to
     prevent a stale device interrupt which targets an offline CPU
 
   - The usual small cleanups
 
  - Driver code:
 
   - Support for the RISCV AIA MSI controller
 
   - Make the interrupt allocation for the Loongson PCH controller more
     flexible to prevent vector exhaustion
 
   - The usual set of cleanups and fixes all over the place
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmZBCM0THHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoeZHEACqMLN3K+1HyWflYtcTHJeYCjZLHS77
 2tQeKaaskOA4W6dcGXPxMw5CHqAobHVQQMqgcJxhUdqQiOJnFFnrtCD7JtqM0hWK
 UORNbyeovuhAo+iJ0fTuS8p63H7vm2GIWwBLWJnOuChYv/6Yyx5Cald1skvyvbzL
 zePhiiAf5mkdmJMeT5wJSCqEWSRYOXsVAJ/0YAwFG3bKkJH3bmDo6SDJY02sXT5P
 pjbtD/0hum9wIVT4fNdYleHHQMdBdj9dLlcxXBikHq50mDMw7GxvjKiLcXmoerw3
 rEBfVVJp3qpSofpNJZ3HH0ywcF3yUzq04/LPE9Tk2MoQ8NF0GzP8r9Ahke4B7cUj
 FysWNiAlC2IisEi6th313FZkTLx0zgewdsdEBTLt8eAE9TU0wamRbo99LZ8i/Qr3
 hk7jV8DzL+EDQJLgl4p1iPJgA708eW17tbCxLEa15VKVV6P58miohmhx/IfPO2Gx
 FV1PPehtItsmiK/UoRtUCoFdFsqNQtOE+h8DWLyy8RDmhBqGbn9Ut4euXiQIF+rX
 WJKPFfslCTR39BrBcZnZeNsgOCN7tEfFRstzjzkey1DaeTGWtxmA5UGhpC2vT74y
 YyXluvZlgKr4S64ABmcqQj++hQLho0OQAih3uW5YVxt4VxEUcXYMJOsV1AQGpMjF
 UnewWH5opBQdfw==
 =jFLf
 -----END PGP SIGNATURE-----

Merge tag 'irq-core-2024-05-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull interrupt subsystem updates from Thomas Gleixner:
 "Core code:

   - Interrupt storm detection for the lockup watchdog:

     Lockups which are caused by interrupt storms are not easy to debug
     because there is no information about the events which make the
     lockup detector trigger.

     To make this more user friendly, provide an extenstion to interrupt
     statistics which allows to take snapshots and an interface to
     retrieve the delta to the snapshot. Use this new mechanism in the
     watchdog code to do a two stage lockup analysis by taking the
     snapshot and printing the deltas for the topmost active interrupts
     on the second trigger.

     Note: This contains both the interrupt and the watchdog changes as
     the latter depend on the former obviously.

   - Avoid summation loops in the /proc/interrupts output and use the
     global counter when possible

   - Skip suspended interrupts on CPU hotplug operations to ensure that
     they are not delivered before the system resumes the device drivers
     when coming out of suspend.

   - On CPU hot-unplug interrupts which are affine to the outgoing CPU
     are migrated to a different CPU in the affinity mask. This can fail
     when the CPUs have no vectors left. Instead of giving up try to
     migrate it to any online CPU and thereby breaking the affinity
     setting in order to prevent a stale device interrupt which targets
     an offline CPU

   - The usual small cleanups

  Driver code:

   - Support for the RISCV AIA MSI controller

   - Make the interrupt allocation for the Loongson PCH controller more
     flexible to prevent vector exhaustion

   - The usual set of cleanups and fixes all over the place"

* tag 'irq-core-2024-05-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (51 commits)
  irqchip/gic-v3-its: Remove BUG_ON in its_vpe_irq_domain_alloc
  cpuidle: Avoid explicit cpumask allocation on stack
  irqchip/sifive-plic: Avoid explicit cpumask allocation on stack
  irqchip/riscv-aplic-direct: Avoid explicit cpumask allocation on stack
  irqchip/loongson-eiointc: Avoid explicit cpumask allocation on stack
  irqchip/gic-v3-its: Avoid explicit cpumask allocation on stack
  irqchip/irq-bcm6345-l1: Avoid explicit cpumask allocation on stack
  cpumask: Introduce cpumask_first_and_and()
  irqchip/irq-brcmstb-l2: Avoid saving mask on shutdown
  genirq: Reuse irq_is_nmi()
  genirq/cpuhotplug: Retry with cpu_online_mask when migration fails
  genirq/cpuhotplug: Skip suspended interrupts when restoring affinity
  arm64: dts: st: Add interrupt parent to pinctrl on stm32mp251
  arm64: dts: st: Add exti1 and exti2 nodes on stm32mp251
  ARM: dts: stm32: List exti parent interrupts on stm32mp131
  ARM: dts: stm32: List exti parent interrupts on stm32mp151
  arm64: Kconfig.platforms: Enable STM32_EXTI for ARCH_STM32
  irqchip/stm32-exti: Mark events reserved with RIF configuration check
  irqchip/stm32-exti: Skip secure events
  irqchip/stm32-exti: Convert driver to standard PM
  ...
2024-05-14 09:47:14 -07:00
Masahiro Yamada
7f7f6f7ad6 Makefile: remove redundant tool coverage variables
Now Kbuild provides reasonable defaults for objtool, sanitizers, and
profilers.

Remove redundant variables.

Note:

This commit changes the coverage for some objects:

  - include arch/mips/vdso/vdso-image.o into UBSAN, GCOV, KCOV
  - include arch/sparc/vdso/vdso-image-*.o into UBSAN
  - include arch/sparc/vdso/vma.o into UBSAN
  - include arch/x86/entry/vdso/extable.o into KASAN, KCSAN, UBSAN, GCOV, KCOV
  - include arch/x86/entry/vdso/vdso-image-*.o into KASAN, KCSAN, UBSAN, GCOV, KCOV
  - include arch/x86/entry/vdso/vdso32-setup.o into KASAN, KCSAN, UBSAN, GCOV, KCOV
  - include arch/x86/entry/vdso/vma.o into GCOV, KCOV
  - include arch/x86/um/vdso/vma.o into KASAN, GCOV, KCOV

I believe these are positive effects because all of them are kernel
space objects.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Tested-by: Roberto Sassu <roberto.sassu@huawei.com>
2024-05-14 23:35:48 +09:00
Mike Rapoport (IBM)
0cc2dc4902 arch: make execmem setup available regardless of CONFIG_MODULES
execmem does not depend on modules, on the contrary modules use
execmem.

To make execmem available when CONFIG_MODULES=n, for instance for
kprobes, split execmem_params initialization out from
arch/*/kernel/module.c and compile it when CONFIG_EXECMEM=y

Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2024-05-14 00:31:44 -07:00
Mike Rapoport (IBM)
4d7b321a9c riscv: extend execmem_params for generated code allocations
The memory allocations for kprobes and BPF on RISC-V are not placed in
the modules area and these custom allocations are implemented with
overrides of alloc_insn_page() and  bpf_jit_alloc_exec().

Define MODULES_VADDR and MODULES_END as VMALLOC_START and VMALLOC_END for
32 bit and slightly reorder execmem_params initialization to support both
32 and 64 bit variants, define EXECMEM_KPROBES and EXECMEM_BPF ranges in
riscv::execmem_params and drop overrides of alloc_insn_page() and
bpf_jit_alloc_exec().

Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2024-05-14 00:31:43 -07:00
Mike Rapoport (IBM)
f6bec26c0a mm/execmem, arch: convert simple overrides of module_alloc to execmem
Several architectures override module_alloc() only to define address
range for code allocations different than VMALLOC address space.

Provide a generic implementation in execmem that uses the parameters for
address space ranges, required alignment and page protections provided
by architectures.

The architectures must fill execmem_info structure and implement
execmem_arch_setup() that returns a pointer to that structure. This way the
execmem initialization won't be called from every architecture, but rather
from a central place, namely a core_initcall() in execmem.

The execmem provides execmem_alloc() API that wraps __vmalloc_node_range()
with the parameters defined by the architectures.  If an architecture does
not implement execmem_arch_setup(), execmem_alloc() will fall back to
module_alloc().

Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
Acked-by: Song Liu <song@kernel.org>
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2024-05-14 00:31:43 -07:00
Jakub Kicinski
6e62702feb bpf-next-for-netdev
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZkGcZAAKCRDbK58LschI
 g6o6APwLsqhrM2w71VUN5ciCxu4H5VDtZp6wkdqtVbxxU4qNxQEApKgYgKt8ZLF3
 Kily5c7m+S4ZXhMX21rb8JhSAz0dfQk=
 =5Dk7
 -----END PGP SIGNATURE-----

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Daniel Borkmann says:

====================
pull-request: bpf-next 2024-05-13

We've added 119 non-merge commits during the last 14 day(s) which contain
a total of 134 files changed, 9462 insertions(+), 4742 deletions(-).

The main changes are:

1) Add BPF JIT support for 32-bit ARCv2 processors, from Shahab Vahedi.

2) Add BPF range computation improvements to the verifier in particular
   around XOR and OR operators, refactoring of checks for range computation
   and relaxing MUL range computation so that src_reg can also be an unknown
   scalar, from Cupertino Miranda.

3) Add support to attach kprobe BPF programs through kprobe_multi link in
   a session mode, meaning, a BPF program is attached to both function entry
   and return, the entry program can decide if the return program gets
   executed and the entry program can share u64 cookie value with return
   program. Session mode is a common use-case for tetragon and bpftrace,
   from Jiri Olsa.

4) Fix a potential overflow in libbpf's ring__consume_n() and improve libbpf
   as well as BPF selftest's struct_ops handling, from Andrii Nakryiko.

5) Improvements to BPF selftests in context of BPF gcc backend,
   from Jose E. Marchesi & David Faust.

6) Migrate remaining BPF selftest tests from test_sock_addr.c to prog_test-
   -style in order to retire the old test, run it in BPF CI and additionally
   expand test coverage, from Jordan Rife.

7) Big batch for BPF selftest refactoring in order to remove duplicate code
   around common network helpers, from Geliang Tang.

8) Another batch of improvements to BPF selftests to retire obsolete
   bpf_tcp_helpers.h as everything is available vmlinux.h,
   from Martin KaFai Lau.

9) Fix BPF map tear-down to not walk the map twice on free when both timer
   and wq is used, from Benjamin Tissoires.

10) Fix BPF verifier assumptions about socket->sk that it can be non-NULL,
    from Alexei Starovoitov.

11) Change BTF build scripts to using --btf_features for pahole v1.26+,
    from Alan Maguire.

12) Small improvements to BPF reusing struct_size() and krealloc_array(),
    from Andy Shevchenko.

13) Fix s390 JIT to emit a barrier for BPF_FETCH instructions,
    from Ilya Leoshkevich.

14) Extend TCP ->cong_control() callback in order to feed in ack and
    flag parameters and allow write-access to tp->snd_cwnd_stamp
    from BPF program, from Miao Xu.

15) Add support for internal-only per-CPU instructions to inline
    bpf_get_smp_processor_id() helper call for arm64 and riscv64 BPF JITs,
    from Puranjay Mohan.

16) Follow-up to remove the redundant ethtool.h from tooling infrastructure,
    from Tushar Vyavahare.

17) Extend libbpf to support "module:<function>" syntax for tracing
    programs, from Viktor Malik.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (119 commits)
  bpf: make list_for_each_entry portable
  bpf: ignore expected GCC warning in test_global_func10.c
  bpf: disable strict aliasing in test_global_func9.c
  selftests/bpf: Free strdup memory in xdp_hw_metadata
  selftests/bpf: Fix a few tests for GCC related warnings.
  bpf: avoid gcc overflow warning in test_xdp_vlan.c
  tools: remove redundant ethtool.h from tooling infra
  selftests/bpf: Expand ATTACH_REJECT tests
  selftests/bpf: Expand getsockname and getpeername tests
  sefltests/bpf: Expand sockaddr hook deny tests
  selftests/bpf: Expand sockaddr program return value tests
  selftests/bpf: Retire test_sock_addr.(c|sh)
  selftests/bpf: Remove redundant sendmsg test cases
  selftests/bpf: Migrate ATTACH_REJECT test cases
  selftests/bpf: Migrate expected_attach_type tests
  selftests/bpf: Migrate wildcard destination rewrite test
  selftests/bpf: Migrate sendmsg6 v4 mapped address tests
  selftests/bpf: Migrate sendmsg deny test cases
  selftests/bpf: Migrate WILDCARD_IP test
  selftests/bpf: Handle SYSCALL_EPERM and SYSCALL_ENOTSUPP test cases
  ...
====================

Link: https://lore.kernel.org/r/20240513134114.17575-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-13 16:41:10 -07:00
Inochi Amaoto
92cce91949
riscv: defconfig: Enable CONFIG_CLK_SOPHGO_CV1800
CONFIG_CLK_SOPHGO_CV1800 is required when booting the minimum
system for CV1800 series board.

Signed-off-by: Inochi Amaoto <inochiama@outlook.com>
Link: https://lore.kernel.org/r/IA1PR20MB49537E8B2D1FAAA7D5B8BDA2BB052@IA1PR20MB4953.namprd20.prod.outlook.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-05-13 14:26:34 -07:00
Linus Torvalds
14a60290ed soc: drivers for 6.10
As usual, these are updates for drivers that are specific to certain
 SoCs or firmware running on them. Notable updates include
 
  - The new STMicroelectronics STM32 "firewall" bus driver that is
    used to provide a barrier between different parts of an SoC
 
  - Lots of updates for the Qualcomm platform drivers, in particular
    SCM, which gets a rewrite of its initialization code
 
  - Firmware driver updates for Arm FF-A notification interrupts
    and indirect messaging, SCMI firmware support for pin control
    and vendor specific interfaces, and TEE firmware interface
    changes across multiple TEE drivers
 
  - A larger cleanup of the Mediatek CMDQ driver and some related bits
 
  - Kconfig changes for riscv drivers to prepare for adding Kanaan
    k230 support
 
  - Multiple minor updates for the TI sysc bus driver, memory controllers,
    hisilicon hccs and more
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmY+dbEACgkQYKtH/8kJ
 UifGTBAA3lh2qw++S5i6nk71388/nswb5fZKwqPKl1m+44SndE7r0/nauGm7IZhd
 oM5xiBZzsoYCKuesSuejkBNgPmUPtUhyHBJKSKjwrcak4k1mrjDgXxfSxCqGptVZ
 Ps683koJ/Ic7O/LQNxlVzUlssG/3gmhJELfpaVIB7rG8pmdgF9ocM73+iJrRwW1Q
 fTFXUXeCcXJ2N5Yki7z2+4oB3RebPzTBz4NeIYNdGQj5/u61oG0KzXwvk8eqWhNb
 0KJYsfAQZGzdyAys6XU1MHv4T4L2a3DQL6NMgLnovVEMhP2Hk0XlBmI7X+uAXYiM
 2z289d9Wx3HMoiekulDJ+rpDUPxPXrEqaRkfWZ8G+HSY4KcIeSP7YGmhylr0kdvw
 +Qo6orxZ9lkSPaT1aUkNIIywDzet/E2hY8zV1EcLBu9GWjkybAvT/Uy2lSSN+LLH
 yEQyDf+s90N6QuZwdXN8a3QliP39tHqlye8wou6UQG8aZ7z870fKAKlvA6DjTfPM
 JyhY1rXYH/bvC87sVTi5Qb09+2R6ftvk5xijiMOyXugPpO/6PQKULVataeUnzwgs
 YTgOPhaqXVadDR/nkrG3FzEtvpYeTspwGpDiEpDrNHf5H1tFg6VfPNS8y0QOlSPY
 JcmylQNCtwxCRLTw2NHOb3tLcY4ruDHNmrWf5INTzf6cJe49jaU=
 =4rf0
 -----END PGP SIGNATURE-----

Merge tag 'soc-drivers-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc

Pull SoC driver updates from Arnd Bergmann:
 "As usual, these are updates for drivers that are specific to certain
  SoCs or firmware running on them.

  Notable updates include

   - The new STMicroelectronics STM32 "firewall" bus driver that is used
     to provide a barrier between different parts of an SoC

   - Lots of updates for the Qualcomm platform drivers, in particular
     SCM, which gets a rewrite of its initialization code

   - Firmware driver updates for Arm FF-A notification interrupts and
     indirect messaging, SCMI firmware support for pin control and
     vendor specific interfaces, and TEE firmware interface changes
     across multiple TEE drivers

   - A larger cleanup of the Mediatek CMDQ driver and some related bits

   - Kconfig changes for riscv drivers to prepare for adding Kanaan k230
     support

   - Multiple minor updates for the TI sysc bus driver, memory
     controllers, hisilicon hccs and more"

* tag 'soc-drivers-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (103 commits)
  firmware: qcom: uefisecapp: Allow on sc8180x Primus and Flex 5G
  soc: qcom: pmic_glink: Make client-lock non-sleeping
  dt-bindings: soc: qcom,wcnss: fix bluetooth address example
  soc/tegra: pmc: Add EQOS wake event for Tegra194 and Tegra234
  bus: stm32_firewall: fix off by one in stm32_firewall_get_firewall()
  bus: etzpc: introduce ETZPC firewall controller driver
  firmware: arm_ffa: Avoid queuing work when running on the worker queue
  bus: ti-sysc: Drop legacy idle quirk handling
  bus: ti-sysc: Drop legacy quirk handling for smartreflex
  bus: ti-sysc: Drop legacy quirk handling for uarts
  bus: ti-sysc: Add a description and copyrights
  bus: ti-sysc: Move check for no-reset-on-init
  soc: hisilicon: kunpeng_hccs: replace MAILBOX dependency with PCC
  soc: hisilicon: kunpeng_hccs: Add the check for obtaining complete port attribute
  firmware: arm_ffa: Fix memory corruption in ffa_msg_send2()
  bus: rifsc: introduce RIFSC firewall controller driver
  of: property: fw_devlink: Add support for "access-controller"
  soc: mediatek: mtk-socinfo: Correct the marketing name for MT8188GV
  soc: mediatek: mtk-socinfo: Add entry for MT8395AV/ZA Genio 1200
  soc: mediatek: mtk-mutex: Add support for MT8188 VPPSYS
  ...
2024-05-13 08:48:42 -07:00
Linus Torvalds
6c60000f0b soc: devicetree updates for 6.10, part 1
The updates this time are a bit smaller than most times, mainly because
 it is not totally dominated by new Qualcomm hardware support. Instead,
 we larger than average updates for Rockchips, NXP, Allwinner and TI.
 The only two new SoCs this time are both from NXP and are minor variants
 of already supported ones.
 
 The updates for aspeed, amlogic and mediatek came a little late, so
 I'm saving those for part 2 in a few days if everything turns out fine.
 
 New machines this time contain:
 
  - two Broadcom SoC based wireless routers from Asus
 
  - Five allwinner based consumer devices for gaming, set-top-box and
    eboot reader applications
 
  - Three older phones based on Qualcomm chips, plus the more recent
    Sony Xperia 1 V
 
  - 14 industrial and embedded boards based on NXP i.MX6, i.MX8,
    layerscape and s32g3 SoCs
 
  - six rockchips boards including another handheld game console
    and a few single-board computers
 
 On top of these, we have the usual cleanups for dtc warnings and
 updates to add more features to already merged machines.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmY+daEACgkQYKtH/8kJ
 UieIXxAAya9WFfjgJrkOCJnn/3+4Q9nr4HzvCaKPfvbw2YH8krxuXTAI5xyeoUNY
 NMb8qyugi94Xw4Jh3qbAYyvQUXFFNgFTvuTQIGqOBIMSNkPjfghiq45emCr+d1ea
 0lsCuu1Mpw62H6038xiyMNJNUFyWjLEoLlLaYqPIZj5/jHiOT1hiKZAB5d3+epzx
 IhkWEvnSgDo37o6XEtaijFDpu64khdpcWhS4aOt2nJQAR73YYO+jySqEGVwDW4Ht
 VXEim70ckcj1WLhGjNYakwkDIw2It24vndzcnmLLJMq5k+9mIZ7D3RYPJrKYLcZk
 /F/hozcYFOxVX0TX+ATwaiKsnUQthvBGEKaaeDTO/VCD87ya6/3FIr7LJezLy4fh
 t8Vvmgme0JH0kFczWr36YVxdGk6QolkQvNGawTIqPdj5Guj2eSkDHLYIc0HOOps+
 4pDKDLO5MUXrOjtWXYy48zGE+7zF58m3QySwieoJAVF5LbnLuXAevmcL+AjQ+QfK
 pBTtyDe6hUHxh5vQHSoY05loQ2dELWBxza+G5lNByYMPX4/qzQHcxeZlF7kMm0t5
 XE0T0lG/C25QPKQRa1NQ950WtJDoGIWtF0+Kk0qzRP6WbgkX0Wo/NemSmCmVD4IJ
 PM/nQYCkccwdD5TjyUl0ZiS/LVfd54MXFHqcrTU2zOMC+YryZHM=
 =SnV4
 -----END PGP SIGNATURE-----

Merge tag 'soc-dt-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc

Pull SoC devicetree updates from Arnd Bergmann:
 "The updates this time are a bit smaller than most times, mainly
  because it is not totally dominated by new Qualcomm hardware support.

  Instead, we larger than average updates for Rockchips, NXP, Allwinner
  and TI. The only two new SoCs this time are both from NXP and are
  minor variants of already supported ones.

  The updates for aspeed, amlogic and mediatek came a little late, so
  I'm saving those for part 2 in a few days if everything turns out
  fine.

  New machines this time contain:

   - two Broadcom SoC based wireless routers from Asus

   - Five allwinner based consumer devices for gaming, set-top-box and
     eboot reader applications

   - Three older phones based on Qualcomm chips, plus the more recent
     Sony Xperia 1 V

   - 14 industrial and embedded boards based on NXP i.MX6, i.MX8,
     layerscape and s32g3 SoCs

   - six rockchips boards including another handheld game console and a
     few single-board computers

  On top of these, we have the usual cleanups for dtc warnings and
  updates to add more features to already merged machines"

* tag 'soc-dt-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (612 commits)
  arm64: dts: marvell: espressobin-ultra: fix Ethernet Switch unit address
  arm64: dts: marvell: turris-mox: drop unneeded flash address/size-cells
  arm64: dts: marvell: eDPU: drop redundant address/size-cells
  arm64: dts: qcom: pm6150: correct USB VBUS regulator compatible
  arm64: dts: rockchip: add rk3588 pcie and php IOMMUs
  arm64: dts: rockchip: enable onboard spi flash for rock-3a
  arm64: dts: rockchip: add USB-C support to rk3588s-orangepi-5
  arm64: dts: rockchip: Enable GPU on Orange Pi 5
  arm64: dts: rockchip: enable GPU on khadas-edge2
  arm64: dts: rockchip: Add USB3 on Edgeble NCM6A-IO board
  arm64: dts: rockchip: Support poweroff on Edgeble Neural Compute Module
  arm64: dts: rockchip: Add Radxa ROCK 3C
  dt-bindings: arm: rockchip: add Radxa ROCK 3C
  arm64: dts: exynos: gs101: specify empty clocks for remaining pinctrl
  arm64: dts: exynos: gs101: specify bus clock for pinctrl_hsi2
  arm64: dts: exynos: gs101: specify bus clock for pinctrl_peric[01]
  arm64: dts: exynos: gs101: specify bus clock for pinctrl (far) alive
  arm64: dts: Add/fix /memory node unit-addresses
  arm64: dts: qcom: qcs404: fix bluetooth device address
  arm64: dts: qcom: sc8280xp-x13s: enable USB MP and fingerprint reader
  ...
2024-05-13 08:45:18 -07:00
Joerg Roedel
2bd5059c6c Merge branches 'arm/renesas', 'arm/smmu', 'x86/amd', 'core' and 'x86/vt-d' into next 2024-05-13 14:06:54 +02:00
Puranjay Mohan
20a759df3b riscv, bpf: make some atomic operations fully ordered
The BPF atomic operations with the BPF_FETCH modifier along with
BPF_XCHG and BPF_CMPXCHG are fully ordered but the RISC-V JIT implements
all atomic operations except BPF_CMPXCHG with relaxed ordering.

Section 8.1 of the "The RISC-V Instruction Set Manual Volume I:
Unprivileged ISA" [1], titled, "Specifying Ordering of Atomic
Instructions" says:

| To provide more efficient support for release consistency [5], each
| atomic instruction has two bits, aq and rl, used to specify additional
| memory ordering constraints as viewed by other RISC-V harts.

and

| If only the aq bit is set, the atomic memory operation is treated as
| an acquire access.
| If only the rl bit is set, the atomic memory operation is treated as a
| release access.
|
| If both the aq and rl bits are set, the atomic memory operation is
| sequentially consistent.

Fix this by setting both aq and rl bits as 1 for operations with
BPF_FETCH and BPF_XCHG.

[1] https://riscv.org/wp-content/uploads/2017/05/riscv-spec-v2.2.pdf

Fixes: dd642ccb45 ("riscv, bpf: Implement more atomic operations for RV64")
Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Reviewed-by: Pu Lehui <pulehui@huawei.com>
Link: https://lore.kernel.org/r/20240505201633.123115-1-puranjay@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-12 17:00:06 -07:00
Xiao Wang
80c5a07ae6 riscv, bpf: Fix typo in comment
We can use either "instruction" or "insn" in the comment.

Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Reviewed-by: Pu Lehui <pulehui@huawei.com>
Link: https://lore.kernel.org/r/20240507111618.437121-1-xiao.w.wang@intel.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-12 16:56:43 -07:00
Puranjay Mohan
2ddec2c80b riscv, bpf: inline bpf_get_smp_processor_id()
Inline the calls to bpf_get_smp_processor_id() in the riscv bpf jit.

RISCV saves the pointer to the CPU's task_struct in the TP (thread
pointer) register. This makes it trivial to get the CPU's processor id.
As thread_info is the first member of task_struct, we can read the
processor id from TP + offsetof(struct thread_info, cpu).

          RISCV64 JIT output for `call bpf_get_smp_processor_id`
	  ======================================================

                Before                           After
               --------                         -------

         auipc   t1,0x848c                  ld    a5,32(tp)
         jalr    604(t1)
         mv      a5,a0

Benchmark using [1] on Qemu.

./benchs/run_bench_trigger.sh glob-arr-inc arr-inc hash-inc

+---------------+------------------+------------------+--------------+
|      Name     |     Before       |       After      |   % change   |
|---------------+------------------+------------------+--------------|
| glob-arr-inc  | 1.077 ± 0.006M/s | 1.336 ± 0.010M/s |   + 24.04%   |
| arr-inc       | 1.078 ± 0.002M/s | 1.332 ± 0.015M/s |   + 23.56%   |
| hash-inc      | 0.494 ± 0.004M/s | 0.653 ± 0.001M/s |   + 32.18%   |
+---------------+------------------+------------------+--------------+

NOTE: This benchmark includes changes from this patch and the previous
      patch that implemented the per-cpu insn.

[1] https://github.com/anakryiko/linux/commit/8dec900975ef

Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Björn Töpel <bjorn@kernel.org>
Link: https://lore.kernel.org/r/20240502151854.9810-3-puranjay@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-12 16:54:34 -07:00
Puranjay Mohan
19c56d4e5b riscv, bpf: add internal-only MOV instruction to resolve per-CPU addrs
Support an instruction for resolving absolute addresses of per-CPU
data from their per-CPU offsets. This instruction is internal-only and
users are not allowed to use them directly. They will only be used for
internal inlining optimizations for now between BPF verifier and BPF
JITs.

RISC-V uses generic per-cpu implementation where the offsets for CPUs
are kept in an array called __per_cpu_offset[cpu_number]. RISCV stores
the address of the task_struct in TP register. The first element in
task_struct is struct thread_info, and we can get the cpu number by
reading from the TP register + offsetof(struct thread_info, cpu).

Once we have the cpu number in a register we read the offset for that
cpu from address: &__per_cpu_offset + cpu_number << 3. Then we add this
offset to the destination register.

To measure the improvement from this change, the benchmark in [1] was
used on Qemu:

Before:
glob-arr-inc   :    1.127 ± 0.013M/s
arr-inc        :    1.121 ± 0.004M/s
hash-inc       :    0.681 ± 0.052M/s

After:
glob-arr-inc   :    1.138 ± 0.011M/s
arr-inc        :    1.366 ± 0.006M/s
hash-inc       :    0.676 ± 0.001M/s

[1] https://github.com/anakryiko/linux/commit/8dec900975ef

Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Acked-by: Björn Töpel <bjorn@kernel.org>
Link: https://lore.kernel.org/r/20240502151854.9810-2-puranjay@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-12 16:54:34 -07:00
Paolo Bonzini
4232da23d7 Merge tag 'loongarch-kvm-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson into HEAD
LoongArch KVM changes for v6.10

1. Add ParaVirt IPI support.
2. Add software breakpoint support.
3. Add mmio trace events support.
2024-05-10 13:20:18 -04:00
Masahiro Yamada
b1992c3772 kbuild: use $(src) instead of $(srctree)/$(src) for source directory
Kbuild conventionally uses $(obj)/ for generated files, and $(src)/ for
checked-in source files. It is merely a convention without any functional
difference. In fact, $(obj) and $(src) are exactly the same, as defined
in scripts/Makefile.build:

    src := $(obj)

When the kernel is built in a separate output directory, $(src) does
not accurately reflect the source directory location. While Kbuild
resolves this discrepancy by specifying VPATH=$(srctree) to search for
source files, it does not cover all cases. For example, when adding a
header search path for local headers, -I$(srctree)/$(src) is typically
passed to the compiler.

This introduces inconsistency between upstream and downstream Makefiles
because $(src) is used instead of $(srctree)/$(src) for the latter.

To address this inconsistency, this commit changes the semantics of
$(src) so that it always points to the directory in the source tree.

Going forward, the variables used in Makefiles will have the following
meanings:

  $(obj)     - directory in the object tree
  $(src)     - directory in the source tree  (changed by this commit)
  $(objtree) - the top of the kernel object tree
  $(srctree) - the top of the kernel source tree

Consequently, $(srctree)/$(src) in upstream Makefiles need to be replaced
with $(src).

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
2024-05-10 04:34:52 +09:00
Arnd Bergmann
01a7f9e1a9 RISC-V Devicetrees for v6.10
Microchip:
 A simple addition of a power-monitor on the Icicle dev board, as the
 binding for it is now in mainline.
 
 StarFive:
 Support for the Milk-V Mars. This board is incredibly similar to the
 VisionFive v2 that is already supported, with only the really ethernet
 configuration being slightly different. Emil requested that a common
 dtsi file, so my fixes branch is pulled into for-next to avoid an
 annoying conflict between moved content and some erroneously added
 nodes that were removed as fixes this cycle.
 
 T-Head:
 Re-ordering of some nodes to match the DTS coding style on the th1520.
 
 Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRh246EGq/8RLhDjO14tDGHoIJi0gUCZjvZxQAKCRB4tDGHoIJi
 0rt/AQDBIGqrvMfTFUM4Rs8FznDpMffDR5DR4h+zrTbT5XR1LwEAnFnDuMTGufFT
 xmfgB9+kbOCPpMyBSyu8NB7wUF47qgI=
 =Vcbm
 -----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmY73yQACgkQYKtH/8kJ
 Uif7bQ//W4qvHgqtqK9A6eGNOV1NOs8NM7rx86dl9Fi0VclDgdeCDT8rsF5yoUST
 3JuRS18HHTD+1Irvh6zrlOxMnxf6i8B36RKZ9zZkoP7gWHo9jDkqYx6PdrxE6be7
 5uPGvdj6amvav0bincF/Cpnwct8FghVKjo680wp4Aku00miG+0ANyTPW+0FAidqJ
 7zsJS3ebTVy2GiDFecG123BfZhnMxuxM0bQNe+yFURVwFat09g3hTrPXUzZkWKtp
 6wlP1wEpjWPGXYnBG3a3ESX6l12vc1ab8J0j5vDgW4C64v1etFbztVWT7yMYFVoA
 rTyyBHyLlL+1R/AAJ2C4B0/LRNsJc88U8kk6/lwtRKZkr+zumO4QUvrgewIF0LJD
 LUEMyWbTGLQ6RfIAEjutTQS2EXejxlV9dGqk/PZzUhwrR3NiVDWuosrjDHOFEtk3
 5VOgpgBNt2Q3GbDkGl+RKlzmnLdHJ6Dk52DuJfJTwZbwssogflxVS4OmUxhiY7jU
 6rxnk4JRvmygYEGWbqVueixloJgU8mer1QeDcyjbUnSiM/fcRfxSdzxFCzjRfYg5
 b/uIYC8uSrIycYy31XTeO1Ukwjv1Vr2uU3I8Pfl5VkTx1msR6uGtZOZM7tLBrpZq
 VsrHzij8VLm3/nPnY49CmvwYSTvFbLfZik/TM6ZF0Hj8/212AnE=
 =Yem+
 -----END PGP SIGNATURE-----

Merge tag 'riscv-dt-for-v6.10-take2' of https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux into soc/dt-late

RISC-V Devicetrees for v6.10

Microchip:
A simple addition of a power-monitor on the Icicle dev board, as the
binding for it is now in mainline.

StarFive:
Support for the Milk-V Mars. This board is incredibly similar to the
VisionFive v2 that is already supported, with only the really ethernet
configuration being slightly different. Emil requested that a common
dtsi file, so my fixes branch is pulled into for-next to avoid an
annoying conflict between moved content and some erroneously added
nodes that were removed as fixes this cycle.

T-Head:
Re-ordering of some nodes to match the DTS coding style on the th1520.

Signed-off-by: Conor Dooley <conor.dooley@microchip.com>

* tag 'riscv-dt-for-v6.10-take2' of https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux:
  riscv: dts: microchip: add pac1934 power-monitor to icicle
  riscv: dts: thead: Fix node ordering in TH1520 device tree
  riscv: dts: starfive: add Milkv Mars board device tree
  riscv: dts: starfive: introduce a common board dtsi for jh7110 based boards
  riscv: dts: starfive: visionfive 2: add "disable-wp" for tfcard
  riscv: dts: starfive: visionfive 2: add tf cd-gpios
  riscv: dts: starfive: visionfive 2: use cpus label for timebase freq
  riscv: dts: starfive: visionfive 2: update sound and codec dt node name
  dt-bindings: riscv: starfive: add Milkv Mars board
  riscv: dts: starfive: add 'cpus' label to jh7110 and jh7100 soc dtsi
  riscv: dts: starfive: visionfive 2: Remove non-existing I2S hardware
  riscv: dts: starfive: visionfive 2: Remove non-existing TDM hardware
  riscv: dts: starfive: Remove PMIC interrupt info for Visionfive 2 board

Link: https://lore.kernel.org/r/20240508-crafter-cement-4f54e4182270@spud
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-05-08 22:23:00 +02:00
Paolo Bonzini
aa24865fb5 KVM/riscv changes for 6.10
- Support guest breakpoints using ebreak
 - Introduce per-VCPU mp_state_lock and reset_cntx_lock
 - Virtualize SBI PMU snapshot and counter overflow interrupts
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEZdn75s5e6LHDQ+f/rUjsVaLHLAcFAmYwgroACgkQrUjsVaLH
 LAfckxAAnCvW9Ahcy0GgM2EwTtYDoNkQp1A6Wkp/a3nXBvc3hXMnlyZQ4YkyJ1T3
 BfQABCWEXWiDyEVpN9KUKtzUJi7WJz0MFuph5kvyZwMl53zddUNFqXpN4Hbb58/d
 dqjTJg7AnHbvirfhlHay/Rp+EaYsDq1E5GviDBi46yFkH/vB8IPpWdFLh3pD/+7f
 bmG5jeLos8zsWEwe3pAIC2hLDj0vFRRe2YJuXTZ9fvPzGBsPN9OHrtq0JbB3lRGt
 WRiYKPJiFjt2P3TjPkjh4N1Xmy8pJaEetu0Qwa1TR6I+ULs2ZcFzx9cw2VuoRQ2C
 uNhVx0o5ulAzJwGgX4U49ZTK4M7a5q6xf6zpqNFHbyy5tZylKJuBEWucuSyF1kTU
 RpjNinZ1PShzjx7HU+2gKPu+bmKHgfwKlr2Dp9Cx92IV9It3Wt1VEXWsjatciMfj
 EGYx+E9VcEOfX6INwX/TiO4ti7chLH/sFc+LhLqvw/1elhi83yAWbszjUmJ1Vrx1
 k1eATN2Hehvw06Y72lc+PrD0sYUmJPcDMVk3MSh/cSC8OODmZ9vi32v8Ie2bjNS5
 gHRLc05av1aX8yX+GRpUSPkCRL/XQ2J3jLG4uc3FmBMcWEhAtnIPsvXnCvV8f2mw
 aYrN+VF/FuRfumuYX6jWN6dwEwDO96AN425Rqu9MXik5KqSASXQ=
 =mGfY
 -----END PGP SIGNATURE-----

Merge tag 'kvm-riscv-6.10-1' of https://github.com/kvm-riscv/linux into HEAD

 KVM/riscv changes for 6.10

- Support guest breakpoints using ebreak
- Introduce per-VCPU mp_state_lock and reset_cntx_lock
- Virtualize SBI PMU snapshot and counter overflow interrupts
- New selftests for SBI PMU and Guest ebreak
2024-05-07 13:03:03 -04:00
Conor Dooley
1c80d50bb6 riscv: dts: microchip: add pac1934 power-monitor to icicle
The binding for this landed in v6.9, add the description. In the
off-chance that there were people carrying local patches for this based
on the driver shipped on the Microchip website (or vendor kernel) both
the binding and sysfs filenames changed during upstreaming.

Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2024-05-07 17:14:06 +01:00
Conor Dooley
04a228aadb RISC-V: add Milkv Mars board devicetree
The Milkv Mars is a development board based on the Starfive JH7110 SoC.
The board features:

- JH7110 SoC
- 1/2/4/8 GiB LPDDR4 DRAM
- AXP15060 PMIC
- 40 pin GPIO header
- 3x USB 3.0 host port
- 1x USB 2.0 host port
- 1x M.2 E-Key
- 1x eMMC slot
- 1x MicroSD slot
- 1x QSPI Flash
- 1x 1Gbps Ethernet port
- 1x HDMI port
- 1x 2-lane DSI and 1x 4-lane DSI
- 1x 2-lane CSI

I fixed up some nits Emil pointed out. This merges fixes into for-next
to avoid messing around with some nodes that were removed as fixes this
cycle.

Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2024-05-07 17:14:06 +01:00
Thomas Bonnefille
9abd613a85 riscv: dts: thead: Fix node ordering in TH1520 device tree
According to the device tree coding style, nodes shall be ordered by
unit address in ascending order.

Signed-off-by: Thomas Bonnefille <thomas.bonnefille@bootlin.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2024-05-07 17:14:06 +01:00
Arnd Bergmann
a3116c8881 RISC-V SoC Kconfig Updates for v6.10
A few different bits of SoC-related Kconfig work. The first part of
 this is shared with the DT updates - the modification of all SOC_CANAAN
 users to SOC_CANAAN_K210 to split the existing m-mode nommu k210 away
 from the k230 that is able to be used in a "common" kernel.
 
 The other thing here is the removal of most of the SOC_VENDOR options,
 with their ARCH_VENDOR equivalents that've been waiting in the wings for
 1 year+ now made visible. Due a lapse on my part when originally adding
 the ARCH_VENDOR stuff, the Microchip transition isn't complete - the
 _POLARFIRE was a mistake to keep as there's gonna be non-PolarFire
 RISC-V stuff from Microchip soonTM.
 
 Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRh246EGq/8RLhDjO14tDGHoIJi0gUCZjUIHwAKCRB4tDGHoIJi
 0r8RAQD+B5rJde/sQuQlmkJGrmZfyE/6/I1ZFxv0/xHhRPNWRAD/RFTTDthL/7c4
 frMGl/nWSD3fvGmXrQ7Dp6wc1APIdQI=
 =BWHh
 -----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmY57kAACgkQYKtH/8kJ
 Uie0Sg/8D8tZVOtBrpVJG/P5F7W1QdcC336GHBMQCho38C0/4AK5ltvdyxmSxmCl
 k84qaLdYNrGIvJ5o6MRmSAZyzT3y3jLYVA8C2Zsrp+Do8KkvvJGl219pUvp5A0J3
 eoMApJ34wx6dQM9LfpLcvU9C3Z767KeiiRm0h5CTV0IUfJnZB/7IQgwSajEGLOr/
 CHtFZpbYK6VgCDgVhbacSY8495jJrIU4i5RDlILst5K64XrmS2UU2oen2L3X/u8h
 xi5nQ//3qCiIfp5UqvBY12OYF8lVzB+F/Uo9vDCpeF9HFiIE69qMgYJiSviCPbwH
 T54zue+oBPGfL57HQoMTYQGUG4GvlnW7JR841GsIlPjrs54uw2kXDZB586n3tqzN
 esAQCc/sNnCuUX9TYKKBzkIrmQ1oTPRdGO61r+lSgxjYQ/ed++eh6KStlbPmttVo
 piEaxSpLS7TOZcVOyXHFrWK6OR4yB6MD6ZvOGJlJKJZSdfMNTlGcoymiJ0j7hVQb
 QJSr3LaIfQMP+Uf5ZWWlNZIxvwxQKER8v6MbyH3vGAVYa+DnDBzaj9Fh364thJnt
 Uybpz7SDQMIugnB+uSe+D1o65XfTEKOn/OHXYEQVYyoWs63QfTE9012+Q4KqBFBQ
 5ylIkvM9r3xMBkzZT3EU8lgr5gx9r5QQuX9czJ7INSBxo4SmuKE=
 =QGOJ
 -----END PGP SIGNATURE-----

Merge tag 'riscv-config-for-v6.10' of https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux into soc/drivers

RISC-V SoC Kconfig Updates for v6.10

A few different bits of SoC-related Kconfig work. The first part of
this is shared with the DT updates - the modification of all SOC_CANAAN
users to SOC_CANAAN_K210 to split the existing m-mode nommu k210 away
from the k230 that is able to be used in a "common" kernel.

The other thing here is the removal of most of the SOC_VENDOR options,
with their ARCH_VENDOR equivalents that've been waiting in the wings for
1 year+ now made visible. Due a lapse on my part when originally adding
the ARCH_VENDOR stuff, the Microchip transition isn't complete - the
_POLARFIRE was a mistake to keep as there's gonna be non-PolarFire
RISC-V stuff from Microchip soonTM.

Signed-off-by: Conor Dooley <conor.dooley@microchip.com>

* tag 'riscv-config-for-v6.10' of https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux:
  riscv: config: enable ARCH_CANAAN in defconfig
  RISC-V: drop SOC_VIRT for ARCH_VIRT
  RISC-V: drop SOC_SIFIVE for ARCH_SIFIVE
  RISC-V: drop SOC_MICROCHIP_POLARFIRE for ARCH_MICROCHIP
  RISC-V: Drop unused SOC_CANAAN
  reset: k210: Deprecate SOC_CANAAN and use SOC_CANAAN_K210
  pinctrl: k210: Deprecate SOC_CANAAN and use SOC_CANAAN_K210
  clk: k210: Deprecate SOC_CANAAN and use SOC_CANAAN_K210
  soc: canaan: Deprecate SOC_CANAAN and use SOC_CANAAN_K210 for K210
  riscv: Kconfig.socs: Split ARCH_CANAAN and SOC_CANAAN_K210

Link: https://lore.kernel.org/r/20240503-mardi-underling-3d81a9f97329@spud
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-05-07 11:02:56 +02:00
Yoann Congal
27021649ec printk: Remove redundant CONFIG_BASE_FULL
CONFIG_BASE_FULL is equivalent to !CONFIG_BASE_SMALL and is enabled by
default: CONFIG_BASE_SMALL is the special case to take care of.
So, remove CONFIG_BASE_FULL and move the config choice to
CONFIG_BASE_SMALL (which defaults to 'n')

For defconfigs explicitely disabling BASE_FULL, explicitely enable
BASE_SMALL.
For defconfigs explicitely enabling BASE_FULL, drop it as it is the
default.

Signed-off-by: Yoann Congal <yoann.congal@smile.fr>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20240505080343.1471198-4-yoann.congal@smile.fr
Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-05-06 17:39:09 +02:00
Jakub Kicinski
e958da0ddb Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

Conflicts:

include/linux/filter.h
kernel/bpf/core.c
  66e13b615a ("bpf: verifier: prevent userspace memory access")
  d503a04f8b ("bpf: Add support for certain atomics in bpf_arena to x86 JIT")
https://lore.kernel.org/all/20240429114939.210328b0@canb.auug.org.au/

No adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-02 12:06:25 -07:00
Linus Torvalds
545c494465 Including fixes from bpf.
Relatively calm week, likely due to public holiday in most places.
 No known outstanding regressions.
 
 Current release - regressions:
 
   - rxrpc: fix wrong alignmask in __page_frag_alloc_align()
 
   - eth: e1000e: change usleep_range to udelay in PHY mdic access
 
 Previous releases - regressions:
 
   - gro: fix udp bad offset in socket lookup
 
   - bpf: fix incorrect runtime stat for arm64
 
   - tipc: fix UAF in error path
 
   - netfs: fix a potential infinite loop in extract_user_to_sg()
 
   - eth: ice: ensure the copied buf is NUL terminated
 
   - eth: qeth: fix kernel panic after setting hsuid
 
 Previous releases - always broken:
 
   - bpf:
     - verifier: prevent userspace memory access
     - xdp: use flags field to disambiguate broadcast redirect
 
   - bridge: fix multicast-to-unicast with fraglist GSO
 
   - mptcp: ensure snd_nxt is properly initialized on connect
 
   - nsh: fix outer header access in nsh_gso_segment().
 
   - eth: bcmgenet: fix racing registers access
 
   - eth: vxlan: fix stats counters.
 
 Misc:
 
   - a bunch of MAINTAINERS file updates
 
 Signed-off-by: Paolo Abeni <pabeni@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmYzaRsSHHBhYmVuaUBy
 ZWRoYXQuY29tAAoJECkkeY3MjxOkh70P/jzsTsvzHspu3RUwcsyvWpSoJPcxP2tF
 5SKR66o8sbSjB5I26zUi/LtRZgbPO32GmLN2Y8GvP74h9lwKdDo4AY4volZKCT6f
 lRG6GohvMa0lSPSn1fti7CKVzDOsaTHvLz3uBBr+Xb9ITCKh+I+zGEEDGj/47SQN
 tmDWHPF8OMs2ezmYS5NqRIQ3CeRz6uyLmEoZhVm4SolypZ18oEg7GCtL3u6U48n+
 e3XB3WwKl0ZxK8ipvPgUDwGIDuM5hEyAaeNon3zpYGoqitRsRITUjULpb9dT4DtJ
 Jma3OkarFJNXgm4N/p/nAtQ9AdiAloF9ivZXs2t0XCdrrUZJUh05yuikoX+mLfpw
 GedG2AbaVl6mdqNkrHeyf5SXKuiPgeCLVfF2xMjS0l1kFbY+Bt8BqnRSdOrcoUG0
 zlSzBeBtajttMdnalWv2ZshjP8uo/NjXydUjoVNwuq8xGO5wP+zhNnwhOvecNyUg
 t7q2PLokahlz4oyDqyY/7SQ0hSEndqxOlt43I6CthoWH0XkS83nTPdQXcTKQParD
 ntJUk5QYwefUT1gimbn/N8GoP7a1+ysWiqcf/7+SNm932gJGiDt36+HOEmyhIfIG
 IDWTWJJW64SnPBIUw59MrG7hMtbfaiZiFQqeUJQpFVrRr+tg5z5NUZ5thA+EJVd8
 qiVDvmngZFiv
 =f6KY
 -----END PGP SIGNATURE-----

Merge tag 'net-6.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
 "Including fixes from bpf.

  Relatively calm week, likely due to public holiday in most places. No
  known outstanding regressions.

  Current release - regressions:

   - rxrpc: fix wrong alignmask in __page_frag_alloc_align()

   - eth: e1000e: change usleep_range to udelay in PHY mdic access

  Previous releases - regressions:

   - gro: fix udp bad offset in socket lookup

   - bpf: fix incorrect runtime stat for arm64

   - tipc: fix UAF in error path

   - netfs: fix a potential infinite loop in extract_user_to_sg()

   - eth: ice: ensure the copied buf is NUL terminated

   - eth: qeth: fix kernel panic after setting hsuid

  Previous releases - always broken:

   - bpf:
       - verifier: prevent userspace memory access
       - xdp: use flags field to disambiguate broadcast redirect

   - bridge: fix multicast-to-unicast with fraglist GSO

   - mptcp: ensure snd_nxt is properly initialized on connect

   - nsh: fix outer header access in nsh_gso_segment().

   - eth: bcmgenet: fix racing registers access

   - eth: vxlan: fix stats counters.

  Misc:

   - a bunch of MAINTAINERS file updates"

* tag 'net-6.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (45 commits)
  MAINTAINERS: mark MYRICOM MYRI-10G as Orphan
  MAINTAINERS: remove Ariel Elior
  net: gro: add flush check in udp_gro_receive_segment
  net: gro: fix udp bad offset in socket lookup by adding {inner_}network_offset to napi_gro_cb
  ipv4: Fix uninit-value access in __ip_make_skb()
  s390/qeth: Fix kernel panic after setting hsuid
  vxlan: Pull inner IP header in vxlan_rcv().
  tipc: fix a possible memleak in tipc_buf_append
  tipc: fix UAF in error path
  rxrpc: Clients must accept conn from any address
  net: core: reject skb_copy(_expand) for fraglist GSO skbs
  net: bridge: fix multicast-to-unicast with fraglist GSO
  mptcp: ensure snd_nxt is properly initialized on connect
  e1000e: change usleep_range to udelay in PHY mdic access
  net: dsa: mv88e6xxx: Fix number of databases for 88E6141 / 88E6341
  cxgb4: Properly lock TX queue for the selftest.
  rxrpc: Fix using alignmask being zero for __page_frag_alloc_align()
  vxlan: Add missing VNI filter counter update in arp_reduce().
  vxlan: Fix racy device stats updates.
  net: qede: use return from qede_parse_actions()
  ...
2024-05-02 08:51:47 -07:00
Arnd Bergmann
0ea32f50b3 RISC-V Devicetrees for v6.10
Sophgo:
 Added sdhci support for cv18xx/duo.
 Added clock support for cv18xx.
 Added clock for uart/sdhci.
 Added spi support for cv18xx.
 Added i2c support for cv18xx.
 Added reserved memory node for cv1800b/duo.
 
 Signed-off-by: Chen Wang <unicorn_wang@outlook.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEEyXBewyFOIoMq4pDN7RKLaY1m8IIFAmYxmBsACgkQ7RKLaY1m
 8IKJbwwA2MF4SGcM7ZBdrYrom0leEIv7/UrGyGLOK3GUY57hp2upsrpE9j66x8sV
 5GGEdlUoAzlcHFe23FzA44JVhRm3NcgUo5CAZRrjrk9M17RKdCWziiJ5h4ZbP1Ue
 a787yAAg58OmMLSpaJBoO0IwjFiUS8nUC9YUSqvWmVhtsvHO/n7aZKvYOdgZ3lnA
 LtlQGpsuz3q8q5Gk/tmS8astIkxzCMVlLfipkETqF5H5dtlw/tl7sIqubxUnnWea
 /qj6QiSn+li2jdk4VQOw57Fc6IEkFZtYJGRwnCKmZE7thHA2z8HZ+gIXDq/X0CU8
 PTnwQ7nO0oRJn0bQLACy+vO/9iznRNo+y3HAoVQtDjB/Xef6JbIrI/1pnft8UUQO
 ZBGvuvXbq7K1uTpVHIrup9AnxwcNTzNFJ6wK/jap/R+CAIZKA2GSpNgd0qDOxzGh
 OLjTVnU1jc7VhB2gaNEaox9zs0s6MOxMwz9iRLAA3Dsyitbx1AEjq63vYXE3KR72
 dnvT4Y8j
 =d7SS
 -----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmYzjYsACgkQYKtH/8kJ
 Uie5nA//Sib+nyw4b/v6Mk1c+OWl+N+V9iaennHJ3hcRxC/AQDF6RbwPi3DVItkz
 KvXFGa7buTl1HzmiBVqjv1k2lZTFdZo6ZndwGwr/V1qd9pP+USoC/5ua4XatYdP+
 UT9RtXnqTMRzYFXVYnYKzs3phhteGfqg8yAmqOqNN6mtkgzy0Wi3NGoPWVD83G2V
 PZY0LcC3H9qU0lM3KXicHGWAPWgrvOo9iVdeXNlKt0QKvaMRw0t3tkKQSb2HEBvv
 M2T6uknTir8lFdXhC/6r26zNiC0Iuk9/xtVuTvyI1NWyscLy+F0orR7QT49axEBO
 C425MQ34OkyPfplGvROWq9PN4jpnbyMIlFnP0/VJyjPfO6oZ7MYNJ6hbYjrNDf/7
 n/8RugMxtUZVwBHTb5pKvCgB51/zf80N5bNVW1yRQ3HbUFz91VaPTh5m+WKFz7BQ
 6rETCpdS4ai0tkREbfbNPhBmid3SuqF9wTWEOcfh2BqRCjc+c4In/XsVFkDLGxL/
 peZku7C11hTDc3L/WazU2hwwjye28aoqykhr6B7CQ7MiThkUnknXrXkdfqwiTDJj
 Dv4e2wl0LU3zN66NFTlpUQ4fYXTD6IH8d3ps9qLaY3KbALbMR8x+oTnwBfxYUJuL
 oP/NT6kEFCRP1pdU9E4OXhvaqvfGQI4/gETnefXmsG68u0dtS24=
 =5D+b
 -----END PGP SIGNATURE-----

Merge tag 'riscv-sophgo-dt-for-v6.10' of https://github.com/sophgo/linux into soc/dt

RISC-V Devicetrees for v6.10

Sophgo:
Added sdhci support for cv18xx/duo.
Added clock support for cv18xx.
Added clock for uart/sdhci.
Added spi support for cv18xx.
Added i2c support for cv18xx.
Added reserved memory node for cv1800b/duo.

Signed-off-by: Chen Wang <unicorn_wang@outlook.com>

* tag 'riscv-sophgo-dt-for-v6.10' of https://github.com/sophgo/linux:
  riscv: dts: sophgo: add reserved memory node for CV1800B
  riscv: dts: sophgo: use real clock for sdhci
  riscv: dts: sophgo: cv18xx: Add i2c devices
  riscv: dts: sophgo: cv18xx: Add spi devices
  riscv: dts: sophgo: add uart clock for Sophgo CV1800 series SoC
  riscv: dts: sophgo: add clock generator for Sophgo CV1800 series SoC
  riscv: dts: sophgo: add sdcard support for milkv duo

Link: https://lore.kernel.org/r/MA0P287MB2822CA2DE757787D6EA3B1F8FE192@MA0P287MB2822.INDP287.PROD.OUTLOOK.COM
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-05-02 14:56:43 +02:00
Jisheng Zhang
9276badd9d riscv: dts: starfive: add Milkv Mars board device tree
The Milkv Mars is a development board based on the Starfive JH7110 SoC.
The board features:

- JH7110 SoC
- 1/2/4/8 GiB LPDDR4 DRAM
- AXP15060 PMIC
- 40 pin GPIO header
- 3x USB 3.0 host port
- 1x USB 2.0 host port
- 1x M.2 E-Key
- 1x eMMC slot
- 1x MicroSD slot
- 1x QSPI Flash
- 1x 1Gbps Ethernet port
- 1x HDMI port
- 1x 2-lane DSI and 1x 4-lane DSI
- 1x 2-lane CSI

Add the devicetree file describing the currently supported features,
namely PMIC, UART, I2C, GPIO, SD card, QSPI Flash, eMMC and Ethernet.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Emil Renner Berthing <emil.renner.berthing@canonical.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2024-04-30 22:04:17 +01:00
Jisheng Zhang
ac9a37e2d6 riscv: dts: starfive: introduce a common board dtsi for jh7110 based boards
This is to prepare for Milkv Mars board dts support in the following
patch. Let's factored out common part into .dtsi.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Emil Renner Berthing <emil.renner.berthing@canonical.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2024-04-30 22:04:17 +01:00
Jisheng Zhang
07da6ddf51 riscv: dts: starfive: visionfive 2: add "disable-wp" for tfcard
No physical write-protect line is present, so setting "disable-wp".

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Emil Renner Berthing <emil.renner.berthing@canonical.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2024-04-30 22:04:17 +01:00
Jisheng Zhang
0ffce9d49a riscv: dts: starfive: visionfive 2: add tf cd-gpios
Per VisionFive 2 1.2B, and 1.3A boards' SCH, GPIO 41 is used as
card detect. So add "cd-gpios" property for this.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Emil Renner Berthing <emil.renner.berthing@canonical.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2024-04-30 22:04:16 +01:00
Jisheng Zhang
ffddddf4aa riscv: dts: starfive: visionfive 2: use cpus label for timebase freq
As pointed out by Krzysztof "Board should not bring new CPU nodes.
Override by label instead."

Suggested-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Emil Renner Berthing <emil.renner.berthing@canonical.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2024-04-30 22:04:16 +01:00
Jisheng Zhang
b9a1481f25 riscv: dts: starfive: visionfive 2: update sound and codec dt node name
Use "audio-codec" as the codec dt node name, and "sound" as the simple
audio card dt name.

Suggested-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Emil Renner Berthing <emil.renner.berthing@canonical.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2024-04-30 22:04:16 +01:00
Jisheng Zhang
5e7922abdd riscv: dts: starfive: add 'cpus' label to jh7110 and jh7100 soc dtsi
Add the 'cpus' label so that we can reference it in board dts files.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Emil Renner Berthing <emil.renner.berthing@canonical.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2024-04-30 22:04:16 +01:00
Palmer Dabbelt
4f16345d92
Merge patch series "riscv: ASID-related and UP-related TLB flush enhancements"
Samuel Holland <samuel.holland@sifive.com> says:

This series converts uniprocessor kernel builds to use the same TLB
flushing code as SMP builds, to take advantage of batching and existing
range- and ASID-based TLB flush optimizations. It optimizes out IPIs and
SBI calls based on the online CPU count, which also covers the scenario
where SMP was enabled at build time but only one CPU is present/online.
A final optimization is to use single-ASID flushes wherever possible, to
avoid unnecessary TLB misses for kernel mappings.

This series has a semantic conflict with the AIA patches that are in
linux-next due to the removal of the third parameter of
riscv_ipi_set_virq_range(), which is called from imsic_ipi_domain_init()
in drivers/irqchip/irq-riscv-imsic-early.c. The resolution is to remove
the extra argument from the call site.

Here are some numbers from D1 which show the performance impact:

v6.9-rc1:
 System Benchmarks Partial Index              BASELINE       RESULT    INDEX
 Execl Throughput                                 43.0        198.5     46.2
 File Copy 1024 bufsize 2000 maxblocks          3960.0      73934.4    186.7
 File Copy 256 bufsize 500 maxblocks            1655.0      20242.6    122.3
 File Copy 4096 bufsize 8000 maxblocks          5800.0     197706.4    340.9
 Pipe Throughput                               12440.0     176974.2    142.3
 Pipe-based Context Switching                   4000.0      23626.8     59.1
 Process Creation                                126.0        449.9     35.7
 Shell Scripts (1 concurrent)                     42.4        544.4    128.4
 Shell Scripts (16 concurrent)                     ---         35.3      ---
 Shell Scripts (8 concurrent)                      6.0         71.6    119.3
 System Call Overhead                          15000.0     248072.6    165.4
                                                                    ========
 System Benchmarks Index Score (Partial Only)                          110.6

v6.9-rc1 + this patch series:
 System Benchmarks Partial Index              BASELINE       RESULT    INDEX
 Execl Throughput                                 43.0        196.8     45.8
 File Copy 1024 bufsize 2000 maxblocks          3960.0      71782.2    181.3
 File Copy 256 bufsize 500 maxblocks            1655.0      21269.4    128.5
 File Copy 4096 bufsize 8000 maxblocks          5800.0     199424.0    343.8
 Pipe Throughput                               12440.0     196468.6    157.9
 Pipe-based Context Switching                   4000.0      24261.8     60.7
 Process Creation                                126.0        459.0     36.4
 Shell Scripts (1 concurrent)                     42.4        543.8    128.2
 Shell Scripts (16 concurrent)                     ---         35.5      ---
 Shell Scripts (8 concurrent)                      6.0         71.7    119.6
 System Call Overhead                          15000.0     259415.2    172.9
                                                                    ========
 System Benchmarks Index Score (Partial Only)                          113.0

* b4-shazam-lts:
  riscv: mm: Always use an ASID to flush mm contexts
  riscv: mm: Preserve global TLB entries when switching contexts
  riscv: mm: Make asid_bits a local variable
  riscv: mm: Use a fixed layout for the MM context ID
  riscv: mm: Introduce cntx2asid/cntx2version helper macros
  riscv: Avoid TLB flush loops when affected by SiFive CIP-1200
  riscv: Apply SiFive CIP-1200 workaround to single-ASID sfence.vma
  riscv: mm: Combine the SMP and UP TLB flush code
  riscv: Only send remote fences when some other CPU is online
  riscv: mm: Broadcast kernel TLB flushes only when needed
  riscv: Use IPIs for remote cache/TLB flushes by default
  riscv: Factor out page table TLB synchronization
  riscv: Flush the instruction cache during SMP bringup

Link: https://lore.kernel.org/r/20240327045035.368512-1-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-30 10:35:48 -07:00
Jisheng Zhang
48b4fc6693
riscv: select ARCH_HAS_FAST_MULTIPLIER
Currently, riscv linux requires at least IMA, so all platforms have a
multiplier. And I assume the 'mul' efficiency is comparable or better
than a sequence of five or so register-dependent arithmetic
instructions. Select ARCH_HAS_FAST_MULTIPLIER to get slightly nicer
codegen. Refer to commit f9b4192923 ("[PATCH] bitops: hweight()
speedup") for more details.

In a simple benchmark test calling hweight64() in a loop, it got:
about 14% performance improvement on JH7110, tested on Milkv Mars.

about 23% performance improvement on TH1520 and SG2042, tested on
Sipeed LPI4A and SG2042 platform.

a slight performance drop on CV1800B, tested on milkv duo. Among all
riscv platforms in my hands, this is the only one which sees a slight
performance drop. It means the 'mul' isn't quick enough. However, the
situation exists on x86 too, for example, P4 doesn't have fast
integer multiplies as said in the above commit, x86 also selects
ARCH_HAS_FAST_MULTIPLIER. So let's select ARCH_HAS_FAST_MULTIPLIER
which can benefit almost riscv platforms.

Samuel also provided some performance numbers:
On Unmatched: 20% speedup for __sw_hweight32 and 30% speedup for
__sw_hweight64.
On D1: 8% speedup for __sw_hweight32 and 8% slowdown for
__sw_hweight64.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Samuel Holland <samuel.holland@sifive.com>
Tested-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240325105823.1483-1-jszhang@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-30 10:35:47 -07:00
Palmer Dabbelt
7845f52256
Merge patch series "riscv: enable lockless lockref implementation"
Jisheng Zhang <jszhang@kernel.org> says:

This series selects ARCH_USE_CMPXCHG_LOCKREF to enable the
cmpxchg-based lockless lockref implementation for riscv. Then,
implement arch_cmpxchg64_{relaxed|acquire|release}.

After patch1:
Using Linus' test case[1] on TH1520 platform, I see a 11.2% improvement.
On JH7110 platform, I see 12.0% improvement.

After patch2:
on both TH1520 and JH7110 platforms, I didn't see obvious
performance improvement with Linus' test case [1]. IMHO, this may
be related with the fence and lr.d/sc.d hw implementations. In theory,
lr/sc without fence could give performance improvement over lr/sc plus
fence, so add the code here to leave performance improvement room on
newer HW platforms.

* b4-shazam-merge:
  riscv: cmpxchg: implement arch_cmpxchg64_{relaxed|acquire|release}
  riscv: select ARCH_USE_CMPXCHG_LOCKREF

Link: http://marc.info/?l=linux-fsdevel&m=137782380714721&w=4 [1]
Link: https://lore.kernel.org/r/20240325111038.1700-1-jszhang@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-30 10:35:46 -07:00
Jisheng Zhang
dcb2743d1e
riscv: mm: still create swiotlb buffer for kmalloc() bouncing if required
After commit f51f7a0fc2 ("riscv: enable DMA_BOUNCE_UNALIGNED_KMALLOC
for !dma_coherent"), for non-coherent platforms with less than 4GB
memory, we rely on users to pass "swiotlb=mmnn,force" kernel parameters
to enable DMA bouncing for unaligned kmalloc() buffers. Now let's go
further: If no bouncing needed for ZONE_DMA, let kernel automatically
allocate 1MB swiotlb buffer per 1GB of RAM for kmalloc() bouncing on
non-coherent platforms, so that no need to pass "swiotlb=mmnn,force"
any more.

The math of "1MB swiotlb buffer per 1GB of RAM for kmalloc() bouncing"
is taken from arm64. Users can still force smaller swiotlb buffer by
passing "swiotlb=mmnn".

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240325110036.1564-1-jszhang@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-30 10:35:45 -07:00