Commit Graph

117007 Commits

Author SHA1 Message Date
Peter Maydell
8a132968b2 softfloat: Allow 2-operand NaN propagation rule to be set at runtime
IEEE 758 does not define a fixed rule for which NaN to pick as the
result if both operands of a 2-operand operation are NaNs.  As a
result different architectures have ended up with different rules for
propagating NaNs.

QEMU currently hardcodes the NaN propagation logic into the binary
because pickNaN() has an ifdef ladder for different targets.  We want
to make the propagation rule instead be selectable at runtime,
because:
 * this will let us have multiple targets in one QEMU binary
 * the Arm FEAT_AFP architectural feature includes letting
   the guest select a NaN propagation rule at runtime
 * x86 specifies different propagation rules for x87 FPU ops
   and for SSE ops, and specifying the rule in the float_status
   would let us emulate this, instead of wrongly using the
   x87 rules everywhere

In this commit we add an enum for the propagation rule, the field in
float_status, and the corresponding getters and setters.  We change
pickNaN to honour this, but because all targets still leave this
field at its default 0 value, the fallback logic will pick the rule
type with the old ifdef ladder.

It's valid not to set a propagation rule if default_nan_mode is
enabled, because in that case there's no need to pick a NaN; all the
callers of pickNaN() catch this case and skip calling it.  So we can
already assert that we don't get into the "no rule defined" codepath
for our four targets which always set default_nan_mode: Hexagon,
RiscV, SH4 and Tricore, and for the one target which does not have FP
at all: avr.  These targets will not need to be updated to call
set_float_2nan_prop_rule().

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20241025141254.2141506-2-peter.maydell@linaro.org
2024-11-05 10:09:52 +00:00
Peter Maydell
9a7b0a8618 aspeed queue:
* Fixed eMMC size calculation
 * Fixed IRQ definitions on AST2700
 * Added RTC support to AST2700
 * Fixed timer IRQ status on AST2600
 * Improved SDHCI model with new registers
 * Added -nodefaults support to AST1030
 * Provided a way to use an eMMC device without boot partitions
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEoPZlSPBIlev+awtgUaNDx8/77KEFAmcoo4oACgkQUaNDx8/7
 7KGYAQ/9GWiwM7SHFD/WTEo6iClQCk+Do3pzGXZPQq7WLqYhBU8mYwSaqMDUtXj+
 MQVywyLxSYaKdCKessN0haATyzEDVRtxKwIRnbrSDWWnxG8NGj2esOTsU6/wgfD4
 FqARaMH91FQB6rY8QbmbGmqTJ1QbWEPXj7v2piJol5dvI2Oe8iqn/6z1Cv4NMXwh
 aYHwSVwcHLD9tfmyXP0DKN/XHLC4pTAOoU96ajcN6RRW+D6vuQEsQq0caZt8CHQc
 I2oSptU+RZF2DPbSeEB42y9I138/kQzTIaVnbBN//NLRwbzRsLlXhA92F2CJyDrD
 FGNQyynteil8F7M5Oab47fFia1QF/v4G45VOAsHpT1tLBsZPKJdRwfLLqDPZbVbG
 2lAVuukqW0gKoEHsXfVsDzcIxpX81SlUsccHY4kCxsRNnwSzCWaDK9OOTx3CAxjG
 CzzDgQszNr/12dzkWExIhLpMhQNeiUXX1veAH/jzbjyRAKxzjkDYaX2lUC3MfmqX
 irjmzOU0AbtComv4ybeBqtqNmQvUx5/y993Hgakc9mqqCoAm/Fn4qtx6uW5vSSZJ
 w4heyWbzcLp5RIzSYZypWlmgI+3bJgJq2aX276MYqAe3m8PnUCkuW9NTsfb+ARMl
 XGExHPNrAsw7eiiQsTa7Byt/jkEf3KmEp8ye+3cAvJwPgxlDyys=
 =ms8H
 -----END PGP SIGNATURE-----

Merge tag 'pull-aspeed-20241104' of https://github.com/legoater/qemu into staging

aspeed queue:

* Fixed eMMC size calculation
* Fixed IRQ definitions on AST2700
* Added RTC support to AST2700
* Fixed timer IRQ status on AST2600
* Improved SDHCI model with new registers
* Added -nodefaults support to AST1030
* Provided a way to use an eMMC device without boot partitions

# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEoPZlSPBIlev+awtgUaNDx8/77KEFAmcoo4oACgkQUaNDx8/7
# 7KGYAQ/9GWiwM7SHFD/WTEo6iClQCk+Do3pzGXZPQq7WLqYhBU8mYwSaqMDUtXj+
# MQVywyLxSYaKdCKessN0haATyzEDVRtxKwIRnbrSDWWnxG8NGj2esOTsU6/wgfD4
# FqARaMH91FQB6rY8QbmbGmqTJ1QbWEPXj7v2piJol5dvI2Oe8iqn/6z1Cv4NMXwh
# aYHwSVwcHLD9tfmyXP0DKN/XHLC4pTAOoU96ajcN6RRW+D6vuQEsQq0caZt8CHQc
# I2oSptU+RZF2DPbSeEB42y9I138/kQzTIaVnbBN//NLRwbzRsLlXhA92F2CJyDrD
# FGNQyynteil8F7M5Oab47fFia1QF/v4G45VOAsHpT1tLBsZPKJdRwfLLqDPZbVbG
# 2lAVuukqW0gKoEHsXfVsDzcIxpX81SlUsccHY4kCxsRNnwSzCWaDK9OOTx3CAxjG
# CzzDgQszNr/12dzkWExIhLpMhQNeiUXX1veAH/jzbjyRAKxzjkDYaX2lUC3MfmqX
# irjmzOU0AbtComv4ybeBqtqNmQvUx5/y993Hgakc9mqqCoAm/Fn4qtx6uW5vSSZJ
# w4heyWbzcLp5RIzSYZypWlmgI+3bJgJq2aX276MYqAe3m8PnUCkuW9NTsfb+ARMl
# XGExHPNrAsw7eiiQsTa7Byt/jkEf3KmEp8ye+3cAvJwPgxlDyys=
# =ms8H
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 04 Nov 2024 10:35:54 GMT
# gpg:                using RSA key A0F66548F04895EBFE6B0B6051A343C7CFFBECA1
# gpg: Good signature from "Cédric Le Goater <clg@redhat.com>" [full]
# gpg:                 aka "Cédric Le Goater <clg@kaod.org>" [full]
# Primary key fingerprint: A0F6 6548 F048 95EB FE6B  0B60 51A3 43C7 CFFB ECA1

* tag 'pull-aspeed-20241104' of https://github.com/legoater/qemu:
  aspeed: Don't set always boot properties of the emmc device
  aspeed: Support create flash devices via command line for AST1030
  hw/sd/aspeed_sdhci: Introduce Capabilities Register 2 for SD slot 0 and 1
  hw/timer/aspeed: Fix interrupt status does not be cleared for AST2600
  hw/timer/aspeed: Fix coding style
  aspeed/soc: Support RTC for AST2700
  hw/arm/aspeed_ast27x0: Avoid hardcoded '256' in IRQ calculation
  hw/arm/aspeed_ast27x0: Use bsa.h for PPI definitions
  hw/sd/sdcard: Fix calculation of size when using eMMC boot partitions
  hw/arm: enable at24c with aspeed

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-11-05 10:06:08 +00:00
Peter Maydell
6b829602e2 * Various bug fixes
* Big cleanup of deprecated machines
 * Power11 support for spapr
 * XIVE improvements
 * Goodbye to Cedric and David as ppc reviewers, thank you both o7
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEETkN92lZhb0MpsKeVZ7MCdqhiHK4FAmcoEicACgkQZ7MCdqhi
 HK5M8Q//fz+ZkJndXkBjb1Oinx+q+eVtNm2JrvcWIsXyhG3K+6VxYPp69H+SRv/Z
 TWuUqMQPxq8mhQvBJlDAttp/oaUEiOcCRvs/iUoBN12L4mVxXfdoT88TZ4frN3eP
 8bePq+DW2N/7gpmsJm5CyEZPpcf9AjVHgLRp3KYFkOJ/14uzvuwnocU39gl+2IUh
 MXHTedQgMNXaKorJXk1NVdM6NxMuVhOvwxAs6ya2gwhxyA5tteo5PiQOnDJWkejf
 xg3RRsNzGYcs1Qg/3kFIf3RfEB0aYbPxROM8IfPaJWKN5KnMggj/JAkHyK1x/V3J
 wml7+cB0doMt/yRiuYJhXpyrtOqpvjRWPA6RhxECWW2kwrovv8NAF8IrFnw9NvOQ
 QC66ZaaFcbAcFrVT1e/iggU76d01II6m4OAgKcXw+FRHgps4VU9y83j7ApNnNUWN
 IXp9hkzoHi5VwX0FrG4ELUr2iEf1HASMvM8EZ/0AxzWj5iNtQB8lFsrEdaGVXyIS
 M5JaJeNjCn4koCyYaFSctH5eKtbzIwnGWnDcdTwaOuQ+9itBvY8O+HZalE6sAc5S
 kLFZ7i/Ut/qxbY5pMumt8LKD4pR1SsOxFB8dJCmn/f/tvRGtIVsoY6btNe4M0+24
 42MxZbWO6W379C32bwbtsPiGA+aLSgShjP4cWm9cgRjz4RJFnwg=
 =vmIG
 -----END PGP SIGNATURE-----

Merge tag 'pull-ppc-for-9.2-1-20241104' of https://gitlab.com/npiggin/qemu into staging

* Various bug fixes
* Big cleanup of deprecated machines
* Power11 support for spapr
* XIVE improvements
* Goodbye to Cedric and David as ppc reviewers, thank you both o7

# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCgAdFiEETkN92lZhb0MpsKeVZ7MCdqhiHK4FAmcoEicACgkQZ7MCdqhi
# HK5M8Q//fz+ZkJndXkBjb1Oinx+q+eVtNm2JrvcWIsXyhG3K+6VxYPp69H+SRv/Z
# TWuUqMQPxq8mhQvBJlDAttp/oaUEiOcCRvs/iUoBN12L4mVxXfdoT88TZ4frN3eP
# 8bePq+DW2N/7gpmsJm5CyEZPpcf9AjVHgLRp3KYFkOJ/14uzvuwnocU39gl+2IUh
# MXHTedQgMNXaKorJXk1NVdM6NxMuVhOvwxAs6ya2gwhxyA5tteo5PiQOnDJWkejf
# xg3RRsNzGYcs1Qg/3kFIf3RfEB0aYbPxROM8IfPaJWKN5KnMggj/JAkHyK1x/V3J
# wml7+cB0doMt/yRiuYJhXpyrtOqpvjRWPA6RhxECWW2kwrovv8NAF8IrFnw9NvOQ
# QC66ZaaFcbAcFrVT1e/iggU76d01II6m4OAgKcXw+FRHgps4VU9y83j7ApNnNUWN
# IXp9hkzoHi5VwX0FrG4ELUr2iEf1HASMvM8EZ/0AxzWj5iNtQB8lFsrEdaGVXyIS
# M5JaJeNjCn4koCyYaFSctH5eKtbzIwnGWnDcdTwaOuQ+9itBvY8O+HZalE6sAc5S
# kLFZ7i/Ut/qxbY5pMumt8LKD4pR1SsOxFB8dJCmn/f/tvRGtIVsoY6btNe4M0+24
# 42MxZbWO6W379C32bwbtsPiGA+aLSgShjP4cWm9cgRjz4RJFnwg=
# =vmIG
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 04 Nov 2024 00:15:35 GMT
# gpg:                using RSA key 4E437DDA56616F4329B0A79567B30276A8621CAE
# gpg: Good signature from "Nicholas Piggin <npiggin@gmail.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 4E43 7DDA 5661 6F43 29B0  A795 67B3 0276 A862 1CAE

* tag 'pull-ppc-for-9.2-1-20241104' of https://gitlab.com/npiggin/qemu: (67 commits)
  MAINTAINERS: Remove myself as reviewer
  MAINTAINERS: Remove myself from XIVE
  MAINTAINERS: Remove myself from the PowerNV machines
  hw/ppc: Consolidate ppc440 initial mapping creation functions
  hw/ppc: Consolidate e500 initial mapping creation functions
  tests/qtest: Add XIVE tests for the powernv10 machine
  pnv/xive2: TIMA CI ops using alternative offsets or byte lengths
  pnv/xive2: TIMA support for 8-byte OS context push for PHYP
  pnv/xive: Update PIPR when updating CPPR
  pnv/xive: Add special handling for pool targets
  ppc/xive2: Support "Pull Thread Context to Odd Thread Reporting Line"
  ppc/xive2: Change context/ring specific functions to be generic
  ppc/xive2: Support "Pull Thread Context to Register" operation
  ppc/xive2: Allow 1-byte write of Target field in TIMA
  ppc/xive2: Dump the VP-group and crowd tables with 'info pic'
  ppc/xive2: Dump more NVP state with 'info pic'
  pnv/xive2: Support for "OS LGS Push" TIMA operation
  ppc/xive2: Support TIMA "Pull OS Context to Odd Thread Reporting Line"
  pnv/xive2: Define OGEN field in the TIMA
  pnv/xive: TIMA patch sets pre-req alignment and formatting changes
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-11-05 10:05:59 +00:00
Pierrick Bouvier
55c84a72ab contrib/plugins: remove Makefile for contrib/plugins
Now replaced by meson build.

Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Message-Id: <20241023212812.1376972-4-pierrick.bouvier@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
2024-11-05 09:13:51 +00:00
Pierrick Bouvier
2181b92887 meson: build contrib/plugins with meson
Tried to unify this meson.build with tests/tcg/plugins/meson.build but
the resulting modules are not output in the right directory.

Originally proposed by Anton Kochkov, thank you!

Solves: https://gitlab.com/qemu-project/qemu/-/issues/1710
Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Message-Id: <20241023212812.1376972-3-pierrick.bouvier@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
2024-11-05 09:13:51 +00:00
Pierrick Bouvier
6d630d84ca contrib/plugins/cflow: fix warning
contrib/plugins/cflow.c: In function ‘plugin_exit’:
contrib/plugins/cflow.c:167:19: error: declaration of ‘n’ shadows a previous local [-Werror=shadow=local]
  167 |         NodeData *n = l->data;
      |                   ^
contrib/plugins/cflow.c:139:9: note: shadowed declaration is here
  139 |     int n = 0;
      |         ^

Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Message-Id: <20241023212812.1376972-2-pierrick.bouvier@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
2024-11-05 09:13:51 +00:00
Jessica Clarke
52a523af71 bsd-user: Set TaskState ts_tid for initial threads
Currently we only set it on fork.

Note: Upstream (blitz) commit also did new threads, but that code isn't
in qemu project repo yet.

Signed-off-by: Jessica Clarke <jrtc27@jrtc27.com>
Pull-Request: https://github.com/qemu-bsd-user/qemu-bsd-user/pull/52
Reviewed-by: Warner Losh <imp@bsdimp.com>
Signed-off-by: Warner Losh <imp@bsdimp.com>
2024-11-04 20:26:40 -07:00
Ilya Leoshkevich
8997452334 bsd-user/main: Allow setting tb-size
While qemu-system can set tb-size using -accel tcg,tb-size=n, there
is no similar knob for qemu-bsd-user. Add one in a way similar to how
one-insn-per-tb is already handled.

Suggested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Warner Losh <imp@bsdimp.com>
Signed-off-by: Warner Losh <imp@bsdimp.com>
2024-11-04 20:26:40 -07:00
Ilya Leoshkevich
1f31243a8c bsd-user/x86_64/target_arch_thread.h: Align stack
bsd-user qemu-x86_64 almost immediately dies with:

    qemu: 0x4002201a68: unhandled CPU exception 0xd - aborting

on FreeBSD 14.1-RELEASE. This is an instruction that requires
alignment:

    (gdb) x/i 0x4002201a68
       0x4002201a68:        movaps %xmm0,-0x40(%rbp)

and the argument is not aligned:

    (gdb) p/x env->regs[5]
    $1 = 0x822443b58

A quick experiment shows that the userspace entry point expects
misaligned rsp:

    (gdb) starti
    (gdb) p/x $rsp
    $1 = 0x7fffffffeaa8

Emulate this behavior in bsd-user.

[[ applied Richard's suggestion ]]

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Warner Losh <imp@bsdimp.com>
Signed-off-by: Warner Losh <imp@bsdimp.com>
2024-11-04 20:26:40 -07:00
Zhenzhong Duan
096d96e7be intel_iommu: Add missed reserved bit check for IEC descriptor
IEC descriptor is 128-bit invalidation descriptor, must be padded with
128-bits of 0s in the upper bytes to create a 256-bit descriptor when
the invalidation queue is configured for 256-bit descriptors (IQA_REG.DW=1).

Fixes: 02a2cbc872 ("x86-iommu: introduce IEC notifiers")
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Message-Id: <20241104125536.1236118-4-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-11-04 16:03:25 -05:00
Zhenzhong Duan
8e761fb61c intel_iommu: Add missed sanity check for 256-bit invalidation queue
According to VTD spec, a 256-bit descriptor will result in an invalid
descriptor error if submitted in an IQ that is setup to provide hardware
with 128-bit descriptors (IQA_REG.DW=0). Meanwhile, there are old inv desc
types (e.g. iotlb_inv_desc) that can be either 128bits or 256bits. If a
128-bit version of this descriptor is submitted into an IQ that is setup
to provide hardware with 256-bit descriptors will also result in an invalid
descriptor error.

The 2nd will be captured by the tail register update. So we only need to
focus on the 1st.

Because the reserved bit check between different types of invalidation desc
are common, so introduce a common function vtd_inv_desc_reserved_check()
to do all the checks and pass the differences as parameters.

With this change, need to replace error_report_once() call with error_report()
to catch different call sites. This isn't an issue as error_report_once()
here is mainly used to help debug guest error, but it only dumps once in
qemu life cycle and doesn't help much, we need error_report() instead.

Fixes: c0c1d35184 ("intel_iommu: add 256 bits qi_desc support")
Suggested-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Message-Id: <20241104125536.1236118-3-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-11-04 16:03:25 -05:00
Zhenzhong Duan
e70e83f561 intel_iommu: Send IQE event when setting reserved bit in IQT_TAIL
According to VTD spec, Figure 11-22, Invalidation Queue Tail Register,
"When Descriptor Width (DW) field in Invalidation Queue Address Register
(IQA_REG) is Set (256-bit descriptors), hardware treats bit-4 as reserved
and a value of 1 in the bit will result in invalidation queue error."

Current code missed to send IQE event to guest, fix it.

Fixes: c0c1d35184 ("intel_iommu: add 256 bits qi_desc support")
Suggested-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Message-Id: <20241104125536.1236118-2-zhenzhong.duan@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-11-04 16:03:25 -05:00
Salil Mehta
65fb66980d hw/acpi: Update GED with vCPU Hotplug VMSD for migration
The ACPI CPU hotplug states must be migrated along with other vCPU
hotplug states to the destination VM. Update the GED's VM State
Description (VMSD) table subsection to conditionally include the CPU
Hotplug VM State Description (VMSD).

Excerpt of GED VMSD State Dump at Source:

    "acpi-ged (16)": {
        "ged_state": {
            "sel": "0x00000000"
        },
        [...]
        "acpi-ged/cpuhp": {
            "cpuhp_state": {
                "selector": "0x00000005",
                "command": "0x00",
                "devs": [
                    {
                        "is_inserting": false,
                        "is_removing": false,
                        "ost_event": "0x00000000",
                        "ost_status": "0x00000000"
                    },
		    [...]
                    {
                        "is_inserting": false,
                        "is_removing": false,
                        "ost_event": "0x00000000",
                        "ost_status": "0x00000000"
                    }
                ]
            }
        }
    },

Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Message-Id: <20241103102419.202225-6-salil.mehta@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-11-04 16:03:25 -05:00
Salil Mehta
4d62d15b11 tests/qtest/bios-tables-test: Update DSDT golden masters for x86/{pc,q35}
Update DSDT golden master files for x86/pc and x86/q35 platforms to
accommodate changes made in the architecture-agnostic CPU AML. These
updates notify the guest OS of vCPU hot-plug and hot-unplug status
using the ACPI `_STA.Enabled` bit.

The following is a diff of the changes in the .dsl file generated with
IASL:

@@ -1480,6 +1480,7 @@
                 CRMV,   1,
                 CEJ0,   1,
                 CEJF,   1,
+                CPRS,   1,
                 Offset (0x05),
                 CCMD,   8
             }
@@ -1514,9 +1515,16 @@
                 Acquire (\_SB.PCI0.PRES.CPLK, 0xFFFF)
                 \_SB.PCI0.PRES.CSEL = Arg0
                 Local0 = Zero
-                If ((\_SB.PCI0.PRES.CPEN == One))
-                {
-                    Local0 = 0x0F
+                If ((\_SB.PCI0.PRES.CPRS == One))
+                {
+                    If ((\_SB.PCI0.PRES.CPEN == One))
+                    {
+                        Local0 = 0x0F
+                    }
+                    Else
+                    {
+                        Local0 = 0x0D
+                    }
                 }

                 Release (\_SB.PCI0.PRES.CPLK)

Reported-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-11-04 16:03:25 -05:00
Salil Mehta
bf1ecc8dad hw/acpi: Update ACPI _STA method with QOM vCPU ACPI Hotplug states
Reflect the QOM vCPUs ACPI CPU hotplug states in the `_STA.Present` and
and `_STA.Enabled` bits when the guest kernel evaluates the ACPI
`_STA` method during initialization, as well as when vCPUs are
hot-plugged or hot-unplugged. If the CPU is present then the its
`enabled` status can be fetched using architecture-specific code [1].

Reference:
[1] Example implementation of architecture-specific hook to fetch CPU
    `enabled status
    Link: c0b416b11e

Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Message-Id: <20241103102419.202225-4-salil.mehta@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-11-04 16:03:25 -05:00
Salil Mehta
e98411c2cb qtest: allow ACPI DSDT Table changes
list changed files in tests/qtest/bios-tables-test-allowed-diff.h

Reported-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Message-Id: <20241103102419.202225-3-salil.mehta@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-11-04 16:03:25 -05:00
Salil Mehta
2d6cfbaf17 hw/acpi: Make CPUs ACPI presence conditional during vCPU hot-unplug
On most architectures, during vCPU hot-plug and hot-unplug actions, the
firmware or VMM/QEMU can update the OS on vCPU status by toggling the
ACPI method `_STA.Present` bit. However, certain CPU architectures
prohibit [1] modifications to a CPU’s `presence` status after the kernel
has booted.

This limitation [2][3] exists because many per-CPU components, such as
interrupt controllers and various per-CPU features tightly integrated
with CPUs, may not support reconfiguration once the kernel is
initialized. Often, these components cannot be powered down, as they may
belong to an `always-on` power domain. As a result, some architectures
require all CPUs to remain `_STA.Present` after system initialization.

Therefore, it is essential to mirror the exact QOM vCPU status through
ACPI for the Guest kernel. For this, we should determine—via
architecture-specific code[4]—whether vCPUs must always remain present
and whether the associated `AcpiCpuStatus::cpu` object should remain
valid, even following a vCPU hot-unplug operation.

References:
[1] Check comment 5 in the bugzilla entry
    Link: https://bugzilla.tianocore.org/show_bug.cgi?id=4481#c5
[2] KVMForum 2023 Presentation: Challenges Revisited in Supporting Virt CPU Hotplug on
    architectures that don’t Support CPU Hotplug (like ARM64)
    a. Kernel Link: https://kvm-forum.qemu.org/2023/KVM-forum-cpu-hotplug_7OJ1YyJ.pdf
    b. Qemu Link:  https://kvm-forum.qemu.org/2023/Challenges_Revisited_in_Supporting_Virt_CPU_Hotplug_-__ii0iNb3.pdf
[3] KVMForum 2020 Presentation: Challenges in Supporting Virtual CPU Hotplug on
    SoC Based Systems (like ARM64)
    Link: https://kvmforum2020.sched.com/event/eE4m
[4] Example implementation of architecture-specific CPU persistence hook
    Link: c0b416b11e

Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Message-Id: <20241103102419.202225-2-salil.mehta@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-11-04 16:03:25 -05:00
Roque Arcudia Hernandez
26f2660bf7 hw/pci: Add parenthesis to PCI_BUILD_BDF macro
The bus parameter in the macro PCI_BUILD_BDF is not surrounded by
parenthesis. This can create a compile error when warnings are
treated as errors or can potentially create runtime errors due to the
operator precedence.

For instance:

 file.c32: error: suggest parentheses around '-' inside '<<'
 [-Werror=parentheses]
   171 | uint16_t bdf = PCI_BUILD_BDF(a - b, sdev->devfn);
       |                              ~~^~~
 include/hw/pci/pci.h:19:41: note: in definition of macro
 'PCI_BUILD_BDF'
    19 | #define PCI_BUILD_BDF(bus, devfn)     ((bus << 8) | (devfn))
       |                                         ^~~
 cc1: all warnings being treated as errors

Signed-off-by: Roque Arcudia Hernandez <roqueh@google.com>
Reviewed-by: Nabih Estefan <nabihestefan@google.com>
Message-Id: <20241101215923.3399311-1-roqueh@google.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-11-04 16:03:25 -05:00
Jonathan Cameron
721c99aefc hw/cxl: Ensure there is enough data to read the input header in cmd_get_physical_port_state()
If len_in is smaller than the header length then the accessing the
number of ports will result in an out of bounds access.
Add a check to avoid this.

Reported-by: Esifiel <esifiel@gmail.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20241101133917.27634-11-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-11-04 16:03:25 -05:00
Jonathan Cameron
5300bdf589 hw/cxl: Ensure there is enough data for the header in cmd_ccls_set_lsa()
The properties of the requested set command cannot be established if
len_in is less than the size of the header.

Reported-by: Esifiel <esifiel@gmail.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20241101133917.27634-10-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-11-04 16:03:25 -05:00
Jonathan Cameron
c1c4d6b38b hw/cxl: Check that writes do not go beyond end of target attributes
In cmd_features_set_feature() the an offset + data size schemed
is used to allow for large features.  Ensure this does not write
beyond the end fo the buffers used to accumulate the full feature
attribute set.

Reported-by: Esifiel <esifiel@gmail.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20241101133917.27634-9-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-11-04 16:03:25 -05:00
Jonathan Cameron
c0f122419f hw/cxl: Ensuring enough data to read parameters in cmd_tunnel_management_cmd()
If len_in is less than the minimum spec allowed value, then return
CXL_MBOX_INVALID_PAYLOAD_LENGTH

Reported-by: Esifiel <esifiel@gmail.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20241101133917.27634-8-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-11-04 16:03:25 -05:00
Jonathan Cameron
a3de73c2a8 hw/cxl: Avoid accesses beyond the end of cel_log.
Add a check that the requested offset + length does not go beyond the end
of the cel_log.

Whilst the cci->cel_log is large enough to include all possible CEL
entries, the guest might still ask for entries beyond the end of it.
Move the comment to this new check rather than before the check on the
type of log requested.

Reported-by: Esifiel <esifiel@gmail.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20241101133917.27634-7-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-11-04 16:03:25 -05:00
Jonathan Cameron
f9f0fa2438 hw/cxl: Check the length of data requested fits in get_log()
Checking offset + length is of no relevance when verifying the CEL
data will fit in the mailbox payload. Only the length is is relevant.

Note that this removes a potential overflow.

Reported-by: Esifiel <esifiel@gmail.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20241101133917.27634-6-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-11-04 16:03:25 -05:00
Jonathan Cameron
a3995360ae hw/cxl: Check enough data in cmd_firmware_update_transfer()
Buggy guest can write a message that advertises more data that
is provided. As QEMU internally duplicates the reported message
size, this may result in an out of bounds access.
Add sanity checks on the size to avoid this.

Reported-by: Esifiel <esifiel@gmail.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20241101133917.27634-5-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-11-04 16:03:25 -05:00
Jonathan Cameron
f4a12ba66b hw/cxl: Check input length is large enough in cmd_events_clear_records()
Buggy software might write a message that is too short for
either the header, or the header + the event data that is specified
in the header.  This may result in accesses beyond the range of the
message allocated as a duplicate of the incoming message buffer.

Reported-by: Esifiel <esifiel@gmail.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20241101133917.27634-4-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-11-04 16:03:25 -05:00
Jonathan Cameron
91a743bd02 hw/cxl: Check input includes at least the header in cmd_features_set_feature()
A buggy guest might write an insufficiently large message.
Check the header is present. Whilst zero data after the header is very
odd it will just result in failure to copy any data.

Reported-by: Esifiel <esifiel@gmail.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20241101133917.27634-3-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-11-04 16:03:25 -05:00
Jonathan Cameron
7edbbff5ee hw/cxl: Check size of input data to dynamic capacity mailbox commands
cxl_cmd_dcd_release_dyn_cap() and cmd_dcd_add_dyn_cap_rsp() are missing
input message size checks.  These must be done in the individual
commands when the command has a variable length input payload.

A buggy or malicious guest might send undersized messages via the mailbox.
As that size is used to take a copy of the mailbox content, each command
must check there is sufficient data. In this case the first check is that
there is enough data to read how many extents there are, and the second
that there is enough for those elements to be accessed.

Reported-by: Esifiel <esifiel@gmail.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20241101133917.27634-2-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-11-04 16:03:25 -05:00
Fan Ni
802671c37a hw/cxl/cxl-mailbox-util: Fix output buffer index update when retrieving DC extents
In the function of retrieving DC extents (cmd_dcd_get_dyn_cap_ext_list),
the output buffer index was not correctly updated while iterating the
extent list on the device, leaving the extents returned incorrect except for
the first one.

Fixes: 1c9221f19e ("hw/mem/cxl_type3: Add DC extent list representative and get DC extent list mailbox support")
Signed-off-by: Fan Ni <fan.ni@samsung.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20241101132005.26633-3-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-11-04 16:03:25 -05:00
Fan Ni
0564019bf1 cxl/cxl-mailbox-utils: Fix size check for cmd_firmware_update_get_info
In the function cmd_firmware_update_get_info for handling Get FW info
command (0x0200h), the vmem, pmem and DC capacity size check were
incorrect. The size should be aligned to 256MiB, not smaller than
256MiB.

Signed-off-by: Fan Ni <fan.ni@samsung.com>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20241101132005.26633-2-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-11-04 16:03:25 -05:00
Marcin Juszkiewicz
449dca6ac9 pcie: enable Extended tag field support
>From what I read PCI has 32 transactions, PCI Express devices can handle
256 with Extended tag enabled (spec mentions also larger values but I
lack PCIe knowledge).

QEMU leaves 'Extended tag field' with 0 as value:

Capabilities: [e0] Express (v1) Root Complex Integrated Endpoint, IntMsgNum 0
        DevCap: MaxPayload 128 bytes, PhantFunc 0
                ExtTag- RBE+ FLReset- TEE-IO-

SBSA ACS has test 824 which checks for PCIe device capabilities. BSA
specification [1] (SBSA is on top of BSA) in section F.3.2 lists
expected values for Device Capabilities Register:

Device Capabilities Register     Requirement
Role based error reporting       RCEC and RCiEP: Hardwired to 1
Endpoint L0s acceptable latency  RCEC and RCiEP: Hardwired to 0
L1 acceptable latency            RCEC and RCiEP: Hardwired to 0
Captured slot power limit scale  RCEC and RCiEP: Hardwired to 0
Captured slot power limit value  RCEC and RCiEP: Hardwired to 0
Max payload size                 value must be compliant with PCIe spec
Phantom functions                RCEC and RCiEP: Recommendation is to
                                 hardwire this bit to 0.
Extended tag field               Hardwired to 1

1. https://developer.arm.com/documentation/den0094/c/

This change enables Extended tag field. All versioned platforms should
have it disabled for older versions (tested with Arm/virt).

Signed-off-by: Marcin Juszkiewicz <marcin.juszkiewicz@linaro.org>
Message-Id: <20241023113820.486017-1-marcin.juszkiewicz@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-11-04 16:03:25 -05:00
Zhenzhong Duan
6ce12bd297 intel_iommu: Introduce property "stale-tm" to control Transient Mapping (TM) field
VT-d spec removed Transient Mapping (TM) field from second-level page-tables
and treat the field as Reserved(0) since revision 3.2.

Changing the field as reserved(0) will break backward compatibility, so
introduce a property "stale-tm" to allow user to control the setting.

Use pc_compat_9_1 to handle the compatibility for machines before 9.2 which
allow guest to set the field. Starting from 9.2, this field is reserved(0)
by default to match spec. Of course, user can force it on command line.

This doesn't impact function of vIOMMU as there was no logic to emulate
Transient Mapping.

Suggested-by: Yi Liu <yi.l.liu@intel.com>
Suggested-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Yi Liu <yi.l.liu@intel.com>
Message-Id: <20241028022514.806657-1-zhenzhong.duan@intel.com>
Reviewed-by: Clément Mathieu--Drif<clement.mathieu--drif@eviden.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-11-04 16:03:25 -05:00
Albert Esteve
eea5aeef84 vhost-user: fix shared object return values
VHOST_USER_BACKEND_SHARED_OBJECT_ADD and
VHOST_USER_BACKEND_SHARED_OBJECT_REMOVE state
in the spec that they return 0 for successful
operations, non-zero otherwise. However,
implementation relies on the return types
of the virtio-dmabuf library, with opposite
semantics (true if everything is correct,
false otherwise). Therefore, current
implementation violates the specification.

Revert the logic so that the implementation
of the vhost-user handling methods matches
the specification.

Fixes: 043e127a12
Fixes: 1609476662
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Albert Esteve <aesteve@redhat.com>
Message-Id: <20241022124615.585596-1-aesteve@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-11-04 16:03:25 -05:00
Jonathan Cameron
d4d5212c54 hw/pci-bridge: Make pxb_dev_realize_common() return if it succeeded
For the CXL PXB there is additional code after pxb_dev_realize_common()
is called.  If that realize failed (e.g. due to an out of range numa_node)
we will get a segfault.  Return a bool so the caller can check if the
pxb_dev_realize_common() succeeded or not without having to poke around
in the errp.

Fixes: 4f8db8711c ("hw/pxb: Allow creation of a CXL PXB (host bridge)")
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20241014121902.2146424-8-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-11-04 16:03:25 -05:00
Jonathan Cameron
d1978226c8 hw/cxl: Fix indent of structure member
Add missing 4 spaces of indent to structure element.

Reported-by: Davidlohr Bueso <dave@stgolabs.net>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20241014121902.2146424-7-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-11-04 16:03:24 -05:00
Shiju Jose
d1853190db hw/cxl/cxl-mailbox-utils: Fix for device DDR5 ECS control feature tables
CXL spec 3.1 section 8.2.9.9.11.2 describes the DDR5 Error Check Scrub (ECS)
control feature.

ECS log capabilities field in following ECS tables, which is common for all
memory media FRUs in a CXL device.

Fix struct CXLMemECSReadAttrs and struct CXLMemECSWriteAttrs to make
log entry type field common.

Fixes: 2d41ce38fb ("hw/cxl/cxl-mailbox-utils: Add device DDR5 ECS control feature")
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20241014121902.2146424-6-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-11-04 16:03:24 -05:00
Fan Ni
80ee960f8d hw/mem/cxl_type3: Fix More flag setting for dynamic capacity event records
Per cxl spec r3.1, for multiple dynamic capacity event records grouped via
the More flag, the last record in the sequence should clear the More flag.

Before the change, the More flag of the event record is cleared before
the loop of inserting records into the event log, which will leave the flag
always set once it is set in the loop.

Fixes: d0b9b28a5b ("hw/cxl/events: Add qmp interfaces to add/release dynamic capacity extents")
Signed-off-by: Fan Ni <fan.ni@samsung.com>
Link: https://lore.kernel.org/r/20240827164304.88876-2-nifan.cxl@gmail.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20241014121902.2146424-5-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-11-04 16:03:24 -05:00
Yao Xingtao
5eabca7ec0 mem/cxl_type3: Fix overlapping region validation error
When injecting a new poisoned region through qmp_cxl_inject_poison(),
the newly injected region should not overlap with existing poisoned
regions.

The current validation method does not consider the following
overlapping region:
┌───┬───────┬───┐
│a  │  b(a) │a  │
└───┴───────┴───┘
(a is a newly added region, b is an existing region, and b is a
 subregion of a)

Fixes: 9547754f40 ("hw/cxl: QMP based poison injection support")
Signed-off-by: Yao Xingtao <yaoxt.fnst@fujitsu.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20241014121902.2146424-4-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-11-04 16:03:24 -05:00
Ajay Joshi
8352756ffa hw/cxl: Fix background completion percentage calculation
The current completion percentage calculation does not account for the
relative time since the start of the background activity, this leads to
showing incorrect start percentage vs what has actually been completed.

This patch calculates the percentage based on the actual elapsed time since
the start of the operation.

Fixes: 221d2cfbdb ("hw/cxl/mbox: Add support for background operations")
Signed-off-by: Ajay Joshi <ajay.opensrc@micron.com>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Link: https://lore.kernel.org/r/20240729102338.22337-1-ajay.opensrc@micron.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20241014121902.2146424-3-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-11-04 16:03:24 -05:00
Dmitry Frolov
df66b85f35 hw/cxl: Fix uint32 overflow cxl-mailbox-utils.c
The sum offset + length may overflow uint32. Since this sum is
compared with uint64_t return value of get_lsa_size(), it makes
sense to choose uint64_t type for offset and length.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Fixes: 3ebe676a34 ("hw/cxl/device: Implement get/set Label Storage Area (LSA)")
Signed-off-by: Dmitry Frolov <frolov@swemel.ru>
Link: https://lore.kernel.org/r/20240917080925.270597-2-frolov@swemel.ru
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20241014121902.2146424-2-Jonathan.Cameron@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-11-04 16:03:24 -05:00
yaozhenguo
963b027645 virtio/vhost-user: fix qemu abort when hotunplug vhost-user-net device
During the hot-unplugging of vhost-user-net type network cards,
the vhost_user_cleanup function may add the same rcu node to
the rcu linked list. The function call in this case is as follows:

vhost_user_cleanup
    ->vhost_user_host_notifier_remove
        ->call_rcu(n, vhost_user_host_notifier_free, rcu);
    ->g_free_rcu(n, rcu);

When this happens, QEMU will abort in try_dequeue:

if (head == &dummy && qatomic_mb_read(&tail) == &dummy.next) {
    abort();
}

backtrace is as follows:
0  __pthread_kill_implementation () at /usr/lib64/libc.so.6
1  raise () at /usr/lib64/libc.so.6
2  abort () at /usr/lib64/libc.so.6
3  try_dequeue () at ../util/rcu.c:235
4  call_rcu_thread (0) at ../util/rcu.c:288
5  qemu_thread_start (0) at ../util/qemu-thread-posix.c:541
6  start_thread () at /usr/lib64/libc.so.6
7  clone3 () at /usr/lib64/libc.so.6

The reason for the abort is that adding two identical nodes to
the rcu linked list will cause the rcu linked list to become a ring,
but when the dummy node is added after the two identical nodes,
the ring is opened. But only one node is added to list with
rcu_call_count added twice. This will cause rcu try_dequeue abort.

This happens when n->addr != 0. In some scenarios, this does happen.
For example, this situation will occur when using a 32-queue DPU
vhost-user-net type network card for hot-unplug testing, because
VhostUserHostNotifier->addr will be cleared during the processing of
VHOST_USER_BACKEND_VRING_HOST_NOTIFIER_MSG. However,it is asynchronous,
so we cannot guarantee that VhostUserHostNotifier->addr is zero in
vhost_user_cleanup. Therefore, it is necessary to merge g_free_rcu
and vhost_user_host_notifier_free into one rcu node.

Fixes: 503e355465 ("virtio/vhost-user: dynamically assign VhostUserHostNotifiers")
Signed-off-by: yaozhenguo <yaozhenguo@jd.com>
Message-Id: <20241011102913.45582-1-yaozhenguo@jd.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-11-04 16:03:24 -05:00
Gao Shiyuan
55fa4be6f7 virtio-pci: fix memory_region_find for VirtIOPCIRegion's MR
As shown below, if a virtio PCI device is attached under a pci-bridge, the MR
of VirtIOPCIRegion does not belong to any address space. So memory_region_find
cannot be used to search for this MR.

Introduce the virtio-pci and pci_bridge address spaces to solve this problem.

Before:
memory-region: pci_bridge_pci
  0000000000000000-ffffffffffffffff (prio 0, i/o): pci_bridge_pci
    00000000fe840000-00000000fe840fff (prio 1, i/o): virtio-net-pci-msix
      00000000fe840000-00000000fe84003f (prio 0, i/o): msix-table
      00000000fe840800-00000000fe840807 (prio 0, i/o): msix-pba
    0000380000000000-0000380000003fff (prio 1, i/o): virtio-pci
      0000380000000000-0000380000000fff (prio 0, i/o): virtio-pci-common-virtio-net
      0000380000001000-0000380000001fff (prio 0, i/o): virtio-pci-isr-virtio-net
      0000380000002000-0000380000002fff (prio 0, i/o): virtio-pci-device-virtio-net
      0000380000003000-0000380000003fff (prio 0, i/o): virtio-pci-notify-virtio-net

After:
address-space: virtio-pci-cfg-mem-as
  0000380000000000-0000380000003fff (prio 1, i/o): virtio-pci
    0000380000000000-0000380000000fff (prio 0, i/o): virtio-pci-common-virtio-net
    0000380000001000-0000380000001fff (prio 0, i/o): virtio-pci-isr-virtio-net
    0000380000002000-0000380000002fff (prio 0, i/o): virtio-pci-device-virtio-net
    0000380000003000-0000380000003fff (prio 0, i/o): virtio-pci-notify-virtio-net

address-space: pci_bridge_pci_mem
  0000000000000000-ffffffffffffffff (prio 0, i/o): pci_bridge_pci
    00000000fe840000-00000000fe840fff (prio 1, i/o): virtio-net-pci-msix
      00000000fe840000-00000000fe84003f (prio 0, i/o): msix-table
      00000000fe840800-00000000fe840807 (prio 0, i/o): msix-pba
    0000380000000000-0000380000003fff (prio 1, i/o): virtio-pci
      0000380000000000-0000380000000fff (prio 0, i/o): virtio-pci-common-virtio-net
      0000380000001000-0000380000001fff (prio 0, i/o): virtio-pci-isr-virtio-net
      0000380000002000-0000380000002fff (prio 0, i/o): virtio-pci-device-virtio-net
      0000380000003000-0000380000003fff (prio 0, i/o): virtio-pci-notify-virtio-net

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2576
Fixes: ffa8a3e3b2 ("virtio-pci: Add lookup subregion of VirtIOPCIRegion MR")
Co-developed-by: Zuo Boqun <zuoboqun@baidu.com>
Signed-off-by: Zuo Boqun <zuoboqun@baidu.com>
Co-developed-by: Wang Liang <wangliang44@baidu.com>
Signed-off-by: Wang Liang <wangliang44@baidu.com>
Signed-off-by: Gao Shiyuan <gaoshiyuan@baidu.com>
Message-Id: <20241030131324.34144-1-gaoshiyuan@baidu.com>
Tested-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-11-04 16:03:24 -05:00
Suravee Suthikulpanit
b12cb3819b amd_iommu: Check APIC ID > 255 for XTSup
The XTSup mode enables x2APIC support for AMD IOMMU, which is needed
to support vcpu w/ APIC ID > 255.

Reviewed-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Santosh Shukla <santosh.shukla@amd.com>
Message-Id: <20240927172913.121477-6-santosh.shukla@amd.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-11-04 16:03:24 -05:00
Suravee Suthikulpanit
f84aad4d71 amd_iommu: Send notification when invalidate interrupt entry cache
In order to support AMD IOMMU interrupt remapping emulation with PCI
pass-through devices, QEMU needs to notify VFIO when guest IOMMU driver
updates and invalidate the guest interrupt remapping table (IRT), and
communicate information so that the host IOMMU driver can update
the shadowed interrupt remapping table in the host IOMMU.

Therefore, send notification when guest IOMMU emulates the IRT
invalidation commands.

Reviewed-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Santosh Shukla <santosh.shukla@amd.com>
Message-Id: <20240927172913.121477-5-santosh.shukla@amd.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-11-04 16:03:24 -05:00
Suravee Suthikulpanit
9fc9dbac61 amd_iommu: Use shared memory region for Interrupt Remapping
Use shared memory region for interrupt remapping which can be
aliased by all devices.

Reviewed-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Santosh Shukla <santosh.shukla@amd.com>
Message-Id: <20240927172913.121477-4-santosh.shukla@amd.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-11-04 16:03:24 -05:00
Suravee Suthikulpanit
c1f46999ef amd_iommu: Add support for pass though mode
Introduce 'nodma' shared memory region to support PT mode
so that for each device, we only create an alias to shared memory
region when DMA-remapping is disabled.

Reviewed-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Santosh Shukla <santosh.shukla@amd.com>
Message-Id: <20240927172913.121477-3-santosh.shukla@amd.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-11-04 16:03:24 -05:00
Suravee Suthikulpanit
2e6f051cfc amd_iommu: Rename variable mmio to mr_mmio
Rename the MMIO memory region variable 'mmio' to 'mr_mmio'
so to correctly name align with struct AMDVIState::variable type.

No functional change intended.

Reviewed-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Santosh Shukla <santosh.shukla@amd.com>
Message-Id: <20240927172913.121477-2-santosh.shukla@amd.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-11-04 16:03:24 -05:00
Ricardo Ribalda
9848a76c0b tests/acpi: pc: update golden masters for DSDT
Note: since all we did is replace VarPackageOp with PackageOP,
and both are represented by Package() in ASL, the AML is
different but ASL is the same.

Signed-off-by: Ricardo Ribalda <ribalda@chromium.org>
Message-Id: <20240924132417.739809-4-ribalda@chromium.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Igor Mammedov <imammedo@redhat.com>
2024-11-04 16:03:24 -05:00
Ricardo Ribalda
7916bb5431 hw/i386/acpi-build: return a non-var package from _PRT()
Windows XP seems to have issues when _PRT() returns a variable package.
We know in advance the size, so we can return a fixed package instead.
https://lore.kernel.org/qemu-devel/c82d9331-a8ce-4bb0-b51f-2ee789e27c86@ilande.co.uk/T/#m541190c942676bccf7a7f7fbcb450d94a4e2da53

Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reported-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Fixes: 99cb2c6c7b ("hw/i386/acpi-build: Return a pre-computed _PRT table")
Closes: https://lore.kernel.org/all/eb11c984-ebe4-4a09-9d71-1e9db7fe7e6f@ilande.co.uk/
Signed-off-by: Ricardo Ribalda <ribalda@chromium.org>
Message-Id: <20240924132417.739809-3-ribalda@chromium.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-11-04 16:03:24 -05:00
Ricardo Ribalda
d944497b55 tests/acpi: pc: allow DSDT acpi table changes
Signed-off-by: Ricardo Ribalda <ribalda@chromium.org>
Message-Id: <20240924132417.739809-2-ribalda@chromium.org>

Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Igor Mammedov <imammedo@redhat.com>
2024-11-04 16:03:24 -05:00