OpenPOWER systems expect to be notified with such an event before a
shutdown or a reboot. An OEM SEL message is sent with specific
identifiers and a user data containing the request : OFF or REBOOT.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Skiboot, the firmware for the PowerNV platform, expects the BMC to
provide some specific IPMI sensors. These sensors are exposed in the
device tree and their values are updated by the firmware at boot time.
Sensors of interest are :
"FW Boot Progress"
"Boot Count"
As such a device is defined on the command line, we can only detect
its presence at reset time.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
It adds the Naples chip which supports proper LPC interrupts via the
LPC controller rather than via an external CPLD.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[clg: - updated for qemu-2.9
- ported on latest PowerNV patchset
- moved the IRQ handler in pnv_lpc.c
- introduced pnv_lpc_isa_irq_create() to create the ISA IRQs ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
xics_system_init() does not need 'nr_servers' anymore as it is only
used to define the 'interrupt-controller' node in the device tree. So
let's just compute the value when calling spapr_dt_xics().
This also gives us an opportunity to simplify the xics_system_init()
routine and introduce a specific spapr_ics_create() helper to create
the sPAPR ICS object.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
It will be used to fill the message buffer with custom events expected
by some systems. Typically, an Open PowerNV platform guest is notified
with an OEM SEL message before a shutdown or a reboot.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Acked-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This patch exposes a new IPMI routine to query a sdr entry from the
sdr table maintained by the IPMI BMC simulator. The API is very
similar to the internal sdr_find_entry() routine and should be used
the same way to query one or all sdrs.
A typical use would be to loop on the sdrs to build nodes of a device
tree.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Acked-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The OCC is an on-chip microcontroller based on a ppc405 core used
for various power management tasks. It comes with a pile of additional
hardware sitting on the PIB (aka XSCOM bus). At this point we don't
emulate it (nor plan to do so). However there is one facility which
is provided by the surrounding hardware that we do need, which is the
interrupt generation facility. OPAL uses it to send itself interrupts
under some circumstances and there are other uses around the corner.
So this implement just enough to support this.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[clg: - updated for qemu-2.9
- changed the XSCOM interface to fit new model
- QOMified the model ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The Processor Service Interface (PSI) Controller is one of the engines
of the "Bridge" unit which connects the different interfaces to the
Power Processor.
This adds just enough of the PSI bridge to handle various on-chip and
the one external interrupt. The rest of PSI has to do with the link to
the IBM FSP service processor which we don't plan to emulate (not used
on OpenPower machines).
The ics_get() and ics_resend() handlers of the XICSFabric interface of
the PowerNV machine are now defined to handle the Interrupt Control
Source of PSI. The InterruptStatsProvider interface is also modified
to dump the new ICS.
Originally from Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This provides to a PowerNV chip (POWER8) access to the Interrupt
Management area, which contains the registers of the Interrupt Control
Presenters of each thread. These are used to accept, return, forward
interrupts in the system.
This area is modeled with a per-chip container memory region holding
all the ICP registers. Each thread of a chip is then associated with
its ICP registers using a memory subregion indexed by its PIR number
in the overall region.
The device tree is populated accordingly.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Some controllers (ICP, PSI) have a base register address which is
calculated using the chip id.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This provides a new ICPState object for the PowerNV machine (POWER8).
Access to the Interrupt Management area is done though a memory
region. It contains the registers of the Interrupt Control Presenters
of each thread which are used to accept, return, forward interrupts in
the system.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
It will be used by derived classes in PowerNV for customization.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Today, all the ICPs are created before the CPUs, stored in an array
under the sPAPR machine and linked to the CPU when the core threads
are realized. This modeling brings some complexity when a lookup in
the array is required and it can be simplified by allocating the ICPs
when the CPUs are.
This is the purpose of this proposal which introduces a new 'icp_type'
field under the machine and creates the ICP objects of the right type
(KVM or not) before the PowerPCCPU object are.
This change allows more cleanups : the removal of the icps array under
the sPAPR machine and the removal of the xics_get_cpu_index_by_dt_id()
helper.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Today, the ICPState array of the sPAPR machine is indexed with
'cpu_index' of the CPUState. This numbering of CPUs is internal to
QEMU and the guest only knows about what is exposed in the device
tree, that is the 'cpu_dt_id'. This is why sPAPR uses the helper
xics_get_cpu_index_by_dt_id() to do the mapping in a couple of places.
To provide a more generic XICS layer, we need to abstract the IRQ
'server' number and remove any assumption made on its nature. It
should not be used as a 'cpu_index' for lookups like xics_cpu_setup()
and xics_cpu_destroy() do.
To reach that goal, we choose to introduce a generic 'intc' backlink
under PowerPCCPU, and let the machine core init routine do the
ICPState lookup. The resulting object is passed on to xics_cpu_setup()
which does the store under PowerPCCPU. The IRQ 'server' number in XICS
is now generic. sPAPR uses 'cpu_dt_id' and PowerNV will use 'PIR'
number.
This also has the benefit of simplifying the sPAPR hcall routines
which do not need to do any ICPState lookups anymore.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
For a little while around 4.9, Linux kernels that saw the radix bit in
ibm,pa-features would attempt to set up the MMU as if they were a
hypervisor, even if they were a guest, which would cause them to
crash.
Work around this by detecting pre-ISA 3.0 guests by their lack of that
bit in option vector 1, and then removing the radix bit from
ibm,pa-features. Note: This now requires regeneration of that node
after CAS negotiation.
Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
[dwg: Fix style nits]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Add the new node, /chosen/ibm,arch-vec-5-platform-support to the
device tree. This allows the guest to determine which modes are
supported by the hypervisor.
Update the option vector processing in h_client_architecture_support()
to handle the new MMU bits. This allows guests to request hash or
radix mode and QEMU to create the guest's HPT at this time if it is
necessary but hasn't yet been done. QEMU will terminate the guest if
it requests an unavailable mode, as required by the architecture.
Extend the ibm,pa-features node with the new ISA 3.0 values
and set the radix bit if KVM supports radix mode. This probably won't
be used directly by guests to determine the availability of radix mode
(that is indicated by the new node added above) but the architecture
requires that it be set when the hardware supports it.
If QEMU is using KVM, and KVM is capable of running in radix mode,
guests can be run in real-mode without allocating a HPT (because KVM
will use a minimal RPT). So in this case, we avoid creating the HPT
at reset time and later (during CAS) create it if it is necessary.
ISA 3.0 guests will now begin to call h_register_process_table(),
which has been added previously.
Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
[dwg: Strip some unneeded prefix from error messages]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The H_REGISTER_PROCESS_TABLE H_CALL is used by a guest to indicate to the
hypervisor where in memory its process table is and how translation should
be performed using this process table.
Provide the implementation of this H_CALL for a guest.
We first check for invalid flags, then parse the flags to determine the
operation, and then check the other parameters for valid values based on
the operation (register new table/deregister table/maintain registration).
The process table is then stored in the appropriate location and registered
with the hypervisor (if running under KVM), and the LPCR_[UPRT/GTSE] bits
are updated as required.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
[dwg: Correct missing prototype and uninitialized variable]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The use of the new in memory tables introduced in ISAv3.00 for translation,
also referred to as process tables, requires the introduction of 3 new
H-CALLs; H_REGISTER_PROCESS_TABLE, H_CLEAN_SLB, and H_INVALIDATE_PID.
Add shells for each of these and register them as the hypercall handlers.
Currently they all log an unimplemented hypercall and return H_FUNCTION.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
[dwg: Fix style nits]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Use the new ioctl, KVM_PPC_GET_RMMU_INFO, to fetch radix MMU
information from KVM and present the page encodings in the device tree
under ibm,processor-radix-AP-encodings. This provides page size
information to the guest which is necessary for it to use radix mode.
Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
[dwg: Compile fix for 32-bit targets, style nit fix]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Also use an 'sPAPRRTCState' attribute under the sPAPR machine to hold
the RTC object. Overall, these changes remove an unnecessary and
implicit dependency on SysBus.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Version: GnuPG v1
iQEcBAABAgAGBQJY/zFbAAoJEO8Ells5jWIR7LgH/A6lWkODVSKihnibRH82J9oe
rTsDdLgAGAMAur++tmNorPadZyMe/2+Cu0VsiIv591ldILruN6+jJydBzFtWFYE5
JQKa2VSTDu6bHPhr/UpRnWLhGzaJogklJR6YLkonDJznb1UnnTwEZ8c8+XD4gWLo
byo/dYF1yMnpVxSak/FkmCmwxc2K7s7P+r4FWO2CgAgY28F+/qERWJMbl1iUevQP
E1PC/XXEvhMdxi+6oYmWACdbW9/KwC5KKVELsQWYU1DcpQ7rWXCtA/mtKxvX+ePw
7CUK9ldeFXHE8uWVDnh3cWUL65Q8OtZarjMbrnN7xzcQDhMysStvVNS4QckN6/I=
=PEvc
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into staging
# gpg: Signature made Tue 25 Apr 2017 12:22:03 BST
# gpg: using RSA key 0xEF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211
* remotes/jasowang/tags/net-pull-request:
COLO-compare: Optimize tcp compare trace event
COLO-compare: Optimize tcp compare for option field
slirp: add a fake NC-SI backend
aspeed: add a FTGMAC100 nic
net/ftgmac100: add a 'aspeed' property
net: add FTGMAC100 support
hw/net: add MII definitions
colo-compare: Fix old packet check bug.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
NC-SI (Network Controller Sideband Interface) enables a BMC to manage
a set of NICs on a system. This model takes the simplest approach and
reverses the NC-SI packets to pretend a NIC is present and exercise
the Linux driver.
The NCSI header file <ncsi-pkt.h> comes from mainline Linux and was
untabified.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Acked-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
There is a second NIC but we do not use it for the moment. We use the
'aspeed' property to tune the definition of the end of ring buffer bit
for the Aspeed SoCs.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
The Aspeed SoCs have a different definition of the end of the ring
buffer bit. Add a property to specify which set of bits should be used
by the NIC.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
iQIcBAABAgAGBQJY/k9tAAoJEL2+eyfA3jBXmS4P/jwl1Z9lyFJeNNyTR/pgLqnQ
/izpS7OBhjzpXuft99dQT8XcX1q4D0Ok0Hf0cs6EffiYa1Ob925wiKi7l+Om1F/c
23aqvaLFsOAxg7DV4kZ+PN1vClfZ/8qlF5hy/CzjmORCGJZXcr29h8JnhTJctrz9
I7KEzxL+aBZHi2QrMpWFIoQgOopLbhnanvxR/SmreQvVV6/AtIBI7qIoR/1HedQu
5EPKSaB81bK2tb+EfRJa9EDGGgufZqEssN0S30DlmYk+M4L5uDNihVD/MfLAGDgB
Kz42EWNuaZj5oBHtDrpcrcEv3/bl8GCGjRJNPnyrwEEYp4/ZLxgzgoRMZNMlC2c0
J+KqY0q+3L8XZMIWb2A8rzAheuzc9CA0b1D+mZ50y0j45wFDmCSSw+LKX907isro
ZP6I2MoSufH22AbihGxKPpZXATYcIZKNnCBaZWpDFcvTu7x/5LveogpqXnqFc6Z5
+eyNYJCQoepTNucuD3+9Blqosz4hEj7qZFvHWGxpDxIsjBTe6SgZTXsNR48osWlb
xvd6hTzycYnvO5R96/e/nudJ5WQjhX3VedkkWG5VAYShCl0WeRNf/JawVYTspgtq
P2Pbq6iBcnDfXd8Xrng69fd3OMGnKKgaEVFrk+NkhU2ZvqcUqP3OCReL62UZDpB5
F83rHk4WbC8B9eqC+afO
=CVO6
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/cody/tags/block-pull-request' into staging
# gpg: Signature made Mon 24 Apr 2017 20:18:05 BST
# gpg: using RSA key 0xBDBE7B27C0DE3057
# gpg: Good signature from "Jeffrey Cody <jcody@redhat.com>"
# gpg: aka "Jeffrey Cody <jeff@codyprime.org>"
# gpg: aka "Jeffrey Cody <codyprime@gmail.com>"
# Primary key fingerprint: 9957 4B4D 3474 90E7 9D98 D624 BDBE 7B27 C0DE 3057
* remotes/cody/tags/block-pull-request:
qemu-iotests: _cleanup_qemu must be called on exit
block/rbd: Add support for reopen()
block/rbd - update variable names to more apt names
block: use bdrv_can_set_read_only() during reopen
block: introduce bdrv_can_set_read_only()
block: code movement
block: honor BDRV_O_ALLOW_RDWR when clearing bs->read_only
block: do not set BDS read_only if copy_on_read enabled
block: add bdrv_set_read_only() helper function
qemu-iotests: exclude vxhs from image creation via protocol
block/vxhs.c: Add qemu-iotests for new block device type "vxhs"
block/vxhs.c: Add support for a new block device type called "vxhs"
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Introduce check function for setting read_only flags. Will return < 0 on
error, with appropriate Error value set. Does not alter any flags.
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: e2bba34ac3bc76a0c42adc390413f358ae0566e8.1491597120.git.jcody@redhat.com
A few block drivers will set the BDS read_only flag from their
.bdrv_open() function. This means the bs->read_only flag could
be set after we enable copy_on_read, as the BDRV_O_COPY_ON_READ
flag check occurs prior to the call to bdrv->bdrv_open().
This adds an error return to bdrv_set_read_only(), and an error will be
return if we try to set the BDS to read_only while copy_on_read is
enabled.
This patch also changes the behavior of vvfat. Before, vvfat could
override the drive 'readonly' flag with its own, internal 'rw' flag.
For instance, this -drive parameter would result in a writable image:
"-drive format=vvfat,dir=/tmp/vvfat,rw,if=virtio,readonly=on"
This is not correct. Now, attempting to use the above -drive parameter
will result in an error (i.e., 'rw' is incompatible with 'readonly=on').
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 0c5b4c1cc2c651471b131f21376dfd5ea24d2196.1491597120.git.jcody@redhat.com
We have a helper wrapper for checking for the BDS read_only flag,
add a helper wrapper to set the read_only flag as well.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 9b18972d05f5fa2ac16c014f0af98d680553048d.1491597120.git.jcody@redhat.com
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJY/aZSAAoJEDhwtADrkYZTPS4P/1VcTOmdzD5WnPDmAPDBi7L6
HLy1AYjFknYhE4LH3bmYHwIw32C9fIsMcXELBcYwTJAqXSyh1i27aMweq4BwBn9u
eURVldSVny7u1lyLVGwjRfxnT6073QdOnjMIxMBHkZitmD9Oov883s+yJOdoLa83
/E1lXqTgsltUXOOdD3yj9LhYoU4wLz0G07xUOB2zvk1f9UJYQWWWg9XP3158IFfI
ikJSBDI5T6gD0tucJbhzpTzkuoIoZggMCtF9gpHbmTuL/ukkunRYiPTbhA+ZNpI+
HWyh33U5v+GYAGh4ZcQmonQUztk0u6y5eisgTslCaRlfhHLEXgPmKGCFKjbhhtIz
XyhLykoR94yvfp0k4xAU5VXsogJajg84qibyIMfPeyL2cFbIbjgruPWVoclefzEL
ekZPzXxcZKH0rcfTnSbgqVnnNNuk7Nj5AYvfqKLEBFcP3I5d/D+3KZOj1a7iZxKq
hyPQ8fFtIytBs8UsAW7qOKD56PJnfCQ0Lo3vhPo4Jx9/lGdtHsCvj8QQAHo9J+Jc
3L/qZtddhEqZdowCF8/c9KtkqByPkq8BywJEc/4vUMl17SpjvFpkbzACaw+i8sY2
pQAQnnA+xeAeEbXZ/6QltrF089nSSlQaj666gUXrLOK82blksdBV3+tmXi9fKLKu
OplAMTSiBycheUNGGtGd
=ItSk
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/armbru/tags/pull-error-2017-04-24' into staging
Error reporting patches for 2017-04-24
# gpg: Signature made Mon 24 Apr 2017 08:16:34 BST
# gpg: using RSA key 0x3870B400EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg: aka "Markus Armbruster <armbru@pond.sub.org>"
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653
* remotes/armbru/tags/pull-error-2017-04-24:
error: Apply error_propagate_null.cocci again
qga: Make errp the last parameter of qga_vss_fsfreeze
migration: Make errp the last parameter of local functions
scsi: Make errp the last parameter of virtio_scsi_common_realize
fdc: Make errp the last parameter of fdctrl_connect_drives
nfs: Make errp the last parameter of nfs_client_open
block: Make errp the last parameter of commit_active_start
mirror: Make errp the last parameter of mirror_start_job
crypto: Make errp the last parameter of functions
block: Make errp the last parameter of bdrv_img_create
socket: Make errp the last parameter of vsock_connect_saddr
socket: Make errp the last parameter of unix_connect_saddr
socket: Make errp the last parameter of inet_connect_saddr
socket: Make errp the last parameter of socket_connect
util/error: Fix leak in error_vprepend()
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Only the display controller part is created automatically on PCI
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Message-id: 647d292c6f5abba8b2a614687229949b5dcb864e.1492787889.git.balaton@eik.bme.hu
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Adding vmstate saving is not in this patch because the state structure
will be changed in further patches, then another patch will add
vmstate descriptor after those changes.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Message-id: a32b7fc981a20205f96d530d8e958f12ace1104c.1492787889.git.balaton@eik.bme.hu
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This patch adds support for getting and using a local copy of the dirty
bitmap.
memory_region_snapshot_and_clear_dirty() will create a snapshot of the
dirty bitmap for the specified range, clear the dirty bitmap and return
the copy. The returned bitmap can be a bit larger than requested, the
range is expanded so the code can copy unsigned longs from the bitmap
and avoid atomic bit update operations.
memory_region_snapshot_get_dirty() will return the dirty status of
pages, pretty much like memory_region_get_dirty(), but using the copy
returned by memory_region_copy_and_clear_dirty().
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170421091632.30900-3-kraxel@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Move opaque to 2nd instead of the 2nd to last, so that compilers help
check with the conversion.
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170421122710.15373-7-famz@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[Commit message typo corrected]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
The FTGMAC100 device is an Ethernet controller with DMA function that
can be found on Aspeed SoCs (which include NCSI).
It is fully compliant with IEEE 802.3 specification for 10/100 Mbps
Ethernet and IEEE 802.3z specification for 1000 Mbps Ethernet and
includes Reduced Media Independent Interface (RMII) and Reduced
Gigabit Media Independent Interface (RGMII) interfaces. It adopts an
AHB bus interface and integrates a link list DMA engine with direct
M-Bus accesses for transmitting and receiving packets. It has
independent TX/RX fifos, supports half and full duplex (1000 Mbps mode
only supports full duplex), flow control for full duplex and
backpressure for half duplex.
The FTGMAC100 also implements IP, TCP, UDP checksum offloads and
supports IEEE 802.1Q VLAN tag insertion and removal. It offers
high-priority transmit queue for QoS and CoS applications
This model is backed with a RealTek 8211E PHY which is the chip found
on the AST2500 EVB. It is complete enough to satisfy two different
Linux drivers and a U-Boot driver. Not supported features are :
- IEEE 802.1Q VLAN
- High Priority Transmit Queue
- Wake-On-LAN functions
The code is based on the Coldfire Fast Ethernet Controller model.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
This adds comments on the Basic mode control and status registers bit
definitions. It also adds a couple of bits for 1000BASE-T and the
RealTek 8211E PHY for the FTGMAC100 model to use.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Do not use the ring.h header installed on the system. Instead, import
the header into the QEMU codebase. This avoids problems when QEMU is
built against a Xen version too old to provide all the ring macros.
Signed-off-by: Stefano Stabellini <stefano@aporeto.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
CC: anthony.perard@citrix.com
CC: jgross@suse.com
Commit f0f272baf3a7 "xen: use libxendevice model to restrict operations"
added a command-line option (-xen-domid-restrict) to limit operations
using the libxendevicemodel API to a specified domid. The commit also
noted that the restriction would be extended to cover operations issued
via other xen libraries by subsequent patches.
My recent Xen patch [1] added a call to the xenforeignmemory API to allow
it to be restricted. This patch now makes use of that new call when the
-xen-domid-restrict option is passed.
[1] http://xenbits.xen.org/gitweb/?p=xen.git;a=commit;h=5823d6eb
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
This patch adds a command-line option (-xen-domid-restrict) which will
use the new libxendevicemodel API to restrict devicemodel [1] operations
to the specified domid. (Such operations are not applicable to the xenpv
machine type).
This patch also adds a tracepoint to allow successful enabling of the
restriction to be monitored.
[1] I.e. operations issued by libxendevicemodel. Operation issued by other
xen libraries (e.g. libxenforeignmemory) are currently still unrestricted
but this will be rectified by subsequent patches.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Today qemu is using e.g. the value 480 for Xen version 4.8.0. As some
Xen version tests are using ">" relations this scheme will lead to
problems when Xen version 4.10.0 is being reached.
Instead of the 3 digit schem use a 5 digit scheme (e.g. 40800 for
version 4.8.0).
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
This patch modifies the wrapper functions in xen_common.h to use the
new xendevicemodel interface if it is available along with compatibility
code to use the old libxenctrl interface if it is not.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Anthony Perard <anthony.perard@citrix.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIcBAABCAAGBQJY+d69AAoJEPSH7xhYctcj/4oQAIFFEyWaqrL9ve5ySiJgdtcY
zYtiIhZQ+nPuy2i1oDSX+vbMcmkJDDyfO5qLovxyHGkZHniR8HtxNHP+MkZQa07p
DiSIvd51HvcixIouhbGcoUCU63AYxqNL3o5/TyNpUI72nvsgwl3yfOot7PtutE/F
r384j8DrOJ9VwC5GGPg27mJvRPvyfDQWfxDCyMYVw153HTuwVYtgiu/layWojJDV
D2L1KV45ezBuGckZTHt9y6K4J5qz8qHb/dJc+whBBjj4j9T9XOILU9NPDAEuvjFZ
gHbrUyxj7EiApjHcDZoQm9Raez422ALU30yc9Kn7ik7vSqTxk2Ejq6Gz7y9MJrDn
KdMj75OETJNjBL+0T9MmbtWts28+aalpTUXtBpmi3eWQV5Hcox2NF1RP42jtD9Pa
lkrM6jv0nsdNfBPlQ+ZmBTJxysWECcMqy487nrzmPNC8vZfokjXL5be12puho9fh
ziU4gx9C6/k82S+/H6WD/AUtRiXJM7j4oTU2mnjrsSXQC1JNWqODBOFUo9zsDufl
vtcrxfPhSD1DwOInFSIBHf/RylcgTkPCL0rPoJ8npNDly6rHFYJ+oIbsn84Z4uYY
RWvH8xB9wgRlK9L1WdRgOd2q7PaeHQoSSdPOiS9YVEVMVvSW8Es5CRlhcAsw/M/T
1Tl65cNrjETAuZKL3dLH
=EsZ5
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/juanquintela/tags/migration/20170421' into staging
migration/next for 20170421
# gpg: Signature made Fri 21 Apr 2017 11:28:13 BST
# gpg: using RSA key 0xF487EF185872D723
# gpg: Good signature from "Juan Quintela <quintela@redhat.com>"
# gpg: aka "Juan Quintela <quintela@trasno.org>"
# Primary key fingerprint: 1899 FF8E DEBF 58CC EE03 4B82 F487 EF18 5872 D723
* remotes/juanquintela/tags/migration/20170421: (65 commits)
hmp: info migrate_parameters format tunes
hmp: info migrate_capability format tunes
migration: rename max_size to threshold_size
migration: set current_active_state once
virtio-rng: stop virtqueue while the CPU is stopped
migration: don't close a file descriptor while it can be in use
ram: Remove migration_bitmap_extend()
migration: Disable hotplug/unplug during migration
qdev: Move qdev_unplug() to qdev-monitor.c
qdev: Export qdev_hot_removed
qdev: qdev_hotplug is really a bool
migration: Remove MigrationState parameter from migration_is_idle()
ram: Use RAMBitmap type for coherence
ram: rename last_ram_offset() last_ram_pages()
ram: Use ramblock and page offset instead of absolute offset
ram: Change offset field in PageSearchStatus to page
ram: Remember last_page instead of last_offset
ram: Use page number instead of an address for the bitmap operations
ram: reorganize last_sent_block
ram: ram_discard_range() don't use the mis parameter
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
- the new compat machine
- several cleanups and optimizations
- introspection for css ids
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJY+bZ5AAoJEN7Pa5PG8C+vsfgQAJvMTeJo9iyGYe8t/Xs5x+8J
SHIifYCM+d7XMJr5izYcRJ173DxliFekXTmRW+0soZqoLraZ2p8AL/+VWHZlWeN0
AcYFzVduYbR2gub/HAgE4ZS46pjbuD4vTmOsLLz3FBn3J9QAiS9kGJ/eL/kZuHmY
tIIXTjR/dQew3kknZiBZ9uD+Zy5ZeWOZUgYyAQwya3HBhqgOZ2ui2WJz8lUNDput
PgGf886Cb6eTl5I3kTnvHrsiwFfrWzjpD+5A1zwBMHZtJLfW5hG5fWtNJzTeihqX
N1sCUsTSdM7gspp0Xjkg8KpEPHccg0xe7cXGr9QCT9mXodppWnyvh+abowH5aNCd
Na1+VwrtWL7kCKQo6JErId8wn0Wbam+9ymvY68occCgrNDcnCAgtmttOQDlgqJFd
WyKMM9vnQkRpObZunUCJfH9wEMKx6OlRFBJVftdL40J/A8+DsVWNZ/qMMBkb5Qpg
fNkESE1PePlH8kJqkbzs/uyXVpP0vb7/2EEx8+0ilo9lj3iU3Kektp55cGMNRBRL
EwM0Ft2kTl3XTxcA8OAyHnF5FgKJpI6Cqh6PYo33uN1Od/SUv5INIG8VfUiFUE6E
8QLUck5ucpEvzFp0FRlU+ZAXE4LFgb3ZTLRE9yUljagy/EC2LL+O3qxnc9/iY4V/
NI07uSLZxb5Bpw8h9/9s
=QDBR
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/cohuck/tags/s390x-20170421' into staging
The first batch of s390x changes for 2.10:
- the new compat machine
- several cleanups and optimizations
- introspection for css ids
# gpg: Signature made Fri 21 Apr 2017 08:36:25 BST
# gpg: using RSA key 0xDECF6B93C6F02FAF
# gpg: Good signature from "Cornelia Huck <huckc@linux.vnet.ibm.com>"
# gpg: aka "Cornelia Huck <cornelia.huck@de.ibm.com>"
# Primary key fingerprint: C3D0 D66D C362 4FF6 A8C0 18CE DECF 6B93 C6F0 2FAF
* remotes/cohuck/tags/s390x-20170421:
s390x: Drop useless casts
s390x: register I/O adapters per ISC during init
s390x/flic: cache flic in s390_get_flic
s390x: initialize flic before I/O subsystems
s390x: use enum for adapter type and standardize its naming
s390x/css: consolidate the devno property for ccw devices
s390x/css: provide introspection for virtual subchannel and device busid
s390x/css: introduce read-only property type for device ids
s390x/pci: make printf always compile in debug output
s390x/kvm: make printf always compile in debug output
s390x: introduce 2.10 compat machine
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
In migration codes (especially in migration_thread()), max_size is used
in many place for the threshold value that we will start to do the final
flush and jump to the next stage to dump the whole rest things to
destination. However its name is confusing to first readers. Let's
rename it to "threshold_size" when proper and add a comment for it. No
functional change is made.
CC: Juan Quintela <quintela@redhat.com>
CC: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
If we modify the virtio-rng virqueue while the
vmstate is already migrated we can have some
inconsistencies between the virtqueue state and
the memory content.
To avoid this, stop the virtqueue while the CPU
is stopped.
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Amit Shah <amit@kernel.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
We have disabled memory hotplug, so we don't need to handle
migration_bitamp there.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
I need to move qdev_unplug to qdev-monitor in the following patch, and
it needs access to this variable.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Only user don't have a MigrationState handly.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
We change the meaning of start to be the offset from the beggining of
the block.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
It was used as a size in all cases except one.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
We need to call for the migrate_get_current() in more that half of the
uses, so call that inside.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Treat it like the rest of ram stats counters. Export its value the
same way. As an added bonus, no more MigrationState used in
migration_bitmap_sync();
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
--
Again, dave was the one reviewing it
It can be recalculated from dirty_pages_rate.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
--
Dave was the one that reviewed it O:-)
This is a ram field that was inside MigrationState. Move it to
RAMState and make it the same that the other ram stats.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
This are the last postcopy fields still at MigrationState. Once there
Move MigrationSrcPageRequest to ram.c and remove MigrationState
parameters where appropiate.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
It was on MigrationState when it is only used inside ram.c for
postcopy. Problem is that we need to access it without being able to
pass it RAMState directly.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Its value can be calculated by other exported.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
For compatibility, we need to still send a value, but just specify it
and comment the fact.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
It reflects better what it does.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQIcBAABCAAGBQJY+QGLAAoJECgHk2+YTcWm9U0QAKaZOefXp73ZKJOtYuS+oF5w
L57dGuULzpxzOlQXug2aHNBRaaBPcNqGI/LvId6eWQBDHEqSWCmpfq5DXsM3qNCq
PCDbveo1gn582MGPgRKYHeN8wqcyt32mpd6TgjBPIhmcLwCR9meCtRvFMDykFuLK
+2typZrWLUqsMh0tRz+mVnbkIeBIN4um85Bqx9gknUHerc+jr2boIpzpvHHNlx4q
37JcIgXFnhz3sjMwuRyK/XihbeMiC1zK6XYxZLbn8JEL9tVBSAiv0PUguJwyjLVP
zftENpiXf4fqURZ/cia43Oy+MQDJEKhiw+oIikGDd1NqAmgCLHKYNQ6lZUL6x+ja
xuyh4sLZgA0o8pPJBuiPseFe3ZcOUSw+TAoTnOorOfYrc65meFJ2byc7STVbh/G9
hGTBfz19mvAtgMk5JsaJsB5MB12vCDizoTxgF35yOq4+jdOQUKGRLIYHt80Q37jl
/5JbWC0xVP1BsCYhsnpleA7V8d7/V1AUaxiVZYeOZiFV/ALzEFxlyQYmaDhjjQEe
FIdFea1eGrTX7P93uGoFDokIkXn9OxD+PvJxrCy076g+noFCERaX4y3MePjOlpm1
gj5A+JTVPvDgB0H53+sDT6i6cqLGqNJPPuIPtX5vQEGFkKIVHNj67NRZlxgWdFj9
lgE7/vHttYI2zUfgFZv+
=Em1p
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/ehabkost/tags/machine-pull-request' into staging
Machine queue for 2.10
# gpg: Signature made Thu 20 Apr 2017 19:44:27 BST
# gpg: using RSA key 0x2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF D1AA 2807 936F 984D C5A6
* remotes/ehabkost/tags/machine-pull-request:
qdev: Constify local variable returned by blk_bs
qdev: Constify value passed to qdev_prop_set_macaddr
hostmem: use host_memory_backend_mr_inited() where proper
hostmem: introduce host_memory_backend_mr_inited()
hw/core/null-machine: Print error message when using the -kernel parameter
qdev: Make "hotplugged" property read-only
intel_iommu: enable remote IOTLB
intel_iommu: allow dynamic switch of IOMMU region
intel_iommu: provide its own replay() callback
intel_iommu: use the correct memory region for device IOTLB notification
memory: add MemoryRegionIOMMUOps.replay() callback
memory: introduce memory_region_notify_one()
memory: provide iommu_replay_all()
memory: provide IOMMU_NOTIFIER_FOREACH macro
memory: add section range info for IOMMU notifier
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The I/O adapters should exist as soon as the bus/infrastructure
exists, and not only when the guest is actually trying to do something
with them. While the lazy allocation was not wrong, allocating at init
time is cleaner, both for the architecture and the code. Let's adjust
this by having each device type (currently for PCI and virtio-ccw)
register the adapters for each ISC (as now we don't know which ISC the
guest will use) as soon as it initializes.
Use a two-dimensional array io_adapters[type][isc] to store adapters
in ChannelSubSys, so that we can conveniently get the adapter id by
the helper function css_get_adapter_id(type, isc).
Signed-off-by: Fei Li <sherrylf@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Let's use an enum for io adapter type, and standardize its naming to
CSS_IO_ADAPTER_* by changing S390_PCIPT_ADAPTER to CSS_IO_ADAPTER_PCI.
Signed-off-by: Fei Li <sherrylf@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Let's introduce a read-only property type that handles device ids of the
CssDevId type used for channel devices for future use. e.g. exposing the
busid of an I/O subchannel that is assigned to a ccw device.
Signed-off-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
As all users have been removed, we can remove
cannot_destroy_with_object_finalize_yet field
from the DeviceClass structure.
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20170414083717.13641-5-lvivier@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
The 'value' argument is not modified so this can be made const for code
safeness.
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Message-Id: <20170310200550.13313-2-krzk@kernel.org>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
We were checking this against memory region size of host memory
backend's mr field to see whether the mr has been inited. This is
efficient but less elegant. Let's make a helper for it to avoid
confusions, along with some notes.
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1489151370-15453-2-git-send-email-peterx@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
This patch is based on Aviv Ben-David (<bd.aviv@gmail.com>)'s patch
upstream:
"IOMMU: enable intel_iommu map and unmap notifiers"
https://lists.gnu.org/archive/html/qemu-devel/2016-11/msg01453.html
However I removed/fixed some content, and added my own codes.
Instead of translate() every page for iotlb invalidations (which is
slower), we walk the pages when needed and notify in a hook function.
This patch enables vfio devices for VT-d emulation.
And, since we already have vhost DMAR support via device-iotlb, a
natural benefit that this patch brings is that vt-d enabled vhost can
live even without ATS capability now. Though more tests are needed.
Signed-off-by: Aviv Ben-David <bdaviv@cs.technion.ac.il>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: \"Michael S. Tsirkin\" <mst@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1491562755-23867-10-git-send-email-peterx@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
This is preparation work to finally enabled dynamic switching ON/OFF for
VT-d protection. The old VT-d codes is using static IOMMU address space,
and that won't satisfy vfio-pci device listeners.
Let me explain.
vfio-pci devices depend on the memory region listener and IOMMU replay
mechanism to make sure the device mapping is coherent with the guest
even if there are domain switches. And there are two kinds of domain
switches:
(1) switch from domain A -> B
(2) switch from domain A -> no domain (e.g., turn DMAR off)
Case (1) is handled by the context entry invalidation handling by the
VT-d replay logic. What the replay function should do here is to replay
the existing page mappings in domain B.
However for case (2), we don't want to replay any domain mappings - we
just need the default GPA->HPA mappings (the address_space_memory
mapping). And this patch helps on case (2) to build up the mapping
automatically by leveraging the vfio-pci memory listeners.
Another important thing that this patch does is to seperate
IR (Interrupt Remapping) from DMAR (DMA Remapping). IR region should not
depend on the DMAR region (like before this patch). It should be a
standalone region, and it should be able to be activated without
DMAR (which is a common behavior of Linux kernel - by default it enables
IR while disabled DMAR).
Reviewed-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: \"Michael S. Tsirkin\" <mst@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1491562755-23867-9-git-send-email-peterx@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
The default replay() don't work for VT-d since vt-d will have a huge
default memory region which covers address range 0-(2^64-1). This will
normally consumes a lot of time (which looks like a dead loop).
The solution is simple - we don't walk over all the regions. Instead, we
jump over the regions when we found that the page directories are empty.
It'll greatly reduce the time to walk the whole region.
To achieve this, we provided a page walk helper to do that, invoking
corresponding hook function when we found an page we are interested in.
vtd_page_walk_level() is the core logic for the page walking. It's
interface is designed to suite further use case, e.g., to invalidate a
range of addresses.
Reviewed-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: \"Michael S. Tsirkin\" <mst@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1491562755-23867-8-git-send-email-peterx@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Originally we have one memory_region_iommu_replay() function, which is
the default behavior to replay the translations of the whole IOMMU
region. However, on some platform like x86, we may want our own replay
logic for IOMMU regions. This patch adds one more hook for IOMMUOps for
the callback, and it'll override the default if set.
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: \"Michael S. Tsirkin\" <mst@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1491562755-23867-6-git-send-email-peterx@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Generalizing the notify logic in memory_region_notify_iommu() into a
single function. This can be further used in customized replay()
functions for IOMMUs.
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: \"Michael S. Tsirkin\" <mst@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1491562755-23867-5-git-send-email-peterx@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
This is an "global" version of existing memory_region_iommu_replay() -
we announce the translations to all the registered notifiers, instead of
a specific one.
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: \"Michael S. Tsirkin\" <mst@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1491562755-23867-4-git-send-email-peterx@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
A new macro is provided to iterate all the IOMMU notifiers hooked
under specific IOMMU memory region.
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: \"Michael S. Tsirkin\" <mst@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1491562755-23867-3-git-send-email-peterx@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
In this patch, IOMMUNotifier.{start|end} are introduced to store section
information for a specific notifier. When notification occurs, we not
only check the notification type (MAP|UNMAP), but also check whether the
notified iova range overlaps with the range of specific IOMMU notifier,
and skip those notifiers if not in the listened range.
When removing an region, we need to make sure we removed the correct
VFIOGuestIOMMU by checking the IOMMUNotifier.start address as well.
This patch is solving the problem that vfio-pci devices receive
duplicated UNMAP notification on x86 platform when vIOMMU is there. The
issue is that x86 IOMMU has a (0, 2^64-1) IOMMU region, which is
splitted by the (0xfee00000, 0xfeefffff) IRQ region. AFAIK
this (splitted IOMMU region) is only happening on x86.
This patch also helps vhost to leverage the new interface as well, so
that vhost won't get duplicated cache flushes. In that sense, it's an
slight performance improvement.
Suggested-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1491562755-23867-2-git-send-email-peterx@redhat.com>
[ehabkost: included extra vhost_iommu_region_del() change from Peter Xu]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
We already require gcc 4.1 or newer (for the atomic
support), so the fallback codepaths for older gcc
versions than that are now dead code and we can
just delete them.
NB: clang reports itself as gcc 4.2 (regardless of
clang version), so clang won't be using the fallbacks
either.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Expose the Cadence GEM revision as a property.
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 541324373cf87b50f8be0439a0cb89f5028b016f.1491947224.git.alistair.francis@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
During block job completion, nothing is preventing
block_job_defer_to_main_loop_bh from being called in a nested
aio_poll(), which is a trouble, such as in this code path:
qmp_block_commit
commit_active_start
bdrv_reopen
bdrv_reopen_multiple
bdrv_reopen_prepare
bdrv_flush
aio_poll
aio_bh_poll
aio_bh_call
block_job_defer_to_main_loop_bh
stream_complete
bdrv_reopen
block_job_defer_to_main_loop_bh is the last step of the stream job,
which should have been "paused" by the bdrv_drained_begin/end in
bdrv_reopen_multiple, but it is not done because it's in the form of a
main loop BH.
Similar to why block jobs should be paused between drained_begin and
drained_end, BHs they schedule must be excluded as well. To achieve
this, this patch forces draining the BH in BDRV_POLL_WHILE.
As a side effect this fixes a hang in block_job_detach_aio_context
during system_reset when a block job is ready:
#0 0x0000555555aa79f3 in bdrv_drain_recurse
#1 0x0000555555aa825d in bdrv_drained_begin
#2 0x0000555555aa8449 in bdrv_drain
#3 0x0000555555a9c356 in blk_drain
#4 0x0000555555aa3cfd in mirror_drain
#5 0x0000555555a66e11 in block_job_detach_aio_context
#6 0x0000555555a62f4d in bdrv_detach_aio_context
#7 0x0000555555a63116 in bdrv_set_aio_context
#8 0x0000555555a9d326 in blk_set_aio_context
#9 0x00005555557e38da in virtio_blk_data_plane_stop
#10 0x00005555559f9d5f in virtio_bus_stop_ioeventfd
#11 0x00005555559fa49b in virtio_bus_stop_ioeventfd
#12 0x00005555559f6a18 in virtio_pci_stop_ioeventfd
#13 0x00005555559f6a18 in virtio_pci_reset
#14 0x00005555559139a9 in qdev_reset_one
#15 0x0000555555916738 in qbus_walk_children
#16 0x0000555555913318 in qdev_walk_children
#17 0x0000555555916738 in qbus_walk_children
#18 0x00005555559168ca in qemu_devices_reset
#19 0x000055555581fcbb in pc_machine_reset
#20 0x00005555558a4d96 in qemu_system_reset
#21 0x000055555577157a in main_loop_should_exit
#22 0x000055555577157a in main_loop
#23 0x000055555577157a in main
The rationale is that the loop in block_job_detach_aio_context cannot
make any progress in pausing/completing the job, because bs->in_flight
is 0, so bdrv_drain doesn't process the block_job_defer_to_main_loop
BH. With this patch, it does.
Reported-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170418143044.12187-3-famz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Tested-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
They start the coroutine on the specified context.
Signed-off-by: Fam Zheng <famz@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
It's a variant of qemu_coroutine_enter with an explicit AioContext
parameter.
Signed-off-by: Fam Zheng <famz@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Minor differences from:
Message-Id: <20170405132503.32125-1-alex.bennee@linaro.org>
- dropped new feature patches
- last minute typo fix from Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJY62CSAAoJEPvQ2wlanipEu/wH/jNzNMus/zaM/+gS3GWpHer2
/aLkKS/zNHZoYzvDE4JZx3sX4q6HyeRZ0Hu46jWs3WAECgHhjV4Rfn3btK+x/5r8
wtmC0DM59ULbE2e6NjDRdJAocdjU6j9Zu+c09/sfssBLRHCJOGyAH8BEbyhHcmlq
hUqTFvZAuLdko6CWfKjtFv+KQm+za9ypiLIncZZDhUi5vt2PIuUV6qSUyqs5EpwP
JyDlgDD8Rzohq62dWIXYTg5dV7tU6/g9vou7tEUoqhMVTHF1usA++j6yfIpGq3Z5
MGN/63Q9tdSX/Kzot9yrHKdsjQEm7k7/03LKT6BIvM1tk0hjzumHFGJBDFYytVc=
=/8U7
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/stsquad/tags/pull-mttcg-fixups-for-rc2-100417-1' into staging
Final icount and misc MTTCG fixes for 2.9
Minor differences from:
Message-Id: <20170405132503.32125-1-alex.bennee@linaro.org>
- dropped new feature patches
- last minute typo fix from Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
# gpg: Signature made Mon 10 Apr 2017 11:38:10 BST
# gpg: using RSA key 0xFBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>"
# Primary key fingerprint: 6685 AE99 E751 67BC AFC8 DF35 FBD0 DB09 5A9E 2A44
* remotes/stsquad/tags/pull-mttcg-fixups-for-rc2-100417-1:
replay: assert time only goes forward
cpus: call cpu_update_icount on read
cpu-exec: update icount after each TB_EXIT
cpus: introduce cpu_update_icount helper
cpus: don't credit executed instructions before they have run
cpus: move icount preparation out of tcg_exec_cpu
cpus: check cpu->running in cpu_get_icount_raw()
cpus: remove icount handling from qemu_tcg_cpu_thread_fn
target/i386/misc_helper: wrap BQL around another IRQ generator
cpus: fix wrong define name
scripts/qemugdb/mtree.py: fix up mtree dump
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
By holding off updates to timer_state.qemu_icount we can run into
trouble when the non-vCPU thread needs to know the time. This helper
ensures we atomically update timers_state.qemu_icount based on what
has been currently executed.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Outside of the vCPU thread icount time will only be tracked against
timers_state.qemu_icount. We no longer credit cycles until they have
completed the run. Inside the vCPU thread we adjust for passage of
time by looking at how many have run so far. This is only valid inside
the vCPU thread while it is running.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>