mirrors/qemu

mirror of https://github.com/qemu/qemu.git synced 2024-11-30 23:33:51 +08:00

Author	SHA1	Message	Date
Peter Xu	698feb5e13	memory: add section range info for IOMMU notifier In this patch, IOMMUNotifier.{start\|end} are introduced to store section information for a specific notifier. When notification occurs, we not only check the notification type (MAP\|UNMAP), but also check whether the notified iova range overlaps with the range of specific IOMMU notifier, and skip those notifiers if not in the listened range. When removing an region, we need to make sure we removed the correct VFIOGuestIOMMU by checking the IOMMUNotifier.start address as well. This patch is solving the problem that vfio-pci devices receive duplicated UNMAP notification on x86 platform when vIOMMU is there. The issue is that x86 IOMMU has a (0, 2^64-1) IOMMU region, which is splitted by the (0xfee00000, 0xfeefffff) IRQ region. AFAIK this (splitted IOMMU region) is only happening on x86. This patch also helps vhost to leverage the new interface as well, so that vhost won't get duplicated cache flushes. In that sense, it's an slight performance improvement. Suggested-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Eric Auger <eric.auger@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <1491562755-23867-2-git-send-email-peterx@redhat.com> [ehabkost: included extra vhost_iommu_region_del() change from Peter Xu] Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-04-20 15:22:41 -03:00
Peter Maydell	da92ada855	target-arm queue: * implement M profile exception return properly * cadence GEM: fix multiqueue handling bugs * pxa2xx.c: QOMify a device * arm/kvm: Remove trailing newlines from error_report() * stellaris: Don't hw_error() on bad register accesses * Add assertion about FSC format for syndrome registers * Move excnames[] array into arm_log_exceptions() * exynos: minor code cleanups * hw/arm/boot: take Linux/arm64 TEXT_OFFSET header field into account * Fix APSR writes via M profile MSR -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABCAAGBQJY+ORHAAoJEDwlJe0UNgzeCAsP/isQdHG5XkvKXNQmg8LC6GYG 0tvHjgOiQMyuJBEpOKEpmFTWXMqfd04CDahEl/3MqUPnG3p/ILwjjNvgbcGACH11 VyvOCd3XwwbdQkYAJNyTHZu+8/Ila+FzjlJcW95MAvb/wxnzVyyBws9mvBletfVn th0qVPgIgIxPYjae0Y714/k6UrTWPaUvSGEbjXvvsSqjwwHUYs7TJKuyDEQzrShG +RsDuo4Qjx2YOATg1coKdY2nFDzpHn/my+RYSGqhYhNpEd12dlkLJ2cZKESU4RcI 2GqsW4sfqDBoGIOejuDVU3ZI4wP2wvzOlXVXsm117b9dndMGoMtEcxIjWD9rdtsk ZXOTHjHWB+6SlUbzmJpt+aCuW2QkG3jMS3RKGChwNZCHNjx28AthZLw+6rqjPR3e DkAAsk9mYpsbAWYD9/B0kyjkFhJlrNbvr+NbxrT19LjpztGvHYTJqDgbK1cneVF2 DB7IubVlLx/ILVkH8TEYqXG6COWTspLa8kW14DmZi1ts7ns3UGcTghdBbVIXNTaE C4KrpFSgt2hAGeeXwJs704pCug9Rkpha2MM2//5Udk3YDLtFwzrTWPv6nq5+68HO T3hhXZ7OfPRVrc88M8u46KjlzdDJTFKQbrmvL2ecyPaodEAbA3+cdbqmMbmWh0Qd Fl8q4dFYPLii4xfOUkoH =Pipo -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20170420' into staging target-arm queue: * implement M profile exception return properly * cadence GEM: fix multiqueue handling bugs * pxa2xx.c: QOMify a device * arm/kvm: Remove trailing newlines from error_report() * stellaris: Don't hw_error() on bad register accesses * Add assertion about FSC format for syndrome registers * Move excnames[] array into arm_log_exceptions() * exynos: minor code cleanups * hw/arm/boot: take Linux/arm64 TEXT_OFFSET header field into account * Fix APSR writes via M profile MSR # gpg: Signature made Thu 20 Apr 2017 17:39:35 BST # gpg: using RSA key 0x3C2525ED14360CDE # gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" # gpg: aka "Peter Maydell <pmaydell@gmail.com>" # gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" # Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83 15CF 3C25 25ED 1436 0CDE * remotes/pmaydell/tags/pull-target-arm-20170420: (24 commits) arm: Remove workarounds for old M-profile exception return implementation arm: Implement M profile exception return properly arm: Track M profile handler mode state in TB flags arm: Abstract out "are we singlestepping" test to utility function arm: Move condition-failed codepath generation out of if() arm: Move gen_set_condexec() and gen_set_pc_im() up in the file arm: Factor out "generate right kind of step exception" arm: Thumb shift operations should not permit interworking branches arm: Don't implement BXJ on M-profile CPUs xlnx-zynqmp: Set the Cadence GEM revision cadence_gem: Make the revision a property cadence_gem: Correct the interupt logic cadence_gem: Correct the multi-queue can rx logic cadence_gem: Read the correct queue descriptor hw/arm: Qomify pxa2xx.c arm/kvm: Remove trailing newlines from error_report() stellaris: Don't hw_error() on bad register accesses target/arm: Add assertion about FSC format for syndrome registers arm: Move excnames[] array into arm_log_exceptions() target/arm: Add missing entries to excnames[] for log strings ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-04-20 17:41:34 +01:00
Peter Maydell	f4e8e4edda	arm: Remove workarounds for old M-profile exception return implementation Now that we've rewritten M-profile exception return so that the magic PC values are not visible to other parts of QEMU, we can delete the special casing of them elsewhere. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Richard Henderson <rth@twiddle.net> Message-id: 1491844419-12485-10-git-send-email-peter.maydell@linaro.org	2017-04-20 17:39:17 +01:00
Peter Maydell	3bb8a96f53	arm: Implement M profile exception return properly On M profile, return from exceptions happen when code in Handler mode executes one of the following function call return instructions: * POP or LDM which loads the PC * LDR to PC * BX register and the new PC value is 0xFFxxxxxx. QEMU tries to implement this by not treating the instruction specially but then catching the attempt to execute from the magic address value. This is not ideal, because: * there are guest visible differences from the architecturally specified behaviour (for instance jumping to 0xFFxxxxxx via a different instruction should not cause an exception return but it will in the QEMU implementation) * we have to account for it in various places (like refusing to take an interrupt if the PC is at a magic value, and making sure that the MPU doesn't deny execution at the magic value addresses) Drop these hacks, and instead implement exception return the way the architecture specifies -- by having the relevant instructions check for the magic value and raise the 'do an exception return' QEMU internal exception immediately. The effect on the generated code is minor: bx lr, old code (and new code for Thread mode): TCG: mov_i32 tmp5,r14 movi_i32 tmp6,$0xfffffffffffffffe and_i32 pc,tmp5,tmp6 movi_i32 tmp6,$0x1 and_i32 tmp5,tmp5,tmp6 st_i32 tmp5,env,$0x218 exit_tb $0x0 set_label $L0 exit_tb $0x7f2aabd61993 x86_64 generated code: 0x7f2aabe87019: mov %ebx,%ebp 0x7f2aabe8701b: and $0xfffffffffffffffe,%ebp 0x7f2aabe8701e: mov %ebp,0x3c(%r14) 0x7f2aabe87022: and $0x1,%ebx 0x7f2aabe87025: mov %ebx,0x218(%r14) 0x7f2aabe8702c: xor %eax,%eax 0x7f2aabe8702e: jmpq 0x7f2aabe7c016 bx lr, new code when in Handler mode: TCG: mov_i32 tmp5,r14 movi_i32 tmp6,$0xfffffffffffffffe and_i32 pc,tmp5,tmp6 movi_i32 tmp6,$0x1 and_i32 tmp5,tmp5,tmp6 st_i32 tmp5,env,$0x218 movi_i32 tmp5,$0xffffffffff000000 brcond_i32 pc,tmp5,geu,$L1 exit_tb $0x0 set_label $L1 movi_i32 tmp5,$0x8 call exception_internal,$0x0,$0,env,tmp5 x86_64 generated code: 0x7fe8fa1264e3: mov %ebp,%ebx 0x7fe8fa1264e5: and $0xfffffffffffffffe,%ebx 0x7fe8fa1264e8: mov %ebx,0x3c(%r14) 0x7fe8fa1264ec: and $0x1,%ebp 0x7fe8fa1264ef: mov %ebp,0x218(%r14) 0x7fe8fa1264f6: cmp $0xff000000,%ebx 0x7fe8fa1264fc: jae 0x7fe8fa126509 0x7fe8fa126502: xor %eax,%eax 0x7fe8fa126504: jmpq 0x7fe8fa122016 0x7fe8fa126509: mov %r14,%rdi 0x7fe8fa12650c: mov $0x8,%esi 0x7fe8fa126511: mov $0x56095dbeccf5,%r10 0x7fe8fa12651b: callq *%r10 which is a difference of one cmp/branch-not-taken. This will be lost in the noise of having to exit generated code and look up the next TB anyway. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-id: 1491844419-12485-9-git-send-email-peter.maydell@linaro.org	2017-04-20 17:39:17 +01:00
Peter Maydell	064c379c99	arm: Track M profile handler mode state in TB flags For M profile exception-return handling we'd like to generate different code for some instructions depending on whether we are in Handler mode or Thread mode. This isn't the same as "are we privileged or user", so we need an extra bit in the TB flags to distinguish. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-id: 1491844419-12485-8-git-send-email-peter.maydell@linaro.org	2017-04-20 17:39:17 +01:00
Peter Maydell	b636649f5a	arm: Abstract out "are we singlestepping" test to utility function We now test for "are we singlestepping" in several places and it's not a trivial check because we need to care about both architectural singlestep and QEMU gdbstub singlestep. We're also about to add another place that needs to make this check, so pull the condition out into a function. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-id: 1491844419-12485-7-git-send-email-peter.maydell@linaro.org	2017-04-20 17:39:17 +01:00
Peter Maydell	f021b2c462	arm: Move condition-failed codepath generation out of if() Move the code to generate the "condition failed" instruction codepath out of the if (singlestepping) {} else {}. This will allow adding support for handling a new is_jmp type which can't be neatly split into "singlestepping case" versus "not singlestepping case". Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Richard Henderson <rth@twiddle.net> Message-id: 1491844419-12485-6-git-send-email-peter.maydell@linaro.org	2017-04-20 17:39:17 +01:00
Peter Maydell	4d5e8c969a	arm: Move gen_set_condexec() and gen_set_pc_im() up in the file Move the utility routines gen_set_condexec() and gen_set_pc_im() up in the file, as we will want to use them from a function placed earlier in the file than their current location. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Richard Henderson <rth@twiddle.net> Message-id: 1491844419-12485-5-git-send-email-peter.maydell@linaro.org	2017-04-20 17:39:17 +01:00
Peter Maydell	5425415ebb	arm: Factor out "generate right kind of step exception" We currently have two places that do: if (dc->ss_active) { gen_step_complete_exception(dc); } else { gen_exception_internal(EXCP_DEBUG); } Factor this out into its own function, as we're about to add a third place that needs the same logic. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Richard Henderson <rth@twiddle.net> Message-id: 1491844419-12485-4-git-send-email-peter.maydell@linaro.org	2017-04-20 17:39:17 +01:00
Peter Maydell	bedb8a6b09	arm: Thumb shift operations should not permit interworking branches In Thumb mode, the only instructions which can cause an interworking branch by writing the PC are BLX, BX, BXJ, LDR, POP and LDM. Unlike ARM mode, data processing instructions which target the PC do not cause interworking branches. When we added support for doing interworking branches on writes to PC from data processing instructions in commit `21aeb3430c`, we accidentally changed a Thumb instruction to have interworking branch behaviour for writes to PC. (MOV, MOVS register-shifted register, encoding T2; this is the standard encoding for LSL/LSR/ASR/ROR (register).) For this encoding, behaviour with Rd == R15 is specified as UNPREDICTABLE, so allowing an interworking branch is within spec, but it's confusing and differs from our handling of this class of UNPREDICTABLE for other Thumb ALU operations. Make it perform a simple (non-interworking) branch like the others. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-id: 1491844419-12485-3-git-send-email-peter.maydell@linaro.org	2017-04-20 17:39:17 +01:00
Peter Maydell	9d7c59c84d	arm: Don't implement BXJ on M-profile CPUs For M-profile CPUs, the BXJ instruction does not exist at all, and the encoding should always UNDEF. We were accidentally implementing it to behave like A-profile BXJ; correct the error. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Richard Henderson <rth@twiddle.net> Message-id: 1491844419-12485-2-git-send-email-peter.maydell@linaro.org	2017-04-20 17:39:17 +01:00
Alistair Francis	20bff21307	xlnx-zynqmp: Set the Cadence GEM revision Signed-off-by: Alistair Francis <alistair.francis@xilinx.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 026dbe01a1d42619eee30ce3f2079741bf04bc73.1491947224.git.alistair.francis@xilinx.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-04-20 17:39:17 +01:00
Alistair Francis	a5517666b2	cadence_gem: Make the revision a property Expose the Cadence GEM revision as a property. Signed-off-by: Alistair Francis <alistair.francis@xilinx.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 541324373cf87b50f8be0439a0cb89f5028b016f.1491947224.git.alistair.francis@xilinx.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-04-20 17:39:17 +01:00
Alistair Francis	596b6f51b7	cadence_gem: Correct the interupt logic This patch fixes two mistakes in the interrupt logic. First we only trigger single-queue or multi-queue interrupts if the status register is set. This logic was already used for non multi-queue interrupts but it also applies to multi-queue interrupts. Secondly we need to lower the interrupts if the ISR isn't set. As part of this we can remove the other interrupt lowering logic and consolidate it inside gem_update_int_status(). Signed-off-by: Alistair Francis <alistair.francis@xilinx.com> Message-id: 438bcc014f8f8a2f8f68f322cb6a53f4c04688c2.1491947224.git.alistair.francis@xilinx.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>	2017-04-20 17:39:17 +01:00
Alistair Francis	dacc0566ac	cadence_gem: Correct the multi-queue can rx logic Correct the buffer descriptor busy logic to work correctly when using multiple queues. Signed-off-by: Alistair Francis <alistair.francis@xilinx.com> Message-id: 8a7e8059984e27d46a276a66299d035a0afd280f.1491947224.git.alistair.francis@xilinx.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>	2017-04-20 17:39:17 +01:00
Alistair Francis	75b7760212	cadence_gem: Read the correct queue descriptor Read the correct descriptor instead of hardcoding the first (q=0). Signed-off-by: Alistair Francis <alistair.francis@xilinx.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 988b183dcf951856d8b3379f7e911ec95233bbf4.1491947224.git.alistair.francis@xilinx.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-04-20 17:39:17 +01:00
Suramya Shah	0493a139c9	hw/arm: Qomify pxa2xx.c Signed-off-by: Suramya Shah <shah.suramya@gmail.com> Message-id: 20170415180316.2694-1-shah.suramya@gmail.com Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-04-20 17:39:17 +01:00
Ishani Chugh	dffc58519e	arm/kvm: Remove trailing newlines from error_report() Signed-off-by: Ishani Chugh <chugh.ishani@research.iiit.ac.in> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1491629987-6826-1-git-send-email-chugh.ishani@research.iiit.ac.in Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-04-20 17:39:17 +01:00
Peter Maydell	df3692e04b	stellaris: Don't hw_error() on bad register accesses Current recommended style is to log a guest error on bad register accesses, not kill the whole system with hw_error(). Change the hw_error() calls to log as LOG_GUEST_ERROR or LOG_UNIMP or use g_assert_not_reached() as appropriate. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-id: 1491486314-25823-1-git-send-email-peter.maydell@linaro.org	2017-04-20 17:39:17 +01:00
Peter Maydell	65ed2ed90d	target/arm: Add assertion about FSC format for syndrome registers In tlb_fill() we construct a syndrome register value from a fault status register value which is filled in by arm_tlb_fill(). arm_tlb_fill() returns FSR values which might be in the format used with short-format page descriptors, or the format used with long-format (LPAE) descriptors. The syndrome register always uses LPAE-format FSR status codes. It isn't actually possible to end up delivering a syndrome register value to the guest for a fault which is reported with a short-format FSR (that kind of stage 1 fault will only happen for an AArch32 translation regime which doesn't have a syndrome register, and can never be redirected to an AArch64 or Hyp exception level). Add an assertion which checks this, and adjust the code so that we construct a syndrome with an invalid status code, rather than allowing set bits in the FSR input to randomly corrupt other fields in the syndrome. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Message-id: 1491486152-24304-1-git-send-email-peter.maydell@linaro.org	2017-04-20 17:39:17 +01:00
Peter Maydell	2c4a7cc5af	arm: Move excnames[] array into arm_log_exceptions() The excnames[] array is defined in internals.h because we used to use it from two different source files for handling logging of AArch32 and AArch64 exception entry. Refactoring means that it's now used only in arm_log_exception() in helper.c, so move the array into that function. Suggested-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-id: 1491821097-5647-1-git-send-email-peter.maydell@linaro.org	2017-04-20 17:39:17 +01:00
Peter Maydell	32b81e620e	target/arm: Add missing entries to excnames[] for log strings Recent changes have added new EXCP_ values to ARM but forgot to update the excnames[] array which is used to provide human-readable strings when printing information about the exception for debug logging. Add the missing entries, and add a comment to the list of #defines to help avoid the mistake being repeated in future. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Message-id: 1491486340-25988-1-git-send-email-peter.maydell@linaro.org	2017-04-20 17:39:17 +01:00
Krzysztof Kozlowski	885f271056	hw/misc/exynos4210_pmu: Reorder local variables for readability Short declaration of 'i' was in the middle of declarations with assignments. Make it a little bit more readable. Additionally switch from "unsigned" to "unsigned int" as this pattern is more widely used. No functional change. Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-id: 20170313184750.429-4-krzk@kernel.org Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-04-20 17:39:17 +01:00
Krzysztof Kozlowski	75c6d92e4c	hw/char/exynos4210_uart: Constify static array and few arguments The static array exynos4210_uart_regs with register values is not modified so it can be made const. Few other functions accept driver or uart state as an argument but they do not change it and do not cast it so this can be made const for code safeness. Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Message-id: 20170313184750.429-3-krzk@kernel.org Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-04-20 17:39:17 +01:00
Krzysztof Kozlowski	f2ad5140fa	hw/arm/exynos: Convert fprintf to qemu_log_mask/error_report qemu_log_mask() and error_report() are preferred over fprintf() for logging errors. Also remove square brackets [] and additional new line characters in printed messages. Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-id: 20170313184750.429-2-krzk@kernel.org [PMM: wrapped long line] Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-04-20 17:39:17 +01:00
Ard Biesheuvel	68115ed5fc	hw/arm/boot: take Linux/arm64 TEXT_OFFSET header field into account The arm64 boot protocol stipulates that the kernel must be loaded TEXT_OFFSET bytes beyond a 2 MB aligned base address, where TEXT_OFFSET could be any 4 KB multiple between 0 and 2 MB, and whose value can be found in the header of the Image file. So after attempts to load the arm64 kernel image as an ELF file or as a U-Boot image have failed (both of which have their own way of specifying the load offset), try to determine the TEXT_OFFSET from the image after loading it but before mapping it as a ROM mapping into the guest address space. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 1489414630-21609-1-git-send-email-ard.biesheuvel@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-04-20 17:39:17 +01:00
Peter Maydell	64c8ed97cc	Open 2.10 development tree Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-04-20 15:42:31 +01:00
Peter Maydell	359c41abe3	Update version for v2.9.0 release Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-04-20 15:31:34 +01:00
Peter Maydell	ca55019dac	Update version for v2.9.0-rc5 release Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-04-18 17:13:50 +01:00
Peter Maydell	d1263f8f18	-----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQEtBAABCAAXBQJY9imYEBxmYW16QHJlZGhhdC5jb20ACgkQyjViTGqRccYulQf/ aMygBnFxgH/YJ9TkoSlXmss1M+nyjXsckJ4GD5hecSmDefEDvlbhzMANe4b/Rcma Q6ibxvgnxUEKGSsQx71UzDo64vLPZxsyLkPjbp03ojkiq+1I8rVdBKJmsH8W3RPv cFL8FmfGJh+PjhrPXZ50fwcQclCLzXkeP4Nlxi0PgyvJAzzpNFRP+DxLgFqOfJCf rB8ziO8XXdSbaLO3e5GwWqJiLvEZ9l91Gt+S5MW4yB7zjjjd/ODrFcslExL7fLEX skXjCBIGUNlRVhYKLdRoK8Q+zQhn+72wq1HTd5RRvJp7qXocqm8xNIiIBaUwobhd LeT97Cmr8gOGIwr5Q4HNVw== =6F+u -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/famz/tags/block-pull-request' into staging # gpg: Signature made Tue 18 Apr 2017 15:58:32 BST # gpg: using RSA key 0xCA35624C6A9171C6 # gpg: Good signature from "Fam Zheng <famz@redhat.com>" # gpg: WARNING: This key is not certified with sufficiently trusted signatures! # gpg: It is not certain that the signature belongs to the owner. # Primary key fingerprint: 5003 7CB7 9706 0F76 F021 AD56 CA35 624C 6A91 71C6 * remotes/famz/tags/block-pull-request: block: Drain BH in bdrv_drained_begin block: Walk bs->children carefully in bdrv_drain_recurse Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-04-18 16:18:15 +01:00
Fam Zheng	91af091f92	block: Drain BH in bdrv_drained_begin During block job completion, nothing is preventing block_job_defer_to_main_loop_bh from being called in a nested aio_poll(), which is a trouble, such as in this code path: qmp_block_commit commit_active_start bdrv_reopen bdrv_reopen_multiple bdrv_reopen_prepare bdrv_flush aio_poll aio_bh_poll aio_bh_call block_job_defer_to_main_loop_bh stream_complete bdrv_reopen block_job_defer_to_main_loop_bh is the last step of the stream job, which should have been "paused" by the bdrv_drained_begin/end in bdrv_reopen_multiple, but it is not done because it's in the form of a main loop BH. Similar to why block jobs should be paused between drained_begin and drained_end, BHs they schedule must be excluded as well. To achieve this, this patch forces draining the BH in BDRV_POLL_WHILE. As a side effect this fixes a hang in block_job_detach_aio_context during system_reset when a block job is ready: #0 0x0000555555aa79f3 in bdrv_drain_recurse #1 0x0000555555aa825d in bdrv_drained_begin #2 0x0000555555aa8449 in bdrv_drain #3 0x0000555555a9c356 in blk_drain #4 0x0000555555aa3cfd in mirror_drain #5 0x0000555555a66e11 in block_job_detach_aio_context #6 0x0000555555a62f4d in bdrv_detach_aio_context #7 0x0000555555a63116 in bdrv_set_aio_context #8 0x0000555555a9d326 in blk_set_aio_context #9 0x00005555557e38da in virtio_blk_data_plane_stop #10 0x00005555559f9d5f in virtio_bus_stop_ioeventfd #11 0x00005555559fa49b in virtio_bus_stop_ioeventfd #12 0x00005555559f6a18 in virtio_pci_stop_ioeventfd #13 0x00005555559f6a18 in virtio_pci_reset #14 0x00005555559139a9 in qdev_reset_one #15 0x0000555555916738 in qbus_walk_children #16 0x0000555555913318 in qdev_walk_children #17 0x0000555555916738 in qbus_walk_children #18 0x00005555559168ca in qemu_devices_reset #19 0x000055555581fcbb in pc_machine_reset #20 0x00005555558a4d96 in qemu_system_reset #21 0x000055555577157a in main_loop_should_exit #22 0x000055555577157a in main_loop #23 0x000055555577157a in main The rationale is that the loop in block_job_detach_aio_context cannot make any progress in pausing/completing the job, because bs->in_flight is 0, so bdrv_drain doesn't process the block_job_defer_to_main_loop BH. With this patch, it does. Reported-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com> Message-Id: <20170418143044.12187-3-famz@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Tested-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>	2017-04-18 22:56:28 +08:00
Fam Zheng	178bd438af	block: Walk bs->children carefully in bdrv_drain_recurse The recursive bdrv_drain_recurse may run a block job completion BH that drops nodes. The coming changes will make that more likely and use-after-free would happen without this patch Stash the bs pointer and use bdrv_ref/bdrv_unref in addition to QLIST_FOREACH_SAFE to prevent such a case from happening. Since bdrv_unref accesses global state that is not protected by the AioContext lock, we cannot use bdrv_ref/bdrv_unref unconditionally. Fortunately the protection is not needed in IOThread because only main loop can modify a graph with the AioContext lock held. Signed-off-by: Fam Zheng <famz@redhat.com> Message-Id: <20170418143044.12187-2-famz@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Tested-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>	2017-04-18 22:56:28 +08:00
Greg Kurz	9c6b899f7a	9pfs: local: set the path of the export root to "." The local backend was recently converted to using "at()" syscalls in order to ensure all accesses happen below the shared directory. This requires that we only pass relative paths, otherwise the dirfd argument to the "at()" syscalls is ignored and the path is treated as an absolute path in the host. This is actually the case for paths in all fids, with the notable exception of the root fid, whose path is "/". This causes the following backend ops to act on the "/" directory of the host instead of the virtfs shared directory when the export root is involved: - lstat - chmod - chown - utimensat ie, chmod /9p_mount_point in the guest will be converted to chmod / in the host for example. This could cause security issues with a privileged QEMU. All "*at()" syscalls are being passed an open file descriptor. In the case of the export root, this file descriptor points to the path in the host that was passed to -fsdev. The fix is thus as simple as changing the path of the export root fid to be "." instead of "/". This is CVE-2017-7471. Cc: qemu-stable@nongnu.org Reported-by: Léo Gaspard <leo@gaspard.io> Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-04-18 14:01:43 +01:00
Peter Maydell	372b3fe0b2	Update version for v2.9.0-rc4 release Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-04-11 17:18:03 +01:00
Max Reitz	e3e0003a8f	block/io: Comment out permission assertions In case of block migration, there may be writes to BlockBackends that do not have the write permission taken. Before this issue is fixed (which is not going to happen in 2.9), we therefore cannot assert that this is the case. Suggested-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Tested-by: Kevin Wolf <kwolf@redhat.com> Message-id: 20170411145050.31290-1-mreitz@redhat.com Tested-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-04-11 16:09:31 +01:00
Kevin Wolf	5eceb01adf	sheepdog: Fix crash in co_read_response() This fixes a regression introduced in commit `9d456654`. aio_co_wake() can only be used to reenter a coroutine that was already previously entered, otherwise co->ctx is uninitialised and we access garbage. Using it immediately after qemu_coroutine_create() like in co_read_response() is wrong and causes segfaults. Replace the call with aio_co_enter(), which gets an explicit AioContext parameter and works even for new coroutines. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Tested-by: Kashyap Chamarthy <kchamart@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 1491919733-21065-1-git-send-email-kwolf@redhat.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-04-11 16:08:29 +01:00
Peter Maydell	f5ac5cfeb6	Block patches for 2.9.0-rc4 -----BEGIN PGP SIGNATURE----- iQFGBAABCAAwFiEEkb62CjDbPohX0Rgp9AfbAGHVz0AFAljs3LcSHG1yZWl0ekBy ZWRoYXQuY29tAAoJEPQH2wBh1c9Ap5UIAL86fSzqRhpFE+327nah6krBvHaRtXRR vY+EuYEOAUotmxrBRqdXS6FB+yOHOCkSw3D3MDFoRoGIGFKEVB7i6WmS3esvsziO ziuoECU1d8R0HdfT4vqOwaTHqmkUNmwvjXvNN3myj0xpq2YNBL+vsuw7/Jb5bOPy j/AxCPgW8+hNjTT/OtAqQL15zSL46Xl5xH2SyksNUog2tOhw3j2C8VMBASP+apY1 9wL/H8DcN7xExRQCeDCPZlGjPGfybxXqpbHC2blTt2PrPyRRwYmbM7mPAl+ilamG O5qThEKZUzU5VlWWB23C3jZksIXlOoeqvCGLabV775i9bRXtynC/9yA= =yblq -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/maxreitz/tags/pull-block-2017-04-11' into staging Block patches for 2.9.0-rc4 # gpg: Signature made Tue 11 Apr 2017 14:40:07 BST # gpg: using RSA key 0xF407DB0061D5CF40 # gpg: Good signature from "Max Reitz <mreitz@redhat.com>" # Primary key fingerprint: 91BE B60A 30DB 3E88 57D1 1829 F407 DB00 61D5 CF40 * remotes/maxreitz/tags/pull-block-2017-04-11: iscsi: Fix iscsi_create throttle: Remove block from group on hot-unplug block: pass the right options for BlockDriver.bdrv_open() Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-04-11 14:53:33 +01:00
Fam Zheng	2ec9a782d1	iscsi: Fix iscsi_create Since `d5895fcb` (iscsi: Split URL into individual options), creating qcow2 image on an iscsi LUN fails: qemu-img create -f qcow2 iscsi://$SERVER/$IQN/0 1G qemu-img: iscsi://$SERVER/$IQN/0: Could not create image: Invalid argument The problem is iscsi_open now expects that transport_name, portal and target are already parsed into structured options by iscsi_parse_filename, but it is not called in iscsi_create. Signed-off-by: Fam Zheng <famz@redhat.com> Message-id: 20170410075451.21329-1-famz@redhat.com Reviewed-by: Eric Blake <eblake@redhat.com> [mreitz: Dropped now superfluous qdict_put(bs_options, "filename", ...)] Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-04-11 15:33:00 +02:00
Eric Blake	1606e4cf8a	throttle: Remove block from group on hot-unplug When a block device that is part of a throttle group is hot-unplugged, we forgot to remove it from the throttle group. This leaves stale memory around, and causes an easily reproducible crash: $ ./x86_64-softmmu/qemu-system-x86_64 -nodefaults -nographic -qmp stdio \ -device virtio-scsi-pci,bus=pci.0 -drive \ id=drive_image2,if=none,format=raw,file=file2,bps=512000,iops=100,group=foo \ -device scsi-hd,id=image2,drive=drive_image2 -drive \ id=drive_image3,if=none,format=raw,file=file3,bps=512000,iops=100,group=foo \ -device scsi-hd,id=image3,drive=drive_image3 {'execute':'qmp_capabilities'} {'execute':'device_del','arguments':{'id':'image3'}} {'execute':'system_reset'} Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1428810 Suggested-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Eric Blake <eblake@redhat.com> Message-id: 20170406190847.29347-1-eblake@redhat.com Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-04-11 15:33:00 +02:00
Dong Jia Shi	7a9e51198c	block: pass the right options for BlockDriver.bdrv_open() raw_open() expects the caller always passing in the right actual @options parameter. But when trying to applying snapshot on a RBD image, bdrv_snapshot_goto() calls raw_open() (by calling the bdrv_open callback on the BlockDriver) with a NULL @options, and that will result in a Segmentation fault. For the other non-raw format drivers, it also makes sense to passing in the actual options, althought they don't trigger the problem so far. Let's prepare a @options by adding the "file" key-value pair to a copy of the actual options that were given for the node (i.e. bs->options), and pass it to the callback. BlockDriver.bdrv_open() expects bs->file to be NULL and just overwrites it with the result from bdrv_open_child(). That means we should actually make sure it's NULL because otherwise the child BDS will have a reference count that is 1 too high. So we unconditionally invoke bdrv_unref_child() before calling BlockDriver.bdrv_open(), and we wrap everything in bdrv_ref()/bdrv_unref() so the BDS isn't deleted in the meantime. Suggested-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> Message-id: 20170405091909.36357-2-bjsdjshi@linux.vnet.ibm.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-04-11 15:33:00 +02:00
Peter Maydell	aa388ddc36	-----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQEtBAABCAAXBQJY7MfPEBxmYW16QHJlZGhhdC5jb20ACgkQyjViTGqRccbnCgf+ N6q6oHpcO6LsisYNPxpJ5p/Vnah/DKNgGhCnwCOHXkhv17KJHCIKqHo+07zcRJUk kO9Yc2g9PaWkgM61NsecnbTc4YE8G5SGrBCRj0o3f5Re8X3OnEi2Or8etaa9o/ZF qIkRk5SMYE2aQFRzn7Hw7LZ/9cta5DPI6+vCzs//S4mnx06p/068Ai21lAWC4Mxy 8dLbz+0GCXEdr9XRweoQKYTYH5HE+gLpRtzlD4QtbfbGN2qkUtR/DLnHTFCUqJ10 F+GBi440L4A7ZXzHIGKVWSH4wO67X65lnWhEFUhIyXHZqkuZzTwsEfIqR005sMub hViMFGYv6sEbf+A9sBz1Xw== =BCcn -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/famz/tags/block-pull-request' into staging # gpg: Signature made Tue 11 Apr 2017 13:10:55 BST # gpg: using RSA key 0xCA35624C6A9171C6 # gpg: Good signature from "Fam Zheng <famz@redhat.com>" # gpg: WARNING: This key is not certified with sufficiently trusted signatures! # gpg: It is not certain that the signature belongs to the owner. # Primary key fingerprint: 5003 7CB7 9706 0F76 F021 AD56 CA35 624C 6A91 71C6 * remotes/famz/tags/block-pull-request: sheepdog: Use bdrv_coroutine_enter before BDRV_POLL_WHILE block: Fix bdrv_co_flush early return block: Use bdrv_coroutine_enter to start I/O coroutines qemu-io-cmds: Use bdrv_coroutine_enter blockjob: Use bdrv_coroutine_enter to start coroutine block: Introduce bdrv_coroutine_enter async: Introduce aio_co_enter coroutine: Extract qemu_aio_coroutine_enter tests/block-job-txn: Don't start block job before adding to txn block: Quiesce old aio context during bdrv_set_aio_context block: Make bdrv_parent_drained_begin/end public Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-04-11 13:27:05 +01:00
Fam Zheng	76296dff97	sheepdog: Use bdrv_coroutine_enter before BDRV_POLL_WHILE When called from main thread, the coroutine should run in the context of bs. Use bdrv_coroutine_enter to ensure that. Signed-off-by: Fam Zheng <famz@redhat.com>	2017-04-11 20:07:15 +08:00
Fam Zheng	49ca625913	block: Fix bdrv_co_flush early return bdrv_inc_in_flight and bdrv_dec_in_flight are mandatory for BDRV_POLL_WHILE to work, even for the shortcut case where flush is unnecessary. Move the if block to below bdrv_dec_in_flight, and BTW fix the variable declaration position. Signed-off-by: Fam Zheng <famz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>	2017-04-11 20:07:15 +08:00
Fam Zheng	e92f0e1910	block: Use bdrv_coroutine_enter to start I/O coroutines BDRV_POLL_WHILE waits for the started I/O by releasing bs's ctx then polling the main context, which relies on the yielded coroutine continuing on bs->ctx before notifying qemu_aio_context with bdrv_wakeup(). Thus, using qemu_coroutine_enter to start I/O is wrong because if the coroutine is entered from main loop, co->ctx will be qemu_aio_context, as a result of the "release, poll, acquire" loop of BDRV_POLL_WHILE, race conditions happen when both main thread and the iothread access the same BDS: main loop iothread ----------------------------------------------------------------------- blockdev_snapshot aio_context_acquire(bs->ctx) virtio_scsi_data_plane_handle_cmd bdrv_drained_begin(bs->ctx) bdrv_flush(bs) bdrv_co_flush(bs) aio_context_acquire(bs->ctx).enter ... qemu_coroutine_yield(co) BDRV_POLL_WHILE() aio_context_release(bs->ctx) aio_context_acquire(bs->ctx).return ... aio_co_wake(co) aio_poll(qemu_aio_context) ... co_schedule_bh_cb() ... qemu_coroutine_enter(co) ... /* (A) bdrv_co_flush(bs) /* (B) I/O on bs / continues... / aio_context_release(bs->ctx) aio_context_acquire(bs->ctx) Note that in above case, bdrv_drained_begin() doesn't do the "release, poll, acquire" in BDRV_POLL_WHILE, because bs->in_flight == 0. Fix this by using bdrv_coroutine_enter and enter coroutine in the right context. iotests 109 output is updated because the coroutine reenter flow during mirror job complete is different (now through co_queue_wakeup, instead of the unconditional qemu_coroutine_switch before), making the end job len different. Signed-off-by: Fam Zheng <famz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com>	2017-04-11 20:07:15 +08:00
Fam Zheng	324ec3e4f2	qemu-io-cmds: Use bdrv_coroutine_enter qemu_coroutine_create associates @co to qemu_aio_context but we poll blk's context below. If the coroutine yields, it may never get resumed again. Use bdrv_coroutine_enter to make sure we are starting the I/O on the right context. Signed-off-by: Fam Zheng <famz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com>	2017-04-11 20:07:15 +08:00
Fam Zheng	aef4278c5a	blockjob: Use bdrv_coroutine_enter to start coroutine Resuming and especially starting of the block job coroutine, could be issued in the main thread. However the coroutine's "home" ctx should be set to the same context as job->blk. Use bdrv_coroutine_enter to ensure that. Signed-off-by: Fam Zheng <famz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com>	2017-04-11 20:07:15 +08:00
Fam Zheng	052a75721f	block: Introduce bdrv_coroutine_enter Signed-off-by: Fam Zheng <famz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com>	2017-04-11 20:07:15 +08:00
Fam Zheng	8865852e00	async: Introduce aio_co_enter They start the coroutine on the specified context. Signed-off-by: Fam Zheng <famz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com>	2017-04-11 20:07:15 +08:00
Fam Zheng	ba9e75ceef	coroutine: Extract qemu_aio_coroutine_enter It's a variant of qemu_coroutine_enter with an explicit AioContext parameter. Signed-off-by: Fam Zheng <famz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com>	2017-04-11 20:07:15 +08:00
Fam Zheng	d90fce9446	tests/block-job-txn: Don't start block job before adding to txn Previously, before test_block_job_start returns, the job can already complete, as a result, the transactional state of other jobs added to the same txn later cannot be handled correctly. Move the block_job_start() calls to callers after block_job_txn_add_job() calls. Signed-off-by: Fam Zheng <famz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com>	2017-04-11 20:07:15 +08:00

1 2 3 4 5 ...

52584 Commits