linux

korg/linux

mirror of https://mirrors.bfsu.edu.cn/git/linux.git synced 2024-12-28 05:24:47 +08:00

Author	SHA1	Message	Date
Vitaly Kuznetsov	7cc80c9807	Drivers: hv: don't leak memory in vmbus_establish_gpadl() In some cases create_gpadl_header() allocates submessages but we never free them. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-08-31 13:05:41 +02:00
Vitaly Kuznetsov	4d63763296	Drivers: hv: get rid of redundant messagecount in create_gpadl_header() We use messagecount only once in vmbus_establish_gpadl() to check if it is safe to iterate through the submsglist. We can just initialize the list header in all cases in create_gpadl_header() instead. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-08-31 13:05:40 +02:00
Vitaly Kuznetsov	a9f61ca793	Drivers: hv: avoid vfree() on crash When we crash from NMI context (e.g. after NMI injection from host when 'sysctl -w kernel.unknown_nmi_panic=1' is set) we hit kernel BUG at mm/vmalloc.c:1530! as vfree() is denied. While the issue could be solved with in_nmi() check instead I opted for skipping vfree on all sorts of crashes to reduce the amount of work which can cause consequent crashes. We don't really need to free anything on crash. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-08-31 13:05:40 +02:00
Stephan Mueller	4b44f2d18a	random: add interrupt callback to VMBus IRQ handler The Hyper-V Linux Integration Services use the VMBus implementation for communication with the Hypervisor. VMBus registers its own interrupt handler that completely bypasses the common Linux interrupt handling. This implies that the interrupt entropy collector is not triggered. This patch adds the interrupt entropy collection callback into the VMBus interrupt handler function. Cc: stable@kernel.org Signed-off-by: Stephan Mueller <stephan.mueller@atsec.com> Signed-off-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2016-06-13 11:54:33 -04:00
Vitaly Kuznetsov	d19a55d6ed	Drivers: hv: balloon: reset host_specified_ha_region We set host_specified_ha_region = true on certain request but this is a global state which stays 'true' forever. We need to reset it when we receive a request where ha_region is not specified. I did not see any real issues, the bug was found by code inspection. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-05-01 09:23:14 -07:00
Vitaly Kuznetsov	77c0c9735b	Drivers: hv: balloon: don't crash when memory is added in non-sorted order When we iterate through all HA regions in handle_pg_range() we have an assumption that all these regions are sorted in the list and the 'start_pfn >= has->end_pfn' check is enough to find the proper region. Unfortunately it's not the case with WS2016 where host can hot-add regions in a different order. We end up modifying the wrong HA region and crashing later on pages online. Modify the check to make sure we found the region we were searching for while iterating. Fix the same check in pfn_covered() as well. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-05-01 09:23:14 -07:00
Vitaly Kuznetsov	cd95aad557	Drivers: hv: vmbus: handle various crash scenarios Kdump keeps biting. Turns out CHANNELMSG_UNLOAD_RESPONSE is always delivered to the CPU which was used for initial contact or to CPU0 depending on host version. vmbus_wait_for_unload() doesn't account for the fact that in case we're crashing on some other CPU we won't get the CHANNELMSG_UNLOAD_RESPONSE message and our wait on the current CPU will never end. Do the following: 1) Check for completion_done() in the loop. In case interrupt handler is still alive we'll get the confirmation we need. 2) Read message pages for all CPUs message page as we're unsure where CHANNELMSG_UNLOAD_RESPONSE is going to be delivered to. We can race with still-alive interrupt handler doing the same, add cmpxchg() to vmbus_signal_eom() to not lose CHANNELMSG_UNLOAD_RESPONSE message. 3) Cleanup message pages on all CPUs. This is required (at least for the current CPU as we're clearing CPU0 messages now but we may want to bring up additional CPUs on crash) as new messages won't be delivered till we consume what's pending. On boot we'll place message pages somewhere else and we won't be able to read stale messages. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-05-01 09:23:14 -07:00
Vitaly Kuznetsov	4dbfc2e680	Drivers: hv: kvp: fix IP Failover Hyper-V VMs can be replicated to another hosts and there is a feature to set different IP for replicas, it is called 'Failover TCP/IP'. When such guest starts Hyper-V host sends it KVP_OP_SET_IP_INFO message as soon as we finish negotiation procedure. The problem is that it can happen (and it actually happens) before userspace daemon connects and we reply with HV_E_FAIL to the message. As there are no repetitions we fail to set the requested IP. Solve the issue by postponing our reply to the negotiation message till userspace daemon is connected. We can't wait too long as there is a host-side timeout (cca. 75 seconds) and if we fail to reply in this time frame the whole KVP service will become inactive. The solution is not ideal - if it takes userspace daemon more than 60 seconds to connect IP Failover will still fail but I don't see a solution with our current separation between kernel and userspace parts. Other two modules (VSS and FCOPY) don't require such delay, leave them untouched. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-05-01 09:23:14 -07:00
Jake Oshins	ea37a6b8a0	drivers:hv: Separate out frame buffer logic when picking MMIO range Simplify the logic that picks MMIO ranges by pulling out the logic related to trying to lay frame buffer claim on top of where the firmware placed the frame buffer. Signed-off-by: Jake Oshins <jakeo@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-30 14:01:37 -07:00
Jake Oshins	6d146aefba	drivers:hv: Record MMIO range in use by frame buffer Later in the boot sequence, we need to figure out which memory ranges can be given out to various paravirtual drivers. The hyperv_fb driver should, ideally, be placed right on top of the frame buffer, without some other device getting plopped on top of this range in the meantime. Recording this now allows that to be guaranteed. Signed-off-by: Jake Oshins <jakeo@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-30 14:01:37 -07:00
Jake Oshins	be000f93e5	drivers:hv: Track allocations of children of hv_vmbus in private resource tree This patch changes vmbus_allocate_mmio() and vmbus_free_mmio() so that when child paravirtual devices allocate memory-mapped I/O space, they allocate it privately from a resource tree pointed at by hyperv_mmio and also by the public resource tree iomem_resource. This allows the region to be marked as "busy" in the private tree, but a "bridge window" in the public tree, guaranteeing that no two bridge windows will overlap each other but while also allowing the PCI device children of the bridge windows to overlap that window. One might conclude that this belongs in the pnp layer, rather than in this driver. Rafael Wysocki, the maintainter of the pnp layer, has previously asked that we not modify the pnp layer as it is considered deprecated. This patch is thus essentially a workaround. Signed-off-by: Jake Oshins <jakeo@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-30 14:01:37 -07:00
Jake Oshins	23a0683186	drivers:hv: Reverse order of resources in hyperv_mmio A patch later in this series allocates child nodes in this resource tree. For that to work, this tree needs to be sorted in ascending order. Signed-off-by: Jake Oshins <jakeo@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-30 14:01:37 -07:00
Jake Oshins	97fb77dc87	drivers:hv: Make a function to free mmio regions through vmbus This patch introduces a function that reverses everything done by vmbus_allocate_mmio(). Existing code just called release_mem_region(). Future patches in this series require a more complex sequence of actions, so this function is introduced to wrap those actions. Signed-off-by: Jake Oshins <jakeo@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-30 14:01:37 -07:00
Jake Oshins	e16dad6bfe	drivers:hv: Lock access to hyperv_mmio resource tree In existing code, this tree of resources is created in single-threaded code and never modified after it is created, and thus needs no locking. This patch introduces a semaphore for tree access, as other patches in this series introduce run-time modifications of this resource tree which can happen on multiple threads. Signed-off-by: Jake Oshins <jakeo@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-30 14:01:37 -07:00
K. Y. Srinivasan	ab028db41c	Drivers: hv: vmbus: Implement APIs to support "in place" consumption of vmbus packets Implement APIs for in-place consumption of vmbus packets. Currently, each packet is copied and processed one at a time and as part of processing each packet we potentially may signal the host (if it is waiting for room to produce a packet). These APIs help batched in-place processing of vmbus packets. We also optimize host signaling by having a separate API to signal the end of in-place consumption. With netvsc using these APIs, on an iperf run on average I see about 20X reduction in checks to signal the host. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-30 14:00:19 -07:00
K. Y. Srinivasan	687f32e6d9	Drivers: hv: vmbus: Move some ring buffer functions to hyperv.h In preparation for implementing APIs for in-place consumption of VMBUS packets, movve some ring buffer functionality into hyperv.h Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-30 14:00:19 -07:00
K. Y. Srinivasan	5cc472477f	Drivers: hv: vmbus: Export the vmbus_set_event() API In preparation for moving some ring buffer functionality out of the vmbus driver, export the API for signaling the host. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-30 14:00:19 -07:00
K. Y. Srinivasan	dcd0eeca44	Drivers: hv: vmbus: Use the new virt_xx barrier code Use the virt_xx barriers that have been defined for use in virtual machines. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-30 14:00:19 -07:00
K. Y. Srinivasan	d45faaeedb	Drivers: hv: vmbus: Use READ_ONCE() to read variables that are volatile Use the READ_ONCE macro to access variabes that can change asynchronously. This is the recommended mechanism for dealing with "unsafe" compiler optimizations. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-30 14:00:19 -07:00
K. Y. Srinivasan	a6341f0000	Drivers: hv: vmbus: Introduce functions for estimating room in the ring buffer Introduce separate functions for estimating how much can be read from and written to the ring buffer. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-30 14:00:19 -07:00
K. Y. Srinivasan	a389fcfd2c	Drivers: hv: vmbus: Fix signaling logic in hv_need_to_signal_on_read() On the consumer side, we have interrupt driven flow management of the producer. It is sufficient to base the signaling decision on the amount of space that is available to write after the read is complete. The current code samples the previous available space and uses this in making the signaling decision. This state can be stale and is unnecessary. Since the state can be stale, we end up not signaling the host (when we should) and this can result in a hang. Fix this problem by removing the unnecessary check. I would like to thank Arseney Romanenko <arseneyr@microsoft.com> for pointing out this issue. Also, issue a full memory barrier before making the signaling descision to correctly deal with potential reordering of the write (read index) followed by the read of pending_sz. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Tested-by: Dexuan Cui <decui@microsoft.com> Cc: <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-04-30 14:00:16 -07:00
Linus Torvalds	8eee93e257	Char/Misc patches for 4.6-rc1 Here is the big char/misc driver update for 4.6-rc1. The majority of the patches here is hwtracing and some new mic drivers, but there's a lot of other driver updates as well. Full details in the shortlog. All have been in linux-next for a while with no reported issues. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iEYEABECAAYFAlbp9IcACgkQMUfUDdst+ykyJgCeLTC2QNGrh51kiJglkVJ0yD36 q4MAn0NkvSX2+iv5Jq8MaX6UQoRa4Nun =MNjR -----END PGP SIGNATURE----- Merge tag 'char-misc-4.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc updates from Greg KH: "Here is the big char/misc driver update for 4.6-rc1. The majority of the patches here is hwtracing and some new mic drivers, but there's a lot of other driver updates as well. Full details in the shortlog. All have been in linux-next for a while with no reported issues" * tag 'char-misc-4.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (238 commits) goldfish: Fix build error of missing ioremap on UM nvmem: mediatek: Fix later provider initialization nvmem: imx-ocotp: Fix return value of imx_ocotp_read nvmem: Fix dependencies for !HAS_IOMEM archs char: genrtc: replace blacklist with whitelist drivers/hwtracing: make coresight-etm-perf.c explicitly non-modular drivers: char: mem: fix IS_ERROR_VALUE usage char: xillybus: Fix internal data structure initialization pch_phub: return -ENODATA if ROM can't be mapped Drivers: hv: vmbus: Support kexec on ws2012 r2 and above Drivers: hv: vmbus: Support handling messages on multiple CPUs Drivers: hv: utils: Remove util transport handler from list if registration fails Drivers: hv: util: Pass the channel information during the init call Drivers: hv: vmbus: avoid unneeded compiler optimizations in vmbus_wait_for_unload() Drivers: hv: vmbus: remove code duplication in message handling Drivers: hv: vmbus: avoid wait_for_completion() on crash Drivers: hv: vmbus: don't loose HVMSG_TIMER_EXPIRED messages misc: at24: replace memory_accessor with nvmem_device_read eeprom: 93xx46: extend driver to plug into the NVMEM framework eeprom: at25: extend driver to plug into the NVMEM framework ...	2016-03-17 13:47:50 -07:00
Alex Ng	7268644734	Drivers: hv: vmbus: Support kexec on ws2012 r2 and above WS2012 R2 and above hosts can support kexec in that thay can support reconnecting to the host (as would be needed in the kexec path) on any CPU. Enable this. Pre ws2012 r2 hosts don't have this ability and consequently cannot support kexec. Signed-off-by: Alex Ng <alexng@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-03-01 16:57:20 -08:00
K. Y. Srinivasan	d81274aae6	Drivers: hv: vmbus: Support handling messages on multiple CPUs Starting with Windows 2012 R2, message inteerupts can be delivered on any VCPU in the guest. Support this functionality. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-03-01 16:57:20 -08:00
Alex Ng	e66853b090	Drivers: hv: utils: Remove util transport handler from list if registration fails If util transport fails to initialize for any reason, the list of transport handlers may become corrupted due to freeing the transport handler without removing it from the list. Fix this by cleaning it up from the list. Signed-off-by: Alex Ng <alexng@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-03-01 16:57:20 -08:00
K. Y. Srinivasan	b9830d120c	Drivers: hv: util: Pass the channel information during the init call Pass the channel information to the util drivers that need to defer reading the channel while they are processing a request. This would address the following issue reported by Vitaly: Commit `3cace4a616` ("Drivers: hv: utils: run polling callback always in interrupt context") removed direct _transaction.state = HVUTIL_READY assignments from _handle_handshake() functions introducing the following race: if a userspace daemon connects before we get first non-negotiation request from the server hv_poll_channel() won't set transaction state to HVUTIL_READY as (!channel) condition will fail, we set it to non-NULL on the first real request from the server. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Reported-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-03-01 16:57:20 -08:00
Vitaly Kuznetsov	d452ab7b4c	Drivers: hv: vmbus: avoid unneeded compiler optimizations in vmbus_wait_for_unload() Message header is modified by the hypervisor and we read it in a loop, we need to prevent compilers from optimizing accesses. There are no such optimizations at this moment, this is just a future proof. Suggested-by: Radim Krcmar <rkrcmar@redhat.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Radim Kr.má<rkrcmar@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-03-01 16:57:20 -08:00
Vitaly Kuznetsov	0f70b66975	Drivers: hv: vmbus: remove code duplication in message handling We have 3 functions dealing with messages and they all implement the same logic to finalize reads, move it to vmbus_signal_eom(). Suggested-by: Radim Krcmar <rkrcmar@redhat.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Radim Kr.má<rkrcmar@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-03-01 16:57:20 -08:00
Vitaly Kuznetsov	75ff3a8a91	Drivers: hv: vmbus: avoid wait_for_completion() on crash wait_for_completion() may sleep, it enables interrupts and this is something we really want to avoid on crashes because interrupt handlers can cause other crashes. Switch to the recently introduced vmbus_wait_for_unload() doing busy wait instead. Reported-by: Radim Krcmar <rkrcmar@redhat.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Radim Kr.má<rkrcmar@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-03-01 16:57:20 -08:00
Vitaly Kuznetsov	7be3e16944	Drivers: hv: vmbus: don't loose HVMSG_TIMER_EXPIRED messages We must handle HVMSG_TIMER_EXPIRED messages in the interrupt context and we offload all the rest to vmbus_on_msg_dpc() tasklet. This functions loops to see if there are new messages pending. In case we'll ever see HVMSG_TIMER_EXPIRED message there we're going to lose it as we can't handle it from there. Avoid looping in vmbus_on_msg_dpc(), we're OK with handling one message per interrupt. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Radim Kr.má<rkrcmar@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-03-01 16:57:20 -08:00
Andrey Smetanin	18f098618a	drivers/hv: Move VMBus hypercall codes into Hyper-V UAPI header VMBus hypercall codes inside Hyper-V UAPI header will be used by QEMU to implement VMBus host devices support. Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com> Acked-by: K. Y. Srinivasan <kys@microsoft.com> Reviewed-by: Roman Kagan <rkagan@virtuozzo.com> CC: Gleb Natapov <gleb@kernel.org> CC: Paolo Bonzini <pbonzini@redhat.com> CC: Joerg Roedel <joro@8bytes.org> CC: "K. Y. Srinivasan" <kys@microsoft.com> CC: Haiyang Zhang <haiyangz@microsoft.com> CC: Roman Kagan <rkagan@virtuozzo.com> CC: Denis V. Lunev <den@openvz.org> CC: qemu-devel@nongnu.org [Do not rename the constant at the same time as moving it, as that would cause semantic conflicts with the Hyper-V tree. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-02-16 18:48:40 +01:00
K. Y. Srinivasan	fe760e4d64	Drivers: hv: vmbus: Give control over how the ring access is serialized On the channel send side, many of the VMBUS device drivers explicity serialize access to the outgoing ring buffer. Give more control to the VMBUS device drivers in terms how to serialize accesss to the outgoing ring buffer. The default behavior will be to aquire the ring lock to preserve the current behavior. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-02-07 21:34:12 -08:00
K. Y. Srinivasan	3eba9a77d5	Drivers: hv: vmbus: Eliminate the spin lock on the read path The function hv_ringbuffer_read() is called always on a pre-assigned CPU. Each chnnel is bound to a specific CPU and this function is always called on the CPU the channel is bound. There is no need to acquire the spin lock; get rid of this overhead. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-02-07 21:34:12 -08:00
Dexuan Cui	85d9aa7051	Drivers: hv: vmbus: add an API vmbus_hvsock_device_unregister() The hvsock driver needs this API to release all the resources related to the channel. Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-02-07 21:34:12 -08:00
Dexuan Cui	499e8401a5	Drivers: hv: vmbus: add a per-channel rescind callback This will be used by the coming hv_sock driver. Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-02-07 21:34:12 -08:00
Dexuan Cui	8981da320a	Drivers: hv: vmbus: add a hvsock flag in struct hv_driver Only the coming hv_sock driver has a "true" value for this flag. We treat the hvsock offers/channels as special VMBus devices. Since the hv_sock driver handles all the hvsock offers/channels, we need to tweak vmbus_match() for hv_sock driver, so we introduce this flag. Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-02-07 21:34:12 -08:00
Dexuan Cui	5c23a1a5c6	Drivers: hv: vmbus: define a new VMBus message type for hvsock A function to send the type of message is also added. The coming net/hvsock driver will use this function to proactively request the host to offer a VMBus channel for a new hvsock connection. Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-02-07 21:34:12 -08:00
Dexuan Cui	5f363bc38f	Drivers: hv: vmbus: vmbus_sendpacket_ctl: hvsock: avoid unnecessary signaling When the hvsock channel's outbound ringbuffer is full (i.e., hv_ringbuffer_write() returns -EAGAIN), we should avoid the unnecessary signaling the host. Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-02-07 21:34:12 -08:00
Vitaly Kuznetsov	3ccb4fd8f4	Drivers: hv: vmbus: don't manipulate with clocksources on crash clocksource_change_rating() involves mutex usage and can't be called in interrupt context. It also makes sense to avoid doing redundant work on crash. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-02-07 21:34:12 -08:00
Vitaly Kuznetsov	415719160d	Drivers: hv: vmbus: avoid scheduling in interrupt context in vmbus_initiate_unload() We have to call vmbus_initiate_unload() on crash to make kdump work but the crash can also be happening in interrupt (e.g. Sysrq + c results in such) where we can't schedule or the following will happen: [ 314.905786] bad: scheduling from the idle thread! Just skipping the wait (and even adding some random wait here) won't help: to make host-side magic working we're supposed to receive CHANNELMSG_UNLOAD (and actually confirm the fact that we received it) but we can't use interrupt-base path (vmbus_isr()-> vmbus_on_msg_dpc()). Implement a simple busy wait ignoring all the other messages and use it if we're in an interrupt context. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-02-07 21:34:12 -08:00
Vitaly Kuznetsov	79fd8e7066	Drivers: hv: vmbus: avoid infinite loop in init_vp_index() When we pick a CPU to use for a new subchannel we try find a non-used one on the appropriate NUMA node, we keep track of them with the primary->alloced_cpus_in_node mask. Under normal circumstances we don't run out of available CPUs but it is possible when we we don't initialize some cpus in Linux, e.g. when we boot with 'nr_cpus=' limitation. Avoid the infinite loop in init_vp_index() by checking that we still have non-used CPUs in the alloced_cpus_in_node mask and resetting it in case we don't. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-02-07 21:34:12 -08:00
K. Y. Srinivasan	7047f17d70	Drivers: hv: vmbus: Add vendor and device atttributes Add vendor and device attributes to VMBUS devices. These will be used by Hyper-V tools as well user-level RDMA libraries that will use the vendor/device tuple to discover the RDMA device. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-02-07 21:32:57 -08:00
K. Y. Srinivasan	1b807e1011	Drivers: hv: vmbus: Cleanup vmbus_set_event() Cleanup vmbus_set_event() by inlining the hypercall to post the event and since the return value of vmbus_set_event() is not checked, make it void. As part of this cleanup, get rid of the function hv_signal_event() as it is only callled from vmbus_set_event(). Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2016-02-07 21:32:57 -08:00
Linus Torvalds	4c257ec37b	char/misc patches for 4.5-rc1 Here's the big set of char/misc patches for 4.5-rc1. Nothing major, lots of different driver subsystem updates, full details in the shortlog. All of these have been in linux-next for a while. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iEYEABECAAYFAlaV0GsACgkQMUfUDdst+ymlPgCg07GSc4SOWlUL2V36vQ0kucoO YjAAoMfeUEhsf/NJ7iaAMGQVUQKuYVqr =BlqH -----END PGP SIGNATURE----- Merge tag 'char-misc-4.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc updates from Greg KH: "Here's the big set of char/misc patches for 4.5-rc1. Nothing major, lots of different driver subsystem updates, full details in the shortlog. All of these have been in linux-next for a while" * tag 'char-misc-4.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (71 commits) mei: fix fasync return value on error parport: avoid assignment in if parport: remove unneeded space parport: change style of NULL comparison parport: remove unnecessary out of memory message parport: remove braces parport: quoted strings should not be split parport: code indent should use tabs parport: fix coding style parport: EXPORT_SYMBOL should follow function parport: remove trailing white space parport: fix a trivial typo coresight: Fix a typo in Kconfig coresight: checking for NULL string in coresight_name_match() Drivers: hv: vmbus: Treat Fibre Channel devices as performance critical Drivers: hv: utils: fix hvt_op_poll() return value on transport destroy Drivers: hv: vmbus: fix the building warning with hyperv-keyboard extcon: add Maxim MAX3355 driver Drivers: hv: ring_buffer: eliminate hv_ringbuffer_peek() Drivers: hv: remove code duplication between vmbus_recvpacket()/vmbus_recvpacket_raw() ...	2016-01-13 10:23:36 -08:00
K. Y. Srinivasan	879a650a27	Drivers: hv: vmbus: Treat Fibre Channel devices as performance critical For performance critical devices, we distribute the incoming channel interrupt load across available CPUs in the guest. Include Fibre channel devices in the set of devices for which we would distribute the interrupt load. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-12-21 13:14:56 -08:00
Vitaly Kuznetsov	77b744a598	Drivers: hv: utils: fix hvt_op_poll() return value on transport destroy The return type of hvt_op_poll() is unsigned int and -EBADF is inappropriate, poll functions return POLL* statuses. Reported-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-12-21 13:14:56 -08:00
Andrey Smetanin	c71acc4c74	drivers/hv: Move struct hv_timer_message_payload into UAPI Hyper-V x86 header This struct is required for Hyper-V SynIC timers implementation inside KVM and for upcoming Hyper-V VMBus support by userspace(QEMU). So place it into Hyper-V UAPI header. Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com> CC: Gleb Natapov <gleb@kernel.org> CC: Paolo Bonzini <pbonzini@redhat.com> CC: "K. Y. Srinivasan" <kys@microsoft.com> CC: Haiyang Zhang <haiyangz@microsoft.com> CC: Vitaly Kuznetsov <vkuznets@redhat.com> CC: Roman Kagan <rkagan@virtuozzo.com> CC: Denis V. Lunev <den@openvz.org> CC: qemu-devel@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-12-16 18:49:41 +01:00
Andrey Smetanin	5b423efe11	drivers/hv: Move struct hv_message into UAPI Hyper-V x86 header This struct is required for Hyper-V SynIC timers implementation inside KVM and for upcoming Hyper-V VMBus support by userspace(QEMU). So place it into Hyper-V UAPI header. Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com> Acked-by: K. Y. Srinivasan <kys@microsoft.com> Reviewed-by: Roman Kagan <rkagan@virtuozzo.com> CC: Gleb Natapov <gleb@kernel.org> CC: Paolo Bonzini <pbonzini@redhat.com> CC: "K. Y. Srinivasan" <kys@microsoft.com> CC: Haiyang Zhang <haiyangz@microsoft.com> CC: Vitaly Kuznetsov <vkuznets@redhat.com> CC: Roman Kagan <rkagan@virtuozzo.com> CC: Denis V. Lunev <den@openvz.org> CC: qemu-devel@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-12-16 18:49:40 +01:00
Andrey Smetanin	4f39bcfd1c	drivers/hv: Move HV_SYNIC_STIMER_COUNT into Hyper-V UAPI x86 header This constant is required for Hyper-V SynIC timers MSR's support by userspace(QEMU). Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com> Acked-by: K. Y. Srinivasan <kys@microsoft.com> Reviewed-by: Roman Kagan <rkagan@virtuozzo.com> CC: Gleb Natapov <gleb@kernel.org> CC: Paolo Bonzini <pbonzini@redhat.com> CC: "K. Y. Srinivasan" <kys@microsoft.com> CC: Haiyang Zhang <haiyangz@microsoft.com> CC: Vitaly Kuznetsov <vkuznets@redhat.com> CC: Roman Kagan <rkagan@virtuozzo.com> CC: Denis V. Lunev <den@openvz.org> CC: qemu-devel@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-12-16 18:49:40 +01:00
Andrey Smetanin	7797dcf63f	drivers/hv: replace enum hv_message_type by u32 enum hv_message_type inside struct hv_message, hv_post_message is not size portable. Replace enum by u32. Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com> CC: Gleb Natapov <gleb@kernel.org> CC: Paolo Bonzini <pbonzini@redhat.com> CC: "K. Y. Srinivasan" <kys@microsoft.com> CC: Haiyang Zhang <haiyangz@microsoft.com> CC: Vitaly Kuznetsov <vkuznets@redhat.com> CC: Roman Kagan <rkagan@virtuozzo.com> CC: Denis V. Lunev <den@openvz.org> CC: qemu-devel@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-12-16 18:49:39 +01:00
Vitaly Kuznetsov	940b68e2c3	Drivers: hv: ring_buffer: eliminate hv_ringbuffer_peek() Currently, there is only one user for hv_ringbuffer_read()/ hv_ringbuffer_peak() functions and the usage of these functions is: - insecure as we drop ring_lock between them, someone else (in theory only) can acquire it in between; - non-optimal as we do a number of things (acquire/release the above mentioned lock, calculate available space on the ring, ...) twice and this path is performance-critical. Remove hv_ringbuffer_peek() moving the logic from __vmbus_recvpacket() to hv_ringbuffer_read(). Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-12-14 19:27:30 -08:00
Vitaly Kuznetsov	667d374064	Drivers: hv: remove code duplication between vmbus_recvpacket()/vmbus_recvpacket_raw() vmbus_recvpacket() and vmbus_recvpacket_raw() are almost identical but there are two discrepancies: 1) vmbus_recvpacket() doesn't propagate errors from hv_ringbuffer_read() which looks like it is not desired. 2) There is an error message printed in packetlen > bufferlen case in vmbus_recvpacket(). I'm removing it as it is usless for users to see such messages and /vmbus_recvpacket_raw() doesn't have it. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-12-14 19:27:30 -08:00
Vitaly Kuznetsov	b5f53dde8d	Drivers: hv: ring_buffer: remove code duplication from hv_ringbuffer_peek/read() hv_ringbuffer_peek() does the same as hv_ringbuffer_read() without advancing the read index. The only functional change this patch brings is moving hv_need_to_signal_on_read() call under the ring_lock but this function is just a couple of comparisons. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-12-14 19:27:30 -08:00
Vitaly Kuznetsov	822f18d4d3	Drivers: hv: ring_buffer.c: fix comment style Convert 6+-string comments repeating function names to normal kernel-style comments and fix a couple of other comment style issues. No textual or functional changes intended. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-12-14 19:27:30 -08:00
Vitaly Kuznetsov	9420098adc	Drivers: hv: utils: fix crash when device is removed from host side The crash is observed when a service is being disabled host side while userspace daemon is connected to the device: [ 90.244859] general protection fault: 0000 [#1] SMP ... [ 90.800082] Call Trace: [ 90.800082] [<ffffffff81187008>] __fput+0xc8/0x1f0 [ 90.800082] [<ffffffff8118716e>] ____fput+0xe/0x10 ... [ 90.800082] [<ffffffff81015278>] do_signal+0x28/0x580 [ 90.800082] [<ffffffff81086656>] ? finish_task_switch+0xa6/0x180 [ 90.800082] [<ffffffff81443ebf>] ? __schedule+0x28f/0x870 [ 90.800082] [<ffffffffa01ebbaa>] ? hvt_op_read+0x12a/0x140 [hv_utils] ... The problem is that hvutil_transport_destroy() which does misc_deregister() freeing the appropriate device is reachable by two paths: module unload and from util_remove(). While module unload path is protected by .owner in struct file_operations util_remove() path is not. Freeing the device while someone holds an open fd for it is a show stopper. In general, it is not possible to revoke an fd from all users so the only way to solve the issue is to defer freeing the hvutil_transport structure. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-12-14 19:27:30 -08:00
Vitaly Kuznetsov	a15025660d	Drivers: hv: utils: introduce HVUTIL_TRANSPORT_DESTROY mode When Hyper-V host asks us to remove some util driver by closing the appropriate channel there is no easy way to force the current file descriptor holder to hang up but we can start to respond -EBADF to all operations asking it to exit gracefully. As we're setting hvt->mode from two separate contexts now we need to use a proper locking. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-12-14 19:27:30 -08:00
Vitaly Kuznetsov	a72f3a4ccf	Drivers: hv: utils: rename outmsg_lock As a preparation to reusing outmsg_lock to protect test-and-set openrations on 'mode' rename it the more general 'lock'. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-12-14 19:27:30 -08:00
Vitaly Kuznetsov	1f75338b6f	Drivers: hv: utils: fix memory leak on on_msg() failure inmsg should be freed in case of on_msg() failure to avoid memory leak. Preserve the error code from on_msg(). Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-12-14 19:27:30 -08:00
K. Y. Srinivasan	2d0c3b5ad7	Drivers: hv: utils: Invoke the poll function after handshake When the handshake with daemon is complete, we should poll the channel since during the handshake, we will not be processing any messages. This is a potential bug if the host is waiting for a response from the guest. I would like to thank Dexuan for pointing this out. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-12-14 19:15:05 -08:00
K. Y. Srinivasan	b282e4c06f	Drivers: hv: vmbus: Force all channel messages to be delivered on CPU 0 Force all channel messages to be delivered on CPU0. These messages are not performance critical and are used during the setup and teardown of the channel. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-12-14 19:15:05 -08:00
Andrey Smetanin	c35b82ef02	drivers/hv: correct tsc page sequence invalid value Hypervisor Top Level Functional Specification v3/4 says that TSC page sequence value = -1(0xFFFFFFFF) is used to indicate that TSC page no longer reliable source of reference timer. Unfortunately, we found that Windows Hyper-V guest side implementation uses sequence value = 0 to indicate that Tsc page no longer valid. This is clearly visible inside Windows 2012R2 ntoskrnl.exe HvlGetReferenceTime() function dissassembly: HvlGetReferenceTime proc near xchg ax, ax loc_1401C3132: mov rax, cs:HvlpReferenceTscPage mov r9d, [rax] test r9d, r9d jz short loc_1401C3176 rdtsc mov rcx, cs:HvlpReferenceTscPage shl rdx, 20h or rdx, rax mov rax, [rcx+8] mov rcx, cs:HvlpReferenceTscPage mov r8, [rcx+10h] mul rdx mov rax, cs:HvlpReferenceTscPage add rdx, r8 mov ecx, [rax] cmp ecx, r9d jnz short loc_1401C3132 jmp short loc_1401C3184 loc_1401C3176: mov ecx, 40000020h rdmsr shl rdx, 20h or rdx, rax loc_1401C3184: mov rax, rdx retn HvlGetReferenceTime endp This patch aligns Tsc page invalid sequence value with Windows Hyper-V guest implementation which is more compatible with both Hyper-V hypervisor and KVM hypervisor. Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com> Signed-off-by: Denis V. Lunev <den@openvz.org> CC: "K. Y. Srinivasan" <kys@microsoft.com> CC: Haiyang Zhang <haiyangz@microsoft.com> CC: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-12-14 19:15:05 -08:00
K. Y. Srinivasan	8599846d73	Drivers: hv: vmbus: Fix a Host signaling bug Currently we have two policies for deciding when to signal the host: One based on the ring buffer state and the other based on what the VMBUS client driver wants to do. Consider the case when the client wants to explicitly control when to signal the host. In this case, if the client were to defer signaling, we will not be able to signal the host subsequently when the client does want to signal since the ring buffer state will prevent the signaling. Implement logic to have only one signaling policy in force for a given channel. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Tested-by: Haiyang Zhang <haiyangz@microsoft.com> Cc: <stable@vger.kernel.org> # v4.2+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-12-14 19:15:05 -08:00
Jake Oshins	40f26f3168	drivers:hv: Allow for MMIO claims that span ACPI _CRS records This patch makes 16GB GPUs work in Hyper-V VMs, since, for compatibility reasons, the Hyper-V BIOS lists MMIO ranges in 2GB chunks in its root bus's _CRS object. Signed-off-by: Jake Oshins <jakeo@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-12-14 19:15:05 -08:00
Dexuan Cui	d6f591e339	Drivers: hv: vmbus: channge vmbus_connection.channel_lock to mutex spinlock is unnecessary here. mutex is enough. Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-12-14 19:15:05 -08:00
Dexuan Cui	f52078cf57	Drivers: hv: vmbus: release relid on error in vmbus_process_offer() We want to simplify vmbus_onoffer_rescind() by not invoking hv_process_channel_removal(NULL, ...). Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-12-14 19:15:05 -08:00
Dexuan Cui	34c6801e33	Drivers: hv: vmbus: fix rescind-offer handling for device without a driver In the path vmbus_onoffer_rescind() -> vmbus_device_unregister() -> device_unregister() -> ... -> __device_release_driver(), we can see for a device without a driver loaded: dev->driver is NULL, so dev->bus->remove(dev), namely vmbus_remove(), isn't invoked. As a result, vmbus_remove() -> hv_process_channel_removal() isn't invoked and some cleanups(like sending a CHANNELMSG_RELID_RELEASED message to the host) aren't done. We can demo the issue this way: 1. rmmod hv_utils; 2. disable the Heartbeat Integration Service in Hyper-V Manager and lsvmbus shows the device disappears. 3. re-enable the Heartbeat in Hyper-V Manager and modprobe hv_utils, but lsvmbus shows the device can't appear again. This is because, the host thinks the VM hasn't released the relid, so can't re-offer the device to the VM. We can fix the issue by moving hv_process_channel_removal() from vmbus_close_internal() to vmbus_device_release(), since the latter is always invoked on device_unregister(), whether or not the dev has a driver loaded. Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-12-14 19:15:05 -08:00
Dexuan Cui	64b7faf903	Drivers: hv: vmbus: do sanity check of channel state in vmbus_close_internal() This fixes an incorrect assumption of channel state in the function. Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-12-14 19:15:05 -08:00
Dexuan Cui	63d55b2aeb	Drivers: hv: vmbus: serialize process_chn_event() and vmbus_close_internal() process_chn_event(), running in the tasklet, can race with vmbus_close_internal() in the case of SMP guest, e.g., when the former is accessing channel->inbound.ring_buffer, the latter could be freeing the ring_buffer pages. To resolve the race, we can serialize them by disabling the tasklet when the latter is running here. Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-12-14 19:15:05 -08:00
K. Y. Srinivasan	efc267226b	Drivers: hv: vmbus: Get rid of the unused irq variable The irq we extract from ACPI is not used - we deliver hypervisor interrupts on a special vector. Make the necessary adjustments. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-12-14 19:15:05 -08:00
K. Y. Srinivasan	4ae9250893	Drivers: hv: vmbus: Use uuid_le_cmp() for comparing GUIDs Use uuid_le_cmp() for comparing GUIDs. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-12-14 19:15:05 -08:00
K. Y. Srinivasan	af3ff643ea	Drivers: hv: vmbus: Use uuid_le type consistently Consistently use uuid_le type in the Hyper-V driver code. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-12-14 19:15:05 -08:00
Olaf Hering	ed9ba608e4	Drivers: hv: vss: run only on supported host versions The Backup integration service on WS2012 has appearently trouble to negotiate with a guest which does not support the provided util version. Currently the VSS driver supports only version 5/0. A WS2012 offers only version 1/x and 3/x, and vmbus_prep_negotiate_resp correctly returns an empty icframe_vercnt/icmsg_vercnt. But the host ignores that and continues to send ICMSGTYPE_NEGOTIATE messages. The result are weird errors during boot and general misbehaviour. Check the Windows version to work around the host bug, skip hv_vss_init on WS2012 and older. Signed-off-by: Olaf Hering <olaf@aepfle.de> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-12-14 19:15:05 -08:00
Jake Oshins	3053c76244	drivers:hv: Define the channel type for Hyper-V PCI Express pass-through This defines the channel type for PCI front-ends in Hyper-V VMs. Signed-off-by: Jake Oshins <jakeo@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-12-14 19:12:21 -08:00
Jake Oshins	a108393dbf	drivers:hv: Export the API to invoke a hypercall on Hyper-V This patch exposes the function that hv_vmbus.ko uses to make hypercalls. This is necessary for retargeting an interrupt when it is given a new affinity. Since we are exporting this API, rename the API as it will be visible outside the hv.c file. Signed-off-by: Jake Oshins <jakeo@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-12-14 19:12:21 -08:00
Jake Oshins	619848bd07	drivers:hv: Export a function that maps Linux CPU num onto Hyper-V proc num This patch exposes the mapping between Linux CPU number and Hyper-V virtual processor number. This is necessary because the hypervisor needs to know which virtual processors to target when making a mapping in the Interrupt Redirection Table in the I/O MMU. Signed-off-by: Jake Oshins <jakeo@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-12-14 19:12:21 -08:00
Andrey Smetanin	17efbee8ba	drivers/hv: cleanup synic msrs if vmbus connect failed Before vmbus_connect() synic is setup per vcpu - this means hypervisor receives writes at synic msr's and probably allocate hypervisor resources per synic setup. If vmbus_connect() failed for some reason it's neccessary to cleanup synic setup by call hv_synic_cleanup() at each vcpu to get a chance to free allocated resources by hypervisor per synic. This patch does appropriate cleanup in case of vmbus_connect() failure. Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com> Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> CC: "K. Y. Srinivasan" <kys@microsoft.com> CC: Haiyang Zhang <haiyangz@microsoft.com> CC: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-12-14 19:12:21 -08:00
Olaf Hering	b00359642c	Drivers: hv: utils: use memdup_user in hvt_op_write Use memdup_user to handle OOM. Fixes: `14b50f80c3` ('Drivers: hv: util: introduce hv_utils_transport abstraction') Signed-off-by: Olaf Hering <olaf@aepfle.de> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-12-14 19:12:21 -08:00
Olaf Hering	cdc0c0c94e	Drivers: hv: util: catch allocation errors Catch allocation errors in hvutil_transport_send. Fixes: `14b50f80c3` ('Drivers: hv: util: introduce hv_utils_transport abstraction') Signed-off-by: Olaf Hering <olaf@aepfle.de> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-12-14 19:12:21 -08:00
Olaf Hering	3cace4a616	Drivers: hv: utils: run polling callback always in interrupt context All channel interrupts are bound to specific VCPUs in the guest at the point channel is created. While currently, we invoke the polling function on the correct CPU (the CPU to which the channel is bound to) in some cases we may run the polling function in a non-interrupt context. This potentially can cause an issue as the polling function can be interrupted by the channel callback function. Fix the issue by running the polling function on the appropriate CPU at interrupt level. Additional details of the issue being addressed by this patch are given below: Currently hv_fcopy_onchannelcallback is called from interrupts and also via the ->write function of hv_utils. Since the used global variables to maintain state are not thread safe the state can get out of sync. This affects the variable state as well as the channel inbound buffer. As suggested by KY adjust hv_poll_channel to always run the given callback on the cpu which the channel is bound to. This avoids the need for locking because all the util services are single threaded and only one transaction is active at any given point in time. Additionally, remove the context variable, they will always be the same as recv_channel. Signed-off-by: Olaf Hering <olaf@aepfle.de> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-12-14 19:12:21 -08:00
K. Y. Srinivasan	c0b200cfb0	Drivers: hv: util: Increase the timeout for util services Util services such as KVP and FCOPY need assistance from daemon's running in user space. Increase the timeout so we don't prematurely terminate the transaction in the kernel. Host sets up a 60 second timeout for all util driver transactions. The host will retry the transaction if it times out. Set the guest timeout at 30 seconds. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-12-14 19:12:21 -08:00
Sudip Mukherjee	9220e39b5c	Drivers: hv: vmbus: fix build warning We were getting build warning about unused variable "tsc_msr" and "va_tsc" while building for i386 allmodconfig. Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-12-14 13:34:50 -08:00
Andrey Smetanin	c75efa974e	drivers/hv: share Hyper-V SynIC constants with userspace Moved Hyper-V synic contants from guest Hyper-V drivers private header into x86 arch uapi Hyper-V header. Added Hyper-V synic msr's flags into x86 arch uapi Hyper-V header. Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com> Reviewed-by: Roman Kagan <rkagan@virtuozzo.com> Signed-off-by: Denis V. Lunev <den@openvz.org> CC: Vitaly Kuznetsov <vkuznets@redhat.com> CC: "K. Y. Srinivasan" <kys@microsoft.com> CC: Gleb Natapov <gleb@kernel.org> CC: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2015-11-04 16:24:33 +01:00
Dexuan Cui	ca1c4b7457	Drivers: hv: vmbus: fix init_vp_index() for reloading hv_netvsc This fixes the recent commit `3b71107d73`: Drivers: hv: vmbus: Further improve CPU affiliation logic Without the fix, reloading hv_netvsc hangs the guest. Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-09-20 22:44:51 -07:00
Vitaly Kuznetsov	f39c4280a3	Drivers: hv: vmbus: use cpu_hotplug_enable/disable Commit `e513229b4c` ("Drivers: hv: vmbus: prevent cpu offlining on newer hypervisors") was altering smp_ops.cpu_disable to prevent CPU offlining. We can bo better by using cpu_hotplug_enable/disable functions instead of such hard-coding. Reported-by: Radim Kr.má <rkrcmar@redhat.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-08-05 11:46:44 -07:00
Dexuan Cui	042ab0313b	Drivers: hv: vmbus: add a sysfs attr to show the binding of channel/VP This is useful to analyze performance issue. Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-08-05 11:44:29 -07:00
K. Y. Srinivasan	ca9357bd26	Drivers: hv: vmbus: Implement a clocksource based on the TSC page The current Hyper-V clock source is based on the per-partition reference counter and this counter is being accessed via s synthetic MSR - HV_X64_MSR_TIME_REF_COUNT. Hyper-V has a more efficient way of computing the per-partition reference counter value that does not involve reading a synthetic MSR. We implement a time source based on this mechanism. Tested-by: Vivek Yadav <vyadav@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-08-05 11:44:29 -07:00
Viresh Kumar	bc609cb47f	drivers/hv: Migrate to new 'set-state' interface Migrate hv driver to the new 'set-state' interface provided by clockevents core, the earlier 'set-mode' interface is marked obsolete now. This also enables us to implement callbacks for new states of clockevent devices, for example: ONESHOT_STOPPED. Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: devel@linuxdriverproject.org Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-08-05 11:44:29 -07:00
Christopher Oo	a5cca686ce	Drivers: hv_vmbus: Fix signal to host condition Fixes a bug where previously hv_ringbuffer_read would pass in the old number of bytes available to read instead of the expected old read index when calculating when to signal to the host that the ringbuffer is empty. Since the previous write size is already saved, also changes the hv_need_to_signal_on_read to use the previously read value rather than recalculating it. Signed-off-by: Christopher Oo <t-chriso@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-08-05 11:44:29 -07:00
Dexuan Cui	3b71107d73	Drivers: hv: vmbus: Further improve CPU affiliation logic Keep track of CPU affiliations of sub-channels within the scope of the primary channel. This will allow us to better distribute the load amongst available CPUs. Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-08-05 11:44:28 -07:00
K. Y. Srinivasan	9f01ec5345	Drivers: hv: vmbus: Improve the CPU affiliation for channels The current code tracks the assigned CPUs within a NUMA node in the context of the primary channel. So, if we have a VM with a single NUMA node with 8 VCPUs, we may end up unevenly distributing the channel load. Fix the issue by tracking affiliations globally. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-08-05 11:41:31 -07:00
Jake Oshins	3546448338	drivers:hv: Move MMIO range picking from hyper_fb to hv_vmbus This patch deletes the logic from hyperv_fb which picked a range of MMIO space for the frame buffer and adds new logic to hv_vmbus which picks ranges for child drivers. The new logic isn't quite the same as the old, as it considers more possible ranges. Signed-off-by: Jake Oshins <jakeo@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-08-05 11:41:31 -07:00
Jake Oshins	7f163a6fd9	drivers:hv: Modify hv_vmbus to search for all MMIO ranges available. This patch changes the logic in hv_vmbus to record all of the ranges in the VM's firmware (BIOS or UEFI) that offer regions of memory-mapped I/O space for use by paravirtual front-end drivers. The old logic just found one range above 4GB and called it good. This logic will find any ranges above 1MB. It would have been possible with this patch to just use existing resource allocation functions, rather than keep track of the entire set of Hyper-V related MMIO regions in VMBus. This strategy, however, is not sufficient when the resource allocator needs to be aware of the constraints of a Hyper-V virtual machine, which is what happens in the next patch in the series. So this first patch exists to show the first steps in reworking the MMIO allocation paths for Hyper-V front-end drivers. Signed-off-by: Jake Oshins <jakeo@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-08-05 11:41:30 -07:00
K. Y. Srinivasan	379e4f756b	Drivers: hv: vmbus: Consider ND NIC in binding channels to CPUs We cycle through all the "high performance" channels to distribute load across the available CPUs. Process the NetworkDirect as a high performance device. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-08-04 22:30:44 -07:00
Denis V. Lunev	cc2dd4027a	mshyperv: fix recognition of Hyper-V guest crash MSR's Hypervisor Top Level Functional Specification v3.1/4.0 notes that cpuid (0x40000003) EDX's 10th bit should be used to check that Hyper-V guest crash MSR's functionality available. This patch should fix this recognition. Currently the code checks EAX register instead of EDX. Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com> Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-08-04 22:30:44 -07:00
Vitaly Kuznetsov	4a54243fc0	Drivers: hv: vmbus: don't send CHANNELMSG_UNLOAD on pre-Win2012R2 hosts Pre-Win2012R2 hosts don't properly handle CHANNELMSG_UNLOAD and wait_for_completion() hangs. Avoid sending such request on old hosts. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-08-04 22:29:45 -07:00
Nik Nyby	e26009aad0	Drivers: hv: vmbus: fix typo in hv_port_info struct This fixes a typo: base_flag_bumber to base_flag_number Signed-off-by: Nik Nyby <nikolas@gnu.org> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-08-04 22:29:45 -07:00
Dan Carpenter	9dd6a06430	hv: util: checking the wrong variable We don't catch this allocation failure because there is a typo and we check the wrong variable. Fixes: `14b50f80c3` ('Drivers: hv: util: introduce hv_utils_transport abstraction') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-08-04 22:29:45 -07:00
K. Y. Srinivasan	b81658cf5d	Drivers: hv: vmbus: Permit sending of packets without payload The guest may have to send a completion packet back to the host. To support this usage, permit sending a packet without a payload - we would be only sending the descriptor in this case. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-08-04 22:28:39 -07:00
Alex Ng	b6ddeae160	Drivers: hv: balloon: Enable dynamic memory protocol negotiation with Windows 10 hosts Support Win10 protocol for Dynamic Memory. Thia patch allows guests on Win10 hosts to hot-add memory even when dynamic memory is not enabled on the guest. Signed-off-by: Alex Ng <alexng@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-08-04 22:28:39 -07:00
Vitaly Kuznetsov	25ef06fe27	Drivers: hv: fcopy: dynamically allocate smsg_out in fcopy_send_data() struct hv_start_fcopy is too big to be on stack on i386, the following warning is reported: >> drivers/hv/hv_fcopy.c:159:1: warning: the frame size of 1088 bytes is larger than 1024 bytes [-Wframe-larger-than=] Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-08-04 22:28:38 -07:00
Vitaly Kuznetsov	b36fda3397	Drivers: hv: kvp: check kzalloc return value kzalloc() return value check was accidentally lost in `11bc3a5fa9`: "Drivers: hv: kvp: convert to hv_utils_transport" commit. We don't need to reset kvp_transaction.state here as we have the kvp_timeout_func() timeout function and in case we're in OOM situation it is preferable to wait. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-08-04 22:28:38 -07:00
Vitaly Kuznetsov	510f7aef65	Drivers: hv: vmbus: prefer 'die' notification chain to 'panic' current_pt_regs() sometimes returns regs of the userspace process and in case of a kernel crash this is not what we need to report. E.g. when we trigger crash with sysrq we see the following: ... RIP: 0010:[<ffffffff815b8696>] [<ffffffff815b8696>] sysrq_handle_crash+0x16/0x20 RSP: 0018:ffff8800db0a7d88 EFLAGS: 00010246 RAX: 000000000000000f RBX: ffffffff820a0660 RCX: 0000000000000000 ... at the same time current_pt_regs() give us: ip=7f899ea7e9e0, ax=ffffffffffffffda, bx=26c81a0, cx=7f899ea7e9e0, ... These registers come from the userspace process triggered the crash. As we don't even know which process it was this information is rather useless. When kernel crash happens through 'die' proper regs are being passed to all receivers on the die_chain (and panic_notifier_list is being notified with the string passed to panic() only). If panic() is called manually (e.g. on BUG()) we won't get 'die' notification so keep the 'panic' notification reporter as well but guard against double reporting. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-08-04 22:28:38 -07:00
Vitaly Kuznetsov	b4370df2b1	Drivers: hv: vmbus: add special crash handler Full kernel hang is observed when kdump kernel starts after a crash. This hang happens in vmbus_negotiate_version() function on wait_for_completion() as Hyper-V host (Win2012R2 in my testing) never responds to CHANNELMSG_INITIATE_CONTACT as it thinks the connection is already established. We need to perform some mandatory minimalistic cleanup before we start new kernel. Reported-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-08-04 22:28:38 -07:00
Vitaly Kuznetsov	d7646eaa76	Drivers: hv: don't do hypercalls when hypercall_page is NULL At the very late stage of kexec a driver (which are not being unloaded) can try to post a message or signal an event. This will crash the kernel as we already did hv_cleanup() and the hypercall page is NULL. Move all common (between 32 and 64 bit code) declarations to the beginning of the do_hypercall() function. Unfortunately we have to write the !hypercall_page check twice to not mix declarations and code. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-08-04 22:25:29 -07:00
Vitaly Kuznetsov	2517281d63	Drivers: hv: vmbus: add special kexec handler When general-purpose kexec (not kdump) is being performed in Hyper-V guest the newly booted kernel fails with an MCE error coming from the host. It is the same error which was fixed in the "Drivers: hv: vmbus: Implement the protocol for tearing down vmbus state" commit - monitor pages remain special and when they're being written to (as the new kernel doesn't know these pages are special) bad things happen. We need to perform some minimalistic cleanup before booting a new kernel on kexec. To do so we need to register a special machine_ops.shutdown handler to be executed before the native_machine_shutdown(). Registering a shutdown notification handler via the register_reboot_notifier() call is not sufficient as it happens to early for our purposes. machine_ops is not being exported to modules (and I don't think we want to export it) so let's do this in mshyperv.c The minimalistic cleanup consists of cleaning up clockevents, synic MSRs, guest os id MSR, and hypercall MSR. Kdump doesn't require all this stuff as it lives in a separate memory space. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-08-04 22:25:29 -07:00
Vitaly Kuznetsov	06210b42f3	Drivers: hv: vmbus: remove hv_synic_free_cpu() call from hv_synic_cleanup() We already have hv_synic_free() which frees all per-cpu pages for all CPUs, let's remove the hv_synic_free_cpu() call from hv_synic_cleanup() so it will be possible to do separate cleanup (writing to MSRs) and final freeing. This is going to be used to assist kexec. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-08-04 22:25:28 -07:00
K. Y. Srinivasan	294409d205	Drivers: hv: vmbus: Allocate ring buffer memory in NUMA aware fashion Allocate ring buffer memory from the NUMA node assigned to the channel. Since this is a performance and not a correctness issue, if the node specific allocation were to fail, fall back and allocate without specifying the node. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-06-12 16:58:33 -07:00
K. Y. Srinivasan	1f656ff3fd	Drivers: hv: vmbus: Implement NUMA aware CPU affinity for channels Channels/sub-channels can be affinitized to VCPUs in the guest. Implement this affinity in a way that is NUMA aware. The current protocol distributed the primary channels uniformly across all available CPUs. The new protocol is NUMA aware: primary channels are distributed across the available NUMA nodes while the sub-channels within a primary channel are distributed amongst CPUs within the NUMA node assigned to the primary channel. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-06-01 10:56:31 +09:00
K. Y. Srinivasan	9c6e64adf2	Drivers: hv: vmbus: Use the vp_index map even for channels bound to CPU 0 Map target_cpu to target_vcpu using the mapping table. We should use the mapping table to transform guest CPU ID to VP Index as is done for the non-performance critical channels. While the value CPU 0 is special and will map to VP index 0, it is good to be consistent. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-06-01 10:56:31 +09:00
Vitaly Kuznetsov	4e4bd36f97	Drivers: hv: balloon: check if ha_region_mutex was acquired in MEM_CANCEL_ONLINE case Memory notifiers are being executed in a sequential order and when one of them fails returning something different from NOTIFY_OK the remainder of the notification chain is not being executed. When a memory block is being onlined in online_pages() we do memory_notify(MEM_GOING_ONLINE, ) and if one of the notifiers in the chain fails we end up doing memory_notify(MEM_CANCEL_ONLINE, ) so it is possible for a notifier to see MEM_CANCEL_ONLINE without seeing the corresponding MEM_GOING_ONLINE event. E.g. when CONFIG_KASAN is enabled the kasan_mem_notifier() is being used to prevent memory hotplug, it returns NOTIFY_BAD for all MEM_GOING_ONLINE events. As kasan_mem_notifier() comes before the hv_memory_notifier() in the notification chain we don't see the MEM_GOING_ONLINE event and we do not take the ha_region_mutex. We, however, see the MEM_CANCEL_ONLINE event and unconditionally try to release the lock, the following is observed: [ 110.850927] ===================================== [ 110.850927] [ BUG: bad unlock balance detected! ] [ 110.850927] 4.1.0-rc3_bugxxxxxxx_test_xxxx #595 Not tainted [ 110.850927] ------------------------------------- [ 110.850927] systemd-udevd/920 is trying to release lock (&dm_device.ha_region_mutex) at: [ 110.850927] [<ffffffff81acda0e>] mutex_unlock+0xe/0x10 [ 110.850927] but there are no more locks to release! At the same time we can have the ha_region_mutex taken when we get the MEM_CANCEL_ONLINE event in case one of the memory notifiers after the hv_memory_notifier() in the notification chain failed so we need to add the mutex_is_locked() check. In case of MEM_ONLINE we are always supposed to have the mutex locked. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-06-01 06:38:21 +09:00
Keith Mange	6c4e5f9c9f	Drivers: hv: vmbus:Update preferred vmbus protocol version to windows 10. Add support for Windows 10. Signed-off-by: Keith Mange <keith.mange@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-06-01 06:38:21 +09:00
Vitaly Kuznetsov	ce59fec836	Drivers: hv: vmbus: distribute subchannels among all vcpus Primary channels are distributed evenly across all vcpus we have. When the host asks us to create subchannels it usually makes us num_cpus-1 offers and we are supposed to distribute the work evenly among the channel itself and all its subchannels. Make sure they are all assigned to different vcpus. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-05-24 12:19:00 -07:00
Vitaly Kuznetsov	f38e7dd723	Drivers: hv: vmbus: move init_vp_index() call to vmbus_process_offer() We need to call init_vp_index() after we added the channel to the appropriate list (global or subchannel) to be able to use this information when assigning the channel to the particular vcpu. To do so we need to move a couple of functions around. The only real change is the init_vp_index() call. This is a small refactoring without a functional change. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-05-24 12:19:00 -07:00
Vitaly Kuznetsov	357e836a60	Drivers: hv: vmbus: decrease num_sc on subchannel removal It is unlikely that that host will ask us to close only one subchannel for a device but let's be consistent. Do both num_sc++ and num_sc-- with channel->lock to be on the safe side. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-05-24 12:19:00 -07:00
Vitaly Kuznetsov	8dfd332674	Drivers: hv: vmbus: unify calls to percpu_channel_enq() Remove some code duplication, no functional change intended. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-05-24 12:19:00 -07:00
Vitaly Kuznetsov	1959a28e26	Drivers: hv: vmbus: kill tasklets on module unload Explicitly kill tasklets we create on module unload. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-05-24 12:19:00 -07:00
Vitaly Kuznetsov	ffc151f3c8	Drivers: hv: vmbus: do cleanup on all vmbus_open() failure paths In case there was an error reported in the response to the CHANNELMSG_OPENCHANNEL call we need to do the cleanup as a vmbus_open() user won't be doing it after receiving an error. The cleanup should be done on all failure paths. We also need to avoid returning open_info->response.open_result.status as the return value as all other errors we return from vmbus_open() are -EXXX and vmbus_open() callers are not supposed to analyze host error codes. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-05-24 12:19:00 -07:00
K. Y. Srinivasan	2db84eff12	Drivers: hv: vmbus: Implement the protocol for tearing down vmbus state Implement the protocol for tearing down the monitor state established with the host. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Tested-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-05-24 12:18:24 -07:00
Dexuan Cui	813c5b7958	hv: vmbus_free_channels(): remove the redundant free_channel() free_channel() has been invoked in vmbus_remove() -> hv_process_channel_removal(), or vmbus_remove() -> ... -> vmbus_close_internal() -> hv_process_channel_removal(). We also change to use list_for_each_entry_safe(), because the entry is removed in hv_process_channel_removal(). This patch fixes a bug in the vmbus unload path. Thank Dan Carpenter for finding the issue! Signed-off-by: Dexuan Cui <decui@microsoft.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-05-24 12:18:23 -07:00
Vitaly Kuznetsov	096c605feb	Drivers: hv: vmbus: unregister panic notifier on module unload Commit `96c1d0581d` ("Drivers: hv: vmbus: Add support for VMBus panic notifier handler") introduced atomic_notifier_chain_register() call on module load. We also need to call atomic_notifier_chain_unregister() on module unload as otherwise the following crash is observed when we bring hv_vmbus back: [ 39.788877] BUG: unable to handle kernel paging request at ffffffffa00078a8 [ 39.788877] IP: [<ffffffff8109d63f>] notifier_call_chain+0x3f/0x80 ... [ 39.788877] Call Trace: [ 39.788877] [<ffffffff8109de7d>] __atomic_notifier_call_chain+0x5d/0x90 ... [ 39.788877] [<ffffffff8109d788>] ? atomic_notifier_chain_register+0x38/0x70 [ 39.788877] [<ffffffff8109d767>] ? atomic_notifier_chain_register+0x17/0x70 [ 39.788877] [<ffffffffa002814f>] hv_acpi_init+0x14f/0x1000 [hv_vmbus] [ 39.788877] [<ffffffff81002144>] do_one_initcall+0xd4/0x210 Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-05-24 12:18:23 -07:00
Vitaly Kuznetsov	e4ecb41cf9	Drivers: hv: vmbus: introduce vmbus_acpi_remove In case we do request_resource() in vmbus_acpi_add() we need to tear it down to be able to load the driver again. Otherwise the following crash in observed when hv_vmbus unload/load sequence is performed on a Generation2 instance: [ 38.165701] BUG: unable to handle kernel paging request at ffffffffa00075a0 [ 38.166315] IP: [<ffffffff8107dc5f>] __request_resource+0x2f/0x50 [ 38.166315] PGD 1f34067 PUD 1f35063 PMD 3f723067 PTE 0 [ 38.166315] Oops: 0000 [#1] SMP [ 38.166315] Modules linked in: hv_vmbus(+) [last unloaded: hv_vmbus] [ 38.166315] CPU: 0 PID: 267 Comm: modprobe Not tainted 3.19.0-rc5_bug923184+ #486 [ 38.166315] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v1.0 11/26/2012 [ 38.166315] task: ffff88003f401cb0 ti: ffff88003f60c000 task.ti: ffff88003f60c000 [ 38.166315] RIP: 0010:[<ffffffff8107dc5f>] [<ffffffff8107dc5f>] __request_resource+0x2f/0x50 [ 38.166315] RSP: 0018:ffff88003f60fb58 EFLAGS: 00010286 Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-05-24 12:18:23 -07:00
Vitaly Kuznetsov	7c95912758	Drivers: hv: utils: unify driver registration reporting Unify driver registration reporting and move it to debug level as normally daemons write to syslog themselves and these kernel messages are useless. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Tested-by: Alex Ng <alexng@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-05-24 12:17:42 -07:00
Vitaly Kuznetsov	a4d1ee5b02	Drivers: hv: fcopy: full handshake support Introduce FCOPY_VERSION_1 to support kernel replying to the negotiation message with its own version. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Tested-by: Alex Ng <alexng@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-05-24 12:17:42 -07:00
Vitaly Kuznetsov	cd8dc05485	Drivers: hv: vss: full handshake support Introduce VSS_OP_REGISTER1 to support kernel replying to the negotiation message with its own version. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Tested-by: Alex Ng <alexng@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-05-24 12:17:41 -07:00
Vitaly Kuznetsov	11bc3a5fa9	Drivers: hv: kvp: convert to hv_utils_transport Convert to hv_utils_transport to support both netlink and /dev/vmbus/hv_kvp communication methods. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Tested-by: Alex Ng <alexng@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-05-24 12:17:41 -07:00
Vitaly Kuznetsov	c7e490fc23	Drivers: hv: fcopy: convert to hv_utils_transport Unify the code with the recently introduced hv_utils_transport. Netlink communication is disabled for fcopy. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Tested-by: Alex Ng <alexng@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-05-24 12:17:41 -07:00
Vitaly Kuznetsov	6472f80a2e	Drivers: hv: vss: convert to hv_utils_transport Convert to hv_utils_transport to support both netlink and /dev/vmbus/hv_vss communication methods. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Tested-by: Alex Ng <alexng@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-05-24 12:17:41 -07:00
Vitaly Kuznetsov	14b50f80c3	Drivers: hv: util: introduce hv_utils_transport abstraction The intention is to make KVP/VSS drivers work through misc char devices. Introduce an abstraction for kernel/userspace communication to make the migration smoother. Transport operational mode (netlink or char device) is determined by the first received message. To support driver upgrades the switch from netlink to chardev operational mode is supported. Every hv_util daemon is supposed to register 2 callbacks: 1) on_msg() to get notified when the userspace daemon sent a message; 2) on_reset() to get notified when the userspace daemon drops the connection. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Tested-by: Alex Ng <alexng@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-05-24 12:17:41 -07:00
Vitaly Kuznetsov	3f0dccf86c	Drivers: hv: fcopy: set .owner reference for file operations Get an additional reference otherwise a crash is observed when hv_utils module is being unloaded while fcopy daemon is still running. .owner gives us an additional reference when someone holds a descriptor for the device. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Tested-by: Alex Ng <alexng@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-05-24 12:17:41 -07:00
Vitaly Kuznetsov	4c93ccccf4	Drivers: hv: fcopy: switch to using the hvutil_device_state state machine Switch to using the hvutil_device_state state machine from using 3 different state variables: fcopy_transaction.active, opened, and in_hand_shake. State transitions are: -> HVUTIL_DEVICE_INIT when driver loads or on device release -> HVUTIL_READY if the handshake was successful -> HVUTIL_HOSTMSG_RECEIVED when there is a non-negotiation message from the host -> HVUTIL_USERSPACE_REQ after userspace daemon read the message -> HVUTIL_USERSPACE_RECV after/if userspace has replied -> HVUTIL_READY after we respond to the host -> HVUTIL_DEVICE_DYING on driver unload In hv_fcopy_onchannelcallback() process ICMSGTYPE_NEGOTIATE messages even when the userspace daemon is disconnected, otherwise we can make the host think we don't support FCOPY and disable the service completely. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Tested-by: Alex Ng <alexng@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-05-24 12:17:41 -07:00
Vitaly Kuznetsov	086a6f68d6	Drivers: hv: vss: switch to using the hvutil_device_state state machine Switch to using the hvutil_device_state state machine from using kvp_transaction.active. State transitions are: -> HVUTIL_DEVICE_INIT when driver loads or on device release -> HVUTIL_READY if the handshake was successful -> HVUTIL_HOSTMSG_RECEIVED when there is a non-negotiation message from the host -> HVUTIL_USERSPACE_REQ after we sent the message to the userspace daemon -> HVUTIL_USERSPACE_RECV after/if the userspace daemon has replied -> HVUTIL_READY after we respond to the host -> HVUTIL_DEVICE_DYING on driver unload In hv_vss_onchannelcallback() process ICMSGTYPE_NEGOTIATE messages even when the userspace daemon is disconnected, otherwise we can make the host think we don't support VSS and disable the service completely. Unfortunately there is no good way we can figure out that the userspace daemon has died (unless we start treating all timeouts as such), add a protection against processing new VSS_OP_REGISTER messages while being in the middle of a transaction (HVUTIL_USERSPACE_REQ or HVUTIL_USERSPACE_RECV state). Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Tested-by: Alex Ng <alexng@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-05-24 12:17:41 -07:00
Vitaly Kuznetsov	97bf16cd30	Drivers: hv: kvp: switch to using the hvutil_device_state state machine Switch to using the hvutil_device_state state machine from using 2 different state variables: kvp_transaction.active and in_hand_shake. State transitions are: -> HVUTIL_DEVICE_INIT when driver loads or on device release -> HVUTIL_READY if the handshake was successful -> HVUTIL_HOSTMSG_RECEIVED when there is a non-negotiation message from the host -> HVUTIL_USERSPACE_REQ after we sent the message to the userspace daemon -> HVUTIL_USERSPACE_RECV after/if the userspace daemon has replied -> HVUTIL_READY after we respond to the host -> HVUTIL_DEVICE_DYING on driver unload In hv_kvp_onchannelcallback() process ICMSGTYPE_NEGOTIATE messages even when the userspace daemon is disconnected, otherwise we can make the host think we don't support KVP and disable the service completely. Unfortunately there is no good way we can figure out that the userspace daemon has died (unless we start treating all timeouts as such). In case the daemon restarts we skip the negotiation procedure (so the daemon is supposed to has the same version). This behavior is unchanged from in_handshake approach. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Tested-by: Alex Ng <alexng@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-05-24 12:17:41 -07:00
Vitaly Kuznetsov	636c88da6d	Drivers: hv: util: introduce state machine for util drivers KVP/VSS/FCOPY drivers work in fully serialized mode: we wait till userspace daemon registers, wait for a message from the host, send this message to the daemon, get the reply, send it back to host, wait for another message. Introduce enum hvutil_device_state to represend this state in all 3 drivers. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Tested-by: Alex Ng <alexng@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-05-24 12:17:41 -07:00
Vitaly Kuznetsov	1d072339fa	Drivers: hv: fcopy: rename fcopy_work -> fcopy_timeout_work 'fcopy_work' (and fcopy_work_func) is a misnomer as it sounds like we expect this useful work to happen and in reality it is just an emergency escape when timeout happens. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Tested-by: Alex Ng <alexng@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-05-24 12:17:40 -07:00
Vitaly Kuznetsov	68c8b39a0d	Drivers: hv: kvp: rename kvp_work -> kvp_timeout_work 'kvp_work' (and kvp_work_func) is a misnomer as it sounds like we expect this useful work to happen and in reality it is just an emergency escape when timeout happens. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Tested-by: Alex Ng <alexng@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-05-24 12:17:40 -07:00
Vitaly Kuznetsov	38c06c29ba	Drivers: hv: vss: process deferred messages when we complete the transaction In theory, the host is not supposed to issue any requests before be reply to the previous one. In KVP we, however, support the following scenarios: 1) A message was received before userspace daemon registered; 2) A message was received while the previous one is still being processed. In VSS we support only the former. Add support for the later, use hv_poll_channel() to do the job. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Tested-by: Alex Ng <alexng@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-05-24 12:17:40 -07:00
Vitaly Kuznetsov	242f31221d	Drivers: hv: fcopy: process deferred messages when we complete the transaction In theory, the host is not supposed to issue any requests before be reply to the previous one. In KVP we, however, support the following scenarios: 1) A message was received before userspace daemon registered; 2) A message was received while the previous one is still being processed. In FCOPY we support only the former. Add support for the later, use hv_poll_channel() to do the job. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Tested-by: Alex Ng <alexng@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-05-24 12:17:40 -07:00
Vitaly Kuznetsov	8efe78fdb1	Drivers: hv: kvp: move poll_channel() to hyperv_vmbus.h Move poll_channel() to hyperv_vmbus.h and make it inline and rename it to hv_poll_channel() so it can be reused in other hv_util modules. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Tested-by: Alex Ng <alexng@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-05-24 12:17:40 -07:00
Vitaly Kuznetsov	5fa97480b9	Drivers: hv: kvp: reset kvp_context We set kvp_context when we want to postpone receiving a packet from vmbus due to the previous transaction being unfinished. We, however, never reset this state, all consequent kvp_respond_to_host() calls will result in poll_channel() calling hv_kvp_onchannelcallback(). This doesn't cause real issues as: 1) Host is supposed to serialize transactions as well 2) If no message is pending vmbus_recvpacket() will return 0 recvlen. This is just a cleanup. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Tested-by: Alex Ng <alexng@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-05-24 12:17:40 -07:00
Vitaly Kuznetsov	3647a83d9d	Drivers: hv: util: move kvp/vss function declarations to hyperv_vmbus.h These declarations are internal to hv_util module and hv_fcopy_* declarations already reside there. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Tested-by: Alex Ng <alexng@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-05-24 12:17:40 -07:00
Vitaly Kuznetsov	797f88c987	Drivers: hv: hv_balloon: correctly handle num_pages>INT_MAX case balloon_wrk.num_pages is __u32 and it comes from host in struct dm_balloon where it is also __u32. We, however, use 'int' in balloon_up() and in case we happen to receive num_pages>INT_MAX request we'll end up allocating zero pages as 'num_pages < alloc_unit' check in alloc_balloon_pages() will pass. Change num_pages type to unsigned int. In real life ballooning request come with num_pages in [512, 32768] range so this is more a future-proof/cleanup. Reported-by: Laszlo Ersek <lersek@redhat.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Laszlo Ersek <lersek@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-04-03 16:20:12 +02:00
Vitaly Kuznetsov	ba0c444153	Drivers: hv: hv_balloon: correctly handle val.freeram<num_pages case 'Drivers: hv: hv_balloon: refuse to balloon below the floor' fix does not correctly handle the case when val.freeram < num_pages as val.freeram is __kernel_ulong_t and the 'val.freeram - num_pages' value will be a huge positive value instead of being negative. Usually host doesn't ask us to balloon more than val.freeram but in case he have a memory hog started after we post the last pressure report we can get into troubles. Suggested-by: Laszlo Ersek <lersek@redhat.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Laszlo Ersek <lersek@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-04-03 16:20:12 +02:00
Haiyang Zhang	e1c0d82dab	hv_vmbus: Add gradually increased delay for retries in vmbus_post_msg() Most of the retries can be done within a millisecond successfully, so we sleep 1ms before the first retry, then gradually increase the retry interval to 2^n with max value of 2048ms. Doing so, we will have shorter overall delay time, because most of the cases succeed within 1-2 attempts. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: K. Y. Srinivasan <kys@microsoft.com> Reviewed-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-04-03 16:18:02 +02:00
Vitaly Kuznetsov	0a1a86ac04	Drivers: hv: hv_balloon: survive ballooning request with num_pages=0 ... and simplify alloc_balloon_pages() interface by removing redundant alloc_error from it. If we happen to enter balloon_up() with balloon_wrk.num_pages = 0 we will enter infinite 'while (!done)' loop as alloc_balloon_pages() will be always returning 0 and not setting alloc_error. We will also be sending a meaningless message to the host on every iteration. The 'alloc_unit == 1 && alloc_error -> num_ballooned == 0' change and alloc_error elimination requires a special comment. We do alloc_balloon_pages() with 2 different alloc_unit values and there are 4 different alloc_balloon_pages() results, let's check them all. alloc_unit = 512: 1) num_ballooned = 0, alloc_error = 0: we do 'alloc_unit=1' and retry pre- and post-patch. 2) num_ballooned > 0, alloc_error = 0: we check 'num_ballooned == num_pages' and act accordingly, pre- and post-patch. 3) num_ballooned > 0, alloc_error > 0: we report this chunk and remain within the loop, no changes here. 4) num_ballooned = 0, alloc_error > 0: we do 'alloc_unit=1' and retry pre- and post-patch. alloc_unit = 1: 1) num_ballooned = 0, alloc_error = 0: this can happen in two cases: when we passed 'num_pages=0' to alloc_balloon_pages() or when there was no space in bl_resp to place a single response. The second option is not possible as bl_resp is of PAGE_SIZE size and single response 'union dm_mem_page_range' is 8 bytes, but the first one is (in theory, I think that Hyper-V host never places such requests). Pre-patch code loops forever, post-patch code sends a reply with more_pages = 0 and finishes. 2) num_ballooned > 0, alloc_error = 0: we ran out of space in bl_resp, we report partial success and remain within the loop, no changes pre- and post-patch. 3) num_ballooned > 0, alloc_error > 0: pre-patch code finishes, post-patch code does one more try and if there is no progress (we finish with 'num_ballooned = 0') we finish. So we try a bit harder with this patch. 4) num_ballooned = 0, alloc_error > 0: both pre- and post-patch code enter 'more_pages = 0' branch and finish. So this patch has two real effects: 1) We reply with an empty response to 'num_pages=0' request. 2) We try a bit harder on alloc_unit=1 allocations (and reply with an empty tail reply in case we fail). An empty reply should be supported by host as we were able to send it even with pre-patch code when we were not able to allocate a single page. Suggested-by: Laszlo Ersek <lersek@redhat.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-04-03 16:18:02 +02:00
Vitaly Kuznetsov	7fb0e1a650	Drivers: hv: hv_balloon: eliminate jumps in piecewiese linear floor function Commit `79208c57da` ("Drivers: hv: hv_balloon: Make adjustments in computing the floor") was inacurate as it introduced a jump in our piecewiese linear 'floor' function: At 2048MB we have: Left limit: 104 + 2048/8 = 360 Right limit: 256 + 2048/16 = 384 (so the right value is 232) We now have to make an adjustment at 8192 boundary: 232 + 8192/16 = 744 512 + 8192/32 = 768 (so the right value is 488) Suggested-by: Laszlo Ersek <lersek@redhat.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-04-03 16:18:02 +02:00
Vitaly Kuznetsov	d6cbd2c3a3	Drivers: hv: hv_balloon: do not online pages in offline blocks Currently we add memory in 128Mb blocks but the request from host can be aligned differently. In such case we add a partially backed block and when this block goes online we skip onlining pages which are not backed (hv_online_page() callback serves this purpose). When we receive next request for the same host add region we online pages which were not backed before with hv_bring_pgs_online(). However, we don't check if the the block in question was onlined and online this tail unconditionally. This is bad as we avoid all online_pages() logic: these pages are not accounted, we don't send notifications (and hv_balloon is not the only receiver of them),... And, first of all, nobody asked as to online these pages. Solve the issue by checking if the last previously backed page was onlined and onlining the tail only in case it was. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-04-03 16:18:02 +02:00
Dexuan Cui	aadc3780f3	hv: remove the per-channel workqueue It's not necessary any longer, since we can safely run the blocking message handlers in vmbus_connection.work_queue now. Signed-off-by: Dexuan Cui <decui@microsoft.com> Cc: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-04-03 16:18:02 +02:00
Dexuan Cui	d43e2fe7da	hv: don't schedule new works in vmbus_onoffer()/vmbus_onoffer_rescind() Since the 2 fucntions can safely run in vmbus_connection.work_queue without hang, we don't need to schedule new work items into the per-channel workqueue. Actally we can even remove the per-channel workqueue now -- we'll do it in the next patch. Signed-off-by: Dexuan Cui <decui@microsoft.com> Cc: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-04-03 16:18:02 +02:00
Dexuan Cui	652594c7df	hv: run non-blocking message handlers in the dispatch tasklet A work item in vmbus_connection.work_queue can sleep, waiting for a new host message (usually it is some kind of "completion" message). Currently the new message will be handled in the same workqueue, but since work items in the workqueue is serialized, we actually have no chance to handle the new message if the current work item is sleeping -- as as result, the current work item will hang forever. K. Y. has posted the below fix to resolve the issue: Drivers: hv: vmbus: Perform device register in the per-channel work element Actually we can simplify the fix by directly running non-blocking message handlers in the dispatch tasklet (inspired by K. Y.). This patch is the fundamental change. The following 2 patches will simplify the message offering and rescind-offering handling a lot. Signed-off-by: Dexuan Cui <decui@microsoft.com> Cc: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-04-03 16:18:01 +02:00
K. Y. Srinivasan	73cffdb65e	Drivers: hv: vmbus: Don't wait after requesting offers Don't wait after sending request for offers to the host. This wait is unnecessary and simply adds 5 seconds to the boot time. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Cc: <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2015-03-26 23:51:14 +01:00

1 2 3 4 5 ...

483 Commits