This patch introduces a function that reverses everything
done by vmbus_allocate_mmio(). Existing code just called
release_mem_region(). Future patches in this series
require a more complex sequence of actions, so this function
is introduced to wrap those actions.
Signed-off-by: Jake Oshins <jakeo@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In existing code, this tree of resources is created
in single-threaded code and never modified after it is
created, and thus needs no locking. This patch introduces
a semaphore for tree access, as other patches in this
series introduce run-time modifications of this resource
tree which can happen on multiple threads.
Signed-off-by: Jake Oshins <jakeo@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Implement APIs for in-place consumption of vmbus packets. Currently, each
packet is copied and processed one at a time and as part of processing
each packet we potentially may signal the host (if it is waiting for
room to produce a packet).
These APIs help batched in-place processing of vmbus packets.
We also optimize host signaling by having a separate API to signal
the end of in-place consumption. With netvsc using these APIs,
on an iperf run on average I see about 20X reduction in checks to
signal the host.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In preparation for implementing APIs for in-place consumption of VMBUS
packets, movve some ring buffer functionality into hyperv.h
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In preparation for moving some ring buffer functionality out of the
vmbus driver, export the API for signaling the host.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Use the virt_xx barriers that have been defined for use in virtual machines.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Use the READ_ONCE macro to access variabes that can change asynchronously.
This is the recommended mechanism for dealing with "unsafe" compiler
optimizations.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Introduce separate functions for estimating how much can be read from
and written to the ring buffer.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On the consumer side, we have interrupt driven flow management of the
producer. It is sufficient to base the signaling decision on the
amount of space that is available to write after the read is complete.
The current code samples the previous available space and uses this
in making the signaling decision. This state can be stale and is
unnecessary. Since the state can be stale, we end up not signaling
the host (when we should) and this can result in a hang. Fix this
problem by removing the unnecessary check. I would like to thank
Arseney Romanenko <arseneyr@microsoft.com> for pointing out this issue.
Also, issue a full memory barrier before making the signaling descision
to correctly deal with potential reordering of the write (read index)
followed by the read of pending_sz.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Tested-by: Dexuan Cui <decui@microsoft.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Here is the big char/misc driver update for 4.6-rc1.
The majority of the patches here is hwtracing and some new mic drivers,
but there's a lot of other driver updates as well. Full details in the
shortlog.
All have been in linux-next for a while with no reported issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iEYEABECAAYFAlbp9IcACgkQMUfUDdst+ykyJgCeLTC2QNGrh51kiJglkVJ0yD36
q4MAn0NkvSX2+iv5Jq8MaX6UQoRa4Nun
=MNjR
-----END PGP SIGNATURE-----
Merge tag 'char-misc-4.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc updates from Greg KH:
"Here is the big char/misc driver update for 4.6-rc1.
The majority of the patches here is hwtracing and some new mic
drivers, but there's a lot of other driver updates as well. Full
details in the shortlog.
All have been in linux-next for a while with no reported issues"
* tag 'char-misc-4.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (238 commits)
goldfish: Fix build error of missing ioremap on UM
nvmem: mediatek: Fix later provider initialization
nvmem: imx-ocotp: Fix return value of imx_ocotp_read
nvmem: Fix dependencies for !HAS_IOMEM archs
char: genrtc: replace blacklist with whitelist
drivers/hwtracing: make coresight-etm-perf.c explicitly non-modular
drivers: char: mem: fix IS_ERROR_VALUE usage
char: xillybus: Fix internal data structure initialization
pch_phub: return -ENODATA if ROM can't be mapped
Drivers: hv: vmbus: Support kexec on ws2012 r2 and above
Drivers: hv: vmbus: Support handling messages on multiple CPUs
Drivers: hv: utils: Remove util transport handler from list if registration fails
Drivers: hv: util: Pass the channel information during the init call
Drivers: hv: vmbus: avoid unneeded compiler optimizations in vmbus_wait_for_unload()
Drivers: hv: vmbus: remove code duplication in message handling
Drivers: hv: vmbus: avoid wait_for_completion() on crash
Drivers: hv: vmbus: don't loose HVMSG_TIMER_EXPIRED messages
misc: at24: replace memory_accessor with nvmem_device_read
eeprom: 93xx46: extend driver to plug into the NVMEM framework
eeprom: at25: extend driver to plug into the NVMEM framework
...
WS2012 R2 and above hosts can support kexec in that thay can support
reconnecting to the host (as would be needed in the kexec path)
on any CPU. Enable this. Pre ws2012 r2 hosts don't have this ability
and consequently cannot support kexec.
Signed-off-by: Alex Ng <alexng@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Starting with Windows 2012 R2, message inteerupts can be delivered
on any VCPU in the guest. Support this functionality.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If util transport fails to initialize for any reason, the list of transport
handlers may become corrupted due to freeing the transport handler without
removing it from the list. Fix this by cleaning it up from the list.
Signed-off-by: Alex Ng <alexng@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pass the channel information to the util drivers that need to defer
reading the channel while they are processing a request. This would address
the following issue reported by Vitaly:
Commit 3cace4a616 ("Drivers: hv: utils: run polling callback always in
interrupt context") removed direct *_transaction.state = HVUTIL_READY
assignments from *_handle_handshake() functions introducing the following
race: if a userspace daemon connects before we get first non-negotiation
request from the server hv_poll_channel() won't set transaction state to
HVUTIL_READY as (!channel) condition will fail, we set it to non-NULL on
the first real request from the server.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reported-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Message header is modified by the hypervisor and we read it in a loop,
we need to prevent compilers from optimizing accesses. There are no such
optimizations at this moment, this is just a future proof.
Suggested-by: Radim Krcmar <rkrcmar@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Radim Kr.má<rkrcmar@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We have 3 functions dealing with messages and they all implement
the same logic to finalize reads, move it to vmbus_signal_eom().
Suggested-by: Radim Krcmar <rkrcmar@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Radim Kr.má<rkrcmar@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
wait_for_completion() may sleep, it enables interrupts and this
is something we really want to avoid on crashes because interrupt
handlers can cause other crashes. Switch to the recently introduced
vmbus_wait_for_unload() doing busy wait instead.
Reported-by: Radim Krcmar <rkrcmar@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Radim Kr.má<rkrcmar@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We must handle HVMSG_TIMER_EXPIRED messages in the interrupt context
and we offload all the rest to vmbus_on_msg_dpc() tasklet. This functions
loops to see if there are new messages pending. In case we'll ever see
HVMSG_TIMER_EXPIRED message there we're going to lose it as we can't
handle it from there. Avoid looping in vmbus_on_msg_dpc(), we're OK
with handling one message per interrupt.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Radim Kr.má<rkrcmar@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
VMBus hypercall codes inside Hyper-V UAPI header will
be used by QEMU to implement VMBus host devices support.
Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
Acked-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
CC: Gleb Natapov <gleb@kernel.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Joerg Roedel <joro@8bytes.org>
CC: "K. Y. Srinivasan" <kys@microsoft.com>
CC: Haiyang Zhang <haiyangz@microsoft.com>
CC: Roman Kagan <rkagan@virtuozzo.com>
CC: Denis V. Lunev <den@openvz.org>
CC: qemu-devel@nongnu.org
[Do not rename the constant at the same time as moving it, as that
would cause semantic conflicts with the Hyper-V tree. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
On the channel send side, many of the VMBUS
device drivers explicity serialize access to the
outgoing ring buffer. Give more control to the
VMBUS device drivers in terms how to serialize
accesss to the outgoing ring buffer.
The default behavior will be to aquire the
ring lock to preserve the current behavior.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The function hv_ringbuffer_read() is called always on a pre-assigned
CPU. Each chnnel is bound to a specific CPU and this function is
always called on the CPU the channel is bound. There is no need to
acquire the spin lock; get rid of this overhead.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The hvsock driver needs this API to release all the resources related
to the channel.
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This will be used by the coming hv_sock driver.
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Only the coming hv_sock driver has a "true" value for this flag.
We treat the hvsock offers/channels as special VMBus devices.
Since the hv_sock driver handles all the hvsock offers/channels, we need to
tweak vmbus_match() for hv_sock driver, so we introduce this flag.
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
A function to send the type of message is also added.
The coming net/hvsock driver will use this function to proactively request
the host to offer a VMBus channel for a new hvsock connection.
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When the hvsock channel's outbound ringbuffer is full (i.e.,
hv_ringbuffer_write() returns -EAGAIN), we should avoid the unnecessary
signaling the host.
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
clocksource_change_rating() involves mutex usage and can't be called
in interrupt context. It also makes sense to avoid doing redundant work
on crash.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We have to call vmbus_initiate_unload() on crash to make kdump work but
the crash can also be happening in interrupt (e.g. Sysrq + c results in
such) where we can't schedule or the following will happen:
[ 314.905786] bad: scheduling from the idle thread!
Just skipping the wait (and even adding some random wait here) won't help:
to make host-side magic working we're supposed to receive CHANNELMSG_UNLOAD
(and actually confirm the fact that we received it) but we can't use
interrupt-base path (vmbus_isr()-> vmbus_on_msg_dpc()). Implement a simple
busy wait ignoring all the other messages and use it if we're in an
interrupt context.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When we pick a CPU to use for a new subchannel we try find a non-used one
on the appropriate NUMA node, we keep track of them with the
primary->alloced_cpus_in_node mask. Under normal circumstances we don't run
out of available CPUs but it is possible when we we don't initialize some
cpus in Linux, e.g. when we boot with 'nr_cpus=' limitation.
Avoid the infinite loop in init_vp_index() by checking that we still have
non-used CPUs in the alloced_cpus_in_node mask and resetting it in case
we don't.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add vendor and device attributes to VMBUS devices. These will be used
by Hyper-V tools as well user-level RDMA libraries that will use the
vendor/device tuple to discover the RDMA device.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cleanup vmbus_set_event() by inlining the hypercall to post
the event and since the return value of vmbus_set_event() is not checked,
make it void. As part of this cleanup, get rid of the function
hv_signal_event() as it is only callled from vmbus_set_event().
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Here's the big set of char/misc patches for 4.5-rc1.
Nothing major, lots of different driver subsystem updates, full details
in the shortlog. All of these have been in linux-next for a while.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iEYEABECAAYFAlaV0GsACgkQMUfUDdst+ymlPgCg07GSc4SOWlUL2V36vQ0kucoO
YjAAoMfeUEhsf/NJ7iaAMGQVUQKuYVqr
=BlqH
-----END PGP SIGNATURE-----
Merge tag 'char-misc-4.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc updates from Greg KH:
"Here's the big set of char/misc patches for 4.5-rc1.
Nothing major, lots of different driver subsystem updates, full
details in the shortlog. All of these have been in linux-next for a
while"
* tag 'char-misc-4.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (71 commits)
mei: fix fasync return value on error
parport: avoid assignment in if
parport: remove unneeded space
parport: change style of NULL comparison
parport: remove unnecessary out of memory message
parport: remove braces
parport: quoted strings should not be split
parport: code indent should use tabs
parport: fix coding style
parport: EXPORT_SYMBOL should follow function
parport: remove trailing white space
parport: fix a trivial typo
coresight: Fix a typo in Kconfig
coresight: checking for NULL string in coresight_name_match()
Drivers: hv: vmbus: Treat Fibre Channel devices as performance critical
Drivers: hv: utils: fix hvt_op_poll() return value on transport destroy
Drivers: hv: vmbus: fix the building warning with hyperv-keyboard
extcon: add Maxim MAX3355 driver
Drivers: hv: ring_buffer: eliminate hv_ringbuffer_peek()
Drivers: hv: remove code duplication between vmbus_recvpacket()/vmbus_recvpacket_raw()
...
For performance critical devices, we distribute the incoming
channel interrupt load across available CPUs in the guest.
Include Fibre channel devices in the set of devices for which
we would distribute the interrupt load.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The return type of hvt_op_poll() is unsigned int and -EBADF is
inappropriate, poll functions return POLL* statuses.
Reported-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This struct is required for Hyper-V SynIC timers implementation inside KVM
and for upcoming Hyper-V VMBus support by userspace(QEMU). So place it into
Hyper-V UAPI header.
Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
CC: Gleb Natapov <gleb@kernel.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: "K. Y. Srinivasan" <kys@microsoft.com>
CC: Haiyang Zhang <haiyangz@microsoft.com>
CC: Vitaly Kuznetsov <vkuznets@redhat.com>
CC: Roman Kagan <rkagan@virtuozzo.com>
CC: Denis V. Lunev <den@openvz.org>
CC: qemu-devel@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This struct is required for Hyper-V SynIC timers implementation inside KVM
and for upcoming Hyper-V VMBus support by userspace(QEMU). So place it into
Hyper-V UAPI header.
Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
Acked-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
CC: Gleb Natapov <gleb@kernel.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: "K. Y. Srinivasan" <kys@microsoft.com>
CC: Haiyang Zhang <haiyangz@microsoft.com>
CC: Vitaly Kuznetsov <vkuznets@redhat.com>
CC: Roman Kagan <rkagan@virtuozzo.com>
CC: Denis V. Lunev <den@openvz.org>
CC: qemu-devel@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This constant is required for Hyper-V SynIC timers MSR's
support by userspace(QEMU).
Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
Acked-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
CC: Gleb Natapov <gleb@kernel.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: "K. Y. Srinivasan" <kys@microsoft.com>
CC: Haiyang Zhang <haiyangz@microsoft.com>
CC: Vitaly Kuznetsov <vkuznets@redhat.com>
CC: Roman Kagan <rkagan@virtuozzo.com>
CC: Denis V. Lunev <den@openvz.org>
CC: qemu-devel@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Currently, there is only one user for hv_ringbuffer_read()/
hv_ringbuffer_peak() functions and the usage of these functions is:
- insecure as we drop ring_lock between them, someone else (in theory
only) can acquire it in between;
- non-optimal as we do a number of things (acquire/release the above
mentioned lock, calculate available space on the ring, ...) twice and
this path is performance-critical.
Remove hv_ringbuffer_peek() moving the logic from __vmbus_recvpacket() to
hv_ringbuffer_read().
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
vmbus_recvpacket() and vmbus_recvpacket_raw() are almost identical but
there are two discrepancies:
1) vmbus_recvpacket() doesn't propagate errors from hv_ringbuffer_read()
which looks like it is not desired.
2) There is an error message printed in packetlen > bufferlen case in
vmbus_recvpacket(). I'm removing it as it is usless for users to see
such messages and /vmbus_recvpacket_raw() doesn't have it.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
hv_ringbuffer_peek() does the same as hv_ringbuffer_read() without
advancing the read index. The only functional change this patch brings
is moving hv_need_to_signal_on_read() call under the ring_lock but this
function is just a couple of comparisons.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Convert 6+-string comments repeating function names to normal kernel-style
comments and fix a couple of other comment style issues. No textual or
functional changes intended.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The crash is observed when a service is being disabled host side while
userspace daemon is connected to the device:
[ 90.244859] general protection fault: 0000 [#1] SMP
...
[ 90.800082] Call Trace:
[ 90.800082] [<ffffffff81187008>] __fput+0xc8/0x1f0
[ 90.800082] [<ffffffff8118716e>] ____fput+0xe/0x10
...
[ 90.800082] [<ffffffff81015278>] do_signal+0x28/0x580
[ 90.800082] [<ffffffff81086656>] ? finish_task_switch+0xa6/0x180
[ 90.800082] [<ffffffff81443ebf>] ? __schedule+0x28f/0x870
[ 90.800082] [<ffffffffa01ebbaa>] ? hvt_op_read+0x12a/0x140 [hv_utils]
...
The problem is that hvutil_transport_destroy() which does misc_deregister()
freeing the appropriate device is reachable by two paths: module unload
and from util_remove(). While module unload path is protected by .owner in
struct file_operations util_remove() path is not. Freeing the device while
someone holds an open fd for it is a show stopper.
In general, it is not possible to revoke an fd from all users so the only
way to solve the issue is to defer freeing the hvutil_transport structure.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When Hyper-V host asks us to remove some util driver by closing the
appropriate channel there is no easy way to force the current file
descriptor holder to hang up but we can start to respond -EBADF to all
operations asking it to exit gracefully.
As we're setting hvt->mode from two separate contexts now we need to use
a proper locking.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
As a preparation to reusing outmsg_lock to protect test-and-set openrations
on 'mode' rename it the more general 'lock'.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
inmsg should be freed in case of on_msg() failure to avoid memory leak.
Preserve the error code from on_msg().
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When the handshake with daemon is complete, we should poll the channel since
during the handshake, we will not be processing any messages. This is a
potential bug if the host is waiting for a response from the guest.
I would like to thank Dexuan for pointing this out.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Force all channel messages to be delivered on CPU0. These messages are not
performance critical and are used during the setup and teardown of the
channel.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Hypervisor Top Level Functional Specification v3/4 says
that TSC page sequence value = -1(0xFFFFFFFF) is used to
indicate that TSC page no longer reliable source of reference
timer. Unfortunately, we found that Windows Hyper-V guest
side implementation uses sequence value = 0 to indicate
that Tsc page no longer valid. This is clearly visible
inside Windows 2012R2 ntoskrnl.exe HvlGetReferenceTime()
function dissassembly:
HvlGetReferenceTime proc near
xchg ax, ax
loc_1401C3132:
mov rax, cs:HvlpReferenceTscPage
mov r9d, [rax]
test r9d, r9d
jz short loc_1401C3176
rdtsc
mov rcx, cs:HvlpReferenceTscPage
shl rdx, 20h
or rdx, rax
mov rax, [rcx+8]
mov rcx, cs:HvlpReferenceTscPage
mov r8, [rcx+10h]
mul rdx
mov rax, cs:HvlpReferenceTscPage
add rdx, r8
mov ecx, [rax]
cmp ecx, r9d
jnz short loc_1401C3132
jmp short loc_1401C3184
loc_1401C3176:
mov ecx, 40000020h
rdmsr
shl rdx, 20h
or rdx, rax
loc_1401C3184:
mov rax, rdx
retn
HvlGetReferenceTime endp
This patch aligns Tsc page invalid sequence value with
Windows Hyper-V guest implementation which is more
compatible with both Hyper-V hypervisor and KVM hypervisor.
Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: "K. Y. Srinivasan" <kys@microsoft.com>
CC: Haiyang Zhang <haiyangz@microsoft.com>
CC: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently we have two policies for deciding when to signal the host:
One based on the ring buffer state and the other based on what the
VMBUS client driver wants to do. Consider the case when the client
wants to explicitly control when to signal the host. In this case,
if the client were to defer signaling, we will not be able to signal
the host subsequently when the client does want to signal since the
ring buffer state will prevent the signaling. Implement logic to
have only one signaling policy in force for a given channel.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Tested-by: Haiyang Zhang <haiyangz@microsoft.com>
Cc: <stable@vger.kernel.org> # v4.2+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>