Restrict the protocol version(s) that will be negotiated with the host
to be 5.2 or greater if the guest is running isolated. This reduces the
footprint of the code that will be exercised by Confidential VMs and
hence the exposure to bugs and vulnerabilities.
Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20210201144814.2701-4-parri.andrea@gmail.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Only the VSCs or ICs that have been hardened and that are critical for
the successful adoption of Confidential VMs should be allowed if the
guest is running isolated. This change reduces the footprint of the
code that will be exercised by Confidential VMs and hence the exposure
to bugs and vulnerabilities.
Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20210201144814.2701-3-parri.andrea@gmail.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
When a Linux VM runs on Hyper-V, if the host toolstack doesn't support
hibernation for the VM (this happens on old Hyper-V hosts like Windows
Server 2016, or new Hyper-V hosts if the admin or user doesn't declare
the hibernation intent for the VM), the VM is discouraged from trying
hibernation (because the host doesn't guarantee that the VM's virtual
hardware configuration will remain exactly the same across hibernation),
i.e. the VM should not try to set up the swap partition/file for
hibernation, etc.
x86 Hyper-V uses the presence of the virtual ACPI S4 state as the
indication of the host toolstack support for a VM. Currently there is
no easy and reliable way for the userspace to detect the presence of
the state (see https://lkml.org/lkml/2020/12/11/1097). Add
/sys/bus/vmbus/hibernation for this purpose.
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20210107014552.14234-1-decui@microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
An erroneous or malicious host could send multiple rescind messages for
a same channel. In vmbus_onoffer_rescind(), the guest maps the channel
ID to obtain a pointer to the channel object and it eventually releases
such object and associated data. The host could time rescind messages
and lead to an use-after-free. Add a new flag to the channel structure
to make sure that only one instance of vmbus_onoffer_rescind() can get
the reference to the channel object.
Reported-by: Juan Vazquez <juvazq@microsoft.com>
Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20201209070827.29335-6-parri.andrea@gmail.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
When channel->device_obj is non-NULL, vmbus_onoffer_rescind() could
invoke put_device(), that will eventually release the device and free
the channel object (cf. vmbus_device_release()). However, a pointer
to the object is dereferenced again later to load the primary_channel.
The use-after-free can be avoided by noticing that this load/check is
redundant if device_obj is non-NULL: primary_channel must be NULL if
device_obj is non-NULL, cf. vmbus_add_channel_work().
Fixes: 54a66265d6 ("Drivers: hv: vmbus: Fix rescind handling")
Reported-by: Juan Vazquez <juvazq@microsoft.com>
Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20201209070827.29335-5-parri.andrea@gmail.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Since the message is in memory shared with the host, an erroneous or a
malicious Hyper-V could 'corrupt' the message while vmbus_on_msg_dpc()
or individual message handlers are executing. To prevent it, copy the
message into private memory.
Reported-by: Juan Vazquez <juvazq@microsoft.com>
Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20201209070827.29335-4-parri.andrea@gmail.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Simplify the function by removing various references to the hv_message
'msg', introduce local variables 'msgtype' and 'payload_size'.
Suggested-by: Juan Vazquez <juvazq@microsoft.com>
Suggested-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20201209070827.29335-3-parri.andrea@gmail.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
__vmbus_open() and vmbus_teardown_gpadl() do not inizialite the memory
for the vmbus_channel_open_channel and the vmbus_channel_gpadl_teardown
objects they allocate respectively. These objects contain padding bytes
and fields that are left uninitialized and that are later sent to the
host, potentially leaking guest data. Zero initialize such fields to
avoid leaking sensitive information to the host.
Reported-by: Juan Vazquez <juvazq@microsoft.com>
Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20201209070827.29335-2-parri.andrea@gmail.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
For additional robustness in the face of Hyper-V errors or malicious
behavior, validate all values that originate from packets that Hyper-V
has sent to the guest in the host-to-guest ring buffer. Ensure that
invalid values cannot cause indexing off the end of the icversion_data
array in vmbus_prep_negotiate_resp().
Signed-off-by: Andres Beltran <lkmlabelt@gmail.com>
Co-developed-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20201109100704.9152-1-parri.andrea@gmail.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Pointers to ring-buffer packets sent by Hyper-V are used within the
guest VM. Hyper-V can send packets with erroneous values or modify
packet fields after they are processed by the guest. To defend
against these scenarios, return a copy of the incoming VMBus packet
after validating its length and offset fields in hv_pkt_iter_first().
In this way, the packet can no longer be modified by the host.
Signed-off-by: Andres Beltran <lkmlabelt@gmail.com>
Co-developed-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: netdev@vger.kernel.org
Cc: linux-scsi@vger.kernel.org
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20201208045311.10244-1-parri.andrea@gmail.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Currently the kexec kernel can panic or hang due to 2 causes:
1) hv_cpu_die() is not called upon kexec, so the hypervisor corrupts the
old VP Assist Pages when the kexec kernel runs. The same issue is fixed
for hibernation in commit 421f090c81 ("x86/hyperv: Suspend/resume the
VP assist page for hibernation"). Now fix it for kexec.
2) hyperv_cleanup() is called too early. In the kexec path, the other CPUs
are stopped in hv_machine_shutdown() -> native_machine_shutdown(), so
between hv_kexec_handler() and native_machine_shutdown(), the other CPUs
can still try to access the hypercall page and cause panic. The workaround
"hv_hypercall_pg = NULL;" in hyperv_cleanup() is unreliabe. Move
hyperv_cleanup() to a better place.
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20201222065541.24312-1-decui@microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Unlike virtio_balloon/virtio_mem/xen balloon drivers, Hyper-V balloon driver
does not adjust managed pages count when ballooning/un-ballooning and this leads
to incorrect stats being reported, e.g. unexpected 'free' output.
Note, the calculation in post_status() seems to remain correct: ballooned out
pages are never 'available' and we manually add dm->num_pages_ballooned to
'commited'.
Suggested-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20201202161245.2406143-3-vkuznets@redhat.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
'alloc_unit' in alloc_balloon_pages() is either '512' for 2M allocations or
'1' for 4k allocations. So
1 << get_order(alloc_unit << PAGE_SHIFT)
equals to 'alloc_unit' and the for loop basically sets all them offline.
Simplify the math to improve the readability.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20201202161245.2406143-2-vkuznets@redhat.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Removes an obsolete TODO in the VMBus module and fixes a misleading typo
in the comment for the macro MAX_NUM_CHANNELS, where two digits have been
twisted.
Signed-off-by: Stefan Eschenbacher <stefan.eschenbacher@fau.de>
Co-developed-by: Max Stolze <max.stolze@fau.de>
Signed-off-by: Max Stolze <max.stolze@fau.de>
Link: https://lore.kernel.org/r/20201206104850.24843-1-stefan.eschenbacher@fau.de
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Checkpatch emits WARNING: quoted string split across lines.
To keep the code clean and with the 80 column length indentation the
check and registration code for kmsg_dump_register has been transferred
to a new function hv_kmsg_dump_register.
Signed-off-by: Matheus Castello <matheus@castello.eng.br>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20201125032926.17002-1-matheus@castello.eng.br
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Fixed checkpatch warning: MSLEEP: msleep < 20ms can sleep for up to
20ms; see Documentation/timers/timers-howto.rst
Signed-off-by: Matheus Castello <matheus@castello.eng.br>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20201115195734.8338-7-matheus@castello.eng.br
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Fixed checkpatch warning: Missing a blank line after declarations
checkpatch(LINE_SPACING)
Signed-off-by: Matheus Castello <matheus@castello.eng.br>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20201115195734.8338-4-matheus@castello.eng.br
Signed-off-by: Wei Liu <wei.liu@kernel.org>
This fixed the below checkpatch issue:
WARNING: Symbolic permissions 'S_IRUGO' are not preferred.
Consider using octal permissions '0444'.
Signed-off-by: Matheus Castello <matheus@castello.eng.br>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20201115195734.8338-3-matheus@castello.eng.br
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Fix the kernel parameter path in the comment, in the documentation the
parameter is correct but if someone who is studying the code and see
this first can get confused and try to access the wrong path/parameter
Signed-off-by: Matheus Castello <matheus@castello.eng.br>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20201115195734.8338-2-matheus@castello.eng.br
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Currently, VMbus drivers use pointers into guest memory as request IDs
for interactions with Hyper-V. To be more robust in the face of errors
or malicious behavior from a compromised Hyper-V, avoid exposing
guest memory addresses to Hyper-V. Also avoid Hyper-V giving back a
bad request ID that is then treated as the address of a guest data
structure with no validation. Instead, encapsulate these memory
addresses and provide small integers as request IDs.
Signed-off-by: Andres Beltran <lkmlabelt@gmail.com>
Co-developed-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Signed-off-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Reviewed-by: Wei Liu <wei.liu@kernel.org>
Link: https://lore.kernel.org/r/20201109100402.8946-2-parri.andrea@gmail.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQFHBAABCAAxFiEEIbPD0id6easf0xsudhRwX5BBoF4FAl+ytzsTHHdlaS5saXVA
a2VybmVsLm9yZwAKCRB2FHBfkEGgXtgkCADLbUGTwl/XXWEMBVASxk9rX9s6ONoN
qoEZXZ6OcleziWmYoxqcyHUKcbNNmN31iKcw4wuld7jHQJSExcwxbPCYS2mAlBUb
urHbPgm7u0u+9rILQi1Qbp5fHP8uQAvDKxe8sKXXzDvnWUNNVSyKlv3nj0kyN8zi
SmpAszx5cdxXkyzwtnsL5GlUkVHyoGF03wMomcMnWgKZh4xsdIOQm5M0xrDFBqiY
Lu+GK62845ZZgIyop4AN74bPNNPWDV29SnU8GMN7neFELdiIOPI1QbDX65qn0QTT
W+oKtv52JVDkYLi7fTY5JUoM7O1eek3DFdvB9ig4QJdNdQ9YkJvnogsM
=1shq
-----END PGP SIGNATURE-----
Merge tag 'hyperv-fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux
Pull Hyper-V fix from Wei Liu:
"One patch from Chris to fix kexec on Hyper-V"
* tag 'hyperv-fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
Drivers: hv: vmbus: Allow cleanup of VMBUS_CONNECT_CPU if disconnected
When invoking kexec() on a Linux guest running on a Hyper-V host, the
kernel panics.
RIP: 0010:cpuhp_issue_call+0x137/0x140
Call Trace:
__cpuhp_remove_state_cpuslocked+0x99/0x100
__cpuhp_remove_state+0x1c/0x30
hv_kexec_handler+0x23/0x30 [hv_vmbus]
hv_machine_shutdown+0x1e/0x30
machine_shutdown+0x10/0x20
kernel_kexec+0x6d/0x96
__do_sys_reboot+0x1ef/0x230
__x64_sys_reboot+0x1d/0x20
do_syscall_64+0x6b/0x3d8
entry_SYSCALL_64_after_hwframe+0x44/0xa9
This was due to hv_synic_cleanup() callback returning -EBUSY to
cpuhp_issue_call() when tearing down the VMBUS_CONNECT_CPU, even
if the vmbus_connection.conn_state = DISCONNECTED. hv_synic_cleanup()
should succeed in the case where vmbus_connection.conn_state
is DISCONNECTED.
Fix is to add an extra condition to test for
vmbus_connection.conn_state == CONNECTED on the VMBUS_CONNECT_CPU and
only return early if true. This way the kexec() path can still shut
everything down while preserving the initial behavior of preventing
CPU offlining on the VMBUS_CONNECT_CPU while the VM is running.
Fixes: 8a857c5542 ("Drivers: hv: vmbus: Always handle the VMBus messages on CPU0")
Signed-off-by: Chris Co <chrco@microsoft.com>
Reviewed-by: Andrea Parri (Microsoft) <parri.andrea@gmail.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20201110190118.15596-1-chrco@linux.microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
It is not an error if the host requests to balloon down, but the VM
refuses to do so. Without this change a warning is logged in dmesg
every five minutes.
Fixes: b3bb97b8a4 ("Drivers: hv: balloon: Add logging for dynamic memory operations")
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20201008071216.16554-1-olaf@aepfle.de
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Merge more updates from Andrew Morton:
"155 patches.
Subsystems affected by this patch series: mm (dax, debug, thp,
readahead, page-poison, util, memory-hotplug, zram, cleanups), misc,
core-kernel, get_maintainer, MAINTAINERS, lib, bitops, checkpatch,
binfmt, ramfs, autofs, nilfs, rapidio, panic, relay, kgdb, ubsan,
romfs, and fault-injection"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (155 commits)
lib, uaccess: add failure injection to usercopy functions
lib, include/linux: add usercopy failure capability
ROMFS: support inode blocks calculation
ubsan: introduce CONFIG_UBSAN_LOCAL_BOUNDS for Clang
sched.h: drop in_ubsan field when UBSAN is in trap mode
scripts/gdb/tasks: add headers and improve spacing format
scripts/gdb/proc: add struct mount & struct super_block addr in lx-mounts command
kernel/relay.c: drop unneeded initialization
panic: dump registers on panic_on_warn
rapidio: fix the missed put_device() for rio_mport_add_riodev
rapidio: fix error handling path
nilfs2: fix some kernel-doc warnings for nilfs2
autofs: harden ioctl table
ramfs: fix nommu mmap with gaps in the page cache
mm: remove the now-unnecessary mmget_still_valid() hack
mm/gup: take mmap_lock in get_dump_page()
binfmt_elf, binfmt_elf_fdpic: use a VMA list snapshot
coredump: rework elf/elf_fdpic vma_dump_size() into common helper
coredump: refactor page range dumping into common helper
coredump: let dump_emit() bail out on short writes
...
Let's try to merge system ram resources we add, to minimize the number of
resources in /proc/iomem. We don't care about the boundaries of
individual chunks we added.
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Wei Liu <wei.liu@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Wei Liu <wei.liu@kernel.org>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Wei Yang <richardw.yang@linux.intel.com>
Cc: Anton Blanchard <anton@ozlabs.org>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Julien Grall <julien@xen.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Leonardo Bras <leobras.c@gmail.com>
Cc: Libor Pechacek <lpechacek@suse.cz>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Nathan Lynch <nathanl@linux.ibm.com>
Cc: "Oliver O'Halloran" <oohall@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Pingfan Liu <kernelfans@gmail.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Roger Pau Monné <roger.pau@citrix.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Link: https://lkml.kernel.org/r/20200911103459.10306-9-david@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We soon want to pass flags, e.g., to mark added System RAM resources.
mergeable. Prepare for that.
This patch is based on a similar patch by Oscar Salvador:
https://lkml.kernel.org/r/20190625075227.15193-3-osalvador@suse.de
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Juergen Gross <jgross@suse.com> # Xen related part
Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Acked-by: Wei Liu <wei.liu@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Baoquan He <bhe@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Len Brown <lenb@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Wei Liu <wei.liu@kernel.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: "Oliver O'Halloran" <oohall@gmail.com>
Cc: Pingfan Liu <kernelfans@gmail.com>
Cc: Nathan Lynch <nathanl@linux.ibm.com>
Cc: Libor Pechacek <lpechacek@suse.cz>
Cc: Anton Blanchard <anton@ozlabs.org>
Cc: Leonardo Bras <leobras.c@gmail.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Julien Grall <julien@xen.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Roger Pau Monné <roger.pau@citrix.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wei Yang <richardw.yang@linux.intel.com>
Link: https://lkml.kernel.org/r/20200911103459.10306-5-david@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-----BEGIN PGP SIGNATURE-----
iQFHBAABCAAxFiEEIbPD0id6easf0xsudhRwX5BBoF4FAl+IfF4THHdlaS5saXVA
a2VybmVsLm9yZwAKCRB2FHBfkEGgXgdnCACUnI5sKbEN/uEvWz4JGJzTSwr20VHt
FkzpbeS4A9vHgl4hXVvGc4eMrwF/RtWY6RrLlJauZSQA1mjU0paAjf2noBFYX41m
zHX6f8awJrPd0cFChrOKcAlPnQy5OHYTJb7id2EakGGIrd0rmR/TkVAdEku23SDD
N7wheh5dVLnkSPwfiERz8Iq0CswMrSjgTljKnwU7XqUqwcNt+7rLRDFAH/M3NG/x
omBrWO8k6t2r0h4otqCQZIyCSLwPO+Wdb9BSaA147eOFHHbhqZlHNJYjIkMROZau
CJn7S0nZorsAUvka3l7W8nyMQmK4PXOh36bwkXzpkV4b+lgit0euXIzA
=H2vc
-----END PGP SIGNATURE-----
Merge tag 'hyperv-next-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux
Pull another Hyper-V update from Wei Liu:
"One patch from Michael to get VMbus interrupt from ACPI DSDT"
* tag 'hyperv-next-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
Drivers: hv: vmbus: Add parsing of VMbus interrupt in ACPI DSDT
On ARM64, Hyper-V now specifies the interrupt to be used by VMbus
in the ACPI DSDT. This information is not used on x86 because the
interrupt vector must be hardcoded. But update the generic
VMbus driver to do the parsing and pass the information to the
architecture specific code that sets up the Linux IRQ. Update
consumers of the interrupt to get it from an architecture specific
function.
Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/1597434304-40631-1-git-send-email-mikelley@microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQFHBAABCAAxFiEEIbPD0id6easf0xsudhRwX5BBoF4FAl+FqrsTHHdlaS5saXVA
a2VybmVsLm9yZwAKCRB2FHBfkEGgXnN8B/4sRg7j9OTzVBlDiXF2vj6vbuplTIH6
JR6S5f4PNjUg4gV6ghzSnsx1zqNhPSOr78zDqYto8vv+wqqj3thmld8+gAnSbKtt
yoAa7mhbbN1ryJiwPlZzvX4ApzGZPC7byqEi3+zPIcag6TEl8eyYJOmvY3x1zv8x
CsAb57oCC4erD0n4xlTyfuc8TLpO+EiU53PXbR9AovKQHe4m2/8LWyEbmrm5cRUR
gx8RxoLkkrqK0unzcmanbm47QodiaOTUpycs3IvaBeWZQsqSgFZdI1RAdTZNg+U+
GT8eMRXAwpgDpilPm/0n1O0PKGAsVh9Lbw8Btb/ggqnjTUlA4Z3Df23E
=Wy5n
-----END PGP SIGNATURE-----
Merge tag 'hyperv-next-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux
Pull Hyper-V updates from Wei Liu:
- a series from Boqun Feng to support page size larger than 4K
- a few miscellaneous clean-ups
* tag 'hyperv-next-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
hv: clocksource: Add notrace attribute to read_hv_sched_clock_*() functions
x86/hyperv: Remove aliases with X64 in their name
PCI: hv: Document missing hv_pci_protocol_negotiation() parameter
scsi: storvsc: Support PAGE_SIZE larger than 4K
Driver: hv: util: Use VMBUS_RING_SIZE() for ringbuffer sizes
HID: hyperv: Use VMBUS_RING_SIZE() for ringbuffer sizes
Input: hyperv-keyboard: Use VMBUS_RING_SIZE() for ringbuffer sizes
hv_netvsc: Use HV_HYP_PAGE_SIZE for Hyper-V communication
hv: hyperv.h: Introduce some hvpfn helper functions
Drivers: hv: vmbus: Move virt_to_hvpfn() to hyperv header
Drivers: hv: Use HV_HYP_PAGE in hv_synic_enable_regs()
Drivers: hv: vmbus: Introduce types of GPADL
Drivers: hv: vmbus: Move __vmbus_open()
Drivers: hv: vmbus: Always use HV_HYP_PAGE_SIZE for gpadl
drivers: hv: remove cast from hyperv_die_event
For a Hyper-V vmbus, the size of the ringbuffer has two requirements:
1) it has to take one PAGE_SIZE for the header
2) it has to be PAGE_SIZE aligned so that double-mapping can work
VMBUS_RING_SIZE() could calculate a correct ringbuffer size which
fulfills both requirements, therefore use it to make sure vmbus work
when PAGE_SIZE != HV_HYP_PAGE_SIZE (4K).
Note that since the argument for VMBUS_RING_SIZE() is the size of
payload (data part), so it will be minus 4k (the size of header when
PAGE_SIZE = 4k) than the original value to keep the ringbuffer total
size unchanged when PAGE_SIZE = 4k.
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Cc: Michael Kelley <mikelley@microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20200916034817.30282-11-boqun.feng@gmail.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
There will be more places other than vmbus where we need to calculate
the Hyper-V page PFN from a virtual address, so move virt_to_hvpfn() to
hyperv generic header.
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20200916034817.30282-6-boqun.feng@gmail.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Both the base_*_gpa should use the guest page number in Hyper-V page, so
use HV_HYP_PAGE instead of PAGE.
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20200916034817.30282-5-boqun.feng@gmail.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
This patch introduces two types of GPADL: HV_GPADL_{BUFFER, RING}. The
types of GPADL are purely the concept in the guest, IOW the hypervisor
treat them as the same.
The reason of introducing the types for GPADL is to support guests whose
page size is not 4k (the page size of Hyper-V hypervisor). In these
guests, both the headers and the data parts of the ringbuffers need to
be aligned to the PAGE_SIZE, because 1) some of the ringbuffers will be
mapped into userspace and 2) we use "double mapping" mechanism to
support fast wrap-around, and "double mapping" relies on ringbuffers
being page-aligned. However, the Hyper-V hypervisor only uses 4k
(HV_HYP_PAGE_SIZE) headers. Our solution to this is that we always make
the headers of ringbuffers take one guest page and when GPADL is
established between the guest and hypervisor, the only first 4k of
header is used. To handle this special case, we need the types of GPADL
to differ different guest memory usage for GPADL.
Type enum is introduced along with several general interfaces to
describe the differences between normal buffer GPADL and ringbuffer
GPADL.
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20200916034817.30282-4-boqun.feng@gmail.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Pure function movement, no functional changes. The move is made, because
in a later change, __vmbus_open() will rely on some static functions
afterwards, so we separate the move and the modification of
__vmbus_open() in two patches to make it easy to review.
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Reviewed-by: Wei Liu <wei.liu@kernel.org>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20200916034817.30282-3-boqun.feng@gmail.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Since the hypervisor always uses 4K as its page size, the size of PFNs
used for gpadl should be HV_HYP_PAGE_SIZE rather than PAGE_SIZE, so
adjust this accordingly as the preparation for supporting 16K/64K page
size guests. No functional changes on x86, since PAGE_SIZE is always 4k
(equals to HV_HYP_PAGE_SIZE).
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20200916034817.30282-2-boqun.feng@gmail.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQFHBAABCAAxFiEEIbPD0id6easf0xsudhRwX5BBoF4FAl9grKITHHdlaS5saXVA
a2VybmVsLm9yZwAKCRB2FHBfkEGgXu7gB/4+uML5b8rd06pJ3+NC53Mh5ooePvL5
CdZsLJP+reK8WeG4s3sXSs4kgzwZQcDFEC6k4STvM7B6fOEhvpi2Y7kuhayy75X4
6mZ7qoAnVvy83RZxAxDdgr0PuPRpL8m3cq1C+2hgOe2JaShhLVOC35IZqcBFKxrL
pp6blm7BFC2ST93JKQ3rLT0dhuT2CyLVkXheXbUK+36UnR1OxmgZLVp3PuPXVjdm
1zEkSjWCPAV/I78U7GOoLFx+sUgJwuG+owRQ1o8ZldMO/yj2gfCzfsSDivnjgB10
Q8ihUYZS2jBxg3L5y1pSYwgkOE/Q0iBRG91PueSq0c06mSuKp8COI0xG
=VCYI
-----END PGP SIGNATURE-----
Merge tag 'hyperv-fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux
Pull hyperv fixes from Wei Liu:
"Two patches from Michael and Dexuan to fix vmbus hanging issues"
* tag 'hyperv-fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
Drivers: hv: vmbus: Add timeout to vmbus_wait_for_unload
Drivers: hv: vmbus: hibernation: do not hang forever in vmbus_bus_resume()
vmbus_wait_for_unload() looks for a CHANNELMSG_UNLOAD_RESPONSE message
coming from Hyper-V. But if the message isn't found for some reason,
the panic path gets hung forever. Add a timeout of 10 seconds to prevent
this.
Fixes: 415719160d ("Drivers: hv: vmbus: avoid scheduling in interrupt context in vmbus_initiate_unload()")
Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Link: https://lore.kernel.org/r/1600026449-23651-1-git-send-email-mikelley@microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
After we Stop and later Start a VM that uses Accelerated Networking (NIC
SR-IOV), currently the VF vmbus device's Instance GUID can change, so after
vmbus_bus_resume() -> vmbus_request_offers(), vmbus_onoffer() can not find
the original vmbus channel of the VF, and hence we can't complete()
vmbus_connection.ready_for_resume_event in check_ready_for_resume_event(),
and the VM hangs in vmbus_bus_resume() forever.
Fix the issue by adding a timeout, so the resuming can still succeed, and
the saved state is not lost, and according to my test, the user can disable
Accelerated Networking and then will be able to SSH into the VM for
further recovery. Also prevent the VM in question from suspending again.
The host will be fixed so in future the Instance GUID will stay the same
across hibernation.
Fixes: d8bd2d442b ("Drivers: hv: vmbus: Resume after fixing up old primary channels")
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20200905025555.45614-1-decui@microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQFHBAABCAAxFiEEIbPD0id6easf0xsudhRwX5BBoF4FAl9GaMcTHHdlaS5saXVA
a2VybmVsLm9yZwAKCRB2FHBfkEGgXi1yCADJRrElI0kP5KAJJ6dQCXzhs8PX5rZG
rDNguVWy3Gocsngrn+8fQVHGdQmO/Y0nqVo3Y4zV1NtWx4MqO6pFtDc1kCYOSgws
bgHLwm16qHYDar1oBrplgEny1U9FVs4zOIw7diUJeOctX7TDu2MXrjl9F+XvlGqJ
xf90H/h8VdODh0rOWY5i6+RuM/ztcVwvqjne/uxhx5Gl/sO+Piwp18AL7C2ItOC4
b+ZnM3c1plAetTqN1taGiYTqlKCdvSoDtVkySMseLPYeVoVt3CwZTNiX6GlWJzbw
RujSrXimOmsxWEVi0qsI0PG6rQRCO+6ojQHEj7amjBR7C1yVnxKeJCM9
=/xNi
-----END PGP SIGNATURE-----
Merge tag 'hyperv-fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux
Pull hyperv fixes from Wei Liu:
"Two patches from Vineeth to improve Hyper-V timesync facility"
* tag 'hyperv-fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
hv_utils: drain the timesync packets on onchannelcallback
hv_utils: return error if host timesysnc update is stale
There could be instances where a system stall prevents the timesync
packets to be consumed. And this might lead to more than one packet
pending in the ring buffer. Current code empties one packet per callback
and it might be a stale one. So drain all the packets from ring buffer
on each callback.
Signed-off-by: Vineeth Pillai <viremana@linux.microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20200821152849.99517-1-viremana@linux.microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
If for any reason, host timesync messages were not processed by
the guest, hv_ptp_gettime() returns a stale value and the
caller (clock_gettime, PTP ioctl etc) has no means to know this
now. Return an error so that the caller knows about this.
Signed-off-by: Vineeth Pillai <viremana@linux.microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20200821152523.99364-1-viremana@linux.microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQFHBAABCAAxFiEEIbPD0id6easf0xsudhRwX5BBoF4FAl82Y6cTHHdlaS5saXVA
a2VybmVsLm9yZwAKCRB2FHBfkEGgXlcxCACP21ZI7RZvQBcFtTj5MWa0uwoofFqF
JDG0MvZ5zFKIJFX0pwlZIUrZY5aVJ1NwCDgCI0EXbZEazTaNCD2knFPqrLe3WUFY
mSDF9df7oW9UvTe9L4g3rYAdqsrkbgqhBypm9Vpbcazg/Ki6QVCgAhIo1lbq62+m
J2/0kLO1lVY6opr6vyobaWbm/Y4b0fbrx7N6KwUDhZUYGLGKaOc+WvsZinNl4XW6
VPiEVQUApvVxwG43rLNXjPe83DtassJ2GevSS1whXnZ+K0bViWhyYicbqEl9iV1i
nlNIkEMX5A1rdwV1zEAGyY/zWi+fi2+IdKGGEbtyUsely1vHtZuaDCiQ
=DE2Y
-----END PGP SIGNATURE-----
Merge tag 'hyperv-fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux
Pull hyper-v fixes from Wei Liu:
- fix oops reporting on Hyper-V
- make objtool happy
* tag 'hyperv-fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
x86/hyperv: Make hv_setup_sched_clock inline
Drivers: hv: vmbus: Only notify Hyper-V for die events that are oops
Hyper-V currently may be notified of a panic for any die event. But
this results in false panic notifications for various user space traps
that are die events. Fix this by ignoring die events that aren't oops.
Fixes: 510f7aef65 ("Drivers: hv: vmbus: prefer 'die' notification chain to 'panic'")
Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Link: https://lore.kernel.org/r/1596730935-11564-1-git-send-email-mikelley@microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQFHBAABCAAxFiEEIbPD0id6easf0xsudhRwX5BBoF4FAl8pNiATHHdlaS5saXVA
a2VybmVsLm9yZwAKCRB2FHBfkEGgXuoQCACiOoe/FKHAYZDR6fZzwZSNVJueYu1o
Pnwj00EjeDWM/M4kmF7u0PVv0v8PlP7fi2b0nvGmOeqwFS4n7Xr6mdDO1ZxRDzpd
QW9ZoLc8OUpg9UU9WlEqj4IqRalwKsZi0eX3P2jGoeQCEXHcRu5AdjkDYk0s/LNq
GBAtpHEPADu/uhEVMyya1KrY9DXsgybPGbCml1pZpeNLPbph3m2ld1M0aHJXqap9
1qBykxPgCSkSd270XfUrHCkYariC/g7khnBP0zAkzsgt90uraqKoUNhihN8GFacF
eZ6oh45LWWUjCNIAzKBiLlrpt7nuOGQszeIZywAuyvkbo0pdlZ+hu1qB
=FMf5
-----END PGP SIGNATURE-----
Merge tag 'hyperv-next-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux
Pull hyperv updates from Wei Liu:
- A patch series from Andrea to improve vmbus code
- Two clean-up patches from Alexander and Randy
* tag 'hyperv-next-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
hyperv: hyperv.h: drop a duplicated word
tools: hv: change http to https in hv_kvp_daemon.c
Drivers: hv: vmbus: Remove the lock field from the vmbus_channel struct
scsi: storvsc: Introduce the per-storvsc_device spinlock
Drivers: hv: vmbus: Remove unnecessary channel->lock critical sections (sc_list updaters)
Drivers: hv: vmbus: Use channel_mutex in channel_vp_mapping_show()
Drivers: hv: vmbus: Remove unnecessary channel->lock critical sections (sc_list readers)
Drivers: hv: vmbus: Replace cpumask_test_cpu(, cpu_online_mask) with cpu_online()
Drivers: hv: vmbus: Remove the numa_node field from the vmbus_channel struct
Drivers: hv: vmbus: Remove the target_vp field from the vmbus_channel struct
When the kernel panics, one page of kmsg data may be collected and sent to
Hyper-V to aid in diagnosing the failure. The collected kmsg data typically
contains 50 to 100 lines, each of which has a log level prefix that isn't
very useful from a diagnostic standpoint. So tell kmsg_dump_get_buffer()
to not include the log level, enabling more information that *is* useful to
fit in the page.
Requesting in stable kernels, since many kernels running in production are
stable releases.
Cc: stable@vger.kernel.org
Signed-off-by: Joseph Salisbury <joseph.salisbury@microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/1593210497-114310-1-git-send-email-joseph.salisbury@microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>