Go to file
Michal Wnukowski f1ed3df20d nvme-pci: add a memory barrier to nvme_dbbuf_update_and_check_event
In many architectures loads may be reordered with older stores to
different locations.  In the nvme driver the following two operations
could be reordered:

 - Write shadow doorbell (dbbuf_db) into memory.
 - Read EventIdx (dbbuf_ei) from memory.

This can result in a potential race condition between driver and VM host
processing requests (if given virtual NVMe controller has a support for
shadow doorbell).  If that occurs, then the NVMe controller may decide to
wait for MMIO doorbell from guest operating system, and guest driver may
decide not to issue MMIO doorbell on any of subsequent commands.

This issue is purely timing-dependent one, so there is no easy way to
reproduce it. Currently the easiest known approach is to run "Oracle IO
Numbers" (orion) that is shipped with Oracle DB:

orion -run advanced -num_large 0 -size_small 8 -type rand -simulate \
	concat -write 40 -duration 120 -matrix row -testname nvme_test

Where nvme_test is a .lun file that contains a list of NVMe block
devices to run test against. Limiting number of vCPUs assigned to given
VM instance seems to increase chances for this bug to occur. On test
environment with VM that got 4 NVMe drives and 1 vCPU assigned the
virtual NVMe controller hang could be observed within 10-20 minutes.
That correspond to about 400-500k IO operations processed (or about
100GB of IO read/writes).

Orion tool was used as a validation and set to run in a loop for 36
hours (equivalent of pushing 550M IO operations). No issues were
observed. That suggest that the patch fixes the issue.

Fixes: f9f38e3338 ("nvme: improve performance for virtual NVMe devices")
Signed-off-by: Michal Wnukowski <wnukowski@google.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
[hch: updated changelog and comment a bit]
Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-08-28 08:40:42 +02:00
arch IOMMU Update for Linux v4.19 2018-08-24 13:10:38 -07:00
block block: bsg: move atomic_t ref_count variable to refcount API 2018-08-27 19:17:02 -06:00
certs Replace magic for trusting the secondary keyring with #define 2018-08-16 09:57:20 -07:00
crypto DMAengine updates for v4.19-rc1 2018-08-18 15:55:59 -07:00
Documentation Merge branch 'stable/for-jens-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen into for-linus 2018-08-27 11:27:32 -06:00
drivers nvme-pci: add a memory barrier to nvme_dbbuf_update_and_check_event 2018-08-28 08:40:42 +02:00
firmware kbuild: remove all dummy assignments to obj- 2017-11-18 11:46:06 +09:00
fs This pull request contains a single fix for UBIFS: 2018-08-25 13:27:35 -07:00
include Merge branch 'for-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata 2018-08-24 13:20:33 -07:00
init Merge branch 'akpm' (patches from Andrew) 2018-08-22 12:34:08 -07:00
ipc ipc/util.c: update return value of ipc_getref from int to bool 2018-08-22 10:52:52 -07:00
kernel Merge branch 'for-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup 2018-08-24 13:19:27 -07:00
lib lib/fonts: convert comments to utf-8 2018-08-23 18:48:43 -07:00
LICENSES LICENSES: Add Linux-OpenIB license text 2018-04-27 16:41:53 -06:00
mm mm/cow: don't bother write protecting already write-protected pages 2018-08-25 13:15:03 -07:00
net Merge branch 'akpm' (patches from Andrew) 2018-08-23 19:20:12 -07:00
samples samples/bpf: all XDP samples should unload xdp/bpf prog on SIGTERM 2018-08-16 21:55:32 +02:00
scripts treewide: correct "differenciate" and "instanciate" typos 2018-08-23 18:48:43 -07:00
security + Cleanups 2018-08-24 13:00:33 -07:00
sound Merge branch 'akpm' (patches from Andrew) 2018-08-23 19:20:12 -07:00
tools treewide: convert ISO_8859-1 text comments to utf-8 2018-08-23 18:48:43 -07:00
usr kbuild: rename built-in.o to built-in.a 2018-03-26 02:01:19 +09:00
virt ARM: Support for Group0 interrupts in guests, Cache management 2018-08-22 13:52:44 -07:00
.clang-format clang-format: Set IndentWrappedFunctionNames false 2018-08-01 18:38:51 +02:00
.cocciconfig scripts: add Linux .cocciconfig for coccinelle 2016-07-22 12:13:39 +02:00
.get_maintainer.ignore
.gitattributes .gitattributes: set git diff driver for C source code files 2016-10-07 18:46:30 -07:00
.gitignore Kbuild updates for v4.17 (2nd) 2018-04-15 17:21:30 -07:00
.mailmap Merge branch 'linus/master' into rdma.git for-next 2018-08-16 14:21:29 -06:00
COPYING COPYING: use the new text with points to the license files 2018-03-23 12:41:45 -06:00
CREDITS 9p: remove Ron Minnich from MAINTAINERS 2018-08-17 16:20:26 -07:00
Kbuild Kbuild updates for v4.15 2017-11-17 17:45:29 -08:00
Kconfig kconfig: move the "Executable file formats" menu to fs/Kconfig.binfmt 2018-08-02 08:06:55 +09:00
MAINTAINERS libata: maintainership update 2018-08-25 12:35:45 -07:00
Makefile Updates for v4.19: 2018-08-20 18:32:00 -07:00
README Docs: Added a pointer to the formatted docs to README 2018-03-21 09:02:53 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.
See Documentation/00-INDEX for a list of what is contained in each file.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.