Go to file
Peter Zijlstra 2658687568 locking/qspinlock, x86: Provide liveness guarantee
commit 7aa54be297 upstream.

On x86 we cannot do fetch_or() with a single instruction and thus end up
using a cmpxchg loop, this reduces determinism. Replace the fetch_or()
with a composite operation: tas-pending + load.

Using two instructions of course opens a window we previously did not
have. Consider the scenario:

	CPU0		CPU1		CPU2

 1)	lock
	  trylock -> (0,0,1)

 2)			lock
			  trylock /* fail */

 3)	unlock -> (0,0,0)

 4)					lock
					  trylock -> (0,0,1)

 5)			  tas-pending -> (0,1,1)
			  load-val <- (0,1,0) from 3

 6)			  clear-pending-set-locked -> (0,0,1)

			  FAIL: _2_ owners

where 5) is our new composite operation. When we consider each part of
the qspinlock state as a separate variable (as we can when
_Q_PENDING_BITS == 8) then the above is entirely possible, because
tas-pending will only RmW the pending byte, so the later load is able
to observe prior tail and lock state (but not earlier than its own
trylock, which operates on the whole word, due to coherence).

To avoid this we need 2 things:

 - the load must come after the tas-pending (obviously, otherwise it
   can trivially observe prior state).

 - the tas-pending must be a full word RmW instruction, it cannot be an XCHGB for
   example, such that we cannot observe other state prior to setting
   pending.

On x86 we can realize this by using "LOCK BTS m32, r32" for
tas-pending followed by a regular load.

Note that observing later state is not a problem:

 - if we fail to observe a later unlock, we'll simply spin-wait for
   that store to become visible.

 - if we observe a later xchg_tail(), there is no difference from that
   xchg_tail() having taken place before the tas-pending.

Suggested-by: Will Deacon <will.deacon@arm.com>
Reported-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: andrea.parri@amarulasolutions.com
Cc: longman@redhat.com
Fixes: 59fb586b4a ("locking/qspinlock: Remove unbounded cmpxchg() loop from locking slowpath")
Link: https://lkml.kernel.org/r/20181003130957.183726335@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
[bigeasy: GEN_BINARY_RMWcc macro redo]
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2018-12-21 14:15:12 +01:00
arch locking/qspinlock, x86: Provide liveness guarantee 2018-12-21 14:15:12 +01:00
block block/bio: Do not zero user pages 2018-12-19 19:19:50 +01:00
certs export.h: remove VMLINUX_SYMBOL() and VMLINUX_SYMBOL_STR() 2018-08-22 23:21:44 +09:00
crypto crypto: do not free algorithm before using 2018-12-13 09:16:21 +01:00
Documentation x86/speculation: Provide IBPB always command line options 2018-12-05 19:32:04 +01:00
drivers dm zoned: Fix target BIO completion handling 2018-12-19 19:19:53 +01:00
firmware kbuild: remove all dummy assignments to obj- 2017-11-18 11:46:06 +09:00
fs fuse: continue to send FUSE_RELEASEDIR when FUSE_OPEN returns ENOSYS 2018-12-19 19:19:51 +01:00
include pstore/ram: Correctly calculate usable PRZ bytes 2018-12-17 09:24:39 +01:00
init sched/pelt: Fix warning and clean up IRQ PELT config 2018-12-19 19:19:49 +01:00
ipc ipc/shm.c: use ERR_CAST() for shm_lock() error return 2018-10-05 16:32:04 -07:00
kernel locking/qspinlock, x86: Provide liveness guarantee 2018-12-21 14:15:12 +01:00
lib debugobjects: avoid recursive calls with kmemleak 2018-12-17 09:24:41 +01:00
LICENSES LICENSES: Remove CC-BY-SA-4.0 license text 2018-10-18 11:28:50 +02:00
mm mm/page_alloc.c: fix calculation of pgdat->nr_zones 2018-12-17 09:24:40 +01:00
net tcp: lack of available data can also cause TSO defer 2018-12-17 09:24:42 +01:00
samples samples: disable CONFIG_SAMPLES for UML 2018-10-11 02:15:46 +09:00
scripts scripts/spdxcheck.py: always open files in binary mode 2018-12-19 19:19:50 +01:00
security selinux: add support for RTM_NEWCHAIN, RTM_DELCHAIN, and RTM_GETCHAIN 2018-12-08 12:59:08 +01:00
sound ALSA: hda/realtek - Fix the mute LED regresion on Lenovo X1 Carbon 2018-12-17 09:24:42 +01:00
tools bpf: fix off-by-one error in adjust_subprog_starts 2018-12-17 09:24:42 +01:00
usr initramfs: move gen_initramfs_list.sh from scripts/ to usr/ 2018-08-22 23:21:44 +09:00
virt KVM: arm64: Fix caching of host MDCR_EL2 value 2018-11-13 11:08:47 -08:00
.clang-format clang-format: Set IndentWrappedFunctionNames false 2018-08-01 18:38:51 +02:00
.cocciconfig scripts: add Linux .cocciconfig for coccinelle 2016-07-22 12:13:39 +02:00
.get_maintainer.ignore
.gitattributes .gitattributes: set git diff driver for C source code files 2016-10-07 18:46:30 -07:00
.gitignore Kbuild updates for v4.17 (2nd) 2018-04-15 17:21:30 -07:00
.mailmap libnvdimm-for-4.19_misc 2018-08-25 18:13:10 -07:00
COPYING COPYING: use the new text with points to the license files 2018-03-23 12:41:45 -06:00
CREDITS 9p: remove Ron Minnich from MAINTAINERS 2018-08-17 16:20:26 -07:00
Kbuild Kbuild updates for v4.15 2017-11-17 17:45:29 -08:00
Kconfig kconfig: move the "Executable file formats" menu to fs/Kconfig.binfmt 2018-08-02 08:06:55 +09:00
MAINTAINERS MAINTAINERS: Add Sasha as a stable branch maintainer 2018-12-01 09:37:25 +01:00
Makefile Linux 4.19.11 2018-12-19 19:19:54 +01:00
README Docs: Added a pointer to the formatted docs to README 2018-03-21 09:02:53 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.
See Documentation/00-INDEX for a list of what is contained in each file.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.