Commit Graph

72 Commits

Author SHA1 Message Date
Linus Torvalds
105ff3cbf2 atomic: remove all traces of READ_ONCE_CTRL() and atomic*_read_ctrl()
This seems to be a mis-reading of how alpha memory ordering works, and
is not backed up by the alpha architecture manual.  The helper functions
don't do anything special on any other architectures, and the arguments
that support them being safe on other architectures also argue that they
are safe on alpha.

Basically, the "control dependency" is between a previous read and a
subsequent write that is dependent on the value read.  Even if the
subsequent write is actually done speculatively, there is no way that
such a speculative write could be made visible to other cpu's until it
has been committed, which requires validating the speculation.

Note that most weakely ordered architectures (very much including alpha)
do not guarantee any ordering relationship between two loads that depend
on each other on a control dependency:

    read A
    if (val == 1)
        read B

because the conditional may be predicted, and the "read B" may be
speculatively moved up to before reading the value A.  So we require the
user to insert a smp_rmb() between the two accesses to be correct:

    read A;
    if (A == 1)
        smp_rmb()
        read B

Alpha is further special in that it can break that ordering even if the
*address* of B depends on the read of A, because the cacheline that is
read later may be stale unless you have a memory barrier in between the
pointer read and the read of the value behind a pointer:

    read ptr
    read offset(ptr)

whereas all other weakly ordered architectures guarantee that the data
dependency (as opposed to just a control dependency) will order the two
accesses.  As a result, alpha needs a "smp_read_barrier_depends()" in
between those two reads for them to be ordered.

The coontrol dependency that "READ_ONCE_CTRL()" and "atomic_read_ctrl()"
had was a control dependency to a subsequent *write*, however, and
nobody can finalize such a subsequent write without having actually done
the read.  And were you to write such a value to a "stale" cacheline
(the way the unordered reads came to be), that would seem to lose the
write entirely.

So the things that make alpha able to re-order reads even more
aggressively than other weak architectures do not seem to be relevant
for a subsequent write.  Alpha memory ordering may be strange, but
there's no real indication that it is *that* strange.

Also, the alpha architecture reference manual very explicitly talks
about the definition of "Dependence Constraints" in section 5.6.1.7,
where a preceding read dominates a subsequent write.

Such a dependence constraint admittedly does not impose a BEFORE (alpha
architecture term for globally visible ordering), but it does guarantee
that there can be no "causal loop".  I don't see how you could avoid
such a loop if another cpu could see the stored value and then impact
the value of the first read.  Put another way: the read and the write
could not be seen as being out of order wrt other cpus.

So I do not see how these "x_ctrl()" functions can currently be necessary.

I may have to eat my words at some point, but in the absense of clear
proof that alpha actually needs this, or indeed even an explanation of
how alpha could _possibly_ need it, I do not believe these functions are
called for.

And if it turns out that alpha really _does_ need a barrier for this
case, that barrier still should not be "smp_read_barrier_depends()".
We'd have to make up some new speciality barrier just for alpha, along
with the documentation for why it really is necessary.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul E McKenney <paulmck@us.ibm.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-11-03 17:22:17 -08:00
Linus Torvalds
d63a978865 Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking changes from Ingo Molnar:
 "The main changes in this cycle were:

   - More gradual enhancements to atomic ops: new atomic*_read_ctrl()
     ops, synchronize atomic_{read,set}() ordering requirements between
     architectures, add atomic_long_t bitops.  (Peter Zijlstra)

   - Add _{relaxed|acquire|release}() variants for inc/dec atomics and
     use them in various locking primitives: mutex, rtmutex, mcs, rwsem.
     This enables weakly ordered architectures (such as arm64) to make
     use of more locking related optimizations.  (Davidlohr Bueso)

   - Implement atomic[64]_{inc,dec}_relaxed() on ARM.  (Will Deacon)

   - Futex kernel data cache footprint micro-optimization.  (Rasmus
     Villemoes)

   - pvqspinlock runtime overhead micro-optimization.  (Waiman Long)

   - misc smaller fixlets"

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  ARM, locking/atomics: Implement _relaxed variants of atomic[64]_{inc,dec}
  locking/rwsem: Use acquire/release semantics
  locking/mcs: Use acquire/release semantics
  locking/rtmutex: Use acquire/release semantics
  locking/mutex: Use acquire/release semantics
  locking/asm-generic: Add _{relaxed|acquire|release}() variants for inc/dec atomics
  atomic: Implement atomic_read_ctrl()
  atomic, arch: Audit atomic_{read,set}()
  atomic: Add atomic_long_t bitops
  futex: Force hot variables into a single cache line
  locking/pvqspinlock: Kick the PV CPU unconditionally when _Q_SLOW_VAL
  locking/osq: Relax atomic semantics
  locking/qrwlock: Rename ->lock to ->wait_lock
  locking/Documentation/lockstat: Fix typo - lokcing -> locking
  locking/atomics, cmpxchg: Privatize the inclusion of asm/cmpxchg.h
2015-11-03 16:10:43 -08:00
Paul E. McKenney
ad2ad5d31f documentation: Add lockless_dereference()
The recently added lockless_dereference() macro is not present in the
Documentation/ directory, so this commit fixes that.

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2015-10-06 11:22:57 -07:00
Paul E. McKenney
27566139b6 documentation: No acquire/release for RCU readers
Documentation/memory-barriers.txt calls out RCU as one of the sets
of primitives associated with ACQUIRE and RELEASE.  There really
is an association in that rcu_assign_pointer() includes a RELEASE
operation, but a quick read can convince people that rcu_read_lock() and
rcu_read_unlock() have ACQUIRE and RELEASE semantics, which they do not.

This commit therefore removes RCU from this list in order to avoid
this confusion.

Reported-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2015-10-06 11:22:28 -07:00
Peter Zijlstra
e3e72ab80a atomic: Implement atomic_read_ctrl()
Provide atomic_read_ctrl() to mirror READ_ONCE_CTRL(), such that we can
more conveniently use atomics in control dependencies.

Since we can assume atomic_read() implies a READ_ONCE(), we must only
emit an extra smp_read_barrier_depends() in order to upgrade to
READ_ONCE_CTRL() semantics.

Requested-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Cc: oleg@redhat.com
Link: http://lkml.kernel.org/r/20150918115637.GM3604@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-09-23 09:54:29 +02:00
Linus Torvalds
ca520cab25 Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking and atomic updates from Ingo Molnar:
 "Main changes in this cycle are:

   - Extend atomic primitives with coherent logic op primitives
     (atomic_{or,and,xor}()) and deprecate the old partial APIs
     (atomic_{set,clear}_mask())

     The old ops were incoherent with incompatible signatures across
     architectures and with incomplete support.  Now every architecture
     supports the primitives consistently (by Peter Zijlstra)

   - Generic support for 'relaxed atomics':

       - _acquire/release/relaxed() flavours of xchg(), cmpxchg() and {add,sub}_return()
       - atomic_read_acquire()
       - atomic_set_release()

     This came out of porting qwrlock code to arm64 (by Will Deacon)

   - Clean up the fragile static_key APIs that were causing repeat bugs,
     by introducing a new one:

       DEFINE_STATIC_KEY_TRUE(name);
       DEFINE_STATIC_KEY_FALSE(name);

     which define a key of different types with an initial true/false
     value.

     Then allow:

       static_branch_likely()
       static_branch_unlikely()

     to take a key of either type and emit the right instruction for the
     case.  To be able to know the 'type' of the static key we encode it
     in the jump entry (by Peter Zijlstra)

   - Static key self-tests (by Jason Baron)

   - qrwlock optimizations (by Waiman Long)

   - small futex enhancements (by Davidlohr Bueso)

   - ... and misc other changes"

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (63 commits)
  jump_label/x86: Work around asm build bug on older/backported GCCs
  locking, ARM, atomics: Define our SMP atomics in terms of _relaxed() operations
  locking, include/llist: Use linux/atomic.h instead of asm/cmpxchg.h
  locking/qrwlock: Make use of _{acquire|release|relaxed}() atomics
  locking/qrwlock: Implement queue_write_unlock() using smp_store_release()
  locking/lockref: Remove homebrew cmpxchg64_relaxed() macro definition
  locking, asm-generic: Add _{relaxed|acquire|release}() variants for 'atomic_long_t'
  locking, asm-generic: Rework atomic-long.h to avoid bulk code duplication
  locking/atomics: Add _{acquire|release|relaxed}() variants of some atomic operations
  locking, compiler.h: Cast away attributes in the WRITE_ONCE() magic
  locking/static_keys: Make verify_keys() static
  jump label, locking/static_keys: Update docs
  locking/static_keys: Provide a selftest
  jump_label: Provide a self-test
  s390/uaccess, locking/static_keys: employ static_branch_likely()
  x86, tsc, locking/static_keys: Employ static_branch_likely()
  locking/static_keys: Add selftest
  locking/static_keys: Add a new static_key interface
  locking/static_keys: Rework update logic
  locking/static_keys: Add static_key_{en,dis}able() helpers
  ...
2015-09-03 15:46:07 -07:00
Paul E. McKenney
12d560f4ea rcu,locking: Privatize smp_mb__after_unlock_lock()
RCU is the only thing that uses smp_mb__after_unlock_lock(), and is
likely the only thing that ever will use it, so this commit makes this
macro private to RCU.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: "linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org>
2015-08-04 08:49:21 -07:00
Will Deacon
ed2de9f74e locking/Documentation: Clarify failed cmpxchg() memory ordering semantics
A failed cmpxchg does not provide any memory ordering guarantees, a
property that is used to optimise the cmpxchg implementations on Alpha,
PowerPC and arm64.

This patch updates atomic_ops.txt and memory-barriers.txt to reflect
this.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Douglas Hatch <doug.hatch@hp.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Scott J Norton <scott.norton@hp.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Waiman Long <waiman.long@hp.com>
Link: http://lkml.kernel.org/r/20150716151006.GH26390@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-03 10:57:09 +02:00
Paul E. McKenney
96d7744e0a doc: Call out smp_mb__after_unlock_lock() transitivity
Although "full barrier" should be interpreted as providing transitivity,
it is worth eliminating any possible confusion.  This commit therefore
adds "(including transitivity)" to eliminate any possible confusion.

Reported-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-07-15 14:43:14 -07:00
Paul E. McKenney
9af194cefc documentation: Replace ACCESS_ONCE() by READ_ONCE() and WRITE_ONCE()
Reported-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-07-15 14:43:13 -07:00
Paul E. McKenney
57aecae950 documentation: Fix variable-name typo in memory-barriers.txt
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-07-15 14:43:11 -07:00
Linus Torvalds
1bf7067c6e Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking updates from Ingo Molnar:
 "The main changes are:

   - 'qspinlock' support, enabled on x86: queued spinlocks - these are
     now the spinlock variant used by x86 as they outperform ticket
     spinlocks in every category.  (Waiman Long)

   - 'pvqspinlock' support on x86: paravirtualized variant of queued
     spinlocks.  (Waiman Long, Peter Zijlstra)

   - 'qrwlock' support, enabled on x86: queued rwlocks.  Similar to
     queued spinlocks, they are now the variant used by x86:

       CONFIG_ARCH_USE_QUEUED_SPINLOCKS=y
       CONFIG_QUEUED_SPINLOCKS=y
       CONFIG_ARCH_USE_QUEUED_RWLOCKS=y
       CONFIG_QUEUED_RWLOCKS=y

   - various lockdep fixlets

   - various locking primitives cleanups, further WRITE_ONCE()
     propagation"

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
  locking/lockdep: Remove hard coded array size dependency
  locking/qrwlock: Don't contend with readers when setting _QW_WAITING
  lockdep: Do not break user-visible string
  locking/arch: Rename set_mb() to smp_store_mb()
  locking/arch: Add WRITE_ONCE() to set_mb()
  rtmutex: Warn if trylock is called from hard/softirq context
  arch: Remove __ARCH_HAVE_CMPXCHG
  locking/rtmutex: Drop usage of __HAVE_ARCH_CMPXCHG
  locking/qrwlock: Rename QUEUE_RWLOCK to QUEUED_RWLOCKS
  locking/pvqspinlock: Rename QUEUED_SPINLOCK to QUEUED_SPINLOCKS
  locking/pvqspinlock: Replace xchg() by the more descriptive set_mb()
  locking/pvqspinlock, x86: Enable PV qspinlock for Xen
  locking/pvqspinlock, x86: Enable PV qspinlock for KVM
  locking/pvqspinlock, x86: Implement the paravirt qspinlock call patching
  locking/pvqspinlock: Implement simple paravirt support for the qspinlock
  locking/qspinlock: Revert to test-and-set on hypervisors
  locking/qspinlock: Use a simple write to grab the lock
  locking/qspinlock: Optimize for smaller NR_CPUS
  locking/qspinlock: Extract out code snippets for the next patch
  locking/qspinlock: Add pending bit
  ...
2015-06-22 14:54:22 -07:00
Paul E. McKenney
0868aa2216 Merge branches 'array.2015.05.27a', 'doc.2015.05.27a', 'fixes.2015.05.27a', 'hotplug.2015.05.27a', 'init.2015.05.27a', 'tiny.2015.05.27a' and 'torture.2015.05.27a' into HEAD
array.2015.05.27a:  Remove all uses of RCU-protected array indexes.
doc.2015.05.27a:  Docuemntation updates.
fixes.2015.05.27a:  Miscellaneous fixes.
hotplug.2015.05.27a:  CPU-hotplug updates.
init.2015.05.27a:  Initialization/Kconfig updates.
tiny.2015.05.27a:  Updates to Tiny RCU.
torture.2015.05.27a:  Torture-testing updates.
2015-05-27 13:00:49 -07:00
Paul E. McKenney
5af4692a75 smp: Make control dependencies work on Alpha, improve documentation
The current formulation of control dependencies fails on DEC Alpha,
which does not respect dependencies of any kind unless an explicit
memory barrier is provided.  This means that the current fomulation of
control dependencies fails on Alpha.  This commit therefore creates a
READ_ONCE_CTRL() that has the same overhead on non-Alpha systems, but
causes Alpha to produce the needed ordering.  This commit also applies
READ_ONCE_CTRL() to the one known use of control dependencies.

Use of READ_ONCE_CTRL() also has the beneficial effect of adding a bit
of self-documentation to control dependencies.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2015-05-27 12:58:02 -07:00
Will Deacon
d956028e99 documentation: memory-barriers: Fix smp_mb__before_spinlock() semantics
Our current documentation claims that, when followed by an ACQUIRE,
smp_mb__before_spinlock() orders prior loads against subsequent loads
and stores, which isn't the intent.  This commit therefore fixes the
documentation to state that this sequence orders only prior stores
against subsequent loads and stores.

In addition, the original intent of smp_mb__before_spinlock() was to only
order prior loads against subsequent stores, however, people have started
using it as if it ordered prior loads against subsequent loads and stores.
This commit therefore also updates smp_mb__before_spinlock()'s header
comment to reflect this new reality.

Cc: Oleg Nesterov <oleg@redhat.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-05-27 12:57:27 -07:00
Peter Zijlstra
b92b8b35a2 locking/arch: Rename set_mb() to smp_store_mb()
Since set_mb() is really about an smp_mb() -- not a IO/DMA barrier
like mb() rename it to match the recent smp_load_acquire() and
smp_store_release().

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-05-19 08:32:00 +02:00
Linus Torvalds
d6a24d0640 The documentation tree update for 4.1. Numerous fixes, the overdue removal
of the i2o docs, some new Chinese translations, and, hopefully, the README
 fix that will end the flow of identical patches to that file.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJVMOeZAAoJEI3ONVYwIuV6/1YQAJwcVvd3ow6cMKuf8eRKMgjd
 crJUdRF9FTwyY21SRHBaonyiKOthnVedOUYnFQ5Z7jbII0EohJ//72nW0pQrHoGi
 0avkbM+ZAZzXfd/paOiZ5HtYkc8Bdc70mPU1fzfexnPm/JACOGznxQsob05r6/sT
 W1GyJcrLlp4uPrba9rhAdGtaa+mEFrt4SVCI+odXOnxQ/KOZSIu1n1F3bSSvL9zV
 YMBgRrCso+Cdtuhe4N3O1jsVy/hyOnvtqcUgwlD4VzElsshKvxdxHn47yeeWK1qI
 zZfGTv+q5QI/eZgIhBIrBpdCafgLipAmNZhX+M76xeNydhfhp60VizRgKfb6JiW1
 M+nLSnv2vxh4wgkJs7yWgW+8TJ0eCF/w2P/mBi6hDXuIul9gwgmNk+EZ7LONSh+2
 l3PV/dyswNs04qbYgFt2rygsxcg79RRbD54zi+S/3NU38/gh7nlidASKmNu1boG/
 KdZx3F0rmX/xQu4aQ5nIQl2N7sVLkEec+oN+ukQGyBTLVHkfAK06Z4EWUeYmmBKh
 M6gqRY5XJMtCm8D5bons/yZmwmpdZFZMxFGJ4enUwrfsJ8FQ8qy/KmFqF8SojGWQ
 HYs4ZUptz6SYa7K0Txe/q0pkrW+doy7t/Bz+JBNNdG7eLeHIpKhSqSlLIdB7MRKw
 NFA8a4PdgdMpf+Zr4bRy
 =LDbk
 -----END PGP SIGNATURE-----

Merge tag 'docs-for-linus' of git://git.lwn.net/linux-2.6

Pull documentation updates from Jonathan Corbet:
 "Numerous fixes, the overdue removal of the i2o docs, some new Chinese
  translations, and, hopefully, the README fix that will end the flow of
  identical patches to that file"

* tag 'docs-for-linus' of git://git.lwn.net/linux-2.6: (34 commits)
  Documentation/memcg: update memcg/kmem status
  Documentation: blackfin: Makefile: Typo building issue
  Documentation/vm/pagemap.txt: correct location of page-types tool
  Documentation/memory-barriers.txt: typo fix
  doc: Add guest_nice column to example output of `cat /proc/stat'
  Documentation/kernel-parameters: Move "eagerfpu" to its right place
  Documentation: gpio: Update ACPI part of the document to mention _DSD
  docs/completion.txt: Various tweaks and corrections
  doc: completion: context, scope and language fixes
  Documentation:Update Documentation/zh_CN/arm64/memory.txt
  Documentation:Update Documentation/zh_CN/arm64/booting.txt
  Documentation: Chinese translation of arm64/legacy_instructions.txt
  DocBook media: fix broken EIA hyperlink
  Documentation: tweak the maintainers entry
  README: Change gzip/bzip2 to xz compression format
  README: Update version number reference
  doc:pci: Fix typo in Documentation/PCI
  Documentation: drm: Use '->' when describing access through pointers.
  Documentation: Remove mentioning of block barriers
  Documentation/email-clients.txt: Fix one grammar mistake, add extra info about TB
  ...
2015-04-18 11:10:49 -04:00
Sylvain Trias
7a4580075d Documentation/memory-barriers.txt: typo fix
Fix an obvious typo in the documentation.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2015-04-08 22:23:48 +02:00
Paul E. McKenney
ff38281059 documentation: Clarify control-dependency pairing
This commit explicitly states that control dependencies pair normally
with other barriers, and gives an example of such pairing.

Reported-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2015-02-26 11:57:32 -08:00
Davidlohr Bueso
d87510c5a6 documentation: Fix smp typo in memory-barriers.txt
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-07 08:59:32 -08:00
Paul E. McKenney
432fbf3c6a documentation: Record limitations of bitfields and small variables
This commit documents the fact that it is not safe to use bitfields
as shared variables in synchronization algorithms.  It also documents
that CPUs must be able to concurrently load from and store to adjacent
one-byte and two-byte variables, which is in fact required by the
C11 standard (Section 3.14).

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2015-01-07 08:57:09 -08:00
Alexander Duyck
1077fa36f2 arch: Add lightweight memory barriers dma_rmb() and dma_wmb()
There are a number of situations where the mandatory barriers rmb() and
wmb() are used to order memory/memory operations in the device drivers
and those barriers are much heavier than they actually need to be.  For
example in the case of PowerPC wmb() calls the heavy-weight sync
instruction when for coherent memory operations all that is really needed
is an lsync or eieio instruction.

This commit adds a coherent only version of the mandatory memory barriers
rmb() and wmb().  In most cases this should result in the barrier being the
same as the SMP barriers for the SMP case, however in some cases we use a
barrier that is somewhere in between rmb() and smp_rmb().  For example on
ARM the rmb barriers break down as follows:

  Barrier   Call     Explanation
  --------- -------- ----------------------------------
  rmb()     dsb()    Data synchronization barrier - system
  dma_rmb() dmb(osh) data memory barrier - outer sharable
  smp_rmb() dmb(ish) data memory barrier - inner sharable

These new barriers are not as safe as the standard rmb() and wmb().
Specifically they do not guarantee ordering between coherent and incoherent
memories.  The primary use case for these would be to enforce ordering of
reads and writes when accessing coherent memory that is shared between the
CPU and a device.

It may also be noted that there is no dma_mb().  Most architectures don't
provide a good mechanism for performing a coherent only full barrier without
resorting to the same mechanism used in mb().  As such there isn't much to
be gained in trying to define such a function.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: David Miller <davem@davemloft.net>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-11 21:15:06 -05:00
Linus Torvalds
c30110608c Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RCU updates from Ingo Molnar:
 "These are the main changes in this cycle:

    - Streamline RCU's use of per-CPU variables, shifting from "cpu"
      arguments to functions to "this_"-style per-CPU variable
      accessors.

    - signal-handling RCU updates.

    - real-time updates.

    - torture-test updates.

    - miscellaneous fixes.

    - documentation updates"

* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (34 commits)
  rcu: Fix FIXME in rcu_tasks_kthread()
  rcu: More info about potential deadlocks with rcu_read_unlock()
  rcu: Optimize cond_resched_rcu_qs()
  rcu: Add sparse check for RCU_INIT_POINTER()
  documentation: memory-barriers.txt: Correct example for reorderings
  documentation: Add atomic_long_t to atomic_ops.txt
  documentation: Additional restriction for control dependencies
  documentation: Document RCU self test boot params
  rcutorture: Fix rcu_torture_cbflood() memory leak
  rcutorture: Remove obsolete kversion param in kvm.sh
  rcutorture: Remove stale test configurations
  rcutorture: Enable RCU self test in configs
  rcutorture: Add early boot self tests
  torture: Run Linux-kernel binary out of results directory
  cpu: Avoid puts_pending overflow
  rcu: Remove "cpu" argument to rcu_cleanup_after_idle()
  rcu: Remove "cpu" argument to rcu_prepare_for_idle()
  rcu: Remove "cpu" argument to rcu_needs_cpu()
  rcu: Remove "cpu" argument to rcu_note_context_switch()
  rcu: Remove "cpu" argument to rcu_preempt_check_callbacks()
  ...
2014-12-09 20:23:19 -08:00
Pranith Kumar
8ab8b3e183 documentation: memory-barriers.txt: Correct example for reorderings
Correct the example of memory orderings in memory-barriers.txt

Commit 615cc2c9cf "Documentation/memory-barriers.txt: fix important typo re
memory barriers" changed the assignment to x and y. Change the rest of the
example to match this change.

Reported-by: Ganesh Rapolu <ganesh.rapolu@hotmail.com>
Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2014-11-13 10:34:55 -08:00
Paul E. McKenney
8b19d1dead documentation: Additional restriction for control dependencies
Short-circuit booleans are not defences against compilers breaking
your intended control dependencies.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Pranith Kumar <bobby.prani@gmail.com>
2014-11-13 10:34:53 -08:00
Will Deacon
a8e0aead70 documentation: memory-barriers: clarify relaxed io accessor semantics
This patch extends the paragraph describing the relaxed read io accessors
so that the relaxed accessors are defined to be:

 - Ordered with respect to each other if accessing the same peripheral

 - Unordered with respect to normal memory accesses

 - Unordered with respect to LOCK/UNLOCK operations

Whilst many architectures will provide stricter semantics, ARM, Alpha and
PPC can achieve significant performance gains by taking advantage of some
or all of the above relaxations.

Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2014-10-20 18:49:18 +01:00
Paul E. McKenney
2456d2a617 memory-barriers: Fix description of 2-legged-if-based control dependencies
Sad to say, current compilers really will hoist identical stores from both
branches of an "if" statement to precede the conditional.  This commit
therefore updates the description of control dependencies to reflect this
ugly reality.

Reported-by: Pranith Kumar <bobby.prani@gmail.com>
Reported-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2014-09-07 16:15:53 -07:00
Paul E. McKenney
efdcd51a4d memory-barriers: Retain barrier() in fold-to-zero example
The transformation in the fold-to-zero example incorrectly omits the
barrier() directive.  This commit therefore adds it back in.

Reported-by: Pranith Kumar <pranith@gatech.edu>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2014-09-07 16:15:52 -07:00
Paul E. McKenney
5646f7acc9 memory-barriers: Fix control-ordering no-transitivity example
The control-ordering example demonstrating lack of transitivity had
multiple problems.  This commit fixes them.

Reported-by: Nikolay Samofatov <nikolay.samofatov@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Pranith Kumar <bobby.prani@gmail.com>
2014-09-07 16:15:41 -07:00
Paul E. McKenney
128ea442b1 documentation: Add acquire/release barriers to pairing rules
It is possible to pair acquire and release barriers with other barriers,
so this commit adds them to the list in the SMP barrier pairing section.

Reported-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
[ paulmck: Updated pairing discussion as suggested by Peter Zijlstra. ]
2014-07-08 08:32:51 -07:00
Paul E. McKenney
5726ce06ad documentation: Clarify wake-up/memory-barrier relationship
This commit adds an example demonstrating that if a wake_up() doesn't
actually wake something up, no memory ordering is provided.

Reported-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Reviewed-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
2014-07-08 08:13:03 -07:00
Alexey Dobriyan
615cc2c9cf Documentation/memory-barriers.txt: fix important typo re memory barriers
Examples introducing neccesity of RMB+WMP pair reads as

        A=3	READ B
        www	rrrrrr
        B=4	READ A

Note the opposite order of reads vs writes.

But the first example without barriers reads as

        A=3	READ A
        B=4	READ B

There are 4 outcomes in the first example.

But if someone new to the concept tries to insert barriers like this:

        A=3	READ A
        www	rrrrrr
        B=4	READ B

he will still get all 4 possible outcomes, because "READ A" is first.

All this can be utterly confusing because barrier pair seems to be
superfluous.  In short, fixup first example to match latter examples
with barriers.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-06 16:08:11 -07:00
Peter Zijlstra
1b15611e1c arch,doc: Convert smp_mb__*()
Update the documentation to reflect the change of barrier primitives.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: David Howells <dhowells@redhat.com>
Link: http://lkml.kernel.org/n/tip-xslfehiga1twbk5uk94rij1e@git.kernel.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-04-18 14:20:48 +02:00
Linus Torvalds
159d8133d0 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
Pull trivial tree updates from Jiri Kosina:
 "Usual rocket science -- mostly documentation and comment updates"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial:
  sparse: fix comment
  doc: fix double words
  isdn: capi: fix "CAPI_VERSION" comment
  doc: DocBook: Fix typos in xml and template file
  Bluetooth: add module name for btwilink
  driver core: unexport static function create_syslog_header
  mmc: core: typo fix in printk specifier
  ARM: spear: clean up editing mistake
  net-sysfs: fix comment typo 'CONFIG_SYFS'
  doc: Insert MODULE_ in module-signing macros
  Documentation: update URL to hfsplus Technote 1150
  gpio: update path to documentation
  ixgbe: Fix format string in ixgbe_fcoe.
  Kconfig: Remove useless "default N" lines
  user_namespace.c: Remove duplicated word in comment
  CREDITS: fix formatting
  treewide: Fix typo in Documentation/DocBook
  mm: Fix warning on make htmldocs caused by slab.c
  ata: ata-samsung_cf: cleanup in header file
  idr: remove unused prototype of idr_free()
2014-04-02 16:23:38 -07:00
Masanari Iida
df5cbb2783 doc: fix double words
Fix double words "the the" in various files
within Documentations.

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2014-03-21 13:16:58 +01:00
Paul E. McKenney
8dd853d7b6 Documentation/memory-barriers.txt: Clarify release/acquire ordering
This commit fixes a couple of typos and clarifies what happens when
the CPU chooses to execute a later lock acquisition before a prior
lock release, in particular, why deadlock is avoided.

Reported-by: Peter Hurley <peter@hurleysoftware.com>
Reported-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Reported-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2014-02-24 08:37:29 -08:00
Paul E. McKenney
9b2b3bf531 Documentation/memory-barriers.txt: Need barriers() for some control dependencies
Current compilers can "speculate" stores in the case where both legs
of the "if" statement start with identical stores.  Because the stores
are identical, the compiler knows that the store will unconditionally
execute regardless of the "if" condition, and so the compiler is within
its rights to hoist the store to precede the condition.  Such hoisting
destroys the control-dependency ordering.  This ordering can be restored
by placing a barrier() at the beginning of each leg of the "if" statement.
This commit adds this requirement to the control-dependencies section.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2014-02-17 14:56:09 -08:00
Paul E. McKenney
586dd56a4c Documentation/memory-barriers.txt: Conditional must use prior load
A control dependency consists of a load, a conditional that depends on
that load, and a store.  This commit emphasizes this point in the
summary.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2014-02-17 14:56:07 -08:00
Paul E. McKenney
449f7413c8 Documentation/memory-barriers.txt: ACCESS_ONCE() provides cache coherence
The ACCESS_ONCE() primitive provides cache coherence, but the
documentation does not clearly state this.  This commit therefore upgrades
the documentation.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2014-02-17 14:56:06 -08:00
Peter Zijlstra
2e4f5382d1 locking/doc: Rename LOCK/UNLOCK to ACQUIRE/RELEASE
The LOCK and UNLOCK barriers as described in our barrier document are
generally known as ACQUIRE and RELEASE barriers in other literature.

Since we plan to introduce the acquire and release nomenclature in
generic kernel primitives we should amend the document to avoid
confusion as to what an acquire/release means.

Reviewed-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Victor Kaplansky <VICTORK@il.ibm.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Link: http://lkml.kernel.org/r/20131217092435.GC21999@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-01-12 10:37:13 +01:00
Paul E. McKenney
17eb88e068 Documentation/memory-barriers.txt: Downgrade UNLOCK+BLOCK
Historically, an UNLOCK+LOCK pair executed by one CPU, by one
task, or on a given lock variable has implied a full memory
barrier.  In a recent LKML thread, the wisdom of this historical
approach was called into question:
http://www.spinics.net/lists/linux-mm/msg65653.html, in part due
to the memory-order complexities of low-handoff-overhead queued
locks on x86 systems.

This patch therefore removes this guarantee from the
documentation, and further documents how to restore it via a new
smp_mb__after_unlock_lock() primitive.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: <linux-arch@vger.kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/1386799151-2219-6-git-send-email-paulmck@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-12-16 11:36:15 +01:00
Paul E. McKenney
692118dac4 Documentation/memory-barriers.txt: Document ACCESS_ONCE()
The situations in which ACCESS_ONCE() is required are not well
documented, so this commit adds some verbiage to
memory-barriers.txt.

Reported-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Reviewed-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: <linux-arch@vger.kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/1386799151-2219-4-git-send-email-paulmck@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-12-16 11:36:12 +01:00
Peter Zijlstra
18c03c6144 Documentation/memory-barriers.txt: Prohibit speculative writes
No SMP architecture currently supporting Linux allows
speculative writes, so this commit updates
Documentation/memory-barriers.txt to prohibit them in Linux core
code.  It also records restrictions on their use.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Reviewed-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: <linux-arch@vger.kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/1386799151-2219-3-git-send-email-paulmck@linux.vnet.ibm.com
[ Paul modified the original patch from Peter. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-12-16 11:36:11 +01:00
Paul E. McKenney
fb2b581968 Documentation/memory-barriers.txt: Add long atomic examples to memory-barriers.txt
Although the atomic_long_t functions are quite useful, they are
a bit obscure.  This commit therefore adds the common ones
alongside their atomic_t counterparts in
Documentation/memory-barriers.txt.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Reviewed-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: <linux-arch@vger.kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/1386799151-2219-2-git-send-email-paulmck@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-12-16 11:36:09 +01:00
Paul E. McKenney
2ecf810121 Documentation/memory-barriers.txt: Add needed ACCESS_ONCE() calls to memory-barriers.txt
The Documentation/memory-barriers.txt file was written before
the need for ACCESS_ONCE() was fully appreciated.  It therefore
contains no ACCESS_ONCE() calls, which can be a problem when
people lift examples from it.  This commit therefore adds
ACCESS_ONCE() calls.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Reviewed-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: <linux-arch@vger.kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/1386799151-2219-1-git-send-email-paulmck@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-12-16 11:36:08 +01:00
Ingo Molnar
e0edc78f25 Documentation/memory-barriers.txt: Fix a typo in the data dependency description
This typo has been there forever, it is 7.5 years old, looks like this
section of our memory ordering documentation is a place where most eyes
are glazed over already ;-)

[ Also fix some stray spaces and stray tabs while at it, shrinking the
  file by 49 bytes. Visual output unchanged. ]

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/n/tip-gncea9cb8igosblizfqMXrie@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-22 11:46:12 +01:00
Paul E. McKenney
45c8a36a55 doc: Fix memory-barrier control-dependency example
Each control-dependency example needs its barriers between the "if"
condition and the body of the "if" because a control dependency is
a dependency induced by a branch.  This commit makes the needed
adjustment.

Reported-by: Yongming Shen <symingz@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2013-08-19 21:39:42 -07:00
Richard Braun
7e8b1e78ea Documentation: Memory barrier semantics of atomic_xchg()
Add atomic_xchg() to documentation for atomic operations and
memory barriers.

Signed-off-by: Richard Braun <rbraun@sceen.net>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2013-01-08 14:14:55 -08:00
Paul E. McKenney
f191eec588 Documentation: Fix memory-barriers.txt example
This commit fixes a broken example of overlapping stores in the
Documentation/memory-barriers.txt file.

Reported-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2012-10-23 14:44:46 -07:00
Paul Bolle
395cf9691d doc: fix broken references
There are numerous broken references to Documentation files (in other
Documentation files, in comments, etc.). These broken references are
caused by typo's in the references, and by renames or removals of the
Documentation files. Some broken references are simply odd.

Fix these broken references, sometimes by dropping the irrelevant text
they were part of.

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-09-27 18:08:04 +02:00