linux/arch/ia64/include/asm/barrier.h
Alexander Duyck 1077fa36f2 arch: Add lightweight memory barriers dma_rmb() and dma_wmb()
There are a number of situations where the mandatory barriers rmb() and
wmb() are used to order memory/memory operations in the device drivers
and those barriers are much heavier than they actually need to be.  For
example in the case of PowerPC wmb() calls the heavy-weight sync
instruction when for coherent memory operations all that is really needed
is an lsync or eieio instruction.

This commit adds a coherent only version of the mandatory memory barriers
rmb() and wmb().  In most cases this should result in the barrier being the
same as the SMP barriers for the SMP case, however in some cases we use a
barrier that is somewhere in between rmb() and smp_rmb().  For example on
ARM the rmb barriers break down as follows:

  Barrier   Call     Explanation
  --------- -------- ----------------------------------
  rmb()     dsb()    Data synchronization barrier - system
  dma_rmb() dmb(osh) data memory barrier - outer sharable
  smp_rmb() dmb(ish) data memory barrier - inner sharable

These new barriers are not as safe as the standard rmb() and wmb().
Specifically they do not guarantee ordering between coherent and incoherent
memories.  The primary use case for these would be to enforce ordering of
reads and writes when accessing coherent memory that is shared between the
CPU and a device.

It may also be noted that there is no dma_mb().  Most architectures don't
provide a good mechanism for performing a coherent only full barrier without
resorting to the same mechanism used in mb().  As such there isn't much to
be gained in trying to define such a function.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: David Miller <davem@davemloft.net>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-11 21:15:06 -05:00

94 lines
2.7 KiB
C

/*
* Memory barrier definitions. This is based on information published
* in the Processor Abstraction Layer and the System Abstraction Layer
* manual.
*
* Copyright (C) 1998-2003 Hewlett-Packard Co
* David Mosberger-Tang <davidm@hpl.hp.com>
* Copyright (C) 1999 Asit Mallick <asit.k.mallick@intel.com>
* Copyright (C) 1999 Don Dugger <don.dugger@intel.com>
*/
#ifndef _ASM_IA64_BARRIER_H
#define _ASM_IA64_BARRIER_H
#include <linux/compiler.h>
/*
* Macros to force memory ordering. In these descriptions, "previous"
* and "subsequent" refer to program order; "visible" means that all
* architecturally visible effects of a memory access have occurred
* (at a minimum, this means the memory has been read or written).
*
* wmb(): Guarantees that all preceding stores to memory-
* like regions are visible before any subsequent
* stores and that all following stores will be
* visible only after all previous stores.
* rmb(): Like wmb(), but for reads.
* mb(): wmb()/rmb() combo, i.e., all previous memory
* accesses are visible before all subsequent
* accesses and vice versa. This is also known as
* a "fence."
*
* Note: "mb()" and its variants cannot be used as a fence to order
* accesses to memory mapped I/O registers. For that, mf.a needs to
* be used. However, we don't want to always use mf.a because (a)
* it's (presumably) much slower than mf and (b) mf.a is supported for
* sequential memory pages only.
*/
#define mb() ia64_mf()
#define rmb() mb()
#define wmb() mb()
#define dma_rmb() mb()
#define dma_wmb() mb()
#ifdef CONFIG_SMP
# define smp_mb() mb()
#else
# define smp_mb() barrier()
#endif
#define smp_rmb() smp_mb()
#define smp_wmb() smp_mb()
#define read_barrier_depends() do { } while (0)
#define smp_read_barrier_depends() do { } while (0)
#define smp_mb__before_atomic() barrier()
#define smp_mb__after_atomic() barrier()
/*
* IA64 GCC turns volatile stores into st.rel and volatile loads into ld.acq no
* need for asm trickery!
*/
#define smp_store_release(p, v) \
do { \
compiletime_assert_atomic_type(*p); \
barrier(); \
ACCESS_ONCE(*p) = (v); \
} while (0)
#define smp_load_acquire(p) \
({ \
typeof(*p) ___p1 = ACCESS_ONCE(*p); \
compiletime_assert_atomic_type(*p); \
barrier(); \
___p1; \
})
/*
* XXX check on this ---I suspect what Linus really wants here is
* acquire vs release semantics but we can't discuss this stuff with
* Linus just yet. Grrr...
*/
#define set_mb(var, value) do { (var) = (value); mb(); } while (0)
/*
* The group barrier in front of the rsm & ssm are necessary to ensure
* that none of the previous instructions in the same group are
* affected by the rsm/ssm.
*/
#endif /* _ASM_IA64_BARRIER_H */