linux/arch/x86/lib
Luca Barbieri a7e926abc3 x86-32: Rewrite 32-bit atomic64 functions in assembly
This patch replaces atomic64_32.c with two assembly implementations,
one for 386/486 machines using pushf/cli/popf and one for 586+ machines
using cmpxchg8b.

The cmpxchg8b implementation provides the following advantages over the
current one:

1. Implements atomic64_add_unless, atomic64_dec_if_positive and
   atomic64_inc_not_zero

2. Uses the ZF flag changed by cmpxchg8b instead of doing a comparison

3. Uses custom register calling conventions that reduce or eliminate
   register moves to suit cmpxchg8b

4. Reads the initial value instead of using cmpxchg8b to do that.
   Currently we use lock xaddl and movl, which seems the fastest.

5. Does not use the lock prefix for atomic64_set
   64-bit writes are already atomic, so we don't need that.
   We still need it for atomic64_read to avoid restoring a value
   changed in the meantime.

6. Allocates registers as well or better than gcc

The 386 implementation provides support for 386 and 486 machines.
386/486 SMP is not supported (we dropped it), but such support can be
added easily if desired.

A pure assembly implementation is required due to the custom calling
conventions, and desire to use %ebp in atomic64_add_return (we need
7 registers...), as well as the ability to use pushf/popf in the 386
code without an intermediate pop/push.

The parameter names are changed to match the convention in atomic_64.h

Changes in v3 (due to rebasing to tip/x86/asm):
- Patches atomic64_32.h instead of atomic_32.h
- Uses the CALL alternative mechanism from commit
  1b1d925818

Changes in v2:
- Merged 386 and cx8 support in the same patch
- 386 support now done in assembly, C code no longer used at all
- cmpxchg64 is used for atomic64_cmpxchg
- stop using macros, use one-line inline functions instead
- miscellanous changes and improvements

Signed-off-by: Luca Barbieri <luca@luca-barbieri.com>
LKML-Reference: <1267005265-27958-5-git-send-email-luca@luca-barbieri.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-02-25 20:47:30 -08:00
..
.gitignore x86: Gitignore: arch/x86/lib/inat-tables.c 2009-11-04 13:11:28 +01:00
atomic64_32.c x86-32: Rewrite 32-bit atomic64 functions in assembly 2010-02-25 20:47:30 -08:00
atomic64_386_32.S x86-32: Rewrite 32-bit atomic64 functions in assembly 2010-02-25 20:47:30 -08:00
atomic64_cx8_32.S x86-32: Rewrite 32-bit atomic64 functions in assembly 2010-02-25 20:47:30 -08:00
checksum_32.S
clear_page_64.S x86: Fix symbol annotation for arch/x86/lib/clear_page_64.S::clear_page_c 2009-06-30 23:43:15 +02:00
cmpxchg8b_emu.S x86: Provide an alternative() based cmpxchg64() 2009-09-30 22:55:59 +02:00
copy_page_64.S x86_64: move lib 2007-10-11 11:17:08 +02:00
copy_user_64.S x86-64: Modify copy_user_generic() alternatives mechanism 2009-12-30 11:57:31 +01:00
copy_user_nocache_64.S x86: wrong register was used in align macro 2008-07-30 10:10:39 -07:00
csum-copy_64.S x86_64: move lib 2007-10-11 11:17:08 +02:00
csum-partial_64.c x86: fix csum_partial() export 2008-05-13 19:38:47 +02:00
csum-wrappers_64.c x86: clean up csum-wrappers_64.c some more 2008-02-19 16:18:32 +01:00
delay.c x86, delay: tsc based udelay should have rdtsc_barrier 2009-06-25 16:47:40 -07:00
getuser.S x86: use _types.h headers in asm where available 2009-02-13 11:35:01 -08:00
inat.c x86: AVX instruction set decoder support 2009-10-29 08:47:46 +01:00
insn.c x86: AVX instruction set decoder support 2009-10-29 08:47:46 +01:00
io_64.c x86: coding style fixes in arch/x86/lib/io_64.c 2008-02-19 16:18:32 +01:00
iomap_copy_64.S x86_64: move lib 2007-10-11 11:17:08 +02:00
Makefile x86-32: Rewrite 32-bit atomic64 functions in assembly 2010-02-25 20:47:30 -08:00
memcpy_32.c x86: coding style fixes to arch/x86/lib/memcpy_32.c 2008-04-17 17:40:49 +02:00
memcpy_64.S x86-64: Modify memcpy()/memset() alternatives mechanism 2009-12-30 11:57:32 +01:00
memmove_64.c x86: coding style fixes to arch/x86/lib/memmove_64.c 2008-04-17 17:40:48 +02:00
memset_64.S x86-64: Modify memcpy()/memset() alternatives mechanism 2009-12-30 11:57:32 +01:00
mmx_32.c x86: clean up mmx_32.c 2008-04-17 17:40:47 +02:00
msr-reg-export.c x86, msr: change msr-reg.o to obj-y, and export its symbols 2009-09-04 10:00:09 -07:00
msr-reg.S x86, msr: Fix msr-reg.S compilation with gas 2.16.1, on 32-bit too 2009-09-03 21:26:34 +02:00
msr-smp.c x86, msr: msrs_alloc/free for CONFIG_SMP=n 2009-12-16 15:36:32 -08:00
msr.c x86, msr: msrs_alloc/free for CONFIG_SMP=n 2009-12-16 15:36:32 -08:00
putuser.S x86: merge putuser asm functions. 2008-07-09 09:14:13 +02:00
rwlock_64.S x86: rename .i assembler includes to .h 2007-10-17 20:16:29 +02:00
semaphore_32.S Generic semaphore implementation 2008-04-17 10:42:34 -04:00
string_32.c x86: coding style fixes to arch/x86/lib/string_32.c 2008-08-15 16:53:25 +02:00
strstr_32.c x86: coding style fixes to arch/x86/lib/strstr_32.c 2008-08-15 16:53:24 +02:00
thunk_32.S ftrace: trace irq disabled critical timings 2008-05-23 20:32:46 +02:00
thunk_64.S ftrace: trace irq disabled critical timings 2008-05-23 20:32:46 +02:00
usercopy_32.c x86: Turn the copy_from_user check into an (optional) compile time warning 2009-10-01 11:31:04 +02:00
usercopy_64.c x86, 64-bit: Clean up user address masking 2009-06-20 15:40:00 -07:00
x86-opcode-map.txt x86: Add Intel FMA instructions to x86 opcode map 2009-10-29 08:47:47 +01:00