linux/arch/powerpc/lib
Anton Blanchard 9b83ecb0a3 powerpc: Optimise 64bit csum_partial
The main loop of csum_partial runs very slowly on recent POWER CPUs. After some
analysis on both POWER6 and POWER7 I came up with routine below. First we get
the source aligned to a double word, ignoring any odd alignment to keep things
simple. Then we do 64 bytes at a time, with an entry and exit limb of a further
64 bytes. On both POWER6 and POWER7 this should be as fast as we can go since
we are limited by the latency of the adde instructions.

To test this I forced checksumming on over loopback and ran socklib (a
simple TCP benchmark). On a POWER6 575 throughput improved by 11% with
this patch.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-09-02 14:07:29 +10:00
..
alloc.c [POWERPC] Limit range of __init_ref_ok somewhat 2007-10-03 11:48:44 +10:00
checksum_32.S powerpc: Rename files to have consistent _32/_64 suffixes 2005-10-10 21:52:43 +10:00
checksum_64.S powerpc: Optimise 64bit csum_partial 2010-09-02 14:07:29 +10:00
code-patching.c PAGE_ALIGN(): correctly handle 64-bit values on 32-bit architectures 2008-07-24 10:47:21 -07:00
copy_32.S powerpc/8xx: Start using dcbX instructions in various copy routines 2009-12-09 17:10:37 +11:00
copypage_64.S powerpc: Pair loads and stores in copy_4k_page 2010-02-17 14:03:16 +11:00
copyuser_64.S powerpc: Improve 64bit copy_tofrom_user 2010-02-17 14:03:16 +11:00
crtsavres.S powerpc: Fix module building for gcc 4.5 and 64 bit 2010-07-08 18:11:38 +10:00
devres.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
div64.S powerpc: Fix a corner case in __div64_32 2005-10-20 09:37:02 +10:00
feature-fixups-test.S powerpc: Fixup lwsync at runtime 2008-07-03 16:58:10 +10:00
feature-fixups.c powerpc: Fix feature-fixup tests for gcc 4.5 2010-07-08 18:11:41 +10:00
ldstfp.S powerpc: Emulate most Book I instructions in emulate_step() 2010-06-22 19:40:29 +10:00
locks.c locking: Convert raw_rwlock to arch_rwlock 2009-12-14 23:55:32 +01:00
Makefile Merge commit 'paulus-perf/master' into next 2010-07-09 11:25:48 +10:00
mem_64.S [POWERPC] Use mtocrf instruction in asm when CONFIG_POWER4_ONLY=y 2007-04-13 03:55:13 +10:00
memcpy_64.S powerpc: Fix 64bit memcpy() regression 2009-02-26 14:02:53 +11:00
rheap.c powerpc: Fix corruption error in rh_alloc_fixed() 2008-12-17 10:06:14 -06:00
sstep.c powerpc: Emulate most Book I instructions in emulate_step() 2010-06-22 19:40:29 +10:00
string.S powerpc: Fix string library functions 2010-05-21 17:31:08 +10:00
usercopy_64.c powerpc: Rename files to have consistent _32/_64 suffixes 2005-10-10 21:52:43 +10:00