2
0
mirror of https://github.com/edk2-porting/linux-next.git synced 2025-01-24 22:55:35 +08:00
linux-next/include
Nicolas Pitre fa4adc6149 [ARM] 3611/4: optimize do_div() when divisor is constant
On ARM all divisions have to be performed "manually".  For 64-bit
divisions that may take more than a hundred cycles in many cases.

With 32-bit divisions gcc already use the recyprocal of constant
divisors to perform a multiplication, but not with 64-bit divisions.

Since the kernel is increasingly relying upon 64-bit divisions it is
worth optimizing at least those cases where the divisor is a constant.
This is what this patch does using plain C code that gets optimized away
at compile time.

For example, despite the amount of added C code, do_div(x, 10000) now
produces the following assembly code (where x is assigned to r0-r1):

	adr	r4, .L0
	ldmia	r4, {r4-r5}
	umull	r2, r3, r4, r0
	mov	r2, #0
	umlal	r3, r2, r5, r0
	umlal	r3, r2, r4, r1
	mov	r3, #0
	umlal	r2, r3, r5, r1
	mov	r0, r2, lsr #11
	orr	r0, r0, r3, lsl #21
	mov	r1, r3, lsr #11
	...
.L0:
	.word	948328779
	.word	879609302

which is the fastest that can be done for any value of x in that case,
many times faster than the __do_div64 code (except for the small x value
space for which the result ends up being zero or a single bit).

The fact that this code is generated inline produces a tiny increase in
.text size, but not significant compared to the needed code around each
__do_div64 call site this code is replacing.

The algorithm used has been validated on a 16-bit scale for all possible
values, and then recodified for 64-bit values.  Furthermore I've been
running it with the final BUG_ON() uncommented for over two months now
with no problem.

Note that this new code is compiled with gcc versions 4.0 or later.
Earlier gcc versions proved themselves too problematic and only the
original code is used with them.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2006-12-07 16:06:09 +00:00
..
acpi ACPI: Processor native C-states using MWAIT 2006-10-14 00:35:39 -04:00
asm-alpha [PATCH] Consolidate check_signature 2006-10-11 11:14:23 -07:00
asm-arm [ARM] 3611/4: optimize do_div() when divisor is constant 2006-12-07 16:06:09 +00:00
asm-arm26 fix file specification in comments 2006-10-03 23:01:26 +02:00
asm-avr32 AVR32: Wire up sys_epoll_pwait 2006-11-06 14:07:15 +01:00
asm-cris [PATCH] remove remaining errno and __KERNEL_SYSCALLS__ references 2006-10-02 07:57:23 -07:00
asm-frv [PATCH] FRV: Use the correct preemption primitives in kmap_atomic() and co 2006-10-16 08:32:29 -07:00
asm-generic Add "pure_initcall" for static variable initialization 2006-11-20 11:47:18 -08:00
asm-h8300 [PATCH] remove remaining errno and __KERNEL_SYSCALLS__ references 2006-10-02 07:57:23 -07:00
asm-i386 [PATCH] i386: Fix compilation with UP genericarch 2006-11-28 20:12:59 +01:00
asm-ia64 [PATCH] mspec driver build fix 2006-11-13 07:40:42 -08:00
asm-m32r [PATCH] Consolidate check_signature 2006-10-11 11:14:23 -07:00
asm-m68k [PATCH] sun3_ioremap() prototype 2006-10-15 11:00:58 -07:00
asm-m68knommu [PATCH] m68knommu: fix up for the irq_handler_t changes 2006-11-20 10:16:49 -08:00
asm-mips [PATCH] make au1xxx-ide compile again 2006-11-22 23:34:02 +00:00
asm-parisc [PATCH] Fix incorrent type of flags in <asm/semaphore.h> 2006-11-26 16:30:29 -08:00
asm-powerpc [POWERPC] Revert "[POWERPC] Add powerpc get/set_rtc_time interface to new generic rtc class" 2006-11-22 12:13:36 +11:00
asm-ppc [PATCH] Consolidate check_signature 2006-10-11 11:14:23 -07:00
asm-s390 [S390] Fix pte type checking. 2006-10-18 18:30:51 +02:00
asm-sh sh: Fix IPR-IRQ's for IRQ-chip change breakage. 2006-10-31 12:53:28 +09:00
asm-sh64 [PATCH] Consolidate check_signature 2006-10-11 11:14:23 -07:00
asm-sparc [SPARC]: Fix robust futex syscalls and wire up migrate_pages. 2006-11-05 16:51:03 -08:00
asm-sparc64 [SPARC]: Fix robust futex syscalls and wire up migrate_pages. 2006-11-05 16:51:03 -08:00
asm-um [PATCH] uml: add INITCALLS 2006-10-31 08:07:00 -08:00
asm-v850 [PATCH] remove remaining errno and __KERNEL_SYSCALLS__ references 2006-10-02 07:57:23 -07:00
asm-x86_64 [PATCH] x86-64: Fix race in exit_idle 2006-11-14 16:57:46 +01:00
asm-xtensa fix file specification in comments 2006-10-03 23:01:26 +02:00
crypto [CRYPTO] digest: Added user API for new hash type 2006-09-21 11:46:17 +10:00
keys
linux [NET]: Fix MAX_HEADER setting. 2006-11-28 20:59:39 -08:00
math-emu
media V4L/DVB (4666): Ensure the WM8775 driver is loaded generically for any board. 2006-10-03 15:13:48 -03:00
mtd Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 2006-10-01 17:55:53 +01:00
net [NET]: Re-fix of doc-comment in sock.h 2006-11-25 15:16:51 -08:00
pcmcia
rdma RDMA/addr: Use client registration to fix module unload race 2006-11-02 14:26:04 -08:00
rxrpc
scsi [PATCH] add missing libsas include to fix s390 compilation. 2006-11-28 17:26:50 -08:00
sound [ALSA] version 1.0.13 2006-11-28 15:07:33 +01:00
video fix file specification in comments 2006-10-03 23:01:26 +02:00
Kbuild [HEADERS] One line per header in Kbuild files to reduce conflicts 2006-09-19 12:43:58 +01:00