linux/lib
Peter Zijlstra aab8621776 lib/int_sqrt: optimize small argument
commit 3f3295709e upstream.

The current int_sqrt() computation is sub-optimal for the case of small
@x.  Which is the interesting case when we're going to do cumulative
distribution functions on idle times, which we assume to be a random
variable, where the target residency of the deepest idle state gives an
upper bound on the variable (5e6ns on recent Intel chips).

In the case of small @x, the compute loop:

	while (m != 0) {
		b = y + m;
		y >>= 1;

		if (x >= b) {
			x -= b;
			y += m;
		}
		m >>= 2;
	}

can be reduced to:

	while (m > x)
		m >>= 2;

Because y==0, b==m and until x>=m y will remain 0.

And while this is computationally equivalent, it runs much faster
because there's less code, in particular less branches.

      cycles:                 branches:              branch-misses:

OLD:

hot:   45.109444 +- 0.044117  44.333392 +- 0.002254  0.018723 +- 0.000593
cold: 187.737379 +- 0.156678  44.333407 +- 0.002254  6.272844 +- 0.004305

PRE:

hot:   67.937492 +- 0.064124  66.999535 +- 0.000488  0.066720 +- 0.001113
cold: 232.004379 +- 0.332811  66.999527 +- 0.000488  6.914634 +- 0.006568

POST:

hot:   43.633557 +- 0.034373  45.333132 +- 0.002277  0.023529 +- 0.000681
cold: 207.438411 +- 0.125840  45.333132 +- 0.002277  6.976486 +- 0.004219

Averages computed over all values <128k using a LFSR to generate order.
Cold numbers have a LFSR based branch trace buffer 'confuser' ran between
each int_sqrt() invocation.

Link: http://lkml.kernel.org/r/20171020164644.876503355@infradead.org
Fixes: 30493cc9dd ("lib/int_sqrt.c: optimize square root algorithm")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Suggested-by: Anshul Garg <aksgarg1989@gmail.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Joe Perches <joe@perches.com>
Cc: David Miller <davem@davemloft.net>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Michael Davidson <md@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-03-27 14:13:54 +09:00
..
842 License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
fonts License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
lz4 lib/lz4: make arrays static const, reduces object code size 2017-10-03 17:54:25 -07:00
lzo License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
mpi lib/mpi: Fix umul_ppmm() for MIPS64r6 2018-03-03 10:24:29 +01:00
raid6 raid6/ppc: Fix build for clang 2019-01-13 10:01:04 +01:00
reed_solomon
xz
zlib_deflate
zlib_inflate lib/zlib_inflate/inftrees.c: fix potential buffer overflow 2017-05-08 17:15:12 -07:00
zstd lib: Add zstd modules 2017-08-15 09:02:08 -07:00
.gitignore
argv_split.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
asn1_decoder.c ASN.1: check for error from ASN1_OP_END__ACT actions 2017-12-14 09:52:52 +01:00
assoc_array.c assoc_array: Fix shortcut creation 2019-03-23 14:35:14 +01:00
atomic64_test.c lib/atomic64_test.c: add a test that atomic64_inc_not_zero() returns an int 2017-07-14 15:05:13 -07:00
atomic64.c locking/atomic: Implement atomic{,64,_long}_fetch_{add,sub,and,andnot,or,xor}{,_relaxed,_acquire,_release}() 2016-06-16 10:48:32 +02:00
audit.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
bcd.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
bch.c
bitmap.c lib: fix stall in __bitmap_parselist() 2018-04-19 08:56:20 +02:00
bitrev.c
bsearch.c lib/bsearch.c: micro-optimize pivot position calculation 2017-07-10 16:32:35 -07:00
btree.c
bug.c lib/bug.c: exclude non-BUG/WARN exceptions from report_bug() 2018-03-15 10:54:32 +01:00
build_OID_registry
bust_spinlocks.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
chacha20.c random: replace non-blocking pool with a Chacha20-based CRNG 2016-07-03 00:57:23 -04:00
check_signature.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
checksum.c
clz_ctz.c
clz_tab.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
cmdline.c lib/cmdline.c: remove meaningless comment 2017-09-08 18:26:49 -07:00
compat_audit.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
cordic.c
cpu_rmap.c
cpumask.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
crc4.c lib: Add crc4 module 2017-06-09 11:52:07 +02:00
crc7.c
crc8.c
crc16.c
crc32.c lib: add module support to crc32 tests 2017-02-24 17:46:57 -08:00
crc32defs.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
crc32test.c lib: add module support to crc32 tests 2017-02-24 17:46:57 -08:00
crc-ccitt.c
crc-itu-t.c
crc-t10dif.c
ctype.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
debug_info.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
debug_locks.c locking/lockdep: Fix debug_locks off performance problem 2018-11-13 11:14:51 -08:00
debugobjects.c debugobjects: avoid recursive calls with kmemleak 2018-12-17 09:28:54 +01:00
dec_and_lock.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
decompress_bunzip2.c
decompress_inflate.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
decompress_unlz4.c lib/decompress_unlz4: change module to work with new LZ4 module version 2017-02-24 17:46:57 -08:00
decompress_unlzma.c
decompress_unlzo.c
decompress_unxz.c
decompress.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
devres.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
digsig.c lib/digsig: fix dereference of NULL user_key_payload 2017-10-12 17:16:40 +01:00
div64.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
dma-debug.c dmaengine updates for 4.12-rc1 2017-05-09 15:40:28 -07:00
dma-noop.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
dma-virt.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
dump_stack.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
dynamic_debug.c dynamic-debug-howto: fix optional/omitted ending line number to be LARGE instead of 0 2017-12-14 09:53:08 +01:00
dynamic_queue_limits.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
earlycpio.c lib/cpio: Make find_cpio_data()'s offset arg optional 2016-06-08 11:04:19 +02:00
errseq.c errseq: Always report a writeback error once 2018-05-09 09:51:54 +02:00
extable.c lib/extable.c: use bsearch() library function in search_extable() 2017-07-10 16:32:35 -07:00
fault-inject.c fault-inject: fix wrong should_fail() decision in task context 2017-08-10 15:54:06 -07:00
fdt_empty_tree.c
fdt_ro.c
fdt_rw.c
fdt_strerror.c
fdt_sw.c
fdt_wip.c
fdt.c
find_bit.c lib/find_bit.c: micro-optimise find_next_*_bit 2017-02-24 17:46:57 -08:00
flex_array.c
flex_proportions.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
gcd.c lib/GCD.c: use binary GCD algorithm instead of Euclidean 2016-05-20 17:58:30 -07:00
gen_crc32table.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
genalloc.c lib/genalloc.c: make the avail variable an atomic_long_t 2017-12-14 09:53:08 +01:00
glob.c lib: add module support to glob tests 2017-02-24 17:46:57 -08:00
globtest.c lib: add module support to glob tests 2017-02-24 17:46:57 -08:00
hexdump.c lib/hexdump.c: return -EINVAL in case of error in hex2bin() 2017-09-08 18:26:49 -07:00
hweight.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
idr.c lib/idr.c: fix comment for idr_replace() 2017-10-03 17:54:25 -07:00
inflate.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
int_sqrt.c lib/int_sqrt: optimize small argument 2019-03-27 14:13:54 +09:00
interval_tree_test.c lib/rbtree-test: lower default params 2018-12-17 09:28:55 +01:00
interval_tree.c
iomap_copy.c
iomap.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
iommu-common.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
iommu-helper.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
ioremap.c ioremap: Update pgtable free interfaces with addr 2018-08-17 21:01:11 +02:00
iov_iter.c iov_iter: fix page_copy_sane for compound pages 2017-09-20 23:27:48 -04:00
irq_poll.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
irq_regs.c
is_single_threaded.c sched/headers: Prepare to move 'init_task' and 'init_thread_union' from <linux/sched.h> to <linux/sched/task.h> 2017-03-02 08:42:38 +01:00
jedec_ddr_data.c
kasprintf.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
Kconfig Merge branch 'zstd-minimal' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs 2017-09-14 17:30:49 -07:00
Kconfig.debug kmemcheck: rip it out 2018-02-22 15:42:24 +01:00
Kconfig.kasan kasan: rework Kconfig settings 2018-02-16 20:23:04 +01:00
Kconfig.kgdb lib: update location of kgdb documentation 2017-05-16 08:44:22 -03:00
Kconfig.ubsan Kconfig: lib/Kconfig.ubsan fix reference to ubsan documentation 2016-12-14 16:04:08 -08:00
kfifo.c
klist.c scsi: klist: Make it safe to use klists in atomic context 2018-10-03 17:00:48 -07:00
kobject_uevent.c driver core: suppress sending MODALIAS in UNBIND uevents 2017-09-18 16:48:33 +02:00
kobject.c kobject: Replace strncpy with memcpy 2018-12-08 13:03:35 +01:00
kstrtox.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
kstrtox.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
lcm.c
libcrc32c.c crypto: Work around deallocated stack frame reference gcc bug on sparc. 2017-06-08 17:36:03 +08:00
list_debug.c bug: switch data corruption check to __must_check 2017-02-24 17:46:56 -08:00
list_sort.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
llist.c
locking-selftest-hardirq.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
locking-selftest-mutex.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
locking-selftest-rlock-hardirq.h
locking-selftest-rlock-softirq.h
locking-selftest-rlock.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
locking-selftest-rsem.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
locking-selftest-rtmutex.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
locking-selftest-softirq.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
locking-selftest-spin-hardirq.h
locking-selftest-spin-softirq.h
locking-selftest-spin.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
locking-selftest-wlock-hardirq.h
locking-selftest-wlock-softirq.h
locking-selftest-wlock.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
locking-selftest-wsem.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
locking-selftest.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
lockref.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
lru_cache.c
Makefile License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
memory-notifier-error-inject.c
memweight.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
net_utils.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
netdev-notifier-error-inject.c
nlattr.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
nmi_backtrace.c printk/nmi: Prevent deadlock when accessing the main log buffer in NMI 2018-09-05 09:26:35 +02:00
nodemask.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
notifier-error-inject.c
notifier-error-inject.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
of-reconfig-notifier-error-inject.c
oid_registry.c 509: fix printing uninitialized stack memory when OID is empty 2018-02-25 11:08:01 +01:00
once.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
parman.c lib: Introduce priority array area manager 2017-02-03 16:35:42 -05:00
parser.c parser: add u64 number parser 2016-12-06 10:17:03 +02:00
pci_iomap.c
percpu_counter.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
percpu_ida.c sched/headers: Prepare to remove the <linux/gfp.h> include from <linux/sched.h> 2017-03-02 08:42:34 +01:00
percpu_test.c
percpu-refcount.c percpu-refcount: support synchronous switch to atomic mode. 2017-03-22 19:18:43 -07:00
plist.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/clock.h> 2017-03-02 08:42:27 +01:00
pm-notifier-error-inject.c
prime_numbers.c lib/prime_numbers: Suppress warn on kmalloc failure 2017-01-23 09:17:12 +01:00
radix-tree.c idr: fix invalid ptr dereference on item delete 2018-05-30 07:51:49 +02:00
random32.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
ratelimit.c lib/ratelimit.c: use deferred printk() version 2017-10-03 17:54:26 -07:00
rational.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
rbtree_test.c lib/rbtree-test: lower default params 2018-12-17 09:28:55 +01:00
rbtree.c rbtree: add some additional comments for rebalancing cases 2017-09-08 18:26:48 -07:00
reciprocal_div.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
refcount.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
rhashtable.c rhashtable: add schedule points 2018-09-19 22:43:45 +02:00
sbitmap.c sbitmap: add sbitmap_get_shallow() operation 2017-04-14 14:06:52 -06:00
scatterlist.c scatterlist: add sg_zero_buffer() helper 2017-06-15 14:30:14 +02:00
seq_buf.c seq_buf: Make seq_buf_puts() null-terminate the buffer 2019-02-12 19:46:08 +01:00
sg_pool.c lib: scatterlist: move SG pool code from SCSI driver to lib/sg_pool.c 2016-04-15 16:53:14 -04:00
sg_split.c
sha1.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
show_mem.c lib/show_mem.c: teach show_mem to work with the given nodemask 2017-02-22 16:41:30 -08:00
siphash.c siphash: implement HalfSipHash1-3 for hash tables 2017-01-09 13:58:57 -05:00
smp_processor_id.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
sort.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
stackdepot.c lib/stackdepot: export save/fetch stack for drivers 2016-11-11 08:12:37 -08:00
stmp_device.c
string_helpers.c mm: treewide: remove GFP_TEMPORARY allocation flag 2017-09-13 18:53:16 -07:00
string.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
strncpy_from_user.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
strnlen_user.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
swiotlb.c swiotlb: clean up reporting 2018-12-13 09:18:52 +01:00
syscall.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
test_bitmap.c lib/test_bitmap.c: fix bitmap optimisation tests to report errors correctly 2018-05-22 18:53:58 +02:00
test_bpf.c test_bpf: Fix testing with CONFIG_BPF_JIT_ALWAYS_ON=y on other arches 2018-11-04 14:52:43 +01:00
test_debug_virtual.c lib: fix build failure in CONFIG_DEBUG_VIRTUAL test 2019-01-13 10:01:07 +01:00
test_firmware.c test_firmware: fix error return getting clobbered 2018-12-13 09:18:46 +01:00
test_hash.c lib/test_hash.c: fix warning in preprocessor symbol evaluation 2016-09-01 17:52:01 -07:00
test_hexdump.c test_hexdump: use memcpy instead of strncpy 2018-12-08 13:03:35 +01:00
test_kasan.c kasan: report only the first error by default 2017-03-31 17:13:30 -07:00
test_kmod.c lib/test_kmod.c: potential double free in error handling 2019-03-13 14:03:18 -07:00
test_list_sort.c lib: add module support to linked list sorting tests 2017-05-08 17:15:10 -07:00
test_module.c
test_parman.c lib: fix spelling mistake: "actualy" -> "actually" 2017-02-26 11:03:38 -05:00
test_printf.c
test_rhashtable.c lib: test_rhashtable: Fix KASAN warning 2017-07-25 12:35:23 -07:00
test_siphash.c siphash: implement HalfSipHash1-3 for hash tables 2017-01-09 13:58:57 -05:00
test_sort.c Revert "lib/test_sort.c: make it explicitly non-modular" 2017-05-08 17:15:10 -07:00
test_static_key_base.c
test_static_keys.c
test_sysctl.c test_sysctl: test against int proc_dointvec() array support 2017-07-12 16:26:00 -07:00
test_user_copy.c lib: remove check for AVR32 arch in test_user_copy 2017-05-01 09:36:30 +02:00
test_uuid.c uuid: fix incorrect uuid_equal conversion in test_uuid_test 2017-07-21 09:38:30 +02:00
test-kstrtox.c
test-string_helpers.c
textsearch.c
timerqueue.c timerqueue: Use rb_entry_safe() instead of open-coding it 2017-01-20 08:03:42 +01:00
ts_bm.c
ts_fsm.c textsearch: fix typos in library helpers 2017-10-22 03:14:07 +01:00
ts_kmp.c textsearch: fix typos in library helpers 2017-10-22 03:14:07 +01:00
ubsan.c lib/ubsan.c: don't mark __ubsan_handle_builtin_unreachable as noreturn 2018-11-21 09:24:15 +01:00
ubsan.h lib/ubsan: add type mismatch handler for new GCC/Clang 2018-02-16 20:23:09 +01:00
ucs2_string.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
usercopy.c Fix misannotated out-of-line _copy_to_user() 2018-03-19 08:42:56 +01:00
uuid.c uuid: hoist uuid_is_null() helper from libnvdimm 2017-06-05 16:59:05 +02:00
vsprintf.c lib/vsprintf: Remove atomic-unsafe support for %pCr 2018-07-03 11:24:48 +02:00
win_minmax.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
xxhash.c lib: Add xxhash module 2017-08-15 09:02:07 -07:00