linux-next

mirror of https://github.com/edk2-porting/linux-next.git synced 2024-12-20 03:04:01 +08:00

History

Eric Dumazet 0ad9500e16 slub: prefetch next freelist pointer in slab_alloc() Recycling a page is a problem, since freelist link chain is hot on cpu(s) which freed objects, and possibly very cold on cpu currently owning slab. Adding a prefetch of cache line containing the pointer to next object in slab_alloc() helps a lot in many workloads, in particular on assymetric ones (allocations done on one cpu, frees on another cpus). Added cost is three machine instructions only. Examples on my dual socket quad core ht machine (Intel CPU E5540 @2.53GHz) (16 logical cpus, 2 memory nodes), 64bit kernel. Before patch : # perf stat -r 32 hackbench 50 process 4000 >/dev/null Performance counter stats for 'hackbench 50 process 4000' (32 runs): 327577,471718 task-clock # 15,821 CPUs utilized ( +- 0,64% ) 28 866 491 context-switches # 0,088 M/sec ( +- 1,80% ) 1 506 929 CPU-migrations # 0,005 M/sec ( +- 3,24% ) 127 151 page-faults # 0,000 M/sec ( +- 0,16% ) 829 399 813 448 cycles # 2,532 GHz ( +- 0,64% ) 580 664 691 740 stalled-cycles-frontend # 70,01% frontend cycles idle ( +- 0,71% ) 197 431 700 448 stalled-cycles-backend # 23,80% backend cycles idle ( +- 1,03% ) 503 548 648 975 instructions # 0,61 insns per cycle # 1,15 stalled cycles per insn ( +- 0,46% ) 95 780 068 471 branches # 292,389 M/sec ( +- 0,48% ) 1 426 407 916 branch-misses # 1,49% of all branches ( +- 1,35% ) 20,705679994 seconds time elapsed ( +- 0,64% ) After patch : # perf stat -r 32 hackbench 50 process 4000 >/dev/null Performance counter stats for 'hackbench 50 process 4000' (32 runs): 286236,542804 task-clock # 15,786 CPUs utilized ( +- 1,32% ) 19 703 372 context-switches # 0,069 M/sec ( +- 4,99% ) 1 658 249 CPU-migrations # 0,006 M/sec ( +- 6,62% ) 126 776 page-faults # 0,000 M/sec ( +- 0,12% ) 724 636 593 213 cycles # 2,532 GHz ( +- 1,32% ) 499 320 714 837 stalled-cycles-frontend # 68,91% frontend cycles idle ( +- 1,47% ) 156 555 126 809 stalled-cycles-backend # 21,60% backend cycles idle ( +- 2,22% ) 463 897 792 661 instructions # 0,64 insns per cycle # 1,08 stalled cycles per insn ( +- 0,94% ) 87 717 352 563 branches # 306,451 M/sec ( +- 0,99% ) 941 738 280 branch-misses # 1,07% of all branches ( +- 3,35% ) 18,132070670 seconds time elapsed ( +- 1,30% ) Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Christoph Lameter <cl@linux.com> CC: Matt Mackall <mpm@selenic.com> CC: David Rientjes <rientjes@google.com> CC: "Alex,Shi" <alex.shi@intel.com> CC: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Pekka Enberg <penberg@kernel.org>		2012-01-24 21:53:57 +02:00
..
backing-dev.c	freezer: implement and use kthread_freezable_should_stop()	2011-11-21 12:32:23 -08:00
bootmem.c	mm: bootmem: try harder to free pages in bulk	2012-01-10 16:30:45 -08:00
bounce.c	Merge branch 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux	2011-11-06 19:44:47 -08:00
cleancache.c	mm: cleancache core ops functions and config	2011-05-26 10:01:36 -06:00
compaction.c	mm: compaction: introduce sync-light migration for use by compaction	2012-01-12 20:13:09 -08:00
debug-pagealloc.c	mm, x86: Remove debug_pagealloc_enabled	2011-12-06 09:24:07 +01:00
dmapool.c	mm: fix implicit stat.h usage in dmapool.c	2011-10-31 09:20:12 -04:00
fadvise.c	fadvise: only initiate writeback for specified range with FADV_DONTNEED	2012-01-10 16:30:43 -08:00
failslab.c	switch debugfs to umode_t	2012-01-03 22:54:56 -05:00
filemap_xip.c	mm: Map most files to use export.h instead of module.h	2011-10-31 09:20:12 -04:00
filemap.c	memcg: add mem_cgroup_replace_page_cache() to fix LRU issue	2012-01-12 20:13:04 -08:00
fremap.c	mm: delete various needless include <linux/module.h>	2011-10-31 09:20:11 -04:00
highmem.c	Merge branch 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux	2011-11-06 19:44:47 -08:00
huge_memory.c	memcg: fix split_huge_page_refcounts()	2012-01-12 20:13:09 -08:00
hugetlb.c	mm/hugetlb.c: avoid bogus counter of surplus huge page	2012-01-10 16:30:45 -08:00
hwpoison-inject.c	Fix common misspellings	2011-03-31 11:26:23 -03:00
init-mm.c	atomic: use <linux/atomic.h>	2011-07-26 16:49:47 -07:00
internal.h	mm: thp: tail page refcounting fix	2011-11-02 16:06:57 -07:00
Kconfig	Merge branch 'master' into x86/memblock	2011-11-28 09:46:22 -08:00
Kconfig.debug	mm: more intensive memory corruption debugging	2012-01-10 16:30:42 -08:00
kmemcheck.c	kmemcheck: Fix build errors due to missing slab.h	2010-03-30 22:02:32 +09:00
kmemleak-test.c	kmemleak: remove memset by using kzalloc	2011-01-27 18:31:51 +00:00
kmemleak.c	kmemleak: Add support for memory hotplug	2011-12-02 16:12:42 +00:00
ksm.c	memcg: clear pc->mem_cgroup if necessary.	2012-01-12 20:13:07 -08:00
maccess.c	mm: Map most files to use export.h instead of module.h	2011-10-31 09:20:12 -04:00
madvise.c	fs: kill i_alloc_sem	2011-07-20 20:47:46 -04:00
Makefile	Cross Memory Attach	2011-10-31 17:30:44 -07:00
memblock.c	memblock: Reimplement memblock allocation using reverse free area iterator	2011-12-08 10:22:09 -08:00
memcontrol.c	net: move sock_update_memcg outside of CONFIG_INET	2012-01-17 10:15:45 -05:00
memory_hotplug.c	mm: compaction: introduce sync-light migration for use by compaction	2012-01-12 20:13:09 -08:00
memory-failure.c	mm: compaction: introduce sync-light migration for use by compaction	2012-01-12 20:13:09 -08:00
memory.c	thp: add tlb_remove_pmd_tlb_entry	2012-01-12 20:13:08 -08:00
mempolicy.c	mm: compaction: introduce sync-light migration for use by compaction	2012-01-12 20:13:09 -08:00
mempool.c	mempool: fix first round failure behavior	2012-01-10 16:30:45 -08:00
migrate.c	mm: compaction: introduce sync-light migration for use by compaction	2012-01-12 20:13:09 -08:00
mincore.c	mm: clarify the radix_tree exceptional cases	2011-08-03 14:25:24 -10:00
mlock.c	Merge branch 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux	2011-11-06 19:44:47 -08:00
mm_init.c	mm: Map most files to use export.h instead of module.h	2011-10-31 09:20:12 -04:00
mmap.c	mm: simplify find_vma_prev()	2012-01-10 16:30:44 -08:00
mmu_context.c	mm: Map most files to use export.h instead of module.h	2011-10-31 09:20:12 -04:00
mmu_notifier.c	mm: Map most files to use export.h instead of module.h	2011-10-31 09:20:12 -04:00
mmzone.c	mm: delete various needless include <linux/module.h>	2011-10-31 09:20:11 -04:00
mprotect.c	thp: mprotect: transparent huge page support	2011-01-13 17:32:44 -08:00
mremap.c	mremap: enforce rmap src/dst vma ordering in case of vma_merge() succeeding in copy_vma()	2012-01-10 16:30:44 -08:00
msync.c	sanitize vfs_fsync calling conventions	2010-05-21 18:31:21 -04:00
nobootmem.c	Merge branch 'master' into x86/memblock	2011-11-28 09:46:22 -08:00
nommu.c	xen: map foreign pages for shared rings by updating the PTEs directly	2011-11-16 12:13:08 -05:00
oom_kill.c	mm: unify remaining mem_cont, mem, etc. variable names to memcg	2012-01-12 20:13:06 -08:00
page_alloc.c	mm: enum lru_list lru	2012-01-12 20:13:10 -08:00
page_cgroup.c	page_cgroup: drop multi CONFIG_MEMORY_HOTPLUG	2012-01-12 20:13:08 -08:00
page_io.c	block: kill off REQ_UNPLUG	2011-03-10 08:52:27 +01:00
page_isolation.c	mm: page_isolation: codeclean fix comment and rm unneeded val init	2010-10-26 16:52:11 -07:00
page-writeback.c	Merge branch 'writeback-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/wfg/linux	2012-01-10 16:59:59 -08:00
pagewalk.c	pagewalk: fix code comment for THP	2011-07-25 20:57:09 -07:00
percpu-km.c	percpu: clear memory allocated with the km allocator	2010-10-02 10:28:42 +03:00
percpu-vm.c	percpu: fix chunk range calculation	2011-11-22 08:09:46 -08:00
percpu.c	Kmemleak patches	2012-01-14 18:11:11 -08:00
pgtable-generic.c	mm/pgtable-generic.c: fix CONFIG_SWAP=n build	2011-01-26 10:49:58 +10:00
prio_tree.c	sanitize <linux/prefetch.h> usage	2011-05-20 12:50:29 -07:00
process_vm_access.c	Cross Memory Attach	2011-10-31 17:30:44 -07:00
quicklist.c	mm: delete various needless include <linux/module.h>	2011-10-31 09:20:11 -04:00
readahead.c	mm: Map most files to use export.h instead of module.h	2011-10-31 09:20:12 -04:00
rmap.c	mm: unify remaining mem_cont, mem, etc. variable names to memcg	2012-01-12 20:13:06 -08:00
shmem.c	vfs: switch ->show_options() to struct dentry *	2012-01-06 23:19:54 -05:00
slab.c	slab, cleanup: remove unneeded return	2012-01-23 15:32:26 +02:00
slob.c	mm: Map most files to use export.h instead of module.h	2011-10-31 09:20:12 -04:00
slub.c	slub: prefetch next freelist pointer in slab_alloc()	2012-01-24 21:53:57 +02:00
sparse-vmemmap.c	mm: delete various needless include <linux/module.h>	2011-10-31 09:20:11 -04:00
sparse.c	mm: Map most files to use export.h instead of module.h	2011-10-31 09:20:12 -04:00
swap_state.c	memcg: clear pc->mem_cgroup if necessary.	2012-01-12 20:13:07 -08:00
swap.c	mm: remove del_page_from_lru, add page_off_lru	2012-01-12 20:13:10 -08:00
swapfile.c	mm: unify remaining mem_cont, mem, etc. variable names to memcg	2012-01-12 20:13:06 -08:00
thrash.c	mm/thrash.c: quiet sparse noise	2011-10-31 17:30:50 -07:00
truncate.c	mm: Map most files to use export.h instead of module.h	2011-10-31 09:20:12 -04:00
util.c	mm: Map most files to use export.h instead of module.h	2011-10-31 09:20:12 -04:00
vmalloc.c	mm/vmalloc.c: eliminate extra loop in pcpu_get_vm_areas error path	2012-01-12 20:13:10 -08:00
vmscan.c	mm: rearrange putback_inactive_pages	2012-01-12 20:13:10 -08:00
vmstat.c	mm,x86,um: move CMPXCHG_LOCAL config option	2012-01-12 20:13:03 -08:00