linux/mm
Huang Ying f6ab1f7f6b mm, swap: use offset of swap entry as key of swap cache
This patch is to improve the performance of swap cache operations when
the type of the swap device is not 0.  Originally, the whole swap entry
value is used as the key of the swap cache, even though there is one
radix tree for each swap device.  If the type of the swap device is not
0, the height of the radix tree of the swap cache will be increased
unnecessary, especially on 64bit architecture.  For example, for a 1GB
swap device on the x86_64 architecture, the height of the radix tree of
the swap cache is 11.  But if the offset of the swap entry is used as
the key of the swap cache, the height of the radix tree of the swap
cache is 4.  The increased height causes unnecessary radix tree
descending and increased cache footprint.

This patch reduces the height of the radix tree of the swap cache via
using the offset of the swap entry instead of the whole swap entry value
as the key of the swap cache.  In 32 processes sequential swap out test
case on a Xeon E5 v3 system with RAM disk as swap, the lock contention
for the spinlock of the swap cache is reduced from 20.15% to 12.19%,
when the type of the swap device is 1.

Use the whole swap entry as key,

  perf-profile.calltrace.cycles-pp._raw_spin_lock_irq.__add_to_swap_cache.add_to_swap_cache.add_to_swap.shrink_page_list: 10.37,
  perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.__remove_mapping.shrink_page_list.shrink_inactive_list.shrink_node_memcg: 9.78,

Use the swap offset as key,

  perf-profile.calltrace.cycles-pp._raw_spin_lock_irq.__add_to_swap_cache.add_to_swap_cache.add_to_swap.shrink_page_list: 6.25,
  perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.__remove_mapping.shrink_page_list.shrink_inactive_list.shrink_node_memcg: 5.94,

Link: http://lkml.kernel.org/r/1473270649-27229-1-git-send-email-ying.huang@intel.com
Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Aaron Lu <aaron.lu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-07 18:46:28 -07:00
..
kasan kasan: remove the unnecessary WARN_ONCE from quarantine.c 2016-08-11 16:58:14 -07:00
backing-dev.c block: fix bdi vs gendisk lifetime mismatch 2016-08-04 14:19:16 -06:00
balloon_compaction.c mm: balloon: use general non-lru movable page feature 2016-07-26 16:19:19 -07:00
bootmem.c mm/bootmem.c: replace kzalloc() by kzalloc_node() 2016-10-07 18:46:28 -07:00
cleancache.c cleancache: constify cleancache_ops structure 2016-01-27 09:09:57 -05:00
cma_debug.c
cma.c mm/cma: silence warnings due to max() usage 2016-05-27 14:49:37 -07:00
cma.h
compaction.c mm, compaction: require only min watermarks for non-costly orders 2016-10-07 18:46:27 -07:00
debug_page_ref.c mm/page_ref: add tracepoint to track down page reference manipulation 2016-03-17 15:09:34 -07:00
debug.c mm: avoid endless recursion in dump_page() 2016-09-19 15:36:16 -07:00
dmapool.c mm: convert printk(KERN_<LEVEL> to pr_<level> 2016-03-17 15:09:34 -07:00
early_ioremap.c
fadvise.c mm/fadvise.c: do not discard partial pages with POSIX_FADV_DONTNEED 2016-06-09 14:23:11 -07:00
failslab.c mm: fault-inject take over bootstrap kmem_cache check 2016-03-15 16:55:16 -07:00
filemap.c do_generic_file_read(): fail immediately if killed 2016-10-07 18:46:27 -07:00
frame_vector.c mm/gup: Switch all callers of get_user_pages() to not pass tsk/mm 2016-02-16 10:11:12 +01:00
frontswap.c mm, frontswap: convert frontswap_enabled to static key 2016-07-26 16:19:19 -07:00
gup.c - ARM: GICv3 ITS emulation and various fixes. Removal of the old 2016-08-02 16:11:27 -04:00
highmem.c mm/highmem: make nr_free_highpages() handles all highmem zones by itself 2016-05-19 19:12:14 -07:00
huge_memory.c thp: reduce usage of huge zero page's atomic counter 2016-10-07 18:46:28 -07:00
hugetlb_cgroup.c mm, hugetlb_cgroup: round limit_in_bytes down to hugepage size 2016-05-20 17:58:30 -07:00
hugetlb.c mm/hugetlb: fix incorrect hugepages count during mem hotplug 2016-08-11 16:58:13 -07:00
hwpoison-inject.c
init-mm.c
internal.h mm, compaction: make whole_zone flag ignore cached scanner positions 2016-10-07 18:46:27 -07:00
interval_tree.c
Kconfig mm: clarify COMPACTION Kconfig text 2016-08-26 17:39:35 -07:00
Kconfig.debug PM / Hibernate: allow hibernation with PAGE_POISONING_ZERO 2016-09-13 02:35:27 +02:00
khugepaged.c mm, thp: fix leaking mapped pte in __collapse_huge_page_swapin() 2016-09-19 15:36:16 -07:00
kmemcheck.c mm: convert printk(KERN_<LEVEL> to pr_<level> 2016-03-17 15:09:34 -07:00
kmemleak-test.c mm: convert printk(KERN_<LEVEL> to pr_<level> 2016-03-17 15:09:34 -07:00
kmemleak.c kmemleak: don't hang if user disables scanning early 2016-07-28 16:07:41 -07:00
ksm.c mm,ksm: fix endless looping in allocating memory when ksm enable 2016-09-28 16:19:01 -07:00
list_lru.c mm: memcontrol: move kmem accounting code to CONFIG_MEMCG 2016-01-20 17:09:18 -08:00
maccess.c x86: remove more uaccess_32.h complexity 2016-05-22 17:21:27 -07:00
madvise.c mm: make mmap_sem for write waits killable for mm syscalls 2016-05-23 17:04:14 -07:00
Makefile Implements HARDENED_USERCOPY verification of copy_to_user/copy_from_user 2016-08-08 14:48:14 -07:00
memblock.c mm/memblock.c: expose total reserved memory 2016-10-07 18:46:28 -07:00
memcontrol.c mm, swap: use offset of swap entry as key of swap cache 2016-10-07 18:46:28 -07:00
memory_hotplug.c mem-hotplug: use nodes that contain memory as mask in new_node_page() 2016-09-28 16:19:02 -07:00
memory-failure.c mm: hwpoison: remove incorrect comments 2016-07-28 16:07:41 -07:00
memory.c mm: fix cache mode tracking in vm_insert_mixed() 2016-10-07 18:46:28 -07:00
mempolicy.c mm: use zonelist name instead of using hardcoded index 2016-10-07 18:46:28 -07:00
mempool.c Revert "mm, mempool: only set __GFP_NOMEMALLOC if there are free elements" 2016-07-28 16:07:41 -07:00
memtest.c
migrate.c mm, thp: remove __GFP_NORETRY from khugepaged and madvised allocations 2016-07-28 16:07:41 -07:00
mincore.c mm, swap: use offset of swap entry as key of swap cache 2016-10-07 18:46:28 -07:00
mlock.c mm: mlock: avoid increase mm->locked_vm on mlock() when already mlock2(,MLOCK_ONFAULT) 2016-10-07 18:46:28 -07:00
mm_init.c mm: convert printk(KERN_<LEVEL> to pr_<level> 2016-03-17 15:09:34 -07:00
mmap.c Merge branch 'x86-vdso-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-10-03 17:29:01 -07:00
mmu_context.c mm/mmu_context, sched/core: Fix mmu_context.h assumption 2016-04-28 11:44:19 +02:00
mmu_notifier.c fix Christoph's email addresses 2016-03-17 15:09:34 -07:00
mmzone.c mm, page_alloc: inline the fast path of the zonelist iterator 2016-05-19 19:12:14 -07:00
mprotect.c mm: thp: check pmd_trans_unstable() after split_huge_pmd() 2016-07-26 16:19:19 -07:00
mremap.c mm: thp: check pmd_trans_unstable() after split_huge_pmd() 2016-07-26 16:19:19 -07:00
msync.c
nobootmem.c mm/nobootmem.c: remove duplicate macro ARCH_LOW_ADDRESS_LIMIT statements 2016-10-07 18:46:28 -07:00
nommu.c mm: introduce fault_env 2016-07-26 16:19:19 -07:00
oom_kill.c oom: warn if we go OOM for higher order and compaction is disabled 2016-10-07 18:46:28 -07:00
page_alloc.c mm: introduce arch_reserved_kernel_pages() 2016-10-07 18:46:28 -07:00
page_counter.c mm: page_counter: let page_counter_try_charge() return bool 2015-11-05 19:34:48 -08:00
page_ext.c mm/page_ext: support extra space allocation by page_ext user 2016-10-07 18:46:27 -07:00
page_idle.c mm, vmscan: move lru_lock to the node 2016-07-28 16:07:41 -07:00
page_io.c mm: fix the page_swap_info() BUG_ON check 2016-09-19 15:36:17 -07:00
page_isolation.c mm/page_isolation: clean up confused code 2016-07-26 16:19:19 -07:00
page_owner.c mm/page_owner: don't define fields on struct page_ext by hard-coding 2016-10-07 18:46:27 -07:00
page_poison.c mm: check the return value of lookup_page_ext for all call sites 2016-06-03 15:06:22 -07:00
page-writeback.c mm: don't use radix tree writeback tags for pages in swap cache 2016-10-07 18:46:28 -07:00
pagewalk.c thp: rename split_huge_page_pmd() to split_huge_pmd() 2016-01-15 17:56:32 -08:00
percpu-km.c mm: percpu: use pr_fmt to prefix output 2016-03-17 15:09:34 -07:00
percpu-vm.c
percpu.c percpu: fix synchronization between synchronous map extension and chunk destruction 2016-05-25 11:48:25 -04:00
pgtable-generic.c mm/thp/migration: switch from flush_tlb_range to flush_pmd_tlb_range 2016-03-17 15:09:34 -07:00
process_vm_access.c mm/gup: Introduce get_user_pages_remote() 2016-02-16 10:04:09 +01:00
quicklist.c fix Christoph's email addresses 2016-03-17 15:09:34 -07:00
readahead.c mm: silently skip readahead for DAX inodes 2016-08-26 17:39:35 -07:00
rmap.c rmap: fix compound check logic in page_remove_file_rmap 2016-08-10 16:40:56 -07:00
shmem.c huge tmpfs: fix Committed_AS leak 2016-09-24 11:20:01 -07:00
slab_common.c mm: charge/uncharge kmemcg from generic page allocator paths 2016-07-26 16:19:19 -07:00
slab.c slab: Convert to hotplug state machine 2016-09-06 18:30:20 +02:00
slab.h mm, kasan: switch SLUB to stackdepot, enable memory quarantine for SLUB 2016-07-28 16:07:41 -07:00
slob.c mm: slab: free kmem_cache_node after destroy sysfs file 2016-02-18 16:23:24 -08:00
slub.c slub: Convert to hotplug state machine 2016-09-06 18:30:20 +02:00
sparse-vmemmap.c treewide: replace obsolete _refok by __ref 2016-08-02 17:31:41 -04:00
sparse.c treewide: replace obsolete _refok by __ref 2016-08-02 17:31:41 -04:00
swap_cgroup.c mm: convert printk(KERN_<LEVEL> to pr_<level> 2016-03-17 15:09:34 -07:00
swap_state.c mm, swap: use offset of swap entry as key of swap cache 2016-10-07 18:46:28 -07:00
swap.c thp: reduce usage of huge zero page's atomic counter 2016-10-07 18:46:28 -07:00
swapfile.c mm, swap: use offset of swap entry as key of swap cache 2016-10-07 18:46:28 -07:00
truncate.c truncate: handle file thp 2016-07-26 16:19:19 -07:00
usercopy.c mm: usercopy: Check for module addresses 2016-09-20 16:07:39 -07:00
userfaultfd.c mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros 2016-04-04 10:41:08 -07:00
util.c mm: move most file-based accounting to the node 2016-07-28 16:07:41 -07:00
vmacache.c mm: unrig VMA cache hit ratio 2016-10-07 18:46:27 -07:00
vmalloc.c mm/vmalloc.c: fix align value calculation error 2016-10-07 18:46:26 -07:00
vmpressure.c mm/vmpressure.c: fix subtree pressure detection 2016-02-03 08:28:43 -08:00
vmscan.c mm: use zonelist name instead of using hardcoded index 2016-10-07 18:46:28 -07:00
vmstat.c cpu: fix node state for whether it contains CPU 2016-10-07 18:46:28 -07:00
workingset.c mm: workingset: fix crash in shadow node shrinker caused by replace_page_cache_page() 2016-09-30 15:26:52 -07:00
z3fold.c mm/z3fold.c: avoid modifying HEADLESS page and minor cleanup 2016-06-03 16:02:55 -07:00
zbud.c mm/zbud.c: use list_last_entry() instead of list_tail_entry() 2016-01-15 11:40:52 -08:00
zpool.c mm: zsmalloc: constify struct zs_pool name 2015-11-06 17:50:42 -08:00
zsmalloc.c zsmalloc: Delete an unnecessary check before the function call "iput" 2016-07-28 16:07:41 -07:00
zswap.c mm/zswap: use workqueue to destroy pool 2016-05-20 17:58:30 -07:00