linux/include
Mike Kravetz 2938396771 hugetlb: add per-hstate mutex to synchronize user adjustments
The helper routine hstate_next_node_to_alloc accesses and modifies the
hstate variable next_nid_to_alloc.  The helper is used by the routines
alloc_pool_huge_page and adjust_pool_surplus.  adjust_pool_surplus is
called with hugetlb_lock held.  However, alloc_pool_huge_page can not be
called with the hugetlb lock held as it will call the page allocator.
Two instances of alloc_pool_huge_page could be run in parallel or
alloc_pool_huge_page could run in parallel with adjust_pool_surplus
which may result in the variable next_nid_to_alloc becoming invalid for
the caller and pages being allocated on the wrong node.

Both alloc_pool_huge_page and adjust_pool_surplus are only called from
the routine set_max_huge_pages after boot.  set_max_huge_pages is only
called as the reusult of a user writing to the proc/sysfs nr_hugepages,
or nr_hugepages_mempolicy file to adjust the number of hugetlb pages.

It makes little sense to allow multiple adjustment to the number of
hugetlb pages in parallel.  Add a mutex to the hstate and use it to only
allow one hugetlb page adjustment at a time.  This will synchronize
modifications to the next_nid_to_alloc variable.

Link: https://lkml.kernel.org/r/20210409205254.242291-4-mike.kravetz@oracle.com
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>
Cc: Barry Song <song.bao.hua@hisilicon.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Hillf Danton <hdanton@sina.com>
Cc: HORIGUCHI NAOYA <naoya.horiguchi@nec.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-05 11:27:22 -07:00
..
acpi Merge branches 'acpi-cppc', 'acpi-video' and 'acpi-utils' 2021-04-26 17:04:27 +02:00
asm-generic - removed get_fs/set_fs 2021-04-29 11:28:08 -07:00
clocksource ARM: platform support for Apple M1 2021-04-26 12:30:36 -07:00
crypto
drm Merge drm/drm-fixes into drm-next 2021-04-13 23:15:09 +02:00
dt-bindings Here's a collection of largely clk driver updates for the merge window. The 2021-04-28 17:13:56 -07:00
keys Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2021-04-26 08:51:23 -07:00
kunit kunit: fix -Wunused-function warning for __kunit_fail_current_test 2021-04-06 15:22:39 -06:00
kvm
linux hugetlb: add per-hstate mutex to synchronize user adjustments 2021-05-05 11:27:22 -07:00
math-emu
media media updates for v5.13-rc1 2021-04-28 09:24:36 -07:00
memory
misc
net net: page_pool: use alloc_pages_bulk in refill code path 2021-04-30 11:20:43 -07:00
pcmcia
ras
rdma
scsi SCSI misc on 20210428 2021-04-28 17:22:10 -07:00
soc Networking changes for 5.13. 2021-04-29 11:57:23 -07:00
sound
target
trace mm, tracing: improve rss_stat tracepoint message 2021-04-30 11:20:39 -07:00
uapi Networking changes for 5.13. 2021-04-29 11:57:23 -07:00
vdso time64.h: Consolidated PSEC_PER_SEC definition 2021-04-06 16:32:17 -07:00
video
xen xen/arm: introduce XENFEAT_direct_mapped and XENFEAT_not_direct_mapped 2021-04-23 11:33:50 +02:00