linux/ipc
Andi Kleen 42d7395feb mm: support more pagesizes for MAP_HUGETLB/SHM_HUGETLB
There was some desire in large applications using MAP_HUGETLB or
SHM_HUGETLB to use 1GB huge pages on some mappings, and stay with 2MB on
others.  This is useful together with NUMA policy: use 2MB interleaving
on some mappings, but 1GB on local mappings.

This patch extends the IPC/SHM syscall interfaces slightly to allow
specifying the page size.

It borrows some upper bits in the existing flag arguments and allows
encoding the log of the desired page size in addition to the *_HUGETLB
flag.  When 0 is specified the default size is used, this makes the
change fully compatible.

Extending the internal hugetlb code to handle this is straight forward.
Instead of a single mount it just keeps an array of them and selects the
right mount based on the specified page size.  When no page size is
specified it uses the mount of the default page size.

The change is not visible in /proc/mounts because internal mounts don't
appear there.  It also has very little overhead: the additional mounts
just consume a super block, but not more memory when not used.

I also exported the new flags to the user headers (they were previously
under __KERNEL__).  Right now only symbols for x86 and some other
architecture for 1GB and 2MB are defined.  The interface should already
work for all other architectures though.  Only architectures that define
multiple hugetlb sizes actually need it (that is currently x86, tile,
powerpc).  However tile and powerpc have user configurable hugetlb
sizes, so it's not easy to add defines.  A program on those
architectures would need to query sysfs and use the appropiate log2.

[akpm@linux-foundation.org: cleanups]
[rientjes@google.com: fix build]
[akpm@linux-foundation.org: checkpatch fixes]
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Hillf Danton <dhillf@gmail.com>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-12-11 17:22:25 -08:00
..
compat_mq.c ipc: initialize structure memory to zero for compat functions 2010-10-27 18:03:13 -07:00
compat.c ipc: use Kconfig options for __ARCH_WANT_[COMPAT_]IPC_PARSE_VERSION 2012-07-30 17:25:21 -07:00
ipc_sysctl.c ipc: introduce shm_rmid_forced sysctl 2011-07-26 16:49:44 -07:00
ipcns_notifier.c ipc: do not use a negative value to re-enable msgmni automatic recomputing 2008-07-25 10:53:42 -07:00
Makefile Add generic sys_ipc wrapper 2010-03-12 15:52:32 -08:00
mq_sysctl.c mqueue: separate mqueue default value from maximum value 2012-05-31 17:49:31 -07:00
mqueue.c audit: make audit_inode take struct filename 2012-10-12 20:15:09 -04:00
msg.c userns: Convert ipc to use kuid and kgid where appropriate 2012-09-06 22:17:20 -07:00
msgutil.c security: trim security.h 2012-02-14 10:45:42 +11:00
namespace.c userns: Use cred->user_ns instead of cred->user->user_ns 2012-04-07 16:55:51 -07:00
sem.c userns: Convert ipc to use kuid and kgid where appropriate 2012-09-06 22:17:20 -07:00
shm.c mm: support more pagesizes for MAP_HUGETLB/SHM_HUGETLB 2012-12-11 17:22:25 -08:00
syscall.c ipc: add COMPAT_SHMLBA support 2012-07-30 17:25:20 -07:00
util.c userns: Convert ipc to use kuid and kgid where appropriate 2012-09-06 22:17:20 -07:00
util.h userns: Convert ipc to use kuid and kgid where appropriate 2012-09-06 22:17:20 -07:00