linux/Documentation
Vlastimil Babka ad59baa316 slab, rust: extend kmalloc() alignment guarantees to remove Rust padding
Slab allocators have been guaranteeing natural alignment for
power-of-two sizes since commit 59bb47985c ("mm, sl[aou]b: guarantee
natural alignment for kmalloc(power-of-two)"), while any other sizes are
guaranteed to be aligned only to ARCH_KMALLOC_MINALIGN bytes (although
in practice are aligned more than that in non-debug scenarios).

Rust's allocator API specifies size and alignment per allocation, which
have to satisfy the following rules, per Alice Ryhl [1]:

  1. The alignment is a power of two.
  2. The size is non-zero.
  3. When you round up the size to the next multiple of the alignment,
     then it must not overflow the signed type isize / ssize_t.

In order to map this to kmalloc()'s guarantees, some requested
allocation sizes have to be padded to the next power-of-two size [2].
For example, an allocation of size 96 and alignment of 32 will be padded
to an allocation of size 128, because the existing kmalloc-96 bucket
doesn't guarantee alignent above ARCH_KMALLOC_MINALIGN. Without slab
debugging active, the layout of the kmalloc-96 slabs however naturally
align the objects to 32 bytes, so extending the size to 128 bytes is
wasteful.

To improve the situation we can extend the kmalloc() alignment
guarantees in a way that

1) doesn't change the current slab layout (and thus does not increase
   internal fragmentation) when slab debugging is not active
2) reduces waste in the Rust allocator use case
3) is a superset of the current guarantee for power-of-two sizes.

The extended guarantee is that alignment is at least the largest
power-of-two divisor of the requested size. For power-of-two sizes the
largest divisor is the size itself, but let's keep this case documented
separately for clarity.

For current kmalloc size buckets, it means kmalloc-96 will guarantee
alignment of 32 bytes and kmalloc-196 will guarantee 64 bytes.

This covers the rules 1 and 2 above of Rust's API as long as the size is
a multiple of the alignment. The Rust layer should now only need to
round up the size to the next multiple if it isn't, while enforcing the
rule 3.

Implementation-wise, this changes the alignment calculation in
create_boot_cache(). While at it also do the calulation only for caches
with the SLAB_KMALLOC flag, because the function is also used to create
the initial kmem_cache and kmem_cache_node caches, where no alignment
guarantee is necessary.

In the Rust allocator's krealloc_aligned(), remove the code that padded
sizes to the next power of two (suggested by Alice Ryhl) as it's no
longer necessary with the new guarantees.

Reported-by: Alice Ryhl <aliceryhl@google.com>
Reported-by: Boqun Feng <boqun.feng@gmail.com>
Link: https://lore.kernel.org/all/CAH5fLggjrbdUuT-H-5vbQfMazjRDpp2%2Bk3%3DYhPyS17ezEqxwcw@mail.gmail.com/ [1]
Link: https://lore.kernel.org/all/CAH5fLghsZRemYUwVvhk77o6y1foqnCeDzW4WZv6ScEWna2+_jw@mail.gmail.com/ [2]
Reviewed-by: Boqun Feng <boqun.feng@gmail.com>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2024-07-03 12:23:27 +02:00
..
ABI Char/Misc and other driver subsystem changes for 6.10-rc1 2024-05-22 12:26:46 -07:00
accel
accounting
admin-guide TTY/Serial changes for 6.10-rc1 2024-05-22 11:53:02 -07:00
arch arm64 fixes for -rc1 2024-05-23 12:09:22 -07:00
block
bpf bpf, docs: Fix the description of 'src' in ALU instructions 2024-05-15 09:34:54 -07:00
cdrom
core-api slab, rust: extend kmalloc() alignment guarantees to remove Rust padding 2024-07-03 12:23:27 +02:00
cpu-freq
crypto
dev-tools Mainly singleton patches, documented in their respective changelogs. 2024-05-19 14:02:03 -07:00
devicetree RTC for 6.10 2024-05-25 13:33:53 -07:00
doc-guide doc: fix spelling about ReStructured Text 2024-04-02 10:07:51 -06:00
driver-api Char/Misc and other driver subsystem changes for 6.10-rc1 2024-05-22 12:26:46 -07:00
fault-injection
fb
features
filesystems 16 hotfixes, 11 of which are cc:stable. 2024-05-25 15:10:33 -07:00
firmware_class
firmware-guide Documentation: firmware-guide: ACPI: Fix namespace typo 2024-04-26 18:58:13 +02:00
fpga
gpu Merge tag 'drm-misc-next-2024-04-19' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-next 2024-04-22 12:29:18 +10:00
hid Merge branch 'for-6.10/intel-ish' into for-linus 2024-05-14 13:53:15 +02:00
hwmon hwmon: (emc1403) Add support for EMC1428 and EMC1438. 2024-05-12 09:02:00 -07:00
i2c
iio docs: iio: ad7944: add documentation for chain mode 2024-04-29 20:53:25 +01:00
images
infiniband
input
isdn
kbuild kbuild: use $(src) instead of $(srctree)/$(src) for source directory 2024-05-10 04:34:52 +09:00
kernel-hacking
leds
litmus-tests Documentation/litmus-tests: Make cmpxchg() tests safe for klitmus 2024-05-06 14:29:21 -07:00
livepatch
locking
maintainer
mhi
misc-devices
mm The usual shower of singleton fixes and minor series all over MM, 2024-05-19 09:21:03 -07:00
netlabel
netlink NFSD 6.10 Release Notes 2024-05-18 14:04:20 -07:00
networking net: revert partially applied PHY topology series 2024-05-13 18:35:02 -07:00
nvdimm
nvme
PCI Merge branch 'pci/enumeration' 2024-05-16 18:14:10 -05:00
pcmcia
peci
power Documentation: PM: Update platform_pci_wakeup_init() reference 2024-03-27 16:22:19 +01:00
process Mainly singleton patches, documented in their respective changelogs. 2024-05-19 14:02:03 -07:00
RCU doc: Remove references to arrayRCU.rst 2024-04-09 15:13:05 +02:00
rust RISC-V Patches for the 6.10 Merge Window, Part 1 2024-05-22 09:56:00 -07:00
scheduler Linux 6.9-rc1 2024-03-25 11:32:29 +01:00
scsi scsi: documentation: Clean up scsi_mid_low_api.rst 2024-04-08 22:10:05 -04:00
security Another not-too-busy cycle for documentation, including: 2024-05-13 10:51:53 -07:00
sound Documentation: sound: Fix trailing whitespaces 2024-05-16 16:00:30 +02:00
sphinx docs: kernel_include.py: Cope with docutils 0.21 2024-05-02 09:50:59 -06:00
sphinx-static
spi spi: pxa2xx: Drop the stale entry in documentation TOC 2024-05-07 23:53:21 +09:00
staging
target
tee Documentation: tee: Add TS-TEE driver 2024-04-03 14:03:09 +02:00
timers sched/isolation: Prevent boot crash when the boot CPU is nohz_full 2024-04-28 10:07:12 +02:00
tools rtla: Documentation: Fix -t, --trace 2024-05-16 16:52:16 +02:00
trace Char/Misc and other driver subsystem changes for 6.10-rc1 2024-05-22 12:26:46 -07:00
translations pci-v6.10-changes 2024-05-21 10:09:28 -07:00
usb
userspace-api mseal: add documentation 2024-05-23 19:40:26 -07:00
virt powerpc updates for 6.10 2024-05-17 09:05:46 -07:00
w1
watchdog
wmi platform/x86: wmi: Add MSI WMI Platform driver 2024-04-29 12:06:21 +02:00
.gitignore
atomic_bitops.txt
atomic_t.txt Documentation/atomic_t: Emphasize that failed atomic operations give no ordering 2024-05-06 14:29:04 -07:00
Changes
CodingStyle
conf.py compiler_types: add Endianness-dependent __counted_by_{le,be} 2024-03-28 18:50:47 -07:00
docutils.conf
dontdiff
index.rst doc: fix spelling about ReStructured Text 2024-04-02 10:07:51 -06:00
Kconfig
Makefile Kbuild updates for v6.10 2024-05-18 12:39:20 -07:00
memory-barriers.txt
SubmittingPatches
subsystem-apis.rst