x86-64: mm: clarify the 'positive addresses' user address rules

Dave Hansen found the "(long) addr >= 0" code in the x86-64 access_ok
checks somewhat confusing, and suggested using a helper to clarify what
the code is doing.

So this does exactly that: clarifying what the sign bit check is all
about, by adding a helper macro that makes it clear what it is testing.

This also adds some explicit comments talking about how even with LAM
enabled, any addresses with the sign bit will still GP-fault in the
non-canonical region just above the sign bit.

This is all what allows us to do the user address checks with just the
sign bit, and furthermore be a bit cavalier about accesses that might be
done with an additional offset even past that point.

(And yes, this talks about 'positive' even though zero is also a valid
user address and so technically we should call them 'non-negative'.  But
I don't think using 'non-negative' ends up being more understandable).

Suggested-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This commit is contained in:
Linus Torvalds 2023-05-03 10:13:41 -07:00
parent 1dbc0a9515
commit 798dec3304
2 changed files with 33 additions and 15 deletions

View File

@ -50,27 +50,45 @@ static inline unsigned long __untagged_addr_remote(struct mm_struct *mm,
#endif
/*
* On x86-64, we may have tag bits in the user pointer. Rather than
* mask them off, just change the rules for __access_ok().
* The virtual address space space is logically divided into a kernel
* half and a user half. When cast to a signed type, user pointers
* are positive and kernel pointers are negative.
*/
#define valid_user_address(x) ((long)(x) >= 0)
/*
* User pointers can have tag bits on x86-64. This scheme tolerates
* arbitrary values in those bits rather then masking them off.
*
* Make the rule be that 'ptr+size' must not overflow, and must not
* have the high bit set. Compilers generally understand about
* unsigned overflow and the CF bit and generate reasonable code for
* this. Although it looks like the combination confuses at least
* clang (and instead of just doing an "add" followed by a test of
* SF and CF, you'll see that unnecessary comparison).
* Enforce two rules:
* 1. 'ptr' must be in the user half of the address space
* 2. 'ptr+size' must not overflow into kernel addresses
*
* For the common case of small sizes that can be checked at compile
* time, don't even bother with the addition, and just check that the
* base pointer is ok.
* Note that addresses around the sign change are not valid addresses,
* and will GP-fault even with LAM enabled if the sign bit is set (see
* "CR3.LAM_SUP" that can narrow the canonicality check if we ever
* enable it, but not remove it entirely).
*
* So the "overflow into kernel addresses" does not imply some sudden
* exact boundary at the sign bit, and we can allow a lot of slop on the
* size check.
*
* In fact, we could probably remove the size check entirely, since
* any kernel accesses will be in increasing address order starting
* at 'ptr', and even if the end might be in kernel space, we'll
* hit the GP faults for non-canonical accesses before we ever get
* there.
*
* That's a separate optimization, for now just handle the small
* constant case.
*/
static inline bool __access_ok(const void __user *ptr, unsigned long size)
{
if (__builtin_constant_p(size <= PAGE_SIZE) && size <= PAGE_SIZE) {
return (long)ptr >= 0;
return valid_user_address(ptr);
} else {
unsigned long sum = size + (unsigned long)ptr;
return (long) sum >= 0 && sum >= (unsigned long)ptr;
return valid_user_address(sum) && sum >= (unsigned long)ptr;
}
}
#define __access_ok __access_ok

View File

@ -143,12 +143,12 @@ static bool gp_fault_address_ok(unsigned long fault_address)
{
#ifdef CONFIG_X86_64
/* Is it in the "user space" part of the non-canonical space? */
if ((long) fault_address >= 0)
if (valid_user_address(fault_address))
return true;
/* .. or just above it? */
fault_address -= PAGE_SIZE;
if ((long) fault_address >= 0)
if (valid_user_address(fault_address))
return true;
#endif
return false;