Changes in this cycle were:

- Add the "ratelimit:N" parameter to the split_lock_detect= boot option,
    to rate-limit the generation of bus-lock exceptions. This is both
    easier on system resources and kinder to offending applications than
    the current policy of outright killing them.
 
  - Document the split-lock detection feature and its parameters.
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmDZfS4RHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1hegw//RVafMIceiA0R4zUG8jsGA7SEUaQixfWX
 YjSYbpbsQRLHBASu8sb9yT/O4Dy+WmJ2PdETeWNTqX3MMfL41bMMEjdzU/5kL4By
 RsWWissxwsx7MRSFdChI74BVT45/DqTnRpbbW5XnYjKoYbXeYqmSIeP/j+Rn5ACQ
 rszqIPM/yTK2/NkU9qDoJZitqCuzs925C8k/685prRHzM7gvbQi+6hjKxcQqYtCX
 s2wMUGqAMtD+sadHXJAkmtfG7JzPOJYfdG/qeyB88EmT48N8KDjwTDfQZH3Cuox0
 DGy7KwtVRiYumF6yaVXXXTCY0ChpPpmZhYA7VuBUIjmFq0EhLwGJ1D4ACL11IX1W
 rmqjJ9rNhO+zVc+JLY8671HtyWm0bkUqKaEYhyqJHosI78pRWJIcfqySOAvuqT0N
 h1JRko3F/gBGh5DB2zsVcI/odYBiBQk7hAz7SZmPRaXmpNb+epesLrdbI2juxpvO
 r6Mt2f1dAWgH+lv+amJRZWWMewrf4bk9mmjGSssUmrSBbi1lxlO1B9it1I0jQn+M
 9hELPj4rj82XLkWVggiM0l24FtAHhBeci+wRx1/NrWp8fSsdZ2FojyzXDOLJFfxF
 NaQLMuqkWH71CeEWVAdYE69OBHWa2ctmZwMj4BM7RnmKk4tVR13qG5BEWcI4TCsS
 TcswzOa1AVA=
 =4DyL
 -----END PGP SIGNATURE-----

Merge tag 'x86-splitlock-2021-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 splitlock updates from Ingo Molnar:

 - Add the "ratelimit:N" parameter to the split_lock_detect= boot
   option, to rate-limit the generation of bus-lock exceptions.

   This is both easier on system resources and kinder to offending
   applications than the current policy of outright killing them.

 - Document the split-lock detection feature and its parameters.

* tag 'x86-splitlock-2021-06-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  Documentation/x86: Add ratelimit in buslock.rst
  Documentation/admin-guide: Add bus lock ratelimit
  x86/bus_lock: Set rate limit for bus lock
  Documentation/x86: Add buslock.rst
This commit is contained in:
Linus Torvalds 2021-06-28 13:30:02 -07:00
commit 1b1cf8fe99
4 changed files with 175 additions and 2 deletions

View File

@ -5278,6 +5278,14 @@
exception. Default behavior is by #AC if
both features are enabled in hardware.
ratelimit:N -
Set system wide rate limit to N bus locks
per second for bus lock detection.
0 < N <= 1000.
N/A for split lock detection.
If an #AC exception is hit in the kernel or in
firmware (i.e. not while executing in user mode)
the kernel will oops in either "warn" or "fatal"

View File

@ -0,0 +1,126 @@
.. SPDX-License-Identifier: GPL-2.0
.. include:: <isonum.txt>
===============================
Bus lock detection and handling
===============================
:Copyright: |copy| 2021 Intel Corporation
:Authors: - Fenghua Yu <fenghua.yu@intel.com>
- Tony Luck <tony.luck@intel.com>
Problem
=======
A split lock is any atomic operation whose operand crosses two cache lines.
Since the operand spans two cache lines and the operation must be atomic,
the system locks the bus while the CPU accesses the two cache lines.
A bus lock is acquired through either split locked access to writeback (WB)
memory or any locked access to non-WB memory. This is typically thousands of
cycles slower than an atomic operation within a cache line. It also disrupts
performance on other cores and brings the whole system to its knees.
Detection
=========
Intel processors may support either or both of the following hardware
mechanisms to detect split locks and bus locks.
#AC exception for split lock detection
--------------------------------------
Beginning with the Tremont Atom CPU split lock operations may raise an
Alignment Check (#AC) exception when a split lock operation is attemped.
#DB exception for bus lock detection
------------------------------------
Some CPUs have the ability to notify the kernel by an #DB trap after a user
instruction acquires a bus lock and is executed. This allows the kernel to
terminate the application or to enforce throttling.
Software handling
=================
The kernel #AC and #DB handlers handle bus lock based on the kernel
parameter "split_lock_detect". Here is a summary of different options:
+------------------+----------------------------+-----------------------+
|split_lock_detect=|#AC for split lock |#DB for bus lock |
+------------------+----------------------------+-----------------------+
|off |Do nothing |Do nothing |
+------------------+----------------------------+-----------------------+
|warn |Kernel OOPs |Warn once per task and |
|(default) |Warn once per task and |and continues to run. |
| |disable future checking | |
| |When both features are | |
| |supported, warn in #AC | |
+------------------+----------------------------+-----------------------+
|fatal |Kernel OOPs |Send SIGBUS to user. |
| |Send SIGBUS to user | |
| |When both features are | |
| |supported, fatal in #AC | |
+------------------+----------------------------+-----------------------+
|ratelimit:N |Do nothing |Limit bus lock rate to |
|(0 < N <= 1000) | |N bus locks per second |
| | |system wide and warn on|
| | |bus locks. |
+------------------+----------------------------+-----------------------+
Usages
======
Detecting and handling bus lock may find usages in various areas:
It is critical for real time system designers who build consolidated real
time systems. These systems run hard real time code on some cores and run
"untrusted" user processes on other cores. The hard real time cannot afford
to have any bus lock from the untrusted processes to hurt real time
performance. To date the designers have been unable to deploy these
solutions as they have no way to prevent the "untrusted" user code from
generating split lock and bus lock to block the hard real time code to
access memory during bus locking.
It's also useful for general computing to prevent guests or user
applications from slowing down the overall system by executing instructions
with bus lock.
Guidance
========
off
---
Disable checking for split lock and bus lock. This option can be useful if
there are legacy applications that trigger these events at a low rate so
that mitigation is not needed.
warn
----
A warning is emitted when a bus lock is detected which allows to identify
the offending application. This is the default behavior.
fatal
-----
In this case, the bus lock is not tolerated and the process is killed.
ratelimit
---------
A system wide bus lock rate limit N is specified where 0 < N <= 1000. This
allows a bus lock rate up to N bus locks per second. When the bus lock rate
is exceeded then any task which is caught via the buslock #DB exception is
throttled by enforced sleeps until the rate goes under the limit again.
This is an effective mitigation in cases where a minimal impact can be
tolerated, but an eventual Denial of Service attack has to be prevented. It
allows to identify the offending processes and analyze whether they are
malicious or just badly written.
Selecting a rate limit of 1000 allows the bus to be locked for up to about
seven million cycles each second (assuming 7000 cycles for each bus
lock). On a 2 GHz processor that would be about 0.35% system slowdown.

View File

@ -29,6 +29,7 @@ x86-specific Documentation
microcode
resctrl
tsx_async_abort
buslock
usb-legacy-support
i386/index
x86_64/index

View File

@ -10,6 +10,7 @@
#include <linux/thread_info.h>
#include <linux/init.h>
#include <linux/uaccess.h>
#include <linux/delay.h>
#include <asm/cpufeature.h>
#include <asm/msr.h>
@ -41,6 +42,7 @@ enum split_lock_detect_state {
sld_off = 0,
sld_warn,
sld_fatal,
sld_ratelimit,
};
/*
@ -999,13 +1001,30 @@ static const struct {
{ "off", sld_off },
{ "warn", sld_warn },
{ "fatal", sld_fatal },
{ "ratelimit:", sld_ratelimit },
};
static struct ratelimit_state bld_ratelimit;
static inline bool match_option(const char *arg, int arglen, const char *opt)
{
int len = strlen(opt);
int len = strlen(opt), ratelimit;
return len == arglen && !strncmp(arg, opt, len);
if (strncmp(arg, opt, len))
return false;
/*
* Min ratelimit is 1 bus lock/sec.
* Max ratelimit is 1000 bus locks/sec.
*/
if (sscanf(arg, "ratelimit:%d", &ratelimit) == 1 &&
ratelimit > 0 && ratelimit <= 1000) {
ratelimit_state_init(&bld_ratelimit, HZ, ratelimit);
ratelimit_set_flags(&bld_ratelimit, RATELIMIT_MSG_ON_RELEASE);
return true;
}
return len == arglen;
}
static bool split_lock_verify_msr(bool on)
@ -1084,6 +1103,15 @@ static void sld_update_msr(bool on)
static void split_lock_init(void)
{
/*
* #DB for bus lock handles ratelimit and #AC for split lock is
* disabled.
*/
if (sld_state == sld_ratelimit) {
split_lock_verify_msr(false);
return;
}
if (cpu_model_supports_sld)
split_lock_verify_msr(sld_state != sld_off);
}
@ -1156,6 +1184,12 @@ void handle_bus_lock(struct pt_regs *regs)
switch (sld_state) {
case sld_off:
break;
case sld_ratelimit:
/* Enforce no more than bld_ratelimit bus locks/sec. */
while (!__ratelimit(&bld_ratelimit))
msleep(20);
/* Warn on the bus lock. */
fallthrough;
case sld_warn:
pr_warn_ratelimited("#DB: %s/%d took a bus_lock trap at address: 0x%lx\n",
current->comm, current->pid, regs->ip);
@ -1261,6 +1295,10 @@ static void sld_state_show(void)
" from non-WB" : "");
}
break;
case sld_ratelimit:
if (boot_cpu_has(X86_FEATURE_BUS_LOCK_DETECT))
pr_info("#DB: setting system wide bus lock rate limit to %u/sec\n", bld_ratelimit.burst);
break;
}
}