Jakub reports build failures when merging linux/master with net tree:
CXX test_cpp
In file included from <built-in>:454:
<command line>:2:9: error: '_GNU_SOURCE' macro redefined [-Werror,-Wmacro-redefined]
2 | #define _GNU_SOURCE
| ^
<built-in>:445:9: note: previous definition is here
445 | #define _GNU_SOURCE 1
The culprit is commit cc937dad85 ("selftests: centralize -D_GNU_SOURCE= to
CFLAGS in lib.mk") which unconditionally added -D_GNU_SOUCE to CLFAGS.
Apparently clang++ also unconditionally adds it for the C++ targets [0]
which causes a conflict. Add small change in the selftests makefile
to filter it out for test_cpp.
Not sure which tree it should go via, targeting bpf for now, but net
might be better?
0: https://stackoverflow.com/questions/11670581/why-is-gnu-source-defined-by-default-and-how-to-turn-it-off
Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20240725214029.1760809-1-sdf@fomichev.me
Add test for lsm tail call to ensure tail call can only be used between
bpf lsm progs attached to the same hook.
Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
Link: https://lore.kernel.org/r/20240719110059.797546-9-xukuohai@huaweicloud.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
The return ranges of some bpf lsm test progs can not be deduced by
the verifier accurately. To avoid erroneous rejections, add explicit
return value checks for these progs.
Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
Link: https://lore.kernel.org/r/20240719110059.797546-8-xukuohai@huaweicloud.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
There is an existing "bpf_tcp_ca/unsupp_cong_op" test to ensure
the unsupported tcp-cc "get_info" struct_ops prog cannot be loaded.
This patch adds a new test in the bpf_testmod such that the
unsupported ops test does not depend on other kernel subsystem
where its supporting ops may be changed in the future.
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20240722183049.2254692-4-martin.lau@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
The tramp_1 to tramp_40 ops is not set in the cfi_stubs in the
bpf_testmod_ops. It fails the struct_ops_multi_pages test after
retiring the unsupported_ops in the earlier patch.
This patch initializes them in a loop during the bpf_testmod_init().
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20240722183049.2254692-3-martin.lau@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEoEVH9lhNrxiMPSyI7MXwXhnZSjYFAmanUD8THGJlbnRpc3NA
a2VybmVsLm9yZwAKCRDsxfBeGdlKNnEYD/9CjuuWX862zYmFxMuobRo7IkY+3ic7
MlnWrwf0FFg91zYEd4NjaH/V8ikLHxv5gDc5HYDhX7XDLdHVjgNsa3+eXKMZJsOP
FIQPwQdncncbH3BFkNxh8Ehq832vqAvpa0H+InrUzlmCqfqf/kRClS3LlwJ0wETe
j+/lrN5Lcg3cj78IQiBi/P4V0QNNTrB5aOZfzcdafUVogqM7fqZPE4SWOiZbZNlk
xhB5vGvj9GJAZ/W2c1cmLPfsOsqGrXQuadAMl+0NIqBn3r9PS2eIabFMK3P3CGHc
GW+xGHNJ8Bn2pujCZVamamodJTZG2ZSZ2l4l+f42iOxPXR629U+LhN1b4I69GFTc
lwHH3Ez8+VIkyVVGKRz9Xk6Zwl8+x8YHsAlAX4Kg6JcZbJFam/IZIeL7cZghN6Nh
25eDNo2V3mdZSCTh7KEhTN1LU9IKR8RrE8umuke+C/iRn8t9DpMwnFnkA9OEP6dc
JaKmNFGTwSfIdSaXF4YEUG24NAc32XhfD3PEaQ9dYo56TCLw6EBChubMOkEF/TDy
FNDROAe3y8NcwAWLaQ1ttrOtMcXFyhzLq3gNb7yF4389IswJ10S/7qwxYcJ5fd+s
iXsavwoMZPqO3wQvNqNEPRAnZGNMYZZOtXUBMQcu+UhFh+dj86ygfTSZ9hePQ8MR
d3jZqsgHDP2hiw==
=194H
-----END PGP SIGNATURE-----
Merge tag 'for-linus-2024072901' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid
Pull HID fixes from Benjamin Tissoires:
- fixes for HID-BPF after the merge with the bpf tree (Arnd Bergmann
and Benjamin Tissoires)
- some tool type fix for the Wacom driver (Tatsunosuke Tobita)
- a reorder of the sensor discovery to ensure the HID AMD SFH is
removed when no sensors are available (Basavaraj Natikar)
* tag 'for-linus-2024072901' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
selftests/hid: add test for attaching multiple time the same struct_ops
HID: bpf: prevent the same struct_ops to be attached more than once
selftests/hid: disable struct_ops auto-attach
selftests/hid: fix bpf_wq new API
HID: amd_sfh: Move sensor discovery before HID device initialization
hid: bpf: add BPF_JIT dependency
HID: wacom: more appropriate tool type categorization
HID: wacom: Modify pen IDs
This commit adds sample output for net attach/detach on
tcx subcommand.
Signed-off-by: Tao Chen <chen.dylane@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Quentin Monnet <qmo@kernel.org>
Link: https://lore.kernel.org/bpf/20240721144252.96264-1-chen.dylane@gmail.com
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
This commit adds bash-completion for attaching tcx program on interface.
Signed-off-by: Tao Chen <chen.dylane@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Quentin Monnet <qmo@kernel.org>
Link: https://lore.kernel.org/bpf/20240721144238.96246-1-chen.dylane@gmail.com
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Now, attach/detach tcx prog supported in libbpf, so we can add new
command 'bpftool attach/detach tcx' to attach tcx prog with bpftool
for user.
# bpftool prog load tc_prog.bpf.o /sys/fs/bpf/tc_prog
# bpftool prog show
...
192: sched_cls name tc_prog tag 187aeb611ad00cfc gpl
loaded_at 2024-07-11T15:58:16+0800 uid 0
xlated 152B jited 97B memlock 4096B map_ids 100,99,97
btf_id 260
# bpftool net attach tcx_ingress name tc_prog dev lo
# bpftool net
...
tc:
lo(1) tcx/ingress tc_prog prog_id 29
# bpftool net detach tcx_ingress dev lo
# bpftool net
...
tc:
# bpftool net attach tcx_ingress name tc_prog dev lo
# bpftool net
tc:
lo(1) tcx/ingress tc_prog prog_id 29
Test environment: ubuntu_22_04, 6.7.0-060700-generic
Signed-off-by: Tao Chen <chen.dylane@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Quentin Monnet <qmo@kernel.org>
Link: https://lore.kernel.org/bpf/20240721144221.96228-1-chen.dylane@gmail.com
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
This commit no logical changed, just increases code readability and
facilitates TCX prog expansion, which will be implemented in the next
patch.
Signed-off-by: Tao Chen <chen.dylane@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Quentin Monnet <qmo@kernel.org>
Link: https://lore.kernel.org/bpf/20240721143353.95980-2-chen.dylane@gmail.com
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
find_equal_scalars() is renamed to sync_linked_regs(),
this commit updates existing references in the selftests comments.
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240718202357.1746514-5-eddyz87@gmail.com
Add a few test cases to verify precision tracking for scalars gaining
range because of sync_linked_regs():
- check what happens when more than 6 registers might gain range in
sync_linked_regs();
- check if precision is propagated correctly when operand of
conditional jump gained range in sync_linked_regs() and one of
linked registers is marked precise;
- check if precision is propagated correctly when operand of
conditional jump gained range in sync_linked_regs() and a
other-linked operand of the conditional jump is marked precise;
- add a minimized reproducer for precision tracking bug reported in [0];
- Check that mark_chain_precision() for one of the conditional jump
operands does not trigger equal scalars precision propagation.
[0] https://lore.kernel.org/bpf/CAEf4BzZ0xidVCqB47XnkXcNhkPWF6_nTV7yt+_Lf0kcFEut2Mg@mail.gmail.com/
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240718202357.1746514-4-eddyz87@gmail.com
Function mark_precise_scalar_ids() is superseded by
bt_sync_linked_regs() and equal scalars tracking in jump history.
mark_precise_scalar_ids() propagates precision over registers sharing
same ID on parent/child state boundaries, while jump history records
allow bt_sync_linked_regs() to propagate same information with
instruction level granularity, which is strictly more precise.
This commit removes mark_precise_scalar_ids() and updates test cases
in progs/verifier_scalar_ids to reflect new verifier behavior.
The tests are updated in the following manner:
- mark_precise_scalar_ids() propagated precision regardless of
presence of conditional jumps, while new jump history based logic
only kicks in when conditional jumps are present.
Hence test cases are augmented with conditional jumps to still
trigger precision propagation.
- As equal scalars tracking no longer relies on parent/child state
boundaries some test cases are no longer interesting,
such test cases are removed, namely:
- precision_same_state and precision_cross_state are superseded by
linked_regs_bpf_k;
- precision_same_state_broken_link and equal_scalars_broken_link
are superseded by linked_regs_broken_link.
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240718202357.1746514-3-eddyz87@gmail.com
Use bpf_verifier_state->jmp_history to track which registers were
updated by find_equal_scalars() (renamed to collect_linked_regs())
when conditional jump was verified. Use recorded information in
backtrack_insn() to propagate precision.
E.g. for the following program:
while verifying instructions
1: r1 = r0 |
2: if r1 < 8 goto ... | push r0,r1 as linked registers in jmp_history
3: if r0 > 16 goto ... | push r0,r1 as linked registers in jmp_history
4: r2 = r10 |
5: r2 += r0 v mark_chain_precision(r0)
while doing mark_chain_precision(r0)
5: r2 += r0 | mark r0 precise
4: r2 = r10 |
3: if r0 > 16 goto ... | mark r0,r1 as precise
2: if r1 < 8 goto ... | mark r0,r1 as precise
1: r1 = r0 v
Technically, do this as follows:
- Use 10 bits to identify each register that gains range because of
sync_linked_regs():
- 3 bits for frame number;
- 6 bits for register or stack slot number;
- 1 bit to indicate if register is spilled.
- Use u64 as a vector of 6 such records + 4 bits for vector length.
- Augment struct bpf_jmp_history_entry with a field 'linked_regs'
representing such vector.
- When doing check_cond_jmp_op() remember up to 6 registers that
gain range because of sync_linked_regs() in such a vector.
- Don't propagate range information and reset IDs for registers that
don't fit in 6-value vector.
- Push a pair {instruction index, linked registers vector}
to bpf_verifier_state->jmp_history.
- When doing backtrack_insn() check if any of recorded linked
registers is currently marked precise, if so mark all linked
registers as precise.
This also requires fixes for two test_verifier tests:
- precise: test 1
- precise: test 2
Both tests contain the following instruction sequence:
19: (bf) r2 = r9 ; R2=scalar(id=3) R9=scalar(id=3)
20: (a5) if r2 < 0x8 goto pc+1 ; R2=scalar(id=3,umin=8)
21: (95) exit
22: (07) r2 += 1 ; R2_w=scalar(id=3+1,...)
23: (bf) r1 = r10 ; R1_w=fp0 R10=fp0
24: (07) r1 += -8 ; R1_w=fp-8
25: (b7) r3 = 0 ; R3_w=0
26: (85) call bpf_probe_read_kernel#113
The call to bpf_probe_read_kernel() at (26) forces r2 to be precise.
Previously, this forced all registers with same id to become precise
immediately when mark_chain_precision() is called.
After this change, the precision is propagated to registers sharing
same id only when 'if' instruction is backtracked.
Hence verification log for both tests is changed:
regs=r2,r9 -> regs=r2 for instructions 25..20.
Fixes: 904e6ddf41 ("bpf: Use scalar ids in mark_chain_precision()")
Reported-by: Hao Sun <sunhao.th@gmail.com>
Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240718202357.1746514-2-eddyz87@gmail.com
Closes: https://lore.kernel.org/bpf/CAEf4BzZ0xidVCqB47XnkXcNhkPWF6_nTV7yt+_Lf0kcFEut2Mg@mail.gmail.com/
Make use of -M compiler options when building .test.o objects to
generate .d files and avoid re-building all tests every time.
Previously, if a single test bpf program under selftests/bpf/progs/*.c
has changed, make would rebuild all the *.bpf.o, *.skel.h and *.test.o
objects, which is a lot of unnecessary work.
A typical dependency chain is:
progs/x.c -> x.bpf.o -> x.skel.h -> x.test.o -> trunner_binary
However for many tests it's not a 1:1 mapping by name, and so far
%.test.o have been simply dependent on all %.skel.h files, and
%.skel.h files on all %.bpf.o objects.
Avoid full rebuilds by instructing the compiler (via -MMD) to
produce *.d files with real dependencies, and appropriately including
them. Exploit make feature that rebuilds included makefiles if they
were changed by setting %.test.d as prerequisite for %.test.o files.
A couple of examples of compilation time speedup (after the first
clean build):
$ touch progs/verifier_and.c && time make -j8
Before: real 0m16.651s
After: real 0m2.245s
$ touch progs/read_vsyscall.c && time make -j8
Before: real 0m15.743s
After: real 0m1.575s
A drawback of this change is that now there is an overhead due to make
processing lots of .d files, which potentially may slow down unrelated
targets. However a time to make all from scratch hasn't changed
significantly:
$ make clean && time make -j8
Before: real 1m31.148s
After: real 1m30.309s
Suggested-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Ihor Solodrai <ihor.solodrai@pm.me>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/VJihUTnvtwEgv_mOnpfy7EgD9D2MPNoHO-MlANeLIzLJPGhDeyOuGKIYyKgk0O6KPjfM-MuhtvPwZcngN8WFqbTnTRyCSMc2aMZ1ODm1T_g=@pm.me
Similar to connect_to_addr() helper for connecting to a server with the
given sockaddr_storage type address, this patch adds a new helper named
connect_to_addr_str() for connecting to a server with the given string
type address "addr_str", together with its "family" and "port" as other
parameters of connect_to_addr_str().
In connect_to_addr_str(), the parameters "family", "addr_str" and "port"
are used to create a sockaddr_storage type address "addr" by invoking
make_sockaddr(). Then pass this "addr" together with "addrlen", "type"
and "opts" to connect_to_addr().
Suggested-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Link: https://lore.kernel.org/r/647e82170831558dbde132a7a3d86df660dba2c4.1721282219.git.tanggeliang@kylinos.cn
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
The struct member "must_fail" of network_helper_opts() is only used in
cgroup_v1v2 tests, it makes sense to drop it from network_helper_opts.
Return value (fd) of connect_to_fd_opts() and the expect errno (EPERM)
can be checked in cgroup_v1v2.c directly, no need to check them in
connect_fd_to_addr() in network_helpers.c.
This also makes connect_fd_to_addr() function useless. It can be replaced
by connect().
Suggested-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Link: https://lore.kernel.org/r/3faf336019a9a48e2e8951f4cdebf19e3ac6e441.1721282219.git.tanggeliang@kylinos.cn
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
The "type" parameter of connect_to_fd_opts() is redundant of "server_fd".
Since the "type" can be obtained inside by invoking getsockopt(SO_TYPE),
without passing it in as a parameter.
This patch drops the "type" parameter of connect_to_fd_opts() and updates
its callers.
Suggested-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Link: https://lore.kernel.org/r/50d8ce7ab7ab0c0f4d211fc7cc4ebe3d3f63424c.1721282219.git.tanggeliang@kylinos.cn
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
In main_loop_s function, when the open(cfg_input, O_RDONLY) function is
run, the last fd is not closed if the "--cfg_repeat > 0" branch is not
taken.
Fixes: 05be5e273c ("selftests: mptcp: add disconnect tests")
Cc: stable@vger.kernel.org
Signed-off-by: Liu Jing <liujing@cmss.chinamobile.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
pm_nl_check_endpoint() currently calls an not existing helper
to mark the test as failed. Fix the wrong call.
Fixes: 03668c65d1 ("selftests: mptcp: join: rework detailed report")
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Delete and re-create a signal endpoint and ensure that the PM
actually deletes and re-create the subflow.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
We had a handful of bugs relating to key being either all 0
or just reported incorrectly as all 0. Check for this in
the tests.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Even if a vgem device is configured in, we will skip the import_vgem_fd()
test almost every time.
TAP version 13
1..11
# Testing heap: system
# =======================================
# Testing allocation and importing:
ok 1 # SKIP Could not open vgem -1
The problem is that we use the DRM_IOCTL_VERSION ioctl to query the driver
version information but leave the name field a non-null-terminated string.
Terminate it properly to actually test against the vgem device.
While at it, let's check the length of the driver name is exactly 4 bytes
and return early otherwise (in case there is a name like "vgemfoo" that
gets converted to "vgem\0" unexpectedly).
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20240729024604.2046-1-yuzenghui@huawei.com
Fix compile error introduced by commit d27c34a735 ("KVM: riscv:
selftests: Add some Zc* extensions to get-reg-list test"). These
4 lines should be end with ";".
Fixes: d27c34a735 ("KVM: riscv: selftests: Add some Zc* extensions to get-reg-list test")
Signed-off-by: Yong-Xuan Wang <yongxuan.wang@sifive.com>
Reviewed-by: Clément Léger <cleger@rivosinc.com>
Link: https://lore.kernel.org/r/20240726084931.28924-5-yongxuan.wang@sifive.com
Signed-off-by: Anup Patel <anup@brainfault.org>
This just standardizes the use of MIN() and MAX() macros, with the very
traditional semantics. The goal is to use these for C constant
expressions and for top-level / static initializers, and so be able to
simplify the min()/max() macros.
These macro names were used by various kernel code - they are very
traditional, after all - and all such users have been fixed up, with a
few different approaches:
- trivial duplicated macro definitions have been removed
Note that 'trivial' here means that it's obviously kernel code that
already included all the major kernel headers, and thus gets the new
generic MIN/MAX macros automatically.
- non-trivial duplicated macro definitions are guarded with #ifndef
This is the "yes, they define their own versions, but no, the include
situation is not entirely obvious, and maybe they don't get the
generic version automatically" case.
- strange use case #1
A couple of drivers decided that the way they want to describe their
versioning is with
#define MAJ 1
#define MIN 2
#define DRV_VERSION __stringify(MAJ) "." __stringify(MIN)
which adds zero value and I just did my Alexander the Great
impersonation, and rewrote that pointless Gordian knot as
#define DRV_VERSION "1.2"
instead.
- strange use case #2
A couple of drivers thought that it's a good idea to have a random
'MIN' or 'MAX' define for a value or index into a table, rather than
the traditional macro that takes arguments.
These values were re-written as C enum's instead. The new
function-line macros only expand when followed by an open
parenthesis, and thus don't clash with enum use.
Happily, there weren't really all that many of these cases, and a lot of
users already had the pattern of using '#ifndef' guarding (or in one
case just using '#undef MIN') before defining their own private version
that does the same thing. I left such cases alone.
Cc: David Laight <David.Laight@aculab.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Enable turbostat extensions to add both perf and PMT
(Intel Platform Monitoring Technology) counters via the cmdline.
Demonstrate PMT access with built-in support for Meteor Lake's Die%c6 counter.
-----BEGIN PGP SIGNATURE-----
iQJIBAABCgAyFiEE67dNfPFP+XUaA73mB9BFOha3NhcFAmaj7c8UHGxlbi5icm93
bkBpbnRlbC5jb20ACgkQB9BFOha3Nhe6XBAArHsMOw7r1dF94yctBukD94szLasz
9BGI64NNVYz4pUlM0BUayZzr0kYxMonLvKMHcg1XaeCF4DRByUyhM86QfPAUVskx
qEY3raRu18wTlfku/90jYm5AM09Dp846zOf4dtrV2Io/JM8TLqo9gAsHtZI5Qaiu
bR9nPL4vjysnIiUG5aowBhYBnI9xvAecxbID/qfOofZMGyb6h06xlXPkDkD2KTkL
F/5Bv7AWohAQ7cX8EEg865eQCea4YDnQtcZg4yYRMkA78bVfE1WzCgA+qjb1dJ0A
2bbL6m7cBvikWkii82VwYWzPLbJTlQDIpQRKxRQP4SvsRc0NinZmS5d4AO9I63h3
JfIjtj7NEhzyPnaykJqcsgyOI3lBFbcEOUckutj0M8S3B6UJSC9WQnU2D6sGuQcW
iwav+zbsEkxL3ebYkOLuTEGLhqJFy6ZNxvPVAh+Q63jcBxraZoHVqG1VbmtqslnE
fSrhZ/hJIqo1F25X7DsMjyk7txDQYed9g/EArQWBb+DiL/ggvaO+FT/TGE3hCGCQ
dt2J1+hhshHesQIhF4/9sj/wHrw8RA08BVqGWzAOvwx69wevvSszSQrGU5ISypTv
9JfeOXGvc72ZQzC8MaotQfyHmxIIJ5ZQ6wFAcThc1Og39OU2+CbfIC5tLIqRoLR6
sEWH07ty9KpN0aQ=
=wsSy
-----END PGP SIGNATURE-----
Merge tag 'v6.11-merge' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux
Pull turbostat updates from Len Brown:
- Enable turbostat extensions to add both perf and PMT (Intel
Platform Monitoring Technology) counters via the cmdline
- Demonstrate PMT access with built-in support for Meteor Lake's
Die C6 counter
* tag 'v6.11-merge' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
tools/power turbostat: version 2024.07.26
tools/power turbostat: Include umask=%x in perf counter's config
tools/power turbostat: Document PMT in turbostat.8
tools/power turbostat: Add MTL's PMT DC6 builtin counter
tools/power turbostat: Add early support for PMT counters
tools/power turbostat: Add selftests for added perf counters
tools/power turbostat: Add selftests for SMI, APERF and MPERF counters
tools/power turbostat: Move verbose counter messages to level 2
tools/power turbostat: Move debug prints from stdout to stderr
tools/power turbostat: Fix typo in turbostat.8
tools/power turbostat: Add perf added counter example to turbostat.8
tools/power turbostat: Fix formatting in turbostat.8
tools/power turbostat: Extend --add option with perf counters
tools/power turbostat: Group SMI counter with APERF and MPERF
tools/power turbostat: Add ZERO_ARRAY for zero initializing builtin array
tools/power turbostat: Replace enum rapl_source and cstate_source with counter_source
tools/power turbostat: Remove anonymous union from rapl_counter_info_t
tools/power/turbostat: Switch to new Intel CPU model defines
New Changes:
- Refactor to a common struct for DRAM and general media CXL events
- Add abstract distance calculation support for CXL
- Add CXL maturity map documentation to detail current state of CXL enabling
- Add warning on mixed CXL VH and RCH/RCD hierachy to inform unsupported config
- Replace ENXIO with EBUSY for inject poison limit reached via debugfs
- Replace ENXIO with EBUSY for inject poison cxl-test support
- XOR math fixup for DPA to SPA translation. Current math works for MODULO arithmetic
where HPA==SPA, however not for XOR decode.
- Move pci config read in cxl_dvsec_rr_decode() to avoid unnecessary acess
Fixes:
- Add a fix to address race condition in CXL memory hotplug notifier
- Add missing MODULE_DESCRIPTION() for CXL modules
- Fix incorrect vendor debug UUID define
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE5DAy15EJMCV1R6v9YGjFFmlTOEoFAmahJiMACgkQYGjFFmlT
OEq8URAArcnzmH9yLvgE2pFOtaKg34vIGDWZGC4R1LpTnFEea04FuJslmxEKNgWo
DJgPt9VZ66ump/oIvzcbvgLl/yMCTbnSxt5U6J8G5EmpO50PvxOTeWnEgAYVa0NH
Diuzk/aF4GA94T3w+iAOzYx2N36kF+ezsY3/kqSORT7MC+DipSSUaPUiJcjr6FC6
/ZIwkhhRi51ONJ8IgaXD+oEU9kxx7WUEyZoQZrJ9bv8/fGbeEfqy04pz2xDKHmLD
rlQjm3l9um67VMsCvZ62Ce14HXqM213jZ3l0FmYjO4GbdXd2+0ZmIRNAb5vvTG9n
5cY8vNsL6fND9FKkxlcRSdzI/O/vV+gcU+jzJxiul0p5fWHh/gaYjVH7fFq3dYc+
vYE5lr97BfyA61bdmylIc2xwDH4yNKVQLZZPVTz5XTxfzBjYCjLPb5vGQKfg/nrB
N66wjCIWLfCH6DqusUXem1c6BSrrjob8MwXpg00eBE0AA4ihieiy5fxuApnv9mI2
f809AXRV1k24s5upStZ9iGZSEILBBqiw/KwDyWfRvxjNz36Z1Q2eiXBwbHrVQHBa
PFtRPPFsZ9+ouIG/8otFaLwDQdITRdA0+drG8lmJ+gs8239Z3eIMMS0+CYdLDbva
S8vo4POOQSS+cVUjLkC9zIxwPaXq96TLIkCtiLI9xUx5eIzv4K0=
=HaEG
-----END PGP SIGNATURE-----
Merge tag 'cxl-for-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl
Pull CXL updates from Dave Jiang:
"Core:
- A CXL maturity map has been added to the documentation to detail
the current state of CXL enabling.
It provides the status of the current state of various CXL features
to inform current and future contributors of where things are and
which areas need contribution.
- A notifier handler has been added in order for a newly created CXL
memory region to trigger the abstract distance metrics calculation.
This should bring parity for CXL memory to the same level vs
hotplugged DRAM for NUMA abstract distance calculation. The
abstract distance reflects relative performance used for memory
tiering handling.
- An addition for XOR math has been added to address the CXL DPA to
SPA translation.
CXL address translation did not support address interleave math
with XOR prior to this change.
Fixes:
- Fix to address race condition in the CXL memory hotplug notifier
- Add missing MODULE_DESCRIPTION() for CXL modules
- Fix incorrect vendor debug UUID define
Misc:
- A warning has been added to inform users of an unsupported
configuration when mixing CXL VH and RCH/RCD hierarchies
- The ENXIO error code has been replaced with EBUSY for inject poison
limit reached via debugfs and cxl-test support
- Moving the PCI config read in cxl_dvsec_rr_decode() to avoid
unnecessary PCI config reads
- A refactor to a common struct for DRAM and general media CXL
events"
* tag 'cxl-for-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl:
cxl/core/pci: Move reading of control register to immediately before usage
cxl: Remove defunct code calculating host bridge target positions
cxl/region: Verify target positions using the ordered target list
cxl: Restore XOR'd position bits during address translation
cxl/core: Fold cxl_trace_hpa() into cxl_dpa_to_hpa()
cxl/test: Replace ENXIO with EBUSY for inject poison limit reached
cxl/memdev: Replace ENXIO with EBUSY for inject poison limit reached
cxl/acpi: Warn on mixed CXL VH and RCH/RCD Hierarchy
cxl/core: Fix incorrect vendor debug UUID define
Documentation: CXL Maturity Map
cxl/region: Simplify cxl_region_nid()
cxl/region: Support to calculate memory tier abstract distance
cxl/region: Fix a race condition in memory hotplug notifier
cxl: add missing MODULE_DESCRIPTION() macros
cxl/events: Use a common struct for DRAM and General Media events
-----BEGIN PGP SIGNATURE-----
iIYEABYKAC4WIQSVyBthFV4iTW/VU1/l49DojIL20gUCZqFEchAcbWljQGRpZ2lr
b2QubmV0AAoJEOXj0OiMgvbSULcBAPEV5Viu/zox2FdS87EGTqWxEQJcBRvc3ahj
MQk44WtMAP4o2CnwrOoMyZXeq9npteL5lQsVhEzeI+p8oN9C9bThBg==
=zizo
-----END PGP SIGNATURE-----
Merge tag 'landlock-6.11-rc1-houdini-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux
Pull landlock fix from Mickaël Salaün:
"Jann Horn reported a sandbox bypass for Landlock. This includes the
fix and new tests. This should be backported"
* tag 'landlock-6.11-rc1-houdini-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux:
selftests/landlock: Add cred_transfer test
landlock: Don't lose track of restrictions on cred_transfer
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZqQWWQAKCRDdBJ7gKXxA
jqJVAP9vU9HNzIyKDOOqoNHKMI+VzGn39w1FihWjG6AU5a+9NQD+MZJwr7bBwkpH
ii43HLUGvNRQtsldBZSRypsaitCSwAI=
=HGce
-----END PGP SIGNATURE-----
Merge tag 'mm-hotfixes-stable-2024-07-26-14-33' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc hotfixes from Andrew Morton:
"11 hotfixes, 7 of which are cc:stable. 7 are MM, 4 are other"
* tag 'mm-hotfixes-stable-2024-07-26-14-33' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
nilfs2: handle inconsistent state in nilfs_btnode_create_block()
selftests/mm: skip test for non-LPA2 and non-LVA systems
mm/page_alloc: fix pcp->count race between drain_pages_zone() vs __rmqueue_pcplist()
mm: memcg: add cacheline padding after lruvec in mem_cgroup_per_node
alloc_tag: outline and export free_reserved_page()
decompress_bunzip2: fix rare decompression failure
mm/huge_memory: avoid PMD-size page cache if needed
mm: huge_memory: use !CONFIG_64BIT to relax huge page alignment on 32 bit machines
mm: fix old/young bit handling in the faulting path
dt-bindings: arm: update James Clark's email address
MAINTAINERS: mailmap: update James Clark's email address
Post my improvement of the test in e4a4ba4154 ("selftests/mm:
va_high_addr_switch: dynamically initialize testcases to enable LPA2
testing"):
The test begins to fail on 4k and 16k pages, on non-LPA2 systems. To
reduce noise in the CI systems, let us skip the test when higher address
space is not implemented.
Link: https://lkml.kernel.org/r/20240718052504.356517-1-dev.jain@arm.com
Fixes: e4a4ba4154 ("selftests/mm: va_high_addr_switch: dynamically initialize testcases to enable LPA2 testing")
Signed-off-by: Dev Jain <dev.jain@arm.com>
Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Release 2024.07.26:
Enable turbostat extensions to add both perf and PMT
(Intel Platform Monitoring Technology) counters from the cmdline.
Demonstrate PMT access with built-in support for Meteor Lake's Die%c6 counter.
This commit:
Clean up white-space nits introduced since version 2024.05.10
Signed-off-by: Len Brown <len.brown@intel.com>
Some counters, like cpu/cache-misses/, expose and require umask=%x
parameter alongside event=%x in the sysfs perf counter's event file.
This change make sure we parse and use it when opening user added
counters.
Signed-off-by: Patryk Wlazlyn <patryk.wlazlyn@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Add a general description of the user interface for adding PMT
counters with the new --add pmt,... option.
Provide a complete example for requesting two counters.
Signed-off-by: Patryk Wlazlyn <patryk.wlazlyn@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Provide a definition for metadata that allows reading DC6 residency
counter via PMT and exposes it as a builtin counter.
Note that this residency counter is updated and read via
entirely different mechanisms vs the MSR-based residency counters.
On MTL processors, there are times when Die%c6 will report above 100%.
This is still useful, but don't expect 3 digits of precision...
Signed-off-by: Patryk Wlazlyn <patryk.wlazlyn@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Allows users to read Intel PMT (Platform Monitoring Technology)
counters, providing interface similar to one used to add MSR and perf
counters. Because PMT is exposed as a raw MMIO range, without metadata,
user has to supply the necessary information to find and correctly
display the requested counter.
Signed-off-by: Patryk Wlazlyn <patryk.wlazlyn@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Records the commands for cross compilation with two methods.
The first method relies on Multiarch. The second approach is to explicitly
specify the PKG_CONFIG variables, which is widely used in build system
(like Buildroot, Yocto, etc).
Co-developed-by: James Clark <james.clark@arm.com>
Signed-off-by: James Clark <james.clark@arm.com>
Signed-off-by: Leo Yan <leo.yan@arm.com>
Tested-by: Ian Rogers <irogers@google.com>
Cc: amadio@gentoo.org
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/r/20240717082211.524826-7-leo.yan@arm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
When build static perf, Makefile reports the error:
Makefile.config:480: No libdw DWARF unwind found, Please install
elfutils-devel/libdw-dev >= 0.158 and/or set LIBDW_DIR
The libdw has been installed on the system, but the build system fails
to build the feature detecting binary 'test-libdw-dwarf-unwind'. The
failure is caused by missing to link the lib 'zstd'.
Link lib 'zstd' for the static build, in the end, the dwarf feature can
be enabled in the static perf.
Signed-off-by: Leo Yan <leo.yan@arm.com>
Tested-by: Ian Rogers <irogers@google.com>
Cc: amadio@gentoo.org
Cc: James Clark <james.clark@linaro.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/r/20240717082211.524826-6-leo.yan@arm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
The libunwind feature test failed with the static linkage. This is due
to the 'lzma' lib is missed, so link it to dismiss building failure.
Signed-off-by: Leo Yan <leo.yan@arm.com>
Tested-by: Ian Rogers <irogers@google.com>
Cc: amadio@gentoo.org
Cc: James Clark <james.clark@linaro.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/r/20240717082211.524826-5-leo.yan@arm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Since libdw version 0.177, elfutils has merged libebl.a into libdw (see
the commit "libebl: Don't install libebl.a, libebl.h and remove backends
from spec." in the elfutils repository).
As a result, libebl.a does not exist on Debian Bullseye and newer
releases, causing static perf builds to fail on these distributions.
This commit checks the libdw version and only links libebl.a if it
detects that the libdw version is older than 0.177.
Signed-off-by: Leo Yan <leo.yan@arm.com>
Tested-by: Ian Rogers <irogers@google.com>
Cc: amadio@gentoo.org
Cc: James Clark <james.clark@linaro.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/r/20240717082211.524826-4-leo.yan@arm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Python configuration has dedicated folders for different architectures.
For example, Python 3.11 has two folders as shown below, one for Arm64
and another for x86_64:
/usr/lib/python3.11/config-3.11-aarch64-linux-gnu/
/usr/lib/python3.11/config-3.11-x86_64-linux-gnu/
This commit updates the Python configuration path based on the
compiler's machine type, guiding the compiler to find the correct path
for Python libraries. It also renames the generated .so file name to
match the machine name.
Signed-off-by: Leo Yan <leo.yan@arm.com>
Tested-by: Ian Rogers <irogers@google.com>
Cc: amadio@gentoo.org
Cc: James Clark <james.clark@linaro.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/r/20240717082211.524826-3-leo.yan@arm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
On recent Linux distros like Ubuntu Noble and Debian Bookworm, the
'pkg-config-aarch64-linux-gnu' package is missing. As a result, the
aarch64-linux-gnu-pkg-config command is not available, which causes
build failures.
When a build passes the environment variables PKG_CONFIG_LIBDIR or
PKG_CONFIG_PATH, like a user uses make command or a build system
(like Yocto, Buildroot, etc) prepares the variables and passes to the
Perf's Makefile, the commit keeps these variables for package
configuration. Otherwise, this commit sets the PKG_CONFIG_LIBDIR
variable to use the Multiarch libs for the cross compilation.
Signed-off-by: Leo Yan <leo.yan@arm.com>
Tested-by: Ian Rogers <irogers@google.com>
Cc: amadio@gentoo.org
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/r/20240717082211.524826-2-leo.yan@arm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
With 0dd5041c9a ("perf addr_location: Add init/exit/copy functions"),
when cpumode is 3 (macro PERF_RECORD_MISC_HYPERVISOR),
thread__find_map() could return with al->maps being NULL.
The path below could add a callchain_cursor_node with NULL ms.maps.
add_callchain_ip()
thread__find_symbol(.., &al)
thread__find_map(.., &al) // al->maps becomes NULL
ms.maps = maps__get(al.maps)
callchain_cursor_append(..., &ms, ...)
node->ms.maps = maps__get(ms->maps)
Then the path below would dereference NULL maps and get segfault.
fill_callchain_info()
maps__machine(node->ms.maps);
Fix it by checking if maps is NULL in fill_callchain_info().
Fixes: 0dd5041c9a ("perf addr_location: Add init/exit/copy functions")
Signed-off-by: Casey Chen <cachen@purestorage.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: yzhong@purestorage.com
Link: https://lore.kernel.org/r/20240722211548.61455-1-cachen@purestorage.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Test adds several perf counters from msr, cstate_core and cstate_pkg
groups and checks if the columns for those counters show up.
The test skips the counters that are not present. It is not an error,
but the test may not be as exhaustive.
Signed-off-by: Patryk Wlazlyn <patryk.wlazlyn@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
The test requests BICs that are dependent on SMI, APERF and MPERF
counters and checks if the columns show up in the output and the
turbostat doesn't crash. Read the counters in both --no-msr
and --no-perf mode.
The test skips counters that are not present or user does not have
permissions to read. It is not an error, but the test may not be as
exhaustive.
Signed-off-by: Patryk Wlazlyn <patryk.wlazlyn@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Printing information about the source and value during initialization and
reading of the counter for each cpu, while useful when debugging,
results in too verbose output.
Signed-off-by: Patryk Wlazlyn <patryk.wlazlyn@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
This leaves the stdout cleaner, having only counter data. It makes it
easier for programs to parse the output of turbostat, for example
selftests.
Signed-off-by: Patryk Wlazlyn <patryk.wlazlyn@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
- Remove some redundant Kconfig conditionals
- Fix string output in ptrace selftest
- Fix fast GUP crashes in some page-table configurations
- Remove obsolete linker option when building the vDSO
- Fix some sysreg field definitions for the GIC
-----BEGIN PGP SIGNATURE-----
iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAmaiSAMQHHdpbGxAa2Vy
bmVsLm9yZwAKCRC3rHDchMFjNJ8PB/9lyDbJ+qTNwECGKtz+vOAbronZncJy4yzd
ElPRNeQ+B7QqrrYZM2TCrz6/ppeKXp0OurwNk9vKBqzrCfy/D6kKXWfcOYqeWlyI
C2NImLHZgC6pIRwF3GlJ/E0VDtf/wQsJoWk7ikVssPtyIWOufafaB53FRacc1vnf
bmEpcdXox+FsTG4q8YhBE6DZnqqQTnm7MvAt4wgskk6tTyKj/FuQmSk50ZW22oXb
G2UOZxhYZV7IIXlRaClsY/iv62pTfMYlqDAvZeH81aiol/vfYXVFSeca5Mca67Ji
P1o8HPd++hTw9WVyCrrbSGcZ/XNs96yTmahJWM+eneiV7OzKxj4v
=Mr4K
-----END PGP SIGNATURE-----
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"The usual summary below, but the main fix is for the fast GUP lockless
page-table walk when we have a combination of compile-time and
run-time folding of the p4d and the pud respectively.
- Remove some redundant Kconfig conditionals
- Fix string output in ptrace selftest
- Fix fast GUP crashes in some page-table configurations
- Remove obsolete linker option when building the vDSO
- Fix some sysreg field definitions for the GIC"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: mm: Fix lockless walks with static and dynamic page-table folding
arm64/sysreg: Correct the values for GICv4.1
arm64/vdso: Remove --hash-style=sysv
kselftest: missing arg in ptrace.c
arm64/Kconfig: Remove redundant 'if HAVE_FUNCTION_GRAPH_TRACER'
arm64: remove redundant 'if HAVE_ARCH_KASAN' in Kconfig
Random fixes for v6.11.
-----BEGIN PGP SIGNATURE-----
iQGzBAABCgAdFiEEi8GdvG6xMhdgpu/4sUSA/TofvsgFAmahKbIACgkQsUSA/Tof
vsh8zQwAvguyeNubDFqdMe3E/Vp1J3WqXsBFzbE1rGLCyI2S0cgJFL5BlW51zY47
70wLt9EmroEobwj1qHSQlzejNp31kSBQ1Sqq25oivfJqEF1elDT5PQxYqBbU1C9Y
kVWnxtb+oKaoFd5jiBK8+iTl8dXjT6H2RoV0zpPab/JPcqsjwFfkUvtENt/Kpo5c
aRrGTFwshdp5eT4sEZQv57VKroBcwZOvv2//qrklFHrJHl4pjMT8eaX3twcQysoy
umTVt+TK6NErLnht+VRQJ2/L02FKi7b+bHePVgNzaT+1FSDMT4FltmZd96Xwbzah
hSkwWtqy0N2gaTcqie9nwdEiCJGjF39M7k2wangUS91CeDsbIUSsJgDCESUCm+zK
hRqleGOnoeg4+jZBci7M53lKa/pADlmLhnU8iAc3BSKozsaioltkT+hHn8vAkstk
h/kHlbfkzasufUWAhduBpIn384gWWEY6RACffgCsOuvbT+ZyDKUJpKYaEwVx+Pri
l72j0hs9
=RbET
-----END PGP SIGNATURE-----
Merge tag 'bitmap-6.11-rc1' of https://github.com:/norov/linux
Pull bitmap updates from Yury Norov:
"Random fixes"
* tag 'bitmap-6.11-rc1' of https://github.com:/norov/linux:
riscv: Remove unnecessary int cast in variable_fls()
radix tree test suite: put definition of bitmap_clear() into lib/bitmap.c
bitops: Add a comment explaining the double underscore macros
lib: bitmap: add missing MODULE_DESCRIPTION() macros
cpumask: introduce assign_cpu() macro
catching COVID, so relatively short PR. Including fixes from bpf
and netfilter.
Current release - regressions:
- tcp: process the 3rd ACK with sk_socket for TFO and MPTCP
Current release - new code bugs:
- l2tp: protect session IDR and tunnel session list with one lock,
make sure the state is coherent to avoid a warning
- eth: bnxt_en: update xdp_rxq_info in queue restart logic
- eth: airoha: fix location of the MBI_RX_AGE_SEL_MASK field
Previous releases - regressions:
- xsk: require XDP_UMEM_TX_METADATA_LEN to actuate tx_metadata_len,
the field reuses previously un-validated pad
Previous releases - always broken:
- tap/tun: drop short frames to prevent crashes later in the stack
- eth: ice: add a per-VF limit on number of FDIR filters
- af_unix: disable MSG_OOB handling for sockets in sockmap/sockhash
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmaibxAACgkQMUZtbf5S
IruuIRAAu96TiN/urPwmKznyb/Sk8x7p8iUzn6OvPS/TUlFUkURQtOh6M9uvbpN4
x/L//EWkMR0hY4SkBegoiXfb1GS0PjBdWTWUiROm5X9nVHqp5KRZAxWXhjFiS1BO
BIYOT+JfCl7mQiPs90Mys/cEtYOggMBsCZQVIGw/iYoJLFREqxFSONwa0dG+tGMX
jn9WNu4yCVDhJ/jtl2MaTsCNtYUaBUgYrKHJBfNGfJ2Lz/7rH9yFui2WSMlmOd/U
QGeCb1DWURlShlCqY37wNinbFsxWkI5JN00ukTtwFAXLIaqc+zgHcIjrDjTJwK43
F4tKbJT3+bmehMU/h3Uo3c7DhXl7n9zDGiDtbCxnkykp0sFGJpjhDrWydo51c+YB
qW5HaNrII2LiDicOVN8L29ylvKp7AEkClxgivEhZVGGk2f/szJRXfp9u3WBn5kAx
3paH55YN0DEsKbYbb1ZENEI1Vnc/4ff4PxZJCUNKwzcS8wCn1awqwcriK9TjS/cp
fjilNFT4J3/uFrodHWTkx0jJT6UJFT0aF03qPLUH/J5kG+EVukOf1jBPInNdf1si
1j47SpblHUe86HiHphFMt32KZ210lJzWxh8uGma57Y2sB9makdLiK4etrFjkiMJJ
Z8A3kGp3KpFjbuK4tHY25rp+5oxLNNOBNpay29lQrWtCL/NDcaQ=
=9OsH
-----END PGP SIGNATURE-----
Merge tag 'net-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from bpf and netfilter.
A lot of networking people were at a conference last week, busy
catching COVID, so relatively short PR.
Current release - regressions:
- tcp: process the 3rd ACK with sk_socket for TFO and MPTCP
Current release - new code bugs:
- l2tp: protect session IDR and tunnel session list with one lock,
make sure the state is coherent to avoid a warning
- eth: bnxt_en: update xdp_rxq_info in queue restart logic
- eth: airoha: fix location of the MBI_RX_AGE_SEL_MASK field
Previous releases - regressions:
- xsk: require XDP_UMEM_TX_METADATA_LEN to actuate tx_metadata_len,
the field reuses previously un-validated pad
Previous releases - always broken:
- tap/tun: drop short frames to prevent crashes later in the stack
- eth: ice: add a per-VF limit on number of FDIR filters
- af_unix: disable MSG_OOB handling for sockets in sockmap/sockhash"
* tag 'net-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (34 commits)
tun: add missing verification for short frame
tap: add missing verification for short frame
mISDN: Fix a use after free in hfcmulti_tx()
gve: Fix an edge case for TSO skb validity check
bnxt_en: update xdp_rxq_info in queue restart logic
tcp: process the 3rd ACK with sk_socket for TFO/MPTCP
selftests/bpf: Add XDP_UMEM_TX_METADATA_LEN to XSK TX metadata test
xsk: Require XDP_UMEM_TX_METADATA_LEN to actuate tx_metadata_len
bpf: Fix a segment issue when downgrading gso_size
net: mediatek: Fix potential NULL pointer dereference in dummy net_device handling
MAINTAINERS: make Breno the netconsole maintainer
MAINTAINERS: Update bonding entry
net: nexthop: Initialize all fields in dumped nexthops
net: stmmac: Correct byte order of perfect_match
selftests: forwarding: skip if kernel not support setting bridge fdb learning limit
tipc: Return non-zero value from tipc_udp_addr2str() on error
netfilter: nft_set_pipapo_avx2: disable softinterrupts
ice: Fix recipe read procedure
ice: Add a per-VF limit on number of FDIR filters
net: bonding: correctly annotate RCU in bond_should_notify_peers()
...
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZqIl1AAKCRDbK58LschI
g/MdAP9oyZV9/IZ6Y6Z1fWfio0SB+yJGugcwbFjWcEtNrzsqJQEAwipQnemAI4NC
HBMfK2a/w7vhAFMXrP/SbkB/gUJJ7QE=
=vovf
-----END PGP SIGNATURE-----
Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:
====================
pull-request: bpf 2024-07-25
We've added 14 non-merge commits during the last 8 day(s) which contain
a total of 19 files changed, 177 insertions(+), 70 deletions(-).
The main changes are:
1) Fix af_unix to disable MSG_OOB handling for sockets in BPF sockmap and
BPF sockhash. Also add test coverage for this case, from Michal Luczaj.
2) Fix a segmentation issue when downgrading gso_size in the BPF helper
bpf_skb_adjust_room(), from Fred Li.
3) Fix a compiler warning in resolve_btfids due to a missing type cast,
from Liwei Song.
4) Fix stack allocation for arm64 to align the stack pointer at a 16 byte
boundary in the fexit_sleep BPF selftest, from Puranjay Mohan.
5) Fix a xsk regression to require a flag when actuating tx_metadata_len,
from Stanislav Fomichev.
6) Fix function prototype BTF dumping in libbpf for prototypes that have
no input arguments, from Andrii Nakryiko.
7) Fix stacktrace symbol resolution in perf script for BPF programs
containing subprograms, from Hou Tao.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
selftests/bpf: Add XDP_UMEM_TX_METADATA_LEN to XSK TX metadata test
xsk: Require XDP_UMEM_TX_METADATA_LEN to actuate tx_metadata_len
bpf: Fix a segment issue when downgrading gso_size
tools/resolve_btfids: Fix comparison of distinct pointer types warning in resolve_btfids
bpf, events: Use prog to emit ksymbol event for main program
selftests/bpf: Test sockmap redirect for AF_UNIX MSG_OOB
selftests/bpf: Parametrize AF_UNIX redir functions to accept send() flags
selftests/bpf: Support SOCK_STREAM in unix_inet_redir_to_connected()
af_unix: Disable MSG_OOB handling for sockets in sockmap/sockhash
bpftool: Fix typo in usage help
libbpf: Fix no-args func prototype BTF dumping syntax
MAINTAINERS: Update powerpc BPF JIT maintainers
MAINTAINERS: Update email address of Naveen
selftests/bpf: fexit_sleep: Fix stack allocation for arm64
====================
Link: https://patch.msgid.link/20240725114312.32197-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This flag is now required to use tx_metadata_len.
Fixes: 40808a237d ("selftests/bpf: Add TX side to xdp_metadata")
Reported-by: Julian Schindel <mail@arctic-alpaca.de>
Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/bpf/20240713015253.121248-3-sdf@fomichev.me
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEq5lC5tSkz8NBJiCnSfxwEqXeA64FAmaarzgACgkQSfxwEqXe
A66ZWBAAlhXx8bve0uKlDRK8fffWHgruho/fOY4lZJ137AKwA9JCtmOyqdfL4Dmk
VxFe7pEQJlQhcA/6kH54uO7SBXwfKlKZJth6SYnaCRMUIbFifHjjIQ0QqldjEKi0
rP90Hu4FVsbwQC7u9i9lQj9n2P36zb6pn83BzpZQ/2PtoVCSCrdSJUe0Rxa3H3GN
0+nNkDSXQt5otCByLaeE3x7KJgXLWL9+G2eFSFLTZ8rSVfMx1CdOIAG37WlLGdWm
BaFYPDKMyBTVvVJBNgAe9YSqtrsZ5nlmLz+Z9wAe/hTL7RlL03kWUu34/Udcpull
zzMDH0WMntiGK3eFQ2gOYSWqypvAjwHgn3BzqNmjUb69+89mZsdU1slcvnxWsUwU
D3vphrscaqarF629tfsXti3jc5PoXwUTjROZVcCyeFPBhyAZgzK8xUvPpJO+RT+K
EuUABob9cpA6FCpW/QeolDmMDhXlNT8QgsZu1juokZac2xP3Ly3REyEvT7HLbU2W
ZJjbEqm1ppp3RmGELUOJbyhwsLrnbt+OMDO7iEWoG8aSFK4diBK/ZM6WvLMkr8Oi
7ioXGIsYkCy3c47wpZKTrAapOPJp5keqNAiHSEbXw8mozp6429QAEZxNOcczgHKC
Ea2JzRkctqutcIT+Slw/uUe//i1iSsIHXbE81fp5udcQTJcUByo=
=P8aI
-----END PGP SIGNATURE-----
Merge tag 'random-6.11-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random
Pull random number generator updates from Jason Donenfeld:
"This adds getrandom() support to the vDSO.
First, it adds a new kind of mapping to mmap(2), MAP_DROPPABLE, which
lets the kernel zero out pages anytime under memory pressure, which
enables allocating memory that never gets swapped to disk but also
doesn't count as being mlocked.
Then, the vDSO implementation of getrandom() is introduced in a
generic manner and hooked into random.c.
Next, this is implemented on x86. (Also, though it's not ready for
this pull, somebody has begun an arm64 implementation already)
Finally, two vDSO selftests are added.
There are also two housekeeping cleanup commits"
* tag 'random-6.11-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random:
MAINTAINERS: add random.h headers to RNG subsection
random: note that RNDGETPOOL was removed in 2.6.9-rc2
selftests/vDSO: add tests for vgetrandom
x86: vdso: Wire up getrandom() vDSO implementation
random: introduce generic vDSO getrandom() implementation
mm: add MAP_DROPPABLE for designating always lazily freeable mappings
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZqDFUwAKCRCRxhvAZXjc
omD6APwJKlepwDYlu5XZptI6/1kmai6SqaYnifTX1+ELR/rQQAD/Z37aho42v2JZ
NYr+KFj02vj7ryKA5OWuSD8cw+6GlwQ=
=dfob
-----END PGP SIGNATURE-----
Merge tag 'vfs-6.11-rc1.fixes.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs fixes from Christian Brauner:
"VFS:
- The new 64bit mount ids start after the old mount id, i.e., at the
first non-32 bit value. However, we started counting one id too
late and thus lost 4294967296 as the first valid id. Fix that.
- Update a few comments on some vfs_*() creation helpers.
- Move copying of the xattr name out from the locks required to start
a filesystem write.
- Extend the filelock lock UAF fix to the compat code as well.
- Now that we added the ability to look up an inode under RCU it's
possible that lockless hash lookup can find and lock an inode after
it gets I_FREEING set. It then waits until inode teardown in
evict() is finished.
The flag however is still set after evict() has woken up all
waiters. If the inode lock is taken late enough on the waiting side
after hash removal and wakeup happened the waiting thread will
never be woken.
Before RCU based lookup this was synchronized via the
inode_hash_lock. But since unhashing requires the inode lock as
well we can check whether the inode is unhashed while holding inode
lock even without holding inode_hash_lock.
pidfd:
- The nsproxy structure contains nearly all of the namespaces
associated with a task. When a namespace type isn't supported
nsproxy might contain a NULL pointer or always point to the initial
namespace type. The logic isn't consistent. So when deriving
namespace fds we need to ensure that the namespace type is
supported.
First, so that we don't risk dereferncing NULL pointers. The
correct bigger fix would be to change all namespaces to always set
a valid namespace pointer in struct nsproxy independent of whether
or not it is compiled in. But that requires quite a few changes.
Second, so that we don't allow deriving namespace fds when the
namespace type doesn't exist and thus when they couldn't also be
derived via /proc/self/ns/.
- Add missing selftests for the new pidfd ioctls to derive namespace
fds. This simply extends the already existing testsuite.
netfs:
- Fix debug logging and fix kconfig variable name so it actually
works.
- Fix writeback that goes both to the server and cache. The streams
are only activated once a subreq is added. When a server write
happens the subreq doesn't need to have finished by the time the
cache write is started. If the server write has already finished by
the time the cache write is about to start the cache write will
operate on a folio that might already have been reused. Fix this by
preactivating the cache write.
- Limit cachefiles subreq size for cache writes to MAX_RW_COUNT"
* tag 'vfs-6.11-rc1.fixes.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
inode: clarify what's locked
vfs: Fix potential circular locking through setxattr() and removexattr()
filelock: Fix fcntl/close race recovery compat path
fs: use all available ids
cachefiles: Set the max subreq size for cache writes to MAX_RW_COUNT
netfs: Fix writeback that needs to go to both server and cache
pidfs: add selftests for new namespace ioctls
pidfs: handle kernels without namespaces cleanly
pidfs: when time ns disabled add check for ioctl
vfs: correct the comments of vfs_*() helpers
vfs: handle __wait_on_freeing_inode() and evict() race
netfs: Rename CONFIG_FSCACHE_DEBUG to CONFIG_NETFS_DEBUG
netfs: Revert "netfs: Switch debug logging to pr_debug()"
Since commit 08ac454e25 ("libbpf: Auto-attach struct_ops BPF maps in
BPF skeleton"), libbpf automatically calls bpf_map__attach_struct_ops()
on every struct_ops it sees in the bpf object. The problem is that
our test bpf object has many of them but only one should be manually
loaded at a time, or we end up locking the syscall.
Link: https://patch.msgid.link/20240723-fix-6-11-bpf-v1-2-b9d770346784@kernel.org
Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
Since commit f56f4d541e ("bpf: helpers: fix bpf_wq_set_callback_impl
signature"), the API for bpf_wq changed a bit.
We need to update the selftests/hid code to reflect that or the
bpf program will not load.
Link: https://patch.msgid.link/20240723-fix-6-11-bpf-v1-1-b9d770346784@kernel.org
Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
If the testing kernel doesn't support setting fdb_max_learned or show
fdb_n_learned, just skip it. Or we will get errors like
./bridge_fdb_learning_limit.sh: line 218: [: null: integer expression expected
./bridge_fdb_learning_limit.sh: line 225: [: null: integer expression expected
Fixes: 6f84090333 ("selftests: forwarding: bridge_fdb_learning_limit: Add a new selftest")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Reviewed-by: Johannes Nixdorf <jnixdorf-oss@avm.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Two fixes about building perf and other tools:
* Fix breakage in tracing tools due to pkg-config for libtrace{event,fs}
* Fix build of perf when libunwind is used
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
-----BEGIN PGP SIGNATURE-----
iHQEABYIAB0WIQSo2x5BnqMqsoHtzsmMstVUGiXMgwUCZqA1SAAKCRCMstVUGiXM
g3TfAQDXLi+XcSDE/u5JcDN3H6+bXvavDn2k8Gsd6vWZQc5LEQD3X1E+GbtWTQsE
ruk5ZT3voy8qBPgmrUg72NJwmRxYAQ==
=0RKR
-----END PGP SIGNATURE-----
Merge tag 'perf-tools-fixes-for-v6.11-2024-07-23' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
Pull perf tools fixes from Namhyung Kim:
"Two fixes for building perf and other tools:
- Fix breakage in tracing tools due to pkg-config for
libtrace{event,fs}
- Fix build of perf when libunwind is used"
* tag 'perf-tools-fixes-for-v6.11-2024-07-23' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools:
perf dso: Fix build when libunwind is enabled
tools/latency: Use pkg-config in lib_setup of Makefile.config
tools/rtla: Use pkg-config in lib_setup of Makefile.config
tools/verification: Use pkg-config in lib_setup of Makefile.config
tools: Make pkg-config dependency checks usable by other tools
perf build: Warn if libtracefs is not found
- Remove tristate choice support from Kconfig
- Stop using the PROVIDE() directive in the linker script
- Reduce the number of links for the combination of CONFIG_DEBUG_INFO_BTF
and CONFIG_KALLSYMS
- Enable the warning for symbol reference to .exit.* sections by default
- Fix warnings in RPM package builds
- Improve scripts/make_fit.py to generate a FIT image with separate base
DTB and overlays
- Improve choice value calculation in Kconfig
- Fix conditional prompt behavior in choice in Kconfig
- Remove support for the uncommon EMAIL environment variable in Debian
package builds
- Remove support for the uncommon "name <email>" form for the DEBEMAIL
environment variable
- Raise the minimum supported GNU Make version to 4.0
- Remove stale code for the absolute kallsyms
- Move header files commonly used for host programs to scripts/include/
- Introduce the pacman-pkg target to generate a pacman package used in
Arch Linux
- Clean up Kconfig
-----BEGIN PGP SIGNATURE-----
iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmagBLUVHG1hc2FoaXJv
eUBrZXJuZWwub3JnAAoJED2LAQed4NsGmoUQAJ8pnURs0g+Rcyk6bdY/qtXBYkS+
nXpIK1ssFgRRgAQdeszYtvBqLFzb0wRCSie87G1AriD/JkVVTjCCY1For1y+vs0u
a7HfxitHhZpPyZW/T+WMQ3LViNccpkx+DFAcoRH8xOY/XPEJKVUby332jOIXMuyg
+NKIELQJVsLhcDofTUGb5VfIQektw219n5c4jKjXdNk4ZtE24xCRM5X528ZebwWJ
RZhMvJ968PyIH1IRXvNt6dsKBxoGIwPP8IO6yW9hzHaNsBqt7MGSChSel7r1VKpk
iwCNApJvEiVBe5wvTSVOVro7/8p/AZ70CQAqnMJV+dNnRqtGqW7NvL6XAjZRJgJJ
Uxe5NSrXgQd3FtqfcbXLetBgp9zGVt328nHm1HXHR5rFsvoOiTvO7hHPbhA+OoWJ
fs+jHzEXdAMRgsNrczPWU5Svq6MgGe4v8HBf0m8N1Uy65t/O+z9ti2QAw7kIFlbu
/VSFNjw4CHmNxGhnH0khCMsy85FwVIt9Ux+2d6IEc0gP8S1Qa1HgHGAoVI4U51eS
9dxEPVJNPOugaIVHheuS3wimEO6wzaJcQHn4IXaasMA7P6Yo4G/jiGoy4cb9qPTM
Hb+GaOltUy7vDoG4D2LSym8zR8rdKwbIf/5psdZrq/IWVKq5p+p7KWs3aOykSoM7
o6Hb532Ioalhm8je
=BYu7
-----END PGP SIGNATURE-----
Merge tag 'kbuild-v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild updates from Masahiro Yamada:
- Remove tristate choice support from Kconfig
- Stop using the PROVIDE() directive in the linker script
- Reduce the number of links for the combination of CONFIG_KALLSYMS and
CONFIG_DEBUG_INFO_BTF
- Enable the warning for symbol reference to .exit.* sections by
default
- Fix warnings in RPM package builds
- Improve scripts/make_fit.py to generate a FIT image with separate
base DTB and overlays
- Improve choice value calculation in Kconfig
- Fix conditional prompt behavior in choice in Kconfig
- Remove support for the uncommon EMAIL environment variable in Debian
package builds
- Remove support for the uncommon "name <email>" form for the DEBEMAIL
environment variable
- Raise the minimum supported GNU Make version to 4.0
- Remove stale code for the absolute kallsyms
- Move header files commonly used for host programs to scripts/include/
- Introduce the pacman-pkg target to generate a pacman package used in
Arch Linux
- Clean up Kconfig
* tag 'kbuild-v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (65 commits)
kbuild: doc: gcc to CC change
kallsyms: change sym_entry::percpu_absolute to bool type
kallsyms: unify seq and start_pos fields of struct sym_entry
kallsyms: add more original symbol type/name in comment lines
kallsyms: use \t instead of a tab in printf()
kallsyms: avoid repeated calculation of array size for markers
kbuild: add script and target to generate pacman package
modpost: use generic macros for hash table implementation
kbuild: move some helper headers from scripts/kconfig/ to scripts/include/
Makefile: add comment to discourage tools/* addition for kernel builds
kbuild: clean up scripts/remove-stale-files
kconfig: recursive checks drop file/lineno
kbuild: rpm-pkg: introduce a simple changelog section for kernel.spec
kallsyms: get rid of code for absolute kallsyms
kbuild: Create INSTALL_PATH directory if it does not exist
kbuild: Abort make on install failures
kconfig: remove 'e1' and 'e2' macros from expression deduplication
kconfig: remove SYMBOL_CHOICEVAL flag
kconfig: add const qualifiers to several function arguments
kconfig: call expr_eliminate_yn() at least once in expr_eliminate_dups()
...
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEESH4wyp42V4tXvYsjUqAMR0iAlPIFAmafyQEACgkQUqAMR0iA
lPIb5Q/+Puck/TtP9W7uyFMDi6AQMm2/Ppja12OV0cpzKwvZci81mvANEJDC2gGa
Wlg4++WVqeA6UfxsPLqDWDWI46dSHWfbMR5d9c5ZZBWh8byNLznBvgQdV04m0KaT
567gXwRaVc3Rx4q0WkoWYn12FFvZV5IZPJbUuM4hKS1WmDrW5tPKteBoxGVujU5N
j9BB3zMHH+cDrgB2+tasUGkEXRLQtesjVrPNQp4lJA535JwnpFk3961+y4cnaAaQ
EqrGy/uUVk6Hl/4Q/JgnYRREmDytMMmvxJJx3FIZm6i5xLOY0y1rJ1YF0z7fL+Eu
+WJi6Uncwo1mFUz5MvKAg2tIUilbOp3awQwxPdPFXoziF0A1mS+S6qHlz6Y8Kk4N
kqgmfIXZeb/K+UCWX6freqqvAg/sJ0JlNIcaGOGZi7+KRwoi8US3eahIAEUxHuis
DRnwtF9qI6YRyjfwy3INDc9Atgnhe6O/mcUpAe1TiSHgbgKT8kQgIQUWb1BwurVE
Ey1Hpci+gpJyj1MW0f1jE08lqQ1O7EdFTJyxJvgjs1IeGA2Zgc0U2GMemSidh6qZ
pqQnEwOfvCdVgR8W4R0Gu8IZYsyCYEawNVHixk2V+GQSdvZsPfJNvnhkqYH/IzUQ
yXKBCZVVBwTQW6zkGI6fjeXgjVH2cEL29vDO4ChQsSH4qEZb/GA=
=k4I2
-----END PGP SIGNATURE-----
Merge tag 'livepatching-for-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching
Pull livepatching update from Petr Mladek:
- show patch->replace flag in sysfs
- add or improve few selftests
* tag 'livepatching-for-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching:
livepatch: Replace snprintf() with sysfs_emit()
selftests/livepatch: Add selftests for "replace" sysfs attribute
livepatch: Add "replace" sysfs attribute
selftests: livepatch: Test atomic replace against multiple modules
selftests/livepatch: define max test-syscall processes
The string passed to ksft_test_result_skip is missing the `type_name`
Signed-off-by: Remington Brasga <rbrasga@uci.edu>
Reviewed-by: Dev Jain <dev.jain@arm.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/20240712231730.2794-1-rbrasga@uci.edu
Signed-off-by: Will Deacon <will@kernel.org>
Add a type cast for set8->pairs to fix below compile warning:
main.c: In function 'sets_patch':
main.c:699:50: warning: comparison of distinct pointer types lacks a cast
699 | BUILD_BUG_ON(set8->pairs != &set8->pairs[0].id);
| ^~
Fixes: 9707ac4fe2 ("tools/resolve_btfids: Refactor set sorting with types from btf_ids.h")
Signed-off-by: Liwei Song <liwei.song.lsong@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20240722083305.4009723-1-liwei.song.lsong@gmail.com
Kuan-Wei Chiu has significantly reworked the min_heap library code and
has taught bcachefs to use the new more generic implementation.
- Yury Norov's series "Cleanup cpumask.h inclusion in core headers"
reworks the cpumask and nodemask headers to make things generally more
rational.
- Kuan-Wei Chiu has sent along some maintenance work against our sorting
library code in the series "lib/sort: Optimizations and cleanups".
- More library maintainance work from Christophe Jaillet in the series
"Remove usage of the deprecated ida_simple_xx() API".
- Ryusuke Konishi continues with the nilfs2 fixes and clanups in the
series "nilfs2: eliminate the call to inode_attach_wb()".
- Kuan-Ying Lee has some fixes to the gdb scripts in the series "Fix GDB
command error".
- Plus the usual shower of singleton patches all over the place. Please
see the relevant changelogs for details.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZp2GvwAKCRDdBJ7gKXxA
jlf/AP48xP5ilIHbtpAKm2z+MvGuTxJQ5VSC0UXFacuCbc93lAEA+Yo+vOVRmh6j
fQF2nVKyKLYfSz7yqmCyAaHWohIYLgg=
=Stxz
-----END PGP SIGNATURE-----
Merge tag 'mm-nonmm-stable-2024-07-21-15-07' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull non-MM updates from Andrew Morton:
- In the series "treewide: Refactor heap related implementation",
Kuan-Wei Chiu has significantly reworked the min_heap library code
and has taught bcachefs to use the new more generic implementation.
- Yury Norov's series "Cleanup cpumask.h inclusion in core headers"
reworks the cpumask and nodemask headers to make things generally
more rational.
- Kuan-Wei Chiu has sent along some maintenance work against our
sorting library code in the series "lib/sort: Optimizations and
cleanups".
- More library maintainance work from Christophe Jaillet in the series
"Remove usage of the deprecated ida_simple_xx() API".
- Ryusuke Konishi continues with the nilfs2 fixes and clanups in the
series "nilfs2: eliminate the call to inode_attach_wb()".
- Kuan-Ying Lee has some fixes to the gdb scripts in the series "Fix
GDB command error".
- Plus the usual shower of singleton patches all over the place. Please
see the relevant changelogs for details.
* tag 'mm-nonmm-stable-2024-07-21-15-07' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (98 commits)
ia64: scrub ia64 from poison.h
watchdog/perf: properly initialize the turbo mode timestamp and rearm counter
tsacct: replace strncpy() with strscpy()
lib/bch.c: use swap() to improve code
test_bpf: convert comma to semicolon
init/modpost: conditionally check section mismatch to __meminit*
init: remove unused __MEMINIT* macros
nilfs2: Constify struct kobj_type
nilfs2: avoid undefined behavior in nilfs_cnt32_ge macro
math: rational: add missing MODULE_DESCRIPTION() macro
lib/zlib: add missing MODULE_DESCRIPTION() macro
fs: ufs: add MODULE_DESCRIPTION()
lib/rbtree.c: fix the example typo
ocfs2: add bounds checking to ocfs2_check_dir_entry()
fs: add kernel-doc comments to ocfs2_prepare_orphan_dir()
coredump: simplify zap_process()
selftests/fpu: add missing MODULE_DESCRIPTION() macro
compiler.h: simplify data_race() macro
build-id: require program headers to be right after ELF header
resource: add missing MODULE_DESCRIPTION()
...
walkers") is known to cause a performance regression
(https://lore.kernel.org/all/3acefad9-96e5-4681-8014-827d6be71c7a@linux.ibm.com/T/#mfa809800a7862fb5bdf834c6f71a3a5113eb83ff).
Yu has a fix which I'll send along later via the hotfixes branch.
- In the series "mm: Avoid possible overflows in dirty throttling" Jan
Kara addresses a couple of issues in the writeback throttling code.
These fixes are also targetted at -stable kernels.
- Ryusuke Konishi's series "nilfs2: fix potential issues related to
reserved inodes" does that. This should actually be in the
mm-nonmm-stable tree, along with the many other nilfs2 patches. My bad.
- More folio conversions from Kefeng Wang in the series "mm: convert to
folio_alloc_mpol()"
- Kemeng Shi has sent some cleanups to the writeback code in the series
"Add helper functions to remove repeated code and improve readability of
cgroup writeback"
- Kairui Song has made the swap code a little smaller and a little
faster in the series "mm/swap: clean up and optimize swap cache index".
- In the series "mm/memory: cleanly support zeropage in
vm_insert_page*(), vm_map_pages*() and vmf_insert_mixed()" David
Hildenbrand has reworked the rather sketchy handling of the use of the
zeropage in MAP_SHARED mappings. I don't see any runtime effects here -
more a cleanup/understandability/maintainablity thing.
- Dev Jain has improved selftests/mm/va_high_addr_switch.c's handling of
higher addresses, for aarch64. The (poorly named) series is
"Restructure va_high_addr_switch".
- The core TLB handling code gets some cleanups and possible slight
optimizations in Bang Li's series "Add update_mmu_tlb_range() to
simplify code".
- Jane Chu has improved the handling of our
fake-an-unrecoverable-memory-error testing feature MADV_HWPOISON in the
series "Enhance soft hwpoison handling and injection".
- Jeff Johnson has sent a billion patches everywhere to add
MODULE_DESCRIPTION() to everything. Some landed in this pull.
- In the series "mm: cleanup MIGRATE_SYNC_NO_COPY mode", Kefeng Wang has
simplified migration's use of hardware-offload memory copying.
- Yosry Ahmed performs more folio API conversions in his series "mm:
zswap: trivial folio conversions".
- In the series "large folios swap-in: handle refault cases first",
Chuanhua Han inches us forward in the handling of large pages in the
swap code. This is a cleanup and optimization, working toward the end
objective of full support of large folio swapin/out.
- In the series "mm,swap: cleanup VMA based swap readahead window
calculation", Huang Ying has contributed some cleanups and a possible
fixlet to his VMA based swap readahead code.
- In the series "add mTHP support for anonymous shmem" Baolin Wang has
taught anonymous shmem mappings to use multisize THP. By default this
is a no-op - users must opt in vis sysfs controls. Dramatic
improvements in pagefault latency are realized.
- David Hildenbrand has some cleanups to our remaining use of
page_mapcount() in the series "fs/proc: move page_mapcount() to
fs/proc/internal.h".
- David also has some highmem accounting cleanups in the series
"mm/highmem: don't track highmem pages manually".
- Build-time fixes and cleanups from John Hubbard in the series
"cleanups, fixes, and progress towards avoiding "make headers"".
- Cleanups and consolidation of the core pagemap handling from Barry
Song in the series "mm: introduce pmd|pte_needs_soft_dirty_wp helpers
and utilize them".
- Lance Yang's series "Reclaim lazyfree THP without splitting" has
reduced the latency of the reclaim of pmd-mapped THPs under fairly
common circumstances. A 10x speedup is seen in a microbenchmark.
It does this by punting to aother CPU but I guess that's a win unless
all CPUs are pegged.
- hugetlb_cgroup cleanups from Xiu Jianfeng in the series
"mm/hugetlb_cgroup: rework on cftypes".
- Miaohe Lin's series "Some cleanups for memory-failure" does just that
thing.
- Is anyone reading this stuff? If so, email me!
- Someone other than SeongJae has developed a DAMON feature in Honggyu
Kim's series "DAMON based tiered memory management for CXL memory".
This adds DAMON features which may be used to help determine the
efficiency of our placement of CXL/PCIe attached DRAM.
- DAMON user API centralization and simplificatio work in SeongJae
Park's series "mm/damon: introduce DAMON parameters online commit
function".
- In the series "mm: page_type, zsmalloc and page_mapcount_reset()"
David Hildenbrand does some maintenance work on zsmalloc - partially
modernizing its use of pageframe fields.
- Kefeng Wang provides more folio conversions in the series "mm: remove
page_maybe_dma_pinned() and page_mkclean()".
- More cleanup from David Hildenbrand, this time in the series
"mm/memory_hotplug: use PageOffline() instead of PageReserved() for
!ZONE_DEVICE". It "enlightens memory hotplug more about PageOffline()
pages" and permits the removal of some virtio-mem hacks.
- Barry Song's series "mm: clarify folio_add_new_anon_rmap() and
__folio_add_anon_rmap()" is a cleanup to the anon folio handling in
preparation for mTHP (multisize THP) swapin.
- Kefeng Wang's series "mm: improve clear and copy user folio"
implements more folio conversions, this time in the area of large folio
userspace copying.
- The series "Docs/mm/damon/maintaier-profile: document a mailing tool
and community meetup series" tells people how to get better involved
with other DAMON developers. From SeongJae Park.
- A large series ("kmsan: Enable on s390") from Ilya Leoshkevich does
that.
- David Hildenbrand sends along more cleanups, this time against the
migration code. The series is "mm/migrate: move NUMA hinting fault
folio isolation + checks under PTL".
- Jan Kara has found quite a lot of strangenesses and minor errors in
the readahead code. He addresses this in the series "mm: Fix various
readahead quirks".
- SeongJae Park's series "selftests/damon: test DAMOS tried regions and
{min,max}_nr_regions" adds features and addresses errors in DAMON's self
testing code.
- Gavin Shan has found a userspace-triggerable WARN in the pagecache
code. The series "mm/filemap: Limit page cache size to that supported
by xarray" addresses this. The series is marked cc:stable.
- Chengming Zhou's series "mm/ksm: cmp_and_merge_page() optimizations
and cleanup" cleans up and slightly optimizes KSM.
- Roman Gushchin has separated the memcg-v1 and memcg-v2 code - lots of
code motion. The series (which also makes the memcg-v1 code
Kconfigurable) are
"mm: memcg: separate legacy cgroup v1 code and put under config
option" and
"mm: memcg: put cgroup v1-specific memcg data under CONFIG_MEMCG_V1"
- Dan Schatzberg's series "Add swappiness argument to memory.reclaim"
adds an additional feature to this cgroup-v2 control file.
- The series "Userspace controls soft-offline pages" from Jiaqi Yan
permits userspace to stop the kernel's automatic treatment of excessive
correctable memory errors. In order to permit userspace to monitor and
handle this situation.
- Kefeng Wang's series "mm: migrate: support poison recover from migrate
folio" teaches the kernel to appropriately handle migration from
poisoned source folios rather than simply panicing.
- SeongJae Park's series "Docs/damon: minor fixups and improvements"
does those things.
- In the series "mm/zsmalloc: change back to per-size_class lock"
Chengming Zhou improves zsmalloc's scalability and memory utilization.
- Vivek Kasireddy's series "mm/gup: Introduce memfd_pin_folios() for
pinning memfd folios" makes the GUP code use FOLL_PIN rather than bare
refcount increments. So these paes can first be moved aside if they
reside in the movable zone or a CMA block.
- Andrii Nakryiko has added a binary ioctl()-based API to /proc/pid/maps
for much faster reading of vma information. The series is "query VMAs
from /proc/<pid>/maps".
- In the series "mm: introduce per-order mTHP split counters" Lance Yang
improves the kernel's presentation of developer information related to
multisize THP splitting.
- Michael Ellerman has developed the series "Reimplement huge pages
without hugepd on powerpc (8xx, e500, book3s/64)". This permits
userspace to use all available huge page sizes.
- In the series "revert unconditional slab and page allocator fault
injection calls" Vlastimil Babka removes a performance-affecting and not
very useful feature from slab fault injection.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZp2C+QAKCRDdBJ7gKXxA
joTkAQDvjqOoFStqk4GU3OXMYB7WCU/ZQMFG0iuu1EEwTVDZ4QEA8CnG7seek1R3
xEoo+vw0sWWeLV3qzsxnCA1BJ8cTJA8=
=z0Lf
-----END PGP SIGNATURE-----
Merge tag 'mm-stable-2024-07-21-14-50' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
- In the series "mm: Avoid possible overflows in dirty throttling" Jan
Kara addresses a couple of issues in the writeback throttling code.
These fixes are also targetted at -stable kernels.
- Ryusuke Konishi's series "nilfs2: fix potential issues related to
reserved inodes" does that. This should actually be in the
mm-nonmm-stable tree, along with the many other nilfs2 patches. My
bad.
- More folio conversions from Kefeng Wang in the series "mm: convert to
folio_alloc_mpol()"
- Kemeng Shi has sent some cleanups to the writeback code in the series
"Add helper functions to remove repeated code and improve readability
of cgroup writeback"
- Kairui Song has made the swap code a little smaller and a little
faster in the series "mm/swap: clean up and optimize swap cache
index".
- In the series "mm/memory: cleanly support zeropage in
vm_insert_page*(), vm_map_pages*() and vmf_insert_mixed()" David
Hildenbrand has reworked the rather sketchy handling of the use of
the zeropage in MAP_SHARED mappings. I don't see any runtime effects
here - more a cleanup/understandability/maintainablity thing.
- Dev Jain has improved selftests/mm/va_high_addr_switch.c's handling
of higher addresses, for aarch64. The (poorly named) series is
"Restructure va_high_addr_switch".
- The core TLB handling code gets some cleanups and possible slight
optimizations in Bang Li's series "Add update_mmu_tlb_range() to
simplify code".
- Jane Chu has improved the handling of our
fake-an-unrecoverable-memory-error testing feature MADV_HWPOISON in
the series "Enhance soft hwpoison handling and injection".
- Jeff Johnson has sent a billion patches everywhere to add
MODULE_DESCRIPTION() to everything. Some landed in this pull.
- In the series "mm: cleanup MIGRATE_SYNC_NO_COPY mode", Kefeng Wang
has simplified migration's use of hardware-offload memory copying.
- Yosry Ahmed performs more folio API conversions in his series "mm:
zswap: trivial folio conversions".
- In the series "large folios swap-in: handle refault cases first",
Chuanhua Han inches us forward in the handling of large pages in the
swap code. This is a cleanup and optimization, working toward the end
objective of full support of large folio swapin/out.
- In the series "mm,swap: cleanup VMA based swap readahead window
calculation", Huang Ying has contributed some cleanups and a possible
fixlet to his VMA based swap readahead code.
- In the series "add mTHP support for anonymous shmem" Baolin Wang has
taught anonymous shmem mappings to use multisize THP. By default this
is a no-op - users must opt in vis sysfs controls. Dramatic
improvements in pagefault latency are realized.
- David Hildenbrand has some cleanups to our remaining use of
page_mapcount() in the series "fs/proc: move page_mapcount() to
fs/proc/internal.h".
- David also has some highmem accounting cleanups in the series
"mm/highmem: don't track highmem pages manually".
- Build-time fixes and cleanups from John Hubbard in the series
"cleanups, fixes, and progress towards avoiding "make headers"".
- Cleanups and consolidation of the core pagemap handling from Barry
Song in the series "mm: introduce pmd|pte_needs_soft_dirty_wp helpers
and utilize them".
- Lance Yang's series "Reclaim lazyfree THP without splitting" has
reduced the latency of the reclaim of pmd-mapped THPs under fairly
common circumstances. A 10x speedup is seen in a microbenchmark.
It does this by punting to aother CPU but I guess that's a win unless
all CPUs are pegged.
- hugetlb_cgroup cleanups from Xiu Jianfeng in the series
"mm/hugetlb_cgroup: rework on cftypes".
- Miaohe Lin's series "Some cleanups for memory-failure" does just that
thing.
- Someone other than SeongJae has developed a DAMON feature in Honggyu
Kim's series "DAMON based tiered memory management for CXL memory".
This adds DAMON features which may be used to help determine the
efficiency of our placement of CXL/PCIe attached DRAM.
- DAMON user API centralization and simplificatio work in SeongJae
Park's series "mm/damon: introduce DAMON parameters online commit
function".
- In the series "mm: page_type, zsmalloc and page_mapcount_reset()"
David Hildenbrand does some maintenance work on zsmalloc - partially
modernizing its use of pageframe fields.
- Kefeng Wang provides more folio conversions in the series "mm: remove
page_maybe_dma_pinned() and page_mkclean()".
- More cleanup from David Hildenbrand, this time in the series
"mm/memory_hotplug: use PageOffline() instead of PageReserved() for
!ZONE_DEVICE". It "enlightens memory hotplug more about PageOffline()
pages" and permits the removal of some virtio-mem hacks.
- Barry Song's series "mm: clarify folio_add_new_anon_rmap() and
__folio_add_anon_rmap()" is a cleanup to the anon folio handling in
preparation for mTHP (multisize THP) swapin.
- Kefeng Wang's series "mm: improve clear and copy user folio"
implements more folio conversions, this time in the area of large
folio userspace copying.
- The series "Docs/mm/damon/maintaier-profile: document a mailing tool
and community meetup series" tells people how to get better involved
with other DAMON developers. From SeongJae Park.
- A large series ("kmsan: Enable on s390") from Ilya Leoshkevich does
that.
- David Hildenbrand sends along more cleanups, this time against the
migration code. The series is "mm/migrate: move NUMA hinting fault
folio isolation + checks under PTL".
- Jan Kara has found quite a lot of strangenesses and minor errors in
the readahead code. He addresses this in the series "mm: Fix various
readahead quirks".
- SeongJae Park's series "selftests/damon: test DAMOS tried regions and
{min,max}_nr_regions" adds features and addresses errors in DAMON's
self testing code.
- Gavin Shan has found a userspace-triggerable WARN in the pagecache
code. The series "mm/filemap: Limit page cache size to that supported
by xarray" addresses this. The series is marked cc:stable.
- Chengming Zhou's series "mm/ksm: cmp_and_merge_page() optimizations
and cleanup" cleans up and slightly optimizes KSM.
- Roman Gushchin has separated the memcg-v1 and memcg-v2 code - lots of
code motion. The series (which also makes the memcg-v1 code
Kconfigurable) are "mm: memcg: separate legacy cgroup v1 code and put
under config option" and "mm: memcg: put cgroup v1-specific memcg
data under CONFIG_MEMCG_V1"
- Dan Schatzberg's series "Add swappiness argument to memory.reclaim"
adds an additional feature to this cgroup-v2 control file.
- The series "Userspace controls soft-offline pages" from Jiaqi Yan
permits userspace to stop the kernel's automatic treatment of
excessive correctable memory errors. In order to permit userspace to
monitor and handle this situation.
- Kefeng Wang's series "mm: migrate: support poison recover from
migrate folio" teaches the kernel to appropriately handle migration
from poisoned source folios rather than simply panicing.
- SeongJae Park's series "Docs/damon: minor fixups and improvements"
does those things.
- In the series "mm/zsmalloc: change back to per-size_class lock"
Chengming Zhou improves zsmalloc's scalability and memory
utilization.
- Vivek Kasireddy's series "mm/gup: Introduce memfd_pin_folios() for
pinning memfd folios" makes the GUP code use FOLL_PIN rather than
bare refcount increments. So these paes can first be moved aside if
they reside in the movable zone or a CMA block.
- Andrii Nakryiko has added a binary ioctl()-based API to
/proc/pid/maps for much faster reading of vma information. The series
is "query VMAs from /proc/<pid>/maps".
- In the series "mm: introduce per-order mTHP split counters" Lance
Yang improves the kernel's presentation of developer information
related to multisize THP splitting.
- Michael Ellerman has developed the series "Reimplement huge pages
without hugepd on powerpc (8xx, e500, book3s/64)". This permits
userspace to use all available huge page sizes.
- In the series "revert unconditional slab and page allocator fault
injection calls" Vlastimil Babka removes a performance-affecting and
not very useful feature from slab fault injection.
* tag 'mm-stable-2024-07-21-14-50' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (411 commits)
mm/mglru: fix ineffective protection calculation
mm/zswap: fix a white space issue
mm/hugetlb: fix kernel NULL pointer dereference when migrating hugetlb folio
mm/hugetlb: fix possible recursive locking detected warning
mm/gup: clear the LRU flag of a page before adding to LRU batch
mm/numa_balancing: teach mpol_to_str about the balancing mode
mm: memcg1: convert charge move flags to unsigned long long
alloc_tag: fix page_ext_get/page_ext_put sequence during page splitting
lib: reuse page_ext_data() to obtain codetag_ref
lib: add missing newline character in the warning message
mm/mglru: fix overshooting shrinker memory
mm/mglru: fix div-by-zero in vmpressure_calc_level()
mm/kmemleak: replace strncpy() with strscpy()
mm, page_alloc: put should_fail_alloc_page() back behing CONFIG_FAIL_PAGE_ALLOC
mm, slab: put should_failslab() back behind CONFIG_SHOULD_FAILSLAB
mm: ignore data-race in __swap_writepage
hugetlbfs: ensure generic_hugetlb_get_unmapped_area() returns higher address than mmap_min_addr
mm: shmem: rename mTHP shmem counters
mm: swap_state: use folio_alloc_mpol() in __read_swap_cache_async()
mm/migrate: putback split folios when numa hint migration fails
...
* Initial infrastructure for shadow stage-2 MMUs, as part of nested
virtualization enablement
* Support for userspace changes to the guest CTR_EL0 value, enabling
(in part) migration of VMs between heterogenous hardware
* Fixes + improvements to pKVM's FF-A proxy, adding support for v1.1 of
the protocol
* FPSIMD/SVE support for nested, including merged trap configuration
and exception routing
* New command-line parameter to control the WFx trap behavior under KVM
* Introduce kCFI hardening in the EL2 hypervisor
* Fixes + cleanups for handling presence/absence of FEAT_TCRX
* Miscellaneous fixes + documentation updates
LoongArch:
* Add paravirt steal time support.
* Add support for KVM_DIRTY_LOG_INITIALLY_SET.
* Add perf kvm-stat support for loongarch.
RISC-V:
* Redirect AMO load/store access fault traps to guest
* perf kvm stat support
* Use guest files for IMSIC virtualization, when available
ONE_REG support for the Zimop, Zcmop, Zca, Zcf, Zcd, Zcb and Zawrs ISA
extensions is coming through the RISC-V tree.
s390:
* Assortment of tiny fixes which are not time critical
x86:
* Fixes for Xen emulation.
* Add a global struct to consolidate tracking of host values, e.g. EFER
* Add KVM_CAP_X86_APIC_BUS_CYCLES_NS to allow configuring the effective APIC
bus frequency, because TDX.
* Print the name of the APICv/AVIC inhibits in the relevant tracepoint.
* Clean up KVM's handling of vendor specific emulation to consistently act on
"compatible with Intel/AMD", versus checking for a specific vendor.
* Drop MTRR virtualization, and instead always honor guest PAT on CPUs
that support self-snoop.
* Update to the newfangled Intel CPU FMS infrastructure.
* Don't advertise IA32_PERF_GLOBAL_OVF_CTRL as an MSR-to-be-saved, as it reads
'0' and writes from userspace are ignored.
* Misc cleanups
x86 - MMU:
* Small cleanups, renames and refactoring extracted from the upcoming
Intel TDX support.
* Don't allocate kvm_mmu_page.shadowed_translation for shadow pages that can't
hold leafs SPTEs.
* Unconditionally drop mmu_lock when allocating TDP MMU page tables for eager
page splitting, to avoid stalling vCPUs when splitting huge pages.
* Bug the VM instead of simply warning if KVM tries to split a SPTE that is
non-present or not-huge. KVM is guaranteed to end up in a broken state
because the callers fully expect a valid SPTE, it's all but dangerous
to let more MMU changes happen afterwards.
x86 - AMD:
* Make per-CPU save_area allocations NUMA-aware.
* Force sev_es_host_save_area() to be inlined to avoid calling into an
instrumentable function from noinstr code.
* Base support for running SEV-SNP guests. API-wise, this includes
a new KVM_X86_SNP_VM type, encrypting/measure the initial image into
guest memory, and finalizing it before launching it. Internally,
there are some gmem/mmu hooks needed to prepare gmem-allocated pages
before mapping them into guest private memory ranges.
This includes basic support for attestation guest requests, enough to
say that KVM supports the GHCB 2.0 specification.
There is no support yet for loading into the firmware those signing
keys to be used for attestation requests, and therefore no need yet
for the host to provide certificate data for those keys. To support
fetching certificate data from userspace, a new KVM exit type will be
needed to handle fetching the certificate from userspace. An attempt to
define a new KVM_EXIT_COCO/KVM_EXIT_COCO_REQ_CERTS exit type to handle
this was introduced in v1 of this patchset, but is still being discussed
by community, so for now this patchset only implements a stub version
of SNP Extended Guest Requests that does not provide certificate data.
x86 - Intel:
* Remove an unnecessary EPT TLB flush when enabling hardware.
* Fix a series of bugs that cause KVM to fail to detect nested pending posted
interrupts as valid wake eents for a vCPU executing HLT in L2 (with
HLT-exiting disable by L1).
* KVM: x86: Suppress MMIO that is triggered during task switch emulation
Explicitly suppress userspace emulated MMIO exits that are triggered when
emulating a task switch as KVM doesn't support userspace MMIO during
complex (multi-step) emulation. Silently ignoring the exit request can
result in the WARN_ON_ONCE(vcpu->mmio_needed) firing if KVM exits to
userspace for some other reason prior to purging mmio_needed.
See commit 0dc902267c ("KVM: x86: Suppress pending MMIO write exits if
emulator detects exception") for more details on KVM's limitations with
respect to emulated MMIO during complex emulator flows.
Generic:
* Rename the AS_UNMOVABLE flag that was introduced for KVM to AS_INACCESSIBLE,
because the special casing needed by these pages is not due to just
unmovability (and in fact they are only unmovable because the CPU cannot
access them).
* New ioctl to populate the KVM page tables in advance, which is useful to
mitigate KVM page faults during guest boot or after live migration.
The code will also be used by TDX, but (probably) not through the ioctl.
* Enable halt poll shrinking by default, as Intel found it to be a clear win.
* Setup empty IRQ routing when creating a VM to avoid having to synchronize
SRCU when creating a split IRQCHIP on x86.
* Rework the sched_in/out() paths to replace kvm_arch_sched_in() with a flag
that arch code can use for hooking both sched_in() and sched_out().
* Take the vCPU @id as an "unsigned long" instead of "u32" to avoid
truncating a bogus value from userspace, e.g. to help userspace detect bugs.
* Mark a vCPU as preempted if and only if it's scheduled out while in the
KVM_RUN loop, e.g. to avoid marking it preempted and thus writing guest
memory when retrieving guest state during live migration blackout.
Selftests:
* Remove dead code in the memslot modification stress test.
* Treat "branch instructions retired" as supported on all AMD Family 17h+ CPUs.
* Print the guest pseudo-RNG seed only when it changes, to avoid spamming the
log for tests that create lots of VMs.
* Make the PMU counters test less flaky when counting LLC cache misses by
doing CLFLUSH{OPT} in every loop iteration.
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmaZQB0UHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroNkZwf/bv2jiENaLFNGPe/VqTKMQ6PHQLMG
+sNHx6fJPP35gTM8Jqf0/7/ummZXcSuC1mWrzYbecZm7Oeg3vwNXHZ4LquwwX6Dv
8dKcUzLbWDAC4WA3SKhi8C8RV2v6E7ohy69NtAJmFWTc7H95dtIQm6cduV2osTC3
OEuHe1i8d9umk6couL9Qhm8hk3i9v2KgCsrfyNrQgLtS3hu7q6yOTR8nT0iH6sJR
KE5A8prBQgLmF34CuvYDw4Hu6E4j+0QmIqodovg2884W1gZQ9LmcVqYPaRZGsG8S
iDdbkualLKwiR1TpRr3HJGKWSFdc7RblbsnHRvHIZgFsMQiimh4HrBSCyQ==
=zepX
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm updates from Paolo Bonzini:
"ARM:
- Initial infrastructure for shadow stage-2 MMUs, as part of nested
virtualization enablement
- Support for userspace changes to the guest CTR_EL0 value, enabling
(in part) migration of VMs between heterogenous hardware
- Fixes + improvements to pKVM's FF-A proxy, adding support for v1.1
of the protocol
- FPSIMD/SVE support for nested, including merged trap configuration
and exception routing
- New command-line parameter to control the WFx trap behavior under
KVM
- Introduce kCFI hardening in the EL2 hypervisor
- Fixes + cleanups for handling presence/absence of FEAT_TCRX
- Miscellaneous fixes + documentation updates
LoongArch:
- Add paravirt steal time support
- Add support for KVM_DIRTY_LOG_INITIALLY_SET
- Add perf kvm-stat support for loongarch
RISC-V:
- Redirect AMO load/store access fault traps to guest
- perf kvm stat support
- Use guest files for IMSIC virtualization, when available
s390:
- Assortment of tiny fixes which are not time critical
x86:
- Fixes for Xen emulation
- Add a global struct to consolidate tracking of host values, e.g.
EFER
- Add KVM_CAP_X86_APIC_BUS_CYCLES_NS to allow configuring the
effective APIC bus frequency, because TDX
- Print the name of the APICv/AVIC inhibits in the relevant
tracepoint
- Clean up KVM's handling of vendor specific emulation to
consistently act on "compatible with Intel/AMD", versus checking
for a specific vendor
- Drop MTRR virtualization, and instead always honor guest PAT on
CPUs that support self-snoop
- Update to the newfangled Intel CPU FMS infrastructure
- Don't advertise IA32_PERF_GLOBAL_OVF_CTRL as an MSR-to-be-saved, as
it reads '0' and writes from userspace are ignored
- Misc cleanups
x86 - MMU:
- Small cleanups, renames and refactoring extracted from the upcoming
Intel TDX support
- Don't allocate kvm_mmu_page.shadowed_translation for shadow pages
that can't hold leafs SPTEs
- Unconditionally drop mmu_lock when allocating TDP MMU page tables
for eager page splitting, to avoid stalling vCPUs when splitting
huge pages
- Bug the VM instead of simply warning if KVM tries to split a SPTE
that is non-present or not-huge. KVM is guaranteed to end up in a
broken state because the callers fully expect a valid SPTE, it's
all but dangerous to let more MMU changes happen afterwards
x86 - AMD:
- Make per-CPU save_area allocations NUMA-aware
- Force sev_es_host_save_area() to be inlined to avoid calling into
an instrumentable function from noinstr code
- Base support for running SEV-SNP guests. API-wise, this includes a
new KVM_X86_SNP_VM type, encrypting/measure the initial image into
guest memory, and finalizing it before launching it. Internally,
there are some gmem/mmu hooks needed to prepare gmem-allocated
pages before mapping them into guest private memory ranges
This includes basic support for attestation guest requests, enough
to say that KVM supports the GHCB 2.0 specification
There is no support yet for loading into the firmware those signing
keys to be used for attestation requests, and therefore no need yet
for the host to provide certificate data for those keys.
To support fetching certificate data from userspace, a new KVM exit
type will be needed to handle fetching the certificate from
userspace.
An attempt to define a new KVM_EXIT_COCO / KVM_EXIT_COCO_REQ_CERTS
exit type to handle this was introduced in v1 of this patchset, but
is still being discussed by community, so for now this patchset
only implements a stub version of SNP Extended Guest Requests that
does not provide certificate data
x86 - Intel:
- Remove an unnecessary EPT TLB flush when enabling hardware
- Fix a series of bugs that cause KVM to fail to detect nested
pending posted interrupts as valid wake eents for a vCPU executing
HLT in L2 (with HLT-exiting disable by L1)
- KVM: x86: Suppress MMIO that is triggered during task switch
emulation
Explicitly suppress userspace emulated MMIO exits that are
triggered when emulating a task switch as KVM doesn't support
userspace MMIO during complex (multi-step) emulation
Silently ignoring the exit request can result in the
WARN_ON_ONCE(vcpu->mmio_needed) firing if KVM exits to userspace
for some other reason prior to purging mmio_needed
See commit 0dc902267c ("KVM: x86: Suppress pending MMIO write
exits if emulator detects exception") for more details on KVM's
limitations with respect to emulated MMIO during complex emulator
flows
Generic:
- Rename the AS_UNMOVABLE flag that was introduced for KVM to
AS_INACCESSIBLE, because the special casing needed by these pages
is not due to just unmovability (and in fact they are only
unmovable because the CPU cannot access them)
- New ioctl to populate the KVM page tables in advance, which is
useful to mitigate KVM page faults during guest boot or after live
migration. The code will also be used by TDX, but (probably) not
through the ioctl
- Enable halt poll shrinking by default, as Intel found it to be a
clear win
- Setup empty IRQ routing when creating a VM to avoid having to
synchronize SRCU when creating a split IRQCHIP on x86
- Rework the sched_in/out() paths to replace kvm_arch_sched_in() with
a flag that arch code can use for hooking both sched_in() and
sched_out()
- Take the vCPU @id as an "unsigned long" instead of "u32" to avoid
truncating a bogus value from userspace, e.g. to help userspace
detect bugs
- Mark a vCPU as preempted if and only if it's scheduled out while in
the KVM_RUN loop, e.g. to avoid marking it preempted and thus
writing guest memory when retrieving guest state during live
migration blackout
Selftests:
- Remove dead code in the memslot modification stress test
- Treat "branch instructions retired" as supported on all AMD Family
17h+ CPUs
- Print the guest pseudo-RNG seed only when it changes, to avoid
spamming the log for tests that create lots of VMs
- Make the PMU counters test less flaky when counting LLC cache
misses by doing CLFLUSH{OPT} in every loop iteration"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (227 commits)
crypto: ccp: Add the SNP_VLEK_LOAD command
KVM: x86/pmu: Add kvm_pmu_call() to simplify static calls of kvm_pmu_ops
KVM: x86: Introduce kvm_x86_call() to simplify static calls of kvm_x86_ops
KVM: x86: Replace static_call_cond() with static_call()
KVM: SEV: Provide support for SNP_EXTENDED_GUEST_REQUEST NAE event
x86/sev: Move sev_guest.h into common SEV header
KVM: SEV: Provide support for SNP_GUEST_REQUEST NAE event
KVM: x86: Suppress MMIO that is triggered during task switch emulation
KVM: x86/mmu: Clean up make_huge_page_split_spte() definition and intro
KVM: x86/mmu: Bug the VM if KVM tries to split a !hugepage SPTE
KVM: selftests: x86: Add test for KVM_PRE_FAULT_MEMORY
KVM: x86: Implement kvm_arch_vcpu_pre_fault_memory()
KVM: x86/mmu: Make kvm_mmu_do_page_fault() return mapped level
KVM: x86/mmu: Account pf_{fixed,emulate,spurious} in callers of "do page fault"
KVM: x86/mmu: Bump pf_taken stat only in the "real" page fault handler
KVM: Add KVM_PRE_FAULT_MEMORY vcpu ioctl to pre-populate guest memory
KVM: Document KVM_PRE_FAULT_MEMORY ioctl
mm, virt: merge AS_UNMOVABLE and AS_INACCESSIBLE
perf kvm: Add kvm-stat for loongarch64
LoongArch: KVM: Add PV steal time support in guest side
...
One small cleanup to use sizeof(*pointer)
A series of patches to add MODULE_DESCRIPTIONS() to eliminate make W=1
warnings.
-----BEGIN PGP SIGNATURE-----
iIoEABYKADIWIQSgX9xt+GwmrJEQ+euebuN7TNx1MQUCZpqNARQcaXJhLndlaW55
QGludGVsLmNvbQAKCRCebuN7TNx1McgTAQDx5VvRC3htc7UM/i6524si2kurfIOd
uuB+AHV53PfrkAD/ad0DfzW22kWR/QzyXtVLguNYoNKN+ipOHnJ0Atzgxgw=
=HPWt
-----END PGP SIGNATURE-----
Merge tag 'libnvdimm-for-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm updates from Ira Weiny:
- One small cleanup to use sizeof(*pointer)
- Add MODULE_DESCRIPTIONS() to eliminate make W=1 warnings
* tag 'libnvdimm-for-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
testing: nvdimm: Add MODULE_DESCRIPTION() macros
testing: nvdimm: iomap: add MODULE_DESCRIPTION()
dax: add missing MODULE_DESCRIPTION() macros
nvdimm: add missing MODULE_DESCRIPTION() macros
ACPI: NFIT: add missing MODULE_DESCRIPTION() macro
nvdimm/btt: use sizeof(*pointer) instead of sizeof(type)
* Support for various new ISA extensions:
* The Zve32[xf] and Zve64[xfd] sub-extensios of the vector
extension.
* Zimop and Zcmop for may-be-operations.
* The Zca, Zcf, Zcd and Zcb sub-extensions of the C extension.
* Zawrs,
* riscv,cpu-intc is now dtschema.
* A handful of performance improvements and cleanups to text patching.
* Support for memory hot{,un}plug
* The highest user-allocatable virtual address is now visible in
hwprobe.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmabIGETHHBhbG1lckBk
YWJiZWx0LmNvbQAKCRAuExnzX7sYiQe8D/9QPCaOnoP5OCZbwjkRBwaVxyknNyD0
l+YNXk7Jk3B/oaOv3d7Bz+uWt1SG4j4jkfyuGJ81StZykp4/R7T823TZrPhog9VX
IJm580MtvE49I2i1qJ+ZQti9wpiM+80lFnyMCzY6S7rrM9m62tKgUpARZcWoA55P
iUo5bku99TYCcU2k1pnPrNSPQvVpECpv7tG0PwKpQd5DiYjbPp+aw5cQWN+izdOB
6raOZ0buzP7McszvO/gcJs+kuHwrp0JSRvNxc2pwYZ0lx00p3hSV8UdtIMlI9Qm/
z3gkQGHwc6UVMPHo1x0Gr5ShUTCI/iSwy4/7aY4NNXF6Sj99b8alt9GcbYqNAE7V
k7sibCR7dhL4ods/GFMmzR7cQYlwlwtO+/ILak7rXhNvA32Xy1WUABguhP9ElTmw
1ZS2hnRv6wc7MA2V7HBamf5mPXM6HQyC3oKy3njzDSJdiGIG7aa+TOfRAD+L/1Du
QjIrKp6XcPIsZNjh8H3nMDVJ0VvDNnS4d4LbfNQc23VPzf57kFUqbli1pS0hBjFT
ELEItH9dgSx+T5Qebdy/QMC3RG8Yc1IUdw6VQ7Jny/uCCEZNq+VZ+bXxspMmswCp
sUIyDplJTJfRt3G2OxK0b95x6oj8jbaJOQfv6PBF71dDBsChg8eXFVJ2NDrX4Bvr
h2MPK7vGBtFz8w==
=+ICi
-----END PGP SIGNATURE-----
Merge tag 'riscv-for-linus-6.11-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V updates from Palmer Dabbelt:
- Support for various new ISA extensions:
* The Zve32[xf] and Zve64[xfd] sub-extensios of the vector
extension
* Zimop and Zcmop for may-be-operations
* The Zca, Zcf, Zcd and Zcb sub-extensions of the C extension
* Zawrs
- riscv,cpu-intc is now dtschema
- A handful of performance improvements and cleanups to text patching
- Support for memory hot{,un}plug
- The highest user-allocatable virtual address is now visible in
hwprobe
* tag 'riscv-for-linus-6.11-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (58 commits)
riscv: lib: relax assembly constraints in hweight
riscv: set trap vector earlier
KVM: riscv: selftests: Add Zawrs extension to get-reg-list test
KVM: riscv: Support guest wrs.nto
riscv: hwprobe: export Zawrs ISA extension
riscv: Add Zawrs support for spinlocks
dt-bindings: riscv: Add Zawrs ISA extension description
riscv: Provide a definition for 'pause'
riscv: hwprobe: export highest virtual userspace address
riscv: Improve sbi_ecall() code generation by reordering arguments
riscv: Add tracepoints for SBI calls and returns
riscv: Optimize crc32 with Zbc extension
riscv: Enable DAX VMEMMAP optimization
riscv: mm: Add support for ZONE_DEVICE
virtio-mem: Enable virtio-mem for RISC-V
riscv: Enable memory hotplugging for RISC-V
riscv: mm: Take memory hotplug read-lock during kernel page table dump
riscv: mm: Add memory hotplugging support
riscv: mm: Add pfn_to_kaddr() implementation
riscv: mm: Refactor create_linear_mapping_range() for memory hot add
...
Commit cf8e865810 ("arch: Remove Itanium (IA-64) architecture")
removed the last use of the absolute kallsyms.
Signed-off-by: Jann Horn <jannh@google.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/all/20240221202655.2423854-1-jannh@google.com/
[masahiroy@kernel.org: rebase the code and reword the commit description]
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
- Remove support for 40x CPUs & platforms.
- Add support to the 64-bit BPF JIT for cpu v4 instructions.
- Fix PCI hotplug driver crash on powernv.
- Fix doorbell emulation for KVM on PAPR guests (nestedv2).
- Fix KVM nested guest handling of some less used SPRs.
- Online NUMA nodes with no CPU/memory if they have a PCI device attached.
- Reduce memory overhead of enabling kfence on 64-bit Radix MMU kernels.
- Reimplement the iommu table_group_ops for pseries for VFIO SPAPR TCE.
Thanks to: Anjali K, Artem Savkov, Athira Rajeev, Breno Leitao, Brian King,
Celeste Liu, Christophe Leroy, Esben Haabendal, Gaurav Batra, Gautam Menghani,
Haren Myneni, Hari Bathini, Jeff Johnson, Krishna Kumar, Krzysztof Kozlowski,
Nathan Lynch, Nicholas Piggin, Nick Bowler, Nilay Shroff, Rob Herring (Arm),
Shawn Anastasio, Shivaprasad G Bhat, Sourabh Jain, Srikar Dronamraju, Timothy
Pearson, Uwe Kleine-König, Vaibhav Jain.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmaaUNITHG1wZUBlbGxl
cm1hbi5pZC5hdQAKCRBR6+o8yOGlgDA+D/4o7OZ+SY0plTlMKSy3hW/SRXVj/byA
CCKdizNY+3Rf/+K7KhuLOUPXhZOemLPE0xfKS3ND4mIEKCswzzXqmi6kjPH0qd8q
qUhkHbt/LNpNJzZOYYw+usaklMTMdZtAl/jD9WEvGwgu2EYHgrujRIq04kEI1b0e
OPiRnXOZcfevRBepQmYZKHvFlCRRa5vvsQcvLfY64yFqD0AsKTHgIi/48Dn33pb2
hqHYyV1tZA3uT86Z1TgF1OG83VOSDsgc19Sb2xn14O9aJJ7lD2TOgVa4P4FfBlXA
TXYYGQwK31ymGVWGcGfebVdC1ECeTem9n28vlk5I0NO9xNgPok/Ov4DAiZ+u1G0E
3CXRDx9Uz2yPcGBJI2dpxfp2iw83Ad2DtBzAdukMD36xnC7xfrQz+W9SQfbcPJ8e
I5SMAstWuLNgrX7YkjAOnXh1N41kht/mdV6KHdcMxPc7jOtAD65gUOZcgwYLeXlT
Av17Ax0PMbiQ1BpFe2KNr/0T9Ba5k5rN7oDSKncDAq4uX8LcZKHj4bSHT9KroT1C
q+GERspoCYp2VDMO742Jm7KTmQDHsS5y4Q+iSdOR8cQBXF613FaryDxSoJZhg2pf
C2zIVED13RGcjIFcWlv73iA6QpBsphM+WWFz7mjULyJhxFQwm6BYt+Wy6jFu84oH
sOgvPH8YyaK2uA==
=eHVd
-----END PGP SIGNATURE-----
Merge tag 'powerpc-6.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Michael Ellerman:
- Remove support for 40x CPUs & platforms
- Add support to the 64-bit BPF JIT for cpu v4 instructions
- Fix PCI hotplug driver crash on powernv
- Fix doorbell emulation for KVM on PAPR guests (nestedv2)
- Fix KVM nested guest handling of some less used SPRs
- Online NUMA nodes with no CPU/memory if they have a PCI device
attached
- Reduce memory overhead of enabling kfence on 64-bit Radix MMU kernels
- Reimplement the iommu table_group_ops for pseries for VFIO SPAPR TCE
Thanks to: Anjali K, Artem Savkov, Athira Rajeev, Breno Leitao, Brian
King, Celeste Liu, Christophe Leroy, Esben Haabendal, Gaurav Batra,
Gautam Menghani, Haren Myneni, Hari Bathini, Jeff Johnson, Krishna
Kumar, Krzysztof Kozlowski, Nathan Lynch, Nicholas Piggin, Nick Bowler,
Nilay Shroff, Rob Herring (Arm), Shawn Anastasio, Shivaprasad G Bhat,
Sourabh Jain, Srikar Dronamraju, Timothy Pearson, Uwe Kleine-König, and
Vaibhav Jain.
* tag 'powerpc-6.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (57 commits)
Documentation/powerpc: Mention 40x is removed
powerpc: Remove 40x leftovers
macintosh/therm_windtunnel: fix module unload.
powerpc: Check only single values are passed to CPU/MMU feature checks
powerpc/xmon: Fix disassembly CPU feature checks
powerpc: Drop clang workaround for builtin constant checks
powerpc64/bpf: jit support for signed division and modulo
powerpc64/bpf: jit support for sign extended mov
powerpc64/bpf: jit support for sign extended load
powerpc64/bpf: jit support for unconditional byte swap
powerpc64/bpf: jit support for 32bit offset jmp instruction
powerpc/pci: Hotplug driver bridge support
pci/hotplug/pnv_php: Fix hotplug driver crash on Powernv
powerpc/configs: Update defconfig with now user-visible CONFIG_FSL_IFC
powerpc: add missing MODULE_DESCRIPTION() macros
macintosh/mac_hid: add MODULE_DESCRIPTION()
KVM: PPC: add missing MODULE_DESCRIPTION() macros
powerpc/kexec: Use of_property_read_reg()
powerpc/64s/radix/kfence: map __kfence_pool at page granularity
powerpc/pseries/iommu: Define spapr_tce_table_group_ops only with CONFIG_IOMMU_API
...
Here is the big set of USB and Thunderbolt changes for 6.11-rc1.
Nothing earth-shattering in here, just constant forward progress in
adding support for new hardware and better debugging functionalities for
thunderbolt devices and the subsystem. Included in here are:
- thunderbolt debugging update and driver additions
- xhci driver updates
- typec driver updates
- kselftest device driver changes (acked by the relevant maintainers,
depended on other changes in this tree.)
- cdns3 driver updates
- gadget driver updates
- MODULE_DESCRIPTION() additions
- dwc3 driver updates and fixes
All of these have been in linux-next for a while with no reported
issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZppaNA8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ylXZwCgrEtIAQw0x6EF7w/iTWVS5UJj9AEAoLCj5UwO
WX978uThyUctuYYKbw+8
=Cm7j
-----END PGP SIGNATURE-----
Merge tag 'usb-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB / Thunderbolt updates from Greg KH:
"Here is the big set of USB and Thunderbolt changes for 6.11-rc1.
Nothing earth-shattering in here, just constant forward progress in
adding support for new hardware and better debugging functionalities
for thunderbolt devices and the subsystem. Included in here are:
- thunderbolt debugging update and driver additions
- xhci driver updates
- typec driver updates
- kselftest device driver changes (acked by the relevant maintainers,
depended on other changes in this tree.)
- cdns3 driver updates
- gadget driver updates
- MODULE_DESCRIPTION() additions
- dwc3 driver updates and fixes
All of these have been in linux-next for a while with no reported
issues"
* tag 'usb-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (112 commits)
kselftest: devices: Add test to detect device error logs
kselftest: Move ksft helper module to common directory
kselftest: devices: Move discoverable devices test to subdirectory
usb: gadget: f_uac2: fix non-newline-terminated function name
USB: uas: Implement the new shutdown callback
USB: core: add 'shutdown' callback to usb_driver
usb: typec: Drop explicit initialization of struct i2c_device_id::driver_data to 0
usb: dwc3: enable CCI support for AMD-xilinx DWC3 controller
usb: dwc2: add support for other Lantiq SoCs
usb: gadget: Use u16 types for 16-bit fields
usb: gadget: midi2: Fix incorrect default MIDI2 protocol setup
usb: dwc3: core: Check all ports when set phy suspend
usb: typec: tcpci: add support to set connector orientation
dt-bindings: usb: Convert fsl-usb to yaml
usb: typec: ucsi: reorder operations in ucsi_run_command()
usb: typec: ucsi: extract common code for command handling
usb: typec: ucsi: inline ucsi_read_message_in
usb: typec: ucsi: rework command execution functions
usb: typec: ucsi: split read operation
usb: typec: ucsi: simplify command sending API
...
Including fixes from netfilter.
Current release - new code bugs:
- eth: fbnic: fix s390 build.
- eth: airoha: fix NULL pointer dereference in airoha_qdma_cleanup_rx_queue()
Previous releases - regressions:
- flow_dissector: use DEBUG_NET_WARN_ON_ONCE
- ipv4: fix incorrect TOS in route get reply
- dsa: fix chip-wide frame size config in some drivers
Previous releases - always broken:
- netfilter: nf_set_pipapo: fix initial map fill
- eth: gve: fix XDP TX completion handling when counters overflow
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmaafj4SHHBhYmVuaUBy
ZWRoYXQuY29tAAoJECkkeY3MjxOkSsIP/jLODokNb/RcIuOYZVlgn2C5icMeZtqv
6BIliZeE1EcnMWqOnheHaJVklq17ChAYbj/GNN8zQeOQhPA2eFCzPPLl/UpF9/ik
wuvk+QQ4EM3T/SwWLKmht9UJioVi66szl3vZ+ByJ4BgGiJWORW1dK/AYzyYrsplk
BMpQTs/Q/ekzWJzBtX+1Cz8izX1gl+dXhMezSdg9cW4KaBu1Hqwny954HF6keUQn
h7bbAWiOAb5bhqUzgCdgF4gXp+uaWEzG1a1kY1G86NjXC5H0G03+vl37RbY/1kYh
lUa/3nfH2V7Sy+MYTvjQe66obVeQeOh/PhRMlbkEVphpKs8XtHfsP433iOMZZn2D
Z2Yb4yICnpNwbPmK1fyFvOrY7zGp2yZ2dSDEhowGjcyoQPO6RPnBfvArl66phIOm
DWfEq79dYa0eEnCM174BLZbsjcHeeYQX2b8lagUslC0Oel+ijUG26XeMqs5W3at9
QVcMV+aPE7pibpAq4M+qqkldv49QmXRsQai0BMa0+qEPyDKLe2nnf6L+TrDfN7Zf
tpiiZZZGpcctZ1cOfyuebB37z/jtHJQ94IfLCD9RzpHlpOtCFycO18adrohhI58Q
TaZf7U1CqC/UREckwE5F9ypQhdiQHEH84rgvKT3zFFRYftWTQmBCAUs3JzBPbYwO
RO2GDFM/htOu
=Q1xF
-----END PGP SIGNATURE-----
Merge tag 'net-6.11-rc0' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from netfilter.
Notably this includes fixes for a s390 build breakage.
Current release - new code bugs:
- eth: fbnic: fix s390 build
- eth: airoha: fix NULL pointer dereference in
airoha_qdma_cleanup_rx_queue()
Previous releases - regressions:
- flow_dissector: use DEBUG_NET_WARN_ON_ONCE
- ipv4: fix incorrect TOS in route get reply
- dsa: fix chip-wide frame size config in some drivers
Previous releases - always broken:
- netfilter: nf_set_pipapo: fix initial map fill
- eth: gve: fix XDP TX completion handling when counters overflow"
* tag 'net-6.11-rc0' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net:
eth: fbnic: don't build the driver when skb has more than 21 frags
net: dsa: b53: Limit chip-wide jumbo frame config to CPU ports
net: dsa: mv88e6xxx: Limit chip-wide frame size config to CPU ports
net: airoha: Fix NULL pointer dereference in airoha_qdma_cleanup_rx_queue()
net: wwan: t7xx: add support for Dell DW5933e
ipv4: Fix incorrect TOS in fibmatch route get reply
ipv4: Fix incorrect TOS in route get reply
net: flow_dissector: use DEBUG_NET_WARN_ON_ONCE
driver core: auxiliary bus: Fix documentation of auxiliary_device
net: airoha: fix error branch in airoha_dev_xmit and airoha_set_gdm_ports
gve: Fix XDP TX completion handling when counters overflow
ipvs: properly dereference pe in ip_vs_add_service
selftests: netfilter: add test case for recent mismatch bug
netfilter: nf_set_pipapo: fix initial map fill
netfilter: ctnetlink: use helper function to calculate expect ID
eth: fbnic: fix s390 build.
Lots of changes in this cycle, but mostly for cleanups and
refactoring. Significant amount of changes are about DT schema
conversions for ASoC at this time while we see other usual
suspects, too. Some highlights below:
Core:
- Re-introduction of PCM sync ID support API
- MIDI2 time-base extension in ALSA sequencer API
ASoC:
- Syncing of features between simple-audio-card and the two
audio-graph cards
- Support for specifying the order of operations for components
within cards to allow quirking for unusual systems
- Lots of DT schema conversions
- Continued SOF/Intel updates for topology, SoundWire, IPC3/4
- New support for Asahi Kasei AK4619, Cirrus Logic CS530x, Everest
Semiconductors ES8311, NXP i.MX95 and LPC32xx, Qualcomm LPASS
v2.5 and WCD937x, Realtek RT1318 and RT1320 and Texas
Instruments PCM5242
HD-audio:
- More quirks, Intel PantherLake support, senarytech codec support
- Refactoring of Cirrus codec component-binding
Others:
- ALSA control kselftest improvements, and fixes for input value
checks in various drivers
-----BEGIN PGP SIGNATURE-----
iQJCBAABCAAsFiEEIXTw5fNLNI7mMiVaLtJE4w1nLE8FAmaZNdoOHHRpd2FpQHN1
c2UuZGUACgkQLtJE4w1nLE/PWw//XYFQ2v+bc0x62LI1rIEt1/mSz6R1moHf85fK
CjDOvHoGlZEkXuTmycK8b522/9tslHyE+8P97TZAy/6ph/yT44JgwQaadAvTZdWK
eKrchogf+v6DaQar8+nmXp8409HBcfJdrSJth2xR5OhY741/kGBF1/YCBHZaIQan
T87ag0tu1PVWQuLhdRlghkNYds+oaSX6wMaLRzVYI2TFYfHZOWYfVYd/NACb8KtO
z66TqybOxOpq4xCi+umNaGn2TxdDvo427JgioAKzcGLodowRKmqNV+mXddfrhBEE
Fwq4o8YGxgX+oaNn4aLQdrrREc1tuwQj0Kwpt/rkh4ESTgugcElq5hJCgPY8U3Ej
5+ih7ZeIojKnfjNivHuath7tXe1inqPEK3RBt3qMoUldIxNhJ8WfIF0RNzW/QRY2
g4JAI/4lswqPz6vYKULatDk+ZEW6PiV72kwW+4Vt7NxZnn9VFzP27qHuwkUHP5HM
0q4/NKrv+MFPedOLEeEm/1dmE7NRT4tRJuIV+RwMJ0cyP4l2jSCwyDpxfkFqGitc
wB0AXK3YLwISlKjziCox1cAex8F2XhjCdpOyOV6hTc3Dv/DySMHysv+4Uf4/kvst
3GrqdkMHy4cEUYj/Sj+VunfColsX2KnQAN+e4Sonn+5nPsw7ypGkpM1Kf+wTQuNK
EoxpzGo=
=hn0h
-----END PGP SIGNATURE-----
Merge tag 'sound-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound updates from Takashi Iwai:
"Lots of changes in this cycle, but mostly for cleanups and
refactoring.
Significant amount of changes are about DT schema conversions for ASoC
at this time while we see other usual suspects, too.
Some highlights below:
Core:
- Re-introduction of PCM sync ID support API
- MIDI2 time-base extension in ALSA sequencer API
ASoC:
- Syncing of features between simple-audio-card and the two
audio-graph cards
- Support for specifying the order of operations for components
within cards to allow quirking for unusual systems
- Lots of DT schema conversions
- Continued SOF/Intel updates for topology, SoundWire, IPC3/4
- New support for Asahi Kasei AK4619, Cirrus Logic CS530x, Everest
Semiconductors ES8311, NXP i.MX95 and LPC32xx, Qualcomm LPASS v2.5
and WCD937x, Realtek RT1318 and RT1320 and Texas Instruments
PCM5242
HD-audio:
- More quirks, Intel PantherLake support, senarytech codec support
- Refactoring of Cirrus codec component-binding
Others:
- ALSA control kselftest improvements, and fixes for input value
checks in various drivers"
* tag 'sound-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (349 commits)
kselftest/alsa: Log the PCM ID in pcm-test
kselftest/alsa: Use card name rather than number in test names
ALSA: hda/realtek: Fix the speaker output on Samsung Galaxy Book Pro 360
ALSA: hda/tas2781: Add new quirk for Lenovo Hera2 Laptop
ALSA: seq: ump: Skip useless ports for static blocks
ALSA: pcm_dmaengine: Don't synchronize DMA channel when DMA is paused
ALSA: usb: Use BIT() for bit values
ALSA: usb: Fix UBSAN warning in parse_audio_unit()
ALSA: hda/realtek: Enable headset mic on Positivo SU C1400
ASoC: tas2781: Add new Kontrol to set tas2563 digital Volume
ASoC: codecs: wcd937x: Remove separate handling for vdd-buck supply
ASoC: codecs: wcd937x: Remove the string compare in MIC BIAS widget settings
ASoC: codecs: wcd937x-sdw: Fix Unbalanced pm_runtime_enable
ASoC: dt-bindings: cirrus,cs42xx8: Convert to dtschema
ASoC: cs530x: Remove bclk from private structure
ASoC: cs530x: Calculate proper bclk rate using TDM
ASoC: dt-bindings: cirrus,cs4270: Convert to dtschema
firmware: cs_dsp: Rename fw_ver to wmfw_ver
firmware: cs_dsp: Clarify wmfw format version log message
firmware: cs_dsp: Make wmfw and bin filename arguments const char *
...
Several new features here:
- Virtio find vqs API has been reworked
(required to fix the scalability issue we have with
adminq, which I hope to merge later in the cycle)
- vDPA driver for Marvell OCTEON
- virtio fs performance improvement
- mlx5 migration speedups
Fixes, cleanups all over the place.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmaXjQQPHG1zdEByZWRo
YXQuY29tAAoJECgfDbjSjVRpnIsH/jVNqAQbe/vaBQdNMdnsA+P9A9unLbYRxYCQ
tN73mQRIXKtnZHBRAEbMGq52HPYg8HlN2HJSgyNo6I6t8VD+PiOco7m+3GpmqEcW
aXPOPl0BAbVoDgyutxRuuodP8Z61lBx0mG6iOxpzTXOPGlpQqtPCFHO8YnodqnPf
tMix/5uAqgZKV2siCbw5DtzwEc0gDHU8qsD0/nyoS5nBDF9yh/ardr5P/qiyFDQH
atCNYTOhIFU83pLAaw0fpCGbkt7gxf+5RpWVx3wkYww+/MwvYhsveRvQyaGbBz3n
WDtET3SOtVTta98OAGIKCq/2z8f6mYXBP7vXapBgnJG3vwS/poQ=
=LYua
-----END PGP SIGNATURE-----
Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull virtio updates from Michael Tsirkin:
"Several new features here:
- Virtio find vqs API has been reworked (required to fix the
scalability issue we have with adminq, which I hope to merge later
in the cycle)
- vDPA driver for Marvell OCTEON
- virtio fs performance improvement
- mlx5 migration speedups
Fixes, cleanups all over the place"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (56 commits)
virtio: rename virtio_find_vqs_info() to virtio_find_vqs()
virtio: remove unused virtio_find_vqs() and virtio_find_vqs_ctx() helpers
virtio: convert the rest virtio_find_vqs() users to virtio_find_vqs_info()
virtio_balloon: convert to use virtio_find_vqs_info()
virtiofs: convert to use virtio_find_vqs_info()
scsi: virtio_scsi: convert to use virtio_find_vqs_info()
virtio_net: convert to use virtio_find_vqs_info()
virtio_crypto: convert to use virtio_find_vqs_info()
virtio_console: convert to use virtio_find_vqs_info()
virtio_blk: convert to use virtio_find_vqs_info()
virtio: rename find_vqs_info() op to find_vqs()
virtio: remove the original find_vqs() op
virtio: call virtio_find_vqs_info() from virtio_find_single_vq() directly
virtio: convert find_vqs() op implementations to find_vqs_info()
virtio_pci: convert vp_*find_vqs() ops to find_vqs_info()
virtio: introduce virtio_queue_info struct and find_vqs_info() config op
virtio: make virtio_find_single_vq() call virtio_find_vqs()
virtio: make virtio_find_vqs() call virtio_find_vqs_ctx()
caif_virtio: use virtio_find_single_vq() for single virtqueue finding
vdpa/mlx5: Don't enable non-active VQs in .set_vq_ready()
...
This adds two tests for vgetrandom. The first one, vdso_test_chacha,
simply checks that the assembly implementation of chacha20 matches that
of libsodium, a basic sanity check that should catch most errors. The
second, vdso_test_getrandom, is a full "libc-like" implementation of the
userspace side of vgetrandom() support. It's meant to be used also as
example code for libcs that might be integrating this.
Cc: linux-kselftest@vger.kernel.org
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
The vDSO getrandom() implementation works with a buffer allocated with a
new system call that has certain requirements:
- It shouldn't be written to core dumps.
* Easy: VM_DONTDUMP.
- It should be zeroed on fork.
* Easy: VM_WIPEONFORK.
- It shouldn't be written to swap.
* Uh-oh: mlock is rlimited.
* Uh-oh: mlock isn't inherited by forks.
- It shouldn't reserve actual memory, but it also shouldn't crash when
page faulting in memory if none is available
* Uh-oh: VM_NORESERVE means segfaults.
It turns out that the vDSO getrandom() function has three really nice
characteristics that we can exploit to solve this problem:
1) Due to being wiped during fork(), the vDSO code is already robust to
having the contents of the pages it reads zeroed out midway through
the function's execution.
2) In the absolute worst case of whatever contingency we're coding for,
we have the option to fallback to the getrandom() syscall, and
everything is fine.
3) The buffers the function uses are only ever useful for a maximum of
60 seconds -- a sort of cache, rather than a long term allocation.
These characteristics mean that we can introduce VM_DROPPABLE, which
has the following semantics:
a) It never is written out to swap.
b) Under memory pressure, mm can just drop the pages (so that they're
zero when read back again).
c) It is inherited by fork.
d) It doesn't count against the mlock budget, since nothing is locked.
e) If there's not enough memory to service a page fault, it's not fatal,
and no signal is sent.
This way, allocations used by vDSO getrandom() can use:
VM_DROPPABLE | VM_DONTDUMP | VM_WIPEONFORK | VM_NORESERVE
And there will be no problem with OOMing, crashing on overcommitment,
using memory when not in use, not wiping on fork(), coredumps, or
writing out to swap.
In order to let vDSO getrandom() use this, expose these via mmap(2) as
MAP_DROPPABLE.
Note that this involves removing the MADV_FREE special case from
sort_folio(), which according to Yu Zhao is unnecessary and will simply
result in an extra call to shrink_folio_list() in the worst case. The
chunk removed reenables the swapbacked flag, which we don't want for
VM_DROPPABLE, and we can't conditionalize it here because there isn't a
vma reference available.
Finally, the provided self test ensures that this is working as desired.
Cc: linux-mm@kvack.org
Acked-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Major changes:
- The iova_bitmap logic for efficiently reporting dirty pages back to
userspace has a few more tricky corner case bugs that have been resolved
and backed with new tests. The revised version has simpler logic.
- Shared branch with iommu for handle support when doing domain
attach. Handles allow the domain owner to include additional private data
on a per-device basis.
- IO Page Fault Reporting to userspace via iommufd. Page faults can be
generated on fault capable HWPTs when a translation is not present.
Routing them to userspace would allow a VMM to be able to virtualize them
into an emulated vIOMMU. This is the next step to fully enabling vSVA
support.
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQRRRCHOFoQz/8F5bUaFwuHvBreFYQUCZpfo4AAKCRCFwuHvBreF
YTO6APwMLxeWmHbE1H+7ZPuXP7B1aDuwRLczZOo3i816pIj+bQD+OywEA/NcljK6
6NLeqyUe7tECtVrFPSiRT9lWVuzZSQs=
=rnN/
-----END PGP SIGNATURE-----
Merge tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd
Pull iommufd updates from Jason Gunthorpe:
- The iova_bitmap logic for efficiently reporting dirty pages back to
userspace has a few more tricky corner case bugs that have been
resolved and backed with new tests.
The revised version has simpler logic.
- Shared branch with iommu for handle support when doing domain attach.
Handles allow the domain owner to include additional private data on
a per-device basis.
- IO Page Fault Reporting to userspace via iommufd. Page faults can be
generated on fault capable HWPTs when a translation is not present.
Routing them to userspace would allow a VMM to be able to virtualize
them into an emulated vIOMMU. This is the next step to fully enabling
vSVA support.
* tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd: (26 commits)
iommufd: Put constants for all the uAPI enums
iommufd: Fix error pointer checking
iommufd: Add check on user response code
iommufd: Remove IOMMUFD_PAGE_RESP_FAILURE
iommufd: Require drivers to supply the cache_invalidate_user ops
iommufd/selftest: Add coverage for IOPF test
iommufd/selftest: Add IOPF support for mock device
iommufd: Associate fault object with iommufd_hw_pgtable
iommufd: Fault-capable hwpt attach/detach/replace
iommufd: Add iommufd fault object
iommufd: Add fault and response message definitions
iommu: Extend domain attach group with handle support
iommu: Add attach handle to struct iopf_group
iommu: Remove sva handle list
iommu: Introduce domain attachment handle
iommufd/iova_bitmap: Remove iterator logic
iommufd/iova_bitmap: Dynamic pinning on iova_bitmap_set()
iommufd/iova_bitmap: Consolidate iova_bitmap_set exit conditionals
iommufd/iova_bitmap: Move initial pinning to iova_bitmap_for_each()
iommufd/iova_bitmap: Cache mapped length in iova_bitmap_map struct
...
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEe7vIQRWZI0iWSE3xu+CwddJFiJoFAmaXl0kACgkQu+CwddJF
iJrOlgf+N/G7BmgoW2CBF7mKsvCYs+pX3xeBuxPtsuq4FD386nsPFMN8gWAYLG3q
ZU1z1S+0M8LhTg6/G9jMYLHt2Y7WhYbhFTjTHmULJkuhMDTUP9CRYy4XZ+hdPtHF
30ezSdJQF9x/XxCSaaRVK1s+SMVHFg5xAOHKpfkNSamcMz9g+ZkYyPBr10/VoKd0
JqwhW7r6hrlvWAiqY3QKCOvohIWglgvBUnNjUGMh1cUkOE2aYLYHklhRwICKgA6z
p/2BUXiAEWUtgBkUrizwm/pdhJXLs0pOeYarVZP1v83tQMxyrc6XLNnqhvxP3DPW
31thF5Rf9I8WaWTczXhxsAwFjqO3KQ==
=4uf9
-----END PGP SIGNATURE-----
Merge tag 'slab-for-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab
Pull slab updates from Vlastimil Babka:
"The most prominent change this time is the kmem_buckets based
hardening of kmalloc() allocations from Kees Cook.
We have also extended the kmalloc() alignment guarantees for
non-power-of-two sizes in a way that benefits rust.
The rest are various cleanups and non-critical fixups.
- Dedicated bucket allocator (Kees Cook)
This series [1] enhances the probabilistic defense against heap
spraying/grooming of CONFIG_RANDOM_KMALLOC_CACHES from last year.
kmalloc() users that are known to be useful for exploits can get
completely separate set of kmalloc caches that can't be shared with
other users. The first converted users are alloc_msg() and
memdup_user().
The hardening is enabled by CONFIG_SLAB_BUCKETS.
- Extended kmalloc() alignment guarantees (Vlastimil Babka)
For years now we have guaranteed natural alignment for power-of-two
allocations, but nothing was defined for other sizes (in practice,
we have two such buckets, kmalloc-96 and kmalloc-192).
To avoid unnecessary padding in the rust layer due to its alignment
rules, extend the guarantee so that the alignment is at least the
largest power-of-two divisor of the requested size.
This fits what rust needs, is a superset of the existing
power-of-two guarantee, and does not in practice change the layout
(and thus does not add overhead due to padding) of the kmalloc-96
and kmalloc-192 caches, unless slab debugging is enabled for them.
- Cleanups and non-critical fixups (Chengming Zhou, Suren
Baghdasaryan, Matthew Willcox, Alex Shi, and Vlastimil Babka)
Various tweaks related to the new alloc profiling code, folio
conversion, debugging and more leftovers after SLAB"
Link: https://lore.kernel.org/all/20240701190152.it.631-kees@kernel.org/ [1]
* tag 'slab-for-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
mm/memcg: alignment memcg_data define condition
mm, slab: move prepare_slab_obj_exts_hook under CONFIG_MEM_ALLOC_PROFILING
mm, slab: move allocation tagging code in the alloc path into a hook
mm/util: Use dedicated slab buckets for memdup_user()
ipc, msg: Use dedicated slab buckets for alloc_msg()
mm/slab: Introduce kmem_buckets_create() and family
mm/slab: Introduce kvmalloc_buckets_node() that can take kmem_buckets argument
mm/slab: Plumb kmem_buckets into __do_kmalloc_node()
mm/slab: Introduce kmem_buckets typedef
slab, rust: extend kmalloc() alignment guarantees to remove Rust padding
slab: delete useless RED_INACTIVE and RED_ACTIVE
slab: don't put freepointer outside of object if only orig_size
slab: make check_object() more consistent
mm: Reduce the number of slab->folio casts
mm, slab: don't wrap internal functions with alloc_hooks()
* reserve_mem command line parameter to allow creation of named memory
reservation at boot time.
The driving use-case is to improve the ability of pstore to retain
ramoops data across reboots.
* cleaunps and small improvements in memblock and mm_init
* new tests cases in memblock test suite
-----BEGIN PGP SIGNATURE-----
iQFEBAABCgAuFiEEeOVYVaWZL5900a/pOQOGJssO/ZEFAmaXfoIQHHJwcHRAa2Vy
bmVsLm9yZwAKCRA5A4Ymyw79kU5mCAC23vIrB8FRlORczMYj+V3VFss3OjKT92lS
fHGwq2oxHW+rdDpHXFObHU0D3k8d2l5jyrENRAAyA02qR0L6Pv8Na6pGxQua1eic
VIdw0PFQMsizD1AIj84Y6skkyyF/tvZHpmX0B12D5+Ur65DC/Z867Cm/lE33/fHv
/1+QB0JlG7W+FzxVstYyebY5/DVkH+bC7/A57FE2oB4BRXjEd8v9tTHBS4kRSvrE
zE2KFxeGajN749LHztIpIprPKehu8Gc3oLrYLNJO+uLFVCV8ey3OqVj0RXMG2wLl
hmVYqhbZM/Uz59D/P8pULD49f1Thjv/5A/MvUZ3SxM6zpWlsincf
=xrZd
-----END PGP SIGNATURE-----
Merge tag 'memblock-v6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock
Pull memblock updates from Mike Rapoport:
- 'reserve_mem' command line parameter to allow creation of named
memory reservation at boot time.
The driving use-case is to improve the ability of pstore to retain
ramoops data across reboots.
- cleanups and small improvements in memblock and mm_init
- new tests cases in memblock test suite
* tag 'memblock-v6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock:
memblock tests: fix implicit declaration of function 'numa_valid_node'
memblock: Move late alloc warning down to phys alloc
pstore/ramoops: Add ramoops.mem_name= command line option
mm/memblock: Add "reserve_mem" to reserved named memory at boot up
mm/mm_init.c: don't initialize page->lru again
mm/mm_init.c: not always search next deferred_init_pfn from very beginning
mm/mm_init.c: use deferred_init_mem_pfn_range_in_zone() to decide loop condition
mm/mm_init.c: get the highest zone directly
mm/mm_init.c: move nr_initialised reset down a bit
mm/memblock: fix a typo in description of for_each_mem_region()
mm/mm_init.c: use memblock_region_memory_base_pfn() to get startpfn
mm/memblock: use PAGE_ALIGN_DOWN to get pgend in free_memmap
mm/memblock: return true directly on finding overlap region
memblock tests: add memblock_overlaps_region_checks
mm/memblock: fix comment for memblock_isolate_range()
memblock tests: add memblock_reserve_many_may_conflict_check()
memblock tests: add memblock_reserve_all_locations_check()
mm/memblock: remove empty dummy entry
Build
-----
* Build each directory as a library so that depedency check for the
python extension module can be automatic. But it also introduces
some trivial merge conflicts with other trees that touched perf tools
codes. Basically it changes perf-y to perf-util-y or similar and you
can find the resolution in the perf-next tree here.
- https://lore.kernel.org/r/Zn8HeRRX3JV2IcxQ@sirena.org.uk
- https://lore.kernel.org/r/20240709100536.238f4d12@canb.auug.org.au
* Use pkg-config to check libtraceevent and libtracefs.
perf sched
----------
* Add --task-name and --fuzzy-name options for `perf sched map`. It's
to focus on selected tasks only by removing unrelated tasks in the
output. It matches the task comm with the given string and the
--fuzzy-name option allows the partial matching.
$ sudo perf sched record -a sleep 1
$ sudo perf sched map --task-name kworker --fuzzy-name
. . . . - *A0 . . 481065.315131 secs A0 => kworker/5:2-i91:438521
. . . . - *- . . 481065.315160 secs
*B0 . . . - . . . 481065.316435 secs B0 => kworker/0:0-i91:437860
*- . . . . . . . 481065.316441 secs
. . . . . *A0 . . 481065.318703 secs
. . . . . *- . . 481065.318717 secs
. . *C0 . . . . . 481065.320544 secs C0 => kworker/u16:30-:430186
. . *- . . . . . 481065.320555 secs
. . *D0 . . . . . 481065.328524 secs D0 => kworker/2:0-kdm:429654
*B0 . D0 . - . . . 481065.328527 secs
*- . D0 . - . . . 481065.328535 secs
. . *- . . . . . 481065.328535 secs
* Fix -r/--repeat option of perf sched replay. The documentation said
-1 will work as infinity but it didn't accept the value. Update the
code and document to use 0 instead.
* Fix perf sched timehist to account the delay time for preempted tasks.
Perf event filtering
--------------------
* perf top gained filtering support on regular events using BPF like
perf record. Previously it was able to use it for tracepoints only.
* The BPF filter now supports filtering by UID/GID. This should be
preferred than -u <UID> option as it's racy to scan /proc to check
tasks for the user and fails to open an event for the task if it's
already gone.
$ sudo perf top -e cycles --filter "uid == $(id -u)"
perf report
-----------
* Skip dummy events in the group output by default. The --skip-empty
option controls display of empty events without samples. But perf
report can force display all events in a group. In this case, auto-
added a dummy event (for a system-wide record) ends up in the output.
Now it can skip those empty events even in the group display mode.
To preserve the old behavior, run this:
$ perf report --group --no-skip-empty
perf stat
---------
* Choose the most disaggregate option when multiple aggregation options
are given. It used to pick the last option in the command line but
it can be confusing and not consistent. Now it'll choose the smallest
unit.
For example, it'd aggregate the result per-core when the user gave
both --per-socket and --per-core options at the same time.
Internals
---------
* Fix `perf bench` when some CPUs are offline.
* Fix handling of JIT symbol mappings to accept "/tmp/perf-${PID}.map
patterns only so that it can not be confused by other /tmp/perf-*
files.
* Many improvements and fixes for `perf test`.
Others
------
* Support some new instructions for Intel-PT.
* Fix syscall ID mapping in perf trace.
* Document AMD IBS PMU usages.
* Change `perf lock info` to show map and thread info by default.
Vendor JSON events
------------------
* Update Intel events and metrics
* Add i.MX9[35] DDR metrics
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQSo2x5BnqMqsoHtzsmMstVUGiXMgwUCZpbe+QAKCRCMstVUGiXM
g4X6AQCXJ+eCuBy/9IvDpo86KjemPQk/brA0X8Ufi7JCd+Vj4QD7BvDobClewn+v
W3MhNQwIlFPJ8u3Act+tJZfUJjVF4wU=
=ZkPv
-----END PGP SIGNATURE-----
Merge tag 'perf-tools-for-v6.11-2024-07-16' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
Pull perf tools updates from Namhyung Kim:
"Build:
- Build each directory as a library so that depedency check for the
python extension module can be automatic
- Use pkg-config to check libtraceevent and libtracefs
perf sched:
- Add --task-name and --fuzzy-name options for `perf sched map`
It focuses on selected tasks only by removing unrelated tasks in
the output. It matches the task comm with the given string and the
--fuzzy-name option allows the partial matching:
$ sudo perf sched record -a sleep 1
$ sudo perf sched map --task-name kworker --fuzzy-name
. . . . - *A0 . . 481065.315131 secs A0 => kworker/5:2-i91:438521
. . . . - *- . . 481065.315160 secs
*B0 . . . - . . . 481065.316435 secs B0 => kworker/0:0-i91:437860
*- . . . . . . . 481065.316441 secs
. . . . . *A0 . . 481065.318703 secs
. . . . . *- . . 481065.318717 secs
. . *C0 . . . . . 481065.320544 secs C0 => kworker/u16:30-:430186
. . *- . . . . . 481065.320555 secs
. . *D0 . . . . . 481065.328524 secs D0 => kworker/2:0-kdm:429654
*B0 . D0 . - . . . 481065.328527 secs
*- . D0 . - . . . 481065.328535 secs
. . *- . . . . . 481065.328535 secs
- Fix -r/--repeat option of perf sched replay
The documentation said -1 will work as infinity but it didn't
accept the value. Update the code and document to use 0 instead
- Fix perf sched timehist to account the delay time for preempted
tasks
Perf event filtering:
- perf top gained filtering support on regular events using BPF like
perf record. Previously it was able to use it for tracepoints only
- The BPF filter now supports filtering by UID/GID. This should be
preferred than -u <UID> option as it's racy to scan /proc to check
tasks for the user and fails to open an event for the task if it's
already gone
$ sudo perf top -e cycles --filter "uid == $(id -u)"
perf report:
- Skip dummy events in the group output by default. The --skip-empty
option controls display of empty events without samples. But perf
report can force display all events in a group
In this case, auto-added a dummy event (for a system-wide record)
ends up in the output. Now it can skip those empty events even in
the group display mode
To preserve the old behavior, run this:
$ perf report --group --no-skip-empty
perf stat:
- Choose the most disaggregate option when multiple aggregation
options are given. It used to pick the last option in the command
line but it can be confusing and not consistent. Now it'll choose
the smallest unit
For example, it'd aggregate the result per-core when the user gave
both --per-socket and --per-core options at the same time
Internals:
- Fix `perf bench` when some CPUs are offline
- Fix handling of JIT symbol mappings to accept "/tmp/perf-${PID}.map
patterns only so that it can not be confused by other /tmp/perf-*
files
- Many improvements and fixes for `perf test`
Others:
- Support some new instructions for Intel-PT
- Fix syscall ID mapping in perf trace
- Document AMD IBS PMU usages
- Change `perf lock info` to show map and thread info by default
Vendor JSON events:
- Update Intel events and metrics
- Add i.MX9[35] DDR metrics"
* tag 'perf-tools-for-v6.11-2024-07-16' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (125 commits)
perf trace: Fix iteration of syscall ids in syscalltbl->entries
perf dso: Fix address sanitizer build
perf mem: Warn if memory events are not supported on all CPUs
perf arm-spe: Support multiple Arm SPE PMUs
perf build x86: Fix SC2034 error in syscalltbl.sh
perf record: Fix memset out-of-range error
perf sched map: Add --fuzzy-name option for fuzzy matching in task names
perf sched map: Add support for multiple task names using CSV
perf sched map: Add task-name option to filter the output map
perf build: Conditionally add feature check flags for libtrace{event,fs}
perf install: Don't propagate subdir to Documentation submake
perf vendor events arm64:: Add i.MX95 DDR Performance Monitor metrics
perf vendor events arm64:: Add i.MX93 DDR Performance Monitor metrics
perf dsos: When adding a dso into sorted dsos maintain the sort order
perf comm str: Avoid sort during insert
perf report: Calling available function for stats printing
perf intel-pt: Fix exclude_guest setting
perf intel-pt: Fix aux_watermark calculation for 64-bit size
perf sched replay: Fix -r/--repeat command line option for infinity
perf: pmus: Remove unneeded semicolon
...
- Use pretty formatting only on interactive tty in rtla/osnoise
- Better reporting when histogram is empty in rtla/osnoise
- Use the correct library name for "libtracefs" in feature detection
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZpbYbRQccm9zdGVkdEBn
b29kbWlzLm9yZwAKCRAp5XQQmuv6qpWSAQD69Ea2bOCRuqCW6mjH4iPgWA3Re4Xt
rnUI3mWT7+FSNwD/T1ElUQ8/mfJptawZl1KyACNovtE1xMn365yHsFA2Mgg=
=fnCa
-----END PGP SIGNATURE-----
Merge tag 'trace-tools-v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing tools updates from Steven Rostedt:
"Trivial updates for 6.11:
- Use pretty formatting only on interactive tty in rtla/osnoise
- Better reporting when histogram is empty in rtla/osnoise
- Use the correct library name for "libtracefs" in feature detection"
* tag 'trace-tools-v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tools: build: use correct lib name for libtracefs feature detection
rtla/osnoise: Better report when histogram is empty
rtla/osnoise: Use pretty formatting only on interactive tty
Up until now, the function graph tracer could only have a single user
attached to it. If another user tried to attach to the function graph
tracer while one was already attached, it would fail. Allowing function
graph tracer to have more than one user has been asked for since 2009, but
it required a rewrite to the logic to pull it off so it never happened.
Until now!
There's three systems that trace the return of a function. That is
kretprobes, function graph tracer, and BPF. kretprobes and function graph
tracing both do it similarly. The difference is that kretprobes uses a
shadow stack per callback and function graph tracer creates a shadow stack
for all tasks. The function graph tracer method makes it possible to trace
the return of all functions. As kretprobes now needs that feature too,
allowing it to use function graph tracer was needed. BPF also wants to
trace the return of many probes and its method doesn't scale either.
Having it use function graph tracer would improve that.
By allowing function graph tracer to have multiple users allows both
kretprobes and BPF to use function graph tracer in these cases. This will
allow kretprobes code to be removed in the future as it's version will no
longer be needed. Note, function graph tracer is only limited to 16
simultaneous users, due to shadow stack size and allocated slots.
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZpbWlxQccm9zdGVkdEBn
b29kbWlzLm9yZwAKCRAp5XQQmuv6qgtvAP9jxmgEiEhz4Bpe1vRKVSMYK6ozXHTT
7MFKRMeQqQ8zeAEA2sD5Zrt9l7zKzg0DFpaDLgc3/yh14afIDxzTlIvkmQ8=
=umuf
-----END PGP SIGNATURE-----
Merge tag 'ftrace-v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull ftrace updates from Steven Rostedt:
"Rewrite of function graph tracer to allow multiple users
Up until now, the function graph tracer could only have a single user
attached to it. If another user tried to attach to the function graph
tracer while one was already attached, it would fail. Allowing
function graph tracer to have more than one user has been asked for
since 2009, but it required a rewrite to the logic to pull it off so
it never happened. Until now!
There's three systems that trace the return of a function. That is
kretprobes, function graph tracer, and BPF. kretprobes and function
graph tracing both do it similarly. The difference is that kretprobes
uses a shadow stack per callback and function graph tracer creates a
shadow stack for all tasks. The function graph tracer method makes it
possible to trace the return of all functions. As kretprobes now needs
that feature too, allowing it to use function graph tracer was needed.
BPF also wants to trace the return of many probes and its method
doesn't scale either. Having it use function graph tracer would
improve that.
By allowing function graph tracer to have multiple users allows both
kretprobes and BPF to use function graph tracer in these cases. This
will allow kretprobes code to be removed in the future as it's version
will no longer be needed.
Note, function graph tracer is only limited to 16 simultaneous users,
due to shadow stack size and allocated slots"
* tag 'ftrace-v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (49 commits)
fgraph: Use str_plural() in test_graph_storage_single()
function_graph: Add READ_ONCE() when accessing fgraph_array[]
ftrace: Add missing kerneldoc parameters to unregister_ftrace_direct()
function_graph: Everyone uses HAVE_FUNCTION_GRAPH_RET_ADDR_PTR, remove it
function_graph: Fix up ftrace_graph_ret_addr()
function_graph: Make fgraph_update_pid_func() a stub for !DYNAMIC_FTRACE
function_graph: Rename BYTE_NUMBER to CHAR_NUMBER in selftests
fgraph: Remove some unused functions
ftrace: Hide one more entry in stack trace when ftrace_pid is enabled
function_graph: Do not update pid func if CONFIG_DYNAMIC_FTRACE not enabled
function_graph: Make fgraph_do_direct static key static
ftrace: Fix prototypes for ftrace_startup/shutdown_subops()
ftrace: Assign RCU list variable with rcu_assign_ptr()
ftrace: Assign ftrace_list_end to ftrace_ops_list type cast to RCU
ftrace: Declare function_trace_op in header to quiet sparse warning
ftrace: Add comments to ftrace_hash_move() and friends
ftrace: Convert "inc" parameter to bool in ftrace_hash_rec_update_modify()
ftrace: Add comments to ftrace_hash_rec_disable/enable()
ftrace: Remove "filter_hash" parameter from __ftrace_hash_rec_update()
ftrace: Rename dup_hash() and comment it
...
Uprobes:
- x86/shstk: Make return uprobe work with shadow stack.
- Add uretprobe syscall which speeds up the uretprobe 10-30% faster. This
syscall is automatically used from user-space trampolines which are
generated by the uretprobe. If this syscall is used by normal
user program, it will cause SIGILL. Note that this is currently only
implemented on x86_64.
(This also has 2 fixes for adjusting the syscall number to avoid conflict
with new *attrat syscalls.)
- uprobes/perf: fix user stack traces in the presence of pending uretprobe.
This corrects the uretprobe's trampoline address in the stacktrace with
correct return address.
- selftests/x86: Add a return uprobe with shadow stack test.
- selftests/bpf: Add uretprobe syscall related tests.
. test case for register integrity check.
. test case with register changing case.
. test case for uretprobe syscall without uprobes (expected to be failed).
. test case for uretprobe with shadow stack.
- selftests/bpf: add test validating uprobe/uretprobe stack traces
- MAINTAINERS: Add uprobes entry. This does not specify the tree but to
clarify who maintains and reviews the uprobes.
Kprobes:
- tracing/kprobes: Test case cleanups. Replace redundant WARN_ON_ONCE() +
pr_warn() with WARN_ONCE() and remove unnecessary code from selftest.
- tracing/kprobes: Add symbol counting check when module loads. This
checks the uniqueness of the probed symbol on modules. The same check
has already done for kernel symbols.
(This also has a fix for build error with CONFIG_MODULES=n)
Cleanup:
- Add MODULE_DESCRIPTION() macros for fprobe and kprobe examples.
-----BEGIN PGP SIGNATURE-----
iQFPBAABCgA5FiEEh7BulGwFlgAOi5DV2/sHvwUrPxsFAmaWYxwbHG1hc2FtaS5o
aXJhbWF0c3VAZ21haWwuY29tAAoJENv7B78FKz8bsUgH/3JcSzDZujQWCZ1f4fJn
QecvTFSYcCl6ck8+/3wm4EsgeCXIFOyPnoPc7k2Gm+l6Dlk1DKGV6wV4tuKFUq9X
9mplcwoVA0Ln+EX9zv9v4s99yUGxcU9xjgC9XT7J52SvqYncPIi6dR0Z9wlJBmyd
Bx3cZk+wSzCYaoqYngI2fKlzsEcYgDIP999fQPRi0HGzNZujc4xeJyjCTC/48yWO
9kreRQq6wFdgRQTwMcR/fKPDKIGZQCU8jkXv5crVV5K3rNaBcwBmCJJMP8PzPU0V
UQ0+8RZK+Qk8SBwXcMNVRqm/efTderob4IYxP8OBe5wjAIE7+vu8r6sqwxRIS54M
Cyg=
=DRSr
-----END PGP SIGNATURE-----
Merge tag 'probes-v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull probes updates from Masami Hiramatsu:
"Uprobes:
- x86/shstk: Make return uprobe work with shadow stack
- Add uretprobe syscall which speeds up the uretprobe 10-30% faster.
This syscall is automatically used from user-space trampolines
which are generated by the uretprobe. If this syscall is used by
normal user program, it will cause SIGILL. Note that this is
currently only implemented on x86_64.
(This also has two fixes for adjusting the syscall number to avoid
conflict with new *attrat syscalls.)
- uprobes/perf: fix user stack traces in the presence of pending
uretprobe. This corrects the uretprobe's trampoline address in the
stacktrace with correct return address
- selftests/x86: Add a return uprobe with shadow stack test
- selftests/bpf: Add uretprobe syscall related tests.
- test case for register integrity check
- test case with register changing case
- test case for uretprobe syscall without uprobes (expected to fail)
- test case for uretprobe with shadow stack
- selftests/bpf: add test validating uprobe/uretprobe stack traces
- MAINTAINERS: Add uprobes entry. This does not specify the tree but
to clarify who maintains and reviews the uprobes
Kprobes:
- tracing/kprobes: Test case cleanups.
Replace redundant WARN_ON_ONCE() + pr_warn() with WARN_ONCE() and
remove unnecessary code from selftest
- tracing/kprobes: Add symbol counting check when module loads.
This checks the uniqueness of the probed symbol on modules. The
same check has already done for kernel symbols
(This also has a fix for build error with CONFIG_MODULES=n)
Cleanup:
- Add MODULE_DESCRIPTION() macros for fprobe and kprobe examples"
* tag 'probes-v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
MAINTAINERS: Add uprobes entry
selftests/bpf: Change uretprobe syscall number in uprobe_syscall test
uprobe: Change uretprobe syscall scope and number
tracing/kprobes: Fix build error when find_module() is not available
tracing/kprobes: Add symbol counting check when module loads
selftests/bpf: add test validating uprobe/uretprobe stack traces
perf,uprobes: fix user stack traces in the presence of pending uretprobes
tracing/kprobe: Remove cleanup code unrelated to selftest
tracing/kprobe: Integrate test warnings into WARN_ONCE
selftests/bpf: Add uretprobe shadow stack test
selftests/bpf: Add uretprobe syscall call from user space test
selftests/bpf: Add uretprobe syscall test for regs changes
selftests/bpf: Add uretprobe syscall test for regs integrity
selftests/x86: Add return uprobe shadow stack test
uprobe: Add uretprobe syscall to speed up return probe
uprobe: Wire up uretprobe system call
x86/shstk: Make return uprobe work with shadow stack
samples: kprobes: add missing MODULE_DESCRIPTION() macros
fprobe: add missing MODULE_DESCRIPTION() macro
Currently for the PCM and mixer tests we report test names which identify
the card being tested with the card number. This ensures we have unique
names but since card numbers are dynamically assigned at runtime the names
we end up with will often not be stable on systems with multiple cards
especially where those cards are provided by separate modules loeaded at
runtime. This makes it difficult for automated systems and UIs to relate
test results between runs on affected platforms.
Address this by replacing our use of card numbers with card names which are
more likely to be stable across runs. We use the card ID since it is
guaranteed to be unique by default, unlike the long name. There is still
some vulnerability to ordering issues if multiple cards with the same base
ID are present in the system but have separate dependencies but not all
drivers put distinguishing information in their long names.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://patch.msgid.link/20240716-alsa-kselftest-board-name-v2-1-60f1acdde096@kernel.org
Signed-off-by: Takashi Iwai <tiwai@suse.de>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEN9lkrMBJgcdVAPub1V2XiooUIOQFAmaYOsUACgkQ1V2XiooU
IOSZfhAAiylt0bXfNLh2dGFpigNqejO0y7uJG/lNALA6kbhqkfdaXyiEGI+ZyIUM
qtkpk0BUDEhCtpspOVLNtDYcZkKuH7zzt6MsA9BcNm/5AS4XPB56buPsZ94ic3Uk
LZB8X/+q+gT+HC3RF4hf92rfJP6WKvt3mO8GGLstg0r6orGX6DywfwD+jCAIS/HS
CgDot0SXS0MQZpTUoMugRpGF8wWBANLjUvLYNjIQhV5vLpRvX/Fal+St1+AsFnzq
hdtMDOmqJHy0GNwrOdWX5muK+3X5Fg1rHUIgA1SqALd52lpsv62U9JHN6cvV8nqG
h24u3v17rtHg82zbKz4pBVHwurDyIT3TM3Pmp8v+b8ZnYmWsyXLr3lqv3GU5Rx14
HWsEi+FL/ND2mfWF+GOn4JqkWy/97eZ78VkekuuWXfzuNAXedp7OUw7RgfhCl5Pr
DZPus0rrnl3nCOeQyXGpo0RacFDOhKESa0EBjHS0+S8nv4vpzS+5XbiZctM0FlSu
QhwZM2Q+M6Klr0LWkLYyS99s5Cra3z5z/Oud7xtnlBOp81r0O9YVBYopoV6MM0kV
b6IHvOvs+5W+dj8RmH6Chq9D+EmFuYGwPoa5uv6zdSn/2ZeXqKeGq/HnlXuqct37
RCbuV8S1IzJi5EgtwbU0uxBclKkiyCtkhqSNrdjSHxLwf7wgI2s=
=I/g9
-----END PGP SIGNATURE-----
Merge tag 'nf-24-07-17' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Pablo Neira Ayuso says:
====================
Netfilter/IPVS fixes for net
The following patchset contains Netfilter/IPVS fixes for net:
1) Call nf_expect_get_id() to delete expectation by ID. By trial and
error it is possible to leak the LSB of the expectation address on
x86_64. This bug is a leftover when converting the existing code
to use nf_expect_get_id().
2) Incorrect initialization in pipapo set backend leads to packet
mismatches. From Florian Westphal.
3) Extend netfilter's selftests to cover for the pipapo set backend,
also from Florian.
4) Fix sparse warning in IPVS when adding service, from Chen Hanxiao.
netfilter pull request 24-07-17
* tag 'nf-24-07-17' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
ipvs: properly dereference pe in ip_vs_add_service
selftests: netfilter: add test case for recent mismatch bug
netfilter: nf_set_pipapo: fix initial map fill
netfilter: ctnetlink: use helper function to calculate expect ID
====================
Link: https://patch.msgid.link/20240717215214.225394-1-pablo@netfilter.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
The TOS value that is returned to user space in the route get reply is
the one with which the lookup was performed ('fl4->flowi4_tos'). This is
fine when the matched route is configured with a TOS as it would not
match if its TOS value did not match the one with which the lookup was
performed.
However, matching on TOS is only performed when the route's TOS is not
zero. It is therefore possible to have the kernel incorrectly return a
non-zero TOS:
# ip link add name dummy1 up type dummy
# ip address add 192.0.2.1/24 dev dummy1
# ip route get fibmatch 192.0.2.2 tos 0xfc
192.0.2.0/24 tos 0x1c dev dummy1 proto kernel scope link src 192.0.2.1
Fix by instead returning the DSCP field from the FIB result structure
which was populated during the route lookup.
Output after the patch:
# ip link add name dummy1 up type dummy
# ip address add 192.0.2.1/24 dev dummy1
# ip route get fibmatch 192.0.2.2 tos 0xfc
192.0.2.0/24 dev dummy1 proto kernel scope link src 192.0.2.1
Extend the existing selftests to not only verify that the correct route
is returned, but that it is also returned with correct "tos" value (or
without it).
Fixes: b61798130f ("net: ipv4: RTM_GETROUTE: return matched fib result when requested")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEoEVH9lhNrxiMPSyI7MXwXhnZSjYFAmaWVnITHGJlbnRpc3NA
a2VybmVsLm9yZwAKCRDsxfBeGdlKNvC2D/0ZkIRcJn8OU3j8vbSE2D10hy3tyDZa
3P5rI2UrlE6NPlJUo755VBEaLe608481TNZlhIKQ6LFzmUdlj3C7bKiCOQ6KLOyT
ZoCeRS3cVgNfSEnF5N6SwfuVW3PgXo6GC6pueNcNepLIVnWGJ5QhLmiLOzPr0YER
mW/y3s447TxecQ803UYtaFQnwSOhxzWvN+G7mnzkz2PNpta3UJ68jsqxQOivrSV0
mEx4W5VN6MYaSVZ2c5s+LIcn48+LMGNwkRAdMkFUAksLDNwvSIgdgRcjaJhpVIPK
MfYrQ9QAXezFxzUxbEoCI5PXOA44MODhT3095fyq+Uf3r2OB/gGKE8p3f2jv24nv
RR/TR5S4y8FD+bWh12/BL8j4bv0weXFFUjwJwZmmpXnL3ev0oN92TaRrKRPuNO4Y
GDmRV5qwUZrL2+e7j0bpXFGxulsxc+1JYxb8UY03BHIB2M8LnUTpsfpcxtSi1MYW
N1U//fObXBfRl1CcdDPbT2cTJD9jwuozJm5l1p/BHOHu3cwhJTStH1XzsnKQXL9g
O5izXWqwCgNmbG8egGR3ddV53ZGi1MsD7tGcc5GGcYnevdBi4l+Q4Zl+oFxjfHvs
MKWMKygdaHUBqmYfOGgspA+S2zY38smbul8ZQUxP0yOl3+MyqCCedYQHi/w8MU+L
k2w+NzXIWXBifA==
=hVVT
-----END PGP SIGNATURE-----
Merge tag 'for-linus-2024071601' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid
Pull HID updates from Benjamin Tissoires:
- rewrite of the HID-BPF internal implementation to use bpf struct_ops
instead of a tracing endpoint (Benjamin Tissoires)
- add two new HID-BPF hooks to be able to intercept userspace calls
targeting a HID device and filtering them (Benjamin Tissoires)
- add support for various new devices through HID-BPF filters (Benjamin
Tissoires)
- add support for the magic keyboard backlight (Orlando Chamberlain)
- add the missing MODULE_DESCRIPTION() macros in HID drivers (Jeff
Johnson)
- use of kvzalloc in case memory gets too fragmented (Hailong Liu)
- retrieve the device firmware node in the child HID device (Danny
Kaehn)
- some hid-uclogic improvements (José Expósito)
- some more typos, trivial fixes, kernel doctext and unused functions
cleanups
* tag 'for-linus-2024071601' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: (60 commits)
HID: hid-steam: Fix typo in goto label
HID: mcp2221: Remove unnecessary semicolon
HID: Fix spelling mistakes "Kensigton" -> "Kensington"
HID: add more missing MODULE_DESCRIPTION() macros
HID: samples: fix the 2 struct_ops definitions
HID: fix for amples in for-6.11/bpf
HID: apple: Add support for magic keyboard backlight on T2 Macs
HID: bpf: Thrustmaster TCA Yoke Boeing joystick fix
HID: bpf: Add Huion Dial 2 bpf fixup
HID: bpf: Add support for the XP-PEN Deco Mini 4
HID: bpf: move the BIT() macro to hid_bpf_helpers.h
HID: bpf: add a driver for the Huion Inspiroy 2S (H641P)
HID: bpf: Add a HID report composition helper macros
HID: bpf: doc fixes for hid_hw_request() hooks
HID: bpf: doc fixes for hid_hw_request() hooks
HID: bpf: fix gcc warning and unify __u64 into u64
selftests/hid: ensure CKI can compile our new tests on old kernels
selftests/hid: add an infinite loop test for hid_bpf_try_input_report
selftests/hid: add another test for injecting an event from an event hook
HID: bpf: allow hid_device_event hooks to inject input reports on self
...
Highlights:
- amd/pmf: Report system state changes using existing input
events
- asus-wmi: Zenbook 2023 camera LED disable support and fix
TUF laptop keyboard RGB LED sysfs interface
- dell-pc: Fan modes / platform profile support
- hp-wmi: Fix platform profile switching on Omen/Victus
laptops
- intel/ISST: Use only TPMI interface when TPMI and legacy
interfaces are available
- intel/pmc: LTR restore support to pair with LTR ignore
- intel/tpmi: Performance Limit Reasons (PLR) and APIC <-> Punit
CPU numbering mapping support
- WMI: driver override support and docs improvements
- lenovo-yoga-c630: Support for EC (platform/arm64)
- platform/arm64: Fix build with COMPILE_TEST (broke after addition
of C630)
- tools: Intel Speed Select Turbo Ratio Limit fix
- Miscellaneous cleanups / refactoring / improvements
The following is an automated shortlog grouped by driver:
amd/pmf:
- Remove update system state document
- Use existing input event codes to update system states
- Use memdup_user()
arm64:
- add Lenovo Yoga C630 WOS EC driver
- build drivers even on non-ARM64 platforms
- EC_ACER_ASPIRE1 should depend on ARCH_QCOM
- EC_LENOVO_YOGA_C630 should depend on ARCH_QCOM
arm64: lenovo-yoga-c630:
- select AUXILIARY_BUS
asus-tf103c-dock:
- Use 2-argument strscpy()
asus-wmi:
- fix TUF laptop RGB variant
- support the disable camera LED on F10 of Zenbook 2023
dell-pc:
- avoid double free and invalid unregistration
- Implement platform_profile
dell-smbios:
- Add helper for checking supported class
- Move request functions for reuse
Docs/admin-guide:
- Remove pmf leftover reference from the index
doc: TPMI:
- Add entry for Performance Limit Reasons
dt-bindings: platform:
- Add Lenovo Yoga C630 EC
hp: hp-bioscfg:
- Use 2-argument strscpy()
hp-wmi:
- Fix implementation of the platform_profile_omen_get function
- Fix platform profile option switch bug on Omen and Victus laptops
ideapad-laptop:
- use cleanup.h
intel: chtwc_int33fe:
- Use 2-argument strscpy()
intel/ifs:
- Switch to new Intel CPU model defines
intel_ips:
- Switch to new Intel CPU model defines
intel/pmc:
- Add support to show ltr_ignore value
- Add support to undo ltr_ignore
- Convert index variables to be unsigned
- Move pmc assignment closer to first usage
- Remove unneeded min_t check
- Simplify mutex usage with cleanup helpers
- Switch to new Intel CPU model defines
- Use DEFINE_SHOW_STORE_ATTRIBUTE macro
- Use the Elvis operator
- Use the return value of pmc_core_send_msg
intel_scu_wdt:
- Switch to new Intel CPU model defines
intel_speed_select_if:
- Switch to new Intel CPU model defines
intel_telemetry:
- Switch to new Intel CPU model defines
intel/tpmi:
- Add API to get debugfs root
- Add new auxiliary driver for performance limits
- Add support for performance limit reasons
intel:
- TPMI domain id and CPU mapping
intel/tpmi/plr:
- Add support for the plr mailbox
- Fix output in plr_print_bits()
intel_turbo_max_3:
- Switch to new Intel CPU model defines
intel-uncore-freq:
- Get rid of magic min_max argument
- Get rid of magic values
- Get rid of uncore_read_freq driver API
- Re-arrange bit masks
- Rename the sysfs helper macro names
- Switch to new Intel CPU model defines
- Use generic helpers for current frequency
- Use uncore_index with read_control_freq
ISST:
- Add model specific loading for common module
- Avoid some SkyLake server models
- Use only TPMI interface when present
p2sb:
- Switch to new Intel CPU model defines
serial-multi-instantiate:
- Use 2-argument strscpy()
think-lmi:
- Use 2-argument strscpy()
thinkpad_acpi:
- Use 2-argument strscpy()
tools/power/x86/intel-speed-select:
- Set TRL MSR in 100 MHz units
- v1.20 release
wmi:
- Add bus ABI documentation
- Add driver_override support
x86/platform/atom:
- Switch to new Intel CPU model defines
Merges:
- Merge branch 'pdx86/platform-drivers-x86-lenovo-c630' into review-ilpo
- Merge branch 'pdx86/platform-drivers-x86-lenovo-c630' into review-ilpo
- Merge branch 'pdx86/platform-drivers-x86-lenovo-c630' into review-ilpo
- Merge remote-tracking branch 'intel-speed-select/intel-sst' into review-ilpo
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQSCSUwRdwTNL2MhaBlZrE9hU+XOMQUCZpZIdQAKCRBZrE9hU+XO
MbIEAQCMVjDuOJSSuS2u7/iVb41Q3+kjP6X0CmSpf8dmt3rH0gD/Z9Qynw6ArRY4
PPHY25ur8kPtwtyxHfCMcar6ESpztwU=
=L2LD
-----END PGP SIGNATURE-----
Merge tag 'platform-drivers-x86-v6.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
Pull x86 platform driver updates from Ilpo Järvinen:
- amd/pmf: Report system state changes using existing input events
- asus-wmi: Zenbook 2023 camera LED disable support and fix TUF laptop
keyboard RGB LED sysfs interface
- dell-pc: Fan modes / platform profile support
- hp-wmi: Fix platform profile switching on Omen/Victus laptops
- intel/ISST: Use only TPMI interface when TPMI and legacy interfaces
are available
- intel/pmc: LTR restore support to pair with LTR ignore
- intel/tpmi: Performance Limit Reasons (PLR) and APIC <-> Punit CPU
numbering mapping support
- WMI: driver override support and docs improvements
- lenovo-yoga-c630: Support for EC (platform/arm64)
- platform/arm64: Fix build with COMPILE_TEST (broke after addition of
C630)
- tools: Intel Speed Select Turbo Ratio Limit fix
- Miscellaneous cleanups / refactoring / improvements
* tag 'platform-drivers-x86-v6.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: (65 commits)
platform/x86: asus-wmi: fix TUF laptop RGB variant
platform/x86/intel/tpmi/plr: Fix output in plr_print_bits()
Docs/admin-guide: Remove pmf leftover reference from the index
platform/x86: ideapad-laptop: use cleanup.h
platform/x86: hp-wmi: Fix implementation of the platform_profile_omen_get function
platform: arm64: EC_LENOVO_YOGA_C630 should depend on ARCH_QCOM
platform: arm64: EC_ACER_ASPIRE1 should depend on ARCH_QCOM
platform/x86/amd/pmf: Remove update system state document
platform/x86/amd/pmf: Use existing input event codes to update system states
platform/x86: hp-wmi: Fix platform profile option switch bug on Omen and Victus laptops
platform/x86:intel/pmc: Add support to undo ltr_ignore
platform/x86:intel/pmc: Use the Elvis operator
platform/x86:intel/pmc: Use DEFINE_SHOW_STORE_ATTRIBUTE macro
platform/x86:intel/pmc: Remove unneeded min_t check
platform/x86:intel/pmc: Add support to show ltr_ignore value
platform/x86:intel/pmc: Move pmc assignment closer to first usage
platform/x86:intel/pmc: Convert index variables to be unsigned
platform/x86:intel/pmc: Simplify mutex usage with cleanup helpers
platform/x86:intel/pmc: Use the return value of pmc_core_send_msg
tools/power/x86/intel-speed-select: v1.20 release
...
Verify that out-of-band packets are silently dropped before they reach the
redirection logic.
The idea is to test with a 2 byte long send(). Should a MSG_OOB flag be in
use, only the last byte will be treated as out-of-band. Test fails if
verd_mapfd indicates a wrong number of packets processed (e.g. if OOB
wasn't dropped at the source) or if it was possible to recv() MSG_OOB from
the mapped socket, or if any stale OOB data have been left reachable from
the unmapped socket.
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Jakub Sitnicki <jakub@cloudflare.com>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/20240713200218.2140950-5-mhal@rbox.co
Extend pairs_redir_to_connected() and unix_inet_redir_to_connected() with a
send_flags parameter. Replace write() with send() allowing packets to be
sent as MSG_OOB.
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Jakub Sitnicki <jakub@cloudflare.com>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/20240713200218.2140950-4-mhal@rbox.co
Function ignores the AF_UNIX socket type argument, SOCK_DGRAM is hardcoded.
Fix to respect the argument provided.
Fixes: 75e0e27db6 ("selftest/bpf: Change udp to inet in some function names")
Suggested-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Jakub Sitnicki <jakub@cloudflare.com>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/20240713200218.2140950-3-mhal@rbox.co
The usage help for "bpftool prog help" contains a ° instead of the _
symbol for cgroup/sendmsg_unix. Fix the typo.
Fixes: 8b3cba987e ("bpftool: Add support for cgroup unix socket address hooks")
Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Quentin Monnet <qmo@kernel.org>
Link: https://lore.kernel.org/bpf/20240717134508.77488-1-donald.hunter@gmail.com
For all these years libbpf's BTF dumper has been emitting not strictly
valid syntax for function prototypes that have no input arguments.
Instead of `int (*blah)()` we should emit `int (*blah)(void)`.
This is not normally a problem, but it manifests when we get kfuncs in
vmlinux.h that have no input arguments. Due to compiler internal
specifics, we get no BTF information for such kfuncs, if they are not
declared with proper `(void)`.
The fix is trivial. We also need to adjust a few ancient tests that
happily assumed `()` is correct.
Fixes: 351131b51c ("libbpf: add btf_dump API for BTF-to-C conversion")
Reported-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://lore.kernel.org/bpf/20240712224442.282823-1-andrii@kernel.org
Now that symsrc_filename is always accessed through an accessor, we also
need a free() function for it to avoid the following compilation error:
util/unwind-libunwind-local.c:416:12: error: lvalue required as unary
‘&’ operand
416 | zfree(&dso__symsrc_filename(dso));
Fixes: 1553419c3c ("perf dso: Fix address sanitizer build")
Signed-off-by: James Clark <james.clark@linaro.org>
Reviewed-by: Ian Rogers <irogers@google.com>
Tested-by: Leo Yan <leo.yan@arm.com>
Tested-by: Florian Fainelli <florian.fainelli@broadcom.com>
Cc: Yunseong Kim <yskelg@gmail.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Link: https://lore.kernel.org/r/20240715094715.3914813-1-james.clark@linaro.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
This allows to build against libtraceevent and libtracefs installed
in non-standard locations.
Signed-off-by: Guilherme Amadio <amadio@gentoo.org>
Tested-by: Thorsten Leemhuis <linux@leemhuis.info>
Tested-by: Leo Yan <leo.yan@arm.com>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Cc: linux-trace-devel@vger.kernel.org
Link: https://lore.kernel.org/r/20240717174739.186988-6-amadio@gentoo.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
This allows to build against libtraceevent and libtracefs installed
in non-standard locations.
Signed-off-by: Guilherme Amadio <amadio@gentoo.org>
Tested-by: Thorsten Leemhuis <linux@leemhuis.info>
Tested-by: Leo Yan <leo.yan@arm.com>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Link: https://lore.kernel.org/r/20240712194511.3973899-5-amadio@gentoo.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
This allows to build against libtraceevent and libtracefs installed
in non-standard locations.
Signed-off-by: Guilherme Amadio <amadio@gentoo.org>
Tested-by: Thorsten Leemhuis <linux@leemhuis.info>
Tested-by: Leo Yan <leo.yan@arm.com>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Link: https://lore.kernel.org/r/20240712194511.3973899-4-amadio@gentoo.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Other tools, in tools/verification and tools/tracing, make use of
libtraceevent and libtracefs as dependencies. This allows setting
up the feature check flags for them as well.
Signed-off-by: Guilherme Amadio <amadio@gentoo.org>
Tested-by: Thorsten Leemhuis <linux@leemhuis.info>
Tested-by: Leo Yan <leo.yan@arm.com>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Cc: linux-trace-devel@vger.kernel.org
Link: https://lore.kernel.org/r/20240717174739.186988-3-amadio@gentoo.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
On ARM64 the stack pointer should be aligned at a 16 byte boundary or
the SPAlignmentFault can occur. The fexit_sleep selftest allocates the
stack for the child process as a character array, this is not guaranteed
to be aligned at 16 bytes.
Because of the SPAlignmentFault, the child process is killed before it
can do the nanosleep call and hence fentry_cnt remains as 0. This causes
the main thread to hang on the following line:
while (READ_ONCE(fexit_skel->bss->fentry_cnt) != 2);
Fix this by allocating the stack using mmap() as described in the
example in the man page of clone().
Remove the fexit_sleep test from the DENYLIST of arm64.
Fixes: eddbe8e652 ("selftest/bpf: Add a test to check trampoline freeing logic.")
Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240715173327.8657-1-puranjay@kernel.org
Without 'netfilter: nf_set_pipapo: fix initial map fill' this fails:
TEST: reported issues
Add two elements, flush, re-add 1s [ OK ]
net,mac with reload 1s [ OK ]
net,port,proto 1s [FAIL]
post-add: should have returned 10.5.8.0/24 . 51-60 . 6-17 but got table inet filter {
set test {
type ipv4_addr . inet_service . inet_proto
flags interval,timeout
elements = { 10.5.7.0/24 . 51-60 . 6-17 }
}
}
The other sets defined in the selftest do not trigger this bug, it only
occurs if the first field group bitsize is smaller than the largest
group bitsize.
For each added element, check 'get' works and actually returns the
requested range.
After map has been filled, check all added ranges can still be
retrieved.
For each deleted element, check that 'get' fails.
Based on a reproducer script from Yi Chen.
Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
patchsets (devmem among them) did not make it in time.
Core & protocols
----------------
- Use local_lock in addition to local_bh_disable() to protect per-CPU
resources in networking, a step closer for local_bh_disable() not
to act as a big lock on PREEMPT_RT.
- Use flex array for netdevice priv area, ensure its cache alignment.
- Add a sysctl knob to allow user to specify a default rto_min at socket
init time. Bit of a big hammer but multiple companies were
independently carrying such patch downstream so clearly it's useful.
- Support scheduling transmission of packets based on CLOCK_TAI.
- Un-pin TCP TIMEWAIT timer to avoid it firing on CPUs later cordoned off
using cpusets.
- Support multiple L2TPv3 UDP tunnels using the same 5-tuple address.
- Allow configuration of multipath hash seed, to both allow synchronizing
hashing of two routers, and preventing partial accidental sync.
- Improve TCP compliance with RFC 9293 for simultaneous connect().
- Support sending NAT keepalives in IPsec ESP in UDP states. Userspace
IKE daemon had to do this before, but the kernel can better keep
track of it.
- Support sending supervision HSR frames with MAC addresses stored in
ProxyNodeTable when RedBox (i.e. HSR-SAN) is enabled.
- Introduce IPPROTO_SMC for selecting SMC when socket is created.
- Allow UDP GSO transmit from devices with no checksum offload.
- openvswitch: add packet sampling via psample, separating the sampled
traffic from "upcall" packets sent to user space for forwarding.
- nf_tables: shrink memory consumption for transaction objects.
Things we sprinkled into general kernel code
--------------------------------------------
- Power Sequencing subsystem (used by Qualcomm Bluetooth driver
for QCA6390).
- Add IRQ information in sysfs for auxiliary bus.
- Introduce guard definition for local_lock.
- Add aligned flavor of __cacheline_group_{begin, end}() markings for
grouping fields in structures.
BPF
---
- Notify user space (via epoll) when a struct_ops object is getting
detached/unregistered.
- Add new kfuncs for a generic, open-coded bits iterator.
- Enable BPF programs to declare arrays of kptr, bpf_rb_root, and
bpf_list_head.
- Support resilient split BTF which cuts down on duplication and makes
BTF as compact as possible WRT BTF from modules.
- Add support for dumping kfunc prototypes from BTF which enables both
detecting as well as dumping compilable prototypes for kfuncs.
- riscv64 BPF JIT improvements in particular to add 12-argument support
for BPF trampolines and to utilize bpf_prog_pack for the latter.
- Add the capability to offload the netfilter flowtable in XDP layer
through kfuncs.
Driver API
----------
- Allow users to configure IRQ tresholds between which automatic IRQ
moderation can choose.
- Expand Power Sourcing (PoE) status with power, class and failure
reason. Support setting power limits.
- Track additional RSS contexts in the core, make sure configuration
changes don't break them.
- Support IPsec crypto offload for IPv6 ESP and IPv4 UDP-encapsulated ESP
data paths.
- Support updating firmware on SFP modules.
Tests and tooling
-----------------
- mptcp: use net/lib.sh to manage netns.
- TCP-AO and TCP-MD5: replace debug prints used by tests with
tracepoints.
- openvswitch: make test self-contained (don't depend on OvS CLI tools).
Drivers
-------
- Ethernet high-speed NICs:
- Broadcom (bnxt):
- increase the max total outstanding PTP TX packets to 4
- add timestamping statistics support
- implement netdev_queue_mgmt_ops
- support new RSS context API
- Intel (100G, ice, idpf):
- implement FEC statistics and dumping signal quality indicators
- support E825C products (with 56Gbps PHYs)
- nVidia/Mellanox:
- support HW-GRO
- mlx4/mlx5: support per-queue statistics via netlink
- obey the max number of EQs setting in sub-functions
- AMD/Solarflare:
- support new RSS context API
- AMD/Pensando:
- ionic: rework fix for doorbell miss to lower overhead
and skip it on new HW
- Wangxun:
- txgbe: support Flow Director perfect filters
- Ethernet NICs consumer, embedded and virtual:
- Add driver for Tehuti Networks TN40xx chips
- Add driver for Meta's internal NIC chips
- Add driver for Ethernet MAC on Airoha EN7581 SoCs
- Add driver for Renesas Ethernet-TSN devices
- Google cloud vNIC:
- flow steering support
- Microsoft vNIC:
- support page sizes other than 4KB on ARM64
- vmware vNIC:
- support latency measurement (update to version 9)
- VirtIO net:
- support for Byte Queue Limits
- support configuring thresholds for automatic IRQ moderation
- support for AF_XDP Rx zero-copy
- Synopsys (stmmac):
- support for STM32MP13 SoC
- let platforms select the right PCS implementation
- TI:
- icssg-prueth: add multicast filtering support
- icssg-prueth: enable PTP timestamping and PPS
- Renesas:
- ravb: improve Rx performance 30-400% by using page pool,
theaded NAPI and timer-based IRQ coalescing
- ravb: add MII support for R-Car V4M
- Cadence (macb):
- macb: add ARP support to Wake-On-LAN
- Cortina:
- use phylib for RX and TX pause configuration
- Ethernet switches:
- nVidia/Mellanox:
- support configuration of multipath hash seed
- report more accurate max MTU
- use page_pool to improve Rx performance
- MediaTek:
- mt7530: add support for bridge port isolation
- Qualcomm:
- qca8k: add support for bridge port isolation
- Microchip:
- lan9371/2: add 100BaseTX PHY support
- NXP:
- vsc73xx: implement VLAN operations
- Ethernet PHYs:
- aquantia: enable support for aqr115c
- aquantia: add support for PHY LEDs
- realtek: add support for rtl8224 2.5Gbps PHY
- xpcs: add memory-mapped device support
- add BroadR-Reach link mode and support in Broadcom's PHY driver
- CAN:
- add document for ISO 15765-2 protocol support
- mcp251xfd: workaround for erratum DS80000789E, use timestamps
to catch when device returns incorrect FIFO status
- WiFi:
- mac80211/cfg80211:
- parse Transmit Power Envelope (TPE) data in mac80211 instead of
in drivers
- improvements for 6 GHz regulatory flexibility
- multi-link improvements
- support multiple radios per wiphy
- remove DEAUTH_NEED_MGD_TX_PREP flag
- Intel (iwlwifi):
- bump FW API to 91 for BZ/SC devices
- report 64-bit radiotap timestamp
- enable P2P low latency by default
- handle Transmit Power Envelope (TPE) advertised by AP
- remove support for older FW for new devices
- fast resume (keeping the device configured)
- mvm: re-enable Multi-Link Operation (MLO)
- aggregation (A-MSDU) optimizations
- MediaTek (mt76):
- mt7925 Multi-Link Operation (MLO) support
- Qualcomm (ath10k):
- LED support for various chipsets
- Qualcomm (ath12k):
- remove unsupported Tx monitor handling
- support channel 2 in 6 GHz band
- support Spatial Multiplexing Power Save (SMPS) in 6 GHz band
- supprt multiple BSSID (MBSSID) and Enhanced Multi-BSSID
Advertisements (EMA)
- support dynamic VLAN
- add panic handler for resetting the firmware state
- DebugFS support for datapath statistics
- WCN7850: support for Wake on WLAN
- Microchip (wilc1000):
- read MAC address during probe to make it visible to user space
- suspend/resume improvements
- TI (wl18xx):
- support newer firmware versions
- RealTek (rtw89):
- preparation for RTL8852BE-VT support
- Wake on WLAN support for WiFi 6 chips
- 36-bit PCI DMA support
- RealTek (rtlwifi):
- RTL8192DU support
- Broadcom (brcmfmac):
- Management Frame Protection support (to enable WPA3)
- Bluetooth:
- qualcomm: use the power sequencer for QCA6390
- btusb: mediatek: add ISO data transmission functions
- hci_bcm4377: add BCM4388 support
- btintel: add support for BlazarU core
- btintel: add support for Whale Peak2
- btnxpuart: add support for AW693 A1 chipset
- btnxpuart: add support for IW615 chipset
- btusb: add Realtek RTL8852BE support ID 0x13d3:0x3591
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmaWjBwACgkQMUZtbf5S
IrvuSRAAkJuEzTRqgURBCe4eNEQde6mJJig7l2CKHwCbFiHZpRkFHf8qKbcGWbL6
uLW33SWnKtJVDhxVKWHLq635XW7BAa80YhqGw21GDi+mIEhWXZglHj3xbXNxsMfE
4eg/kG4BkfYWFmHaXOwVWV/mr7nXf6j7WmXNeXEi32ufE1j0OL+YlQenKnMj8yP2
j9JmYa2Chwppng1SblHmcjmGkdNVwFhStKeCG+2K7v06wdDH/QYBlbgUv9gw/cxp
NlW//wgiaeX40U4O3kDwt9C+LDoh+0VrDDeVdQ+IsScLtY3PhAzEoKolFYTq2HSr
I1JpoaHNnyNsJq3DZrACQ5WlH4yDn6C2EUB6dxNnFaI9F1ZPsi+7MTl6Sei1AklD
TuQTj/lxOACBwW2Q77NU72uoxiIUauesGPHcnrAFuoCIEhZF0mso7k59BvrXhsOP
QwcLbQdc1YHNkqv/Vc7NBY+ruMsYB+5Ubbhhj2p27dp/CWFIwxI29fze4dn2uhO6
ejHN3mbqwPdSzg12YJtM6Iq61Cnwo2eVSvhTxl+ZVSZtI4nu2arzR+y7QTYmNrXP
6tkgVN9UsWeLl2xJ8wyyqL5mcvNHP2rPXWZ2X56iTaa26m+UlleeQ7YRaYtQAAr0
Ec/vlDMX64SwHhd+qwE99DXGQf2g+KklHKSLsnajJUVrWFTlRI0=
=opz8
-----END PGP SIGNATURE-----
Merge tag 'net-next-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
"Not much excitement - a handful of large patchsets (devmem among them)
did not make it in time.
Core & protocols:
- Use local_lock in addition to local_bh_disable() to protect per-CPU
resources in networking, a step closer for local_bh_disable() not
to act as a big lock on PREEMPT_RT
- Use flex array for netdevice priv area, ensure its cache alignment
- Add a sysctl knob to allow user to specify a default rto_min at
socket init time. Bit of a big hammer but multiple companies were
independently carrying such patch downstream so clearly it's useful
- Support scheduling transmission of packets based on CLOCK_TAI
- Un-pin TCP TIMEWAIT timer to avoid it firing on CPUs later cordoned
off using cpusets
- Support multiple L2TPv3 UDP tunnels using the same 5-tuple address
- Allow configuration of multipath hash seed, to both allow
synchronizing hashing of two routers, and preventing partial
accidental sync
- Improve TCP compliance with RFC 9293 for simultaneous connect()
- Support sending NAT keepalives in IPsec ESP in UDP states.
Userspace IKE daemon had to do this before, but the kernel can
better keep track of it
- Support sending supervision HSR frames with MAC addresses stored in
ProxyNodeTable when RedBox (i.e. HSR-SAN) is enabled
- Introduce IPPROTO_SMC for selecting SMC when socket is created
- Allow UDP GSO transmit from devices with no checksum offload
- openvswitch: add packet sampling via psample, separating the
sampled traffic from "upcall" packets sent to user space for
forwarding
- nf_tables: shrink memory consumption for transaction objects
Things we sprinkled into general kernel code:
- Power Sequencing subsystem (used by Qualcomm Bluetooth driver for
QCA6390) [ Already merged separately - Linus ]
- Add IRQ information in sysfs for auxiliary bus
- Introduce guard definition for local_lock
- Add aligned flavor of __cacheline_group_{begin, end}() markings for
grouping fields in structures
BPF:
- Notify user space (via epoll) when a struct_ops object is getting
detached/unregistered
- Add new kfuncs for a generic, open-coded bits iterator
- Enable BPF programs to declare arrays of kptr, bpf_rb_root, and
bpf_list_head
- Support resilient split BTF which cuts down on duplication and
makes BTF as compact as possible WRT BTF from modules
- Add support for dumping kfunc prototypes from BTF which enables
both detecting as well as dumping compilable prototypes for kfuncs
- riscv64 BPF JIT improvements in particular to add 12-argument
support for BPF trampolines and to utilize bpf_prog_pack for the
latter
- Add the capability to offload the netfilter flowtable in XDP layer
through kfuncs
Driver API:
- Allow users to configure IRQ tresholds between which automatic IRQ
moderation can choose
- Expand Power Sourcing (PoE) status with power, class and failure
reason. Support setting power limits
- Track additional RSS contexts in the core, make sure configuration
changes don't break them
- Support IPsec crypto offload for IPv6 ESP and IPv4 UDP-encapsulated
ESP data paths
- Support updating firmware on SFP modules
Tests and tooling:
- mptcp: use net/lib.sh to manage netns
- TCP-AO and TCP-MD5: replace debug prints used by tests with
tracepoints
- openvswitch: make test self-contained (don't depend on OvS CLI
tools)
Drivers:
- Ethernet high-speed NICs:
- Broadcom (bnxt):
- increase the max total outstanding PTP TX packets to 4
- add timestamping statistics support
- implement netdev_queue_mgmt_ops
- support new RSS context API
- Intel (100G, ice, idpf):
- implement FEC statistics and dumping signal quality indicators
- support E825C products (with 56Gbps PHYs)
- nVidia/Mellanox:
- support HW-GRO
- mlx4/mlx5: support per-queue statistics via netlink
- obey the max number of EQs setting in sub-functions
- AMD/Solarflare:
- support new RSS context API
- AMD/Pensando:
- ionic: rework fix for doorbell miss to lower overhead and
skip it on new HW
- Wangxun:
- txgbe: support Flow Director perfect filters
- Ethernet NICs consumer, embedded and virtual:
- Add driver for Tehuti Networks TN40xx chips
- Add driver for Meta's internal NIC chips
- Add driver for Ethernet MAC on Airoha EN7581 SoCs
- Add driver for Renesas Ethernet-TSN devices
- Google cloud vNIC:
- flow steering support
- Microsoft vNIC:
- support page sizes other than 4KB on ARM64
- vmware vNIC:
- support latency measurement (update to version 9)
- VirtIO net:
- support for Byte Queue Limits
- support configuring thresholds for automatic IRQ moderation
- support for AF_XDP Rx zero-copy
- Synopsys (stmmac):
- support for STM32MP13 SoC
- let platforms select the right PCS implementation
- TI:
- icssg-prueth: add multicast filtering support
- icssg-prueth: enable PTP timestamping and PPS
- Renesas:
- ravb: improve Rx performance 30-400% by using page pool,
theaded NAPI and timer-based IRQ coalescing
- ravb: add MII support for R-Car V4M
- Cadence (macb):
- macb: add ARP support to Wake-On-LAN
- Cortina:
- use phylib for RX and TX pause configuration
- Ethernet switches:
- nVidia/Mellanox:
- support configuration of multipath hash seed
- report more accurate max MTU
- use page_pool to improve Rx performance
- MediaTek:
- mt7530: add support for bridge port isolation
- Qualcomm:
- qca8k: add support for bridge port isolation
- Microchip:
- lan9371/2: add 100BaseTX PHY support
- NXP:
- vsc73xx: implement VLAN operations
- Ethernet PHYs:
- aquantia: enable support for aqr115c
- aquantia: add support for PHY LEDs
- realtek: add support for rtl8224 2.5Gbps PHY
- xpcs: add memory-mapped device support
- add BroadR-Reach link mode and support in Broadcom's PHY driver
- CAN:
- add document for ISO 15765-2 protocol support
- mcp251xfd: workaround for erratum DS80000789E, use timestamps to
catch when device returns incorrect FIFO status
- WiFi:
- mac80211/cfg80211:
- parse Transmit Power Envelope (TPE) data in mac80211 instead
of in drivers
- improvements for 6 GHz regulatory flexibility
- multi-link improvements
- support multiple radios per wiphy
- remove DEAUTH_NEED_MGD_TX_PREP flag
- Intel (iwlwifi):
- bump FW API to 91 for BZ/SC devices
- report 64-bit radiotap timestamp
- enable P2P low latency by default
- handle Transmit Power Envelope (TPE) advertised by AP
- remove support for older FW for new devices
- fast resume (keeping the device configured)
- mvm: re-enable Multi-Link Operation (MLO)
- aggregation (A-MSDU) optimizations
- MediaTek (mt76):
- mt7925 Multi-Link Operation (MLO) support
- Qualcomm (ath10k):
- LED support for various chipsets
- Qualcomm (ath12k):
- remove unsupported Tx monitor handling
- support channel 2 in 6 GHz band
- support Spatial Multiplexing Power Save (SMPS) in 6 GHz band
- supprt multiple BSSID (MBSSID) and Enhanced Multi-BSSID
Advertisements (EMA)
- support dynamic VLAN
- add panic handler for resetting the firmware state
- DebugFS support for datapath statistics
- WCN7850: support for Wake on WLAN
- Microchip (wilc1000):
- read MAC address during probe to make it visible to user space
- suspend/resume improvements
- TI (wl18xx):
- support newer firmware versions
- RealTek (rtw89):
- preparation for RTL8852BE-VT support
- Wake on WLAN support for WiFi 6 chips
- 36-bit PCI DMA support
- RealTek (rtlwifi):
- RTL8192DU support
- Broadcom (brcmfmac):
- Management Frame Protection support (to enable WPA3)
- Bluetooth:
- qualcomm: use the power sequencer for QCA6390
- btusb: mediatek: add ISO data transmission functions
- hci_bcm4377: add BCM4388 support
- btintel: add support for BlazarU core
- btintel: add support for Whale Peak2
- btnxpuart: add support for AW693 A1 chipset
- btnxpuart: add support for IW615 chipset
- btusb: add Realtek RTL8852BE support ID 0x13d3:0x3591"
* tag 'net-next-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1589 commits)
eth: fbnic: Fix spelling mistake "tiggerring" -> "triggering"
tcp: Replace strncpy() with strscpy()
wifi: ath12k: fix build vs old compiler
tcp: Don't access uninit tcp_rsk(req)->ao_keyid in tcp_create_openreq_child().
eth: fbnic: Write the TCAM tables used for RSS control and Rx to host
eth: fbnic: Add L2 address programming
eth: fbnic: Add basic Rx handling
eth: fbnic: Add basic Tx handling
eth: fbnic: Add link detection
eth: fbnic: Add initial messaging to notify FW of our presence
eth: fbnic: Implement Rx queue alloc/start/stop/free
eth: fbnic: Implement Tx queue alloc/start/stop/free
eth: fbnic: Allocate a netdevice and napi vectors with queues
eth: fbnic: Add FW communication mechanism
eth: fbnic: Add message parsing for FW messages
eth: fbnic: Add register init to set PCIe/Ethernet device config
eth: fbnic: Allocate core device specific structures and devlink interface
eth: fbnic: Add scaffolding for Meta's NIC driver
PCI: Add Meta Platforms vendor ID
net/sched: cls_flower: propagate tca[TCA_OPTIONS] to NL_REQ_ATTR_CHECK
...
This kselftest next update for Linux 6.11-rc1 consists of:
-- changes to resctrl test to cleanup resctrl_val() and
generalize it by removing test name specific handling
from the function.
-- several clang build failure fixes to framework and tests
-- adds tests to verify IFS (In Field Scan) driver functionality
-- cleanups to remove unused variables and document changes
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmaWzrUACgkQCwJExA0N
Qxx4UQ//VCkYyI/R6CUDSWU0+mUk4SxoMqwjw1cpqC/QkW3u03AyD7sjNyUtZT57
JaOHlbC1drgGDUOtLLiVNaZRXNBhJEmr0+917BtyxIO3iRaMR3FOw8mQ1w8BiMHq
46L2+/wk7xZrSFhj1/qjzwelKVTC+I+CwG8xkmBqcBSs41DyqldHkeddEf47/ijQ
qYt6RyNTTKZMQrTN/KhjtdyMk3KBigE3UwrVUuYEFT6Nlvc0fKX+2XfAlS8CdZu1
EDyq7HBPMgr/UhZt+Gvo62+T/9HBiBckVw3EYdM/WbAK45rDmTPXrL32lYEvia0q
NXjWpFLuIc1CjTEMdP1dLS1yuZlngKKco2odbPOTE9EUVpS9y9IvmzdNxMTX59mZ
AkpSkEx5PKrHHuTrN+GRxfvEYnrzYbjLvgXO2StSFuuR/huZaC7juPVcBQwXFwvM
ekciAMxt8TG0UMeEQ3+3U4HysFz6Ra7qgLv3aBfe6tbw3IMFP6b1kHoMLlbuFdwl
/A+z6Ty5rePqc/8WSCCQgwvloUeif2jzDwUXCOXWoQuQpGGnNHcumMktKIKrVzav
Zi5s32qGSqwUdvZQaSLUGL/AXjC6EzgEaMBdR6Ve+7jL+DB0ug5mkCZe8ukGuEgG
rr+sHZ5y4Eu5u0KInGKSS1b9ou37GErDVfmCUgNxrisDCr6F2hQ=
=3vp1
-----END PGP SIGNATURE-----
Merge tag 'linux_kselftest-next-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull kselftest updates from Shuah Khan:
- change resctrl test to cleanup resctrl_val() and generalize it by
removing test name specific handling from the function.
- several clang build failure fixes to framework and tests
- add tests to verify IFS (In Field Scan) driver functionality
- cleanups to remove unused variables and document changes
* tag 'linux_kselftest-next-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: (33 commits)
selftests: ifs: verify IFS ARRAY BIST functionality
selftests: ifs: verify IFS scan test functionality
selftests: ifs: verify test image loading functionality
selftests: ifs: verify test interfaces are created by the driver
selftests/dma:remove unused variable
selftests/breakpoints:Remove unused variable
selftests/x86: fix printk warnings reported by clang
selftests/x86: remove (or use) unused variables and functions
selftests/x86: avoid -no-pie warnings from clang during compilation
selftests/x86: build sysret_rip.c with clang
selftests/x86: build fsgsbase_restore.c with clang
selftests: x86: test_FISTTP: use fisttps instead of ambiguous fisttp
selftests/x86: fix Makefile dependencies to work with clang
selftests/timers: remove unused irqcount variable
selftests: Add information about TAP conformance in tests
selftests/resctrl: Remove test name comparing from write_bm_pid_to_resctrl()
selftests/resctrl: Remove mongrp from CMT test
selftests/resctrl: Remove mongrp from MBA test
selftests/resctrl: Convert ctrlgrp & mongrp to pointers
selftests/resctrl: Make some strings passed to resctrlfs functions const
...
- Fix bug that caused objtool to confuse certain memory ops
added by KASAN instrumentation as stack accesses
- Various faddr2line optimizations
- Improve error messages
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmaVsrARHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1jvlhAAj9I7UXMMWRq+lOtYp/nUNpq4XxETeEbx
IvF6b1TVbD2M8+a5y8Wo9cy+UL1i781oHaM+cggDWMnvt6IV6dUIbb72cLjtWYzx
bLD1NDw4exrSFrlbn2ri8U1dkpHHmxoSLHmFdcpKNCV5NLEtFPSj9Bmd6DCmrQ2U
kbgAAjtc2XWx2ObXtgId8MfQso1/xzX6u2KhduQCEa/vuAzEUy7ppVXC5lwpKaVY
MSo3vzC2kt5HlZirsp4ALiHkmJgxytoXVtzzKKnsxyLeG84GcUQy8aQndCevOhqU
nPI3BCyDs0o1DYNWj/qP5d1HnPdHKPptMZtPlmg6ukBNqIuFXSGwaLdWp06dTQ+6
Q7COfVwG24H6QQD8aixvBDCIVGbzDJRF6DCL1zZWVtV1faxL1iM+vjOsh5C7l5A4
ZuaiZCa79JT3nQ6NHD5k5rv++FWEcC2dpwRBgSZU8HbfCtDIEEKB6ZYA/YVkt0Ae
hAfkPX6j4Cl4MuFS2VmuPeaEmnciuTUHq65MGM1PxW7qaYgDDdT58T3Kt6SFrPy0
8KOeVKUMI7sRNDOcIIRbsoSzQ4xtXSeurX6zNsyj17L9O8QxzMCTLHi1tZGSPd9g
VtKNkIHBNOch6CAcE2tgntzJ/0c0NTyN8F7Xh7PK2o981Kmj/Ic0r1TVMdY7PCm3
riLimaXEhJ8=
=RwK9
-----END PGP SIGNATURE-----
Merge tag 'objtool-core-2024-07-16' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull objtool updates from Ingo Molnar:
- Fix bug that caused objtool to confuse certain memory ops added by
KASAN instrumentation as stack accesses
- Various faddr2line optimizations
- Improve error messages
* tag 'objtool-core-2024-07-16' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
objtool/x86: objtool can confuse memory and stack access
objtool: Use "action" in error message to be consistent with help
scripts/faddr2line: Check only two symbols when calculating symbol size
scripts/faddr2line: Remove call to addr2line from find_dir_prefix()
scripts/faddr2line: Invoke addr2line as a single long-running process
scripts/faddr2line: Pass --addresses argument to addr2line
scripts/faddr2line: Check vmlinux only once
scripts/faddr2line: Combine three readelf calls into one
scripts/faddr2line: Reduce number of readelf calls to three
- Add Loongson-3 CPUFreq driver support (Huacai Chen).
- Add support for the Arrow Lake and Lunar Lake platforms and
the out-of-band (OOB) mode on Emerald Rapids to the intel_pstate
cpufreq driver, make it support the highest performance change
interrupt and clean it up (Srinivas Pandruvada).
- Switch cpufreq to new Intel CPU model defines (Tony Luck).
- Simplify the cpufreq driver interface by switching the .exit() driver
callback to the void return data type (Lizhe, Viresh Kumar).
- Make cpufreq_boost_enabled() return bool (Dhruva Gole).
- Add fast CPPC support to the amd-pstate cpufreq driver, address
multiple assorted issues in it and clean it up (Perry Yuan, Mario
Limonciello, Dhananjay Ugwekar, Meng Li, Xiaojian Du).
- Add Allwinner H700 speed bin to the sun50i cpufreq driver (Ryan
Walklin).
- Fix memory leaks and of_node_put() usage in the sun50i and qcom-nvmem
cpufreq drivers (Javier Carrasco).
- Clean up the sti and dt-platdev cpufreq drivers (Jeff Johnson,
Raphael Gallais-Pou).
- Fix deferred probe handling in the TI cpufreq driver and wrong return
values of ti_opp_supply_probe(), and add OPP tables for the AM62Ax and
AM62Px SoCs to it (Bryan Brattlof, Primoz Fiser).
- Avoid overflow of target_freq in .fast_switch() in the SCMI cpufreq
driver (Jagadeesh Kona).
- Use dev_err_probe() in every error path in probe in the Mediatek
cpufreq driver (Nícolas Prado).
- Fix kernel-doc param for longhaul_setstate in the longhaul cpufreq
driver (Yang Li).
- Fix system resume handling in the CPPC cpufreq driver (Riwen Lu).
- Improve the teo cpuidle governor and clean up leftover comments from
the menu cpuidle governor (Christian Loehle).
- Clean up a comment typo in the teo cpuidle governor (Atul Kumar
Pant).
- Add missing MODULE_DESCRIPTION() macro to cpuidle haltpoll (Jeff
Johnson).
- Switch the intel_idle driver to new Intel CPU model defines (Tony
Luck).
- Switch the Intel RAPL driver new Intel CPU model defines (Tony Luck).
- Simplify if condition in the idle_inject driver (Thorsten Blum).
- Fix missing cleanup on error in _opp_attach_genpd() (Viresh Kumar).
- Introduce an OF helper function to inform if required-opps is used
and drop a redundant in-parameter to _set_opp_level() (Ulf Hansson).
- Update pm-graph to v5.12 which includes fixes and major code revamp
for python3.12 (Todd Brandt).
- Address several assorted issues in the cpupower utility (Roman
Storozhenko).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmaVb+8SHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxXIUQALFhNTO+wo8uPWUmsp0SV81Sbf17zM0f
9IDpzJTUZLK0stTdLtxY4khcClPE4MrwS/LjSJlvkEVZChHpUw6vFezHmx0O42Ti
Tmv3ezABSAmx6QVRSpyVhE3Hb0BmXW9V+3dtoefofV0JWenN7mqk4Hbb2Jx1Cvbh
zyerUeWWl97yqVMM2l5owKHSvk7SYO6cfML73XcdXQ6pBfQePfekG87i1+r40l+d
qEzdyh6JjqGbdkvZKtI4zO1Hdai9FdlLWSqYmVZGS5XRN8RVvDaHDIDlSijNXAei
DFPFoBVAvl8CymBXXnzDyJJhCCkEb2aX3xD6WzthoCygZt5W+tqfGxyZfViBfb55
kvpyiWZUVaDyX4Hfz1PLnJ7Xg9kPUKUcDDrsV5vKA7W0Sq2T0RbORsVkaP2nIhlY
4Xspp9nEv+78DG0UjT7jT0Py2Oq9I6BTG+pmMTxcgA7G/U5H2uAvvIM/kwQ+30vi
yUxO3W5o9TQmvJF1klHgp3YsCNWZG3IYacHZzUIoPbPusEbevYrCuUNriT+zlANc
Pv/FMfBfHDmU2lHWyLzuoKhlzQosNi9NajMANBJgd55zACWKzgNzFV4P5gIMd1KR
moJYfosbT2RWetEH8Zrh7xA5dewUphe6tibshElbKJHilnP0iFjYhhdb6aQRcuPd
q/RECFYT7z0r
=imBx
-----END PGP SIGNATURE-----
Merge tag 'pm-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
"These add a new cpufreq driver for Loongson-3, add support for new
features in the intel_pstate (Lunar Lake and Arrow Lake platforms, OOB
mode for Emerald Rapids, highest performance change interrupt),
amd-pstate (fast CPPC) and sun50i (Allwinner H700 speed bin) cpufreq
drivers, simplify the cpufreq driver interface, simplify the teo
cpuidle governor, adjust the pm-graph utility for a new version of
Python, address issues and clean up code.
Specifics:
- Add Loongson-3 CPUFreq driver support (Huacai Chen)
- Add support for the Arrow Lake and Lunar Lake platforms and the
out-of-band (OOB) mode on Emerald Rapids to the intel_pstate
cpufreq driver, make it support the highest performance change
interrupt and clean it up (Srinivas Pandruvada)
- Switch cpufreq to new Intel CPU model defines (Tony Luck)
- Simplify the cpufreq driver interface by switching the .exit()
driver callback to the void return data type (Lizhe, Viresh Kumar)
- Make cpufreq_boost_enabled() return bool (Dhruva Gole)
- Add fast CPPC support to the amd-pstate cpufreq driver, address
multiple assorted issues in it and clean it up (Perry Yuan, Mario
Limonciello, Dhananjay Ugwekar, Meng Li, Xiaojian Du)
- Add Allwinner H700 speed bin to the sun50i cpufreq driver (Ryan
Walklin)
- Fix memory leaks and of_node_put() usage in the sun50i and
qcom-nvmem cpufreq drivers (Javier Carrasco)
- Clean up the sti and dt-platdev cpufreq drivers (Jeff Johnson,
Raphael Gallais-Pou)
- Fix deferred probe handling in the TI cpufreq driver and wrong
return values of ti_opp_supply_probe(), and add OPP tables for the
AM62Ax and AM62Px SoCs to it (Bryan Brattlof, Primoz Fiser)
- Avoid overflow of target_freq in .fast_switch() in the SCMI cpufreq
driver (Jagadeesh Kona)
- Use dev_err_probe() in every error path in probe in the Mediatek
cpufreq driver (Nícolas Prado)
- Fix kernel-doc param for longhaul_setstate in the longhaul cpufreq
driver (Yang Li)
- Fix system resume handling in the CPPC cpufreq driver (Riwen Lu)
- Improve the teo cpuidle governor and clean up leftover comments
from the menu cpuidle governor (Christian Loehle)
- Clean up a comment typo in the teo cpuidle governor (Atul Kumar
Pant)
- Add missing MODULE_DESCRIPTION() macro to cpuidle haltpoll (Jeff
Johnson)
- Switch the intel_idle driver to new Intel CPU model defines (Tony
Luck)
- Switch the Intel RAPL driver new Intel CPU model defines (Tony
Luck)
- Simplify if condition in the idle_inject driver (Thorsten Blum)
- Fix missing cleanup on error in _opp_attach_genpd() (Viresh Kumar)
- Introduce an OF helper function to inform if required-opps is used
and drop a redundant in-parameter to _set_opp_level() (Ulf Hansson)
- Update pm-graph to v5.12 which includes fixes and major code revamp
for python3.12 (Todd Brandt)
- Address several assorted issues in the cpupower utility (Roman
Storozhenko)"
* tag 'pm-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (77 commits)
cpufreq: sti: fix build warning
cpufreq: mediatek: Use dev_err_probe in every error path in probe
cpufreq: Add Loongson-3 CPUFreq driver support
cpufreq: Make cpufreq_driver->exit() return void
cpufreq/amd-pstate: Fix the scaling_max_freq setting on shared memory CPPC systems
cpufreq/amd-pstate-ut: Convert nominal_freq to khz during comparisons
cpufreq: pcc: Remove empty exit() callback
cpufreq: loongson2: Remove empty exit() callback
cpufreq: nforce2: Remove empty exit() callback
cpupower: fix lib default installation path
cpufreq: docs: Add missing scaling_available_frequencies description
cpuidle: teo: Don't count non-existent intercepts
cpupower: Disable direct build of the 'bench' subproject
cpuidle: teo: Remove recent intercepts metric
Revert: "cpuidle: teo: Introduce util-awareness"
cpufreq: make cpufreq_boost_enabled() return bool
cpufreq: intel_pstate: Support highest performance change interrupt
x86/cpufeatures: Add HWP highest perf change feature flag
Documentation: cpufreq: amd-pstate: update doc for Per CPU boost control method
cpufreq: amd-pstate: Cap the CPPC.max_perf to nominal_perf if CPB is off
...
- interrupt SECCOMP_IOCTL_NOTIF_RECV when all users exit (Andrei Vagin)
- Update selftests to check for expected NOTIF_RECV exits (Andrei Vagin)
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmaVTOkACgkQiXL039xt
wCayUA//fJTZUq+idihdKMql9SEwYh0uIJQpGOxxBEesUksZUM4alTCswzmheLFL
q9BxlWWJJfUsp3djeZK+0vnDv3izaR+LfA1JPEJf64ImbympbEenVXK0ZkmVrTqg
bNAlD5c/LVpzYtB7cnOaglq18uUja6/E+EQvNYz5NLHrIhYYCieJZIiFATkHQ9Lj
3Wq3g9FWEa5pZxpKbEI3UA2HllADnmHeb/Z78Zdvyue5lOOvsBIheQfL4m0pW38x
xgBWNglIg7b+X+YgwYSv8w50Lhn4SJVtIynWnwzBz19qFJRQL7oJRj1zyFHZPCwZ
ajHVIj5LOuts/BYxSiGzczxVqZaAqeOyCY5e8G+Mjk5ZD5kLYznbbcrIFIUIaHpx
rpRD/TVVwJ3PHsOIpWHwrXKgoKnbe/0n8lJT+Ehnm/2lLrlGyZj9hLyl6/+JsizE
dGIWgE2emykYI+52IRRYSZaw4hLb+CU52d1vd5a35wUk1ie5fcVGZAWnaul23x0I
maQtXcyB6tYuhX3oPnxoxFVqGvCKGi3Tc5N+Vg4JR/RzTy2H7fZ02DRZq8Vs0QST
EgO3cpCD3035qxkK6ivaV4ebPJjkL158D/+uFyne+PWQlSfrmXJvhfggPFA+Oqtb
y8PTUV73+HxDiruXbGZ7MD8C7d+ZvGI/D+xheohYrijhivcXxcc=
=tlFN
-----END PGP SIGNATURE-----
Merge tag 'seccomp-v6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull seccomp updates from Kees Cook:
- interrupt SECCOMP_IOCTL_NOTIF_RECV when all users exit (Andrei Vagin)
- Update selftests to check for expected NOTIF_RECV exits (Andrei
Vagin)
* tag 'seccomp-v6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
selftests/seccomp: check that a zombie leader doesn't affect others
selftests/seccomp: add test for NOTIF_RECV and unused filters
seccomp: release task filters when the task exits
seccomp: interrupt SECCOMP_IOCTL_NOTIF_RECV when all users have exited
- Use value of kernel.randomize_va_space once per exec (Alexey Dobriyan)
- Honor PT_LOAD alignment for static PIE
- Make bprm->argmin only visible under CONFIG_MMU
- Add KUnit testing of bprm_stack_limits()
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmaVTFAACgkQiXL039xt
wCYyZw//ZcPV2hu48WqqOImL8LI9HIUaZqKQpixGQRD5VcTRb5MKg8g3Wi4EBHz+
Kg6QvTEOQdg6NbfE9fH8VIIwcp3dAxdWN6g+3A0HHDSRdb8Ye1ucnzB2kgmEkM1l
huBRn5tnoS0vn2fxafu1O5tj330kKAvTsemsy316cxmbKNs7ckHdfwuVgZHcuyEt
OrOA3ZSTWwjkSiA9tatsi5iAQ34tQYGwDEosf06avlnPkQqsRzn3wNlohAPjQF6V
kjRfX/Mxz2EHa0mjXy2OkhNyPSn6wu0OcmF0ympySHzxm726uRggG+olT5ziUc+2
DW6Gz6TJ1P8Gu+uTEoz6AY+l5Bpo9ZLYSBm+Mp88sxAT6+Xcc68XeZsFZHmefJzs
6g6EdmwhDEP/Xd3sIsNphdkS5q1RMgc7tdAtyK8GCaACsHUlU4CfzRYh2mWxpIg6
hA7oM5KF9FuToLtaIS6K/yYQIVsTKAaA7t+5K/a1RUyKzcJ0O7UpMx1oEge2sPEK
RnETCYhQs0Cxm11iJ/eqEFzWm0Puxjsjz/P/j5H5U8usx9VUoz0HuS91fNEIY3S9
y7bn09wxuUv4QddKYgltkurxCCB//Nv7jPYo96pKIW3T56XkfsrYLvNH2W95cCNz
OMvZImA1J/vQubSODrgeQsfMRsaJodHU3acWyYQ90HmmoWx4JS4=
=bO7x
-----END PGP SIGNATURE-----
Merge tag 'execve-v6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull execve updates from Kees Cook:
- Use value of kernel.randomize_va_space once per exec (Alexey
Dobriyan)
- Honor PT_LOAD alignment for static PIE
- Make bprm->argmin only visible under CONFIG_MMU
- Add KUnit testing of bprm_stack_limits()
* tag 'execve-v6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
exec: Avoid pathological argc, envc, and bprm->p values
execve: Keep bprm->argmin behind CONFIG_MMU
ELF: fix kernel.randomize_va_space double read
exec: Add KUnit test for bprm_stack_limits()
binfmt_elf: Honor PT_LOAD alignment for static PIE
binfmt_elf: Calculate total_size earlier
selftests/exec: Build both static and non-static load_address tests
Most of this is part of my ongoing work to clean up the system call
tables. In this bit, all of the newer architectures are converted to
use the machine readable syscall.tbl format instead in place of complex
macros in include/uapi/asm-generic/unistd.h.
This follows an earlier series that fixed various API mismatches
and in turn is used as the base for planned simplifications.
The other two patches are dead code removal and a warning fix.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmaVB1cACgkQYKtH/8kJ
UicMqxAAnYKOxfjoMIhYYK6bl126wg/vIcDcjIR9cNWH21Nhn3qxn11ZXau3S7xv
3l/HreEhyEQr4gC2a70IlXyHUadYOlrk+83OURrunWk1oKPmZlMKcfPVbtp8GL7x
PUNXQfwM1XZLveKwufY24hoZdwKC+Y/5WLc1t0ReznJuAqgeO2rM9W5dnV5bAfCp
he3F5hFcr196Dz3/GJjJIWrY+cbwfmZWsNtj1vFTL5/r/LuCu8HTkqhsGj8tE5BJ
NGVEEXbp5eaVTCIGqJWhnuZcsnKN9kM51M7CtdwWf8OTckUVuJap5OsDVKQkWkGl
bLPbd2jhDltph0sah51hAIvv4WdkThW76u9FRW7KR3fo7ra67eF7l5j7wc1lE2JB
GwLJ1X56Bxe1GhvvNTlDmb7DrnlP/DMPuRv3Z6xyH6l8iZ2pMGlnAxuw6Bs1s6Y5
WSs36ZpnS0ctgjfx37ZITsZSvbKFPpQFJP4siwS8aRNv/NFALNNdFyOCY5lNzspZ
0dxwjn6/7UpHE4MKh6/hvCg2QwupXXBTRytibw+75/rOsR+EYlmtuONtyq2sLUHe
ktJ5pg+8XuZm27+wLffuluzmY7sv2F8OU4cTYeM60Ynmc6pRzwUY6/VhG52S1/mU
Ua4VgYIpzOtlLrYmz5QTWIZpdSFSVbIc/3pLriD6hn4Mvg+BwdA=
=XOhL
-----END PGP SIGNATURE-----
Merge tag 'asm-generic-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic
Pull asm-generic updates from Arnd Bergmann:
"Most of this is part of my ongoing work to clean up the system call
tables. In this bit, all of the newer architectures are converted to
use the machine readable syscall.tbl format instead in place of
complex macros in include/uapi/asm-generic/unistd.h.
This follows an earlier series that fixed various API mismatches and
in turn is used as the base for planned simplifications.
The other two patches are dead code removal and a warning fix"
* tag 'asm-generic-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
vmlinux.lds.h: catch .bss..L* sections into BSS")
fixmap: Remove unused set_fixmap_offset_io()
riscv: convert to generic syscall table
openrisc: convert to generic syscall table
nios2: convert to generic syscall table
loongarch: convert to generic syscall table
hexagon: use new system call table
csky: convert to generic syscall table
arm64: rework compat syscall macros
arm64: generate 64-bit syscall.tbl
arm64: convert unistd_32.h to syscall.tbl format
arc: convert to generic syscall table
clone3: drop __ARCH_WANT_SYS_CLONE3 macro
kbuild: add syscall table generation to scripts/Makefile.asm-headers
kbuild: verify asm-generic header list
loongarch: avoid generating extra header files
um: don't generate asm/bpf_perf_event.h
csky: drop asm/gpio.h wrapper
syscalls: add generic scripts/syscall.tbl
- Remove dead code in the memslot modification stress test.
- Treat "branch instructions retired" as supported on all AMD Family 17h+ CPUs.
- Print the guest pseudo-RNG seed only when it changes, to avoid spamming the
log for tests that create lots of VMs.
- Make the PMU counters test less flaky when counting LLC cache misses by
doing CLFLUSH{OPT} in every loop iteration.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEKTobbabEP7vbhhN9OlYIJqCjN/0FAmaRvAwACgkQOlYIJqCj
N/2PSw//UgZJnVNvh87kxYY48hNamwaFCkbCgBCx4J4SBgZkz/6hqzEo8SsFIQEP
bb1W6z1cAthL1f5OuTsCkROGfjnUrCi4igLfnSl7vaJjkInwKz4kQmW37XCWhQ4p
VGayOPvGk122uY63tVo7041v2ByKNJFEwSWQVCIGTY+ZyYH0uH2GoeN/PRllPw1Z
CY9JxFmyLyUZCCSoNbEF8I0uxrKeFj42NHZ8PebWKpRm4ZWCa6Nd3o4q3mrFAqth
BuIrg3bYKrD7qyGFtR0Hrn2RTzyVJimFILFg3CxQfVqw32kwuZxmttYKuXgeUYo3
lMmYXLc/sYzoOIIojEFFwAVOrt4vegbar8sQ8VyglCfMRuLFRS4qEm9SEy7y8p14
s5mjcKBoTW6PSSoqGbrUO6fmA2Ex0yrQzYP+sC4QG6u57f41Pv2zF7vbzA3UItT7
ujjKTRqG1LJLY3cYQy6j+4pVcEJGTPTGE/2QbYElyFtG+mVrDZybnYR/g6Xb9SH6
OVtnIHtB0PZ8wm64hhszLjSBoL49iqSP7K4GLusdD9l8y92yGnveurj9shVn2OqM
zLMdhrwe/ioTZTNAyeHI2IsmWHcHqaoB5yNADvcHLoIFFUaihEkGugt767JFVo7q
4xTqapa+DSMe7fYfRUI92V1TFwNpq0tThbDIZ1wI6dF+AGNm2Dg=
=zg8U
-----END PGP SIGNATURE-----
Merge tag 'kvm-x86-selftests-6.11' of https://github.com/kvm-x86/linux into HEAD
KVM selftests for 6.11
- Remove dead code in the memslot modification stress test.
- Treat "branch instructions retired" as supported on all AMD Family 17h+ CPUs.
- Print the guest pseudo-RNG seed only when it changes, to avoid spamming the
log for tests that create lots of VMs.
- Make the PMU counters test less flaky when counting LLC cache misses by
doing CLFLUSH{OPT} in every loop iteration.
- Add a global struct to consolidate tracking of host values, e.g. EFER, and
move "shadow_phys_bits" into the structure as "maxphyaddr".
- Add KVM_CAP_X86_APIC_BUS_CYCLES_NS to allow configuring the effective APIC
bus frequency, because TDX.
- Print the name of the APICv/AVIC inhibits in the relevant tracepoint.
- Clean up KVM's handling of vendor specific emulation to consistently act on
"compatible with Intel/AMD", versus checking for a specific vendor.
- Misc cleanups
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEKTobbabEP7vbhhN9OlYIJqCjN/0FAmaRub0ACgkQOlYIJqCj
N/2LMxAArGzhcWZ6Qdo2aMRaMIPtSBJHmbEgEuHvHMumgsTZQzDcn9cxDi/hNSrc
l8ODOwAM2qNcq95YfwjU7F0ae3E+HRzGvKcBnmZWuQeCDp2HhVEoCphFu1sHst+t
XEJTL02b6OgyJUEU3h40mYk12eiq2S4FCnFYXPCqijwwuL6Y5KQvvTqek3c2/SDn
c+VneutYGax/S0GiiCkYh4wrwWh9g7qm0IX70ycBwJbW5qBFKgyglvHxvL8JLJC9
Nkkw/p2657wcOdraH+fOBuRy2dMwE5fv++1tOjWwB5WAAhSOJPZh0BGYvgA2yfN7
OE+k7APKUQd9Xxtud8H3LrTPoyMA4hz2sdDFyqrrWK9yjpBY7zXNyN50Fxi7VVsm
T8nTIiKAGyRbjotY+m7krXQPXjfZYhVqrJ/jtxESOZLZ93q2gSWU2p/ZXpUPVHnH
+YOBAI1owP3wepaYlrthtI4LQx9lF422dnmeSflztfKFGabRbQZxg3uHMCCxIaGc
lJ6CD546+D45f/uBXRDMqk//qFTqXhKUbDk9sutmU/C2oWufMwW0R8kOyItGPyvk
9PP1vd8vSsIHj+tpwg+i04jBqYDaAcPBOcTZaHm9SYYP+1e11Uu5Vjep37JL1bkA
xJWxnDZOCGcfKQi2jkh51HJ/dOAHXY1GQKMfyAoPQOSonYHvGVY=
=Cf2R
-----END PGP SIGNATURE-----
Merge tag 'kvm-x86-misc-6.11' of https://github.com/kvm-x86/linux into HEAD
KVM x86 misc changes for 6.11
- Add a global struct to consolidate tracking of host values, e.g. EFER, and
move "shadow_phys_bits" into the structure as "maxphyaddr".
- Add KVM_CAP_X86_APIC_BUS_CYCLES_NS to allow configuring the effective APIC
bus frequency, because TDX.
- Print the name of the APICv/AVIC inhibits in the relevant tracepoint.
- Clean up KVM's handling of vendor specific emulation to consistently act on
"compatible with Intel/AMD", versus checking for a specific vendor.
- Misc cleanups
- Enable halt poll shrinking by default, as Intel found it to be a clear win.
- Setup empty IRQ routing when creating a VM to avoid having to synchronize
SRCU when creating a split IRQCHIP on x86.
- Rework the sched_in/out() paths to replace kvm_arch_sched_in() with a flag
that arch code can use for hooking both sched_in() and sched_out().
- Take the vCPU @id as an "unsigned long" instead of "u32" to avoid
truncating a bogus value from userspace, e.g. to help userspace detect bugs.
- Mark a vCPU as preempted if and only if it's scheduled out while in the
KVM_RUN loop, e.g. to avoid marking it preempted and thus writing guest
memory when retrieving guest state during live migration blackout.
- A few minor cleanups
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEKTobbabEP7vbhhN9OlYIJqCjN/0FAmaRuOYACgkQOlYIJqCj
N/1UnQ/8CI5Qfr+/0gzYgtWmtEMczGG+rMNpzD3XVqPjJjXcMcBiQnplnzUVLhha
vlPdYVK7vgmEt003XGzV55mik46LHL+DX/v4hI3HEdblfyCeNLW3fKEWVRB44qJe
o+YUQwSK42SORUp9oXuQINxhA//U9EnI7CQxlJ8w8wenv5IJKfIGr01DefmfGPAV
PKm9t6WLcNqvhZMEyy/zmzM3KVPCJL0NcwI97x6sHxFpQYIDtL0E/VexA4AFqMoT
QK7cSDC/2US41Zvem/r/GzM/ucdF6vb9suzZYBohwhxtVhwJe2CDeYQZvtNKJ1U7
GOHPaKL6nBWdZCm/yyWbbX2nstY1lHqxhN3JD0X8wqU5rNcwm2b8Vfyav0Ehc7H+
jVbDTshOx4YJmIgajoKjgM050rdBK59TdfVL+l+AAV5q/TlHocalYtvkEBdGmIDg
2td9UHSime6sp20vQfczUEz4bgrQsh4l2Fa/qU2jFwLievnBw0AvEaMximkSGMJe
b8XfjmdTjlOesWAejANKtQolfrq14+1wYw0zZZ8PA+uNVpKdoovmcqSOcaDC9bT8
GO/NFUvoG+lkcvJcIlo1SSl81SmGLosijwxWfGvFAqsgpR3/3l3dYp0QtztoCNJO
d3+HnjgYn5o5FwufuTD3eUOXH4AFjG108DH0o25XrIkb2Kymy0o=
=BalU
-----END PGP SIGNATURE-----
Merge tag 'kvm-x86-generic-6.11' of https://github.com/kvm-x86/linux into HEAD
KVM generic changes for 6.11
- Enable halt poll shrinking by default, as Intel found it to be a clear win.
- Setup empty IRQ routing when creating a VM to avoid having to synchronize
SRCU when creating a split IRQCHIP on x86.
- Rework the sched_in/out() paths to replace kvm_arch_sched_in() with a flag
that arch code can use for hooking both sched_in() and sched_out().
- Take the vCPU @id as an "unsigned long" instead of "u32" to avoid
truncating a bogus value from userspace, e.g. to help userspace detect bugs.
- Mark a vCPU as preempted if and only if it's scheduled out while in the
KVM_RUN loop, e.g. to avoid marking it preempted and thus writing guest
memory when retrieving guest state during live migration blackout.
- A few minor cleanups
- Initial infrastructure for shadow stage-2 MMUs, as part of nested
virtualization enablement
- Support for userspace changes to the guest CTR_EL0 value, enabling
(in part) migration of VMs between heterogenous hardware
- Fixes + improvements to pKVM's FF-A proxy, adding support for v1.1 of
the protocol
- FPSIMD/SVE support for nested, including merged trap configuration
and exception routing
- New command-line parameter to control the WFx trap behavior under KVM
- Introduce kCFI hardening in the EL2 hypervisor
- Fixes + cleanups for handling presence/absence of FEAT_TCRX
- Miscellaneous fixes + documentation updates
-----BEGIN PGP SIGNATURE-----
iI0EABYIADUWIQSNXHjWXuzMZutrKNKivnWIJHzdFgUCZpTCAxccb2xpdmVyLnVw
dG9uQGxpbnV4LmRldgAKCRCivnWIJHzdFjChAQCWs9ucJag4USgvXpg5mo9sxzly
kBZZ1o49N/VLxs4cagEAtq3KVNQNQyGXelYH6gr20aI85j6VnZW5W5z+sy5TAgk=
=sSOt
-----END PGP SIGNATURE-----
Merge tag 'kvmarm-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 changes for 6.11
- Initial infrastructure for shadow stage-2 MMUs, as part of nested
virtualization enablement
- Support for userspace changes to the guest CTR_EL0 value, enabling
(in part) migration of VMs between heterogenous hardware
- Fixes + improvements to pKVM's FF-A proxy, adding support for v1.1 of
the protocol
- FPSIMD/SVE support for nested, including merged trap configuration
and exception routing
- New command-line parameter to control the WFx trap behavior under KVM
- Introduce kCFI hardening in the EL2 hypervisor
- Fixes + cleanups for handling presence/absence of FEAT_TCRX
- Miscellaneous fixes + documentation updates
environments
- Remove duplicated Spectre cmdline option documentation
- Add separate macro definitions for syscall handlers which do not
return in order to address objtool warnings
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmaVXXMACgkQEsHwGGHe
VUrd3A/9FFJZcpxdpWJikyEskb3CO1xthfM/6QvV5U3/Nldpz4aROEteqsMYc+xB
OcA/RkCc8mBBFuydZjNxlNwyMXkoab/rQJC/Dz7q1O61sho4RWk8yCh6xM1JRofF
WeKGCClz1KnsCc8FlVaHAEhp6gBMJiiqawjXBklfHhUqmbY7UZgcAyeM3uMIwAEG
qCS7opOSZVijJadoyvROf5na23hggUVO++qS4HYT66G3bI3MdEEWp06dUxXBD/Er
2zRAY6III4wuGTxe8L49ftsyW9RS7AKY2rUmhpffkeA8tLYBfXogYVSQYyR3S9Ou
gZg9Yeu64rjqZZUYpzRR+kATUpuSKO6nQBHxd+ICRIUbzSmXUNzvPTi5SWSWh2vC
HTLgFbGXxg8fLlpqCJ21oaU982w3eteOJ+wgf/AH3hBykFljck9EcaGsaQ5OfeDE
MA0XaDy2V4jypyxmLpRfRIWJWtNVTgza2Jl0Dg3X+UipAXtvCvJzW1ZJ0ksA+2P0
K1GeWy4tC51uFndeYpNC1eQ0cJjv1mfAugHcqgVdAhwMYUZdXchaPJHr/fcF7AEG
xjV7fnoGK6WKKUni+Tnmom3FzBVDztKAtZ4iYgwIWReRj9bKLhP2k779rMXkCftt
WtiencSCtVn+K/4acYBx0vbRKlDv769Lq64FZ8xNgGw6uRXjhhM=
=AP9P
-----END PGP SIGNATURE-----
Merge tag 'x86_bugs_for_v6.11_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cpu mitigation updates from Borislav Petkov:
- Add a spectre_bhi=vmexit mitigation option aimed at cloud
environments
- Remove duplicated Spectre cmdline option documentation
- Add separate macro definitions for syscall handlers which do not
return in order to address objtool warnings
* tag 'x86_bugs_for_v6.11_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/bugs: Add 'spectre_bhi=vmexit' cmdline option
x86/bugs: Remove duplicate Spectre cmdline option descriptions
x86/syscall: Mark exit[_group] syscall handlers __noreturn
they're the only ones who can interpret the results properly
- The usual cleanups and fixes, left and right
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmaVOU0ACgkQEsHwGGHe
VUqeFBAAl9X4bj08GwSAXfqBangXaGpKO4Nx0VZiFCYDkQ/TDnchMEBbpRWSuVzS
SEnVSrcAXCxKqhv295UyFMmv2a+q3UUidkxTzRfznekMZMMylHYcfCFrg16w9ZNJ
N/cBquTu96hSJHd2/usNUvNPLllTrMoIg3gofBav+NTaHQQDmzvM5htfewREY9OF
SRS/86o3u5oIsRKKiJRyzfLzzX9lEGUvU+lvxv/yu1x2Q6SG0guhfM3HeaSxCIOs
yeB23bwe/N/pO5KlqOtEJJL49Ypu2k/jfiS2rhH6AxSqNfXVpBlDbnahu9sA973n
irzWwycJhVU4OQ3pqmPXdcKDqn7GmUWDsjrkEIOqJeBCSukmlM7APi8Ss8yGZ3X4
HgDw10c900ldrxSo0H5PdpeULvowpeptpzBY8gzcdum4s0vNUvZLy/n1AKo7ydea
oJ+ZBdXvywnR66uGQLkTxLvpGTNgyFrKDORHuyOAwJTN5CbLuco2SV/82mkcQCZt
sAgyiWFvIcLoHZPfY8BNztYWVX01lWDIxFHJE8ca/B97mBeZCC3w1DnHJla8Kxsg
zCMV0yn61BdMvjVS9AGaKqEuN0gYYrs/QOjtOp5ggAv7QC1ke/wqgZoFGvLbmcP9
pIf8GzCt34u3tACGAl76toP0rtnMjGvKD8xXdHGHf7AAj1jKo28=
=rd6Q
-----END PGP SIGNATURE-----
Merge tag 'x86_misc_for_v6.11_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull misc x86 updates from Borislav Petkov:
- Make error checking of AMD SMN accesses more robust in the callers as
they're the only ones who can interpret the results properly
- The usual cleanups and fixes, left and right
* tag 'x86_misc_for_v6.11_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/kmsan: Fix hook for unaligned accesses
x86/platform/iosf_mbi: Convert PCIBIOS_* return codes to errnos
x86/pci/xen: Fix PCIBIOS_* return code handling
x86/pci/intel_mid_pci: Fix PCIBIOS_* return code handling
x86/of: Return consistent error type from x86_of_pci_irq_enable()
hwmon: (k10temp) Rename _data variable
hwmon: (k10temp) Remove unused HAVE_TDIE() macro
hwmon: (k10temp) Reduce k10temp_get_ccd_support() parameters
hwmon: (k10temp) Define a helper function to read CCD temperature
x86/amd_nb: Enhance SMN access error checking
hwmon: (k10temp) Check return value of amd_smn_read()
EDAC/amd64: Check return value of amd_smn_read()
EDAC/amd64: Remove unused register accesses
tools/x86/kcpuid: Add missing dir via Makefile
x86, arm: Add missing license tag to syscall tables files
need to "spell out" the number of alternates in an ALTERNATIVE_n() macro and
thus have an ever-increasing complexity in those definitions.
For ease of bisection, the old macros are converted to the new, nested
variants in a step-by-step manner so that in case an issue is encountered
during testing, one can pinpoint the place where it fails easier. Because
debugging alternatives is a serious pain.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmaU/+MACgkQEsHwGGHe
VUpMGhAAqVbB3DZohv0Oa4BRRvaKFuQ3L7H0NTjK/pbT3EG+phol0zrHby2MnGjD
HWXskps86n91QBB/06vAyZRimV/dvAPvlSKllsRx6ie3VCE4FJPzA4nTWQn/41dC
HWamj78mQuSMgioLzIYdTY79KObtJcUw/X/xz+TTMemfkzkQxukKY7+Y71nZbuKi
rUuCSrfAWNHQaIaoGs2JowGw7te7yNOtKQMCW5TdNLwvJfOAECuoLIFeiEcWHvoO
uGl6FTABNLp26wmaeceUxdjBbTJcM3iV3joZQYED7B+mbJcU/a7tZw7I+mavPrbh
Y6+EOn7rzR0wbcmj0iJ74TKr+uKDme/Qzm3YEKgGvJPj9tRjTDwxWRBnyTeCMbav
NkKVwWTep8K+1qJtGVBwACY6iz89u3P8V5owD8O++KIPQa8rA0m8pN5gaU3PVYYQ
D2UUdqXWIPIFoD4Sveb/WFU8OJKY+Nx7IK8KD03h5tiXW8MmGSa2e5b57gIfCLP7
DbSHyCkTiqEdBrSM4/RaVVckD6NZ39M87H+iV51vYUkCYmODa/riMj0M7SVMi5Jo
S/30jvdHEzWnmDBbOsn9d1XbvB5I+zz3BrcZQ2VSyBx+Y9m+SZ9qsyMEkOWw4uKM
kfFp+XlYVWSnQFo7jY3UUIgVrU9dmdX1WxNX7/2HABDjg3MtWog=
=tGap
-----END PGP SIGNATURE-----
Merge tag 'x86_alternatives_for_v6.11_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 alternatives updates from Borislav Petkov:
"This is basically PeterZ's idea to nest the alternative macros to
avoid the need to "spell out" the number of alternates in an
ALTERNATIVE_n() macro and thus have an ever-increasing complexity in
those definitions.
For ease of bisection, the old macros are converted to the new, nested
variants in a step-by-step manner so that in case an issue is
encountered during testing, one can pinpoint the place where it fails
easier.
Because debugging alternatives is a serious pain"
* tag 'x86_alternatives_for_v6.11_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/alternatives, kvm: Fix a couple of CALLs without a frame pointer
x86/alternative: Replace the old macros
x86/alternative: Convert the asm ALTERNATIVE_3() macro
x86/alternative: Convert the asm ALTERNATIVE_2() macro
x86/alternative: Convert the asm ALTERNATIVE() macro
x86/alternative: Convert ALTERNATIVE_3()
x86/alternative: Convert ALTERNATIVE_TERNARY()
x86/alternative: Convert alternative_call_2()
x86/alternative: Convert alternative_call()
x86/alternative: Convert alternative_io()
x86/alternative: Convert alternative_input()
x86/alternative: Convert alternative_2()
x86/alternative: Convert alternative()
x86/alternatives: Add nested alternatives macros
x86/alternative: Zap alternative_ternary()
GPIOLIB core:
- rework kfifo handling rework in the character device code
- improve the labeling of GPIOs requested as interrupts and show more info
on interrupt-only GPIOs in debugfs
- remove unused APIs
- unexport interfaces that are only used from the core GPIO code
- drop the return value from gpiochip_set_desc_names() as it cannot fail
- move a string array definition out of a header and into a specific
compilation unit
- convert the last user of gpiochip_get_desc() other than GPIO core to using
a safer alternative
- use array_index_nospec() where applicable
New drivers:
- add a "virtual GPIO consumer" module that allows requesting GPIOs from actual
hardware and driving tests of the in-kernel GPIO API from user-space over
debugfs
- add a GPIO-based "sloppy" logic analyzer module useful for "first glance"
debugging on remote boards
Driver improvements:
- add support for a new model to gpio-pca953x
- lock GPIOs as interrupts in gpio-sim when the lines are requested as irqs
via the simulator domain + some other minor improvements
- improve error reporting in gpio-syscon
- convert gpio-ath79 to using dynamic GPIO base and range
- use pcibios_err_to_errno() for converting PCIBIOS error codes to errno
vaues in gpio-amd8111 and gpio-rdc321x
- allow building gpio-brcmstb for the BCM2835 architecture
DT bindings:
- convert DT bindings for lsi,zevio, mpc8xxx, and atmel to DT schema
- document new properties for aspeed,gpio, fsl,qoriq-gpio and gpio-vf610
- document new compatibles for pca953x and fsl,qoriq-gpio
Documentation:
- document stricter behavior of the GPIO character device uAPI with regards to
reconfiguring requested line without direction set
- clarify the effect of the active-low flag on line values and edges
- remove documentation for the legacy GPIO API in order to stop tempting
people to use it
- document the preference for using pread() for reading edge events in the
sysfs API
Other:
- add an extended initializer to the interrupt simulator allowing to specify
a number of callbacks callers can use to be notified about irqs being
requested and released
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEFp3rbAvDxGAT0sefEacuoBRx13IFAmaU3n4ACgkQEacuoBRx
13LnYxAAgIfV2MQxdlB8+I3+PrObiom9uykNpoj8Ho3FiGroJEmSi2Vg9NOP5j8n
7THjBDFqKk0USMdzGgWDe+u0oCpql8ONLd+lxPiKzRxkebMVzlumeNzWNEE3wqxO
MdV3AOs9DLM1a4MAuv9E8PgooBVR8Cyqs3tc3wwpZRKoSZBIzwrjFL3tO1P8Dezv
9xoPqIiMJRBZr8jifU/ZRdLG3gYKqgQH1Mha7bm94ebUwA6q/hxtGYAtc2a3Q+dF
6lPrPONJBN6/YwvmoDddm2ppoiyWN7QdX9DQjJvKBcNRTZSE1EAggdh8kNnCoa1d
+PeClIAJLl8ZSkdMS8yvMZIpduK4gTl7yEBMkER1d0JoJLkTowqKsvONgU12Npr2
3rwbpACt/kVt5v0lRwaafj5vnD3NgiiVCeuZZz99ICbrXqe6rYszMIemKDYWWlTn
kEFrTM5ql+dwAfvp8Y9JZf4oOgInHbF3LBKM34PKMW9D0a4aQC/HTfmtHobeNHzn
FmY9ysHjMG6fvuwnkpojW6N3/LLwt+TX8jik9x0O42AE7qXn6a8U2g6RUg6rJOdd
mUiIX3+rn+AaI6eKPvUNp2h391jH1K3hBCAca4cNAIKpqPuE/A/B5RyZZnL5Q7HQ
Iz2G3hSlTBVPf7QWMkBUfMzQMwmvqfoKsZljC5y7YgafJc5cf4k=
=kK88
-----END PGP SIGNATURE-----
Merge tag 'gpio-updates-for-v6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio updates from Bartosz Golaszewski:
"The majority of added lines are two new modules: the GPIO virtual
consumer module that improves our ability to add automated tests for
the kernel API and the "sloppy" logic analyzer module that uses the
GPIO API to implement a coarse-grained debugging tool for useful for
remote development.
Other than that we have the usual assortment of various driver
extensions, improvements to the core GPIO code, DT-bindings and other
documentation updates as well as an extension to the interrupt
simulator:
GPIOLIB core:
- rework kfifo handling rework in the character device code
- improve the labeling of GPIOs requested as interrupts and show more
info on interrupt-only GPIOs in debugfs
- remove unused APIs
- unexport interfaces that are only used from the core GPIO code
- drop the return value from gpiochip_set_desc_names() as it cannot
fail
- move a string array definition out of a header and into a specific
compilation unit
- convert the last user of gpiochip_get_desc() other than GPIO core
to using a safer alternative
- use array_index_nospec() where applicable
New drivers:
- add a "virtual GPIO consumer" module that allows requesting GPIOs
from actual hardware and driving tests of the in-kernel GPIO API
from user-space over debugfs
- add a GPIO-based "sloppy" logic analyzer module useful for "first
glance" debugging on remote boards
Driver improvements:
- add support for a new model to gpio-pca953x
- lock GPIOs as interrupts in gpio-sim when the lines are requested
as irqs via the simulator domain + some other minor improvements
- improve error reporting in gpio-syscon
- convert gpio-ath79 to using dynamic GPIO base and range
- use pcibios_err_to_errno() for converting PCIBIOS error codes to
errno vaues in gpio-amd8111 and gpio-rdc321x
- allow building gpio-brcmstb for the BCM2835 architecture
DT bindings:
- convert DT bindings for lsi,zevio, mpc8xxx, and atmel to DT schema
- document new properties for aspeed,gpio, fsl,qoriq-gpio and
gpio-vf610
- document new compatibles for pca953x and fsl,qoriq-gpio
Documentation:
- document stricter behavior of the GPIO character device uAPI with
regards to reconfiguring requested line without direction set
- clarify the effect of the active-low flag on line values and edges
- remove documentation for the legacy GPIO API in order to stop
tempting people to use it
- document the preference for using pread() for reading edge events
in the sysfs API
Other:
- add an extended initializer to the interrupt simulator allowing to
specify a number of callbacks callers can use to be notified about
irqs being requested and released"
* tag 'gpio-updates-for-v6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux: (41 commits)
gpio: mc33880: Convert comma to semicolon
gpio: virtuser: actually use the "trimmed" local variable
dt-bindings: gpio: convert Atmel GPIO to json-schema
gpio: virtuser: new virtual testing driver for the GPIO API
dt-bindings: gpio: vf610: Allow gpio-line-names to be set
gpio: sim: lock GPIOs as interrupts when they are requested
genirq/irq_sim: add an extended irq_sim initializer
dt-bindings: gpio: fsl,qoriq-gpio: Add compatible string fsl,ls1046a-gpio
gpiolib: unexport gpiochip_get_desc()
gpio: add sloppy logic analyzer using polling
Documentation: gpio: Reconfiguration with unset direction (uAPI v2)
Documentation: gpio: Reconfiguration with unset direction (uAPI v1)
dt-bindings: gpio: fsl,qoriq-gpio: add common property gpio-line-names
gpio: ath79: convert to dynamic GPIO base allocation
pinctrl: da9062: replace gpiochip_get_desc() with gpio_device_get_desc()
gpiolib: put gpio_suffixes in a single compilation unit
Documentation: gpio: Clarify effect of active low flag on line edges
Documentation: gpio: Clarify effect of active low flag on line values
gpiolib: Remove data-less gpiochip_add() function
gpio: sim: use devm_mutex_init()
...
* Virtual CPU hotplug support for arm64 ACPI systems
* cpufeature infrastructure cleanups and making the FEAT_ECBHB ID bits
visible to guests
* CPU errata: expand the speculative SSBS workaround to more CPUs
* arm64 ACPI:
- acpi=nospcr option to disable SPCR as default console for arm64
- Move some ACPI code (cpuidle, FFH) to drivers/acpi/arm64/
* GICv3, use compile-time PMR values: optimise the way regular IRQs are
masked/unmasked when GICv3 pseudo-NMIs are used, removing the need for
a static key in fast paths by using a priority value chosen
dynamically at boot time
* arm64 perf updates:
- Rework of the IMX PMU driver to enable support for I.MX95
- Enable support for tertiary match groups in the CMN PMU driver
- Initial refactoring of the CPU PMU code to prepare for the fixed
instruction counter introduced by Arm v9.4
- Add missing PMU driver MODULE_DESCRIPTION() strings
- Hook up DT compatibles for recent CPU PMUs
* arm64 kselftest updates:
- Kernel mode NEON fp-stress
- Cleanups, spelling mistakes
* arm64 Documentation update with a minor clarification on TBI
* Miscellaneous:
- Fix missing IPI statistics
- Implement raw_smp_processor_id() using thread_info rather than a
per-CPU variable (better code generation)
- Make MTE checking of in-kernel asynchronous tag faults conditional
on KASAN being enabled
- Minor cleanups, typos
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmaQKN4ACgkQa9axLQDI
XvE0Nw/+JZ6OEQ+DMUHXZfbWanvn1p0nVOoEV3MYVpOeQK1ILYCoDapatLNIlet0
wcja7tohKbL1ifc7GOqlkitu824LMlotncrdOBycRqb/4C5KuJ+XhygFv5hGfX0T
Uh2zbo4w52FPPEUMICfEAHrKT3QB9tv7f66xeUNbWWFqUn3rY02/ZVQVVdw6Zc0e
fVYWGUUoQDR7+9hRkk6tnYw3+9YFVAUAbLWk+DGrW7WsANi6HuJ/rBMibwFI6RkG
SZDZHum6vnwx0Dj9H7WrYaQCvUMm7AlckhQGfPbIFhUk6pWysfJtP5Qk49yiMl7p
oRk/GrSXpiKumuetgTeOHbokiE1Nb8beXx0OcsjCu4RrIaNipAEpH1AkYy5oiKoT
9vKZErMDtQgd96JHFVaXc+A3D2kxVfkc1u7K3TEfVRnZFV7CN+YL+61iyZ+uLxVi
d9xrAmwRsWYFVQzlZG3NWvSeQBKisUA1L8JROlzWc/NFDwTqDGIt/zS4pZNL3+OM
EXW0LyKt7Ijl6vPXKCXqrODRrPlcLc66VMZxofZOl0/dEqyJ+qLL4GUkWZu8lTqO
BqydYnbTSjiDg/ntWjTrD0uJ8c40Qy7KTPEdaPqEIQvkDEsUGlOnhAQjHrnGNb9M
psZtpDW2xm7GykEOcd6rgSz4Xeky2iLsaR4Wc7FTyDS0YRmeG44=
=ob2k
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Catalin Marinas:
"The biggest part is the virtual CPU hotplug that touches ACPI,
irqchip. We also have some GICv3 optimisation for pseudo-NMIs that has
been queued via the arm64 tree. Otherwise the usual perf updates,
kselftest, various small cleanups.
Core:
- Virtual CPU hotplug support for arm64 ACPI systems
- cpufeature infrastructure cleanups and making the FEAT_ECBHB ID
bits visible to guests
- CPU errata: expand the speculative SSBS workaround to more CPUs
- GICv3, use compile-time PMR values: optimise the way regular IRQs
are masked/unmasked when GICv3 pseudo-NMIs are used, removing the
need for a static key in fast paths by using a priority value
chosen dynamically at boot time
ACPI:
- 'acpi=nospcr' option to disable SPCR as default console for arm64
- Move some ACPI code (cpuidle, FFH) to drivers/acpi/arm64/
Perf updates:
- Rework of the IMX PMU driver to enable support for I.MX95
- Enable support for tertiary match groups in the CMN PMU driver
- Initial refactoring of the CPU PMU code to prepare for the fixed
instruction counter introduced by Arm v9.4
- Add missing PMU driver MODULE_DESCRIPTION() strings
- Hook up DT compatibles for recent CPU PMUs
Kselftest updates:
- Kernel mode NEON fp-stress
- Cleanups, spelling mistakes
Miscellaneous:
- arm64 Documentation update with a minor clarification on TBI
- Fix missing IPI statistics
- Implement raw_smp_processor_id() using thread_info rather than a
per-CPU variable (better code generation)
- Make MTE checking of in-kernel asynchronous tag faults conditional
on KASAN being enabled
- Minor cleanups, typos"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (69 commits)
selftests: arm64: tags: remove the result script
selftests: arm64: tags_test: conform test to TAP output
perf: add missing MODULE_DESCRIPTION() macros
arm64: smp: Fix missing IPI statistics
irqchip/gic-v3: Fix 'broken_rdists' unused warning when !SMP and !ACPI
ACPI: Add acpi=nospcr to disable ACPI SPCR as default console on ARM64
Documentation: arm64: Update memory.rst for TBI
arm64/cpufeature: Replace custom macros with fields from ID_AA64PFR0_EL1
KVM: arm64: Replace custom macros with fields from ID_AA64PFR0_EL1
perf: arm_pmuv3: Include asm/arm_pmuv3.h from linux/perf/arm_pmuv3.h
perf: arm_v6/7_pmu: Drop non-DT probe support
perf/arm: Move 32-bit PMU drivers to drivers/perf/
perf: arm_pmuv3: Drop unnecessary IS_ENABLED(CONFIG_ARM64) check
perf: arm_pmuv3: Avoid assigning fixed cycle counter with threshold
arm64: Kconfig: Fix dependencies to enable ACPI_HOTPLUG_CPU
perf: imx_perf: add support for i.MX95 platform
perf: imx_perf: fix counter start and config sequence
perf: imx_perf: refactor driver for imx93
perf: imx_perf: let the driver manage the counter usage rather the user
perf: imx_perf: add macro definitions for parsing config attr
...
- Added Michal Koutný as a maintainer.
- Counters in pids.events were behaving inconsistently. pids.events made
properly hierarchical and pids.events.local added.
- misc.peak and misc.events.local added.
- cpuset remote partition creation and cpuset.cpus.exclusive handling
improved.
- Code cleanups, non-critical fixes, doc updates.
- for-6.10-fixes is merged in to receive two non-critical fixes that didn't
trigger pull.
-----BEGIN PGP SIGNATURE-----
iIQEABYKACwWIQTfIjM1kS57o3GsC/uxYfJx3gVYGQUCZpSsdw4cdGpAa2VybmVs
Lm9yZwAKCRCxYfJx3gVYGSEMAQDQ5VfcRz+rW20ez5IAgyN3EKIwSbW6pY6jojgj
bJtJSQD/TzA8DoRxcCvTdHcZcwJ2e2zBVcuM8NkZHfSCNiPrrgs=
=5f3I
-----END PGP SIGNATURE-----
Merge tag 'cgroup-for-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup updates from Tejun Heo:
- Added Michal Koutný as a maintainer
- Counters in pids.events were behaving inconsistently. pids.events
made properly hierarchical and pids.events.local added
- misc.peak and misc.events.local added
- cpuset remote partition creation and cpuset.cpus.exclusive handling
improved
- Code cleanups, non-critical fixes, doc updates
- for-6.10-fixes is merged in to receive two non-critical fixes that
didn't trigger pull
* tag 'cgroup-for-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (23 commits)
cgroup: Add Michal Koutný as a maintainer
cgroup/misc: Introduce misc.events.local
cgroup/rstat: add force idle show helper
cgroup: Protect css->cgroup write under css_set_lock
cgroup/misc: Introduce misc.peak
cgroup_misc: add kernel-doc comments for enum misc_res_type
cgroup/cpuset: Prevent UAF in proc_cpuset_show()
selftest/cgroup: Update test_cpuset_prs.sh to match changes
cgroup/cpuset: Make cpuset.cpus.exclusive independent of cpuset.cpus
cgroup/cpuset: Delay setting of CS_CPU_EXCLUSIVE until valid partition
selftest/cgroup: Fix test_cpuset_prs.sh problems reported by test robot
cgroup/cpuset: Fix remote root partition creation problem
cgroup: avoid the unnecessary list_add(dying_tasks) in cgroup_exit()
cgroup/cpuset: Optimize isolated partition only generate_sched_domains() calls
cgroup/cpuset: Reduce the lock protecting CS_SCHED_LOAD_BALANCE
kernel/cgroup: cleanup cgroup_base_files when fail to add cgroup_psi_files
selftests: cgroup: Add basic tests for pids controller
selftests: cgroup: Lexicographic order in Makefile
cgroup/pids: Add pids.events.local
cgroup/pids: Make event counters hierarchical
...
o Fix selftest printf format mismatch in expect_str_buf_eq().
o Stop using brk() and sbrk() when testing against musl, which
implements these two functions with ENOMEM.
o Make tests use -Werror to force failure on compiler warnings.
o Add limits for the {u,}intmax_t, ulong and {u,}llong types.
o Implement strtol() and friends.
o Add facility to skip nolibc-specific tests when running against
non-nolibc libraries.
o Implement strerror().
o Also use strerror() on nolibc when running kselftests.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEbK7UrM+RBIrCoViJnr8S83LZ+4wFAmaVk7sTHHBhdWxtY2tA
a2VybmVsLm9yZwAKCRCevxLzctn7jG64D/9xBRXDbz6DKs3F8hx0hFCEbVKd0jIm
bWvr3sB07x9jFhRM2fGVyNGzGOpWzdj5IPEe2Qj+qon7IENHZUZYFMzMMJJQrR4q
P5lICIzCouTtowceP039Q2Gw7Zzpxxmd4ZGyJzuofHys4dk2O2FbGBF2GoypJ/Ut
88webb1kKFWKaZOU4o9qGRGSUoQml69TP3USWR/JvBohxHh9nLaDglYeLxi0BTYQ
nXtk8QL1fHRKiOsgJKJYfS9QXd4l2dPB5yiN5zhEA8RYPDFtrcYDyD4y1YVgQoiy
G32eLBCSnKaO96tk7aeq9j++X2PBuMi7EUd8DoLjdV5gRdVeiLnmhrrNYI5Eb18r
0wxlbz1nBchOfF6mnGvSH1jfvCTkZ5rVzkNbKZO5vgM0B84tTraHSDawjdFa6G0Y
7AnbfGouUayq1GAIvCjX7TfALOnSBPbeohtJhMaXNGv9bgWvAO5MZk1HbNkVBVlN
LM0EtaEgbawFsPNHbiBt17BlcJuMAMoJXoDj7NwkhOQRwLvmmzd4tbX2PtykgoPc
abX49qSx6Ok9nPiNsq5Tc9JNJVyHgKZiKaiPujX4xQs0x47Hq/NsrZH22Q4dbyQH
+YvswoskE6wKZ4GszF4QA8If7257wV73ZJimR9xC7iNEknkyCbljwcWjWsIvOiot
vA3kUf4JyxbkWA==
=ch3B
-----END PGP SIGNATURE-----
Merge tag 'nolibc.2024.07.15a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
Pull nolibc updates from Paul McKenney:
- Fix selftest printf format mismatch in expect_str_buf_eq()
- Stop using brk() and sbrk() when testing against musl, which
implements these two functions with ENOMEM
- Make tests use -Werror to force failure on compiler warnings
- Add limits for the {u,}intmax_t, ulong and {u,}llong types
- Implement strtol() and friends
- Add facility to skip nolibc-specific tests when running against
non-nolibc libraries
- Implement strerror()
- Also use strerror() on nolibc when running kselftests
* tag 'nolibc.2024.07.15a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu:
selftests: kselftest: also use strerror() on nolibc
tools/nolibc: implement strerror()
selftests/nolibc: introduce condition to run tests only on nolibc
tools/nolibc: implement strtol() and friends
tools/nolibc: add limits for {u,}intmax_t, ulong and {u,}llong
selftests/nolibc: run-tests.sh: use -Werror by default
selftests/nolibc: disable brk()/sbrk() tests on musl
selftests/nolibc: fix printf format mismatch in expect_str_buf_eq()
This series contains on commit that improves the documentation for the
new __data_racy type qualifier to the data_race() macro's kernel-doc
header and to the LKMM's access-marking documentation.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEbK7UrM+RBIrCoViJnr8S83LZ+4wFAmaRubITHHBhdWxtY2tA
a2VybmVsLm9yZwAKCRCevxLzctn7jLWlD/99wMLUIfPh1cvWVVyhv8tytWuLJn6y
olwukg9pOCz6WGucECfjAC2kFivSQYxS3b54K697tF4hnABEtpsx7ozkv11kgyR8
niHNv4P+L+0J7CnM/g7gkLrAGosdo+PF7rUhX3u3kpeasVjJ5NyK379jUeTjrDrc
ia34vjbVVNdJ5v0c4ITxbbV/NKecbsadRSqDLzjtNrFPkwo/yfFCrz8ddztadsTd
4jftt/L4Up51QQ8NAgiHsHdp+iY/FRjwd/QtvEXSVUZ38sGseH+eNqn4hMuyTnka
cjnHwAlTbEfAuR3Mdcz25ToiaI6qNxptvvM7E9+LtHuBhI+h6vEvdvPMmSFo48Rv
kzdPix39O5dqpu49nz/Qsmmr/AWn9i3M5oht62YMJ5twoutDFAcc5MMppPT6sNsX
Kq3fn8fjRvdHqgE3jBYOgDDWEZuTxdtcdv8wKfplID/SgyjXL48fghMEl62b2O1d
X6PrHhKARk7RKMndAPk0UNz4m8T5oDmKu9Z9mF8UidjeZjYU0tVYPAoN1H/9lVdH
Kn2DoTW5Oz2A21z37r5SGAro0QBE50ZMYA+xH+Ab3fyfFcd4l4ia/wu06c3AVJFc
DqgKJ7f3YQ7tHGLf0AEpO6o/nNZGxwDHJ5Pjbteu2RwbCcUHmUtKxrM+McPXTvuv
MjW/IAvvPdMxCw==
=r0/X
-----END PGP SIGNATURE-----
Merge tag 'kcsan.2024.07.12a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
Pull KCSAN updates from Paul McKenney:
- improve the documentation for the new __data_racy type qualifier
to the data_race() macro's kernel-doc header and to the LKMM's
access-marking documentation
- add missing MODULE_DESCRIPTION
* tag 'kcsan.2024.07.12a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu:
kcsan: Add missing MODULE_DESCRIPTION() macro
kcsan: Add example to data_race() kerneldoc header
doc.2024.06.06a: Update Tasks RCU and Tasks Rude RCU description in
Requirements.rst and clarify rcu_assign_pointer() and
rcu_dereference() ordering properties.
fixes.2024.07.04a: Add lockdep assertions for RCU readers, limit inline
wakeups for callback-bypass synchronize_rcu(), add an
rcutree.nohz_full_patience_delay to reduce nohz_full OS jitter,
add Uladzislau Rezki as RCU maintainer, and fix a subtle
callback-migration memory-ordering issue.
mb.2024.06.28a: Remove a number of redundant memory barriers.
nocb.2024.06.03a: Remove unnecessary bypass-list lock-contention
mitigation, use parking API instead of open-coded ad-hoc
equivalent, and upgrade obsolete comments.
rcu-tasks.2024.06.06a: Revert avoidance of a deadlock that can no
longer occur and properly synchronize Tasks Trace RCU checking
of runqueues.
rcutorture.2024.06.06a: Add tests for handling of double-call_rcu()
bug, add missing MODULE_DESCRIPTION, and add a script that
histograms the number of calls to RCU updaters.
srcu.2024.06.18a: Fill out SRCU polled-grace-period API.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEbK7UrM+RBIrCoViJnr8S83LZ+4wFAmaR7/QTHHBhdWxtY2tA
a2VybmVsLm9yZwAKCRCevxLzctn7jGwAEACJKef2LryG6khoJdorWbvRf1V2k23H
19CxXexCE4UoGsgGST9z1/5rM8kBdNhdhQ0JB9CitW+zGlXpOM79/mO3gALKMj++
YBPw9B5EM622H2cKJGFzoHFSO4X9nM1CCMeuFCo6bVsbWfMtX3ENqsYl2IQy1JkB
pGiKqcNXGWU0mdUcZKs/8ilfLG1NhaLwrkfinlsP9V1+8z8LxxDH5Qh27AT3rIvu
W87OITTZoHlUaDVHYTautHTZoqM381xv9kNoQlS9lpH/gcFOPiO9DLj8NcLjkJ4y
S/OrxOwfQ+BGKwnk8daFQFAc3Nr9KeVAQH7CbOW7guARhj3z97J0+wPm6nZGEE2s
tDzg8zLT9LtbmUypJLurl29+wFE4fPNsnd69XDONbMFN1Ox2tJM3dd/rPCsHSUvz
kEOK9gUreHOv7/Ou6UIHlYVlHY7HHuD7TAsrhaaWk7CEmlY31UKwXG+fMl1FAnSy
F3PcBF/1M687RRFWVeMlug/+0/+ghtc+kZ1YyR79KZR6dI0C7ueQbCBGztCCtFDz
RjrHcDifS0Y2GNQO9+zAyrJvttidRATdYDeFstk+8nnta3CnYzxCp4rn5hs3Ss3N
AJVJm244jR3AcoL4V/tQwiQlYh9ZYN5tZ7qxFiASdtV50Uc8HoIrWXeP0Ar+GHiV
2z/f5fKF4+5clQ==
=7a1C
-----END PGP SIGNATURE-----
Merge tag 'rcu.2024.07.12a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
Pull RCU updates from Paul McKenney:
- Update Tasks RCU and Tasks Rude RCU description in Requirements.rst
and clarify rcu_assign_pointer() and rcu_dereference() ordering
properties
- Add lockdep assertions for RCU readers, limit inline wakeups for
callback-bypass synchronize_rcu(), add an
rcutree.nohz_full_patience_delay to reduce nohz_full OS jitter, add
Uladzislau Rezki as RCU maintainer, and fix a subtle
callback-migration memory-ordering issue
- Remove a number of redundant memory barriers
- Remove unnecessary bypass-list lock-contention mitigation, use
parking API instead of open-coded ad-hoc equivalent, and upgrade
obsolete comments
- Revert avoidance of a deadlock that can no longer occur and properly
synchronize Tasks Trace RCU checking of runqueues
- Add tests for handling of double-call_rcu() bug, add missing
MODULE_DESCRIPTION, and add a script that histograms the number of
calls to RCU updaters
- Fill out SRCU polled-grace-period API
* tag 'rcu.2024.07.12a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (29 commits)
rcu: Fix rcu_barrier() VS post CPUHP_TEARDOWN_CPU invocation
rcu: Eliminate lockless accesses to rcu_sync->gp_count
MAINTAINERS: Add Uladzislau Rezki as RCU maintainer
rcu: Add rcutree.nohz_full_patience_delay to reduce nohz_full OS jitter
rcu/exp: Remove redundant full memory barrier at the end of GP
rcu: Remove full memory barrier on RCU stall printout
rcu: Remove full memory barrier on boot time eqs sanity check
rcu/exp: Remove superfluous full memory barrier upon first EQS snapshot
rcu: Remove superfluous full memory barrier upon first EQS snapshot
rcu: Remove full ordering on second EQS snapshot
srcu: Fill out polled grace-period APIs
srcu: Update cleanup_srcu_struct() comment
srcu: Add NUM_ACTIVE_SRCU_POLL_OLDSTATE
srcu: Disable interrupts directly in srcu_gp_end()
rcu: Disable interrupts directly in rcu_gp_init()
rcu/tree: Reduce wake up for synchronize_rcu() common case
rcu/tasks: Fix stale task snaphot for Tasks Trace
tools/rcu: Add rcu-updaters.sh script
rcutorture: Add missing MODULE_DESCRIPTION() macros
rcutorture: Fix rcu_torture_fwd_cb_cr() data race
...
A simple but odd single-process litmus test acquires and immediately
releases a lock, then calls spin_is_locked(). LKMM acts if it was
a deadlock due to an assumption that spin_is_locked() will follow a
spin_lock() or some other process's spin_unlock(). This litmus test
manages to violate this assumption because the spin_is_locked() follows
the same process's spin_unlock().
This series fixes this bug, reorganizes and optimizes the lock.cat model,
and updates documentation.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEbK7UrM+RBIrCoViJnr8S83LZ+4wFAmaRwSUTHHBhdWxtY2tA
a2VybmVsLm9yZwAKCRCevxLzctn7jNmcD/0dhWCkytKMLwVke6oLP++HTFK/Uzrb
XsXkq5t0Y+xzf7KKca9XuqEMRnaenLM07Swx0ujn2fH7FrXCNlzHD6R4T2e8r+L5
9TVa1ewRfpjH8tM6YLsMbiH4yRah47F9p6CulRHJiHtyU2pLgThIF0/KrhiYKn2P
aBF4zclt0TJkTgJDaoJ45z6YTWNFMv+DopRwWT9vm0CNS4jaxHrn8CaBtiv1arWe
9CMArXtPS6eJjbZ9RPLSVlNUufLD7QnyPXRPjd7hwBHReEgcfvpAYKfsAZanhSqv
vTp9PRQlu1j9pcjrSCzZ2baBXd0XhuX2wtjNBZXcb2yzIj6V2t5rADKKEG7IONTj
eV0ZpMm5bF6lUZBtr1YowgNPwKktdqIVavXk1Sl/fiUs/B8thKZI/pekOMLwq5An
SdcCO2m6tY5fYvsvIpfRExvx3TGrRHw+64C/jZrGPhLL6d87tTr0uuyZe1XDxv2+
yGkDuKHUGg7lAXbd6sPvPLOej6pOFbb43qUa7AwBtXBrUs9nF1AuegTM8ux+5eCe
qbA6GrvPyeSXOwsy0XD83oOQ4o93f0GdpMnIgQNK32uaEvQJf1MenTcTjXD8RfGv
FqxSqhSGNQSXjmBCRFHujR2lR7aWSEUWTEdV+hnSd8O0q4IG2LMwK+TzbtJXJTl8
KZgRYz7lne7sQw==
=kUWy
-----END PGP SIGNATURE-----
Merge tag 'lkmm.2024.07.12a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
Pull memory model updates from Paul McKenney:
"lkmm: Fix corner-case locking bug and improve documentation
A simple but odd single-process litmus test acquires and immediately
releases a lock, then calls spin_is_locked(). LKMM acts if it was a
deadlock due to an assumption that spin_is_locked() will follow a
spin_lock() or some other process's spin_unlock(). This litmus test
manages to violate this assumption because the spin_is_locked()
follows the same process's spin_unlock().
This series fixes this bug, reorganizes and optimizes the lock.cat
model, and updates documentation"
* tag 'lkmm.2024.07.12a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu:
tools/memory-model: Code reorganization in lock.cat
tools/memory-model: Fix bug in lock.cat
tools/memory-model: Add access-marking.txt to README
tools/memory-model: Add KCSAN LF mentorship session citation
Merge in late fixes to prepare for the 6.11 net-next PR.
Conflicts:
93c3a96c30 ("net: pse-pd: Do not return EOPNOSUPP if config is null")
4cddb0f15e ("net: ethtool: pse-pd: Fix possible null-deref")
30d7b67277 ("net: ethtool: Add new power limit get and set features")
https://lore.kernel.org/20240715123204.623520bb@canb.auug.org.au/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZpEHCgAKCRCRxhvAZXjc
om+gAQCC4nJqJqs9bJZIItRtZ71GnxZQO3HVohhIlNM2KKh0VgEA47JhD0O0Srfb
CleII4cQWqB32BfB/vMeff6hpNa7SA4=
=7ltk
-----END PGP SIGNATURE-----
Merge tag 'vfs-6.11.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs mount query updates from Christian Brauner:
"This contains work to extend the abilities of listmount() and
statmount() and various fixes and cleanups.
Features:
- Allow iterating through mounts via listmount() from newest to
oldest. This makes it possible for mount(8) to keep iterating the
mount table in reverse order so it gets newest mounts first.
- Relax permissions on listmount() and statmount().
It's not necessary to have capabilities in the initial namespace:
it is sufficient to have capabilities in the owning namespace of
the mount namespace we're located in to list unreachable mounts in
that namespace.
- Extend both listmount() and statmount() to list and stat mounts in
foreign mount namespaces.
Currently the only way to iterate over mount entries in mount
namespaces that aren't in the caller's mount namespace is by
crawling through /proc in order to find /proc/<pid>/mountinfo for
the relevant mount namespace.
This is both very clumsy and hugely inefficient. So extend struct
mnt_id_req with a new member that allows to specify the mount
namespace id of the mount namespace we want to look at.
Luckily internally we already have most of the infrastructure for
this so we just need to expose it to userspace. Give userspace a
way to retrieve the id of a mount namespace via statmount() and
through a new nsfs ioctl() on mount namespace file descriptor.
This comes with appropriate selftests.
- Expose mount options through statmount().
Currently if userspace wants to get mount options for a mount and
with statmount(), they still have to open /proc/<pid>/mountinfo to
parse mount options. Simply the information through statmount()
directly.
Afterwards it's possible to only rely on statmount() and
listmount() to retrieve all and more information than
/proc/<pid>/mountinfo provides.
This comes with appropriate selftests.
Fixes:
- Avoid copying to userspace under the namespace semaphore in
listmount.
Cleanups:
- Simplify the error handling in listmount by relying on our newly
added cleanup infrastructure.
- Refuse invalid mount ids early for both listmount and statmount"
* tag 'vfs-6.11.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
fs: reject invalid last mount id early
fs: refuse mnt id requests with invalid ids early
fs: find rootfs mount of the mount namespace
fs: only copy to userspace on success in listmount()
sefltests: extend the statmount test for mount options
fs: use guard for namespace_sem in statmount()
fs: export mount options via statmount()
fs: rename show_mnt_opts -> show_vfsmnt_opts
selftests: add a test for the foreign mnt ns extensions
fs: add an ioctl to get the mnt ns id from nsfs
fs: Allow statmount() in foreign mount namespace
fs: Allow listmount() in foreign mount namespace
fs: export the mount ns id via statmount
fs: keep an index of current mount namespaces
fs: relax permissions for statmount()
listmount: allow listing in reverse order
fs: relax permissions for listmount()
fs: simplify error handling
fs: don't copy to userspace under namespace semaphore
path: add cleanup helper
Merge OPP (operating performance points) and tooling updates for
6.11-rc1:
- Fix missing cleanup on error in _opp_attach_genpd() (Viresh Kumar).
- Introduce an OF helper function to inform if required-opps is used
and drop a redundant in-parameter to _set_opp_level() (Ulf Hansson).
- Update pm-graph to v5.12 which includes fixes and major code revamp
for python3.12 (Todd Brandt).
- Address several assorted issues in the cpupower utility (Roman
Storozhenko).
* pm-opp:
OPP: Introduce an OF helper function to inform if required-opps is used
OPP: Drop a redundant in-parameter to _set_opp_level()
OPP: Fix missing cleanup on error in _opp_attach_genpd()
* pm-tools:
cpupower: fix lib default installation path
cpupower: Disable direct build of the 'bench' subproject
cpupower: Change the var type of the 'monitor' subcommand display mode
cpupower: Remove absent 'v' parameter from monitor man page
cpupower: Improve cpupower build process description
cpupower: Add 'help' target to the main Makefile
cpupower: Replace a dead reference link with working ones
pm-graph: v5.12, code revamp for python3.12
pm-graph: v5.12, fixes
There are a lot of changes in here, though the big bulk of things is
cleanups and simplifications of various kinds which are internally
rather than externally visible. A good chunk of those are DT schema
conversions, but there's also a lot of changes in the code.
Highlights:
- Syncing of features between simple-audio-card and the two
audio-graph cards so there is no reason to stick with an older
driver.
- Support for specifying the order of operations for components within
cards to allow quirking for unusual systems.
- New support for Asahi Kasei AK4619, Cirrus Logic CS530x, Everest
Semiconductors ES8311, NXP i.MX95 and LPC32xx, Qualcomm LPASS v2.5
and WCD937x, Realtek RT1318 and RT1320 and Texas Instruments PCM5242.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmaVJSQACgkQJNaLcl1U
h9C8vwf/Q/wzwY5DSx8JM+qRkhjQdN11ILm5ZL8CD36K5frpv4YuqkHxvI3AO8Yb
+LGLVzmcf6XW4+SGBkXoSOUZOYK726Ld2+BoqTM0isPXHinGdrkcUhUcHKy7qS7g
3MImaVM+nGJGyO718cJ++XnEy7uNkbiA0ztIxy2Ui2Dzxq5LX++tT0IroRxf4AAf
zIFgZpaZz4lueTJ1d0FB7uIG4XHxg4nTn7cSllPhGr5mjiZZhOIwDGE1+9GQC44q
k8oMOACrh887qDSScCbW+pplLJunlei2EC28oVNxsUkNaxl+ItEj1s+X0XH1id6u
FZquRQPHZ9mJ0/QTlzo2l4g4EvOxKg==
=YXaR
-----END PGP SIGNATURE-----
Merge tag 'asoc-v6.11' of https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Updates for for v6.11
There are a lot of changes in here, though the big bulk of things is
cleanups and simplifications of various kinds which are internally
rather than externally visible. A good chunk of those are DT schema
conversions, but there's also a lot of changes in the code.
Highlights:
- Syncing of features between simple-audio-card and the two
audio-graph cards so there is no reason to stick with an older
driver.
- Support for specifying the order of operations for components within
cards to allow quirking for unusual systems.
- New support for Asahi Kasei AK4619, Cirrus Logic CS530x, Everest
Semiconductors ES8311, NXP i.MX95 and LPC32xx, Qualcomm LPASS v2.5
and WCD937x, Realtek RT1318 and RT1320 and Texas Instruments PCM5242.
The goal is to check that the source address selected by the kernel is
routable when a leaking route is used. ICMP, TCP and UDP connections are
tested.
The symmetric topology is enough for this test.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20240710081521.3809742-5-nicolas.dichtel@6wind.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Lately, an additional locking was added by commit c0a40097f0
("drivers: core: synchronize really_probe() and dev_uevent()"). The
locking protects dev_uevent() calling. This function is used to send
messages from the kernel to user space. Uevent messages notify user space
about changes in device states, such as when a device is added, removed,
or changed. These messages are used by udev (or other similar user-space
tools) to apply device-specific rules.
After reloading devlink instance, udev events should be processed. This
locking causes a short delay of udev events handling.
One example for useful udev rule is renaming ports. 'forwading.config'
can be configured to use names after udev rules are applied. Some tests run
devlink_reload() and immediately use the updated names. This worked before
the above mentioned commit was pushed, but now the delay of uevent messages
causes that devlink_reload() returns before udev events are handled and
tests fail.
Adjust devlink_reload() to not assume that udev events are already
processed when devlink reload is done, instead, wait for udev events to
ensure they are processed before returning from the function.
Without this patch:
TESTS='rif_mac_profile' ./resource_scale.sh
TEST: 'rif_mac_profile' 4 [ OK ]
sysctl: cannot stat /proc/sys/net/ipv6/conf/swp1/disable_ipv6: No such file or directory
sysctl: cannot stat /proc/sys/net/ipv6/conf/swp1/disable_ipv6: No such file or directory
sysctl: cannot stat /proc/sys/net/ipv6/conf/swp2/disable_ipv6: No such file or directory
sysctl: cannot stat /proc/sys/net/ipv6/conf/swp2/disable_ipv6: No such file or directory
Cannot find device "swp1"
Cannot find device "swp2"
TEST: setup_wait_dev (: Interface swp1 does not come up.) [FAIL]
With this patch:
$ TESTS='rif_mac_profile' ./resource_scale.sh
TEST: 'rif_mac_profile' 4 [ OK ]
TEST: 'rif_mac_profile' overflow 5 [ OK ]
This is relevant not only for this test.
Fixes: bc7cbb1e9f ("selftests: forwarding: Add devlink_lib.sh")
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Link: https://patch.msgid.link/89367666e04b38a8993027f1526801ca327ab96a.1720709333.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
* kvm-arm64/ctr-el0:
: Support for user changes to CTR_EL0, courtesy of Sebastian Ott
:
: Allow userspace to change the guest-visible value of CTR_EL0 for a VM,
: so long as the requested value represents a subset of features supported
: by hardware. In other words, prevent the VMM from over-promising the
: capabilities of hardware.
:
: Make this happen by fitting CTR_EL0 into the existing infrastructure for
: feature ID registers.
KVM: selftests: Assert that MPIDR_EL1 is unchanged across vCPU reset
KVM: arm64: nv: Unfudge ID_AA64PFR0_EL1 masking
KVM: selftests: arm64: Test writes to CTR_EL0
KVM: arm64: rename functions for invariant sys regs
KVM: arm64: show writable masks for feature registers
KVM: arm64: Treat CTR_EL0 as a VM feature ID register
KVM: arm64: unify code to prepare traps
KVM: arm64: nv: Use accessors for modifying ID registers
KVM: arm64: Add helper for writing ID regs
KVM: arm64: Use read-only helper for reading VM ID registers
KVM: arm64: Make idregs debugfs iterator search sysreg table directly
KVM: arm64: Get sys_reg encoding from descriptor in idregs_debug_show()
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
It looks like we missed these two errors recently:
- SC2068: Double quote array expansions to avoid re-splitting elements.
- SC2145: Argument mixes string and array. Use * or separate argument.
Two simple fixes, it is not supposed to change the behaviour as the
variable names should not have any spaces in their names. Still, better
to fix them to easily spot new issues.
Fixes: f265d3119a ("selftests: mptcp: lib: use setup/cleanup_ns helpers")
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240712-upstream-net-next-20240712-selftests-mptcp-fix-shellcheck-v1-1-1cb7180db40a@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
tcp_ao/self-connect.c checked the following SNMP stats before/after
connect() to confirm that the test exercises the simultaneous connect()
path.
* TCPChallengeACK
* TCPSYNChallenge
But the stats should not be counted for self-connect in the first place,
and the assumption is no longer true.
Let's remove the check.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Dmitry Safonov <dima@arista.com>
Link: https://patch.msgid.link/20240710171246.87533-3-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add install target for vsock to make Yocto easy to install the images.
Signed-off-by: Peng Fan <peng.fan@nxp.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://patch.msgid.link/20240710122728.45044-1-peng.fan@oss.nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZpGVmAAKCRDbK58LschI
gxB4AQCgquQis63yqTI36j4iXBT+TuxHEBNoQBSLyzYdrLS1dgD/S5DRJDA+3LD+
394hn/VtB1qvX5vaqjsov4UIwSMyxA0=
=OhSn
-----END PGP SIGNATURE-----
Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:
====================
pull-request: bpf-next 2024-07-12
We've added 23 non-merge commits during the last 3 day(s) which contain
a total of 18 files changed, 234 insertions(+), 243 deletions(-).
The main changes are:
1) Improve BPF verifier by utilizing overflow.h helpers to check
for overflows, from Shung-Hsi Yu.
2) Fix NULL pointer dereference in resolve_prog_type() for BPF_PROG_TYPE_EXT
when attr->attach_prog_fd was not specified, from Tengda Wu.
3) Fix arm64 BPF JIT when generating code for BPF trampolines with
BPF_TRAMP_F_CALL_ORIG which corrupted upper address bits,
from Puranjay Mohan.
4) Remove test_run callback from lwt_seg6local_prog_ops which never worked
in the first place and caused syzbot reports,
from Sebastian Andrzej Siewior.
5) Relax BPF verifier to accept non-zero offset on KF_TRUSTED_ARGS/
/KF_RCU-typed BPF kfuncs, from Matt Bobrowski.
6) Fix a long standing bug in libbpf with regards to handling of BPF
skeleton's forward and backward compatibility, from Andrii Nakryiko.
7) Annotate btf_{seq,snprintf}_show functions with __printf,
from Alan Maguire.
8) BPF selftest improvements to reuse common network helpers in sk_lookup
test and dropping the open-coded inetaddr_len() and make_socket() ones,
from Geliang Tang.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (23 commits)
selftests/bpf: Test for null-pointer-deref bugfix in resolve_prog_type()
bpf: Fix null pointer dereference in resolve_prog_type() for BPF_PROG_TYPE_EXT
selftests/bpf: DENYLIST.aarch64: Skip fexit_sleep again
bpf: use check_sub_overflow() to check for subtraction overflows
bpf: use check_add_overflow() to check for addition overflows
bpf: fix overflow check in adjust_jmp_off()
bpf: Eliminate remaining "make W=1" warnings in kernel/bpf/btf.o
bpf: annotate BTF show functions with __printf
bpf, arm64: Fix trampoline for BPF_TRAMP_F_CALL_ORIG
selftests/bpf: Close obj in error path in xdp_adjust_tail
selftests/bpf: Null checks for links in bpf_tcp_ca
selftests/bpf: Use connect_fd_to_fd in sk_lookup
selftests/bpf: Use start_server_addr in sk_lookup
selftests/bpf: Use start_server_str in sk_lookup
selftests/bpf: Close fd in error path in drop_on_reuseport
selftests/bpf: Add ASSERT_OK_FD macro
selftests/bpf: Add backlog for network_helper_opts
selftests/bpf: fix compilation failure when CONFIG_NF_FLOW_TABLE=m
bpf: Remove tst_run from lwt_seg6local_prog_ops.
bpf: relax zero fixed offset constraint on KF_TRUSTED_ARGS/KF_RCU
...
====================
Link: https://patch.msgid.link/20240712212448.5378-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Extend existing proc-pid-vm.c tests with PROCMAP_QUERY ioctl() API. Test
a few successful and negative cases, validating querying filtering and
exact vs next VMA logic works as expected.
Link: https://lkml.kernel.org/r/20240627170900.1672542-7-andrii@kernel.org
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
We need this UAPI header in tools/include subdirectory for using it from
BPF selftests.
Link: https://lkml.kernel.org/r/20240627170900.1672542-6-andrii@kernel.org
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Since the memfd pages associated with a udmabuf may be migrated as part of
udmabuf create, we need to verify the data coherency after successful
migration. The new tests added in this patch try to do just that using 4k
sized pages and also 2 MB sized huge pages for the memfd.
Successful completion of the tests would mean that there is no disconnect
between the memfd pages and the ones associated with a udmabuf. And,
these tests can also be augmented in the future to test newer udmabuf
features (such as handling memfd hole punch).
The idea for these tests comes from a patch by Mike Kravetz here:
https://lists.freedesktop.org/archives/dri-devel/2023-June/410623.html
v1->v2: (suggestions from Shuah)
- Use ksft_* functions to print and capture results of tests
- Use appropriate KSFT_* status codes for exit()
- Add Mike Kravetz's suggested-by tag
Link: https://lkml.kernel.org/r/20240624063952.1572359-10-vivek.kasireddy@intel.com
Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
Suggested-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: Dave Airlie <airlied@redhat.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Hugh Dickins <hughd@google.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Dongwon Kim <dongwon.kim@intel.com>
Cc: Junxiao Chang <junxiao.chang@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Oscar Salvador <osalvador@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
This test verifies that resolve_prog_type() works as expected when
`attach_prog_fd` is not passed in.
`prog->aux->dst_prog` in resolve_prog_type() is assigned by
`attach_prog_fd`, and would be NULL if `attach_prog_fd` is not provided.
Loading EXT prog with bpf_dynptr_from_skb() kfunc call in this way will
lead to null-pointer-deref.
Verify that the null-pointer-deref bug in resolve_prog_type() is fixed.
Signed-off-by: Tengda Wu <wutengda@huaweicloud.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240711145819.254178-3-wutengda@huaweicloud.com
This is a bug found when implementing pretty-printing for the
landlock_add_rule system call, I decided to send this patch separately
because this is a serious bug that should be fixed fast.
I wrote a test program to do landlock_add_rule syscall in a loop,
yet perf trace -e landlock_add_rule freezes, giving no output.
This bug is introduced by the false understanding of the variable "key"
below:
```
for (key = 0; key < trace->sctbl->syscalls.nr_entries; ++key) {
struct syscall *sc = trace__syscall_info(trace, NULL, key);
...
}
```
The code above seems right at the beginning, but when looking at
syscalltbl.c, I found these lines:
```
for (i = 0; i <= syscalltbl_native_max_id; ++i)
if (syscalltbl_native[i])
++nr_entries;
entries = tbl->syscalls.entries = malloc(sizeof(struct syscall) * nr_entries);
...
for (i = 0, j = 0; i <= syscalltbl_native_max_id; ++i) {
if (syscalltbl_native[i]) {
entries[j].name = syscalltbl_native[i];
entries[j].id = i;
++j;
}
}
```
meaning the key is merely an index to traverse the syscall table,
instead of the actual syscall id for this particular syscall.
So if one uses key to do trace__syscall_info(trace, NULL, key), because
key only goes up to trace->sctbl->syscalls.nr_entries, for example, on
my X86_64 machine, this number is 373, it will end up neglecting all
the rest of the syscall, in my case, everything after `rseq`, because
the traversal will stop at 373, and `rseq` is the last syscall whose id
is lower than 373
in tools/perf/arch/x86/include/generated/asm/syscalls_64.c:
```
...
[334] = "rseq",
[424] = "pidfd_send_signal",
...
```
The reason why the key is scrambled but perf trace works well is that
key is used in trace__syscall_info(trace, NULL, key) to do
trace->syscalls.table[id], this makes sure that the struct syscall returned
actually has an id the same value as key, making the later bpf_prog
matching all correct.
After fixing this bug, I can do perf trace on 38 more syscalls, and
because more syscalls are visible, we get 8 more syscalls that can be
augmented.
before:
perf $ perf trace -vv --max-events=1 |& grep Reusing
Reusing "open" BPF sys_enter augmenter for "stat"
Reusing "open" BPF sys_enter augmenter for "lstat"
Reusing "open" BPF sys_enter augmenter for "access"
Reusing "connect" BPF sys_enter augmenter for "accept"
Reusing "sendto" BPF sys_enter augmenter for "recvfrom"
Reusing "connect" BPF sys_enter augmenter for "bind"
Reusing "connect" BPF sys_enter augmenter for "getsockname"
Reusing "connect" BPF sys_enter augmenter for "getpeername"
Reusing "open" BPF sys_enter augmenter for "execve"
Reusing "open" BPF sys_enter augmenter for "truncate"
Reusing "open" BPF sys_enter augmenter for "chdir"
Reusing "open" BPF sys_enter augmenter for "mkdir"
Reusing "open" BPF sys_enter augmenter for "rmdir"
Reusing "open" BPF sys_enter augmenter for "creat"
Reusing "open" BPF sys_enter augmenter for "link"
Reusing "open" BPF sys_enter augmenter for "unlink"
Reusing "open" BPF sys_enter augmenter for "symlink"
Reusing "open" BPF sys_enter augmenter for "readlink"
Reusing "open" BPF sys_enter augmenter for "chmod"
Reusing "open" BPF sys_enter augmenter for "chown"
Reusing "open" BPF sys_enter augmenter for "lchown"
Reusing "open" BPF sys_enter augmenter for "mknod"
Reusing "open" BPF sys_enter augmenter for "statfs"
Reusing "open" BPF sys_enter augmenter for "pivot_root"
Reusing "open" BPF sys_enter augmenter for "chroot"
Reusing "open" BPF sys_enter augmenter for "acct"
Reusing "open" BPF sys_enter augmenter for "swapon"
Reusing "open" BPF sys_enter augmenter for "swapoff"
Reusing "open" BPF sys_enter augmenter for "delete_module"
Reusing "open" BPF sys_enter augmenter for "setxattr"
Reusing "open" BPF sys_enter augmenter for "lsetxattr"
Reusing "openat" BPF sys_enter augmenter for "fsetxattr"
Reusing "open" BPF sys_enter augmenter for "getxattr"
Reusing "open" BPF sys_enter augmenter for "lgetxattr"
Reusing "openat" BPF sys_enter augmenter for "fgetxattr"
Reusing "open" BPF sys_enter augmenter for "listxattr"
Reusing "open" BPF sys_enter augmenter for "llistxattr"
Reusing "open" BPF sys_enter augmenter for "removexattr"
Reusing "open" BPF sys_enter augmenter for "lremovexattr"
Reusing "fsetxattr" BPF sys_enter augmenter for "fremovexattr"
Reusing "open" BPF sys_enter augmenter for "mq_open"
Reusing "open" BPF sys_enter augmenter for "mq_unlink"
Reusing "fsetxattr" BPF sys_enter augmenter for "add_key"
Reusing "fremovexattr" BPF sys_enter augmenter for "request_key"
Reusing "fremovexattr" BPF sys_enter augmenter for "inotify_add_watch"
Reusing "fremovexattr" BPF sys_enter augmenter for "mkdirat"
Reusing "fremovexattr" BPF sys_enter augmenter for "mknodat"
Reusing "fremovexattr" BPF sys_enter augmenter for "fchownat"
Reusing "fremovexattr" BPF sys_enter augmenter for "futimesat"
Reusing "fremovexattr" BPF sys_enter augmenter for "newfstatat"
Reusing "fremovexattr" BPF sys_enter augmenter for "unlinkat"
Reusing "fremovexattr" BPF sys_enter augmenter for "linkat"
Reusing "open" BPF sys_enter augmenter for "symlinkat"
Reusing "fremovexattr" BPF sys_enter augmenter for "readlinkat"
Reusing "fremovexattr" BPF sys_enter augmenter for "fchmodat"
Reusing "fremovexattr" BPF sys_enter augmenter for "faccessat"
Reusing "fremovexattr" BPF sys_enter augmenter for "utimensat"
Reusing "connect" BPF sys_enter augmenter for "accept4"
Reusing "fremovexattr" BPF sys_enter augmenter for "name_to_handle_at"
Reusing "fremovexattr" BPF sys_enter augmenter for "renameat2"
Reusing "open" BPF sys_enter augmenter for "memfd_create"
Reusing "fremovexattr" BPF sys_enter augmenter for "execveat"
Reusing "fremovexattr" BPF sys_enter augmenter for "statx"
after
perf $ perf trace -vv --max-events=1 |& grep Reusing
Reusing "open" BPF sys_enter augmenter for "stat"
Reusing "open" BPF sys_enter augmenter for "lstat"
Reusing "open" BPF sys_enter augmenter for "access"
Reusing "connect" BPF sys_enter augmenter for "accept"
Reusing "sendto" BPF sys_enter augmenter for "recvfrom"
Reusing "connect" BPF sys_enter augmenter for "bind"
Reusing "connect" BPF sys_enter augmenter for "getsockname"
Reusing "connect" BPF sys_enter augmenter for "getpeername"
Reusing "open" BPF sys_enter augmenter for "execve"
Reusing "open" BPF sys_enter augmenter for "truncate"
Reusing "open" BPF sys_enter augmenter for "chdir"
Reusing "open" BPF sys_enter augmenter for "mkdir"
Reusing "open" BPF sys_enter augmenter for "rmdir"
Reusing "open" BPF sys_enter augmenter for "creat"
Reusing "open" BPF sys_enter augmenter for "link"
Reusing "open" BPF sys_enter augmenter for "unlink"
Reusing "open" BPF sys_enter augmenter for "symlink"
Reusing "open" BPF sys_enter augmenter for "readlink"
Reusing "open" BPF sys_enter augmenter for "chmod"
Reusing "open" BPF sys_enter augmenter for "chown"
Reusing "open" BPF sys_enter augmenter for "lchown"
Reusing "open" BPF sys_enter augmenter for "mknod"
Reusing "open" BPF sys_enter augmenter for "statfs"
Reusing "open" BPF sys_enter augmenter for "pivot_root"
Reusing "open" BPF sys_enter augmenter for "chroot"
Reusing "open" BPF sys_enter augmenter for "acct"
Reusing "open" BPF sys_enter augmenter for "swapon"
Reusing "open" BPF sys_enter augmenter for "swapoff"
Reusing "open" BPF sys_enter augmenter for "delete_module"
Reusing "open" BPF sys_enter augmenter for "setxattr"
Reusing "open" BPF sys_enter augmenter for "lsetxattr"
Reusing "openat" BPF sys_enter augmenter for "fsetxattr"
Reusing "open" BPF sys_enter augmenter for "getxattr"
Reusing "open" BPF sys_enter augmenter for "lgetxattr"
Reusing "openat" BPF sys_enter augmenter for "fgetxattr"
Reusing "open" BPF sys_enter augmenter for "listxattr"
Reusing "open" BPF sys_enter augmenter for "llistxattr"
Reusing "open" BPF sys_enter augmenter for "removexattr"
Reusing "open" BPF sys_enter augmenter for "lremovexattr"
Reusing "fsetxattr" BPF sys_enter augmenter for "fremovexattr"
Reusing "open" BPF sys_enter augmenter for "mq_open"
Reusing "open" BPF sys_enter augmenter for "mq_unlink"
Reusing "fsetxattr" BPF sys_enter augmenter for "add_key"
Reusing "fremovexattr" BPF sys_enter augmenter for "request_key"
Reusing "fremovexattr" BPF sys_enter augmenter for "inotify_add_watch"
Reusing "fremovexattr" BPF sys_enter augmenter for "mkdirat"
Reusing "fremovexattr" BPF sys_enter augmenter for "mknodat"
Reusing "fremovexattr" BPF sys_enter augmenter for "fchownat"
Reusing "fremovexattr" BPF sys_enter augmenter for "futimesat"
Reusing "fremovexattr" BPF sys_enter augmenter for "newfstatat"
Reusing "fremovexattr" BPF sys_enter augmenter for "unlinkat"
Reusing "fremovexattr" BPF sys_enter augmenter for "linkat"
Reusing "open" BPF sys_enter augmenter for "symlinkat"
Reusing "fremovexattr" BPF sys_enter augmenter for "readlinkat"
Reusing "fremovexattr" BPF sys_enter augmenter for "fchmodat"
Reusing "fremovexattr" BPF sys_enter augmenter for "faccessat"
Reusing "fremovexattr" BPF sys_enter augmenter for "utimensat"
Reusing "connect" BPF sys_enter augmenter for "accept4"
Reusing "fremovexattr" BPF sys_enter augmenter for "name_to_handle_at"
Reusing "fremovexattr" BPF sys_enter augmenter for "renameat2"
Reusing "open" BPF sys_enter augmenter for "memfd_create"
Reusing "fremovexattr" BPF sys_enter augmenter for "execveat"
Reusing "fremovexattr" BPF sys_enter augmenter for "statx"
TL;DR:
These are the new syscalls that can be augmented
Reusing "openat" BPF sys_enter augmenter for "open_tree"
Reusing "openat" BPF sys_enter augmenter for "openat2"
Reusing "openat" BPF sys_enter augmenter for "mount_setattr"
Reusing "openat" BPF sys_enter augmenter for "move_mount"
Reusing "open" BPF sys_enter augmenter for "fsopen"
Reusing "openat" BPF sys_enter augmenter for "fspick"
Reusing "openat" BPF sys_enter augmenter for "faccessat2"
Reusing "openat" BPF sys_enter augmenter for "fchmodat2"
as for the perf trace output:
before
perf $ perf trace -e faccessat2 --max-events=1
[no output]
after
perf $ ./perf trace -e faccessat2 --max-events=1
0.000 ( 0.037 ms): waybar/958 faccessat2(dfd: 40, filename: "uevent") = 0
P.S. The reason why this bug was not found in the past five years is
probably because it only happens to the newer syscalls whose id is
greater, for instance, faccessat2 of id 439, which not a lot of people
care about when using perf trace.
[Arnaldo]: notes
That and the fact that the BPF code was hidden before having to use -e,
that got changed kinda recently when we switched to using BPF skels for
augmenting syscalls in 'perf trace':
⬢[acme@toolbox perf-tools-next]$ git log --oneline tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c
a9f4c6c999 perf trace: Collect sys_nanosleep first argument
29d16de26d perf augmented_raw_syscalls.bpf: Move 'struct timespec64' to vmlinux.h
5069211e2f perf trace: Use the right bpf_probe_read(_str) variant for reading user data
33b725ce7b perf trace: Avoid compile error wrt redefining bool
7d9642311b perf bpf augmented_raw_syscalls: Add an assert to make sure sizeof(augmented_arg->value) is a power of two.
262b54b6c9 perf bpf augmented_raw_syscalls: Add an assert to make sure sizeof(saddr) is a power of two.
1836480429 perf bpf_skel augmented_raw_syscalls: Cap the socklen parameter using &= sizeof(saddr)
cd2cece61a perf trace: Tidy comments related to BPF + syscall augmentation
5e6da6be30 perf trace: Migrate BPF augmentation to use a skeleton
⬢[acme@toolbox perf-tools-next]$
⬢[acme@toolbox perf-tools-next]$ git show --oneline --pretty=reference 5e6da6be30 | head -1
5e6da6be30 (perf trace: Migrate BPF augmentation to use a skeleton, 2023-08-10)
⬢[acme@toolbox perf-tools-next]$
I.e. from August, 2023.
One had as well to ask for BUILD_BPF_SKEL=1, which now is default if all
it needs is available on the system.
I simplified the code to not expose the 'struct syscall' outside of
tools/perf/util/syscalltbl.c, instead providing a function to go from
the index to the syscall id:
int syscalltbl__id_at_idx(struct syscalltbl *tbl, int idx);
Signed-off-by: Howard Chu <howardchu95@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/lkml/ZmhlAxbVcAKoPTg8@x1
Link: https://lore.kernel.org/r/20240705132059.853205-2-howardchu95@gmail.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Various files had been missed from having accessor functions added for
the sake of dso reference count checking. Add the function calls and
missing dso accessor functions.
Fixes: ee756ef749 ("perf dso: Add reference count checking and accessor functions")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Yunseong Kim <yskelg@gmail.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: John Garry <john.g.garry@oracle.com>
Link: https://lore.kernel.org/r/20240704011745.1021288-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
It is possible that memory events are not supported on all CPUs.
Prints a warning by dumping the enabled CPU maps in this case.
Signed-off-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: John Garry <john.g.garry@oracle.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: coresight@lists.linaro.org
Link: https://lore.kernel.org/r/20240706152035.86983-3-leo.yan@arm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
A platform can have more than one Arm SPE PMU. For example, a system
with multiple clusters may have each cluster enabled with its own Arm
SPE instance. In such case, the PMU devices will be named 'arm_spe_0',
'arm_spe_1', and so on.
Currently, the tool only supports 'arm_spe_0'. This commit extends
support to multiple Arm SPE PMUs by detecting the substring 'arm_spe_'.
Signed-off-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: John Garry <john.g.garry@oracle.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: coresight@lists.linaro.org
Link: https://lore.kernel.org/r/20240706152035.86983-2-leo.yan@arm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Change the unused var in 'arch/x86/entry/syscalls/syscalltbl.sh' to '_'
when reading from '$sorted_table'. This change allows the script to pass
tests of ShellCheck before and after version 0.7.2 at the same time.
When building in arch x86, syscalltbl.sh got a ShellCheck warning, which
makes compilation error:
In arch/x86/entry/syscalls/syscalltbl.sh line 27:
while read nr _abi name entry _compat; do
^-^ SC2034: abi appears unused.
Verify use (or export if used externally).
^----^ SC2034: compat appears unused.
Verify use (or export if used externally).
The script reads unused param abi and compat. It uses format '_xxx' to
indicate dummy vars, which won't work properly when ShellCheck <= 0.7.2.
According to SC2034, the more general way of writing is to use directly
'_' to indicate discarding vars. 'entry' is also replaced by '_' because
it just happens to be defined in emit function, otherwise it will lead
to some misunderstandings.
Link: https://www.shellcheck.net/wiki/SC2034
Signed-off-by: Haoze Xie <royenheart@gmail.com>
Signed-off-by: Yuan Tan <tanyuan@tinylab.org>
Link: https://lore.kernel.org/r/2143cab4cd8468c88860f4e5e382d0e6b4d89ac9.1720372178.git.royenheart@gmail.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Modified the object of 'memset' from '&lost.lost' to '&lost' in
record__read_lost_samples. This allows 'memset' to access memory properly
without causing out-of-bounds problems.
The problems got from builtin-record.c are:
In file included from /usr/include/string.h:495,
from util/parse-events.h:13,
from builtin-record.c:14:
In function 'memset',
inlined from 'record__read_lost_samples' at
builtin-record.c:1958:6,
inlined from '__cmd_record.constprop' at builtin-record.c:2817:2:
/usr/include/x86_64-linux-gnu/bits/string_fortified.h:71:10: error:
'__builtin_memset' offset [17, 64] from the object at 'lost' is out
of the bounds of referenced subobject 'lost' with type
'struct perf_record_lost_samples' at offset 0 [-Werror=array-bounds]
71|return __builtin___memset_chk (__dest,__ch,__len,__bos0 (__dest));
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The error arised when performing a memset operation on the 'lost' variable,
the bytes of 'sizeof(lost)' exceeds that of '&lost.lost', which are 64
and 16.
Fixes: 6c1785cd75 ("perf record: Ensure space for lost samples")
Signed-off-by: Haoze Xie <royenheart@gmail.com>
Signed-off-by: Yuan Tan <tanyuan@tinylab.org>
Link: https://lore.kernel.org/r/11e12f171b846577cac698cd3999db3d7f6c4d03.1720372317.git.royenheart@gmail.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
By default, perf sched map prints sched-in events for all the tasks
which may not be required all the time as it prints lot of symbols
and rows to the terminal.
With --task-name option, one could specify the specific task name
for which the map has to be shown. This would help in analyzing the
CPU usage patterns easier for that specific task. Since multiple
PID's might have the same task name, using task-name filter
would be more useful for debugging.
For other tasks, instead of printing the symbol, '-' is printed and
the same '.' is used to represent idle. '-' is used instead of symbol
for other tasks because it helps in clear visualization of task
of interest and secondly the symbol itself doesn't mean anything
because the sched-in of that symbol will not be printed(first sched-in
contains pid and the corresponding symbol).
When using the --task-name option, the sched-out time is represented
by a '*-'. Since not all task sched-in events are printed, the sched-out
time of the relevant task might be lost. This representation ensures
that the sched-out time of the interested task is not overlooked.
6.10.0-rc1
==========
*A0 131040.639793 secs A0 => migration/0:19
*. 131040.639801 secs . => swapper:0
. *B0 131040.639830 secs B0 => migration/1:24
. *. 131040.639836 secs
. . *C0 131040.640108 secs C0 => migration/2:30
. . *. 131040.640163 secs
. . . *D0 131040.640386 secs D0 => migration/3:36
. . . *. 131040.640395 secs
6.10.0-rc1 + patch (--task-name wdavdaemon)
=============
. *A0 . . . . - . 131040.641346 secs A0 => wdavdaemon:62509
. A0 *B0 . . . - . 131040.641378 secs B0 => wdavdaemon:62274
- *- B0 . . . - . 131040.641379 secs
*C0 . B0 . . . . . 131040.641572 secs C0 => wdavdaemon:62283
C0 . B0 . *D0 . . . 131040.641572 secs D0 => wdavdaemon:62277
C0 . B0 . D0 . *E0 . 131040.641578 secs E0 => wdavdaemon:62270
*- . B0 . D0 . E0 . 131040.641581 secs
. . B0 . D0 . *- . 131040.641583 secs
Reviewed-and-tested-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Madadi Vineeth Reddy <vineethr@linux.ibm.com>
Cc: Chen Yu <yu.c.chen@intel.com>
Link: https://lore.kernel.org/r/20240707182716.22054-2-vineethr@linux.ibm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Revert commit 90dc946059 ("selftests/bpf: DENYLIST.aarch64: Remove
fexit_sleep") again. The fix in 19d3c179a3 ("bpf, arm64: Fix trampoline
for BPF_TRAMP_F_CALL_ORIG") does not address all of the issues and BPF
CI is still hanging and timing out:
https://github.com/kernel-patches/bpf/actions/runs/9905842936/job/27366435436
[...]
#89/11 fexit_bpf2bpf/func_replace_global_func:OK
#89/12 fexit_bpf2bpf/fentry_to_cgroup_bpf:OK
#89/13 fexit_bpf2bpf/func_replace_progmap:OK
#89 fexit_bpf2bpf:OK
Error: The operation was canceled.
Thus more investigation work & fixing is needed before the test can be put
in place again.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Puranjay Mohan <puranjay@kernel.org>
Link: https://lore.kernel.org/bpf/20240705145009.32340-1-puranjay@kernel.org
Andrew Jones <ajones@ventanamicro.com> says:
Zawrs provides two instructions (wrs.nto and wrs.sto), where both are
meant to allow the hart to enter a low-power state while waiting on a
store to a memory location. The instructions also both wait an
implementation-defined "short" duration (unless the implementation
terminates the stall for another reason). The difference is that while
wrs.sto will terminate when the duration elapses, wrs.nto, depending on
configuration, will either just keep waiting or an ILL exception will be
raised. Linux will use wrs.nto, so if platforms have an implementation
which falls in the "just keep waiting" category (which is not expected),
then it should _not_ advertise Zawrs in the hardware description.
Like wfi (and with the same {m,h}status bits to configure it), when
wrs.nto is configured to raise exceptions it's expected that the higher
privilege level will see the instruction was a wait instruction, do
something, and then resume execution following the instruction. For
example, KVM does configure exceptions for wfi (hstatus.VTW=1) and
therefore also for wrs.nto. KVM does this for wfi since it's better to
allow other tasks to be scheduled while a VCPU waits for an interrupt.
For waits such as those where wrs.nto/sto would be used, which are
typically locks, it is also a good idea for KVM to be involved, as it
can attempt to schedule the lock holding VCPU.
This series starts with Christoph's addition of the riscv
smp_cond_load_relaxed function which applies wrs.sto when available.
That patch has been reworked to use wrs.nto and to use the same approach
as Arm for the wait loop, since we can't have arbitrary C code between
the load-reserved and the wrs. Then, hwprobe support is added (since the
instructions are also usable from usermode), and finally KVM is
taught about wrs.nto, allowing guests to see and use the Zawrs
extension.
We still don't have test results from hardware, and it's not possible to
prove that using Zawrs is a win when testing on QEMU, not even when
oversubscribing VCPUs to guests. However, it is possible to use KVM
selftests to force a scenario where we can prove Zawrs does its job and
does it well. [4] is a test which does this and, on my machine, without
Zawrs it takes 16 seconds to complete and with Zawrs it takes 0.25
seconds.
This series is also available here [1]. In order to use QEMU for testing
a build with [2] is needed. In order to enable guests to use Zawrs with
KVM using kvmtool, the branch at [3] may be used.
[1] https://github.com/jones-drew/linux/commits/riscv/zawrs-v3/
[2] https://lore.kernel.org/all/20240312152901.512001-2-ajones@ventanamicro.com/
[3] https://github.com/jones-drew/kvmtool/commits/riscv/zawrs/
[4] cb2beccebc
Link: https://lore.kernel.org/r/20240426100820.14762-8-ajones@ventanamicro.com
* b4-shazam-merge:
KVM: riscv: selftests: Add Zawrs extension to get-reg-list test
KVM: riscv: Support guest wrs.nto
riscv: hwprobe: export Zawrs ISA extension
riscv: Add Zawrs support for spinlocks
dt-bindings: riscv: Add Zawrs ISA extension description
riscv: Provide a definition for 'pause'
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
KVM RISC-V allows the Zawrs extension for the Guest/VM, so add it
to the get-reg-list test.
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Acked-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20240426100820.14762-14-ajones@ventanamicro.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
1. Add ParaVirt steal time support.
2. Add some VM migration enhancement.
3. Add perf kvm-stat support for loongarch.
-----BEGIN PGP SIGNATURE-----
iQJKBAABCAA0FiEEzOlt8mkP+tbeiYy5AoYrw/LiJnoFAmaOS6UWHGNoZW5odWFj
YWlAa2VybmVsLm9yZwAKCRAChivD8uImehejD/9pACGe3h3krXLcFVWXOFIu5Hpc
5kQLP0lSPJ/o5Xs8t/oPLrnDX70z90wXI1LOmltc7h32MSwFa2l8COQh+sN5eJBQ
PNyt7u7bMipp0yJS4Gl3LQQ5vklcGOSpQc/gbeXnVx8J/tz+Mo9YGGLIXVRXRM6W
Ri8D2VVFiwzQQYeTpPo1u1Ob8C6mA4KOppwvhscMTM3vj4NMbsinBzRnR0lG0Tdw
meFhxDPly1Ksxsbnj9UGO6UnEY0A2SLONs6MiO4y4DtoqoDlw/lbqFJuYo4vvbx1
pxtjyirD/PX/wjslQFWUOuU0hMfAodera+JupZ5BZWfcG8FltA4DQfDsm/U9RjK/
7gGNnr8Xk2/tp6+4AVV+HU2iTgRvq+mXCL72zSy2Y4r7ElBAANDfk4n+Zn/PWisn
U9wwV8Ue7tVB15BRpRsg77NzBidiCFEe/6flWYiX2y24ke71gwDJBGUy8hMdKt6t
4Cq8atsU0MvDAzfYMsK9JjskJp4UFq6wb1tXbbuADM4TDhnzlK6s6h3vM+pFlh/f
my7fDH8/2qsCWhBDM4pmsJskVp+I1GOk/80RjTQISwx7iHktJWvxNYTaisK2fvD5
Qs1IUWfNFbDX0Lr0QpN6j6X4rZkghR4R6XoFkd4nkicwi+UHVn3oK9GSqv24QJn9
7+Ev3dfRTUYLd6mC4Q==
=DpIK
-----END PGP SIGNATURE-----
Merge tag 'loongarch-kvm-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson into HEAD
LoongArch KVM changes for v6.11
1. Add ParaVirt steal time support.
2. Add some VM migration enhancement.
3. Add perf kvm-stat support for loongarch.
- Redirect AMO load/store access fault traps to guest
- Perf kvm stat support for RISC-V
- Use HW IMSIC guest files when available
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEZdn75s5e6LHDQ+f/rUjsVaLHLAcFAmaRDW4ACgkQrUjsVaLH
LAcU2Q/+IaL17M8D8ueOcbmCMqZRReyVdR9vH6q87E9NJYRH2dewZ656bNQnnU20
3hkbHOnF+NJAHJ0SfXwqNTVkJcQ8u+F3Xui4DlnFZ/lkpcWpvT/DRI5SCjIjiB/G
SS/xWaRoSjvVJ7M8SyQhHUb2Y/tiDRXOOEl59ROGAKjzC3SY5/NJJ6g5FeE5akT8
/Q7WisZmc+ZH+a9EEOnl+Do7AFakrlaFM5KnweamfqSlSFrQB12YNpSsmA16k6X9
fqK/xPQTjeNakdQDPKw8INCbXkt8dsnlrPS6ivL0FCVf38aIJK0jxyLk9JbZGBK8
+dGCJOLVJontEyOVTYheq2oWv40xAlkXDjLNbnz+Nf7Sau8evFBpE2mPnbUBoGZi
fu5UCddSw3CFwrFNM+qiBRPz/mNuUpCC4pCh8yJSCDZ374ew9ili2l3Nb2IvBcJ2
36lQuxlPVTPOv1J76/WtYwsSwaYBHHcBshweTJCkAkezp0d/wAE8bpaw3n4YnfSn
l4u8/rrnEBb3Cd9cbW1Vk77Vw5e02RlZY5T+JLj7TXWSAFzstYxMpLsf097tqqcn
vY1iTrpxTcJuY0Rra3SI05eKgliXI5snh08xlW2NiVxu8NjjZMU73b6tg3JX8FHl
DMCafyQUBueV2jCpwbYribpbWv/UuUl92AKyJOwZ76W/e9YVBLA=
=atJQ
-----END PGP SIGNATURE-----
Merge tag 'kvm-riscv-6.11-1' of https://github.com/kvm-riscv/linux into HEAD
KVM/riscv changes for 6.11
- Redirect AMO load/store access fault traps to guest
- Perf kvm stat support for RISC-V
- Use guest files for IMSIC virtualization, when available
ONE_REG support for the Zimop, Zcmop, Zca, Zcf, Zcd, Zcb and Zawrs ISA
extensions is coming through the RISC-V tree.
Add a test case to exercise KVM_PRE_FAULT_MEMORY and run the guest to access the
pre-populated area. It tests KVM_PRE_FAULT_MEMORY ioctl for KVM_X86_DEFAULT_VM
and KVM_X86_SW_PROTECTED_VM.
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Message-ID: <32427791ef42e5efaafb05d2ac37fa4372715f47.1712785629.git.isaku.yamahata@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Log errors are the most widely used mechanism for reporting issues in
the kernel. When an error is logged using the device helpers, eg
dev_err(), it gets metadata attached that identifies the subsystem and
device where the message is coming from. Introduce a new test that makes
use of that metadata to report which devices logged errors (or more
critical messages).
Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20240705-dev-err-log-selftest-v2-3-163b9cd7b3c1@collabora.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Move the ksft python module, which provides generic helpers for
kselftests, to a common directory so it can be more easily shared
between different tests.
Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20240705-dev-err-log-selftest-v2-2-163b9cd7b3c1@collabora.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Move the discoverable devices test to a subdirectory to allow other
related tests to be added to the devices directory.
Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20240705-dev-err-log-selftest-v2-1-163b9cd7b3c1@collabora.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There are a couple of places where the test script "sleep"s to wait for
some external condition to be met.
This is error prone, specially in slow systems (identified in CI by
"KSFT_MACHINE_SLOW=yes").
To fix this, add a "ovs_wait" function that tries to execute a command
a few times until it succeeds. The timeout used is set to 5s for
"normal" systems and doubled if a slow CI machine is detected.
This should make the following work:
$ vng --build \
--config tools/testing/selftests/net/config \
--config kernel/configs/debug.config
$ vng --run . --user root -- "make -C tools/testing/selftests/ \
KSFT_MACHINE_SLOW=yes TARGETS=net/openvswitch run_tests"
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Reviewed-by: Ilya Maximets <i.maximets@ovn.org>
Link: https://patch.msgid.link/20240710090500.1655212-1-amorenoz@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This avoids reported warnings when the packages are not installed.
[namhyung]: Removed the dummy assignment and unnecessary ifeq checks.
Fixes: 0f0e1f4456 ("perf build: Use pkg-config for feature check for libtrace{event,fs}")
Signed-off-by: Guilherme Amadio <amadio@gentoo.org>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Thorsten Leemhuis <linux@leemhuis.info>
Link: https://lore.kernel.org/r/20240628203432.3273625-1-amadio@gentoo.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cross-merge networking fixes after downstream PR.
Conflicts:
net/sched/act_ct.c
26488172b0 ("net/sched: Fix UAF when resolving a clash")
3abbd7ed8b ("act_ct: prepare for stolen verdict coming from conntrack and nat engine")
No adjacent changes.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The run_tags_test.sh script is used to run tags_test and print out if
the test succeeded or failed. As tags_test has been TAP conformed, this
script is unneeded and hence can be removed.
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Link: https://lore.kernel.org/r/20240602132502.4186771-2-usama.anjum@collabora.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Conform the layout, informational and status messages to TAP. No
functional change is intended other than the layout of output messages.
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Link: https://lore.kernel.org/r/20240602132502.4186771-1-usama.anjum@collabora.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
There are two selftest scenarios for ARRAY BIST(Board Integrated System
Test) tests:
1. Perform IFS ARRAY BIST tests once on each CPU.
2. Perform IFS ARRAY BIST tests on a random CPU with 3 rounds.
These are not meant to be exhaustive, but are some minimal tests for
for checking IFS ARRAY BIST.
Reviewed-by: Jithu Joseph <jithu.joseph@intel.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Co-developed-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Pengfei Xu <pengfei.xu@intel.com>
Acked-by: Jithu Joseph <jithu.joseph@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Two selftests are added to verify IFS scan test feature:
1. Perform IFS scan test once on each CPU using all the available image
files.
2. Perform IFS scan test with the default image on a random cpu for 3
rounds.
Reviewed-by: Jithu Joseph <jithu.joseph@intel.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Co-developed-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Pengfei Xu <pengfei.xu@intel.com>
Acked-by: Jithu Joseph <jithu.joseph@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Scan test image files have to be loaded before starting IFS test.
Verify that In Field scan driver is able to load valid test image files.
Also check if loading an invalid test image file fails.
Reviewed-by: Jithu Joseph <jithu.joseph@intel.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Co-developed-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Pengfei Xu <pengfei.xu@intel.com>
Acked-by: Jithu Joseph <jithu.joseph@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
IFS (In Field Scan) driver exposes its functionality via sysfs interfaces.
Applications prepare and exercise the tests by interacting with the
aforementioned sysfs files.
Verify that the necessary sysfs entries are created after loading the IFS
driver.
Initialize test variables needed for building subsequent kself-test cases.
Reviewed-by: Jithu Joseph <jithu.joseph@intel.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Co-developed-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Pengfei Xu <pengfei.xu@intel.com>
Acked-by: Jithu Joseph <jithu.joseph@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
The variable are never referenced in the code, just remove it
that this problem was discovered by reading code
Signed-off-by: Zhu Jun <zhujun2@cmss.chinamobile.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
This variable is never referenced in the code, just remove them
that this problem was discovered by reading the code
Signed-off-by: Zhu Jun <zhujun2@cmss.chinamobile.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
These warnings are all of the form, "the format specified a short
(signed or unsigned) int, but the value is a full length int".
Acked-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
When building with clang, via:
make LLVM=1 -C tools/testing/selftests
...quite a few functions are variables are generating "unused" warnings.
Fix the warnings by deleting the unused items.
One item, the "nerrs" variable in vsdo_restorer.c's main(), is unused
but probably wants to be returned from main(), as a non-zero result.
That result is also unused right now, so another option would be to
delete it entirely, but this way, main() also gets fixed. It was missing
a return value.
Acked-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
When building with clang, via:
make LLVM=1 -C tools/testing/selftests
...clang warns that -no-pie is "unused during compilation".
This occurs because clang only wants to see -no-pie during linking.
Here, we don't have a separate linking stage, so a compiler warning is
unavoidable without (wastefully) restructuring the Makefile.
Avoid the warning by simply disabling that warning, for clang builds.
Acked-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
When building with clang, via:
make LLVM=1 -C tools/testing/selftests
...the build fails because clang's inline asm doesn't support all of the
features that are used in the asm() snippet in sysret_rip.c.
Fix this by moving the asm code into the clang_helpers_64.S file, where
it can be built with the assembler's full set of features.
Acked-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
When building with clang, via:
make LLVM=1 -C tools/testing/selftests
Fix this by moving the inline asm to "pure" assembly, in two new files:
clang_helpers_32.S, clang_helpers_64.S.
As a bonus, the pure asm avoids the need for ifdefs, and is now very
simple and easy on the eyes.
Acked-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Use fisttps instead of fisttp to specify correctly that the output
variable is of size short.
test_FISTTP.c:28:3: error: ambiguous instructions require an explicit suffix (could be 'fisttps', or 'fisttpl')
28 | " fisttp res16""\n"
| ^
<inline asm>:3:2: note: instantiated into assembly here
3 | fisttp res16
| ^
...followed by three more cases of the same warning for other lines.
[jh: removed a bit of duplication from the warnings report, above, and
fixed a typo in the title]
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
When building with clang, via:
make LLVM=1 -C tools/testing/selftests
...the following build failure occurs in selftests/x86:
clang: error: cannot specify -o when generating multiple output files
This happens because, although gcc doesn't complain if you invoke it
like this:
gcc file1.c header2.h
...clang won't accept that form--it rejects the .h file(s). Also, the
above approach is inaccurate anyway, because file.c includes header2.h
in this case, and the inclusion of header2.h on the invocation is an
artifact of the Makefile's desire to maintain dependencies.
In Makefiles of this type, a better way to do it is to use Makefile
dependencies to trigger the appropriate incremental rebuilds, and
separately use file lists (see EXTRA_FILES in this commit) to track what
to pass to the compiler.
This commit splits those concepts up, by setting up both EXTRA_FILES and
the Makefile dependencies with a single call to the new Makefile
function extra-files.
That fixes the build failure, while still providing the correct
dependencies in all cases.
Acked-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
When building with clang, via:
make LLVM=1 -C tools/testing/selftest
...clang warns about an unused irqcount variable. clang is correct: the
variable is incremented and then ignored.
Fix this by deleting the irqcount variable.
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Acked-by: John Stultz <jstultz@google.com>
Reviewed-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
write_bm_pid_to_resctrl() uses resctrl_val to check test name which is
not a good interface generic resctrl FS functions should provide.
Tests define mongrp when needed. Remove the test name check in
write_bm_pid_to_resctrl() to only rely on the mongrp parameter being
non-NULL.
Remove write_bm_pid_to_resctrl() resctrl_val parameter and resctrl_val
member from the struct resctrl_val_param that are not used anymore.
Similarly, remove the test name constants that are no longer used.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
The CMT selftest instantiates a monitor group to read LLC occupancy.
Since the test also creates a control group, it is unnecessary to
create another one for monitoring because control groups already
provide monitoring too.
Remove the unnecessary monitor group from the CMT selftest.
Suggested-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Nothing during MBA test uses mongrp even if it has been defined ever
since the introduction of the MBA test in the commit 01fee6b4d1
("selftests/resctrl: Add MBA test").
Remove the mongrp from MBA test.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
The struct resctrl_val_param has control and monitor groups as char
arrays but they are not supposed to be mutated within resctrl_val().
Convert the ctrlgrp and mongrp char array within resctrl_val_param to
plain const char pointers and adjust the strlen() based checks to
check NULL instead.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Control group, monitor group and resctrl_val are not mutated and
should not be mutated within resctrlfs.c functions.
Mark this by using const char * for the arguments.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
bw_report is only needed for selecting the correct value from the
values IMC measured. It is a member in the resctrl_val_param struct and
is always set to "reads". The value is then checked in resctrl_val()
using validate_bw_report_request() that besides validating the input,
assumes it can mutate the string which is questionable programming
practice.
Simplify handling bw_report:
- Convert validate_bw_report_request() into get_bw_report_type() that
inputs and returns const char *. Use NULL to indicate error.
- Validate the report types inside measure_mem_bw(), not in
resctrl_val().
- Pass bw_report to measure_mem_bw() from ->measure() hook because
resctrl_val() no longer needs bw_report for anything.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
The struct resctrl_val_param is there to customize behavior inside
resctrl_val() which is currently not used to full extent and there are
number of strcmp()s for test name in resctrl_val done by resctrl_val().
Create ->init() hook into the struct resctrl_val_param to cleanly
do per test initialization.
Remove also unused branches to setup paths and the related #defines
for CMT test.
While touching kerneldoc, make the adjacent line consistent with the
newly added form (callback vs call back).
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
The measurement done in resctrl_val() varies depending on test type.
The decision for how to measure is decided based on the string compare
to test name which is quite inflexible.
Add ->measure() callback into the struct resctrl_val_param to allow
each test to provide necessary code as a function which simplifies what
resctrl_val() has to do.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
initialize_mem_bw_resctrl() and set_mbm_path() contain complicated set
of conditions, each yielding different file to be opened to measure
memory bandwidth through resctrl FS. In practice, only two of them are
used. For MBA test, ctrlgrp is always provided, and for MBM test both
ctrlgrp and mongrp are set.
The file used differ between MBA/MBM test, however, MBM test
unnecessarily create monitor group because resctrl FS already provides
monitoring interface underneath any ctrlgrp too, which is what the MBA
selftest uses.
Consolidate memory bandwidth file used to the one used by the MBA
selftest. Remove all unused branches opening other files to simplify
the code.
Suggested-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
measure_vals() is awfully generic name so rename it to measure_mem_bw()
to describe better what it does and document the function parameters.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
'bm_pid' and 'ppid' are global variables. As they are used by different
processes and in signal handler, they cannot be entirely converted into
local variables.
The scope of those variables can still be reduced into resctrl_val.c
only. As PARENT_EXIT() macro is using 'ppid', make it a function in
resctrl_val.c and pass ppid to it as an argument because it is easier
to understand than using the global variable directly.
Pass 'bm_pid' into measure_vals() instead of relying on the global
variable which helps to make the call signatures of measure_vals() and
measure_llc_resctrl() more similar to each other.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
A few functions receive PIDs through int arguments. PIDs variables
should be of type pid_t, not int.
Convert pid arguments from int to pid_t.
Before printing PID, match the type to %d by casting to int which is
enough for Linux (standard would allow using a longer integer type but
generalizing for that would complicate the code unnecessarily, the
selftest code does not need to be portable).
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Both initialize_mem_bw_resctrl() and initialize_llc_occu_resctrl() that
are called from resctrl_val() need to determine domain ID to construct
resctrl fs related paths. Both functions do it by taking CPU ID which
neither needs for any other purpose than determining the domain ID.
Consolidate determining the domain ID into resctrl_val() and pass the
domain ID instead of CPU ID to initialize_mem_bw_resctrl() and
initialize_llc_occu_resctrl().
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Resctrl selftests refer to "bandwidth" currently in two other forms in
the code ("B/W" and "band width").
Use "bandwidth" consistently everywhere. While at it, fix also one
"over flow" -> "overflow" on a line that is touched by the change.
Suggested-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
For MBM/MBA tests, measure_vals() calls get_mem_bw_imc() that performs
the measurement over a duration of sleep(1) call. The memory bandwidth
numbers from IMC are derived over this duration. The resctrl FS derived
memory bandwidth, however, is calculated inside measure_vals() and only
takes delta between the previous value and the current one which
besides the actual test, also samples inter-test noise.
Rework the logic in measure_vals() and get_mem_bw_imc() such that the
resctrl FS memory bandwidth section covers much shorter duration
closely matching that of the IMC perf counters to improve measurement
accuracy.
For the second read after rewind() to return a fresh value, also
newline has to be consumed by the fscanf().
Suggested-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
The imc perf fd close() calls are missing from all error paths. In
addition, get_mem_bw_imc() handles fds in a for loop but close() is
based on two fixed indexes READ and WRITE.
Open code inner for loops to READ+WRITE entries for clarity and add a
function to close() IMC fds properly in all cases.
Fixes: 7f4d257e3a ("selftests/resctrl: Add callback to start a benchmark")
Suggested-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
There are extra spaces in the middle of #define. It is recommended
to delete the spaces to make the code look more comfortable.
Signed-off-by: aigourensheng <shechenglong001@gmail.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
gcc defaults to silence (off) for the following warnings, but clang
defaults to the opposite. The warnings are not useful for the kernel
itself, which is why they have remained disabled in gcc for the main
kernel build. And it is only due to including kernel data structures in
the selftests, that we get the warnings from clang.
-Waddress-of-packed-member
-Wgnu-variable-sized-type-not-at-end
In other words, the warnings are not unique to the selftests: there is
nothing that the selftests' code does that triggers these warnings,
other than the act of including the kernel's data structures. Therefore,
silence them for the clang builds as well.
This eliminates warnings for the net/ and user_events/ kselftest
subsystems, in these files:
./net/af_unix/scm_rights.c
./net/timestamping.c
./net/ipsec.c
./user_events/perf_test.c
Cc: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Acked-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Current release - regressions:
- core: fix rc7's __skb_datagram_iter() regression
Current release - new code bugs:
- eth: bnxt: fix crashes when reducing ring count with active RSS contexts
Previous releases - regressions:
- sched: fix UAF when resolving a clash
- skmsg: skip zero length skb in sk_msg_recvmsg2
- sunrpc: fix kernel free on connection failure in xs_tcp_setup_socket
- tcp: avoid too many retransmit packets
- tcp: fix incorrect undo caused by DSACK of TLP retransmit
- udp: Set SOCK_RCU_FREE earlier in udp_lib_get_port().
- eth: ks8851: fix deadlock with the SPI chip variant
- eth: i40e: fix XDP program unloading while removing the driver
Previous releases - always broken:
- bpf:
- fix too early release of tcx_entry
- fail bpf_timer_cancel when callback is being cancelled
- bpf: fix order of args in call to bpf_map_kvcalloc
- netfilter: nf_tables: prefer nft_chain_validate
- ppp: reject claimed-as-LCP but actually malformed packets
- wireguard: avoid unaligned 64-bit memory accesses
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmaP4GYSHHBhYmVuaUBy
ZWRoYXQuY29tAAoJECkkeY3MjxOkzUgQAKroA17iMt2rD8h385hL8T9r483CGsR+
MX6SuWn6T8v4cuhKbqhkf25pWOD0mKH2i+dmgYon7g9LjLG4DMiZBZAqmBwArbyM
mITgndWH57MnQQh3pgkDFp0lhzYkeERCVSgcgh2AFTcNoxXbazHkMMIghsBENx3+
wccTsqtPmT2GWRpw6IrHO6kUs98Gry4O2p6fw3dX3/umD0z8OgnyRoCdVkylCDTM
2tBl4rWsXw8LvzSzmQ7qX3FSzdS++RJk2iXKWdrglah8cuKohZ98WbUwlt2ObCxz
fLbaAocPzOijaX2YAsNzKYzWizq0i4IjpgSebNUcI3YFthAog5nNGJv79w+cxBKy
NpaQA31Hd6K6oFJybkysHAf776RC3ueF48Isp3kag7NQ8Qy2+nfWAM9g1wq4UnOu
IePFdgojqUbfp3GzOG5yFyqqD8RPppJp7DowSjjfYN8Dxw1y5R090suDyrjFeiiC
MezV4xu7vZdi/6R8RXVfskR/iczHqDHGuuMikTlkm1LaLty9dfnoIZC5AagFH1rA
Jkzztkd9MnSGK94G9upIhu+t8F22/wzmrJhJOgl9LFvbXP91uGjZGr75u4cc8Q/G
jn2uy05T/vzoBSiLRQc0Z2Wp9GYEWRXHKSjLkabrGeakw6tYYgZSAVAlxqrpvqbV
+fm9Ibpvs9fb
=qwVA
-----END PGP SIGNATURE-----
Merge tag 'net-6.10-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from bpf and netfilter.
Current release - regressions:
- core: fix rc7's __skb_datagram_iter() regression
Current release - new code bugs:
- eth: bnxt: fix crashes when reducing ring count with active RSS
contexts
Previous releases - regressions:
- sched: fix UAF when resolving a clash
- skmsg: skip zero length skb in sk_msg_recvmsg2
- sunrpc: fix kernel free on connection failure in
xs_tcp_setup_socket
- tcp: avoid too many retransmit packets
- tcp: fix incorrect undo caused by DSACK of TLP retransmit
- udp: Set SOCK_RCU_FREE earlier in udp_lib_get_port().
- eth: ks8851: fix deadlock with the SPI chip variant
- eth: i40e: fix XDP program unloading while removing the driver
Previous releases - always broken:
- bpf:
- fix too early release of tcx_entry
- fail bpf_timer_cancel when callback is being cancelled
- bpf: fix order of args in call to bpf_map_kvcalloc
- netfilter: nf_tables: prefer nft_chain_validate
- ppp: reject claimed-as-LCP but actually malformed packets
- wireguard: avoid unaligned 64-bit memory accesses"
* tag 'net-6.10-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (33 commits)
net, sunrpc: Remap EPERM in case of connection failure in xs_tcp_setup_socket
net/sched: Fix UAF when resolving a clash
net: ks8851: Fix potential TX stall after interface reopen
udp: Set SOCK_RCU_FREE earlier in udp_lib_get_port().
netfilter: nf_tables: prefer nft_chain_validate
netfilter: nfnetlink_queue: drop bogus WARN_ON
ethtool: netlink: do not return SQI value if link is down
ppp: reject claimed-as-LCP but actually malformed packets
selftests/bpf: Add timer lockup selftest
net: ethernet: mtk-star-emac: set mac_managed_pm when probing
e1000e: fix force smbus during suspend flow
tcp: avoid too many retransmit packets
bpf: Defer work in bpf_timer_cancel_and_free
bpf: Fail bpf_timer_cancel when callback is being cancelled
bpf: fix order of args in call to bpf_map_kvcalloc
net: ethernet: lantiq_etop: fix double free in detach
i40e: Fix XDP program unloading while removing the driver
net: fix rc7's __skb_datagram_iter()
net: ks8851: Fix deadlock with the SPI chip variant
octeontx2-af: Fix incorrect value output on error path in rvu_check_rsrc_availability()
...
Add a selftest that tries to trigger a situation where two timer callbacks
are attempting to cancel each other's timer. By running them continuously,
we hit a condition where both run in parallel and cancel each other.
Without the fix in the previous patch, this would cause a lockup as
hrtimer_cancel on either side will wait for forward progress from the
callback.
Ensure that this situation leads to a EDEADLK error.
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240711052709.2148616-1-memxor@gmail.com
The CXL driver was recently updated to return EBUSY rather than
ENXIO when the device reports that an injection request exceeds
the device's limit. That change to EBUSY allows debug users to
differentiate between limit reached and inject failures for any
other reason.
Change cxl-test to also return EBUSY and tidy up the dev_dbg()
messaging to emit the correct limit.
Reminder: the cxl-test per device injection limit is a configurable
attribute: /sys/bus/platform/drivers/cxl_mock_mem/poison_inject_max
Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Tested-by: Xingtao Yao <yaoxt.fnst@fujitsu.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Link: https://patch.msgid.link/ba1b80e1658b644d85d0d5e2287112d00a48b9cf.1720316188.git.alison.schofield@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
In tools/ directory, function bitmap_clear() is currently only used in
object file tools/testing/radix-tree/xarray.o.
But instead of keeping a bitmap.c with only bitmap_clear() definition in
radix-tree's own directory, it would be more proper to put it in common
directory lib/.
Sync the kernel definition and link some related libs, no functional
change is expected.
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
CC: Matthew Wilcox <willy@infradead.org>
CC: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
If bpf_object__load() fails in test_xdp_adjust_frags_tail_grow(), "obj"
opened before this should be closed. So use "goto out" to close it instead
of using "return" here.
Fixes: 110221081a ("bpf: selftests: update xdp_adjust_tail selftest to include xdp frags")
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Link: https://lore.kernel.org/r/f282a1ed2d0e3fb38cceefec8e81cabb69cab260.1720615848.git.tanggeliang@kylinos.cn
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Run bpf_tcp_ca selftests (./test_progs -t bpf_tcp_ca) on a Loongarch
platform, some "Segmentation fault" errors occur:
'''
test_dctcp:PASS:bpf_dctcp__open_and_load 0 nsec
test_dctcp:FAIL:bpf_map__attach_struct_ops unexpected error: -524
#29/1 bpf_tcp_ca/dctcp:FAIL
test_cubic:PASS:bpf_cubic__open_and_load 0 nsec
test_cubic:FAIL:bpf_map__attach_struct_ops unexpected error: -524
#29/2 bpf_tcp_ca/cubic:FAIL
test_dctcp_fallback:PASS:dctcp_skel 0 nsec
test_dctcp_fallback:PASS:bpf_dctcp__load 0 nsec
test_dctcp_fallback:FAIL:dctcp link unexpected error: -524
#29/4 bpf_tcp_ca/dctcp_fallback:FAIL
test_write_sk_pacing:PASS:open_and_load 0 nsec
test_write_sk_pacing:FAIL:attach_struct_ops unexpected error: -524
#29/6 bpf_tcp_ca/write_sk_pacing:FAIL
test_update_ca:PASS:open 0 nsec
test_update_ca:FAIL:attach_struct_ops unexpected error: -524
settcpca:FAIL:setsockopt unexpected setsockopt: \
actual -1 == expected -1
(network_helpers.c:99: errno: No such file or directory) \
Failed to call post_socket_cb
start_test:FAIL:start_server_str unexpected start_server_str: \
actual -1 == expected -1
test_update_ca:FAIL:ca1_ca1_cnt unexpected ca1_ca1_cnt: \
actual 0 <= expected 0
#29/9 bpf_tcp_ca/update_ca:FAIL
#29 bpf_tcp_ca:FAIL
Caught signal #11!
Stack trace:
./test_progs(crash_handler+0x28)[0x5555567ed91c]
linux-vdso.so.1(__vdso_rt_sigreturn+0x0)[0x7ffffee408b0]
./test_progs(bpf_link__update_map+0x80)[0x555556824a78]
./test_progs(+0x94d68)[0x5555564c4d68]
./test_progs(test_bpf_tcp_ca+0xe8)[0x5555564c6a88]
./test_progs(+0x3bde54)[0x5555567ede54]
./test_progs(main+0x61c)[0x5555567efd54]
/usr/lib64/libc.so.6(+0x22208)[0x7ffff2aaa208]
/usr/lib64/libc.so.6(__libc_start_main+0xac)[0x7ffff2aaa30c]
./test_progs(_start+0x48)[0x55555646bca8]
Segmentation fault
'''
This is because BPF trampoline is not implemented on Loongarch yet,
"link" returned by bpf_map__attach_struct_ops() is NULL. test_progs
crashs when this NULL link passes to bpf_link__update_map(). This
patch adds NULL checks for all links in bpf_tcp_ca to fix these errors.
If "link" is NULL, goto the newly added label "out" to destroy the skel.
v2:
- use "goto out" instead of "return" as Eduard suggested.
Fixes: 06da9f3bd6 ("selftests/bpf: Test switching TCP Congestion Control algorithms.")
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Link: https://lore.kernel.org/r/b4c841492bd4ed97964e4e61e92827ce51bf1dc9.1720615848.git.tanggeliang@kylinos.cn
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
CONFIG_MEMCG_KMEM used to be a user-visible option for whether slab
tracking is enabled. It has been default-enabled and equivalent to
CONFIG_MEMCG for almost a decade. We've only grown more kernel memory
accounting sites since, and there is no imaginable cgroup usecase going
forward that wants to track user pages but not the multitude of
user-drivable kernel allocations.
Link: https://lkml.kernel.org/r/20240701153148.452230-1-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Muchun Song <muchun.song@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Centralize the _GNU_SOURCE definition to CFLAGS in lib.mk. Remove
redundant defines from Makefiles that import lib.mk. Convert any usage of
"#define _GNU_SOURCE 1" to "#define _GNU_SOURCE".
This uses the form "-D_GNU_SOURCE=", which is equivalent to
"#define _GNU_SOURCE".
Otherwise using "-D_GNU_SOURCE" is equivalent to "-D_GNU_SOURCE=1" and
"#define _GNU_SOURCE 1", which is less commonly seen in source code and
would require many changes in selftests to avoid redefinition warnings.
Link: https://lkml.kernel.org/r/20240625223454.1586259-2-edliaw@google.com
Signed-off-by: Edward Liaw <edliaw@google.com>
Suggested-by: John Hubbard <jhubbard@nvidia.com>
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Reviewed-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: André Almeida <andrealmeid@igalia.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: David S. Miller <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Jarkko Sakkinen <jarkko@kernel.org>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Kees Cook <kees@kernel.org>
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Reinette Chatre <reinette.chatre@intel.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Both Ryan and Chris have been utilizing the small test program to aid in
debugging and identifying issues with swap entry allocation. While a real
or intricate workload might be more suitable for assessing the correctness
and effectiveness of the swap allocation policy, a small test program
presents a simpler means of understanding the problem and initially
verifying the improvements being made.
Let's endeavor to integrate it into tools/mm. Although it presently only
accommodates 64KB and 4KB, I'm optimistic that we can expand its
capabilities to support multiple sizes and simulate more complex systems
in the future as required.
Basically, we have
1. Use MADV_PAGEPUT for rapid swap-out, putting the swap allocation
code under high exercise in a short time.
2. Use MADV_DONTNEED to simulate the behavior of libc and Java heap in
freeing memory, as well as for munmap, app exits, or OOM killer
scenarios. This ensures new mTHP is always generated, released or
swapped out, similar to the behavior on a PC or Android phone where
many applications are frequently started and terminated.
3. Swap in with or without the "-a" option to observe how fragments
due to swap-in and the incoming swap-in of large folios will impact
swap-out fallback.
Due to 2, we ensure a certain proportion of mTHP. Similarly, because of
3, we maintain a certain proportion of small folios, as we don't support
large folios swap-in, meaning any swap-in will immediately result in small
folios. Therefore, with both 2 and 3, we automatically achieve a system
containing both mTHP and small folios. Additionally, 1 provides the
ability to continuously swap them out.
We can also use "-s" to add a dedicated small folios memory area.
[akpm@linux-foundation.org: thp_swap_allocator_test.c needs mman.h, per Kairui Song]
Link: https://lkml.kernel.org/r/20240622071231.576056-2-21cnbao@gmail.com
Signed-off-by: Barry Song <v-songbaohua@oppo.com>
Acked-by: Chris Li <chrisl@kernel.org>
Tested-by: Chris Li <chrisl@kernel.org>
Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>
Tested-by: Ryan Roberts <ryan.roberts@arm.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: "Huang, Ying" <ying.huang@intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Kairui Song <kasong@tencent.com>
Cc: Kalesh Singh <kaleshsingh@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
This patch uses public helper connect_fd_to_fd() exported in
network_helpers.h instead of using getsockname() + connect() in
run_lookup_prog() in prog_tests/sk_lookup.c. This can simplify
the code.
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Link: https://lore.kernel.org/r/7077c277cde5a1864cdc244727162fb75c8bb9c5.1720515893.git.tanggeliang@kylinos.cn
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
This patch uses public helper start_server_addr() in udp_recv_send()
in prog_tests/sk_lookup.c to simplify the code.
And use ASSERT_OK_FD() to check fd returned by start_server_addr().
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Link: https://lore.kernel.org/r/f11cabfef4a2170ecb66a1e8e2e72116d8f621b3.1720515893.git.tanggeliang@kylinos.cn
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
This patch uses public helper start_server_str() to simplify make_server()
in prog_tests/sk_lookup.c.
Add a callback setsockopts() to do all sockopts, set it to post_socket_cb
pointer of struct network_helper_opts. And add a new struct cb_opts to save
the data needed to pass to the callback. Then pass this network_helper_opts
to start_server_str().
Also use ASSERT_OK_FD() to check fd returned by start_server_str().
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Link: https://lore.kernel.org/r/5981539f5591d2c4998c962ef2bf45f34c940548.1720515893.git.tanggeliang@kylinos.cn
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
In the error path when update_lookup_map() fails in drop_on_reuseport in
prog_tests/sk_lookup.c, "server1", the fd of server 1, should be closed.
This patch fixes this by using "goto close_srv1" lable instead of "detach"
to close "server1" in this case.
Fixes: 0ab5539f85 ("selftests/bpf: Tests for BPF_SK_LOOKUP attach point")
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Link: https://lore.kernel.org/r/86aed33b4b0ea3f04497c757845cff7e8e621a2d.1720515893.git.tanggeliang@kylinos.cn
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Add a new dedicated ASSERT macro ASSERT_OK_FD to test whether a socket
FD is valid or not. It can be used to replace macros ASSERT_GT(fd, 0, ""),
ASSERT_NEQ(fd, -1, "") or statements (fd < 0), (fd != -1).
Suggested-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Link: https://lore.kernel.org/r/ded75be86ac630a3a5099739431854c1ec33f0ea.1720515893.git.tanggeliang@kylinos.cn
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Some callers expect __start_server() helper to pass their own "backlog"
value to listen() instead of the default of 1. So this patch adds struct
member "backlog" for network_helper_opts to allow callers to set "backlog"
value via start_server_str() helper.
listen(fd, 0 /* backlog */) can be used to enforce syncookie. Meaning
backlog 0 is a legit value.
Using 0 as a default and changing it to 1 here is fine. It makes the test
program easier to write for the common case. Enforcing syncookie mode by
using backlog 0 is a niche use case but it should at least have a way for
the caller to do that. Thus, -ve backlog value is used here for the
syncookie use case. Please see the comment in network_helpers.h for
the details.
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Link: https://lore.kernel.org/r/1660229659b66eaad07aa2126e9c9fe217eba0dd.1720515893.git.tanggeliang@kylinos.cn
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
In many cases, kernel netfilter functionality is built as modules.
If CONFIG_NF_FLOW_TABLE=m in particular, progs/xdp_flowtable.c
(and hence selftests) will fail to compile, so add a ___local
version of "struct flow_ports".
Fixes: c77e572d3a ("selftests/bpf: Add selftest for bpf_xdp_flow_lookup kfunc")
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Link: https://lore.kernel.org/r/20240710150051.192598-1-alan.maguire@oracle.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
When clone3() was introduced, it was not obvious how each architecture
deals with setting up the stack and keeping the register contents in
a fork()-like system call, so this was left for the architecture
maintainers to implement, with __ARCH_WANT_SYS_CLONE3 defined by those
that already implement it.
Five years later, we still have a few architectures left that are missing
clone3(), and the macro keeps getting in the way as it's fundamentally
different from all the other __ARCH_WANT_SYS_* macros that are meant
to provide backwards-compatibility with applications using older
syscalls that are no longer provided by default.
Address this by reversing the polarity of the macro, adding an
__ARCH_BROKEN_SYS_CLONE3 macro to all architectures that don't
already provide the syscall, and remove __ARCH_WANT_SYS_CLONE3
from all the other ones.
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Currently, BPF kfuncs which accept trusted pointer arguments
i.e. those flagged as KF_TRUSTED_ARGS, KF_RCU, or KF_RELEASE, all
require an original/unmodified trusted pointer argument to be supplied
to them. By original/unmodified, it means that the backing register
holding the trusted pointer argument that is to be supplied to the BPF
kfunc must have its fixed offset set to zero, or else the BPF verifier
will outright reject the BPF program load. However, this zero fixed
offset constraint that is currently enforced by the BPF verifier onto
BPF kfuncs specifically flagged to accept KF_TRUSTED_ARGS or KF_RCU
trusted pointer arguments is rather unnecessary, and can limit their
usability in practice. Specifically, it completely eliminates the
possibility of constructing a derived trusted pointer from an original
trusted pointer. To put it simply, a derived pointer is a pointer
which points to one of the nested member fields of the object being
pointed to by the original trusted pointer.
This patch relaxes the zero fixed offset constraint that is enforced
upon BPF kfuncs which specifically accept KF_TRUSTED_ARGS, or KF_RCU
arguments. Although, the zero fixed offset constraint technically also
applies to BPF kfuncs accepting KF_RELEASE arguments, relaxing this
constraint for such BPF kfuncs has subtle and unwanted
side-effects. This was discovered by experimenting a little further
with an initial version of this patch series [0]. The primary issue
with relaxing the zero fixed offset constraint on BPF kfuncs accepting
KF_RELEASE arguments is that it'd would open up the opportunity for
BPF programs to supply both trusted pointers and derived trusted
pointers to them. For KF_RELEASE BPF kfuncs specifically, this could
be problematic as resources associated with the backing pointer could
be released by the backing BPF kfunc and cause instabilities for the
rest of the kernel.
With this new fixed offset semantic in-place for BPF kfuncs accepting
KF_TRUSTED_ARGS and KF_RCU arguments, we now have more flexibility
when it comes to the BPF kfuncs that we're able to introduce moving
forward.
Early discussions covering the possibility of relaxing the zero fixed
offset constraint can be found using the link below. This will provide
more context on where all this has stemmed from [1].
Notably, pre-existing tests have been updated such that they provide
coverage for the updated zero fixed offset
functionality. Specifically, the nested offset test was converted from
a negative to positive test as it was already designed to assert zero
fixed offset semantics of a KF_TRUSTED_ARGS BPF kfunc.
[0] https://lore.kernel.org/bpf/ZnA9ndnXKtHOuYMe@google.com/
[1] https://lore.kernel.org/bpf/ZhkbrM55MKQ0KeIV@google.com/
Signed-off-by: Matt Bobrowski <mattbobrowski@google.com>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20240709210939.1544011-1-mattbobrowski@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Improve how we handle old BPF skeletons when it comes to BPF map
auto-attachment. Emit one warn-level message per each struct_ops map
that could have been auto-attached, if user provided recent enough BPF
skeleton version. Don't spam log if there are no relevant struct_ops
maps, though.
This should help users realize that they probably need to regenerate BPF
skeleton header with more recent bpftool/libbpf-cargo (or whatever other
means of BPF skeleton generation).
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20240708204540.4188946-4-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
BPF skeleton was designed from day one to be extensible. Generated BPF
skeleton code specifies actual sizes of map/prog/variable skeletons for
that reason and libbpf is supposed to work with newer/older versions
correctly.
Unfortunately, it was missed that we implicitly embed hard-coded most
up-to-date (according to libbpf's version of libbpf.h header used to
compile BPF skeleton header) sizes of those structs, which can differ
from the actual sizes at runtime when libbpf is used as a shared
library.
We have a few places were we just index array of maps/progs/vars, which
implicitly uses these potentially invalid sizes of structs.
This patch aims to fix this problem going forward. Once this lands,
we'll backport these changes in Github repo to create patched releases
for older libbpfs.
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Fixes: d66562fba1 ("libbpf: Add BPF object skeleton support")
Fixes: 430025e5dc ("libbpf: Add subskeleton scaffolding")
Fixes: 08ac454e25 ("libbpf: Auto-attach struct_ops BPF maps in BPF skeleton")
Co-developed-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20240708204540.4188946-3-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Old versions of libbpf don't handle varying sizes of bpf_map_skeleton
struct correctly. As such, BPF skeleton generated by newest bpftool
might not be compatible with older libbpf (though only when libbpf is
used as a shared library), even though it, by design, should.
Going forward libbpf will be fixed, plus we'll release bug fixed
versions of relevant old libbpfs, but meanwhile try to mitigate from
bpftool side by conservatively assuming older and smaller definition of
bpf_map_skeleton, if possible. Meaning, if there are no struct_ops maps.
If there are struct_ops, then presumably user would like to have
auto-attaching logic and struct_ops map link placeholders, so use the
full bpf_map_skeleton definition in that case.
Acked-by: Quentin Monnet <qmo@kernel.org>
Co-developed-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20240708204540.4188946-2-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Some workloads may want to rehash the flows in response to an imbalance.
Most effective way to do that is changing the RSS key. Check that changing
the key does not cause link flaps or traffic disruption.
Disrupting traffic for key update is not incorrect, but makes the key
update unusable for rehashing under load.
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20240708213627.226025-6-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Some devices dynamically increase and decrease the size of the RSS
indirection table based on the number of enabled queues.
When that happens driver must maintain the balance of entries
(preferably duplicating the smaller table).
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20240708213627.226025-5-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
By default main RSS table should change to include all queues.
When user sets a specific RSS config the driver should preserve it,
even when queue count changes. Driver should refuse to deactivate
queues used in the user-set RSS config.
For additional contexts driver should still refuse to deactivate
queues in use. Whether the contexts should get resized like
context 0 when queue count increases is a bit unclear. I anticipate
most drivers today don't do that. Since main use case for additional
contexts is to set the indir table - it doesn't seem worthwhile to
care about behavior of the default table too much. Don't test that.
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20240708213627.226025-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Wrap up sending traffic and checking in which queues it landed
in a helper.
The method used for testing is to send a lot of iperf traffic
and check which queues received the most packets. Those should
be the queues where we expect iperf to land - either because we
installed a filter for the port iperf uses, or we didn't and
expect it to use context 0.
Contexts get disjoint queue sets, but the main context (AKA context 0)
may receive some background traffic (noise).
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20240708213627.226025-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The basic test may fail without resetting the RSS indir table.
Use the .exec() method to run cleanup early since we re-test
with traffic that returning to default state works.
While at it reformat the doc a tiny bit.
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20240708213627.226025-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The ageing time used by the test is too short for debug kernels and
results in entries being aged out prematurely [1].
Fix by increasing the ageing time.
The same change was done for the VLAN-aware version of the test in
commit dfbab74044 ("selftests: forwarding: Make vxlan-bridge-1q pass
on debug kernels").
[1]
# ./vxlan_bridge_1d.sh
[...]
# TEST: VXLAN: flood before learning [ OK ]
# TEST: VXLAN: show learned FDB entry [ OK ]
# TEST: VXLAN: learned FDB entry [FAIL]
# veth3: Expected to capture 0 packets, got 4.
# RTNETLINK answers: No such file or directory
# TEST: VXLAN: deletion of learned FDB entry [ OK ]
# TEST: VXLAN: Ageing of learned FDB entry [FAIL]
# veth3: Expected to capture 0 packets, got 2.
[...]
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240707095458.2870260-1-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Lu Baolu says:
====================
This series implements the functionality of delivering IO page faults to
user space through the IOMMUFD framework. One feasible use case is the
nested translation. Nested translation is a hardware feature that supports
two-stage translation tables for IOMMU. The second-stage translation table
is managed by the host VMM, while the first-stage translation table is
owned by user space. This allows user space to control the IOMMU mappings
for its devices.
When an IO page fault occurs on the first-stage translation table, the
IOMMU hardware can deliver the page fault to user space through the
IOMMUFD framework. User space can then handle the page fault and respond
to the device top-down through the IOMMUFD. This allows user space to
implement its own IO page fault handling policies.
User space application that is capable of handling IO page faults should
allocate a fault object, and bind the fault object to any domain that it
is willing to handle the fault generatd for them. On a successful return
of fault object allocation, the user can retrieve and respond to page
faults by reading or writing to the file descriptor (FD) returned.
The iommu selftest framework has been updated to test the IO page fault
delivery and response functionality.
====================
* iommufd_pri:
iommufd/selftest: Add coverage for IOPF test
iommufd/selftest: Add IOPF support for mock device
iommufd: Associate fault object with iommufd_hw_pgtable
iommufd: Fault-capable hwpt attach/detach/replace
iommufd: Add iommufd fault object
iommufd: Add fault and response message definitions
iommu: Extend domain attach group with handle support
iommu: Add attach handle to struct iopf_group
iommu: Remove sva handle list
iommu: Introduce domain attachment handle
Link: https://lore.kernel.org/all/20240702063444.105814-1-baolu.lu@linux.intel.com
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Extend the selftest tool to add coverage of testing IOPF handling. This
would include the following tests:
- Allocating and destroying an iommufd fault object.
- Allocating and destroying an IOPF-capable HWPT.
- Attaching/detaching/replacing an IOPF-capable HWPT on a device.
- Triggering an IOPF on the mock device.
- Retrieving and responding to the IOPF through the file interface.
Link: https://lore.kernel.org/r/20240702063444.105814-11-baolu.lu@linux.intel.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
This kselftest fixes update for Linux 6.10 consists of fixes to clang
build failures to timerns, vDSO tests and fixes to vDSO makefile.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmaMaSgACgkQCwJExA0N
Qxz+xw/8DG3OciD30wrXyPbH7Lw35AJXH3IsIjirQET6hoE+saxYOWbVN9POqqVy
VTh21ZwJpuSrh8gIYHUEZPLezTwIYWN7aZm2Zps7VttlfXjNvbWiLBB4ptAL/XWu
SogHFeE1u1KHk8JZY3v3j//hQxL0FqnUbqRjv5nnOUS1krgL4shP6JsdUU65Bs9x
TLSCJrJSCSpG/u7KAXSHlYy0kn9fnL+F2LUqTFf+kzOOdLZ+XaxHS/02GsZYgcVI
SUvL6x4NEqVMyxcnvL4QBs91SD1/q80vf7g0+gKHkcuHKluto/Zmnwhw40oN92lr
T6muSS2jW+OemZzglJdD4aIbCEisVtwPsPkdtux9JZV9VAH2lyYz0+G0J2fX7r11
LOcd4Y7HhoYA5UL6s6puE8xQEZOUrBNMY4exfeOkW/UaJhscewtyTMQsNRs8qW+4
lEoHFJSsVQtfuZSxUaiXm49loVxu8JueynG6dafRue8tf9mCWpOzl01fVpkoLL/1
5lMOau3DZallsiHKU0COg6eJhAi6QQjC2nYNMJHwO3DFCKpwneMYbbU9xqS5MZ5Y
wVijpgyFdIMk5qxHDdVEmevFNyYG3xGYKq/sReDuwb4qJkdx7rDS5mMTkVxyHdCe
ezHxw6tuiLohHXDHVCR/KxQwjiHkXZF2uudzTFDt6Lxeu68PAFk=
=c2zt
-----END PGP SIGNATURE-----
Merge tag 'linux_kselftest-fixes-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull kselftest fixes from Shuah Khan
"Fixes to clang build failures to timerns, vDSO tests and fixes to vDSO
makefile"
* tag 'linux_kselftest-fixes-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
selftests/vDSO: remove duplicate compiler invocations from Makefile
selftests/vDSO: remove partially duplicated "all:" target in Makefile
selftests/vDSO: fix clang build errors and warnings
selftest/timerns: fix clang build failures for abs() calls
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZoxN0AAKCRDbK58LschI
g0c5AQDa3ZV9gfbN42y1zSDoM1uOgO60fb+ydxyOYh8l3+OiQQD/fLfpTY3gBFSY
9yi/pZhw/QdNzQskHNIBrHFGtJbMxgs=
=p1Zz
-----END PGP SIGNATURE-----
Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:
====================
pull-request: bpf-next 2024-07-08
The following pull-request contains BPF updates for your *net-next* tree.
We've added 102 non-merge commits during the last 28 day(s) which contain
a total of 127 files changed, 4606 insertions(+), 980 deletions(-).
The main changes are:
1) Support resilient split BTF which cuts down on duplication and makes BTF
as compact as possible wrt BTF from modules, from Alan Maguire & Eduard Zingerman.
2) Add support for dumping kfunc prototypes from BTF which enables both detecting
as well as dumping compilable prototypes for kfuncs, from Daniel Xu.
3) Batch of s390x BPF JIT improvements to add support for BPF arena and to implement
support for BPF exceptions, from Ilya Leoshkevich.
4) Batch of riscv64 BPF JIT improvements in particular to add 12-argument support
for BPF trampolines and to utilize bpf_prog_pack for the latter, from Pu Lehui.
5) Extend BPF test infrastructure to add a CHECKSUM_COMPLETE validation option
for skbs and add coverage along with it, from Vadim Fedorenko.
6) Inline bpf_get_current_task/_btf() helpers in the arm64 BPF JIT which gives
a small 1% performance improvement in micro-benchmarks, from Puranjay Mohan.
7) Extend the BPF verifier to track the delta between linked registers in order
to better deal with recent LLVM code optimizations, from Alexei Starovoitov.
8) Fix bpf_wq_set_callback_impl() kfunc signature where the third argument should
have been a pointer to the map value, from Benjamin Tissoires.
9) Extend BPF selftests to add regular expression support for test output matching
and adjust some of the selftest when compiled under gcc, from Cupertino Miranda.
10) Simplify task_file_seq_get_next() and remove an unnecessary loop which always
iterates exactly once anyway, from Dan Carpenter.
11) Add the capability to offload the netfilter flowtable in XDP layer through
kfuncs, from Florian Westphal & Lorenzo Bianconi.
12) Various cleanups in networking helpers in BPF selftests to shave off a few
lines of open-coded functions on client/server handling, from Geliang Tang.
13) Properly propagate prog->aux->tail_call_reachable out of BPF verifier, so
that x86 JIT does not need to implement detection, from Leon Hwang.
14) Fix BPF verifier to add a missing check_func_arg_reg_off() to prevent an
out-of-bounds memory access for dynpointers, from Matt Bobrowski.
15) Fix bpf_session_cookie() kfunc to return __u64 instead of long pointer as
it might lead to problems on 32-bit archs, from Jiri Olsa.
16) Enhance traffic validation and dynamic batch size support in xsk selftests,
from Tushar Vyavahare.
bpf-next-for-netdev
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (102 commits)
selftests/bpf: DENYLIST.aarch64: Remove fexit_sleep
selftests/bpf: amend for wrong bpf_wq_set_callback_impl signature
bpf: helpers: fix bpf_wq_set_callback_impl signature
libbpf: Add NULL checks to bpf_object__{prev_map,next_map}
selftests/bpf: Remove exceptions tests from DENYLIST.s390x
s390/bpf: Implement exceptions
s390/bpf: Change seen_reg to a mask
bpf: Remove unnecessary loop in task_file_seq_get_next()
riscv, bpf: Optimize stack usage of trampoline
bpf, devmap: Add .map_alloc_check
selftests/bpf: Remove arena tests from DENYLIST.s390x
selftests/bpf: Add UAF tests for arena atomics
selftests/bpf: Introduce __arena_global
s390/bpf: Support arena atomics
s390/bpf: Enable arena
s390/bpf: Support address space cast instruction
s390/bpf: Support BPF_PROBE_MEM32
s390/bpf: Land on the next JITed instruction after exception
s390/bpf: Introduce pre- and post- probe functions
s390/bpf: Get rid of get_probe_mem_regno()
...
====================
Link: https://patch.msgid.link/20240708221438.10974-1-daniel@iogearbox.net
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
"After" was missing an "r", nothing to see here.
Signed-off-by: Patryk Wlazlyn <patryk.wlazlyn@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
We had few lines about the feature, but without any complete examples.
Signed-off-by: Patryk Wlazlyn <patryk.wlazlyn@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
We had an extra "+" at the beginning of some lines that look like a
poorly formated patch.
Signed-off-by: Patryk Wlazlyn <patryk.wlazlyn@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
User can now read perf counters using "--add perf/<device>/<event>".
Other details work similarly to how --add works with MSRs.
Signed-off-by: Patryk Wlazlyn <patryk.wlazlyn@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
These three counters now are treated similar to other perf counters
groups. This simplifies and gets rid of a lot of special cases for APERF
and MPERF.
Signed-off-by: Patryk Wlazlyn <patryk.wlazlyn@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
These addresses the performance issues reported by Matt, Namhyung and
Linus. Recently it changed processing comm string and DSO with sorted
arrays but it required to sort the array whenever it adds a new entry.
This caused a performance issue and fix is to enhance the sorting by
finding the insertion point in the sorted array and to shift righthand
side using memmove().
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQSo2x5BnqMqsoHtzsmMstVUGiXMgwUCZov/wgAKCRCMstVUGiXM
g3olAQCFzp/BnopE7VgUSK5j0EOnMjSsvkQGkocqCVN1Km3y8AEAlV3EKb1rUN8s
SQ+QcEx7F4V38s+Aoo2SqU1yAwYsXAc=
=Ao/v
-----END PGP SIGNATURE-----
Merge tag 'perf-tools-fixes-for-v6.10-2024-07-08' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
Pull perf tools fixes from Namhyung Kim:
"Fix performance issue for v6.10
These address the performance issues reported by Matt, Namhyung and
Linus. Recently perf changed the processing of the comm string and DSO
using sorted arrays but this caused it to sort the array whenever
adding a new entry.
This caused a performance issue and the fix is to enhance the sorting
by finding the insertion point in the sorted array and to shift
righthand side using memmove()"
* tag 'perf-tools-fixes-for-v6.10-2024-07-08' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools:
perf dsos: When adding a dso into sorted dsos maintain the sort order
perf comm str: Avoid sort during insert
fexit_sleep test runs successfully now on the BPF CI so remove it
from the deny list. ftrace direct calls was blocking tracing programs
on arm64 but it has been resolved by now. For more details see also
discussion in [*].
Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240705145009.32340-1-puranjay@kernel.org [*]
See the previous patch: the API was wrong, we were provided the pointer
to the value, not the actual struct bpf_wq *.
Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>
Link: https://lore.kernel.org/r/20240708-fix-wq-v2-2-667e5c9fbd99@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
In the current state, an erroneous call to
bpf_object__find_map_by_name(NULL, ...) leads to a segmentation
fault through the following call chain:
bpf_object__find_map_by_name(obj = NULL, ...)
-> bpf_object__for_each_map(pos, obj = NULL)
-> bpf_object__next_map((obj = NULL), NULL)
-> return (obj = NULL)->maps
While calling bpf_object__find_map_by_name with obj = NULL is
obviously incorrect, this should not lead to a segmentation
fault but rather be handled gracefully.
As __bpf_map__iter already handles this situation correctly, we
can delegate the check for the regular case there and only add
a check in case the prev or next parameter is NULL.
Signed-off-by: Andreas Ziegler <ziegler.andreas@siemens.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240703083436.505124-1-ziegler.andreas@siemens.com
This cpupower second update for Linux 6.11-rc1 consists of
-- fix to install cpupower library in standard librray intall
location - /usr/lib
-- disable direct build of cpupower bench as it can only be
built from the cpupower main makefile.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmaIdeoACgkQCwJExA0N
Qxx7iA//UehgsvCngPaOwBAyDALv5mBK4LHviUCZ95NqZHkmE2b9YOpVXLRBmxna
K2VU4HpfN05P5yOwi+ah46vJr4zq0mS3GThqHGdxNruVzky3TwmAiLIkZoeiOQ/I
v0o4+3RTingMylyR6DxdMV5leHdct1W89JbM+CPkeU4BRtYYxEbAQc2fElTANaLR
CsPqXsW0sk09XoQnjdclrBp86mTP9OvjSq8ZgBlXOHyIRzKGE3PADn0oi3BiFv1L
YojkciiVHxbavtbp0QOGyeTZCjVb2DeMRCeO8ZLlLxmnZzYBBsGuQm5SMM5Sv6sl
9wej7MwHuWBctgKoqD6MZt0EjunPFcy5gdV3oh8f2r07yPDsYnL28RgolGh9n9zC
XELzV2N1QXfp4373mYO5d2OlppYd+1jPsshFnv5ALbdeiGElya+wonEz/Q/J6a9f
Ip23k4hOM8PnWVDJlqyMp3K0Kii7gOXkHYwmvOfxSU5daMlrEWfkOpxc48w1nrVy
W9kScrfTOOedPWu7YL56yif9FBjR1Elf7GmaAthwI1V6h1HLJFtFm3j1oJN9Mu8A
ek9npCHN+qOTysz9V2IcWG+6AX9dd59aSldcrPPRePi6S8nrm1/4pAUyHFdCsoop
uBljD6SFssJr37K0fdkWZz3oi9emFUMuZ7WcvPkhTdTytJkkNc8=
=WIv4
-----END PGP SIGNATURE-----
Merge tag 'linux-cpupower-6.11-rc1-2' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/shuah/linux into pm-tools
Merge more cpupower utility changes for 6.11-rc1 from Shuah Khan:
"This cpupower second update for Linux 6.11-rc1 consists of
-- fix to install cpupower library in standard librray intall
location - /usr/lib
-- disable direct build of cpupower bench as it can only be
built from the cpupower main makefile."
* tag 'linux-cpupower-6.11-rc1-2' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/shuah/linux:
cpupower: fix lib default installation path
cpupower: Disable direct build of the 'bench' subproject
dsos__add would add at the end of the dso array possibly requiring a
later find to re-sort the array. Patterns of find then add were
becoming O(n*log n) due to the sorts. Change the add routine to be
O(n) rather than O(1) but to maintain the sorted-ness of the dsos
array so that later finds don't need the O(n*log n) sort.
Fixes: 3f4ac23a99 ("perf dsos: Switch backing storage to array from rbtree/list")
Reported-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Steinar Gunderson <sesse@google.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Matt Fleming <matt@readmodwrite.com>
Link: https://lore.kernel.org/r/20240703172117.810918-3-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
The array is sorted, so just move the elements and insert in order.
Fixes: 13ca628716 ("perf comm: Add reference count checking to 'struct comm_str'")
Reported-by: Matt Fleming <matt@readmodwrite.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Matt Fleming <matt@readmodwrite.com>
Cc: Steinar Gunderson <sesse@google.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Link: https://lore.kernel.org/r/20240703172117.810918-2-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
This version addresses one issue:
- Fix updating TRL MSR after SST-TF is disabled in auto mode.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>