Commit Graph

32514 Commits

Author SHA1 Message Date
Jakub Kicinski
31f1aa4f74 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
drivers/net/can/usb/kvaser_usb/kvaser_usb_leaf.c
  2871edb32f ("can: kvaser_usb: Fix possible completions during init_completion")
  abb8670938 ("can: kvaser_usb_leaf: Ignore stale bus-off after start")
  8d21f5927a ("can: kvaser_usb_leaf: Fix improved state not being reported")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-27 16:56:36 -07:00
Linus Torvalds
2375886721 Including fixes from 802.15.4 (Zigbee et al.).
Current release - regressions:
 
  - ipa: fix bugs in the register conversion for IPA v3.1 and v3.5.1
 
 Current release - new code bugs:
 
  - mptcp: fix abba deadlock on fastopen
 
  - eth: stmmac: rk3588: allow multiple gmac controllers in one system
 
 Previous releases - regressions:
 
  - ip: rework the fix for dflt addr selection for connected nexthop
 
  - net: couple more fixes for misinterpreting bits in struct page after
    the signature was added
 
 Previous releases - always broken:
 
  - ipv6: ensure sane device mtu in tunnels
 
  - openvswitch: switch from WARN to pr_warn on a user-triggerable path
 
  - ethtool: eeprom: fix null-deref on genl_info in dump
 
  - ieee802154: more return code fixes for corner cases in dgram_sendmsg
 
  - mac802154: fix link-quality-indicator recording
 
  - eth: mlx5: fixes for IPsec, PTP timestamps, OvS and conntrack offload
 
  - eth: fec: limit register access on i.MX6UL
 
  - eth: bcm4908_enet: update TX stats after actual transmission
 
  - can: rcar_canfd: improve IRQ handling for RZ/G2L
 
 Misc:
 
  - genetlink: piggy back on the newly added resv_op_start to enforce
    more sanity checks on new commands
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmNa2CIACgkQMUZtbf5S
 IrsEDhAAsqvsIqhnwaDuvzTpdz/l2ZiLyRixue+Z5Q88/LkSYC7SRMjh70TzbYEj
 ENbB+hzGt9zDYIga1+vtLU13rENiI+3V0Pr5eOK9jVV2KBwQmgj1PatjlLhfQ8aa
 q9c/dg3YqKFcsLjHpCZC1O3imDEU+Wt1XV+N2tuoOhJ1QVPSemjSVUEgIP+qLTD7
 cXd+bWpcEXq/X0jkptElGsCM4RHxuN9MCcQDoGfdyoGEmXDi17BmmJEVu4LWdamg
 bPlky2uerFBtuUyK3jSvsoTI0VHwcxAr/MSmMxwcRGMr/smy/1UIKfehSJUOXFsr
 XeN4pfgezqPvl4l7LjC0xx83zg1UffKGhkGuu47MS3A8rS+zSo9CEH993owOb5Ty
 ZH5ZhBsdS6wchCbM15eqEby2ATYh/pYf8gNEBYfItsj2QuIPoqt8h19yQ4Gu1eX2
 1w1RpDJH0SyD02hsmfRWKzjehHNbNM+cQ2+prVazhXuSmhGxTOqWsirv6mThlfm6
 IEuG62d0VOYFoRBKxTV27S57QyfT0/+uMyu7UjDX5lieJGXvN6wGH7UlOUDBC5j/
 4GhW8Li4hxskxv292S8nvwANAOY02wWaunVsEtLYwB+7erkPDISUkiUjdxi4Uc7W
 yfxqbhW70Yd9sDEoKXGRsQ21nl82ZBeUIWPx/xLr+F6PuKdvUHo=
 =g5TW
 -----END PGP SIGNATURE-----

Merge tag 'net-6.1-rc3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from 802.15.4 (Zigbee et al).

  Current release - regressions:

   - ipa: fix bugs in the register conversion for IPA v3.1 and v3.5.1

  Current release - new code bugs:

   - mptcp: fix abba deadlock on fastopen

   - eth: stmmac: rk3588: allow multiple gmac controllers in one system

  Previous releases - regressions:

   - ip: rework the fix for dflt addr selection for connected nexthop

   - net: couple more fixes for misinterpreting bits in struct page
     after the signature was added

  Previous releases - always broken:

   - ipv6: ensure sane device mtu in tunnels

   - openvswitch: switch from WARN to pr_warn on a user-triggerable path

   - ethtool: eeprom: fix null-deref on genl_info in dump

   - ieee802154: more return code fixes for corner cases in
     dgram_sendmsg

   - mac802154: fix link-quality-indicator recording

   - eth: mlx5: fixes for IPsec, PTP timestamps, OvS and conntrack
     offload

   - eth: fec: limit register access on i.MX6UL

   - eth: bcm4908_enet: update TX stats after actual transmission

   - can: rcar_canfd: improve IRQ handling for RZ/G2L

  Misc:

   - genetlink: piggy back on the newly added resv_op_start to enforce
     more sanity checks on new commands"

* tag 'net-6.1-rc3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (57 commits)
  net: enetc: survive memory pressure without crashing
  kcm: do not sense pfmemalloc status in kcm_sendpage()
  net: do not sense pfmemalloc status in skb_append_pagefrags()
  net/mlx5e: Fix macsec sci endianness at rx sa update
  net/mlx5e: Fix wrong bitwise comparison usage in macsec_fs_rx_add_rule function
  net/mlx5e: Fix macsec rx security association (SA) update/delete
  net/mlx5e: Fix macsec coverity issue at rx sa update
  net/mlx5: Fix crash during sync firmware reset
  net/mlx5: Update fw fatal reporter state on PCI handlers successful recover
  net/mlx5e: TC, Fix cloned flow attr instance dests are not zeroed
  net/mlx5e: TC, Reject forwarding from internal port to internal port
  net/mlx5: Fix possible use-after-free in async command interface
  net/mlx5: ASO, Create the ASO SQ with the correct timestamp format
  net/mlx5e: Update restore chain id for slow path packets
  net/mlx5e: Extend SKB room check to include PTP-SQ
  net/mlx5: DR, Fix matcher disconnect error flow
  net/mlx5: Wait for firmware to enable CRS before pci_restore_state
  net/mlx5e: Do not increment ESN when updating IPsec ESN state
  netdevsim: remove dir in nsim_dev_debugfs_init() when creating ports dir failed
  netdevsim: fix memory leak in nsim_drv_probe() when nsim_dev_resources_register() failed
  ...
2022-10-27 13:36:59 -07:00
Aaron Conole
25f16c873f selftests: add openvswitch selftest suite
Previous commit resolves a WARN splat that can be difficult to reproduce,
but with the ovs-dpctl.py utility, it can be trivial.  Introduce a test
case which creates a DP, and then downgrades the feature set.  This will
include a utility 'ovs-dpctl.py' that can be extended to do additional
tests and diagnostics.

Signed-off-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-10-27 12:31:24 +02:00
Victor Nogueira
95d9a3dab1 selftests: tc-testing: Add matchJSON to tdc
This allows the use of a matchJSON field in tests to match
against JSON output from the command under test, if that
command outputs JSON.

You specify what you want to match against as a JSON array
or object in the test's matchJSON field. You can leave out
any fields you don't want to match against that are present
in the output and they will be skipped.

An example matchJSON value would look like this:

"matchJSON": [
  {
    "Value": {
      "neighIP": {
        "family": 4,
        "addr": "AQIDBA==",
        "width": 32
      },
      "nsflags": 142,
      "ncflags": 0,
      "LLADDR": "ESIzRFVm"
    }
  }
]

The real output from the command under test might have some
extra fields that we don't care about for matching, and
since we didn't include them in our matchJSON value, those
fields will not be attempted to be matched. If everything
we included above has the same values as the real command
output, the test will pass.

The matchJSON field's type must be the same as the command
output's type, otherwise the test will fail. So if the
command outputs an array, then the value of matchJSON must
also be an array.

If matchJSON is an array, it must not contain more elements
than the command output's array, otherwise the test will
fail.

Signed-off-by: Jeremy Carter <jeremy@mojatatu.com>
Signed-off-by: Victor Nogueira <victor@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://lore.kernel.org/r/20221024111603.2185410-1-victor@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-26 20:22:33 -07:00
Colin Ian King
96f341a475 bpftool: Fix spelling mistake "disasembler" -> "disassembler"
There is a spelling mistake in an error message. Fix it.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/r/20221026081645.3186878-1-colin.i.king@gmail.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2022-10-26 18:20:22 -07:00
Yonghong Song
d96d4276ea selftests/bpf: Fix bpftool synctypes checking failure
kernel-patches/bpf failed with error:
  Running bpftool checks...
  Comparing /data/users/ast/net-next/tools/include/uapi/linux/bpf.h (bpf_map_type) and
            /data/users/ast/net-next/tools/bpf/bpftool/map.c (do_help() TYPE):
            {'cgroup_storage_deprecated', 'cgroup_storage'}
  Comparing /data/users/ast/net-next/tools/include/uapi/linux/bpf.h (bpf_map_type) and
            /data/users/ast/net-next/tools/bpf/bpftool/Documentation/bpftool-map.rst (TYPE):
            {'cgroup_storage_deprecated', 'cgroup_storage'}
The selftests/bpf/test_bpftool_synctypes.py runs checking in the above.

The failure is introduced by Commit c4bcfb38a95e("bpf: Implement cgroup storage available
to non-cgroup-attached bpf progs"). The commit introduced BPF_MAP_TYPE_CGROUP_STORAGE_DEPRECATED
which has the same enum value as BPF_MAP_TYPE_CGROUP_STORAGE.

In test_bpftool_synctypes.py, one test is to compare uapi bpf.h map types and
bpftool supported maps. The tool picks 'cgroup_storage_deprecated' from bpf.h
while bpftool supported map is displayed as 'cgroup_storage'. The test failure
can be fixed by explicitly replacing 'cgroup_storage_deprecated' with 'cgroup_storage'
in uapi bpf.h map types.

Signed-off-by: Yonghong Song <yhs@fb.com>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/r/20221026163014.470732-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-10-26 10:02:25 -07:00
Shang XiaoJing
e9229d5b62 perf vendor events arm64: Fix incorrect Hisi hip08 L3 metrics
Commit 0cc177cfc9 ("perf vendor events arm64: Add Hisi hip08 L3
metrics") add L3 metrics of hip08, but some metrics (IF_BP_MISP_BR_RET,
IF_BP_MISP_BR_RET, IF_BP_MISP_BR_BL) have incorrect event number due to
the mistakes in document, which caused incorrect result. Fix the
incorrect metrics.

Before:

  65,811,214,308	armv8_pmuv3_0/event=0x1014/	# 18.87 push_branch
  							# -40.19 other_branch
  3,564,316,780	BR_MIS_PRED				# 0.51 indirect_branch
  							# 21.81 pop_branch

After:

  6,537,146,245	BR_MIS_PRED			# 0.48 indirect_branch
  						# 0.47 pop_branch
  						# 0.00 push_branch
  						# 0.05 other_branch

Fixes: 0cc177cfc9 ("perf vendor events arm64: Add Hisi hip08 L3 metrics")
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Shang XiaoJing <shangxiaojing@huawei.com>
Acked-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20221021105035.10000-2-shangxiaojing@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-26 11:01:56 -03:00
Adrian Hunter
cba04f3136 perf auxtrace: Fix address filter symbol name match for modules
For modules, names from kallsyms__parse() contain the module name which
meant that module symbols did not match exactly by name.

Fix by matching the name string up to the separating tab character.

Fixes: 1b36c03e35 ("perf record: Add support for using symbols in address filters")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20221026072736.2982-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-26 10:49:42 -03:00
Arnaldo Carvalho de Melo
831c05a762 tools headers UAPI: Sync linux/perf_event.h with the kernel sources
To pick the changes in:

  cfef80bad4 ("perf/uapi: Define PERF_MEM_SNOOPX_PEER in kernel header file")
  ee3e88dfec ("perf/mem: Introduce PERF_MEM_LVLNUM_{EXTN_MEM|IO}")
  b4e12b2d70 ("perf: Kill __PERF_SAMPLE_CALLCHAIN_EARLY")

There is a kernel patch pending that renames PERF_MEM_LVLNUM_EXTN_MEM to
PERF_MEM_LVLNUM_CXL, tooling this time is ahead of the kernel :-)

This thus partially addresses this perf build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/linux/perf_event.h' differs from latest version at 'include/uapi/linux/perf_event.h'
  diff -u tools/include/uapi/linux/perf_event.h include/uapi/linux/perf_event.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/lkml/Y1k53KMdzypmU0WS@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-26 10:45:16 -03:00
Daniel Müller
5ed88f8151 selftests/bpf: Panic on hard/soft lockup
When running tests, we should probably accept any help we can get when
it comes to detecting issues early or making them more debuggable. We
have seen a few cases where a test_progs_noalu32 run, for example,
encountered a soft lockup and stopped making progress. It was only
interrupted once we hit the overall test timeout [0]. We can not and do
not want to necessarily rely on test timeouts, because those rely on
infrastructure provided by the environment we run in (and which is not
present in tools/testing/selftests/bpf/vmtest.sh, for example).
To that end, let's enable panics on soft as well as hard lockups to fail
fast should we encounter one. That's happening in the configuration
indented to be used for selftests (including when using vmtest.sh or
when running in BPF CI).

[0] https://github.com/kernel-patches/bpf/runs/7844499997

Signed-off-by: Daniel Müller <deso@posteo.net>
Link: https://lore.kernel.org/r/20221025231546.811766-1-deso@posteo.net
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-10-25 23:42:03 -07:00
Yonghong Song
0a1b69d1c7 selftests/bpf: Add test cgrp_local_storage to DENYLIST.s390x
Test cgrp_local_storage have some programs utilizing trampoline.
Arch s390x does not support trampoline so add the test to
the corresponding DENYLIST file.

Signed-off-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20221026042917.675685-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-10-25 23:19:19 -07:00
Yonghong Song
12bb6ca4e2 selftests/bpf: Add selftests for new cgroup local storage
Add four tests for new cgroup local storage, (1) testing bpf program helpers
and user space map APIs, (2) testing recursive fentry triggering won't deadlock,
(3) testing progs attached to cgroups, and (4) a negative test if the
bpf_cgrp_storage_get() helper key is not a cgroup btf id.

Signed-off-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20221026042911.675546-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-10-25 23:19:19 -07:00
Yonghong Song
fd4ca6c1fa selftests/bpf: Fix test test_libbpf_str/bpf_map_type_str
Previous bpf patch made a change to uapi bpf.h like
  @@ -922,7 +922,14 @@ enum bpf_map_type {
        BPF_MAP_TYPE_SOCKHASH,
  -     BPF_MAP_TYPE_CGROUP_STORAGE,
  +     BPF_MAP_TYPE_CGROUP_STORAGE_DEPRECATED,
  +     BPF_MAP_TYPE_CGROUP_STORAGE = BPF_MAP_TYPE_CGROUP_STORAGE_DEPRECATED,
        BPF_MAP_TYPE_REUSEPORT_SOCKARRAY,
where BPF_MAP_TYPE_CGROUP_STORAGE_DEPRECATED and BPF_MAP_TYPE_CGROUP_STORAGE
have the same enum value. This will cause selftest test_libbpf_str/bpf_map_type_str
failing. This patch fixed the issue by avoid the check for
BPF_MAP_TYPE_CGROUP_STORAGE_DEPRECATED in the test.

Signed-off-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20221026042906.674830-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-10-25 23:19:19 -07:00
Yonghong Song
f7f0f1657d bpftool: Support new cgroup local storage
Add support for new cgroup local storage

Acked-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20221026042901.674177-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-10-25 23:19:19 -07:00
Yonghong Song
4fe64af23c libbpf: Support new cgroup local storage
Add support for new cgroup local storage.

Acked-by: David Vernet <void@manifault.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20221026042856.673989-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-10-25 23:19:19 -07:00
Yonghong Song
c4bcfb38a9 bpf: Implement cgroup storage available to non-cgroup-attached bpf progs
Similar to sk/inode/task storage, implement similar cgroup local storage.

There already exists a local storage implementation for cgroup-attached
bpf programs.  See map type BPF_MAP_TYPE_CGROUP_STORAGE and helper
bpf_get_local_storage(). But there are use cases such that non-cgroup
attached bpf progs wants to access cgroup local storage data. For example,
tc egress prog has access to sk and cgroup. It is possible to use
sk local storage to emulate cgroup local storage by storing data in socket.
But this is a waste as it could be lots of sockets belonging to a particular
cgroup. Alternatively, a separate map can be created with cgroup id as the key.
But this will introduce additional overhead to manipulate the new map.
A cgroup local storage, similar to existing sk/inode/task storage,
should help for this use case.

The life-cycle of storage is managed with the life-cycle of the
cgroup struct.  i.e. the storage is destroyed along with the owning cgroup
with a call to bpf_cgrp_storage_free() when cgroup itself
is deleted.

The userspace map operations can be done by using a cgroup fd as a key
passed to the lookup, update and delete operations.

Typically, the following code is used to get the current cgroup:
    struct task_struct *task = bpf_get_current_task_btf();
    ... task->cgroups->dfl_cgrp ...
and in structure task_struct definition:
    struct task_struct {
        ....
        struct css_set __rcu            *cgroups;
        ....
    }
With sleepable program, accessing task->cgroups is not protected by rcu_read_lock.
So the current implementation only supports non-sleepable program and supporting
sleepable program will be the next step together with adding rcu_read_lock
protection for rcu tagged structures.

Since map name BPF_MAP_TYPE_CGROUP_STORAGE has been used for old cgroup local
storage support, the new map name BPF_MAP_TYPE_CGRP_STORAGE is used
for cgroup storage available to non-cgroup-attached bpf programs. The old
cgroup storage supports bpf_get_local_storage() helper to get the cgroup data.
The new cgroup storage helper bpf_cgrp_storage_get() can provide similar
functionality. While old cgroup storage pre-allocates storage memory, the new
mechanism can also pre-allocate with a user space bpf_map_update_elem() call
to avoid potential run-time memory allocation failure.
Therefore, the new cgroup storage can provide all functionality w.r.t.
the old one. So in uapi bpf.h, the old BPF_MAP_TYPE_CGROUP_STORAGE is alias to
BPF_MAP_TYPE_CGROUP_STORAGE_DEPRECATED to indicate the old cgroup storage can
be deprecated since the new one can provide the same functionality.

Acked-by: David Vernet <void@manifault.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20221026042850.673791-1-yhs@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-10-25 23:19:19 -07:00
Martin KaFai Lau
387b532138 selftests/bpf: Tracing prog can still do lookup under busy lock
This patch modifies the task_ls_recursion test to check that
the first bpf_task_storage_get(&map_a, ...) in BPF_PROG(on_update)
can still do the lockless lookup even it cannot acquire the percpu
busy lock.  If the lookup succeeds, it will increment the value
by 1 and the value in the task storage map_a will become 200+1=201.
After that, BPF_PROG(on_update) tries to delete from map_a and
should get -EBUSY because it cannot acquire the percpu busy lock
after finding the data.

Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20221025184524.3526117-10-martin.lau@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-10-25 23:11:47 -07:00
Martin KaFai Lau
0334b4d882 selftests/bpf: Ensure no task storage failure for bpf_lsm.s prog due to deadlock detection
This patch adds a test to check for deadlock failure
in bpf_task_storage_{get,delete} when called by a sleepable bpf_lsm prog.
It also checks if the prog_info.recursion_misses is non zero.

The test starts with 32 threads and they are affinitized to one cpu.
In my qemu setup, with CONFIG_PREEMPT=y, I can reproduce it within
one second if it is run without the previous patches of this set.

Here is the test error message before adding the no deadlock detection
version of the bpf_task_storage_{get,delete}:

test_nodeadlock:FAIL:bpf_task_storage_get busy unexpected bpf_task_storage_get busy: actual 2 != expected 0
test_nodeadlock:FAIL:bpf_task_storage_delete busy unexpected bpf_task_storage_delete busy: actual 2 != expected 0

Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/r/20221025184524.3526117-9-martin.lau@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-10-25 23:11:46 -07:00
Alan Maguire
f3c51fe02c libbpf: Btf dedup identical struct test needs check for nested structs/arrays
When examining module BTF, it is common to see core kernel structures
such as sk_buff, net_device duplicated in the module.  After adding
debug messaging to BTF it turned out that much of the problem
was down to the identical struct test failing during deduplication;
sometimes the compiler adds identical structs.  However
it turns out sometimes that type ids of identical struct members
can also differ, even when the containing structs are still identical.

To take an example, for struct sk_buff, debug messaging revealed
that the identical struct matching was failing for the anon
struct "headers"; specifically for the first field:

__u8       __pkt_type_offset[0]; /*   128     0 */

Looking at the code in BTF deduplication, we have code that guards
against the possibility of identical struct definitions, down to
type ids, and identical array definitions.  However in this case
we have a struct which is being defined twice but does not have
identical type ids since each duplicate struct has separate type
ids for the above array member.   A similar problem (though not
observed) could occur for struct-in-struct.

The solution is to make the "identical struct" test check members
not just for matching ids, but to also check if they in turn are
identical structs or arrays.

The results of doing this are quite dramatic (for some modules
at least); I see the number of type ids drop from around 10000
to just over 1000 in one module for example.

For testing use latest pahole or apply [1], otherwise dedups
can fail for the reasons described there.

Also fix return type of btf_dedup_identical_arrays() as
suggested by Andrii to match boolean return type used
elsewhere.

Fixes: efdd3eb801 ("libbpf: Accommodate DWARF/compiler bug with duplicated structs")
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1666622309-22289-1-git-send-email-alan.maguire@oracle.com

[1] https://lore.kernel.org/bpf/1666364523-9648-1-git-send-email-alan.maguire
2022-10-25 16:16:14 -07:00
Arnaldo Carvalho de Melo
74455fd7e4 tools headers cpufeatures: Sync with the kernel sources
To pick the changes from:

  257449c6a5 ("x86/cpufeatures: Add LbrExtV2 feature bit")

This only causes these perf files to be rebuilt:

  CC       /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o
  CC       /tmp/build/perf/bench/mem-memset-x86-64-asm.o

And addresses this perf build warning:

  Warning: Kernel ABI header at 'tools/arch/x86/include/asm/cpufeatures.h' differs from latest version at 'arch/x86/include/asm/cpufeatures.h'
  diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Link: https://lore.kernel.org/lkml/Y1g6vGPqPhOrXoaN@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-25 17:40:48 -03:00
Arnaldo Carvalho de Melo
49c75d30b0 tools headers uapi: Sync linux/stat.h with the kernel sources
To pick the changes from:

  825cf206ed ("statx: add direct I/O alignment information")

That add a constant that was manually added to tools/perf/trace/beauty/statx.c,
at some point this should move to the shell based automated way.

This silences this perf build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/linux/stat.h' differs from latest version at 'include/uapi/linux/stat.h'
  diff -u tools/include/uapi/linux/stat.h include/uapi/linux/stat.h

Cc: Eric Biggers <ebiggers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/Y1gGQL5LonnuzeYd@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-25 17:40:48 -03:00
Arnaldo Carvalho de Melo
82c50d8937 tools include UAPI: Sync sound/asound.h copy with the kernel sources
Picking the changes from:

  69ab6f5b00 ("ALSA: Remove some left-over license text in include/uapi/sound/")

Which entails no changes in the tooling side as it doesn't introduce new
SNDRV_PCM_IOCTL_ ioctls.

To silence this perf tools build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/sound/asound.h' differs from latest version at 'include/uapi/sound/asound.h'
  diff -u tools/include/uapi/sound/asound.h include/uapi/sound/asound.h

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-25 17:40:48 -03:00
Arnaldo Carvalho de Melo
036b8f5b89 tools headers uapi: Update linux/in.h copy
To get the changes in:

  65b32f801b ("uapi: move IPPROTO_L2TP to in.h")
  5854a09b49 ("net/ipv4: Use __DECLARE_FLEX_ARRAY() helper")

That ends up automatically adding the new IPPROTO_L2TP to the socket
args beautifiers:

  $ tools/perf/trace/beauty/socket.sh > before
  $ cp include/uapi/linux/in.h tools/include/uapi/linux/in.h
  $ tools/perf/trace/beauty/socket.sh > after
  $ diff -u before after
  --- before	2022-10-25 12:17:02.577892416 -0300
  +++ after	2022-10-25 12:17:10.806113033 -0300
  @@ -20,6 +20,7 @@
   	[98] = "ENCAP",
   	[103] = "PIM",
   	[108] = "COMP",
  +	[115] = "L2TP",
   	[132] = "SCTP",
   	[136] = "UDPLITE",
   	[137] = "MPLS",
  $

Now 'perf trace' will decode that 115 into "L2TP" and it will also be
possible to use it in tracepoint filter expressions.

Addresses this tools/perf build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/linux/in.h' differs from latest version at 'include/uapi/linux/in.h'
  diff -u tools/include/uapi/linux/in.h include/uapi/linux/in.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Wojciech Drewek <wojciech.drewek@intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/lkml/Y1f%2FGe6vjQrGjYiK@kernel.org/
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-25 17:40:48 -03:00
Arnaldo Carvalho de Melo
4402e360d0 tools headers: Update the copy of x86's memcpy_64.S used in 'perf bench'
We also need to add SYM_TYPED_FUNC_START() to util/include/linux/linkage.h
and update tools/perf/check_headers.sh to ignore the include cfi_types.h
line when checking if the kernel original files drifted from the copies
we carry.

This is to get the changes from:

  ccace936ee ("x86: Add types to indirectly called assembly functions")

Addressing these tools/perf build warnings:

  Warning: Kernel ABI header at 'tools/arch/x86/lib/memcpy_64.S' differs from latest version at 'arch/x86/lib/memcpy_64.S'
  diff -u tools/arch/x86/lib/memcpy_64.S arch/x86/lib/memcpy_64.S

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Sami Tolvanen <samitolvanen@google.com>
Link: https://lore.kernel.org/lkml/Y1f3VRIec9EBgX6F@kernel.org/
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-25 17:40:48 -03:00
Arnaldo Carvalho de Melo
ffc1df3dc9 tools headers arm64: Sync arm64's cputype.h with the kernel sources
To get the changes in:

  0e5d5ae837 ("arm64: Add AMPERE1 to the Spectre-BHB affected list")

That addresses this perf build warning:

  Warning: Kernel ABI header at 'tools/arch/arm64/include/asm/cputype.h' differs from latest version at 'arch/arm64/include/asm/cputype.h'
  diff -u tools/arch/arm64/include/asm/cputype.h arch/arm64/include/asm/cputype.h

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: D Scott Phillips <scott@os.amperecomputing.com>
https://lore.kernel.org/lkml/Y1fy5GD7ZYvkeufv@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-25 17:40:48 -03:00
Namhyung Kim
246122a856 perf test: Do not fail Intel-PT misc test w/o libpython
The virtual LBR test uses a python script to check the max size of
branch stack in the Intel-PT generated LBR.  But it didn't check whether
python scripting is available (as it's optional).

Let's skip the test if the python support is not available.

Fixes: f77811a0f6 ("perf test: test_intel_pt.sh: Add 9 tests")
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Ammy Yi <ammy.yi@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20221021181055.60183-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-25 17:40:48 -03:00
Thomas Richter
5a6c184a72 perf list: Fix PMU name pai_crypto in perf list on s390
Commit e0b23af82d ("perf list: Add PMU pai_crypto event
description for IBM z16") introduced the "Processor Activity
Instrumentation" for cryptographic counters for z16. The PMU device
driver exports the counters via sysfs files listed in directory
/sys/devices/pai_crypto.

To specify an event from that PMU, use 'perf stat -e pai_crypto/XXX/'.

However the JSON file mentioned in above commit exports the counter
decriptions in file pmu-events/arch/s390/cf_z16/pai.json.  Rename this
file to pmu-events/arch/s390/cf_z16/pai_crypto.json to make the naming
consistent.

Now 'perf list' shows the counter names under pai_crypto section:

  pai_crypto:

    CRYPTO_ALL
         [CRYPTO ALL. Unit: pai_crypto]
    ...

Output before was

  pai:
    CRYPTO_ALL
         [CRYPTO ALL. Unit: pai_crypto]
    ...

Fixes: e0b23af82d ("perf list: Add PMU pai_crypto event description for IBM z16")
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: https://lore.kernel.org/r/20221021082557.2695382-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-25 17:40:48 -03:00
Ian Rogers
304f0a2f6a perf record: Fix event fd races
The write call may set errno which is problematic if occurring in a
function also setting errno. Save and restore errno around the write
call.

done_fd may be used after close, clear it as part of the close and check
its validity in the signal handler.

Suggested-by: <gthelen@google.com>
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anand K Mistry <amistry@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20221024011024.462518-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-25 17:40:48 -03:00
Arnaldo Carvalho de Melo
f1bdebbb67 perf bpf: Fix build with libbpf 0.7.0 by checking if bpf_program__set_insns() is available
During the transition to libbpf 1.0 some functions that perf used were
deprecated and finally removed from libbpf, so bpf_program__set_insns()
was introduced for perf to continue to use its bpf loader.

But when build with LIBBPF_DYNAMIC=1 we now need to check if that
function is available so that perf can build with older libbpf versions,
even if the end result is emitting a warning to the user that the use
of the perf BPF loader requires a newer libbpf, since bpf_program__set_insns()
touches libbpf objects internal state.

This affects only 'perf trace' when using bpf C code or pre-compiled
bytecode as an event.

Noticed on RHEL9, that has libbpf 0.7.0, where bpf_program__set_insns()
isn't available.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-25 17:40:48 -03:00
Arnaldo Carvalho de Melo
409fb6bdd6 perf bpf: Fix build with libbpf 0.7.0 by adding prototype for bpf_load_program()
The bpf_load_program() prototype appeared in tools/lib/bpf/bpf.h as
deprecated, but nowadays its completely removed, so add it back for
building with the system libbpf when using 'make LIBBPF_DYNAMIC=1'.

This is a stop gap hack till we do like tools/bpf does with bpftool,
i.e. bootstrap the libbpf build and install it in the perf build
directory when not using 'make LIBBPF_DYNAMIC=1'.

That has to be done to all libraries in tools/lib/, so tha we can
remove -Itools/lib/ from the tools/perf CFLAGS.

Noticed when building with LIBBPF_DYNAMIC=1 and libbpf 0.7.0 on RHEL9.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-25 17:40:48 -03:00
Kajol Jain
b92dd11725 perf vendor events power10: Fix hv-24x7 metric events
Testcase stat_all_metrics.sh fails in powerpc:

  90: perf all metrics test : FAILED!

The testcase "stat_all_metrics.sh" verifies perf stat result for all the
metric events present in perf list.  It runs perf metric events with
various commands and expects non-empty metric result.

Incase of powerpc:hv-24x7 events, some of the event count can be 0 based
on system configuration. And if that event used as denominator in divide
equation, it can cause divide by 0 error. The current nest_metric.json
file creating divide by 0 issue for some of the metric events, which
results in failure of the "stat_all_metrics.sh" test case.

Most of the metrics events have cycles or an event which expect to have
a larger value as denominator, so adding 1 to the denominator of the
metric expression as a fix.

Result in powerpc box after this patch changes:

  90: perf all metrics test : Ok

Fixes: a3cbcadfdf ("perf vendor events power10: Adds 24x7 nest metric events for power10 platform")
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Reviewed-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
Link: https://lore.kernel.org/r/20221014140220.122251-1-kjain@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-25 17:40:48 -03:00
Adrian Hunter
231e61bc2e perf docs: Fix man page build wrt perf-arm-coresight.txt
perf build assumes documentation files starting with "perf-" are man
pages but perf-arm-coresight.txt is not a man page:

  asciidoc: ERROR: perf-arm-coresight.txt: line 2: malformed manpage title
  asciidoc: ERROR: perf-arm-coresight.txt: line 3: name section expected
  asciidoc: FAILED: perf-arm-coresight.txt: line 3: section title expected
  make[3]: *** [Makefile:266: perf-arm-coresight.xml] Error 1
  make[3]: *** Waiting for unfinished jobs....
  make[2]: *** [Makefile.perf:895: man] Error 2

Fix by renaming it.

Fixes: dc2e0fb00b ("perf test coresight: Add relevant documentation about ARM64 CoreSight testing")
Reported-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Reported-by: Sven Schnelle <svens@linux.ibm.com>
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Carsten Haitzler <carsten.haitzler@arm.com>
Cc: coresight@lists.linaro.org
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/a176a3e1-6ddc-bb63-e41c-15cda8c2d5d2@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-25 17:40:48 -03:00
Arnaldo Carvalho de Melo
8886461194 tools headers UAPI: Sync powerpc syscall tables with the kernel sources
To pick the changes in these csets:

  e237506238 ("powerpc/32: fix syscall wrappers with 64-bit arguments of unaligned register-pairs")

That doesn't cause any changes in the perf tools.

As a reminder, this table is used in tools perf to allow features such as:

  [root@five ~]# perf trace -e set_mempolicy_home_node
  ^C[root@five ~]#
  [root@five ~]# perf trace -v -e set_mempolicy_home_node
  Using CPUID AuthenticAMD-25-21-0
  event qualifier tracepoint filter: (common_pid != 253729 && common_pid != 3585) && (id == 450)
  mmap size 528384B
  ^C[root@five ~]
  [root@five ~]# perf trace -v -e set*  --max-events 5
  Using CPUID AuthenticAMD-25-21-0
  event qualifier tracepoint filter: (common_pid != 253734 && common_pid != 3585) && (id == 38 || id == 54 || id == 105 || id == 106 || id == 109 || id == 112 || id == 113 || id == 114 || id == 116 || id == 117 || id == 119 || id == 122 || id == 123 || id == 141 || id == 160 || id == 164 || id == 170 || id == 171 || id == 188 || id == 205 || id == 218 || id == 238 || id == 273 || id == 308 || id == 450)
  mmap size 528384B
       0.000 ( 0.008 ms): bash/253735 setpgid(pid: 253735 (bash), pgid: 253735 (bash))      = 0
    6849.011 ( 0.008 ms): bash/16046 setpgid(pid: 253736 (bash), pgid: 253736 (bash))       = 0
    6849.080 ( 0.005 ms): bash/253736 setpgid(pid: 253736 (bash), pgid: 253736 (bash))      = 0
    7437.718 ( 0.009 ms): gnome-shell/253737 set_robust_list(head: 0x7f34b527e920, len: 24) = 0
   13445.986 ( 0.010 ms): bash/16046 setpgid(pid: 253738 (bash), pgid: 253738 (bash))       = 0
  [root@five ~]#

That is the filter expression attached to the raw_syscalls:sys_{enter,exit}
tracepoints.

  $ find tools/perf/arch/ -name "syscall*tbl" | xargs grep -w set_mempolicy_home_node
  tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl:450	common	set_mempolicy_home_node		sys_set_mempolicy_home_node
  tools/perf/arch/powerpc/entry/syscalls/syscall.tbl:450 	nospu	set_mempolicy_home_node		sys_set_mempolicy_home_node
  tools/perf/arch/s390/entry/syscalls/syscall.tbl:450  common	set_mempolicy_home_node	sys_set_mempolicy_home_node	sys_set_mempolicy_home_node
  tools/perf/arch/x86/entry/syscalls/syscall_64.tbl:450	common	set_mempolicy_home_node	sys_set_mempolicy_home_node
  $

  $ grep -w set_mempolicy_home_node /tmp/build/perf/arch/x86/include/generated/asm/syscalls_64.c
	[450] = "set_mempolicy_home_node",
  $

This addresses these perf build warnings:

  Warning: Kernel ABI header at 'tools/include/uapi/asm-generic/unistd.h' differs from latest version at 'include/uapi/asm-generic/unistd.h'
  diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h
  Warning: Kernel ABI header at 'tools/perf/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl'
  diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl
  Warning: Kernel ABI header at 'tools/perf/arch/powerpc/entry/syscalls/syscall.tbl' differs from latest version at 'arch/powerpc/kernel/syscalls/syscall.tbl'
  diff -u tools/perf/arch/powerpc/entry/syscalls/syscall.tbl arch/powerpc/kernel/syscalls/syscall.tbl
  Warning: Kernel ABI header at 'tools/perf/arch/s390/entry/syscalls/syscall.tbl' differs from latest version at 'arch/s390/kernel/syscalls/syscall.tbl'
  diff -u tools/perf/arch/s390/entry/syscalls/syscall.tbl arch/s390/kernel/syscalls/syscall.tbl
  Warning: Kernel ABI header at 'tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl' differs from latest version at 'arch/mips/kernel/syscalls/syscall_n64.tbl'
  diff -u tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl arch/mips/kernel/syscalls/syscall_n64.tbl

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Link: https://lore.kernel.org/lkml/Y01HN2DGkWz8tC%2FJ@kernel.org/
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-25 17:40:48 -03:00
Jiri Olsa
b2440443a6 selftests/bpf: Add kprobe_multi kmod attach api tests
Adding kprobe_multi kmod attach api tests that attach bpf_testmod
functions via bpf_program__attach_kprobe_multi_opts.

Running it as serial test, because we don't want other tests to
reload bpf_testmod while it's running.

Acked-by: Song Liu <song@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20221025134148.3300700-9-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-10-25 10:14:51 -07:00
Jiri Olsa
e697d8dceb selftests/bpf: Add kprobe_multi check to module attach test
Adding test that makes sure the kernel module won't be removed
if there's kprobe multi link defined on top of it.

Acked-by: Song Liu <song@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20221025134148.3300700-8-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-10-25 10:14:51 -07:00
Jiri Olsa
fee356ede9 selftests/bpf: Add bpf_testmod_fentry_* functions
Adding 3 bpf_testmod_fentry_* functions to have a way to test
kprobe multi link on kernel module. They follow bpf_fentry_test*
functions prototypes/code.

Adding equivalent functions to all bpf_fentry_test* does not
seems necessary at the moment, could be added later.

Acked-by: Song Liu <song@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20221025134148.3300700-7-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-10-25 10:14:51 -07:00
Jiri Olsa
10705b2b7a selftests/bpf: Add load_kallsyms_refresh function
Adding load_kallsyms_refresh function to re-read symbols from
/proc/kallsyms file.

This will be needed to get proper functions addresses from
bpf_testmod.ko module, which is loaded/unloaded several times
during the tests run, so symbols might be already old when
we need to use them.

Acked-by: Song Liu <song@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20221025134148.3300700-6-jolsa@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-10-25 10:14:51 -07:00
Quentin Monnet
08b8191ba7 bpftool: Add llvm feature to "bpftool version"
Similarly to "libbfd", add a "llvm" feature to the output of command
"bpftool version" to indicate that LLVM is used for disassembling JIT-ed
programs. This feature is mutually exclusive (from Makefile definitions)
with "libbfd".

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Tested-by: Niklas Söderlund <niklas.soderlund@corigine.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20221025150329.97371-9-quentin@isovalent.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-10-25 10:11:57 -07:00
Quentin Monnet
ce4f660862 bpftool: Support setting alternative arch for JIT disasm with LLVM
For offloaded BPF programs, instead of failing to create the
LLVM disassembler without even looking for a triple at all, do run the
function that attempts to retrieve a valid architecture name for the
device.

It will still fail for the LLVM disassembler, because currently we have
no valid triple to return (NFP disassembly is not supported by LLVM).
But failing in that function is more logical than to assume in
jit_disasm.c that passing an "arch" name is simply not supported.

Suggested-by: Song Liu <song@kernel.org>
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/r/20221025150329.97371-8-quentin@isovalent.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-10-25 10:11:56 -07:00
Quentin Monnet
eb9d1acf63 bpftool: Add LLVM as default library for disassembling JIT-ed programs
To disassemble instructions for JIT-ed programs, bpftool has relied on
the libbfd library. This has been problematic in the past: libbfd's
interface is not meant to be stable and has changed several times. For
building bpftool, we have to detect how the libbfd version on the system
behaves, which is why we have to handle features disassembler-four-args
and disassembler-init-styled in the Makefile. When it comes to shipping
bpftool, this has also caused issues with several distribution
maintainers unwilling to support the feature (see for example Debian's
page for binutils-dev, which ships libbfd: "Note that building Debian
packages which depend on the shared libbfd is Not Allowed." [0]).

For these reasons, we add support for LLVM as an alternative to libbfd
for disassembling instructions of JIT-ed programs. Thanks to the
preparation work in the previous commits, it's easy to add the library
by passing the relevant compilation options in the Makefile, and by
adding the functions for setting up the LLVM disassembler in file
jit_disasm.c.

The LLVM disassembler requires the LLVM development package (usually
llvm-dev or llvm-devel).

The expectation is that the interface for this disassembler will be more
stable. There is a note in LLVM's Developer Policy [1] stating that the
stability for the C API is "best effort" and not guaranteed, but at
least there is some effort to keep compatibility when possible (which
hasn't really been the case for libbfd so far). Furthermore, the Debian
page for the related LLVM package does not caution against linking to
the lib, as binutils-dev page does.

Naturally, the display of disassembled instructions comes with a few
minor differences. Here is a sample output with libbfd (already
supported before this patch):

    # bpftool prog dump jited id 56
    bpf_prog_6deef7357e7b4530:
       0:   nopl   0x0(%rax,%rax,1)
       5:   xchg   %ax,%ax
       7:   push   %rbp
       8:   mov    %rsp,%rbp
       b:   push   %rbx
       c:   push   %r13
       e:   push   %r14
      10:   mov    %rdi,%rbx
      13:   movzwq 0xb4(%rbx),%r13
      1b:   xor    %r14d,%r14d
      1e:   or     $0x2,%r14d
      22:   mov    $0x1,%eax
      27:   cmp    $0x2,%r14
      2b:   jne    0x000000000000002f
      2d:   xor    %eax,%eax
      2f:   pop    %r14
      31:   pop    %r13
      33:   pop    %rbx
      34:   leave
      35:   ret

LLVM supports several variants that we could set when initialising the
disassembler, for example with:

    LLVMSetDisasmOptions(*ctx,
                         LLVMDisassembler_Option_AsmPrinterVariant);

but the default printer is used for now. Here is the output with LLVM:

    # bpftool prog dump jited id 56
    bpf_prog_6deef7357e7b4530:
       0:   nopl    (%rax,%rax)
       5:   nop
       7:   pushq   %rbp
       8:   movq    %rsp, %rbp
       b:   pushq   %rbx
       c:   pushq   %r13
       e:   pushq   %r14
      10:   movq    %rdi, %rbx
      13:   movzwq  180(%rbx), %r13
      1b:   xorl    %r14d, %r14d
      1e:   orl     $2, %r14d
      22:   movl    $1, %eax
      27:   cmpq    $2, %r14
      2b:   jne     0x2f
      2d:   xorl    %eax, %eax
      2f:   popq    %r14
      31:   popq    %r13
      33:   popq    %rbx
      34:   leave
      35:   retq

The LLVM disassembler comes as the default choice, with libbfd as a
fall-back.

Of course, we could replace libbfd entirely and avoid supporting two
different libraries. One reason for keeping libbfd is that, right now,
it works well, we have all we need in terms of features detection in the
Makefile, so it provides a fallback for disassembling JIT-ed programs if
libbfd is installed but LLVM is not. The other motivation is that libbfd
supports nfp instruction for Netronome's SmartNICs and can be used to
disassemble offloaded programs, something that LLVM cannot do. If
libbfd's interface breaks again in the future, we might reconsider
keeping support for it.

[0] https://packages.debian.org/buster/binutils-dev
[1] https://llvm.org/docs/DeveloperPolicy.html#c-api-changes

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Tested-by: Niklas Söderlund <niklas.soderlund@corigine.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20221025150329.97371-7-quentin@isovalent.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-10-25 10:11:56 -07:00
Quentin Monnet
e1947c750f bpftool: Refactor disassembler for JIT-ed programs
Refactor disasm_print_insn() to extract the code specific to libbfd and
move it to dedicated functions. There is no functional change. This is
in preparation for supporting an alternative library for disassembling
the instructions.

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Tested-by: Niklas Söderlund <niklas.soderlund@corigine.com>
Acked-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20221025150329.97371-6-quentin@isovalent.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-10-25 10:11:56 -07:00
Quentin Monnet
2ea4d86a50 bpftool: Group libbfd defs in Makefile, only pass them if we use libbfd
Bpftool uses libbfd for disassembling JIT-ed programs. But the feature
is optional, and the tool can be compiled without libbfd support. The
Makefile sets the relevant variables accordingly. It also sets variables
related to libbfd's interface, given that it has changed over time.

Group all those libbfd-related definitions so that it's easier to
understand what we are testing for, and only use variables related to
libbfd's interface if we need libbfd in the first place.

In addition to make the Makefile clearer, grouping the definitions
related to disassembling JIT-ed programs will help support alternatives
to libbfd.

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Tested-by: Niklas Söderlund <niklas.soderlund@corigine.com>
Acked-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20221025150329.97371-5-quentin@isovalent.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-10-25 10:11:56 -07:00
Quentin Monnet
108326d6fa bpftool: Split FEATURE_TESTS/FEATURE_DISPLAY definitions in Makefile
Make FEATURE_TESTS and FEATURE_DISPLAY easier to read and less likely to
be subject to conflicts on updates by having one feature per line.

Suggested-by: Andres Freund <andres@anarazel.de>
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Tested-by: Niklas Söderlund <niklas.soderlund@corigine.com>
Acked-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20221025150329.97371-4-quentin@isovalent.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-10-25 10:11:56 -07:00
Quentin Monnet
55b4de58d0 bpftool: Remove asserts from JIT disassembler
The JIT disassembler in bpftool is the only components (with the JSON
writer) using asserts to check the return values of functions. But it
does not do so in a consistent way, and diasm_print_insn() returns no
value, although sometimes the operation failed.

Remove the asserts, and instead check the return values, print messages
on errors, and propagate the error to the caller from prog.c.

Remove the inclusion of assert.h from jit_disasm.c, and also from map.c
where it is unused.

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Tested-by: Niklas Söderlund <niklas.soderlund@corigine.com>
Acked-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20221025150329.97371-3-quentin@isovalent.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-10-25 10:11:56 -07:00
Quentin Monnet
b3d84af7cd bpftool: Define _GNU_SOURCE only once
_GNU_SOURCE is defined in several source files for bpftool, but only one
of them takes the precaution of checking whether the value is already
defined. Add #ifndef for other occurrences too.

This is in preparation for the support of disassembling JIT-ed programs
with LLVM, with $(llvm-config --cflags) passing -D_GNU_SOURCE as a
compilation argument.

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Tested-by: Niklas Söderlund <niklas.soderlund@corigine.com>
Acked-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20221025150329.97371-2-quentin@isovalent.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-10-25 10:11:56 -07:00
Todd Brandt
9bfb09774e pm-graph v5.10
sleepgraph:
- add -wifitrace argument for tracing all the way to wifi reconnect
- include more data in ftrace to mark the end of kernel resume
- add async_synchronize_full to the list of funcs to chart
- add thermal zone info to the log data
- include a check for s0ix support (s2idle is the default mem_sleep)
- if s2idle does not support s0ix, remove the SYS%LPI turbostat var
- fix -dev crash when kprobe caller is just an address (not a symbol)
- fix the cpuexec data in -proc to display in resume

sleepgraph.8:
- add -wifitrace documentation

README:
- change links from 01.org to developer.intel.com

Signed-off-by: Todd Brandt <todd.e.brandt@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-10-25 17:46:15 +02:00
Kuniyuki Iwashima
6df96146b2 selftest: Add test for SO_INCOMING_CPU.
Some highly optimised applications use SO_INCOMING_CPU to make them
efficient, but they didn't test if it's working correctly by getsockopt()
to avoid slowing down.  As a result, no one noticed it had been broken
for years, so it's a good time to add a test to catch future regression.

The test does

  1) Create $(nproc) TCP listeners associated with each CPU.

  2) Create 32 child sockets for each listener by calling
     sched_setaffinity() for each CPU.

  3) Check if accept()ed sockets' sk_incoming_cpu matches
     listener's one.

If we see -EAGAIN, SO_INCOMING_CPU is broken.  However, we might not see
any error even if broken; the kernel could miraculously distribute all SYN
to correct listeners.  Not to let that happen, we must increase the number
of clients and CPUs to some extent, so the test requires $(nproc) >= 2 and
creates 64 sockets at least.

Test:
  $ nproc
  96
  $ ./so_incoming_cpu

Before the previous patch:

  # Starting 12 tests from 5 test cases.
  #  RUN           so_incoming_cpu.before_reuseport.test1 ...
  # so_incoming_cpu.c:191:test1:Expected cpu (5) == i (0)
  # test1: Test terminated by assertion
  #          FAIL  so_incoming_cpu.before_reuseport.test1
  not ok 1 so_incoming_cpu.before_reuseport.test1
  ...
  # FAILED: 0 / 12 tests passed.
  # Totals: pass:0 fail:12 xfail:0 xpass:0 skip:0 error:0

After:

  # Starting 12 tests from 5 test cases.
  #  RUN           so_incoming_cpu.before_reuseport.test1 ...
  # so_incoming_cpu.c:199:test1:SO_INCOMING_CPU is very likely to be working correctly with 3072 sockets.
  #            OK  so_incoming_cpu.before_reuseport.test1
  ok 1 so_incoming_cpu.before_reuseport.test1
  ...
  # PASSED: 12 / 12 tests passed.
  # Totals: pass:12 fail:0 xfail:0 xpass:0 skip:0 error:0

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-10-25 11:35:16 +02:00
Jakub Kicinski
96917bb3a3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
include/linux/net.h
  a5ef058dc4 ("net: introduce and use custom sockopt socket flag")
  e993ffe3da ("net: flag sockets supporting msghdr originated zerocopy")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-24 13:44:11 -07:00
Linus Torvalds
337a0a0b63 Including fixes from bpf.
Current release - regressions:
 
  - eth: fman: re-expose location of the MAC address to userspace,
    apparently some udev scripts depended on the exact value
 
 Current release - new code bugs:
 
  - bpf:
    - wait for busy refill_work when destroying bpf memory allocator
    - allow bpf_user_ringbuf_drain() callbacks to return 1
    - fix dispatcher patchable function entry to 5 bytes nop
 
 Previous releases - regressions:
 
  - net-memcg: avoid stalls when under memory pressure
 
  - tcp: fix indefinite deferral of RTO with SACK reneging
 
  - tipc: fix a null-ptr-deref in tipc_topsrv_accept
 
  - eth: macb: specify PHY PM management done by MAC
 
  - tcp: fix a signed-integer-overflow bug in tcp_add_backlog()
 
 Previous releases - always broken:
 
  - eth: amd-xgbe: SFP fixes and compatibility improvements
 
 Misc:
 
  - docs: netdev: offer performance feedback to contributors
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmNW024ACgkQMUZtbf5S
 IrvX7w//SP/zKZwgzC13zd2rrCP16TX2QvkHPmSLvcldQDXdCypmsoc5Vb8UNpkG
 jwAuy2pxqPy2oxTwTBQv9TNRT2oqEFOsFTK+w410whlL7g1wZ02aXU8qFhV2XumW
 o4gRtM+UISPUKFbOnawdK1XlrNdeLF3bjETvW2GP9zxCb0iqoQXtDDNKxv2B2iQA
 MSyTtzHA4n9GS7LKGtPgsP2Ose7h1Z+AjTIpQH1nvfEHJUf/wmxUdCK+fuwfeLjY
 PhmYaPG/333j1bfBk1Ms/nUYA5KRXlEj9A/7jDtxhxNEwaTNKyLB19a6oVxXxpSQ
 x/k+nZP1RColn5xeco5a1X9aHHQ46PJQ8wVAmxYDIeIA5XPMgShNmhAyjrq1ac+o
 9vYeYpmnMGSTLdBMvGbWpynWHe7SddgF8LkbnYf2HLKbxe4bgkOnmxOUH4q9iinZ
 MfVSknjax4DP0C7X1kGgR6WyltWnkrahOdUkINsIUNxj0KxJa/eStpJIbJrfkxNV
 gHbOjB2/bF3SXENrS4A0IJCgsbO9YugN83Eyu0WDWQOw9wVgopzxOJx9R+H0wkVH
 XpGGP8qi1DZiTE3iQiq1LHj6f6kirFmtt9QFH5yzaqtKBaqXakHaXwUO4VtD+BI9
 NPFKvFL6jrp8EAn0PTM/RrvhJZN+V0bFXiyiMe0TLx+aR0UMxGc=
 =dD6N
 -----END PGP SIGNATURE-----

Merge tag 'net-6.1-rc3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from bpf.

  The net-memcg fix stands out, the rest is very run-off-the-mill. Maybe
  I'm biased.

  Current release - regressions:

   - eth: fman: re-expose location of the MAC address to userspace,
     apparently some udev scripts depended on the exact value

  Current release - new code bugs:

   - bpf:
       - wait for busy refill_work when destroying bpf memory allocator
       - allow bpf_user_ringbuf_drain() callbacks to return 1
       - fix dispatcher patchable function entry to 5 bytes nop

  Previous releases - regressions:

   - net-memcg: avoid stalls when under memory pressure

   - tcp: fix indefinite deferral of RTO with SACK reneging

   - tipc: fix a null-ptr-deref in tipc_topsrv_accept

   - eth: macb: specify PHY PM management done by MAC

   - tcp: fix a signed-integer-overflow bug in tcp_add_backlog()

  Previous releases - always broken:

   - eth: amd-xgbe: SFP fixes and compatibility improvements

  Misc:

   - docs: netdev: offer performance feedback to contributors"

* tag 'net-6.1-rc3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (37 commits)
  net-memcg: avoid stalls when under memory pressure
  tcp: fix indefinite deferral of RTO with SACK reneging
  tcp: fix a signed-integer-overflow bug in tcp_add_backlog()
  net: lantiq_etop: don't free skb when returning NETDEV_TX_BUSY
  net: fix UAF issue in nfqnl_nf_hook_drop() when ops_init() failed
  docs: netdev: offer performance feedback to contributors
  kcm: annotate data-races around kcm->rx_wait
  kcm: annotate data-races around kcm->rx_psock
  net: fman: Use physical address for userspace interfaces
  net/mlx5e: Cleanup MACsec uninitialization routine
  atlantic: fix deadlock at aq_nic_stop
  nfp: only clean `sp_indiff` when application firmware is unloaded
  amd-xgbe: add the bit rate quirk for Molex cables
  amd-xgbe: fix the SFP compliance codes check for DAC cables
  amd-xgbe: enable PLL_CTL for fixed PHY modes only
  amd-xgbe: use enums for mailbox cmd and sub_cmds
  amd-xgbe: Yellow carp devices do not need rrc
  bpf: Use __llist_del_all() whenever possbile during memory draining
  bpf: Wait for busy refill_work when destroying bpf memory allocator
  MAINTAINERS: add keyword match on PTP
  ...
2022-10-24 12:43:51 -07:00
Linus Torvalds
21c92498e9 linux-kselftest-fixes-6.1-rc3
This Kselftest fixes update for Linux 6.1-rc3 consists of:
 
 - futex, intel_pstate, kexec build fixes
 - ftrace dynamic_events dependency check fix
 - memory-hotplug fix to remove redundant warning from test report
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmNWu/cACgkQCwJExA0N
 Qxz/nBAAhKn5l5G69tUcJIPT2XNNVnslJJKg8CBuFhfzw9U+BIJqbR9ZCqyzLAz2
 ZVYTEbZtshbjgEJbe8KwDskK3H9F3RYkMEJ+EOH7N6GwcL9Ol7Ubzx6Yf9RM3V8S
 +Y8rUE3iZfC4grR4uQNK0edUb2mXZwMLB+ZYEetp9YlTS62vtpUUtaMWRAU4mvd5
 2Ws+6YKV89kP4zoZB6SCJe9/ORvp0wvskoSCshw9BAaJrbJ6MkGToOZtFWzTCKNx
 x3he60mj4n5rE6PtxuWh1sAar1m/7ZF9DT9gGy6t99/zT0BUWTUQ8RECi2/IIe5m
 SqXJkWiNqO+ytFm17IqDPd/fk/XQlCdCRR423fkAjie6DGdZzu1KeJjg+Q9vPpeP
 qaCpFXwc3E4culDKDWBn2ZRQZFBoFbrCUEbqO0uZB5Ua9jB9FuTJhofanEmz3TUh
 XZlXUrPTVwE6SLur7f5x8ubj//0qGcbdM0qyYl6C9/i213axYdj3V6Xh1FStHJpK
 IvNcUqcxrSAPEKXmOY9hEZeAMHYevBwRXOqHSLNhrZbjGN0ASGfIhU5Ys1ffNS0c
 SuVsxfxzqR1Eu11JqRnSIcHj9hJBgroaBv+c5h2ivZD+MqZSrocFoRnnwjV3nW9S
 B+n6Whi1JuatLK4pfn8Ba+VoZ7hl9Bb3dxvALRdqDbqYBs5cp+I=
 =awHm
 -----END PGP SIGNATURE-----

Merge tag 'linux-kselftest-fixes-6.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull Kselftest fixes from Shuah Khan:

 - futex, intel_pstate, kexec build fixes

 - ftrace dynamic_events dependency check fix

 - memory-hotplug fix to remove redundant warning from test report

* tag 'linux-kselftest-fixes-6.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  selftests/ftrace: fix dynamic_events dependency check
  selftests/memory-hotplug: Remove the redundant warning information
  selftests/kexec: fix build for ARCH=x86_64
  selftests/intel_pstate: fix build for ARCH=x86_64
  selftests/futex: fix build for clang
2022-10-24 12:10:55 -07:00
Jakub Kicinski
e28c44450b bpf-for-netdev
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE+soXsSLHKoYyzcli6rmadz2vbToFAmNVkYkACgkQ6rmadz2v
 bTqzHw/+NYMwfLm5Ck+BK0+HiYU5VVLoG4jp8G7B3sJL/6nUDduajzoqa+nM19Xl
 +HEjbMza7CizmhkCRkzIs1VVtx8mtvKdTxbhvm77SU2+GBn+X1es+XhtFd4EOpok
 MINNHs+cOC/HlnPD/QbFgvxKiKkjyjWxInjUp6Y/mLMcKCn7l9KOkc07/la9Dj3j
 RI0gXCywq1pJaPuTCnt0/wcYLJvzn6QsZnKmmksQwt59GQqOd11HWid3rBWZhDp6
 beEoHDIMGHROtu60vm4DB0p4l6tauZfeXyPCeu3Tx5ZSsypJIyU1iTdKiIUjG963
 ilpy55nrX9bWxadB7LIKHyYfW3in4o+D1ZZaUvLIato/69CZJZ7Uc4kU1RF4Ay1F
 Df1Fmal2WeNAxxETPmQPvVeCePvQvwLHl4KNogdZZvd/67cyc1cDhnuTJp37iPak
 FALHaaw0VOrTdTvxsWym7yEbkhPbCHpPrKYFZFHgGrRTFk/GM2k38mM07lcLxFGw
 aKyooS+eoIZMEgtK5Hma2wpeIVSlkJiJk1d0K20OxdnIUyYEsMXmI+uV1gMxq/8z
 EHNi0+296xOoxy22I1Bd5Tu7fIeecHFN44q7YFmpGsB54UNLpFsP0vYUmYT/6hLI
 Y0KVZu4c3oQDX7ttifMvkeOCURDJBPrZx37bpNpNXF55fB5ehNk=
 =eV7W
 -----END PGP SIGNATURE-----

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf

Alexei Starovoitov says:

====================
pull-request: bpf 2022-10-23

We've added 7 non-merge commits during the last 18 day(s) which contain
a total of 8 files changed, 69 insertions(+), 5 deletions(-).

The main changes are:

1) Wait for busy refill_work when destroying bpf memory allocator, from Hou.

2) Allow bpf_user_ringbuf_drain() callbacks to return 1, from David.

3) Fix dispatcher patchable function entry to 5 bytes nop, from Jiri.

4) Prevent decl_tag from being referenced in func_proto, from Stanislav.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  bpf: Use __llist_del_all() whenever possbile during memory draining
  bpf: Wait for busy refill_work when destroying bpf memory allocator
  bpf: Fix dispatcher patchable function entry to 5 bytes nop
  bpf: prevent decl_tag from being referenced in func_proto
  selftests/bpf: Add reproducer for decl_tag in func_proto return type
  selftests/bpf: Make bpf_user_ringbuf_drain() selftest callback return 1
  bpf: Allow bpf_user_ringbuf_drain() callbacks to return 1
====================

Link: https://lore.kernel.org/r/20221023192244.81137-1-alexei.starovoitov@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-24 10:32:01 -07:00
Linus Torvalds
05b4ebd2c7 RISC-V:
- Fix compilation without RISCV_ISA_ZICBOM
 
 - Fix kvm_riscv_vcpu_timer_pending() for Sstc
 
 ARM:
 
 - Fix a bug preventing restoring an ITS containing mappings
   for very large and very sparse device topology
 
 - Work around a relocation handling error when compiling
   the nVHE object with profile optimisation
 
 - Fix for stage-2 invalidation holding the VM MMU lock
   for too long by limiting the walk to the largest
   block mapping size
 
 - Enable stack protection and branch profiling for VHE
 
 - Two selftest fixes
 
 x86:
 
 - add compat implementation for KVM_X86_SET_MSR_FILTER ioctl
 
 selftests:
 
 - synchronize includes between include/uapi and tools/include/uapi
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmNT2hQUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroNeVwgAkGGk2F2SF5s+MQUQ9tDPxyuRbddN
 NPo/YRTKszKc8rK6d1TCbQi56I3e8Oa7kNkMF7CiBlAekB7B1r1ySg5qc+3lQebx
 moME30Ru4nmfqPcZ7971MA8Me7zZxGzvIviL5KIwm1ownGifdTsPZ9jCvu4EPdzv
 3dd10guH3GeBIq8QeQGEqNP4fticziwhE+IA3HZstcWsq96800Le7WNAgklfzdC+
 YTB81QU6whHv6N/7YvRcTbp+tER3VIKdFMmRD1FwC90flhXMbxTymESFXULfHCM2
 x/arGz2E31/QGgJo0/Yy2VPenr5ZMU57dL4SYWR02mwSfJQnJWb1cRdWnw==
 =rxQ7
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "RISC-V:

   - Fix compilation without RISCV_ISA_ZICBOM

   - Fix kvm_riscv_vcpu_timer_pending() for Sstc

  ARM:

   - Fix a bug preventing restoring an ITS containing mappings for very
     large and very sparse device topology

   - Work around a relocation handling error when compiling the nVHE
     object with profile optimisation

   - Fix for stage-2 invalidation holding the VM MMU lock for too long
     by limiting the walk to the largest block mapping size

   - Enable stack protection and branch profiling for VHE

   - Two selftest fixes

  x86:

   - add compat implementation for KVM_X86_SET_MSR_FILTER ioctl

  selftests:

   - synchronize includes between include/uapi and tools/include/uapi"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  tools: include: sync include/api/linux/kvm.h
  KVM: x86: Add compat handler for KVM_X86_SET_MSR_FILTER
  KVM: x86: Copy filter arg outside kvm_vm_ioctl_set_msr_filter()
  kvm: Add support for arch compat vm ioctls
  RISC-V: KVM: Fix kvm_riscv_vcpu_timer_pending() for Sstc
  RISC-V: Fix compilation without RISCV_ISA_ZICBOM
  KVM: arm64: vgic: Fix exit condition in scan_its_table()
  KVM: arm64: nvhe: Fix build with profile optimization
  KVM: selftests: Fix number of pages for memory slot in memslot_modification_stress_test
  KVM: arm64: selftests: Fix multiple versions of GIC creation
  KVM: arm64: Enable stack protection and branch profiling for VHE
  KVM: arm64: Limit stage2_apply_range() batch size to largest block
  KVM: arm64: Work out supported block level at compile time
2022-10-23 15:00:43 -07:00
Linus Torvalds
a703852408 - Fix raw data handling when perf events are used in bpf
- Rework how SIGTRAPs get delivered to events to address a bunch of
   problems with it. Add a selftest for that too
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmNVE9UACgkQEsHwGGHe
 VUpodRAAsn3s+xX02qfxRG0mYX/1nnOmnFnDhmUEPAZpDXjB6g3PFXAh26F9tw92
 dLm8fbfp8W3tK7TQrGyOGWMnb7oj4gEoXAcQBXvsq2KOJMVwRHHKwCeSSJ89DLAW
 zZEl2nzgea2cGyZn2RZJvhmbm8YdFON3v++Vv34yovs1MMoCcD7FOPLLGWkD2gk5
 6PK6lIlWEI/+vYKWhZdxgk6/PanBInXRUnbaBVj42US1XwfDLzPBEi9yyUUJQrht
 CQhfTpHn4Z5MC/hJTlFat4Jlaajql4JBcQ1SS5LW59M+6gdlBK4tL6zFt10zvU2m
 +kywOOIWiYLRRgFf6idGO45P5BuWOdmsXEaEg5KW7b6nJfGvgkd7WUYgCVOgtEY1
 r4Mf4hAQUunDHGQ4e8eHk7XemFJsoBSweYCTQ2O0yr/QzO2M6QBi/BR9PzUajyH+
 yShKEfrxXt4595BMH0nonSMpTKcE4Zxdj06LZSnGecEN8UUlx/n49uYwhFNdUkqM
 s6Wz6kSR76YRlKUmYnNzP1gkY6nJZ1nR6z7SjmMkioxf3VxhT9SY8K587r6hRUlr
 /NVA69iUhJy75VdttxZEmZzB03A7AjdudmZEisF0ImEmB1hxzYLHcDKJMTIj/r4/
 f8OXCg5ACKhFlnx1SdBVRtA+6+5ab368Fs2rItJQ4dzdxRi6VVY=
 =DXG7
 -----END PGP SIGNATURE-----

Merge tag 'perf_urgent_for_v6.1_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf fixes from Borislav Petkov:

 - Fix raw data handling when perf events are used in bpf

 - Rework how SIGTRAPs get delivered to events to address a bunch of
   problems with it. Add a selftest for that too

* tag 'perf_urgent_for_v6.1_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  bpf: Fix sample_flags for bpf_perf_event_output
  selftests/perf_events: Add a SIGTRAP stress test with disables
  perf: Fix missing SIGTRAPs
2022-10-23 10:14:45 -07:00
Paolo Bonzini
9aec606c16 tools: include: sync include/api/linux/kvm.h
Provide a definition of KVM_CAP_DIRTY_LOG_RING_ACQ_REL.

Fixes: 17601bfed9 ("KVM: Add KVM_CAP_DIRTY_LOG_RING_ACQ_REL capability and config option")
Cc: Marc Zyngier <maz@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-22 07:54:19 -04:00
Paolo Bonzini
5834816829 KVM/arm64 fixes for 6.1, take #1
- Fix for stage-2 invalidation holding the VM MMU lock
   for too long by limiting the walk to the largest
   block mapping size
 
 - Enable stack protection and branch profiling for VHE
 
 - Two selftest fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmNIEAUPHG1hekBrZXJu
 ZWwub3JnAAoJECPQ0LrRPXpDIXcP/AyWlCEJlmc1Jcd9rlaW1Wenr82U+StLVeMy
 qP5P02gMWdbGExWIWEi4zkt+pAm7K2WRgXid9z5Vjw7kZY/+WwswTzKHWcQhVuZv
 cBHfeOqgtoHVGR8NcwX6xcp406y3WRqYIsyAmbc5qmo75L8Ew1o3m+3eDfFtAq7l
 3XuTCv+lQGNSGMhXHN2SVewZ+pCAo3XJmuHfCBXTqRjwqH4Tzh+54IKzo+9mqBWW
 7yeIm5qcbIKGuXLuLL7XCf99gWy/3kQ0xQ1yJeXLAyiHswHqEISZXGHnKeATvD+6
 RdbmQ9oRmIYfZfoDKZRUJg8TyTvW1rIKokFbe0q2iyuDnI5D/fAJ48epZaLw+kEf
 PUzdB3UgPk19SLwgZKQddqY4wOD420ZD5x1TUFUQuLL7sjVv1vUILDvuCLWpq7F7
 GyfSB+LEMgexHGsZ1wjslN/ivTbG+dQgaSS9mlV8/WDOLPtD2uOf65vYR3P28hAX
 zOHrwm3e2+UV83BsEFEY2FQiiIBD24JmSecMbmAIHY09MCSZ+vJ/WbF4J1PcPP8C
 3vjueIYTcjhzLtQrfIkGZcS7+wC9ji/RRmpJjbg79EpwrjhEs9G8h1+HyL9+zBZ4
 Xn6X+ZG/cv0/ZYdin0ZRzJMvM0RutbsR77blVCLY97PBuLtBlqJDcxr+lmmjyIZ2
 Db8Qd6uW
 =IOxM
 -----END PGP SIGNATURE-----

Merge tag 'kvmarm-fixes-6.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 fixes for 6.1, take #1

- Fix for stage-2 invalidation holding the VM MMU lock
  for too long by limiting the walk to the largest
  block mapping size

- Enable stack protection and branch profiling for VHE

- Two selftest fixes
2022-10-22 03:32:23 -04:00
Dave Marchevsky
8f4bc15b9a selftests/bpf: Add write to hashmap to array_map iter test
Modify iter prog in existing bpf_iter_bpf_array_map.c, which currently
dumps arraymap key/val, to also do a write of (val, key) into a
newly-added hashmap. Confirm that the write succeeds as expected by
modifying the userspace runner program.

Before a change added in an earlier commit - considering PTR_TO_BUF reg
a valid input to helpers which expect MAP_{KEY,VAL} - the verifier
would've rejected this prog change due to type mismatch. Since using
current iter's key/val to access a separate map is a reasonable usecase,
let's add support for it.

Note that the test prog cannot directly write (val, key) into hashmap
via bpf_map_update_elem when both come from iter context because key is
marked MEM_RDONLY. This is due to bpf_map_update_elem - and other basic
map helpers - taking ARG_PTR_TO_MAP_{KEY,VALUE} w/o MEM_RDONLY type
flag. bpf_map_{lookup,update,delete}_elem don't modify their
input key/val so it should be possible to tag their args READONLY, but
due to the ubiquitous use of these helpers and verifier checks for
type == MAP_VALUE, such a change is nontrivial and seems better to
address in a followup series.

Also fixup some 'goto's in test runner's map checking loop.

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20221020160721.4030492-4-davemarchevsky@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-10-21 19:23:34 -07:00
Dave Marchevsky
51ee71d38d selftests/bpf: Add test verifying bpf_ringbuf_reserve retval use in map ops
Add a test_ringbuf_map_key test prog, borrowing heavily from extant
test_ringbuf.c. The program tries to use the result of
bpf_ringbuf_reserve as map_key, which was not possible before previouis
commits in this series. The test runner added to prog_tests/ringbuf.c
verifies that the program loads and does basic sanity checks to confirm
that it runs as expected.

Also, refactor test_ringbuf such that runners for existing test_ringbuf
and newly-added test_ringbuf_map_key are subtests of 'ringbuf' top-level
test.

Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20221020160721.4030492-3-davemarchevsky@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-10-21 19:23:34 -07:00
Manu Bretelle
94d52a1918 selftests/bpf: Initial DENYLIST for aarch64
Those tests are currently failing on aarch64, ignore them until they are
individually addressed.

Using this deny list, vmtest.sh ran successfully using

LLVM_STRIP=llvm-strip-16 CLANG=clang-16 \
    tools/testing/selftests/bpf/vmtest.sh  -- \
        ./test_progs -d \
            \"$(cat tools/testing/selftests/bpf/DENYLIST{,.aarch64} \
                | cut -d'#' -f1 \
                | sed -e 's/^[[:space:]]*//' \
                      -e 's/[[:space:]]*$//' \
                | tr -s '\n' ','\
            )\"

Signed-off-by: Manu Bretelle <chantr4@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20221021210701.728135-5-chantr4@gmail.com
2022-10-21 16:27:26 -07:00
Manu Bretelle
20776b72ae selftests/bpf: Update vmtests.sh to support aarch64
Add handling of aarch64 when setting QEMU options and provide the right
path to aarch64 kernel image.

Signed-off-by: Manu Bretelle <chantr4@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20221021210701.728135-4-chantr4@gmail.com
2022-10-21 16:27:25 -07:00
Manu Bretelle
ec99451f0a selftests/bpf: Add config.aarch64
config.aarch64, similarly to config.{s390x,x86_64} is a config enabling
building a kernel on aarch64 to be used in bpf's
selftests/kernel-patches CI.

Signed-off-by: Manu Bretelle <chantr4@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20221021210701.728135-3-chantr4@gmail.com
2022-10-21 16:27:25 -07:00
Manu Bretelle
7a42af4b94 selftests/bpf: Remove entries from config.s390x already present in config
`config.s390x` had entries already present in `config`.

When generating the config used by vmtest, we concatenate the `config`
file with the `config.{arch}` one, making those entries duplicated.

This patch removes that duplication.

Before:
$ comm -1 -2  <(sort tools/testing/selftests/bpf/config.s390x) <(sort
tools/testing/selftests/bpf/config)
CONFIG_MODULE_SIG=y
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
$

Ater:
$ comm -1 -2  <(sort tools/testing/selftests/bpf/config.s390x) <(sort
tools/testing/selftests/bpf/config)
$

Signed-off-by: Manu Bretelle <chantr4@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20221021210701.728135-2-chantr4@gmail.com
2022-10-21 16:27:25 -07:00
Quentin Monnet
2c76238ead bpftool: Add "bootstrap" feature to version output
Along with the version number, "bpftool version" displays a list of
features that were selected at compilation time for bpftool. It would be
useful to indicate in that list whether a binary is a bootstrap version
of bpftool. Given that an increasing number of components rely on
bootstrap versions for generating skeletons, this could help understand
what a binary is capable of if it has been copied outside of the usual
"bootstrap" directory.

To detect a bootstrap version, we simply rely on the absence of
implementation for the do_prog() function. To do this, we must move the
(unchanged) list of commands before do_version(), which in turn requires
renaming this "cmds" array to avoid shadowing it with the "cmds"
argument in cmd_select().

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20221020100332.69563-1-quentin@isovalent.com
2022-10-21 23:41:13 +02:00
Quentin Monnet
7e5eb725cf bpftool: Set binary name to "bpftool" in help and version output
Commands "bpftool help" or "bpftool version" use argv[0] to display the
name of the binary. While it is a convenient way to retrieve the string,
it does not always produce the most readable output. For example,
because of the way bpftool is currently packaged on Ubuntu (using a
wrapper script), the command displays the absolute path for the binary:

    $ bpftool version | head -n 1
    /usr/lib/linux-tools/5.15.0-50-generic/bpftool v5.15.60

More generally, there is no apparent reason for keeping the whole path
and exact binary name in this output. If the user wants to understand
what binary is being called, there are other ways to do so. This commit
replaces argv[0] with "bpftool", to simply reflect what the tool is
called. This is aligned on what "ip" or "tc" do, for example.

As an additional benefit, this seems to help with integration with
Meson for packaging [0].

[0] https://github.com/NixOS/nixpkgs/pull/195934

Suggested-by: Vladimír Čunát <vladimir.cunat@nic.cz>
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20221020100300.69328-1-quentin@isovalent.com
2022-10-21 14:37:46 -07:00
Xu Kuohai
d9740535b8 libbpf: Avoid allocating reg_name with sscanf in parse_usdt_arg()
The reg_name in parse_usdt_arg() is used to hold register name, which
is short enough to be held in a 16-byte array, so we could define
reg_name as char reg_name[16] to avoid dynamically allocating reg_name
with sscanf.

Suggested-by: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/bpf/20221018145538.2046842-1-xukuohai@huaweicloud.com
2022-10-21 14:28:14 -07:00
Delyan Kratunov
eb814cf1ad selftests/bpf: fix task_local_storage/exit_creds rcu usage
BPF CI has revealed flakiness in the task_local_storage/exit_creds test.
The failure point in CI [1] is that null_ptr_count is equal to 0,
which indicates that the program hasn't run yet. This points to the
kern_sync_rcu (sys_membarrier -> synchronize_rcu underneath) not
waiting sufficiently.

Indeed, synchronize_rcu only waits for read-side sections that started
before the call. If the program execution starts *during* the
synchronize_rcu invocation (due to, say, preemption), the test won't
wait long enough.

As a speculative fix, make the synchornize_rcu calls in a loop until
an explicit run counter has gone up.

  [1]: https://github.com/kernel-patches/bpf/actions/runs/3268263235/jobs/5374940791

Signed-off-by: Delyan Kratunov <delyank@meta.com>
Link: https://lore.kernel.org/r/156d4ef82275a074e8da8f4cffbd01b0c1466493.camel@meta.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-10-21 13:58:09 -07:00
Linus Torvalds
ce3d90a877 Tracing/tools updates for 6.1-rc1:
- Make dot2c generate monitor's automata definition static
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCY1G25RQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qv7vAQCmyxA4f7V1eoGAmCUKly5MUqXijUYU
 L6msA1x+s7ENeQEAvUWBSVcu/bzwsonijQJp2OCbUKq9NkAhR5Yk69D4OQM=
 =BL24
 -----END PGP SIGNATURE-----

Merge tag 'trace-tools-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracing tool update from Steven Rostedt:

 - Make dot2c generate monitor's automata definition static

* tag 'trace-tools-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  rv/dot2c: Make automaton definition static
2022-10-21 12:29:52 -07:00
Wang Yufen
b81a677400 bpftool: Update the bash completion(add autoattach to prog load)
Add autoattach optional to prog load|loadall for supporting
one-step load-attach-pin_link.

Signed-off-by: Wang Yufen <wangyufen@huawei.com>
Link: https://lore.kernel.org/r/1665736275-28143-4-git-send-email-wangyufen@huawei.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-10-21 08:59:00 -07:00
Wang Yufen
ff0e9a579e bpftool: Update doc (add autoattach to prog load)
Add autoattach optional to prog load|loadall for supporting
one-step load-attach-pin_link.

Signed-off-by: Wang Yufen <wangyufen@huawei.com>
Link: https://lore.kernel.org/r/1665736275-28143-3-git-send-email-wangyufen@huawei.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-10-21 08:59:00 -07:00
Wang Yufen
19526e701e bpftool: Add autoattach for bpf prog load|loadall
Add autoattach optional to support one-step load-attach-pin_link.

For example,
   $ bpftool prog loadall test.o /sys/fs/bpf/test autoattach

   $ bpftool link
   26: tracing  name test1  tag f0da7d0058c00236  gpl
   	loaded_at 2022-09-09T21:39:49+0800  uid 0
   	xlated 88B  jited 55B  memlock 4096B  map_ids 3
   	btf_id 55
   28: kprobe  name test3  tag 002ef1bef0723833  gpl
   	loaded_at 2022-09-09T21:39:49+0800  uid 0
   	xlated 88B  jited 56B  memlock 4096B  map_ids 3
   	btf_id 55
   57: tracepoint  name oncpu  tag 7aa55dfbdcb78941  gpl
   	loaded_at 2022-09-09T21:41:32+0800  uid 0
   	xlated 456B  jited 265B  memlock 4096B  map_ids 17,13,14,15
   	btf_id 82

   $ bpftool link
   1: tracing  prog 26
   	prog_type tracing  attach_type trace_fentry
   3: perf_event  prog 28
   10: perf_event  prog 57

The autoattach optional can support tracepoints, k(ret)probes,
u(ret)probes.

Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Wang Yufen <wangyufen@huawei.com>
Link: https://lore.kernel.org/r/1665736275-28143-2-git-send-email-wangyufen@huawei.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-10-21 08:59:00 -07:00
Benjamin Poirier
b2c0921b92 selftests: net: Fix netdev name mismatch in cleanup
lag_lib.sh creates the interfaces dummy1 and dummy2 whereas
dev_addr_lists.sh:destroy() deletes the interfaces dummy0 and dummy1. Fix
the mismatch in names.

Fixes: bbb774d921 ("net: Add tests for bonding and team address list management")
Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com>
Reviewed-by: Jonathan Toppins <jtoppins@redhat.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-20 21:09:22 -07:00
Benjamin Poirier
ae108c48b5 selftests: net: Fix cross-tree inclusion of scripts
When exporting and running a subset of selftests via kselftest, files from
parts of the source tree which were not exported are not available. A few
tests are trying to source such files. Address the problem by using
symlinks.

The problem can be reproduced by running:
make -C tools/testing/selftests gen_tar TARGETS="drivers/net/bonding"
[... extract archive ...]
./run_kselftest.sh

or:
make kselftest KBUILD_OUTPUT=/tmp/kselftests TARGETS="drivers/net/bonding"

Fixes: bbb774d921 ("net: Add tests for bonding and team address list management")
Fixes: eccd0a80dc ("selftests: net: dsa: add a stress test for unlocked FDB operations")
Link: https://lore.kernel.org/netdev/40f04ded-0c86-8669-24b1-9a313ca21076@redhat.com/
Reported-by: Jonathan Toppins <jtoppins@redhat.com>
Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com>
Reviewed-by: Jonathan Toppins <jtoppins@redhat.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-20 21:09:22 -07:00
Wang Yufen
98af374602 selftests/bpf: fix missing BPF object files
After commit afef88e655 ("selftests/bpf: Store BPF object files with
.bpf.o extension"), we should use *.bpf.o instead of *.o.

In addition, use the BPF_FILE variable to save the BPF object file name,
which can be better identified and modified.

Fixes: afef88e655 ("selftests/bpf: Store BPF object files with .bpf.o extension")
Signed-off-by: Wang Yufen <wangyufen@huawei.com>
Acked-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/r/1666235134-562-1-git-send-email-wangyufen@huawei.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-10-20 19:05:50 -07:00
Jakub Kicinski
94adb5e29e Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
No conflicts.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-20 17:49:10 -07:00
Linus Torvalds
6d36c728bc Networking fixes for 6.1-rc2, including fixes from netfilter
Current release - regressions:
   - revert "net: fix cpu_max_bits_warn() usage in netif_attrmask_next{,_and}"
 
   - revert "net: sched: fq_codel: remove redundant resource cleanup in fq_codel_init()"
 
   - dsa: uninitialized variable in dsa_slave_netdevice_event()
 
   - eth: sunhme: uninitialized variable in happy_meal_init()
 
 Current release - new code bugs:
   - eth: octeontx2: fix resource not freed after malloc
 
 Previous releases - regressions:
   - sched: fix return value of qdisc ingress handling on success
 
   - sched: fix race condition in qdisc_graft()
 
   - udp: update reuse->has_conns under reuseport_lock.
 
   - tls: strp: make sure the TCP skbs do not have overlapping data
 
   - hsr: avoid possible NULL deref in skb_clone()
 
   - tipc: fix an information leak in tipc_topsrv_kern_subscr
 
   - phylink: add mac_managed_pm in phylink_config structure
 
   - eth: i40e: fix DMA mappings leak
 
   - eth: hyperv: fix a RX-path warning
 
   - eth: mtk: fix memory leaks
 
 Previous releases - always broken:
   - sched: cake: fix null pointer access issue when cake_init() fails
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmNRKdcSHHBhYmVuaUBy
 ZWRoYXQuY29tAAoJECkkeY3MjxOkYn8P/31xjE9/BRQKVGQOMxj78vhQvVHEZYXJ
 OJcaLcjxUCqj6hu3pEcuf88PeTicyfEqN32zzH1k+SS8jGQCmoVtXUfbZ7pDR6Tc
 rsAqVhLD6JnYkGEgtzm3i+8EfSeBoCy9kT4JZzRxQmOfZr1rBmtoMHOB4cGk9g1K
 lSF3KJcKT1GDacB/gVei+ms0Y+Q9WULOg3OFuyLSeltAkhZKaTfx/qqsLLEHFqZc
 u6eR31GwG28Y4GVurLQSOdaWrFOKqmPFOpzvjmeKC2RBqS6hVl4/YKZTmTV53Lee
 brm6kuVlU7CJVZEN2qF8G2+/SqLgcB0o26JVnml1kT8n0GlFAbyFf5akawjT8/Je
 G/zgz6k6wUAI2g3nSPNmgqVtobsypthzWL/bOpWfGfJFXxGOgLG3pbZYIl816Tha
 KnibZqQOBHxfPaUzh0xCLhidoi5G0T8ip9o9tyKlnmvbKY/EWk6HiIjCWlxnkPiO
 GRdHkyF7KMxqo/QuE9AK3LnD/AsLeWcuqzMveiaTbYLkMjGDW1yz3otL7KXW5l8U
 zYoUn1HQLkNqDE17+PjRo28awOMyN6ujXggKBK/hfXVnPpdW3yWPUoslqQdVn5KC
 3PLeSNM1v4UQSMWx1alRx1PvA+zYDX4GQpSSXbQgYGVim8LMwZQ1xui2qWF8xEau
 k9ZKfMSUNGEr
 =y2As
 -----END PGP SIGNATURE-----

Merge tag 'net-6.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
 "Including fixes from netfilter.

  Current release - regressions:

   - revert "net: fix cpu_max_bits_warn() usage in
     netif_attrmask_next{,_and}"

   - revert "net: sched: fq_codel: remove redundant resource cleanup in
     fq_codel_init()"

   - dsa: uninitialized variable in dsa_slave_netdevice_event()

   - eth: sunhme: uninitialized variable in happy_meal_init()

  Current release - new code bugs:

   - eth: octeontx2: fix resource not freed after malloc

  Previous releases - regressions:

   - sched: fix return value of qdisc ingress handling on success

   - sched: fix race condition in qdisc_graft()

   - udp: update reuse->has_conns under reuseport_lock.

   - tls: strp: make sure the TCP skbs do not have overlapping data

   - hsr: avoid possible NULL deref in skb_clone()

   - tipc: fix an information leak in tipc_topsrv_kern_subscr

   - phylink: add mac_managed_pm in phylink_config structure

   - eth: i40e: fix DMA mappings leak

   - eth: hyperv: fix a RX-path warning

   - eth: mtk: fix memory leaks

  Previous releases - always broken:

   - sched: cake: fix null pointer access issue when cake_init() fails"

* tag 'net-6.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (43 commits)
  net: phy: dp83822: disable MDI crossover status change interrupt
  net: sched: fix race condition in qdisc_graft()
  net: hns: fix possible memory leak in hnae_ae_register()
  wwan_hwsim: fix possible memory leak in wwan_hwsim_dev_new()
  sfc: include vport_id in filter spec hash and equal()
  genetlink: fix kdoc warnings
  selftests: add selftest for chaining of tc ingress handling to egress
  net: Fix return value of qdisc ingress handling on success
  net: sched: sfb: fix null pointer access issue when sfb_init() fails
  Revert "net: sched: fq_codel: remove redundant resource cleanup in fq_codel_init()"
  net: sched: cake: fix null pointer access issue when cake_init() fails
  ethernet: marvell: octeontx2 Fix resource not freed after malloc
  netfilter: nf_tables: relax NFTA_SET_ELEM_KEY_END set flags requirements
  netfilter: rpfilter/fib: Set ->flowic_uid correctly for user namespaces.
  ionic: catch NULL pointer issue on reconfig
  net: hsr: avoid possible NULL deref in skb_clone()
  bnxt_en: fix memory leak in bnxt_nvm_test()
  ip6mr: fix UAF issue in ip6mr_sk_done() when addrconf_init_net() failed
  udp: Update reuse->has_conns under reuseport_lock.
  net: ethernet: mediatek: ppe: Remove the unused function mtk_foe_entry_usable()
  ...
2022-10-20 17:24:59 -07:00
Daniel Bristot de Oliveira
21a1994b64 rv/dot2c: Make automaton definition static
Monitor's automata definition is only used locally, so make dot2c generate
a static definition.

Link: https://lore.kernel.org/all/202208210332.gtHXje45-lkp@intel.com
Link: https://lore.kernel.org/all/202208210358.6HH3OrVs-lkp@intel.com
Link: https://lkml.kernel.org/r/ffbb92010f643307766c9307fd42f416e5b85fa0.1661266564.git.bristot@kernel.org

Cc: Steven Rostedt <rostedt@goodmis.org>
Fixes: e3c9fc78f0 ("tools/rv: Add dot2c")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-10-20 16:02:45 -04:00
Jie Meng
8662de2321 bpf: add selftests for lsh, rsh, arsh with reg operand
Current tests cover only shifts with an immediate as the source
operand/shift counts; add a new test case to cover register operand.

Signed-off-by: Jie Meng <jmeng@fb.com>
Link: https://lore.kernel.org/r/20221007202348.1118830-4-jmeng@fb.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-10-19 16:53:51 -07:00
Andrii Nakryiko
2f968e9f4a libbpf: add non-mmapable data section selftest
Add non-mmapable data section to test_skeleton selftest and make sure it
really isn't mmapable by trying to mmap() it anyways.

Also make sure that libbpf doesn't report BPF_F_MMAPABLE flag to users.

Additional, some more manual testing was performed that this feature
works as intended.

Looking at created map through bpftool shows that flags passed to kernel are
indeed zero:

  $ bpftool map show
  ...
  1782: array  name .data.non_mmapa  flags 0x0
          key 4B  value 16B  max_entries 1  memlock 4096B
          btf_id 1169
          pids test_progs(8311)
  ...

Checking BTF uploaded to kernel for this map shows that zero_key and
zero_value are indeed marked as static, even though zero_key is actually
original global (but STV_HIDDEN) variable:

  $ bpftool btf dump id 1169
  ...
  [51] VAR 'zero_key' type_id=2, linkage=static
  [52] VAR 'zero_value' type_id=7, linkage=static
  ...
  [62] DATASEC '.data.non_mmapable' size=16 vlen=2
          type_id=51 offset=0 size=4 (VAR 'zero_key')
          type_id=52 offset=4 size=12 (VAR 'zero_value')
  ...

And original BTF does have zero_key marked as linkage=global:

  $ bpftool btf dump file test_skeleton.bpf.linked3.o
  ...
  [51] VAR 'zero_key' type_id=2, linkage=global
  [52] VAR 'zero_value' type_id=7, linkage=static
  ...
  [62] DATASEC '.data.non_mmapable' size=16 vlen=2
          type_id=51 offset=0 size=4 (VAR 'zero_key')
          type_id=52 offset=4 size=12 (VAR 'zero_value')

Bpftool didn't require any changes at all because it checks whether internal
map is mmapable already, but just to double-check generated skeleton, we
see that .data.non_mmapable neither sets mmaped pointer nor has
a corresponding field in the skeleton:

  $ grep non_mmapable test_skeleton.skel.h
                  struct bpf_map *data_non_mmapable;
          s->maps[7].name = ".data.non_mmapable";
          s->maps[7].map = &obj->maps.data_non_mmapable;

But .data.read_mostly has all of those things:

  $ grep read_mostly test_skeleton.skel.h
                  struct bpf_map *data_read_mostly;
          struct test_skeleton__data_read_mostly {
                  int read_mostly_var;
          } *data_read_mostly;
          s->maps[6].name = ".data.read_mostly";
          s->maps[6].map = &obj->maps.data_read_mostly;
          s->maps[6].mmaped = (void **)&obj->data_read_mostly;
          _Static_assert(sizeof(s->data_read_mostly->read_mostly_var) == 4, "unexpected size of 'read_mostly_var'");

Acked-by: Stanislav Fomichev <sdf@google.com>
Acked-by: Dave Marchevsky <davemarchevsky@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20221019002816.359650-4-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-10-19 16:40:45 -07:00
Andrii Nakryiko
4fcac46c7e libbpf: only add BPF_F_MMAPABLE flag for data maps with global vars
Teach libbpf to not add BPF_F_MMAPABLE flag unnecessarily for ARRAY maps
that are backing data sections, if such data sections don't expose any
variables to user-space. Exposed variables are those that have
STB_GLOBAL or STB_WEAK ELF binding and correspond to BTF VAR's
BTF_VAR_GLOBAL_ALLOCATED linkage.

The overall idea is that if some data section doesn't have any variable that
is exposed through BPF skeleton, then there is no reason to make such
BPF array mmapable. Making BPF array mmapable is not a free no-op
action, because BPF verifier doesn't allow users to put special objects
(such as BPF spin locks, RB tree nodes, linked list nodes, kptrs, etc;
anything that has a sensitive internal state that should not be modified
arbitrarily from user space) into mmapable arrays, as there is no way to
prevent user space from corrupting such sensitive state through direct
memory access through memory-mapped region.

By making sure that libbpf doesn't add BPF_F_MMAPABLE flag to BPF array
maps corresponding to data sections that only have static variables
(which are not supposed to be visible to user space according to libbpf
and BPF skeleton rules), users now can have spinlocks, kptrs, etc in
either default .bss/.data sections or custom .data.* sections (assuming
there are no global variables in such sections).

The only possible hiccup with this approach is the need to use global
variables during BPF static linking, even if it's not intended to be
shared with user space through BPF skeleton. To allow such scenarios,
extend libbpf's STV_HIDDEN ELF visibility attribute handling to
variables. Libbpf is already treating global hidden BPF subprograms as
static subprograms and adjusts BTF accordingly to make BPF verifier
verify such subprograms as static subprograms with preserving entire BPF
verifier state between subprog calls. This patch teaches libbpf to treat
global hidden variables as static ones and adjust BTF information
accordingly as well. This allows to share variables between multiple
object files during static linking, but still keep them internal to BPF
program and not get them exposed through BPF skeleton.

Note, that if the user has some advanced scenario where they absolutely
need BPF_F_MMAPABLE flag on .data/.bss/.rodata BPF array map despite
only having static variables, they still can achieve this by forcing it
through explicit bpf_map__set_map_flags() API.

Acked-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/r/20221019002816.359650-3-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-10-19 16:40:45 -07:00
Andrii Nakryiko
f33f742d56 libbpf: clean up and refactor BTF fixup step
Refactor libbpf's BTF fixup step during BPF object open phase. The only
functional change is that we now ignore BTF_VAR_GLOBAL_EXTERN variables
during fix up, not just BTF_VAR_STATIC ones, which shouldn't cause any
change in behavior as there shouldn't be any extern variable in data
sections for valid BPF object anyways.

Otherwise it's just collapsing two functions that have no reason to be
separate, and switching find_elf_var_offset() helper to return entire
symbol pointer, not just its offset. This will be used by next patch to
get ELF symbol visibility.

While refactoring, also "normalize" debug messages inside
btf_fixup_datasec() to follow general libbpf style and print out data
section name consistently, where it's available.

Acked-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20221019002816.359650-2-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-10-19 16:40:45 -07:00
Mickaël Salaün
091873e47e
selftests/landlock: Build without static libraries
The only (forced) static test binary doesn't depend on libcap.  Because
using -lcap on systems that don't have such static library would fail
(e.g. on Arch Linux), let's be more specific and require only dynamic
libcap linking.

Fixes: a52540522c ("selftests/landlock: Fix out-of-tree builds")
Cc: Anders Roxell <anders.roxell@linaro.org>
Cc: Guillaume Tucker <guillaume.tucker@collabora.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: stable@vger.kernel.org
Signed-off-by: Mickaël Salaün <mic@digikod.net>
Link: https://lore.kernel.org/r/20221019200536.2771316-1-mic@digikod.net
2022-10-19 22:10:56 +02:00
Daniel Müller
81bfcc3fcd bpf/docs: Summarize CI system and deny lists
This change adds a brief summary of the BPF continuous integration (CI)
to the BPF selftest documentation. The summary focuses not so much on
actual workings of the CI, as it is maintained outside of the
repository, but aims to document the few bits of it that are sourced
from this repository and that developers may want to adjust as part of
patch submissions: the BPF kernel configuration and the deny list
file(s).

Changelog:
- v1->v2:
  - use s390x instead of s390 for consistency

Signed-off-by: Daniel Müller <deso@posteo.net>
Acked-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/r/20221018164015.1970862-1-deso@posteo.net
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2022-10-19 11:42:01 -07:00
Paul Blakey
fd602f5cb5 selftests: add selftest for chaining of tc ingress handling to egress
This test runs a simple ingress tc setup between two veth pairs,
then adds a egress->ingress rule to test the chaining of tc ingress
pipeline to tc egress piepline.

Signed-off-by: Paul Blakey <paulb@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-19 14:04:36 +01:00
Ido Schimmel
b526b2ea14 selftests: bridge_igmp: Remove unnecessary address deletion
The test group address is added and removed in v2reportleave_test().
There is no need to delete it again during cleanup as it results in the
following error message:

 # bash -x ./bridge_igmp.sh
 [...]
 + cleanup
 + pre_cleanup
 [...]
 + ip address del dev swp4 239.10.10.10/32
 RTNETLINK answers: Cannot assign requested address
 + h2_destroy

Solve by removing the unnecessary address deletion.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-19 14:01:08 +01:00
Ido Schimmel
6fb1faa1b9 selftests: bridge_vlan_mcast: Delete qdiscs during cleanup
The qdiscs are added during setup, but not deleted during cleanup,
resulting in the following error messages:

 # ./bridge_vlan_mcast.sh
 [...]
 # ./bridge_vlan_mcast.sh
 Error: Exclusivity flag on, cannot modify.
 Error: Exclusivity flag on, cannot modify.

Solve by deleting the qdiscs during cleanup.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-19 14:01:08 +01:00
Jakub Kicinski
3566a79c9e bpf-next-for-netdev
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCY08JQQAKCRDbK58LschI
 g0M0AQCWGrJcnQFut1qwR9efZUadwxtKGAgpaA/8Smd8+v7c8AD/SeHQuGfkFiD6
 rx18hv1mTfG0HuPnFQy6YZQ98vmznwE=
 =DaeS
 -----END PGP SIGNATURE-----

Merge tag 'for-netdev' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Daniel Borkmann says:

====================
pull-request: bpf-next 2022-10-18

We've added 33 non-merge commits during the last 14 day(s) which contain
a total of 31 files changed, 874 insertions(+), 538 deletions(-).

The main changes are:

1) Add RCU grace period chaining to BPF to wait for the completion
   of access from both sleepable and non-sleepable BPF programs,
   from Hou Tao & Paul E. McKenney.

2) Improve helper UAPI by explicitly defining BPF_FUNC_xxx integer
   values. In the wild we have seen OS vendors doing buggy backports
   where helper call numbers mismatched. This is an attempt to make
   backports more foolproof, from Andrii Nakryiko.

3) Add libbpf *_opts API-variants for bpf_*_get_fd_by_id() functions,
   from Roberto Sassu.

4) Fix libbpf's BTF dumper for structs with padding-only fields,
   from Eduard Zingerman.

5) Fix various libbpf bugs which have been found from fuzzing with
   malformed BPF object files, from Shung-Hsi Yu.

6) Clean up an unneeded check on existence of SSE2 in BPF x86-64 JIT,
   from Jie Meng.

7) Fix various ASAN bugs in both libbpf and selftests when running
   the BPF selftest suite on arm64, from Xu Kuohai.

8) Fix missing bpf_iter_vma_offset__destroy() call in BPF iter selftest
   and use in-skeleton link pointer to remove an explicit bpf_link__destroy(),
   from Jiri Olsa.

9) Fix BPF CI breakage by pointing to iptables-legacy instead of relying
   on symlinked iptables which got upgraded to iptables-nft,
   from Martin KaFai Lau.

10) Minor BPF selftest improvements all over the place, from various others.

* tag 'for-netdev' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (33 commits)
  bpf/docs: Update README for most recent vmtest.sh
  bpf: Use rcu_trace_implies_rcu_gp() for program array freeing
  bpf: Use rcu_trace_implies_rcu_gp() in local storage map
  bpf: Use rcu_trace_implies_rcu_gp() in bpf memory allocator
  rcu-tasks: Provide rcu_trace_implies_rcu_gp()
  selftests/bpf: Use sys_pidfd_open() helper when possible
  libbpf: Fix null-pointer dereference in find_prog_by_sec_insn()
  libbpf: Deal with section with no data gracefully
  libbpf: Use elf_getshdrnum() instead of e_shnum
  selftest/bpf: Fix error usage of ASSERT_OK in xdp_adjust_tail.c
  selftests/bpf: Fix error failure of case test_xdp_adjust_tail_grow
  selftest/bpf: Fix memory leak in kprobe_multi_test
  selftests/bpf: Fix memory leak caused by not destroying skeleton
  libbpf: Fix memory leak in parse_usdt_arg()
  libbpf: Fix use-after-free in btf_dump_name_dups
  selftests/bpf: S/iptables/iptables-legacy/ in the bpf_nf and xdp_synproxy test
  selftests/bpf: Alphabetize DENYLISTs
  selftests/bpf: Add tests for _opts variants of bpf_*_get_fd_by_id()
  libbpf: Introduce bpf_link_get_fd_by_id_opts()
  libbpf: Introduce bpf_btf_get_fd_by_id_opts()
  ...
====================

Link: https://lore.kernel.org/r/20221018210631.11211-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-18 18:56:43 -07:00
Sven Schnelle
cb05c81ada selftests/ftrace: fix dynamic_events dependency check
commit 95c104c378 ("tracing: Auto generate event name when creating a
group of events") changed the syntax in the ftrace README file which is
used by the selftests to check what features are support. Adjust the
string to make test_duplicates.tc and trigger-synthetic-eprobe.tc work
again.

Fixes: 95c104c378 ("tracing: Auto generate event name when creating a group of events")
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-10-18 14:27:23 -06:00
Zhao Gongyi
eb6789b0c3 selftests/memory-hotplug: Remove the redundant warning information
Remove the redundant warning information of online_all_offline_memory()
since there is a warning in online_memory_expect_success().

Signed-off-by: Zhao Gongyi <zhaogongyi@huawei.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-10-18 14:21:18 -06:00
Ricardo Cañuelo
2a8e366b23 selftests/kexec: fix build for ARCH=x86_64
Handle the scenario where the build is launched with the ARCH envvar
defined as x86_64.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-10-18 14:13:25 -06:00
Ricardo Cañuelo
beb7d862ed selftests/intel_pstate: fix build for ARCH=x86_64
Handle the scenario where the build is launched with the ARCH envvar
defined as x86_64.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-10-18 14:13:19 -06:00
Ricardo Cañuelo
03cab65a07 selftests/futex: fix build for clang
Don't use the test-specific header files as source files to force a
target dependency, as clang will complain if more than one source file
is used for a compile command with a single '-o' flag.

Use the proper Makefile variables instead as defined in
tools/testing/selftests/lib.mk.

Signed-off-by: Ricardo Cañuelo <ricardo.canuelo@collabora.com>
Reviewed-by: André Almeida <andrealmeid@igalia.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-10-18 14:13:11 -06:00
Daniel Müller
6c4e777fbb bpf/docs: Update README for most recent vmtest.sh
Since commit 40b09653b1 ("selftests/bpf: Adjust vmtest.sh to use local
kernel configuration") the vmtest.sh script no longer downloads a kernel
configuration but uses the local, in-repository one.
This change updates the README, which still mentions the old behavior.

Signed-off-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20221017232458.1272762-1-deso@posteo.net
2022-10-18 21:54:05 +02:00
Stanislav Fomichev
35cc9d622e selftests/bpf: Add reproducer for decl_tag in func_proto return type
It should trigger a WARN_ON_ONCE in btf_type_id_size.

     btf_func_proto_check kernel/bpf/btf.c:4447 [inline]
     btf_check_all_types kernel/bpf/btf.c:4723 [inline]
     btf_parse_type_sec kernel/bpf/btf.c:4752 [inline]
     btf_parse kernel/bpf/btf.c:5026 [inline]
     btf_new_fd+0x1926/0x1e70 kernel/bpf/btf.c:6892
     bpf_btf_load kernel/bpf/syscall.c:4324 [inline]
     __sys_bpf+0xb7d/0x4cf0 kernel/bpf/syscall.c:5010
     __do_sys_bpf kernel/bpf/syscall.c:5069 [inline]
     __se_sys_bpf kernel/bpf/syscall.c:5067 [inline]
     __x64_sys_bpf+0x75/0xb0 kernel/bpf/syscall.c:5067
     do_syscall_x64 arch/x86/entry/common.c:50 [inline]
     do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
     entry_SYSCALL_64_after_hwframe+0x63/0xcd

Cc: Yonghong Song <yhs@fb.com>
Cc: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20221015002444.2680969-1-sdf@google.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2022-10-17 10:56:12 -07:00
Marco Elver
23488ec668 selftests/perf_events: Add a SIGTRAP stress test with disables
Add a SIGTRAP stress test that exercises repeatedly enabling/disabling
an event while it concurrently keeps firing.

Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/all/Y0E3uG7jOywn7vy3@elver.google.com/
2022-10-17 16:32:06 +02:00
Matti Vaittinen
72b2aa3819 tools: iio: iio_utils: fix digit calculation
The iio_utils uses a digit calculation in order to know length of the
file name containing a buffer number. The digit calculation does not
work for number 0.

This leads to allocation of one character too small buffer for the
file-name when file name contains value '0'. (Eg. buffer0).

Fix digit calculation by returning one digit to be present for number
'0'.

Fixes: 096f9b862e ("tools:iio:iio_utils: implement digit calculation")
Signed-off-by: Matti Vaittinen <mazziesaccount@gmail.com>
Link: https://lore.kernel.org/r/Y0f+tKCz+ZAIoroQ@dc75zzyyyyyyyyyyyyycy-3.rev.dnainternet.fi
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2022-10-17 08:51:26 +01:00
Linus Torvalds
8636df94ec perf tools changes for v6.1: 2nd batch
- Use BPF CO-RE (Compile Once, Run Everywhere) to support old kernels
   when using bperf (perf BPF based counters) with cgroups.
 
 - Support HiSilicon PCIe Performance Monitoring Unit (PMU), that
   monitors bandwidth, latency, bus utilization and buffer occupancy.
 
   Documented in Documentation/admin-guide/perf/hisi-pcie-pmu.rst.
 
 - User space tasks can migrate between CPUs, so when tracing selected
   CPUs, system-wide sideband is still needed, fix it in the setup of
   Intel PT on hybrid systems.
 
 - Fix metricgroups title message in 'perf list', it should state that
   the metrics groups are to be used with the '-M' option, not '-e'.
 
 - Sync the msr-index.h copy with the kernel sources, adding support
   for using "AMD64_TSC_RATIO" in filter expressions in 'perf trace' as
   well as decoding it when printing the MSR tracepoint arguments.
 
 - Fix program header size and alignment when generating a JIT ELF
   in 'perf inject'.
 
 - Add multiple new Intel PT 'perf test' entries, including a jitdump one.
 
 - Fix the 'perf test' entries for 'perf stat' CSV and JSON output when
   running on PowerPC due to an invalid topology number in that arch.
 
 - Fix the 'perf test' for arm_coresight failures on the ARM Juno system.
 
 - Fix the 'perf test' attr entry for PERF_FORMAT_LOST, adding this option
   to the or expression expected in the intercepted perf_event_open() syscall.
 
 - Add missing condition flags ('hs', 'lo', 'vc', 'vs') for arm64 in the 'perf
   annotate' asm parser.
 
 - Fix 'perf mem record -C' option processing, it was being chopped up
   when preparing the underlying 'perf record -e mem-events' and thus being
   ignored, requiring using '-- -C CPUs' as a workaround.
 
 - Improvements and tidy ups for 'perf test' shell infra.
 
 - Fix Intel PT information printing segfault in uClibc, where a NULL
   format was being passed to fprintf.
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCY0vyXAAKCRCyPKLppCJ+
 J3EgAQDgr9FhTCTG+u46iGqPG4lxc46ZWKB3MgZwPuX6P2jwLwD9GCwGow4qHQVP
 F/m7S/3tK/ShPfPWB2m4nVHd9xp7uwM=
 =F1IB
 -----END PGP SIGNATURE-----

Merge tag 'perf-tools-for-v6.1-2-2022-10-16' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux

Pull more perf tools updates from Arnaldo Carvalho de Melo:

 - Use BPF CO-RE (Compile Once, Run Everywhere) to support old kernels
   when using bperf (perf BPF based counters) with cgroups.

 - Support HiSilicon PCIe Performance Monitoring Unit (PMU), that
   monitors bandwidth, latency, bus utilization and buffer occupancy.

   Documented in Documentation/admin-guide/perf/hisi-pcie-pmu.rst.

 - User space tasks can migrate between CPUs, so when tracing selected
   CPUs, system-wide sideband is still needed, fix it in the setup of
   Intel PT on hybrid systems.

 - Fix metricgroups title message in 'perf list', it should state that
   the metrics groups are to be used with the '-M' option, not '-e'.

 - Sync the msr-index.h copy with the kernel sources, adding support for
   using "AMD64_TSC_RATIO" in filter expressions in 'perf trace' as well
   as decoding it when printing the MSR tracepoint arguments.

 - Fix program header size and alignment when generating a JIT ELF in
   'perf inject'.

 - Add multiple new Intel PT 'perf test' entries, including a jitdump
   one.

 - Fix the 'perf test' entries for 'perf stat' CSV and JSON output when
   running on PowerPC due to an invalid topology number in that arch.

 - Fix the 'perf test' for arm_coresight failures on the ARM Juno
   system.

 - Fix the 'perf test' attr entry for PERF_FORMAT_LOST, adding this
   option to the or expression expected in the intercepted
   perf_event_open() syscall.

 - Add missing condition flags ('hs', 'lo', 'vc', 'vs') for arm64 in the
   'perf annotate' asm parser.

 - Fix 'perf mem record -C' option processing, it was being chopped up
   when preparing the underlying 'perf record -e mem-events' and thus
   being ignored, requiring using '-- -C CPUs' as a workaround.

 - Improvements and tidy ups for 'perf test' shell infra.

 - Fix Intel PT information printing segfault in uClibc, where a NULL
   format was being passed to fprintf.

* tag 'perf-tools-for-v6.1-2-2022-10-16' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (23 commits)
  tools arch x86: Sync the msr-index.h copy with the kernel sources
  perf auxtrace arm64: Add support for parsing HiSilicon PCIe Trace packet
  perf auxtrace arm64: Add support for HiSilicon PCIe Tune and Trace device driver
  perf auxtrace arm: Refactor event list iteration in auxtrace_record__init()
  perf tests stat+json_output: Include sanity check for topology
  perf tests stat+csv_output: Include sanity check for topology
  perf intel-pt: Fix system_wide dummy event for hybrid
  perf intel-pt: Fix segfault in intel_pt_print_info() with uClibc
  perf test: Fix attr tests for PERF_FORMAT_LOST
  perf test: test_intel_pt.sh: Add 9 tests
  perf inject: Fix GEN_ELF_TEXT_OFFSET for jit
  perf test: test_intel_pt.sh: Add jitdump test
  perf test: test_intel_pt.sh: Tidy some alignment
  perf test: test_intel_pt.sh: Print a message when skipping kernel tracing
  perf test: test_intel_pt.sh: Tidy some perf record options
  perf test: test_intel_pt.sh: Fix return checking again
  perf: Skip and warn on unknown format 'configN' attrs
  perf list: Fix metricgroups title message
  perf mem: Fix -C option behavior for perf mem record
  perf annotate: Add missing condition flags for arm64
  ...
2022-10-16 15:14:29 -07:00
Arnaldo Carvalho de Melo
a3a365655a tools arch x86: Sync the msr-index.h copy with the kernel sources
To pick up the changes in:

  b8d1d16360 ("x86/apic: Don't disable x2APIC if locked")
  ca5b7c0d96 ("perf/x86/amd/lbr: Add LbrExtV2 branch record support")

Addressing these tools/perf build warnings:

    diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h
    Warning: Kernel ABI header at 'tools/arch/x86/include/asm/msr-index.h' differs from latest version at 'arch/x86/include/asm/msr-index.h'

That makes the beautification scripts to pick some new entries:

  $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > before
  $ cp arch/x86/include/asm/msr-index.h tools/arch/x86/include/asm/msr-index.h
  $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > after
  $ diff -u before after
  --- before	2022-10-14 18:06:34.294561729 -0300
  +++ after	2022-10-14 18:06:41.285744044 -0300
  @@ -264,6 +264,7 @@
   	[0xc0000102 - x86_64_specific_MSRs_offset] = "KERNEL_GS_BASE",
   	[0xc0000103 - x86_64_specific_MSRs_offset] = "TSC_AUX",
   	[0xc0000104 - x86_64_specific_MSRs_offset] = "AMD64_TSC_RATIO",
  +	[0xc000010e - x86_64_specific_MSRs_offset] = "AMD64_LBR_SELECT",
   	[0xc000010f - x86_64_specific_MSRs_offset] = "AMD_DBG_EXTN_CFG",
   	[0xc0000300 - x86_64_specific_MSRs_offset] = "AMD64_PERF_CNTR_GLOBAL_STATUS",
   	[0xc0000301 - x86_64_specific_MSRs_offset] = "AMD64_PERF_CNTR_GLOBAL_CTL",
  $

Now one can trace systemwide asking to see backtraces to where that MSR
is being read/written, see this example with a previous update:

  # perf trace -e msr:*_msr/max-stack=32/ --filter="msr>=IA32_U_CET && msr<=IA32_INT_SSP_TAB"
  ^C#

If we use -v (verbose mode) we can see what it does behind the scenes:

  # perf trace -v -e msr:*_msr/max-stack=32/ --filter="msr>=IA32_U_CET && msr<=IA32_INT_SSP_TAB"
  Using CPUID AuthenticAMD-25-21-0
  0x6a0
  0x6a8
  New filter for msr:read_msr: (msr>=0x6a0 && msr<=0x6a8) && (common_pid != 597499 && common_pid != 3313)
  0x6a0
  0x6a8
  New filter for msr:write_msr: (msr>=0x6a0 && msr<=0x6a8) && (common_pid != 597499 && common_pid != 3313)
  mmap size 528384B
  ^C#

Example with a frequent msr:

  # perf trace -v -e msr:*_msr/max-stack=32/ --filter="msr==IA32_SPEC_CTRL" --max-events 2
  Using CPUID AuthenticAMD-25-21-0
  0x48
  New filter for msr:read_msr: (msr==0x48) && (common_pid != 2612129 && common_pid != 3841)
  0x48
  New filter for msr:write_msr: (msr==0x48) && (common_pid != 2612129 && common_pid != 3841)
  mmap size 528384B
  Looking at the vmlinux_path (8 entries long)
  symsrc__init: build id mismatch for vmlinux.
  Using /proc/kcore for kernel data
  Using /proc/kallsyms for symbols
     0.000 Timer/2525383 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6)
                                       do_trace_write_msr ([kernel.kallsyms])
                                       do_trace_write_msr ([kernel.kallsyms])
                                       __switch_to_xtra ([kernel.kallsyms])
                                       __switch_to ([kernel.kallsyms])
                                       __schedule ([kernel.kallsyms])
                                       schedule ([kernel.kallsyms])
                                       futex_wait_queue_me ([kernel.kallsyms])
                                       futex_wait ([kernel.kallsyms])
                                       do_futex ([kernel.kallsyms])
                                       __x64_sys_futex ([kernel.kallsyms])
                                       do_syscall_64 ([kernel.kallsyms])
                                       entry_SYSCALL_64_after_hwframe ([kernel.kallsyms])
                                       __futex_abstimed_wait_common64 (/usr/lib64/libpthread-2.33.so)
     0.030 :0/0 msr:write_msr(msr: IA32_SPEC_CTRL, val: 2)
                                       do_trace_write_msr ([kernel.kallsyms])
                                       do_trace_write_msr ([kernel.kallsyms])
                                       __switch_to_xtra ([kernel.kallsyms])
                                       __switch_to ([kernel.kallsyms])
                                       __schedule ([kernel.kallsyms])
                                       schedule_idle ([kernel.kallsyms])
                                       do_idle ([kernel.kallsyms])
                                       cpu_startup_entry ([kernel.kallsyms])
                                       secondary_startup_64_no_verify ([kernel.kallsyms])
  #

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Daniel Sneddon <daniel.sneddon@linux.intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Link: https://lore.kernel.org/lkml/Y0nQkz2TUJxwfXJd@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-15 10:13:16 -03:00
Qi Liu
5e91e57e68 perf auxtrace arm64: Add support for parsing HiSilicon PCIe Trace packet
Add support for using 'perf report --dump-raw-trace' to parse PTT packet.

Example usage:

Output will contain raw PTT data and its textual representation, such
as (8DW format):

0 0 0x5810 [0x30]: PERF_RECORD_AUXTRACE size: 0x400000  offset: 0
ref: 0xa5d50c725  idx: 0  tid: -1  cpu: 0
.
. ... HISI PTT data: size 4194304 bytes
.  00000000: 00 00 00 00                                 Prefix
.  00000004: 08 20 00 60                                 Header DW0
.  00000008: ff 02 00 01                                 Header DW1
.  0000000c: 20 08 00 00                                 Header DW2
.  00000010: 10 e7 44 ab                                 Header DW3
.  00000014: 2a a8 1e 01                                 Time
.  00000020: 00 00 00 00                                 Prefix
.  00000024: 01 00 00 60                                 Header DW0
.  00000028: 0f 1e 00 01                                 Header DW1
.  0000002c: 04 00 00 00                                 Header DW2
.  00000030: 40 00 81 02                                 Header DW3
.  00000034: ee 02 00 00                                 Time
....

This patch only add basic parsing support according to the definition of
the PTT packet described in Documentation/trace/hisi-ptt.rst. And the
fields of each packet can be further decoded following the PCIe Spec's
definition of TLP packet.

Signed-off-by: Qi Liu <liuqi115@huawei.com>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Bjorn Helgaas <helgaas@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Qi Liu <liuqi6124@gmail.com>
Cc: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Zeng Prime <prime.zeng@huawei.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-pci@vger.kernel.org
Cc: linuxarm@huawei.com
Link: https://lore.kernel.org/r/20220927081400.14364-4-yangyicong@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-15 10:13:16 -03:00
Qi Liu
057381a7ec perf auxtrace arm64: Add support for HiSilicon PCIe Tune and Trace device driver
HiSilicon PCIe tune and trace device (PTT) could dynamically tune the
PCIe link's events, and trace the TLP headers).

This patch add support for PTT device in perf tool, so users could use
'perf record' to get TLP headers trace data.

Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Acked-by: John Garry <john.garry@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Bjorn Helgaas <helgaas@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Qi Liu <liuqi6124@gmail.com>
Cc: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Zeng Prime <prime.zeng@huawei.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-pci@vger.kernel.org
Cc: linuxarm@huawei.com
Link: https://lore.kernel.org/r/20220927081400.14364-3-yangyicong@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-15 10:13:16 -03:00
Qi Liu
45a3975f8e perf auxtrace arm: Refactor event list iteration in auxtrace_record__init()
Add find_pmu_for_event() and use to simplify logic in
auxtrace_record_init(). find_pmu_for_event() will be reused in
subsequent patches.

Reviewed-by: John Garry <john.garry@huawei.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Bjorn Helgaas <helgaas@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Qi Liu <liuqi6124@gmail.com>
Cc: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Zeng Prime <prime.zeng@huawei.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-pci@vger.kernel.org
Cc: linuxarm@huawei.com
Link: https://lore.kernel.org/r/20220927081400.14364-2-yangyicong@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-15 10:13:16 -03:00
Athira Rajeev
58d4802a5e perf tests stat+json_output: Include sanity check for topology
Testcase stat+json_output.sh fails in powerpc:

	86: perf stat JSON output linter : FAILED!

The testcase "stat+json_output.sh" verifies perf stat JSON output. The
test covers aggregation modes like per-socket, per-core, per-die, -A
(no_aggr mode) along with few other tests. It counts expected fields for
various commands. For example say -A (i.e, AGGR_NONE mode), expects 7
fields in the output having "CPU" as first field. Same way, for
per-socket, it expects the first field in result to point to socket id.
The testcases compares the result with expected count.

The values for socket, die, core and cpu are fetched from topology
directory:

  /sys/devices/system/cpu/cpu*/topology.

For example, socket value is fetched from "physical_package_id" file of
topology directory.  (cpu__get_topology_int() in util/cpumap.c)

If a platform fails to fetch the topology information, values will be
set to -1. For example, incase of pSeries platform of powerpc, value for
"physical_package_id" is restricted and not exposed. So, -1 will be
assigned.

Perf code has a checks for valid cpu id in "aggr_printout"
(stat-display.c), which displays the fields. So, in cases where topology
values not exposed, first field of the output displaying will be empty.
This cause the testcase to fail, as it counts  number of fields in the
output.

Incase of -A (AGGR_NONE mode,), testcase expects 7 fields in the output,
becos of -1 value obtained from topology files for some, only 6 fields
are printed. Hence a testcase failure reported due to mismatch in number
of fields in the output.

Patch here adds a sanity check in the testcase for topology.  Check will
help to skip the test if -1 value found.

Fixes: 0c343af2a2 ("perf test: JSON format checking")
Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com>
Suggested-by: Ian Rogers <irogers@google.com>
Suggested-by: James Clark <james.clark@arm.com>
Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Claire Jensen <cjense@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
Link: https://lore.kernel.org/r/20221006155149.67205-2-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-15 10:13:16 -03:00
Athira Rajeev
cd400f6f18 perf tests stat+csv_output: Include sanity check for topology
Testcase stat+csv_output.sh fails in powerpc:

	84: perf stat CSV output linter: FAILED!

The testcase "stat+csv_output.sh" verifies perf stat CSV output. The
test covers aggregation modes like per-socket, per-core, per-die, -A
(no_aggr mode) along with few other tests. It counts expected fields for
various commands. For example say -A (i.e, AGGR_NONE mode), expects 7
fields in the output having "CPU" as first field. Same way, for
per-socket, it expects the first field in result to point to socket id.
The testcases compares the result with expected count.

The values for socket, die, core and cpu are fetched from topology
directory:

  /sys/devices/system/cpu/cpu*/topology.

For example, socket value is fetched from "physical_package_id" file of
topology directory.  (cpu__get_topology_int() in util/cpumap.c)

If a platform fails to fetch the topology information, values will be
set to -1. For example, incase of pSeries platform of powerpc, value for
"physical_package_id" is restricted and not exposed. So, -1 will be
assigned.

Perf code has a checks for valid cpu id in "aggr_printout"
(stat-display.c), which displays the fields. So, in cases where topology
values not exposed, first field of the output displaying will be empty.
This cause the testcase to fail, as it counts  number of fields in the
output.

Incase of -A (AGGR_NONE mode,), testcase expects 7 fields in the output,
becos of -1 value obtained from topology files for some, only 6 fields
are printed. Hence a testcase failure reported due to mismatch in number
of fields in the output.

Patch here adds a sanity check in the testcase for topology.  Check will
help to skip the test if -1 value found.

Fixes: 7473ee56db ("perf test: Add checking for perf stat CSV output.")
Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com>
Suggested-by: Ian Rogers <irogers@google.com>
Suggested-by: James Clark <james.clark@arm.com>
Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Claire Jensen <cjense@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
Link: https://lore.kernel.org/r/20221006155149.67205-1-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-15 10:13:16 -03:00
Adrian Hunter
6cef7dab3e perf intel-pt: Fix system_wide dummy event for hybrid
User space tasks can migrate between CPUs, so when tracing selected CPUs,
system-wide sideband is still needed, however evlist->core.has_user_cpus
is not set in the hybrid case, so check the target cpu_list instead.

Fixes: 7d189cadbe ("perf intel-pt: Track sideband system-wide when needed")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20221012082259.22394-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-15 10:13:16 -03:00
Adrian Hunter
5a3d47071f perf intel-pt: Fix segfault in intel_pt_print_info() with uClibc
uClibc segfaulted because NULL was passed as the format to fprintf().

That happened because one of the format strings was missing and
intel_pt_print_info() didn't check that before calling fprintf().

Add the missing format string, and check format is not NULL before calling
fprintf().

Fixes: 11fa7cb86b ("perf tools: Pass Intel PT information for decoding MTC and CYC")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20221012082259.22394-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-15 10:13:16 -03:00
James Clark
e28039667c perf test: Fix attr tests for PERF_FORMAT_LOST
Since PERF_FORMAT_LOST was added, the default read format has that bit
set, so add it to the tests. Keep the old value as well so that the test
still passes on older kernels.

This fixes the following failure:

  expected read_format=0|4, got 20
  FAILED './tests/attr/test-record-C0' - match failure

Fixes: 85b425f31c8866e0 ("perf record: Set PERF_FORMAT_LOST by default")
Signed-off-by: James Clark <james.clark@arm.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20221012094633.21669-2-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-15 10:13:16 -03:00
Ammy Yi
f77811a0f6 perf test: test_intel_pt.sh: Add 9 tests
Add tests:
	Test with MTC and TSC disabled
	Test with branches disabled
	Test with/without CYC
	Test recording with sample mode
	Test with kernel trace
	Test virtual LBR
	Test power events
	Test with TNT packets disabled
	Test with event_trace

These tests mostly check that perf record works with the corresponding
Intel PT config terms, sometimes also checking that certain packets do or
do not appear in the resulting trace as appropriate.

The "Test virtual LBR" is slightly trickier, using a Python script to
check that branch stacks are actually synthesized.

Signed-off-by: Ammy Yi <ammy.yi@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20221014170905.64069-8-adrian.hunter@intel.com
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-15 10:13:16 -03:00
Adrian Hunter
89b15d0052 perf inject: Fix GEN_ELF_TEXT_OFFSET for jit
When a program header was added, it moved the text section but
GEN_ELF_TEXT_OFFSET was not updated.

Fix by adding the program header size and aligning.

Fixes: babd04386b ("perf jit: Include program header in ELF files")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Lieven Hey <lieven.hey@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20221014170905.64069-7-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-15 10:13:16 -03:00
Adrian Hunter
973db24079 perf test: test_intel_pt.sh: Add jitdump test
Add a test for decoding self-modifying code using a jitdump file.

The test creates a workload that uses self-modifying code and generates its
own jitdump file.  The result is processed with perf inject --jit and
checked for decoding errors.

Note the test will fail without patch "perf inject: Fix GEN_ELF_TEXT_OFFSET
for jit" applied.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20221014170905.64069-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-15 10:13:16 -03:00
Adrian Hunter
40053a4b7e perf test: test_intel_pt.sh: Tidy some alignment
Tidy alignment of test function lines to make them more readable.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20221014170905.64069-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-15 10:13:16 -03:00
Adrian Hunter
9637bf8ff0 perf test: test_intel_pt.sh: Print a message when skipping kernel tracing
Messages display with the perf test -v option. Add a message to show when
skipping a test because the user cannot do kernel tracing.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20221014170905.64069-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-15 10:13:16 -03:00
Adrian Hunter
5021d82bca perf test: test_intel_pt.sh: Tidy some perf record options
When not decoding, the options "-B -N --no-bpf-event" speed up perf record.
Make a common function for them.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20221014170905.64069-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-15 10:13:16 -03:00
Adrian Hunter
e408049287 perf test: test_intel_pt.sh: Fix return checking again
count_result() does not always reset ret=0 which means the value can spill
into the next test result.

Fix by explicitly setting it to zero between tests.

Committer testing:

  # perf test "Miscellaneous Intel PT testing"
  110: Miscellaneous Intel PT testing               : Ok
  #

Tested as well with:

  # perf test -v "Miscellaneous Intel PT testing"

Fixes: fd9b45e39c ("perf test: test_intel_pt.sh: Fix return checking")
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20221014170905.64069-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-15 10:13:16 -03:00
Linus Torvalds
5e714bf171 - Alistair Popple has a series which addresses a race which causes page
refcounting errors in ZONE_DEVICE pages.
 
 - Peter Xu fixes some userfaultfd test harness instability.
 
 - Various other patches in MM, mainly fixes.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCY0j6igAKCRDdBJ7gKXxA
 jnGxAP99bV39ZtOsoY4OHdZlWU16BUjKuf/cb3bZlC2G849vEwD+OKlij86SG20j
 MGJQ6TfULJ8f1dnQDd6wvDfl3FMl7Qc=
 =tbdp
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2022-10-13' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull more MM updates from Andrew Morton:

 - fix a race which causes page refcounting errors in ZONE_DEVICE pages
   (Alistair Popple)

 - fix userfaultfd test harness instability (Peter Xu)

 - various other patches in MM, mainly fixes

* tag 'mm-stable-2022-10-13' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (29 commits)
  highmem: fix kmap_to_page() for kmap_local_page() addresses
  mm/page_alloc: fix incorrect PGFREE and PGALLOC for high-order page
  mm/selftest: uffd: explain the write missing fault check
  mm/hugetlb: use hugetlb_pte_stable in migration race check
  mm/hugetlb: fix race condition of uffd missing/minor handling
  zram: always expose rw_page
  LoongArch: update local TLB if PTE entry exists
  mm: use update_mmu_tlb() on the second thread
  kasan: fix array-bounds warnings in tests
  hmm-tests: add test for migrate_device_range()
  nouveau/dmem: evict device private memory during release
  nouveau/dmem: refactor nouveau_dmem_fault_copy_one()
  mm/migrate_device.c: add migrate_device_range()
  mm/migrate_device.c: refactor migrate_vma and migrate_deivce_coherent_page()
  mm/memremap.c: take a pgmap reference on page allocation
  mm: free device private pages have zero refcount
  mm/memory.c: fix race when faulting a device private page
  mm/damon: use damon_sz_region() in appropriate place
  mm/damon: move sz_damon_region to damon_sz_region
  lib/test_meminit: add checks for the allocation functions
  ...
2022-10-14 12:28:43 -07:00
Rob Herring
e552b7be12 perf: Skip and warn on unknown format 'configN' attrs
If the kernel exposes a new perf_event_attr field in a format attr, perf
will return an error stating the specified PMU can't be found. For
example, a format attr with 'config3:0-63' causes an error as config3 is
unknown to perf. This causes a compatibility issue between a newer
kernel with older perf tool.

Before this change with a kernel adding 'config3' I get:

  $ perf record -e arm_spe// -- true
  event syntax error: 'arm_spe//'
                       \___ Cannot find PMU `arm_spe'. Missing kernel support?
  Run 'perf list' for a list of valid events

   Usage: perf record [<options>] [<command>]
      or: perf record [<options>] -- <command> [<options>]

      -e, --event <event>   event selector. use 'perf list' to list
  available events

After this change, I get:

  $ perf record -e arm_spe// -- true
  WARNING: 'arm_spe_0' format 'inv_event_filter' requires 'perf_event_attr::config3' which is not supported by this version of perf!
  [ perf record: Woken up 2 times to write data ]
  [ perf record: Captured and wrote 0.091 MB perf.data ]

To support unknown configN formats, rework the YACC implementation to
pass any config[0-9]+ format to perf_pmu__new_format() to handle with a
warning.

Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Rob Herring <robh@kernel.org>
Tested-by: Leo Yan <leo.yan@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20220914-arm-perf-tool-spe1-2-v2-v4-1-83c098e6212e@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-14 12:23:09 -03:00
Andi Kleen
0cef141e86 perf list: Fix metricgroups title message
$ perf list metricgroups

gives

  List of pre-defined events (to be used in -e):

  Metric Groups:

  Backend
  Bad
  BadSpec

But that's incorrect of course because metric groups or metrics can only
be specified with -M. So fix the message to say -e or -M

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20221004192634.998984-1-ak@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-14 12:21:42 -03:00
Namhyung Kim
7d60fa2cde perf mem: Fix -C option behavior for perf mem record
The -C/--cpu option was maily for report but it also affected record as
it ate the option.  So users needed to use "--" after perf mem record to
pass the info to the perf record properly.

Check if this option is set for record, and pass it to the actual perf
record.

Before)
  $ sudo perf --debug perf-event-open mem record -C 0 2>&1 | grep -a sys_perf_event_open
  ...
  sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8 = 4
  sys_perf_event_open: pid -1  cpu 1  group_fd -1  flags 0x8 = 5
  sys_perf_event_open: pid -1  cpu 2  group_fd -1  flags 0x8 = 6
  sys_perf_event_open: pid -1  cpu 3  group_fd -1  flags 0x8 = 7
  sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8 = 8
  sys_perf_event_open: pid -1  cpu 1  group_fd -1  flags 0x8 = 9
  sys_perf_event_open: pid -1  cpu 2  group_fd -1  flags 0x8 = 10
  sys_perf_event_open: pid -1  cpu 3  group_fd -1  flags 0x8 = 11
  ...

After)
  $ sudo perf --debug perf-event-open mem record -C 0 2>&1 | grep -a sys_perf_event_open
  ...
  sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8 = 4
  sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8 = 5
  sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8 = 6
  sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8 = 7

Reported-by: Ravi Bangoria <ravi.bangoria@amd.com>
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Leo Yan <leo.yan@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20221004200211.1444521-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-14 12:21:34 -03:00
Namhyung Kim
531778b129 perf annotate: Add missing condition flags for arm64
According to the document [1], it can also have 'hs', 'lo', 'vc', 'vs' as a
condition code.  Let's add them too.

[1] https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/condition-codes-1-condition-flags-and-codes

Reported-by: Kevin Nomura <nomurak@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20221006222232.266416-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-14 10:47:35 -03:00
Arnaldo Carvalho de Melo
d5e57375a5 libperf: Do not include non-UAPI linux/compiler.h header
Its just for that __packed define, so use it expanded as __attribute__((packed)),
like the other files in /usr/include do.

This was problem was preventing building the libperf examples on ALT
Linux and Fedora 35, fix it.

Reported-by: Vitaly Chikunov <vt@altlinux.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Dmitry Levin <ldv@altlinux.org
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/Y0lnpl2Ix7VljVDc@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-14 10:44:20 -03:00
James Clark
fe180a5201 perf test: Fix test_arm_coresight.sh failures on Juno
This test commonly fails on Arm Juno because the instruction interval
is large enough to miss generating any samples for Perf in system-wide
mode.

Fix this by lowering the interval until a comfortable number of Perf
instructions are generated. The test is still quick to run because only
a small amount of trace is gathered.

Before:

  sudo ./perf test coresight -vvv
  ...
  Recording trace with system wide mode
  Looking at perf.data file for dumping branch samples:
  Looking at perf.data file for reporting branch samples:
  Looking at perf.data file for instruction samples:
  CoreSight system wide testing: FAIL
  ...

After:

  sudo ./perf test coresight -vvv
  ...
  Recording trace with system wide mode
  Looking at perf.data file for dumping branch samples:
  Looking at perf.data file for reporting branch samples:
  Looking at perf.data file for instruction samples:
  CoreSight system wide testing: PASS
  ...

Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: coresight@lists.linaro.org
Link: https://lore.kernel.org/r/20221005140508.1537277-1-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-14 10:32:39 -03:00
Namhyung Kim
f21cb52036 perf stat: Support old kernels for bperf cgroup counting
The recent change in the cgroup will break the backward compatiblity in
the BPF program.  It should support both old and new kernels using BPF
CO-RE technique.

Like the task_struct->__state handling in the offcpu analysis, we can
check the field name in the cgroup struct.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: bpf@vger.kernel.org
Cc: cgroups@vger.kernel.org
Cc: zefan li <lizefan.x@bytedance.com>
Link: http://lore.kernel.org/lkml/20221011052808.282394-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-14 10:29:05 -03:00
Hou Tao
62c69e89e8 selftests/bpf: Use sys_pidfd_open() helper when possible
SYS_pidfd_open may be undefined for old glibc, so using sys_pidfd_open()
helper defined in task_local_storage_helpers.h instead to fix potential
build failure.

And according to commit 7615d9e178 ("arch: wire-up pidfd_open()"), the
syscall number of pidfd_open is always 434 except for alpha architure,
so update the definition of __NR_pidfd_open accordingly.

Signed-off-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20221011071249.3471760-1-houtao@huaweicloud.com
2022-10-13 12:09:19 -07:00
Shung-Hsi Yu
d0d382f95a libbpf: Fix null-pointer dereference in find_prog_by_sec_insn()
When there are no program sections, obj->programs is left unallocated,
and find_prog_by_sec_insn()'s search lands on &obj->programs[0] == NULL,
and will cause null-pointer dereference in the following access to
prog->sec_idx.

Guard the search with obj->nr_programs similar to what's being done in
__bpf_program__iter() to prevent null-pointer access from happening.

Fixes: db2b8b0642 ("libbpf: Support CO-RE relocations for multi-prog sections")
Signed-off-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20221012022353.7350-4-shung-hsi.yu@suse.com
2022-10-13 10:53:34 -07:00
Shung-Hsi Yu
35a855509e libbpf: Deal with section with no data gracefully
ELF section data pointer returned by libelf may be NULL (if section has
SHT_NOBITS), so null check section data pointer before attempting to
copy license and kversion section.

Fixes: cb1e5e9619 ("bpf tools: Collect version and license from ELF sections")
Signed-off-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20221012022353.7350-3-shung-hsi.yu@suse.com
2022-10-13 10:53:34 -07:00
Shung-Hsi Yu
51deedc9b8 libbpf: Use elf_getshdrnum() instead of e_shnum
This commit replace e_shnum with the elf_getshdrnum() helper to fix two
oss-fuzz-reported heap-buffer overflow in __bpf_object__open. Both
reports are incorrectly marked as fixed and while still being
reproducible in the latest libbpf.

  # clusterfuzz-testcase-minimized-bpf-object-fuzzer-5747922482888704
  libbpf: loading object 'fuzz-object' from buffer
  libbpf: sec_cnt is 0
  libbpf: elf: section(1) .data, size 0, link 538976288, flags 2020202020202020, type=2
  libbpf: elf: section(2) .data, size 32, link 538976288, flags 202020202020ff20, type=1
  =================================================================
  ==13==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x6020000000c0 at pc 0x0000005a7b46 bp 0x7ffd12214af0 sp 0x7ffd12214ae8
  WRITE of size 4 at 0x6020000000c0 thread T0
  SCARINESS: 46 (4-byte-write-heap-buffer-overflow-far-from-bounds)
      #0 0x5a7b45 in bpf_object__elf_collect /src/libbpf/src/libbpf.c:3414:24
      #1 0x5733c0 in bpf_object_open /src/libbpf/src/libbpf.c:7223:16
      #2 0x5739fd in bpf_object__open_mem /src/libbpf/src/libbpf.c:7263:20
      ...

The issue lie in libbpf's direct use of e_shnum field in ELF header as
the section header count. Where as libelf implemented an extra logic
that, when e_shnum == 0 && e_shoff != 0, will use sh_size member of the
initial section header as the real section header count (part of ELF
spec to accommodate situation where section header counter is larger
than SHN_LORESERVE).

The above inconsistency lead to libbpf writing into a zero-entry calloc
area. So intead of using e_shnum directly, use the elf_getshdrnum()
helper provided by libelf to retrieve the section header counter into
sec_cnt.

Fixes: 0d6988e16a ("libbpf: Fix section counting logic")
Fixes: 25bbbd7a44 ("libbpf: Remove assumptions about uniqueness of .rodata/.data/.bss maps")
Signed-off-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=40868
Link: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=40957
Link: https://lore.kernel.org/bpf/20221012022353.7350-2-shung-hsi.yu@suse.com
2022-10-13 10:53:34 -07:00
Xu Kuohai
cbc1c998da selftest/bpf: Fix error usage of ASSERT_OK in xdp_adjust_tail.c
xdp_adjust_tail.c calls ASSERT_OK() to check the return value of
bpf_prog_test_load(), but the condition is not correct. Fix it.

Fixes: 791cad0250 ("bpf: selftests: Get rid of CHECK macro in xdp_adjust_tail.c")
Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/bpf/20221011120108.782373-7-xukuohai@huaweicloud.com
2022-10-13 10:53:30 -07:00
Xu Kuohai
4abdb1d5b2 selftests/bpf: Fix error failure of case test_xdp_adjust_tail_grow
test_xdp_adjust_tail_grow failed with ipv6:
  test_xdp_adjust_tail_grow:FAIL:ipv6 unexpected error: -28 (errno 28)

The reason is that this test case tests ipv4 before ipv6, and when ipv4
test finished, topts.data_size_out was set to 54, which is smaller than the
ipv6 output data size 114, so ipv6 test fails with NOSPC error.

Fix it by reset topts.data_size_out to sizeof(buf) before testing ipv6.

Fixes: 04fcb5f9a1 ("selftests/bpf: Migrate from bpf_prog_test_run")
Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/bpf/20221011120108.782373-6-xukuohai@huaweicloud.com
2022-10-13 10:53:28 -07:00
Xu Kuohai
6d2e21dc4d selftest/bpf: Fix memory leak in kprobe_multi_test
The get_syms() function in kprobe_multi_test.c does not free the string
memory allocated by sscanf correctly. Fix it.

Fixes: 5b6c7e5c44 ("selftests/bpf: Add attach bench test")
Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/bpf/20221011120108.782373-5-xukuohai@huaweicloud.com
2022-10-13 10:53:24 -07:00
Xu Kuohai
6e8280b958 selftests/bpf: Fix memory leak caused by not destroying skeleton
Some test cases does not destroy skeleton object correctly, causing ASAN
to report memory leak warning. Fix it.

Fixes: 0ef6740e97 ("selftests/bpf: Add tests for kptr_ref refcounting")
Fixes: 1642a3945e ("selftests/bpf: Add struct argument tests with fentry/fexit programs.")
Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/bpf/20221011120108.782373-4-xukuohai@huaweicloud.com
2022-10-13 10:53:22 -07:00
Xu Kuohai
0dc9254e03 libbpf: Fix memory leak in parse_usdt_arg()
In the arm64 version of parse_usdt_arg(), when sscanf returns 2, reg_name
is allocated but not freed. Fix it.

Fixes: 0f8619929c ("libbpf: Usdt aarch64 arg parsing support")
Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/bpf/20221011120108.782373-3-xukuohai@huaweicloud.com
2022-10-13 10:53:18 -07:00
Xu Kuohai
93c660ca40 libbpf: Fix use-after-free in btf_dump_name_dups
ASAN reports an use-after-free in btf_dump_name_dups:

ERROR: AddressSanitizer: heap-use-after-free on address 0xffff927006db at pc 0xaaaab5dfb618 bp 0xffffdd89b890 sp 0xffffdd89b928
READ of size 2 at 0xffff927006db thread T0
    #0 0xaaaab5dfb614 in __interceptor_strcmp.part.0 (test_progs+0x21b614)
    #1 0xaaaab635f144 in str_equal_fn tools/lib/bpf/btf_dump.c:127
    #2 0xaaaab635e3e0 in hashmap_find_entry tools/lib/bpf/hashmap.c:143
    #3 0xaaaab635e72c in hashmap__find tools/lib/bpf/hashmap.c:212
    #4 0xaaaab6362258 in btf_dump_name_dups tools/lib/bpf/btf_dump.c:1525
    #5 0xaaaab636240c in btf_dump_resolve_name tools/lib/bpf/btf_dump.c:1552
    #6 0xaaaab6362598 in btf_dump_type_name tools/lib/bpf/btf_dump.c:1567
    #7 0xaaaab6360b48 in btf_dump_emit_struct_def tools/lib/bpf/btf_dump.c:912
    #8 0xaaaab6360630 in btf_dump_emit_type tools/lib/bpf/btf_dump.c:798
    #9 0xaaaab635f720 in btf_dump__dump_type tools/lib/bpf/btf_dump.c:282
    #10 0xaaaab608523c in test_btf_dump_incremental tools/testing/selftests/bpf/prog_tests/btf_dump.c:236
    #11 0xaaaab6097530 in test_btf_dump tools/testing/selftests/bpf/prog_tests/btf_dump.c:875
    #12 0xaaaab6314ed0 in run_one_test tools/testing/selftests/bpf/test_progs.c:1062
    #13 0xaaaab631a0a8 in main tools/testing/selftests/bpf/test_progs.c:1697
    #14 0xffff9676d214 in __libc_start_main ../csu/libc-start.c:308
    #15 0xaaaab5d65990  (test_progs+0x185990)

0xffff927006db is located 11 bytes inside of 16-byte region [0xffff927006d0,0xffff927006e0)
freed by thread T0 here:
    #0 0xaaaab5e2c7c4 in realloc (test_progs+0x24c7c4)
    #1 0xaaaab634f4a0 in libbpf_reallocarray tools/lib/bpf/libbpf_internal.h:191
    #2 0xaaaab634f840 in libbpf_add_mem tools/lib/bpf/btf.c:163
    #3 0xaaaab636643c in strset_add_str_mem tools/lib/bpf/strset.c:106
    #4 0xaaaab6366560 in strset__add_str tools/lib/bpf/strset.c:157
    #5 0xaaaab6352d70 in btf__add_str tools/lib/bpf/btf.c:1519
    #6 0xaaaab6353e10 in btf__add_field tools/lib/bpf/btf.c:2032
    #7 0xaaaab6084fcc in test_btf_dump_incremental tools/testing/selftests/bpf/prog_tests/btf_dump.c:232
    #8 0xaaaab6097530 in test_btf_dump tools/testing/selftests/bpf/prog_tests/btf_dump.c:875
    #9 0xaaaab6314ed0 in run_one_test tools/testing/selftests/bpf/test_progs.c:1062
    #10 0xaaaab631a0a8 in main tools/testing/selftests/bpf/test_progs.c:1697
    #11 0xffff9676d214 in __libc_start_main ../csu/libc-start.c:308
    #12 0xaaaab5d65990  (test_progs+0x185990)

previously allocated by thread T0 here:
    #0 0xaaaab5e2c7c4 in realloc (test_progs+0x24c7c4)
    #1 0xaaaab634f4a0 in libbpf_reallocarray tools/lib/bpf/libbpf_internal.h:191
    #2 0xaaaab634f840 in libbpf_add_mem tools/lib/bpf/btf.c:163
    #3 0xaaaab636643c in strset_add_str_mem tools/lib/bpf/strset.c:106
    #4 0xaaaab6366560 in strset__add_str tools/lib/bpf/strset.c:157
    #5 0xaaaab6352d70 in btf__add_str tools/lib/bpf/btf.c:1519
    #6 0xaaaab6353ff0 in btf_add_enum_common tools/lib/bpf/btf.c:2070
    #7 0xaaaab6354080 in btf__add_enum tools/lib/bpf/btf.c:2102
    #8 0xaaaab6082f50 in test_btf_dump_incremental tools/testing/selftests/bpf/prog_tests/btf_dump.c:162
    #9 0xaaaab6097530 in test_btf_dump tools/testing/selftests/bpf/prog_tests/btf_dump.c:875
    #10 0xaaaab6314ed0 in run_one_test tools/testing/selftests/bpf/test_progs.c:1062
    #11 0xaaaab631a0a8 in main tools/testing/selftests/bpf/test_progs.c:1697
    #12 0xffff9676d214 in __libc_start_main ../csu/libc-start.c:308
    #13 0xaaaab5d65990  (test_progs+0x185990)

The reason is that the key stored in hash table name_map is a string
address, and the string memory is allocated by realloc() function, when
the memory is resized by realloc() later, the old memory may be freed,
so the address stored in name_map references to a freed memory, causing
use-after-free.

Fix it by storing duplicated string address in name_map.

Fixes: 919d2b1dbb ("libbpf: Allow modification of BTF and add btf__add_str API")
Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/bpf/20221011120108.782373-2-xukuohai@huaweicloud.com
2022-10-13 10:53:03 -07:00
Linus Torvalds
66ae04368e Including fixes from netfilter, and wifi.
Current release - regressions:
 
  - Revert "net/sched: taprio: make qdisc_leaf() see
    the per-netdev-queue pfifo child qdiscs", it may cause crashes
    when the qdisc is reconfigured
 
  - inet: ping: fix splat due to packet allocation refactoring in inet
 
  - tcp: clean up kernel listener's reqsk in inet_twsk_purge(),
    fix UAF due to races when per-netns hash table is used
 
 Current release - new code bugs:
 
  - eth: adin1110: check in netdev_event that netdev belongs to driver
 
  - fixes for PTR_ERR() vs NULL bugs in driver code, from Dan and co.
 
 Previous releases - regressions:
 
  - ipv4: handle attempt to delete multipath route when fib_info
    contains an nh reference, avoid oob access
 
  - wifi: fix handful of bugs in the new Multi-BSSID code
 
  - wifi: mt76: fix rate reporting / throughput regression on mt7915
    and newer, fix checksum offload
 
  - wifi: iwlwifi: mvm: fix double list_add at
    iwl_mvm_mac_wake_tx_queue (other cases)
 
  - wifi: mac80211: do not drop packets smaller than the LLC-SNAP
    header on fast-rx
 
 Previous releases - always broken:
 
  - ieee802154: don't warn zero-sized raw_sendmsg()
 
  - ipv6: ping: fix wrong checksum for large frames
 
  - mctp: prevent double key removal and unref
 
  - tcp/udp: fix memory leaks and races around IPV6_ADDRFORM
 
  - hv_netvsc: fix race between VF offering and VF association message
 
 Misc:
 
  - remove -Warray-bounds silencing in the drivers, compilers fixed
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmNISaMACgkQMUZtbf5S
 IruEARAArjYZbOEGkUVqtcEbnV0vmxQ5GsVyvurDkmzUULJ1rVAITtG7BbxcPyZ7
 tJf5BPmmpXxEXh/lZBIlgHLOGgf/cx4gkCH9Jz6LYlSoTpTZiTqxlOfAZNeei0FI
 PD95Slvd3TnIOEysv5RH/pQzIoKdd6+YqOhVITbwCW36cCLaUm+r7JUhzDrnHMNE
 KCcsOX9DDtW7MDJrJj/E0wlWeWcudpHY4DLG2A723X6Esu+8k6krK32XtkrFIKqa
 PFxeU1NPgMkn4S2xRPKqy+W3dTMfMKB4WWBMMUzEU220MIxV4l/RZSrnI5nrnLh2
 uXyUefpx+lD92D5BOiqUw8rK7B4Jq0uUrawuCf+70tbO1f13ThkkAlV6cEzrlnZY
 tGQxs0ayFIDVypU1tpY9cemUiYXrnPpCkpz+V1G0us8L323eCHxjz/f5TUlb51Na
 BVFvRqvxkjztprBv2LrH2SmnVtcH2kvQG8qMYmXRchBM+11rivz6BrPdE0V+muMg
 Hjr6HefYMBpSgcD+ADVFr8a/OB/W7AuWpTBd3z/WyNQ5MxkFX9Kf2Lt2+j8SRfpE
 ELO0AANFQZ1Gyp6LTbEkA3mFs1LhNNQyfjHcMHC16ZExHmV3i37BE9LJdnFM27N8
 R8lIm4YDs6Jj6YIUDy2wExgUAgkUk7mfZNCMPNi2nSsdJksyAsc=
 =AyqG
 -----END PGP SIGNATURE-----

Merge tag 'net-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from netfilter, and wifi.

Current release - regressions:

   - Revert "net/sched: taprio: make qdisc_leaf() see the
     per-netdev-queue pfifo child qdiscs", it may cause crashes when the
     qdisc is reconfigured

   - inet: ping: fix splat due to packet allocation refactoring in inet

   - tcp: clean up kernel listener's reqsk in inet_twsk_purge(), fix UAF
     due to races when per-netns hash table is used

  Current release - new code bugs:

   - eth: adin1110: check in netdev_event that netdev belongs to driver

   - fixes for PTR_ERR() vs NULL bugs in driver code, from Dan and co.

  Previous releases - regressions:

   - ipv4: handle attempt to delete multipath route when fib_info
     contains an nh reference, avoid oob access

   - wifi: fix handful of bugs in the new Multi-BSSID code

   - wifi: mt76: fix rate reporting / throughput regression on mt7915
     and newer, fix checksum offload

   - wifi: iwlwifi: mvm: fix double list_add at
     iwl_mvm_mac_wake_tx_queue (other cases)

   - wifi: mac80211: do not drop packets smaller than the LLC-SNAP
     header on fast-rx

  Previous releases - always broken:

   - ieee802154: don't warn zero-sized raw_sendmsg()

   - ipv6: ping: fix wrong checksum for large frames

   - mctp: prevent double key removal and unref

   - tcp/udp: fix memory leaks and races around IPV6_ADDRFORM

   - hv_netvsc: fix race between VF offering and VF association message

  Misc:

   - remove -Warray-bounds silencing in the drivers, compilers fixed"

* tag 'net-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (73 commits)
  sunhme: fix an IS_ERR() vs NULL check in probe
  net: marvell: prestera: fix a couple NULL vs IS_ERR() checks
  kcm: avoid potential race in kcm_tx_work
  tcp: Clean up kernel listener's reqsk in inet_twsk_purge()
  net: phy: micrel: Fixes FIELD_GET assertion
  openvswitch: add nf_ct_is_confirmed check before assigning the helper
  tcp: Fix data races around icsk->icsk_af_ops.
  ipv6: Fix data races around sk->sk_prot.
  tcp/udp: Call inet6_destroy_sock() in IPv6 sk->sk_destruct().
  udp: Call inet6_destroy_sock() in setsockopt(IPV6_ADDRFORM).
  tcp/udp: Fix memory leak in ipv6_renew_options().
  mctp: prevent double key removal and unref
  selftests: netfilter: Fix nft_fib.sh for all.rp_filter=1
  netfilter: rpfilter/fib: Populate flowic_l3mdev field
  selftests: netfilter: Test reverse path filtering
  net/mlx5: Make ASO poll CQ usable in atomic context
  tcp: cdg: allow tcp_cdg_release() to be called multiple times
  inet: ping: fix recent breakage
  ipv6: ping: fix wrong checksum for large frames
  net: ethernet: ti: am65-cpsw: set correct devlink flavour for unused ports
  ...
2022-10-13 10:51:01 -07:00
David Vernet
6e44b9f375 selftests/bpf: Make bpf_user_ringbuf_drain() selftest callback return 1
In commit 1bfe26fb08 ("bpf: Add verifier support for custom callback
return range"), the verifier was updated to require callbacks to BPF
helpers to explicitly specify the range of values that can be returned.
bpf_user_ringbuf_drain() was merged after this in commit 2057156738
("bpf: Add bpf_user_ringbuf_drain() helper"), and this change in default
behavior was missed. This patch updates the BPF_MAP_TYPE_USER_RINGBUF
selftests to also return 1 from a bpf_user_ringbuf_drain() callback so
as to properly test this going forward.

Signed-off-by: David Vernet <void@manifault.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20221012232015.1510043-3-void@manifault.com
2022-10-13 08:27:38 -07:00
Martin KaFai Lau
de9c8d848d selftests/bpf: S/iptables/iptables-legacy/ in the bpf_nf and xdp_synproxy test
The recent vm image in CI has reported error in selftests that use
the iptables command.  Manu Bretelle has pointed out the difference
in the recent vm image that the iptables is sym-linked to the iptables-nft.
With this knowledge,  I can also reproduce the CI error by manually running
with the 'iptables-nft'.

This patch is to replace the iptables command with iptables-legacy
to unblock the CI tests.

Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/bpf/20221012221235.3529719-1-martin.lau@linux.dev
2022-10-13 08:03:04 -07:00
Gavin Shan
05c2224d4b KVM: selftests: Fix number of pages for memory slot in memslot_modification_stress_test
It's required by vm_userspace_mem_region_add() that memory size
should be aligned to host page size. However, one guest page is
provided by memslot_modification_stress_test. It triggers failure
in the scenario of 64KB-page-size-host and 4KB-page-size-guest,
as the following messages indicate.

 # ./memslot_modification_stress_test
 Testing guest mode: PA-bits:40,  VA-bits:48,  4K pages
 guest physical test memory: [0xffbfff0000, 0xffffff0000)
 Finished creating vCPUs
 Started all vCPUs
 ==== Test Assertion Failure ====
   lib/kvm_util.c:824: vm_adjust_num_guest_pages(vm->mode, npages) == npages
   pid=5712 tid=5712 errno=0 - Success
      1	0x0000000000404eeb: vm_userspace_mem_region_add at kvm_util.c:822
      2	0x0000000000401a5b: add_remove_memslot at memslot_modification_stress_test.c:82
      3	 (inlined by) run_test at memslot_modification_stress_test.c:110
      4	0x0000000000402417: for_each_guest_mode at guest_modes.c:100
      5	0x00000000004016a7: main at memslot_modification_stress_test.c:187
      6	0x0000ffffb8cd4383: ?? ??:0
      7	0x0000000000401827: _start at :?
   Number of guest pages is not compatible with the host. Try npages=16

Fix the issue by providing 16 guest pages to the memory slot for this
particular combination of 64KB-page-size-host and 4KB-page-size-guest
on aarch64.

Fixes: ef4c9f4f65 ("KVM: selftests: Fix 32-bit truncation of vm_get_max_gfn()")
Signed-off-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20221013063020.201856-1-gshan@redhat.com
2022-10-13 11:46:51 +01:00
Peter Xu
26c92d37d3 mm/selftest: uffd: explain the write missing fault check
It's not obvious why we had a write check for each of the missing
messages, especially when it should be a locking op.  Add a rich comment
for that, and also try to explain its good side and limitations, so that
if someone hit it again for either a bug or a different glibc impl
there'll be some clue to start with.

Link: https://lkml.kernel.org/r/20221004193400.110155-4-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Nadav Amit <nadav.amit@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-12 18:51:50 -07:00
Alistair Popple
ad4c365221 hmm-tests: add test for migrate_device_range()
Link: https://lkml.kernel.org/r/a73cf109de0224cfd118d22be58ddebac3ae2897.1664366292.git-series.apopple@nvidia.com
Signed-off-by: Alistair Popple <apopple@nvidia.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Alex Sierra <alex.sierra@amd.com>
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: "Huang, Ying" <ying.huang@intel.com>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-12 18:51:50 -07:00
Linus Torvalds
a185a09955 linux-kselftest-kunit-6.1-rc1-2
This second KUnit update for Linux 6.1-rc1 consists of features and
 fixes:
 
 - simplifying resource use.
 - make kunit_malloc() and kunit_free() allocations and frees consistent.
   kunit_free() frees only the memory allocated by kunit_malloc().
 - stop downloading risc-v  opensbi binaries using wget.
 - other fixes and improvements to tool and KUnit framework.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmNG/4EACgkQCwJExA0N
 QxxznhAAqtXbCYxIxerdiAHwYifnsrLcCMm/Ol2yuFJhmTn6sZh7w4S8bRBt0RlX
 +1IfqtzOi1K1fTpmWQqnq0/fH8gNZrhZHHqXxx3c353pG0BfrC3vODx1VzxuPCMi
 nr/OHqAQ0VSTuxgWxsIr0SuhOM4LFDjhBcLDoCDoBF5aQSJricpa++ixiYsVgaUt
 nG+E1i7I/hvEYwqqUqtJLp9fOD6LK2IeiOP4oH2PwYBIpFO+BXwk0Gbs/ISL+fRP
 F8pph2Qm2jxCJ4kRDvs/N41mkIvG9PwC1h7fW4vDXix0zryJdh0TbilFQFFwiuW3
 S8kFE1tarMBWyqEZU/2cln9MFdZpxXAWtJu1/B8dqOvLA06mBOaNbB4tOXzfyriE
 QBOnEJNqgT0wqnwWONvrljz7L+YaFAkJAGxbub1cGIUa/t5HHs0WX5XncctGfsaE
 Ec6bLOXMgemb3dm35fDpBHyN6np9K5BMmz8Ggv02+V8FH8nrXAzblOW/CN8KgXiG
 R5+1vd3SxaLq7npal4S88LmNRoJCVCSWnNPItBTgWFXy6Ni2T5WEoi6rSdqJNX+/
 bpPM4G47IO5BH0YEbl9IPvKLfDGczVB4TVLpIt61QST4rf+puUhysr76ZweqoU6f
 sOyEenr3YZ7C3EpSbcAztzgyPomPAacR/lNbG5lezcEPRSo184I=
 =FgDN
 -----END PGP SIGNATURE-----

Merge tag 'linux-kselftest-kunit-6.1-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull more KUnit updates from Shuah Khan:
 "Features and fixes:

   - simplify resource use

   - make kunit_malloc() and kunit_free() allocations and frees
     consistent. kunit_free() frees only the memory allocated by
     kunit_malloc()

   - stop downloading risc-v opensbi binaries using wget

   - other fixes and improvements to tool and KUnit framework"

* tag 'linux-kselftest-kunit-6.1-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  Documentation: kunit: Update description of --alltests option
  kunit: declare kunit_assert structs as const
  kunit: rename base KUNIT_ASSERTION macro to _KUNIT_FAILED
  kunit: remove format func from struct kunit_assert, get it to 0 bytes
  kunit: tool: Don't download risc-v opensbi firmware with wget
  kunit: make kunit_kfree(NULL) a no-op to match kfree()
  kunit: make kunit_kfree() not segfault on invalid inputs
  kunit: make kunit_kfree() only work on pointers from kunit_malloc() and friends
  kunit: drop test pointer in string_stream_fragment
  kunit: string-stream: Simplify resource use
2022-10-12 15:01:58 -07:00
Linus Torvalds
661e00960f linux-kselftest-next-6.1-rc1-2
This second Kselftest update for Linux 6.1-rc1 consists of fixes
 and improvements to memory-hotplug test and a minor spelling fix
 to ftrace test.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmNG4hgACgkQCwJExA0N
 QxwkPBAAyPd0ZUHlF7JjzdV2obHDGxbjMzi0x8Di8md4B24gE0PvGY79E7eM/uKd
 pBsop5cnvwGBZGuoBM0E/1J7UB/Lgedl2iYDUFXQe8JoPlOgvmBMbJCdZ3Zv8gxp
 sk5yIrLakgyp2WZng0QyQwZQY4nvq8Lf/f50T8/3+g8OBqF+xTo60DyEpsaDNHS4
 3SddH8/jJ6TkG/5lRoEOlfYFrhCDuxq1e8R0jts1vgnpdhpSD9JZPr26VNGVcygB
 dkp4icsQFWAaZjNO6+7scgp1yfxBFJ2Fh/gDdfWqEAYvZtvnnr2XhwlYK+O7JZRp
 DuglF4Lo/AN3betWuAz4rWyqAYoBZxrUTxrsIVyzb3FqpRAlR32YPFfMo6iWYYn4
 638E6cYvkNbbbhCEEgHJJiFZzUB/xbLR/Y8gD4Que/Y+Ck7+zuvQMzZWHQNJfsGx
 OhhfUcJlw/VzRpdZx1UToT++DqOqJLBL7DVMATbiXd2rDGKbnEw2pKkeuURXVged
 1nis9odge5yY42Q5I3doyPHO7rENOAP2wmlKvJqFDKZFoD23MsGv/m6gpg/HaS1Q
 T27L1hHFXPrAZ14MxGva1DTVTPU8D/ciHcqjCWWmnHp8M359JvjOsDL7PyMUcGlm
 bVSqcciy71utp1XaFfaF7kT1bwnNeGgGqs0EXcmj4xE8CyOtat8=
 =GXpd
 -----END PGP SIGNATURE-----

Merge tag 'linux-kselftest-next-6.1-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull more Kselftest updates from Shuah Khan:
 "This consists of fixes and improvements to memory-hotplug test and a
  minor spelling fix to ftrace test"

* tag 'linux-kselftest-next-6.1-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  docs: notifier-error-inject: Correct test's name
  selftests/memory-hotplug: Adjust log info for maintainability
  selftests/memory-hotplug: Restore memory before exit
  selftests/memory-hotplug: Add checking after online or offline
  selftests/ftrace: func_event_triggers: fix typo in user message
2022-10-12 14:59:13 -07:00
Linus Torvalds
676cb49573 - hfs and hfsplus kmap API modernization from Fabio Francesco
- Valentin Schneider makes crash-kexec work properly when invoked from
   an NMI-time panic.
 
 - ntfs bugfixes from Hawkins Jiawei
 
 - Jiebin Sun improves IPC msg scalability by replacing atomic_t's with
   percpu counters.
 
 - nilfs2 cleanups from Minghao Chi
 
 - lots of other single patches all over the tree!
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCY0Yf0gAKCRDdBJ7gKXxA
 joapAQDT1d1zu7T8yf9cQXkYnZVuBKCjxKE/IsYvqaq1a42MjQD/SeWZg0wV05B8
 DhJPj9nkEp6R3Rj3Mssip+3vNuceAQM=
 =lUQY
 -----END PGP SIGNATURE-----

Merge tag 'mm-nonmm-stable-2022-10-11' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull non-MM updates from Andrew Morton:

 - hfs and hfsplus kmap API modernization (Fabio Francesco)

 - make crash-kexec work properly when invoked from an NMI-time panic
   (Valentin Schneider)

 - ntfs bugfixes (Hawkins Jiawei)

 - improve IPC msg scalability by replacing atomic_t's with percpu
   counters (Jiebin Sun)

 - nilfs2 cleanups (Minghao Chi)

 - lots of other single patches all over the tree!

* tag 'mm-nonmm-stable-2022-10-11' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (71 commits)
  include/linux/entry-common.h: remove has_signal comment of arch_do_signal_or_restart() prototype
  proc: test how it holds up with mapping'less process
  mailmap: update Frank Rowand email address
  ia64: mca: use strscpy() is more robust and safer
  init/Kconfig: fix unmet direct dependencies
  ia64: update config files
  nilfs2: replace WARN_ONs by nilfs_error for checkpoint acquisition failure
  fork: remove duplicate included header files
  init/main.c: remove unnecessary (void*) conversions
  proc: mark more files as permanent
  nilfs2: remove the unneeded result variable
  nilfs2: delete unnecessary checks before brelse()
  checkpatch: warn for non-standard fixes tag style
  usr/gen_init_cpio.c: remove unnecessary -1 values from int file
  ipc/msg: mitigate the lock contention with percpu counter
  percpu: add percpu_counter_add_local and percpu_counter_sub_local
  fs/ocfs2: fix repeated words in comments
  relay: use kvcalloc to alloc page array in relay_alloc_page_array
  proc: make config PROC_CHILDREN depend on PROC_FS
  fs: uninline inode_maybe_inc_iversion()
  ...
2022-10-12 11:00:22 -07:00
Phil Sutter
6a91e72709 selftests: netfilter: Fix nft_fib.sh for all.rp_filter=1
If net.ipv4.conf.all.rp_filter is set, it overrides the per-interface
setting and thus defeats the fix from bbe4c0896d ("selftests:
netfilter: disable rp_filter on router"). Unset it as well to cover that
case.

Fixes: bbe4c0896d ("selftests: netfilter: disable rp_filter on router")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Florian Westphal <fw@strlen.de>
2022-10-12 14:08:15 +02:00
Phil Sutter
6e31ce831c selftests: netfilter: Test reverse path filtering
Test reverse path (filter) matches in iptables, ip6tables and nftables.
Both with a regular interface and a VRF.

Signed-off-by: Phil Sutter <phil@nwl.cc>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
2022-10-12 14:08:15 +02:00
David Vernet
d31ada3b51 selftests/bpf: Alphabetize DENYLISTs
The DENYLIST and DENYLIST.s390x files are used to specify testcases
which should not be run on CI. Currently, testcases are appended to the
end of these files as needed. This can make it a pain to resolve merge
conflicts. This patch alphabetizes the DENYLIST files to ease this
burden.

Signed-off-by: David Vernet <void@manifault.com>
Acked-by: Daniel Müller <deso@posteo.net>
Link: https://lore.kernel.org/r/20221011165255.774014-1-void@manifault.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2022-10-11 22:55:28 -07:00
Linus Torvalds
49da070062 memblock: test suite improvements
* Added verification that memblock allocations zero the allocated memory
 * Added more test cases for memblock_add(), memblock_remove(),
   memblock_reserve() and memblock_free()
 * Added tests for memblock_*_raw() family
 * Added tests for NUMA-aware allocations in memblock_alloc_try_nid() and
   memblock_alloc_try_nid_raw()
 -----BEGIN PGP SIGNATURE-----
 
 iQFMBAABCAA2FiEEeOVYVaWZL5900a/pOQOGJssO/ZEFAmNFRd8YHG1pa2UucmFw
 b3BvcnRAZ21haWwuY29tAAoJEDkDhibLDv2RD7AH/2aaJqtGakZGPtx1IBULg/AE
 P3/b0OERUjSCzyU3gH8rZ2zG9AsV6Lq/NKeq7B6zdrTJrLs+IlLaGGnbBoO3sEZn
 KWJWVQlVTyFdT8bdM0BGAAtQvehE6Km1IEoH5c7ColdcBeZKi98MuXY1BEY0KjgJ
 76Z+sWOfC+SKV2Yp8A+7anSmD95jZwU4ogr2rHjAgdXAqPQgyUYNvAeawKLapnpQ
 XjvfWwsEC54PVCw7S1k83Q1VDQ7Vo/b8SMP9/XphcVpwt/6rq3B2uJosRE91ViSe
 G8M6moNbYtj4n06nNvRNN5zq9igkbjgONCQnBj2D/r6s3SF8Tb3hcznxzORvdK4=
 =U6/l
 -----END PGP SIGNATURE-----

Merge tag 'memblock-v6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock

Pull memblock updates from Mike Rapoport:
 "Test suite improvements:

   - Added verification that memblock allocations zero the allocated
     memory

   - Added more test cases for memblock_add(), memblock_remove(),
     memblock_reserve() and memblock_free()

   - Added tests for memblock_*_raw() family

   - Added tests for NUMA-aware allocations in memblock_alloc_try_nid()
     and memblock_alloc_try_nid_raw()"

* tag 'memblock-v6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock:
  memblock tests: add generic NUMA tests for memblock_alloc_try_nid*
  memblock tests: add bottom-up NUMA tests for memblock_alloc_try_nid*
  memblock tests: add top-down NUMA tests for memblock_alloc_try_nid*
  memblock tests: add simulation of physical memory with multiple NUMA nodes
  memblock_tests: move variable declarations to single block
  memblock tests: remove 'cleared' from comment blocks
  memblock tests: add tests for memblock_trim_memory
  memblock tests: add tests for memblock_*bottom_up functions
  memblock tests: update alloc_nid_api to test memblock_alloc_try_nid_raw
  memblock tests: update alloc_api to test memblock_alloc_raw
  memblock tests: add additional tests for basic api and memblock_alloc
  memblock tests: add labels to verbose output for generic alloc tests
  memblock tests: update zeroed memory check for memblock_alloc_* tests
  memblock tests: update tests to check if memblock_alloc zeroed memory
  memblock tests: update reference to obsolete build option in comments
  memblock tests: add command line help option
2022-10-11 20:48:55 -07:00
Linus Torvalds
f311d498be ARM:
* Fixes for single-stepping in the presence of an async
   exception as well as the preservation of PSTATE.SS
 
 * Better handling of AArch32 ID registers on AArch64-only
   systems
 
 * Fixes for the dirty-ring API, allowing it to work on
   architectures with relaxed memory ordering
 
 * Advertise the new kvmarm mailing list
 
 * Various minor cleanups and spelling fixes
 
 RISC-V:
 
 * Improved instruction encoding infrastructure for
   instructions not yet supported by binutils
 
 * Svinval support for both KVM Host and KVM Guest
 
 * Zihintpause support for KVM Guest
 
 * Zicbom support for KVM Guest
 
 * Record number of signal exits as a VCPU stat
 
 * Use generic guest entry infrastructure
 
 x86:
 
 * Misc PMU fixes and cleanups.
 
 * selftests: fixes for Hyper-V hypercall
 
 * selftests: fix nx_huge_pages_test on TDP-disabled hosts
 
 * selftests: cleanups for fix_hypercall_test
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmM7OcMUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroPAFgf/Rqc9hrXZVdbh2OZ+gScSsFsPK1zO
 DISUksLcXaYVYYsvQAEg/N2BPz3XbmO4jA+z8bIUrYTA7fC98we2C4jfR+EaX/fO
 +/Kzf0lAgu/nQZyFzUya+1jRsZqvVbC/HmDCI2kzN4u78e/LZ7NVcMijdV/ly6ib
 cq0b0LLqJHe/fcpJ806JZP3p5sndQhDmlUkZ2AWZf6CUKSEFcufbbYkt+84ZK4PL
 N9mEqXYQ3DXClLQmIBv+NZhtGlmADkWDE4BNouw8dVxhaXH7Hw/jfBHdb6SSHMRe
 tQ6Src1j8AYOhf5J35SMudgkbGcMelm0yeZ7Sizk+5Ft0EmdbZsnkvsGdQ==
 =4RA+
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull more kvm updates from Paolo Bonzini:
 "The main batch of ARM + RISC-V changes, and a few fixes and cleanups
  for x86 (PMU virtualization and selftests).

  ARM:

   - Fixes for single-stepping in the presence of an async exception as
     well as the preservation of PSTATE.SS

   - Better handling of AArch32 ID registers on AArch64-only systems

   - Fixes for the dirty-ring API, allowing it to work on architectures
     with relaxed memory ordering

   - Advertise the new kvmarm mailing list

   - Various minor cleanups and spelling fixes

  RISC-V:

   - Improved instruction encoding infrastructure for instructions not
     yet supported by binutils

   - Svinval support for both KVM Host and KVM Guest

   - Zihintpause support for KVM Guest

   - Zicbom support for KVM Guest

   - Record number of signal exits as a VCPU stat

   - Use generic guest entry infrastructure

  x86:

   - Misc PMU fixes and cleanups.

   - selftests: fixes for Hyper-V hypercall

   - selftests: fix nx_huge_pages_test on TDP-disabled hosts

   - selftests: cleanups for fix_hypercall_test"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (57 commits)
  riscv: select HAVE_POSIX_CPU_TIMERS_TASK_WORK
  RISC-V: KVM: Use generic guest entry infrastructure
  RISC-V: KVM: Record number of signal exits as a vCPU stat
  RISC-V: KVM: add __init annotation to riscv_kvm_init()
  RISC-V: KVM: Expose Zicbom to the guest
  RISC-V: KVM: Provide UAPI for Zicbom block size
  RISC-V: KVM: Make ISA ext mappings explicit
  RISC-V: KVM: Allow Guest use Zihintpause extension
  RISC-V: KVM: Allow Guest use Svinval extension
  RISC-V: KVM: Use Svinval for local TLB maintenance when available
  RISC-V: Probe Svinval extension form ISA string
  RISC-V: KVM: Change the SBI specification version to v1.0
  riscv: KVM: Apply insn-def to hlv encodings
  riscv: KVM: Apply insn-def to hfence encodings
  riscv: Introduce support for defining instructions
  riscv: Add X register names to gpr-nums
  KVM: arm64: Advertise new kvmarm mailing list
  kvm: vmx: keep constant definition format consistent
  kvm: mmu: fix typos in struct kvm_arch
  KVM: selftests: Fix nx_huge_pages_test on TDP-disabled hosts
  ...
2022-10-11 20:07:44 -07:00
Alexey Dobriyan
5bc73bb345 proc: test how it holds up with mapping'less process
Create process without mappings and check

	/proc/*/maps
	/proc/*/numa_maps
	/proc/*/smaps
	/proc/*/smaps_rollup

They must be empty (excluding vsyscall page) or full of zeroes.

Retroactively this test should've caught embarassing /proc/*/smaps_rollup
oops:

[17752.703567] BUG: kernel NULL pointer dereference, address: 0000000000000000
[17752.703580] #PF: supervisor read access in kernel mode
[17752.703583] #PF: error_code(0x0000) - not-present page
[17752.703587] PGD 0 P4D 0
[17752.703593] Oops: 0000 [#1] PREEMPT SMP PTI
[17752.703598] CPU: 0 PID: 60649 Comm: cat Tainted: G        W         5.19.9-100.fc35.x86_64 #1
[17752.703603] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X99 Extreme6/3.1, BIOS P3.30 08/05/2016
[17752.703607] RIP: 0010:show_smaps_rollup+0x159/0x2e0

Note 1:
	ProtectionKey field in /proc/*/smaps is optional,
	so check most of its contents, not everything.

Note 2:
	due to the nature of this test, child process hardly can signal
	its readiness (after unmapping everything!) to parent.
	I feel like "sleep(1)" is justified.
	If you know how to do it without sleep please tell me.

Note 3:
	/proc/*/statm is not tested but can be.

Link: https://lkml.kernel.org/r/Yz3liL6Dn+n2SD8Q@localhost.localdomain
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-11 18:51:11 -07:00
Linus Torvalds
d465bff130 perf tools changes for v6.1: 1st batch
- Add support for AMD on 'perf mem' and 'perf c2c', the kernel enablement
   patches went via tip.
 
   Example:
 
   $ sudo perf mem record -- -c 10000
   ^C[ perf record: Woken up 227 times to write data ]
   [ perf record: Captured and wrote 58.760 MB perf.data (836978 samples) ]
 
   $ sudo perf mem report -F mem,sample,snoop
   Samples: 836K of event 'ibs_op//', Event count (approx.): 8418762
   Memory access                  Samples  Snoop
   N/A                             700620  N/A
   L1 hit                          126675  N/A
   L2 hit                             424  N/A
   L3 hit                             664  HitM
   L3 hit                              10  N/A
   Local RAM hit                        2  N/A
   Remote RAM (1 hop) hit            8558  N/A
   Remote Cache (1 hop) hit             3  N/A
   Remote Cache (1 hop) hit             2  HitM
   Remote Cache (2 hops) hit           10  HitM
   Remote Cache (2 hops) hit            6  N/A
   Uncached hit                         4  N/A
   $
 
 - "perf lock" improvements:
 
   - Add -E/--entries option to limit the number of entries to display, say to ask for
     just the top 5 contended locks.
 
   - Add -q/--quiet option to suppress header and debug messages.
 
   - Add a 'perf test' kernel lock contention entry to test 'perf lock'.
 
 - "perf lock contention" improvements:
 
   - Ask BPF's bpf_get_stackid() to skip some callchain entries.
 
     The ones closer to the tooling are bpf related and not that interesting, the
     ones calling the locking function are the ones we're interested in, example
     of a full, unskipped callstack:
 
   - Allow changing the callstack depth and number of entries to skip.
 
            1     10.74 us     10.74 us     10.74 us     spinlock   __bpf_trace_contention_begin+0xb
                           0xffffffffc03b5c47  bpf_prog_bf07ae9e2cbd02c5_contention_begin+0x117
                           0xffffffffc03b5c47  bpf_prog_bf07ae9e2cbd02c5_contention_begin+0x117
                           0xffffffffbb8b8e75  bpf_trace_run2+0x35
                           0xffffffffbb7eab9b  __bpf_trace_contention_begin+0xb
                           0xffffffffbb7ebe75  queued_spin_lock_slowpath+0x1f5
                           0xffffffffbc1c26ff  _raw_spin_lock+0x1f
                           0xffffffffbb841015  tick_do_update_jiffies64+0x25
                           0xffffffffbb8409ee  tick_irq_enter+0x9e
 
   - Show full callstack in verbose mode (-v option), sometimes this is desirable
     instead of showing just one callstack entry.
 
 - Allow multiple time ranges in 'perf record --delay' to help in reducing the
   amount of data collected from hardware tracing (Intel PT, etc) when there is
   a rough idea of periods of time where events of interest take time.
 
 - Add Intel PT to record only decoder debug messages when error happens.
 
 - Improve layout of Intel PT man page.
 
 - Add new branch types: alignment, data and inst faults and arch specific ones,
   such as fiq, debug_halt, debug_exit, debug_inst and debug_data on arm64.
 
   Kernel enablement went thru the tip tree.
 
 - Fix 'perf probe' error log check in 'perf test' when no debuginfo is
   available.
 
 - Fix 'perf stat' aggregation mode logic, it should be looking at the CPU
   not at the core number.
 
 - Fix flags parsing in 'perf trace' filters.
 
 - Introduce compact encoding of CPU range encoding on perf.data, to avoid
   having a bitmap with all the CPUs.
 
 - Improvements to the 'perf stat' metrics, including adding "core_wide", and
   computing "smt" from the CPU topology.
 
 - Add support to the new PERF_FORMAT_LOST perf_event_attr.read_format, that allows
   tooling to ask for the precise number of lost samples for a given event.
 
 - Add 'addr' sort key to see just the address of sampled instructions:
 
   $ perf record -o- true | perf report -i- -s addr
   [ perf record: Woken up 1 times to write data ]
   [ perf record: Captured and wrote 0.000 MB - ]
   # Samples: 12  of event 'cycles:u'
   # Event count (approx.): 252512
   #
   # Overhead  Address
   # ........  ..................
       42.96%  0x7f96f08443d7
       29.55%  0x7f96f0859b50
       14.76%  0x7f96f0852e02
        8.30%  0x7f96f0855028
        4.43%  0xffffffff8de01087
 
   perf annotate: Toggle full address <-> offset display
 
 - Add 'f' hotkey to the 'perf annotate' TUI interface when in 'disassembler output'
   mode ('o' hotkey) to toggle showing full virtual address or just the offset.
 
 - Cache DSO build-ids when synthesizing PERF_RECORD_MMAP records for pre-existing threads,
   at the start of a 'perf record' session, speeding up that record startup phase.
 
 - Add a command line option to specify build ids in 'perf inject'.
 
 - Update JSON event files for the Intel alderlake, broadwell, broadwellde,
   broadwellx, cascadelakex, haswell, haswellx, icelake, icelakex, ivybridge,
   ivytown, jaketown, sandybridge, sapphirerapids, skylake, skylakex, and
   tigerlake processors.
 
 - Update vendor JSON event files for the ARM Neoverse V1 and E1 platforms.
 
 - Add a 'perf test' entry for 'perf mem' where a struct has false sharing and
   this gets detected in the 'perf mem' output, tested with Intel, AMD and ARM64
   systems.
 
 - Add a 'perf test' entry to test the resolution of java symbols, where an
   output like this is expected:
 
      8.18%  jshell    jitted-50116-29.so    [.] Interpreter
      0.75%  Thread-1  jitted-83602-1670.so  [.] jdk.internal.jimage.BasicImageReader.getString(int)
 
 - Add tests for the ARM64 CoreSight hardware tracing feature, with specially
   crafted pureloop, memcpy, thread loop and unroll tread that then gets
   traced and the output compared with expected output.
 
   Documentation explaining it is also included.
 
 - Add per thread Intel PT 'perf test' entry to check that PERF_RECORD_TEXT_POKE events
   are recorded per CPU, resulting in a mixture of per thread and per CPU events and mmaps,
   verify that this gets all recorded correctly.
 
 - Introduce pthread mutex wrappers to allow for building with clang's
   -Wthread-safety, i.e. using the "guarded_by" "pt_guarded_by" "lockable",
   "exclusive_lock_function", "exclusive_trylock_function",
   "exclusive_locks_required", and "no_thread_safety_analysis" compiler function
   attributes.
 
 - Fix empty version number when building outside of a git repo.
 
 - Improve feature detection display when multiple versions of a feature are present, such
   as for binutils libbfd, that has a mix of possible ways to detect according to the
   Linux distribution.
 
   Previously in some cases we had:
 
   Auto-detecting system features
   <SNIP>
   ...                                  libbfd: [ on  ]
   ...                          libbfd-liberty: [ on  ]
   ...                        libbfd-liberty-z: [ on  ]
   <SNIP>
 
   Now for this case we show just the main feature:
 
   Auto-detecting system features
   <SNIP>
   ...                                  libbfd: [ on  ]
   <SNIP>
 
 - Remove some unused structs, variables, macros, function prototypes and
   includes from various places.
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCY0CKuAAKCRCyPKLppCJ+
 JywwAQDWLForEnEZNk92Fd3y342Lh9W/8z1V51dKK7XdY1cV6AD/Rn5L57v7k/yG
 mG5w2Fd1J/xBjlsL/BvNlimUD2tbkQA=
 =XPMg
 -----END PGP SIGNATURE-----

Merge tag 'perf-tools-for-v6.1-1-2022-10-07' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux

Pull perf tools updates from Arnaldo Carvalho de Melo:

 - Add support for AMD on 'perf mem' and 'perf c2c', the kernel
   enablement patches went via tip.

   Example:

      $ sudo perf mem record -- -c 10000
      ^C[ perf record: Woken up 227 times to write data ]
      [ perf record: Captured and wrote 58.760 MB perf.data (836978 samples) ]

      $ sudo perf mem report -F mem,sample,snoop
      Samples: 836K of event 'ibs_op//', Event count (approx.): 8418762
      Memory access                  Samples  Snoop
      N/A                             700620  N/A
      L1 hit                          126675  N/A
      L2 hit                             424  N/A
      L3 hit                             664  HitM
      L3 hit                              10  N/A
      Local RAM hit                        2  N/A
      Remote RAM (1 hop) hit            8558  N/A
      Remote Cache (1 hop) hit             3  N/A
      Remote Cache (1 hop) hit             2  HitM
      Remote Cache (2 hops) hit           10  HitM
      Remote Cache (2 hops) hit            6  N/A
      Uncached hit                         4  N/A
      $

 - "perf lock" improvements:

     - Add -E/--entries option to limit the number of entries to
       display, say to ask for just the top 5 contended locks.

     - Add -q/--quiet option to suppress header and debug messages.

     - Add a 'perf test' kernel lock contention entry to test 'perf
       lock'.

 - "perf lock contention" improvements:

     - Ask BPF's bpf_get_stackid() to skip some callchain entries.

       The ones closer to the tooling are bpf related and not that
       interesting, the ones calling the locking function are the ones
       we're interested in, example of a full, unskipped callstack:

     - Allow changing the callstack depth and number of entries to skip.

           1     10.74 us     10.74 us     10.74 us     spinlock   __bpf_trace_contention_begin+0xb
                          0xffffffffc03b5c47  bpf_prog_bf07ae9e2cbd02c5_contention_begin+0x117
                          0xffffffffc03b5c47  bpf_prog_bf07ae9e2cbd02c5_contention_begin+0x117
                          0xffffffffbb8b8e75  bpf_trace_run2+0x35
                          0xffffffffbb7eab9b  __bpf_trace_contention_begin+0xb
                          0xffffffffbb7ebe75  queued_spin_lock_slowpath+0x1f5
                          0xffffffffbc1c26ff  _raw_spin_lock+0x1f
                          0xffffffffbb841015  tick_do_update_jiffies64+0x25
                          0xffffffffbb8409ee  tick_irq_enter+0x9e

     - Show full callstack in verbose mode (-v option), sometimes this
       is desirable instead of showing just one callstack entry.

 - Allow multiple time ranges in 'perf record --delay' to help in
   reducing the amount of data collected from hardware tracing (Intel
   PT, etc) when there is a rough idea of periods of time where events
   of interest take time.

 - Add Intel PT to record only decoder debug messages when error
   happens.

 - Improve layout of Intel PT man page.

 - Add new branch types: alignment, data and inst faults and arch
   specific ones, such as fiq, debug_halt, debug_exit, debug_inst and
   debug_data on arm64.

   Kernel enablement went thru the tip tree.

 - Fix 'perf probe' error log check in 'perf test' when no debuginfo is
   available.

 - Fix 'perf stat' aggregation mode logic, it should be looking at the
   CPU not at the core number.

 - Fix flags parsing in 'perf trace' filters.

 - Introduce compact encoding of CPU range encoding on perf.data, to
   avoid having a bitmap with all the CPUs.

 - Improvements to the 'perf stat' metrics, including adding
   "core_wide", and computing "smt" from the CPU topology.

 - Add support to the new PERF_FORMAT_LOST perf_event_attr.read_format,
   that allows tooling to ask for the precise number of lost samples for
   a given event.

 - Add 'addr' sort key to see just the address of sampled instructions:

      $ perf record -o- true | perf report -i- -s addr
      [ perf record: Woken up 1 times to write data ]
      [ perf record: Captured and wrote 0.000 MB - ]
      # Samples: 12  of event 'cycles:u'
      # Event count (approx.): 252512
      #
      # Overhead  Address
      # ........  ..................
          42.96%  0x7f96f08443d7
          29.55%  0x7f96f0859b50
          14.76%  0x7f96f0852e02
           8.30%  0x7f96f0855028
           4.43%  0xffffffff8de01087

      perf annotate: Toggle full address <-> offset display

 - Add 'f' hotkey to the 'perf annotate' TUI interface when in
   'disassembler output' mode ('o' hotkey) to toggle showing full
   virtual address or just the offset.

 - Cache DSO build-ids when synthesizing PERF_RECORD_MMAP records for
   pre-existing threads, at the start of a 'perf record' session,
   speeding up that record startup phase.

 - Add a command line option to specify build ids in 'perf inject'.

 - Update JSON event files for the Intel alderlake, broadwell,
   broadwellde, broadwellx, cascadelakex, haswell, haswellx, icelake,
   icelakex, ivybridge, ivytown, jaketown, sandybridge, sapphirerapids,
   skylake, skylakex, and tigerlake processors.

 - Update vendor JSON event files for the ARM Neoverse V1 and E1
   platforms.

 - Add a 'perf test' entry for 'perf mem' where a struct has false
   sharing and this gets detected in the 'perf mem' output, tested with
   Intel, AMD and ARM64 systems.

 - Add a 'perf test' entry to test the resolution of java symbols, where
   an output like this is expected:

       8.18%  jshell    jitted-50116-29.so    [.] Interpreter
       0.75%  Thread-1  jitted-83602-1670.so  [.] jdk.internal.jimage.BasicImageReader.getString(int)

 - Add tests for the ARM64 CoreSight hardware tracing feature, with
   specially crafted pureloop, memcpy, thread loop and unroll tread that
   then gets traced and the output compared with expected output.

   Documentation explaining it is also included.

 - Add per thread Intel PT 'perf test' entry to check that
   PERF_RECORD_TEXT_POKE events are recorded per CPU, resulting in a
   mixture of per thread and per CPU events and mmaps, verify that this
   gets all recorded correctly.

 - Introduce pthread mutex wrappers to allow for building with clang's
   -Wthread-safety, i.e. using the "guarded_by" "pt_guarded_by"
   "lockable", "exclusive_lock_function", "exclusive_trylock_function",
   "exclusive_locks_required", and "no_thread_safety_analysis" compiler
   function attributes.

 - Fix empty version number when building outside of a git repo.

 - Improve feature detection display when multiple versions of a feature
   are present, such as for binutils libbfd, that has a mix of possible
   ways to detect according to the Linux distribution.

   Previously in some cases we had:

      Auto-detecting system features
      <SNIP>
      ...                                  libbfd: [ on  ]
      ...                          libbfd-liberty: [ on  ]
      ...                        libbfd-liberty-z: [ on  ]
      <SNIP>

   Now for this case we show just the main feature:

      Auto-detecting system features
      <SNIP>
      ...                                  libbfd: [ on  ]
      <SNIP>

 - Remove some unused structs, variables, macros, function prototypes
   and includes from various places.

* tag 'perf-tools-for-v6.1-1-2022-10-07' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (169 commits)
  perf script: Add missing fields in usage hint
  perf mem: Print "LFB/MAB" for PERF_MEM_LVLNUM_LFB
  perf mem/c2c: Avoid printing empty lines for unsupported events
  perf mem/c2c: Add load store event mappings for AMD
  perf mem/c2c: Set PERF_SAMPLE_WEIGHT for LOAD_STORE events
  perf mem: Add support for printing PERF_MEM_LVLNUM_{CXL|IO}
  perf amd ibs: Sync arch/x86/include/asm/amd-ibs.h header with the kernel
  tools headers UAPI: Sync include/uapi/linux/perf_event.h header with the kernel
  perf stat: Fix cpu check to use id.cpu.cpu in aggr_printout()
  perf test coresight: Add relevant documentation about ARM64 CoreSight testing
  perf test: Add git ignore for tmp and output files of ARM CoreSight tests
  perf test coresight: Add unroll thread test shell script
  perf test coresight: Add unroll thread test tool
  perf test coresight: Add thread loop test shell scripts
  perf test coresight: Add thread loop test tool
  perf test coresight: Add memcpy thread test shell script
  perf test coresight: Add memcpy thread test tool
  perf test: Add git ignore for perf data generated by the ARM CoreSight tests
  perf test: Add arm64 asm pureloop test shell script
  perf test: Add asm pureloop test tool
  ...
2022-10-11 15:02:25 -07:00
Linus Torvalds
27bc50fc90 - Yu Zhao's Multi-Gen LRU patches are here. They've been under test in
linux-next for a couple of months without, to my knowledge, any negative
   reports (or any positive ones, come to that).
 
 - Also the Maple Tree from Liam R.  Howlett.  An overlapping range-based
   tree for vmas.  It it apparently slight more efficient in its own right,
   but is mainly targeted at enabling work to reduce mmap_lock contention.
 
   Liam has identified a number of other tree users in the kernel which
   could be beneficially onverted to mapletrees.
 
   Yu Zhao has identified a hard-to-hit but "easy to fix" lockdep splat
   (https://lkml.kernel.org/r/CAOUHufZabH85CeUN-MEMgL8gJGzJEWUrkiM58JkTbBhh-jew0Q@mail.gmail.com).
   This has yet to be addressed due to Liam's unfortunately timed
   vacation.  He is now back and we'll get this fixed up.
 
 - Dmitry Vyukov introduces KMSAN: the Kernel Memory Sanitizer.  It uses
   clang-generated instrumentation to detect used-unintialized bugs down to
   the single bit level.
 
   KMSAN keeps finding bugs.  New ones, as well as the legacy ones.
 
 - Yang Shi adds a userspace mechanism (madvise) to induce a collapse of
   memory into THPs.
 
 - Zach O'Keefe has expanded Yang Shi's madvise(MADV_COLLAPSE) to support
   file/shmem-backed pages.
 
 - userfaultfd updates from Axel Rasmussen
 
 - zsmalloc cleanups from Alexey Romanov
 
 - cleanups from Miaohe Lin: vmscan, hugetlb_cgroup, hugetlb and memory-failure
 
 - Huang Ying adds enhancements to NUMA balancing memory tiering mode's
   page promotion, with a new way of detecting hot pages.
 
 - memcg updates from Shakeel Butt: charging optimizations and reduced
   memory consumption.
 
 - memcg cleanups from Kairui Song.
 
 - memcg fixes and cleanups from Johannes Weiner.
 
 - Vishal Moola provides more folio conversions
 
 - Zhang Yi removed ll_rw_block() :(
 
 - migration enhancements from Peter Xu
 
 - migration error-path bugfixes from Huang Ying
 
 - Aneesh Kumar added ability for a device driver to alter the memory
   tiering promotion paths.  For optimizations by PMEM drivers, DRM
   drivers, etc.
 
 - vma merging improvements from Jakub Matěn.
 
 - NUMA hinting cleanups from David Hildenbrand.
 
 - xu xin added aditional userspace visibility into KSM merging activity.
 
 - THP & KSM code consolidation from Qi Zheng.
 
 - more folio work from Matthew Wilcox.
 
 - KASAN updates from Andrey Konovalov.
 
 - DAMON cleanups from Kaixu Xia.
 
 - DAMON work from SeongJae Park: fixes, cleanups.
 
 - hugetlb sysfs cleanups from Muchun Song.
 
 - Mike Kravetz fixes locking issues in hugetlbfs and in hugetlb core.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCY0HaPgAKCRDdBJ7gKXxA
 joPjAQDZ5LlRCMWZ1oxLP2NOTp6nm63q9PWcGnmY50FjD/dNlwEAnx7OejCLWGWf
 bbTuk6U2+TKgJa4X7+pbbejeoqnt5QU=
 =xfWx
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2022-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:

 - Yu Zhao's Multi-Gen LRU patches are here. They've been under test in
   linux-next for a couple of months without, to my knowledge, any
   negative reports (or any positive ones, come to that).

 - Also the Maple Tree from Liam Howlett. An overlapping range-based
   tree for vmas. It it apparently slightly more efficient in its own
   right, but is mainly targeted at enabling work to reduce mmap_lock
   contention.

   Liam has identified a number of other tree users in the kernel which
   could be beneficially onverted to mapletrees.

   Yu Zhao has identified a hard-to-hit but "easy to fix" lockdep splat
   at [1]. This has yet to be addressed due to Liam's unfortunately
   timed vacation. He is now back and we'll get this fixed up.

 - Dmitry Vyukov introduces KMSAN: the Kernel Memory Sanitizer. It uses
   clang-generated instrumentation to detect used-unintialized bugs down
   to the single bit level.

   KMSAN keeps finding bugs. New ones, as well as the legacy ones.

 - Yang Shi adds a userspace mechanism (madvise) to induce a collapse of
   memory into THPs.

 - Zach O'Keefe has expanded Yang Shi's madvise(MADV_COLLAPSE) to
   support file/shmem-backed pages.

 - userfaultfd updates from Axel Rasmussen

 - zsmalloc cleanups from Alexey Romanov

 - cleanups from Miaohe Lin: vmscan, hugetlb_cgroup, hugetlb and
   memory-failure

 - Huang Ying adds enhancements to NUMA balancing memory tiering mode's
   page promotion, with a new way of detecting hot pages.

 - memcg updates from Shakeel Butt: charging optimizations and reduced
   memory consumption.

 - memcg cleanups from Kairui Song.

 - memcg fixes and cleanups from Johannes Weiner.

 - Vishal Moola provides more folio conversions

 - Zhang Yi removed ll_rw_block() :(

 - migration enhancements from Peter Xu

 - migration error-path bugfixes from Huang Ying

 - Aneesh Kumar added ability for a device driver to alter the memory
   tiering promotion paths. For optimizations by PMEM drivers, DRM
   drivers, etc.

 - vma merging improvements from Jakub Matěn.

 - NUMA hinting cleanups from David Hildenbrand.

 - xu xin added aditional userspace visibility into KSM merging
   activity.

 - THP & KSM code consolidation from Qi Zheng.

 - more folio work from Matthew Wilcox.

 - KASAN updates from Andrey Konovalov.

 - DAMON cleanups from Kaixu Xia.

 - DAMON work from SeongJae Park: fixes, cleanups.

 - hugetlb sysfs cleanups from Muchun Song.

 - Mike Kravetz fixes locking issues in hugetlbfs and in hugetlb core.

Link: https://lkml.kernel.org/r/CAOUHufZabH85CeUN-MEMgL8gJGzJEWUrkiM58JkTbBhh-jew0Q@mail.gmail.com [1]

* tag 'mm-stable-2022-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (555 commits)
  hugetlb: allocate vma lock for all sharable vmas
  hugetlb: take hugetlb vma_lock when clearing vma_lock->vma pointer
  hugetlb: fix vma lock handling during split vma and range unmapping
  mglru: mm/vmscan.c: fix imprecise comments
  mm/mglru: don't sync disk for each aging cycle
  mm: memcontrol: drop dead CONFIG_MEMCG_SWAP config symbol
  mm: memcontrol: use do_memsw_account() in a few more places
  mm: memcontrol: deprecate swapaccounting=0 mode
  mm: memcontrol: don't allocate cgroup swap arrays when memcg is disabled
  mm/secretmem: remove reduntant return value
  mm/hugetlb: add available_huge_pages() func
  mm: remove unused inline functions from include/linux/mm_inline.h
  selftests/vm: add selftest for MADV_COLLAPSE of uffd-minor memory
  selftests/vm: add file/shmem MADV_COLLAPSE selftest for cleared pmd
  selftests/vm: add thp collapse shmem testing
  selftests/vm: add thp collapse file and tmpfs testing
  selftests/vm: modularize thp collapse memory operations
  selftests/vm: dedup THP helpers
  mm/khugepaged: add tracepoint to hpage_collapse_scan_file()
  mm/madvise: add file and shmem support to MADV_COLLAPSE
  ...
2022-10-10 17:53:04 -07:00
Roberto Sassu
a9c7c18b57 selftests/bpf: Add tests for _opts variants of bpf_*_get_fd_by_id()
Introduce the data_input map, write-protected with a small eBPF program
implementing the lsm/bpf_map hook.

Then, ensure that bpf_map_get_fd_by_id() and bpf_map_get_fd_by_id_opts()
with NULL opts don't succeed due to requesting read-write access to the
write-protected map. Also, ensure that bpf_map_get_fd_by_id_opts() with
open_flags in opts set to BPF_F_RDONLY instead succeeds.

After obtaining a read-only fd, ensure that only map lookup succeeds and
not update. Ensure that update works only with the read-write fd obtained
at program loading time, when the write protection was not yet enabled.

Finally, ensure that the other _opts variants of bpf_*_get_fd_by_id() don't
work if the BPF_F_RDONLY flag is set in opts (due to the kernel not
handling the open_flags member of bpf_attr).

Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20221006110736.84253-7-roberto.sassu@huaweicloud.com
2022-10-10 16:49:43 -07:00
Roberto Sassu
97c8f9dd5d libbpf: Introduce bpf_link_get_fd_by_id_opts()
Introduce bpf_link_get_fd_by_id_opts(), for symmetry with
bpf_map_get_fd_by_id_opts(), to let the caller pass the newly introduced
data structure bpf_get_fd_by_id_opts. Keep the existing
bpf_link_get_fd_by_id(), and call bpf_link_get_fd_by_id_opts() with NULL as
opts argument, to prevent setting open_flags.

Currently, the kernel does not support non-zero open_flags for
bpf_link_get_fd_by_id_opts(), and a call with them will result in an error
returned by the bpf() system call. The caller should always pass zero
open_flags.

Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20221006110736.84253-6-roberto.sassu@huaweicloud.com
2022-10-10 16:49:20 -07:00
Roberto Sassu
2ce7cbf2ba libbpf: Introduce bpf_btf_get_fd_by_id_opts()
Introduce bpf_btf_get_fd_by_id_opts(), for symmetry with
bpf_map_get_fd_by_id_opts(), to let the caller pass the newly introduced
data structure bpf_get_fd_by_id_opts. Keep the existing
bpf_btf_get_fd_by_id(), and call bpf_btf_get_fd_by_id_opts() with NULL as
opts argument, to prevent setting open_flags.

Currently, the kernel does not support non-zero open_flags for
bpf_btf_get_fd_by_id_opts(), and a call with them will result in an error
returned by the bpf() system call. The caller should always pass zero
open_flags.

Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20221006110736.84253-5-roberto.sassu@huaweicloud.com
2022-10-10 16:49:20 -07:00
Roberto Sassu
8f13f168ea libbpf: Introduce bpf_prog_get_fd_by_id_opts()
Introduce bpf_prog_get_fd_by_id_opts(), for symmetry with
bpf_map_get_fd_by_id_opts(), to let the caller pass the newly introduced
data structure bpf_get_fd_by_id_opts. Keep the existing
bpf_prog_get_fd_by_id(), and call bpf_prog_get_fd_by_id_opts() with NULL as
opts argument, to prevent setting open_flags.

Currently, the kernel does not support non-zero open_flags for
bpf_prog_get_fd_by_id_opts(), and a call with them will result in an error
returned by the bpf() system call. The caller should always pass zero
open_flags.

Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20221006110736.84253-4-roberto.sassu@huaweicloud.com
2022-10-10 16:49:20 -07:00