Commit Graph

1168626 Commits

Author SHA1 Message Date
David Vernet
4a54de6596 bpf/selftests: Fix send_signal tracepoint tests
The send_signal tracepoint tests are non-deterministically failing in
CI. The test works as follows:

1. Two pairs of file descriptors are created using the pipe() function.
   One pair is used to communicate between a parent process -> child
   process, and the other for the reverse direction.

2. A child is fork()'ed. The child process registers a signal handler,
   notifies its parent that the signal handler is registered, and then
   and waits for its parent to have enabled a BPF program that sends a
   signal.

3. The parent opens and loads a BPF skeleton with programs that send
   signals to the child process. The different programs are triggered by
   different perf events (either NMI or normal perf), or by regular
   tracepoints. The signal is delivered to the child whenever the child
   triggers the program.

4. The child's signal handler is invoked, which sets a flag saying that
   the signal handler was reached. The child then signals to the parent
   that it received the signal, and the test ends.

The perf testcases (send_signal_perf{_thread} and
send_signal_nmi{_thread}) work 100% of the time, but the tracepoint
testcases fail non-deterministically because the tracepoint is not
always being fired for the child.

There are two tracepoint programs registered in the test:
'tracepoint/sched/sched_switch', and
'tracepoint/syscalls/sys_enter_nanosleep'. The child never intentionally
blocks, nor sleeps, so neither tracepoint is guaranteed to be triggered.
To fix this, we can have the child trigger the nanosleep program with a
usleep().

Before this patch, the test would fail locally every 2-3 runs. Now, it
doesn't fail after more than 1000 runs.

Signed-off-by: David Vernet <void@manifault.com>
Link: https://lore.kernel.org/r/20230310061909.1420887-1-void@manifault.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-10 10:36:46 -08:00
Andrii Nakryiko
52c2b005a3 bpf: take into account liveness when propagating precision
When doing state comparison, if old state has register that is not
marked as REG_LIVE_READ, then we just skip comparison, regardless what's
the state of corresponing register in current state. This is because not
REG_LIVE_READ register is irrelevant for further program execution and
correctness. All good here.

But when we get to precision propagation, after two states were declared
equivalent, we don't take into account old register's liveness, and thus
attempt to propagate precision for register in current state even if
that register in old state was not REG_LIVE_READ anymore. This is bad,
because register in current state could be anything at all and this
could cause -EFAULT due to internal logic bugs.

Fix by taking into account REG_LIVE_READ liveness mark to keep the logic
in state comparison in sync with precision propagation.

Fixes: a3ce685dd0 ("bpf: fix precision tracking")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20230309224131.57449-1-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-10 10:11:42 -08:00
Andrii Nakryiko
4b5ce570db bpf: ensure state checkpointing at iter_next() call sites
State equivalence check and checkpointing performed in is_state_visited()
employs certain heuristics to try to save memory by avoiding state checkpoints
if not enough jumps and instructions happened since last checkpoint. This leads
to unpredictability of whether a particular instruction will be checkpointed
and how regularly. While normally this is not causing much problems (except
inconveniences for predictable verifier tests, which we overcome with
BPF_F_TEST_STATE_FREQ flag), turns out it's not the case for open-coded
iterators.

Checking and saving state checkpoints at iter_next() call is crucial for fast
convergence of open-coded iterator loop logic, so we need to force it. If we
don't do that, is_state_visited() might skip saving a checkpoint, causing
unnecessarily long sequence of not checkpointed instructions and jumps, leading
to exhaustion of jump history buffer, and potentially other undesired outcomes.
It is expected that with correct open-coded iterators convergence will happen
quickly, so we don't run a risk of exhausting memory.

This patch adds, in addition to prune and jump instruction marks, also a
"forced checkpoint" mark, and makes sure that any iter_next() call instruction
is marked as such.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20230310060149.625887-1-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-10 08:31:42 -08:00
Alexei Starovoitov
1456ddcce5 Merge branch 'selftests/bpf: make BPF_CFLAGS stricter with -Wall'
Andrii Nakryiko says:

====================

Make BPF-side compiler flags stricter by adding -Wall. Fix tons of small
issues pointed out by compiler immediately after that. That includes newly
added bpf_for(), bpf_for_each(), and bpf_repeat() macros.
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-10 08:14:08 -08:00
Andrii Nakryiko
3d5a55ddc2 selftests/bpf: make BPF compiler flags stricter
We recently added -Wuninitialized, but it's not enough to catch various
silly mistakes or omissions. Let's go all the way to -Wall, just like we
do for user-space code.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20230309054015.4068562-5-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-10 08:14:08 -08:00
Andrii Nakryiko
c8ed668593 selftests/bpf: fix lots of silly mistakes pointed out by compiler
Once we enable -Wall for BPF sources, compiler will complain about lots
of unused variables, variables that are set but never read, etc.

Fix all these issues first before enabling -Wall in Makefile.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20230309054015.4068562-4-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-10 08:14:08 -08:00
Andrii Nakryiko
713461b895 selftests/bpf: add __sink() macro to fake variable consumption
Add __sink(expr) macro that forces compiler to believe that passed in
expression is both read and written. It used a simple embedded asm for
this. This is useful in a lot of tests where we assign value to some variable
to trigger some action, but later don't read variable, causing compiler
to complain (if corresponding compiler warnings are turned on, which
we'll do in the next patch).

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20230309054015.4068562-3-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-10 08:14:07 -08:00
Andrii Nakryiko
2498e6231b selftests/bpf: prevent unused variable warning in bpf_for()
Add __attribute__((unused)) to inner __p variable inside bpf_for(),
bpf_for_each(), and bpf_repeat() macros to avoid compiler warnings about
unused variable.

Reported-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20230309054015.4068562-2-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-10 08:14:07 -08:00
Yonghong Song
63d78b7e8c selftests/bpf: Workaround verification failure for fexit_bpf2bpf/func_replace_return_code
With latest llvm17, selftest fexit_bpf2bpf/func_replace_return_code
has the following verification failure:

  0: R1=ctx(off=0,imm=0) R10=fp0
  ; int connect_v4_prog(struct bpf_sock_addr *ctx)
  0: (bf) r7 = r1                       ; R1=ctx(off=0,imm=0) R7_w=ctx(off=0,imm=0)
  1: (b4) w6 = 0                        ; R6_w=0
  ; memset(&tuple.ipv4.saddr, 0, sizeof(tuple.ipv4.saddr));
  ...
  ; return do_bind(ctx) ? 1 : 0;
  179: (bf) r1 = r7                     ; R1=ctx(off=0,imm=0) R7=ctx(off=0,imm=0)
  180: (85) call pc+147
  Func#3 is global and valid. Skipping.
  181: R0_w=scalar()
  181: (bc) w6 = w0                     ; R0_w=scalar() R6_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff))
  182: (05) goto pc-129
  ; }
  54: (bc) w0 = w6                      ; R0_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff)) R6_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff))
  55: (95) exit
  At program exit the register R0 has value (0x0; 0xffffffff) should have been in (0x0; 0x1)
  processed 281 insns (limit 1000000) max_states_per_insn 1 total_states 26 peak_states 26 mark_read 13
  -- END PROG LOAD LOG --
  libbpf: prog 'connect_v4_prog': failed to load: -22

The corresponding source code:

  __attribute__ ((noinline))
  int do_bind(struct bpf_sock_addr *ctx)
  {
        struct sockaddr_in sa = {};

        sa.sin_family = AF_INET;
        sa.sin_port = bpf_htons(0);
        sa.sin_addr.s_addr = bpf_htonl(SRC_REWRITE_IP4);

        if (bpf_bind(ctx, (struct sockaddr *)&sa, sizeof(sa)) != 0)
                return 0;

        return 1;
  }
  ...
  SEC("cgroup/connect4")
  int connect_v4_prog(struct bpf_sock_addr *ctx)
  {
  ...
        return do_bind(ctx) ? 1 : 0;
  }

Insn 180 is a call to 'do_bind'. The call's return value is also the return value
for the program. Since do_bind() returns 0/1, so it is legitimate for compiler to
optimize 'return do_bind(ctx) ? 1 : 0' to 'return do_bind(ctx)'. However, such
optimization breaks verifier as the return value of 'do_bind()' is marked as any
scalar which violates the requirement of prog return value 0/1.

There are two ways to fix this problem, (1) changing 'return 1' in do_bind() to
e.g. 'return 10' so the compiler has to do 'do_bind(ctx) ? 1 :0', or (2)
suggested by Andrii, marking do_bind() with __weak attribute so the compiler
cannot make any assumption on do_bind() return value.

This patch adopted adding __weak approach which is simpler and more resistant
to potential compiler optimizations.

Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230310012410.2920570-1-yhs@fb.com
2023-03-09 18:59:54 -08:00
Lorenzo Bianconi
c1cd734c1b selftests/bpf: Improve error logs in XDP compliance test tool
Improve some error logs reported in the XDP compliance test tool.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/212fc5bd214ff706f6ef1acbe7272cf4d803ca9c.1678382940.git.lorenzo@kernel.org
2023-03-09 20:52:40 +01:00
Lorenzo Bianconi
27a36bc3cd selftests/bpf: Use ifname instead of ifindex in XDP compliance test tool
Rely on interface name instead of interface index in error messages or
logs from XDP compliance test tool.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/7dc5a8ff56c252b1a7ae29b059d0b2b1543c8b5d.1678382940.git.lorenzo@kernel.org
2023-03-09 20:52:30 +01:00
Michael Weiß
5a70f4a630 bpf: Fix a typo for BPF_F_ANY_ALIGNMENT in bpf.h
Fix s/BPF_PROF_LOAD/BPF_PROG_LOAD/ typo in the documentation comment
for BPF_F_ANY_ALIGNMENT in bpf.h.

Signed-off-by: Michael Weiß <michael.weiss@aisec.fraunhofer.de>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20230309133823.944097-1-michael.weiss@aisec.fraunhofer.de
2023-03-09 20:42:57 +01:00
Martin KaFai Lau
a686557631 selftests/bpf: Fix flaky fib_lookup test
There is a report that fib_lookup test is flaky when running in parallel.
A symptom of slowness or delay. An example:

Testing IPv6 stale neigh
set_lookup_params:PASS:inet_pton(IPV6_IFACE_ADDR) 0 nsec
test_fib_lookup:PASS:bpf_prog_test_run_opts 0 nsec
test_fib_lookup:FAIL:fib_lookup_ret unexpected fib_lookup_ret: actual 0 != expected 7
test_fib_lookup:FAIL:dmac not match unexpected dmac not match: actual 1 != expected 0
dmac expected 11:11:11:11:11:11 actual 00:00:00:00:00:00

[ Note that the "fib_lookup_ret unexpected fib_lookup_ret actual 0 ..."
  is reversed in terms of expected and actual value. Fixing in this
  patch also. ]

One possibility is the testing stale neigh entry was marked dead by the
gc (in neigh_periodic_work). The default gc_stale_time sysctl is 60s.
This patch increases it to 15 mins.

It also:

- fixes the reversed arg (actual vs expected) in one of the
  ASSERT_EQ test
- removes the nodad command arg when adding v4 neigh entry which
  currently has a warning.

Fixes: 168de02335 ("selftests/bpf: Add bpf_fib_lookup test")
Reported-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20230309060244.3242491-1-martin.lau@linux.dev
2023-03-09 20:37:55 +01:00
Alexei Starovoitov
23e403b326 Merge branch 'BPF open-coded iterators'
Andrii Nakryiko says:

====================

Add support for open-coded (aka inline) iterators in BPF world. This is a next
evolution of gradually allowing more powerful and less restrictive looping and
iteration capabilities to BPF programs.

We set up a framework for implementing all kinds of iterators (e.g., cgroup,
task, file, etc, iterators), but this patch set only implements numbers
iterator, which is used to implement ergonomic bpf_for() for-like construct
(see patches #4-#5). We also add bpf_for_each(), which is a generic
foreach-like construct that will work with any kind of open-coded iterator
implementation, as long as we stick with bpf_iter_<type>_{new,next,destroy}()
naming pattern (which we now enforce on the kernel side).

Patch #1 is preparatory refactoring for easier way to check for special kfunc
calls. Patch #2 is adding iterator kfunc registration and validation logic,
which is mostly independent from the rest of open-coded iterator logic, so is
separated out for easier reviewing.

The meat of verifier-side logic is in patch #3. Patch #4 implements numbers
iterator. I kept them separate to have clean reference for how to integrate
new iterator types (now even simpler to do than in v1 of this patch set).
Patch #5 adds bpf_for(), bpf_for_each(), and bpf_repeat() macros to
bpf_misc.h, and also adds yet another pyperf test variant, now with bpf_for()
loop. Patch #6 is verification tests, based on numbers iterator (as the only
available right now). Patch #7 actually tests runtime behavior of numbers
iterator.

Finally, with changes in v2, it's possible and trivial to implement custom
iterators completely in kernel modules, which we showcase and test by adding
a simple iterator returning same number a given number of times to
bpf_testmod. Patch #8 is where all this happens and is tested.

Most of the relevant details are in corresponding commit messages or code
comments.

v4->v5:
  - fixing missed inner for() in is_iter_reg_valid_uninit, and fixed return
    false (kernel test robot);
  - typo fixes and comment/commit description improvements throughout the
    patch set;
v3->v4:
  - remove unused variable from is_iter_reg_valid_init (kernel test robot);
v2->v3:
  - remove special kfunc leftovers for bpf_iter_num_{new,next,destroy};
  - add iters/testmod_seq* to DENYLIST.s390x, it doesn't support kfuncs in
    modules yet (CI);
v1->v2:
  - rebased on latest, dropping previously landed preparatory patches;
  - each iterator type now have its own `struct bpf_iter_<type>` which allows
    each iterator implementation to use exactly as much stack space as
    necessary, allowing to avoid runtime allocations (Alexei);
  - reworked how iterator kfuncs are defined, no verifier changes are required
    when adding new iterator type;
  - added bpf_testmod-based iterator implementation;
  - address the rest of feedback, comments, commit message adjustment, etc.

Cc: Tejun Heo <tj@kernel.org>
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-08 16:19:51 -08:00
Andrii Nakryiko
7e86a8c4ac selftests/bpf: implement and test custom testmod_seq iterator
Implement a trivial iterator returning same specified integer value
N times as part of bpf_testmod kernel module. Add selftests to validate
everything works end to end.

We also reuse these tests as "verification-only" tests to validate that
kernel prints the state of custom kernel module-defined iterator correctly:

  fp-16=iter_testmod_seq(ref_id=1,state=drained,depth=0)

"testmod_seq" part is an iterator type, and is coming from module's BTF
data dynamically at runtime.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20230308184121.1165081-9-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-08 16:19:51 -08:00
Andrii Nakryiko
f59b146092 selftests/bpf: add number iterator tests
Add number iterator (bpf_iter_num_{new,next,destroy}()) tests,
validating the correct handling of various corner and common cases
*at runtime*.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20230308184121.1165081-8-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-08 16:19:51 -08:00
Andrii Nakryiko
57400dcce6 selftests/bpf: add iterators tests
Add various tests for open-coded iterators. Some of them excercise
various possible coding patterns in C, some go down to low-level
assembly for more control over various conditions, especially invalid
ones.

We also make use of bpf_for(), bpf_for_each(), bpf_repeat() macros in
some of these tests.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20230308184121.1165081-7-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-08 16:19:51 -08:00
Andrii Nakryiko
8c2b5e9050 selftests/bpf: add bpf_for_each(), bpf_for(), and bpf_repeat() macros
Add bpf_for_each(), bpf_for(), and bpf_repeat() macros that make writing
open-coded iterator-based loops much more convenient and natural. These
macros utilize cleanup attribute to ensure proper destruction of the
iterator and thanks to that manage to provide the ergonomics that is
very close to C language's for() construct. Typical loop would look like:

  int i;
  int arr[N];

  bpf_for(i, 0, N) {
      /* verifier will know that i >= 0 && i < N, so could be used to
       * directly access array elements with no extra checks
       */
       arr[i] = i;
  }

bpf_repeat() is very similar, but it doesn't expose iteration number and
is meant as a simple "repeat action N times" loop:

  bpf_repeat(N) { /* whatever, N times */ }

Note that `break` and `continue` statements inside the {} block work as
expected.

bpf_for_each() is a generalization over any kind of BPF open-coded
iterator allowing to use for-each-like approach instead of calling
low-level bpf_iter_<type>_{new,next,destroy}() APIs explicitly. E.g.:

  struct cgroup *cg;

  bpf_for_each(cgroup, cg, some, input, args) {
      /* do something with each cg */
  }

would call (not-yet-implemented) bpf_iter_cgroup_{new,next,destroy}()
functions to form a loop over cgroups, where `some, input, args` are
passed verbatim into constructor as

  bpf_iter_cgroup_new(&it, some, input, args).

As a first demonstration, add pyperf variant based on the bpf_for() loop.

Also clean up a few tests that either included bpf_misc.h header
unnecessarily from the user-space, which is unsupported, or included it
before any common types are defined (and thus leading to unnecessary
compilation warnings, potentially).

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20230308184121.1165081-6-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-08 16:19:51 -08:00
Andrii Nakryiko
6018e1f407 bpf: implement numbers iterator
Implement the first open-coded iterator type over a range of integers.

It's public API consists of:
  - bpf_iter_num_new() constructor, which accepts [start, end) range
    (that is, start is inclusive, end is exclusive).
  - bpf_iter_num_next() which will keep returning read-only pointer to int
    until the range is exhausted, at which point NULL will be returned.
    If bpf_iter_num_next() is kept calling after this, NULL will be
    persistently returned.
  - bpf_iter_num_destroy() destructor, which needs to be called at some
    point to clean up iterator state. BPF verifier enforces that iterator
    destructor is called at some point before BPF program exits.

Note that `start = end = X` is a valid combination to setup an empty
iterator. bpf_iter_num_new() will return 0 (success) for any such
combination.

If bpf_iter_num_new() detects invalid combination of input arguments, it
returns error, resets iterator state to, effectively, empty iterator, so
any subsequent call to bpf_iter_num_next() will keep returning NULL.

BPF verifier has no knowledge that returned integers are in the
[start, end) value range, as both `start` and `end` are not statically
known and enforced: they are runtime values.

While the implementation is pretty trivial, some care needs to be taken
to avoid overflows and underflows. Subsequent selftests will validate
correctness of [start, end) semantics, especially around extremes
(INT_MIN and INT_MAX).

Similarly to bpf_loop(), we enforce that no more than BPF_MAX_LOOPS can
be specified.

bpf_iter_num_{new,next,destroy}() is a logical evolution from bounded
BPF loops and bpf_loop() helper and is the basis for implementing
ergonomic BPF loops with no statically known or verified bounds.
Subsequent patches implement bpf_for() macro, demonstrating how this can
be wrapped into something that works and feels like a normal for() loop
in C language.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20230308184121.1165081-5-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-08 16:19:51 -08:00
Andrii Nakryiko
06accc8779 bpf: add support for open-coded iterator loops
Teach verifier about the concept of the open-coded (or inline) iterators.

This patch adds generic iterator loop verification logic, new STACK_ITER
stack slot type to contain iterator state, and necessary kfunc plumbing
for iterator's constructor, destructor and next methods. Next patch
implements first specific iterator (numbers iterator for implementing
for() loop logic). Such split allows to have more focused commits for
verifier logic and separate commit that we could point later to
demonstrating  what does it take to add a new kind of iterator.

Each kind of iterator has its own associated struct bpf_iter_<type>,
where <type> denotes a specific type of iterator. struct bpf_iter_<type>
state is supposed to live on BPF program stack, so there will be no way
to change its size later on without breaking backwards compatibility, so
choose wisely! But given this struct is specific to a given <type> of
iterator, this allows a lot of flexibility: simple iterators could be
fine with just one stack slot (8 bytes), like numbers iterator in the
next patch, while some other more complicated iterators might need way
more to keep their iterator state. Either way, such design allows to
avoid runtime memory allocations, which otherwise would be necessary if
we fixed on-the-stack size and it turned out to be too small for a given
iterator implementation.

The way BPF verifier logic is implemented, there are no artificial
restrictions on a number of active iterators, it should work correctly
using multiple active iterators at the same time. This also means you
can have multiple nested iteration loops. struct bpf_iter_<type>
reference can be safely passed to subprograms as well.

General flow is easiest to demonstrate with a simple example using
number iterator implemented in next patch. Here's the simplest possible
loop:

  struct bpf_iter_num it;
  int *v;

  bpf_iter_num_new(&it, 2, 5);
  while ((v = bpf_iter_num_next(&it))) {
      bpf_printk("X = %d", *v);
  }
  bpf_iter_num_destroy(&it);

Above snippet should output "X = 2", "X = 3", "X = 4". Note that 5 is
exclusive and is not returned. This matches similar APIs (e.g., slices
in Go or Rust) that implement a range of elements, where end index is
non-inclusive.

In the above example, we see a trio of function:
  - constructor, bpf_iter_num_new(), which initializes iterator state
  (struct bpf_iter_num it) on the stack. If any of the input arguments
  are invalid, constructor should make sure to still initialize it such
  that subsequent bpf_iter_num_next() calls will return NULL. I.e., on
  error, return error and construct empty iterator.
  - next method, bpf_iter_num_next(), which accepts pointer to iterator
  state and produces an element. Next method should always return
  a pointer. The contract between BPF verifier is that next method will
  always eventually return NULL when elements are exhausted. Once NULL is
  returned, subsequent next calls should keep returning NULL. In the
  case of numbers iterator, bpf_iter_num_next() returns a pointer to an int
  (storage for this integer is inside the iterator state itself),
  which can be dereferenced after corresponding NULL check.
  - once done with the iterator, it's mandated that user cleans up its
  state with the call to destructor, bpf_iter_num_destroy() in this
  case. Destructor frees up any resources and marks stack space used by
  struct bpf_iter_num as usable for something else.

Any other iterator implementation will have to implement at least these
three methods. It is enforced that for any given type of iterator only
applicable constructor/destructor/next are callable. I.e., verifier
ensures you can't pass number iterator state into, say, cgroup
iterator's next method.

It is important to keep the naming pattern consistent to be able to
create generic macros to help with BPF iter usability. E.g., one
of the follow up patches adds generic bpf_for_each() macro to bpf_misc.h
in selftests, which allows to utilize iterator "trio" nicely without
having to code the above somewhat tedious loop explicitly every time.
This is enforced at kfunc registration point by one of the previous
patches in this series.

At the implementation level, iterator state tracking for verification
purposes is very similar to dynptr. We add STACK_ITER stack slot type,
reserve necessary number of slots, depending on
sizeof(struct bpf_iter_<type>), and keep track of necessary extra state
in the "main" slot, which is marked with non-zero ref_obj_id. Other
slots are also marked as STACK_ITER, but have zero ref_obj_id. This is
simpler than having a separate "is_first_slot" flag.

Another big distinction is that STACK_ITER is *always refcounted*, which
simplifies implementation without sacrificing usability. So no need for
extra "iter_id", no need to anticipate reuse of STACK_ITER slots for new
constructors, etc. Keeping it simple here.

As far as the verification logic goes, there are two extensive comments:
in process_iter_next_call() and iter_active_depths_differ() explaining
some important and sometimes subtle aspects. Please refer to them for
details.

But from 10,000-foot point of view, next methods are the points of
forking a verification state, which are conceptually similar to what
verifier is doing when validating conditional jump. We branch out at
a `call bpf_iter_<type>_next` instruction and simulate two outcomes:
NULL (iteration is done) and non-NULL (new element is returned). NULL is
simulated first and is supposed to reach exit without looping. After
that non-NULL case is validated and it either reaches exit (for trivial
examples with no real loop), or reaches another `call bpf_iter_<type>_next`
instruction with the state equivalent to already (partially) validated
one. State equivalency at that point means we technically are going to
be looping forever without "breaking out" out of established "state
envelope" (i.e., subsequent iterations don't add any new knowledge or
constraints to the verifier state, so running 1, 2, 10, or a million of
them doesn't matter). But taking into account the contract stating that
iterator next method *has to* return NULL eventually, we can conclude
that loop body is safe and will eventually terminate. Given we validated
logic outside of the loop (NULL case), and concluded that loop body is
safe (though potentially looping many times), verifier can claim safety
of the overall program logic.

The rest of the patch is necessary plumbing for state tracking, marking,
validation, and necessary further kfunc plumbing to allow implementing
iterator constructor, destructor, and next methods.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20230308184121.1165081-4-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-08 16:19:50 -08:00
Andrii Nakryiko
215bf4962f bpf: add iterator kfuncs registration and validation logic
Add ability to register kfuncs that implement BPF open-coded iterator
contract and enforce naming and function proto convention. Enforcement
happens at the time of kfunc registration and significantly simplifies
the rest of iterators logic in the verifier.

More details follow in subsequent patches, but we enforce the following
conditions.

All kfuncs (constructor, next, destructor) have to be named consistenly
as bpf_iter_<type>_{new,next,destroy}(), respectively. <type> represents
iterator type, and iterator state should be represented as a matching
`struct bpf_iter_<type>` state type. Also, all iter kfuncs should have
a pointer to this `struct bpf_iter_<type>` as the very first argument.

Additionally:
  - Constructor, i.e., bpf_iter_<type>_new(), can have arbitrary extra
  number of arguments. Return type is not enforced either.
  - Next method, i.e., bpf_iter_<type>_next(), has to return a pointer
  type and should have exactly one argument: `struct bpf_iter_<type> *`
  (const/volatile/restrict and typedefs are ignored).
  - Destructor, i.e., bpf_iter_<type>_destroy(), should return void and
  should have exactly one argument, similar to the next method.
  - struct bpf_iter_<type> size is enforced to be positive and
  a multiple of 8 bytes (to fit stack slots correctly).

Such strictness and consistency allows to build generic helpers
abstracting important, but boilerplate, details to be able to use
open-coded iterators effectively and ergonomically (see bpf_for_each()
in subsequent patches). It also simplifies the verifier logic in some
places. At the same time, this doesn't hurt generality of possible
iterator implementations. Win-win.

Constructor kfunc is marked with a new KF_ITER_NEW flags, next method is
marked with KF_ITER_NEXT (and should also have KF_RET_NULL, of course),
while destructor kfunc is marked as KF_ITER_DESTROY.

Additionally, we add a trivial kfunc name validation: it should be
a valid non-NULL and non-empty string.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20230308184121.1165081-3-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-08 16:19:50 -08:00
Andrii Nakryiko
07236eab7a bpf: factor out fetching basic kfunc metadata
Factor out logic to fetch basic kfunc metadata based on struct bpf_insn.
This is not exactly short or trivial code to just copy/paste and this
information is sometimes necessary in other parts of the verifier logic.
Subsequent patches will rely on this to determine if an instruction is
a kfunc call to iterator next method.

No functional changes intended, including that verbose() warning
behavior when kfunc is not allowed for a particular program type.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20230308184121.1165081-2-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-03-08 16:19:50 -08:00
Jakub Kicinski
ed69e0667d Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Andrii Nakryiko says:

====================
pull-request: bpf-next 2023-03-08

We've added 23 non-merge commits during the last 2 day(s) which contain
a total of 28 files changed, 414 insertions(+), 104 deletions(-).

The main changes are:

1) Add more precise memory usage reporting for all BPF map types,
   from Yafang Shao.

2) Add ARM32 USDT support to libbpf, from Puranjay Mohan.

3) Fix BTF_ID_LIST size causing problems in !CONFIG_DEBUG_INFO_BTF,
   from Nathan Chancellor.

4) IMA selftests fix, from Roberto Sassu.

5) libbpf fix in APK support code, from Daniel Müller.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (23 commits)
  selftests/bpf: Fix IMA test
  libbpf: USDT arm arg parsing support
  libbpf: Refactor parse_usdt_arg() to re-use code
  libbpf: Fix theoretical u32 underflow in find_cd() function
  bpf: enforce all maps having memory usage callback
  bpf: offload map memory usage
  bpf, net: xskmap memory usage
  bpf, net: sock_map memory usage
  bpf, net: bpf_local_storage memory usage
  bpf: local_storage memory usage
  bpf: bpf_struct_ops memory usage
  bpf: queue_stack_maps memory usage
  bpf: devmap memory usage
  bpf: cpumap memory usage
  bpf: bloom_filter memory usage
  bpf: ringbuf memory usage
  bpf: reuseport_array memory usage
  bpf: stackmap memory usage
  bpf: arraymap memory usage
  bpf: hashtab memory usage
  ...
====================

Link: https://lore.kernel.org/r/20230308193533.1671597-1-andrii@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-08 14:34:22 -08:00
Roberto Sassu
12fabae03c selftests/bpf: Fix IMA test
Commit 62622dab0a ("ima: return IMA digest value only when IMA_COLLECTED
flag is set") caused bpf_ima_inode_hash() to refuse to give non-fresh
digests. IMA test #3 assumed the old behavior, that bpf_ima_inode_hash()
still returned also non-fresh digests.

Correct the test by accepting both cases. If the samples returned are 1,
assume that the commit above is applied and that the returned digest is
fresh. If the samples returned are 2, assume that the commit above is not
applied, and check both the non-fresh and fresh digest.

Fixes: 62622dab0a ("ima: return IMA digest value only when IMA_COLLECTED flag is set")
Reported-by: David Vernet <void@manifault.com>
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Matt Bobrowski <mattbobrowski@google.com>
Link: https://lore.kernel.org/bpf/20230308103713.1681200-1-roberto.sassu@huaweicloud.com
2023-03-08 11:15:39 -08:00
Eric Dumazet
1036908045 net: reclaim skb->scm_io_uring bit
Commit 0091bfc817 ("io_uring/af_unix: defer registered
files gc to io_uring release") added one bit to struct sk_buff.

This structure is critical for networking, and we try very hard
to not add bloat on it, unless absolutely required.

For instance, we can use a specific destructor as a wrapper
around unix_destruct_scm(), to identify skbs that unix_gc()
has to special case.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Pavel Begunkov <asml.silence@gmail.com>
Cc: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Cc: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-08 13:21:47 +00:00
David S. Miller
b3f4cd07df Merge branch 'sparx5-tc-flower-templates'
Steen Hegelund says:

====================
Add support for TC flower templates in Sparx5

This adds support for the TC template mechanism in the Sparx5 flower filter
implementation.

Templates are as such handled by the TC framework, but when a template is
created (using a chain id) there are by definition no filters on this
chain (an error will be returned if there are any).

If the templates chain id is one that is represented by a VCAP lookup, then
when the template is created, we know that it is safe to use the keys
provided in the template to change the keyset configuration for the (port,
lookup) combination, if this is needed to improve the match on the
template.

The original port keyset configuration is captured in the template state
information which is kept per port, so that when the template is deleted
the port keyset configuration can be restored to its previous setting.

The template also provides the protocol parameter which is the basic
information that is used to find out which port keyset configuration needs
to be changed.

The VCAPs and lookups are slightly different when it comes to which keys,
keysets and protocol are supported and used for selection, so in some
cases a bit of tweaking is needed to find a useful match.  This is done by
e.g. removing a key that prevents the best matching keyset from being
selected.

The debugfs output that is provided for a port allows inspection of the
currently used keyset in each of the VCAPs lookups.  So when a template has
been created the debugfs output allows you to verify if the keyset
configuration has been changed successfully.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-08 13:19:44 +00:00
Steen Hegelund
e1d597ecbe net: microchip: sparx5: Add TC template support
This adds support for using the "template add" and "template destroy"
functionality to change the port keyset configuration.

If the VCAP lookup already contains rules, the port keyset is left
unchanged, as a change would make these rules unusable.

When the template is destroyed the port keyset configuration is restored.
The filters using the template chain will automatically be deleted by the
TC framework.

Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-08 13:19:43 +00:00
Steen Hegelund
d9f175b0df net: microchip: sparx5: Add port keyset changing functionality
With this its is now possible for clients (like TC) to change the port
keyset configuration in the Sparx5 VCAPs.

This is typically done per traffic class which is guided with the L3
protocol information.
Before the change the current keyset configuration is collected in a list
that is handed back to the client.

Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-08 13:19:43 +00:00
Steen Hegelund
1c14432dce net: microchip: sparx5: Add TC template list to a port
This adds a list that is used to collect the templates that are active on a
port.

This allows the template creation to change the port configuration
and the template destruction to change it back.

Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-08 13:19:43 +00:00
Steen Hegelund
bfcb94aacc net: microchip: sparx5: Provide rule count, key removal and keyset select
This provides these 3 functions in the VCAP API:

- Count the number of rules in a VCAP lookup (chain)
- Remove a key from a VCAP rule
- Find the keyset that gives the smallest rule list from a list of keysets

Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-08 13:19:43 +00:00
Steen Hegelund
fbd3dce958 net: microchip: sparx5: Correct the spelling of the keysets in debugfs
Correct the name used in the debugfs output.

Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-08 13:19:43 +00:00
Arınç ÜNAL
7d8c48917a dt-bindings: net: dsa: mediatek,mt7530: change some descriptions to literal
The line endings must be preserved on gpio-controller, io-supply, and
reset-gpios properties to look proper when the YAML file is parsed.

Currently it's interpreted as a single line when parsed. Change the style
of the description of these properties to literal style to preserve the
line endings.

Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-08 13:05:37 +00:00
Jiapeng Chong
ecf729f93b emulex/benet: clean up some inconsistent indenting
No functional modification involved.

drivers/net/ethernet/emulex/benet/be_cmds.c:1120 be_cmd_pmac_add() warn: inconsistent indenting.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=4396
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-08 13:02:07 +00:00
Gustavo A. R. Silva
966b6b809f net/mlx4_en: Replace fake flex-array with flexible-array member
Zero-length arrays as fake flexible arrays are deprecated and we are
moving towards adopting C99 flexible-array members instead.

Transform zero-length array into flexible-array member in struct
mlx4_en_rx_desc.

Address the following warnings found with GCC-13 and
-fstrict-flex-arrays=3 enabled:
drivers/net/ethernet/mellanox/mlx4/en_rx.c:88:30: warning: array subscript i is outside array bounds of ‘struct mlx4_wqe_data_seg[0]’ [-Warray-bounds=]
drivers/net/ethernet/mellanox/mlx4/en_rx.c:149:30: warning: array subscript 0 is outside array bounds of ‘struct mlx4_wqe_data_seg[0]’ [-Warray-bounds=]
drivers/net/ethernet/mellanox/mlx4/en_rx.c:127:30: warning: array subscript i is outside array bounds of ‘struct mlx4_wqe_data_seg[0]’ [-Warray-bounds=]
drivers/net/ethernet/mellanox/mlx4/en_rx.c:128:30: warning: array subscript i is outside array bounds of ‘struct mlx4_wqe_data_seg[0]’ [-Warray-bounds=]
drivers/net/ethernet/mellanox/mlx4/en_rx.c:129:30: warning: array subscript i is outside array bounds of ‘struct mlx4_wqe_data_seg[0]’ [-Warray-bounds=]
drivers/net/ethernet/mellanox/mlx4/en_rx.c:117:30: warning: array subscript i is outside array bounds of ‘struct mlx4_wqe_data_seg[0]’ [-Warray-bounds=]
drivers/net/ethernet/mellanox/mlx4/en_rx.c:119:30: warning: array subscript i is outside array bounds of ‘struct mlx4_wqe_data_seg[0]’ [-Warray-bounds=]

This helps with the ongoing efforts to tighten the FORTIFY_SOURCE
routines on memcpy() and help us make progress towards globally
enabling -fstrict-flex-arrays=3 [1].

Link: https://github.com/KSPP/linux/issues/21
Link: https://github.com/KSPP/linux/issues/264
Link: https://gcc.gnu.org/pipermail/gcc-patches/2022-October/602902.html [1]
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-08 09:53:49 +00:00
David S. Miller
db067ef342 Merge branch 'r8169-disable-ASPM-during-NAPI-poll'
Heiner Kallweit says:

====================
r8169: disable ASPM during NAPI poll

This is a rework of ideas from Kai-Heng on how to avoid the known
ASPM issues whilst still allowing for a maximum of ASPM-related power
savings. As a prerequisite some locking is added first.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-08 09:31:31 +00:00
Heiner Kallweit
2ab19de62d r8169: remove ASPM restrictions now that ASPM is disabled during NAPI poll
Now that  ASPM is disabled during NAPI poll, we can remove all ASPM
restrictions. This allows for higher power savings if the network
isn't fully loaded.

Reviewed-by: Simon Horman <simon.horman@corigine.com>
Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-08 09:30:41 +00:00
Heiner Kallweit
e1ed3e4d91 r8169: disable ASPM during NAPI poll
Several chip versions have problems with ASPM, what may result in
rx_missed errors or tx timeouts. The root cause isn't known but
experience shows that disabling ASPM during NAPI poll can avoid
these problems.

Suggested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-08 09:30:41 +00:00
Heiner Kallweit
49ef7d846d r8169: prepare rtl_hw_aspm_clkreq_enable for usage in atomic context
Bail out if the function is used with chip versions that don't support
ASPM configuration. In addition remove the delay, it tuned out that
it's not needed, also vendor driver r8125 doesn't have it.

Suggested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-08 09:30:41 +00:00
Heiner Kallweit
59ee97c0c1 r8169: enable cfg9346 config register access in atomic context
For disabling ASPM during NAPI poll we'll have to unlock access
to the config registers in atomic context. Other code parts
running with config register access unlocked are partially
longer and can sleep. Add a usage counter to enable parallel
execution of code parts requiring unlocked config registers.

Reviewed-by: Simon Horman <simon.horman@corigine.com>
Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-08 09:30:41 +00:00
Heiner Kallweit
6bc6c4e689 r8169: use spinlock to protect access to registers Config2 and Config5
For disabling ASPM during NAPI poll we'll have to access both registers
in atomic context. Use a spinlock to protect access.

Reviewed-by: Simon Horman <simon.horman@corigine.com>
Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-08 09:30:41 +00:00
Heiner Kallweit
91c8643578 r8169: use spinlock to protect mac ocp register access
For disabling ASPM during NAPI poll we'll have to access mac ocp
registers in atomic context. This could result in races because
a mac ocp read consists of a write to register OCPDR, followed
by a read from the same register. Therefore add a spinlock to
protect access to mac ocp registers.

Reviewed-by: Simon Horman <simon.horman@corigine.com>
Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-08 09:30:41 +00:00
Vadim Fedorenko
8ca5a5790b net-timestamp: extend SOF_TIMESTAMPING_OPT_ID to HW timestamps
When the feature was added it was enabled for SW timestamps only but
with current hardware the same out-of-order timestamps can be seen.
Let's expand the area for the feature to all types of timestamps.

Signed-off-by: Vadim Fedorenko <vadfed@meta.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-08 09:27:14 +00:00
Gustavo A. R. Silva
2549347972 netxen_nic: Replace fake flex-array with flexible-array member
Zero-length arrays as fake flexible arrays are deprecated and we are
moving towards adopting C99 flexible-array members instead.

Transform zero-length array into flexible-array member in struct
nx_cardrsp_rx_ctx_t.

Address the following warnings found with GCC-13 and
-fstrict-flex-arrays=3 enabled:
drivers/net/ethernet/qlogic/netxen/netxen_nic_ctx.c:361:26: warning: array subscript <unknown> is outside array bounds of ‘char[0]’ [-Warray-bounds=]
drivers/net/ethernet/qlogic/netxen/netxen_nic_ctx.c:372:25: warning: array subscript <unknown> is outside array bounds of ‘char[0]’ [-Warray-bounds=]

This helps with the ongoing efforts to tighten the FORTIFY_SOURCE
routines on memcpy() and help us make progress towards globally
enabling -fstrict-flex-arrays=3 [1].

Link: https://github.com/KSPP/linux/issues/21
Link: https://github.com/KSPP/linux/issues/265
Link: https://gcc.gnu.org/pipermail/gcc-patches/2022-October/602902.html [1]
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/ZAZ57I6WdQEwWh7v@work
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-08 00:16:58 -08:00
Heiner Kallweit
4310e2f420 net: phy: smsc: simplify lan95xx_config_aneg_ext
lan95xx_config_aneg_ext() can be simplified by using phy_set_bits().

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/3da785c7-3ef8-b5d3-89a0-340f550be3c2@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-07 23:57:32 -08:00
Eric Dumazet
40bbae583e net: remove enum skb_free_reason
enum skb_drop_reason is more generic, we can adopt it instead.

Provide dev_kfree_skb_irq_reason() and dev_kfree_skb_any_reason().

This means drivers can use more precise drop reasons if they want to.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Yunsheng Lin <linyunsheng@huawei.com>
Link: https://lore.kernel.org/r/20230306204313.10492-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-07 23:57:19 -08:00
Heiner Kallweit
0194b64578 net: phy: improve phy_read_poll_timeout
cond sometimes is (val & MASK) what may result in a false positive
if val is a negative errno. We shouldn't evaluate cond if val < 0.
This has no functional impact here, but it's not nice.
Therefore switch order of the checks.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/6d8274ac-4344-23b4-d9a3-cad4c39517d4@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-07 18:19:09 -08:00
Andrii Nakryiko
d1d51a62d0 Merge branch 'libbpf: usdt arm arg parsing support'
Puranjay Mohan says:

====================

This series add the support of the ARM architecture to libbpf USDT. This
involves implementing the parse_usdt_arg() function for ARM.

It was seen that the last part of parse_usdt_arg() is repeated for all architectures,
so, the first patch in this series refactors these functions and moved the post
processing to parse_usdt_spec()

Changes in V2[1] to V3:

- Use a tabular approach to find register offsets.
- Add the patch for refactoring parse_usdt_arg()
====================

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2023-03-07 15:35:56 -08:00
Puranjay Mohan
720d93b60a libbpf: USDT arm arg parsing support
Parsing of USDT arguments is architecture-specific; on arm it is
relatively easy since registers used are r[0-10], fp, ip, sp, lr,
pc. Format is slightly different compared to aarch64; forms are

- "size @ [ reg, #offset ]" for dereferences, for example
  "-8 @ [ sp, #76 ]" ; " -4 @ [ sp ]"
- "size @ reg" for register values; for example
  "-4@r0"
- "size @ #value" for raw values; for example
  "-8@#1"

Add support for parsing USDT arguments for ARM architecture.

To test the above changes QEMU's virt[1] board with cortex-a15
CPU was used. libbpf-bootstrap's usdt example[2] was modified to attach
to a test program with DTRACE_PROBE1/2/3/4... probes to test different
combinations.

[1] https://www.qemu.org/docs/master/system/arm/virt.html
[2] https://github.com/libbpf/libbpf-bootstrap/blob/master/examples/c/usdt.bpf.c

Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230307120440.25941-3-puranjay12@gmail.com
2023-03-07 15:35:53 -08:00
Puranjay Mohan
98e678e9bc libbpf: Refactor parse_usdt_arg() to re-use code
The parse_usdt_arg() function is defined differently for each
architecture but the last part of the function is repeated
verbatim for each architecture.

Refactor parse_usdt_arg() to fill the arg_sz and then do the repeated
post-processing in parse_usdt_spec().

Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230307120440.25941-2-puranjay12@gmail.com
2023-03-07 15:35:05 -08:00
Daniel Müller
3ecde2182a libbpf: Fix theoretical u32 underflow in find_cd() function
Coverity reported a potential underflow of the offset variable used in
the find_cd() function. Switch to using a signed 64 bit integer for the
representation of offset to make sure we can never underflow.

Fixes: 1eebcb6063 ("libbpf: Implement basic zip archive parsing support")
Signed-off-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20230307215504.837321-1-deso@posteo.net
2023-03-07 15:30:47 -08:00