Commit Graph

1009512 Commits

Author SHA1 Message Date
Jussi Maki
096eccdef0 selftests/bpf: Rewrite test_tc_redirect.sh as prog_tests/tc_redirect.c
As discussed in [0], this ports test_tc_redirect.sh to the test_progs
framework and removes the old test.

This makes it more in line with rest of the tests and makes it possible
to run this test case with vmtest.sh and under the bpf CI.

The upcoming skb_change_head() helper fix in [0] is depending on it and
extending the test case to redirect a packet from L3 device to veth.

  [0] https://lore.kernel.org/bpf/20210427135550.807355-1-joamaki@gmail.com

Signed-off-by: Jussi Maki <joamaki@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20210505085925.783985-1-joamaki@gmail.com
2021-05-11 23:15:43 +02:00
Arnaldo Carvalho de Melo
67e7ec0bd4 libbpf: Provide GELF_ST_VISIBILITY() define for older libelf
Where that macro isn't available.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/YJaspEh0qZr4LYOc@kernel.org
2021-05-11 23:07:33 +02:00
Florent Revest
e2d5b2bb76 bpf: Fix nested bpf_bprintf_prepare with more per-cpu buffers
The bpf_seq_printf, bpf_trace_printk and bpf_snprintf helpers share one
per-cpu buffer that they use to store temporary data (arguments to
bprintf). They "get" that buffer with try_get_fmt_tmp_buf and "put" it
by the end of their scope with bpf_bprintf_cleanup.

If one of these helpers gets called within the scope of one of these
helpers, for example: a first bpf program gets called, uses
bpf_trace_printk which calls raw_spin_lock_irqsave which is traced by
another bpf program that calls bpf_snprintf, then the second "get"
fails. Essentially, these helpers are not re-entrant. They would return
-EBUSY and print a warning message once.

This patch triples the number of bprintf buffers to allow three levels
of nesting. This is very similar to what was done for tracepoints in
"9594dc3c7e7 bpf: fix nested bpf tracepoints with per-cpu data"

Fixes: d9c9e4db18 ("bpf: Factorize bpf_trace_printk and bpf_seq_printf")
Reported-by: syzbot+63122d0bc347f18c1884@syzkaller.appspotmail.com
Signed-off-by: Florent Revest <revest@chromium.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210511081054.2125874-1-revest@chromium.org
2021-05-11 14:02:33 -07:00
Jiri Olsa
35e3815fa8 bpf: Add deny list of btf ids check for tracing programs
The recursion check in __bpf_prog_enter and __bpf_prog_exit
leaves some (not inlined) functions unprotected:

In __bpf_prog_enter:
  - migrate_disable is called before prog->active is checked

In __bpf_prog_exit:
  - migrate_enable,rcu_read_unlock_strict are called after
    prog->active is decreased

When attaching trampoline to them we get panic like:

  traps: PANIC: double fault, error_code: 0x0
  double fault: 0000 [#1] SMP PTI
  RIP: 0010:__bpf_prog_enter+0x4/0x50
  ...
  Call Trace:
   <IRQ>
   bpf_trampoline_6442466513_0+0x18/0x1000
   migrate_disable+0x5/0x50
   __bpf_prog_enter+0x9/0x50
   bpf_trampoline_6442466513_0+0x18/0x1000
   migrate_disable+0x5/0x50
   __bpf_prog_enter+0x9/0x50
   bpf_trampoline_6442466513_0+0x18/0x1000
   migrate_disable+0x5/0x50
   __bpf_prog_enter+0x9/0x50
   bpf_trampoline_6442466513_0+0x18/0x1000
   migrate_disable+0x5/0x50
   ...

Fixing this by adding deny list of btf ids for tracing
programs and checking btf id during program verification.
Adding above functions to this list.

Suggested-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210429114712.43783-1-jolsa@kernel.org
2021-05-11 14:00:53 -07:00
Daniel Borkmann
08389d8882 bpf: Add kconfig knob for disabling unpriv bpf by default
Add a kconfig knob which allows for unprivileged bpf to be disabled by default.
If set, the knob sets /proc/sys/kernel/unprivileged_bpf_disabled to value of 2.

This still allows a transition of 2 -> {0,1} through an admin. Similarly,
this also still keeps 1 -> {1} behavior intact, so that once set to permanently
disabled, it cannot be undone aside from a reboot.

We've also added extra2 with max of 2 for the procfs handler, so that an admin
still has a chance to toggle between 0 <-> 2.

Either way, as an additional alternative, applications can make use of CAP_BPF
that we added a while ago.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/74ec548079189e4e4dffaeb42b8987bb3c852eee.1620765074.git.daniel@iogearbox.net
2021-05-11 13:56:16 -07:00
Daniel Borkmann
b24abcff91 bpf, kconfig: Add consolidated menu entry for bpf with core options
Right now, all core BPF related options are scattered in different Kconfig
locations mainly due to historic reasons. Moving forward, lets add a proper
subsystem entry under ...

  General setup  --->
    BPF subsystem  --->

... in order to have all knobs in a single location and thus ease BPF related
configuration. Networking related bits such as sockmap are out of scope for
the general setup and therefore better suited to remain in net/Kconfig.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/f23f58765a4d59244ebd8037da7b6a6b2fb58446.1620765074.git.daniel@iogearbox.net
2021-05-11 13:56:16 -07:00
Andrii Nakryiko
04ea3086c4 bpf: Prevent writable memory-mapping of read-only ringbuf pages
Only the very first page of BPF ringbuf that contains consumer position
counter is supposed to be mapped as writeable by user-space. Producer
position is read-only and can be modified only by the kernel code. BPF ringbuf
data pages are read-only as well and are not meant to be modified by
user-code to maintain integrity of per-record headers.

This patch allows to map only consumer position page as writeable and
everything else is restricted to be read-only. remap_vmalloc_range()
internally adds VM_DONTEXPAND, so all the established memory mappings can't be
extended, which prevents any future violations through mremap()'ing.

Fixes: 457f44363a ("bpf: Implement BPF ring buffer and verifier support for it")
Reported-by: Ryota Shiga (Flatt Security)
Reported-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
2021-05-11 13:31:10 +02:00
Thadeu Lima de Souza Cascardo
4b81ccebae bpf, ringbuf: Deny reserve of buffers larger than ringbuf
A BPF program might try to reserve a buffer larger than the ringbuf size.
If the consumer pointer is way ahead of the producer, that would be
successfully reserved, allowing the BPF program to read or write out of
the ringbuf allocated area.

Reported-by: Ryota Shiga (Flatt Security)
Fixes: 457f44363a ("bpf: Implement BPF ring buffer and verifier support for it")
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Alexei Starovoitov <ast@kernel.org>
2021-05-11 13:30:45 +02:00
Daniel Borkmann
049c4e1371 bpf: Fix alu32 const subreg bound tracking on bitwise operations
Fix a bug in the verifier's scalar32_min_max_*() functions which leads to
incorrect tracking of 32 bit bounds for the simulation of and/or/xor bitops.
When both the src & dst subreg is a known constant, then the assumption is
that scalar_min_max_*() will take care to update bounds correctly. However,
this is not the case, for example, consider a register R2 which has a tnum
of 0xffffffff00000000, meaning, lower 32 bits are known constant and in this
case of value 0x00000001. R2 is then and'ed with a register R3 which is a
64 bit known constant, here, 0x100000002.

What can be seen in line '10:' is that 32 bit bounds reach an invalid state
where {u,s}32_min_value > {u,s}32_max_value. The reason is scalar32_min_max_*()
delegates 32 bit bounds updates to scalar_min_max_*(), however, that really
only takes place when both the 64 bit src & dst register is a known constant.
Given scalar32_min_max_*() is intended to be designed as closely as possible
to scalar_min_max_*(), update the 32 bit bounds in this situation through
__mark_reg32_known() which will set all {u,s}32_{min,max}_value to the correct
constant, which is 0x00000000 after the fix (given 0x00000001 & 0x00000002 in
32 bit space). This is possible given var32_off already holds the final value
as dst_reg->var_off is updated before calling scalar32_min_max_*().

Before fix, invalid tracking of R2:

  [...]
  9: R0_w=inv1337 R1=ctx(id=0,off=0,imm=0) R2_w=inv(id=0,smin_value=-9223372036854775807 (0x8000000000000001),smax_value=9223372032559808513 (0x7fffffff00000001),umin_value=1,umax_value=0xffffffff00000001,var_off=(0x1; 0xffffffff00000000),s32_min_value=1,s32_max_value=1,u32_min_value=1,u32_max_value=1) R3_w=inv4294967298 R10=fp0
  9: (5f) r2 &= r3
  10: R0_w=inv1337 R1=ctx(id=0,off=0,imm=0) R2_w=inv(id=0,smin_value=0,smax_value=4294967296 (0x100000000),umin_value=0,umax_value=0x100000000,var_off=(0x0; 0x100000000),s32_min_value=1,s32_max_value=0,u32_min_value=1,u32_max_value=0) R3_w=inv4294967298 R10=fp0
  [...]

After fix, correct tracking of R2:

  [...]
  9: R0_w=inv1337 R1=ctx(id=0,off=0,imm=0) R2_w=inv(id=0,smin_value=-9223372036854775807 (0x8000000000000001),smax_value=9223372032559808513 (0x7fffffff00000001),umin_value=1,umax_value=0xffffffff00000001,var_off=(0x1; 0xffffffff00000000),s32_min_value=1,s32_max_value=1,u32_min_value=1,u32_max_value=1) R3_w=inv4294967298 R10=fp0
  9: (5f) r2 &= r3
  10: R0_w=inv1337 R1=ctx(id=0,off=0,imm=0) R2_w=inv(id=0,smin_value=0,smax_value=4294967296 (0x100000000),umin_value=0,umax_value=0x100000000,var_off=(0x0; 0x100000000),s32_min_value=0,s32_max_value=0,u32_min_value=0,u32_max_value=0) R3_w=inv4294967298 R10=fp0
  [...]

Fixes: 3f50f132d8 ("bpf: Verifier, do explicit ALU32 bounds tracking")
Fixes: 2921c90d47 ("bpf: Fix a verifier failure with xor")
Reported-by: Manfred Paul (@_manfp)
Reported-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
2021-05-11 08:55:53 +02:00
Jiri Olsa
31379397dc bpf: Forbid trampoline attach for functions with variable arguments
We can't currently allow to attach functions with variable arguments.
The problem is that we should save all the registers for arguments,
which is probably doable, but if caller uses more than 6 arguments,
we need stack data, which will be wrong, because of the extra stack
frame we do in bpf trampoline, so we could crash.

Also currently there's malformed trampoline code generated for such
functions at the moment as described in:

  https://lore.kernel.org/bpf/20210429212834.82621-1-jolsa@kernel.org/

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210505132529.401047-1-jolsa@kernel.org
2021-05-07 01:28:28 +02:00
Magnus Karlsson
3b80d106e1 samples/bpf: Consider frame size in tx_only of xdpsock sample
Fix the tx_only micro-benchmark in xdpsock to take frame size into
consideration. It was hardcoded to the default value of frame_size
which is 4K. Changing this on the command line to 2K made half of the
packets illegal as they were outside the umem and were therefore
discarded by the kernel.

Fixes: 46738f73ea ("samples/bpf: add use of need_wakeup flag in xdpsock")
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/bpf/20210506124349.6666-1-magnus.karlsson@gmail.com
2021-05-07 01:19:55 +02:00
Ian Rogers
9683e5775c libbpf: Add NULL check to add_dummy_ksym_var
Avoids a segv if btf isn't present. Seen on the call path
__bpf_object__open calling bpf_object__collect_externs.

Fixes: 5bd022ec01 (libbpf: Support extern kernel function)
Suggested-by: Stanislav Fomichev <sdf@google.com>
Suggested-by: Petar Penkov <ppenkov@google.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210504234910.976501-1-irogers@google.com
2021-05-05 11:29:33 -07:00
David S. Miller
1682d8df20 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
pull-request: bpf 2021-05-04

The following pull-request contains BPF updates for your *net* tree.

We've added 5 non-merge commits during the last 4 day(s) which contain
a total of 6 files changed, 52 insertions(+), 30 deletions(-).

The main changes are:

1) Fix libbpf overflow when processing BPF ring buffer in case of extreme
   application behavior, from Brendan Jackman.

2) Fix potential data leakage of uninitialized BPF stack under speculative
   execution, from Daniel Borkmann.

3) Fix off-by-one when validating xsk pool chunks, from Xuan Zhuo.

4) Fix snprintf BPF selftest with a pid filter to avoid racing its output
   test buffer, from Florent Revest.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-03 18:40:17 -07:00
Xuan Zhuo
ac31565c21 xsk: Fix for xp_aligned_validate_desc() when len == chunk_size
When desc->len is equal to chunk_size, it is legal. But when the
xp_aligned_validate_desc() got chunk_end from desc->addr + desc->len
pointing to the next chunk during the check, it caused the check to
fail.

This problem was first introduced in bbff2f321a ("xsk: new descriptor
addressing scheme"). Later in 2b43470add ("xsk: Introduce AF_XDP buffer
allocation API") this piece of code was moved into the new function called
xp_aligned_validate_desc(). This function was then moved into xsk_queue.h
via 26062b185e ("xsk: Explicitly inline functions and move definitions").

Fixes: bbff2f321a ("xsk: new descriptor addressing scheme")
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/bpf/20210428094424.54435-1-xuanzhuo@linux.alibaba.com
2021-05-04 00:28:06 +02:00
Daniele Palmas
bd1af6b5ff Documentation: ABI: sysfs-class-net-qmi: document pass-through file
Add documentation for /sys/class/net/<iface>/qmi/pass_through

Signed-off-by: Daniele Palmas <dnlplm@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-03 13:40:17 -07:00
Xie He
d362fd0be4 Revert "drivers/net/wan/hdlc_fr: Fix a double free in pvc_xmit"
This reverts commit 1b479fb801
("drivers/net/wan/hdlc_fr: Fix a double free in pvc_xmit").

1. This commit is incorrect. "__skb_pad" will NOT free the skb on
failure when its "free_on_error" parameter is "false".

2. This commit claims to fix my commit. But it didn't CC me??

Fixes: 1b479fb801 ("drivers/net/wan/hdlc_fr: Fix a double free in pvc_xmit")
Cc: Lv Yunlong <lyl2019@mail.ustc.edu.cn>
Signed-off-by: Xie He <xie.he.0141@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-03 13:37:58 -07:00
David S. Miller
d89ecd154c Merge branch 'sctp-race-fix'
Xin Long says:

====================
sctp: fix the race condition in sctp_destroy_sock in a proper way

The original fix introduced a dead lock, and has to be removed in
Patch 1/2, and we will get a proper way to fix it in Patch 2/2.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-03 13:36:21 -07:00
Xin Long
34e5b01186 sctp: delay auto_asconf init until binding the first addr
As Or Cohen described:

  If sctp_destroy_sock is called without sock_net(sk)->sctp.addr_wq_lock
  held and sp->do_auto_asconf is true, then an element is removed
  from the auto_asconf_splist without any proper locking.

  This can happen in the following functions:
  1. In sctp_accept, if sctp_sock_migrate fails.
  2. In inet_create or inet6_create, if there is a bpf program
     attached to BPF_CGROUP_INET_SOCK_CREATE which denies
     creation of the sctp socket.

This patch is to fix it by moving the auto_asconf init out of
sctp_init_sock(), by which inet_create()/inet6_create() won't
need to operate it in sctp_destroy_sock() when calling
sk_common_release().

It also makes more sense to do auto_asconf init while binding the
first addr, as auto_asconf actually requires an ANY addr bind,
see it in sctp_addr_wq_timeout_handler().

This addresses CVE-2021-23133.

Fixes: 6102365876 ("bpf: Add new cgroup attach type to enable sock modifications")
Reported-by: Or Cohen <orcohen@paloaltonetworks.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-03 13:36:21 -07:00
Xin Long
01bfe5e8e4 Revert "net/sctp: fix race condition in sctp_destroy_sock"
This reverts commit b166a20b07.

This one has to be reverted as it introduced a dead lock, as
syzbot reported:

       CPU0                    CPU1
       ----                    ----
  lock(&net->sctp.addr_wq_lock);
                               lock(slock-AF_INET6);
                               lock(&net->sctp.addr_wq_lock);
  lock(slock-AF_INET6);

CPU0 is the thread of sctp_addr_wq_timeout_handler(), and CPU1
is that of sctp_close().

The original issue this commit fixed will be fixed in the next
patch.

Reported-by: syzbot+959223586843e69a2674@syzkaller.appspotmail.com
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-03 13:36:21 -07:00
Phillip Potter
2e9f60932a net: hsr: check skb can contain struct hsr_ethhdr in fill_frame_info
Check at start of fill_frame_info that the MAC header in the supplied
skb is large enough to fit a struct hsr_ethhdr, as otherwise this is
not a valid HSR frame. If it is too small, return an error which will
then cause the callers to clean up the skb. Fixes a KMSAN-found
uninit-value bug reported by syzbot at:
https://syzkaller.appspot.com/bug?id=f7e9b601f1414f814f7602a82b6619a8d80bce3f

Reported-by: syzbot+e267bed19bfc5478fb33@syzkaller.appspotmail.com
Signed-off-by: Phillip Potter <phil@philpotter.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-03 13:33:54 -07:00
Xin Long
f282df0391 sctp: fix a SCTP_MIB_CURRESTAB leak in sctp_sf_do_dupcook_b
Normally SCTP_MIB_CURRESTAB is always incremented once asoc enter into
ESTABLISHED from the state < ESTABLISHED and decremented when the asoc
is being deleted.

However, in sctp_sf_do_dupcook_b(), the asoc's state can be changed to
ESTABLISHED from the state >= ESTABLISHED where it shouldn't increment
SCTP_MIB_CURRESTAB. Otherwise, one asoc may increment MIB_CURRESTAB
multiple times but only decrement once at the end.

I was able to reproduce it by using scapy to do the 4-way shakehands,
after that I replayed the COOKIE-ECHO chunk with 'peer_vtag' field
changed to different values, and SCTP_MIB_CURRESTAB was incremented
multiple times and never went back to 0 even when the asoc was freed.

This patch is to fix it by only incrementing SCTP_MIB_CURRESTAB when
the state < ESTABLISHED in sctp_sf_do_dupcook_b().

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reported-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-03 13:32:12 -07:00
David S. Miller
a52445ae5a Merge branch 'sctp-bad-revert'
Xin Long says:

====================
sctp: fix the incorrect revert

commit 35b4f24415 ("sctp: do asoc update earlier in
sctp_sf_do_dupcook_a") only keeps the SHUTDOWN and
COOKIE-ACK with the same asoc, not transport.

So instead of revert commit 145cb2f717 ("sctp: Fix bundling
of SHUTDOWN with COOKIE-ACK"), we should revert 12dfd78e3a
("sctp: Fix SHUTDOWN CTSN Ack in the peer restart case").
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-03 13:30:04 -07:00
Xin Long
7aa4e54739 Revert "sctp: Fix SHUTDOWN CTSN Ack in the peer restart case"
This reverts commit 12dfd78e3a.

This can be reverted as shutdown and cookie_ack chunk are using the
same asoc since commit 35b4f24415 ("sctp: do asoc update earlier
in sctp_sf_do_dupcook_a").

Reported-by: Jere Leppänen <jere.leppanen@nokia.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-03 13:30:04 -07:00
Xin Long
22008f560b Revert "Revert "sctp: Fix bundling of SHUTDOWN with COOKIE-ACK""
This reverts commit 7e9269a5ac.

As Jere notice, commit 35b4f24415 ("sctp: do asoc update earlier
in sctp_sf_do_dupcook_a") only keeps the SHUTDOWN and COOKIE-ACK
with the same asoc, not transport. So we have to bring this patch
back.

Reported-by: Jere Leppänen <jere.leppanen@nokia.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-03 13:30:04 -07:00
Lv Yunlong
643001b47a ethernet:enic: Fix a use after free bug in enic_hard_start_xmit
In enic_hard_start_xmit, it calls enic_queue_wq_skb(). Inside
enic_queue_wq_skb, if some error happens, the skb will be freed
by dev_kfree_skb(skb). But the freed skb is still used in
skb_tx_timestamp(skb).

My patch makes enic_queue_wq_skb() return error and goto spin_unlock()
incase of error. The solution is provided by Govind.
See https://lkml.org/lkml/2021/4/30/961.

Fixes: fb7516d424 ("enic: add sw timestamp support")
Signed-off-by: Lv Yunlong <lyl2019@mail.ustc.edu.cn>
Acked-by: Govindarajulu Varadarajan <gvaradar@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-03 13:23:00 -07:00
Brendan Jackman
2a30f94406 libbpf: Fix signed overflow in ringbuf_process_ring
One of our benchmarks running in (Google-internal) CI pushes data
through the ringbuf faster htan than userspace is able to consume
it. In this case it seems we're actually able to get >INT_MAX entries
in a single ring_buffer__consume() call. ASAN detected that cnt
overflows in this case.

Fix by using 64-bit counter internally and then capping the result to
INT_MAX before converting to the int return type. Do the same for
the ring_buffer__poll().

Fixes: bf99c936f9 (libbpf: Add BPF ring buffer support)
Signed-off-by: Brendan Jackman <jackmanb@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210429130510.1621665-1-jackmanb@google.com
2021-05-03 09:54:12 -07:00
Daniel Borkmann
801c6058d1 bpf: Fix leakage of uninitialized bpf stack under speculation
The current implemented mechanisms to mitigate data disclosure under
speculation mainly address stack and map value oob access from the
speculative domain. However, Piotr discovered that uninitialized BPF
stack is not protected yet, and thus old data from the kernel stack,
potentially including addresses of kernel structures, could still be
extracted from that 512 bytes large window. The BPF stack is special
compared to map values since it's not zero initialized for every
program invocation, whereas map values /are/ zero initialized upon
their initial allocation and thus cannot leak any prior data in either
domain. In the non-speculative domain, the verifier ensures that every
stack slot read must have a prior stack slot write by the BPF program
to avoid such data leaking issue.

However, this is not enough: for example, when the pointer arithmetic
operation moves the stack pointer from the last valid stack offset to
the first valid offset, the sanitation logic allows for any intermediate
offsets during speculative execution, which could then be used to
extract any restricted stack content via side-channel.

Given for unprivileged stack pointer arithmetic the use of unknown
but bounded scalars is generally forbidden, we can simply turn the
register-based arithmetic operation into an immediate-based arithmetic
operation without the need for masking. This also gives the benefit
of reducing the needed instructions for the operation. Given after
the work in 7fedb63a83 ("bpf: Tighten speculative pointer arithmetic
mask"), the aux->alu_limit already holds the final immediate value for
the offset register with the known scalar. Thus, a simple mov of the
immediate to AX register with using AX as the source for the original
instruction is sufficient and possible now in this case.

Reported-by: Piotr Krysiuk <piotras@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Piotr Krysiuk <piotras@gmail.com>
Reviewed-by: Piotr Krysiuk <piotras@gmail.com>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
2021-05-03 11:56:23 +02:00
Daniel Borkmann
b9b34ddbe2 bpf: Fix masking negation logic upon negative dst register
The negation logic for the case where the off_reg is sitting in the
dst register is not correct given then we cannot just invert the add
to a sub or vice versa. As a fix, perform the final bitwise and-op
unconditionally into AX from the off_reg, then move the pointer from
the src to dst and finally use AX as the source for the original
pointer arithmetic operation such that the inversion yields a correct
result. The single non-AX mov in between is possible given constant
blinding is retaining it as it's not an immediate based operation.

Fixes: 979d63d50c ("bpf: prevent out of bounds speculation on pointer arithmetic")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Piotr Krysiuk <piotras@gmail.com>
Reviewed-by: Piotr Krysiuk <piotras@gmail.com>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
2021-05-03 11:56:16 +02:00
Wan Jiabing
f18c51b651 net: stmmac: Remove duplicate declaration of stmmac_priv
In commit f4da56529d ("net: stmmac: Add support for external
trigger timestamping"), struct stmmac_priv was declared at line 507
which caused duplicate struct declarations.
Remove later duplicate declaration here.

Signed-off-by: Wan Jiabing <wanjiabing@vivo.com>
Reviewed-by: Wong Vee Khee <vee.khee.wong@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-30 15:21:44 -07:00
Maxim Kochetkov
8385b1f0ad net: phy: marvell: enable downshift by default
A number of PHYs support the PHY tunable to set and get
downshift. However, only 88E1116R enables downshift by default. Extend
this default enabled to all the PHYs that support the downshift
tunable.

Signed-off-by: Maxim Kochetkov <fido_max@inbox.ru>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-30 15:20:47 -07:00
David S. Miller
c5fab809bf Merge branch 'sctp-chunk-fix'
Xin Long says:

====================
sctp: always send a chunk with the asoc that it belongs to

Currently when processing a duplicate COOKIE-ECHO chunk, a new temp
asoc would be created, then it creates the chunks with the new asoc.
However, later on it uses the old asoc to send these chunks, which
has caused quite a few issues.

This patchset is to fix this and make sure that the COOKIE-ACK and
SHUTDOWN chunks are created with the same asoc that will be used to
send them out.

v1->v2:
  - see Patch 3/3.
====================

Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-30 15:06:34 -07:00
Xin Long
51eac7f2f0 sctp: do asoc update earlier in sctp_sf_do_dupcook_b
The same thing should be done for sctp_sf_do_dupcook_b().
Meanwhile, SCTP_CMD_UPDATE_ASSOC cmd can be removed.

v1->v2:
  - Fix the return value in sctp_sf_do_assoc_update().

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-30 15:06:34 -07:00
Xin Long
7e9269a5ac Revert "sctp: Fix bundling of SHUTDOWN with COOKIE-ACK"
This can be reverted as shutdown and cookie_ack chunk are using the
same asoc since the last patch.

This reverts commit 145cb2f717.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-30 15:06:34 -07:00
Xin Long
35b4f24415 sctp: do asoc update earlier in sctp_sf_do_dupcook_a
There's a panic that occurs in a few of envs, the call trace is as below:

  [] general protection fault, ... 0x29acd70f1000a: 0000 [#1] SMP PTI
  [] RIP: 0010:sctp_ulpevent_notify_peer_addr_change+0x4b/0x1fa [sctp]
  []  sctp_assoc_control_transport+0x1b9/0x210 [sctp]
  []  sctp_do_8_2_transport_strike.isra.16+0x15c/0x220 [sctp]
  []  sctp_cmd_interpreter.isra.21+0x1231/0x1a10 [sctp]
  []  sctp_do_sm+0xc3/0x2a0 [sctp]
  []  sctp_generate_timeout_event+0x81/0xf0 [sctp]

This is caused by a transport use-after-free issue. When processing a
duplicate COOKIE-ECHO chunk in sctp_sf_do_dupcook_a(), both COOKIE-ACK
and SHUTDOWN chunks are allocated with the transort from the new asoc.
However, later in the sideeffect machine, the old asoc is used to send
them out and old asoc's shutdown_last_sent_to is set to the transport
that SHUTDOWN chunk attached to in sctp_cmd_setup_t2(), which actually
belongs to the new asoc. After the new_asoc is freed and the old asoc
T2 timeout, the old asoc's shutdown_last_sent_to that is already freed
would be accessed in sctp_sf_t2_timer_expire().

Thanks Alexander and Jere for helping dig into this issue.

To fix it, this patch is to do the asoc update first, then allocate
the COOKIE-ACK and SHUTDOWN chunks with the 'updated' old asoc. This
would make more sense, as a chunk from an asoc shouldn't be sent out
with another asoc. We had fixed quite a few issues caused by this.

Fixes: 145cb2f717 ("sctp: Fix bundling of SHUTDOWN with COOKIE-ACK")
Reported-by: Alexander Sverdlin <alexander.sverdlin@nokia.com>
Reported-by: syzbot+bbe538efd1046586f587@syzkaller.appspotmail.com
Reported-by: Michal Tesar <mtesar@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-30 15:06:34 -07:00
Marc Dionne
c5197b4ec9 afs, rxrpc: Add Marc Dionne as co-maintainer
Add Marc Dionne as a co-maintainer for kafs and rxrpc.

Signed-off-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-30 15:04:24 -07:00
Johannes Berg
a57d3d4836 net: atheros: nic-devel@qualcomm.com is dead
Remove it from the MODULE_AUTHOR statements referencing it.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-30 15:03:21 -07:00
Yang Li
f0a5818b47 vsock/vmci: Remove redundant assignment to err
Variable 'err' is set to zero but this value is never read as it is
overwritten with a new value later on, hence it is a redundant
assignment and can be removed.

Clean up the following clang-analyzer warning:

net/vmw_vsock/vmci_transport.c:948:2: warning: Value stored to 'err' is
never read [clang-analyzer-deadcode.DeadStores]

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-30 15:00:59 -07:00
David S. Miller
00207c7d12 Merge branch 'hns3-fixes'
Huazhong Tan says:

====================
net: hns3: fixes for -net

This series adds some bugfixes for the HNS3 ethernet driver.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-30 14:49:34 -07:00
Yufeng Mo
472497d0bd net: hns3: disable phy loopback setting in hclge_mac_start_phy
If selftest and reset are performed at the same time, the phy
loopback setting may be still in enable state after the reset,
and device cannot link up. So fix this issue by disabling phy
loopback before phy_start().

Fixes: 256727da73 ("net: hns3: Add MDIO support to HNS3 Ethernet driver for hip08 SoC")
Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-30 14:49:33 -07:00
Yufeng Mo
8c9200e387 net: hns3: clear unnecessary reset request in hclge_reset_rebuild
HW error and global reset are reported through MSIX interrupts.
The same error may be reported to different functions at the
same time. When global reset begins, the pending reset request
set by this error is unnecessary. So clear the pending reset
request after the reset is complete to avoid the repeated reset.

Fixes: f6162d4412 ("net: hns3: add handling of hw errors reported through MSIX")
Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-30 14:49:33 -07:00
Peng Li
b416e872be net: hns3: use netif_tx_disable to stop the transmit queue
Currently, netif_tx_stop_all_queues() is used to ensure that
the xmit is not running, but for the concurrent case it will
not take effect, since netif_tx_stop_all_queues() just sets
a flag without locking to indicate that the xmit queue(s)
should not be run.

So use netif_tx_disable() to replace netif_tx_stop_all_queues(),
it takes the xmit queue lock while marking the queue stopped.

Fixes: 76ad4f0ee7 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC")
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-30 14:49:33 -07:00
Hao Chen
905416f18f net: hns3: fix for vxlan gpe tx checksum bug
When skb->ip_summed is CHECKSUM_PARTIAL, for non-tunnel udp packet,
which has a dest port as the IANA assigned, the hardware is expected
to do the checksum offload, but the hardware whose version is below
V3 will not do the checksum offload when udp dest port is 4790.

So fixes it by doing the checksum in software for this case.

Fixes: 76ad4f0ee7 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC")
Signed-off-by: Hao Chen <chenhao288@hisilicon.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-30 14:49:33 -07:00
Mohammad Athari Bin Ismail
db7c691d7f net: stmmac: cleared __FPE_REMOVING bit in stmmac_fpe_start_wq()
An issue found when network interface is down and up again, FPE handshake
fails to trigger. This is due to __FPE_REMOVING bit remains being set in
stmmac_fpe_stop_wq() but not cleared in stmmac_fpe_start_wq(). This
cause FPE workqueue task, stmmac_fpe_lp_task() not able to be executed.

To fix this, add clearing __FPE_REMOVING bit in stmmac_fpe_start_wq().

Fixes: 5a5586112b ("net: stmmac: support FPE link partner hand-shaking procedure")
Signed-off-by: Mohammad Athari Bin Ismail <mohammad.athari.ismail@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-30 14:43:53 -07:00
Florent Revest
f80f88f0e2 selftests/bpf: Fix the snprintf test
The BPF program for the snprintf selftest runs on all syscall entries.
On busy multicore systems this can cause concurrency issues.

For example it was observed that sometimes the userspace part of the
test reads "    4 0000" instead of "    4 000" (extra '0' at the end)
which seems to happen just before snprintf on another core sets
end[-1] = '\0'.

This patch adds a pid filter to the test to ensure that no
bpf_snprintf() will write over the test's output buffers while the
userspace reads the values.

Fixes: c2e39c6bdc ("selftests/bpf: Add a series of tests for bpf_snprintf")
Reported-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Florent Revest <revest@chromium.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210428152501.1024509-1-revest@chromium.org
2021-04-30 10:36:59 -07:00
Oleksij Rempel
d4eecfb28b net: dsa: ksz: ksz8863_smi_probe: set proper return value for ksz_switch_alloc()
ksz_switch_alloc() will return NULL only if allocation is failed. So,
the proper return value is -ENOMEM.

Fixes: 60a3647600 ("net: dsa: microchip: Add Microchip KSZ8863 SMI based driver support")
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-29 15:54:35 -07:00
Oleksij Rempel
ba46b576a7 net: dsa: ksz: ksz8795_spi_probe: fix possible NULL pointer dereference
Fix possible NULL pointer dereference in case devm_kzalloc() failed to
allocate memory

Fixes: cc13e52c3a ("net: dsa: microchip: Add Microchip KSZ8863 SPI based driver support")
Reported-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-29 15:54:35 -07:00
Oleksij Rempel
d27f0201b9 net: dsa: ksz: ksz8863_smi_probe: fix possible NULL pointer dereference
Fix possible NULL pointer dereference in case devm_kzalloc() failed to
allocate memory.

Fixes: 60a3647600 ("net: dsa: microchip: Add Microchip KSZ8863 SMI based driver support")
Reported-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-29 15:54:35 -07:00
Yang Li
8343b1f8b9 bnx2x: Remove redundant assignment to err
Variable 'err' is set to -EIO but this value is never read as it is
overwritten with a new value later on, hence it is a redundant
assignment and can be removed.

Clean up the following clang-analyzer warning:
drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c:1195:2: warning: Value
stored to 'err' is never read [clang-analyzer-deadcode.DeadStores]

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-29 15:52:07 -07:00
Jiapeng Chong
bbf6acea6e net: macb: Remove redundant assignment to queue
Variable queue is set to bp->queues but these values is not used as it
is overwritten later on, hence redundant assignment  can be removed.

Cleans up the following clang-analyzer warning:

drivers/net/ethernet/cadence/macb_main.c:4919:21: warning: Value stored
to 'queue' during its initialization is never read
[clang-analyzer-deadcode.DeadStores].

drivers/net/ethernet/cadence/macb_main.c:4832:21: warning: Value stored
to 'queue' during its initialization is never read
[clang-analyzer-deadcode.DeadStores].

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-29 15:50:20 -07:00
Michael Walle
57e1d8206e MAINTAINERS: move Murali Karicheri to credits
His email bounces with permanent error "550 Invalid recipient". His last
email was from 2020-09-09 on the LKML and he seems to have left TI.

Signed-off-by: Michael Walle <michael@walle.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-29 15:47:30 -07:00