linux-next

mirror of https://github.com/edk2-porting/linux-next.git synced 2024-12-17 01:34:00 +08:00

Author	SHA1	Message	Date
Daniel Borkmann	0af2ffc93a	bpf: Fix incorrect verifier simulation of ARSH under ALU32 Anatoly has been fuzzing with kBdysch harness and reported a hang in one of the outcomes: 0: R1=ctx(id=0,off=0,imm=0) R10=fp0 0: (85) call bpf_get_socket_cookie#46 1: R0_w=invP(id=0) R10=fp0 1: (57) r0 &= 808464432 2: R0_w=invP(id=0,umax_value=808464432,var_off=(0x0; 0x30303030)) R10=fp0 2: (14) w0 -= 810299440 3: R0_w=invP(id=0,umax_value=4294967295,var_off=(0xcf800000; 0x3077fff0)) R10=fp0 3: (c4) w0 s>>= 1 4: R0_w=invP(id=0,umin_value=1740636160,umax_value=2147221496,var_off=(0x67c00000; 0x183bfff8)) R10=fp0 4: (76) if w0 s>= 0x30303030 goto pc+216 221: R0_w=invP(id=0,umin_value=1740636160,umax_value=2147221496,var_off=(0x67c00000; 0x183bfff8)) R10=fp0 221: (95) exit processed 6 insns (limit 1000000) [...] Taking a closer look, the program was xlated as follows: # ./bpftool p d x i 12 0: (85) call bpf_get_socket_cookie#7800896 1: (bf) r6 = r0 2: (57) r6 &= 808464432 3: (14) w6 -= 810299440 4: (c4) w6 s>>= 1 5: (76) if w6 s>= 0x30303030 goto pc+216 6: (05) goto pc-1 7: (05) goto pc-1 8: (05) goto pc-1 [...] 220: (05) goto pc-1 221: (05) goto pc-1 222: (95) exit Meaning, the visible effect is very similar to `f54c7898ed` ("bpf: Fix precision tracking for unbounded scalars"), that is, the fall-through branch in the instruction 5 is considered to be never taken given the conclusion from the min/max bounds tracking in w6, and therefore the dead-code sanitation rewrites it as goto pc-1. However, real-life input disagrees with verification analysis since a soft-lockup was observed. The bug sits in the analysis of the ARSH. The definition is that we shift the target register value right by K bits through shifting in copies of its sign bit. In adjust_scalar_min_max_vals(), we do first coerce the register into 32 bit mode, same happens after simulating the operation. However, for the case of simulating the actual ARSH, we don't take the mode into account and act as if it's always 64 bit, but location of sign bit is different: dst_reg->smin_value >>= umin_val; dst_reg->smax_value >>= umin_val; dst_reg->var_off = tnum_arshift(dst_reg->var_off, umin_val); Consider an unknown R0 where bpf_get_socket_cookie() (or others) would for example return 0xffff. With the above ARSH simulation, we'd see the following results: [...] 1: R1=ctx(id=0,off=0,imm=0) R2_w=invP65535 R10=fp0 1: (85) call bpf_get_socket_cookie#46 2: R0_w=invP(id=0) R10=fp0 2: (57) r0 &= 808464432 -> R0_runtime = 0x3030 3: R0_w=invP(id=0,umax_value=808464432,var_off=(0x0; 0x30303030)) R10=fp0 3: (14) w0 -= 810299440 -> R0_runtime = 0xcfb40000 4: R0_w=invP(id=0,umax_value=4294967295,var_off=(0xcf800000; 0x3077fff0)) R10=fp0 (0xffffffff) 4: (c4) w0 s>>= 1 -> R0_runtime = 0xe7da0000 5: R0_w=invP(id=0,umin_value=1740636160,umax_value=2147221496,var_off=(0x67c00000; 0x183bfff8)) R10=fp0 (0x67c00000) (0x7ffbfff8) [...] In insn 3, we have a runtime value of 0xcfb40000, which is '1100 1111 1011 0100 0000 0000 0000 0000', the result after the shift has 0xe7da0000 that is '1110 0111 1101 1010 0000 0000 0000 0000', where the sign bit is correctly retained in 32 bit mode. In insn4, the umax was 0xffffffff, and changed into 0x7ffbfff8 after the shift, that is, '0111 1111 1111 1011 1111 1111 1111 1000' and means here that the simulation didn't retain the sign bit. With above logic, the updates happen on the 64 bit min/max bounds and given we coerced the register, the sign bits of the bounds are cleared as well, meaning, we need to force the simulation into s32 space for 32 bit alu mode. Verification after the fix below. We're first analyzing the fall-through branch on 32 bit signed >= test eventually leading to rejection of the program in this specific case: 0: R1=ctx(id=0,off=0,imm=0) R10=fp0 0: (b7) r2 = 808464432 1: R1=ctx(id=0,off=0,imm=0) R2_w=invP808464432 R10=fp0 1: (85) call bpf_get_socket_cookie#46 2: R0_w=invP(id=0) R10=fp0 2: (bf) r6 = r0 3: R0_w=invP(id=0) R6_w=invP(id=0) R10=fp0 3: (57) r6 &= 808464432 4: R0_w=invP(id=0) R6_w=invP(id=0,umax_value=808464432,var_off=(0x0; 0x30303030)) R10=fp0 4: (14) w6 -= 810299440 5: R0_w=invP(id=0) R6_w=invP(id=0,umax_value=4294967295,var_off=(0xcf800000; 0x3077fff0)) R10=fp0 5: (c4) w6 s>>= 1 6: R0_w=invP(id=0) R6_w=invP(id=0,umin_value=3888119808,umax_value=4294705144,var_off=(0xe7c00000; 0x183bfff8)) R10=fp0 (0x67c00000) (0xfffbfff8) 6: (76) if w6 s>= 0x30303030 goto pc+216 7: R0_w=invP(id=0) R6_w=invP(id=0,umin_value=3888119808,umax_value=4294705144,var_off=(0xe7c00000; 0x183bfff8)) R10=fp0 7: (30) r0 = (u8 )skb[808464432] BPF_LD_[ABS\|IND] uses reserved fields processed 8 insns (limit 1000000) [...] Fixes: `9cbe1f5a32` ("bpf/verifier: improve register value range tracking with ARSH") Reported-by: Anatoly Trosinenko <anatoly.trosinenko@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200115204733.16648-1-daniel@iogearbox.net	2020-01-15 13:39:59 -08:00
Martin KaFai Lau	555089fdfc	bpftool: Fix printing incorrect pointer in btf_dump_ptr For plain text output, it incorrectly prints the pointer value "void data". The "void data" is actually pointing to memory that contains a bpf-map's value. The intention is to print the content of the bpf-map's value instead of printing the pointer pointing to the bpf-map's value. In this case, a member of the bpf-map's value is a pointer type. Thus, it should print the "(void *)data". Fixes: `22c349e8db` ("tools: bpftool: fix format strings and arguments for jsonw_printf()") Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Link: https://lore.kernel.org/bpf/20200110231644.3484151-1-kafai@fb.com	2020-01-11 19:08:21 -08:00
Lorenz Bauer	2e012c7482	net: bpf: Don't leak time wait and request sockets It's possible to leak time wait and request sockets via the following BPF pseudo code: sk = bpf_skc_lookup_tcp(...) if (sk) bpf_sk_release(sk) If sk->sk_state is TCP_NEW_SYN_RECV or TCP_TIME_WAIT the refcount taken by bpf_skc_lookup_tcp is not undone by bpf_sk_release. This is because sk_flags is re-used for other data in both kinds of sockets. The check !sock_flag(sk, SOCK_RCU_FREE) therefore returns a bogus result. Check that sk_flags is valid by calling sk_fullsock. Skip checking SOCK_RCU_FREE if we already know that sk is not a full socket. Fixes: `edbf8c01de` ("bpf: add skc_lookup_tcp helper") Fixes: `f7355a6c04` ("bpf: Check sk_fullsock() before returning from bpf_sk_lookup()") Signed-off-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200110132336.26099-1-lmb@cloudflare.com	2020-01-10 10:42:23 -08:00
Lingpeng Chen	e7a5f1f1cd	bpf/sockmap: Read psock ingress_msg before sk_receive_queue Right now in tcp_bpf_recvmsg, sock read data first from sk_receive_queue if not empty than psock->ingress_msg otherwise. If a FIN packet arrives and there's also some data in psock->ingress_msg, the data in psock->ingress_msg will be purged. It is always happen when request to a HTTP1.0 server like python SimpleHTTPServer since the server send FIN packet after data is sent out. Fixes: `604326b41a` ("bpf, sockmap: convert to generic sk_msg interface") Reported-by: Arika Chen <eaglesora@gmail.com> Suggested-by: Arika Chen <eaglesora@gmail.com> Signed-off-by: Lingpeng Chen <forrest0579@gmail.com> Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20200109014833.18951-1-forrest0579@gmail.com	2020-01-09 23:13:48 +01:00
Jose Abreu	da29f2d84b	net: stmmac: Fixed link does not need MDIO Bus When using fixed link we don't need the MDIO bus support. Reported-by: Heiko Stuebner <heiko@sntech.de> Reported-by: kernelci.org bot <bot@kernelci.org> Fixes: `d3e014ec7d` ("net: stmmac: platform: Fix MDIO init for platforms without PHY") Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com> Acked-by: Sriram Dash <Sriram.dash@samsung.com> Tested-by: Patrice Chotard <patrice.chotard@st.com> Tested-by: Heiko Stuebner <heiko@sntech.de> Acked-by: Neil Armstrong <narmstrong@baylibre.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Tested-by: Florian Fainelli <f.fainelli@gmail> # Lamobo R1 (fixed-link + MDIO sub node for roboswitch). Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-07 13:40:29 -08:00
David S. Miller	b57e1fff7d	Merge branch 'vlan-rtnetlink-newlink-fixes' Eric Dumazet says: ==================== vlan: rtnetlink newlink fixes First patch fixes a potential memory leak found by syzbot Second patch makes vlan_changelink() aware of errors and report them to user. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-07 13:35:14 -08:00
Eric Dumazet	eb8ef2a3c5	vlan: vlan_changelink() should propagate errors Both vlan_dev_change_flags() and vlan_dev_set_egress_priority() can return an error. vlan_changelink() should not ignore them. Fixes: `07b5b17e15` ("[VLAN]: Use rtnl_link API") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-07 13:35:14 -08:00
Eric Dumazet	9bbd917e0b	vlan: fix memory leak in vlan_dev_set_egress_priority There are few cases where the ndo_uninit() handler might be not called if an error happens while device is initialized. Since vlan_newlink() calls vlan_changelink() before trying to register the netdevice, we need to make sure vlan_dev_uninit() has been called at least once, or we might leak allocated memory. BUG: memory leak unreferenced object 0xffff888122a206c0 (size 32): comm "syz-executor511", pid 7124, jiffies 4294950399 (age 32.240s) hex dump (first 32 bytes): 00 00 00 00 00 00 61 73 00 00 00 00 00 00 00 00 ......as........ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<000000000eb3bb85>] kmemleak_alloc_recursive include/linux/kmemleak.h:43 [inline] [<000000000eb3bb85>] slab_post_alloc_hook mm/slab.h:586 [inline] [<000000000eb3bb85>] slab_alloc mm/slab.c:3320 [inline] [<000000000eb3bb85>] kmem_cache_alloc_trace+0x145/0x2c0 mm/slab.c:3549 [<000000007b99f620>] kmalloc include/linux/slab.h:556 [inline] [<000000007b99f620>] vlan_dev_set_egress_priority+0xcc/0x150 net/8021q/vlan_dev.c:194 [<000000007b0cb745>] vlan_changelink+0xd6/0x140 net/8021q/vlan_netlink.c:126 [<0000000065aba83a>] vlan_newlink+0x135/0x200 net/8021q/vlan_netlink.c:181 [<00000000fb5dd7a2>] __rtnl_newlink+0x89a/0xb80 net/core/rtnetlink.c:3305 [<00000000ae4273a1>] rtnl_newlink+0x4e/0x80 net/core/rtnetlink.c:3363 [<00000000decab39f>] rtnetlink_rcv_msg+0x178/0x4b0 net/core/rtnetlink.c:5424 [<00000000accba4ee>] netlink_rcv_skb+0x61/0x170 net/netlink/af_netlink.c:2477 [<00000000319fe20f>] rtnetlink_rcv+0x1d/0x30 net/core/rtnetlink.c:5442 [<00000000d51938dc>] netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline] [<00000000d51938dc>] netlink_unicast+0x223/0x310 net/netlink/af_netlink.c:1328 [<00000000e539ac79>] netlink_sendmsg+0x2c0/0x570 net/netlink/af_netlink.c:1917 [<000000006250c27e>] sock_sendmsg_nosec net/socket.c:639 [inline] [<000000006250c27e>] sock_sendmsg+0x54/0x70 net/socket.c:659 [<00000000e2a156d1>] ____sys_sendmsg+0x2d0/0x300 net/socket.c:2330 [<000000008c87466e>] ___sys_sendmsg+0x8a/0xd0 net/socket.c:2384 [<00000000110e3054>] __sys_sendmsg+0x80/0xf0 net/socket.c:2417 [<00000000d71077c8>] __do_sys_sendmsg net/socket.c:2426 [inline] [<00000000d71077c8>] __se_sys_sendmsg net/socket.c:2424 [inline] [<00000000d71077c8>] __x64_sys_sendmsg+0x23/0x30 net/socket.c:2424 Fixe: `07b5b17e15` ("[VLAN]: Use rtnl_link API") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-07 13:35:14 -08:00
David S. Miller	96b11e9358	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Daniel Borkmann says: ==================== pull-request: bpf 2020-01-07 The following pull-request contains BPF updates for your net tree. We've added 2 non-merge commits during the last 1 day(s) which contain a total of 2 files changed, 16 insertions(+), 4 deletions(-). The main changes are: 1) Fix a use-after-free in cgroup BPF due to auto-detachment, from Roman Gushchin. 2) Fix skb out-of-bounds access in ld_abs/ind instruction, from Daniel Borkmann. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-07 13:31:23 -08:00
Jiping Ma	481a7d154c	stmmac: debugfs entry name is not be changed when udev rename device name. Add one notifier for udev changes net device name. Fixes: b6601323ef9e ("net: stmmac: debugfs entry name is not be changed when udev rename") Signed-off-by: Jiping Ma <jiping.ma2@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-07 13:26:16 -08:00
David S. Miller	c101fffcd7	mlx5-fixes-2020-01-06 -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAl4Twv0ACgkQSD+KveBX +j6NzAf/Z4FkuBtjvroUcRpR/R2B9d9ipWemqvjHUobLDm/q0HLRvz3YHGFOW/1H JLHVEh3jze5DkBegDmJWpFk/T2MWT5GZQ62jccUtIvjtIOwE5R6+EiUTZc3IHGZ1 Yzzoo+t0LyML6O+jxf3x+ZKjBHCh0jMvI0R2PFoxXESkPK8dKFDg7u6eLAmshYdX akN9gzvrrClcsG0kx5llGzWJuNaRFGy/LA1rYM/9IpyFkgPG6yuwljWLk6U3era3 bPCOmL3X6SN4ji55RWvsnvwtBB2LY5ZIsFzF9rbkXBhu/bb3AdYfdDREUvemzfbH dUnFsNECYnmjIx/52/C7ozW7gTtzdg== =0Pxs -----END PGP SIGNATURE----- Merge tag 'mlx5-fixes-2020-01-06' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== Mellanox, mlx5 fixes 2020-01-06 This series introduces some fixes to mlx5 driver. Please pull and let me know if there is any problem. For -stable v5.3 ('net/mlx5: Move devlink registration before interfaces load') For -stable v5.4 ('net/mlx5e: Fix hairpin RSS table size') ('net/mlx5: DR, Init lists that are used in rule's member') ('net/mlx5e: Always print health reporter message to dmesg') ('net/mlx5: DR, No need for atomic refcount for internal SW steering resources') ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-06 18:24:18 -08:00
Erez Shitrit	df55c5586e	net/mlx5: DR, Init lists that are used in rule's member Whenever adding new member of rule object we attach it to 2 lists, These 2 lists should be initialized first. Fixes: `41d0707415` ("net/mlx5: DR, Expose steering rule functionality") Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2020-01-06 15:30:05 -08:00
Eli Cohen	6412bb396a	net/mlx5e: Fix hairpin RSS table size Set hairpin table size to the corret size, based on the groups that would be created in it. Groups are laid out on the table such that a group occupies a range of entries in the table. This implies that the group ranges should have correspondence to the table they are laid upon. The patch cited below made group 1's size to grow hence causing overflow of group range laid on the table. Fixes: `a795d8db2a` ("net/mlx5e: Support RSS for IP-in-IP and IPv6 tunneled packets") Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2020-01-06 15:30:05 -08:00
Yevgeny Kliteynik	4ce380ca47	net/mlx5: DR, No need for atomic refcount for internal SW steering resources No need for an atomic refcounter for the STE and hashtables. These are internal SW steering resources and they are always under domain mutex. This also fixes the following refcount error: refcount_t: addition on 0; use-after-free. WARNING: CPU: 9 PID: 3527 at lib/refcount.c:25 refcount_warn_saturate+0x81/0xe0 Call Trace: dr_table_init_nic+0x10d/0x110 [mlx5_core] mlx5dr_table_create+0xb4/0x230 [mlx5_core] mlx5_cmd_dr_create_flow_table+0x39/0x120 [mlx5_core] __mlx5_create_flow_table+0x221/0x5f0 [mlx5_core] esw_create_offloads_fdb_tables+0x180/0x5a0 [mlx5_core] ... Fixes: `26d688e33f` ("net/mlx5: DR, Add Steering entry (STE) utilities") Signed-off-by: Yevgeny Kliteynik <kliteyn@mellanox.com> Reviewed-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2020-01-06 15:30:04 -08:00
Parav Pandit	1f0593e791	Revert "net/mlx5: Support lockless FTE read lookups" This reverts commit `7dee607ed0`. During cleanup path, FTE's parent node group is removed which is referenced by the FTE while freeing the FTE. Hence FTE's lockless read lookup optimization done in cited commit is not possible at the moment. Hence, revert the commit. This avoid below KAZAN call trace. [ 110.390896] BUG: KASAN: use-after-free in find_root.isra.14+0x56/0x60 [mlx5_core] [ 110.391048] Read of size 4 at addr ffff888c19e6d220 by task swapper/12/0 [ 110.391219] CPU: 12 PID: 0 Comm: swapper/12 Not tainted 5.5.0-rc1+ [ 110.391222] Hardware name: HP ProLiant DL380p Gen8, BIOS P70 08/02/2014 [ 110.391225] Call Trace: [ 110.391229] <IRQ> [ 110.391246] dump_stack+0x95/0xd5 [ 110.391307] ? find_root.isra.14+0x56/0x60 [mlx5_core] [ 110.391320] print_address_description.constprop.5+0x20/0x320 [ 110.391379] ? find_root.isra.14+0x56/0x60 [mlx5_core] [ 110.391435] ? find_root.isra.14+0x56/0x60 [mlx5_core] [ 110.391441] __kasan_report+0x149/0x18c [ 110.391499] ? find_root.isra.14+0x56/0x60 [mlx5_core] [ 110.391504] kasan_report+0x12/0x20 [ 110.391511] __asan_report_load4_noabort+0x14/0x20 [ 110.391567] find_root.isra.14+0x56/0x60 [mlx5_core] [ 110.391625] del_sw_fte_rcu+0x4a/0x100 [mlx5_core] [ 110.391633] rcu_core+0x404/0x1950 [ 110.391640] ? rcu_accelerate_cbs_unlocked+0x100/0x100 [ 110.391649] ? run_rebalance_domains+0x201/0x280 [ 110.391654] rcu_core_si+0xe/0x10 [ 110.391661] __do_softirq+0x181/0x66c [ 110.391670] irq_exit+0x12c/0x150 [ 110.391675] smp_apic_timer_interrupt+0xf0/0x370 [ 110.391681] apic_timer_interrupt+0xf/0x20 [ 110.391684] </IRQ> [ 110.391695] RIP: 0010:cpuidle_enter_state+0xfa/0xba0 [ 110.391703] Code: 3d c3 9b b5 50 e8 56 75 6e fe 48 89 45 c8 0f 1f 44 00 00 31 ff e8 a6 94 6e fe 45 84 ff 0f 85 f6 02 00 00 fb 66 0f 1f 44 00 00 <45> 85 f6 0f 88 db 06 00 00 4d 63 fe 4b 8d 04 7f 49 8d 04 87 49 8d [ 110.391706] RSP: 0018:ffff888c23a6fce8 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13 [ 110.391712] RAX: dffffc0000000000 RBX: ffffe8ffff7002f8 RCX: 000000000000001f [ 110.391715] RDX: 1ffff11184ee6cb5 RSI: 0000000040277d83 RDI: ffff888c277365a8 [ 110.391718] RBP: ffff888c23a6fd40 R08: 0000000000000002 R09: 0000000000035280 [ 110.391721] R10: ffff888c23a6fc80 R11: ffffed11847485d0 R12: ffffffffb1017740 [ 110.391723] R13: 0000000000000003 R14: 0000000000000003 R15: 0000000000000000 [ 110.391732] ? cpuidle_enter_state+0xea/0xba0 [ 110.391738] cpuidle_enter+0x4f/0xa0 [ 110.391747] call_cpuidle+0x6d/0xc0 [ 110.391752] do_idle+0x360/0x430 [ 110.391758] ? arch_cpu_idle_exit+0x40/0x40 [ 110.391765] ? complete+0x67/0x80 [ 110.391771] cpu_startup_entry+0x1d/0x20 [ 110.391779] start_secondary+0x2f3/0x3c0 [ 110.391784] ? set_cpu_sibling_map+0x2500/0x2500 [ 110.391795] secondary_startup_64+0xa4/0xb0 [ 110.391841] Allocated by task 290: [ 110.391917] save_stack+0x21/0x90 [ 110.391921] __kasan_kmalloc.constprop.8+0xa7/0xd0 [ 110.391925] kasan_kmalloc+0x9/0x10 [ 110.391929] kmem_cache_alloc_trace+0xf6/0x270 [ 110.391987] create_root_ns.isra.36+0x58/0x260 [mlx5_core] [ 110.392044] mlx5_init_fs+0x5fd/0x1ee0 [mlx5_core] [ 110.392092] mlx5_load_one+0xc7a/0x3860 [mlx5_core] [ 110.392139] init_one+0x6ff/0xf90 [mlx5_core] [ 110.392145] local_pci_probe+0xde/0x190 [ 110.392150] work_for_cpu_fn+0x56/0xa0 [ 110.392153] process_one_work+0x678/0x1140 [ 110.392157] worker_thread+0x573/0xba0 [ 110.392162] kthread+0x341/0x400 [ 110.392166] ret_from_fork+0x1f/0x40 [ 110.392218] Freed by task 2742: [ 110.392288] save_stack+0x21/0x90 [ 110.392292] __kasan_slab_free+0x137/0x190 [ 110.392296] kasan_slab_free+0xe/0x10 [ 110.392299] kfree+0x94/0x250 [ 110.392357] tree_put_node+0x257/0x360 [mlx5_core] [ 110.392413] tree_remove_node+0x63/0xb0 [mlx5_core] [ 110.392469] clean_tree+0x199/0x240 [mlx5_core] [ 110.392525] mlx5_cleanup_fs+0x76/0x580 [mlx5_core] [ 110.392572] mlx5_unload+0x22/0xc0 [mlx5_core] [ 110.392619] mlx5_unload_one+0x99/0x260 [mlx5_core] [ 110.392666] remove_one+0x61/0x160 [mlx5_core] [ 110.392671] pci_device_remove+0x10b/0x2c0 [ 110.392677] device_release_driver_internal+0x1e4/0x490 [ 110.392681] device_driver_detach+0x36/0x40 [ 110.392685] unbind_store+0x147/0x200 [ 110.392688] drv_attr_store+0x6f/0xb0 [ 110.392693] sysfs_kf_write+0x127/0x1d0 [ 110.392697] kernfs_fop_write+0x296/0x420 [ 110.392702] __vfs_write+0x66/0x110 [ 110.392707] vfs_write+0x1a0/0x500 [ 110.392711] ksys_write+0x164/0x250 [ 110.392715] __x64_sys_write+0x73/0xb0 [ 110.392720] do_syscall_64+0x9f/0x3a0 [ 110.392725] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: `7dee607ed0` ("net/mlx5: Support lockless FTE read lookups") Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2020-01-06 15:30:04 -08:00
Michael Guralnik	a6f3b62386	net/mlx5: Move devlink registration before interfaces load Register devlink before interfaces are added. This will allow interfaces to use devlink while initalizing. For example, call mlx5_is_roce_enabled. Fixes: `aba25279c1` ("net/mlx5e: Add TX reporter support") Signed-off-by: Michael Guralnik <michaelgur@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2020-01-06 15:30:04 -08:00
Eran Ben Elisha	99cda45426	net/mlx5e: Always print health reporter message to dmesg In case a reporter exists, error message is logged only to the devlink tracer. The devlink tracer is a visibility utility only, which user can choose not to monitor. After cited patch, 3rd party monitoring tools that tracks these error message will no longer find them in dmesg, causing a regression. With this patch, error messages are also logged into the dmesg. Fixes: `c50de4af1d` ("net/mlx5e: Generalize tx reporter's functionality") Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2020-01-06 15:30:04 -08:00
Dmytro Linkin	554fe75c1b	net/mlx5e: Avoid duplicating rule destinations Following scenario easily break driver logic and crash the kernel: 1. Add rule with mirred actions to same device. 2. Delete this rule. In described scenario rule is not added to database and on deletion driver access invalid entry. Example: $ tc filter add dev ens1f0_0 ingress protocol ip prio 1 \ flower skip_sw \ action mirred egress mirror dev ens1f0_1 pipe \ action mirred egress redirect dev ens1f0_1 $ tc filter del dev ens1f0_0 ingress protocol ip prio 1 Dmesg output: [ 376.634396] mlx5_core 0000:82:00.0: mlx5_cmd_check:756:(pid 3439): DESTROY_FLOW_GROUP(0x934) op_mod(0x0) failed, status bad resource state(0x9), syndrome (0x563e2f) [ 376.654983] mlx5_core 0000:82:00.0: del_hw_flow_group:567:(pid 3439): flow steering can't destroy fg 89 of ft 3145728 [ 376.673433] kasan: CONFIG_KASAN_INLINE enabled [ 376.683769] kasan: GPF could be caused by NULL-ptr deref or user memory access [ 376.695229] general protection fault: 0000 [#1] PREEMPT SMP KASAN PTI [ 376.705069] CPU: 7 PID: 3439 Comm: tc Not tainted 5.4.0-rc5+ #76 [ 376.714959] Hardware name: Supermicro SYS-2028TP-DECTR/X10DRT-PT, BIOS 2.0a 08/12/2016 [ 376.726371] RIP: 0010:mlx5_del_flow_rules+0x105/0x960 [mlx5_core] [ 376.735817] Code: 01 00 00 00 48 83 eb 08 e8 28 d9 ff ff 4c 39 e3 75 d8 4c 8d bd c0 02 00 00 48 b8 00 00 00 00 00 fc ff df 4c 89 fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 08 3c 03 0f 8e 84 04 00 00 48 8d 7d 28 8b 9 d [ 376.761261] RSP: 0018:ffff888847c56db8 EFLAGS: 00010202 [ 376.770054] RAX: dffffc0000000000 RBX: ffff8888582a6da0 RCX: ffff888847c56d60 [ 376.780743] RDX: 0000000000000058 RSI: 0000000000000008 RDI: 0000000000000282 [ 376.791328] RBP: 0000000000000000 R08: fffffbfff0c60ea6 R09: fffffbfff0c60ea6 [ 376.802050] R10: fffffbfff0c60ea5 R11: ffffffff8630752f R12: ffff8888582a6da0 [ 376.812798] R13: dffffc0000000000 R14: ffff8888582a6da0 R15: 00000000000002c0 [ 376.823445] FS: 00007f675f9a8840(0000) GS:ffff88886d200000(0000) knlGS:0000000000000000 [ 376.834971] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 376.844179] CR2: 00000000007d9640 CR3: 00000007d3f26003 CR4: 00000000001606e0 [ 376.854843] Call Trace: [ 376.868542] __mlx5_eswitch_del_rule+0x49/0x300 [mlx5_core] [ 376.877735] mlx5e_tc_del_fdb_flow+0x6ec/0x9e0 [mlx5_core] [ 376.921549] mlx5e_flow_put+0x2b/0x50 [mlx5_core] [ 376.929813] mlx5e_delete_flower+0x5b6/0xbd0 [mlx5_core] [ 376.973030] tc_setup_cb_reoffload+0x29/0xc0 [ 376.980619] fl_reoffload+0x50a/0x770 [cls_flower] [ 377.015087] tcf_block_playback_offloads+0xbd/0x250 [ 377.033400] tcf_block_setup+0x1b2/0xc60 [ 377.057247] tcf_block_offload_cmd+0x195/0x240 [ 377.098826] tcf_block_offload_unbind+0xe7/0x180 [ 377.107056] __tcf_block_put+0xe5/0x400 [ 377.114528] ingress_destroy+0x3d/0x60 [sch_ingress] [ 377.122894] qdisc_destroy+0xf1/0x5a0 [ 377.129993] qdisc_graft+0xa3d/0xe50 [ 377.151227] tc_get_qdisc+0x48e/0xa20 [ 377.165167] rtnetlink_rcv_msg+0x35d/0x8d0 [ 377.199528] netlink_rcv_skb+0x11e/0x340 [ 377.219638] netlink_unicast+0x408/0x5b0 [ 377.239913] netlink_sendmsg+0x71b/0xb30 [ 377.267505] sock_sendmsg+0xb1/0xf0 [ 377.273801] ___sys_sendmsg+0x635/0x900 [ 377.312784] __sys_sendmsg+0xd3/0x170 [ 377.338693] do_syscall_64+0x95/0x460 [ 377.344833] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 377.352321] RIP: 0033:0x7f675e58e090 To avoid this, for every mirred action check if output device was already processed. If so - drop rule with EOPNOTSUPP error. Signed-off-by: Dmytro Linkin <dmitrolin@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2020-01-06 15:30:03 -08:00
Daniel Borkmann	6d4f151acf	bpf: Fix passing modified ctx to ld/abs/ind instruction Anatoly has been fuzzing with kBdysch harness and reported a KASAN slab oob in one of the outcomes: [...] [ 77.359642] BUG: KASAN: slab-out-of-bounds in bpf_skb_load_helper_8_no_cache+0x71/0x130 [ 77.360463] Read of size 4 at addr ffff8880679bac68 by task bpf/406 [ 77.361119] [ 77.361289] CPU: 2 PID: 406 Comm: bpf Not tainted 5.5.0-rc2-xfstests-00157-g2187f215eba #1 [ 77.362134] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014 [ 77.362984] Call Trace: [ 77.363249] dump_stack+0x97/0xe0 [ 77.363603] print_address_description.constprop.0+0x1d/0x220 [ 77.364251] ? bpf_skb_load_helper_8_no_cache+0x71/0x130 [ 77.365030] ? bpf_skb_load_helper_8_no_cache+0x71/0x130 [ 77.365860] __kasan_report.cold+0x37/0x7b [ 77.366365] ? bpf_skb_load_helper_8_no_cache+0x71/0x130 [ 77.366940] kasan_report+0xe/0x20 [ 77.367295] bpf_skb_load_helper_8_no_cache+0x71/0x130 [ 77.367821] ? bpf_skb_load_helper_8+0xf0/0xf0 [ 77.368278] ? mark_lock+0xa3/0x9b0 [ 77.368641] ? kvm_sched_clock_read+0x14/0x30 [ 77.369096] ? sched_clock+0x5/0x10 [ 77.369460] ? sched_clock_cpu+0x18/0x110 [ 77.369876] ? bpf_skb_load_helper_8+0xf0/0xf0 [ 77.370330] ___bpf_prog_run+0x16c0/0x28f0 [ 77.370755] __bpf_prog_run32+0x83/0xc0 [ 77.371153] ? __bpf_prog_run64+0xc0/0xc0 [ 77.371568] ? match_held_lock+0x1b/0x230 [ 77.371984] ? rcu_read_lock_held+0xa1/0xb0 [ 77.372416] ? rcu_is_watching+0x34/0x50 [ 77.372826] sk_filter_trim_cap+0x17c/0x4d0 [ 77.373259] ? sock_kzfree_s+0x40/0x40 [ 77.373648] ? __get_filter+0x150/0x150 [ 77.374059] ? skb_copy_datagram_from_iter+0x80/0x280 [ 77.374581] ? do_raw_spin_unlock+0xa5/0x140 [ 77.375025] unix_dgram_sendmsg+0x33a/0xa70 [ 77.375459] ? do_raw_spin_lock+0x1d0/0x1d0 [ 77.375893] ? unix_peer_get+0xa0/0xa0 [ 77.376287] ? __fget_light+0xa4/0xf0 [ 77.376670] __sys_sendto+0x265/0x280 [ 77.377056] ? __ia32_sys_getpeername+0x50/0x50 [ 77.377523] ? lock_downgrade+0x350/0x350 [ 77.377940] ? __sys_setsockopt+0x2a6/0x2c0 [ 77.378374] ? sock_read_iter+0x240/0x240 [ 77.378789] ? __sys_socketpair+0x22a/0x300 [ 77.379221] ? __ia32_sys_socket+0x50/0x50 [ 77.379649] ? mark_held_locks+0x1d/0x90 [ 77.380059] ? trace_hardirqs_on_thunk+0x1a/0x1c [ 77.380536] __x64_sys_sendto+0x74/0x90 [ 77.380938] do_syscall_64+0x68/0x2a0 [ 77.381324] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 77.381878] RIP: 0033:0x44c070 [...] After further debugging, turns out while in case of other helper functions we disallow passing modified ctx, the special case of ld/abs/ind instruction which has similar semantics (except r6 being the ctx argument) is missing such check. Modified ctx is impossible here as bpf_skb_load_helper_8_no_cache() and others are expecting skb fields in original position, hence, add check_ctx_reg() to reject any modified ctx. Issue was first introduced back in `f1174f77b5` ("bpf/verifier: rework value tracking"). Fixes: `f1174f77b5` ("bpf/verifier: rework value tracking") Reported-by: Anatoly Trosinenko <anatoly.trosinenko@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200106215157.3553-1-daniel@iogearbox.net	2020-01-06 14:19:47 -08:00
David S. Miller	d76063c506	Merge branch 'atlantic-bugfixes' Igor Russkikh says: ==================== Aquantia/Marvell atlantic bugfixes 2020/01 Here is a set of recently discovered bugfixes, ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-06 14:06:11 -08:00
Igor Russkikh	b585f8602a	net: atlantic: remove duplicate entries Function entries were duplicated accidentally, removing the dups. Fixes: `ea4b4d7fc1` ("net: atlantic: loopback tests via private flags") Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-06 14:06:11 -08:00
Igor Russkikh	883daa1854	net: atlantic: loopback configuration in improper place Initial loopback configuration should be called earlier, before starting traffic on HW blocks. Otherwise depending on race conditions it could be kept disabled. Fixes: `ea4b4d7fc1` ("net: atlantic: loopback tests via private flags") Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-06 14:06:11 -08:00
Igor Russkikh	ac70957ee1	net: atlantic: broken link status on old fw Last code/checkpatch cleanup did a copy paste error where code from firmware 3 API logic was moved to firmware 1 logic. This resulted in FW1.x users would never see the link state as active. Fixes: `7b0c342f1f` ("net: atlantic: code style cleanup") Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-06 14:06:11 -08:00
Roman Gushchin	e10360f815	bpf: cgroup: prevent out-of-order release of cgroup bpf Before commit `4bfc0bb2c6` ("bpf: decouple the lifetime of cgroup_bpf from cgroup itself") cgroup bpf structures were released with corresponding cgroup structures. It guaranteed the hierarchical order of destruction: children were always first. It preserved attached programs from being released before their propagated copies. But with cgroup auto-detachment there are no such guarantees anymore: cgroup bpf is released as soon as the cgroup is offline and there are no live associated sockets. It means that an attached program can be detached and released, while its propagated copy is still living in the cgroup subtree. This will obviously lead to an use-after-free bug. To reproduce the issue the following script can be used: #!/bin/bash CGROOT=/sys/fs/cgroup mkdir -p ${CGROOT}/A ${CGROOT}/B ${CGROOT}/A/C sleep 1 ./test_cgrp2_attach ${CGROOT}/A egress & A_PID=$! ./test_cgrp2_attach ${CGROOT}/B egress & B_PID=$! echo $$ > ${CGROOT}/A/C/cgroup.procs iperf -s & S_PID=$! iperf -c localhost -t 100 & C_PID=$! sleep 1 echo $$ > ${CGROOT}/B/cgroup.procs echo ${S_PID} > ${CGROOT}/B/cgroup.procs echo ${C_PID} > ${CGROOT}/B/cgroup.procs sleep 1 rmdir ${CGROOT}/A/C rmdir ${CGROOT}/A sleep 1 kill -9 ${S_PID} ${C_PID} ${A_PID} ${B_PID} On the unpatched kernel the following stacktrace can be obtained: [ 33.619799] BUG: unable to handle page fault for address: ffffbdb4801ab002 [ 33.620677] #PF: supervisor read access in kernel mode [ 33.621293] #PF: error_code(0x0000) - not-present page [ 33.622754] Oops: 0000 [#1] SMP NOPTI [ 33.623202] CPU: 0 PID: 601 Comm: iperf Not tainted 5.5.0-rc2+ #23 [ 33.625545] RIP: 0010:__cgroup_bpf_run_filter_skb+0x29f/0x3d0 [ 33.635809] Call Trace: [ 33.636118] ? __cgroup_bpf_run_filter_skb+0x2bf/0x3d0 [ 33.636728] ? __switch_to_asm+0x40/0x70 [ 33.637196] ip_finish_output+0x68/0xa0 [ 33.637654] ip_output+0x76/0xf0 [ 33.638046] ? __ip_finish_output+0x1c0/0x1c0 [ 33.638576] __ip_queue_xmit+0x157/0x410 [ 33.639049] __tcp_transmit_skb+0x535/0xaf0 [ 33.639557] tcp_write_xmit+0x378/0x1190 [ 33.640049] ? _copy_from_iter_full+0x8d/0x260 [ 33.640592] tcp_sendmsg_locked+0x2a2/0xdc0 [ 33.641098] ? sock_has_perm+0x10/0xa0 [ 33.641574] tcp_sendmsg+0x28/0x40 [ 33.641985] sock_sendmsg+0x57/0x60 [ 33.642411] sock_write_iter+0x97/0x100 [ 33.642876] new_sync_write+0x1b6/0x1d0 [ 33.643339] vfs_write+0xb6/0x1a0 [ 33.643752] ksys_write+0xa7/0xe0 [ 33.644156] do_syscall_64+0x5b/0x1b0 [ 33.644605] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fix this by grabbing a reference to the bpf structure of each ancestor on the initialization of the cgroup bpf structure, and dropping the reference at the end of releasing the cgroup bpf structure. This will restore the hierarchical order of cgroup bpf releasing, without adding any operations on hot paths. Thanks to Josef Bacik for the debugging and the initial analysis of the problem. Fixes: `4bfc0bb2c6` ("bpf: decouple the lifetime of cgroup_bpf from cgroup itself") Reported-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Roman Gushchin <guro@fb.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2020-01-06 14:00:30 -08:00
Vikas Gupta	4012a6f2fa	firmware: tee_bnxt: Fix multiple call to tee_client_close_context Fix calling multiple tee_client_close_context in case of shm allocation fails. Fixes: `246880958a` (“firmware: broadcom: add OP-TEE based BNXT f/w manager”) Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-06 13:51:37 -08:00
Andrew Lunn	d8dc2c9676	net: dsa: mv88e6xxx: Preserve priority when setting CPU port. The 6390 family uses an extended register to set the port connected to the CPU. The lower 5 bits indicate the port, the upper three bits are the priority of the frames as they pass through the switch, what egress queue they should use, etc. Since frames being set to the CPU are typically management frames, BPDU, IGMP, ARP, etc set the priority to 7, the reset default, and the highest. Fixes: `33641994a6` ("net: dsa: mv88e6xxx: Monitor and Management tables") Signed-off-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Chris Healy <cphealy@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-06 13:35:11 -08:00
Krzysztof Kozlowski	5adcb8b186	net: ethernet: sxgbe: Rename Samsung to lowercase Fix up inconsistent usage of upper and lowercase letters in "Samsung" name. "SAMSUNG" is not an abbreviation but a regular trademarked name. Therefore it should be written with lowercase letters starting with capital letter. Although advertisement materials usually use uppercase "SAMSUNG", the lowercase version is used in all legal aspects (e.g. on Wikipedia and in privacy/legal statements on https://www.samsung.com/semiconductor/privacy-global/). Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-06 13:33:14 -08:00
Krzysztof Kozlowski	00c0688cec	net: wan: sdla: Fix cast from pointer to integer of different size Since net_device.mem_start is unsigned long, it should not be cast to int right before casting to pointer. This fixes warning (compile testing on alpha architecture): drivers/net/wan/sdla.c: In function ‘sdla_transmit’: drivers/net/wan/sdla.c:711:13: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast] Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-06 13:30:03 -08:00
Xin Long	be7a772920	sctp: free cmd->obj.chunk for the unprocessed SCTP_CMD_REPLY This patch is to fix a memleak caused by no place to free cmd->obj.chunk for the unprocessed SCTP_CMD_REPLY. This issue occurs when failing to process a cmd while there're still SCTP_CMD_REPLY cmds on the cmd seq with an allocated chunk in cmd->obj.chunk. So fix it by freeing cmd->obj.chunk for each SCTP_CMD_REPLY cmd left on the cmd seq when any cmd returns error. While at it, also remove 'nomem' label. Reported-by: syzbot+107c4aff5f392bf1517f@syzkaller.appspotmail.com Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-06 13:28:37 -08:00
Ying Xue	a7869e5f91	tipc: eliminate KMSAN: uninit-value in __tipc_nl_compat_dumpit error syzbot found the following crash on: ===================================================== BUG: KMSAN: uninit-value in __nlmsg_parse include/net/netlink.h:661 [inline] BUG: KMSAN: uninit-value in nlmsg_parse_deprecated include/net/netlink.h:706 [inline] BUG: KMSAN: uninit-value in __tipc_nl_compat_dumpit+0x553/0x11e0 net/tipc/netlink_compat.c:215 CPU: 0 PID: 12425 Comm: syz-executor062 Not tainted 5.5.0-rc1-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1c9/0x220 lib/dump_stack.c:118 kmsan_report+0x128/0x220 mm/kmsan/kmsan_report.c:108 __msan_warning+0x57/0xa0 mm/kmsan/kmsan_instr.c:245 __nlmsg_parse include/net/netlink.h:661 [inline] nlmsg_parse_deprecated include/net/netlink.h:706 [inline] __tipc_nl_compat_dumpit+0x553/0x11e0 net/tipc/netlink_compat.c:215 tipc_nl_compat_dumpit+0x761/0x910 net/tipc/netlink_compat.c:308 tipc_nl_compat_handle net/tipc/netlink_compat.c:1252 [inline] tipc_nl_compat_recv+0x12e9/0x2870 net/tipc/netlink_compat.c:1311 genl_family_rcv_msg_doit net/netlink/genetlink.c:672 [inline] genl_family_rcv_msg net/netlink/genetlink.c:717 [inline] genl_rcv_msg+0x1dd0/0x23a0 net/netlink/genetlink.c:734 netlink_rcv_skb+0x431/0x620 net/netlink/af_netlink.c:2477 genl_rcv+0x63/0x80 net/netlink/genetlink.c:745 netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline] netlink_unicast+0xfa0/0x1100 net/netlink/af_netlink.c:1328 netlink_sendmsg+0x11f0/0x1480 net/netlink/af_netlink.c:1917 sock_sendmsg_nosec net/socket.c:639 [inline] sock_sendmsg net/socket.c:659 [inline] ____sys_sendmsg+0x1362/0x13f0 net/socket.c:2330 ___sys_sendmsg net/socket.c:2384 [inline] __sys_sendmsg+0x4f0/0x5e0 net/socket.c:2417 __do_sys_sendmsg net/socket.c:2426 [inline] __se_sys_sendmsg+0x97/0xb0 net/socket.c:2424 __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2424 do_syscall_64+0xb6/0x160 arch/x86/entry/common.c:295 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x444179 Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 1b d8 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007ffd2d6409c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00000000004002e0 RCX: 0000000000444179 RDX: 0000000000000000 RSI: 0000000020000140 RDI: 0000000000000003 RBP: 00000000006ce018 R08: 0000000000000000 R09: 00000000004002e0 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000401e20 R13: 0000000000401eb0 R14: 0000000000000000 R15: 0000000000000000 Uninit was created at: kmsan_save_stack_with_flags mm/kmsan/kmsan.c:149 [inline] kmsan_internal_poison_shadow+0x5c/0x110 mm/kmsan/kmsan.c:132 kmsan_slab_alloc+0x8a/0xe0 mm/kmsan/kmsan_hooks.c:86 slab_alloc_node mm/slub.c:2774 [inline] __kmalloc_node_track_caller+0xe47/0x11f0 mm/slub.c:4382 __kmalloc_reserve net/core/skbuff.c:141 [inline] __alloc_skb+0x309/0xa50 net/core/skbuff.c:209 alloc_skb include/linux/skbuff.h:1049 [inline] nlmsg_new include/net/netlink.h:888 [inline] tipc_nl_compat_dumpit+0x6e4/0x910 net/tipc/netlink_compat.c:301 tipc_nl_compat_handle net/tipc/netlink_compat.c:1252 [inline] tipc_nl_compat_recv+0x12e9/0x2870 net/tipc/netlink_compat.c:1311 genl_family_rcv_msg_doit net/netlink/genetlink.c:672 [inline] genl_family_rcv_msg net/netlink/genetlink.c:717 [inline] genl_rcv_msg+0x1dd0/0x23a0 net/netlink/genetlink.c:734 netlink_rcv_skb+0x431/0x620 net/netlink/af_netlink.c:2477 genl_rcv+0x63/0x80 net/netlink/genetlink.c:745 netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline] netlink_unicast+0xfa0/0x1100 net/netlink/af_netlink.c:1328 netlink_sendmsg+0x11f0/0x1480 net/netlink/af_netlink.c:1917 sock_sendmsg_nosec net/socket.c:639 [inline] sock_sendmsg net/socket.c:659 [inline] ____sys_sendmsg+0x1362/0x13f0 net/socket.c:2330 ___sys_sendmsg net/socket.c:2384 [inline] __sys_sendmsg+0x4f0/0x5e0 net/socket.c:2417 __do_sys_sendmsg net/socket.c:2426 [inline] __se_sys_sendmsg+0x97/0xb0 net/socket.c:2424 __x64_sys_sendmsg+0x4a/0x70 net/socket.c:2424 do_syscall_64+0xb6/0x160 arch/x86/entry/common.c:295 entry_SYSCALL_64_after_hwframe+0x44/0xa9 ===================================================== The complaint above occurred because the memory region pointed by attrbuf variable was not initialized. To eliminate this warning, we use kcalloc() rather than kmalloc_array() to allocate memory for attrbuf. Reported-by: syzbot+b1fd2bf2c89d8407e15f@syzkaller.appspotmail.com Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-06 13:24:31 -08:00
Stephen Boyd	d89091a493	macb: Don't unregister clks unconditionally The only clk init function in this driver that register a clk is fu540_c000_clk_init(), and thus we need to unregister the clk when this driver is removed on that platform. Other init functions, for example macb_clk_init(), don't register clks and therefore we shouldn't unregister the clks when this driver is removed. Convert this registration path to devm so it gets auto-unregistered when this driver is removed and drop the clk_unregister() calls in driver remove (and error paths) so that we don't erroneously remove a clk from the system that isn't registered by this driver. Otherwise we get strange crashes with a use-after-free when the devm_clk_get() call in macb_clk_init() calls clk_put() on a clk pointer that has become invalid because it is freed in clk_unregister(). Cc: Nicolas Ferre <nicolas.ferre@microchip.com> Cc: Yash Shah <yash.shah@sifive.com> Reported-by: Guenter Roeck <linux@roeck-us.net> Fixes: `c218ad5590` ("macb: Add support for SiFive FU540-C000") Signed-off-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-05 15:09:58 -08:00
Krzysztof Kozlowski	15a821f050	MAINTAINERS: Drop obsolete entries from Samsung sxgbe ethernet driver The emails to ks.giri@samsung.com and vipul.pandya@samsung.com bounce with 550 error code: host mailin.samsung.com[203.254.224.12] said: 550 5.1.1 Recipient address rejected: User unknown (in reply to RCPT TO command)" Drop Girish K S and Vipul Pandya from sxgbe maintainers entry. Cc: Byungho An <bh74.an@samsung.com> Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-05 14:49:31 -08:00
Carl Huang	ce57785bf9	net: qrtr: fix len of skb_put_padto in qrtr_node_enqueue The len used for skb_put_padto is wrong, it need to add len of hdr. In qrtr_node_enqueue, local variable size_t len is assign with skb->len, then skb_push(skb, sizeof(hdr)) will add skb->len with sizeof(hdr), so local variable size_t len is not same with skb->len after skb_push(skb, sizeof(hdr)). Then the purpose of skb_put_padto(skb, ALIGN(len, 4)) is to add add pad to the end of the skb's data if skb->len is not aligned to 4, but unfortunately it use len instead of skb->len, at this line, skb->len is 32 bytes(sizeof(hdr)) more than len, for example, len is 3 bytes, then skb->len is 35 bytes(3 + 32), and ALIGN(len, 4) is 4 bytes, so __skb_put_padto will do nothing after check size(35) < len(4), the correct value should be 36(sizeof(hdr) + ALIGN(len, 4) = 32 + 4), then __skb_put_padto will pass check size(35) < len(36) and add 1 byte to the end of skb's data, then logic is correct. function of skb_push: void skb_push(struct sk_buff skb, unsigned int len) { skb->data -= len; skb->len += len; if (unlikely(skb->data < skb->head)) skb_under_panic(skb, len, __builtin_return_address(0)); return skb->data; } function of skb_put_padto static inline int skb_put_padto(struct sk_buff skb, unsigned int len) { return __skb_put_padto(skb, len, true); } function of __skb_put_padto static inline int __skb_put_padto(struct sk_buff *skb, unsigned int len, bool free_on_error) { unsigned int size = skb->len; if (unlikely(size < len)) { len -= size; if (__skb_pad(skb, len, free_on_error)) return -ENOMEM; __skb_put(skb, len); } return 0; } Signed-off-by: Carl Huang <cjhuang@codeaurora.org> Signed-off-by: Wen Gong <wgong@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-05 14:46:05 -08:00
Fenghua Yu	f11421ba4a	drivers/net/b44: Change to non-atomic bit operations on pwol_mask Atomic operations that span cache lines are super-expensive on x86 (not just to the current processor, but also to other processes as all memory operations are blocked until the operation completes). Upcoming x86 processors have a switch to cause such operations to generate a #AC trap. It is expected that some real time systems will enable this mode in BIOS. In preparation for this, it is necessary to fix code that may execute atomic instructions with operands that cross cachelines because the #AC trap will crash the kernel. Since "pwol_mask" is local and never exposed to concurrency, there is no need to set bits in pwol_mask using atomic operations. Directly operate on the byte which contains the bit instead of using __set_bit() to avoid any big endian concern due to type cast to unsigned long in __set_bit(). Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-05 14:21:23 -08:00
Liran Alon	b54ef37b1c	net: Google gve: Remove dma_wmb() before ringing doorbell Current code use dma_wmb() to ensure Rx/Tx descriptors are visible to device before writing to doorbell. However, these dma_wmb() are wrong and unnecessary. Therefore, they should be removed. iowrite32be() called from gve_rx_write_doorbell()/gve_tx_put_doorbell() should guaratee that all previous writes to WB/UC memory is visible to device before the write done by iowrite32be(). E.g. On ARM64, iowrite32be() calls __iowmb() which expands to dma_wmb() and only then calls __raw_writel(). Reviewed-by: Si-Wei Liu <si-wei.liu@oracle.com> Signed-off-by: Liran Alon <liran.alon@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-03 12:40:53 -08:00
Russell King	eed70fd945	net: phylink: fix failure to register on x86 systems The kernel test robot reports a boot failure with qemu in 5.5-rc, referencing commit `2203cbf2c8` ("net: sfp: move fwnode parsing into sfp-bus layer"). This is caused by phylink_create() being passed a NULL fwnode, causing fwnode_property_get_reference_args() to return -EINVAL. Don't attempt to attach to a SFP bus if we have no fwnode, which avoids this issue. Reported-by: kernel test robot <rong.a.chen@intel.com> Fixes: `2203cbf2c8` ("net: sfp: move fwnode parsing into sfp-bus layer") Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-03 12:37:55 -08:00
Jesper Dangaard Brouer	e64b274c95	doc/net: Update git https URLs in netdev-FAQ documentation DaveM's git tree have been moved into a named subdir 'netdev' to deal with allowing Jakub Kicinski to help co-maintain the trees. Link: https://www.kernel.org/doc/html/latest/networking/netdev-FAQ.html Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-03 12:33:54 -08:00
Hangbin Liu	cc7e3f63d7	selftests: loopback.sh: skip this test if the driver does not support The loopback feature is only supported on a few drivers like broadcom, mellanox, etc. The default veth driver has not supported it yet. To avoid returning failed and making the runner feel confused, let's just skip the test on drivers that not support loopback. Fixes: `ad11340994` ("selftests: Add loopback test") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-03 12:23:34 -08:00
David S. Miller	cd82dbf0d3	net: Update GIT url in maintainers. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-02 17:45:34 -08:00
David S. Miller	542d0f607e	linux-can-fixes-for-5.5-20200102 -----BEGIN PGP SIGNATURE----- iQFHBAABCgAxFiEEmvEkXzgOfc881GuFWsYho5HknSAFAl4OCPwTHG1rbEBwZW5n dXRyb25peC5kZQAKCRBaxiGjkeSdIMnWCACpMWqGPtvJPCDyCSqge5ncoWYIIzGX nncH134TgBpkViYMybYBdHet7RUptJ5ItKVMCYvE9gmK11D1aZ84ylVll8dyz3od ce9Y1+GK74bF1GXP5DJa+AbeLqFoW6X+iJPUpupCC3VnEnJ418f5R2RoS7LEnlqW 6pxZsylbULlcSxHxuU9Hii5zNtNSrXRZhSfTUsou5bNp3+65XCJ3JVPFc8Kg4iRw ZrlC2fOKTcDDx53UO/OhPIkfwir9WEHJIVWWw+bm5+yqz8gtdC3hlFXSwK+E0Nuv 5ZQ9Q3adj0xNMRwapFk46GAhOJTPTu5dZm5504AETuFMCSKDRUmVufiU =faNF -----END PGP SIGNATURE----- Merge tag 'linux-can-fixes-for-5.5-20200102' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2020-01-02 this is a pull request of 9 patches for net/master. The first 5 patches target all the tcan4x5x driver. The first 3 patches of them are by Dan Murphy and Sean Nyekjaer and improve the device initialization (power on, reset and get device out of standby before register access). The next patch is by Dan Murphy and disables the INH pin device-state if the GPIO is unavailable. The last patch for the tcan4x5x driver is by Gustavo A. R. Silva and fixes an inconsistent PTR_ERR check in the tcan4x5x_parse_config() function. The next patch is by Oliver Hartkopp and targets the generic CAN device infrastructure. It ensures that an initialized headroom in outgoing CAN sk_buffs (e.g. if injected by AF_PACKET). The last 2 patches are by Johan Hovold and fix the kvaser_usb and gs_usb drivers by always using the current alternate setting not blindly the first one. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-02 16:40:51 -08:00
Andrew Lunn	c72a0bc0aa	net: freescale: fec: Fix ethtool -d runtime PM In order to dump the FECs registers the clocks have to be ticking, otherwise a data abort occurs. Add calls to runtime PM so they are enabled and later disabled. Fixes: `e8fcfcd568` ("net: fec: optimize the clock management to save power") Reported-by: Chris Healy <Chris.Healy@zii.aero> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-02 16:38:09 -08:00
Hangbin Liu	71130f2997	vxlan: fix tos value before xmit Before ip_tunnel_ecn_encap() and udp_tunnel_xmit_skb() we should filter tos value by RT_TOS() instead of using config tos directly. vxlan_get_route() would filter the tos to fl4.flowi4_tos but we didn't return it back, as geneve_get_v4_rt() did. So we have to use RT_TOS() directly in function ip_tunnel_ecn_encap(). Fixes: `206aaafcd2` ("VXLAN: Use IP Tunnels tunnel ENC encap API") Fixes: `1400615d64` ("vxlan: allow setting ipv6 traffic class") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-02 16:35:48 -08:00
Wen Yang	68aab823c2	sch_cake: avoid possible divide by zero in cake_enqueue() The variables 'window_interval' is u64 and do_div() truncates it to 32 bits, which means it can test non-zero and be truncated to zero for division. The unit of window_interval is nanoseconds, so its lower 32-bit is relatively easy to exceed. Fix this issue by using div64_u64() instead. Fixes: `7298de9cd7` ("sch_cake: Add ingress mode") Signed-off-by: Wen Yang <wenyang@linux.alibaba.com> Cc: Kevin Darbyshire-Bryant <ldir@darbyshire-bryant.me.uk> Cc: Toke Høiland-Jørgensen <toke@redhat.com> Cc: David S. Miller <davem@davemloft.net> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: cake@lists.bufferbloat.net Cc: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Acked-by: Toke Høiland-Jørgensen <toke@toke.dk> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-02 16:34:28 -08:00
David S. Miller	d8513df259	net: Correct type of tcp_syncookies sysctl. It can take on the values of '0', '1', and '2' and thus is not a boolean. Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-02 16:07:38 -08:00
Pengcheng Yang	c9655008e7	tcp: fix "old stuff" D-SACK causing SACK to be treated as D-SACK When we receive a D-SACK, where the sequence number satisfies: undo_marker <= start_seq < end_seq <= prior_snd_una we consider this is a valid D-SACK and tcp_is_sackblock_valid() returns true, then this D-SACK is discarded as "old stuff", but the variable first_sack_index is not marked as negative in tcp_sacktag_write_queue(). If this D-SACK also carries a SACK that needs to be processed (for example, the previous SACK segment was lost), this SACK will be treated as a D-SACK in the following processing of tcp_sacktag_write_queue(), which will eventually lead to incorrect updates of undo_retrans and reordering. Fixes: `fd6dad616d` ("[TCP]: Earlier SACK block verification & simplify access to them") Signed-off-by: Pengcheng Yang <yangpc@wangsu.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-02 15:42:28 -08:00
Baruch Siach	f7a48b68ab	net: dsa: mv88e6xxx: force cmode write on 6141/6341 mv88e6xxx_port_set_cmode() relies on cmode stored in struct mv88e6xxx_port to skip cmode update when the requested value matches the cached value. It turns out that mv88e6xxx_port_hidden_write() might change the port cmode setting as a side effect, so we can't rely on the cached value to determine that cmode update in not necessary. Force cmode update in mv88e6341_port_set_cmode(), to make serdes configuration work again. Other mv88e6xxx_port_set_cmode() callers keep the current behaviour. This fixes serdes configuration of the 6141 switch on SolidRun Clearfog GT-8K. Fixes: `7a3007d22e` ("net: dsa: mv88e6xxx: fully support SERDES on Topaz family") Reported-by: Denis Odintsov <d.odintsov@traviangames.com> Signed-off-by: Baruch Siach <baruch@tkos.co.il> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-02 15:30:48 -08:00
Florian Faber	2d77bd61a2	can: mscan: mscan_rx_poll(): fix rx path lockup when returning from polling to irq mode Under load, the RX side of the mscan driver can get stuck while TX still works. Restarting the interface locks up the system. This behaviour could be reproduced reliably on a MPC5121e based system. The patch fixes the return value of the NAPI polling function (should be the number of processed packets, not constant 1) and the condition under which IRQs are enabled again after polling is finished. With this patch, no more lockups were observed over a test period of ten days. Fixes: `afa17a500a` ("net/can: add driver for mscan family & mpc52xx_mscan") Signed-off-by: Florian Faber <faber@faberman.de> Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2020-01-02 15:34:27 +01:00
Johan Hovold	2f361cd947	can: gs_usb: gs_usb_probe(): use descriptors of current altsetting Make sure to always use the descriptors of the current alternate setting to avoid future issues when accessing fields that may differ between settings. Signed-off-by: Johan Hovold <johan@kernel.org> Fixes: `d08e973a77` ("can: gs_usb: Added support for the GS_USB CAN devices") Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2020-01-02 15:34:27 +01:00
Johan Hovold	5660493c63	can: kvaser_usb: fix interface sanity check Make sure to use the current alternate setting when verifying the interface descriptors to avoid binding to an invalid interface. Failing to do so could cause the driver to misbehave or trigger a WARN() in usb_submit_urb() that kernels with panic_on_warn set would choke on. Fixes: `aec5fb2268` ("can: kvaser_usb: Add support for Kvaser USB hydra family") Cc: stable <stable@vger.kernel.org> # 4.19 Cc: Jimmy Assarsson <extja@kvaser.com> Cc: Christer Beskow <chbe@kvaser.com> Cc: Nicklas Johansson <extnj@kvaser.com> Cc: Martin Henriksson <mh@kvaser.com> Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2020-01-02 15:34:27 +01:00
Oliver Hartkopp	e7153bf70c	can: can_dropped_invalid_skb(): ensure an initialized headroom in outgoing CAN sk_buffs KMSAN sysbot detected a read access to an untinitialized value in the headroom of an outgoing CAN related sk_buff. When using CAN sockets this area is filled appropriately - but when using a packet socket this initialization is missing. The problematic read access occurs in the CAN receive path which can only be triggered when the sk_buff is sent through a (virtual) CAN interface. So we check in the sending path whether we need to perform the missing initializations. Fixes: `d3b58c47d3` ("can: replace timestamp as unique skb attribute") Reported-by: syzbot+b02ff0707a97e4e79ebb@syzkaller.appspotmail.com Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Tested-by: Oliver Hartkopp <socketcan@hartkopp.net> Cc: linux-stable <stable@vger.kernel.org> # >= v4.1 Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2020-01-02 15:34:27 +01:00

1 2 3 4 5 ...

887980 Commits