There is a deadlock in rr_close(), which is shown below:
(Thread 1) | (Thread 2)
| rr_open()
rr_close() | add_timer()
spin_lock_irqsave() //(1) | (wait a time)
... | rr_timer()
del_timer_sync() | spin_lock_irqsave() //(2)
(wait timer to stop) | ...
We hold rrpriv->lock in position (1) of thread 1 and
use del_timer_sync() to wait timer to stop, but timer handler
also need rrpriv->lock in position (2) of thread 2.
As a result, rr_close() will block forever.
This patch extracts del_timer_sync() from the protection of
spin_lock_irqsave(), which could let timer handler to obtain
the needed lock.
Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
Link: https://lore.kernel.org/r/20220417125519.82618-1-duoming@zju.edu.cn
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
The test verifies that packets are correctly flooded by the bridge and
the VXLAN device by matching on the encapsulated packets at the other
end. However, if packets other than those generated by the test also
ingress the bridge (e.g., MLD packets), they will be flooded as well and
interfere with the expected count.
Make the test more robust by making sure that only the packets generated
by the test can ingress the bridge. Drop all the rest using tc filters
on the egress of 'br0' and 'h1'.
In the software data path, the problem can be solved by matching on the
inner destination MAC or dropping unwanted packets at the egress of the
VXLAN device, but this is not currently supported by mlxsw.
Fixes: d01724dd2a ("selftests: mlxsw: spectrum-2: Add a test for VxLAN flooding with IPv6")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The test verifies that packets are correctly flooded by the bridge and
the VXLAN device by matching on the encapsulated packets at the other
end. However, if packets other than those generated by the test also
ingress the bridge (e.g., MLD packets), they will be flooded as well and
interfere with the expected count.
Make the test more robust by making sure that only the packets generated
by the test can ingress the bridge. Drop all the rest using tc filters
on the egress of 'br0' and 'h1'.
In the software data path, the problem can be solved by matching on the
inner destination MAC or dropping unwanted packets at the egress of the
VXLAN device, but this is not currently supported by mlxsw.
Fixes: 94d302deae ("selftests: mlxsw: Add a test for VxLAN flooding")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a Bug section, indicating preferred mailing method for bug reports,
to NFC Subsystem entry.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The init_systime() may be invoked in atomic state. We have observed the
following call trace when running "phc_ctl /dev/ptp0 set" on a Intel
Agilex board.
BUG: sleeping function called from invalid context at drivers/net/ethernet/stmicro/stmmac/stmmac_hwtstamp.c:74
in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 381, name: phc_ctl
preempt_count: 1, expected: 0
RCU nest depth: 0, expected: 0
Preemption disabled at:
[<ffff80000892ef78>] stmmac_set_time+0x34/0x8c
CPU: 2 PID: 381 Comm: phc_ctl Not tainted 5.18.0-rc2-next-20220414-yocto-standard+ #567
Hardware name: SoCFPGA Agilex SoCDK (DT)
Call trace:
dump_backtrace.part.0+0xc4/0xd0
show_stack+0x24/0x40
dump_stack_lvl+0x7c/0xa0
dump_stack+0x18/0x34
__might_resched+0x154/0x1c0
__might_sleep+0x58/0x90
init_systime+0x78/0x120
stmmac_set_time+0x64/0x8c
ptp_clock_settime+0x60/0x9c
pc_clock_settime+0x6c/0xc0
__arm64_sys_clock_settime+0x88/0xf0
invoke_syscall+0x5c/0x130
el0_svc_common.constprop.0+0x4c/0x100
do_el0_svc+0x7c/0xa0
el0_svc+0x58/0xcc
el0t_64_sync_handler+0xa4/0x130
el0t_64_sync+0x18c/0x190
So we should use readl_poll_timeout_atomic() here instead of
readl_poll_timeout().
Also adjust the delay time to 10us to fix a "__bad_udelay" build error
reported by "kernel test robot <lkp@intel.com>". I have tested this on
Intel Agilex and NXP S32G boards, there is no delay needed at all.
So the 10us delay should be long enough for most cases.
Fixes: ff8ed73786 ("net: stmmac: use readl_poll_timeout() function in init_systime()")
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Let's describe this sysctl.
Fixes: 5cbf777cfd ("route: add support for directed broadcast forwarding")
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the user runs:
bridge link set dev $br_port mcast_flood on
this command should affect not only L2 multicast, but also IPv4 and IPv6
multicast.
In the Ocelot switch, unknown multicast gets flooded according to
different PGIDs according to its type, and PGID_MC only handles L2
multicast. Therefore, by leaving PGID_MCIPV4 and PGID_MCIPV6 at their
default value of 0, unknown IP multicast traffic is never flooded.
Fixes: 421741ea56 ("net: mscc: ocelot: offload bridge port flags to device")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20220415151950.219660-1-vladimir.oltean@nxp.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
In case the checksum calculation is offloaded to the DSA master network
interface, it will include the switch trailing tag. As soon as the switch strips
that tag on egress, the calculated checksum is wrong.
Therefore, add the checksum calculation to the tagger (if required) before
adding the switch tag. This way, the hellcreek code works with all DSA master
interfaces regardless of their declared feature set.
Fixes: 01ef09caad ("net: dsa: Add tag handling for Hirschmann Hellcreek switches")
Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20220415103320.90657-1-kurt@linutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
This will reset deeply on freeze and thaw instead of suspend and
resume and prevent null pointer dereferences of the uninitialized ring
0 buffer while thawing.
The impact is an indefinitely hanging kernel. You can't switch
consoles after this and the only possible user interaction is SysRq.
BUG: kernel NULL pointer dereference
RIP: 0010:aq_ring_rx_fill+0xcf/0x210 [atlantic]
aq_vec_init+0x85/0xe0 [atlantic]
aq_nic_init+0xf7/0x1d0 [atlantic]
atl_resume_common+0x4f/0x100 [atlantic]
pci_pm_thaw+0x42/0xa0
resolves in aq_ring.o to
```
0000000000000ae0 <aq_ring_rx_fill>:
{
/* ... */
baf: 48 8b 43 08 mov 0x8(%rbx),%rax
buff->flags = 0U; /* buff is NULL */
```
The bug has been present since the introduction of the new pm code in
8aaa112a57 ("net: atlantic: refactoring pm logic") and was hidden
until 8ce8427169 ("net: atlantic: changes for multi-TC support"),
which refactored the aq_vec_{free,alloc} functions into
aq_vec_{,ring}_{free,alloc}, but is technically not wrong. The
original functions just always reinitialized the buffers on S3/S4. If
the interface is down before freezing, the bug does not occur. It does
not matter, whether the initrd contains and loads the module before
thawing.
So the fix is to invert the boolean parameter deep in all pm function
calls, which was clearly intended to be set like that.
First report was on Github [1], which you have to guess from the
resume logs in the posted dmesg snippet. Recently I posted one on
Bugzilla [2], since I did not have an AQC device so far.
#regzbot introduced: 8ce8427169
#regzbot from: koo5 <kolman.jindrich@gmail.com>
#regzbot monitor: https://github.com/Aquantia/AQtion/issues/32
Fixes: 8aaa112a57 ("net: atlantic: refactoring pm logic")
Link: https://github.com/Aquantia/AQtion/issues/32 [1]
Link: https://bugzilla.kernel.org/show_bug.cgi?id=215798 [2]
Cc: stable@vger.kernel.org
Reported-by: koo5 <kolman.jindrich@gmail.com>
Signed-off-by: Manuel Ullmann <labre@posteo.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
-----BEGIN PGP SIGNATURE-----
iQFHBAABCgAxFiEEBsvAIBsPu6mG7thcrX5LkNig010FAmJcMUsTHG1rbEBwZW5n
dXRyb25peC5kZQAKCRCtfkuQ2KDTXf+UB/9DyZD5tzOgeHEj/BHyN15TQ/KyABou
fifW4Al5awFaoNG4gAmKfBiSqI0H4pejhIIB76t5xCfdftEiW6VbLDQz048oOezK
olAW71Te0ESHP57ZOPsGPb2Im/qIl0b1nymatgse6XSsDzMupQiDT+BD79BTq8re
sO/IzAky6L1nnaxfG2GrJMHMqsttyKQtYLJReDRTMaskI11ya9TjaJjI/uYOpm6F
jCNTv0M/rEjDal85rN4m1E6NIh5E0qhaPAR5O+olKCfuVmneIl5YESnOIovA/19M
0KPcOAT1rCoKaH7K94IHHJ9XG2bO1P5GTtHu3eUkjLwtEusMpHjfOMME
=Andf
-----END PGP SIGNATURE-----
Merge tag 'linux-can-fixes-for-5.18-20220417' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
Marc Kleine-Budde says:
====================
pull-request: can 2022-04-17
this is a pull request of 1 patch for net/master.
The patch is by Oliver Hartkopp and fixes a timeout monitoring problem
in the ISO TP protocol found by the syzbot.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The first attempt to fix a the 'impossible' WARN_ON_ONCE(1) in
isotp_tx_timer_handler() focussed on the identical CAN IDs created by
the syzbot reproducer and lead to upstream fix/commit 3ea566422c
("can: isotp: sanitize CAN ID checks in isotp_bind()"). But this did
not catch the root cause of the wrong tx.state in the tx_timer handler.
In the isotp 'first frame' case a timeout monitoring needs to be started
before the 'first frame' is send. But when this sending failed the timeout
monitoring for this specific frame has to be disabled too.
Otherwise the tx_timer is fired with the 'warn me' tx.state of ISOTP_IDLE.
Fixes: e057dd3fc2 ("can: add ISO 15765-2:2016 transport protocol")
Link: https://lore.kernel.org/all/20220405175112.2682-1-socketcan@hartkopp.net
Reported-by: syzbot+2339c27f5c66c652843e@syzkaller.appspotmail.com
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Commit b5f862180d was introduced to discard lowest hash bit for layer3+4 hashing
but it also removes last bit from non layer3+4 hashing
Below script shows layer2+3 hashing will result in same slave to be used with above commit.
$ cat hash.py
#/usr/bin/python3.6
h_dests=[0xa0, 0xa1]
h_source=0xe3
hproto=0x8
saddr=0x1e7aa8c0
daddr=0x17aa8c0
for h_dest in h_dests:
hash = (h_dest ^ h_source ^ hproto ^ saddr ^ daddr)
hash ^= hash >> 16
hash ^= hash >> 8
print(hash)
print("with last bit removed")
for h_dest in h_dests:
hash = (h_dest ^ h_source ^ hproto ^ saddr ^ daddr)
hash ^= hash >> 16
hash ^= hash >> 8
hash = hash >> 1
print(hash)
Output:
$ python3.6 hash.py
522133332
522133333 <-------------- will result in both slaves being used
with last bit removed
261066666
261066666 <-------------- only single slave used
Signed-off-by: suresh kumar <suresh2514@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Reads and Writes to ip6_rt_gc_expire always have been racy,
as syzbot reported lately [1]
There is a possible risk of under-flow, leading
to unexpected high value passed to fib6_run_gc(),
although I have not observed this in the field.
Hosts hitting ip6_dst_gc() very hard are under pretty bad
state anyway.
[1]
BUG: KCSAN: data-race in ip6_dst_gc / ip6_dst_gc
read-write to 0xffff888102110744 of 4 bytes by task 13165 on cpu 1:
ip6_dst_gc+0x1f3/0x220 net/ipv6/route.c:3311
dst_alloc+0x9b/0x160 net/core/dst.c:86
ip6_dst_alloc net/ipv6/route.c:344 [inline]
icmp6_dst_alloc+0xb2/0x360 net/ipv6/route.c:3261
mld_sendpack+0x2b9/0x580 net/ipv6/mcast.c:1807
mld_send_cr net/ipv6/mcast.c:2119 [inline]
mld_ifc_work+0x576/0x800 net/ipv6/mcast.c:2651
process_one_work+0x3d3/0x720 kernel/workqueue.c:2289
worker_thread+0x618/0xa70 kernel/workqueue.c:2436
kthread+0x1a9/0x1e0 kernel/kthread.c:376
ret_from_fork+0x1f/0x30
read-write to 0xffff888102110744 of 4 bytes by task 11607 on cpu 0:
ip6_dst_gc+0x1f3/0x220 net/ipv6/route.c:3311
dst_alloc+0x9b/0x160 net/core/dst.c:86
ip6_dst_alloc net/ipv6/route.c:344 [inline]
icmp6_dst_alloc+0xb2/0x360 net/ipv6/route.c:3261
mld_sendpack+0x2b9/0x580 net/ipv6/mcast.c:1807
mld_send_cr net/ipv6/mcast.c:2119 [inline]
mld_ifc_work+0x576/0x800 net/ipv6/mcast.c:2651
process_one_work+0x3d3/0x720 kernel/workqueue.c:2289
worker_thread+0x618/0xa70 kernel/workqueue.c:2436
kthread+0x1a9/0x1e0 kernel/kthread.c:376
ret_from_fork+0x1f/0x30
value changed: 0x00000bb3 -> 0x00000ba9
Reported by Kernel Concurrency Sanitizer on:
CPU: 0 PID: 11607 Comm: kworker/0:21 Not tainted 5.18.0-rc1-syzkaller-00037-g42e7a03d3bad-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: mld mld_ifc_work
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20220413181333.649424-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
David Ahern says:
====================
l3mdev: Fix ip tunnel case after recent l3mdev change
Second patch provides a fix for ip tunnels after the recent l3mdev change
that avoids touching the oif in the flow struct. First patch preemptively
provides a fix to an existing function that the second patch uses.
====================
Link: https://lore.kernel.org/r/20220413174320.28989-1-dsahern@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ido reported that the commit referenced in the Fixes tag broke
a gre use case with dummy devices. Add a check to ip_tunnel_init_flow
to see if the oif is an l3mdev port and if so set the oif to 0 to
avoid the oif comparison in fib_lookup_good_nhc.
Fixes: 40867d74c3 ("net: Add l3mdev index to flow struct and avoid oif reset for port devices")
Reported-by: Ido Schimmel <idosch@idosch.org>
Signed-off-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet says:
====================
net/sched: two fixes for cls_u32
One syzbot report brought my attention to cls_u32.
This series addresses the syzbot report, and an additional
issue discovered in code review.
====================
Link: https://lore.kernel.org/r/20220413173542.533060-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
While investigating a related syzbot report,
I found that whenever call to tcf_exts_init()
from u32_init_knode() is failing, we end up
with an elevated refcount on ht->refcnt
To avoid that, only increase the refcount after
all possible errors have been evaluated.
Fixes: b9a24bb76b ("net_sched: properly handle failure case of tcf_exts_init()")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2022-04-13
This series contains updates to igc and e1000e drivers.
Sasha removes waiting for hardware semaphore as it could cause an
infinite loop and changes usleep_range() calls done under atomic
context to udelay() for igc. For e1000e, he changes some variables from
u16 to u32 to prevent possible overflow of values.
Vinicius disables PTM when going to suspend as it is causing hang issues
on some platforms for igc.
* '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
e1000e: Fix possible overflow in LTR decoding
igc: Fix suspending when PTM is active
igc: Fix BUG: scheduling while atomic
igc: Fix infinite loop in release_swfw_sync
====================
Link: https://lore.kernel.org/r/20220413170814.2066855-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The displayed list of Ethernet devices in make menuconfig
has gotten out of order. This is mostly due to changes in vendor
names etc, but also because of new Microsoft entry in wrong place.
This restores so that the display is in order even if the names
of the sub directories are not.
Fixes: ca9c54d2d6 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Given a sufficiently large number of actions, while copying and
reserving memory for a new action of a new flow, if next_offset is
greater than MAX_ACTIONS_BUFSIZE, the function reserve_sfa_size() does
not return -EMSGSIZE as expected, but it allocates MAX_ACTIONS_BUFSIZE
bytes increasing actions_len by req_size. This can then lead to an OOB
write access, especially when further actions need to be copied.
Fix it by rearranging the flow action size check.
KASAN splat below:
==================================================================
BUG: KASAN: slab-out-of-bounds in reserve_sfa_size+0x1ba/0x380 [openvswitch]
Write of size 65360 at addr ffff888147e4001c by task handler15/836
CPU: 1 PID: 836 Comm: handler15 Not tainted 5.18.0-rc1+ #27
...
Call Trace:
<TASK>
dump_stack_lvl+0x45/0x5a
print_report.cold+0x5e/0x5db
? __lock_text_start+0x8/0x8
? reserve_sfa_size+0x1ba/0x380 [openvswitch]
kasan_report+0xb5/0x130
? reserve_sfa_size+0x1ba/0x380 [openvswitch]
kasan_check_range+0xf5/0x1d0
memcpy+0x39/0x60
reserve_sfa_size+0x1ba/0x380 [openvswitch]
__add_action+0x24/0x120 [openvswitch]
ovs_nla_add_action+0xe/0x20 [openvswitch]
ovs_ct_copy_action+0x29d/0x1130 [openvswitch]
? __kernel_text_address+0xe/0x30
? unwind_get_return_address+0x56/0xa0
? create_prof_cpu_mask+0x20/0x20
? ovs_ct_verify+0xf0/0xf0 [openvswitch]
? prep_compound_page+0x198/0x2a0
? __kasan_check_byte+0x10/0x40
? kasan_unpoison+0x40/0x70
? ksize+0x44/0x60
? reserve_sfa_size+0x75/0x380 [openvswitch]
__ovs_nla_copy_actions+0xc26/0x2070 [openvswitch]
? __zone_watermark_ok+0x420/0x420
? validate_set.constprop.0+0xc90/0xc90 [openvswitch]
? __alloc_pages+0x1a9/0x3e0
? __alloc_pages_slowpath.constprop.0+0x1da0/0x1da0
? unwind_next_frame+0x991/0x1e40
? __mod_node_page_state+0x99/0x120
? __mod_lruvec_page_state+0x2e3/0x470
? __kasan_kmalloc_large+0x90/0xe0
ovs_nla_copy_actions+0x1b4/0x2c0 [openvswitch]
ovs_flow_cmd_new+0x3cd/0xb10 [openvswitch]
...
Cc: stable@vger.kernel.org
Fixes: f28cd2af22 ("openvswitch: fix flow actions reallocation")
Signed-off-by: Paolo Valerio <pvalerio@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Feng reported an skb_under_panic BUG triggered by running
test_ip6gretap() in tools/testing/selftests/bpf/test_tunnel.sh:
[ 82.492551] skbuff: skb_under_panic: text:ffffffffb268bb8e len:403 put:12 head:ffff9997c5480000 data:ffff9997c547fff8 tail:0x18b end:0x2c0 dev:ip6gretap11
<...>
[ 82.607380] Call Trace:
[ 82.609389] <TASK>
[ 82.611136] skb_push.cold.109+0x10/0x10
[ 82.614289] __gre6_xmit+0x41e/0x590
[ 82.617169] ip6gre_tunnel_xmit+0x344/0x3f0
[ 82.620526] dev_hard_start_xmit+0xf1/0x330
[ 82.623882] sch_direct_xmit+0xe4/0x250
[ 82.626961] __dev_queue_xmit+0x720/0xfe0
<...>
[ 82.633431] packet_sendmsg+0x96a/0x1cb0
[ 82.636568] sock_sendmsg+0x30/0x40
<...>
The following sequence of events caused the BUG:
1. During ip6gretap device initialization, tunnel->tun_hlen (e.g. 4) is
calculated based on old flags (see ip6gre_calc_hlen());
2. packet_snd() reserves header room for skb A, assuming
tunnel->tun_hlen is 4;
3. Later (in clsact Qdisc), the eBPF program sets a new tunnel key for
skb A using bpf_skb_set_tunnel_key() (see _ip6gretap_set_tunnel());
4. __gre6_xmit() detects the new tunnel key, and recalculates
"tun_hlen" (e.g. 12) based on new flags (e.g. TUNNEL_KEY and
TUNNEL_SEQ);
5. gre_build_header() calls skb_push() with insufficient reserved header
room, triggering the BUG.
As sugguested by Cong, fix it by moving the call to skb_cow_head() after
the recalculation of tun_hlen.
Reproducer:
OBJ=$LINUX/tools/testing/selftests/bpf/test_tunnel_kern.o
ip netns add at_ns0
ip link add veth0 type veth peer name veth1
ip link set veth0 netns at_ns0
ip netns exec at_ns0 ip addr add 172.16.1.100/24 dev veth0
ip netns exec at_ns0 ip link set dev veth0 up
ip link set dev veth1 up mtu 1500
ip addr add dev veth1 172.16.1.200/24
ip netns exec at_ns0 ip addr add ::11/96 dev veth0
ip netns exec at_ns0 ip link set dev veth0 up
ip addr add dev veth1 ::22/96
ip link set dev veth1 up
ip netns exec at_ns0 \
ip link add dev ip6gretap00 type ip6gretap seq flowlabel 0xbcdef key 2 \
local ::11 remote ::22
ip netns exec at_ns0 ip addr add dev ip6gretap00 10.1.1.100/24
ip netns exec at_ns0 ip addr add dev ip6gretap00 fc80::100/96
ip netns exec at_ns0 ip link set dev ip6gretap00 up
ip link add dev ip6gretap11 type ip6gretap external
ip addr add dev ip6gretap11 10.1.1.200/24
ip addr add dev ip6gretap11 fc80::200/24
ip link set dev ip6gretap11 up
tc qdisc add dev ip6gretap11 clsact
tc filter add dev ip6gretap11 egress bpf da obj $OBJ sec ip6gretap_set_tunnel
tc filter add dev ip6gretap11 ingress bpf da obj $OBJ sec ip6gretap_get_tunnel
ping6 -c 3 -w 10 -q ::11
Fixes: 6712abc168 ("ip6_gre: add ip6 gre and gretap collect_md mode")
Reported-by: Feng Zhou <zhoufeng.zf@bytedance.com>
Co-developed-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Do not update tunnel->tun_hlen in data plane code. Use a local variable
instead, just like "tunnel_hlen" in net/ipv4/ip_gre.c:gre_fb_xmit().
Co-developed-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2022-04-14
This series contains updates to ice driver only.
Maciej adjusts implementation in __ice_alloc_rx_bufs_zc() for when
ice_fill_rx_descs() does not return the entire buffer request and fixes a
return value for !CONFIG_NET_SWITCHDEV configuration which was preventing
VF creation.
Wojciech prevents eswitch transmit when VFs are being removed which was
causing NULL pointer dereference.
Jianglei Nie fixes a memory leak on error path of getting OROM data.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Steffen Klassert says:
====================
pull request (net): ipsec 2022-04-14
1) Fix the output interface for VRF cases in xfrm_dst_lookup.
From David Ahern.
2) Fix write out of bounds by doing COW on esp output when the
packet size is larger than a page.
From Sabrina Dubroca.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
packet_sock xmit could be dev_queue_xmit, which also returns negative
errors. So only checking positive errors is not enough, or userspace
sendmsg may return success while packet is not send out.
Move the net_xmit_errno() assignment in the braces as checkpatch.pl said
do not use assignment in if condition.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reported-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since commit e5d5aadcf3 ("net/smc: fix sk_refcnt underflow on linkdown
and fallback"), for a fallback connection, __smc_release() does not call
sock_put() if its state is already SMC_CLOSED.
When calling smc_shutdown() after falling back, its state is set to
SMC_CLOSED but does not call sock_put(), so this patch calls it.
Reported-and-tested-by: syzbot+6e29a053eb165bd50de5@syzkaller.appspotmail.com
Fixes: e5d5aadcf3 ("net/smc: fix sk_refcnt underflow on linkdown and fallback")
Signed-off-by: Tony Lu <tonylu@linux.alibaba.com>
Acked-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A recent patch[1] from Eric Dumazet flipped the order in which the
keepalive timer and the keepalive worker were cancelled in order to fix a
syzbot reported issue[2]. Unfortunately, this enables the mirror image bug
whereby the timer races with rxrpc_exit_net(), restarting the worker after
it has been cancelled:
CPU 1 CPU 2
=============== =====================
if (rxnet->live)
<INTERRUPT>
rxnet->live = false;
cancel_work_sync(&rxnet->peer_keepalive_work);
rxrpc_queue_work(&rxnet->peer_keepalive_work);
del_timer_sync(&rxnet->peer_keepalive_timer);
Fix this by restoring the removed del_timer_sync() so that we try to remove
the timer twice. If the timer runs again, it should see ->live == false
and not restart the worker.
Fixes: 1946014ca3 ("rxrpc: fix a race in rxrpc_exit_net()")
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Eric Dumazet <edumazet@google.com>
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
Link: https://lore.kernel.org/r/20220404183439.3537837-1-eric.dumazet@gmail.com/ [1]
Link: https://syzkaller.appspot.com/bug?extid=724378c4bb58f703b09a [2]
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Added the phy_poll_cable_test flag for the lan937x phy driver.
Tested using command - ethtool --cable-test <dev>
Fixes: 680baca546 ("net: phy: added the LAN937x phy support")
Signed-off-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
netfilter.
Current release - regressions:
- smc: fix af_ops of child socket pointing to released memory
- wifi: ath9k: fix usage of driver-private space in tx_info
Previous releases - regressions:
- ipv6: fix panic when forwarding a pkt with no in6 dev
- sctp: use the correct skb for security_sctp_assoc_request
- smc: fix NULL pointer dereference in smc_pnet_find_ib()
- sched: fix initialization order when updating chain 0 head
- phy: don't defer probe forever if PHY IRQ provider is missing
- dsa: revert "net: dsa: setup master before ports"
- dsa: felix: fix tagging protocol changes with multiple CPU ports
- eth: ice:
- fix use-after-free when freeing @rx_cpu_rmap
- revert "iavf: fix deadlock occurrence during resetting VF interface"
- eth: lan966x: stop processing the MAC entry is port is wrong
Previous releases - always broken:
- sched
- flower: fix parsing of ethertype following VLAN header
- taprio: check if socket flags are valid
- nfc: add flush_workqueue to prevent uaf
- veth: ensure eth header is in skb's linear part
- eth: stmmac: fix altr_tse_pcs function when using a fixed-link
- eth: macb: restart tx only if queue pointer is lagging
- eth: macvlan: fix leaking skb in source mode with nodst option
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmJX6CoSHHBhYmVuaUBy
ZWRoYXQuY29tAAoJECkkeY3MjxOkMRIP/0CKmGetL0i1WQ0LD8regv+E6NizyXDB
+YCchHMMgYJ/aIRVps9GSWqD6ncU2gCSW27aCxHsN+Esw/HytrmsaHfS1SWRgIfb
6hn1CB3/Ojd1eXZOdXgVrlayhJKj//c0KdoHQoY+sQaLKNvULx5VfTq4y1yjyXy2
upfwW4JaQaUlkaTbdhjRhq5TuRY2JCwscc7EmJbwrqmqWG7nSEngrLU0XLHs8tWr
X/gQbI/wDepf/KidE59rLV7yMYlovCkoZBVUrN4oLZRqxUIBtU0a8nZNX5NFWiVD
KwTahrUh8eOizzSMEzjO0HpuGBWQFrmyC0eOb8KwixMJrRfDtqAWJKwHfxZtzHr/
JpUHlgvCN9iQoPYY0LZ8uU2CLFJM+p4hn1sOt/swoo23iAvCiJWWbRoAvsvwLbZ3
4pAsof8w6SRb34J/6Lcu78LECD/y62TJ27Ay82vdjrYbz1g8wb1wGlAMAbJ7K7sE
NQ4iB6wzd6UVF3Qm4qo7kRJT5TTY4TxZMQUKl1gj/OEV9hrJ8zF1MjRhxNWRC2R9
+A8Rw3weL8zYKoezCEEbogaDp+h7IdeNlbZE8r19p2CMiJ/Y+NQd//p4rSJ0Oqcw
1tOtg5x7LVVMiSz38uHk0HbiKJ9UbHLJJZ35N5oi/IDPg+LWFfar23I4BfuII2kB
Wo5Ii/0DMozb
=Ha9M
-----END PGP SIGNATURE-----
Merge tag 'net-5.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from wireless and netfilter.
Current release - regressions:
- smc: fix af_ops of child socket pointing to released memory
- wifi: ath9k: fix usage of driver-private space in tx_info
Previous releases - regressions:
- ipv6: fix panic when forwarding a pkt with no in6 dev
- sctp: use the correct skb for security_sctp_assoc_request
- smc: fix NULL pointer dereference in smc_pnet_find_ib()
- sched: fix initialization order when updating chain 0 head
- phy: don't defer probe forever if PHY IRQ provider is missing
- dsa: revert "net: dsa: setup master before ports"
- dsa: felix: fix tagging protocol changes with multiple CPU ports
- eth: ice:
- fix use-after-free when freeing @rx_cpu_rmap
- revert "iavf: fix deadlock occurrence during resetting VF
interface"
- eth: lan966x: stop processing the MAC entry is port is wrong
Previous releases - always broken:
- sched:
- flower: fix parsing of ethertype following VLAN header
- taprio: check if socket flags are valid
- nfc: add flush_workqueue to prevent uaf
- veth: ensure eth header is in skb's linear part
- eth: stmmac: fix altr_tse_pcs function when using a fixed-link
- eth: macb: restart tx only if queue pointer is lagging
- eth: macvlan: fix leaking skb in source mode with nodst option"
* tag 'net-5.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (52 commits)
net: bcmgenet: Revert "Use stronger register read/writes to assure ordering"
rtnetlink: Fix handling of disabled L3 stats in RTM_GETSTATS replies
net: dsa: felix: fix tagging protocol changes with multiple CPU ports
tun: annotate access to queue->trans_start
nfc: nci: add flush_workqueue to prevent uaf
net: dsa: realtek: don't parse compatible string for RTL8366S
net: dsa: realtek: fix Kconfig to assure consistent driver linkage
net: ftgmac100: access hardware register after clock ready
Revert "net: dsa: setup master before ports"
macvlan: Fix leaking skb in source mode with nodst option
netfilter: nf_tables: nft_parse_register can return a negative value
net: lan966x: Stop processing the MAC entry is port is wrong.
net: lan966x: Fix when a port's upper is changed.
net: lan966x: Fix IGMP snooping when frames have vlan tag
net: lan966x: Update lan966x_ptp_get_nominal_value
sctp: Initialize daddr on peeled off socket
net/smc: Fix af_ops of child socket pointing to released memory
net/smc: Fix NULL pointer dereference in smc_pnet_find_ib()
net/smc: use memcpy instead of snprintf to avoid out of bounds read
net: macb: Restart tx only if queue pointer is lagging
...
This became an unexpectedly large pull request due to various
regression fixes in the previous kernels.
The majority of fixes are a series of patches to address the
regression at probe errors in devres'ed drivers, while there are
yet more fixes for the x86 SG allocations and for USB-audio
buffer management. In addition, a few HD-audio quirks and other
small fixes are found.
-----BEGIN PGP SIGNATURE-----
iQJCBAABCAAsFiEEIXTw5fNLNI7mMiVaLtJE4w1nLE8FAmJWp0oOHHRpd2FpQHN1
c2UuZGUACgkQLtJE4w1nLE8uUxAAozqgnJSXJPj/ADhgdkYEWEQQlz2nMomQFZmN
QSvh+gHmPD5HATrwwGFcVpmmwXHiZdG3pzh+gJPseaNNastO6jiSuRAeEFuX/h/a
Mw8mdvT1git7rHBliRUAIZnbIV7e1NKVOXdujTPF7FLt2OlcoALpCeygGh6zTTEp
0UmRUYrHhtZbXzkXszaq6SEhLjGG8Maa1AiaOo2ltl4O9fTfrFvGiOP3usxEY4xx
aTAtF+fmY24UEmD6/44L3LuQ568cDf+XXDCgStOOCM9jlz3Q5ptTCcmv6pj5KaM3
zGG4gmjialzG+QS8qySvo4BU3sYWASwUklw/Yyyx8tFcnLG7q2nWjswQA9g8X19h
MyZXwgh9iKKfF3XQRETcQupsOAFnzpUJj1ZQzxG2fwj/ohFdbKQ2Z/OO9f8eCvbo
IEQrQwBgGS6eHetUDOCdC357XwzDlRFafx0W6o5cm+XdxcXzMIDcJZbyKoEwvzr3
tozNc8L9AUYdDqZ8dKiNMOrDy7OHGTN/VU+pV1xVBF7R7/5DKvbnF4s+mAg25uXF
1/V+z9zDciDBXSuiuSzsfz/MbHUPlMn+c8qKwHfUiw5sLX2AKXGwgmkhr3ly1ga0
/4YZfrQgTVFrzQITfMxxn8MGG5QOy42g4uKmTnEUeH2ucJ9uY5hE8FUDH3jOeC/V
tJB+f4Y=
=NERK
-----END PGP SIGNATURE-----
Merge tag 'sound-5.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"This became an unexpectedly large pull request due to various
regression fixes in the previous kernels.
The majority of fixes are a series of patches to address the
regression at probe errors in devres'ed drivers, while there are yet
more fixes for the x86 SG allocations and for USB-audio buffer
management. In addition, a few HD-audio quirks and other small fixes
are found"
* tag 'sound-5.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (52 commits)
ALSA: usb-audio: Limit max buffer and period sizes per time
ALSA: memalloc: Add fallback SG-buffer allocations for x86
ALSA: nm256: Don't call card private_free at probe error path
ALSA: mtpav: Don't call card private_free at probe error path
ALSA: rme9652: Fix the missing snd_card_free() call at probe error
ALSA: hdspm: Fix the missing snd_card_free() call at probe error
ALSA: hdsp: Fix the missing snd_card_free() call at probe error
ALSA: oxygen: Fix the missing snd_card_free() call at probe error
ALSA: lx6464es: Fix the missing snd_card_free() call at probe error
ALSA: cmipci: Fix the missing snd_card_free() call at probe error
ALSA: aw2: Fix the missing snd_card_free() call at probe error
ALSA: als300: Fix the missing snd_card_free() call at probe error
ALSA: lola: Fix the missing snd_card_free() call at probe error
ALSA: bt87x: Fix the missing snd_card_free() call at probe error
ALSA: sis7019: Fix the missing error handling
ALSA: intel_hdmi: Fix the missing snd_card_free() call at probe error
ALSA: via82xx: Fix the missing snd_card_free() call at probe error
ALSA: sonicvibes: Fix the missing snd_card_free() call at probe error
ALSA: rme96: Fix the missing snd_card_free() call at probe error
ALSA: rme32: Fix the missing snd_card_free() call at probe error
...
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAmJUfvwACgkQxWXV+ddt
WDtiuA//csj0CrJq7wyRvgDkbPtCCMyDtL4zfmjP4s++NWaMDOKTxBU8msuGUJgR
Xribel2zWqlFiOzptd9sGEfxfKfz1p5Rm/gtFj57WVSGV7YtiAGyFGuzn/vpCrgq
NP5LFY2z5N36VxDXUPKHzvdqczO5W2n9KdaysJCr6FpCz+vVrplFiT5U+X175Sgg
zWS/XrPIHYbtaEFdb3rUKol6riu7vXW3MxEA9di8K4Xo9TJrp0NtBoGZDsWFQxf7
vfVwtYsQPsACJxw+MjBcVQ5fNXO5iATL1JfBb9ltN589xouja7bDCb40Fm2gEwWB
IUatnCq/4MN2S2NdPtEcXQ52W9svT/87z6ZblefSEiQqvQBBJHN131RTM/s8LBG4
fkHcGV6PsiutSIFycrID0bpXr1Mhvg2aMjxdriLPBtYopaMPh+ivK6LPYoE5MggQ
/rshWfjiWJhPKXPsng+H7UGbViOiYeG0kUBuaqFx4ARnESpN1gF2dJRYvXYFL/8a
Q0wmLr1tf3M82VAaAFNOl/BVk8blutCSHcJDLDKxcl3DhVVlY5J8onEPXjneJRkk
rRB3zoxLlptgfW75CPNyFrpicPdAXzGCoccienIUoEdHSX/W5rNA4L6XpVE5Hv94
dWR1aVAbCUcuhY1QfBU7H2iYw0RHzqDO3+IRfqJuYsLGciijLbs=
=APMY
-----END PGP SIGNATURE-----
Merge tag 'for-5.18-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
"A few more code and warning fixes.
There's one feature ioctl removal patch slated for 5.18 that did not
make it to the main pull request. It's just a one-liner and the ioctl
has a v2 that's in use for a long time, no point to postpone it to
5.19.
Late update:
- remove balance v1 ioctl, superseded by v2 in 2012
Fixes:
- add back cgroup attribution for compressed writes
- add super block write start/end annotations to asynchronous balance
- fix root reference count on an error handling path
- in zoned mode, activate zone at the chunk allocation time to avoid
ENOSPC due to timing issues
- fix delayed allocation accounting for direct IO
Warning fixes:
- simplify assertion condition in zoned check
- remove an unused variable"
* tag 'for-5.18-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: fix btrfs_submit_compressed_write cgroup attribution
btrfs: fix root ref counts in error handling in btrfs_get_root_ref
btrfs: zoned: activate block group only for extent allocation
btrfs: return allocated block group from do_chunk_alloc()
btrfs: mark resumed async balance as writing
btrfs: remove support of balance v1 ioctl
btrfs: release correct delalloc amount in direct IO write path
btrfs: remove unused variable in btrfs_{start,write}_dirty_block_groups()
btrfs: zoned: remove redundant condition in btrfs_run_delalloc_range
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEqG5UsNXhtOCrfGQP+7dXa6fLC2sFAmJW6J0ACgkQ+7dXa6fL
C2sXZw//Z5if0BzqI0SjZ7JeozZ4ADgAKYrT0WvYRg7TPfJBThZsa9NEaQfueXBp
0DlZpM/ToNQXMYV/jRlDjzooPstKt45IdS4wMewTtXZAXDuZNDygOIRTY5S3zKHY
wW4ZuZ/3AtNejY8YAnxFtykcbVfxMzYkQ+cgX79fd/+ZLgpYImtk4k3fS53pyRhv
Q6bT7zxIOIGzaPVda/lu4t1vdTXNTNI2b/9lYOPEXMC6OFmHQ94dnqsyW6AKUD6K
oYldKUFIFcv5CeiH8sTXFrkckNqgkUln5OXdkFLUjRPHE8Yyq57ZyX7ilp4yFdfS
HE8VpDaDfk11hVpFLngkPu5RJKsRbad5CQ+Jl8+xiEn0fbuR4q7LFvqSVoykh/x5
hh5veuSc/KvfpRIijNlCff4gry9gnAhHlLbt8xFbpNWPaYFTVdkkuDtoB8Kv/KoV
HmxyNyQEW2eL7i7JXfSI4EjNPI85RVhVXmccbJJoKHQRvSVCZuWBlR7NUkykzuXg
CQbnO/SeqilpOpbFvhTUXdOXNQEHbtcKp9yO9hd/f67E7gNKiDOewOmItMt4HfFx
pjKv48tLR1cnZtL8pNf7nSMKuGe0bofDu46rnra0Z4mcz4d4y9mkIrdoIVT/heae
E3Ut/1GJ0MkETl+nYXdV9IxTMHWIVGyzt7HyJTziXj03dfVquqA=
=fUQK
-----END PGP SIGNATURE-----
Merge tag 'fscache-fixes-20220413' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
Pull fscache fixes from David Howells:
"Here's a collection of fscache and cachefiles fixes and misc small
cleanups. The two main fixes are:
- Add a missing unmark of the inode in-use mark in an error path.
- Fix a KASAN slab-out-of-bounds error when setting the xattr on a
cachefiles volume due to the wrong length being given to memcpy().
In addition, there's the removal of an unused parameter, removal of an
unused Kconfig option, conditionalising a bit of procfs-related stuff
and some doc fixes"
* tag 'fscache-fixes-20220413' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
fscache: remove FSCACHE_OLD_API Kconfig option
fscache: Use wrapper fscache_set_cache_state() directly when relinquishing
fscache: Move fscache_cookies_seq_ops specific code under CONFIG_PROC_FS
fscache: Remove the cookie parameter from fscache_clear_page_bits()
docs: filesystems: caching/backend-api.rst: fix an object withdrawn API
docs: filesystems: caching/backend-api.rst: correct two relinquish APIs use
cachefiles: Fix KASAN slab-out-of-bounds in cachefiles_set_volume_xattr
cachefiles: unmark inode in use in error path
A memory chunk was allocated for orom_data in ice_get_orom_civd_data()
by vzmalloc(). But when ice_read_flash_module() fails, the allocated
memory is not freed, which will lead to a memory leak.
We can fix it by freeing the orom_data when ce_read_flash_module() fails.
Fixes: af18d8866c ("ice: reduce time to read Option ROM CIVD data")
Signed-off-by: Jianglei Nie <niejianglei2021@163.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Currently for !CONFIG_NET_SWITCHDEV kernel builds it is not possible to
create VFs properly as call to ice_eswitch_configure() returns
-EOPNOTSUPP for us. This is because CONFIG_ICE_SWITCHDEV depends on
CONFIG_NET_SWITCHDEV.
Change the ice_eswitch_configure() implementation for
!CONFIG_ICE_SWITCHDEV to return 0 instead -EOPNOTSUPP and let
ice_ena_vfs() finish its work properly.
CC: Grzegorz Nitka <grzegorz.nitka@intel.com>
Fixes: 1a1c40df2e ("ice: set and release switchdev environment")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
__ice_alloc_rx_bufs_zc() checks if a number of the descriptors to be
allocated would cause the ring wrap. In that case, driver will issue two
calls to xsk_buff_alloc_batch() - one that will fill the ring up to the
end and the second one that will start with filling descriptors from the
beginning of the ring.
ice_fill_rx_descs() is a wrapper for taking care of what
xsk_buff_alloc_batch() gave back to the driver. It works in a best
effort approach, so for example when driver asks for 64 buffers,
ice_fill_rx_descs() could assign only 32. Such case needs to be checked
when ring is being filled up to the end, because in that situation ntu
might not reached the end of the ring.
Fix the ring wrap by checking if nb_buffs_extra has the expected value.
If not, bump ntu and go directly to tail update.
Fixes: 3876ff525d ("ice: xsk: Handle SW XDP ring wrap and bump tail more often")
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Shwetha Nagaraju <Shwetha.nagaraju@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
When L3 stats are disabled, rtnl_offload_xstats_get_size_stats() returns
size of 0, which is supposed to be an indication that the corresponding
attribute should not be emitted. However, instead, the current code
reserves a 0-byte attribute.
The reason this does not show up as a citation on a kasan kernel is that
netdev_offload_xstats_get(), which is supposed to fill in the data, never
ends up getting called, because rtnl_offload_xstats_get_stats() notices
that the stats are not actually used and skips the call.
Thus a zero-length IFLA_OFFLOAD_XSTATS_L3_STATS attribute ends up in a
response, confusing the userspace.
Fix by skipping the L3-stats related block in rtnl_offload_xstats_fill().
Fixes: 0e7788fd76 ("net: rtnetlink: Add UAPI for obtaining L3 offload xstats")
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/591b58e7623edc3eb66dd1fcfa8c8f133d090974.1649794741.git.petrm@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
When the device tree has 2 CPU ports defined, a single one is active
(has any dp->cpu_dp pointers point to it). Yet the second one is still a
CPU port, and DSA still calls ->change_tag_protocol on it.
On the NXP LS1028A, the CPU ports are ports 4 and 5. Port 4 is the
active CPU port and port 5 is inactive.
After the following commands:
# Initial setting
cat /sys/class/net/eno2/dsa/tagging
ocelot
echo ocelot-8021q > /sys/class/net/eno2/dsa/tagging
echo ocelot > /sys/class/net/eno2/dsa/tagging
traffic is now broken, because the driver has moved the NPI port from
port 4 to port 5, unbeknown to DSA.
The problem can be avoided by detecting that the second CPU port is
unused, and not doing anything for it. Further rework will be needed
when proper support for multiple CPU ports is added.
Treat this as a bug and prepare current kernels to work in single-CPU
mode with multiple-CPU DT blobs.
Fixes: adb3dccf09 ("net: dsa: felix: convert to the new .change_tag_protocol DSA API")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20220412172209.2531865-1-vladimir.oltean@nxp.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Commit 5337824f4d ("net: annotate accesses to queue->trans_start")
introduced a new helper, txq_trans_cond_update, to update
queue->trans_start using WRITE_ONCE. One snippet in drivers/net/tun.c
was missed, as it was introduced roughly at the same time.
Fixes: 5337824f4d ("net: annotate accesses to queue->trans_start")
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20220412135852.466386-1-atenart@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
When we decode the latency and the max_latency, u16 value may not fit
the required size and could lead to the wrong LTR representation.
Scaling is represented as:
scale 0 - 1 (2^(5*0)) = 2^0
scale 1 - 32 (2^(5 *1))= 2^5
scale 2 - 1024 (2^(5 *2)) =2^10
scale 3 - 32768 (2^(5 *3)) =2^15
scale 4 - 1048576 (2^(5 *4)) = 2^20
scale 5 - 33554432 (2^(5 *4)) = 2^25
scale 4 and scale 5 required 20 and 25 bits respectively.
scale 6 reserved.
Replace the u16 type with the u32 type and allow corrected LTR
representation.
Cc: stable@vger.kernel.org
Fixes: 44a13a5d99 ("e1000e: Fix the max snoop/no-snoop latency for 10M")
Reported-by: James Hutchinson <jahutchinson99@googlemail.com>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=215689
Suggested-by: Dima Ruinskiy <dima.ruinskiy@intel.com>
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Naama Meir <naamax.meir@linux.intel.com>
Tested-by: James Hutchinson <jahutchinson99@googlemail.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Some mainboard/CPU combinations, in particular, Alder Lake-S with a
W680 mainboard, have shown problems (system hangs usually, no kernel
logs) with suspend/resume when PCIe PTM is enabled and active. In some
cases, it could be reproduced when removing the igc module.
The best we can do is to stop PTM dialogs from the downstream/device
side before the interface is brought down. PCIe PTM will be re-enabled
when the interface is being brought up.
Fixes: a90ec84837 ("igc: Add support for PTP getcrosststamp()")
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Tested-by: Naama Meir <naamax.meir@linux.intel.com>
Acked-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
An infinite loop may occur if we fail to acquire the HW semaphore,
which is needed for resource release.
This will typically happen if the hardware is surprise-removed.
At this stage there is nothing to do, except log an error and quit.
Fixes: c0071c7aa5 ("igc: Add HW initialization code")
Suggested-by: Dima Ruinskiy <dima.ruinskiy@intel.com>
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Naama Meir <naamax.meir@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>