Following the cited commit, sparse started complaining about:
../include/net/gro.h:58:1: warning: directive in macro's argument list
../include/net/gro.h:59:1: warning: directive in macro's argument list
Fix that by moving the defines out of the struct_group() macro.
Fixes: de5a1f3ce4 ("net: gro: minor optimization for dev_gro_receive()")
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Acked-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
'destroy_workqueue()' already drains the queue before destroying it, so
there is no need to flush it explicitly.
Remove the redundant 'flush_workqueue()' calls.
Signed-off-by: Xu Wang <vulab@iscas.ac.cn>
Acked-by: Alexandra Winter <wintera@linux.ibm.com>
Link: https://lore.kernel.org/r/20220216075155.940-1-vulab@iscas.ac.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Introduced in commit cf96357303 ("net: dsa: Allow providing PHY
statistics from CPU port"), it appears these were never used.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20220216193726.2926320-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
prot->memory_allocated should only be set if prot->sysctl_mem
is also set.
This is a followup of commit 2520611151 ("crypto: af_alg - get
rid of alg_memory_allocated").
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20220216171801.3604366-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
XTE_MAX_JUMBO_FRAME_SIZE is over 9000 bytes and the default value for
'rx_bd_num' is RX_BD_NUM_DEFAULT (i.e. 1024)
So this loop allocates more than 9 Mo of memory.
Previous memory allocations in this function already use GFP_KERNEL, so
use __netdev_alloc_skb_ip_align() and an explicit GFP_KERNEL instead of a
implicit GFP_ATOMIC.
This gives more opportunities of successful allocation.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/694abd65418b2b3974106a82d758e3474c65ae8f.1645042560.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
NIXGE_MAX_JUMBO_FRAME_SIZE is over 9000 bytes and RX_BD_NUM 128.
So this loop allocates more than 1 Mo of memory.
Previous memory allocations in this function already use GFP_KERNEL, so
use __netdev_alloc_skb_ip_align() and an explicit GFP_KERNEL instead of a
implicit GFP_ATOMIC.
This gives more opportunities of successful allocation.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/28d2c8e05951ad02a57eb48333672947c8bb4f81.1645043881.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Mat Martineau says:
====================
mptcp: Selftest fine-tuning and cleanup
Patch 1 adjusts the mptcp selftest timeout to account for slow machines
running debug builds.
Patch 2 simplifies one test function.
Patches 3-6 do some cleanup, like deleting unused variables and avoiding
extra work when only printing usage information.
Patch 7 improves the checksum tests by utilizing existing checksum MIBs.
====================
Link: https://lore.kernel.org/r/20220218030311.367536-1-mathew.j.martineau@linux.intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This patch added the data checksum error mib counters check for the
script mptcp_connect.sh when the data checksum is enabled.
In do_transfer(), got the mib counters twice, before and after running
the mptcp_connect commands. The latter minus the former is the actual
number of the data checksum mib counter.
The output looks like this:
ns1 MPTCP -> ns2 (dead:beef:1::2:10007) MPTCP (duration 86ms) [ OK ]
ns1 MPTCP -> ns2 (10.0.2.1:10008 ) MPTCP (duration 66ms) [ FAIL ]
server got 1 data checksum error[s]
Fixes: 94d66ba1d8 ("selftests: mptcp: enable checksum in mptcp_connect.sh")
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/255
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
To allow showing the 'help' menu even if these tools are not available.
While at it, also avoid launching the command then checking $?. Instead,
the check is directly done in the 'if'.
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
These tmp files will only be created when a test will be launched.
This avoid 'dd' output when '-h' is used for example.
While at it, also avoid creating netns that will be removed when
starting the first test.
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Shellcheck found that these variables were set but never used.
Note that rndh is no longer prefixed with '0-' but it doesn't change
anything.
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
With an error if it is an unknown option.
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This patch simplified pm_nl_change_endpoint(), using id-based address
lookups only. And dropped the fragile way of parsing 'addr' and 'id'
from the output of pm_nl_show_endpoints().
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
With the increase number of tests, one CI instance, using a debug kernel
config and not recent hardware, takes around 10 minutes to execute the
slowest MPTCP test: mptcp_join.sh.
Even if most CIs don't take that long to execute these tests --
typically max 10 minutes to run all selftests -- it will help some of
them if the timeout is increased.
The timeout could be disabled but it is always good to have an extra
safeguard, just in case.
Please note that on slow public CIs with kernel debug settings, it has
been observed it can easily take up to 45 minutes to execute all tests
in this very slow environment with other jobs running in parallel.
The slowest test, mptcp_join.sh takes ~30 minutes in this case.
In such environments, the selftests timeout set in the 'settings' file
is disabled because this environment is known as being exceptionnally
slow. It has been decided not to take such exceptional environments into
account and set the timeout to 20min.
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Daniel Borkmann says:
====================
bpf-next 2022-02-17
We've added 29 non-merge commits during the last 8 day(s) which contain
a total of 34 files changed, 1502 insertions(+), 524 deletions(-).
The main changes are:
1) Add BTFGen support to bpftool which allows to use CO-RE in kernels without
BTF info, from Mauricio Vásquez, Rafael David Tinoco, Lorenzo Fontana and
Leonardo Di Donato. (Details: https://lpc.events/event/11/contributions/948/)
2) Prepare light skeleton to be used in both kernel module and user space
and convert bpf_preload.ko to use light skeleton, from Alexei Starovoitov.
3) Rework bpftool's versioning scheme and align with libbpf's version number;
also add linked libbpf version info to "bpftool version", from Quentin Monnet.
4) Add minimal C++ specific additions to bpftool's skeleton codegen to
facilitate use of C skeletons in C++ applications, from Andrii Nakryiko.
5) Add BPF verifier sanity check whether relative offset on kfunc calls overflows
desc->imm and reject the BPF program if the case, from Hou Tao.
6) Fix libbpf to use a dynamically allocated buffer for netlink messages to
avoid receiving truncated messages on some archs, from Toke Høiland-Jørgensen.
7) Various follow-up fixes to the JIT bpf_prog_pack allocator, from Song Liu.
8) Various BPF selftest and vmtest.sh fixes, from Yucong Sun.
9) Fix bpftool pretty print handling on dumping map keys/values when no BTF
is available, from Jiri Olsa and Yinjun Zhang.
10) Extend XDP frags selftest to check for invalid length, from Lorenzo Bianconi.
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (29 commits)
bpf: bpf_prog_pack: Set proper size before freeing ro_header
selftests/bpf: Fix crash in core_reloc when bpftool btfgen fails
selftests/bpf: Fix vmtest.sh to launch smp vm.
libbpf: Fix memleak in libbpf_netlink_recv()
bpftool: Fix C++ additions to skeleton
bpftool: Fix pretty print dump for maps without BTF loaded
selftests/bpf: Test "bpftool gen min_core_btf"
bpftool: Gen min_core_btf explanation and examples
bpftool: Implement btfgen_get_btf()
bpftool: Implement "gen min_core_btf" logic
bpftool: Add gen min_core_btf command
libbpf: Expose bpf_core_{add,free}_cands() to bpftool
libbpf: Split bpf_core_apply_relo()
bpf: Reject kfunc calls that overflow insn->imm
selftests/bpf: Add Skeleton templated wrapper as an example
bpftool: Add C++-specific open/load/etc skeleton wrappers
selftests/bpf: Fix GCC11 compiler warnings in -O2 mode
bpftool: Fix the error when lookup in no-btf maps
libbpf: Use dynamically allocated buffer when receiving netlink messages
libbpf: Fix libbpf.map inheritance chain for LIBBPF_0.7.0
...
====================
Link: https://lore.kernel.org/r/20220217232027.29831-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Yevhen Orlov says:
====================
net: marvell: prestera: add basic routes offloading
Add support for blackhole and local routes for Marvell Prestera driver.
Subscribe on fib notifications and handle add/del.
Add features:
- Support route adding.
e.g.: "ip route add blackhole 7.7.1.1/24"
e.g.: "ip route add local 9.9.9.9 dev sw1p30"
- Support "rt_trap", "rt_offload", "rt_offload_failed" flags
- Handle case, when route in "local" table overlaps route in "main" table
example:
ip ro add blackhole 7.7.7.7
ip ro add local 7.7.7.7 dev sw1p30
# blackhole route will be deoffloaded. rt_offload flag disappeared
Limitations:
- Only "blackhole" and "local" routes supported. "nexthop" routes is TRAP
for now and will be implemented soon.
- Only "local" and "main" tables supported
====================
Co-developed-by: Taras Chornyi <tchornyi@marvell.com>
Signed-off-by: Taras Chornyi <tchornyi@marvell.com>
Co-developed-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu>
Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu>
Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
For now we support only TRAP or DROP, so we can offload only "local" or
"blackhole" routes.
Nexthop routes is TRAP for now. Will be implemented soon.
Co-developed-by: Taras Chornyi <tchornyi@marvell.com>
Signed-off-by: Taras Chornyi <tchornyi@marvell.com>
Co-developed-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu>
Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu>
Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add new router_hw object "fib_node". For now it support only DROP and
TRAP mode.
Co-developed-by: Taras Chornyi <tchornyi@marvell.com>
Signed-off-by: Taras Chornyi <tchornyi@marvell.com>
Co-developed-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu>
Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu>
Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add functions to create/delete lpm entry in hw.
prestera_hw_lpm_add() take index of allocated virtual router.
Also it takes grp_id, which is index of allocated nexthop group.
ABI to create nexthop group will be added soon.
Co-developed-by: Taras Chornyi <tchornyi@marvell.com>
Signed-off-by: Taras Chornyi <tchornyi@marvell.com>
Co-developed-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu>
Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu>
Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexei Starovoitov says:
====================
pull-request: bpf 2022-02-17
We've added 8 non-merge commits during the last 7 day(s) which contain
a total of 8 files changed, 119 insertions(+), 15 deletions(-).
The main changes are:
1) Add schedule points in map batch ops, from Eric.
2) Fix bpf_msg_push_data with len 0, from Felix.
3) Fix crash due to incorrect copy_map_value, from Kumar.
4) Fix crash due to out of bounds access into reg2btf_ids, from Kumar.
5) Fix a bpf_timer initialization issue with clang, from Yonghong.
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
bpf: Add schedule points in batch ops
bpf: Fix crash due to out of bounds access into reg2btf_ids.
selftests: bpf: Check bpf_msg_push_data return value
bpf: Fix a bpf_timer initialization issue
bpf: Emit bpf_timer in vmlinux BTF
selftests/bpf: Add test for bpf_timer overwriting crash
bpf: Fix crash due to incorrect copy_map_value
bpf: Do not try bpf_msg_push_data with len 0
====================
Link: https://lore.kernel.org/r/20220217190000.37925-1-alexei.starovoitov@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
netfilter.
Current release - regressions:
- dsa: lantiq_gswip: fix use after free in gswip_remove()
- smc: avoid overwriting the copies of clcsock callback functions
Current release - new code bugs:
- iwlwifi:
- fix use-after-free when no FW is present
- mei: fix the pskb_may_pull check in ipv4
- mei: retry mapping the shared area
- mvm: don't feed the hardware RFKILL into iwlmei
Previous releases - regressions:
- ipv6: mcast: use rcu-safe version of ipv6_get_lladdr()
- tipc: fix wrong publisher node address in link publications
- iwlwifi: mvm: don't send SAR GEO command for 3160 devices,
avoid FW assertion
- bgmac: make idm and nicpm resource optional again
- atl1c: fix tx timeout after link flap
Previous releases - always broken:
- vsock: remove vsock from connected table when connect is
interrupted by a signal
- ping: change destination interface checks to match raw sockets
- crypto: af_alg - get rid of alg_memory_allocated to avoid confusing
semantics (and null-deref) after SO_RESERVE_MEM was added
- ipv6: make exclusive flowlabel checks per-netns
- bonding: force carrier update when releasing slave
- sched: limit TC_ACT_REPEAT loops
- bridge: multicast: notify switchdev driver whenever MC processing
gets disabled because of max entries reached
- wifi: brcmfmac: fix crash in brcm_alt_fw_path when WLAN not found
- iwlwifi: fix locking when "HW not ready"
- phy: mediatek: remove PHY mode check on MT7531
- dsa: mv88e6xxx: flush switchdev FDB workqueue before removing VLAN
- dsa: lan9303:
- fix polarity of reset during probe
- fix accelerated VLAN handling
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmIOm44ACgkQMUZtbf5S
IruWWBAAmNJBxVoVUahidwIVKHnKYeHClzDee3B6sYSupRW22Eeuh7Q8fqPMO4J7
KO9nP/vGibFuhKfjApvS6wPvpYWCuXAoSozfOa+JWNrFg9uVpdyHIhfdXl0WPZqx
A4+p2vIs1ldV0yOac/7ZGMWg57dzlYUurkld3xFwf7KyOOhV5/PkxkxpGN9eFkv1
NTFWTaIsVzFMXMjNXkGfGHmt/8mSmZHgsH+tYd+KXsjbs2UpbGM3SyfHBlUf3aA0
bceT4h07xA6C4rlUCbmalRqwvtcdM15MwlDBtSBXm5fXy0c59XxQOqj/dLhPuAO4
42sQlO2MhqDrZjR0tOjmuP2cpc7llj1lIZe1Qs3nKiNFHcuOJEHGw1PYCO85jKdn
xiWquuoe3G5YkQeoOoi+HqmXcP6aBZpUbROvYNjSJhcci4Ck0Qjna5J1rk8IRjb8
AkDf68dodn8I+W5dx/EnopH/ShPQcqGw1+tH4215UB7b40Ecpc+laqFAHRcgs654
ONuJVdRC4k3TyES1B9z8vawLcGYWa06fz8Mh/dS3gnLDphe5ZiH2tTrESfBdYixH
idmuO5C/YDhsVelVuO+B0RT/yziPb3Lr+BTplSfkODCXT6LuOdCYUcHx5nGZ1TYW
EeZ9hMSaxp2E06llEyD6JQQ+0Q17wnDGjLxtOMk+A8fmNX2F17g=
=Nrzq
-----END PGP SIGNATURE-----
Merge tag 'net-5.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from wireless and netfilter.
Current release - regressions:
- dsa: lantiq_gswip: fix use after free in gswip_remove()
- smc: avoid overwriting the copies of clcsock callback functions
Current release - new code bugs:
- iwlwifi:
- fix use-after-free when no FW is present
- mei: fix the pskb_may_pull check in ipv4
- mei: retry mapping the shared area
- mvm: don't feed the hardware RFKILL into iwlmei
Previous releases - regressions:
- ipv6: mcast: use rcu-safe version of ipv6_get_lladdr()
- tipc: fix wrong publisher node address in link publications
- iwlwifi: mvm: don't send SAR GEO command for 3160 devices, avoid FW
assertion
- bgmac: make idm and nicpm resource optional again
- atl1c: fix tx timeout after link flap
Previous releases - always broken:
- vsock: remove vsock from connected table when connect is
interrupted by a signal
- ping: change destination interface checks to match raw sockets
- crypto: af_alg - get rid of alg_memory_allocated to avoid confusing
semantics (and null-deref) after SO_RESERVE_MEM was added
- ipv6: make exclusive flowlabel checks per-netns
- bonding: force carrier update when releasing slave
- sched: limit TC_ACT_REPEAT loops
- bridge: multicast: notify switchdev driver whenever MC processing
gets disabled because of max entries reached
- wifi: brcmfmac: fix crash in brcm_alt_fw_path when WLAN not found
- iwlwifi: fix locking when "HW not ready"
- phy: mediatek: remove PHY mode check on MT7531
- dsa: mv88e6xxx: flush switchdev FDB workqueue before removing VLAN
- dsa: lan9303:
- fix polarity of reset during probe
- fix accelerated VLAN handling"
* tag 'net-5.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (65 commits)
bonding: force carrier update when releasing slave
nfp: flower: netdev offload check for ip6gretap
ipv6: fix data-race in fib6_info_hw_flags_set / fib6_purge_rt
ipv4: fix data races in fib_alias_hw_flags_set
net: dsa: lan9303: add VLAN IDs to master device
net: dsa: lan9303: handle hwaccel VLAN tags
vsock: remove vsock from connected table when connect is interrupted by a signal
Revert "net: ethernet: bgmac: Use devm_platform_ioremap_resource_byname"
ping: fix the dif and sdif check in ping_lookup
net: usb: cdc_mbim: avoid altsetting toggling for Telit FN990
net: sched: limit TC_ACT_REPEAT loops
tipc: fix wrong notification node addresses
net: dsa: lantiq_gswip: fix use after free in gswip_remove()
ipv6: per-netns exclusive flowlabel checks
net: bridge: multicast: notify switchdev driver whenever MC processing gets disabled
CDC-NCM: avoid overflow in sanity checking
mctp: fix use after free
net: mscc: ocelot: fix use-after-free in ocelot_vlan_del()
bonding: fix data-races around agg_select_timer
dpaa2-eth: Initialize mutex used in one step timestamping path
...
In __bond_release_one(), bond_set_carrier() is only called when bond
device has no slave. Therefore, if we remove the up slave from a master
with two slaves and keep the down slave, the master will remain up.
Fix this by moving bond_set_carrier() out of if (!bond_has_slaves(bond))
statement.
Reproducer:
$ insmod bonding.ko mode=0 miimon=100 max_bonds=2
$ ifconfig bond0 up
$ ifenslave bond0 eth0 eth1
$ ifconfig eth0 down
$ ifenslave -d bond0 eth1
$ cat /proc/net/bonding/bond0
Fixes: ff59c4563a ("[PATCH] bonding: support carrier state for master")
Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Link: https://lore.kernel.org/r/1645021088-38370-1-git-send-email-zhangchangzhong@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
syzbot reported various soft lockups caused by bpf batch operations.
INFO: task kworker/1:1:27 blocked for more than 140 seconds.
INFO: task hung in rcu_barrier
Nothing prevents batch ops to process huge amount of data,
we need to add schedule points in them.
Note that maybe_wait_bpf_programs(map) calls from
generic_map_delete_batch() can be factorized by moving
the call after the loop.
This will be done later in -next tree once we get this fix merged,
unless there is strong opinion doing this optimization sooner.
Fixes: aa2e93b8e5 ("bpf: Add generic support for update and delete batch ops")
Fixes: cb4d03ab49 ("bpf: Add generic support for lookup batch op")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Stanislav Fomichev <sdf@google.com>
Acked-by: Brian Vazquez <brianvv@google.com>
Link: https://lore.kernel.org/bpf/20220217181902.808742-1-eric.dumazet@gmail.com
Commit b42bc9a3c5 ("Fix regression due to "fs: move binfmt_misc sysctl
to its own file") fixed a regression, however it failed to add a
kmemleak_not_leak().
Fixes: b42bc9a3c5 ("Fix regression due to "fs: move binfmt_misc sysctl to its own file")
Reported-by: Tong Zhang <ztong0001@gmail.com>
Cc: Tong Zhang <ztong0001@gmail.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- Fix corrupt inject files when only last branch option is enabled with ARM CoreSight ETM.
- Fix use-after-free for realloc(..., 0) in libsubcmd, found by gcc 12.
- Defer freeing string after possible strlen() on it in the BPF loader, found by gcc 12.
- Avoid early exit in 'perf trace' due SIGCHLD from non-workload processes.
- Fix arm64 perf_event_attr 'perf test's wrt --call-graph initialization.
- Fix libperf 32-bit build for 'perf test' wrt uint64_t printf.
- Fix perf_cpu_map__for_each_cpu macro in libperf, providing access to the CPU iterator.
- Sync linux/perf_event.h UAPI with the kernel sources.
- Update Jiri Olsa's email address in MAINTAINERS.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCYg5q1gAKCRCyPKLppCJ+
JxbAAP9P7iIQuXecNop1ye4eLsWHYhBAJPSJdvUHwqpwGRm8HQEA2r+2tRtBcbei
3qgOY0od4Xtw1yji1YmTeQ6jmKFuMQM=
=c5Ce
-----END PGP SIGNATURE-----
Merge tag 'perf-tools-fixes-for-v5.17-2022-02-17' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tools fixes from Arnaldo Carvalho de Melo:
- Fix corrupt inject files when only last branch option is enabled with
ARM CoreSight ETM
- Fix use-after-free for realloc(..., 0) in libsubcmd, found by gcc 12
- Defer freeing string after possible strlen() on it in the BPF loader,
found by gcc 12
- Avoid early exit in 'perf trace' due SIGCHLD from non-workload
processes
- Fix arm64 perf_event_attr 'perf test's wrt --call-graph
initialization
- Fix libperf 32-bit build for 'perf test' wrt uint64_t printf
- Fix perf_cpu_map__for_each_cpu macro in libperf, providing access to
the CPU iterator
- Sync linux/perf_event.h UAPI with the kernel sources
- Update Jiri Olsa's email address in MAINTAINERS
* tag 'perf-tools-fixes-for-v5.17-2022-02-17' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
perf bpf: Defer freeing string after possible strlen() on it
perf test: Fix arm64 perf_event_attr tests wrt --call-graph initialization
libsubcmd: Fix use-after-free for realloc(..., 0)
libperf: Fix perf_cpu_map__for_each_cpu macro
perf cs-etm: Fix corrupt inject files when only last branch option is enabled
perf cs-etm: No-op refactor of synth opt usage
libperf: Fix 32-bit build for tests uint64_t printf
tools headers UAPI: Sync linux/perf_event.h with the kernel sources
perf trace: Avoid early exit due SIGCHLD from non-workload processes
MAINTAINERS: Update Jiri's email address
The only fix trickled down for v5.17-rc cycle so far is
the fix for module decompression when CONFIG_SYSFS=n. This
was reported through 0-day.
-----BEGIN PGP SIGNATURE-----
iQJGBAABCgAwFiEENnNq2KuOejlQLZofziMdCjCSiKcFAmINZTUSHG1jZ3JvZkBr
ZXJuZWwub3JnAAoJEM4jHQowkoinrUgP/Ai8ae+IHpscLbmBM/bdbqinKoOUo17M
0nHTJIMnpf8J7B/MSx64VLBmfkCKEVdYfiQhkGcxAwmTcw7n+1+ekM5kTAhCRJfd
Ej/4n0Q5cWl7SnCmhO94vBYFmDHn/NgoXhsAAFaZIRtCcyRWzXAb946lBgGUDOHl
AtybuRIPJpU+6AeXeZyz9WCdYF2/uyz5HFs929SOE1iYazmpshei8bpOepVvH9LK
Sa72/gVKC3CuVGx+6VwwvF+wL0dBvp8Gufg5/xh7xOhlls5N3AJUhJ7fkg7v1K9+
gjeBhQrKzfbtd0MFsjnfNhN0zbQ/A+d3f2M6SNGcZK/xbDEnEypmcNyXiy0QKXjw
QVSS5dXEI8bNoNPM2oqVmmCXliTqD3jY1mS7ynTwI7w0zUN8NL1y2tcF9xjBvk7O
X7sirOUtiaEIfFkkt7BxHILEAVd/jqYg43g/cq0EMglxeRMJXJo04kY6EvzFSxCm
bKUUV4d+wCq+JzQmPDIO01UcfKmlTC3a8bexusVz06MhDYW7KdVfNTDxu2/EQaNn
gU3/v5SvmucV9Oo/norXbDf0jzDTSbmPc0yYbanXEV2CrHx0CEAAGMN5W+iNOWCz
3YVMON3qu6nJcZuu7YemjKr8v8ZA5M37L9bUDlBZDCoEUqbaJV7wjRRjKqeL2d6F
1gZ+hV6sqN+1
=h/ue
-----END PGP SIGNATURE-----
Merge tag 'modules-5.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux
Pull module fix from Luis Chamberlain:
"Fixes module decompression when CONFIG_SYSFS=n
The only fix trickled down for v5.17-rc cycle so far is the fix for
module decompression when CONFIG_SYSFS=n. This was reported through
0-day"
* tag 'modules-5.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux:
module: fix building with sysfs disabled
IPv6 GRE tunnels are not being offloaded, this is caused by a missing
netdev offload check. The functionality of IPv6 GRE tunnel offloading
was previously added but this check was not included. Adding the
ip6gretap check allows IPv6 GRE tunnels to be offloaded correctly.
Fixes: f7536ffb09 ("nfp: flower: Allow ipv6gretap interface for offloading")
Signed-off-by: Danie du Toit <danie.dutoit@corigine.com>
Signed-off-by: Louis Peens <louis.peens@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20220217124820.40436-1-louis.peens@corigine.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
fib_alias_hw_flags_set() can be used by concurrent threads,
and is only RCU protected.
We need to annotate accesses to following fields of struct fib_alias:
offload, trap, offload_failed
Because of READ_ONCE()WRITE_ONCE() limitations, make these
field u8.
BUG: KCSAN: data-race in fib_alias_hw_flags_set / fib_alias_hw_flags_set
read to 0xffff888134224a6a of 1 bytes by task 2013 on cpu 1:
fib_alias_hw_flags_set+0x28a/0x470 net/ipv4/fib_trie.c:1050
nsim_fib4_rt_hw_flags_set drivers/net/netdevsim/fib.c:350 [inline]
nsim_fib4_rt_add drivers/net/netdevsim/fib.c:367 [inline]
nsim_fib4_rt_insert drivers/net/netdevsim/fib.c:429 [inline]
nsim_fib4_event drivers/net/netdevsim/fib.c:461 [inline]
nsim_fib_event drivers/net/netdevsim/fib.c:881 [inline]
nsim_fib_event_work+0x1852/0x2cf0 drivers/net/netdevsim/fib.c:1477
process_one_work+0x3f6/0x960 kernel/workqueue.c:2307
process_scheduled_works kernel/workqueue.c:2370 [inline]
worker_thread+0x7df/0xa70 kernel/workqueue.c:2456
kthread+0x1bf/0x1e0 kernel/kthread.c:377
ret_from_fork+0x1f/0x30
write to 0xffff888134224a6a of 1 bytes by task 4872 on cpu 0:
fib_alias_hw_flags_set+0x2d5/0x470 net/ipv4/fib_trie.c:1054
nsim_fib4_rt_hw_flags_set drivers/net/netdevsim/fib.c:350 [inline]
nsim_fib4_rt_add drivers/net/netdevsim/fib.c:367 [inline]
nsim_fib4_rt_insert drivers/net/netdevsim/fib.c:429 [inline]
nsim_fib4_event drivers/net/netdevsim/fib.c:461 [inline]
nsim_fib_event drivers/net/netdevsim/fib.c:881 [inline]
nsim_fib_event_work+0x1852/0x2cf0 drivers/net/netdevsim/fib.c:1477
process_one_work+0x3f6/0x960 kernel/workqueue.c:2307
process_scheduled_works kernel/workqueue.c:2370 [inline]
worker_thread+0x7df/0xa70 kernel/workqueue.c:2456
kthread+0x1bf/0x1e0 kernel/kthread.c:377
ret_from_fork+0x1f/0x30
value changed: 0x00 -> 0x02
Reported by Kernel Concurrency Sanitizer on:
CPU: 0 PID: 4872 Comm: kworker/0:0 Not tainted 5.17.0-rc3-syzkaller-00188-g1d41d2e82623-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: events nsim_fib_event_work
Fixes: 90b93f1b31 ("ipv4: Add "offload" and "trap" indications to routes")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/20220216173217.3792411-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
If the master device does VLAN filtering, the IDs used by the switch
must be added for any frames to be received. Do this in the
port_enable() function, and remove them in port_disable().
Fixes: a1292595e0 ("net: dsa: add new DSA switch driver for the SMSC-LAN9303")
Signed-off-by: Mans Rullgard <mans@mansr.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://lore.kernel.org/r/20220216204818.28746-1-mans@mansr.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Check for a hwaccel VLAN tag on rx and use it if present. Otherwise,
use __skb_vlan_pop() like the other tag parsers do. This fixes the case
where the VLAN tag has already been consumed by the master.
Fixes: a1292595e0 ("net: dsa: add new DSA switch driver for the SMSC-LAN9303")
Signed-off-by: Mans Rullgard <mans@mansr.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://lore.kernel.org/r/20220216124634.23123-1-mans@mansr.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Oded Gabbay reports that enabling NUMA balancing causes corruption with
his Gaudi accelerator test load:
"All the details are in the bug, but the bottom line is that somehow,
this patch causes corruption when the numa balancing feature is
enabled AND we don't use process affinity AND we use GUP to pin pages
so our accelerator can DMA to/from system memory.
Either disabling numa balancing, using process affinity to bind to
specific numa-node or reverting this patch causes the bug to
disappear"
and Oded bisected the issue to commit 09854ba94c ("mm: do_wp_page()
simplification").
Now, the NUMA balancing shouldn't actually be changing the writability
of a page, and as such shouldn't matter for COW. But it appears it
does. Suspicious.
However, regardless of that, the condition for enabling NUMA faults in
change_pte_range() is nonsensical. It uses "page_mapcount(page)" to
decide if a COW page should be NUMA-protected or not, and that makes
absolutely no sense.
The number of mappings a page has is irrelevant: not only does GUP get a
reference to a page as in Oded's case, but the other mappings migth be
paged out and the only reference to them would be in the page count.
Since we should never try to NUMA-balance a page that we can't move
anyway due to other references, just fix the code to use 'page_count()'.
Oded confirms that that fixes his issue.
Now, this does imply that something in NUMA balancing ends up changing
page protections (other than the obvious one of making the page
inaccessible to get the NUMA faulting information). Otherwise the COW
simplification wouldn't matter - since doing the GUP on the page would
make sure it's writable.
The cause of that permission change would be good to figure out too,
since it clearly results in spurious COW events - but fixing the
nonsensical test that just happened to work before is obviously the
CorrectThing(tm) to do regardless.
Fixes: 09854ba94c ("mm: do_wp_page() simplification")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=215616
Link: https://lore.kernel.org/all/CAFCwf10eNmwq2wD71xjUhqkvv5+_pJMR1nPug2RqNDcFT4H86Q@mail.gmail.com/
Reported-and-tested-by: Oded Gabbay <oded.gabbay@gmail.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
vsock_connect() expects that the socket could already be in the
TCP_ESTABLISHED state when the connecting task wakes up with a signal
pending. If this happens the socket will be in the connected table, and
it is not removed when the socket state is reset. In this situation it's
common for the process to retry connect(), and if the connection is
successful the socket will be added to the connected table a second
time, corrupting the list.
Prevent this by calling vsock_remove_connected() if a signal is received
while waiting for a connection. This is harmless if the socket is not in
the connected table, and if it is in the table then removing it will
prevent list corruption from a double add.
Note for backporting: this patch requires d5afa82c97 ("vsock: correct
removal of socket from the list"), which is in all current stable trees
except 4.9.y.
Fixes: d021c34405 ("VSOCK: Introduce VM Sockets")
Signed-off-by: Seth Forshee <sforshee@digitalocean.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20220217141312.2297547-1-sforshee@digitalocean.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This reverts commit 3710e80952.
Since idm_base and nicpm_base are still optional resources not present
on all platforms, this breaks the driver for everything except Northstar
2 (which has both).
The same change was already reverted once with 755f5738ff ("net:
broadcom: fix a mistake about ioremap resource").
So let's do it again.
Fixes: 3710e80952 ("net: ethernet: bgmac: Use devm_platform_ioremap_resource_byname")
Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
[florian: Added comments to explain the resources are optional]
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20220216184634.2032460-1-f.fainelli@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Before freeing the hash table in addrconf_exit_net(),
we need to make sure the work queue has completed,
or risk NULL dereference or UAF.
Thus, use cancel_delayed_work_sync() to enforce this.
We do not hold RTNL in addrconf_exit_net(), making this safe.
Fixes: 8805d13ff1 ("ipv6/addrconf: use one delayed work per netns")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20220216182037.3742-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Sprinkle for each loops to allow netdevices to be unregistered
out of order, as their refs are released.
This prevents problems caused by dependencies between netdevs
which want to release references in their ->priv_destructor.
See commit d6ff94afd9 ("vlan: move dev_put into vlan_dev_uninit")
for example.
Eric has removed the only known ordering requirement in
commit c002496bab ("Merge branch 'ipv6-loopback'")
so let's try this and see if anything explodes...
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Link: https://lore.kernel.org/r/20220215225310.3679266-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
In prep for unregistering netdevs out of order move the netdev
state validation and change outside of the loop.
While at it modernize this code and use WARN() instead of
pr_err() + dump_stack().
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Link: https://lore.kernel.org/r/20220215225310.3679266-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When 'ping' changes to use PING socket instead of RAW socket by:
# sysctl -w net.ipv4.ping_group_range="0 100"
There is another regression caused when matching sk_bound_dev_if
and dif, RAW socket is using inet_iif() while PING socket lookup
is using skb->dev->ifindex, the cmd below fails due to this:
# ip link add dummy0 type dummy
# ip link set dummy0 up
# ip addr add 192.168.111.1/24 dev dummy0
# ping -I dummy0 192.168.111.1 -c1
The issue was also reported on:
https://github.com/iputils/iputils/issues/104
But fixed in iputils in a wrong way by not binding to device when
destination IP is on device, and it will cause some of kselftests
to fail, as Jianlin noticed.
This patch is to use inet(6)_iif and inet(6)_sdif to get dif and
sdif for PING socket, and keep consistent with RAW socket.
Fixes: c319b4d76b ("net: ipv4: add IPPROTO_ICMP socket kind")
Reported-by: Jianlin Shi <jishi@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add quirk CDC_MBIM_FLAG_AVOID_ALTSETTING_TOGGLE for Telit FN990
0x1071 composition in order to avoid bind error.
Signed-off-by: Daniele Palmas <dnlplm@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski says:
====================
net: ping6: support setting basic SOL_IPV6 options via cmsg
Support for IPV6_HOPLIMIT, IPV6_TCLASS, IPV6_DONTFRAG on ICMPv6
sockets and associated tests. I have no immediate plans to
implement IPV6_FLOWINFO and all the extension header stuff.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a basic test to make sure ping sockets don't crash
with IPV6_2292* options.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Test setting IPV6_HOPLIMIT via setsockopt and cmsg
across socket types.
Output without the kernel support (this series):
Case HOPLIMIT ICMP cmsg - packet data returned 1, expected 0
Case HOPLIMIT ICMP diff - packet data returned 1, expected 0
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Test setting IPV6_TCLASS via setsockopt and cmsg
across socket types.
Output without the kernel support (this series):
Case TCLASS ICMP cmsg - packet data returned 1, expected 0
Case TCLASS ICMP cmsg - rejection returned 0, expected 1
Case TCLASS ICMP diff - pass returned 1, expected 0
Case TCLASS ICMP diff - packet data returned 1, expected 0
Case TCLASS ICMP diff - rejection returned 0, expected 1
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Test setting IPV6_DONTFRAG via setsockopt and cmsg
across socket types.
Output without the kernel support (this series):
Case DONTFRAG ICMP setsock returned 0, expected 1
Case DONTFRAG ICMP cmsg returned 0, expected 1
Case DONTFRAG ICMP both returned 0, expected 1
Case DONTFRAG ICMP diff returned 0, expected 1
FAIL - 4/24 cases failed
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>