rds_notify_queue_get() is potentially copying uninitialized kernel stack
memory to userspace since the compiler may leave a 4-byte hole at the end
of `cmsg`.
In 2016 we tried to fix this issue by doing `= { 0 };` on `cmsg`, which
unfortunately does not always initialize that 4-byte hole. Fix it by using
memset() instead.
Cc: stable@vger.kernel.org
Fixes: f037590fff ("rds: fix a leak of kernel memory")
Fixes: bdbe6fbc6a ("RDS: recv.c")
Suggested-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Peilin Ye <yepeilin.cs@gmail.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2020-07-30
This series contains updates to the e1000e and igb drivers.
Aaron Ma allows PHY initialization to continue if ULP disable failed for
e1000e.
Francesco Ruggeri fixes race conditions in igb reset that could cause panics.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Exchange the positions of the err_tbl_init and err_register labels in
ct_init_module function.
Fixes: c34b961a24 ("net/sched: act_ct: Create nf flow table per zone")
Signed-off-by: liujian <liujian56@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* remove a warning that can trigger in certain races
* check a function pointer before using it
* check before adding 6 GHz to avoid a warning in mesh
* fix two memory leaks in mesh
* fix a TX status bug leading to a memory leak
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEH1e1rEeCd0AIMq6MB8qZga/fl8QFAl8i4fUACgkQB8qZga/f
l8TryA/+NVsEyCUgXRH4ZanXAGpC3QB7925f2THbgkOgdlKwR1iKGtmwZVIQKwl4
DdxUql/6C02l4fW+OBz23ohKkb9MixpSwiYjHKeqlC5EeNvmzLujG0A/X+Qvj/j0
iuttFNJxrwQ7ZCzqKbsufRL9BfWwaIVYB62HFd4gR/FBnPwNhZPlHLJMyzT54eIC
ZdDQ8ynqij3uB2BS31J2l3clwO7iOg1ek81DsTgLp3nCGqIDwJzYMRqrVh5+9I1j
yAwE9TCBxv8eEGJIFwpF+Yl0ffPvYXT0DjjeRQMOG7lIDa25orbIT8zHSt83y7My
6Rq+8Paw8Z2bQO0Z7EN3ys4fgqMqY1TTVzrTpF0XQaW+CnEdc3mvRN9UGDiViNcP
JO9QH7RlCV+CQld/I5F/b3kbpt9DP+TFQyVaE62+OhgG5TCf7upJ7XmHK9IpoX7A
cSjJxV8isWLzILTWLiHFXEkyCbdIYeq8PSvuIvJcxOH8hIQ+Si8Rv01K2Wvp0QpM
mReW4MYbpHPju88aBl82Tl1fUwfQNfaMp41iN+b2zBS4xB4+zmTFnpq0lIS7azT/
8Zx+9mlJKUd+YTTKcMvUrwJ2dH8b7w5bEllf5WAEGbLubo+dhgsZ+6PuLRrw/8J7
ujINEQTT+UarZWgevyCG6j/c229UiSfaQNCeoADUAhW4wMCPm3E=
=xoMY
-----END PGP SIGNATURE-----
Merge tag 'mac80211-for-davem-2020-07-30' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
Johannes Berg says:
====================
A couple of more changes:
* remove a warning that can trigger in certain races
* check a function pointer before using it
* check before adding 6 GHz to avoid a warning in mesh
* fix two memory leaks in mesh
* fix a TX status bug leading to a memory leak
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix the missing clk_disable_unprepare() before return
from gemini_ethernet_port_probe() in the error handling case.
Fixes: 4d5ae32f5e ("net: ethernet: Add a driver for Gemini gigabit ethernet")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On an error return, jump to the unlock at the end to be sure
to unlock the queue_lock mutex.
Fixes: 0925e9db4d ("ionic: use mutex to protect queue operations")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
atmtcp_remove_persistent() invokes atm_dev_lookup(), which returns a
reference of atm_dev with increased refcount or NULL if fails.
The refcount leaks issues occur in two error handling paths. If
dev_data->persist is zero or PRIV(dev)->vcc isn't NULL, the function
returns 0 without decreasing the refcount kept by a local variable,
resulting in refcount leaks.
Fix the issue by adding atm_dev_put() before returning 0 both when
dev_data->persist is zero or PRIV(dev)->vcc isn't NULL.
Signed-off-by: Xin Xiong <xiongx18@fudan.edu.cn>
Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn>
Signed-off-by: Xin Tan <tanxin.ctf@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
in recent kernel versions there are warnings about incorrect MTU size
like these:
eth0: mtu greater than device maximum
mtk_soc_eth 1b100000.ethernet eth0: error -22 setting MTU to include DSA overhead
Fixes: bfcb813203 ("net: dsa: configure the MTU for switch ports")
Fixes: 72579e14a1 ("net: dsa: don't fail to probe if we couldn't set the MTU")
Fixes: 7a4c53bee3 ("net: report invalid mtu value via netlink extack")
Signed-off-by: Landen Chao <landen.chao@mediatek.com>
Signed-off-by: Frank Wunderlich <frank-w@public-files.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
If some processes in nixge_probe() fail, free_netdev(dev)
needs to be called to aviod a memory leak.
Fixes: 87ab207981 ("net: nixge: Separate ctrl and dma resources")
Fixes: abcd3d6fc6 ("net: nixge: Fix error path for obtaining mac address")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Lu Wei <luwei32@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Number of .dumpit functions try to ignore -EOPNOTSUPP errors.
Recent change missed that, and started reporting all errors
but -EMSGSIZE back from dumps. This leads to situation like
this:
$ devlink dev info
devlink answers: Operation not supported
Dump should not report an error just because the last device
to be queried could not provide an answer.
To fix this and avoid similar confusion make sure we clear
err properly, and not leave it set to an error if we don't
terminate the iteration.
Fixes: c62c2cfb80 ("net: devlink: don't ignore errors during dumpit")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There's a race between rxrpc_sendmsg setting up a call, but then failing to
send anything on it due to an error, and recvmsg() seeing the call
completion occur and trying to return the state to the user.
An assertion fails in rxrpc_recvmsg() because the call has already been
released from the socket and is about to be released again as recvmsg deals
with it. (The recvmsg_q queue on the socket holds a ref, so there's no
problem with use-after-free.)
We also have to be careful not to end up reporting an error twice, in such
a way that both returns indicate to userspace that the user ID supplied
with the call is no longer in use - which could cause the client to
malfunction if it recycles the user ID fast enough.
Fix this by the following means:
(1) When sendmsg() creates a call after the point that the call has been
successfully added to the socket, don't return any errors through
sendmsg(), but rather complete the call and let recvmsg() retrieve
them. Make sendmsg() return 0 at this point. Further calls to
sendmsg() for that call will fail with ESHUTDOWN.
Note that at this point, we haven't send any packets yet, so the
server doesn't yet know about the call.
(2) If sendmsg() returns an error when it was expected to create a new
call, it means that the user ID wasn't used.
(3) Mark the call disconnected before marking it completed to prevent an
oops in rxrpc_release_call().
(4) recvmsg() will then retrieve the error and set MSG_EOR to indicate
that the user ID is no longer known by the kernel.
An oops like the following is produced:
kernel BUG at net/rxrpc/recvmsg.c:605!
...
RIP: 0010:rxrpc_recvmsg+0x256/0x5ae
...
Call Trace:
? __init_waitqueue_head+0x2f/0x2f
____sys_recvmsg+0x8a/0x148
? import_iovec+0x69/0x9c
? copy_msghdr_from_user+0x5c/0x86
___sys_recvmsg+0x72/0xaa
? __fget_files+0x22/0x57
? __fget_light+0x46/0x51
? fdget+0x9/0x1b
do_recvmmsg+0x15e/0x232
? _raw_spin_unlock+0xa/0xb
? vtime_delta+0xf/0x25
__x64_sys_recvmmsg+0x2c/0x2f
do_syscall_64+0x4c/0x78
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Fixes: 357f5ef646 ("rxrpc: Call rxrpc_release_call() on error in rxrpc_new_client_call()")
Reported-by: syzbot+b54969381df354936d96@syzkaller.appspotmail.com
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch is to replace Thor Thayer as Altera Triple Speed Ethernet
maintainer as he is moving to a different role.
Signed-off-by: Joyce Ooi <joyce.ooi@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When read netdevsim trap_flow_action_cookie, we need to init it first,
or we will get "Invalid argument" error.
Fixes: d3cbb907ae ("netdevsim: add ACL trap reporting cookie as a metadata")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
IPV6_ADDRFORM causes resource leaks when converting an IPv6 socket
to IPv4, particularly struct ipv6_ac_socklist. Similar to
struct ipv6_mc_socklist, we should just close it on this path.
This bug can be easily reproduced with the following C program:
#include <stdio.h>
#include <string.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <arpa/inet.h>
int main()
{
int s, value;
struct sockaddr_in6 addr;
struct ipv6_mreq m6;
s = socket(AF_INET6, SOCK_DGRAM, 0);
addr.sin6_family = AF_INET6;
addr.sin6_port = htons(5000);
inet_pton(AF_INET6, "::ffff:192.168.122.194", &addr.sin6_addr);
connect(s, (struct sockaddr *)&addr, sizeof(addr));
inet_pton(AF_INET6, "fe80::AAAA", &m6.ipv6mr_multiaddr);
m6.ipv6mr_interface = 5;
setsockopt(s, SOL_IPV6, IPV6_JOIN_ANYCAST, &m6, sizeof(m6));
value = AF_INET;
setsockopt(s, SOL_IPV6, IPV6_ADDRFORM, &value, sizeof(value));
close(s);
return 0;
}
Reported-by: ch3332xr@gmail.com
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We observed two panics involving races with igb_reset_task.
The first panic is caused by this race condition:
kworker reboot -f
igb_reset_task
igb_reinit_locked
igb_down
napi_synchronize
__igb_shutdown
igb_clear_interrupt_scheme
igb_free_q_vectors
igb_free_q_vector
adapter->q_vector[v_idx] = NULL;
napi_disable
Panics trying to access
adapter->q_vector[v_idx].napi_state
The second panic (a divide error) is caused by this race:
kworker reboot -f tx packet
igb_reset_task
__igb_shutdown
rtnl_lock()
...
igb_clear_interrupt_scheme
igb_free_q_vectors
adapter->num_tx_queues = 0
...
rtnl_unlock()
rtnl_lock()
igb_reinit_locked
igb_down
igb_up
netif_tx_start_all_queues
dev_hard_start_xmit
igb_xmit_frame
igb_tx_queue_mapping
Panics on
r_idx % adapter->num_tx_queues
This commit applies to igb_reset_task the same changes that
were applied to ixgbe in commit 2f90b8657e ("ixgbe: this patch
adds support for DCB to the kernel and ixgbe driver"),
commit 8f4c5c9fb8 ("ixgbe: reinit_locked() should be called with
rtnl_lock") and commit 88adce4ea8 ("ixgbe: fix possible race in
reset subtask").
Signed-off-by: Francesco Ruggeri <fruggeri@arista.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
After 'commit e086ba2fcc ("e1000e: disable s0ix entry and exit flows
for ME systems")',
ThinkPad P14s always failed to disable ULP by ME.
'commit 0c80cdbf33 ("e1000e: Warn if disabling ULP failed")'
break out of init phy:
error log:
[ 42.364753] e1000e 0000:00:1f.6 enp0s31f6: Failed to disable ULP
[ 42.524626] e1000e 0000:00:1f.6 enp0s31f6: PHY Wakeup cause - Unicast Packet
[ 42.822476] e1000e 0000:00:1f.6 enp0s31f6: Hardware Error
When disable s0ix, E1000_FWSM_ULP_CFG_DONE will never be 1.
If continue to init phy like before, it can work as before.
iperf test result good too.
Fixes: 0c80cdbf33 ("e1000e: Warn if disabling ULP failed")
Signed-off-by: Aaron Ma <aaron.ma@canonical.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
This warning can trigger if there is a mismatch between frames that were
sent with the sta pointer set vs tx status frames reported for the sta address.
This can happen due to race conditions on re-creating stations, or even
in the case of .sta_add/remove being used instead of .sta_state, which can cause
frames to be sent to a station that has not been uploaded yet.
If there is an actual underflow issue, it should show up in the device airtime
warning below, so it is better to remove this one.
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Link: https://lore.kernel.org/r/20200725084533.13829-1-nbd@nbd.name
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Allocated ack_frame id from local->ack_status_frames is not really
stored in the tx_info for 802.3 Tx path. Due to this, tx ack status
is not reported and ack_frame id is not freed for the buffers requiring
tx ack status. Also move the memset to 0 of tx_info before
IEEE80211_TX_CTL_REQ_TX_STATUS flag assignment.
Fixes: 50ff477a86 ("mac80211: add 802.11 encapsulation offloading support")
Signed-off-by: Vasanthakumar Thiagarajan <vthiagar@codeaurora.org>
Link: https://lore.kernel.org/r/1595427617-1713-1-git-send-email-vthiagar@codeaurora.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
In the case where a vendor command does not implement doit, and has no
flags set, doit would not be validated and a NULL pointer dereference
would occur, for example when invoking the vendor command via iw.
I encountered this while developing new vendor commands. Perhaps in
practice it is advisable to always implement doit along with dumpit,
but it seems reasonable to me to always check doit anyway, not just
when NEED_WDEV.
Signed-off-by: Julian Squires <julian@cipht.net>
Link: https://lore.kernel.org/r/20200706211353.2366470-1-julian@cipht.net
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
The commit 24a2042cb2 ("mac80211: add HE 6 GHz Band Capability
element") failed to check device capability before adding HE 6 GHz
capability element. Below warning is reported in 11ac device in mesh.
Fix that by checking device capability at HE 6 GHz cap IE addition
in mesh beacon and association request.
WARNING: CPU: 1 PID: 1897 at net/mac80211/util.c:2878
ieee80211_ie_build_he_6ghz_cap+0x149/0x150 [mac80211]
[ 3138.720358] Call Trace:
[ 3138.720361] ieee80211_mesh_build_beacon+0x462/0x530 [mac80211]
[ 3138.720363] ieee80211_start_mesh+0xa8/0xf0 [mac80211]
[ 3138.720365] __cfg80211_join_mesh+0x122/0x3e0 [cfg80211]
[ 3138.720368] nl80211_join_mesh+0x3d3/0x510 [cfg80211]
Fixes: 24a2042cb2 ("mac80211: add HE 6 GHz Band Capability element")
Reported-by: Markus Theil <markus.theil@tu-ilmenau.de>
Signed-off-by: Rajkumar Manoharan <rmanohar@codeaurora.org>
Link: https://lore.kernel.org/r/1593656424-18240-1-git-send-email-rmanohar@codeaurora.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
RX queue IRQ mappings are disposed in both the TX IRQ and RX IRQ
error paths. Fix this and dispose of TX IRQ mappings correctly in
case of an error.
Fixes: ea22d51a78 ("ibmvnic: simplify and improve driver probe function")
Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel says:
====================
mlxsw fixes
This patch set contains various fixes for mlxsw.
Patches #1-#2 fix two trap related issues introduced in previous cycle.
Patches #3-#5 fix rare use-after-frees discovered by syzkaller. After
over a week of fuzzing with the fixes, the bugs did not reproduce.
Patch #6 from Amit fixes an issue in the ethtool selftest that was
recently discovered after running the test on a new platform that
supports only 1Gbps and 10Gbps speeds.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The test case check_highest_speed_is_chosen() configures $h1 to
advertise a subset of its supported speeds and checks that $h2 chooses
the highest speed from the subset.
To find the common advertised speeds between $h1 and $h2,
common_speeds_get() is called.
Currently, the first speed returned from common_speeds_get() is removed
claiming "h1 does not advertise this speed". The claim is wrong because
the function is called after $h1 already advertised a subset of speeds.
In case $h1 supports only two speeds, it will advertise a single speed
which will be later removed because of previously mentioned bug. This
results in the test needlessly failing. When more than two speeds are
supported this is not an issue because the first advertised speed
is the lowest one.
Fix this by not removing any speed from the list of commonly advertised
speeds.
Fixes: 64916b57c0 ("selftests: forwarding: Add speed and auto-negotiation test")
Reported-by: Danielle Ratson <danieller@mellanox.com>
Signed-off-by: Amit Cohen <amitc@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The lifetime of the Rx listener item ('rxl_item') is managed using RCU,
but is dereferenced outside of RCU read-side critical section, which can
lead to a use-after-free.
Fix this by increasing the scope of the RCU read-side critical section.
Fixes: 93c1edb27f ("mlxsw: Introduce Mellanox switch driver core")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cited commit mistakenly removed the trap group for externally routed
packets (e.g., via the management interface) and grouped locally routed
and externally routed packet traps under the same group, thereby
subjecting them to the same policer.
This can result in problems, for example, when FRR is restarted and
suddenly all transient traffic is trapped to the CPU because of a
default route through the management interface. Locally routed packets
required to re-establish a BGP connection will never reach the CPU and
the routing tables will not be re-populated.
Fix this by using a different trap group for externally routed packets.
Fixes: 8110668ecd ("mlxsw: spectrum_trap: Register layer 3 control traps")
Reported-by: Alex Veber <alexve@mellanox.com>
Tested-by: Alex Veber <alexve@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cited commit added the ability to program link-local prefix routes to
the ASIC so that relevant packets are routed and trapped correctly.
However, host routes were not included in the change and thus not
programmed to the ASIC. This can result in packets being trapped via an
external route trap instead of a local route trap as in IPv4.
Fix this by programming all the link-local routes to the ASIC.
Fixes: 10d3757fcb ("mlxsw: spectrum_router: Allow programming link-local prefix routes")
Reported-by: Alex Veber <alexve@mellanox.com>
Tested-by: Alex Veber <alexve@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
fib_trie_unmerge() is called with RTNL held, but not from an RCU
read-side critical section. This leads to the following warning [1] when
the FIB alias list in a leaf is traversed with
hlist_for_each_entry_rcu().
Since the function is always called with RTNL held and since
modification of the list is protected by RTNL, simply use
hlist_for_each_entry() and silence the warning.
[1]
WARNING: suspicious RCU usage
5.8.0-rc4-custom-01520-gc1f937f3f83b #30 Not tainted
-----------------------------
net/ipv4/fib_trie.c:1867 RCU-list traversed in non-reader section!!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1
1 lock held by ip/164:
#0: ffffffff85a27850 (rtnl_mutex){+.+.}-{3:3}, at: rtnetlink_rcv_msg+0x49a/0xbd0
stack backtrace:
CPU: 0 PID: 164 Comm: ip Not tainted 5.8.0-rc4-custom-01520-gc1f937f3f83b #30
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-2.fc32 04/01/2014
Call Trace:
dump_stack+0x100/0x184
lockdep_rcu_suspicious+0x153/0x15d
fib_trie_unmerge+0x608/0xdb0
fib_unmerge+0x44/0x360
fib4_rule_configure+0xc8/0xad0
fib_nl_newrule+0x37a/0x1dd0
rtnetlink_rcv_msg+0x4f7/0xbd0
netlink_rcv_skb+0x17a/0x480
rtnetlink_rcv+0x22/0x30
netlink_unicast+0x5ae/0x890
netlink_sendmsg+0x98a/0xf40
____sys_sendmsg+0x879/0xa00
___sys_sendmsg+0x122/0x190
__sys_sendmsg+0x103/0x1d0
__x64_sys_sendmsg+0x7d/0xb0
do_syscall_64+0x54/0xa0
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7fc80a234e97
Code: Bad RIP value.
RSP: 002b:00007ffef8b66798 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fc80a234e97
RDX: 0000000000000000 RSI: 00007ffef8b66800 RDI: 0000000000000003
RBP: 000000005f141b1c R08: 0000000000000001 R09: 0000000000000000
R10: 00007fc80a2a8ac0 R11: 0000000000000246 R12: 0000000000000001
R13: 0000000000000000 R14: 00007ffef8b67008 R15: 0000556fccb10020
Fixes: 0ddcf43d5d ("ipv4: FIB Local/MAIN table collapse")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The commit cited below removed the RCU read-side critical section from
rtnl_fdb_dump() which means that the ndo_fdb_dump() callback is invoked
without RCU protection.
This results in the following warning [1] in the VXLAN driver, which
relied on the callback being invoked from an RCU read-side critical
section.
Fix this by calling rcu_read_lock() in the VXLAN driver, as already done
in the bridge driver.
[1]
WARNING: suspicious RCU usage
5.8.0-rc4-custom-01521-g481007553ce6 #29 Not tainted
-----------------------------
drivers/net/vxlan.c:1379 RCU-list traversed in non-reader section!!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1
1 lock held by bridge/166:
#0: ffffffff85a27850 (rtnl_mutex){+.+.}-{3:3}, at: netlink_dump+0xea/0x1090
stack backtrace:
CPU: 1 PID: 166 Comm: bridge Not tainted 5.8.0-rc4-custom-01521-g481007553ce6 #29
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-2.fc32 04/01/2014
Call Trace:
dump_stack+0x100/0x184
lockdep_rcu_suspicious+0x153/0x15d
vxlan_fdb_dump+0x51e/0x6d0
rtnl_fdb_dump+0x4dc/0xad0
netlink_dump+0x540/0x1090
__netlink_dump_start+0x695/0x950
rtnetlink_rcv_msg+0x802/0xbd0
netlink_rcv_skb+0x17a/0x480
rtnetlink_rcv+0x22/0x30
netlink_unicast+0x5ae/0x890
netlink_sendmsg+0x98a/0xf40
__sys_sendto+0x279/0x3b0
__x64_sys_sendto+0xe6/0x1a0
do_syscall_64+0x54/0xa0
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7fe14fa2ade0
Code: Bad RIP value.
RSP: 002b:00007fff75bb5b88 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 00005614b1ba0020 RCX: 00007fe14fa2ade0
RDX: 000000000000011c RSI: 00007fff75bb5b90 RDI: 0000000000000003
RBP: 00007fff75bb5b90 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00005614b1b89160
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
Fixes: 5e6d243587 ("bridge: netlink dump interface at par with brctl")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Removed redundant words.
Fixes: 571912c69f ("net: UDP tunnel encapsulation module for tunnelling different protocols like MPLS, IP, NSH etc.")
Signed-off-by: Martin Varghese <martin.varghese@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In multiproto mode, bareudp_xmit() accepts sending multicast MPLS and
IPv6 packets regardless of the bareudp ethertype. In practice, this
let an IP tunnel send multicast MPLS packets, or an MPLS tunnel send
IPv6 packets.
We need to restrict the test further, so that the multiproto mode only
enables
* IPv6 for IPv4 tunnels,
* or multicast MPLS for unicast MPLS tunnels.
To improve clarity, the protocol validation is moved to its own
function, where each logical test has its own condition.
v2: s/ntohs/htons/
Fixes: 4b5f67232d ("net: Special handling for IP & MPLS.")
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
ip6_route_info_create() invokes nexthop_get(), which increases the
refcount of the "nh".
When ip6_route_info_create() returns, local variable "nh" becomes
invalid, so the refcount should be decreased to keep refcount balanced.
The reference counting issue happens in one exception handling path of
ip6_route_info_create(). When nexthops can not be used with source
routing, the function forgets to decrease the refcnt increased by
nexthop_get(), causing a refcnt leak.
Fix this issue by pulling up the error source routing handling when
nexthops can not be used with source routing.
Fixes: f88d8ea67f ("ipv6: Plumb support for nexthop object in a fib6_info")
Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn>
Signed-off-by: Xin Tan <tanxin.ctf@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Subbaraya Sundeep says:
====================
Fix bugs in Octeontx2 netdev driver
There are problems in the existing Octeontx2
netdev drivers like missing cancel_work for the
reset task, missing lock in reset task and
missing unergister_netdev in driver remove.
This patch set fixes the above problems.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Added unregister_netdev in the driver remove
function. Generally unregister_netdev is called
after disabling all the device interrupts but here
it is called before disabling device mailbox
interrupts. The reason behind this is VF needs
mailbox interrupt to communicate with its PF to
clean up its resources during otx2_stop.
otx2_stop disables packet I/O and queue interrupts
first and by using mailbox interrupt communicates
to PF to free VF resources. Hence this patch
calls unregister_device just before
disabling mailbox interrupts.
Fixes: 3184fb5ba9 ("octeontx2-vf: Virtual function driver support")
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
During driver exit cancel the queued
reset_task work in VF driver.
Fixes: 3184fb5ba9 ("octeontx2-vf: Virtual function driver support")
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Two bugs exist in the code related to reset_task
in PF driver one is the missing protection
against network stack ndo_open and ndo_close.
Other one is the missing cancel_work.
This patch fixes those problems.
Fixes: 4ff7d1488a ("octeontx2-pf: Error handling support")
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu says:
====================
rhashtable: Fix unprotected RCU dereference in __rht_ptr
This patch series fixes an unprotected dereference in __rht_ptr.
The first patch is a minimal fix that does not use the correct
RCU markings but is suitable for backport, and the second patch
cleans up the RCU markings.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch restores the RCU marking on bucket_table->buckets as
it really does need RCU protection. Its removal had led to a fatal
bug.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
The rcu_dereference call in rht_ptr_rcu is completely bogus because
we've already dereferenced the value in __rht_ptr and operated on it.
This causes potential double readings which could be fatal. The RCU
dereference must occur prior to the comparison in __rht_ptr.
This patch changes the order of RCU dereference so that it is done
first and the result is then fed to __rht_ptr. The RCU marking
changes have been minimised using casts which will be removed in
a follow-up patch.
Fixes: ba6306e3f6 ("rhashtable: Remove RCU marking from...")
Reported-by: "Gong, Sishuai" <sishuai@purdue.edu>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Modify mtk_gmac0_rgmii_adjust() so it can always be called.
mtk_gmac0_rgmii_adjust() sets-up the TRGMII clocks.
Signed-off-by: René van Dorst <opensource@vdorst.com>
Signed-off-By: David Woodhouse <dwmw2@infradead.org>
Tested-by: Frank Wunderlich <frank-w@public-files.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAl8ggswACgkQSD+KveBX
+j7yOgf8DjzPtSpVfUA7Iq28WO6YxJy208oUdLjKNuRNr74vXulHegNlP6cFDHU3
QIddvTBNMTp2BpJpoFMbEod7sVTBq5KVQZvpIuFM2JU4h76vL4cYbWeBuT6rFIoJ
m5vuuUyAB+16QbJzagY/rqfQMs0w7KnR+Zhv18JzwyHhBiRLaPzYdmSWM2kkF8HZ
3DrY8RWgkeaI9vTpE6Fau7BRNDUOMgjIahiUrojJuyPsYZpJf5g+KaMj4xvgcqMa
vaPaw8iHN7+N3KIdcf6MJhfzx3SHP5YNieU/MfvE9sLvdPvfLETdpexYPB8b0/vs
L9w2D8j0uZyXek30fIiIwHaibQGZPw==
=W0Fn
-----END PGP SIGNATURE-----
Merge tag 'mlx5-fixes-2020-07-28' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5 fixes-2020-07-28
This series introduces some fixes to mlx5 driver.
v1->v2:
- Drop the "Hold reference on mirred devices" patch, until Or's
comments are addressed.
- Imporve "Modify uplink state" patch commit message per Or's request.
Please pull and let me know if there is any problem.
For -Stable:
For -stable v4.9
('net/mlx5e: Fix error path of device attach')
For -stable v4.15
('net/mlx5: Verify Hardware supports requested ptp function on a given
pin')
For -stable v5.3
('net/mlx5e: Modify uplink state on interface up/down')
For -stable v5.4
('net/mlx5e: Fix kernel crash when setting vf VLANID on a VF dev')
('net/mlx5: E-switch, Destroy TSAR when fail to enable the mode')
For -stable v5.5
('net/mlx5: E-switch, Destroy TSAR after reload interface')
For -stable v5.7
('net/mlx5: Fix a bug of using ptp channel index as pin index')
====================
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Johan Hovold says:
====================
net: lan78xx: fix NULL deref and memory leak
The first two patches fix a NULL-pointer dereference at probe that can
be triggered by a malicious device and a small transfer-buffer memory
leak, respectively.
For another subsystem I would have marked them:
Cc: stable@vger.kernel.org # 4.3
The third one replaces the driver's current broken endpoint lookup
helper, which could end up accepting incomplete interfaces and whose
results weren't even useeren
Johan
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Drop the bogus endpoint-lookup helper which could end up accepting
interfaces based on endpoints belonging to unrelated altsettings.
Note that the returned bulk pipes and interrupt endpoint descriptor
were never actually used. Instead the bulk-endpoint numbers are
hardcoded to 1 and 2 (matching the specification), while the interrupt-
endpoint descriptor was assumed to be the third descriptor created by
USB core.
Try to bring some order to this by dropping the bogus lookup helper and
adding the missing endpoint sanity checks while keeping the interrupt-
descriptor assumption for now.
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The interrupt URB transfer-buffer was never freed on disconnect or after
probe errors.
Fixes: 55d7de9de6 ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet device driver")
Cc: Woojung.Huh@microchip.com <Woojung.Huh@microchip.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add the missing endpoint sanity check to prevent a NULL-pointer
dereference should a malicious device lack the expected endpoints.
Note that the driver has a broken endpoint-lookup helper,
lan78xx_get_endpoints(), which can end up accepting interfaces in an
altsetting without endpoints as long as *some* altsetting has a bulk-in
and a bulk-out endpoint.
Fixes: 55d7de9de6 ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet device driver")
Cc: Woojung.Huh@microchip.com <Woojung.Huh@microchip.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>