linux

korg/linux

mirror of https://mirrors.bfsu.edu.cn/git/linux.git synced 2024-11-17 01:04:19 +08:00

Author	SHA1	Message	Date
Eric Dumazet	5c701549c9	tcp: move retrans_out, sacked_out, tlp_high_seq, last_oow_ack_time init to tcp_disconnect() If we make sure all listeners have these fields cleared, then a clone will also inherit zero values. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 22:19:05 -08:00
Eric Dumazet	5d83676462	tcp: do not clear urg_data in tcp_create_openreq_child All listeners have this field cleared already, since tcp_disconnect() clears it and newly created sockets have also a zero value here. So a clone will inherit a zero value here. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 22:19:05 -08:00
Eric Dumazet	3a9a57f637	tcp: move snd_cwnd & snd_cwnd_cnt init to tcp_disconnect() Passive connections can inherit proper value by cloning, if we make sure all listeners have the proper values there. tcp_disconnect() was setting snd_cwnd to 2, which seems quite obsolete since IW10 adoption. Also remove an obsolete comment. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 22:19:05 -08:00
Eric Dumazet	b9e2e689aa	tcp: move mdev_us init to tcp_disconnect() If we make sure a listener always has its mdev_us field set to TCP_TIMEOUT_INIT, we do not need to rewrite this field after a new clone is created. tcp_disconnect() is very seldom used in real applications. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 22:19:05 -08:00
Eric Dumazet	a0070e463f	tcp: do not clear srtt_us in tcp_create_openreq_child All listeners have this field cleared already, since tcp_disconnect() clears it and newly created sockets have also a zero value here. So a clone will inherit a zero value here. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 22:19:05 -08:00
Eric Dumazet	eb2c80ca87	tcp: do not clear packets_out in tcp_create_openreq_child() New sockets have this field cleared, and tcp_disconnect() calls tcp_write_queue_purge() which among other things also clear tp->packets_out So a listener is guaranteed to have this field cleared. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 22:19:04 -08:00
Eric Dumazet	6a408147ea	tcp: move icsk_rto init to tcp_disconnect() If we make sure a listener always has its icsk_rto field set to TCP_TIMEOUT_INIT, we do not need to rewrite this field after a new clone is created. tcp_disconnect() is very seldom used in real applications. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 22:19:04 -08:00
Eric Dumazet	b84235e291	tcp: do not set snd_ssthresh in tcp_create_openreq_child() New sockets get the field set to TCP_INFINITE_SSTHRESH in tcp_init_sock() In case a socket had this field changed and transitions to TCP_LISTEN state, tcp_disconnect() also makes sure snd_ssthresh is set to TCP_INFINITE_SSTHRESH. So a listener has this field set to TCP_INFINITE_SSTHRESH already. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 22:19:04 -08:00
YueHaibing	bec03debe2	net/mlx4: remove unneeded semicolon Remove unneeded semicolon. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 22:05:42 -08:00
YueHaibing	5c423d7114	net: ethernet: ti: cpsw-phy-sel: remove unneeded semicolon Remove unneeded semicolon. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 22:05:16 -08:00
YueHaibing	d4fb30f6f1	tipc: remove unneeded semicolon in trace.c Remove unneeded semicolon Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 22:04:43 -08:00
YueHaibing	8b59bfe83c	qed: remove duplicated include from qed_if.h Remove duplicated include. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: Denis Bolotin <dbolotin@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 21:57:45 -08:00
Colin Ian King	6394d98df6	sb1000: fix a couple of indentation issues and remove assignment in if statements There is an if statement and a return statement that are incorrectly indented. Fix these. Also replace the assignment-in-if statements to assignment followed by an if to keep to the coding style. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 21:51:36 -08:00
Peter Oskolkov	22c2ad616b	net: add a route cache full diagnostic message In some testing scenarios, dst/route cache can fill up so quickly that even an explicit GC call occasionally fails to clean it up. This leads to sporadically failing calls to dst_alloc and "network unreachable" errors to the user, which is confusing. This patch adds a diagnostic message to make the cause of the failure easier to determine. Signed-off-by: Peter Oskolkov <posk@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 15:37:25 -08:00
Ioana Ciocoi Radulescu	68d7431553	dpaa2-eth: Fix ndo_stop routine In the current implementation, on interface down we disabled NAPI and then manually drained any remaining ingress frames. This could lead to a situation when, under heavy traffic, the data availability notification for some of the channels would not get rearmed correctly. Change the implementation such that we let all remaining ingress frames be processed as usual and only disable NAPI once the hardware queues are empty. We also add a wait on the Tx side, to allow hardware time to process all in-flight Tx frames before issueing the disable command. Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 15:37:02 -08:00
Colin Ian King	5191673b69	wan: dscc4: fix various indentation issues There are some lines that have indentation issues, fix these. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 15:36:31 -08:00
David S. Miller	039d52e15e	Merge branch 'vxlan-FDB-veto' Petr Machata says: ==================== vxlan: Allow vetoing FDB operations mlxsw does not implement handling of the more advanced types of VXLAN FDB entries. In order to provide visibility to users, it is important to be able to reject such FDB entries, ideally with an explanation passed in extended ack. This patch set implements this. In patches #1-#4, vxlan is gradually transformed to support vetoing of FDB entries added (or modified) through vxlan_fdb_update(), and the default FDB entry added in __vxlan_dev_create(). Patches #5-#7 deal with vxlan_changelink(). The existing code recognizes that vxlan_fdb_update() may fail, but doesn't attempt to keep things intact if it does. These patches change the function in several steps to gracefully handle vetoes (or other failures). Then in patches #8-#11, extack arguments are added, respectively, to ndo_fdb_add(), mlxsw's mlxsw_sp_nve_ops.fdb_replay, the functions that connect to the VXLAN vetoing code, and call_switchdev_notifiers(). Note that call_switchdev_blocking_notifiers() already does support extack. Finally in patch #12, mlxsw is extended to add extack messages to rejected FDB entries. In patch #13, the functionality is tested. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 15:18:47 -08:00
Petr Machata	7e1046fd1f	selftests: mlxsw: Test veto of unsupported VXLAN FDBs mlxsw doesn't implement offloading of all types of FDB entries that the VXLAN driver supports. Test that such FDB entries are rejected. That makes sure that the decision made by the existing validation code in mlxsw propagates up the stack. It also exercises rollback functionality in VXLAN, and tests that extack is returned. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 15:18:47 -08:00
Petr Machata	a40313d956	mlxsw: spectrum: Add extack messages to VXLAN FDB rejection Annotate the rejections in mlxsw_sp_switchdev_vxlan_work_prepare() with textual reasons. Because this code ends up being invoked for FDB replay as well, drop the default message from there, so that the more accurate error message is not overwritten. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 15:18:47 -08:00
Petr Machata	6685987c29	switchdev: Add extack argument to call_switchdev_notifiers() A follow-up patch will enable vetoing of FDB entries. Make it possible to communicate details of why an FDB entry is not acceptable back to the user. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 15:18:47 -08:00
Petr Machata	4c59b7d160	vxlan: Add extack to switchdev operations There are four sources of VXLAN switchdev notifier calls: - the changelink() link operation, which already supports extack, - ndo_fdb_add() which got extack support in a previous patch, - FDB updates due to packet forwarding, - and vxlan_fdb_replay(). Extend vxlan_fdb_switchdev_call_notifiers() to include extack in the switchdev message that it sends, and propagate the argument upwards to the callers. For the first two cases, pass in the extack gotten through the operation. For case #3, pass in NULL. To cover the last case, extend vxlan_fdb_replay() to take extack argument, which might come from whatever operation necessitated the FDB replay. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 15:18:47 -08:00
Petr Machata	d907f58fa9	mlxsw: Add extack to mlxsw_sp_nve_ops.fdb_replay A follow-up patch will extend vxlan_fdb_replay() with an extack argument. Extend the fdb_replay callback in mlxsw likewise so that the argument is ready for the vxlan conversion. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 15:18:47 -08:00
Petr Machata	87b0984ebf	net: Add extack argument to ndo_fdb_add() Drivers may not be able to support certain FDB entries, and an error code is insufficient to give clear hints as to the reasons of rejection. In order to make it possible to communicate the rejection reason, extend ndo_fdb_add() with an extack argument. Adapt the existing implementations of ndo_fdb_add() to take the parameter (and ignore it). Pass the extack parameter when invoking ndo_fdb_add() from rtnl_fdb_add(). Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 15:18:47 -08:00
Petr Machata	1cdc98c271	vxlan: changelink: Delete remote after update If a change in remote address prompts a change in a default FDB entry, that change might be vetoed. If that happens, it would then be necessary to reinstate the already-removed default FDB entry corresponding to the previous remote address. Instead, arrange to have the previous address removed only after the FDB is successfully vetted. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 15:18:46 -08:00
Petr Machata	038a5a99e9	vxlan: changelink: Postpone vxlan_config_apply() When an FDB entry is vetoed, it is necessary to unroll the changes that have already been done. To avoid having to unroll vxlan_config_apply(), postpone the call after the point where the vetoing takes place. Since the call can't fail, it doesn't necessitate any cleanups in the preceding FDB update logic. Correspondingly, move down the mod_timer() call as well. References to *dst need to be replaced with references to conf. Additionally, old_dst and old_age_interval are not necessary anymore, and therefore drop them. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 15:18:46 -08:00
Petr Machata	8db9427d52	vxlan: changelink: Inline vxlan_dev_configure() The changelink operation may cause change in remote address, and therefore an FDB update, which can be vetoed. To properly handle vetoing, vxlan_changelink() needs to be gradually updated. In this patch simply replace vxlan_dev_configure() with the two constituent calls. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 15:18:46 -08:00
Petr Machata	61f46fe8c6	vxlan: Allow vetoing of FDB notifications Change vxlan_fdb_switchdev_call_notifiers() to return the result from calling switchdev notifiers. Propagate the error number up the stack. In vxlan_fdb_update_existing() and vxlan_fdb_update_create() add rollbacks to clean up the work that was done before the veto. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 15:18:46 -08:00
Petr Machata	ccdfd4f71d	vxlan: Have vxlan_fdb_replace() save original rdst value To enable rollbacks after vetoed FDB updates, extend vxlan_fdb_replace() to take an additional argument where it should store the original values of a modified rdst. Update the sole caller. The following patch will make use of the saved value. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 15:18:46 -08:00
Petr Machata	a76d1ca296	vxlan: Split vxlan_fdb_update() in two In order to make it easier to implement rollbacks after FDB update vetoing, separate the FDB update code to two parts: one that deals with updates of existing FDB entries, and one that creates new entries. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 15:18:46 -08:00
Petr Machata	c2b200e0ba	vxlan: Move up vxlan_fdb_free(), vxlan_fdb_destroy() These functions will be needed for rollbacks of vetoed FDB entries. Move them up so that they are visible at their intended point of use. Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 15:18:46 -08:00
David S. Miller	12ff91c8ba	Merge branch 'improving-TCP-behavior-on-host-congestion' Yuchung Cheng says: ==================== improving TCP behavior on host congestion This patch set aims to improve how TCP handle local qdisc congestion by simplifying the previous implementation. Previously when an skb fails to (re)transmit due to local qdisc congestion or other resource issue, TCP refrains from setting the skb timestamp or the recovery starting time. This design makes determining when to abort a stalling socket more complicated, as the timestamps of these tranmission attempts were missing. The stack needs to sort of infer when the original attempt happens. A by-product is a socket may disregard the system timeout limit (i.e. sysctl net.ipv4.tcp_retries2 or USER_TIMEOUT option), and continue to retry until the transmission is successful. In data-center environment when TCP RTO is small, this could cause the socket to retry frequently for long during qdisc congestion. The solution is to first unconditionally timestamp skb and recovery attempt. Then retry more conservatively (twice a second) on local qdisc congestion but abort the sockets according to the system limit. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 15:12:26 -08:00
Yuchung Cheng	c1d5674f83	tcp: less aggressive window probing on local congestion Previously when the sender fails to send (original) data packet or window probes due to congestion in the local host (e.g. throttling in qdisc), it'll retry within an RTO or two up to 500ms. In low-RTT networks such as data-centers, RTO is often far below the default minimum 200ms. Then local host congestion could trigger a retry storm pouring gas to the fire. Worse yet, the probe counter (icsk_probes_out) is not properly updated so the aggressive retry may exceed the system limit (15 rounds) until the packet finally slips through. On such rare events, it's wise to retry more conservatively (500ms) and update the stats properly to reflect these incidents and follow the system limit. Note that this is consistent with the behaviors when a keep-alive probe or RTO retry is dropped due to local congestion. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 15:12:26 -08:00
Yuchung Cheng	590d2026d6	tcp: retry more conservatively on local congestion Previously when the sender fails to retransmit a data packet on timeout due to congestion in the local host (e.g. throttling in qdisc), it'll retry within an RTO up to 500ms. In low-RTT networks such as data-centers, RTO is often far below the default minimum 200ms (and the cap 500ms). Then local host congestion could trigger a retry storm pouring gas to the fire. Worse yet, the retry counter (icsk_retransmits) is not properly updated so the aggressive retry may exceed the system limit (15 rounds) until the packet finally slips through. On such rare events, it's wise to retry more conservatively (500ms) and update the stats properly to reflect these incidents and follow the system limit. Note that this is consistent with the behavior when a keep-alive probe is dropped due to local congestion. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 15:12:26 -08:00
Yuchung Cheng	9721e709fa	tcp: simplify window probe aborting on USER_TIMEOUT Previously we use the next unsent skb's timestamp to determine when to abort a socket stalling on window probes. This no longer works as skb timestamp reflects the last instead of the first transmission. Instead we can estimate how long the socket has been stalling with the probe count and the exponential backoff behavior. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 15:12:26 -08:00
Yuchung Cheng	01a523b071	tcp: create a helper to model exponential backoff Create a helper to model TCP exponential backoff for the next patch. This is pure refactor w no behavior change. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 15:12:26 -08:00
Yuchung Cheng	c7d13c8faa	tcp: properly track retry time on passive Fast Open This patch addresses a corner issue on timeout behavior of a passive Fast Open socket. A passive Fast Open server may write and close the socket when it is re-trying SYN-ACK to complete the handshake. After the handshake is completely, the server does not properly stamp the recovery start time (tp->retrans_stamp is 0), and the socket may abort immediately on the very first FIN timeout, instead of retying until it passes the system or user specified limit. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 15:12:26 -08:00
Yuchung Cheng	7ae189759c	tcp: always set retrans_stamp on recovery Previously TCP socket's retrans_stamp is not set if the retransmission has failed to send. As a result if a socket is experiencing local issues to retransmit packets, determining when to abort a socket is complicated w/o knowning the starting time of the recovery since retrans_stamp may remain zero. This complication causes sub-optimal behavior that TCP may use the latest, instead of the first, retransmission time to compute the elapsed time of a stalling connection due to local issues. Then TCP may disrecard TCP retries settings and keep retrying until it finally succeed: not a good idea when the local host is already strained. The simple fix is to always timestamp the start of a recovery. It's worth noting that retrans_stamp is also used to compare echo timestamp values to detect spurious recovery. This patch does not break that because retrans_stamp is still later than when the original packet was sent. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 15:12:26 -08:00
Yuchung Cheng	7f12422c48	tcp: always timestamp on every skb transmission Previously TCP skbs are not always timestamped if the transmission failed due to memory or other local issues. This makes deciding when to abort a socket tricky and complicated because the first unacknowledged skb's timestamp may be 0 on TCP timeout. The straight-forward fix is to always timestamp skb on every transmission attempt. Also every skb retransmission needs to be flagged properly to avoid RTT under-estimation. This can happen upon receiving an ACK for the original packet and the a previous (spurious) retransmission has failed. It's worth noting that this reverts to the old time-stamping style before commit `8c72c65b42` ("tcp: update skb->skb_mstamp more carefully") which addresses a problem in computing the elapsed time of a stalled window-probing socket. The problem will be addressed differently in the next patches with a simpler approach. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 15:12:26 -08:00
Yuchung Cheng	88f8598d0a	tcp: exit if nothing to retransmit on RTO timeout Previously TCP only warns if its RTO timer fires and the retransmission queue is empty, but it'll cause null pointer reference later on. It's better to avoid such catastrophic failure and simply exit with a warning. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 15:12:26 -08:00
Heiner Kallweit	9b420eff9f	net: phy: micrel: use phy_read_mmd and phy_write_mmd This driver implements open-coded versions of phy_read_mmd() and phy_write_mmd() for KSZ9031. That's not needed, let's use the phylib functions directly. This is compile-tested only because I have no such hardware. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 15:10:00 -08:00
Mathieu Malaterre	43deda5408	davicom: Annotate implicit fall through in dm9000_set_io There is a plan to build the kernel with -Wimplicit-fallthrough and this place in the code produced a warning (W=1). This commit removes the following warning: include/linux/device.h:1480:5: warning: this statement may fall through [-Wimplicit-fallthrough=] drivers/net/ethernet/davicom/dm9000.c:397:3: note: in expansion of macro 'dev_dbg' drivers/net/ethernet/davicom/dm9000.c:398:2: note: here Signed-off-by: Mathieu Malaterre <malat@debian.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 15:08:17 -08:00
David Herrmann	49b4994c14	net/ipv6/udp_tunnel: prefer SO_BINDTOIFINDEX over SO_BINDTODEVICE The udp-tunnel setup allows binding sockets to a network device. Prefer the new SO_BINDTOIFINDEX to avoid temporarily resolving the device-name just to look it up in the ioctl again. Reviewed-by: Tom Gundersen <teg@jklm.no> Signed-off-by: David Herrmann <dh.herrmann@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 14:55:52 -08:00
David Herrmann	2eadee72db	net/ipv4/udp_tunnel: prefer SO_BINDTOIFINDEX over SO_BINDTODEVICE The udp-tunnel setup allows binding sockets to a network device. Prefer the new SO_BINDTOIFINDEX to avoid temporarily resolving the device-name just to look it up in the ioctl again. Reviewed-by: Tom Gundersen <teg@jklm.no> Signed-off-by: David Herrmann <dh.herrmann@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 14:55:52 -08:00
David Herrmann	f5dd3d0c96	net: introduce SO_BINDTOIFINDEX sockopt This introduces a new generic SOL_SOCKET-level socket option called SO_BINDTOIFINDEX. It behaves similar to SO_BINDTODEVICE, but takes a network interface index as argument, rather than the network interface name. User-space often refers to network-interfaces via their index, but has to temporarily resolve it to a name for a call into SO_BINDTODEVICE. This might pose problems when the network-device is renamed asynchronously by other parts of the system. When this happens, the SO_BINDTODEVICE might either fail, or worse, it might bind to the wrong device. In most cases user-space only ever operates on devices which they either manage themselves, or otherwise have a guarantee that the device name will not change (e.g., devices that are UP cannot be renamed). However, particularly in libraries this guarantee is non-obvious and it would be nice if that race-condition would simply not exist. It would make it easier for those libraries to operate even in situations where the device-name might change under the hood. A real use-case that we recently hit is trying to start the network stack early in the initrd but make it survive into the real system. Existing distributions rename network-interfaces during the transition from initrd into the real system. This, obviously, cannot affect devices that are up and running (unless you also consider moving them between network-namespaces). However, the network manager now has to make sure its management engine for dormant devices will not run in parallel to these renames. Particularly, when you offload operations like DHCP into separate processes, these might setup their sockets early, and thus have to resolve the device-name possibly running into this race-condition. By avoiding a call to resolve the device-name, we no longer depend on the name and can run network setup of dormant devices in parallel to the transition off the initrd. The SO_BINDTOIFINDEX ioctl plugs this race. Reviewed-by: Tom Gundersen <teg@jklm.no> Signed-off-by: David Herrmann <dh.herrmann@gmail.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 14:55:51 -08:00
Vakul Garg	692d7b5d1f	tls: Fix recvmsg() to be able to peek across multiple records This fixes recvmsg() to be able to peek across multiple tls records. Without this patch, the tls's selftests test case 'recv_peek_large_buf_mult_recs' fails. Each tls receive context now maintains a 'rx_list' to retain incoming skb carrying tls records. If a tls record needs to be retained e.g. for peek case or for the case when the buffer passed to recvmsg() has a length smaller than decrypted record length, then it is added to 'rx_list'. Additionally, records are added in 'rx_list' if the crypto operation runs in async mode. The records are dequeued from 'rx_list' after the decrypted data is consumed by copying into the buffer passed to recvmsg(). In case, the MSG_PEEK flag is used in recvmsg(), then records are not consumed or removed from the 'rx_list'. Signed-off-by: Vakul Garg <vakul.garg@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 14:20:40 -08:00
David S. Miller	fb73d62025	Merge branch 'dsa-lantiq_gswip-probe-fixes-and-remove-cleanup' Johan Hovold says: ==================== net: dsa: lantiq_gswip: probe fixes and remove cleanup This series fix a few issues found through inspection when fixing up new bad uses of of_find_compatible_node() that have crept in since 4.19. Note that these have only been compile tested. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 12:12:19 -08:00
Johan Hovold	8bb18f69c7	net: dsa: lantiq_gswip: drop bogus drvdata check The platform-device driver data is set on successful probe and will never be NULL on remove (or we have much bigger problems). Signed-off-by: Johan Hovold <johan@kernel.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Hauke Mehrtens <hauke@hauke-m.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 12:12:19 -08:00
Johan Hovold	c8cbcb0d8b	net: dsa: lantiq_gswip: fix OF child-node lookups Use the new of_get_compatible_child() helper to look up child nodes to avoid ever matching non-child nodes elsewhere in the tree. Also fix up the related struct device_node leaks. Fixes: `14fceff477` ("net: dsa: Add Lantiq / Intel DSA driver for vrx200") Cc: stable <stable@vger.kernel.org> # 4.20 Cc: Hauke Mehrtens <hauke@hauke-m.de> Signed-off-by: Johan Hovold <johan@kernel.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Hauke Mehrtens <hauke@hauke-m.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 12:12:19 -08:00
Johan Hovold	aed13f2e00	net: dsa: lantiq_gswip: fix use-after-free on failed probe Make sure to disable and deregister the switch on late probe errors to avoid use-after-free when the device-resource-managed switch is freed. Fixes: `14fceff477` ("net: dsa: Add Lantiq / Intel DSA driver for vrx200") Cc: stable <stable@vger.kernel.org> # 4.20 Cc: Hauke Mehrtens <hauke@hauke-m.de> Signed-off-by: Johan Hovold <johan@kernel.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Hauke Mehrtens <hauke@hauke-m.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 12:12:19 -08:00
Bert Kenward	5fb1beecea	sfc: extend MTD support for newer hardware The X2 family of NICs (based on the SFC9250) have additional MTD partitions for firmware and configuration. This includes partitions that are read-only. The NICs also have extended versions of the NVRAM interface, allowing more detailed status information to be returned. Signed-off-by: Bert Kenward <bkenward@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-01-17 12:10:14 -08:00

1 2 3 4 5 ...

810809 Commits