The structs of CLC accept and confirm messages for SMCv1 and SMCv2 are
separately defined and often casted to each other in the code, which may
increase the risk of errors caused by future divergence of them. So
unify them into one struct for better maintainability.
Suggested-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Reviewed-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is a large if-else block in smc_clc_send_confirm_accept() and it
is better to split it into two sub-functions.
Suggested-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Reviewed-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rename some functions or variables with 'fce' in their name but used in
SMCv2.1 as 'fce_v2x' for clarity.
Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 2f3ce7a56c ("net: sfp: rework the RollBall PHY waiting code")
changed the long wait before accessing RollBall / FS modules into
probing for PHY every 1 second, and trying 25 times.
Wei Lei reports that this does not work correctly on FS modules: when
initializing, they may report values different from 0xffff in PHY ID
registers for some MMDs, causing get_phy_c45_ids() to find some bogus
MMD.
Fix this by adding the module_t_wait member back, and setting it to 4
seconds for FS modules.
Fixes: 2f3ce7a56c ("net: sfp: rework the RollBall PHY waiting code")
Reported-by: Wei Lei <quic_leiwei@quicinc.com>
Signed-off-by: Marek Behún <kabel@kernel.org>
Tested-by: Lei Wei <quic_leiwei@quicinc.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ioana Ciornei says:
====================
dpaa2-switch: small improvements
This patch set consists of a series of small improvements on the
dpaa2-switch driver ranging from adding some more verbosity when
encountering errors to reorganizing code to be easily extensible.
Changes in v3:
- 4/8: removed the fixes tag and moved it to the commit message
- 5/8: specified that there is no user-visible effect
- 6/8: removed the initialization of the err variable
Changes in v2:
- No changes to the actual diff, only rephrased some commit messages and
added more information.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
In case a DPAA2 switch interface joins a bridge, the FDB used on the
port will be changed to the one associated with the bridge. What this
means exactly is that any VLAN installed on the port will need to be
removed and then installed back so that it points to the new FDB.
Once this is done, the previous FDB will become unused (no VLAN to
point to it). Even though no traffic will reach this FDB, it's best to
just cleanup the state of the FDB by zeroing its egress flood domain.
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Two different DPAA2 switch ports from two different DPSW instances
cannot be under the same bridge. Instead of checking for this
unsupported configuration in the CHANGEUPPER event, check it as early as
possible in the PRECHANGEUPPER one.
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Create separate functions, dpaa2_switch_port_prechangeupper and
dpaa2_switch_port_changeupper, to be called directly when a DPSW port
changes its upper device.
This way we are not open-coding everything in the main event callback
and we can easily extent, for example, with bond offload.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The DPSW object has multiple event sources multiplexed over the same
IRQ. The driver has the capability to configure only some of these
events to trigger the IRQ.
The dpsw_get_irq_status() can clear events automatically based on the
value stored in the 'status' variable passed to it. We don't want that
to happen because we could get into a situation when we are clearing
more events than we actually handled.
Just resort to manually clearing the events that we handled. Also, since
status is not used on the out path we remove its initialization to zero.
This change does not have a user-visible effect because the dpaa2-switch
driver enables and handles all the DPSW events which exist at the
moment.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 84cba72956 ("dpaa2-switch: integrate the MAC endpoint support")
added support for MAC endpoints in the dpaa2-switch driver but omitted
to add the ENDPOINT_CHANGED irq to the list of interrupt sources. Fix
this by extending the list of events which can raise an interrupt by
extending the mask passed to the dpsw_set_irq_mask() firmware API.
There is no user visible impact even without this patch since whenever a
switch interface is connected/disconnected from an endpoint both events
are set (LINK_CHANGED and ENDPOINT_CHANGED) and, luckily, the
LINK_CHANGED event could actually raise the interrupt and thus get the
MAC/PHY SW configuration started.
Even with this, it's better to just not rely on undocumented firmware
behavior which can change.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Print a netdev error when we hit a case in which a specific VLAN is
already configured on the port. While at it, change the already existing
netdev_warn into an _err for consistency purposes.
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is no restriction around the change of the MAC address on the
switch ports, thus declare the interface netdevs IFF_LIVE_ADDR_CHANGE
capable.
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is no point in updating the MAC address of a switch interface each
time the link state changes, this only needs to happen in case the
endpoint changes (the switch interface is [dis]connected from/to a MAC).
Just move the call to dpaa2_switch_port_set_mac_addr() under
DPSW_IRQ_EVENT_ENDPOINT_CHANGED.
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Roger Quadros says:
====================
net: ethernet: am65-cpsw: Add mqprio, frame preemption & coalescing
This series adds mqprio qdisc offload in channel mode,
Frame Preemption MAC merge support and RX/TX coalesing
for AM65 CPSW driver.
In v11 following changes were made
- Fix patch "net: ethernet: ti: am65-cpsw: add mqprio qdisc offload in channel mode"
by including units.h
Changelog information in each patch file.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Add SW IRQ coalescing based on hrtimers for TX and RX data path which
can be enabled by ethtool commands:
- RX coalescing
ethtool -C eth1 rx-usecs 50
- TX coalescing can be enabled per TX queue
- by default enables coalesing for TX0
ethtool -C eth1 tx-usecs 50
- configure TX0
ethtool -Q eth0 queue_mask 1 --coalesce tx-usecs 100
- configure TX1
ethtool -Q eth0 queue_mask 2 --coalesce tx-usecs 100
- configure TX0 and TX1
ethtool -Q eth0 queue_mask 3 --coalesce tx-usecs 100 --coalesce tx-usecs 100
show configuration for TX0 and TX1:
ethtool -Q eth0 queue_mask 3 --show-coalesce
Comparing to gro_flush_timeout and napi_defer_hard_irqs, this patch
allows to enable IRQ coalesing for RX path separately.
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Roger Quadros <rogerq@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add driver support for viewing / changing the MAC Merge sublayer
parameters and seeing the verification state machine's current state
via ethtool.
As hardware does not support interrupt notification for verification
events we resort to polling on link up. On link up we try a couple of
times for verification success and if unsuccessful then give up.
The Frame Preemption feature is described in the Technical Reference
Manual [1] in section:
12.3.1.4.6.7 Intersperced Express Traffic (IET – P802.3br/D2.0)
Due to Silicon Errata i2208 [2] we set limit min IET fragment size to
124 (excluding 4 bytes mCRC).
[1] AM62x TRM - https://www.ti.com/lit/ug/spruiv7a/spruiv7a.pdf
[2] AM62x Silicon Errata - https://www.ti.com/lit/er/sprz487c/sprz487c.pdf
Signed-off-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds MQPRIO Qdisc offload in full 'channel' mode which allows
not only setting up pri:tc mapping, but also configuring TX shapers
(rate-limiting) on external port FIFOs.
The MQPRIO Qdisc offload is expected to work with or without VLAN/priority
tagged packets.
The CPSW external Port FIFO has 8 Priority queues. The rate-limit can be
set for each of these priority queues. Which Priority queue a packet is
assigned to depends on PN_REG_TX_PRI_MAP register which maps header
priority to switch priority.
The header priority of a packet is assigned via the RX_PRI_MAP_REG which
maps packet priority to header priority.
The packet priority is either the VLAN priority (for VLAN tagged packets)
or the thread/channel offset.
For simplicity, we assign the same priority queue to all queues of a
Traffic Class so it can be rate-limited correctly.
Configuration example:
ethtool -L eth1 tx 5
ethtool --set-priv-flags eth1 p0-rx-ptype-rrobin off
tc qdisc add dev eth1 parent root handle 100: mqprio num_tc 3 \
map 0 0 1 2 0 0 0 0 0 0 0 0 0 0 0 0 \
queues 1@0 1@1 1@2 hw 1 mode channel \
shaper bw_rlimit min_rate 0 100mbit 200mbit max_rate 0 101mbit 202mbit
tc qdisc replace dev eth2 handle 100: parent root mqprio num_tc 1 \
map 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 queues 1@0 hw 1
ip link add link eth1 name eth1.100 type vlan id 100
ip link set eth1.100 type vlan egress 0:0 1:1 2:2 3:3 4:4 5:5 6:6 7:7
In the above example two ports share the same TX CPPI queue 0 for low
priority traffic. 3 traffic classes are defined for eth1 and mapped to:
TC0 - low priority, TX CPPI queue 0 -> ext Port 1 fifo0, no rate limit
TC1 - prio 2, TX CPPI queue 1 -> ext Port 1 fifo1, CIR=100Mbit/s, EIR=1Mbit/s
TC2 - prio 3, TX CPPI queue 2 -> ext Port 1 fifo2, CIR=200Mbit/s, EIR=2Mbit/s
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move register definitions to header file. No functional change.
Signed-off-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move this code around to avoid forward declaration.
No functional change.
Signed-off-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Handle offloading commands using switch-case in
am65_cpsw_setup_taprio().
Move checks to am65_cpsw_taprio_replace().
Use NL_SET_ERR_MSG_MOD for error messages.
Change error message from "Failed to set cycle time extension"
to "cycle time extension not supported"
Signed-off-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We will use this Kconfig option to not only enable TAS/EST offload
but also other QoS features like Multiqueue priority descriptors
and MAC-Merge/Frame Preemption. TI_AM65_CPSW_QOS seems a more
appropriate Kconfig option name than TI_AM65_CPSW_TAS.
Signed-off-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Build am65-cpsw-qos only if CONFIG_TI_AM65_CPSW_TAS is enabled.
Signed-off-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some devices do not support individual 'pmac' and 'emac' stats.
For such devices, resort to 'aggregate' stats.
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Roger Quadros <rogerq@kernel.org>
Signed-off-by: Roger Quadros <rogerq@kernel.org>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some devices have errata due to which they cannot report ETH_ZLEN (60)
in the rx-min-frag-size. This was foreseen of course, and lldpad has
logic that when we request it to advertise addFragSize 0, it will round
it up to the lowest value that is _actually_ supported by the hardware.
The problem is that the selftest expects lldpad to report back to us the
same value as we requested.
Make the selftest smarter by figuring out on its own what is a
reasonable value to expect.
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Roger Quadros <rogerq@kernel.org>
Signed-off-by: Roger Quadros <rogerq@kernel.org>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hangbin Liu says:
====================
Convert net selftests to run in unique namespace (last part)
Here is the last part of converting net selftests to run in unique namespace.
This part converts all left tests. After the conversion, we can run the net
sleftests in parallel. e.g.
# ./run_kselftest.sh -n -t net:reuseport_bpf
TAP version 13
1..1
# selftests: net: reuseport_bpf
ok 1 selftests: net: reuseport_bpf
mod 10...
# Socket 0: 0
# Socket 1: 1
...
# Socket 4: 19
# Testing filter add without bind...
# SUCCESS
# ./run_kselftest.sh -p -n -t net:cmsg_so_mark.sh -t net:cmsg_time.sh -t net:cmsg_ipv6.sh
TAP version 13
1..3
# selftests: net: cmsg_so_mark.sh
ok 1 selftests: net: cmsg_so_mark.sh
# selftests: net: cmsg_time.sh
ok 2 selftests: net: cmsg_time.sh
# selftests: net: cmsg_ipv6.sh
ok 3 selftests: net: cmsg_ipv6.sh
# ./run_kselftest.sh -p -n -c net
TAP version 13
1..95
# selftests: net: reuseport_bpf_numa
ok 3 selftests: net: reuseport_bpf_numa
# selftests: net: reuseport_bpf_cpu
ok 2 selftests: net: reuseport_bpf_cpu
# selftests: net: sk_bind_sendto_listen
ok 9 selftests: net: sk_bind_sendto_listen
# selftests: net: reuseaddr_conflict
ok 5 selftests: net: reuseaddr_conflict
...
Here is the part 1 link:
https://lore.kernel.org/netdev/20231202020110.362433-1-liuhangbin@gmail.com
part 2 link:
https://lore.kernel.org/netdev/20231206070801.1691247-1-liuhangbin@gmail.com
part 3 link:
https://lore.kernel.org/netdev/20231213060856.4030084-1-liuhangbin@gmail.com
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a variable RUN_IN_NETNS if the user wants to run all the selected tests
in namespace in parallel. With this, we can save a lot of testing time.
Note that some tests may not fit to run in namespace, e.g.
net/drop_monitor_tests.sh, as the dwdump needs to be run in init ns.
I also added another parameter -p to make all the logs reported separately
instead of mixing them in the stdout or output.log.
Nit: the NUM in run_one is not used, rename it to test_num.
Acked-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
pmtu test use /bin/sh, so we need to source ./lib.sh instead of lib.sh
Here is the test result after conversion.
# ./pmtu.sh
TEST: ipv4: PMTU exceptions [ OK ]
TEST: ipv4: PMTU exceptions - nexthop objects [ OK ]
TEST: ipv6: PMTU exceptions [ OK ]
TEST: ipv6: PMTU exceptions - nexthop objects [ OK ]
...
TEST: ipv4: list and flush cached exceptions - nexthop objects [ OK ]
TEST: ipv6: list and flush cached exceptions [ OK ]
TEST: ipv6: list and flush cached exceptions - nexthop objects [ OK ]
TEST: ipv4: PMTU exception w/route replace [ OK ]
TEST: ipv4: PMTU exception w/route replace - nexthop objects [ OK ]
TEST: ipv6: PMTU exception w/route replace [ OK ]
TEST: ipv6: PMTU exception w/route replace - nexthop objects [ OK ]
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The setup_loopback and setup_veth use their own way to create namespace.
So let's just re-define server_ns/client_ns to unique name.
At the same time update the namespace name in gro.sh and toeplitz.sh.
As I don't have env to run toeplitz.sh. Here is only the gro test result.
# ./gro.sh
running test ipv4 data
Expected {200 }, Total 1 packets
Received {200 }, Total 1 packets.
...
Gro::large test passed.
All Tests Succeeded!
Acked-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Here is the test result after conversion.
# ./xfrm_policy.sh
PASS: policy before exception matches
PASS: ping to .254 bypassed ipsec tunnel (exceptions)
PASS: direct policy matches (exceptions)
PASS: policy matches (exceptions)
PASS: ping to .254 bypassed ipsec tunnel (exceptions and block policies)
PASS: direct policy matches (exceptions and block policies)
PASS: policy matches (exceptions and block policies)
PASS: ping to .254 bypassed ipsec tunnel (exceptions and block policies after hresh changes)
PASS: direct policy matches (exceptions and block policies after hresh changes)
PASS: policy matches (exceptions and block policies after hresh changes)
PASS: ping to .254 bypassed ipsec tunnel (exceptions and block policies after hthresh change in ns3)
PASS: direct policy matches (exceptions and block policies after hthresh change in ns3)
PASS: policy matches (exceptions and block policies after hthresh change in ns3)
PASS: ping to .254 bypassed ipsec tunnel (exceptions and block policies after htresh change to normal)
PASS: direct policy matches (exceptions and block policies after htresh change to normal)
PASS: policy matches (exceptions and block policies after htresh change to normal)
PASS: policies with repeated htresh change
Acked-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Here is the test result after conversion.
# ./stress_reuseport_listen.sh
listen 24000 socks took 0.47714
Acked-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When running the test in namespace, the debugfs may not load automatically.
So add a checking to make sure debugfs loaded. Here is the test result
after conversion.
# ./rtnetlink.sh
PASS: policy routing
PASS: route get
...
PASS: address proto IPv4
PASS: address proto IPv6
Acked-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This test will move the device to netns 1. Add a new test_ns to do this.
Here is the test result after conversion.
# ./netns-name.sh
netns-name.sh [ OK ]
Acked-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Here is the test result after conversion.
# ./gre_gso.sh
TEST: GREv6/v4 - copy file w/ TSO [ OK ]
TEST: GREv6/v4 - copy file w/ GSO [ OK ]
TEST: GREv6/v6 - copy file w/ TSO [ OK ]
TEST: GREv6/v6 - copy file w/ GSO [ OK ]
Tests passed: 4
Tests failed: 0
Acked-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Trivial fix for out-of-tree build that I wasn't testing previously:
1. Create a directory for library object files, fixes:
> gcc lib/kconfig.c -Wall -O2 -g -D_GNU_SOURCE -fno-strict-aliasing -I ../../../../../usr/include/ -iquote /tmp/kselftest/kselftest/net/tcp_ao/lib -I ../../../../include/ -o /tmp/kselftest/kselftest/net/tcp_ao/lib/kconfig.o -c
> Assembler messages:
> Fatal error: can't create /tmp/kselftest/kselftest/net/tcp_ao/lib/kconfig.o: No such file or directory
> make[1]: *** [Makefile:46: /tmp/kselftest/kselftest/net/tcp_ao/lib/kconfig.o] Error 1
2. Include $(KHDR_INCLUDES) that's exported by selftests/Makefile, fixes:
> In file included from lib/kconfig.c:6:
> lib/aolib.h:320:45: warning: ‘struct tcp_ao_add’ declared inside parameter list will not be visible outside of this definition or declaration
> 320 | extern int test_prepare_key_sockaddr(struct tcp_ao_add *ao, const char *alg,
> | ^~~~~~~~~~
...
3. While at here, clean-up $(KSFT_KHDR_INSTALL): it's not needed anymore
since commit f2745dc0ba ("selftests: stop using KSFT_KHDR_INSTALL")
4. Also, while at here, drop .DEFAULT_GOAL definition: that has a
self-explaining comment, that was valid when I made these selftests
compile on local v4.19 kernel, but not needed since
commit 8ce72dc325 ("selftests: fix headers_install circular dependency")
Fixes: cfbab37b3d ("selftests/net: Add TCP-AO library")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202312190645.q76MmHyq-lkp@intel.com/
Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove documentation for nonexistent struct members, addressing these
warnings:
./net/tipc/link.c:228: warning: Excess struct member 'media_addr' description in 'tipc_link'
./net/tipc/link.c:228: warning: Excess struct member 'timer' description in 'tipc_link'
./net/tipc/link.c:228: warning: Excess struct member 'refcnt' description in 'tipc_link'
./net/tipc/link.c:228: warning: Excess struct member 'proto_msg' description in 'tipc_link'
./net/tipc/link.c:228: warning: Excess struct member 'pmsg' description in 'tipc_link'
./net/tipc/link.c:228: warning: Excess struct member 'backlog_limit' description in 'tipc_link'
./net/tipc/link.c:228: warning: Excess struct member 'exp_msg_count' description in 'tipc_link'
./net/tipc/link.c:228: warning: Excess struct member 'reset_rcv_checkpt' description in 'tipc_link'
./net/tipc/link.c:228: warning: Excess struct member 'transmitq' description in 'tipc_link'
./net/tipc/link.c:228: warning: Excess struct member 'snt_nxt' description in 'tipc_link'
./net/tipc/link.c:228: warning: Excess struct member 'deferred_queue' description in 'tipc_link'
./net/tipc/link.c:228: warning: Excess struct member 'unacked_window' description in 'tipc_link'
./net/tipc/link.c:228: warning: Excess struct member 'next_out' description in 'tipc_link'
./net/tipc/link.c:228: warning: Excess struct member 'long_msg_seq_no' description in 'tipc_link'
./net/tipc/link.c:228: warning: Excess struct member 'bc_rcvr' description in 'tipc_link'
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove documentation for nonexistent structure members, addressing these
warnings:
./include/linux/skbuff.h:1063: warning: Excess struct member 'sp' description in 'sk_buff'
./include/linux/skbuff.h:1063: warning: Excess struct member 'nf_bridge' description in 'sk_buff'
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kuniyuki Iwashima says:
====================
tcp: Refactor bhash2 and remove sk_bind2_node.
This series refactors code around bhash2 and remove some bhash2-specific
fields; sock.sk_bind2_node, and inet_timewait_sock.tw_bind2_node.
patch 1 : optimise bind() for non-wildcard v4-mapped-v6 address
patch 2 - 4 : optimise bind() conflict tests
patch 5 - 12 : Link bhash2 to bhash and unlink sk from bhash2 to
remove sk_bind2_node
The patch 8 will trigger a false-positive error by checkpatch.
v2: resend of https://lore.kernel.org/netdev/20231213082029.35149-1-kuniyu@amazon.com/
* Rebase on latest net-next
* Patch 11
* Add change in inet_diag_dump_icsk() for recent bhash dump patch
v1: https://lore.kernel.org/netdev/20231023190255.39190-1-kuniyu@amazon.com/
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Now all sockets including TIME_WAIT are linked to bhash2 using
sock_common.skc_bind_node.
We no longer use inet_bind2_bucket.deathrow, sock.sk_bind2_node,
and inet_timewait_sock.tw_bind2_node.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now we can use sk_bind_node/tw_bind_node for bhash2, which means
we need not link TIME_WAIT sockets separately.
The dead code and sk_bind2_node will be removed in the next patch.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now we do not use tb->owners and can unlink sockets from bhash.
sk_bind_node/tw_bind_node are available for bhash2 and will be
used in the following patch.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We use hlist_empty(&tb->owners) to check if the bhash bucket has a socket.
We can check the child bhash2 buckets instead.
For this to work, the bhash2 bucket must be freed before the bhash bucket.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sockets in bhash are also linked to bhash2, but TIME_WAIT sockets
are linked separately in tb2->deathrow.
Let's replace tb->owners iteration in inet_csk_bind_conflict() with
two iterations over tb2->owners and tb2->deathrow.
This can be done safely under bhash's lock because socket insertion/
deletion in bhash2 happens with bhash's lock held.
Note that twsk_for_each_bound_bhash() will be removed later.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The following patch adds code in the !inet_use_bhash2_on_bind(sk)
case in inet_csk_bind_conflict().
To avoid adding nest and make the change cleaner, this patch
rearranges tests in inet_csk_bind_conflict().
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
bhash2 added a new member sk_bind2_node in struct sock to link
sockets to bhash2 in addition to bhash.
bhash is still needed to search conflicting sockets efficiently
from a port for the wildcard address. However, bhash itself need
not have sockets.
If we link each bhash2 bucket to the corresponding bhash bucket,
we can iterate the same set of the sockets from bhash2 via bhash.
This patch links bhash2 to bhash only, and the actual use will be
in the later patches. Finally, we will remove sk_bind2_node.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Later, we no longer link sockets to bhash. Instead, each bhash2
bucket is linked to the corresponding bhash bucket.
Then, we pass the bhash bucket to bhash2 allocation functions as
tb. However, tb is already used in inet_bind2_bucket_create() and
inet_bind2_bucket_init() as the bhash2 bucket.
To make the following diff clear, let's use tb2 for the bhash2 bucket
there.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
inet_bind2_bucket_addr_match() and inet_bind2_bucket_match_addr_any()
are called for each bhash2 bucket to check conflicts. Thus, we call
ipv6_addr_any() and ipv6_addr_v4mapped() over and over during bind().
Let's avoid calling them by saving the address type in inet_bind2_bucket.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In bhash2, IPv4/IPv6 addresses are saved in two union members,
which complicate address checks in inet_bind2_bucket_addr_match()
and inet_bind2_bucket_match_addr_any() considering uninitialised
memory and v4-mapped-v6 conflicts.
Let's simplify that by saving IPv4 address as v4-mapped-v6 address
and defining tb2.rcv_saddr as tb2.v6_rcv_saddr.s6_addr32[3].
Then, we can compare v6 address as is, and after checking v4-mapped-v6,
we can compare v4 address easily. Also, we can remove tb2->family.
Note these functions will be further refactored in the next patch.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The protocol family tests in inet_bind2_bucket_addr_match() and
inet_bind2_bucket_match_addr_any() are ordered as follows.
if (sk->sk_family != tb2->family)
else if (sk->sk_family == AF_INET6)
else
This patch rearranges them so that AF_INET6 socket is handled first
to make the following patch tidy, where tb2->family will be removed.
if (sk->sk_family == AF_INET6)
else if (tb2->family == AF_INET6)
else
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
While checking port availability in bind() or listen(), we used only
bhash for all v4-mapped-v6 addresses. But there is no good reason not
to use bhash2 for v4-mapped-v6 non-wildcard addresses.
Let's do it by returning true in inet_use_bhash2_on_bind(). Then, we
also need to add a test in inet_bind2_bucket_match_addr_any() so that
::ffff:X.X.X.X will match with 0.0.0.0.
Note that sk->sk_rcv_saddr is initialised for v4-mapped-v6 sk in
__inet6_bind().
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>