Add skip_hw and skip_sw flags for user to control whether
offload action to hardware.
Also we add hw_count to show how many hardwares accept to offload
the action.
Change man page to describe the usage of skip_sw and skip_hw flag.
An example to add and query action as below.
$ tc actions add action police rate 1mbit burst 100k index 100 skip_sw
$ tc -s -d actions list action police
total acts 1
action order 0: police 0x64 rate 1Mbit burst 100Kb mtu 2Kb action reclassify overhead 0b linklayer ethernet
ref 1 bind 0 installed 2 sec used 2 sec
Action statistics:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
skip_sw in_hw in_hw_count 1
used_hw_stats delayed
Signed-off-by: baowen zheng <baowen.zheng@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Tested-by: Victor Nogueira <victor@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
This patch added the id check for deleting address in mptcp_parse_opt().
The ADDRESS argument is invalid for the non-zero id address, only needed
for the id 0 address.
# ip mptcp endpoint delete id 1
# ip mptcp endpoint delete id 0 10.0.1.1
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Acked-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Implement mtu setting for vdpa device.
$ vdpa mgmtdev show
vdpasim_net:
supported_classes net
Add the device with mac address and mtu:
$ vdpa dev add name bar mgmtdev vdpasim_net mac 00:11:22:33:44:55 mtu 9000
In above command only mac address or only mtu can also be set.
View the config after setting:
$ vdpa dev config show
bar: mac 00:11:22:33:44:55 link up link_announce false mtu 9000
Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
vdpa: Enable user to set mtu of the vdpa device
Implement mtu setting for vdpa device.
$ vdpa mgmtdev show
vdpasim_net:
supported_classes net
Add the device with specified mac address:
$ vdpa dev add name bar mgmtdev vdpasim_net mac 00:11:22:33:44:55
View the config after setting:
$ vdpa dev config show
bar: mac 00:11:22:33:44:55 link up link_announce false mtu 1500
Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
Query the device configuration layout whenever kernel supports it.
An example of configuration layout of vdpa device of type network:
$ vdpa dev add name bar mgmtdev vdpasim_net
$ vdpa dev config show
bar: mac 00:35:09:19:48:05 link up link_announce false mtu 1500
$ vdpa dev config show -jp
{
"config": {
"bar": {
"mac": "00:35:09:19:48:05",
"link ": "up",
"link_announce ": false,
"mtu": 1500,
}
}
}
Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
Linux supports 'MPTCP_PM_CMD_SET_FLAGS' since v5.12, and this control has
recently been extended to allow setting flags for a given endpoint id.
Although there is no use for changing 'signal' or 'subflow' flags, it can
be helpful to set/clear the backup bit on existing endpoints: add the 'ip
mptcp endpoint change <...>' command for this purpose.
Link: https://github.com/multipath-tcp/mptcp_net-next/issues/158
Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
Commit dfcb63ce1de6 ("fq_codel: generalise ce_threshold marking for subset
of traffic") added support in fq_codel for setting a value and mask that
will be applied to the diffserv/ECN byte to turn on the ce_threshold
feature for a subset of traffic.
This adds support to iproute for setting these values. The parameter is
called ce_threshold_selector and takes a value followed by a
slash-separated mask. Some examples:
# apply ce_threshold to ECT(1) traffic
tc qdisc replace dev eth0 root fq_codel ce_threshold 1ms ce_threshold_selector 0x1/0x3
# apply ce_threshold to ECN-capable traffic marked as diffserv AF22
tc qdisc replace dev eth0 root fq_codel ce_threshold 1ms ce_threshold_selector 0x50/0xfc
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
The link kernel supports this endpoint flag since v5.15, let's
expose it to user-space. It allows creation on fullmesh topolgy
via MPTCP subflow.
Additionally update the related man-page, clarifying the behavior
of related options.
Acked-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
Commit 690b11f4a6 ("tc: u32: Fix firstfrag filter.") applied in 2012
changed the "ip firstfrag" selector to not match non-fragmented packets
anymore.
However, the documentation added in f15a23966f ("tc: add a man page
for u32 filter") in 2015 includes an example that relies on the previous
behavior (non-fragmented packet counted as first fragment).
Due to this, the example does not work correctly and does not actually
classify regular SSH packets.
Modify the example to use a raw u16 selector on the fragment offset to
make it work, and also make the firstfrag description more clear about
the current behavior.
Fixes: f15a23966f ("tc: add a man page for u32 filter")
Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi>
Cc: Phil Sutter <phil@nwl.cc>
Cc: Hiroaki SHIMODA <shimoda.hiroaki@gmail.com>
Acked-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
When configuring a devlink PCI port, the pfnumber can be specified
using 'pfnum' and not 'pcipf' as stated in the man page. Fix this.
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Currently, ip neigh does not support the NTF_EXT_MANAGED flag. Add cmdline
support.
Usage example:
# ./ip/ip n replace 192.168.178.30 dev enp5s0 managed extern_learn
# ./ip/ip n
192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a managed extern_learn REACHABLE
[...]
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David Ahern <dsahern@kernel.org>
Currently, ip neigh does not support the NTF_USE flag. Similar to other flags
such as extern_learn, add cmdline support. The flag dump support is explicitly
missing here, since the kernel does not propagate the flag back to user space.
Usage example:
# ./ip/ip n replace 192.168.178.30 dev enp5s0 use extern_learn
# ./ip/ip n
192.168.178.30 dev enp5s0 lladdr f4:8c:50:5e:71:9a extern_learn REACHABLE
[...]
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David Ahern <dsahern@kernel.org>
Two new commands to manage default policies:
- ip xfrm policy setdefault
- ip xfrm policy getdefault
And the corresponding part in 'ip xfrm monitor'.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
This patch provides an extension to the rdma statistics tool
that allows to set/unset optional counters set dynamically,
using new netlink commands.
Note that the optional counter statistic implementation is
driver-specific and may impact the performance.
Examples:
To enable a set of optional counters on link rocep8s0f0/1:
$ sudo rdma statistic set link rocep8s0f0/1 optional-counters cc_rx_ce_pkts,cc_rx_cnp_pkts
To disable all optional counters on link rocep8s0f0/1:
$ sudo rdma statistic unset link rocep8s0f0/1 optional-counters
Signed-off-by: Neta Ostrovsky <netao@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
This patch introduces the "mode" command, which presents the enabled or
supported (when the "supported" argument is available) optional
counters.
An optional counter is a vendor-specific counter that may be
dynamically enabled/disabled. This enhancement of hwcounters allows
exposing of counters which are for example mutual exclusive and cannot
be enabled at the same time, counters that might degrades performance,
optional debug counters, etc.
Examples:
To present currently enabled optional counters on link rocep8s0f0/1:
$ rdma statistic mode link rocep8s0f0/1
link rocep8s0f0/1 optional-counters cc_rx_ce_pkts
To present supported optional counters on link rocep8s0f0/1:
$ rdma statistic mode supported link rocep8s0f0/1
link rocep8s0f0/1 supported optional-counters cc_rx_ce_pkts,cc_rx_cnp_pkts,cc_tx_cnp_pkts
Signed-off-by: Neta Ostrovsky <netao@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
br. were added between options of the same command. That is not needed
and makes the output to be one 3 lines for no particular reason.
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Values should be .I, square brackets should be used for optional values,
curly brackets for lists. Follow this in the devlink-port man page.
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
When configuring a devlink PCI SF port, the sfnumber can be specified
using 'sfnum' and not 'pcisf' as stated in the man page. Fix this.
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
This patch updates the IOAM documentation (ip-route man page) to reflect the
three encap modes that were introduced.
Signed-off-by: Justin Iurman <justin.iurman@uliege.be>
Signed-off-by: David Ahern <dsahern@kernel.org>
Commit d3432bf10f17 ("net: Support filtering interfaces on no master")
in the kernel added support for filtering interfaces/neighbours that
have no master interface.
This patch completes it and adds this support to iproute2:
1. ip link show nomaster
2. ip address show nomaster
3. ip neighbour {show | flush} nomaster
Signed-off-by: Lahav Schlesinger <lschlesinger@drivenets.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
The 'ip link add' invocation template at the top of the ip-macsec man
page formats with a pair of extra double quotes:
ip link add link DEVICE name NAME type macsec [ [ address <lladdr> ]
port PORT | sci <u64> ] [ cipher { default | gcm-aes-128 | gcm-
aes-256"}][" icvlen ICVLEN ] [ encrypt { on | off } ] [ send_sci { on |
This is due to missing whitespace around the gcm-aes-256 identifier
in the source file.
Fixes: b16f525323 ("Add support for configuring MACsec gcm-aes-256 cipher type.")
Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
Add support for setting and dumping per-vlan/interface mcast_router
option. It controls the mcast router mode of a vlan/interface pair.
For bridge devices only modes 0 - 2 are allowed. The possible modes
are:
0 - disabled
1 - automatic router presence detection (default)
2 - permanent router
3 - temporary router (available only for ports)
Example:
# mark port ens16 as a permanent mcast router for vlan 100
$ bridge vlan set dev ens16 vid 100 mcast_router 2
# disable mcast router for port ens16 and vlan 200
$ bridge vlan set dev ens16 vid 200 mcast_router 0
$ bridge -d vlan show
port vlan-id
ens16 1 PVID Egress Untagged
state forwarding mcast_router 1
100
state forwarding mcast_router 2
200
state forwarding mcast_router 0
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
Not sure if anyone uses the routel script. The script was
a combination of ip route, shell and awk doing command scraping.
It is now possible to do this much better using the JSON
output formats and python.
Rewriting also fixes the bug where the old script could not parse
the current output format. At the end was getting:
/usr/bin/routel: 48: shift: can't shift that many
The new script also has IPv6 as option.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David Ahern <dsahern@kernel.org>
This script is old and limited to IPv4.
Using ip route command directly is better option.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David Ahern <dsahern@kernel.org>
This script was from olden days of ifcfg.
I don't see any distribution using it and it is time to put
it out to pasture.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David Ahern <dsahern@kernel.org>
This script was a one off hack for a special case.
Now that ip commands have better formatting, there is no
real reason for it.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David Ahern <dsahern@kernel.org>
Iproute2 has not supported DECnet or IPX since version 5.0.
There were some leftover support in the ip options flags
and parsing, remove these.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Recently we added SKBMOD_F_ECN option support to the kernel; support it in
the tc-skbmod(8) front end, and update its man page accordingly.
The 2 least significant bits of the Traffic Class field in IPv4 and IPv6
headers are used to represent different ECN states [1]:
0b00: "Non ECN-Capable Transport", Non-ECT
0b10: "ECN Capable Transport", ECT(0)
0b01: "ECN Capable Transport", ECT(1)
0b11: "Congestion Encountered", CE
This new option, "ecn", marks ECT(0) and ECT(1) IPv{4,6} packets as CE,
which is useful for ECN-based rate limiting. For example:
$ tc filter add dev eth0 parent 1: protocol ip prio 10 \
u32 match ip protocol 1 0xff flowid 1:2 \
action skbmod \
ecn
The updated tc-skbmod SYNOPSIS looks like the following:
tc ... action skbmod { set SETTABLE | swap SWAPPABLE | ecn } ...
Only one of "set", "swap" or "ecn" shall be used in a single tc-skbmod
command. Trying to use more than one of them at a time is considered
undefined behavior; pipe multiple tc-skbmod commands together instead.
"set" and "swap" only affect Ethernet packets, while "ecn" only affects
IP packets.
Depends on kernel patch "net/sched: act_skbmod: Add SKBMOD_F_ECN option
support", as well as iproute2 patch "tc/skbmod: Remove misinformation
about the swap action".
[1] https://en.wikipedia.org/wiki/Explicit_Congestion_Notification
Reviewed-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
This patch provides man8 documentation for IOAM inside ip, ip-ioam and ip-route.
Signed-off-by: Justin Iurman <justin.iurman@uliege.be>
Signed-off-by: David Ahern <dsahern@kernel.org>
Make use of the already available brief flag and print the basic details of
the IPv4 or IPv6 neighbour cache in a tabular format for better readability
when the brief output is expected.
$ ip -br neigh
172.16.12.100 bridge0 b0:fc:36:2f:07:43
172.16.12.174 bridge0 8c:16:45:2f:bc:1c
172.16.12.250 bridge0 04:d9:f5:c1:0c:74
fe80::267b:9f70:745e:d54d bridge0 b0:fc:36:2f:07:43
fd16:a115:6a62:0:8744:efa1:9933:2c4c bridge0 8c:16:45:2f:bc:1c
fe80::6d9:f5ff:fec1:c74 bridge0 04:d9:f5:c1:0c:74
And add "ip neigh show" to the list of ip sub commands mentioned in the man
page that support the brief output in tabular format.
Signed-off-by: Gokul Sivakumar <gokulkumar792@gmail.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
Add control and dump support for the global mcast_querier option which
controls if the bridge will act as a multicast querier for that vlan.
Syntax: $ bridge vlan global set dev bridge vid 1 mcast_querier 1
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
Add control and dump support for the global mcast_startup_query_interval
option which controls the interval between queries in the startup phase.
To be consistent with the same bridge-wide option the value is reported
with USER_HZ granularity and the same granularity is expected when setting
it.
Syntax:
$ bridge vlan global set dev bridge vid 1 mcast_startup_query_interval 15000
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
Add control and dump support for the global mcast_query_response_interval
option which sets the Max Response Time/Maximum Response Delay for IGMP/MLD
queries sent by the bridge. To be consistent with the same bridge-wide
option the value is reported with USER_HZ granularity and the same
granularity is expected when setting it.
Syntax:
$ bridge vlan global set dev bridge vid 1 mcast_query_response_interval 13000
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
Add control and dump support for the global mcast_query_interval
option which controls the interval between queries sent by the bridge
after the end of the startup phase. To be consistent with the same
bridge-wide option the value is reported with USER_HZ granularity and
the same granularity is expected when setting it.
Syntax:
$ bridge vlan global set dev bridge vid 1 mcast_query_interval 13000
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
Add control and dump support for the global mcast_querier_interval
option which controls the interval after which if no other router
queries are seen the bridge will start sending its own queries.
To be consistent with the same bridge-wide option the value is reported
with USER_HZ granularity and the same granularity is expected when
setting it.
Syntax:
$ bridge vlan global set dev bridge vid 1 mcast_querier_interval 13000
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
Add control and dump support for the global mcast_membership_interval
option which controls the interval after which the bridge will leave a
group if no reports have been received for it. To be consistent with the
same bridge-wide option the value is reported with USER_HZ granularity and
the same granularity is expected when setting it.
The default is 26000 (260 seconds).
Syntax:
$ bridge vlan global set dev bridge vid 1 mcast_membership_interval 13000
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
Add control and dump support for the global mcast_last_member_interval
option which controls the interval between queries to find remaining
members of a group after a leave message. To be consistent with the same
bridge-wide option the value is reported with USER_HZ granularity and
the same granularity is expected when setting it.
The default is 100 (1 second).
Syntax:
$ bridge vlan global set dev bridge vid 1 mcast_last_member_interval 200
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
Add control and dump support for the global mcast_startup_query_count
option which controls the number of queries the bridge will send on the
vlan during startup phase (default 2).
Syntax:
$ bridge vlan global set dev bridge vid 1 mcast_startup_query_count 5
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
Add control and dump support for the global mcast_last_member_count option
which controls the number of queries the bridge will send on the vlan after
a leave is received (default 2).
Syntax:
$ bridge vlan global set dev bridge vid 1 mcast_last_member_count 10
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
Add control and dump support for the global mcast_mld_version option
which controls the MLD version on the vlan (default 1).
Syntax: $ bridge vlan global set dev bridge vid 1 mcast_mld_version 2
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
Add control and dump support for the global mcast_igmp_version option
which controls the IGMP version on the vlan (default 2).
Syntax: $ bridge vlan global set dev bridge vid 1 mcast_igmp_version 3
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
Add control and dump support for the global mcast_snooping option which
controls if multicast snooping is enabled or disabled for a single vlan.
Syntax: $ bridge vlan global set dev bridge vid 1 mcast_snooping 1
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
Add support to change global vlan options via a new vlan global
set subcommand similar to the current vlan set subcommand. The man page
and help are updated accordingly. The command works only with bridge
devices. It doesn't support any options yet.
Syntax: $ bridge vlan global set vid VID dev DEV
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>