Add support for the new TCA_MPLS_ACT_MAC_PUSH action (kernel commit
a45294af9e96 ("net/sched: act_mpls: Add action to push MPLS LSE before
Ethernet header")). This action let TC push an MPLS header before the
MAC header of a frame.
Example (encapsulate all outgoing frames with label 20, then add an
outer Ethernet header):
# tc filter add dev ethX matchall \
action mpls mac_push label 20 ttl 64 \
action vlan push_eth dst_mac 0a:00:00:00:00:02 \
src_mac 0a:00:00:00:00:01
This patch also adds an alias for ETH_P_TEB, since it is useful when
decapsulating MPLS packets that contain an Ethernet frame.
With MAC_PUSH, there's no previous Ethertype to modify. However, the
"protocol" option is still needed, because the kernel uses it to set
skb->protocol. So rename can_modify_ethtype() to can_set_ethtype().
Also add a test suite for m_mpls, which covers the new action and the
pre-existing ones.
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Add support for the new TCA_VLAN_ACT_POP_ETH and TCA_VLAN_ACT_PUSH_ETH
actions (kernel commit 19fbcb36a39e ("net/sched: act_vlan:
Add {POP,PUSH}_ETH actions"). These action let TC remove or add the
Ethernet at the head of a frame.
Drop an Ethernet header:
# tc filter add dev ethX matchall action vlan pop_eth
Push an Ethernet header (the original frame must have no MAC header):
# tc filter add dev ethX matchall action vlan \
push_eth dst_mac 0a:00:00:00:00:02 src_mac 0a:00:00:00:00:01
Also add a test suite for m_vlan, which covers these new actions and
the pre-existing ones.
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
This patch adds the user-space control and dump of mdb entry source
address. When setting the new MDBA_SET_ENTRY_ATTRS nested attribute is
used and inside is added MDBE_ATTR_SOURCE based on the address family.
When dumping we look for MDBA_MDB_EATTR_SOURCE and if present we add the
"src x.x.x.x" output. The source address will be always shown as it's
needed to match the entry to modify it from user-space.
Example:
$ bridge mdb add dev bridge port ens13 grp 239.0.0.1 src 1.2.3.4 permanent vid 100
$ bridge mdb show
dev bridge port ens13 grp 239.0.0.1 src 1.2.3.4 permanent vid 100
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
The XFRMA_SET_MARK_MASK attribute can be set in states (4.19+)
It is optional and the kernel default is 0xffffffff
It is the mask of XFRMA_SET_MARK(a.k.a. XFRMA_OUTPUT_MARK in 4.18)
e.g.
./ip/ip xfrm state add output-mark 0x6 mask 0xab proto esp \
auth digest_null 0 enc cipher_null ''
ip xfrm state
src 0.0.0.0 dst 0.0.0.0
proto esp spi 0x00000000 reqid 0 mode transport
replay-window 0
output-mark 0x6/0xab
auth-trunc digest_null 0x30 0
enc ecb(cipher_null)
anti-replay context: seq 0x0, oseq 0x0, bitmap 0x00000000
sel src 0.0.0.0/0 dst 0.0.0.0/0
Signed-off-by: Antony Antony <antony@phenome.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
Add health reporter test command and allow user to trigger a test event.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Added description of link flags allmulticast, promisc and trailers.
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
This commit adds support to expose the following inet socket options:
-- recverr
-- is_icsk
-- freebind
-- hdrincl
-- mc_loop
-- transparent
-- mc_all
-- nodefrag
-- bind_address_no_port
-- recverr_rfc4884
-- defer_connect
with the option --inet-sockopt. The individual option is only shown
when set.
Signed-off-by: Wei Wang <weiwan@google.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
This patch adds support for recently
added link IFLA_PROTO_DOWN_REASON attribute.
IFLA_PROTO_DOWN_REASON enumerates reasons
for the already existing IFLA_PROTO_DOWN link
attribute.
$ cat /etc/iproute2/protodown_reasons.d/r.conf
0 mlag
1 evpn
2 vrrp
3 psecurity
$ ip link set dev vx10 protodown on protodown_reason vrrp on
$ip link show dev vx10
14: vx10: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode
DEFAULT group default qlen 1000
link/ether f2:32:28:b8:35:ff brd ff:ff:ff:ff:ff:ff protodown on
protodown_reason <vrrp>
$ip -p -j link show dev vx10
[ {
<snip>
"proto_down": true,
"proto_down_reason": [ "vrrp" ]
} ]
$ip link set dev vx10 protodown_reason mlag on
$ip link show dev vx10
14: vx10: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode
DEFAULT group default qlen 1000
link/ether f2:32:28:b8:35:ff brd ff:ff:ff:ff:ff:ff protodown on
protodown_reason <mlag,vrrp>
$ip -p -j link show dev vx10
[ {
<snip>
"proto_down": true,
"protodown_reason": [ "mlag","vrrp" ]
} ]
$ip -p -j link show dev vx10
$ip link set dev vx10 protodown off protodown_reason vrrp off
Error: Cannot clear protodown, active reasons.
$ip link set dev vx10 protodown off protodown_reason mlag off
$
Note: for somereason the json and non-json key for protodown
are different (protodown and proto_down). I have kept the
same for protodown reason for consistency (protodown_reason and
proto_down_reason).
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
PRP support requires a proto parameter which is 0 for hsr and 1 for
prp. Default is hsr and is backward compatible.
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Document the new supported criteria of auto mode. Examples:
$ rdma statistic qp set link mlx5_2/1 auto pid on
$ rdma statistic qp set link mlx5_2/1 auto pid,type on
Signed-off-by: Mark Zhang <markz@mellanox.com>
Reviewed-by: Ido Kalir <idok@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
This flag allows to create SA where sequence number can cycle in
outbound packets if set.
Signed-off-by: Petr Vaněk <pv@excello.cz>
Signed-off-by: David Ahern <dsahern@kernel.org>
In most of cases a user wants to see only the dynamic mac addresses
in the fdb output. But currently the 'fdb show' displays tons of
various self entries, those only waste the output without any useful
goal.
New option 'dynamic' for 'show' and 'get' commands forces display
only relevant records.
Signed-off-by: Anton Danilov <littlesmilingcloud@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
To improve the usability better use case-insensitive pattern-matching
in ifstat, nstat and ss tools.
Signed-off-by: Anton Danilov <littlesmilingcloud@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
The XFRMA_IF_ID attribute is set in policies for them to be
associated with an XFRM interface (4.19+).
Add support for getting/deleting policies with this attribute.
For supporting 'deleteall' the XFRMA_IF_ID attribute needs to be
explicitly copied.
Signed-off-by: Eyal Birger <eyal.birger@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
In commit aed63ae1ac ("ip xfrm: support setting/printing XFRMA_IF_ID attribute in states/policies")
I added the ability to set/print the xfrm interface ID without updating
the man page.
Fixes: aed63ae1ac ("ip xfrm: support setting/printing XFRMA_IF_ID attribute in states/policies")
Signed-off-by: Eyal Birger <eyal.birger@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
When setting a policer to a trap group, a value of "0" will unbind the
currently bound policer from the group.
The behavior is intentional and tested in kernel selftests, so document
it.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Suggested-by: Alex Kushnarov <alexanderk@mellanox.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Add the new "mpls" keyword that can be used to match MPLS fields in
arbitrary Label Stack Entries.
LSEs are introduced by the "lse" keyword and followed by LSE options:
"depth", "label", "tc", "bos" and "ttl". The depth is manadtory, the
other options are optionals.
For example, the following filter drops MPLS packets having two labels,
where the first label is 21 and has TTL 64 and the second label is 22:
$ tc filter add dev ethX ingress proto mpls_uc flower mpls \
lse depth 1 label 21 ttl 64 \
lse depth 2 label 22 bos 1 \
action drop
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Bareudp devices provide a generic L3 encapsulation for tunnelling
different protocols like MPLS, IP, NSH, etc. inside a UDP tunnel.
This patch is based on original work from Martin Varghese:
https://lore.kernel.org/netdev/1570532361-15163-1-git-send-email-martinvarghesenokia@gmail.com/
Examples:
- ip link add dev bareudp0 type bareudp dstport 6635 ethertype mpls_uc
This creates a bareudp tunnel device which tunnels L3 traffic with
ethertype 0x8847 (unicast MPLS traffic). The destination port of the
UDP header will be set to 6635. The device will listen on UDP port 6635
to receive traffic.
- ip link add dev bareudp0 type bareudp dstport 6635 ethertype ipv4 multiproto
Same as the MPLS example, but for IPv4. The "multiproto" keyword allows
the device to also tunnel IPv6 traffic.
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Before can be possible show only all qeueue disciplines on an interface.
There wasn't a way to get the qdisc info by handle or parent, only full
dump of the disciplines with a following grep/sed usage.
Now new and old options work as expected to filter a qdisc by handle or
parent.
Full syntax of the qdisc show command:
tc qdisc { show | list } [ dev STRING ] [ QDISC_ID ] [ invisible ]
QDISC_ID := { root | ingress | handle QHANDLE | parent CLASSID }
This change doesn't require any changes in the kernel.
Signed-off-by: Anton Danilov <littlesmilingcloud@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Use a single font macro for a single argument.
Remove unnecessary quotes for a single-font macro.
Join two lines into one.
The output of "nroff" and "groff" is unchanged.
Signed-off-by: Bjarni Ingi Gislason <bjarniig@rhi.hi.is>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Use a single-font macro for one argument.
Remove unnecessary quotes for a single font macro.
Join some lines into one.
The output of "nroff" and "groff" is unchanged, except for a font
change in two lines.
Signed-off-by: Bjarni Ingi Gislason <bjarniig@rhi.hi.is>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Use a single-font macro for a single argument
Signed-off-by: Bjarni Ingi Gislason <bjarniig@rhi.hi.is>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Use a single-font macro for a single argument.
Signed-off-by: Bjarni Ingi Gislason <bjarniig@rhi.hi.is>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Typeset section numbers in roman font, see man-pages(7).
###
Details:
Output is from: test-groff -b -mandoc -T utf8 -rF0 -t -w w -z
[ "test-groff" is a developmental version of "groff" ]
<./man/man3/libnetlink.3>:53 (macro BR): only 1 argument, but more are expected
<./man/man3/libnetlink.3>:132 (macro BR): only 1 argument, but more are expected
<./man/man3/libnetlink.3>:134 (macro BR): only 1 argument, but more are expected
<./man/man3/libnetlink.3>:197 (macro BR): only 1 argument, but more are expected
<./man/man3/libnetlink.3>:198 (macro BR): only 1 argument, but more are expected
Signed-off-by: Bjarni Ingi Gislason <bjarniig@rhi.hi.is>
Add 'raw' argument to get the resource in raw format.
When RDMA_NLDEV_ATTR_RES_RAW is set in the netlink message,
then the resource fields are in raw format, print it as byte array.
Example:
$rdma res show qp link rocep0s12f0/1 lqpn 1137 -j -r
[{"ifindex":7,"ifname":"mlx5_1","port":1,
"data":[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...]}]
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
The "early_drop" qevent matches packets that have been early-dropped. The
"mark" qevent matches packets that have been ECN-marked.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
This patch adds support to assign a nexthop group
id to an fdb entry.
$bridge fdb add 02:02:00:00:00:13 dev vx10 nhid 102 self
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
This patch adds support to add and delete
ecmp nexthops of type fdb. Such nexthops can
be linked to vxlan fdb entries.
$ip nexthop add id 12 via 172.16.1.2 fdb
$ip nexthop add id 13 via 172.16.1.3 fdb
$ip nexthop add id 102 group 12/13 fdb
$bridge fdb add 02:02:00:00:00:13 dev vx10 nhid 102 self
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
optimistic DAD is controllable via sysctl for an interface
or all interfaces on the system. This would affect addresses
added by the kernel only.
Recent kernels, however, have enabled support for adding optimistic
address via userspace. This plumbs that support.
Signed-off-by: David Ahern <dsahern@gmail.com>
This patch introduces two new features: obtaining cgroup information and
filtering sockets by cgroups. These features work based on cgroup v2 ID
field in the socket (kernel should be compiled with CONFIG_SOCK_CGROUP_DATA).
Cgroup information can be obtained by specifying --cgroup flag and now contains
only pathname. For faster pathname lookups cgroup cache is implemented. This
cache is filled on ss startup and missed entries are resolved and saved
on the fly.
Cgroup filter extends EXPRESSION and allows to specify cgroup pathname
(relative or absolute) to obtain sockets attached only to this cgroup.
Filter syntax: ss [ cgroup PATHNAME ]
Examples:
ss -a cgroup /sys/fs/cgroup/unified (or ss -a cgroup .)
ss -a cgroup /sys/fs/cgroup/unified/cgroup1 (or ss -a cgroup cgroup1)
v2:
- style fixes (David Ahern)
Signed-off-by: Dmitry Yakunin <zeil@yandex-team.ru>
Signed-off-by: David Ahern <dsahern@gmail.com>
This patch is to add TCA_FLOWER_KEY_ENC_OPTS_ERSPAN's parse and
print to implement erspan options support in m_tunnel_key, like
Commit 56155d4df8 ("tc: f_flower: add geneve option match
support to flower") for geneve options support.
Option is expressed as version:index:dir:hwid, dir and hwid will
be parsed when version is 2, while index will be parsed when
version is 1. erspan doesn't support multiple options.
With this patch, users can add and dump erspan options like:
# ip link add name erspan1 type erspan external
# tc qdisc add dev erspan1 ingress
# tc filter add dev erspan1 protocol ip parent ffff: \
flower \
enc_src_ip 10.0.99.192 \
enc_dst_ip 10.0.99.193 \
enc_key_id 11 \
erspan_opts 1:2:0:0/1:255:0:0 \
ip_proto udp \
action mirred egress redirect dev eth1
# tc -s filter show dev erspan1 parent ffff:
filter protocol ip pref 49152 flower chain 0 handle 0x1
eth_type ipv4
ip_proto udp
enc_dst_ip 10.0.99.193
enc_src_ip 10.0.99.192
enc_key_id 11
erspan_opts 1:2:0:0/1:255:0:0
not_in_hw
action order 1: mirred (Egress Redirect to device eth1) stolen
index 1 ref 1 bind 1
Action statistics:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
v1->v2:
- no change.
v2->v3:
- no change.
v3->v4:
- keep the same format between input and output, json and non json.
- print version, index, dir and hwid as uint.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
This patch is to add TCA_FLOWER_KEY_ENC_OPTS_VXLAN's parse and
print to implement vxlan options support in m_tunnel_key, like
Commit 56155d4df8 ("tc: f_flower: add geneve option match
support to flower") for geneve options support.
Option is expressed a 32bit number for gbp only, and vxlan
doesn't support multiple options.
With this patch, users can add and dump vxlan options like:
# ip link add name vxlan1 type vxlan dstport 0 external
# tc qdisc add dev vxlan1 ingress
# tc filter add dev vxlan1 protocol ip parent ffff: \
flower \
enc_src_ip 10.0.99.192 \
enc_dst_ip 10.0.99.193 \
enc_key_id 11 \
vxlan_opts 65793/4008635966 \
ip_proto udp \
action mirred egress redirect dev eth1
# tc -s filter show dev vxlan1 parent ffff:
filter protocol ip pref 49152 flower chain 0 handle 0x1
eth_type ipv4
ip_proto udp
enc_dst_ip 10.0.99.193
enc_src_ip 10.0.99.192
enc_key_id 11
vxlan_opts 65793/4008635966
not_in_hw
action order 1: mirred (Egress Redirect to device eth1) stolen
index 3 ref 1 bind 1
Action statistics:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
v1->v2:
- get_u32 with base = 0 for gbp.
v2->v3:
- implement proper JSON array for opts.
v3->v4:
- keep the same format between input and output, json and non json.
- print gbp as uint.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
This patch is to add TCA_TUNNEL_KEY_ENC_OPTS_ERSPAN's parse and
print to implement erspan options support in m_tunnel_key, like
Commit 6217917a38 ("tc: m_tunnel_key: Add tunnel option support
to act_tunnel_key") for geneve options support.
Option is expressed as version:index:dir:hwid, dir and hwid will
be parsed when version is 2, while index will be parsed when
version is 1. erspan doesn't support multiple options.
With this patch, users can add and dump erspan options like:
# ip link add name erspan1 type erspan external
# tc qdisc add dev eth0 ingress
# tc filter add dev eth0 protocol ip parent ffff: \
flower indev eth0 \
ip_proto udp \
action tunnel_key \
set src_ip 10.0.99.192 \
dst_ip 10.0.99.193 \
dst_port 6081 \
id 11 \
erspan_opts 1:2:0:0 \
action mirred egress redirect dev erspan1
# tc -s filter show dev eth0 parent ffff:
filter protocol ip pref 49151 flower chain 0 handle 0x1
indev eth0
eth_type ipv4
ip_proto udp
not_in_hw
action order 1: tunnel_key set
src_ip 10.0.99.192
dst_ip 10.0.99.193
key_id 11
dst_port 6081
erspan_opts 1:2:0:0
csum pipe
index 2 ref 1 bind 1
...
v1->v2:
- no change.
v2->v3:
- no change.
v3->v4:
- keep the same format between input and output, json and non json.
- print version, index, dir and hwid as uint.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
This patch is to add TCA_TUNNEL_KEY_ENC_OPTS_VXLAN's parse and
print to implement vxlan options support in m_tunnel_key, like
Commit 6217917a38 ("tc: m_tunnel_key: Add tunnel option support
to act_tunnel_key") for geneve options support.
Option is expressed a 32bit number for gbp only, and vxlan
doesn't support multiple options.
With this patch, users can add and dump vxlan options like:
# ip link add name vxlan1 type vxlan dstport 0 external
# tc qdisc add dev eth0 ingress
# tc filter add dev eth0 protocol ip parent ffff: \
flower indev eth0 \
ip_proto udp \
action tunnel_key \
set src_ip 10.0.99.192 \
dst_ip 10.0.99.193 \
dst_port 6081 \
id 11 \
vxlan_opts 65793 \
action mirred egress redirect dev vxlan1
# tc -s filter show dev eth0 parent ffff:
filter protocol ip pref 49152 flower chain 0 handle 0x1
indev eth0
eth_type ipv4
ip_proto udp
not_in_hw
action order 1: tunnel_key set
src_ip 10.0.99.192
dst_ip 10.0.99.193
key_id 11
dst_port 6081
vxlan_opts 65793
...
v1->v2:
- get_u32 with base = 0 for gbp.
- use to print_unint("0x%x") to print gbp.
v2->v3:
- implement proper JSON array for opts.
v3->v4:
- keep the same format between input and output, json and non json.
- print gbp as uint.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
The Type I ERSPAN frame format is based on the barebones
IP + GRE(4-byte) encapsulation on top of the raw mirrored frame.
Both type I and II use 0x88BE as protocol type. Unlike type II
and III, no sequence number or key is required.
To creat a type I erspan tunnel device:
$ ip link add dev erspan11 type erspan \
local 172.16.1.100 remote 172.16.1.200 \
erspan_ver 0
CC: Dmitriy Andreyevskiy <dandreye@cisco.com>
Signed-off-by: William Tu <u9012063@gmail.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
While at it, additionally fix a mandoc warning in mptcp.8
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
This patch adds support for configuring offload mode upon MACsec
device creation.
If offload mode is not specified, then netlink attribute is not
added. Default behavior on the kernel side in this case is
backward-compatible (offloading is disabled by default).
Example:
$ ip link add link eth0 macsec0 type macsec port 11 encrypt on offload mac
Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David Ahern <dsahern@gmail.com>