When link-netns or link-netnsid is supplied, lookup link in that netns.
And if both netns and link-netns are given, IFLA_LINK_NETNSID should be
the nsid of link-netns from the view of target netns, not from current
one.
For example, when handling:
# ip -n ns1 link add netns ns2 link-netns ns3 link eth1 eth1.100 type vlan id 100
should lookup eth1 in ns3 and IFLA_LINK_NETNSID is the id of ns3 from
ns2.
Signed-off-by: Xiao Liang <shaw.leon@gmail.com>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Move set_netnsid_from_name() outside for reuse, like what's done for
netns_id_from_name().
Signed-off-by: Xiao Liang <shaw.leon@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
'rdma resource show cq' supports object 'dev' but not 'link', and
doesn't support device name with port.
Fixes: b0b8e32cbf ("rdma: Add CQ resource tracking information")
Signed-off-by: wenglianfa <wenglianfa@huawei.com>
Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
The rtnetlink.sh kernel test started reporting errors after
iproute2 update. The error checking introduced by commit
under fixes is incorrect. rtnl_listen() always returns
an error, because the only way to break the loop is to
return an error from the handler, it seems.
Switch this code to using normal rtnl_talk(), instead of
the rtnl_listen() abuse. As far as I can tell the use of
rtnl_listen() was to make get and dump use common handling
but that's no longer the case, anyway.
Before:
$ ip -6 netconf show dev lo
inet6 lo forwarding off mc_forwarding off proxy_neigh off ignore_routes_with_linkdown off
$ echo $?
2
After:
$ ./ip/ip -6 netconf show dev lo
inet6 lo forwarding off mc_forwarding off proxy_neigh off ignore_routes_with_linkdown off
$ echo $?
0
Fixes: 00e8a64dac ("ip: detect errors in netconf monitor mode")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
The mroute family is reset to RTNL_FAMILY_IPMR or RTNL_FAMILY_IP6MR when
retrieving the multicast routing cache. However, the get_prefix() and
subsequently __get_addr_1() cannot identify these families. Using
preferred_family to obtain the prefix can resolve this issue.
Fixes: 98ce99273f ("mroute: fix up family handling")
Reported-by: Jianlin Shi <jishi@redhat.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
The stp state parsing was putting result in an __u8 which
would mean that check for invalid string was never happening.
Caught by enabling -Wextra:
CC mst.o
mst.c: In function ‘mst_set’:
mst.c:217:27: warning: comparison is always false due to limited range of data type [-Wtype-limits]
217 | if (state == -1) {
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
The command 'ip link set foo netns mynetns' opens a file descriptor to fill
the netlink attribute IFLA_NET_NS_FD. This file descriptor is never closed.
When batch mode is used, the number of file descriptor may grow greatly and
reach the maximum file descriptor number that can be opened.
This fd can be closed only after the netlink answer. Moreover, a second
fd could be opened because some (struct link_util)->parse_opt() handlers
call iplink_parse().
Let's add a helper to manage these fds:
- open_fds_add() stores a fd, up to 5 (arbitrary choice, it seems enough);
- open_fds_close() closes all stored fds.
Fixes: 0dc34c7713 ("iproute2: Add processless network namespace support")
Reported-by: Alexandre Ferrieux <alexandre.ferrieux@orange.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
This patch fixes the following error:
arpd.c:442:17: error: initialization of 'int' from 'void *' makes integer from pointer without a cast [-Wint-conversion]
442 | NULL, 0,
raised by Buildroot autobuilder [1].
In the case in question, the analysis of socket.h [2] containing the
msghdr structure shows that it has been modified with the addition of
padding fields, which cause the compilation error. The use of designated
initializers allows the issue to be fixed.
struct msghdr {
void *msg_name;
socklen_t msg_namelen;
struct iovec *msg_iov;
int __pad1;
int msg_iovlen;
int __pad1;
void *msg_control;
int __pad2;
socklen_t msg_controllen;
int __pad2;
int msg_flags;
};
[1] http://autobuild.buildroot.org/results/e4cdfa38ae9578992f1c0ff5c4edae3cc0836e3c/
[2] iproute2/host/mips64-buildroot-linux-musl/sysroot/usr/include/sys/socket.h
Signed-off-by: Dario Binacchi <dario.binacchi@amarulasolutions.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
This patch fixes the following build errors:
In file included from mst.c:11:
../include/json_print.h:80:30: warning: 'struct timeval' declared inside parameter list will not be visible outside of this definition or declaration
80 | _PRINT_FUNC(tv, const struct timeval *)
| ^~~~~~~
../include/json_print.h:50:37: note: in definition of macro '_PRINT_FUNC'
50 | type value); \
| ^~~~
../include/json_print.h:80:30: warning: 'struct timeval' declared inside parameter list will not be visible outside of this definition or declaration
80 | _PRINT_FUNC(tv, const struct timeval *)
| ^~~~~~~
../include/json_print.h:55:45: note: in definition of macro '_PRINT_FUNC'
55 | type value) \
| ^~~~
../include/json_print.h: In function 'print_tv':
../include/json_print.h:58:48: error: passing argument 5 of 'print_color_tv' from incompatible pointer type [-Wincompatible-pointer-types]
58 | value); \
| ^~~~~
| |
| const struct timeval *
../include/json_print.h:80:1: note: in expansion of macro '_PRINT_FUNC'
80 | _PRINT_FUNC(tv, const struct timeval *)
| ^~~~~~~~~~~
../include/json_print.h:50:42: note: expected 'const struct timeval *' but argument is of type 'const struct timeval *'
50 | type value); \
| ^
../include/json_print.h:80:1: note: in expansion of macro '_PRINT_FUNC'
80 | _PRINT_FUNC(tv, const struct timeval *)
Signed-off-by: Dario Binacchi <dario.binacchi@amarulasolutions.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
This patch fixes a compilation error raised by the bump to version 6.11.0
in Buildroot using musl as the C library for the cross-compilation
toolchain.
After setting the CFLGAS
ifeq ($(BR2_TOOLCHAIN_USES_MUSL),y)
IPROUTE2_CFLAGS += -D__UAPI_DEF_IN6_ADDR=0 -D__UAPI_DEF_SOCKADDR_IN6=0 \
-D__UAPI_DEF_IPV6_MREQ=0
endif
to fix the following errors:
In file included from ../../../host/mips64-buildroot-linux-musl/sysroot/usr/include/arpa/inet.h:9,
from ../include/libnetlink.h:14,
from mst.c:10:
../../../host/mips64-buildroot-linux-musl/sysroot/usr/include/netinet/in.h:23:8: error: redefinition of 'struct in6_addr'
23 | struct in6_addr {
| ^~~~~~~~
In file included from ../include/uapi/linux/if_bridge.h:19,
from mst.c:7:
../include/uapi/linux/in6.h:33:8: note: originally defined here
33 | struct in6_addr {
| ^~~~~~~~
../../../host/mips64-buildroot-linux-musl/sysroot/usr/include/netinet/in.h:34:8: error: redefinition of 'struct sockaddr_in6'
34 | struct sockaddr_in6 {
| ^~~~~~~~~~~~
../include/uapi/linux/in6.h:50:8: note: originally defined here
50 | struct sockaddr_in6 {
| ^~~~~~~~~~~~
../../../host/mips64-buildroot-linux-musl/sysroot/usr/include/netinet/in.h:42:8: error: redefinition of 'struct ipv6_mreq'
42 | struct ipv6_mreq {
| ^~~~~~~~~
../include/uapi/linux/in6.h:60:8: note: originally defined here
60 | struct ipv6_mreq {
I got this further errors
../include/uapi/linux/in6.h:72:25: error: field 'flr_dst' has incomplete type
72 | struct in6_addr flr_dst;
| ^~~~~~~
../include/uapi/linux/if_bridge.h:711:41: error: field 'ip6' has incomplete type
711 | struct in6_addr ip6;
| ^~~
fixed by including the netinet/in.h header.
Signed-off-by: Dario Binacchi <dario.binacchi@amarulasolutions.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Avoid use of term whitelist because it propgates white == good
assumptions. Not really neede on the man page.
See: https://inclusivenaming.org/word-lists/tier-1/whitelist/
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
The term segregate carries a lot of racist baggage in the US.
It is on the Inclusive Naming word list.
See: https://inclusivenaming.org/word-lists/tier-3/segregate/
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Add support for setting/getting the new "tunsrc" feature.
Signed-off-by: Justin Iurman <justin.iurman@uliege.be>
Signed-off-by: David Ahern <dsahern@kernel.org>
Two interlinked changes related to the nexthop group management have been
recently merged in kernel commit e96f6fd30eec ("Merge branch
'net-nexthop-increase-weight-to-u16'").
- One of the reserved bytes in struct nexthop_grp was redefined to carry
high-order bits of the nexthop weight, thus allowing 16-bit nexthop
weights.
- NHA_OP_FLAGS started getting dumped on nexthop group dump to carry a
flag, NHA_OP_FLAG_RESP_GRP_RESVD_0, that indicates that reserved fields
in struct nexthop_grp are zeroed before dumping.
If NHA_OP_FLAG_RESP_GRP_RESVD_0 is given, it is safe to interpret the newly
named nexthop_grp.weight_high as high-order bits of nexthop weight.
Extend ipnexthop to support configuring nexthop weights of up to 65536, and
when dumping, to interpret nexthop_grp.weight_high if safe.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
This patch fixes a problem with the libbpf version comparison to decide
if ENABLE_BPF_SKSTORAGE_SUPPORT could be enabled.
- The code enabled by ENABLE_BPF_SKSTORAGE_SUPPORT uses the function
btf_dump__new with an API that was introduced in libbpf 0.6.0. So
check now against libbpf version to be >= 0.6.x instead of 0.5.x.
- This code still depends on the necessity to have LIBBPF_MAJOR_VERSION
and LIBBPF_MINOR_VERSION defined, even if libbpf_version.h is not
present in the library development package. This was ensured with
the previous patch for the configure script.
Fixes: e3ecf048 ("ss: pretty-print BPF socket-local storage")
Signed-off-by: Stefan Mätje <stefan.maetje@esd.eu>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Old libbpf library versions (< 0.7.x) may not have the libbpf_version.h
header packaged. This header would provide LIBBPF_MAJOR_VERSION and
LIBBPF_MINOR_VERSION which are then missing to control conditional
compilation in some source files.
Provide surrogates for these defines via CFLAGS that are derived from
the LIBBPF_VERSION determined with $(${PKG_CONFIG} libbpf --modversion).
Signed-off-by: Stefan Mätje <stefan.maetje@esd.eu>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reformat tc-cake to use man format (nroff) instead of pre-formatting.
Signed-off-by: Lương Việt Hoàng <tcm4095@gmail.com>
Acked-by: Toke Høiland-Jørgensen <toke@kernel.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Linux kernel commit 7298de9cd7255a783ba ("sch_cake: Add ingress mode") added
an ingress mode for CAKE, which can be enabled with the 'ingress' parameter.
Document the changes in CAKE's behavior when ingress mode is enabled.
Signed-off-by: Lương Việt Hoàng <tcm4095@gmail.com>
Reviewed-by: Toke Høiland-Jørgensen <toke@kernel.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Instead of pre-formatted bullet list, use the man macros.
Make sure same sentence format is used in all options.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
The man page had a dangling quote character in the usage text
which can confuse auto-color/format code like Emacs and Vim.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Print also the peer policy, example:
$ ip -d l sh dev netkit0
...
netkit mode l2 type primary policy blackhole peer policy forward
...
Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David Ahern <dsahern@kernel.org>
include/utils.h already provides textify(), which is functionally
equivalent to __stringify().
Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
extend TC flower for matching on tunnel metadata.
Changes since v2:
- split uAPI changes and TC code in separate patches, as per David's request [2]
Changes since v1:
- fix incostintent naming in explain() and in tc-flower.8 (Asbjørn)
Changes since RFC:
- update uAPI bits to Asbjørn's most recent code [1]
- add 'tun' prefix to all flag names (Asbjørn)
- allow parsing 'enc_flags' multiple times, without clearing the match
mask every time, like happens for 'ip_flags' (Asbjørn)
- don't use "matches()" for parsing argv[] (Stephen)
- (hopefully) improve usage() printout (Asbjørn)
- update man page
[1] https://lore.kernel.org/netdev/20240709163825.1210046-1-ast@fiberby.net/
[2] https://lore.kernel.org/netdev/cc73004c-9aa8-9cd3-b46e-443c0727c34d@kernel.org/
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David Ahern <dsahern@kernel.org>
Adding an endpoint with 'id 0' is not allowed. In this case, the kernel
will ignore this 'id 0' and set another one.
Similarly, because there are no endpoints with this 'id 0', changing an
attribute for such endpoint will not be possible.
To avoid some confusions, it sounds better to clearly report an error
that the ID cannot be 0 in these cases.
Acked-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Explain the range (u8), and the special case for ID 0.
The endpoints here are for all the connections, while the ID 0 is a
special case per connection, depending on the source address used by the
initial subflow. This ID 0 can then not be used for the global
endpoints.
Acked-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
'fullmesh' affects the subflow creation, it has to be used with the
'subflow' flag. That's what is enforced on the kernel side.
Acked-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
That's the behaviour with the default packet scheduler.
In some early design, the default scheduler was supposed to take into
account only the received backup flags, but it ended up not being the
case, and setting the flag would also affect outgoing data.
Suggested-by: Mat Martineau <martineau@kernel.org>
Acked-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
That's what is enforced by the kernel: the 'port' is used to create a
new listening socket on that port, not to create a new subflow from/to
that port. It then requires the 'signal' flag.
Acked-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
According to some bug reports on the MPTCP project, these options might
be a bit confusing for some.
Mentioning that the 'signal' flag is typically for a server, and the
'subflow' one is typically for a client should help the user knowing in
which context which flag should be picked.
Acked-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
It was missing, while it is a very important option.
Indeed, without it, the kernel might not pick the right interface to
send packets for additional subflows. Mention that in the man page.
Acked-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
The ip address man page had some small things that needed update:
- ip address delete without address returns not supported
- always use full words for commands in man pages
(ie "delete" not "del")
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
When running "ip monitor", accept_msg() first prints the prefix and
then calls the object-specific print function, which also does the
filtering. Therefore, it is possible that the prefix is printed even
for events that get ignored later. For example:
ip link add dummy1 type dummy
ip link set dummy1 up
ip -ts monitor all dev dummy1 &
ip link add dummy2 type dummy
ip addr add dev dummy1 192.0.2.1/24
generates:
[2024-07-12T22:11:26.338342] [LINK][2024-07-12T22:11:26.339846] [ADDR]314: dummy1 inet 192.0.2.1/24 scope global dummy1
valid_lft forever preferred_lft forever
Fix this by printing the prefix only after the filtering. Now the
output for the commands above is:
[2024-07-12T22:11:26.339846] [ADDR]314: dummy1 inet 192.0.2.1/24 scope global dummy1
valid_lft forever preferred_lft forever
See also commit 7e0a889b54 ("bridge: Do not print stray prefixes in
monitor mode") which fixed the same problem in the bridge tool.
Signed-off-by: Beniamino Galvani <b.galvani@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Expression 'ttl & ~(255 >> 0)' is always zero, because right operand
has 8 trailing zero bits, which is greater or equal than the size
of the left operand == 8 bits.
Found by RASU JSC.
Signed-off-by: Maks Mishin <maks.mishinFZ@gmail.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>