linux/net/ipv4
David Ahern 40867d74c3 net: Add l3mdev index to flow struct and avoid oif reset for port devices
The fundamental premise of VRF and l3mdev core code is binding a socket
to a device (l3mdev or netdev with an L3 domain) to indicate L3 scope.
Legacy code resets flowi_oif to the l3mdev losing any original port
device binding. Ben (among others) has demonstrated use cases where the
original port device binding is important and needs to be retained.
This patch handles that by adding a new entry to the common flow struct
that can indicate the l3mdev index for later rule and table matching
avoiding the need to reset flowi_oif.

In addition to allowing more use cases that require port device binds,
this patch brings a few datapath simplications:

1. l3mdev_fib_rule_match is only called when walking fib rules and
   always after l3mdev_update_flow. That allows an optimization to bail
   early for non-VRF type uses cases when flowi_l3mdev is not set. Also,
   only that index needs to be checked for the FIB table id.

2. l3mdev_update_flow can be called with flowi_oif set to a l3mdev
   (e.g., VRF) device. By resetting flowi_oif only for this case the
   FLOWI_FLAG_SKIP_NH_OIF flag is not longer needed and can be removed,
   removing several checks in the datapath. The flowi_iif path can be
   simplified to only be called if the it is not loopback (loopback can
   not be assigned to an L3 domain) and the l3mdev index is not already
   set.

3. Avoid another device lookup in the output path when the fib lookup
   returns a reject failure.

Note: 2 functional tests for local traffic with reject fib rules are
updated to reflect the new direct failure at FIB lookup time for ping
rather than the failure on packet path. The current code fails like this:

    HINT: Fails since address on vrf device is out of device scope
    COMMAND: ip netns exec ns-A ping -c1 -w1 -I eth1 172.16.3.1
    ping: Warning: source address might be selected on device other than: eth1
    PING 172.16.3.1 (172.16.3.1) from 172.16.3.1 eth1: 56(84) bytes of data.

    --- 172.16.3.1 ping statistics ---
    1 packets transmitted, 0 received, 100% packet loss, time 0ms

where the test now directly fails:

    HINT: Fails since address on vrf device is out of device scope
    COMMAND: ip netns exec ns-A ping -c1 -w1 -I eth1 172.16.3.1
    ping: connect: No route to host

Signed-off-by: David Ahern <dsahern@kernel.org>
Tested-by: Ben Greear <greearb@candelatech.com>
Link: https://lore.kernel.org/r/20220314204551.16369-1-dsahern@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-15 20:20:02 -07:00
..
bpfilter net: Revert "net: optimize the sockptr_t for unified kernel/user address spaces" 2020-08-10 12:06:44 -07:00
netfilter netfilter: conntrack: pptp: use single option structure 2022-02-04 06:30:28 +01:00
af_inet.c gso: do not skip outer ip header in case of ipip and net_failover 2022-02-21 11:41:30 +00:00
ah4.c Networking changes for 5.14. 2021-06-30 15:51:09 -07:00
arp.c net: neigh: add skb drop reasons to arp_error_report() 2022-02-26 12:53:59 +00:00
bpf_tcp_ca.c bpf: reject program if a __user tagged memory accessed in kernel way 2022-01-27 12:03:46 -08:00
cipso_ipv4.c NET: IPV4: fix error "do not initialise globals to 0" 2021-09-19 12:43:56 +01:00
datagram.c net/ipv4/datagram.c: remove superfluous header files from datagram.c 2021-09-29 11:39:33 +01:00
devinet.c net: Add new protocol attribute to IP addresses 2022-02-18 21:20:06 -08:00
esp4_offload.c net: Fix esp GSO on inter address family tunnels. 2022-03-07 13:14:04 +01:00
esp4.c esp: Fix possible buffer overflow in ESP transformation 2022-03-07 13:14:03 +01:00
fib_frontend.c net: Add l3mdev index to flow struct and avoid oif reset for port devices 2022-03-15 20:20:02 -07:00
fib_lookup.h Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-02-17 11:44:20 -08:00
fib_notifier.c net: ipv4: remove superfluous header files from fib_notifier.c 2021-09-28 17:32:56 -07:00
fib_rules.c ipv4: Reject again rules with high DSCP values 2022-02-10 15:33:33 +00:00
fib_semantics.c net: Add l3mdev index to flow struct and avoid oif reset for port devices 2022-03-15 20:20:02 -07:00
fib_trie.c net: Add l3mdev index to flow struct and avoid oif reset for port devices 2022-03-15 20:20:02 -07:00
fou.c gro: remove rcu_read_lock/rcu_read_unlock from gro_complete handlers 2021-11-24 17:21:42 -08:00
gre_demux.c net: Remove the member netns_ok 2021-05-17 15:29:35 -07:00
gre_offload.c gro: remove rcu_read_lock/rcu_read_unlock from gro_complete handlers 2021-11-24 17:21:42 -08:00
icmp.c ipv4: do not use per netns icmp sockets 2022-01-25 11:25:21 +00:00
igmp.c ipv4: drop unused assignment 2021-11-14 12:20:44 +00:00
inet_connection_sock.c tcp: Use BPF timeout setting for SYN ACK RTO 2022-02-02 14:45:18 +00:00
inet_diag.c inet_diag: fix kernel-infoleak for UDP sockets 2021-12-10 21:14:49 -08:00
inet_fragment.c net: ip: Handle delivery_time in ip defrag 2022-03-03 14:38:48 +00:00
inet_hashtables.c tcp: Don't acquire inet_listen_hashbucket::lock with disabled BH. 2022-02-09 21:28:36 -08:00
inet_timewait_sock.c tcp: allocate tcp_death_row outside of struct netns_ipv4 2022-01-26 19:00:31 -08:00
inetpeer.c inetpeer: use div64_ul() and clamp_val() calculate inet_peer_threshold 2021-03-01 13:32:12 -08:00
ip_forward.c net: Add skb_clear_tstamp() to keep the mono delivery_time 2022-03-03 14:38:48 +00:00
ip_fragment.c net: ip: Handle delivery_time in ip defrag 2022-03-03 14:38:48 +00:00
ip_gre.c gre: Don't accidentally set RTO_ONLINK in gre_fill_metadata_dst() 2022-01-11 20:36:08 -08:00
ip_input.c net: Postpone skb_clear_delivery_time() until knowing the skb is delivered locally 2022-03-03 14:38:48 +00:00
ip_options.c ipv4: drop fragmentation code from ip_options_build() 2022-01-29 17:53:07 +00:00
ip_output.c net: Set skb->mono_delivery_time and clear it after sch_handle_ingress() 2022-03-03 14:38:48 +00:00
ip_sockglue.c ipv4: Exposing __ip_sock_set_tos() in ip.h 2021-11-20 14:11:00 +00:00
ip_tunnel_core.c net: ip_tunnel: clean up endianness conversions 2021-01-08 19:25:35 -08:00
ip_tunnel.c ip: use dev_addr_set() in tunnels 2021-10-13 09:41:37 -07:00
ip_vti.c ip: use dev_addr_set() in tunnels 2021-10-13 09:41:37 -07:00
ipcomp.c Networking changes for 5.14. 2021-06-30 15:51:09 -07:00
ipconfig.c net: ipconfig: Release the rtnl_lock while waiting for carrier 2021-10-28 14:36:41 +01:00
ipip.c ip: use dev_addr_set() in tunnels 2021-10-13 09:41:37 -07:00
ipmr_base.c
ipmr.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-02-10 17:29:56 -08:00
Kconfig net: ipv4: remove duplicate "the the" phrase in Kconfig text 2020-08-18 16:02:16 -07:00
Makefile bpf: Clean up sockmap related Kconfigs 2021-02-26 12:28:03 -08:00
metrics.c treewide: rename nla_strlcpy to nla_strscpy. 2020-11-16 08:08:54 -08:00
netfilter.c netfilter: Dissect flow after packet mangling 2021-04-18 22:04:16 +02:00
netlink.c
nexthop.c nexthop: change nexthop_net_exit() to nexthop_net_exit_batch() 2022-02-08 20:41:33 -08:00
ping.c ping: remove pr_err from ping_lookup 2022-02-24 09:18:29 -08:00
proc.c tcp: allocate tcp_death_row outside of struct netns_ipv4 2022-01-26 19:00:31 -08:00
protocol.c net: Remove the member netns_ok 2021-05-17 15:29:35 -07:00
raw_diag.c net: Use nlmsg_unicast() instead of netlink_unicast() 2021-07-13 09:28:29 -07:00
raw.c Networking fixes for 5.17-rc2, including fixes from netfilter and can. 2022-01-27 20:58:39 +02:00
route.c net: Add l3mdev index to flow struct and avoid oif reset for port devices 2022-03-15 20:20:02 -07:00
syncookies.c net: align static siphash keys 2021-11-16 19:07:54 -08:00
sysctl_net_ipv4.c tcp: adjust TSO packet sizes based on min_rtt 2022-03-09 20:05:44 -08:00
tcp_bbr.c bpf: Remove check_kfunc_call callback and old kfunc BTF ID API 2022-01-18 14:26:41 -08:00
tcp_bic.c
tcp_bpf.c bpf, sockmap: Fix return codes from tcp_bpf_recvmsg_parser() 2022-01-05 20:43:08 +01:00
tcp_cdg.c
tcp_cong.c tcp: unexport tcp_ca_get_key_by_name and tcp_ca_get_name_by_key 2022-03-11 22:51:40 -08:00
tcp_cubic.c bpf: Remove check_kfunc_call callback and old kfunc BTF ID API 2022-01-18 14:26:41 -08:00
tcp_dctcp.c bpf: Remove check_kfunc_call callback and old kfunc BTF ID API 2022-01-18 14:26:41 -08:00
tcp_dctcp.h
tcp_diag.c
tcp_fastopen.c net/ipv4/tcp_fastopen.c: remove superfluous header files from tcp_fastopen.c 2021-09-20 13:09:06 +01:00
tcp_highspeed.c Replace HTTP links with HTTPS ones: IPv* 2020-07-06 13:23:03 -07:00
tcp_htcp.c Replace HTTP links with HTTPS ones: IPv* 2020-07-06 13:23:03 -07:00
tcp_hybla.c
tcp_illinois.c
tcp_input.c net: tcp: use tcp_drop_reason() for tcp_data_queue_ofo() 2022-02-20 13:55:31 +00:00
tcp_ipv4.c tcp: adjust TSO packet sizes based on min_rtt 2022-03-09 20:05:44 -08:00
tcp_lp.c ipv4: tcp_lp.c: Couple of typo fixes 2021-03-28 17:31:13 -07:00
tcp_metrics.c fixes-v5.11 2020-12-14 16:40:27 -08:00
tcp_minisocks.c tcp: Use BPF timeout setting for SYN ACK RTO 2022-02-02 14:45:18 +00:00
tcp_nv.c net/ipv4/tcp_nv.c: remove superfluous header files from tcp_nv.c 2021-09-27 12:47:39 +01:00
tcp_offload.c net: move gro definitions to include/net/gro.h 2021-11-16 13:16:54 +00:00
tcp_output.c tcp: adjust TSO packet sizes based on min_rtt 2022-03-09 20:05:44 -08:00
tcp_rate.c tcp: tracking packets with CE marks in BW rate sample 2021-09-24 14:16:40 +01:00
tcp_recovery.c tcp: more accurately check DSACKs to grow RACK reordering window 2021-07-27 20:07:21 +01:00
tcp_scalable.c net: ipv4: delete repeated words 2020-08-24 17:31:20 -07:00
tcp_timer.c net: sock: introduce sk_error_report 2021-06-29 11:28:21 -07:00
tcp_ulp.c
tcp_vegas.c tcp: use semicolons rather than commas to separate statements 2020-10-13 17:11:52 -07:00
tcp_vegas.h
tcp_veno.c Replace HTTP links with HTTPS ones: IPv* 2020-07-06 13:23:03 -07:00
tcp_westwood.c
tcp_yeah.c tcp_yeah: check struct yeah size at compile time 2021-06-29 11:54:36 -07:00
tcp.c tcp: autocork: take MSG_EOR hint into consideration 2022-03-09 20:05:20 -08:00
tunnel4.c net: Remove the member netns_ok 2021-05-17 15:29:35 -07:00
udp_bpf.c net: Implement ->sock_is_readable() for UDP and AF_UNIX 2021-10-26 12:29:33 -07:00
udp_diag.c net: Use nlmsg_unicast() instead of netlink_unicast() 2021-07-13 09:28:29 -07:00
udp_impl.h net: pass a sockptr_t into ->setsockopt 2020-07-24 15:41:54 -07:00
udp_offload.c gro: remove rcu_read_lock/rcu_read_unlock from gro_complete handlers 2021-11-24 17:21:42 -08:00
udp_tunnel_core.c net/ipv4/udp_tunnel_core.c: remove superfluous header files from udp_tunnel_core.c 2021-09-21 10:17:20 +01:00
udp_tunnel_nic.c udp_tunnel: Fix end of loop test in udp_tunnel_nic_unregister() 2022-02-23 12:35:00 +00:00
udp_tunnel_stub.c udp_tunnel: add central NIC RX port offload infrastructure 2020-07-10 13:54:00 -07:00
udp.c net: udp: use kfree_skb_reason() in __udp_queue_rcv_skb() 2022-02-07 11:18:49 +00:00
udplite.c net: Remove the member netns_ok 2021-05-17 15:29:35 -07:00
xfrm4_input.c
xfrm4_output.c
xfrm4_policy.c net: Add l3mdev index to flow struct and avoid oif reset for port devices 2022-03-15 20:20:02 -07:00
xfrm4_protocol.c net: Remove the member netns_ok 2021-05-17 15:29:35 -07:00
xfrm4_state.c
xfrm4_tunnel.c net/ipv4/xfrm4_tunnel.c: remove superfluous header files from xfrm4_tunnel.c 2021-09-23 10:10:00 +02:00