linux/net
Florian Westphal 80cd0487f6 netfilter: bridge: confirm multicast packets before passing them up the stack
[ Upstream commit 62e7151ae3 ]

conntrack nf_confirm logic cannot handle cloned skbs referencing
the same nf_conn entry, which will happen for multicast (broadcast)
frames on bridges.

 Example:
    macvlan0
       |
      br0
     /  \
  ethX    ethY

 ethX (or Y) receives a L2 multicast or broadcast packet containing
 an IP packet, flow is not yet in conntrack table.

 1. skb passes through bridge and fake-ip (br_netfilter)Prerouting.
    -> skb->_nfct now references a unconfirmed entry
 2. skb is broad/mcast packet. bridge now passes clones out on each bridge
    interface.
 3. skb gets passed up the stack.
 4. In macvlan case, macvlan driver retains clone(s) of the mcast skb
    and schedules a work queue to send them out on the lower devices.

    The clone skb->_nfct is not a copy, it is the same entry as the
    original skb.  The macvlan rx handler then returns RX_HANDLER_PASS.
 5. Normal conntrack hooks (in NF_INET_LOCAL_IN) confirm the orig skb.

The Macvlan broadcast worker and normal confirm path will race.

This race will not happen if step 2 already confirmed a clone. In that
case later steps perform skb_clone() with skb->_nfct already confirmed (in
hash table).  This works fine.

But such confirmation won't happen when eb/ip/nftables rules dropped the
packets before they reached the nf_confirm step in postrouting.

Pablo points out that nf_conntrack_bridge doesn't allow use of stateful
nat, so we can safely discard the nf_conn entry and let inet call
conntrack again.

This doesn't work for bridge netfilter: skb could have a nat
transformation. Also bridge nf prevents re-invocation of inet prerouting
via 'sabotage_in' hook.

Work around this problem by explicit confirmation of the entry at LOCAL_IN
time, before upper layer has a chance to clone the unconfirmed entry.

The downside is that this disables NAT and conntrack helpers.

Alternative fix would be to add locking to all code parts that deal with
unconfirmed packets, but even if that could be done in a sane way this
opens up other problems, for example:

-m physdev --physdev-out eth0 -j SNAT --snat-to 1.2.3.4
-m physdev --physdev-out eth1 -j SNAT --snat-to 1.2.3.5

For multicast case, only one of such conflicting mappings will be
created, conntrack only handles 1:1 NAT mappings.

Users should set create a setup that explicitly marks such traffic
NOTRACK (conntrack bypass) to avoid this, but we cannot auto-bypass
them, ruleset might have accept rules for untracked traffic already,
so user-visible behaviour would change.

Suggested-by: Pablo Neira Ayuso <pablo@netfilter.org>
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217777
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-03-06 14:48:36 +00:00
..
6lowpan
9p net: 9p: avoid freeing uninit memory in p9pdu_vreadf 2024-01-01 12:42:41 +00:00
802
8021q vlan: skip nested type that is not IFLA_VLAN_QOS_MAPPING 2024-01-31 16:19:01 -08:00
appletalk appletalk: Fix Use-After-Free in atalk_ioctl 2023-12-20 17:01:50 +01:00
atm atm: Fix Use-After-Free in do_vcc_ioctl 2023-12-20 17:01:48 +01:00
ax25 ax25: Kconfig: Update link for linux-ax25.org 2023-09-18 12:56:58 +01:00
batman-adv Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2023-08-24 10:51:39 -07:00
bluetooth Bluetooth: Enforce validation on max value of connection interval 2024-03-06 14:48:36 +00:00
bpf bpf: Fix a few selftest failures due to llvm18 change 2024-02-05 20:14:20 +00:00
bpfilter
bridge netfilter: bridge: confirm multicast packets before passing them up the stack 2024-03-06 14:48:36 +00:00
caif
can can: j1939: Fix UAF in j1939_sk_match_filter during setsockopt(SO_J1939_FILTER) 2024-02-23 09:25:17 +01:00
ceph libceph: fail sparse-read if the data length doesn't match 2024-03-01 13:34:56 +01:00
core bpf, sockmap: Fix NULL pointer dereference in sk_psock_verdict_data_ready() 2024-03-01 13:35:08 +01:00
dcb net: dcb: choose correct policy to parse DCB_ATTR_BCN 2023-08-01 21:07:46 -07:00
dccp dccp/tcp: Call security_inet_conn_request() after setting IPv6 addresses. 2023-11-20 11:59:35 +01:00
devlink devlink: fix port dump cmd type 2024-03-01 13:35:09 +01:00
dns_resolver keys, dns: Fix size check of V1 server-list header 2024-01-25 15:35:41 -08:00
dsa net: dsa: mark parsed interface mode for legacy switch drivers 2023-08-09 13:08:09 -07:00
ethernet
ethtool ethtool: netlink: Add missing ethnl_ops_begin/complete 2024-01-25 15:36:00 -08:00
handshake net/handshake: Fix handshake_req_destroy_test1 2024-02-23 09:24:50 +01:00
hsr net: hsr: remove WARN_ONCE() in send_hsr_supervision_frame() 2024-02-23 09:25:03 +01:00
ieee802154 sysctl-6.6-rc1 2023-08-29 17:39:15 -07:00
ife net: sched: ife: fix potential use-after-free 2024-01-01 12:42:30 +00:00
ipv4 net: ip_tunnel: prevent perpetual headroom growth 2024-03-06 14:48:34 +00:00
ipv6 ipv6: fix potential "struct net" leak in inet6_rtm_getaddr() 2024-03-06 14:48:35 +00:00
iucv
kcm net: kcm: fix direct access to bv_len 2024-02-05 20:14:25 +00:00
key Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2023-08-18 12:44:56 -07:00
l2tp l2tp: pass correct message length to ip6_append_data 2024-03-01 13:35:01 +01:00
l3mdev
lapb
llc llc: call sock_orphan() at release time 2024-02-05 20:14:36 +00:00
mac80211 wifi: mac80211: accept broadcast probe responses on 6 GHz 2024-03-01 13:34:55 +01:00
mac802154
mctp net: mctp: take ownership of skb in mctp_local_output 2024-03-06 14:48:34 +00:00
mpls networking: Update to register_net_sysctl_sz 2023-08-15 15:26:18 -07:00
mptcp mptcp: add needs_id for netlink appending addr 2024-03-01 13:35:11 +01:00
ncsi net/ncsi: Fix netlink major/minor version numbers 2024-01-25 15:35:20 -08:00
netfilter netfilter: bridge: confirm multicast packets before passing them up the stack 2024-03-06 14:48:36 +00:00
netlabel calipso: fix memory leak in netlbl_calipso_add_pass() 2024-01-25 15:35:14 -08:00
netlink netlink: Fix kernel-infoleak-after-free in __skb_datagram_iter 2024-03-06 14:48:34 +00:00
netrom netrom: Deny concurrent connect(). 2023-08-28 06:58:46 +01:00
nfc nfc: nci: free rx_data_reassembly skb on NCI device cleanup 2024-02-23 09:25:02 +01:00
nsh
openvswitch net: openvswitch: limit the number of recursions from action sets 2024-02-23 09:24:51 +01:00
packet packet: Move reference count in packet_sock to atomic_long_t 2023-12-13 18:45:23 +01:00
phonet phonet/pep: fix racy skb_queue_empty() use 2024-03-01 13:35:10 +01:00
psample psample: Require 'CAP_NET_ADMIN' when joining "packets" group 2023-12-13 18:45:10 +01:00
qrtr net: qrtr: ns: Return 0 if server port is not present 2024-01-20 11:51:47 +01:00
rds net/rds: Fix UBSAN: array-index-out-of-bounds in rds_cmsg_recv 2024-01-31 16:19:01 -08:00
rfkill net: rfkill: gpio: set GPIO direction 2024-01-01 12:42:41 +00:00
rose net/rose: fix races in rose_kill_by_device() 2024-01-01 12:42:31 +00:00
rxrpc rxrpc: Fix counting of new acks and nacks 2024-02-16 19:10:50 +01:00
sched net/sched: flower: Add lock protection when remove filter handle 2024-03-01 13:35:09 +01:00
sctp sctp: fix busy polling 2024-01-25 15:35:30 -08:00
smc net/smc: disable SEID on non-s390 archs where virtual ISM may be used 2024-02-05 20:14:25 +00:00
strparser
sunrpc SUNRPC: Fix a suspicious RCU usage warning 2024-02-05 20:14:17 +00:00
switchdev net: bridge: switchdev: Skip MDB replays of deferred events on offload 2024-03-01 13:35:06 +01:00
tipc tipc: Check the bearer type before calling tipc_udp_nl_bearer_add() 2024-02-16 19:10:50 +01:00
tls tls: don't skip over different type records from the rx_list 2024-03-01 13:35:09 +01:00
unix af_unix: Call kfree_skb() for dead unix_(sk)->oob_skb in GC. 2024-02-16 19:10:50 +01:00
vmw_vsock virtio/vsock: send credit update during setting SO_RCVLOWAT 2024-01-25 15:35:26 -08:00
wireless wifi: cfg80211: fix missing interfaces when dumping 2024-03-01 13:34:48 +01:00
x25
xdp xsk: Add truesize to skb_add_rx_frag(). 2024-03-01 13:35:05 +01:00
xfrm ipsec-2023-10-17 2023-10-17 18:21:13 -07:00
compat.c
devres.c
Kconfig bpf: Add fd-based tcx multi-prog infra with link support 2023-07-19 10:07:27 -07:00
Kconfig.debug
Makefile
socket.c net: Save and restore msg_namelen in sock_sendmsg 2024-01-10 17:16:51 +01:00
sysctl_net.c sysctl: Add size to register_net_sysctl function 2023-08-15 15:26:17 -07:00