linux/net
Eric Dumazet 92e6e36ecd inet: fully convert sk->sk_rx_dst to RCU rules
commit 8f905c0e73 upstream.

syzbot reported various issues around early demux,
one being included in this changelog [1]

sk->sk_rx_dst is using RCU protection without clearly
documenting it.

And following sequences in tcp_v4_do_rcv()/tcp_v6_do_rcv()
are not following standard RCU rules.

[a]    dst_release(dst);
[b]    sk->sk_rx_dst = NULL;

They look wrong because a delete operation of RCU protected
pointer is supposed to clear the pointer before
the call_rcu()/synchronize_rcu() guarding actual memory freeing.

In some cases indeed, dst could be freed before [b] is done.

We could cheat by clearing sk_rx_dst before calling
dst_release(), but this seems the right time to stick
to standard RCU annotations and debugging facilities.

[1]
BUG: KASAN: use-after-free in dst_check include/net/dst.h:470 [inline]
BUG: KASAN: use-after-free in tcp_v4_early_demux+0x95b/0x960 net/ipv4/tcp_ipv4.c:1792
Read of size 2 at addr ffff88807f1cb73a by task syz-executor.5/9204

CPU: 0 PID: 9204 Comm: syz-executor.5 Not tainted 5.16.0-rc5-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
 print_address_description.constprop.0.cold+0x8d/0x320 mm/kasan/report.c:247
 __kasan_report mm/kasan/report.c:433 [inline]
 kasan_report.cold+0x83/0xdf mm/kasan/report.c:450
 dst_check include/net/dst.h:470 [inline]
 tcp_v4_early_demux+0x95b/0x960 net/ipv4/tcp_ipv4.c:1792
 ip_rcv_finish_core.constprop.0+0x15de/0x1e80 net/ipv4/ip_input.c:340
 ip_list_rcv_finish.constprop.0+0x1b2/0x6e0 net/ipv4/ip_input.c:583
 ip_sublist_rcv net/ipv4/ip_input.c:609 [inline]
 ip_list_rcv+0x34e/0x490 net/ipv4/ip_input.c:644
 __netif_receive_skb_list_ptype net/core/dev.c:5508 [inline]
 __netif_receive_skb_list_core+0x549/0x8e0 net/core/dev.c:5556
 __netif_receive_skb_list net/core/dev.c:5608 [inline]
 netif_receive_skb_list_internal+0x75e/0xd80 net/core/dev.c:5699
 gro_normal_list net/core/dev.c:5853 [inline]
 gro_normal_list net/core/dev.c:5849 [inline]
 napi_complete_done+0x1f1/0x880 net/core/dev.c:6590
 virtqueue_napi_complete drivers/net/virtio_net.c:339 [inline]
 virtnet_poll+0xca2/0x11b0 drivers/net/virtio_net.c:1557
 __napi_poll+0xaf/0x440 net/core/dev.c:7023
 napi_poll net/core/dev.c:7090 [inline]
 net_rx_action+0x801/0xb40 net/core/dev.c:7177
 __do_softirq+0x29b/0x9c2 kernel/softirq.c:558
 invoke_softirq kernel/softirq.c:432 [inline]
 __irq_exit_rcu+0x123/0x180 kernel/softirq.c:637
 irq_exit_rcu+0x5/0x20 kernel/softirq.c:649
 common_interrupt+0x52/0xc0 arch/x86/kernel/irq.c:240
 asm_common_interrupt+0x1e/0x40 arch/x86/include/asm/idtentry.h:629
RIP: 0033:0x7f5e972bfd57
Code: 39 d1 73 14 0f 1f 80 00 00 00 00 48 8b 50 f8 48 83 e8 08 48 39 ca 77 f3 48 39 c3 73 3e 48 89 13 48 8b 50 f8 48 89 38 49 8b 0e <48> 8b 3e 48 83 c3 08 48 83 c6 08 eb bc 48 39 d1 72 9e 48 39 d0 73
RSP: 002b:00007fff8a413210 EFLAGS: 00000283
RAX: 00007f5e97108990 RBX: 00007f5e97108338 RCX: ffffffff81d3aa45
RDX: ffffffff81d3aa45 RSI: 00007f5e97108340 RDI: ffffffff81d3aa45
RBP: 00007f5e97107eb8 R08: 00007f5e97108d88 R09: 0000000093c2e8d9
R10: 0000000000000000 R11: 0000000000000000 R12: 00007f5e97107eb0
R13: 00007f5e97108338 R14: 00007f5e97107ea8 R15: 0000000000000019
 </TASK>

Allocated by task 13:
 kasan_save_stack+0x1e/0x50 mm/kasan/common.c:38
 kasan_set_track mm/kasan/common.c:46 [inline]
 set_alloc_info mm/kasan/common.c:434 [inline]
 __kasan_slab_alloc+0x90/0xc0 mm/kasan/common.c:467
 kasan_slab_alloc include/linux/kasan.h:259 [inline]
 slab_post_alloc_hook mm/slab.h:519 [inline]
 slab_alloc_node mm/slub.c:3234 [inline]
 slab_alloc mm/slub.c:3242 [inline]
 kmem_cache_alloc+0x202/0x3a0 mm/slub.c:3247
 dst_alloc+0x146/0x1f0 net/core/dst.c:92
 rt_dst_alloc+0x73/0x430 net/ipv4/route.c:1613
 ip_route_input_slow+0x1817/0x3a20 net/ipv4/route.c:2340
 ip_route_input_rcu net/ipv4/route.c:2470 [inline]
 ip_route_input_noref+0x116/0x2a0 net/ipv4/route.c:2415
 ip_rcv_finish_core.constprop.0+0x288/0x1e80 net/ipv4/ip_input.c:354
 ip_list_rcv_finish.constprop.0+0x1b2/0x6e0 net/ipv4/ip_input.c:583
 ip_sublist_rcv net/ipv4/ip_input.c:609 [inline]
 ip_list_rcv+0x34e/0x490 net/ipv4/ip_input.c:644
 __netif_receive_skb_list_ptype net/core/dev.c:5508 [inline]
 __netif_receive_skb_list_core+0x549/0x8e0 net/core/dev.c:5556
 __netif_receive_skb_list net/core/dev.c:5608 [inline]
 netif_receive_skb_list_internal+0x75e/0xd80 net/core/dev.c:5699
 gro_normal_list net/core/dev.c:5853 [inline]
 gro_normal_list net/core/dev.c:5849 [inline]
 napi_complete_done+0x1f1/0x880 net/core/dev.c:6590
 virtqueue_napi_complete drivers/net/virtio_net.c:339 [inline]
 virtnet_poll+0xca2/0x11b0 drivers/net/virtio_net.c:1557
 __napi_poll+0xaf/0x440 net/core/dev.c:7023
 napi_poll net/core/dev.c:7090 [inline]
 net_rx_action+0x801/0xb40 net/core/dev.c:7177
 __do_softirq+0x29b/0x9c2 kernel/softirq.c:558

Freed by task 13:
 kasan_save_stack+0x1e/0x50 mm/kasan/common.c:38
 kasan_set_track+0x21/0x30 mm/kasan/common.c:46
 kasan_set_free_info+0x20/0x30 mm/kasan/generic.c:370
 ____kasan_slab_free mm/kasan/common.c:366 [inline]
 ____kasan_slab_free mm/kasan/common.c:328 [inline]
 __kasan_slab_free+0xff/0x130 mm/kasan/common.c:374
 kasan_slab_free include/linux/kasan.h:235 [inline]
 slab_free_hook mm/slub.c:1723 [inline]
 slab_free_freelist_hook+0x8b/0x1c0 mm/slub.c:1749
 slab_free mm/slub.c:3513 [inline]
 kmem_cache_free+0xbd/0x5d0 mm/slub.c:3530
 dst_destroy+0x2d6/0x3f0 net/core/dst.c:127
 rcu_do_batch kernel/rcu/tree.c:2506 [inline]
 rcu_core+0x7ab/0x1470 kernel/rcu/tree.c:2741
 __do_softirq+0x29b/0x9c2 kernel/softirq.c:558

Last potentially related work creation:
 kasan_save_stack+0x1e/0x50 mm/kasan/common.c:38
 __kasan_record_aux_stack+0xf5/0x120 mm/kasan/generic.c:348
 __call_rcu kernel/rcu/tree.c:2985 [inline]
 call_rcu+0xb1/0x740 kernel/rcu/tree.c:3065
 dst_release net/core/dst.c:177 [inline]
 dst_release+0x79/0xe0 net/core/dst.c:167
 tcp_v4_do_rcv+0x612/0x8d0 net/ipv4/tcp_ipv4.c:1712
 sk_backlog_rcv include/net/sock.h:1030 [inline]
 __release_sock+0x134/0x3b0 net/core/sock.c:2768
 release_sock+0x54/0x1b0 net/core/sock.c:3300
 tcp_sendmsg+0x36/0x40 net/ipv4/tcp.c:1441
 inet_sendmsg+0x99/0xe0 net/ipv4/af_inet.c:819
 sock_sendmsg_nosec net/socket.c:704 [inline]
 sock_sendmsg+0xcf/0x120 net/socket.c:724
 sock_write_iter+0x289/0x3c0 net/socket.c:1057
 call_write_iter include/linux/fs.h:2162 [inline]
 new_sync_write+0x429/0x660 fs/read_write.c:503
 vfs_write+0x7cd/0xae0 fs/read_write.c:590
 ksys_write+0x1ee/0x250 fs/read_write.c:643
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x44/0xae

The buggy address belongs to the object at ffff88807f1cb700
 which belongs to the cache ip_dst_cache of size 176
The buggy address is located 58 bytes inside of
 176-byte region [ffff88807f1cb700, ffff88807f1cb7b0)
The buggy address belongs to the page:
page:ffffea0001fc72c0 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x7f1cb
flags: 0xfff00000000200(slab|node=0|zone=1|lastcpupid=0x7ff)
raw: 00fff00000000200 dead000000000100 dead000000000122 ffff8881413bb780
raw: 0000000000000000 0000000000100010 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 0, migratetype Unmovable, gfp_mask 0x112a20(GFP_ATOMIC|__GFP_NOWARN|__GFP_NORETRY|__GFP_HARDWALL), pid 5, ts 108466983062, free_ts 108048976062
 prep_new_page mm/page_alloc.c:2418 [inline]
 get_page_from_freelist+0xa72/0x2f50 mm/page_alloc.c:4149
 __alloc_pages+0x1b2/0x500 mm/page_alloc.c:5369
 alloc_pages+0x1a7/0x300 mm/mempolicy.c:2191
 alloc_slab_page mm/slub.c:1793 [inline]
 allocate_slab mm/slub.c:1930 [inline]
 new_slab+0x32d/0x4a0 mm/slub.c:1993
 ___slab_alloc+0x918/0xfe0 mm/slub.c:3022
 __slab_alloc.constprop.0+0x4d/0xa0 mm/slub.c:3109
 slab_alloc_node mm/slub.c:3200 [inline]
 slab_alloc mm/slub.c:3242 [inline]
 kmem_cache_alloc+0x35c/0x3a0 mm/slub.c:3247
 dst_alloc+0x146/0x1f0 net/core/dst.c:92
 rt_dst_alloc+0x73/0x430 net/ipv4/route.c:1613
 __mkroute_output net/ipv4/route.c:2564 [inline]
 ip_route_output_key_hash_rcu+0x921/0x2d00 net/ipv4/route.c:2791
 ip_route_output_key_hash+0x18b/0x300 net/ipv4/route.c:2619
 __ip_route_output_key include/net/route.h:126 [inline]
 ip_route_output_flow+0x23/0x150 net/ipv4/route.c:2850
 ip_route_output_key include/net/route.h:142 [inline]
 geneve_get_v4_rt+0x3a6/0x830 drivers/net/geneve.c:809
 geneve_xmit_skb drivers/net/geneve.c:899 [inline]
 geneve_xmit+0xc4a/0x3540 drivers/net/geneve.c:1082
 __netdev_start_xmit include/linux/netdevice.h:4994 [inline]
 netdev_start_xmit include/linux/netdevice.h:5008 [inline]
 xmit_one net/core/dev.c:3590 [inline]
 dev_hard_start_xmit+0x1eb/0x920 net/core/dev.c:3606
 __dev_queue_xmit+0x299a/0x3650 net/core/dev.c:4229
page last free stack trace:
 reset_page_owner include/linux/page_owner.h:24 [inline]
 free_pages_prepare mm/page_alloc.c:1338 [inline]
 free_pcp_prepare+0x374/0x870 mm/page_alloc.c:1389
 free_unref_page_prepare mm/page_alloc.c:3309 [inline]
 free_unref_page+0x19/0x690 mm/page_alloc.c:3388
 qlink_free mm/kasan/quarantine.c:146 [inline]
 qlist_free_all+0x5a/0xc0 mm/kasan/quarantine.c:165
 kasan_quarantine_reduce+0x180/0x200 mm/kasan/quarantine.c:272
 __kasan_slab_alloc+0xa2/0xc0 mm/kasan/common.c:444
 kasan_slab_alloc include/linux/kasan.h:259 [inline]
 slab_post_alloc_hook mm/slab.h:519 [inline]
 slab_alloc_node mm/slub.c:3234 [inline]
 kmem_cache_alloc_node+0x255/0x3f0 mm/slub.c:3270
 __alloc_skb+0x215/0x340 net/core/skbuff.c:414
 alloc_skb include/linux/skbuff.h:1126 [inline]
 alloc_skb_with_frags+0x93/0x620 net/core/skbuff.c:6078
 sock_alloc_send_pskb+0x783/0x910 net/core/sock.c:2575
 mld_newpack+0x1df/0x770 net/ipv6/mcast.c:1754
 add_grhead+0x265/0x330 net/ipv6/mcast.c:1857
 add_grec+0x1053/0x14e0 net/ipv6/mcast.c:1995
 mld_send_initial_cr.part.0+0xf6/0x230 net/ipv6/mcast.c:2242
 mld_send_initial_cr net/ipv6/mcast.c:1232 [inline]
 mld_dad_work+0x1d3/0x690 net/ipv6/mcast.c:2268
 process_one_work+0x9b2/0x1690 kernel/workqueue.c:2298
 worker_thread+0x658/0x11f0 kernel/workqueue.c:2445

Memory state around the buggy address:
 ffff88807f1cb600: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff88807f1cb680: fb fb fb fb fb fb fc fc fc fc fc fc fc fc fc fc
>ffff88807f1cb700: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                        ^
 ffff88807f1cb780: fb fb fb fb fb fb fc fc fc fc fc fc fc fc fc fc
 ffff88807f1cb800: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb

Fixes: 41063e9dd1 ("ipv4: Early TCP socket demux.")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20211220143330.680945-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
[cmllamas: fixed trivial merge conflict]
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-10-26 13:17:14 +02:00
..
6lowpan 6lowpan: Off by one handling ->nexthdr 2020-01-27 14:46:30 +01:00
9p net/9p: Initialize the iounit field during fid creation 2022-08-25 11:11:30 +02:00
802 net/802/garp: fix memleak in garp_request_join() 2021-08-04 12:22:14 +02:00
8021q net: vlan: avoid leaks on register_vlan_dev() failures 2021-01-17 13:58:58 +01:00
appletalk appletalk: Fix skb allocation size in loopback case 2021-04-07 12:47:02 +02:00
atm atm: fix a memory leak of vcc->user_back 2020-10-01 13:12:42 +02:00
ax25 ax25: Fix UAF bugs in ax25 timers 2022-04-27 13:15:32 +02:00
batman-adv batman-adv: Don't skb_split skbuffs with frag_list 2022-05-18 09:18:05 +02:00
bluetooth Bluetooth: L2CAP: Fix user-after-free 2022-10-26 13:17:11 +02:00
bpf bpf: fix panic due to oob in bpf_prog_test_run_skb 2021-12-22 09:17:58 +01:00
bridge netfilter: ebtables: fix memory leak when blob is malformed 2022-09-28 10:56:51 +02:00
caif net-caif: avoid user-triggerable WARN_ON(1) 2021-09-22 11:45:33 +02:00
can can: bcm: check the result of can_send() in bcm_can_tx() 2022-10-26 13:17:10 +02:00
ceph libceph: clear con->out_msg on Policy::stateful_server faults 2020-11-05 11:07:03 +01:00
core net: If sock is dead don't access sock's sk_wq in sk_stream_wait_memory 2022-10-26 13:17:10 +02:00
dcb net: dcb: disable softirqs in dcbnl_flush_dev() 2022-03-08 19:01:58 +01:00
dccp dccp: put dccp_qpolicy_full() and dccp_qpolicy_push() in the same lock 2022-08-25 11:11:20 +02:00
decnet net: decnet: Fix sleeping inside in af_decnet 2021-07-28 11:12:18 +02:00
dns_resolver KEYS: Don't write out to userspace while holding key semaphore 2020-04-24 08:01:25 +02:00
dsa net: dsa: Fix duplicate frames flooded by learning 2020-04-02 16:34:24 +02:00
ethernet net: add annotations on hh->hh_len lockless accesses 2020-01-09 10:17:59 +01:00
hsr hsr: use netdev_err() instead of WARN_ONCE() 2021-05-22 10:57:24 +02:00
ieee802154 net/ieee802154: don't warn zero-sized raw_sendmsg() 2022-10-26 13:17:13 +02:00
ife net: sched: ife: check on metadata length 2018-04-29 11:33:13 +02:00
ipv4 inet: fully convert sk->sk_rx_dst to RCU rules 2022-10-26 13:17:14 +02:00
ipv6 inet: fully convert sk->sk_rx_dst to RCU rules 2022-10-26 13:17:14 +02:00
ipx
iucv net/af_iucv: set correct sk_protocol for child sockets 2020-12-08 10:17:32 +01:00
kcm kcm: fix strp_init() order and cleanup 2022-09-15 12:23:50 +02:00
key af_key: Do not call xfrm_probe_algs in parallel 2022-09-05 10:25:02 +02:00
l2tp l2tp: fix race in pppol2tp_release with session object destroy 2022-06-25 11:46:45 +02:00
l3mdev
lapb net: lapb: Copy the skb before sending a packet 2021-02-10 09:12:08 +01:00
llc llc: only change llc->dev when bind() succeeds 2022-03-28 08:22:27 +02:00
mac80211 wifi: mac80211: allow bw change during channel switch in mesh 2022-10-26 13:16:59 +02:00
mac802154 net: mac802154: Fix a condition in the receive path 2022-09-15 12:23:51 +02:00
mpls net: mpls: Fix notifications when deleting a device 2021-12-08 08:46:55 +01:00
ncsi net/ncsi: Avoid GFP_KERNEL in response handler 2021-04-16 11:57:51 +02:00
netfilter netfilter: nf_queue: fix socket leak 2022-10-26 13:16:53 +02:00
netlabel net: fix NULL pointer reference in cipso_v4_doi_free 2021-09-22 11:45:32 +02:00
netlink netlink: do not reset transport header in netlink_recvmsg() 2022-05-18 09:18:05 +02:00
netrom netrom: Decrease sock refcount when sock timers expire 2021-07-28 11:12:18 +02:00
nfc NFC: NULL out the dev->rfkill to prevent UAF 2022-06-14 16:53:48 +02:00
nsh nsh: set mac len based on inner packet 2018-07-22 14:28:49 +02:00
openvswitch openvswitch: Fix overreporting of drops in dropwatch 2022-10-26 13:17:09 +02:00
packet net/packet: fix packet_sock xmit return value checking 2022-04-27 13:15:29 +02:00
phonet phonet: refcount leak in pep_sock_accep 2022-01-11 13:57:37 +01:00
psample net: psample: fix skb_over_panic 2019-12-05 15:38:15 +01:00
qrtr net: qrtr: fix a kernel-infoleak in qrtr_recvmsg() 2021-03-30 14:40:12 +02:00
rds net: rds: don't hold sock lock when cancelling work from rds_tcp_reset_callbacks() 2022-10-26 13:17:00 +02:00
rfkill rfkill: Fix incorrect check to avoid NULL pointer dereference 2020-01-12 12:11:57 +01:00
rose rose: check NULL rose_loopback_neigh->loopback 2022-09-05 10:25:03 +02:00
rxrpc rxrpc: Don't try to resend the request if we're receiving the reply 2022-06-14 16:53:50 +02:00
sched sch_sfb: Also store skb len before calling child enqueue 2022-09-15 12:23:52 +02:00
sctp sctp: read sk->sk_bound_dev_if once in sctp_rcv() 2022-06-14 16:53:50 +02:00
smc net/smc: non blocking recvmsg() return -EAGAIN when no data and signal_pending 2022-05-18 09:18:06 +02:00
strparser strparser: Remove early eaten to fix full tcp receive buffer stall 2018-07-22 14:28:47 +02:00
sunrpc SUNRPC: use _bh spinlocking on ->transport_lock 2022-09-15 12:23:53 +02:00
switchdev
tipc tipc: fix shift wrapping bug in map_get() 2022-09-15 12:23:52 +02:00
tls net/tls: Fixed return value when tls_complete_pending_work() fails 2018-12-05 19:41:11 +01:00
unix af_unix: annote lockless accesses to unix_tot_inflight & gc_in_progress 2022-01-27 09:01:00 +01:00
vmw_vsock vhost/vsock: Use kvmalloc/kvfree for larger packets. 2022-10-26 13:17:00 +02:00
wimax
wireless wifi: cfg80211: debugfs: fix return type in ht40allow_map_read() 2022-09-15 12:23:49 +02:00
x25 net/x25: Fix null-ptr-deref caused by x25_disconnect 2022-04-20 09:08:21 +02:00
xfrm xfrm: Update ipcomp_scratches with NULL when freed 2022-10-26 13:17:09 +02:00
compat.c net: Return the correct errno code 2021-06-30 08:48:47 -04:00
Kconfig
Makefile net: split out functions related to registering inflight socket files 2021-08-04 12:22:14 +02:00
socket.c net: Fix a data-race around sysctl_somaxconn. 2022-09-05 10:25:04 +02:00
sysctl_net.c