linux/net/core
Eric Dumazet c20029db28 net: avoid potential underflow in qdisc_pkt_len_init() with UFO
After commit 7c6d2ecbda ("net: be more gentle about silly gso
requests coming from user") virtio_net_hdr_to_skb() had sanity check
to detect malicious attempts from user space to cook a bad GSO packet.

Then commit cf9acc90c8 ("net: virtio_net_hdr_to_skb: count
transport header in UFO") while fixing one issue, allowed user space
to cook a GSO packet with the following characteristic :

IPv4 SKB_GSO_UDP, gso_size=3, skb->len = 28.

When this packet arrives in qdisc_pkt_len_init(), we end up
with hdr_len = 28 (IPv4 header + UDP header), matching skb->len

Then the following sets gso_segs to 0 :

gso_segs = DIV_ROUND_UP(skb->len - hdr_len,
                        shinfo->gso_size);

Then later we set qdisc_skb_cb(skb)->pkt_len to back to zero :/

qdisc_skb_cb(skb)->pkt_len += (gso_segs - 1) * hdr_len;

This leads to the following crash in fq_codel [1]

qdisc_pkt_len_init() is best effort, we only want an estimation
of the bytes sent on the wire, not crashing the kernel.

This patch is fixing this particular issue, a following one
adds more sanity checks for another potential bug.

[1]
[   70.724101] BUG: kernel NULL pointer dereference, address: 0000000000000000
[   70.724561] #PF: supervisor read access in kernel mode
[   70.724561] #PF: error_code(0x0000) - not-present page
[   70.724561] PGD 10ac61067 P4D 10ac61067 PUD 107ee2067 PMD 0
[   70.724561] Oops: Oops: 0000 [#1] SMP NOPTI
[   70.724561] CPU: 11 UID: 0 PID: 2163 Comm: b358537762 Not tainted 6.11.0-virtme #991
[   70.724561] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[   70.724561] RIP: 0010:fq_codel_enqueue (net/sched/sch_fq_codel.c:120 net/sched/sch_fq_codel.c:168 net/sched/sch_fq_codel.c:230) sch_fq_codel
[ 70.724561] Code: 24 08 49 c1 e1 06 44 89 7c 24 18 45 31 ed 45 31 c0 31 ff 89 44 24 14 4c 03 8b 90 01 00 00 eb 04 39 ca 73 37 4d 8b 39 83 c7 01 <49> 8b 17 49 89 11 41 8b 57 28 45 8b 5f 34 49 c7 07 00 00 00 00 49
All code
========
   0:	24 08                	and    $0x8,%al
   2:	49 c1 e1 06          	shl    $0x6,%r9
   6:	44 89 7c 24 18       	mov    %r15d,0x18(%rsp)
   b:	45 31 ed             	xor    %r13d,%r13d
   e:	45 31 c0             	xor    %r8d,%r8d
  11:	31 ff                	xor    %edi,%edi
  13:	89 44 24 14          	mov    %eax,0x14(%rsp)
  17:	4c 03 8b 90 01 00 00 	add    0x190(%rbx),%r9
  1e:	eb 04                	jmp    0x24
  20:	39 ca                	cmp    %ecx,%edx
  22:	73 37                	jae    0x5b
  24:	4d 8b 39             	mov    (%r9),%r15
  27:	83 c7 01             	add    $0x1,%edi
  2a:*	49 8b 17             	mov    (%r15),%rdx		<-- trapping instruction
  2d:	49 89 11             	mov    %rdx,(%r9)
  30:	41 8b 57 28          	mov    0x28(%r15),%edx
  34:	45 8b 5f 34          	mov    0x34(%r15),%r11d
  38:	49 c7 07 00 00 00 00 	movq   $0x0,(%r15)
  3f:	49                   	rex.WB

Code starting with the faulting instruction
===========================================
   0:	49 8b 17             	mov    (%r15),%rdx
   3:	49 89 11             	mov    %rdx,(%r9)
   6:	41 8b 57 28          	mov    0x28(%r15),%edx
   a:	45 8b 5f 34          	mov    0x34(%r15),%r11d
   e:	49 c7 07 00 00 00 00 	movq   $0x0,(%r15)
  15:	49                   	rex.WB
[   70.724561] RSP: 0018:ffff95ae85e6fb90 EFLAGS: 00000202
[   70.724561] RAX: 0000000002000000 RBX: ffff95ae841de000 RCX: 0000000000000000
[   70.724561] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000001
[   70.724561] RBP: ffff95ae85e6fbf8 R08: 0000000000000000 R09: ffff95b710a30000
[   70.724561] R10: 0000000000000000 R11: bdf289445ce31881 R12: ffff95ae85e6fc58
[   70.724561] R13: 0000000000000000 R14: 0000000000000040 R15: 0000000000000000
[   70.724561] FS:  000000002c5c1380(0000) GS:ffff95bd7fcc0000(0000) knlGS:0000000000000000
[   70.724561] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   70.724561] CR2: 0000000000000000 CR3: 000000010c568000 CR4: 00000000000006f0
[   70.724561] Call Trace:
[   70.724561]  <TASK>
[   70.724561] ? __die (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:434)
[   70.724561] ? page_fault_oops (arch/x86/mm/fault.c:715)
[   70.724561] ? exc_page_fault (./arch/x86/include/asm/irqflags.h:26 ./arch/x86/include/asm/irqflags.h:87 ./arch/x86/include/asm/irqflags.h:147 arch/x86/mm/fault.c:1489 arch/x86/mm/fault.c:1539)
[   70.724561] ? asm_exc_page_fault (./arch/x86/include/asm/idtentry.h:623)
[   70.724561] ? fq_codel_enqueue (net/sched/sch_fq_codel.c:120 net/sched/sch_fq_codel.c:168 net/sched/sch_fq_codel.c:230) sch_fq_codel
[   70.724561] dev_qdisc_enqueue (net/core/dev.c:3784)
[   70.724561] __dev_queue_xmit (net/core/dev.c:3880 (discriminator 2) net/core/dev.c:4390 (discriminator 2))
[   70.724561] ? irqentry_enter (kernel/entry/common.c:237)
[   70.724561] ? sysvec_apic_timer_interrupt (./arch/x86/include/asm/hardirq.h:74 (discriminator 2) arch/x86/kernel/apic/apic.c:1043 (discriminator 2) arch/x86/kernel/apic/apic.c:1043 (discriminator 2))
[   70.724561] ? trace_hardirqs_on (kernel/trace/trace_preemptirq.c:58 (discriminator 4))
[   70.724561] ? asm_sysvec_apic_timer_interrupt (./arch/x86/include/asm/idtentry.h:702)
[   70.724561] ? virtio_net_hdr_to_skb.constprop.0 (./include/linux/virtio_net.h:129 (discriminator 1))
[   70.724561] packet_sendmsg (net/packet/af_packet.c:3145 (discriminator 1) net/packet/af_packet.c:3177 (discriminator 1))
[   70.724561] ? _raw_spin_lock_bh (./arch/x86/include/asm/atomic.h:107 (discriminator 4) ./include/linux/atomic/atomic-arch-fallback.h:2170 (discriminator 4) ./include/linux/atomic/atomic-instrumented.h:1302 (discriminator 4) ./include/asm-generic/qspinlock.h:111 (discriminator 4) ./include/linux/spinlock.h:187 (discriminator 4) ./include/linux/spinlock_api_smp.h:127 (discriminator 4) kernel/locking/spinlock.c:178 (discriminator 4))
[   70.724561] ? netdev_name_node_lookup_rcu (net/core/dev.c:325 (discriminator 1))
[   70.724561] __sys_sendto (net/socket.c:730 (discriminator 1) net/socket.c:745 (discriminator 1) net/socket.c:2210 (discriminator 1))
[   70.724561] ? __sys_setsockopt (./include/linux/file.h:34 net/socket.c:2355)
[   70.724561] __x64_sys_sendto (net/socket.c:2222 (discriminator 1) net/socket.c:2218 (discriminator 1) net/socket.c:2218 (discriminator 1))
[   70.724561] do_syscall_64 (arch/x86/entry/common.c:52 (discriminator 1) arch/x86/entry/common.c:83 (discriminator 1))
[   70.724561] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
[   70.724561] RIP: 0033:0x41ae09

Fixes: cf9acc90c8 ("net: virtio_net_hdr_to_skb: count transport header in UFO")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jonathan Davies <jonathan.davies@nutanix.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Jonathan Davies <jonathan.davies@nutanix.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-01 11:47:05 +02:00
..
bpf_sk_storage.c netlink: introduce type-checking attribute iteration 2024-03-29 15:06:02 -07:00
datagram.c net: add support for skbs with unreadable frags 2024-09-11 20:44:31 -07:00
dev_addr_lists_test.c net: dev_addr_lists: move locking out of init/exit in kunit 2024-04-15 10:26:35 +01:00
dev_addr_lists.c net: Correct spelling in net/core 2024-08-26 09:37:23 -07:00
dev_ioctl.c netdevice: convert private flags > BIT(31) to bitfields 2024-09-03 11:36:43 +02:00
dev.c net: avoid potential underflow in qdisc_pkt_len_init() with UFO 2024-10-01 11:47:05 +02:00
dev.h net: softnet_data: Make xmit per task. 2024-06-24 16:41:23 -07:00
devmem.c memory-provider: dmabuf devmem memory provider 2024-09-11 20:44:31 -07:00
devmem.h tcp: RX path for devmem TCP 2024-09-11 20:44:32 -07:00
drop_monitor.c net: add rx_sk to trace_kfree_skb 2024-06-19 12:44:22 +01:00
dst_cache.c net: dst_cache: add two DEBUG_NET warnings 2024-06-03 18:50:09 -07:00
dst.c net: dst: Make dst_destroy() static and return void. 2024-02-06 11:45:53 +01:00
failover.c net: failover: use IFF_NO_ADDRCONF flag to prevent ipv6 addrconf 2022-12-12 15:18:25 -08:00
fib_notifier.c
fib_rules.c net: fib_rules: Enable DSCP selector usage 2024-09-13 21:15:45 -07:00
filter.c bpf-next-6.12 2024-09-21 09:27:50 -07:00
flow_dissector.c net: flow_dissector: use DEBUG_NET_WARN_ON_ONCE 2024-07-18 10:52:17 +02:00
flow_offload.c tc: flower: Enable offload support IPSEC SPI field. 2023-08-02 10:09:32 +01:00
gen_estimator.c net: use unrcu_pointer() helper 2024-06-06 11:52:52 +02:00
gen_stats.c net: Remove the obsolte u64_stats_fetch_*_irq() users (net). 2022-10-28 20:13:54 -07:00
gro_cells.c net: move netdev_max_backlog to net_hotdata 2024-03-07 21:12:42 -08:00
gro.c net: Add netif_get_gro_max_size helper for GRO 2024-10-01 10:48:51 +02:00
gso.c net: introduce struct net_hotdata 2024-03-07 21:12:41 -08:00
hotdata.c net: move sysctl_mem_pcpu_rsv to net_hotdata 2024-04-30 18:46:52 -07:00
hwbm.c
ieee8021q_helpers.c net: add IEEE 802.1q specific helpers 2024-05-08 10:35:09 +01:00
link_watch.c net: linkwatch: use system_unbound_wq 2024-08-06 12:12:53 -07:00
lwt_bpf.c bpf: lwtunnel: Unmask upper DSCP bits in bpf_lwt_xmit_reroute() 2024-09-09 14:14:52 +01:00
lwtunnel.c xfrm: lwtunnel: squelch kernel warning in case XFRM encap type is not available 2022-10-12 10:45:51 +02:00
Makefile netdev: support binding dma-buf to netdevice 2024-09-11 20:44:31 -07:00
mp_dmabuf_devmem.h memory-provider: dmabuf devmem memory provider 2024-09-11 20:44:31 -07:00
neighbour.c neighbour: delete redundant judgment statements 2024-08-23 14:27:45 +01:00
net_namespace.c struct fd layout change (and conversion to accessor helpers) 2024-09-23 09:35:36 -07:00
net_test.c pfcp: always set pfcp metadata 2024-04-01 10:49:28 +01:00
net-procfs.c net: make softnet_data.dropped an atomic_t 2024-04-01 11:28:32 +01:00
net-sysfs.c net: sysfs: Fix weird usage of class's namespace relevant fields 2024-09-09 10:30:52 +01:00
net-sysfs.h
net-traces.c udp6: add a missing call into udp_fail_queue_rcv_skb tracepoint 2023-07-07 09:16:52 +01:00
netclassid_cgroup.c cgroup, netclassid: on modifying netclassid in cgroup, only consider the main process. 2023-10-16 16:36:53 -07:00
netdev_rx_queue.c memory-provider: dmabuf devmem memory provider 2024-09-11 20:44:31 -07:00
netdev-genl-gen.c netdev: support binding dma-buf to netdevice 2024-09-11 20:44:31 -07:00
netdev-genl-gen.h netdev: support binding dma-buf to netdevice 2024-09-11 20:44:31 -07:00
netdev-genl.c netdev: add dmabuf introspection 2024-09-11 20:44:32 -07:00
netevent.c
netmem_priv.h page_pool: devmem support 2024-09-11 20:44:31 -07:00
netpoll.c netpoll: remove netpoll_srcu 2024-09-06 18:24:59 -07:00
netprio_cgroup.c
of_net.c net: Explicitly include correct DT includes 2023-07-27 20:33:16 -07:00
page_pool_priv.h memory-provider: dmabuf devmem memory provider 2024-09-11 20:44:31 -07:00
page_pool_user.c netdev: add dmabuf introspection 2024-09-11 20:44:32 -07:00
page_pool.c memory-provider: dmabuf devmem memory provider 2024-09-11 20:44:31 -07:00
pktgen.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2024-08-29 11:49:10 -07:00
ptp_classifier.c
request_sock.c tcp: make sure init the accept_queue's spinlocks once 2024-01-19 21:13:25 -08:00
rtnetlink.c netdevice: convert private flags > BIT(31) to bitfields 2024-09-03 11:36:43 +02:00
scm.c af_unix: Add dead flag to struct scm_fp_list. 2024-05-10 18:52:45 -07:00
secure_seq.c tcp: Fix data-races around sysctl knobs related to SYN option. 2022-07-20 10:14:49 +01:00
selftests.c net: fill in MODULE_DESCRIPTION()s under net/core 2023-10-28 11:29:27 +01:00
skbuff.c net: add support for skbs with unreadable frags 2024-09-11 20:44:31 -07:00
skmsg.c bpf, sockmap: Correct spelling skmsg.c 2024-09-02 18:43:58 +02:00
sock_destructor.h
sock_diag.c net: use unrcu_pointer() helper 2024-06-06 11:52:52 +02:00
sock_map.c bpf-next-6.12-struct-fd 2024-09-24 14:54:26 -07:00
sock_reuseport.c net: core: annotate socks of struct sock_reuseport with __counted_by 2024-08-02 17:16:59 -07:00
sock.c vfs-6.12.file 2024-09-16 09:14:02 +02:00
stream.c net: Return error from sk_stream_wait_connect() if sk_wait_event() fails 2023-12-15 10:48:51 +00:00
sysctl_net_core.c sysctl: treewide: constify the ctl_table argument of proc_handlers 2024-07-24 20:59:29 +02:00
timestamping.c net: Change the API of PHY default timestamp to MAC 2024-07-15 08:02:26 -07:00
tso.c net: tso: inline tso_count_descs() 2022-12-12 15:04:39 -08:00
utils.c net: Correct spelling in net/core 2024-08-26 09:37:23 -07:00
xdp.c xdp: fix invalid wait context of page_pool_destroy() 2024-07-14 20:40:21 -07:00