linux/net/ipv4
Dragos Tatulea c3198822c6 net: esp: fix bad handling of pages from page_pool
When the skb is reorganized during esp_output (!esp->inline), the pages
coming from the original skb fragments are supposed to be released back
to the system through put_page. But if the skb fragment pages are
originating from a page_pool, calling put_page on them will trigger a
page_pool leak which will eventually result in a crash.

This leak can be easily observed when using CONFIG_DEBUG_VM and doing
ipsec + gre (non offloaded) forwarding:

  BUG: Bad page state in process ksoftirqd/16  pfn:1451b6
  page:00000000de2b8d32 refcount:0 mapcount:0 mapping:0000000000000000 index:0x1451b6000 pfn:0x1451b6
  flags: 0x200000000000000(node=0|zone=2)
  page_type: 0xffffffff()
  raw: 0200000000000000 dead000000000040 ffff88810d23c000 0000000000000000
  raw: 00000001451b6000 0000000000000001 00000000ffffffff 0000000000000000
  page dumped because: page_pool leak
  Modules linked in: ip_gre gre mlx5_ib mlx5_core xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink iptable_nat nf_nat xt_addrtype br_netfilter rpcrdma rdma_ucm ib_iser libiscsi scsi_transport_iscsi ib_umad rdma_cm ib_ipoib iw_cm ib_cm ib_uverbs ib_core overlay zram zsmalloc fuse [last unloaded: mlx5_core]
  CPU: 16 PID: 96 Comm: ksoftirqd/16 Not tainted 6.8.0-rc4+ #22
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
  Call Trace:
   <TASK>
   dump_stack_lvl+0x36/0x50
   bad_page+0x70/0xf0
   free_unref_page_prepare+0x27a/0x460
   free_unref_page+0x38/0x120
   esp_ssg_unref.isra.0+0x15f/0x200
   esp_output_tail+0x66d/0x780
   esp_xmit+0x2c5/0x360
   validate_xmit_xfrm+0x313/0x370
   ? validate_xmit_skb+0x1d/0x330
   validate_xmit_skb_list+0x4c/0x70
   sch_direct_xmit+0x23e/0x350
   __dev_queue_xmit+0x337/0xba0
   ? nf_hook_slow+0x3f/0xd0
   ip_finish_output2+0x25e/0x580
   iptunnel_xmit+0x19b/0x240
   ip_tunnel_xmit+0x5fb/0xb60
   ipgre_xmit+0x14d/0x280 [ip_gre]
   dev_hard_start_xmit+0xc3/0x1c0
   __dev_queue_xmit+0x208/0xba0
   ? nf_hook_slow+0x3f/0xd0
   ip_finish_output2+0x1ca/0x580
   ip_sublist_rcv_finish+0x32/0x40
   ip_sublist_rcv+0x1b2/0x1f0
   ? ip_rcv_finish_core.constprop.0+0x460/0x460
   ip_list_rcv+0x103/0x130
   __netif_receive_skb_list_core+0x181/0x1e0
   netif_receive_skb_list_internal+0x1b3/0x2c0
   napi_gro_receive+0xc8/0x200
   gro_cell_poll+0x52/0x90
   __napi_poll+0x25/0x1a0
   net_rx_action+0x28e/0x300
   __do_softirq+0xc3/0x276
   ? sort_range+0x20/0x20
   run_ksoftirqd+0x1e/0x30
   smpboot_thread_fn+0xa6/0x130
   kthread+0xcd/0x100
   ? kthread_complete_and_exit+0x20/0x20
   ret_from_fork+0x31/0x50
   ? kthread_complete_and_exit+0x20/0x20
   ret_from_fork_asm+0x11/0x20
   </TASK>

The suggested fix is to introduce a new wrapper (skb_page_unref) that
covers page refcounting for page_pool pages as well.

Cc: stable@vger.kernel.org
Fixes: 6a5bcd84e8 ("page_pool: Allow drivers to hint on SKB recycling")
Reported-and-tested-by: Anatoli N.Chechelnickiy <Anatoli.Chechelnickiy@m.interpipe.biz>
Reported-by: Ian Kumlien <ian.kumlien@gmail.com>
Link: https://lore.kernel.org/netdev/CAA85sZvvHtrpTQRqdaOx6gd55zPAVsqMYk_Lwh4Md5knTq7AyA@mail.gmail.com
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Mina Almasry <almasrymina@google.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2024-03-18 11:53:46 +01:00
..
netfilter netfilter: xtables: fix up kconfig dependencies 2024-02-21 11:57:11 +01:00
af_inet.c net: introduce include/net/rps.h 2024-03-07 21:12:43 -08:00
ah4.c net: fill in MODULE_DESCRIPTION()s for ipv4 modules 2024-02-09 14:12:02 -08:00
arp.c arp: Prevent overflow in arp_req_get(). 2024-02-20 10:50:19 +01:00
bpf_tcp_ca.c bpf: treewide: Annotate BPF kfuncs in BTF 2024-01-31 20:40:56 -08:00
cipso_ipv4.c netlabel: remove impossible return value in netlbl_bitmap_walk 2024-02-28 19:37:34 -08:00
datagram.c ipv4: Set the routing scope properly in ip_route_output_ports(). 2024-02-12 17:33:05 -08:00
devinet.c netlink: let core handle error cases in dump operations 2024-03-07 20:48:22 -08:00
esp4_offload.c xfrm: Support GRO for IPv4 ESP in UDP encapsulation 2023-10-06 07:30:40 +02:00
esp4.c net: esp: fix bad handling of pages from page_pool 2024-03-18 11:53:46 +01:00
fib_frontend.c netlink: let core handle error cases in dump operations 2024-03-07 20:48:22 -08:00
fib_lookup.h
fib_notifier.c
fib_rules.c fib: remove unnecessary input parameters in fib_default_rule_add 2024-01-03 16:42:48 -08:00
fib_semantics.c ipv4: fib: annotate races around nh->nh_saddr_genid and nh->nh_saddr 2023-10-18 18:11:31 -07:00
fib_trie.c inet: switch inet_dump_fib() to RCU protection 2024-02-26 11:46:13 +00:00
fou_bpf.c bpf: treewide: Annotate BPF kfuncs in BTF 2024-01-31 20:40:56 -08:00
fou_core.c net: gro: rename skb_gro_header_hard() 2024-03-05 13:30:11 +01:00
fou_nl.c net: ynl: prefix uAPI header include with uapi/ 2023-05-26 10:30:14 +01:00
fou_nl.h net: ynl: prefix uAPI header include with uapi/ 2023-05-26 10:30:14 +01:00
gre_demux.c
gre_offload.c net: gro: rename skb_gro_header_hard() 2024-03-05 13:30:11 +01:00
icmp.c xfrm: pass struct net to xfrm_decode_session wrappers 2023-10-06 08:31:53 +02:00
igmp.c inet: annotate devconf data-races 2024-02-28 19:36:39 -08:00
inet_connection_sock.c sock: Use unsafe_memcpy() for sock_copy() 2024-03-05 18:35:12 -08:00
inet_diag.c inet_diag: skip over empty buckets 2024-01-23 15:13:55 +01:00
inet_fragment.c net: dropreason: add SKB_DROP_REASON_FRAG_REASM_TIMEOUT 2022-10-31 20:14:27 -07:00
inet_hashtables.c tcp: Fix refcnt handling in __inet_hash_connect(). 2024-03-14 10:57:02 +01:00
inet_timewait_sock.c tcp: Fix NEW_SYN_RECV handling in inet_twsk_purge() 2024-03-12 18:56:15 -07:00
inetpeer.c net: ipv4: Simplify the allocation of slab caches in inet_initpeers 2024-01-31 16:39:42 -08:00
ip_forward.c net: fix IPSTATS_MIB_OUTFORWDATAGRAMS increment after fragment check 2023-10-13 09:58:45 -07:00
ip_fragment.c networking: Update to register_net_sysctl_sz 2023-08-15 15:26:18 -07:00
ip_gre.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2024-02-15 16:20:04 -08:00
ip_input.c ipv4: ignore dst hint for multipath routes 2023-09-01 08:11:51 +01:00
ip_options.c
ip_output.c net: Re-use and set mono_delivery_time bit for userspace tstamp packets 2024-03-05 13:41:16 +01:00
ip_sockglue.c inet: Add getsockopt support for IP_ROUTER_ALERT and IPV6_ROUTER_ALERT 2024-03-06 12:37:06 +00:00
ip_tunnel_core.c tunnels: fix out of bounds access when building IPv6 PMTU error 2024-02-03 12:43:19 +00:00
ip_tunnel.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2024-03-11 20:38:36 -07:00
ip_vti.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2024-02-15 16:20:04 -08:00
ipcomp.c
ipconfig.c net: ipconfig: move ic_nameservers_fallback into #ifdef block 2023-05-22 11:17:55 +01:00
ipip.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2024-02-15 16:20:04 -08:00
ipmr_base.c
ipmr.c ipmr: fix incorrect parameter validation in the ip_mroute_getsockopt() function 2024-03-11 09:53:22 +00:00
Kconfig net/tcp: Add TCP-AO config and structures 2023-10-27 10:35:44 +01:00
Makefile bpfilter: remove bpfilter 2024-01-04 10:23:10 -08:00
metrics.c ipv4: prevent potential spectre v1 gadget in ip_metrics_convert() 2023-01-23 21:37:25 -08:00
netfilter.c xfrm: pass struct net to xfrm_decode_session wrappers 2023-10-06 08:31:53 +02:00
netlink.c
nexthop.c nexthop: Fix splat with CONFIG_DEBUG_PREEMPT=y 2024-03-11 20:35:20 -07:00
ping.c bpf-next-for-netdev 2023-10-16 21:05:33 -07:00
proc.c inet: annotate devconf data-races 2024-02-28 19:36:39 -08:00
protocol.c
raw_diag.c inet_diag: add module pointer to "struct inet_diag_handler" 2024-01-23 15:13:54 +01:00
raw.c ipv4: raw: check sk->sk_rcvbuf earlier 2024-03-08 11:39:44 -08:00
route.c inet: annotate devconf data-races 2024-02-28 19:36:39 -08:00
syncookies.c tcp: use drop reasons in cookie check for ipv4 2024-02-28 10:39:21 +00:00
sysctl_net_ipv4.c Use READ/WRITE_ONCE() for IP local_port_range. 2023-12-08 10:44:42 -08:00
tcp_ao.c net: tcp: Remove redundant initialization of variable len 2024-02-20 11:40:15 +01:00
tcp_bbr.c bpf: treewide: Annotate BPF kfuncs in BTF 2024-01-31 20:40:56 -08:00
tcp_bic.c
tcp_bpf.c tcp_bpf: properly release resources on error paths 2023-10-18 18:09:31 -07:00
tcp_cdg.c
tcp_cong.c bpf, net: validate struct_ops when updating value. 2024-03-04 10:03:57 -08:00
tcp_cubic.c bpf: treewide: Annotate BPF kfuncs in BTF 2024-01-31 20:40:56 -08:00
tcp_dctcp.c bpf: treewide: Annotate BPF kfuncs in BTF 2024-01-31 20:40:56 -08:00
tcp_dctcp.h
tcp_diag.c inet_diag: add module pointer to "struct inet_diag_handler" 2024-01-23 15:13:54 +01:00
tcp_fastopen.c inet: move inet->defer_connect to inet->inet_flags 2023-08-16 11:09:18 +01:00
tcp_highspeed.c
tcp_htcp.c
tcp_hybla.c
tcp_illinois.c
tcp_input.c tcp: add dropreasons in tcp_rcv_state_process() 2024-02-28 10:39:22 +00:00
tcp_ipv4.c tcp: make dropreason in tcp_child_process() work 2024-02-28 10:39:22 +00:00
tcp_lp.c tcp: rename tcp_time_stamp() to tcp_time_stamp_ts() 2023-10-23 09:35:01 +01:00
tcp_metrics.c tcp_metrics: optimize tcp_metrics_flush_all() 2023-10-03 10:05:22 +02:00
tcp_minisocks.c rds: tcp: Fix use-after-free of net in reqsk_timer_handler(). 2024-03-12 18:56:16 -07:00
tcp_nv.c
tcp_offload.c net: move tcpv4_offload and tcpv6_offload to net_hotdata 2024-03-07 21:12:42 -08:00
tcp_output.c net: Remove acked SYN flag from packet in the transmit queue correctly 2023-12-12 15:56:02 -08:00
tcp_plb.c prandom: remove prandom_u32_max() 2022-12-20 03:13:45 +01:00
tcp_rate.c
tcp_recovery.c tcp: fix excessive TLP and RACK timeouts from HZ rounding 2023-10-17 17:25:42 -07:00
tcp_scalable.c
tcp_sigpool.c net/tcp_sigpool: Use kref_get_unless_zero() 2024-01-01 14:42:05 +00:00
tcp_timer.c tcp: use tp->total_rto to track number of linear timeouts in SYN_SENT state 2023-11-16 23:35:12 +00:00
tcp_ulp.c net/ulp: use consistent error code when blocking ULP 2023-01-19 09:26:16 -08:00
tcp_vegas.c
tcp_vegas.h
tcp_veno.c
tcp_westwood.c
tcp_yeah.c
tcp.c tcp: annotate a data-race around sysctl_tcp_wmem[0] 2024-03-11 10:37:40 +00:00
tunnel4.c net: fill in MODULE_DESCRIPTION()s for ipv4 modules 2024-02-09 14:12:02 -08:00
udp_bpf.c bpf, sockmap: Fix an infinite loop error when len is 0 in tcp_bpf_recvmsg_parser() 2023-03-03 17:25:15 +01:00
udp_diag.c inet_diag: add module pointer to "struct inet_diag_handler" 2024-01-23 15:13:54 +01:00
udp_impl.h sock: Remove ->sendpage*() in favour of sendmsg(MSG_SPLICE_PAGES) 2023-06-24 15:50:13 -07:00
udp_offload.c udp: move udpv4_offload and udpv6_offload to net_hotdata 2024-03-07 21:12:43 -08:00
udp_tunnel_core.c net: fill in MODULE_DESCRIPTION()s for ipv4 modules 2024-02-09 14:12:02 -08:00
udp_tunnel_nic.c udp_tunnel: Use flex array to simplify code 2023-10-03 11:39:34 +02:00
udp_tunnel_stub.c
udp.c udp: no longer touch sk->sk_refcnt in early demux 2024-03-11 09:56:03 +00:00
udplite.c udplite: remove UDPLITE_BIT 2023-09-14 16:16:36 +02:00
xfrm4_input.c net: adopt skb_network_offset() and similar helpers 2024-03-04 08:47:06 +00:00
xfrm4_output.c
xfrm4_policy.c sysctl-6.6-rc1 2023-08-29 17:39:15 -07:00
xfrm4_protocol.c
xfrm4_state.c
xfrm4_tunnel.c net: fill in MODULE_DESCRIPTION()s for ipv4 modules 2024-02-09 14:12:02 -08:00