linux/net/ipv4
Eric Dumazet 3fb07daff8 ipv4: add reference counting to metrics
Andrey Konovalov reported crashes in ipv4_mtu()

I could reproduce the issue with KASAN kernels, between
10.246.7.151 and 10.246.7.152 :

1) 20 concurrent netperf -t TCP_RR -H 10.246.7.152 -l 1000 &

2) At the same time run following loop :
while :
do
 ip ro add 10.246.7.152 dev eth0 src 10.246.7.151 mtu 1500
 ip ro del 10.246.7.152 dev eth0 src 10.246.7.151 mtu 1500
done

Cong Wang attempted to add back rt->fi in commit
82486aa6f1 ("ipv4: restore rt->fi for reference counting")
but this proved to add some issues that were complex to solve.

Instead, I suggested to add a refcount to the metrics themselves,
being a standalone object (in particular, no reference to other objects)

I tried to make this patch as small as possible to ease its backport,
instead of being super clean. Note that we believe that only ipv4 dst
need to take care of the metric refcount. But if this is wrong,
this patch adds the basic infrastructure to extend this to other
families.

Many thanks to Julian Anastasov for reviewing this patch, and Cong Wang
for his efforts on this problem.

Fixes: 2860583fe8 ("ipv4: Kill rt->fi")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Julian Anastasov <ja@ssi.bg>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-26 14:57:07 -04:00
..
netfilter Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next 2017-05-01 10:47:53 -04:00
af_inet.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2017-05-02 16:40:27 -07:00
ah4.c IPsec: do not ignore crypto err in ah4 input 2017-01-16 12:57:48 +01:00
arp.c arp: fixed -Wuninitialized compiler warning 2017-05-25 13:38:20 -04:00
cipso_ipv4.c netlabel: out of bound access in cipso_v4_validate() 2017-02-04 19:44:22 -05:00
datagram.c
devinet.c net: rtnetlink: plumb extended ack to doit function 2017-04-17 15:35:38 -04:00
esp4_offload.c esp4/6: Fix GSO path for non-GSO SW-crypto packets 2017-04-19 07:48:57 +02:00
esp4.c esp4: Fix udpencap for local TCP packets. 2017-05-04 07:27:26 +02:00
fib_frontend.c net: Improve handling of failures on link and route dumps 2017-05-16 14:54:11 -04:00
fib_lookup.h
fib_notifier.c ipv4: fib: Remove redundant argument 2017-03-10 09:45:09 -08:00
fib_rules.c ipv4: fib_rules: Dump FIB rules when registering FIB notifier 2017-03-16 10:18:34 -07:00
fib_semantics.c ipv4: add reference counting to metrics 2017-05-26 14:57:07 -04:00
fib_trie.c net: Improve handling of failures on link and route dumps 2017-05-16 14:54:11 -04:00
fou.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-10-30 12:42:58 -04:00
gre_demux.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-06-30 05:03:36 -04:00
gre_offload.c net: add recursion limit to GRO 2016-10-20 14:32:22 -04:00
icmp.c net: ipv4: add support for ECMP hash policy choice 2017-03-21 15:27:19 -07:00
igmp.c igmp, mld: Fix memory leak in igmpv3/mld_del_delrec() 2017-02-09 16:43:45 -05:00
inet_connection_sock.c dccp/tcp: do not inherit mc_list from parent 2017-05-09 15:17:49 -04:00
inet_diag.c tcp: remove early retransmit 2017-01-13 22:37:16 -05:00
inet_fragment.c net: disable fragment reassembly if high_thresh is zero 2016-06-05 22:56:42 -04:00
inet_hashtables.c treewide: use kv[mz]alloc* rather than opencoded variants 2017-05-08 17:15:13 -07:00
inet_timewait_sock.c ipv4: Namespaceify tcp_tw_recycle and tcp_max_tw_buckets knob 2016-12-29 11:38:31 -05:00
inetpeer.c
ip_forward.c ipv4: allow local fragmentation in ip_finish_output_gso() 2016-11-03 16:10:26 -04:00
ip_fragment.c inet: frag: release spinlock before calling icmp_send() 2017-03-22 15:40:45 -07:00
ip_gre.c ip_tunnel: Allow policy-based routing through tunnels 2017-04-21 13:21:31 -04:00
ip_input.c net: Add sysctl to toggle early demux for tcp and udp 2017-03-24 13:17:07 -07:00
ip_options.c Replace <asm/uaccess.h> with <linux/uaccess.h> globally 2016-12-24 11:46:01 -08:00
ip_output.c udp: avoid ufo handling on IP payload compression packets 2017-03-09 18:28:42 -08:00
ip_sockglue.c ipv4: get rid of ip_ra_lock 2017-04-30 22:44:04 -04:00
ip_tunnel_core.c netlink: pass extended ACK struct to parsing functions 2017-04-13 13:58:22 -04:00
ip_tunnel.c ip_tunnel: Allow policy-based routing through tunnels 2017-04-21 13:21:31 -04:00
ip_vti.c vti: check nla_put_* return value 2017-05-08 15:10:31 -04:00
ipcomp.c
ipconfig.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-04-06 08:24:51 -07:00
ipip.c ip_tunnel: Allow policy-based routing through tunnels 2017-04-21 13:21:31 -04:00
ipmr.c ipmr: vrf: Find VIFs using the actual device 2017-05-16 12:52:17 -04:00
Kconfig Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next 2017-02-16 21:25:49 -05:00
Makefile ipv4: fib: Move FIB notification code to a separate file 2017-03-10 09:45:09 -08:00
netfilter.c netfilter: use skb_to_full_sk in ip_route_me_harder 2017-02-28 12:49:36 +01:00
ping.c ping: implement proper locking 2017-03-24 20:50:28 -07:00
proc.c net/tcp_fastopen: Add snmp counter for blackhole detection 2017-04-24 14:27:17 -04:00
protocol.c net: Add sysctl to toggle early demux for tcp and udp 2017-03-24 13:17:07 -07:00
raw_diag.c net: ip, raw_diag -- Use jump for exiting from nested loop 2016-11-03 15:25:26 -04:00
raw.c ipv4, ipv6: ensure raw socket message is big enough to hold an IP header 2017-05-04 11:02:46 -04:00
route.c ipv4: add reference counting to metrics 2017-05-26 14:57:07 -04:00
syncookies.c tcp: randomize timestamps on syncookies 2017-05-05 12:00:11 -04:00
sysctl_net_ipv4.c net/tcp_fastopen: Disable active side TFO in certain scenarios 2017-04-24 14:27:17 -04:00
tcp_bbr.c tcp_bbr: add a state transition diagram and accompanying comment 2016-10-29 17:12:43 -04:00
tcp_bic.c tcp: replace cnt & rtt with struct in pkts_acked() 2016-05-11 14:43:19 -04:00
tcp_cdg.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/clock.h> 2017-03-02 08:42:27 +01:00
tcp_cong.c tcp: memset ca_priv data to 0 properly 2017-04-26 14:58:32 -04:00
tcp_cubic.c tcp_cubic: fix typo in module param description 2017-04-20 16:16:44 -04:00
tcp_dctcp.c Revert "dctcp: update cwnd on congestion event" 2016-12-06 11:34:24 -05:00
tcp_diag.c net: diag: Fix refcnt leak in error path destroying socket 2016-08-23 23:11:36 -07:00
tcp_fastopen.c net/tcp_fastopen: Add snmp counter for blackhole detection 2017-04-24 14:27:17 -04:00
tcp_highspeed.c tcp: add cwnd_undo functions to various tcp cc algorithms 2016-11-21 13:20:17 -05:00
tcp_htcp.c tcp: replace cnt & rtt with struct in pkts_acked() 2016-05-11 14:43:19 -04:00
tcp_hybla.c tcp: make undo_cwnd mandatory for congestion modules 2016-11-21 13:20:17 -05:00
tcp_illinois.c tcp: add cwnd_undo functions to various tcp cc algorithms 2016-11-21 13:20:17 -05:00
tcp_input.c tcp: eliminate negative reordering in tcp_clean_rtx_queue 2017-05-16 12:45:21 -04:00
tcp_ipv4.c Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-05-10 10:30:46 -07:00
tcp_lp.c tcp: fix wraparound issue in tcp_lp 2017-05-02 15:07:02 -04:00
tcp_metrics.c treewide: use kv[mz]alloc* rather than opencoded variants 2017-05-08 17:15:13 -07:00
tcp_minisocks.c tcp: do not inherit fastopen_req from parent 2017-05-04 11:00:04 -04:00
tcp_nv.c tcp: add NV congestion control 2016-06-10 23:07:49 -07:00
tcp_offload.c gso: Support partial splitting at the frag_list pointer 2016-09-19 20:59:34 -04:00
tcp_output.c tcp: make congestion control optionally skip slow start after idle 2017-05-08 14:37:07 -04:00
tcp_probe.c tcp: Revert "tcp: tcp_probe: use spin_lock_bh()" 2017-02-21 13:26:03 -05:00
tcp_rate.c tcp: do not pass timestamp to tcp_rate_gen() 2017-04-26 14:44:38 -04:00
tcp_recovery.c tcp: tcp_rack_reo_timeout() must update tp->tcp_mstamp 2017-04-27 11:46:15 -04:00
tcp_scalable.c tcp: add cwnd_undo functions to various tcp cc algorithms 2016-11-21 13:20:17 -05:00
tcp_timer.c net/tcp_fastopen: Remove mss check in tcp_write_timeout() 2017-04-24 14:27:17 -04:00
tcp_vegas.c tcp: make undo_cwnd mandatory for congestion modules 2016-11-21 13:20:17 -05:00
tcp_vegas.h tcp: replace cnt & rtt with struct in pkts_acked() 2016-05-11 14:43:19 -04:00
tcp_veno.c tcp: add cwnd_undo functions to various tcp cc algorithms 2016-11-21 13:20:17 -05:00
tcp_westwood.c tcp_westwood: fix tcp_westwood_info() style mistakes 2017-03-16 20:23:28 -07:00
tcp_yeah.c tcp: add cwnd_undo functions to various tcp cc algorithms 2016-11-21 13:20:17 -05:00
tcp.c tcp: avoid fastopen API to be used on AF_UNSPEC 2017-05-25 13:30:34 -04:00
tunnel4.c tunnels: correct conditional build of MPLS and IPv6 2016-07-11 13:27:06 -07:00
udp_diag.c net: inet: diag: expose the socket mark to privileged processes. 2016-09-08 16:13:09 -07:00
udp_impl.h udp: make *udp*_queue_rcv_skb() functions static 2017-05-18 10:23:33 -04:00
udp_offload.c udp: disable inner UDP checksum offloads in IPsec case 2017-04-24 13:48:54 -04:00
udp_tunnel.c net: Remove deprecated tunnel specific UDP offload functions 2016-06-17 20:23:32 -07:00
udp.c udp: make *udp*_queue_rcv_skb() functions static 2017-05-18 10:23:33 -04:00
udplite.c udplite: call proper backlog handlers 2016-11-24 15:32:14 -05:00
xfrm4_input.c esp: Add a software GRO codepath 2017-02-15 11:04:11 +01:00
xfrm4_mode_beet.c
xfrm4_mode_transport.c xfrm: Add encapsulation header offsets while SKB is not encrypted 2017-04-14 10:07:39 +02:00
xfrm4_mode_tunnel.c xfrm: Add encapsulation header offsets while SKB is not encrypted 2017-04-14 10:07:39 +02:00
xfrm4_output.c xfrm: Add an IPsec hardware offloading API 2017-04-14 10:06:10 +02:00
xfrm4_policy.c xfrm: policy: make policy backend const 2017-02-09 10:22:19 +01:00
xfrm4_protocol.c xfrm: input: constify xfrm_input_afinfo 2017-02-09 10:22:17 +01:00
xfrm4_state.c xfrm: remove unused function 2017-01-10 10:57:12 +01:00
xfrm4_tunnel.c