linux/net
Ido Schimmel df6afe2f7c nexthop: Fix performance regression in nexthop deletion
While insertion of 16k nexthops all using the same netdev ('dummy10')
takes less than a second, deletion takes about 130 seconds:

# time -p ip -b nexthop.batch
real 0.29
user 0.01
sys 0.15

# time -p ip link set dev dummy10 down
real 131.03
user 0.06
sys 0.52

This is because of repeated calls to synchronize_rcu() whenever a
nexthop is removed from a nexthop group:

# /usr/share/bcc/tools/offcputime -p `pgrep -nx ip` -K
...
    b'finish_task_switch'
    b'schedule'
    b'schedule_timeout'
    b'wait_for_completion'
    b'__wait_rcu_gp'
    b'synchronize_rcu.part.0'
    b'synchronize_rcu'
    b'__remove_nexthop'
    b'remove_nexthop'
    b'nexthop_flush_dev'
    b'nh_netdev_event'
    b'raw_notifier_call_chain'
    b'call_netdevice_notifiers_info'
    b'__dev_notify_flags'
    b'dev_change_flags'
    b'do_setlink'
    b'__rtnl_newlink'
    b'rtnl_newlink'
    b'rtnetlink_rcv_msg'
    b'netlink_rcv_skb'
    b'rtnetlink_rcv'
    b'netlink_unicast'
    b'netlink_sendmsg'
    b'____sys_sendmsg'
    b'___sys_sendmsg'
    b'__sys_sendmsg'
    b'__x64_sys_sendmsg'
    b'do_syscall_64'
    b'entry_SYSCALL_64_after_hwframe'
    -                ip (277)
        126554955

Since nexthops are always deleted under RTNL, synchronize_net() can be
used instead. It will call synchronize_rcu_expedited() which only blocks
for several microseconds as opposed to multiple milliseconds like
synchronize_rcu().

With this patch deletion of 16k nexthops takes less than a second:

# time -p ip link set dev dummy10 down
real 0.12
user 0.00
sys 0.04

Tested with fib_nexthops.sh which includes torture tests that prompted
the initial change:

# ./fib_nexthops.sh
...
Tests passed: 134
Tests failed:   0

Fixes: 90f33bffa3 ("nexthops: don't modify published nexthop groups")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Link: https://lore.kernel.org/r/20201016172914.643282-1-idosch@idosch.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-10-19 20:07:15 -07:00
..
6lowpan treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
9p treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
802
8021q net: vlan: Fixed signedness in vlan_group_prealloc_vid() 2020-09-28 00:51:39 -07:00
appletalk appletalk: Fix atalk_proc_init() return path 2020-08-03 15:48:32 -07:00
atm net: atm: delete duplicated words 2020-09-18 14:12:43 -07:00
ax25 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-07-25 17:49:04 -07:00
batman-adv genetlink: move to smaller ops wherever possible 2020-10-02 19:11:11 -07:00
bluetooth Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next 2020-09-29 13:22:53 -07:00
bpf bpf: fix raw_tp test run in preempt kernel 2020-09-30 08:34:08 -07:00
bpfilter Revert "bpfilter: Fix build error with CONFIG_BPFILTER_UMH" 2020-10-15 12:33:24 -07:00
bridge net: bridge: use new function dev_fetch_sw_netstats 2020-10-13 17:33:49 -07:00
caif caif: Remove duplicate macro SRVL_CTRL_PKT_SIZE 2020-09-05 15:57:05 -07:00
can Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-10-15 12:43:21 -07:00
ceph libceph: use sendpage_ok() in ceph_tcp_sendpage() 2020-10-02 15:27:08 -07:00
core net: core: use list_del_init() instead of list_del() in netdev_run_todo() 2020-10-18 14:50:25 -07:00
dcb net: DCB: Validate DCB_ATTR_DCB_BUFFER argument 2020-09-10 15:09:08 -07:00
dccp inet: remove icsk_ack.blocked 2020-09-30 14:21:30 -07:00
decnet treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
dns_resolver
dsa net: dsa: tag_ksz: KSZ8795 and KSZ9477 also use tail tags 2020-10-19 17:32:50 -07:00
ethernet
ethtool ethtool: correct policy for ETHTOOL_MSG_CHANNELS_SET 2020-10-08 16:06:01 -07:00
hsr genetlink: move to smaller ops wherever possible 2020-10-02 19:11:11 -07:00
ieee802154 genetlink: move to smaller ops wherever possible 2020-10-02 19:11:11 -07:00
ife
ipv4 nexthop: Fix performance regression in nexthop deletion 2020-10-19 20:07:15 -07:00
ipv6 networking changes for the 5.10 merge window 2020-10-15 18:42:13 -07:00
iucv net/iucv: fix indentation in __iucv_message_receive() 2020-10-03 16:51:07 -07:00
kcm net: pass a sockptr_t into ->setsockopt 2020-07-24 15:41:54 -07:00
key Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-08-02 01:02:12 -07:00
l2tp genetlink: move to smaller ops wherever possible 2020-10-02 19:11:11 -07:00
l3mdev net: Fix some comments 2020-08-27 07:55:59 -07:00
lapb treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
llc net: pass a sockptr_t into ->setsockopt 2020-07-24 15:41:54 -07:00
mac80211 mac80211: use new function dev_fetch_sw_netstats 2020-10-13 17:33:49 -07:00
mac802154 Merge tag 'ieee802154-for-davem-2020-09-08' of git://git.kernel.org/pub/scm/linux/kernel/git/sschmidt/wpan 2020-09-08 20:12:58 -07:00
mpls treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
mptcp Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-10-15 12:43:21 -07:00
ncsi genetlink: move to smaller ops wherever possible 2020-10-02 19:11:11 -07:00
netfilter Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-10-15 12:43:21 -07:00
netlabel genetlink: move to smaller ops wherever possible 2020-10-02 19:11:11 -07:00
netlink netlink: export policy in extended ACK 2020-10-09 20:22:32 -07:00
netrom treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
nfc NFC: digital: Remove two unused macroes 2020-09-05 16:01:52 -07:00
nsh treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
openvswitch net: openvswitch: fix to make sure flow_lookup() is not preempted 2020-10-18 12:29:36 -07:00
packet net/packet: Fix a comment about network_header 2020-09-19 16:40:48 -07:00
phonet treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
psample genetlink: move to smaller ops wherever possible 2020-10-02 19:11:11 -07:00
qrtr net: qrtr: ns: Fix the incorrect usage of rcu_read_lock() 2020-10-06 06:01:35 -07:00
rds net/rds: suppress page allocation failure error in recv buffer refill 2020-10-09 12:32:03 -07:00
rfkill
rose treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
rxrpc rxrpc: Fix loss of final ack on shutdown 2020-10-15 13:28:00 +01:00
sched net/sched: get rid of qdisc->padded 2020-10-09 08:08:08 -07:00
sctp Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-10-08 15:44:50 -07:00
smc Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-10-15 12:43:21 -07:00
strparser
sunrpc networking changes for the 5.10 merge window 2020-10-15 18:42:13 -07:00
switchdev net: switchdev: Fixed kerneldoc warning 2020-09-23 17:46:31 -07:00
tipc tipc: fix incorrect setting window for bcast link 2020-10-16 14:09:12 -07:00
tls Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-10-15 12:43:21 -07:00
unix networking changes for the 5.10 merge window 2020-10-15 18:42:13 -07:00
vmw_vsock vsock: fix potential null pointer dereference in vsock_poll() 2020-08-12 12:56:06 -07:00
wimax genetlink: move to smaller ops wherever possible 2020-10-02 19:11:11 -07:00
wireless A handful of changes: 2020-10-10 09:12:52 -07:00
x25 treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
xdp Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next 2020-10-12 16:16:50 -07:00
xfrm xfrm: use new function dev_fetch_sw_netstats 2020-10-13 17:33:49 -07:00
compat.c iov_iter: transparently handle compat iovecs in import_iovec 2020-10-03 00:02:13 -04:00
devres.c net: devres: rename the release callback of devm_register_netdev() 2020-06-30 15:57:34 -07:00
Kconfig drop_monitor: Convert to using devlink tracepoint 2020-09-30 18:01:26 -07:00
Makefile
socket.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-10-05 18:40:01 -07:00
sysctl_net.c