linux/net
Florian Westphal e0df8cae6c netfilter: conntrack: refine gc worker heuristics
Nicolas Dichtel says:
  After commit b87a2f9199 ("netfilter: conntrack: add gc worker to
  remove timed-out entries"), netlink conntrack deletion events may be
  sent with a huge delay.

Nicolas further points at this line:

  goal = min(nf_conntrack_htable_size / GC_MAX_BUCKETS_DIV, GC_MAX_BUCKETS);

and indeed, this isn't optimal at all.  Rationale here was to ensure that
we don't block other work items for too long, even if
nf_conntrack_htable_size is huge.  But in order to have some guarantee
about maximum time period where a scan of the full conntrack table
completes we should always use a fixed slice size, so that once every
N scans the full table has been examined at least once.

We also need to balance this vs. the case where the system is either idle
(i.e., conntrack table (almost) empty) or very busy (i.e. eviction happens
from packet path).

So, after some discussion with Nicolas:

1. want hard guarantee that we scan entire table at least once every X s
-> need to scan fraction of table (get rid of upper bound)

2. don't want to eat cycles on idle or very busy system
-> increase interval if we did not evict any entries

3. don't want to block other worker items for too long
-> make fraction really small, and prefer small scan interval instead

4. Want reasonable short time where we detect timed-out entry when
system went idle after a burst of traffic, while not doing scans
all the time.
-> Store next gc scan in worker, increasing delays when no eviction
happened and shrinking delay when we see timed out entries.

The old gc interval is turned into a max number, scans can now happen
every jiffy if stale entries are present.

Longest possible time period until an entry is evicted is now 2 minutes
in worst case (entry expires right after it was deemed 'not expired').

Reported-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-11-08 23:53:38 +01:00
..
6lowpan 6lowpan: ndisc: no overreact if no short address is available 2016-09-19 20:19:34 +02:00
9p IB/core: add support to create a unsafe global rkey to ib_create_pd 2016-09-23 13:47:44 -04:00
802
8021q net: add recursion limit to GRO 2016-10-20 14:32:22 -04:00
appletalk appletalk: use IS_ENABLED() instead of checking for built-in or module 2016-09-10 21:19:10 -07:00
atm lec: use IS_ENABLED() instead of checking for built-in or module 2016-09-10 21:19:10 -07:00
ax25 AX.25: Close socket connection on session completion 2016-06-18 20:55:34 -07:00
batman-adv treewide: remove redundant #include <linux/kconfig.h> 2016-10-11 15:06:33 -07:00
bluetooth Bluetooth: Fix append max 11 bytes of name to scan rsp data 2016-10-19 18:42:37 +02:00
bridge bridge: multicast: restore perm router ports on multicast enable 2016-10-18 13:52:13 -04:00
caif caif: Remove unneeded header file 2016-06-28 05:26:14 -04:00
can can: only call can_stat_update with procfs 2016-06-23 11:23:49 +02:00
ceph crush: remove redundant local variable 2016-10-05 23:02:10 +02:00
core netns: revert "netns: avoid disabling irq for netns id" 2016-10-22 16:16:29 -04:00
dcb
dccp Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security 2016-07-29 17:38:46 -07:00
decnet net: fix decnet rtnexthop parsing 2016-07-05 14:08:47 -07:00
dns_resolver
dsa net: dsa: add port fast ageing 2016-09-23 08:38:50 -04:00
ethernet net: add recursion limit to GRO 2016-10-20 14:32:22 -04:00
hsr net/hsr: Remove unused but set variable 2016-10-18 10:28:18 -04:00
ieee802154 ieee802154: 6lowpan: fix intra pan id check 2016-07-08 13:23:12 +02:00
ipv4 netfilter: nft_dup: do not use sreg_dev if the user doesn't specify it 2016-10-31 13:17:38 +01:00
ipv6 netfilter: nft_dup: do not use sreg_dev if the user doesn't specify it 2016-10-31 13:17:38 +01:00
ipx
irda Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-09-23 06:46:57 -04:00
iucv Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security 2016-07-29 17:38:46 -07:00
kcm Merge branch 'work.splice_read' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-10-07 15:36:58 -07:00
key
l2tp udp: must lock the socket in udp_disconnect() 2016-10-20 14:45:52 -04:00
l3mdev net: ipv6: Remove l3mdev_get_saddr6 2016-09-10 23:12:53 -07:00
lapb
llc llc: switch type to bool as the timeout is only tested versus 0 2016-09-17 10:05:05 -04:00
mac80211 mac80211: move struct aead_req off the stack 2016-10-17 16:14:04 +02:00
mac802154 mac802154: use rate limited warnings for malformed frames 2016-09-19 20:19:34 +02:00
mpls mpls: move mpls_hdr to a common location 2016-10-03 02:00:21 -04:00
ncsi net/ncsi: Improve HNCDSC AEN handler 2016-10-20 11:23:08 -04:00
netfilter netfilter: conntrack: refine gc worker heuristics 2016-11-08 23:53:38 +01:00
netlabel netlabel: Implement CALIPSO config functions for SMACK. 2016-06-27 15:06:18 -04:00
netlink netlink: do not enter direct reclaim from netlink_dump() 2016-10-06 20:53:13 -04:00
netrom
nfc NFC: digital: Fix RTOX supervisor PDU handling 2016-07-11 02:02:03 +02:00
openvswitch openvswitch: add NETIF_F_HW_VLAN_STAG_TX to internal dev 2016-10-13 10:03:23 -04:00
packet packet: call fanout_release, while UNREGISTERING a netdev 2016-10-06 20:50:18 -04:00
phonet
qrtr
rds IB/core: add support to create a unsafe global rkey to ib_create_pd 2016-09-23 13:47:44 -04:00
rfkill
rose rose: limit sk_filter trim to payload 2016-07-13 11:53:40 -07:00
rxrpc rxrpc: Fix checking of error from ip6_route_output() 2016-10-13 08:43:17 +01:00
sched net/sched: act_mirred: Use passed lastuse argument 2016-10-20 11:14:24 -04:00
sctp sctp: fix the panic caused by route update 2016-10-26 17:32:19 -04:00
strparser strparser: Propagate correct error code in strp_recv() 2016-10-12 01:51:49 -04:00
sunrpc NFS client updates for Linux 4.9 2016-10-13 21:28:20 -07:00
switchdev switchdev: Execute bridge ndos only for bridge ports 2016-10-19 10:58:04 -04:00
tipc tipc: info leak in __tipc_nl_add_udp_addr() 2016-10-13 12:10:01 -04:00
unix skb_splice_bits(): get rid of callback 2016-10-03 20:40:56 -04:00
vmw_vsock VSOCK: Don't dec ack backlog twice for rejected connections 2016-09-27 07:59:25 -04:00
wimax
wireless cfg80211: add ability to check DA/SA in A-MSDU decapsulation 2016-10-12 09:19:10 +02:00
x25 net: x25: remove null checks on arrays calling_ae and called_ae 2016-09-09 18:13:30 -07:00
xfrm proc: Reduce cache miss in xfrm_statistics_seq_show 2016-09-30 01:50:45 -04:00
compat.c packet: compat support for sock_fprog 2016-06-09 23:41:03 -07:00
Kconfig strparser: Stream parser for messages 2016-08-17 19:36:23 -04:00
Makefile strparser: Stream parser for messages 2016-08-17 19:36:23 -04:00
socket.c vfs: Remove {get,set,remove}xattr inode operations 2016-10-07 21:48:36 -04:00
sysctl_net.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2016-10-06 09:52:23 -07:00