linux

korg/linux

mirror of https://mirrors.bfsu.edu.cn/git/linux.git synced 2024-11-15 16:24:13 +08:00

History

Daniel Borkmann 628185cfdd net, sched: fix soft lockup in tc_classify Shahar reported a soft lockup in tc_classify(), where we run into an endless loop when walking the classifier chain due to tp->next == tp which is a state we should never run into. The issue only seems to trigger under load in the tc control path. What happens is that in tc_ctl_tfilter(), thread A allocates a new tp, initializes it, sets tp_created to 1, and calls into tp->ops->change() with it. In that classifier callback we had to unlock/lock the rtnl mutex and returned with -EAGAIN. One reason why we need to drop there is, for example, that we need to request an action module to be loaded. This happens via tcf_exts_validate() -> tcf_action_init/_1() meaning after we loaded and found the requested action, we need to redo the whole request so we don't race against others. While we had to unlock rtnl in that time, thread B's request was processed next on that CPU. Thread B added a new tp instance successfully to the classifier chain. When thread A returned grabbing the rtnl mutex again, propagating -EAGAIN and destroying its tp instance which never got linked, we goto replay and redo A's request. This time when walking the classifier chain in tc_ctl_tfilter() for checking for existing tp instances we had a priority match and found the tp instance that was created and linked by thread B. Now calling again into tp->ops->change() with that tp was successful and returned without error. tp_created was never cleared in the second round, thus kernel thinks that we need to link it into the classifier chain (once again). tp and *back point to the same object due to the match we had earlier on. Thus for thread B's already public tp, we reset tp->next to tp itself and link it into the chain, which eventually causes the mentioned endless loop in tc_classify() once a packet hits the data path. Fix is to clear tp_created at the beginning of each request, also when we replay it. On the paths that can cause -EAGAIN we already destroy the original tp instance we had and on replay we really need to start from scratch. It seems that this issue was first introduced in commit `12186be7d2` ("net_cls: fix unconfigured struct tcf_proto keeps chaining and avoid kernel panic when we use cls_cgroup"). Fixes: `12186be7d2` ("net_cls: fix unconfigured struct tcf_proto keeps chaining and avoid kernel panic when we use cls_cgroup") Reported-by: Shahar Klein <shahark@mellanox.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Tested-by: Shahar Klein <shahark@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>		2016-12-26 11:24:10 -05:00
..
act_api.c	net_sched: gen_estimator: complete rewrite of rate estimators	2016-12-05 15:21:59 -05:00
act_bpf.c	bpf: add prog_digest and expose it via fdinfo/netlink	2016-12-05 15:33:11 -05:00
act_connmark.c	netns: make struct pernet_operations::id unsigned int	2016-11-18 10:59:15 -05:00
act_csum.c	netns: make struct pernet_operations::id unsigned int	2016-11-18 10:59:15 -05:00
act_gact.c	netns: make struct pernet_operations::id unsigned int	2016-11-18 10:59:15 -05:00
act_ife.c	netns: make struct pernet_operations::id unsigned int	2016-11-18 10:59:15 -05:00
act_ipt.c	netns: make struct pernet_operations::id unsigned int	2016-11-18 10:59:15 -05:00
act_meta_mark.c	Support to encoding decoding skb mark on IFE action	2016-03-01 17:15:23 -05:00
act_meta_skbprio.c	Support to encoding decoding skb prio on IFE action	2016-03-01 17:15:23 -05:00
act_meta_skbtcindex.c	net sched ife action: Introduce skb tcindex metadata encap decap	2016-09-19 21:55:28 -04:00
act_mirred.c	act_mirred: fix a typo in get_dev	2016-12-03 19:28:02 -05:00
act_nat.c	netns: make struct pernet_operations::id unsigned int	2016-11-18 10:59:15 -05:00
act_pedit.c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2016-12-03 12:29:53 -05:00
act_police.c	net_sched: gen_estimator: complete rewrite of rate estimators	2016-12-05 15:21:59 -05:00
act_simple.c	netns: make struct pernet_operations::id unsigned int	2016-11-18 10:59:15 -05:00
act_skbedit.c	netns: make struct pernet_operations::id unsigned int	2016-11-18 10:59:15 -05:00
act_skbmod.c	netns: make struct pernet_operations::id unsigned int	2016-11-18 10:59:15 -05:00
act_tunnel_key.c	net/sched: act_tunnel_key: Fix setting UDP dst port in metadata under IPv6	2016-12-23 11:59:56 -05:00
act_vlan.c	netns: make struct pernet_operations::id unsigned int	2016-11-18 10:59:15 -05:00
cls_api.c	net, sched: fix soft lockup in tc_classify	2016-12-26 11:24:10 -05:00
cls_basic.c	net, sched: respect rcu grace period on cls destruction	2016-11-28 10:47:35 -05:00
cls_bpf.c	bpf: add prog_digest and expose it via fdinfo/netlink	2016-12-05 15:33:11 -05:00
cls_cgroup.c	net, sched: respect rcu grace period on cls destruction	2016-11-28 10:47:35 -05:00
cls_flow.c	net, sched: respect rcu grace period on cls destruction	2016-11-28 10:47:35 -05:00
cls_flower.c	net/sched: cls_flower: Mandate mask when matching on flags	2016-12-23 11:59:56 -05:00
cls_fw.c	net sched: stylistic cleanups	2016-09-19 22:04:14 -04:00
cls_matchall.c	net, sched: respect rcu grace period on cls destruction	2016-11-28 10:47:35 -05:00
cls_route.c	net_sched: check NULL on error path in route4_change()	2016-09-23 06:51:49 -04:00
cls_rsvp6.c
cls_rsvp.c
cls_rsvp.h	net, sched: respect rcu grace period on cls destruction	2016-11-28 10:47:35 -05:00
cls_tcindex.c	net, sched: respect rcu grace period on cls destruction	2016-11-28 10:47:35 -05:00
cls_u32.c	net sched: stylistic cleanups	2016-09-19 22:04:14 -04:00
em_canid.c	net: sched: remove tcf_proto from ematch calls	2014-10-06 18:02:32 -04:00
em_cmp.c	net_sched: cleanups	2011-01-19 23:31:12 -08:00
em_ipset.c	netfilter: x_tables: move hook state into xt_action_param structure	2016-11-03 10:56:21 +01:00
em_meta.c	net/sched: em_meta: Fix 'meta vlan' to correctly recognize zero VID frames	2016-10-23 17:31:25 -04:00
em_nbyte.c	net: sched: remove tcf_proto from ematch calls	2014-10-06 18:02:32 -04:00
em_text.c	net: Remove state argument from skb_find_text()	2015-02-22 15:59:54 -05:00
em_u32.c	net_sched: cleanups	2011-01-19 23:31:12 -08:00
ematch.c	ematch: Fix auto-loading of ematch modules.	2015-02-20 15:30:56 -05:00
Kconfig	net sched ife action: Introduce skb tcindex metadata encap decap	2016-09-19 21:55:28 -04:00
Makefile	net sched ife action: Introduce skb tcindex metadata encap decap	2016-09-19 21:55:28 -04:00
sch_api.c	net_sched: gen_estimator: complete rewrite of rate estimators	2016-12-05 15:21:59 -05:00
sch_atm.c	net_sched: drop packets after root qdisc lock is released	2016-06-25 12:19:35 -04:00
sch_blackhole.c	net_sched: drop packets after root qdisc lock is released	2016-06-25 12:19:35 -04:00
sch_cbq.c	net_sched: gen_estimator: complete rewrite of rate estimators	2016-12-05 15:21:59 -05:00
sch_choke.c	net_sched: drop packets after root qdisc lock is released	2016-06-25 12:19:35 -04:00
sch_codel.c	sched: replace __skb_dequeue with __qdisc_dequeue_head	2016-09-19 01:47:18 -04:00
sch_drr.c	net_sched: gen_estimator: complete rewrite of rate estimators	2016-12-05 15:21:59 -05:00
sch_dsmark.c	net_sched: drop packets after root qdisc lock is released	2016-06-25 12:19:35 -04:00
sch_fifo.c	sched: don't use skb queue helpers	2016-09-19 01:47:18 -04:00
sch_fq_codel.c	net_sched: fq_codel: cache skb->truesize into skb->cb	2016-06-25 12:19:35 -04:00
sch_fq.c	net_sched: sch_fq: use rb_entry()	2016-12-20 14:22:48 -05:00
sch_generic.c	net_sched: gen_estimator: complete rewrite of rate estimators	2016-12-05 15:21:59 -05:00
sch_gred.c	net_sched: drop packets after root qdisc lock is released	2016-06-25 12:19:35 -04:00
sch_hfsc.c	net_sched: gen_estimator: complete rewrite of rate estimators	2016-12-05 15:21:59 -05:00
sch_hhf.c	net_sched: drop packets after root qdisc lock is released	2016-06-25 12:19:35 -04:00
sch_htb.c	net_sched: gen_estimator: complete rewrite of rate estimators	2016-12-05 15:21:59 -05:00
sch_ingress.c	net: sched: fix tc_should_offload for specific clsact classes	2016-06-07 16:59:53 -07:00
sch_mq.c	net: sched: convert qdisc linked list to hashtable	2016-08-10 17:19:02 -07:00
sch_mqprio.c	net: sched: convert qdisc linked list to hashtable	2016-08-10 17:19:02 -07:00
sch_multiq.c	net_sched: drop packets after root qdisc lock is released	2016-06-25 12:19:35 -04:00
sch_netem.c	net_sched: sch_netem: use rb_entry()	2016-12-20 14:22:48 -05:00
sch_pie.c	sched: replace __skb_dequeue with __qdisc_dequeue_head	2016-09-19 01:47:18 -04:00
sch_plug.c	net_sched: drop packets after root qdisc lock is released	2016-06-25 12:19:35 -04:00
sch_prio.c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2016-06-30 05:03:36 -04:00
sch_qfq.c	net_sched: gen_estimator: complete rewrite of rate estimators	2016-12-05 15:21:59 -05:00
sch_red.c	net_sched: drop packets after root qdisc lock is released	2016-06-25 12:19:35 -04:00
sch_sfb.c	sch_sfb: keep backlog updated with qlen	2016-09-23 06:52:31 -04:00
sch_sfq.c	net_sched: drop packets after root qdisc lock is released	2016-06-25 12:19:35 -04:00
sch_tbf.c	net_sched: drop packets after root qdisc lock is released	2016-06-25 12:19:35 -04:00
sch_teql.c	net: use core MTU range checking in core net infra	2016-10-20 14:51:09 -04:00