Go to file
Daniel Borkmann 628185cfdd net, sched: fix soft lockup in tc_classify
Shahar reported a soft lockup in tc_classify(), where we run into an
endless loop when walking the classifier chain due to tp->next == tp
which is a state we should never run into. The issue only seems to
trigger under load in the tc control path.

What happens is that in tc_ctl_tfilter(), thread A allocates a new
tp, initializes it, sets tp_created to 1, and calls into tp->ops->change()
with it. In that classifier callback we had to unlock/lock the rtnl
mutex and returned with -EAGAIN. One reason why we need to drop there
is, for example, that we need to request an action module to be loaded.

This happens via tcf_exts_validate() -> tcf_action_init/_1() meaning
after we loaded and found the requested action, we need to redo the
whole request so we don't race against others. While we had to unlock
rtnl in that time, thread B's request was processed next on that CPU.
Thread B added a new tp instance successfully to the classifier chain.
When thread A returned grabbing the rtnl mutex again, propagating -EAGAIN
and destroying its tp instance which never got linked, we goto replay
and redo A's request.

This time when walking the classifier chain in tc_ctl_tfilter() for
checking for existing tp instances we had a priority match and found
the tp instance that was created and linked by thread B. Now calling
again into tp->ops->change() with that tp was successful and returned
without error.

tp_created was never cleared in the second round, thus kernel thinks
that we need to link it into the classifier chain (once again). tp and
*back point to the same object due to the match we had earlier on. Thus
for thread B's already public tp, we reset tp->next to tp itself and
link it into the chain, which eventually causes the mentioned endless
loop in tc_classify() once a packet hits the data path.

Fix is to clear tp_created at the beginning of each request, also when
we replay it. On the paths that can cause -EAGAIN we already destroy
the original tp instance we had and on replay we really need to start
from scratch. It seems that this issue was first introduced in commit
12186be7d2 ("net_cls: fix unconfigured struct tcf_proto keeps chaining
and avoid kernel panic when we use cls_cgroup").

Fixes: 12186be7d2 ("net_cls: fix unconfigured struct tcf_proto keeps chaining and avoid kernel panic when we use cls_cgroup")
Reported-by: Shahar Klein <shahark@mellanox.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Tested-by: Shahar Klein <shahark@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-26 11:24:10 -05:00
arch 2nd round of ARC udpates for 4.10rc1 2016-12-23 10:22:47 -08:00
block Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-12-23 10:52:43 -08:00
certs certs: Add a secondary system keyring that can be added to dynamically 2016-04-11 22:48:09 +01:00
crypto This pull contains one set of changes: a conversion of the crypto DocBook 2016-12-17 16:00:34 -08:00
Documentation SCSI for-linus on 20161222 2016-12-23 10:36:19 -08:00
drivers net/mlx4_en: Fix user prio field in XDP forward 2016-12-23 17:53:47 -05:00
firmware WHENCE: use https://linuxtv.org for LinuxTV URLs 2015-12-04 10:35:11 -02:00
fs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-12-23 10:52:43 -08:00
include Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-12-23 10:52:43 -08:00
init Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-12-17 20:17:04 -08:00
ipc ipc/sem: avoid idr tree lookup for interrupted semop 2016-12-14 16:04:08 -08:00
kernel Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-12-23 10:52:43 -08:00
lib Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-12-23 10:52:43 -08:00
mm mm: fadvise: avoid expensive remote LRU cache draining after FADV_DONTNEED 2016-12-20 09:48:46 -08:00
net net, sched: fix soft lockup in tc_classify 2016-12-26 11:24:10 -05:00
samples Merge branch 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild 2016-12-17 16:24:13 -08:00
scripts Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security 2016-12-22 10:03:52 -08:00
security Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security 2016-12-22 10:03:52 -08:00
sound Revert "ALSA: usb-audio: Fix race at stopping the stream" 2016-12-21 09:48:14 -08:00
tools libnvdimm for 4.10 2016-12-18 15:49:10 -08:00
usr initramfs: allow again choice of the embedded initram compression algorithm 2016-12-14 16:04:08 -08:00
virt mm: unexport __get_user_pages_unlocked() 2016-12-14 16:04:09 -08:00
.cocciconfig scripts: add Linux .cocciconfig for coccinelle 2016-07-22 12:13:39 +02:00
.get_maintainer.ignore Add hch to .get_maintainer.ignore 2015-08-21 14:30:10 -07:00
.gitattributes .gitattributes: set git diff driver for C source code files 2016-10-07 18:46:30 -07:00
.gitignore Merge branch 'misc' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild 2016-08-02 16:48:52 -04:00
.mailmap Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus 2016-10-15 09:26:12 -07:00
COPYING
CREDITS CREDITS: Remove outdated address information 2016-12-21 15:21:29 -08:00
Kbuild scripts/gdb: provide linux constants 2016-05-23 17:04:14 -07:00
Kconfig kbuild: migrate all arch to the kconfig mainmenu upgrade 2010-09-19 22:54:11 -04:00
MAINTAINERS SCSI for-linus on 20161222 2016-12-23 10:36:19 -08:00
Makefile Linux 4.9 2016-12-11 11:17:54 -08:00
README README: add a new README file, pointing to the Documentation/ 2016-10-24 08:12:35 -02:00

Linux kernel
============

This file was moved to Documentation/admin-guide/README.rst

Please notice that there are several guides for kernel developers and users.
These guides can be rendered in a number of formats, like HTML and PDF.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.
See Documentation/00-INDEX for a list of what is contained in each file.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.