linux-next

mirror of https://github.com/edk2-porting/linux-next.git synced 2024-12-24 05:04:00 +08:00

Author	SHA1	Message	Date
Eric Dumazet	89ad2fa3f0	bpf: fix lockdep splat pcpu_freelist_pop() needs the same lockdep awareness than pcpu_freelist_populate() to avoid a false positive. [ INFO: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected ] switchto-defaul/12508 [HC0[0]:SC0[6]:HE0:SE0] is trying to acquire: (&htab->buckets[i].lock){......}, at: [<ffffffff9dc099cb>] __htab_percpu_map_update_elem+0x1cb/0x300 and this task is already holding: (dev_queue->dev->qdisc_class ?: &qdisc_tx_lock#2){+.-...}, at: [<ffffffff9e135848>] __dev_queue_xmit+0 x868/0x1240 which would create a new lock dependency: (dev_queue->dev->qdisc_class ?: &qdisc_tx_lock#2){+.-...} -> (&htab->buckets[i].lock){......} but this new dependency connects a SOFTIRQ-irq-safe lock: (dev_queue->dev->qdisc_class ?: &qdisc_tx_lock#2){+.-...} ... which became SOFTIRQ-irq-safe at: [<ffffffff9db5931b>] __lock_acquire+0x42b/0x1f10 [<ffffffff9db5b32c>] lock_acquire+0xbc/0x1b0 [<ffffffff9da05e38>] _raw_spin_lock+0x38/0x50 [<ffffffff9e135848>] __dev_queue_xmit+0x868/0x1240 [<ffffffff9e136240>] dev_queue_xmit+0x10/0x20 [<ffffffff9e1965d9>] ip_finish_output2+0x439/0x590 [<ffffffff9e197410>] ip_finish_output+0x150/0x2f0 [<ffffffff9e19886d>] ip_output+0x7d/0x260 [<ffffffff9e19789e>] ip_local_out+0x5e/0xe0 [<ffffffff9e197b25>] ip_queue_xmit+0x205/0x620 [<ffffffff9e1b8398>] tcp_transmit_skb+0x5a8/0xcb0 [<ffffffff9e1ba152>] tcp_write_xmit+0x242/0x1070 [<ffffffff9e1baffc>] __tcp_push_pending_frames+0x3c/0xf0 [<ffffffff9e1b3472>] tcp_rcv_established+0x312/0x700 [<ffffffff9e1c1acc>] tcp_v4_do_rcv+0x11c/0x200 [<ffffffff9e1c3dc2>] tcp_v4_rcv+0xaa2/0xc30 [<ffffffff9e191107>] ip_local_deliver_finish+0xa7/0x240 [<ffffffff9e191a36>] ip_local_deliver+0x66/0x200 [<ffffffff9e19137d>] ip_rcv_finish+0xdd/0x560 [<ffffffff9e191e65>] ip_rcv+0x295/0x510 [<ffffffff9e12ff88>] __netif_receive_skb_core+0x988/0x1020 [<ffffffff9e130641>] __netif_receive_skb+0x21/0x70 [<ffffffff9e1306ff>] process_backlog+0x6f/0x230 [<ffffffff9e132129>] net_rx_action+0x229/0x420 [<ffffffff9da07ee8>] __do_softirq+0xd8/0x43d [<ffffffff9e282bcc>] do_softirq_own_stack+0x1c/0x30 [<ffffffff9dafc2f5>] do_softirq+0x55/0x60 [<ffffffff9dafc3a8>] __local_bh_enable_ip+0xa8/0xb0 [<ffffffff9db4c727>] cpu_startup_entry+0x1c7/0x500 [<ffffffff9daab333>] start_secondary+0x113/0x140 to a SOFTIRQ-irq-unsafe lock: (&head->lock){+.+...} ... which became SOFTIRQ-irq-unsafe at: ... [<ffffffff9db5971f>] __lock_acquire+0x82f/0x1f10 [<ffffffff9db5b32c>] lock_acquire+0xbc/0x1b0 [<ffffffff9da05e38>] _raw_spin_lock+0x38/0x50 [<ffffffff9dc0b7fa>] pcpu_freelist_pop+0x7a/0xb0 [<ffffffff9dc08b2c>] htab_map_alloc+0x50c/0x5f0 [<ffffffff9dc00dc5>] SyS_bpf+0x265/0x1200 [<ffffffff9e28195f>] entry_SYSCALL_64_fastpath+0x12/0x17 other info that might help us debug this: Chain exists of: dev_queue->dev->qdisc_class ?: &qdisc_tx_lock#2 --> &htab->buckets[i].lock --> &head->lock Possible interrupt unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&head->lock); local_irq_disable(); lock(dev_queue->dev->qdisc_class ?: &qdisc_tx_lock#2); lock(&htab->buckets[i].lock); <Interrupt> lock(dev_queue->dev->qdisc_class ?: &qdisc_tx_lock#2); * DEADLOCK * Fixes: `e19494edab` ("bpf: introduce percpu_freelist") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-15 19:46:32 +09:00
Yonghong Song	9c019e2bc4	bpf: change helper bpf_probe_read arg2 type to ARG_CONST_SIZE_OR_ZERO The helper bpf_probe_read arg2 type is changed from ARG_CONST_SIZE to ARG_CONST_SIZE_OR_ZERO to permit size-0 buffer. Together with newer ARG_CONST_SIZE_OR_ZERO semantics which allows non-NULL buffer with size 0, this allows simpler bpf programs with verifier acceptance. The previous commit which changes ARG_CONST_SIZE_OR_ZERO semantics has details on examples. Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-14 16:20:03 +09:00
Yonghong Song	9fd29c08e5	bpf: improve verifier ARG_CONST_SIZE_OR_ZERO semantics For helpers, the argument type ARG_CONST_SIZE_OR_ZERO permits the access size to be 0 when accessing the previous argument (arg). Right now, it requires the arg needs to be NULL when size passed is 0 or could be 0. It also requires a non-NULL arg when the size is proved to be non-0. This patch changes verifier ARG_CONST_SIZE_OR_ZERO behavior such that for size-0 or possible size-0, it is not required the arg equal to NULL. There are a couple of reasons for this semantics change, and all of them intends to simplify user bpf programs which may improve user experience and/or increase chances of verifier acceptance. Together with the next patch which changes bpf_probe_read arg2 type from ARG_CONST_SIZE to ARG_CONST_SIZE_OR_ZERO, the following two examples, which fail the verifier currently, are able to get verifier acceptance. Example 1: unsigned long len = pend - pstart; len = len > MAX_PAYLOAD_LEN ? MAX_PAYLOAD_LEN : len; len &= MAX_PAYLOAD_LEN; bpf_probe_read(data->payload, len, pstart); It does not have test for "len > 0" and it failed the verifier. Users may not be aware that they have to add this test. Converting the bpf_probe_read helper to have ARG_CONST_SIZE_OR_ZERO helps the above code get verifier acceptance. Example 2: Here is one example where llvm "messed up" the code and the verifier fails. ...... unsigned long len = pend - pstart; if (len > 0 && len <= MAX_PAYLOAD_LEN) bpf_probe_read(data->payload, len, pstart); ...... The compiler generates the following code and verifier fails: ...... 39: (79) r2 = (u64 )(r10 -16) 40: (1f) r2 -= r8 41: (bf) r1 = r2 42: (07) r1 += -1 43: (25) if r1 > 0xffe goto pc+3 R0=inv(id=0) R1=inv(id=0,umax_value=4094,var_off=(0x0; 0xfff)) R2=inv(id=0) R6=map_value(id=0,off=0,ks=4,vs=4095,imm=0) R7=inv(id=0) R8=inv(id=0) R9=inv0 R10=fp0 44: (bf) r1 = r6 45: (bf) r3 = r8 46: (85) call bpf_probe_read#45 R2 min value is negative, either use unsigned or 'var &= const' ...... The compiler optimization is correct. If r1 = 0, r1 - 1 = 0xffffffffffffffff > 0xffe. If r1 != 0, r1 - 1 will not wrap. r1 > 0xffe at insn #43 can actually capture both "r1 > 0" and "len <= MAX_PAYLOAD_LEN". This however causes an issue in verifier as the value range of arg2 "r2" does not properly get refined and lead to verification failure. Relaxing bpf_prog_read arg2 from ARG_CONST_SIZE to ARG_CONST_SIZE_OR_ZERO allows the following simplied code: unsigned long len = pend - pstart; if (len <= MAX_PAYLOAD_LEN) bpf_probe_read(data->payload, len, pstart); The llvm compiler will generate less complex code and the verifier is able to verify that the program is okay. Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-14 16:20:03 +09:00
David S. Miller	f3edacbd69	bpf: Revert bpf_overrid_function() helper changes. NACK'd by x86 maintainer. Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-11 18:24:55 +09:00
Josef Bacik	dd0bb688ea	bpf: add a bpf_override_function helper Error injection is sloppy and very ad-hoc. BPF could fill this niche perfectly with it's kprobe functionality. We could make sure errors are only triggered in specific call chains that we care about with very specific situations. Accomplish this with the bpf_override_funciton helper. This will modify the probe'd callers return value to the specified value and set the PC to an override function that simply returns, bypassing the originally probed function. This gives us a nice clean way to implement systematic error injection for all of our code paths. Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Josef Bacik <jbacik@fb.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-11 12:18:05 +09:00
David S. Miller	4dc6758d78	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Simple cases of overlapping changes in the packet scheduler. Must easier to resolve this time. Which probably means that I screwed it up somehow. Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-10 10:00:18 +09:00
Linus Torvalds	3fefc31843	Final power management fixes for v4.14 - Prevent the schedutil cpufreq governor from using the utilization of a wrong CPU in some cases which started to happen after one of the recent changes in it (Chris Redpath). - Blacklist Dell XPS13 9360 from using the Low Power S0 Idle _DSM interface as that causes serious issue (related to NVMe) to appear on one of these machines, even though the other Dells XPS13 9360 in somewhat different HW configurations behave correctly (Rafael Wysocki). -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABCAAGBQJaBFMsAAoJEILEb/54YlRxb8sP/jhqgCEiaUvSpFYBLklx1fqK QDbbYw3Os2JlAcMFElGjZEoV0+WBbadTHQxIxB353mpcAe1moI/SoDAJEgA9HWib tjJIt1lak3nLF967y0pOgAi9uvFCG6Rw+ZyVSrYBnT+eG6fw3h6MdHaOEFeCHJY4 qUgj+Bm0OtisW9mRFk5zdGam+6MxaE1z9mhjRAebrAjOI+46Oj/LQl7q32S2vWTp GJm6lSYOmSruo5vdU7T3dOSkBrkymf4E//1XkegFvcweEX/aITuCAil0Rl4X2PV0 NSeeCp66skftvlZ/vf3QOem1XLgPdVZ9dr7WRzOZt3CDctXuDaA3uKfRJf3oIMsi rnrsmWnfsT4sNaqV6zHYTkMtQfSsxFfjOde/HC7vAtC0oM/EECsonBtIlMUBmYlR 7fi+wtrE6ErsMgkThOvplG+cAnvWCK8yhmFBAKbcwg9UHwooF0HxXPqVwHE6M8pw eFALkJyagx1PTGKXNHOUTHfZcYGlhhJxH6SEhs/pJIxL8rVgey8mJfA9y8eGmEXP fKPnEb6yI7/9ZpzFFu3DLiHMifgc6o9F0mRqw3apeQFGtg46frYw51meP5Faq0EZ BAb0AXBF4qNB+XvORG0ABJkPqqDaZVPVfdo1SIA06IBr2mpVT1RAetMU0qSuWy+x STI04myU/9hHmCSdqzpg =F9fg -----END PGP SIGNATURE----- Merge tag 'pm-final-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull final power management fixes from Rafael Wysocki: "These fix a regression in the schedutil cpufreq governor introduced by a recent change and blacklist Dell XPS13 9360 from using the Low Power S0 Idle _DSM interface which triggers serious problems on one of these machines. Specifics: - Prevent the schedutil cpufreq governor from using the utilization of a wrong CPU in some cases which started to happen after one of the recent changes in it (Chris Redpath). - Blacklist Dell XPS13 9360 from using the Low Power S0 Idle _DSM interface as that causes serious issue (related to NVMe) to appear on one of these machines, even though the other Dells XPS13 9360 in somewhat different HW configurations behave correctly (Rafael Wysocki)" * tag 'pm-final-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI / PM: Blacklist Low Power S0 Idle _DSM for Dell XPS13 9360 cpufreq: schedutil: Examine the correct CPU when we update util	2017-11-09 11:16:28 -08:00
Rafael J. Wysocki	e029b9bf12	Merge branch 'pm-cpufreq-sched' * pm-cpufreq-sched: cpufreq: schedutil: Examine the correct CPU when we update util	2017-11-09 00:07:56 +01:00
Linus Torvalds	e4880bc5df	Merge branch 'for-4.14-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq Pull workqueue fix from Tejun Heo: "Another fix for a really old bug. It only affects drain_workqueue() which isn't used often and even then triggers only during a pretty small race window, so it isn't too surprising that it stayed hidden for so long. The fix is straight-forward and low-risk. Kudos to Li Bin for reporting and fixing the bug" * 'for-4.14-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: workqueue: Fix NULL pointer dereference	2017-11-06 12:26:49 -08:00
Linus Torvalds	9d9cc4aa00	Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Various fixes: - synchronize kernel and tooling headers - cgroup support fix - two tooling fixes" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: tools/headers: Synchronize kernel ABI headers perf/cgroup: Fix perf cgroup hierarchy support perf tools: Unwind properly location after REJECT perf symbols: Fix memory corruption because of zero length symbols	2017-11-05 11:44:39 -08:00
Roman Gushchin	ebc614f687	bpf, cgroup: implement eBPF-based device controller for cgroup v2 Cgroup v2 lacks the device controller, provided by cgroup v1. This patch adds a new eBPF program type, which in combination of previously added ability to attach multiple eBPF programs to a cgroup, will provide a similar functionality, but with some additional flexibility. This patch introduces a BPF_PROG_TYPE_CGROUP_DEVICE program type. A program takes major and minor device numbers, device type (block/character) and access type (mknod/read/write) as parameters and returns an integer which defines if the operation should be allowed or terminated with -EPERM. Signed-off-by: Roman Gushchin <guro@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Tejun Heo <tj@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-05 23:26:51 +09:00
Jakub Kicinski	b37a530613	bpf: remove old offload/analyzer Thanks to the ability to load a program for a specific device, running verifier twice is no longer needed. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-05 22:26:20 +09:00
Jakub Kicinski	6c8dfe21c4	cls_bpf: allow attaching programs loaded for specific device If TC program is loaded with skip_sw flag, we should allow the device-specific programs to be accepted. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-05 22:26:19 +09:00
Jakub Kicinski	248f346ffe	xdp: allow attaching programs loaded for specific device Pass the netdev pointer to bpf_prog_get_type(). This way BPF code can decide whether the device matches what the code was loaded/translated for. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-05 22:26:19 +09:00
Jakub Kicinski	bd601b6ada	bpf: report offload info to user space Extend struct bpf_prog_info to contain information about program being bound to a device. Since the netdev may get destroyed while program still exists we need a flag to indicate the program is loaded for a device, even if the device is gone. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-05 22:26:18 +09:00
Jakub Kicinski	ab3f0063c4	bpf: offload: add infrastructure for loading programs for a specific netdev The fact that we don't know which device the program is going to be used on is quite limiting in current eBPF infrastructure. We have to reverse or limit the changes which kernel makes to the loaded bytecode if we want it to be offloaded to a networking device. We also have to invent new APIs for debugging and troubleshooting support. Make it possible to load programs for a specific netdev. This helps us to bring the debug information closer to the core eBPF infrastructure (e.g. we will be able to reuse the verifer log in device JIT). It allows device JITs to perform translation on the original bytecode. __bpf_prog_get() when called to get a reference for an attachment point will now refuse to give it if program has a device assigned. Following patches will add a version of that function which passes the expected netdev in. @type argument in __bpf_prog_get() is renamed to attach_type to make it clearer that it's only set on attachment. All calls to ndo_bpf are protected by rtnl, only verifier callbacks are not. We need a wait queue to make sure netdev doesn't get destroyed while verifier is still running and calling its driver. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-05 22:26:18 +09:00
Chris Redpath	d62d813c0d	cpufreq: schedutil: Examine the correct CPU when we update util After commit `674e75411f` (sched: cpufreq: Allow remote cpufreq callbacks) we stopped to always read the utilization for the CPU we are running the governor on, and instead we read it for the CPU which we've been told has updated utilization. This is stored in sugov_cpu->cpu. The value is set in sugov_register() but we clear it in sugov_start() which leads to always looking at the utilization of CPU0 instead of the correct one. Fix this by consolidating the initialization code into sugov_start(). Fixes: `674e75411f` (sched: cpufreq: Allow remote cpufreq callbacks) Signed-off-by: Chris Redpath <chris.redpath@arm.com> Reviewed-by: Patrick Bellasi <patrick.bellasi@arm.com> Reviewed-by: Brendan Jackman <brendan.jackman@arm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2017-11-04 17:44:28 +01:00
Ingo Molnar	649e441f49	Merge branch 'linus' into core/urgent, to pick up dependent commits We want to fix an objtool build warning that got introduced in the latest upstream kernel. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-11-04 08:53:04 +01:00
David S. Miller	2a171788ba	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Files removed in 'net-next' had their license header updated in 'net'. We take the remove from 'net-next'. Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-04 09:26:51 +09:00
Ingo Molnar	294cbd05e3	Merge branch 'linus' into perf/urgent, to pick up dependent commits Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-11-03 12:30:12 +01:00
Craig Gallek	8c01c4f896	bpf: fix verifier NULL pointer dereference do_check() can fail early without allocating env->cur_state under memory pressure. Syzkaller found the stack below on the linux-next tree because of this. kasan: CONFIG_KASAN_INLINE enabled kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] SMP KASAN Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 1 PID: 27062 Comm: syz-executor5 Not tainted 4.14.0-rc7+ #106 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 task: ffff8801c2c74700 task.stack: ffff8801c3e28000 RIP: 0010:free_verifier_state kernel/bpf/verifier.c:347 [inline] RIP: 0010:bpf_check+0xcf4/0x19c0 kernel/bpf/verifier.c:4533 RSP: 0018:ffff8801c3e2f5c8 EFLAGS: 00010202 RAX: dffffc0000000000 RBX: 00000000fffffff4 RCX: 0000000000000000 RDX: 0000000000000070 RSI: ffffffff817d5aa9 RDI: 0000000000000380 RBP: ffff8801c3e2f668 R08: 0000000000000000 R09: 1ffff100387c5d9f R10: 00000000218c4e80 R11: ffffffff85b34380 R12: ffff8801c4dc6a28 R13: 0000000000000000 R14: ffff8801c4dc6a00 R15: ffff8801c4dc6a20 FS: 00007f311079b700(0000) GS:ffff8801db300000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000004d4a24 CR3: 00000001cbcd0000 CR4: 00000000001406e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: bpf_prog_load+0xcbb/0x18e0 kernel/bpf/syscall.c:1166 SYSC_bpf kernel/bpf/syscall.c:1690 [inline] SyS_bpf+0xae9/0x4620 kernel/bpf/syscall.c:1652 entry_SYSCALL_64_fastpath+0x1f/0xbe RIP: 0033:0x452869 RSP: 002b:00007f311079abe8 EFLAGS: 00000212 ORIG_RAX: 0000000000000141 RAX: ffffffffffffffda RBX: 0000000000758020 RCX: 0000000000452869 RDX: 0000000000000030 RSI: 0000000020168000 RDI: 0000000000000005 RBP: 00007f311079aa20 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000212 R12: 00000000004b7550 R13: 00007f311079ab58 R14: 00000000004b7560 R15: 0000000000000000 Code: df 48 c1 ea 03 80 3c 02 00 0f 85 e6 0b 00 00 4d 8b 6e 20 48 b8 00 00 00 00 00 fc ff df 49 8d bd 80 03 00 00 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 b6 0b 00 00 49 8b bd 80 03 00 00 e8 d6 0c 26 RIP: free_verifier_state kernel/bpf/verifier.c:347 [inline] RSP: ffff8801c3e2f5c8 RIP: bpf_check+0xcf4/0x19c0 kernel/bpf/verifier.c:4533 RSP: ffff8801c3e2f5c8 ---[ end trace c8d37f339dc64004 ]--- Fixes: `638f5b90d4` ("bpf: reduce verifier memory consumption") Fixes: `1969db47f8` ("bpf: fix verifier memory leaks") Signed-off-by: Craig Gallek <kraig@google.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-03 15:49:15 +09:00
Arnd Bergmann	eba0c929d1	bpf: fix out-of-bounds access warning in bpf_check The bpf_verifer_ops array is generated dynamically and may be empty depending on configuration, which then causes an out of bounds access: kernel/bpf/verifier.c: In function 'bpf_check': kernel/bpf/verifier.c:4320:29: error: array subscript is above array bounds [-Werror=array-bounds] This adds a check to the start of the function as a workaround. I would assume that the function is never called in that configuration, so the warning is probably harmless. Fixes: `00176a34d9` ("bpf: remove the verifier ops from program structure") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-03 14:20:22 +09:00
Arnd Bergmann	7cce782ef3	bpf: fix link error without CONFIG_NET I ran into this link error with the latest net-next plus linux-next trees when networking is disabled: kernel/bpf/verifier.o:(.rodata+0x2958): undefined reference to `tc_cls_act_analyzer_ops' kernel/bpf/verifier.o:(.rodata+0x2970): undefined reference to `xdp_analyzer_ops' It seems that the code was written to deal with varying contents of the arrray, but the actual #ifdef was missing. Both tc_cls_act_analyzer_ops and xdp_analyzer_ops are defined in the core networking code, so adding a check for CONFIG_NET seems appropriate here, and I've verified this with many randconfig builds Fixes: `4f9218aaf8` ("bpf: move knowledge about post-translation offsets out of verifier") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-03 14:20:22 +09:00
Linus Torvalds	ead751507d	License cleanup: add SPDX license identifiers to some files Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a /uapi/ one with no licensing information in it, - file was a /uapi/ one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non /uapi/ files that summary was: SPDX license identifier # files ---------------------------------------------------\|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a /uapi/ path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------\|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the /uapi/ ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------\|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCWfswbQ8cZ3JlZ0Brcm9h aC5jb20ACgkQMUfUDdst+ykvEwCfXU1MuYFQGgMdDmAZXEc+xFXZvqgAoKEcHDNA 6dVh26uchcEQLN/XqUDt =x306 -----END PGP SIGNATURE----- Merge tag 'spdx_identifiers-4.14-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull initial SPDX identifiers from Greg KH: "License cleanup: add SPDX license identifiers to some files Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a /uapi/ one with no licensing information in it, - file was a /uapi/ one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non /uapi/ files that summary was: SPDX license identifier # files ---------------------------------------------------\|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a /uapi/ path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------\|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the /uapi/ ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------\|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>" * tag 'spdx_identifiers-4.14-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: License cleanup: add SPDX license identifier to uapi header files with a license License cleanup: add SPDX license identifier to uapi header files with no license License cleanup: add SPDX GPL-2.0 license identifier to files with no license	2017-11-02 10:04:46 -07:00
Jiri Slaby	e78c38f6bd	futex: futex_wake_op, do not fail on invalid op In commit `30d6e0a419` ("futex: Remove duplicated code and fix undefined behaviour"), I let FUTEX_WAKE_OP to fail on invalid op. Namely when op should be considered as shift and the shift is out of range (< 0 or > 31). But strace's test suite does this madness: futex(0x7fabd78bcffc, 0x5, 0xfacefeed, 0xb, 0x7fabd78bcffc, 0xa0caffee); futex(0x7fabd78bcffc, 0x5, 0xfacefeed, 0xb, 0x7fabd78bcffc, 0xbadfaced); futex(0x7fabd78bcffc, 0x5, 0xfacefeed, 0xb, 0x7fabd78bcffc, 0xffffffff); When I pick the first 0xa0caffee, it decodes as: 0x80000000 & 0xa0caffee: oparg is shift 0x70000000 & 0xa0caffee: op is FUTEX_OP_OR 0x0f000000 & 0xa0caffee: cmp is FUTEX_OP_CMP_EQ 0x00fff000 & 0xa0caffee: oparg is sign-extended 0xcaf = -849 0x00000fff & 0xa0caffee: cmparg is sign-extended 0xfee = -18 That means the op tries to do this: (futex \|= (1 << (-849))) == -18 which is completely bogus. The new check of op in the code is: if (encoded_op & (FUTEX_OP_OPARG_SHIFT << 28)) { if (oparg < 0 \|\| oparg > 31) return -EINVAL; oparg = 1 << oparg; } which results obviously in the "Invalid argument" errno: FAIL: futex =========== futex(0x7fabd78bcffc, 0x5, 0xfacefeed, 0xb, 0x7fabd78bcffc, 0xa0caffee) = -1: Invalid argument futex.test: failed test: ../futex failed with code 1 So let us soften the failure to print only a (ratelimited) message, crop the value and continue as if it were right. When userspace keeps up, we can switch this to return -EINVAL again. [v2] Do not return 0 immediatelly, proceed with the cropped value. Fixes: `30d6e0a419` ("futex: Remove duplicated code and fix undefined behaviour") Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Darren Hart <dvhart@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-11-02 07:41:50 -07:00
Greg Kroah-Hartman	b24413180f	License cleanup: add SPDX GPL-2.0 license identifier to files with no license Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a /uapi/ one with no licensing information in it, - file was a /uapi/ one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non /uapi/ files that summary was: SPDX license identifier # files ---------------------------------------------------\|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a /uapi/ path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------\|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the /uapi/ ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------\|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2017-11-02 11:10:55 +01:00
Daniel Borkmann	5beca081be	bpf: also improve pattern matches for meta access Follow-up to `0fd4759c55` ("bpf: fix pattern matches for direct packet access") to cover also the remaining data_meta/data matches in the verifier. The matches are also refactored a bit to simplify handling of all the cases. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-02 17:01:38 +09:00
Daniel Borkmann	b06723da82	bpf: minor cleanups after merge Two minor cleanups after Dave's recent merge in `f8ddadc4db` ("Merge git://git.kernel.org...") of net into net-next in order to get the code in line with what was done originally in the net tree: i) use max() instead of max_t() since both ranges are u16, ii) don't split the direct access test cases in the middle with bpf_exit test cases from `390ee7e29f` ("bpf: enforce return code for cgroup-bpf programs"). Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-02 17:01:38 +09:00
Jesper Dangaard Brouer	03c4cc385f	bpf: cpumap micro-optimization in cpu_map_enqueue Discovered that the compiler laid-out asm code in suboptimal way when studying perf report during benchmarking of cpumap. Help the compiler by the marking unlikely code paths. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-02 16:13:14 +09:00
David S. Miller	ed29668d1a	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Smooth Cong Wang's bug fix into 'net-next'. Basically put the bulk of the tcf_block_put() logic from 'net' into tcf_block_put_ext(), but after the offload unbind. Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-02 15:23:39 +09:00
Linus Torvalds	3a99df9a3d	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull signal bugfix from Eric Biederman: "When making the generic support for SIGEMT conditional on the presence of SIGEMT I made a typo that causes it to fail to activate. It was noticed comparatively quickly but the bug report just made it to me today" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: signal: Fix name of SIGEMT in #if defined() check	2017-11-01 16:04:27 -07:00
Andrew Clayton	c3aff086ea	signal: Fix name of SIGEMT in #if defined() check Commit `cc731525f2` ("signal: Remove kernel interal si_code magic") added a check for SIGMET and NSIGEMT being defined. That SIGMET should in fact be SIGEMT, with SIGEMT being defined in arch/{alpha,mips,sparc}/include/uapi/asm/signal.h This was actually pointed out by BenHutchings in a lwn.net comment here https://lwn.net/Comments/734608/ Fixes: `cc731525f2` ("signal: Remove kernel interal si_code magic") Signed-off-by: Andrew Clayton <andrew@digital-domain.net> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2017-11-01 17:04:57 -05:00
Don Zickus	42f930da7f	watchdog/hardlockup/perf: Use atomics to track in-use cpu counter Guenter reported: There is still a problem. When running echo 6 > /proc/sys/kernel/watchdog_thresh echo 5 > /proc/sys/kernel/watchdog_thresh repeatedly, the message NMI watchdog: Enabled. Permanently consumes one hw-PMU counter. stops after a while (after ~10-30 iterations, with fluctuations). Maybe watchdog_cpus needs to be atomic ? That's correct as this again is affected by the asynchronous nature of the smpboot thread unpark mechanism. CPU 0 CPU1 CPU2 write(watchdog_thresh, 6) stop() park() update() start() unpark() thread->unpark() cnt++; write(watchdog_thresh, 5) thread->unpark() stop() park() thread->park() cnt--; cnt++; update() start() unpark() That's not a functional problem, it just affects the informational message. Convert watchdog_cpus to atomic_t to prevent the problem Reported-and-tested-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lkml.kernel.org/r/20171101181126.j727fqjmdthjz4xk@redhat.com	2017-11-01 21:18:40 +01:00
Thomas Gleixner	9c388a5ed1	watchdog/harclockup/perf: Revert `a33d44843d` ("watchdog/hardlockup/perf: Simplify deferred event destroy") Guenter reported a crash in the watchdog/perf code, which is caused by cleanup() and enable() running concurrently. The reason for this is: The watchdog functions are serialized via the watchdog_mutex and cpu hotplug locking, but the enable of the perf based watchdog happens in context of the unpark callback of the smpboot thread. But that unpark function is not synchronous inside the locking. The unparking of the thread just wakes it up and leaves so there is no guarantee when the thread is executing. If it starts running _before_ the cleanup happened then it will create a event and overwrite the dead event pointer. The new event is then cleaned up because the event is marked dead. lock(watchdog_mutex); lockup_detector_reconfigure(); cpus_read_lock(); stop(); park() update(); start(); unpark() cpus_read_unlock(); thread runs() overwrite dead event ptr cleanup(); free new event, which is active inside perf.... unlock(watchdog_mutex); The park side is safe as that actually waits for the thread to reach parked state. Commit `a33d44843d` removed the protection against this kind of scenario under the stupid assumption that the hotplug serialization and the watchdog_mutex cover everything. Bring it back. Reverts: `a33d44843d` ("watchdog/hardlockup/perf: Simplify deferred event destroy") Reported-and-tested-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Thomas Feels-stupid Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Don Zickus <dzickus@redhat.com> Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1710312145190.1942@nanos	2017-11-01 21:18:39 +01:00
Alexei Starovoitov	1969db47f8	bpf: fix verifier memory leaks fix verifier memory leaks Fixes: `638f5b90d4` ("bpf: reduce verifier memory consumption") Signed-off-by: Alexei Starovoitov <ast@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-01 22:07:31 +09:00
Peter Zijlstra	153fbd1226	futex: Fix more put_pi_state() vs. exit_pi_state_list() races Dmitry (through syzbot) reported being able to trigger the WARN in get_pi_state() and a use-after-free on: raw_spin_lock_irq(&pi_state->pi_mutex.wait_lock); Both are due to this race: exit_pi_state_list() put_pi_state() lock(&curr->pi_lock) while() { pi_state = list_first_entry(head); hb = hash_futex(&pi_state->key); unlock(&curr->pi_lock); dec_and_test(&pi_state->refcount); lock(&hb->lock) lock(&pi_state->pi_mutex.wait_lock) // uaf if pi_state free'd lock(&curr->pi_lock); .... unlock(&curr->pi_lock); get_pi_state(); // WARN; refcount==0 The problem is we take the reference count too late, and don't allow it being 0. Fix it by using inc_not_zero() and simply retrying the loop when we fail to get a refcount. In that case put_pi_state() should remove the entry from the list. Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Gratian Crisan <gratian.crisan@ni.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: dvhart@infradead.org Cc: syzbot <bot+2af19c9e1ffe4d4ee1d16c56ae7580feaee75765@syzkaller.appspotmail.com> Cc: syzkaller-bugs@googlegroups.com Cc: <stable@vger.kernel.org> Fixes: `c74aef2d06` ("futex: Fix pi_state->owner serialization") Link: http://lkml.kernel.org/r/20171031101853.xpfh72y643kdfhjs@hirez.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-11-01 09:05:00 +01:00
Yonghong Song	07c41a295c	bpf: avoid rcu_dereference inside bpf_event_mutex lock region During perf event attaching/detaching bpf programs, the tp_event->prog_array change is protected by the bpf_event_mutex lock in both attaching and deteching functions. Although tp_event->prog_array is a rcu pointer, rcu_derefrence is not needed to access it since mutex lock will guarantee ordering. Verified through "make C=2" that sparse locking check still happy with the new change. Also change the label name in perf_event_{attach,detach}_bpf_prog from "out" to "unlock" to reflect the code action after the label. Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-01 12:35:48 +09:00
John Fastabend	04686ef299	bpf: remove SK_REDIRECT from UAPI Now that SK_REDIRECT is no longer a valid return code. Remove it from the UAPI completely. Then do a namespace remapping internal to sockmap so SK_REDIRECT is no longer externally visible. Patchs primary change is to do a namechange from SK_REDIRECT to __SK_REDIRECT Reported-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-01 11:43:50 +09:00
Alexei Starovoitov	638f5b90d4	bpf: reduce verifier memory consumption the verifier got progressively smarter over time and size of its internal state grew as well. Time to reduce the memory consumption. Before: sizeof(struct bpf_verifier_state) = 6520 After: sizeof(struct bpf_verifier_state) = 896 It's done by observing that majority of BPF programs use little to no stack whereas verifier kept all of 512 stack slots ready always. Instead dynamically reallocate struct verifier state when stack access is detected. Runtime difference before vs after is within a noise. The number of processed instructions stays the same. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-01 11:41:18 +09:00
Li Bin	cef572ad9b	workqueue: Fix NULL pointer dereference When queue_work() is used in irq (not in task context), there is a potential case that trigger NULL pointer dereference. ---------------------------------------------------------------- worker_thread() \|-spin_lock_irq() \|-process_one_work() \|-worker->current_pwq = pwq \|-spin_unlock_irq() \|-worker->current_func(work) \|-spin_lock_irq() \|-worker->current_pwq = NULL \|-spin_unlock_irq() //interrupt here \|-irq_handler \|-__queue_work() //assuming that the wq is draining \|-is_chained_work(wq) \|-current_wq_worker() //Here, 'current' is the interrupted worker! \|-current->current_pwq is NULL here! \|-schedule() ---------------------------------------------------------------- Avoid it by checking for task context in current_wq_worker(), and if not in task context, we shouldn't use the 'current' to check the condition. Reported-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: Li Bin <huawei.libin@huawei.com> Reviewed-by: Lai Jiangshan <jiangshanlai@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org> Fixes: `8d03ecfe47` ("workqueue: reimplement is_chained_work() using current_wq_worker()") Cc: stable@vger.kernel.org # v3.9+	2017-10-30 07:56:01 -07:00
David S. Miller	e1ea2f9856	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Several conflicts here. NFP driver bug fix adding nfp_netdev_is_nfp_repr() check to nfp_fl_output() needed some adjustments because the code block is in an else block now. Parallel additions to net/pkt_cls.h and net/sch_generic.h A bug fix in __tcp_retransmit_skb() conflicted with some of the rbtree changes in net-next. The tc action RCU callback fixes in 'net' had some overlap with some of the recent tcf_block reworking. Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-30 21:09:24 +09:00
Tejun Heo	be96b316de	perf/cgroup: Fix perf cgroup hierarchy support The following commit: `864c2357ca` ("perf/core: Do not set cpuctx->cgrp for unscheduled cgroups") made list_update_cgroup_event() skip setting cpuctx->cgrp if no cgroup event targets %current's cgroup. This breaks perf_event's hierarchical support because events which target one of the ancestors get ignored. Fix it by using cgroup_is_descendant() test instead of equality. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: David Carrillo-Cisneros <davidcc@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: kernel-team@fb.com Cc: stable@vger.kernel.org # v4.9+ Fixes: `864c2357ca` ("perf/core: Do not set cpuctx->cgrp for unscheduled cgroups") Link: http://lkml.kernel.org/r/20171028164237.GA972780@devbig577.frc2.facebook.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-10-30 11:58:51 +01:00
Linus Torvalds	19e12196da	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) Fix route leak in xfrm_bundle_create(). 2) In mac80211, validate user rate mask before configuring it. From Johannes Berg. 3) Properly enforce memory limits in fair queueing code, from Toke Hoiland-Jorgensen. 4) Fix lockdep splat in inet_csk_route_req(), from Eric Dumazet. 5) Fix TSO header allocation and management in mvpp2 driver, from Yan Markman. 6) Don't take socket lock in BH handler in strparser code, from Tom Herbert. 7) Don't show sockets from other namespaces in AF_UNIX code, from Andrei Vagin. 8) Fix double free in error path of tap_open(), from Girish Moodalbail. 9) Fix TX map failure path in igb and ixgbe, from Jean-Philippe Brucker and Alexander Duyck. 10) Fix DCB mode programming in stmmac driver, from Jose Abreu. 11) Fix err_count handling in various tunnels (ipip, ip6_gre). From Xin Long. 12) Properly align SKB head before building SKB in tuntap, from Jason Wang. 13) Avoid matching qdiscs with a zero handle during lookups, from Cong Wang. 14) Fix various endianness bugs in sctp, from Xin Long. 15) Fix tc filter callback races and add selftests which trigger the problem, from Cong Wang. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (73 commits) selftests: Introduce a new test case to tc testsuite selftests: Introduce a new script to generate tc batch file net_sched: fix call_rcu() race on act_sample module removal net_sched: add rtnl assertion to tcf_exts_destroy() net_sched: use tcf_queue_work() in tcindex filter net_sched: use tcf_queue_work() in rsvp filter net_sched: use tcf_queue_work() in route filter net_sched: use tcf_queue_work() in u32 filter net_sched: use tcf_queue_work() in matchall filter net_sched: use tcf_queue_work() in fw filter net_sched: use tcf_queue_work() in flower filter net_sched: use tcf_queue_work() in flow filter net_sched: use tcf_queue_work() in cgroup filter net_sched: use tcf_queue_work() in bpf filter net_sched: use tcf_queue_work() in basic filter net_sched: introduce a workqueue for RCU callbacks of tc filter sctp: fix some type cast warnings introduced since very beginning sctp: fix a type cast warnings that causes a_rwnd gets the wrong value sctp: fix some type cast warnings introduced by transport rhashtable sctp: fix some type cast warnings introduced by stream reconf ...	2017-10-29 08:11:49 -07:00
John Fastabend	bfa640757e	bpf: rename sk_actions to align with bpf infrastructure Recent additions to support multiple programs in cgroups impose a strict requirement, "all yes is yes, any no is no". To enforce this the infrastructure requires the 'no' return code, SK_DROP in this case, to be 0. To apply these rules to SK_SKB program types the sk_actions return codes need to be adjusted. This fix adds SK_PASS and makes 'SK_DROP = 0'. Finally, remove SK_ABORTED to remove any chance that the API may allow aborted program flows to be passed up the stack. This would be incorrect behavior and allow programs to break existing policies. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-29 11:18:48 +09:00
John Fastabend	8108a77515	bpf: bpf_compute_data uses incorrect cb structure SK_SKB program types use bpf_compute_data to store the end of the packet data. However, bpf_compute_data assumes the cb is stored in the qdisc layer format. But, for SK_SKB this is the wrong layer of the stack for this type. It happens to work (sort of!) because in most cases nothing happens to be overwritten today. This is very fragile and error prone. Fortunately, we have another hole in tcp_skb_cb we can use so lets put the data_end value there. Note, SK_SKB program types do not use data_meta, they are failed by sk_skb_is_valid_access(). Signed-off-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-29 11:18:48 +09:00
Gianluca Borello	035226b964	bpf: remove tail_call and get_stackid helper declarations from bpf.h commit `afdb09c720` ("security: bpf: Add LSM hooks for bpf object related syscall") included linux/bpf.h in linux/security.h. As a result, bpf programs including bpf_helpers.h and some other header that ends up pulling in also security.h, such as several examples under samples/bpf, fail to compile because bpf_tail_call and bpf_get_stackid are now "redefined as different kind of symbol". >From bpf.h: u64 bpf_tail_call(u64 ctx, u64 r2, u64 index, u64 r4, u64 r5); u64 bpf_get_stackid(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5); Whereas in bpf_helpers.h they are: static void (bpf_tail_call)(void ctx, void map, int index); static int (bpf_get_stackid)(void ctx, void map, int flags); Fix this by removing the unused declaration of bpf_tail_call and moving the declaration of bpf_get_stackid in bpf_trace.c, which is the only place where it's needed. Signed-off-by: Gianluca Borello <g.borello@gmail.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-27 22:14:22 +09:00
Yonghong Song	e87c6bc385	bpf: permit multiple bpf attachments for a single perf event This patch enables multiple bpf attachments for a kprobe/uprobe/tracepoint single trace event. Each trace_event keeps a list of attached perf events. When an event happens, all attached bpf programs will be executed based on the order of attachment. A global bpf_event_mutex lock is introduced to protect prog_array attaching and detaching. An alternative will be introduce a mutex lock in every trace_event_call structure, but it takes a lot of extra memory. So a global bpf_event_mutex lock is a good compromise. The bpf prog detachment involves allocation of memory. If the allocation fails, a dummy do-nothing program will replace to-be-detached program in-place. Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-25 10:47:47 +09:00
Yonghong Song	0b4c6841fe	bpf: use the same condition in perf event set/free bpf handler This is a cleanup such that doing the same check in perf_event_free_bpf_prog as we already do in perf_event_set_bpf_prog step. Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-25 10:47:46 +09:00
Jesper Dangaard Brouer	31749468c3	bpf: cpumap fix potential lost wake-up problem As pointed out by Michael, commit `1c601d829a` ("bpf: cpumap xdp_buff to skb conversion and allocation") contains a classical example of the potential lost wake-up problem. We need to recheck the condition __ptr_ring_empty() after changing current->state to TASK_INTERRUPTIBLE, this avoids a race between wake_up_process() and schedule(). After this, a race with wake_up_process() will simply change the state to TASK_RUNNING, and the schedule() call not really put us to sleep. Fixes: `1c601d829a` ("bpf: cpumap xdp_buff to skb conversion and allocation") Reported-by: "Michael S. Tsirkin" <mst@redhat.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-24 18:40:22 +09:00
Linus Torvalds	06987dad0a	Merge branch 'for-4.14-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq Pull workqueue fix from Tejun Heo: "This is a fix for an old bug in workqueue. Workqueue used a mutex to arbitrate who gets to be the manager of a pool. When the manager role gets released, the mutex gets unlocked while holding the pool's irqsafe spinlock. This can lead to deadlocks as mutex's internal spinlock isn't irqsafe. This got discovered by recent fixes to mutex lockdep annotations. The fix is a bit invasive for rc6 but if anything were wrong with the fix it would likely have already blown up in -next, and we want the fix in -stable anyway" * 'for-4.14-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: workqueue: replace pool->manager_arb mutex with a flag	2017-10-23 11:24:52 -04:00

1 2 3 4 5 ...

25972 Commits