linux

korg/linux

mirror of https://mirrors.bfsu.edu.cn/git/linux.git synced 2024-11-16 16:54:20 +08:00

Author	SHA1	Message	Date
Waiman Long	e9737301f0	selftests/vm: make charge_reserved_hugetlb.sh work with existing cgroup setting [ Upstream commit `209376ed2a` ] The hugetlb cgroup reservation test charge_reserved_hugetlb.sh assume that no cgroup filesystems are mounted before running the test. That is not true in many cases. As a result, the test fails to run. Fix that by querying the current cgroup mount setting and using the existing cgroup setup instead before attempting to freshly mount a cgroup filesystem. Similar change is also made for hugetlb_reparenting_test.sh as well, though it still has problem if cgroup v2 isn't used. The patched test scripts were run on a centos 8 based system to verify that they ran properly. Link: https://lkml.kernel.org/r/20220106201359.1646575-1-longman@redhat.com Fixes: `29750f71a9` ("hugetlb_cgroup: add hugetlb_cgroup reservation tests") Signed-off-by: Waiman Long <longman@redhat.com> Acked-by: Mina Almasry <almasrymina@google.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-03-08 19:12:38 +01:00
Andrey Konovalov	1123c2fb9d	kasan: fix quarantine conflicting with init_on_free [ Upstream commit `26dca996ea` ] KASAN's quarantine might save its metadata inside freed objects. As this happens after the memory is zeroed by the slab allocator when init_on_free is enabled, the memory coming out of quarantine is not properly zeroed. This causes lib/test_meminit.c tests to fail with Generic KASAN. Zero the metadata when the object is removed from quarantine. Link: https://lkml.kernel.org/r/2805da5df4b57138fdacd671f5d227d58950ba54.1640037083.git.andreyknvl@google.com Fixes: `6471384af2` ("mm: security: introduce init_on_alloc=1 and init_on_free=1 boot options") Signed-off-by: Andrey Konovalov <andreyknvl@google.com> Reviewed-by: Marco Elver <elver@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-03-08 19:12:38 +01:00
Kefeng Wang	f1675103e0	mm: defer kmemleak object creation of module_alloc() [ Upstream commit `60115fa54a` ] Yongqiang reports a kmemleak panic when module insmod/rmmod with KASAN enabled(without KASAN_VMALLOC) on x86[1]. When the module area allocates memory, it's kmemleak_object is created successfully, but the KASAN shadow memory of module allocation is not ready, so when kmemleak scan the module's pointer, it will panic due to no shadow memory with KASAN check. module_alloc __vmalloc_node_range kmemleak_vmalloc kmemleak_scan update_checksum kasan_module_alloc kmemleak_ignore Note, there is no problem if KASAN_VMALLOC enabled, the modules area entire shadow memory is preallocated. Thus, the bug only exits on ARCH which supports dynamic allocation of module area per module load, for now, only x86/arm64/s390 are involved. Add a VM_DEFER_KMEMLEAK flags, defer vmalloc'ed object register of kmemleak in module_alloc() to fix this issue. [1] https://lore.kernel.org/all/6d41e2b9-4692-5ec4-b1cd-cbe29ae89739@huawei.com/ [wangkefeng.wang@huawei.com: fix build] Link: https://lkml.kernel.org/r/20211125080307.27225-1-wangkefeng.wang@huawei.com [akpm@linux-foundation.org: simplify ifdefs, per Andrey] Link: https://lkml.kernel.org/r/CA+fCnZcnwJHUQq34VuRxpdoY6_XbJCDJ-jopksS5Eia4PijPzw@mail.gmail.com Link: https://lkml.kernel.org/r/20211124142034.192078-1-wangkefeng.wang@huawei.com Fixes: `793213a82d` ("s390/kasan: dynamic shadow mem allocation for modules") Fixes: `39d114ddc6` ("arm64: add KASAN support") Fixes: `bebf56a1b1` ("kasan: enable instrumentation of global variables") Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Reported-by: Yongqiang Liu <liuyongqiang13@huawei.com> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Alexander Potapenko <glider@google.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-03-08 19:12:38 +01:00
Xiaoke Wang	013c2af6c1	tracing/probes: check the return value of kstrndup() for pbuf [ Upstream commit `1c1857d400` ] kstrndup() is a memory allocation-related function, it returns NULL when some internal memory errors happen. It is better to check the return value of it so to catch the memory error in time. Link: https://lkml.kernel.org/r/tencent_4D6E270731456EB88712ED7F13883C334906@qq.com Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Fixes: `a42e3c4de9` ("tracing/probe: Add immediate string parameter support") Signed-off-by: Xiaoke Wang <xkernel.wang@foxmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-03-08 19:12:38 +01:00
Xiaoke Wang	8a20fed48e	tracing/uprobes: Check the return value of kstrdup() for tu->filename [ Upstream commit `8c72242455` ] kstrdup() returns NULL when some internal memory errors happen, it is better to check the return value of it so to catch the memory error in time. Link: https://lkml.kernel.org/r/tencent_3C2E330722056D7891D2C83F29C802734B06@qq.com Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Fixes: `33ea4b2427` ("perf/core: Implement the 'perf_uprobe' PMU") Signed-off-by: Xiaoke Wang <xkernel.wang@foxmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-03-08 19:12:38 +01:00
Weizhao Ouyang	1a62246c2c	dma-buf: cma_heap: Fix mutex locking section [ Upstream commit `54329e6f7b` ] Fix cma_heap_buffer mutex locking critical section to protect vmap_cnt and vaddr. Fixes: `a5d2d29e24` ("dma-buf: heaps: Move heap-helper logic into the cma_heap implementation") Signed-off-by: Weizhao Ouyang <o451686892@gmail.com> Acked-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20220104073545.124244-1-o451686892@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-03-08 19:12:37 +01:00
Tom Rix	8654464086	i3c: master: dw: check return of dw_i3c_master_get_free_pos() [ Upstream commit `13462ba181` ] Clang static analysis reports this problem dw-i3c-master.c:799:9: warning: The result of the left shift is undefined because the left operand is negative COMMAND_PORT_DEV_INDEX(pos) \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~ pos can be negative because dw_i3c_master_get_free_pos() can return an error. So check for an error. Fixes: `1dd728f5d4` ("i3c: master: Add driver for Synopsys DesignWare IP") Signed-off-by: Tom Rix <trix@redhat.com> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Link: https://lore.kernel.org/r/20220108150948.3988790-1-trix@redhat.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-03-08 19:12:37 +01:00
Guchun Chen	a80b13642a	drm/amdgpu: use spin_lock_irqsave to avoid deadlock by local interrupt [ Upstream commit `2096b74b1d` ] This is observed in SRIOV case with virtual KMS as display. _raw_spin_lock_irqsave+0x37/0x40 drm_handle_vblank+0x69/0x350 [drm] ? try_to_wake_up+0x432/0x5c0 ? amdgpu_vkms_prepare_fb+0x1c0/0x1c0 [amdgpu] drm_crtc_handle_vblank+0x17/0x20 [drm] amdgpu_vkms_vblank_simulate+0x4d/0x80 [amdgpu] __hrtimer_run_queues+0xfb/0x230 hrtimer_interrupt+0x109/0x220 __sysvec_apic_timer_interrupt+0x64/0xe0 asm_call_irq_on_stack+0x12/0x20 Fixes: `84ec374bd5` ("drm/amdgpu: create amdgpu_vkms (v4)") Signed-off-by: Guchun Chen <guchun.chen@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Tested-by: Kelly Zytaruk <kelly.zytaruk@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-03-08 19:12:37 +01:00
Jiasheng Jiang	5609b78039	drm/amdkfd: Check for null pointer after calling kmemdup [ Upstream commit `abfaf0eee9` ] As the possible failure of the allocation, kmemdup() may return NULL pointer. Therefore, it should be better to check the 'props2' in order to prevent the dereference of NULL pointer. Fixes: `3a87177eb1` ("drm/amdkfd: Add topology support for dGPUs") Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-03-08 19:12:37 +01:00
Wesley Sheng	950d17f190	ntb_hw_switchtec: Fix bug with more than 32 partitions [ Upstream commit `7ff351c86b` ] Switchtec could support as mush as 48 partitions, but ffs & fls are for 32 bit argument, in case of partition index larger than 31, the current code could not parse the peer partition index correctly. Change to the 64 bit version __ffs64 & fls64 accordingly to fix this bug. Fixes: `3df54c870f` ("ntb_hw_switchtec: Allow using Switchtec NTB in multi-partition setups") Signed-off-by: Wesley Sheng <wesley.sheng@microchip.com> Signed-off-by: Kelvin Cao <kelvin.cao@microchip.com> Signed-off-by: Jon Mason <jdmason@kudzu.us> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-03-08 19:12:37 +01:00
Jeremy Pallotta	377cbdc927	ntb_hw_switchtec: Fix pff ioread to read into mmio_part_cfg_all [ Upstream commit `32c3d375b0` ] Array mmio_part_cfg_all holds the partition configuration of all partitions, with partition number as index. Fix this by reading into mmio_part_cfg_all for pff. Fixes: `0ee28f26f3` ("NTB: switchtec_ntb: Add link management") Signed-off-by: Jeremy Pallotta <jmpallotta@gmail.com> Signed-off-by: Kelvin Cao <kelvin.cao@microchip.com> Signed-off-by: Jon Mason <jdmason@kudzu.us> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-03-08 19:12:37 +01:00
Liu Ying	cd07b19fbf	drm/atomic: Check new_crtc_state->active to determine if CRTC needs disable in self refresh mode [ Upstream commit `69e630016e` ] Actual hardware state of CRTC is controlled by the member 'active' in struct drm_crtc_state instead of the member 'enable', according to the kernel doc of the member 'enable'. In fact, the drm client modeset and atomic helpers are using the member 'active' to do the control. Referencing the member 'enable' of new_crtc_state, the function crtc_needs_disable() may fail to reflect if CRTC needs disable in self refresh mode, e.g., when the framebuffer emulation will be blanked through the client modeset helper with the next commit, the member 'enable' of new_crtc_state is still true while the member 'active' is false, hence the relevant potential encoder and bridges won't be disabled. So, let's check new_crtc_state->active to determine if CRTC needs disable in self refresh mode instead of new_crtc_state->enable. Fixes: `1452c25b0e` ("drm: Add helpers to kick off self refresh mode in drivers") Cc: Sean Paul <seanpaul@chromium.org> Cc: Rob Clark <robdclark@chromium.org> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Maxime Ripard <mripard@kernel.org> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Liu Ying <victor.liu@nxp.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211230040626.646807-1-victor.liu@nxp.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-03-08 19:12:37 +01:00
Miaoqian Lin	1796d5350c	drm/sun4i: dw-hdmi: Fix missing put_device() call in sun8i_hdmi_phy_get [ Upstream commit `c71af3dae3` ] The reference taken by 'of_find_device_by_node()' must be released when not needed anymore. Add the corresponding 'put_device()' in the error handling path. Fixes: `9bf3797796` ("drm/sun4i: dw-hdmi: Make HDMI PHY into a platform device") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Signed-off-by: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20220107083633.20843-1-linmq006@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-03-08 19:12:37 +01:00
Chuck Lever	e209742c13	SUNRPC: Fix sockaddr handling in svcsock_accept_class trace points [ Upstream commit `1672086167` ] Avoid potentially hazardous memory copying and the needless use of "%pIS" -- in the kernel, an RPC service listener is always bound to ANYADDR. Having the network namespace is helpful when recording errors, though. Fixes: `a0469f46fa` ("SUNRPC: Replace dprintk call sites in TCP state change callouts") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-03-08 19:12:37 +01:00
Chuck Lever	bdaa8c7b71	SUNRPC: Fix sockaddr handling in the svc_xprt_create_error trace point [ Upstream commit `dc6c6fb3d6` ] While testing, I got an unexpected KASAN splat: Jan 08 13:50:27 oracle-102.nfsv4.dev kernel: BUG: KASAN: stack-out-of-bounds in trace_event_raw_event_svc_xprt_create_err+0x190/0x210 [sunrpc] Jan 08 13:50:27 oracle-102.nfsv4.dev kernel: Read of size 28 at addr ffffc9000008f728 by task mount.nfs/4628 The memcpy() in the TP_fast_assign section of this trace point copies the size of the destination buffer in order that the buffer won't be overrun. In other similar trace points, the source buffer for this memcpy is a "struct sockaddr_storage" so the actual length of the source buffer is always long enough to prevent the memcpy from reading uninitialized or unallocated memory. However, for this trace point, the source buffer can be as small as a "struct sockaddr_in". For AF_INET sockaddrs, the memcpy() reads memory that follows the source buffer, which is not always valid memory. To avoid copying past the end of the passed-in sockaddr, make the source address's length available to the memcpy(). It would be a little nicer if the tracing infrastructure was more friendly about storing socket addresses that are not AF_INET, but I could not find a way to make printk("%pIS") work with a dynamic array. Reported-by: KASAN Fixes: `4b8f380e46` ("SUNRPC: Tracepoint to record errors in svc_xpo_create()") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-03-08 19:12:37 +01:00
Matthew Auld	d3f67ceaeb	drm/i915: don't call free_mmap_offset when purging [ Upstream commit `4c2602ba8d` ] The TTM backend is in theory the only user here(also purge should only be called once we have dropped the pages), where it is setup at object creation and is only removed once the object is destroyed. Also resetting the node here might be iffy since the ttm fault handler uses the stored fake offset to determine the page offset within the pages array. This also blows up in the dontneed-before-mmap test, since the expectation is that the vma_node will live on, until the object is destroyed: <2> [749.062902] kernel BUG at drivers/gpu/drm/i915/gem/i915_gem_ttm.c:943! <4> [749.062923] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI <4> [749.062928] CPU: 0 PID: 1643 Comm: gem_madvise Tainted: G U W 5.16.0-rc8-CI-CI_DRM_11046+ #1 <4> [749.062933] Hardware name: Gigabyte Technology Co., Ltd. GB-Z390 Garuda/GB-Z390 Garuda-CF, BIOS IG1c 11/19/2019 <4> [749.062937] RIP: 0010:i915_ttm_mmap_offset.cold.35+0x5b/0x5d [i915] <4> [749.063044] Code: 00 48 c7 c2 a0 23 4e a0 48 c7 c7 26 df 4a a0 e8 95 1d d0 e0 bf 01 00 00 00 e8 8b ec cf e0 31 f6 bf 09 00 00 00 e8 5f 30 c0 e0 <0f> 0b 48 c7 c1 24 4b 56 a0 ba 5b 03 00 00 48 c7 c6 c0 23 4e a0 48 <4> [749.063052] RSP: 0018:ffffc90002ab7d38 EFLAGS: 00010246 <4> [749.063056] RAX: 0000000000000240 RBX: ffff88811f2e61c0 RCX: 0000000000000006 <4> [749.063060] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000009 <4> [749.063063] RBP: ffffc90002ab7e58 R08: 0000000000000001 R09: 0000000000000001 <4> [749.063067] R10: 000000000123d0f8 R11: ffffc90002ab7b20 R12: ffff888112a1a000 <4> [749.063071] R13: 0000000000000004 R14: ffff88811f2e61c0 R15: ffff888112a1a000 <4> [749.063074] FS: 00007f6e5fcad500(0000) GS:ffff8884ad600000(0000) knlGS:0000000000000000 <4> [749.063078] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 <4> [749.063081] CR2: 00007efd264e39f0 CR3: 0000000115fd6005 CR4: 00000000003706f0 <4> [749.063085] Call Trace: <4> [749.063087] <TASK> <4> [749.063089] __assign_mmap_offset+0x41/0x300 [i915] <4> [749.063171] __assign_mmap_offset_handle+0x159/0x270 [i915] <4> [749.063248] ? i915_gem_dumb_mmap_offset+0x70/0x70 [i915] <4> [749.063325] drm_ioctl_kernel+0xae/0x140 <4> [749.063330] drm_ioctl+0x201/0x3d0 <4> [749.063333] ? i915_gem_dumb_mmap_offset+0x70/0x70 [i915] <4> [749.063409] ? do_user_addr_fault+0x200/0x670 <4> [749.063415] __x64_sys_ioctl+0x6d/0xa0 <4> [749.063419] do_syscall_64+0x3a/0xb0 <4> [749.063423] entry_SYSCALL_64_after_hwframe+0x44/0xae <4> [749.063428] RIP: 0033:0x7f6e5f100317 Testcase: igt/gem_madvise/dontneed-before-mmap Fixes: `cf3e3e86d7` ("drm/i915: Use ttm mmap handling for ttm bo's.") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220106174910.280616-1-matthew.auld@intel.com (cherry picked from commit `658a0c6326`) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-03-08 19:12:36 +01:00
Vitaly Kuznetsov	e47679c06a	x86/hyperv: Properly deal with empty cpumasks in hyperv_flush_tlb_multi() [ Upstream commit `51500b71d5` ] KASAN detected the following issue: BUG: KASAN: slab-out-of-bounds in hyperv_flush_tlb_multi+0xf88/0x1060 Read of size 4 at addr ffff8880011ccbc0 by task kcompactd0/33 CPU: 1 PID: 33 Comm: kcompactd0 Not tainted 5.14.0-39.el9.x86_64+debug #1 Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.0 12/17/2019 Call Trace: dump_stack_lvl+0x57/0x7d print_address_description.constprop.0+0x1f/0x140 ? hyperv_flush_tlb_multi+0xf88/0x1060 __kasan_report.cold+0x7f/0x11e ? hyperv_flush_tlb_multi+0xf88/0x1060 kasan_report+0x38/0x50 hyperv_flush_tlb_multi+0xf88/0x1060 flush_tlb_mm_range+0x1b1/0x200 ptep_clear_flush+0x10e/0x150 ... Allocated by task 0: kasan_save_stack+0x1b/0x40 __kasan_kmalloc+0x7c/0x90 hv_common_init+0xae/0x115 hyperv_init+0x97/0x501 apic_intr_mode_init+0xb3/0x1e0 x86_late_time_init+0x92/0xa2 start_kernel+0x338/0x3eb secondary_startup_64_no_verify+0xc2/0xcb The buggy address belongs to the object at ffff8880011cc800 which belongs to the cache kmalloc-1k of size 1024 The buggy address is located 960 bytes inside of 1024-byte region [ffff8880011cc800, ffff8880011ccc00) 'hyperv_flush_tlb_multi+0xf88/0x1060' points to hv_cpu_number_to_vp_number() and '960 bytes' means we're trying to get VP_INDEX for CPU#240. 'nr_cpus' here is exactly 240 so we're trying to access past hv_vp_index's last element. This can (and will) happen when 'cpus' mask is empty and cpumask_last() will return '>=nr_cpus'. Commit `ad0a6bad44` ("x86/hyperv: check cpu mask after interrupt has been disabled") tried to deal with empty cpumask situation but apparently didn't fully fix the issue. 'cpus' cpumask which is passed to hyperv_flush_tlb_multi() is 'mm_cpumask(mm)' (which is '&mm->cpu_bitmap'). This mask changes every time the particular mm is scheduled/unscheduled on some CPU (see switch_mm_irqs_off()), disabling IRQs on the CPU which is performing remote TLB flush has zero influence on whether the particular process can get scheduled/unscheduled on _other_ CPUs so e.g. in the case where the mm was scheduled on one other CPU and got unscheduled during hyperv_flush_tlb_multi()'s execution will lead to cpumask becoming empty. It doesn't seem that there's a good way to protect 'mm_cpumask(mm)' from changing during hyperv_flush_tlb_multi()'s execution. It would be possible to copy it in the very beginning of the function but this is a waste. It seems we can deal with changing cpumask just fine. When 'cpus' cpumask changes during hyperv_flush_tlb_multi()'s execution, there are two possible issues: - 'Under-flushing': we will not flush TLB on a CPU which got added to the mask while hyperv_flush_tlb_multi() was already running. This is not a problem as this is equal to mm getting scheduled on that CPU right after TLB flush. - 'Over-flushing': we may flush TLB on a CPU which is already cleared from the mask. First, extra TLB flush preserves correctness. Second, Hyper-V's TLB flush hypercall takes 'mm->pgd' argument so Hyper-V may avoid the flush if CR3 doesn't match. Fix the immediate issue with cpumask_last()/hv_cpu_number_to_vp_number() and remove the pointless cpumask_empty() check from the beginning of the function as it really doesn't protect anything. Also, avoid the hypercall altogether when 'flush->processor_mask' ends up being empty. Fixes: `ad0a6bad44` ("x86/hyperv: check cpu mask after interrupt has been disabled") Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/20220106094611.1404218-1-vkuznets@redhat.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-03-08 19:12:36 +01:00
J. Bruce Fields	4425ca3677	nfsd: fix crash on COPY_NOTIFY with special stateid [ Upstream commit `074b07d94e` ] RTM says "If the special ONE stateid is passed to nfs4_preprocess_stateid_op(), it returns status=0 but does not set *cstid. nfsd4_copy_notify() depends on stid being set if status=0, and thus can crash if the client sends the right COPY_NOTIFY RPC." RFC 7862 says "The cna_src_stateid MUST refer to either open or locking states provided earlier by the server. If it is invalid, then the operation MUST fail." The RFC doesn't specify an error, and the choice doesn't matter much as this is clearly illegal client behavior, but bad_stateid seems reasonable. Simplest is just to guarantee that nfs4_preprocess_stateid_op, called with non-NULL cstid, errors out if it can't return a stateid. Reported-by: rtm@csail.mit.edu Fixes: `624322f1ad` ("NFSD add COPY_NOTIFY operation") Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Olga Kornievskaia <kolga@netapp.com> Tested-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-03-08 19:12:36 +01:00
Chuck Lever	0f84cfb465	Revert "nfsd: skip some unnecessary stats in the v4 case" [ Upstream commit `58f258f652` ] On the wire, I observed NFSv4 OPEN(CREATE) operations sometimes returning a reasonable-looking value in the cinfo.before field and zero in the cinfo.after field. RFC 8881 Section 10.8.1 says: > When a client is making changes to a given directory, it needs to > determine whether there have been changes made to the directory by > other clients. It does this by using the change attribute as > reported before and after the directory operation in the associated > change_info4 value returned for the operation. and > ... The post-operation change > value needs to be saved as the basis for future change_info4 > comparisons. A good quality client implementation therefore saves the zero cinfo.after value. During a subsequent OPEN operation, it will receive a different non-zero value in the cinfo.before field for that directory, and it will incorrectly believe the directory has changed, triggering an undesirable directory cache invalidation. There are filesystem types where fs_supports_change_attribute() returns false, tmpfs being one. On NFSv4 mounts, this means the fh_getattr() call site in fill_pre_wcc() and fill_post_wcc() is never invoked. Subsequently, nfsd4_change_attribute() is invoked with an uninitialized @stat argument. In fill_pre_wcc(), @stat contains stale stack garbage, which is then placed on the wire. In fill_post_wcc(), ->fh_post_wc is all zeroes, so zero is placed on the wire. Both of these values are meaningless. This fix can be applied immediately to stable kernels. Once there are more regression tests in this area, this optimization can be attempted again. Fixes: `428a23d2bf` ("nfsd: skip some unnecessary stats in the v4 case") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-03-08 19:12:36 +01:00
Chuck Lever	3abe2a70f5	NFSD: Fix verifier returned in stable WRITEs [ Upstream commit `f11ad7aa65` ] RFC 8881 explains the purpose of the write verifier this way: > The final portion of the result is the field writeverf. This field > is the write verifier and is a cookie that the client can use to > determine whether a server has changed instance state (e.g., server > restart) between a call to WRITE and a subsequent call to either > WRITE or COMMIT. But then it says: > This cookie MUST be unchanged during a single instance of the > NFSv4.1 server and MUST be unique between instances of the NFSv4.1 > server. If the cookie changes, then the client MUST assume that > any data written with an UNSTABLE4 value for committed and an old > writeverf in the reply has been lost and will need to be > recovered. RFC 1813 has similar language for NFSv3. NFSv2 does not have a write verifier since it doesn't implement the COMMIT procedure. Since commit `19e0663ff9` ("nfsd: Ensure sampling of the write verifier is atomic with the write"), the Linux NFS server has returned a boot-time-based verifier for UNSTABLE WRITEs, but a zero verifier for FILE_SYNC and DATA_SYNC WRITEs. FILE_SYNC and DATA_SYNC WRITEs are not followed up with a COMMIT, so there's no need for clients to compare verifiers for stable writes. However, by returning a different verifier for stable and unstable writes, the above commit puts the Linux NFS server a step farther out of compliance with the first MUST above. At least one NFS client (FreeBSD) noticed the difference, making this a potential regression. Reported-by: Rick Macklem <rmacklem@uoguelph.ca> Link: https://lore.kernel.org/linux-nfs/YQXPR0101MB096857EEACF04A6DF1FC6D9BDD749@YQXPR0101MB0968.CANPRD01.PROD.OUTLOOK.COM/T/ Fixes: `19e0663ff9` ("nfsd: Ensure sampling of the write verifier is atomic with the write") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-03-08 19:12:36 +01:00
Pali Rohár	e7c4332703	PCI: mvebu: Fix support for DEVCAP2, DEVCTL2 and LNKCTL2 registers on emulated bridge [ Upstream commit `4ab34548c5` ] Armada XP and new hardware supports access to DEVCAP2, DEVCTL2 and LNKCTL2 configuration registers of PCIe core via PCIE_CAP_PCIEXP. So export them via emulated software root bridge. Pre-XP hardware does not support these registers and returns zeros. Link: https://lore.kernel.org/r/20211125124605.25915-16-pali@kernel.org Fixes: `1f08673eef` ("PCI: mvebu: Convert to PCI emulated bridge config space") Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-03-08 19:12:36 +01:00
Pali Rohár	a247456733	PCI: mvebu: Fix support for PCI_EXP_RTSTA on emulated bridge [ Upstream commit `838ff44a39` ] PME Status bit in Root Status Register (PCIE_RC_RTSTA_OFF) is read-only and can be cleared only by writing 0b to the Interrupt Cause RW0C register (PCIE_INT_CAUSE_OFF). Link: https://lore.kernel.org/r/20211125124605.25915-15-pali@kernel.org Fixes: `1f08673eef` ("PCI: mvebu: Convert to PCI emulated bridge config space") Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-03-08 19:12:36 +01:00
Pali Rohár	1d4200e284	PCI: mvebu: Fix support for PCI_EXP_DEVCTL on emulated bridge [ Upstream commit `ecae073e39` ] Comment in Armada 370 functional specification is misleading. PCI_EXP_DEVCTL_*RE bits are supported and configures receiving of error interrupts. Link: https://lore.kernel.org/r/20211125124605.25915-14-pali@kernel.org Fixes: `1f08673eef` ("PCI: mvebu: Convert to PCI emulated bridge config space") Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-03-08 19:12:36 +01:00
Pali Rohár	1ea3f69784	PCI: mvebu: Fix support for PCI_BRIDGE_CTL_BUS_RESET on emulated bridge [ Upstream commit `d75404cc08` ] Hardware supports PCIe Hot Reset via PCIE_CTRL_OFF register. Use it for implementing PCI_BRIDGE_CTL_BUS_RESET bit of PCI_BRIDGE_CONTROL register on emulated bridge. With this change the function pci_reset_secondary_bus() starts working and can reset connected PCIe card. Link: https://lore.kernel.org/r/20211125124605.25915-13-pali@kernel.org Fixes: `1f08673eef` ("PCI: mvebu: Convert to PCI emulated bridge config space") Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-03-08 19:12:36 +01:00
Pali Rohár	9c91c75500	PCI: mvebu: Setup PCIe controller to Root Complex mode [ Upstream commit `df08ac0161` ] This driver operates only in Root Complex mode, so ensure that hardware is properly configured in Root Complex mode. Link: https://lore.kernel.org/r/20211125124605.25915-10-pali@kernel.org Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-03-08 19:12:36 +01:00
Pali Rohár	3d394fa375	PCI: mvebu: Fix configuring secondary bus of PCIe Root Port via emulated bridge [ Upstream commit `91a8d79fc7` ] It looks like that mvebu PCIe controller has for each PCIe link fully independent PCIe host bridge and so every PCIe Root Port is isolated not only on its own bus but also isolated from each others. But in past device tree structure was defined to put all PCIe Root Ports (as PCI Bridge devices) into one root bus 0 and this bus is emulated by pci-mvebu.c driver. Probably reason for this decision was incorrect understanding of PCIe topology of these Armada SoCs and also reason of misunderstanding how is PCIe controller generating Type 0 and Type 1 config requests (it is fully different compared to other drivers). Probably incorrect setup leaded to very surprised things like having PCIe Root Port (PCI Bridge device, with even incorrect Device Class set to Memory Controller) and the PCIe device behind the Root Port on the same PCI bus, which obviously was needed to somehow hack (as these two devices cannot be in reality on the same bus). Properly set mvebu local bus number and mvebu local device number based on PCI Bridge secondary bus number configuration. Also correctly report configured secondary bus number in config space. And explain in driver comment why this setup is correct. Link: https://lore.kernel.org/r/20211125124605.25915-12-pali@kernel.org Fixes: `1f08673eef` ("PCI: mvebu: Convert to PCI emulated bridge config space") Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-03-08 19:12:35 +01:00
Pali Rohár	4396c507a8	PCI: mvebu: Fix support for bus mastering and PCI_COMMAND on emulated bridge [ Upstream commit `e42b855837` ] According to PCI specifications bits [0:2] of Command Register, this should be by default disabled on reset. So explicitly disable these bits at early beginning of driver initialization. Also remove code which unconditionally enables all 3 bits and let kernel code (via pci_set_master() function) to handle bus mastering of PCI Bridge via emulated PCI_COMMAND on emulated bridge. Adjust existing functions mvebu_pcie_handle_iobase_change() and mvebu_pcie_handle_membase_change() to handle PCI_IO_BASE and PCI_MEM_BASE registers correctly even when bus mastering on emulated bridge is disabled. Link: https://lore.kernel.org/r/20211125124605.25915-7-pali@kernel.org Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-03-08 19:12:35 +01:00
Pali Rohár	bc988b1261	PCI: mvebu: Do not modify PCI IO type bits in conf_write [ Upstream commit `2cf150216e` ] PCI IO type bits are already initialized in mvebu_pci_bridge_emul_init() function and only when IO support is enabled. These type bits are read-only and pci-bridge-emul.c code already does not allow to modify them from upper layers. When IO support is disabled then all IO registers should be read-only and return zeros. Therefore do not modify PCI IO type bits in mvebu_pci_bridge_emul_base_conf_write() callback. Link: https://lore.kernel.org/r/20211125124605.25915-8-pali@kernel.org Fixes: `1f08673eef` ("PCI: mvebu: Convert to PCI emulated bridge config space") Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-03-08 19:12:35 +01:00
Pali Rohár	c1a027629c	PCI: mvebu: Check for errors from pci_bridge_emul_init() call [ Upstream commit `5d18d702e5` ] Function pci_bridge_emul_init() may fail so correctly check for errors. Link: https://lore.kernel.org/r/20211125124605.25915-3-pali@kernel.org Fixes: `1f08673eef` ("PCI: mvebu: Convert to PCI emulated bridge config space") Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-03-08 19:12:35 +01:00
Dario Binacchi	7c93c809e0	Input: ti_am335x_tsc - fix STEPCONFIG setup for Z2 [ Upstream commit `6bfeb6c21e` ] The Z2 step configuration doesn't erase the SEL_INP_SWC_3_0 bit-field before setting the ADC channel. This way its value could be corrupted by the ADC channel selected for the Z1 coordinate. Fixes: `8c896308fe` ("input: ti_am335x_adc: use only FIFO0 and clean up a little") Signed-off-by: Dario Binacchi <dariobin@libero.it> Link: https://lore.kernel.org/r/20211212125358.14416-3-dariobin@libero.it Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-03-08 19:12:35 +01:00
Dario Binacchi	16ff93557d	Input: ti_am335x_tsc - set ADCREFM for X configuration [ Upstream commit `73cca71a90` ] As reported by the STEPCONFIG[1-16] registered field descriptions of the TI reference manual, for the ADC "in single ended, SEL_INM_SWC_3_0 must be 1xxx". Unlike the Y and Z coordinates, this bit has not been set for the step configuration registers used to sample the X coordinate. Fixes: `1b8be32e69` ("Input: add support for TI Touchscreen controller") Signed-off-by: Dario Binacchi <dariobin@libero.it> Link: https://lore.kernel.org/r/20211212125358.14416-2-dariobin@libero.it Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-03-08 19:12:35 +01:00
Beau Belgrave	628761fe05	tracing: Do not let synth_events block other dyn_event systems during create [ Upstream commit `4f67cca70c` ] synth_events is returning -EINVAL if the dyn_event create command does not contain ' \t'. This prevents other systems from getting called back. synth_events needs to return -ECANCELED in these cases when the command is not targeting the synth_event system. Link: https://lore.kernel.org/linux-trace-devel/20210930223821.11025-1-beaub@linux.microsoft.com Fixes: `c9e759b1e8` ("tracing: Rework synthetic event command parsing") Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Beau Belgrave <beaub@linux.microsoft.com> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-03-08 19:12:35 +01:00
Christophe JAILLET	f35bacbb79	i3c/master/mipi-i3c-hci: Fix a potentially infinite loop in 'hci_dat_v1_get_index()' [ Upstream commit `3f43926f27` ] The code in 'hci_dat_v1_get_index()' really looks like a hand coded version of 'for_each_set_bit()', except that a +1 is missing when searching for the next set bit. This really looks odd and it seems that it will loop until 'dat_w0_read()' returns the expected result. So use 'for_each_set_bit()' instead. It is less verbose and should be more correct. Fixes: `9ad9a52cce` ("i3c/master: introduce the mipi-i3c-hci driver") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Acked-by: Nicolas Pitre <npitre@baylibre.com> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Link: https://lore.kernel.org/r/0cdf3cb10293ead1acd271fdb8a70369c298c082.1637186628.git.christophe.jaillet@wanadoo.fr Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-03-08 19:12:35 +01:00
Jamie Iles	e5264d44f7	i3c: fix incorrect address slot lookup on 64-bit [ Upstream commit `f18f98110f` ] The address slot bitmap is an array of unsigned long's which are the same size as an int on 32-bit platforms but not 64-bit. Loading the bitmap into an int could result in the incorrect status being returned for a slot and slots being reported as the wrong status. Fixes: `3a379bbcea` ("i3c: Add core I3C infrastructure") Cc: Boris Brezillon <bbrezillon@kernel.org> Cc: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Jamie Iles <quic_jiles@quicinc.com> Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Link: https://lore.kernel.org/r/20210922165600.179394-1-quic_jiles@quicinc.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-03-08 19:12:35 +01:00
Hou Wenlong	1adfbfaeb2	KVM: x86: Exit to userspace if emulation prepared a completion callback [ Upstream commit `adbfb12d4c` ] em_rdmsr() and em_wrmsr() return X86EMUL_IO_NEEDED if MSR accesses required an exit to userspace. However, x86_emulate_insn() doesn't return X86EMUL_*, so x86_emulate_instruction() doesn't directly act on X86EMUL_IO_NEEDED; instead, it looks for other signals to differentiate between PIO, MMIO, etc. causing RDMSR/WRMSR emulation not to exit to userspace now. Nevertheless, if the userspace_msr_exit_test testcase in selftests is changed to test RDMSR/WRMSR with a forced emulation prefix, the test passes. What happens is that first userspace exit information is filled but the userspace exit does not happen. Because x86_emulate_instruction() returns 1, the guest retries the instruction---but this time RIP has already been adjusted past the forced emulation prefix, so the guest executes RDMSR/WRMSR and the userspace exit finally happens. Since the X86EMUL_IO_NEEDED path has provided a complete_userspace_io callback, x86_emulate_instruction() can just return 0 if the callback is not NULL. Then RDMSR/WRMSR instruction emulation will exit to userspace directly, without the RDMSR/WRMSR vmexit. Fixes: `1ae099540e` ("KVM: x86: Allow deflecting unknown MSR accesses to user space") Signed-off-by: Hou Wenlong <houwenlong93@linux.alibaba.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <56f9df2ee5c05a81155e2be366c9dc1f7adc8817.1635842679.git.houwenlong93@linux.alibaba.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-03-08 19:12:35 +01:00
Sean Christopherson	3d8468045e	KVM: x86: Handle 32-bit wrap of EIP for EMULTYPE_SKIP with flat code seg [ Upstream commit `5e854864ee` ] Truncate the new EIP to a 32-bit value when handling EMULTYPE_SKIP as the decode phase does not truncate _eip. Wrapping the 32-bit boundary is legal if and only if CS is a flat code segment, but that check is implicitly handled in the form of limit checks in the decode phase. Opportunstically prepare for a future fix by storing the result of any truncation in "eip" instead of "_eip". Fixes: `1957aa63be` ("KVM: VMX: Handle single-step #DB for EMULTYPE_SKIP on EPT misconfig") Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <093eabb1eab2965201c9b018373baf26ff256d85.1635842679.git.houwenlong93@linux.alibaba.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-03-08 19:12:34 +01:00
Lai Jiangshan	00542cbacf	KVM: X86: Ensure that dirty PDPTRs are loaded [ Upstream commit `2c5653caec` ] For VMX with EPT, dirty PDPTRs need to be loaded before the next vmentry via vmx_load_mmu_pgd() But not all paths that call load_pdptrs() will cause vmx_load_mmu_pgd() to be invoked. Normally, kvm_mmu_reset_context() is used to cause KVM_REQ_LOAD_MMU_PGD, but sometimes it is skipped: * commit d81135a57aa6("KVM: x86: do not reset mmu if CR0.CD and CR0.NW are changed") skips kvm_mmu_reset_context() after load_pdptrs() when changing CR0.CD and CR0.NW. * commit 21823fbda552("KVM: x86: Invalidate all PGDs for the current PCID on MOV CR3 w/ flush") skips KVM_REQ_LOAD_MMU_PGD after load_pdptrs() when rewriting the CR3 with the same value. * commit a91a7c709600("KVM: X86: Don't reset mmu context when toggling X86_CR4_PGE") skips kvm_mmu_reset_context() after load_pdptrs() when changing CR4.PGE. Fixes: `d81135a57a` ("KVM: x86: do not reset mmu if CR0.CD and CR0.NW are changed") Fixes: `21823fbda5` ("KVM: x86: Invalidate all PGDs for the current PCID on MOV CR3 w/ flush") Fixes: `a91a7c7096` ("KVM: X86: Don't reset mmu context when toggling X86_CR4_PGE") Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com> Message-Id: <20211108124407.12187-2-jiangshanlai@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-03-08 19:12:34 +01:00
Sean Christopherson	723053e16d	KVM: VMX: Read Posted Interrupt "control" exactly once per loop iteration [ Upstream commit `cfb0e1306a` ] Use READ_ONCE() when loading the posted interrupt descriptor control field to ensure "old" and "new" have the same base value. If the compiler emits separate loads, and loads into "new" before "old", KVM could theoretically drop the ON bit if it were set between the loads. Fixes: `28b835d60f` ("KVM: Update Posted-Interrupts Descriptor when vCPU is preempted") Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20211009021236.4122790-27-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-03-08 19:12:34 +01:00
Sean Christopherson	32b758d12c	KVM: s390: Ensure kvm_arch_no_poll() is read once when blocking vCPU [ Upstream commit `6f390916c4` ] Wrap s390's halt_poll_max_steal with READ_ONCE and snapshot the result of kvm_arch_no_poll() in kvm_vcpu_block() to avoid a mostly-theoretical, largely benign bug on s390 where the result of kvm_arch_no_poll() could change due to userspace modifying halt_poll_max_steal while the vCPU is blocking. The bug is largely benign as it will either cause KVM to skip updating halt-polling times (no_poll toggles false=>true) or to update halt-polling times with a slightly flawed block_ns. Note, READ_ONCE is unnecessary in the current code, add it in case the arch hook is ever inlined, and to provide a hint that userspace can change the param at will. Fixes: `8b905d28ee` ("KVM: s390: provide kvm_arch_no_poll function") Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20211009021236.4122790-4-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-03-08 19:12:34 +01:00
Paolo Bonzini	b63190d020	KVM: VMX: Don't unblock vCPU w/ Posted IRQ if IRQs are disabled in guest [ Upstream commit `1831fa44df` ] Don't configure the wakeup handler when a vCPU is blocking with IRQs disabled, in which case any IRQ, posted or otherwise, should not be recognized and thus should not wake the vCPU. Fixes: `bf9f6ac8d7` ("KVM: Update Posted-Interrupts Descriptor when vCPU is blocked") Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20211009021236.4122790-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-03-08 19:12:34 +01:00
Pali Rohár	f303196899	PCI: aardvark: Fix checking for MEM resource type [ Upstream commit `2070b2ddea` ] IORESOURCE_MEM_64 is not a resource type but a type flag. Remove incorrect check for type IORESOURCE_MEM_64. Link: https://lore.kernel.org/r/20211125160148.26029-2-kabel@kernel.org Fixes: `64f160e19e` ("PCI: aardvark: Configure PCIe resources from 'ranges' DT property") Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Marek Behún <kabel@kernel.org> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-03-08 19:12:34 +01:00
Tim Harvey	a2f5e9a6f2	PCI: dwc: Do not remap invalid res [ Upstream commit `6e5ebc96ec` ] On imx6 and perhaps others when pcie probes you get a: imx6q-pcie 33800000.pcie: invalid resource This occurs because the atu is not specified in the DT and as such it should not be remapped. Link: https://lore.kernel.org/r/20211101180243.23761-1-tharvey@gateworks.com Fixes: `281f1f99cf` ("PCI: dwc: Detect number of iATU windows") Signed-off-by: Tim Harvey <tharvey@gateworks.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Rob Herring <robh@kernel.org> Acked-by: Richard Zhu <hongxing.zhu@nxp.com> Cc: Richard Zhu <hongxing.zhu@nxp.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-03-08 19:12:34 +01:00
Marek Vasut	d9fc43aab6	PCI: rcar: Check if device is runtime suspended instead of __clk_is_enabled() [ Upstream commit `d2a14b5498` ] Replace __clk_is_enabled() with pm_runtime_suspended(), as __clk_is_enabled() was checking the wrong bus clock and caused the following build error too: arm-linux-gnueabi-ld: drivers/pci/controller/pcie-rcar-host.o: in function `rcar_pcie_aarch32_abort_handler': pcie-rcar-host.c:(.text+0xdd0): undefined reference to `__clk_is_enabled' Link: https://lore.kernel.org/r/20211115204641.12941-1-marek.vasut@gmail.com Fixes: `a115b1bd3a` ("PCI: rcar: Add L1 link state fix into data abort hook") Signed-off-by: Marek Vasut <marek.vasut+renesas@gmail.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Randy Dunlap <rdunlap@infradead.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Geert Uytterhoeven <geert+renesas@glider.be> Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Cc: Stephen Boyd <sboyd@kernel.org> Cc: Wolfram Sang <wsa@the-dreams.de> Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Cc: linux-renesas-soc@vger.kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-03-08 19:12:34 +01:00
Jianjun Wang	aa805236ed	PCI: mediatek-gen3: Disable DVFSRC voltage request [ Upstream commit `ab344fd43f` ] When the DVFSRC (dynamic voltage and frequency scaling resource collector) feature is not implemented, the PCIe hardware will assert a voltage request signal when exit from the L1 PM Substates to request a specific Vcore voltage, but cannot receive the voltage ready signal, which will cause the link to fail to exit the L1 PM Substates. Disable DVFSRC voltage request by default, we need to find a common way to enable it in the future. Link: https://lore.kernel.org/r/20211015063602.29058-1-jianjun.wang@mediatek.com Fixes: `d3bf75b579` ("PCI: mediatek-gen3: Add MediaTek Gen3 driver for MT8192") Tested-by: Qizhong Cheng <qizhong.cheng@mediatek.com> Signed-off-by: Jianjun Wang <jianjun.wang@mediatek.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Tzung-Bi Shih <tzungbi@google.com> Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-03-08 19:12:34 +01:00
Eric W. Biederman	7f361266e9	signal: In get_signal test for signal_group_exit every time through the loop [ Upstream commit `e7f7c99ba9` ] Recently while investigating a problem with rr and signals I noticed that siglock is dropped in ptrace_signal and get_signal does not jump to relock. Looking farther to see if the problem is anywhere else I see that do_signal_stop also returns if signal_group_exit is true. I believe that test can now never be true, but it is a bit hard to trace through and be certain. Testing signal_group_exit is not expensive, so move the test for signal_group_exit into the for loop inside of get_signal to ensure the test is never skipped improperly. This has been a potential problem since I added the test for signal_group_exit was added. Fixes: `35634ffa17` ("signal: Always notice exiting tasks") Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lkml.kernel.org/r/875yssekcd.fsf_-_@email.froward.int.ebiederm.org Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-03-08 19:12:34 +01:00
Huang Pei	f98371d2ac	MIPS: fix local_{add,sub}_return on MIPS64 [ Upstream commit `277c8cb3e8` ] Use "daddu/dsubu" for long int on MIPS64 instead of "addu/subu" Fixes: `7232311ef1` ("local_t: mips extension") Signed-off-by: Huang Pei <huangpei@loongson.cn> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-03-08 19:12:33 +01:00
Tudor Ambarus	64b487be33	mtd: spi-nor: Fix mtd size for s3an flashes [ Upstream commit `f656b419d4` ] As it was before the blamed commit, s3an_nor_scan() was called after mtd size was set with params->size, and it overwrote the mtd size value with '8 * nor->page_size * nor->info->n_sectors' when XSR_PAGESIZE was set. With the introduction of s3an_post_sfdp_fixups(), we missed to update the mtd size for the s3an flashes. Fix the mtd size by updating both nor->params->size, (which will update the mtd_info size later on) and nor->mtd.size (which is used in spi_nor_set_addr_width()). Fixes: `641edddb4f` ("mtd: spi-nor: Add s3an_post_sfdp_fixups()") Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com> Reviewed-by: Pratyush Yadav <p.yadav@ti.com> Link: https://lore.kernel.org/r/20211207140254.87681-2-tudor.ambarus@microchip.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-03-08 19:12:33 +01:00
Andrii Nakryiko	83ef63535a	tools/resolve_btf_ids: Close ELF file on error [ Upstream commit `1144ab9bdf` ] Fix one case where we don't do explicit clean up. Fixes: `fbbb68de80` ("bpf: Add resolve_btfids tool to resolve BTF IDs in ELF object") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20211124002325.1737739-2-andrii@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-03-08 19:12:33 +01:00
Hao Xu	1bd12b7aae	io_uring: fix no lock protection for ctx->cq_extra [ Upstream commit `e302f1046f` ] ctx->cq_extra should be protected by completion lock so that the req_need_defer() does the right check. Cc: stable@vger.kernel.org Signed-off-by: Hao Xu <haoxu@linux.alibaba.com> Link: https://lore.kernel.org/r/20211125092103.224502-2-haoxu@linux.alibaba.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-03-08 19:12:33 +01:00
Chuck Lever	384d1b1138	NFSD: Fix zero-length NFSv3 WRITEs [ Upstream commit `6a2f774424` ] The Linux NFS server currently responds to a zero-length NFSv3 WRITE request with NFS3ERR_IO. It responds to a zero-length NFSv4 WRITE with NFS4_OK and count of zero. RFC 1813 says of the WRITE procedure's @count argument: count The number of bytes of data to be written. If count is 0, the WRITE will succeed and return a count of 0, barring errors due to permissions checking. RFC 8881 has similar language for NFSv4, though NFSv4 removed the explicit @count argument because that value is already contained in the opaque payload array. The synthetic client pynfs's WRT4 and WRT15 tests do emit zero- length WRITEs to exercise this spec requirement. Commit `fdec6114ee` ("nfsd4: zero-length WRITE should succeed") addressed the same problem there with the same fix. But interestingly the Linux NFS client does not appear to emit zero- length WRITEs, instead squelching them. I'm not aware of a test that can generate such WRITEs for NFSv3, so I wrote a naive C program to generate a zero-length WRITE and test this fix. Fixes: `8154ef2776` ("NFSD: Clean up legacy NFS WRITE argument XDR decoders") Reported-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Cc: stable@vger.kernel.org Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-03-08 19:12:33 +01:00

... 2 3 4 5 6 ...

1049229 Commits