Commit Graph

1169977 Commits

Author SHA1 Message Date
Guchun Chen
3fadda5de8 drm/amdgpu: move poll enabled/disable into non DC path
Some amd asics having reliable hotplug support don't call
drm_kms_helper_poll_init in driver init sequence. However,
due to the unified suspend/resume path for all asics, because
the output_poll_work->func is not set for these asics, a warning
arrives when suspending.

[   90.656049]  <TASK>
[   90.656050]  ? console_unlock+0x4d/0x100
[   90.656053]  ? __irq_work_queue_local+0x27/0x60
[   90.656056]  ? irq_work_queue+0x2b/0x50
[   90.656057]  ? __wake_up_klogd+0x40/0x60
[   90.656059]  __cancel_work_timer+0xed/0x180
[   90.656061]  drm_kms_helper_poll_disable.cold+0x1f/0x2c [drm_kms_helper]
[   90.656072]  amdgpu_device_suspend+0x81/0x170 [amdgpu]
[   90.656180]  amdgpu_pmops_runtime_suspend+0xb5/0x1b0 [amdgpu]
[   90.656269]  pci_pm_runtime_suspend+0x61/0x1b0

drm_kms_helper_poll_enable/disable is valid when poll_init is called in
amdgpu code, which is only used in non DC path. So move such codes into
non-DC path code to get rid of such warnings.

v1: introduce use_kms_poll flag in amdgpu as the poll stuff check
v2: use dc_enabled as the flag to simply code
v3: move code into non DC path instead of relying on any flag

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2411
Fixes: a4e771729a ("drm/probe_helper: sort out poll_running vs poll_enabled")
Reported-by: Bert Karwatzki <spasswolf@web.de>
Suggested-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2023-03-14 10:38:13 -04:00
Bhawanpreet Lakha
728cefa53a drm/amd/display: Fix HDCP failing to enable after suspend
[Why]
On resume some displays are not ready for HDCP, so they will fail if we
start the hdcp authentintication too soon.

Add a delay so that the displays can be ready before we start.

NOTE: Previoulsy this delay was set to 3 seconds but it was causing
issues with compliance, 2 seconds should enough for compliance and the
s3 resume case.

[How]
Change the Delay to 2 seconds.

Reviewed-by: Aurabindo Pillai <Aurabindo.Pillai@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-14 10:37:48 -04:00
Chia-I Wu
9da050b0d9 drm/amdkfd: fix potential kgd_mem UAFs
kgd_mem pointers returned by kfd_process_device_translate_handle are
only guaranteed to be valid while p->mutex is held. As soon as the mutex
is unlocked, another thread can free the BO.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-14 10:37:22 -04:00
Jane Jian
d71e38df3b drm/amdgpu/vcn: custom video info caps for sriov
for sriov, we added a new flag to indicate av1 support,
this will override the original caps info.

Signed-off-by: Jane Jian <Jane.Jian@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-14 10:37:09 -04:00
Błażej Szczygieł
a9386ee968 drm/amd/pm: Fix sienna cichlid incorrect OD volage after resume
Always setup overdrive tables after resume. Preserve only some
user-defined settings in user_overdrive_table if they're set.

Copy restored user_overdrive_table into od_table to get correct
values.

On cold boot, BTC was triggered and GfxVfCurve was calibrated. We
got VfCurve settings (a). On resuming back, BTC will be triggered
again and GfxVfCurve will be recalibrated. VfCurve settings (b)
got may be different from those of cold boot.  So if we reuse
those VfCurve settings (a) got on cold boot on suspend, we can
run into discrepencies.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1897
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2276
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Błażej Szczygieł <mumei6102@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2023-03-14 10:36:25 -04:00
Tim Huang
ab9bdb1213 drm/amd/pm: bump SMU 13.0.4 driver_if header version
Align the SMU driver interface version with PMFW to
suppress the version mismatch message on driver loading.

Signed-off-by: Tim Huang <tim.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.1.x
2023-03-14 10:30:47 -04:00
Chia-I Wu
b2ca5c5d41 drm/amdkfd: fix a potential double free in pqm_create_queue
Set *q to NULL on errors, otherwise pqm_create_queue would free it
again.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-14 10:30:29 -04:00
Xiaogang Chen
8eeddc0d42 drm/amdkfd: Get prange->offset after svm_range_vram_node_new
During miration to vram prange->offset is valid after vram buffer is located,
either use old one or allocate a new one. Move svm_range_vram_node_new before
migrate for each vma to get valid prange->offset.

v2: squash in warning fix

Fixes: b4ee960637 ("drm/amdkfd: Fix BO offset for multi-VMA page migration")
Signed-off-by: Xiaogang Chen <Xiaogang.Chen@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-14 10:29:52 -04:00
Xiaogang Chen
b4ee960637 drm/amdkfd: Fix BO offset for multi-VMA page migration
svm_migrate_ram_to_vram migrates a prange from sys ram to vram. The prange may
cross multiple vma. Need remember current dst vram offset in the TTM resource for
each migration.

v2: squash in warning fix (Alex)

Signed-off-by: Xiaogang Chen <Xiaogang.Chen@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-03-14 10:27:37 -04:00
Jan Beulich
934ef33ee7 x86/PVH: obtain VGA console info in Dom0
A new platform-op was added to Xen to allow obtaining the same VGA
console information PV Dom0 is handed. Invoke the new function and have
the output data processed by xen_init_vga().

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Juergen Gross <jgross@suse.com>

Link: https://lore.kernel.org/r/8f315e92-7bda-c124-71cc-478ab9c5e610@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2023-03-14 15:20:51 +01:00
Vipin Sharma
f3e707413d KVM: selftests: Sync KVM exit reasons in selftests
Add missing KVM_EXIT_* reasons in KVM selftests from
include/uapi/linux/kvm.h

Signed-off-by: Vipin Sharma <vipinsh@google.com>
Message-Id: <20230204014547.583711-5-vipinsh@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-03-14 10:20:10 -04:00
Sean Christopherson
1b3d660e5d KVM: selftests: Add macro to generate KVM exit reason strings
Add and use a macro to generate the KVM exit reason strings array
instead of relying on developers to correctly copy+paste+edit each
string.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20230204014547.583711-4-vipinsh@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-03-14 10:20:10 -04:00
Vipin Sharma
6f974494b8 KVM: selftests: Print expected and actual exit reason in KVM exit reason assert
Print what KVM exit reason a test was expecting and what it actually
got int TEST_ASSERT_KVM_EXIT_REASON().

Signed-off-by: Vipin Sharma <vipinsh@google.com>
Message-Id: <20230204014547.583711-3-vipinsh@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-03-14 10:20:09 -04:00
Vipin Sharma
c96f57b080 KVM: selftests: Make vCPU exit reason test assertion common
Make TEST_ASSERT_KVM_EXIT_REASON() macro and replace all exit reason
test assert statements with it.

No functional changes intended.

Signed-off-by: Vipin Sharma <vipinsh@google.com>
Reviewed-by: David Matlack <dmatlack@google.com>
Message-Id: <20230204014547.583711-2-vipinsh@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-03-14 10:20:09 -04:00
David Woodhouse
e6239a4ec5 KVM: selftests: Add EVTCHNOP_send slow path test to xen_shinfo_test
When kvm_xen_evtchn_send() takes the slow path because the shinfo GPC
needs to be revalidated, it used to violate the SRCU vs. kvm->lock
locking rules and potentially cause a deadlock.

Now that lockdep is learning to catch such things, make sure that code
path is exercised by the selftest.

Link: https://lore.kernel.org/all/20230113124606.10221-2-dwmw2@infradead.org
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20230204024151.1373296-5-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-03-14 10:20:08 -04:00
David Woodhouse
e7062a98d0 KVM: selftests: Use enum for test numbers in xen_shinfo_test
The xen_shinfo_test started off with very few iterations, and the numbers
we used in GUEST_SYNC() were precisely mapped to the RUNSTATE_xxx values
anyway to start with.

It has since grown quite a few more tests, and it's kind of awful to be
handling them all as bare numbers. Especially when I want to add a new
test in the middle. Define an enum for the test stages, and use it both
in the guest code and the host switch statement.

No functional change, if I can count to 24.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20230204024151.1373296-4-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-03-14 10:20:08 -04:00
Sean Christopherson
c0c76d9993 KVM: selftests: Add helpers to make Xen-style VMCALL/VMMCALL hypercalls
Add wrappers to do hypercalls using VMCALL/VMMCALL and Xen's register ABI
(as opposed to full Xen-style hypercalls through a hypervisor provided
page).  Using the common helpers dedups a pile of code, and uses the
native hypercall instruction when running on AMD.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20230204024151.1373296-3-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-03-14 10:20:08 -04:00
Sean Christopherson
4009e0bb7b KVM: selftests: Move the guts of kvm_hypercall() to a separate macro
Extract the guts of kvm_hypercall() to a macro so that Xen hypercalls,
which have a different register ABI, can reuse the VMCALL vs. VMMCALL
logic.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20230204024151.1373296-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-03-14 10:20:07 -04:00
Sean Christopherson
c281794eaa KVM: SVM: WARN if GATag generation drops VM or vCPU ID information
WARN if generating a GATag given a VM ID and vCPU ID doesn't yield the
same IDs when pulling the IDs back out of the tag.  Don't bother adding
error handling to callers, this is very much a paranoid sanity check as
KVM fully controls the VM ID and is supposed to reject too-big vCPU IDs.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Tested-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Message-Id: <20230207002156.521736-4-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-03-14 10:20:07 -04:00
Suravee Suthikulpanit
5999715922 KVM: SVM: Modify AVIC GATag to support max number of 512 vCPUs
Define AVIC_VCPU_ID_MASK based on AVIC_PHYSICAL_MAX_INDEX, i.e. the mask
that effectively controls the largest guest physical APIC ID supported by
x2AVIC, instead of hardcoding the number of bits to 8 (and the number of
VM bits to 24).

The AVIC GATag is programmed into the AMD IOMMU IRTE to provide a
reference back to KVM in case the IOMMU cannot inject an interrupt into a
non-running vCPU.  In such a case, the IOMMU notifies software by creating
a GALog entry with the corresponded GATag, and KVM then uses the GATag to
find the correct VM+vCPU to kick.  Dropping bit 8 from the GATag results
in kicking the wrong vCPU when targeting vCPUs with x2APIC ID > 255.

Fixes: 4d1d7942e3 ("KVM: SVM: Introduce logic to (de)activate x2AVIC mode")
Cc: stable@vger.kernel.org
Reported-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Co-developed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Tested-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Message-Id: <20230207002156.521736-3-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-03-14 10:20:06 -04:00
Sean Christopherson
3ec7a1b274 KVM: SVM: Fix a benign off-by-one bug in AVIC physical table mask
Define the "physical table max index mask" as bits 8:0, not 9:0.  x2AVIC
currently supports a max of 512 entries, i.e. the max index is 511, and
the inputs to GENMASK_ULL() are inclusive.  The bug is benign as bit 9 is
reserved and never set by KVM, i.e. KVM is just clearing bits that are
guaranteed to be zero.

Note, as of this writing, APM "Rev. 3.39-October 2022" incorrectly states
that bits 11:8 are reserved in Table B-1. VMCB Layout, Control Area.  I.e.
that table wasn't updated when x2AVIC support was added.

Opportunistically fix the comment for the max AVIC ID to align with the
code, and clean up comment formatting too.

Fixes: 4d1d7942e3 ("KVM: SVM: Introduce logic to (de)activate x2AVIC mode")
Cc: stable@vger.kernel.org
Cc: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Tested-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Message-Id: <20230207002156.521736-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-03-14 10:20:06 -04:00
Paolo Bonzini
3dc40cf89b selftests: KVM: skip hugetlb tests if huge pages are not available
Right now, if KVM memory stress tests are run with hugetlb sources but hugetlb is
not available (either in the kernel or because /proc/sys/vm/nr_hugepages is 0)
the test will fail with a memory allocation error.

This makes it impossible to add tests that default to hugetlb-backed memory,
because on a machine with a default configuration they will fail.  Therefore,
check HugePages_Total as well and, if zero, direct the user to enable hugepages
in procfs.  Furthermore, return KSFT_SKIP whenever hugetlb is not available.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-03-14 10:20:06 -04:00
Rong Tao
53293cb81b KVM: VMX: Use tabs instead of spaces for indentation
Code indentation should use tabs where possible and miss a '*'.

Signed-off-by: Rong Tao <rongtao@cestc.cn>
Message-Id: <tencent_A492CB3F9592578451154442830EA1B02C07@qq.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-03-14 09:40:55 -04:00
Rong Tao
06e1854728 KVM: VMX: Fix indentation coding style issue
Code indentation should use tabs where possible.

Signed-off-by: Rong Tao <rongtao@cestc.cn>
Message-Id: <tencent_31E6ACADCB6915E157CF5113C41803212107@qq.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-03-14 09:40:55 -04:00
Paolo Bonzini
77900bffed KVM: nVMX: remove unnecessary #ifdef
nested_vmx_check_controls() has already run by the time KVM checks host state,
so the "host address space size" exit control can only be set on x86-64 hosts.
Simplify the condition at the cost of adding some dead code to 32-bit kernels.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-03-14 09:40:54 -04:00
Paolo Bonzini
112e66017b KVM: nVMX: add missing consistency checks for CR0 and CR4
The effective values of the guest CR0 and CR4 registers may differ from
those included in the VMCS12.  In particular, disabling EPT forces
CR4.PAE=1 and disabling unrestricted guest mode forces CR0.PG=CR0.PE=1.

Therefore, checks on these bits cannot be delegated to the processor
and must be performed by KVM.

Reported-by: Reima ISHII <ishiir@g.ecc.u-tokyo.ac.jp>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-03-14 09:40:54 -04:00
Paolo Bonzini
bceeedb2f0 KVM/arm64 fixes for 6.3, part #1
A single patch to address a rather annoying bug w.r.t. guest timer
 offsetting. Effectively the synchronization of timer offsets between
 vCPUs was broken, leading to inconsistent timer reads within the VM.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQSNXHjWXuzMZutrKNKivnWIJHzdFgUCZAzwRwAKCRCivnWIJHzd
 Fh0nAP4seI9aMrv0EnCHS9nufCSYQZYGBxOe+8EyUOIERxyCPgEAspn6fNJWnc6o
 RWbFGMyNHPgeQgGjH+g4ehqh5LSeMww=
 =uFU2
 -----END PGP SIGNATURE-----

Merge tag 'kvmarm-fixes-6.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 fixes for 6.3, part #1

A single patch to address a rather annoying bug w.r.t. guest timer
offsetting. Effectively the synchronization of timer offsets between
vCPUs was broken, leading to inconsistent timer reads within the VM.
2023-03-14 09:40:39 -04:00
Randy Dunlap
6175b70df9 powerpc/pseries: RTAS work area requires GENERIC_ALLOCATOR
The RTAS work area allocator uses code that is built by
GENERIC_ALLOCATOR, so the PSERIES Kconfig should select the
required Kconfig symbol to fix multiple build errors.

powerpc64-linux-ld: arch/powerpc/platforms/pseries/rtas-work-area.o: in function `.rtas_work_area_allocator_init':
rtas-work-area.c:(.init.text+0x288): undefined reference to `.gen_pool_create'
powerpc64-linux-ld: rtas-work-area.c:(.init.text+0x2dc): undefined reference to `.gen_pool_set_algo'
powerpc64-linux-ld: rtas-work-area.c:(.init.text+0x310): undefined reference to `.gen_pool_add_owner'
powerpc64-linux-ld: rtas-work-area.c:(.init.text+0x43c): undefined reference to `.gen_pool_destroy'
powerpc64-linux-ld: arch/powerpc/platforms/pseries/rtas-work-area.o:(.toc+0x0): undefined reference to `gen_pool_first_fit_order_align'
powerpc64-linux-ld: arch/powerpc/platforms/pseries/rtas-work-area.o: in function `.__rtas_work_area_alloc':
rtas-work-area.c:(.ref.text+0x14c): undefined reference to `.gen_pool_alloc_algo_owner'
powerpc64-linux-ld: rtas-work-area.c:(.ref.text+0x238): undefined reference to `.gen_pool_alloc_algo_owner'
powerpc64-linux-ld: arch/powerpc/platforms/pseries/rtas-work-area.o: in function `.rtas_work_area_free':
rtas-work-area.c:(.ref.text+0x44c): undefined reference to `.gen_pool_free_owner'

Fixes: 43033bc62d ("powerpc/pseries: add RTAS work area allocator")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230223070116.660-2-rdunlap@infradead.org
2023-03-14 23:03:36 +11:00
Alex Elder
512dd35471 net: ipa: fix a surprising number of bad offsets
A recent commit eliminated a hack that adjusted the offset used for
many GSI registers.  It became possible because we now specify all
GSI register offsets explicitly for every version of IPA.

Unfortunately, a large number of register offsets were *not* updated
as they should have been in that commit.  For IPA v4.5+, the offset
for every GSI register *except* the two inter-EE interrupt masking
registers were supposed to have been reduced by 0xd000.

Tested-by: Luca Weiss <luca.weiss@fairphone.com>
Tested-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> # SM8350-HDK
Fixes: 59b12b1d27 ("net: ipa: kill gsi->virt_raw")
Signed-off-by: Alex Elder <elder@linaro.org>
Link: https://lore.kernel.org/r/20230310193709.1477102-1-elder@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-13 17:11:43 -07:00
Arınç ÜNAL
0b086d76e7 net: dsa: mt7530: set PLL frequency and trgmii only when trgmii is used
As my testing on the MCM MT7530 switch on MT7621 SoC shows, setting the PLL
frequency does not affect MII modes other than trgmii on port 5 and port 6.
So the assumption is that the operation here called "setting the PLL
frequency" actually sets the frequency of the TRGMII TX clock.

Make it so that it and the rest of the trgmii setup run only when the
trgmii mode is used.

Tested rgmii and trgmii modes of port 6 on MCM MT7530 on MT7621AT Unielec
U7621-06 and standalone MT7530 on MT7623NI Bananapi BPI-R2.

Fixes: b8f126a8d5 ("net-next: dsa: add dsa support for Mediatek MT7530 switch")
Tested-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Link: https://lore.kernel.org/r/20230310073338.5836-2-arinc.unal@arinc9.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-13 17:04:18 -07:00
Arınç ÜNAL
feb03fd11c net: dsa: mt7530: remove now incorrect comment regarding port 5
Remove now incorrect comment regarding port 5 as GMAC5. This is supposed to
be supported since commit 38f790a805 ("net: dsa: mt7530: Add support for
port 5") under mt7530_setup_port5().

Fixes: 38f790a805 ("net: dsa: mt7530: Add support for port 5")
Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Link: https://lore.kernel.org/r/20230310073338.5836-1-arinc.unal@arinc9.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-13 17:04:18 -07:00
Daniil Tatianin
1a9dc5610e qed/qed_dev: guard against a possible division by zero
Previously we would divide total_left_rate by zero if num_vports
happened to be 1 because non_requested_count is calculated as
num_vports - req_count. Guard against this by validating num_vports at
the beginning and returning an error otherwise.

Found by Linux Verification Center (linuxtesting.org) with the SVACE
static analysis tool.

Fixes: bcd197c81f ("qed: Add vport WFQ configuration APIs")
Signed-off-by: Daniil Tatianin <d-tatianin@yandex-team.ru>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20230309201556.191392-1-d-tatianin@yandex-team.ru
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-13 16:55:02 -07:00
D. Wythe
22a825c541 net/smc: fix NULL sndbuf_desc in smc_cdc_tx_handler()
When performing a stress test on SMC-R by rmmod mlx5_ib driver
during the wrk/nginx test, we found that there is a probability
of triggering a panic while terminating all link groups.

This issue dues to the race between smc_smcr_terminate_all()
and smc_buf_create().

			smc_smcr_terminate_all

smc_buf_create
/* init */
conn->sndbuf_desc = NULL;
...

			__smc_lgr_terminate
				smc_conn_kill
					smc_close_abort
						smc_cdc_get_slot_and_msg_send

			__softirqentry_text_start
				smc_wr_tx_process_cqe
					smc_cdc_tx_handler
						READ(conn->sndbuf_desc->len);
						/* panic dues to NULL sndbuf_desc */

conn->sndbuf_desc = xxx;

This patch tries to fix the issue by always to check the sndbuf_desc
before send any cdc msg, to make sure that no null pointer is
seen during cqe processing.

Fixes: 0b29ec6436 ("net/smc: immediate termination for SMCR link groups")
Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
Reviewed-by: Tony Lu <tonylu@linux.alibaba.com>
Reviewed-by: Wenjia Zhang <wenjia@linux.ibm.com>
Link: https://lore.kernel.org/r/1678263432-17329-1-git-send-email-alibuda@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-13 16:03:58 -07:00
Vadim Fedorenko
131db49916 bnxt_en: reset PHC frequency in free-running mode
When using a PHC in shared between multiple hosts, the previous
frequency value may not be reset and could lead to host being unable to
compensate the offset with timecounter adjustments. To avoid such state
reset the hardware frequency of PHC to zero on init. Some refactoring is
needed to make code readable.

Fixes: 85036aee19 ("bnxt_en: Add a non-real time mode to access NIC clock")
Signed-off-by: Vadim Fedorenko <vadfed@meta.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Link: https://lore.kernel.org/r/20230310151356.678059-1-vadfed@meta.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-13 15:57:31 -07:00
NeilBrown
3bc5729227 md: avoid signed overflow in slot_store()
slot_store() uses kstrtouint() to get a slot number, but stores the
result in an "int" variable (by casting a pointer).
This can result in a negative slot number if the unsigned int value is
very large.

A negative number means that the slot is empty, but setting a negative
slot number this way will not remove the device from the array.  I don't
think this is a serious problem, but it could cause confusion and it is
best to fix it.

Reported-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Song Liu <song@kernel.org>
2023-03-13 12:50:54 -07:00
Johan Hovold
9db481c909 memory: tegra30-emc: fix interconnect registration race
The current interconnect provider registration interface is inherently
racy as nodes are not added until the after adding the provider. This
can specifically cause racing DT lookups to fail.

Switch to using the new API where the provider is not registered until
after it has been fully initialised.

Fixes: d5ef16ba5f ("memory: tegra20: Support interconnect framework")
Cc: stable@vger.kernel.org      # 5.11
Cc: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Link: https://lore.kernel.org/r/20230306075651.2449-21-johan+linaro@kernel.org
Signed-off-by: Georgi Djakov <djakov@kernel.org>
2023-03-13 21:13:49 +02:00
Johan Hovold
c5587f61ec memory: tegra20-emc: fix interconnect registration race
The current interconnect provider registration interface is inherently
racy as nodes are not added until the after adding the provider. This
can specifically cause racing DT lookups to fail.

Switch to using the new API where the provider is not registered until
after it has been fully initialised.

Fixes: d5ef16ba5f ("memory: tegra20: Support interconnect framework")
Cc: stable@vger.kernel.org      # 5.11
Cc: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Link: https://lore.kernel.org/r/20230306075651.2449-20-johan+linaro@kernel.org
Signed-off-by: Georgi Djakov <djakov@kernel.org>
2023-03-13 21:13:48 +02:00
Johan Hovold
abd9f1b49c memory: tegra124-emc: fix interconnect registration race
The current interconnect provider registration interface is inherently
racy as nodes are not added until the after adding the provider. This
can specifically cause racing DT lookups to fail.

Switch to using the new API where the provider is not registered until
after it has been fully initialised.

Fixes: 380def2d4c ("memory: tegra124: Support interconnect framework")
Cc: stable@vger.kernel.org      # 5.12
Cc: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Link: https://lore.kernel.org/r/20230306075651.2449-19-johan+linaro@kernel.org
Signed-off-by: Georgi Djakov <djakov@kernel.org>
2023-03-13 21:13:48 +02:00
Johan Hovold
5553055c62 memory: tegra: fix interconnect registration race
The current interconnect provider registration interface is inherently
racy as nodes are not added until the after adding the provider. This
can specifically cause racing DT lookups to fail.

Switch to using the new API where the provider is not registered until
after it has been fully initialised.

Fixes: 06f079816d ("memory: tegra-mc: Add interconnect framework")
Cc: stable@vger.kernel.org      # 5.11
Cc: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Link: https://lore.kernel.org/r/20230306075651.2449-18-johan+linaro@kernel.org
Signed-off-by: Georgi Djakov <djakov@kernel.org>
2023-03-13 21:13:48 +02:00
Johan Hovold
859ad5f177 interconnect: exynos: drop redundant link destroy
There is no longer any need to explicitly destroy node links as this is
now done when the node is destroyed as part of icc_nodes_remove().

Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Link: https://lore.kernel.org/r/20230306075651.2449-17-johan+linaro@kernel.org
Signed-off-by: Georgi Djakov <djakov@kernel.org>
2023-03-13 21:13:48 +02:00
Johan Hovold
c9e46ca612 interconnect: exynos: fix registration race
The current interconnect provider registration interface is inherently
racy as nodes are not added until the after adding the provider. This
can specifically cause racing DT lookups to trigger a NULL-pointer
deference when either a NULL pointer or not fully initialised node is
returned from exynos_generic_icc_xlate().

Switch to using the new API where the provider is not registered until
after it has been fully initialised.

Fixes: 2f95b9d5cf ("interconnect: Add generic interconnect driver for Exynos SoCs")
Cc: stable@vger.kernel.org      # 5.11
Cc: Sylwester Nawrocki <s.nawrocki@samsung.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Link: https://lore.kernel.org/r/20230306075651.2449-16-johan+linaro@kernel.org
Signed-off-by: Georgi Djakov <djakov@kernel.org>
2023-03-13 21:13:48 +02:00
Johan Hovold
3aab264875 interconnect: exynos: fix node leak in probe PM QoS error path
Make sure to add the newly allocated interconnect node to the provider
before adding the PM QoS request so that the node is freed on errors.

Fixes: 2f95b9d5cf ("interconnect: Add generic interconnect driver for Exynos SoCs")
Cc: stable@vger.kernel.org      # 5.11
Cc: Sylwester Nawrocki <s.nawrocki@samsung.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Link: https://lore.kernel.org/r/20230306075651.2449-15-johan+linaro@kernel.org
Signed-off-by: Georgi Djakov <djakov@kernel.org>
2023-03-13 21:13:48 +02:00
Johan Hovold
bfe7bcd2b9 interconnect: qcom: msm8974: fix registration race
The current interconnect provider registration interface is inherently
racy as nodes are not added until the after adding the provider. This
can specifically cause racing DT lookups to fail.

Switch to using the new API where the provider is not registered until
after it has been fully initialised.

Fixes: 4e60a9568d ("interconnect: qcom: add msm8974 driver")
Cc: stable@vger.kernel.org      # 5.5
Reviewed-by: Brian Masney <bmasney@redhat.com>
Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Link: https://lore.kernel.org/r/20230306075651.2449-12-johan+linaro@kernel.org
Signed-off-by: Georgi Djakov <djakov@kernel.org>
2023-03-13 21:13:48 +02:00
Johan Hovold
74240a5beb interconnect: qcom: rpmh: fix registration race
The current interconnect provider registration interface is inherently
racy as nodes are not added until the after adding the provider. This
can specifically cause racing DT lookups to fail.

Switch to using the new API where the provider is not registered until
after it has been fully initialised.

Fixes: 976daac4a1 ("interconnect: qcom: Consolidate interconnect RPMh support")
Cc: stable@vger.kernel.org      # 5.7
Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Link: https://lore.kernel.org/r/20230306075651.2449-11-johan+linaro@kernel.org
Signed-off-by: Georgi Djakov <djakov@kernel.org>
2023-03-13 21:13:48 +02:00
Johan Hovold
6570d1d46e interconnect: qcom: rpmh: fix probe child-node error handling
Make sure to clean up and release resources properly also in case probe
fails when populating child devices.

Fixes: 57eb14779d ("interconnect: qcom: icc-rpmh: Support child NoC device probe")
Cc: stable@vger.kernel.org      # 6.0
Cc: Luca Weiss <luca.weiss@fairphone.com>
Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Link: https://lore.kernel.org/r/20230306075651.2449-10-johan+linaro@kernel.org
Signed-off-by: Georgi Djakov <djakov@kernel.org>
2023-03-13 21:13:47 +02:00
Johan Hovold
90ae93d8af interconnect: qcom: rpm: fix registration race
The current interconnect provider registration interface is inherently
racy as nodes are not added until the after adding the provider. This
can specifically cause racing DT lookups to fail.

Switch to using the new API where the provider is not registered until
after it has been fully initialised.

Fixes: 62feb14ee8 ("interconnect: qcom: Consolidate interconnect RPM support")
Fixes: 30c8fa3ec6 ("interconnect: qcom: Add MSM8916 interconnect provider driver")
Cc: stable@vger.kernel.org	# 5.7
Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Reviewed-by: Jun Nie <jun.nie@linaro.org>
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Link: https://lore.kernel.org/r/20230306075651.2449-9-johan+linaro@kernel.org
Signed-off-by: Georgi Djakov <djakov@kernel.org>
2023-03-13 21:12:16 +02:00
Jurica Vukadin
ee06a3ef7e kconfig: Update config changed flag before calling callback
Prior to commit 5ee5465940 ("kconfig: change sym_change_count to a
boolean flag"), the conf_updated flag was set to the new value *before*
calling the callback. xconfig's save action depends on this behaviour,
because xconfig calls conf_get_changed() directly from the callback and
now sees the old value, thus never enabling the save button or the
shortcut.

Restore the previous behaviour.

Fixes: 5ee5465940 ("kconfig: change sym_change_count to a boolean flag")
Signed-off-by: Jurica Vukadin <jura@vukad.in>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2023-03-14 03:46:09 +09:00
Xiao Ni
3e45352259 md: Free resources in __md_stop
If md_run() fails after ->active_io is initialized, then percpu_ref_exit
is called in error path. However, later md_free_disk will call
percpu_ref_exit again which leads to a panic because of null pointer
dereference. It can also trigger this bug when resources are initialized
but are freed in error path, then will be freed again in md_free_disk.

BUG: kernel NULL pointer dereference, address: 0000000000000038
Oops: 0000 [#1] PREEMPT SMP
Workqueue: md_misc mddev_delayed_delete
RIP: 0010:free_percpu+0x110/0x630
Call Trace:
 <TASK>
 __percpu_ref_exit+0x44/0x70
 percpu_ref_exit+0x16/0x90
 md_free_disk+0x2f/0x80
 disk_release+0x101/0x180
 device_release+0x84/0x110
 kobject_put+0x12a/0x380
 kobject_put+0x160/0x380
 mddev_delayed_delete+0x19/0x30
 process_one_work+0x269/0x680
 worker_thread+0x266/0x640
 kthread+0x151/0x1b0
 ret_from_fork+0x1f/0x30

For creating raid device, md raid calls do_md_run->md_run, dm raid calls
md_run. We alloc those memory in md_run. For stopping raid device, md raid
calls do_md_stop->__md_stop, dm raid calls md_stop->__md_stop. So we can
free those memory resources in __md_stop.

Fixes: 72adae23a7 ("md: Change active_io to percpu")
Reported-and-tested-by: Yu Kuai <yukuai3@huawei.com>
Signed-off-by: Xiao Ni <xni@redhat.com>
Signed-off-by: Song Liu <song@kernel.org>
2023-03-13 10:56:54 -07:00
Linus Torvalds
fc89d7fb49 virtio,vhost,vdpa: bugfixes
Some fixes accumulated so far.
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmQOwvcPHG1zdEByZWRo
 YXQuY29tAAoJECgfDbjSjVRp2msH/0EwXAZCieuUnRU2c3wGEYnP+wnQbIBH3aUJ
 mpC3vFR9Bdo2bhxYcRjSJ+tUqWmUPn+0XZ8dYR9zu4lDli69cbWflij0+AJ/iGEe
 Rm6vhzq9KB4CKRulaW7gEdY+nXaIdoCAhdiI8Ka5QHd17wtnQ7X8g7HJRk+6m8hy
 tMfm8osrDXcHSGClipHKkw63n2vflAwA0cLfQjZYec4gwIMqOIDrztvvPJorxTLo
 NNHUF1wzyYCiVoWotDjfNcEiYULbSSE3Gau8KDyF6Y1VrkfyakXDAiH77GlttPuv
 j7DGIuWUGn84/z51C2HY164ux/bfWwKUHntGk6NvliZDWqeo718=
 =6O4V
 -----END PGP SIGNATURE-----

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost

Pull virtio fixes from Michael Tsirkin:
 "Some virtio / vhost / vdpa fixes accumulated so far"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  tools/virtio: Ignore virtio-trace/trace-agent
  vdpa_sim: set last_used_idx as last_avail_idx in vdpasim_queue_ready
  vhost-vdpa: free iommu domain after last use during cleanup
  vdpa/mlx5: should not activate virtq object when suspended
  vp_vdpa: fix the crash in hot unplug with vp_vdpa
2023-03-13 10:43:09 -07:00
Dionna Glaze
72f7754dcf virt/coco/sev-guest: Add throttling awareness
A potentially malicious SEV guest can constantly hammer the hypervisor
using this driver to send down requests and thus prevent or at least
considerably hinder other guests from issuing requests to the secure
processor which is a shared platform resource.

Therefore, the host is permitted and encouraged to throttle such guest
requests.

Add the capability to handle the case when the hypervisor throttles
excessive numbers of requests issued by the guest. Otherwise, the VM
platform communication key will be disabled, preventing the guest from
attesting itself.

Realistically speaking, a well-behaved guest should not even care about
throttling. During its lifetime, it would end up issuing a handful of
requests which the hardware can easily handle.

This is more to address the case of a malicious guest. Such guest should
get throttled and if its VMPCK gets disabled, then that's its own
wrongdoing and perhaps that guest even deserves it.

To the implementation: the hypervisor signals with SNP_GUEST_REQ_ERR_BUSY
that the guest requests should be throttled. That error code is returned
in the upper 32-bit half of exitinfo2 and this is part of the GHCB spec
v2.

So the guest is given a throttling period of 1 minute in which it
retries the request every 2 seconds. This is a good default but if it
turns out to not pan out in practice, it can be tweaked later.

For safety, since the encryption algorithm in GHCBv2 is AES_GCM, control
must remain in the kernel to complete the request with the current
sequence number. Returning without finishing the request allows the
guest to make another request but with different message contents. This
is IV reuse, and breaks cryptographic protections.

  [ bp:
    - Rewrite commit message and do a simplified version.
    - The stable tags are supposed to denote that a cleanup should go
      upfront before backporting this so that any future fixes to this
      can preserve the sanity of the backporter(s). ]

Fixes: d5af44dde5 ("x86/sev: Provide support for SNP guest request NAEs")
Signed-off-by: Dionna Glaze <dionnaglaze@google.com>
Co-developed-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Cc: <stable@kernel.org> # d6fd48eff7 ("virt/coco/sev-guest: Check SEV_SNP attribute at probe time")
Cc: <stable@kernel.org> # 970ab82374 (" virt/coco/sev-guest: Simplify extended guest request handling")
Cc: <stable@kernel.org> # c5a338274b ("virt/coco/sev-guest: Remove the disable_vmpck label in handle_guest_request()")
Cc: <stable@kernel.org> # 0fdb6cc7c8 ("virt/coco/sev-guest: Carve out the request issuing logic into a helper")
Cc: <stable@kernel.org> # d25bae7dc7 ("virt/coco/sev-guest: Do some code style cleanups")
Cc: <stable@kernel.org> # fa4ae42cc6 ("virt/coco/sev-guest: Convert the sw_exit_info_2 checking to a switch-case")
Link: https://lore.kernel.org/r/20230214164638.1189804-2-dionnaglaze@google.com
2023-03-13 13:29:27 +01:00