linux

korg/linux

mirror of https://mirrors.bfsu.edu.cn/git/linux.git synced 2024-11-11 04:18:39 +08:00

Author	SHA1	Message	Date
Dave Airlie	1a6bbc4d9e	Driver Changes: - Fix ccs_mode setting for Xe2 and later (Balasubramani) - Synchronize ccs_mode setting with client creation (Balasubramani) - Apply scheduling WA for LNL in additional places as needed (Nirmoy) - Fix leak and lock handling in error paths of xe_exec ioctl (Matthew Brost) - Fix GGTT allocation leak leading to eventual crash in SR-IOV (Michal Wajdeczko) - Move run_ticks update out of job handling to avoid synchronization with reader (Lucas) -----BEGIN PGP SIGNATURE----- iQJNBAABCAA3FiEE6rM8lpABPHM5FqyDm6KlpjDL6lMFAmcuLUQZHGx1Y2FzLmRl bWFyY2hpQGludGVsLmNvbQAKCRCboqWmMMvqU3HEEACI04W0GFdk1ix9MS2vuK7U r0phWoaP4+29vfWG9BRN3S6jisAt2Y1+DBHCP78C7naV/FvHW8aJQ78cHEJ9DjF+ /OpzpErfd+BD2DSSOBRNcQJ3eqe7JVjy2Q4kV41+qkKqqOaN99L33vFexCXOu7Je wFOXn33pmrsFCp9mb6xBhLYGo8tFS5IctHvBajGFHoUQzqfA/WkD/g67x9qKSzYv BThYOm066jcWGX0LaKHCw6slQ/OeZE78q2gaQraEpCyw4dJg7QBTlrVU5662xCnI RXDv+adzsHGWah77vrnBKFBRAD66qBHj4ntecHesnhuBv31EJdqWlLb5baoiE5RN wuPm57hPJ39tsB/KxqUlTmFiBCWQ74Px4o6nRKcnme2N0m/H8xFTp7c45SRwJZ+X RhQPnv14fkLN3e4TaWbr8at89Gapk1R8cXRytZG4gwkrfFfLEGT1ufBV6lP3Pn0n 6DQvtNrKR2qGO1JM+R5MXIzzdFFqRuI/8+GX9dvs/5fFTLbTUCckyU9ZM9JXFKIl 2ShK7eSFhUPqw2p3zbQzyxQqkopWORQhRkzUmCm8e9WQLE9uBpaFbkeIzu36atY2 pjRElc6OtTlxCN1KcBnOk5UvTbEC8MxNdWjB5MZJd8zaxm5DVbgGbxp5YmU8K/gG GV1YXZlniVpdhixJFFoypQ== =jirT -----END PGP SIGNATURE----- Merge tag 'drm-xe-fixes-2024-11-08' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes Driver Changes: - Fix ccs_mode setting for Xe2 and later (Balasubramani) - Synchronize ccs_mode setting with client creation (Balasubramani) - Apply scheduling WA for LNL in additional places as needed (Nirmoy) - Fix leak and lock handling in error paths of xe_exec ioctl (Matthew Brost) - Fix GGTT allocation leak leading to eventual crash in SR-IOV (Michal Wajdeczko) - Move run_ticks update out of job handling to avoid synchronization with reader (Lucas) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/4ffcebtluaaaohquxfyf5babpihmtscxwad3jjmt5nggwh2xpm@ztw67ucywttg	2024-11-09 05:14:29 +10:00
Dave Airlie	9b984a71c2	Short summary of fixes pull: imagination: - Track PVR context per file - Break ref-counting cycle panel-orientation-quirks: - Fix matching Lenovo Yoga Tab 3 X90F panthor: - Lock VM array - Be strict about I/O mapping flags -----BEGIN PGP SIGNATURE----- iQEzBAABCgAdFiEEchf7rIzpz2NEoWjlaA3BHVMLeiMFAmct0MAACgkQaA3BHVML eiNK5wgAoiqbLQapezO0gwMwyPjpT2WGyX7lcBpPjhJRtn1y/rsLMx2P9oXaMgxa VdKhH5ONYPx/uWwc+V/BzdgUPoiShOUmIIVBwF/AXF+kSvMlWMCV/eRh/7x6Txkl kAPxVY/a10JY54gMHayzMUakXEoHmeZ9wV1tweQLZLIzbCa54i93DdbXNLF8heE9 oTLbZ1L8CXJojZ8ftD5YnTX37u/igaFLcoMmimBAKi3vYS/kdy3LQ5Q0k7hFI/Sq LMhN4/4gsjLnM1lpH3nNjkZBI22CpKptkebNaR9GiUq7EuLA8PELiTCR2luc/RcZ g2TldqnogoPE1hlnroXVBLuRLvjoXQ== =nX96 -----END PGP SIGNATURE----- Merge tag 'drm-misc-fixes-2024-11-08' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes Short summary of fixes pull: imagination: - Track PVR context per file - Break ref-counting cycle panel-orientation-quirks: - Fix matching Lenovo Yoga Tab 3 X90F panthor: - Lock VM array - Be strict about I/O mapping flags Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20241108085058.GA37468@linux.fritz.box	2024-11-09 05:14:17 +10:00
Jann Horn	f432a1621f	drm/panthor: Be stricter about IO mapping flags The current panthor_device_mmap_io() implementation has two issues: 1. For mapping DRM_PANTHOR_USER_FLUSH_ID_MMIO_OFFSET, panthor_device_mmap_io() bails if VM_WRITE is set, but does not clear VM_MAYWRITE. That means userspace can use mprotect() to make the mapping writable later on. This is a classic Linux driver gotcha. I don't think this actually has any impact in practice: When the GPU is powered, writes to the FLUSH_ID seem to be ignored; and when the GPU is not powered, the dummy_latest_flush page provided by the driver is deliberately designed to not do any flushes, so the only thing writing to the dummy_latest_flush could achieve would be to make more flushes happen. 2. panthor_device_mmap_io() does not block MAP_PRIVATE mappings (which are mappings without the VM_SHARED flag). MAP_PRIVATE in combination with VM_MAYWRITE indicates that the VMA has copy-on-write semantics, which for VM_PFNMAP are semi-supported but fairly cursed. In particular, in such a mapping, the driver can only install PTEs during mmap() by calling remap_pfn_range() (because remap_pfn_range() wants to store the physical address of the mapped physical memory into the vm_pgoff of the VMA); installing PTEs later on with a fault handler (as panthor does) is not supported in private mappings, and so if you try to fault in such a mapping, vmf_insert_pfn_prot() splats when it hits a BUG() check. Fix it by clearing the VM_MAYWRITE flag (userspace writing to the FLUSH_ID doesn't make sense) and requiring VM_SHARED (copy-on-write semantics for the FLUSH_ID don't make sense). Reproducers for both scenarios are in the notes of my patch on the mailing list; I tested that these bugs exist on a Rock 5B machine. Note that I only compile-tested the patch, I haven't tested it; I don't have a working kernel build setup for the test machine yet. Please test it before applying it. Cc: stable@vger.kernel.org Fixes: `5fe909cae1` ("drm/panthor: Add the device logical block") Signed-off-by: Jann Horn <jannh@google.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241105-panthor-flush-page-fixes-v1-1-829aaf37db93@google.com	2024-11-07 16:39:53 +00:00
Liviu Dudau	444fa5b100	drm/panthor: Lock XArray when getting entries for the VM Similar to commit `cac075706f` ("drm/panthor: Fix race when converting group handle to group object") we need to use the XArray's internal locking when retrieving a vm pointer from there. v2: Removed part of the patch that was trying to protect fetching the heap pointer from XArray, as that operation is protected by the @pool->lock. Fixes: `647810ec24` ("drm/panthor: Add the MMU/VM logical block") Reported-by: Jann Horn <jannh@google.com> Cc: stable@vger.kernel.org Signed-off-by: Liviu Dudau <liviu.dudau@arm.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241106185806.389089-1-liviu.dudau@arm.com	2024-11-07 15:23:54 +00:00
Hans de Goede	052ef642bd	drm: panel-orientation-quirks: Make Lenovo Yoga Tab 3 X90F DMI match less strict There are 2G and 4G RAM versions of the Lenovo Yoga Tab 3 X90F and it turns out that the 2G version has a DMI product name of "CHERRYVIEW D1 PLATFORM" where as the 4G version has "CHERRYVIEW C0 PLATFORM". The sys-vendor + product-version check are unique enough that the product-name check is not necessary. Drop the product-name check so that the existing DMI match for the 4G RAM version also matches the 2G RAM version. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Acked-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240825132131.6643-1-hdegoede@redhat.com	2024-11-07 16:16:42 +01:00
Lucas De Marchi	514447a121	drm/xe: Stop accumulating LRC timestamp on job_free The exec queue timestamp is only really useful when it's being queried through the fdinfo. There's no need to update it so often, on every job_free. Tracing a simple app like vkcube running shows an update rate of ~ 120Hz. In case of discrete, the BO is on vram, creating a lot of pcie transactions. The update on job_free() is used to cover a gap: if exec queue is created and destroyed rapidly, before a new query, the timestamp still needs to be accumulated and accounted for in the xef. Initial implementation in commit `6109f24f87` ("drm/xe: Add helper to accumulate exec queue runtime") couldn't do it on the exec_queue_fini since the xef could be gone at that point. However since commit `ce8c161cba` ("drm/xe: Add ref counting for xe_file") the xef is refcounted and the exec queue always holds a reference, making this safe now. Improve the fix in commit `2149ded630` ("drm/xe: Fix use after free when client stats are captured") by reducing the frequency in which the update is needed. Fixes: `2149ded630` ("drm/xe: Fix use after free when client stats are captured") Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241104143815.2112272-3-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit 83db047d9425d9a649f01573797558eff0f632e1) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-11-05 15:40:13 -08:00
Michal Wajdeczko	a353c78459	drm/xe/pf: Fix potential GGTT allocation leak In unlikely event that we fail during sending the new VF GGTT configuration to the GuC, we will free only the GGTT node data struct but will miss to release the actual GGTT allocation. This will later lead to list corruption, GGTT space leak and finally risking crash when unloading the driver: [ ] ... [drm] GT0: PF: Failed to provision VF1 with 1073741824 (1.00 GiB) GGTT (-EIO) [ ] ... [drm] GT0: PF: VF1 provisioning remains at 0 (0 B) GGTT [ ] list_add corruption. next->prev should be prev (ffff88813cfcd628), but was 0000000000000000. (next=ffff88813cfe2028). [ ] RIP: 0010:__list_add_valid_or_report+0x6b/0xb0 [ ] Call Trace: [ ] drm_mm_insert_node_in_range+0x2c0/0x4e0 [ ] xe_ggtt_node_insert+0x46/0x70 [xe] [ ] pf_provision_vf_ggtt+0x7f5/0xa70 [xe] [ ] xe_gt_sriov_pf_config_set_ggtt+0x5e/0x770 [xe] [ ] ggtt_set+0x4b/0x70 [xe] [ ] simple_attr_write_xsigned.constprop.0.isra.0+0xb0/0x110 [ ] ... [drm] GT0: PF: Failed to provision VF1 with 1073741824 (1.00 GiB) GGTT (-ENOSPC) [ ] ... [drm] GT0: PF: VF1 provisioning remains at 0 (0 B) GGTT [ ] Oops: general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6b7b: 0000 [#1] PREEMPT SMP NOPTI [ ] RIP: 0010:drm_mm_remove_node+0x1b7/0x390 [ ] Call Trace: [ ] <TASK> [ ] ? die_addr+0x2e/0x80 [ ] ? exc_general_protection+0x1a1/0x3e0 [ ] ? asm_exc_general_protection+0x22/0x30 [ ] ? drm_mm_remove_node+0x1b7/0x390 [ ] ggtt_node_remove+0xa5/0xf0 [xe] [ ] xe_ggtt_node_remove+0x35/0x70 [xe] [ ] xe_ttm_bo_destroy+0x123/0x220 [xe] [ ] intel_user_framebuffer_destroy+0x44/0x70 [xe] [ ] intel_plane_destroy_state+0x3b/0xc0 [xe] [ ] drm_atomic_state_default_clear+0x1cd/0x2f0 [ ] intel_atomic_state_clear+0x9/0x20 [xe] [ ] __drm_atomic_state_free+0x1d/0xb0 Fix that by using pf_release_ggtt() on the error path, which now works regardless if the node has GGTT allocation or not. Fixes: `34e804220f` ("drm/xe: Make xe_ggtt_node struct independent") Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241104144901.1903-1-michal.wajdeczko@intel.com (cherry picked from commit 43b1dd2b550f0861ce80fbfffd5881b1b26272b1) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-11-05 15:40:12 -08:00
Matthew Brost	64a2b6ed4b	drm/xe: Drop VM dma-resv lock on xe_sync_in_fence_get failure in exec IOCTL Upon failure all locks need to be dropped before returning to the user. Fixes: `58480c1c91` ("drm/xe: Skip VMAs pin when requesting signal to the last XE_EXEC") Cc: <stable@vger.kernel.org> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241105043524.4062774-3-matthew.brost@intel.com (cherry picked from commit 7d1a4258e602ffdce529f56686925034c1b3b095) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-11-05 15:40:12 -08:00
Matthew Brost	af797b831d	drm/xe: Fix possible exec queue leak in exec IOCTL In a couple of places after an exec queue is looked up the exec IOCTL returns on input errors without dropping the exec queue ref. Fix this ensuring the exec queue ref is dropped on input error. Fixes: `dd08ebf6c3` ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: <stable@vger.kernel.org> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241105043524.4062774-2-matthew.brost@intel.com (cherry picked from commit 07064a200b40ac2195cb6b7b779897d9377e5e6f) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-11-05 15:40:12 -08:00
Alex Deucher	4d75b94680	drm/amdgpu: add missing size check in amdgpu_debugfs_gprwave_read() Avoid a possible buffer overflow if size is larger than 4K. Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit f5d873f5825b40d886d03bd2aede91d4cf002434) Cc: stable@vger.kernel.org	2024-11-05 10:54:11 -05:00
Alex Deucher	f790a2c494	drm/amdgpu: Adjust debugfs eviction and IB access permissions Users should not be able to run these. Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 7ba9395430f611cfc101b1c2687732baafa239d5) Cc: stable@vger.kernel.org	2024-11-05 10:53:48 -05:00
Alex Deucher	b46dadf7e3	drm/amdgpu: Adjust debugfs register access permissions Regular users shouldn't have read access. Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit c0cfd2e652553d607b910be47d0cc5a7f3a78641) Cc: stable@vger.kernel.org	2024-11-05 10:53:21 -05:00
Lijo Lazar	3ce3f85787	drm/amdgpu: Fix DPX valid mode check on GC 9.4.3 For DPX mode, the number of memory partitions supported should be less than or equal to 2. Fixes: `1589c82a10` ("drm/amdgpu: Check memory ranges for valid xcp mode") Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 990c4f580742de7bb78fa57420ffd182fc3ab4cd) Cc: stable@vger.kernel.org	2024-11-05 10:52:40 -05:00
Thomas Zimmermann	e301aea030	Merge drm/drm-fixes into drm-misc-fixes Backmerging to get the latest fixes from v6.12-rc6. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>	2024-11-05 09:43:47 +01:00
Kenneth Feng	74e1006430	drm/amd/pm: correct the workload setting Correct the workload setting in order not to mix the setting with the end user. Update the workload mask accordingly. v2: changes as below: 1. the end user can not erase the workload from driver except default workload. 2. always shows the real highest priority workoad to the end user. 3. the real workload mask is combined with driver workload mask and end user workload mask. v3: apply this to the other ASICs as well. v4: simplify the code v5: refine the code based on the review comments. Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 8cc438be5d49b8326b2fcade0bdb7e6a97df9e0b) Cc: stable@vger.kernel.org # 6.11.x	2024-11-04 12:51:01 -05:00
Kenneth Feng	1356bfc54c	drm/amd/pm: always pick the pptable from IFWI always pick the pptable from IFWI on smu v14.0.2/3 Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 136ce12bd5907388cb4e9aa63ee5c9c8c441640b) Cc: stable@vger.kernel.org # 6.11.x	2024-11-04 12:48:51 -05:00
Antonio Quartulli	a6dd15981c	drm/amdgpu: prevent NULL pointer dereference if ATIF is not supported acpi_evaluate_object() may return AE_NOT_FOUND (failure), which would result in dereferencing buffer.pointer (obj) while being NULL. Although this case may be unrealistic for the current code, it is still better to protect against possible bugs. Bail out also when status is AE_NOT_FOUND. This fixes 1 FORWARD_NULL issue reported by Coverity Report: CID 1600951: Null pointer dereferences (FORWARD_NULL) Signed-off-by: Antonio Quartulli <antonio@mandelbit.com> Fixes: c9b7c809b89f ("drm/amd: Guard against bad data for ATIF ACPI method") Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://lore.kernel.org/r/20241031152848.4716-1-antonio@mandelbit.com Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 91c9e221fe2553edf2db71627d8453f083de87a1) Cc: stable@vger.kernel.org	2024-11-04 12:48:21 -05:00
Aurabindo Pillai	694c79769c	drm/amd/display: parse umc_info or vram_info based on ASIC An upstream bug report suggests that there are production dGPUs that are older than DCN401 but still have a umc_info in VBIOS tables with the same version as expected for a DCN401 product. Hence, reading this tables should be guarded with a version check. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3678 Reviewed-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 2551b4a321a68134360b860113dd460133e856e5) Fixes: `00c391102a` ("drm/amd/display: Add misc DC changes for DCN401") Cc: stable@vger.kernel.org # 6.11.x	2024-11-04 12:44:07 -05:00
Tom Chung	4f26c95ffc	drm/amd/display: Fix brightness level not retained over reboot [Why] During boot up and resume the DC layer will reset the panel brightness to fix a flicker issue. It will cause the dm->actual_brightness is not the current panel brightness level. (the dm->brightness is the correct panel level) [How] Set the backlight level after do the set mode. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Fixes: `d9e865826c` ("drm/amd/display: Simplify brightness initialization") Reported-by: Mark Herbert <mark.herbert42@gmail.com> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3655 Reviewed-by: Sun peng Li <sunpeng.li@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 7875afafba84817b791be6d2282b836695146060) Cc: stable@vger.kernel.org	2024-11-04 12:43:03 -05:00
Nirmoy Das	1491efb39a	drm/xe/guc/tlb: Flush g2h worker in case of tlb timeout Flush the g2h worker explicitly if TLB timeout happens which is observed on LNL and that points to the recent scheduling issue with E-cores on LNL. This is similar to the recent fix: commit e51527233804 ("drm/xe/guc/ct: Flush g2h worker in case of g2h response timeout") and should be removed once there is E core scheduling fix. v2: Add platform check(Himal) v3: Remove gfx platform check as the issue related to cpu platform(John) Use the common WA macro(John) and print when the flush resolves timeout(Matt B) v4: Remove the resolves log and do the flush before taking pending_lock(Matt A) Cc: Badal Nilawar <badal.nilawar@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: stable@vger.kernel.org # v6.11+ Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2687 Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241029120117.449694-3-nirmoy.das@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit e1f6fa55664a0eeb0a641f497e1adfcf6672e995) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-11-04 08:12:30 -08:00
Nirmoy Das	7d1e2580ed	drm/xe/ufence: Flush xe ordered_wq in case of ufence timeout Flush xe ordered_wq in case of ufence timeout which is observed on LNL and that points to recent scheduling issue with E-cores. This is similar to the recent fix: commit e51527233804 ("drm/xe/guc/ct: Flush g2h worker in case of g2h response timeout") and should be removed once there is a E-core scheduling fix for LNL. v2: Add platform check(Himal) s/__flush_workqueue/flush_workqueue(Jani) v3: Remove gfx platform check as the issue related to cpu platform(John) v4: Use the Common macro(John) and print when the flush resolves timeout(Matt B) Cc: Badal Nilawar <badal.nilawar@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: stable@vger.kernel.org # v6.11+ Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2754 Suggested-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241029120117.449694-2-nirmoy.das@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit 38c4c8722bd74452280951edc44c23de47612001) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-11-04 08:12:30 -08:00
Nirmoy Das	55e8a3f37e	drm/xe: Move LNL scheduling WA to xe_device.h Move LNL scheduling WA to xe_device.h so this can be used in other places without needing keep the same comment about removal of this WA in the future. The WA, which flushes work or workqueues, is now wrapped in macros and can be reused wherever needed. Cc: Badal Nilawar <badal.nilawar@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> cc: stable@vger.kernel.org # v6.11+ Suggested-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241029120117.449694-1-nirmoy.das@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit cbe006a6492c01a0058912ae15d473f4c149896c) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-11-04 08:12:30 -08:00
Balasubramani Vivekanandan	4b468a92dd	drm/xe: Use the filelist from drm for ccs_mode change Drop the exclusive client count tracking and use the filelist from the drm to track the active clients. This also ensures the clients created internally by the driver won't block changing the ccs mode. Fixes: `ce8c161cba` ("drm/xe: Add ref counting for xe_file") Signed-off-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241008073628.377433-3-balasubramani.vivekanandan@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit 1c35f1ed1fe3c649f8c16214d0d3dd828b5265d9) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-11-04 08:12:30 -08:00
Balasubramani Vivekanandan	7fd3fa006f	drm/xe: Set mask bits for CCS_MODE register CCS_MODE register requires setting mask bits from Xe2+ platforms. Set the mask bits unconditionally, as those bits are unused for older platforms. Signed-off-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Cc: stable@vger.kernel.org # v6.11+ Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241008073628.377433-2-balasubramani.vivekanandan@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit 23ea2c7572d4735ef66beb1e4feb8ae510b78247) [ Fix conflict with mmio refactors ] Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-11-04 08:03:40 -08:00
Brendan King	b04ce1e718	drm/imagination: Break an object reference loop When remaining resources are being cleaned up on driver close, outstanding VM mappings may result in resources being leaked, due to an object reference loop, as shown below, with each object (or set of objects) referencing the object below it: PVR GEM Object GPU scheduler "finished" fence GPU scheduler “scheduled” fence PVR driver “done” fence PVR Context PVR VM Context PVR VM Mappings PVR GEM Object The reference that the PVR VM Context has on the VM mappings is a soft one, in the sense that the freeing of outstanding VM mappings is done as part of VM context destruction; no reference counts are involved, as is the case for all the other references in the loop. To break the reference loop during cleanup, free the outstanding VM mappings before destroying the PVR Context associated with the VM context. Signed-off-by: Brendan King <brendan.king@imgtec.com> Signed-off-by: Matt Coster <matt.coster@imgtec.com> Reviewed-by: Frank Binns <frank.binns@imgtec.com> Cc: stable@vger.kernel.org Link: https://patchwork.freedesktop.org/patch/msgid/8a25924f-1bb7-4d9a-a346-58e871dfb1d1@imgtec.com	2024-11-04 09:41:38 +00:00
Brendan King	b0ef514bc6	drm/imagination: Add a per-file PVR context list This adds a linked list of VM contexts which is needed for the next patch to be able to correctly track VM contexts for destruction on file close. It is only safe for VM contexts to be removed from the list and destroyed when not in interrupt context. Signed-off-by: Brendan King <brendan.king@imgtec.com> Signed-off-by: Matt Coster <matt.coster@imgtec.com> Reviewed-by: Frank Binns <frank.binns@imgtec.com> Cc: stable@vger.kernel.org Link: https://patchwork.freedesktop.org/patch/msgid/e57128ea-f0ce-4e93-a9d4-3f033a8b06fa@imgtec.com	2024-11-04 09:41:32 +00:00
Dave Airlie	f99c7cca2f	Driver Changes: - Fix missing HPD interrupt enabling, bringing one PM refactor with it (Imre / Maarten) - Workaround LNL GGTT invalidation not being visible to GuC (Matthew Brost) - Avoid getting jobs stuck without a protecting timeout (Matthew Brost) -----BEGIN PGP SIGNATURE----- iQJNBAABCAA3FiEE6rM8lpABPHM5FqyDm6KlpjDL6lMFAmckr1wZHGx1Y2FzLmRl bWFyY2hpQGludGVsLmNvbQAKCRCboqWmMMvqU5mUD/9ldyPILkvMvGT6yEBf7HWJ HwEs8oqUkN/NDxOXySrkiVPESLB9PH/SWo6YW7L0X7DkFq1rEEB6eWbEu4ZQ6Oef Q9wwx3XZ2oZwi5A/oa2mT9mvxN53vRVgEaKtPxej42N6zVcbtJgre9sfqa8MwF5u HNHP8tcKZNf8i+kzIEPrz33wLRH+Ecvkbp1k2E//cSHIWcsaq9ZmG8pnH7x8X/91 fo0uLpTKhSJZzs5QygxnQ3X9BgIDgHpCu0mlTcmHBsH4wPr3u2uSdQ2cT97zZyiT 7fATg/QdgwXH8lsmnbzPj0LjQd+AK4eLS3nD1PKzBN3+/5ZVy6ZVol6K41qUDNSC Tzzgo4/NOVfcRdyiWe6Elj07Snent8gtNGGphp1vAtUns8kGcJw3gDcOpX3cYHxa u+DQP7zasWwPAF2lqwApMHFZGe/zoH3w7zUZYCmGP4IImC3VM8NBaM4No6VNIvco +08yJUFBifS/4qYISYCwO9H7Dt+10WDxHrzF6w6sA89dL4mBf4fGvcfnxE33NuWn RIhfQ+WLC/vIzRwbOGmvAhbG4Owxjsacq7RQOUc1azJ66cCPCQCNfVGx6qf5Q23A 0KbDoQXVlegKOz7z5m/ORXQO6A/Ien2VABLn0t0bjmoP7S57UFfiENn2P0c+UeNZ GJUi4QS5Ssdjq/DK6awoRw== =kIfk -----END PGP SIGNATURE----- Merge tag 'drm-xe-fixes-2024-10-31' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes Driver Changes: - Fix missing HPD interrupt enabling, bringing one PM refactor with it (Imre / Maarten) - Workaround LNL GGTT invalidation not being visible to GuC (Matthew Brost) - Avoid getting jobs stuck without a protecting timeout (Matthew Brost) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/tsbftadm7owyizzdaqnqu7u4tqggxgeqeztlfvmj5fryxlfomi@5m5bfv2zvzmw	2024-11-02 04:44:27 +10:00
Dave Airlie	427360718e	Mediatek DRM Fixes - 20241028 1. Fix degradation problem of alpha blending 2. Fix color format MACROs in OVL 3. Fix get efuse issue for MT8188 DPTX 4. Fix potential NULL dereference in mtk_crtc_destroy() 5. Correct dpi power-domains property 6. Add split subschema property constraints -----BEGIN PGP SIGNATURE----- iQJMBAABCgA2FiEEACwLKSDmq+9RDv5P4cpzo8lZTiQFAmcfllQYHGNodW5rdWFu Zy5odUBrZXJuZWwub3JnAAoJEOHKc6PJWU4kS1QP/ju0jufnTZsS6m61yNKTJ/wA LjZeGOmtsbPOQc+++fD+5M4gAj9V3tuFPyfmqSRyLWj3PHDg05FY23XkFnKEDNMH n0S5i/097ZaS4fqDZcIy2A8+rOO+72aX+noLZaeMNT9V1k6EH08fycQ1CWP/fc0b 3tS9e9f7lYzO+Bf0h7TophXJ81Dga/o8hj+16qo0eSjz4vS+49W/Ir30gp8UprfK i/p5TwoAApDfe5VTOnXgZlWfjVBCoSvWHD9nX8uRw9RXunf+B0GWYe5LcKRLXtFv JEuf8+D3lO2CEuLLfxtiXsIKA9uKkHkmrpCQ++2aLFnqav/mA1tkCN38mvtxtlCs Aqb2ng19NWNtP7Fhz/N8AELLgoQXcs1KZpYvAJkWeHjQXrxWtFMvSCa8mtQImtwT aG7QvONeBHmwIqHZNQYXTE337qcua0Wz06jYv8pgnUrwdqBrwVtq6qHUPk+TR3/G GNbM4LQYTEBFoBY1gUjgs4kbrvtTWBoYxcaMdW2wpYELCJ1trbvbzk9CA7RPjRWM 4SH1OFwICv2wz6OWoywgJLLYTVC3PJf/c1bfew4AZsMHW4TSKnctXdd5ug5WSwp1 tvRRr36a5gjMxFL4zOx0yPGIIVavlKNdM14ntyH2fo1v+KC+NJtWbYGlMwnsVvMK AXzr6lalO2jfvjxy2mg9 =F/S2 -----END PGP SIGNATURE----- Merge tag 'mediatek-drm-fixes-20241028' of https://git.kernel.org/pub/scm/linux/kernel/git/chunkuang.hu/linux into drm-fixes Mediatek DRM Fixes - 20241028 1. Fix degradation problem of alpha blending 2. Fix color format MACROs in OVL 3. Fix get efuse issue for MT8188 DPTX 4. Fix potential NULL dereference in mtk_crtc_destroy() 5. Correct dpi power-domains property 6. Add split subschema property constraints Signed-off-by: Dave Airlie <airlied@redhat.com> From: Chun-Kuang Hu <chunkuang.hu@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20241028135846.3570-1-chunkuang.hu@kernel.org	2024-11-01 07:34:21 +10:00
Dave Airlie	8594a2d8d7	amd-drm-fixes-6.12-2024-10-31: amdgpu: - DCN 3.5 fix - Vangogh SMU KASAN fix - SMU 13 profile reporting fix -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQQgO5Idg2tXNTSZAr293/aFa7yZ2AUCZyOePQAKCRC93/aFa7yZ 2Jl4AP0WgPXMmGM6EBgf5CvYjxW0ZlKb8CZW05LiW7q0qHttHgD/erb+s37Z0jtl XGZeUmZ2EvV0hxr7KH9jxjHG2pD6/gM= =Ls48 -----END PGP SIGNATURE----- Merge tag 'amd-drm-fixes-6.12-2024-10-31' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.12-2024-10-31: amdgpu: - DCN 3.5 fix - Vangogh SMU KASAN fix - SMU 13 profile reporting fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241031151539.3523633-1-alexander.deucher@amd.com	2024-11-01 07:25:41 +10:00
Matthew Brost	fe05cee4d9	drm/xe: Don't short circuit TDR on jobs not started Short circuiting TDR on jobs not started is an optimization which is not required. On LNL we are facing an issue where jobs do not get scheduled by the GuC if it misses a GGTT page update. When this occurs let the TDR fire, toggle the scheduling which may get the job unstuck, and print a warning message. If the TDR fires twice on job that hasn't started, timeout the job. v2: - Add warning message (Paulo) - Add fixes tag (Paulo) - Timeout job which hasn't started after TDR firing twice v3: - Include local change v4: - Short circuit check_timeout on job not started - use warn level rather than notice (Paulo) Fixes: `7ddb9403dd` ("drm/xe: Sample ctx timestamp to determine if jobs have timed out") Cc: stable@vger.kernel.org Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241025214330.2010521-2-matthew.brost@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit 35d25a4a0012e690ef0cc4c5440231176db595cc) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-10-31 07:03:14 -07:00
Matthew Brost	993ca0ecce	drm/xe: Add mmio read before GGTT invalidate On LNL without a mmio read before a GGTT invalidate the GuC can incorrectly read the GGTT scratch page upon next access leading to jobs not getting scheduled. A mmio read before a GGTT invalidate seems to fix this. Since a GGTT invalidate is not a hot code path, blindly do a mmio read before each GGTT invalidate. Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: stable@vger.kernel.org Fixes: `dd08ebf6c3` ("drm/xe: Introduce a new DRM driver for Intel GPUs") Reported-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/3164 Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241023221200.1797832-1-matthew.brost@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit 5a710196883e0ac019ac6df2a6d79c16ad3c32fa) [ Fix conflict with mmio vs gt argument ] Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-10-31 07:02:42 -07:00
Jinjie Ruan	add4163aca	drm/tests: hdmi: Fix memory leaks in drm_display_mode_from_cea_vic() modprobe drm_hdmi_state_helper_test and then rmmod it, the following memory leak occurs. The `mode` allocated in drm_mode_duplicate() called by drm_display_mode_from_cea_vic() is not freed, which cause the memory leak: unreferenced object 0xffffff80ccd18100 (size 128): comm "kunit_try_catch", pid 1851, jiffies 4295059695 hex dump (first 32 bytes): 57 62 00 00 80 02 90 02 f0 02 20 03 00 00 e0 01 Wb........ ..... ea 01 ec 01 0d 02 00 00 0a 00 00 00 00 00 00 00 ................ backtrace (crc c2f1aa95): [<000000000f10b11b>] kmemleak_alloc+0x34/0x40 [<000000001cd4cf73>] __kmalloc_cache_noprof+0x26c/0x2f4 [<00000000f1f3cffa>] drm_mode_duplicate+0x44/0x19c [<000000008cbeef13>] drm_display_mode_from_cea_vic+0x88/0x98 [<0000000019daaacf>] 0xffffffedc11ae69c [<000000000aad0f85>] kunit_try_run_case+0x13c/0x3ac [<00000000a9210bac>] kunit_generic_run_threadfn_adapter+0x80/0xec [<000000000a0b2e9e>] kthread+0x2e8/0x374 [<00000000bd668858>] ret_from_fork+0x10/0x20 ...... Free `mode` by using drm_kunit_display_mode_from_cea_vic() to fix it. Cc: stable@vger.kernel.org Fixes: `4af70f19e5` ("drm/tests: Add RGB Quantization tests") Acked-by: Maxime Ripard <mripard@kernel.org> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241030023504.530425-4-ruanjinjie@huawei.com Signed-off-by: Maxime Ripard <mripard@kernel.org>	2024-10-31 10:31:35 +01:00
Jinjie Ruan	926163342a	drm/connector: hdmi: Fix memory leak in drm_display_mode_from_cea_vic() modprobe drm_connector_test and then rmmod drm_connector_test, the following memory leak occurs. The `mode` allocated in drm_mode_duplicate() called by drm_display_mode_from_cea_vic() is not freed, which cause the memory leak: unreferenced object 0xffffff80cb0ee400 (size 128): comm "kunit_try_catch", pid 1948, jiffies 4294950339 hex dump (first 32 bytes): 14 44 02 00 80 07 d8 07 04 08 98 08 00 00 38 04 .D............8. 3c 04 41 04 65 04 00 00 05 00 00 00 00 00 00 00 <.A.e........... backtrace (crc 90e9585c): [<00000000ec42e3d7>] kmemleak_alloc+0x34/0x40 [<00000000d0ef055a>] __kmalloc_cache_noprof+0x26c/0x2f4 [<00000000c2062161>] drm_mode_duplicate+0x44/0x19c [<00000000f96c74aa>] drm_display_mode_from_cea_vic+0x88/0x98 [<00000000d8f2c8b4>] 0xffffffdc982a4868 [<000000005d164dbc>] kunit_try_run_case+0x13c/0x3ac [<000000006fb23398>] kunit_generic_run_threadfn_adapter+0x80/0xec [<000000006ea56ca0>] kthread+0x2e8/0x374 [<000000000676063f>] ret_from_fork+0x10/0x20 ...... Free `mode` by using drm_kunit_display_mode_from_cea_vic() to fix it. Cc: stable@vger.kernel.org Fixes: `abb6f74973` ("drm/tests: Add HDMI TDMS character rate tests") Acked-by: Maxime Ripard <mripard@kernel.org> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241030023504.530425-3-ruanjinjie@huawei.com Signed-off-by: Maxime Ripard <mripard@kernel.org>	2024-10-31 10:31:34 +01:00
Jinjie Ruan	caa714f866	drm/tests: helpers: Add helper for drm_display_mode_from_cea_vic() As Maxime suggested, add a new helper drm_kunit_display_mode_from_cea_vic(), it can replace the direct call of drm_display_mode_from_cea_vic(), and it will help solving the `mode` memory leaks. Acked-by: Maxime Ripard <mripard@kernel.org> Suggested-by: Maxime Ripard <mripard@kernel.org> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241030023504.530425-2-ruanjinjie@huawei.com Signed-off-by: Maxime Ripard <mripard@kernel.org>	2024-10-31 10:31:34 +01:00
Boris Brezillon	4700fd3e05	drm/panthor: Report group as timedout when we fail to properly suspend If we don't do that, the group is considered usable by userspace, but all further GROUP_SUBMIT will fail with -EINVAL. Changes in v3: - Add R-bs Changes in v2: - New patch Fixes: `de85488138` ("drm/panthor: Add the scheduler logical block") Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241029152912.270346-3-boris.brezillon@collabora.com	2024-10-30 16:37:18 +01:00
Boris Brezillon	412a2a8fdd	drm/panthor: Fail job creation when the group is dead Userspace can use GROUP_SUBMIT errors as a trigger to check the group state and recreate the group if it became unusable. Make sure we report an error when the group became unusable. Changes in v3: - None Changes in v2: - Add R-bs Fixes: `de85488138` ("drm/panthor: Add the scheduler logical block") Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241029152912.270346-2-boris.brezillon@collabora.com	2024-10-30 16:37:09 +01:00
Boris Brezillon	5d01b56f05	drm/panthor: Fix firmware initialization on systems with a page size > 4k The system and GPU MMU page size might differ, which becomes a problem for FW sections that need to be mapped at explicit addresses since our PAGE_SIZE alignment might cover a VA range that's expected to be used for another section. Make sure we never map more than we need. Changes in v3: - Add R-bs Changes in v2: - Plan for per-VM page sizes so the MCU VM and user VM can have different pages sizes Fixes: `2718d91816` ("drm/panthor: Add the FW logical block") Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241030150231.768949-1-boris.brezillon@collabora.com	2024-10-30 16:30:21 +01:00
Imre Deak	6a9d2e2988	drm/xe/display: Add missing HPD interrupt enabling during non-d3cold RPM resume Atm the display HPD interrupts that got disabled during runtime suspend, are re-enabled only if d3cold is enabled. Fix things by also re-enabling the interrupts if d3cold is disabled. Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241009194358.1321200-5-imre.deak@intel.com (cherry picked from commit bbc4a30de095f0349d3c278500345a1b620d495e) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-10-29 07:12:56 -07:00
Imre Deak	dcb6c1d071	drm/xe/display: Separate the d3cold and non-d3cold runtime PM handling For clarity separate the d3cold and non-d3cold runtime PM handling. The only change in behavior is disabling polling later during runtime resume. This shouldn't make a difference, since the poll disabling is handled from a work, which could run at any point wrt. the runtime resume handler. The work will also require a runtime PM reference, syncing it with the resume handler. Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241009194358.1321200-4-imre.deak@intel.com (cherry picked from commit a4de6beb83fc5adee788518350247c629568901e) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-10-29 07:12:56 -07:00
Maarten Lankhorst	25f2ff5383	drm/xe: Remove runtime argument from display s/r functions The previous change ensures that pm_suspend is only called when suspending or resuming. This ensures no further bugs like those in the previous commit. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Vinod Govindapillai <vinod.govindapillai@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240905150052.174895-3-maarten.lankhorst@linux.intel.com (cherry picked from commit f90491d4b64e302e940133103d3d9908e70e454f) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-10-29 07:12:55 -07:00
Alex Deucher	935abb86a9	drm/amdgpu/smu13: fix profile reporting The following 3 commits landed in parallel: commit `d7d2688bf4` ("drm/amd/pm: update workload mask after the setting") commit `7a1613e47e` ("drm/amdgpu/smu13: always apply the powersave optimization") commit `7c210ca5a2` ("drm/amdgpu: handle default profile on on devices without fullscreen 3D") While everything is set correctly, this caused the profile to be reported incorrectly because both the powersave and fullscreen3d bits were set in the mask and when the driver prints the profile, it looks for the first bit set. Fixes: `d7d2688bf4` ("drm/amd/pm: update workload mask after the setting") Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit ecfe9b237687a55d596fff0650ccc8cc455edd3f) Cc: stable@vger.kernel.org	2024-10-28 17:19:45 -04:00
Tvrtko Ursulin	4aa923a6e6	drm/amd/pm: Vangogh: Fix kernel memory out of bounds write KASAN reports that the GPU metrics table allocated in vangogh_tables_init() is not large enough for the memset done in smu_cmn_init_soft_gpu_metrics(). Condensed report follows: [ 33.861314] BUG: KASAN: slab-out-of-bounds in smu_cmn_init_soft_gpu_metrics+0x73/0x200 [amdgpu] [ 33.861799] Write of size 168 at addr ffff888129f59500 by task mangoapp/1067 ... [ 33.861808] CPU: 6 UID: 1000 PID: 1067 Comm: mangoapp Tainted: G W 6.12.0-rc4 #356 1a56f59a8b5182eeaf67eb7cb8b13594dd23b544 [ 33.861816] Tainted: [W]=WARN [ 33.861818] Hardware name: Valve Galileo/Galileo, BIOS F7G0107 12/01/2023 [ 33.861822] Call Trace: [ 33.861826] <TASK> [ 33.861829] dump_stack_lvl+0x66/0x90 [ 33.861838] print_report+0xce/0x620 [ 33.861853] kasan_report+0xda/0x110 [ 33.862794] kasan_check_range+0xfd/0x1a0 [ 33.862799] __asan_memset+0x23/0x40 [ 33.862803] smu_cmn_init_soft_gpu_metrics+0x73/0x200 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779] [ 33.863306] vangogh_get_gpu_metrics_v2_4+0x123/0xad0 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779] [ 33.864257] vangogh_common_get_gpu_metrics+0xb0c/0xbc0 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779] [ 33.865682] amdgpu_dpm_get_gpu_metrics+0xcc/0x110 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779] [ 33.866160] amdgpu_get_gpu_metrics+0x154/0x2d0 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779] [ 33.867135] dev_attr_show+0x43/0xc0 [ 33.867147] sysfs_kf_seq_show+0x1f1/0x3b0 [ 33.867155] seq_read_iter+0x3f8/0x1140 [ 33.867173] vfs_read+0x76c/0xc50 [ 33.867198] ksys_read+0xfb/0x1d0 [ 33.867214] do_syscall_64+0x90/0x160 ... [ 33.867353] Allocated by task 378 on cpu 7 at 22.794876s: [ 33.867358] kasan_save_stack+0x33/0x50 [ 33.867364] kasan_save_track+0x17/0x60 [ 33.867367] __kasan_kmalloc+0x87/0x90 [ 33.867371] vangogh_init_smc_tables+0x3f9/0x840 [amdgpu] [ 33.867835] smu_sw_init+0xa32/0x1850 [amdgpu] [ 33.868299] amdgpu_device_init+0x467b/0x8d90 [amdgpu] [ 33.868733] amdgpu_driver_load_kms+0x19/0xf0 [amdgpu] [ 33.869167] amdgpu_pci_probe+0x2d6/0xcd0 [amdgpu] [ 33.869608] local_pci_probe+0xda/0x180 [ 33.869614] pci_device_probe+0x43f/0x6b0 Empirically we can confirm that the former allocates 152 bytes for the table, while the latter memsets the 168 large block. Root cause appears that when GPU metrics tables for v2_4 parts were added it was not considered to enlarge the table to fit. The fix in this patch is rather "brute force" and perhaps later should be done in a smarter way, by extracting and consolidating the part version to size logic to a common helper, instead of brute forcing the largest possible allocation. Nevertheless, for now this works and fixes the out of bounds write. v2: * Drop impossible v3_0 case. (Mario) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Fixes: `41cec40bc9` ("drm/amd/pm: Vangogh: Add new gpu_metrics_v2_4 to acquire gpu_metrics") Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Evan Quan <evan.quan@amd.com> Cc: Wenyou Yang <WenYou.Yang@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://lore.kernel.org/r/20241025145639.19124-1-tursulin@igalia.com Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 0880f58f9609f0200483a49429af0f050d281703) Cc: stable@vger.kernel.org # v6.6+	2024-10-28 17:14:08 -04:00
Ovidiu Bunea	1b6063a577	Revert "drm/amd/display: update DML2 policy EnhancedPrefetchScheduleAccelerationFinal DCN35" This reverts commit 9dad21f910fc ("drm/amd/display: update DML2 policy EnhancedPrefetchScheduleAccelerationFinal DCN35") [why & how] The offending commit exposes a hang with lid close/open behavior. Both issues seem to be related to ODM 2:1 mode switching, so there is another issue generic to that sequence that needs to be investigated. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Ovidiu Bunea <Ovidiu.Bunea@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 68bf95317ebf2cfa7105251e4279e951daceefb7) Cc: stable@vger.kernel.org	2024-10-28 17:13:25 -04:00
Matthew Brost	746ae46c11	drm/sched: Mark scheduler work queues with WQ_MEM_RECLAIM drm_gpu_scheduler.submit_wq is used to submit jobs, jobs are in the path of dma-fences, and dma-fences are in the path of reclaim. Mark scheduler work queue with WQ_MEM_RECLAIM to ensure forward progress during reclaim; without WQ_MEM_RECLAIM, work queues cannot make forward progress during reclaim. v2: - Fixes tags (Philipp) - Reword commit message (Philipp) Cc: Luben Tuikov <ltuikov89@gmail.com> Cc: Danilo Krummrich <dakr@kernel.org> Cc: Philipp Stanner <pstanner@redhat.com> Cc: stable@vger.kernel.org Fixes: `34f50cc644` ("drm/sched: Use drm sched lockdep map for submit_wq") Fixes: `a6149f0393` ("drm/sched: Convert drm scheduler to use a work queue rather than kthread") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Acked-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Philipp Stanner <pstanner@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241023235917.1836428-1-matthew.brost@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-10-28 14:12:56 -04:00
Thomas Zimmermann	fc5ced75d6	Merge drm/drm-fixes into drm-misc-fixes Backmerging to get the latest fixes from upstream. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>	2024-10-25 15:24:08 +02:00
Dave Airlie	4d95a12beb	Driver Changes: - Increase invalidation timeout to avoid errors in some hosts (Shuicheng) - Flush worker on timeout (Badal) - Better handling for force wake failure (Shuicheng) - Improve argument check on user fence creation (Nirmoy) - Don't restart parallel queues multiple times on GT reset (Nirmoy) -----BEGIN PGP SIGNATURE----- iQJNBAABCAA3FiEE6rM8lpABPHM5FqyDm6KlpjDL6lMFAmcavkQZHGx1Y2FzLmRl bWFyY2hpQGludGVsLmNvbQAKCRCboqWmMMvqU99uD/0fiq9PPSGZQlQTiMYYWR80 EZ+u1xESm+VqgFJJBDcRiKRB3TIIISeH1IOse0JzRY5kVxPU8jqA1HFkuDjvVCMI cHc5T3WTsUQUhUIyMBTex+MEYBNmaKF0a+qfOA+Uh86v9Xt5LWVyViPXM2qY0T0E skGFulZlhqGRIb4EEA8pdDwmJeFcHxVRK9Xkc8kn9kzfgv4sxHCvbCE6pNNzR1w2 ofQ85omyQWguNH5cgj5koI6LQqoKYVztBIi3rNTgYsl+FKy8BNY5KX2+C5XMNdv1 /GHSj+jLaUGFZ3WdgWkRNtBFVtX0bLSTpQd3y85rENOwOnR3mr6VTWGIBUaB4fKS /xtqC4KJptpAT4MEDlYEQ4P6/YkG5PQZ8UYzdDJfWQtNkN4G5evYIUtZQ+h8mcyh mgs8+nbQbbKsHoqVKLTGdoV+1lpQRVPeWiezPTxNS1zuEQ+CwwKAjpAi1wImJDVz gu4JqHbr16oMY1if2SD9On7eYcqDXSUs2vR6JShtwbgpBUN8/UmH9RYQxkpIU3q2 bNLCIxGAgC+G1cdGIIp9kRLMu8QmfSzf653KLuUAUogEXOcV2xwTncZhSnxx0XD7 S9vR7/9alS65zEro0KQfwSJbBb3izXy00JfCUyquIAioi2nXLbaJ0imhBn3gBlma +1t2XGyhib3poqsiDpZY2w== =+1gt -----END PGP SIGNATURE----- Merge tag 'drm-xe-fixes-2024-10-24-1' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes Driver Changes: - Increase invalidation timeout to avoid errors in some hosts (Shuicheng) - Flush worker on timeout (Badal) - Better handling for force wake failure (Shuicheng) - Improve argument check on user fence creation (Nirmoy) - Don't restart parallel queues multiple times on GT reset (Nirmoy) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/trlkoiewtc4x2cyhsxmj3atayyq4zwto4iryea5pvya2ymc3yp@fdx5nhwmiyem	2024-10-25 16:55:39 +10:00
Dave Airlie	e3e1cfe33f	Short summary of fixes pull: bridge: - aux: Fix assignment of OF node - tc358767: Add missing of_node_put() in error path -----BEGIN PGP SIGNATURE----- iQEzBAABCgAdFiEEchf7rIzpz2NEoWjlaA3BHVMLeiMFAmcaQhwACgkQaA3BHVML eiNaLAf+OXhVWlaFV7fdSlgEFHnRVk1oaDa/l3gch2nWfW/gHPZRsL3vEzHPJKUw yRuc/x/wgELVsuweGjudQijeEioCYqLMy4SBEau7h069f4BTsQJAugz+7/Dxx/ps /GWhYF2lHhN7cbkyjIk5ERElWnq+CfIjEbPvHxXADkXuZty3a3tEE3NCfWm7Mifo n4JrmuqPbegFG/0fBYswx7S0MeZ0RhrR2VuYRuRaesBaK/DlSFJRzS53WnxPgA7g 5UeTVF0TOIuqI9bQgA3LUh68r83ntGCGtMnOl4dWdjTYc5cQGIPsJRaaClliGDGZ HI8MtYC7k3DjRs1QDsbZgIhtm0k7yA== =9Aa2 -----END PGP SIGNATURE----- Merge tag 'drm-misc-fixes-2024-10-24' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes Short summary of fixes pull: bridge: - aux: Fix assignment of OF node - tc358767: Add missing of_node_put() in error path Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20241024124921.GA20475@localhost.localdomain	2024-10-25 11:11:59 +10:00
Dave Airlie	2ba1f81ec7	Merge tag 'drm-intel-fixes-2024-10-24' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes - Fix DRM_I915_GVT_KVMGT dependencies in Kconfig Signed-off-by: Dave Airlie <airlied@redhat.com> From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/ZxniUlDg59RxOO-6@jlahtine-mobl.ger.corp.intel.com	2024-10-25 07:43:41 +10:00
Nirmoy Das	cdc21021f0	drm/xe: Don't restart parallel queues multiple times on GT reset In case of parallel submissions multiple GuC id will point to the same exec queue and on GT reset such exec queues will get restarted multiple times which is not desirable. v2: don't use exec_queue_enabled() which could race, do the same for xe_guc_submit_stop (Matt B) Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2295 Cc: Jonathan Cavitt <jonathan.cavitt@intel.com> Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Tejas Upadhyay <tejas.upadhyay@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241022103555.731557-1-nirmoy.das@intel.com Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> (cherry picked from commit c8b0acd6d8745fd7e6450f5acc38f0227bd253b3) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-10-24 12:42:52 -05:00
Nirmoy Das	9c1813b325	drm/xe/ufence: Prefetch ufence addr to catch bogus address access_ok() only checks for addr overflow so also try to read the addr to catch invalid addr sent from userspace. Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1630 Cc: Francois Dugast <francois.dugast@intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241016082304.66009-2-nirmoy.das@intel.com Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> (cherry picked from commit 9408c4508483ffc60811e910a93d6425b8e63928) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-10-24 12:42:52 -05:00

1 2 3 4 5 ...

108504 Commits