- Fix ccs_mode setting for Xe2 and later (Balasubramani)
- Synchronize ccs_mode setting with client creation (Balasubramani)
- Apply scheduling WA for LNL in additional places as needed
(Nirmoy)
- Fix leak and lock handling in error paths of xe_exec ioctl
(Matthew Brost)
- Fix GGTT allocation leak leading to eventual crash in SR-IOV
(Michal Wajdeczko)
- Move run_ticks update out of job handling to avoid synchronization
with reader (Lucas)
-----BEGIN PGP SIGNATURE-----
iQJNBAABCAA3FiEE6rM8lpABPHM5FqyDm6KlpjDL6lMFAmcuLUQZHGx1Y2FzLmRl
bWFyY2hpQGludGVsLmNvbQAKCRCboqWmMMvqU3HEEACI04W0GFdk1ix9MS2vuK7U
r0phWoaP4+29vfWG9BRN3S6jisAt2Y1+DBHCP78C7naV/FvHW8aJQ78cHEJ9DjF+
/OpzpErfd+BD2DSSOBRNcQJ3eqe7JVjy2Q4kV41+qkKqqOaN99L33vFexCXOu7Je
wFOXn33pmrsFCp9mb6xBhLYGo8tFS5IctHvBajGFHoUQzqfA/WkD/g67x9qKSzYv
BThYOm066jcWGX0LaKHCw6slQ/OeZE78q2gaQraEpCyw4dJg7QBTlrVU5662xCnI
RXDv+adzsHGWah77vrnBKFBRAD66qBHj4ntecHesnhuBv31EJdqWlLb5baoiE5RN
wuPm57hPJ39tsB/KxqUlTmFiBCWQ74Px4o6nRKcnme2N0m/H8xFTp7c45SRwJZ+X
RhQPnv14fkLN3e4TaWbr8at89Gapk1R8cXRytZG4gwkrfFfLEGT1ufBV6lP3Pn0n
6DQvtNrKR2qGO1JM+R5MXIzzdFFqRuI/8+GX9dvs/5fFTLbTUCckyU9ZM9JXFKIl
2ShK7eSFhUPqw2p3zbQzyxQqkopWORQhRkzUmCm8e9WQLE9uBpaFbkeIzu36atY2
pjRElc6OtTlxCN1KcBnOk5UvTbEC8MxNdWjB5MZJd8zaxm5DVbgGbxp5YmU8K/gG
GV1YXZlniVpdhixJFFoypQ==
=jirT
-----END PGP SIGNATURE-----
Merge tag 'drm-xe-fixes-2024-11-08' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes
Driver Changes:
- Fix ccs_mode setting for Xe2 and later (Balasubramani)
- Synchronize ccs_mode setting with client creation (Balasubramani)
- Apply scheduling WA for LNL in additional places as needed
(Nirmoy)
- Fix leak and lock handling in error paths of xe_exec ioctl
(Matthew Brost)
- Fix GGTT allocation leak leading to eventual crash in SR-IOV
(Michal Wajdeczko)
- Move run_ticks update out of job handling to avoid synchronization
with reader (Lucas)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/4ffcebtluaaaohquxfyf5babpihmtscxwad3jjmt5nggwh2xpm@ztw67ucywttg
The current panthor_device_mmap_io() implementation has two issues:
1. For mapping DRM_PANTHOR_USER_FLUSH_ID_MMIO_OFFSET,
panthor_device_mmap_io() bails if VM_WRITE is set, but does not clear
VM_MAYWRITE. That means userspace can use mprotect() to make the mapping
writable later on. This is a classic Linux driver gotcha.
I don't think this actually has any impact in practice:
When the GPU is powered, writes to the FLUSH_ID seem to be ignored; and
when the GPU is not powered, the dummy_latest_flush page provided by the
driver is deliberately designed to not do any flushes, so the only thing
writing to the dummy_latest_flush could achieve would be to make *more*
flushes happen.
2. panthor_device_mmap_io() does not block MAP_PRIVATE mappings (which are
mappings without the VM_SHARED flag).
MAP_PRIVATE in combination with VM_MAYWRITE indicates that the VMA has
copy-on-write semantics, which for VM_PFNMAP are semi-supported but
fairly cursed.
In particular, in such a mapping, the driver can only install PTEs
during mmap() by calling remap_pfn_range() (because remap_pfn_range()
wants to **store the physical address of the mapped physical memory into
the vm_pgoff of the VMA**); installing PTEs later on with a fault
handler (as panthor does) is not supported in private mappings, and so
if you try to fault in such a mapping, vmf_insert_pfn_prot() splats when
it hits a BUG() check.
Fix it by clearing the VM_MAYWRITE flag (userspace writing to the FLUSH_ID
doesn't make sense) and requiring VM_SHARED (copy-on-write semantics for
the FLUSH_ID don't make sense).
Reproducers for both scenarios are in the notes of my patch on the mailing
list; I tested that these bugs exist on a Rock 5B machine.
Note that I only compile-tested the patch, I haven't tested it; I don't
have a working kernel build setup for the test machine yet. Please test it
before applying it.
Cc: stable@vger.kernel.org
Fixes: 5fe909cae1 ("drm/panthor: Add the device logical block")
Signed-off-by: Jann Horn <jannh@google.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Signed-off-by: Steven Price <steven.price@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241105-panthor-flush-page-fixes-v1-1-829aaf37db93@google.com
Similar to commit cac075706f ("drm/panthor: Fix race when converting
group handle to group object") we need to use the XArray's internal
locking when retrieving a vm pointer from there.
v2: Removed part of the patch that was trying to protect fetching
the heap pointer from XArray, as that operation is protected by
the @pool->lock.
Fixes: 647810ec24 ("drm/panthor: Add the MMU/VM logical block")
Reported-by: Jann Horn <jannh@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Signed-off-by: Steven Price <steven.price@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241106185806.389089-1-liviu.dudau@arm.com
There are 2G and 4G RAM versions of the Lenovo Yoga Tab 3 X90F and it
turns out that the 2G version has a DMI product name of
"CHERRYVIEW D1 PLATFORM" where as the 4G version has
"CHERRYVIEW C0 PLATFORM". The sys-vendor + product-version check are
unique enough that the product-name check is not necessary.
Drop the product-name check so that the existing DMI match for the 4G
RAM version also matches the 2G RAM version.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240825132131.6643-1-hdegoede@redhat.com
The exec queue timestamp is only really useful when it's being queried
through the fdinfo. There's no need to update it so often, on every
job_free. Tracing a simple app like vkcube running shows an update
rate of ~ 120Hz. In case of discrete, the BO is on vram, creating a lot
of pcie transactions.
The update on job_free() is used to cover a gap: if exec
queue is created and destroyed rapidly, before a new query, the
timestamp still needs to be accumulated and accounted for in the xef.
Initial implementation in commit 6109f24f87 ("drm/xe: Add helper to
accumulate exec queue runtime") couldn't do it on the exec_queue_fini
since the xef could be gone at that point. However since commit
ce8c161cba ("drm/xe: Add ref counting for xe_file") the xef is
refcounted and the exec queue always holds a reference, making this safe
now.
Improve the fix in commit 2149ded630 ("drm/xe: Fix use after free when
client stats are captured") by reducing the frequency in which the
update is needed.
Fixes: 2149ded630 ("drm/xe: Fix use after free when client stats are captured")
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241104143815.2112272-3-lucas.demarchi@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit 83db047d9425d9a649f01573797558eff0f632e1)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
In unlikely event that we fail during sending the new VF GGTT
configuration to the GuC, we will free only the GGTT node data
struct but will miss to release the actual GGTT allocation.
This will later lead to list corruption, GGTT space leak and
finally risking crash when unloading the driver:
[ ] ... [drm] GT0: PF: Failed to provision VF1 with 1073741824 (1.00 GiB) GGTT (-EIO)
[ ] ... [drm] GT0: PF: VF1 provisioning remains at 0 (0 B) GGTT
[ ] list_add corruption. next->prev should be prev (ffff88813cfcd628), but was 0000000000000000. (next=ffff88813cfe2028).
[ ] RIP: 0010:__list_add_valid_or_report+0x6b/0xb0
[ ] Call Trace:
[ ] drm_mm_insert_node_in_range+0x2c0/0x4e0
[ ] xe_ggtt_node_insert+0x46/0x70 [xe]
[ ] pf_provision_vf_ggtt+0x7f5/0xa70 [xe]
[ ] xe_gt_sriov_pf_config_set_ggtt+0x5e/0x770 [xe]
[ ] ggtt_set+0x4b/0x70 [xe]
[ ] simple_attr_write_xsigned.constprop.0.isra.0+0xb0/0x110
[ ] ... [drm] GT0: PF: Failed to provision VF1 with 1073741824 (1.00 GiB) GGTT (-ENOSPC)
[ ] ... [drm] GT0: PF: VF1 provisioning remains at 0 (0 B) GGTT
[ ] Oops: general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6b7b: 0000 [#1] PREEMPT SMP NOPTI
[ ] RIP: 0010:drm_mm_remove_node+0x1b7/0x390
[ ] Call Trace:
[ ] <TASK>
[ ] ? die_addr+0x2e/0x80
[ ] ? exc_general_protection+0x1a1/0x3e0
[ ] ? asm_exc_general_protection+0x22/0x30
[ ] ? drm_mm_remove_node+0x1b7/0x390
[ ] ggtt_node_remove+0xa5/0xf0 [xe]
[ ] xe_ggtt_node_remove+0x35/0x70 [xe]
[ ] xe_ttm_bo_destroy+0x123/0x220 [xe]
[ ] intel_user_framebuffer_destroy+0x44/0x70 [xe]
[ ] intel_plane_destroy_state+0x3b/0xc0 [xe]
[ ] drm_atomic_state_default_clear+0x1cd/0x2f0
[ ] intel_atomic_state_clear+0x9/0x20 [xe]
[ ] __drm_atomic_state_free+0x1d/0xb0
Fix that by using pf_release_ggtt() on the error path, which now
works regardless if the node has GGTT allocation or not.
Fixes: 34e804220f ("drm/xe: Make xe_ggtt_node struct independent")
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241104144901.1903-1-michal.wajdeczko@intel.com
(cherry picked from commit 43b1dd2b550f0861ce80fbfffd5881b1b26272b1)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Upon failure all locks need to be dropped before returning to the user.
Fixes: 58480c1c91 ("drm/xe: Skip VMAs pin when requesting signal to the last XE_EXEC")
Cc: <stable@vger.kernel.org>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241105043524.4062774-3-matthew.brost@intel.com
(cherry picked from commit 7d1a4258e602ffdce529f56686925034c1b3b095)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
In a couple of places after an exec queue is looked up the exec IOCTL
returns on input errors without dropping the exec queue ref. Fix this
ensuring the exec queue ref is dropped on input error.
Fixes: dd08ebf6c3 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Cc: <stable@vger.kernel.org>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241105043524.4062774-2-matthew.brost@intel.com
(cherry picked from commit 07064a200b40ac2195cb6b7b779897d9377e5e6f)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Avoid a possible buffer overflow if size is larger than 4K.
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit f5d873f5825b40d886d03bd2aede91d4cf002434)
Cc: stable@vger.kernel.org
Users should not be able to run these.
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 7ba9395430f611cfc101b1c2687732baafa239d5)
Cc: stable@vger.kernel.org
Regular users shouldn't have read access.
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit c0cfd2e652553d607b910be47d0cc5a7f3a78641)
Cc: stable@vger.kernel.org
For DPX mode, the number of memory partitions supported should be less
than or equal to 2.
Fixes: 1589c82a10 ("drm/amdgpu: Check memory ranges for valid xcp mode")
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 990c4f580742de7bb78fa57420ffd182fc3ab4cd)
Cc: stable@vger.kernel.org
Correct the workload setting in order not to mix the setting
with the end user. Update the workload mask accordingly.
v2: changes as below:
1. the end user can not erase the workload from driver except default workload.
2. always shows the real highest priority workoad to the end user.
3. the real workload mask is combined with driver workload mask and end user workload mask.
v3: apply this to the other ASICs as well.
v4: simplify the code
v5: refine the code based on the review comments.
Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 8cc438be5d49b8326b2fcade0bdb7e6a97df9e0b)
Cc: stable@vger.kernel.org # 6.11.x
always pick the pptable from IFWI on smu v14.0.2/3
Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 136ce12bd5907388cb4e9aa63ee5c9c8c441640b)
Cc: stable@vger.kernel.org # 6.11.x
acpi_evaluate_object() may return AE_NOT_FOUND (failure), which
would result in dereferencing buffer.pointer (obj) while being NULL.
Although this case may be unrealistic for the current code, it is
still better to protect against possible bugs.
Bail out also when status is AE_NOT_FOUND.
This fixes 1 FORWARD_NULL issue reported by Coverity
Report: CID 1600951: Null pointer dereferences (FORWARD_NULL)
Signed-off-by: Antonio Quartulli <antonio@mandelbit.com>
Fixes: c9b7c809b89f ("drm/amd: Guard against bad data for ATIF ACPI method")
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20241031152848.4716-1-antonio@mandelbit.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 91c9e221fe2553edf2db71627d8453f083de87a1)
Cc: stable@vger.kernel.org
An upstream bug report suggests that there are production dGPUs that are
older than DCN401 but still have a umc_info in VBIOS tables with the
same version as expected for a DCN401 product. Hence, reading this
tables should be guarded with a version check.
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3678
Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 2551b4a321a68134360b860113dd460133e856e5)
Fixes: 00c391102a ("drm/amd/display: Add misc DC changes for DCN401")
Cc: stable@vger.kernel.org # 6.11.x
[Why]
During boot up and resume the DC layer will reset the panel
brightness to fix a flicker issue.
It will cause the dm->actual_brightness is not the current panel
brightness level. (the dm->brightness is the correct panel level)
[How]
Set the backlight level after do the set mode.
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Fixes: d9e865826c ("drm/amd/display: Simplify brightness initialization")
Reported-by: Mark Herbert <mark.herbert42@gmail.com>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3655
Reviewed-by: Sun peng Li <sunpeng.li@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Zaeem Mohamed <zaeem.mohamed@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 7875afafba84817b791be6d2282b836695146060)
Cc: stable@vger.kernel.org
Flush the g2h worker explicitly if TLB timeout happens which is
observed on LNL and that points to the recent scheduling issue with
E-cores on LNL.
This is similar to the recent fix:
commit e51527233804 ("drm/xe/guc/ct: Flush g2h worker in case of g2h
response timeout") and should be removed once there is E core
scheduling fix.
v2: Add platform check(Himal)
v3: Remove gfx platform check as the issue related to cpu
platform(John)
Use the common WA macro(John) and print when the flush
resolves timeout(Matt B)
v4: Remove the resolves log and do the flush before taking
pending_lock(Matt A)
Cc: Badal Nilawar <badal.nilawar@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: stable@vger.kernel.org # v6.11+
Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2687
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241029120117.449694-3-nirmoy.das@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit e1f6fa55664a0eeb0a641f497e1adfcf6672e995)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Flush xe ordered_wq in case of ufence timeout which is observed
on LNL and that points to recent scheduling issue with E-cores.
This is similar to the recent fix:
commit e51527233804 ("drm/xe/guc/ct: Flush g2h worker in case of g2h
response timeout") and should be removed once there is a E-core
scheduling fix for LNL.
v2: Add platform check(Himal)
s/__flush_workqueue/flush_workqueue(Jani)
v3: Remove gfx platform check as the issue related to cpu
platform(John)
v4: Use the Common macro(John) and print when the flush resolves
timeout(Matt B)
Cc: Badal Nilawar <badal.nilawar@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: stable@vger.kernel.org # v6.11+
Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2754
Suggested-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241029120117.449694-2-nirmoy.das@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit 38c4c8722bd74452280951edc44c23de47612001)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Move LNL scheduling WA to xe_device.h so this can be used in other
places without needing keep the same comment about removal of this WA
in the future. The WA, which flushes work or workqueues, is now wrapped
in macros and can be reused wherever needed.
Cc: Badal Nilawar <badal.nilawar@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
cc: stable@vger.kernel.org # v6.11+
Suggested-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241029120117.449694-1-nirmoy.das@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit cbe006a6492c01a0058912ae15d473f4c149896c)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Drop the exclusive client count tracking and use the filelist from the
drm to track the active clients. This also ensures the clients created
internally by the driver won't block changing the ccs mode.
Fixes: ce8c161cba ("drm/xe: Add ref counting for xe_file")
Signed-off-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241008073628.377433-3-balasubramani.vivekanandan@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit 1c35f1ed1fe3c649f8c16214d0d3dd828b5265d9)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
CCS_MODE register requires setting mask bits from Xe2+ platforms. Set
the mask bits unconditionally, as those bits are unused for older
platforms.
Signed-off-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Cc: stable@vger.kernel.org # v6.11+
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241008073628.377433-2-balasubramani.vivekanandan@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit 23ea2c7572d4735ef66beb1e4feb8ae510b78247)
[ Fix conflict with mmio refactors ]
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
When remaining resources are being cleaned up on driver close,
outstanding VM mappings may result in resources being leaked, due
to an object reference loop, as shown below, with each object (or
set of objects) referencing the object below it:
PVR GEM Object
GPU scheduler "finished" fence
GPU scheduler “scheduled” fence
PVR driver “done” fence
PVR Context
PVR VM Context
PVR VM Mappings
PVR GEM Object
The reference that the PVR VM Context has on the VM mappings is a
soft one, in the sense that the freeing of outstanding VM mappings
is done as part of VM context destruction; no reference counts are
involved, as is the case for all the other references in the loop.
To break the reference loop during cleanup, free the outstanding
VM mappings before destroying the PVR Context associated with the
VM context.
Signed-off-by: Brendan King <brendan.king@imgtec.com>
Signed-off-by: Matt Coster <matt.coster@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Cc: stable@vger.kernel.org
Link: https://patchwork.freedesktop.org/patch/msgid/8a25924f-1bb7-4d9a-a346-58e871dfb1d1@imgtec.com
This adds a linked list of VM contexts which is needed for the next patch
to be able to correctly track VM contexts for destruction on file close.
It is only safe for VM contexts to be removed from the list and destroyed
when not in interrupt context.
Signed-off-by: Brendan King <brendan.king@imgtec.com>
Signed-off-by: Matt Coster <matt.coster@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Cc: stable@vger.kernel.org
Link: https://patchwork.freedesktop.org/patch/msgid/e57128ea-f0ce-4e93-a9d4-3f033a8b06fa@imgtec.com
- Fix missing HPD interrupt enabling, bringing one PM refactor with it
(Imre / Maarten)
- Workaround LNL GGTT invalidation not being visible to GuC
(Matthew Brost)
- Avoid getting jobs stuck without a protecting timeout (Matthew Brost)
-----BEGIN PGP SIGNATURE-----
iQJNBAABCAA3FiEE6rM8lpABPHM5FqyDm6KlpjDL6lMFAmckr1wZHGx1Y2FzLmRl
bWFyY2hpQGludGVsLmNvbQAKCRCboqWmMMvqU5mUD/9ldyPILkvMvGT6yEBf7HWJ
HwEs8oqUkN/NDxOXySrkiVPESLB9PH/SWo6YW7L0X7DkFq1rEEB6eWbEu4ZQ6Oef
Q9wwx3XZ2oZwi5A/oa2mT9mvxN53vRVgEaKtPxej42N6zVcbtJgre9sfqa8MwF5u
HNHP8tcKZNf8i+kzIEPrz33wLRH+Ecvkbp1k2E//cSHIWcsaq9ZmG8pnH7x8X/91
fo0uLpTKhSJZzs5QygxnQ3X9BgIDgHpCu0mlTcmHBsH4wPr3u2uSdQ2cT97zZyiT
7fATg/QdgwXH8lsmnbzPj0LjQd+AK4eLS3nD1PKzBN3+/5ZVy6ZVol6K41qUDNSC
Tzzgo4/NOVfcRdyiWe6Elj07Snent8gtNGGphp1vAtUns8kGcJw3gDcOpX3cYHxa
u+DQP7zasWwPAF2lqwApMHFZGe/zoH3w7zUZYCmGP4IImC3VM8NBaM4No6VNIvco
+08yJUFBifS/4qYISYCwO9H7Dt+10WDxHrzF6w6sA89dL4mBf4fGvcfnxE33NuWn
RIhfQ+WLC/vIzRwbOGmvAhbG4Owxjsacq7RQOUc1azJ66cCPCQCNfVGx6qf5Q23A
0KbDoQXVlegKOz7z5m/ORXQO6A/Ien2VABLn0t0bjmoP7S57UFfiENn2P0c+UeNZ
GJUi4QS5Ssdjq/DK6awoRw==
=kIfk
-----END PGP SIGNATURE-----
Merge tag 'drm-xe-fixes-2024-10-31' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes
Driver Changes:
- Fix missing HPD interrupt enabling, bringing one PM refactor with it
(Imre / Maarten)
- Workaround LNL GGTT invalidation not being visible to GuC
(Matthew Brost)
- Avoid getting jobs stuck without a protecting timeout (Matthew Brost)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/tsbftadm7owyizzdaqnqu7u4tqggxgeqeztlfvmj5fryxlfomi@5m5bfv2zvzmw
Short circuiting TDR on jobs not started is an optimization which is not
required. On LNL we are facing an issue where jobs do not get scheduled
by the GuC if it misses a GGTT page update. When this occurs let the TDR
fire, toggle the scheduling which may get the job unstuck, and print a
warning message. If the TDR fires twice on job that hasn't started,
timeout the job.
v2:
- Add warning message (Paulo)
- Add fixes tag (Paulo)
- Timeout job which hasn't started after TDR firing twice
v3:
- Include local change
v4:
- Short circuit check_timeout on job not started
- use warn level rather than notice (Paulo)
Fixes: 7ddb9403dd ("drm/xe: Sample ctx timestamp to determine if jobs have timed out")
Cc: stable@vger.kernel.org
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241025214330.2010521-2-matthew.brost@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit 35d25a4a0012e690ef0cc4c5440231176db595cc)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
On LNL without a mmio read before a GGTT invalidate the GuC can
incorrectly read the GGTT scratch page upon next access leading to jobs
not getting scheduled. A mmio read before a GGTT invalidate seems to fix
this. Since a GGTT invalidate is not a hot code path, blindly do a mmio
read before each GGTT invalidate.
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: stable@vger.kernel.org
Fixes: dd08ebf6c3 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Reported-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/3164
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241023221200.1797832-1-matthew.brost@intel.com
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
(cherry picked from commit 5a710196883e0ac019ac6df2a6d79c16ad3c32fa)
[ Fix conflict with mmio vs gt argument ]
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
As Maxime suggested, add a new helper
drm_kunit_display_mode_from_cea_vic(), it can replace the direct call
of drm_display_mode_from_cea_vic(), and it will help solving
the `mode` memory leaks.
Acked-by: Maxime Ripard <mripard@kernel.org>
Suggested-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241030023504.530425-2-ruanjinjie@huawei.com
Signed-off-by: Maxime Ripard <mripard@kernel.org>
If we don't do that, the group is considered usable by userspace, but
all further GROUP_SUBMIT will fail with -EINVAL.
Changes in v3:
- Add R-bs
Changes in v2:
- New patch
Fixes: de85488138 ("drm/panthor: Add the scheduler logical block")
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241029152912.270346-3-boris.brezillon@collabora.com
Userspace can use GROUP_SUBMIT errors as a trigger to check the group
state and recreate the group if it became unusable. Make sure we
report an error when the group became unusable.
Changes in v3:
- None
Changes in v2:
- Add R-bs
Fixes: de85488138 ("drm/panthor: Add the scheduler logical block")
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241029152912.270346-2-boris.brezillon@collabora.com
The system and GPU MMU page size might differ, which becomes a
problem for FW sections that need to be mapped at explicit addresses
since our PAGE_SIZE alignment might cover a VA range that's
expected to be used for another section.
Make sure we never map more than we need.
Changes in v3:
- Add R-bs
Changes in v2:
- Plan for per-VM page sizes so the MCU VM and user VM can
have different pages sizes
Fixes: 2718d91816 ("drm/panthor: Add the FW logical block")
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Reviewed-by: Liviu Dudau <liviu.dudau@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241030150231.768949-1-boris.brezillon@collabora.com
Atm the display HPD interrupts that got disabled during runtime
suspend, are re-enabled only if d3cold is enabled. Fix things by
also re-enabling the interrupts if d3cold is disabled.
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241009194358.1321200-5-imre.deak@intel.com
(cherry picked from commit bbc4a30de095f0349d3c278500345a1b620d495e)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
For clarity separate the d3cold and non-d3cold runtime PM handling. The
only change in behavior is disabling polling later during runtime
resume. This shouldn't make a difference, since the poll disabling is
handled from a work, which could run at any point wrt. the runtime
resume handler. The work will also require a runtime PM reference,
syncing it with the resume handler.
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241009194358.1321200-4-imre.deak@intel.com
(cherry picked from commit a4de6beb83fc5adee788518350247c629568901e)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
The previous change ensures that pm_suspend is only called when
suspending or resuming. This ensures no further bugs like those
in the previous commit.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Vinod Govindapillai <vinod.govindapillai@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240905150052.174895-3-maarten.lankhorst@linux.intel.com
(cherry picked from commit f90491d4b64e302e940133103d3d9908e70e454f)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
The following 3 commits landed in parallel:
commit d7d2688bf4 ("drm/amd/pm: update workload mask after the setting")
commit 7a1613e47e ("drm/amdgpu/smu13: always apply the powersave optimization")
commit 7c210ca5a2 ("drm/amdgpu: handle default profile on on devices without fullscreen 3D")
While everything is set correctly, this caused the profile to be
reported incorrectly because both the powersave and fullscreen3d bits
were set in the mask and when the driver prints the profile, it looks
for the first bit set.
Fixes: d7d2688bf4 ("drm/amd/pm: update workload mask after the setting")
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit ecfe9b237687a55d596fff0650ccc8cc455edd3f)
Cc: stable@vger.kernel.org
KASAN reports that the GPU metrics table allocated in
vangogh_tables_init() is not large enough for the memset done in
smu_cmn_init_soft_gpu_metrics(). Condensed report follows:
[ 33.861314] BUG: KASAN: slab-out-of-bounds in smu_cmn_init_soft_gpu_metrics+0x73/0x200 [amdgpu]
[ 33.861799] Write of size 168 at addr ffff888129f59500 by task mangoapp/1067
...
[ 33.861808] CPU: 6 UID: 1000 PID: 1067 Comm: mangoapp Tainted: G W 6.12.0-rc4 #356 1a56f59a8b5182eeaf67eb7cb8b13594dd23b544
[ 33.861816] Tainted: [W]=WARN
[ 33.861818] Hardware name: Valve Galileo/Galileo, BIOS F7G0107 12/01/2023
[ 33.861822] Call Trace:
[ 33.861826] <TASK>
[ 33.861829] dump_stack_lvl+0x66/0x90
[ 33.861838] print_report+0xce/0x620
[ 33.861853] kasan_report+0xda/0x110
[ 33.862794] kasan_check_range+0xfd/0x1a0
[ 33.862799] __asan_memset+0x23/0x40
[ 33.862803] smu_cmn_init_soft_gpu_metrics+0x73/0x200 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779]
[ 33.863306] vangogh_get_gpu_metrics_v2_4+0x123/0xad0 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779]
[ 33.864257] vangogh_common_get_gpu_metrics+0xb0c/0xbc0 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779]
[ 33.865682] amdgpu_dpm_get_gpu_metrics+0xcc/0x110 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779]
[ 33.866160] amdgpu_get_gpu_metrics+0x154/0x2d0 [amdgpu 13b1bc364ec578808f676eba412c20eaab792779]
[ 33.867135] dev_attr_show+0x43/0xc0
[ 33.867147] sysfs_kf_seq_show+0x1f1/0x3b0
[ 33.867155] seq_read_iter+0x3f8/0x1140
[ 33.867173] vfs_read+0x76c/0xc50
[ 33.867198] ksys_read+0xfb/0x1d0
[ 33.867214] do_syscall_64+0x90/0x160
...
[ 33.867353] Allocated by task 378 on cpu 7 at 22.794876s:
[ 33.867358] kasan_save_stack+0x33/0x50
[ 33.867364] kasan_save_track+0x17/0x60
[ 33.867367] __kasan_kmalloc+0x87/0x90
[ 33.867371] vangogh_init_smc_tables+0x3f9/0x840 [amdgpu]
[ 33.867835] smu_sw_init+0xa32/0x1850 [amdgpu]
[ 33.868299] amdgpu_device_init+0x467b/0x8d90 [amdgpu]
[ 33.868733] amdgpu_driver_load_kms+0x19/0xf0 [amdgpu]
[ 33.869167] amdgpu_pci_probe+0x2d6/0xcd0 [amdgpu]
[ 33.869608] local_pci_probe+0xda/0x180
[ 33.869614] pci_device_probe+0x43f/0x6b0
Empirically we can confirm that the former allocates 152 bytes for the
table, while the latter memsets the 168 large block.
Root cause appears that when GPU metrics tables for v2_4 parts were added
it was not considered to enlarge the table to fit.
The fix in this patch is rather "brute force" and perhaps later should be
done in a smarter way, by extracting and consolidating the part version to
size logic to a common helper, instead of brute forcing the largest
possible allocation. Nevertheless, for now this works and fixes the out of
bounds write.
v2:
* Drop impossible v3_0 case. (Mario)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Fixes: 41cec40bc9 ("drm/amd/pm: Vangogh: Add new gpu_metrics_v2_4 to acquire gpu_metrics")
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Evan Quan <evan.quan@amd.com>
Cc: Wenyou Yang <WenYou.Yang@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Link: https://lore.kernel.org/r/20241025145639.19124-1-tursulin@igalia.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 0880f58f9609f0200483a49429af0f050d281703)
Cc: stable@vger.kernel.org # v6.6+
This reverts
commit 9dad21f910fc ("drm/amd/display: update DML2 policy EnhancedPrefetchScheduleAccelerationFinal DCN35")
[why & how]
The offending commit exposes a hang with lid close/open behavior.
Both issues seem to be related to ODM 2:1 mode switching, so there
is another issue generic to that sequence that needs to be
investigated.
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Ovidiu Bunea <Ovidiu.Bunea@amd.com>
Signed-off-by: Tom Chung <chiahsuan.chung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 68bf95317ebf2cfa7105251e4279e951daceefb7)
Cc: stable@vger.kernel.org
drm_gpu_scheduler.submit_wq is used to submit jobs, jobs are in the path
of dma-fences, and dma-fences are in the path of reclaim. Mark scheduler
work queue with WQ_MEM_RECLAIM to ensure forward progress during
reclaim; without WQ_MEM_RECLAIM, work queues cannot make forward
progress during reclaim.
v2:
- Fixes tags (Philipp)
- Reword commit message (Philipp)
Cc: Luben Tuikov <ltuikov89@gmail.com>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: Philipp Stanner <pstanner@redhat.com>
Cc: stable@vger.kernel.org
Fixes: 34f50cc644 ("drm/sched: Use drm sched lockdep map for submit_wq")
Fixes: a6149f0393 ("drm/sched: Convert drm scheduler to use a work queue rather than kthread")
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Acked-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Philipp Stanner <pstanner@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241023235917.1836428-1-matthew.brost@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
- Increase invalidation timeout to avoid errors in some hosts (Shuicheng)
- Flush worker on timeout (Badal)
- Better handling for force wake failure (Shuicheng)
- Improve argument check on user fence creation (Nirmoy)
- Don't restart parallel queues multiple times on GT reset (Nirmoy)
-----BEGIN PGP SIGNATURE-----
iQJNBAABCAA3FiEE6rM8lpABPHM5FqyDm6KlpjDL6lMFAmcavkQZHGx1Y2FzLmRl
bWFyY2hpQGludGVsLmNvbQAKCRCboqWmMMvqU99uD/0fiq9PPSGZQlQTiMYYWR80
EZ+u1xESm+VqgFJJBDcRiKRB3TIIISeH1IOse0JzRY5kVxPU8jqA1HFkuDjvVCMI
cHc5T3WTsUQUhUIyMBTex+MEYBNmaKF0a+qfOA+Uh86v9Xt5LWVyViPXM2qY0T0E
skGFulZlhqGRIb4EEA8pdDwmJeFcHxVRK9Xkc8kn9kzfgv4sxHCvbCE6pNNzR1w2
ofQ85omyQWguNH5cgj5koI6LQqoKYVztBIi3rNTgYsl+FKy8BNY5KX2+C5XMNdv1
/GHSj+jLaUGFZ3WdgWkRNtBFVtX0bLSTpQd3y85rENOwOnR3mr6VTWGIBUaB4fKS
/xtqC4KJptpAT4MEDlYEQ4P6/YkG5PQZ8UYzdDJfWQtNkN4G5evYIUtZQ+h8mcyh
mgs8+nbQbbKsHoqVKLTGdoV+1lpQRVPeWiezPTxNS1zuEQ+CwwKAjpAi1wImJDVz
gu4JqHbr16oMY1if2SD9On7eYcqDXSUs2vR6JShtwbgpBUN8/UmH9RYQxkpIU3q2
bNLCIxGAgC+G1cdGIIp9kRLMu8QmfSzf653KLuUAUogEXOcV2xwTncZhSnxx0XD7
S9vR7/9alS65zEro0KQfwSJbBb3izXy00JfCUyquIAioi2nXLbaJ0imhBn3gBlma
+1t2XGyhib3poqsiDpZY2w==
=+1gt
-----END PGP SIGNATURE-----
Merge tag 'drm-xe-fixes-2024-10-24-1' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes
Driver Changes:
- Increase invalidation timeout to avoid errors in some hosts (Shuicheng)
- Flush worker on timeout (Badal)
- Better handling for force wake failure (Shuicheng)
- Improve argument check on user fence creation (Nirmoy)
- Don't restart parallel queues multiple times on GT reset (Nirmoy)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/trlkoiewtc4x2cyhsxmj3atayyq4zwto4iryea5pvya2ymc3yp@fdx5nhwmiyem
In case of parallel submissions multiple GuC id will point to the
same exec queue and on GT reset such exec queues will get restarted
multiple times which is not desirable.
v2: don't use exec_queue_enabled() which could race,
do the same for xe_guc_submit_stop (Matt B)
Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2295
Cc: Jonathan Cavitt <jonathan.cavitt@intel.com>
Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Tejas Upadhyay <tejas.upadhyay@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241022103555.731557-1-nirmoy.das@intel.com
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
(cherry picked from commit c8b0acd6d8745fd7e6450f5acc38f0227bd253b3)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
access_ok() only checks for addr overflow so also try to read the addr
to catch invalid addr sent from userspace.
Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1630
Cc: Francois Dugast <francois.dugast@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241016082304.66009-2-nirmoy.das@intel.com
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
(cherry picked from commit 9408c4508483ffc60811e910a93d6425b8e63928)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>