linux/drivers/gpu/drm/i915
Chris Wilson d1d1ebf412 drm/i915: Don't touch fence->error when resetting an innocent request
If the request has been completed before the reset took effect, we don't
need to mark it up as being a victim. Touching fence->error after the
fence has been signaled is detected by dma_fence_set_error() and
triggers a BUG:

[  231.743133] kernel BUG at ./include/linux/dma-fence.h:434!
[  231.743156] invalid opcode: 0000 [#1] SMP KASAN
[  231.743172] Modules linked in: i915 drm_kms_helper drm iptable_nat nf_nat_ipv4 nf_nat x86_pkg_temp_thermal iosf_mbi i2c_algo_bit cfbfillrect syscopyarea cfbimgblt sysfillrect sysimgblt fb_sys_fops cfbcopyarea fb font fbdev [last unloaded: drm]
[  231.743221] CPU: 2 PID: 20 Comm: kworker/2:0 Tainted: G     U          4.13.0-rc1+ #52
[  231.743236] Hardware name: Hewlett-Packard HP EliteBook 8460p/161C, BIOS 68SCF Ver. F.01 03/11/2011
[  231.743363] Workqueue: events_long i915_hangcheck_elapsed [i915]
[  231.743382] task: ffff8801f42e9780 task.stack: ffff8801f42f8000
[  231.743489] RIP: 0010:i915_gem_reset_engine+0x45a/0x460 [i915]
[  231.743505] RSP: 0018:ffff8801f42ff770 EFLAGS: 00010202
[  231.743521] RAX: 0000000000000007 RBX: ffff8801bf6b1880 RCX: ffffffffa02881a6
[  231.743537] RDX: dffffc0000000000 RSI: dffffc0000000000 RDI: ffff8801bf6b18c8
[  231.743551] RBP: ffff8801f42ff7c8 R08: 0000000000000001 R09: 0000000000000000
[  231.743566] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8801edb02d00
[  231.743581] R13: ffff8801e19d4200 R14: 000000000000001d R15: ffff8801ce2a4000
[  231.743599] FS:  0000000000000000(0000) GS:ffff8801f5a80000(0000) knlGS:0000000000000000
[  231.743614] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  231.743629] CR2: 00007f0ebd1add10 CR3: 0000000002621000 CR4: 00000000000406e0
[  231.743643] Call Trace:
[  231.743752]  i915_gem_reset+0x6c/0x150 [i915]
[  231.743853]  i915_reset+0x175/0x210 [i915]
[  231.743958]  i915_reset_device+0x33b/0x350 [i915]
[  231.744061]  ? valleyview_pipestat_irq_handler+0xe0/0xe0 [i915]
[  231.744081]  ? trace_hardirqs_off_caller+0x70/0x110
[  231.744102]  ? _raw_spin_unlock_irqrestore+0x46/0x50
[  231.744120]  ? find_held_lock+0x119/0x150
[  231.744138]  ? mark_lock+0x6d/0x850
[  231.744241]  ? gen8_gt_irq_ack+0x1f0/0x1f0 [i915]
[  231.744262]  ? work_on_cpu_safe+0x60/0x60
[  231.744284]  ? rcu_read_lock_sched_held+0x57/0xa0
[  231.744400]  ? gen6_read32+0x2ba/0x320 [i915]
[  231.744506]  i915_handle_error+0x382/0x5f0 [i915]
[  231.744611]  ? gen6_rps_reset_ei+0x20/0x20 [i915]
[  231.744630]  ? vsnprintf+0x128/0x8e0
[  231.744649]  ? pointer+0x6b0/0x6b0
[  231.744667]  ? debug_check_no_locks_freed+0x1a0/0x1a0
[  231.744688]  ? scnprintf+0x92/0xe0
[  231.744706]  ? snprintf+0xb0/0xb0
[  231.744820]  hangcheck_declare_hang+0x15a/0x1a0 [i915]
[  231.744932]  ? engine_stuck+0x440/0x440 [i915]
[  231.744951]  ? rcu_read_lock_sched_held+0x57/0xa0
[  231.745062]  ? gen6_read32+0x2ba/0x320 [i915]
[  231.745173]  ? gen6_read16+0x320/0x320 [i915]
[  231.745284]  ? intel_engine_get_active_head+0x91/0x170 [i915]
[  231.745401]  i915_hangcheck_elapsed+0x3d8/0x400 [i915]
[  231.745424]  process_one_work+0x3e8/0xac0
[  231.745444]  ? pwq_dec_nr_in_flight+0x110/0x110
[  231.745464]  ? do_raw_spin_lock+0x8e/0x120
[  231.745484]  worker_thread+0x8d/0x720
[  231.745506]  kthread+0x19e/0x1f0
[  231.745524]  ? process_one_work+0xac0/0xac0
[  231.745541]  ? kthread_create_on_node+0xa0/0xa0
[  231.745560]  ret_from_fork+0x27/0x40
[  231.745581] Code: 8b 7d c8 e8 49 0d 02 e1 49 8b 7f 38 48 8b 75 b8 48 83 c7 10 e8 b8 89 be e1 e9 95 fc ff ff 4c 89 e7 e8 4b b9 ff ff e9 30 ff ff ff <0f> 0b 0f 1f 40 00 55 48 89 e5 41 57 41 56 41 55 41 54 49 89 fe
[  231.745767] RIP: i915_gem_reset_engine+0x45a/0x460 [i915] RSP: ffff8801f42ff770

At first glance this looks to be related to commit c64992e035
("drm/i915: Look for active requests earlier in the reset path"), but it
could easily happen before as well. On the other hand, we no longer
logged victims due to the active_request being dropped earlier.

v2: Be trickier to unwind the incomplete request as we cannot rely on
request retirement for the lockless per-engine reset.
v3: Reprobe the active request at the time of the reset.

Reported-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Fixes: c64992e035 ("drm/i915: Look for active requests earlier in the reset path")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170721123238.16428-15-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> #v1
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2017-07-27 09:38:57 +02:00
..
gvt drm/i915/hsw+: Add has_fuses power well attribute 2017-07-27 09:38:53 +02:00
selftests drm/i915/selftests: Exercise independence of per-engine resets 2017-07-27 09:38:48 +02:00
dvo_ch7xxx.c drm/i915/dvo: fix debug logging on unknown DID 2017-06-01 15:53:03 +03:00
dvo_ch7017.c drm/i915/lvds: Remove magic from PLL programming 2017-05-10 13:47:55 +03:00
dvo_ivch.c
dvo_ns2501.c
dvo_sil164.c
dvo_tfp410.c
dvo.h
i915_cmd_parser.c drm/i915: Redefine ptr_pack_bits() and friends 2017-05-17 13:38:04 +01:00
i915_debugfs.c drm/i915: Report execlists irq bit in debugfs 2017-07-27 09:38:44 +02:00
i915_drv.c drm/i915: Emit a user level message when resetting the GPU (or engine) 2017-07-27 09:38:47 +02:00
i915_drv.h drm/i915/hsw+: Add has_fuses power well attribute 2017-07-27 09:38:53 +02:00
i915_gem_batch_pool.c drm/i915: Reinstate reservation_object zapping for batch_pool objects 2017-06-14 14:06:22 +01:00
i915_gem_batch_pool.h
i915_gem_clflush.c drm/i915: Store i915_gem_object_is_coherent() as a bit next to cache-dirty 2017-06-16 14:52:27 +01:00
i915_gem_clflush.h drm/i915: Mark up clflushes as belonging to an unordered timeline 2017-05-03 11:08:45 +01:00
i915_gem_context.c drm/i915: Make i915_gem_context_mark_guilty() safe for unlocked updates 2017-07-27 09:38:47 +02:00
i915_gem_context.h drm/i915: Make i915_gem_context_mark_guilty() safe for unlocked updates 2017-07-27 09:38:47 +02:00
i915_gem_dmabuf.c drm/i915: Implement dma_buf_ops->kmap 2017-05-03 23:15:02 +01:00
i915_gem_evict.c drm/i915: Eliminate lots of iterations over the execobjects array 2017-06-16 16:54:05 +01:00
i915_gem_execbuffer.c drm/i915: Fix user ptr check size in eb_relocate_vma() 2017-07-17 14:24:16 +03:00
i915_gem_fence_reg.c
i915_gem_fence_reg.h
i915_gem_gtt.c drm/i915: Fix the kernel panic when using aliasing ppgtt 2017-07-07 11:05:53 +01:00
i915_gem_gtt.h drm/i915: pass the vma to insert_entries 2017-06-22 16:48:50 +01:00
i915_gem_internal.c drm/i915: Store i915_gem_object_is_coherent() as a bit next to cache-dirty 2017-06-16 14:52:27 +01:00
i915_gem_object.h drm/i915: Store a direct lookup from object handle to vma 2017-06-16 16:54:04 +01:00
i915_gem_render_state.c
i915_gem_render_state.h
i915_gem_request.c drm/i915: Make i915_gem_context_mark_guilty() safe for unlocked updates 2017-07-27 09:38:47 +02:00
i915_gem_request.h Merge tag 'drm-intel-next-2017-07-17' of git://anongit.freedesktop.org/git/drm-intel into drm-next 2017-07-20 11:31:43 +10:00
i915_gem_shrinker.c drm/i915: Spin for struct_mutex inside shrinker 2017-06-14 10:55:11 +01:00
i915_gem_stolen.c drm/i915: More stolen quirking 2017-07-19 14:04:03 +02:00
i915_gem_tiling.c drm/i915: Fix logical inversion for gen4 quirking 2017-06-07 16:31:34 +03:00
i915_gem_timeline.c drm/i915: Squash repeated awaits on the same fence 2017-05-03 11:08:48 +01:00
i915_gem_timeline.h drm/i915: Rename intel_timeline.sync_seqno[] to .global_sync[] 2017-05-03 11:08:52 +01:00
i915_gem_userptr.c drm/i915: Wait upon userptr get-user-pages within execbuffer 2017-06-16 16:54:05 +01:00
i915_gem.c drm/i915: Don't touch fence->error when resetting an innocent request 2017-07-27 09:38:57 +02:00
i915_gem.h drm/i915: Squash repeated awaits on the same fence 2017-05-03 11:08:48 +01:00
i915_gpu_error.c drm/i915: Make i915_gem_context_mark_guilty() safe for unlocked updates 2017-07-27 09:38:47 +02:00
i915_guc_reg.h
i915_guc_submission.c drm/i915: Differentiate between sw write location into ring and last hw read 2017-06-19 10:52:34 +03:00
i915_ioc32.c
i915_irq.c drm/i915/hsw, bdw: Add irq_pipe_mask, has_vga power well attributes 2017-07-27 09:38:53 +02:00
i915_memcpy.c
i915_mm.c
i915_oa_bdw.c drm/i915/perf: Add more OA configs for BDW, CHV, SKL + BXT 2017-06-14 12:31:57 -07:00
i915_oa_bdw.h drm/i915/perf: Add 'render basic' Gen8+ OA unit configs 2017-06-14 12:31:57 -07:00
i915_oa_bxt.c drm/i915/perf: Add more OA configs for BDW, CHV, SKL + BXT 2017-06-14 12:31:57 -07:00
i915_oa_bxt.h drm/i915/perf: Add 'render basic' Gen8+ OA unit configs 2017-06-14 12:31:57 -07:00
i915_oa_chv.c drm/i915/perf: Add more OA configs for BDW, CHV, SKL + BXT 2017-06-14 12:31:57 -07:00
i915_oa_chv.h drm/i915/perf: Add 'render basic' Gen8+ OA unit configs 2017-06-14 12:31:57 -07:00
i915_oa_glk.c drm/i915/perf: add GLK support 2017-06-14 12:31:58 -07:00
i915_oa_glk.h drm/i915/perf: add GLK support 2017-06-14 12:31:58 -07:00
i915_oa_hsw.c drm/i915/perf: Add more OA configs for BDW, CHV, SKL + BXT 2017-06-14 12:31:57 -07:00
i915_oa_hsw.h drm/i915/perf: rework mux configurations queries 2017-06-14 12:31:57 -07:00
i915_oa_kblgt2.c drm/i915/perf: add KBL support 2017-06-14 12:31:58 -07:00
i915_oa_kblgt2.h drm/i915/perf: add KBL support 2017-06-14 12:31:58 -07:00
i915_oa_kblgt3.c drm/i915/perf: add KBL support 2017-06-14 12:31:58 -07:00
i915_oa_kblgt3.h drm/i915/perf: add KBL support 2017-06-14 12:31:58 -07:00
i915_oa_sklgt2.c drm/i915/perf: Add more OA configs for BDW, CHV, SKL + BXT 2017-06-14 12:31:57 -07:00
i915_oa_sklgt2.h drm/i915/perf: Add 'render basic' Gen8+ OA unit configs 2017-06-14 12:31:57 -07:00
i915_oa_sklgt3.c drm/i915/perf: Add more OA configs for BDW, CHV, SKL + BXT 2017-06-14 12:31:57 -07:00
i915_oa_sklgt3.h drm/i915/perf: Add 'render basic' Gen8+ OA unit configs 2017-06-14 12:31:57 -07:00
i915_oa_sklgt4.c drm/i915/perf: Add more OA configs for BDW, CHV, SKL + BXT 2017-06-14 12:31:57 -07:00
i915_oa_sklgt4.h drm/i915/perf: Add 'render basic' Gen8+ OA unit configs 2017-06-14 12:31:57 -07:00
i915_params.c Revert "drm/i915: Add heuristic to determine better way to adjust brightness" 2017-07-21 09:32:48 +03:00
i915_params.h Revert "drm/i915: Add heuristic to determine better way to adjust brightness" 2017-07-21 09:32:48 +03:00
i915_pci.c drm/i915: Disable per-engine reset for Broxton 2017-07-27 09:38:48 +02:00
i915_perf.c drm/i915: Fix error checking/locking in perf/lookup_context() 2017-07-17 14:22:17 +03:00
i915_pvinfo.h drm/i915: Fix GVT-g PVINFO version compatibility check 2017-06-13 11:19:22 +03:00
i915_reg.h drm/i915: prepare pipe for YCBCR420 output 2017-07-27 09:38:55 +02:00
i915_selftest.h
i915_suspend.c
i915_sw_fence.c main drm pull for v4.13 2017-07-09 18:48:37 -07:00
i915_sw_fence.h main drm pull for v4.13 2017-07-09 18:48:37 -07:00
i915_syncmap.c drm/i915: Squash repeated awaits on the same fence 2017-05-03 11:08:48 +01:00
i915_syncmap.h drm/i915: Squash repeated awaits on the same fence 2017-05-03 11:08:48 +01:00
i915_sysfs.c drm/i915/cnl: Inherit RPS stuff from previous platforms. 2017-07-07 09:13:35 -07:00
i915_trace_points.c
i915_trace.h drm/i915: Add g4x watermark tracepoint 2017-05-10 16:48:32 +03:00
i915_utils.h drm/i915: Store a direct lookup from object handle to vma 2017-06-16 16:54:04 +01:00
i915_vgpu.c drm/i915: Fix GVT-g PVINFO version compatibility check 2017-06-13 11:19:22 +03:00
i915_vgpu.h
i915_vma.c main drm pull for v4.13 2017-07-09 18:48:37 -07:00
i915_vma.h drm/i915: Stash a pointer to the obj's resv in the vma 2017-06-16 16:54:05 +01:00
intel_acpi.c ACPI: Switch to use generic guid_t in acpi_evaluate_dsm() 2017-06-07 12:20:49 +02:00
intel_atomic_plane.c drm/i915/skl+: Check for supported plane configuration in Interlace mode 2017-07-04 16:52:30 +03:00
intel_atomic.c drm/i915/cnl: Fix Cannonlake scaler mode programing. 2017-06-12 09:45:55 -07:00
intel_audio.c drm/i915: Reorganize intel_lpe_audio_notify() arguments 2017-05-03 16:20:48 +03:00
intel_bios.c drm/i915/cnl: Don't trust VBT's alternate pin for port D for now. 2017-07-07 09:11:46 -07:00
intel_bios.h
intel_breadcrumbs.c drm/i915: Skip adding the request to the signal tree is complete 2017-06-08 12:33:08 +01:00
intel_cdclk.c drm/i915: reintroduce VLV/CHV PFI programming power domain workaround 2017-07-03 16:12:44 +03:00
intel_color.c drm/i915: prepare csc unit for YCBCR420 output 2017-07-27 09:38:56 +02:00
intel_crt.c drm/i915: Convert intel_crt connector properties to atomic. 2017-04-12 10:53:22 +02:00
intel_csr.c drm/i915: Use HAS_CSR instead of gen number on DMC load. 2017-06-12 09:45:30 -07:00
intel_ddi.c drm/i915: prepare pipe for YCBCR420 output 2017-07-27 09:38:55 +02:00
intel_device_info.c drm/i915: Use HAS_PCH_CPT() everywhere 2017-06-22 19:08:34 +03:00
intel_display.c drm/i915: prepare csc unit for YCBCR420 output 2017-07-27 09:38:56 +02:00
intel_dp_aux_backlight.c Revert "drm/i915: Add heuristic to determine better way to adjust brightness" 2017-07-21 09:32:48 +03:00
intel_dp_link_training.c drm/i915: Explicit the connector name for DP link training result 2017-07-19 08:32:42 +02:00
intel_dp_mst.c drm/i915: Drop FBDEV #ifdev in mst code 2017-07-06 10:00:33 +02:00
intel_dp.c main drm pull for v4.13 2017-07-10 21:56:39 +02:00
intel_dpio_phy.c
intel_dpll_mgr.c drm/i915/cnl: make function cnl_ddi_dp_set_dpll_hw_state static 2017-06-15 15:53:56 +03:00
intel_dpll_mgr.h drm/i915/cnl: Initialize PLLs 2017-06-12 09:42:18 -07:00
intel_drv.h drm/i915: add config function for YCBCR420 outputs 2017-07-27 09:38:55 +02:00
intel_dsi_dcs_backlight.c drm/i915: Pass atomic state to backlight enable/disable/set callbacks. 2017-06-12 16:06:28 +02:00
intel_dsi_pll.c
intel_dsi_vbt.c drm/i915/glk: Calculate high/low switch count for GLK 2017-05-15 18:29:46 +03:00
intel_dsi.c drm/i915/glk: Add cold boot sequence for GLK DSI 2017-06-15 23:03:57 +03:00
intel_dsi.h
intel_dvo.c drm/i915: Convert intel DVO connector to atomic 2017-04-12 10:53:29 +02:00
intel_engine_cs.c drm/i915: Move idle checks before intel_engine_init_global_seqno() 2017-07-27 09:38:46 +02:00
intel_fbc.c Linux 4.12-rc5 2017-06-16 13:58:27 +10:00
intel_fbdev.c Merge airlied/drm-next into drm-intel-next-queued 2017-07-27 09:33:49 +02:00
intel_fifo_underrun.c drm/i915: Consistently use enum pipe for PCH transcoders 2017-07-18 08:39:03 +02:00
intel_frontbuffer.c
intel_frontbuffer.h
intel_guc_ct.c drm/i915/guc: Introduce buffer based cmd transport 2017-05-26 13:26:53 +01:00
intel_guc_ct.h drm/i915/guc: Introduce buffer based cmd transport 2017-05-26 13:26:53 +01:00
intel_guc_fwif.h drm/i915/guc: Introduce buffer based cmd transport 2017-05-26 13:26:53 +01:00
intel_guc_loader.c drm/i915/guc: Load GuC on Coffee Lake 2017-06-09 11:56:53 -07:00
intel_guc_log.c drm/i915: Treat WC a separate cache domain 2017-04-12 12:35:17 +01:00
intel_gvt.c drm/i915/gvt: Return -EIO if host GuC submission is enabled when loading GVT-g 2017-05-30 16:00:07 +03:00
intel_gvt.h drm/i915/gvt: Add gvt options sanitize function 2017-05-30 15:59:47 +03:00
intel_hangcheck.c drm/i915: Check execlist/ring status during hangcheck 2017-07-27 09:38:45 +02:00
intel_hdmi.c drm/i915/glk: set HDMI 2.0 identifier 2017-07-27 09:38:56 +02:00
intel_hotplug.c drm/atomic: Acquire connection_mutex lock in drm_helper_probe_single_connector_modes, v4. 2017-04-06 21:29:23 +02:00
intel_huc.c drm/i915/huc: Load HuC on Coffee Lake 2017-06-09 11:57:16 -07:00
intel_i2c.c drm/i915/cnp: add CNP gmbus support 2017-06-02 13:59:32 -07:00
intel_lpe_audio.c drm/i915: Stop pretending to mask/unmask LPE audio interrupts 2017-05-26 11:51:18 +03:00
intel_lrc.c drm/i915: Flush the execlist ports if idle 2017-07-27 09:38:45 +02:00
intel_lrc.h drm/i915: Sanitize engine context sizes 2017-04-28 12:11:59 +03:00
intel_lspcon.c drm/i915: use drm DP helper to read DPCD desc 2017-05-29 13:37:46 +03:00
intel_lvds.c drm/i915: Pass crtc_state and connector state to backlight enable/disable functions 2017-06-12 16:05:45 +02:00
intel_mocs.c drm/i915/cnl: Cannonlake has same MOCS table than Skylake. 2017-06-07 07:29:51 -07:00
intel_mocs.h
intel_modes.c
intel_opregion.c drm/i915: Pass connector state to intel_panel_set_backlight_acpi 2017-06-12 16:06:10 +02:00
intel_overlay.c drm/i915: Remove pipe A quirk remnants 2017-06-15 15:38:27 +03:00
intel_panel.c drm/i915: prepare scaler for YCBCR420 modeset 2017-07-27 09:38:55 +02:00
intel_pipe_crc.c drm/i915: use memdup_user_nul 2017-05-08 09:28:39 +02:00
intel_pm.c drm/i915: Fix bad comparison in skl_compute_plane_wm, v2. 2017-07-19 13:51:58 +02:00
intel_psr.c drm/i915/psr: disable psr2 for resolution greater than 32X20 2017-06-07 16:31:06 +03:00
intel_renderstate_gen6.c
intel_renderstate_gen7.c
intel_renderstate_gen8.c
intel_renderstate_gen9.c
intel_renderstate.h
intel_ringbuffer.c drm/i915: Enforce that CS packets are qword aligned 2017-07-27 09:38:57 +02:00
intel_ringbuffer.h drm/i915: Look for active requests earlier in the reset path 2017-06-20 21:00:03 +01:00
intel_runtime_pm.c drm/i915: Gather all the power well->domain mappings to one place 2017-07-27 09:38:54 +02:00
intel_sdvo_regs.h
intel_sdvo.c Merge airlied/drm-next into drm-misc-next 2017-07-26 13:43:33 +02:00
intel_sideband.c
intel_sprite.c drm/i915: Remove intel_flip_work infrastructure 2017-07-20 22:45:36 +02:00
intel_tv.c drm/i915: Convert intel_tv connector properties to atomic, v5. 2017-04-12 10:53:22 +02:00
intel_uc.c drm/i915/guc: Clear enable_guc_loading in case of init failure 2017-06-08 12:21:19 +03:00
intel_uc.h Linux 4.12-rc5 2017-06-16 13:58:27 +10:00
intel_uncore.c drm/i915/cnl: Add force wake for gen10+. 2017-07-06 13:22:37 -07:00
intel_uncore.h drm/i915: Keep the forcewake timer alive for 1ms past the most recent use 2017-05-26 15:58:21 +01:00
intel_vbt_defs.h
Kconfig drm/i915: select CRC32 2017-06-21 11:13:27 +02:00
Kconfig.debug Merge tag 'drm-intel-next-2017-05-29' of git://anongit.freedesktop.org/git/drm-intel into drm-next 2017-05-30 15:25:28 +10:00
Makefile drm/i915/perf: add GLK support 2017-06-14 12:31:58 -07:00