linux

korg/linux

mirror of https://mirrors.bfsu.edu.cn/git/linux.git synced 2024-12-25 12:04:46 +08:00

Author	SHA1	Message	Date
Chris Wilson	ce8ff099c4	drm/i915: i915_gem_object_create_from_data() doesn't require struct_mutex Both object creation and backing storage page allocation do not require struct_mutex, so do not require the caller to take it. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170317194648.12468-1-chris@chris-wilson.co.uk Reviewed-by: Matthew Auld <matthew.auld@intel.com>	2017-03-17 22:54:06 +00:00
Chris Wilson	51a575d957	drm/i915: Retire an active batch pool object rather than allocate new Since obj->active_count is only updated upon retirement, if we see an active object in the batch pool, double check that is still active before deciding to allocate a new object. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170316132006.7976-3-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-03-17 17:57:20 +00:00
Chris Wilson	6c943de668	drm/i915: Skip execlists_dequeue() early if the list is empty Do an early read of the execlists' queue before we take the spinlock and start checking. This is safe as the first writer to the execlists queue will cause the tasklet to be run again after a memory barrier. v2: Keep guc in sync with execlists queue changes v3: Explain the mb between the tasklet running on one cpu and the execlist_first update and schedule from a second cpu. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Michał Winiarski <michal.winiarski@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Reviewed-by: Michał Winiarski <michal.winiarski@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170317120716.17191-1-chris@chris-wilson.co.uk	2017-03-17 15:53:26 +00:00
Chris Wilson	e637d2cba8	drm/i915: Stop using obj->obj_exec_link outside of execbuf i915_gem_stolen_list_info() sneakily takes advantage of the obj->obj_exec_link to save itself from having to allocate. Enough of the subterfuge, just allocate an array of pointers and sort them instead of the list. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170316132006.7976-7-chris@chris-wilson.co.uk	2017-03-17 15:53:26 +00:00
Chris Wilson	facbecad71	drm/i915: Squelch WARN for VLV_COUNTER_CONTROL Before rc6 is initialised (after driver load or resume), the value inside VLV_COUNTER_CONTROL is undefined so we cannot make an assertion that is in HIGH_RANGE mode. Fixes: `6b7f6aa75e` ("drm/i915: Use coarse grained residency counter with byt") Testcase: igt/drv_suspend/debugfs-reader Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170317125918.11351-1-chris@chris-wilson.co.uk Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>	2017-03-17 15:53:26 +00:00
Ander Conselvan de Oliveira	234516afbb	drm/i915/glk: Enable pooled EUs for Geminilake Geminilake also supports pooled EUs. Enable it. It is unclear if the recommendation to disable it for 2x6 configurations from commit `e015dd69b2` ("drm/i915/bxt: Add WaEnablePooledEuFor2x6") should also apply to GLK, but it is applied anyway to be on the safe side. That restriction can be lifted later if determined not to impact performance. The extra restriction should not impact user space either. The only user space that uses this feature is Beignet, and it only does so for 3x6 devices. See See Beignet's commit 6901899ec90a ("Runtime: set the sub slice according to kernel pooled EU configure."). v2: Improve commit message. (Mika, Roy) Cc: Arun Siluvery <arun.siluvery@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Yang Rong <rong.r.yang@intel.com> Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170317140436.24645-1-ander.conselvan.de.oliveira@intel.com	2017-03-17 17:05:36 +02:00
Chris Wilson	e642c85b03	drm/i915: Remove superfluous i915_add_request_no_flush() helper The only time we need to emit a flush inside request emission is after an execbuffer, for which we can use the full __i915_add_request(). All other instances want the simpler i915_add_request() without flushing, so remove the useless helper. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170317114709.8388-1-chris@chris-wilson.co.uk	2017-03-17 13:03:25 +00:00
Tvrtko Ursulin	e3b1895fc1	drm/i915/vgpu: Neuter forcewakes for VGPU more thoroughly If we avoid initializing forcewake domains when running as a guest, and also use gen2 mmio accessors in that case, we can avoid the timer traffic and any looping through the forcewake code which is currently just so it can end up in the no-op forcewake implementation. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Weinan Li <weinan.z.li@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Terrence Xu <terrence.xu@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170310095747.12258-1-tvrtko.ursulin@linux.intel.com [tursulin: commit spelling fix]	2017-03-17 09:52:57 +00:00
Zhenyu Wang	fa7e8b55e9	drm/i915: Fix vGPU balloon for ggtt guard page From commit `a6508ded2a` ("drm/i915: Use page coloring to provide the guard page at the end of the GTT"), we no longer explicitly subtract guard page at end for GGTT address space init, so shouldn't subtract that for vGPU balloon too, as that will leave that end page to be available for vGPU. Change balloon to cover full range too. This fixes to use recent drm-intel tip kernel for guest OS. Found by GVT-g cmd parser that guest kernel uses end page as scratch then try to run MI_STORE_REG_MEM onto it. v2: remove old comments Cc: Terrence Xu <terrence.xu@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170310022238.3191-1-zhenyuw@linux.intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-03-17 09:41:27 +00:00
Chris Wilson	60367132a2	drm/i915: Avoid use-after-free of ctx in request tracepoints trace_i915_gem_request_out may be used after the request is completed, and so the request may have been retired on another thread, invalidating the rq->ctx. Avoid dereferencing rq->ctx in the tracepoint by switching to the fence context id instead, updating all tracepoints to match. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170316204235.27786-1-chris@chris-wilson.co.uk Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>	2017-03-17 07:59:48 +00:00
Chris Wilson	a533b4ba77	drm/i915: Assert that the context pin_counts do not overflow This should be impossible, but let's assert that we do not pin a context 4 billion times before retiring! v2: Fix the assertion -- the patch had just one job to do! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170316171628.3228-1-chris@chris-wilson.co.uk Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>	2017-03-16 20:48:58 +00:00
Chris Wilson	d3df42b76f	drm/i915: Wait for reset to complete before returning from debugfs/i915_wedged Provide some serialisation between user operations by waiting for the reset initiated by setting i915_wedged to complete. The automatic wait here makes echo 1 > i915_wedged; cat i915_error_state do the right thing, and not risk reporting "No error collected". Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170316171305.12972-4-chris@chris-wilson.co.uk	2017-03-16 17:17:15 +00:00
Chris Wilson	2e8f9d3229	drm/i915: Restore engine->submit_request before unwedging When we wedge the device, we override engine->submit_request with a nop to ensure that all in-flight requests are marked in error. However, igt would like to unwedge the device to test -EIO handling. This requires us to flush those in-flight requests and restore the original engine->submit_request. v2: Use a vfunc to unify enabling request submission to engines v3: Split new vfunc to a separate patch. v4: Make the wait interruptible -- the third party fences we wait upon may be indefinitely broken, so allow the reset to be aborted. Fixes: `821ed7df6e` ("drm/i915: Update reset path to fix incomplete requests") Testcase: igt/gem_eio Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> #v3 Link: http://patchwork.freedesktop.org/patch/msgid/20170316171305.12972-3-chris@chris-wilson.co.uk	2017-03-16 17:17:14 +00:00
Chris Wilson	ff44ad51eb	drm/i915: Move engine->submit_request selection to a vfunc It turns out that we may want to restore the original engine->submit_request (and engine->schedule) callbacks from more than just the guc <-> execlists transition. Move this to a vfunc so we can have a common interface. v2: Move initial selection to intel_engines_init_common(), repaint vfunc with engine->set_default_submission (and a similar colour for the helper). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170316171305.12972-2-chris@chris-wilson.co.uk	2017-03-16 17:17:12 +00:00
Chris Wilson	8c185ecaf4	drm/i915: Split I915_RESET_IN_PROGRESS into two flags I915_RESET_IN_PROGRESS is being used for both signaling the requirement to i915_mutex_lock_interruptible() to avoid taking the struct_mutex and to instruct a waiter (already holding the struct_mutex) to perform the reset. To allow for a little more coordination, split these two meaning into a couple of distinct flags. I915_RESET_BACKOFF tells i915_mutex_lock_interruptible() not to acquire the mutex and I915_RESET_HANDOFF tells the waiter to call i915_reset(). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170316171305.12972-1-chris@chris-wilson.co.uk	2017-03-16 17:17:10 +00:00
Changbin Du	3fc03069bc	drm/i915: make context status notifier head be per engine GVTg has introduced the context status notifier to schedule the GVTg workload. At that time, the notifier is bound to GVTg context only, so GVTg is not aware of host workloads. Now we are going to improve GVTg's guest workload scheduler policy, and add Guc emulation support for new Gen graphics. Both these two features require acknowledgment for all contexts running on hardware. (But will not alter host workload.) So here try to make some change. The change is simple: 1. Move the context status notifier head from i915_gem_context to intel_engine_cs. Which means there is a notifier head per engine instead of per context. Execlist driver still call notifier for each context sched-in/out events of current engine. 2. At GVTg side, it binds a notifier_block for each physical engine at GVTg initialization period. Then GVTg can hear all context status events. In this patch, GVTg do nothing for host context event, but later will add a function there. But in any case, the notifier callback is a noop if this is no active vGPU. Since intel_gvt_init() is called at early initialization stage and require the status notifier head has been initiated, I initiate it in intel_engine_setup(). v2: remove a redundant newline. (chris) Fixes: `3c7ba6359d` ("drm/i915: Introduce execlist context status change notification") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100232 Signed-off-by: Changbin Du <changbin.du@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Zhi Wang <zhi.a.wang@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170313024711.28591-1-changbin.du@intel.com Acked-by: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-03-16 16:24:35 +00:00
Chris Wilson	31de73501a	drm/i915/scheduler: emulate a scheduler for guc This emulates execlists on top of the GuC in order to defer submission of requests to the hardware. This deferral allows time for high priority requests to gazump their way to the head of the queue, however it nerfs the GuC by converting it back into a simple execlist (where the CPU has to wake up after every request to feed new commands into the GuC). v2: Drop hack status - though iirc there is still a lockdep inversion between fence and engine->timeline->lock (which is impossible as the nesting only occurs on different fences - hopefully just requires some judicious lockdep annotation) v3: Apply lockdep nesting to enabling signaling on the request, using the pattern we already have in __i915_gem_request_submit(); v4: Replaying requests after a hang also now needs the timeline spinlock, to disable the interrupts at least v5: Hold wq lock for completeness, and emit a tracepoint for enabling signal v6: Reorder interrupt checking for a happier gcc. v7: Only signal the tasklet after a user-interrupt if using guc scheduling v8: Restore lost update of rq through the i915_guc_irq_handler (Tvrtko) v9: Avoid re-initialising the engine->irq_tasklet from inside a reset v10: Hook up the execlists-style tracepoints v11: Clear the execlists irq_posted bit after taking over the interrupt/tasklet Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170316125619.6856-1-chris@chris-wilson.co.uk Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>	2017-03-16 14:45:07 +00:00
Chris Wilson	14a6bbf9e5	drm/i915: Replace irq_seqno_barrier on hws write with a clflush When manually overwriting the HWS, rather than assume irq_seqno_barrier does the right thing, we can explicitly flush the cacheline instead. This avoids us calling the engine->irq_seqno_barrier() from an illegal context: [ 1472.651797] BUG: scheduling while atomic: migration/0/11/0x00000002 [ 1472.651807] Modules linked in: ctr ccm arc4 snd_hda_codec_hdmi bnep rfcomm iwldvm snd_hda_codec_conexant snd_hda_codec_generic snd_hda_intel mac80211 snd_hda_codec snd_hda_core snd_pcm dm_multipath snd_hwdep intel_powerclamp coretemp snd_seq_midi crct10dif_pclmul snd_seq_midi_event crc32_pclmul iwlwifi ghash_clmulni_intel btusb snd_rawmidi btrtl aesni_intel btbcm aes_x86_64 crypto_simd btintel cryptd glue_helper bluetooth snd_seq cfg80211 snd_timer snd_seq_device intel_ips binfmt_misc snd mei_me soundcore mei dm_mirror dm_region_hash dm_log i915 intel_gtt i2c_algo_bit drm_kms_helper cfbfillrect syscopyarea cfbimgblt sysfillrect sysimgblt fb_sys_fops cfbcopyarea prime_numbers e1000e drm ahci libahci [ 1472.651897] CPU: 0 PID: 11 Comm: migration/0 Tainted: G U 4.11.0-rc1+ #203 [ 1472.651899] Hardware name: LENOVO 514328U/514328U, BIOS 6QET44WW (1.14 ) 04/20/2010 [ 1472.651900] Call Trace: [ 1472.651913] dump_stack+0x63/0x90 [ 1472.651922] __schedule_bug+0x5d/0x6b [ 1472.651930] __schedule+0x46a/0x5f0 [ 1472.651934] schedule+0x38/0x90 [ 1472.651938] schedule_hrtimeout_range_clock+0x85/0x110 [ 1472.651945] ? hrtimer_init+0x10/0x10 [ 1472.651949] schedule_hrtimeout_range+0xe/0x10 [ 1472.651952] usleep_range+0x4d/0x60 [ 1472.652037] gen5_seqno_barrier+0x13/0x20 [i915] [ 1472.652101] intel_engine_init_global_seqno+0xd7/0x160 [i915] [ 1472.652160] __i915_gem_set_wedged_BKL+0xa0/0x180 [i915] [ 1472.652166] multi_cpu_stop+0xbb/0xe0 [ 1472.652170] ? cpu_stop_queue_work+0x90/0x90 [ 1472.652174] cpu_stopper_thread+0x82/0x110 [ 1472.652179] smpboot_thread_fn+0x137/0x190 [ 1472.652184] kthread+0xf7/0x130 [ 1472.652187] ? sort_range+0x20/0x20 [ 1472.652191] ? kthread_park+0x90/0x90 [ 1472.652195] ret_from_fork+0x2c/0x40 Testcase: igt/gem_eio #ilk Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170314111452.9375-1-chris@chris-wilson.co.uk Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>	2017-03-16 14:26:28 +00:00
Mika Kuoppala	6b7f6aa75e	drm/i915: Use coarse grained residency counter with byt Set byt rc residency counters high level as chv does by default. We lose some accuracy on byt but we can do the calculation without extra hw read on both platforms, as now they behave identically in this respect. v2: use ktime v3: keep comparison u32 (Chris) Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1489592584-10422-1-git-send-email-mika.kuoppala@intel.com	2017-03-16 12:28:28 +02:00
Mika Kuoppala	679cb6c132	drm/i915: Use ktime to calculate rc0 residency We have used cz timestamp register to gain a reference time wrt to residency calculations. The residency counts are in cz clk ticks (333Mhz clock) but for some reason the cz timestamp register gives 100us units. Perhaps for some other usage, the base-ten based values are easier, but in residency calculations raw units would have been the easiest. As there is not much advantage of using base-ten clock through a more costly punit access, take our reference times directly from kernel clock. v2: use ktime (Chris, Ville) Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-03-16 12:28:28 +02:00
Mika Kuoppala	1362877ed2	drm/i915: Convert debugfs to use generic residency calculator Use intel_rc6_residency to get benefit for increased resolution in byt/chv. v2: output raw and time (Chris) Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-03-16 12:28:28 +02:00
Mika Kuoppala	47c21d9a1a	drm/i915: Extend vlv/chv residency resolution Vlv and chv residency counters are 40 bits in width. With a control bit, we can choose between upper or lower 32 bit window into this counter. Lets toggle this bit on and off on and read both parts. As a result we can push the wrap from 13 seconds to 54 minutes. v2: commit msg, loop readability, goto elimination (Chris) v3: bug ref, divide outside runtime pm lock (Chris) References: https://bugs.freedesktop.org/show_bug.cgi?id=94852 Reported-by: Len Brown <len.brown@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-03-16 12:28:28 +02:00
Mika Kuoppala	c5a0ad114b	drm/i915: Return residency as microseconds Change the granularity from milliseconds to microseconds when returning rc6 residencies. This is in preparation for increased resolution on some platforms. v2: use 64bit div macro (Chris) Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-03-16 12:28:28 +02:00
Mika Kuoppala	135bafa551	drm/i915: Move residency calculation into intel_pm.c Plan is to make generic residency calculation utility function for usage outside of sysfs. As a first step move residency calculation into intel_pm.c Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-03-16 12:28:28 +02:00
Chris Wilson	15c344f4d0	drm/i915/userptr: Reinvent GGTT self-faulting protection lockdep doesn't like us taking the mm->mmap_sem inside the get_pages callback for a couple of reasons. The straightforward deadlock: [13755.434059] ============================================= [13755.434061] [ INFO: possible recursive locking detected ] [13755.434064] 4.11.0-rc1-CI-CI_DRM_297+ #1 Tainted: G U [13755.434066] --------------------------------------------- [13755.434068] gem_userptr_bli/8398 is trying to acquire lock: [13755.434070] (&mm->mmap_sem){++++++}, at: [<ffffffffa00c988a>] i915_gem_userptr_get_pages+0x5a/0x2e0 [i915] [13755.434096] but task is already holding lock: [13755.434098] (&mm->mmap_sem){++++++}, at: [<ffffffff8104d485>] __do_page_fault+0x105/0x560 [13755.434105] other info that might help us debug this: [13755.434108] Possible unsafe locking scenario: [13755.434110] CPU0 [13755.434111] ---- [13755.434112] lock(&mm->mmap_sem); [13755.434115] lock(&mm->mmap_sem); [13755.434117] * DEADLOCK * [13755.434121] May be due to missing lock nesting notation [13755.434126] 2 locks held by gem_userptr_bli/8398: [13755.434128] #0: (&mm->mmap_sem){++++++}, at: [<ffffffff8104d485>] __do_page_fault+0x105/0x560 [13755.434135] #1: (&obj->mm.lock){+.+.+.}, at: [<ffffffffa00b887d>] __i915_gem_object_get_pages+0x1d/0x70 [i915] [13755.434156] stack backtrace: [13755.434161] CPU: 3 PID: 8398 Comm: gem_userptr_bli Tainted: G U 4.11.0-rc1-CI-CI_DRM_297+ #1 [13755.434165] Hardware name: GIGABYTE GB-BKi7(H)A-7500/MFLP7AP-00, BIOS F4 02/20/2017 [13755.434169] Call Trace: [13755.434174] dump_stack+0x67/0x92 [13755.434178] __lock_acquire+0x133a/0x1b50 [13755.434182] lock_acquire+0xc9/0x220 [13755.434200] ? i915_gem_userptr_get_pages+0x5a/0x2e0 [i915] [13755.434204] down_read+0x42/0x70 [13755.434221] ? i915_gem_userptr_get_pages+0x5a/0x2e0 [i915] [13755.434238] i915_gem_userptr_get_pages+0x5a/0x2e0 [i915] [13755.434255] ____i915_gem_object_get_pages+0x25/0x60 [i915] [13755.434272] __i915_gem_object_get_pages+0x59/0x70 [i915] [13755.434288] i915_gem_fault+0x397/0x6a0 [i915] [13755.434304] ? i915_gem_fault+0x1a1/0x6a0 [i915] [13755.434308] ? __lock_acquire+0x449/0x1b50 [13755.434311] ? __lock_acquire+0x449/0x1b50 [13755.434315] ? vm_mmap_pgoff+0xa9/0xd0 [13755.434318] __do_fault+0x19/0x70 [13755.434321] __handle_mm_fault+0x863/0xe50 [13755.434325] handle_mm_fault+0x17f/0x370 [13755.434329] ? handle_mm_fault+0x40/0x370 [13755.434332] __do_page_fault+0x279/0x560 [13755.434336] do_page_fault+0xc/0x10 [13755.434339] page_fault+0x22/0x30 [13755.434342] RIP: 0033:0x7f5ab91b5880 [13755.434345] RSP: 002b:00007fff62922218 EFLAGS: 00010216 [13755.434348] RAX: 0000000000b74500 RBX: 00007f5ab7f81000 RCX: 0000000000000000 [13755.434352] RDX: 0000000000100000 RSI: 00007f5ab7f81000 RDI: 00007f5aba61c000 [13755.434355] RBP: 00007f5aba61c000 R08: 0000000000000007 R09: 0000000100000000 [13755.434359] R10: 000000000000037d R11: 00007f5ab91b5840 R12: 0000000000000001 [13755.434362] R13: 0000000000000005 R14: 0000000000000001 R15: 0000000000000000 and cyclic deadlocks: [ 2566.458979] ====================================================== [ 2566.459054] [ INFO: possible circular locking dependency detected ] [ 2566.459127] 4.11.0-rc1+ #26 Not tainted [ 2566.459194] ------------------------------------------------------- [ 2566.459266] gem_streaming_w/759 is trying to acquire lock: [ 2566.459334] (&obj->mm.lock){+.+.+.}, at: [<ffffffffa034bc80>] i915_gem_object_pin_pages+0x0/0xc0 [i915] [ 2566.459605] [ 2566.459605] but task is already holding lock: [ 2566.459699] (&mm->mmap_sem){++++++}, at: [<ffffffff8106fd11>] __do_page_fault+0x121/0x500 [ 2566.459814] [ 2566.459814] which lock already depends on the new lock. [ 2566.459814] [ 2566.459934] [ 2566.459934] the existing dependency chain (in reverse order) is: [ 2566.460030] [ 2566.460030] -> #1 (&mm->mmap_sem){++++++}: [ 2566.460139] lock_acquire+0xfe/0x220 [ 2566.460214] down_read+0x4e/0x90 [ 2566.460444] i915_gem_userptr_get_pages+0x6e/0x340 [i915] [ 2566.460669] ____i915_gem_object_get_pages+0x8b/0xd0 [i915] [ 2566.460900] __i915_gem_object_get_pages+0x6a/0x80 [i915] [ 2566.461132] __i915_vma_do_pin+0x7fa/0x930 [i915] [ 2566.461352] eb_add_vma+0x67b/0x830 [i915] [ 2566.461572] eb_lookup_vmas+0xafe/0x1010 [i915] [ 2566.461792] i915_gem_do_execbuffer+0x715/0x2870 [i915] [ 2566.462012] i915_gem_execbuffer2+0x106/0x2b0 [i915] [ 2566.462152] drm_ioctl+0x36c/0x670 [drm] [ 2566.462236] do_vfs_ioctl+0x12c/0xa60 [ 2566.462317] SyS_ioctl+0x41/0x70 [ 2566.462399] entry_SYSCALL_64_fastpath+0x1c/0xb1 [ 2566.462477] [ 2566.462477] -> #0 (&obj->mm.lock){+.+.+.}: [ 2566.462587] __lock_acquire+0x1602/0x1790 [ 2566.462661] lock_acquire+0xfe/0x220 [ 2566.462893] i915_gem_object_pin_pages+0x4c/0xc0 [i915] [ 2566.463116] i915_gem_fault+0x2c2/0x8c0 [i915] [ 2566.463197] __do_fault+0x42/0x130 [ 2566.463276] __handle_mm_fault+0x92c/0x1280 [ 2566.463356] handle_mm_fault+0x1e2/0x440 [ 2566.463443] __do_page_fault+0x1c4/0x500 [ 2566.463529] do_page_fault+0xc/0x10 [ 2566.463613] page_fault+0x1f/0x30 [ 2566.463693] [ 2566.463693] other info that might help us debug this: [ 2566.463693] [ 2566.463820] Possible unsafe locking scenario: [ 2566.463820] [ 2566.463918] CPU0 CPU1 [ 2566.463988] ---- ---- [ 2566.464068] lock(&mm->mmap_sem); [ 2566.464143] lock(&obj->mm.lock); [ 2566.464226] lock(&mm->mmap_sem); [ 2566.464304] lock(&obj->mm.lock); [ 2566.464378] [ 2566.464378] * DEADLOCK * [ 2566.464378] [ 2566.464504] 1 lock held by gem_streaming_w/759: [ 2566.464576] #0: (&mm->mmap_sem){++++++}, at: [<ffffffff8106fd11>] __do_page_fault+0x121/0x500 [ 2566.464699] [ 2566.464699] stack backtrace: [ 2566.464801] CPU: 0 PID: 759 Comm: gem_streaming_w Not tainted 4.11.0-rc1+ #26 [ 2566.464881] Hardware name: GIGABYTE GB-BXBT-1900/MZBAYAB-00, BIOS F8 03/02/2016 [ 2566.464983] Call Trace: [ 2566.465061] dump_stack+0x68/0x9f [ 2566.465144] print_circular_bug+0x20b/0x260 [ 2566.465234] __lock_acquire+0x1602/0x1790 [ 2566.465323] ? debug_check_no_locks_freed+0x1a0/0x1a0 [ 2566.465564] ? i915_gem_object_wait+0x238/0x650 [i915] [ 2566.465657] ? debug_lockdep_rcu_enabled.part.4+0x1a/0x30 [ 2566.465749] lock_acquire+0xfe/0x220 [ 2566.465985] ? i915_sg_trim+0x1b0/0x1b0 [i915] [ 2566.466223] i915_gem_object_pin_pages+0x4c/0xc0 [i915] [ 2566.466461] ? i915_sg_trim+0x1b0/0x1b0 [i915] [ 2566.466699] i915_gem_fault+0x2c2/0x8c0 [i915] [ 2566.466939] ? i915_gem_pwrite_ioctl+0xce0/0xce0 [i915] [ 2566.467030] ? __lock_acquire+0x642/0x1790 [ 2566.467122] ? __lock_acquire+0x642/0x1790 [ 2566.467209] ? debug_lockdep_rcu_enabled+0x35/0x40 [ 2566.467299] ? get_unmapped_area+0x1b4/0x1d0 [ 2566.467387] __do_fault+0x42/0x130 [ 2566.467474] __handle_mm_fault+0x92c/0x1280 [ 2566.467564] ? __pmd_alloc+0x1e0/0x1e0 [ 2566.467651] ? vm_mmap_pgoff+0x160/0x190 [ 2566.467740] ? handle_mm_fault+0x111/0x440 [ 2566.467827] handle_mm_fault+0x1e2/0x440 [ 2566.467914] ? handle_mm_fault+0x5d/0x440 [ 2566.468002] __do_page_fault+0x1c4/0x500 [ 2566.468090] do_page_fault+0xc/0x10 [ 2566.468180] page_fault+0x1f/0x30 [ 2566.468263] RIP: 0033:0x557895ced32a [ 2566.468337] RSP: 002b:00007fffd6dd8a10 EFLAGS: 00010202 [ 2566.468419] RAX: 00007f659a4db000 RBX: 0000000000000003 RCX: 00007f659ad032da [ 2566.468501] RDX: 0000000000000000 RSI: 0000000000100000 RDI: 0000000000000000 [ 2566.468586] RBP: 0000000000000007 R08: 0000000000000003 R09: 0000000100000000 [ 2566.468667] R10: 0000000000000001 R11: 0000000000000246 R12: 0000557895ceda60 [ 2566.468749] R13: 0000000000000001 R14: 00007fffd6dd8ac0 R15: 00007f659a4db000 By checking the status of the gup worker (serialized by the obj->mm.lock) we can determine whether it is still active, has failed or has succeeded. If the worker is still active (or failed), we know that it cannot be bound and so we can skip taking struct_mutex (risking potential recursion). As we check the worker status, we mark it to discard any partial results, forcing us to restart on the next get_pages. Reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Fixes: `1c8782dd31` ("drm/i915/userptr: Disallow wrapping GTT into a userptr") Testcase: igt/gem_userptr_blits/map-fixed-invalidate-gup Testcase: igt/gem_userptr_blits/dmabuf-sync Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michał Winiarski <michal.winiarski@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170315140150.19432-1-chris@chris-wilson.co.uk Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>	2017-03-16 10:21:25 +00:00
Michal Wajdeczko	d4a70a10f5	drm/i915: Make intel_uc_sanitize_options() more robust After negative guc fw selection we could leave guc submission flag still turned on. Reorder some checks to cover this case. While here, fix info message and return early if there is no Guc. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com> [tursulin: fixup bad alignment] Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170315133741.150420-1-michal.wajdeczko@intel.com	2017-03-16 08:57:46 +00:00
Arkadiusz Hiler	6833b82e98	drm/i915/uc: Rename intel_uc_fw.fw to .type This field is used to determine which kind of firmware the struct describes (GuC/HuC) - the name does not reflect. The enum used here have "type" in the name, so let's go with that. Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Oscar Mateo <oscar.mateo@intel.com> Signed-off-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170315133415.15343-1-arkadiusz.hiler@intel.com	2017-03-16 08:54:04 +00:00
Chris Wilson	a6b0a14128	drm/i915/breadcrumbs: Tweak commentary Tvrtko spotted a stale reference to b->lock (now b->rb_lock) so review the comments and try to improve them in passing. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170315222259.1469-1-chris@chris-wilson.co.uk Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>	2017-03-16 08:49:28 +00:00
Chris Wilson	db93991bf5	drm/i915: Only attempt to signal the request once from the interrupt handler Check that request has not been signaled before acquiring a reference to the request for signaling later in the interrupt handler. The loading of the cacheline (for request->fence.flags) should be "free" when followed by the locked increment of the request->fence.refcount (which then sets the cacheline to exclusive mode), i.e. the cost of test_bit prior to an atomic_inc should be negligible. This should benefit us when we have a pile of bare breadcrumbs (interrupted execbuf) where we may get interrupts faster than we can get rid of the intel_wait, or if the device is too slow to run the bottom-half between interrupts. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170315210726.12095-5-chris@chris-wilson.co.uk	2017-03-15 21:45:41 +00:00
Chris Wilson	908a6cbf84	drm/i915/breadcrumbs: Assert that we do not shortcut the current bottom-half We need to ensure that we always serialize updates to the bottom-half using the breadcrumbs.irq_lock so that we don't race with a concurrent interrupt handler. This is most important just prior to leaving the waiter (when the intel_wait will be overwritten), so make sure we are not the current bottom-half when skipping the irq locks. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170315210726.12095-4-chris@chris-wilson.co.uk	2017-03-15 21:45:40 +00:00
Chris Wilson	a5cae7b8ed	drm/i915/breadcrumbs: Disable interrupt bottom-half first on idling Before walking the rbtree of waiters (marking them as complete and waking them), decouple the interrupt handler. This prevents a race between the missed waiter waking up and removing its intel_wait (which skips checking the lock) and the interrupt handler dereferencing the intel_wait. (Though we do not expect to encounter waiters during idle!) Fixes: `e1c0c91bda` ("drm/i915: Wake up all waiters before idling") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170315210726.12095-3-chris@chris-wilson.co.uk	2017-03-15 21:45:39 +00:00
Chris Wilson	429732e860	drm/i915/breadcrumbs: Update bottom-half before marking as complete When adding a new request to the breadcrumb rbtree, we mark all those requests inside the rbtree that are already completed as complete. This wakes those waiters up and allows them to skip the spinlock before returning to userspace. If one of those is the current bottom-half and allocated its intel_wait on the stack, it may then overwrite the b->irq_wait upon exiting i915_wait_request() just as the interrupt handler dereferences it. Fixes: `56299fb7d9` ("drm/i915: Signal first fence from irq handler if complete") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170315210726.12095-2-chris@chris-wilson.co.uk	2017-03-15 21:45:38 +00:00
Chris Wilson	4bd66391dd	drm/i915/breadcrumbs: Use booleans for intel_breadcrumbs_busy() Since commit `9b6586ae9f` ("drm/i915: Keep a global seqno per-engine") converted intel_breadcrumbs_busy() to reporting a single boolean, we need only compute a boolean internally (and not needlessly compute the flag). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170315210726.12095-1-chris@chris-wilson.co.uk	2017-03-15 21:45:38 +00:00
Chris Wilson	1604a86d08	drm/i915: Extend rpm wakelock during i915_handle_error() We take the runtime pm wakelock during i915_handle_error() to ensure that all paths that reach the error handler keep the device awake during the hw reads. However, we need to extend that from the reset handler to include the earlier capture routines. Reported-by: Antonio Argenziano <antonio.argenziano@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michel Thierry <michel.thierry@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170314171840.25706-1-chris@chris-wilson.co.uk Reviewed-by: Michel Thierry <michel.thierry@intel.com>	2017-03-15 15:44:36 +00:00
Michal Wajdeczko	16f11f4696	drm/i915/guc: Use formalized struct definition for ads object Manual pointer manipulation is error prone. Let compiler calculate right offsets for us in case we need to change ads layout. v2: don't call it object (Chris) v3: restyle offset assignments (Chris) v4: stylistic reductions Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Oscar Mateo <oscar.mateo@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/20170314133309.126432-1-michal.wajdeczko@intel.com Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-03-15 15:43:43 +00:00
Arkadiusz Hiler	b3420dde38	drm/i915/uc: Add params for specifying firmware `guc_firmware_path` and `huc_firmware_path` module parameters are added. Using the parameter disables version checks and loads desired firmware instead of the default one. v2: make params unsafe && notice about disabled fw check (J. Lahtinen) Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Michal Winiarski <michal.winiarski@intel.com> Signed-off-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-03-15 14:26:30 +02:00
Arkadiusz Hiler	b551f610b3	drm/i915/uc: Separate firmware selection and preparation intel_{h,g}uc_init_fw selects correct firmware and then triggers it's preparation (fetch + initial parsing). This change separates out select steps, so those can be called by the sanitize_options(). Then, during the init_fw(), we prepare the firmware if the firmware was selected. Cc: Michal Winiarski <michal.winiarski@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-03-15 14:26:30 +02:00
Arkadiusz Hiler	8fc2a4e427	drm/i915/uc: Simplify firmware path handling Currently fw->path values can represent one of three possible states: 1) NULL - device without the uC 2) '\0' - device with the uC but have no firmware 3) else - device with the uC and we have firmware Second case is used only to WARN at a later stage. We can WARN right away and merge cases 1 and 2. Code can be even further simplified and common (HuC/GuC logic) happening right before the fetch can be offloaded to the common function. v2: fewer temporary variables, more straightforward flow (M. Wajdeczko) v3: DRM_ERROR instead of WARN (M. Wajdeczko) v4: coding standard (J. Lahtinen) v5: non-trivial rebase v6: remove path check, we are checking fetch status (M. Wajdeczko) Cc: Anusha Srivatsa <anusha.srivatsa@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Michal Winiarski <michal.winiarski@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-03-15 14:26:30 +02:00
Arkadiusz Hiler	6cd5a72c35	drm/i915/guc: Simplify intel_guc_init_hw() Current version of intel_guc_init_hw() does a lot: - cares about submission - loads huc - implement WA This change offloads some of the logic to intel_uc_init_hw(), which now cares about the above. v2: rename guc_hw_reset and fix typo in define name (M. Wajdeczko) v3: rename once again v4: remove spurious comments and add some style (J. Lahtinen) v5: flow changes, got rid of dead checks (M. Wajdeczko) v6: rebase v7: rebase & onion teardown (J. Lahtinen) Cc: Anusha Srivatsa <anusha.srivatsa@intel.com> Cc: Michal Winiarski <michal.winiarski@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-03-15 14:26:30 +02:00
Arkadiusz Hiler	d2be9f2f41	drm/i915/guc: Extract param logic form guc_init_fw() Let intel_guc_init_fw() focus on determining and fetching the correct firmware. This patch introduces intel_uc_sanitize_options() that is called from intel_sanitize_options(). Then, if we have GuC, we can call intel_guc_init_fw() conditionally and we do not have to do the internal checks. v2: fix comment, notify when nuking GuC explicitly enabled (M. Wajdeczko) v3: fix comment again, change the nuke message (M. Wajdeczko) v4: update title to reflect new function name + rebase v5: text && remove 2 uneccessary checks (M. Wajdeczko) Cc: Michal Winiarski <michal.winiarski@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-03-15 14:26:30 +02:00
Arkadiusz Hiler	29ad6a30de	drm/i915/uc: Introduce intel_uc_init_fw() Instead of calling intel_guc_init() and intel_huc_init() one by one this patch introduces intel_uc_init_fw() function that calls them both. Called functions are renamed accordingly. Trying to have subject_verb_object ordering and more descriptive names, the intel_huc_init() and intel_guc_init() functions are renamed. For guc_init(): * `intel_guc` is the subject, so those functions now take intel_guc structure, instead of the dev_priv * init is the verb * fw is the object which better describes the function's role huc_init() change follows the same reasoning. v2: settle on intel_uc_fetch_fw name (M. Wajdeczko) v3: yet another rename - intel_uc_init_fw (J. Lahtinen) v4: non-trivial rebase Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Michal Winiarski <michal.winiarski@intel.com> Signed-off-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-03-15 14:26:30 +02:00
Arkadiusz Hiler	4c0fed7911	drm/i915/uc: Move intel_uc_fw_fetch() to intel_uc.c The file fits better. Additionally rename it to intel_uc_prepare_fw(), as the function does more than simple fetch. `obj` cleanup in the function is also fixed (i.e. removed). In the fail scenario it was always 'put' but there's no possible flow that initializes the obj properly and then goes to the fail label. v2: remove second declaration, reorder (M. Wajdeczko) v3: non-trivial rebase v4: remove obj cleanup in the fail scenario (C. Wilson) Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-03-15 14:26:30 +02:00
Arkadiusz Hiler	882d1db09b	drm/i915/uc: Rename intel_?uc_{setup, load}() to _init_hw() GuC historically has two "startup" functions called _init() and _setup() Then HuC came with it's _init() and _load(). This commit renames intel_guc_setup() and intel_huc_load() to *uc_init_hw() as they called from the i915_gem_init_hw(). The aim is to be consistent in that entry points called during particular driver init phases (e.g. init_hw) are all suffixed by that phase. When reading the leaf functions, it should be clear at what stage during the driver load it is called and therefore what operations are legal at that point. Also, since the functions start with intel_guc and intel_huc they take appropiate structure. v2: commit message update (Chris Wilson) v3: change taken parameters to be more "semantic" (M. Wajdeczko) Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michal Winiarski <michal.winiarski@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-03-15 14:26:30 +02:00
Arkadiusz Hiler	50beba55e8	drm/i915/huc: Add huc_to_i915 Used to obtain "dev_priv" from huc struct pointer. We already have similar thing for guc. Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-03-15 14:26:30 +02:00
Arkadiusz Hiler	0417a2b3a1	drm/i915/uc: Drop superfluous externs in intel_uc.h Externs are implicit and we generally try to avoid them. Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-03-15 14:26:30 +02:00
Chris Wilson	bd784b7cc4	drm/i915: Avoid rcu_barrier() from reclaim paths (shrinker) The rcu_barrier() takes the cpu_hotplug mutex which itself is not reclaim-safe, and so rcu_barrier() is illegal from inside the shrinker. [ 309.661373] ========================================================= [ 309.661376] [ INFO: possible irq lock inversion dependency detected ] [ 309.661380] 4.11.0-rc1-CI-CI_DRM_2333+ #1 Tainted: G W [ 309.661383] --------------------------------------------------------- [ 309.661386] gem_exec_gttfil/6435 just changed the state of lock: [ 309.661389] (rcu_preempt_state.barrier_mutex){+.+.-.}, at: [<ffffffff81100731>] _rcu_barrier+0x31/0x160 [ 309.661399] but this lock took another, RECLAIM_FS-unsafe lock in the past: [ 309.661402] (cpu_hotplug.lock){+.+.+.} [ 309.661404] and interrupts could create inverse lock ordering between them. [ 309.661410] other info that might help us debug this: [ 309.661414] Possible interrupt unsafe locking scenario: [ 309.661417] CPU0 CPU1 [ 309.661419] ---- ---- [ 309.661421] lock(cpu_hotplug.lock); [ 309.661425] local_irq_disable(); [ 309.661432] lock(rcu_preempt_state.barrier_mutex); [ 309.661441] lock(cpu_hotplug.lock); [ 309.661446] <Interrupt> [ 309.661448] lock(rcu_preempt_state.barrier_mutex); [ 309.661453] * DEADLOCK * [ 309.661460] 4 locks held by gem_exec_gttfil/6435: [ 309.661464] #0: (sb_writers#10){.+.+.+}, at: [<ffffffff8120d83d>] vfs_write+0x17d/0x1f0 [ 309.661475] #1: (debugfs_srcu){......}, at: [<ffffffff81320491>] debugfs_use_file_start+0x41/0xa0 [ 309.661486] #2: (&attr->mutex){+.+.+.}, at: [<ffffffff8123a3e7>] simple_attr_write+0x37/0xe0 [ 309.661495] #3: (&dev->struct_mutex){+.+.+.}, at: [<ffffffffa0091b4a>] i915_drop_caches_set+0x3a/0x150 [i915] [ 309.661540] the shortest dependencies between 2nd lock and 1st lock: [ 309.661547] -> (cpu_hotplug.lock){+.+.+.} ops: 829 { [ 309.661553] HARDIRQ-ON-W at: [ 309.661560] __lock_acquire+0x5e5/0x1b50 [ 309.661565] lock_acquire+0xc9/0x220 [ 309.661572] __mutex_lock+0x6e/0x990 [ 309.661576] mutex_lock_nested+0x16/0x20 [ 309.661583] get_online_cpus+0x61/0x80 [ 309.661590] kmem_cache_create+0x25/0x1d0 [ 309.661596] debug_objects_mem_init+0x30/0x249 [ 309.661602] start_kernel+0x341/0x3fe [ 309.661607] x86_64_start_reservations+0x2a/0x2c [ 309.661612] x86_64_start_kernel+0x173/0x186 [ 309.661619] verify_cpu+0x0/0xfc [ 309.661622] SOFTIRQ-ON-W at: [ 309.661627] __lock_acquire+0x611/0x1b50 [ 309.661632] lock_acquire+0xc9/0x220 [ 309.661636] __mutex_lock+0x6e/0x990 [ 309.661641] mutex_lock_nested+0x16/0x20 [ 309.661646] get_online_cpus+0x61/0x80 [ 309.661650] kmem_cache_create+0x25/0x1d0 [ 309.661655] debug_objects_mem_init+0x30/0x249 [ 309.661660] start_kernel+0x341/0x3fe [ 309.661664] x86_64_start_reservations+0x2a/0x2c [ 309.661669] x86_64_start_kernel+0x173/0x186 [ 309.661674] verify_cpu+0x0/0xfc [ 309.661677] RECLAIM_FS-ON-W at: [ 309.661682] mark_held_locks+0x6f/0xa0 [ 309.661687] lockdep_trace_alloc+0xb3/0x100 [ 309.661693] kmem_cache_alloc_trace+0x31/0x2e0 [ 309.661699] __smpboot_create_thread.part.1+0x27/0xe0 [ 309.661704] smpboot_create_threads+0x61/0x90 [ 309.661709] cpuhp_invoke_callback+0x9c/0x8a0 [ 309.661713] cpuhp_up_callbacks+0x31/0xb0 [ 309.661718] _cpu_up+0x7a/0xc0 [ 309.661723] do_cpu_up+0x5f/0x80 [ 309.661727] cpu_up+0xe/0x10 [ 309.661734] smp_init+0x71/0xb3 [ 309.661738] kernel_init_freeable+0x94/0x19e [ 309.661743] kernel_init+0x9/0xf0 [ 309.661748] ret_from_fork+0x2e/0x40 [ 309.661752] INITIAL USE at: [ 309.661757] __lock_acquire+0x234/0x1b50 [ 309.661761] lock_acquire+0xc9/0x220 [ 309.661766] __mutex_lock+0x6e/0x990 [ 309.661771] mutex_lock_nested+0x16/0x20 [ 309.661775] get_online_cpus+0x61/0x80 [ 309.661780] __cpuhp_setup_state+0x44/0x170 [ 309.661785] page_alloc_init+0x23/0x3a [ 309.661790] start_kernel+0x124/0x3fe [ 309.661794] x86_64_start_reservations+0x2a/0x2c [ 309.661799] x86_64_start_kernel+0x173/0x186 [ 309.661804] verify_cpu+0x0/0xfc [ 309.661807] } [ 309.661813] ... key at: [<ffffffff81e37690>] cpu_hotplug+0xb0/0x100 [ 309.661817] ... acquired at: [ 309.661821] lock_acquire+0xc9/0x220 [ 309.661825] __mutex_lock+0x6e/0x990 [ 309.661829] mutex_lock_nested+0x16/0x20 [ 309.661833] get_online_cpus+0x61/0x80 [ 309.661837] _rcu_barrier+0x9f/0x160 [ 309.661841] rcu_barrier+0x10/0x20 [ 309.661847] netdev_run_todo+0x5f/0x310 [ 309.661852] rtnl_unlock+0x9/0x10 [ 309.661856] default_device_exit_batch+0x133/0x150 [ 309.661862] ops_exit_list.isra.0+0x4d/0x60 [ 309.661866] cleanup_net+0x1d8/0x2c0 [ 309.661872] process_one_work+0x1f4/0x6d0 [ 309.661876] worker_thread+0x49/0x4a0 [ 309.661881] kthread+0x107/0x140 [ 309.661884] ret_from_fork+0x2e/0x40 [ 309.661890] -> (rcu_preempt_state.barrier_mutex){+.+.-.} ops: 179 { [ 309.661896] HARDIRQ-ON-W at: [ 309.661901] __lock_acquire+0x5e5/0x1b50 [ 309.661905] lock_acquire+0xc9/0x220 [ 309.661910] __mutex_lock+0x6e/0x990 [ 309.661914] mutex_lock_nested+0x16/0x20 [ 309.661919] _rcu_barrier+0x31/0x160 [ 309.661923] rcu_barrier+0x10/0x20 [ 309.661928] netdev_run_todo+0x5f/0x310 [ 309.661932] rtnl_unlock+0x9/0x10 [ 309.661936] default_device_exit_batch+0x133/0x150 [ 309.661941] ops_exit_list.isra.0+0x4d/0x60 [ 309.661946] cleanup_net+0x1d8/0x2c0 [ 309.661951] process_one_work+0x1f4/0x6d0 [ 309.661955] worker_thread+0x49/0x4a0 [ 309.661960] kthread+0x107/0x140 [ 309.661964] ret_from_fork+0x2e/0x40 [ 309.661968] SOFTIRQ-ON-W at: [ 309.661972] __lock_acquire+0x611/0x1b50 [ 309.661977] lock_acquire+0xc9/0x220 [ 309.661981] __mutex_lock+0x6e/0x990 [ 309.661986] mutex_lock_nested+0x16/0x20 [ 309.661990] _rcu_barrier+0x31/0x160 [ 309.661995] rcu_barrier+0x10/0x20 [ 309.661999] netdev_run_todo+0x5f/0x310 [ 309.662003] rtnl_unlock+0x9/0x10 [ 309.662008] default_device_exit_batch+0x133/0x150 [ 309.662013] ops_exit_list.isra.0+0x4d/0x60 [ 309.662017] cleanup_net+0x1d8/0x2c0 [ 309.662022] process_one_work+0x1f4/0x6d0 [ 309.662027] worker_thread+0x49/0x4a0 [ 309.662031] kthread+0x107/0x140 [ 309.662035] ret_from_fork+0x2e/0x40 [ 309.662039] IN-RECLAIM_FS-W at: [ 309.662043] __lock_acquire+0x638/0x1b50 [ 309.662048] lock_acquire+0xc9/0x220 [ 309.662053] __mutex_lock+0x6e/0x990 [ 309.662058] mutex_lock_nested+0x16/0x20 [ 309.662062] _rcu_barrier+0x31/0x160 [ 309.662067] rcu_barrier+0x10/0x20 [ 309.662089] i915_gem_shrink_all+0x33/0x40 [i915] [ 309.662109] i915_drop_caches_set+0x141/0x150 [i915] [ 309.662114] simple_attr_write+0xc7/0xe0 [ 309.662119] full_proxy_write+0x4f/0x70 [ 309.662124] __vfs_write+0x23/0x120 [ 309.662128] vfs_write+0xc6/0x1f0 [ 309.662133] SyS_write+0x44/0xb0 [ 309.662138] entry_SYSCALL_64_fastpath+0x1c/0xb1 [ 309.662142] INITIAL USE at: [ 309.662147] __lock_acquire+0x234/0x1b50 [ 309.662151] lock_acquire+0xc9/0x220 [ 309.662156] __mutex_lock+0x6e/0x990 [ 309.662160] mutex_lock_nested+0x16/0x20 [ 309.662165] _rcu_barrier+0x31/0x160 [ 309.662169] rcu_barrier+0x10/0x20 [ 309.662174] netdev_run_todo+0x5f/0x310 [ 309.662178] rtnl_unlock+0x9/0x10 [ 309.662183] default_device_exit_batch+0x133/0x150 [ 309.662188] ops_exit_list.isra.0+0x4d/0x60 [ 309.662192] cleanup_net+0x1d8/0x2c0 [ 309.662197] process_one_work+0x1f4/0x6d0 [ 309.662202] worker_thread+0x49/0x4a0 [ 309.662206] kthread+0x107/0x140 [ 309.662210] ret_from_fork+0x2e/0x40 [ 309.662214] } [ 309.662220] ... key at: [<ffffffff81e4e1c8>] rcu_preempt_state+0x508/0x780 [ 309.662225] ... acquired at: [ 309.662229] check_usage_forwards+0x12b/0x130 [ 309.662233] mark_lock+0x360/0x6f0 [ 309.662237] __lock_acquire+0x638/0x1b50 [ 309.662241] lock_acquire+0xc9/0x220 [ 309.662245] __mutex_lock+0x6e/0x990 [ 309.662249] mutex_lock_nested+0x16/0x20 [ 309.662253] _rcu_barrier+0x31/0x160 [ 309.662257] rcu_barrier+0x10/0x20 [ 309.662279] i915_gem_shrink_all+0x33/0x40 [i915] [ 309.662298] i915_drop_caches_set+0x141/0x150 [i915] [ 309.662303] simple_attr_write+0xc7/0xe0 [ 309.662307] full_proxy_write+0x4f/0x70 [ 309.662311] __vfs_write+0x23/0x120 [ 309.662315] vfs_write+0xc6/0x1f0 [ 309.662319] SyS_write+0x44/0xb0 [ 309.662323] entry_SYSCALL_64_fastpath+0x1c/0xb1 [ 309.662329] stack backtrace: [ 309.662335] CPU: 1 PID: 6435 Comm: gem_exec_gttfil Tainted: G W 4.11.0-rc1-CI-CI_DRM_2333+ #1 [ 309.662342] Hardware name: Hewlett-Packard HP Compaq 8100 Elite SFF PC/304Ah, BIOS 786H1 v01.13 07/14/2011 [ 309.662348] Call Trace: [ 309.662354] dump_stack+0x67/0x92 [ 309.662359] print_irq_inversion_bug.part.19+0x1a4/0x1b0 [ 309.662365] check_usage_forwards+0x12b/0x130 [ 309.662369] mark_lock+0x360/0x6f0 [ 309.662374] ? print_shortest_lock_dependencies+0x1a0/0x1a0 [ 309.662379] __lock_acquire+0x638/0x1b50 [ 309.662383] ? __mutex_unlock_slowpath+0x3e/0x2e0 [ 309.662388] ? trace_hardirqs_on+0xd/0x10 [ 309.662392] ? _rcu_barrier+0x31/0x160 [ 309.662396] lock_acquire+0xc9/0x220 [ 309.662400] ? _rcu_barrier+0x31/0x160 [ 309.662404] ? _rcu_barrier+0x31/0x160 [ 309.662409] __mutex_lock+0x6e/0x990 [ 309.662412] ? _rcu_barrier+0x31/0x160 [ 309.662416] ? _rcu_barrier+0x31/0x160 [ 309.662421] ? synchronize_rcu_expedited+0x35/0xb0 [ 309.662426] ? _raw_spin_unlock_irqrestore+0x52/0x60 [ 309.662434] mutex_lock_nested+0x16/0x20 [ 309.662438] _rcu_barrier+0x31/0x160 [ 309.662442] rcu_barrier+0x10/0x20 [ 309.662464] i915_gem_shrink_all+0x33/0x40 [i915] [ 309.662484] i915_drop_caches_set+0x141/0x150 [i915] [ 309.662489] simple_attr_write+0xc7/0xe0 [ 309.662494] full_proxy_write+0x4f/0x70 [ 309.662498] __vfs_write+0x23/0x120 [ 309.662503] ? rcu_read_lock_sched_held+0x75/0x80 [ 309.662507] ? rcu_sync_lockdep_assert+0x2a/0x50 [ 309.662512] ? __sb_start_write+0x102/0x210 [ 309.662516] ? vfs_write+0x17d/0x1f0 [ 309.662520] vfs_write+0xc6/0x1f0 [ 309.662524] ? trace_hardirqs_on_caller+0xe7/0x200 [ 309.662529] SyS_write+0x44/0xb0 [ 309.662533] entry_SYSCALL_64_fastpath+0x1c/0xb1 [ 309.662537] RIP: 0033:0x7f507eac24a0 [ 309.662541] RSP: 002b:00007fffda8720e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ 309.662548] RAX: ffffffffffffffda RBX: ffffffff81482bd3 RCX: 00007f507eac24a0 [ 309.662552] RDX: 0000000000000005 RSI: 00007fffda8720f0 RDI: 0000000000000005 [ 309.662557] RBP: ffffc9000048bf88 R08: 0000000000000000 R09: 000000000000002c [ 309.662561] R10: 0000000000000014 R11: 0000000000000246 R12: 00007fffda872230 [ 309.662566] R13: 00007fffda872228 R14: 0000000000000201 R15: 00007fffda8720f0 [ 309.662572] ? __this_cpu_preempt_check+0x13/0x20 Fixes: `0eafec6d32` ("drm/i915: Enable lockless lookup of request tracking via RCU") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100192 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: <stable@vger.kernel.org> # v4.9+ Link: http://patchwork.freedesktop.org/patch/msgid/20170314115019.18127-1-chris@chris-wilson.co.uk Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2017-03-14 14:19:51 +00:00
Ander Conselvan de Oliveira	3465dbddf0	drm/i915/glk: Improve rounding caused by pre-CSC gamma tables The 33rd entry in the pre-CSC gamma table in Geminilake can represent a value of 1.0 as 17 bits fixed point with one integer bit. However, the table was generated such that the value of 1.0 would be 0.ffff with all the intervals scaled accordingly. For instance, 0.5 mapped to 0.7fff instead of 0.8000. For a reason that is not clear to the author, the rounding seems to be different when a cursor plane is used, leading to some seemingly random failures of the kms_cursor_crc igt tests. The differences weren't perceptible at 8bpc with images captured by a Chamelium device, but did cause CRC mismatches. Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170310101835.29845-1-ander.conselvan.de.oliveira@intel.com	2017-03-14 16:07:00 +02:00
Daniel Vetter	7d2ec88149	drm/i915: Merge pre/postclose hooks There's really not a reason afaics that we can't just clean up everything at the end, in the terminal postclose hook: Since this is closing a file descriptor we know no one else can have a reference or a thread doing something with that drm_file except the close code. Ordering shouldn't matter, as long as we don't kfree before we clean stuff up. In the past this was more relevant when drivers still had to track and clean up pending drm events, but that's all done by the core now. Reviewed-by: Sean Paul <seanpaul@chromium.org> Reviewed-by: Liviu Dudau <Liviu.Dudau@arm.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170308141257.12119-13-daniel.vetter@ffwll.ch	2017-03-14 14:38:31 +01:00
Ville Syrjälä	ca0241a55f	Revert "drm/i915: Ignore panel type from OpRegion on SKL" This reverts commit `bb10d4ec3b`. Since commit `c8ebfad7a0` ("drm/i915: Ignore OpRegion panel type except on select machines") we ignore the OpRegion panel type except for specific machines (handled via a DMI match), so having SKL explicitly excluded from using the OpRegion panel type is redundant. So let's remove the SKL check. Cc: Jani Nikula <jani.nikula@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170308143334.21216-1-ville.syrjala@linux.intel.com Reviewed-by: Jani Nikula <jani.nikula@intel.com>	2017-03-14 12:21:51 +02:00
Daniel Vetter	05df49e73b	drm/i915: annote drop_caches debugfs interface with lockdep The trouble we have is that we can't really test all the shrinker recursion stuff exhaustively in BAT because any kind of thrashing stress test just takes too long. But that leaves a really big gap open, since shrinker recursions are one of the most annoying bugs. Now lockdep already has support for checking allocation deadlocks: - Direct reclaim paths are marked up with lockdep_set_current_reclaim_state() and lockdep_clear_current_reclaim_state(). - Any allocation paths are marked with lockdep_trace_alloc(). If we simply mark up our debugfs with the reclaim annotations, any code and locks taken in there will automatically complete the picture with any allocation paths we already have, as long as we have a simple testcase in BAT which throws out a few objects using this interface. Not stress test or thrashing needed at all. v2: Need to EXPORT_SYMBOL_GPL to make it compile as a module. v3: Fixup rebase fail (spotted by Chris). Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: linux-kernel@vger.kernel.org Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: http://patchwork.freedesktop.org/patch/msgid/20170312205340.16202-1-daniel.vetter@ffwll.ch Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>	2017-03-14 10:10:14 +01:00

1 2 3 4 5 ...

34326 Commits