linux

korg/linux

mirror of https://mirrors.bfsu.edu.cn/git/linux.git synced 2024-12-23 19:14:30 +08:00

Author	SHA1	Message	Date
Tomas Elf	fc0768ceac	drm/i915/tdr: Initialize hangcheck struct for each engine Initialize hangcheck struct during driver load. Since we do the same after recovering from a reset, this is extracted into a helper function. v2: remove redundant hangcheck init during load as this is done when engines are initialized (Chris) Cc: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Tomas Elf <tomas.elf@intel.com> Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1458577619-12006-1-git-send-email-arun.siluvery@linux.intel.com	2016-03-22 13:52:42 +02:00
Tvrtko Ursulin	26720ab97f	drm/i915: Move CSB MMIO reads out of the execlists lock By reading the CSB (slow MMIO accesses) into a temporary local buffer we can decrease the duration of holding the execlist lock. Main advantage is that during heavy batch buffer submission we reduce the execlist lock contention, which should decrease the latency and CPU usage between the submitting userspace process and interrupt handling. Downside is that we need to grab and relase the forcewake twice, but as the below numbers will show this is completely hidden by the primary gains. Testing with "gem_latency -n 100" (submit batch buffers with a hundred nops each) shows more than doubling of the throughput and more than halving of the dispatch latency, overall latency and CPU time spend in the submitting process. Submitting empty batches ("gem_latency -n 0") does not seem significantly affected by this change with throughput and CPU time improving by half a percent, and overall latency worsening by the same amount. Above tests were done in a hundred runs on a big core Broadwell. v2: * Overflow protection to local CSB buffer. * Use closer dev_priv in execlists_submit_requests. (Chris Wilson) v3: Rebase. v4: Added commend about irq needed to be disabled in execlists_submit_request. (Chris Wilson) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilsno <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1458219586-20452-1-git-send-email-tvrtko.ursulin@linux.intel.com	2016-03-18 10:25:56 +00:00
Tvrtko Ursulin	39dabecd99	drm/i915: Use shorter route to dev_private where possible Where we have a request we can use req->i915 directly instead of going through the engine and device. Coccinelle script: @@ function f; identifier r; @@ f(..., struct drm_i915_gem_request r, ...) { ... - engine->dev->dev_private + r->i915 ... } @@ struct drm_i915_gem_request req; @@ ( req-> - engine->dev->dev_private + i915 ) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1458219850-21007-1-git-send-email-tvrtko.ursulin@linux.intel.com	2016-03-18 09:50:37 +00:00
Tvrtko Ursulin	117897f42c	drm/i915: More renaming of rings to engines This time using only sed and a few by hand. v2: Rename also intel_ring_id and intel_ring_initialized. v3: Fixed typo in intel_ring_initialized. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1458126040-33105-1-git-send-email-tvrtko.ursulin@linux.intel.com	2016-03-16 15:33:30 +00:00
Tvrtko Ursulin	666796da7a	drm/i915: More intel_engine_cs renaming Some trivial ones, first pass done with Coccinelle: @@ @@ ( - I915_NUM_RINGS + I915_NUM_ENGINES \| - intel_ring_flag + intel_engine_flag \| - for_each_ring + for_each_engine \| - i915_gem_request_get_ring + i915_gem_request_get_engine \| - intel_ring_idle + intel_engine_idle \| - i915_gem_reset_ring_status + i915_gem_reset_engine_status \| - i915_gem_reset_ring_cleanup + i915_gem_reset_engine_cleanup \| - init_ring_lists + init_engine_lists ) But that didn't fully work so I cleaned it up with: for f in .[hc]; do sed -i -e s/I915_NUM_RINGS/I915_NUM_ENGINES/ $f; done for f in .[hc]; do sed -i -e s/i915_gem_request_get_ring/i915_gem_request_get_engine/ $f; done for f in .[hc]; do sed -i -e s/intel_ring_flag/intel_engine_flag/ $f; done for f in .[hc]; do sed -i -e s/intel_ring_idle/intel_engine_idle/ $f; done for f in .[hc]; do sed -i -e s/init_ring_lists/init_engine_lists/ $f; done for f in .[hc]; do sed -i -e s/i915_gem_reset_ring_cleanup/i915_gem_reset_engine_cleanup/ $f; done for f in *.[hc]; do sed -i -e s/i915_gem_reset_ring_status/i915_gem_reset_engine_status/ $f; done v2: Rebase. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2016-03-16 15:33:24 +00:00
Tvrtko Ursulin	4a570db57c	drm/i915: Rename intel_engine_cs struct members below and a couple manual fixups. @@ identifier I, J; @@ struct I { ... - struct intel_engine_cs J; + struct intel_engine_cs engine; ... } @@ identifier I, J; @@ struct I { ... - struct intel_engine_cs J; + struct intel_engine_cs engine; ... } @@ struct drm_i915_private d; @@ ( - d->ring + d->engine ) @@ struct i915_execbuffer_params p; @@ ( - p->ring + p->engine ) @@ struct intel_ringbuffer r; @@ ( - r->ring + r->engine ) @@ struct drm_i915_gem_request req; @@ ( - req->ring + req->engine ) v2: Script missed the tracepoint code - fixed up by hand. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2016-03-16 15:33:17 +00:00
Tvrtko Ursulin	0bc40be85f	drm/i915: Rename intel_engine_cs function parameters @@ identifier func; @@ func(..., struct intel_engine_cs * - ring + engine , ...) { <... - ring + engine ...> } @@ identifier func; type T; @@ T func(..., struct intel_engine_cs * - ring + engine , ...); Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2016-03-16 15:33:10 +00:00
Tvrtko Ursulin	e2f8039147	drm/i915: Rename local struct intel_engine_cs variables Done by the Coccinelle script below plus a manual intervention to GEN8_RING_SEMAPHORE_INIT. @@ expression E; @@ - struct intel_engine_cs ring = E; + struct intel_engine_cs engine = E; <+... - ring + engine ...+> @@ @@ - struct intel_engine_cs ring; + struct intel_engine_cs engine; <+... - ring + engine ...+> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2016-03-16 15:33:00 +00:00
Tvrtko Ursulin	8de1b23efa	drm/i915/lrc: Do not wait atomically when stopping engines I do not see that this needs to be done atomically and up to one second is quite a long time to busy loop. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2016-03-03 17:26:51 +00:00
Tvrtko Ursulin	c6a2ac712d	drm/i915: Execlists small cleanups and micro-optimisations Assorted changes in the areas of code cleanup, reduction of invariant conditional in the interrupt handler and lock contention and MMIO access optimisation. * Remove needless initialization. * Improve cache locality by reorganizing code and/or using branch hints to keep unexpected or error conditions out of line. * Favor busy submit path vs. empty queue. * Less branching in hot-paths. v2: * Avoid mmio reads when possible. (Chris Wilson) * Use natural integer size for csb indices. * Remove useless return value from execlists_update_context. * Extract 32-bit ppgtt PDPs update so it is out of line and shared with two callers. * Grab forcewake across all mmio operations to ease the load on uncore lock and use chepear mmio ops. v3: * Removed some more pointless u8 data types. * Removed unused return from execlists_context_queue. * Commit message updates. v4: * Unclumsify the unqueue if statement. (Chris Wilson) * Hide forcewake from the queuing function. (Chris Wilson) Version 3 now makes the irq handling code path ~20% smaller on 48-bit PPGTT hardware, and a little bit less elsewhere. Hot paths are mostly in-line now and hammering on the uncore spinlock is greatly reduced together with mmio traffic to an extent. Benchmarking with "gem_latency -n 100" (keep submitting batches with 100 nop instruction) shows approximately 4% higher throughput, 2% less CPU time and 22% smaller latencies. This was on a big-core while small-cores could benefit even more. Most likely reason for the improvements are the MMIO optimization and uncore lock traffic reduction. One odd result is with "gem_latency -n 0" (dispatching empty batches) which shows 5% more throughput, 8% less CPU time, 25% better producer and consumer latencies, but 15% higher dispatch latency which is yet unexplained. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1456505912-22286-1-git-send-email-tvrtko.ursulin@linux.intel.com	2016-03-01 10:36:02 +00:00
Chris Wilson	2743179d95	drm/i915: Execlists cannot pin a context without the object Given that the intel_lr_context_pin cannot succeed without the object, we cannot reach intel_lr_context_unpin() without first allocating that object - so we can remove the redundant test. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1456485751-15213-1-git-send-email-tvrtko.ursulin@linux.intel.com	2016-02-26 13:16:36 +00:00
Michel Thierry	99cf8ea165	drm/i915/lrc: Only set RS ctx enable in ctx control reg if there is a RS The driver should only set the "RS context enable" bit in the context image if we plan to use the resource streamer. Reviewed-by: Arun Siluvery <arun.siluvery@linux.intel.com> Signed-off-by: Michel Thierry <michel.thierry@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1456393738-35608-1-git-send-email-michel.thierry@intel.com	2016-02-26 11:30:28 +00:00
Michel Thierry	715629190e	drm/i915/gen9: Set value of Indirect Context Offset based on gen version The cache line offset for the Indirect CS context (0x21C8) varies from gen to gen. v2: Move it into a function (Arun), use MISSING_CASE (Chris) v3: Rebased (catched by ci bat) Cc: Arun Siluvery <arun.siluvery@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Arun Siluvery <arun.siluvery@linux.intel.com> Signed-off-by: Michel Thierry <michel.thierry@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1456223509-6454-1-git-send-email-michel.thierry@intel.com	2016-02-26 11:30:17 +00:00
Tvrtko Ursulin	f4e2deceb6	drm/i915: Fix premature LRC unpin in GuC mode In GuC mode LRC pinning lifetime depends exclusively on the request liftime. Since that is terminated by the seqno update that opens up a race condition between GPU finishing writing out the context image and the driver unpinning the LRC. To extend the LRC lifetime we will employ a similar approach to what legacy ringbuffer submission does. We will start tracking the last submitted context per engine and keep it pinned until it is replaced by another one. Note that the driver unload path is a bit fragile and could benefit greatly from efforts to unify the legacy and exec list submission code paths. At the moment i915_gem_context_fini has special casing for the two which are potentialy not needed, and also depends on i915_gem_cleanup_ringbuffer running before itself. v2: * Move pinning into engine->emit_request and actually fix the reference/unreference logic. (Chris Wilson) * ring->dev can be NULL on driver unload so use a different route towards it. v3: * Rebase. * Handle the reset path. (Chris Wilson) * Exclude default context from the pinning - it is impossible to get it right before default context special casing in general is eliminated. v4: * Rebased & moved context tracking to intel_logical_ring_advance_and_submit. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Issue: VIZ-4277 Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Nick Hoath <nicholas.hoath@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1453976997-25424-1-git-send-email-tvrtko.ursulin@linux.intel.com	2016-01-28 17:23:15 +00:00
Tvrtko Ursulin	321fe304f1	drm/i915: Make LRC pinning own a reference to the context Will simplify the following fix and sounds logical. v2: Add some whitespace to separate logic better. (Chris Wilson) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Nick Hoath <nicholas.hoath@intel.com>	2016-01-28 17:23:15 +00:00
Tvrtko Ursulin	e5292823c1	drm/i915: Make LRC (un)pinning work on context and engine Previously intel_lr_context_(un)pin were operating on requests which is in conflict with their names. If we make them take a context and an engine, it makes the names make more sense and it also makes future fixes possible. v2: Rebase for default_context/kernel_context change. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Nick Hoath <nicholas.hoath@intel.com>	2016-01-28 17:23:15 +00:00
Alex Dai	397097b026	drm/i915/guc: Decouple GuC engine id from ring id Previously GuC uses ring id as engine id because of same definition. But this is not true since this commit: commit `de1add3605` Author: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Date: Fri Jan 15 15:12:50 2016 +0000 drm/i915: Decouple execbuf uAPI from internal implementation Added GuC engine id into GuC interface to decouple it from ring id used by driver. v2: Keep ring name print out in debugfs; using for_each_ring() where possible to keep driver consistent. (Chris W.) Signed-off-by: Alex Dai <yu.dai@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1453579094-29860-1-git-send-email-yu.dai@intel.com	2016-01-25 10:56:30 +00:00
Tvrtko Ursulin	77b04a0428	drm/i915: More use of the cached LRC state Since: commit `82352e908a` Author: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Date: Fri Jan 15 17:12:45 2016 +0000 drm/i915: Cache LRC state page in the context and: commit `0eb973d31d` Author: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Date: Fri Jan 15 15:10:28 2016 +0000 drm/i915: Cache ringbuffer GTT VMA We can also remove the ring buffer start updates on every context update since the address will not change for the duration of the LRC pin. For GuC we can remove the update altogether because it only cares about the ring buffer start. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Alex Dai <yu.dai@intel.com> Cc: Dave Gordon <david.s.gordon@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1453466567-33369-1-git-send-email-tvrtko.ursulin@linux.intel.com	2016-01-25 10:55:19 +00:00
Chris Wilson	426960bed3	drm/i915: Seal busy-ioctl uABI and prevent leaking of internal ids Tvrtko was looking through the execbuffer-ioctl and noticed that the uABI was tightly coupled to our internal engine identifiers. Close inspection also revealed that we leak those internal engine identifiers through the busy-ioctl, and those internal identifiers already do not match the user identifiers. Fortuitiously, there is only one user of the set of busy rings from the busy-ioctl, and they only wish to choose between the RENDER and the BLT engines. Let's fix the userspace ABI while we still can. v2: Update the uAPI documentation to explain the identifiers. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Testcase: igt/gem_busy Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1452876706-21620-1-git-send-email-chris@chris-wilson.co.uk	2016-01-21 11:00:35 +00:00
Chris Wilson	7c17d37737	drm/i915: Use ordered seqno write interrupt generation on gen8+ execlists Broadwell and later currently use the same unordered command sequence to update the seqno in the HWS status page and then assert the user interrupt. We should apply the w/a from legacy (where we do an mmio read to delay the seqno read after the interrupt), but this is not enough to enforce coherent seqno visibilty on Skylake. Rather than search for the proper post-interrupt seqno barrier, use a strongly ordered command sequence to write the seqno, then assert the user interrupt from the ring. v2: Move around the wa tail dwords to avoid adding duplicate code. v3: Add references, comments on workarounds and bit5 check. References: https://bugs.freedesktop.org/show_bug.cgi?id=93693 Testcase: igt/gem_ring_sync_loop #skl Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1453297415-17793-1-git-send-email-mika.kuoppala@intel.com	2016-01-21 11:53:09 +02:00
Dave Gordon	e28e404c3e	drm/i915: tidy up a few leftovers There are a few bits of code which the transformations implemented by the previous patch reveal to be suboptimal, once the notion of a per- ring default context has gone away. So this tidies up the leftovers. It could have been squashed into the previous patch, but that would have made that patch less clearly a simple transformation. In particular, any change which alters the code block structure or indentation has been deferred into this separate patch, because such things tend to make diffs more difficult to read. v4: Rebased Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Reviewed-by: Nick Hoath <nicholas.hoath@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1453230175-19330-4-git-send-email-david.s.gordon@intel.com Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2016-01-21 09:21:29 +01:00
Dave Gordon	ed54c1a1d1	drm/i915: abolish separate per-ring default_context pointers Now that we've eliminated a lot of uses of ring->default_context, we can eliminate the pointer itself. All the engines share the same default intel_context, so we can just keep a single reference to it in the dev_priv structure rather than one in each of the engine[] elements. This make refcounting more sensible too, as we now have a refcount of one for the one pointer, rather than a refcount of one but multiple pointers. From an idea by Chris Wilson. v2: transform an extra instance of ring->default_context introduced by `42f1cae8c` drm/i915: Restore inhibiting the load of the default context That patch's commentary includes: v2: Mark the global default context as uninitialized on GPU reset so that the context-local workarounds are reloaded upon re-enabling The code implementing that now also benefits from the replacement of the multiple (per-ring) pointers to the default context with a single pointer to the unique kernel context. v4: Rebased, remove underused local (Nick Hoath) Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Reviewed-by: Nick Hoath <nicholas.hoath@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1453230175-19330-3-git-send-email-david.s.gordon@intel.com Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2016-01-21 09:21:29 +01:00
Dave Gordon	2682708839	drm/i915: simplify allocation of driver-internal requests There are a number of places where the driver needs a request, but isn't working on behalf of any specific user or in a specific context. At present, we associate them with the per-engine default context. A future patch will abolish those per-engine context pointers; but we can already eliminate a lot of the references to them, just by making the allocator allow NULL as a shorthand for "an appropriate context for this ring", which will mean that the callers don't need to know anything about how the "appropriate context" is found (e.g. per-ring vs per-device, etc). So this patch renames the existing i915_gem_request_alloc(), and makes it local (static inline), and replaces it with a wrapper that provides a default if the context is NULL, and also has a nicer calling convention (doesn't require a pointer to an output parameter). Then we change all callers to use the new convention: OLD: err = i915_gem_request_alloc(ring, user_ctx, &req); if (err) ... NEW: req = i915_gem_request_alloc(ring, user_ctx); if (IS_ERR(req)) ... OLD: err = i915_gem_request_alloc(ring, ring->default_context, &req); if (err) ... NEW: req = i915_gem_request_alloc(ring, NULL); if (IS_ERR(req)) ... v4: Rebased Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Reviewed-by: Nick Hoath <nicholas.hoath@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1453230175-19330-2-git-send-email-david.s.gordon@intel.com Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2016-01-21 09:21:29 +01:00
Tvrtko Ursulin	82352e908a	drm/i915: Cache LRC state page in the context LRC lifetime is well defined so we can cache the page pointing to the object backing store in the context in order to avoid walking over the object SG page list from the interrupt context without the big lock held. v2: Also cache the mapping. (Chris Wilson) v3: Unmap on the error path. v4: No need to cache the page. (Chris Wilson) v5: No need to dirty the page on unpin. (Chris Wilson) v6: kmap() cannot fail and use kmap_to_page to simplify unpin. (Chris Wilson) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Dave Gordon <david.s.gordon@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1452877965-32042-1-git-send-email-tvrtko.ursulin@linux.intel.com	2016-01-18 09:58:44 +00:00
Tvrtko Ursulin	0eb973d31d	drm/i915: Cache ringbuffer GTT VMA Purpose is to avoid calling i915_gem_obj_ggtt_offset from the interrupt context without the big lock held. v2: Renamed gtt_start to gtt_offset. (Daniel Vetter) v3: Cache the VMA instead of address. (Chris Wilson) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1452870629-13830-2-git-send-email-tvrtko.ursulin@linux.intel.com	2016-01-18 09:58:40 +00:00
Tvrtko Ursulin	ca82580c9c	drm/i915: Do not call API requiring struct_mutex where it is not available LRC code was calling GEM API like i915_gem_obj_ggtt_offset from places where the struct_mutex cannot be grabbed (irq handlers). To avoid that this patch caches some interesting bits and values in the engine and context structures. Some usages are also removed where they are not needed like a few asserts which are either impossible or have been checked already during engine initialization. Side benefit is also that interrupt handlers and command submission stop evaluating invariant conditionals, like what Gen we are running on, on every interrupt and every command submitted. This patch deals with logical ring context id and descriptors while subsequent patches will deal with the remaining issues. v2: * Cache the VMA instead of the address. (Chris Wilson) * Incorporate Dave Gordon's good comments and function name. v3: * Extract ctx descriptor template to a function and group functions dealing with ctx descriptor & co together near top of the file. (Dave Gordon) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Dave Gordon <david.s.gordon@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1452870629-13830-1-git-send-email-tvrtko.ursulin@linux.intel.com	2016-01-18 09:58:36 +00:00
Francisco Jerez	965fd602a6	drm/i915: Make sure DC writes are coherent on flush. We need to set the DC FLUSH PIPE_CONTROL bit on Gen7+ to guarantee that writes performed via the HDC are visible in memory. Fixes an intermittent failure in a Piglit test that writes to a BO from a shader using GL atomic counters (implemented as HDC untyped atomics) and then expects the memory to read back the same value after mapping it on the CPU. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91298 Tested-by: Mark Janes <mark.a.janes@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1452740379-3194-1-git-send-email-currojerez@riseup.net Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2016-01-15 11:33:52 +02:00
Tvrtko Ursulin	ec8a9776cc	drm/i915: Fix bsd2 ring name Chris Wilson noticed the "bds2" typo. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1452619956-27014-3-git-send-email-tvrtko.ursulin@linux.intel.com	2016-01-13 10:02:52 +00:00
Tvrtko Ursulin	d9f3af96c2	drm/i915: Compact logical ring interrupt initialization Identically to vfuncs interrupt mask initialization can also be compacted for more readable code. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1452619956-27014-2-git-send-email-tvrtko.ursulin@linux.intel.com	2016-01-13 10:02:36 +00:00
Tvrtko Ursulin	c9cacf9349	drm/i915: Extract vfunc setup from logical ring initializers Majority of them was duplicated code and only render ring currently overrides some of them. We can save some lines of code and also take away the confusion on why bsd2 did not do the seqno coherency workaround. (VCS2 ring does not exist on platforms where workaround is needed but that was not documented in the code.) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1452619956-27014-1-git-send-email-tvrtko.ursulin@linux.intel.com	2016-01-13 10:02:05 +00:00
Tvrtko Ursulin	7eb08a25a4	drm/i915/bdw+: Replace list_del+list_add_tail with list_move_tail Same effect for slightly less source code and resulting binary. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1452521321-4032-2-git-send-email-tvrtko.ursulin@linux.intel.com	2016-01-12 10:30:40 +00:00
Boyer, Wayne	cd7feaaad6	drm/i915: Don't warn if the workaround list is empty part 2. Extend the same reasoning as in the patch listed below. It's not an error for the workaround list to be empty if no workarounds are needed. commit `02235808b6` Author: Francisco Jerez <currojerez@riseup.net> Date: Wed Oct 7 14:44:01 2015 +0300 drm/i915: Don't warn if the workaround list is empty. Signed-off-by: Wayne Boyer <wayne.boyer@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1452129330-3484-2-git-send-email-rodrigo.vivi@intel.com	2016-01-08 10:14:30 -08:00
Ben Widawsky	91a4103206	drm/i915: Extract CSB status read This is a useful thing to have around as a function because the mechanism may change in the future. There is a net increase in LOC here, and it will continue to be the case on GEN8 and GEN9 - but future GENs may have an alternate mechanism for doing this. Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Michel Thierry <michel.thierry@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1452018609-10142-4-git-send-email-benjamin.widawsky@intel.com Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2016-01-07 15:34:41 +01:00
Ben Widawsky	f764a8b146	drm/i915: Change WARN to ERROR in CSB count There is no point in emitting a WARN since the backtrace will always be the same. Errors have actually become easier to spot given the large number of WARNs which exist today in modesetting paths. Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Reviewed-by: Michel Thierry <michel.thierry@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1452018609-10142-3-git-send-email-benjamin.widawsky@intel.com Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2016-01-07 15:34:41 +01:00
Ben Widawsky	5590a5f0af	drm/i915: Cleanup some of the CSB handling I think this patch is a worthwhile cleanup even if it might look only marginally useful. It gets more useful in upcoming patches and for handling of future GEN platforms. The only non-mechanical part of this is the removal of the extra & operation on the ring->next_context_status_buffer. This is safe because right above this, we already did a modulus operation. Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1452018609-10142-2-git-send-email-benjamin.widawsky@intel.com Reviewed-by: Michel Thierry <michel.thierry@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2016-01-07 15:34:41 +01:00
Dave Gordon	c5d46ee206	drm/i915: add kerneldoc for intel_lr_context_size() This function was recently renamed & exposed, so now it gets documented Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1451996493-16079-1-git-send-email-david.s.gordon@intel.com	2016-01-05 16:14:33 +01:00
Dave Gordon	95a66f7e71	drm/i915/guc: Expose (intel)_lr_context_size() The GuC code needs to know the size of a logical context, so we expose get_lr_context_size(), renaming it intel_lr_context__size() to fit the naming conventions for nonstatic functions. For: VIZ-2021 Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Signed-off-by: Alex Dai <yu.dai@intel.com> Reviewed-by: Dave Gordon <david.s.gordon@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1450468812-4882-2-git-send-email-yu.dai@intel.com Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2016-01-05 11:33:34 +01:00
Alex Dai	a7e02199ae	drm/i915/guc: Move GuC wq_check_space to alloc_request_extras Split GuC work queue space checking from submission and move it to ring_alloc_request_extras. The reason is that failure in later i915_add_request() won't be handled. In the case timeout happens, driver can return early in order to handle the error. v1: Move wq_reserve_space to ring_reserve_space v2: Move wq_reserve_space to alloc_request_extras (Chris Wilson) v3: The work queue head pointer is cached by driver now. So we can quickly return if space is available. s/reserve/check/g (Dave Gordon) v4: Update cached wq head after ring doorbell; check wq space before ring doorbell in case unexpected error happens; call wq space check only when GuC submission is enabled. (Dave Gordon) Signed-off-by: Alex Dai <yu.dai@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1450295155-10050-1-git-send-email-yu.dai@intel.com Reviewed-by: Dave Gordon <david.s.gordon@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2016-01-05 11:07:32 +01:00
Ben Widawsky	eba51190f3	drm/i915: Fix whitespace (trivial) Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1451427643-7266-1-git-send-email-benjamin.widawsky@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com>	2015-12-30 11:19:50 +02:00
Ben Widawsky	1a5a9ce70f	drm/i915: Limit VF cache invalidate workaround usage to gen9 It is unclear if this is even required on BXT. v2: Make sure to set the default value to false. Uncertain how my compiler doesn't complain with v1. Cc: Imre Deak <imre.deak@intel.com> Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1450374597-7021-1-git-send-email-benjamin.widawsky@intel.com Reviewed-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-12-21 13:05:36 +01:00
Dave Gordon	033908aed5	drm/i915: mark GEM object pages dirty when mapped & written by the CPU In various places, a single page of a (regular) GEM object is mapped into CPU address space and updated. In each such case, either the page or the the object should be marked dirty, to ensure that the modifications are not discarded if the object is evicted under memory pressure. The typical sequence is: va = kmap_atomic(i915_gem_object_get_page(obj, pageno)); *(va+offset) = ... kunmap_atomic(va); Here we introduce i915_gem_object_get_dirty_page(), which performs the same operation as i915_gem_object_get_page() but with the side-effect of marking the returned page dirty in the pagecache. This will ensure that if the object is subsequently evicted (due to memory pressure), the changes are written to backing store rather than discarded. Note that it works only for regular (shmfs-backed) GEM objects, but (at least for now) those are the only ones that are updated in this way -- the objects in question are contexts and batchbuffers, which are always shmfs-backed. Separate patches deal with the cases where whole objects are (or may be) dirtied. v3: Mark two more pages dirty in the page-boundary-crossing cases of the execbuffer relocation code [Chris Wilson] Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1449773486-30822-2-git-send-email-david.s.gordon@intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-12-11 18:11:53 +01:00
Dave Gordon	b0366a54b4	drm/i915: intel_ring_initialized() must be simple and inline Based on Chris Wilson's patch from 6 months ago, rebased and adapted. The current implementation of intel_ring_initialized() is too heavyweight; it's a non-inlined function that chases several levels of pointers. This wouldn't matter too much if it were rarely called, but it's used inside the iterator test of for_each_ring() and is therefore called quite frequently. So let's make it simple and inline ... The idea here is to use ring->dev as an indicator showing which engines have been initialised and are therefore to be included in iterations that use for_each_ring(). This allows us to avoid multiple memory references and a (non-inlined) function call on each iteration of each such loop. Fixes regression from commit `48d823878d` Author: Oscar Mateo <oscar.mateo@intel.com> Date: Thu Jul 24 17:04:23 2014 +0100 drm/i915/bdw: Generic logical ring init and cleanup Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1449586956-32360-2-git-send-email-david.s.gordon@intel.com	2015-12-10 14:14:36 +01:00
Daniel Vetter	af3302b907	Revert "drm/i915: Extend LRC pinning to cover GPU context writeback" This reverts commit `6d65ba943a`. Mika Kuoppala traced down a use-after-free crash in module unload to this commit, because ring->last_context is leaked beyond when the context gets destroyed. Mika submitted a quick fix to patch that up in the context destruction code, but that's too much of a hack. The right fix is instead for the ring to hold a full reference onto it's last context, like we do for legacy contexts. Since this is causing a regression in BAT it gets reverted before we can close this. Cc: Nick Hoath <nicholas.hoath@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: David Gordon <david.s.gordon@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Alex Dai <yu.dai@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93248 Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>	2015-12-04 17:34:40 +01:00
Nick Hoath	6d65ba943a	drm/i915: Extend LRC pinning to cover GPU context writeback Use the first retired request on a new context to unpin the old context. This ensures that the hw context remains bound until it has been written back to by the GPU. Now that the context is pinned until later in the request/context lifecycle, it no longer needs to be pinned from context_queue to retire_requests. This fixes an issue with GuC submission where the GPU might not have finished writing back the context before it is unpinned. This results in a GPU hang. v2: Moved the new pin to cover GuC submission (Alex Dai) Moved the new unpin to request_retire to fix coverage leak v3: Added switch to default context if freeing a still pinned context just in case the hw was actually still using it v4: Unwrapped context unpin to allow calling without a request v5: Only create a switch to idle context if the ring doesn't already have a request pending on it (Alex Dai) Rename unsaved to dirty to avoid double negatives (Dave Gordon) Changed _no_req postfix to __ prefix for consistency (Dave Gordon) Split out per engine cleanup from context_free as it was getting unwieldy Corrected locking (Dave Gordon) v6: Removed some bikeshedding (Mika Kuoppala) Added explanation of the GuC hang that this fixes (Daniel Vetter) v7: Removed extra per request pinning from ring reset code (Alex Dai) Added forced ring unpin/clean in error case in context free (Alex Dai) Signed-off-by: Nick Hoath <nicholas.hoath@intel.com> Issue: VIZ-4277 Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: David Gordon <david.s.gordon@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Alex Dai <yu.dai@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Alex Dai <yu.dai@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-12-03 15:11:55 +01:00
Daniel Vetter	92907cbbef	Linux 4.4-rc2 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJWUmHZAAoJEHm+PkMAQRiGHtcH/RVRsn8re0WdRWYaTr9+Hknm CGlRJN4LKecttgYQ/2bS1QsDbt8usDPBiiYVopqGXQxPBmjyDAqPjsa+8VzCaVc6 WA+9LDB+PcW28lD6BO+qSZCOAm7hHSZq7dtw9x658IqO+mI2mVeCybsAyunw2iWi Kf5q90wq6tIBXuT8YH9MXGrSCQw00NclbYeYwB9CmCt9hT/koEFBdl7uFUFitB+Q GSPTz5fXhgc5Lms85n7flZlrVKoQKmtDQe4/DvKZm+SjsATHU9ru89OxDBdS5gSG YcEIM4zc9tMjhs3GC9t6WXf6iFOdctum8HOhUoIN/+LVfeOMRRwAhRVqtGJ//Xw= =DCUg -----END PGP SIGNATURE----- Merge tag 'v4.4-rc2' into drm-intel-next-queued Linux 4.4-rc2 Backmerge to get at commit `1b0e3a049e` Author: Imre Deak <imre.deak@intel.com> Date: Thu Nov 5 23:04:11 2015 +0200 drm/i915/skl: disable display side power well support for now so that we can proplery re-eanble skl power wells in -next. Conflicts are just adjacent lines changed, except for intel_fbdev.c where we need to interleave the changs. Nothing nefarious. Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>	2015-11-23 09:04:05 +01:00
Arun Siluvery	fb1a21114f	Revert "drm/i915: Initialize HWS page address after GPU reset" This reverts commit `2e5356da37`. It is now redundant as it is already covered in below commit which introduced the changes to reuse initialization of resources in resume/reset path. commit `e84fe80337` Author: Nick Hoath <nicholas.hoath@intel.com> Date: Fri Sep 11 12:53:46 2015 +0100 drm/i915: Split alloc from init for lrc lrc_setup_hardware_status_page() in the same function gen8_init_common_ring() takes care of this. Cc: Nick Hoath <nicholas.hoath@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1447951664-9347-1-git-send-email-arun.siluvery@linux.intel.com Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2015-11-19 17:51:16 +01:00
Ville Syrjälä	f0f59a00a1	drm/i915: Type safe register read/write Make I915_READ and I915_WRITE more type safe by wrapping the register offset in a struct. This should eliminate most of the fumbles we've had with misplaced parens. This only takes care of normal mmio registers. We could extend the idea to other register types and define each with its own struct. That way you wouldn't be able to accidentally pass the wrong thing to a specific register access function. The gpio_reg setup is probably the ugliest thing left. But I figure I'd just leave it for now, and wait for some divine inspiration to strike before making it nice. As for the generated code, it's actually a bit better sometimes. Eg. looking at i915_irq_handler(), we can see the following change: lea 0x70024(%rdx,%rax,1),%r9d mov $0x1,%edx - movslq %r9d,%r9 - mov %r9,%rsi - mov %r9,-0x58(%rbp) - callq 0xd8(%rbx) + mov %r9d,%esi + mov %r9d,-0x48(%rbp) callq 0xd8(%rbx) So previously gcc thought the register offset might be signed and decided to sign extend it, just in case. The rest appears to be mostly just minor shuffling of instructions. v2: i915_mmio_reg_{offset,equal,valid}() helpers added s/_REG/_MMIO/ in the register defines mo more switch statements left to worry about ring_emit stuff got sorted in a prep patch cmd parser, lrc context and w/a batch buildup also in prep patch vgpu stuff cleaned up and moved to a prep patch all other unrelated changes split out v3: Rebased due to BXT DSI/BLC, MOCS, etc. v4: Rebased due to churn, s/i915_mmio_reg_t/i915_reg_t/ Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1447853606-2751-1-git-send-email-ville.syrjala@linux.intel.com	2015-11-18 15:39:11 +02:00
Ville Syrjälä	0d925ea023	drm/i915: Wrap context LRI init in a macro We set up a load of LRIs in the logical ring context. Wrap that stuff in a macro to avoid typos with position of each reg/value pair in the context. This also makes it easier to make the register defines type safe. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1446672017-24497-24-git-send-email-ville.syrjala@linux.intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2015-11-18 14:35:39 +02:00
Ville Syrjälä	35dc3f97a6	drm/i915: Give names to more ring registers The logical render context population has a bunch of raw ring register offsets. Use the names we have for them, and in cases where we we don't, give them names. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1446672017-24497-23-git-send-email-ville.syrjala@linux.intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2015-11-18 14:35:36 +02:00
Ville Syrjälä	9244a81701	drm/i915: Wrap ASSIGN_CTX_{PDP,PM4L} in do {} while(0) Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1446672017-24497-22-git-send-email-ville.syrjala@linux.intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2015-11-18 14:35:32 +02:00

1 2 3 4 5 ...

260 Commits