./drivers/gpu/drm/i915/gvt/cmd_parser.h:53:44: error: ‘struct intel_vgpu’ declared inside parameter list will not be visible outside of this definition or declaration
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210121132558.2893-1-chris@chris-wilson.co.uk
Now that we are careful to always force-restore contexts upon rewinding
(where necessary), we can restore our optimisation to skip over
completed active execlists when dequeuing.
Referenecs: 35f3fd8182 ("drm/i915/execlists: Workaround switching back to a completed context")
References: 8ab3a3812a ("drm/i915/gt: Incrementally check for rewinding")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210120121718.26435-2-chris@chris-wilson.co.uk
Simplify the frontbuffer unpin by removing the lock requirement. The LRU
bumping was primarily to protect the GTT from being evicted and from
frontbuffers being eagerly shrunk. Now we protect frontbuffers from the
shrinker, and we avoid accidentally evicting from the GTT, so the
benefit from bumping LRU is no more, and we can save more time by not.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210119214336.1463-6-chris@chris-wilson.co.uk
In the shrinker, we protect framebuffers from light reclaim as we
typically expect framebuffers to be reused in the near future (and with
low latency requirements). We can apply the same logic to the GGTT
eviction and defer framebuffers to the second pass only used if the
caller is desperate enough to wait for space to become available.
In most cases, the caller will use a smaller partial vma instead of
trying to force the object into the GGTT if doing so will cause other
users to be evicted.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210119214336.1463-5-chris@chris-wilson.co.uk
Give obj->mm.quirked a name much more reflective of its purpose
(i915_gem_object_has_tiling_quirk) and move it from the obj->mm field as
it doesn't denote a quirk of the backing store, but a quirk in the
object in its treatment of the backing pages, similar to tiling modes.
Then instead of abusing the pinned status of the buffer to protect it
from the shrinker, we can instead hide the buffer from the shrinker so
it is never considered for being swapped.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210119214336.1463-4-chris@chris-wilson.co.uk
The obj->stolen is currently used to identify an object allocated from
stolen memory. This dates back to when there were just 1.5 types of
objects, an object backed by shmemfs and an object backed by shmemfs
with a contiguous physical address. Now that we have several different
types of objects, we no longer want to treat stolen objects as a special
case.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210119214336.1463-3-chris@chris-wilson.co.uk
When flushing objects larger than the CPU cache it is preferrable to use
a single wbinvd() rather than overlapping clflush(). At runtime, we
avoid wbinvd() due to its system-wide latencies, but during
singlethreaded suspend, no one will observe the imposed latency and we
can opt for the faster wbinvd to clear all objects in a single hit.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210119214336.1463-2-chris@chris-wilson.co.uk
When flushing objects larger than the CPU cache it is preferrable to use
a single wbinvd() rather than overlapping clflush(). At runtime, we
avoid wbinvd() due to its system-wide latencies, but during
singlethreaded suspend, no one will observe the imposed latency and we
can opt for the faster wbinvd to clear all objects in a single hit.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210119214336.1463-1-chris@chris-wilson.co.uk
Sanity check the object size before allocating a new gem object.
Fixes: 97d5539632 ("drm/i915/region: convert object_create into object_init")
Testcase: igt/gem_create/create-massive
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20210120104714.112812-1-matthew.auld@intel.com
There is a module parameter for controlling what GuC/HuC features are
enabled. Setting to -1 means 'use the default'. However, the default
was not well defined, out of date and needs to be different across
platforms.
The default is now to disable both GuC and HuC on legacy platforms
where legacy means TGL/RKL and anything prior to Gen12. For new
platforms, the default is to load HuC but not enable GuC submission
as that has not landed yet.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20210113220724.2484897-1-John.C.Harrison@Intel.com
If we enable_breadcrumbs for a request while that request is being
removed from HW; we may see that the request is active as we take the
ce->signal_lock and proceed to attach the request to ce->signals.
However, during unsubmission after marking the request as inactive, we
see that the request has not yet been added to ce->signals and so skip
the removal. Pull the check during cancel_breadcrumbs under the same
spinlock as enabling so that we the two tests are consistent in
enable/cancel.
Otherwise, we may insert a request onto ce->signals that we expect should
not be there:
intel_context_remove_breadcrumbs:488 GEM_BUG_ON(!__i915_request_is_complete(rq))
While updating, we can note that we are always called with
irqs-disabled, due to the engine->active.lock being held at the single
caller, and so remove the irqsave/restore making it symmetric to
enable_breadcrumbs.
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2931
Fixes: c18636f763 ("drm/i915: Remove requirement for holding i915_request.lock for breadcrumbs")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Andi Shyti <andi.shyti@intel.com>
Cc: <stable@vger.kernel.org> # v5.10+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210119162057.31097-1-chris@chris-wilson.co.uk
Currently, if a modeset/pageflip needs to wait for render completion to
an object, we boost the priority of that rendering above all other work.
We can apply the same interactive priority boosting to explicit fences
that we can unwrap into a native i915_request (i.e. sync_file).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210119204454.10343-1-chris@chris-wilson.co.uk
In a few places we always end up mapping the pool object with the FORCE
constraint(to prevent hitting -EBUSY) which will destroy the cached
mapping if it has a different type. As a simple first step, make the
mapping type part of the pool interface, where the behaviour is to only
give out pool objects which match the requested mapping type.
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20210119133106.66294-4-matthew.auld@intel.com
Looks like it belongs there anyway, otherwise we have to include the
entirety of i915_gem_object.h just to get at the enum.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20210119133106.66294-3-matthew.auld@intel.com
io_mapping_map_wc() expects the offset to be relative to the iomapping
base address. Currently we just pass in the physical address for the
page which only works if the region.start starts at zero.
Signed-off-by: CQ Tang <cq.tang@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20210119133106.66294-2-matthew.auld@intel.com
For the device local-memory case, sgt.pfn will always be equal to zero,
since we instead use sgt.dma. Also, for device local-memory it is
perfectly valid for it to start from zero anyway, so no need to add a
new check for that either.
Signed-off-by: Kui Wen <kui.wen@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20210119133106.66294-1-matthew.auld@intel.com
Rather than trying to avoid the use-after-free possible with the current
context chasing, simply remove presentation of the per-client stats from
debugfs. While we know from bug reports that this debugfs/i915_gem_objects
has been used by chromeos (and chrome itself) for debug purposes, google
suggests that it is unparsed, so we are free to invoke debugfs is not
ABI and remove details from it.
Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210118110854.1873-1-chris@chris-wilson.co.uk
The benefit of only resetting a single engine is that we leave other
streams of userspace work intact across a hang; vital for process
isolation. We had wired up individual engine resets for gen6, but only
enabled it from gen8; now let's turn it on for the forgotten gen7. gen6
is still a mystery as how to unravel some global state that appears to
be reset along with an engine (in particular the ppgtt enabling in
GFX_MODE).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210119110802.22228-6-chris@chris-wilson.co.uk
Take advantage of calling xcs_resume under a forcewake by using direct
mmio access. In particular, we can avoid the sleeping variants to allow
resume to be called from softirq context, required for engine resets.
v2: Keep the posting read at the start of resume as a guardian memory
barrier.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210119110802.22228-5-chris@chris-wilson.co.uk
During the reset of ring submission, we first stop the engine by
clearing the HEAD/TAIL and marking the ring as disabled. However, it
would be safer to disable the ring (after emptying) before resetting the
HEAD/TAIL.
Suggested-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210119110802.22228-4-chris@chris-wilson.co.uk
CI reports that Baytail requires one more invalidate after CACHE_MODE
for it to be happy.
Fixes: ace44e13e5 ("drm/i915/gt: Clear CACHE_MODE prior to clearing residuals")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210119110802.22228-1-chris@chris-wilson.co.uk
While immensely convenient for developing to only tackle the first
error, and not be flooded by repeated or secondiary issues, many more
casual testers are not setup to remotely capture debug traces. For those
testers, it is more beneficial to keep the system running in the remote
chance that they are able to extract the original debug logs.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210114113434.8229-2-chris@chris-wilson.co.uk
Now that i915 compiles cleanly with Werror, we can enforce enabling
DEBUG_GEM when selecting the default debug config.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@intel.com>
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210114113434.8229-1-chris@chris-wilson.co.uk
Similar to commit 49b20dbf74 ("drm/i915/gt: Perform an arbitration
check before busywaiting"), also add a check prior to the busywait
on gen8+, as we have now seen (because we added a selftest to add fault
injection into the engine resets) the same engine reset failure leading
to an indefinite wait on the ring-stop semaphore. So not a Tigerlake
specific bug after all, though it still seems odd behaviour for the
busywait as we do get the arbitration point elsewhere on a miss.
Testcase: igt_reset_fail_engine
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210117110418.3361-1-chris@chris-wilson.co.uk
Since we do a bare context switch with no restore, the clear residual
kernel runs on dirty state, and we must be careful to avoid executing
with bad state from context registers inherited from a malicious client.
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2955
Fixes: 008ead6ef8 ("drm/i915/gt: Restore clear-residual mitigations for Ivybridge, Baytrail")
Fixes: 09aa9e4586 ("drm/i915/gt: Restore clear-residual mitigations for Ivybridge, Baytrail")
Testcase: igt/gem_ctx_isolation # ivb,vlv
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Reviewed-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210117093015.29143-1-chris@chris-wilson.co.uk
As context-in/out is now always serialised, we do not have to worry
about concurrent enabling/disable of the busy-stats and can reduce the
atomic_t active to a plain unsigned int, and the seqlock to a seqcount.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210115142331.24458-3-chris@chris-wilson.co.uk
Since schedule-in/out is now entirely serialised by the tasklet bitlock,
we do not need to worry about concurrent in/out operations and so reduce
the atomic operations to plain instructions.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210115142331.24458-1-chris@chris-wilson.co.uk
warning: symbol '__i915_gem_object_create_stolen' was not declared. Should it be static?
warning: symbol '_i915_gem_object_stolen_init' was not declared. Should it be static?
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20210115101329.880667-1-matthew.auld@intel.com
Give more flexibility to the caller, if they already have an allocated
object, in case they wish to apply some transformation to the object
prior to handing it over to the region specific initialisation step,
like in gem_create_ext where we would like to first apply the extensions
to the object.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20210114182402.840247-3-matthew.auld@intel.com
Depending on the regions min_page_size we might need to adjust the
object size, ensure this matches our expectations.
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20210114182402.840247-2-matthew.auld@intel.com
In preparation for gem_create_ext break out the gem_create uAPI, so that
we don't clutter i915_gem.c once we start adding various extensions
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20210114182402.840247-1-matthew.auld@intel.com
When cloning the engines from the source context, we need to ensure that
the engines are not freed as we copy them, and that the flags we clone
from the source correspond with the engines we copy across. To do this
we need only take a reference to the src->engines, rather than hold the
src->engine_mutex, so long as we verify that nothing changed under the
read.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210114135612.13210-3-chris@chris-wilson.co.uk
Originally, we used the signal->lock as a means of following the
previous link in its timeline and peeking at the previous fence.
However, we have replaced the explicit serialisation with a series of
very careful probes that anticipate the links being deleted and the
fences recycled before we are able to acquire a strong reference to it.
We do not need the signal->lock crutch anymore, nor want the contention.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210114135612.13210-2-chris@chris-wilson.co.uk
When we know that we are inside the timeline mutex, or inside the
submission flow (under active.lock or the holder's rcu lock), we know
that the rq->hwsp is stable and we can use the simpler direct version.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210114135612.13210-1-chris@chris-wilson.co.uk
Backmerging to get a common base for merging topic branches between
drm-intel-next and drm-intel-gt-next.
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
UAPI Changes:
- Deprecate I915_PMU_LAST and optimize state tracking (Tvrtko)
Avoid relying on last item ABI marker in i915_drm.h, add a
comment to mark as deprecated.
Cross-subsystem Changes:
Core Changes:
Driver Changes:
- Restore clear residuals security mitigations for Ivybridge and
Baytrail (Chris)
- Close#1858: Allow sysadmin to choose applied GPU security mitigations
through i915.mitigations=... similar to CPU (Chris)
- Fix for #2024: GPU hangs on HSW GT1 (Chris)
- Fix for #2707: Driver hang when editing UVs in Blender (Chris, Ville)
- Fix for #2797: False positive GuC loading error message (Chris)
- Fix for #2859: Missing GuC firmware for older Cometlakes (Chris)
- Lessen probability of GPU hang due to DMAR faults [reason 7,
next page table ptr is invalid] on Tigerlake (Chris)
- Fix REVID macros for TGL to fetch correct stepping (Aditya)
- Limit frequency drop to RPe on parking (Chris, Edward)
- Limit W/A 1406941453 to TGL, RKL and DG1 (Swathi)
- Make W/A 22010271021 permanent on DG1 (Lucas)
- Implement W/A 16011163337 to prevent a HS/DS hang on DG1 (Swathi)
- Only disable preemption on gen8 render engines (Chris)
- Disable arbitration around Braswell's PDP updates (Chris)
- Disable arbitration on no-preempt requests (Chris)
- Check for arbitration after writing start seqno before busywaiting (Chris)
- Retain default context state across shrinking (Venkata, CQ)
- Fix mismatch between misplaced vma check and vma insert for 32-bit
addressing userspaces (Chris, CQ)
- Propagate error for vmap() failure instead kernel NULL deref (Chris)
- Propagate error from cancelled submit due to context closure
immediately (Chris)
- Fix RCU race on HWSP tracking per request (Chris)
- Clear CMD parser shadow and GPU reloc batches (Matt A)
- Populate logical context during first pin (Maarten)
- Optimistically prune dma-resv from the shrinker (Chris)
- Fix for virtual engine ownership race (Chris)
- Remove timeslice suppression to restore fairness for virtual engines (Chris)
- Rearrange IVB/HSW workarounds properly between GT and engine (Chris)
- Taint the reset mutex with the shrinker (Chris)
- Replace direct submit with direct call to tasklet (Chris)
- Multiple corrections to virtual engine dequeue and breadcrumbs code (Chris)
- Avoid wakeref from potentially hard IRQ context in PMU (Tvrtko)
- Use raw clock for RC6 time estimation in PMU (Tvrtko)
- Differentiate OOM failures from invalid map types (Chris)
- Fix Gen9 to have 64 MOCS entries similar to Gen11 (Chris)
- Ignore repeated attempts to suspend request flow across reset (Chris)
- Remove livelock from "do_idle_maps" VT-d W/A (Chris)
- Cancel the preemption timeout early in case engine reset fails (Chris)
- Code flow optimization in the scheduling code (Chris)
- Clear the execlists timers upon reset (Chris)
- Drain the breadcrumbs just once (Chris, Matt A)
- Track the overall GT awake/busy time (Chris)
- Tweak submission tasklet flushing to avoid starvation (Chris)
- Track timelines created using the HWSP to restore on resume (Chris)
- Use cmpxchg64 for 32b compatilibity for active tracking (Chris)
- Prefer recycling an idle GGTT fence to avoid GPU wait (Chris)
- Restructure GT code organization for clearer split between GuC
and execlists (Chris, Daniele, John, Matt A)
- Remove GuC code that will remain unused by new interfaces (Matt B)
- Restructure the CS timestamp clocks code to local to GT (Chris)
- Fix error return paths in perf code (Zhang)
- Replace idr_init() by idr_init_base() in perf (Deepak)
- Fix shmem_pin_map error path (Colin)
- Drop redundant free_work worker for GEM contexts (Chris, Mika)
- Increase readability and understandability of intel_workarounds.c (Lucas)
- Defer enabling the breadcrumb interrupt to after submission (Chris)
- Deal with buddy alloc block sizes beyond 4G (Venkata, Chris)
- Encode fence specific waitqueue behaviour into the wait.flags (Chris)
- Don't cancel the breadcrumb interrupt shadow too early (Chris)
- Cancel submitted requests upon context reset (Chris)
- Use correct locks in GuC code (Tvrtko)
- Prevent use of engine->wa_ctx after error (Chris, Matt R)
- Fix build warning on 32-bit (Arnd)
- Avoid memory leak if platform would have more than 16 W/A (Tvrtko)
- Avoid unnecessary #if CONFIG_PM in PMU code (Chris, Tvrtko)
- Improve debugging output (Chris, Tvrtko, Matt R)
- Make file local variables static (Jani)
- Avoid uint*_t types in i915 (Jani)
- Selftest improvements (Chris, Matt A, Dan)
- Documentation fixes (Chris, Jose)
Signed-off-by: Dave Airlie <airlied@redhat.com>
# Conflicts:
# drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
# drivers/gpu/drm/i915/gt/intel_breadcrumbs_types.h
# drivers/gpu/drm/i915/gt/intel_lrc.c
# drivers/gpu/drm/i915/gvt/mmio_context.h
# drivers/gpu/drm/i915/i915_drv.h
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210114152232.GA21588@jlahtine-mobl.ger.corp.intel.com
amd-drm-next-5.12-2021-01-08:
amdgpu:
- Rework IH ring handling on vega and navi
- Rework HDP handling for vega and navi
- swSMU documenation updates
- Overdrive support for Sienna Cichlid and newer asics
- swSMU updates for vangogh
- swSMU updates for renoir
- Enable FP16 on DCE8-11
- Misc code cleanups and bug fixes
radeon:
- Fixes for platforms that can't access PCI resources correctly
- Misc code cleanups
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210108221811.3868-1-alexander.deucher@amd.com
Signed-off-by: Dave Airlie <airlied@redhat.com>
Remove the extraneous inlines. The only split by the compiler that
looked dubious was execlists_schedule_out, so push the code around
slightly to move all the work into the out-of-line function.
In a normal build, bloat-o-meter shows that only the
execlists_schedule_out is contentious:
add/remove: 1/0 grow/shrink: 0/2 up/down: 803/-1532 (-729)
Function old new delta
__execlists_schedule_out - 803 +803
execlists_submission_tasklet 6488 5766 -722
execlists_reset_csb.constprop 1587 777 -810
Total: Before=1605815, After=1605086, chg -0.05%
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210113151112.15212-1-chris@chris-wilson.co.uk