For icl+, have hw read out to create hw blob of gamma
lut values. icl+ platforms supports multi segmented gamma
mode by default, add hw lut creation for this mode.
This will be used to validate gamma programming using dsb
(display state buffer) which is a tgl specific feature.
Major change done-removal of readouts of coarse and fine segments
because PAL_PREC_DATA register isn't giving propoer values.
State checker limited only to "fine segment"
v2: -readout code for multisegmented gamma has to come
up with some intermediate entries that aren't preserved
in hardware (Jani N)
-linear interpolation (Ville)
-moved common code to check gamma_enable to specific funcs,
since icl doesn't support that
v3: -use u16 instead of __u16 [Jani N]
-used single lut [Jani N]
-improved and more readable for loops [Jani N]
-read values directly to actual locations and then fill gaps [Jani N]
-moved cleaning to patch 1 [Jani N]
-renamed icl_read_lut_multi_seg() to icl_read_lut_multi_segment to
make it similar to icl_load_luts()
-renamed icl_compute_interpolated_gamma_blob() to
icl_compute_interpolated_gamma_lut_values() more sensible, I guess
v4: -removed interpolated func for creating gamma lut values
-removed readouts of fine and coarse segments, failure to read PAL_PREC_DATA
correctly
Signed-off-by: Swati Sharma <swati2.sharma@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1569096654-24433-3-git-send-email-swati2.sharma@intel.com
Gen12 has dual-subslices (DSS), which compared to gen11 subslices have
some duplicated resources/paths. Although DSS behave similarly to 2
subslices, instead of splitting this and presenting userspace with bits
not directly representative of hardware resources, present userspace
with a subslice_mask made up of DSS bits instead.
v2: GEM_BUG_ON on mask size (Lionel)
Bspec: 29547
Bspec: 12247
Cc: Kelvin Gardiner <kelvin.gardiner@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
CC: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com> #v1
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: James Ausmus <james.ausmus@intel.com>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com>
Signed-off-by: Stuart Summers <stuart.summers@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190913075137.18476-2-chris@chris-wilson.co.uk
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
On ILK-IVB the pipe colorspace is configured via PIPECONF
(as opposed to PIPEMISC in BDW+). Let's configure+readout
that stuff correctly.
Enabling YCbCr 4:4:4 output will now be a simple matter of
setting crtc_state->output_format appropriately in the encoder
.compute_config(). However, when we do that we must be
aware of the fact that YCbCr DP output doesn't seem to work
on ILK (resulting image is totally garbled), but on SNB+
it works fine. However HDMI YCbCr output does work correctly
even on ILK.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190718145053.25808-13-ville.syrjala@linux.intel.com
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Prepare the pipe csc for YCbCr output on ilk/snb. The main difference
to IVB+ is the lack of explicit post offsets, and instead we must
configure the CSC info RGB->YUV mode (which takes care of offsetting
Cb/Cr properly) and enable the "black screen offset" bit to add the
required offset to Y.
And while at it throw some comments around the bit defines to
document which platforms have which bits.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190718145053.25808-12-ville.syrjala@linux.intel.com
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
On HSW the pipe colorspace is configured via PIPECONF
(as opposed to PIPEMISC in BDW+). Let's configure+readout
that stuff correctly.
Enabling YCbCr 4:4:4 output will now be a simple matter of
setting crtc_state->output_format appropriately in the encoder
.compute_config().
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190718145053.25808-10-ville.syrjala@linux.intel.com
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Make intel_get_crtc_ycbcr_config() simpler and rename it
to bdw_get_pipemisc_output_format() to better reflect what
it does.
Also toss in some comments to document that the 4:2:0 PIPECONF
bits are glk+ only. They are mbz on earlier platforms so reading
them unconditionally is safe however.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190718145053.25808-9-ville.syrjala@linux.intel.com
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Since HSW the PIPECONF progressive vs. interlaced selection is done
with just two bits instead of the earlier three. Let's not look at the
extra bit on HSW+. Also gen2 doesn't support interlaced displays at all.
This is actually fine as is currently because the extra bit is mbz (as
are all three bits on gen2). But just to avoid mishaps in the future
if the bits get reused let's only look at what's properly defined.
v2: constify crtc_state
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190718145053.25808-8-ville.syrjala@linux.intel.com
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
crtc_state->limited_color_range only applies to RGB output but
we're currently setting it even for YCbCr output. That will
lead to conflicting MSA and PIPECONF settings which can mess
up the image. Let's make sure limited_color_range stays unset
with YCbCr output.
Also WARN if we end up with such a bogus combination when
programming the MSA MISC bits as it's impossible to even
indicate quantization rangle for YCbCr via MSA MISC. YCbCr
output is simply assumed to be limited range always. Note
that VSC SDP does provide a mechanism for full range YCbCr,
so in the future we may want to rethink how we compute/store
this state.
And for good measure we add the same WARN to the HDMI path.
v2: s/==/!=/ in the HDMI WARN
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190718164523.11738-1-ville.syrjala@linux.intel.com
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
We're configuring the AVI infoframe quantization range bits as if
we're always transmitting RGB pixels. Let's fix this so that we
correctly indicate limited range YCC quantization range when
transmitting YCbCr instead.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190718145053.25808-4-ville.syrjala@linux.intel.com
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Looks like we're currently setting the MSA to xvYCC BT.709 instead
of the YCbCr BT.601 claimed by the comment. But even that comment
is wrong since we configure the CSC matrix to BT.709.
Let's remove the bogus statement from the comment and fix the
MSA to indicate YCbCr BT.709 so that it matches the actual
pixel data we're transmitting.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190718145053.25808-3-ville.syrjala@linux.intel.com
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Make both GuC and HuC to use "." as the separator. Hardcode
the separator in MAKE_UC_FW_PATH. Remove the usage of "ver" from HuC.
The current convention being:
<platform>_<g/h>uc_<major>.<minor>.patch.bin
Update the versions of HuC being loaded of the platforms.
SKL - v2.0.0
BXT - v2.0.0
KBL - v4.0.0
GLK - v4.0.0
CFL - KBL v4.0.0
ICL - v9.0.0
CML - v4.0.0
v2: Remove the separator parameter altogether from
__MAKE_UC_FW_PATH.(Daniele)
- Squash all firmware update patches (Daniele)
v3: s/huc/HuC
- Correct the order of platforms
- Change REVID of cml to 5(Michal)
- Code space changes in huc_def (Daniele)
Suggested-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Anusha Srivatsa <anusha.srivatsa@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190919201204.9691-1-daniele.ceraolospurio@intel.com
Our sanitychecks indicate that while this register is context
saved/restore, the HW does not preserve this bit within the register --
it likely doesn't exist, or one of those mythical bits that the
architects insist does something despite all appearances to the
contrary.
For reference, SAMPLER_MODE is already in i915_reg.h as
GEN10_SAMPLER_MODE and is being setup in icl_ctx_workarounds_init() as
opposed to the chosen location here of rcs_engine_wa_init).
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111754
Fixes: 7f0cc34b53 ("drm/i915/tgl: Implement Wa_1406941453")
Testcase: igt/i915_selftest/live_workarounds
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Stuart Summers <stuart.summers@intel.com>
Cc: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190920081254.18389-1-chris@chris-wilson.co.uk
As not only is the signal->timeline volatile, so will be acquiring the
timeline's HWSP. We must first carefully acquire the timeline from the
signaling request and then lock the timeline. With the removal of the
struct_mutex serialisation of request construction, we can have multiple
timelines active at once, and so we must avoid using the nested mutex
lock as it is quite possible for both timelines to be establishing
semaphores on the other and so deadlock.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190919111912.21631-3-chris@chris-wilson.co.uk
As we need to take a walk back along the signaler timeline to find the
fence before upon which we want to wait, we need to lock that timeline
to prevent it being modified as we walk. Similarly, we also need to
acquire a reference to the earlier fence while it still exists!
Though we lack the correct locking today, we are saved by the
overarching struct_mutex -- but that protection is being removed.
v2: Tvrtko made me realise I was being lax and using annotations to
ignore the AB-BA deadlock from the timeline overlap. As it would be
possible to construct a second request that was using a semaphore from the
same timeline as ourselves, we could quite easily end up in a situation
where we deadlocked in our mutex waits. Avoid that by using a trylock
and falling back to a normal dma-fence await if contended.
v3: Eek, the signal->timeline is volatile and must be carefully
dereferenced to ensure it is valid.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190919111912.21631-2-chris@chris-wilson.co.uk
The request->timeline is only valid until the request is retired (i.e.
before it is completed). Upon retiring the request, the context may be
unpinned and freed, and along with it the timeline may be freed. We
therefore need to be very careful when chasing rq->timeline that the
pointer does not disappear beneath us. The vast majority of users are in
a protected context, either during request construction or retirement,
where the timeline->mutex is held and the timeline cannot disappear. It
is those few off the beaten path (where we access a second timeline) that
need extra scrutiny -- to be added in the next patch after first adding
the warnings about dangerous access.
One complication, where we cannot use the timeline->mutex itself, is
during request submission onto hardware (under spinlocks). Here, we want
to check on the timeline to finalize the breadcrumb, and so we need to
impose a second rule to ensure that the request->timeline is indeed
valid. As we are submitting the request, it's context and timeline must
be pinned, as it will be used by the hardware. Since it is pinned, we
know the request->timeline must still be valid, and we cannot submit the
idle barrier until after we release the engine->active.lock, ergo while
submitting and holding that spinlock, a second thread cannot release the
timeline.
v2: Don't be lazy inside selftests; hold the timeline->mutex for as long
as we need it, and tidy up acquiring the timeline with a bit of
refactoring (i915_active_add_request)
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190919111912.21631-1-chris@chris-wilson.co.uk
Before we execute a batch, we must first issue any and all TLB
invalidations so that batch picks up the new page table entries.
Tigerlake's preparser is weakening our post-sync CS_STALL inside the
invalidate pipe-control and allowing the loading of the batch buffer
before we have setup its page table (and so it loads the wrong page and
executes indefinitely).
The igt_cs_tlb indicates that this issue can only be observed on rcs,
even though the preparser is common to all engines. Alternatively, we
could do TLB shootdown via mmio on updating the GTT.
By inserting the pre-parser disable inside EMIT_INVALIDATE, we will also
accidentally fixup execution that writes into subsequent batches, such
as gem_exec_whisper and even relocations performed on the GPU. We should
be careful not to allow this disable to become baked into the uABI! The
issue is that if userspace relies on our disabling of the HW
optimisation, when we are ready to enable that optimisation, userspace
will then be broken...
Testcase: igt/i915_selftests/live_gtt/igt_cs_tlb
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111753
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190919151811.9526-1-chris@chris-wilson.co.uk
Modern platforms allow the transcoders hdisplay/vdisplay to exceed the
planes' max resolution. This has the nasty implication that modes on the
connectors' mode list may not be usable when the user asks for a
fullscreen plane. Seeing as that is the most common use case it seems
prudent to filter out modes that don't allow for fullscreen planes to
be enabled.
Let's do that in the connetor .mode_valid() hook so that normally
such modes are kept hidden but the user is still able to forcibly
specify such a mode if they know they don't need fullscreen planes.
This is in line with ealier policies regarding certain clock limits.
The idea is to prevent the casual user from encountering a mode that
would fail under typical conditions, but allow the expert user to
force things if they so wish.
Maybe in the future we should consider automagically using two
planes when one can't cover the entire screen? Wouldn't be a
great match for the current uapi with explicit planes though,
but I guess no worse than using two pipes (which we apparently
have to in the future anyway). Either that or we'd have to
teach userspace to do it for us.
v2: Fix icl+ max plane heigth (Manasi)
Cc: Manasi Navare <manasi.d.navare@intel.com>
Cc: Leho Kraav <leho@kraav.com>
Cc: Sean Paul <sean@poorly.run>
Cc: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190918150707.32420-1-ville.syrjala@linux.intel.com
The officially validated plane width limit is 4k on skl+, however
we already had people using 5k displays before we started to enforce
the limit. Also it seems Windows allows 5k resolutions as well
(though not sure if they do it with one plane or two).
According to hw folks 5k should work with the possible
exception of the following features:
- Ytile (already limited to 4k)
- FP16 (already limited to 4k)
- render compression (already limited to 4k)
- KVMR sprite and cursor (don't care)
- horizontal panning (need to verify this)
- pipe and plane scaling (need to verify this)
So apart from last two items on that list we are already
fine. We should really verify what happens with those last
two items but I don't have a 5k display on hand atm so it'll
have to wait.
In the meantime let's just bump the limit back up to 5k since
several users have already been using it without apparent issues.
At least we'll be no worse off than we were prior to lowering
the limits.
Cc: stable@vger.kernel.org
Cc: Sean Paul <sean@poorly.run>
Cc: José Roberto de Souza <jose.souza@intel.com>
Tested-by: Leho Kraav <leho@kraav.com>
Fixes: 372b9ffb57 ("drm/i915: Fix skl+ max plane width")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111501
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190905135044.2001-1-ville.syrjala@linux.intel.com
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Sean Paul <sean@poorly.run>
The MCC hpd table is just a subset of the ICP table; we can eliminate it
and use the ICP table everywhere. The extra pins in the table won't be
a problem for MCC since we still supply an appropriate hotplug trigger
mask anywhere the pin table is used.
Cc: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190918235626.3750-2-matthew.d.roper@intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
We generally assume future platforms will inherit the behavior of the
most recent platforms, so update our DDC pin mapping defaults to match
how ICP/TGP behave (i.e., pins starting from GMBUS_PIN_1_BXT for combo
PHY's and pins starting from GMBUS_PIN_9_TC1_ICP for TC PHY's). MCC's
non-standard handling of combo PHY C seems like a platform-specific
quirk that is unlikely to be duplicated on future platforms, so continue
handling it as a special case.
Without this change, future platforms would default to gen4-style pin
mapping which is almost certainly not what we'll want.
Cc: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190918235626.3750-1-matthew.d.roper@intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Check that we are correctly invalidating the TLB at the start of a
batch after updating the GTT.
v2: Comments and hold the request reference while spinning
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190919131414.7495-1-chris@chris-wilson.co.uk
Our assumption that the we can ask the HW to lock the SFC even if not
currently in use does not match the HW commitment. The expectation from
the HW is that SW will not try to lock the SFC if the engine is not
using it and if we do that the behavior is undefined; on ICL the HW
ends up to returning the ack and ignoring our lock request, but this is
not guaranteed and we shouldn't expect it going forward.
Also, failing to get the ack while the SFC is in use means that we can't
cleanly reset it, so fail the engine reset in that scenario.
v2: drop rmw change, keep the log as debug and handle failure (Chris),
improve comments (Tvrtko).
Reported-by: Owen Zhang <owen.zhang@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190919015330.15435-1-daniele.ceraolospurio@intel.com
A few times in CI, we have detected a GPU hang on our Haswell GT2
systems with the characteristic IPEHR of 0x780c0000. When the PSMI w/a
was first introducted, it was applied to all Haswell, but later on we
found an erratum that supposedly restricted the issue to GT1 and so
constrained it only be applied on GT1. That may have been a mistake...
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111692
Fixes: 167bc759e8 ("drm/i915: Restrict PSMI context load w/a to Haswell GT1")
References: 2c55018347 ("drm/i915: Disable PSMI sleep messages on all rings around context switches")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190917194746.26710-1-chris@chris-wilson.co.uk
The CMP PCH ID we have in the driver is correct for the CML-U machines we have
in our CI system, but the CML-S and CML-H CI machines appear to use a
different PCH ID, leading our driver to detect no PCH for them.
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Anusha Srivatsa <anusha.srivatsa@intel.com>
References: 729ae330a0 ("drm/i915/cml: Introduce Comet Lake PCH")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111461
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190916233251.387-1-matthew.d.roper@intel.com
On Tigerlake, MI_SEMAPHORE_WAIT grew an extra dword, so be sure to
update the length field and emit that extra parameter and any padding
noop as required.
v2: Define the token shift while we are adding the updated MI_SEMAPHORE_WAIT
v3: Use int instead of bool in the addition so that readers are not left
wondering about the intricacies of the C spec. Now they just have to
worry what the integer value of a boolean operation is...
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michal Winiarski <michal.winiarski@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190917123055.28965-1-chris@chris-wilson.co.uk
If we try to clear, or even set, a bit in the register that doesn't
change the register state; skip the write. There's a slight danger in
that the register acts as a latch-on-write, but I do not think we use a
rmw cycle with any such latch registers.
Suggested-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190917080029.27632-1-chris@chris-wilson.co.uk
Include the active context register state when dumping the engine.
Suggested-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Stuart Summers <stuart.summers@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190915203701.29163-1-chris@chris-wilson.co.uk
Stop setting ->pipe_mask to zero when display is disabled, allowing us
to have different code paths for not actually having display hardware,
and having display hardware disabled. This lets us develop those two
avenues independently.
There are no functional changes for when there is no display. However,
all uses of for_each_pipe() and for_each_pipe_masked() will start
running for the disabled display case. Put one of the more significant
ones behind checks for INTEL_DISPLAY_ENABLED(), otherwise the cases
should not be hit with disabled display, or they seem benign. Fingers
crossed.
All in all, this might not be the ideal solution. In fact we may have
had something along the lines of this in the past, but we ended up
conflating the two cases. Possibly even by recommendation by yours
truly; I did not dare dig up that part of the history. But the perfect
is the enemy of the good, this is a straightforward change, and lets us
get actual work done in both fronts without interfering with each other.
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190916092901.31440-1-jani.nikula@intel.com
There's a helper in drm_fourcc.h these days to check of we're dealing
with a two plane YUV format. Make use if it.
Also s/plane/color_plane/ in skl_plane_relative_data_rate() to reduce
the confusion.
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190913193157.9556-2-ville.syrjala@linux.intel.com
Prepare for making a distinction between not having display and having
disabled display. Add INTEL_DISPLAY_ENABLED() and use it where
HAS_DISPLAY() is used after intel_device_info_runtime_init(). This is
initially duplication, as disabling display still leads to ->pipe_mask =
0 and HAS_DISPLAY() being false.
Note that ever since i915.display_disable was introduced, it has not
affected PCH detection even if it uses HAS_DISPLAY(), as display disable
happens after that.
Since INTEL_DISPLAY_ENABLED() will not make sense unless HAS_DISPLAY()
is true, include a warning for catching misuses making decisions on
INTEL_DISPLAY_ENABLED() when HAS_DISPLAY() is false.
v2: Remove INTEL_DISPLAY_ENABLED() check from intel_detect_pch() (Chris)
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Acked-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190913100407.30991-1-jani.nikula@intel.com
We think that we got rc6 problems sorted out. Flip the switch
and let CI expose our tendency to naive optimism.
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190913200638.31939-1-chris@chris-wilson.co.uk
The media ranges extend beyond what gen11 gives so we can't piggypack
on gen11 ranges, even on read side.
Introduce a table for gen12 and accessors for it.
v2: correctly implement gen12_fwtable_write/read (Daniele)
v3: update with ranges from bspec.
v4: avoid GEN11_NEEDS_FORCEWAKE (Mika)
v5: bspec ref (Daniele)
BSpec: 52078
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190913141652.27958-2-mika.kuoppala@linux.intel.com
More pruning away of features until we have a stable system and a basis
for debugging what's missing.
v2: Fixup vdbox/vebox fusing
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190913145556.23912-1-chris@chris-wilson.co.uk
While srcu may use an integer tag, it does not exclude potential error
codes and so may overlap with our own use of -EINTR. Use a separate
outparam to store the tag, and report the error code separately.
Fixes: 2caffbf117 ("drm/i915: Revoke mmaps and prevent access to fence registers across reset")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190912160834.30601-1-chris@chris-wilson.co.uk
On ICL+, the max supported plane height is 4320, so bump it up
To support 4320, we need to increase the number of bits used to
read plane_height to 13 as opposed to older 12 bits.
v4:
* Adjust the width mask also since extra bits are mbz (Ville)
v3:
* Use 0xffff for mask as extra bits are mbz (Ville)
v2:
* ICL plane height supported is 4320 (Ville)
* Add a new line between max width and max height (Jose)
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190712203808.4126-1-manasi.d.navare@intel.com
On ICL+, the vertical limits for the transcoders are increased to 8192
and horizontal limits are bumped to 16K so bump up
limits in intel_mode_valid()
v4:
* Increase the hdisplay to 16K (Ville)
v3:
* Supported starting ICL (Ville)
* Use the higher limits from TRANS_VTOTAL register (Ville)
v2:
* Checkpatch warning (Manasi)
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190712202214.3906-1-manasi.d.navare@intel.com
Without it we get:
Unclaimed read from register 0x1e1110
WARNING: CPU: 2 PID: 1029 at drivers/gpu/drm/i915/intel_uncore.c:1101 __unclaimed_reg_debug+0x40/0x50 [i915]
Call Trace:
fwtable_read32+0x233/0x300 [i915]
i915_interrupt_info+0xa73/0xd60 [i915]
seq_read+0xdb/0x3c0
full_proxy_read+0x51/0x80
vfs_read+0x9e/0x160
ksys_read+0x8f/0xe0
do_syscall_64+0x55/0x1c0
entry_SYSCALL_64_after_hwframe+0x49/0xbe
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109824
Signed-off-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190912125418.23115-2-arkadiusz.hiler@intel.com
We see failures where the context continues executing past a
preemption event, eventually leading to situations where a request has
executed before we have event submitted it to HW! It seems like tgl is
ignoring our RING_TAIL updates, but more likely is that there is a
missing update required for our semaphore waits around preemption.
v2: And disable internal semaphore usage
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190912132313.12751-1-chris@chris-wilson.co.uk
As we track when we put the GT device to sleep upon idling, we can use
that callback to sample the current rc6 counters and record the
timestamp for estimating samples after that point while asleep.
v2: Stick to using ktime_t
v3: Track user_wakerefs that interfere with the new
intel_gt_pm_wait_for_idle
v4: No need for parked/unparked estimation if !CONFIG_PM
v5: Keep timer park/unpark logic as was
v6: Refactor duplicated estimate/update rc6 logic
v7: Pull intel_get_pm_get_if_awake() out from the pmu->lock.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105010
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190912124813.19225-1-chris@chris-wilson.co.uk
Replace device info number of pipes with a bit mask of available
pipes. This will prove handy in the future. There's still a bunch of
future work to do to actually allow a non-consecutive mask of pipes, but
it's a start. No functional changes.
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Acked-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190911202908.19631-1-jani.nikula@intel.com
Since d0aa694b92 ("drm/i915/pmu: Always sample an active ringbuffer")
the cost of sampling the engine state on execlists platforms became a
little bit higher when both engine busyness and one of the wait states are
being monitored. (Previously the busyness sampling on legacy platforms was
done via seqno comparison so there was no cost of mmio read.)
We can avoid that by skipping busyness sampling when engine supports
software busy stats and so avoid the cost of potential mmio read and
sample accumulation.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190911160730.22687-1-tvrtko.ursulin@linux.intel.com
After we manipulate the context to allow replay after a GPU reset, force
that context to be reloaded. This should be a layer of paranoia, for if
the GPU was reset, the context will no longer be resident!
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190912092933.4729-2-chris@chris-wilson.co.uk
After a GPU reset, we need to drain all the CS events so that we have an
accurate picture of the execlists state at the time of the reset. Be
paranoid and force a read of the CSB write pointer from memory.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190912092933.4729-1-chris@chris-wilson.co.uk
The FBC requires a couple of contiguous buffers, which we allocate from
stolen memory. If stolen memory is unavailable, we cannot allocate those
buffers and so cannot support FBC. Mark it so.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190911175926.31365-1-chris@chris-wilson.co.uk
Reuse the same .modeset_calc_cdclk() function for all bxt+.
The only difference in between the cnl/icl and the bxt variants
is the call to cnl_compute_min_voltage_level(). We can do that call
just fine on older platforms since they leave min_voltage_level[]
zeroed. Let's rename the function to bxt_compute_min_voltage_level()
just so it stays consistent with the rest of the naming scheme.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190911133129.27466-4-ville.syrjala@linux.intel.com
We're forgetting to mask off all three pipe select bits from the
CDCLK_CTL value on icl+ which may lead to the extra bit being
left in. That will cause us to consider the current hardware
cdclk state as invalid, and we proceed to sanitize it even
though the hardware may have active pipes and whatnot.
Fix up the mask so we get rid of all three pipe select bits
and thus hopefully no longer sanitize cdclk when it's already
correctly programmed.
Cc: Matt Roper <matthew.d.roper@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111641
Fixes: 0c1279b58f ("drm/i915: Consolidate {bxt,cnl,icl}_init_cdclk")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190911133129.27466-2-ville.syrjala@linux.intel.com
On tgl/bxt/glk the cdclk bypass frequency depends on the PLL
reference clock. So let's read out the ref clock before we
try to compute the bypass clock.
Cc: Matt Roper <matthew.d.roper@intel.com>
Fixes: 71dc367e2b ("drm/i915: Consolidate bxt/cnl/icl cdclk readout")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190911133129.27466-1-ville.syrjala@linux.intel.com
Abstract away direct access to ->num_pipes to allow further
refactoring. No functional changes.
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Acked-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190911092608.13009-1-jani.nikula@intel.com
There's no easy way of checking whether iommu is enabled for the GPU
(you can grep dmesg if you know the device, or you can grep
i915_gpu_info if that's available). We do have a central
i915_capabilities with the intent of listing such pertinent information,
so add the iommu status.
Suggested-by: Martin Peres <martin.peres@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Martin Peres <martin.peres@linux.intel.com>
Cc: Tomi Sarvela <tomi.p.sarvela@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Acked-by: Martin Peres <martin.peres@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190911114655.9254-1-chris@chris-wilson.co.uk
The same read-only affliction as befell Icelake is affecting Tigerlake.
Disable the read-only support as clearly it was not fixed.
Testcase: igt/i915_selftests/live_gem_context
References: 3936867dbc ("drm/i915: Disable read only ppgtt support for gen11")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190911125717.28997-1-chris@chris-wilson.co.uk
system_unbound_wq can't keep up sometimes and we get dropped frames.
Switch to a high priority variant.
Reported-by: Heinrich Fink <heinrich.fink@daqri.com>
Tested-by: Heinrich Fink <heinrich.fink@daqri.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190910121347.22958-1-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Commit 736da8112f ("drm/i915: Use literal representation of cdclk
tables") pushed the cdclk logic into tables, adding glk_cdclk_table but
not using yet:
drivers/gpu/drm/i915/display/intel_cdclk.c:1173:38: error: ‘glk_cdclk_table’ defined but not used [-Werror=unused-const-variable=]
Fixes: 736da8112f ("drm/i915: Use literal representation of cdclk tables")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190911074727.32585-1-chris@chris-wilson.co.uk
In preparation for reducing struct_mutex stranglehold around the vm,
make the vma.flags atomic so that we can acquire a pin on the vma
atomically before deciding if we need to take the mutex.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190911090243.16786-1-chris@chris-wilson.co.uk
This allows userspace to use "legacy" mode for push constants, where
they are committed at 3DPRIMITIVE or flush time, rather than being
committed at 3DSTATE_BINDING_TABLE_POINTERS_XS time. Gen6-8 and Gen11
both use the "legacy" behavior - only Gen9 works in the "new" way.
Conflating push constants with binding tables is painful for userspace,
we would like to be able to avoid doing so.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: stable@vger.kernel.org
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190911014801.26821-1-kenneth@whitecape.org
Both in the container_of and getting to gt->awake there is no need to go
via i915 since both the wakeref and awake are members of gt.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Andi Shyti <andi.shyti@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190910143823.10686-4-tvrtko.ursulin@linux.intel.com
Code in i915_gem_init_hw is all about GT init so move it to intel_gt.c
renaming to intel_gt_init_hw.
Existing intel_gt_init_hw is renamed to intel_gt_init_hw_early since it
is currently called from driver probe.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Andi Shyti <andi.shyti@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190910143823.10686-2-tvrtko.ursulin@linux.intel.com
The BXT and CNL functions were already basically identical, whereas
ICL's function tried to do its own sanitization rather than calling
bxt_sanitize_cdclk.
This should actually fix a bug in our ICL initialization where it would
consider the /2 CD2X divider invalid and force an unnecessary
sanitization (we now have valid clock frequencies that use this
divider).
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190910154252.30503-9-matthew.d.roper@intel.com
When reading out the BIOS-programmed cdclk state, let's make sure that
the cdclk value is on the valid list for the platform, ensure that the
VCO matches the cdclk, and ensure that the CD2X divider was set
properly.
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190910154252.30503-8-matthew.d.roper@intel.com
With all of the cdclk function consolidation, we can cut down on a lot
of platform if/else logic by creating a vfunc that's initialized at
startup.
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190910154252.30503-7-matthew.d.roper@intel.com
The uninitialize flow is the same on all of these platforms, aside from
calculating a different frequency level.
v2: Reverse platform conditional order for consistency. (Ville)
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190910154252.30503-6-matthew.d.roper@intel.com
The CNL variant of this function is identical to the BXT variant aside
from not needing to handle SSA precharge.
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190910154252.30503-5-matthew.d.roper@intel.com
We'd previously combined ICL/TGL logic into the cnl_set_cdclk function,
but BXT is pretty similar as well. Roll the cnl/icl/tgl logic back into
the bxt function; the only things we really need to handle separately
are punit notification and calling different functions to enable/disable
the cdclk PLL.
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190910154252.30503-4-matthew.d.roper@intel.com
The bspec lays out legal cdclk frequencies, PLL ratios, and CD2X
dividers in an easy-to-read table for most recent platforms. We've been
translating the data from that table into platform-specific code logic,
but it's easy to overlook an area we need to update when adding new
cdclk values or enabling new platforms. Let's just add a form of the
bspec table to the code and then adjust our functions to pull what they
need directly out of the table.
v2: Fix comparison when finding best cdclk.
v3: Another logic fix for calc_cdclk.
v4:
- Use named initializers for cdclk tables. (Ville)
- Include refclk as a field in the table instead of adding all three
ratios for each entry. (Ville)
- Terminate tables with an empty entry to avoid needing to store the
table size. (Ville)
- Don't try so hard to return reasonable values from our lookup
functions if we get impossible inputs; just WARN and return 0.
(Ville)
- Keep a bxt_ prefix on the lookup functions since they're still only
used on bxt+ for now. We can rename them later if we extend this
table-based approach back to older platforms. (Ville)
v5:
- Fix cnl table's ratios for 24mhz refclk. (Ville)
- Don't miss the named initializers on the cnl table. (Ville)
- Represent refclk in table as u16 rather than u32. (Ville)
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190910161506.7158-1-matthew.d.roper@intel.com
Aside from a few minor register changes and some different clock values,
cdclk design hasn't changed much since gen9lp. Let's consolidate the
handlers for bxt, cnl, and icl to keep the codeflow consistent.
Also, while we're at it, s/bxt_de_pll_update/bxt_de_pll_readout/ since
"update" makes me think we should be writing to hardware rather than
reading from it.
v2:
- Fix icl_calc_voltage_level() limits. (Ville)
- Use CNL_CDCLK_PLL_RATIO_MASK rather than BXT_DE_PLL_RATIO_MASK on
gen10+ to avoid confusion. (Ville)
v3:
- Also fix ehl_calc_voltage_level() limits. (Ville)
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190910160520.6587-1-matthew.d.roper@intel.com
Empirical evidence from CI tells us that our rc6 setup for Tigerlake is
off. Disable rc6 on tgl temporary so that we gain CI coverage as we
prepare a fix. It also appears that the BIOS on our tgl leaves rc6
enabled, so we have to explicitly disable it on init.
References: https://bugs.freedesktop.org/show_bug.cgi?id=111593
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Acked-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190910161657.23037-1-chris@chris-wilson.co.uk
Currently, if there is time remaining before the start of the loop, we
do one full iteration over many possible different chunks within the
object. A full loop may take 50+s (depending on speed of indirect GTT
mmapings) and we try separately with LINEAR, X and Y -- at which point
igt times out. If we check more frequently, we will interrupt the loop
upon our timeout -- it is hard to argue for as this significantly reduces
the test coverage as we dramatically reduce the runtime. In practical
terms, the coverage we should prioritise is in using different fence
setups, forcing verification of the tile row computations over the
current preference of checking extracting chunks. Though the exhaustive
search is great given an infinite timeout, to improve our current
coverage, we also add a randomised smoketest of partial mmaps. So let's
do both, add a randomised smoketest of partial tiling chunks and the
exhaustive (though time limited) search for failures.
Even in adding another subtest, we should shave 100s off BAT! (With,
hopefully, no loss in coverage, at least over multiple runs.)
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190910121009.13431-1-chris@chris-wilson.co.uk
As soon as we re-enable the various functions within the HW, they may go
off and read data via a GGTT offset. Hence, if we have not yet restored
the GGTT PTE before then, they may read and even *write* random locations
in memory.
Detected by DMAR faults during resume.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Martin Peres <martin.peres@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190909110011.8958-4-chris@chris-wilson.co.uk
For cherryview, add hw read out to create hw blob of gamma
lut values.
Review comments from previous series:
https://patchwork.freedesktop.org/patch/328252
v4: -No need to initialize *blob [Jani]
-Removed right shifts [Jani]
-Dropped dev local var [Jani]
v5: -Returned blob instead of assigning it internally within the
function [Ville]
-Renamed function cherryview_get_color_config() to chv_read_luts()
-Renamed cherryview_get_gamma_config() to chv_read_cgm_gamma_lut()
[Ville]
v9: -80 character limit [Uma]
-Made read func para as const [Ville, Uma]
-Renamed chv_read_cgm_gamma_lut() to chv_read_cgm_gamma_lut()
[Ville, Uma]
Signed-off-by: Swati Sharma <swati2.sharma@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1568030503-26747-4-git-send-email-swati2.sharma@intel.com
For i965, add hw read out to create hw blob of gamma
lut values.
Review comments from old series:
https://patchwork.freedesktop.org/series/58039/
v4: -No need to initialize *blob [Jani]
-Removed right shifts [Jani]
-Dropped dev local var [Jani]
v5: -Returned blob instead of assigning it internally
within the function [Ville]
-Renamed i965_get_color_config() to i965_read_lut() [Ville]
-Renamed i965_get_gamma_config_10p6() to i965_read_gamma_lut_10p6()
[Ville]
v9: -Typo and 80 character limit [Uma]
-Made read func para as const [Ville, Uma]
-Renamed i965_read_gamma_lut_10p6() to i965_read_lut_10p6() [Ville, Uma]
v10: -Swapped ldw and udw while creating hw blob [Jani]
-Added last index rgb lut value from PIPEGCMAX to h/w blob [Jani]
Signed-off-by: Swati Sharma <swati2.sharma@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1568030503-26747-3-git-send-email-swati2.sharma@intel.com
intel_color_get_gamma_bit_precision() is extended for
cherryview by adding chv_gamma_precision(), i965 will use existing
i9xx_gamma_precision() func only.
Signed-off-by: Swati Sharma <swati2.sharma@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1568030503-26747-2-git-send-email-swati2.sharma@intel.com
During reset, we try to ensure no forward progress of the CS prior to
the reset by setting the STOP_RING bit in RING_MI_MODE. Since gen9, this
register is context saved and do we end up in the odd situation where we
save the STOP_RING bit and so try to stop the engine again immediately
upon resume. This is quite unexpected and causes us to complain about an
early CS completion event!
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111514
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190910080208.4223-1-chris@chris-wilson.co.uk
It might prove useful in the future to know if the vma is utilising
huge-GTT-pages. Related to this is the GTT cache, where there is some HW
"quirkiness" where it must be disabled if using 2M pages, so include
that for good measure.
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190909171646.22090-1-matthew.auld@intel.com
Try to tidy up the cache-coloring such that we rid the code of any
mm.color_adjust assumptions, this should hopefully make it more obvious
in the code when we need to actually use the cache-level as the color,
and as a bonus should make adding a different color-scheme simpler.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190909124052.22900-3-matthew.auld@intel.com
Make it clear that the color adjust callback applies to the ggtt.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190909124052.22900-2-matthew.auld@intel.com
Export color_differs so that we can use it elsewhere.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190909124052.22900-1-matthew.auld@intel.com
As we may unwind incomplete requests (for preemption) prior to
processing the CSB and the schedule-out events, we may update rq->engine
(resetting it to point back to the parent virtual engine) prior to
calling execlists_schedule_out(), invalidating the assertion that the
request still points to the inflight engine. (The likelihood of this is
increased if the CSB interrupt processing is pushed to the ksoftirqd for
being too slow and direct submission overtakes it.)
Tvrtko summarised it as:
"So unwind from direct submission resets rq->engine and races with
process_csb from the tasklet which notices request has actually
completed."
Reported-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Fixes: df40306902 ("drm/i915/execlists: Lift process_csb() out of the irq-off spinlock")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190907105046.19934-1-chris@chris-wilson.co.uk
We are meant to register the kmem cache at init, such the supplied exit
and shrink hooks can be called.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190905072921.7979-1-matthew.auld@intel.com
Refactor the GT power management interface to work through the GT now
that it is under the control of gt/
Based on a patch by Chris Wilson.
Signed-off-by: Andi Shyti <andi.shyti@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190905111403.10071-1-andi.shyti@intel.com
Gen12 has subtle changes in the reg state context offsets (some fields
are gone, some are in a different location), compared to previous Gens.
The simplest approach seems to be keeping Gen12 (and future platform)
changes apart from the previous gens, while keeping the registers that
are contiguous in functions we can reuse.
v2: alias, virtual engine, rpcs, prune unused regs
v3: use engine base (Daniele), take ctx_bb for all
Bspec: 46255
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Tested-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
[ickle: Tweaked the GEM_WARN_ON after settling on a compromise with
Daniele]
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190906122314.2146-2-mika.kuoppala@linux.intel.com
Daniele pointed out that relative mmio works differently in
on context restore. Instead of adding the engine mmio base to offset,
it masks out the base and adds bits [12:2] to current engine base.
This should allow us to construct context register state to be
applicable to all instances, including virtual. And avoid the trouble
of updating the registers on virtual instances when submitting work.
v2: only enable for gen12 for now (Mika)
v3: make enabling readable (Chris)
Bspec: 20206
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Suggested-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190906134957.25909-1-mika.kuoppala@linux.intel.com
Unlike gen11, which always ran at 50MHz when the cdclk PLL was disabled,
TGL runs at refclk/2. The 50MHz croclk/2 is only used by hardware
during some power state transitions.
Bspec: 49201
Cc: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190905181337.23727-1-matthew.d.roper@intel.com
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
If we make sure we grab a strong reference to each object as we dump it,
we can reduce the locks outside of our iterators to an rcu_read_lock.
This should prevent errors like:
[ 2138.371911] BUG: KASAN: use-after-free in per_file_stats+0x43/0x380 [i915]
[ 2138.371924] Read of size 8 at addr ffff888223651000 by task cat/8293
[ 2138.371947] CPU: 0 PID: 8293 Comm: cat Not tainted 5.3.0-rc6-CI-Custom_4352+ #1
[ 2138.371953] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./J4205-ITX, BIOS P1.40 07/14/2017
[ 2138.371959] Call Trace:
[ 2138.371974] dump_stack+0x7c/0xbb
[ 2138.372099] ? per_file_stats+0x43/0x380 [i915]
[ 2138.372108] print_address_description+0x73/0x3a0
[ 2138.372231] ? per_file_stats+0x43/0x380 [i915]
[ 2138.372352] ? per_file_stats+0x43/0x380 [i915]
[ 2138.372362] __kasan_report+0x14e/0x192
[ 2138.372489] ? per_file_stats+0x43/0x380 [i915]
[ 2138.372502] kasan_report+0xe/0x20
[ 2138.372625] per_file_stats+0x43/0x380 [i915]
[ 2138.372751] ? i915_panel_show+0x110/0x110 [i915]
[ 2138.372761] idr_for_each+0xa7/0x160
[ 2138.372773] ? idr_get_next_ul+0x110/0x110
[ 2138.372782] ? do_raw_spin_lock+0x10a/0x1d0
[ 2138.372923] print_context_stats+0x264/0x510 [i915]
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: David Weinehall <david.weinehall@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190903062133.27360-1-chris@chris-wilson.co.uk
Tiger Lake has up to 4 pipes so the mask would need to be 0xf instead of
0x7. Do not hardcode the mask so it allows the fake MST encoders to
connect to all pipes no matter how many the platform has.
Iterating over all pipes to keep consistent with intel_ddi_init().
Initialy this patch was replaced by commit 4eaceea3a0 ("drm/i915:
Fix DP-MST crtc_mask") but userspace it not correctly using
encoder.possible_crtcs and it was reverted by
commit e838bfa8e1 ("Revert "drm/i915: Fix DP-MST crtc_mask"")
Userspace should be fixed but it might take a while, so bringing this
patch back for now.
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190904230241.20638-2-jose.souza@intel.com
WA 1409120013 is also valid for TGL, so lets check for ">= 11".
BSpec: 52890
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Clinton Taylor <Clinton.A.Taylor@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190904230241.20638-1-jose.souza@intel.com
Add case for gen == 12 and add MISSING_CASE() for future gens. We were
already handling gen12 as the default, so this doesn't change the
current behavior.
BSpec: 19481 and 44980
Cc: CQ Tang <cq.tang@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190904213419.27547-7-jose.souza@intel.com
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
SAGV is not currently working for Tiger Lake. We better disable it until
the implementation is stabilized and we can enable it.
HSDES: 1409542895 2208191909
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190904213419.27547-6-jose.souza@intel.com
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Gen 12 onwards moves the DP_TP_* registers to be transcoder-based rather
than port-based. This adds the new register addresses and changes all
the callers to use the register saved in intel_dp->regs.*. This is
filled out when preparing to enable the port so we take into account if
we should use the transcoder or the port.
v2: reimplement by stashing the registers we want to access under
intel_dp->reg. Now they are initialized when enabling the port.
Ville suggested to store the transcoder to be used exclusively
by TGL+. After implementing I thought just storing the register directly
made it cleaner.
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190904213419.27547-5-jose.souza@intel.com
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
DP_TP_{CTL,STATUS} should only be programmed when the encoder is intel_dp.
Checking its current usages intel_disable_ddi_buf() is the only
offender, with other places being protected by checks like
pipe_config->fec_enable that is only set by intel_dp.
v3 (José):
- Using intel_crtc_has_dp_encoder() instead of intel_encoder_is_dp()
(Ville)
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190904213419.27547-4-jose.souza@intel.com
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
For older gens PSR IIR and IMR have fixed addresses. From TGL onwards those
registers moved to each transcoder offset. The bits for the registers
are defined without an offset per transcoder as right now we have one
register per transcoder. So add a fake "trans_shift" when calculating
the bits offsets: it will be 0 for gen12+ and psr.transcoder otherwise.
v2 (Lucas): change the implementation to use trans_shift instead of
getting each bit value with a different macro
Cc: Imre Deak <imre.deak@intel.com>
Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190904213419.27547-3-jose.souza@intel.com
It was enabling and checking PSR interruptions in every transcoder
while it should keep the interruptions on the non-used transcoders
masked.
While doing this it gives us trouble on Tiger Lake if we are
reading/writing to registers of disabled transcoders since from gen12
onwards the registers are relative to the transcoder. Instead of forcing
them ON to access those registers, just avoid the accesses as they are
not needed.
v2 (Lucas):
- Explain why we can't keep accessing all transcoders
- Remove TODO about extending the irq handling to multiple instances:
when/if implementing multiple instances it's pretty clear by the
singleton psr that it needs to be extended
- Fix intel_psr_debug_set() calling psr_irq_control() with
psr.transcoder not set yet (from Imre). Now we only set the debug
register right away if psr is already enabled. Otherwise we just
record the value to be set when enabling the source.
- Do not depend on the value of TRANSCODER_A. Just be relative to it
(from Imre)
- handle psr error last so we don't schedule the work before handling
the other flags
v3:
- Adding a warning about setting reserverd bits on EDP_PSR_IMR
Cc: Imre Deak <imre.deak@intel.com>
Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190904213419.27547-2-jose.souza@intel.com
The cpu (de)tiler hw is gone, this stopped being useful. Plus it never
supported any of the fancy new tiling formats, which means userspace
also stopped using the magic side-channel this provides.
This would totally break a lot of the igts, but they're already broken
for the same reasons as userspace on gen12 would be.
v2: Look at ggtt->num_fences instead, that also avoids the need for a
comment (Chris). This also means that gen12 support really needs to
make sure num_fences is set to 0. There is a patch for that, but it
checks for HAS_MAPPABLE_APERTURE, which I'm not sure is the right
thing really. Adding relevant people.
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Stuart Summers <stuart.summers@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190820195451.15671-1-daniel.vetter@ffwll.ch
This reverts commit 4eaceea3a0.
Several userspace clients (modesetting ddx and mutter+wayland at least)
handle encoder.possible_crtcs incorrectly. What they essentially do is
the following:
possible_crtcs = ~0;
for_each_possible_encoder(connector)
possible_crtcs &= encoder->possible_crtcs;
Ie. they calculate the intersection of the possible_crtcs
for the connector when they really should be calculating the
union instead.
In our case each MST encoder now has just one unique bit set,
and so the intersection is always zero. The end result is that
MST connectors can't be lit up because no crtc can be found to
drive them.
I've submitted a fix for the modesetting ddx [1], and complained
on #wayland about mutter, so hopefully the situation will improve
in the future. In the meantime we have regression, and so must go
back to the old way of misconfiguring possible_crtcs in the kernel.
[1] https://gitlab.freedesktop.org/xorg/xserver/merge_requests/277
Cc: Jonas Ådahl <jadahl@gmail.com>
Cc: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111507
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190903154018.26357-1-ville.syrjala@linux.intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
This bit was fliped on for "syncing dependencies between camera and
graphics". BSpec has no recollection why, and it is causing
unrecoverable GPU hangs with Vulkan compute workloads.
From BSpec, setting bit5 to 0 enables relaxed padding requirements for
buffers, 1D and 2D non-array, non-MSAA, non-mip-mapped linear surfaces;
and *must* be set to 0h on skl+ to ensure "Out of Bounds" case is
suppressed.
Reported-by: Jason Ekstrand <jason@jlekstrand.net>
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110998
Fixes: 8424171e13 ("drm/i915/gen9: h/w w/a: syncing dependencies between camera and graphics")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: denys.kostin@globallogic.com
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: <stable@vger.kernel.org> # v4.1+
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190904100707.7377-1-chris@chris-wilson.co.uk
For glk, add hw read out to create hw blob of gamma
lut values.
v4: -No need to initialize *blob [Jani]
-Removed right shifts [Jani]
-Dropped dev local var [Jani]
v5: -Returned blob instead of assigning it internally within the
function [Ville]
-Renamed glk_get_color_config() to glk_read_luts() [Ville]
-Added degamma validation [Ville]
v9: -80 character limit [Uma]
-Made read func para as const [Ville, Uma]
Signed-off-by: Swati Sharma <swati2.sharma@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1567538578-4489-8-git-send-email-swati2.sharma@intel.com
For ilk, add hw read out to create hw blob of gamma
lut values.
v4: -No need to initialize *blob [Jani]
-Removed right shifts [Jani]
-Dropped dev local var [Jani]
v5: -Returned blob instead of assigning it internally within the
function [Ville]
-Renamed ilk_get_color_config() to ilk_read_luts() [Ville]
v9: -80 character limit [Uma]
-Made read func para as const [Ville, Uma]
-Renamed ilk_read_gamma_lut() to ilk_read_lut_10() [Uma, Ville]
v10: -Made ilk_read_luts() static [Jani]
-ilk_load_lut_10 has lut_size, not (lut_size - 1) [Jani]
Signed-off-by: Swati Sharma <swati2.sharma@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1567538578-4489-7-git-send-email-swati2.sharma@intel.com
For the legacy(gen < 4) gamma, add hw read out to create hw blob of gamma
lut values. Also, add function intel_color_lut_pack to convert hw value
with given bit precision to lut property val.
v4: -No need to initialize *blob [Jani]
-Removed right shifts [Jani]
-Dropped dev local var [Jani]
v5: -Returned blob instead of assigning it internally within the
function [Ville]
-Renamed function i9xx_get_color_config() to i9xx_read_luts()
-Renamed i9xx_get_config_internal() to i9xx_read_lut_8() [Ville]
v9: -Change in commit message [Jani, Uma]
-Wrap commit within 75 characters [Uma]
-Use macro for 256 [Uma]
-Made read func para as const [Ville, Uma]
v10: -Made i9xx_read_luts() static [Jani]
Signed-off-by: Swati Sharma <swati2.sharma@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1567538578-4489-6-git-send-email-swati2.sharma@intel.com
Add macro to compare hw/sw gamma lut values. First need to
check whether hw/sw gamma mode matches or not. If not
no need to compare lut values, if matches then only compare
lut entries.
v5: -Called PIPE_CONF_CHECK_COLOR_LUT inside if (!adjust) [Jani]
-Added #undef PIPE_CONF_CHECK_COLOR_LUT [Jani]
v8: -Added check for gamma mode before gamma lut entry comparison
[Jani]
-Split patch 3 into 4 patches
Signed-off-by: Swati Sharma <swati2.sharma@intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1567538578-4489-5-git-send-email-swati2.sharma@intel.com
Add func intel_color_lut_equal() to compare hw/sw gamma
lut values. Since hw/sw gamma lut sizes and lut entries comparison
will be different for different gamma modes, add gamma mode dependent
checks.
v3: -Rebase
v4: -Renamed intel_compare_color_lut() to intel_color_lut_equal() [Jani]
-Added the default label above the correct label [Jani]
-Corrected smatch warn "variable dereferenced before check"
[Dan Carpenter]
v5: -Added condition (!blob1 && !blob2) return true [Jani]
v6: -Made patch11 as patch3 [Jani]
v8: -Split patch 3 into 4 patches
-Optimized blob check condition [Ville]
v9: -Exclude spilt gamma mode (bdw and ivb platforms)
as there is exception in way gamma values are written in
hardware [Ville]
-Added exception made in commit [Uma]
-Dropped else, character limit and indentation [Uma]
-Added multi segmented gama mode for icl+ platforms [Uma]
v10: -Dropped multi segmented mode for icl+ platforms [Jani]
-Removed references of sw and hw state in compare code [Jani]
-Dropped inline from func [Jani]
Signed-off-by: Swati Sharma <swati2.sharma@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1567538578-4489-4-git-send-email-swati2.sharma@intel.com
Each platform supports different gamma modes and each gamma mode
has different bit precision. Here bit precision corresponds
to number of bits the hw LUT supports.
Add func per platform to return bit precision corresponding to gamma mode
which will be later used as a parameter in lut comparison function
intel_color_lut_equal().
This is done for legacy, ilk, glk and their variant platforms.
v6: -Added func intel_color_get_bit_precision() to get bit precision for
gamma and degamma lut readout depending upon platform and
corresponding to load_luts() [Ankit]
-Made patch11 as patch3 [Jani]
v7: -Renamed func intel_color_get_bit_precision() to
intel_color_get_gamma_bit_precision()
-Added separate function/platform for gamma bit precision [Ville]
-Corrected checkpatch warnings
v8: -Split patch 3 into 4 separate patches
v9: -Changed commit message, gave more info [Uma]
-Added precision func for icl+ platform
v10: -Removed precision func for chv and icl+ platforms [Jani]
-Added gamma_enable check once [Jani]
Signed-off-by: Swati Sharma <swati2.sharma@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1567538578-4489-2-git-send-email-swati2.sharma@intel.com
Add debug log for color related parameters like gamma_mode, gamma_enable,
csc_enable, etc inside intel_dump_pipe_config().
v6: -Added debug log for color para in intel_dump_pipe_config [Jani]
v7: -Split patch 3 into 4 patches
v8: -Corrected alignment [Uma]
Signed-off-by: Swati Sharma <swati2.sharma@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1567538578-4489-3-git-send-email-swati2.sharma@intel.com
It's been a long time since we accidentally reported -EIO upon wedging,
it can now only be generated by failure to swap in a page.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190902040303.14195-4-chris@chris-wilson.co.uk
obj->pin_global was originally used as a means to keep the shrinker off
the active scanout, but we use the vma->pin_count itself for that and
the obj->frontbuffer to delay shrinking active framebuffers. The other
role that obj->pin_global gained was for spotting display objects inside
GEM and working harder to keep those coherent; for which we can again
simply inspect obj->frontbuffer directly.
Coming up next, we will want to manipulate the pin_global counter
outside of the principle locks, so would need to make pin_global atomic.
However, since obj->frontbuffer is already managed atomically, it makes
sense to use that the primary key for display objects instead of having
pin_global.
Ville pointed out the principle difference is that obj->frontbuffer is
set for as long as an intel_framebuffer is attached to an object, but
obj->pin_global was only raised for as long as the object was active. In
practice, this means that we consider the object as being on the scanout
for longer than is strictly required, causing us to be more proactive in
flushing -- though it should be true that we would have flushed
eventually when the back became the front, except that on the flip path
that flush is async but when hit from another ioctl it will be
synchronous.
v2: i915_gem_object_is_framebuffer()
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190902040303.14195-5-chris@chris-wilson.co.uk
The aliasing-ppgtt is not allowed to be smaller than the ggtt, nor
should we advertise it as being any bigger, or else we may get sued for
false advertisement.
Testcase: igt/gem_exec_big
Fixes: 0b718ba1e8 ("drm/i915/gtt: Downgrade Cherryview back to aliasing-ppgtt")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190902040303.14195-1-chris@chris-wilson.co.uk
Reogranize the HDMI deep color state computation to just
loop over possible bpc values. Avoids having to maintain
so many variants of the clock etc.
The current code also looks confused w.r.t. port_clock vs.
bw_constrained. It would happily update port_clock for
deep color but then not actually enable deep color due to
bw_constrained being set. The new logic handles that case
correctly.
v2: Pull stuff into separate funcs (Jani)
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190828183424.7856-1-ville.syrjala@linux.intel.com
enum port is a mess now because it no longer matches the spec
at all. Let's start to dig ourselves out of this hole by
reducing our reliance on port_name(). This should at least make
a bunch of debug messages a bit more sensible while we think how
to fill the the hole properly.
Based on the following cocci script with a lot of manual cleanup
(all the format strings etc.):
@@
expression E;
@@
(
- port_name(E->port)
+ E->base.base.id, E->base.name
|
- port_name(E.port)
+ E.base.base.id, E.base.name
)
@@
enum port P;
expression E;
@@
P = E->port
<...
- port_name(P)
+ E->base.base.id, E->base.name
...>
@@
enum port P;
expression E;
@@
P = E.port
<...
- port_name(P)
+ E.base.base.id, E.base.name
...>
@@
expression E;
@@
{
- enum port P = E;
... when != P
}
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190830182719.32608-1-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
My attempt at allowing MST to use the higher color depths has
regressed some configurations. Apparently people have setups
where all MST streams will fit into the DP link with 8bpc but
won't fit with higher color depths.
What we really should be doing is reducing the bpc for all the
streams on the same link until they start to fit. But that requires
a bit more work, so in the meantime let's revert back closer to
the old behavior and limit MST to at most 8bpc.
Cc: stable@vger.kernel.org
Cc: Lyude Paul <lyude@redhat.com>
Tested-by: Geoffrey Bennett <gmux22@gmail.com>
Fixes: f147721986 ("drm/i915: Remove the 8bpc shackles from DP MST")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111505
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190828102059.2512-1-ville.syrjala@linux.intel.com
Reviewed-by: Lyude Paul <lyude@redhat.com>
We use the context->pin_mutex to serialise updates to the OA config and
the registers values written into each new context. Document this
relationship and assert we do hold the context->pin_mutex as used by
gen8_configure_all_contexts() to serialise updates to the OA config
itself.
v2: Add a white-lie for when we call intel_gt_resume() from init.
v3: Lie while we have the context pinned inside atomic reset.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> #v1
Link: https://patchwork.freedesktop.org/patch/msgid/20190830181929.18663-1-chris@chris-wilson.co.uk
The bspec was recently updated with these new cdclk values for ICL, EHL,
and TGL.
Bspec: 20598
Bspec: 49201
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190826225540.11987-3-matthew.d.roper@intel.com
The bspec has just recently been updated with new cdclk values that
require the use of a /2 CD2X divider rather than a /1 divider. Once we
add the divider selection logic to ICL+ cdclk programming, we have
pretty much the same logic we were already using on CNL, so it's simpler
to drop icl_set_cdclk() completely and reuse cnl_set_cdclk() on gen11+
platforms as well.
v2:
- Using ICL_CDCLK_CD2X_PIPE_NONE + BXT_CDCLK_CD2X_PIPE(pipe) for TGL is
correct, but looks really confusing. Add some TGL_ macros that alias
these to avoid confusion. (Ville)
- Use DIV_ROUND_CLOSEST rather than / when applying the divider. (Ville)
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190830004828.19359-1-matthew.d.roper@intel.com
When we moved the code to disable crtc's to a separate patch,
we forgot to ensure that for_each_oldnew_intel_crtc_in_state_reverse()
was moved as well.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Fixes: 66d9cec8a6 ("drm/i915/display: Move the commit_tail() disable sequence to separate function")
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Manasi Navare <manasi.d.navare@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190830101644.8740-1-maarten.lankhorst@linux.intel.com
This is no longer used anywhere and so can be removed. However, tracking
the dirty status on the ppgtt doesn't work very well if the ppgtt is
shared, so perhaps for the best that it is no longer required.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190830180000.24608-3-chris@chris-wilson.co.uk
With the upcoming change in timing (dramatically reducing the latency
between manipulating the ppGTT and execution), no amount of tweaking
could save Cherryview, it would always fail to invalidate its TLB.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190830180000.24608-2-chris@chris-wilson.co.uk
With the upcoming change in timing (dramatically reducing the latency
between manipulating the ppGTT and execution), no amount of tweaking
could save Baytrail, it would always fail to invalidate its TLB. Ville
was right, Baytrail is beyond hope.
v2: Rollback on all gen7; same timing instability on TLB invalidation.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190830180000.24608-1-chris@chris-wilson.co.uk
Use a single function to setup the SDE irq and make MCC, ICP and TGP use
it, just like was done for the irq handler.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190829211526.30525-4-jose.souza@intel.com
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Ice Lake, Tiger Lake and Elkhart Lake all have different port
configurations and all of them can be parameterized the same way to form
the SDE hotplug bitmask. Avoid making them a special case an just use
the parameterized macros.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190829211526.30525-3-jose.souza@intel.com
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
The differences are only on the pins, trigger and long_detect function.
The MCC handling is already partially merged, so merge TGP as well.
Remove the pins argument from icp_irq_handler() so we have all the
differences between the 3 set in a common if ladder.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190829211526.30525-2-jose.souza@intel.com
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
South, follow the north.
Instead of defining separate macros for each port, make them take port
as parameter as done for TC ports and for north engine. This will allow
us to easily extend this as needed.
tgp_ddi_port_hotplug_long_detect() is also removed as after the EHL
introduction the tgp variant is an exact copy of icp.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190829211526.30525-1-jose.souza@intel.com
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
>From Gen12 onwards, HDCP HW block is implemented within transcoders.
Till Gen11 HDCP HW block was part of DDI.
Hence required changes in HW programming is handled here.
As ME FW needs the transcoder detail on which HDCP is enabled
on Gen12+ platform, we are populating the detail in hdcp_port_data.
v2:
_MMIO_TRANS is used [Lucas and Daniel]
platform check is moved into the caller [Lucas]
v3:
platform check is moved into a macro [Shashank]
v4:
Few optimizations in the coding [Shashank]
v5:
Fixed alignment in macro definition in i915_reg.h [Shashank]
unused variables "reg" is removed.
v6:
Configuring the transcoder at compute_config.
transcoder is used instead of pipe in macros.
Rebased.
v7:
transcoder is cached at intel_hdcp
hdcp_port_data is configured with transcoder index asper ME FW.
v8:
s/trans/cpu_transcoder
s/tc/cpu_transcoder
v9:
rep_ctl is prepared for TCD too.
return moved into deault of rep_ctl prepare function [Shashank]
Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Reviewed-by: Shashank Sharma <shashank.sharma@intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Uma Shankar <uma.shankar@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190828164216.405-7-ramalingam.c@intel.com
On gen12+ platforms, HDCP HW is associated to the transcoder.
Hence on every modeset update associated transcoder into the
intel_hdcp of the port.
v2:
s/trans/cpu_transcoder [Jani]
v3:
comment is added for fw_ddi init for gen12+ [Shashank]
only hdcp capable transcoder is translated into fw_tc [Shashank]
v4:
fw_tc initialization is kept for modeset. [Tomas]
few extra doc is added at port_data init [Tomas]
v5:
Few comments are improvised [Tomas]
Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Shashank Sharma <shashank.sharma@intel.com>
Signed-off-by: Uma Shankar <uma.shankar@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190828164216.405-6-ramalingam.c@intel.com
We dont need the definition of the enum port outside I915, anymore.
Hence move enum port definition into I915 driver itself.
v2:
intel_display.h is included in intel_hdcp.h
v3:
enum port is declared in headers.
v4:
commit msg is rephrased.
v5:
copyright year is updated [Tomas]
Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Shashank Sharma <shashank.sharma@intel.com>
Reviewed-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Uma Shankar <uma.shankar@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190828164216.405-3-ramalingam.c@intel.com
I915 converts it's port value into ddi index defiend by ME FW
and pass it as a member of hdcp_port_data structure.
Hence expose the enum mei_fw_ddi to I915 through
i915_mei_interface.h.
v2:
Copyright years are bumped [Tomas]
Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Shashank Sharma <shashank.sharma@intel.com>
Acked-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Uma Shankar <uma.shankar@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190828164216.405-2-ramalingam.c@intel.com
The addition of the DC_FLUSH failed to ensure sanctity of the post-sync
write as CI immediately got a completion CS-event before the breadcrumb
was coherent. So let's try the other idea of moving the post-sync write
into the CS_STALL.
References: https://bugs.freedesktop.org/show_bug.cgi?id=111514
References: e8f6b4952e ("drm/i915/execlists: Flush the post-sync breadcrumb write harder")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190829081150.10271-2-chris@chris-wilson.co.uk
Create a new function intel_commit_modeset_disables() consistent
with the naming in drm atomic helpers and similar to the enable function.
This helps better organize the disable sequence in atomic_commit_tail()
No functional change
v4:
* Do not create a function pointer, just a function (Maarten)
v3:
* Rebase (Manasi)
v2:
* Create a helper for old_crtc_state disables (Lucas)
Suggested-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190828224701.422-1-manasi.d.navare@intel.com
This patch has no functional changes. This just renames the update_crtcs()
hooks to commit_modeset_enables() to match the drm_atomic helper naming
conventions.
v2:
* Rebase on drm-tip
Suggested-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190827221735.29351-2-manasi.d.navare@intel.com
The sg_table for our backing store might contain addresses from
stolen-memory or in the future local-memory, at which point this is no
longer a dma-iterator. As a consequence we should now break on NULL
iter.sgp, instead of dmap == 0 which is considered an invalid dma
address.
As a bonus, gcc much prefers this construct,
Function old new delta
gen8_ggtt_insert_entries 211 192 -19
gen6_ggtt_insert_entries 292 262 -30
i915_error_object_create 996 954 -42
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190829201919.21493-1-matthew.auld@intel.com
During normal driver unload we attempt to disable GuC communication
while it is currently stopped. This results in a nop'd call to
intel_guc_ct_disable within guc_disable_communication because
stop/disable rely on the same flag to prevent further comms with CT.
We can avoid the call to disable and still leave communication in a
satisfactory state by extracting a set of shared steps from stop/disable.
This set can include guc_disable_interrupts as we do not require the
single caller of guc_stop_communication to be atomic:
"drm/i915/selftests: Fixup atomic reset checking".
This situation (stop -> disable) only occurs during intel_uc_fini_hw,
so during fini, call guc_disable_communication only if currently enabled.
The symmetric calls to enable/disable remain unmodified for all other
scenarios.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110943
Signed-off-by: Fernando Pacheco <fernando.pacheco@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190829174154.14675-1-fernando.pacheco@intel.com
Let the scheduler have a breather in between passes of the longer buddy
tests. Important if we are running under kasan etc and this takes far
longer than usual!
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190829170848.969-1-chris@chris-wilson.co.uk
According to BSpc if link standby is set on TGL+, PSR will not be
enabled. Vendors should not use panels that requires link standby and
even if they do, panel should assert a PSR error that will cause PSR to
be disabled.
BSpec: 50434
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Anshuman Gupta <anshuman.gupta@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190823082055.5992-8-lucas.demarchi@intel.com
Yf tiling was removed in gen-12, so do not expose Yf modifiers to user
space. Gen-12 display also is incompatible with pre-gen12 Y-tiled
CCS, so do not expose I915_FORMAT_MOD_Y_TILED_CCS.
v2: Rebase to carry forward recently added gen11 formats.
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Signed-off-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190827084516.6748-1-dhinakaran.pandiyan@intel.com
DSC was not supported on Pipe A for previous platforms. Tigerlake onwards,
all the pipes support DSC. Hence, the DSC and FEC restriction on
Pipe A needs to be removed.
v2: Changes in the logic around removing the restriction around
Pipe A (Manasi, Lucas)
Cc: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Madhumitha Tolakanahalli Pradeep <madhumitha.tolakanahalli.pradeep@intel.com>
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190823004655.28905-1-madhumitha.tolakanahalli.pradeep@intel.com
Trust our own workers to not cause unnecessary delays and disable the
automatic timeout on their asynchronous fence waits. (Along the same
lines that we trust our own requests to complete eventually, if
necessary by force.)
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190826072149.9447-6-chris@chris-wilson.co.uk
There is a difference in BSpec's and the driver's designation of DDI
ports. BSpec uses the following names:
- before GEN11:
BSpec/driver:
port A/B/C/D etc
- GEN11:
BSpec/driver:
port A-F
- GEN12:
BSpec:
port A/B/C for combo PHY ports
port TC1-6 for Type C PHY ports
driver:
port A-I.
The driver's port D name matches BSpec's TC1 port name.
So far power domains were named according to the BSpec designation, to
make it easier to match the code against the specification. That however
can be confusing when a power domain needs to be matched to a port on
GEN12+. To resolve that use the driver's port A-I designation for power
domain names too and rename the corresponding power wells so that they
reflect the mapping from the driver's to BSpec's port name.
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190823100711.27833-1-imre.deak@intel.com
We've been ignoring similar coherency issues in IGT for Broadwater, and
specifically Broadwater (original gen4) and not, for example, Crestline
(same generation as Broadwater, but the mobile variant). Without any
means to reproduce locally (I have a 965GM but alas no 965G), fixing will
be slow, so tell CI to ignore any failure until we are ready with a fix.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190826133837.6784-1-chris@chris-wilson.co.uk
Our current avoidance of non readable mcr range was not
inclusive enough. Extend the start and end.
References: HSDES#1405586840
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190809145653.2279-1-mika.kuoppala@linux.intel.com
Quite rarely we see that the CS completion event fires before the
breadcrumb is coherent, which presumably is a result of the CS_STALL not
waiting for the post-sync operation. Try throwing in a DC_FLUSH into
the following pipecontrol to see if that makes any difference.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190827120615.31390-1-chris@chris-wilson.co.uk
igt_ctx_exec allocates a new context for each iteration, keeping them
all allocated until the end. Instead, release the local ctx reference at
the end of each iteration, allowing ourselves to reap those if under
mempressure.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190827161726.3640-2-chris@chris-wilson.co.uk
Upon object creation for live_gem_contexts, we fill the object with
known scratch and flush it out of the CPU cache. Before performing the
GPU fill, we don't need to flush it again and so avoid serialising with
previous fills.
However, we do need some throttling on the internal interfaces if we do
not want to run out of memory!
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190827161726.3640-1-chris@chris-wilson.co.uk
The CS pre-parser can pre-fetch commands across memory sync points and
starting from gen12 it is able to pre-fetch across BB_START and BB_END
boundaries as well, so when we emit gpu relocs the pre-parser might
fetch the target location of the reloc before the memory write lands.
The parser can't pre-fetch across the ctx switch, so we use a separate
context to guarantee that the memory is synchronized before the parser
can get to it.
Note that there is no risk of the CS doing a lite restore from the reloc
context to the user context, even if the two have the same hw_id,
because since gen11 the CS also checks the LRCA when deciding if it can
lite-restore.
v2: limit new context to gen12+, release in eb_destroy, add a comment
in emit_fini_breadcrumb (Chris).
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190827185805.21799-1-daniele.ceraolospurio@intel.com
Compared to Icelake, Tigerlake's MAX_CONTEXT_HW_ID is smaller by one, but
since we just use the upper 32 bits of the lrc_desc, it's guaranteed OA
will use the correct one.
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190823082055.5992-19-lucas.demarchi@intel.com
On TGL some registers moved from DDI to transcoder and the
DisplayPort training sequence has a separate BSpec page.
I started adding 'ifs' to the original intel_ddi_pre_enable_dp() but
it was becoming really hard to follow, so a new and cleaner function
for TGL was added with comments of all steps. It's similar to ICL,
but different enough to deserve a new function.
The rest of DisplayPort enable and the whole disable sequences
remained the same.
v2: FEC and DSC should be enabled on sink side before start link
training(Maarten reported and Manasi confirmed the DSC part)
v3: Add call to enable FEC on step 7.l(Manasi)
BSpec: 49190
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Manasi Navare <manasi.d.navare@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190823082055.5992-16-lucas.demarchi@intel.com
Disable CRTC/pipes in reverse order because some features (MST in
TGL+) requires master and slave relationship between pipes, so it
should always pick the lowest pipe as master as it will be enabled
first and disable in the reverse order so the master will be the last
one to be disabled.
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Mika Kahola <mika.kahola@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190823082055.5992-13-lucas.demarchi@intel.com
Same as for_each_oldnew_intel_crtc_in_state() but iterates in reverse
order.
v2: Fix additional blank line
v3: Rebase
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Mika Kahola <mika.kahola@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190823082055.5992-12-lucas.demarchi@intel.com
TGL PSR2 HW supports a bigger resolution, so lets add it
BSpec: 50422, 49199
Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Anshuman Gupta <anshuman.gupta@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190823082055.5992-10-lucas.demarchi@intel.com
On TGL+ it's possible to have PSR1 enabled in other ports besides DDIA.
PSR2 is still limited to DDIA. However currently we handle only one
instance of PSR struct. Lets guard intel_psr_init_dpcd() against
multiple eDP panels and warn about it.
v2: Reword commit message to be TGL+ only and with the info where
PSR1/PSR2 are supported (Lucas)
Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Anshuman Gupta <anshuman.gupta@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190823082055.5992-6-lucas.demarchi@intel.com
The point of debug_object_activate is to mark the first, and only the
first, acquisition. The object then remains active until the last
release. However, we marked up all successful first acquires even though
we allowed concurrent parties to try and acquire the i915_active
simultaneously (serialised by the i915_active.mutex).
Testcase: igt/gem_mmap_gtt/fault-concurrent
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190827132631.18627-1-chris@chris-wilson.co.uk
If we create a new live_context() we should have a mapping for each
engine. Document that assumption with an assertion.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190827094933.13778-1-chris@chris-wilson.co.uk
The intention is that we first try to pin the current vma into the
mappable aperture only if it is already in use or it fits in the free
space and will not cause contention. The first attempt was meant to be
using PIN_NOEVICT to reuse the current vma if possible, following up
with different eviction strategies.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111485
Fixes: 6846895fde ("drm/i915: Replace PIN_NONFAULT with calls to PIN_NOEVICT")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190826130750.17272-1-chris@chris-wilson.co.uk
To properly handle asynchronous migration of batch objects, we need to
couple the fences on the incoming batch into the request and should not
assume that they always start idle.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190826072149.9447-1-chris@chris-wilson.co.uk
Currently, we don't call dma_set_max_seg_size() for i915 because we
intentionally do not limit the segment length that the device supports.
However, this results in a warning being emitted if we try to map
anything larger than SZ_64K on a kernel with CONFIG_DMA_API_DEBUG_SG
enabled:
[ 7.751926] DMA-API: i915 0000:00:02.0: mapping sg segment longer
than device claims to support [len=98304] [max=65536]
[ 7.751934] WARNING: CPU: 5 PID: 474 at kernel/dma/debug.c:1220
debug_dma_map_sg+0x20f/0x340
This was originally brought up on
https://bugs.freedesktop.org/show_bug.cgi?id=108517 , and the consensus
there was it wasn't really useful to set a limit (and that dma-debug
isn't really all that useful for i915 in the first place). Unfortunately
though, CONFIG_DMA_API_DEBUG_SG is enabled in the debug configs for
various distro kernels. Since a WARN_ON() will disable automatic problem
reporting (and cause any CI with said option enabled to start
complaining), we really should just fix the problem.
Note that as me and Chris Wilson discussed, the other solution for this
would be to make DMA-API not make such assumptions when a driver hasn't
explicitly set a maximum segment size. But, taking a look at the commit
which originally introduced this behavior, commit 78c47830a5
("dma-debug: check scatterlist segments"), there is an explicit mention
of this assumption and how it applies to devices with no segment size:
Conversely, devices which are less limited than the rather
conservative defaults, or indeed have no limitations at all
(e.g. GPUs with their own internal MMU), should be encouraged to
set appropriate dma_parms, as they may get more efficient DMA
mapping performance out of it.
So unless there's any concerns (I'm open to discussion!), let's just
follow suite and call dma_set_max_seg_size() with UINT_MAX as our limit
to silence any warnings.
Changes since v3:
* Drop patch for enabling CONFIG_DMA_API_DEBUG_SG in CI. It looks like
just turning it on causes the kernel to spit out bogus WARN_ONs()
during some igt tests which would otherwise require teaching igt to
disable the various DMA-API debugging options causing this. This is
too much work to be worth it, since DMA-API debugging is useless for
us. So, we'll just settle with this single patch to squelch WARN_ONs()
during driver load for users that have CONFIG_DMA_API_DEBUG_SG turned
on for some reason.
* Move dma_set_max_seg_size() call into i915_driver_hw_probe() - Chris
Wilson
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: <stable@vger.kernel.org> # v4.18+
Link: https://patchwork.freedesktop.org/patch/msgid/20190823205251.14298-1-lyude@redhat.com
vgpu ppgtt notification was split into 2 steps, the first step is to
update PVINFO's pdp register and then write PVINFO's g2v_notify register
with action code to tirgger ppgtt notification to GVT side.
currently these steps were not atomic operations due to no any protection,
so it is easy to enter race condition state during the MTBF, stress and
IGT test to cause GPU hang.
the solution is to add a lock to make vgpu ppgtt notication as atomic
operation.
Cc: stable@vger.kernel.org
Signed-off-by: Xiaolin Zhang <xiaolin.zhang@intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/1566543451-13955-1-git-send-email-xiaolin.zhang@intel.com
Avoid having to pass around (ctx, engine) everywhere by passing the
actual intel_context we intend to use. Today we preach this lesson to
igt_gpu_fill_dw and its callers' callers.
The immediate benefit for the GEM selftests is that we aim to use the
GEM context as the control, the source of the engines on which to test
the GEM context.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190823235141.31799-1-chris@chris-wilson.co.uk
Our fence management is lazy, very lazy. If the user marks an object as
untiled, we do not immediately flush the fence but merely mark it as
dirty. On the next use we have to remember to check and remove the fence,
by which time we hope it is idle and we do not have to wait.
v2: Throw away the old fence on the next ggtt_pin.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111468
Fixes: 1f7fd484ff ("drm/i915: Replace i915_vma_put_fence()")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190823153944.20630-1-chris@chris-wilson.co.uk
In order for the Braswell top-level PD to remain the same from the time
of request construction to its submission onto HW, as we may be
asynchronously rewriting the page tables (thus changing the expected
register state after having already stored the old addresses in the
request), the top level PD must be preallocated.
So wave goodbye to our lazy allocation of those 4x2 pages.
v2: A little bit of write-flushing required (presumably it always has
been required, but now we are more susceptible and it is showing up!)
v3: Put back the forced-PD-reload on every batch, we can't survive
without it and explicitly marking the context for PD reload makes
Braswell turn nasty.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190823141421.2398-1-chris@chris-wilson.co.uk
Sadly lockdep records when the irqs are re-enabled and then marks up the
fake lock as being irq-unsafe. Our hand is forced and so we must mark up
the entire fake lock critical section as irq-off.
Hopefully this is the last tweak required!
v2: Not quite, we need to mark the timeline spinlock as irqsafe. That
was a genuine bug being hidden by the earlier lockdep splat.
Fixes: d67739268c ("drm/i915/gt: Mark up the nested engine-pm timeline lock as irqsafe")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190823132700.25286-2-chris@chris-wilson.co.uk
Use hweight8() instead of hweight32() for 8bit masks. Doesn't actually
matter for us since the arch code will go for hweight32() anyway, but
maybe we stil want to do this for documentation purposes?
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190821173033.24123-5-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
We may need to eliminate the crtc->index == pipe assumptions from
the code to support arbitrary pipes being fused off. Start that by
switching some bitmasks over to using pipe instead of the crtc index.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190821173033.24123-1-ville.syrjala@linux.intel.com
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Currently, the subslice_mask runtime parameter is stored as an
array of subslices per slice. Expand the subslice mask array to
better match what is presented to userspace through the
I915_QUERY_TOPOLOGY_INFO ioctl. The index into this array is
then calculated:
slice * subslice stride + subslice index / 8
v2: Fix 32-bit build
v3: Use new helper function in SSEU workaround warning message
v4: Use GEM_BUG_ON to force developers to use valid SSEU configurations
per platform (Chris)
Signed-off-by: Stuart Summers <stuart.summers@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190823160307.180813-12-stuart.summers@intel.com
Add a new function to copy subslices for a specified slice
between intel_sseu structures for the purpose of determining
power-gate status. Note that currently ss_stride has a max
of 1.
Signed-off-by: Stuart Summers <stuart.summers@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190823160307.180813-11-stuart.summers@intel.com
Add a subslice stride calculation when setting subslices. This
aligns more closely with the userspace expectation of the subslice
mask structure.
v2: Use local variable for subslice_mask on HSW and
clean up a few other subslice_mask local variable
changes
v3: Add GEM_BUG_ON for ss_stride to prevent array overflow (Chris)
Split main set function and refactors in intel_device_info.c
into separate patches (Chris)
v4: Reduce ss_stride size check when setting subslices per slice
based on actual expected max stride (Chris)
Move that GEM_BUG_ON check for the ss_stride out to the patch
which adds the ss_stride
v5: Use memcpy instead of looping through each stride index
Signed-off-by: Stuart Summers <stuart.summers@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190823160307.180813-8-stuart.summers@intel.com
When setting up subslice_mask, instead of operating on the slice
array directly, use a local variable to start bits per slice, then
use this to set the per slice array in one step.
Signed-off-by: Stuart Summers <stuart.summers@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190823160307.180813-6-stuart.summers@intel.com
Add a new SSEU runtime parameter, eu_stride, which is
used to mirror the userspace concept of a range of EUs
per subslice.
This patch simply adds the parameter and updates usage
in the QUERY_TOPOLOGY_INFO handler.
v2: Add GEM_BUG_ON to make sure eu_stride is valid
Signed-off-by: Stuart Summers <stuart.summers@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190823160307.180813-5-stuart.summers@intel.com
Add a new parameter, ss_stride, to the runtime info
structure. This is used to mirror the userspace concept
of subslice stride, which is a range of subslices per slice.
This patch simply adds the definition and updates usage
in the QUERY_TOPOLOGY_INFO handler.
v2: Add GEM_BUG_ON to make sure ss_stride is valid
Signed-off-by: Stuart Summers <stuart.summers@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190823160307.180813-4-stuart.summers@intel.com
Add a new function to allow each platform to set maximum
slice, subslice, and EU information to reduce code duplication.
Signed-off-by: Stuart Summers <stuart.summers@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190823160307.180813-3-stuart.summers@intel.com
Use a local variable to find SSEU runtime information
in various debugfs functions.
v2: Remove extra line breaks per feedback from Chris
Signed-off-by: Stuart Summers <stuart.summers@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190823160307.180813-2-stuart.summers@intel.com
HCP/MFX power gating is disabled by default, turn it on for the vd units
available. User space will also issue a MI_FORCE_WAKEUP properly to
wake up proper subwell.
During driver load, init_clock_gating happens after device_info_init_mmio
read the vdbox disable fuse register, so only present vd units will have
these enabled.
BSpec: 14214
HSDES: 1209977827
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Tony Ye <tony.ye@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190823082055.5992-3-lucas.demarchi@intel.com