Until now the software command checker assumed that commands could
read or write at most a single register per packet. This is not
necessarily the case, MI_LOAD_REGISTER_IMM expects a variable-length
list of offset/value pairs and writes them in sequence. The previous
code would only check whether the first entry was valid, effectively
allowing userspace to write unrestricted registers of the MMIO space
by sending a multi-register write with a legal first register, with
potential security implications on Gen6 and 7 hardware.
Fix it by extending the drm_i915_cmd_descriptor table to represent
multi-register access and making validate_cmd() iterate for all
register offsets present in the command packet.
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Until now the software command checker assumed that commands could
read or write at most a single register per packet. This is not
necessarily the case, MI_LOAD_REGISTER_IMM expects a variable-length
list of offset/value pairs and writes them in sequence. The previous
code would only check whether the first entry was valid, effectively
allowing userspace to write unrestricted registers of the MMIO space
by sending a multi-register write with a legal first register, with
potential security implications on Gen6 and 7 hardware.
Fix it by extending the drm_i915_cmd_descriptor table to represent
multi-register access and making validate_cmd() iterate for all
register offsets present in the command packet.
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Zhigang Gong <zhigang.gong@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Now that we can subclass drm_atomic_state we can also use it to keep
track of all the pll settings. atomic_state is a better place to hold
all shared state than keeping pll->new_config everywhere.
Changes since v1:
- Assert connection_mutex is held.
Changes since v2:
- Fix swapped arguments to kzalloc for intel_atomic_state_alloc. (Jani Nikula)
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Now that the dpll updates are (mostly) atomic, the .off() code is a noop,
and intel_crtc_disable does mostly the same as intel_modeset_update_state.
Move all logic for connectors_active and setting dpms to that function.
Changes since v1:
- Move drm_atomic_helper_swap_state up.
Changes since v2:
- Split out intel_put_shared_dpll removal.
Changes since v3:
- Rebase on top of latest drm-intel.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
We need to tell BDW ULT and ULX apart.
v2: Rebased to the latest
v3: Rebased to the latest
v4: Fix for patch style problems
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Mika Kahola <mika.kahola@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Keep the cdclk maximum supported frequency around in dev_priv so that we
can verify certain things against it before actually changing the cdclk
frequency.
For now only VLV/CHV have support changing cdclk frequency, so other
plarforms get to assume cdclk is fixed.
v2: Rebased to the latest
v3: Rebased to the latest
v4: Fix for patch style problems
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Mika Kahola <mika.kahola@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
No functional changes.
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
There are plenty of hotplug related fields in struct drm_i915_private
scattered all around. Group them under one hotplug struct. Clean up
naming while at it. No functional changes.
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Export a new context parameter that can be set/queried through the
context_{get,set}param ioctls. This parameter is passed as a context
flag and decides whether or not a GPU address mapping is allowed to
be made at address zero. The default is to allow such mappings.
Signed-off-by: David Weinehall <david.weinehall@intel.com>
Acked-by: "Zou, Nanhai" <nanhai.zou@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Rename dpio_lock to sb_lock to inform the reader that its primary
purpose is to protect the sideband mailbox rather than some DPIO
state.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
In commit 1854d5ca0d
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Tue Apr 7 16:20:32 2015 +0100
drm/i915: Deminish contribution of wait-boosting from clients
we removed an atomic timer based check for allowing waitboosting and
moved it below the mutex taken during RPS. However, that mutex can be
held for long periods of time on Vallyview/Cherryview as communication
with the PCU is slow. As clients may frequently wait for results (e.g.
such as tranform feedback) we introduced contention between the client
and the RPS worker. We can take advantage of the RPS worker, by
switching the wait boost decision to use spin locks and defer the
actual reclocking to the worker.
Fixes a regression of up to 45% on Baytrail and Baswell!
v2 (Daniel):
- Use max_freq_softlimit instead of the not-yet-merged boost
frequency.
- Don't inject a fake irq into the boost work, instead treat
client_boost as just another legit waker.
v3: Drop the now unused mask (Chris).
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90112
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
As Daniel commented on
commit b7ffe1362c5f468b853223acc9268804aa92afc8
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Mon Apr 27 13:41:24 2015 +0100
drm/i915: Free RPS boosts for all laggards
it is better to be explicit when sharing hardcoded values such as
throttle/boost timeouts. Make it so!
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We need to re-init the display hardware when going out of suspend. This
includes:
- Hooking the PCH to the reset logic
- Restoring CDCDLK
- Enabling the DDB power
Among those, only the CDCDLK one is a bit tricky. There's some
complexity in that:
- DPLL0 (which is the source for CDCLK) has two VCOs, each with a set
of supported frequencies. As eDP also uses DPLL0 for its link rate,
once DPLL0 is on, we restrict the possible eDP link rates the chosen
VCO.
- CDCLK also limits the bandwidth available to push pixels.
So, as a first step, this commit restore what the BIOS set, until I can
do more testing.
In case that's of interest for the reviewer, I've unit tested the
function that derives the decimal frequency field:
#include <stdio.h>
#include <stdint.h>
#include <assert.h>
#define ARRAY_SIZE(x) (sizeof(x) / sizeof(*(x)))
static const struct dpll_freq {
unsigned int freq;
unsigned int decimal;
} freqs[] = {
{ .freq = 308570, .decimal = 0b01001100111},
{ .freq = 337500, .decimal = 0b01010100001},
{ .freq = 432000, .decimal = 0b01101011110},
{ .freq = 450000, .decimal = 0b01110000010},
{ .freq = 540000, .decimal = 0b10000110110},
{ .freq = 617140, .decimal = 0b10011010000},
{ .freq = 675000, .decimal = 0b10101000100},
};
static void intbits(unsigned int v)
{
int i;
for(i = 10; i >= 0; i--)
putchar('0' + ((v >> i) & 1));
}
static unsigned int freq_decimal(unsigned int freq /* in kHz */)
{
return (freq - 1000) / 500;
}
static void test_freq(const struct dpll_freq *entry)
{
unsigned int decimal = freq_decimal(entry->freq);
printf("freq: %d, expected: ", entry->freq);
intbits(entry->decimal);
printf(", got: ");
intbits(decimal);
putchar('\n');
assert(decimal == entry->decimal);
}
int main(int argc, char **argv)
{
int i;
for (i = 0; i < ARRAY_SIZE(freqs); i++)
test_freq(&freqs[i]);
return 0;
}
v2:
- Rebase on top of -nightly
- Use (freq - 1000) / 500 for the decimal frequency (Ville)
- Fix setting the enable bit of HSW_NDE_RSTWRN_OPT (Ville)
- Rename skl_display_{resume,suspend} to skl_{init,uninit}_cdclk to
be consistent with the BXT code (Ville)
- Store boot CDCLK in ddi_pll_init (Ville)
- Merge dev_priv's skl_boot_cdclk into cdclk_freq
- Use LCPLL_PLL_LOCK instead of (1 << 30) (Ville)
- Replace various '0' by SKL_DPLL0 to be a bit more explicit that
we're programming DPLL0
- Busy poll the PCU before doing the frequency change. It takes about
3/4 cycles, each separated by 10us, to get the ACK from the CPU
(Ville)
v3:
- Restore dev_priv->skl_boot_cdclk, leaving unification with
dev_priv->cdclk_freq for a later patch (Daniel, Ville)
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Now that we have internal clients, rather than faking a whole
drm_i915_file_private just for tracking RPS boosts, create a new struct
intel_rps_client and pass it along when waiting.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: s/rq/req/]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Since we will often pageflip to an active surface, we will often have to
wait for the surface to be written before issuing the flip. Also we are
likely to wait on that surface in plenty of time before the vblank.
Since we have a mechanism for boosting when a flip misses the expected
vblank, curtain the number of times we RPS boost when simply waiting for
mmioflip.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: s/rq/req/]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Ring switches can occur many times per frame, and are often out of
control, causing frequent RPS boosting for no practical benefit. Treat
the sw semaphore synchronisation as a separate client and only allow it
to boost once per busy/idle cycle.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: s/rq/req/]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Currently, we only track the last request globally across all engines.
This prevents us from issuing concurrent read requests on e.g. the RCS
and BCS engines (or more likely the render and media engines). Without
semaphores, we incur costly stalls as we synchronise between rings -
greatly impacting the current performance of Broadwell versus Haswell in
certain workloads (like video decode). With the introduction of
reference counted requests, it is much easier to track the last request
per ring, as well as the last global write request so that we can
optimise inter-engine read read requests (as well as better optimise
certain CPU waits).
v2: Fix inverted readonly condition for nonblocking waits.
v3: Handle non-continguous engine array after waits
v4: Rebase, tidy, rewrite ring list debugging
v5: Use obj->active as a bitfield, it looks cool
v6: Micro-optimise, mostly involving moving code around
v7: Fix retire-requests-upto for execlists (and multiple rq->ringbuf)
v8: Rebase
v9: Refactor i915_gem_object_sync() to allow the compiler to better
optimise it.
Benchmark: igt/gem_read_read_speed
hsw:gt3e (with semaphores):
Before: Time to read-read 1024k: 275.794µs
After: Time to read-read 1024k: 123.260µs
hsw:gt3e (w/o semaphores):
Before: Time to read-read 1024k: 230.433µs
After: Time to read-read 1024k: 124.593µs
bdw-u (w/o semaphores): Before After
Time to read-read 1x1: 26.274µs 10.350µs
Time to read-read 128x128: 40.097µs 21.366µs
Time to read-read 256x256: 77.087µs 42.608µs
Time to read-read 512x512: 281.999µs 181.155µs
Time to read-read 1024x1024: 1196.141µs 1118.223µs
Time to read-read 2048x2048: 5639.072µs 5225.837µs
Time to read-read 4096x4096: 22401.662µs 21137.067µs
Time to read-read 8192x8192: 89617.735µs 85637.681µs
Testcase: igt/gem_concurrent_blit (read-read and friends)
Cc: Lionel Landwerlin <lionel.g.landwerlin@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> [v8]
[danvet: s/\<rq\>/req/g]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
BUN 1: prop_coeff, int_coeff, tdctargetcnt programming updated and tied to
VCO frequencies. Program i_lockthresh in PORT_PLL_9.
VCO calculated based on the formula:
Desired Output = Port bit rate in MHz (DisplayPort HBR2 is 5400 MHz)
Fast Clock = Desired Output / 2
VCO = Fast Clock * P1 * P2
Prop_coeff, int_coeff, and tdctargetcnt modified according to above
calculation.
BUN 2: Port PLLs require additional programming at certain frequencies -
DCO amplitude in PORT_PLL_10
Review comments from Siva which were addressed in the initial version of the
patch.
- Change PORT_PLL_LOCK_THRESHOLD to PORT_PLL_LOCK_THRESHOLD_MASK
- Calculate for HDMI
- Correct values for vco = 5.4
- return in case of invalid vco range
v2: Imre's review comments addressed
- change dcoampovr_en to dcoampovr_en_h
- change PORT_PLL_DCO_AMP_OVR_EN to PORT_PLL_DCO_AMP_OVR_EN_H
- Correct lane stagger value for 324MHz
- Make coef common for HDMI and DP
- remove superfluous comments
v3: Imre's comments addressed
- Remove Prop_coeff, int_coeff, tdctargetcnt, dcoampovr_en, gain_ctl,
dcoampovr_en_h from bxt_clk_div and make them local variables.
Signed-off-by: Vandana Kannan <vandana.kannan@intel.com>
Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com> [v1]
Cc: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Be in line with other features that we have.
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
As we perform the mmio-flip without any locking and then try to acquire
the struct_mutex prior to dereferencing the request, it is possible for
userspace to queue a new pageflip before the worker can finish clearing
the old state - and then it will clear the new flip request. The result
is that the new flip could be completed before the GPU has finished
rendering.
The bugs stems from removing the seqno checking in
commit 536f5b5e86
Author: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Date: Thu Nov 6 11:03:40 2014 +0200
drm/i915: Make mmio flip wait for seqno in the work function
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We no longer interpolate domains in the same manner, and even if we did,
we should trust setting either of the other write domains would trigger
an invalidation rather than force it. Remove the tweaking of the
read_domains since it serves no purpose and use
i915_gem_object_wait_rendering() directly.
Note that this goes back to
commit a8198eea15
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Wed Apr 13 22:04:09 2011 +0100
drm/i915: Introduce i915_gem_object_finish_gpu()
and gpu domain tracking died in
commit cc889e0f6c
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Wed Jun 13 20:45:19 2012 +0200
drm/i915: disable flushing_list/gpu_write_list
which is more than 1 year older.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Add notes with information dug out of git history.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Skylake nv12 format requires dbuf (aka. ddb) calculations
and programming for each of y and uv sub-planes. Made minor
changes to reuse current dbuf calculations and programming
for uv plane. i.e., with this change, existing computation
is used for either packed format or uv portion of nv12
depending on incoming format. Added new code for dbuf
computation and programming for y plane.
This patch is a pre-requisite for adding NV12 format support.
Actual nv12 support is coming in later patches.
Signed-off-by: Chandra Konduru <chandra.konduru@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Sometimes (exactly when is a bit unclear) DISPLAY_PHY_CONTROL appears to
get corrupted. The values I've managed to read from it seem to have some
pattern but vary quite a lot. The corruption doesn't seem to just happen
when the register is accessed, but can also happen spontaneosly during
modeset. When this happens during a modeset things go south and the
display doesn't light up.
I've managed to hit the problemn when toggling HDMI on port D on and
off. When things get corrupted the display doesn't light up, but as soon
as I manually write the correct value to the register the display comes
up.
First I was suspicious that we ourselves accidentally overwrite it with
garbage, but didn't catch anything with the reg_rw tracepoint. Also I
sprinkled check all over the modeset path to see exactly when the
corruption happens, and eg. the read back value was fine just before
intel_dp_set_m(), and corrupted immediately after it. I also made my
check function repair the register value whenever it was wrong, and with
this approach the corruption repeated several times during the modeset
operation, always seeming to trigger in the same exact calls to the
check function, while other calls to the function never caught anything.
So far I've not seen this problem occurring when carefully avoiding all
read accesses to DISPLAY_PHY_CONTROL. Not sure if that's just pure luck
or an actual workaround, but we can hope it works. So let's avoid reading
the register and instead track the desired value of the register in dev_priv.
v2: Read out the power well state to determine initial register value
v3: Use DPIO_CHx names instead of raw numbers
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Deepak S <deepak.s@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This allows disabling all planes affecting a crtc without caring what type it is.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This provides an option to override the value set by VBT
for selecting edp Vswing Pre-emph setting table.
v2: Adding comment about this being a temporary workaround and
making the parameter read-only (Jani)
v3: Changing mode to 0400 instead of 0 (Jani)
https://bugs.freedesktop.org/show_bug.cgi?id=89554
Signed-off-by: Sonika Jindal <sonika.jindal@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Enable runtime PM for Skylake platform
v2: After adding dmc ver 1.0 support rebased on top of nightly. (Animesh)
Issue: VIZ-2819
Signed-off-by: A.Sunil Kamath <sunil.kamath@intel.com>
Signed-off-by: Suketu Shah <suketu.j.shah@intel.com>
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Animesh Manna <animesh.manna@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Add triggers as per expectations mentioned in gen9_enable_dc5
and gen9_disable_dc5 patch.
Also call POSTING_READ for every write to a register to ensure that
its written immediately.
v1: Remove POSTING_READ calls as they've already been added in previous patches.
v2: Rebase to move all runtime pm specific changes to intel_runtime_pm.c file.
Modified as per review comments from Imre:
1] Change variable name 'dc5_allowed' to 'dc5_enabled' to correspond to relevant
functions.
2] Move the check dc5_enabled in skl_set_power_well() to disable DC5 into
gen9_disable_DC5 which is a more appropriate place.
3] Convert checks for 'pm.dc5_enabled' and 'pm.suspended' in skl_set_power_well()
to warnings. However, removing them for now as they'll be included in a future patch
asserting DC-state entry/exit criteria.
4] Enable DC5, only when CSR firmware is verified to be loaded. Create new structure
to track 'enabled' and 'deferred' status of DC5.
5] Ensure runtime PM reference is obtained, if CSR is not loaded, to avoid entering
runtime-suspend and release it when it's loaded.
6] Protect necessary CSR-related code with locks.
7] Move CSR-loading call to runtime PM initialization, as power domains needed to be
accessed during deferred DC5-enabling, are not initialized earlier.
v3: Rebase to latest.
Modified as per review comments from Imre:
1] Use blocking wait for CSR-loading to finish to enable DC5 for simplicity, instead of
deferring enabling DC5 until CSR is loaded.
2] Obtain runtime PM reference during CSR-loading initialization itself as deferred DC5-
enabling is removed and release it at the end of CSR-loading functionality.
3] Revert calling CSR-loading functionality to the beginning of i915 driver-load
functionality to avoid any delay in loading.
4] Define another variable to track whether CSR-loading failed and use it to avoid enabling
DC5 if it's true.
5] Define CSR-load-status accessor functions for use later.
v4:
1] Disable DC5 before enabling PG2 instead of after it.
2] DC5 was being mistaken enabled even when CSR-loading timed-out. Fix that.
3] Enable DC5-related functionality using a macro.
4] Remove dc5_enabled tracking variable and its use as it's not needed now.
v5:
1] Mark CSR failed to load where necessary in finish_csr_load function.
2] Use mutex-protected accessor function to check if CSR loaded instead of directly
accessing the variable.
3] Prefix csr_load_status_get/set function names with intel_.
v6: rebase to latest.
v7: Rebase on top of nightly (Damien)
v8: Squashed the patch from Imre - added csr helper pointers to simplify the code. (Imre)
v9: After adding dmc ver 1.0 support rebased on top of nightly. (Animesh)
v10: Added a enum for different csr states, suggested by Imre. (Animesh)
v11: Based on review comments from Imre, Damien and Daniel following changes done
- enum name chnaged to csr_state (singular form).
- FW_UNINITIALIZED used as zeroth element in enum csr_state.
- Prototype changed for helper function(set/get csr status), using enum csr_state instead of bool.
v12: Based on review comment from Imre, introduced bool fw_loaded local to finish_csr_load() which helps
calling once to set the csr status. The same flag used to fail RPM if find any issue during
firmware loading.
Issue: VIZ-2819
Signed-off-by: A.Sunil Kamath <sunil.kamath@intel.com>
Signed-off-by: Suketu Shah <suketu.j.shah@intel.com>
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Animesh Manna <animesh.manna@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Display Context Save and Restore support is needed for
various SKL Display C states like DC5, DC6.
This implementation is added based on first version of DMC CSR program
that we received from h/w team.
Here we are using request_firmware based design.
Finally this firmware should end up in linux-firmware tree.
For SKL platform its mandatory to ensure that we load this
csr program before enabling DC states like DC5/DC6.
As CSR program gets reset on various conditions, we should ensure
to load it during boot and in future change to be added to load
this system resume sequence too.
v1: Initial relese as RFC patch
v2: Design change as per Daniel, Damien and Shobit's review comments
request firmware method followed.
v3: Some optimization and functional changes.
Pulled register defines into drivers/gpu/drm/i915/i915_reg.h
Used kmemdup to allocate and duplicate firmware content.
Ensured to free allocated buffer.
v4: Modified as per review comments from Satheesh and Daniel
Removed temporary buffer.
Optimized number of writes by replacing I915_WRITE with I915_WRITE64.
v5:
Modified as per review comemnts from Damien.
- Changed name for functions and firmware.
- Introduced HAS_CSR.
- Reverted back previous change and used csr_buf with u8 size.
- Using cpu_to_be64 for endianness change.
Modified as per review comments from Imre.
- Modified registers and macro names to be a bit closer to bspec terminology
and the existing register naming in the driver.
- Early return for non SKL platforms in intel_load_csr_program function.
- Added locking around CSR program load function as it may be called
concurrently during system/runtime resume.
- Releasing the fw before loading the program for consistency
- Handled error path during f/w load.
v6: Modified as per review comments from Imre.
- Corrected out_freecsr sequence.
v7: Modified as per review comments from Imre.
Fail loading fw if fw->size%8!=0.
v8: Rebase to latest.
v9: Rebase on top of -nightly (Damien)
v10: Enabled support for dmc firmware ver 1.0.
According to ver 1.0 in a single binary package all the firmware's that are
required for different stepping's of the product will be stored. The package
contains the css header, followed by the package header and the actual dmc
firmwares. Package header contains the firmware/stepping mapping table and
the corresponding firmware offsets to the individual binaries, within the
package. Each individual program binary contains the header and the payload
sections whose size is specified in the header section. This changes are done
to extract the specific firmaware from the package. (Animesh)
v11: Modified as per review comemnts from Imre.
- Added code comment from bpec for header structure elements.
- Added __packed to avoid structure padding.
- Added helper functions for stepping and substepping info.
- Added code comment for CSR_MAX_FW_SIZE.
- Disabled BXT firmware loading, will be enabled with dmc 1.0 support.
- Changed skl_stepping_info based on bspec, earlier used from config DB.
- Removed duplicate call of cpu_to_be* from intel_csr_load_program function.
- Used cpu_to_be32 instead of cpu_to_be64 as firmware binary in dword aligned.
- Added sanity check for header length.
- Added sanity check for mmio address got from firmware binary.
- kmalloc done separately for dmc header and dmc firmware. (Animesh)
v12: Modified as per review comemnts from Imre.
- Corrected the typo error in skl stepping info structure.
- Added out-of-bound access for skl_stepping_info.
- Sanity check for mmio address modified.
- Sanity check added for stepping and substeppig.
- Modified the intel_dmc_info structure, cache only the required header info. (Animesh)
v13: clarify firmware load error message.
The reason for a firmware loading failure can be obscure if the driver
is built-in. Provide an explanation to the user about the likely reason for
the failure and how to resolve it. (Imre)
v14: Suggested by Jani.
- fix s/I915/CONFIG_DRM_I915/ typo
- add fw_path to the firmware object instead of using a static ptr (Jani)
v15:
1) Changed the firmware name as dmc_gen9.bin, everytime for a new firmware version a symbolic link
with same name will help not to build kernel again.
2) Changes done as per review comments from Imre.
- Error check removed for intel_csr_ucode_init.
- Moved csr-specific data structure to intel_csr.h and optimization done on structure definition.
- fw->data used directly for parsing the header info & memory allocation
only done separately for payload. (Animesh)
v16:
- No need for out_regs label in i915_driver_load(), so removed it.
- Changed the firmware name as skl_dmc_ver1.bin, followed naming convention <platform>_dmc_<api-version>.bin (Animesh)
Issue: VIZ-2569
Signed-off-by: A.Sunil Kamath <sunil.kamath@intel.com>
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Animesh Manna <animesh.manna@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
drm-intel-next-2015-04-23:
- dither support for ns2501 dvo (Thomas Richter)
- some polish for the gtt code and fixes to finally enable the cmd parser on hsw
- first pile of bxt stage 1 enabling (too many different people to list ...)
- more psr fixes from Rodrigo
- skl rotation support from Chandra
- more atomic work from Ander and Matt
- pile of cleanups and micro-ops for execlist from Chris
drm-intel-next-2015-04-10:
- cdclk handling cleanup and fixes from Ville
- more prep patches for olr removal from John Harrison
- gmbus pin naming rework from Jani (prep for bxt)
- remove ->new_config from Ander (more atomic conversion work)
- rps (boost) tuning and unification with byt/bsw from Chris
- cmd parser batch bool tuning from Chris
- gen8 dynamic pte allocation (Michel Thierry, based on work from Ben Widawsky)
- execlist tuning (not yet all of it) from Chris
- add drm_plane_from_index (Chandra)
- various small things all over
* tag 'drm-intel-next-2015-04-23-fixed' of git://anongit.freedesktop.org/drm-intel: (204 commits)
drm/i915/gtt: Allocate va range only if vma is not bound
drm/i915: Enable cmd parser to do secure batch promotion for aliasing ppgtt
drm/i915: fix intel_prepare_ddi
drm/i915: factor out ddi_get_encoder_port
drm/i915/hdmi: check port in ibx_infoframe_enabled
drm/i915/hdmi: fix vlv infoframe port check
drm/i915: Silence compiler warning in dvo
drm/i915: Update DRIVER_DATE to 20150423
drm/i915: Enable dithering on NatSemi DVO2501 for Fujitsu S6010
rm/i915: Move i915_get_ggtt_vma_pages into ggtt_bind_vma
drm/i915: Don't try to outsmart gcc in i915_gem_gtt.c
drm/i915: Unduplicate i915_ggtt_unbind/bind_vma
drm/i915: Move ppgtt_bind/unbind around
drm/i915: move i915_gem_restore_gtt_mappings around
drm/i915: Fix up the vma aliasing ppgtt binding
drm/i915: Remove misleading comment around bind_to_vm
drm/i915: Don't use atomics for pg_dirty_rings
drm/i915: Don't look at pg_dirty_rings for aliasing ppgtt
drm/i915/skl: Support Y tiling in MMIO flips
drm/i915: Fixup kerneldoc for struct intel_context
...
Conflicts:
drivers/gpu/drm/i915/i915_drv.c
At the moment intel_prepare_ddi buffer will iterate through both MST and
CRT encoders, which is incorrect. Neither of these encoder types have an
embedding intel_digital_port object, so for these encoder types we will
use random data when dereferencing the corresponding
intel_digital_port->port field.
Introduced in
commit b403745c84
Author: Damien Lespiau <damien.lespiau@intel.com>
Date: Mon Aug 4 22:01:33 2014 +0100
drm/i915: Iterate through the initialized DDIs to prepare their buffers
v2:
- fix getting at the port for MST encoders too
- make sure that intel_prepare_ddi_buffers() gets called for port E too
(Paulo)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90067
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Currently we have the problem that the decision whether ptes need to
be (re)written is splattered all over the codebase. Move all that into
i915_vma_bind. This needs a few changes:
- Just reuse the PIN_* flags for i915_vma_bind and do the conversion
to vma->bound in there to avoid duplicating the conversion code all
over.
- We need to make binding for EXECBUF (i.e. pick aliasing ppgtt if
around) explicit, add PIN_USER for that.
- Two callers want to update ptes, give them a PIN_UPDATE for that.
Of course we still want to avoid double-binding, but that should be
taken care of:
- A ppgtt vma will only ever see PIN_USER, so no issue with
double-binding.
- A ggtt vma with aliasing ppgtt needs both types of binding, and we
track that properly now.
- A ggtt vma without aliasing ppgtt could be bound twice. In the
lower-level ->bind_vma functions hence unconditionally set
GLOBAL_BIND when writing the ggtt ptes.
There's still a bit room for cleanup, but that's for follow-up
patches.
v2: Fixup fumbles.
v3: s/PIN_EXECBUF/PIN_USER/ for clearer meaning, suggested by Chris.
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
commit ae6c480692
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Wed Aug 6 15:04:53 2014 +0200
drm/i915: Only track real ppgtt for a context
Changed the code but didn't update kerneldoc.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: "Thierry, Michel" <michel.thierry@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Purpose of this tracking is to know when to flush the cache between
the CPU and the non-coherent display engine. Prior to:
commit 121920faf2
Author: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Date: Mon Mar 23 11:10:37 2015 +0000
drm/i915/skl: Query display address through a wrapper
This worked by a mix of direct flag manipulation and checking for
existence of a pinned GGTT VMA.
With the introduction of rotated display mappings this approach is
no longer correct.
New simpler approach is to just keep this count over calls which pin
and unpin objects to and from display, at the slight cost of extra
space in every bo.
(Inspired and extracted code from a larger rework by Chris Wilson.)
v2: Remove the limit since it is not well defined. (Chris Wilson, Ville Syrjälä)
v3: Commit message corrections. (Chris Wilson)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The merge is clean, but the arm build fails afterwards,
due to API changes in the regulator tree.
I've included the patch into the merge to fix the build.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Not every DDIs is necessarily connected can be strapped off and, in the
future, we'll have platforms with a different number of default DDI
ports. So, let's only call intel_prepare_ddi_buffers() on DDI ports that
are actually detected.
We also use the opportunity to give a struct intel_digital_port to
intel_prepare_ddi_buffers() as we'll need it in a following patch to
query if the port supports HMDI or not.
On my HSW machine this removes the initialization of a couple of
(unused) DDIs.
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Plug bxt PLL code into existing shared DPLL framework.
v2: (imre)
- squash in Satheeshakrishna's "Define BXT clock registers" and
"Add state variables for bxt clock registers" patches
- squash in Vandanas's "Change grp access to lane access for PLL"
- fix group vs. lane access in bxt_ddi_pll_get_hw_state
- add code comment why we read from lane registers while writing to
group registers
- clean up register macros
- use BXT_PORT_PLL_* macros instead of open-coding the same
- check if BXT_PORT_PCS_DW12_LN01 matches BXT_PORT_PCS_DW12_LN23
during hardware state readout
- add missing LANESTAGGER_STRAP_OVRD masking
- add note about missing step according to the latest BUN for
PORT_PLL_9/lockthresh
Signed-off-by: Satheeshakrishna M <satheeshakrishna.m@intel.com> (v1)
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Rename vlv_cdclk_freq to cdclk_freq so that it can be used for all
platforms as required. Needed by the next patch.
Signed-off-by: Vandana Kannan <vandana.kannan@intel.com>
Signed-off-by: A.Sunil Kamath <sunil.kamath@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
On Haswell and Broadwell with link in standby when exit event happens
between vblank and VSC packet, PSR exit on panel but DPA transmitter
still sends black pixel. When this condition hits, panel will intermittently
display black frame.
The known W/A for this case involve the of single_frame update
that isn't supported on Haswell and to be supported on Broadwell
3 other workarounds would be required. So it is better and safe to
just deprecate link_standby for now.
Also, link fully off saves more power than link_standby and afwk
no OEM is requesting link standby on VBT. There is no reason for that.
For Skylake let's just consider it behaves like Broadwell until
we prove otherwise.
v2: Fix commit message (Durga).
v3: Fix conflict with PSR2.
Reference: HSD: bdwgfx/1912559
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Durgadoss R <durgadoss.r@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Separate topic branch for bxt didn't work out since we needed to
refactor the gmbus code a bit to make it look decent. So backmerge.
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
The obj->pin_mappable flag only exists for debug purposes and is a
hindrance that is mistreated with rotated GGTT views. For debug
purposes, it suffices to mark objects with pin_display as being of note.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We already assign a unique identifier to every request: seqno. That
someone felt like adding a second one without even mentioning why and
tweaking ABI smells very fishy.
Fixes regression from
commit b3a38998f0
Author: Nick Hoath <nicholas.hoath@intel.com>
Date: Thu Feb 19 16:30:47 2015 +0000
drm/i915: Fix a use after free, and unbalanced refcounting
v2: Rebase
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Nick Hoath <nicholas.hoath@intel.com>
Cc: Thomas Daniel <thomas.daniel@intel.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Jani Nikula <jani.nikula@intel.com>
[danvet: Fixup because different merge order.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This eliminates six needless spin lock/unlock pairs when writing out
ELSP.
v2: Respin with my preferred colour.
v3: Mostly back to the original colour
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> [v1]
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
vma are more frequently allocated than objects and so should equally
benefit from having a dedicated slab.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
requests are even more frequently allocated than objects and equally
benefit from having a dedicated slab.
v2: Rebase
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This is mostly useful for execlists where the rings switch between
contexts (and so checking that the ring's start register matches the
context is important).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Now with the trimmed memcpy before the command parser, we try to
allocate many different sizes of batches, predominantly one or two
pages. We can therefore speed up searching for a good sized batch by
keeping the objects of buckets of roughly the same size.
v2: Add a comment about bucket sizes
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
I woke up one morning and found 50k objects sitting in the batch pool
and every search seemed to iterate the entire list... Painting the
screen in oils would provide a more fluid display.
One issue with the current design is that we only check for retirements
on the current ring when preparing to submit a new batch. This means
that we can have thousands of "active" batches on another ring that we
have to walk over. The simplest way to avoid that is to split the pools
per ring and then our LRU execution ordering will also ensure that the
inactive buffers remain at the front.
v2: execlists still requires duplicate code.
v3: execlists requires more duplicate code
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
In the next patch, I want to use the structure elsewhere and so require
it defined earlier. Rather than move the definition to an earlier location
where it feels very odd, place it in its own header file.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
With boosting for missed pageflips, we have a much stronger indication
of when we need to (temporarily) boost GPU frequency to ensure smooth
delivery of frames. So now only allow each client to perform one RPS boost
in each period of GPU activity due to stalling on results.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Deepak S <deepak.s@linux.intel.com>
Reviewed-by: Deepak S <deepak.s@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reuse the same reclocking strategy for Baytail as on its bigger brethren,
Sandybridge and Ivybridge. In particular, this makes the device quicker
to reclock (both up and down) though the tendency now is to downclock
more aggressively to compensate for the RPS boosts.
v2: Rebase
v3: Exclude Cherrytrail as Deepak was concerned that the increased
number of register writes would wake the common powerwell too often.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Deepak S <deepak.s@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Deepak S <deepak.s@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The biggest user of i915_gem_object_get_page() is the relocation
processing during execbuffer. Typically userspace passes in a set of
relocations in sorted order. Sadly, we alternate between relocations
increasing from the start of the buffers, and relocations decreasing
from the end. However the majority of consecutive lookups will still be
in the same page. We could cache the start of the last sg chain, however
for most callers, the entire sgl is inside a single chain and so we see
no improve from the extra layer of caching.
v2: Avoid the double increment inside unlikely()
References: https://bugs.freedesktop.org/show_bug.cgi?id=88308
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Nick Hoath <nicholas.hoath@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Pipe A and b have 4 planes.
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Antti Koskipää <antti.koskipaa@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Some BIOSes (e.g. the one on the Minnowboard) don't save/restore this
reg. If it's unlocked, we can just restore the previous value, and if
it's locked (in case the BIOS re-programmed it for us) the write will be
ignored and we'll still have "did it move" sanity check in the PM code to
warn us if something is still amiss.
References: https://bugs.freedesktop.org/show_bug.cgi?id=89611
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Tested-by: Darren Hart <dvhart@linux.intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Deepak S <deepak.s@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Occasionally it would be interesting to read some of the DPCD registers
for debug purposes, without having to resort to logging. Add an i915
specific i915_dpcd debugfs file for DP and eDP connectors to dump parts
of the DPCD. Currently the DPCD addresses to be dumped are statically
configured, and more can be added trivially.
The implementation also makes it relatively easy to add other i915 and
connector specific debugfs files in the future, as necessary.
This is currently i915 specific just because there's no generic way to
do AUX transactions given just a drm_connector. However it's all pretty
straightforward to port to other drivers.
v2: Add more DPCD registers to dump.
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Bob Paauwe <bob.j.paauwe@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We make use of HW tracking for Selective update region and enable frame sync on
sink. We use hardware's hardcoded data values for frame sync and GTC.
v2: Add 3200x2000 resolution restriction with PSR2, move psr2_support to i915_psr
struct, add aux_frame_sync to independently control aux frame sync, rename the
TP2 TIME macro for 2500us (Rodrigo, Siva)
v3: Moving the resolution restriction to intel_psr_enable so that we check it
only once(Durga)
Cc: Durgadoss R <durgadoss.r@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Sonika Jindal <sonika.jindal@intel.com>
Reviewed-by: Durgadoss R <durgadoss.r@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This will be helpful for adding future platforms. It is better to keep
the information in the single point of truth (the table) instead of
duplicating it into the validity function.
While at it, add dev_priv parameter to the function, also to prepare for
adding future platform support.
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Index the gmbus tables directly using the pin instead of having a
confusing "port = i + 1" mapping. This finishes off removing the "gmbus
port" as a notion, and leaves us with just the "gmbus pin".
As pin 0 is invalid by definition and the gmbus tables will have a gap
at that index, add pin validity check to all the loops. This will be
benefitial for supporting platforms that have different numbers of pins,
or gaps.
v2: s/GMBUS_PIN_MAX/GMBUS_NUM_PINS/ (Ville, Daniel)
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Rename intel_gmbus_is_port_valid to intel_gmbus_is_valid_pin, and rename
port parameters to pin as well. This matches usage all around, as
usually a pin is passed to the validity check function. No functional
changes.
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The specs refer to pin pairs. Start moving towards using pin rather than
port all around to avoid confusion. No functional changes.
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The request allocation code is largely duplicated between legacy mode and
execlist mode. The actual difference between the two versions of the code is
pretty minimal.
This patch moves the common code out into a separate function. This is then
called by the execution specific version prior to setting up the one different
value.
For: VIZ-5190
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tomas Elf <tomas.elf@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The submission portion of the execbuffer code path was abstracted into a
function pointer indirection as part of the legacy vs execlist work. The two
implementation functions are called 'i915_gem_ringbuffer_submission' and
'intel_execlists_submission' but the pointer was called 'do_execbuf'. There is
already a 'i915_gem_do_execbuffer' function (which is what calls the pointer
indirection). The name of the pointer is therefore considered to be backwards
and should be changed.
This patch renames it to 'execbuf_submit' which is hopefully a bit clearer.
For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tomas Elf <tomas.elf@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We were missing a convenience stub to aquire the right mutex whilst
dropping the request, so add it.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
To allow for views where the view type is not defined by the view type only,
like it is in stereo or rotated 90 degree view, change the semantic to require
the whole view structure for comparison when we match a GGTT view.
This allows including parameters like offset to be included in the view which
is useful for eg. partial views.
v3:
- Rely on ggtt_view type being 0 for non-GGTT vma's, which equals to
I915_GGTT_VIEW_NORMAL. (Daniel Vetter)
- Do not use potentially slower comparison when we only want to know if
something is or is not a normal view.
- Rebase on top of rotated view patches. Add rotated view singleton.
- If one view is missing in comparison they're equal only if both are missing.
v4:
- Use comparison helper in obj_to_ggtt_view too. (Tvrtko Ursulin)
- Do WARN_ON if one view is NULL. (Tvrtko Ursulin)
Cc: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This is useful for writing igts to make sure we don't break this,
without being forced to own a one of these dinosaurs.
Suggested-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Pass a crtc_state to it and find whether the pipe has an encoder of a
given type by looking at the drm_atomic_state the crtc_state points to.
Until recently i9xx_get_refclk() used to be called indirectly from
vlv_force_pll_on() with a dummy crtc_state. That dummy crtc state is not
converted to be part of a full drm atomic state, so add a WARN in case
someone decides to call that again with a such dummy state. This was
removed in
commit 9cbe40c15a
Author: Vijay Purushothaman <vijay.a.purushothaman@linux.intel.com>
Date: Thu Mar 5 19:33:08 2015 +0530
drm/i915: Update prop, int co-eff and gain threshold for CHV
v2: Warn if there is no connectors for a given crtc. (Daniel)
Replace comment i9xx_get_refclk() with a WARN_ON(). (Ander)
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
[danvet: Add commit reference for when i9xx_get_refclk was removed.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Follow up patches will convert some functions called from there to use
the atomic state, instead of directly accessing the new or current
config. This patch just changes the parameters, but shouldn't have any
functional changes.
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This flag was being mostly used as a meta flag in some
cases and not covering other cases.
One of the risks is that it was masking some frontbuffer
trackings without disabling PSR.
So, better to kill this at once and avoid umbrella parameters.
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Drop unused out: label to appease gcc.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The faulting virtual address is >32bits and has been moved
to different registers. Add to error state and output upper
register first, in the same line for easy reconstruction of
the fault address.
v2: correct gen masking (Michel)
v3: s/TBL/TLB (Ville)
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
To support frame buffer rotation we need to be able to pass on the information
on what kind of GGTT view is required for display.
This patch just adds the parameter and makes all the callers default to the
normal view.
v2: Rebased for ggtt view changes.
v3: Don't limit PIN_MAPPABLE to normal views just yet. (Joonas Lahtinen)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (v3)
[danvet: s/BUG/WARN/ in the patch hunk because. At least where the
BUG_ON isn't fatal right away.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Two code changes:
- Extract i915_gem_shrinker_init.
- Inline i915_gem_object_is_purgeable since we open-code it everywhere
else too.
This already has the benefit of pulling all the shrinker code
together, next patch adds a bit of kerneldoc.
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Use both up/down manual ei calcuations for symmetry and greater
flexibility for reclocking, instead of faking the down interrupt based
on a fixed integer number of up interrupts.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Deepak S<deepak.s@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
When we idle, we set the GPU frequency to the hardware minimum (not user
minimum). We introduce a new variable to distinguish between the
different roles, and to allow easy tuning of the idle frequency without
impacting over aspects of RPS. Setting the minimum frequency should be a
safety blanket as the pcu on the GPU should be power gating itself
anyway. However, in order for us to do set the absolute minimum
frequency, we need to relax a few of our assertions that we do not
exceed the user limits.
v2: Add idle_freq
v3: Init idle_freq for vlv and add a bunch of WARNs
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Deepak S <deepak.s@linux.intel.com>
Reviewed-by: Deepak S<deepak.s@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
GGTT views are only applicable when dealing with GGTT. Change the code to
reject ggtt_view where it should not be used and require it when it should
be.
v2:
- Dropped _ppgtt_ infixes, allow both types to be passed
- Disregard other but normal views when no view is specified
- More checks that valid parameters are passed
- More readable error checking
v3:
- Prefer WARN_ONCE over BUG_ON when there is code path for failure
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
[danvet: Drop unecessary forward decl from earlier patch iterations.]
[danvet: Remove unused variable spotted by Tvrtko.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Same as
commit c883ef1b1c
Author: Mika Kuoppala <miku@iki.fi>
Date: Tue Oct 28 17:32:30 2014 +0200
drm/i915: Redefine WARN_ON to include the condition
but for WARN_ON_ONCE. Since the kernel WARN_ON_ONCE actually picks up
*our* version of WARN_ON, we end up with messages like
[ 838.285319] WARN_ON(!__warned)
which are not that helpful.
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
For SKL, register definition for RPNSWREQ (A008), RPSTAT1(A01C)
have changed slightly. Also on SKL, frequency is specified in
units of 16.66 MHZ, compared to 50 MHZ for most of the earlier
platforms and the time values are expressed in units of 1.33 us,
compared to 1.28 us for earlier platforms.
Added new macros for the aforementioned changes.
v2: Renamed the GT_FREQ_FROM_PERIOD macro to GT_INTERVAL_FROM_US (Damien)
v3: Removed the implicit use of dev_priv in GT_INTERVAL_FROM_US macro (Chris)
Signed-off-by: Akash Goel <akash.goel@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Assuming the PND deadline mechanism works reasonably we should do
memory requests as early as possible so that PND has schedule the
requests more intelligently. Currently we're still calculating
the watermarks as if VLV/CHV are identical to g4x, which isn't
the case.
The current code also seems to calculate insufficient watermarks
and hence we're seeing some underruns, especially on high resolution
displays.
To fix it just rip out the current code and replace is with something
that tries to utilize PND as efficiently as possible.
We now calculate the WM watermark to trigger when the FIFO still has
256us worth of data. 256us is the maximum deadline value supoorted by
PND, so issuing memory requests earlier would mean we probably couldn't
utilize the full FIFO as PND would attempt to return the data at
least in at least 256us. We also clamp the watermark to at least 8
cachelines as that's the magic watermark that enabling trickle feed
would also impose. I'm assuming it matches some burst size.
In theory we could just enable trickle feed and ignore the WM values,
except trickle feed doesn't work with max fifo mode anyway, so we'd
still need to calculate the SR watermarks. It seems cleaner to just
disable trickle feed and calculate all watermarks the same way. Also
trickle feed wouldn't account for the 256us max deadline value, thoguh
that may be a moot point in non-max fifo mode sicne the FIFOs are fairly
small.
On VLV max fifo mode can be used with either primary or sprite planes.
So the code now also checks all the planes (apart from the cursor)
when calculating the SR plane watermark.
We don't have to worry about the WM1 watermarks since we're using the
PND deadline scheme which means the hardware ignores WM1 values.
v2: Use plane->state->fb instead of plane->fb
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Introduce struct vlv_wm_values to house VLV watermark/drain latency
values. We start by using it when computing the drain latency values.
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
If we have a single unclaimed register, we will have lots. A WARN for
each one makes the machine unusable and does not aid debugging. Convert
the i915.mmio_debug option to a counter for how many WARNs to fire
before shutting up. Even when i915.mmio_debug was disabled it would
continue to shout an *ERROR* for every interrupt, without any
information at all for debugging.
The massive verbiage was added in
commit 5978118c39
Author: Paulo Zanoni <paulo.r.zanoni@intel.com>
Date: Wed Jul 16 17:49:29 2014 -0300
drm/i915: reorganize the unclaimed register detection code
v2: Automatically enable invalid mmio reporting for the *next* invalid
access if mmio_debug is disabled by default. This should give us clearer
debug information without polluting the logs too much.
v3: Compile fixes, rebase.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
[danvet: Update modparam text per the thread.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Apparently, this has never worked reliably and is currently disabled. Also, the
gains are not particularly impressive. Thus rather than try to keep unused code
from decaying and having to update it for other driver changes, it was decided
to simply remove it.
For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Kill the blt/render tracking we currently have and use the frontbuffer
tracking infrastructure.
Don't enable things by default yet.
v2: (Rodrigo) Fix small conflict on rebase and typo at subject.
v3: (Paulo) Rebase on RENDER_CS change.
v4: (Paulo) Rebase.
v5: (Paulo) Simplify: flushes don't have origin (Daniel).
Also rebase due to patch order changes.
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We have similar macros for crtcs and encoders, and the pattern happens
often enough to justify the macro.
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We want to port FBC to the frontbuffer tracking infrastructure, but
for that we need to know what caused the object invalidation so
we can react accordingly: CPU mmaps need manual, GTT mmaps and
flips don't need handling and ring rendering needs nukes.
v2: - s/ORIGIN_RENDER/ORIGIN_CS/ (Daniel, Rodrigo)
- Fix copy/pasted wrong documentation
- Rebase
v3: - Rebase
v4: - Don't pass the operation to flushes (Daniel).
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Implicit usage of local variables in macros isn't exactly the greatest
thing in the world, especially when that variable is the drm device and
we want to move towards a broader use of the i915 device structure.
Let's make for_each_sprite() take dev_priv as its first argument then.
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Chris Wilson <chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Implicit usage of local variables in macros isn't exactly the greatest
thing in the world, especially when that variable is the drm device and
we want to move towards a broader use of the i915 device structure.
Let's make for_each_plane() take dev_priv as its first argument then.
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Chris Wilson <chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJU/NacAAoJEHm+PkMAQRiGdUcIAJU5dHclwd9HRc7LX5iOwYN6
mN0aCsYjMD8Pjx2VcPCgJvkIoESQO5pkwYpFFWCwILup1bVEidqXfr8EPOdThzdh
kcaT0FwUvd19K+0jcKVNCX1RjKBtlUfUKONk6sS2x4RrYZpv0Ur8Gh+yXV8iMWtf
fAusNEYlxQJvEz5+NSKw86EZTr4VVcykKLNvj+/t/JrXEuue7IG8EyoAO/nLmNd2
V/TUKKttqpE6aUVBiBDmcMQl2SUVAfp5e+KJAHmizdDpSE80nU59UC1uyV8VCYdM
qwHXgttLhhKr8jBPOkvUxl4aSXW7S0QWO8TrMpNdEOeB3ZB8AKsiIuhe1JrK0ro=
=Xkue
-----END PGP SIGNATURE-----
Merge tag 'v4.0-rc3' into drm-next
Linux 4.0-rc3 backmerge to fix two i915 conflicts, and get
some mainline bug fixes needed for my testing box
Conflicts:
drivers/gpu/drm/i915/i915_drv.h
drivers/gpu/drm/i915/intel_display.c
- Y tiling support for scanout from Tvrtko&Damien
- Remove more UMS support
- some small prep patches for OLR removal from John Harrison
- first few patches for dynamic pagetable allocation from Ben Widawsky, rebased
by tons of other people
- DRRS support patches (Sonika&Vandana)
- fbc patches from Paulo
- make sure our vblank callbacks aren't called when the pipes are off
- various patches all over
* tag 'drm-intel-next-2015-02-27' of git://anongit.freedesktop.org/drm-intel: (61 commits)
drm/i915: Update DRIVER_DATE to 20150227
drm/i915: Clarify obj->map_and_fenceable
drm/i915/skl: Allow Y (and Yf) frame buffer creation
drm/i915/skl: Update watermarks for Y tiling
drm/i915/skl: Updated watermark programming
drm/i915/skl: Adjust get_plane_config() to support Yb/Yf tiling
drm/i915/skl: Teach pin_and_fence_fb_obj() about Y tiling constraints
drm/i915/skl: Adjust intel_fb_align_height() for Yb/Yf tiling
drm/i915/skl: Allow scanning out Y and Yf fbs
drm/i915/skl: Add new displayable tiling formats
drm/i915: Remove DRIVER_MODESET checks from modeset code
drm/i915: Remove regfile code&data for UMS suspend/resume
drm/i915: Remove DRIVER_MODESET checks from gem code
drm/i915: Remove DRIVER_MODESET checks in the gpu reset code
drm/i915: Remove DRIVER_MODESET checks from suspend/resume code
drm/i915: Remove DRIVER_MODESET checks in load/unload/close code
drm/i915: fix a printk format
drm/i915: Add media rc6 residency file to sysfs
drm/i915: Add missing description to parameter in alloc_pt_range
drm/i915: Removed the read of RP_STATE_CAP from sysfs/debugfs functions
...
- use the atomic helpers for plane_upate/disable hooks (Matt Roper)
- refactor the initial plane config code (Damien)
- ppgtt prep patches for dynamic pagetable alloc (Ben Widawsky, reworked and
rebased by a lot of other people)
- framebuffer modifier support from Tvrtko Ursulin, drm core code from Rob Clark
- piles of workaround patches for skl from Damien and Nick Hoath
- vGPU support for xengt on the client side (Yu Zhang)
- and the usual smaller things all over
* tag 'drm-intel-next-2015-02-14' of git://anongit.freedesktop.org/drm-intel: (88 commits)
drm/i915: Update DRIVER_DATE to 20150214
drm/i915: Remove references to previously removed UMS config option
drm/i915/skl: Use a LRI for WaDisableDgMirrorFixInHalfSliceChicken5
drm/i915/skl: Fix always true comparison in a revision id check
drm/i915/skl: Implement WaEnableLbsSlaRetryTimerDecrement
drm/i915/skl: Implement WaSetDisablePixMaskCammingAndRhwoInCommonSliceChicken
drm/i915: Add process identifier to requests
drm/i915/skl: Implement WaBarrierPerformanceFixDisable
drm/i915/skl: Implement WaCcsTlbPrefetchDisable:skl
drm/i915/skl: Implement WaDisableChickenBitTSGBarrierAckForFFSliceCS
drm/i915/skl: Implement WaDisableHDCInvalidation
drm/i915/skl: Implement WaDisableLSQCROPERFforOCL
drm/i915/skl: Implement WaDisablePartialResolveInVc
drm/i915/skl: Introduce a SKL specific init_workarounds()
drm/i915/skl: Document that we implement WaRsClearFWBitsAtReset
drm/i915/skl: Implement WaSetGAPSunitClckGateDisable
drm/i915/skl: Make the init clock gating function skylake specific
drm/i915/skl: Provide a gen9 specific init_render_ring()
drm/i915/skl: Document the WM read latency W/A with its name
drm/i915/skl: Also detect eDRAM on SKL
...
Lots of lines to remove!
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[danvet: Fixup makefile.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
In execlist mode, the ringbuf is a function of the ring and context whereas in
legacy mode, it is derived from the ring alone. Thus the calculation required to
determine the ringbuf pointer from the ring (and context) also needs to test
execlist mode or not. This is messy.
Further, the request structure holds a pointer to both the ring and the context
for which it was created. Thus, given a request, it is possible to derive the
ringbuf in either legacy or execlist mode. Hence it is necessary to pass just
the request in to all the low level functions rather than some combination of
request, ring, context and ringbuf. However, rather than recalculating it each
time, it is much simpler to just cache the ringbuf pointer in the request
structure itself.
Caching the pointer means the calculation is done once at request creation time
and all further code and simply read it directly from the request structure.
OTC-Jira: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
[danvet: Drop contentless comment in lrc alloc request entirely. And
spelling fix in the commit message.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
v2: Adding VBT version check for low_vswing field, and correcting parsing
v3: (Damien)
- Restrain the scope of the 'vswing' variable
- Use the more idiomatic "ev_priv->vbt.edp_low_vswing = vswing == 0;"
instead of if (foo) var = true; else var = false;
- Shorten edp_vswing_premph_setting to edp_vswing_premph to fit in 80 chars
- Add the version from which the edp_vswing_premph field is valid in the
struct definition
Reviewed-by: Satheeshakrishna M <satheeshakrishna.m@intel.com> (v2)
Signed-off-by: Sonika Jindal <sonika.jindal@intel.com>
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
When converting from implicitly tracked execlist queue items to ref counted
requests, not all frees of requests were replaced with unrefs, and extraneous
refs/unrefs of contexts were added.
Correct the unbalanced refcount & replace the frees.
Remove a noisy warning when hitting the request creation path.
drm_i915_gem_request and intel_context are both kref reference counted
structures. Upon allocation, drm_i915_gem_request's ref count should be
bumped using kref_init. When a context is assigned to the request,
the context's reference count should be bumped using i915_gem_context_reference.
i915_gem_request_reference will reduce the context reference count when
the request is freed.
Problem introduced in
commit 6d3d8274bc
Author: Nick Hoath <nicholas.hoath@intel.com>
AuthorDate: Thu Jan 15 13:10:39 2015 +0000
drm/i915: Subsume intel_ctx_submit_request in to drm_i915_gem_request
v2: Added comments explaining how the ctx pointer and the request object should
be ref-counted. Removed noisy warning.
v3: Cleaned up the language used in the commit & the header
description (Thanks David Gordon)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88652
Signed-off-by: Nick Hoath <nicholas.hoath@intel.com>
Reviewed-by: Thomas Daniel <thomas.daniel@intel.com>
Reviewed-by: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
When one EU is disabled in a particular subslice, we can tune how the
work is spread between subslices to improve EU utilization.
v2: - Use a bitfield to record which subslice(s) has(have) 7 EUs. That
will also make the machinery work if several sublices have 7 EUs.
(Jeff Mcgee)
- Only apply the different hashing algorithm if the slice is
effectively unbalanced by checking there's a single subslice with
7 EUs. (Jeff Mcgee)
v3: Fix typo in comment (Jeff Mcgee)
Issue: VIZ-3845
Cc: Jeff Mcgee <jeff.mcgee@intel.com>
Reviewed-by: Jeff Mcgee <jeff.mcgee@intel.com>
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Read fuse registers to determine the available slice total,
subslice total, subslice per slice, EU total, and EU per subslice
counts of the SKL device. The EU per subslice attribute is more
precisely defined as the maximum EU available on any one subslice,
since available EU counts may vary across subslices due to fusing.
Set flags indicating the SKL device's slice/subslice/EU (SSEU)
power gating capability. Make all values available via debugfs
entry 'i915_sseu_status'.
v2: Several small clean-ups suggested by Damien. Most notably,
used smaller types for the new device info fields to reduce
memory usage and improved the clarity/readability of the
method used to extract attribute values from the fuse
registers.
Signed-off-by: Jeff McGee <jeff.mcgee@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
When reviewing patch that fixes VGA on BDW Halo Jani noticed that
we also had other ULT IDs that weren't listed there.
So this follow-up patch add these pci-ids as halo and fix comments
on i915_pciids.h
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Pull drm updates from Dave Airlie:
"This is the main drm pull, it has a shared branch with some alsa
crossover but everything should be acked by relevant people.
New drivers:
- ATMEL HLCDC driver
- designware HDMI core support (used in multiple SoCs).
core:
- lots more atomic modesetting work, properties and atomic ioctl
(hidden under option)
- bridge rework allows support for Samsung exynos chromebooks to
work finally.
- some more panels supported
i915:
- atomic plane update support
- DSI uses shared DSI infrastructure
- Skylake basic support is all merged now
- component framework used for i915/snd-hda interactions
- write-combine cpu memory mappings
- engine init code refactored
- full ppgtt enabled where execlists are enabled.
- cherryview rps/gpu turbo and pipe CRC support.
radeon:
- indirect draw support for evergreen/cayman
- SMC and manual fan control for SI/CI
- Displayport audio support
amdkfd:
- SDMA usermode queue support
- replace suballocator usage with more suitable one
- rework for allowing interfacing to more than radeon
nouveau:
- major renaming in prep for later splitting work
- merge arm platform driver into nouveau
- GK20A reclocking support
msm:
- conversion to atomic modesetting
- YUV support for mdp4/5
- eDP support
- hw cursor for mdp5
tegra:
- conversion to atomic modesetting
- better suspend/resume support for child devices
rcar-du:
- interlaced support
imx:
- move to using dw_hdmi shared support
- mode_fixup support
sti:
- DVO support
- HDMI infoframe support
exynos:
- refactoring and cleanup, removed lots of internal unnecessary
abstraction
- exynos7 DECON display controller support
Along with the usual bunch of fixes, cleanups etc"
* 'drm-next' of git://people.freedesktop.org/~airlied/linux: (724 commits)
drm/radeon: fix voltage setup on hawaii
drm/radeon/dp: Set EDP_CONFIGURATION_SET for bridge chips if necessary
drm/radeon: only enable kv/kb dpm interrupts once v3
drm/radeon: workaround for CP HW bug on CIK
drm/radeon: Don't try to enable write-combining without PAT
drm/radeon: use 0-255 rather than 0-100 for pwm fan range
drm/i915: Clamp efficient frequency to valid range
drm/i915: Really ignore long HPD pulses on eDP
drm/exynos: Add DECON driver
drm/i915: Correct the base value while updating LP_OUTPUT_HOLD in MIPI_PORT_CTRL
drm/i915: Insert a command barrier on BLT/BSD cache flushes
drm/i915: Drop vblank wait from intel_dp_link_down
drm/exynos: fix NULL pointer reference
drm/exynos: remove exynos_plane_dpms
drm/exynos: remove mode property of exynos crtc
drm/exynos: Remove exynos_plane_dpms() call with no effect
drm/i915: Squelch overzealous uncore reset WARN_ON
drm/i915: Take runtime pm reference on hangcheck_info
drm/i915: Correct the IOSF Dev_FN field for IOSF transfers
drm/exynos: fix DMA_ATTR_NO_KERNEL_MAPPING usage
...
We use the pid of the process which opened our device when
we track which was the culprit of the gpu hang. But as that
file descriptor might get inherited, we might blame the
wrong process when we record the error state.
Track process identifiers in requests to always find
the correct offender.
v2: Track only user processes (Chris)
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
[danvet: drop NULL check before put_pid as suggested by Chris.]
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The last (only?) user of this was removed in:
commit 2208d655a9
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Fri Nov 14 09:25:29 2014 +0100
drm/i915: drop WaSetupGtModeTdRowDispatch:snb
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
There have been quite a bit of development lately, leaving behing lonely
protypes. Time to bid them farewell.
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Introduce a PV INFO structure, to facilitate the Intel GVT-g
technology, which is a GPU virtualization solution with mediated
pass-through. This page contains the shared information between
i915 driver and the host emulator. For now, this structure utilizes
an area of 4K bytes on HSW GPU's unused MMIO space. Future hardware
will have the reserved window architecturally defined, and layout
of the page will be added in future BSpec.
The i915 driver load routine detects if it is running in a VM by
reading the contents of this PV INFO page. Thereafter a flag,
vgpu.active is set, and intel_vgpu_active() is used by checking
this flag to conclude if GPU is virtualized with Intel GVT-g. By
now, intel_vgpu_active() will return true, only when the driver
is running as a guest in the Intel GVT-g enhanced environment on
HSW platform.
v2:
take Chris' comments:
- call the i915_check_vgpu() in intel_uncore_init()
- sanitize i915_check_vgpu() by adding BUILD_BUG_ON() and debug info
take Daniel's comments:
- put the definition of PV INFO into a new header - i915_vgt_if.h
other changes:
- access mmio regs by readq/readw in i915_check_vgpu()
v3:
take Daniel's comments:
- move the i915/vgt interfaces into a new i915_vgpu.c
- update makefile
- add kerneldoc to functions which are non-static
- add a DOC: section describing some of the high-level design
- update drm docbook
other changes:
- rename i915_vgt_if.h to i915_vgpu.h
v4:
take Tvrtko's comments:
- fix a typo in commit message
- add debug message when vgt version mismatches
- rename low_gmadr/high_gmadr to mappable/non-mappable in PV INFO
structure
Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com>
Signed-off-by: Jike Song <jike.song@intel.com>
Signed-off-by: Eddie Dong <eddie.dong@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
To be used from the new addfb2 extension.
v2:
- Drop Intel-specific untiled modfier.
- Move to drm_fourcc.h.
- Document layouts a bit and denote them as platform-specific and not
useable for cross-driver sharing.
- Add Y-tiling for completeness.
- Drop special docstring markers to avoid confusing kerneldoc.
v3: Give Y-tiling a unique idea, noticed by Tvrtko.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> (v1)
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Since the mapping from CRTCs to planes is fixed, looking at the CRTC
is essentially the same as looking at the plane. Also, the next
patches wil start using the frontbuffer_bits macros, and they take the
pipe as the parameter instead of the plane, and this could differ on
gens 2 and 3.
Another nice thing is that we don't risk accidentally initializing
things to PLANE_A if we don't set the value before it is used for the
first time. But this shouldn't be a problem with the current code.
V2: Rebase.
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (v1)
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Add Skylake stepping Revision IDs definitions.
v1: Use existing revision id.
Signed-off-by: Nick Hoath <nicholas.hoath@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
[danvet: Use magic __I915__ and bikeshed #defines as suggested by
Damien.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The check for previously reserved stolen space size for FBC in
i915_gem_stolen_setup_compression() did not take the compression
threshold into account. Fix this by storing and comparing to
uncompressed size instead.
The bug has been introduced in
commit 5e59f7175f
Author: Ben Widawsky <benjamin.widawsky@intel.com>
Date: Mon Jun 30 10:41:24 2014 -0700
drm/i915: Try harder to get FBC
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88975
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Replace the valleyview_set_rps() and gen6_set_rps() calls with
intel_set_rps() which itself does the IS_VALLEYVIEW() check. The
code becomes simpler since the callers don't have to do this check
themselves.
Most of the change was performe with the following semantic patch:
@@
expression E1, E2, E3;
@@
- if (IS_VALLEYVIEW(E1)) {
- valleyview_set_rps(E2, E3);
- } else {
- gen6_set_rps(E2, E3);
- }
+ intel_set_rps(E2, E3);
Adding intel_set_rps() and making valleyview_set_rps() and gen6_set_rps()
static was done manually. Also valleyview_set_rps() had to be moved a
bit avoid a forward declaration.
v2: Use a less greedy semantic patch
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Now when we declare gpu errors only through our own dedicated
hangcheck workqueue there is no need to have a separate workqueue
for handling the resetting and waking up the clients as the deadlock
concerns are no more.
The only exception is i915_debugfs::i915_set_wedged, which triggers
error handling through process context. However as this is only used through
test harness it is responsibility for test harness not to introduce hangs
through both debug interface and through hangcheck mechanism at the same time.
Remove gpu_error.work and let the hangcheck work do the tasks it used to.
v2: Add a big warning sign into i915_debugfs::i915_set_wedged (Chris)
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Mainly taking care of some register offsets, otherwise things are similar to
hsw. Also, programming ddi aux to use hardcoded values for psr data select.
v2: introduce EDP_PSR_AUX_BASE macro (Chris)
v3: Moving to HW tracking for SKL+ platforms, so activating source psr during
psr_enabling and then avoiding psr entries and exits for each frontbuffer
updates.
v4: Using SKL DDI AUX regs instead of changing PSR_AUX regs definition (Rodrigo)
Signed-off-by: Sonika Jindal <sonika.jindal@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[danvet: Drop the hunks to short-circuit sw tracking: We'd need to
push this down one level, and I don't fully trust the test coverage
yet to do so. So much prefer we pick a whitelist approach for the
cases we know work correctly.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
When run as a timer, i915_hangcheck_elapsed() must adhere to all the
rules of running in a softirq context. This is advantageous to us as we
want to minimise the risk that a driver bug will prevent us from
detecting a hung GPU. However, that is irrelevant if the driver bug
prevents us from resetting and recovering. Still it is prudent not to
rely on mutexes inside the checker, but given the coarseness of
dev->struct_mutex doing so is extremely hard.
Give in and run from a work queue, i.e. outside of softirq.
v2: Use own workqueue to avoid deadlocks (Daniel)
Cleanup commit msg and add comment to i915_queue_hangcheck() (Chris)
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Daniel Vetter <dnaiel.vetter@ffwll.chm>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
[danvet: Remove accidental kerneldoc comment starter, to appease the 0
day builder.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We don't have full atomic modeset support yet, but the "nuclear
pageflip" subset of functionality (i.e., plane operations only) should
be ready. Allow the user to force atomic on for debug purposes, or for
fixed-purpose embedded devices that will only use atomic for plane
updates.
The term 'nuclear' is used here instead of 'atomic' to make it clear
that this doesn't allow full atomic modeset support, just a (very
useful) subset of the atomic functionality.
We'll drop the kernel parameter and unconditionally enable atomic in a
future patch once all of the necessary pieces are in.
v2:
- Use module_param_named_unsafe() (Daniel)
- Simplify comment on DRIVER_ATOMIC guard (Daniel)
v3:
- Make the parameter "nuclear_pageflip" rather than just "nuclear"
for clarity. (Ander)
v4:
- Make the internal variable "nuclear_pageflip" as well as the
command-line option. (Ander)
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Replace all the vlv_gpu_freq(), vlv_freq_opcode(),
*GT_FREQUENCY_MULTIPLIER, and /GT_FREQUENCY_MULTIPLIER instances
with intel_gpu_freq() and intel_freq_opcode() calls.
Most of the change was performed with the following semantic patch:
@@
expression E;
@@
(
- E * GT_FREQUENCY_MULTIPLIER
+ intel_gpu_freq(dev_priv, E)
|
- E *= GT_FREQUENCY_MULTIPLIER
+ E = intel_gpu_freq(dev_priv, E)
|
- E /= GT_FREQUENCY_MULTIPLIER
+ E = intel_freq_opcode(dev_priv, E)
|
- do_div(E, GT_FREQUENCY_MULTIPLIER)
+ E = intel_freq_opcode(dev_priv, E)
)
@@
expression E1, E2;
@@
(
- vlv_gpu_freq(E1, E2)
+ intel_gpu_freq(E1, E2)
|
- vlv_freq_opcode(E1, E2)
+ intel_freq_opcode(E1, E2)
)
@@
expression E1, E2, E3, E4;
@@
(
- if (IS_VALLEYVIEW(E1)) {
- E2 = intel_gpu_freq(E3, E4);
- } else {
- E2 = intel_gpu_freq(E3, E4);
- }
+ E2 = intel_gpu_freq(E3, E4);
|
- if (IS_VALLEYVIEW(E1)) {
- E2 = intel_freq_opcode(E3, E4);
- } else {
- E2 = intel_freq_opcode(E3, E4);
- }
+ E2 = intel_freq_opcode(E3, E4);
)
One hunk was manually undone as intel_gpu_freq() ended up
calling itself. Supposedly it would be possible to exclude
certain functions via !=~, but I couldn't get that to work.
Also the removal of vlv_gpu_freq() and vlv_opcode_freq() compat
wrappers was done manually.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Rename the vlv_gpu_freq() and vlv_freq_opecode() functions to have
an intel_ prefix, and handle non-VLV/CHV platforms in them as well.
Leave the vlv_ names around for now since they're currently used.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Currently we are hitting the WARN inside
i915_gem_object_set_cache_level() as we can now have an unbound object
in the GTT write domain (due to 43566dedde "drm/i915: Broaden
application of set-domain(GTT)"). To avoid the warning, we need to track
when we elided the clflush on a cacheable object and then evict the
cache for the object when we move the object out of a cacheable domain.
Reported-by: Jani Nikula <jani.nikula@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Tested-by: Jani Nikula <jani.nikula@intel.com>
Testcase: igt/gem_mmap_wc/set-cache-level
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88607
Tested-by: huax.lu@intel.com
[danvet: Split if into nested if as discussion on the m-l.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We increase it when we pin, so for the casual reader
rename it to cause less confusion.
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Thomas Daniel <thomas.daniel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This vfunc and related structure are only used for fast boot, so let's
rename them to not take them as general purpose ones.
v2: Fix conflicts caused by the introduction of struct intel_crtc_state
Reviewed-By: Tvrtko Ursulin <tvrtko.ursulin@intel.com> (v1)
Suggested-by: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Adding new power doamins for AUX controllers
v2: Added new power domains in power_domain_str per Imre's comment
v3: Added AUX power domains to older platforms
v4: Rebase on top of POWER_DOMAIN_PLLS.
v5: Modified to address review comments from Imre
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Satheeshakrishna M <satheeshakrishna.m@intel.com>
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> (v3)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Make the domains and domain identifiers enums. To emphasize
the difference in order to avoid mistakes.
v2: s/fw_domain/forcewake_domain (Jani)
v3: rebase
Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Deepak S <deepak.s@linux.intel.com> (v1)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We have multiple forcewake domains now on recent gens. Change the
function naming to reflect this.
v2: More verbose names (Chris)
v3: Rebase
v4: Rebase
v5: Add documentation for forcewake_get/put
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Deepak S <deepak.s@linux.intel.com> (v2)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
As we now have forcewake domains, take advantage of it
by putting the differences in gen fw handling in data rather
than in code.
In past we have opencoded this quite extensively as the fw handling
is in the fast path. There has also been a lot of cargo-culted
copy'n'pasting from older gens to newer ones.
Now when the releasing of the forcewake is done by deferred timer,
it gives chance to consolidate more. Due to the frequency of actual hw
access being significantly less.
Take advantage of this and generalize the fw handling code
as much as possible. But we still aim to keep the forcewake sequence
particularities for each gen intact. So the access pattern
to fw engines should remain the same.
v2: - s/old_ack/clear_ack (Chris)
- s/post_read/posting_read (Chris)
- less polite commit msg (Chris)
v3: - rebase
- check and clear wake_count in init
v4: - fix posting reads for gen8 (PRTS)
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Deepak S <deepak.s@linux.intel.com> (v2)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Introduce a structure to track the individual forcewake domains and use
that to eliminate duplicate logic.
v2: - Rebase on latest dinq (Mika)
- for_each_fw_domain macro (Mika)
- Handle reset atomically, keeping the timer running (Mika)
- for_each_fw_domain parameter ordering (Chris)
- defer timer on new register access (Mika)
v3: - Fix forcewake_reset/get race by waiting pending timers
v4: - cond_resched and verbose warning on timer deletion (Chris)
- need to run pending timers manually on reset
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Acked-by: Deepak S <deepak.s@linux.intel.com> (v2)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Move all remaining elements that were unique to execlists queue items
in to the associated request.
Issue: VIZ-4274
v2: Rebase. Fixed issue of overzealous freeing of request.
v3: Removed re-addition of cleanup work queue (found by Daniel Vetter)
v4: Rebase.
v5: Actual removal of intel_ctx_submit_request. Update both tail and postfix
pointer in __i915_add_request (found by Thomas Daniel)
v6: Removed unrelated changes
Signed-off-by: Nick Hoath <nicholas.hoath@intel.com>
Reviewed-by: Thomas Daniel <thomas.daniel@intel.com>
[danvet: Reformat comment with strange linebreaks.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Where there were duplicate variables for the tail, context and ring (engine)
in the gem request and the execlist queue item, use the one from the request
and remove the duplicate from the execlist queue item.
Issue: VIZ-4274
v1: Rebase
v2: Fixed build issues. Keep separate postfix & tail pointers as these are
used in different ways. Reinserted missing full tail pointer update.
Signed-off-by: Nick Hoath <nicholas.hoath@intel.com>
Reviewed-by: Thomas Daniel <thomas.daniel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This reduces the number of direct users of crtc->new_config, opening up
the possibilty of removing it altogether.
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The objective is to make this structure usable with the atomic helpers,
so let's start with the rename. Patch generated with coccinelle:
@@ @@
-struct intel_crtc_config {
+struct intel_crtc_state {
...
}
@@ @@
-struct intel_crtc_config
+struct intel_crtc_state
v2: Completely generate the patch with cocci. (Ander)
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Earlier, DRRS structures were specific to eDP (used only in intel_dp).
Since DRRS can be extended to other internal display types
(if the panel supports multiple RR), modifying structures
to be part of drm_i915_private and have a provision to add display related
structs like intel_dp.
Also, aligning with frontbuffer tracking mechanism, the new structure
contains data for busy frontbuffer bits.
Signed-off-by: Vandana Kannan <vandana.kannan@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Looks like latest BSW/CHV production system has sideband address > 128.
Use u32 data types to cover new offset/address range :)
Signed-off-by: Deepak S <deepak.s@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Starting with Cherryview, devices may have a varying number of EU for
a given ID due to creative fusing. Punit support different frequency for
different fuse data. We use this patch to help get total eu enabled and
read the right offset to get RP0
Based upon a patch from Jeff, but reworked to only store eu_total and
avoid sending info to userspace
v2: Format register definitions (Jani)
Signed-off-by: Deepak S <deepak.s@linux.intel.com>
Acked-by: Jeff McGee <jeff.mcgee@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
BDW with PCI-IDs ended in "2" aren't ULT, but HALO.
Let's fix it and at least allow VGA to work on this units.
v2: forgot ammend and v1 doesn't compile
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87220
Cc: Xion Zhang <xiong.y.zhang@intel.com>
Cc: Guo Jinxian <jinxianx.guo@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Stable <stable@vger.kernel.org>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
No functional changes on this patch. Just grouping the link_standy decision
to avoid miss any change. Also making this info available everywhere
which will help to decide when to use vbt's tp time on following patch.
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Durgadoss R <durgadoss.r@intel.com>
[danvet: Slight editing of the commit message which was one huge
run-on sentence.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
If CONFIG_BUG=n __WARN_printf won't be defined leading to the below
build failure. The double underscores should have told us to steer clear
of it anyway.
drivers/gpu/drm/i915/intel_display.c: In function ‘assert_pll’:
drivers/gpu/drm/i915/intel_display.c:1027:2: error: implicit declaration
of function ‘__WARN_printf’ [-Werror=implicit-function-declaration]
I915_STATE_WARN(cur_state != state,
Use WARN(1, ...) instead. It handles CONFIG_BUG=n gracefully and, with
the constant condition, a sane compiler should reduce it to
__WARN_printf.
This is a regression introduced by
commit e2c719b75c
Author: Rob Clark <robdclark@gmail.com>
Date: Mon Dec 15 13:56:32 2014 -0500
drm/i915: tame the chattermouth (v2)
Reported-by: Jim Davis <jim.epost@gmail.com>
Reference: http://mid.gmane.org/CA+r1ZhgHTi7bS2irhtuSUs9aO=Br1dumN8=oAOeaMJDZ_ZhwBw@mail.gmail.com
Cc: Rob Clark <robdclark@gmail.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Conflicts:
drivers/gpu/drm/i915/intel_runtime_pm.c
Separate branch so that Takashi can also pull just this refactoring
into sound-next.
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Register a component to be used to interface with the snd_hda_intel
driver. This is meant to replace the same interface that is currently
based on module symbol lookup.
v2:
- change roles between the hda and i915 components (Daniel)
- add the implementation to a new file (Jani)
- use better namespacing (Jani)
v3:
- move the implementation to intel_audio.c (Daniel)
- rename display_component to audio_component (Daniel)
- add kerneldoc (Daniel)
v4:
- run forgotten git rm i915_component.c (Jani)
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This will be needed by later patches, so factor it out.
No functional change.
v2:
- s/dev_to_i915_priv/dev_to_i915/ (Jani)
- don't use the helper in i915_pm_suspend (Chris)
- simplify the helper (Chris)
v3:
- remove redundant upcasting in the helper (Daniel)
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Sometimes we wish to tweak how an individual context behaves. Since we
always create a context for every filp, this means that individual
processes can fine tune their behaviour even if they do not explicitly
create a context.
The first example parameter here is to enable multi-process GPU testing,
but the interface should be able to cope with passing arbitrarily complex
parameters.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Testcase: igt/gem_reset_stats/ban-period-*
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This will allow us to set per-file, or even per-context, periods in the
future.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This reverts commit 371abae844.
This data seems unreliable and causing many issues and blocking other
teams and feature implementation. Safest way is to revert that for now.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88081
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88039
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87671
Cc: Vandana Kannan <vandana.kannan@intel.com>
Cc: Deepak M <m.deepak@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Ben Widawsky <ben@bwidawsk.net>
Cc: Kristian Høgsberg <hoegsberg@gmail.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
I've had these since before -rc1, but they missed my last pull
request. Real bug fixes and mostly cc: stable material.
* tag 'drm-intel-next-fixes-2014-12-30' of git://anongit.freedesktop.org/drm-intel:
drm/i915: add missing rpm ref to i915_gem_pwrite_ioctl
Revert "drm/i915: Preserve VGACNTR bits from the BIOS"
drm/i915: Don't call intel_prepare_page_flip() multiple times on gen2-4
drm/i915: Kill check_power_well() calls
This reverts commit 355a701838.
This had some bad side effects under normal operation, and should
have been dropped earlier.
Signed-off-by: Dave Airlie <airlied@redhat.com>
The VGA_2X_MODE bit apparently affects the display even when the VGA
plane is disabled. The bit will set by the BIOS when the panel width
is at least 1280 pixels. So by preserving the bit from the BIOS we
end up with corrupted display on machines with such high res panels.
I only have 1024x768 panels on my gen2 machines so never ran into
this problem.
The original reason for preserving the VGACNTR register was to make
my 830 survive S3 with acpi_sleep=s3_bios option. However after
further 830 fixes that option is no longer needed to make S3 work
and preserving VGACNTR doesn't seem to be necessary without it,
so we can just revert the entire patch.
This reverts
commit 69769f9a42
Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
Date: Fri Aug 15 01:22:08 2014 +0300
drm/i915: Preserve VGACNTR bits from the BIOS
Cc: Bruno Prémont <bonbons@linux-vserver.org>
Cc: stable@vger.kernel.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87171
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Many distro's have mechanism in place to collect and automatically file
bugs for failed WARN()s. And since i915 has a lot of hw state sanity
checks which result in WARN(), it generates quite a lot of noise which
is somewhat disconcerting to the end user.
Separate out the internal hw-is-in-the-state-I-expected checks into
I915_STATE_WARN()s and allow configuration via i915.verbose_checks module
param about whether this will generate a full blown stacktrace or just
DRM_ERROR(). The new moduleparam defaults to true, so by default there
is no change in behavior. And even when disabled, you will still get
an error message logged.
v2: paint the macro names blue, clarify that the default behavior
remains the same as before
Signed-off-by: Rob Clark <robdclark@gmail.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Previously we couldn't trust the user-supplied batch length because
it came directly from userspace (i.e. untrusted code). It would have
affected what commands software parsed without regard to what hardware
would actually execute, leaving a potential hole.
With the parser now copying the user supplied batch buffer and writing
MI_NOP commands to any space after the copied region, we can safely use
the batch length input. This should be a performance win as the actual
batch length is frequently much smaller than the allocated object size.
v2: Fix handling of non-zero batch_start_offset
Issue: VIZ-4719
Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
Reviewed-By: Jon Bloomfield <jon.bloomfield@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This patch sets up all of the tracking and copying necessary to
use batch pools with the command parser and dispatches the copied
(shadow) batch to the hardware.
After this patch, the parser is in 'enabling' mode.
Note that performance takes a hit from the copy in some cases
and will likely need some work. At a rough pass, the memcpy
appears to be the bottleneck. Without having done a deeper
analysis, two ideas that come to mind are:
1) Copy sections of the batch at a time, as they are reached
by parsing. Might improve cache locality.
2) Copy only up to the userspace-supplied batch length and
memset the rest of the buffer. Reduces the number of reads.
v2:
- Remove setting the capacity of the pool
- One global pool instead of per-ring pools
- Replace batch_obj with shadow_batch_obj and hook into eb->vmas
- Memset any space in the shadow batch beyond what gets copied
- Rebased on execlist prep refactoring
v3:
- Rebase on chained batch handling
- Squash in setting the secure dispatch flag
- Add a note about the interaction w/secure dispatch pinning
- Check for request->batch_obj == NULL in i915_gem_free_request
v4:
- Fix read domains for shadow_batch_obj
- Remove the set_to_gtt_domain call from i915_parse_cmds
- ggtt_pin/unpin in the parser block to simplify error handling
- Check USES_FULL_PPGTT before setting DISPATCH_SECURE flag
- Remove i915_gem_batch_pool_put calls
v5:
- Move 'pending_read_domains |= I915_GEM_DOMAIN_COMMAND' after
the parser (danvet, from v4 0/7 feedback)
Issue: VIZ-4719
Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
Reviewed-By: Jon Bloomfield <jon.bloomfield@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This adds a small module for managing a pool of batch buffers.
The only current use case is for the command parser, as described
in the kerneldoc in the patch. The code is simple, but separating
it out makes it easier to change the underlying algorithms and to
extend to future use cases should they arise.
The interface is simple: init to create an empty pool, fini to
clean it up, get to obtain a new buffer. Note that all buffers are
expected to be inactive before cleaning up the pool.
Locking is currently based on the caller holding the struct_mutex.
We already do that in the places where we will use the batch pool
for the command parser.
v2:
- s/BUG_ON/WARN_ON/ for locking assertions
- Remove the cap on pool size
- Switch from alloc/free to init/fini
v3:
- Idiomatic looping structure in _fini
- Correct handling of purged objects
- Don't return a buffer that's too much larger than needed
v4:
- Rebased to latest -nightly
v5:
- Remove _put() function and clean up comments to match
v6:
- Move purged check inside the loop (danvet, from v4 1/7 feedback)
v7:
- Use single list instead of two. (Chris W)
- s/active_list/cache_list
- Squashed in debug patches (Chris W)
drm/i915: Add a batch pool debugfs file
It provides some useful information about the buffers in
the global command parser batch pool.
v2: rebase on global pool instead of per-ring pools
v3: rebase
drm/i915: Add batch pool details to i915_gem_objects debugfs
To better account for the potentially large memory consumption
of the batch pool.
v8:
- Keep cache in LRU order (danvet, from v6 1/5 feedback)
Issue: VIZ-4719
Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
Reviewed-By: Jon Bloomfield <jon.bloomfield@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
LFP brighness control from the VBT block 43 indicates which
controller is used for brightness.
LFP1 brightness control method:
Bit 7-4 = This field controller number of the brightnes controller.
0 = Controller 0
1 = Controller 1
2 = Controller 2
3 = Controller 3
Others = Reserved
Bits 3-0 = This field specifies the brightness control pin to be used on the
platform.
0 = PMIC pin is used for brightness control
1 = LPSS PWM is used for brightness control
2 = Display DDI is used for brightness control
3 = CABC method to control brightness
Others = Reserved
Adding the above fields in dev_priv->vbt and corresponding changes in
parse_backlight()
v2: Jani's review comments addressed
- Move PWM definitions to intel_bios.h
- Moving vbt_version to intel_vbt_data
- Rename brightness to bl_ctrl_data
- Logging just control_pin instead of string
- Avoid adding vbt_version in dev_priv
- Since only DDI option is available as of now, let control pin DDI
affect dev_priv->vbt.backlight.present
v3: Jani's review comments addressed
- Drop control_pin
- Use bdb->version
- set controller to 0 instead of using control pin define
- check controller bounds
- remove superfluous changes in intel_parse_bios
Signed-off-by: Deepak M <m.deepak@intel.com>
Signed-off-by: Vandana Kannan <vandana.kannan@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Things like reliable GGTT mappings and mirrored 2d-on-3d display will need
to map objects into the same address space multiple times.
Added a GGTT view concept and linked it with the VMA to distinguish between
multiple instances per address space.
New objects and GEM functions which do not take this new view as a parameter
assume the default of zero (I915_GGTT_VIEW_NORMAL) which preserves the
previous behaviour.
This now means that objects can have multiple VMA entries so the code which
assumed there will only be one also had to be modified.
Alternative GGTT views are supposed to borrow DMA addresses from obj->pages
which is DMA mapped on first VMA instantiation and unmapped on the last one
going away.
v2:
* Removed per view special casing in i915_gem_ggtt_prepare /
finish_object in favour of creating and destroying DMA mappings
on first VMA instantiation and last VMA destruction. (Daniel Vetter)
* Simplified i915_vma_unbind which does not need to count the GGTT views.
(Daniel Vetter)
* Also moved obj->map_and_fenceable reset under the same check.
* Checkpatch cleanups.
v3:
* Only retire objects once the last VMA is unbound.
v4:
* Keep scatter-gather table for alternative views persistent for the
lifetime of the VMA.
* Propagate binding errors to callers and handle appropriately.
v5:
* Explicitly look for normal GGTT view in i915_gem_obj_bound to align
usage in i915_gem_object_ggtt_unpin. (Michel Thierry)
* Change to single if statement in i915_gem_obj_to_ggtt. (Michel Thierry)
* Removed stray semi-colon in i915_gem_object_set_cache_level.
For: VIZ-4544
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
[danvet: Drop hunk from i915_gem_shrink since it's just prettification
but upsets a __must_check warning.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Faster feedback to errors is always better. This is inspired by the
addition to WARN_ONs to mask/enable helpers for registers to make sure
callers have the arguments ordered correctly: Pretty much always the
arguments are static.
We use WARN_ON(1) a lot in default switch statements though where we
should always handle all cases. So add a new macro specifically for
that.
The idea to use __builtin_constant_p is from Chris Wilson.
v2: Use the ({}) gcc-ism to avoid the static inline, suggested by
Dave. My first attempt used __cond as the temp var, which is the same
used by BUILD_BUG_ON, but with inverted sense. Hilarity ensued, so
sprinkle i915 into the name.
Also use a temporary variable to only evaluate the condition once,
suggested by Damien.
v3: It's crazy but apparently 32bit gcc can't compile out the
BUILD_BUG_ON in a lot of cases and just falls over. I have no idea
why, but until clue grows just disable this nifty idea on 32bit
builds. Reported by 0-day builder.
v4: Got it all wrong, apparently its the gcc version. We need 4.9+.
Now reported by Imre.
v5: Chris suggested to add the case to MISSING_CASE for speedier
debug.
v6: Even some gcc 4.9 versions don't see through the maze, so give up
for now. Keep the skeleton and MISSING_CASE stuff though.
Cc: Imre Deak <imre.deak@intel.com>
Cc: Damien Lespiau <damien.lespiau@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Dave Gordon <david.s.gordon@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Should probably just init this in the GMbus code all the time, based on
the cdclk and HPLL like we do on newer platforms. Ville has code for
that in a rework branch, but until then we can fix this bug fairly
easily.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76301
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Tested-by: Nikolay <mar.kolya@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
No functional changes. This is just the begin of a FBC rework.
v2 (Paulo):
- Revert intel_fbc_init() changed parameter.
- Revert set_no_fbc_reason() rename.
- Rebase.
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
For debugging purposes, it is useful to be able to uniquely identify a given
request structure as it works its way through the system. This becomes
especially tricky once the seqno value is lazily allocated as then the request
has nothing but its pointer to identify it for much of its life.
Change-Id: Ie76b2268b940467f4cdf5a4ba6f5a54cbb96445d
For: VIZ-4377
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We've lost the +1 required for correct timeouts in
commit 5ed0bdf21a
Author: Thomas Gleixner <tglx@linutronix.de>
Date: Wed Jul 16 21:05:06 2014 +0000
drm: i915: Use nsec based interfaces
Use ktime_get_raw_ns() and get rid of the back and forth timespec
conversions.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: John Stultz <john.stultz@linaro.org>
So fix this up by reinstating our handrolled _timeout function. While
at it bother with handling MAX_JIFFIES.
v2: Convert to usecs (we don't care about the accuracy anyway) first
to avoid overflow issues Dave Gordon spotted.
v3: Drop the explicit MAX_JIFFY_OFFSET check, usecs_to_jiffies should
take care of that already. It might be a bit too enthusiastic about it
though.
v4: Chris has a much nicer color, so use his implementation.
This requires to export nsec_to_jiffies from time.c.
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Dave Gordon <david.s.gordon@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82749
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: John Stultz <john.stultz@linaro.org>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Acked-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Updated the trace_irq code to use requests instead of seqnos. This includes
reference counting the request object to ensure it sticks around when required.
Note that getting access to the reference counting functions means moving the
inline i915_trace_irq_get() function from intel_ringbuffer.h to i915_drv.h.
For: VIZ-4377
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com>
[danvet: Resolve conflict due to shuffled merge order.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The ring member of the object structure was always updated with the
last_read_seqno member. Thus with the conversion to last_read_req, obj->ring is
now a direct copy of obj->last_read_req->ring. This makes it somewhat redundant
and potentially misleading (especially as there was no comment to explain its
purpose).
This checkin removes the redundant field. Many uses were simply testing for
non-null to see if the object is active on the GPU. Some of these have been
converted to check 'obj->active' instead. Others (where the last_read_req is
about to be used anyway) have been changed to check obj->last_read_req. The rest
simply pull the ring out from the request structure and proceed as before.
For: VIZ-4377
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Almost everywhere that caled i915_seqno_passed() was really asking 'has the
given seqno popped out of the hardware yet?'. Thus it had to query the current
hardware seqno and then do a signed delta comparison (which copes with wrapping
around zero but not with seqno values more than 2GB apart, although the latter
is unlikely!).
Now that the majority of seqno instances have been replaced with request
structures, it is possible to convert this test to be request based as well.
There is now a 'i915_gem_request_completed()' function which takes a request and
returns true or false as appropriate. Note that this currently just wraps up the
original _passed() test but a later patch in the series will reduce this to
simply returning a cached internal value, i.e.:
_completed(req) { return req->completed; }'
This checkin converts almost all _seqno_passed() calls. The only one left is in
the semaphore code which still requires seqnos not request structures.
For: VIZ-4377
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com>
[danvet: Drop hunk touching the trace_irq code since I've dropped the
patch which converts that, and resolve resulting conflict.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
There is no longer any need to retrieve a seqno value from an i915_add_request()
call. The calling code already knows which request structure is being processed
(it can only be ring->OLR). And as the request itself is now used in preference
to the basic seqno value, the latter is now redundant in this situation.
For: VIZ-4377
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Now that all code above is using request structures instead of seqno values, it
is possible to convert __wait_seqno() itself. Internally, it is still calling
i915_seqno_passed(), this will be updated later in the series. This step is just
changing the parameter list and function name.
For: VIZ-4377
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
With refcounting it looks like you can just drop that refcount, but
that's not really the case. So make sure no one forgets.
Motivated by the unlocked call in the mmio flip code.
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Thomas Daniel <Thomas.Daniel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Updated i915_wait_seqno() to take a request structure instead of a seqno value
and renamed it accordingly. Internally, it just pulls the seqno out of the
request and calls on to __wait_seqno() as before. However, all the code further
up the stack is now simplified as it can just pass the request object straight
through without having to peek inside.
For: VIZ-4377
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com>
[danvet: Squash in hunk from an earlier patch which was rebased
wrongly.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Updated the _check_olr() function to actually take a request object and compare
it to the OLR rather than extracting seqnos and comparing those.
Note that there is one use case where the request object being processed is no
longer available at that point in the call stack. Hence a temporary copy of the
original function is still present (but called _check_ols() instead). This will
be removed in a subsequent patch.
Also, downgraded a BUG_ON to a WARN_ON as apparently the former is frowned upon
for shipping code.
For: VIZ-4377
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The object structure contains the last read, write and fenced seqno values for
use in syncrhonisation operations. These have now been replaced with their
request structure counterparts.
Note that to ensure that objects do not end up with dangling pointers, the
assignments of last_*_req include reference count updates. Thus a request cannot
be freed if an object is still hanging on to it for any reason.
v2: Corrected 'last_rendering_' to 'last_read_' in a number of comments that did
not get updated when 'last_rendering_seqno' became 'last_read|write_seqno'
several millenia ago.
For: VIZ-4377
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Added helper functions for retrieving the ring and seqno entries from a request
structure. This allows the internal workings of the request structure to be
hidden from code that is using these. It also allows for useful
workarounds/debug code to be added as or when necessary.
Note that it is intended that the majority (if not all) uses of the seqno
accessor will disappear eventually as code is updated to use the request
structure itself rather than working with seqno values.
For: VIZ-4377
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The plan is to use request structures everywhere that seqno values were
previously used. This means saving pointers to structures in places that used to
be simple integers. In turn, that means that the target structure now needs much
more stringent lifetime tracking. That is, it must not be freed while some other
random object still holds a pointer to it.
To achieve this tracking, a reference count needs to be added. Whenever a
pointer to the structure is saved away, the count must be incremented and the
free must only occur when all references have been released.
For: VIZ-4377
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Thomas Daniel <Thomas.Daniel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This patch is the last in series of VLV/CHV PSR,
that finally enable PSR by adding it to HAS_PSR
and calling the proper enable and disable
functions on the right places.
Although it is still disabled by default.
v2: Rebase over intel_psr and merge Durgadoss's fixes.
v3: Fix typo.
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Durgadoss R <durgadoss.r@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
PSR (aka SRD) block is defined at VBT and currently being used.
Mainly/At-least to configure the amount of idle_frames require to get
back to PSR Entry.
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Durgadoss R <durgadoss.r@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
drm-intel-next-2014-11-21:
- infoframe tracking (for fastboot) from Jesse
- start of the dri1/ums support removal
- vlv forcewake timeout fixes (Imre)
- bunch of patches to polish the rps code (Imre) and improve it on bdw (Tom
O'Rourke)
- on-demand pinning for execlist contexts
- vlv/chv backlight improvements (Ville)
- gen8+ render ctx w/a work from various people
- skl edp programming (Satheeshakrishna et al.)
- psr docbook (Rodrigo)
- piles of little fixes and improvements all over, as usual
* tag 'drm-intel-next-2014-11-21-fixed' of git://anongit.freedesktop.org/drm-intel: (117 commits)
drm/i915: Don't pin LRC in GGTT when dumping in debugfs
drm/i915: Update DRIVER_DATE to 20141121
drm/i915/g4x: fix g4x infoframe readout
drm/i915: Only call mod_timer() if not already pending
drm/i915: Don't rely upon encoder->type for infoframe hw state readout
drm/i915: remove the IRQs enabled WARN from intel_disable_gt_powersave
drm/i915: Use ggtt error obj capture helper for gen8 semaphores
drm/i915: vlv: increase timeout when setting idle GPU freq
drm/i915: vlv: fix cdclk setting during modeset while suspended
drm/i915: Dump hdmi pipe_config state
drm/i915: Gen9 shadowed registers
drm/i915/skl: Gen9 multi-engine forcewake
drm/i915: Read power well status before other registers for drpc info
drm/i915: Pin tiled objects for L-shaped configs
drm/i915: Update ring freq for full gpu freq range
drm/i915: change initial rps frequency for gen8
drm/i915: Keep min freq above floor on HSW/BDW
drm/i915: Use efficient frequency for HSW/BDW
drm/i915: Can i915_gem_init_ioctl
drm/i915: Sanitize ->lastclose
...
It happens on occasion that developers of generic user-space applications
abuse the dumb buffer API to get hold of drm buffers that they can both
mmap() and use for GPU acceleration, using the assumptions that dumb buffers
and buffers available for GPU are
a) The same type and can be aribtrarily type-casted.
b) fully coherent.
This patch makes the most widely used drivers warn nicely when that happens,
the next step will be to fail.
v2: Move drmP.h changes to drm_gem.h. Fix Radeon dumb mmap breakage.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Let's just throw in the towel on this one and take the cheap way out.
Based on a patch from Chris Wilson, but checking for a different bit.
Chris' patch checked for even bank layout, this one here for a magic
bit. Given the evidence we've gathered (not much) both work I think,
but checking for the magic bit might be more accurate.
Anyway, works on my gm45 here.
For paranoi restrict to gen4 (and mobile), since we've only ever seen
this on gm45 and i965gm.
Also add some debugfs output so that we can skip the tiled swapping
tests properly in these cases.
v2: Clean up the quirk'ed pin count in free_object to avoid upsetting
the WARN_ON. Spotted by Chris.
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=28813
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45092
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Found one more!
With this we can clear up the ggtt init code a bit, yay!
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
With this all the ums nonsense around gem setup/teardown has
disappeared, yay!
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Again just complicates gem init functions and makes a general mess out
of everything.
Good riddance!
v2: In my enthusiasm to start removing dri1/ums crud I went overboard a
bit and killed parts of hangcheck. Resurrect it.
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
We've killed ums support by now, it's time to reap the benefits. This
one here is getting in the way of doing some ring init cleanup.
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
With the deprecation of UMS, and by association DRI1, we have a tough
choice when updating the ring access routines. We either rewrite the
DRI1 routines blindly without testing (so likely to be broken) or take
the liberty of declaring them no longer supported and remove them
entirely. This takes the latter approach.
v2: Also remove the DRI1 sarea updates
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Fix rebase conflicts.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Up until now, we have pinned every logical ring context backing object
during creation, and left it pinned until destruction. This made my life
easier, but it's a harmful thing to do, because we cause fragmentation
of the GGTT (and, eventually, we would run out of space).
This patch makes the pinning on-demand: the backing objects of the two
contexts that are written to the ELSP are pinned right before submission
and unpinned once the hardware is done with them. The only context that
is still pinned regardless is the global default one, so that the HWS can
still be accessed in the same way (ring->status_page).
v2: In the early version of this patch, we were pinning the context as
we put it into the ELSP: on the one hand, this is very efficient because
only a maximum two contexts are pinned at any given time, but on the other
hand, we cannot really pin in interrupt time :(
v3: Use a mutex rather than atomic_t to protect pin count to avoid races.
Do not unpin default context in free_request.
v4: Break out pin and unpin into functions. Fix style problems reported
by checkpatch
v5: Remove unpin_lock as all pinning and unpinning is done with the struct
mutex already locked. Add WARN_ONs to make sure this is the case in future.
Issue: VIZ-4277
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Signed-off-by: Thomas Daniel <thomas.daniel@intel.com>
Reviewed-by: Akash Goel <akash.goels@gmail.com>
Reviewed-by: Deepak S<deepak.s@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
When disabling the RPS interrupts there is a tricky dependency between
the thread disabling the interrupts, the RPS interrupt handler and the
corresponding RPS work. The RPS work can reenable the interrupts, so
there is no straightforward order in the disabling thread to (1) make
sure that any RPS work is flushed and to (2) disable all RPS
interrupts. Currently this is solved by masking the interrupts using two
separate mask registers (first level display IMR and PM IMR) and doing
the disabling when all first level interrupts are disabled.
This works, but the requirement to run with all first level interrupts
disabled is unnecessary making the suspend / unload time ordering of RPS
disabling wrt. other unitialization steps difficult and error prone.
Removing this restriction allows us to disable RPS early during suspend
/ unload and forget about it for the rest of the sequence. By adding a
more explicit method for avoiding the above race, it also becomes easier
to prove its correctness. Finally currently we can hit the WARN in
snb_update_pm_irq(), when a final RPS work runs with the first level
interrupts already disabled. This won't lead to any problem (due to the
separate interrupt masks), but with the change in this and the next
patch we can get rid of the WARN, while leaving it in place for other
scenarios.
To address the above points, add a new RPS interrupts_enabled flag and
use this during RPS disabling to avoid requeuing the RPS work and
reenabling of the RPS interrupts. Since the interrupt disabling happens
now in intel_suspend_gt_powersave(), we will disable RPS interrupts
explicitly during suspend (and not just through the first level mask),
but there is no problem doing so, it's also more consistent and allows
us to unify more of the RPS disabling during suspend and unload time in
the next patch.
v2/v3:
- rebase on patch "drm/i915: move rps irq disable one level up" in the
patchset
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
In sandybridge_pcode_read and sandybridge_pcode_write,
extend the mbox parameter from u8 to u32.
On Haswell and Sandybridge, bits 7:0 encode the mailbox
command and bits 28:8 are used for address control for
specific commands.
Based on suggestion from Ville Syrjälä.
Signed-off-by: Tom O'Rourke <Tom.O'Rourke@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
On skylake, DPLL 1, 2 and 3 can be used for DP and HDMI. The shared dpll
framework allows us to share those DPLLs among DDIs when possible.
The most tricky part is to provide a DPLL state that can be easily
compared. DPLL_CRTL1 is shared by all the DPLLs, 6 bits each. The
per-dpll crtl1 field of the hw state is then normalized to be the same
value if 2 DPLLs do indeed have identical values for those 6 bits.
v2: Port the code to the shared DPLL infrastructure (Damien)
v3: Rebase on top of Ander's clock computation staging work for atomic (Damien)
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> (v2)
Signed-off-by: Satheeshakrishna M <satheeshakrishna.m@intel.com> (v1)
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Adding structure/enum for SKL clocking implementation.
v2: Addressed Damien's comment
- Removed internal structure from this header file
v3: Stove this into the generic intel_dpll_id enum and give them the established
DPLL_ID_ prefixes. (Daniel)
v4: - We'll only try to share DPLL1/2/3, leaving DPLL0 to eDP
- Use SKL in the skylake shared DPLL names
- Re-add the skl_dpll enum
(Damien)
v5: Remove SKL_DPLL_NONE (Daniel)
v6: Modified as per review comments from Paulo
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Satheeshakrishna M <satheeshakrishna.m@intel.com> (v2)
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> (v4,v5)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> (v3)
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This is not used within the driver, and merely saving/restoring these
registers isn't going to do any good anyway. In fact, it's possible it's
actively harmful. Any code enabling the feature should handle this
completely in the regular platform specific enable/disable backlight
functions.
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
On VLV/CHV both pipes A and B have their own backlight control
registers. In order to correctly read out the current hardware state at
init we need to know which pipe is driving the eDP port. Pass that
information down from the eDP init code into the backlight code.
To determine the correct pipe we first look at which pipe is currently
configured in the port control register, if that look invalid we look
at which pipe's PPS is currently controlling the port, and if that
too looks invalid we just assume pipe A.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Currently objects for which the hardware needs a contiguous physical
address are allocated a shadow backing storage to satisfy the contraint.
This shadow buffer is not wired into the normal obj->pages and so the
physical object is incoherent with accesses via the GPU, GTT and CPU. By
setting up the appropriate scatter-gather table, we can allow userspace
to access the physical object via either a GTT mmaping of or by rendering
into the GEM bo. However, keeping the CPU mmap of the shmemfs backing
storage coherent with the contiguous shadow is not yet possible.
Fortuituously, CPU mmaps of objects requiring physical addresses are not
expected to be coherent anyway.
This allows the physical constraint of the GEM object to be transparent
to userspace and allow it to efficiently render into or update them via
the GTT and GPU.
v2: Fix leak of pci handle spotted by Ville
v3: Remove the now duplicate call to detach_phys_object during free.
v4: Wait for rendering before pwrite. As this patch makes it possible to
render into the phys object, we should make it correct as well!
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>