Change the gen6+ max delay if the pcode read was successful (not the
inverse).
The previous code was all sorts of wrong and has existed since I broke
it:
commit 42c0526c93
Author: Ben Widawsky <ben@bwidawsk.net>
Date: Wed Sep 26 10:34:00 2012 -0700
drm/i915: Extract PCU communication
I added some parentheses for clarity, and I also corrected the debug
message message to use the mask (wrong before I came along) and added a
print to show the value we're changing from.
Looking over the code, I'm not actually sure what we're trying to do. I
introduced the bug simply by extracting the function not implementing
anything new. We already set max_delay based on the capabilities
register (which is what we use elsewhere to determine min and max).
This would potentially increase it, I suppose? Jesse, I can't find the
document which explains the definitions of the pcode commands, maybe you
have it around.
Based on Jesse's response, this could potentially be for -fixes, or
stable, or maybe lead to us dropping it entirely. As the current code is
is, things won't completely break because of the aforementioned
capabilities register, and in my experimentation, enabling this has no
effect, it goes from 1100->1100.
I found this while reviewing Jesse's VLV patches.
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
[danvet: Bikeshed-away the redudant parens spotted by Chris Wilson.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We'll re-enable select bits as needed after testing and power measurement.
v2: split out wake handling bits (Jani)
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Can prevent a hang when we get to tessellation. We need to set bit 15
as well for this workaround.
v2: update changelog with accurate info
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We could split this out into a separate routine at some point as an
optimization.
v2: use FORCEWAKE_KERNEL (Ville)
Note: Ville mentioned in his review that he declines to be responsible
if this blows up due to the lack of "readback a register != FW_ACK,
but from the same cacheline" magic we have in other forcewake
implementations.
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
[danvet: Bikeshed overtly long lines according to checkpatch.pl. Nope,
this time around I didn't screw up printk message since I've left
those alone.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
iQEcBAABAgAGBQJRRkrbAAoJEHm+PkMAQRiGy3oH/jrbHinYs0auurANgx4TdtWT
/WNajstKBqLOJJ6cnTR7sOqwOVlptt65EbbTs+qGyZ2Z2W/Lg0BMenHvNHo4ER8C
e7UbMdBCSLKBjAMKh1XCoZscGv4Exm8WRH3Vc5yP0Hafj3EzSAVLY1dta9WKKoQi
bh7D1ErUlbU1zczA1w5YbPF0LqFKRvyZOwebMCCAKAxv5wWAxmbcPNxVR4sufkjg
k6TkQ2ysgWivZAfy3tJYOcxiEu7ahpZVEuYdlZEJQXHRQUfoNljQlOp4BqKsYUai
5A0kaf2VpKay/7pkhvTfBBcF/jFJ68pYP6gQ2ThNdr0b5kOiAfMWj030Xyngnhg=
=iO9t
-----END PGP SIGNATURE-----
Merge tag 'v3.9-rc3' into drm-intel-next-queued
Backmerge so that I can merge Imre Deak's coalesced sg entries fixes,
which depend upon the new for_each_sg_page introduce in
commit a321e91b6d
Author: Imre Deak <imre.deak@intel.com>
Date: Wed Feb 27 17:02:56 2013 -0800
lib/scatterlist: add simple page iterator
The merge itself is just two trivial conflicts:
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We're starting to add many IS_HASWELL checks for the power well code,
so add a HAS_POWER_WELL macro to properly document that we're checking
for hardware that has the power down well.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
[danvet: Resolve conflicts since some converted code was added by
not-yet merged patches.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This increases GEN6_RC6p_THRESHOLD from 100000 to 150000. For some
reason this avoids the gen6_gt_check_fifodbg.isra warnings and
associated GPU lockups, which makes my ivy bridge machine stable.
Signed-off-by: Stéphane Marchesin <marcheu@chromium.org>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Kill the HSW check from the single thread force wake code. HSW
uses MT force wake exclusively these days.
The commit that removed HSW single thread forcewake support:
commit 36ec8f8774
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Thu Oct 18 14:44:35 2012 +0200
drm/i915: unconditionally use mt forcewake on hsw/ivb
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Use the number '1' instead of FORCEWAKE_KERNEL when requesting single
thread force wake since there is only one bit in the register. Using
the FORCEWAKE_KERNEL name might give someone the wrong impression.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The MT forcewake ACK register also has a corresponding bit to each of
the bits in the MT forcewake register. Use the define we have for the
bit we care about instead of a hardcoded number.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This has been lost in the locking rework for intel_alloc_context_page:
commit 2c34b850ee
Author: Ben Widawsky <ben@bwidawsk.net>
Date: Sat Mar 19 18:14:26 2011 -0700
drm/i915: fix ilk rc6 teardown locking
Cc: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Some early bios versions seem to ship with the wrong tuning values for
the MCH, possible resulting in pipe underruns under load. Especially
on DP outputs this can lead to black screen, since DP really doesn't
like an occasional whack from an underrun.
Unfortunately the registers seem to be locked after boot, so the only
thing we can do is politely point out issues and suggest a BIOS
upgrade.
Arthur Runyan pointed us at this issue while discussion DP bugs - thus
far no confirmation from a bug report yet that it helps. But at least
some of my machines here have wrong values, so this might be useful in
understanding bug reports.
v2: After a bit more discussion with Art and Ben we've decided to only
the check the watermark values, since the OREF ones could be be a
notch more aggressive on certain machines.
Cc: Ben Widawsky <ben@bwidawsk.net>
Cc: Runyan, Arthur J <arthur.j.runyan@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This pulls in most of Linus tree up to -rc6, this fixes the worst lockdep
reported issues and re-enables fbcon lockdep.
(not the fbcon maintainer)
* 'fbcon-locking-fixes' of ssh://people.freedesktop.org/~airlied/linux: (529 commits)
Revert "Revert "console: implement lockdep support for console_lock""
fbcon: fix locking harder
fb: Yet another band-aid for fixing lockdep mess
fb: rework locking to fix lock ordering on takeover
Daniel writes:
"Probably the last feature pull for 3.9, there's some fixes outstanding
thought that I'd like to sneak in. And maybe 3.8 takes a bit longer ...
Anyway, highlights of this pull:
- Kill the horrible IS_DISPLAYREG hack to handle the mmio offset movements
on vlv, big thanks to Ville.
- Dynamic power well support for Haswell, shaves away a bit when only
using the eDP port on pipe A (Paulo). Plus unclaimed register fixes
uncovered by this.
- Clarifications of the gpu hang/reset state transitions, hopefully fixing
a few spurious -EIO deaths in userspace.
- Haswell ELD fixes.
- Some more (pp)gtt cleanups from Ben.
- A few smaller things all over.
Plus all the stuff from the previous rather small pull request:
- Broadcast RBG improvements and reduced color range fixes from Ville.
- Ben is on a "kill legacy gtt code for good" spree, first pile of patches
included.
- No-relocs and bo lut improvements for faster execbuf from Chris.
- Some refactorings from Imre."
* tag 'drm-intel-next-2013-02-01' of git://people.freedesktop.org/~danvet/drm-intel: (101 commits)
GPU/i915: Fix acpi_bus_get_device() check in drivers/gpu/drm/i915/intel_opregion.c
drm/i915: Set the SR01 "screen off" bit in i915_redisable_vga() too
drm/i915: Kill IS_DISPLAYREG()
drm/i915: Introduce i915_vgacntrl_reg()
drm/i915: gen6_gmch_remove can be static
drm/i915: dynamic Haswell display power well support
drm/i915: check the power down well on assert_pipe()
drm/i915: don't send DP "idle" pattern before "normal" on HSW PORT_A
drm/i915: don't run hsw power well code on !hsw
drm/i915: kill cargo-culted locking from power well code
drm/i915: Only run idle processing from i915_gem_retire_requests_worker
drm/i915: Fix CAGF for HSW
drm/i915: Reclaim GTT space for failed PPGTT
drm/i915: remove intel_gtt structure
drm/i915: Add probe and remove to the gtt ops
drm/i915: extract hw ppgtt setup/cleanup code
drm/i915: pte_encode is gen6+
drm/i915: vfuncs for ppgtt
drm/i915: vfuncs for gtt_clear_range/insert_entries
drm/i915: Error state should print /sys/kernel/debug
...
We may not concurrently change the power wells code. Which
is already guaranteed since modesets aren't concurrent. That
leaves races against setup/teardown/suspend/resume, and for
those we already (try) rather hard not to hit concurrent
modesets.
No debug WARN_ON added since that would require us to grab the
modeset locks in init/suspend code. Which is again just cargo
culting since just grabbing the locks in those paths isn't good
enough, we need the right order of operations, too.
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Implements WaVSRefCountFullforceMissDisable as documented in the BSpec
3D workarounds chapter.
Cc: Paulo Zanoni <przanoni@gmail.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Our suspend code touches a lot of registers all over the place, so we
need to enable the power well before suspending.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
[danvet: Fixup compilation by stealing the header decl from the
dynamic power wells patch.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The current code was wrong in many different ways, so this is a full
rewrite. We don't have "different power wells for different parts of
the GPU", we have a single power well, but we have multiple registers
that can be used to request enabling/disabling the power well. So
let's be a good citizen and only use the register we're suppose to
use, except when we're loading the driver, where we clear the request
made by the BIOS.
If any of the registers is requesting the power well to be enabled, it
will be enabled. If none of the registers is requesting the power well
to be enabled, it will be disabled.
For now we're just forcing the power well to be enabled, but in the
next commits we'll change this.
V2:
- Remove debug messages that could be misleading due to possible
race conditions with KVMr, Debug and BIOS.
- Don't wait on disabling: after a conversaion with a hardware
engineer we discovered that the "restriction" on bit 31 is just
for the "enable" case, and we don't even need to wait on the
"disable" case.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The aim of this locking rework is that ioctls which a compositor should be
might call for every frame (set_cursor, page_flip, addfb, rmfb and
getfb/create_handle) should not be able to block on kms background
activities like output detection. And since each EDID read takes about
25ms (in the best case), that always means we'll drop at least one frame.
The solution is to add per-crtc locking for these ioctls, and restrict
background activities to only use the global lock. Change-the-world type
of events (modeset, dpms, ...) need to grab all locks.
Two tricky parts arose in the conversion:
- A lot of current code assumes that a kms fb object can't disappear while
holding the global lock, since the current code serializes fb
destruction with it. Hence proper lifetime management using the already
created refcounting for fbs need to be instantiated for all ioctls and
interfaces/users.
- The rmfb ioctl removes the to-be-deleted fb from all active users. But
unconditionally taking the global kms lock to do so introduces an
unacceptable potential stall point. And obviously changing the userspace
abi isn't on the table, either. Hence this conversion opportunistically
checks whether the rmfb ioctl holds the very last reference, which
guarantees that the fb isn't in active use on any crtc or plane (thanks
to the conversion to the new lifetime rules using proper refcounting).
Only if this is not the case will the code go through the slowpath and
grab all modeset locks. Sane compositors will never hit this path and so
avoid the stall, but userspace relying on these semantics will also not
break.
All these cases are exercised by the newly added subtests for the i-g-t
kms_flip, tested on a machine where a full detect cycle takes around 100
ms. It works, and no frames are dropped any more with these patches
applied. kms_flip also contains a special case to exercise the
above-describe rmfb slowpath.
* 'drm-kms-locking' of git://people.freedesktop.org/~danvet/drm-intel: (335 commits)
drm/fb_helper: check whether fbcon is bound
drm/doc: updates for new framebuffer lifetime rules
drm: don't hold crtc mutexes for connector ->detect callbacks
drm: only grab the crtc lock for pageflips
drm: optimize drm_framebuffer_remove
drm/vmwgfx: add proper framebuffer refcounting
drm/i915: dump refcount into framebuffer debugfs file
drm: refcounting for crtc framebuffers
drm: refcounting for sprite framebuffers
drm: fb refcounting for dirtyfb_ioctl
drm: don't take modeset locks in getfb ioctl
drm: push modeset_lock_all into ->fb_create driver callbacks
drm: nest modeset locks within fpriv->fbs_lock
drm: reference framebuffers which are on the idr
drm: revamp framebuffer cleanup interfaces
drm: create drm_framebuffer_lookup
drm: revamp locking around fb creation/destruction
drm: only take the crtc lock for ->cursor_move
drm: only take the crtc lock for ->cursor_set
drm: add per-crtc locks
...
Daniel writes:
- seqno wrap fixes and debug infrastructure from Mika Kuoppala and Chris
Wilson
- some leftover kill-agp on gen6+ patches from Ben
- hotplug improvements from Damien
- clear fb when allocated from stolen, avoids dirt on the fbcon (Chris)
- Stolen mem support from Chris Wilson, one of the many steps to get to
real fastboot support.
- Some DDI code cleanups from Paulo.
- Some refactorings around lvds and dp code.
- some random little bits&pieces
* tag 'drm-intel-next-2012-12-21' of git://people.freedesktop.org/~danvet/drm-intel: (93 commits)
drm/i915: Return the real error code from intel_set_mode()
drm/i915: Make GSM void
drm/i915: Move GSM mapping into dev_priv
drm/i915: Move even more gtt code to i915_gem_gtt
drm/i915: Make next_seqno debugs entry to use i915_gem_set_seqno
drm/i915: Introduce i915_gem_set_seqno()
drm/i915: Always clear semaphore mboxes on seqno wrap
drm/i915: Initialize hardware semaphore state on ring init
drm/i915: Introduce ring set_seqno
drm/i915: Missed conversion to gtt_pte_t
drm/i915: Bug on unsupported swizzled platforms
drm/i915: BUG() if fences are used on unsupported platform
drm/i915: fixup overlay stolen memory leak
drm/i915: clean up PIPECONF bpc #defines
drm/i915: add intel_dp_set_signal_levels
drm/i915: remove leftover display.update_wm assignment
drm/i915: check for the PCH when setting pch_transcoder
drm/i915: Clear the stolen fb before enabling
drm/i915: Access to snooped system memory through the GTT is incoherent
drm/i915: Remove stale comment about intel_dp_detect()
...
Conflicts:
drivers/gpu/drm/i915/intel_display.c
We stopped reading FORCEWAKE for posting reads in
commit 8dee3eea3c
Author: Ben Widawsky <ben@bwidawsk.net>
Date: Sat Sep 1 22:59:50 2012 -0700
drm/i915: Never read FORCEWAKE
and started using something from the same cacheline instead. On the
bug reporter's machine this broke entering rc6 states after a
suspend/resume cycle. It turns out reading ECOBUS as posting read
worked fine, while GTFIFODBG did not, preventing RC6 states after
suspend/resume per the bug report referenced below. It's not entirely
clear why, but clearly GTFIFODBG was nowhere near the same cacheline
or address range as FORCEWAKE.
Trying out various registers for posting reads showed that all tested
registers for which NEEDS_FORCE_WAKE() (in i915_drv.c) returns true
work. Conversely, most (but not quite all) registers for which
NEEDS_FORCE_WAKE() returns false do not work. Details in the referenced
bug.
Based on the above, add posting reads on ECOBUS where GTFIFODBG was
previously relied on.
In true cargo cult spirit, add posting reads for FORCEWAKE_VLV writes as
well, but instead of ECOBUS, use FORCEWAKE_ACK_VLV which is in the same
address range as FORCEWAKE_VLV.
v2: Add more details to the commit message. No functional changes.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=52411
Reported-and-tested-by: Alexander Bersenev <bay@hackerdom.ru>
CC: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@vger.kernel.org
[danvet: add cc: stable and make the commit message a bit clearer that
this is a regression fix and what exactly broke.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Prevent a divide-by-zero by consistently treating an 'active' CRTC
without a mode set as actually disabled.
This looks to have been first introduced with
commit 2492935248
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Mon Jul 2 20:28:59 2012 +0200
drm/i915: read out the modeset hw state at load and resume time
but then combined with
commit b0a2658acb
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Tue Dec 18 09:37:54 2012 +0100
drm/i915: don't disable disconnected outputs
it finally started oopsing.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reported-and-tested-by: Alexey Zaytsev <alexey.zaytsev@gmail.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Some fixes for 3.8:
- Watermark fixups from Chris Wilson (4 pieces).
- 2 snb workarounds, seem to be recently added to our internal DB.
- workaround for the infamous i830/i845 hang, seems now finally solid!
Based on Chris' fix for SNA, now also for UXA/mesa&old SNA.
- Some more fixlets for shrinker-pulls-the-rug issues (Chris&me).
- Fix dma-buf flags when exporting (you).
- Disable the VGA plane if it's enabled on lid open - similar fix in
spirit to the one I've sent you last weeek, BIOS' really like to mess
with the display when closing the lid (awesome debug work from Krzysztof
Mazur).
* 'drm-intel-fixes' of git://people.freedesktop.org/~danvet/drm-intel:
drm/i915: disable shrinker lock stealing for create_mmap_offset
drm/i915: optionally disable shrinker lock stealing
drm/i915: fix flags in dma buf exporting
i915: ensure that VGA plane is disabled
drm/i915: Preallocate the drm_mm_node prior to manipulating the GTT drm_mm manager
drm: Export routines for inserting preallocated nodes into the mm manager
drm/i915: don't disable disconnected outputs
drm/i915: Implement workaround for broken CS tlb on i830/845
drm/i915: Implement WaSetupGtModeTdRowDispatch
drm/i915: Implement WaDisableHiZPlanesWhenMSAAEnabled
drm/i915: Prefer CRTC 'active' rather than 'enabled' during WM computations
drm/i915: Clear self-refresh watermarks when disabled
drm/i915: Double the cursor self-refresh latency on Valleyview
drm/i915: Fixup cursor latency used for IVB lp3 watermarks
I'm not really sure, since the w/a entry is as thin on details as
ever, and Bspec doesn't say anything about it. But I've figured only
dispatching to rows 0&1 instead of all four should be the right thing
for GT1.
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
[danvet: Add the missing snb server GT1 to the check, spotted by Chris
Wilson.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Quoting from Bspec, 3D_CHICKEN1, bit 10
This bit needs to be set always to "1", Project: DevSNB "
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Only the intel_crtc->active is accurate at the point where we wish to
perform WM computations, so use it instead of crtc->enabled.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
If we elect to disable self-refresh as they require too many FIFO
entries, clear the values prior to writing them into the registers. If
they are too large they may occupy more bits than available and so
corrupt neighbouring WM values.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
It operates at twice the declared latency, so double the latency value
used for the cursor watermark calculation.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=50248
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
CC: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
It operates at twice the declared latency, so adjust the computation to
avoid potential flicker at low power.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=50248
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
CC: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Daniel writes:
A few leftover fixes for 3.8:
- VIC support for hdmi infoframes with the associated drm helper, fixes
some black TVs (Paulo Zanoni)
- Modeset state check (and fixup if the BIOS messed with the hw) for
lid-open. modeset-rework fallout. Somehow the original reporter went
awol, so this stalled for way too long until we've found a new
victim^Wreporter with broken BIOS.
- seqno wrap fixes from Mika and Chris.
- Some minor fixes all over from various people.
- Another race fix in the pageflip vs. unpin code from Chris.
- hsw vga resume support and a few more fdi link fixes (only used for vga
on hsw) from Paulo.
- Regression fix for DMAR from Zhenyu Wang - I've scavenged memory from my
DMAR for a while and it broke right away :(
- Regression fix from Takashi Iwai for ivb lvds - some w/a needs to be
(partially) moved back into place. Note that these are regressions in
-next.
- One more fix for ivb 3 pipe support - it now actually seems to work.
* 'drm-intel-fixes' of git://people.freedesktop.org/~danvet/drm-intel: (25 commits)
drm/i915: Fix missed needs_dmar setting
drm/i915: Fix shifted screen on top of LVDS on IVY laptop
drm/i915: disable cpt phase pointer fdi rx workaround
drm/i915: set the LPT FDI RX polarity reversal bit when needed
drm/i915: add lpt_init_pch_refclk
drm/i915: add support for mPHY destination on intel_sbi_{read, write}
drm/i915: reject modes the LPT FDI receiver can't handle
drm/i915: fix hsw_fdi_link_train "retry" code
drm/i915: Close race between processing unpin task and queueing the flip
drm/i915: fixup l3 parity sysfs access check
drm/i915: Clear the existing watermarks for g4x when modifying the cursor sr
drm/i915: do not access BLC_PWM_CTL2 on pre-gen4 hardware
drm/i915: Don't allow ring tail to reach the same cacheline as head
drm/i915: Decouple the object from the unbound list before freeing pages
drm/i915: Set sync_seqno properly after seqno wrap
drm/i915: Include the last semaphore sync point in the error-state
drm/i915: Rearrange code to only have a single method for waiting upon the ring
drm/i915: Simplify flushing activity on the ring
drm/i915: Preallocate next seqno before touching the ring
drm/i915: force restore on lid open
...
The commit [23670b322: drm/i915: CPT+ pch transcoder workaround]
caused a regression on some HP laptops with IvyBridge. The whole
laptop screen is shifted downward for a few pixels constantly.
The problem appears only on LVDS while DP and VGA seem unaffected.
Also, the problem disappears once when go and back from S3.
(S4 resume still shows the same problem.)
This patch revives the minimum part the commit above dropped.
For fixing this regression, only the setup of CHICKEN2 bit in
cpt_init_clock_gating() is needed.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Alex writes:
Pretty minor -next pull request. We some additional new bits waiting
internally for release. Hopefully Monday we can get at least some of
them out. The others will probably take a few more weeks.
Highlights of the current request:
- ELD registers for passing audio information to the sound hardware
- Handle GPUVM page faults more gracefully
- Misc fixes
Merge radeon test
* 'drm-next-3.8' of git://people.freedesktop.org/~agd5f/linux: (483 commits)
drm/radeon: bump driver version for new info ioctl requests
drm/radeon: fix eDP clk and lane setup for scaled modes
drm/radeon: add new INFO ioctl requests
drm/radeon/dce32+: use fractional fb dividers for high clocks
drm/radeon: use cached memory when evicting for vram on non agp
drm/radeon: add a CS flag END_OF_FRAME
drm/radeon: stop page faults from hanging the system (v2)
drm/radeon/dce4/5: add registers for ELD handling
drm/radeon/dce3.2: add registers for ELD handling
radeon: fix pll/ctrc mapping on dce2 and dce3 hardware
Linux 3.7-rc7
powerpc/eeh: Do not invalidate PE properly
Revert "drm/i915: enable rc6 on ilk again"
ALSA: hda - Fix build without CONFIG_PM
of/address: sparc: Declare of_iomap as an extern function for sparc again
PM / QoS: fix wrong error-checking condition
bnx2x: remove redundant warning log
vxlan: fix command usage in its doc
8139cp: revert "set ring address before enabling receiver"
MPI: Fix compilation on MIPS with GCC 4.4 and newer
...
Conflicts:
drivers/gpu/drm/exynos/exynos_drm_encoder.c
drivers/gpu/drm/exynos/exynos_drm_fbdev.c
drivers/gpu/drm/nouveau/core/engine/disp/nv50.c
In a couple of places we attempt to adjust the existing watermark
registers to update them for the new cursor watermarks. This goes
horribly wrong as instead of clearing the cursor bits prior to or'ing in
the new values, we clear the rest of the register with the result that
the watermark registers contain bogus values.
References: https://bugs.freedesktop.org/show_bug.cgi?id=47034
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
As FBC is commonly disabled due to limitations of the chipset upon
output configurations, on many systems FBC is never enabled. For those
systems, it is advantageous to make use of the stolen memory for other
objects and so we defer allocation of the FBC chunk until we actually
require it. This increases the likelihood of that allocation failing,
but that in turns means that we are already taking advantage of the
stolen memory!
As well as delaying the allocation from driver initialisation until the
first use of FBC, we also return the stolen block after we finish using
it - allowing greater flexibility in our usage of stolen space. A side
effect of this is that we can then attempt to allocate only the required
amount of space (with a little slack to reduce reallocation rate and
avoid fragmentation).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Replace the wait for the ring to be clear with the more common wait for
the ring to be idle. The principle advantage is one less exported
intel_ring_wait function, and the removal of a hardcoded value.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Even with the cumulative set of ilk w/a, rc6 is demonstrably still
failing and causing GPU hangs as found by Peter Wu. So we need to disable
it again until it is stable.
This reverts
commit 456470eb58
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Wed Aug 8 23:35:40 2012 +0200
drm/i915: enable rc6 on ilk again
and the follow-on
commit cd7988eea5
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Sun Aug 26 20:33:18 2012 +0200
drm/i915: disable rc6 on ilk when vt-d is enabled
Note: The situation around the gen4/5 gpu hangs that cropped up in 3.7
is rather strange. Most useful bisects have lead to
commit 6c085a728c
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Mon Aug 20 11:40:46 2012 +0200
drm/i915: Track unbound pages
or even later commits that affect the gem bo recycling, which all is
way past the point where we re-enabled rc6. But somehow
reverting/disabling those commits doesn't help, but disabling rc6 at
least helps for many hangs on ilk. Obviously it doesn't change
anything at all on gen4, and there are still strange issues left on
gen5 (which we unfortunately can't readily reproduce).
Also, the error_state signature of the hangs which can be fixed with
this patch look remarkably different to those which seem to be
unaffected by the rc6 settings: The rc6 hangs are in the ring,
somewhere in the MI_FLUSH/PIPE_CONTROL sequence to make ilk coherent,
wheras all the other hangs tend to be at a random point in the middle
of the user batch. So it could also be that we have different issues.
Until we grow more clue, this at least helps some users.
Reported-by: Peter Wu <lekensteyn@gmail.com>
References: https://bugs.freedesktop.org/show_bug.cgi?id=55984
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Added note with some more details about the gen4/5 3.7
gpu hang regression.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Also document the WA name for the previous gens that implement it.
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We need to enable a special bit, otherwise none of the DP functions
requiring the PCH will work.
Version 2: store the PCH ID inside dev_priv, as suggested by Daniel
Vetter.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
DIV_ROUND_CLOSEST is faster if the compiler knows it will only be
dealing with unsigned dividends. This optimization rips 32 bytes of
binary code on x86_64.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: David Airlie <airlied@linux.ie>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The specs for gen2 say that the watermark values "should always be set
assuming a 32bpp display mode, even though the display mode may be 15 or
16 bpp."
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This allows the power related code to run independently of the rest of
the pipeline, extending the resume and init time improvements into
userspace, which would otherwise have been blocked on the struct mutex
if we were doing PCU communication.
v2: Also convert the locking for the rps sysfs interface.
Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> (v1)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Communicating via the mailbox registers with the PCU can take quite
awhile. And updating the ring frequency or enabling turbo is not
something that needs to happen synchronously, so take it out of our init
and resume paths to speed things up (~200ms on my T420).
v2: add comment about why we use a work queue (Daniel)
make sure work queue is idle on suspend (Daniel)
use a delayed work queue since there's no hurry (Daniel)
v3: make cleanup symmetric and just call cancel work directly (Daniel)
v4: schedule the work using round_jiffies_up to batch work better (Chris)
v5: fix the right schedule_delayed_work call (Chris)
References: https://bugs.freedesktop.org/show_bug.cgi?id=54089
Signed-of-by: Jesse Barnes <jbarnes@virtuougseek.org>
[danvet: bikeshed the placement of the new delayed work, move it to
all the other gen6 power mgmt stuff.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Workaround for dual port PS dispatch on GT1.
v2: pull in register definition & offset handling
v3: use IVB GT1 macro to get the right regs (Ben)
v4: add for VLV too (Ben)
v5: don't read the reg, it's masked so we'll only enable the one extra bit (Chris)
v6: use a _GT2 suffix for the second reg (Chris)
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Antti Koskipää <antti.koskipaa@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This allows us to get the right vblank interrupt frequency.
v2: pull in register definition
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Antti Koskipää <antti.koskipaa@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Needs to be set on every context restore as well, so set it as part of
the initial state so we can save/restore it. Note this removes the IVB
workaround value from VLV and uses the default value, just adding in the
L3 cache aging disable bit, since the IVB value is wrong for VLV.
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Antti Koskipää <antti.koskipaa@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>