linux-next

mirror of https://github.com/edk2-porting/linux-next.git synced 2024-12-27 14:43:58 +08:00

Author	SHA1	Message	Date
Joonas Lahtinen	991914274b	drm/i915: Catch non-existent registers in find_fw_domain Add WARN_ON to find_fw_domain to registers related to uninitialized hardware. v2: - Print the uninitialized domains and register (Chris) Cc: Imre Deak <imre.deak@intel.com> Cc: Wang Elaine <elaine.wang@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1481120559-17413-1-git-send-email-joonas.lahtinen@linux.intel.com	2016-12-08 09:37:23 +02:00
Jani Nikula	73f67aa8cc	drm/i915: distinguish G33 and Pineview from each other Pineview deserves to use its own platform enum (which was already added, unused, previously). IS_G33() no longer matches Pineview, and gets replaced by IS_G33() \|\| IS_PINEVIEW() or equivalent. Pineview is no longer an outlier among platform definitions. Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1481143689-19672-1-git-send-email-jani.nikula@intel.com	2016-12-07 23:28:33 +02:00
Ander Conselvan de Oliveira	a3f79ca63b	drm/i915: Don't sanitize has_decoupled_mmio if platform is not broxton The check in __intel_uncore_early_sanitize() to disable decoupled mmio would disable it for every platform that is not broxton. While that's not a problem now since only broxton supports that, simply setting .has_decoupled_mmio in a new platform's device info wouldn't suffice. So avoid future confusion and change the workaround to only change the value of has_decoupled_mmio for broxton. v2: git add compile fix. (Ander) Cc: Praveen Paneri <praveen.paneri@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Imre Deak <imre.deak@intel.com> Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com> Reviewed-by: Imre Deak <imre.deak@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1479993807-29353-1-git-send-email-ander.conselvan.de.oliveira@intel.com	2016-11-25 16:43:45 +02:00
Tvrtko Ursulin	a194b8cb84	drm/i915: Waterproof verification of gen9 forcewake table ranges We have to make sure there are no holes in the table in Gen9. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1479388435-12062-1-git-send-email-tvrtko.ursulin@linux.intel.com	2016-11-17 15:37:44 +00:00
Tvrtko Ursulin	66478475b5	drm/i915: Assorted INTEL_INFO(dev) cleanups A bunch of source files with just a few instances of the incorrect INTEL_INFO use. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2016-11-17 13:56:35 +00:00
Tvrtko Ursulin	78424c927c	drm/i915: Fix gen9 forcewake range table Commit `0dd356bb6f` ("drm/i915: Eliminate Gen9 special case") accidentaly dropped a MMIO range between 0xc000 to 0xcfff out of the blitter forcewake domain. Fix it. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Fixes: `0dd356bb6f` ("drm/i915: Eliminate Gen9 special case") Reported-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1479373363-16528-1-git-send-email-tvrtko.ursulin@linux.intel.com	2016-11-17 13:15:17 +00:00
Praveen Paneri	85ee17ebee	drm/i915/bxt: Broxton decoupled MMIO Decoupled MMIO is an alternative way to access forcewake domain registers, which requires less cycles for a single read/write and avoids frequent software forcewake. This certainly gives advantage over the forcewake as this new mechanism “decouples” CPU cycles and allow them to complete even when GT is in a CPD (frequency change) or C6 state. This can co-exist with forcewake and we will continue to use forcewake as appropriate. E.g. 64-bit register writes to avoid writing 2 dwords separately and land into funny situations. v2: - Moved platform check out of the function and got rid of duplicate functions to find out decoupled power domain (Chris) - Added a check for forcewake already held and skipped decoupled access (Chris) - Skipped writing 64 bit registers through decoupled MMIO (Chris) v3: - Improved commit message with more info on decoupled mmio (Tvrtko) - Changed decoupled operation to enum and used u32 instead of uint_32 data type for register offset (Tvrtko) - Moved HAS_DECOUPLED_MMIO to device info (Tvrtko) - Added lookup table for converting fw_engine to pd_engine (Tvrtko) - Improved __gen9_decoupled_read and __gen9_decoupled_write routines (Tvrtko) v4: - Fixed alignment and variable names (Chris) - Write GEN9_DECOUPLED_REG0_DW1 register in just one go (Zhe Wang) v5: - Changed HAS_DECOUPLED_MMIO() argument name to dev_priv (Tvrtko) - Sanitize info->had_decoupled_mmio at init (Chris) Signed-off-by: Zhe Wang <zhe1.wang@intel.com> Signed-off-by: Praveen Paneri <praveen.paneri@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1479230360-22395-1-git-send-email-praveen.paneri@intel.com	2016-11-16 09:33:10 +00:00
Tvrtko Ursulin	9480dbf074	drm/i915: Inline binary search Instead of using bsearch library function make a local generator macro out of it so the comparison callback can be inlined. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1475569769-31108-1-git-send-email-tvrtko.ursulin@linux.intel.com	2016-10-04 11:10:01 +01:00
Tvrtko Ursulin	5a65938381	drm/i915: Use binary search when looking for shadowed registers Simply replace the linear search with the kernel's binary search implementation. There is only six registers currently in that table so this may not be that interesting. It adds a function call so hopefully remains performance neutral for now. v2: No need for manual conversion to bool for return. (Joonas Lahtinen) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2016-10-04 11:09:59 +01:00
Tvrtko Ursulin	47188574a9	drm/i915: Sort the shadow register table Also verify the order at runtime. This was we can start using binary search on it in a following patch. v2: Add comment on the sorted array and only check it when debug option is enabled. v3: Use IS_ENABLED. (Chris Wilson) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (v1) Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2016-10-04 11:09:59 +01:00
Tvrtko Ursulin	22d48c55ba	drm/i915: Remove identical write mmmio functions We notice two identical copies of the shadow register table and following from that removal can also unify CHV and Gen9 write mmio functions and macros into a single implementation. v2: Name fwtable consistently and use HAS_FWTABLE. (Joonas Lahtinen) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2016-10-04 11:09:59 +01:00
Tvrtko Ursulin	6044c4a371	drm/i915: Remove identical mmio read functions It is now obvious VLV, CHV and Gen9 mmio read fcuntions are completely identical so we can remove the three copies and just keep the newly named generic implementation. v2: Use fwtable naming consistently. (Joonas Lahtinen) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2016-10-04 11:09:59 +01:00
Tvrtko Ursulin	895833bd97	drm/i915: Remove identical macros Remove some macros which are now obviously identical. v2: Added HAS_FWTABLE macro and simplified intel_uncore_forcewake_for_read. (Joonas Lahtinen) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2016-10-04 11:09:59 +01:00
Tvrtko Ursulin	15157970f7	drm/i915: Store the active forcewake range table pointer If we store this in the uncore structure we are on a good way to show more commonality between the per-platform implementations. v2: Constify table pointer and correct coding style. (Chris Wilson) v3: Rebase. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2016-10-04 11:09:59 +01:00
Tvrtko Ursulin	0dd356bb6f	drm/i915: Eliminate Gen9 special case If we insert blitter forcewake domain entries in the range table we can eliminate that special case and simplify the code in a few macros. This will enable more unification later. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2016-10-04 11:09:59 +01:00
Tvrtko Ursulin	91e630b9e6	drm/i915: Use binary search when looking up forcewake domains Instead of the existing linear seach, now that we have sorted range tables, we can do a binary search on them for some potential miniscule performance gain, but more importantly for elegance and code size. Hopefully the perfomance gain is sufficient to offset the function calls which were not there before. v2: Removed const cast away. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2016-10-04 11:09:59 +01:00
Tvrtko Ursulin	b008123966	drm/i915: Sort forcewake mapping tables Sorting the tables (verified at runtime to help during development) is another prerequisite for interesting work which will follow. v2: * Remove const away cast and improve comments. (Chris Wilson) * Check tables only when debug option is enabled. v3: Use IS_ENABLED. (Chris Wilson) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2016-10-04 11:09:59 +01:00
Tvrtko Ursulin	9fc1117cf8	drm/i915: Data driven register to forcewake domains lookup Move finding the correct forcewake domains to take for register access from code to a mapping table. This will allow more interesting work in the following patches and is easier to review if singled out early. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2016-10-04 11:09:59 +01:00
Tvrtko Ursulin	c521b0c898	drm/i915: Do not inline forcewake taking in mmio accessors Once we know we need to take new forcewakes, that being a slow operation, it does not make sense to inline that code into every mmio accessor. Move it to a separate function and save some code. v2: Be explicit with noinline and remove stale comment. (Chris Wilson) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2016-10-04 11:09:59 +01:00
Tvrtko Ursulin	003342a500	drm/i915: Keep track of active forcewake domains in a bitmask There are current places in the code, and there will be more in the future, which iterate the forcewake domains to find out which ones are currently active. To save them from doing this iteration, we can cheaply keep a mask of active domains in dev_priv->uncore.fw_domains_active. This has no cost in terms of object size, even manages to shrink it overall by 368 bytes on my config. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: "Paneri, Praveen" <praveen.paneri@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2016-10-04 11:09:59 +01:00
Tvrtko Ursulin	e9b825f4e9	drm/i915: Remove redundant hsw_write* mmio functions They are completely identical to gen6_write* ones. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2016-10-04 11:09:59 +01:00
Chris Wilson	dda960335e	drm/i915: Just clear the mmiodebug before a register access When we enable the per-register access mmiodebug, it is to detect which access is illegal. Reporting on earlier untraced access outside of the mmiodebug does not help debugging (as the suspicion is immediately put upon the current register which is not at fault)! References: https://bugs.freedesktop.org/show_bug.cgi?id=97985 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Cc: stable@vger.kernel.org Link: http://patchwork.freedesktop.org/patch/msgid/20161003124516.12388-1-chris@chris-wilson.co.uk	2016-10-03 20:59:54 +01:00
Chris Wilson	b18c1bb48e	drm/i915: Remove 64b mmio write vfuncs We don't have safe 64-bit mmio writes as they are really split into 2x32-bit writes. This tearing is dangerous as the hardware will operate on the intermediate value, requiring great care when assigning. (See, for example, i965_write_fence_reg.) As such we don't currently use them and strongly advise not to us them. Go one step further and remove the 64-bit write vfuncs. v2: Add some more details to the comment about why WRITE64 is absent, and why you need to think twice before using READ64. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20160906144538.4204-1-chris@chris-wilson.co.uk	2016-09-06 17:51:01 +01:00
Chris Wilson	bafb0fced9	drm/i915: Make for_each_engine_masked() more compact and quicker Rather than walk the full array of engines checking whether each is in the mask in turn, we can use the mask to jump to the right engines. This should quicker for a sparse array of engines or mask, whilst generating smaller code: text data bss dec hex filename 1251010 4579 800 1256389 132bc5 drivers/gpu/drm/i915/i915.ko 1250530 4579 800 1255909 1329e5 drivers/gpu/drm/i915/i915.ko The downside is that we have to pass in a temporary, alas no C99 iterators yet. [P.S. Joonas doesn't like having to pass extra temporaries into the macro, and even less that I called them tmp. As yet, we haven't found a macro that avoids passing in a temporary that is smaller. We probably will get C99 iterators first!] Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20160827075401.16470-2-chris@chris-wilson.co.uk	2016-08-27 09:42:53 +01:00
Chris Wilson	54b4f68f18	Revert "drm/i915: Enable RC6 immediately" This reverts commit `b12e0ee208` ("drm/i915: Enable RC6 immediately"), as it was never meant to be sent anywhere other than the bug report for experimentation. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1469132179-4052-1-git-send-email-chris@chris-wilson.co.uk Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2016-07-21 21:43:19 +01:00
Chris Wilson	b12e0ee208	drm/i915: Enable RC6 immediately Now that PCU communication is reasonably fast, we do not need to defer RC6 initialisation to a workqueue. References: https://bugs.freedesktop.org/show_bug.cgi?id=97017 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2016-07-21 18:30:22 +01:00
Daniel Vetter	3d466cd67e	drm/i915: Fixup kerneldoc code snippets in intel_uncore.c We need :: before, blank lines around and indentation with 4 _additional_ spaces to make it work. Also, don't use @param in code snippets, it results in confusion. Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/1468612088-9721-8-git-send-email-daniel.vetter@ffwll.ch	2016-07-19 10:34:24 +02:00
Chris Wilson	b7137e0cf1	drm/i915: Defer enabling rc6 til after we submit the first batch/context Some hardware requires a valid render context before it can initiate rc6 power gating of the GPU; the default state of the GPU is not sufficient and may lead to undefined behaviour. The first execution of any batch will load the "golden render state", at which point it is safe to enable rc6. As we do not forcibly load the kernel context at resume, we have to hook into the batch submission to be sure that the render state is setup before enabling rc6. However, since we don't enable powersaving until that first batch, we queued a delayed task in order to guarantee that the batch is indeed submitted. v2: Rearrange intel_disable_gt_powersave() to match. v3: Apply user specified cur_freq (or idle_freq if not set). v4: Give in, and supply a delayed work to autoenable rc6 v5: Mika suggested a couple of better names for delayed_resume_work v6: Rebalance rpm_put around the autoenable task Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1468397438-21226-7-git-send-email-chris@chris-wilson.co.uk Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>	2016-07-14 15:24:21 +01:00
Chris Wilson	91c8a326a1	drm/i915: Convert dev_priv->dev backpointers to dev_priv->drm Since drm_i915_private is now a subclass of drm_device we do not need to chase the drm_i915_private->dev backpointer and can instead simply access drm_i915_private->drm directly. text data bss dec hex filename `1068757` 4565 416 1073738 10624a drivers/gpu/drm/i915/i915.ko 1066949 4565 416 1071930 105b3a drivers/gpu/drm/i915/i915.ko Created by the coccinelle script: @@ struct drm_i915_private *d; identifier i; @@ ( - d->dev->i + d->drm.i \| - d->dev + &d->drm ) and for good measure the dev_priv->dev backpointer was removed entirely. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1467711623-2905-4-git-send-email-chris@chris-wilson.co.uk	2016-07-05 11:58:45 +01:00
Chris Wilson	fac5e23e3c	drm/i915: Mass convert dev->dev_private to to_i915(dev) Since we now subclass struct drm_device, we can save pointer dances by noting the equivalence of struct drm_device and struct drm_i915_private, i.e. by using to_i915(). text data bss dec hex filename 1073824 4562 416 1078802 107612 drivers/gpu/drm/i915/i915.ko 1068976 4562 416 1073954 106322 drivers/gpu/drm/i915/i915.ko Created by the coccinelle script: @@ expression E; identifier p; @@ - struct drm_i915_private p = E->dev_private; + struct drm_i915_private p = to_i915(E); Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Dave Gordon <david.s.gordon@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1467628477-25379-1-git-send-email-chris@chris-wilson.co.uk	2016-07-04 12:54:07 +01:00
Chris Wilson	556ab7a6de	drm/i915: Hold irq uncore.lock when initialising fw_domains Acquiring the forcewake domain asserts that it is in an atomic section (as we always expect to be under the uncore.lock). This is true except for initialising the domains on Ivybridge, and so we generate a warning. Wrap the manual usage of fw_domains inside the spin_lock. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1467566973-13596-1-git-send-email-chris@chris-wilson.co.uk Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>	2016-07-04 10:27:37 +01:00
Chris Wilson	4a17fe13c0	drm/i915: Convert wait_for(I915_READ(reg)) to intel_wait_for_register() By using the out-of-line intel_wait_for_register() not only do we can efficiency from using the hybrid wait_for() contained within, but we avoid code bloat from the numerous inlined loops, in total (all patches): text data bss dec hex filename 1078551 4557 416 1083524 108884 drivers/gpu/drm/i915/i915.ko 1070775 4557 416 1075748 106a24 drivers/gpu/drm/i915/i915.ko Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1467297225-21379-61-git-send-email-chris@chris-wilson.co.uk	2016-06-30 17:05:07 +01:00
Chris Wilson	87273b7110	drm/i915: Convert wait_for(I915_READ(reg)) to intel_wait_for_register() By using the out-of-line intel_wait_for_register() not only do we can efficiency from using the hybrid wait_for() contained within, but we avoid code bloat from the numerous inlined loops, in total (all patches): text data bss dec hex filename 1078551 4557 416 1083524 108884 drivers/gpu/drm/i915/i915.ko 1070775 4557 416 1075748 106a24 drivers/gpu/drm/i915/i915.ko Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1467297225-21379-60-git-send-email-chris@chris-wilson.co.uk	2016-06-30 15:42:34 +01:00
Chris Wilson	1758b90e38	drm/i915: Use a hybrid scheme for fast register waits Ville Syrjälä reported that in the majority of wait_for(I915_READ()) he inspect, most completed within the first couple of reads and that the delay between those wait_for() reads was the ratelimiting step for many code paths. For example, __gen6_update_ring_freq() was blamed for slowing down boot by many milliseconds, but under Ville's scrutiny the issue was just excessive delay waiting for sandybridge_pcode_write(). We can eliminate the wait by initially using a busyspin upon the register read and only fallback to the sleeping loop in cases where the hardware is indeed too slow. A threshold of 2 microseconds is used as the initial ballpark. To avoid excessive code bloating from converting every wait_for() into a hybrid busy/sleep loop, we extend wait_for_register_fw() and export it for use by other callers. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1467297225-21379-1-git-send-email-chris@chris-wilson.co.uk	2016-06-30 15:41:58 +01:00
Dave Gordon	1a3d189837	drm/i915/guc: distinguish HAS_GUC() from HAS_GUC_UCODE/HAS_GUC_SCHED For now, anything with a GuC requires uCode loading, and then supports command submission once loaded. But these are logically distinct from simply "having a GuC", so we need a separate macro for the latter. Then, various tests should use this new macro rather than HAS_GUC_UCODE() or testing enable_guc_submission. v4: Added a couple more uses of the new macro. Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>	2016-05-23 14:21:52 +01:00
Chris Wilson	d538704ba0	drm/i915: Move get-reset-stats ioctl from intel_uncore.c to i915_gem_context.c The get-reset-stats ioctl reports upon the statistics (number of hangs, be it as a victim or the guilty party) of a particular context. It is semantically better as being part of i915_gem_context.c user interface, as opposed to the hardware level access of intel_uncore.c Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1463137042-9669-1-git-send-email-chris@chris-wilson.co.uk	2016-05-13 12:38:42 +01:00
Tvrtko Ursulin	ae5702d245	drm/i915: Make IS_GENx macros work on a mask If instead of numerical comparison me make these test a bitmask, we enable the compiler to optimize all instances of IS_GENx \|\| IS_GENy. v2: Make bit zero of gen mask mean gen 1. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2016-05-11 12:27:27 +01:00
Chris Wilson	dc97997a21	drm/i915: Use drm_i915_private as the native pointer for intel_uncore.c Pass drm_i915_private to the uncore init/fini routines and their subservients as it is their native type. text data bss dec hex filename 6309978 3578778 696320 10585076 a183f4 vmlinux 6309530 3578778 696320 10584628 a18234 vmlinux a modest 400 bytes of saving, but 60 lines of code deleted! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1462885804-26750-1-git-send-email-chris@chris-wilson.co.uk	2016-05-10 17:20:20 +01:00
Chris Wilson	c033666a94	drm/i915: Store a i915 backpointer from engine, and use it text data bss dec hex filename 6309351 3578714 696320 10584385 a18141 vmlinux 6308391 3578714 696320 10583425 a17d81 vmlinux Almost 1KiB of code reduction. v2: More s/INTEL_INFO()->gen/INTEL_GEN()/ and IS_GENx() conversions text data bss dec hex filename 6304579 3578778 696320 10579677 a16edd vmlinux 6303427 3578778 696320 10578525 a16a5d vmlinux Now over 1KiB! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1462545621-30125-3-git-send-email-chris@chris-wilson.co.uk	2016-05-09 13:41:24 +01:00
Ville Syrjälä	3d7d0c85e4	drm/i915: Use fw_domains_put_with_fifo() on HSW HSW still has the wake FIFO, so let's check it. Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Deepak S <deepak.s@linux.intel.com> Fixes: `05a2fb157e` ("drm/i915: Consolidate forcewake code") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1460633942-24013-1-git-send-email-ville.syrjala@linux.intel.com Cc: stable@vger.kernel.org Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>	2016-04-14 15:02:16 +03:00
Mika Kuoppala	c02e85a06e	drm/i915: Calculate edram size With gen9+ the edram capabilities are defined so that we can calculate the edram (ellc) size accordingly. Note that there are undefined combinations for some subset of edram capability bits. Return the closest size for undefined indexes. Even if we get it wrong with beginning of future gen enabling, the size information is currently only used for boot message and in debugfs entry. v2: Use function instead of hard to read macro (Daniel) v3: s/INTEL_INFO/INTEL_GEN (Matthew) Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1460557604-7126-2-git-send-email-mika.kuoppala@intel.com	2016-04-14 12:27:37 +03:00
Mika Kuoppala	3accaf7e73	drm/i915: Store and use edram capabilities Store the edram capabilities instead of only the size of edram. This is preparatory patch to allow edram size calculation based on edram capability bits for gen9+. With gen9 the edram is behind llc and is a separate entity. With hsw/bdw it was more of a victim cache for LLC so the name 'eLLC' might be warranted. Regardless, rename all mentions of eLLC to EDRAM to clear the confusion. v2: return bytes for edram size (Chris) s/eLLC/eDRAM in output if we are gen > 8 v3: rebase, INTEL_GEN (Chris) Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2016-04-14 12:27:37 +03:00
Ville Syrjälä	522bad5b5e	Revert "drm/i915: Limit the auto arming of mmio debugs on vlv/chv" Enable the unclaimd register detection stuff on vlv/chv since we've now fixed the known problems during suspend. This reverts commit `c81eeea6c1`. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1460382992-28728-11-git-send-email-ville.syrjala@linux.intel.com Reviewed-by: Imre Deak <imre.deak@intel.com>	2016-04-12 19:09:39 +03:00
Tvrtko Ursulin	3756685a18	drm/i915: Only grab correct forcewake for the engine with execlists Rather than blindly waking up all forcewake domains on command submission, we can teach each engine what is (or are) the correct one to take. On platforms with multiple forcewake domains like VLV, CHV, SKL and BXT, this has the potential of lowering the GPU and CPU power use and submission latency. To implement it we add a function named intel_uncore_forcewake_for_reg whose purpose is to query which forcewake domains need to be taken to read or write a specific register with raw mmio accessors. These enables the execlists engine setup to query which forcewake domains are relevant per engine on the currently running platform. v2: * Kerneldoc. * Split from intel_uncore.c macro extraction, WARN_ON, no warns on old platforms. (Chris Wilson) v3: * Single domain per engine, mention all registers, bi-directional function and a new name, fix handling of gen6 and gen7 writes. (Chris Wilson) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1460468251-14069-1-git-send-email-tvrtko.ursulin@linux.intel.com	2016-04-12 15:35:22 +01:00
Tvrtko Ursulin	a70ecc16d0	drm/i915: Remove forcewake request registers from the shadowed table Chris Wilson points out that we can remove them from the array since they are always written to with raw accessors. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2016-04-12 15:35:22 +01:00
Tvrtko Ursulin	6863b76c62	drm/i915: Extract knowledge of register forcewake domains Knowledge of which register per platform belonds in which forcewake domain was embedded in the MMIO accessors themselves. Extract it into standalone macros so they can be used from new code in the following patches. This causes GCC to compile some of the MMIO accessors slightly differently and grows the code a tiny amount. But none of the growth is on the fast-path so it does not matter hugely. Affected sizes before: 00000000000026f0 00000000000001a5 t gen6_read16 0000000000002390 00000000000001a5 t gen6_read32 00000000000028a0 00000000000001a5 t gen6_read64 00000000000061d0 000000000000019e t gen8_write16 0000000000006510 000000000000019d t gen8_write32 0000000000006370 000000000000019d t gen8_write64 00000000000021f0 000000000000019d t gen8_write8 Affected sizes after: 0000000000002840 00000000000001aa t gen6_read16 00000000000024e0 00000000000001a9 t gen6_read32 00000000000029f0 00000000000001a9 t gen6_read64 0000000000004f20 00000000000001b5 t gen8_write16 0000000000004ba0 00000000000001b4 t gen8_write32 00000000000050e0 00000000000001b4 t gen8_write64 0000000000004d60 00000000000001b4 t gen8_write8 Other MMIO accessors are not affected in size. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2016-04-12 15:35:22 +01:00
Tvrtko Ursulin	4e1176dd61	drm/i915: Do not serialize forcewake acquire across domains On platforms with multiple forcewake domains it seems more efficient to request all desired ones and then to wait for acks to avoid needlessly serializing on each domain. v2: Rebase. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: http://patchwork.freedesktop.org/patch/msgid/1460045074-1006-1-git-send-email-tvrtko.ursulin@linux.intel.com	2016-04-12 14:30:41 +01:00
Tvrtko Ursulin	33c582c10a	drm/i915: Simplify for_each_fw_domain iterators As the vast majority of users do not use the domain id variable, we can eliminate it from the iterator and also change the latter using the same principle as was recently done for for_each_engine. For a couple of callers which do need the domain mask, store it in the domain array (which already has the domain id), then both can be retrieved thence. Result is clearer code and smaller generated binary, especially in the tight fw get/put loops. Also, relationship between domain id and mask is no longer assumed in the macro. v2: Improve grammar in the commit message and rename the iterator to for_each_fw_domain_masked for consistency. (Dave Gordon) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Dave Gordon <david.s.gordon@intel.com>	2016-04-12 14:30:41 +01:00
Tvrtko Ursulin	a57a4a67e5	drm/i915: Use consistent forcewake auto-release timeout across kernel configs Because it is based on jiffies, current implementation releases the forcewake at any time between straight away and between 1ms and 10ms, depending on the kernel configuration (CONFIG_HZ). This is probably not what has been desired, since the dynamics of keeping parts of the GPU awake should not be correlated with this kernel configuration parameter. Change the auto-release mechanism to use hrtimers and set the timeout to 1ms with a 1ms of slack. This should make the GPU power consistent across kernel configs, and timer slack should enable some timer coalescing where multiple force-wake domains exist, or with unrelated timers. For GlBench/T-Rex this decreases the number of forcewake releases from ~480 to ~300 per second, and for a heavy combined OGL/OCL test from ~670 to ~360 (HZ=1000 kernel). Even though this reduction can be attributed to the average release period extending from 0-1ms to 1-2ms, as discussed above, it will make the forcewake timeout consistent for different CONFIG_HZ values. Real life measurements with the above workload has shown that, with this patch, both manage to auto-release the forcewake between 2-4 times per 10ms, even though the number of forcewake gets is dramatically different. T-Rex requests between 5-10 explicit gets and 5-10 implict gets in each 10ms period, while the OGL/OCL test requests 250 and 380 times in the same period. The two data points together suggest that the nature of the forwake accesses is bursty and that further changes and potential timeout extensions, or moving the start of timeout from the first to the last automatic forcewake grab, should be carefully measured for power and performance effects. v2: * Commit spelling. (Dave Gordon) * More discussion on numbers in the commit. (Chris Wilson) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Dave Gordon <david.s.gordon@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>	2016-04-12 14:30:41 +01:00
Joonas Lahtinen	2d1fe07340	drm/i915: Do not use {HAS_, IS_, INTEL_INFO}(dev_priv->dev) dev_priv is what the macro works hard to extract, pass it directly. > sed 's/\([A-Z].*(dev_priv\)->dev)/\1)/g' v2: - Include all wrapper macros too (Chris) v3: - Include sed cmdline (Chris) v4: - Break long line - Rebase Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1460016485-8089-1-git-send-email-joonas.lahtinen@linux.intel.com	2016-04-07 14:50:26 +03:00

1 2 3 4 5

222 Commits