linux

korg/linux

mirror of https://mirrors.bfsu.edu.cn/git/linux.git synced 2024-12-22 18:44:44 +08:00

Author	SHA1	Message	Date
Chris Wilson	c501ae7f33	drm/i915: Only clear the GPU domains upon a successful finish By clearing the GPU read domains before waiting upon the buffer, we run the risk of the wait being interrupted and the domains prematurely cleared. The next time we attempt to wait upon the buffer (after userspace handles the signal), we believe that the buffer is idle and so skip the wait. There are a number of bugs across all generations which show signs of an overly haste reuse of active buffers. Such as: https://bugs.freedesktop.org/show_bug.cgi?id=29046 https://bugs.freedesktop.org/show_bug.cgi?id=35863 https://bugs.freedesktop.org/show_bug.cgi?id=38952 https://bugs.freedesktop.org/show_bug.cgi?id=40282 https://bugs.freedesktop.org/show_bug.cgi?id=41098 https://bugs.freedesktop.org/show_bug.cgi?id=41102 https://bugs.freedesktop.org/show_bug.cgi?id=41284 https://bugs.freedesktop.org/show_bug.cgi?id=42141 A couple of those pre-date i915_gem_object_finish_gpu(), so may be unrelated (such as a wild write from a userspace command buffer), but this does look like a convincing cause for most of those bugs. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-03-01 21:36:13 +01:00
Chris Wilson	eadb29a9c5	drm/i915: Silence the error message from i915_wait_request() This error message has since been superseded by the hangcheck, and does not add any salient information beyond that already printed by hangcheck discovering the GPU hang that lead to i915_wait_request() bombing out in the first place. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-02-27 18:08:22 +01:00
Chris Wilson	a71d8d9452	drm/i915: Record the tail at each request and use it to estimate the head By recording the location of every request in the ringbuffer, we know that in order to retire the request the GPU must have finished reading it and so the GPU head is now beyond the tail of the request. We can therefore provide a conservative estimate of where the GPU is reading from in order to avoid having to read back the ring buffer registers when polling for space upon starting a new write into the ringbuffer. A secondary effect is that this allows us to convert intel_ring_buffer_wait() to use i915_wait_request() and so consolidate upon the single function to handle the complicated task of waiting upon the GPU. A necessary precaution is that we need to make that wait uninterruptible to match the existing conditions as all the callers of intel_ring_begin() have not been audited to handle ERESTARTSYS correctly. By using a conservative estimate for the head, and always processing all outstanding requests first, we prevent a race condition between using the estimate and direct reads of I915_RING_HEAD which could result in the value of the head going backwards, and the tail overflowing once again. We are also careful to mark any request that we skip over in order to free space in ring as consumed which provides a self-consistency check. Given sufficient abuse, such as a set of unthrottled GPU bound cairo-traces, avoiding the use of I915_RING_HEAD gives a 10-20% boost on Sandy Bridge (i5-2520m): firefox-paintball 18927ms -> 15646ms: 1.21x speedup firefox-fishtank 12563ms -> 11278ms: 1.11x speedup which is a mild consolation for the performance those traces achieved from exploiting the buggy autoreported head. v2: Add a few more comments and make request->tail a conservative estimate as suggested by Daniel Vetter. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> [danvet: resolve conflicts with retirement defering and the lack of the autoreport head removal (that will go in through -fixes).] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-02-15 14:26:03 +01:00
Daniel Vetter	53d227f282	drm/i915: fixup seqno allocation logic for lazy_request Currently we reserve seqnos only when we emit the request to the ring (by bumping dev_priv->next_seqno), but start using it much earlier for ring->oustanding_lazy_request. When 2 threads compete for the gpu and run on two different rings (e.g. ddx on blitter vs. compositor) hilarity ensued, especially when we get constantly interrupted while reserving buffers. Breakage seems to have been introduced in commit `6f392d5486` Author: Chris Wilson <chris@chris-wilson.co.uk> Date: Sat Aug 7 11:01:22 2010 +0100 drm/i915: Use a common seqno for all rings. This patch fixes up the seqno reservation logic by moving it into i915_gem_next_request_seqno. The ring->add_request functions now superflously still return the new seqno through a pointer, that will be refactored in the next patch. Note that with this change we now unconditionally allocate a seqno, even when ->add_request might fail because the rings are full and the gpu died. But this does not open up a new can of worms because we can already leave behind an outstanding_request_seqno if e.g. the caller gets interrupted with a signal while stalling for the gpu in the eviciton paths. And with the bugfix we only ever have one seqno allocated per ring (and only that ring), so there are no ordering issues with multiple outstanding seqnos on the same ring. v2: Keep i915_gem_get_seqno (but move it to i915_gem.c) to make it clear that we only have one seqno counter for all rings. Suggested by Chris Wilson. v3: As suggested by Chris Wilson use i915_gem_next_request_seqno instead of ring->oustanding_lazy_request to make the follow-up refactoring more clearly correct. Also improve the commit message with issues discussed on irc. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45181 Tested-by: Nicolas Kalkhof nkalkhof()at()web.de Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-02-13 10:55:57 +01:00
Daniel Vetter	5391d0cffe	drm/i915: outstanding_lazy_request is a u32 So don't assign it false, that's just confusing ... No functional change here. Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-02-13 10:55:48 +01:00
Daniel Vetter	e21af88d39	drm/i915: enable ppgtt We want to unconditionally enable ppgtt for two reasons: - Windows uses this on snb and later. - We need the basic hw support to work before we can think about real per-process address spaces and other cool features we want. But Chris Wilson was complaining all over irc and intel-gfx that this will blow up if we don't have a module option to disable it. Hence add one, to prevent this. ppgtt support seems to slightly change the timings and make crashy things slightly more or less crashy. Now in my testing and the testing this got on troublesome snb machines, it seems to have improved things only. But on ivb it makes quite a few crashes happen much more often, see https://bugs.freedesktop.org/show_bug.cgi?id=41353 Luckily Eugeni Dodonov seems to have a set of workarounds that fix this issue. v2: Don't try to enable ppgtt on pre-snb. v3: Pimp commit message and make Chris Wilson less grumpy by adding a module option. v4: New try at making Chris Wilson happy. Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-02-09 21:49:30 +01:00
Daniel Vetter	7bddb01fb9	drm/i915: ppgtt binding/unbinding support This adds support to bind/unbind objects and wires it up. Objects are only put into the ppgtt when necessary, i.e. at execbuf time. Objects are still unconditionally put into the global gtt. v2: Kill the quick hack and explicitly pass cache_level to ppgtt_bind like for the global gtt function. Noticed by Chris Wilson. Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Tested-by: Chris Wilson <chris@chris-wilson.co.uk> Tested-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-02-09 21:25:23 +01:00
Daniel Vetter	11782b0233	drm/i915: consolidate swizzling control bit frobbing On gen5 we also need to correctly set up swizzling in the display scanout engine, but only there. Consolidate this into the same function. This has a small effect on ums setups - the kernel now also sets this bit in addition to userspace setting it. Given that this code only runs when userspace either can't (resume, gpu reset) or explicitly won't(gem_init) touch the hw this shouldn't have an adverse effect. Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-02-08 23:18:27 +01:00
Daniel Vetter	f691e2f4ce	drm/i915: swizzling support for snb/ivb We have to do this manually. Somebody had a Great Idea. I've measured speed-ups just a few percent above the noise level (below 5% for the best case), but no slowdows. Chris Wilson measured quite a bit more (10-20% above the usual snb variance) on a more recent and better tuned version of sna, but also recorded a few slow-downs on benchmarks know for uglier amounts of snb-induced variance. v2: Incorporate Ben Widawsky's preliminary review comments and elaborate a bit about the performance impact in the changelog. v3: Add a comment as to why we don't need to check the 3rd memory channel. v4: Fixup whitespace. Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Reviewed-by: Eric Anholt <eric@anholt.net> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-02-08 23:16:24 +01:00
Daniel Vetter	8461d22677	drm/i915: rewrite shmem_pread_slow to use copy_to_user Like for shmem_pwrite_slow. The only difference is that because we read data, we can leave the fetched cachelines in the cpu: In the case that the object isn't in the cpu read domain anymore, the clflush for the next cpu read domain invalidation will simply drop these cachelines. slow_shmem_bit17_copy is now ununsed, so kill it. With this patch tests/gem_mmap_gtt now actually works. v2: add __ to copy_to_user_swizzled as suggested by Chris Wilson. v3: Fixup the swizzling logic, it swizzled the wrong pages. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=38115 Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-01-30 23:34:34 +01:00
Daniel Vetter	8c59967c44	drm/i915: rewrite shmem_pwrite_slow to use copy_from_user ... instead of get_user_pages, because that fails on non page-backed user addresses like e.g. a gtt mapping of a bo. To get there essentially copy the vfs read path into pagecache. We can't call that right away because we have to take care of bit17 swizzling. To not deadlock with our own pagefault handler we need to completely drop struct_mutex, reducing the atomicty-guarantees of our userspace abi. Implications for racing with other gem ioctl: - execbuf, pwrite, pread: Due to -EFAULT fallback to slow paths there's already the risk of the pwrite call not being atomic, no degration. - read/write access to mmaps: already fully racy, no degration. - set_tiling: Calling set_tiling while reading/writing is already pretty much undefined, now it just got a bit worse. set_tiling is only called by libdrm on unused/new bos, so no problem. - set_domain: When changing to the gtt domain while copying (without any read/write access, e.g. for synchronization), we might leave unflushed data in the cpu caches. The clflush_object at the end of pwrite_slow takes care of this problem. - truncating of purgeable objects: the shmem_read_mapping_page call could reinstate backing storage for truncated objects. The check at the end of pwrite_slow takes care of this. v2: - add missing intel_gtt_chipset_flush - add __ to copy_from_user_swizzled as suggest by Chris Wilson. v3: Fixup bit17 swizzling, it swizzled the wrong pages. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-01-30 23:34:21 +01:00
Daniel Vetter	5c0480f21f	drm/i915: fall through pwrite_gtt_slow to the shmem slow path The gtt_pwrite slowpath grabs the userspace memory with get_user_pages. This will not work for non-page backed memory, like a gtt mmapped gem object. Hence fall throuh to the shmem paths if we hit -EFAULT in the gtt paths. Now the shmem paths have exactly the same problem, but this way we only need to rearrange the code in one write path. v2: v1 accidentaly falls back to shmem pwrite for phys objects. Fixed. v3: Make the codeflow around phys_pwrite cleara as suggested by Chris Wilson. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-01-30 23:34:07 +01:00
Chris Wilson	068c6ff1cb	drm/i915: Remove the upper limit on the bo size for mapping into the CPU domain The original intention of comparing the bo against the mappable GTT limits was to prevent a subsequent faulting of the bo into the GTT from clearing the entire GTT in vain. However, that was clearly a cut'n'paste mistake as a CPU mapping never binds the bo into the aperture. Whilst there may be some merit to limiting the maximum size of the bo to something that can be utilized by the GPU, that limit itself does not belong as a safeguard to mmapping the bo, so remove the check entirely. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-01-30 17:54:35 +01:00
Daniel Vetter	39965b3766	drm/i915: don't trash the gtt when running out of fences With the fence accounting fixed up in the previous commit not finding enough fences is a fatal error and userspace bug. Trashing the entire gtt is not gonna turn up that missing fence, so don't to this by returning another error thatn ENOSPC. This has the added benefit that it's easier to distinguish fence accounting errors from gtt space accounting issues. TTM serves as precendence for the EDEADLK error code - it returns it when the reservation code needs resources already blocked by the current reservation. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-01-29 18:24:10 +01:00
Chris Wilson	1690e1eb7a	drm/i915: Separate fence pin counting from normal bind pin counting In order to correctly account for reserving space in the GTT and fences for a batch buffer, we need to independently track whether the fence is pinned due to a fenced GPU access in the batch or whether the buffer is pinned in the aperture. Currently we count the fenced as pinned if the buffer has already been seen in the execbuffer. This leads to a false accounting of available fence registers, causing frequent mass evictions. Worse, if coupled with the change to make i915_gem_object_get_fence() report EDADLK upon fence starvation, the batchbuffer can fail with only one fence required... Fixes intel-gpu-tools/tests/gem_fenced_exec_thrash Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=38735 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Tested-by: Paul Neumann <paul104x@yahoo.de> [danvet: Resolve the functional conflict with Jesse Barnes sprite patches, acked by Chris Wilson on irc.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-01-29 18:23:37 +01:00
Ben Widawsky	b93f9cf14e	drm/i915: argument to control retiring behavior Sometimes it may be the case when we idle the gpu or wait on something we don't actually want to process the retiring list. This patch allows callers to choose the behavior. Reviewed-by: Keith Packard <keithp@keithp.com> Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-01-26 11:19:19 +01:00
Eugeni Dodonov	3d29b842e5	drm/i915: add a LLC feature flag in device description LLC is not SNB/IVB-specific, so we should check for it in a more generic way. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2012-01-17 20:01:45 +01:00
Eric Anholt	e959b5db4a	drm/i915: Make the fallback IRQ wait not sleep. The waits we do here are generally so short that sleeping is a bad idea unless we have an IRQ to wake us up. Improves regression test performance from 18 minutes to 3.5 minutes on gen7, which is now consistent with the previous generation. Signed-off-by: Eric Anholt <eric@anholt.net> Tested-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Keith Packard <keithp@keithp.com>	2012-01-03 09:31:16 -08:00
Eric Anholt	7ea29b13e5	drm/i915: Do the fallback non-IRQ wait in ring throttle, too. As a workaround for IRQ synchronization issues in the gen7 BLT ring, we want to turn the two wait functions into polling loops. Signed-off-by: Eric Anholt <eric@anholt.net> Tested-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Keith Packard <keithp@keithp.com>	2012-01-03 09:31:14 -08:00
Linus Torvalds	ed4a51842a	Revert "drm/i915: fix infinite recursion on unbind due to ilk vt-d w/a" This reverts commit `eb1711bb94`. It blows up the i915 seqno tracking, resulting in the BUG_ON(seqno == 0); in i915_wait_request() triggering, which will cause lock-ups. See for example https://bugs.launchpad.net/ubuntu/+source/linux/+bug/903010 https://lkml.org/lkml/2011/12/14/395 Reported-requested-and-tested-by: Dirk Hohndel <dirk@hohndel.org> Reported-by: Richard Eames <Richard.Eames@flinders.edu.au> Reported-by: Rocko Requin <rockorequin@hotmail.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Dave Airlie <airlied@redhat.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Keith Packard <keithp@keithp.com> Cc: Eric Anholt <eric@anholt.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-12-16 12:58:39 -08:00
Daniel Vetter	eb1711bb94	drm/i915: fix infinite recursion on unbind due to ilk vt-d w/a The recursion loop goes retire_requests->unbind->gpu_idle->retire_reqeusts. Every time we go through this we need a - active object that can be retired - and there are no other references to that object than the one from the active list, so that it gets unbound and freed immediately. Otherwise the recursion stops. So the recursion is only limited by the number of objects that fit these requirements sitting in the active list any time retire_request is called. Issue exercised by tests/gem_unref_active_buffers from i-g-t. There's been a decent bikeshed discussion whether it wouldn't be better to pass around a flag, but imo this is o.k. for such a limited case that only supports a w/a. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=42180 Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Chris Wilson <chris@chris-wilson> [ickle- we built better bikesheds, but this keeps the rain off for now] Tested-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2011-12-07 10:44:40 +00:00
Rakib Mullick	457eafce61	drm, i915: Fix memory leak in i915_gem_busy_ioctl(). A call to i915_add_request() has been made in function i915_gem_busy_ioctl(). i915_add_request can fail, so in it's exit path previously allocated memory needs to be freed. Signed-off-by: Rakib Mullick <rakib.mullick@gmail.com> Reviewed-by: Keith Packard <keithp@keithp.com> Signed-off-by: Keith Packard <keithp@keithp.com>	2011-11-17 12:57:45 -08:00
Jesse Barnes	680da876f4	drm/i915: enable cacheable objects on Ivybridge IVB supports these bits as well. Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Keith Packard <keithp@keithp.com>	2011-11-03 16:17:57 -07:00
Daniel Vetter	4b9de737fa	drm/i915: add constants to size fence arrays and fields In preparation of to support 32 fences on Ivybdrigde. Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Keith Packard <keithp@keithp.com>	2011-11-03 09:20:37 -07:00
Eric Anholt	ff56b0bc84	drm/i915: Fix object refcount leak on mmappable size limit error path. I've been seeing memory leaks on my system in the form of large (300-400MB) GEM objects created by now-dead processes laying around clogging up memory. I usually notice when it gets to about 1.2GB of them. Hopefully this clears up the issue, but I just found this bug by inspection. Signed-off-by: Eric Anholt <eric@anholt.net> Cc: stable@kernel.org Signed-off-by: Keith Packard <keithp@keithp.com>	2011-11-01 09:15:17 -07:00
Ben Widawsky	f372b85463	drm/i915: Remove early exit on i915_gpu_idle [Description from: Daniel Vetter] I've just discussed this quickly with Chris on irc and it's probably best to just kill the list_empty early bailout. gpu_idle isn't a fastpath, so who cares. One candidate where we emit commands to the ring without adding anything onto these lists is e.g. pageflip. There are probably more. Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Keith Packard <keithp@keithp.com>	2011-10-20 15:26:38 -07:00
Daniel Vetter	130c2561de	drm/i915: drop KM_USER0 argument to k(un)map_atomic Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Keith Packard <keithp@keithp.com>	2011-10-20 15:26:37 -07:00
Chris Wilson	8ffc024681	drm/i915: Defend against userspace creating a gem object with size==0 We currently only round up the userspace size to the next page. We assume that userspace hasn't made a mistake and requested a zero-length gem object and all through our internal code we then presume that every object is backed by at least a single page. Fix that oversight and report EINVAL back to userspace if they try to create a zero length object. [danvet: This fixes tests/gem_bad_length] Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Keith Packard <keithp@keithp.com>	2011-10-20 14:11:19 -07:00
Daniel Vetter	6dacfd2faa	drm/i915: simplify swapin/out swizzle checking a bit Use the helper function already employed by the pwrite/pread functions. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Keith Packard <keithp@keithp.com>	2011-10-20 14:11:18 -07:00
Dave Airlie	88ef4e3f4f	Merge branch 'drm-intel-next' of git://people.freedesktop.org/~keithp/linux into drm-next * 'drm-intel-next' of git://people.freedesktop.org/~keithp/linux: Drivers: i915: Fix all space related issues.	2011-09-20 09:36:22 +01:00
Akshay Joshi	0206e353a0	Drivers: i915: Fix all space related issues. Various issues involved with the space character were generating warnings in the checkpatch.pl file. This patch removes most of those warnings. Signed-off-by: Akshay Joshi <me@akshayjoshi.com> Signed-off-by: Keith Packard <keithp@keithp.com>	2011-09-19 18:01:47 -07:00
Rob Clark	b464e9a25c	drm/i915: use common functions for mmap offset creation Signed-off-by: Rob Clark <rob@ti.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2011-08-30 11:07:00 +01:00
Keith Packard	df7976797f	Merge branch 'drm-intel-fixes' into drm-intel-next	2011-07-22 13:40:42 -07:00
Keith Packard	f0b69efc29	drm/i915: Skip GPU wait for scanout pin while wedged Failing to pin a scanout buffer will most likely lead to a black screen, so if the GPU is wedged, then just let the pin happen and hope that things work out OK. v2: Just ignore any error from i915_gem_object_wait_rendering, as suggested by Chris Wilson Signed-off-by: Keith Packard <keithp@keithp.com>	2011-07-21 20:18:31 -07:00
Chris Wilson	e28f871165	drm/i915: Fix unfenced alignment on pre-G33 hardware Align unfenced buffers on older hardware to the power-of-two object size. The docs suggest that it should be possible to align only to a power-of-two tile height, but using the already computed fence size is easier and always correct. We also have to make sure that we unbind misaligned buffers upon tiling changes. In order to prevent a repetition of this bug, we change the interface to the alignment computation routines to force the caller to provide the requested alignment and size of the GTT binding rather than assume the current values on the object. Reported-and-tested-by: Sitosfe Wheeler <sitsofe@yahoo.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=36326 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@kernel.org Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Keith Packard <keithp@keithp.com>	2011-07-18 14:02:06 -07:00
Keith Packard	8eb2c0ee67	Merge branch 'drm-intel-fixes' into drm-intel-next	2011-06-29 10:34:54 -07:00
Ben Widawsky	3e0dc6b01f	drm/i915: hangcheck disable parameter Provide a parameter to disable hanghcheck. This is useful mostly for developers trying to debug known problems, and probably should not be touched by normal users. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Keith Packard <keithp@keithp.com>	2011-06-29 10:32:08 -07:00
Linus Torvalds	0d72c6fcb5	Merge branch 'drm-intel-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/keithp/linux-2.6 * 'drm-intel-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/keithp/linux-2.6: drm/i915: Use chipset-specific irq installers drm/i915: forcewake fix after reset drm/i915: add Ivy Bridge page flip support drm/i915: split page flip queueing into per-chipset functions	2011-06-28 11:15:57 -07:00
Keith Packard	6ae77e6b6a	Merge branch 'drm-intel-fixes' into drm-intel-next	2011-06-28 10:29:47 -07:00
Chris Wilson	f01c22fd59	drm/i915: Use chipset-specific irq installers Konstantin Belousov pointed out that `4697995b98` replaced the generic i915_driver_irq_install() functions with chipset specific routines accessible only through driver->irq_install(). So update the sanity check in i915_request_wait() to match. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Keith Packard <keithp@keithp.com>	2011-06-28 10:20:06 -07:00
Hugh Dickins	e2377fe0b6	drm/i915: use shmem_truncate_range The interface to ->truncate_range is changing very slightly: once "tmpfs: take control of its truncate_range" has been applied, this can be applied. For now there is only a slight inefficiency while this remains unapplied, but it will soon become essential for managing shmem's use of swap. Change i915_gem_object_truncate() to use shmem_truncate_range() directly: which should also spare i915 later change if we switch from inode_operations->truncate_range to file_operations->fallocate. Signed-off-by: Hugh Dickins <hughd@google.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Keith Packard <keithp@keithp.com> Cc: Dave Airlie <airlied@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-06-27 18:00:14 -07:00
Hugh Dickins	5949eac4d9	drm/i915: use shmem_read_mapping_page Soon tmpfs will stop supporting ->readpage and read_cache_page_gfp(): once "tmpfs: add shmem_read_mapping_page_gfp" has been applied, this patch can be applied to ease the transition. Make i915_gem_object_get_pages_gtt() use shmem_read_mapping_page_gfp() in the one place it's needed; elsewhere use shmem_read_mapping_page(), with the mapping's gfp_mask properly initialized. Forget about __GFP_COLD: since tmpfs initializes its pages with memset, asking for a cold page is counter-productive. Include linux/shmem_fs.h also in drm_gem.c: with shmem_file_setup() now declared there too, we shall remove the prototype from linux/mm.h later. Signed-off-by: Hugh Dickins <hughd@google.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Keith Packard <keithp@keithp.com> Cc: Dave Airlie <airlied@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-06-27 18:00:13 -07:00
Keith Packard	b97c3d9c16	drm/i915: i915_gem_object_finish_gtt must always release gtt mmap Even if the object is no longer in the GTT domain, there may still be a user space mapping which needs to be released. Without this fix, render-based text (mostly in firefox) would occasionally get corrupted when the system was under load. Signed-off-by: Keith Packard <keithp@keithp.com>	2011-06-24 21:02:59 -07:00
Keith Packard	2cd1176bd9	Merge branch 'drm-intel-fixes' into drm-intel-next	2011-06-21 12:02:57 -07:00
Eric Anholt	e92d03bff9	Revert "drm/i915: Kill GTT mappings when moving from GTT domain" This reverts commit `4a684a4117`. Userland has always been required to set the object's domain to GTT before using it through a GTT mapping, it's not something that the kernel is supposed to enforce. (The pagefault support is so that we can handle multiple mappings without userland having to pin across them, not so that userland can use GTT after GPU domains without telling the kernel). Fixes 19.2% +/- 0.8% (n=6) performance regression in cairo-gl firefox-talos-gfx on my T420 latop. Signed-off-by: Keith Packard <keithp@keithp.com>	2011-06-21 11:11:02 -07:00
Jesper Juhl	b65552f06c	drm/i915: Don't leak in i915_gem_shmem_pread_slow() It seems to me that we are leaking 'user_pages' in drivers/gpu/drm/i915/i915_gem.c::i915_gem_shmem_pread_slow() if read_cache_page_gfp() fails. Signed-off-by: Jesper Juhl <jj@chaosbits.net> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Dave Airlie <airlied@redhat.com>	2011-06-14 11:00:54 +10:00
Eric Anholt	a187111207	drm/i915: Use the LLC mode on gen6 for everything but display. Improves full-screen openarena on my laptop 20.3% +/- 4.0% (n=3) Improves 800x600 nexuiz on my laptop 12.3% +/- 0.1% (n=3) We have more room to improve with doing LLC caching for display using GFDT, and in doing LLC+MLC caching, but this was an easy performance win and incremental improvement toward those two. Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2011-06-09 21:51:22 -07:00
Eric Anholt	a7ef0640d9	drm/i915: Use the uncached domain for the display planes The simplest and common method for ensuring scanout coherency on all chipsets is to mark the scanout buffers as uncached (and for userspace to remember to flush the render cache every so often). We can improve upon this for later generations by marking scanout objects as GFDT and only flush those cachelines when required. However, we start simple. [v2: Move the set to uncached above the clflush. Otherwise, we'd skip the clflush and try to scan out data that was still sitting in the cache.] Signed-off-by: Eric Anholt <eric@anholt.net> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2011-06-09 21:51:20 -07:00
Chris Wilson	2da3b9b940	drm/i915: Combine pinning with setting to the display plane We need to perform a few operations in order to move the object into the display plane (where it can be accessed coherently by the display engine) that are important for future safety to forbid whilst pinned. As a result, we want to need to perform some of the operations before pinning, but some are required once we have been bound into the GTT. So combine the pinning performed by all the callers with set_to_display_plane(), so this complication is contained within the single function. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2011-06-09 21:51:19 -07:00
Chris Wilson	e4ffd173a1	drm/i915: Add an interface to dynamically change the cache level [anholt v2: Don't forget that when going from cached to uncached, we haven't been tracking the write domain from the CPU perspective, since we haven't needed it for GPU coherency.] [ickle v3: We also need to make sure we relinquish any fences on older chipsets and clear the GTT for sane domain tracking.] Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Eric Anholt <eric@anholt.net> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2011-06-09 21:51:16 -07:00

1 2 3 4 5 ...

499 Commits