2
0
mirror of https://github.com/edk2-porting/linux-next.git synced 2025-01-07 13:13:57 +08:00
Commit Graph

611 Commits

Author SHA1 Message Date
Chris Wilson
d8cb508669 drm/i915: Try harder to allocate an mmap_offset
Given the persistence of an offset for the lifetime of an object, itis
easy to contemplate how the mmap space becomes badly fragmented to the
point that further allocations fail with ENOSPC. Our only recourse at
this point is to try to purge the objects to release some space and
reattempt the allocation.

References: https://bugs.freedesktop.org/show_bug.cgi?id=39552
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-08-21 14:34:36 +02:00
Chris Wilson
c4670ad080 drm/i915: Add some sanity checks to unbound tracking
A pair of universally true checks that just need to be put in the right
place depending on where in the patch sequence you go. Note that
i915_gem_object_put_pages_gtt() already gains the
BUG_ON(obj->gtt_space), but on reflection that needed to migrate to
put_pages().

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-08-21 14:34:20 +02:00
Chris Wilson
6c085a728c drm/i915: Track unbound pages
When dealing with a working set larger than the GATT, or even the
mappable aperture when touching through the GTT, we end up with evicting
objects only to rebind them at a new offset again later. Moving an
object into and out of the GTT requires clflushing the pages, thus
causing a double-clflush penalty for rebinding.

To avoid having to clflush on rebinding, we can track the pages as they
are evicted from the GTT and only relinquish those pages on memory
pressure.

As usual, if it were not for the handling of out-of-memory condition and
having to manually shrink our own bo caches, it would be a net reduction
of code. Alas.

Note: The patch also contains a few changes to the last-hope
evict_everything logic in i916_gem_execbuffer.c - we no longer try to
only evict the purgeable stuff in a first try (since that's superflous
and only helps in OOM corner-cases, not fragmented-gtt trashing
situations).

Also, the extraction of the get_pages retry loop from bind_to_gtt (and
other callsites) to get_pages should imo have been a separate patch.

v2: Ditch the newly added put_pages (for unbound objects only) in
i915_gem_reset. A quick irc discussion hasn't revealed any important
reason for this, so if we need this, I'd like to have a git blame'able
explanation for it.

v3: Undo the s/drm_malloc_ab/kmalloc/ in get_pages that Chris noticed.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Split out code movements and rant a bit in the commit message
with a few Notes. Done v2]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-08-21 14:34:11 +02:00
Daniel Vetter
225067eedf drm/i915: move functions around
Prep work to make Chris Wilson's unbound tracking patch a bit easier
to read. Alas, I'd have preferred that moving the page allocation
retry loop from bind to get_pages would have been a separate patch,
too. But that looks like real work ;-)

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-08-20 10:59:41 +02:00
Chris Wilson
b2eadbc85b drm/i915: Lazily apply the SNB+ seqno w/a
Avoid the forcewake overhead when simply retiring requests, as often the
last seen seqno is good enough to satisfy the retirment process and will
be promptly re-run in any case. Only ensure that we force the coherent
seqno read when we are explicitly waiting upon a completion event to be
sure that none go missing, and also for when we are reporting seqno
values in case of error or debugging.

This greatly reduces the load for userspace using the busy-ioctl to
track active buffers, for instance halving the CPU used by X in pushing
the pixels from a software render (flash). The effect will be even more
magnified with userptr and so providing a zero-copy upload path in that
instance, or in similar instances where X is simply compositing DRI
buffers.

v2: Reverse the polarity of the tachyon stream. Daniel suggested that
'force' was too generic for the parameter name and that 'lazy_coherency'
better encapsulated the semantics of it being an optimization and its
purpose. Also notice that gen6_get_seqno() is only used by gen6/7
chipsets and so the test for IS_GEN6 || IS_GEN7 is redundant in that
function.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-08-10 11:11:32 +02:00
Chris Wilson
e6994aeedc drm/i915: Export ability of changing cache levels to userspace
By selecting the cache level (essentially whether or not the CPU snoops
any updates to the bo, and on more recent machines whether it resides
inside the CPU's last-level-cache) a userspace driver is able to then
manage all of its memory within buffer objects, if it so desires. This
enables the userspace driver to accelerate uploads and more importantly
downloads from the GPU and to able to mix CPU and GPU rendering/activity
efficiently.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Added code comment about where we plan to stuff platform
specific cacheing control bits in the ioctl struct.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-07-26 12:56:25 +02:00
Chris Wilson
42d6ab4839 drm/i915: Segregate memory domains in the GTT using coloring
Several functions of the GPU have the restriction that differing memory
domains cannot be placed next to each other (as the GPU may prefetch
beyond the end of one domain and hang as it crosses into the other
domain). We use the facility of the drm_mm to mark ranges with a
particular color that corresponds to the cache attributes of those pages
in order to prevent allocating adjacent blocks of differing memory
types.

v2: Rebase ontop of drm_mm coloring v2.
v3: Fix rebinding existing gtt_space and add a verification routine.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-07-26 12:56:25 +02:00
Chris Wilson
f047e395dd drm/i915: Avoid concurrent access when marking the device as idle/busy
As suggested by Daniel, rip out the independent timers for device and
crtc busyness and integrate the manual powermanagement of the display
engine into the GEM core and its request tracking. The benefits are that
the code is a lot smaller, fewer moving parts and should fit more neatly
into the overall activity tracking of the driver.

v2: Complete overhaul and removal of the racy timers and workers.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-07-25 18:23:56 +02:00
Chris Wilson
a7b9761d0a drm/i915: Split i915_gem_flush_ring() into seperate invalidate/flush funcs
By moving the function to intel_ringbuffer and currying the appropriate
parameter, hopefully we make the callsites easier to read and
understand.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-07-25 18:23:55 +02:00
Chris Wilson
26b9c4a57f drm/i915: Remove the explicit flush of the GPU write domain
Rely instead on the insertion of the implicit flush before the seqno
breadcrumb.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-07-25 18:23:54 +02:00
Chris Wilson
86d5bc3782 drm/i915: Remove explicit flush from i915_gem_object_flush_fence()
As the flush is either performed explictly immediately after the
execbuffer dispatch, or before the serialisation of last_fenced_seqno we
can forgo the explict i915_gem_flush_ring().

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-07-25 18:23:53 +02:00
Chris Wilson
69c2fc8913 drm/i915: Remove the per-ring write list
This is now handled by a global flag to ensure we emit a flush before
the next serialisation point (if we failed to queue one previously).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-07-25 18:23:53 +02:00
Chris Wilson
65ce302741 drm/i915: Remove the defunct flushing list
As we guarantee to emit a flush before emitting the breadcrumb or
the next batchbuffer, there is no further need for the flushing list.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-07-25 18:23:52 +02:00
Chris Wilson
0201f1ecf4 drm/i915: Replace the pending_gpu_write flag with an explicit seqno
As we always flush the GPU cache prior to emitting the breadcrumb, we no
longer have to worry about the deferred flush causing the
pending_gpu_write to be delayed. So we can instead utilize the known
last_write_seqno to hopefully minimise the wait times.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-07-25 18:23:52 +02:00
Chris Wilson
e5f1d962a8 drm/i915: Remove assertion over write domain after i915_gem_object_sync()
As we move to lazily clearing the GPU write domain only when the buffer
becomes inactive, this leaves a window of opportunity for
i915_gem_object_pin_to_display_plane() to detect a seemingly
inconsistent value. This function is special as it tries to pipeline the
operation to avoid the stall and so may not retires the buffer and we
may not get the opportunity to clear the write domain. However, we know
all is good, so drop the assertion.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-07-25 18:23:51 +02:00
Chris Wilson
3bb73aba1e drm/i915: Allow late allocation of request for i915_add_request()
Request preallocation was added to i915_add_request() in order to
support the overlay. However, not all users care and can quite happily
ignore the failure to allocate the request as they will simply repeat
the request in the future.

By pushing the allocation down into i915_add_request(), we can then
remove some rather ugly error handling in the callers.

v2: Nullify request->file_priv otherwise we chase a garbage pointer
when retiring requests.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-07-25 18:23:51 +02:00
Chris Wilson
e9808edd98 drm/i915: Return a mask of the active rings in the high word of busy_ioctl
The intention is to help select which engine to use for copies with
interoperating clients - such as a GL client making a request to the X
server to perform a SwapBuffers, which may require copying from the
active GL back buffer to the X front buffer.

We choose to report a mask of the active rings to future proof the
interface against any changes which may allow for the object to reside
upon multiple rings.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: bikeshed away the write ring mask and add the explanation
Chris sent in a follow-up mail why we decided to use masks.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-07-25 18:23:50 +02:00
Chris Wilson
eeef9b3874 drm/i915: Add -EIO to the list of known errors for __wait_seqno
This prevents a WARN introduced with

  commit de2b998552
  Author: Daniel Vetter <daniel.vetter@ffwll.ch>
  Date:   Wed Jul 4 22:52:50 2012 +0200

      drm/i915: don't return a spurious -EIO from intel_ring_begin

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-07-25 10:39:57 +02:00
Chris Wilson
67b1b57182 drm/i915: Disable the BLT on pre-production SNB hardware
It never quite worked despite the numerous workarounds, yet I still see
people trying to use this hardware and filing bug reports. As we no
longer even try to implement the workarounds, since 6a233c7887
(drm/i915/ringbuffer: kill snb blt workaround), simply disable the ring.

v2: Add a message to inform the user about the limited capabilities of
their pre-production hardware.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-07-20 12:21:37 +02:00
Chris Wilson
6b9d89b436 drm: Add colouring to the range allocator
In order to support snoopable memory on non-LLC architectures (so that
we can bind vgem objects into the i915 GATT for example), we have to
avoid the prefetcher on the GPU from crossing memory domains and so
prevent allocation of a snoopable PTE immediately following an uncached
PTE. To do that, we need to extend the range allocator with support for
tracking and segregating different node colours.

This will be used by i915 to segregate memory domains within the GTT.

v2: Now with more drm_mm helpers and less driver interference.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Dave Airlie <airlied@redhat.com
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@gmail.com>
2012-07-16 05:59:37 +10:00
Daniel Vetter
a9340ccab5 drm/i915: properly SIGBUS on I/O errors
... instead of looping endless with no hope of ever serving that
page-fault. We only need to break out of this loop when the gpu died,
to run the reset work (and hopefully resurrect it).

To clarify questions Chris raised on irc: This is about handling I/O
errors not from our own code, but e.g. when the disk died when trying
to swap in a gem bo. So this patch remidies the issue that the current
handling only handles gpu-death-induced cases of -EIO. Admittedly,
dying disks are much rarer than hanging gpus ...To clarify questions
Chris raised on irc: This is about handling I/O errors not from our
own code, but e.g. when the disk died when trying to swap in a gem bo.
So this patch remidies the issue that the current handling only
handles gpu-death-induced cases of -EIO. Admittedly, dying disks are
much rarer than hanging gpus ...

This seems to have been lost in:

commit d9bc7e9f32
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Mon Feb 7 13:09:31 2011 +0000

    drm/i915: Fix infinite loop regression from 21dd3734

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-07-05 10:03:01 +02:00
Daniel Vetter
0a6759c6ba drm/i915: don't hang userspace when the gpu reset is stuck
With the gpu reset no longer using a trylock we've increased the
chances of userspace getting stuck quite a bit. To make that
(hopefully) rare case more paletable time out when waiting for the gpu
reset code to complete and signal this little issue to the caller by
returning -EIO.

This should help userspace to somewhat gracefully fall back and
hopefully allow the user to grab some logs and reboot the machine
(instead of staring at a frozen X screen in agony).

Suggested by Chris Wilson because I've been stubborn about allowing
the gpu reset code no to fail, ever (by removing the trylock).

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-07-05 10:02:24 +02:00
Daniel Vetter
d6b2c790a4 drm/i915: non-interruptible sleeps can't handle -EAGAIN
So don't return -EAGAIN, even in the case of a gpu hang. Remap it to
-EIO instead. Note that this isn't really an issue with
interruptability, but more that we have quite a few codepaths (mostly
around kms stuff) that simply can't handle any errors and hence not
even -EAGAIN. Instead of adding proper failure paths so that we could
restart these ioctls we've opted for the cheap way out of sleeping
non-interruptibly.  Which works everywhere but when the gpu dies,
which this patch fixes.

So essentially interruptible == false means 'wait for the gpu or die
trying'.'

This patch is a bit ugly because intel_ring_begin is all non-interruptible
and hence only returns -EIO. But as the comment in there says,
auditing all the callsites would be a pain.

To avoid duplicating code, reuse i915_gem_check_wedge in __wait_seqno
and intel_wait_ring_buffer. Also use the opportunity to clarify the
different cases in i915_gem_check_wedge a bit with comments.

v2: Don't access dev_priv->mm.interruptible from check_wedge - we
might not hold dev->struct_mutex, making this racy. Instead pass
interruptible in as a parameter. I've noticed this because I've hit a
BUG_ON(!mutex_is_locked) at the top of check_wedge. This has been
added in

commit b4aca0106c
Author: Ben Widawsky <ben@bwidawsk.net>
Date:   Wed Apr 25 20:50:12 2012 -0700

    drm/i915: extract some common olr+wedge code

although that commit is missing any justification for this. I guess
it's just copy&paste, because the same commit add the same BUG_ON
check to check_olr, where it indeed makes sense.

But in check_wedge everything we access is protected by other means,
so this is superflous. And because it now gets in the way (we add a
new caller in __wait_seqno, which can be called without
dev->struct_mutext) let's just remove it.

v3: Group all the i915_gem_check_wedge refactoring into this patch, so
that this patch here is all about not returning -EAGAIN to callsites
that can't handle syscall restarting.

v4: Add clarification what interuptible == fales means in our code,
requested by Ben Widawsky.

v5: Fix EAGAIN mispell noticed by Chris Wilson.

Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-07-05 10:01:14 +02:00
Daniel Vetter
cc889e0f6c drm/i915: disable flushing_list/gpu_write_list
This is just the minimal patch to disable all this code so that we can
do decent amounts of QA before we rip it all out.

The complicating thing is that we need to flush the gpu caches after
the batchbuffer is emitted. Which is past the point of no return where
execbuffer can't fail any more (otherwise we risk submitting the same
batch multiple times).

Hence we need to add a flag to track whether any caches associated
with that ring are dirty. And emit the flush in add_request if that's
the case.

Note that this has a quite a few behaviour changes:
- Caches get flushed/invalidated unconditionally.
- Invalidation now happens after potential inter-ring sync.

I've bantered around a bit with Chris on irc whether this fixes
anything, and it might or might not. The only thing clear is that with
these changes it's much easier to reason about correctness.

Also rip out a lone get_next_request_seqno in the execbuffer
retire_commands function. I've dug around and I couldn't figure out
why that is still there, with the outstanding lazy request stuff it
shouldn't be necessary.

v2: Chris Wilson complained that I also invalidate the read caches
when flushing after a batchbuffer. Now optimized.

v3: Added some comments to explain the new flushing behaviour.

Cc: Eric Anholt <eric@anholt.net>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-06-20 13:54:28 +02:00
Ben Widawsky
f2ef6eb145 drm/i915: switch to default context on idle
To keep things as sane as possible, switch to the default context before
idling. This should help free context objects, as well as put things in
a more well defined state before suspending.

v2: remove seqno from context switch call (daniel)
return error on failed context switch instead of WARN+continue (daniel)

v3: move idling to i915_gpu idle (from i915_gem_idle) (Chris)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
2012-06-14 17:36:20 +02:00
Ben Widawsky
254f965c39 drm/i915: preliminary context support
Very basic code for context setup/destruction in the driver.

Adds the file i915_gem_context.c This file implements HW context
support. On gen5+ a HW context consists of an opaque GPU object which is
referenced at times of context saves and restores.  With RC6 enabled,
the context is also referenced as the GPU enters and exists from RC6
(GPU has it's own internal power context, except on gen5).  Though
something like a context does exist for the media ring, the code only
supports contexts for the render ring.

In software, there is a distinction between contexts created by the
user, and the default HW context. The default HW context is used by GPU
clients that do not request setup of their own hardware context. The
default context's state is never restored to help prevent programming
errors. This would happen if a client ran and piggy-backed off another
clients GPU state.  The default context only exists to give the GPU some
offset to load as the current to invoke a save of the context we
actually care about. In fact, the code could likely be constructed,
albeit in a more complicated fashion, to never use the default context,
though that limits the driver's ability to swap out, and/or destroy
other contexts.

All other contexts are created as a request by the GPU client. These
contexts store GPU state, and thus allow GPU clients to not re-emit
state (and potentially query certain state) at any time. The kernel
driver makes certain that the appropriate commands are inserted.

There are 4 entry points into the contexts, init, fini, open, close.
The names are self-explanatory except that init can be called during
reset, and also during pm thaw/resume. As we expect our context to be
preserved across these events, we do not reinitialize in this case.

As Adam Jackson pointed out, The cutoff of 1MB where a HW context is
considered too big is arbitrary. The reason for this is even though
context sizes are increasing with every generation, they have yet to
eclipse even 32k. If we somehow read back way more than that, it
probably means BIOS has done something strange, or we're running on a
platform that wasn't designed for this.

v2: rename load/unload to init/fini (daniel)
remove ILK support for get_size() (indirectly daniel)
add HAS_HW_CONTEXTS macro to clarify supported platforms (daniel)
added comments (Ben)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
2012-06-14 17:36:16 +02:00
Daniel Vetter
8ecd1a6615 drm/i915: call intel_enable_gtt
When drm/i915 is in control of the gtt, we need to call
the enable function at all the relevant places ourselves.

Reviewed-by: Jani Nikula <jani.nikula@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-06-12 22:21:07 +02:00
Daniel Vetter
dd2757f8b5 drm/i915: stop using dev->agp->base
For that to work we need to export the base address of the gtt
mmio window from intel-gtt. Also replace all other uses of
dev->agp by values we already have at hand.

Reviewed-by: Jani Nikula <jani.nikula@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-06-12 22:18:06 +02:00
Ben Widawsky
eac1f14fd1 drm/i915: Inifite timeout for wait ioctl
Change the ns_timeout parameter of the wait ioctl to a signed value.
Doing this allows the kernel to provide an infinite wait when a timeout
of less than 0 is provided. This mimics select/poll.

Initially the parameter was meant to match up with the GL spec 1:1, but
after being made aware of how much 2^64 - 1 nanoseconds actually is, I
do not think anyone will ever notice the loss of 1 bit.

The infinite timeout on waiting is similar to the existing i915
userspace interface with the exception that struct_mutex is dropped
while doing the wait in this ioctl.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-06-06 12:25:46 +02:00
Daniel Vetter
30dfebf34b drm/i915: extract object active state flushing code
Both busy_ioctl and the new wait_ioct need to do the same dance (or at
least should). Some slight changes:
- busy_ioctl now unconditionally checks for olr. Before emitting a
  require flush would have prevent the olr check and hence required a
  second call to the busy ioctl to really emit the request.
- the timeout wait now also retires request. Not really required for
  abi-reasons, but makes a notch more sense imo.

I've tested this by pimping the i-g-t test some more and also checking
the polling behviour of the wait_rendering_timeout ioctl versus what
busy_ioctl returns.

v2: Too many people complained about unplug, new color is
flush_active.

v3: Kill the comment about the unplug moniker.

v4: s/un-active/inactive/

Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-06-02 20:51:03 +02:00
Daniel Vetter
e269f90f3d Merge remote-tracking branch 'airlied/drm-prime-vmap' into drm-intel-next-queued
We need the latest dma-buf code from Dave Airlie so that we can pimp
the backing storage handling code in drm/i915 with Chris Wilson's
unbound tracking and stolen mem backed gem object code.

Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-06-01 10:52:54 +02:00
Ben Widawsky
b9524a1e1c drm/i915: remap l3 on hw init
If any l3 rows have been previously remapped, we must remap them after
GPU reset/resume too.

v2: Just return (no warn) on remapping init if not IVB (Jesse)
Move the check of schizo userspace to i915_gem_l3_remap (Jesse)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-05-31 12:11:29 +02:00
Dave Airlie
a21f976094 Merge branch 'drm-intel-fixes' of git://people.freedesktop.org/~danvet/drm-intel into drm-fixes
* 'drm-intel-fixes' of git://people.freedesktop.org/~danvet/drm-intel:
  drm/i915: tune down the noise of the RP irq limit fail
  drm/i915: Remove the error message for unbinding pinned buffers
  drm/i915: Limit page allocations to lowmem (dma32) for i965
  drm/i915: always use RPNSWREQ for turbo change requests
  drm/i915: reject doubleclocked cea modes on dp
  drm/i915: Adding TV Out Missing modes.
  drm/i915: wait for a vblank to pass after tv detect
  drm/i915: no lvds quirk for HP t5740e Thin Client
  drm/i915: enable vdd when switching off the eDP panel
  drm/i915: Fix PCH PLL assertions to not assume CRTC:PLL relationship
  drm/i915: Always update RPS interrupts thresholds along with frequency
  drm/i915: properly handle interlaced bit for sdvo dtd conversion
  drm/i915: fix module unload since error_state rework
  drm/i915: be more careful when returning -ENXIO in gmbus transfer
2012-05-29 11:09:06 +01:00
Ben Widawsky
199b2bc25b drm/i915: s/i915_wait_request/i915_wait_seqno/g
Wait request is poorly named IMO. After working with these functions for
some time, I feel it's much clearer to name the functions more
appropriately.

Of course we must update the callers to use the new name as well.

This leaves room within our namespace for a *real* wait request function
at some point.

Note to maintainer: this patch is optional.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-05-25 14:18:42 +02:00
Ben Widawsky
23ba4fd0a4 drm/i915: wait render timeout ioctl
This helps implement GL_ARB_sync but stops short of allowing full blown
sync objects. Finally we can use the new timed seqno waiting function
to allow userspace to wait on a buffer object with a timeout. This
implements that interface.

The IOCTL will take as input a buffer object handle, and a timeout in
nanoseconds (flags is currently optional but will likely be used for
permutations of flush operations). Users may specify 0 nanoseconds to
instantly check.

The wait ioctl with a timeout of 0 reimplements the busy ioctl. With any
non-zero timeout parameter the wait ioctl will wait for the given number
of nanoseconds on an object becoming unbusy. Since the wait itself does
so holding struct_mutex the object may become re-busied before this
completes. A similar but shorter race condition exists in the busy
ioctl.

v2: ETIME/ERESTARTSYS instead of changing to EBUSY, and EGAIN (Chris)
Flush the object from the gpu write domain (Chris + Daniel)
Fix leaked refcount in good case (Chris)
Naturally align ioctl struct (Chris)

v3: Drop lock after getting seqno to avoid ugly dance (Chris)

v4: check for 0 timeout after olr check to allow polling (Chris)

v5: Updated the comment. (Chris)

v6: Return -ETIME instead of -EBUSY when timeout_ns is 0 (Daniel)
Fix the commit message comment to be less ugly (Ben)
Add a warning to check the return timespec (Ben)

v7: Use DRM_AUTH for the ioctl. (Eugeni)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-05-25 14:15:46 +02:00
Chris Wilson
31d8d651eb drm/i915: Remove the error message for unbinding pinned buffers
This is now used intentionally to prevent proliferation of is-pinned
checks upon the inactive list following:

commit 1b50247a8d
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Tue Apr 24 15:47:30 2012 +0100

    drm/i915: Remove the list of pinned inactive objects

Reported-and-tested-by: guang.a.yang@intel.com
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=50075
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-05-25 10:10:40 +02:00
Chris Wilson
bed1ea95a3 drm/i915: Limit page allocations to lowmem (dma32) for i965
Broadwater and Crestline share a limitation that prevent it from
relocating general surface state above 4GiB. The only recourse we have
since any buffer object may be used as a relocation target is then to
limit all object allocations on 965g[m] to DMA32.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-05-25 10:07:06 +02:00
Ben Widawsky
5c81fe85da drm/i915: timeout parameter for seqno wait
Insert a wait parameter in the code so we can possibly timeout on a
seqno wait if need be. The code should be functionally the same as
before because all the callers will continue to retry if an arbitrary
timeout elapses.

We'd like to have nanosecond granularity, but the only way to do this is
with hrtimer, and that doesn't fit well with the needs of this code.

v2: Fix rebase error (Chris)
Return proper time even in wedged + signal case (Chris + Ben)
Use timespec constructs (Ben)
Didn't take Daniel's advice regarding the Frankenstein-ness of the
  function. I did try his advice, but in the end I liked the way the
  original code looked, better.

v3: Make wakeups far less frequent for infinite waits (Chris)

v4: Remove dummy_wait variable (Daniel)
Use raw monotonic time instead of jiffies (made the code a bit cleaner) (Ben)
Added a couple of warnings (Ben)

v5: Remove warnings (Daniel)
Use more accurate time diff for default case (Daniel)
Bikeshed for setting the return timespec in timeout case (Daniel)
s/jiffies/time in one of the comments

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Eugeni Dodonov <eugeni.dodonov@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-05-25 09:55:08 +02:00
Daniel Vetter
1286ff7397 i915: add dmabuf/prime buffer sharing support.
This adds handle->fd and fd->handle support to i915, this is to allow
for offloading of rendering in one direction and outputs in the other.

v2 from Daniel Vetter:
- fixup conflicts with the prepare/finish gtt prep work.
- implement ppgtt binding support.

Note that we have squat i-g-t testcoverage for any of the lifetime and
access rules dma_buf/prime support brings along. And there are quite a
few intricate situations here.

Also note that the integration with the existing code is a bit
hackish, especially around get_gtt_pages and put_gtt_pages. It imo
would be easier with the prep code from Chris Wilson's unbound series,
but that is for 3.6.

Also note that I didn't bother to put the new prepare/finish gtt hooks
to good use by moving the dma_buf_map/unmap_attachment calls in there
(like we've originally planned for).

Last but not least this patch is only compile-tested, but I've changed
very little compared to Dave Airlie's version. So there's a decent
chance v2 on drm-next works as well as v1 on 3.4-rc.

v3: Right when I've hit sent I've noticed that I've screwed up one
obj->sg_list (for dmar support) and obj->sg_table (for prime support)
disdinction. We should be able to merge these 2 paths, but that's
material for another patch.

v4: fix the error reporting bugs pointed out by ickle.

v5: fix another error, and stop non-gtt mmaps on shared objects
stop pread/pwrite on imported objects, add fake kmap

Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-05-23 10:47:10 +01:00
Chris Wilson
b4519513e8 drm/i915: Introduce for_each_ring() macro
In many places we wish to iterate over the rings associated with the
GPU, so refactor them to use a common macro.

Along the way, there are a few code removals that should be side-effect
free and some rearrangement which should only have a cosmetic impact,
such as error-state.

Note that this slightly changes the semantics in the hangcheck code:
We now always cycle through all enabled rings instead of
short-circuiting the logic.

v2: Pull in a couple of suggestions from Ben and Daniel for
intel_ring_initialized() and not removing the warning (just moving them
to a new home, closer to the error).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Added note to commit message about the small behaviour
change, suggested by Ben Widawsky.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-05-19 22:39:53 +02:00
Daniel Vetter
5e13a0c5ec Merge remote-tracking branch 'airlied/drm-core-next' into drm-intel-next-queued
Backmerge of drm-next to resolve a few ugly conflicts and to get a few
fixes from 3.4-rc6 (which drm-next has already merged). Note that this
merge also restricts the stencil cache lra evict policy workaround to
snb (as it should) - I had to frob the code anyway because the
CM0_MASK_SHIFT define died in the masked bit cleanups.

We need the backmerge to get Paulo Zanoni's infoframe regression fix
for gm45 - further bugfixes from him touch the same area and would
needlessly conflict.

Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-05-08 13:39:59 +02:00
Daniel Vetter
dc257cf154 Linux 3.4-rc6
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.18 (GNU/Linux)
 
 iQEcBAABAgAGBQJPpvY9AAoJEHm+PkMAQRiGpEoIAJgbu+Y8gITnBK/wh9O6zy3S
 5jie5KK4YWdbJsvO58WbNr3CyVIwGIqQ2dUZLiU59aBVLarlGw8xor0MmW+cZwhp
 6fBHaf0qDYAV0MZjD+mnnExOiCRyISa2lPmsfu9dAWywh5KGe6/oAP6/qcXIyok3
 KZyl3qQf4ENpaZPHwZPXCEkUvtuyHgNiszN+QXEadA3s19Ot4VGe9A3VGw+GNrSm
 JqFIq3acQAbKa5BYaqf7TQC02v2FI7//eqt6QHxTqbE6a7LGbTvLfX3HlJ2mnfqa
 1R6QHhM4y4OZDHbaMT2raHZ8WuLXzhehJzhP8Co7AHFOKwVKOb5XbcUr2RrukMU=
 =HkMd
 -----END PGP SIGNATURE-----

Merge tag 'v3.4-rc6' into drm-intel-next

Conflicts:
	drivers/gpu/drm/i915/intel_display.c

Ok, this is a fun story of git totally messing things up. There
/shouldn't/ be any conflict in here, because the fixes in -rc6 do only
touch functions that have not been changed in -next.

The offending commits in drm-next are 14415745b2..1fa611065 which
simply move a few functions from intel_display.c to intel_pm.c. The
problem seems to be that git diff gets completely confused:

$ git diff 14415745b2..1fa611065

is a nice mess in intel_display.c, and the diff leaks into totally
unrelated functions, whereas

$git diff --minimal  14415745b2..1fa611065

is exactly what we want.

Unfortunately there seems to be no way to teach similar smarts to the
merge diff and conflict generation code, because with the minimal diff
there really shouldn't be any conflicts. For added hilarity, every
time something in that area changes the + and - lines in the diff move
around like crazy, again resulting in new conflicts. So I fear this
mess will stay with us for a little longer (and might result in
another backmerge down the road).

Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-05-07 14:02:14 +02:00
Ben Widawsky
b4aca0106c drm/i915: extract some common olr+wedge code
The new wait_rendering ioctl also needs to check for an oustanding
lazy request, and we already duplicate that logic at three places. So
extract it.

While at it, also extract the code to check the gpu wedging state to
improve code flow.

v2: Don't use seqno as an outparam (Chris)

v3 by danvet: Kill stale comment and pimp commit message

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-05-03 11:18:32 +02:00
Daniel Vetter
53ca26cab8 drm/i915 disallow physical batchbuffers for KMS
Even the horrible gen3 XvMC code has learned to do this
right by the time xf86-video-intel releases learned to do
kernel modesetting. So we can just disallow this.

Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-05-03 11:18:25 +02:00
Daniel Vetter
8781342df7 drm/i915: create dev_priv->dri1 dragon dungeon^W^W sub-struct
... and shove allow_batchbuffer in there. More dragons will
follow suit.

There's the curious case that we allow this for KMS ...

Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-05-03 11:18:25 +02:00
Ben Widawsky
3b88cc0dd7 drm/i915: use __wait_seqno for ring throttle
It turns out throttle had an almost identical  bit of code to do the
wait. Now we can call the new helper directly.  This is just a bonus,
and not needed for the overall series.

v2: remove irq_get/put which is now in __wait_seqno (Ben)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-05-03 11:18:23 +02:00
Ben Widawsky
4146b08d76 drm/i915: remove polled wait from throttle
It's about to go away anyway. Just here to help bisection.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-05-03 11:18:22 +02:00
Ben Widawsky
604dd3ec75 drm/i915: extract __wait_seqno from i915_wait_request
i915_wait_request is actually a fairly large function encapsulating
quite a few different operations. Because being able to wait on seqnos
in various conditions is useful, extracting that bit of code to a helper
function seems useful

v2: pull the irq_get/put as well (Ben)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-05-03 11:18:22 +02:00
Ben Widawsky
c58cf4f108 drm/i915: drop polled waits from i915_wait_request
The only time irq_get should fail is during unload or suspend. Both of
these points should try to quiesce the GPU before disabling interrupts
and so the atomic polling should never occur.

This was recommended by Chris Wilson as a way of reducing added
complexity to the polled wait which I introduced in an RFC patch.

09:57 < ickle_> it's only there as a fudge for waiting after irqs
after uninstalled during s&r, we aren't actually meant to hit it
09:57 < ickle_> so maybe we should just kill the code there and fix the breakage

v2: return -ENODEV instead of -EBUSY when irq_get fails

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-05-03 11:18:22 +02:00
Ben Widawsky
9574b3fe29 drm/i915: kill waiting_seqno
The waiting_seqno is not terribly useful, and as such we can remove it
so that we'll be able to extract lockless code.

v2: Keep the information for error_state (Chris)
Check if ring is initialized in hangcheck (Chris)
Capture the waiting ring (Chris)

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: add some bikeshed to clarify a comment.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2012-05-03 11:18:21 +02:00