Fixes overwriting the first page table entry when testing that the PRAMIN
BAR can be correctly read/written, and adds an additional bar flush after
poking the BAR3 control regs.
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
NVC0 will be able to share some of nv50's paths this way. This also makes
it the card-specific vram code responsible for deciding if a given set
of tile_flags is valid, rather than duplicating the allowed types in
nv50_vram.c and nouveau_gem.c
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
As of this commit, it's guaranteed that if an object is in VRAM that its
GPU virtual address will be constant.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
This is required on nv50 as we need to be able to have more precise control
over physical VRAM allocations to avoid buffer corruption when using
buffers of mixed memory types.
This removes some nasty overallocation/alignment that we were previously
using to "control" this problem.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
At some point in the future, this bo won't necessarily be backed by
a drm_mm_node, so use the start/size fields of the ttm_mem_reg instead.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
The regs belong to PFIFO, they're different for pretty much the same
generations we need different PFIFO control for, and NVC0 is going
to be even more different than the rest.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Sleeping doesn't pay off for very short delays in comparison with the
minimum granularity of schedule_timeout().
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
There have been reports of PFIFO cache errors during context take down
(fdo bug 31637). They are caused by some GPU objects being taken out
while the channel is still potentially processing commands. Make sure
that all the previous rendering has landed before releasing a GPU
object.
Reported-by: Grzesiek Sójka <pld@pfu.pl>
Reported-by: Patrice Mandin <patmandin@gmail.com>
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Acked-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
nvxx_graph_isr is already taking care of it. In some cases this
could've made you miss PGRAPH interrupts (e.g. when you were supposed
to get several IRQs of the same kind in a row).
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
No functional changes, just simplify some code paths a bit.
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
In typical Apple fashion there's no standard information about what
encoders are present on this machine, this patch adds a quirk to
provide it.
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Fix running of destroy_context() when create_context() has never been
called for the channel, and fill in engine's tlb_flush() function pointer.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Allows callers to install their own handlers for when a GPIO line
changes state (such as for hotplug detect).
This also fixes a bug where we weren't acknowledging the GPIO IRQ
until after the bottom half had run, causing a severe IRQ storm
in some cases.
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
The point is to share more code between the PFB/PGRAPH tile region
hooks, and give the hardware specific functions a chance to allocate
per-region resources.
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
nouveau_bo_move_m2mf() needs to lock the kernel channel, and it may be
called from the pushbuf IOCTL with an user channel already locked. Use
a separate subclass for the kernel channel mutex because this is
legitimate mutex nesting.
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
In a multihead setup vblank interrupts may end up enabled in both
heads. In that case we want to ignore the vblank interrupts coming
from the wrong CRTC to avoid tearing and unbalanced calls to
drm_vblank_get/put (fdo bug 31074).
Reported-by: Felix Leimbach <felix.leimbach@gmx.net>
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
nv0x-nv4x should be mostly fine, nv50 doesn't work yet.
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
nouveau_fence_* functions are not type safe, which could lead to bugs.
Additionally every use of nouveau_fence_unref had to cast struct
nouveau_fence to void **.
Fix it by renaming old functions and creating static inline functions with
new prototypes. We still need old functions, because we pass function
pointers to ttm.
As we are wrapping functions, drop unused "void *arg" parameter where possible.
Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Only supported on NV50+ so far, and disabled by default currently. The
module parameter "msi=1" will enable it.
There's a kernel bug which will cause this to fail if the module (or the
NVIDIA binary driver) has ever been loaded before loading nouveau with
MSI enabled. As such, this is only safe to enable if you have nouveau
load on boot, and don't wish to ever reload it.
The workaround is to "echo 0 > /sys/bus/pci/devices/<device>/enable"
until the enable count reads 0. Then you should be able to load nouveau
with MSI enabled.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Not an issue right now, we're forced to 64k size/alignment by the BO
allocator anyway. This won't be the case soon.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
This really needs cleaning up somehow, and probably investigate what's
needed to do this on earlier generations. NVIDIA do something similar
there too.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
We previously added all the available classes for the entire generation,
even though the objects wouldn't work on the hardware.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
The structs themselves, as well as the non-sw object creation function are
probably very misnamed now. That's a problem for later :)
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Without it there's a potential race with nouveau_fence_update().
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
It needs a "strong" channel reference because it actually writes to
the channel pushbuf, otherwise the corresponding FIFO context could
get kicked off in the middle of nouveau_fence_sync().
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Fences didn't increment the channel reference count, and the fenced
channel could go away at any time. Fixes a potential race in
nouveau_fence_update().
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
nouveau_channel_ref() takes a "weak" channel reference that doesn't
prevent the hardware channel resources from being released, it just
keeps the channel data structure alive.
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
nouveau_channel_put() can be executed after the 'refcount == 0' check
in nouveau_channel_get() and before the channel reference count is
incremented. In that case CPU0 will take the context down while CPU1
thinks it owns the channel and 'refcount == 1'.
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
The destroy_context() engine hooks call gpuobj management functions to
release the channel resources, these functions use HARDIRQ-unsafe locks
whereas destroy_context() is called with the HARDIRQ-safe
context_switch_lock held, that's a lock ordering violation.
Push the engine-specific channel destruction logic into destroy_context()
and let the hardware-specific code lock and unlock when it's actually
needed. Change the engine destruction order to avoid a race in the small
gap between pgraph and pfifo context uninitialization.
Reported-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
The pushbuf ioctl syncs after validation, no need for this anymore.
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
No other driver uses this, and userspace should be responsible for handling
locking between them if they share BOs.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
This fixes a race condition between fbcon acceleration and TTM buffer
moves. To reproduce:
- start X
- switch to vt and "while (true); do dmesg; done"
- switch to another vt and "sleep 2 && cat /path/to/debugfs/dri/0/evict_vram"
- switch back to vt running dmesg
We don't make use of this on any other channel yet, they're currently
protected by drm_global_mutex. This will change in the near future.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
A future commit will add locking to the DRM's channel, and there's numerous
problems that come up if we allow printk from an interrupt context to be
accelerated. It seems saner to just disallow it completely.
As a nice side-effect, all the "to accel or not to accel" logic gets moved
out of the chipset-specific code.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
The bo lock used only to protect the bo sync object members, and since it
is a per bo lock, fencing a buffer list will see a lot of locks and unlocks.
Replace it with a per-device lock that protects the sync object members on
*all* bos. Reading and setting these members will always be very quick, so
the risc of heavy lock contention is microscopic. Note that waiting for
sync objects will always take place outside of this lock.
The bo device fence lock will eventually be replaced with a seqlock /
rcu mechanism so we can determine that a bo is idle under a
rcu / read seqlock.
However this change will allow us to batch fencing and unreserving of
buffers with a minimal amount of locking.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Jerome Glisse <j.glisse@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
The old code generated an interrupt storm bad enough to completely
take down my system.
Signed-off-by: Andy Lutomirski <luto@mit.edu>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Avoid confusing userspace by not publishing backlight controls if ACPI
equivalents are available.
Reported-by: Aaron Sowry <aaron@aeneby.se>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Improvements:
- Fix bug in switch statement
- Add parts of 0x10022c, 0x10023c
- Clean up 0x100234
- Comment out assumption in 0x100228 until verified
Signed-off-by: Roy Spliet <r.spliet@student.tudelft.nl>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reported-by: Tomas Miljenovic <tomasmiljenovic@gmail.com>
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>