Resuming from RPM can happen while already holding
dev->mode_config.mutex. This means we can't actually handle fbcon in
any RPM resume workers, since restoring fbcon requires grabbing
dev->mode_config.mutex again. So move the fbcon suspend/resume code into
it's own worker, and rely on that instead to avoid deadlocking.
This fixes more deadlocks for runtime suspending the GPU on the ThinkPad
W541. Reproduction recipe:
- Get a machine with both optimus and a nvidia card with connectors
attached to it
- Wait for the nvidia GPU to suspend
- Attempt to manually reprobe any of the connectors on the nvidia GPU
using sysfs
- *deadlock*
[airlied: use READ_ONCE to address Hans's comment]
Signed-off-by: Lyude <lyude@redhat.com>
Cc: Hans de Goede <hdegoede@redhat.com>
Cc: Kilian Singer <kilian.singer@quantumtechnology.info>
Cc: Lukas Wunner <lukas@wunner.de>
Cc: David Airlie <airlied@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
As it turns out, on cards that actually have CRTCs on them we're already
calling drm_kms_helper_poll_enable(drm_dev) from
nouveau_display_resume() before we call it in
nouveau_pmops_runtime_resume(). This leads us to accidentally trying to
enable polling twice, which results in a potential deadlock between the
RPM locks and drm_dev->mode_config.mutex if we end up trying to enable
polling the second time while output_poll_execute is running and holding
the mode_config lock. As such, make sure we only enable polling in
nouveau_pmops_runtime_resume() if we need to.
This fixes hangs observed on the ThinkPad W541
Signed-off-by: Lyude <lyude@redhat.com>
Cc: Hans de Goede <hdegoede@redhat.com>
Cc: Kilian Singer <kilian.singer@quantumtechnology.info>
Cc: Lukas Wunner <lukas@wunner.de>
Cc: David Airlie <airlied@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
ktime_set(S,N) was required for the timespec storage type and is still
useful for situations where a Seconds and Nanoseconds part of a time value
needs to be converted. For anything where the Seconds argument is 0, this
is pointless and can be replaced with a simple assignment.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
- Regression fix from atomic conversion (rotation on the original G80).
- Concurrency fix when clearing compression tags.
- Fixes DP link training issues on GP102/4/6.
- Fixes backlight handling in the presence of Apple GMUX.
- Improvements to GPU error recovery in a number of scenarios.
- GP106 support.
* 'linux-4.10' of git://github.com/skeggsb/linux:
drm/nouveau/kms/nv50: fix atomic regression on original G80
drm/nouveau/bl: Do not register interface if Apple GMUX detected
drm/nouveau/bl: Assign different names to interfaces
drm/nouveau/bios/dp: fix handling of LevelEntryTableIndex on DP table 4.2
drm/nouveau/ltc: protect clearing of comptags with mutex
drm/nouveau/gr/gf100-: handle GPC/TPC/MPC trap
drm/nouveau/core: recognise GP106 chipset
drm/nouveau/ttm: wait for bo fence to signal before unmapping vmas
drm/nouveau/gr/gf100-: FECS intr handling is not relevant on proprietary ucode
drm/nouveau/gr/gf100-: properly ack all FECS error interrupts
drm/nouveau/fifo/gf100-: recover from host mmu faults
The Apple GMUX is the one managing the backlight, so there is no need for
Nouveau to register its own backlight interface.
v2: Do not split information message on two lines as it prevents from grepping
it, as pointed out by Lukas Wunner
v3: Add a missing end-of-line character to the printed message
Signed-off-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Currently, every backlight interface created by Nouveau uses the same name,
nv_backlight. This leads to a sysfs warning as it tries to create an already
existing folder. This patch adds a incremented number to the name, but keeps
the initial name as nv_backlight, to avoid possibly breaking userspace; the
second interface will be named nv_backlight1, and so on.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86539
v2:
* Switch to using ida for generating unique IDs, as suggested by Ilia Mirkin;
* Allocate backlight name on the stack, as suggested by Ilia Mirkin;
* Move `nouveau_get_backlight_name()` to avoid forward declaration, as
suggested by Ilia Mirkin;
* Fix reference to bug report formatting, as reported by Nick Tenney.
v3:
* Define a macro for the size of the backlight name, to avoid defining
it multiple times;
* Use snprintf in place of sprintf.
v4:
* Do not create similarly named interfaces when reaching the maximum
amount of unique names, but fail instead, as pointed out by Lukas Wunner
Signed-off-by: Pierre Moreau <pierre.morrow@free.fr>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
TTM was changed a while back to allow for pipelining of buffer moves, and
part of this was the removal of waiting for a BO to idle before calling
move(), placing the responsibility on the driver to do this if required.
That's all well and good, except, we make use of move_notify() to handle
mapping/unmapping from the GPU VMM as move() isn't called on all paths.
This commit adds a wait before unmapping from a VMM in move_notify(), to
prevent GPU page faults where a buffer is still being accessed.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Cc: stable@vger.kernel.org [v4.8+]
This has been on the TODO list for a while now, recovering from things
such as attempting to execute a push buffer or touch a semaphore in an
unmapped memory area.
The only thing required on the HW side here is that the offending
channel is removed from the runlist, and *not* a full reset of PFIFO.
This used to be a bit messier to handle before the rework to make use
of engine topology info, but is apparently now trivial.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
- BIT_PERF_PTRS uses 32-bit pointers to its subtables, we were parsing
them as 16-bit, causing various issues on newer boards.
- Support for MXM on GM20x and up.
- More display-related fixes.
* 'linux-4.10' of git://github.com/skeggsb/linux:
drm/nouveau/mxm: warn more loudly on unsupported DCB version
drm/nouveau/mxm: handle DCB 4.1 modification
drm/nouveau/bios/mxm: handle digital connector table 1.1
drm/nouveau: Queue hpd_work on (runtime) resume
drm/nouveau: Rename acpi_work to hpd_work
drm/nouveau/kms/nv50: Fix atomic pageflip events.
drm/nouveau/fb/ram/gp100-: fix memory detection where FBP_NUM != FBPA_NUM
drm/nouveau/bios/volt: pointers are 32-bit
drm/nouveau/bios/vmap: pointers are 32-bit
drm/nouveau/bios/timing: pointers are 32-bit
drm/nouveau/bios/therm: pointers are 32-bit
drm/nouveau/bios/perf: pointers are 32-bit
drm/nouveau/bios/iccsense: pointers are 32-bit
drm/nouveau/bios/fan: pointers are 32-bit
drm/nouveau/bios/cstep: pointers are 32-bit
drm/nouveau/bios/boost: pointers are 32-bit
I suspect the version bump is just to signify that the table now specifies
pad macro/links instead of SOR/sublinks.
For our usage of the table, just recognising the new version is enough.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
We need to call drm_helper_hpd_irq_event() on resume to properly detect
monitor connection / disconnection on some laptops, use hpd_work for
this to avoid deadlocks.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
We need to call drm_helper_hpd_irq_event() on resume to properly detect
monitor connection / disconnection on some laptops. For runtime-resume
(which gets called on resume from normal suspend too) we must call
drm_helper_hpd_irq_event() from a workqueue to avoid a deadlock.
Rename acpi_work to hpd_work, and move it out of the #ifdef CONFIG_ACPI
blocks to make it suitable for generic work.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
The new atomic modesetting/pageflip code for nv50+ for
Linux 4.10+ no longer uses pageflip irq's to signal
flip completion. Instead it polls for flip completion
from within a kthread/work queue.
This creates a race between the vblank irq handler
updating the vblank count and timestamp for the
vblank of flip completion, and the kthread's
polling code detecting flip completion and sending
out the flip completion event.
Depending on who executes a few microseconds earlier,
the flip completion event will either contain correct
count/timestamp or a stale count/timestamp from the
previous vblank. This error was observed for about
50% of all executed flips, e.g., observable under DRI2
by the Xorg.log filling with flip handler warning
messages.
Call drm_accurate_vblank_count() before sending
out flip completion events to enforce a vblank
count/ts update for the vblank of flip completion
and avoid stale counts/timestamps.
This fix leads to one redundant call to drm_update_vblank_count
for each completed flip, but no other side effects. On
a ~6 year old Core i7 M620@ 2.67GHz the redundant call
costs about 10 usecs per flip
Successfully tested on GeForce 9500/9600/330M so far.
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
- GP102/GP104 devinit (suspend/resume, optimus) hang fix
- GP102/GP104 hardware cursor fix
- Fix for a regression on some non-MST monitors that was caused by the
MST work
- Workaround for certain laptops where ACPI sends display hotkey presses
on a modeset, causing gnome-settings-daemon to go into a continuous loop
* 'linux-4.10' of git://github.com/skeggsb/linux:
drm/nouveau/disp/gp102: rename from gp104
drm/nouveau/ce/gp102: rename from gp104
drm/nouveau/fb/gp102: rename from gp104
drm/nouveau/disp/gp102: fix cursor/overlay immediate channel indices
drm/nouveau/disp/nv50-: specify ctrl/user separately when constructing classes
drm/nouveau/disp/nv50-: split chid into chid.ctrl and chid.user
drm/nouveau: Intercept ACPI_VIDEO_NOTIFY_PROBE
drm/nouveau/devinit/gm200: drop pmu reset sequence
drm/nouveau/devinit/gm200: replace while loops with PTIMER-based timeout loops
drm/nouveau/pmu/gp102: initial implementation
drm/nouveau/pmu/gp100: initial implementation
drm/nouveau/pmu: execute reset before running devinit
drm/nouveau/pmu: move ucode handling into gt215 implementation
drm/nouveau/core: initial support for GP102
drm/nouveau/device/pci: fix oops if no mmu subdev present
drm/nouveau/kms/nv50: avoid touching DP_MSTM_CTRL if !DP_MST_CAP
Various notebooks with nvidia GPUs generate an ACPI_VIDEO_NOTIFY_PROBE
acpi-video event when an external device gets plugged in (and again on
modesets on that connector), the default behavior in the acpi-video
driver for this is to send a KEY_SWITCHVIDEOMODE evdev event, which
causes e.g. gnome-settings-daemon to ask us to rescan the connectors
(good), but also causes g-s-d to switch to mirror mode on a newly plugged
monitor rather then using the monitor to extend the desktop (bad)
as KEY_SWITCHVIDEOMODE is supposed to switch between extend the desktop
vs mirror mode.
More troublesome are the repeated ACPI_VIDEO_NOTIFY_PROBE events on
changing the mode on the connector, which cause g-s-d to switch
between mirror/extend mode, which causes a new ACPI_VIDEO_NOTIFY_PROBE
event and we end up with an endless loop.
This commit fixes this by adding an acpi notifier block handler to
nouveau_display.c to intercept ACPI_VIDEO_NOTIFY_PROBE and:
1) Wake-up runtime suspended GPUs and call drm_helper_hpd_irq_event()
on them, this is necessary in some cases for the GPU to detect connector
hotplug events while runtime suspended
2) Return NOTIFY_BAD to stop acpi-video from emitting a bogus
KEY_SWITCHVIDEOMODE key-press event
There already is another acpi notifier block handler registered in
drivers/gpu/drm/nouveau/nvkm/engine/device/acpi.c, but that is not
suitable since that one gets unregistered on runtime suspend, and
we also want to intercept ACPI_VIDEO_NOTIFY_PROBE when runtime suspended.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
This sequence is incorrect for GP102/GP104 boards. This is now being
handled correctly by the PMU subdev during preinit();
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
From visual inspection of traces, what we currently implement appears to
be identical to GP104. Seems to work well enough too.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
With atomic nv50+ is already converted over to them, but the old
display code is still using it. Found in a 2 year old patch I have
lying around to un-export these old helpers!
v2: Drop the hand-rolled versions from resume/suspend code. Now that
crtc callbacks do this, we don't need a special case for s/r anymore.
v3: Remove unused variables.
v4: Don't remove drm_crtc_vblank_off from suspend paths, non-atomic
nouveau still needs that. But still switch to drm_crtc_vblank_off
since drm_vblank_off will disappear.
Cc: Mario Kleiner <mario.kleiner.de@gmail.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20161114114101.21731-1-daniel.vetter@ffwll.ch