The same as on evergreen.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reported-by: FrankR Huang <FrankR.Huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
The vram scratch buffer needs to be initialized
before the mc is programmed otherwise we program
0 as the GPU address of the default GPU fault
page. In most cases we put vram at zero anyway and
reserve a page for the legacy vga buffer so in practice
this shouldn't cause any problems, but better to make
it correct.
Was changed in:
6fab3febf6
Reported-by: FrankR Huang <FrankR.Huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Avoid needless uvd reprogramming if uvd powergating is disabled.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
No need to try the ring tests if starting the UVD block failed.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
For powergating, we just need to re-init the registers, there
is no need to restore the uvd BOs. This just adds needless
work when powergating uvd for playback while the system is
on. We only need to restore the uvd BOs on an actual resume
from suspend or when the driver loads.
This fixes multi-stream UVD playback on KB systems.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
The table has the following format:
typedef struct _ATOM_SRC_DST_TABLE_FOR_ONE_OBJECT //usSrcDstTableOffset pointing to this structure
{
UCHAR ucNumberOfSrc;
USHORT usSrcObjectID[1];
UCHAR ucNumberOfDst;
USHORT usDstObjectID[1];
}ATOM_SRC_DST_TABLE_FOR_ONE_OBJECT;
usSrcObjectID[] and usDstObjectID[] are variably sized, so we
can't access them directly. Use pointers and update the offset
appropriately when accessing the Dst members.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Setting MC_MISC_CNTL.GART_INDEX_REG_EN causes hangs on
some boards on resume. The systems seem to work fine
without touching this bit so leave it as is.
v2: read-modify-write the GART_INDEX_REG_EN bit.
I suspect the problem is that we are losing the other
settings in the register.
fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=52952
Reported-by: Ondrej Zary <linux@rainbow-software.org>
Tested-by: Daniel Tobias <dan.g.tob@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
This fills in the GPU specific details for berlin
GPU cores so that the driver will work with them.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
These blocks need to be ungated for the other parts of
the driver properly initialize them (e.g., after a gpu
reset, etc.).
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Since we aren't using it when the crtc is disabled, turn it off
to save power. The GRPH block is the part of the display
controller that controls the primary graphics plane (size,
address, etc.).
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
These fixes make writes work properly. Previously
only reads worked. Note that this feature is off
by default.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
If the LCD table contains an EDID record, properly account
for the edid size when walking through the records.
This should fix error messages about unknown LCD records.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Check the overrides in the firmware info table before
enabling spread spectrum on the engine or memory clocks.
Some boards may have valid spread spectrum tables, but
shouldn't necessarily have it enabled.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We need to allocate line buffer to each display when
setting up the watermarks. Failure to do so can lead
to a blank screen. This fixes blank screen problems
on dce8 asics.
Based on an initial fix from:
Jay Cornwall <jay.cornwall@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
We need to allocate line buffer to each display when
setting up the watermarks. Failure to do so can lead
to a blank screen. This fixes blank screen problems
on dce6 asics.
Fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=64850
Based on an initial fix from:
Jay Cornwall <jay.cornwall@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
We need to allocate line buffer to each display when
setting up the watermarks. Failure to do so can lead
to a blank screen. This fixes blank screen problems
on dce4.1/5 asics.
Based on an initial fix from:
Jay Cornwall <jay.cornwall@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Also add a new RADEON_INFO query to check that CP DMA packets are
supported on the compute ring.
CP DMA has been supported since the 3.8 kernel, but due to an oversight
we forgot to teach the CS checker that the CP DMA packet was legal for
the compute ring on Southern Islands GPUs.
This patch fixes a bug where the radeon driver will incorrectly reject a legal
CP DMA packet from user space. I would like to have the patch
backported to stable so that we don't have to require Mesa users to use a
bleeding edge kernel in order to take advantage of this feature which
is already present in the stable kernels (3.8 and newer).
v2:
- Don't bump kms version, so this patch can be backported to stable
kernels.
Cc: stable@vger.kernel.org
Signed-off-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Now that the CP is no longer reset and cg is properly
disabled in when appropriate in the dpm code we can
now enable mgcg (medium grained clockgating).
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Clockgating needs to be disabled around certain parts
of dpm setup otherwise the smc gets into a bad state
and dpm doesn't work properly.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Clockgating needs to be disabled around certain parts
of dpm setup otherwise the smc gets into a bad state
and dpm doesn't work properly.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The format of the clearstate buffer used for pg (powergating)
changed between NI and SI. This formats it properly for what
the hardware expects on SI+.
v2: fix addresses
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Clockgating requires signalling between the CP and the
RLC to work properly. Resetting the CP block in the
CP resume code messed up the internal coordination
between the blocks. Removing the reset allows gfx
clockgating to work properly. However, when gfx clock
gating is enabled, there is a strange interaction with
dpm which causes the chip to stay in the high performance
level all the time, so leave gfx clockgating disabled
for now.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
- use new cg/pg flags for finer grained clock and
powergating control
- restructure the cg/pg code so it can be called from
other components such as dpm
v2: fix build breakage from rebase
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The format of the clearstate buffer used for pg (powergating)
changed between NI and SI. This formats it properly for what
the hardware expects on SI.
v2: fix addresses
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Now that the CP is no longer reset and cg is properly
disabled in when appropriate in the dpm code we can
now enable mgcg (medium grained clockgating).
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Clockgating needs to be disabled around certain parts
of dpm setup otherwise the smc gets into a bad state
and dpm doesn't work properly.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Clockgating requires signalling between the CP and the
RLC to work properly. Resetting the CP block in the
CP resume code messed up the internal coordination
between the blocks. Removing the reset allows gfx
clockgating to work properly. However, when gfx clock
gating is enabled, there is a strange interaction with
dpm which causes the chip to stay in the high performance
level all the time, so leave gfx clockgating disabled
for now.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Resturcture clockgating code so that it can be
enabled/disabled from other components such as
dpm.
v2: make function static
v3: add fine grained cg controls
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This commits adds flags for supported clockgating and
powergating features. This allows us to more easily
track which features are supported on a particular
asic and to enable/disable features for debugging.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This updates the audio driver to the speaker allocation
block from the EDID. A similar change was just implemented
for DCE4-8.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This updates the audio driver to the speaker allocation
block from the EDID. A similar change was just implemented
for DCE6/8.
v2: remove unused variables
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Rafał Miłecki <zajec5@gmail.com>
Do it before enabling audio channels (in AFMT_AUDIO_PACKET_CONTROL2
register).
Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Similar to DCE4/5, but supports multiple audio pins
which can be assigned per afmt block.
v2: rework the driver to handle more than one audio
pin.
v3: try different dto reg
v4: properly program dto
v5 (ck): change dto programming order
v6: program speaker allocation block
v7: rebase
v8: rebase on Rafał's changes
v9: integrated Rafał's comments, update to latest
drm_edid_to_speaker_allocation API
v10: add missing line break in error message
v11: add back audio enabled messages
v12: fix copy paste typo in r600_audio_enable
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Rafał Miłecki <zajec5@gmail.com>
This adds a helper function to extract the speaker allocation
data block from the EDID. This data block describes what speakers
are present on the display device.
v2: update per Ville Syrjälä's comments
v3: fix copy/paste typo in memory allocation
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Tested-by: Rafał Miłecki <zajec5@gmail.com>
Similar to separating the UVD code, just put the DMA
functions into separate files.
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Our different hardware blocks are actually completely
separated, so it doesn't make much sense any more to
structure the code by pure chipset generations.
Start restructuring the code by separating our the UVD block.
v2: updated commit message
v3: rebased and restructurized start/stop functions for kv dpm.
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Now that we have callbacks for [rw]ptr handling we can
remove the special handling for the DMA rings and use
the callbacks instead.
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The hardware just doesn't support this correctly.
Disable it before we accidentally write anywhere we shouldn't.
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Give the ring functions a separate structure and let the asic
structure point to the ring specific functions. This simplifies
the code and allows us to make changes at only one point.
No change in functionality.
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Need to swap the data fetched over i2c properly. This
is the same fix as the endian fix for aux channel
transactions.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
According to the internal teams, we never hit the limit for
mclk switching on these asics, so we can disable the check.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The LCD has a relatively short vblank time (216us), but
the card is able to reclock memory fine in that time.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reported-by: normalrawr@gmail.com
Disable the UVD block when not in use to save power.
The block is not actually powergated on CI, but we
switch between UVD DPM (where the uvd clocks are
adjusted on demand) and clocks off.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
When we PG (powergate) UVD, we need to re-initialize it
before we can use it again.
v2: rebase on UVD stop fixes
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Starting on CIK, multi-media blocks like UVD no longer
have special power state. Rather they have their own
DPM implementation which adjusts their clocks dynamically
when active. When they are not active, the blocks are
powergated to save power.
v2: add missing pm locks
v3: rebase on uvd state selection rework
v4: fix inverted logic typo noticed by Christian
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Check if we can switch the mclk during the vblank time otherwise
we may get artifacts on the screen when the mclk changes.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This adds dpm support for btc asics. This includes:
- dynamic engine clock scaling
- dynamic memory clock scaling
- dynamic voltage scaling
- dynamic pcie gen switching
Set radeon.dpm=1 to enable.
v2: remove unused radeon_atombios.c changes,
make missing smc ucode non-fatal
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This adds dpm support for KB/KV asics. This includes:
- dynamic engine clock scaling
- dynamic voltage scaling
- power containment
- shader power scaling
Set radeon.dpm=1 to enable.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Internally we switched to using a separate header for
atombios pplib definitions. Switch over the open source
driver.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Calculate the low and high watermarks based on the low and high
clocks for the current power state. The dynamic pm hw will select
the appropriate watermark based on the internal dpm state.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Newer asics have a lot of vram so it's less of an
issue to waste a little more space for the gart
page table. This gives us some additional gart space
before having to migrate to non-gart system ram
for games, etc. where we use up most of vram.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
1. Handle the the thermal state directly in the work handler.
Remove the state selection function since nothing else uses it now.
2. On some asics there is no thermal state, so we just use a regular
state and force the low performance state.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use the UVD handle information to determine which
which power states to select when using UVD. For
example, decoding a single SD stream requires much
lower clocks than multiple HD streams.
v2: switch to a cleaner dpm/uvd interface
v3: change the uvd power state while streams
are active if need be
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add a helper function for counting the number of open stream handles.
v2: fix copy-pasta in comments and whitespace error
v3: make function static since it's only used in radeon_uvd.c
at the moment
v4: make non-static again for future changes
v5: make static again for new rework of dpm uvd changes
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Need to get my stuff out the door ;-) Highlights:
- pc8+ support from Paulo
- more vma patches from Ben.
- Kconfig option to enable preliminary support by default (Josh
Triplett)
- Optimized cpu cache flush handling and support for write-through caching
of display planes on Iris (Chris)
- rc6 tuning from Stéphane Marchesin for more stability
- VECS seqno wrap/semaphores fix (Ben)
- a pile of smaller cleanups and improvements all over
Note that I've ditched Ben's execbuf vma conversion for 3.12 since not yet
ready. But there's still other vma conversion stuff in here.
* tag 'drm-intel-next-2013-08-23' of git://people.freedesktop.org/~danvet/drm-intel: (62 commits)
drm/i915: Print seqnos as unsigned in debugfs
drm/i915: Fix context size calculation on SNB/IVB/VLV
drm/i915: Use POSTING_READ in lcpll code
drm/i915: enable Package C8+ by default
drm/i915: add i915.pc8_timeout function
drm/i915: add i915_pc8_status debugfs file
drm/i915: allow package C8+ states on Haswell (disabled)
drm/i915: fix SDEIMR assertion when disabling LCPLL
drm/i915: grab force_wake when restoring LCPLL
drm/i915: drop WaMbcDriverBootEnable workaround
drm/i915: Cleaning up the relocate entry function
drm/i915: merge HSW and SNB PM irq handlers
drm/i915: fix how we mask PMIMR when adding work to the queue
drm/i915: don't queue PM events we won't process
drm/i915: don't disable/reenable IVB error interrupts when not needed
drm/i915: add dev_priv->pm_irq_mask
drm/i915: don't update GEN6_PMIMR when it's not needed
drm/i915: wrap GEN6_PMIMR changes
drm/i915: wrap GTIMR changes
drm/i915: add the FCLK case to intel_ddi_get_cdclk_freq
...
Let applications know whether the kernel supports asynchronous page
flipping.
Signed-off-by: Keith Packard <keithp@keithp.com>
Signed-off-by: Dave Airlie <airlied@gmail.com>
This lets drivers see the flags requested by the application
[airlied: fixup for rcar/imx/msm]
Signed-off-by: Keith Packard <keithp@keithp.com>
Signed-off-by: Dave Airlie <airlied@gmail.com>
We're taking the sizeof() the wrong thing so it doesn't clear the whole
buffer.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Dave Airlie <airlied@gmail.com>
Drivers that don't support PRIME will not have initialized the PRIME
specific private component of struct drm_file. If called for such
drivers, the drm_gem_remove_prime_handles() function will crash. Fix
it by checking for PRIME support prior to removing the PRIME handles.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Dave Airlie <airlied@gmail.com>
This fixes the piglit test texturing/max-texture-size
causing the VM to die due to a too large SVGA command.
Signed-off-by: Jakob Bornecrantz <jakob@vmware.com>
Reviewed-by: Biran Paul <brianp@vmware.com>
Reviewed-by: Zack Rusin <zackr@vmware.com>
Cc: stable@vger.kernel.org
Signed-off-by: Dave Airlie <airlied@gmail.com>
There is a typo so deadlocks on error instead of unlocking.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@gmail.com>
Fix to return -ENOMEM in the fence manager init error handling
case instead of 0, as done elsewhere in this function.
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Acked-by: Dmitry Torokhov <dtor@vmware.com>
Signed-off-by: Dave Airlie <airlied@gmail.com>
Render nodes provide an API for userspace to use non-privileged GPU
commands without any running DRM-Master. It is useful for offscreen
rendering, GPGPU clients, and normal render clients which do not perform
modesetting.
Compared to legacy clients, render clients no longer need any
authentication to perform client ioctls. Instead, user-space controls
render/client access to GPUs via filesystem access-modes on the
render-node. Once a render-node was opened, a client has full access to
the client/render operations on the GPU. However, no modesetting or ioctls
that affect global state are allowed on render nodes.
To prevent privilege-escalation, drivers must explicitly state that they
support render nodes. They must mark their render-only ioctls as
DRM_RENDER_ALLOW so render clients can use them. Furthermore, they must
support clients without any attached master.
If filesystem access-modes are not enough for fine-grained access control
to render nodes (very unlikely, considering the versaitlity of FS-ACLs),
you may still fall-back to fd-passing from server to client (which allows
arbitrary access-control). However, note that revoking access is
currently impossible and unlikely to get implemented.
Note: Render clients no longer have any associated DRM-Master as they are
supposed to be independent of any server state. DRM core highly depends on
file_priv->master to be non-NULL for modesetting/ctx/etc. commands.
Therefore, drivers must be very careful to not require DRM-Master if they
support DRIVER_RENDER.
So far render-nodes are protected by "drm_rnodes". As long as this
module-parameter is not set to 1, a driver will not create render nodes.
This allows us to experiment with the API a bit before we stabilize it.
v2: drop insecure GEM_FLINK to force use of dmabuf
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
HDMI_IDENTIFIER was felt too generic, rename it to what it is, the IEEE
OUI corresponding to HDMI Licensing, LLC.
http://standards.ieee.org/develop/regauth/oui/oui.txt
Cc: Thierry Reding <thierry.reding@gmail.com>
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Dave Airlie <airlied@gmail.com>
With all the common infoframe bits now in place, we can finally write
the vendor specific infoframes in our driver.
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Dave Airlie <airlied@gmail.com>
This can then be used by DRM drivers to setup their vendor infoframes.
v2: Fix hmdi typo (Simon Farnsworth)
v3: Adapt to the hdmi_vendor_infoframe rename
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Simon Farnsworth <simon.farnsworth@onelan.co.uk>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Dave Airlie <airlied@gmail.com>
We just got rid of the version of hdmi_vendor_infoframe that had a byte
array for anyone to poke at. It's now time to shuffle around the naming
of hdmi_hdmi_infoframe to make hdmi_vendor_infoframe become the HDMI
vendor specific structure.
Cc: Thierry Reding <thierry.reding@gmail.com>
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Dave Airlie <airlied@gmail.com>
We'll need the HDMI OUI for the HDMI vendor infoframe data, so let's
move the DRM one to hdmi.h, might as well use the hdmi header to store
some hdmi defines.
(Note that, in fact, infoframes are part of the CEA-861 standard, and
only the HDMI vendor specific infoframe is special to HDMI, but
details..)
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Dave Airlie <airlied@gmail.com>
I just wrote the bits to define and pack HDMI vendor specific infoframe.
Port the host1x driver to use those so I can refactor the infoframe code
a bit more.
This changes the length of the infoframe payload from 6 to 5, which is
enough for the "frame packing" stereo format.
v2: Pimp up the commit message with the note about the length
(Ville Syrjälä)
Cc: Thierry Reding <thierry.reding@gmail.com>
Cc: Terje Bergström <tbergstrom@nvidia.com>
Cc: linux-tegra@vger.kernel.org
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Dave Airlie <airlied@gmail.com>
To set the active aspect ratio value in the AVI infoframe today, you not
only have to set the active_aspect field, but also the active_info_valid
bit. Out of the 1 user of this API, we had 100% misuse, forgetting the
_valid bit. This was fixed in:
Author: Damien Lespiau <damien.lespiau@intel.com>
Date: Tue Aug 6 20:32:17 2013 +0100
drm: Don't generate invalid AVI infoframes for CEA modes
We can do better and derive the _valid bit from the user wanting to set
the active aspect ratio.
v2: Fix multi-lines comment style (Thierry Reding)
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Dave Airlie <airlied@gmail.com>
HDMI 1.4 adds 4 "4k x 2k" modes in the the CEA vendor specific block.
With this commit, we now parse this block and expose the 4k modes that
we find there.
v2: Fix the "4096x2160" string (nice catch!), add comments about
do_hdmi_vsdb_modes() arguments and make it clearer that offset is
relative to the end of the required fields of the HDMI VSDB
(Ville Syrjälä)
v3: Fix 'Unknow' typo (Simon Farnsworth)
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Tested-by: Cancan Feng <cancan.feng@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67030
Reviewed-by: Simon Farnsworth <simon.farnsworth@onelan.co.uk>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Dave Airlie <airlied@gmail.com>
A few styles issues have crept in here, fix them before touching this
code again.
v2: constify arguments that can be (Ville Syrjälä)
v3: constify, but better (Ville Syrjälä)
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Dave Airlie <airlied@gmail.com>
This function is only used inside drm_edid.c.
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Dave Airlie <airlied@gmail.com>
Fix the typo introduced in
commit 1a2eb4604b
Author: Keith Packard <keithp@keithp.com>
Date: Wed Nov 16 16:26:07 2011 -0800
drm/i915: Hook up Ivybridge eDP
This fixes eDP link-training failures and cases where all voltage swing
/pre-emphasis levels were tried and failed during clock recovery and -
as a fallback - we go on to do channel equalization with the last voltage
swing/pre-emphasis level which will succeed. Both issues can lead to a
blank screen.
v2:
- improve commit message
CC: stable@vger.kernel.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64880
Tested-by: Jeremy Moles <cubicool@gmail.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This hooks nouveau up to the runtime PM system to enable
dynamic power management for secondary GPUs in switchable
and optimus laptops.
a) rewrite suspend/resume printks to hide them during dynamic s/r
to avoid cluttering logs
b) add runtime pm suspend to irq handler, crtc display, ioctl handler,
connector status,
c) handle hdmi audio dynamic power on/off using magic register.
v0.5:
make sure we hit D3 properly
fix fbdev_set_suspend locking interaction, we only will poweroff if we have no
active crtcs/fbcon anyways.
add reference for active crtcs.
sprinkle mark last busy for autosuspend timeout
v0.6:
allow more flexible debugging - to avoid log spam
add option to enable/disable dynpm
got to D3Cold
v0.7:
add hdmi audio support.
v0.8:
call autosuspend from idle, so pci config space access doesn't go straight
back to sleep, this makes starting X faster.
only signal usage if we actually handle the irq, otherwise usb keeps us awake.
fix nv50 display active powerdown
v0.9:
use masking function to enable hdmi audio
set busy when we fail to suspend
Signed-off-by: Dave Airlie <airlied@redhat.com>
For optimus and powerxpress muxless we really want the GPU
driver deciding when to power up/down the GPU, not userspace.
This adds the ability for a driver to dynamically power up/down
the GPU and remove the switcheroo from controlling it, the
switcheroo reports the dynamic state to userspace also.
It also adds 2 power domains, one for machine where the power
switch is controlled outside the GPU D3 state, so the powerdown
ordering is done correctly, and the second for the hdmi audio
device to make sure it can resume for PCI config space accesses.
v1.1: fix build with switcheroo off
v2: add power domain support for radeon and v1 nvidia dsms
v2.1: fix typo in off case
v3: add audio power domain for hdmi audio + misc audio fixes
v4: use PCI_SLOT macro, drop power reference on hdmi audio resume
failure also.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Merge the MSM driver from Rob Clark
* 'drm-next' of git://people.freedesktop.org/~robclark/linux:
drm/msm: add basic hangcheck/recovery mechanism
drm/msm: add a3xx gpu support
drm/msm: add register definitions for gpu
drm/msm: basic KMS driver for snapdragon
drm/msm: add register definitions
GEM does already a good job in tracking access to gem buffers via handles
and drm_vma access management. However, TTM drivers currently do not
verify this during mmap().
TTM provides the verify_access() callback to test this. So fix all drivers
to actually call into gem+vma to verify access instead of always returning
0.
All drivers assume that user-space can only get access to TTM buffers via
GEM handles. So whenever the verify_access() callback is called from
ttm_bo_mmap(), the buffer must have a valid embedded gem object. This is
true for all TTM+GEM drivers. But that's why this patch doesn't touch pure
TTM drivers (ie, vmwgfx).
v2: Switch to drm_vma_node_verify_access() to correctly return -EACCES if
access was denied.
Cc: Dave Airlie <airlied@redhat.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
We implement automatic vma mmap() access management for all drivers using
gem_mmap. We use the vma manager to add each open-file that creates a
gem-handle to the vma-node of the underlying gem object. Once the handle
is destroyed, we drop the open-file again.
This allows us to use drm_vma_node_is_allowed() on _any_ gem object to see
whether an open-file is granted access. In drm_gem_mmap() we use this to
verify that unprivileged users cannot guess gem offsets and map arbitrary
buffers.
Note that this manages access for _all_ gem users (also TTM+GEM), but the
actual access checks are only done for drm_gem_mmap(). TTM drivers use the
TTM mmap helpers, which need to do that separately.
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
The VMA offset manager uses a device-global address-space. Hence, any
user can currently map any offset-node they want. They only need to guess
the right offset. If we wanted per open-file offset spaces, we'd either
need VM_NONLINEAR mappings or multiple "struct address_space" trees. As
both doesn't really scale, we implement access management in the VMA
manager itself.
We use an rb-tree to store open-files for each VMA node. On each mmap
call, GEM, TTM or the drivers must check whether the current user is
allowed to map this file.
We add a separate lock for each node as there is no generic lock available
for the caller to protect the node easily.
As we currently don't know whether an object may be used for mmap(), we
have to do access management for all objects. If it turns out to slow down
handle creation/deletion significantly, we can optimize it in several
ways:
- Most times only a single filp is added per bo so we could use a static
"struct file *main_filp" which is checked/added/removed first before we
fall back to the rbtree+drm_vma_offset_file.
This could be even done lockless with rcu.
- Let user-space pass a hint whether mmap() should be supported on the
bo and avoid access-management if not.
- .. there are probably more ideas once we have benchmarks ..
v2: add drm_vma_node_verify_access() helper
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
A basic, no-frills recovery mechanism in case the gpu gets wedged. We
could try to be a bit more fancy and restart the next submit after the
one that got wedged, but for now keep it simple. This is enough to
recover things if, for example, the gpu hangs mid way through a piglit
run.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Add initial support for a3xx 3d core.
So far, with hardware that I've seen to date, we can have:
+ zero, one, or two z180 2d cores
+ a3xx or a2xx 3d core, which share a common CP (the firmware
for the CP seems to implement some different PM4 packet types
but the basics of cmdstream submission are the same)
Which means that the eventual complete "class" hierarchy, once
support for all past and present hw is in place, becomes:
+ msm_gpu
+ adreno_gpu
+ a3xx_gpu
+ a2xx_gpu
+ z180_gpu
This commit splits out the parts that will eventually be common
between a2xx/a3xx into adreno_gpu, and the parts that are even
common to z180 into msm_gpu.
Note that there is no cmdstream validation required. All memory access
from the GPU is via IOMMU/MMU. So as long as you don't map silly things
to the GPU, there isn't much damage that the GPU can do.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Generated from rnndb files in:
https://github.com/freedreno/envytools
Keep this split out as a separate commit to make it easier to review the
actual driver.
Signed-off-by: Rob Clark <robdclark@gmail.com>
The snapdragon chips have multiple different display controllers,
depending on which chip variant/version. (As far as I can tell, current
devices have either MDP3 or MDP4, and upcoming devices have MDSS.) And
then external to the display controller are HDMI, DSI, etc. blocks which
may be shared across devices which have different display controller
blocks.
To more easily add support for different display controller blocks, the
display controller specific bits are split out into a "kms" module,
which provides the kms plane/crtc/encoder objects.
The external HDMI, DSI, etc. blocks are part encoder, and part connector
currently. But I think I will pull in the drm_bridge patches from
chromeos tree, and split them into a bridge+connector, with the
registers that need to be set in modeset handled by the bridge. This
would remove the 'msm_connector' base class. But some things need to be
double checked to make sure I could get the correct ON/OFF sequencing..
This patch adds support for mdp4 crtc (including hw cursor), dtv encoder
(part of MDP4 block), and hdmi.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Generated from rnndb files in:
https://github.com/freedreno/envytools
Keep this split out as a separate commit to make it easier to review the
actual driver.
Signed-off-by: Rob Clark <robdclark@gmail.com>
I don't like seeing signed seqnos. Make them unsigned.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
All the different context sizes reported in the CXT_SIZE register
aren't meant to be simply added together.
While BSpec is somewhat unclear on the topic of the actual context
size, empirical tests have now revealed the truth. So let's add a
big fat comment to remind people how it all works.
As a result of correctly interpreting CXT_SIZE, the IVB context
size is reduced from three pages to two, while SNB context size
remains at two pages.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Acked-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
If we don't use the return value of a mmio read our coding style is to
use the POSTING_READ macro. This avoids cluttering the mmio traces.
While at it add the missing posting read in the lcpll enable function
that Paulo spotted.
v2: Drop the _NOTRACE changes, tracing such wait_for loops in the modeset
code might actually be rather useful!
Cc: Paulo Zanoni <przanoni@gmail.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This should be working, so enable it by default. Also easy to revert.
v2: Rebase, s/allow/enable/.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We currently only enter PC8+ after all its required conditions are
met, there's no rendering, and we stay like that for at least 5
seconds.
I chose "5 seconds" because this value is conservative and won't make
us enter/leave PC8+ thousands of times after the screen is off: some
desktop environments have applications that wake up and do rendering
every 1-3 seconds, even when the screen is off and the machine is
completely idle.
But when I was testing my PC8+ patches I set the default value to
100ms so I could use the bad-behaving desktop environments to
stress-test my patches. I also thought it would be a good idea to ask
our power management team to test different values, but I'm pretty
sure they would ask me for an easy way to change the timeout. So to
help these 2 cases I decided to create an option that would make it
easier to change the default value. I also expect people making
specific products that use our driver could try to find the perfect
timeout for them.
Anyway, fixing the bad-behaving applications will always lead to
better power savings than just changing the timeout value: you need to
stop waking the Kernel, not quickly put it back to sleep again after
you wake it for nothing. Bad sleep leads to bad mood!
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Make it print the value of the variables on the PC8 struct.
v2: Update to recent renames and add the new fields.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This patch allows PC8+ states on Haswell. These states can only be
reached when all the display outputs are disabled, and they allow some
more power savings.
The fact that the graphics device is allowing PC8+ doesn't mean that
the machine will actually enter PC8+: all the other devices also need
to allow PC8+.
For now this option is disabled by default. You need i915.allow_pc8=1
if you want it.
This patch adds a big comment inside i915_drv.h explaining how it
works and how it tracks things. Read it.
v2: (this is not really v2, many previous versions were already sent,
but they had different names)
- Use the new functions to enable/disable GTIMR and GEN6_PMIMR
- Rename almost all variables and functions to names suggested by
Chris
- More WARNs on the IRQ handling code
- Also disable PC8 when there's GPU work to do (thanks to Ben for
the help on this), so apps can run caster
- Enable PC8 on a delayed work function that is delayed for 5
seconds. This makes sure we only enable PC8+ if we're really
idle
- Make sure we're not in PC8+ when suspending
v3: - WARN if IRQs are disabled on __wait_seqno
- Replace some DRM_ERRORs with WARNs
- Fix calls to restore GT and PM interrupts
- Use intel_mark_busy instead of intel_ring_advance to disable PC8
v4: - Use the force_wake, Luke!
v5: - Remove the "IIR is not zero" WARNs
- Move the force_wake chunk to its own patch
- Only restore what's missing from RC6, not everything
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This was causing WARNs in one machine, so instead of trying to guess
exactly which hotplug bits should exist, just do the test on the
non-HPD bits. We don't care about the state of the hotplug bits, we
just care about the others, that need to be 1.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
If LCPLL is disabled, there's a chance we might be in package C8 state
or deeper, and we'll get a hard hang when restoring LCPLL (also, a red
led lights up on my motherboard). So grab the force_wake, which will
get us out of RC6 and, as a consequence, out of PC8+ (since we need
RC6 to get into PC8+).
Note: Discussions with hw designers are still ongoing what exactly
goes boom here. But I think we can go ahead and just merge this little
hack for now until it's clear what we actually need.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
[danvet: Add small note about the current state of the discussion
around this hack.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Turns out the BIOS will do this for us as needed, and if we try to do it
again we risk hangs or other bad behavior.
Note that this seems to break libva on ChromeOS after resumes (but
strangely _not_ after booting up).
This essentially reverts
commit b4ae3f22d2
Author: Jesse Barnes <jbarnes@virtuousgeek.org>
Date: Thu Jun 14 11:04:48 2012 -0700
drm/i915: load boot context at driver init time
and
commit b3bf076697
Author: Paulo Zanoni <paulo.r.zanoni@intel.com>
Date: Tue Nov 20 13:27:44 2012 -0200
drm/i915: implement WaMbcDriverBootEnable on Haswell
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reported-and-Tested-by: Stéphane Marchesin <marcheu@chromium.org>
[danvet: Add note about impact and regression citation.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
As the relocate entry function was getting a bit too big I've moved
the code that used to use either the cpu or the gtt to for the
relocation into two separate functions.
Signed-off-by: Rafael Barbalho <rafael.barbalho@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Because hsw_pm_irq_handler does exactly what gen6_rps_irq_handler does
and also processes the 2 additional VEBOX bits. So merge those
functions and wrap the VEBOX bits on a HAS_VEBOX check. This
check isn't really necessary since the bits are reserved on
SNB/IVB/VLV, but it's a good documentation on who uses them.
v2: - Change IS_HASWELL check to HAS_VEBOX
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
It seems we've been doing this ever since we started processing the
RPS events on a work queue, on commit "drm/i915: move gen6 rps
handling to workqueue", 4912d04193.
The problem is: when we add work to the queue, instead of just masking
the bits we queued and leaving all the others on their current state,
we mask the bits we queued and unmask all the others. This basically
means we'll be unmasking a bunch of interrupts we're not going to
process. And if you look at gen6_pm_rps_work, we unmask back only
GEN6_PM_RPS_EVENTS, which means the bits we unmasked when adding work
to the queue will remain unmasked after we process the queue.
Notice that even though we unmask those unrelated interrupts, we never
enable them on IER, so they don't fire our interrupt handler, they
just stay there on IIR waiting to be cleared when something else
triggers the interrupt handler.
So this patch does what seems to make more sense: mask only the bits
we add to the queue, without unmasking anything else, and so we'll
unmask them after we process the queue.
As a side effect we also have to remove that WARN, because it is not
only making sure we don't mask useful interrupts, it is also making
sure we do unmask useless interrupts! That piece of code should not be
responsible for knowing which bits should be unmasked, so just don't
assert anything, and trust that snb_disable_pm_irq should be doing the
right thing.
With i915.enable_pc8=1 I was getting ocasional "GEN6_PMIIR is not 0"
error messages due to the fact that we unmask those unrelated
interrupts but don't enable them.
Note: if bugs start bisecting to this patch, then it probably means
someone was relying on the fact that we unmask everything by accident,
then we should fix gen5_gt_irq_postinstall or whoever needs the
accidentally unmasked interrupts. Or maybe I was just wrong and we
need to revert this patch :)
Note: This started to be a more real issue with the addition of the
VEBOX support since now we do enable more than just the minimal set of
RPS interrupts in the IER register. Which means after the first rps
interrupt has happened we will never mask the VEBOX user interrupts
again and so will blow through cpu time needlessly when running video
workloads.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Add note that this started to matter with VEBOX much more.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
On SNB/IVB/VLV we only call gen6_rps_irq_handler if one of the IIR
bits set is part of GEN6_PM_RPS_EVENTS, but at gen6_rps_irq_handler we
add all the enabled IIR bits to the work queue, not only the ones that
are part of GEN6_PM_RPS_EVENTS. But then gen6_pm_rps_work only
processes GEN6_PM_RPS_EVENTS, so it's useless to add anything that's
not GEN6_PM_RPS_EVENTS to the work queue.
As a bonus, gen6_rps_irq_handler looks more similar to
hsw_pm_irq_handler, so we may be able to merge them in the future.
v2: - Add a WARN in case we queued something we're not going to
process.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net> (v1)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
If the error interrupts are already disabled, don't disable and
reenable them. This is going to be needed when we're in PC8+, where
all the interrupts are disabled so we won't risk re-enabling
DE_ERR_INT_IVB.
v2: Use dev_priv->irq_mask (Chris)
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Just like irq_mask and gt_irq_mask, use it to track the status of
GEN6_PMIMR so we don't need to read it again every time we call
snb_update_pm_irq.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
I did some brief tests and the "new_val = pmimr" condition usually
happens a few times after exiting games.
Note: This is also prep work to track the GEN6_PMIMR register state in
dev_priv->pm_imr. This happens in the next patch.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
[danvet: Add note to explain why we want this, as per the discussion
between Chris and Paulo.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Just like we're doing with the other IMR changes.
One of the functional changes is that not every caller was doing the
POSTING_READ.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Just like the functions that touch DEIMR and SDEIMR, but for GTIMR.
The new functions contain a POSTING_READ(GTIMR) which was not present
at the 2 callers inside i915_irq.c.
The implementation is based on ibx_display_interrupt_update.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We already have code to disable LCPLL and switch to FCLK, so we need this too.
We still don't call the code to disable LCPLL, but we'll call it when we add
support for Package C8+.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
On SNB and IVB, there's an MSR (also exposed through MCHBAR) we can use
to read out the amount of energy used over time. Expose this in sysfs
to make it easy to do power comparisons with different configurations.
If the platform supports it, the file will show up under the
drm/card0/power subdirectory of the PCI device in sysfs as gt_energy_uJ.
The value in the file is a running total of energy (in microjoules)
consumed by the graphics device.
v2: move to sysfs (Ben, Daniel)
expose a simple value (Chris)
drop unrelated hunk (Ben)
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
v3: by Ben
Tied it into existing rc6 sysfs entries and named that a more generic
"power attrs." Fixed rebase conflicts.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
v4: Since RAPL is a real driver that already exists to serve power
monitoring, place our entry in debugfs. This gives me a fallback
location for systems that do not expose it otherwise.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The code directly uses the registers and ring->mmio_base.
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This define hasn't been used since:
commit cfdf1fa23f
Author: Kristian Høgsberg <krh@bitplanet.net>
Date: Wed Dec 16 15:16:16 2009 -0500
drm/i915: Implement IS_* macros using static tables
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The code using this was removed in:
commit 88f23b8fa3
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Sun Dec 5 15:08:31 2010 +0000
drm/i915: Avoid using PIPE_CONTROL on Ironlake
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This define hasn't been used since:
commit 652c393a33
Author: Jesse Barnes <jbarnes@virtuousgeek.org>
Date: Mon Aug 17 13:31:43 2009 -0700
drm/i915: add dynamic clock frequency control
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The existing code was trying different vswing and preemphasis settings
in the wrong place, and wasn't trying them enough. So add a loop to
walk through them, properly disabling FDI TX and RX in between if a
failure is detected.
v2: remove unneeded reg writes, add delays around bit lock checks (Jesse)
v3: fix TX and RX disable per spec (Paulo)
fix delays per spec (Paulo)
make RX symbol lock check match TX bit lock check (Paulo)
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=51983
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Just one patch that soaked for quite a bit to fix a resume issue,
resulting in gpu hangs (or worse) due to tlb containing garbage.
* tag 'drm-intel-fixes-2013-08-23' of git://people.freedesktop.org/~danvet/drm-intel:
drm/i915: Invalidate TLBs for the rings after a reset
In the new execbuf code we want to track buffers using the vmas even
before they're all properly mapped. Which means that bind_to_vm needs
to deal with buffers which have preallocated vmas which aren't yet
bound.
This patch implements this prep work and adjusts our WARN/BUG checks.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Split out from Ben's big execbuf patch. Also move one BUG
back to its original place to deflate the diff a notch.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The execbuf wants to do relocations usings vmas, so we need a
vma->exec_list. The eviction code also uses the old obj execbuf list
for it's own book-keeping, but would really prefer to deal in vmas
only. So switch it over to the new list.
Again this is just a prep patch for the big execbuf vma conversion.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Split out from Ben's big execbuf vma patch.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
To convert the execbuf code over to use vmas natively we need to
shuffle the exec_list a bit. This patch here just prepares things with
the debugfs code, which also uses the old exec_list list_head, newly
called obj_exec_link.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: Split out from Ben's big patch.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
When building kernels for a preliminary hardware target, having to add a
kernel command-line option can prove inconvenient. Add a Kconfig option
that changes the default of this option to 1.
Signed-off-by: Josh Triplett <josh@joshtriplett.org>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
[danvet: Pimp the Kconfig help text a bit as suggested by Damien in
his 2nd review.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Our driver initialization doesn't seem to be ready to load when the
power well is disabled: we hit a few "Unclaimed register" messages. So
do just like we already do for the suspend/resume path: enable the
power well before unloading.
At some point we'll want to be able to survive suspend/resume and
load/unload with the power well disabled, but for now let's just fix
the regression.
Regression introduced by the following commit:
commit bf51d5e2cd
Author: Paulo Zanoni <paulo.r.zanoni@intel.com>
Date: Wed Jul 3 17:12:13 2013 -0300
drm/i915: switch disable_power_well default value to 1
Bug can be reproduced by running the "module_reload" script from
intel-gpu-tools.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67813
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Makes it more obviously correct what tricks we play by reusing the drm
prime release helper.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This fixes a WARN in i915_gem_free_object when the
obj->pages_pin_count isn't 0.
v2: Add locking to unmap, noticed by Chris Wilson. Note that even
though we call unmap with our own dev->struct_mutex held that won't
result in an immediate deadlock since we never go through the dma_buf
interfaces for our own, reimported buffers. But it's still easy to
blow up and anger lockdep, but that's already the case with our ->map
implementation. Fixing this for real will involve per dma-buf ww mutex
locking by the callers. And lots of fun. So go with the duct-tape
approach for now.
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reported-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Cc: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Tested-by: Armin K. <krejzi@email.com> (v1)
Acked-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Un-masking all PM interrupts causes hardware to generate
interrupts regardless of whether the interrupts are enabled
on the DE side. Since turbo only need up/down threshold and
rc6 timeout interrupt, mask all other interrupts bits to avoid
unnecessary overhead/wake up.
Note that our interrupt handler isn't being fired since we do set the
IER bits properly (IIR bits aren't set). The overhead isn't because
our driver is reacting to these interrupts, but because hardware keeps
generating internal messages when PMINTRMSK doesn't mask out the
up/down EI interrupts (which happen periodically).
Change-Id: I6c947df6fd5f60584d39b9e8b8c89faa51a5e827
Signed-off-by: Vinit Azad <vinit.azad@intel.com>
[danvet: Add follow-up explanation of the precise effects from Vinit
as a note to the commit message.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Whenever I need to work with the HSW_PWER_WELL_* register bits I have
to look at the documentation to find out which bit is to request the
power well and which one shows its current state. Rename the bits so I
won't need to look the docs every time.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
If the power well is disabled VGA is guaranteed to be disabled.
This fixes unclaimed register messages that happen on suspend/resume.
v2: Check the actual hw power well state instead of our own tracking
to make sure VGA is _really_ off (in case the BIOS/KVMr has just its
own request bit set). Requested by Ville.
Note: Ville suggested whether it wouldn't be better to just enable the
power well over a slightly longer time in our resume code, since we
already do that. I tend to agree, but there's also the modeset force
code in the lid notifier which _also_ eventually calls redisable_vga.
We shouldn't ever need this on somewhat modern hw (everything with
opregion essentially) but the code to bail out isn't there. Hence
stick with this simple approach here for now.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67517
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
[danvet: Summarize the discussion around the resume sequence and lid
notifier a bit.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
By our earlier reckoning, move from a snooped/llc setting to an uncached
setting, leaves the CPU cache in a consistent state irrespective of our
domain tracking - so we can forgo the warning about the lack of
invalidation. Similarly for any writes posted to the snooped CPU domain,
we know will be safely clflushed to the uncached PTEs after forcing the
domain change.
This WARN started to pop up with
commit d46f1c3f13
Author: Chris Wilson <chris@chris-wilson.co.uk>
AuthorDate: Thu Aug 8 14:41:06 2013 +0100
drm/i915: Allow the GPU to cache stolen memory
Ville brought up a scenario where the interaction of a set_caching
ioctl call from userspace on a scanout buffer (i.e. obj->pin_display
is set) resulted in the code getting confused and not properly
flushing stale cpu cachelines. Luckily we already prevent this by
rejecting caching changes when obj->pin_count is set.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68040
Tested-by: cancan,feng <cancan.feng@intel.com>
[danvet: Add buglink, bisect result and explain why Ville's scenario
is already taken care of.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The conditional is usually a recoverable driver bug, and so WARNing, and
preventing the drm_mm code from doing potential damage (BUG) is
desirable.
This issue was hit and fixed twice while developing the i915 multiple
address space code. The first fix is the patch just before this, and is
hit on an not frequently occuring error path. Another was fixed during
patch iteration, so it's hard to see from the patch:
commit c6cfb32567
Author: Ben Widawsky <ben@bwidawsk.net>
Date: Fri Jul 5 14:41:06 2013 -0700
drm/i915: Embed drm_mm_node in i915 gem obj
From the intel-gfx mailing list, we discussed this:
References: <20130705191235.GA3057@bwidawsk.net>
Cc: Dave Airlie <airlied@redhat.com>
CC: <dri-devel@lists.freedesktop.org>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Acked-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Use () to make for neater alignment of the split lines, too. With this
we ditch another jump through the obj_gtt_size/offset indirection
maze.
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cleanup the map and fenceable setting during bind to make more sense,
and not check i915_is_ggtt() 2 unnecessary times
v2: Move the bools into the if block (Chris) - There are ways to tidy
this function (fence calculations for instance) even further, but they
are quite invasive, so I am punting on those unless specifically asked.
v3: Add newline between variable declaration and logic (Chris)
Recommended-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
VMAs can be created and not bound. One may think of it as lazy cleanup,
and safely gloss over the conditions which manufacture it. In either
case, when the object backing the i915 vma is destroyed, we must cleanup
the vma without stumbling into a bunch of pitfalls that assume the vma
is bound.
NOTE: I was pretty certain the above condition could only happen when we
introduced the use of VMAs being looked up at execbuf, and already
existing. Paulo has hit this though, so I must be missing something. As
I believe the patch is correct anyway, therefore I won't scratch my head
too hard.
v2: use goto destroy as a compromise (Chris)
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Use the standard inversely ordered goto label stack for everything.
Spotted while reviewing place where we might need to to call
vma_destroy but failed to do so.
Cc: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Ideally we could use for_each_ring with the ring flags as I've done a
couple times
(http://lists.freedesktop.org/archives/intel-gfx/2013-June/029450.html).
Until Daniel merges that patch though, we can just use this.
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We require n-1 mailboxes for proper semaphore synchronization. All
semaphore synchronization code relies on proper values in these
mailboxes. The fact that we failed to touch the vebox ring by itself
was unlikely to be an issue since the HW should be initializing the
values to 0. However the error framework for testing seqno wrap
introduced by Mika, in addition to the hangcheck via seqno, and
i915_error_first_batchbuffer() combined caused a nice explosion.
The problem is caused by seqno wrap because the wrap condition is not
properly setup. The wrap code attempts to set the sync mailboxes all
to 0, and then set the current seqno to one less than 0. In all cases,
the vebox mailbox wasn't properly being initialized. This caused a
wrap to not occur. When hangcheck kicks in with the bogus seqno
values, the rest just doesn't work. It makes me wonder if we shouldn't
consider a dumber version of hangcheck...
How we messed this up: VECS support was written before the
aforementioned other features. Upon VECS being rebased, these facts
were missed.
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65387
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67198
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
It's basically the same deal as the RC6+ issues on ivy bridge
except this time with RC6 on sandy bridge. Like last time the
core of the issue is that the timings don't work 100% with our
voltage regulator. So from time to time, the kernel will print
a warning message about the GPU not getting out of RC6. In
particular, I found this fairly easy to reproduce during
suspend/resume.
Changing the threshold to 125000 instead of 50000 seems to fix
the issue. The previous patch used 150000 but as it turns out
this doesn't work everywhere. After getting such a machine, I
bisected the highest value which works, which is 125000, so here
it is.
I also measured the idle power usage before/after this patch and
didn't see a difference on a sandy bridge laptop. On haswell and
up, it makes a big difference, so we want to keep it at 50k
there. It also seems like haswell doesn't have the RC6 issues
that sandy bridge has so the 50k value is fine.
Signed-off-by: Stéphane Marchesin <marcheu@chromium.org>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The machines that fall in this category are the SDVs that have a PCI
ID starting with 0x0C. These are very early pre-production machines
and may not fully work. Other Haswell SDVs have PCI IDs that match the
real Haswell machines and we expect them to work better.
Even though they have problems, they still mostly work so I don't see
a reason to refuse loading our driver. But I do see a reason to reject
bug reports from these machines, so the message should help the bug
triagers.
As far as I know, we don't implement some workarounds that are
specific to these machines and suspend/resume may not work on most of
them, but besides this, they may work.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61508
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
After computing the stage changes for the set_config, record those in
the debug log.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Caught by "make W=1 drivers/gpu/drm/i915/".
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This is primarily for the benefit of the create2 ioctl so that the
caller can avoid the later step of rebinding the bo with new PTE bits.
After introducing WT (and possibly GFDT) cacheing for display targets,
not everything in the display is earmarked as UC, and more importantly
what is is controlled by the kernel.
Note that set_cache_level/get_cache_level for DISPLAY is not necessarily
idempotent; get_cache_level may return UC for architectures that have no
special cache domain for the display engine.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Haswell GT3e has the unique feature of supporting Write-Through cacheing
of objects within the eLLC/LLC. The purpose of this is to enable the display
plane to remain coherent whilst objects lie resident in the eLLC/LLC - so
that we, in theory, get the best of both worlds, perfect display and fast
access.
However, we still need to be careful as the CPU does not see the WT when
accessing the cache. In particular, this means that we need to flush the
cache lines after writing to an object through the CPU, and on
transitioning from a cached state to WT.
v2: Actually do the clflush on transition to WT, nagging by Ville.
v3: Flush the CPU cache after writes into WT objects.
v4: Rease onto LLC updates and report WT as "uncached" for
get_cache_level_ioctl to remain symmetric with set_cache_level_ioctl.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Although I could not reproduce this (different compiler version,
perhaps), reportedly we get:
drivers/gpu/drm/i915/i915_irq.c:1943:27: warning: ‘score’ may be used
uninitialized in this function [-Wuninitialized]
Drop the 'score' variable altogether as it's not really needed.
Reported-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The short lowercase names are bound to collide. The default warnings
don't even warn about shadowing.
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
It's been there since i8xx_irq_handler() was added in
commit c2798b19ba
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Sun Apr 22 21:13:57 2012 +0100
drm/i915: i8xx interrupt handler
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Some Poulsbo cards seem to incorrectly report
SDVO_CMD_STATUS_TARGET_NOT_SPECIFIED instead of
SDVO_CMD_STATUS_PENDING, which causes the display to be turned off.
This could also happen to i915.
Signed-off-by: Guillaume Clement <gclement@baobob.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
I just noticed in our code we don't really check the assertion, and
given some of the code I am changing in this area, I feel a WARN is very
nice to have.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
[danvet: s/&/&&/ to fix typo on the check.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Now that we skip clflushes more often, return a boolean indicating
whether the clflush was actually performed, and only if it was do the
chipset flush. (Though on most of the architectures where the clflush will
be skipped, the chipset flush is a no-op!)
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Here's some gma500 unifying and cleanups for drm-next. There is more stuff in
the pipe for 3.12 but I'd like to get these out of the way first.
* 'gma500-next' of git://github.com/patjak/drm-gma500: (35 commits)
drm/gma500/cdv: Add and hook up chip op for disabling sr
drm/gma500/cdv: Add and hook up chip op for watermarks
drm/gma500: Rename psb_intel_encoder to gma_encoder
drm/gma500: Rename psb_intel_connector to gma_connector
drm/gma500: Rename psb_intel_crtc to gma_crtc
drm/gma500/cdv: Convert to generic set_config()
drm/gma500/psb: Convert to generic set_config()
drm/gma500: Add generic set_config() function
drm/gma500/cdv: Convert to generic save/restore
drm/gma500/psb: Convert to generic save/restore
drm/gma500: Add generic crtc save/restore funcs
drm/gma500: Convert to generic encoder funcs
drm/gma500: Add generic encoder functions
drm/gma500/psb: Convert to generic cursor funcs
drm/gma500/cdv: Convert to generic cursor funcs
drm/gma500: Add generic cursor functions
drm/gma500/psb: Convert to generic crtc->destroy
drm/gma500/mdfld: Use identical generic crtc funcs
drm/gma500/oak: Use identical generic crtc funcs
drm/gma500/psb: Convert to gma_crtc_dpms()
...
Some Poulsbo cards seem to incorrectly report SDVO_CMD_STATUS_TARGET_NOT_SPECIFIED instead of SDVO_CMD_STATUS_PENDING, which causes the display to be turned off.
Signed-off-by: Guillaume Clement <gclement@baobob.org>
Acked-by: Patrik Jakobsson <patrik.r.jakobsson@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
... not only when the dma-buf is freshly created. In contrived
examples someone else could have exported/imported the dma-buf already
and handed us the gem object with a flink name. If such on object gets
reexported as a dma_buf we won't have it in the handle cache already,
which breaks the guarantee that for dma-buf imports we always hand
back an existing handle if there is one.
This is exercised by igt/prime_self_import/with_one_bo_two_files
Now if we extend the locked sections just a notch more we can also
plug th racy buf/handle cache setup in handle_to_fd:
If evil userspace races a concurrent gem close against a prime export
operation we can end up tearing down the gem handle before the dma buf
handle cache is set up. When handle_to_fd gets around to adding the
handle to the cache there will be no one left to clean it up,
effectily leaking the bo (and the dma-buf, since the handle cache
holds a ref on the dma-buf):
Thread A Thread B
handle_to_fd:
lookup gem object from handle
creates new dma_buf
gem_close on the same handle
obj->dma_buf is set, but file priv buf
handle cache has no entry
obj->handle_count drops to 0
drm_prime_add_buf_handle sets up the handle cache
-> We have a dma-buf reference in the handle cache, but since the
handle_count of the gem object already dropped to 0 no on will clean
it up. When closing the drm device fd we'll hit the WARN_ON in
drm_prime_destroy_file_private.
The important change is to extend the critical section of the
filp->prime.lock to cover the gem handle lookup. This serializes with
a concurrent gem handle close.
This leak is exercised by igt/prime_self_import/export-vs-gem_close-race
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
... and move it to the top of the function to avoid a forward
declaration.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
with the reworking semantics and locking of the obj->dma_buf pointer
this pointer is always set as long as there's still a gem handle
around and a dma_buf associated with this gem object.
Also, the per file-priv lookup-cache for dma-buf importing is also
unified between foreign and native objects.
Hence we don't need to special case the clean any more and can simply
drop the clause which only runs for foreing objects, i.e. with
obj->import_attach set.
Note that with this change (actually with the previous one to always
set up obj->dma_buf even for foreign objects) it is no longer required
to set obj->import_attach when importing a foreing object. So update
comments accordingly, too.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
The export dma-buf cache is semantically similar to an flink name. So
semantically it makes sense to treat it the same and remove the name
(i.e. the dma_buf pointer) and its references when the last gem handle
disappears.
Again we need to be careful, but double so: Not just could someone
race and export with a gem close ioctl (so we need to recheck
obj->handle_count again when assigning the new name), but multiple
exports can also race against each another. This is prevented by
holding the dev->object_name_lock across the entire section which
touches obj->dma_buf.
With the new scheme we also need to reinstate the obj->dma_buf link at
import time (in case the only reference userspace has held in-between
was through the dma-buf fd and not through any native gem handle). For
simplicity we don't check whether it's a native object but
unconditionally set up that link - with the new scheme of removing the
obj->dma_buf reference when the last handle disappears we can do that.
To make it clear that this is not just for exported buffers anymore
als rename it from export_dma_buf to dma_buf.
To make sure that now one can race a fd_to_handle or handle_to_fd with
gem_close we use the same tricks as in flink of extending the
dev->object_name_locking critical section. With this change we finally
have a guaranteed 1:1 relationship (at least for native objects)
between gem objects and dma-bufs, even accounting for races (which can
happen since the dma-buf itself holds a reference while in-flight).
This prevent igt/prime_self_import/export-vs-gem_close-race from
Oopsing the kernel. There is still a leak though since the per-file
priv dma-buf/handle cache handling is racy. That will be fixed in a
later patch.
v2: Remove the bogus dma_buf_put from the export_and_register_object
failure path if we've raced with the handle count dropping to 0.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>