Keep run queue, entity and scheduler handling together.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Cleanup function name, stop checking scheduler ready twice, but
check if kernel thread should stop instead.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
None of them are used any more.
v2: fix type in error message
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Christian K?nig <christian.koenig@amd.com>
Remove active_hw_rq and it's protecting queue_lock, they are unused.
User 32bit atomic for hw_rq_count, 64bits for counting to three is a bit
overkill.
Cleanup the function name and remove incorrect comments.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
Simply not used any more. Only keep 32bit atomic for fence sequence numbering.
v2: trivial rebase
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (v1)
Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com> (v1)
Reviewed-by: Chunming Zhou <david1.zhou@amd.com> (v1)
Rename the function and update the related code with this modified function.
Add the new parameter of bool wait_all.
If wait_all is true, it will return when all fences are signaled or timeout.
If wait_all is false, it will return when any fence is signaled or timeout.
Signed-off-by: Junwei Zhang <Jerry.Zhang@amd.com>
Reviewed-by: Monk Liu <monk.liu@amd.com>
Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Use pci_alloc_consistent rather than kzalloc since we
need 256 byte aligned memory for the ring buffer.
v2: fix copy paste typo in free function noticed
by Jammy.
bug:
https://bugs.freedesktop.org/show_bug.cgi?id=91749
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
This reverts commit 992cbf19b3.
Until we make fbdev layer atomic we can't call this.
Requested-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com?
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJV2pUkAAoJEHm+PkMAQRiGCIoH/Rb29ZjdCoZJp9OtnjAG+qRc
bG3YuomIdib86x7xHRKKaLWBa7din7IYjuwT/X4S4duO5a1R5Lp1sRG3IlGfhT0W
nBNbjFl4q4bOyiTPtTRTYyh4g5UQv4IuyCnCmZyCTJyVi/O6HVM9TWKUzm68P2dJ
30LwLUcQJ+mHueGJwFBAXe2BaojEpvYCdSX6tvbrQ/8X3FrVExZXuJl4uMYNFYNK
ZwG/v5t7tYOiAe76JGbrEuVFPZWLPEW7amHOWR0T4Ye4nWTlBgx7fENiNRlfgcvI
CM16l/xkyrZQ3Q5jZy1qYDfdHYF++dyEDysX4w1ae/X0aaLZn7l+u5VQD6WpkQQ=
=IF6I
-----END PGP SIGNATURE-----
Merge tag 'v4.2-rc8' into drm-next
Linux 4.2-rc8
Backmerge required for Intel so they can fix their -next tree up properly.
- Added PLL algorithm for a new rev of G200e
- Removed the bandwidth limitation for the new G200e
Signed-off-by: Mathieu Larouche <mathieu.larouche@matrox.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
- Added support for the new deviceID for G200eW3
- Added PLL algorithm for the G200eW3
- Added some initialization code for G200eW3
Signed-off-by: Mathieu Larouche <mathieu.larouche@matrox.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This is a port of:
DRM - radeon: Don't link train DisplayPort on HPD until we get the dpcd
to amdgpu.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Most of the time this isn't an issue since hotplugging an adaptor will
trigger a crtc mode change which in turn, causes the driver to probe
every DisplayPort for a dpcd. However, in cases where hotplugging
doesn't cause a mode change (specifically when one unplugs a monitor
from a DisplayPort connector, then plugs that same monitor back in
seconds later on the same port without any other monitors connected), we
never probe for the dpcd before starting the initial link training. What
happens from there looks like this:
- GPU has only one monitor connected. It's connected via
DisplayPort, and does not go through an adaptor of any sort.
- User unplugs DisplayPort connector from GPU.
- Change in HPD is detected by the driver, we probe every
DisplayPort for a possible connection.
- Probe the port the user originally had the monitor connected
on for it's dpcd. This fails, and we clear the first (and only
the first) byte of the dpcd to indicate we no longer have a
dpcd for this port.
- User plugs the previously disconnected monitor back into the
same DisplayPort.
- radeon_connector_hotplug() is called before everyone else,
and tries to handle the link training. Since only the first
byte of the dpcd is zeroed, the driver is able to complete
link training but does so against the wrong dpcd, causing it
to initialize the link with the wrong settings.
- Display stays blank (usually), dpcd is probed after the
initial link training, and the driver prints no obvious
messages to the log.
In theory, since only one byte of the dpcd is chopped off (specifically,
the byte that contains the revision information for DisplayPort), it's
not entirely impossible that this bug may not show on certain monitors.
For instance, the only reason this bug was visible on my ASUS PB238
monitor was due to the fact that this monitor using the enhanced framing
symbol sequence, the flag for which is ignored if the radeon driver
thinks that the DisplayPort version is below 1.1.
Signed-off-by: Stephen Chandler Paul <cpaul@redhat.com>
Reviewed-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
No need to try to call ttm_bo_device_release twice during module unload.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
When a user-space process writes directly to the fbdev framebuffer,
we hit a circular locking dependency. Fix this by introducing a local
delayed work callback so that the defio lock can be released before
calling into the modesetting code for a dirty update.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
It appears some MST docks are worse than other, but the only
way to know is to see the sw revisions in here, so dump
the branch OUI so we can look at the sw revision.
v2: Thierry made me feel guilty, so I parsed the branch
OUI.
Reviewed-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Revert of a VBT parsing commit that should've been queued for drm-next,
not v4.2. The revert unbreaks Braswell among other things.
Also on Braswell removal of DP HBR2/TP3 and intermediate eDP frequency
support. The code was optimistically added based on incorrect
documentation; the platform does not support them. These are cc: stable.
Finally a gpu state fix from Chris, also cc: stable.
* tag 'drm-intel-fixes-2015-08-20' of git://anongit.freedesktop.org/drm-intel:
drm/i915: Avoid TP3 on CHV
drm/i915: remove HBR2 from chv supported list
Revert "drm/i915: Add eDP intermediate frequencies for CHV"
Revert "drm/i915: Allow parsing of variable size child device entries from VBT"
drm/i915: Flag the execlists context object as dirty after every use
Stop double freeing the the BO list by pulling the content
of amdgpu_cs_parser_prepare_job() into the IOCTL function again.
v2: better commit message
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com> (v1)
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
The problem now is that we don't necessarily call amdgpu_ib_get()
in some error paths and so work with uninitialized data.
Better require that the memory is already zeroed.
v2: better commit message
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com> (v1)
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
More appropriate and fixes some nasty lockdep warnings.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Used by mesa, etc. for profiling.
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This patch add support for Two Dimensional Animation and Compositing
Engine (2D-ACE) on the Freescale SoCs.
2D-ACE is a Freescale display controller. 2D-ACE describes
the functionality of the module extremely well its name is a value
that cannot be used as a token in programming languages.
Instead the valid token "DCU" is used to tag the register names and
function names.
The Display Controller Unit (DCU) module is a system master that
fetches graphics stored in internal or external memory and displays
them on a TFT LCD panel. A wide range of panel sizes is supported
and the timing of the interface signals is highly configurable.
Graphics are read directly from memory and then blended in real-time,
which allows for dynamic content creation with minimal CPU
intervention.
The features:
(1) Full RGB888 output to TFT LCD panel.
(2) Blending of each pixel using up to 4 source layers
dependent
on size of panel.
(3) Each graphic layer can be placed with one pixel resolution
in either axis.
(4) Each graphic layer support RGB565 and RGB888 direct colors
without alpha channel and BGRA8888 BGRA4444 ARGB1555 direct
colors
with an alpha channel and YUV422 format.
(5) Each graphic layer support alpha blending with 8-bit
resolution.
This is a simplified version, only one primary plane, one
framebuffer, one crtc, one connector and one encoder for TFT
LCD panel.
Signed-off-by: Alison Wang <b18965@freescale.com>
Signed-off-by: Xiubo Li <lixiubo@cmss.chinamobile.com>
Signed-off-by: Jianwei Wang <jianwei.wang.chn@gmail.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The following PR add support for 3 more atmel SoCs and for some missing
features (new input formats and PRIME support).
* 'drm-atmel-hlcdc-devel' of https://github.com/bbrezillon/linux-at91:
drm: atmel-hlcdc: add support for sama5d4 SoCs
drm: atmel-hlcdc: add support for at91sam9n12 SoC
drm: atmel-hlcdc: add support for at91sam9x5 SoCs
drm: atmel-hlcdc: add RGB565 and RGB444 output support
drm: atmel-hlcdc: add the missing DRM_ATOMIC flag
drm: atmel-hlcdc: add PRIME support
This patch removes TP3 support on CHV since there is no support
for HBR2 on this platform.
v2: rename the function to indicate it checks source rates (Jani)
v3: update comment to indicate TP3 dependency on HBR2 supported
hardware (Jani)
Cc: stable@vger.kernel.org # v4.1+
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
[Jani: fixed a couple of checkpatch warnings.]
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
This patch removes 5.4Gbps from supported link rate for CHV since
it is not supported in it.
v2: change the ordering for better readability (Ville)
Cc: stable@vger.kernel.org # v4.1+
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
This reverts
commit fe51bfb95c.
Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
Date: Thu Mar 12 17:10:38 2015 +0200
CHV does not support intermediate frequencies so reverting the
patch that added it in the first place
Cc: stable@vger.kernel.org # v4.1+
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
This reverts
commit 047fe6e6db
Author: David Weinehall <david.weinehall@linux.intel.com>
Date: Tue Aug 4 16:55:52 2015 +0300
drm/i915: Allow parsing of variable size child device entries from VBT
That commit is not valid for v4.2, however it will be valid for v4.3. It
was simply queued too early.
The referenced regressing commit is just fine until the size of struct
common_child_dev_config changes, and that won't happen until
v4.3. Indeed, the expected size checks here rely on the increased size
of the struct, breaking new platforms.
Fixes: 047fe6e6db ("drm/i915: Allow parsing of variable size child device entries from VBT")
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: David Weinehall <david.weinehall@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
I2CM_ADDRESS became a MESS, fix it, also change guarding define
to __DW_HDMI_H__ , since the driver is not IMX specific.
Signed-off-by: Vladimir Zapolskiy <vladimir_zapolskiy@mentor.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The dw_hdmi enable/disable handling is particularly weak in several
regards:
* The hotplug interrupt could call hdmi_poweron() or hdmi_poweroff()
while DRM is setting a mode, which could race with a mode being set.
* Hotplug will always re-enable the phy whenever it detects an active
hotplug signal, even if DRM has disabled the output.
Resolve all of these by introducing a mutex to prevent races, and a
state-tracking bool so we know whether DRM wishes the output to be
enabled. We choose to use our own mutex rather than ->struct_mutex
so that we can still process interrupts in a timely fashion.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
dw_hdmi_phy_enable_power() is not about enabling and disabling power.
It is about allowing or preventing power-down mode being entered - the
register is documented as "Power-down enable (active low 0b)."
This can be seen as the bit has no effect when the HDMI phy is
operational on iMX6 hardware.
Rename the function to dw_hdmi_phy_enable_powerdown() to reflect the
documentation, make it take a bool for the 'enable' argument, and invert
the value to be written.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
On a mode set, DRM makes the following sequence of calls:
* for_each_encoder
* bridge mode_fixup
* encoder mode_fixup
* crtc mode_fixup
* for_each_encoder
* bridge disable
* encoder prepare
* bridge post_disable
* disable unused encoders
* crtc prepare
* crtc mode_set
* for_each_encoder
* encoder mode_set
* bridge mode_set
* crtc commit
* for_each_encoder
* bridge pre_enable
* encoder commit
* bridge enable
dw_hdmi enables the HDMI output in both the bridge mode_set() and also
the bridge enable() step. This is duplicated work - we can avoid the
setup in mode_set() and just do it in the enable() stage. This
simplifies the code a little.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Only enable audio support if the sink supports audio in some form, as
defined via its EDID. We discover this capability using the generic
drm_detect_monitor_audio() function.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The FSL kernel detects the HDMI vendor id, and uses this to set
hdmi->edid_cfg.hdmi_cap, which is then used to set mdvi appropriately,
rather than detecting whether we are outputting a CEA mode. Update
the dw_hdmi code to use this logic, but lets eliminate the mdvi
variable, prefering the more verbose "hdmi->sink_is_hdmi" instead.
Use the generic drm_detect_hdmi_monitor() to detect a HDMI sink.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
As mentioned in the previous commit, the dw-hdmi driver does not support
pixel doubled modes at present; it does not configure the PLL correctly
for these modes. Therefore, filter out the double-clocked modes as we
presently are unable to support them.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
dw_hdmi sets a pixel repetition factor of 1 for VICs 10-15, 25-30 and
35-38. However, DRM uses their native resolutions in its timing
information. For example, VIC 14 can be 1440x480 with no repetition,
or 720x480 with one pixel repetition. As DRM uses 1440 pixels per line
for this video mode, we need no pixel repetition.
In any case, pixel repetition appears broken in dw_hdmi.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
iMX6 devices suffer from an errata (ERR005174) where the audio FIFO can
be emptied while it is partially full, resulting in misalignment of the
audio samples.
To prevent this, the errata workaround recommends writing N as zero
until the audio FIFO has been loaded by DMA. Writing N=0 prevents the
HDMI bridge from reading from the audio FIFO, effectively disabling
audio.
This means we need to provide the audio driver with a pair of functions
to enable/disable audio. These are dw_hdmi_audio_enable() and
dw_hdmi_audio_disable().
A spinlock is introduced to ensure that setting the CTS/N values can't
race, ensuring that the audio driver calling the enable/disable
functions (which are called in an atomic context) can't race with a
modeset.
Tested-by: Yakir Yang <ykk@rock-chips.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Introduce dw_hdmi_set_sample_rate(), which allows us to configure the
audio sample rate, setting the CTS/N values appropriately.
Tested-by: Yakir Yang <ykk@rock-chips.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Remove the struct hdmi_vmode mhsyncpolarity/mvsyncpolarity/minterlaced
members, which are only used within a single function. We can directly
reference the appropriate mode->flags instead.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
When a YCBCR format is selected, we can merely copy the colorimetry
information directly as we use the same definitions for both the
unpacked AVI info frame and the hdmi_data_info structure.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Clean up hdmi_set_clk_regenerator() by allowing it to take the audio
sample rate and ratio directly, rather than hiding it inside the
function. Raise the unsupported pixel clock/sample rate message from
debug to error level as this results in audio not working correctly.
Tested-by: Yakir Yang <ykk@rock-chips.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The phy configuration is dependent on the SoC, and we look up values for
some of the registers in SoC specific data. However, we had partially
programmed the phy before we had successfully looked up the clock rate.
Also, we were only checking that we had a valid configuration for the
currctrl register.
Move all these lookups to the start of this function instead, so we can
check that all lookups were successful before beginning to program the
phy.
Tested-by: Yakir Yang <ykk@rock-chips.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The phy comments in dw_hdmi.c applied to the iMX6 version. Move these
comments to the iMX6 dw_hdmi-imx data along side the data.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Describe capabilities of the HLCDC IP found on sama5d4 SoCs and add a
new entry to the atmel_hlcdc_of_match table.
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Describe capabilities of the HLCDC IP found on at91sam9n12 SoC and add a
new entry to the atmel_hlcdc_of_match table.
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Describe capabilities of the HLCDC IP found on at91sam9x5 SoCs and add a
new entry to the atmel_hlcdc_of_match table.
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
The atmel-hlcdc driver already supports atomic operations, add the
missing DRM_ATOMIC flag to expose the atomic features to userspace.
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
If PM is enabled but PM_SLEEP is disabled, the suspend/resume functions
are still unused and produce a compiler warning.
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Cc: <stable@vger.kernel.org> # 4.1+
Pagetables can be moved and therefore the page directory update can be necessary
for the current cs even if none of the the bo's are moved. In that scenario
there is no fence between the sdma0 and gfx ring, so we add one.
v2 (chk): rebased
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Explicitly select BACKLIGHT_LCD_SUPPORT to satisfy the direct dependency
of BACKLIGHT_CLASS_DEVICE.
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Explicitly select BACKLIGHT_LCD_SUPPORT to satisfy the direct dependency
of BACKLIGHT_CLASS_DEVICE.
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Rework run queue implementation, especially remove the odd list handling.
v2: cleanup the code only, no algorithem change.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
fix the bug that there is duplicated bo_update_mapping issued
Signed-off-by: monk.liu <monk.liu@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
bo_list_clone() will take a lot of time when bo_list hold too much
elements, like above 7000
Signed-off-by: Monk.Liu <monk.liu@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Jammy Zhou <jammy.zhou@amd.com>
It's not validated yet and causes more harm than good.
Avoids spurious resets.
Reviewed-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
remaining timeout returned by amdgpu_fence_wait_any can be larger than
max int value, thus the truncated 32 bit value in r ends up being
negative while its original long value is positive.
Signed-off-by: monk.liu <monk.liu@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Jammy Zhou <jammy.zhou@amd.com>
fix fence is released when pass to **fence sometimes.
add reference for it.
Signed-off-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Christian K?nig <christian.koenig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested-and-Reviewed-by: Leo Liu <leo.liu@amd.com>
Not used any more.
v2: remove amd_sched_emit as well.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Similar to radeon, except that amdgpu doesn't even use struct_mutex to
protect anything like the shared z buffer (sane gpu architecture,
yay!). And the code already grabs the globa adev->ring_lock, so this
code can't race with itself. Which makes struct_mutex completely
redundnant. Remove it.
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Reviewed-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
It really doesn't protect anything which doesn't have other locks
already. Also this is run from driver unload code so not much need for
locks anyway.
Same changes as for radeon really.
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Reviewed-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We already grab 2 device-global locks (write-sema rdev->pm.mclk_lock
and rdev->ring_lock), adding another global mutex won't serialize this
code more. And since there's really nothing interesting that gets
protected in radeon by dev->struct mutex (we only have the global z
buffer owners and it's still serializing gem bo destruction in the drm
core - which is irrelevant since radeon uses ttm anyway internally)
this doesn't add protection. Remove it.
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Reviewed-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
It really doesn't protect anything which doesn't have other locks
already. Also this is run from driver unload code so not much need for
locks anyway.
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Reviewed-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The workaround simply doesn't work because VM mappings
are controlled by userspace not the kernel.
Additional to that this is just a performance problem
which happens if you have holes in your VM mapping.
v2: adjust virtual addr alignment as well.
v3: fix trivial warning
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Monk Liu <monk.liu@amd.com> (v1)
Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com> (v2)
Looks like that somehow got missed while during porting the radeon changes.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
It was just a wrapper for fence_wait anyway.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
The common kernel function does the same thing.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
scheduler fence is based on kernel fence framework.
v2: squash in Christian's build fix
Signed-off-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Christian K?nig <christian.koenig@amd.com>
v2: rebased
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (v1)
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Avoiding a couple of casts.
v2: rename c_entity to entity as well
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Cleanup the kernel context handling.
v2: rebased
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com> (v1)
Id's are for the IOCTL ABI only.
v2: remove tgid as well
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
We didn't initialized the mutex in the cloned bo list resulting in nice
warnings from lockdep. Also fixes error handling in this function.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
This way can avoid interrupt lost, and can process sched job exactly.
Signed-off-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
This function is used to get the next queued sequence number
Signed-off-by: Jammy Zhou <Jammy.Zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Jammy Zhou <Jammy.Zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
This function is to update last_emitted_v_seq and wake up the waiters.
It should be called by driver in the run_job backend function
Signed-off-by: Jammy Zhou <Jammy.Zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
It is clean to update last_queued_v_seq in the scheduler module
Signed-off-by: Jammy Zhou <Jammy.Zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Fix the code alignment, etc.
v2: rebase the code
Signed-off-by: Jammy Zhou <Jammy.Zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
every sbumission should be able to get a fence.
Signed-off-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Christian K?nig <christian.koenig@amd.com>
Reviewed-by: Jammy Zhou <jammy.zhou@amd.com>
It is theoretically possible that a swapped out BO gets the
same GTT address, but different backing pages while being swapped in.
Instead just use another VA state to note updated areas.
Ported from not upstream yet radeon commit with the same name.
v2: fix some bugs in the original implementation found in the radeon code.
v3: squash in VCE/UVD fix
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
thus unnecessary wake up could be avoid between rings
v2:
move wait_queue_head to fence_drv from ring
Signed-off-by: monk.liu <monk.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
origninal method will sleep/schedule at the granurarity of HZ/2 and
based on seq signal method, the new implement is based on kernel fance
interface, no unnecessary schedule at all
v2: replace logic of original amdgpu_fence_wait_any
Signed-off-by: monk.liu <monk.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
if enabling scheduler, then the queued seq is assigned
when pushing job before emitting job.
Signed-off-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Christian K?nig <christian.koenig@amd.com>
the job must be emitted by scheduler, otherwise scheduler is abnormal.
Signed-off-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Christian K?nig <christian.koenig@amd.com>
This option can be used to specify the max number of submissions in the
active HW queue. The default value is 2 now.
Signed-off-by: Jammy Zhou <Jammy.Zhou@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
This option can be used to specify the max job number in the job queue,
and it is 16 by default.
Signed-off-by: Jammy Zhou <Jammy.Zhou@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
it is possible that the callback isn't defined sometimes.
Signed-off-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Christian K?nig <christian.koenig@amd.com>
fence_process may be called from kthread, user thread and interrupt context.
it is possible to called concurrently, then will wake up fence queue multiple times.
Signed-off-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
use kernel context to submit command for vm
Signed-off-by: Chunming Zhou <david1.zhou@amd.com>
Acked-by: Christian K?nig <christian.koenig@amd.com>
Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
user mode will still use pte ring as a normal ring.
if the prepare job generates another command(update pte) on its ring in scheduler,
then will kill scheduler which is going to waiting later job but pending running job.
Signed-off-by: Chunming Zhou <david1.zhou@amd.com>
Acked-by: Christian K?nig <christian.koenig@amd.com>
Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>