Not only do we need to release the vm.ref we acquired for the vma on the
duplicate insert branch, but also for the normal error paths, so roll
them all into one.
Reported-by: Andi Shyti <andi.shyti@intel.com>
Suggested-by: Andi Shyti <andi.shyti@intel.com>
Fixes: 2850748ef8 ("drm/i915: Pull i915_vma_pin under the vm->mutex")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Cc: <stable@vger.kernel.org> # v5.5+
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200702211015.29604-1-chris@chris-wilson.co.uk
Since we no longer always take struct_mutex around everything, and want
the freedom to create GEM objects, actually taking struct_mutex inside
the lock creation ends up pulling the mutex inside other looks. Since we
don't use generally use struct_mutex, we can relax the tainting.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200702083225.20044-3-chris@chris-wilson.co.uk
Avoid waking up the device and taking stale locks if we know that the
object is not currently mmapped. This is particularly useful as not many
object are actually mmapped and so we can destroy them without waking
the device up, and gives us a little more freedom of workqueue ordering
during shutdown.
v2: Pull the release_mmap() into its single user in freeing the objects,
where there can not be any race with a concurrent user of the freed
object. Or so one hopes!
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>,
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>,
Link: https://patchwork.freedesktop.org/patch/msgid/20200702163623.6402-2-chris@chris-wilson.co.uk
Only a GGTT mmapping will use the aperture detiling registers, so on a
tiling change for an object, we only need to revoke those mmappings and
not the CPU mmappings (which are always linear irrespective of the tiling).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200702163623.6402-1-chris@chris-wilson.co.uk
We have a mix of dport, intel_dport, intel_dig_port and dig_port to
reference a intel_digital_port struct. Numbers are around
5 intel_dport
36 dport
479 intel_dig_port
352 dig_port
Since we already removed the intel_ prefix from most of our other
structs, do the same here and prefer dig_port.
v2: rename everything in i915, not just a few display sources and
reword commit message (from Matt Roper)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200701045054.23357-1-lucas.demarchi@intel.com
As we allow for parallel threads to create the same vma instance
concurrently, and we only filter out the duplicates upon reacquiring the
spinlock for the rbtree, we have to free the loser of the constructors'
race. When freeing, we should also drop any resource references acquired
for the redundant vma.
Fixes: 2850748ef8 ("drm/i915: Pull i915_vma_pin under the vm->mutex")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: <stable@vger.kernel.org> # v5.5+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200702083225.20044-1-chris@chris-wilson.co.uk
If the driver gets stuck holding the kernel timeline, we cannot issue a
heartbeat and so fail to discover that the driver is indeed stuck and do
not issue a GPU reset (which would hopefully unstick the driver!).
Switch to using a trylock so that we can query if the heartbeat's
timeline mutex is locked elsewhere, and then use the timer to probe if it
remains stuck at the same spot for consecutive heartbeats, indicating
that the mutex has not been released and the engine has not progressed.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200702095219.963-1-chris@chris-wilson.co.uk
The old epoch counter was left uninited, so the function returned a
changed state always.
While at it debug print the old epoch counter as well.
Fixes: 35205ee9ba ("drm/i915: Send hotplug event if edid had changed")
Cc: Kunal Joshi <kunal1.joshi@intel.com>
Cc: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200701180001.15857-1-imre.deak@intel.com
intel_dp_set_source_rates() calls intel_dp_is_edp(), which is unsafe to
use before encoder_type is set. This caused GEN11+ to incorrectly strip
HBR3 from source rates for edp. Move intel_dp_set_source_rates() to
after encoder_type is set. Add comment to intel_dp_is_edp() describing
unsafe usages.
v2: Alter intel_dp_set_source_rates final position (Ville/Manasi).
Remove outdated comment (Ville).
Slight optimization of control flow in intel_dp_init_connector.
Slight rewording in commit message.
Signed-off-by: Matt Atwood <matthew.s.atwood@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200630233310.10191-1-matthew.s.atwood@intel.com
'level' here means the highest level we can't use, so when checking
the fbc watermarks we need a -1 to get at the last enabled level.
While at if refactor the code a bit to declutter
g4x_compute_pipe_wm().
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200429101034.8208-12-ville.syrjala@linux.intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
To simplify things, call the combo PHY/TBT PLL calculation functions
directly from the corresponding combo/TypeC PLL get functions, instead of
calling the same calculation functions after having to recheck if the
given PHY is combo or TypeC.
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200629185848.20550-2-imre.deak@intel.com
When the reference clock is 38.4MHz, using the current TBT PLL
fractional divider value results in a slightly off TBT link frequency.
This causes an endless loop of link training success followed by a bad
link signaling and retraining at least on a Dell WD19TB TBT dock. The
workaround provided by the HW team is to divide the fractional divider
value by two. This fixed the link training problem on the ThinkPad dock.
The same workaround is needed on some EHL platforms and for combo PHY
PLLs, these will be addressed in a follow-up.
Bspec: 49204
References: HSDES#22010772725
References: HSDES#14011861142
Reported-and-tested-by: Khaled Almahallawy <khaled.almahallawy@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Khaled Almahallawy <khaled.almahallawy@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200629185848.20550-1-imre.deak@intel.com
The obj->lut_list is traversed when the object is closed as the file
table is destroyed during process termination. As this occurs before we
kill any outstanding context if, due to some bug or another, the closure
is blocked, then we fail to shootdown any inflight operations
potentially leaving the GPU spinning forever. As we only need to guard
the list against concurrent closures and insertions, the hold is short
and merits being treated as a simple spinlock.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200701084439.17025-1-chris@chris-wilson.co.uk
This registers will be used to implement PSR2 manual tracking/selective
fetch.
v2:
- Fixed typo in _PLANE_SEL_FETCH_BASE
- Renamed PSR2_MAN_TRK_CTL bits to better match spec names
- Renamed _PLANE_SEL_FETCH_* to better match spec names
BSpec: 55229
BSpec: 50424
BSpec: 50420
Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200626010151.221388-3-jose.souza@intel.com
Future patches will bring PSR2 selective fetch configuration
validation but most of the configuration checks will be used for HW
tracking and selective fetch so the reoder was necessary.
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200626010151.221388-2-jose.souza@intel.com
Rearrange the allocation of the mm_struct registration to avoid
allocating underneath the i915->mm_lock, so that we avoid tainting the
lock (and in turn many other locks that may be held as i915->mm_lock is
taken, and those locks we may want on the free [shrinker] paths). In
doing so, we convert the lookup to be RCU protected by courtesy of
converting the free-worker to be an rcu_work.
v2: Remember to use hash_rcu variants to protect the list iteration from
concurrent add/del.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200619194038.5088-1-chris@chris-wilson.co.uk
Often we seem to detect an underrun right after modeset on gen2.
It seems to be a spurious detection (potentially the pipe is still
in a wonky state when we enable the planes). An extra vblank wait
seems to cure it.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200429101034.8208-13-ville.syrjala@linux.intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
The default fbc1 compression interval we use is 500 frames. That
translates to over 8 seconds typically. That's rather excessive
so let's drop it to 1 second.
The hardware will not attempt recompression unless at least one
line has been modified, so a shorter compression interval should
not cause extra bandwidth use in the purely idle scenario. Of
course in the mostly idle case we are possibly going to recompress
a bit more.
Should really try to find some kind of sweet spot to minimize
the energy usage...
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200429101034.8208-11-ville.syrjala@linux.intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
The hardware host tracking won't nuke the entire cfb (unless the
entire fb is written through the gtt) so don't clear the busy_bits
for gtt tracking.
Not that it really matters anymore since we've lost ORIGIN_GTT usage
everywhere.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200429101034.8208-7-ville.syrjala@linux.intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
The current fence_y_offset calculation is broken. I think it more or
less used to do the right thing, but then I changed the plane code
to put the final x/y source offsets back into the src rectangle so
now it's just subtraacting the same value from itself. The code would
never have worked if we allowed the framebuffer to have a non-zero
offset.
Let's do this in a better way by just calculating the fence_y_offset
from the final plane surface offset. Note that we don't align the
plane surface address to fence rows so with horizontal panning there's
often a horizontal offset from the fence start to the surface address
as well. We have no way to tell the hardware about that so we just
ignore it. Based on some quick tests the invlidation still happens
correctly. I presume due to the invalidation nuking at least the full
line (or a segment of multiple lines).
Fixes: 54d4d719fa ("drm/i915: Overcome display engine stride limits via GTT remapping")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200429101034.8208-4-ville.syrjala@linux.intel.com
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Added epoch counter checking to intel_encoder_hotplug
in order to be able process all the connector changes,
besides connection status. Also now any change in connector
would result in epoch counter change, so no multiple checks
are needed.
v2: Renamed change counter to epoch counter. Fixed type name.
v3: Fixed rebase conflict
v4: Remove duplicate drm_edid_equal checks from hdmi and dp,
lets use only once edid property is getting updated and
increment epoch counter from there.
Also lets now call drm_connector_update_edid_property
right after we get edid always to make sure there is a
unified way to handle edid change, without having to
change tons of source code as currently
drm_connector_update_edid_property is called only in
certain cases like reprobing and not right after edid is
actually updated.
v5: Fixed const modifiers, removed blank line
v6: Removed drm specific part from this patch, leaving only
i915 specific changes here.
Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200630002700.5451-4-kunal1.joshi@intel.com
Acked-by: Jani Nikula <jani.nikula@intel.com>
Currently there is no null check for a failed memory allocation
on the dsb object and without this a null pointer dereference
error can occur. Fix this by adding a null check.
Note: added a drm_err message in keeping with the error message style
in the function.
Addresses-Coverity: ("Dereference null return")
Fixes: afeda4f3b1 ("drm/i915/dsb: Pre allocate and late cleanup of cmd buffer")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200616114221.73971-1-colin.king@canonical.com
Remove the use of dev->archdata.iommu and use the private per-device
pointer provided by IOMMU core code instead.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20200625130836.1916-3-joro@8bytes.org
A single Ri mismatch doesn't automatically mean that the link integrity
is broken. Update and check of Ri and Ri' are done asynchronously. In
case an update happens just between the read of Ri' and the check against
Ri there will be a mismatch even if the link integrity is fine otherwise.
Signed-off-by: Oliver Barta <oliver.barta@aptiv.com>
Reviewed-by: Sean Paul <sean@poorly.run>
Reviewed-by: Ramalingam C <ramalingam.c@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200504123524.7731-1-oliver.barta@aptiv.com
The linetime watermark is a 9 bit value, which gives us
a maximum linetime of just below 64 usec. If the linetime
exceeds that value we currently just discard the high bits
and program the rest into the register, which angers the
state checker.
To avoid that let's just clamp the value to the max. I believe
it should be perfectly fine to program a smaller linetime wm
than strictly required, just means the hardware may fetch data
sooner than strictly needed. We are further reassured by the
fact that with DRRS the spec tells us to program the smaller
of the two linetimes corresponding to the two refresh rates.
Cc: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200625200003.12436-1-ville.syrjala@linux.intel.com
Reviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Alexandre Oliva has recently removed these files from Linux Libre
with concerns that the sources weren't available.
The sources are available on IGT repository, and only open source
tools are used to generate the {ivb,hsw}_clear_kernel.c files.
However, the remaining concern from Alexandre Oliva was around
GPL license and the source not been present when distributing
the code.
So, it looks like 2 alternatives are possible, the use of
linux-firmware.git repository to store the blob or making sure
that the source is also present in our tree. Since the goal
is to limit the i915 firmware to only the micro-controller blobs
let's make sure that we do include the asm sources here in our tree.
Btw, I tried to have some diligence here and make sure that the
asms that these commits are adding are truly the source for
the mentioned files:
igt$ ./scripts/generate_clear_kernel.sh -g ivb \
-m ~/mesa/build/src/intel/tools/i965_asm
Output file not specified - using default file "ivb-cb_assembled"
Generating gen7 CB Kernel assembled file "ivb_clear_kernel.c"
for i915 driver...
igt$ diff ~/i915/drm-tip/drivers/gpu/drm/i915/gt/ivb_clear_kernel.c \
ivb_clear_kernel.c
< * Generated by: IGT Gpu Tools on Fri 21 Feb 2020 05:29:32 AM UTC
> * Generated by: IGT Gpu Tools on Mon 08 Jun 2020 10:00:54 AM PDT
61c61
< };
> };
\ No newline at end of file
igt$ ./scripts/generate_clear_kernel.sh -g hsw \
-m ~/mesa/build/src/intel/tools/i965_asm
Output file not specified - using default file "hsw-cb_assembled"
Generating gen7.5 CB Kernel assembled file "hsw_clear_kernel.c"
for i915 driver...
igt$ diff ~/i915/drm-tip/drivers/gpu/drm/i915/gt/hsw_clear_kernel.c \
hsw_clear_kernel.c
5c5
< * Generated by: IGT Gpu Tools on Fri 21 Feb 2020 05:30:13 AM UTC
> * Generated by: IGT Gpu Tools on Mon 08 Jun 2020 10:01:42 AM PDT
61c61
< };
> };
\ No newline at end of file
Used IGT and Mesa master repositories from Fri Jun 5 2020)
IGT: 53e8c878a6fb ("tests/kms_chamelium: Force reprobe after replugging
the connector")
Mesa: 5d13c7477eb1 ("radv: set keep_statistic_info with
RADV_DEBUG=shaderstats")
Mesa built with: meson build -D platforms=drm,x11 -D dri-drivers=i965 \
-D gallium-drivers=iris -D prefix=/usr \
-D libdir=/usr/lib64/ -Dtools=intel \
-Dkulkan-drivers=intel && ninja -C build
v2: Header clean-up and include build instructions in a readme (Chris)
Modified commit message to respect check-patch
Reference: http://www.fsfla.org/pipermail/linux-libre/2020-June/003374.html
Reference: http://www.fsfla.org/pipermail/linux-libre/2020-June/003375.html
Fixes: 47f8253d2b ("drm/i915/gen7: Clear all EU/L3 residual contexts")
Cc: <stable@vger.kernel.org> # v5.7+
Cc: Alexandre Oliva <lxoliva@fsfla.org>
Cc: Prathap Kumar Valsan <prathap.kumar.valsan@intel.com>
Cc: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Jon Bloomfield <jon.bloomfield@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200610201807.191440-1-rodrigo.vivi@intel.com
(cherry picked from commit 5a7eeb8ba1)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
The DP spec says:
"The transmitter shall support at least three levels of voltage
swing (Levels 0, 1, and 2).
If only three levels of voltage swing are supported (VOLTAGE
SWING SET field (bits 1:0) are programmed to 10 (Level 2)),
this bit shall be set to 1, and cleared in all other cases.
If all four levels of voltage swing are supported (VOLTAGE
SWING SET field (bits 1:0) are programmed to 11 (Level 3)),
this bit shall be set to 1,and cleared in all other cases."
Let's follow that exactly instead of the current apporach
where we can set those also for vswing/preemph levels 0 or 1
(or 2 when the platform max is 3).
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200512174145.3186-7-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
Catch up with upstream, in particular to get c1e8d7c6a7 ("mmap locking
API: convert mmap_sem comments").
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
The spec requires enabling the MST Virtual Channel payload allocation
- in a separate step - after the transcoder is enabled, follow this.
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200623082411.3889-1-imre.deak@intel.com
UAPI Changes:
- Add DRM_MODE_TYPE_USERDEF for video modes specified in cmdline.
Cross-subsystem Changes:
- Assorted devicetree binding updates.
- Add might_sleep() to dma_fence_wait().
- Fix fbdev's get_user_pages_fast() handling, and use pin_user_pages.
- Small cleanup with IS_BUILTIN in video/fbdev drivers.
- Fix video/hdmi coding style for infoframe size.
Core Changes:
- Silence vblank output during init.
- Fix DP-MST corruption during send msg timeout.
- Clear leak in drm_gem_objecs_lookup().
- Make newlines work with force connector attribute.
- Fix module refcounting error in drm_encoder_slave, and use new i2c api.
- Header fix for drm_managed.c
- More struct_mutex removal for !legacy drivers:
- Remove gem_free_object()
- Removal of drm_gem_object_put_unlocked().
- Show current->comm alongside pid in debug printfs.
- Add drm_client_modeset_check() + drm_client_framebuffer_flush().
- Replace drm_fb_swab16 with drm_fb_swap that also supports 32-bits.
- Remove mode->vrefresh, and compactify drm_display_mode.
- Use drm_* macros for logging and warnings.
- Add WARN when drm_gem_get_pages is used on a private obj.
- Handle importing and imported dmabuf better in shmem helpers.
- Small fix for drm/mm hole size comparison, and remove invalid entry optimization.
- Add a drm/mm selftest.
- Set DSI connector type for DSI panels.
- Assorted small fixes and documentation updates.
- Fix DDI I2C device registration for MST ports, and flushing on destroy.
- Fix master_set return type, used by vmwgfx.
- Make the drm_set/drop_master ioctl symmetrical.
Driver Changes:
Allow iommu in the sun4i driver and use it for sun8i.
- Simplify backlight lookup for omap, amba-clcd and tilcdc.
- Hold reg_lock for rockchip.
- Add support for bridge gpio and lane reordering + polarity to ti-sn65dsi86, and fix clock choice.
- Small assorted fixes to tilcdc, vc4, i915, omap, fbdev/sm712fb, fbdev/pxafb, console/newport_con, msm, virtio, udl, malidp, hdlcd, bridge/ti-sn65dsi86, panfrost.
- Remove hw cursor support for mgag200, and use simple kms helper + shmem helpers.
- Add support for KOE Allow iommu in the sun4i driver and use it for sun8i.
- Simplify backlight lookup for omap, amba-clcd and tilcdc.
- Hold reg_lock for rockchip.
- Add support for bridge gpio and lane reordering + polarity to ti-sn65dsi86, and fix clock choice.
- Small assorted fixes to tilcdc, vc4 (multiple), i915.
- Remove hw cursor support for mgag200, and use simple kms helper + shmem helpers.
- Add support for KOE TX26D202VM0BWA panel.
- Use GEM CMA functions in arc, arm, atmel-hlcdc, fsi-dcu, hisilicon, imx, ingenic, komeda, malidp, mcde, meson, msxfb, rcar-du, shmobile, stm, sti, tilcdc, tve200, zte.
- Remove gem_print_info.
- Improve gem_create_object_helper so udl can use shmem helpers.
- Convert vc4 dt bindings to schemas, and add clock properties.
- Device initialization cleanups for mgag200.
- Add a workaround to fix DP-MST short pulses handling on broken hardware in i915.
- Allow build test compiling arm drivers.
- Use managed pci functions in mgag200 and ast.
- Use dev_groups in malidp.
- Add per pixel alpha support for PX30 VOP in rockchip.
- Silence deferred probe logs in panfrost.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEuXvWqAysSYEJGuVH/lWMcqZwE8MFAl7s11IACgkQ/lWMcqZw
E8OYnQ/+K4ZGpU11t4IzXCyJYis2ZPYs/FlJ2BWXH89YhOckN1e1tq7uDBzUE8qK
Hlz0gvH5C0WXR/PWqNglPXW7INwc0LtC8PSmvS4vvrZQBaJ2bvf19y7dROqJbR0E
xTUje95eq+10H9TysCRTf1osIUuZIoR0gRna22pb+nplKVBkqsQPyPT21AWq4fN0
H/LQfKNfAAHtKwvfsMsuG2U+ZTyTYo7Xi6UP413WAqDmzhewnCm5plifM29m5LhB
9BmKk0/1pL3KzZuCQvcZw4kYUjXYsgoOqD4hkMAOLsjyf6Ad5zbPB5YTxNK0C+NU
N04aHWvkRVl62A6rehgXdS5GJ3M4ORPDpIV9zQCVxMZV/886JLTGA1Wb+b3+umdk
t5M40kzgYQTDqdSwFoCDCd1tFpEjnLbE7E+eM89AyzTPOxZowrVS0No1dJA3+ST/
g7JOdDu2Zg7VAar6zByow0pMppikZro9H1mpSnk+WHbYNF3dFmW3QHKRuxoRt+Ee
l5G52LylwH3ZMPebGH9XB4cWtAUAHOsioe3CS/PKzGeUWNlUK29AqDCCBQmUdbcT
HNm5/Yygdg3rRjkDBuUI0I/pifxMYvm+28eNfNGjwq5To9ABXPNONQCEBH6rke+S
d1Z2nMmiVDf2MqhpkJppTKtHdMz13IGyZykXB7CdGnAu6k5s59c=
=ZmKI
-----END PGP SIGNATURE-----
Merge tag 'drm-misc-next-2020-06-19' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
drm-misc-next for v5.9:
UAPI Changes:
- Add DRM_MODE_TYPE_USERDEF for video modes specified in cmdline.
Cross-subsystem Changes:
- Assorted devicetree binding updates.
- Add might_sleep() to dma_fence_wait().
- Fix fbdev's get_user_pages_fast() handling, and use pin_user_pages.
- Small cleanup with IS_BUILTIN in video/fbdev drivers.
- Fix video/hdmi coding style for infoframe size.
Core Changes:
- Silence vblank output during init.
- Fix DP-MST corruption during send msg timeout.
- Clear leak in drm_gem_objecs_lookup().
- Make newlines work with force connector attribute.
- Fix module refcounting error in drm_encoder_slave, and use new i2c api.
- Header fix for drm_managed.c
- More struct_mutex removal for !legacy drivers:
- Remove gem_free_object()
- Removal of drm_gem_object_put_unlocked().
- Show current->comm alongside pid in debug printfs.
- Add drm_client_modeset_check() + drm_client_framebuffer_flush().
- Replace drm_fb_swab16 with drm_fb_swap that also supports 32-bits.
- Remove mode->vrefresh, and compactify drm_display_mode.
- Use drm_* macros for logging and warnings.
- Add WARN when drm_gem_get_pages is used on a private obj.
- Handle importing and imported dmabuf better in shmem helpers.
- Small fix for drm/mm hole size comparison, and remove invalid entry optimization.
- Add a drm/mm selftest.
- Set DSI connector type for DSI panels.
- Assorted small fixes and documentation updates.
- Fix DDI I2C device registration for MST ports, and flushing on destroy.
- Fix master_set return type, used by vmwgfx.
- Make the drm_set/drop_master ioctl symmetrical.
Driver Changes:
Allow iommu in the sun4i driver and use it for sun8i.
- Simplify backlight lookup for omap, amba-clcd and tilcdc.
- Hold reg_lock for rockchip.
- Add support for bridge gpio and lane reordering + polarity to ti-sn65dsi86, and fix clock choice.
- Small assorted fixes to tilcdc, vc4, i915, omap, fbdev/sm712fb, fbdev/pxafb, console/newport_con, msm, virtio, udl, malidp, hdlcd, bridge/ti-sn65dsi86, panfrost.
- Remove hw cursor support for mgag200, and use simple kms helper + shmem helpers.
- Add support for KOE Allow iommu in the sun4i driver and use it for sun8i.
- Simplify backlight lookup for omap, amba-clcd and tilcdc.
- Hold reg_lock for rockchip.
- Add support for bridge gpio and lane reordering + polarity to ti-sn65dsi86, and fix clock choice.
- Small assorted fixes to tilcdc, vc4 (multiple), i915.
- Remove hw cursor support for mgag200, and use simple kms helper + shmem helpers.
- Add support for KOE TX26D202VM0BWA panel.
- Use GEM CMA functions in arc, arm, atmel-hlcdc, fsi-dcu, hisilicon, imx, ingenic, komeda, malidp, mcde, meson, msxfb, rcar-du, shmobile, stm, sti, tilcdc, tve200, zte.
- Remove gem_print_info.
- Improve gem_create_object_helper so udl can use shmem helpers.
- Convert vc4 dt bindings to schemas, and add clock properties.
- Device initialization cleanups for mgag200.
- Add a workaround to fix DP-MST short pulses handling on broken hardware in i915.
- Allow build test compiling arm drivers.
- Use managed pci functions in mgag200 and ast.
- Use dev_groups in malidp.
- Add per pixel alpha support for PX30 VOP in rockchip.
- Silence deferred probe logs in panfrost.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/001cd9a6-405d-4e29-43d8-354f53ae4e8b@linux.intel.com
During encoder enabling we clear the flag before starting the ACT
sequence and wait for the flag, but the clearing is missing during
encoder disabling, add it there too. Since nothing cleared the flag
automatically we could've run subsequent disabling steps too early.
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200616141855.746-5-imre.deak@intel.com
It's not clear if the DP_TP_STATUS flags other than the ACT sent flag
have some side-effect, so don't clear those; we don't depend on the
state of these flags anyway.
Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200616141855.746-4-imre.deak@intel.com
During transcoder enabling we'll configure the transcoder in MST mode
and enable the VC payload allocation, which will start the ACT sequence.
Before waiting for the ACT sequence completion, we need to clear the ACT
sent flag, but based on the above we can do this right before enabling
the transcoder.
For clarity, move the flag clearing closer to where we wait for it.
While at it also factor out some common code.
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200616141855.746-3-imre.deak@intel.com
During the initial probing of an MST sink, MST core will determine the
sink's link bandwidth based on its own version of the sink link
rate/lane count caps it reads from the DPCD. At a later point (after
probing and 1 or more modesets) i915 may limit the link parameters wrt.
the original source/sink common caps above due to link training failures
during a modeset and the resulting link training fallback logic.
Based on the above a modeset following another modeset with a link
training error will compute the i915 HW specific and DP protocol timing
parameters (data/link M/N and MST TU values) taking into account only
the unlimited source/sink common caps, but not taking into account the
fallback limits. This will also let DRM core oversubscribe the actual
link bandwidth during the MST payload allocation.
Prevent the above problem by disabling the link training fallback on MST
links for now, until the MST probe time initialization and the MST
compute config logic can deal with changing link parameters.
The misconfigured timings lead at least to a
'Timed out waiting for DP idle patterns'
error.
v2: (Ville)
- Print link training error message on the MST path too.
- Clarify the problem in the commit log.
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200616211146.23027-2-imre.deak@intel.com
MST encoders must use the master MST transcoder's DP_TP_STATUS and
DP_TP_CONTROL registers. Atm, during the HW readout of an MST encoder
connected to a slave transcoder we reset these register addresses in
intel_dp::regs.dp_tp_* to the slave transcoder's DP_TP_* register
addresses incorrectly; fix this.
One example where the above overwite happens is the encoder HW state
validation after enabling multiple streams; see
intel_dp_mst_enc_get_config(). After that during disabling any stream
we'll get a
'Timed out waiting for ACT sent when disabling'
error, due to reading from the incorrect DP_TP_STATUS register.
This change replaces
https://patchwork.freedesktop.org/patch/369577/?series=78193&rev=1
which just papered over the problem.
v2:
- Correct the failure scenario in the commit log. (José)
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200616211146.23027-1-imre.deak@intel.com
Start using device specific parameters instead of module parameters for
most things. The module parameters become the immutable initial values
for i915 parameters. The device specific parameters in i915->params
start life as a copy of i915_modparams. Any later changes are only
reflected in the debugfs.
The stragglers are:
* i915.force_probe and i915.modeset. Needed before dev_priv is
available. This is fine because the parameters are read-only and never
modified.
* i915.verbose_state_checks. Passing dev_priv to I915_STATE_WARN and
I915_STATE_WARN_ON would result in massive and ugly churn. This is
handled by not exposing the parameter via debugfs, and leaving the
parameter writable in sysfs. This may be fixed up in follow-up work.
* i915.inject_probe_failure. Only makes sense in terms of the module,
not the device. This is handled by not exposing the parameter via
debugfs.
v2: Fix uc i915 lookup code (Michał Winiarski)
Cc: Juha-Pekka Heikkilä <juha-pekka.heikkila@intel.com>
Cc: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Acked-by: Michał Winiarski <michal.winiarski@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20200618150402.14022-1-jani.nikula@intel.com
We only emit the renderstate once now during module load, it is no
longer a concern that we are delaying context creation and so do not
need to so eagerly optimise. Since the last time we have looked at the
renderstate, we have a pin_map / flush_map facility that supports simple
single mappings, replacing the open-coded kmap_atomic() and
prepare_write. As it should be a single page, of which we only write a
small portion, we stick to a simple WB [kmap] and use clflush on !llc
platforms, rather than creating a temporary WC vmapping for the single
page.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200619234543.17499-2-chris@chris-wilson.co.uk
Since gvt calls pin_map for the shadow batch buffer, this makes the
action of prepare_write [+pin_pages] redundant. We can write into the
obj->mm.mapping directory and the flush_map routine knows when it has to
flush the cpu cache afterwards.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200619234543.17499-1-chris@chris-wilson.co.uk
Smatch warns that we may iterate over an empty array of gt->engines[].
One hopes that this is impossible, but nevertheless we can simply
appease smatch by initialising the timestamp to zero before we starting
probing the busy-time from the engines.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200619151938.21740-1-chris@chris-wilson.co.uk
Make use of the struct_size() helper instead of an open-coded version
in order to avoid any potential type mistakes.
This code was detected with the help of Coccinelle and, audited and
fixed manually.
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200617220331.GA19550@embeddedor
Return the monotonic timestamp (ktime_get()) at the time of sampling the
busy-time. This is used in preference to taking ktime_get() separately
before or after the read seqlock as there can be some large variance in
reported timestamps. For selftests trying to ascertain that we are
reporting accurate to within a few microseconds, even a small delay
leads to the test failing.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200617130916.15261-2-chris@chris-wilson.co.uk
A couple of very simple tests to ensure that the basic properties of
per-engine busyness accounting [0% and 100% busy] are faithful.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200617130916.15261-1-chris@chris-wilson.co.uk
Using _MASKED_BIT_ENABLE macro to set mask register bits is straight
forward and not likely to go wrong. However when checking which bit(s)
is(are) enabled, simply bitwise AND value and _MASKED_BIT_ENABLE() won't
output expected result. Suppose the register write is disabling bit 1
by setting 0xFFFF0000, however "& _MASKED_BIT_ENABLE(1)" outputs
0x00010000, and the non-zero check will pass which cause the old code
consider the new value set as an enabling operation.
We found guest set 0x80008000 on boot, and set 0xffff8000 during resume.
Both are legal settings but old code will block latter and force vgpu
enter fail-safe mode.
Introduce two new macro and make proper masked bit check in mmio handler:
IS_MASKED_BITS_ENABLED()
IS_MASKED_BITS_DISABLED()
V2: Rebase.
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Colin Xu <colin.xu@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20200601030721.17129-1-colin.xu@intel.com
Like live_unlite_ring, but instead of simply looking at the impact of
intel_ring_direction(), check that preemption more generally works with
different depths of queued requests in the ring.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200616233733.18050-1-chris@chris-wilson.co.uk
Not too long ago, we realised we had issues with a rolling back a
context so far for a preemption request we considered the resubmit not
to be a rollback but a forward roll. This means we would issue a lite
restore instead of forcing a full restore, continuing execution of the
old requests rather than causing a preemption. Add a selftest to
exercise such a far rollback, such that if we were to skip the full
restore, we would execute invalid instructions in the ring and hang.
Note that while I was able to confirm that this causes us to do a
lite-restore preemption rollback (with commit e36ba817fa ("drm/i915/gt:
Incrementally check for rewinding") disabled), it did not trick the HW
into rolling past the old RING_TAIL. Myybe on other HW.
References: e36ba817fa ("drm/i915/gt: Incrementally check for rewinding")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200616185518.11948-1-chris@chris-wilson.co.uk
One i915_request_await_object is enough and we keep the one under the
object lock so it is final.
At the same time move async clflushing setup under the same locked
section and consolidate common code into a helper function.
v2:
* Emit initial breadcrumbs after aways are set up. (Chris)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michael J. Ruhl <michael.j.ruhl@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200615151449.32605-1-tvrtko.ursulin@linux.intel.com
Since these inline routines only return the desired pointer from the
i915_request(after checking the preconditions for acquiring said
pointer), they can be const.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200616183139.4061-1-chris@chris-wilson.co.uk
Fix inconsistent IS_ERR and PTR_ERR in live_timeslice_nopreempt().
The proper pointer to be passed as argument to PTR_ERR() is ce.
This bug was detected with the help of Coccinelle.
Fixes: b72f02d78e ("drm/i915/gt: Prevent timeslicing into unpreemptable requests")
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200616145452.GA25291@embeddedor
Since the PWM framework is switching struct pwm_state.duty_cycle's
datatype to u64, prepare for this transition by using DIV_ROUND_UP_ULL
to handle a 64-bit dividend.
Signed-off-by: Guru Das Srinagesh <gurus@codeaurora.org>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Thierry Reding <thierry.reding@gmail.com>
For all ddi, encoder->type holds output type as ddi,
assigning it to individual o/p types is no more valid.
Fixes: 362bfb995b ("drm/i915/tgl: Add DKL PHY vswing table for HDMI")
v2: Rebase, no functional change.
Signed-off-by: Vandita Kulkarni <vandita.kulkarni@intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: Uma Shankar <uma.shankar@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200612082237.11886-1-vandita.kulkarni@intel.com
(cherry picked from commit 94641eb6c6)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Rescue the GT workarounds from being buried inside init_clock_gating so
that we remember to apply them after a GT reset, and that they are
included in our verification that the workarounds are applied.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: stable@vger.kernel.org
Link: https://patchwork.freedesktop.org/patch/msgid/20200611080140.30228-6-chris@chris-wilson.co.uk
(cherry picked from commit 2bcefd0d26)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Rescue the GT workarounds from being buried inside init_clock_gating so
that we remember to apply them after a GT reset, and that they are
included in our verification that the workarounds are applied.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: stable@vger.kernel.org
Link: https://patchwork.freedesktop.org/patch/msgid/20200611080140.30228-5-chris@chris-wilson.co.uk
(cherry picked from commit 806a45c083)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Rescue the GT workarounds from being buried inside init_clock_gating so
that we remember to apply them after a GT reset, and that they are
included in our verification that the workarounds are applied.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: stable@vger.kernel.org
Link: https://patchwork.freedesktop.org/patch/msgid/20200611080140.30228-4-chris@chris-wilson.co.uk
(cherry picked from commit c3b93a943f)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Rescue the GT workarounds from being buried inside init_clock_gating so
that we remember to apply them after a GT reset, and that they are
included in our verification that the workarounds are applied.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: stable@vger.kernel.org
Link: https://patchwork.freedesktop.org/patch/msgid/20200611080140.30228-3-chris@chris-wilson.co.uk
(cherry picked from commit 7331c356b6)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Rescue the GT workarounds from being buried inside init_clock_gating so
that we remember to apply them after a GT reset, and that they are
included in our verification that the workarounds are applied.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: stable@vger.kernel.org
Link: https://patchwork.freedesktop.org/patch/msgid/20200611080140.30228-2-chris@chris-wilson.co.uk
(cherry picked from commit 19f1f627b3)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Rescue the GT workarounds from being buried inside init_clock_gating so
that we remember to apply them after a GT reset, and that they are
included in our verification that the workarounds are applied.
v2: Leave HSW_SCRATCH to set an explicit value, not or in our disable
bit.
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2011
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: stable@vger.kernel.org
Link: https://patchwork.freedesktop.org/patch/msgid/20200611093015.11370-1-chris@chris-wilson.co.uk
(cherry picked from commit f93ec5fb56)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
According to BSpec the Data Island Packet should be disabled after
disabling the transcoder, but before the transcoder clock select is set
to none. On an ICL RVP, daisy-chained MST config not following this
leads to a hang with the following MCE when disabling the output:
[ 870.948739] mce: [Hardware Error]: CPU 0: Machine Check Exception: 5 Bank 6: ba00000011000402
[ 871.019212] mce: [Hardware Error]: RIP !INEXACT! 10:<ffffffff81aca652> {poll_idle+0x92/0xb0}
[ 871.019212] mce: [Hardware Error]: TSC 135a261fe61
[ 871.019212] mce: [Hardware Error]: PROCESSOR 0:706e5 TIME 1591739604 SOCKET 0 APIC 0 microcode 20
[ 871.019212] mce: [Hardware Error]: Run the above through 'mcelog --ascii'
[ 871.019212] mce: [Hardware Error]: Machine check: Processor context corrupt
[ 871.019212] Kernel panic - not syncing: Fatal machine check
[ 871.019212] Kernel Offset: disabled
Bspec: 4287
Fixes: fa37a21327 ("drm/i915: Stop sending DP SDPs on ddi disable")
Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Cc: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200609220616.6015-1-imre.deak@intel.com
(cherry picked from commit c980216dd2)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
In commit 5ba32c7be8 ("drm/i915/execlists: Always force a context
reload when rewinding RING_TAIL"), we placed the check for rewinding a
context on actually submitting the next request in that context. This
was so that we only had to check once, and could do so with precision
avoiding as many forced restores as possible. For example, to ensure
that we can resubmit the same request a couple of times, we include a
small wa_tail such that on the next submission, the ring->tail will
appear to move forwards when resubmitting the same request. This is very
common as it will happen for every lite-restore to fill the second port
after a context switch.
However, intel_ring_direction() is limited in precision to movements of
upto half the ring size. The consequence being that if we tried to
unwind many requests, we could exceed half the ring and flip the sense
of the direction, so missing a force restore. As no request can be
greater than half the ring (i.e. 2048 bytes in the smallest case), we
can check for rollback incrementally. As we check against the tail that
would be submitted, we do not lose any sensitivity and allow lite
restores for the simple case. We still need to double check upon
submitting the context, to allow for multiple preemptions and
resubmissions.
Fixes: 5ba32c7be8 ("drm/i915/execlists: Always force a context reload when rewinding RING_TAIL")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.4+
Reviewed-by: Bruce Chang <yu.bruce.chang@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200609151723.12971-1-chris@chris-wilson.co.uk
(cherry picked from commit e36ba817fa)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
We have a I915_REQUEST_NOPREEMPT flag that we set when we must prevent
the HW from preempting during the course of this request. We need to
honour this flag and protect the HW even if we have a heartbeat request,
or other maximum priority barrier, pending. As such, restrict the
timeslicing check to avoid preempting into the topmost priority band,
leaving the unpreemptable requests in blissful peace running
uninterrupted on the HW.
v2: Set the I915_PRIORITY_BARRIER to be less than
I915_PRIORITY_UNPREEMPTABLE so that we never submit a request
(heartbeat or barrier) that can legitimately preempt the current
non-premptable request.
Fixes: 2a98f4e65b ("drm/i915: add infrastructure to hold off preemption on a request")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200527162418.24755-1-chris@chris-wilson.co.uk
(cherry picked from commit b72f02d78e)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Since we temporarily disable the heartbeat and restore back to the
default value, we can use the stored defaults on the engine and avoid
using a local.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200519063123.20673-3-chris@chris-wilson.co.uk
(cherry picked from commit 3a230a554d)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Alexandre Oliva has recently removed these files from Linux Libre
with concerns that the sources weren't available.
The sources are available on IGT repository, and only open source
tools are used to generate the {ivb,hsw}_clear_kernel.c files.
However, the remaining concern from Alexandre Oliva was around
GPL license and the source not been present when distributing
the code.
So, it looks like 2 alternatives are possible, the use of
linux-firmware.git repository to store the blob or making sure
that the source is also present in our tree. Since the goal
is to limit the i915 firmware to only the micro-controller blobs
let's make sure that we do include the asm sources here in our tree.
Btw, I tried to have some diligence here and make sure that the
asms that these commits are adding are truly the source for
the mentioned files:
igt$ ./scripts/generate_clear_kernel.sh -g ivb \
-m ~/mesa/build/src/intel/tools/i965_asm
Output file not specified - using default file "ivb-cb_assembled"
Generating gen7 CB Kernel assembled file "ivb_clear_kernel.c"
for i915 driver...
igt$ diff ~/i915/drm-tip/drivers/gpu/drm/i915/gt/ivb_clear_kernel.c \
ivb_clear_kernel.c
< * Generated by: IGT Gpu Tools on Fri 21 Feb 2020 05:29:32 AM UTC
> * Generated by: IGT Gpu Tools on Mon 08 Jun 2020 10:00:54 AM PDT
61c61
< };
> };
\ No newline at end of file
igt$ ./scripts/generate_clear_kernel.sh -g hsw \
-m ~/mesa/build/src/intel/tools/i965_asm
Output file not specified - using default file "hsw-cb_assembled"
Generating gen7.5 CB Kernel assembled file "hsw_clear_kernel.c"
for i915 driver...
igt$ diff ~/i915/drm-tip/drivers/gpu/drm/i915/gt/hsw_clear_kernel.c \
hsw_clear_kernel.c
5c5
< * Generated by: IGT Gpu Tools on Fri 21 Feb 2020 05:30:13 AM UTC
> * Generated by: IGT Gpu Tools on Mon 08 Jun 2020 10:01:42 AM PDT
61c61
< };
> };
\ No newline at end of file
Used IGT and Mesa master repositories from Fri Jun 5 2020)
IGT: 53e8c878a6fb ("tests/kms_chamelium: Force reprobe after replugging
the connector")
Mesa: 5d13c7477eb1 ("radv: set keep_statistic_info with
RADV_DEBUG=shaderstats")
Mesa built with: meson build -D platforms=drm,x11 -D dri-drivers=i965 \
-D gallium-drivers=iris -D prefix=/usr \
-D libdir=/usr/lib64/ -Dtools=intel \
-Dkulkan-drivers=intel && ninja -C build
v2: Header clean-up and include build instructions in a readme (Chris)
Modified commit message to respect check-patch
Reference: http://www.fsfla.org/pipermail/linux-libre/2020-June/003374.html
Reference: http://www.fsfla.org/pipermail/linux-libre/2020-June/003375.html
Fixes: 47f8253d2b ("drm/i915/gen7: Clear all EU/L3 residual contexts")
Cc: <stable@vger.kernel.org> # v5.7+
Cc: Alexandre Oliva <lxoliva@fsfla.org>
Cc: Prathap Kumar Valsan <prathap.kumar.valsan@intel.com>
Cc: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Jon Bloomfield <jon.bloomfield@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200610201807.191440-1-rodrigo.vivi@intel.com
Just in case everything fails (like for example "missed interrupt
syndrome" on Sandybridge), always flush the submission tasklet from the
heartbeat. This papers over such issues, but will still appear as a
second long glitch, and prevents us from detecting it unless we happen
to be performing a timed test.
v2: We rely on flush_submission() synchronizing with the tasklet on
another CPU.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200615165013.22973-3-chris@chris-wilson.co.uk
gcc-9 gets confused by the code flow in check_dirty_whitelist:
drivers/gpu/drm/i915/gt/selftest_workarounds.c: In function 'check_dirty_whitelist':
drivers/gpu/drm/i915/gt/selftest_workarounds.c:492:17: error: 'rsvd' may be used uninitialized in this function [-Werror=maybe-uninitialized]
I could not figure out a good way to do this in a way that gcc
understands better, so initialize the variable to zero, as last
resort.
Fixes: aee20aaed8 ("drm/i915: Implement read-only support in whitelist selftest")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200527140526.1458215-2-arnd@arndb.de
(cherry picked from commit cc649a9eaf)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Conditional spinlocks make it hard for gcc and for lockdep to
follow the code flow. This one causes a warning with at least
gcc-9 and higher:
In file included from include/linux/irq.h:14,
from drivers/gpu/drm/i915/i915_pmu.c:7:
drivers/gpu/drm/i915/i915_pmu.c: In function 'i915_sample':
include/linux/spinlock.h:289:3: error: 'flags' may be used uninitialized in this function [-Werror=maybe-uninitialized]
289 | _raw_spin_unlock_irqrestore(lock, flags); \
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/i915/i915_pmu.c:288:17: note: 'flags' was declared here
288 | unsigned long flags;
| ^~~~~
Split out the part between the locks into a separate function
for readability and to let the compiler figure out what the
logic actually is.
Fixes: d79e1bd676 ("drm/i915/pmu: Only use exclusive mmio access for gen7")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200527140526.1458215-1-arnd@arndb.de
(cherry picked from commit 6ec81b8273)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
It was quite the oversight to only factor in the normal queue to decide
the timeslicing switch priority. By leaving out the next virtual request
from the priority decision, we would not timeslice the current engine if
there was an available virtual request.
Testcase: igt/gem_exec_balancer/sliced
Fixes: 3df2deed41 ("drm/i915/execlists: Enable timeslice on partial virtual engine dequeue")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200519132046.22443-3-chris@chris-wilson.co.uk
(cherry picked from commit 6ad249ba59)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
gen3 does not fully flush MI stores to memory on MI_FLUSH, such that a
subsequent read from e.g. the sampler can bypass the store and read the
stale value from memory. This is a serious issue when we are using MI
stores to rewrite the batches for relocation, as it means that the batch
is reading from random user/kernel memory. While it is particularly
sensitive [and detectable] for relocations, reading stale data at any
time is a worry.
Having started with a small number of delaying stores and doubling until
no more incoherency was seen over a few hours (with and without
background memory pressure), 32 was the magic number.
Note that it definitely doesn't fix the issue, merely adds a long delay
between requests, sufficient to mostly hide the problem, enough to raise
the mtbf to several hours. This is merely a stop gap.
v2: Follow more closer with the gen5 w/a and include some
post-invalidate flushes as well.
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2018
References: a889580c08 ("drm/i915: Flush GPU relocs harder for gen3")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200612123949.7093-1-chris@chris-wilson.co.uk
Reduce the smoke depth by trimming the number of contexts, repetitions
and wait times. This is in preparation for a less greedy scheduler that
tries to be fair across contexts, resulting in a great many more context
switches. A thousand context switches may be 50-100ms, causing us to
timeout as the HW is not fast enough to complete the deep smoketests.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200607222108.14401-5-chris@chris-wilson.co.uk
Since the process_csb() does not require us to hold the
engine->active.lock, we can move the opportunistic flush before
direction submission to outside of the lock.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200612221113.9129-1-chris@chris-wilson.co.uk
If we find ourselves trying to reuse a misplaced but active vma, we
currently try to discard it to avoid having to wait to unbind it
(upsetting the current user fo the vma). An alternative to marking it as
a dicarded vma and keeping it in both the obj->vma.list and
obj->vma.tree, is to simply remove it from the lookup rbtree.
While it remains in the list of vma, it will be unbound under eviction
pressure and freed along with the object. We will never reuse it again
for new instances. As before, with no pruning, the list may continually
grow, but eventually we will have the most constrained version of the
ggtt view that meets all requirements -- so the list of vma should not
grow without bound.
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2012
Fixes: 9bdcaa5e3a ("drm/i915: Discard a misplaced GGTT vma")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200611180421.23262-1-chris@chris-wilson.co.uk
For all ddi, encoder->type holds output type as ddi,
assigning it to individual o/p types is no more valid.
Fixes: 362bfb995b ("drm/i915/tgl: Add DKL PHY vswing table for HDMI")
v2: Rebase, no functional change.
Signed-off-by: Vandita Kulkarni <vandita.kulkarni@intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: Uma Shankar <uma.shankar@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200612082237.11886-1-vandita.kulkarni@intel.com
Merge some more updates from Andrew Morton:
- various hotfixes and minor things
- hch's use_mm/unuse_mm clearnups
Subsystems affected by this patch series: mm/hugetlb, scripts, kcov,
lib, nilfs, checkpatch, lib, mm/debug, ocfs2, lib, misc.
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
kernel: set USER_DS in kthread_use_mm
kernel: better document the use_mm/unuse_mm API contract
kernel: move use_mm/unuse_mm to kthread.c
kernel: move use_mm/unuse_mm to kthread.c
stacktrace: cleanup inconsistent variable type
lib: test get_count_order/long in test_bitops.c
mm: add comments on pglist_data zones
ocfs2: fix spelling mistake and grammar
mm/debug_vm_pgtable: fix kernel crash by checking for THP support
lib: fix bitmap_parse() on 64-bit big endian archs
checkpatch: correct check for kernel parameters doc
nilfs2: fix null pointer dereference at nilfs_segctor_do_construct()
lib/lz4/lz4_decompress.c: document deliberate use of `&'
kcov: check kcov_softirq in kcov_remote_stop()
scripts/spelling: add a few more typos
khugepaged: selftests: fix timeout condition in wait_for_scan()
core:
- fix race in connectors sending hotplug
i915:
- Avoid use after free in cmdparser
- Avoid NULL dereference when probing all display encoders
- Fixup to module parameter type
sun4i:
- clock divider fix
ast:
- 24/32 bpp mode setting fix
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJe4ec6AAoJEAx081l5xIa+nlwQAJbFmInMITfkNoLwh/x7MPZt
MUbpO/cZqE/WaNTN8rLOMtU074yav+7YUSnvdjbTfZSy0+/f+aRzGRIL7BkckMmU
NJVGU4lB7HeGmdafIdwe11+jCOEv4DLNNqFx2xk+Frp2cRTisF10uaXrTQKQZD1N
E2kROWIuWD3cM4lqho4fhETm3HmMc62H3udgHJjMRI5lF0Gj+7uE4Eva08SPTr5E
ogC0+R+kkJdzc3iXE/kBkh65Gsygh9x3NkCd62SDHMu8SAZ00YTCn3C1FZ6w0XrB
GjaRd6NtlMElLuWy/DAcTxxOorCXVz1XslbGPcogG/lmjsIV8vzf/mTlH70sp9Ex
qGbGj1i4sn4HUzOZi7Q4IR+9NVN15r9USSMgQT+nleQXFTq2rk8lctPvF7NtQYIm
vHtieo73FjNcA7+ewI9b/2zDaOX2CxrjphCFGPMYneZQ61vYZXiJaV4ir0/k1XNv
mQwv0riIXqQrRRlWVJz+wGBLtNgzKnxpZE0V9b2kpvcYpfqMmsXGcaUIpj+ufSmQ
ERwS76aJ2QhsGRkvz8Ys9meK1K84wWWWaxDNozeQFRLd4Rjpa0cBjL41pX7RwD/+
WFsWg+ujt4slxZNvRUakO9LeQCV0GvL7WpMbdmCIJKYyTC3erf8j72H9AMxvwz9X
1V/7G+LUo863IOkaVjlG
=IBzW
-----END PGP SIGNATURE-----
Merge tag 'drm-next-2020-06-11-1' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"One sun4i fix and a connector hotplug race The ast fix is for a
regression in 5.6, and one of the i915 ones fixes an oops reported by
dhowells.
core:
- fix race in connectors sending hotplug
i915:
- Avoid use after free in cmdparser
- Avoid NULL dereference when probing all display encoders
- Fixup to module parameter type
sun4i:
- clock divider fix
ast:
- 24/32 bpp mode setting fix"
* tag 'drm-next-2020-06-11-1' of git://anongit.freedesktop.org/drm/drm:
drm/ast: fix missing break in switch statement for format->cpp[0] case 4
drm/sun4i: hdmi ddc clk: Fix size of m divider
drm/i915/display: Only query DP state of a DDI encoder
drm/i915/params: fix i915.reset module param type
drm/i915/gem: Mark the buffer pool as active for the cmdparser
drm/connector: notify userspace on hotplug after register complete
With the removal of the internal wait-priority boosting, we can also
remove the selftest to ensure that those waits were being suppressed
from causing preemptions.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200607222108.14401-4-chris@chris-wilson.co.uk
Some TypeC -> native DP adapters, at least the Club 3D CAC-1557 adapter,
incorrectly filter out HPD short pulses with a duration less than
~540 usec, leading to MST probe failures.
According to the DP Standard 2.0 section 5.1.4:
- DP sinks should generate short pulses in the 500 usec -> 1 msec range
- DP sources should detect short pulses in the 250 usec -> 2 msec range
According to the DP Alt Mode on TypeC Standard section 3.9.2, adapters
should detect and forward short pulses according to how sources should
detect them as specified in the DP Standard (250 usec -> 2 msec).
Based on the above filtering out short pulses with a duration less than
540 usec is incorrect.
To make such adapters work add support for a driver polling on MST
inerrupt flags, and wire this up in the i915 driver. The sink can clear
an interrupt it raised after 110 msec if the source doesn't respond, so
use a 50 msec poll period to avoid missing an interrupt. Polling of the
MST interrupt flags is explicitly allowed by the DP Standard.
This fixes MST probe failures I saw using this adapter and a DELL U2515H
monitor.
v2:
- Fix the wait event timeout for the no-poll case.
v3 (Ville):
- Fix the short pulse duration limits in the commit log prescribed by the
DP Standard.
- Add code comment explaining why/how polling is used.
- Factor out a helper to schedule the port's hpd irq handler and move it
to the rest of hotplug handlers.
- Document the new MST callback.
- s/update_hpd_irq_state/poll_hpd_irq/
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200604184500.23730-2-imre.deak@intel.com
Currently MST on a port can get enabled/disabled from the hotplug work
and get disabled from the short pulse work in a racy way. Fix this by
relying on the MST state checking in the hotplug work and just schedule
a hotplug work from the short pulse handler if some problem happened
during the MST interrupt handling.
This removes the explicit MST disabling in case of an AUX failure, but
if AUX fails, then probably the detection will also fail during the
scheduled hotplug work and it's not guaranteed that we'll see
intermittent errors anyway.
While at it also simplify the error checking of the MST interrupt
handler.
v2:
- Convert intel_dp_check_mst_status() to return bool. (Ville)
- Change the intel_dp->is_mst check to an assert, since after this patch
the condition can't change after we checked it previously.
- Document the return value from intel_dp_check_mst_status().
v3:
- Remove the intel_dp->is_mst check from intel_dp_check_mst_status().
There is no point in checking the same condition twice, even though
there is a chance that the hotplug work running concurrently changes
it.
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200605094801.17709-1-imre.deak@intel.com
DSC is not supported on DP MST streams so just don't add this entry for
MST connectors.
This also fixes an OOPS, caused by the encoder->digport cast, which is
not valid for MST encoders.
v2:
- Check encoder, which is unset for an MST connector, before it gets
enabled.
v3:
- Just don't add this debugfs file for MST connectors. (Ville)
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200609184140.4937-1-imre.deak@intel.com
According to BSpec the Data Island Packet should be disabled after
disabling the transcoder, but before the transcoder clock select is set
to none. On an ICL RVP, daisy-chained MST config not following this
leads to a hang with the following MCE when disabling the output:
[ 870.948739] mce: [Hardware Error]: CPU 0: Machine Check Exception: 5 Bank 6: ba00000011000402
[ 871.019212] mce: [Hardware Error]: RIP !INEXACT! 10:<ffffffff81aca652> {poll_idle+0x92/0xb0}
[ 871.019212] mce: [Hardware Error]: TSC 135a261fe61
[ 871.019212] mce: [Hardware Error]: PROCESSOR 0:706e5 TIME 1591739604 SOCKET 0 APIC 0 microcode 20
[ 871.019212] mce: [Hardware Error]: Run the above through 'mcelog --ascii'
[ 871.019212] mce: [Hardware Error]: Machine check: Processor context corrupt
[ 871.019212] Kernel panic - not syncing: Fatal machine check
[ 871.019212] Kernel Offset: disabled
Bspec: 4287
Fixes: fa37a21327 ("drm/i915: Stop sending DP SDPs on ddi disable")
Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Cc: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200609220616.6015-1-imre.deak@intel.com
Patch series "improve use_mm / unuse_mm", v2.
This series improves the use_mm / unuse_mm interface by better documenting
the assumptions, and my taking the set_fs manipulations spread over the
callers into the core API.
This patch (of 3):
Use the proper API instead.
Link: http://lkml.kernel.org/r/20200404094101.672954-1-hch@lst.de
These helpers are only for use with kernel threads, and I will tie them
more into the kthread infrastructure going forward. Also move the
prototypes to kthread.h - mmu_context.h was a little weird to start with
as it otherwise contains very low-level MM bits.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Felipe Balbi <balbi@kernel.org>
Cc: Jason Wang <jasowang@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: Zhi Wang <zhi.a.wang@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: http://lkml.kernel.org/r/20200404094101.672954-1-hch@lst.de
Link: http://lkml.kernel.org/r/20200416053158.586887-1-hch@lst.de
Link: http://lkml.kernel.org/r/20200404094101.672954-5-hch@lst.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull i915 uaccess updates from Al Viro:
"Low-hanging fruit in i915; there are several trickier followups, but
that'll wait for the next cycle"
* 'uaccess.i915' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
i915:get_engines(): get rid of pointless access_ok()
i915: alloc_oa_regs(): get rid of pointless access_ok()
i915 compat ioctl(): just use drm_ioctl_kernel()
i915: switch copy_perf_config_registers_or_number() to unsafe_put_user()
i915: switch query_{topology,engine}_info() to copy_to_user()
We have a test case to exercise resetting an engine while the other
engines are busy, all the TEST_SELF adds on top is that the target
engine also has background activity. In this case it is useful to first
test resetting the engine while there is background activity, as a
separate flag from exercising all others.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200607222108.14401-3-chris@chris-wilson.co.uk
In commit 5ba32c7be8 ("drm/i915/execlists: Always force a context
reload when rewinding RING_TAIL"), we placed the check for rewinding a
context on actually submitting the next request in that context. This
was so that we only had to check once, and could do so with precision
avoiding as many forced restores as possible. For example, to ensure
that we can resubmit the same request a couple of times, we include a
small wa_tail such that on the next submission, the ring->tail will
appear to move forwards when resubmitting the same request. This is very
common as it will happen for every lite-restore to fill the second port
after a context switch.
However, intel_ring_direction() is limited in precision to movements of
upto half the ring size. The consequence being that if we tried to
unwind many requests, we could exceed half the ring and flip the sense
of the direction, so missing a force restore. As no request can be
greater than half the ring (i.e. 2048 bytes in the smallest case), we
can check for rollback incrementally. As we check against the tail that
would be submitted, we do not lose any sensitivity and allow lite
restores for the simple case. We still need to double check upon
submitting the context, to allow for multiple preemptions and
resubmissions.
Fixes: 5ba32c7be8 ("drm/i915/execlists: Always force a context reload when rewinding RING_TAIL")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.4+
Reviewed-by: Bruce Chang <yu.bruce.chang@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200609151723.12971-1-chris@chris-wilson.co.uk
RKL doesn't have DSI outputs, so we shouldn't try to read out the DSI
transcoder registers.
v2(MattR):
- Just set the 'extra panel mask' to edp | dsi0 | dsi1 and then mask
against the platform's cpu_transcoder_mask to filter out the ones
that don't exist on a given platform. (Ville)
v3(MattR):
- Only include DSI transcoders on gen11+ again. (Ville)
- Use for_each_cpu_transcoder_masked() for loop. (Ville)
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Aditya Swarup <aditya.swarup@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200606025740.3308880-5-matthew.d.roper@intel.com
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
HPD pin handling for RKL+TGP is a special case; we effectively select
the HPD pin based on the DDI (A,B,D,E) rather than the PHY (A,B,C,D).
This differs from the regular behavior of RKL+CMP (and also TGL+TGP).
v2:
- Rather than providing a custom hpd_pin mapping table, just assign
encoder->hpd_pin in a custom manner for this setup. (Ville)
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200606025740.3308880-4-matthew.d.roper@intel.com
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Rocket Lake uses the same 'abox0' mechanism to handle pixel data
transfers from memory that gen11 platforms used, rather than the
abox1/abox2 interfaces used by TGL/DG1. For the most part this is a
hardware implementation detail that's transparent to driver software,
but we do have to program a couple of tuning registers (MBUS_ABOX_CTL
and BW_BUDDY registers) according to which ABOX instances are used by a
platform. Let's track the platform's ABOX usage in the device info
structure and use that to determine which instances of these registers
to program.
As an exception to this rule is that even though TGL/DG1 use ABOX1+ABOX2
for data transfers, we're still directed to program the ABOX_CTL
register for ABOX0; so we'll handle that as a special case.
v2:
- Store the mask of platform-specific abox registers in the device
info structure.
- Add a TLB_REQ_TIMER() helper macro. (Aditya)
v3:
- Squash ABOX and BW_BUDDY patches together and use a single mask for
both of them, plus a special-case for programming the ABOX0 instance
on all gen12. (Ville)
Bspec: 50096
Bspec: 49218
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Aditya Swarup <aditya.swarup@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200606025740.3308880-2-matthew.d.roper@intel.com
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Patch series "mm: consolidate definitions of page table accessors", v2.
The low level page table accessors (pXY_index(), pXY_offset()) are
duplicated across all architectures and sometimes more than once. For
instance, we have 31 definition of pgd_offset() for 25 supported
architectures.
Most of these definitions are actually identical and typically it boils
down to, e.g.
static inline unsigned long pmd_index(unsigned long address)
{
return (address >> PMD_SHIFT) & (PTRS_PER_PMD - 1);
}
static inline pmd_t *pmd_offset(pud_t *pud, unsigned long address)
{
return (pmd_t *)pud_page_vaddr(*pud) + pmd_index(address);
}
These definitions can be shared among 90% of the arches provided
XYZ_SHIFT, PTRS_PER_XYZ and xyz_page_vaddr() are defined.
For architectures that really need a custom version there is always
possibility to override the generic version with the usual ifdefs magic.
These patches introduce include/linux/pgtable.h that replaces
include/asm-generic/pgtable.h and add the definitions of the page table
accessors to the new header.
This patch (of 12):
The linux/mm.h header includes <asm/pgtable.h> to allow inlining of the
functions involving page table manipulations, e.g. pte_alloc() and
pmd_alloc(). So, there is no point to explicitly include <asm/pgtable.h>
in the files that include <linux/mm.h>.
The include statements in such cases are remove with a simple loop:
for f in $(git grep -l "include <linux/mm.h>") ; do
sed -i -e '/include <asm\/pgtable.h>/ d' $f
done
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Greg Ungerer <gerg@linux-m68k.org>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Guo Ren <guoren@kernel.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: http://lkml.kernel.org/r/20200514170327.31389-1-rppt@kernel.org
Link: http://lkml.kernel.org/r/20200514170327.31389-2-rppt@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Avoid a NULL dereference for a mismatched encoder type, hit when
probing state for all encoders.
This is a band aid to prevent the OOPS as the right fix is "probably to
swap the psr vs infoframes.enable checks, or outright disappear from
this function" (Ville).
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1892
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200525124912.16019-1-chris@chris-wilson.co.uk
(cherry picked from commit 22da5d846d)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
In some of our hangtests, we try to reset an active engine while it is
spinning inside the recursive spinner. However, we also try to flood the
engine with requests that preempt the hang, and so should disable the
preemption to be sure that we reset the right request.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200607222108.14401-2-chris@chris-wilson.co.uk
Sentinels are supposed to be last requests in the elsp queue, not the
only one, so adjust the assert accordingly.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200607222108.14401-1-chris@chris-wilson.co.uk
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
i915:
- gvt: Fix one clang warning on debug only function
Use ARRAY_SIZE for coccicheck warn
- Use after free fix for display global state.
- Whitelisting context-local timestamp on Gen9
and two scheduler fixes with deps (Cc: stable)
- Removal of write flag from sysfs files where
ineffective
nouveau:
- HDMI/DP audio HDA fixes
- display hang fix for Volta/Turing
- GK20A regression fix.
amdgpu:
- Prevent hwmon accesses while GPU is in reset
- CTF interrupt fix
- Backlight fix for renoir
- Fix for display sync groups
- Display bandwidth validation workaround
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJe3awVAAoJEAx081l5xIa+a6QP/2KoSAxod8tFUG5Lh8e0XUQS
H8lnmikPfHhngzfHdWvC9lkxfZ+MII3Bs6I6agJtqYsavy9u9ooMuYG4I3TWULZb
rO8Z/lJHdOjlFnHXUxfZKg0oc1zrY2U+5IcnEFXvHV3/MPboshWohK7dh5c/LZuA
tL84JUM5eIdLFphM5xtgTDE4gKFyVkdw8ndlnCSxNagYhlRNyUeP4qteqmCdilCr
CP7FcVRIe+Bk7y3wtOzB43mdRJQ9vDjUvQurz8voI9WObnW4oXvEjoZUaN4KwRHL
Lpd52QAS7aFT+nnIzpYkrnHPSpNk710i/SsHamOWFLS/jI9NJq7hTK3dZT3ZSyLR
5Dw0mZuu038gwk1SHmN2DtkUR8JEULppDHphh3yL0yp0Hzze6IoeUIe9XdzVHKuV
tBk5AoFGNf98WzSHvOwQAchvB60gk861pt7p27q5JhO9umLNvkLI7z4jWpa2aUda
hZkP+4ycslN1Q77FrYjca4yZVRvhtsuejAJbn74oSNB6UdX4pfvdnF2swhMf8//i
Lyyl/s5e/qkudZYoMku2gEqFahnmDxFeyy5X2I/Doc1lfYQbdlGMS1FGT872yXW2
x8hOiMuVFNCs/RMik3HiYxZ1WjwIVeMXp7rnUOvZ4ObE4VCA0P8IVOb9MbCOScC1
QtsJN1yA0xEfSvoC+Lfh
=wXjv
-----END PGP SIGNATURE-----
Merge tag 'drm-next-2020-06-08' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"These are the fixes from last week for the stuff merged in the merge
window. It got a bunch of nouveau fixes for HDA audio on some new
GPUs, some i915 and some amdpgu fixes.
i915:
- gvt: Fix one clang warning on debug only function
- Use ARRAY_SIZE for coccicheck warning
- Use after free fix for display global state.
- Whitelisting context-local timestamp on Gen9 and two scheduler
fixes with deps (Cc: stable)
- Removal of write flag from sysfs files where ineffective
nouveau:
- HDMI/DP audio HDA fixes
- display hang fix for Volta/Turing
- GK20A regression fix.
amdgpu:
- Prevent hwmon accesses while GPU is in reset
- CTF interrupt fix
- Backlight fix for renoir
- Fix for display sync groups
- Display bandwidth validation workaround"
* tag 'drm-next-2020-06-08' of git://anongit.freedesktop.org/drm/drm: (28 commits)
drm/nouveau/kms/nv50-: clear SW state of disabled windows harder
drm/nouveau: gr/gk20a: Use firmware version 0
drm/nouveau/disp/gm200-: detect and potentially disable HDA support on some SORs
drm/nouveau/disp/gp100: split SOR implementation from gm200
drm/nouveau/disp: modify OR allocation policy to account for HDA requirements
drm/nouveau/disp: split part of OR allocation logic into a function
drm/nouveau/disp: provide hint to OR allocation about HDA requirements
drm/amd/display: Revalidate bandwidth before commiting DC updates
drm/amdgpu/display: use blanked rather than plane state for sync groups
drm/i915/params: fix i915.fake_lmem_start module param sysfs permissions
drm/i915/params: don't expose inject_probe_failure in debugfs
drm/i915: Whitelist context-local timestamp in the gen9 cmdparser
drm/i915: Fix global state use-after-frees with a refcount
drm/i915: Check for awaits on still currently executing requests
drm/i915/gt: Do not schedule normal requests immediately along virtual
drm/i915: Reorder await_execution before await_request
drm/nouveau/kms/gt215-: fix race with audio driver runpm
drm/nouveau/disp/gm200-: fix NV_PDISP_SOR_HDMI2_CTRL(n) selection
Revert "drm/amd/display: disable dcn20 abm feature for bring up"
drm/amd/powerplay: ack the SMUToHost interrupt on receive V2
...
The reset member in i915_params was previously changed to unsigned, but
this failed to change the actual module parameter.
Fixes: aae970d845 ("drm/i915: Mark i915.reset as unsigned")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200602151126.25626-1-jani.nikula@intel.com
(cherry picked from commit 34becfdb94)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
If the execbuf is interrupted after building the cmdparser pipeline, and
before we commit to submitting the request to HW, we would attempt to
clean up the cmdparser early. While we held active references to the vma
being parsed and constructed, we did not hold an active reference for
the buffer pool itself. The result was that an interrupted execbuf could
still have run the cmdparser pipeline, but since the buffer pool was
idle, its target vma could have been recycled.
Note this problem only occurs if the cmdparser is running async due to
pipelined waits on busy fences, and the execbuf is interrupted.
Fixes: 686c7c35ab ("drm/i915/gem: Asynchronous cmdparser")
Fixes: 16e8745967 ("drm/i915/gt: Move the batch buffer pool from the engine to the gt")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200604103751.18816-1-chris@chris-wilson.co.uk
(cherry picked from commit 57a78ca4ec)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
This reverts commit 82ea174dc5.
Unfortunately according to our recent findings there is still some
unidentified factor, requiring CDCLK to be set higher - otherwise we
still get underruns on some multipipe configurations, despite CDCLK
being set according to BSpec formula. So getting again back into debug
mode to indentify the cause, meanwhile setting CDCLK=Pixel rate back in
order to remove regression in 10% of the cases due to FIFO underruns.
Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Fixes: cd19154608 ("drm/i915: Adjust CDCLK accordingly to our DBuf bw needs")
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200608065552.21728-1-stanislav.lisovskiy@intel.com
- Includes gvt-next-fixes-2020-05-28
- Use after free fix for display global state.
- Whitelisting context-local timestamp on Gen9
and two scheduler fixes with deps (Cc: stable)
- Removal of write flag from sysfs files where
ineffective
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200604150454.GA59322@jlahtine-desk.ger.corp.intel.com
The IO buffer Wake and Fast Wake bit size and value have been changed from
Gen12+. It programs the default value of IO buffer Wake and Fast Wake on
Gen12+. It adds definitions of IO buffer Wake and Fast Wake for pre Gen12
and Gen12+. And it aligns PSR2 definition macros.
v2: Fix macro definitions. (José)
v3: Addressed review comments from José
- Add missing default values of IO_BUFFER_WAKE and FAST_WAKE for GEN9+
- Change a style of macro naming in order to use lines as input.
- Update Todo comments.
v4: Add parentheses to macros to avoid precedence issues.
Cc: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200607143614.185246-1-gwan-gyeong.mun@intel.com
We accidentally dropped matching for DVO_PORT_DPE from the VBT mapping
table when we refactored the function. Restore it.
Fixes: 4628142aec ("drm/i915/rkl: provide port/phy mapping for vbt")
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200606031803.3309624-1-matthew.d.roper@intel.com
As a last minute addition, I added an assertion to make sure that the
new i915_vma view would be equal to the discard. However, the positive
encouragement from CI only goes to show that we rarely take this path,
and it wasn't until the post-merge run did we hit the assert -- because
it compared the wrong view. Fixup the copy'n'paste error and compare
against both the old view and the expected new view.
Fixes: 9bdcaa5e3a ("drm/i915: Discard a misplaced GGTT vma")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200605184844.24644-1-chris@chris-wilson.co.uk
Across the many users of the GGTT vma (internal objects, mmapings,
display etc), we may end up with conflicting requirements for the
placement. Currently, we try to resolve the conflict by unbinding the
vma and rebinding it to match the new constraints; over time we will end
up with a GGTT that matches the most strict constraints over all
concurrent users. However, this causes a problem if the vma is currently
in use as we must wait until it is idle before moving it. But there is
no restriction on the number of views we may use (apart from the limited
size of the GGTT itself), and so if the active vma does not meet our
requirements, try and build a new one!
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200605165258.1483-1-chris@chris-wilson.co.uk
We may choose not to submit for a number of reasons, yet not fill both
ELSP. In which case we must start timeslicing (there will be no ACK
event on which to hook the start) if the queue would benefit from the
currently active context being evicted.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200605122334.2798-2-chris@chris-wilson.co.uk
If we only submit the first port, leaving the second empty yet have
ready requests pending in the queue, use that to set the timeslicing
priority (i.e. the priority at which we will decided to enabling
timeslicing and evict the currently active context if the queue is of
equal priority after its quantum expired).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200605122334.2798-1-chris@chris-wilson.co.uk
Add engine->fw_domain/active to the pretty printer for debug dumps and
debugfs.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200605144705.31127-1-chris@chris-wilson.co.uk
Reduce the 3 relocation paths down to the single path that accommodates
all. The primary motivation for this is to guard the relocations with a
natural fence (derived from the i915_request used to write the
relocation from the GPU).
The tradeoff in using async gpu relocations is that it increases latency
over using direct CPU relocations, for the cases where the target is
idle and accessible by the CPU. The benefit is greatly reduced lock
contention and improved concurrency by pipelining.
Note that forcing the async gpu relocations does reveal a few issues
they have. Firstly, is that they are visible as writes to gem_busy,
causing to mark some buffers are being to written to by the GPU even
though userspace only reads. Secondly is that, in combination with the
cmdparser, they can cause priority inversions. This should be the case
where the work is being put into a common workqueue losing our priority
information and so being executed in FIFO from the worker, denying us
the opportunity to reorder the requests afterwards.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200604211457.19696-1-chris@chris-wilson.co.uk
This parameter is meant to be used when PSR issues are found as some
issues in the past was due wrong values set in VBT so this would be
a quick and easy way to ask users or for us to check if the issue is
due VBT values.
Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200520212756.354623-1-jose.souza@intel.com
RKL doesn't have PSR2 HW tracking, it was replaced by software/manual
tracking. The driver is required to track the areas that needs update
and program hardware to send selective updates.
So until the software tracking is implemented, PSR2 needs to be disabled
for platforms without PSR2 HW tracking.
BSpec: 50422
BSpec: 50424
Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200603211529.3005059-15-matthew.d.roper@intel.com
There are a couple places in our driver that loop over transcoders A..D
for gen11+; since RKL only has three pipes/transcoders, this can lead to
unclaimed register reads/writes. We should add checks for transcoder
existence where appropriate.
v2: Move one transcoder check that wound up in the wrong function after
conflict resolution. It belongs in bdw_get_trans_port_sync_config
rather than bxt_get_dsi_transcoder_state.
v3: Switch loops to use for_each_cpu_transcoder_masked() since this
iterator already checks the platform's transcoder mask for us.
(Ville)
Cc: Aditya Swarup <aditya.swarup@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200603211529.3005059-10-matthew.d.roper@intel.com
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
RKL uses DDI's A, B, TC1, and TC2 which need to map to combo PHY's A-D.
Bspec: 49181
Cc: Imre Deak <imre.deak@intel.com>
Cc: Aditya Swarup <aditya.swarup@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200603211529.3005059-6-matthew.d.roper@intel.com
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
As latest update we have now 2 voltage swing tables for DP over DKL
PHY with only one difference in Level 0 pre-emphasis 3.
So with 2 tables for DP is time to have one single function to return
all DKL voltage swing tables.
BSpec: 49292
Cc: Khaled Almahallawy <khaled.almahallawy@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Tested-by: Khaled Almahallawy <khaled.almahallawy@intel.com>
Reviewed-by: Khaled Almahallawy<khaled.almahallawy@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200602205424.138143-1-jose.souza@intel.com
Previous patch didn't take into account all pipes
but only those in state, which could cause wrong
CDCLK conclcusions and calculations.
Also there was a severe issue with min_cdclk being
assigned to 0 every compare cycle.
Too bad this was found by me only after merge.
This could be also causing the issues in test, however
not clear - anyway marking this as fixing the
"Adjust CDCLK accordingly to our DBuf bw needs".
v2: - s/pipe/crtc->pipe/
- save a bit of instructions by
skipping inactive pipes, without
getting 0 DBuf slice mask for it.
Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Fixes: cd19154608 ("drm/i915: Adjust CDCLK accordingly to our DBuf bw needs")
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200601173058.5084-1-stanislav.lisovskiy@intel.com
Certain combo PHYs act as a compensation master to other PHYs and need
to be initialized with a special irefgen bit in the PORT_COMP_DW8
register. Previously PHY A was the only compensation master (for PHYs
B & C), but RKL adds a fourth PHY which is slaved to PHY C instead.
Bspec: 49291
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Aditya Swarup <aditya.swarup@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Anusha Srivatsa <anusha.srivatsa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200603211529.3005059-12-matthew.d.roper@intel.com
The pin mapping for the final two outputs varies according to which PCH
is present on the platform: with TGP the pins are remapped into the TC
range, whereas with CMP they stay in the traditional combo output range.
Bspec: 49181
Cc: Aditya Swarup <aditya.swarup@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Anusha Srivatsa <anusha.srivatsa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200603211529.3005059-9-matthew.d.roper@intel.com
RKL uses the DDI A, DDI B, DDI USBC1, DDI USBC2 from the DE point of
view, so all DDI/pipe/transcoder register use these indexes to refer to
them. Combo phy and IO functions follow another namespace that we keep
as "enum phy". The VBT in theory would use the DE point of view, but
that does not happen in practice.
Provide a table to convert the child devices to the "correct" port
numbering we use. Now this is the output we get while reading the VBT:
DDIA:
[drm:intel_bios_port_aux_ch [i915]] using AUX A for port A (VBT)
[drm:intel_dp_init_connector [i915]] Adding DP connector on [ENCODER:275:DDI A]
[drm:intel_hdmi_init_connector [i915]] Adding HDMI connector on [ENCODER:275:DDI A]
[drm:intel_hdmi_init_connector [i915]] Using DDC pin 0x1 for port A (VBT)
DDIB:
[drm:intel_bios_port_aux_ch [i915]] using AUX B for port B (platform default)
[drm:intel_hdmi_init_connector [i915]] Adding HDMI connector on [ENCODER:291:DDI B]
[drm:intel_hdmi_init_connector [i915]] Using DDC pin 0x2 for port B (VBT)
DDI USBC1:
[drm:intel_bios_port_aux_ch [i915]] using AUX D for port D (VBT)
[drm:intel_dp_init_connector [i915]] Adding DP connector on [ENCODER:295:DDI D]
[drm:intel_hdmi_init_connector [i915]] Adding HDMI connector on [ENCODER:295:DDI D]
[drm:intel_hdmi_init_connector [i915]] Using DDC pin 0x3 for port D (VBT)
DDI USBC2:
[drm:intel_bios_port_aux_ch [i915]] using AUX E for port E (VBT)
[drm:intel_dp_init_connector [i915]] Adding DP connector on [ENCODER:306:DDI E]
[drm:intel_hdmi_init_connector [i915]] Adding HDMI connector on [ENCODER:306:DDI E]
[drm:intel_hdmi_init_connector [i915]] Using DDC pin 0x9 for port E (VBT)
Cc: Clinton Taylor <Clinton.A.Taylor@intel.com>
Cc: Aditya Swarup <aditya.swarup@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200603211529.3005059-7-matthew.d.roper@intel.com
Although we properly captured RKL's three pipes in the device info
structure, we forgot to make the corresponding update to the transcoder
mask. Set this field so that our transcoder loops will operate
properly.
Fixes: 123f62de41 ("drm/i915/rkl: Add RKL platform info and PCI ids")
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200603211529.3005059-2-matthew.d.roper@intel.com
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Sometimes an engine might need to keep forcewake active while it is busy
submitting requests for a particular workaround. Track such nuisance
with engine->fw_domain.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200604153145.21068-1-chris@chris-wilson.co.uk
Use the plain msec_to_jiffies() rather than the _timeout variant so we
round down and do not add an extra jiffy to our interval. For example,
with timeslicing we do not want to err on the longer side as any
fairness depends on catching hogging contexts on the GPU. Bring on
CFS.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200604135938.3975-1-chris@chris-wilson.co.uk
If the execbuf is interrupted after building the cmdparser pipeline, and
before we commit to submitting the request to HW, we would attempt to
clean up the cmdparser early. While we held active references to the vma
being parsed and constructed, we did not hold an active reference for
the buffer pool itself. The result was that an interrupted execbuf could
still have run the cmdparser pipeline, but since the buffer pool was
idle, its target vma could have been recycled.
Note this problem only occurs if the cmdparser is running async due to
pipelined waits on busy fences, and the execbuf is interrupted.
Fixes: 686c7c35ab ("drm/i915/gem: Asynchronous cmdparser")
Fixes: 16e8745967 ("drm/i915/gt: Move the batch buffer pool from the engine to the gt")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200604103751.18816-1-chris@chris-wilson.co.uk
Set GS Timer to 224. Combine with Wa_1604555607 due to register FF_MODE2
not being able to be read.
V2: Math issue fixed
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Caz Yokoyama <caz.yokoyama@intel.com>
Cc: Matt Atwood <matthew.s.atwood@intel.com>
Signed-off-by: Clint Taylor <clinton.a.taylor@intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200603221150.14745-1-clinton.a.taylor@intel.com
Just to remove an obnoxious HAS_ENGINES(), and in the process make the
code agnostic to the availabilty of any particular engine by making it
exercise any and all such engines declared on the system.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200604123641.767-1-chris@chris-wilson.co.uk
The reset member in i915_params was previously changed to unsigned, but
this failed to change the actual module parameter.
Fixes: aae970d845 ("drm/i915: Mark i915.reset as unsigned")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200602151126.25626-1-jani.nikula@intel.com
This code was using get_user_pages*(), in a "Case 2" scenario (DMA/RDMA),
using the categorization from [1]. That means that it's time to convert
the get_user_pages*() + put_page() calls to pin_user_pages*() +
unpin_user_pages() calls.
There is some helpful background in [2]: basically, this is a small part
of fixing a long-standing disconnect between pinning pages, and file
systems' use of those pages.
[1] Documentation/core-api/pin_user_pages.rst
[2] "Explicit pinning of user-space pages":
https://lwn.net/Articles/807108/
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Souptick Joarder <jrdr.linux@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: "Joonas Lahtinen" <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Link: http://lkml.kernel.org/r/20200519002124.2025955-5-jhubbard@nvidia.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The DP spec says:
"When the combination of the requested pre-emphasis level and
voltage swing exceeds the capability of a DPTX, the DPTX shall
set the pre-emphasis level according to the request and use the
highest voltage swing it can output with the given pre-emphasis level."
and
"When a DPTX reads a request beyond the limits of this Standard,
the DPTX shall set the pre-emphasis level according to the request
and set the highest voltage swing level it can output with the
given pre-emphasis level. If a DPTX is requested for 9.5dB of
pre-emphasis level (may be supported for a DPTX) and cannot support
that level, it shall set the pre-emphasis level to the next
highest level, 6dB."
Ie. we should first validate the pre-emphasis, and then select
the appropriate vswing for it.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200512174145.3186-6-ville.syrjala@linux.intel.com
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
According to the DP spec supporting vswing 1 + preemph 2 is
mandatory. We don't have the hw settings for that though. In
order to pretend to follow the DP spec let's just select
vswing 0 + preemph 2 in this case (the DP spec says to use
the requested preemph in preference to the vswing when the
requested values aren't supported).
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200512174145.3186-4-ville.syrjala@linux.intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
cpt/ppt support pre-emphasis level 3. Let's actually declare
support for it, instead of clamping things to level 2.
Also tweak the if-ladder in intel_dp_voltage_max() to match
intel_dp_pre_emphasis_max() to make it easier to compare them.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200512174145.3186-2-ville.syrjala@linux.intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
We infrequently use the direct i915 backpointer from the i915_request,
so do we really need to waste the space in the struct for it? 8 bytes
from the most frequently allocated struct vs an 3 bytes and pointer
chasing in using rq->engine->i915?
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200602220953.21178-1-chris@chris-wilson.co.uk
If we injected an error (such as pretending the GuC firmware was
broken), then suppress the error message as it is expected and our CI
complains if it sees any *ERROR*.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200603104657.25651-1-chris@chris-wilson.co.uk
For reasons that be, the HW only allows usersace to read its own
CTX_TIMESTAMP [context local HW runtime] on rcs. Make it available for
all by adding it to the whitelists.
v2: The change took effect from Cometlake.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200602154839.6902-1-chris@chris-wilson.co.uk
Cometlake is a small refresh of Coffeelake, but since we have found out a
difference in the plaforms, we need to identify them as separate platforms.
Since we previously took Coffeelake/Cometlake as identical, update all
IS_COFFEELAKE() to also include IS_COMETLAKE().
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200602140541.5481-1-chris@chris-wilson.co.uk
As a timestamp will automatically update itself, it will not hold only
contexts we write into it, and will change from the baseline value
making us suspect that our writes are landing. As this confuses us and
we would need more careful treatment to detect invalid stores into the
timestamp, skip it when verifying the whitelists.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200602154839.6902-1-chris@chris-wilson.co.uk
- Rework the system-wide PM driver flags to make them easier to
understand and use and update their documentation (Rafael Wysocki,
Alan Stern).
- Allow cpuidle governors to be switched at run time regardless of
the kernel configuration and update the related documentation
accordingly (Hanjun Guo).
- Improve the resume device handling in the user space hibernarion
interface code (Domenico Andreoli).
- Document the intel-speed-select sysfs interface (Srinivas
Pandruvada).
- Make the ACPI code handing suspend to idle print more debug
messages to help diagnose issues with it (Rafael Wysocki).
- Fix a helper routine in the cpufreq core and correct a typo in
the struct cpufreq_driver kerneldoc comment (Rafael Wysocki, Wang
Wenhu).
- Update cpufreq drivers:
* Make the intel_pstate driver start in the passive mode by
default on systems without HWP (Rafael Wysocki).
* Add i.MX7ULP support to the imx-cpufreq-dt driver and add
i.MX7ULP to the cpufreq-dt-platdev blacklist (Peng Fan).
* Convert the qoriq cpufreq driver to a platform one, make the
platform code create a suitable device object for it and add
platform dependencies to it (Mian Yousaf Kaukab, Geert
Uytterhoeven).
* Fix wrong compatible binding in the qcom driver (Ansuel Smith).
* Build the omap driver by default for ARCH_OMAP2PLUS (Anders
Roxell).
* Add r8a7742 SoC support to the dt cpufreq driver (Lad Prabhakar).
- Update cpuidle core and drivers:
* Fix three reference count leaks in error code paths in the
cpuidle core (Qiushi Wu).
* Convert Qualcomm SPM to a generic cpuidle driver (Stephan
Gerhold).
* Fix up the execution order when entering a domain idle state in
the PSCI driver (Ulf Hansson).
- Fix a reference counting issue related to clock management and
clean up two oddities in the PM-runtime framework (Rafael Wysocki,
Andy Shevchenko).
- Add ElkhartLake support to the Intel RAPL power capping driver
and remove an unused local MSR definition from it (Jacob Pan,
Sumeet Pawnikar).
- Update devfreq core and drivers:
* Replace strncpy() with strscpy() in the devfreq core and use
lockdep asserts instead of manual checks for a locked mutex in
it (Dmitry Osipenko, Krzysztof Kozlowski).
* Add a generic imx bus scaling driver and make it register an
interconnect device (Leonard Crestez, Gustavo A. R. Silva).
* Make the cpufreq notifier in the tegra30 driver take boosting
into account and delete an unuseful error message from that
driver (Dmitry Osipenko, Markus Elfring).
- Remove unneeded semicolon from the cpupower code (Zou Wei).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAl7VGjwSHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRx46gP/jGAXlddFEQswi6qUT3Cff0A9mb8CdcX
dyKrjX4xxo/wtBIAwSN4achxrgse//ayo2dYTzWRDd31W9Azbv+5F+46XsDRz4hL
pH29u/E66NMtFWnHCmt78NEJn0FzSa0YBC43ZzwFwKktCK9skYIpGN2z6iuXUBSX
Q5GHqop3zvDsdKQFBGL62xvUw/AmOTPG7ohIZvqWBN2mbOqEqMcoFHT+aUF/NbLj
+i14dvTH767eDZGRVASmXWQyljjaRWm+SIw4+m8zT1D1Y3d5IFObuMN+9RQl1Tif
BYjkgJ2oDDMhCJLW7TBuJB+g7exiyaSQds3nMr2ZR+eZbJipICjU4eehNEKIUopU
DM17tHQfnwZfS/7YbCx3vYQwLkNq37AJyXS9uqCAIFM+0n4xN4/mIVmgWYISLDTs
1v9olFxtwMRNpjGGQWPJAO7ebB8Zz9qhQv7pIkSQEfwp93/SzvlVf4vvruTeFN9J
qqG60cDumXWAm+s43eQHJNn5nOd5ocWv0FBpo/cxqKbzxFVWwdB42Cm0SY+rK2ID
uHdnc2DJcK2c78UVbz3Cmk4272foJt2zxchqjFXXAZPLrOsFfzmti4B28VxGxjmP
LG3MhH5sdbF4yl/1aSC1Bnrt+PV9Lus6ut/VKhjwIpw8cqiXgpwSbMoDoaBd9UMQ
ubGz2rplGAtB
=APdj
-----END PGP SIGNATURE-----
Merge tag 'pm-5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
"These rework the system-wide PM driver flags, make runtime switching
of cpuidle governors easier, improve the user space hibernation
interface code, add intel-speed-select interface documentation, add
more debug messages to the ACPI code handling suspend to idle, update
the cpufreq core and drivers, fix a minor issue in the cpuidle core
and update two cpuidle drivers, improve the PM-runtime framework,
update the Intel RAPL power capping driver, update devfreq core and
drivers, and clean up the cpupower utility.
Specifics:
- Rework the system-wide PM driver flags to make them easier to
understand and use and update their documentation (Rafael Wysocki,
Alan Stern).
- Allow cpuidle governors to be switched at run time regardless of
the kernel configuration and update the related documentation
accordingly (Hanjun Guo).
- Improve the resume device handling in the user space hibernarion
interface code (Domenico Andreoli).
- Document the intel-speed-select sysfs interface (Srinivas
Pandruvada).
- Make the ACPI code handing suspend to idle print more debug
messages to help diagnose issues with it (Rafael Wysocki).
- Fix a helper routine in the cpufreq core and correct a typo in the
struct cpufreq_driver kerneldoc comment (Rafael Wysocki, Wang
Wenhu).
- Update cpufreq drivers:
- Make the intel_pstate driver start in the passive mode by
default on systems without HWP (Rafael Wysocki).
- Add i.MX7ULP support to the imx-cpufreq-dt driver and add
i.MX7ULP to the cpufreq-dt-platdev blacklist (Peng Fan).
- Convert the qoriq cpufreq driver to a platform one, make the
platform code create a suitable device object for it and add
platform dependencies to it (Mian Yousaf Kaukab, Geert
Uytterhoeven).
- Fix wrong compatible binding in the qcom driver (Ansuel Smith).
- Build the omap driver by default for ARCH_OMAP2PLUS (Anders
Roxell).
- Add r8a7742 SoC support to the dt cpufreq driver (Lad
Prabhakar).
- Update cpuidle core and drivers:
- Fix three reference count leaks in error code paths in the
cpuidle core (Qiushi Wu).
- Convert Qualcomm SPM to a generic cpuidle driver (Stephan
Gerhold).
- Fix up the execution order when entering a domain idle state in
the PSCI driver (Ulf Hansson).
- Fix a reference counting issue related to clock management and
clean up two oddities in the PM-runtime framework (Rafael Wysocki,
Andy Shevchenko).
- Add ElkhartLake support to the Intel RAPL power capping driver and
remove an unused local MSR definition from it (Jacob Pan, Sumeet
Pawnikar).
- Update devfreq core and drivers:
- Replace strncpy() with strscpy() in the devfreq core and use
lockdep asserts instead of manual checks for a locked mutex in
it (Dmitry Osipenko, Krzysztof Kozlowski).
- Add a generic imx bus scaling driver and make it register an
interconnect device (Leonard Crestez, Gustavo A. R. Silva).
- Make the cpufreq notifier in the tegra30 driver take boosting
into account and delete an unuseful error message from that
driver (Dmitry Osipenko, Markus Elfring).
- Remove unneeded semicolon from the cpupower code (Zou Wei)"
* tag 'pm-5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (51 commits)
cpuidle: Fix three reference count leaks
PM: runtime: Replace pm_runtime_callbacks_present()
PM / devfreq: Use lockdep asserts instead of manual checks for locked mutex
PM / devfreq: imx-bus: Fix inconsistent IS_ERR and PTR_ERR
PM / devfreq: Replace strncpy with strscpy
PM / devfreq: imx: Register interconnect device
PM / devfreq: Add generic imx bus scaling driver
PM / devfreq: tegra30: Delete an error message in tegra_devfreq_probe()
PM / devfreq: tegra30: Make CPUFreq notifier to take into account boosting
PM: hibernate: Restrict writes to the resume device
PM: runtime: clk: Fix clk_pm_runtime_get() error path
cpuidle: Convert Qualcomm SPM driver to a generic CPUidle driver
ACPI: EC: PM: s2idle: Extend GPE dispatching debug message
ACPI: PM: s2idle: Print type of wakeup debug messages
powercap: RAPL: remove unused local MSR define
PM: runtime: Make clear what we do when conditions are wrong in rpm_suspend()
Documentation: admin-guide: pm: Document intel-speed-select
PM: hibernate: Split off snapshot dev option
PM: hibernate: Incorporate concurrency handling
Documentation: ABI: make current_governer_ro as a candidate for removal
...
Merge updates from Andrew Morton:
"A few little subsystems and a start of a lot of MM patches.
Subsystems affected by this patch series: squashfs, ocfs2, parisc,
vfs. With mm subsystems: slab-generic, slub, debug, pagecache, gup,
swap, memcg, pagemap, memory-failure, vmalloc, kasan"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (128 commits)
kasan: move kasan_report() into report.c
mm/mm_init.c: report kasan-tag information stored in page->flags
ubsan: entirely disable alignment checks under UBSAN_TRAP
kasan: fix clang compilation warning due to stack protector
x86/mm: remove vmalloc faulting
mm: remove vmalloc_sync_(un)mappings()
x86/mm/32: implement arch_sync_kernel_mappings()
x86/mm/64: implement arch_sync_kernel_mappings()
mm/ioremap: track which page-table levels were modified
mm/vmalloc: track which page-table levels were modified
mm: add functions to track page directory modifications
s390: use __vmalloc_node in stack_alloc
powerpc: use __vmalloc_node in alloc_vm_stack
arm64: use __vmalloc_node in arch_alloc_vmap_stack
mm: remove vmalloc_user_node_flags
mm: switch the test_vmalloc module to use __vmalloc_node
mm: remove __vmalloc_node_flags_caller
mm: remove both instances of __vmalloc_node_flags
mm: remove the prot argument to __vmalloc_node
mm: remove the pgprot argument to __vmalloc
...
This is always PAGE_KERNEL - for long term mappings with other properties
vmap should be used.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Christophe Leroy <christophe.leroy@c-s.fr>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: David Airlie <airlied@linux.ie>
Cc: Gao Xiang <xiang@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Laura Abbott <labbott@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Kelley <mikelley@microsoft.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Sakari Ailus <sakari.ailus@linux.intel.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Wei Liu <wei.liu@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mackerras <paulus@ozlabs.org>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Will Deacon <will@kernel.org>
Link: http://lkml.kernel.org/r/20200414131348.444715-19-hch@lst.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Doing a "get_user_pages()" on a copy-on-write page for reading can be
ambiguous: the page can be COW'ed at any time afterwards, and the
direction of a COW event isn't defined.
Yes, whoever writes to it will generally do the COW, but if the thread
that did the get_user_pages() unmapped the page before the write (and
that could happen due to memory pressure in addition to any outright
action), the writer could also just take over the old page instead.
End result: the get_user_pages() call might result in a page pointer
that is no longer associated with the original VM, and is associated
with - and controlled by - another VM having taken it over instead.
So when doing a get_user_pages() on a COW mapping, the only really safe
thing to do would be to break the COW when getting the page, even when
only getting it for reading.
At the same time, some users simply don't even care.
For example, the perf code wants to look up the page not because it
cares about the page, but because the code simply wants to look up the
physical address of the access for informational purposes, and doesn't
really care about races when a page might be unmapped and remapped
elsewhere.
This adds logic to force a COW event by setting FOLL_WRITE on any
copy-on-write mapping when FOLL_GET (or FOLL_PIN) is used to get a page
pointer as a result.
The current semantics end up being:
- __get_user_pages_fast(): no change. If you don't ask for a write,
you won't break COW. You'd better know what you're doing.
- get_user_pages_fast(): the fast-case "look it up in the page tables
without anything getting mmap_sem" now refuses to follow a read-only
page, since it might need COW breaking. Which happens in the slow
path - the fast path doesn't know if the memory might be COW or not.
- get_user_pages() (including the slow-path fallback for gup_fast()):
for a COW mapping, turn on FOLL_WRITE for FOLL_GET/FOLL_PIN, with
very similar semantics to FOLL_FORCE.
If it turns out that we want finer granularity (ie "only break COW when
it might actually matter" - things like the zero page are special and
don't need to be broken) we might need to push these semantics deeper
into the lookup fault path. So if people care enough, it's possible
that we might end up adding a new internal FOLL_BREAK_COW flag to go
with the internal FOLL_COW flag we already have for tracking "I had a
COW".
Alternatively, if it turns out that different callers might want to
explicitly control the forced COW break behavior, we might even want to
make such a flag visible to the users of get_user_pages() instead of
using the above default semantics.
But for now, this is mostly commentary on the issue (this commit message
being a lot bigger than the patch, and that patch in turn is almost all
comments), with that minimal "enable COW breaking early" logic using the
existing FOLL_WRITE behavior.
[ It might be worth noting that we've always had this ambiguity, and it
could arguably be seen as a user-space issue.
You only get private COW mappings that could break either way in
situations where user space is doing cooperative things (ie fork()
before an execve() etc), but it _is_ surprising and very subtle, and
fork() is supposed to give you independent address spaces.
So let's treat this as a kernel issue and make the semantics of
get_user_pages() easier to understand. Note that obviously a true
shared mapping will still get a page that can change under us, so this
does _not_ mean that get_user_pages() somehow returns any "stable"
page ]
Reported-by: Jann Horn <jannh@google.com>
Tested-by: Christoph Hellwig <hch@lst.de>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Kirill Shutemov <kirill@shutemov.name>
Acked-by: Jan Kara <jack@suse.cz>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If an error is encountered during the DSI initialization setup, the
drm connector object also needs to be cleaned up along with the encoder.
The error can happen due to a missing mode in the VBT or for other
reasons.
v2: Rephrase the commit message to make it more clear.
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Vandita Kulkarni <vandita.kulkarni@intel.com>
Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200522202630.7604-1-vivek.kasireddy@intel.com
fake_lmem_start does not need to be mutable via module param sysfs. It's
only used during driver probe.
Fixes: 1629224324 ("drm/i915/lmem: add the fake lmem region")
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200601215510.18379-2-jani.nikula@intel.com
(cherry picked from commit f322e851f2)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Allow batch buffers to read their own _local_ cumulative HW runtime of
their logical context.
Fixes: 0f2f397583 ("drm/i915: Add gen9 BCS cmdparsing")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.4+
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200601161942.30854-1-chris@chris-wilson.co.uk
(cherry picked from commit f9496520df)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
While the current locking/serialization of the global state
suffices for protecting the obj->state access and the actual
hardware reprogramming, we do have a problem with accessing
the old/new states during nonblocking commits.
The state computation and swap will be protected by the crtc
locks, but the commit_tails can finish out of order, thus also
causing the atomic states to be cleaned up out of order. This
would mean the commit that started first but finished last has
had its new state freed as the no-longer-needed old state by the
other commit.
To fix this let's just refcount the states. obj->state amounts
to one reference, and the intel_atomic_state holds extra references
to both its new and old global obj states.
Fixes: 0ef1905ecf ("drm/i915: Introduce better global state handling")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200527200245.13184-1-ville.syrjala@linux.intel.com
Reviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
(cherry picked from commit f8c86ffa28)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Only support runtime changes through the debugfs.
i915.verbose_state_checks remains an exception, and is not exposed via
debugfs.
This depends on IGT having been updated to use the debugfs for modifying
the parameters.
Cc: Juha-Pekka Heikkilä <juha-pekka.heikkila@intel.com>
Cc: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200601215510.18379-3-jani.nikula@intel.com
fake_lmem_start does not need to be mutable via module param sysfs. It's
only used during driver probe.
Fixes: 1629224324 ("drm/i915/lmem: add the fake lmem region")
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200601215510.18379-2-jani.nikula@intel.com
The parameter only makes sense as a module parameter only.
Fixes: c43c5a8818 ("drm/i915/params: add i915 parameters to debugfs")
Cc: Juha-Pekka Heikkilä <juha-pekka.heikkila@intel.com>
Cc: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200601215510.18379-1-jani.nikula@intel.com
Pull the routines for writing CS packets out of intel_ring_submission
into their own files. These are low level operations for building CS
instructions, rather than the logic for filling the global ring buffer
with requests, and we will want to reuse them outside of this context.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200601072446.19548-2-chris@chris-wilson.co.uk
Pull uaccess/readdir updates from Al Viro:
"Finishing the conversion of readdir.c to unsafe_... API.
This includes the uaccess_{read,write}_begin series by Christophe
Leroy"
* 'uaccess.readdir' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
readdir.c: get rid of the last __put_user(), drop now-useless access_ok()
readdir.c: get compat_filldir() more or less in sync with filldir()
switch readdir(2) to unsafe_copy_dirent_name()
drm/i915/gem: Replace user_access_begin by user_write_access_begin
uaccess: Selectively open read or write user access
uaccess: Add user_read_access_begin/end and user_write_access_begin/end
Allow batch buffers to read their own _local_ cumulative HW runtime of
their logical context.
Fixes: 0f2f397583 ("drm/i915: Add gen9 BCS cmdparsing")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.4+
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200601161942.30854-1-chris@chris-wilson.co.uk
Ever noticed that our interrupt handlers are where we spend most of our
time on a busy system? In part this is unavoidable as each interrupt
requires to poll and reset several registers, but we can try and do so as
efficiently as possible.
Function old new delta
ilk_irq_handler 2317 2156 -161
v2: Restore the irqreturn_t ret
Function old new delta
ilk_irq_handler.cold 63 72 +9
ilk_irq_handler 2221 2080 -141
A slight improvement in the baseline overnight as well!
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200601140355.20243-1-chris@chris-wilson.co.uk
While the current locking/serialization of the global state
suffices for protecting the obj->state access and the actual
hardware reprogramming, we do have a problem with accessing
the old/new states during nonblocking commits.
The state computation and swap will be protected by the crtc
locks, but the commit_tails can finish out of order, thus also
causing the atomic states to be cleaned up out of order. This
would mean the commit that started first but finished last has
had its new state freed as the no-longer-needed old state by the
other commit.
To fix this let's just refcount the states. obj->state amounts
to one reference, and the intel_atomic_state holds extra references
to both its new and old global obj states.
Fixes: 0ef1905ecf ("drm/i915: Introduce better global state handling")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200527200245.13184-1-ville.syrjala@linux.intel.com
Reviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Our forcewake utilisation is split into categories: automatic and
manual. Around bare register reads, we look up the right forcewake
domain and automatically acquire and release [upon a timer] the
forcewake domain. For other access, where we know we require the
forcewake across a group of register reads, we manually acquire the
forcewake domain and release it at the end. Again, this currently arms
the domain timer for a later release.
However, looking at some energy utilisation profiles, we have tried to
avoid using forcewake [and rely on the natural wake up to post register
updates] due to that even keep the fw active for a brief period
contributes to a significant power draw [i.e. when the gpu is sleeping
with rc6 at high clocks]. But as it turns out, not posting the writes
immediately also has unintended consequences, such as not reducing the
clocks and so conserving power while busy.
As a compromise, let us only arm the domain timer for automatic
forcewake usage around bare register access, but immediately release the
forcewake when manually acquired by intel_uncore_forcewake_get/_put.
The corollary to this is that we may instead have to take forcewake more
often, and so incur a latency penalty in doing so. For Sandybridge this
was significant, and even on the latest machines, taking forcewake at
interrupt frequency is a huge impact. [So we don't do that anymore!
Hopefully, this will spare us from still needing the mitigation of the
timer for steady state execution.]
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200601072446.19548-13-chris@chris-wilson.co.uk
With the advent of preempt-to-busy, a request may still be on the GPU as
we unwind. And in the case of a unpreemptible [due to HW] request, that
request will remain indefinitely on the GPU even though we have
returned it back to our submission queue, and cleared the active bit.
We only run the execution callbacks on transferring the request from our
submission queue to the execution queue, but if this is a bonded request
that the HW is waiting for, we will not submit it (as we wait for a
fresh execution) even though it is still being executed.
As we know that there are always preemption points between requests, we
know that only the currently executing request may be still active even
though we have cleared the flag. However, we do not precisely know which
request is in ELSP[0] due to a delay in processing events, and
furthermore we only store the last request in a context in our state
tracker.
Fixes: 22b7a426bb ("drm/i915/execlists: Preempt-to-busy")
Testcase: igt/gem_exec_balancer/bonded-dual
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200529143926.3245-1-chris@chris-wilson.co.uk
(cherry picked from commit b55230e5e8)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
When we push a virtual request onto the HW, we update the rq->engine to
point to the physical engine. A request that is then submitted by the
user that waits upon the virtual engine, but along the physical engine
in use, will then see that it is due to be submitted to the same engine
and take a shortcut (and be queued without waiting for the completion
fence). However, the virtual request may be preempted (either by higher
priority users, or by timeslicing) and removed from the physical engine
to be migrated over to one of its siblings. The dependent normal request
however is oblivious to the removal of the virtual request and remains
queued to execute on HW, believing that once it reaches the head of its
queue all of its predecessors will have completed executing!
v2: Beware restriction of signal->execution_mask prior to submission.
Fixes: 6d06779e86 ("drm/i915: Load balancing across a virtual engine")
Testcase: igt/gem_exec_balancer/sliced
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: <stable@vger.kernel.org> # v5.3+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200526090753.11329-2-chris@chris-wilson.co.uk
(cherry picked from commit 511b6d9aed)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reorder the code so that we can reuse the await_execution from a special
case in await_request in the next patch.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200526090753.11329-1-chris@chris-wilson.co.uk
(cherry picked from commit ffb0c600c2)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Currently the plane property doesn't have support for YCBCR_BT2020,
which enables the corresponding color conversion mode on plane CSC.
Enabling the plane property for the planes for GLK & ICL+ platforms.
Also as per spec, update the Plane Color CSC from YUV601_TO_RGB709
to YUV601_TO_RGB601.
V2: Enabling support for YCBCT_BT2020 for HDR planes on
platforms GLK & ICL
V3: Refined the condition check to handle GLK & ICL+ HDR planes
Also added BT2020 handling in glk_plane_color_ctl.
V4: Combine If-else into single If
V5: Drop the checking for HDR planes and enable YCBCR_BT2020
for platforms GLK & ICL+.
V6: As per Spec, update PLANE_COLOR_CSC_MODE_YUV601_TO_RGB709
to PLANE_COLOR_CSC_MODE_YUV601_TO_RGB601 as per Ville's
feedback.
V7: Rebased
Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: Kishore Kadiyala <kishore.kadiyala@intel.com>
Signed-off-by: Uma Shankar <uma.shankar@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200601073544.11291-1-kishore.kadiyala@intel.com
If we declare that an object type is shrinkable (any that we can reclaim
to recover system pages), make sure we taint the object mutex so that
lockdep expects us to use it within fs_reclaim. lockdep will then
complain the first time we try to allocate while holding the plain
mutex, as doing so invites potential recursion.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200529183204.16850-1-chris@chris-wilson.co.uk
With the advent of preempt-to-busy, a request may still be on the GPU as
we unwind. And in the case of a unpreemptible [due to HW] request, that
request will remain indefinitely on the GPU even though we have
returned it back to our submission queue, and cleared the active bit.
We only run the execution callbacks on transferring the request from our
submission queue to the execution queue, but if this is a bonded request
that the HW is waiting for, we will not submit it (as we wait for a
fresh execution) even though it is still being executed.
As we know that there are always preemption points between requests, we
know that only the currently executing request may be still active even
though we have cleared the flag. However, we do not precisely know which
request is in ELSP[0] due to a delay in processing events, and
furthermore we only store the last request in a context in our state
tracker.
Fixes: 22b7a426bb ("drm/i915/execlists: Preempt-to-busy")
Testcase: igt/gem_exec_balancer/bonded-dual
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200529143926.3245-1-chris@chris-wilson.co.uk
There's no reason for I915_MODE_FLAG_INHERITED to exist as a flag
anymore. Just make it a boolean.
v2: Deal with sanitize_watermarks()
CC: Sam Ravnborg <sam@ravnborg.org>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200429103936.11850-1-ville.syrjala@linux.intel.com
Replace the use of mode->private_flags with a truly private bitmaks
in our own crtc state. We also need a copy in the crtc itself so the
vblank code can get at it. We already have scanline_offset in there
for a similar reason, as well as the vblank->hwmode which is assigned
via drm_calc_timestamping_constants(). Fortunately we now have a
nice place for doing the crtc_state->crtc copy in
intel_crtc_update_active_timings() which gets called both for
modesets and init/resume readout.
The one slightly iffy spot is the INHERITED flag which we want to
preserve until userspace/fb_helper does the first proper commit after
actually calling .detecti() on the connectors. Otherwise we don't have
the full sink capabilities (audio,infoframes,etc.) when .compute_config()
gets called and thus we will fail to enable those features when the
first userspace commit happens. The only internal commit we do prior to
that should be from intel_initial_commit() and there we can simply
preserve the INHERITED flag from the readout.
v2: Deal with INHERITED in sanitize_watermarks() as well
CC: Sam Ravnborg <sam@ravnborg.org>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200429103904.11727-1-ville.syrjala@linux.intel.com
We may choose to only submit ELSP[0], even though we have sufficient
requests to fill the whole ELSP. Normally, we only start timeslicing if
we fill more than one port, but in this case we need to start
timeslicing for the queue that we choose not to submit.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200528205727.20309-1-chris@chris-wilson.co.uk
If the ring submission is stalled on an external request, nothing can be
submitted, not even the heartbeat in the kernel context. Since nothing
is running, resetting the engine/device does not unblock the system and
is pointless. We can see if the heartbeat is supposed to be running
before declaring foul.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200528074109.28235-2-chris@chris-wilson.co.uk
Across suspend/resume, we clear the entire GGTT and rebuild from
scratch. In particular, we want to only preserve the global entries for
use by the HW, and delay reinstating the local binds until required by
the user. This means that we can evict any local binds in the global GTT,
saving any time in preserving their state, as they will be rebound on
demand.
References: https://gitlab.freedesktop.org/drm/intel/-/issues/1947
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200528082427.21402-2-chris@chris-wilson.co.uk
We should be able to skip restoring LOCAL (user) binds within the GGTT on
resume and let them be restored upon demand. However, our consistency
checks demand that the bind flags match the node state, and we cannot
simply clear the flags, we need to evict as well. For now, make sure we
restore the bind flags exactly upon resume.
Fixes: 0109a16ef3 ("drm/i915/gt: Clear LOCAL_BIND from shared GGTT on resume")
Fixes: bf0840cdb3 ("drm/i915/gt: Stop cross-polluting PIN_GLOBAL with PIN_USER with no-ppgtt")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200528150452.7880-1-chris@chris-wilson.co.uk
We have a I915_REQUEST_NOPREEMPT flag that we set when we must prevent
the HW from preempting during the course of this request. We need to
honour this flag and protect the HW even if we have a heartbeat request,
or other maximum priority barrier, pending. As such, restrict the
timeslicing check to avoid preempting into the topmost priority band,
leaving the unpreemptable requests in blissful peace running
uninterrupted on the HW.
v2: Set the I915_PRIORITY_BARRIER to be less than
I915_PRIORITY_UNPREEMPTABLE so that we never submit a request
(heartbeat or barrier) that can legitimately preempt the current
non-premptable request.
Fixes: 2a98f4e65b ("drm/i915: add infrastructure to hold off preemption on a request")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200527162418.24755-1-chris@chris-wilson.co.uk
gcc-9 gets confused by the code flow in check_dirty_whitelist:
drivers/gpu/drm/i915/gt/selftest_workarounds.c: In function 'check_dirty_whitelist':
drivers/gpu/drm/i915/gt/selftest_workarounds.c:492:17: error: 'rsvd' may be used uninitialized in this function [-Werror=maybe-uninitialized]
I could not figure out a good way to do this in a way that gcc
understands better, so initialize the variable to zero, as last
resort.
Fixes: aee20aaed8 ("drm/i915: Implement read-only support in whitelist selftest")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200527140526.1458215-2-arnd@arndb.de
Conditional spinlocks make it hard for gcc and for lockdep to
follow the code flow. This one causes a warning with at least
gcc-9 and higher:
In file included from include/linux/irq.h:14,
from drivers/gpu/drm/i915/i915_pmu.c:7:
drivers/gpu/drm/i915/i915_pmu.c: In function 'i915_sample':
include/linux/spinlock.h:289:3: error: 'flags' may be used uninitialized in this function [-Werror=maybe-uninitialized]
289 | _raw_spin_unlock_irqrestore(lock, flags); \
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/i915/i915_pmu.c:288:17: note: 'flags' was declared here
288 | unsigned long flags;
| ^~~~~
Split out the part between the locks into a separate function
for readability and to let the compiler figure out what the
logic actually is.
Fixes: d79e1bd676 ("drm/i915/pmu: Only use exclusive mmio access for gen7")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200527140526.1458215-1-arnd@arndb.de
We only restore GLOBAL binds upon resume as we expect these to be pinned
for use by HW, whereas the LOCAL binds can be recreated on demand once
userspace is resumed. For the LOCAL bind to be recreated in the global
GTT (for old systems without ppgtt), we need to clear its presence flag
on deciding not to restore the mapping upon resume.
Fixes: bf0840cdb3 ("drm/i915/gt: Stop cross-polluting PIN_GLOBAL with PIN_USER with no-ppgtt")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200526150739.26147-1-chris@chris-wilson.co.uk
When we push a virtual request onto the HW, we update the rq->engine to
point to the physical engine. A request that is then submitted by the
user that waits upon the virtual engine, but along the physical engine
in use, will then see that it is due to be submitted to the same engine
and take a shortcut (and be queued without waiting for the completion
fence). However, the virtual request may be preempted (either by higher
priority users, or by timeslicing) and removed from the physical engine
to be migrated over to one of its siblings. The dependent normal request
however is oblivious to the removal of the virtual request and remains
queued to execute on HW, believing that once it reaches the head of its
queue all of its predecessors will have completed executing!
v2: Beware restriction of signal->execution_mask prior to submission.
Fixes: 6d06779e86 ("drm/i915: Load balancing across a virtual engine")
Testcase: igt/gem_exec_balancer/sliced
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: <stable@vger.kernel.org> # v5.3+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200526090753.11329-2-chris@chris-wilson.co.uk
The drrs code dereferences mode->vrefresh via some really long chain
of structures/pointers. Couldn't get coccinelle to see through all
that so let's add some local variables to help it.
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200428171940.19552-3-ville.syrjala@linux.intel.com
Reduce the irq_work llist for attaching the callbacks to the signal for
both smaller structs (two fewer pointers!) and simpler [debug] code:
Function old new delta
irq_execute_cb 35 34 -1
__igt_breadcrumbs_smoketest 1684 1682 -2
i915_request_retire 2003 1996 -7
__i915_request_create 1047 1040 -7
__notify_execute_cb 135 126 -9
__i915_request_ctor 188 178 -10
__await_execution.part.constprop 451 440 -11
igt_wait_request 924 714 -210
One minor artifact is that the order of cb exection is reversed. No
current use cases are affected by that change.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200526112051.10229-1-chris@chris-wilson.co.uk
If there are no internal levels and the user priority-shift is zero, we
can help the compiler eliminate some dead code:
Function old new delta
start_timeslice 169 154 -15
__execlists_submission_tasklet 4696 4659 -37
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200525075347.582-4-chris@chris-wilson.co.uk
Before we return control to the system, and letting it reuse all the
pages being accessed by HW, we must disable the HW. At the moment, we
dare not reset the GPU if it will clobber the display, but once we know
the display has been disabled, we can proceed with the reset as we
shutdown the module. We know the next user must reinitialise the HW for
their purpose.
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/489
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@kernel.org
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200525151459.12083-1-chris@chris-wilson.co.uk
drivers/gpu/drm/i915/display/intel_dsb.c:177 intel_dsb_reg_write() warn: variable dereferenced before check 'dsb' (see line 175)
Fixes: afeda4f3b1 ("drm/i915/dsb: Pre allocate and late cleanup of cmd buffer")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Animesh Manna <animesh.manna@intel.com>
Cc: Uma Shankar <uma.shankar@intel.com>
Reviewed-by: Animesh Manna <animesh.manna@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200524233900.25598-1-chris@chris-wilson.co.uk
Since the worker may rearm, we currently are only guaranteed to flush
the work if we cancel the timer. If the work was running at the time we
try and cancel it, we will wait for it to complete, but it may leave
items in the pool and requeue the work. If we rearrange the immediate
discard of the pool then cancel the work, we know that the work cannot
rearm and so our flush will be final.
<0> [314.146044] i915_mod-1321 2.... 299799443us : intel_gt_fini_buffer_pool: intel_gt_fini_buffer_pool:227 GEM_BUG_ON(!list_empty(&pool->cache_list[n]))
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1920
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200525141957.3061-1-chris@chris-wilson.co.uk
In order to be valid to dereference during the i915_fence_release, after
retiring the fence and releasing its refererences, we assume that
rq->engine can only be a real engine (that stay intact until the device
is shutdown after all fences have been flushed). However, due to a quirk
of preempt-to-busy, we may retire a request that still belongs to a
virtual engine and so eventually free it with rq->engine being invalid.
To avoid dereferencing that invalid engine, we look at the
execution_mask which if it indicates it may be executed on more than one
engine, we know it originated on a virtual engine and may still be on
one.
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1906
Fixes: 43acd6516c ("drm/i915: Keep a per-engine request pool")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200521140617.30015-2-chris@chris-wilson.co.uk
(cherry picked from commit 32a4605b38)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Since the removal of the no-semaphore boosting, we rely on timeslicing to
reorder passed inter-dependency hogs across the engines. However, we
require preemption to support timeslicing into user payloads, and not all
machine support preemption so we do not universally enable timeslicing,
even when it would correctly preempt our own inter-engine semaphores.
Since timeslicing and semaphore priority deboosting is now disabled on
Broadwell/Braswell, we have to follow suite and not use semaphores.
Testcase: igt/gem_exec_schedule/semaphore-codependency # bdw/bsw
Fixes: 18e4af04d2 ("drm/i915: Drop no-semaphore boosting")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200521140617.30015-1-chris@chris-wilson.co.uk
(cherry picked from commit 0eb670aac2)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
This assertion was removed in commit b412c63f1c ("drm/i915/gt: Report
context-is-closed prior to pinning"), but accidentally restored by a
cherry-pick into drm-next and now has percolated back to
drm-intel-next-queued.
Fixes: 2e46a2a0b0 ("drm/i915: Use explicit flag to mark unreachable intel_context")
Fixes: 2b703bbda2 ("Merge drm/drm-next into drm-intel-next-queued")
References: b412c63f1c ("drm/i915/gt: Report context-is-closed prior to pinning")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200520073048.2394034-1-chris@chris-wilson.co.uk
(cherry picked from commit f2c1061a36)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
We recorded the execlists->queue_priority_hint update for the inflight
request without kicking the tasklet. The next submitted request then
failed to be scheduled as it had a lower priority than the hint, leaving
the HW running with only the inflight request.
Fixes: 6cebcf746f ("drm/i915: Tweak scheduler's kick_submission()")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200519063123.20673-1-chris@chris-wilson.co.uk
(cherry picked from commit b86fc6e5e8)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Pre-allocate command buffer in atomic_commit using intel_dsb_prepare
function which also includes pinning and map in cpu domain.
No functional change is dsb write/commit functions.
Now dsb get/put function is removed and ref-count mechanism is
not needed. Below dsb api added to do respective job mentioned
below.
intel_dsb_prepare - Allocate, pin and map the buffer.
intel_dsb_cleanup - Unpin and release the gem object.
RFC: Initial patch for design review.
v2: included _init() part in _prepare(). [Daniel, Ville]
v3: dsb_cleanup called after cleanup_planes. [Daniel]
v4: dsb structure is moved to intel_crtc_state from intel_crtc. [Maarten]
v5: dsb get/put/ref-count mechanism removed. [Maarten]
v6: Based on review feedback following changes are added,
- replaced intel_dsb structure by pointer in intel_crtc_state. [Maarten]
- passing intel_crtc_state to dsp-api to simplify the code. [Maarten]
- few dsb functions prototype modified to simplify code.
v7: added few cosmetic changes suggested by Jani and null check for
crtc_state in dsb_cleanup removed as suggested by Maarten.
v8: changed the function parameter to intel_crtc_state* of
ivb_load_lut_ext_max() from intel_crtc. [Maarten]
v9: error handling improved in _write() and prepare(). [Maarten]
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@intel.com>
Signed-off-by: Animesh Manna <animesh.manna@intel.com>
Signed-off-by: Uma Shankar <uma.shankar@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200520130737.11240-1-animesh.manna@intel.com
Removed duplicate include and fixed comment > 80 chars.
v2: Added newline after system include and between functions
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200522131843.20477-1-stanislav.lisovskiy@intel.com
This is a permanent w/a for JSL/EHL.This is to be applied to the
PCH types on JSL/EHL ie JSP/MCC
Bspec: 52888
v2: Fixed the wrong usage of logical OR(ville)
v3: Removed extra braces, changed the check(jose)
Signed-off-by: Swathi Dhanavanthri <swathi.dhanavanthri@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200521064448.29522-1-swathi.dhanavanthri@intel.com
According to BSpec max BW per slice is calculated using formula
Max BW = CDCLK * 64. Currently when calculating min CDCLK we
account only per plane requirements, however in order to avoid
FIFO underruns we need to estimate accumulated BW consumed by
all planes(ddb entries basically) residing on that particular
DBuf slice. This will allow us to put CDCLK lower and save power
when we don't need that much bandwidth or gain additional
performance once plane consumption grows.
v2: - Fix long line warning
- Limited new DBuf bw checks to only gens >= 11
v3: - Lets track used Dbuf bw per slice and per crtc in bw state
(or may be in DBuf state in future), that way we don't need
to have all crtcs in state and those only if we detect if
are actually going to change cdclk, just same way as we
do with other stuff, i.e intel_atomic_serialize_global_state
and co. Just as per Ville's paradigm.
- Made dbuf bw calculation procedure look nicer by introducing
for_each_dbuf_slice_in_mask - we often will now need to iterate
slices using mask.
- According to experimental results CDCLK * 64 accounts for
overall bandwidth across all dbufs, not per dbuf.
v4: - Fixed missing const(Ville)
- Removed spurious whitespaces(Ville)
- Fixed local variable init(reduced scope where not needed)
- Added some comments about data rate for planar formats
- Changed struct intel_crtc_bw to intel_dbuf_bw
- Moved dbuf bw calculation to intel_compute_min_cdclk(Ville)
v5: - Removed unneeded macro
v6: - Prevent too frequent CDCLK switching back and forth:
Always switch to higher CDCLK when needed to prevent bandwidth
issues, however don't switch to lower CDCLK earlier than once
in 30 minutes in order to prevent constant modeset blinking.
We could of course not switch back at all, however this is
bad from power consumption point of view.
v7: - Fixed to track cdclk using bw_state, modeset will be now
triggered only when CDCLK change is really needed.
v8: - Lock global state if bw_state->min_cdclk is changed.
- Try getting bw_state only if there are crtcs in the commit
(need to have read-locked global state)
v9: - Do not do Dbuf bw check for gens < 9 - triggers WARN
as ddb_size is 0.
v10: - Lock global state for older gens as well.
v11: - Define new bw_calc_min_cdclk hook, instead of using
a condition(Manasi Navare)
v12: - Fixed rebase conflict
v13: - Added spaces after declarations to make checkpatch happy.
Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200520150058.16123-1-stanislav.lisovskiy@intel.com
We quite often need now to iterate only particular dbuf slices
in mask, whether they are active or related to particular crtc.
v2: - Minor code refactoring
v3: - Use enum for max slices instead of macro
Let's make our life a bit easier and use a macro for that.
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200519131117.17190-6-stanislav.lisovskiy@intel.com
Checking with hweight8 if plane configuration had
changed seems to be wrong as different plane configs
can result in a same hamming weight.
So lets check the bitmask itself.
v2: Fixed "from" field which got corrupted for some weird reason
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200520145827.15887-1-stanislav.lisovskiy@intel.com
In Gen11+ whenever we might exceed DBuf bandwidth we might need to
recalculate CDCLK which DBuf bandwidth is scaled with.
Total Dbuf bw used might change based on particular plane needs.
Thus to calculate if cdclk needs to be changed it is not enough
anymore to check plane configuration and plane min cdclk, per DBuf
bw can be calculated only after wm/ddb calculation is done and
all required planes are added into the state. In order to keep
all min_cdclk related checks in one place let's extract it into
separate function, checking and modifying any_ms.
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200519131117.17190-3-stanislav.lisovskiy@intel.com
We need to calculate cdclk after watermarks/ddb has been calculated
as with recent hw CDCLK needs to be adjusted accordingly to DBuf
requirements, which is not possible with current code organization.
Setting CDCLK according to DBuf BW requirements and not just rejecting
if it doesn't satisfy BW requirements, will allow us to save power when
it is possible and gain additional bandwidth when it's needed - i.e
boosting both our power management and perfomance capabilities.
This patch is preparation for that, first we now extract modeset
calculation from modeset checks, in order to call it after wm/ddb
has been calculated.
v2: - Extract only intel_modeset_calc_cdclk from intel_modeset_checks
(Ville Syrjälä)
v3: - Clear plls after intel_modeset_calc_cdclk
v4: - Added r-b from previous revision to commit message
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200519131117.17190-2-stanislav.lisovskiy@intel.com
In order to be valid to dereference during the i915_fence_release, after
retiring the fence and releasing its refererences, we assume that
rq->engine can only be a real engine (that stay intact until the device
is shutdown after all fences have been flushed). However, due to a quirk
of preempt-to-busy, we may retire a request that still belongs to a
virtual engine and so eventually free it with rq->engine being invalid.
To avoid dereferencing that invalid engine, we look at the
execution_mask which if it indicates it may be executed on more than one
engine, we know it originated on a virtual engine and may still be on
one.
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1906
Fixes: 43acd6516c ("drm/i915: Keep a per-engine request pool")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200521140617.30015-2-chris@chris-wilson.co.uk
Since the removal of the no-semaphore boosting, we rely on timeslicing to
reorder passed inter-dependency hogs across the engines. However, we
require preemption to support timeslicing into user payloads, and not all
machine support preemption so we do not universally enable timeslicing,
even when it would correctly preempt our own inter-engine semaphores.
Since timeslicing and semaphore priority deboosting is now disabled on
Broadwell/Braswell, we have to follow suite and not use semaphores.
Testcase: igt/gem_exec_schedule/semaphore-codependency # bdw/bsw
Fixes: 18e4af04d2 ("drm/i915: Drop no-semaphore boosting")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200521140617.30015-1-chris@chris-wilson.co.uk
Since the number of platforms with this restriction are growing, let's
separate out the platform logic into a has_phy_misc() function.
Bspec: 50107
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Anusha Srivatsa <anusha.srivatsa@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200504225227.464666-11-matthew.d.roper@intel.com
RKL power wells are similar to TGL power wells, but have some important
differences:
* PG1 now has pipe A's VDSC (rather than sticking it in PG2)
* PG2 no longer exists
* DDI-C (aka TC-1) moves from PG1 -> PG3
* PG5 no longer exists due to the lack of a fourth pipe
Also note that what we refer to as 'DDI-C' and 'DDI-D' need to actually
be programmed as TC-1 and TC-2 even though this platform doesn't have TC
outputs.
Bspec: 49234
Cc: Imre Deak <imre.deak@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Anshuman Gupta <anshuman.gupta@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Anshuman Gupta <anshuman.gupta@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200504225227.464666-9-matthew.d.roper@intel.com
RKL only has five universal planes, plus a cursor. Since the
bottom-most universal plane is considered the primary plane, set the
number of sprites available on this platform to 4.
In general, the plane capabilities of the remaining planes stay the same
as TGL. However the NV12 Y-plane support moves down to the new top two
planes and now only the bottom three planes can be used for NV12 UV.
Bspec: 49181
Bspec: 49251
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200504225227.464666-8-matthew.d.roper@intel.com
The RKL platform has different memory characteristics from past
platforms. Update the values used by our memory bandwidth calculations
accordingly.
Bspec: 53998
Cc: James Ausmus <james.ausmus@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200504225227.464666-7-matthew.d.roper@intel.com
This assertion was removed in commit b412c63f1c ("drm/i915/gt: Report
context-is-closed prior to pinning"), but accidentally restored by a
cherry-pick into drm-next and now has percolated back to
drm-intel-next-queued.
Fixes: 2e46a2a0b0 ("drm/i915: Use explicit flag to mark unreachable intel_context")
Fixes: 2b703bbda2 ("Merge drm/drm-next into drm-intel-next-queued")
References: b412c63f1c ("drm/i915/gt: Report context-is-closed prior to pinning")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200520073048.2394034-1-chris@chris-wilson.co.uk
UAPI Changes:
- drm/i915: Show per-engine default property values in sysfs
By providing the default values configured into the kernel via sysfs, it
is much more convenient for userspace to restore those sane defaults, or
at least know what are considered good baseline. This is useful, for
example, to cleanup after any failed userspace prior to commencing new
jobs.
Cross-subsystem Changes:
- video/hdmi: Add Unpack only function for DRM infoframe
- Includes pull request gvt-next-2020-05-12
Driver Changes:
- Restore Cherryview back to full-ppgtt (Chris, Mika)
- Document locking guidelines for i915 (Chris, Daniel, Joonas)
- Fix GitLab #1746: Handle idling during i915_gem_evict_something busy loops (Chris)
- Display WA #1105: Require linear fb stride to be multiple of 512 bytes on
gen9/glk (Ville)
- Add Wa_14010685332 for ICP/ICL (Matt R)
- Restrict w/a 1607087056 for EHL/JSL (Swathi)
- Fix interrupt handling for DP AUX transactions on Tigerlake (Imre)
- Revert "drm/i915/tgl: Include ro parts of l3 to invalidate" (Mika)
- Fix HDC pipeline flush hardware bit on Gen12 (Mika)
- Flush L3 when flushing render on Gen12 (Mika)
- Invalidate aux table entries forcibly between BB on Gen12 (Mika)
- Add aux table invalidate for all engines on Gen12 (Mika)
- Force pte cacheline to main memory Gen8+ (Mika)
- Add and enable TGL+ SAGV support (Stanislav)
- Implement vm_ops->access on i915 mmaps for GDB (Chris, Kristian)
- Replace zero-length array with flexible-array (Gustavo)
- Improve batch buffer pool effectiveness to mitigate soft-rc6 hit (Chris)
- Remove wait priority boosting (Chris)
- Keep driver module referenced when PMU is active (Chris)
- Sanitize RPS interrupts upon resume (Chris)
- Extend pcode read timeout to 20 ms (Chris)
- Wait for ACT sent before enabling MST pipe (Ville)
- Extend support to async relocations to SNB (Chris)
- Remove CNL pre-prod workarounds (Ville)
- Don't enable WaIncreaseLatencyIPCEnabled when IPC is disabled (Sultan)
- Record the active CCID from before reset (Chris)
- Mark concurrent submissions with a weak-dependency (Chris)
- Peel dma-fence-chains for await to allow engine-to-engine sync (Lionel)
- Prevent using semaphores to chain up to external fences (Chris)
- Fix GLK watermark calculations (Ville)
- Emit await(batch) before MI_BB_START (Chris)
- Reset execlists registers before HWSP (Chris)
- Drop no-semaphore boosting in favor of fast timeslicing (Chris)
- Fix enabled infoframe states of lspcon (Gwan-gyeong)
- Program DP SDPs on pipe updates (Gwan-gyeong)
- Stop sending DP SDPs on ddi disable (Gwan-gyeong)
- Store CS timestamp frequency in Hz (Ville)
- Remove unused HAS_FWTABLE macro (Pascal)
- Use batchbuffer chaining for relocations to save ring space (Chris)
- Try different engines for relocs if MI ops not supported (Chris, Tvrtko)
- Lazily acquire the device wakeref for freeing objects (Chris)
- Streamline display code arithmetics around rounding etc. (Ville)
- Use bw state for per crtc SAGV evaluation (Stanislav)
- Track active_pipes in bw_state (Stanislav)
- Nuke mode.vrefresh usage (Ville)
- Warn if the FBC is still writing to stolen on removal (Chris)
- Added new PCode commands prepping for QGV rescricting (Stansilav)
- Stop holding onto the pinned_default_state (Chris)
- Propagate error from completed fences (Chris)
- Ignore submit-fences on the same timeline (Chris)
- Pull waiting on an external dma-fence into its routine (Chris)
- Replace the hardcoded I915_FENCE_TIMEOUT with Kconfig (Chris)
- Mark up the racy read of execlists->context_tag (Chris)
- Tidy up the return handling for completed dma-fences (Chris)
- Introduce skl_plane_wm_level accessor (Stanislav)
- Extract SKL SAGV checking (Stanislav)
- Make active_pipes check skl specific (Stanislav)
- Suspend tasklets before resume sanitization (Chris)
- Remove redundant exec_fence (Chris)
- Mark the addition of the initial-breadcrumb in the request (Chris)
- Transfer old virtual breadcrumbs to irq_worker (Chris)
- Read the DP SDPs from the video DIP (Gwan-gyeong)
- Program DP SDPs with computed configs (Gwan-gyeong)
- Add state readout for DP VSC and DP HDR Metadata Infoframe SDP
(Gwan-gyeong)
- Add compute routine for DP PSR VSC SDP (Gwan-gyeong)
- Use new DP VSC SDP compute routine on PSR (Gwan-gyeong)
- Restrict qgv points which don't have enough bandwidth. (Stanislav)
- Nuke pointless div by 64bit (Ville)
- Static checker code fixes (Nathan, Mika, Chris)
- Add logging function for DP VSC SDP (Gwan-gyeong)
- Include HDMI DRM infoframe, DP HDR metadata and DP VSC SDP in the
crtc state dump (Gwan-gyeong)
- Make timeslicing explicit engine property (Chris, Tvrtko)
- Selftest and debugging improvements (Chris)
- Align variable names with BSpec (Ville)
- Tidy up gen8+ breadcrumb emission code (Chris)
- Turn intel_digital_port_connected() in a vfunc (Ville)
- Use stashed away hpd isr bits in intel_digital_port_connected() (Ville)
- Extract i915_cs_timestamp_{ns_to_ticks,tick_to_ns}() (Ville)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200515160703.GA19043@jlahtine-desk.ger.corp.intel.com