Patch series "mm: unexport __get_user_pages_unlocked()".
This patch series continues the cleanup of get_user_pages*() functions
taking advantage of the fact we can now pass gup_flags as we please.
It firstly adds an additional 'locked' parameter to
get_user_pages_remote() to allow for its callers to utilise
VM_FAULT_RETRY functionality. This is necessary as the invocation of
__get_user_pages_unlocked() in process_vm_rw_single_vec() makes use of
this and no other existing higher level function would allow it to do
so.
Secondly existing callers of __get_user_pages_unlocked() are replaced
with the appropriate higher-level replacement -
get_user_pages_unlocked() if the current task and memory descriptor are
referenced, or get_user_pages_remote() if other task/memory descriptors
are referenced (having acquiring mmap_sem.)
This patch (of 2):
Add a int *locked parameter to get_user_pages_remote() to allow
VM_FAULT_RETRY faulting behaviour similar to get_user_pages_[un]locked().
Taking into account the previous adjustments to get_user_pages*()
functions allowing for the passing of gup_flags, we are now in a
position where __get_user_pages_unlocked() need only be exported for his
ability to allow VM_FAULT_RETRY behaviour, this adjustment allows us to
subsequently unexport __get_user_pages_unlocked() as well as allowing
for future flexibility in the use of get_user_pages_remote().
[sfr@canb.auug.org.au: merge fix for get_user_pages_remote API change]
Link: http://lkml.kernel.org/r/20161122210511.024ec341@canb.auug.org.au
Link: http://lkml.kernel.org/r/20161027095141.2569-2-lstoakes@gmail.com
Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Hugh Dickins <hughd@google.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krcmar <rkrcmar@redhat.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- fix dma-buf export path to return correct SG table
- trivially implement direct dma-buf mapping
- allow DRAW_INSTANCED commands in validator
- make the driver work on i.MX6SX, yielding a working 2D/3D stack
together with Mareks MXS DRM driver
* 'drm-etnaviv-next' of git://git.pengutronix.de/lst/linux:
MAINTAINERS: add etnaviv mailinglist
drm/etnaviv: move linear window on MC1.0 parts if necessary
drm/etnaviv: don't invoke OOM killer from dump code
drm/etnaviv: fix gem_prime_get_sg_table to return new SG table
drm/etnaviv: Allow DRAW_INSTANCED commands
drm/etnaviv: implement dma-buf mmap
On i.MX6SX the physical memory is placed above the 2GB mark, so the GPU
linear window has to be moved for the GPU to work at all. This doesn't
mix with the FAST_CLEAR feature, as the TS unit doesn't take the linear
window offset into account and will corrupt memory when used with a
non-zero offset.
Move the linear window if it's necessary for the GPU to work, but avoid
announcing FAST_CLEAR support to userspace in this case.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Tested-by: Marek Vasut <marex@denx.de>
The dumper is only a debugging aid so we don't want to invoke the OOM
killer if buffer for the potentially large GPU state can't be vmalloced.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
The object internal SG table must not be returned, as the caller
will take ownership of the returned table.
Construct a new table from the object pages and return this one
instead.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Vivante GPUs with HALTI0 feature support a DRAW_INSTANCED command in the
command stream to draw a number of instances of the same geometry.
The information that has been figured out about the command can be found
here: https://github.com/etnaviv/etna_viv/blob/master/rnndb/cmdstream.xml#L270
This command is not allowed currently by the DRM driver because it
was not known before. This patch enables parsing it in command
streams and allows using it by userspace drivers.
Signed-off-by: Wladimir J. van der Laan <laanwj@gmail.com>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
This adds the required boilerplate to allow direct mmap of exported
etnaviv BOs.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Tested-by: Philipp Zabel <p.zabel@pengutronix.de>
Convert DT component matching to use component_match_add_release().
Acked-by: Jyri Sarha <jsarha@ti.com>
Reviewed-by: Jyri Sarha <jsarha@ti.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: http://patchwork.freedesktop.org/patch/msgid/E1bwo6l-0005Io-Q1@rmk-PC.armlinux.org.uk
First -misc pull for 4.10:
- drm_format rework from Laurent
- reservation patches from Chris that missed 4.9.
- aspect ratio support in infoframe helpers and drm mode/edid code
(Shashank Sharma)
- rotation rework from Ville (first parts at least)
- another attempt at the CRC debugfs interface from Tomeu
- piles and piles of misc patches all over
* tag 'topic/drm-misc-2016-10-24' of git://anongit.freedesktop.org/drm-intel: (55 commits)
drm: Use u64 for intermediate dotclock calculations
drm/i915: Use the per-plane rotation property
drm/omap: Use per-plane rotation property
drm/omap: Set rotation property initial value to BIT(DRM_ROTATE_0) insted of 0
drm/atmel-hlcdc: Use per-plane rotation property
drm/arm: Use per-plane rotation property
drm: Add support for optional per-plane rotation property
drm/atomic: Reject attempts to use multiple rotation angles at once
drm: Add drm_rotation_90_or_270()
dma-buf/sync_file: hold reference to fence when creating sync_file
drm/virtio: kconfig: Fixup white space.
drm/fence: release fence reference when canceling event
drm/i915: Handle early failure during intel_get_load_detect_pipe
drm/fb_cma_helper: do not free fbdev if there is none
drm: fix sparse warnings on undeclared symbols in crc debugfs
gpu: Remove depends on RESET_CONTROLLER when not a provider
i915: don't call drm_atomic_state_put on invalid pointer
drm: Don't export the drm_fb_get_bpp_depth() function
drm/arm: mali-dp: Replace drm_fb_get_bpp_depth() with drm_format_plane_cpp()
drm: vmwgfx: Replace drm_fb_get_bpp_depth() with drm_format_info()
...
2 more patches to stabilize the new MMUv2 support.
* 'drm-etnaviv-fixes' of git://git.pengutronix.de/lst/linux:
drm/etnaviv: block 64K of address space behind each cmdstream
drm/etnaviv: ensure write caches are flushed at end of user cmdstream
This removes the 'write' and 'force' from get_user_pages_remote() and
replaces them with 'gup_flags' to make the use of FOLL_FORCE explicit in
callers as use of this flag can result in surprising behaviour (and
hence bugs) within the mm subsystem.
Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Since fence_wait_timeout_reservation_object_wait_timeout_rcu() with a
timeout of 0 becomes reservation_object_test_signaled_rcu(), we do not
need to handle such conversion in the caller. The only challenge are
those callers that wish to differentiate the error code between the
nonblocking busy check and potentially blocking wait.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Russell King <linux+etnaviv@armlinux.org.uk>
Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Link: http://patchwork.freedesktop.org/patch/msgid/20160829070834.22296-2-chris@chris-wilson.co.uk
To make sure we don't place anything there which might confuse
the FE prefetcher. This gets rid of another case of FE MMU faults
when the address space gets crowded before triggering the reaper.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
If the GPU is done with one user command stream the buffers referenced
by this command stream may go away and get unmapped from the MMU. If
the write caches are still dirty at this point later evictions will run
into MMU faults, killing the GPU.
Make sure the write caches are flushed before signaling completion
of the user command stream.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Notable changes:
- Cleanups from Fabio to some error paths and proper error propagation.
- Lots of refactoring and new code to support the new MMU version 2,
still relatively unoptimized and doesn't yet provide better process
isolation than MMUv1, but enough to get newer cores up and running.
- New hardware support: GC3000, as found on the NXP i.MX6 QuadPlus SoC.
* 'drm-etnaviv-next' of git://git.pengutronix.de/git/lst/linux: (25 commits)
drm/etnaviv: mark whole context as lost in recover worker
drm/etnaviv: record correct cmdbuf IOVA in dump
drm/etnaviv: space out IOVA layout for cmdbufs on MMUv2
drm/etnaviv: fix up model and revision for GC2000+
drm/etnaviv: implement IOMMUv2 translation
drm/etnaviv: handle MMU exception in IRQ handler
drm/etnaviv: add flushing logic for MMUv2
drm/etnaviv: add function to construct MMUv2 init buffer
drm/etnaviv: map cmdbuf through MMU on version 2
drm/etnaviv: split out iova search and MMU reaping logic
drm/etnaviv: split out FE start
drm/etnaviv: split out wait for gpu idle
drm/etnaviv: move gpu_va() to etnaviv mmu
drm/etnaviv: remove unused iommu_v2 header
drm/etnaviv: move IOMMU domain allocation into etnaviv MMU
drm/etnaviv: indirect IOMMU restore through etnaviv MMU
drm/etnaviv: move linear window setup into etnaviv_iommuv1_restore
drm/etnaviv: rename etnaviv_iommu_domain_restore to etnaviv_iommuv1_restore
drm/etnaviv: only check if the cmdbuf is inside the linear window on MMUv1
drm/etnaviv: only try to use the linear window on MMUv1
...
There are many reasons other than ENOMEM that drm_dev_init() can
fail. Return ERR_PTR rather than NULL to be able to distinguish
these in the caller.
Signed-off-by: Tom Gundersen <teg@jklm.no>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: http://patchwork.freedesktop.org/patch/msgid/20160921145919.13754-2-teg@jklm.no
If we reset the GPU to get it back into a usable state we lose
all context, not just the MMU one. Mark the whole context as
lost to trigger a restore of the exec and MMU state.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
For cmdbufs the CPU IOVA was recorded instead of the GPU one.
Fix this to make it consistent with other BOs and to make
reading the dumps easier.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
At least on the GC3000 the FE MMU is not properly flushing stale TLB
entries. Make sure to map the cmdbufs with a big enough spacing in
the IOVAs to not hit old/prefetched TLB entries when jumping to a
newly mapped cmdbuf.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
GC2000+ on the i.MX6QP is just a re-branded GC3000, lets call it by
its real name to avoid confusion in other parts of the driver.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
All other parts are now in place, so implement the actual translation
step and hook it up, so the driver claims support for cores with
the new MMU.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Bit 30 of the interrupt status signals an MMU exception. Handle this
condition properly and dump some useful registers.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Flushing works differently on MMUv2, in that it's only necessary
to set a single bit in the control register to flush all translation
units. A semaphore stall then makes sure that the flush has propagated
properly.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Both the safe/scratch address and the master TLB address are per pipe
with the CPU mapped registers not properly propagating to the
different translation units.
The only way to correctly configure all translation units is to have
a command stream snipped executed by the FE, before any other execution
can start.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
With MMUv2 all buffers need to be mapped through the MMU once it
is enabled. Align the buffer size to 4K, as the MMU is only able to
map page aligned buffers.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
With MMUv2 the command buffers need to be mapped through the MMU.
Split out the iova search and MMU reaping logic so it can be reused
for the cmdbuf mapping, where no GEM object is involved.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Split out into a new externally visible function, as the IOMMUv2
code needs this functionality, too.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Split out into a new externally visible function, as the IOMMUv2
code needs this functionality, too.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
The GPU virtual address for the command buffers differs depending on
the IOMMU version. Move the calculation of the iova into etnaviv
mmu, to enable proper dispatch.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
The GPU code doesn't need to deal with the IOMMU directly, instead
it can all be hidden behind the etnaviv mmu interface. Move the
last remaining part into etnaviv mmu.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
This function has external visibility and only handles the Vivant IOMMU
version 1. Rename to make this more clear and allow a clear separation
of the different IOMMU versions.
Also drop the domain parameter, as we can infer it from the GPU we are
dealing with.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
There is no linear window on MMUv2 and the FE can access the full 4GB
address space either directly (as long as the MMU isn't configured) or
through the MMU, once it is up.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
As the comment above the code states, the linear window is only
available on MMUv1. Don't try to use it on MMUv2.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
The driver doesn't ever enable individual clocks alone, so there
is no need to scatter the clock enable/disable sequences through
multiple functions. Fold them into the top one.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
In the etnaviv_gpu_platform_probe() error path the 'fail' label is
used to just return the error code.
This can be simplified by returning the error code immediately, so
get rid of the unneeded 'fail' label.
Signed-off-by: Fabio Estevam <festevam@gmail.com>
clk_prepare_enable() may fail, so we should better check for its return
value and propagate it in the case of failure.
Signed-off-by: Fabio Estevam <festevam@gmail.com>
It's deprecated and only should be used by drivers which still use
drm_platform_init, not by anyone else.
And indeed it's entirely unused and can be nuked.
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Christian Gmeiner <christian.gmeiner@gmail.com>
Acked-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1471034937-651-6-git-send-email-daniel.vetter@ffwll.ch
Since commit 4984979b9b ("drm/irq: simplify irq checks in
drm_wait_vblank"), the drm driver feature flag DRIVER_HAVE_IRQ is only
required for drivers that have an IRQ handler managed by the DRM core.
Some drivers, armada, etnaviv, kirin and sti, set this flag without
.irq_handler setup in drm_driver. These drivers manage IRQ handler
by themselves and the flag DRIVER_HAVE_IRQ makes no sense there.
Drop the flag for these drivers to avoid confusion.
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Cc: Vincent Abriou <vincent.abriou@st.com>
Cc: Xinliang Liu <z.liuxinliang@hisilicon.com>
Acked-by: Russell King <rmk+kernel@armlinux.org.uk> (for armada and etnaviv)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/1471331168-5601-1-git-send-email-shawnguo@kernel.org
Both the fence and event alloc are safe to be done without holding the GPU
lock, as they either don't need any locking (fences) or are protected by
their own lock (events).
This solves a bad locking interaction between the submit path and the
recover worker. If userspace manages to exhaust all available events while
the GPU is hung, the submit will wait for events to become available
holding the GPU lock. The recover worker waits for this lock to become
available before trying to recover the GPU which frees up the allocated
events. Essentially both paths are deadlocked until the submit path
times out waiting for available events, failing the submit that could
otherwise be handled just fine if the recover worker had the chance to
bring the GPU back in a working state.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Merge drm updates from Dave Airlie:
"This is the main drm pull request for 4.8.
I'm down with a cold at the moment so hopefully this isn't in too bad
a state, I finished pulling stuff last week mostly (nouveau fixes just
went in today), so only this message should be influenced by illness.
Apologies to anyone who's major feature I missed :-)
Core:
Lockless GEM BO freeing
Non-blocking atomic work
Documentation changes (rst/sphinx)
Prep for new fencing changes
Simple display helpers
Master/auth changes
Register/unregister rework
Loads of trivial patches/fixes.
New stuff:
ARM Mali display driver (not the 3D chip)
sii902x RGB->HDMI bridge
Panel:
Support for new panels
Improved backlight support
Bridge:
Convert ADV7511 to bridge driver
ADV7533 support
TC358767 (DSI/DPI to eDP) encoder chip support
i915:
BXT support enabled by default
GVT-g infrastructure
GuC command submission and fixes
BXT workarounds
SKL/BKL workarounds
Demidlayering device registration
Thundering herd fixes
Missing pci ids
Atomic updates
amdgpu/radeon:
ATPX improvements for better dGPU power control on PX systems
New power features for CZ/BR/ST
Pipelined BO moves and evictions in TTM
GPU scheduler improvements
GPU reset improvements
Overclocking on dGPUs with amdgpu
Polaris powermanagement enabled
nouveau:
GK20A/GM20B volt and clock improvements.
Initial support for GP100/GP104 GPUs, GP104 will not yet support
acceleration due to NVIDIA having not released firmware for them as of yet.
exynos:
Exynos5433 SoC with IOMMU support.
vc4:
Shader validation for branching
imx-drm:
Atomic mode setting conversion
Reworked DMFC FIFO allocation
External bridge support
analogix-dp:
RK3399 eDP support
Lots of fixes.
rockchip:
Lots of small fixes.
msm:
DT bindings cleanups
Shrinker and madvise support
ASoC HDMI codec support
tegra:
Host1x driver cleanups
SOR reworking for DP support
Runtime PM support
omapdrm:
PLL enhancements
Header refactoring
Gamma table support
arcgpu:
Simulator support
virtio-gpu:
Atomic modesetting fixes.
rcar-du:
Misc fixes.
mediatek:
MT8173 HDMI support
sti:
ASOC HDMI codec support
Minor fixes
fsl-dcu:
Suspend/resume support
Bridge support
amdkfd:
Minor fixes.
etnaviv:
Enable GPU clock gating
hisilicon:
Vblank and other fixes"
* tag 'drm-for-v4.8' of git://people.freedesktop.org/~airlied/linux: (1575 commits)
drm/nouveau/gr/nv3x: fix instobj write offsets in gr setup
drm/nouveau/acpi: fix lockup with PCIe runtime PM
drm/nouveau/acpi: check for function 0x1B before using it
drm/nouveau/acpi: return supported DSM functions
drm/nouveau/acpi: ensure matching ACPI handle and supported functions
drm/nouveau/fbcon: fix font width not divisible by 8
drm/amd/powerplay: remove enable_clock_power_gatings_tasks from initialize and resume events
drm/amd/powerplay: move clockgating to after ungating power in pp for uvd/vce
drm/amdgpu: add query device id and revision id into system info entry at CGS
drm/amdgpu: add new definition in bif header
drm/amd/powerplay: rename smum header guards
drm/amdgpu: enable UVD context buffer for older HW
drm/amdgpu: fix default UVD context size
drm/amdgpu: fix incorrect type of info_id
drm/amdgpu: make amdgpu_cgs_call_acpi_method as static
drm/amdgpu: comment out unused defaults_staturn_pro static const structure to fix the build
drm/amdgpu: enable UVD VM only on polaris
drm/amdgpu: increase timeout of IB test
drm/amdgpu: add destroy session when generate VCE destroy msg.
drm/amd: fix deadlock of job_list_lock V2
...
Refactor this function implementation so that the
drm_gem_object_unreference_unlocked() function will only be called once
in case of a failure according to the Linux coding style recommendation
for centralized exiting of functions.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
[seanpaul tweaked subject]
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: http://patchwork.freedesktop.org/patch/msgid/4af34ce6-62c6-7966-1ae3-0877d5ac909d@users.sourceforge.net
The functions drm_gem_object_unreference_unlocked() and vunmap() perform
also input parameter validation.
Thus the tests around their calls are not needed.
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
[seanpaul tweaked subject]
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: http://patchwork.freedesktop.org/patch/msgid/9638cd74-ffc5-d9ee-a40c-9b60e860ad8b@users.sourceforge.net
etnaviv-next only contains two patches to get rid of a confusing error
message and finally one patch to enable the autonomous GPU clock gating.
* 'drm-etnaviv-next' of git://git.pengutronix.de/git/lst/linux:
drm/etnaviv: remove generic GPU init failure reporting
drm/etnaviv: improve error reporting in GPU init path
drm/etnaviv: enable GPU module level clock gating support
The GPU init path now reports any errors which might occur more accurately
than what is possible with the generic "something failed" message.
Remove the generic reporting, so we don't log an error into dmesg anymore
if any of the GPU cores are ignored.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>