Alloc GEM buffers backed by noncoherent memory on SoCs where it is
actually faster than write-combine.
This dramatically speeds up software rendering on these SoCs, even for
tasks where write-combine memory should in theory be faster (e.g. simple
blits).
v3: The option is now selected per-SoC instead of being a module
parameter.
v5: - Fix drm_atomic_get_new_plane_state() used to retrieve the old
state
- Use custom drm_gem_fb_create()
- Only check damage clips and sync DMA buffers if non-coherent
buffers are used
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20210523170415.90410-4-paul@crapouillou.net
Add support for 24-bit panels that are connected through a 8-bit bus and
use delta-RGB, which means a RGB pixel ordering on odd lines, and a GBR
pixel ordering on even lines.
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20201119155559.14112-4-paul@crapouillou.net
This reverts commit 37054fc814 ("gpu/drm: ingenic: Add option to mmap
GEM buffers cached")
At the very moment this commit was created, the DMA API it relied on was
modified in the DMA tree, which caused the driver to break in
linux-next.
Revert it for now, and it will be resubmitted later to work with the new
DMA API.
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20201004141758.1013317-1-paul@crapouillou.net
Starting from the JZ4725B SoC, the primary and overlay planes support
24-bit pixel modes (8 bits per color component, without dummy byte).
Add support for these in the ingenic-drm driver.
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200926170501.1109197-8-paul@crapouillou.net
Ingenic SoCs are most notably used in cheap chinese handheld gaming
consoles. There, the games and applications generally render in software
directly into GEM buffers.
Traditionally, GEM buffers are mapped write-combine. Writes to the
buffer are accelerated, and reads are slow. Application doing lots of
alpha-blending paint inside shadow buffers, which is then memcpy'd into
the final GEM buffer.
On recent Ingenic SoCs however, it is much faster to have a fully cached
GEM buffer, in which applications paint directly, and whose data is
invalidated before scanout, than having a write-combine GEM buffer, even
when alpha blending is not used.
Add an optional 'cached_gem_buffers' parameter to the ingenic-drm driver
to allow GEM buffers to be mapped fully-cached, in order to speed up
software rendering.
v2: Use standard noncoherent DMA APIs
v3: Use damage clips instead of invalidating full frames
v4: Avoid dma_pgprot() which is not exported. Using vm_get_page_prot()
is enough in this case.
v5:
- Avoid calling drm_gem_cma_prime_mmap(). It has the side effect that an
extra object reference is obtained, which causes our dumb buffers to
never be freed. It should have been drm_gem_cma_mmap_obj(). However,
our custom mmap function only differs with one flag, so we can cleanly
handle both modes in ingenic_drm_gem_mmap().
- Call drm_gem_vm_close() if drm_mmap_attrs() failed, just like in
drm_gem_cma_mmap_obj().
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200912195639.176001-1-paul@crapouillou.net
Add support for the Image Processing Unit (IPU) found in all Ingenic
SoCs.
The IPU can upscale and downscale a source frame of arbitrary size
ranging from 4x4 to 4096x4096 on newer SoCs, with bicubic filtering
on newer SoCs, bilinear filtering on older SoCs. Nearest-neighbour can
also be obtained with proper coefficients.
Starting from the JZ4725B, the IPU supports a mode where its output is
sent directly to the LCDC, without having to be written to RAM first.
This makes it possible to use the IPU as a DRM plane on the compatible
SoCs, and have it convert and scale anything the userspace asks for to
what's available for the display.
Regarding pixel formats, older SoCs support packed YUV 4:2:2 and various
planar YUV formats. Newer SoCs introduced support for RGB.
Since the IPU is a separate hardware block, to make it work properly the
Ingenic DRM driver will now register itself as a component master in
case the IPU driver has been enabled in the config.
When enabled in the config, the CRTC will see the IPU as a second primary
plane. It cannot be enabled at the same time as the regular primary
plane. It has the same priority, which means that it will also display
below the overlay plane.
v2: - ingenic-ipu is no longer its own module. It will be built
into the ingenic-drm module.
- If enabled in the config, both the core driver and the IPU
driver will register as components; otherwise the core
driver will bypass that and call the ingenic_drm_bind()
function directly.
- Since both files now build into the same module, the
symbols previously exported as GPL are not exported anymore,
since they are only used internally.
- Fix SPDX license header in ingenic-ipu.h
- Avoid using 'for(;;);' loops without trailing statement(s)
v3: - Pass priv structure to IRQ handler; that way we don't hardcode
the expectation that the IPU plane is at index #0.
- Rework osd_changed() to account for src_* changes
- Add multiplanar YUV 4:4:4 support
- Commit fb addresses to HW at vblank, since addr registers are
not shadow registers
- Probe IPU component later so that IPU plane is last
- Fix driver not working on IPU-less hardware
- Use IPU driver's name as the IRQ name to avoid having two
'ingenic-drm' in /proc/interrupts
- Fix IPU only working for still images on JZ4725B
- Add a bit more code comments
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200716163846.174790-10-paul@crapouillou.net
All Ingenic SoCs starting from the JZ4725B support OSD mode.
In this mode, two separate planes can be used. They can have different
positions and sizes, and one can be overlayed on top of the other.
v2: Use fallthrough; instead of /* fall-through */
v3: - Add custom atomic_tail function to handle case where HW gives no
VBLANK
- Use regmap_set_bits() / regmap_clear_bits() when possible
- Use dma_hwdesc_f{0,1} fields in priv structure instead of array
- Use dmam_alloc_coherent() instead of dma_alloc_coherent()
- Use more meaningful 0xf0 / 0xf1 values as DMA descriptors IDs
- Add a bit more code comments
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200716163846.174790-9-paul@crapouillou.net