The Texas Instrument's K3 J721E SoCs have a newer next-generation
C71x DSP Subsystem in the MAIN voltage domain in addition to the
previous generation C66x DSP subsystems. The C71x DSP subsystem is
based on the TMS320C71x DSP CorePac module. The C71x CPU is a true
64-bit machine including 64-bit memory addressing and single-cycle
64-bit base arithmetic operations and supports vector signal processing
providing a significant lift in DSP processing power over C66x DSPs.
J721E SoCs use a C711 (a one-core 512-bit vector width CPU core) DSP
that is cache coherent with the A72 Arm cores.
Each subsystem has one or more Fixed/Floating-Point DSP CPUs, with 32 KB
of L1P Cache, 48 KB of L1D SRAM that can be configured and partitioned as
either RAM and/or Cache, and 512 KB of L2 SRAM configurable as either RAM
and/or Cache. The CorePac also includes a Matrix Multiplication Accelerator
(MMA), a Stream Engine (SE) and a C71x Memory Management Unit (CMMU), an
Interrupt Controller (INTC) and a Powerdown Management Unit (PMU) modules.
Update the existing K3 DSP remoteproc driver to add support for this C71x
DSP subsystem. The firmware loading support is provided by using the newly
added 64-bit ELF loader support, and is limited to images using only
external DDR memory at the moment. The L1D and L2 SRAMs are used as scratch
memory when using as RAMs, and cannot be used for loadable segments. The
CMMU is also not supported to begin with, and the driver is designed to
treat the MMU as if it is in bypass mode.
Signed-off-by: Suman Anna <s-anna@ti.com>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Link: https://lore.kernel.org/r/20200612225357.8251-3-s-anna@ti.com
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
The resets for the DSP processors on K3 SoCs are managed through the
Power and Sleep Controller (PSC) module. Each DSP typically has two
resets - a global module reset for powering on the device, and a local
reset that affects only the CPU while allowing access to the other
sub-modules within the DSP processor sub-systems.
The C66x DSPs have two levels of internal RAMs that can be used to
boot from, and the firmware loading into these RAMs require the
local reset to be asserted with the device powered on/enabled using
the module reset. Enhance the K3 DSP remoteproc driver to add support
for loading into the internal RAMs. The local reset is deasserted on
SoC power-on-reset, so logic has to be added in probe in remoteproc
mode to balance the remoteproc state-machine.
Note that the local resets are a no-op on C71x cores, and the hardware
does not supporting loading into its internal RAMs.
Signed-off-by: Suman Anna <s-anna@ti.com>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Link: https://lore.kernel.org/r/20200721223617.20312-7-s-anna@ti.com
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
The Texas Instrument's K3 J721E SoCs have two C66x DSP Subsystems in MAIN
voltage domain that are based on the TI's standard TMS320C66x DSP CorePac
module. Each subsystem has a Fixed/Floating-Point DSP CPU, with 32 KB each
of L1P & L1D SRAMs that can be configured and partitioned as either RAM
and/or Cache, and 288 KB of L2 SRAM with 256 KB of memory configurable as
either RAM and/or Cache. The CorePac also includes an Internal DMA (IDMA),
External Memory Controller (EMC), Extended Memory Controller (XMC) with a
Region Address Translator (RAT) unit for 32-bit to 48-bit address
extension/translations, an Interrupt Controller (INTC) and a Powerdown
Controller (PDC).
A new remoteproc module is added to perform the device management of
these DSP devices. The support is limited to images using only external
DDR memory at the moment, the loading support to internal memories and
any on-chip RAM memories will be added in a subsequent patch. RAT support
is also left for a future patch, and as such the reserved memory carveout
regions are all expected to be using memory regions within the first 2 GB.
Error Recovery and Power Management features are not currently supported.
The C66x remote processors do not have an MMU, and so require fixed memory
carveout regions matching the firmware image addresses. Support for this
is provided by mandating multiple memory regions to be attached to the
remoteproc device. The first memory region will be used to serve as the
DMA pool for all dynamic allocations like the vrings and vring buffers.
The remaining memory regions are mapped into the kernel at device probe
time, and are used to provide address translations for firmware image
segments without the need for any RSC_CARVEOUT entries. Any firmware
image using memory outside of the supplied reserved memory carveout
regions will be errored out.
The driver uses various TI-SCI interfaces to talk to the System Controller
(DMSC) for managing configuration, power and reset management of these
cores. IPC between the A72 cores and the DSP cores is supported through
the virtio rpmsg stack using shared memory and OMAP Mailboxes.
Signed-off-by: Suman Anna <s-anna@ti.com>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Link: https://lore.kernel.org/r/20200721223617.20312-6-s-anna@ti.com
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>