There are multiple issues with bcm2835_dma_abort() (which is called on
termination of a transaction):
* The algorithm to abort the transaction first pauses the channel by
clearing the ACTIVE flag in the CS register, then waits for the PAUSED
flag to clear. Page 49 of the spec documents the latter as follows:
"Indicates if the DMA is currently paused and not transferring data.
This will occur if the active bit has been cleared [...]"
https://www.raspberrypi.org/app/uploads/2012/02/BCM2835-ARM-Peripherals.pdf
So the function is entering an infinite loop because it is waiting for
PAUSED to clear which is always set due to the function having cleared
the ACTIVE flag. The only thing that's saving it from itself is the
upper bound of 10000 loop iterations.
The code comment says that the intention is to "wait for any current
AXI transfer to complete", so the author probably wanted to check the
WAITING_FOR_OUTSTANDING_WRITES flag instead. Amend the function
accordingly.
* The CS register is only read at the beginning of the function. It
needs to be read again after pausing the channel and before checking
for outstanding writes, otherwise writes which were issued between
the register read at the beginning of the function and pausing the
channel may not be waited for.
* The function seeks to abort the transfer by writing 0 to the NEXTCONBK
register and setting the ABORT and ACTIVE flags. Thereby, the 0 in
NEXTCONBK is sought to be loaded into the CONBLK_AD register. However
experimentation has shown this approach to not work: The CONBLK_AD
register remains the same as before and the CS register contains
0x00000030 (PAUSED | DREQ_STOPS_DMA). In other words, the control
block is not aborted but merely paused and it will be resumed once the
next DMA transaction is started. That is absolutely not the desired
behavior.
A simpler approach is to set the channel's RESET flag instead. This
reliably zeroes the NEXTCONBK as well as the CS register. It requires
less code and only a single MMIO write. This is also what popular
user space DMA drivers do, e.g.:
https://github.com/metachris/RPIO/blob/master/source/c_pwm/pwm.c
Note that the spec is contradictory whether the NEXTCONBK register
is writeable at all. On the one hand, page 41 claims:
"The value loaded into the NEXTCONBK register can be overwritten so
that the linked list of Control Block data structures can be
dynamically altered. However it is only safe to do this when the DMA
is paused."
On the other hand, page 40 specifies:
"Only three registers in each channel's register set are directly
writeable (CS, CONBLK_AD and DEBUG). The other registers (TI,
SOURCE_AD, DEST_AD, TXFR_LEN, STRIDE & NEXTCONBK), are automatically
loaded from a Control Block data structure held in external memory."
Fixes: 96286b5766 ("dmaengine: Add support for BCM2835")
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Cc: stable@vger.kernel.org # v3.14+
Cc: Frank Pavlic <f.pavlic@kunbus.de>
Cc: Martin Sperl <kernel@martin.sperl.org>
Cc: Florian Meier <florian.meier@koalo.de>
Cc: Clive Messer <clive.m.messer@gmail.com>
Cc: Matthias Reichl <hias@horus.com>
Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
Acked-by: Florian Kauer <florian.kauer@koalo.de>
Signed-off-by: Vinod Koul <vkoul@kernel.org>
If IRQ handlers are threaded (either because CONFIG_PREEMPT_RT_BASE is
enabled or "threadirqs" was passed on the command line) and if system
load is sufficiently high that wakeup latency of IRQ threads degrades,
SPI DMA transactions on the BCM2835 occasionally break like this:
ks8851 spi0.0: SPI transfer timed out
bcm2835-dma 3f007000.dma: DMA transfer could not be terminated
ks8851 spi0.0 eth2: ks8851_rdfifo: spi_sync() failed
The root cause is an assumption made by the DMA driver which is
documented in a code comment in bcm2835_dma_terminate_all():
/*
* Stop DMA activity: we assume the callback will not be called
* after bcm_dma_abort() returns (even if it does, it will see
* c->desc is NULL and exit.)
*/
That assumption falls apart if the IRQ handler bcm2835_dma_callback() is
threaded: A client may terminate a descriptor and issue a new one
before the IRQ handler had a chance to run. In fact the IRQ handler may
miss an *arbitrary* number of descriptors. The result is the following
race condition:
1. A descriptor finishes, its interrupt is deferred to the IRQ thread.
2. A client calls dma_terminate_async() which sets channel->desc = NULL.
3. The client issues a new descriptor. Because channel->desc is NULL,
bcm2835_dma_issue_pending() immediately starts the descriptor.
4. Finally the IRQ thread runs and writes BCM2835_DMA_INT to the CS
register to acknowledge the interrupt. This clears the ACTIVE flag,
so the newly issued descriptor is paused in the middle of the
transaction. Because channel->desc is not NULL, the IRQ thread
finalizes the descriptor and tries to start the next one.
I see two possible solutions: The first is to call synchronize_irq()
in bcm2835_dma_issue_pending() to wait until the IRQ thread has
finished before issuing a new descriptor. The downside of this approach
is unnecessary latency if clients desire rapidly terminating and
re-issuing descriptors and don't have any use for an IRQ callback.
(The SPI TX DMA channel is a case in point.)
A better alternative is to make the IRQ thread recognize that it has
missed descriptors and avoid finalizing the newly issued descriptor.
So first of all, set the ACTIVE flag when acknowledging the interrupt.
This keeps a newly issued descriptor running.
If the descriptor was finished, the channel remains idle despite the
ACTIVE flag being set. However the ACTIVE flag can then no longer be
used to check whether the channel is idle, so instead check whether
the register containing the current control block address is zero
and finalize the current descriptor only if so.
That way, there is no impact on latency and throughput if the client
doesn't care for the interrupt: Only minimal additional overhead is
introduced for non-cyclic descriptors as one further MMIO read is
necessary per interrupt to check for idleness of the channel. Cyclic
descriptors are sped up slightly by removing one MMIO write per
interrupt.
Fixes: 96286b5766 ("dmaengine: Add support for BCM2835")
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Cc: stable@vger.kernel.org # v3.14+
Cc: Frank Pavlic <f.pavlic@kunbus.de>
Cc: Martin Sperl <kernel@martin.sperl.org>
Cc: Florian Meier <florian.meier@koalo.de>
Cc: Clive Messer <clive.m.messer@gmail.com>
Cc: Matthias Reichl <hias@horus.com>
Tested-by: Stefan Wahren <stefan.wahren@i2se.com>
Acked-by: Florian Kauer <florian.kauer@koalo.de>
Signed-off-by: Vinod Koul <vkoul@kernel.org>
The license text is specifying GPL v2 or later but the MODULE_LICENSE
is set to GPL v2 which means GNU Public License v2 only. So choose the
license text as the correct one.
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Acked-by: Florian Kauer <florian.kauer@koalo.de>
Acked-by: Martin Sperl <kernel@martin.sperl.org>
Signed-off-by: Vinod Koul <vkoul@kernel.org>
dma_slave_config direction was marked as deprecated quite some
time back, remove the usage from this driver so that the field
can be removed
Acked-by: Scott Branden <scott.branden@broadcom.com>
Signed-off-by: Vinod Koul <vkoul@kernel.org>
To avoid race with vchan_complete, use the race free way to terminate
running transfer.
Implement the device_synchronize callback to make sure that the terminated
descriptor is freed.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Acked-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
The code responsible for splitting periods into chunks that
can be handled by the DMA controller missed to update total_len,
the number of bytes processed in the current period, when there
are more chunks to follow.
Therefore total_len was stuck at 0 and the code didn't work at all.
This resulted in a wrong control block layout and audio issues because
the cyclic DMA callback wasn't executing on period boundaries.
Fix this by adding the missing total_len update.
Signed-off-by: Matthias Reichl <hias@horus.com>
Signed-off-by: Martin Sperl <kernel@martin.sperl.org>
Tested-by: Clive Messer <clive.messer@digitaldreamtime.co.uk>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
When building this driver on arm64, we get a harmless type
mismatch warning:
drivers/dma/bcm2835-dma.c: In function 'bcm2835_dma_fill_cb_chain_with_sg':
include/linux/kernel.h:743:17: warning: comparison of distinct pointer types lacks a cast
(void) (&_min1 == &_min2); \
^
drivers/dma/bcm2835-dma.c:409:21: note: in expansion of macro 'min'
cb->cb->length = min(len, max_len);
This changes the type of the 'len' variable to size_t, which
avoids the problem.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 388cc7a281 ("dmaengine: bcm2835: add slave_sg support to bcm2835-dma")
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
The bcm2835_dma_prep_dma_memcpy() function is not exported
outside the driver, so make it static to avoid the following
warning:
drivers/dma/bcm2835-dma.c:616:32: warning: symbol 'bcm2835_dma_prep_dma_memcpy' was not declared. Should it be static?
Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Fix typo in warning message that there is no "interrupt-names"
property defined in the device-tree and legacy-mode is used.
Also added newline to end of message.
Signed-off-by: Martin Sperl <kernel@martin.sperl.org>
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Use platform_get_irq_byname to allow for correct mapping of
interrupts to dma channels.
The currently implemented device tree is unfortunately
implemented with the wrong assumption, that each dma-channel
has its own dma channel, but dma-irq 11 is handling
dma-channel 11-14 and dma-irq 12 is actually a "catch all"
interrupt.
So here we use the byname variant and require that interrupts
are explicitly named via the interrupts-name property in the
device tree.
The use of shared interrupts is also implemented.
As a side-effect this means we can now use dma channels 12, 13 and 14
in a correct manner - also testing shows that onl using
channels 11 to 14 for spi and i2s works perfectly (when playing
some video)
Signed-off-by: Martin Sperl <kernel@martin.sperl.org>
Acked-by: Eric Anholt <eric@anholt.net>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Also added check for an error condition in bcm2835_dma_create_cb_chain
that showed up during development of this patch.
Tested using dmatest for all enabled channels.
Signed-off-by: Martin Sperl <kernel@martin.sperl.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Add slave_sg support to bcm2835-dma using shared allocation
code for bcm2835_desc and DMA-control blocks already used by
dma_cyclic.
Note that bcm2835_dma_callback had to get modified to support
both modes of operation (cyclic and non-cyclic).
Tested using:
* Hifiberry I2S card (using cyclic DMA)
* fb_st7735r SPI-framebuffer (using slave_sg DMA via spi-bcm2835)
playing BigBuckBunny for audio and video.
Signed-off-by: Martin Sperl <kernel@martin.sperl.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
The bcm2835 dma system has 2 basic types of dma-channels:
* "normal" channels
* "light" channels
Lite channels are limited in several aspects:
* internal data-structure is 128 bit (not 256)
* does not support BCM2835_DMA_TDMODE (2D)
* DMA length register is limited to 16 bit.
so 0-65535 (not 0-65536 as mentioned in the official datasheet)
* BCM2835_DMA_S/D_IGNORE are not supported
The detection of the type of mode is implemented by looking at
the LITE bit in the DEBUG register for each channel.
This allows automatic detection.
Based on this the maximum block size is set to (64K - 4) or to 1G
and this limit is honored during generation of control block
chains. The effect is that when a LITE channel is used more
control blocks are used to do the same transfer (compared
to a normal channel).
As there are several sources/target DREQS that are 32 bit wide
we need to have the transfer to be a multiple of 4 as this would
break the transfer otherwise.
This is why the limit of (64K - 4) was chosen over the
alternative of (64K - 4K).
Signed-off-by: Martin Sperl <kernel@martin.sperl.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
In preparation of adding slave_sg functionality this patch moves the
generation/allocation of bcm2835_desc and the building of
the corresponding DMA-control-block chain from bcm2835_dma_prep_dma_cyclic
into the newly created method bcm2835_dma_create_cb_chain.
Signed-off-by: Martin Sperl <kernel@martin.sperl.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
In preparation to consolidating code we move the cyclic member
into the bcm_2835_desc structure.
Signed-off-by: Martin Sperl <kernel@martin.sperl.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Add additional defines describing the DMA registers
as well as adding some more documentation to those registers.
Signed-off-by: Martin Sperl <kernel@martin.sperl.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
The original patch contained 3 dma channels that were masked out.
These - as far as research and discussions show - are a
artefacts remaining from the downstream legacy dma-api.
Right now down-stream still includes a legacy api used only
in a single (downstream only) driver (bcm2708_fb) that requires
2D DMA for speedup (DMA-channel 0).
Formerly the sd-card support driver also was using this legacy
api (DMA-channel 2), but since has been moved over to use
dmaengine directly.
The DMA-channel 3 is already masked out in the devicetree in
the default property "brcm,dma-channel-mask = <0x7f35>;"
So we can remove the whole masking of DMA channels.
Signed-off-by: Martin Sperl <kernel@martin.sperl.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
bcm2835-dma supports residue reporting at burst level but didn't report
this via the residue_granularity field.
See also:
b015555327
for the downstream patch.
Signed-off-by: Matthias Reichl <hias@horus.com>
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
Signed-off-by: Martin Sperl <kernel@martin.sperl.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
f931782917 dmaengine: bcm2835-dma: Fix memory leak when stopping a
running transfer
Fixed the memleak, but introduced another issue: the terminate_all callback
might be called with interrupts disabled and the dma_free_coherent() is
not allowed to be called when IRQs are disabled.
Convert the driver to use dma_pool_* for managing the list of control
blocks for the transfer.
Fixes: f931782917 ("dmaengine: bcm2835-dma: Fix memory leak when stopping a running transfer")
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Tested-by: Matthias Reichl <hias@horus.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
The vd->node is removed from the lists when the transfer started so the
vchan_get_all_descriptors() will not find it. This results memory leak.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Acked-by: Stephen Warren <swarren@wwwdotorg.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Now that the generic slave caps code can make use of the device assigned
capabilities, instead of relying on a callback to be implemented.
Make use of this code.
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Acked-by: Stephen Warren <swarren@wwwdotorg.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Split the device_control callback of the Broadcom BCM2835 DMA driver to make
use of the newly introduced callbacks, that will eventually be used to retrieve
slave capabilities.
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Acked-by: Stephen Warren <swarren@wwwdotorg.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
The dmaengine header abbreviates destination as at least two different strings.
Make a coherent use of a single one.
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Acked-by: Mark Brown <broonie@kernel.org>
Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Acked-by: Stephen Warren <swarren@wwwdotorg.org>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
There is no need to init .owner field.
Based on the patch from Peter Griffin <peter.griffin@linaro.org>
"mmc: remove .owner field for drivers using module_platform_driver"
This patch removes the superflous .owner field for drivers which
use the module_platform_driver API, as this is overriden in
platform_driver_register anyway."
Signed-off-by: Kiran Padwal <kiran.padwal@smartplayin.com>
[for nvidia]
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
chanctnt is already filled by dma_async_device_register, which uses the channel
list to know how much channels there is.
Since it's already filled, we can safely remove it from the drivers' probe
function.
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
The argument is always set to NULL and never used. Remove it.
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Without DMA_PRIVATE the driver is not able to allocate more than one channel.
Since it uses dma_get_any_slave_channel that calls private_candidate,
the second allocation fails at
/* some channels are already publicly allocated */
Maybe it should be fixed in the core, but at least this fixes the bug.
Signed-off-by: Florian Meier <florian.meier@koalo.de>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Add support for DMA controller of BCM2835 as used in the Raspberry Pi.
Currently it only supports cyclic DMA.
Signed-off-by: Florian Meier <florian.meier@koalo.de>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>