Conflicts:
arch/mips/net/bpf_jit.c
drivers/net/can/flexcan.c
Both the flexcan and MIPS bpf_jit conflicts were cases of simple
overlapping changes.
Signed-off-by: David S. Miller <davem@davemloft.net>
Pass the correct 'mask' and 'value' bits to c_can_hw_raminit_wait_ti(). They
seem to have been swapped in the usage instances.
Reported-by: Jay Schroeder <jay.schroeder@garmin.com>
Signed-off-by: Roger Quadros <rogerq@ti.com>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
devm_ioremap() returns NULL on error, not an ERR_PTR().
Fixes: 33cf756569 ('can: c_can_platform: Fix raminit, use devm_ioremap() instead of devm_ioremap_resource()')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: linux-stable <stable@vger.kernel.org> # >= v3.11
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
No need to manually copy debug settings into subdir Makefiles. kbuild
has a mechanism for inheriting, so let's use it.
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
We should prefer `struct pci_device_id` over `DEFINE_PCI_DEVICE_TABLE` to
meet kernel coding style guidelines. This issue was reported by checkpatch.
A simplified version of the semantic patch that makes this change is as
follows (http://coccinelle.lip6.fr/):
// <smpl>
@@
identifier i;
declarer name DEFINE_PCI_DEVICE_TABLE;
initializer z;
@@
- DEFINE_PCI_DEVICE_TABLE(i)
+ const struct pci_device_id i[]
= z;
// </smpl>
[bhelgaas: add semantic patch]
Signed-off-by: Benoit Taine <benoit.taine@lip6.fr>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The raminit register is shared register for both can0 and can1. Since commit:
32766ff net: can: Convert to use devm_ioremap_resource
devm_ioremap_resource() is used to map raminit register. When using both
interfaces the mapping for the can1 interface fails, leading to a non
functional can interface.
Signed-off-by: George Cherian <george.cherian@ti.com>
Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com>
Cc: linux-stable <stable@vger.kernel.org> # >= v3.11
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
This patch uses devm_* APIs as they are device managed and make code simpler.
Signed-off-by: Lad, Prabhakar <prabhakar.csengg@gmail.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Conflicts:
drivers/net/bonding/bond_alb.c
drivers/net/ethernet/altera/altera_msgdma.c
drivers/net/ethernet/altera/altera_sgdma.c
net/ipv6/xfrm6_output.c
Several cases of overlapping changes.
The xfrm6_output.c has a bug fix which overlaps the renaming
of skb->local_df to skb->ignore_df.
In the Altera TSE driver cases, the register access cleanups
in net-next overlapped with bug fixes done in net.
Similarly a bug fix to send ALB packets in the bonding driver using
the right source address overlaps with cleanups in net-next.
Signed-off-by: David S. Miller <davem@davemloft.net>
Add helpers for 32-bit accesses and replace open-coded 32-bit access
with calls to helpers. Minimum changes are done to the pci case, as I
don't have access to that hardware.
Tested-by: Thor Thayer <tthayer@altera.com>
Signed-off-by: Thor Thayer <tthayer@altera.com>
Signed-off-by: Pavel Machek <pavel@denx.de>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
This patch makes the {read,write}_reg functions const, this is a preparation to
make use of {read,write}_reg in the hwinit callback.
Signed-off-by: Thor Thayer <tthayer@altera.com>
Signed-off-by: Pavel Machek <pavel@denx.de>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
In 2b9aecdce2 ("can: c_can: Disable rx split as workaround") a new Kconfig
option was introduced as a workaround. The tests performed by Alexander Stein
confirmed this option to be obsolete with all the other cleanups and fixes
that had been discussed that time:
http://marc.info/?l=linux-can&m=139746476821294&w=2
Both (author and tester) agreed to remove this Kconfig option again:
http://marc.info/?l=linux-can&m=139883820714229&w=2
As some more cleanups took place since then a simple revert is not possible.
This patch removes the entire option as it would behave when disabled.
Further beautification’s can be done later.
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Tested-by: Alexander Stein <alexander.stein@systec-electronic.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Conflicts:
drivers/net/ethernet/altera/altera_sgdma.c
net/netlink/af_netlink.c
net/sched/cls_api.c
net/sched/sch_api.c
The netlink conflict dealt with moving to netlink_capable() and
netlink_ns_capable() in the 'net' tree vs. supporting 'tc' operations
in non-init namespaces. These were simple transformations from
netlink_capable to netlink_ns_capable.
The Altera driver conflict was simply code removal overlapping some
void pointer cast cleanups in net-next.
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Alexander Stein <alexander.stein@systec-electronic.com>
Acked-by: Wolfgang Grandegger <wg@grandegger.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Coverity complains that c_can_pci_probe() calls pci_enable_msi() without
checking the result:
CID 712278 (#1 of 1): Unchecked return value (CHECKED_RETURN) 3. check_return:
Calling pci_enable_msi_block without checking return value (as is done
elsewhere 88 out of 105 times).
88 pci_enable_msi(pdev);
This is CID 712278.
Signed-off-by: Wolfgang Grandegger <wg@grandegger.com>
Reported-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Commit 6439fbce10 (can: c_can: fix error checking of priv->instance in
probe()) found the warning but applied a suboptimal solution. Since, both
pdev->id and of_alias_get_id() return integers, it makes sense to convert the
variable to an integer and avoid the cast.
Signed-off-by: Wolfram Sang <wsa@sang-engineering.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
It's suffcient to kill the TXIE bit in the message control register
even if the documentation of C and D CAN says that it's not allowed to
do that while MSGVAL is set. Reality tells a different story and this
change gives us another 2% of CPU back for not waiting on I/O.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Alexander Stein <alexander.stein@systec-electronic.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Mark suggested to use one IF for the softirq and the other for the
xmit function to avoid the xmit lock.
That requires to write the frame into the interface first, then handle
the echo skb and store the dlc before committing the TX request to the
message ram.
We use an atomic to handle the active buffers instead of reading the
MSGVAL register as thats way faster especially on PCH/x86.
Suggested-by: Mark <mark5@del-llc.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Alexander Stein <alexander.stein@systec-electronic.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Instead of obfuscating the code by artificial 16 bit splits use the
proper 32 bit assignments and split the result when writing to the
interface.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Alexander Stein <alexander.stein@systec-electronic.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Remove the MASK from the TX transfer side.
Make the code readable and get rid of the annoying IFX_WRITE_XXX_16BIT
macros which are just obfuscating the code.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Alexander Stein <alexander.stein@systec-electronic.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Sigh!
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Alexander Stein <alexander.stein@systec-electronic.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Alexander Stein <alexander.stein@systec-electronic.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Alexander Stein <alexander.stein@systec-electronic.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Alexander Stein <alexander.stein@systec-electronic.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Alexander Stein <alexander.stein@systec-electronic.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Alexander reported that the new optimized handling of the RX fifo
causes random packet loss on Intel PCH C_CAN hardware.
After a few fruitless debugging sessions I got hold of a PCH (eg20t)
afflicted system. That machine does not have the CAN interface wired
up, but it was possible to reproduce the issue with the HW loopback
mode.
As Alexander observed correctly, clearing the NewDat flag along with
reading out the message buffer causes that issue on C_CAN, while D_CAN
handles that correctly.
Instead of restoring the original message buffer handling horror the
following workaround solves the issue:
transfer buffer to IF without clearing the NewDat
handle the message
clear NewDat bit
That's similar to the original code but conditional for C_CAN.
I really wonder why all user manuals (C_CAN, Intel PCH and some more)
recommend to clear the NewDat bit right away. The knows it all Oracle
operated by Gurgle does not unearth any useful information either. I
simply cannot believe that we are the first to uncover that HW issue.
Reported-and-tested-by: Alexander Stein <alexander.stein@systec-electronic.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
The RX buffer split causes packet loss in the hardware:
What happens is:
RX Packet 1 --> message buffer 1 (newdat bit is not cleared)
RX Packet 2 --> message buffer 2 (newdat bit is not cleared)
RX Packet 3 --> message buffer 3 (newdat bit is not cleared)
RX Packet 4 --> message buffer 4 (newdat bit is not cleared)
RX Packet 5 --> message buffer 5 (newdat bit is not cleared)
RX Packet 6 --> message buffer 6 (newdat bit is not cleared)
RX Packet 7 --> message buffer 7 (newdat bit is not cleared)
RX Packet 8 --> message buffer 8 (newdat bit is not cleared)
Clear newdat bit in message buffer 1
Clear newdat bit in message buffer 2
Clear newdat bit in message buffer 3
Clear newdat bit in message buffer 4
Clear newdat bit in message buffer 5
Clear newdat bit in message buffer 6
Clear newdat bit in message buffer 7
Clear newdat bit in message buffer 8
Now if during that clearing of newdat bits, a new message comes in,
the HW gets confused and drops it.
It does not matter how many of them you clear. I put a delay between
clear of buffer 1 and buffer 2 which was long enough that the message
should have been queued either in buffer 1 or buffer 9. But it did not
show up anywhere. The next message ended up in buffer 1. So the
hardware lost a packet of course without telling it via one of the
error handlers.
That does not happen on all clear newdat bit events. I see one of 10k
packets dropped in the scenario which allows us to reproduce. But the
trace looks always the same.
Not splitting the RX Buffer avoids the packet loss but can cause
reordering. It's hard to trigger, but it CAN happen.
With that mode we use the HW as it was probably designed for. We read
from the buffer 1 upwards and clear the buffer as we get the
message. That's how all microcontrollers use it. So I assume that the
way we handle the buffers was never really tested. According to the
public documentation it should just work :)
Let the user decide which evil is the lesser one.
[ Oliver Hartkopp: Provided a sane config option and help text and
made me switch to favour potential and unlikely reordering over
packet loss ]
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Alexander Stein <alexander.stein@systec-electronic.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
The driver handles pointlessly TWO interrupts per packet. The reason
is that it enables the status interrupt which fires for each rx and tx
packet and it enables the per message object interrupts as well.
The status interrupt merily acks or in case of D_CAN ignores the TX/RX
state and then the message object interrupt fires.
The message objects interrupts are only useful if all message objects
have hardware filters activated.
But we don't have that and its not simple to implement in that driver
without rewriting it completely.
So we can ditch the message object interrupts and handle the RX/TX
right away from the status interrupt. Instead of TWO we handle ONE.
Note: We must keep the TXIE/RXIE bits in the message buffers because
the status interrupt alone is not reliable enough in corner cases.
If we ever have the need for HW filtering, then this code needs a
complete overhaul and we can think about it then. For now we prefer a
lower interrupt load.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Alexander Stein <alexander.stein@systec-electronic.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
On D_CAN the RXOK, TXOK and LEC bits are cleared/set on read of the
status register. No need to update them.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Alexander Stein <alexander.stein@systec-electronic.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Instead of writing to the message object we can simply clear the
NewDat bit with the get method.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Alexander Stein <alexander.stein@systec-electronic.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
If the allocation of the error skb fails, we still want to see the
error statistics.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Alexander Stein <alexander.stein@systec-electronic.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Reading the LEC type with
return (mode & ENABLED) && (status & LEC_MASK);
is not guaranteed to return (status & LEC_MASK) if the enabled bit in
mode is set. It's guaranteed to return 0 or !=0.
Remove the inline function and call unconditionally into the
berr_handling code and return early when the reporting is disabled.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Alexander Stein <alexander.stein@systec-electronic.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
If the allocation of an error skb fails, the state change handling
returns w/o doing any work. That leaves the interface in a wreckaged
state as the internal status is wrong.
Split the interface handling and the skb handling.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Alexander Stein <alexander.stein@systec-electronic.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
There is no guarantee that the skb is in the same state after calling
net_receive_skb(). It might be freed or reused. Not really harmful as
its a read access, except you turn on the proper debugging options
which catch a use after free.
The whole can subsystem is full of this. Copy and paste ....
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Alexander Stein <alexander.stein@systec-electronic.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
The state change handler is called with device interrupts disabled
already. So no point in disabling them again when we enter bus off
state.
But what's worse is that we reenable the interrupts at the end of NAPI
poll unconditionally. So c_can_start() which is called from the
restart timer can trigger interrupts which confuse the hell out of the
half reinitialized driver/hw.
Remove the pointless device interrupt disable in the BUS_OFF handler
and prevent reenabling the device interrupts at the end of the poll
routine when the current state is BUS_OFF.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Alexander Stein <alexander.stein@systec-electronic.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
c_can_start() enables interrupts way too early. The first enabling
happens when setting the control mode in c_can_chip_config() and then
again at the end of the function.
But that happens before napi_enable() and that means that an interrupt
which comes in will disable interrupts again and call napi_schedule,
which ignores the request and the later napi_enable() is not making
thinks work either. So the interface is up with all device interrupts
disabled.
Move the device interrupt after napi_enable() and add it to the other
callsites of c_can_start() in c_can_set_mode() and c_can_power_up()
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Alexander Stein <alexander.stein@systec-electronic.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
All type checks in c_can.c are != BOSCH_D_CAN so nobody noticed so far
that the pci code does not update the type information.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Alexander Stein <alexander.stein@systec-electronic.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iEYEABECAAYFAlM6jfsACgkQjTAFq1RaXHOVTwCeOyOYOJ+Ze2x33WydGcGsvX8a
QPoAoJVAcekmL5MVbcMTTvc8QL1AebAm
=4aEp
-----END PGP SIGNATURE-----
Merge tag 'linux-can-fixes-for-3.15-20140401' of git://gitorious.org/linux-can/linux-can
linux-can-fixes-for-3.15-20140401
Marc Kleine-Budde says:
====================
this is a pull request of 16 patches for the 3.15 release cycle.
Bjorn Van Tilt contributes a patch which fixes a memory leak in usb_8dev's
usb_8dev_start_xmit()s error path. A patch by Robert Schwebel fixes a typo in
the can documentation. The remaining patches all target the c_can driver. Two
of them are by me; they add a missing netif_napi_del() and return value
checking. Thomas Gleixner contributes 12 patches, which address several
shortcomings in the driver like hardware initialisation, concurrency, message
ordering and poor performance.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
There is no point to toggle the RX led for every packet. Especially if
we have a full FIFO we want to avoid everything we can.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
The function loads the message object from the hardware to get the
payload length. The previous patch stores that information in an
array, so we can avoid the hardware access.
Remove the hardware access and move the led toggle outside of the
spinlocked region. Toggle the led only once when at least one packet
has been received.
Binary size shrinks along with the code
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
We can avoid the HW access in TX cleanup path for retrieving the DLC
of the sent package if we store the DLC in a private array.
Ideally this should be handled in the can_echo_skb functions, but I
leave that exercise to the CAN folks.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
commit 4ce78a838c (can: c_can: Speed up rx_poll function) hyped a
performance improvement by reducing the access to the interrupt
pending register from a dual 16 bit to a single 16 bit access. Wow!
Thereby it crippled the driver to cast the 16 msg objects in stone,
which is completly braindead as contemporary hardware has up to 128
message objects. Supporting larger object buffers is a major surgery,
but it'd be definitely worth it especially as the driver does not
support HW message filtering ....
The logic of the "FIFO" implementation is to split the FIFO in half.
For the lower half we read the buffers and clear the interrupt pending
bit, but keep the newdat bit set, so the HW will queue above those
buffers.
When we read out the last low buffer then we reenable all the low half
buffers by clearing the newdat bit.
The upper half buffers clear the newdat and the interrupt pending bit
right away as we know that the lower half bits are clear and give us a
headstart against the hardware.
Now the implementation is:
transfer_message_object()
read_object_and_put_into_skb();
if (obj < END_OF_LOW_BUF)
clear_intpending(obj)
else if (obj > END_OF_LOW_BUF)
clear_intpending_and_newdat(obj)
else if (obj == END_OF_LOW_BUF)
clear_newdat_of_all_low_objects()
The hardware allows to avoid most of the mess simply because we can
tell the transfer_message_object() function to clear bits right away.
So we can be clever and do:
if (obj <= END_OF_LOW_BUF)
ctrl = TRANSFER_MSG | CLEAR_INTPND;
else
ctrl = TRANSFER_MSG | CLEAR_INTPND | CLEAR_NEWDAT;
transfer_message_object(ctrl)
read_object_and_put_into_skb();
if (obj == END_OF_LOW_BUF)
clear_newdat_of_all_low_objects()
So we save a complete control operation on all message objects except
the one which is the end of the low buffer. That's a few micro seconds
per object.
I'm not adding a boasting profile to that, simply because it's self
explaining.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
[mkl: adjusted subject and commit message]
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
If every other line contains line breaks, that's a clear sign for
indentation level madness. Split out the inner loop and move the code
to a separate function. gcc creates slightly worse code for that, but
we'll fix that in the next step.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
[mkl: adjusted subject]
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
The network core does not serialize the access to the hardware. The
xmit related code lets the following happen:
CPU0 CPU1
interrupt()
do_poll()
c_can_do_tx()
Fiddle with HW and xmit()
internal data Fiddle with HW and
internal data
due the complete lack of serialization.
Add proper locking.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
The rx_poll code has the following gem:
if (msg_ctrl_save & IF_MCONT_EOB)
return num_rx_pkts;
The EOB bit is the indicator for the hardware that this is the last
configured FIFO object. But this object can contain valid data, if we
manage to free up objects before the overrun case hits.
Now if the code exits due to the EOB bit set, then this buffer is
stale and the interrupt bit and NewDat bit of the buffer are still
set. Results in a nice interrupt storm unless we come into an overrun
situation where the MSGLST bit gets set.
ksoftirqd/0-3 [000] ..s. 79.124101: c_can_poll: rx_poll: val: 00008001 pend 00008001
ksoftirqd/0-3 [000] ..s. 79.124176: c_can_poll: rx_poll: val: 00008000 pend 00008000
ksoftirqd/0-3 [000] ..s. 79.124187: c_can_poll: rx_poll: val: 00008002 pend 00008002
ksoftirqd/0-3 [000] ..s. 79.124256: c_can_poll: rx_poll: val: 00008000 pend 00008000
ksoftirqd/0-3 [000] ..s. 79.124267: c_can_poll: rx_poll: val: 00008000 pend 00008000
The amazing thing is that the check of the MSGLST (aka overrun bit)
used to be after the check of the EOB bit. That was "fixed" in commit
5d0f801a2c(can: c_can: Fix RX message handling, handle lost message
before EOB). But the author of this "fix" did not even understand that
the EOB check is broken as well.
Again a simple solution: Remove
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
[mkl: adjusted subject and commit message]
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
The lost message handling is broken in several ways.
1) Clearing the message lost flag is done by writing 0 to the
message control register of the object.
#define IF_MCONT_CLR_MSGLST (0 << 14)
That clears the object buffer configuration in the worst case,
which results in a loss of the EOB flag. That leaves the FIFO chain
without a limit and causes a complete lockup of the HW
2) In case that the error skb allocation fails, the code happily
claims that it handed down a packet. Just an accounting bug, but ....
3) The code adds a lot of pointless overhead to that error case, where
we need to get stuff done as fast as possible to avoid more packet
loss.
- printk an annoying error message
- reread the object buffer for nothing
Fix is simple again:
- Use the already known MSGCTRL content and only clear the MSGLST bit
- Fix the buffer accounting by adding a proper return code
- Remove the pointless operations
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
The buffer handling of c_can has been broken forever. That leads to
message reordering:
ksoftirqd/0-3 [000] ..s. 79.123776: c_can_poll: rx_poll: val: 00007fff
ksoftirqd/0-3 [000] ..s. 79.124101: c_can_poll: rx_poll: val: 00008001
What happens is:
CPU HW
queue new packet into obj 16 (0-15 are busy)
read obj 1-15
return because pending is 0
set pending obj 16 -> pending reg 8000
queue new packet into obj 1
set pending obj 1 -> pending reg 8001
So the current algorithmus reads the newest message first, which
violates the ordering rules of CAN.
Add proper handling of that situation by analyzing the contents of the
pending register for gaps.
This does NOT fix the message object corruption which can lead to
interrupt storms. Thats addressed in the next patches.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
[mkl: adjusted subject]
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
The hardware has two message control interfaces, but the code only uses the
first one. So on SMP the following can be observed:
CPU0 CPU1
rx_poll()
write IF1 xmit()
write IF1
write IF1
That results in corrupted message object configurations. The TX/RX is not
globally serialized it's only serialized on a core.
Simple solution: Let RX use IF1 and TX use IF2 and all is good.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>