If NET_SKB_PAD is not a multiple of the cache line size, mv643xx_eth
allocates a couple of extra bytes at the start of each receive buffer
to make the data payload end up on a cache line boundary.
These extra bytes are skb_reserve()'d before DMA mapping, so they
should not be included in the DMA map byte count (as the mapping is
done starting at skb->data), nor should they be included in the
receive descriptor buffer size field, or the hardware can end up
DMAing beyond the end of the buffer, which can happen if someone
sends us a larger-than-MTU sized packet.
This problem was introduced in commit 7fd96ce47f ("mv643xx_eth:
rework receive skb cache alignment", May 6 2009), but hasn't appeared
to be problematic so far, probably as the main users of mv643xx_eth
all have NET_SKB_PAD == L1_CACHE_BYTES.
Signed-off-by: Saeed Bishara <saeed@marvell.com>
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The txq_set_wrr() function in drivers/net/mv643xx_eth.c is
unused, not even referenced under #if 0 or something like that,
which results in a compile-time warning:
drivers/net/mv643xx_eth.c:1070: warning: 'txq_set_wrr' defined but not used
Fix: remove it.
Signed-off-by: Mikael Pettersson <mikpe@it.uu.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Unicast Promiscious Mode (UPM) bit in the mv643xx_eth port
configuration register doesn't do exactly what its name would suggest:
setting this bit merely enables reception of all unicast frames with a
destination address that differs from our local MAC address in bits
[47:4]. In particular, it doesn't have any effect on unicast frames
with a destination address that matches our MAC address in bits [47:4]
-- these will still be tested against the 16-entry unicast address
filter table.
Therefore, if the interface is set to promiscuous mode, just setting
the unicast promiscuous bit isn't enough -- we need to set all filter
bits in the unicast filter table to 1 as well.
Reported-by: Sachin Sanap <ssanap@marvell.com>
Signed-off-by: Prabhanjan Sarnaik <sarnaik@marvell.com>
Tested-by: Siddarth Gore <gores@marvell.com>
Tested-by: Mahavir Jain <mjain@marvell.com>
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Cc: stable@kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch is inspired by patch recently posted by Johannes Berg. Basically what
my patch does is to group list and a count of addresses into newly introduced
structure netdev_hw_addr_list. This brings us two benefits:
1) struct net_device becames a bit nicer.
2) in the future there will be a possibility to operate with lists independently
on netdevices (with exporting right functions).
I wanted to introduce this patch before I'll post a multicast lists conversion.
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
drivers/net/bnx2.c | 4 +-
drivers/net/e1000/e1000_main.c | 4 +-
drivers/net/ixgbe/ixgbe_main.c | 6 +-
drivers/net/mv643xx_eth.c | 2 +-
drivers/net/niu.c | 4 +-
drivers/net/virtio_net.c | 10 ++--
drivers/s390/net/qeth_l2_main.c | 2 +-
include/linux/netdevice.h | 17 +++--
net/core/dev.c | 130 ++++++++++++++++++--------------------
9 files changed, 89 insertions(+), 90 deletions(-)
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch converts unicast address list to standard list_head using
previously introduced struct netdev_hw_addr. It also relaxes the
locking. Original spinlock (still used for multicast addresses) is not
needed and is no longer used for a protection of this list. All
reading and writing takes place under rtnl (with no changes).
I also removed a possibility to specify the length of the address
while adding or deleting unicast address. It's always dev->addr_len.
The convertion touched especially e1000 and ixgbe codes when the
change is not so trivial.
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
drivers/net/bnx2.c | 13 +--
drivers/net/e1000/e1000_main.c | 24 +++--
drivers/net/ixgbe/ixgbe_common.c | 14 ++--
drivers/net/ixgbe/ixgbe_common.h | 4 +-
drivers/net/ixgbe/ixgbe_main.c | 6 +-
drivers/net/ixgbe/ixgbe_type.h | 4 +-
drivers/net/macvlan.c | 11 +-
drivers/net/mv643xx_eth.c | 11 +-
drivers/net/niu.c | 7 +-
drivers/net/virtio_net.c | 7 +-
drivers/s390/net/qeth_l2_main.c | 6 +-
drivers/scsi/fcoe/fcoe.c | 16 ++--
include/linux/netdevice.h | 18 ++--
net/8021q/vlan.c | 4 +-
net/8021q/vlan_dev.c | 10 +-
net/core/dev.c | 195 +++++++++++++++++++++++++++-----------
net/dsa/slave.c | 10 +-
net/packet/af_packet.c | 4 +-
18 files changed, 227 insertions(+), 137 deletions(-)
Signed-off-by: David S. Miller <davem@davemloft.net>
After 2.6.29, PPC no more admits passing NULL to the dev parameter of
the DMA API. The result is a BUG followed by solid lock-up when the
mv643xx_eth driver brings an interface up. The following patch makes
the driver work on my Pegasos again; it is mostly a search and replace
of NULL by mp->dev->dev.parent in dma allocation/freeing/mapping/unmapping
functions.
Signed-off-by: Gabriel Paubert <paubert@iram.es>
Acked-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It is not a good idea to blindly unmask the RX and TX_END interrupts
for all eight queues on all mv643xx_eth hardware, since some variations
of the hardware have less than eight transmit/receive queues, and the
RX/TX_END interrupts for the queues they don't have can be in use by
other interrupt sources.
Signed-off-by: Saeed Bishara <saeed@marvell.com>
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On the platforms that mv643xx_eth is used on, the manual skb->data
alignment logic in mv643xx_eth can be simplified, as the only case we
need to handle is where NET_SKB_PAD is not a multiple of the cache
line size. If this is the case, the extra padding we need can be
computed at compile time, while if NET_SKB_PAD _is_ a multiple of
the cache line size, the code can be optimised out entirely.
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move the definitions for the SDMA and port serial configuration
register values to where all the other register definitions live,
and expand the shifts to 32 bit constants.
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On several mv643xx_eth hardware versions, the two 64bit mib counters
for 'good octets received' and 'good octets sent' are actually 32bit
counters, and reading from the upper half of the register has the same
effect as reading from the lower half of the register: an atomic
read-and-clear of the entire 32bit counter value. This can under heavy
traffic occasionally lead to small numbers being added to the upper
half of the 64bit mib counter even though no 32bit wrap has occured.
Since we poll the mib counters at least every 30 seconds anyway, we
might as well just skip the reads of the upper halves of the hardware
counters without breaking the stats, which this patch does.
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Cc: stable@kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, when OOM occurs during rx ring refill, mv643xx_eth will get
into an infinite loop, due to the refill function setting the OOM bit
but not clearing the 'rx refill needed' bit for this queue, while the
calling function (the NAPI poll handler) will call the refill function
in a loop until the 'rx refill needed' bit goes off, without checking
the OOM bit.
This patch fixes this by checking the OOM bit in the NAPI poll handler
before attempting to do rx refill. This means that once OOM occurs,
we won't try to do any memory allocations again until the next invocation
of the poll handler.
While we're at it, change the OOM flag to be a single bit instead of
one bit per receive queue since OOM is a system state rather than a
per-queue state, and cancel the OOM timer on entry to the NAPI poll
handler if it's running to prevent it from firing when we've already
come out of OOM.
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Cc: stable@kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Move SDMA configuration from interface up to port probe, to prevent
overwriting the receive coalescing timer value on interface up.
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When mv643xx_eth_open() is called to up an interface, port_start()
will first re-program the unicast address filter, and then
re-initialise the PORT_CONFIG register, but that will disable unicast
promiscuous mode if it was enabled by the unicast address filter setup.
This isn't a problem on ifconfig up, as ->set_rx_mode() will be called
shortly afterwards which will program the filters again, but it does
trigger when changing the MTU, which calls mv643xx_eth_stop() and then
mv643xx_eth_open() by hand to repopulate the receive rings with skbuffs
of the new size.
Swap the initialisation of the PORT_START register and the call to
the unicast filter setup function to fix this.
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A receive coalescing timeout of 250 usec appears to strike a good
balance between allowing enough received frames to be aggregated for
LRO to do its job and not allowing the connection to stall due to
delaying ACKs to the remote end for too long.
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move the netif_carrier_off() call in ->open() to port probe, so that
ethtool doesn't report the link as being up before we have up'd the
interface.
Move initialisation of the rx/tx coalescing timers from ->open() to
port probe, so that we don't reset the coalescing timers every time
the interface is up'd.
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
mib_counters_update() also restarts the timer.
So the timer is dequeued, the stats are read and then the timer is
enqueued again. This is "okay" unless someone unloads the module.
The locking here is also broken:
mib_counters_update() grabs just a simple spinlock. The only thing the
lock is good for is to protect the timer func against other callers
namely mv643xx_eth_stop() && mv643xx_eth_get_ethtool_stats(). That means
if the spinlock is taken via the ethtool path and than the timer kicks
in then the box will lock up.
Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Acked-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Controlled by a compile-time (Kconfig) option for now, since it
isn't a win in all cases.
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rename the mp->default_[rt]x_ring_size variables to ->[rt]x_ring_size,
allow them to be read via the standard ethtool ->get_ringparam() op,
and add a ->set_ringparam() op to allow resizing them at run time.
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch:
- increases the precision of the receive/transmit interrupt
coalescing register value computations by using 64bit temporaries;
- adds functions to read the current hardware coalescing register
values and convert them back to usecs;
- exports the {get,set} {rx,tx} coal methods via the standard
ethtool coalescing interface.
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It's a waste having two different versions of this structure around
when the differences between ethtool ops for phy'd and phy-less
interfaces are so minor.
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Contrary to what the docs say, the 'extended interrupt cause' bit in
the interrupt cause register (bit 1) appears to not be maskable on at
least some of the mv643xx_eth platforms, making writing zeroes to the
interrupt mask register but not the extended interrupt mask register
insufficient to stop interrupts from occuring. Therefore, also write
zeroes to the extended interrupt mask register when shutting down the
port.
This fixes the interrupt storm seen on the Pegasos board when shutting
down the interface.
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 66e63ffbc0 ("mv643xx_eth:
implement ->set_rx_mode()") cleaned up mv643xx_eth's multicast filter
programming, but broke it as well.
The non-special multicast filter table (for multicast addresses that
are not of the form 01:00:5e:00:00:xx) consists of 256 hash table
buckets organised as 64 32-bit words, where the 'accept' bits are
in the LSB of each byte, so in bits 24 16 8 0 of each 32-bit word.
The old code got this right, but the referenced commit broke this by
using bits 3 2 1 0 instead. This commit fixes this up.
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit cd4ccf76bf.
On the Pegasos board, we can't do DMA burst that are longer than
one cache line. For now, go back to using 32 byte DMA bursts for
all mv643xx_eth platforms -- we can switch the ARM-based platforms
back to doing long 128 byte bursts in the next development cycle.
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Reported-by: Alan Curry <pacman@kosh.dhis.org>
Reported-by: Gabriel Paubert <paubert@iram.es>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, if multiple unicast addresses are programmed into a
mv643xx_eth interface, the core networking will resort to enabling
promiscuous mode on the interface, as mv643xx_eth does not implement
->set_rx_mode().
This patch switches mv643xx_eth over from ->set_multicast_list()
to ->set_rx_mode(), and implements support for secondary unicast
addresses. The hardware can handle multiple unicast addresses as
long as their first 11 nibbles are the same (i.e. are of the form
xx:xx:xx:xx:xx:xy where the x part is the same for all addresses), so
if that is the case, we use that mode. If it's not the case, we enable
unicast promiscuous mode in the hardware, which is slightly better than
enabling promiscuous mode for multicasts as well, which is what would
happen before.
While we are at it, change the programming sequence so that we
don't clear all filter bits first, so we don't lose all incoming
packets while the filter is being reprogrammed.
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since txq_alloc_desc_index() is a very simple function, and since
descriptor ring index handling for transmit reclaim, receive
processing and receive refill is already handled inline as well,
inline txq_alloc_desc_index() into its two call sites.
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The mv643xx_eth driver uses the rdl()/wrl() macros to read and
write hardware registers. Per-port registers are accessed in the
following way:
#define PORT_STATUS(p) (0x0444 + ((p) << 10))
[...]
static inline u32 rdl(struct mv643xx_eth_private *mp, int offset)
{
return readl(mp->shared->base + offset);
}
[...]
port_status = rdl(mp, PORT_STATUS(mp->port_num));
By giving the per-port 'struct mv643xx_eth_private' its own
'void __iomem *base' pointer that points to the per-port register
area, we can get rid of both the double indirection and the << 10
that is done for every per-port register access -- this patch does
that.
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix up a couple of coding style issues caught by checkpatch.
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When mv643xx_eth allocates skbuffs, it adds
'dma_get_cache_alignment() - 1' to the length it needs, so that it can
align the skb's ->data pointer to a cache boundary. When checking
whether a transmitted skbuff can be reused as a receive buffer, these
bytes needs to be included into the minimum bound for the recycle check.
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The generic packet receive code takes care of setting
netdev->last_rx when necessary, for the sake of the
bonding ARP monitor.
Drivers need not do it any more.
Some cases had to be skipped over because the drivers
were making use of the ->last_rx value themselves.
Signed-off-by: David S. Miller <davem@davemloft.net>
The mv643xx_eth mii bus implementation uses wait_event_timeout() to
wait for SMI completion interrupts.
If wait_event_timeout() would return zero, mv643xx_eth would conclude
that the SMI access timed out, but this is not necessarily true --
wait_event_timeout() can also return zero in the case where the SMI
completion interrupt did happen in time but where it took longer than
the requested timeout for the process performing the SMI access to be
scheduled again. This would lead to occasional SMI access timeouts
when the system would be under heavy load.
The fix is to ignore the return value of wait_event_timeout(), and
to re-check the SMI done bit after wait_event_timeout() returns to
determine whether or not the SMI access timed out.
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
This converts pretty much everything to print_mac. There were
a few things that had conflicts which I have just dropped for
now, no harm done.
I've built an allyesconfig with this and looked at the files
that weren't built very carefully, but it's a huge patch.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
mv643xx_eth uses ip_hdr() (defined in linux/ip.h), but relied on
another header file to include the needed header file indirectly.
In latest net-next this indirect include chain is gone, so the
driver fails to build. Include linux/ip.h explicitly to fix this.
Signed-off-by: Lennert Buytenhek <buytenh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>