linux/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c

6687 lines
178 KiB
C
Raw Normal View History

// SPDX-License-Identifier: GPL-2.0
/*
* Driver for Marvell PPv2 network controller for Armada 375 SoC.
*
* Copyright (C) 2014 Marvell
*
* Marcin Wojtas <mw@semihalf.com>
*/
net: mvpp2: enable ACPI support in the driver This patch introduces an alternative way of obtaining resources - via ACPI tables provided by firmware. Enabling coexistence with the DT support, in addition to the OF_*->device_*/fwnode_* API replacement, required following steps to be taken: * Add mvpp2_acpi_match table * Omit clock configuration and obtain tclk from the property - in ACPI world, the firmware is responsible for clock maintenance. * Disable comphy and syscon handling as they are not available for ACPI. * Modify way of obtaining interrupts - use newly introduced fwnode_irq_get() routine * Until proper MDIO bus and PHY handling with ACPI is established in the kernel, use only link interrupts feature in the driver. For the RGMII port it results in depending on GMAC settings done during firmware stage. * When booting with ACPI MVPP2_QDIST_MULTI_MODE is picked by default, as there is no need to keep any kind of the backward compatibility. Moreover, a memory region used by mvmdio driver is usually placed in the middle of the address space of the PP2 network controller. The MDIO base address is obtained without requesting memory region (by devm_ioremap() call) in mvmdio.c, later overlapping resources are requested by the network driver, which is responsible for avoiding a concurrent access. In case the MDIO memory region is declared in the ACPI, it can already appear as 'in-use' in the OS. Because it is overlapped by second region of the network controller, make sure it is released, before requesting it again. The care is taken by mvpp2 driver to avoid concurrent access to this memory region. Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-18 20:31:44 +08:00
#include <linux/acpi.h>
#include <linux/kernel.h>
#include <linux/netdevice.h>
#include <linux/etherdevice.h>
#include <linux/platform_device.h>
#include <linux/skbuff.h>
#include <linux/inetdevice.h>
#include <linux/mbus.h>
#include <linux/module.h>
#include <linux/mfd/syscon.h>
#include <linux/interrupt.h>
#include <linux/cpumask.h>
#include <linux/of.h>
#include <linux/of_irq.h>
#include <linux/of_mdio.h>
#include <linux/of_net.h>
#include <linux/of_address.h>
#include <linux/of_device.h>
#include <linux/phy.h>
#include <linux/phylink.h>
#include <linux/phy/phy.h>
#include <linux/clk.h>
net: mvpp2: replace TX coalescing interrupts with hrtimer The PP2 controller is capable of per-CPU TX processing, which means there are per-CPU banked register sets and queues. Current version of the driver supports TX packet coalescing - once on given CPU sent packets amount reaches a threshold value, an IRQ occurs. However, there is a single interrupt line responsible for CPU0/1 TX and RX events (the latter is not per-CPU, the hardware does not support RSS). When the top-half executes the interrupt cause is not known. This is why in NAPI poll function, along with RX processing, IRQ cause register on both CPU's is accessed in order to determine on which of them the TX coalescing threshold might have been reached. Thus the egress processing and releasing the buffers is able to take place on the corresponding CPU. Hitherto approach lead to an illegal usage of on_each_cpu function in softirq context. The problem is solved by resigning from TX coalescing interrupts and separating egress finalization from NAPI processing. For that purpose a method of using hrtimer is introduced. In main transmit function (mvpp2_tx) buffers are released once a software coalescing threshold is reached. In case not all the data is processed a timer is set on this CPU - in its interrupt context a tasklet is scheduled in which all queues are processed. At once only one timer per-CPU can be running, which is controlled by a dedicated flag. This commit removes TX processing from NAPI polling function, disables hardware coalescing and enables hrtimer with tasklet, using new per-CPU port structure (mvpp2_port_pcpu). Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-07 01:00:30 +08:00
#include <linux/hrtimer.h>
#include <linux/ktime.h>
#include <linux/regmap.h>
#include <uapi/linux/ppp_defs.h>
#include <net/ip.h>
#include <net/ipv6.h>
#include <net/tso.h>
#include <linux/bpf_trace.h>
#include "mvpp2.h"
#include "mvpp2_prs.h"
#include "mvpp2_cls.h"
net: mvpp2: handle register mapping and access for PPv2.2 This commit adjusts the mvpp2 driver register mapping and access logic to support PPv2.2, to handle a number of differences. Due to how the registers are laid out in memory, the Device Tree binding for the "reg" property is different: - On PPv2.1, we had a first area for the packet processor registers (common to all ports), and then one area per port. - On PPv2.2, we have a first area for the packet processor registers (common to all ports), and a second area for numerous other registers, including a large number of per-port registers In addition, on PPv2.2, the area for the common registers is split into so-called "address spaces" of 64 KB each. They allow to access per-CPU registers, where each CPU has its own copy of some registers. A few other registers, which have a single copy, also need to be accessed from those per-CPU windows if they are related to a per-CPU register. For example: - Writing to MVPP2_TXQ_NUM_REG selects a TX queue. This register is a per-CPU register, it must be accessed from the current CPU register window. - Then a write to MVPP2_TXQ_PENDING_REG, MVPP2_TXQ_DESC_ADDR_REG (and a few others) will affect the TX queue that was selected by the write to MVPP2_TXQ_NUM_REG. It must be accessed from the same CPU window as the write to the TXQ_NUM_REG. Therefore, the ->base member of 'struct mvpp2' is replaced with a ->cpu_base[] array, each entry pointing to a mapping of the per-CPU area. Since PPv2.1 doesn't have this concept of per-CPU windows, all entries in ->cpu_base[] point to the same io-remapped area. The existing mvpp2_read() and mvpp2_write() accessors use cpu_base[0], they are used for registers for which the CPU window doesn't matter. mvpp2_percpu_read() and mvpp2_percpu_write() are new accessors added to access the registers for which the CPU window does matter, which is why they take a "cpu" as argument. The driver is then changed to use mvpp2_percpu_read() and mvpp2_percpu_write() where it matters. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-07 23:53:13 +08:00
enum mvpp2_bm_pool_log_num {
MVPP2_BM_SHORT,
MVPP2_BM_LONG,
MVPP2_BM_JUMBO,
MVPP2_BM_POOLS_NUM
};
static struct {
int pkt_size;
int buf_num;
} mvpp2_pools[MVPP2_BM_POOLS_NUM];
/* The prototype is added here to be used in start_dev when using ACPI. This
* will be removed once phylink is used for all modes (dt+ACPI).
*/
net: phylink: Add struct phylink_config to PHYLINK API The phylink_config structure will encapsulate a pointer to a struct device and the operation type requested for this instance of PHYLINK. This patch does not make any functional changes, it just transitions the PHYLINK internals and all its users to the new API. A pointer to a phylink_config structure will be passed to phylink_create() instead of the net_device directly. Also, the same phylink_config pointer will be passed back to all phylink_mac_ops callbacks instead of the net_device. Using this mechanism, a PHYLINK user can get the original net_device using a structure such as 'to_net_dev(config->dev)' or directly the structure containing the phylink_config using a container_of call. At the moment, only the PHYLINK_NETDEV is defined as a valid operation type for PHYLINK. In this mode, a valid reference to a struct device linked to the original net_device should be passed to PHYLINK through the phylink_config structure. This API changes is mainly driven by the necessity of adding a new operation type in PHYLINK that disconnects the phy_device from the net_device and also works when the net_device is lacking. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Tested-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-29 01:38:12 +08:00
static void mvpp2_mac_config(struct phylink_config *config, unsigned int mode,
const struct phylink_link_state *state);
static void mvpp2_mac_link_up(struct phylink_config *config,
struct phy_device *phy,
unsigned int mode, phy_interface_t interface,
int speed, int duplex,
bool tx_pause, bool rx_pause);
/* Queue modes */
#define MVPP2_QDIST_SINGLE_MODE 0
#define MVPP2_QDIST_MULTI_MODE 1
static int queue_mode = MVPP2_QDIST_MULTI_MODE;
module_param(queue_mode, int, 0444);
MODULE_PARM_DESC(queue_mode, "Set queue_mode (single=0, multi=1)");
/* Utility/helper methods */
void mvpp2_write(struct mvpp2 *priv, u32 offset, u32 data)
{
writel(data, priv->swth_base[0] + offset);
}
u32 mvpp2_read(struct mvpp2 *priv, u32 offset)
{
return readl(priv->swth_base[0] + offset);
}
static u32 mvpp2_read_relaxed(struct mvpp2 *priv, u32 offset)
{
return readl_relaxed(priv->swth_base[0] + offset);
}
static inline u32 mvpp2_cpu_to_thread(struct mvpp2 *priv, int cpu)
{
return cpu % priv->nthreads;
}
static struct page_pool *
mvpp2_create_page_pool(struct device *dev, int num, int len,
enum dma_data_direction dma_dir)
{
struct page_pool_params pp_params = {
/* internal DMA mapping in page_pool */
.flags = PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV,
.pool_size = num,
.nid = NUMA_NO_NODE,
.dev = dev,
.dma_dir = dma_dir,
.offset = MVPP2_SKB_HEADROOM,
.max_len = len,
};
return page_pool_create(&pp_params);
}
/* These accessors should be used to access:
*
* - per-thread registers, where each thread has its own copy of the
* register.
*
* MVPP2_BM_VIRT_ALLOC_REG
* MVPP2_BM_ADDR_HIGH_ALLOC
* MVPP22_BM_ADDR_HIGH_RLS_REG
* MVPP2_BM_VIRT_RLS_REG
* MVPP2_ISR_RX_TX_CAUSE_REG
* MVPP2_ISR_RX_TX_MASK_REG
* MVPP2_TXQ_NUM_REG
* MVPP2_AGGR_TXQ_UPDATE_REG
* MVPP2_TXQ_RSVD_REQ_REG
* MVPP2_TXQ_RSVD_RSLT_REG
* MVPP2_TXQ_SENT_REG
* MVPP2_RXQ_NUM_REG
*
* - global registers that must be accessed through a specific thread
* window, because they are related to an access to a per-thread
* register
*
* MVPP2_BM_PHY_ALLOC_REG (related to MVPP2_BM_VIRT_ALLOC_REG)
* MVPP2_BM_PHY_RLS_REG (related to MVPP2_BM_VIRT_RLS_REG)
* MVPP2_RXQ_THRESH_REG (related to MVPP2_RXQ_NUM_REG)
* MVPP2_RXQ_DESC_ADDR_REG (related to MVPP2_RXQ_NUM_REG)
* MVPP2_RXQ_DESC_SIZE_REG (related to MVPP2_RXQ_NUM_REG)
* MVPP2_RXQ_INDEX_REG (related to MVPP2_RXQ_NUM_REG)
* MVPP2_TXQ_PENDING_REG (related to MVPP2_TXQ_NUM_REG)
* MVPP2_TXQ_DESC_ADDR_REG (related to MVPP2_TXQ_NUM_REG)
* MVPP2_TXQ_DESC_SIZE_REG (related to MVPP2_TXQ_NUM_REG)
* MVPP2_TXQ_INDEX_REG (related to MVPP2_TXQ_NUM_REG)
* MVPP2_TXQ_PENDING_REG (related to MVPP2_TXQ_NUM_REG)
* MVPP2_TXQ_PREF_BUF_REG (related to MVPP2_TXQ_NUM_REG)
* MVPP2_TXQ_PREF_BUF_REG (related to MVPP2_TXQ_NUM_REG)
*/
static void mvpp2_thread_write(struct mvpp2 *priv, unsigned int thread,
u32 offset, u32 data)
{
writel(data, priv->swth_base[thread] + offset);
}
static u32 mvpp2_thread_read(struct mvpp2 *priv, unsigned int thread,
u32 offset)
{
return readl(priv->swth_base[thread] + offset);
}
static void mvpp2_thread_write_relaxed(struct mvpp2 *priv, unsigned int thread,
u32 offset, u32 data)
{
writel_relaxed(data, priv->swth_base[thread] + offset);
}
static u32 mvpp2_thread_read_relaxed(struct mvpp2 *priv, unsigned int thread,
u32 offset)
{
return readl_relaxed(priv->swth_base[thread] + offset);
}
static dma_addr_t mvpp2_txdesc_dma_addr_get(struct mvpp2_port *port,
struct mvpp2_tx_desc *tx_desc)
{
if (port->priv->hw_version == MVPP21)
return le32_to_cpu(tx_desc->pp21.buf_dma_addr);
else
return le64_to_cpu(tx_desc->pp22.buf_dma_addr_ptp) &
MVPP2_DESC_DMA_MASK;
}
static void mvpp2_txdesc_dma_addr_set(struct mvpp2_port *port,
struct mvpp2_tx_desc *tx_desc,
dma_addr_t dma_addr)
{
dma_addr_t addr, offset;
addr = dma_addr & ~MVPP2_TX_DESC_ALIGN;
offset = dma_addr & MVPP2_TX_DESC_ALIGN;
if (port->priv->hw_version == MVPP21) {
tx_desc->pp21.buf_dma_addr = cpu_to_le32(addr);
tx_desc->pp21.packet_offset = offset;
} else {
__le64 val = cpu_to_le64(addr);
tx_desc->pp22.buf_dma_addr_ptp &= ~cpu_to_le64(MVPP2_DESC_DMA_MASK);
tx_desc->pp22.buf_dma_addr_ptp |= val;
tx_desc->pp22.packet_offset = offset;
}
}
static size_t mvpp2_txdesc_size_get(struct mvpp2_port *port,
struct mvpp2_tx_desc *tx_desc)
{
if (port->priv->hw_version == MVPP21)
return le16_to_cpu(tx_desc->pp21.data_size);
else
return le16_to_cpu(tx_desc->pp22.data_size);
}
static void mvpp2_txdesc_size_set(struct mvpp2_port *port,
struct mvpp2_tx_desc *tx_desc,
size_t size)
{
if (port->priv->hw_version == MVPP21)
tx_desc->pp21.data_size = cpu_to_le16(size);
else
tx_desc->pp22.data_size = cpu_to_le16(size);
}
static void mvpp2_txdesc_txq_set(struct mvpp2_port *port,
struct mvpp2_tx_desc *tx_desc,
unsigned int txq)
{
if (port->priv->hw_version == MVPP21)
tx_desc->pp21.phys_txq = txq;
else
tx_desc->pp22.phys_txq = txq;
}
static void mvpp2_txdesc_cmd_set(struct mvpp2_port *port,
struct mvpp2_tx_desc *tx_desc,
unsigned int command)
{
if (port->priv->hw_version == MVPP21)
tx_desc->pp21.command = cpu_to_le32(command);
else
tx_desc->pp22.command = cpu_to_le32(command);
}
static unsigned int mvpp2_txdesc_offset_get(struct mvpp2_port *port,
struct mvpp2_tx_desc *tx_desc)
{
if (port->priv->hw_version == MVPP21)
return tx_desc->pp21.packet_offset;
else
return tx_desc->pp22.packet_offset;
}
static dma_addr_t mvpp2_rxdesc_dma_addr_get(struct mvpp2_port *port,
struct mvpp2_rx_desc *rx_desc)
{
if (port->priv->hw_version == MVPP21)
return le32_to_cpu(rx_desc->pp21.buf_dma_addr);
else
return le64_to_cpu(rx_desc->pp22.buf_dma_addr_key_hash) &
MVPP2_DESC_DMA_MASK;
}
static unsigned long mvpp2_rxdesc_cookie_get(struct mvpp2_port *port,
struct mvpp2_rx_desc *rx_desc)
{
if (port->priv->hw_version == MVPP21)
return le32_to_cpu(rx_desc->pp21.buf_cookie);
else
return le64_to_cpu(rx_desc->pp22.buf_cookie_misc) &
MVPP2_DESC_DMA_MASK;
}
static size_t mvpp2_rxdesc_size_get(struct mvpp2_port *port,
struct mvpp2_rx_desc *rx_desc)
{
if (port->priv->hw_version == MVPP21)
return le16_to_cpu(rx_desc->pp21.data_size);
else
return le16_to_cpu(rx_desc->pp22.data_size);
}
static u32 mvpp2_rxdesc_status_get(struct mvpp2_port *port,
struct mvpp2_rx_desc *rx_desc)
{
if (port->priv->hw_version == MVPP21)
return le32_to_cpu(rx_desc->pp21.status);
else
return le32_to_cpu(rx_desc->pp22.status);
}
static void mvpp2_txq_inc_get(struct mvpp2_txq_pcpu *txq_pcpu)
{
txq_pcpu->txq_get_index++;
if (txq_pcpu->txq_get_index == txq_pcpu->size)
txq_pcpu->txq_get_index = 0;
}
static void mvpp2_txq_inc_put(struct mvpp2_port *port,
struct mvpp2_txq_pcpu *txq_pcpu,
void *data,
struct mvpp2_tx_desc *tx_desc,
enum mvpp2_tx_buf_type buf_type)
{
struct mvpp2_txq_pcpu_buf *tx_buf =
txq_pcpu->buffs + txq_pcpu->txq_put_index;
tx_buf->type = buf_type;
if (buf_type == MVPP2_TYPE_SKB)
tx_buf->skb = data;
else
tx_buf->xdpf = data;
tx_buf->size = mvpp2_txdesc_size_get(port, tx_desc);
tx_buf->dma = mvpp2_txdesc_dma_addr_get(port, tx_desc) +
mvpp2_txdesc_offset_get(port, tx_desc);
txq_pcpu->txq_put_index++;
if (txq_pcpu->txq_put_index == txq_pcpu->size)
txq_pcpu->txq_put_index = 0;
}
/* Get number of maximum RXQ */
static int mvpp2_get_nrxqs(struct mvpp2 *priv)
{
unsigned int nrxqs;
if (priv->hw_version == MVPP22 && queue_mode == MVPP2_QDIST_SINGLE_MODE)
return 1;
/* According to the PPv2.2 datasheet and our experiments on
* PPv2.1, RX queues have an allocation granularity of 4 (when
* more than a single one on PPv2.2).
* Round up to nearest multiple of 4.
*/
nrxqs = (num_possible_cpus() + 3) & ~0x3;
if (nrxqs > MVPP2_PORT_MAX_RXQ)
nrxqs = MVPP2_PORT_MAX_RXQ;
return nrxqs;
}
/* Get number of physical egress port */
static inline int mvpp2_egress_port(struct mvpp2_port *port)
{
return MVPP2_MAX_TCONT + port->id;
}
/* Get number of physical TXQ */
static inline int mvpp2_txq_phys(int port, int txq)
{
return (MVPP2_MAX_TCONT + port) * MVPP2_MAX_TXQ + txq;
}
/* Returns a struct page if page_pool is set, otherwise a buffer */
static void *mvpp2_frag_alloc(const struct mvpp2_bm_pool *pool,
struct page_pool *page_pool)
{
if (page_pool)
return page_pool_dev_alloc_pages(page_pool);
if (likely(pool->frag_size <= PAGE_SIZE))
return netdev_alloc_frag(pool->frag_size);
return kmalloc(pool->frag_size, GFP_ATOMIC);
}
static void mvpp2_frag_free(const struct mvpp2_bm_pool *pool,
struct page_pool *page_pool, void *data)
{
if (page_pool)
page_pool_put_full_page(page_pool, virt_to_head_page(data), false);
else if (likely(pool->frag_size <= PAGE_SIZE))
skb_free_frag(data);
else
kfree(data);
}
/* Buffer Manager configuration routines */
/* Create pool */
static int mvpp2_bm_pool_create(struct device *dev, struct mvpp2 *priv,
struct mvpp2_bm_pool *bm_pool, int size)
{
u32 val;
/* Number of buffer pointers must be a multiple of 16, as per
* hardware constraints
*/
if (!IS_ALIGNED(size, 16))
return -EINVAL;
/* PPv2.1 needs 8 bytes per buffer pointer, PPv2.2 needs 16
* bytes per buffer pointer
*/
if (priv->hw_version == MVPP21)
bm_pool->size_bytes = 2 * sizeof(u32) * size;
else
bm_pool->size_bytes = 2 * sizeof(u64) * size;
bm_pool->virt_addr = dma_alloc_coherent(dev, bm_pool->size_bytes,
&bm_pool->dma_addr,
GFP_KERNEL);
if (!bm_pool->virt_addr)
return -ENOMEM;
if (!IS_ALIGNED((unsigned long)bm_pool->virt_addr,
MVPP2_BM_POOL_PTR_ALIGN)) {
dma_free_coherent(dev, bm_pool->size_bytes,
bm_pool->virt_addr, bm_pool->dma_addr);
dev_err(dev, "BM pool %d is not %d bytes aligned\n",
bm_pool->id, MVPP2_BM_POOL_PTR_ALIGN);
return -ENOMEM;
}
mvpp2_write(priv, MVPP2_BM_POOL_BASE_REG(bm_pool->id),
lower_32_bits(bm_pool->dma_addr));
mvpp2_write(priv, MVPP2_BM_POOL_SIZE_REG(bm_pool->id), size);
val = mvpp2_read(priv, MVPP2_BM_POOL_CTRL_REG(bm_pool->id));
val |= MVPP2_BM_START_MASK;
mvpp2_write(priv, MVPP2_BM_POOL_CTRL_REG(bm_pool->id), val);
bm_pool->size = size;
bm_pool->pkt_size = 0;
bm_pool->buf_num = 0;
return 0;
}
/* Set pool buffer size */
static void mvpp2_bm_pool_bufsize_set(struct mvpp2 *priv,
struct mvpp2_bm_pool *bm_pool,
int buf_size)
{
u32 val;
bm_pool->buf_size = buf_size;
val = ALIGN(buf_size, 1 << MVPP2_POOL_BUF_SIZE_OFFSET);
mvpp2_write(priv, MVPP2_POOL_BUF_SIZE_REG(bm_pool->id), val);
}
static void mvpp2_bm_bufs_get_addrs(struct device *dev, struct mvpp2 *priv,
struct mvpp2_bm_pool *bm_pool,
dma_addr_t *dma_addr,
phys_addr_t *phys_addr)
{
unsigned int thread = mvpp2_cpu_to_thread(priv, get_cpu());
net: mvpp2: handle register mapping and access for PPv2.2 This commit adjusts the mvpp2 driver register mapping and access logic to support PPv2.2, to handle a number of differences. Due to how the registers are laid out in memory, the Device Tree binding for the "reg" property is different: - On PPv2.1, we had a first area for the packet processor registers (common to all ports), and then one area per port. - On PPv2.2, we have a first area for the packet processor registers (common to all ports), and a second area for numerous other registers, including a large number of per-port registers In addition, on PPv2.2, the area for the common registers is split into so-called "address spaces" of 64 KB each. They allow to access per-CPU registers, where each CPU has its own copy of some registers. A few other registers, which have a single copy, also need to be accessed from those per-CPU windows if they are related to a per-CPU register. For example: - Writing to MVPP2_TXQ_NUM_REG selects a TX queue. This register is a per-CPU register, it must be accessed from the current CPU register window. - Then a write to MVPP2_TXQ_PENDING_REG, MVPP2_TXQ_DESC_ADDR_REG (and a few others) will affect the TX queue that was selected by the write to MVPP2_TXQ_NUM_REG. It must be accessed from the same CPU window as the write to the TXQ_NUM_REG. Therefore, the ->base member of 'struct mvpp2' is replaced with a ->cpu_base[] array, each entry pointing to a mapping of the per-CPU area. Since PPv2.1 doesn't have this concept of per-CPU windows, all entries in ->cpu_base[] point to the same io-remapped area. The existing mvpp2_read() and mvpp2_write() accessors use cpu_base[0], they are used for registers for which the CPU window doesn't matter. mvpp2_percpu_read() and mvpp2_percpu_write() are new accessors added to access the registers for which the CPU window does matter, which is why they take a "cpu" as argument. The driver is then changed to use mvpp2_percpu_read() and mvpp2_percpu_write() where it matters. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-07 23:53:13 +08:00
*dma_addr = mvpp2_thread_read(priv, thread,
net: mvpp2: handle register mapping and access for PPv2.2 This commit adjusts the mvpp2 driver register mapping and access logic to support PPv2.2, to handle a number of differences. Due to how the registers are laid out in memory, the Device Tree binding for the "reg" property is different: - On PPv2.1, we had a first area for the packet processor registers (common to all ports), and then one area per port. - On PPv2.2, we have a first area for the packet processor registers (common to all ports), and a second area for numerous other registers, including a large number of per-port registers In addition, on PPv2.2, the area for the common registers is split into so-called "address spaces" of 64 KB each. They allow to access per-CPU registers, where each CPU has its own copy of some registers. A few other registers, which have a single copy, also need to be accessed from those per-CPU windows if they are related to a per-CPU register. For example: - Writing to MVPP2_TXQ_NUM_REG selects a TX queue. This register is a per-CPU register, it must be accessed from the current CPU register window. - Then a write to MVPP2_TXQ_PENDING_REG, MVPP2_TXQ_DESC_ADDR_REG (and a few others) will affect the TX queue that was selected by the write to MVPP2_TXQ_NUM_REG. It must be accessed from the same CPU window as the write to the TXQ_NUM_REG. Therefore, the ->base member of 'struct mvpp2' is replaced with a ->cpu_base[] array, each entry pointing to a mapping of the per-CPU area. Since PPv2.1 doesn't have this concept of per-CPU windows, all entries in ->cpu_base[] point to the same io-remapped area. The existing mvpp2_read() and mvpp2_write() accessors use cpu_base[0], they are used for registers for which the CPU window doesn't matter. mvpp2_percpu_read() and mvpp2_percpu_write() are new accessors added to access the registers for which the CPU window does matter, which is why they take a "cpu" as argument. The driver is then changed to use mvpp2_percpu_read() and mvpp2_percpu_write() where it matters. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-07 23:53:13 +08:00
MVPP2_BM_PHY_ALLOC_REG(bm_pool->id));
*phys_addr = mvpp2_thread_read(priv, thread, MVPP2_BM_VIRT_ALLOC_REG);
if (priv->hw_version == MVPP22) {
u32 val;
u32 dma_addr_highbits, phys_addr_highbits;
val = mvpp2_thread_read(priv, thread, MVPP22_BM_ADDR_HIGH_ALLOC);
dma_addr_highbits = (val & MVPP22_BM_ADDR_HIGH_PHYS_MASK);
phys_addr_highbits = (val & MVPP22_BM_ADDR_HIGH_VIRT_MASK) >>
MVPP22_BM_ADDR_HIGH_VIRT_SHIFT;
if (sizeof(dma_addr_t) == 8)
*dma_addr |= (u64)dma_addr_highbits << 32;
if (sizeof(phys_addr_t) == 8)
*phys_addr |= (u64)phys_addr_highbits << 32;
}
put_cpu();
}
/* Free all buffers from the pool */
static void mvpp2_bm_bufs_free(struct device *dev, struct mvpp2 *priv,
struct mvpp2_bm_pool *bm_pool, int buf_num)
{
struct page_pool *pp = NULL;
int i;
if (buf_num > bm_pool->buf_num) {
WARN(1, "Pool does not have so many bufs pool(%d) bufs(%d)\n",
bm_pool->id, buf_num);
buf_num = bm_pool->buf_num;
}
if (priv->percpu_pools)
pp = priv->page_pool[bm_pool->id];
for (i = 0; i < buf_num; i++) {
dma_addr_t buf_dma_addr;
net: mvpp2: store physical address of buffer in rx_desc->buf_cookie The RX descriptors of the PPv2 hardware allow to store several information, amongst which: - the DMA address of the buffer in which the data has been received - a "cookie" field, left to the use of the driver, and not used by the hardware In the current implementation, the "cookie" field is used to store the virtual address of the buffer, so that in the receive completion path, we can easily get the virtual address of the buffer that corresponds to a completed RX descriptors. On PPv2.1, used on 32-bit platforms, those two fields are 32-bit wide, which is enough to store a DMA address in the first field, and a virtual address in the second field. On PPv2.2, used on 64-bit platforms, these two fields have been extended to 40 bits. While 40 bits is enough to store a DMA address (as long as the DMA mask is 40 bits or lower), it is not enough to store a virtual address. Therefore, the "cookie" field can no longer be used to store the virtual address of the buffer. However, as Russell King pointed out, the RX buffers are always allocated in the kernel linear mapping, and therefore using phys_to_virt() on the physical address of the RX buffer is possible and correct. Therefore, this commit changes the driver to use the "cookie" field to store the physical address instead of the virtual address. phys_to_virt() is used in the receive completion path to retrieve the virtual address from the physical address. It is obviously important to realize that the DMA address and physical address are two different things, which is why we store both in the RX descriptors. While those addresses may be identical in some situations, it remains two distinct concepts, and both addresses should be handled separately. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-07 23:53:04 +08:00
phys_addr_t buf_phys_addr;
void *data;
mvpp2_bm_bufs_get_addrs(dev, priv, bm_pool,
&buf_dma_addr, &buf_phys_addr);
if (!pp)
dma_unmap_single(dev, buf_dma_addr,
bm_pool->buf_size, DMA_FROM_DEVICE);
net: mvpp2: store physical address of buffer in rx_desc->buf_cookie The RX descriptors of the PPv2 hardware allow to store several information, amongst which: - the DMA address of the buffer in which the data has been received - a "cookie" field, left to the use of the driver, and not used by the hardware In the current implementation, the "cookie" field is used to store the virtual address of the buffer, so that in the receive completion path, we can easily get the virtual address of the buffer that corresponds to a completed RX descriptors. On PPv2.1, used on 32-bit platforms, those two fields are 32-bit wide, which is enough to store a DMA address in the first field, and a virtual address in the second field. On PPv2.2, used on 64-bit platforms, these two fields have been extended to 40 bits. While 40 bits is enough to store a DMA address (as long as the DMA mask is 40 bits or lower), it is not enough to store a virtual address. Therefore, the "cookie" field can no longer be used to store the virtual address of the buffer. However, as Russell King pointed out, the RX buffers are always allocated in the kernel linear mapping, and therefore using phys_to_virt() on the physical address of the RX buffer is possible and correct. Therefore, this commit changes the driver to use the "cookie" field to store the physical address instead of the virtual address. phys_to_virt() is used in the receive completion path to retrieve the virtual address from the physical address. It is obviously important to realize that the DMA address and physical address are two different things, which is why we store both in the RX descriptors. While those addresses may be identical in some situations, it remains two distinct concepts, and both addresses should be handled separately. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-07 23:53:04 +08:00
data = (void *)phys_to_virt(buf_phys_addr);
if (!data)
break;
mvpp2_frag_free(bm_pool, pp, data);
}
/* Update BM driver with number of buffers removed from pool */
bm_pool->buf_num -= i;
}
/* Check number of buffers in BM pool */
static int mvpp2_check_hw_buf_num(struct mvpp2 *priv, struct mvpp2_bm_pool *bm_pool)
{
int buf_num = 0;
buf_num += mvpp2_read(priv, MVPP2_BM_POOL_PTRS_NUM_REG(bm_pool->id)) &
MVPP22_BM_POOL_PTRS_NUM_MASK;
buf_num += mvpp2_read(priv, MVPP2_BM_BPPI_PTRS_NUM_REG(bm_pool->id)) &
MVPP2_BM_BPPI_PTR_NUM_MASK;
/* HW has one buffer ready which is not reflected in the counters */
if (buf_num)
buf_num += 1;
return buf_num;
}
/* Cleanup pool */
static int mvpp2_bm_pool_destroy(struct device *dev, struct mvpp2 *priv,
struct mvpp2_bm_pool *bm_pool)
{
int buf_num;
u32 val;
buf_num = mvpp2_check_hw_buf_num(priv, bm_pool);
mvpp2_bm_bufs_free(dev, priv, bm_pool, buf_num);
/* Check buffer counters after free */
buf_num = mvpp2_check_hw_buf_num(priv, bm_pool);
if (buf_num) {
WARN(1, "cannot free all buffers in pool %d, buf_num left %d\n",
bm_pool->id, bm_pool->buf_num);
return 0;
}
val = mvpp2_read(priv, MVPP2_BM_POOL_CTRL_REG(bm_pool->id));
val |= MVPP2_BM_STOP_MASK;
mvpp2_write(priv, MVPP2_BM_POOL_CTRL_REG(bm_pool->id), val);
if (priv->percpu_pools) {
page_pool_destroy(priv->page_pool[bm_pool->id]);
priv->page_pool[bm_pool->id] = NULL;
}
dma_free_coherent(dev, bm_pool->size_bytes,
bm_pool->virt_addr,
bm_pool->dma_addr);
return 0;
}
static int mvpp2_bm_pools_init(struct device *dev, struct mvpp2 *priv)
{
int i, err, size, poolnum = MVPP2_BM_POOLS_NUM;
struct mvpp2_bm_pool *bm_pool;
if (priv->percpu_pools)
poolnum = mvpp2_get_nrxqs(priv) * 2;
/* Create all pools with maximum size */
size = MVPP2_BM_POOL_SIZE_MAX;
for (i = 0; i < poolnum; i++) {
bm_pool = &priv->bm_pools[i];
bm_pool->id = i;
err = mvpp2_bm_pool_create(dev, priv, bm_pool, size);
if (err)
goto err_unroll_pools;
mvpp2_bm_pool_bufsize_set(priv, bm_pool, 0);
}
return 0;
err_unroll_pools:
dev_err(dev, "failed to create BM pool %d, size %d\n", i, size);
for (i = i - 1; i >= 0; i--)
mvpp2_bm_pool_destroy(dev, priv, &priv->bm_pools[i]);
return err;
}
static int mvpp2_bm_init(struct device *dev, struct mvpp2 *priv)
{
enum dma_data_direction dma_dir = DMA_FROM_DEVICE;
int i, err, poolnum = MVPP2_BM_POOLS_NUM;
struct mvpp2_port *port;
if (priv->percpu_pools) {
for (i = 0; i < priv->port_count; i++) {
port = priv->port_list[i];
if (port->xdp_prog) {
dma_dir = DMA_BIDIRECTIONAL;
break;
}
}
poolnum = mvpp2_get_nrxqs(priv) * 2;
for (i = 0; i < poolnum; i++) {
/* the pool in use */
int pn = i / (poolnum / 2);
priv->page_pool[i] =
mvpp2_create_page_pool(dev,
mvpp2_pools[pn].buf_num,
mvpp2_pools[pn].pkt_size,
dma_dir);
if (IS_ERR(priv->page_pool[i])) {
int j;
for (j = 0; j < i; j++) {
page_pool_destroy(priv->page_pool[j]);
priv->page_pool[j] = NULL;
}
return PTR_ERR(priv->page_pool[i]);
}
}
}
dev_info(dev, "using %d %s buffers\n", poolnum,
priv->percpu_pools ? "per-cpu" : "shared");
for (i = 0; i < poolnum; i++) {
/* Mask BM all interrupts */
mvpp2_write(priv, MVPP2_BM_INTR_MASK_REG(i), 0);
/* Clear BM cause register */
mvpp2_write(priv, MVPP2_BM_INTR_CAUSE_REG(i), 0);
}
/* Allocate and initialize BM pools */
priv->bm_pools = devm_kcalloc(dev, poolnum,
sizeof(*priv->bm_pools), GFP_KERNEL);
if (!priv->bm_pools)
return -ENOMEM;
err = mvpp2_bm_pools_init(dev, priv);
if (err < 0)
return err;
return 0;
}
static void mvpp2_setup_bm_pool(void)
{
/* Short pool */
mvpp2_pools[MVPP2_BM_SHORT].buf_num = MVPP2_BM_SHORT_BUF_NUM;
mvpp2_pools[MVPP2_BM_SHORT].pkt_size = MVPP2_BM_SHORT_PKT_SIZE;
/* Long pool */
mvpp2_pools[MVPP2_BM_LONG].buf_num = MVPP2_BM_LONG_BUF_NUM;
mvpp2_pools[MVPP2_BM_LONG].pkt_size = MVPP2_BM_LONG_PKT_SIZE;
/* Jumbo pool */
mvpp2_pools[MVPP2_BM_JUMBO].buf_num = MVPP2_BM_JUMBO_BUF_NUM;
mvpp2_pools[MVPP2_BM_JUMBO].pkt_size = MVPP2_BM_JUMBO_PKT_SIZE;
}
/* Attach long pool to rxq */
static void mvpp2_rxq_long_pool_set(struct mvpp2_port *port,
int lrxq, int long_pool)
{
u32 val, mask;
int prxq;
/* Get queue physical ID */
prxq = port->rxqs[lrxq]->id;
if (port->priv->hw_version == MVPP21)
mask = MVPP21_RXQ_POOL_LONG_MASK;
else
mask = MVPP22_RXQ_POOL_LONG_MASK;
val = mvpp2_read(port->priv, MVPP2_RXQ_CONFIG_REG(prxq));
val &= ~mask;
val |= (long_pool << MVPP2_RXQ_POOL_LONG_OFFS) & mask;
mvpp2_write(port->priv, MVPP2_RXQ_CONFIG_REG(prxq), val);
}
/* Attach short pool to rxq */
static void mvpp2_rxq_short_pool_set(struct mvpp2_port *port,
int lrxq, int short_pool)
{
u32 val, mask;
int prxq;
/* Get queue physical ID */
prxq = port->rxqs[lrxq]->id;
if (port->priv->hw_version == MVPP21)
mask = MVPP21_RXQ_POOL_SHORT_MASK;
else
mask = MVPP22_RXQ_POOL_SHORT_MASK;
val = mvpp2_read(port->priv, MVPP2_RXQ_CONFIG_REG(prxq));
val &= ~mask;
val |= (short_pool << MVPP2_RXQ_POOL_SHORT_OFFS) & mask;
mvpp2_write(port->priv, MVPP2_RXQ_CONFIG_REG(prxq), val);
}
static void *mvpp2_buf_alloc(struct mvpp2_port *port,
struct mvpp2_bm_pool *bm_pool,
struct page_pool *page_pool,
dma_addr_t *buf_dma_addr,
net: mvpp2: store physical address of buffer in rx_desc->buf_cookie The RX descriptors of the PPv2 hardware allow to store several information, amongst which: - the DMA address of the buffer in which the data has been received - a "cookie" field, left to the use of the driver, and not used by the hardware In the current implementation, the "cookie" field is used to store the virtual address of the buffer, so that in the receive completion path, we can easily get the virtual address of the buffer that corresponds to a completed RX descriptors. On PPv2.1, used on 32-bit platforms, those two fields are 32-bit wide, which is enough to store a DMA address in the first field, and a virtual address in the second field. On PPv2.2, used on 64-bit platforms, these two fields have been extended to 40 bits. While 40 bits is enough to store a DMA address (as long as the DMA mask is 40 bits or lower), it is not enough to store a virtual address. Therefore, the "cookie" field can no longer be used to store the virtual address of the buffer. However, as Russell King pointed out, the RX buffers are always allocated in the kernel linear mapping, and therefore using phys_to_virt() on the physical address of the RX buffer is possible and correct. Therefore, this commit changes the driver to use the "cookie" field to store the physical address instead of the virtual address. phys_to_virt() is used in the receive completion path to retrieve the virtual address from the physical address. It is obviously important to realize that the DMA address and physical address are two different things, which is why we store both in the RX descriptors. While those addresses may be identical in some situations, it remains two distinct concepts, and both addresses should be handled separately. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-07 23:53:04 +08:00
phys_addr_t *buf_phys_addr,
gfp_t gfp_mask)
{
dma_addr_t dma_addr;
struct page *page;
void *data;
data = mvpp2_frag_alloc(bm_pool, page_pool);
if (!data)
return NULL;
if (page_pool) {
page = (struct page *)data;
dma_addr = page_pool_get_dma_addr(page);
data = page_to_virt(page);
} else {
dma_addr = dma_map_single(port->dev->dev.parent, data,
MVPP2_RX_BUF_SIZE(bm_pool->pkt_size),
DMA_FROM_DEVICE);
if (unlikely(dma_mapping_error(port->dev->dev.parent, dma_addr))) {
mvpp2_frag_free(bm_pool, NULL, data);
return NULL;
}
}
*buf_dma_addr = dma_addr;
net: mvpp2: store physical address of buffer in rx_desc->buf_cookie The RX descriptors of the PPv2 hardware allow to store several information, amongst which: - the DMA address of the buffer in which the data has been received - a "cookie" field, left to the use of the driver, and not used by the hardware In the current implementation, the "cookie" field is used to store the virtual address of the buffer, so that in the receive completion path, we can easily get the virtual address of the buffer that corresponds to a completed RX descriptors. On PPv2.1, used on 32-bit platforms, those two fields are 32-bit wide, which is enough to store a DMA address in the first field, and a virtual address in the second field. On PPv2.2, used on 64-bit platforms, these two fields have been extended to 40 bits. While 40 bits is enough to store a DMA address (as long as the DMA mask is 40 bits or lower), it is not enough to store a virtual address. Therefore, the "cookie" field can no longer be used to store the virtual address of the buffer. However, as Russell King pointed out, the RX buffers are always allocated in the kernel linear mapping, and therefore using phys_to_virt() on the physical address of the RX buffer is possible and correct. Therefore, this commit changes the driver to use the "cookie" field to store the physical address instead of the virtual address. phys_to_virt() is used in the receive completion path to retrieve the virtual address from the physical address. It is obviously important to realize that the DMA address and physical address are two different things, which is why we store both in the RX descriptors. While those addresses may be identical in some situations, it remains two distinct concepts, and both addresses should be handled separately. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-07 23:53:04 +08:00
*buf_phys_addr = virt_to_phys(data);
return data;
}
/* Release buffer to BM */
static inline void mvpp2_bm_pool_put(struct mvpp2_port *port, int pool,
dma_addr_t buf_dma_addr,
net: mvpp2: store physical address of buffer in rx_desc->buf_cookie The RX descriptors of the PPv2 hardware allow to store several information, amongst which: - the DMA address of the buffer in which the data has been received - a "cookie" field, left to the use of the driver, and not used by the hardware In the current implementation, the "cookie" field is used to store the virtual address of the buffer, so that in the receive completion path, we can easily get the virtual address of the buffer that corresponds to a completed RX descriptors. On PPv2.1, used on 32-bit platforms, those two fields are 32-bit wide, which is enough to store a DMA address in the first field, and a virtual address in the second field. On PPv2.2, used on 64-bit platforms, these two fields have been extended to 40 bits. While 40 bits is enough to store a DMA address (as long as the DMA mask is 40 bits or lower), it is not enough to store a virtual address. Therefore, the "cookie" field can no longer be used to store the virtual address of the buffer. However, as Russell King pointed out, the RX buffers are always allocated in the kernel linear mapping, and therefore using phys_to_virt() on the physical address of the RX buffer is possible and correct. Therefore, this commit changes the driver to use the "cookie" field to store the physical address instead of the virtual address. phys_to_virt() is used in the receive completion path to retrieve the virtual address from the physical address. It is obviously important to realize that the DMA address and physical address are two different things, which is why we store both in the RX descriptors. While those addresses may be identical in some situations, it remains two distinct concepts, and both addresses should be handled separately. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-07 23:53:04 +08:00
phys_addr_t buf_phys_addr)
{
unsigned int thread = mvpp2_cpu_to_thread(port->priv, get_cpu());
unsigned long flags = 0;
if (test_bit(thread, &port->priv->lock_map))
spin_lock_irqsave(&port->bm_lock[thread], flags);
net: mvpp2: handle register mapping and access for PPv2.2 This commit adjusts the mvpp2 driver register mapping and access logic to support PPv2.2, to handle a number of differences. Due to how the registers are laid out in memory, the Device Tree binding for the "reg" property is different: - On PPv2.1, we had a first area for the packet processor registers (common to all ports), and then one area per port. - On PPv2.2, we have a first area for the packet processor registers (common to all ports), and a second area for numerous other registers, including a large number of per-port registers In addition, on PPv2.2, the area for the common registers is split into so-called "address spaces" of 64 KB each. They allow to access per-CPU registers, where each CPU has its own copy of some registers. A few other registers, which have a single copy, also need to be accessed from those per-CPU windows if they are related to a per-CPU register. For example: - Writing to MVPP2_TXQ_NUM_REG selects a TX queue. This register is a per-CPU register, it must be accessed from the current CPU register window. - Then a write to MVPP2_TXQ_PENDING_REG, MVPP2_TXQ_DESC_ADDR_REG (and a few others) will affect the TX queue that was selected by the write to MVPP2_TXQ_NUM_REG. It must be accessed from the same CPU window as the write to the TXQ_NUM_REG. Therefore, the ->base member of 'struct mvpp2' is replaced with a ->cpu_base[] array, each entry pointing to a mapping of the per-CPU area. Since PPv2.1 doesn't have this concept of per-CPU windows, all entries in ->cpu_base[] point to the same io-remapped area. The existing mvpp2_read() and mvpp2_write() accessors use cpu_base[0], they are used for registers for which the CPU window doesn't matter. mvpp2_percpu_read() and mvpp2_percpu_write() are new accessors added to access the registers for which the CPU window does matter, which is why they take a "cpu" as argument. The driver is then changed to use mvpp2_percpu_read() and mvpp2_percpu_write() where it matters. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-07 23:53:13 +08:00
if (port->priv->hw_version == MVPP22) {
u32 val = 0;
if (sizeof(dma_addr_t) == 8)
val |= upper_32_bits(buf_dma_addr) &
MVPP22_BM_ADDR_HIGH_PHYS_RLS_MASK;
if (sizeof(phys_addr_t) == 8)
val |= (upper_32_bits(buf_phys_addr)
<< MVPP22_BM_ADDR_HIGH_VIRT_RLS_SHIFT) &
MVPP22_BM_ADDR_HIGH_VIRT_RLS_MASK;
mvpp2_thread_write_relaxed(port->priv, thread,
MVPP22_BM_ADDR_HIGH_RLS_REG, val);
}
net: mvpp2: store physical address of buffer in rx_desc->buf_cookie The RX descriptors of the PPv2 hardware allow to store several information, amongst which: - the DMA address of the buffer in which the data has been received - a "cookie" field, left to the use of the driver, and not used by the hardware In the current implementation, the "cookie" field is used to store the virtual address of the buffer, so that in the receive completion path, we can easily get the virtual address of the buffer that corresponds to a completed RX descriptors. On PPv2.1, used on 32-bit platforms, those two fields are 32-bit wide, which is enough to store a DMA address in the first field, and a virtual address in the second field. On PPv2.2, used on 64-bit platforms, these two fields have been extended to 40 bits. While 40 bits is enough to store a DMA address (as long as the DMA mask is 40 bits or lower), it is not enough to store a virtual address. Therefore, the "cookie" field can no longer be used to store the virtual address of the buffer. However, as Russell King pointed out, the RX buffers are always allocated in the kernel linear mapping, and therefore using phys_to_virt() on the physical address of the RX buffer is possible and correct. Therefore, this commit changes the driver to use the "cookie" field to store the physical address instead of the virtual address. phys_to_virt() is used in the receive completion path to retrieve the virtual address from the physical address. It is obviously important to realize that the DMA address and physical address are two different things, which is why we store both in the RX descriptors. While those addresses may be identical in some situations, it remains two distinct concepts, and both addresses should be handled separately. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-07 23:53:04 +08:00
/* MVPP2_BM_VIRT_RLS_REG is not interpreted by HW, and simply
* returned in the "cookie" field of the RX
* descriptor. Instead of storing the virtual address, we
* store the physical address
*/
mvpp2_thread_write_relaxed(port->priv, thread,
MVPP2_BM_VIRT_RLS_REG, buf_phys_addr);
mvpp2_thread_write_relaxed(port->priv, thread,
MVPP2_BM_PHY_RLS_REG(pool), buf_dma_addr);
if (test_bit(thread, &port->priv->lock_map))
spin_unlock_irqrestore(&port->bm_lock[thread], flags);
put_cpu();
}
/* Allocate buffers for the pool */
static int mvpp2_bm_bufs_add(struct mvpp2_port *port,
struct mvpp2_bm_pool *bm_pool, int buf_num)
{
int i, buf_size, total_size;
dma_addr_t dma_addr;
net: mvpp2: store physical address of buffer in rx_desc->buf_cookie The RX descriptors of the PPv2 hardware allow to store several information, amongst which: - the DMA address of the buffer in which the data has been received - a "cookie" field, left to the use of the driver, and not used by the hardware In the current implementation, the "cookie" field is used to store the virtual address of the buffer, so that in the receive completion path, we can easily get the virtual address of the buffer that corresponds to a completed RX descriptors. On PPv2.1, used on 32-bit platforms, those two fields are 32-bit wide, which is enough to store a DMA address in the first field, and a virtual address in the second field. On PPv2.2, used on 64-bit platforms, these two fields have been extended to 40 bits. While 40 bits is enough to store a DMA address (as long as the DMA mask is 40 bits or lower), it is not enough to store a virtual address. Therefore, the "cookie" field can no longer be used to store the virtual address of the buffer. However, as Russell King pointed out, the RX buffers are always allocated in the kernel linear mapping, and therefore using phys_to_virt() on the physical address of the RX buffer is possible and correct. Therefore, this commit changes the driver to use the "cookie" field to store the physical address instead of the virtual address. phys_to_virt() is used in the receive completion path to retrieve the virtual address from the physical address. It is obviously important to realize that the DMA address and physical address are two different things, which is why we store both in the RX descriptors. While those addresses may be identical in some situations, it remains two distinct concepts, and both addresses should be handled separately. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-07 23:53:04 +08:00
phys_addr_t phys_addr;
struct page_pool *pp = NULL;
void *buf;
if (port->priv->percpu_pools &&
bm_pool->pkt_size > MVPP2_BM_LONG_PKT_SIZE) {
netdev_err(port->dev,
"attempted to use jumbo frames with per-cpu pools");
return 0;
}
buf_size = MVPP2_RX_BUF_SIZE(bm_pool->pkt_size);
total_size = MVPP2_RX_TOTAL_SIZE(buf_size);
if (buf_num < 0 ||
(buf_num + bm_pool->buf_num > bm_pool->size)) {
netdev_err(port->dev,
"cannot allocate %d buffers for pool %d\n",
buf_num, bm_pool->id);
return 0;
}
if (port->priv->percpu_pools)
pp = port->priv->page_pool[bm_pool->id];
for (i = 0; i < buf_num; i++) {
buf = mvpp2_buf_alloc(port, bm_pool, pp, &dma_addr,
net: mvpp2: store physical address of buffer in rx_desc->buf_cookie The RX descriptors of the PPv2 hardware allow to store several information, amongst which: - the DMA address of the buffer in which the data has been received - a "cookie" field, left to the use of the driver, and not used by the hardware In the current implementation, the "cookie" field is used to store the virtual address of the buffer, so that in the receive completion path, we can easily get the virtual address of the buffer that corresponds to a completed RX descriptors. On PPv2.1, used on 32-bit platforms, those two fields are 32-bit wide, which is enough to store a DMA address in the first field, and a virtual address in the second field. On PPv2.2, used on 64-bit platforms, these two fields have been extended to 40 bits. While 40 bits is enough to store a DMA address (as long as the DMA mask is 40 bits or lower), it is not enough to store a virtual address. Therefore, the "cookie" field can no longer be used to store the virtual address of the buffer. However, as Russell King pointed out, the RX buffers are always allocated in the kernel linear mapping, and therefore using phys_to_virt() on the physical address of the RX buffer is possible and correct. Therefore, this commit changes the driver to use the "cookie" field to store the physical address instead of the virtual address. phys_to_virt() is used in the receive completion path to retrieve the virtual address from the physical address. It is obviously important to realize that the DMA address and physical address are two different things, which is why we store both in the RX descriptors. While those addresses may be identical in some situations, it remains two distinct concepts, and both addresses should be handled separately. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-07 23:53:04 +08:00
&phys_addr, GFP_KERNEL);
if (!buf)
break;
mvpp2_bm_pool_put(port, bm_pool->id, dma_addr,
net: mvpp2: store physical address of buffer in rx_desc->buf_cookie The RX descriptors of the PPv2 hardware allow to store several information, amongst which: - the DMA address of the buffer in which the data has been received - a "cookie" field, left to the use of the driver, and not used by the hardware In the current implementation, the "cookie" field is used to store the virtual address of the buffer, so that in the receive completion path, we can easily get the virtual address of the buffer that corresponds to a completed RX descriptors. On PPv2.1, used on 32-bit platforms, those two fields are 32-bit wide, which is enough to store a DMA address in the first field, and a virtual address in the second field. On PPv2.2, used on 64-bit platforms, these two fields have been extended to 40 bits. While 40 bits is enough to store a DMA address (as long as the DMA mask is 40 bits or lower), it is not enough to store a virtual address. Therefore, the "cookie" field can no longer be used to store the virtual address of the buffer. However, as Russell King pointed out, the RX buffers are always allocated in the kernel linear mapping, and therefore using phys_to_virt() on the physical address of the RX buffer is possible and correct. Therefore, this commit changes the driver to use the "cookie" field to store the physical address instead of the virtual address. phys_to_virt() is used in the receive completion path to retrieve the virtual address from the physical address. It is obviously important to realize that the DMA address and physical address are two different things, which is why we store both in the RX descriptors. While those addresses may be identical in some situations, it remains two distinct concepts, and both addresses should be handled separately. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-07 23:53:04 +08:00
phys_addr);
}
/* Update BM driver with number of buffers added to pool */
bm_pool->buf_num += i;
netdev_dbg(port->dev,
"pool %d: pkt_size=%4d, buf_size=%4d, total_size=%4d\n",
bm_pool->id, bm_pool->pkt_size, buf_size, total_size);
netdev_dbg(port->dev,
"pool %d: %d of %d buffers added\n",
bm_pool->id, i, buf_num);
return i;
}
/* Notify the driver that BM pool is being used as specific type and return the
* pool pointer on success
*/
static struct mvpp2_bm_pool *
mvpp2_bm_pool_use(struct mvpp2_port *port, unsigned pool, int pkt_size)
{
struct mvpp2_bm_pool *new_pool = &port->priv->bm_pools[pool];
int num;
if ((port->priv->percpu_pools && pool > mvpp2_get_nrxqs(port->priv) * 2) ||
(!port->priv->percpu_pools && pool >= MVPP2_BM_POOLS_NUM)) {
netdev_err(port->dev, "Invalid pool %d\n", pool);
return NULL;
}
/* Allocate buffers in case BM pool is used as long pool, but packet
* size doesn't match MTU or BM pool hasn't being used yet
*/
if (new_pool->pkt_size == 0) {
int pkts_num;
/* Set default buffer number or free all the buffers in case
* the pool is not empty
*/
pkts_num = new_pool->buf_num;
if (pkts_num == 0) {
if (port->priv->percpu_pools) {
if (pool < port->nrxqs)
pkts_num = mvpp2_pools[MVPP2_BM_SHORT].buf_num;
else
pkts_num = mvpp2_pools[MVPP2_BM_LONG].buf_num;
} else {
pkts_num = mvpp2_pools[pool].buf_num;
}
} else {
mvpp2_bm_bufs_free(port->dev->dev.parent,
port->priv, new_pool, pkts_num);
}
new_pool->pkt_size = pkt_size;
new_pool->frag_size =
SKB_DATA_ALIGN(MVPP2_RX_BUF_SIZE(pkt_size)) +
MVPP2_SKB_SHINFO_SIZE;
/* Allocate buffers for this pool */
num = mvpp2_bm_bufs_add(port, new_pool, pkts_num);
if (num != pkts_num) {
WARN(1, "pool %d: %d of %d allocated\n",
new_pool->id, num, pkts_num);
return NULL;
}
}
mvpp2_bm_pool_bufsize_set(port->priv, new_pool,
MVPP2_RX_BUF_SIZE(new_pool->pkt_size));
return new_pool;
}
static struct mvpp2_bm_pool *
mvpp2_bm_pool_use_percpu(struct mvpp2_port *port, int type,
unsigned int pool, int pkt_size)
{
struct mvpp2_bm_pool *new_pool = &port->priv->bm_pools[pool];
int num;
if (pool > port->nrxqs * 2) {
netdev_err(port->dev, "Invalid pool %d\n", pool);
return NULL;
}
/* Allocate buffers in case BM pool is used as long pool, but packet
* size doesn't match MTU or BM pool hasn't being used yet
*/
if (new_pool->pkt_size == 0) {
int pkts_num;
/* Set default buffer number or free all the buffers in case
* the pool is not empty
*/
pkts_num = new_pool->buf_num;
if (pkts_num == 0)
pkts_num = mvpp2_pools[type].buf_num;
else
mvpp2_bm_bufs_free(port->dev->dev.parent,
port->priv, new_pool, pkts_num);
new_pool->pkt_size = pkt_size;
new_pool->frag_size =
SKB_DATA_ALIGN(MVPP2_RX_BUF_SIZE(pkt_size)) +
MVPP2_SKB_SHINFO_SIZE;
/* Allocate buffers for this pool */
num = mvpp2_bm_bufs_add(port, new_pool, pkts_num);
if (num != pkts_num) {
WARN(1, "pool %d: %d of %d allocated\n",
new_pool->id, num, pkts_num);
return NULL;
}
}
mvpp2_bm_pool_bufsize_set(port->priv, new_pool,
MVPP2_RX_BUF_SIZE(new_pool->pkt_size));
return new_pool;
}
/* Initialize pools for swf, shared buffers variant */
static int mvpp2_swf_bm_pool_init_shared(struct mvpp2_port *port)
{
enum mvpp2_bm_pool_log_num long_log_pool, short_log_pool;
int rxq;
/* If port pkt_size is higher than 1518B:
* HW Long pool - SW Jumbo pool, HW Short pool - SW Long pool
* else: HW Long pool - SW Long pool, HW Short pool - SW Short pool
*/
if (port->pkt_size > MVPP2_BM_LONG_PKT_SIZE) {
long_log_pool = MVPP2_BM_JUMBO;
short_log_pool = MVPP2_BM_LONG;
} else {
long_log_pool = MVPP2_BM_LONG;
short_log_pool = MVPP2_BM_SHORT;
}
if (!port->pool_long) {
port->pool_long =
mvpp2_bm_pool_use(port, long_log_pool,
mvpp2_pools[long_log_pool].pkt_size);
if (!port->pool_long)
return -ENOMEM;
port->pool_long->port_map |= BIT(port->id);
for (rxq = 0; rxq < port->nrxqs; rxq++)
mvpp2_rxq_long_pool_set(port, rxq, port->pool_long->id);
}
if (!port->pool_short) {
port->pool_short =
mvpp2_bm_pool_use(port, short_log_pool,
mvpp2_pools[short_log_pool].pkt_size);
if (!port->pool_short)
return -ENOMEM;
port->pool_short->port_map |= BIT(port->id);
for (rxq = 0; rxq < port->nrxqs; rxq++)
mvpp2_rxq_short_pool_set(port, rxq,
port->pool_short->id);
}
return 0;
}
/* Initialize pools for swf, percpu buffers variant */
static int mvpp2_swf_bm_pool_init_percpu(struct mvpp2_port *port)
{
struct mvpp2_bm_pool *bm_pool;
int i;
for (i = 0; i < port->nrxqs; i++) {
bm_pool = mvpp2_bm_pool_use_percpu(port, MVPP2_BM_SHORT, i,
mvpp2_pools[MVPP2_BM_SHORT].pkt_size);
if (!bm_pool)
return -ENOMEM;
bm_pool->port_map |= BIT(port->id);
mvpp2_rxq_short_pool_set(port, i, bm_pool->id);
}
for (i = 0; i < port->nrxqs; i++) {
bm_pool = mvpp2_bm_pool_use_percpu(port, MVPP2_BM_LONG, i + port->nrxqs,
mvpp2_pools[MVPP2_BM_LONG].pkt_size);
if (!bm_pool)
return -ENOMEM;
bm_pool->port_map |= BIT(port->id);
mvpp2_rxq_long_pool_set(port, i, bm_pool->id);
}
port->pool_long = NULL;
port->pool_short = NULL;
return 0;
}
static int mvpp2_swf_bm_pool_init(struct mvpp2_port *port)
{
if (port->priv->percpu_pools)
return mvpp2_swf_bm_pool_init_percpu(port);
else
return mvpp2_swf_bm_pool_init_shared(port);
}
static void mvpp2_set_hw_csum(struct mvpp2_port *port,
enum mvpp2_bm_pool_log_num new_long_pool)
{
const netdev_features_t csums = NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM;
/* Update L4 checksum when jumbo enable/disable on port.
* Only port 0 supports hardware checksum offload due to
* the Tx FIFO size limitation.
* Also, don't set NETIF_F_HW_CSUM because L3_offset in TX descriptor
* has 7 bits, so the maximum L3 offset is 128.
*/
if (new_long_pool == MVPP2_BM_JUMBO && port->id != 0) {
port->dev->features &= ~csums;
port->dev->hw_features &= ~csums;
} else {
port->dev->features |= csums;
port->dev->hw_features |= csums;
}
}
static int mvpp2_bm_update_mtu(struct net_device *dev, int mtu)
{
struct mvpp2_port *port = netdev_priv(dev);
enum mvpp2_bm_pool_log_num new_long_pool;
int pkt_size = MVPP2_RX_PKT_SIZE(mtu);
if (port->priv->percpu_pools)
goto out_set;
/* If port MTU is higher than 1518B:
* HW Long pool - SW Jumbo pool, HW Short pool - SW Long pool
* else: HW Long pool - SW Long pool, HW Short pool - SW Short pool
*/
if (pkt_size > MVPP2_BM_LONG_PKT_SIZE)
new_long_pool = MVPP2_BM_JUMBO;
else
new_long_pool = MVPP2_BM_LONG;
if (new_long_pool != port->pool_long->id) {
/* Remove port from old short & long pool */
port->pool_long = mvpp2_bm_pool_use(port, port->pool_long->id,
port->pool_long->pkt_size);
port->pool_long->port_map &= ~BIT(port->id);
port->pool_long = NULL;
port->pool_short = mvpp2_bm_pool_use(port, port->pool_short->id,
port->pool_short->pkt_size);
port->pool_short->port_map &= ~BIT(port->id);
port->pool_short = NULL;
port->pkt_size = pkt_size;
/* Add port to new short & long pool */
mvpp2_swf_bm_pool_init(port);
mvpp2_set_hw_csum(port, new_long_pool);
}
out_set:
dev->mtu = mtu;
dev->wanted_features = dev->features;
netdev_update_features(dev);
return 0;
}
static inline void mvpp2_interrupts_enable(struct mvpp2_port *port)
{
int i, sw_thread_mask = 0;
for (i = 0; i < port->nqvecs; i++)
sw_thread_mask |= port->qvecs[i].sw_thread_mask;
mvpp2_write(port->priv, MVPP2_ISR_ENABLE_REG(port->id),
MVPP2_ISR_ENABLE_INTERRUPT(sw_thread_mask));
}
static inline void mvpp2_interrupts_disable(struct mvpp2_port *port)
{
int i, sw_thread_mask = 0;
for (i = 0; i < port->nqvecs; i++)
sw_thread_mask |= port->qvecs[i].sw_thread_mask;
mvpp2_write(port->priv, MVPP2_ISR_ENABLE_REG(port->id),
MVPP2_ISR_DISABLE_INTERRUPT(sw_thread_mask));
}
static inline void mvpp2_qvec_interrupt_enable(struct mvpp2_queue_vector *qvec)
{
struct mvpp2_port *port = qvec->port;
mvpp2_write(port->priv, MVPP2_ISR_ENABLE_REG(port->id),
MVPP2_ISR_ENABLE_INTERRUPT(qvec->sw_thread_mask));
}
static inline void mvpp2_qvec_interrupt_disable(struct mvpp2_queue_vector *qvec)
{
struct mvpp2_port *port = qvec->port;
mvpp2_write(port->priv, MVPP2_ISR_ENABLE_REG(port->id),
MVPP2_ISR_DISABLE_INTERRUPT(qvec->sw_thread_mask));
}
/* Mask the current thread's Rx/Tx interrupts
* Called by on_each_cpu(), guaranteed to run with migration disabled,
* using smp_processor_id() is OK.
*/
static void mvpp2_interrupts_mask(void *arg)
{
struct mvpp2_port *port = arg;
/* If the thread isn't used, don't do anything */
if (smp_processor_id() > port->priv->nthreads)
return;
mvpp2_thread_write(port->priv,
mvpp2_cpu_to_thread(port->priv, smp_processor_id()),
net: mvpp2: handle register mapping and access for PPv2.2 This commit adjusts the mvpp2 driver register mapping and access logic to support PPv2.2, to handle a number of differences. Due to how the registers are laid out in memory, the Device Tree binding for the "reg" property is different: - On PPv2.1, we had a first area for the packet processor registers (common to all ports), and then one area per port. - On PPv2.2, we have a first area for the packet processor registers (common to all ports), and a second area for numerous other registers, including a large number of per-port registers In addition, on PPv2.2, the area for the common registers is split into so-called "address spaces" of 64 KB each. They allow to access per-CPU registers, where each CPU has its own copy of some registers. A few other registers, which have a single copy, also need to be accessed from those per-CPU windows if they are related to a per-CPU register. For example: - Writing to MVPP2_TXQ_NUM_REG selects a TX queue. This register is a per-CPU register, it must be accessed from the current CPU register window. - Then a write to MVPP2_TXQ_PENDING_REG, MVPP2_TXQ_DESC_ADDR_REG (and a few others) will affect the TX queue that was selected by the write to MVPP2_TXQ_NUM_REG. It must be accessed from the same CPU window as the write to the TXQ_NUM_REG. Therefore, the ->base member of 'struct mvpp2' is replaced with a ->cpu_base[] array, each entry pointing to a mapping of the per-CPU area. Since PPv2.1 doesn't have this concept of per-CPU windows, all entries in ->cpu_base[] point to the same io-remapped area. The existing mvpp2_read() and mvpp2_write() accessors use cpu_base[0], they are used for registers for which the CPU window doesn't matter. mvpp2_percpu_read() and mvpp2_percpu_write() are new accessors added to access the registers for which the CPU window does matter, which is why they take a "cpu" as argument. The driver is then changed to use mvpp2_percpu_read() and mvpp2_percpu_write() where it matters. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-07 23:53:13 +08:00
MVPP2_ISR_RX_TX_MASK_REG(port->id), 0);
}
/* Unmask the current thread's Rx/Tx interrupts.
* Called by on_each_cpu(), guaranteed to run with migration disabled,
* using smp_processor_id() is OK.
*/
static void mvpp2_interrupts_unmask(void *arg)
{
struct mvpp2_port *port = arg;
u32 val;
/* If the thread isn't used, don't do anything */
if (smp_processor_id() > port->priv->nthreads)
return;
val = MVPP2_CAUSE_MISC_SUM_MASK |
MVPP2_CAUSE_RXQ_OCCUP_DESC_ALL_MASK(port->priv->hw_version);
if (port->has_tx_irqs)
val |= MVPP2_CAUSE_TXQ_OCCUP_DESC_ALL_MASK;
mvpp2_thread_write(port->priv,
mvpp2_cpu_to_thread(port->priv, smp_processor_id()),
MVPP2_ISR_RX_TX_MASK_REG(port->id), val);
}
static void
mvpp2_shared_interrupt_mask_unmask(struct mvpp2_port *port, bool mask)
{
u32 val;
int i;
if (port->priv->hw_version != MVPP22)
return;
if (mask)
val = 0;
else
val = MVPP2_CAUSE_RXQ_OCCUP_DESC_ALL_MASK(MVPP22);
for (i = 0; i < port->nqvecs; i++) {
struct mvpp2_queue_vector *v = port->qvecs + i;
if (v->type != MVPP2_QUEUE_VECTOR_SHARED)
continue;
mvpp2_thread_write(port->priv, v->sw_thread_id,
MVPP2_ISR_RX_TX_MASK_REG(port->id), val);
}
}
/* Only GOP port 0 has an XLG MAC */
static bool mvpp2_port_supports_xlg(struct mvpp2_port *port)
{
return port->gop_id == 0;
}
static bool mvpp2_port_supports_rgmii(struct mvpp2_port *port)
{
return !(port->priv->hw_version == MVPP22 && port->gop_id == 0);
}
/* Port configuration routines */
static bool mvpp2_is_xlg(phy_interface_t interface)
{
return interface == PHY_INTERFACE_MODE_10GBASER ||
interface == PHY_INTERFACE_MODE_XAUI;
}
static void mvpp2_modify(void __iomem *ptr, u32 mask, u32 set)
{
u32 old, val;
old = val = readl(ptr);
val &= ~mask;
val |= set;
if (old != val)
writel(val, ptr);
}
static void mvpp22_gop_init_rgmii(struct mvpp2_port *port)
{
struct mvpp2 *priv = port->priv;
u32 val;
regmap_read(priv->sysctrl_base, GENCONF_PORT_CTRL0, &val);
val |= GENCONF_PORT_CTRL0_BUS_WIDTH_SELECT;
regmap_write(priv->sysctrl_base, GENCONF_PORT_CTRL0, val);
regmap_read(priv->sysctrl_base, GENCONF_CTRL0, &val);
if (port->gop_id == 2)
val |= GENCONF_CTRL0_PORT0_RGMII | GENCONF_CTRL0_PORT1_RGMII;
else if (port->gop_id == 3)
val |= GENCONF_CTRL0_PORT1_RGMII_MII;
regmap_write(priv->sysctrl_base, GENCONF_CTRL0, val);
}
static void mvpp22_gop_init_sgmii(struct mvpp2_port *port)
{
struct mvpp2 *priv = port->priv;
u32 val;
regmap_read(priv->sysctrl_base, GENCONF_PORT_CTRL0, &val);
val |= GENCONF_PORT_CTRL0_BUS_WIDTH_SELECT |
GENCONF_PORT_CTRL0_RX_DATA_SAMPLE;
regmap_write(priv->sysctrl_base, GENCONF_PORT_CTRL0, val);
if (port->gop_id > 1) {
regmap_read(priv->sysctrl_base, GENCONF_CTRL0, &val);
if (port->gop_id == 2)
val &= ~GENCONF_CTRL0_PORT0_RGMII;
else if (port->gop_id == 3)
val &= ~GENCONF_CTRL0_PORT1_RGMII_MII;
regmap_write(priv->sysctrl_base, GENCONF_CTRL0, val);
}
}
static void mvpp22_gop_init_10gkr(struct mvpp2_port *port)
{
struct mvpp2 *priv = port->priv;
void __iomem *mpcs = priv->iface_base + MVPP22_MPCS_BASE(port->gop_id);
void __iomem *xpcs = priv->iface_base + MVPP22_XPCS_BASE(port->gop_id);
u32 val;
val = readl(xpcs + MVPP22_XPCS_CFG0);
val &= ~(MVPP22_XPCS_CFG0_PCS_MODE(0x3) |
MVPP22_XPCS_CFG0_ACTIVE_LANE(0x3));
val |= MVPP22_XPCS_CFG0_ACTIVE_LANE(2);
writel(val, xpcs + MVPP22_XPCS_CFG0);
val = readl(mpcs + MVPP22_MPCS_CTRL);
val &= ~MVPP22_MPCS_CTRL_FWD_ERR_CONN;
writel(val, mpcs + MVPP22_MPCS_CTRL);
val = readl(mpcs + MVPP22_MPCS_CLK_RESET);
val &= ~MVPP22_MPCS_CLK_RESET_DIV_RATIO(0x7);
val |= MVPP22_MPCS_CLK_RESET_DIV_RATIO(1);
writel(val, mpcs + MVPP22_MPCS_CLK_RESET);
}
static int mvpp22_gop_init(struct mvpp2_port *port)
{
struct mvpp2 *priv = port->priv;
u32 val;
if (!priv->sysctrl_base)
return 0;
switch (port->phy_interface) {
case PHY_INTERFACE_MODE_RGMII:
case PHY_INTERFACE_MODE_RGMII_ID:
case PHY_INTERFACE_MODE_RGMII_RXID:
case PHY_INTERFACE_MODE_RGMII_TXID:
if (!mvpp2_port_supports_rgmii(port))
goto invalid_conf;
mvpp22_gop_init_rgmii(port);
break;
case PHY_INTERFACE_MODE_SGMII:
case PHY_INTERFACE_MODE_1000BASEX:
case PHY_INTERFACE_MODE_2500BASEX:
mvpp22_gop_init_sgmii(port);
break;
case PHY_INTERFACE_MODE_10GBASER:
if (!mvpp2_port_supports_xlg(port))
goto invalid_conf;
mvpp22_gop_init_10gkr(port);
break;
default:
goto unsupported_conf;
}
regmap_read(priv->sysctrl_base, GENCONF_PORT_CTRL1, &val);
val |= GENCONF_PORT_CTRL1_RESET(port->gop_id) |
GENCONF_PORT_CTRL1_EN(port->gop_id);
regmap_write(priv->sysctrl_base, GENCONF_PORT_CTRL1, val);
regmap_read(priv->sysctrl_base, GENCONF_PORT_CTRL0, &val);
val |= GENCONF_PORT_CTRL0_CLK_DIV_PHASE_CLR;
regmap_write(priv->sysctrl_base, GENCONF_PORT_CTRL0, val);
regmap_read(priv->sysctrl_base, GENCONF_SOFT_RESET1, &val);
val |= GENCONF_SOFT_RESET1_GOP;
regmap_write(priv->sysctrl_base, GENCONF_SOFT_RESET1, val);
unsupported_conf:
return 0;
invalid_conf:
netdev_err(port->dev, "Invalid port configuration\n");
return -EINVAL;
}
static void mvpp22_gop_unmask_irq(struct mvpp2_port *port)
{
u32 val;
if (phy_interface_mode_is_rgmii(port->phy_interface) ||
phy_interface_mode_is_8023z(port->phy_interface) ||
port->phy_interface == PHY_INTERFACE_MODE_SGMII) {
/* Enable the GMAC link status irq for this port */
val = readl(port->base + MVPP22_GMAC_INT_SUM_MASK);
val |= MVPP22_GMAC_INT_SUM_MASK_LINK_STAT;
writel(val, port->base + MVPP22_GMAC_INT_SUM_MASK);
}
if (mvpp2_port_supports_xlg(port)) {
/* Enable the XLG/GIG irqs for this port */
val = readl(port->base + MVPP22_XLG_EXT_INT_MASK);
if (mvpp2_is_xlg(port->phy_interface))
val |= MVPP22_XLG_EXT_INT_MASK_XLG;
else
val |= MVPP22_XLG_EXT_INT_MASK_GIG;
writel(val, port->base + MVPP22_XLG_EXT_INT_MASK);
}
}
static void mvpp22_gop_mask_irq(struct mvpp2_port *port)
{
u32 val;
if (mvpp2_port_supports_xlg(port)) {
val = readl(port->base + MVPP22_XLG_EXT_INT_MASK);
val &= ~(MVPP22_XLG_EXT_INT_MASK_XLG |
MVPP22_XLG_EXT_INT_MASK_GIG);
writel(val, port->base + MVPP22_XLG_EXT_INT_MASK);
}
if (phy_interface_mode_is_rgmii(port->phy_interface) ||
phy_interface_mode_is_8023z(port->phy_interface) ||
port->phy_interface == PHY_INTERFACE_MODE_SGMII) {
val = readl(port->base + MVPP22_GMAC_INT_SUM_MASK);
val &= ~MVPP22_GMAC_INT_SUM_MASK_LINK_STAT;
writel(val, port->base + MVPP22_GMAC_INT_SUM_MASK);
}
}
static void mvpp22_gop_setup_irq(struct mvpp2_port *port)
{
u32 val;
if (port->phylink ||
phy_interface_mode_is_rgmii(port->phy_interface) ||
phy_interface_mode_is_8023z(port->phy_interface) ||
port->phy_interface == PHY_INTERFACE_MODE_SGMII) {
val = readl(port->base + MVPP22_GMAC_INT_MASK);
val |= MVPP22_GMAC_INT_MASK_LINK_STAT;
writel(val, port->base + MVPP22_GMAC_INT_MASK);
}
if (mvpp2_port_supports_xlg(port)) {
val = readl(port->base + MVPP22_XLG_INT_MASK);
val |= MVPP22_XLG_INT_MASK_LINK;
writel(val, port->base + MVPP22_XLG_INT_MASK);
}
mvpp22_gop_unmask_irq(port);
}
/* Sets the PHY mode of the COMPHY (which configures the serdes lanes).
*
* The PHY mode used by the PPv2 driver comes from the network subsystem, while
* the one given to the COMPHY comes from the generic PHY subsystem. Hence they
* differ.
*
* The COMPHY configures the serdes lanes regardless of the actual use of the
* lanes by the physical layer. This is why configurations like
* "PPv2 (2500BaseX) - COMPHY (2500SGMII)" are valid.
*/
static int mvpp22_comphy_init(struct mvpp2_port *port)
{
int ret;
if (!port->comphy)
return 0;
ret = phy_set_mode_ext(port->comphy, PHY_MODE_ETHERNET,
port->phy_interface);
if (ret)
return ret;
return phy_power_on(port->comphy);
}
static void mvpp2_port_enable(struct mvpp2_port *port)
{
u32 val;
if (mvpp2_port_supports_xlg(port) &&
mvpp2_is_xlg(port->phy_interface)) {
val = readl(port->base + MVPP22_XLG_CTRL0_REG);
val |= MVPP22_XLG_CTRL0_PORT_EN;
val &= ~MVPP22_XLG_CTRL0_MIB_CNT_DIS;
writel(val, port->base + MVPP22_XLG_CTRL0_REG);
} else {
val = readl(port->base + MVPP2_GMAC_CTRL_0_REG);
val |= MVPP2_GMAC_PORT_EN_MASK;
val |= MVPP2_GMAC_MIB_CNTR_EN_MASK;
writel(val, port->base + MVPP2_GMAC_CTRL_0_REG);
}
}
static void mvpp2_port_disable(struct mvpp2_port *port)
{
u32 val;
if (mvpp2_port_supports_xlg(port) &&
mvpp2_is_xlg(port->phy_interface)) {
val = readl(port->base + MVPP22_XLG_CTRL0_REG);
val &= ~MVPP22_XLG_CTRL0_PORT_EN;
writel(val, port->base + MVPP22_XLG_CTRL0_REG);
}
val = readl(port->base + MVPP2_GMAC_CTRL_0_REG);
val &= ~(MVPP2_GMAC_PORT_EN_MASK);
writel(val, port->base + MVPP2_GMAC_CTRL_0_REG);
}
/* Set IEEE 802.3x Flow Control Xon Packet Transmission Mode */
static void mvpp2_port_periodic_xon_disable(struct mvpp2_port *port)
{
u32 val;
val = readl(port->base + MVPP2_GMAC_CTRL_1_REG) &
~MVPP2_GMAC_PERIODIC_XON_EN_MASK;
writel(val, port->base + MVPP2_GMAC_CTRL_1_REG);
}
/* Configure loopback port */
static void mvpp2_port_loopback_set(struct mvpp2_port *port,
const struct phylink_link_state *state)
{
u32 val;
val = readl(port->base + MVPP2_GMAC_CTRL_1_REG);
if (state->speed == 1000)
val |= MVPP2_GMAC_GMII_LB_EN_MASK;
else
val &= ~MVPP2_GMAC_GMII_LB_EN_MASK;
if (phy_interface_mode_is_8023z(port->phy_interface) ||
port->phy_interface == PHY_INTERFACE_MODE_SGMII)
val |= MVPP2_GMAC_PCS_LB_EN_MASK;
else
val &= ~MVPP2_GMAC_PCS_LB_EN_MASK;
writel(val, port->base + MVPP2_GMAC_CTRL_1_REG);
}
enum {
ETHTOOL_XDP_REDIRECT,
ETHTOOL_XDP_PASS,
ETHTOOL_XDP_DROP,
ETHTOOL_XDP_TX,
ETHTOOL_XDP_TX_ERR,
ETHTOOL_XDP_XMIT,
ETHTOOL_XDP_XMIT_ERR,
};
struct mvpp2_ethtool_counter {
unsigned int offset;
const char string[ETH_GSTRING_LEN];
bool reg_is_64b;
};
static u64 mvpp2_read_count(struct mvpp2_port *port,
const struct mvpp2_ethtool_counter *counter)
{
u64 val;
val = readl(port->stats_base + counter->offset);
if (counter->reg_is_64b)
val += (u64)readl(port->stats_base + counter->offset + 4) << 32;
return val;
}
/* Some counters are accessed indirectly by first writing an index to
* MVPP2_CTRS_IDX. The index can represent various resources depending on the
* register we access, it can be a hit counter for some classification tables,
* a counter specific to a rxq, a txq or a buffer pool.
*/
static u32 mvpp2_read_index(struct mvpp2 *priv, u32 index, u32 reg)
{
mvpp2_write(priv, MVPP2_CTRS_IDX, index);
return mvpp2_read(priv, reg);
}
/* Due to the fact that software statistics and hardware statistics are, by
* design, incremented at different moments in the chain of packet processing,
* it is very likely that incoming packets could have been dropped after being
* counted by hardware but before reaching software statistics (most probably
* multicast packets), and in the oppposite way, during transmission, FCS bytes
* are added in between as well as TSO skb will be split and header bytes added.
* Hence, statistics gathered from userspace with ifconfig (software) and
* ethtool (hardware) cannot be compared.
*/
static const struct mvpp2_ethtool_counter mvpp2_ethtool_mib_regs[] = {
{ MVPP2_MIB_GOOD_OCTETS_RCVD, "good_octets_received", true },
{ MVPP2_MIB_BAD_OCTETS_RCVD, "bad_octets_received" },
{ MVPP2_MIB_CRC_ERRORS_SENT, "crc_errors_sent" },
{ MVPP2_MIB_UNICAST_FRAMES_RCVD, "unicast_frames_received" },
{ MVPP2_MIB_BROADCAST_FRAMES_RCVD, "broadcast_frames_received" },
{ MVPP2_MIB_MULTICAST_FRAMES_RCVD, "multicast_frames_received" },
{ MVPP2_MIB_FRAMES_64_OCTETS, "frames_64_octets" },
{ MVPP2_MIB_FRAMES_65_TO_127_OCTETS, "frames_65_to_127_octet" },
{ MVPP2_MIB_FRAMES_128_TO_255_OCTETS, "frames_128_to_255_octet" },
{ MVPP2_MIB_FRAMES_256_TO_511_OCTETS, "frames_256_to_511_octet" },
{ MVPP2_MIB_FRAMES_512_TO_1023_OCTETS, "frames_512_to_1023_octet" },
{ MVPP2_MIB_FRAMES_1024_TO_MAX_OCTETS, "frames_1024_to_max_octet" },
{ MVPP2_MIB_GOOD_OCTETS_SENT, "good_octets_sent", true },
{ MVPP2_MIB_UNICAST_FRAMES_SENT, "unicast_frames_sent" },
{ MVPP2_MIB_MULTICAST_FRAMES_SENT, "multicast_frames_sent" },
{ MVPP2_MIB_BROADCAST_FRAMES_SENT, "broadcast_frames_sent" },
{ MVPP2_MIB_FC_SENT, "fc_sent" },
{ MVPP2_MIB_FC_RCVD, "fc_received" },
{ MVPP2_MIB_RX_FIFO_OVERRUN, "rx_fifo_overrun" },
{ MVPP2_MIB_UNDERSIZE_RCVD, "undersize_received" },
{ MVPP2_MIB_FRAGMENTS_RCVD, "fragments_received" },
{ MVPP2_MIB_OVERSIZE_RCVD, "oversize_received" },
{ MVPP2_MIB_JABBER_RCVD, "jabber_received" },
{ MVPP2_MIB_MAC_RCV_ERROR, "mac_receive_error" },
{ MVPP2_MIB_BAD_CRC_EVENT, "bad_crc_event" },
{ MVPP2_MIB_COLLISION, "collision" },
{ MVPP2_MIB_LATE_COLLISION, "late_collision" },
};
static const struct mvpp2_ethtool_counter mvpp2_ethtool_port_regs[] = {
{ MVPP2_OVERRUN_ETH_DROP, "rx_fifo_or_parser_overrun_drops" },
{ MVPP2_CLS_ETH_DROP, "rx_classifier_drops" },
};
static const struct mvpp2_ethtool_counter mvpp2_ethtool_txq_regs[] = {
{ MVPP2_TX_DESC_ENQ_CTR, "txq_%d_desc_enqueue" },
{ MVPP2_TX_DESC_ENQ_TO_DDR_CTR, "txq_%d_desc_enqueue_to_ddr" },
{ MVPP2_TX_BUFF_ENQ_TO_DDR_CTR, "txq_%d_buff_euqueue_to_ddr" },
{ MVPP2_TX_DESC_ENQ_HW_FWD_CTR, "txq_%d_desc_hardware_forwarded" },
{ MVPP2_TX_PKTS_DEQ_CTR, "txq_%d_packets_dequeued" },
{ MVPP2_TX_PKTS_FULL_QUEUE_DROP_CTR, "txq_%d_queue_full_drops" },
{ MVPP2_TX_PKTS_EARLY_DROP_CTR, "txq_%d_packets_early_drops" },
{ MVPP2_TX_PKTS_BM_DROP_CTR, "txq_%d_packets_bm_drops" },
{ MVPP2_TX_PKTS_BM_MC_DROP_CTR, "txq_%d_packets_rep_bm_drops" },
};
static const struct mvpp2_ethtool_counter mvpp2_ethtool_rxq_regs[] = {
{ MVPP2_RX_DESC_ENQ_CTR, "rxq_%d_desc_enqueue" },
{ MVPP2_RX_PKTS_FULL_QUEUE_DROP_CTR, "rxq_%d_queue_full_drops" },
{ MVPP2_RX_PKTS_EARLY_DROP_CTR, "rxq_%d_packets_early_drops" },
{ MVPP2_RX_PKTS_BM_DROP_CTR, "rxq_%d_packets_bm_drops" },
};
static const struct mvpp2_ethtool_counter mvpp2_ethtool_xdp[] = {
{ ETHTOOL_XDP_REDIRECT, "rx_xdp_redirect", },
{ ETHTOOL_XDP_PASS, "rx_xdp_pass", },
{ ETHTOOL_XDP_DROP, "rx_xdp_drop", },
{ ETHTOOL_XDP_TX, "rx_xdp_tx", },
{ ETHTOOL_XDP_TX_ERR, "rx_xdp_tx_errors", },
{ ETHTOOL_XDP_XMIT, "tx_xdp_xmit", },
{ ETHTOOL_XDP_XMIT_ERR, "tx_xdp_xmit_errors", },
};
#define MVPP2_N_ETHTOOL_STATS(ntxqs, nrxqs) (ARRAY_SIZE(mvpp2_ethtool_mib_regs) + \
ARRAY_SIZE(mvpp2_ethtool_port_regs) + \
(ARRAY_SIZE(mvpp2_ethtool_txq_regs) * (ntxqs)) + \
(ARRAY_SIZE(mvpp2_ethtool_rxq_regs) * (nrxqs)) + \
ARRAY_SIZE(mvpp2_ethtool_xdp))
static void mvpp2_ethtool_get_strings(struct net_device *netdev, u32 sset,
u8 *data)
{
struct mvpp2_port *port = netdev_priv(netdev);
int i, q;
if (sset != ETH_SS_STATS)
return;
for (i = 0; i < ARRAY_SIZE(mvpp2_ethtool_mib_regs); i++) {
strscpy(data, mvpp2_ethtool_mib_regs[i].string,
ETH_GSTRING_LEN);
data += ETH_GSTRING_LEN;
}
for (i = 0; i < ARRAY_SIZE(mvpp2_ethtool_port_regs); i++) {
strscpy(data, mvpp2_ethtool_port_regs[i].string,
ETH_GSTRING_LEN);
data += ETH_GSTRING_LEN;
}
for (q = 0; q < port->ntxqs; q++) {
for (i = 0; i < ARRAY_SIZE(mvpp2_ethtool_txq_regs); i++) {
snprintf(data, ETH_GSTRING_LEN,
mvpp2_ethtool_txq_regs[i].string, q);
data += ETH_GSTRING_LEN;
}
}
for (q = 0; q < port->nrxqs; q++) {
for (i = 0; i < ARRAY_SIZE(mvpp2_ethtool_rxq_regs); i++) {
snprintf(data, ETH_GSTRING_LEN,
mvpp2_ethtool_rxq_regs[i].string,
q);
data += ETH_GSTRING_LEN;
}
}
for (i = 0; i < ARRAY_SIZE(mvpp2_ethtool_xdp); i++) {
strscpy(data, mvpp2_ethtool_xdp[i].string,
ETH_GSTRING_LEN);
data += ETH_GSTRING_LEN;
}
}
static void
mvpp2_get_xdp_stats(struct mvpp2_port *port, struct mvpp2_pcpu_stats *xdp_stats)
{
unsigned int start;
unsigned int cpu;
/* Gather XDP Statistics */
for_each_possible_cpu(cpu) {
struct mvpp2_pcpu_stats *cpu_stats;
u64 xdp_redirect;
u64 xdp_pass;
u64 xdp_drop;
u64 xdp_xmit;
u64 xdp_xmit_err;
u64 xdp_tx;
u64 xdp_tx_err;
cpu_stats = per_cpu_ptr(port->stats, cpu);
do {
start = u64_stats_fetch_begin_irq(&cpu_stats->syncp);
xdp_redirect = cpu_stats->xdp_redirect;
xdp_pass = cpu_stats->xdp_pass;
xdp_drop = cpu_stats->xdp_drop;
xdp_xmit = cpu_stats->xdp_xmit;
xdp_xmit_err = cpu_stats->xdp_xmit_err;
xdp_tx = cpu_stats->xdp_tx;
xdp_tx_err = cpu_stats->xdp_tx_err;
} while (u64_stats_fetch_retry_irq(&cpu_stats->syncp, start));
xdp_stats->xdp_redirect += xdp_redirect;
xdp_stats->xdp_pass += xdp_pass;
xdp_stats->xdp_drop += xdp_drop;
xdp_stats->xdp_xmit += xdp_xmit;
xdp_stats->xdp_xmit_err += xdp_xmit_err;
xdp_stats->xdp_tx += xdp_tx;
xdp_stats->xdp_tx_err += xdp_tx_err;
}
}
static void mvpp2_read_stats(struct mvpp2_port *port)
{
struct mvpp2_pcpu_stats xdp_stats = {};
const struct mvpp2_ethtool_counter *s;
u64 *pstats;
int i, q;
pstats = port->ethtool_stats;
for (i = 0; i < ARRAY_SIZE(mvpp2_ethtool_mib_regs); i++)
*pstats++ += mvpp2_read_count(port, &mvpp2_ethtool_mib_regs[i]);
for (i = 0; i < ARRAY_SIZE(mvpp2_ethtool_port_regs); i++)
*pstats++ += mvpp2_read(port->priv,
mvpp2_ethtool_port_regs[i].offset +
4 * port->id);
for (q = 0; q < port->ntxqs; q++)
for (i = 0; i < ARRAY_SIZE(mvpp2_ethtool_txq_regs); i++)
*pstats++ += mvpp2_read_index(port->priv,
MVPP22_CTRS_TX_CTR(port->id, q),
mvpp2_ethtool_txq_regs[i].offset);
/* Rxqs are numbered from 0 from the user standpoint, but not from the
* driver's. We need to add the port->first_rxq offset.
*/
for (q = 0; q < port->nrxqs; q++)
for (i = 0; i < ARRAY_SIZE(mvpp2_ethtool_rxq_regs); i++)
*pstats++ += mvpp2_read_index(port->priv,
port->first_rxq + q,
mvpp2_ethtool_rxq_regs[i].offset);
/* Gather XDP Statistics */
mvpp2_get_xdp_stats(port, &xdp_stats);
for (i = 0, s = mvpp2_ethtool_xdp;
s < mvpp2_ethtool_xdp + ARRAY_SIZE(mvpp2_ethtool_xdp);
s++, i++) {
switch (s->offset) {
case ETHTOOL_XDP_REDIRECT:
*pstats++ = xdp_stats.xdp_redirect;
break;
case ETHTOOL_XDP_PASS:
*pstats++ = xdp_stats.xdp_pass;
break;
case ETHTOOL_XDP_DROP:
*pstats++ = xdp_stats.xdp_drop;
break;
case ETHTOOL_XDP_TX:
*pstats++ = xdp_stats.xdp_tx;
break;
case ETHTOOL_XDP_TX_ERR:
*pstats++ = xdp_stats.xdp_tx_err;
break;
case ETHTOOL_XDP_XMIT:
*pstats++ = xdp_stats.xdp_xmit;
break;
case ETHTOOL_XDP_XMIT_ERR:
*pstats++ = xdp_stats.xdp_xmit_err;
break;
}
}
}
static void mvpp2_gather_hw_statistics(struct work_struct *work)
{
struct delayed_work *del_work = to_delayed_work(work);
struct mvpp2_port *port = container_of(del_work, struct mvpp2_port,
stats_work);
mutex_lock(&port->gather_stats_lock);
mvpp2_read_stats(port);
/* No need to read again the counters right after this function if it
* was called asynchronously by the user (ie. use of ethtool).
*/
cancel_delayed_work(&port->stats_work);
queue_delayed_work(port->priv->stats_queue, &port->stats_work,
MVPP2_MIB_COUNTERS_STATS_DELAY);
mutex_unlock(&port->gather_stats_lock);
}
static void mvpp2_ethtool_get_stats(struct net_device *dev,
struct ethtool_stats *stats, u64 *data)
{
struct mvpp2_port *port = netdev_priv(dev);
/* Update statistics for the given port, then take the lock to avoid
* concurrent accesses on the ethtool_stats structure during its copy.
*/
mvpp2_gather_hw_statistics(&port->stats_work.work);
mutex_lock(&port->gather_stats_lock);
memcpy(data, port->ethtool_stats,
sizeof(u64) * MVPP2_N_ETHTOOL_STATS(port->ntxqs, port->nrxqs));
mutex_unlock(&port->gather_stats_lock);
}
static int mvpp2_ethtool_get_sset_count(struct net_device *dev, int sset)
{
struct mvpp2_port *port = netdev_priv(dev);
if (sset == ETH_SS_STATS)
return MVPP2_N_ETHTOOL_STATS(port->ntxqs, port->nrxqs);
return -EOPNOTSUPP;
}
static void mvpp2_mac_reset_assert(struct mvpp2_port *port)
{
u32 val;
val = readl(port->base + MVPP2_GMAC_CTRL_2_REG) |
MVPP2_GMAC_PORT_RESET_MASK;
writel(val, port->base + MVPP2_GMAC_CTRL_2_REG);
if (port->priv->hw_version == MVPP22 && port->gop_id == 0) {
val = readl(port->base + MVPP22_XLG_CTRL0_REG) &
~MVPP22_XLG_CTRL0_MAC_RESET_DIS;
writel(val, port->base + MVPP22_XLG_CTRL0_REG);
}
}
static void mvpp22_pcs_reset_assert(struct mvpp2_port *port)
{
struct mvpp2 *priv = port->priv;
void __iomem *mpcs, *xpcs;
u32 val;
if (port->priv->hw_version != MVPP22 || port->gop_id != 0)
return;
mpcs = priv->iface_base + MVPP22_MPCS_BASE(port->gop_id);
xpcs = priv->iface_base + MVPP22_XPCS_BASE(port->gop_id);
val = readl(mpcs + MVPP22_MPCS_CLK_RESET);
val &= ~(MAC_CLK_RESET_MAC | MAC_CLK_RESET_SD_RX | MAC_CLK_RESET_SD_TX);
val |= MVPP22_MPCS_CLK_RESET_DIV_SET;
writel(val, mpcs + MVPP22_MPCS_CLK_RESET);
val = readl(xpcs + MVPP22_XPCS_CFG0);
writel(val & ~MVPP22_XPCS_CFG0_RESET_DIS, xpcs + MVPP22_XPCS_CFG0);
}
static void mvpp22_pcs_reset_deassert(struct mvpp2_port *port)
{
struct mvpp2 *priv = port->priv;
void __iomem *mpcs, *xpcs;
u32 val;
if (port->priv->hw_version != MVPP22 || port->gop_id != 0)
return;
mpcs = priv->iface_base + MVPP22_MPCS_BASE(port->gop_id);
xpcs = priv->iface_base + MVPP22_XPCS_BASE(port->gop_id);
switch (port->phy_interface) {
case PHY_INTERFACE_MODE_10GBASER:
val = readl(mpcs + MVPP22_MPCS_CLK_RESET);
val |= MAC_CLK_RESET_MAC | MAC_CLK_RESET_SD_RX |
MAC_CLK_RESET_SD_TX;
val &= ~MVPP22_MPCS_CLK_RESET_DIV_SET;
writel(val, mpcs + MVPP22_MPCS_CLK_RESET);
break;
case PHY_INTERFACE_MODE_XAUI:
case PHY_INTERFACE_MODE_RXAUI:
val = readl(xpcs + MVPP22_XPCS_CFG0);
writel(val | MVPP22_XPCS_CFG0_RESET_DIS, xpcs + MVPP22_XPCS_CFG0);
break;
default:
break;
}
}
/* Change maximum receive size of the port */
static inline void mvpp2_gmac_max_rx_size_set(struct mvpp2_port *port)
{
u32 val;
val = readl(port->base + MVPP2_GMAC_CTRL_0_REG);
val &= ~MVPP2_GMAC_MAX_RX_SIZE_MASK;
val |= (((port->pkt_size - MVPP2_MH_SIZE) / 2) <<
MVPP2_GMAC_MAX_RX_SIZE_OFFS);
writel(val, port->base + MVPP2_GMAC_CTRL_0_REG);
}
/* Change maximum receive size of the port */
static inline void mvpp2_xlg_max_rx_size_set(struct mvpp2_port *port)
{
u32 val;
val = readl(port->base + MVPP22_XLG_CTRL1_REG);
val &= ~MVPP22_XLG_CTRL1_FRAMESIZELIMIT_MASK;
val |= ((port->pkt_size - MVPP2_MH_SIZE) / 2) <<
MVPP22_XLG_CTRL1_FRAMESIZELIMIT_OFFS;
writel(val, port->base + MVPP22_XLG_CTRL1_REG);
}
/* Set defaults to the MVPP2 port */
static void mvpp2_defaults_set(struct mvpp2_port *port)
{
int tx_port_num, val, queue, lrxq;
if (port->priv->hw_version == MVPP21) {
/* Update TX FIFO MIN Threshold */
val = readl(port->base + MVPP2_GMAC_PORT_FIFO_CFG_1_REG);
val &= ~MVPP2_GMAC_TX_FIFO_MIN_TH_ALL_MASK;
/* Min. TX threshold must be less than minimal packet length */
val |= MVPP2_GMAC_TX_FIFO_MIN_TH_MASK(64 - 4 - 2);
writel(val, port->base + MVPP2_GMAC_PORT_FIFO_CFG_1_REG);
}
/* Disable Legacy WRR, Disable EJP, Release from reset */
tx_port_num = mvpp2_egress_port(port);
mvpp2_write(port->priv, MVPP2_TXP_SCHED_PORT_INDEX_REG,
tx_port_num);
mvpp2_write(port->priv, MVPP2_TXP_SCHED_CMD_1_REG, 0);
/* Set TXQ scheduling to Round-Robin */
mvpp2_write(port->priv, MVPP2_TXP_SCHED_FIXED_PRIO_REG, 0);
/* Close bandwidth for all queues */
for (queue = 0; queue < MVPP2_MAX_TXQ; queue++)
mvpp2_write(port->priv,
MVPP2_TXQ_SCHED_TOKEN_CNTR_REG(queue), 0);
/* Set refill period to 1 usec, refill tokens
* and bucket size to maximum
*/
mvpp2_write(port->priv, MVPP2_TXP_SCHED_PERIOD_REG,
port->priv->tclk / USEC_PER_SEC);
val = mvpp2_read(port->priv, MVPP2_TXP_SCHED_REFILL_REG);
val &= ~MVPP2_TXP_REFILL_PERIOD_ALL_MASK;
val |= MVPP2_TXP_REFILL_PERIOD_MASK(1);
val |= MVPP2_TXP_REFILL_TOKENS_ALL_MASK;
mvpp2_write(port->priv, MVPP2_TXP_SCHED_REFILL_REG, val);
val = MVPP2_TXP_TOKEN_SIZE_MAX;
mvpp2_write(port->priv, MVPP2_TXP_SCHED_TOKEN_SIZE_REG, val);
/* Set MaximumLowLatencyPacketSize value to 256 */
mvpp2_write(port->priv, MVPP2_RX_CTRL_REG(port->id),
MVPP2_RX_USE_PSEUDO_FOR_CSUM_MASK |
MVPP2_RX_LOW_LATENCY_PKT_SIZE(256));
/* Enable Rx cache snoop */
for (lrxq = 0; lrxq < port->nrxqs; lrxq++) {
queue = port->rxqs[lrxq]->id;
val = mvpp2_read(port->priv, MVPP2_RXQ_CONFIG_REG(queue));
val |= MVPP2_SNOOP_PKT_SIZE_MASK |
MVPP2_SNOOP_BUF_HDR_MASK;
mvpp2_write(port->priv, MVPP2_RXQ_CONFIG_REG(queue), val);
}
/* At default, mask all interrupts to all present cpus */
mvpp2_interrupts_disable(port);
}
/* Enable/disable receiving packets */
static void mvpp2_ingress_enable(struct mvpp2_port *port)
{
u32 val;
int lrxq, queue;
for (lrxq = 0; lrxq < port->nrxqs; lrxq++) {
queue = port->rxqs[lrxq]->id;
val = mvpp2_read(port->priv, MVPP2_RXQ_CONFIG_REG(queue));
val &= ~MVPP2_RXQ_DISABLE_MASK;
mvpp2_write(port->priv, MVPP2_RXQ_CONFIG_REG(queue), val);
}
}
static void mvpp2_ingress_disable(struct mvpp2_port *port)
{
u32 val;
int lrxq, queue;
for (lrxq = 0; lrxq < port->nrxqs; lrxq++) {
queue = port->rxqs[lrxq]->id;
val = mvpp2_read(port->priv, MVPP2_RXQ_CONFIG_REG(queue));
val |= MVPP2_RXQ_DISABLE_MASK;
mvpp2_write(port->priv, MVPP2_RXQ_CONFIG_REG(queue), val);
}
}
/* Enable transmit via physical egress queue
* - HW starts take descriptors from DRAM
*/
static void mvpp2_egress_enable(struct mvpp2_port *port)
{
u32 qmap;
int queue;
int tx_port_num = mvpp2_egress_port(port);
/* Enable all initialized TXs. */
qmap = 0;
for (queue = 0; queue < port->ntxqs; queue++) {
struct mvpp2_tx_queue *txq = port->txqs[queue];
if (txq->descs)
qmap |= (1 << queue);
}
mvpp2_write(port->priv, MVPP2_TXP_SCHED_PORT_INDEX_REG, tx_port_num);
mvpp2_write(port->priv, MVPP2_TXP_SCHED_Q_CMD_REG, qmap);
}
/* Disable transmit via physical egress queue
* - HW doesn't take descriptors from DRAM
*/
static void mvpp2_egress_disable(struct mvpp2_port *port)
{
u32 reg_data;
int delay;
int tx_port_num = mvpp2_egress_port(port);
/* Issue stop command for active channels only */
mvpp2_write(port->priv, MVPP2_TXP_SCHED_PORT_INDEX_REG, tx_port_num);
reg_data = (mvpp2_read(port->priv, MVPP2_TXP_SCHED_Q_CMD_REG)) &
MVPP2_TXP_SCHED_ENQ_MASK;
if (reg_data != 0)
mvpp2_write(port->priv, MVPP2_TXP_SCHED_Q_CMD_REG,
(reg_data << MVPP2_TXP_SCHED_DISQ_OFFSET));
/* Wait for all Tx activity to terminate. */
delay = 0;
do {
if (delay >= MVPP2_TX_DISABLE_TIMEOUT_MSEC) {
netdev_warn(port->dev,
"Tx stop timed out, status=0x%08x\n",
reg_data);
break;
}
mdelay(1);
delay++;
/* Check port TX Command register that all
* Tx queues are stopped
*/
reg_data = mvpp2_read(port->priv, MVPP2_TXP_SCHED_Q_CMD_REG);
} while (reg_data & MVPP2_TXP_SCHED_ENQ_MASK);
}
/* Rx descriptors helper methods */
/* Get number of Rx descriptors occupied by received packets */
static inline int
mvpp2_rxq_received(struct mvpp2_port *port, int rxq_id)
{
u32 val = mvpp2_read(port->priv, MVPP2_RXQ_STATUS_REG(rxq_id));
return val & MVPP2_RXQ_OCCUPIED_MASK;
}
/* Update Rx queue status with the number of occupied and available
* Rx descriptor slots.
*/
static inline void
mvpp2_rxq_status_update(struct mvpp2_port *port, int rxq_id,
int used_count, int free_count)
{
/* Decrement the number of used descriptors and increment count
* increment the number of free descriptors.
*/
u32 val = used_count | (free_count << MVPP2_RXQ_NUM_NEW_OFFSET);
mvpp2_write(port->priv, MVPP2_RXQ_STATUS_UPDATE_REG(rxq_id), val);
}
/* Get pointer to next RX descriptor to be processed by SW */
static inline struct mvpp2_rx_desc *
mvpp2_rxq_next_desc_get(struct mvpp2_rx_queue *rxq)
{
int rx_desc = rxq->next_desc_to_proc;
rxq->next_desc_to_proc = MVPP2_QUEUE_NEXT_DESC(rxq, rx_desc);
prefetch(rxq->descs + rxq->next_desc_to_proc);
return rxq->descs + rx_desc;
}
/* Set rx queue offset */
static void mvpp2_rxq_offset_set(struct mvpp2_port *port,
int prxq, int offset)
{
u32 val;
/* Convert offset from bytes to units of 32 bytes */
offset = offset >> 5;
val = mvpp2_read(port->priv, MVPP2_RXQ_CONFIG_REG(prxq));
val &= ~MVPP2_RXQ_PACKET_OFFSET_MASK;
/* Offset is in */
val |= ((offset << MVPP2_RXQ_PACKET_OFFSET_OFFS) &
MVPP2_RXQ_PACKET_OFFSET_MASK);
mvpp2_write(port->priv, MVPP2_RXQ_CONFIG_REG(prxq), val);
}
/* Tx descriptors helper methods */
/* Get pointer to next Tx descriptor to be processed (send) by HW */
static struct mvpp2_tx_desc *
mvpp2_txq_next_desc_get(struct mvpp2_tx_queue *txq)
{
int tx_desc = txq->next_desc_to_proc;
txq->next_desc_to_proc = MVPP2_QUEUE_NEXT_DESC(txq, tx_desc);
return txq->descs + tx_desc;
}
/* Update HW with number of aggregated Tx descriptors to be sent
*
* Called only from mvpp2_tx(), so migration is disabled, using
* smp_processor_id() is OK.
*/
static void mvpp2_aggr_txq_pend_desc_add(struct mvpp2_port *port, int pending)
{
/* aggregated access - relevant TXQ number is written in TX desc */
mvpp2_thread_write(port->priv,
mvpp2_cpu_to_thread(port->priv, smp_processor_id()),
net: mvpp2: handle register mapping and access for PPv2.2 This commit adjusts the mvpp2 driver register mapping and access logic to support PPv2.2, to handle a number of differences. Due to how the registers are laid out in memory, the Device Tree binding for the "reg" property is different: - On PPv2.1, we had a first area for the packet processor registers (common to all ports), and then one area per port. - On PPv2.2, we have a first area for the packet processor registers (common to all ports), and a second area for numerous other registers, including a large number of per-port registers In addition, on PPv2.2, the area for the common registers is split into so-called "address spaces" of 64 KB each. They allow to access per-CPU registers, where each CPU has its own copy of some registers. A few other registers, which have a single copy, also need to be accessed from those per-CPU windows if they are related to a per-CPU register. For example: - Writing to MVPP2_TXQ_NUM_REG selects a TX queue. This register is a per-CPU register, it must be accessed from the current CPU register window. - Then a write to MVPP2_TXQ_PENDING_REG, MVPP2_TXQ_DESC_ADDR_REG (and a few others) will affect the TX queue that was selected by the write to MVPP2_TXQ_NUM_REG. It must be accessed from the same CPU window as the write to the TXQ_NUM_REG. Therefore, the ->base member of 'struct mvpp2' is replaced with a ->cpu_base[] array, each entry pointing to a mapping of the per-CPU area. Since PPv2.1 doesn't have this concept of per-CPU windows, all entries in ->cpu_base[] point to the same io-remapped area. The existing mvpp2_read() and mvpp2_write() accessors use cpu_base[0], they are used for registers for which the CPU window doesn't matter. mvpp2_percpu_read() and mvpp2_percpu_write() are new accessors added to access the registers for which the CPU window does matter, which is why they take a "cpu" as argument. The driver is then changed to use mvpp2_percpu_read() and mvpp2_percpu_write() where it matters. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-07 23:53:13 +08:00
MVPP2_AGGR_TXQ_UPDATE_REG, pending);
}
/* Check if there are enough free descriptors in aggregated txq.
* If not, update the number of occupied descriptors and repeat the check.
*
* Called only from mvpp2_tx(), so migration is disabled, using
* smp_processor_id() is OK.
*/
static int mvpp2_aggr_desc_num_check(struct mvpp2_port *port,
struct mvpp2_tx_queue *aggr_txq, int num)
{
if ((aggr_txq->count + num) > MVPP2_AGGR_TXQ_SIZE) {
/* Update number of occupied aggregated Tx descriptors */
unsigned int thread =
mvpp2_cpu_to_thread(port->priv, smp_processor_id());
u32 val = mvpp2_read_relaxed(port->priv,
MVPP2_AGGR_TXQ_STATUS_REG(thread));
aggr_txq->count = val & MVPP2_AGGR_TXQ_PENDING_MASK;
if ((aggr_txq->count + num) > MVPP2_AGGR_TXQ_SIZE)
return -ENOMEM;
}
return 0;
}
/* Reserved Tx descriptors allocation request
*
* Called only from mvpp2_txq_reserved_desc_num_proc(), itself called
* only by mvpp2_tx(), so migration is disabled, using
* smp_processor_id() is OK.
*/
static int mvpp2_txq_alloc_reserved_desc(struct mvpp2_port *port,
struct mvpp2_tx_queue *txq, int num)
{
unsigned int thread = mvpp2_cpu_to_thread(port->priv, smp_processor_id());
struct mvpp2 *priv = port->priv;
u32 val;
val = (txq->id << MVPP2_TXQ_RSVD_REQ_Q_OFFSET) | num;
mvpp2_thread_write_relaxed(priv, thread, MVPP2_TXQ_RSVD_REQ_REG, val);
val = mvpp2_thread_read_relaxed(priv, thread, MVPP2_TXQ_RSVD_RSLT_REG);
return val & MVPP2_TXQ_RSVD_RSLT_MASK;
}
/* Check if there are enough reserved descriptors for transmission.
* If not, request chunk of reserved descriptors and check again.
*/
static int mvpp2_txq_reserved_desc_num_proc(struct mvpp2_port *port,
struct mvpp2_tx_queue *txq,
struct mvpp2_txq_pcpu *txq_pcpu,
int num)
{
int req, desc_count;
unsigned int thread;
if (txq_pcpu->reserved_num >= num)
return 0;
/* Not enough descriptors reserved! Update the reserved descriptor
* count and check again.
*/
desc_count = 0;
/* Compute total of used descriptors */
for (thread = 0; thread < port->priv->nthreads; thread++) {
struct mvpp2_txq_pcpu *txq_pcpu_aux;
txq_pcpu_aux = per_cpu_ptr(txq->pcpu, thread);
desc_count += txq_pcpu_aux->count;
desc_count += txq_pcpu_aux->reserved_num;
}
req = max(MVPP2_CPU_DESC_CHUNK, num - txq_pcpu->reserved_num);
desc_count += req;
if (desc_count >
(txq->size - (MVPP2_MAX_THREADS * MVPP2_CPU_DESC_CHUNK)))
return -ENOMEM;
txq_pcpu->reserved_num += mvpp2_txq_alloc_reserved_desc(port, txq, req);
/* OK, the descriptor could have been updated: check again. */
if (txq_pcpu->reserved_num < num)
return -ENOMEM;
return 0;
}
/* Release the last allocated Tx descriptor. Useful to handle DMA
* mapping failures in the Tx path.
*/
static void mvpp2_txq_desc_put(struct mvpp2_tx_queue *txq)
{
if (txq->next_desc_to_proc == 0)
txq->next_desc_to_proc = txq->last_desc - 1;
else
txq->next_desc_to_proc--;
}
/* Set Tx descriptors fields relevant for CSUM calculation */
static u32 mvpp2_txq_desc_csum(int l3_offs, __be16 l3_proto,
int ip_hdr_len, int l4_proto)
{
u32 command;
/* fields: L3_offset, IP_hdrlen, L3_type, G_IPv4_chk,
* G_L4_chk, L4_type required only for checksum calculation
*/
command = (l3_offs << MVPP2_TXD_L3_OFF_SHIFT);
command |= (ip_hdr_len << MVPP2_TXD_IP_HLEN_SHIFT);
command |= MVPP2_TXD_IP_CSUM_DISABLE;
if (l3_proto == htons(ETH_P_IP)) {
command &= ~MVPP2_TXD_IP_CSUM_DISABLE; /* enable IPv4 csum */
command &= ~MVPP2_TXD_L3_IP6; /* enable IPv4 */
} else {
command |= MVPP2_TXD_L3_IP6; /* enable IPv6 */
}
if (l4_proto == IPPROTO_TCP) {
command &= ~MVPP2_TXD_L4_UDP; /* enable TCP */
command &= ~MVPP2_TXD_L4_CSUM_FRAG; /* generate L4 csum */
} else if (l4_proto == IPPROTO_UDP) {
command |= MVPP2_TXD_L4_UDP; /* enable UDP */
command &= ~MVPP2_TXD_L4_CSUM_FRAG; /* generate L4 csum */
} else {
command |= MVPP2_TXD_L4_CSUM_NOT;
}
return command;
}
/* Get number of sent descriptors and decrement counter.
* The number of sent descriptors is returned.
* Per-thread access
*
* Called only from mvpp2_txq_done(), called from mvpp2_tx()
* (migration disabled) and from the TX completion tasklet (migration
* disabled) so using smp_processor_id() is OK.
*/
static inline int mvpp2_txq_sent_desc_proc(struct mvpp2_port *port,
struct mvpp2_tx_queue *txq)
{
u32 val;
/* Reading status reg resets transmitted descriptor counter */
val = mvpp2_thread_read_relaxed(port->priv,
mvpp2_cpu_to_thread(port->priv, smp_processor_id()),
MVPP2_TXQ_SENT_REG(txq->id));
return (val & MVPP2_TRANSMITTED_COUNT_MASK) >>
MVPP2_TRANSMITTED_COUNT_OFFSET;
}
/* Called through on_each_cpu(), so runs on all CPUs, with migration
* disabled, therefore using smp_processor_id() is OK.
*/
static void mvpp2_txq_sent_counter_clear(void *arg)
{
struct mvpp2_port *port = arg;
int queue;
/* If the thread isn't used, don't do anything */
if (smp_processor_id() > port->priv->nthreads)
return;
for (queue = 0; queue < port->ntxqs; queue++) {
int id = port->txqs[queue]->id;
mvpp2_thread_read(port->priv,
mvpp2_cpu_to_thread(port->priv, smp_processor_id()),
net: mvpp2: handle register mapping and access for PPv2.2 This commit adjusts the mvpp2 driver register mapping and access logic to support PPv2.2, to handle a number of differences. Due to how the registers are laid out in memory, the Device Tree binding for the "reg" property is different: - On PPv2.1, we had a first area for the packet processor registers (common to all ports), and then one area per port. - On PPv2.2, we have a first area for the packet processor registers (common to all ports), and a second area for numerous other registers, including a large number of per-port registers In addition, on PPv2.2, the area for the common registers is split into so-called "address spaces" of 64 KB each. They allow to access per-CPU registers, where each CPU has its own copy of some registers. A few other registers, which have a single copy, also need to be accessed from those per-CPU windows if they are related to a per-CPU register. For example: - Writing to MVPP2_TXQ_NUM_REG selects a TX queue. This register is a per-CPU register, it must be accessed from the current CPU register window. - Then a write to MVPP2_TXQ_PENDING_REG, MVPP2_TXQ_DESC_ADDR_REG (and a few others) will affect the TX queue that was selected by the write to MVPP2_TXQ_NUM_REG. It must be accessed from the same CPU window as the write to the TXQ_NUM_REG. Therefore, the ->base member of 'struct mvpp2' is replaced with a ->cpu_base[] array, each entry pointing to a mapping of the per-CPU area. Since PPv2.1 doesn't have this concept of per-CPU windows, all entries in ->cpu_base[] point to the same io-remapped area. The existing mvpp2_read() and mvpp2_write() accessors use cpu_base[0], they are used for registers for which the CPU window doesn't matter. mvpp2_percpu_read() and mvpp2_percpu_write() are new accessors added to access the registers for which the CPU window does matter, which is why they take a "cpu" as argument. The driver is then changed to use mvpp2_percpu_read() and mvpp2_percpu_write() where it matters. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-07 23:53:13 +08:00
MVPP2_TXQ_SENT_REG(id));
}
}
/* Set max sizes for Tx queues */
static void mvpp2_txp_max_tx_size_set(struct mvpp2_port *port)
{
u32 val, size, mtu;
int txq, tx_port_num;
mtu = port->pkt_size * 8;
if (mtu > MVPP2_TXP_MTU_MAX)
mtu = MVPP2_TXP_MTU_MAX;
/* WA for wrong Token bucket update: Set MTU value = 3*real MTU value */
mtu = 3 * mtu;
/* Indirect access to registers */
tx_port_num = mvpp2_egress_port(port);
mvpp2_write(port->priv, MVPP2_TXP_SCHED_PORT_INDEX_REG, tx_port_num);
/* Set MTU */
val = mvpp2_read(port->priv, MVPP2_TXP_SCHED_MTU_REG);
val &= ~MVPP2_TXP_MTU_MAX;
val |= mtu;
mvpp2_write(port->priv, MVPP2_TXP_SCHED_MTU_REG, val);
/* TXP token size and all TXQs token size must be larger that MTU */
val = mvpp2_read(port->priv, MVPP2_TXP_SCHED_TOKEN_SIZE_REG);
size = val & MVPP2_TXP_TOKEN_SIZE_MAX;
if (size < mtu) {
size = mtu;
val &= ~MVPP2_TXP_TOKEN_SIZE_MAX;
val |= size;
mvpp2_write(port->priv, MVPP2_TXP_SCHED_TOKEN_SIZE_REG, val);
}
for (txq = 0; txq < port->ntxqs; txq++) {
val = mvpp2_read(port->priv,
MVPP2_TXQ_SCHED_TOKEN_SIZE_REG(txq));
size = val & MVPP2_TXQ_TOKEN_SIZE_MAX;
if (size < mtu) {
size = mtu;
val &= ~MVPP2_TXQ_TOKEN_SIZE_MAX;
val |= size;
mvpp2_write(port->priv,
MVPP2_TXQ_SCHED_TOKEN_SIZE_REG(txq),
val);
}
}
}
/* Set the number of packets that will be received before Rx interrupt
* will be generated by HW.
*/
static void mvpp2_rx_pkts_coal_set(struct mvpp2_port *port,
struct mvpp2_rx_queue *rxq)
{
unsigned int thread = mvpp2_cpu_to_thread(port->priv, get_cpu());
net: mvpp2: handle register mapping and access for PPv2.2 This commit adjusts the mvpp2 driver register mapping and access logic to support PPv2.2, to handle a number of differences. Due to how the registers are laid out in memory, the Device Tree binding for the "reg" property is different: - On PPv2.1, we had a first area for the packet processor registers (common to all ports), and then one area per port. - On PPv2.2, we have a first area for the packet processor registers (common to all ports), and a second area for numerous other registers, including a large number of per-port registers In addition, on PPv2.2, the area for the common registers is split into so-called "address spaces" of 64 KB each. They allow to access per-CPU registers, where each CPU has its own copy of some registers. A few other registers, which have a single copy, also need to be accessed from those per-CPU windows if they are related to a per-CPU register. For example: - Writing to MVPP2_TXQ_NUM_REG selects a TX queue. This register is a per-CPU register, it must be accessed from the current CPU register window. - Then a write to MVPP2_TXQ_PENDING_REG, MVPP2_TXQ_DESC_ADDR_REG (and a few others) will affect the TX queue that was selected by the write to MVPP2_TXQ_NUM_REG. It must be accessed from the same CPU window as the write to the TXQ_NUM_REG. Therefore, the ->base member of 'struct mvpp2' is replaced with a ->cpu_base[] array, each entry pointing to a mapping of the per-CPU area. Since PPv2.1 doesn't have this concept of per-CPU windows, all entries in ->cpu_base[] point to the same io-remapped area. The existing mvpp2_read() and mvpp2_write() accessors use cpu_base[0], they are used for registers for which the CPU window doesn't matter. mvpp2_percpu_read() and mvpp2_percpu_write() are new accessors added to access the registers for which the CPU window does matter, which is why they take a "cpu" as argument. The driver is then changed to use mvpp2_percpu_read() and mvpp2_percpu_write() where it matters. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-07 23:53:13 +08:00
if (rxq->pkts_coal > MVPP2_OCCUPIED_THRESH_MASK)
rxq->pkts_coal = MVPP2_OCCUPIED_THRESH_MASK;
mvpp2_thread_write(port->priv, thread, MVPP2_RXQ_NUM_REG, rxq->id);
mvpp2_thread_write(port->priv, thread, MVPP2_RXQ_THRESH_REG,
net: mvpp2: handle register mapping and access for PPv2.2 This commit adjusts the mvpp2 driver register mapping and access logic to support PPv2.2, to handle a number of differences. Due to how the registers are laid out in memory, the Device Tree binding for the "reg" property is different: - On PPv2.1, we had a first area for the packet processor registers (common to all ports), and then one area per port. - On PPv2.2, we have a first area for the packet processor registers (common to all ports), and a second area for numerous other registers, including a large number of per-port registers In addition, on PPv2.2, the area for the common registers is split into so-called "address spaces" of 64 KB each. They allow to access per-CPU registers, where each CPU has its own copy of some registers. A few other registers, which have a single copy, also need to be accessed from those per-CPU windows if they are related to a per-CPU register. For example: - Writing to MVPP2_TXQ_NUM_REG selects a TX queue. This register is a per-CPU register, it must be accessed from the current CPU register window. - Then a write to MVPP2_TXQ_PENDING_REG, MVPP2_TXQ_DESC_ADDR_REG (and a few others) will affect the TX queue that was selected by the write to MVPP2_TXQ_NUM_REG. It must be accessed from the same CPU window as the write to the TXQ_NUM_REG. Therefore, the ->base member of 'struct mvpp2' is replaced with a ->cpu_base[] array, each entry pointing to a mapping of the per-CPU area. Since PPv2.1 doesn't have this concept of per-CPU windows, all entries in ->cpu_base[] point to the same io-remapped area. The existing mvpp2_read() and mvpp2_write() accessors use cpu_base[0], they are used for registers for which the CPU window doesn't matter. mvpp2_percpu_read() and mvpp2_percpu_write() are new accessors added to access the registers for which the CPU window does matter, which is why they take a "cpu" as argument. The driver is then changed to use mvpp2_percpu_read() and mvpp2_percpu_write() where it matters. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-07 23:53:13 +08:00
rxq->pkts_coal);
put_cpu();
}
/* For some reason in the LSP this is done on each CPU. Why ? */
static void mvpp2_tx_pkts_coal_set(struct mvpp2_port *port,
struct mvpp2_tx_queue *txq)
{
unsigned int thread = mvpp2_cpu_to_thread(port->priv, get_cpu());
u32 val;
if (txq->done_pkts_coal > MVPP2_TXQ_THRESH_MASK)
txq->done_pkts_coal = MVPP2_TXQ_THRESH_MASK;
val = (txq->done_pkts_coal << MVPP2_TXQ_THRESH_OFFSET);
mvpp2_thread_write(port->priv, thread, MVPP2_TXQ_NUM_REG, txq->id);
mvpp2_thread_write(port->priv, thread, MVPP2_TXQ_THRESH_REG, val);
put_cpu();
}
static u32 mvpp2_usec_to_cycles(u32 usec, unsigned long clk_hz)
{
u64 tmp = (u64)clk_hz * usec;
do_div(tmp, USEC_PER_SEC);
return tmp > U32_MAX ? U32_MAX : tmp;
}
static u32 mvpp2_cycles_to_usec(u32 cycles, unsigned long clk_hz)
{
u64 tmp = (u64)cycles * USEC_PER_SEC;
do_div(tmp, clk_hz);
return tmp > U32_MAX ? U32_MAX : tmp;
}
/* Set the time delay in usec before Rx interrupt */
static void mvpp2_rx_time_coal_set(struct mvpp2_port *port,
struct mvpp2_rx_queue *rxq)
{
unsigned long freq = port->priv->tclk;
u32 val = mvpp2_usec_to_cycles(rxq->time_coal, freq);
if (val > MVPP2_MAX_ISR_RX_THRESHOLD) {
rxq->time_coal =
mvpp2_cycles_to_usec(MVPP2_MAX_ISR_RX_THRESHOLD, freq);
/* re-evaluate to get actual register value */
val = mvpp2_usec_to_cycles(rxq->time_coal, freq);
}
mvpp2_write(port->priv, MVPP2_ISR_RX_THRESHOLD_REG(rxq->id), val);
}
static void mvpp2_tx_time_coal_set(struct mvpp2_port *port)
{
unsigned long freq = port->priv->tclk;
u32 val = mvpp2_usec_to_cycles(port->tx_time_coal, freq);
if (val > MVPP2_MAX_ISR_TX_THRESHOLD) {
port->tx_time_coal =
mvpp2_cycles_to_usec(MVPP2_MAX_ISR_TX_THRESHOLD, freq);
/* re-evaluate to get actual register value */
val = mvpp2_usec_to_cycles(port->tx_time_coal, freq);
}
mvpp2_write(port->priv, MVPP2_ISR_TX_THRESHOLD_REG(port->id), val);
}
/* Free Tx queue skbuffs */
static void mvpp2_txq_bufs_free(struct mvpp2_port *port,
struct mvpp2_tx_queue *txq,
struct mvpp2_txq_pcpu *txq_pcpu, int num)
{
int i;
for (i = 0; i < num; i++) {
net: mvpp2: fix dma unmapping of TX buffers for fragments Since commit 71ce391dfb784 ("net: mvpp2: enable proper per-CPU TX buffers unmapping"), we are not correctly DMA unmapping TX buffers for fragments. Indeed, the mvpp2_txq_inc_put() function only stores in the txq_cpu->tx_buffs[] array the physical address of the buffer to be DMA-unmapped when skb != NULL. In addition, when DMA-unmapping, we use skb_headlen(skb) to get the size to be unmapped. Both of this works fine for TX descriptors that are associated directly to a SKB, but not the ones that are used for fragments, with a NULL pointer as skb: - We have a NULL physical address when calling DMA unmap - skb_headlen(skb) crashes because skb is NULL This causes random crashes when fragments are used. To solve this problem, we need to: - Store the physical address of the buffer to be unmapped unconditionally, regardless of whether it is tied to a SKB or not. - Store the length of the buffer to be unmapped, which requires a new field. Instead of adding a third array to store the length of the buffer to be unmapped, and as suggested by David Miller, this commit refactors the tx_buffs[] and tx_skb[] arrays of 'struct mvpp2_txq_pcpu' into a separate structure 'mvpp2_txq_pcpu_buf', to which a 'size' field is added. Therefore, instead of having three arrays to allocate/free, we have a single one, which also improve data locality, reducing the impact on the CPU cache. Fixes: 71ce391dfb784 ("net: mvpp2: enable proper per-CPU TX buffers unmapping") Reported-by: Raphael G <raphael.glon@corp.ovh.com> Cc: Raphael G <raphael.glon@corp.ovh.com> Cc: stable@vger.kernel.org Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-21 18:28:49 +08:00
struct mvpp2_txq_pcpu_buf *tx_buf =
txq_pcpu->buffs + txq_pcpu->txq_get_index;
if (!IS_TSO_HEADER(txq_pcpu, tx_buf->dma) &&
tx_buf->type != MVPP2_TYPE_XDP_TX)
dma_unmap_single(port->dev->dev.parent, tx_buf->dma,
tx_buf->size, DMA_TO_DEVICE);
if (tx_buf->type == MVPP2_TYPE_SKB && tx_buf->skb)
dev_kfree_skb_any(tx_buf->skb);
else if (tx_buf->type == MVPP2_TYPE_XDP_TX ||
tx_buf->type == MVPP2_TYPE_XDP_NDO)
xdp_return_frame(tx_buf->xdpf);
mvpp2_txq_inc_get(txq_pcpu);
}
}
static inline struct mvpp2_rx_queue *mvpp2_get_rx_queue(struct mvpp2_port *port,
u32 cause)
{
int queue = fls(cause) - 1;
return port->rxqs[queue];
}
static inline struct mvpp2_tx_queue *mvpp2_get_tx_queue(struct mvpp2_port *port,
u32 cause)
{
net: mvpp2: replace TX coalescing interrupts with hrtimer The PP2 controller is capable of per-CPU TX processing, which means there are per-CPU banked register sets and queues. Current version of the driver supports TX packet coalescing - once on given CPU sent packets amount reaches a threshold value, an IRQ occurs. However, there is a single interrupt line responsible for CPU0/1 TX and RX events (the latter is not per-CPU, the hardware does not support RSS). When the top-half executes the interrupt cause is not known. This is why in NAPI poll function, along with RX processing, IRQ cause register on both CPU's is accessed in order to determine on which of them the TX coalescing threshold might have been reached. Thus the egress processing and releasing the buffers is able to take place on the corresponding CPU. Hitherto approach lead to an illegal usage of on_each_cpu function in softirq context. The problem is solved by resigning from TX coalescing interrupts and separating egress finalization from NAPI processing. For that purpose a method of using hrtimer is introduced. In main transmit function (mvpp2_tx) buffers are released once a software coalescing threshold is reached. In case not all the data is processed a timer is set on this CPU - in its interrupt context a tasklet is scheduled in which all queues are processed. At once only one timer per-CPU can be running, which is controlled by a dedicated flag. This commit removes TX processing from NAPI polling function, disables hardware coalescing and enables hrtimer with tasklet, using new per-CPU port structure (mvpp2_port_pcpu). Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-07 01:00:30 +08:00
int queue = fls(cause) - 1;
return port->txqs[queue];
}
/* Handle end of transmission */
static void mvpp2_txq_done(struct mvpp2_port *port, struct mvpp2_tx_queue *txq,
struct mvpp2_txq_pcpu *txq_pcpu)
{
struct netdev_queue *nq = netdev_get_tx_queue(port->dev, txq->log_id);
int tx_done;
if (txq_pcpu->thread != mvpp2_cpu_to_thread(port->priv, smp_processor_id()))
netdev_err(port->dev, "wrong cpu on the end of Tx processing\n");
tx_done = mvpp2_txq_sent_desc_proc(port, txq);
if (!tx_done)
return;
mvpp2_txq_bufs_free(port, txq, txq_pcpu, tx_done);
txq_pcpu->count -= tx_done;
if (netif_tx_queue_stopped(nq))
if (txq_pcpu->count <= txq_pcpu->wake_threshold)
netif_tx_wake_queue(nq);
}
static unsigned int mvpp2_tx_done(struct mvpp2_port *port, u32 cause,
unsigned int thread)
net: mvpp2: replace TX coalescing interrupts with hrtimer The PP2 controller is capable of per-CPU TX processing, which means there are per-CPU banked register sets and queues. Current version of the driver supports TX packet coalescing - once on given CPU sent packets amount reaches a threshold value, an IRQ occurs. However, there is a single interrupt line responsible for CPU0/1 TX and RX events (the latter is not per-CPU, the hardware does not support RSS). When the top-half executes the interrupt cause is not known. This is why in NAPI poll function, along with RX processing, IRQ cause register on both CPU's is accessed in order to determine on which of them the TX coalescing threshold might have been reached. Thus the egress processing and releasing the buffers is able to take place on the corresponding CPU. Hitherto approach lead to an illegal usage of on_each_cpu function in softirq context. The problem is solved by resigning from TX coalescing interrupts and separating egress finalization from NAPI processing. For that purpose a method of using hrtimer is introduced. In main transmit function (mvpp2_tx) buffers are released once a software coalescing threshold is reached. In case not all the data is processed a timer is set on this CPU - in its interrupt context a tasklet is scheduled in which all queues are processed. At once only one timer per-CPU can be running, which is controlled by a dedicated flag. This commit removes TX processing from NAPI polling function, disables hardware coalescing and enables hrtimer with tasklet, using new per-CPU port structure (mvpp2_port_pcpu). Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-07 01:00:30 +08:00
{
struct mvpp2_tx_queue *txq;
struct mvpp2_txq_pcpu *txq_pcpu;
unsigned int tx_todo = 0;
while (cause) {
txq = mvpp2_get_tx_queue(port, cause);
if (!txq)
break;
txq_pcpu = per_cpu_ptr(txq->pcpu, thread);
net: mvpp2: replace TX coalescing interrupts with hrtimer The PP2 controller is capable of per-CPU TX processing, which means there are per-CPU banked register sets and queues. Current version of the driver supports TX packet coalescing - once on given CPU sent packets amount reaches a threshold value, an IRQ occurs. However, there is a single interrupt line responsible for CPU0/1 TX and RX events (the latter is not per-CPU, the hardware does not support RSS). When the top-half executes the interrupt cause is not known. This is why in NAPI poll function, along with RX processing, IRQ cause register on both CPU's is accessed in order to determine on which of them the TX coalescing threshold might have been reached. Thus the egress processing and releasing the buffers is able to take place on the corresponding CPU. Hitherto approach lead to an illegal usage of on_each_cpu function in softirq context. The problem is solved by resigning from TX coalescing interrupts and separating egress finalization from NAPI processing. For that purpose a method of using hrtimer is introduced. In main transmit function (mvpp2_tx) buffers are released once a software coalescing threshold is reached. In case not all the data is processed a timer is set on this CPU - in its interrupt context a tasklet is scheduled in which all queues are processed. At once only one timer per-CPU can be running, which is controlled by a dedicated flag. This commit removes TX processing from NAPI polling function, disables hardware coalescing and enables hrtimer with tasklet, using new per-CPU port structure (mvpp2_port_pcpu). Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-07 01:00:30 +08:00
if (txq_pcpu->count) {
mvpp2_txq_done(port, txq, txq_pcpu);
tx_todo += txq_pcpu->count;
}
cause &= ~(1 << txq->log_id);
}
return tx_todo;
}
/* Rx/Tx queue initialization/cleanup methods */
/* Allocate and initialize descriptors for aggr TXQ */
static int mvpp2_aggr_txq_init(struct platform_device *pdev,
struct mvpp2_tx_queue *aggr_txq,
unsigned int thread, struct mvpp2 *priv)
{
u32 txq_dma;
/* Allocate memory for TX descriptors */
aggr_txq->descs = dma_alloc_coherent(&pdev->dev,
MVPP2_AGGR_TXQ_SIZE * MVPP2_DESC_ALIGNED_SIZE,
&aggr_txq->descs_dma, GFP_KERNEL);
if (!aggr_txq->descs)
return -ENOMEM;
aggr_txq->last_desc = MVPP2_AGGR_TXQ_SIZE - 1;
/* Aggr TXQ no reset WA */
aggr_txq->next_desc_to_proc = mvpp2_read(priv,
MVPP2_AGGR_TXQ_INDEX_REG(thread));
/* Set Tx descriptors queue starting address indirect
* access
*/
if (priv->hw_version == MVPP21)
txq_dma = aggr_txq->descs_dma;
else
txq_dma = aggr_txq->descs_dma >>
MVPP22_AGGR_TXQ_DESC_ADDR_OFFS;
mvpp2_write(priv, MVPP2_AGGR_TXQ_DESC_ADDR_REG(thread), txq_dma);
mvpp2_write(priv, MVPP2_AGGR_TXQ_DESC_SIZE_REG(thread),
MVPP2_AGGR_TXQ_SIZE);
return 0;
}
/* Create a specified Rx queue */
static int mvpp2_rxq_init(struct mvpp2_port *port,
struct mvpp2_rx_queue *rxq)
{
struct mvpp2 *priv = port->priv;
unsigned int thread;
u32 rxq_dma;
int err;
rxq->size = port->rx_ring_size;
/* Allocate memory for RX descriptors */
rxq->descs = dma_alloc_coherent(port->dev->dev.parent,
rxq->size * MVPP2_DESC_ALIGNED_SIZE,
&rxq->descs_dma, GFP_KERNEL);
if (!rxq->descs)
return -ENOMEM;
rxq->last_desc = rxq->size - 1;
/* Zero occupied and non-occupied counters - direct access */
mvpp2_write(port->priv, MVPP2_RXQ_STATUS_REG(rxq->id), 0);
/* Set Rx descriptors queue starting address - indirect access */
thread = mvpp2_cpu_to_thread(port->priv, get_cpu());
mvpp2_thread_write(port->priv, thread, MVPP2_RXQ_NUM_REG, rxq->id);
if (port->priv->hw_version == MVPP21)
rxq_dma = rxq->descs_dma;
else
rxq_dma = rxq->descs_dma >> MVPP22_DESC_ADDR_OFFS;
mvpp2_thread_write(port->priv, thread, MVPP2_RXQ_DESC_ADDR_REG, rxq_dma);
mvpp2_thread_write(port->priv, thread, MVPP2_RXQ_DESC_SIZE_REG, rxq->size);
mvpp2_thread_write(port->priv, thread, MVPP2_RXQ_INDEX_REG, 0);
put_cpu();
/* Set Offset */
mvpp2_rxq_offset_set(port, rxq->id, MVPP2_SKB_HEADROOM);
/* Set coalescing pkts and time */
mvpp2_rx_pkts_coal_set(port, rxq);
mvpp2_rx_time_coal_set(port, rxq);
/* Add number of descriptors ready for receiving packets */
mvpp2_rxq_status_update(port, rxq->id, 0, rxq->size);
if (priv->percpu_pools) {
err = xdp_rxq_info_reg(&rxq->xdp_rxq_short, port->dev, rxq->id);
if (err < 0)
goto err_free_dma;
err = xdp_rxq_info_reg(&rxq->xdp_rxq_long, port->dev, rxq->id);
if (err < 0)
goto err_unregister_rxq_short;
/* Every RXQ has a pool for short and another for long packets */
err = xdp_rxq_info_reg_mem_model(&rxq->xdp_rxq_short,
MEM_TYPE_PAGE_POOL,
priv->page_pool[rxq->logic_rxq]);
if (err < 0)
goto err_unregister_rxq_long;
err = xdp_rxq_info_reg_mem_model(&rxq->xdp_rxq_long,
MEM_TYPE_PAGE_POOL,
priv->page_pool[rxq->logic_rxq +
port->nrxqs]);
if (err < 0)
goto err_unregister_mem_rxq_short;
}
return 0;
err_unregister_mem_rxq_short:
xdp_rxq_info_unreg_mem_model(&rxq->xdp_rxq_short);
err_unregister_rxq_long:
xdp_rxq_info_unreg(&rxq->xdp_rxq_long);
err_unregister_rxq_short:
xdp_rxq_info_unreg(&rxq->xdp_rxq_short);
err_free_dma:
dma_free_coherent(port->dev->dev.parent,
rxq->size * MVPP2_DESC_ALIGNED_SIZE,
rxq->descs, rxq->descs_dma);
return err;
}
/* Push packets received by the RXQ to BM pool */
static void mvpp2_rxq_drop_pkts(struct mvpp2_port *port,
struct mvpp2_rx_queue *rxq)
{
int rx_received, i;
rx_received = mvpp2_rxq_received(port, rxq->id);
if (!rx_received)
return;
for (i = 0; i < rx_received; i++) {
struct mvpp2_rx_desc *rx_desc = mvpp2_rxq_next_desc_get(rxq);
u32 status = mvpp2_rxdesc_status_get(port, rx_desc);
int pool;
pool = (status & MVPP2_RXD_BM_POOL_ID_MASK) >>
MVPP2_RXD_BM_POOL_ID_OFFS;
mvpp2_bm_pool_put(port, pool,
mvpp2_rxdesc_dma_addr_get(port, rx_desc),
mvpp2_rxdesc_cookie_get(port, rx_desc));
}
mvpp2_rxq_status_update(port, rxq->id, rx_received, rx_received);
}
/* Cleanup Rx queue */
static void mvpp2_rxq_deinit(struct mvpp2_port *port,
struct mvpp2_rx_queue *rxq)
{
unsigned int thread;
net: mvpp2: handle register mapping and access for PPv2.2 This commit adjusts the mvpp2 driver register mapping and access logic to support PPv2.2, to handle a number of differences. Due to how the registers are laid out in memory, the Device Tree binding for the "reg" property is different: - On PPv2.1, we had a first area for the packet processor registers (common to all ports), and then one area per port. - On PPv2.2, we have a first area for the packet processor registers (common to all ports), and a second area for numerous other registers, including a large number of per-port registers In addition, on PPv2.2, the area for the common registers is split into so-called "address spaces" of 64 KB each. They allow to access per-CPU registers, where each CPU has its own copy of some registers. A few other registers, which have a single copy, also need to be accessed from those per-CPU windows if they are related to a per-CPU register. For example: - Writing to MVPP2_TXQ_NUM_REG selects a TX queue. This register is a per-CPU register, it must be accessed from the current CPU register window. - Then a write to MVPP2_TXQ_PENDING_REG, MVPP2_TXQ_DESC_ADDR_REG (and a few others) will affect the TX queue that was selected by the write to MVPP2_TXQ_NUM_REG. It must be accessed from the same CPU window as the write to the TXQ_NUM_REG. Therefore, the ->base member of 'struct mvpp2' is replaced with a ->cpu_base[] array, each entry pointing to a mapping of the per-CPU area. Since PPv2.1 doesn't have this concept of per-CPU windows, all entries in ->cpu_base[] point to the same io-remapped area. The existing mvpp2_read() and mvpp2_write() accessors use cpu_base[0], they are used for registers for which the CPU window doesn't matter. mvpp2_percpu_read() and mvpp2_percpu_write() are new accessors added to access the registers for which the CPU window does matter, which is why they take a "cpu" as argument. The driver is then changed to use mvpp2_percpu_read() and mvpp2_percpu_write() where it matters. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-07 23:53:13 +08:00
if (xdp_rxq_info_is_reg(&rxq->xdp_rxq_short))
xdp_rxq_info_unreg(&rxq->xdp_rxq_short);
if (xdp_rxq_info_is_reg(&rxq->xdp_rxq_long))
xdp_rxq_info_unreg(&rxq->xdp_rxq_long);
mvpp2_rxq_drop_pkts(port, rxq);
if (rxq->descs)
dma_free_coherent(port->dev->dev.parent,
rxq->size * MVPP2_DESC_ALIGNED_SIZE,
rxq->descs,
rxq->descs_dma);
rxq->descs = NULL;
rxq->last_desc = 0;
rxq->next_desc_to_proc = 0;
rxq->descs_dma = 0;
/* Clear Rx descriptors queue starting address and size;
* free descriptor number
*/
mvpp2_write(port->priv, MVPP2_RXQ_STATUS_REG(rxq->id), 0);
thread = mvpp2_cpu_to_thread(port->priv, get_cpu());
mvpp2_thread_write(port->priv, thread, MVPP2_RXQ_NUM_REG, rxq->id);
mvpp2_thread_write(port->priv, thread, MVPP2_RXQ_DESC_ADDR_REG, 0);
mvpp2_thread_write(port->priv, thread, MVPP2_RXQ_DESC_SIZE_REG, 0);
put_cpu();
}
/* Create and initialize a Tx queue */
static int mvpp2_txq_init(struct mvpp2_port *port,
struct mvpp2_tx_queue *txq)
{
u32 val;
unsigned int thread;
int desc, desc_per_txq, tx_port_num;
struct mvpp2_txq_pcpu *txq_pcpu;
txq->size = port->tx_ring_size;
/* Allocate memory for Tx descriptors */
txq->descs = dma_alloc_coherent(port->dev->dev.parent,
txq->size * MVPP2_DESC_ALIGNED_SIZE,
&txq->descs_dma, GFP_KERNEL);
if (!txq->descs)
return -ENOMEM;
txq->last_desc = txq->size - 1;
/* Set Tx descriptors queue starting address - indirect access */
thread = mvpp2_cpu_to_thread(port->priv, get_cpu());
mvpp2_thread_write(port->priv, thread, MVPP2_TXQ_NUM_REG, txq->id);
mvpp2_thread_write(port->priv, thread, MVPP2_TXQ_DESC_ADDR_REG,
net: mvpp2: handle register mapping and access for PPv2.2 This commit adjusts the mvpp2 driver register mapping and access logic to support PPv2.2, to handle a number of differences. Due to how the registers are laid out in memory, the Device Tree binding for the "reg" property is different: - On PPv2.1, we had a first area for the packet processor registers (common to all ports), and then one area per port. - On PPv2.2, we have a first area for the packet processor registers (common to all ports), and a second area for numerous other registers, including a large number of per-port registers In addition, on PPv2.2, the area for the common registers is split into so-called "address spaces" of 64 KB each. They allow to access per-CPU registers, where each CPU has its own copy of some registers. A few other registers, which have a single copy, also need to be accessed from those per-CPU windows if they are related to a per-CPU register. For example: - Writing to MVPP2_TXQ_NUM_REG selects a TX queue. This register is a per-CPU register, it must be accessed from the current CPU register window. - Then a write to MVPP2_TXQ_PENDING_REG, MVPP2_TXQ_DESC_ADDR_REG (and a few others) will affect the TX queue that was selected by the write to MVPP2_TXQ_NUM_REG. It must be accessed from the same CPU window as the write to the TXQ_NUM_REG. Therefore, the ->base member of 'struct mvpp2' is replaced with a ->cpu_base[] array, each entry pointing to a mapping of the per-CPU area. Since PPv2.1 doesn't have this concept of per-CPU windows, all entries in ->cpu_base[] point to the same io-remapped area. The existing mvpp2_read() and mvpp2_write() accessors use cpu_base[0], they are used for registers for which the CPU window doesn't matter. mvpp2_percpu_read() and mvpp2_percpu_write() are new accessors added to access the registers for which the CPU window does matter, which is why they take a "cpu" as argument. The driver is then changed to use mvpp2_percpu_read() and mvpp2_percpu_write() where it matters. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-07 23:53:13 +08:00
txq->descs_dma);
mvpp2_thread_write(port->priv, thread, MVPP2_TXQ_DESC_SIZE_REG,
net: mvpp2: handle register mapping and access for PPv2.2 This commit adjusts the mvpp2 driver register mapping and access logic to support PPv2.2, to handle a number of differences. Due to how the registers are laid out in memory, the Device Tree binding for the "reg" property is different: - On PPv2.1, we had a first area for the packet processor registers (common to all ports), and then one area per port. - On PPv2.2, we have a first area for the packet processor registers (common to all ports), and a second area for numerous other registers, including a large number of per-port registers In addition, on PPv2.2, the area for the common registers is split into so-called "address spaces" of 64 KB each. They allow to access per-CPU registers, where each CPU has its own copy of some registers. A few other registers, which have a single copy, also need to be accessed from those per-CPU windows if they are related to a per-CPU register. For example: - Writing to MVPP2_TXQ_NUM_REG selects a TX queue. This register is a per-CPU register, it must be accessed from the current CPU register window. - Then a write to MVPP2_TXQ_PENDING_REG, MVPP2_TXQ_DESC_ADDR_REG (and a few others) will affect the TX queue that was selected by the write to MVPP2_TXQ_NUM_REG. It must be accessed from the same CPU window as the write to the TXQ_NUM_REG. Therefore, the ->base member of 'struct mvpp2' is replaced with a ->cpu_base[] array, each entry pointing to a mapping of the per-CPU area. Since PPv2.1 doesn't have this concept of per-CPU windows, all entries in ->cpu_base[] point to the same io-remapped area. The existing mvpp2_read() and mvpp2_write() accessors use cpu_base[0], they are used for registers for which the CPU window doesn't matter. mvpp2_percpu_read() and mvpp2_percpu_write() are new accessors added to access the registers for which the CPU window does matter, which is why they take a "cpu" as argument. The driver is then changed to use mvpp2_percpu_read() and mvpp2_percpu_write() where it matters. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-07 23:53:13 +08:00
txq->size & MVPP2_TXQ_DESC_SIZE_MASK);
mvpp2_thread_write(port->priv, thread, MVPP2_TXQ_INDEX_REG, 0);
mvpp2_thread_write(port->priv, thread, MVPP2_TXQ_RSVD_CLR_REG,
net: mvpp2: handle register mapping and access for PPv2.2 This commit adjusts the mvpp2 driver register mapping and access logic to support PPv2.2, to handle a number of differences. Due to how the registers are laid out in memory, the Device Tree binding for the "reg" property is different: - On PPv2.1, we had a first area for the packet processor registers (common to all ports), and then one area per port. - On PPv2.2, we have a first area for the packet processor registers (common to all ports), and a second area for numerous other registers, including a large number of per-port registers In addition, on PPv2.2, the area for the common registers is split into so-called "address spaces" of 64 KB each. They allow to access per-CPU registers, where each CPU has its own copy of some registers. A few other registers, which have a single copy, also need to be accessed from those per-CPU windows if they are related to a per-CPU register. For example: - Writing to MVPP2_TXQ_NUM_REG selects a TX queue. This register is a per-CPU register, it must be accessed from the current CPU register window. - Then a write to MVPP2_TXQ_PENDING_REG, MVPP2_TXQ_DESC_ADDR_REG (and a few others) will affect the TX queue that was selected by the write to MVPP2_TXQ_NUM_REG. It must be accessed from the same CPU window as the write to the TXQ_NUM_REG. Therefore, the ->base member of 'struct mvpp2' is replaced with a ->cpu_base[] array, each entry pointing to a mapping of the per-CPU area. Since PPv2.1 doesn't have this concept of per-CPU windows, all entries in ->cpu_base[] point to the same io-remapped area. The existing mvpp2_read() and mvpp2_write() accessors use cpu_base[0], they are used for registers for which the CPU window doesn't matter. mvpp2_percpu_read() and mvpp2_percpu_write() are new accessors added to access the registers for which the CPU window does matter, which is why they take a "cpu" as argument. The driver is then changed to use mvpp2_percpu_read() and mvpp2_percpu_write() where it matters. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-07 23:53:13 +08:00
txq->id << MVPP2_TXQ_RSVD_CLR_OFFSET);
val = mvpp2_thread_read(port->priv, thread, MVPP2_TXQ_PENDING_REG);
val &= ~MVPP2_TXQ_PENDING_MASK;
mvpp2_thread_write(port->priv, thread, MVPP2_TXQ_PENDING_REG, val);
/* Calculate base address in prefetch buffer. We reserve 16 descriptors
* for each existing TXQ.
* TCONTS for PON port must be continuous from 0 to MVPP2_MAX_TCONT
* GBE ports assumed to be continuous from 0 to MVPP2_MAX_PORTS
*/
desc_per_txq = 16;
desc = (port->id * MVPP2_MAX_TXQ * desc_per_txq) +
(txq->log_id * desc_per_txq);
mvpp2_thread_write(port->priv, thread, MVPP2_TXQ_PREF_BUF_REG,
net: mvpp2: handle register mapping and access for PPv2.2 This commit adjusts the mvpp2 driver register mapping and access logic to support PPv2.2, to handle a number of differences. Due to how the registers are laid out in memory, the Device Tree binding for the "reg" property is different: - On PPv2.1, we had a first area for the packet processor registers (common to all ports), and then one area per port. - On PPv2.2, we have a first area for the packet processor registers (common to all ports), and a second area for numerous other registers, including a large number of per-port registers In addition, on PPv2.2, the area for the common registers is split into so-called "address spaces" of 64 KB each. They allow to access per-CPU registers, where each CPU has its own copy of some registers. A few other registers, which have a single copy, also need to be accessed from those per-CPU windows if they are related to a per-CPU register. For example: - Writing to MVPP2_TXQ_NUM_REG selects a TX queue. This register is a per-CPU register, it must be accessed from the current CPU register window. - Then a write to MVPP2_TXQ_PENDING_REG, MVPP2_TXQ_DESC_ADDR_REG (and a few others) will affect the TX queue that was selected by the write to MVPP2_TXQ_NUM_REG. It must be accessed from the same CPU window as the write to the TXQ_NUM_REG. Therefore, the ->base member of 'struct mvpp2' is replaced with a ->cpu_base[] array, each entry pointing to a mapping of the per-CPU area. Since PPv2.1 doesn't have this concept of per-CPU windows, all entries in ->cpu_base[] point to the same io-remapped area. The existing mvpp2_read() and mvpp2_write() accessors use cpu_base[0], they are used for registers for which the CPU window doesn't matter. mvpp2_percpu_read() and mvpp2_percpu_write() are new accessors added to access the registers for which the CPU window does matter, which is why they take a "cpu" as argument. The driver is then changed to use mvpp2_percpu_read() and mvpp2_percpu_write() where it matters. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-07 23:53:13 +08:00
MVPP2_PREF_BUF_PTR(desc) | MVPP2_PREF_BUF_SIZE_16 |
MVPP2_PREF_BUF_THRESH(desc_per_txq / 2));
put_cpu();
/* WRR / EJP configuration - indirect access */
tx_port_num = mvpp2_egress_port(port);
mvpp2_write(port->priv, MVPP2_TXP_SCHED_PORT_INDEX_REG, tx_port_num);
val = mvpp2_read(port->priv, MVPP2_TXQ_SCHED_REFILL_REG(txq->log_id));
val &= ~MVPP2_TXQ_REFILL_PERIOD_ALL_MASK;
val |= MVPP2_TXQ_REFILL_PERIOD_MASK(1);
val |= MVPP2_TXQ_REFILL_TOKENS_ALL_MASK;
mvpp2_write(port->priv, MVPP2_TXQ_SCHED_REFILL_REG(txq->log_id), val);
val = MVPP2_TXQ_TOKEN_SIZE_MAX;
mvpp2_write(port->priv, MVPP2_TXQ_SCHED_TOKEN_SIZE_REG(txq->log_id),
val);
for (thread = 0; thread < port->priv->nthreads; thread++) {
txq_pcpu = per_cpu_ptr(txq->pcpu, thread);
txq_pcpu->size = txq->size;
txq_pcpu->buffs = kmalloc_array(txq_pcpu->size,
sizeof(*txq_pcpu->buffs),
GFP_KERNEL);
net: mvpp2: fix dma unmapping of TX buffers for fragments Since commit 71ce391dfb784 ("net: mvpp2: enable proper per-CPU TX buffers unmapping"), we are not correctly DMA unmapping TX buffers for fragments. Indeed, the mvpp2_txq_inc_put() function only stores in the txq_cpu->tx_buffs[] array the physical address of the buffer to be DMA-unmapped when skb != NULL. In addition, when DMA-unmapping, we use skb_headlen(skb) to get the size to be unmapped. Both of this works fine for TX descriptors that are associated directly to a SKB, but not the ones that are used for fragments, with a NULL pointer as skb: - We have a NULL physical address when calling DMA unmap - skb_headlen(skb) crashes because skb is NULL This causes random crashes when fragments are used. To solve this problem, we need to: - Store the physical address of the buffer to be unmapped unconditionally, regardless of whether it is tied to a SKB or not. - Store the length of the buffer to be unmapped, which requires a new field. Instead of adding a third array to store the length of the buffer to be unmapped, and as suggested by David Miller, this commit refactors the tx_buffs[] and tx_skb[] arrays of 'struct mvpp2_txq_pcpu' into a separate structure 'mvpp2_txq_pcpu_buf', to which a 'size' field is added. Therefore, instead of having three arrays to allocate/free, we have a single one, which also improve data locality, reducing the impact on the CPU cache. Fixes: 71ce391dfb784 ("net: mvpp2: enable proper per-CPU TX buffers unmapping") Reported-by: Raphael G <raphael.glon@corp.ovh.com> Cc: Raphael G <raphael.glon@corp.ovh.com> Cc: stable@vger.kernel.org Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-21 18:28:49 +08:00
if (!txq_pcpu->buffs)
return -ENOMEM;
txq_pcpu->count = 0;
txq_pcpu->reserved_num = 0;
txq_pcpu->txq_put_index = 0;
txq_pcpu->txq_get_index = 0;
txq_pcpu->tso_headers = NULL;
txq_pcpu->stop_threshold = txq->size - MVPP2_MAX_SKB_DESCS;
txq_pcpu->wake_threshold = txq_pcpu->stop_threshold / 2;
txq_pcpu->tso_headers =
dma_alloc_coherent(port->dev->dev.parent,
txq_pcpu->size * TSO_HEADER_SIZE,
&txq_pcpu->tso_headers_dma,
GFP_KERNEL);
if (!txq_pcpu->tso_headers)
return -ENOMEM;
}
return 0;
}
/* Free allocated TXQ resources */
static void mvpp2_txq_deinit(struct mvpp2_port *port,
struct mvpp2_tx_queue *txq)
{
struct mvpp2_txq_pcpu *txq_pcpu;
unsigned int thread;
for (thread = 0; thread < port->priv->nthreads; thread++) {
txq_pcpu = per_cpu_ptr(txq->pcpu, thread);
net: mvpp2: fix dma unmapping of TX buffers for fragments Since commit 71ce391dfb784 ("net: mvpp2: enable proper per-CPU TX buffers unmapping"), we are not correctly DMA unmapping TX buffers for fragments. Indeed, the mvpp2_txq_inc_put() function only stores in the txq_cpu->tx_buffs[] array the physical address of the buffer to be DMA-unmapped when skb != NULL. In addition, when DMA-unmapping, we use skb_headlen(skb) to get the size to be unmapped. Both of this works fine for TX descriptors that are associated directly to a SKB, but not the ones that are used for fragments, with a NULL pointer as skb: - We have a NULL physical address when calling DMA unmap - skb_headlen(skb) crashes because skb is NULL This causes random crashes when fragments are used. To solve this problem, we need to: - Store the physical address of the buffer to be unmapped unconditionally, regardless of whether it is tied to a SKB or not. - Store the length of the buffer to be unmapped, which requires a new field. Instead of adding a third array to store the length of the buffer to be unmapped, and as suggested by David Miller, this commit refactors the tx_buffs[] and tx_skb[] arrays of 'struct mvpp2_txq_pcpu' into a separate structure 'mvpp2_txq_pcpu_buf', to which a 'size' field is added. Therefore, instead of having three arrays to allocate/free, we have a single one, which also improve data locality, reducing the impact on the CPU cache. Fixes: 71ce391dfb784 ("net: mvpp2: enable proper per-CPU TX buffers unmapping") Reported-by: Raphael G <raphael.glon@corp.ovh.com> Cc: Raphael G <raphael.glon@corp.ovh.com> Cc: stable@vger.kernel.org Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-21 18:28:49 +08:00
kfree(txq_pcpu->buffs);
if (txq_pcpu->tso_headers)
dma_free_coherent(port->dev->dev.parent,
txq_pcpu->size * TSO_HEADER_SIZE,
txq_pcpu->tso_headers,
txq_pcpu->tso_headers_dma);
txq_pcpu->tso_headers = NULL;
}
if (txq->descs)
dma_free_coherent(port->dev->dev.parent,
txq->size * MVPP2_DESC_ALIGNED_SIZE,
txq->descs, txq->descs_dma);
txq->descs = NULL;
txq->last_desc = 0;
txq->next_desc_to_proc = 0;
txq->descs_dma = 0;
/* Set minimum bandwidth for disabled TXQs */
mvpp2_write(port->priv, MVPP2_TXQ_SCHED_TOKEN_CNTR_REG(txq->log_id), 0);
/* Set Tx descriptors queue starting address and size */
thread = mvpp2_cpu_to_thread(port->priv, get_cpu());
mvpp2_thread_write(port->priv, thread, MVPP2_TXQ_NUM_REG, txq->id);
mvpp2_thread_write(port->priv, thread, MVPP2_TXQ_DESC_ADDR_REG, 0);
mvpp2_thread_write(port->priv, thread, MVPP2_TXQ_DESC_SIZE_REG, 0);
put_cpu();
}
/* Cleanup Tx ports */
static void mvpp2_txq_clean(struct mvpp2_port *port, struct mvpp2_tx_queue *txq)
{
struct mvpp2_txq_pcpu *txq_pcpu;
int delay, pending;
unsigned int thread = mvpp2_cpu_to_thread(port->priv, get_cpu());
u32 val;
mvpp2_thread_write(port->priv, thread, MVPP2_TXQ_NUM_REG, txq->id);
val = mvpp2_thread_read(port->priv, thread, MVPP2_TXQ_PREF_BUF_REG);
val |= MVPP2_TXQ_DRAIN_EN_MASK;
mvpp2_thread_write(port->priv, thread, MVPP2_TXQ_PREF_BUF_REG, val);
/* The napi queue has been stopped so wait for all packets
* to be transmitted.
*/
delay = 0;
do {
if (delay >= MVPP2_TX_PENDING_TIMEOUT_MSEC) {
netdev_warn(port->dev,
"port %d: cleaning queue %d timed out\n",
port->id, txq->log_id);
break;
}
mdelay(1);
delay++;
pending = mvpp2_thread_read(port->priv, thread,
net: mvpp2: handle register mapping and access for PPv2.2 This commit adjusts the mvpp2 driver register mapping and access logic to support PPv2.2, to handle a number of differences. Due to how the registers are laid out in memory, the Device Tree binding for the "reg" property is different: - On PPv2.1, we had a first area for the packet processor registers (common to all ports), and then one area per port. - On PPv2.2, we have a first area for the packet processor registers (common to all ports), and a second area for numerous other registers, including a large number of per-port registers In addition, on PPv2.2, the area for the common registers is split into so-called "address spaces" of 64 KB each. They allow to access per-CPU registers, where each CPU has its own copy of some registers. A few other registers, which have a single copy, also need to be accessed from those per-CPU windows if they are related to a per-CPU register. For example: - Writing to MVPP2_TXQ_NUM_REG selects a TX queue. This register is a per-CPU register, it must be accessed from the current CPU register window. - Then a write to MVPP2_TXQ_PENDING_REG, MVPP2_TXQ_DESC_ADDR_REG (and a few others) will affect the TX queue that was selected by the write to MVPP2_TXQ_NUM_REG. It must be accessed from the same CPU window as the write to the TXQ_NUM_REG. Therefore, the ->base member of 'struct mvpp2' is replaced with a ->cpu_base[] array, each entry pointing to a mapping of the per-CPU area. Since PPv2.1 doesn't have this concept of per-CPU windows, all entries in ->cpu_base[] point to the same io-remapped area. The existing mvpp2_read() and mvpp2_write() accessors use cpu_base[0], they are used for registers for which the CPU window doesn't matter. mvpp2_percpu_read() and mvpp2_percpu_write() are new accessors added to access the registers for which the CPU window does matter, which is why they take a "cpu" as argument. The driver is then changed to use mvpp2_percpu_read() and mvpp2_percpu_write() where it matters. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-07 23:53:13 +08:00
MVPP2_TXQ_PENDING_REG);
pending &= MVPP2_TXQ_PENDING_MASK;
} while (pending);
val &= ~MVPP2_TXQ_DRAIN_EN_MASK;
mvpp2_thread_write(port->priv, thread, MVPP2_TXQ_PREF_BUF_REG, val);
put_cpu();
for (thread = 0; thread < port->priv->nthreads; thread++) {
txq_pcpu = per_cpu_ptr(txq->pcpu, thread);
/* Release all packets */
mvpp2_txq_bufs_free(port, txq, txq_pcpu, txq_pcpu->count);
/* Reset queue */
txq_pcpu->count = 0;
txq_pcpu->txq_put_index = 0;
txq_pcpu->txq_get_index = 0;
}
}
/* Cleanup all Tx queues */
static void mvpp2_cleanup_txqs(struct mvpp2_port *port)
{
struct mvpp2_tx_queue *txq;
int queue;
u32 val;
val = mvpp2_read(port->priv, MVPP2_TX_PORT_FLUSH_REG);
/* Reset Tx ports and delete Tx queues */
val |= MVPP2_TX_PORT_FLUSH_MASK(port->id);
mvpp2_write(port->priv, MVPP2_TX_PORT_FLUSH_REG, val);
for (queue = 0; queue < port->ntxqs; queue++) {
txq = port->txqs[queue];
mvpp2_txq_clean(port, txq);
mvpp2_txq_deinit(port, txq);
}
on_each_cpu(mvpp2_txq_sent_counter_clear, port, 1);
val &= ~MVPP2_TX_PORT_FLUSH_MASK(port->id);
mvpp2_write(port->priv, MVPP2_TX_PORT_FLUSH_REG, val);
}
/* Cleanup all Rx queues */
static void mvpp2_cleanup_rxqs(struct mvpp2_port *port)
{
int queue;
for (queue = 0; queue < port->nrxqs; queue++)
mvpp2_rxq_deinit(port, port->rxqs[queue]);
}
/* Init all Rx queues for port */
static int mvpp2_setup_rxqs(struct mvpp2_port *port)
{
int queue, err;
for (queue = 0; queue < port->nrxqs; queue++) {
err = mvpp2_rxq_init(port, port->rxqs[queue]);
if (err)
goto err_cleanup;
}
return 0;
err_cleanup:
mvpp2_cleanup_rxqs(port);
return err;
}
/* Init all tx queues for port */
static int mvpp2_setup_txqs(struct mvpp2_port *port)
{
struct mvpp2_tx_queue *txq;
int queue, err;
for (queue = 0; queue < port->ntxqs; queue++) {
txq = port->txqs[queue];
err = mvpp2_txq_init(port, txq);
if (err)
goto err_cleanup;
/* Assign this queue to a CPU */
if (queue < num_possible_cpus())
netif_set_xps_queue(port->dev, cpumask_of(queue), queue);
}
if (port->has_tx_irqs) {
mvpp2_tx_time_coal_set(port);
for (queue = 0; queue < port->ntxqs; queue++) {
txq = port->txqs[queue];
mvpp2_tx_pkts_coal_set(port, txq);
}
}
on_each_cpu(mvpp2_txq_sent_counter_clear, port, 1);
return 0;
err_cleanup:
mvpp2_cleanup_txqs(port);
return err;
}
/* The callback for per-port interrupt */
static irqreturn_t mvpp2_isr(int irq, void *dev_id)
{
struct mvpp2_queue_vector *qv = dev_id;
mvpp2_qvec_interrupt_disable(qv);
napi_schedule(&qv->napi);
return IRQ_HANDLED;
}
/* Per-port interrupt for link status changes */
static irqreturn_t mvpp2_link_status_isr(int irq, void *dev_id)
{
struct mvpp2_port *port = (struct mvpp2_port *)dev_id;
struct net_device *dev = port->dev;
bool event = false, link = false;
u32 val;
mvpp22_gop_mask_irq(port);
if (mvpp2_port_supports_xlg(port) &&
mvpp2_is_xlg(port->phy_interface)) {
val = readl(port->base + MVPP22_XLG_INT_STAT);
if (val & MVPP22_XLG_INT_STAT_LINK) {
event = true;
val = readl(port->base + MVPP22_XLG_STATUS);
if (val & MVPP22_XLG_STATUS_LINK_UP)
link = true;
}
} else if (phy_interface_mode_is_rgmii(port->phy_interface) ||
phy_interface_mode_is_8023z(port->phy_interface) ||
port->phy_interface == PHY_INTERFACE_MODE_SGMII) {
val = readl(port->base + MVPP22_GMAC_INT_STAT);
if (val & MVPP22_GMAC_INT_STAT_LINK) {
event = true;
val = readl(port->base + MVPP2_GMAC_STATUS0);
if (val & MVPP2_GMAC_STATUS0_LINK_UP)
link = true;
}
}
if (port->phylink) {
phylink_mac_change(port->phylink, link);
goto handled;
}
if (!netif_running(dev) || !event)
goto handled;
if (link) {
mvpp2_interrupts_enable(port);
mvpp2_egress_enable(port);
mvpp2_ingress_enable(port);
netif_carrier_on(dev);
netif_tx_wake_all_queues(dev);
} else {
netif_tx_stop_all_queues(dev);
netif_carrier_off(dev);
mvpp2_ingress_disable(port);
mvpp2_egress_disable(port);
mvpp2_interrupts_disable(port);
}
handled:
mvpp22_gop_unmask_irq(port);
return IRQ_HANDLED;
}
static enum hrtimer_restart mvpp2_hr_timer_cb(struct hrtimer *timer)
net: mvpp2: replace TX coalescing interrupts with hrtimer The PP2 controller is capable of per-CPU TX processing, which means there are per-CPU banked register sets and queues. Current version of the driver supports TX packet coalescing - once on given CPU sent packets amount reaches a threshold value, an IRQ occurs. However, there is a single interrupt line responsible for CPU0/1 TX and RX events (the latter is not per-CPU, the hardware does not support RSS). When the top-half executes the interrupt cause is not known. This is why in NAPI poll function, along with RX processing, IRQ cause register on both CPU's is accessed in order to determine on which of them the TX coalescing threshold might have been reached. Thus the egress processing and releasing the buffers is able to take place on the corresponding CPU. Hitherto approach lead to an illegal usage of on_each_cpu function in softirq context. The problem is solved by resigning from TX coalescing interrupts and separating egress finalization from NAPI processing. For that purpose a method of using hrtimer is introduced. In main transmit function (mvpp2_tx) buffers are released once a software coalescing threshold is reached. In case not all the data is processed a timer is set on this CPU - in its interrupt context a tasklet is scheduled in which all queues are processed. At once only one timer per-CPU can be running, which is controlled by a dedicated flag. This commit removes TX processing from NAPI polling function, disables hardware coalescing and enables hrtimer with tasklet, using new per-CPU port structure (mvpp2_port_pcpu). Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-07 01:00:30 +08:00
{
struct net_device *dev;
struct mvpp2_port *port;
struct mvpp2_port_pcpu *port_pcpu;
net: mvpp2: replace TX coalescing interrupts with hrtimer The PP2 controller is capable of per-CPU TX processing, which means there are per-CPU banked register sets and queues. Current version of the driver supports TX packet coalescing - once on given CPU sent packets amount reaches a threshold value, an IRQ occurs. However, there is a single interrupt line responsible for CPU0/1 TX and RX events (the latter is not per-CPU, the hardware does not support RSS). When the top-half executes the interrupt cause is not known. This is why in NAPI poll function, along with RX processing, IRQ cause register on both CPU's is accessed in order to determine on which of them the TX coalescing threshold might have been reached. Thus the egress processing and releasing the buffers is able to take place on the corresponding CPU. Hitherto approach lead to an illegal usage of on_each_cpu function in softirq context. The problem is solved by resigning from TX coalescing interrupts and separating egress finalization from NAPI processing. For that purpose a method of using hrtimer is introduced. In main transmit function (mvpp2_tx) buffers are released once a software coalescing threshold is reached. In case not all the data is processed a timer is set on this CPU - in its interrupt context a tasklet is scheduled in which all queues are processed. At once only one timer per-CPU can be running, which is controlled by a dedicated flag. This commit removes TX processing from NAPI polling function, disables hardware coalescing and enables hrtimer with tasklet, using new per-CPU port structure (mvpp2_port_pcpu). Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-07 01:00:30 +08:00
unsigned int tx_todo, cause;
port_pcpu = container_of(timer, struct mvpp2_port_pcpu, tx_done_timer);
dev = port_pcpu->dev;
net: mvpp2: replace TX coalescing interrupts with hrtimer The PP2 controller is capable of per-CPU TX processing, which means there are per-CPU banked register sets and queues. Current version of the driver supports TX packet coalescing - once on given CPU sent packets amount reaches a threshold value, an IRQ occurs. However, there is a single interrupt line responsible for CPU0/1 TX and RX events (the latter is not per-CPU, the hardware does not support RSS). When the top-half executes the interrupt cause is not known. This is why in NAPI poll function, along with RX processing, IRQ cause register on both CPU's is accessed in order to determine on which of them the TX coalescing threshold might have been reached. Thus the egress processing and releasing the buffers is able to take place on the corresponding CPU. Hitherto approach lead to an illegal usage of on_each_cpu function in softirq context. The problem is solved by resigning from TX coalescing interrupts and separating egress finalization from NAPI processing. For that purpose a method of using hrtimer is introduced. In main transmit function (mvpp2_tx) buffers are released once a software coalescing threshold is reached. In case not all the data is processed a timer is set on this CPU - in its interrupt context a tasklet is scheduled in which all queues are processed. At once only one timer per-CPU can be running, which is controlled by a dedicated flag. This commit removes TX processing from NAPI polling function, disables hardware coalescing and enables hrtimer with tasklet, using new per-CPU port structure (mvpp2_port_pcpu). Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-07 01:00:30 +08:00
if (!netif_running(dev))
return HRTIMER_NORESTART;
net: mvpp2: replace TX coalescing interrupts with hrtimer The PP2 controller is capable of per-CPU TX processing, which means there are per-CPU banked register sets and queues. Current version of the driver supports TX packet coalescing - once on given CPU sent packets amount reaches a threshold value, an IRQ occurs. However, there is a single interrupt line responsible for CPU0/1 TX and RX events (the latter is not per-CPU, the hardware does not support RSS). When the top-half executes the interrupt cause is not known. This is why in NAPI poll function, along with RX processing, IRQ cause register on both CPU's is accessed in order to determine on which of them the TX coalescing threshold might have been reached. Thus the egress processing and releasing the buffers is able to take place on the corresponding CPU. Hitherto approach lead to an illegal usage of on_each_cpu function in softirq context. The problem is solved by resigning from TX coalescing interrupts and separating egress finalization from NAPI processing. For that purpose a method of using hrtimer is introduced. In main transmit function (mvpp2_tx) buffers are released once a software coalescing threshold is reached. In case not all the data is processed a timer is set on this CPU - in its interrupt context a tasklet is scheduled in which all queues are processed. At once only one timer per-CPU can be running, which is controlled by a dedicated flag. This commit removes TX processing from NAPI polling function, disables hardware coalescing and enables hrtimer with tasklet, using new per-CPU port structure (mvpp2_port_pcpu). Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-07 01:00:30 +08:00
port_pcpu->timer_scheduled = false;
port = netdev_priv(dev);
net: mvpp2: replace TX coalescing interrupts with hrtimer The PP2 controller is capable of per-CPU TX processing, which means there are per-CPU banked register sets and queues. Current version of the driver supports TX packet coalescing - once on given CPU sent packets amount reaches a threshold value, an IRQ occurs. However, there is a single interrupt line responsible for CPU0/1 TX and RX events (the latter is not per-CPU, the hardware does not support RSS). When the top-half executes the interrupt cause is not known. This is why in NAPI poll function, along with RX processing, IRQ cause register on both CPU's is accessed in order to determine on which of them the TX coalescing threshold might have been reached. Thus the egress processing and releasing the buffers is able to take place on the corresponding CPU. Hitherto approach lead to an illegal usage of on_each_cpu function in softirq context. The problem is solved by resigning from TX coalescing interrupts and separating egress finalization from NAPI processing. For that purpose a method of using hrtimer is introduced. In main transmit function (mvpp2_tx) buffers are released once a software coalescing threshold is reached. In case not all the data is processed a timer is set on this CPU - in its interrupt context a tasklet is scheduled in which all queues are processed. At once only one timer per-CPU can be running, which is controlled by a dedicated flag. This commit removes TX processing from NAPI polling function, disables hardware coalescing and enables hrtimer with tasklet, using new per-CPU port structure (mvpp2_port_pcpu). Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-07 01:00:30 +08:00
/* Process all the Tx queues */
cause = (1 << port->ntxqs) - 1;
tx_todo = mvpp2_tx_done(port, cause,
mvpp2_cpu_to_thread(port->priv, smp_processor_id()));
net: mvpp2: replace TX coalescing interrupts with hrtimer The PP2 controller is capable of per-CPU TX processing, which means there are per-CPU banked register sets and queues. Current version of the driver supports TX packet coalescing - once on given CPU sent packets amount reaches a threshold value, an IRQ occurs. However, there is a single interrupt line responsible for CPU0/1 TX and RX events (the latter is not per-CPU, the hardware does not support RSS). When the top-half executes the interrupt cause is not known. This is why in NAPI poll function, along with RX processing, IRQ cause register on both CPU's is accessed in order to determine on which of them the TX coalescing threshold might have been reached. Thus the egress processing and releasing the buffers is able to take place on the corresponding CPU. Hitherto approach lead to an illegal usage of on_each_cpu function in softirq context. The problem is solved by resigning from TX coalescing interrupts and separating egress finalization from NAPI processing. For that purpose a method of using hrtimer is introduced. In main transmit function (mvpp2_tx) buffers are released once a software coalescing threshold is reached. In case not all the data is processed a timer is set on this CPU - in its interrupt context a tasklet is scheduled in which all queues are processed. At once only one timer per-CPU can be running, which is controlled by a dedicated flag. This commit removes TX processing from NAPI polling function, disables hardware coalescing and enables hrtimer with tasklet, using new per-CPU port structure (mvpp2_port_pcpu). Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-07 01:00:30 +08:00
/* Set the timer in case not all the packets were processed */
if (tx_todo && !port_pcpu->timer_scheduled) {
port_pcpu->timer_scheduled = true;
hrtimer_forward_now(&port_pcpu->tx_done_timer,
MVPP2_TXDONE_HRTIMER_PERIOD_NS);
net: mvpp2: replace TX coalescing interrupts with hrtimer The PP2 controller is capable of per-CPU TX processing, which means there are per-CPU banked register sets and queues. Current version of the driver supports TX packet coalescing - once on given CPU sent packets amount reaches a threshold value, an IRQ occurs. However, there is a single interrupt line responsible for CPU0/1 TX and RX events (the latter is not per-CPU, the hardware does not support RSS). When the top-half executes the interrupt cause is not known. This is why in NAPI poll function, along with RX processing, IRQ cause register on both CPU's is accessed in order to determine on which of them the TX coalescing threshold might have been reached. Thus the egress processing and releasing the buffers is able to take place on the corresponding CPU. Hitherto approach lead to an illegal usage of on_each_cpu function in softirq context. The problem is solved by resigning from TX coalescing interrupts and separating egress finalization from NAPI processing. For that purpose a method of using hrtimer is introduced. In main transmit function (mvpp2_tx) buffers are released once a software coalescing threshold is reached. In case not all the data is processed a timer is set on this CPU - in its interrupt context a tasklet is scheduled in which all queues are processed. At once only one timer per-CPU can be running, which is controlled by a dedicated flag. This commit removes TX processing from NAPI polling function, disables hardware coalescing and enables hrtimer with tasklet, using new per-CPU port structure (mvpp2_port_pcpu). Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-07 01:00:30 +08:00
return HRTIMER_RESTART;
}
net: mvpp2: replace TX coalescing interrupts with hrtimer The PP2 controller is capable of per-CPU TX processing, which means there are per-CPU banked register sets and queues. Current version of the driver supports TX packet coalescing - once on given CPU sent packets amount reaches a threshold value, an IRQ occurs. However, there is a single interrupt line responsible for CPU0/1 TX and RX events (the latter is not per-CPU, the hardware does not support RSS). When the top-half executes the interrupt cause is not known. This is why in NAPI poll function, along with RX processing, IRQ cause register on both CPU's is accessed in order to determine on which of them the TX coalescing threshold might have been reached. Thus the egress processing and releasing the buffers is able to take place on the corresponding CPU. Hitherto approach lead to an illegal usage of on_each_cpu function in softirq context. The problem is solved by resigning from TX coalescing interrupts and separating egress finalization from NAPI processing. For that purpose a method of using hrtimer is introduced. In main transmit function (mvpp2_tx) buffers are released once a software coalescing threshold is reached. In case not all the data is processed a timer is set on this CPU - in its interrupt context a tasklet is scheduled in which all queues are processed. At once only one timer per-CPU can be running, which is controlled by a dedicated flag. This commit removes TX processing from NAPI polling function, disables hardware coalescing and enables hrtimer with tasklet, using new per-CPU port structure (mvpp2_port_pcpu). Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-07 01:00:30 +08:00
return HRTIMER_NORESTART;
}
/* Main RX/TX processing routines */
/* Display more error info */
static void mvpp2_rx_error(struct mvpp2_port *port,
struct mvpp2_rx_desc *rx_desc)
{
u32 status = mvpp2_rxdesc_status_get(port, rx_desc);
size_t sz = mvpp2_rxdesc_size_get(port, rx_desc);
char *err_str = NULL;
switch (status & MVPP2_RXD_ERR_CODE_MASK) {
case MVPP2_RXD_ERR_CRC:
err_str = "crc";
break;
case MVPP2_RXD_ERR_OVERRUN:
err_str = "overrun";
break;
case MVPP2_RXD_ERR_RESOURCE:
err_str = "resource";
break;
}
if (err_str && net_ratelimit())
netdev_err(port->dev,
"bad rx status %08x (%s error), size=%zu\n",
status, err_str, sz);
}
/* Handle RX checksum offload */
static void mvpp2_rx_csum(struct mvpp2_port *port, u32 status,
struct sk_buff *skb)
{
if (((status & MVPP2_RXD_L3_IP4) &&
!(status & MVPP2_RXD_IP4_HEADER_ERR)) ||
(status & MVPP2_RXD_L3_IP6))
if (((status & MVPP2_RXD_L4_UDP) ||
(status & MVPP2_RXD_L4_TCP)) &&
(status & MVPP2_RXD_L4_CSUM_OK)) {
skb->csum = 0;
skb->ip_summed = CHECKSUM_UNNECESSARY;
return;
}
skb->ip_summed = CHECKSUM_NONE;
}
/* Allocate a new skb and add it to BM pool */
static int mvpp2_rx_refill(struct mvpp2_port *port,
struct mvpp2_bm_pool *bm_pool,
struct page_pool *page_pool, int pool)
{
dma_addr_t dma_addr;
net: mvpp2: store physical address of buffer in rx_desc->buf_cookie The RX descriptors of the PPv2 hardware allow to store several information, amongst which: - the DMA address of the buffer in which the data has been received - a "cookie" field, left to the use of the driver, and not used by the hardware In the current implementation, the "cookie" field is used to store the virtual address of the buffer, so that in the receive completion path, we can easily get the virtual address of the buffer that corresponds to a completed RX descriptors. On PPv2.1, used on 32-bit platforms, those two fields are 32-bit wide, which is enough to store a DMA address in the first field, and a virtual address in the second field. On PPv2.2, used on 64-bit platforms, these two fields have been extended to 40 bits. While 40 bits is enough to store a DMA address (as long as the DMA mask is 40 bits or lower), it is not enough to store a virtual address. Therefore, the "cookie" field can no longer be used to store the virtual address of the buffer. However, as Russell King pointed out, the RX buffers are always allocated in the kernel linear mapping, and therefore using phys_to_virt() on the physical address of the RX buffer is possible and correct. Therefore, this commit changes the driver to use the "cookie" field to store the physical address instead of the virtual address. phys_to_virt() is used in the receive completion path to retrieve the virtual address from the physical address. It is obviously important to realize that the DMA address and physical address are two different things, which is why we store both in the RX descriptors. While those addresses may be identical in some situations, it remains two distinct concepts, and both addresses should be handled separately. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-07 23:53:04 +08:00
phys_addr_t phys_addr;
void *buf;
buf = mvpp2_buf_alloc(port, bm_pool, page_pool,
&dma_addr, &phys_addr, GFP_ATOMIC);
if (!buf)
return -ENOMEM;
mvpp2_bm_pool_put(port, pool, dma_addr, phys_addr);
return 0;
}
/* Handle tx checksum */
static u32 mvpp2_skb_tx_csum(struct mvpp2_port *port, struct sk_buff *skb)
{
if (skb->ip_summed == CHECKSUM_PARTIAL) {
int ip_hdr_len = 0;
u8 l4_proto;
__be16 l3_proto = vlan_get_protocol(skb);
if (l3_proto == htons(ETH_P_IP)) {
struct iphdr *ip4h = ip_hdr(skb);
/* Calculate IPv4 checksum and L4 checksum */
ip_hdr_len = ip4h->ihl;
l4_proto = ip4h->protocol;
} else if (l3_proto == htons(ETH_P_IPV6)) {
struct ipv6hdr *ip6h = ipv6_hdr(skb);
/* Read l4_protocol from one of IPv6 extra headers */
if (skb_network_header_len(skb) > 0)
ip_hdr_len = (skb_network_header_len(skb) >> 2);
l4_proto = ip6h->nexthdr;
} else {
return MVPP2_TXD_L4_CSUM_NOT;
}
return mvpp2_txq_desc_csum(skb_network_offset(skb),
l3_proto, ip_hdr_len, l4_proto);
}
return MVPP2_TXD_L4_CSUM_NOT | MVPP2_TXD_IP_CSUM_DISABLE;
}
static void mvpp2_xdp_finish_tx(struct mvpp2_port *port, u16 txq_id, int nxmit, int nxmit_byte)
{
unsigned int thread = mvpp2_cpu_to_thread(port->priv, smp_processor_id());
struct mvpp2_tx_queue *aggr_txq;
struct mvpp2_txq_pcpu *txq_pcpu;
struct mvpp2_tx_queue *txq;
struct netdev_queue *nq;
txq = port->txqs[txq_id];
txq_pcpu = per_cpu_ptr(txq->pcpu, thread);
nq = netdev_get_tx_queue(port->dev, txq_id);
aggr_txq = &port->priv->aggr_txqs[thread];
txq_pcpu->reserved_num -= nxmit;
txq_pcpu->count += nxmit;
aggr_txq->count += nxmit;
/* Enable transmit */
wmb();
mvpp2_aggr_txq_pend_desc_add(port, nxmit);
if (txq_pcpu->count >= txq_pcpu->stop_threshold)
netif_tx_stop_queue(nq);
/* Finalize TX processing */
if (!port->has_tx_irqs && txq_pcpu->count >= txq->done_pkts_coal)
mvpp2_txq_done(port, txq, txq_pcpu);
}
static int
mvpp2_xdp_submit_frame(struct mvpp2_port *port, u16 txq_id,
struct xdp_frame *xdpf, bool dma_map)
{
unsigned int thread = mvpp2_cpu_to_thread(port->priv, smp_processor_id());
u32 tx_cmd = MVPP2_TXD_L4_CSUM_NOT | MVPP2_TXD_IP_CSUM_DISABLE |
MVPP2_TXD_F_DESC | MVPP2_TXD_L_DESC;
enum mvpp2_tx_buf_type buf_type;
struct mvpp2_txq_pcpu *txq_pcpu;
struct mvpp2_tx_queue *aggr_txq;
struct mvpp2_tx_desc *tx_desc;
struct mvpp2_tx_queue *txq;
int ret = MVPP2_XDP_TX;
dma_addr_t dma_addr;
txq = port->txqs[txq_id];
txq_pcpu = per_cpu_ptr(txq->pcpu, thread);
aggr_txq = &port->priv->aggr_txqs[thread];
/* Check number of available descriptors */
if (mvpp2_aggr_desc_num_check(port, aggr_txq, 1) ||
mvpp2_txq_reserved_desc_num_proc(port, txq, txq_pcpu, 1)) {
ret = MVPP2_XDP_DROPPED;
goto out;
}
/* Get a descriptor for the first part of the packet */
tx_desc = mvpp2_txq_next_desc_get(aggr_txq);
mvpp2_txdesc_txq_set(port, tx_desc, txq->id);
mvpp2_txdesc_size_set(port, tx_desc, xdpf->len);
if (dma_map) {
/* XDP_REDIRECT or AF_XDP */
dma_addr = dma_map_single(port->dev->dev.parent, xdpf->data,
xdpf->len, DMA_TO_DEVICE);
if (unlikely(dma_mapping_error(port->dev->dev.parent, dma_addr))) {
mvpp2_txq_desc_put(txq);
ret = MVPP2_XDP_DROPPED;
goto out;
}
buf_type = MVPP2_TYPE_XDP_NDO;
} else {
/* XDP_TX */
struct page *page = virt_to_page(xdpf->data);
dma_addr = page_pool_get_dma_addr(page) +
sizeof(*xdpf) + xdpf->headroom;
dma_sync_single_for_device(port->dev->dev.parent, dma_addr,
xdpf->len, DMA_BIDIRECTIONAL);
buf_type = MVPP2_TYPE_XDP_TX;
}
mvpp2_txdesc_dma_addr_set(port, tx_desc, dma_addr);
mvpp2_txdesc_cmd_set(port, tx_desc, tx_cmd);
mvpp2_txq_inc_put(port, txq_pcpu, xdpf, tx_desc, buf_type);
out:
return ret;
}
static int
mvpp2_xdp_xmit_back(struct mvpp2_port *port, struct xdp_buff *xdp)
{
struct mvpp2_pcpu_stats *stats = this_cpu_ptr(port->stats);
struct xdp_frame *xdpf;
u16 txq_id;
int ret;
xdpf = xdp_convert_buff_to_frame(xdp);
if (unlikely(!xdpf))
return MVPP2_XDP_DROPPED;
/* The first of the TX queues are used for XPS,
* the second half for XDP_TX
*/
txq_id = mvpp2_cpu_to_thread(port->priv, smp_processor_id()) + (port->ntxqs / 2);
ret = mvpp2_xdp_submit_frame(port, txq_id, xdpf, false);
if (ret == MVPP2_XDP_TX) {
u64_stats_update_begin(&stats->syncp);
stats->tx_bytes += xdpf->len;
stats->tx_packets++;
stats->xdp_tx++;
u64_stats_update_end(&stats->syncp);
mvpp2_xdp_finish_tx(port, txq_id, 1, xdpf->len);
} else {
u64_stats_update_begin(&stats->syncp);
stats->xdp_tx_err++;
u64_stats_update_end(&stats->syncp);
}
return ret;
}
static int
mvpp2_xdp_xmit(struct net_device *dev, int num_frame,
struct xdp_frame **frames, u32 flags)
{
struct mvpp2_port *port = netdev_priv(dev);
int i, nxmit_byte = 0, nxmit = num_frame;
struct mvpp2_pcpu_stats *stats;
u16 txq_id;
u32 ret;
if (unlikely(test_bit(0, &port->state)))
return -ENETDOWN;
if (unlikely(flags & ~XDP_XMIT_FLAGS_MASK))
return -EINVAL;
/* The first of the TX queues are used for XPS,
* the second half for XDP_TX
*/
txq_id = mvpp2_cpu_to_thread(port->priv, smp_processor_id()) + (port->ntxqs / 2);
for (i = 0; i < num_frame; i++) {
ret = mvpp2_xdp_submit_frame(port, txq_id, frames[i], true);
if (ret == MVPP2_XDP_TX) {
nxmit_byte += frames[i]->len;
} else {
xdp_return_frame_rx_napi(frames[i]);
nxmit--;
}
}
if (likely(nxmit > 0))
mvpp2_xdp_finish_tx(port, txq_id, nxmit, nxmit_byte);
stats = this_cpu_ptr(port->stats);
u64_stats_update_begin(&stats->syncp);
stats->tx_bytes += nxmit_byte;
stats->tx_packets += nxmit;
stats->xdp_xmit += nxmit;
stats->xdp_xmit_err += num_frame - nxmit;
u64_stats_update_end(&stats->syncp);
return nxmit;
}
static int
mvpp2_run_xdp(struct mvpp2_port *port, struct mvpp2_rx_queue *rxq,
struct bpf_prog *prog, struct xdp_buff *xdp,
struct page_pool *pp, struct mvpp2_pcpu_stats *stats)
{
unsigned int len, sync, err;
struct page *page;
u32 ret, act;
len = xdp->data_end - xdp->data_hard_start - MVPP2_SKB_HEADROOM;
act = bpf_prog_run_xdp(prog, xdp);
/* Due xdp_adjust_tail: DMA sync for_device cover max len CPU touch */
sync = xdp->data_end - xdp->data_hard_start - MVPP2_SKB_HEADROOM;
sync = max(sync, len);
switch (act) {
case XDP_PASS:
stats->xdp_pass++;
ret = MVPP2_XDP_PASS;
break;
case XDP_REDIRECT:
err = xdp_do_redirect(port->dev, xdp, prog);
if (unlikely(err)) {
ret = MVPP2_XDP_DROPPED;
page = virt_to_head_page(xdp->data);
page_pool_put_page(pp, page, sync, true);
} else {
ret = MVPP2_XDP_REDIR;
stats->xdp_redirect++;
}
break;
case XDP_TX:
ret = mvpp2_xdp_xmit_back(port, xdp);
if (ret != MVPP2_XDP_TX) {
page = virt_to_head_page(xdp->data);
page_pool_put_page(pp, page, sync, true);
}
break;
default:
bpf_warn_invalid_xdp_action(act);
fallthrough;
case XDP_ABORTED:
trace_xdp_exception(port->dev, prog, act);
fallthrough;
case XDP_DROP:
page = virt_to_head_page(xdp->data);
page_pool_put_page(pp, page, sync, true);
ret = MVPP2_XDP_DROPPED;
stats->xdp_drop++;
break;
}
return ret;
}
/* Main rx processing */
static int mvpp2_rx(struct mvpp2_port *port, struct napi_struct *napi,
int rx_todo, struct mvpp2_rx_queue *rxq)
{
struct net_device *dev = port->dev;
struct mvpp2_pcpu_stats ps = {};
enum dma_data_direction dma_dir;
struct bpf_prog *xdp_prog;
struct xdp_buff xdp;
int rx_received;
int rx_done = 0;
u32 xdp_ret = 0;
rcu_read_lock();
xdp_prog = READ_ONCE(port->xdp_prog);
/* Get number of received packets and clamp the to-do */
rx_received = mvpp2_rxq_received(port, rxq->id);
if (rx_todo > rx_received)
rx_todo = rx_received;
while (rx_done < rx_todo) {
struct mvpp2_rx_desc *rx_desc = mvpp2_rxq_next_desc_get(rxq);
struct mvpp2_bm_pool *bm_pool;
struct page_pool *pp = NULL;
struct sk_buff *skb;
unsigned int frag_size;
dma_addr_t dma_addr;
phys_addr_t phys_addr;
u32 rx_status;
int pool, rx_bytes, err, ret;
void *data;
rx_done++;
rx_status = mvpp2_rxdesc_status_get(port, rx_desc);
rx_bytes = mvpp2_rxdesc_size_get(port, rx_desc);
rx_bytes -= MVPP2_MH_SIZE;
dma_addr = mvpp2_rxdesc_dma_addr_get(port, rx_desc);
phys_addr = mvpp2_rxdesc_cookie_get(port, rx_desc);
data = (void *)phys_to_virt(phys_addr);
pool = (rx_status & MVPP2_RXD_BM_POOL_ID_MASK) >>
MVPP2_RXD_BM_POOL_ID_OFFS;
bm_pool = &port->priv->bm_pools[pool];
/* In case of an error, release the requested buffer pointer
* to the Buffer Manager. This request process is controlled
* by the hardware, and the information about the buffer is
* comprised by the RX descriptor.
*/
if (rx_status & MVPP2_RXD_ERR_SUMMARY)
goto err_drop_frame;
if (port->priv->percpu_pools) {
pp = port->priv->page_pool[pool];
dma_dir = page_pool_get_dma_dir(pp);
} else {
dma_dir = DMA_FROM_DEVICE;
}
dma_sync_single_for_cpu(dev->dev.parent, dma_addr,
rx_bytes + MVPP2_MH_SIZE,
dma_dir);
/* Prefetch header */
prefetch(data);
if (bm_pool->frag_size > PAGE_SIZE)
frag_size = 0;
else
frag_size = bm_pool->frag_size;
if (xdp_prog) {
xdp.data_hard_start = data;
xdp.data = data + MVPP2_MH_SIZE + MVPP2_SKB_HEADROOM;
xdp.data_end = xdp.data + rx_bytes;
xdp.frame_sz = PAGE_SIZE;
if (bm_pool->pkt_size == MVPP2_BM_SHORT_PKT_SIZE)
xdp.rxq = &rxq->xdp_rxq_short;
else
xdp.rxq = &rxq->xdp_rxq_long;
xdp_set_data_meta_invalid(&xdp);
ret = mvpp2_run_xdp(port, rxq, xdp_prog, &xdp, pp, &ps);
if (ret) {
xdp_ret |= ret;
err = mvpp2_rx_refill(port, bm_pool, pp, pool);
if (err) {
netdev_err(port->dev, "failed to refill BM pools\n");
goto err_drop_frame;
}
ps.rx_packets++;
ps.rx_bytes += rx_bytes;
continue;
}
}
skb = build_skb(data, frag_size);
if (!skb) {
netdev_warn(port->dev, "skb build failed\n");
goto err_drop_frame;
}
err = mvpp2_rx_refill(port, bm_pool, pp, pool);
if (err) {
netdev_err(port->dev, "failed to refill BM pools\n");
dev_kfree_skb_any(skb);
goto err_drop_frame;
}
if (pp)
page_pool_release_page(pp, virt_to_page(data));
else
dma_unmap_single_attrs(dev->dev.parent, dma_addr,
bm_pool->buf_size, DMA_FROM_DEVICE,
DMA_ATTR_SKIP_CPU_SYNC);
ps.rx_packets++;
ps.rx_bytes += rx_bytes;
skb_reserve(skb, MVPP2_MH_SIZE + MVPP2_SKB_HEADROOM);
skb_put(skb, rx_bytes);
skb->protocol = eth_type_trans(skb, dev);
mvpp2_rx_csum(port, rx_status, skb);
napi_gro_receive(napi, skb);
continue;
err_drop_frame:
dev->stats.rx_errors++;
mvpp2_rx_error(port, rx_desc);
/* Return the buffer to the pool */
mvpp2_bm_pool_put(port, pool, dma_addr, phys_addr);
}
rcu_read_unlock();
if (xdp_ret & MVPP2_XDP_REDIR)
xdp_do_flush_map();
if (ps.rx_packets) {
struct mvpp2_pcpu_stats *stats = this_cpu_ptr(port->stats);
u64_stats_update_begin(&stats->syncp);
stats->rx_packets += ps.rx_packets;
stats->rx_bytes += ps.rx_bytes;
/* xdp */
stats->xdp_redirect += ps.xdp_redirect;
stats->xdp_pass += ps.xdp_pass;
stats->xdp_drop += ps.xdp_drop;
u64_stats_update_end(&stats->syncp);
}
/* Update Rx queue management counters */
wmb();
mvpp2_rxq_status_update(port, rxq->id, rx_done, rx_done);
return rx_todo;
}
static inline void
tx_desc_unmap_put(struct mvpp2_port *port, struct mvpp2_tx_queue *txq,
struct mvpp2_tx_desc *desc)
{
unsigned int thread = mvpp2_cpu_to_thread(port->priv, smp_processor_id());
struct mvpp2_txq_pcpu *txq_pcpu = per_cpu_ptr(txq->pcpu, thread);
dma_addr_t buf_dma_addr =
mvpp2_txdesc_dma_addr_get(port, desc);
size_t buf_sz =
mvpp2_txdesc_size_get(port, desc);
if (!IS_TSO_HEADER(txq_pcpu, buf_dma_addr))
dma_unmap_single(port->dev->dev.parent, buf_dma_addr,
buf_sz, DMA_TO_DEVICE);
mvpp2_txq_desc_put(txq);
}
/* Handle tx fragmentation processing */
static int mvpp2_tx_frag_process(struct mvpp2_port *port, struct sk_buff *skb,
struct mvpp2_tx_queue *aggr_txq,
struct mvpp2_tx_queue *txq)
{
unsigned int thread = mvpp2_cpu_to_thread(port->priv, smp_processor_id());
struct mvpp2_txq_pcpu *txq_pcpu = per_cpu_ptr(txq->pcpu, thread);
struct mvpp2_tx_desc *tx_desc;
int i;
dma_addr_t buf_dma_addr;
for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
void *addr = skb_frag_address(frag);
tx_desc = mvpp2_txq_next_desc_get(aggr_txq);
mvpp2_txdesc_txq_set(port, tx_desc, txq->id);
mvpp2_txdesc_size_set(port, tx_desc, skb_frag_size(frag));
buf_dma_addr = dma_map_single(port->dev->dev.parent, addr,
skb_frag_size(frag),
DMA_TO_DEVICE);
if (dma_mapping_error(port->dev->dev.parent, buf_dma_addr)) {
mvpp2_txq_desc_put(txq);
goto cleanup;
}
mvpp2_txdesc_dma_addr_set(port, tx_desc, buf_dma_addr);
if (i == (skb_shinfo(skb)->nr_frags - 1)) {
/* Last descriptor */
mvpp2_txdesc_cmd_set(port, tx_desc,
MVPP2_TXD_L_DESC);
mvpp2_txq_inc_put(port, txq_pcpu, skb, tx_desc, MVPP2_TYPE_SKB);
} else {
/* Descriptor in the middle: Not First, Not Last */
mvpp2_txdesc_cmd_set(port, tx_desc, 0);
mvpp2_txq_inc_put(port, txq_pcpu, NULL, tx_desc, MVPP2_TYPE_SKB);
}
}
return 0;
cleanup:
/* Release all descriptors that were used to map fragments of
* this packet, as well as the corresponding DMA mappings
*/
for (i = i - 1; i >= 0; i--) {
tx_desc = txq->descs + i;
tx_desc_unmap_put(port, txq, tx_desc);
}
return -ENOMEM;
}
static inline void mvpp2_tso_put_hdr(struct sk_buff *skb,
struct net_device *dev,
struct mvpp2_tx_queue *txq,
struct mvpp2_tx_queue *aggr_txq,
struct mvpp2_txq_pcpu *txq_pcpu,
int hdr_sz)
{
struct mvpp2_port *port = netdev_priv(dev);
struct mvpp2_tx_desc *tx_desc = mvpp2_txq_next_desc_get(aggr_txq);
dma_addr_t addr;
mvpp2_txdesc_txq_set(port, tx_desc, txq->id);
mvpp2_txdesc_size_set(port, tx_desc, hdr_sz);
addr = txq_pcpu->tso_headers_dma +
txq_pcpu->txq_put_index * TSO_HEADER_SIZE;
mvpp2_txdesc_dma_addr_set(port, tx_desc, addr);
mvpp2_txdesc_cmd_set(port, tx_desc, mvpp2_skb_tx_csum(port, skb) |
MVPP2_TXD_F_DESC |
MVPP2_TXD_PADDING_DISABLE);
mvpp2_txq_inc_put(port, txq_pcpu, NULL, tx_desc, MVPP2_TYPE_SKB);
}
static inline int mvpp2_tso_put_data(struct sk_buff *skb,
struct net_device *dev, struct tso_t *tso,
struct mvpp2_tx_queue *txq,
struct mvpp2_tx_queue *aggr_txq,
struct mvpp2_txq_pcpu *txq_pcpu,
int sz, bool left, bool last)
{
struct mvpp2_port *port = netdev_priv(dev);
struct mvpp2_tx_desc *tx_desc = mvpp2_txq_next_desc_get(aggr_txq);
dma_addr_t buf_dma_addr;
mvpp2_txdesc_txq_set(port, tx_desc, txq->id);
mvpp2_txdesc_size_set(port, tx_desc, sz);
buf_dma_addr = dma_map_single(dev->dev.parent, tso->data, sz,
DMA_TO_DEVICE);
if (unlikely(dma_mapping_error(dev->dev.parent, buf_dma_addr))) {
mvpp2_txq_desc_put(txq);
return -ENOMEM;
}
mvpp2_txdesc_dma_addr_set(port, tx_desc, buf_dma_addr);
if (!left) {
mvpp2_txdesc_cmd_set(port, tx_desc, MVPP2_TXD_L_DESC);
if (last) {
mvpp2_txq_inc_put(port, txq_pcpu, skb, tx_desc, MVPP2_TYPE_SKB);
return 0;
}
} else {
mvpp2_txdesc_cmd_set(port, tx_desc, 0);
}
mvpp2_txq_inc_put(port, txq_pcpu, NULL, tx_desc, MVPP2_TYPE_SKB);
return 0;
}
static int mvpp2_tx_tso(struct sk_buff *skb, struct net_device *dev,
struct mvpp2_tx_queue *txq,
struct mvpp2_tx_queue *aggr_txq,
struct mvpp2_txq_pcpu *txq_pcpu)
{
struct mvpp2_port *port = netdev_priv(dev);
int hdr_sz, i, len, descs = 0;
struct tso_t tso;
/* Check number of available descriptors */
if (mvpp2_aggr_desc_num_check(port, aggr_txq, tso_count_descs(skb)) ||
mvpp2_txq_reserved_desc_num_proc(port, txq, txq_pcpu,
tso_count_descs(skb)))
return 0;
hdr_sz = tso_start(skb, &tso);
len = skb->len - hdr_sz;
while (len > 0) {
int left = min_t(int, skb_shinfo(skb)->gso_size, len);
char *hdr = txq_pcpu->tso_headers +
txq_pcpu->txq_put_index * TSO_HEADER_SIZE;
len -= left;
descs++;
tso_build_hdr(skb, hdr, &tso, left, len == 0);
mvpp2_tso_put_hdr(skb, dev, txq, aggr_txq, txq_pcpu, hdr_sz);
while (left > 0) {
int sz = min_t(int, tso.size, left);
left -= sz;
descs++;
if (mvpp2_tso_put_data(skb, dev, &tso, txq, aggr_txq,
txq_pcpu, sz, left, len == 0))
goto release;
tso_build_data(skb, &tso, sz);
}
}
return descs;
release:
for (i = descs - 1; i >= 0; i--) {
struct mvpp2_tx_desc *tx_desc = txq->descs + i;
tx_desc_unmap_put(port, txq, tx_desc);
}
return 0;
}
/* Main tx processing */
static netdev_tx_t mvpp2_tx(struct sk_buff *skb, struct net_device *dev)
{
struct mvpp2_port *port = netdev_priv(dev);
struct mvpp2_tx_queue *txq, *aggr_txq;
struct mvpp2_txq_pcpu *txq_pcpu;
struct mvpp2_tx_desc *tx_desc;
dma_addr_t buf_dma_addr;
unsigned long flags = 0;
unsigned int thread;
int frags = 0;
u16 txq_id;
u32 tx_cmd;
thread = mvpp2_cpu_to_thread(port->priv, smp_processor_id());
txq_id = skb_get_queue_mapping(skb);
txq = port->txqs[txq_id];
txq_pcpu = per_cpu_ptr(txq->pcpu, thread);
aggr_txq = &port->priv->aggr_txqs[thread];
if (test_bit(thread, &port->priv->lock_map))
spin_lock_irqsave(&port->tx_lock[thread], flags);
if (skb_is_gso(skb)) {
frags = mvpp2_tx_tso(skb, dev, txq, aggr_txq, txq_pcpu);
goto out;
}
frags = skb_shinfo(skb)->nr_frags + 1;
/* Check number of available descriptors */
if (mvpp2_aggr_desc_num_check(port, aggr_txq, frags) ||
mvpp2_txq_reserved_desc_num_proc(port, txq, txq_pcpu, frags)) {
frags = 0;
goto out;
}
/* Get a descriptor for the first part of the packet */
tx_desc = mvpp2_txq_next_desc_get(aggr_txq);
mvpp2_txdesc_txq_set(port, tx_desc, txq->id);
mvpp2_txdesc_size_set(port, tx_desc, skb_headlen(skb));
buf_dma_addr = dma_map_single(dev->dev.parent, skb->data,
skb_headlen(skb), DMA_TO_DEVICE);
if (unlikely(dma_mapping_error(dev->dev.parent, buf_dma_addr))) {
mvpp2_txq_desc_put(txq);
frags = 0;
goto out;
}
mvpp2_txdesc_dma_addr_set(port, tx_desc, buf_dma_addr);
tx_cmd = mvpp2_skb_tx_csum(port, skb);
if (frags == 1) {
/* First and Last descriptor */
tx_cmd |= MVPP2_TXD_F_DESC | MVPP2_TXD_L_DESC;
mvpp2_txdesc_cmd_set(port, tx_desc, tx_cmd);
mvpp2_txq_inc_put(port, txq_pcpu, skb, tx_desc, MVPP2_TYPE_SKB);
} else {
/* First but not Last */
tx_cmd |= MVPP2_TXD_F_DESC | MVPP2_TXD_PADDING_DISABLE;
mvpp2_txdesc_cmd_set(port, tx_desc, tx_cmd);
mvpp2_txq_inc_put(port, txq_pcpu, NULL, tx_desc, MVPP2_TYPE_SKB);
/* Continue with other skb fragments */
if (mvpp2_tx_frag_process(port, skb, aggr_txq, txq)) {
tx_desc_unmap_put(port, txq, tx_desc);
frags = 0;
}
}
out:
if (frags > 0) {
struct mvpp2_pcpu_stats *stats = per_cpu_ptr(port->stats, thread);
struct netdev_queue *nq = netdev_get_tx_queue(dev, txq_id);
txq_pcpu->reserved_num -= frags;
txq_pcpu->count += frags;
aggr_txq->count += frags;
/* Enable transmit */
wmb();
mvpp2_aggr_txq_pend_desc_add(port, frags);
if (txq_pcpu->count >= txq_pcpu->stop_threshold)
netif_tx_stop_queue(nq);
u64_stats_update_begin(&stats->syncp);
stats->tx_packets++;
stats->tx_bytes += skb->len;
u64_stats_update_end(&stats->syncp);
} else {
dev->stats.tx_dropped++;
dev_kfree_skb_any(skb);
}
net: mvpp2: replace TX coalescing interrupts with hrtimer The PP2 controller is capable of per-CPU TX processing, which means there are per-CPU banked register sets and queues. Current version of the driver supports TX packet coalescing - once on given CPU sent packets amount reaches a threshold value, an IRQ occurs. However, there is a single interrupt line responsible for CPU0/1 TX and RX events (the latter is not per-CPU, the hardware does not support RSS). When the top-half executes the interrupt cause is not known. This is why in NAPI poll function, along with RX processing, IRQ cause register on both CPU's is accessed in order to determine on which of them the TX coalescing threshold might have been reached. Thus the egress processing and releasing the buffers is able to take place on the corresponding CPU. Hitherto approach lead to an illegal usage of on_each_cpu function in softirq context. The problem is solved by resigning from TX coalescing interrupts and separating egress finalization from NAPI processing. For that purpose a method of using hrtimer is introduced. In main transmit function (mvpp2_tx) buffers are released once a software coalescing threshold is reached. In case not all the data is processed a timer is set on this CPU - in its interrupt context a tasklet is scheduled in which all queues are processed. At once only one timer per-CPU can be running, which is controlled by a dedicated flag. This commit removes TX processing from NAPI polling function, disables hardware coalescing and enables hrtimer with tasklet, using new per-CPU port structure (mvpp2_port_pcpu). Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-07 01:00:30 +08:00
/* Finalize TX processing */
if (!port->has_tx_irqs && txq_pcpu->count >= txq->done_pkts_coal)
net: mvpp2: replace TX coalescing interrupts with hrtimer The PP2 controller is capable of per-CPU TX processing, which means there are per-CPU banked register sets and queues. Current version of the driver supports TX packet coalescing - once on given CPU sent packets amount reaches a threshold value, an IRQ occurs. However, there is a single interrupt line responsible for CPU0/1 TX and RX events (the latter is not per-CPU, the hardware does not support RSS). When the top-half executes the interrupt cause is not known. This is why in NAPI poll function, along with RX processing, IRQ cause register on both CPU's is accessed in order to determine on which of them the TX coalescing threshold might have been reached. Thus the egress processing and releasing the buffers is able to take place on the corresponding CPU. Hitherto approach lead to an illegal usage of on_each_cpu function in softirq context. The problem is solved by resigning from TX coalescing interrupts and separating egress finalization from NAPI processing. For that purpose a method of using hrtimer is introduced. In main transmit function (mvpp2_tx) buffers are released once a software coalescing threshold is reached. In case not all the data is processed a timer is set on this CPU - in its interrupt context a tasklet is scheduled in which all queues are processed. At once only one timer per-CPU can be running, which is controlled by a dedicated flag. This commit removes TX processing from NAPI polling function, disables hardware coalescing and enables hrtimer with tasklet, using new per-CPU port structure (mvpp2_port_pcpu). Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-07 01:00:30 +08:00
mvpp2_txq_done(port, txq, txq_pcpu);
/* Set the timer in case not all frags were processed */
if (!port->has_tx_irqs && txq_pcpu->count <= frags &&
txq_pcpu->count > 0) {
struct mvpp2_port_pcpu *port_pcpu = per_cpu_ptr(port->pcpu, thread);
net: mvpp2: replace TX coalescing interrupts with hrtimer The PP2 controller is capable of per-CPU TX processing, which means there are per-CPU banked register sets and queues. Current version of the driver supports TX packet coalescing - once on given CPU sent packets amount reaches a threshold value, an IRQ occurs. However, there is a single interrupt line responsible for CPU0/1 TX and RX events (the latter is not per-CPU, the hardware does not support RSS). When the top-half executes the interrupt cause is not known. This is why in NAPI poll function, along with RX processing, IRQ cause register on both CPU's is accessed in order to determine on which of them the TX coalescing threshold might have been reached. Thus the egress processing and releasing the buffers is able to take place on the corresponding CPU. Hitherto approach lead to an illegal usage of on_each_cpu function in softirq context. The problem is solved by resigning from TX coalescing interrupts and separating egress finalization from NAPI processing. For that purpose a method of using hrtimer is introduced. In main transmit function (mvpp2_tx) buffers are released once a software coalescing threshold is reached. In case not all the data is processed a timer is set on this CPU - in its interrupt context a tasklet is scheduled in which all queues are processed. At once only one timer per-CPU can be running, which is controlled by a dedicated flag. This commit removes TX processing from NAPI polling function, disables hardware coalescing and enables hrtimer with tasklet, using new per-CPU port structure (mvpp2_port_pcpu). Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-07 01:00:30 +08:00
if (!port_pcpu->timer_scheduled) {
port_pcpu->timer_scheduled = true;
hrtimer_start(&port_pcpu->tx_done_timer,
MVPP2_TXDONE_HRTIMER_PERIOD_NS,
HRTIMER_MODE_REL_PINNED_SOFT);
}
net: mvpp2: replace TX coalescing interrupts with hrtimer The PP2 controller is capable of per-CPU TX processing, which means there are per-CPU banked register sets and queues. Current version of the driver supports TX packet coalescing - once on given CPU sent packets amount reaches a threshold value, an IRQ occurs. However, there is a single interrupt line responsible for CPU0/1 TX and RX events (the latter is not per-CPU, the hardware does not support RSS). When the top-half executes the interrupt cause is not known. This is why in NAPI poll function, along with RX processing, IRQ cause register on both CPU's is accessed in order to determine on which of them the TX coalescing threshold might have been reached. Thus the egress processing and releasing the buffers is able to take place on the corresponding CPU. Hitherto approach lead to an illegal usage of on_each_cpu function in softirq context. The problem is solved by resigning from TX coalescing interrupts and separating egress finalization from NAPI processing. For that purpose a method of using hrtimer is introduced. In main transmit function (mvpp2_tx) buffers are released once a software coalescing threshold is reached. In case not all the data is processed a timer is set on this CPU - in its interrupt context a tasklet is scheduled in which all queues are processed. At once only one timer per-CPU can be running, which is controlled by a dedicated flag. This commit removes TX processing from NAPI polling function, disables hardware coalescing and enables hrtimer with tasklet, using new per-CPU port structure (mvpp2_port_pcpu). Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-07 01:00:30 +08:00
}
if (test_bit(thread, &port->priv->lock_map))
spin_unlock_irqrestore(&port->tx_lock[thread], flags);
return NETDEV_TX_OK;
}
static inline void mvpp2_cause_error(struct net_device *dev, int cause)
{
if (cause & MVPP2_CAUSE_FCS_ERR_MASK)
netdev_err(dev, "FCS error\n");
if (cause & MVPP2_CAUSE_RX_FIFO_OVERRUN_MASK)
netdev_err(dev, "rx fifo overrun error\n");
if (cause & MVPP2_CAUSE_TX_FIFO_UNDERRUN_MASK)
netdev_err(dev, "tx fifo underrun error\n");
}
net: mvpp2: replace TX coalescing interrupts with hrtimer The PP2 controller is capable of per-CPU TX processing, which means there are per-CPU banked register sets and queues. Current version of the driver supports TX packet coalescing - once on given CPU sent packets amount reaches a threshold value, an IRQ occurs. However, there is a single interrupt line responsible for CPU0/1 TX and RX events (the latter is not per-CPU, the hardware does not support RSS). When the top-half executes the interrupt cause is not known. This is why in NAPI poll function, along with RX processing, IRQ cause register on both CPU's is accessed in order to determine on which of them the TX coalescing threshold might have been reached. Thus the egress processing and releasing the buffers is able to take place on the corresponding CPU. Hitherto approach lead to an illegal usage of on_each_cpu function in softirq context. The problem is solved by resigning from TX coalescing interrupts and separating egress finalization from NAPI processing. For that purpose a method of using hrtimer is introduced. In main transmit function (mvpp2_tx) buffers are released once a software coalescing threshold is reached. In case not all the data is processed a timer is set on this CPU - in its interrupt context a tasklet is scheduled in which all queues are processed. At once only one timer per-CPU can be running, which is controlled by a dedicated flag. This commit removes TX processing from NAPI polling function, disables hardware coalescing and enables hrtimer with tasklet, using new per-CPU port structure (mvpp2_port_pcpu). Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-07 01:00:30 +08:00
static int mvpp2_poll(struct napi_struct *napi, int budget)
{
u32 cause_rx_tx, cause_rx, cause_tx, cause_misc;
net: mvpp2: replace TX coalescing interrupts with hrtimer The PP2 controller is capable of per-CPU TX processing, which means there are per-CPU banked register sets and queues. Current version of the driver supports TX packet coalescing - once on given CPU sent packets amount reaches a threshold value, an IRQ occurs. However, there is a single interrupt line responsible for CPU0/1 TX and RX events (the latter is not per-CPU, the hardware does not support RSS). When the top-half executes the interrupt cause is not known. This is why in NAPI poll function, along with RX processing, IRQ cause register on both CPU's is accessed in order to determine on which of them the TX coalescing threshold might have been reached. Thus the egress processing and releasing the buffers is able to take place on the corresponding CPU. Hitherto approach lead to an illegal usage of on_each_cpu function in softirq context. The problem is solved by resigning from TX coalescing interrupts and separating egress finalization from NAPI processing. For that purpose a method of using hrtimer is introduced. In main transmit function (mvpp2_tx) buffers are released once a software coalescing threshold is reached. In case not all the data is processed a timer is set on this CPU - in its interrupt context a tasklet is scheduled in which all queues are processed. At once only one timer per-CPU can be running, which is controlled by a dedicated flag. This commit removes TX processing from NAPI polling function, disables hardware coalescing and enables hrtimer with tasklet, using new per-CPU port structure (mvpp2_port_pcpu). Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-07 01:00:30 +08:00
int rx_done = 0;
struct mvpp2_port *port = netdev_priv(napi->dev);
struct mvpp2_queue_vector *qv;
unsigned int thread = mvpp2_cpu_to_thread(port->priv, smp_processor_id());
qv = container_of(napi, struct mvpp2_queue_vector, napi);
/* Rx/Tx cause register
*
* Bits 0-15: each bit indicates received packets on the Rx queue
* (bit 0 is for Rx queue 0).
*
* Bits 16-23: each bit indicates transmitted packets on the Tx queue
* (bit 16 is for Tx queue 0).
*
* Each CPU has its own Rx/Tx cause register
*/
cause_rx_tx = mvpp2_thread_read_relaxed(port->priv, qv->sw_thread_id,
MVPP2_ISR_RX_TX_CAUSE_REG(port->id));
cause_misc = cause_rx_tx & MVPP2_CAUSE_MISC_SUM_MASK;
if (cause_misc) {
mvpp2_cause_error(port->dev, cause_misc);
/* Clear the cause register */
mvpp2_write(port->priv, MVPP2_ISR_MISC_CAUSE_REG, 0);
mvpp2_thread_write(port->priv, thread,
net: mvpp2: handle register mapping and access for PPv2.2 This commit adjusts the mvpp2 driver register mapping and access logic to support PPv2.2, to handle a number of differences. Due to how the registers are laid out in memory, the Device Tree binding for the "reg" property is different: - On PPv2.1, we had a first area for the packet processor registers (common to all ports), and then one area per port. - On PPv2.2, we have a first area for the packet processor registers (common to all ports), and a second area for numerous other registers, including a large number of per-port registers In addition, on PPv2.2, the area for the common registers is split into so-called "address spaces" of 64 KB each. They allow to access per-CPU registers, where each CPU has its own copy of some registers. A few other registers, which have a single copy, also need to be accessed from those per-CPU windows if they are related to a per-CPU register. For example: - Writing to MVPP2_TXQ_NUM_REG selects a TX queue. This register is a per-CPU register, it must be accessed from the current CPU register window. - Then a write to MVPP2_TXQ_PENDING_REG, MVPP2_TXQ_DESC_ADDR_REG (and a few others) will affect the TX queue that was selected by the write to MVPP2_TXQ_NUM_REG. It must be accessed from the same CPU window as the write to the TXQ_NUM_REG. Therefore, the ->base member of 'struct mvpp2' is replaced with a ->cpu_base[] array, each entry pointing to a mapping of the per-CPU area. Since PPv2.1 doesn't have this concept of per-CPU windows, all entries in ->cpu_base[] point to the same io-remapped area. The existing mvpp2_read() and mvpp2_write() accessors use cpu_base[0], they are used for registers for which the CPU window doesn't matter. mvpp2_percpu_read() and mvpp2_percpu_write() are new accessors added to access the registers for which the CPU window does matter, which is why they take a "cpu" as argument. The driver is then changed to use mvpp2_percpu_read() and mvpp2_percpu_write() where it matters. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-07 23:53:13 +08:00
MVPP2_ISR_RX_TX_CAUSE_REG(port->id),
cause_rx_tx & ~MVPP2_CAUSE_MISC_SUM_MASK);
}
if (port->has_tx_irqs) {
cause_tx = cause_rx_tx & MVPP2_CAUSE_TXQ_OCCUP_DESC_ALL_MASK;
if (cause_tx) {
cause_tx >>= MVPP2_CAUSE_TXQ_OCCUP_DESC_ALL_OFFSET;
mvpp2_tx_done(port, cause_tx, qv->sw_thread_id);
}
}
/* Process RX packets */
cause_rx = cause_rx_tx &
MVPP2_CAUSE_RXQ_OCCUP_DESC_ALL_MASK(port->priv->hw_version);
cause_rx <<= qv->first_rxq;
cause_rx |= qv->pending_cause_rx;
while (cause_rx && budget > 0) {
int count;
struct mvpp2_rx_queue *rxq;
rxq = mvpp2_get_rx_queue(port, cause_rx);
if (!rxq)
break;
count = mvpp2_rx(port, napi, budget, rxq);
rx_done += count;
budget -= count;
if (budget > 0) {
/* Clear the bit associated to this Rx queue
* so that next iteration will continue from
* the next Rx queue.
*/
cause_rx &= ~(1 << rxq->logic_rxq);
}
}
if (budget > 0) {
cause_rx = 0;
napi_complete_done(napi, rx_done);
mvpp2_qvec_interrupt_enable(qv);
}
qv->pending_cause_rx = cause_rx;
return rx_done;
}
static void mvpp22_mode_reconfigure(struct mvpp2_port *port)
{
u32 ctrl3;
/* Set the GMAC & XLG MAC in reset */
mvpp2_mac_reset_assert(port);
/* Set the MPCS and XPCS in reset */
mvpp22_pcs_reset_assert(port);
/* comphy reconfiguration */
mvpp22_comphy_init(port);
/* gop reconfiguration */
mvpp22_gop_init(port);
mvpp22_pcs_reset_deassert(port);
if (mvpp2_port_supports_xlg(port)) {
ctrl3 = readl(port->base + MVPP22_XLG_CTRL3_REG);
ctrl3 &= ~MVPP22_XLG_CTRL3_MACMODESELECT_MASK;
if (mvpp2_is_xlg(port->phy_interface))
ctrl3 |= MVPP22_XLG_CTRL3_MACMODESELECT_10G;
else
ctrl3 |= MVPP22_XLG_CTRL3_MACMODESELECT_GMAC;
writel(ctrl3, port->base + MVPP22_XLG_CTRL3_REG);
}
if (mvpp2_port_supports_xlg(port) && mvpp2_is_xlg(port->phy_interface))
mvpp2_xlg_max_rx_size_set(port);
else
mvpp2_gmac_max_rx_size_set(port);
}
/* Set hw internals when starting port */
static void mvpp2_start_dev(struct mvpp2_port *port)
{
int i;
mvpp2_txp_max_tx_size_set(port);
for (i = 0; i < port->nqvecs; i++)
napi_enable(&port->qvecs[i].napi);
/* Enable interrupts on all threads */
mvpp2_interrupts_enable(port);
if (port->priv->hw_version == MVPP22)
mvpp22_mode_reconfigure(port);
if (port->phylink) {
phylink_start(port->phylink);
} else {
/* Phylink isn't used as of now for ACPI, so the MAC has to be
* configured manually when the interface is started. This will
* be removed as soon as the phylink ACPI support lands in.
*/
struct phylink_link_state state = {
.interface = port->phy_interface,
};
net: phylink: Add struct phylink_config to PHYLINK API The phylink_config structure will encapsulate a pointer to a struct device and the operation type requested for this instance of PHYLINK. This patch does not make any functional changes, it just transitions the PHYLINK internals and all its users to the new API. A pointer to a phylink_config structure will be passed to phylink_create() instead of the net_device directly. Also, the same phylink_config pointer will be passed back to all phylink_mac_ops callbacks instead of the net_device. Using this mechanism, a PHYLINK user can get the original net_device using a structure such as 'to_net_dev(config->dev)' or directly the structure containing the phylink_config using a container_of call. At the moment, only the PHYLINK_NETDEV is defined as a valid operation type for PHYLINK. In this mode, a valid reference to a struct device linked to the original net_device should be passed to PHYLINK through the phylink_config structure. This API changes is mainly driven by the necessity of adding a new operation type in PHYLINK that disconnects the phy_device from the net_device and also works when the net_device is lacking. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Tested-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-29 01:38:12 +08:00
mvpp2_mac_config(&port->phylink_config, MLO_AN_INBAND, &state);
mvpp2_mac_link_up(&port->phylink_config, NULL,
MLO_AN_INBAND, port->phy_interface,
SPEED_UNKNOWN, DUPLEX_UNKNOWN, false, false);
}
netif_tx_start_all_queues(port->dev);
clear_bit(0, &port->state);
}
/* Set hw internals when stopping port */
static void mvpp2_stop_dev(struct mvpp2_port *port)
{
int i;
set_bit(0, &port->state);
/* Disable interrupts on all threads */
mvpp2_interrupts_disable(port);
for (i = 0; i < port->nqvecs; i++)
napi_disable(&port->qvecs[i].napi);
if (port->phylink)
phylink_stop(port->phylink);
phy_power_off(port->comphy);
}
static int mvpp2_check_ringparam_valid(struct net_device *dev,
struct ethtool_ringparam *ring)
{
u16 new_rx_pending = ring->rx_pending;
u16 new_tx_pending = ring->tx_pending;
if (ring->rx_pending == 0 || ring->tx_pending == 0)
return -EINVAL;
if (ring->rx_pending > MVPP2_MAX_RXD_MAX)
new_rx_pending = MVPP2_MAX_RXD_MAX;
else if (!IS_ALIGNED(ring->rx_pending, 16))
new_rx_pending = ALIGN(ring->rx_pending, 16);
if (ring->tx_pending > MVPP2_MAX_TXD_MAX)
new_tx_pending = MVPP2_MAX_TXD_MAX;
else if (!IS_ALIGNED(ring->tx_pending, 32))
new_tx_pending = ALIGN(ring->tx_pending, 32);
/* The Tx ring size cannot be smaller than the minimum number of
* descriptors needed for TSO.
*/
if (new_tx_pending < MVPP2_MAX_SKB_DESCS)
new_tx_pending = ALIGN(MVPP2_MAX_SKB_DESCS, 32);
if (ring->rx_pending != new_rx_pending) {
netdev_info(dev, "illegal Rx ring size value %d, round to %d\n",
ring->rx_pending, new_rx_pending);
ring->rx_pending = new_rx_pending;
}
if (ring->tx_pending != new_tx_pending) {
netdev_info(dev, "illegal Tx ring size value %d, round to %d\n",
ring->tx_pending, new_tx_pending);
ring->tx_pending = new_tx_pending;
}
return 0;
}
static void mvpp21_get_mac_address(struct mvpp2_port *port, unsigned char *addr)
{
u32 mac_addr_l, mac_addr_m, mac_addr_h;
mac_addr_l = readl(port->base + MVPP2_GMAC_CTRL_1_REG);
mac_addr_m = readl(port->priv->lms_base + MVPP2_SRC_ADDR_MIDDLE);
mac_addr_h = readl(port->priv->lms_base + MVPP2_SRC_ADDR_HIGH);
addr[0] = (mac_addr_h >> 24) & 0xFF;
addr[1] = (mac_addr_h >> 16) & 0xFF;
addr[2] = (mac_addr_h >> 8) & 0xFF;
addr[3] = mac_addr_h & 0xFF;
addr[4] = mac_addr_m & 0xFF;
addr[5] = (mac_addr_l >> MVPP2_GMAC_SA_LOW_OFFS) & 0xFF;
}
static int mvpp2_irqs_init(struct mvpp2_port *port)
{
int err, i;
for (i = 0; i < port->nqvecs; i++) {
struct mvpp2_queue_vector *qv = port->qvecs + i;
if (qv->type == MVPP2_QUEUE_VECTOR_PRIVATE) {
qv->mask = kzalloc(cpumask_size(), GFP_KERNEL);
if (!qv->mask) {
err = -ENOMEM;
goto err;
}
irq_set_status_flags(qv->irq, IRQ_NO_BALANCING);
}
err = request_irq(qv->irq, mvpp2_isr, 0, port->dev->name, qv);
if (err)
goto err;
if (qv->type == MVPP2_QUEUE_VECTOR_PRIVATE) {
unsigned int cpu;
for_each_present_cpu(cpu) {
if (mvpp2_cpu_to_thread(port->priv, cpu) ==
qv->sw_thread_id)
cpumask_set_cpu(cpu, qv->mask);
}
irq_set_affinity_hint(qv->irq, qv->mask);
}
}
return 0;
err:
for (i = 0; i < port->nqvecs; i++) {
struct mvpp2_queue_vector *qv = port->qvecs + i;
irq_set_affinity_hint(qv->irq, NULL);
kfree(qv->mask);
qv->mask = NULL;
free_irq(qv->irq, qv);
}
return err;
}
static void mvpp2_irqs_deinit(struct mvpp2_port *port)
{
int i;
for (i = 0; i < port->nqvecs; i++) {
struct mvpp2_queue_vector *qv = port->qvecs + i;
irq_set_affinity_hint(qv->irq, NULL);
kfree(qv->mask);
qv->mask = NULL;
irq_clear_status_flags(qv->irq, IRQ_NO_BALANCING);
free_irq(qv->irq, qv);
}
}
static bool mvpp22_rss_is_supported(void)
{
return queue_mode == MVPP2_QDIST_MULTI_MODE;
}
static int mvpp2_open(struct net_device *dev)
{
struct mvpp2_port *port = netdev_priv(dev);
struct mvpp2 *priv = port->priv;
unsigned char mac_bcast[ETH_ALEN] = {
0xff, 0xff, 0xff, 0xff, 0xff, 0xff };
bool valid = false;
int err;
err = mvpp2_prs_mac_da_accept(port, mac_bcast, true);
if (err) {
netdev_err(dev, "mvpp2_prs_mac_da_accept BC failed\n");
return err;
}
err = mvpp2_prs_mac_da_accept(port, dev->dev_addr, true);
if (err) {
netdev_err(dev, "mvpp2_prs_mac_da_accept own addr failed\n");
return err;
}
err = mvpp2_prs_tag_mode_set(port->priv, port->id, MVPP2_TAG_TYPE_MH);
if (err) {
netdev_err(dev, "mvpp2_prs_tag_mode_set failed\n");
return err;
}
err = mvpp2_prs_def_flow(port);
if (err) {
netdev_err(dev, "mvpp2_prs_def_flow failed\n");
return err;
}
/* Allocate the Rx/Tx queues */
err = mvpp2_setup_rxqs(port);
if (err) {
netdev_err(port->dev, "cannot allocate Rx queues\n");
return err;
}
err = mvpp2_setup_txqs(port);
if (err) {
netdev_err(port->dev, "cannot allocate Tx queues\n");
goto err_cleanup_rxqs;
}
err = mvpp2_irqs_init(port);
if (err) {
netdev_err(port->dev, "cannot init IRQs\n");
goto err_cleanup_txqs;
}
/* Phylink isn't supported yet in ACPI mode */
if (port->of_node) {
err = phylink_of_phy_connect(port->phylink, port->of_node, 0);
if (err) {
netdev_err(port->dev, "could not attach PHY (%d)\n",
err);
goto err_free_irq;
}
valid = true;
}
if (priv->hw_version == MVPP22 && port->link_irq) {
err = request_irq(port->link_irq, mvpp2_link_status_isr, 0,
dev->name, port);
if (err) {
netdev_err(port->dev, "cannot request link IRQ %d\n",
port->link_irq);
goto err_free_irq;
}
mvpp22_gop_setup_irq(port);
/* In default link is down */
netif_carrier_off(port->dev);
valid = true;
} else {
port->link_irq = 0;
}
if (!valid) {
netdev_err(port->dev,
"invalid configuration: no dt or link IRQ");
goto err_free_irq;
}
/* Unmask interrupts on all CPUs */
on_each_cpu(mvpp2_interrupts_unmask, port, 1);
mvpp2_shared_interrupt_mask_unmask(port, false);
mvpp2_start_dev(port);
/* Start hardware statistics gathering */
queue_delayed_work(priv->stats_queue, &port->stats_work,
MVPP2_MIB_COUNTERS_STATS_DELAY);
return 0;
err_free_irq:
mvpp2_irqs_deinit(port);
err_cleanup_txqs:
mvpp2_cleanup_txqs(port);
err_cleanup_rxqs:
mvpp2_cleanup_rxqs(port);
return err;
}
static int mvpp2_stop(struct net_device *dev)
{
struct mvpp2_port *port = netdev_priv(dev);
net: mvpp2: replace TX coalescing interrupts with hrtimer The PP2 controller is capable of per-CPU TX processing, which means there are per-CPU banked register sets and queues. Current version of the driver supports TX packet coalescing - once on given CPU sent packets amount reaches a threshold value, an IRQ occurs. However, there is a single interrupt line responsible for CPU0/1 TX and RX events (the latter is not per-CPU, the hardware does not support RSS). When the top-half executes the interrupt cause is not known. This is why in NAPI poll function, along with RX processing, IRQ cause register on both CPU's is accessed in order to determine on which of them the TX coalescing threshold might have been reached. Thus the egress processing and releasing the buffers is able to take place on the corresponding CPU. Hitherto approach lead to an illegal usage of on_each_cpu function in softirq context. The problem is solved by resigning from TX coalescing interrupts and separating egress finalization from NAPI processing. For that purpose a method of using hrtimer is introduced. In main transmit function (mvpp2_tx) buffers are released once a software coalescing threshold is reached. In case not all the data is processed a timer is set on this CPU - in its interrupt context a tasklet is scheduled in which all queues are processed. At once only one timer per-CPU can be running, which is controlled by a dedicated flag. This commit removes TX processing from NAPI polling function, disables hardware coalescing and enables hrtimer with tasklet, using new per-CPU port structure (mvpp2_port_pcpu). Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-07 01:00:30 +08:00
struct mvpp2_port_pcpu *port_pcpu;
unsigned int thread;
mvpp2_stop_dev(port);
/* Mask interrupts on all threads */
on_each_cpu(mvpp2_interrupts_mask, port, 1);
mvpp2_shared_interrupt_mask_unmask(port, true);
if (port->phylink)
phylink_disconnect_phy(port->phylink);
if (port->link_irq)
free_irq(port->link_irq, port);
mvpp2_irqs_deinit(port);
if (!port->has_tx_irqs) {
for (thread = 0; thread < port->priv->nthreads; thread++) {
port_pcpu = per_cpu_ptr(port->pcpu, thread);
net: mvpp2: replace TX coalescing interrupts with hrtimer The PP2 controller is capable of per-CPU TX processing, which means there are per-CPU banked register sets and queues. Current version of the driver supports TX packet coalescing - once on given CPU sent packets amount reaches a threshold value, an IRQ occurs. However, there is a single interrupt line responsible for CPU0/1 TX and RX events (the latter is not per-CPU, the hardware does not support RSS). When the top-half executes the interrupt cause is not known. This is why in NAPI poll function, along with RX processing, IRQ cause register on both CPU's is accessed in order to determine on which of them the TX coalescing threshold might have been reached. Thus the egress processing and releasing the buffers is able to take place on the corresponding CPU. Hitherto approach lead to an illegal usage of on_each_cpu function in softirq context. The problem is solved by resigning from TX coalescing interrupts and separating egress finalization from NAPI processing. For that purpose a method of using hrtimer is introduced. In main transmit function (mvpp2_tx) buffers are released once a software coalescing threshold is reached. In case not all the data is processed a timer is set on this CPU - in its interrupt context a tasklet is scheduled in which all queues are processed. At once only one timer per-CPU can be running, which is controlled by a dedicated flag. This commit removes TX processing from NAPI polling function, disables hardware coalescing and enables hrtimer with tasklet, using new per-CPU port structure (mvpp2_port_pcpu). Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-07 01:00:30 +08:00
hrtimer_cancel(&port_pcpu->tx_done_timer);
port_pcpu->timer_scheduled = false;
}
net: mvpp2: replace TX coalescing interrupts with hrtimer The PP2 controller is capable of per-CPU TX processing, which means there are per-CPU banked register sets and queues. Current version of the driver supports TX packet coalescing - once on given CPU sent packets amount reaches a threshold value, an IRQ occurs. However, there is a single interrupt line responsible for CPU0/1 TX and RX events (the latter is not per-CPU, the hardware does not support RSS). When the top-half executes the interrupt cause is not known. This is why in NAPI poll function, along with RX processing, IRQ cause register on both CPU's is accessed in order to determine on which of them the TX coalescing threshold might have been reached. Thus the egress processing and releasing the buffers is able to take place on the corresponding CPU. Hitherto approach lead to an illegal usage of on_each_cpu function in softirq context. The problem is solved by resigning from TX coalescing interrupts and separating egress finalization from NAPI processing. For that purpose a method of using hrtimer is introduced. In main transmit function (mvpp2_tx) buffers are released once a software coalescing threshold is reached. In case not all the data is processed a timer is set on this CPU - in its interrupt context a tasklet is scheduled in which all queues are processed. At once only one timer per-CPU can be running, which is controlled by a dedicated flag. This commit removes TX processing from NAPI polling function, disables hardware coalescing and enables hrtimer with tasklet, using new per-CPU port structure (mvpp2_port_pcpu). Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-07 01:00:30 +08:00
}
mvpp2_cleanup_rxqs(port);
mvpp2_cleanup_txqs(port);
cancel_delayed_work_sync(&port->stats_work);
mvpp2_mac_reset_assert(port);
mvpp22_pcs_reset_assert(port);
return 0;
}
static int mvpp2_prs_mac_da_accept_list(struct mvpp2_port *port,
struct netdev_hw_addr_list *list)
{
struct netdev_hw_addr *ha;
int ret;
netdev_hw_addr_list_for_each(ha, list) {
ret = mvpp2_prs_mac_da_accept(port, ha->addr, true);
if (ret)
return ret;
}
return 0;
}
static void mvpp2_set_rx_promisc(struct mvpp2_port *port, bool enable)
{
if (!enable && (port->dev->features & NETIF_F_HW_VLAN_CTAG_FILTER))
mvpp2_prs_vid_enable_filtering(port);
else
mvpp2_prs_vid_disable_filtering(port);
mvpp2_prs_mac_promisc_set(port->priv, port->id,
MVPP2_PRS_L2_UNI_CAST, enable);
mvpp2_prs_mac_promisc_set(port->priv, port->id,
MVPP2_PRS_L2_MULTI_CAST, enable);
}
static void mvpp2_set_rx_mode(struct net_device *dev)
{
struct mvpp2_port *port = netdev_priv(dev);
/* Clear the whole UC and MC list */
mvpp2_prs_mac_del_all(port);
if (dev->flags & IFF_PROMISC) {
mvpp2_set_rx_promisc(port, true);
return;
}
mvpp2_set_rx_promisc(port, false);
if (netdev_uc_count(dev) > MVPP2_PRS_MAC_UC_FILT_MAX ||
mvpp2_prs_mac_da_accept_list(port, &dev->uc))
mvpp2_prs_mac_promisc_set(port->priv, port->id,
MVPP2_PRS_L2_UNI_CAST, true);
if (dev->flags & IFF_ALLMULTI) {
mvpp2_prs_mac_promisc_set(port->priv, port->id,
MVPP2_PRS_L2_MULTI_CAST, true);
return;
}
if (netdev_mc_count(dev) > MVPP2_PRS_MAC_MC_FILT_MAX ||
mvpp2_prs_mac_da_accept_list(port, &dev->mc))
mvpp2_prs_mac_promisc_set(port->priv, port->id,
MVPP2_PRS_L2_MULTI_CAST, true);
}
static int mvpp2_set_mac_address(struct net_device *dev, void *p)
{
const struct sockaddr *addr = p;
int err;
if (!is_valid_ether_addr(addr->sa_data))
return -EADDRNOTAVAIL;
err = mvpp2_prs_update_mac_da(dev, addr->sa_data);
if (err) {
/* Reconfigure parser accept the original MAC address */
mvpp2_prs_update_mac_da(dev, dev->dev_addr);
netdev_err(dev, "failed to change MAC address\n");
}
return err;
}
/* Shut down all the ports, reconfigure the pools as percpu or shared,
* then bring up again all ports.
*/
static int mvpp2_bm_switch_buffers(struct mvpp2 *priv, bool percpu)
{
int numbufs = MVPP2_BM_POOLS_NUM, i;
struct mvpp2_port *port = NULL;
bool status[MVPP2_MAX_PORTS];
for (i = 0; i < priv->port_count; i++) {
port = priv->port_list[i];
status[i] = netif_running(port->dev);
if (status[i])
mvpp2_stop(port->dev);
}
/* nrxqs is the same for all ports */
if (priv->percpu_pools)
numbufs = port->nrxqs * 2;
for (i = 0; i < numbufs; i++)
mvpp2_bm_pool_destroy(port->dev->dev.parent, priv, &priv->bm_pools[i]);
devm_kfree(port->dev->dev.parent, priv->bm_pools);
priv->percpu_pools = percpu;
mvpp2_bm_init(port->dev->dev.parent, priv);
for (i = 0; i < priv->port_count; i++) {
port = priv->port_list[i];
mvpp2_swf_bm_pool_init(port);
if (status[i])
mvpp2_open(port->dev);
}
return 0;
}
static int mvpp2_change_mtu(struct net_device *dev, int mtu)
{
struct mvpp2_port *port = netdev_priv(dev);
bool running = netif_running(dev);
struct mvpp2 *priv = port->priv;
int err;
if (!IS_ALIGNED(MVPP2_RX_PKT_SIZE(mtu), 8)) {
netdev_info(dev, "illegal MTU value %d, round to %d\n", mtu,
ALIGN(MVPP2_RX_PKT_SIZE(mtu), 8));
mtu = ALIGN(MVPP2_RX_PKT_SIZE(mtu), 8);
}
if (MVPP2_RX_PKT_SIZE(mtu) > MVPP2_BM_LONG_PKT_SIZE) {
if (port->xdp_prog) {
netdev_err(dev, "Jumbo frames are not supported with XDP\n");
return -EINVAL;
}
if (priv->percpu_pools) {
netdev_warn(dev, "mtu %d too high, switching to shared buffers", mtu);
mvpp2_bm_switch_buffers(priv, false);
}
} else {
bool jumbo = false;
int i;
for (i = 0; i < priv->port_count; i++)
if (priv->port_list[i] != port &&
MVPP2_RX_PKT_SIZE(priv->port_list[i]->dev->mtu) >
MVPP2_BM_LONG_PKT_SIZE) {
jumbo = true;
break;
}
/* No port is using jumbo frames */
if (!jumbo) {
dev_info(port->dev->dev.parent,
"all ports have a low MTU, switching to per-cpu buffers");
mvpp2_bm_switch_buffers(priv, true);
}
}
if (running)
mvpp2_stop_dev(port);
err = mvpp2_bm_update_mtu(dev, mtu);
if (err) {
netdev_err(dev, "failed to change MTU\n");
/* Reconfigure BM to the original MTU */
mvpp2_bm_update_mtu(dev, dev->mtu);
} else {
port->pkt_size = MVPP2_RX_PKT_SIZE(mtu);
}
if (running) {
mvpp2_start_dev(port);
mvpp2_egress_enable(port);
mvpp2_ingress_enable(port);
}
return err;
}
static int mvpp2_check_pagepool_dma(struct mvpp2_port *port)
{
enum dma_data_direction dma_dir = DMA_FROM_DEVICE;
struct mvpp2 *priv = port->priv;
int err = -1, i;
if (!priv->percpu_pools)
return err;
if (!priv->page_pool[0])
return -ENOMEM;
for (i = 0; i < priv->port_count; i++) {
port = priv->port_list[i];
if (port->xdp_prog) {
dma_dir = DMA_BIDIRECTIONAL;
break;
}
}
/* All pools are equal in terms of DMA direction */
if (priv->page_pool[0]->p.dma_dir != dma_dir)
err = mvpp2_bm_switch_buffers(priv, true);
return err;
}
static void
mvpp2_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *stats)
{
struct mvpp2_port *port = netdev_priv(dev);
unsigned int start;
unsigned int cpu;
for_each_possible_cpu(cpu) {
struct mvpp2_pcpu_stats *cpu_stats;
u64 rx_packets;
u64 rx_bytes;
u64 tx_packets;
u64 tx_bytes;
cpu_stats = per_cpu_ptr(port->stats, cpu);
do {
start = u64_stats_fetch_begin_irq(&cpu_stats->syncp);
rx_packets = cpu_stats->rx_packets;
rx_bytes = cpu_stats->rx_bytes;
tx_packets = cpu_stats->tx_packets;
tx_bytes = cpu_stats->tx_bytes;
} while (u64_stats_fetch_retry_irq(&cpu_stats->syncp, start));
stats->rx_packets += rx_packets;
stats->rx_bytes += rx_bytes;
stats->tx_packets += tx_packets;
stats->tx_bytes += tx_bytes;
}
stats->rx_errors = dev->stats.rx_errors;
stats->rx_dropped = dev->stats.rx_dropped;
stats->tx_dropped = dev->stats.tx_dropped;
}
static int mvpp2_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
{
struct mvpp2_port *port = netdev_priv(dev);
if (!port->phylink)
return -ENOTSUPP;
return phylink_mii_ioctl(port->phylink, ifr, cmd);
}
static int mvpp2_vlan_rx_add_vid(struct net_device *dev, __be16 proto, u16 vid)
{
struct mvpp2_port *port = netdev_priv(dev);
int ret;
ret = mvpp2_prs_vid_entry_add(port, vid);
if (ret)
netdev_err(dev, "rx-vlan-filter offloading cannot accept more than %d VIDs per port\n",
MVPP2_PRS_VLAN_FILT_MAX - 1);
return ret;
}
static int mvpp2_vlan_rx_kill_vid(struct net_device *dev, __be16 proto, u16 vid)
{
struct mvpp2_port *port = netdev_priv(dev);
mvpp2_prs_vid_entry_remove(port, vid);
return 0;
}
static int mvpp2_set_features(struct net_device *dev,
netdev_features_t features)
{
netdev_features_t changed = dev->features ^ features;
struct mvpp2_port *port = netdev_priv(dev);
if (changed & NETIF_F_HW_VLAN_CTAG_FILTER) {
if (features & NETIF_F_HW_VLAN_CTAG_FILTER) {
mvpp2_prs_vid_enable_filtering(port);
} else {
/* Invalidate all registered VID filters for this
* port
*/
mvpp2_prs_vid_remove_all(port);
mvpp2_prs_vid_disable_filtering(port);
}
}
net: mvpp2: add an RSS classification step for each flow One of the classification action that can be performed is to compute a hash of the packet header based on some header fields, and lookup a RSS table based on this hash to determine the final RxQ. This is done by adding one lookup entry per flow per port, so that we can configure the hash generation parameters for each flow and each port. There are 2 possible engines that can be used for RSS hash generation : - C3HA, that generates a hash based on up to 4 header-extracted fields - C3HB, that does the same as c3HA, but also includes L4 info in the hash There are a lot of fields that can be extracted from the header. For now, we only use the ones that we can configure using ethtool : - DST MAC address - L3 info - Source IP - Destination IP - Source port - Destination port The C3HB engine is selected when we use L4 fields (src/dst port). Header parser Dec table Ingress pkt +-------------+ flow id +----------------------------+ ------------->| TCAM + SRAM |-------->|TCP IPv4 w/ VLAN, not frag | +-------------+ |TCP IPv4 w/o VLAN, not frag | |TCP IPv4 w/ VLAN, frag |--+ |etc. | | +----------------------------+ | | Flow table | +---------+ +------------+ +--------------------------+ | | RSS tbl |<--| Classifier |<--------| flow 0: C2 lookup | | +---------+ +------------+ | C3 lookup port 0 | | | | | C3 lookup port 1 | | +-----------+ +-------------+ | ... | | | C2 engine | | C3H engines | | flow 1: C2 lookup |<--+ +-----------+ +-------------+ | C3 lookup port 0 | | ... | | ... | | flow 51 : C2 lookup | | ... | +--------------------------+ The C2 engine also gains the role of enabling and disabling the RSS table lookup for this packet. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-12 19:54:26 +08:00
if (changed & NETIF_F_RXHASH) {
if (features & NETIF_F_RXHASH)
mvpp22_port_rss_enable(port);
net: mvpp2: add an RSS classification step for each flow One of the classification action that can be performed is to compute a hash of the packet header based on some header fields, and lookup a RSS table based on this hash to determine the final RxQ. This is done by adding one lookup entry per flow per port, so that we can configure the hash generation parameters for each flow and each port. There are 2 possible engines that can be used for RSS hash generation : - C3HA, that generates a hash based on up to 4 header-extracted fields - C3HB, that does the same as c3HA, but also includes L4 info in the hash There are a lot of fields that can be extracted from the header. For now, we only use the ones that we can configure using ethtool : - DST MAC address - L3 info - Source IP - Destination IP - Source port - Destination port The C3HB engine is selected when we use L4 fields (src/dst port). Header parser Dec table Ingress pkt +-------------+ flow id +----------------------------+ ------------->| TCAM + SRAM |-------->|TCP IPv4 w/ VLAN, not frag | +-------------+ |TCP IPv4 w/o VLAN, not frag | |TCP IPv4 w/ VLAN, frag |--+ |etc. | | +----------------------------+ | | Flow table | +---------+ +------------+ +--------------------------+ | | RSS tbl |<--| Classifier |<--------| flow 0: C2 lookup | | +---------+ +------------+ | C3 lookup port 0 | | | | | C3 lookup port 1 | | +-----------+ +-------------+ | ... | | | C2 engine | | C3H engines | | flow 1: C2 lookup |<--+ +-----------+ +-------------+ | C3 lookup port 0 | | ... | | ... | | flow 51 : C2 lookup | | ... | +--------------------------+ The C2 engine also gains the role of enabling and disabling the RSS table lookup for this packet. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-12 19:54:26 +08:00
else
mvpp22_port_rss_disable(port);
net: mvpp2: add an RSS classification step for each flow One of the classification action that can be performed is to compute a hash of the packet header based on some header fields, and lookup a RSS table based on this hash to determine the final RxQ. This is done by adding one lookup entry per flow per port, so that we can configure the hash generation parameters for each flow and each port. There are 2 possible engines that can be used for RSS hash generation : - C3HA, that generates a hash based on up to 4 header-extracted fields - C3HB, that does the same as c3HA, but also includes L4 info in the hash There are a lot of fields that can be extracted from the header. For now, we only use the ones that we can configure using ethtool : - DST MAC address - L3 info - Source IP - Destination IP - Source port - Destination port The C3HB engine is selected when we use L4 fields (src/dst port). Header parser Dec table Ingress pkt +-------------+ flow id +----------------------------+ ------------->| TCAM + SRAM |-------->|TCP IPv4 w/ VLAN, not frag | +-------------+ |TCP IPv4 w/o VLAN, not frag | |TCP IPv4 w/ VLAN, frag |--+ |etc. | | +----------------------------+ | | Flow table | +---------+ +------------+ +--------------------------+ | | RSS tbl |<--| Classifier |<--------| flow 0: C2 lookup | | +---------+ +------------+ | C3 lookup port 0 | | | | | C3 lookup port 1 | | +-----------+ +-------------+ | ... | | | C2 engine | | C3H engines | | flow 1: C2 lookup |<--+ +-----------+ +-------------+ | C3 lookup port 0 | | ... | | ... | | flow 51 : C2 lookup | | ... | +--------------------------+ The C2 engine also gains the role of enabling and disabling the RSS table lookup for this packet. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-12 19:54:26 +08:00
}
return 0;
}
static int mvpp2_xdp_setup(struct mvpp2_port *port, struct netdev_bpf *bpf)
{
struct bpf_prog *prog = bpf->prog, *old_prog;
bool running = netif_running(port->dev);
bool reset = !prog != !port->xdp_prog;
if (port->dev->mtu > ETH_DATA_LEN) {
NL_SET_ERR_MSG_MOD(bpf->extack, "XDP is not supported with jumbo frames enabled");
return -EOPNOTSUPP;
}
if (!port->priv->percpu_pools) {
NL_SET_ERR_MSG_MOD(bpf->extack, "Per CPU Pools required for XDP");
return -EOPNOTSUPP;
}
if (port->ntxqs < num_possible_cpus() * 2) {
NL_SET_ERR_MSG_MOD(bpf->extack, "XDP_TX needs two TX queues per CPU");
return -EOPNOTSUPP;
}
/* device is up and bpf is added/removed, must setup the RX queues */
if (running && reset)
mvpp2_stop(port->dev);
old_prog = xchg(&port->xdp_prog, prog);
if (old_prog)
bpf_prog_put(old_prog);
/* bpf is just replaced, RXQ and MTU are already setup */
if (!reset)
return 0;
/* device was up, restore the link */
if (running)
mvpp2_open(port->dev);
/* Check Page Pool DMA Direction */
mvpp2_check_pagepool_dma(port);
return 0;
}
static int mvpp2_xdp(struct net_device *dev, struct netdev_bpf *xdp)
{
struct mvpp2_port *port = netdev_priv(dev);
switch (xdp->command) {
case XDP_SETUP_PROG:
return mvpp2_xdp_setup(port, xdp);
default:
return -EINVAL;
}
}
/* Ethtool methods */
static int mvpp2_ethtool_nway_reset(struct net_device *dev)
{
struct mvpp2_port *port = netdev_priv(dev);
if (!port->phylink)
return -ENOTSUPP;
return phylink_ethtool_nway_reset(port->phylink);
}
/* Set interrupt coalescing for ethtools */
static int mvpp2_ethtool_set_coalesce(struct net_device *dev,
struct ethtool_coalesce *c)
{
struct mvpp2_port *port = netdev_priv(dev);
int queue;
for (queue = 0; queue < port->nrxqs; queue++) {
struct mvpp2_rx_queue *rxq = port->rxqs[queue];
rxq->time_coal = c->rx_coalesce_usecs;
rxq->pkts_coal = c->rx_max_coalesced_frames;
mvpp2_rx_pkts_coal_set(port, rxq);
mvpp2_rx_time_coal_set(port, rxq);
}
if (port->has_tx_irqs) {
port->tx_time_coal = c->tx_coalesce_usecs;
mvpp2_tx_time_coal_set(port);
}
for (queue = 0; queue < port->ntxqs; queue++) {
struct mvpp2_tx_queue *txq = port->txqs[queue];
txq->done_pkts_coal = c->tx_max_coalesced_frames;
if (port->has_tx_irqs)
mvpp2_tx_pkts_coal_set(port, txq);
}
return 0;
}
/* get coalescing for ethtools */
static int mvpp2_ethtool_get_coalesce(struct net_device *dev,
struct ethtool_coalesce *c)
{
struct mvpp2_port *port = netdev_priv(dev);
c->rx_coalesce_usecs = port->rxqs[0]->time_coal;
c->rx_max_coalesced_frames = port->rxqs[0]->pkts_coal;
c->tx_max_coalesced_frames = port->txqs[0]->done_pkts_coal;
c->tx_coalesce_usecs = port->tx_time_coal;
return 0;
}
static void mvpp2_ethtool_get_drvinfo(struct net_device *dev,
struct ethtool_drvinfo *drvinfo)
{
strlcpy(drvinfo->driver, MVPP2_DRIVER_NAME,
sizeof(drvinfo->driver));
strlcpy(drvinfo->version, MVPP2_DRIVER_VERSION,
sizeof(drvinfo->version));
strlcpy(drvinfo->bus_info, dev_name(&dev->dev),
sizeof(drvinfo->bus_info));
}
static void mvpp2_ethtool_get_ringparam(struct net_device *dev,
struct ethtool_ringparam *ring)
{
struct mvpp2_port *port = netdev_priv(dev);
ring->rx_max_pending = MVPP2_MAX_RXD_MAX;
ring->tx_max_pending = MVPP2_MAX_TXD_MAX;
ring->rx_pending = port->rx_ring_size;
ring->tx_pending = port->tx_ring_size;
}
static int mvpp2_ethtool_set_ringparam(struct net_device *dev,
struct ethtool_ringparam *ring)
{
struct mvpp2_port *port = netdev_priv(dev);
u16 prev_rx_ring_size = port->rx_ring_size;
u16 prev_tx_ring_size = port->tx_ring_size;
int err;
err = mvpp2_check_ringparam_valid(dev, ring);
if (err)
return err;
if (!netif_running(dev)) {
port->rx_ring_size = ring->rx_pending;
port->tx_ring_size = ring->tx_pending;
return 0;
}
/* The interface is running, so we have to force a
* reallocation of the queues
*/
mvpp2_stop_dev(port);
mvpp2_cleanup_rxqs(port);
mvpp2_cleanup_txqs(port);
port->rx_ring_size = ring->rx_pending;
port->tx_ring_size = ring->tx_pending;
err = mvpp2_setup_rxqs(port);
if (err) {
/* Reallocate Rx queues with the original ring size */
port->rx_ring_size = prev_rx_ring_size;
ring->rx_pending = prev_rx_ring_size;
err = mvpp2_setup_rxqs(port);
if (err)
goto err_out;
}
err = mvpp2_setup_txqs(port);
if (err) {
/* Reallocate Tx queues with the original ring size */
port->tx_ring_size = prev_tx_ring_size;
ring->tx_pending = prev_tx_ring_size;
err = mvpp2_setup_txqs(port);
if (err)
goto err_clean_rxqs;
}
mvpp2_start_dev(port);
mvpp2_egress_enable(port);
mvpp2_ingress_enable(port);
return 0;
err_clean_rxqs:
mvpp2_cleanup_rxqs(port);
err_out:
netdev_err(dev, "failed to change ring parameters");
return err;
}
static void mvpp2_ethtool_get_pause_param(struct net_device *dev,
struct ethtool_pauseparam *pause)
{
struct mvpp2_port *port = netdev_priv(dev);
if (!port->phylink)
return;
phylink_ethtool_get_pauseparam(port->phylink, pause);
}
static int mvpp2_ethtool_set_pause_param(struct net_device *dev,
struct ethtool_pauseparam *pause)
{
struct mvpp2_port *port = netdev_priv(dev);
if (!port->phylink)
return -ENOTSUPP;
return phylink_ethtool_set_pauseparam(port->phylink, pause);
}
static int mvpp2_ethtool_get_link_ksettings(struct net_device *dev,
struct ethtool_link_ksettings *cmd)
{
struct mvpp2_port *port = netdev_priv(dev);
if (!port->phylink)
return -ENOTSUPP;
return phylink_ethtool_ksettings_get(port->phylink, cmd);
}
static int mvpp2_ethtool_set_link_ksettings(struct net_device *dev,
const struct ethtool_link_ksettings *cmd)
{
struct mvpp2_port *port = netdev_priv(dev);
if (!port->phylink)
return -ENOTSUPP;
return phylink_ethtool_ksettings_set(port->phylink, cmd);
}
static int mvpp2_ethtool_get_rxnfc(struct net_device *dev,
struct ethtool_rxnfc *info, u32 *rules)
{
struct mvpp2_port *port = netdev_priv(dev);
int ret = 0, i, loc = 0;
if (!mvpp22_rss_is_supported())
return -EOPNOTSUPP;
switch (info->cmd) {
case ETHTOOL_GRXFH:
ret = mvpp2_ethtool_rxfh_get(port, info);
break;
case ETHTOOL_GRXRINGS:
info->data = port->nrxqs;
break;
case ETHTOOL_GRXCLSRLCNT:
info->rule_cnt = port->n_rfs_rules;
break;
case ETHTOOL_GRXCLSRULE:
ret = mvpp2_ethtool_cls_rule_get(port, info);
break;
case ETHTOOL_GRXCLSRLALL:
for (i = 0; i < MVPP2_N_RFS_ENTRIES_PER_FLOW; i++) {
if (port->rfs_rules[i])
rules[loc++] = i;
}
break;
default:
return -ENOTSUPP;
}
return ret;
}
static int mvpp2_ethtool_set_rxnfc(struct net_device *dev,
struct ethtool_rxnfc *info)
{
struct mvpp2_port *port = netdev_priv(dev);
int ret = 0;
if (!mvpp22_rss_is_supported())
return -EOPNOTSUPP;
switch (info->cmd) {
case ETHTOOL_SRXFH:
ret = mvpp2_ethtool_rxfh_set(port, info);
break;
case ETHTOOL_SRXCLSRLINS:
ret = mvpp2_ethtool_cls_rule_ins(port, info);
break;
case ETHTOOL_SRXCLSRLDEL:
ret = mvpp2_ethtool_cls_rule_del(port, info);
break;
default:
return -EOPNOTSUPP;
}
return ret;
}
static u32 mvpp2_ethtool_get_rxfh_indir_size(struct net_device *dev)
{
return mvpp22_rss_is_supported() ? MVPP22_RSS_TABLE_ENTRIES : 0;
}
static int mvpp2_ethtool_get_rxfh(struct net_device *dev, u32 *indir, u8 *key,
u8 *hfunc)
{
struct mvpp2_port *port = netdev_priv(dev);
int ret = 0;
if (!mvpp22_rss_is_supported())
return -EOPNOTSUPP;
if (indir)
ret = mvpp22_port_rss_ctx_indir_get(port, 0, indir);
if (hfunc)
*hfunc = ETH_RSS_HASH_CRC32;
return ret;
}
static int mvpp2_ethtool_set_rxfh(struct net_device *dev, const u32 *indir,
const u8 *key, const u8 hfunc)
{
struct mvpp2_port *port = netdev_priv(dev);
int ret = 0;
if (!mvpp22_rss_is_supported())
return -EOPNOTSUPP;
if (hfunc != ETH_RSS_HASH_NO_CHANGE && hfunc != ETH_RSS_HASH_CRC32)
return -EOPNOTSUPP;
if (key)
return -EOPNOTSUPP;
if (indir)
ret = mvpp22_port_rss_ctx_indir_set(port, 0, indir);
return ret;
}
static int mvpp2_ethtool_get_rxfh_context(struct net_device *dev, u32 *indir,
u8 *key, u8 *hfunc, u32 rss_context)
{
struct mvpp2_port *port = netdev_priv(dev);
int ret = 0;
if (!mvpp22_rss_is_supported())
return -EOPNOTSUPP;
if (rss_context >= MVPP22_N_RSS_TABLES)
return -EINVAL;
if (hfunc)
*hfunc = ETH_RSS_HASH_CRC32;
if (indir)
ret = mvpp22_port_rss_ctx_indir_get(port, rss_context, indir);
return ret;
}
static int mvpp2_ethtool_set_rxfh_context(struct net_device *dev,
const u32 *indir, const u8 *key,
const u8 hfunc, u32 *rss_context,
bool delete)
{
struct mvpp2_port *port = netdev_priv(dev);
int ret;
if (!mvpp22_rss_is_supported())
return -EOPNOTSUPP;
if (hfunc != ETH_RSS_HASH_NO_CHANGE && hfunc != ETH_RSS_HASH_CRC32)
return -EOPNOTSUPP;
if (key)
return -EOPNOTSUPP;
if (delete)
return mvpp22_port_rss_ctx_delete(port, *rss_context);
if (*rss_context == ETH_RXFH_CONTEXT_ALLOC) {
ret = mvpp22_port_rss_ctx_create(port, rss_context);
if (ret)
return ret;
}
return mvpp22_port_rss_ctx_indir_set(port, *rss_context, indir);
}
/* Device ops */
static const struct net_device_ops mvpp2_netdev_ops = {
.ndo_open = mvpp2_open,
.ndo_stop = mvpp2_stop,
.ndo_start_xmit = mvpp2_tx,
.ndo_set_rx_mode = mvpp2_set_rx_mode,
.ndo_set_mac_address = mvpp2_set_mac_address,
.ndo_change_mtu = mvpp2_change_mtu,
.ndo_get_stats64 = mvpp2_get_stats64,
.ndo_do_ioctl = mvpp2_ioctl,
.ndo_vlan_rx_add_vid = mvpp2_vlan_rx_add_vid,
.ndo_vlan_rx_kill_vid = mvpp2_vlan_rx_kill_vid,
.ndo_set_features = mvpp2_set_features,
.ndo_bpf = mvpp2_xdp,
.ndo_xdp_xmit = mvpp2_xdp_xmit,
};
static const struct ethtool_ops mvpp2_eth_tool_ops = {
.supported_coalesce_params = ETHTOOL_COALESCE_USECS |
ETHTOOL_COALESCE_MAX_FRAMES,
.nway_reset = mvpp2_ethtool_nway_reset,
.get_link = ethtool_op_get_link,
.set_coalesce = mvpp2_ethtool_set_coalesce,
.get_coalesce = mvpp2_ethtool_get_coalesce,
.get_drvinfo = mvpp2_ethtool_get_drvinfo,
.get_ringparam = mvpp2_ethtool_get_ringparam,
.set_ringparam = mvpp2_ethtool_set_ringparam,
.get_strings = mvpp2_ethtool_get_strings,
.get_ethtool_stats = mvpp2_ethtool_get_stats,
.get_sset_count = mvpp2_ethtool_get_sset_count,
.get_pauseparam = mvpp2_ethtool_get_pause_param,
.set_pauseparam = mvpp2_ethtool_set_pause_param,
.get_link_ksettings = mvpp2_ethtool_get_link_ksettings,
.set_link_ksettings = mvpp2_ethtool_set_link_ksettings,
.get_rxnfc = mvpp2_ethtool_get_rxnfc,
.set_rxnfc = mvpp2_ethtool_set_rxnfc,
.get_rxfh_indir_size = mvpp2_ethtool_get_rxfh_indir_size,
.get_rxfh = mvpp2_ethtool_get_rxfh,
.set_rxfh = mvpp2_ethtool_set_rxfh,
.get_rxfh_context = mvpp2_ethtool_get_rxfh_context,
.set_rxfh_context = mvpp2_ethtool_set_rxfh_context,
};
/* Used for PPv2.1, or PPv2.2 with the old Device Tree binding that
* had a single IRQ defined per-port.
*/
static int mvpp2_simple_queue_vectors_init(struct mvpp2_port *port,
struct device_node *port_node)
{
struct mvpp2_queue_vector *v = &port->qvecs[0];
v->first_rxq = 0;
v->nrxqs = port->nrxqs;
v->type = MVPP2_QUEUE_VECTOR_SHARED;
v->sw_thread_id = 0;
v->sw_thread_mask = *cpumask_bits(cpu_online_mask);
v->port = port;
v->irq = irq_of_parse_and_map(port_node, 0);
if (v->irq <= 0)
return -EINVAL;
netif_napi_add(port->dev, &v->napi, mvpp2_poll,
NAPI_POLL_WEIGHT);
port->nqvecs = 1;
return 0;
}
static int mvpp2_multi_queue_vectors_init(struct mvpp2_port *port,
struct device_node *port_node)
{
struct mvpp2 *priv = port->priv;
struct mvpp2_queue_vector *v;
int i, ret;
switch (queue_mode) {
case MVPP2_QDIST_SINGLE_MODE:
port->nqvecs = priv->nthreads + 1;
break;
case MVPP2_QDIST_MULTI_MODE:
port->nqvecs = priv->nthreads;
break;
}
for (i = 0; i < port->nqvecs; i++) {
char irqname[16];
v = port->qvecs + i;
v->port = port;
v->type = MVPP2_QUEUE_VECTOR_PRIVATE;
v->sw_thread_id = i;
v->sw_thread_mask = BIT(i);
if (port->flags & MVPP2_F_DT_COMPAT)
snprintf(irqname, sizeof(irqname), "tx-cpu%d", i);
else
snprintf(irqname, sizeof(irqname), "hif%d", i);
if (queue_mode == MVPP2_QDIST_MULTI_MODE) {
v->first_rxq = i;
v->nrxqs = 1;
} else if (queue_mode == MVPP2_QDIST_SINGLE_MODE &&
i == (port->nqvecs - 1)) {
v->first_rxq = 0;
v->nrxqs = port->nrxqs;
v->type = MVPP2_QUEUE_VECTOR_SHARED;
if (port->flags & MVPP2_F_DT_COMPAT)
strncpy(irqname, "rx-shared", sizeof(irqname));
}
net: mvpp2: enable ACPI support in the driver This patch introduces an alternative way of obtaining resources - via ACPI tables provided by firmware. Enabling coexistence with the DT support, in addition to the OF_*->device_*/fwnode_* API replacement, required following steps to be taken: * Add mvpp2_acpi_match table * Omit clock configuration and obtain tclk from the property - in ACPI world, the firmware is responsible for clock maintenance. * Disable comphy and syscon handling as they are not available for ACPI. * Modify way of obtaining interrupts - use newly introduced fwnode_irq_get() routine * Until proper MDIO bus and PHY handling with ACPI is established in the kernel, use only link interrupts feature in the driver. For the RGMII port it results in depending on GMAC settings done during firmware stage. * When booting with ACPI MVPP2_QDIST_MULTI_MODE is picked by default, as there is no need to keep any kind of the backward compatibility. Moreover, a memory region used by mvmdio driver is usually placed in the middle of the address space of the PP2 network controller. The MDIO base address is obtained without requesting memory region (by devm_ioremap() call) in mvmdio.c, later overlapping resources are requested by the network driver, which is responsible for avoiding a concurrent access. In case the MDIO memory region is declared in the ACPI, it can already appear as 'in-use' in the OS. Because it is overlapped by second region of the network controller, make sure it is released, before requesting it again. The care is taken by mvpp2 driver to avoid concurrent access to this memory region. Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-18 20:31:44 +08:00
if (port_node)
v->irq = of_irq_get_byname(port_node, irqname);
else
v->irq = fwnode_irq_get(port->fwnode, i);
if (v->irq <= 0) {
ret = -EINVAL;
goto err;
}
netif_napi_add(port->dev, &v->napi, mvpp2_poll,
NAPI_POLL_WEIGHT);
}
return 0;
err:
for (i = 0; i < port->nqvecs; i++)
irq_dispose_mapping(port->qvecs[i].irq);
return ret;
}
static int mvpp2_queue_vectors_init(struct mvpp2_port *port,
struct device_node *port_node)
{
if (port->has_tx_irqs)
return mvpp2_multi_queue_vectors_init(port, port_node);
else
return mvpp2_simple_queue_vectors_init(port, port_node);
}
static void mvpp2_queue_vectors_deinit(struct mvpp2_port *port)
{
int i;
for (i = 0; i < port->nqvecs; i++)
irq_dispose_mapping(port->qvecs[i].irq);
}
/* Configure Rx queue group interrupt for this port */
static void mvpp2_rx_irqs_setup(struct mvpp2_port *port)
{
struct mvpp2 *priv = port->priv;
u32 val;
int i;
if (priv->hw_version == MVPP21) {
mvpp2_write(priv, MVPP21_ISR_RXQ_GROUP_REG(port->id),
port->nrxqs);
return;
}
/* Handle the more complicated PPv2.2 case */
for (i = 0; i < port->nqvecs; i++) {
struct mvpp2_queue_vector *qv = port->qvecs + i;
if (!qv->nrxqs)
continue;
val = qv->sw_thread_id;
val |= port->id << MVPP22_ISR_RXQ_GROUP_INDEX_GROUP_OFFSET;
mvpp2_write(priv, MVPP22_ISR_RXQ_GROUP_INDEX_REG, val);
val = qv->first_rxq;
val |= qv->nrxqs << MVPP22_ISR_RXQ_SUB_GROUP_SIZE_OFFSET;
mvpp2_write(priv, MVPP22_ISR_RXQ_SUB_GROUP_CONFIG_REG, val);
}
}
/* Initialize port HW */
static int mvpp2_port_init(struct mvpp2_port *port)
{
struct device *dev = port->dev->dev.parent;
struct mvpp2 *priv = port->priv;
struct mvpp2_txq_pcpu *txq_pcpu;
unsigned int thread;
int queue, err;
/* Checks for hardware constraints */
if (port->first_rxq + port->nrxqs >
MVPP2_MAX_PORTS * priv->max_port_rxqs)
return -EINVAL;
if (port->nrxqs > priv->max_port_rxqs || port->ntxqs > MVPP2_MAX_TXQ)
return -EINVAL;
/* Disable port */
mvpp2_egress_disable(port);
mvpp2_port_disable(port);
port->tx_time_coal = MVPP2_TXDONE_COAL_USEC;
port->txqs = devm_kcalloc(dev, port->ntxqs, sizeof(*port->txqs),
GFP_KERNEL);
if (!port->txqs)
return -ENOMEM;
/* Associate physical Tx queues to this port and initialize.
* The mapping is predefined.
*/
for (queue = 0; queue < port->ntxqs; queue++) {
int queue_phy_id = mvpp2_txq_phys(port->id, queue);
struct mvpp2_tx_queue *txq;
txq = devm_kzalloc(dev, sizeof(*txq), GFP_KERNEL);
if (!txq) {
err = -ENOMEM;
goto err_free_percpu;
}
txq->pcpu = alloc_percpu(struct mvpp2_txq_pcpu);
if (!txq->pcpu) {
err = -ENOMEM;
goto err_free_percpu;
}
txq->id = queue_phy_id;
txq->log_id = queue;
txq->done_pkts_coal = MVPP2_TXDONE_COAL_PKTS_THRESH;
for (thread = 0; thread < priv->nthreads; thread++) {
txq_pcpu = per_cpu_ptr(txq->pcpu, thread);
txq_pcpu->thread = thread;
}
port->txqs[queue] = txq;
}
port->rxqs = devm_kcalloc(dev, port->nrxqs, sizeof(*port->rxqs),
GFP_KERNEL);
if (!port->rxqs) {
err = -ENOMEM;
goto err_free_percpu;
}
/* Allocate and initialize Rx queue for this port */
for (queue = 0; queue < port->nrxqs; queue++) {
struct mvpp2_rx_queue *rxq;
/* Map physical Rx queue to port's logical Rx queue */
rxq = devm_kzalloc(dev, sizeof(*rxq), GFP_KERNEL);
if (!rxq) {
err = -ENOMEM;
goto err_free_percpu;
}
/* Map this Rx queue to a physical queue */
rxq->id = port->first_rxq + queue;
rxq->port = port->id;
rxq->logic_rxq = queue;
port->rxqs[queue] = rxq;
}
mvpp2_rx_irqs_setup(port);
/* Create Rx descriptor rings */
for (queue = 0; queue < port->nrxqs; queue++) {
struct mvpp2_rx_queue *rxq = port->rxqs[queue];
rxq->size = port->rx_ring_size;
rxq->pkts_coal = MVPP2_RX_COAL_PKTS;
rxq->time_coal = MVPP2_RX_COAL_USEC;
}
mvpp2_ingress_disable(port);
/* Port default configuration */
mvpp2_defaults_set(port);
/* Port's classifier configuration */
mvpp2_cls_oversize_rxq_set(port);
mvpp2_cls_port_config(port);
if (mvpp22_rss_is_supported())
mvpp22_port_rss_init(port);
/* Provide an initial Rx packet size */
port->pkt_size = MVPP2_RX_PKT_SIZE(port->dev->mtu);
/* Initialize pools for swf */
err = mvpp2_swf_bm_pool_init(port);
if (err)
goto err_free_percpu;
/* Clear all port stats */
mvpp2_read_stats(port);
memset(port->ethtool_stats, 0,
MVPP2_N_ETHTOOL_STATS(port->ntxqs, port->nrxqs) * sizeof(u64));
return 0;
err_free_percpu:
for (queue = 0; queue < port->ntxqs; queue++) {
if (!port->txqs[queue])
continue;
free_percpu(port->txqs[queue]->pcpu);
}
return err;
}
static bool mvpp22_port_has_legacy_tx_irqs(struct device_node *port_node,
unsigned long *flags)
{
char *irqs[5] = { "rx-shared", "tx-cpu0", "tx-cpu1", "tx-cpu2",
"tx-cpu3" };
int i;
for (i = 0; i < 5; i++)
if (of_property_match_string(port_node, "interrupt-names",
irqs[i]) < 0)
return false;
*flags |= MVPP2_F_DT_COMPAT;
return true;
}
/* Checks if the port dt description has the required Tx interrupts:
* - PPv2.1: there are no such interrupts.
* - PPv2.2:
* - The old DTs have: "rx-shared", "tx-cpuX" with X in [0...3]
* - The new ones have: "hifX" with X in [0..8]
*
* All those variants are supported to keep the backward compatibility.
*/
static bool mvpp2_port_has_irqs(struct mvpp2 *priv,
struct device_node *port_node,
unsigned long *flags)
{
char name[5];
int i;
/* ACPI */
if (!port_node)
return true;
if (priv->hw_version == MVPP21)
return false;
if (mvpp22_port_has_legacy_tx_irqs(port_node, flags))
return true;
for (i = 0; i < MVPP2_MAX_THREADS; i++) {
snprintf(name, 5, "hif%d", i);
if (of_property_match_string(port_node, "interrupt-names",
name) < 0)
return false;
}
return true;
}
static void mvpp2_port_copy_mac_addr(struct net_device *dev, struct mvpp2 *priv,
struct fwnode_handle *fwnode,
char **mac_from)
{
struct mvpp2_port *port = netdev_priv(dev);
char hw_mac_addr[ETH_ALEN] = {0};
char fw_mac_addr[ETH_ALEN];
if (fwnode_get_mac_address(fwnode, fw_mac_addr, ETH_ALEN)) {
*mac_from = "firmware node";
ether_addr_copy(dev->dev_addr, fw_mac_addr);
return;
}
if (priv->hw_version == MVPP21) {
mvpp21_get_mac_address(port, hw_mac_addr);
if (is_valid_ether_addr(hw_mac_addr)) {
*mac_from = "hardware";
ether_addr_copy(dev->dev_addr, hw_mac_addr);
return;
}
}
*mac_from = "random";
eth_hw_addr_random(dev);
}
static struct mvpp2_port *mvpp2_phylink_to_port(struct phylink_config *config)
{
return container_of(config, struct mvpp2_port, phylink_config);
}
net: phylink: Add struct phylink_config to PHYLINK API The phylink_config structure will encapsulate a pointer to a struct device and the operation type requested for this instance of PHYLINK. This patch does not make any functional changes, it just transitions the PHYLINK internals and all its users to the new API. A pointer to a phylink_config structure will be passed to phylink_create() instead of the net_device directly. Also, the same phylink_config pointer will be passed back to all phylink_mac_ops callbacks instead of the net_device. Using this mechanism, a PHYLINK user can get the original net_device using a structure such as 'to_net_dev(config->dev)' or directly the structure containing the phylink_config using a container_of call. At the moment, only the PHYLINK_NETDEV is defined as a valid operation type for PHYLINK. In this mode, a valid reference to a struct device linked to the original net_device should be passed to PHYLINK through the phylink_config structure. This API changes is mainly driven by the necessity of adding a new operation type in PHYLINK that disconnects the phy_device from the net_device and also works when the net_device is lacking. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Tested-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-29 01:38:12 +08:00
static void mvpp2_phylink_validate(struct phylink_config *config,
unsigned long *supported,
struct phylink_link_state *state)
{
struct mvpp2_port *port = mvpp2_phylink_to_port(config);
__ETHTOOL_DECLARE_LINK_MODE_MASK(mask) = { 0, };
/* Invalid combinations */
switch (state->interface) {
case PHY_INTERFACE_MODE_10GBASER:
case PHY_INTERFACE_MODE_XAUI:
if (!mvpp2_port_supports_xlg(port))
goto empty_set;
break;
case PHY_INTERFACE_MODE_RGMII:
case PHY_INTERFACE_MODE_RGMII_ID:
case PHY_INTERFACE_MODE_RGMII_RXID:
case PHY_INTERFACE_MODE_RGMII_TXID:
if (!mvpp2_port_supports_rgmii(port))
goto empty_set;
break;
default:
break;
}
phylink_set(mask, Autoneg);
phylink_set_port_modes(mask);
phylink_set(mask, Pause);
phylink_set(mask, Asym_Pause);
switch (state->interface) {
case PHY_INTERFACE_MODE_10GBASER:
case PHY_INTERFACE_MODE_XAUI:
case PHY_INTERFACE_MODE_NA:
if (mvpp2_port_supports_xlg(port)) {
phylink_set(mask, 10000baseT_Full);
phylink_set(mask, 10000baseCR_Full);
phylink_set(mask, 10000baseSR_Full);
phylink_set(mask, 10000baseLR_Full);
phylink_set(mask, 10000baseLRM_Full);
phylink_set(mask, 10000baseER_Full);
phylink_set(mask, 10000baseKR_Full);
}
if (state->interface != PHY_INTERFACE_MODE_NA)
break;
/* Fall-through */
case PHY_INTERFACE_MODE_RGMII:
case PHY_INTERFACE_MODE_RGMII_ID:
case PHY_INTERFACE_MODE_RGMII_RXID:
case PHY_INTERFACE_MODE_RGMII_TXID:
case PHY_INTERFACE_MODE_SGMII:
phylink_set(mask, 10baseT_Half);
phylink_set(mask, 10baseT_Full);
phylink_set(mask, 100baseT_Half);
phylink_set(mask, 100baseT_Full);
phylink_set(mask, 1000baseT_Full);
phylink_set(mask, 1000baseX_Full);
if (state->interface != PHY_INTERFACE_MODE_NA)
break;
/* Fall-through */
case PHY_INTERFACE_MODE_1000BASEX:
case PHY_INTERFACE_MODE_2500BASEX:
if (port->comphy ||
state->interface != PHY_INTERFACE_MODE_2500BASEX) {
phylink_set(mask, 1000baseT_Full);
phylink_set(mask, 1000baseX_Full);
}
if (port->comphy ||
state->interface == PHY_INTERFACE_MODE_2500BASEX) {
phylink_set(mask, 2500baseT_Full);
phylink_set(mask, 2500baseX_Full);
}
break;
default:
goto empty_set;
}
bitmap_and(supported, supported, mask, __ETHTOOL_LINK_MODE_MASK_NBITS);
bitmap_and(state->advertising, state->advertising, mask,
__ETHTOOL_LINK_MODE_MASK_NBITS);
phylink_helper_basex_speed(state);
return;
empty_set:
bitmap_zero(supported, __ETHTOOL_LINK_MODE_MASK_NBITS);
}
static void mvpp22_xlg_pcs_get_state(struct mvpp2_port *port,
struct phylink_link_state *state)
{
u32 val;
state->speed = SPEED_10000;
state->duplex = 1;
state->an_complete = 1;
val = readl(port->base + MVPP22_XLG_STATUS);
state->link = !!(val & MVPP22_XLG_STATUS_LINK_UP);
state->pause = 0;
val = readl(port->base + MVPP22_XLG_CTRL0_REG);
if (val & MVPP22_XLG_CTRL0_TX_FLOW_CTRL_EN)
state->pause |= MLO_PAUSE_TX;
if (val & MVPP22_XLG_CTRL0_RX_FLOW_CTRL_EN)
state->pause |= MLO_PAUSE_RX;
}
static void mvpp2_gmac_pcs_get_state(struct mvpp2_port *port,
struct phylink_link_state *state)
{
u32 val;
val = readl(port->base + MVPP2_GMAC_STATUS0);
state->an_complete = !!(val & MVPP2_GMAC_STATUS0_AN_COMPLETE);
state->link = !!(val & MVPP2_GMAC_STATUS0_LINK_UP);
state->duplex = !!(val & MVPP2_GMAC_STATUS0_FULL_DUPLEX);
switch (port->phy_interface) {
case PHY_INTERFACE_MODE_1000BASEX:
state->speed = SPEED_1000;
break;
case PHY_INTERFACE_MODE_2500BASEX:
state->speed = SPEED_2500;
break;
default:
if (val & MVPP2_GMAC_STATUS0_GMII_SPEED)
state->speed = SPEED_1000;
else if (val & MVPP2_GMAC_STATUS0_MII_SPEED)
state->speed = SPEED_100;
else
state->speed = SPEED_10;
}
state->pause = 0;
if (val & MVPP2_GMAC_STATUS0_RX_PAUSE)
state->pause |= MLO_PAUSE_RX;
if (val & MVPP2_GMAC_STATUS0_TX_PAUSE)
state->pause |= MLO_PAUSE_TX;
}
static void mvpp2_phylink_mac_pcs_get_state(struct phylink_config *config,
struct phylink_link_state *state)
{
struct mvpp2_port *port = mvpp2_phylink_to_port(config);
if (port->priv->hw_version == MVPP22 && port->gop_id == 0) {
u32 mode = readl(port->base + MVPP22_XLG_CTRL3_REG);
mode &= MVPP22_XLG_CTRL3_MACMODESELECT_MASK;
if (mode == MVPP22_XLG_CTRL3_MACMODESELECT_10G) {
mvpp22_xlg_pcs_get_state(port, state);
return;
}
}
mvpp2_gmac_pcs_get_state(port, state);
}
net: phylink: Add struct phylink_config to PHYLINK API The phylink_config structure will encapsulate a pointer to a struct device and the operation type requested for this instance of PHYLINK. This patch does not make any functional changes, it just transitions the PHYLINK internals and all its users to the new API. A pointer to a phylink_config structure will be passed to phylink_create() instead of the net_device directly. Also, the same phylink_config pointer will be passed back to all phylink_mac_ops callbacks instead of the net_device. Using this mechanism, a PHYLINK user can get the original net_device using a structure such as 'to_net_dev(config->dev)' or directly the structure containing the phylink_config using a container_of call. At the moment, only the PHYLINK_NETDEV is defined as a valid operation type for PHYLINK. In this mode, a valid reference to a struct device linked to the original net_device should be passed to PHYLINK through the phylink_config structure. This API changes is mainly driven by the necessity of adding a new operation type in PHYLINK that disconnects the phy_device from the net_device and also works when the net_device is lacking. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Tested-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-29 01:38:12 +08:00
static void mvpp2_mac_an_restart(struct phylink_config *config)
{
struct mvpp2_port *port = mvpp2_phylink_to_port(config);
u32 val = readl(port->base + MVPP2_GMAC_AUTONEG_CONFIG);
writel(val | MVPP2_GMAC_IN_BAND_RESTART_AN,
port->base + MVPP2_GMAC_AUTONEG_CONFIG);
writel(val & ~MVPP2_GMAC_IN_BAND_RESTART_AN,
port->base + MVPP2_GMAC_AUTONEG_CONFIG);
}
static void mvpp2_xlg_config(struct mvpp2_port *port, unsigned int mode,
const struct phylink_link_state *state)
{
u32 val;
mvpp2_modify(port->base + MVPP22_XLG_CTRL0_REG,
MVPP22_XLG_CTRL0_MAC_RESET_DIS,
MVPP22_XLG_CTRL0_MAC_RESET_DIS);
mvpp2_modify(port->base + MVPP22_XLG_CTRL4_REG,
MVPP22_XLG_CTRL4_MACMODSELECT_GMAC |
MVPP22_XLG_CTRL4_EN_IDLE_CHECK |
MVPP22_XLG_CTRL4_FWD_FC | MVPP22_XLG_CTRL4_FWD_PFC,
MVPP22_XLG_CTRL4_FWD_FC | MVPP22_XLG_CTRL4_FWD_PFC);
/* Wait for reset to deassert */
do {
val = readl(port->base + MVPP22_XLG_CTRL0_REG);
} while (!(val & MVPP22_XLG_CTRL0_MAC_RESET_DIS));
}
static void mvpp2_gmac_config(struct mvpp2_port *port, unsigned int mode,
const struct phylink_link_state *state)
{
net: marvell: mvpp2: only reprogram what is necessary on mac_config mac_config() can be called at any point, and the expected behaviour from MAC drivers is to only reprogram when necessary - and certainly avoid taking the link down on every call. Unfortunately, mvpp2 does exactly that - it takes the link down, and reprograms everything, and then releases the forced-link down. This is bad, it can cause the link to bounce: - SFP detects signal, disables LOS indication. - SFP code calls into phylink, calling phylink_sfp_link_up() which triggers a resolve. - phylink_resolve() calls phylink_get_mac_state() and finds the MAC reporting link up. - phylink wants to configure the pause mode on the MAC, so calls phylink_mac_config() - mvpp2 takes the link down temporarily, generating a MAC link down event followed by another MAC link event. - phylink calls mac_link_up() and then processes the MAC link down event. - phylink_resolve() gets called again, registers the link down, and calls mach_link_down() before re-running itself. - phylink_resolve() starts again at step 3 above. This sequence repeats. GMAC versions prior to mvpp2 do not require the link to be taken down except when certain link properties (eg, switching between SGMII and 1000base-X mode, or enabling/disabling in-band negotiation) are changed. Implement this for mvpp2. Tested-by: Sven Auhagen <sven.auhagen@voleatech.de> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-08 23:35:49 +08:00
u32 old_an, an;
u32 old_ctrl0, ctrl0;
u32 old_ctrl2, ctrl2;
u32 old_ctrl4, ctrl4;
net: marvell: mvpp2: only reprogram what is necessary on mac_config mac_config() can be called at any point, and the expected behaviour from MAC drivers is to only reprogram when necessary - and certainly avoid taking the link down on every call. Unfortunately, mvpp2 does exactly that - it takes the link down, and reprograms everything, and then releases the forced-link down. This is bad, it can cause the link to bounce: - SFP detects signal, disables LOS indication. - SFP code calls into phylink, calling phylink_sfp_link_up() which triggers a resolve. - phylink_resolve() calls phylink_get_mac_state() and finds the MAC reporting link up. - phylink wants to configure the pause mode on the MAC, so calls phylink_mac_config() - mvpp2 takes the link down temporarily, generating a MAC link down event followed by another MAC link event. - phylink calls mac_link_up() and then processes the MAC link down event. - phylink_resolve() gets called again, registers the link down, and calls mach_link_down() before re-running itself. - phylink_resolve() starts again at step 3 above. This sequence repeats. GMAC versions prior to mvpp2 do not require the link to be taken down except when certain link properties (eg, switching between SGMII and 1000base-X mode, or enabling/disabling in-band negotiation) are changed. Implement this for mvpp2. Tested-by: Sven Auhagen <sven.auhagen@voleatech.de> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-08 23:35:49 +08:00
old_an = an = readl(port->base + MVPP2_GMAC_AUTONEG_CONFIG);
old_ctrl0 = ctrl0 = readl(port->base + MVPP2_GMAC_CTRL_0_REG);
old_ctrl2 = ctrl2 = readl(port->base + MVPP2_GMAC_CTRL_2_REG);
old_ctrl4 = ctrl4 = readl(port->base + MVPP22_GMAC_CTRL_4_REG);
an &= ~(MVPP2_GMAC_AN_SPEED_EN | MVPP2_GMAC_FC_ADV_EN |
MVPP2_GMAC_FC_ADV_ASM_EN | MVPP2_GMAC_FLOW_CTRL_AUTONEG |
MVPP2_GMAC_AN_DUPLEX_EN | MVPP2_GMAC_IN_BAND_AUTONEG |
MVPP2_GMAC_IN_BAND_AUTONEG_BYPASS);
ctrl0 &= ~MVPP2_GMAC_PORT_TYPE_MASK;
net: marvell: mvpp2: phylink compliance updates Sven Auhagen reported issues with negotiation on a couple of his platforms using a mixture of SFP and PHYs in various different modes. Debugging to root cause proved difficult, but essentially the problem comes down to the mvpp2 phylink implementation being slightly at odds with what is expected. phylink operates in three modes: phy, fixed-link, and in-band mode. In the first two modes, the expected behaviour from a MAC driver is that phylink resolves the operating mode and passes the mode to the MAC driver for it to program, including when the link should be brought up or taken down. This is basically the same as the libphy approach. This does not negate the requirement to advertise a correct control word for interface modes that have control words where that can be reasonably controlled. The second mode is in-band mode, where the MAC is expected to use the in-band control word to determine the operating mode. The mvneta driver implements the correct pattern required to support this: configure the port interface type separately from the in-band mode(s). This is now specified in the phylink documentation patches. mvpp2 was programming in-band mode for SGMII and the 802.3z modes no what, and avoided forcing the link up in fixed/phy modes. This caused a problem with some boards where the PHY is by default programmed to enter AN bypass mode, the PHY would report that the link was up, but the mvpp2 never completed the exchange of control word. Another issue that mvpp2 has is it sets SGMII AN format control word for both SGMII and 802.3z modes. The format of the control word is defined by MVPP2_GMAC_INBAND_AN_MASK, which should be set for SGMII and clear for 802.3z. Available Marvell documentation for earlier GMAC implementations does not make this clear, but this has been ascertained via extensive testing on earlier GMAC implementations, and then confirmed with a Macchiatobin Single Shot connected to a Clearfog: when MVPP2_GMAC_INBAND_AN_MASK is set, the clearfog does not receive the advertised pause mode settings. Lastly, there is no flow control in the in-band control word in Cisco SGMII, setting the flow control autonegotiation bit even with a PHY that has the Marvell extension to send this information does not result in the flow control being enabled at the MAC. We need to do this manually using the information provided via phylink. Re-code mvpp2's mac_config() and mac_link_up() to follow this pattern. This allows Sven Auhagen's board and Macchiatobin to reliably bring the link up with the 88e1512 PHY with phylink operating in PHY mode with COMPHY built as a module but the rest of the networking built-in, and u-boot having brought up the interface. in-band mode requires an additional patch to resolve another problem. Tested-by: Sven Auhagen <sven.auhagen@voleatech.de> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-08 23:35:38 +08:00
ctrl2 &= ~(MVPP2_GMAC_INBAND_AN_MASK | MVPP2_GMAC_PORT_RESET_MASK |
MVPP2_GMAC_PCS_ENABLE_MASK);
net: marvell: mvpp2: phylink compliance updates Sven Auhagen reported issues with negotiation on a couple of his platforms using a mixture of SFP and PHYs in various different modes. Debugging to root cause proved difficult, but essentially the problem comes down to the mvpp2 phylink implementation being slightly at odds with what is expected. phylink operates in three modes: phy, fixed-link, and in-band mode. In the first two modes, the expected behaviour from a MAC driver is that phylink resolves the operating mode and passes the mode to the MAC driver for it to program, including when the link should be brought up or taken down. This is basically the same as the libphy approach. This does not negate the requirement to advertise a correct control word for interface modes that have control words where that can be reasonably controlled. The second mode is in-band mode, where the MAC is expected to use the in-band control word to determine the operating mode. The mvneta driver implements the correct pattern required to support this: configure the port interface type separately from the in-band mode(s). This is now specified in the phylink documentation patches. mvpp2 was programming in-band mode for SGMII and the 802.3z modes no what, and avoided forcing the link up in fixed/phy modes. This caused a problem with some boards where the PHY is by default programmed to enter AN bypass mode, the PHY would report that the link was up, but the mvpp2 never completed the exchange of control word. Another issue that mvpp2 has is it sets SGMII AN format control word for both SGMII and 802.3z modes. The format of the control word is defined by MVPP2_GMAC_INBAND_AN_MASK, which should be set for SGMII and clear for 802.3z. Available Marvell documentation for earlier GMAC implementations does not make this clear, but this has been ascertained via extensive testing on earlier GMAC implementations, and then confirmed with a Macchiatobin Single Shot connected to a Clearfog: when MVPP2_GMAC_INBAND_AN_MASK is set, the clearfog does not receive the advertised pause mode settings. Lastly, there is no flow control in the in-band control word in Cisco SGMII, setting the flow control autonegotiation bit even with a PHY that has the Marvell extension to send this information does not result in the flow control being enabled at the MAC. We need to do this manually using the information provided via phylink. Re-code mvpp2's mac_config() and mac_link_up() to follow this pattern. This allows Sven Auhagen's board and Macchiatobin to reliably bring the link up with the 88e1512 PHY with phylink operating in PHY mode with COMPHY built as a module but the rest of the networking built-in, and u-boot having brought up the interface. in-band mode requires an additional patch to resolve another problem. Tested-by: Sven Auhagen <sven.auhagen@voleatech.de> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-08 23:35:38 +08:00
/* Configure port type */
if (phy_interface_mode_is_8023z(state->interface)) {
net: marvell: mvpp2: phylink compliance updates Sven Auhagen reported issues with negotiation on a couple of his platforms using a mixture of SFP and PHYs in various different modes. Debugging to root cause proved difficult, but essentially the problem comes down to the mvpp2 phylink implementation being slightly at odds with what is expected. phylink operates in three modes: phy, fixed-link, and in-band mode. In the first two modes, the expected behaviour from a MAC driver is that phylink resolves the operating mode and passes the mode to the MAC driver for it to program, including when the link should be brought up or taken down. This is basically the same as the libphy approach. This does not negate the requirement to advertise a correct control word for interface modes that have control words where that can be reasonably controlled. The second mode is in-band mode, where the MAC is expected to use the in-band control word to determine the operating mode. The mvneta driver implements the correct pattern required to support this: configure the port interface type separately from the in-band mode(s). This is now specified in the phylink documentation patches. mvpp2 was programming in-band mode for SGMII and the 802.3z modes no what, and avoided forcing the link up in fixed/phy modes. This caused a problem with some boards where the PHY is by default programmed to enter AN bypass mode, the PHY would report that the link was up, but the mvpp2 never completed the exchange of control word. Another issue that mvpp2 has is it sets SGMII AN format control word for both SGMII and 802.3z modes. The format of the control word is defined by MVPP2_GMAC_INBAND_AN_MASK, which should be set for SGMII and clear for 802.3z. Available Marvell documentation for earlier GMAC implementations does not make this clear, but this has been ascertained via extensive testing on earlier GMAC implementations, and then confirmed with a Macchiatobin Single Shot connected to a Clearfog: when MVPP2_GMAC_INBAND_AN_MASK is set, the clearfog does not receive the advertised pause mode settings. Lastly, there is no flow control in the in-band control word in Cisco SGMII, setting the flow control autonegotiation bit even with a PHY that has the Marvell extension to send this information does not result in the flow control being enabled at the MAC. We need to do this manually using the information provided via phylink. Re-code mvpp2's mac_config() and mac_link_up() to follow this pattern. This allows Sven Auhagen's board and Macchiatobin to reliably bring the link up with the 88e1512 PHY with phylink operating in PHY mode with COMPHY built as a module but the rest of the networking built-in, and u-boot having brought up the interface. in-band mode requires an additional patch to resolve another problem. Tested-by: Sven Auhagen <sven.auhagen@voleatech.de> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-08 23:35:38 +08:00
ctrl2 |= MVPP2_GMAC_PCS_ENABLE_MASK;
ctrl4 &= ~MVPP22_CTRL4_EXT_PIN_GMII_SEL;
ctrl4 |= MVPP22_CTRL4_SYNC_BYPASS_DIS |
MVPP22_CTRL4_DP_CLK_SEL |
MVPP22_CTRL4_QSGMII_BYPASS_ACTIVE;
} else if (state->interface == PHY_INTERFACE_MODE_SGMII) {
ctrl2 |= MVPP2_GMAC_PCS_ENABLE_MASK | MVPP2_GMAC_INBAND_AN_MASK;
ctrl4 &= ~MVPP22_CTRL4_EXT_PIN_GMII_SEL;
ctrl4 |= MVPP22_CTRL4_SYNC_BYPASS_DIS |
MVPP22_CTRL4_DP_CLK_SEL |
MVPP22_CTRL4_QSGMII_BYPASS_ACTIVE;
} else if (phy_interface_mode_is_rgmii(state->interface)) {
ctrl4 &= ~MVPP22_CTRL4_DP_CLK_SEL;
ctrl4 |= MVPP22_CTRL4_EXT_PIN_GMII_SEL |
MVPP22_CTRL4_SYNC_BYPASS_DIS |
MVPP22_CTRL4_QSGMII_BYPASS_ACTIVE;
}
net: marvell: mvpp2: phylink compliance updates Sven Auhagen reported issues with negotiation on a couple of his platforms using a mixture of SFP and PHYs in various different modes. Debugging to root cause proved difficult, but essentially the problem comes down to the mvpp2 phylink implementation being slightly at odds with what is expected. phylink operates in three modes: phy, fixed-link, and in-band mode. In the first two modes, the expected behaviour from a MAC driver is that phylink resolves the operating mode and passes the mode to the MAC driver for it to program, including when the link should be brought up or taken down. This is basically the same as the libphy approach. This does not negate the requirement to advertise a correct control word for interface modes that have control words where that can be reasonably controlled. The second mode is in-band mode, where the MAC is expected to use the in-band control word to determine the operating mode. The mvneta driver implements the correct pattern required to support this: configure the port interface type separately from the in-band mode(s). This is now specified in the phylink documentation patches. mvpp2 was programming in-band mode for SGMII and the 802.3z modes no what, and avoided forcing the link up in fixed/phy modes. This caused a problem with some boards where the PHY is by default programmed to enter AN bypass mode, the PHY would report that the link was up, but the mvpp2 never completed the exchange of control word. Another issue that mvpp2 has is it sets SGMII AN format control word for both SGMII and 802.3z modes. The format of the control word is defined by MVPP2_GMAC_INBAND_AN_MASK, which should be set for SGMII and clear for 802.3z. Available Marvell documentation for earlier GMAC implementations does not make this clear, but this has been ascertained via extensive testing on earlier GMAC implementations, and then confirmed with a Macchiatobin Single Shot connected to a Clearfog: when MVPP2_GMAC_INBAND_AN_MASK is set, the clearfog does not receive the advertised pause mode settings. Lastly, there is no flow control in the in-band control word in Cisco SGMII, setting the flow control autonegotiation bit even with a PHY that has the Marvell extension to send this information does not result in the flow control being enabled at the MAC. We need to do this manually using the information provided via phylink. Re-code mvpp2's mac_config() and mac_link_up() to follow this pattern. This allows Sven Auhagen's board and Macchiatobin to reliably bring the link up with the 88e1512 PHY with phylink operating in PHY mode with COMPHY built as a module but the rest of the networking built-in, and u-boot having brought up the interface. in-band mode requires an additional patch to resolve another problem. Tested-by: Sven Auhagen <sven.auhagen@voleatech.de> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-08 23:35:38 +08:00
/* Configure advertisement bits */
if (phylink_test(state->advertising, Pause))
an |= MVPP2_GMAC_FC_ADV_EN;
if (phylink_test(state->advertising, Asym_Pause))
an |= MVPP2_GMAC_FC_ADV_ASM_EN;
net: marvell: mvpp2: phylink compliance updates Sven Auhagen reported issues with negotiation on a couple of his platforms using a mixture of SFP and PHYs in various different modes. Debugging to root cause proved difficult, but essentially the problem comes down to the mvpp2 phylink implementation being slightly at odds with what is expected. phylink operates in three modes: phy, fixed-link, and in-band mode. In the first two modes, the expected behaviour from a MAC driver is that phylink resolves the operating mode and passes the mode to the MAC driver for it to program, including when the link should be brought up or taken down. This is basically the same as the libphy approach. This does not negate the requirement to advertise a correct control word for interface modes that have control words where that can be reasonably controlled. The second mode is in-band mode, where the MAC is expected to use the in-band control word to determine the operating mode. The mvneta driver implements the correct pattern required to support this: configure the port interface type separately from the in-band mode(s). This is now specified in the phylink documentation patches. mvpp2 was programming in-band mode for SGMII and the 802.3z modes no what, and avoided forcing the link up in fixed/phy modes. This caused a problem with some boards where the PHY is by default programmed to enter AN bypass mode, the PHY would report that the link was up, but the mvpp2 never completed the exchange of control word. Another issue that mvpp2 has is it sets SGMII AN format control word for both SGMII and 802.3z modes. The format of the control word is defined by MVPP2_GMAC_INBAND_AN_MASK, which should be set for SGMII and clear for 802.3z. Available Marvell documentation for earlier GMAC implementations does not make this clear, but this has been ascertained via extensive testing on earlier GMAC implementations, and then confirmed with a Macchiatobin Single Shot connected to a Clearfog: when MVPP2_GMAC_INBAND_AN_MASK is set, the clearfog does not receive the advertised pause mode settings. Lastly, there is no flow control in the in-band control word in Cisco SGMII, setting the flow control autonegotiation bit even with a PHY that has the Marvell extension to send this information does not result in the flow control being enabled at the MAC. We need to do this manually using the information provided via phylink. Re-code mvpp2's mac_config() and mac_link_up() to follow this pattern. This allows Sven Auhagen's board and Macchiatobin to reliably bring the link up with the 88e1512 PHY with phylink operating in PHY mode with COMPHY built as a module but the rest of the networking built-in, and u-boot having brought up the interface. in-band mode requires an additional patch to resolve another problem. Tested-by: Sven Auhagen <sven.auhagen@voleatech.de> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-08 23:35:38 +08:00
/* Configure negotiation style */
if (!phylink_autoneg_inband(mode)) {
/* Phy or fixed speed - no in-band AN, nothing to do, leave the
* configured speed, duplex and flow control as-is.
*/
net: marvell: mvpp2: phylink compliance updates Sven Auhagen reported issues with negotiation on a couple of his platforms using a mixture of SFP and PHYs in various different modes. Debugging to root cause proved difficult, but essentially the problem comes down to the mvpp2 phylink implementation being slightly at odds with what is expected. phylink operates in three modes: phy, fixed-link, and in-band mode. In the first two modes, the expected behaviour from a MAC driver is that phylink resolves the operating mode and passes the mode to the MAC driver for it to program, including when the link should be brought up or taken down. This is basically the same as the libphy approach. This does not negate the requirement to advertise a correct control word for interface modes that have control words where that can be reasonably controlled. The second mode is in-band mode, where the MAC is expected to use the in-band control word to determine the operating mode. The mvneta driver implements the correct pattern required to support this: configure the port interface type separately from the in-band mode(s). This is now specified in the phylink documentation patches. mvpp2 was programming in-band mode for SGMII and the 802.3z modes no what, and avoided forcing the link up in fixed/phy modes. This caused a problem with some boards where the PHY is by default programmed to enter AN bypass mode, the PHY would report that the link was up, but the mvpp2 never completed the exchange of control word. Another issue that mvpp2 has is it sets SGMII AN format control word for both SGMII and 802.3z modes. The format of the control word is defined by MVPP2_GMAC_INBAND_AN_MASK, which should be set for SGMII and clear for 802.3z. Available Marvell documentation for earlier GMAC implementations does not make this clear, but this has been ascertained via extensive testing on earlier GMAC implementations, and then confirmed with a Macchiatobin Single Shot connected to a Clearfog: when MVPP2_GMAC_INBAND_AN_MASK is set, the clearfog does not receive the advertised pause mode settings. Lastly, there is no flow control in the in-band control word in Cisco SGMII, setting the flow control autonegotiation bit even with a PHY that has the Marvell extension to send this information does not result in the flow control being enabled at the MAC. We need to do this manually using the information provided via phylink. Re-code mvpp2's mac_config() and mac_link_up() to follow this pattern. This allows Sven Auhagen's board and Macchiatobin to reliably bring the link up with the 88e1512 PHY with phylink operating in PHY mode with COMPHY built as a module but the rest of the networking built-in, and u-boot having brought up the interface. in-band mode requires an additional patch to resolve another problem. Tested-by: Sven Auhagen <sven.auhagen@voleatech.de> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-08 23:35:38 +08:00
} else if (state->interface == PHY_INTERFACE_MODE_SGMII) {
/* SGMII in-band mode receives the speed and duplex from
* the PHY. Flow control information is not received. */
an &= ~(MVPP2_GMAC_FORCE_LINK_DOWN |
MVPP2_GMAC_FORCE_LINK_PASS |
MVPP2_GMAC_CONFIG_MII_SPEED |
MVPP2_GMAC_CONFIG_GMII_SPEED |
MVPP2_GMAC_CONFIG_FULL_DUPLEX);
net: marvell: mvpp2: phylink compliance updates Sven Auhagen reported issues with negotiation on a couple of his platforms using a mixture of SFP and PHYs in various different modes. Debugging to root cause proved difficult, but essentially the problem comes down to the mvpp2 phylink implementation being slightly at odds with what is expected. phylink operates in three modes: phy, fixed-link, and in-band mode. In the first two modes, the expected behaviour from a MAC driver is that phylink resolves the operating mode and passes the mode to the MAC driver for it to program, including when the link should be brought up or taken down. This is basically the same as the libphy approach. This does not negate the requirement to advertise a correct control word for interface modes that have control words where that can be reasonably controlled. The second mode is in-band mode, where the MAC is expected to use the in-band control word to determine the operating mode. The mvneta driver implements the correct pattern required to support this: configure the port interface type separately from the in-band mode(s). This is now specified in the phylink documentation patches. mvpp2 was programming in-band mode for SGMII and the 802.3z modes no what, and avoided forcing the link up in fixed/phy modes. This caused a problem with some boards where the PHY is by default programmed to enter AN bypass mode, the PHY would report that the link was up, but the mvpp2 never completed the exchange of control word. Another issue that mvpp2 has is it sets SGMII AN format control word for both SGMII and 802.3z modes. The format of the control word is defined by MVPP2_GMAC_INBAND_AN_MASK, which should be set for SGMII and clear for 802.3z. Available Marvell documentation for earlier GMAC implementations does not make this clear, but this has been ascertained via extensive testing on earlier GMAC implementations, and then confirmed with a Macchiatobin Single Shot connected to a Clearfog: when MVPP2_GMAC_INBAND_AN_MASK is set, the clearfog does not receive the advertised pause mode settings. Lastly, there is no flow control in the in-band control word in Cisco SGMII, setting the flow control autonegotiation bit even with a PHY that has the Marvell extension to send this information does not result in the flow control being enabled at the MAC. We need to do this manually using the information provided via phylink. Re-code mvpp2's mac_config() and mac_link_up() to follow this pattern. This allows Sven Auhagen's board and Macchiatobin to reliably bring the link up with the 88e1512 PHY with phylink operating in PHY mode with COMPHY built as a module but the rest of the networking built-in, and u-boot having brought up the interface. in-band mode requires an additional patch to resolve another problem. Tested-by: Sven Auhagen <sven.auhagen@voleatech.de> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-08 23:35:38 +08:00
an |= MVPP2_GMAC_IN_BAND_AUTONEG |
MVPP2_GMAC_AN_SPEED_EN |
MVPP2_GMAC_AN_DUPLEX_EN;
} else if (phy_interface_mode_is_8023z(state->interface)) {
/* 1000BaseX and 2500BaseX ports cannot negotiate speed nor can
* they negotiate duplex: they are always operating with a fixed
* speed of 1000/2500Mbps in full duplex, so force 1000/2500
* speed and full duplex here.
*/
ctrl0 |= MVPP2_GMAC_PORT_TYPE_MASK;
an &= ~(MVPP2_GMAC_FORCE_LINK_DOWN |
MVPP2_GMAC_FORCE_LINK_PASS |
MVPP2_GMAC_CONFIG_MII_SPEED |
MVPP2_GMAC_CONFIG_GMII_SPEED |
MVPP2_GMAC_CONFIG_FULL_DUPLEX);
net: marvell: mvpp2: only reprogram what is necessary on mac_config mac_config() can be called at any point, and the expected behaviour from MAC drivers is to only reprogram when necessary - and certainly avoid taking the link down on every call. Unfortunately, mvpp2 does exactly that - it takes the link down, and reprograms everything, and then releases the forced-link down. This is bad, it can cause the link to bounce: - SFP detects signal, disables LOS indication. - SFP code calls into phylink, calling phylink_sfp_link_up() which triggers a resolve. - phylink_resolve() calls phylink_get_mac_state() and finds the MAC reporting link up. - phylink wants to configure the pause mode on the MAC, so calls phylink_mac_config() - mvpp2 takes the link down temporarily, generating a MAC link down event followed by another MAC link event. - phylink calls mac_link_up() and then processes the MAC link down event. - phylink_resolve() gets called again, registers the link down, and calls mach_link_down() before re-running itself. - phylink_resolve() starts again at step 3 above. This sequence repeats. GMAC versions prior to mvpp2 do not require the link to be taken down except when certain link properties (eg, switching between SGMII and 1000base-X mode, or enabling/disabling in-band negotiation) are changed. Implement this for mvpp2. Tested-by: Sven Auhagen <sven.auhagen@voleatech.de> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-08 23:35:49 +08:00
an |= MVPP2_GMAC_IN_BAND_AUTONEG |
MVPP2_GMAC_CONFIG_GMII_SPEED |
net: marvell: mvpp2: phylink compliance updates Sven Auhagen reported issues with negotiation on a couple of his platforms using a mixture of SFP and PHYs in various different modes. Debugging to root cause proved difficult, but essentially the problem comes down to the mvpp2 phylink implementation being slightly at odds with what is expected. phylink operates in three modes: phy, fixed-link, and in-band mode. In the first two modes, the expected behaviour from a MAC driver is that phylink resolves the operating mode and passes the mode to the MAC driver for it to program, including when the link should be brought up or taken down. This is basically the same as the libphy approach. This does not negate the requirement to advertise a correct control word for interface modes that have control words where that can be reasonably controlled. The second mode is in-band mode, where the MAC is expected to use the in-band control word to determine the operating mode. The mvneta driver implements the correct pattern required to support this: configure the port interface type separately from the in-band mode(s). This is now specified in the phylink documentation patches. mvpp2 was programming in-band mode for SGMII and the 802.3z modes no what, and avoided forcing the link up in fixed/phy modes. This caused a problem with some boards where the PHY is by default programmed to enter AN bypass mode, the PHY would report that the link was up, but the mvpp2 never completed the exchange of control word. Another issue that mvpp2 has is it sets SGMII AN format control word for both SGMII and 802.3z modes. The format of the control word is defined by MVPP2_GMAC_INBAND_AN_MASK, which should be set for SGMII and clear for 802.3z. Available Marvell documentation for earlier GMAC implementations does not make this clear, but this has been ascertained via extensive testing on earlier GMAC implementations, and then confirmed with a Macchiatobin Single Shot connected to a Clearfog: when MVPP2_GMAC_INBAND_AN_MASK is set, the clearfog does not receive the advertised pause mode settings. Lastly, there is no flow control in the in-band control word in Cisco SGMII, setting the flow control autonegotiation bit even with a PHY that has the Marvell extension to send this information does not result in the flow control being enabled at the MAC. We need to do this manually using the information provided via phylink. Re-code mvpp2's mac_config() and mac_link_up() to follow this pattern. This allows Sven Auhagen's board and Macchiatobin to reliably bring the link up with the 88e1512 PHY with phylink operating in PHY mode with COMPHY built as a module but the rest of the networking built-in, and u-boot having brought up the interface. in-band mode requires an additional patch to resolve another problem. Tested-by: Sven Auhagen <sven.auhagen@voleatech.de> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-08 23:35:38 +08:00
MVPP2_GMAC_CONFIG_FULL_DUPLEX;
if (state->pause & MLO_PAUSE_AN && state->an_enabled)
net: marvell: mvpp2: phylink compliance updates Sven Auhagen reported issues with negotiation on a couple of his platforms using a mixture of SFP and PHYs in various different modes. Debugging to root cause proved difficult, but essentially the problem comes down to the mvpp2 phylink implementation being slightly at odds with what is expected. phylink operates in three modes: phy, fixed-link, and in-band mode. In the first two modes, the expected behaviour from a MAC driver is that phylink resolves the operating mode and passes the mode to the MAC driver for it to program, including when the link should be brought up or taken down. This is basically the same as the libphy approach. This does not negate the requirement to advertise a correct control word for interface modes that have control words where that can be reasonably controlled. The second mode is in-band mode, where the MAC is expected to use the in-band control word to determine the operating mode. The mvneta driver implements the correct pattern required to support this: configure the port interface type separately from the in-band mode(s). This is now specified in the phylink documentation patches. mvpp2 was programming in-band mode for SGMII and the 802.3z modes no what, and avoided forcing the link up in fixed/phy modes. This caused a problem with some boards where the PHY is by default programmed to enter AN bypass mode, the PHY would report that the link was up, but the mvpp2 never completed the exchange of control word. Another issue that mvpp2 has is it sets SGMII AN format control word for both SGMII and 802.3z modes. The format of the control word is defined by MVPP2_GMAC_INBAND_AN_MASK, which should be set for SGMII and clear for 802.3z. Available Marvell documentation for earlier GMAC implementations does not make this clear, but this has been ascertained via extensive testing on earlier GMAC implementations, and then confirmed with a Macchiatobin Single Shot connected to a Clearfog: when MVPP2_GMAC_INBAND_AN_MASK is set, the clearfog does not receive the advertised pause mode settings. Lastly, there is no flow control in the in-band control word in Cisco SGMII, setting the flow control autonegotiation bit even with a PHY that has the Marvell extension to send this information does not result in the flow control being enabled at the MAC. We need to do this manually using the information provided via phylink. Re-code mvpp2's mac_config() and mac_link_up() to follow this pattern. This allows Sven Auhagen's board and Macchiatobin to reliably bring the link up with the 88e1512 PHY with phylink operating in PHY mode with COMPHY built as a module but the rest of the networking built-in, and u-boot having brought up the interface. in-band mode requires an additional patch to resolve another problem. Tested-by: Sven Auhagen <sven.auhagen@voleatech.de> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-08 23:35:38 +08:00
an |= MVPP2_GMAC_FLOW_CTRL_AUTONEG;
}
/* Some fields of the auto-negotiation register require the port to be down when
* their value is updated.
*/
#define MVPP2_GMAC_AN_PORT_DOWN_MASK \
(MVPP2_GMAC_IN_BAND_AUTONEG | \
MVPP2_GMAC_IN_BAND_AUTONEG_BYPASS | \
MVPP2_GMAC_CONFIG_MII_SPEED | MVPP2_GMAC_CONFIG_GMII_SPEED | \
MVPP2_GMAC_AN_SPEED_EN | MVPP2_GMAC_CONFIG_FULL_DUPLEX | \
MVPP2_GMAC_AN_DUPLEX_EN)
net: marvell: mvpp2: only reprogram what is necessary on mac_config mac_config() can be called at any point, and the expected behaviour from MAC drivers is to only reprogram when necessary - and certainly avoid taking the link down on every call. Unfortunately, mvpp2 does exactly that - it takes the link down, and reprograms everything, and then releases the forced-link down. This is bad, it can cause the link to bounce: - SFP detects signal, disables LOS indication. - SFP code calls into phylink, calling phylink_sfp_link_up() which triggers a resolve. - phylink_resolve() calls phylink_get_mac_state() and finds the MAC reporting link up. - phylink wants to configure the pause mode on the MAC, so calls phylink_mac_config() - mvpp2 takes the link down temporarily, generating a MAC link down event followed by another MAC link event. - phylink calls mac_link_up() and then processes the MAC link down event. - phylink_resolve() gets called again, registers the link down, and calls mach_link_down() before re-running itself. - phylink_resolve() starts again at step 3 above. This sequence repeats. GMAC versions prior to mvpp2 do not require the link to be taken down except when certain link properties (eg, switching between SGMII and 1000base-X mode, or enabling/disabling in-band negotiation) are changed. Implement this for mvpp2. Tested-by: Sven Auhagen <sven.auhagen@voleatech.de> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-08 23:35:49 +08:00
if ((old_ctrl0 ^ ctrl0) & MVPP2_GMAC_PORT_TYPE_MASK ||
(old_ctrl2 ^ ctrl2) & MVPP2_GMAC_INBAND_AN_MASK ||
(old_an ^ an) & MVPP2_GMAC_AN_PORT_DOWN_MASK) {
net: marvell: mvpp2: only reprogram what is necessary on mac_config mac_config() can be called at any point, and the expected behaviour from MAC drivers is to only reprogram when necessary - and certainly avoid taking the link down on every call. Unfortunately, mvpp2 does exactly that - it takes the link down, and reprograms everything, and then releases the forced-link down. This is bad, it can cause the link to bounce: - SFP detects signal, disables LOS indication. - SFP code calls into phylink, calling phylink_sfp_link_up() which triggers a resolve. - phylink_resolve() calls phylink_get_mac_state() and finds the MAC reporting link up. - phylink wants to configure the pause mode on the MAC, so calls phylink_mac_config() - mvpp2 takes the link down temporarily, generating a MAC link down event followed by another MAC link event. - phylink calls mac_link_up() and then processes the MAC link down event. - phylink_resolve() gets called again, registers the link down, and calls mach_link_down() before re-running itself. - phylink_resolve() starts again at step 3 above. This sequence repeats. GMAC versions prior to mvpp2 do not require the link to be taken down except when certain link properties (eg, switching between SGMII and 1000base-X mode, or enabling/disabling in-band negotiation) are changed. Implement this for mvpp2. Tested-by: Sven Auhagen <sven.auhagen@voleatech.de> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-08 23:35:49 +08:00
/* Force link down */
old_an &= ~MVPP2_GMAC_FORCE_LINK_PASS;
old_an |= MVPP2_GMAC_FORCE_LINK_DOWN;
writel(old_an, port->base + MVPP2_GMAC_AUTONEG_CONFIG);
/* Set the GMAC in a reset state - do this in a way that
* ensures we clear it below.
*/
old_ctrl2 |= MVPP2_GMAC_PORT_RESET_MASK;
writel(old_ctrl2, port->base + MVPP2_GMAC_CTRL_2_REG);
}
if (old_ctrl0 != ctrl0)
writel(ctrl0, port->base + MVPP2_GMAC_CTRL_0_REG);
if (old_ctrl2 != ctrl2)
writel(ctrl2, port->base + MVPP2_GMAC_CTRL_2_REG);
if (old_ctrl4 != ctrl4)
writel(ctrl4, port->base + MVPP22_GMAC_CTRL_4_REG);
if (old_an != an)
writel(an, port->base + MVPP2_GMAC_AUTONEG_CONFIG);
if (old_ctrl2 & MVPP2_GMAC_PORT_RESET_MASK) {
while (readl(port->base + MVPP2_GMAC_CTRL_2_REG) &
MVPP2_GMAC_PORT_RESET_MASK)
continue;
}
}
net: phylink: Add struct phylink_config to PHYLINK API The phylink_config structure will encapsulate a pointer to a struct device and the operation type requested for this instance of PHYLINK. This patch does not make any functional changes, it just transitions the PHYLINK internals and all its users to the new API. A pointer to a phylink_config structure will be passed to phylink_create() instead of the net_device directly. Also, the same phylink_config pointer will be passed back to all phylink_mac_ops callbacks instead of the net_device. Using this mechanism, a PHYLINK user can get the original net_device using a structure such as 'to_net_dev(config->dev)' or directly the structure containing the phylink_config using a container_of call. At the moment, only the PHYLINK_NETDEV is defined as a valid operation type for PHYLINK. In this mode, a valid reference to a struct device linked to the original net_device should be passed to PHYLINK through the phylink_config structure. This API changes is mainly driven by the necessity of adding a new operation type in PHYLINK that disconnects the phy_device from the net_device and also works when the net_device is lacking. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Tested-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-29 01:38:12 +08:00
static void mvpp2_mac_config(struct phylink_config *config, unsigned int mode,
const struct phylink_link_state *state)
{
struct mvpp2_port *port = mvpp2_phylink_to_port(config);
bool change_interface = port->phy_interface != state->interface;
/* Check for invalid configuration */
if (mvpp2_is_xlg(state->interface) && port->gop_id != 0) {
netdev_err(port->dev, "Invalid mode on %s\n", port->dev->name);
return;
}
/* Make sure the port is disabled when reconfiguring the mode */
mvpp2_port_disable(port);
if (port->priv->hw_version == MVPP22 && change_interface) {
mvpp22_gop_mask_irq(port);
port->phy_interface = state->interface;
/* Reconfigure the serdes lanes */
phy_power_off(port->comphy);
mvpp22_mode_reconfigure(port);
}
/* mac (re)configuration */
if (mvpp2_is_xlg(state->interface))
mvpp2_xlg_config(port, mode, state);
else if (phy_interface_mode_is_rgmii(state->interface) ||
phy_interface_mode_is_8023z(state->interface) ||
state->interface == PHY_INTERFACE_MODE_SGMII)
mvpp2_gmac_config(port, mode, state);
if (port->priv->hw_version == MVPP21 && port->flags & MVPP2_F_LOOPBACK)
mvpp2_port_loopback_set(port, state);
if (port->priv->hw_version == MVPP22 && change_interface)
mvpp22_gop_unmask_irq(port);
mvpp2_port_enable(port);
}
static void mvpp2_mac_link_up(struct phylink_config *config,
struct phy_device *phy,
unsigned int mode, phy_interface_t interface,
int speed, int duplex,
bool tx_pause, bool rx_pause)
{
struct mvpp2_port *port = mvpp2_phylink_to_port(config);
u32 val;
if (mvpp2_is_xlg(interface)) {
if (!phylink_autoneg_inband(mode)) {
val = MVPP22_XLG_CTRL0_FORCE_LINK_PASS;
if (tx_pause)
val |= MVPP22_XLG_CTRL0_TX_FLOW_CTRL_EN;
if (rx_pause)
val |= MVPP22_XLG_CTRL0_RX_FLOW_CTRL_EN;
mvpp2_modify(port->base + MVPP22_XLG_CTRL0_REG,
MVPP22_XLG_CTRL0_FORCE_LINK_DOWN |
MVPP22_XLG_CTRL0_FORCE_LINK_PASS |
MVPP22_XLG_CTRL0_TX_FLOW_CTRL_EN |
MVPP22_XLG_CTRL0_RX_FLOW_CTRL_EN, val);
}
} else {
if (!phylink_autoneg_inband(mode)) {
val = MVPP2_GMAC_FORCE_LINK_PASS;
if (speed == SPEED_1000 || speed == SPEED_2500)
val |= MVPP2_GMAC_CONFIG_GMII_SPEED;
else if (speed == SPEED_100)
val |= MVPP2_GMAC_CONFIG_MII_SPEED;
if (duplex == DUPLEX_FULL)
val |= MVPP2_GMAC_CONFIG_FULL_DUPLEX;
mvpp2_modify(port->base + MVPP2_GMAC_AUTONEG_CONFIG,
MVPP2_GMAC_FORCE_LINK_DOWN |
MVPP2_GMAC_FORCE_LINK_PASS |
MVPP2_GMAC_CONFIG_MII_SPEED |
MVPP2_GMAC_CONFIG_GMII_SPEED |
MVPP2_GMAC_CONFIG_FULL_DUPLEX, val);
}
/* We can always update the flow control enable bits;
* these will only be effective if flow control AN
* (MVPP2_GMAC_FLOW_CTRL_AUTONEG) is disabled.
*/
val = 0;
if (tx_pause)
val |= MVPP22_CTRL4_TX_FC_EN;
if (rx_pause)
val |= MVPP22_CTRL4_RX_FC_EN;
mvpp2_modify(port->base + MVPP22_GMAC_CTRL_4_REG,
MVPP22_CTRL4_RX_FC_EN | MVPP22_CTRL4_TX_FC_EN,
val);
}
mvpp2_port_enable(port);
mvpp2_egress_enable(port);
mvpp2_ingress_enable(port);
netif_tx_wake_all_queues(port->dev);
}
net: phylink: Add struct phylink_config to PHYLINK API The phylink_config structure will encapsulate a pointer to a struct device and the operation type requested for this instance of PHYLINK. This patch does not make any functional changes, it just transitions the PHYLINK internals and all its users to the new API. A pointer to a phylink_config structure will be passed to phylink_create() instead of the net_device directly. Also, the same phylink_config pointer will be passed back to all phylink_mac_ops callbacks instead of the net_device. Using this mechanism, a PHYLINK user can get the original net_device using a structure such as 'to_net_dev(config->dev)' or directly the structure containing the phylink_config using a container_of call. At the moment, only the PHYLINK_NETDEV is defined as a valid operation type for PHYLINK. In this mode, a valid reference to a struct device linked to the original net_device should be passed to PHYLINK through the phylink_config structure. This API changes is mainly driven by the necessity of adding a new operation type in PHYLINK that disconnects the phy_device from the net_device and also works when the net_device is lacking. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Tested-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-29 01:38:12 +08:00
static void mvpp2_mac_link_down(struct phylink_config *config,
unsigned int mode, phy_interface_t interface)
{
struct mvpp2_port *port = mvpp2_phylink_to_port(config);
u32 val;
if (!phylink_autoneg_inband(mode)) {
if (mvpp2_is_xlg(interface)) {
val = readl(port->base + MVPP22_XLG_CTRL0_REG);
val &= ~MVPP22_XLG_CTRL0_FORCE_LINK_PASS;
val |= MVPP22_XLG_CTRL0_FORCE_LINK_DOWN;
writel(val, port->base + MVPP22_XLG_CTRL0_REG);
} else {
val = readl(port->base + MVPP2_GMAC_AUTONEG_CONFIG);
val &= ~MVPP2_GMAC_FORCE_LINK_PASS;
val |= MVPP2_GMAC_FORCE_LINK_DOWN;
writel(val, port->base + MVPP2_GMAC_AUTONEG_CONFIG);
}
}
netif_tx_stop_all_queues(port->dev);
mvpp2_egress_disable(port);
mvpp2_ingress_disable(port);
mvpp2_port_disable(port);
}
static const struct phylink_mac_ops mvpp2_phylink_ops = {
.validate = mvpp2_phylink_validate,
.mac_pcs_get_state = mvpp2_phylink_mac_pcs_get_state,
.mac_an_restart = mvpp2_mac_an_restart,
.mac_config = mvpp2_mac_config,
.mac_link_up = mvpp2_mac_link_up,
.mac_link_down = mvpp2_mac_link_down,
};
/* Ports initialization */
static int mvpp2_port_probe(struct platform_device *pdev,
struct fwnode_handle *port_fwnode,
struct mvpp2 *priv)
{
net: mvpp2: enable ACPI support in the driver This patch introduces an alternative way of obtaining resources - via ACPI tables provided by firmware. Enabling coexistence with the DT support, in addition to the OF_*->device_*/fwnode_* API replacement, required following steps to be taken: * Add mvpp2_acpi_match table * Omit clock configuration and obtain tclk from the property - in ACPI world, the firmware is responsible for clock maintenance. * Disable comphy and syscon handling as they are not available for ACPI. * Modify way of obtaining interrupts - use newly introduced fwnode_irq_get() routine * Until proper MDIO bus and PHY handling with ACPI is established in the kernel, use only link interrupts feature in the driver. For the RGMII port it results in depending on GMAC settings done during firmware stage. * When booting with ACPI MVPP2_QDIST_MULTI_MODE is picked by default, as there is no need to keep any kind of the backward compatibility. Moreover, a memory region used by mvmdio driver is usually placed in the middle of the address space of the PP2 network controller. The MDIO base address is obtained without requesting memory region (by devm_ioremap() call) in mvmdio.c, later overlapping resources are requested by the network driver, which is responsible for avoiding a concurrent access. In case the MDIO memory region is declared in the ACPI, it can already appear as 'in-use' in the OS. Because it is overlapped by second region of the network controller, make sure it is released, before requesting it again. The care is taken by mvpp2 driver to avoid concurrent access to this memory region. Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-18 20:31:44 +08:00
struct phy *comphy = NULL;
struct mvpp2_port *port;
net: mvpp2: replace TX coalescing interrupts with hrtimer The PP2 controller is capable of per-CPU TX processing, which means there are per-CPU banked register sets and queues. Current version of the driver supports TX packet coalescing - once on given CPU sent packets amount reaches a threshold value, an IRQ occurs. However, there is a single interrupt line responsible for CPU0/1 TX and RX events (the latter is not per-CPU, the hardware does not support RSS). When the top-half executes the interrupt cause is not known. This is why in NAPI poll function, along with RX processing, IRQ cause register on both CPU's is accessed in order to determine on which of them the TX coalescing threshold might have been reached. Thus the egress processing and releasing the buffers is able to take place on the corresponding CPU. Hitherto approach lead to an illegal usage of on_each_cpu function in softirq context. The problem is solved by resigning from TX coalescing interrupts and separating egress finalization from NAPI processing. For that purpose a method of using hrtimer is introduced. In main transmit function (mvpp2_tx) buffers are released once a software coalescing threshold is reached. In case not all the data is processed a timer is set on this CPU - in its interrupt context a tasklet is scheduled in which all queues are processed. At once only one timer per-CPU can be running, which is controlled by a dedicated flag. This commit removes TX processing from NAPI polling function, disables hardware coalescing and enables hrtimer with tasklet, using new per-CPU port structure (mvpp2_port_pcpu). Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-07 01:00:30 +08:00
struct mvpp2_port_pcpu *port_pcpu;
struct device_node *port_node = to_of_node(port_fwnode);
netdev_features_t features;
struct net_device *dev;
struct phylink *phylink;
char *mac_from = "";
unsigned int ntxqs, nrxqs, thread;
unsigned long flags = 0;
bool has_tx_irqs;
u32 id;
int phy_mode;
int err, i;
has_tx_irqs = mvpp2_port_has_irqs(priv, port_node, &flags);
if (!has_tx_irqs && queue_mode == MVPP2_QDIST_MULTI_MODE) {
dev_err(&pdev->dev,
"not enough IRQs to support multi queue mode\n");
return -EINVAL;
net: mvpp2: enable ACPI support in the driver This patch introduces an alternative way of obtaining resources - via ACPI tables provided by firmware. Enabling coexistence with the DT support, in addition to the OF_*->device_*/fwnode_* API replacement, required following steps to be taken: * Add mvpp2_acpi_match table * Omit clock configuration and obtain tclk from the property - in ACPI world, the firmware is responsible for clock maintenance. * Disable comphy and syscon handling as they are not available for ACPI. * Modify way of obtaining interrupts - use newly introduced fwnode_irq_get() routine * Until proper MDIO bus and PHY handling with ACPI is established in the kernel, use only link interrupts feature in the driver. For the RGMII port it results in depending on GMAC settings done during firmware stage. * When booting with ACPI MVPP2_QDIST_MULTI_MODE is picked by default, as there is no need to keep any kind of the backward compatibility. Moreover, a memory region used by mvmdio driver is usually placed in the middle of the address space of the PP2 network controller. The MDIO base address is obtained without requesting memory region (by devm_ioremap() call) in mvmdio.c, later overlapping resources are requested by the network driver, which is responsible for avoiding a concurrent access. In case the MDIO memory region is declared in the ACPI, it can already appear as 'in-use' in the OS. Because it is overlapped by second region of the network controller, make sure it is released, before requesting it again. The care is taken by mvpp2 driver to avoid concurrent access to this memory region. Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-18 20:31:44 +08:00
}
ntxqs = MVPP2_MAX_TXQ;
nrxqs = mvpp2_get_nrxqs(priv);
dev = alloc_etherdev_mqs(sizeof(*port), ntxqs, nrxqs);
if (!dev)
return -ENOMEM;
phy_mode = fwnode_get_phy_mode(port_fwnode);
if (phy_mode < 0) {
dev_err(&pdev->dev, "incorrect phy mode\n");
err = phy_mode;
goto err_free_netdev;
}
/*
* Rewrite 10GBASE-KR to 10GBASE-R for compatibility with existing DT.
* Existing usage of 10GBASE-KR is not correct; no backplane
* negotiation is done, and this driver does not actually support
* 10GBASE-KR.
*/
if (phy_mode == PHY_INTERFACE_MODE_10GKR)
phy_mode = PHY_INTERFACE_MODE_10GBASER;
net: mvpp2: enable ACPI support in the driver This patch introduces an alternative way of obtaining resources - via ACPI tables provided by firmware. Enabling coexistence with the DT support, in addition to the OF_*->device_*/fwnode_* API replacement, required following steps to be taken: * Add mvpp2_acpi_match table * Omit clock configuration and obtain tclk from the property - in ACPI world, the firmware is responsible for clock maintenance. * Disable comphy and syscon handling as they are not available for ACPI. * Modify way of obtaining interrupts - use newly introduced fwnode_irq_get() routine * Until proper MDIO bus and PHY handling with ACPI is established in the kernel, use only link interrupts feature in the driver. For the RGMII port it results in depending on GMAC settings done during firmware stage. * When booting with ACPI MVPP2_QDIST_MULTI_MODE is picked by default, as there is no need to keep any kind of the backward compatibility. Moreover, a memory region used by mvmdio driver is usually placed in the middle of the address space of the PP2 network controller. The MDIO base address is obtained without requesting memory region (by devm_ioremap() call) in mvmdio.c, later overlapping resources are requested by the network driver, which is responsible for avoiding a concurrent access. In case the MDIO memory region is declared in the ACPI, it can already appear as 'in-use' in the OS. Because it is overlapped by second region of the network controller, make sure it is released, before requesting it again. The care is taken by mvpp2 driver to avoid concurrent access to this memory region. Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-18 20:31:44 +08:00
if (port_node) {
comphy = devm_of_phy_get(&pdev->dev, port_node, NULL);
if (IS_ERR(comphy)) {
if (PTR_ERR(comphy) == -EPROBE_DEFER) {
err = -EPROBE_DEFER;
goto err_free_netdev;
}
comphy = NULL;
}
}
if (fwnode_property_read_u32(port_fwnode, "port-id", &id)) {
err = -EINVAL;
dev_err(&pdev->dev, "missing port-id value\n");
goto err_free_netdev;
}
dev->tx_queue_len = MVPP2_MAX_TXD_MAX;
dev->watchdog_timeo = 5 * HZ;
dev->netdev_ops = &mvpp2_netdev_ops;
dev->ethtool_ops = &mvpp2_eth_tool_ops;
port = netdev_priv(dev);
port->dev = dev;
net: mvpp2: enable ACPI support in the driver This patch introduces an alternative way of obtaining resources - via ACPI tables provided by firmware. Enabling coexistence with the DT support, in addition to the OF_*->device_*/fwnode_* API replacement, required following steps to be taken: * Add mvpp2_acpi_match table * Omit clock configuration and obtain tclk from the property - in ACPI world, the firmware is responsible for clock maintenance. * Disable comphy and syscon handling as they are not available for ACPI. * Modify way of obtaining interrupts - use newly introduced fwnode_irq_get() routine * Until proper MDIO bus and PHY handling with ACPI is established in the kernel, use only link interrupts feature in the driver. For the RGMII port it results in depending on GMAC settings done during firmware stage. * When booting with ACPI MVPP2_QDIST_MULTI_MODE is picked by default, as there is no need to keep any kind of the backward compatibility. Moreover, a memory region used by mvmdio driver is usually placed in the middle of the address space of the PP2 network controller. The MDIO base address is obtained without requesting memory region (by devm_ioremap() call) in mvmdio.c, later overlapping resources are requested by the network driver, which is responsible for avoiding a concurrent access. In case the MDIO memory region is declared in the ACPI, it can already appear as 'in-use' in the OS. Because it is overlapped by second region of the network controller, make sure it is released, before requesting it again. The care is taken by mvpp2 driver to avoid concurrent access to this memory region. Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-18 20:31:44 +08:00
port->fwnode = port_fwnode;
port->has_phy = !!of_find_property(port_node, "phy", NULL);
port->ntxqs = ntxqs;
port->nrxqs = nrxqs;
port->priv = priv;
port->has_tx_irqs = has_tx_irqs;
port->flags = flags;
err = mvpp2_queue_vectors_init(port, port_node);
if (err)
goto err_free_netdev;
net: mvpp2: enable ACPI support in the driver This patch introduces an alternative way of obtaining resources - via ACPI tables provided by firmware. Enabling coexistence with the DT support, in addition to the OF_*->device_*/fwnode_* API replacement, required following steps to be taken: * Add mvpp2_acpi_match table * Omit clock configuration and obtain tclk from the property - in ACPI world, the firmware is responsible for clock maintenance. * Disable comphy and syscon handling as they are not available for ACPI. * Modify way of obtaining interrupts - use newly introduced fwnode_irq_get() routine * Until proper MDIO bus and PHY handling with ACPI is established in the kernel, use only link interrupts feature in the driver. For the RGMII port it results in depending on GMAC settings done during firmware stage. * When booting with ACPI MVPP2_QDIST_MULTI_MODE is picked by default, as there is no need to keep any kind of the backward compatibility. Moreover, a memory region used by mvmdio driver is usually placed in the middle of the address space of the PP2 network controller. The MDIO base address is obtained without requesting memory region (by devm_ioremap() call) in mvmdio.c, later overlapping resources are requested by the network driver, which is responsible for avoiding a concurrent access. In case the MDIO memory region is declared in the ACPI, it can already appear as 'in-use' in the OS. Because it is overlapped by second region of the network controller, make sure it is released, before requesting it again. The care is taken by mvpp2 driver to avoid concurrent access to this memory region. Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-18 20:31:44 +08:00
if (port_node)
port->link_irq = of_irq_get_byname(port_node, "link");
else
port->link_irq = fwnode_irq_get(port_fwnode, port->nqvecs + 1);
if (port->link_irq == -EPROBE_DEFER) {
err = -EPROBE_DEFER;
goto err_deinit_qvecs;
}
if (port->link_irq <= 0)
/* the link irq is optional */
port->link_irq = 0;
if (fwnode_property_read_bool(port_fwnode, "marvell,loopback"))
port->flags |= MVPP2_F_LOOPBACK;
port->id = id;
if (priv->hw_version == MVPP21)
port->first_rxq = port->id * port->nrxqs;
else
port->first_rxq = port->id * priv->max_port_rxqs;
port->of_node = port_node;
port->phy_interface = phy_mode;
port->comphy = comphy;
net: mvpp2: handle register mapping and access for PPv2.2 This commit adjusts the mvpp2 driver register mapping and access logic to support PPv2.2, to handle a number of differences. Due to how the registers are laid out in memory, the Device Tree binding for the "reg" property is different: - On PPv2.1, we had a first area for the packet processor registers (common to all ports), and then one area per port. - On PPv2.2, we have a first area for the packet processor registers (common to all ports), and a second area for numerous other registers, including a large number of per-port registers In addition, on PPv2.2, the area for the common registers is split into so-called "address spaces" of 64 KB each. They allow to access per-CPU registers, where each CPU has its own copy of some registers. A few other registers, which have a single copy, also need to be accessed from those per-CPU windows if they are related to a per-CPU register. For example: - Writing to MVPP2_TXQ_NUM_REG selects a TX queue. This register is a per-CPU register, it must be accessed from the current CPU register window. - Then a write to MVPP2_TXQ_PENDING_REG, MVPP2_TXQ_DESC_ADDR_REG (and a few others) will affect the TX queue that was selected by the write to MVPP2_TXQ_NUM_REG. It must be accessed from the same CPU window as the write to the TXQ_NUM_REG. Therefore, the ->base member of 'struct mvpp2' is replaced with a ->cpu_base[] array, each entry pointing to a mapping of the per-CPU area. Since PPv2.1 doesn't have this concept of per-CPU windows, all entries in ->cpu_base[] point to the same io-remapped area. The existing mvpp2_read() and mvpp2_write() accessors use cpu_base[0], they are used for registers for which the CPU window doesn't matter. mvpp2_percpu_read() and mvpp2_percpu_write() are new accessors added to access the registers for which the CPU window does matter, which is why they take a "cpu" as argument. The driver is then changed to use mvpp2_percpu_read() and mvpp2_percpu_write() where it matters. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-07 23:53:13 +08:00
if (priv->hw_version == MVPP21) {
port->base = devm_platform_ioremap_resource(pdev, 2 + id);
net: mvpp2: handle register mapping and access for PPv2.2 This commit adjusts the mvpp2 driver register mapping and access logic to support PPv2.2, to handle a number of differences. Due to how the registers are laid out in memory, the Device Tree binding for the "reg" property is different: - On PPv2.1, we had a first area for the packet processor registers (common to all ports), and then one area per port. - On PPv2.2, we have a first area for the packet processor registers (common to all ports), and a second area for numerous other registers, including a large number of per-port registers In addition, on PPv2.2, the area for the common registers is split into so-called "address spaces" of 64 KB each. They allow to access per-CPU registers, where each CPU has its own copy of some registers. A few other registers, which have a single copy, also need to be accessed from those per-CPU windows if they are related to a per-CPU register. For example: - Writing to MVPP2_TXQ_NUM_REG selects a TX queue. This register is a per-CPU register, it must be accessed from the current CPU register window. - Then a write to MVPP2_TXQ_PENDING_REG, MVPP2_TXQ_DESC_ADDR_REG (and a few others) will affect the TX queue that was selected by the write to MVPP2_TXQ_NUM_REG. It must be accessed from the same CPU window as the write to the TXQ_NUM_REG. Therefore, the ->base member of 'struct mvpp2' is replaced with a ->cpu_base[] array, each entry pointing to a mapping of the per-CPU area. Since PPv2.1 doesn't have this concept of per-CPU windows, all entries in ->cpu_base[] point to the same io-remapped area. The existing mvpp2_read() and mvpp2_write() accessors use cpu_base[0], they are used for registers for which the CPU window doesn't matter. mvpp2_percpu_read() and mvpp2_percpu_write() are new accessors added to access the registers for which the CPU window does matter, which is why they take a "cpu" as argument. The driver is then changed to use mvpp2_percpu_read() and mvpp2_percpu_write() where it matters. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-07 23:53:13 +08:00
if (IS_ERR(port->base)) {
err = PTR_ERR(port->base);
goto err_free_irq;
net: mvpp2: handle register mapping and access for PPv2.2 This commit adjusts the mvpp2 driver register mapping and access logic to support PPv2.2, to handle a number of differences. Due to how the registers are laid out in memory, the Device Tree binding for the "reg" property is different: - On PPv2.1, we had a first area for the packet processor registers (common to all ports), and then one area per port. - On PPv2.2, we have a first area for the packet processor registers (common to all ports), and a second area for numerous other registers, including a large number of per-port registers In addition, on PPv2.2, the area for the common registers is split into so-called "address spaces" of 64 KB each. They allow to access per-CPU registers, where each CPU has its own copy of some registers. A few other registers, which have a single copy, also need to be accessed from those per-CPU windows if they are related to a per-CPU register. For example: - Writing to MVPP2_TXQ_NUM_REG selects a TX queue. This register is a per-CPU register, it must be accessed from the current CPU register window. - Then a write to MVPP2_TXQ_PENDING_REG, MVPP2_TXQ_DESC_ADDR_REG (and a few others) will affect the TX queue that was selected by the write to MVPP2_TXQ_NUM_REG. It must be accessed from the same CPU window as the write to the TXQ_NUM_REG. Therefore, the ->base member of 'struct mvpp2' is replaced with a ->cpu_base[] array, each entry pointing to a mapping of the per-CPU area. Since PPv2.1 doesn't have this concept of per-CPU windows, all entries in ->cpu_base[] point to the same io-remapped area. The existing mvpp2_read() and mvpp2_write() accessors use cpu_base[0], they are used for registers for which the CPU window doesn't matter. mvpp2_percpu_read() and mvpp2_percpu_write() are new accessors added to access the registers for which the CPU window does matter, which is why they take a "cpu" as argument. The driver is then changed to use mvpp2_percpu_read() and mvpp2_percpu_write() where it matters. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-07 23:53:13 +08:00
}
port->stats_base = port->priv->lms_base +
MVPP21_MIB_COUNTERS_OFFSET +
port->gop_id * MVPP21_MIB_COUNTERS_PORT_SZ;
net: mvpp2: handle register mapping and access for PPv2.2 This commit adjusts the mvpp2 driver register mapping and access logic to support PPv2.2, to handle a number of differences. Due to how the registers are laid out in memory, the Device Tree binding for the "reg" property is different: - On PPv2.1, we had a first area for the packet processor registers (common to all ports), and then one area per port. - On PPv2.2, we have a first area for the packet processor registers (common to all ports), and a second area for numerous other registers, including a large number of per-port registers In addition, on PPv2.2, the area for the common registers is split into so-called "address spaces" of 64 KB each. They allow to access per-CPU registers, where each CPU has its own copy of some registers. A few other registers, which have a single copy, also need to be accessed from those per-CPU windows if they are related to a per-CPU register. For example: - Writing to MVPP2_TXQ_NUM_REG selects a TX queue. This register is a per-CPU register, it must be accessed from the current CPU register window. - Then a write to MVPP2_TXQ_PENDING_REG, MVPP2_TXQ_DESC_ADDR_REG (and a few others) will affect the TX queue that was selected by the write to MVPP2_TXQ_NUM_REG. It must be accessed from the same CPU window as the write to the TXQ_NUM_REG. Therefore, the ->base member of 'struct mvpp2' is replaced with a ->cpu_base[] array, each entry pointing to a mapping of the per-CPU area. Since PPv2.1 doesn't have this concept of per-CPU windows, all entries in ->cpu_base[] point to the same io-remapped area. The existing mvpp2_read() and mvpp2_write() accessors use cpu_base[0], they are used for registers for which the CPU window doesn't matter. mvpp2_percpu_read() and mvpp2_percpu_write() are new accessors added to access the registers for which the CPU window does matter, which is why they take a "cpu" as argument. The driver is then changed to use mvpp2_percpu_read() and mvpp2_percpu_write() where it matters. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-07 23:53:13 +08:00
} else {
if (fwnode_property_read_u32(port_fwnode, "gop-port-id",
&port->gop_id)) {
net: mvpp2: handle register mapping and access for PPv2.2 This commit adjusts the mvpp2 driver register mapping and access logic to support PPv2.2, to handle a number of differences. Due to how the registers are laid out in memory, the Device Tree binding for the "reg" property is different: - On PPv2.1, we had a first area for the packet processor registers (common to all ports), and then one area per port. - On PPv2.2, we have a first area for the packet processor registers (common to all ports), and a second area for numerous other registers, including a large number of per-port registers In addition, on PPv2.2, the area for the common registers is split into so-called "address spaces" of 64 KB each. They allow to access per-CPU registers, where each CPU has its own copy of some registers. A few other registers, which have a single copy, also need to be accessed from those per-CPU windows if they are related to a per-CPU register. For example: - Writing to MVPP2_TXQ_NUM_REG selects a TX queue. This register is a per-CPU register, it must be accessed from the current CPU register window. - Then a write to MVPP2_TXQ_PENDING_REG, MVPP2_TXQ_DESC_ADDR_REG (and a few others) will affect the TX queue that was selected by the write to MVPP2_TXQ_NUM_REG. It must be accessed from the same CPU window as the write to the TXQ_NUM_REG. Therefore, the ->base member of 'struct mvpp2' is replaced with a ->cpu_base[] array, each entry pointing to a mapping of the per-CPU area. Since PPv2.1 doesn't have this concept of per-CPU windows, all entries in ->cpu_base[] point to the same io-remapped area. The existing mvpp2_read() and mvpp2_write() accessors use cpu_base[0], they are used for registers for which the CPU window doesn't matter. mvpp2_percpu_read() and mvpp2_percpu_write() are new accessors added to access the registers for which the CPU window does matter, which is why they take a "cpu" as argument. The driver is then changed to use mvpp2_percpu_read() and mvpp2_percpu_write() where it matters. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-07 23:53:13 +08:00
err = -EINVAL;
dev_err(&pdev->dev, "missing gop-port-id value\n");
goto err_deinit_qvecs;
net: mvpp2: handle register mapping and access for PPv2.2 This commit adjusts the mvpp2 driver register mapping and access logic to support PPv2.2, to handle a number of differences. Due to how the registers are laid out in memory, the Device Tree binding for the "reg" property is different: - On PPv2.1, we had a first area for the packet processor registers (common to all ports), and then one area per port. - On PPv2.2, we have a first area for the packet processor registers (common to all ports), and a second area for numerous other registers, including a large number of per-port registers In addition, on PPv2.2, the area for the common registers is split into so-called "address spaces" of 64 KB each. They allow to access per-CPU registers, where each CPU has its own copy of some registers. A few other registers, which have a single copy, also need to be accessed from those per-CPU windows if they are related to a per-CPU register. For example: - Writing to MVPP2_TXQ_NUM_REG selects a TX queue. This register is a per-CPU register, it must be accessed from the current CPU register window. - Then a write to MVPP2_TXQ_PENDING_REG, MVPP2_TXQ_DESC_ADDR_REG (and a few others) will affect the TX queue that was selected by the write to MVPP2_TXQ_NUM_REG. It must be accessed from the same CPU window as the write to the TXQ_NUM_REG. Therefore, the ->base member of 'struct mvpp2' is replaced with a ->cpu_base[] array, each entry pointing to a mapping of the per-CPU area. Since PPv2.1 doesn't have this concept of per-CPU windows, all entries in ->cpu_base[] point to the same io-remapped area. The existing mvpp2_read() and mvpp2_write() accessors use cpu_base[0], they are used for registers for which the CPU window doesn't matter. mvpp2_percpu_read() and mvpp2_percpu_write() are new accessors added to access the registers for which the CPU window does matter, which is why they take a "cpu" as argument. The driver is then changed to use mvpp2_percpu_read() and mvpp2_percpu_write() where it matters. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-07 23:53:13 +08:00
}
port->base = priv->iface_base + MVPP22_GMAC_BASE(port->gop_id);
port->stats_base = port->priv->iface_base +
MVPP22_MIB_COUNTERS_OFFSET +
port->gop_id * MVPP22_MIB_COUNTERS_PORT_SZ;
}
/* Alloc per-cpu and ethtool stats */
port->stats = netdev_alloc_pcpu_stats(struct mvpp2_pcpu_stats);
if (!port->stats) {
err = -ENOMEM;
goto err_free_irq;
}
port->ethtool_stats = devm_kcalloc(&pdev->dev,
MVPP2_N_ETHTOOL_STATS(ntxqs, nrxqs),
sizeof(u64), GFP_KERNEL);
if (!port->ethtool_stats) {
err = -ENOMEM;
goto err_free_stats;
}
mutex_init(&port->gather_stats_lock);
INIT_DELAYED_WORK(&port->stats_work, mvpp2_gather_hw_statistics);
mvpp2_port_copy_mac_addr(dev, priv, port_fwnode, &mac_from);
port->tx_ring_size = MVPP2_MAX_TXD_DFLT;
port->rx_ring_size = MVPP2_MAX_RXD_DFLT;
SET_NETDEV_DEV(dev, &pdev->dev);
err = mvpp2_port_init(port);
if (err < 0) {
dev_err(&pdev->dev, "failed to init port %d\n", id);
goto err_free_stats;
}
mvpp2_port_periodic_xon_disable(port);
mvpp2_mac_reset_assert(port);
mvpp22_pcs_reset_assert(port);
net: mvpp2: replace TX coalescing interrupts with hrtimer The PP2 controller is capable of per-CPU TX processing, which means there are per-CPU banked register sets and queues. Current version of the driver supports TX packet coalescing - once on given CPU sent packets amount reaches a threshold value, an IRQ occurs. However, there is a single interrupt line responsible for CPU0/1 TX and RX events (the latter is not per-CPU, the hardware does not support RSS). When the top-half executes the interrupt cause is not known. This is why in NAPI poll function, along with RX processing, IRQ cause register on both CPU's is accessed in order to determine on which of them the TX coalescing threshold might have been reached. Thus the egress processing and releasing the buffers is able to take place on the corresponding CPU. Hitherto approach lead to an illegal usage of on_each_cpu function in softirq context. The problem is solved by resigning from TX coalescing interrupts and separating egress finalization from NAPI processing. For that purpose a method of using hrtimer is introduced. In main transmit function (mvpp2_tx) buffers are released once a software coalescing threshold is reached. In case not all the data is processed a timer is set on this CPU - in its interrupt context a tasklet is scheduled in which all queues are processed. At once only one timer per-CPU can be running, which is controlled by a dedicated flag. This commit removes TX processing from NAPI polling function, disables hardware coalescing and enables hrtimer with tasklet, using new per-CPU port structure (mvpp2_port_pcpu). Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-07 01:00:30 +08:00
port->pcpu = alloc_percpu(struct mvpp2_port_pcpu);
if (!port->pcpu) {
err = -ENOMEM;
goto err_free_txq_pcpu;
}
if (!port->has_tx_irqs) {
for (thread = 0; thread < priv->nthreads; thread++) {
port_pcpu = per_cpu_ptr(port->pcpu, thread);
net: mvpp2: replace TX coalescing interrupts with hrtimer The PP2 controller is capable of per-CPU TX processing, which means there are per-CPU banked register sets and queues. Current version of the driver supports TX packet coalescing - once on given CPU sent packets amount reaches a threshold value, an IRQ occurs. However, there is a single interrupt line responsible for CPU0/1 TX and RX events (the latter is not per-CPU, the hardware does not support RSS). When the top-half executes the interrupt cause is not known. This is why in NAPI poll function, along with RX processing, IRQ cause register on both CPU's is accessed in order to determine on which of them the TX coalescing threshold might have been reached. Thus the egress processing and releasing the buffers is able to take place on the corresponding CPU. Hitherto approach lead to an illegal usage of on_each_cpu function in softirq context. The problem is solved by resigning from TX coalescing interrupts and separating egress finalization from NAPI processing. For that purpose a method of using hrtimer is introduced. In main transmit function (mvpp2_tx) buffers are released once a software coalescing threshold is reached. In case not all the data is processed a timer is set on this CPU - in its interrupt context a tasklet is scheduled in which all queues are processed. At once only one timer per-CPU can be running, which is controlled by a dedicated flag. This commit removes TX processing from NAPI polling function, disables hardware coalescing and enables hrtimer with tasklet, using new per-CPU port structure (mvpp2_port_pcpu). Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-07 01:00:30 +08:00
hrtimer_init(&port_pcpu->tx_done_timer, CLOCK_MONOTONIC,
HRTIMER_MODE_REL_PINNED_SOFT);
port_pcpu->tx_done_timer.function = mvpp2_hr_timer_cb;
port_pcpu->timer_scheduled = false;
port_pcpu->dev = dev;
}
net: mvpp2: replace TX coalescing interrupts with hrtimer The PP2 controller is capable of per-CPU TX processing, which means there are per-CPU banked register sets and queues. Current version of the driver supports TX packet coalescing - once on given CPU sent packets amount reaches a threshold value, an IRQ occurs. However, there is a single interrupt line responsible for CPU0/1 TX and RX events (the latter is not per-CPU, the hardware does not support RSS). When the top-half executes the interrupt cause is not known. This is why in NAPI poll function, along with RX processing, IRQ cause register on both CPU's is accessed in order to determine on which of them the TX coalescing threshold might have been reached. Thus the egress processing and releasing the buffers is able to take place on the corresponding CPU. Hitherto approach lead to an illegal usage of on_each_cpu function in softirq context. The problem is solved by resigning from TX coalescing interrupts and separating egress finalization from NAPI processing. For that purpose a method of using hrtimer is introduced. In main transmit function (mvpp2_tx) buffers are released once a software coalescing threshold is reached. In case not all the data is processed a timer is set on this CPU - in its interrupt context a tasklet is scheduled in which all queues are processed. At once only one timer per-CPU can be running, which is controlled by a dedicated flag. This commit removes TX processing from NAPI polling function, disables hardware coalescing and enables hrtimer with tasklet, using new per-CPU port structure (mvpp2_port_pcpu). Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-07 01:00:30 +08:00
}
features = NETIF_F_SG | NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM |
NETIF_F_TSO;
dev->features = features | NETIF_F_RXCSUM;
dev->hw_features |= features | NETIF_F_RXCSUM | NETIF_F_GRO |
NETIF_F_HW_VLAN_CTAG_FILTER;
if (mvpp22_rss_is_supported()) {
net: mvpp2: add an RSS classification step for each flow One of the classification action that can be performed is to compute a hash of the packet header based on some header fields, and lookup a RSS table based on this hash to determine the final RxQ. This is done by adding one lookup entry per flow per port, so that we can configure the hash generation parameters for each flow and each port. There are 2 possible engines that can be used for RSS hash generation : - C3HA, that generates a hash based on up to 4 header-extracted fields - C3HB, that does the same as c3HA, but also includes L4 info in the hash There are a lot of fields that can be extracted from the header. For now, we only use the ones that we can configure using ethtool : - DST MAC address - L3 info - Source IP - Destination IP - Source port - Destination port The C3HB engine is selected when we use L4 fields (src/dst port). Header parser Dec table Ingress pkt +-------------+ flow id +----------------------------+ ------------->| TCAM + SRAM |-------->|TCP IPv4 w/ VLAN, not frag | +-------------+ |TCP IPv4 w/o VLAN, not frag | |TCP IPv4 w/ VLAN, frag |--+ |etc. | | +----------------------------+ | | Flow table | +---------+ +------------+ +--------------------------+ | | RSS tbl |<--| Classifier |<--------| flow 0: C2 lookup | | +---------+ +------------+ | C3 lookup port 0 | | | | | C3 lookup port 1 | | +-----------+ +-------------+ | ... | | | C2 engine | | C3H engines | | flow 1: C2 lookup |<--+ +-----------+ +-------------+ | C3 lookup port 0 | | ... | | ... | | flow 51 : C2 lookup | | ... | +--------------------------+ The C2 engine also gains the role of enabling and disabling the RSS table lookup for this packet. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-12 19:54:26 +08:00
dev->hw_features |= NETIF_F_RXHASH;
dev->features |= NETIF_F_NTUPLE;
}
net: mvpp2: add an RSS classification step for each flow One of the classification action that can be performed is to compute a hash of the packet header based on some header fields, and lookup a RSS table based on this hash to determine the final RxQ. This is done by adding one lookup entry per flow per port, so that we can configure the hash generation parameters for each flow and each port. There are 2 possible engines that can be used for RSS hash generation : - C3HA, that generates a hash based on up to 4 header-extracted fields - C3HB, that does the same as c3HA, but also includes L4 info in the hash There are a lot of fields that can be extracted from the header. For now, we only use the ones that we can configure using ethtool : - DST MAC address - L3 info - Source IP - Destination IP - Source port - Destination port The C3HB engine is selected when we use L4 fields (src/dst port). Header parser Dec table Ingress pkt +-------------+ flow id +----------------------------+ ------------->| TCAM + SRAM |-------->|TCP IPv4 w/ VLAN, not frag | +-------------+ |TCP IPv4 w/o VLAN, not frag | |TCP IPv4 w/ VLAN, frag |--+ |etc. | | +----------------------------+ | | Flow table | +---------+ +------------+ +--------------------------+ | | RSS tbl |<--| Classifier |<--------| flow 0: C2 lookup | | +---------+ +------------+ | C3 lookup port 0 | | | | | C3 lookup port 1 | | +-----------+ +-------------+ | ... | | | C2 engine | | C3H engines | | flow 1: C2 lookup |<--+ +-----------+ +-------------+ | C3 lookup port 0 | | ... | | ... | | flow 51 : C2 lookup | | ... | +--------------------------+ The C2 engine also gains the role of enabling and disabling the RSS table lookup for this packet. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-12 19:54:26 +08:00
if (!port->priv->percpu_pools)
mvpp2_set_hw_csum(port, port->pool_long->id);
dev->vlan_features |= features;
dev->gso_max_segs = MVPP2_MAX_TSO_SEGS;
dev->priv_flags |= IFF_UNICAST_FLT;
/* MTU range: 68 - 9704 */
dev->min_mtu = ETH_MIN_MTU;
/* 9704 == 9728 - 20 and rounding to 8 */
dev->max_mtu = MVPP2_BM_JUMBO_PKT_SIZE;
dev->dev.of_node = port_node;
/* Phylink isn't used w/ ACPI as of now */
if (port_node) {
net: phylink: Add struct phylink_config to PHYLINK API The phylink_config structure will encapsulate a pointer to a struct device and the operation type requested for this instance of PHYLINK. This patch does not make any functional changes, it just transitions the PHYLINK internals and all its users to the new API. A pointer to a phylink_config structure will be passed to phylink_create() instead of the net_device directly. Also, the same phylink_config pointer will be passed back to all phylink_mac_ops callbacks instead of the net_device. Using this mechanism, a PHYLINK user can get the original net_device using a structure such as 'to_net_dev(config->dev)' or directly the structure containing the phylink_config using a container_of call. At the moment, only the PHYLINK_NETDEV is defined as a valid operation type for PHYLINK. In this mode, a valid reference to a struct device linked to the original net_device should be passed to PHYLINK through the phylink_config structure. This API changes is mainly driven by the necessity of adding a new operation type in PHYLINK that disconnects the phy_device from the net_device and also works when the net_device is lacking. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Tested-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-29 01:38:12 +08:00
port->phylink_config.dev = &dev->dev;
port->phylink_config.type = PHYLINK_NETDEV;
phylink = phylink_create(&port->phylink_config, port_fwnode,
phy_mode, &mvpp2_phylink_ops);
if (IS_ERR(phylink)) {
err = PTR_ERR(phylink);
goto err_free_port_pcpu;
}
port->phylink = phylink;
} else {
port->phylink = NULL;
}
/* Cycle the comphy to power it down, saving 270mW per port -
* don't worry about an error powering it up. When the comphy
* driver does this, we can remove this code.
*/
if (port->comphy) {
err = mvpp22_comphy_init(port);
if (err == 0)
phy_power_off(port->comphy);
}
err = register_netdev(dev);
if (err < 0) {
dev_err(&pdev->dev, "failed to register netdev\n");
goto err_phylink;
}
netdev_info(dev, "Using %s mac address %pM\n", mac_from, dev->dev_addr);
priv->port_list[priv->port_count++] = port;
return 0;
err_phylink:
if (port->phylink)
phylink_destroy(port->phylink);
net: mvpp2: replace TX coalescing interrupts with hrtimer The PP2 controller is capable of per-CPU TX processing, which means there are per-CPU banked register sets and queues. Current version of the driver supports TX packet coalescing - once on given CPU sent packets amount reaches a threshold value, an IRQ occurs. However, there is a single interrupt line responsible for CPU0/1 TX and RX events (the latter is not per-CPU, the hardware does not support RSS). When the top-half executes the interrupt cause is not known. This is why in NAPI poll function, along with RX processing, IRQ cause register on both CPU's is accessed in order to determine on which of them the TX coalescing threshold might have been reached. Thus the egress processing and releasing the buffers is able to take place on the corresponding CPU. Hitherto approach lead to an illegal usage of on_each_cpu function in softirq context. The problem is solved by resigning from TX coalescing interrupts and separating egress finalization from NAPI processing. For that purpose a method of using hrtimer is introduced. In main transmit function (mvpp2_tx) buffers are released once a software coalescing threshold is reached. In case not all the data is processed a timer is set on this CPU - in its interrupt context a tasklet is scheduled in which all queues are processed. At once only one timer per-CPU can be running, which is controlled by a dedicated flag. This commit removes TX processing from NAPI polling function, disables hardware coalescing and enables hrtimer with tasklet, using new per-CPU port structure (mvpp2_port_pcpu). Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-07 01:00:30 +08:00
err_free_port_pcpu:
free_percpu(port->pcpu);
err_free_txq_pcpu:
for (i = 0; i < port->ntxqs; i++)
free_percpu(port->txqs[i]->pcpu);
err_free_stats:
free_percpu(port->stats);
err_free_irq:
if (port->link_irq)
irq_dispose_mapping(port->link_irq);
err_deinit_qvecs:
mvpp2_queue_vectors_deinit(port);
err_free_netdev:
free_netdev(dev);
return err;
}
/* Ports removal routine */
static void mvpp2_port_remove(struct mvpp2_port *port)
{
int i;
unregister_netdev(port->dev);
if (port->phylink)
phylink_destroy(port->phylink);
net: mvpp2: replace TX coalescing interrupts with hrtimer The PP2 controller is capable of per-CPU TX processing, which means there are per-CPU banked register sets and queues. Current version of the driver supports TX packet coalescing - once on given CPU sent packets amount reaches a threshold value, an IRQ occurs. However, there is a single interrupt line responsible for CPU0/1 TX and RX events (the latter is not per-CPU, the hardware does not support RSS). When the top-half executes the interrupt cause is not known. This is why in NAPI poll function, along with RX processing, IRQ cause register on both CPU's is accessed in order to determine on which of them the TX coalescing threshold might have been reached. Thus the egress processing and releasing the buffers is able to take place on the corresponding CPU. Hitherto approach lead to an illegal usage of on_each_cpu function in softirq context. The problem is solved by resigning from TX coalescing interrupts and separating egress finalization from NAPI processing. For that purpose a method of using hrtimer is introduced. In main transmit function (mvpp2_tx) buffers are released once a software coalescing threshold is reached. In case not all the data is processed a timer is set on this CPU - in its interrupt context a tasklet is scheduled in which all queues are processed. At once only one timer per-CPU can be running, which is controlled by a dedicated flag. This commit removes TX processing from NAPI polling function, disables hardware coalescing and enables hrtimer with tasklet, using new per-CPU port structure (mvpp2_port_pcpu). Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-07 01:00:30 +08:00
free_percpu(port->pcpu);
free_percpu(port->stats);
for (i = 0; i < port->ntxqs; i++)
free_percpu(port->txqs[i]->pcpu);
mvpp2_queue_vectors_deinit(port);
if (port->link_irq)
irq_dispose_mapping(port->link_irq);
free_netdev(port->dev);
}
/* Initialize decoding windows */
static void mvpp2_conf_mbus_windows(const struct mbus_dram_target_info *dram,
struct mvpp2 *priv)
{
u32 win_enable;
int i;
for (i = 0; i < 6; i++) {
mvpp2_write(priv, MVPP2_WIN_BASE(i), 0);
mvpp2_write(priv, MVPP2_WIN_SIZE(i), 0);
if (i < 4)
mvpp2_write(priv, MVPP2_WIN_REMAP(i), 0);
}
win_enable = 0;
for (i = 0; i < dram->num_cs; i++) {
const struct mbus_dram_window *cs = dram->cs + i;
mvpp2_write(priv, MVPP2_WIN_BASE(i),
(cs->base & 0xffff0000) | (cs->mbus_attr << 8) |
dram->mbus_dram_target_id);
mvpp2_write(priv, MVPP2_WIN_SIZE(i),
(cs->size - 1) & 0xffff0000);
win_enable |= (1 << i);
}
mvpp2_write(priv, MVPP2_BASE_ADDR_ENABLE, win_enable);
}
/* Initialize Rx FIFO's */
static void mvpp2_rx_fifo_init(struct mvpp2 *priv)
{
int port;
for (port = 0; port < MVPP2_MAX_PORTS; port++) {
mvpp2_write(priv, MVPP2_RX_DATA_FIFO_SIZE_REG(port),
MVPP2_RX_FIFO_PORT_DATA_SIZE_4KB);
mvpp2_write(priv, MVPP2_RX_ATTR_FIFO_SIZE_REG(port),
MVPP2_RX_FIFO_PORT_ATTR_SIZE_4KB);
}
mvpp2_write(priv, MVPP2_RX_MIN_PKT_SIZE_REG,
MVPP2_RX_FIFO_PORT_MIN_PKT);
mvpp2_write(priv, MVPP2_RX_FIFO_INIT_REG, 0x1);
}
static void mvpp22_rx_fifo_init(struct mvpp2 *priv)
{
int port;
/* The FIFO size parameters are set depending on the maximum speed a
* given port can handle:
* - Port 0: 10Gbps
* - Port 1: 2.5Gbps
* - Ports 2 and 3: 1Gbps
*/
mvpp2_write(priv, MVPP2_RX_DATA_FIFO_SIZE_REG(0),
MVPP2_RX_FIFO_PORT_DATA_SIZE_32KB);
mvpp2_write(priv, MVPP2_RX_ATTR_FIFO_SIZE_REG(0),
MVPP2_RX_FIFO_PORT_ATTR_SIZE_32KB);
mvpp2_write(priv, MVPP2_RX_DATA_FIFO_SIZE_REG(1),
MVPP2_RX_FIFO_PORT_DATA_SIZE_8KB);
mvpp2_write(priv, MVPP2_RX_ATTR_FIFO_SIZE_REG(1),
MVPP2_RX_FIFO_PORT_ATTR_SIZE_8KB);
for (port = 2; port < MVPP2_MAX_PORTS; port++) {
mvpp2_write(priv, MVPP2_RX_DATA_FIFO_SIZE_REG(port),
MVPP2_RX_FIFO_PORT_DATA_SIZE_4KB);
mvpp2_write(priv, MVPP2_RX_ATTR_FIFO_SIZE_REG(port),
MVPP2_RX_FIFO_PORT_ATTR_SIZE_4KB);
}
mvpp2_write(priv, MVPP2_RX_MIN_PKT_SIZE_REG,
MVPP2_RX_FIFO_PORT_MIN_PKT);
mvpp2_write(priv, MVPP2_RX_FIFO_INIT_REG, 0x1);
}
/* Initialize Tx FIFO's: the total FIFO size is 19kB on PPv2.2 and 10G
* interfaces must have a Tx FIFO size of 10kB. As only port 0 can do 10G,
* configure its Tx FIFO size to 10kB and the others ports Tx FIFO size to 3kB.
*/
static void mvpp22_tx_fifo_init(struct mvpp2 *priv)
{
int port, size, thrs;
for (port = 0; port < MVPP2_MAX_PORTS; port++) {
if (port == 0) {
size = MVPP22_TX_FIFO_DATA_SIZE_10KB;
thrs = MVPP2_TX_FIFO_THRESHOLD_10KB;
} else {
size = MVPP22_TX_FIFO_DATA_SIZE_3KB;
thrs = MVPP2_TX_FIFO_THRESHOLD_3KB;
}
mvpp2_write(priv, MVPP22_TX_FIFO_SIZE_REG(port), size);
mvpp2_write(priv, MVPP22_TX_FIFO_THRESH_REG(port), thrs);
}
}
static void mvpp2_axi_init(struct mvpp2 *priv)
{
u32 val, rdval, wrval;
mvpp2_write(priv, MVPP22_BM_ADDR_HIGH_RLS_REG, 0x0);
/* AXI Bridge Configuration */
rdval = MVPP22_AXI_CODE_CACHE_RD_CACHE
<< MVPP22_AXI_ATTR_CACHE_OFFS;
rdval |= MVPP22_AXI_CODE_DOMAIN_OUTER_DOM
<< MVPP22_AXI_ATTR_DOMAIN_OFFS;
wrval = MVPP22_AXI_CODE_CACHE_WR_CACHE
<< MVPP22_AXI_ATTR_CACHE_OFFS;
wrval |= MVPP22_AXI_CODE_DOMAIN_OUTER_DOM
<< MVPP22_AXI_ATTR_DOMAIN_OFFS;
/* BM */
mvpp2_write(priv, MVPP22_AXI_BM_WR_ATTR_REG, wrval);
mvpp2_write(priv, MVPP22_AXI_BM_RD_ATTR_REG, rdval);
/* Descriptors */
mvpp2_write(priv, MVPP22_AXI_AGGRQ_DESCR_RD_ATTR_REG, rdval);
mvpp2_write(priv, MVPP22_AXI_TXQ_DESCR_WR_ATTR_REG, wrval);
mvpp2_write(priv, MVPP22_AXI_TXQ_DESCR_RD_ATTR_REG, rdval);
mvpp2_write(priv, MVPP22_AXI_RXQ_DESCR_WR_ATTR_REG, wrval);
/* Buffer Data */
mvpp2_write(priv, MVPP22_AXI_TX_DATA_RD_ATTR_REG, rdval);
mvpp2_write(priv, MVPP22_AXI_RX_DATA_WR_ATTR_REG, wrval);
val = MVPP22_AXI_CODE_CACHE_NON_CACHE
<< MVPP22_AXI_CODE_CACHE_OFFS;
val |= MVPP22_AXI_CODE_DOMAIN_SYSTEM
<< MVPP22_AXI_CODE_DOMAIN_OFFS;
mvpp2_write(priv, MVPP22_AXI_RD_NORMAL_CODE_REG, val);
mvpp2_write(priv, MVPP22_AXI_WR_NORMAL_CODE_REG, val);
val = MVPP22_AXI_CODE_CACHE_RD_CACHE
<< MVPP22_AXI_CODE_CACHE_OFFS;
val |= MVPP22_AXI_CODE_DOMAIN_OUTER_DOM
<< MVPP22_AXI_CODE_DOMAIN_OFFS;
mvpp2_write(priv, MVPP22_AXI_RD_SNOOP_CODE_REG, val);
val = MVPP22_AXI_CODE_CACHE_WR_CACHE
<< MVPP22_AXI_CODE_CACHE_OFFS;
val |= MVPP22_AXI_CODE_DOMAIN_OUTER_DOM
<< MVPP22_AXI_CODE_DOMAIN_OFFS;
mvpp2_write(priv, MVPP22_AXI_WR_SNOOP_CODE_REG, val);
}
/* Initialize network controller common part HW */
static int mvpp2_init(struct platform_device *pdev, struct mvpp2 *priv)
{
const struct mbus_dram_target_info *dram_target_info;
int err, i;
u32 val;
/* MBUS windows configuration */
dram_target_info = mv_mbus_dram_info();
if (dram_target_info)
mvpp2_conf_mbus_windows(dram_target_info, priv);
if (priv->hw_version == MVPP22)
mvpp2_axi_init(priv);
/* Disable HW PHY polling */
if (priv->hw_version == MVPP21) {
val = readl(priv->lms_base + MVPP2_PHY_AN_CFG0_REG);
val |= MVPP2_PHY_AN_STOP_SMI0_MASK;
writel(val, priv->lms_base + MVPP2_PHY_AN_CFG0_REG);
} else {
val = readl(priv->iface_base + MVPP22_SMI_MISC_CFG_REG);
val &= ~MVPP22_SMI_POLLING_EN;
writel(val, priv->iface_base + MVPP22_SMI_MISC_CFG_REG);
}
/* Allocate and initialize aggregated TXQs */
priv->aggr_txqs = devm_kcalloc(&pdev->dev, MVPP2_MAX_THREADS,
sizeof(*priv->aggr_txqs),
GFP_KERNEL);
if (!priv->aggr_txqs)
return -ENOMEM;
for (i = 0; i < MVPP2_MAX_THREADS; i++) {
priv->aggr_txqs[i].id = i;
priv->aggr_txqs[i].size = MVPP2_AGGR_TXQ_SIZE;
err = mvpp2_aggr_txq_init(pdev, &priv->aggr_txqs[i], i, priv);
if (err < 0)
return err;
}
/* Fifo Init */
if (priv->hw_version == MVPP21) {
mvpp2_rx_fifo_init(priv);
} else {
mvpp22_rx_fifo_init(priv);
mvpp22_tx_fifo_init(priv);
}
if (priv->hw_version == MVPP21)
writel(MVPP2_EXT_GLOBAL_CTRL_DEFAULT,
priv->lms_base + MVPP2_MNG_EXTENDED_GLOBAL_CTRL_REG);
/* Allow cache snoop when transmiting packets */
mvpp2_write(priv, MVPP2_TX_SNOOP_REG, 0x1);
/* Buffer Manager initialization */
err = mvpp2_bm_init(&pdev->dev, priv);
if (err < 0)
return err;
/* Parser default initialization */
err = mvpp2_prs_default_init(pdev, priv);
if (err < 0)
return err;
/* Classifier default initialization */
mvpp2_cls_init(priv);
return 0;
}
static int mvpp2_probe(struct platform_device *pdev)
{
net: mvpp2: enable ACPI support in the driver This patch introduces an alternative way of obtaining resources - via ACPI tables provided by firmware. Enabling coexistence with the DT support, in addition to the OF_*->device_*/fwnode_* API replacement, required following steps to be taken: * Add mvpp2_acpi_match table * Omit clock configuration and obtain tclk from the property - in ACPI world, the firmware is responsible for clock maintenance. * Disable comphy and syscon handling as they are not available for ACPI. * Modify way of obtaining interrupts - use newly introduced fwnode_irq_get() routine * Until proper MDIO bus and PHY handling with ACPI is established in the kernel, use only link interrupts feature in the driver. For the RGMII port it results in depending on GMAC settings done during firmware stage. * When booting with ACPI MVPP2_QDIST_MULTI_MODE is picked by default, as there is no need to keep any kind of the backward compatibility. Moreover, a memory region used by mvmdio driver is usually placed in the middle of the address space of the PP2 network controller. The MDIO base address is obtained without requesting memory region (by devm_ioremap() call) in mvmdio.c, later overlapping resources are requested by the network driver, which is responsible for avoiding a concurrent access. In case the MDIO memory region is declared in the ACPI, it can already appear as 'in-use' in the OS. Because it is overlapped by second region of the network controller, make sure it is released, before requesting it again. The care is taken by mvpp2 driver to avoid concurrent access to this memory region. Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-18 20:31:44 +08:00
const struct acpi_device_id *acpi_id;
struct fwnode_handle *fwnode = pdev->dev.fwnode;
struct fwnode_handle *port_fwnode;
struct mvpp2 *priv;
struct resource *res;
net: mvpp2: handle register mapping and access for PPv2.2 This commit adjusts the mvpp2 driver register mapping and access logic to support PPv2.2, to handle a number of differences. Due to how the registers are laid out in memory, the Device Tree binding for the "reg" property is different: - On PPv2.1, we had a first area for the packet processor registers (common to all ports), and then one area per port. - On PPv2.2, we have a first area for the packet processor registers (common to all ports), and a second area for numerous other registers, including a large number of per-port registers In addition, on PPv2.2, the area for the common registers is split into so-called "address spaces" of 64 KB each. They allow to access per-CPU registers, where each CPU has its own copy of some registers. A few other registers, which have a single copy, also need to be accessed from those per-CPU windows if they are related to a per-CPU register. For example: - Writing to MVPP2_TXQ_NUM_REG selects a TX queue. This register is a per-CPU register, it must be accessed from the current CPU register window. - Then a write to MVPP2_TXQ_PENDING_REG, MVPP2_TXQ_DESC_ADDR_REG (and a few others) will affect the TX queue that was selected by the write to MVPP2_TXQ_NUM_REG. It must be accessed from the same CPU window as the write to the TXQ_NUM_REG. Therefore, the ->base member of 'struct mvpp2' is replaced with a ->cpu_base[] array, each entry pointing to a mapping of the per-CPU area. Since PPv2.1 doesn't have this concept of per-CPU windows, all entries in ->cpu_base[] point to the same io-remapped area. The existing mvpp2_read() and mvpp2_write() accessors use cpu_base[0], they are used for registers for which the CPU window doesn't matter. mvpp2_percpu_read() and mvpp2_percpu_write() are new accessors added to access the registers for which the CPU window does matter, which is why they take a "cpu" as argument. The driver is then changed to use mvpp2_percpu_read() and mvpp2_percpu_write() where it matters. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-07 23:53:13 +08:00
void __iomem *base;
int i, shared;
int err;
priv = devm_kzalloc(&pdev->dev, sizeof(*priv), GFP_KERNEL);
if (!priv)
return -ENOMEM;
net: mvpp2: enable ACPI support in the driver This patch introduces an alternative way of obtaining resources - via ACPI tables provided by firmware. Enabling coexistence with the DT support, in addition to the OF_*->device_*/fwnode_* API replacement, required following steps to be taken: * Add mvpp2_acpi_match table * Omit clock configuration and obtain tclk from the property - in ACPI world, the firmware is responsible for clock maintenance. * Disable comphy and syscon handling as they are not available for ACPI. * Modify way of obtaining interrupts - use newly introduced fwnode_irq_get() routine * Until proper MDIO bus and PHY handling with ACPI is established in the kernel, use only link interrupts feature in the driver. For the RGMII port it results in depending on GMAC settings done during firmware stage. * When booting with ACPI MVPP2_QDIST_MULTI_MODE is picked by default, as there is no need to keep any kind of the backward compatibility. Moreover, a memory region used by mvmdio driver is usually placed in the middle of the address space of the PP2 network controller. The MDIO base address is obtained without requesting memory region (by devm_ioremap() call) in mvmdio.c, later overlapping resources are requested by the network driver, which is responsible for avoiding a concurrent access. In case the MDIO memory region is declared in the ACPI, it can already appear as 'in-use' in the OS. Because it is overlapped by second region of the network controller, make sure it is released, before requesting it again. The care is taken by mvpp2 driver to avoid concurrent access to this memory region. Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-18 20:31:44 +08:00
if (has_acpi_companion(&pdev->dev)) {
acpi_id = acpi_match_device(pdev->dev.driver->acpi_match_table,
&pdev->dev);
if (!acpi_id)
return -EINVAL;
net: mvpp2: enable ACPI support in the driver This patch introduces an alternative way of obtaining resources - via ACPI tables provided by firmware. Enabling coexistence with the DT support, in addition to the OF_*->device_*/fwnode_* API replacement, required following steps to be taken: * Add mvpp2_acpi_match table * Omit clock configuration and obtain tclk from the property - in ACPI world, the firmware is responsible for clock maintenance. * Disable comphy and syscon handling as they are not available for ACPI. * Modify way of obtaining interrupts - use newly introduced fwnode_irq_get() routine * Until proper MDIO bus and PHY handling with ACPI is established in the kernel, use only link interrupts feature in the driver. For the RGMII port it results in depending on GMAC settings done during firmware stage. * When booting with ACPI MVPP2_QDIST_MULTI_MODE is picked by default, as there is no need to keep any kind of the backward compatibility. Moreover, a memory region used by mvmdio driver is usually placed in the middle of the address space of the PP2 network controller. The MDIO base address is obtained without requesting memory region (by devm_ioremap() call) in mvmdio.c, later overlapping resources are requested by the network driver, which is responsible for avoiding a concurrent access. In case the MDIO memory region is declared in the ACPI, it can already appear as 'in-use' in the OS. Because it is overlapped by second region of the network controller, make sure it is released, before requesting it again. The care is taken by mvpp2 driver to avoid concurrent access to this memory region. Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-18 20:31:44 +08:00
priv->hw_version = (unsigned long)acpi_id->driver_data;
} else {
priv->hw_version =
(unsigned long)of_device_get_match_data(&pdev->dev);
}
/* multi queue mode isn't supported on PPV2.1, fallback to single
* mode
*/
if (priv->hw_version == MVPP21)
queue_mode = MVPP2_QDIST_SINGLE_MODE;
base = devm_platform_ioremap_resource(pdev, 0);
net: mvpp2: handle register mapping and access for PPv2.2 This commit adjusts the mvpp2 driver register mapping and access logic to support PPv2.2, to handle a number of differences. Due to how the registers are laid out in memory, the Device Tree binding for the "reg" property is different: - On PPv2.1, we had a first area for the packet processor registers (common to all ports), and then one area per port. - On PPv2.2, we have a first area for the packet processor registers (common to all ports), and a second area for numerous other registers, including a large number of per-port registers In addition, on PPv2.2, the area for the common registers is split into so-called "address spaces" of 64 KB each. They allow to access per-CPU registers, where each CPU has its own copy of some registers. A few other registers, which have a single copy, also need to be accessed from those per-CPU windows if they are related to a per-CPU register. For example: - Writing to MVPP2_TXQ_NUM_REG selects a TX queue. This register is a per-CPU register, it must be accessed from the current CPU register window. - Then a write to MVPP2_TXQ_PENDING_REG, MVPP2_TXQ_DESC_ADDR_REG (and a few others) will affect the TX queue that was selected by the write to MVPP2_TXQ_NUM_REG. It must be accessed from the same CPU window as the write to the TXQ_NUM_REG. Therefore, the ->base member of 'struct mvpp2' is replaced with a ->cpu_base[] array, each entry pointing to a mapping of the per-CPU area. Since PPv2.1 doesn't have this concept of per-CPU windows, all entries in ->cpu_base[] point to the same io-remapped area. The existing mvpp2_read() and mvpp2_write() accessors use cpu_base[0], they are used for registers for which the CPU window doesn't matter. mvpp2_percpu_read() and mvpp2_percpu_write() are new accessors added to access the registers for which the CPU window does matter, which is why they take a "cpu" as argument. The driver is then changed to use mvpp2_percpu_read() and mvpp2_percpu_write() where it matters. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-07 23:53:13 +08:00
if (IS_ERR(base))
return PTR_ERR(base);
if (priv->hw_version == MVPP21) {
priv->lms_base = devm_platform_ioremap_resource(pdev, 1);
net: mvpp2: handle register mapping and access for PPv2.2 This commit adjusts the mvpp2 driver register mapping and access logic to support PPv2.2, to handle a number of differences. Due to how the registers are laid out in memory, the Device Tree binding for the "reg" property is different: - On PPv2.1, we had a first area for the packet processor registers (common to all ports), and then one area per port. - On PPv2.2, we have a first area for the packet processor registers (common to all ports), and a second area for numerous other registers, including a large number of per-port registers In addition, on PPv2.2, the area for the common registers is split into so-called "address spaces" of 64 KB each. They allow to access per-CPU registers, where each CPU has its own copy of some registers. A few other registers, which have a single copy, also need to be accessed from those per-CPU windows if they are related to a per-CPU register. For example: - Writing to MVPP2_TXQ_NUM_REG selects a TX queue. This register is a per-CPU register, it must be accessed from the current CPU register window. - Then a write to MVPP2_TXQ_PENDING_REG, MVPP2_TXQ_DESC_ADDR_REG (and a few others) will affect the TX queue that was selected by the write to MVPP2_TXQ_NUM_REG. It must be accessed from the same CPU window as the write to the TXQ_NUM_REG. Therefore, the ->base member of 'struct mvpp2' is replaced with a ->cpu_base[] array, each entry pointing to a mapping of the per-CPU area. Since PPv2.1 doesn't have this concept of per-CPU windows, all entries in ->cpu_base[] point to the same io-remapped area. The existing mvpp2_read() and mvpp2_write() accessors use cpu_base[0], they are used for registers for which the CPU window doesn't matter. mvpp2_percpu_read() and mvpp2_percpu_write() are new accessors added to access the registers for which the CPU window does matter, which is why they take a "cpu" as argument. The driver is then changed to use mvpp2_percpu_read() and mvpp2_percpu_write() where it matters. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-07 23:53:13 +08:00
if (IS_ERR(priv->lms_base))
return PTR_ERR(priv->lms_base);
} else {
res = platform_get_resource(pdev, IORESOURCE_MEM, 1);
net: mvpp2: enable ACPI support in the driver This patch introduces an alternative way of obtaining resources - via ACPI tables provided by firmware. Enabling coexistence with the DT support, in addition to the OF_*->device_*/fwnode_* API replacement, required following steps to be taken: * Add mvpp2_acpi_match table * Omit clock configuration and obtain tclk from the property - in ACPI world, the firmware is responsible for clock maintenance. * Disable comphy and syscon handling as they are not available for ACPI. * Modify way of obtaining interrupts - use newly introduced fwnode_irq_get() routine * Until proper MDIO bus and PHY handling with ACPI is established in the kernel, use only link interrupts feature in the driver. For the RGMII port it results in depending on GMAC settings done during firmware stage. * When booting with ACPI MVPP2_QDIST_MULTI_MODE is picked by default, as there is no need to keep any kind of the backward compatibility. Moreover, a memory region used by mvmdio driver is usually placed in the middle of the address space of the PP2 network controller. The MDIO base address is obtained without requesting memory region (by devm_ioremap() call) in mvmdio.c, later overlapping resources are requested by the network driver, which is responsible for avoiding a concurrent access. In case the MDIO memory region is declared in the ACPI, it can already appear as 'in-use' in the OS. Because it is overlapped by second region of the network controller, make sure it is released, before requesting it again. The care is taken by mvpp2 driver to avoid concurrent access to this memory region. Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-18 20:31:44 +08:00
if (has_acpi_companion(&pdev->dev)) {
/* In case the MDIO memory region is declared in
* the ACPI, it can already appear as 'in-use'
* in the OS. Because it is overlapped by second
* region of the network controller, make
* sure it is released, before requesting it again.
* The care is taken by mvpp2 driver to avoid
* concurrent access to this memory region.
*/
release_resource(res);
}
net: mvpp2: handle register mapping and access for PPv2.2 This commit adjusts the mvpp2 driver register mapping and access logic to support PPv2.2, to handle a number of differences. Due to how the registers are laid out in memory, the Device Tree binding for the "reg" property is different: - On PPv2.1, we had a first area for the packet processor registers (common to all ports), and then one area per port. - On PPv2.2, we have a first area for the packet processor registers (common to all ports), and a second area for numerous other registers, including a large number of per-port registers In addition, on PPv2.2, the area for the common registers is split into so-called "address spaces" of 64 KB each. They allow to access per-CPU registers, where each CPU has its own copy of some registers. A few other registers, which have a single copy, also need to be accessed from those per-CPU windows if they are related to a per-CPU register. For example: - Writing to MVPP2_TXQ_NUM_REG selects a TX queue. This register is a per-CPU register, it must be accessed from the current CPU register window. - Then a write to MVPP2_TXQ_PENDING_REG, MVPP2_TXQ_DESC_ADDR_REG (and a few others) will affect the TX queue that was selected by the write to MVPP2_TXQ_NUM_REG. It must be accessed from the same CPU window as the write to the TXQ_NUM_REG. Therefore, the ->base member of 'struct mvpp2' is replaced with a ->cpu_base[] array, each entry pointing to a mapping of the per-CPU area. Since PPv2.1 doesn't have this concept of per-CPU windows, all entries in ->cpu_base[] point to the same io-remapped area. The existing mvpp2_read() and mvpp2_write() accessors use cpu_base[0], they are used for registers for which the CPU window doesn't matter. mvpp2_percpu_read() and mvpp2_percpu_write() are new accessors added to access the registers for which the CPU window does matter, which is why they take a "cpu" as argument. The driver is then changed to use mvpp2_percpu_read() and mvpp2_percpu_write() where it matters. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-07 23:53:13 +08:00
priv->iface_base = devm_ioremap_resource(&pdev->dev, res);
if (IS_ERR(priv->iface_base))
return PTR_ERR(priv->iface_base);
net: mvpp2: enable ACPI support in the driver This patch introduces an alternative way of obtaining resources - via ACPI tables provided by firmware. Enabling coexistence with the DT support, in addition to the OF_*->device_*/fwnode_* API replacement, required following steps to be taken: * Add mvpp2_acpi_match table * Omit clock configuration and obtain tclk from the property - in ACPI world, the firmware is responsible for clock maintenance. * Disable comphy and syscon handling as they are not available for ACPI. * Modify way of obtaining interrupts - use newly introduced fwnode_irq_get() routine * Until proper MDIO bus and PHY handling with ACPI is established in the kernel, use only link interrupts feature in the driver. For the RGMII port it results in depending on GMAC settings done during firmware stage. * When booting with ACPI MVPP2_QDIST_MULTI_MODE is picked by default, as there is no need to keep any kind of the backward compatibility. Moreover, a memory region used by mvmdio driver is usually placed in the middle of the address space of the PP2 network controller. The MDIO base address is obtained without requesting memory region (by devm_ioremap() call) in mvmdio.c, later overlapping resources are requested by the network driver, which is responsible for avoiding a concurrent access. In case the MDIO memory region is declared in the ACPI, it can already appear as 'in-use' in the OS. Because it is overlapped by second region of the network controller, make sure it is released, before requesting it again. The care is taken by mvpp2 driver to avoid concurrent access to this memory region. Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-18 20:31:44 +08:00
}
net: mvpp2: enable ACPI support in the driver This patch introduces an alternative way of obtaining resources - via ACPI tables provided by firmware. Enabling coexistence with the DT support, in addition to the OF_*->device_*/fwnode_* API replacement, required following steps to be taken: * Add mvpp2_acpi_match table * Omit clock configuration and obtain tclk from the property - in ACPI world, the firmware is responsible for clock maintenance. * Disable comphy and syscon handling as they are not available for ACPI. * Modify way of obtaining interrupts - use newly introduced fwnode_irq_get() routine * Until proper MDIO bus and PHY handling with ACPI is established in the kernel, use only link interrupts feature in the driver. For the RGMII port it results in depending on GMAC settings done during firmware stage. * When booting with ACPI MVPP2_QDIST_MULTI_MODE is picked by default, as there is no need to keep any kind of the backward compatibility. Moreover, a memory region used by mvmdio driver is usually placed in the middle of the address space of the PP2 network controller. The MDIO base address is obtained without requesting memory region (by devm_ioremap() call) in mvmdio.c, later overlapping resources are requested by the network driver, which is responsible for avoiding a concurrent access. In case the MDIO memory region is declared in the ACPI, it can already appear as 'in-use' in the OS. Because it is overlapped by second region of the network controller, make sure it is released, before requesting it again. The care is taken by mvpp2 driver to avoid concurrent access to this memory region. Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-18 20:31:44 +08:00
if (priv->hw_version == MVPP22 && dev_of_node(&pdev->dev)) {
priv->sysctrl_base =
syscon_regmap_lookup_by_phandle(pdev->dev.of_node,
"marvell,system-controller");
if (IS_ERR(priv->sysctrl_base))
/* The system controller regmap is optional for dt
* compatibility reasons. When not provided, the
* configuration of the GoP relies on the
* firmware/bootloader.
*/
priv->sysctrl_base = NULL;
net: mvpp2: handle register mapping and access for PPv2.2 This commit adjusts the mvpp2 driver register mapping and access logic to support PPv2.2, to handle a number of differences. Due to how the registers are laid out in memory, the Device Tree binding for the "reg" property is different: - On PPv2.1, we had a first area for the packet processor registers (common to all ports), and then one area per port. - On PPv2.2, we have a first area for the packet processor registers (common to all ports), and a second area for numerous other registers, including a large number of per-port registers In addition, on PPv2.2, the area for the common registers is split into so-called "address spaces" of 64 KB each. They allow to access per-CPU registers, where each CPU has its own copy of some registers. A few other registers, which have a single copy, also need to be accessed from those per-CPU windows if they are related to a per-CPU register. For example: - Writing to MVPP2_TXQ_NUM_REG selects a TX queue. This register is a per-CPU register, it must be accessed from the current CPU register window. - Then a write to MVPP2_TXQ_PENDING_REG, MVPP2_TXQ_DESC_ADDR_REG (and a few others) will affect the TX queue that was selected by the write to MVPP2_TXQ_NUM_REG. It must be accessed from the same CPU window as the write to the TXQ_NUM_REG. Therefore, the ->base member of 'struct mvpp2' is replaced with a ->cpu_base[] array, each entry pointing to a mapping of the per-CPU area. Since PPv2.1 doesn't have this concept of per-CPU windows, all entries in ->cpu_base[] point to the same io-remapped area. The existing mvpp2_read() and mvpp2_write() accessors use cpu_base[0], they are used for registers for which the CPU window doesn't matter. mvpp2_percpu_read() and mvpp2_percpu_write() are new accessors added to access the registers for which the CPU window does matter, which is why they take a "cpu" as argument. The driver is then changed to use mvpp2_percpu_read() and mvpp2_percpu_write() where it matters. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-07 23:53:13 +08:00
}
if (priv->hw_version == MVPP22 &&
mvpp2_get_nrxqs(priv) * 2 <= MVPP2_BM_MAX_POOLS)
priv->percpu_pools = 1;
mvpp2_setup_bm_pool();
priv->nthreads = min_t(unsigned int, num_present_cpus(),
MVPP2_MAX_THREADS);
shared = num_present_cpus() - priv->nthreads;
if (shared > 0)
bitmap_fill(&priv->lock_map,
min_t(int, shared, MVPP2_MAX_THREADS));
for (i = 0; i < MVPP2_MAX_THREADS; i++) {
net: mvpp2: handle register mapping and access for PPv2.2 This commit adjusts the mvpp2 driver register mapping and access logic to support PPv2.2, to handle a number of differences. Due to how the registers are laid out in memory, the Device Tree binding for the "reg" property is different: - On PPv2.1, we had a first area for the packet processor registers (common to all ports), and then one area per port. - On PPv2.2, we have a first area for the packet processor registers (common to all ports), and a second area for numerous other registers, including a large number of per-port registers In addition, on PPv2.2, the area for the common registers is split into so-called "address spaces" of 64 KB each. They allow to access per-CPU registers, where each CPU has its own copy of some registers. A few other registers, which have a single copy, also need to be accessed from those per-CPU windows if they are related to a per-CPU register. For example: - Writing to MVPP2_TXQ_NUM_REG selects a TX queue. This register is a per-CPU register, it must be accessed from the current CPU register window. - Then a write to MVPP2_TXQ_PENDING_REG, MVPP2_TXQ_DESC_ADDR_REG (and a few others) will affect the TX queue that was selected by the write to MVPP2_TXQ_NUM_REG. It must be accessed from the same CPU window as the write to the TXQ_NUM_REG. Therefore, the ->base member of 'struct mvpp2' is replaced with a ->cpu_base[] array, each entry pointing to a mapping of the per-CPU area. Since PPv2.1 doesn't have this concept of per-CPU windows, all entries in ->cpu_base[] point to the same io-remapped area. The existing mvpp2_read() and mvpp2_write() accessors use cpu_base[0], they are used for registers for which the CPU window doesn't matter. mvpp2_percpu_read() and mvpp2_percpu_write() are new accessors added to access the registers for which the CPU window does matter, which is why they take a "cpu" as argument. The driver is then changed to use mvpp2_percpu_read() and mvpp2_percpu_write() where it matters. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-07 23:53:13 +08:00
u32 addr_space_sz;
addr_space_sz = (priv->hw_version == MVPP21 ?
MVPP21_ADDR_SPACE_SZ : MVPP22_ADDR_SPACE_SZ);
priv->swth_base[i] = base + i * addr_space_sz;
net: mvpp2: handle register mapping and access for PPv2.2 This commit adjusts the mvpp2 driver register mapping and access logic to support PPv2.2, to handle a number of differences. Due to how the registers are laid out in memory, the Device Tree binding for the "reg" property is different: - On PPv2.1, we had a first area for the packet processor registers (common to all ports), and then one area per port. - On PPv2.2, we have a first area for the packet processor registers (common to all ports), and a second area for numerous other registers, including a large number of per-port registers In addition, on PPv2.2, the area for the common registers is split into so-called "address spaces" of 64 KB each. They allow to access per-CPU registers, where each CPU has its own copy of some registers. A few other registers, which have a single copy, also need to be accessed from those per-CPU windows if they are related to a per-CPU register. For example: - Writing to MVPP2_TXQ_NUM_REG selects a TX queue. This register is a per-CPU register, it must be accessed from the current CPU register window. - Then a write to MVPP2_TXQ_PENDING_REG, MVPP2_TXQ_DESC_ADDR_REG (and a few others) will affect the TX queue that was selected by the write to MVPP2_TXQ_NUM_REG. It must be accessed from the same CPU window as the write to the TXQ_NUM_REG. Therefore, the ->base member of 'struct mvpp2' is replaced with a ->cpu_base[] array, each entry pointing to a mapping of the per-CPU area. Since PPv2.1 doesn't have this concept of per-CPU windows, all entries in ->cpu_base[] point to the same io-remapped area. The existing mvpp2_read() and mvpp2_write() accessors use cpu_base[0], they are used for registers for which the CPU window doesn't matter. mvpp2_percpu_read() and mvpp2_percpu_write() are new accessors added to access the registers for which the CPU window does matter, which is why they take a "cpu" as argument. The driver is then changed to use mvpp2_percpu_read() and mvpp2_percpu_write() where it matters. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-07 23:53:13 +08:00
}
if (priv->hw_version == MVPP21)
priv->max_port_rxqs = 8;
else
priv->max_port_rxqs = 32;
net: mvpp2: enable ACPI support in the driver This patch introduces an alternative way of obtaining resources - via ACPI tables provided by firmware. Enabling coexistence with the DT support, in addition to the OF_*->device_*/fwnode_* API replacement, required following steps to be taken: * Add mvpp2_acpi_match table * Omit clock configuration and obtain tclk from the property - in ACPI world, the firmware is responsible for clock maintenance. * Disable comphy and syscon handling as they are not available for ACPI. * Modify way of obtaining interrupts - use newly introduced fwnode_irq_get() routine * Until proper MDIO bus and PHY handling with ACPI is established in the kernel, use only link interrupts feature in the driver. For the RGMII port it results in depending on GMAC settings done during firmware stage. * When booting with ACPI MVPP2_QDIST_MULTI_MODE is picked by default, as there is no need to keep any kind of the backward compatibility. Moreover, a memory region used by mvmdio driver is usually placed in the middle of the address space of the PP2 network controller. The MDIO base address is obtained without requesting memory region (by devm_ioremap() call) in mvmdio.c, later overlapping resources are requested by the network driver, which is responsible for avoiding a concurrent access. In case the MDIO memory region is declared in the ACPI, it can already appear as 'in-use' in the OS. Because it is overlapped by second region of the network controller, make sure it is released, before requesting it again. The care is taken by mvpp2 driver to avoid concurrent access to this memory region. Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-18 20:31:44 +08:00
if (dev_of_node(&pdev->dev)) {
priv->pp_clk = devm_clk_get(&pdev->dev, "pp_clk");
if (IS_ERR(priv->pp_clk))
return PTR_ERR(priv->pp_clk);
err = clk_prepare_enable(priv->pp_clk);
if (err < 0)
return err;
net: mvpp2: enable ACPI support in the driver This patch introduces an alternative way of obtaining resources - via ACPI tables provided by firmware. Enabling coexistence with the DT support, in addition to the OF_*->device_*/fwnode_* API replacement, required following steps to be taken: * Add mvpp2_acpi_match table * Omit clock configuration and obtain tclk from the property - in ACPI world, the firmware is responsible for clock maintenance. * Disable comphy and syscon handling as they are not available for ACPI. * Modify way of obtaining interrupts - use newly introduced fwnode_irq_get() routine * Until proper MDIO bus and PHY handling with ACPI is established in the kernel, use only link interrupts feature in the driver. For the RGMII port it results in depending on GMAC settings done during firmware stage. * When booting with ACPI MVPP2_QDIST_MULTI_MODE is picked by default, as there is no need to keep any kind of the backward compatibility. Moreover, a memory region used by mvmdio driver is usually placed in the middle of the address space of the PP2 network controller. The MDIO base address is obtained without requesting memory region (by devm_ioremap() call) in mvmdio.c, later overlapping resources are requested by the network driver, which is responsible for avoiding a concurrent access. In case the MDIO memory region is declared in the ACPI, it can already appear as 'in-use' in the OS. Because it is overlapped by second region of the network controller, make sure it is released, before requesting it again. The care is taken by mvpp2 driver to avoid concurrent access to this memory region. Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-18 20:31:44 +08:00
priv->gop_clk = devm_clk_get(&pdev->dev, "gop_clk");
if (IS_ERR(priv->gop_clk)) {
err = PTR_ERR(priv->gop_clk);
goto err_pp_clk;
}
net: mvpp2: enable ACPI support in the driver This patch introduces an alternative way of obtaining resources - via ACPI tables provided by firmware. Enabling coexistence with the DT support, in addition to the OF_*->device_*/fwnode_* API replacement, required following steps to be taken: * Add mvpp2_acpi_match table * Omit clock configuration and obtain tclk from the property - in ACPI world, the firmware is responsible for clock maintenance. * Disable comphy and syscon handling as they are not available for ACPI. * Modify way of obtaining interrupts - use newly introduced fwnode_irq_get() routine * Until proper MDIO bus and PHY handling with ACPI is established in the kernel, use only link interrupts feature in the driver. For the RGMII port it results in depending on GMAC settings done during firmware stage. * When booting with ACPI MVPP2_QDIST_MULTI_MODE is picked by default, as there is no need to keep any kind of the backward compatibility. Moreover, a memory region used by mvmdio driver is usually placed in the middle of the address space of the PP2 network controller. The MDIO base address is obtained without requesting memory region (by devm_ioremap() call) in mvmdio.c, later overlapping resources are requested by the network driver, which is responsible for avoiding a concurrent access. In case the MDIO memory region is declared in the ACPI, it can already appear as 'in-use' in the OS. Because it is overlapped by second region of the network controller, make sure it is released, before requesting it again. The care is taken by mvpp2 driver to avoid concurrent access to this memory region. Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-18 20:31:44 +08:00
err = clk_prepare_enable(priv->gop_clk);
if (err < 0)
net: mvpp2: enable ACPI support in the driver This patch introduces an alternative way of obtaining resources - via ACPI tables provided by firmware. Enabling coexistence with the DT support, in addition to the OF_*->device_*/fwnode_* API replacement, required following steps to be taken: * Add mvpp2_acpi_match table * Omit clock configuration and obtain tclk from the property - in ACPI world, the firmware is responsible for clock maintenance. * Disable comphy and syscon handling as they are not available for ACPI. * Modify way of obtaining interrupts - use newly introduced fwnode_irq_get() routine * Until proper MDIO bus and PHY handling with ACPI is established in the kernel, use only link interrupts feature in the driver. For the RGMII port it results in depending on GMAC settings done during firmware stage. * When booting with ACPI MVPP2_QDIST_MULTI_MODE is picked by default, as there is no need to keep any kind of the backward compatibility. Moreover, a memory region used by mvmdio driver is usually placed in the middle of the address space of the PP2 network controller. The MDIO base address is obtained without requesting memory region (by devm_ioremap() call) in mvmdio.c, later overlapping resources are requested by the network driver, which is responsible for avoiding a concurrent access. In case the MDIO memory region is declared in the ACPI, it can already appear as 'in-use' in the OS. Because it is overlapped by second region of the network controller, make sure it is released, before requesting it again. The care is taken by mvpp2 driver to avoid concurrent access to this memory region. Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-18 20:31:44 +08:00
goto err_pp_clk;
if (priv->hw_version == MVPP22) {
priv->mg_clk = devm_clk_get(&pdev->dev, "mg_clk");
if (IS_ERR(priv->mg_clk)) {
err = PTR_ERR(priv->mg_clk);
goto err_gop_clk;
}
err = clk_prepare_enable(priv->mg_clk);
if (err < 0)
goto err_gop_clk;
priv->mg_core_clk = devm_clk_get(&pdev->dev, "mg_core_clk");
if (IS_ERR(priv->mg_core_clk)) {
priv->mg_core_clk = NULL;
} else {
err = clk_prepare_enable(priv->mg_core_clk);
if (err < 0)
goto err_mg_clk;
}
net: mvpp2: enable ACPI support in the driver This patch introduces an alternative way of obtaining resources - via ACPI tables provided by firmware. Enabling coexistence with the DT support, in addition to the OF_*->device_*/fwnode_* API replacement, required following steps to be taken: * Add mvpp2_acpi_match table * Omit clock configuration and obtain tclk from the property - in ACPI world, the firmware is responsible for clock maintenance. * Disable comphy and syscon handling as they are not available for ACPI. * Modify way of obtaining interrupts - use newly introduced fwnode_irq_get() routine * Until proper MDIO bus and PHY handling with ACPI is established in the kernel, use only link interrupts feature in the driver. For the RGMII port it results in depending on GMAC settings done during firmware stage. * When booting with ACPI MVPP2_QDIST_MULTI_MODE is picked by default, as there is no need to keep any kind of the backward compatibility. Moreover, a memory region used by mvmdio driver is usually placed in the middle of the address space of the PP2 network controller. The MDIO base address is obtained without requesting memory region (by devm_ioremap() call) in mvmdio.c, later overlapping resources are requested by the network driver, which is responsible for avoiding a concurrent access. In case the MDIO memory region is declared in the ACPI, it can already appear as 'in-use' in the OS. Because it is overlapped by second region of the network controller, make sure it is released, before requesting it again. The care is taken by mvpp2 driver to avoid concurrent access to this memory region. Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-18 20:31:44 +08:00
}
priv->axi_clk = devm_clk_get(&pdev->dev, "axi_clk");
if (IS_ERR(priv->axi_clk)) {
err = PTR_ERR(priv->axi_clk);
if (err == -EPROBE_DEFER)
goto err_mg_core_clk;
priv->axi_clk = NULL;
} else {
err = clk_prepare_enable(priv->axi_clk);
if (err < 0)
goto err_mg_core_clk;
}
net: mvpp2: enable ACPI support in the driver This patch introduces an alternative way of obtaining resources - via ACPI tables provided by firmware. Enabling coexistence with the DT support, in addition to the OF_*->device_*/fwnode_* API replacement, required following steps to be taken: * Add mvpp2_acpi_match table * Omit clock configuration and obtain tclk from the property - in ACPI world, the firmware is responsible for clock maintenance. * Disable comphy and syscon handling as they are not available for ACPI. * Modify way of obtaining interrupts - use newly introduced fwnode_irq_get() routine * Until proper MDIO bus and PHY handling with ACPI is established in the kernel, use only link interrupts feature in the driver. For the RGMII port it results in depending on GMAC settings done during firmware stage. * When booting with ACPI MVPP2_QDIST_MULTI_MODE is picked by default, as there is no need to keep any kind of the backward compatibility. Moreover, a memory region used by mvmdio driver is usually placed in the middle of the address space of the PP2 network controller. The MDIO base address is obtained without requesting memory region (by devm_ioremap() call) in mvmdio.c, later overlapping resources are requested by the network driver, which is responsible for avoiding a concurrent access. In case the MDIO memory region is declared in the ACPI, it can already appear as 'in-use' in the OS. Because it is overlapped by second region of the network controller, make sure it is released, before requesting it again. The care is taken by mvpp2 driver to avoid concurrent access to this memory region. Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-18 20:31:44 +08:00
/* Get system's tclk rate */
priv->tclk = clk_get_rate(priv->pp_clk);
} else if (device_property_read_u32(&pdev->dev, "clock-frequency",
&priv->tclk)) {
dev_err(&pdev->dev, "missing clock-frequency value\n");
return -EINVAL;
}
if (priv->hw_version == MVPP22) {
err = dma_set_mask(&pdev->dev, MVPP2_DESC_DMA_MASK);
if (err)
goto err_axi_clk;
/* Sadly, the BM pools all share the same register to
* store the high 32 bits of their address. So they
* must all have the same high 32 bits, which forces
* us to restrict coherent memory to DMA_BIT_MASK(32).
*/
err = dma_set_coherent_mask(&pdev->dev, DMA_BIT_MASK(32));
if (err)
goto err_axi_clk;
}
/* Initialize network controller */
err = mvpp2_init(pdev, priv);
if (err < 0) {
dev_err(&pdev->dev, "failed to initialize controller\n");
goto err_axi_clk;
}
/* Initialize ports */
fwnode_for_each_available_child_node(fwnode, port_fwnode) {
err = mvpp2_port_probe(pdev, port_fwnode, priv);
if (err < 0)
goto err_port_probe;
}
if (priv->port_count == 0) {
dev_err(&pdev->dev, "no ports enabled\n");
err = -ENODEV;
goto err_axi_clk;
}
/* Statistics must be gathered regularly because some of them (like
* packets counters) are 32-bit registers and could overflow quite
* quickly. For instance, a 10Gb link used at full bandwidth with the
* smallest packets (64B) will overflow a 32-bit counter in less than
* 30 seconds. Then, use a workqueue to fill 64-bit counters.
*/
snprintf(priv->queue_name, sizeof(priv->queue_name),
"stats-wq-%s%s", netdev_name(priv->port_list[0]->dev),
priv->port_count > 1 ? "+" : "");
priv->stats_queue = create_singlethread_workqueue(priv->queue_name);
if (!priv->stats_queue) {
err = -ENOMEM;
goto err_port_probe;
}
net: mvpp2: add a debugfs interface for the Header Parser Marvell PPv2 Packer Header Parser has a TCAM based filter, that is not trivial to configure and debug. Being able to dump TCAM entries from userspace can be really helpful to help development of new features and debug existing ones. This commit adds a basic debugfs interface for the PPv2 driver, focusing on TCAM related features. <mnt>/mvpp2/ --- f2000000.ethernet \- f4000000.ethernet --- parser --- 000 ... | \- 001 | \- ... | \- 255 --- ai | \- header_data | \- lookup_id | \- sram | \- valid \- eth1 ... \- eth2 --- mac_filter \- parser_entries \- vid_filter There's one directory per PPv2 instance, named after pdev->name to make sure names are uniques. In each of these directories, there's : - one directory per interface on the controller, each containing : - "mac_filter", which lists all filtered addresses for this port (based on TCAM, not on the kernel's uc / mc lists) - "parser_entries", which lists the indices of all valid TCAM entries that have this port in their port map - "vid_filter", which lists the vids allowed on this port, based on TCAM - one "parser" directory (the parser is common to all ports), containing : - one directory per TCAM entry (256 of them, from 0 to 255), each containing : - "ai" : Contains the 1 byte Additional Info field from TCAM, and - "header_data" : Contains the 8 bytes Header Data extracted from the packet - "lookup_id" : Contains the 4 bits LU_ID - "sram" : contains the raw SRAM data, which is the result of the TCAM lookup. This readonly at the moment. - "valid" : Indicates if the entry is valid of not. All entries are read-only, and everything is output in hex form. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-14 19:29:25 +08:00
mvpp2_dbgfs_init(priv, pdev->name);
platform_set_drvdata(pdev, priv);
return 0;
err_port_probe:
i = 0;
fwnode_for_each_available_child_node(fwnode, port_fwnode) {
if (priv->port_list[i])
mvpp2_port_remove(priv->port_list[i]);
i++;
}
err_axi_clk:
clk_disable_unprepare(priv->axi_clk);
err_mg_core_clk:
if (priv->hw_version == MVPP22)
clk_disable_unprepare(priv->mg_core_clk);
err_mg_clk:
if (priv->hw_version == MVPP22)
clk_disable_unprepare(priv->mg_clk);
err_gop_clk:
clk_disable_unprepare(priv->gop_clk);
err_pp_clk:
clk_disable_unprepare(priv->pp_clk);
return err;
}
static int mvpp2_remove(struct platform_device *pdev)
{
struct mvpp2 *priv = platform_get_drvdata(pdev);
struct fwnode_handle *fwnode = pdev->dev.fwnode;
int i = 0, poolnum = MVPP2_BM_POOLS_NUM;
struct fwnode_handle *port_fwnode;
net: mvpp2: add a debugfs interface for the Header Parser Marvell PPv2 Packer Header Parser has a TCAM based filter, that is not trivial to configure and debug. Being able to dump TCAM entries from userspace can be really helpful to help development of new features and debug existing ones. This commit adds a basic debugfs interface for the PPv2 driver, focusing on TCAM related features. <mnt>/mvpp2/ --- f2000000.ethernet \- f4000000.ethernet --- parser --- 000 ... | \- 001 | \- ... | \- 255 --- ai | \- header_data | \- lookup_id | \- sram | \- valid \- eth1 ... \- eth2 --- mac_filter \- parser_entries \- vid_filter There's one directory per PPv2 instance, named after pdev->name to make sure names are uniques. In each of these directories, there's : - one directory per interface on the controller, each containing : - "mac_filter", which lists all filtered addresses for this port (based on TCAM, not on the kernel's uc / mc lists) - "parser_entries", which lists the indices of all valid TCAM entries that have this port in their port map - "vid_filter", which lists the vids allowed on this port, based on TCAM - one "parser" directory (the parser is common to all ports), containing : - one directory per TCAM entry (256 of them, from 0 to 255), each containing : - "ai" : Contains the 1 byte Additional Info field from TCAM, and - "header_data" : Contains the 8 bytes Header Data extracted from the packet - "lookup_id" : Contains the 4 bits LU_ID - "sram" : contains the raw SRAM data, which is the result of the TCAM lookup. This readonly at the moment. - "valid" : Indicates if the entry is valid of not. All entries are read-only, and everything is output in hex form. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-14 19:29:25 +08:00
mvpp2_dbgfs_cleanup(priv);
fwnode_for_each_available_child_node(fwnode, port_fwnode) {
if (priv->port_list[i]) {
mutex_destroy(&priv->port_list[i]->gather_stats_lock);
mvpp2_port_remove(priv->port_list[i]);
}
i++;
}
mvpp2: fix panic on module removal mvpp2 uses a delayed workqueue to gather traffic statistics. On module removal the workqueue can be destroyed before calling cancel_delayed_work_sync() on its works. Fix it by moving the destroy_workqueue() call after mvpp2_port_remove(). Also remove an unneeded call to flush_workqueue() # rmmod mvpp2 [ 2743.311722] mvpp2 f4000000.ethernet eth1: phy link down 10gbase-kr/10Gbps/Full [ 2743.320063] mvpp2 f4000000.ethernet eth1: Link is Down [ 2743.572263] mvpp2 f4000000.ethernet eth2: phy link down sgmii/1Gbps/Full [ 2743.580076] mvpp2 f4000000.ethernet eth2: Link is Down [ 2744.102169] mvpp2 f2000000.ethernet eth0: phy link down 10gbase-kr/10Gbps/Full [ 2744.110441] mvpp2 f2000000.ethernet eth0: Link is Down [ 2744.115614] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 [ 2744.115615] Mem abort info: [ 2744.115616] ESR = 0x96000005 [ 2744.115617] Exception class = DABT (current EL), IL = 32 bits [ 2744.115618] SET = 0, FnV = 0 [ 2744.115619] EA = 0, S1PTW = 0 [ 2744.115620] Data abort info: [ 2744.115621] ISV = 0, ISS = 0x00000005 [ 2744.115622] CM = 0, WnR = 0 [ 2744.115624] user pgtable: 4k pages, 39-bit VAs, pgdp=0000000422681000 [ 2744.115626] [0000000000000000] pgd=0000000000000000, pud=0000000000000000 [ 2744.115630] Internal error: Oops: 96000005 [#1] SMP [ 2744.115632] Modules linked in: mvpp2(-) algif_hash af_alg nls_iso8859_1 nls_cp437 vfat fat xhci_plat_hcd m25p80 spi_nor xhci_hcd mtd usbcore i2c_mv64xxx sfp usb_common marvell10g phy_generic spi_orion mdio_i2c i2c_core mvmdio phylink sbsa_gwdt ip_tables x_tables autofs4 [last unloaded: mvpp2] [ 2744.115654] CPU: 3 PID: 8357 Comm: kworker/3:2 Not tainted 5.3.0-rc2 #1 [ 2744.115655] Hardware name: Marvell 8040 MACCHIATOBin Double-shot (DT) [ 2744.115665] Workqueue: events_power_efficient phylink_resolve [phylink] [ 2744.115669] pstate: a0000085 (NzCv daIf -PAN -UAO) [ 2744.115675] pc : __queue_work+0x9c/0x4d8 [ 2744.115677] lr : __queue_work+0x170/0x4d8 [ 2744.115678] sp : ffffff801001bd50 [ 2744.115680] x29: ffffff801001bd50 x28: ffffffc422597600 [ 2744.115684] x27: ffffff80109ae6f0 x26: ffffff80108e4018 [ 2744.115688] x25: 0000000000000003 x24: 0000000000000004 [ 2744.115691] x23: ffffff80109ae6e0 x22: 0000000000000017 [ 2744.115694] x21: ffffffc42c030000 x20: ffffffc42209e8f8 [ 2744.115697] x19: 0000000000000000 x18: 0000000000000000 [ 2744.115699] x17: 0000000000000000 x16: 0000000000000000 [ 2744.115701] x15: 0000000000000010 x14: ffffffffffffffff [ 2744.115702] x13: ffffff8090e2b95f x12: ffffff8010e2b967 [ 2744.115704] x11: ffffff8010906000 x10: 0000000000000040 [ 2744.115706] x9 : ffffff80109223b8 x8 : ffffff80109223b0 [ 2744.115707] x7 : ffffffc42bc00068 x6 : 0000000000000000 [ 2744.115709] x5 : ffffffc42bc00000 x4 : 0000000000000000 [ 2744.115710] x3 : 0000000000000000 x2 : 0000000000000000 [ 2744.115712] x1 : 0000000000000008 x0 : ffffffc42c030000 [ 2744.115714] Call trace: [ 2744.115716] __queue_work+0x9c/0x4d8 [ 2744.115718] delayed_work_timer_fn+0x28/0x38 [ 2744.115722] call_timer_fn+0x3c/0x180 [ 2744.115723] expire_timers+0x60/0x168 [ 2744.115724] run_timer_softirq+0xbc/0x1e8 [ 2744.115727] __do_softirq+0x128/0x320 [ 2744.115731] irq_exit+0xa4/0xc0 [ 2744.115734] __handle_domain_irq+0x70/0xc0 [ 2744.115735] gic_handle_irq+0x58/0xa8 [ 2744.115737] el1_irq+0xb8/0x140 [ 2744.115738] console_unlock+0x3a0/0x568 [ 2744.115740] vprintk_emit+0x200/0x2a0 [ 2744.115744] dev_vprintk_emit+0x1c8/0x1e4 [ 2744.115747] dev_printk_emit+0x6c/0x7c [ 2744.115751] __netdev_printk+0x104/0x1d8 [ 2744.115752] netdev_printk+0x60/0x70 [ 2744.115756] phylink_resolve+0x38c/0x3c8 [phylink] [ 2744.115758] process_one_work+0x1f8/0x448 [ 2744.115760] worker_thread+0x54/0x500 [ 2744.115762] kthread+0x12c/0x130 [ 2744.115764] ret_from_fork+0x10/0x1c [ 2744.115768] Code: aa1403e0 97fffbbe aa0003f5 b4000700 (f9400261) Fixes: 118d6298f6f0 ("net: mvpp2: add ethtool GOP statistics") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Matteo Croce <mcroce@redhat.com> Acked-by: Antoine Tenart <antoine.tenart@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-01 20:13:30 +08:00
destroy_workqueue(priv->stats_queue);
if (priv->percpu_pools)
poolnum = mvpp2_get_nrxqs(priv) * 2;
for (i = 0; i < poolnum; i++) {
struct mvpp2_bm_pool *bm_pool = &priv->bm_pools[i];
mvpp2_bm_pool_destroy(&pdev->dev, priv, bm_pool);
}
for (i = 0; i < MVPP2_MAX_THREADS; i++) {
struct mvpp2_tx_queue *aggr_txq = &priv->aggr_txqs[i];
dma_free_coherent(&pdev->dev,
MVPP2_AGGR_TXQ_SIZE * MVPP2_DESC_ALIGNED_SIZE,
aggr_txq->descs,
aggr_txq->descs_dma);
}
net: mvpp2: enable ACPI support in the driver This patch introduces an alternative way of obtaining resources - via ACPI tables provided by firmware. Enabling coexistence with the DT support, in addition to the OF_*->device_*/fwnode_* API replacement, required following steps to be taken: * Add mvpp2_acpi_match table * Omit clock configuration and obtain tclk from the property - in ACPI world, the firmware is responsible for clock maintenance. * Disable comphy and syscon handling as they are not available for ACPI. * Modify way of obtaining interrupts - use newly introduced fwnode_irq_get() routine * Until proper MDIO bus and PHY handling with ACPI is established in the kernel, use only link interrupts feature in the driver. For the RGMII port it results in depending on GMAC settings done during firmware stage. * When booting with ACPI MVPP2_QDIST_MULTI_MODE is picked by default, as there is no need to keep any kind of the backward compatibility. Moreover, a memory region used by mvmdio driver is usually placed in the middle of the address space of the PP2 network controller. The MDIO base address is obtained without requesting memory region (by devm_ioremap() call) in mvmdio.c, later overlapping resources are requested by the network driver, which is responsible for avoiding a concurrent access. In case the MDIO memory region is declared in the ACPI, it can already appear as 'in-use' in the OS. Because it is overlapped by second region of the network controller, make sure it is released, before requesting it again. The care is taken by mvpp2 driver to avoid concurrent access to this memory region. Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-18 20:31:44 +08:00
if (is_acpi_node(port_fwnode))
return 0;
clk_disable_unprepare(priv->axi_clk);
clk_disable_unprepare(priv->mg_core_clk);
clk_disable_unprepare(priv->mg_clk);
clk_disable_unprepare(priv->pp_clk);
clk_disable_unprepare(priv->gop_clk);
return 0;
}
static const struct of_device_id mvpp2_match[] = {
{
.compatible = "marvell,armada-375-pp2",
.data = (void *)MVPP21,
},
{
.compatible = "marvell,armada-7k-pp22",
.data = (void *)MVPP22,
},
{ }
};
MODULE_DEVICE_TABLE(of, mvpp2_match);
net: mvpp2: enable ACPI support in the driver This patch introduces an alternative way of obtaining resources - via ACPI tables provided by firmware. Enabling coexistence with the DT support, in addition to the OF_*->device_*/fwnode_* API replacement, required following steps to be taken: * Add mvpp2_acpi_match table * Omit clock configuration and obtain tclk from the property - in ACPI world, the firmware is responsible for clock maintenance. * Disable comphy and syscon handling as they are not available for ACPI. * Modify way of obtaining interrupts - use newly introduced fwnode_irq_get() routine * Until proper MDIO bus and PHY handling with ACPI is established in the kernel, use only link interrupts feature in the driver. For the RGMII port it results in depending on GMAC settings done during firmware stage. * When booting with ACPI MVPP2_QDIST_MULTI_MODE is picked by default, as there is no need to keep any kind of the backward compatibility. Moreover, a memory region used by mvmdio driver is usually placed in the middle of the address space of the PP2 network controller. The MDIO base address is obtained without requesting memory region (by devm_ioremap() call) in mvmdio.c, later overlapping resources are requested by the network driver, which is responsible for avoiding a concurrent access. In case the MDIO memory region is declared in the ACPI, it can already appear as 'in-use' in the OS. Because it is overlapped by second region of the network controller, make sure it is released, before requesting it again. The care is taken by mvpp2 driver to avoid concurrent access to this memory region. Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-18 20:31:44 +08:00
static const struct acpi_device_id mvpp2_acpi_match[] = {
{ "MRVL0110", MVPP22 },
{ },
};
MODULE_DEVICE_TABLE(acpi, mvpp2_acpi_match);
static struct platform_driver mvpp2_driver = {
.probe = mvpp2_probe,
.remove = mvpp2_remove,
.driver = {
.name = MVPP2_DRIVER_NAME,
.of_match_table = mvpp2_match,
net: mvpp2: enable ACPI support in the driver This patch introduces an alternative way of obtaining resources - via ACPI tables provided by firmware. Enabling coexistence with the DT support, in addition to the OF_*->device_*/fwnode_* API replacement, required following steps to be taken: * Add mvpp2_acpi_match table * Omit clock configuration and obtain tclk from the property - in ACPI world, the firmware is responsible for clock maintenance. * Disable comphy and syscon handling as they are not available for ACPI. * Modify way of obtaining interrupts - use newly introduced fwnode_irq_get() routine * Until proper MDIO bus and PHY handling with ACPI is established in the kernel, use only link interrupts feature in the driver. For the RGMII port it results in depending on GMAC settings done during firmware stage. * When booting with ACPI MVPP2_QDIST_MULTI_MODE is picked by default, as there is no need to keep any kind of the backward compatibility. Moreover, a memory region used by mvmdio driver is usually placed in the middle of the address space of the PP2 network controller. The MDIO base address is obtained without requesting memory region (by devm_ioremap() call) in mvmdio.c, later overlapping resources are requested by the network driver, which is responsible for avoiding a concurrent access. In case the MDIO memory region is declared in the ACPI, it can already appear as 'in-use' in the OS. Because it is overlapped by second region of the network controller, make sure it is released, before requesting it again. The care is taken by mvpp2 driver to avoid concurrent access to this memory region. Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-18 20:31:44 +08:00
.acpi_match_table = ACPI_PTR(mvpp2_acpi_match),
},
};
module_platform_driver(mvpp2_driver);
MODULE_DESCRIPTION("Marvell PPv2 Ethernet Driver - www.marvell.com");
MODULE_AUTHOR("Marcin Wojtas <mw@semihalf.com>");
MODULE_LICENSE("GPL v2");