2
0
mirror of https://github.com/edk2-porting/linux-next.git synced 2024-12-23 20:53:53 +08:00
Commit Graph

627 Commits

Author SHA1 Message Date
Or Gerlitz
1b136de120 net/mlx4: Implement vxlan ndo calls
Add implementation for the add/del vxlan port ndo calls, using the
CONFIG_DEV firmware command.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-28 16:29:35 -04:00
Or Gerlitz
d18f141a1a mlx4: Add support for CONFIG_DEV command
Introduce the CONFIG_DEV firmware command which we will use to
configure the UDP port assumed by the firmware for the VXLAN offloads.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-28 16:29:35 -04:00
Or Gerlitz
b74757944d net/mlx4: USe one wrapper that returns -EPERM
When a VF issues a firmware command which is disallowed for them, the PF
rerturns -EPERM from that command wrapper. Move to use one such wrapper
instance, instead of repeating the same code on such commands.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-28 16:29:35 -04:00
Wei Yang
97a5221f56 net/mlx4_core: pass pci_device_id.driver_data to __mlx4_init_one during reset
The second parameter of __mlx4_init_one() is used to identify whether the
pci_dev is a PF or VF. Currently, when it is invoked in mlx4_pci_slot_reset()
this information is missed.

This patch match the pci_dev with mlx4_pci_table and passes the
pci_device_id.driver_data to __mlx4_init_one() in mlx4_pci_slot_reset().

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-27 15:35:33 -04:00
Richard Cochran
4986b4f008 ptp: drivers: set the number of programmable pins.
This patch updates the many PTP Hardware Clock drivers with the
newly introduced field that advertises the number of programmable
pins. Some of these devices do have programmable pins, but the
implementation will have to wait for follow on patches.

Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-21 14:21:14 -04:00
Matan Barak
dd41cc3bb9 net/mlx4: Adapt num_vfs/probed_vf params for single port VF
A new syntax is added for the module parameters num_vfs and probe_vf.

  num_vfs=p1,p2,p1+p2
  probe_bf=p1,p2,p1+p2

Where p1(2) is the number of VFs on / probed VFs for physical
port1(2) and p1+p2 is the number of dual port VFs.

Single port VFs are currently supported only when the link type
for both ports of the device is Ethernet.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-20 16:18:30 -04:00
Matan Barak
449fc48866 net/mlx4: Adapt code for N-Port VF
Adds support for N-Port VFs, this includes:
1. Adding support in the wrapped FW command
	In wrapped commands, we need to verify and convert
	the slave's port into the real physical port.
	Furthermore, when sending the response back to the slave,
	a reverse conversion should be made.
2. Adjusting sqpn for QP1 para-virtualization
	The slave assumes that sqpn is used for QP1 communication.
	If the slave is assigned to a port != (first port), we need
	to adjust the sqpn that will direct its QP1 packets into the
	correct endpoint.
3. Adjusting gid[5] to modify the port for raw ethernet
	In B0 steering, gid[5] contains the port. It needs
	to be adjusted into the physical port.
4. Adjusting number of ports in the query / ports caps in the FW commands
	When a slave queries the hardware, it needs to view only
	the physical ports it's assigned to.
5. Adjusting the sched_qp according to the port number
	The QP port is encoded in the sched_qp, thus in modify_qp we need
	to encode the correct port in sched_qp.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-20 16:18:30 -04:00
Matan Barak
f74462acf8 net/mlx4: Add utils for N-Port VFs
This patch adds the following utils:
1. Convert slave_id -> VF
2. Get the active ports by slave_id
3. Convert slave's port to real port
4. Get the slave's port from real port
5. Get all slaves that uses the i'th real port
6. Get all slaves that uses the i'th real port exclusively

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-20 16:18:29 -04:00
Matan Barak
1ab95d37bc net/mlx4: Add data structures to support N-Ports per VF
Adds the required data structures to support VFs with N (1 or 2)
ports instead of always using the number of physical ports.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-20 16:18:29 -04:00
Eric W. Biederman
38be0a347c mlx4: Don't receive packets when the napi budget == 0
Processing any incoming packets with a with a napi budget of 0
is incorrect driver behavior.

This matters as netpoll will shortly call drivers with a budget of 0
to avoid receive packet processing happening in hard irq context.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-14 22:52:48 -04:00
David S. Miller
85dcce7a73 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/usb/r8152.c
	drivers/net/xen-netback/netback.c

Both the r8152 and netback conflicts were simple overlapping
changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-14 22:31:55 -04:00
Or Gerlitz
de12326830 net/mlx4_en: Deregister multicast vxlan steering rules when going down
When mlx4_en_stop_port() is called, we need to deregister also the
tunnel steering rules that relate to multicast.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-13 13:35:40 -04:00
Eric W. Biederman
e81f44b66b mlx4: Call dev_kfree_skby_any instead of dev_kfree_skb.
Replace dev_kfree_skb with dev_kfree_skb_any in functions that can
be called in hard irq and other contexts.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12 16:22:13 -04:00
Or Gerlitz
7855bff42e net/mlx4_core: Load the IB driver when the device supports IBoE
When checking what protocol drivers to load, the IB driver should be
requested also over Ethernet ports, if the device supports IBoE (RoCE).

Fixes: b046ffe 'net/mlx4_core: Load higher level modules according to ports type'
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12 16:12:42 -04:00
Or Gerlitz
2a2083f7f3 net/mlx4_en: Handle vxlan steering rules for mac address changes
When the device mac address is changed, we must deregister the vxlan
steering rule associated with the previous mac, and register a new
steering rule using the new mac.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12 16:12:41 -04:00
Or Gerlitz
56cb456746 net/mlx4_core: Fix wrong dump of the vxlan offloads device capability
Fix the value used to dump the vxlan offloads device capability to align
with the MLX4_DEV_CAP_FLAG2_yyy definition. While on that, add dump to
the IPoIB flow-steering device capability and fix small typo.

The vxlan cap value wasn't fully handled when a conflict was resolved
between MLX4_DEV_CAP_FLAG2_DMFS_IPOIB coming from the IB tree to
MLX4_DEV_CAP_FLAG2_VXLAN_OFFLOADS coming from net-next.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12 16:12:41 -04:00
Jack Morgenstein
aa9a2d51a3 mlx4: Activate RoCE/SRIOV
To activate RoCE/SRIOV, need to remove the following:
1. In mlx4_ib_add, need to remove the error return preventing
   initialization of a RoCE port under SRIOV.
2. In update_vport_qp_params (in resource_tracker.c) need to remove
   the error return when a RoCE RC or UD qp is detected.
   This error return causes the INIT-to-RTR qp transition to fail
   in the wrapper function under RoCE/SRIOV.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12 15:57:16 -04:00
Jack Morgenstein
5ea8bbfc49 mlx4: Implement IP based gids support for RoCE/SRIOV
Since there is no connection between the MAC/VLAN and the GID
when using IP-based addressing, the proxy QP1 (running on the
slave) must pass the source-mac, destination-mac, and vlan_id
information separately from the GID. Additionally, the Host
must pass the remote source-mac and vlan_id back to the slave,

This is achieved as follows:
Outgoing MADs:
    1. Source MAC: obtained from the CQ completion structure
       (struct ib_wc, smac field).
    2. Destination MAC: obtained from the tunnel header
    3. vlan_id: obtained from the tunnel header.
Incoming MADs
    1. The source (i.e., remote) MAC and vlan_id are passed in
       the tunnel header to the proxy QP1.

VST mode support:
     For outgoing MADs,  the vlan_id obtained from the header is
        discarded, and the vlan_id specified by the Hypervisor is used
        instead.
     For incoming MADs, the incoming vlan_id (in the wc) is discarded, and the
        "invalid" vlan (0xffff)  is substituted when forwarding to the slave.

Signed-off-by: Moni Shoua <monis@mellanox.co.il>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12 15:57:16 -04:00
Jack Morgenstein
2f5bb47368 mlx4: Add ref counting to port MAC table for RoCE
The IB side of RoCE requires the MAC table index of the
MAC address used by its QPs.

To obtain the real MAC index, the IB side registers the
MAC (increasing its ref count, and also returning the
real MAC index) during the modify-qp sequence.

This protects against the ETH side deleting or modifying
that MAC table entry while the QP is active.

Note that until the modify-qp command returns success,
the MAC and VLAN information only has "candidate" status.
If the modify-qp succeeds, the "candidate" info is promoted
to the operational MAC/VLAN info for the qp. If the modify fails,
the candidate MAC/VLAN is unregistered, and the old qp info
is preserved.

The patch is a bit complex, because there are multiple qp
transitions where the primary-path information may be
modified:  INIT-to-RTR, and SQD-to-SQD.

Similarly for the alternate path information.

Therefore the code must handle cases where path information
has already been entered into the QP context by previous
qp transitions.

For the MAC address, the success logic is as follows:
1. If there was no previous MAC, simply move the candidate
   MAC information to the operational information, and reset
   the candidate MAC info.
2. If there was a previous MAC, unregister it.  Then move
   the MAC information from candidate to operational, and
   reset the candidate info (as in 1. above).

The MAC address failure logic is the same for all cases:
 - Unregister the candidate MAC, and reset the candidate MAC info.

For Vlan registration, the logic is similar.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12 15:57:15 -04:00
Jack Morgenstein
b6ffaeffae mlx4: In RoCE allow guests to have multiple GIDS
The GIDs are statically distributed, as follows:
PF: gets 16 GIDs
VFs:  Remaining GIDS are divided evenly between VFs activated by the driver.
      If the division is not even, lower-numbered VFs get an extra GID.

For an IB interface, the number of gids per guest remains as before: one gid per guest.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12 15:57:14 -04:00
Jack Morgenstein
9cd593529c mlx4_core: For RoCE, allow slaves to set the GID entry at that slave's index
For IB transport, the host determines the slave GIDs. For ETH (RoCE),
however, the slave's GID is determined by the IP address that the slave
itself assigns to the ETH device used by RoCE.

In this case, the slave must be able to write its GIDs to the HCA gid table
(at the GID indices that slave "owns").

This commit adds processing for the SET_PORT_GID_TABLE opcode modifier
for the SET_PORT command wrapper (so that slaves may modify their GIDS
for RoCE).

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12 15:57:13 -04:00
Jack Morgenstein
6ee51a4e86 mlx4: Adjust QP1 multiplexing for RoCE/SRIOV
This requires the following modifications:
1. Fix build_mlx4_header to properly fill in the ETH fields
2. Adjust mux and demux QP1 flow to support RoCE.

This commit still assumes only one GID per slave for RoCE.
The commit enabling multiple GIDs is a subsequent commit, and
is done separately because of its complexity.

Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-12 15:57:12 -04:00
Kleber Sacilotto de Souza
c120e9e030 IB/mlx5_core: remove unreachable function call in module init
The call to mlx5_health_cleanup() in the module init function can never
be reached. Removing it.

Signed-off-by: Kleber Sacilotto de Souza <klebers@linux.vnet.ibm.com>
Acked-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-10 15:23:22 -04:00
Sagi Grimberg
3bcdb17a5e IB/mlx5: Keep mlx5 MRs in a radix tree under device
This will be useful when processing signature errors on a specific
key.  The mlx5 driver will lookup the matching mlx5 memory region
structure and mark it as dirty (contains signature errors).

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-03-07 11:26:49 -08:00
Sagi Grimberg
3121e3c441 mlx5: Implement create_mr and destroy_mr
Support create_mr and destroy_mr verbs.  Creating ib_mr may be done
for either ib_mr that will register regular page lists like
alloc_fast_reg_mr routine, or indirect ib_mrs that can register other
(pre-registered) ib_mrs in an indirect manner.

In addition user may request signature enable, that will mean that the
created ib_mr may be attached with signature attributes (BSF, PSVs).

Currently we only allow direct/indirect registration modes.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-03-07 11:26:49 -08:00
Amir Vadai
97989356af net/mlx4_core: mlx4_init_slave() shouldn't access comm channel before PF is ready
Currently, the PF call to pci_enable_sriov from the PF probe function
stalls for 10 seconds times the number of VFs probed on the host. This
happens because the way for such VFs to determine of the PF
initialization finished, is by attempting to issue reset on the
comm-channel and get timeout (after 10s).

The PF probe function is called from a kenernel workqueue, and therefore
during that time, rcu lock is being held and kernel's workqueue is
stalled. This blocks other processes that try to use the workqueue
or rcu lock.  For example, interface renaming which is calling
rcu_synchronize is blocked, and timedout by systemd.

Changed mlx4_init_slave() to allow VF probed on the host to immediatly
detect that the PF is not ready, and return EPROBE_DEFER instantly.

Only when the PF finishes the initialization, allow such VFs to
access the comm channel.

This issue and fix are relevant only for probed VFs on the hypervisor,
there is no way to pass this information to a VM until comm channel is
ready, so in a VM, if PF is not ready, the first command will be timedout
after 10 seconds and return EPROBE_DEFER.

Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-06 17:09:04 -05:00
Amir Vadai
57352ef4f5 net/mlx4_core: Fix memory access error in mlx4_QUERY_DEV_CAP_wrapper()
Fix a regression introduced by [1]. outbox was accessed instead of
outbox->buf. Typo was copy-pasted to [2] and [3].

[1] - cc1ade9 mlx4_core: Disable memory windows for virtual functions
[2] - 4de6580 mlx4_core: Add support for steerable IB UD QPs
[3] - 7ffdf72 net/mlx4_core: Add basic support for TCP/IP offloads under
      tunneling

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-06 17:09:04 -05:00
Fengguang Wu
d0ceebd750 net/mlx4_en: mlx4_en_verify_params() can be static
Fix static error introduced by commit:
b97b33a3df [645/653] net/mlx4_en: Verify
mlx4_en module parameters

sparse warnings:
drivers/net/ethernet/mellanox/mlx4/en_main.c:335:6: sparse: symbol
'mlx4_en_verify_params' was not declared. Should it be static?

CC: netdev@vger.kernel.org
CC: linux-kernel@vger.kernel.org
CC: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-06 16:53:36 -05:00
Gavin Shan
367d56f7b4 net/mlx4: Support shutdown() interface
In kexec scenario, we failed to load the mlx4 driver in the
second kernel because the ownership bit was hold by the first
kernel without release correctly.

The patch adds shutdown() interface so that the ownership can
be released correctly in the first kernel. It also helps avoiding
EEH error happened during boot stage of the second kernel because
of undesired traffic, which can't be handled by hardware during
that stage on Power platform.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Tested-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-05 20:40:24 -05:00
David S. Miller
67ddc87f16 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/wireless/ath/ath9k/recv.c
	drivers/net/wireless/mwifiex/pcie.c
	net/ipv6/sit.c

The SIT driver conflict consists of a bug fix being done by hand
in 'net' (missing u64_stats_init()) whilst in 'net-next' a helper
was created (netdev_alloc_pcpu_stats()) which takes care of this.

The two wireless conflicts were overlapping changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-05 20:32:02 -05:00
Eyal Perry
9717218bb2 net/mlx4_en: Change Connect-X description in kconfig
The mlx4_en driver support also 1Gbit and 40Gbit Ethernet devices,
changed the driver description in the menuconfig to reflect that.

Signed-off-by: Eyal Perry <eyalpe@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-02 20:04:01 -05:00
Amir Vadai
ec5709403e net/mlx4_en: Use union for BlueFlame WQE
When BlueFlame is turned on, control segment of the TX WQE is changed,
and the second line of it is used for QPN.
Changed code to use a union in the mlx4_wqe_ctrl_seg instead of casting.
This makes the code clearer and solves the static checker warning:

drivers/net/ethernet/mellanox/mlx4/en_tx.c:839 mlx4_en_xmit()
	warn: potential memory corrupting cast 4 vs 2 bytes

CC: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-02 20:04:01 -05:00
Eyal Perry
28d222bbaa net/mlx4_core: Fix sparse warning
This patch force conversion to u32 to fix the following sparse warning:
drivers/net/ethernet/mellanox/mlx4/fw.c:1822:53: warning: restricted __be32
degrades to integer

Casting to u32 is safe here, because token will be returned as is
from the hardware without any modification.

Signed-off-by: Eyal Perry <eyalpe@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-02 20:04:01 -05:00
Amir Vadai
313c2d375b net/mlx4_en: Fix selftest failing on non 10G link speed
Connect-X devices selftest speed test shouldn't fail on 1G and 40G link
speeds.

Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-02 20:04:01 -05:00
Eugenia Emantayev
9813337a4b net/mlx4: Replace mlx4_en_mac_to_u64() with mlx4_mac_to_u64()
Currently, the EN driver uses a private static function
mlx4_en_mac_to_u64(). Move it to a common include file (driver.h)
for mlx4_en and mlx4_ib for further use.

Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-02 20:04:01 -05:00
Eugenia Emantayev
15bffdffcc net/mlx4_en: Move queue stopped/waked counters to be per ring
Give accurate counters and avoids cache misses when several rings
update the counters of stop/wake queue.

Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-02 20:04:01 -05:00
Eugenia Emantayev
93591aaa62 net/mlx4_en: Pad ethernet packets smaller than 17 bytes
Hardware can't accept packets smaller than 17 bytes. Therefore need to
pad with zeros.

Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-02 20:04:00 -05:00
Eugenia Emantayev
b97b33a3df net/mlx4_en: Verify mlx4_en module parameters
Verify mlx4_en module parameters.
In case they are out of range - reset to default values.

Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-02 20:04:00 -05:00
Amir Vadai
fd8daa45f2 net/mlx4_en: Fix UP limit in ieee_ets->prio_tc
User priority limit has to be less than MLX4_EN_NUM_UP.

Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-02 20:04:00 -05:00
Amir Vadai
ca9f9f7039 net/mlx4_en: Fix bad use of dev_id
dev_id should be set for multiple netdev's sharing the same MAC, which
is not the case here.

Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-26 15:38:07 -05:00
Amir Vadai
76a066f2a2 net/mlx4_en: Expose port number through sysfs
Initialize dev_port with port number (0 based) to be accessed through
sysfs from user space.

Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-26 15:38:06 -05:00
Amir Vadai
169a1d85d0 net,IB/mlx: Bump all Mellanox driver versions
Bump all Mellanox driver versions.

Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-25 17:34:44 -05:00
Ido Shamay
bb2146bc88 net/mlx4: Fix limiting number of IRQ's instead of RSS queues
This fix a performance bug introduced by commit 90b1ebe "mlx4: set
maximal number of default RSS queues", which limits the numbers of IRQs
opened by core module.
The limit should be on the number of queues in the indirection table -
rx_rings, and not on the number of IRQ's. Also, limiting on mlx4_core
initialization instead of in mlx4_en, prevented using "ethtool -L" to
utilize all the CPU's, when performance mode is prefered, since limiting
this number to 8 reduces overall packet rate by 15%-50% in multiple TCP
streams applications.

For example, after running ethtool -L <ethx> rx 16

          Packet rate
Before the fix  897799
After the fix   1142070

Results were obtained using netperf:

S=200 ; ( for i in $(seq 1 $S) ; do ( \
  netperf -H 11.7.13.55 -t TCP_RR -l 30 &) ; \
  wait ; done | grep "1        1" | awk '{SUM+=$6} END {print SUM}' )

CC: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-24 18:38:14 -05:00
Ido Shamay
0251248232 net/mlx4: Set number of RX rings in a utility function
mlx4_en_add() is too long.
Moving set number of RX rings to a utiltity function to improve
readability and modulization of the code.

Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-24 18:38:01 -05:00
David S. Miller
1e8d6421cf Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/bonding/bond_3ad.h
	drivers/net/bonding/bond_main.c

Two minor conflicts in bonding, both of which were overlapping
changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-19 01:24:22 -05:00
Linus Torvalds
b0d3f6d47e Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) kvaser CAN driver has fixed limits of some of it's table, validate
    that we won't exceed those limits at probe time.  Fix from Olivier
    Sobrie.

 2) Fix rtl8192ce disabling interrupts for too long, from Olivier
    Langlois.

 3) Fix botched shift in ath5k driver, from Dan Carpenter.

 4) Fix corruption of deferred packets in TIPC, from Erik Hugne.

 5) Fix newlink error path in macvlan driver, from Cong Wang.

 6) Fix netpoll deadlock in bonding, from Ding Tianhong.

 7) Handle GSO packets properly in forwarding path when fragmentation is
    necessary on egress, from Florian Westphal.

 8) Fix axienet build errors, from Michal Simek.

 9) Fix refcounting of ubufs on tx in vhost net driver, from Michael S
    Tsirkin.

10) Carrier status isn't set properly in hyperv driver, from Haiyang
    Zhang.

11) Missing pci_disable_device() in tulip_remove_one), from Ingo Molnar.

12) AF_PACKET qdisc bypass mode doesn't adhere to driver provided TX
    queue selection method.  Add a fallback method mechanism to fix this
    bug, from Daniel Borkmann.

13) Fix regression in link local route handling on GRE tunnels, from
    Nicolas Dichtel.

14) Bonding can assign dup aggregator IDs in some sequences of
    configuration, fix by making the allocation counter per-bond instead
    of global.  From Jiri Bohac.

15) sctp_connectx() needs compat translations, from Daniel Borkmann.

16) Fix of_mdio PHY interrupt parsing, from Ben Dooks

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (62 commits)
  MAINTAINERS: add entry for the PHY library
  of_mdio: fix phy interrupt passing
  net: ethernet: update dependency and help text of mvneta
  NET: fec: only enable napi if we are successful
  af_packet: remove a stray tab in packet_set_ring()
  net: sctp: fix sctp_connectx abi for ia32 emulation/compat mode
  ipv4: fix counter in_slow_tot
  irtty-sir.c: Do not set_termios() on irtty_close()
  bonding: 802.3ad: make aggregator_identifier bond-private
  usbnet: remove generic hard_header_len check
  gre: add link local route when local addr is any
  batman-adv: fix potential kernel paging error for unicast transmissions
  batman-adv: avoid double free when orig_node initialization fails
  batman-adv: free skb on TVLV parsing success
  batman-adv: fix TT CRC computation by ensuring byte order
  batman-adv: fix potential orig_node reference leak
  batman-adv: avoid potential race condition when adding a new neighbour
  batman-adv: properly check pskb_may_pull return value
  batman-adv: release vlan object after checking the CRC
  batman-adv: fix TT-TVLV parsing on OGM reception
  ...
2014-02-18 15:52:43 -08:00
Alexander Gordeev
f3c9407bc2 mlx5: Use pci_enable_msix_range() instead of pci_enable_msix()
As result of deprecation of MSI-X/MSI enablement functions
pci_enable_msix() and pci_enable_msi_block() all drivers
using these two interfaces need to be updated to use the
new pci_enable_msi_range() and pci_enable_msix_range()
interfaces.

Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
Cc: Eli Cohen <eli@mellanox.com>
Cc: linux-rdma@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: linux-pci@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-18 15:33:32 -05:00
Alexander Gordeev
66e2f9c1de mlx4: Use pci_enable_msix_range() instead of pci_enable_msix()
As result of deprecation of MSI-X/MSI enablement functions
pci_enable_msix() and pci_enable_msi_block() all drivers
using these two interfaces need to be updated to use the
new pci_enable_msi_range() and pci_enable_msix_range()
interfaces.

Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Amir Vadai <amirv@mellanox.com>
Cc: netdev@vger.kernel.org
Cc: linux-pci@vger.kernel.org
Acked-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-18 15:33:32 -05:00
Daniel Borkmann
99932d4fc0 netdevice: add queue selection fallback handler for ndo_select_queue
Add a new argument for ndo_select_queue() callback that passes a
fallback handler. This gets invoked through netdev_pick_tx();
fallback handler is currently __netdev_pick_tx() as most drivers
invoke this function within their customized implementation in
case for skbs that don't need any special handling. This fallback
handler can then be replaced on other call-sites with different
queue selection methods (e.g. in packet sockets, pktgen etc).

This also has the nice side-effect that __netdev_pick_tx() is
then only invoked from netdev_pick_tx() and export of that
function to modules can be undone.

Suggested-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-17 00:36:34 -05:00
Eli Cohen
0861565f50 IB/mlx5: Remove dependency on X86
Remove Kconfig dependency of mlx5_ib/mlx5_core on X86, since there is
no such dependency in reality.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-02-13 20:48:02 -08:00
Linus Torvalds
4ba9920e5e Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next
Pull networking updates from David Miller:

 1) BPF debugger and asm tool by Daniel Borkmann.

 2) Speed up create/bind in AF_PACKET, also from Daniel Borkmann.

 3) Correct reciprocal_divide and update users, from Hannes Frederic
    Sowa and Daniel Borkmann.

 4) Currently we only have a "set" operation for the hw timestamp socket
    ioctl, add a "get" operation to match.  From Ben Hutchings.

 5) Add better trace events for debugging driver datapath problems, also
    from Ben Hutchings.

 6) Implement auto corking in TCP, from Eric Dumazet.  Basically, if we
    have a small send and a previous packet is already in the qdisc or
    device queue, defer until TX completion or we get more data.

 7) Allow userspace to manage ipv6 temporary addresses, from Jiri Pirko.

 8) Add a qdisc bypass option for AF_PACKET sockets, from Daniel
    Borkmann.

 9) Share IP header compression code between Bluetooth and IEEE802154
    layers, from Jukka Rissanen.

10) Fix ipv6 router reachability probing, from Jiri Benc.

11) Allow packets to be captured on macvtap devices, from Vlad Yasevich.

12) Support tunneling in GRO layer, from Jerry Chu.

13) Allow bonding to be configured fully using netlink, from Scott
    Feldman.

14) Allow AF_PACKET users to obtain the VLAN TPID, just like they can
    already get the TCI.  From Atzm Watanabe.

15) New "Heavy Hitter" qdisc, from Terry Lam.

16) Significantly improve the IPSEC support in pktgen, from Fan Du.

17) Allow ipv4 tunnels to cache routes, just like sockets.  From Tom
    Herbert.

18) Add Proportional Integral Enhanced packet scheduler, from Vijay
    Subramanian.

19) Allow openvswitch to mmap'd netlink, from Thomas Graf.

20) Key TCP metrics blobs also by source address, not just destination
    address.  From Christoph Paasch.

21) Support 10G in generic phylib.  From Andy Fleming.

22) Try to short-circuit GRO flow compares using device provided RX
    hash, if provided.  From Tom Herbert.

The wireless and netfilter folks have been busy little bees too.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (2064 commits)
  net/cxgb4: Fix referencing freed adapter
  ipv6: reallocate addrconf router for ipv6 address when lo device up
  fib_frontend: fix possible NULL pointer dereference
  rtnetlink: remove IFLA_BOND_SLAVE definition
  rtnetlink: remove check for fill_slave_info in rtnl_have_link_slave_info
  qlcnic: update version to 5.3.55
  qlcnic: Enhance logic to calculate msix vectors.
  qlcnic: Refactor interrupt coalescing code for all adapters.
  qlcnic: Update poll controller code path
  qlcnic: Interrupt code cleanup
  qlcnic: Enhance Tx timeout debugging.
  qlcnic: Use bool for rx_mac_learn.
  bonding: fix u64 division
  rtnetlink: add missing IFLA_BOND_AD_INFO_UNSPEC
  sfc: Use the correct maximum TX DMA ring size for SFC9100
  Add Shradha Shah as the sfc driver maintainer.
  net/vxlan: Share RX skb de-marking and checksum checks with ovs
  tulip: cleanup by using ARRAY_SIZE()
  ip_tunnel: clear IPCB in ip_tunnel_xmit() in case dst_link_failure() is called
  net/cxgb4: Don't retrieve stats during recovery
  ...
2014-01-25 11:17:34 -08:00
Roland Dreier
fb1b5034e4 Merge branch 'ip-roce' into for-next
Conflicts:
	drivers/infiniband/hw/mlx4/main.c
2014-01-22 23:24:21 -08:00
Roland Dreier
8f399921ea Merge branches 'cma', 'cxgb4', 'flowsteer', 'ipoib', 'misc', 'mlx4', 'mlx5', 'ocrdma', 'qib', 'srp' and 'usnic' into for-next 2014-01-22 23:24:13 -08:00
Eli Cohen
1bde6e301c IB/mlx5: Abort driver cleanup if teardown hca fails
Do not continue with cleanup flow. If this ever happens we can check which
resources remained open.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-01-22 23:23:53 -08:00
Eli Cohen
05bdb2ab6b mlx5_core: Fix PowerPC support
1. Fix derivation of sub-page index from the dma address in free_4k.
2. Fix the DMA address passed to dma_unmap_page by masking it properly.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-01-22 23:23:52 -08:00
Eli Cohen
db81a5c374 mlx5_core: Improve debugfs readability
Use strings to display transport service or state of QPs.  Use numeric
value for MTU of a QP.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-01-22 23:23:50 -08:00
Eli Cohen
bde51583f4 IB/mlx5: Add support for resize CQ
Implement resize CQ which is a mandatory verb in mlx5.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-01-22 23:23:50 -08:00
Eli Cohen
3bdb31f688 IB/mlx5: Implement modify CQ
Modify CQ is used by ULPs like IPoIB to change moderation parameters.  This
patch adds support in mlx5.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-01-22 23:23:49 -08:00
Eli Cohen
042b9adae8 mlx5_core: Use mlx5 core style warning
Use mlx5_core_warn(), which is the standard warning emitter function, instead
of pr_warn().

Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-01-22 23:23:47 -08:00
Eli Cohen
0b6e81b910 IB/mlx5: Clear out struct before create QP command
Output structs are expected by firmware to be cleared when a command is called.
Clear the "out" struct instead of "dout" which is used only later.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-01-22 23:23:46 -08:00
Haggai Eran
e08a8761d8 mlx5_core: Fix out arg size in access_register command
The output size should be the sum of the core access reg output struct
plus the size of the specific register data provided by the caller.

Signed-off-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-01-22 23:23:44 -08:00
Moni Shoua
6cd28f044b net/mlx4_core: Remove unnecessary validation for port number
This is a fix to a regression introduced by commit:
"982290a net/mlx4_core: Check port number for validity
before accessing data"

IPoIB could not attach to multicast group and we get this in dmesg:
[144214.145008] ib0: failed to attach to multicast group, ret = -22
[144214.145016] ib0: couldn't attach QP to multicast group ff12:401b:ffff:0000:0000:0000:ffff:ffff
[144214.145019] ib0: multicast join failed for ff12:401b:ffff:0000:0000:0000:ffff:ffff, status -22

The cause to the problem is because port is extracted from gid[5].
Which is only valid for Ethernet.
Removed this validation in mlx4_qp_attach_common(), which is accessed
from both Ethernet and IB flows.
Error flow for bad port value in Ethernet is already exists in that
function.

Signed-off-by: Moni Shoua <monis@mellanox.co.il>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-21 23:14:27 -08:00
Moni Shoua
297e0dad72 IB/mlx4: Handle Ethernet L2 parameters for IP based GID addressing
IP based RoCE gids don't store Ethernet L2 parameters, MAC and VLAN.

Therefore, we need to extract them from the CQE and place them in
struct ib_wc (to be used for cases were they were taken from the gid).

Also, when modifying a QP or building address handle, instead of
parsing the dgid to get the MAC and VLAN, take them from the address
handle attributes.

Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-01-18 14:12:53 -08:00
Paul Bolle
f088cbb8d8 net/mlx4_core: clean up srq_res_start_move_to()
Building resource_tracker.o triggers a GCC warning:
    drivers/net/ethernet/mellanox/mlx4/resource_tracker.c: In function 'mlx4_HW2SW_SRQ_wrapper':
    drivers/net/ethernet/mellanox/mlx4/resource_tracker.c:3202:17: warning: 'srq' may be used uninitialized in this function [-Wmaybe-uninitialized]
      atomic_dec(&srq->mtt->ref_count);
                     ^

This is a false positive. But a cleanup of srq_res_start_move_to() can
help GCC here. The code currently uses a switch statement where a plain
if/else would do, since only two of the switch's four cases can ever
occur. Dropping that switch makes the warning go away.

While we're at it, add some missing braces, and convert state to the
correct type.

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16 16:04:47 -08:00
Paul Bolle
c9218a9e67 net/mlx4_core: clean up cq_res_start_move_to()
Building resource_tracker.o triggers a GCC warning:
    drivers/net/ethernet/mellanox/mlx4/resource_tracker.c: In function 'mlx4_HW2SW_CQ_wrapper':
    drivers/net/ethernet/mellanox/mlx4/resource_tracker.c:3019:16: warning: 'cq' may be used uninitialized in this function [-Wmaybe-uninitialized]
      atomic_dec(&cq->mtt->ref_count);
                    ^

This is a false positive. But a cleanup of cq_res_start_move_to() can
help GCC here. The code currently uses a switch statement where an
if/else construct would do too, since only two of the switch's four
cases can ever occur. Dropping that switch makes the warning go away.

While we're at it, add some missing braces.

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16 16:04:47 -08:00
Paul Gortmaker
a81ab36bf5 drivers/net: delete non-required instances of include <linux/init.h>
None of these files are actually using any __init type directives
and hence don't need to include <linux/init.h>.   Most are just a
left over from __devinit and __cpuinit removal, or simply due to
code getting copied from one driver to the next.

This covers everything under drivers/net except for wireless, which
has been submitted separately.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-16 11:53:26 -08:00
David S. Miller
0a379e21c5 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2014-01-14 14:42:42 -08:00
Matan Barak
4de6580360 mlx4_core: Add support for steerable IB UD QPs
This patch adds support for allocating IB UD QPs that we can steer
traffic from.  We introduce a new firmware command FLOW_STEERING_IB_UC_QP_RANGE
and a capability bit.

This command isn't supported for VFs.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-01-14 14:06:50 -08:00
Dan Carpenter
24e42754f6 mlx5_core: Remove dead code
Remove leftover of debug code.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-01-14 13:54:23 -08:00
Eric Dumazet
e6a7675829 net/mlx4_en: call gro handler for encapsulated frames
In order to use the native GRO handling of encapsulated protocols on
mlx4, we need to call napi_gro_receive() instead of netif_receive_skb()
unless busy polling is in action.

While we are at it, rename mlx4_en_cq_ll_polling() to
mlx4_en_cq_busy_polling()

Tested with GRE tunnel : GRO aggregation is now performed on the
ethernet device instead of being done later on gre device.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Amir Vadai <amirv@mellanox.com>
Cc: Jerry Chu <hkchu@google.com>
Cc: Or Gerlitz <ogerlitz@mellanox.com>
Acked-By: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-13 11:41:47 -08:00
Jason Wang
f663dd9aaf net: core: explicitly select a txq before doing l2 forwarding
Currently, the tx queue were selected implicitly in ndo_dfwd_start_xmit(). The
will cause several issues:

- NETIF_F_LLTX were removed for macvlan, so txq lock were done for macvlan
  instead of lower device which misses the necessary txq synchronization for
  lower device such as txq stopping or frozen required by dev watchdog or
  control path.
- dev_hard_start_xmit() was called with NULL txq which bypasses the net device
  watchdog.
- dev_hard_start_xmit() does not check txq everywhere which will lead a crash
  when tso is disabled for lower device.

Fix this by explicitly introducing a new param for .ndo_select_queue() for just
selecting queues in the case of l2 forwarding offload. netdev_pick_tx() was also
extended to accept this parameter and dev_queue_xmit_accel() was used to do l2
forwarding transmission.

With this fixes, NETIF_F_LLTX could be preserved for macvlan and there's no need
to check txq against NULL in dev_hard_start_xmit(). Also there's no need to keep
a dedicated ndo_dfwd_start_xmit() and we can just reuse the code of
dev_queue_xmit() to do the transmission.

In the future, it was also required for macvtap l2 forwarding support since it
provides a necessary synchronization method.

Cc: John Fastabend <john.r.fastabend@intel.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: e1000-devel@lists.sourceforge.net
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-10 13:23:08 -05:00
Shawn Bohrer
74b9c3ea84 mlx4_en: Select PTP_1588_CLOCK
Now that mlx4_en includes a PHC driver it must select PTP_1588_CLOCK.

   drivers/built-in.o: In function `mlx4_en_get_ts_info':
>> en_ethtool.c:(.text+0x391a11): undefined reference to `ptp_clock_index'
   drivers/built-in.o: In function `mlx4_en_remove_timestamp':
>> (.text+0x397913): undefined reference to `ptp_clock_unregister'
   drivers/built-in.o: In function `mlx4_en_init_timestamp':
>> (.text+0x397b20): undefined reference to `ptp_clock_register'

Fixes: ad7d4eaed9 ("mlx4_en: Add PTP hardware clock")
Signed-off-by: Shawn Bohrer <sbohrer@rgmadvisors.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-07 16:23:45 -05:00
Wei Yongjun
9ba75fb0c4 net/mlx4_en: fix error return code in mlx4_en_get_qp()
Fix to return a negative error code from the error handling
case instead of 0.

Fixes: 837052d0cc ('net/mlx4_en: Add netdev support for TCP/IP offloads of vxlan tunneling')
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-07 15:42:52 -05:00
Eyal Perry
b912b2f8fc net/mlx4_core: Warn if device doesn't have enough PCI bandwidth
Check if the device get enough bandwidth from the entire PCI chain to satisfy
its capabilities. This patch determines the PCIe device's bandwidth capabilities
by reading its PCIe Link Capabilities registers and then call the
pcie_get_minimum_link function to ensure that the adapter is hooked into a slot
which is capable of providing the necessary bandwidth capabilities.

Signed-off-by: Eyal Perry <eyalpe@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-05 20:37:05 -05:00
Shawn Bohrer
2156d9a8ac mlx4_en: Only cycle port if HW timestamp config changes
If the hwtstamp_config matches what is currently set for the device then
simply return.  Without this change any program that tries to enable
hardware timestamps will cause the link to cycle even if hardware
timstamps were already enabled.

Signed-off-by: Shawn Bohrer <sbohrer@rgmadvisors.com>
Acked-By: Hadar Hen Zion <hadarh@mellanox.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-02 03:30:36 -05:00
Shawn Bohrer
ad7d4eaed9 mlx4_en: Add PTP hardware clock
This adds a PHC to the mlx4_en driver. We use reader/writer spinlocks to
protect the timecounter since every packet received needs to call
timecounter_cycle2time() when timestamping is enabled.  This can become
a performance bottleneck with RSS and multiple receive queues if normal
spinlocks are used.

This driver has been tested with both Documentation/ptp/testptp and the
linuxptp project (http://linuxptp.sourceforge.net/) on a Mellanox
ConnectX-3 card.

Signed-off-by: Shawn Bohrer <sbohrer@rgmadvisors.com>
Acked-By: Hadar Hen Zion <hadarh@mellanox.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-02 03:30:36 -05:00
dingtianhong
c0623e587d net: mlx4: slight optimization of addr compare
Use possibly more efficient ether_addr_equal
to instead of memcmp.

Cc: Amir Vadai <amirv@mellanox.com>
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Acked-By: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-31 16:48:31 -05:00
Or Gerlitz
837052d0cc net/mlx4_en: Add netdev support for TCP/IP offloads of vxlan tunneling
When the device tunneling offloads mode is vxlan do the following

 - call SET_PORT with the relevant setting

 - add DMFS steering vxlan rule for the device self and multicast mac addresses
   of the form: {<ETH, outer-mac> <VXLAN, ANY vnid> <ETH, ANY mac>} --> RSS QP

 - set relevant QPC fields in RSS context and RX ring QPs

 - in TX flow, set WQE fields to generate HW checksum, and handle gso skbs
   which are marked for encapsulation such that the HW will segment them properly.

 - in RX flow, read HW offloaded checksum for encapsulated packets from the CQE

 - advertize hw_enc_features and NETIF_F_GSO_UDP_TUNNEL to the networking stack

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-31 14:31:43 -05:00
Or Gerlitz
7ffdf726cf net/mlx4_core: Add basic support for TCP/IP offloads under tunneling
Add the low-level device commands and definitions used for TCP/IP HW offloads
of tunneled/vxlan traffic which are supported by the ConnectX3-pro NIC.

This is done through the following elements:

 - read tunneling device caps in QUERY_DEV_CAP
 - add helper function to do SET_PORT for tunneling
 - add DMFS VXLAN steering rule definitions
 - add CQE and WQE checksum offload field definitions

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-31 14:31:43 -05:00
Matan Barak
982290a7fe net/mlx4_core: Check port number for validity before accessing data
Need to validate port number at mlx4_promisc_qp() before use.
Since port number is extracted from gid, as a cooked or corrupted gid
could lead to a crash.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-19 19:04:44 -05:00
Eugenia Emantayev
0276a33061 net/mlx4_en: Add NAPI support for transmit side
Add NAPI for TX side,
implement its support and provide NAPI callback.

Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.com>
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-19 19:04:44 -05:00
Eugenia Emantayev
e4b59a1cb6 net/mlx4_en: Ignore irrelevant hypervisor events
MLX4_DEV_EVENT_SLAVE_INIT and MLX4_DEV_EVENT_SLAVE_SHUTDOWN
events used by Hypervisor to inform the PPF IB driver that
IB para-virtualization must be initialized/destroyed for a slave.
If this event is catched by ETH VF annoying but harmless error message
is printed into dmesg. Remove dmesg prints for these events.

Signed-off-by: Eugenia Emantayev <eugenia@mellanox.co.il>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-19 19:04:44 -05:00
Eyal Perry
be902ab122 net/mlx4_core: Set CQE/EQE size to 64B by default
To achieve out of the box performance default is to use 64 byte CQE/EQE.
In tests that we conduct in our labs, we achieved a performance
improvement of twice the message rate. For older VF/libmlx4 support,
enable_64b_cqe_eqe must be set to 0 (disabled).

Signed-off-by: Eyal Perry <eyalpe@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-19 19:04:44 -05:00
Ido Shamay
d03a68f821 net/mlx4_en: Configure the XPS queue mapping on driver load
Only TX rings of User Piority 0 are mapped.
TX rings of other UP's are using UP 0 mapping.
XPS is not in use when num_tc is set.

Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-19 19:04:44 -05:00
Hadar Hen Zion
84c864038d net/mlx4_en: Implement ndo_get_phys_port_id
Use the port GUID read from the firmware to identify the physical port.
This port identifier is available via ndo_get_phys_port_id for both PF
and VF net-devices.

Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-19 19:04:43 -05:00
Hadar Hen Zion
8e1a28e8e6 net/mlx4_core: Expose physical port id as PF/VF capability
Add the infrastructure needed to support ndo_get_phys_port_id which
allows users to identify to which physical port a net-device is connected
to by reading a unique port id.
This will work for VFs and PFs.
The driver uses a new device capability - phys_port_id, The PF driver
reads the port phys_port_id from Firmware and stores it. The VF driver
reads the port phys_port_id from the PF using QUERY_FUNC_CAP command.

Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-19 19:04:43 -05:00
Hadar Hen Zion
eb17711bc1 net/mlx4_core: Introduce nic_info new flag in QUERY_FUNC_CAP
Set nic_info field in QUERY_FUNC_CAP, which designates
supplementary NIC information is provided by the hypervisor.
When set, the following fields are valid: nic_num_rings,
nic_indirection_tbl_sz, cur_mac and phys_port_id.

Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-19 19:04:43 -05:00
Hadar Hen Zion
73e74ab4e0 net/mlx4_core: Rename QUERY_FUNC_CAP fields
Use correct names for QUERY_FUNC_CAP fields: flags0 and flags1.

Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-19 19:04:43 -05:00
Hadar Hen Zion
7b25d81b7f net/mlx4_core: Remove zeroed out of explicit QUERY_FUNC_CAP fields
All mailboxes are already zeroed by commit:
571b8b9 net/mlx4_core: Initialize all mailbox buffers to zero before use
Remove explicit zero set for force mac and force vlan fields in
mlx4_QUERY_FUNC_CAP_wrapper

Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-19 19:04:43 -05:00
Tom Herbert
6917441603 net: mlx4 calls skb_set_hash
Drivers should call skb_set_hash to set the hash and its type
in an skbuff.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-18 15:00:52 -05:00
Jack Morgenstein
7c6d74d23a mlx4_core: Roll back round robin bitmap allocation commit for CQs, SRQs, and MPTs
Commit f4ec9e9 "mlx4_core: Change bitmap allocator to work in round-robin fashion"
introduced round-robin allocation (via bitmap) for all resources which allocate
via a bitmap.

Round robin allocation is desirable for mcgs, counters, pd's, UARs, and xrcds.
These are simply numbers, with no involvement of ICM memory mapping.

Round robin is required for QPs, since we had a problem with immediate
reuse of a 24-bit QP number (commit f4ec9e9).

However, for other resources which use the bitmap allocator and involve
mapping ICM memory -- MPTs, CQs, SRQs -- round-robin is not desirable.

What happens in these cases is the following:

ICM memory is allocated and mapped in chunks of 256K.

Since the resource allocation index goes up monotonically, the allocator
will eventually require mapping a new chunk. Now, chunks are also unmapped
when their reference count goes back to zero.  Thus, if a single app is
running and starts/exits frequently we will have the following situation:

When the app starts, a new chunk must be allocated and mapped.

When the app exits, the chunk reference count goes back to zero, and the
chunk is unmapped and freed. Therefore, the app must pay the cost of allocation
and mapping of ICM memory each time it runs (although the price is paid only when
allocating the initial entry in the new chunk).

For apps which allocate MPTs/SRQs/CQs and which operate as described above,
this presented a performance problem.

We therefore roll back the round-robin allocator modification for MPTs, CQs, SRQs.

Reported-by: Matthew Finlay <matt@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-09 21:12:13 -05:00
David S. Miller
426e1fa31e Merge branch 'siocghwtstamp' of git://git.kernel.org/pub/scm/linux/kernel/git/bwh/sfc-next
Ben Hutchings says:

====================
SIOCGHWTSTAMP ioctl

1. Add the SIOCGHWTSTAMP ioctl and update the timestamping
documentation.
2. Implement SIOCGHWTSTAMP in most drivers that support SIOCSHWTSTAMP.
3. Add a test program to exercise SIOC{G,S}HWTSTAMP.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-05 19:45:14 -05:00
Wei Yang
1b85ee09aa net/mlx4_core: destroy workqueue when driver fails to register
When driver registration fails, we need to clean up the resources allocated
before. mlx4_core missed destroying the workqueue allocated.

This patch destroys the workqueue when registration fails.

Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-03 11:55:44 -05:00
Eugenia Emantayev
833846e8fa net/mlx4_en: Remove selftest TX queues empty condition
Remove waiting for TX queues to become empty during selftest.
This check is not necessary for any purpose, and might put
the driver into an infinite loop.

Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-01 20:36:07 -05:00
Ben Hutchings
100dbda8e4 mlx4_en: Implement the SIOCGHWTSTAMP ioctl
Compile-tested only.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
2013-11-21 17:17:40 +00:00
Linus Torvalds
1ea406c0e0 Main batch of InfiniBand/RDMA changes for 3.13:
- Re-enable flow steering verbs with new improved userspace ABI
  - Fixes for slow connection due to GID lookup scalability
  - IPoIB fixes
  - Many fixes to HW drivers including mlx4, mlx5, ocrdma and qib
  - Further improvements to SRP error handling
  - Add new transport type for Cisco usNIC
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.14 (GNU/Linux)
 
 iQIcBAABCAAGBQJSil7BAAoJEENa44ZhAt0hbtgP/A+AmUalbOX6ZKzuOFxsrtY2
 r55CX9b1JBeFM/Zhn2o6y+81lpCjkckJSggESMe4izNgocGw0nW4vYGN4SBynatj
 y8sR9OSn+G3ihuENrzG41MJUGEa5WbcNMy4boN+Oa+qyTlV/WjLR7Fv4WbikK7Wm
 o8FNlXiiDhMoGfHHG5J0MD0EQsnxuLDk2XP+ciu4tLtTs+wBka+gFK8WnMvztle3
 gTeMNna5ilvCS2fdBxteuPA3KeDnJE9AgJSMJ2a4Rh+DR8uTgWYQ6n7amjmOc546
 yhAKkoBkxPE10+Yj82WOPhCFxSeWcuSwJvpgv5dTVZ1XqUUcC1V3TEcZDHmyyHQ7
 uPXgS1A+erBW3OYPBjZqtKvnHObscV12fL+rId3vIhcAQIbFroci08ZwPidEYRkn
 fvwlEKcrIsBIpRXEyjlFCxsiiDnfq1wC1VayMR3jrIK0P6idf1SXf/geiRp9+RGT
 wKUc0j51jvEx29qc65xuhEP9FQV9pCMxyd+FEE0d0KkjMz5hsIkjmcUcBbgF0CGg
 GEyDPlgRLv+vmWDGpT8XraaV/0CJOEQDIgB4WSN87/AZ4UoNt7spW2xqsLsp1toy
 5e0100tpWUleTPLe/Wig5GtBdagQ2jAUK1+186CP93pFPtkwc4/7X3hyp7qPIPTz
 VDvT9DEy6zjSMCLpMcdo
 =xxC+
 -----END PGP SIGNATURE-----

Merge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband

Pull infiniband/rdma updates from Roland Dreier:
 - Re-enable flow steering verbs with new improved userspace ABI
 - Fixes for slow connection due to GID lookup scalability
 - IPoIB fixes
 - Many fixes to HW drivers including mlx4, mlx5, ocrdma and qib
 - Further improvements to SRP error handling
 - Add new transport type for Cisco usNIC

* tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (66 commits)
  IB/core: Re-enable create_flow/destroy_flow uverbs
  IB/core: extended command: an improved infrastructure for uverbs commands
  IB/core: Remove ib_uverbs_flow_spec structure from userspace
  IB/core: Use a common header for uverbs flow_specs
  IB/core: Make uverbs flow structure use names like verbs ones
  IB/core: Rename 'flow' structs to match other uverbs structs
  IB/core: clarify overflow/underflow checks on ib_create/destroy_flow
  IB/ucma: Convert use of typedef ctl_table to struct ctl_table
  IB/cm: Convert to using idr_alloc_cyclic()
  IB/mlx5: Fix page shift in create CQ for userspace
  IB/mlx4: Fix device max capabilities check
  IB/mlx5: Fix list_del of empty list
  IB/mlx5: Remove dead code
  IB/core: Encorce MR access rights rules on kernel consumers
  IB/mlx4: Fix endless loop in resize CQ
  RDMA/cma: Remove unused argument and minor dead code
  RDMA/ucma: Discard events for IDs not yet claimed by user space
  IB/core: Add Cisco usNIC rdma node and transport types
  RDMA/nes: Remove self-assignment from nes_query_qp()
  IB/srp: Report receive errors correctly
  ...
2013-11-18 15:36:04 -08:00
Linus Torvalds
9073e1a804 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
Pull trivial tree updates from Jiri Kosina:
 "Usual earth-shaking, news-breaking, rocket science pile from
  trivial.git"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (23 commits)
  doc: usb: Fix typo in Documentation/usb/gadget_configs.txt
  doc: add missing files to timers/00-INDEX
  timekeeping: Fix some trivial typos in comments
  mm: Fix some trivial typos in comments
  irq: Fix some trivial typos in comments
  NUMA: fix typos in Kconfig help text
  mm: update 00-INDEX
  doc: Documentation/DMA-attributes.txt fix typo
  DRM: comment: `halve' -> `half'
  Docs: Kconfig: `devlopers' -> `developers'
  doc: typo on word accounting in kprobes.c in mutliple architectures
  treewide: fix "usefull" typo
  treewide: fix "distingush" typo
  mm/Kconfig: Grammar s/an/a/
  kexec: Typo s/the/then/
  Documentation/kvm: Update cpuid documentation for steal time and pv eoi
  treewide: Fix common typo in "identify"
  __page_to_pfn: Fix typo in comment
  Correct some typos for word frequency
  clk: fixed-factor: Fix a trivial typo
  ...
2013-11-15 16:47:22 -08:00
Eli Cohen
2b136d0253 IB/mlx5: Fix list_del of empty list
For archs with pages size of 4K, when the chunk is freed, fwp is not in the
list so avoid attempting to delete it.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-11-15 14:36:35 -08:00
Eli Cohen
1b77d2bd75 mlx5: Use enum to indicate adapter page size
The Connect-IB adapter has an inherent page size which equals 4K.
Define an new enum that equals the page shift and use it instead of
using the value 12 throughout the code.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-11-08 14:43:01 -08:00
Moshe Lazer
4e3d677ba9 mlx5_core: Change optimal_reclaimed_pages for better performance
Change optimal_reclaimed_pages() to increase the output size of each
reclaim pages command. This change reduces significantly the amount of
reclaim pages commands issued to FW when the driver is unloaded which
reduces the overall driver unload time.

Signed-off-by: Moshe Lazer <moshel@mellanox.com>
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-11-08 14:43:00 -08:00