Commit Graph

294 Commits

Author SHA1 Message Date
Arthur Kiyanovski
3a7b9d8ddd net: ena: update driver version to 2.0.1
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-11 10:13:51 -07:00
Arthur Kiyanovski
f1e90f6e2c net: ena: remove redundant parameter in ena_com_admin_init()
Remove redundant spinlock acquire parameter from ena_com_admin_init()

Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-11 10:13:51 -07:00
Arthur Kiyanovski
87731f0c68 net: ena: change rx copybreak default to reduce kernel memory pressure
Improves socket memory utilization when receiving packets larger
than 128 bytes (the previous rx copybreak) and smaller than 256 bytes.

Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-11 10:13:51 -07:00
Arthur Kiyanovski
0574bb806d net: ena: limit refill Rx threshold to 256 to avoid latency issues
Currently Rx refill is done when the number of required descriptors is
above 1/8 queue size. With a default of 1024 entries per queue the
threshold is 128 descriptors.
There is intention to increase the queue size to 8196 entries.
In this case threshold of 1024 descriptors is too large and can hurt
latency.
Add another limitation to Rx threshold to be at most 256 descriptors.

Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-11 10:13:51 -07:00
Arthur Kiyanovski
bd791175a6 net: ena: explicit casting and initialization, and clearer error handling
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-11 10:13:51 -07:00
Arthur Kiyanovski
cb36bb36e1 net: ena: use CSUM_CHECKED device indication to report skb's checksum status
Set skb->ip_summed to the correct value as reported by the device.
Add counter for the case where rx csum offload is enabled but
device didn't check it.

Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-11 10:13:51 -07:00
Arthur Kiyanovski
38005ca816 net: ena: add functions for handling Low Latency Queues in ena_netdev
This patch includes all code changes necessary in ena_netdev to enable
packet sending via the LLQ placemnt mode.

Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-11 10:13:51 -07:00
Arthur Kiyanovski
689b2bdaaa net: ena: add functions for handling Low Latency Queues in ena_com
This patch introduces APIs for detection, initialization, configuration
and actual usage of low latency queues(LLQ). It extends transmit API with
creation of LLQ descriptors in device memory (which include host buffers
descriptors as well as packet header)

Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-11 10:13:51 -07:00
Arthur Kiyanovski
a7982b8ec9 net: ena: introduce Low Latency Queues data structures according to ENA spec
Low Latency Queues(LLQ) allow usage of device's memory for descriptors
and headers. Such queues decrease processing time since data is already
located on the device when driver rings the doorbell.

Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-11 10:13:51 -07:00
Arthur Kiyanovski
095f2f1fac net: ena: complete host info to match latest ENA spec
Add new fields and definitions to host info and fill them
according to the latest ENA spec version.

Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-11 10:13:50 -07:00
Arthur Kiyanovski
0e575f8542 net: ena: minor performance improvement
Reduce fastpath overhead by making ena_com_tx_comp_req_id_get() inline.
Also move it to ena_eth_com.h file with its dependency function
ena_com_cq_inc_head().

Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-11 10:13:50 -07:00
Arthur Kiyanovski
248ab77342 net: ena: fix auto casting to boolean
Eliminate potential auto casting compilation error.

Fixes: 1738cd3ed3 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-09 10:49:49 -07:00
Arthur Kiyanovski
78a55d05de net: ena: fix NULL dereference due to untimely napi initialization
napi poll functions should be initialized before running request_irq(),
to handle a rare condition where there is a pending interrupt, causing
the ISR to fire immediately while the poll function wasn't set yet,
causing a NULL dereference.

Fixes: 1738cd3ed3 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-09 10:49:49 -07:00
Arthur Kiyanovski
d7703ddbd7 net: ena: fix rare bug when failed restart/resume is followed by driver removal
In a rare scenario when ena_device_restore() fails, followed by device
remove, an FLR will not be issued. In this case, the device will keep
sending asynchronous AENQ keep-alive events, even after driver removal,
leading to memory corruption.

Fixes: 8c5c7abdeb ("net: ena: add power management ops to the ENA driver")
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-09 10:49:49 -07:00
Arthur Kiyanovski
d79c3888bd net: ena: fix warning in rmmod caused by double iounmap
Memory mapped with devm_ioremap is automatically freed when the driver
is disconnected from the device. Therefore there is no need to
explicitly call devm_iounmap.

Fixes: 0857d92f71 ("net: ena: add missing unmap bars on device removal")
Fixes: 411838e7b4 ("net: ena: fix rare kernel crash when bar memory remap fails")
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-09 10:49:49 -07:00
David S. Miller
6f41617bf2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Minor conflict in net/core/rtnetlink.c, David Ahern's bug fix in 'net'
overlapped the renaming of a netlink attribute in net-next.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-03 21:00:17 -07:00
Eric Dumazet
21627982e4 net: ena: remove ndo_poll_controller
As diagnosed by Song Liu, ndo_poll_controller() can
be very dangerous on loaded hosts, since the cpu
calling ndo_poll_controller() might steal all NAPI
contexts (for all RX/TX queues of the NIC). This capture
can last for unlimited amount of time, since one
cpu is generally not able to drain all the queues under load.

ena uses NAPI for TX completions, so we better let core
networking stack call the napi->poll() to avoid the capture.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Netanel Belgazal <netanel@amazon.com>
Cc: Saeed Bishara <saeedb@amazon.com>
Cc: Zorik Machulsky <zorik@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-28 11:12:29 -07:00
zhong jiang
e641e99f26 net: ethernet: remove redundant include
module.h already contained moduleparam.h,  so it is safe to remove
the redundant include.

The issue is detected with the help of Coccinelle.

Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-18 20:07:27 -07:00
Netanel Belgazal
37dff155dc net: ena: fix incorrect usage of memory barriers
Added memory barriers where they were missing to support multiple
architectures, and removed redundant ones.

As part of removing the redundant memory barriers and improving
performance, we moved to more relaxed versions of memory barriers,
as well as to the more relaxed version of writel - writel_relaxed,
while maintaining correctness.

Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-09 07:59:56 -07:00
Netanel Belgazal
28abf4e9c9 net: ena: fix missing calls to READ_ONCE
Add READ_ONCE calls where necessary (for example when iterating
over a memory field that gets updated by the hardware).

Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-09 07:59:56 -07:00
Netanel Belgazal
944b28aa29 net: ena: fix missing lock during device destruction
acquire the rtnl_lock during device destruction to avoid
using partially destroyed device.

ena_remove() shares almost the same logic as ena_destroy_device(),
so use ena_destroy_device() and avoid duplications.

Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-09 07:59:56 -07:00
Netanel Belgazal
fe870c77ef net: ena: fix potential double ena_destroy_device()
ena_destroy_device() can potentially be called twice.
To avoid this, check that the device is running and
only then proceed destroying it.

Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-09 07:59:55 -07:00
Netanel Belgazal
cfa324a514 net: ena: fix device destruction to gracefully free resources
When ena_destroy_device() is called from ena_suspend(), the device is
still reachable from the driver. Therefore, the driver can send a command
to the device to free all resources.
However, in all other cases of calling ena_destroy_device(), the device is
potentially in an error state and unreachable from the driver. In these
cases the driver must not send commands to the device.

The current implementation does not request resource freeing from the
device even when possible. We add the graceful parameter to
ena_destroy_device() to enable resource freeing when possible, and
use it in ena_suspend().

Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-09 07:59:55 -07:00
Netanel Belgazal
ef5b0771d2 net: ena: fix driver when PAGE_SIZE == 64kB
The buffer length field in the ena rx descriptor is 16 bit, and the
current driver passes a full page in each ena rx descriptor.
When PAGE_SIZE equals 64kB or more, the buffer length field becomes
zero.
To solve this issue, limit the ena Rx descriptor to use 16kB even
when allocating 64kB kernel pages. This change would not impact ena
device functionality, as 16kB is still larger than maximum MTU.

Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-09 07:59:55 -07:00
Netanel Belgazal
772ed869f5 net: ena: fix surprise unplug NULL dereference kernel crash
Starting with driver version 1.5.0, in case of a surprise device
unplug, there is a race caused by invoking ena_destroy_device()
from two different places. As a result, the readless register might
be accessed after it was destroyed.

Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-09 07:59:55 -07:00
David S. Miller
89b1698c93 Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/net
The BTF conflicts were simple overlapping changes.

The virtio_net conflict was an overlap of a fix of statistics counter,
happening alongisde a move over to a bonafide statistics structure
rather than counting value on the stack.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-02 10:55:32 -07:00
Gal Pressman
101f0cd4f2 net: ena: Fix use of uninitialized DMA address bits field
UBSAN triggers the following undefined behaviour warnings:
[...]
[   13.236124] UBSAN: Undefined behaviour in drivers/net/ethernet/amazon/ena/ena_eth_com.c:468:22
[   13.240043] shift exponent 64 is too large for 64-bit type 'long long unsigned int'
[...]
[   13.744769] UBSAN: Undefined behaviour in drivers/net/ethernet/amazon/ena/ena_eth_com.c:373:4
[   13.748694] shift exponent 64 is too large for 64-bit type 'long long unsigned int'
[...]

When splitting the address to high and low, GENMASK_ULL is used to generate
a bitmask with dma_addr_bits field from io_sq (in ena_com_prepare_tx and
ena_com_add_single_rx_desc).
The problem is that dma_addr_bits is not initialized with a proper value
(besides being cleared in ena_com_create_io_queue).
Assign dma_addr_bits the correct value that is stored in ena_dev when
initializing the SQ.

Fixes: 1738cd3ed3 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Gal Pressman <pressmangal@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-26 21:30:13 -07:00
Alexander Duyck
8ec56fc3c5 net: allow fallback function to pass netdev
For most of these calls we can just pass NULL through to the fallback
function as the sb_dev. The only cases where we cannot are the cases where
we might be dealing with either an upper device or a driver that would
have configured things to support an sb_dev itself.

The only driver that has any significant change in this patch set should be
ixgbe as we can drop the redundant functionality that existed in both the
ndo_select_queue function and the fallback function that was passed through
to us.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-07-09 13:57:25 -07:00
Alexander Duyck
4f49dec907 net: allow ndo_select_queue to pass netdev
This patch makes it so that instead of passing a void pointer as the
accel_priv we instead pass a net_device pointer as sb_dev. Making this
change allows us to pass the subordinate device through to the fallback
function eventually so that we can keep the actual code in the
ndo_select_queue call as focused on possible on the exception cases.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-07-09 13:41:34 -07:00
Kees Cook
a86854d0c5 treewide: devm_kzalloc() -> devm_kcalloc()
The devm_kzalloc() function has a 2-factor argument form, devm_kcalloc().
This patch replaces cases of:

        devm_kzalloc(handle, a * b, gfp)

with:
        devm_kcalloc(handle, a * b, gfp)

as well as handling cases of:

        devm_kzalloc(handle, a * b * c, gfp)

with:

        devm_kzalloc(handle, array3_size(a, b, c), gfp)

as it's slightly less ugly than:

        devm_kcalloc(handle, array_size(a, b), c, gfp)

This does, however, attempt to ignore constant size factors like:

        devm_kzalloc(handle, 4 * 1024, gfp)

though any constants defined via macros get caught up in the conversion.

Any factors with a sizeof() of "unsigned char", "char", and "u8" were
dropped, since they're redundant.

Some manual whitespace fixes were needed in this patch, as Coccinelle
really liked to write "=devm_kcalloc..." instead of "= devm_kcalloc...".

The Coccinelle script used for this was:

// Fix redundant parens around sizeof().
@@
expression HANDLE;
type TYPE;
expression THING, E;
@@

(
  devm_kzalloc(HANDLE,
-	(sizeof(TYPE)) * E
+	sizeof(TYPE) * E
  , ...)
|
  devm_kzalloc(HANDLE,
-	(sizeof(THING)) * E
+	sizeof(THING) * E
  , ...)
)

// Drop single-byte sizes and redundant parens.
@@
expression HANDLE;
expression COUNT;
typedef u8;
typedef __u8;
@@

(
  devm_kzalloc(HANDLE,
-	sizeof(u8) * (COUNT)
+	COUNT
  , ...)
|
  devm_kzalloc(HANDLE,
-	sizeof(__u8) * (COUNT)
+	COUNT
  , ...)
|
  devm_kzalloc(HANDLE,
-	sizeof(char) * (COUNT)
+	COUNT
  , ...)
|
  devm_kzalloc(HANDLE,
-	sizeof(unsigned char) * (COUNT)
+	COUNT
  , ...)
|
  devm_kzalloc(HANDLE,
-	sizeof(u8) * COUNT
+	COUNT
  , ...)
|
  devm_kzalloc(HANDLE,
-	sizeof(__u8) * COUNT
+	COUNT
  , ...)
|
  devm_kzalloc(HANDLE,
-	sizeof(char) * COUNT
+	COUNT
  , ...)
|
  devm_kzalloc(HANDLE,
-	sizeof(unsigned char) * COUNT
+	COUNT
  , ...)
)

// 2-factor product with sizeof(type/expression) and identifier or constant.
@@
expression HANDLE;
type TYPE;
expression THING;
identifier COUNT_ID;
constant COUNT_CONST;
@@

(
- devm_kzalloc
+ devm_kcalloc
  (HANDLE,
-	sizeof(TYPE) * (COUNT_ID)
+	COUNT_ID, sizeof(TYPE)
  , ...)
|
- devm_kzalloc
+ devm_kcalloc
  (HANDLE,
-	sizeof(TYPE) * COUNT_ID
+	COUNT_ID, sizeof(TYPE)
  , ...)
|
- devm_kzalloc
+ devm_kcalloc
  (HANDLE,
-	sizeof(TYPE) * (COUNT_CONST)
+	COUNT_CONST, sizeof(TYPE)
  , ...)
|
- devm_kzalloc
+ devm_kcalloc
  (HANDLE,
-	sizeof(TYPE) * COUNT_CONST
+	COUNT_CONST, sizeof(TYPE)
  , ...)
|
- devm_kzalloc
+ devm_kcalloc
  (HANDLE,
-	sizeof(THING) * (COUNT_ID)
+	COUNT_ID, sizeof(THING)
  , ...)
|
- devm_kzalloc
+ devm_kcalloc
  (HANDLE,
-	sizeof(THING) * COUNT_ID
+	COUNT_ID, sizeof(THING)
  , ...)
|
- devm_kzalloc
+ devm_kcalloc
  (HANDLE,
-	sizeof(THING) * (COUNT_CONST)
+	COUNT_CONST, sizeof(THING)
  , ...)
|
- devm_kzalloc
+ devm_kcalloc
  (HANDLE,
-	sizeof(THING) * COUNT_CONST
+	COUNT_CONST, sizeof(THING)
  , ...)
)

// 2-factor product, only identifiers.
@@
expression HANDLE;
identifier SIZE, COUNT;
@@

- devm_kzalloc
+ devm_kcalloc
  (HANDLE,
-	SIZE * COUNT
+	COUNT, SIZE
  , ...)

// 3-factor product with 1 sizeof(type) or sizeof(expression), with
// redundant parens removed.
@@
expression HANDLE;
expression THING;
identifier STRIDE, COUNT;
type TYPE;
@@

(
  devm_kzalloc(HANDLE,
-	sizeof(TYPE) * (COUNT) * (STRIDE)
+	array3_size(COUNT, STRIDE, sizeof(TYPE))
  , ...)
|
  devm_kzalloc(HANDLE,
-	sizeof(TYPE) * (COUNT) * STRIDE
+	array3_size(COUNT, STRIDE, sizeof(TYPE))
  , ...)
|
  devm_kzalloc(HANDLE,
-	sizeof(TYPE) * COUNT * (STRIDE)
+	array3_size(COUNT, STRIDE, sizeof(TYPE))
  , ...)
|
  devm_kzalloc(HANDLE,
-	sizeof(TYPE) * COUNT * STRIDE
+	array3_size(COUNT, STRIDE, sizeof(TYPE))
  , ...)
|
  devm_kzalloc(HANDLE,
-	sizeof(THING) * (COUNT) * (STRIDE)
+	array3_size(COUNT, STRIDE, sizeof(THING))
  , ...)
|
  devm_kzalloc(HANDLE,
-	sizeof(THING) * (COUNT) * STRIDE
+	array3_size(COUNT, STRIDE, sizeof(THING))
  , ...)
|
  devm_kzalloc(HANDLE,
-	sizeof(THING) * COUNT * (STRIDE)
+	array3_size(COUNT, STRIDE, sizeof(THING))
  , ...)
|
  devm_kzalloc(HANDLE,
-	sizeof(THING) * COUNT * STRIDE
+	array3_size(COUNT, STRIDE, sizeof(THING))
  , ...)
)

// 3-factor product with 2 sizeof(variable), with redundant parens removed.
@@
expression HANDLE;
expression THING1, THING2;
identifier COUNT;
type TYPE1, TYPE2;
@@

(
  devm_kzalloc(HANDLE,
-	sizeof(TYPE1) * sizeof(TYPE2) * COUNT
+	array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
  , ...)
|
  devm_kzalloc(HANDLE,
-	sizeof(TYPE1) * sizeof(THING2) * (COUNT)
+	array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
  , ...)
|
  devm_kzalloc(HANDLE,
-	sizeof(THING1) * sizeof(THING2) * COUNT
+	array3_size(COUNT, sizeof(THING1), sizeof(THING2))
  , ...)
|
  devm_kzalloc(HANDLE,
-	sizeof(THING1) * sizeof(THING2) * (COUNT)
+	array3_size(COUNT, sizeof(THING1), sizeof(THING2))
  , ...)
|
  devm_kzalloc(HANDLE,
-	sizeof(TYPE1) * sizeof(THING2) * COUNT
+	array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
  , ...)
|
  devm_kzalloc(HANDLE,
-	sizeof(TYPE1) * sizeof(THING2) * (COUNT)
+	array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
  , ...)
)

// 3-factor product, only identifiers, with redundant parens removed.
@@
expression HANDLE;
identifier STRIDE, SIZE, COUNT;
@@

(
  devm_kzalloc(HANDLE,
-	(COUNT) * STRIDE * SIZE
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  devm_kzalloc(HANDLE,
-	COUNT * (STRIDE) * SIZE
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  devm_kzalloc(HANDLE,
-	COUNT * STRIDE * (SIZE)
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  devm_kzalloc(HANDLE,
-	(COUNT) * (STRIDE) * SIZE
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  devm_kzalloc(HANDLE,
-	COUNT * (STRIDE) * (SIZE)
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  devm_kzalloc(HANDLE,
-	(COUNT) * STRIDE * (SIZE)
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  devm_kzalloc(HANDLE,
-	(COUNT) * (STRIDE) * (SIZE)
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  devm_kzalloc(HANDLE,
-	COUNT * STRIDE * SIZE
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
)

// Any remaining multi-factor products, first at least 3-factor products,
// when they're not all constants...
@@
expression HANDLE;
expression E1, E2, E3;
constant C1, C2, C3;
@@

(
  devm_kzalloc(HANDLE, C1 * C2 * C3, ...)
|
  devm_kzalloc(HANDLE,
-	(E1) * E2 * E3
+	array3_size(E1, E2, E3)
  , ...)
|
  devm_kzalloc(HANDLE,
-	(E1) * (E2) * E3
+	array3_size(E1, E2, E3)
  , ...)
|
  devm_kzalloc(HANDLE,
-	(E1) * (E2) * (E3)
+	array3_size(E1, E2, E3)
  , ...)
|
  devm_kzalloc(HANDLE,
-	E1 * E2 * E3
+	array3_size(E1, E2, E3)
  , ...)
)

// And then all remaining 2 factors products when they're not all constants,
// keeping sizeof() as the second factor argument.
@@
expression HANDLE;
expression THING, E1, E2;
type TYPE;
constant C1, C2, C3;
@@

(
  devm_kzalloc(HANDLE, sizeof(THING) * C2, ...)
|
  devm_kzalloc(HANDLE, sizeof(TYPE) * C2, ...)
|
  devm_kzalloc(HANDLE, C1 * C2 * C3, ...)
|
  devm_kzalloc(HANDLE, C1 * C2, ...)
|
- devm_kzalloc
+ devm_kcalloc
  (HANDLE,
-	sizeof(TYPE) * (E2)
+	E2, sizeof(TYPE)
  , ...)
|
- devm_kzalloc
+ devm_kcalloc
  (HANDLE,
-	sizeof(TYPE) * E2
+	E2, sizeof(TYPE)
  , ...)
|
- devm_kzalloc
+ devm_kcalloc
  (HANDLE,
-	sizeof(THING) * (E2)
+	E2, sizeof(THING)
  , ...)
|
- devm_kzalloc
+ devm_kcalloc
  (HANDLE,
-	sizeof(THING) * E2
+	E2, sizeof(THING)
  , ...)
|
- devm_kzalloc
+ devm_kcalloc
  (HANDLE,
-	(E1) * E2
+	E1, E2
  , ...)
|
- devm_kzalloc
+ devm_kcalloc
  (HANDLE,
-	(E1) * (E2)
+	E1, E2
  , ...)
|
- devm_kzalloc
+ devm_kcalloc
  (HANDLE,
-	E1 * E2
+	E1, E2
  , ...)
)

Signed-off-by: Kees Cook <keescook@chromium.org>
2018-06-12 16:19:22 -07:00
Alexander Duyck
115ddc4919 net: ena: Use pci_sriov_configure_simple() to enable VFs
Instead of implementing our own version of a SR-IOV configuration stub in
the ena driver, use the existing pci_sriov_configure_simple() function.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2018-04-24 16:47:22 -05:00
Sinan Kaya
6d2e1a8d5e net: ena: Eliminate duplicate barriers on weakly-ordered archs
Code includes barrier() followed by writel(). writel() already has a
barrier on some architectures like arm64.

This ends up CPU observing two barriers back to back before executing the
register write.

Create a new wrapper function with relaxed write operator. Use the new
wrapper when a write is following a barrier().

Since code already has an explicit barrier call, changing writel() to
writel_relaxed() and adding mmiowb() for ordering protection.

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-03-26 12:47:56 -04:00
David S. Miller
a0ce093180 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-01-09 10:37:00 -05:00
Netanel Belgazal
ee4552aaf3 net: ena: fix error handling in ena_down() sequence
ENA admin command queue errors are not handled as part of ena_down().
As a result, in case of error admin queue transitions to non-running
state and aborts all subsequent commands including those coming from
ena_up(). Reset scheduled by the driver from the timer service
context would not proceed due to sharing rtnl with ena_up()/ena_down()

Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-03 12:21:00 -05:00
Netanel Belgazal
7853b49ce8 net: ena: unmask MSI-X only after device initialization is completed
Under certain conditions MSI-X interrupt might arrive right after it
was unmasked in ena_up(). There is a chance it would be processed by
the driver before device ENA_FLAG_DEV_UP flag is set. In such a case
the interrupt is ignored.
ENA device operates in auto-masked mode, therefore ignoring
interrupt leaves it masked for good.
Moving unmask of interrupt to be the last step in ena_up().

Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-03 12:21:00 -05:00
Netanel Belgazal
7009744568 net: ena: increase ena driver version to 1.5.0
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-02 14:35:12 -05:00
Netanel Belgazal
8510e1a3d1 net: ena: add detection and recovery mechanism for handling missed/misrouted MSI-X
A mechanism for detection of stuck Rx/Tx rings due to missed or
misrouted interrupts.
Check if there are unhandled completion descriptors before the first
MSI-X interrupt arrived.
The check is per queue and per interrupt vector.
Once such condition is detected, driver and device reset is scheduled.

Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-02 14:35:12 -05:00
Kees Cook
e99e88a9d2 treewide: setup_timer() -> timer_setup()
This converts all remaining cases of the old setup_timer() API into using
timer_setup(), where the callback argument is the structure already
holding the struct timer_list. These should have no behavioral changes,
since they just change which pointer is passed into the callback with
the same available pointers after conversion. It handles the following
examples, in addition to some other variations.

Casting from unsigned long:

    void my_callback(unsigned long data)
    {
        struct something *ptr = (struct something *)data;
    ...
    }
    ...
    setup_timer(&ptr->my_timer, my_callback, ptr);

and forced object casts:

    void my_callback(struct something *ptr)
    {
    ...
    }
    ...
    setup_timer(&ptr->my_timer, my_callback, (unsigned long)ptr);

become:

    void my_callback(struct timer_list *t)
    {
        struct something *ptr = from_timer(ptr, t, my_timer);
    ...
    }
    ...
    timer_setup(&ptr->my_timer, my_callback, 0);

Direct function assignments:

    void my_callback(unsigned long data)
    {
        struct something *ptr = (struct something *)data;
    ...
    }
    ...
    ptr->my_timer.function = my_callback;

have a temporary cast added, along with converting the args:

    void my_callback(struct timer_list *t)
    {
        struct something *ptr = from_timer(ptr, t, my_timer);
    ...
    }
    ...
    ptr->my_timer.function = (TIMER_FUNC_TYPE)my_callback;

And finally, callbacks without a data assignment:

    void my_callback(unsigned long data)
    {
    ...
    }
    ...
    setup_timer(&ptr->my_timer, my_callback, 0);

have their argument renamed to verify they're unused during conversion:

    void my_callback(struct timer_list *unused)
    {
    ...
    }
    ...
    timer_setup(&ptr->my_timer, my_callback, 0);

The conversion is done with the following Coccinelle script:

spatch --very-quiet --all-includes --include-headers \
	-I ./arch/x86/include -I ./arch/x86/include/generated \
	-I ./include -I ./arch/x86/include/uapi \
	-I ./arch/x86/include/generated/uapi -I ./include/uapi \
	-I ./include/generated/uapi --include ./include/linux/kconfig.h \
	--dir . \
	--cocci-file ~/src/data/timer_setup.cocci

@fix_address_of@
expression e;
@@

 setup_timer(
-&(e)
+&e
 , ...)

// Update any raw setup_timer() usages that have a NULL callback, but
// would otherwise match change_timer_function_usage, since the latter
// will update all function assignments done in the face of a NULL
// function initialization in setup_timer().
@change_timer_function_usage_NULL@
expression _E;
identifier _timer;
type _cast_data;
@@

(
-setup_timer(&_E->_timer, NULL, _E);
+timer_setup(&_E->_timer, NULL, 0);
|
-setup_timer(&_E->_timer, NULL, (_cast_data)_E);
+timer_setup(&_E->_timer, NULL, 0);
|
-setup_timer(&_E._timer, NULL, &_E);
+timer_setup(&_E._timer, NULL, 0);
|
-setup_timer(&_E._timer, NULL, (_cast_data)&_E);
+timer_setup(&_E._timer, NULL, 0);
)

@change_timer_function_usage@
expression _E;
identifier _timer;
struct timer_list _stl;
identifier _callback;
type _cast_func, _cast_data;
@@

(
-setup_timer(&_E->_timer, _callback, _E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, &_callback, _E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, _callback, (_cast_data)_E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, &_callback, (_cast_data)_E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, (_cast_func)_callback, _E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, (_cast_func)&_callback, _E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, (_cast_func)_callback, (_cast_data)_E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, (_cast_func)&_callback, (_cast_data)_E);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E._timer, _callback, (_cast_data)_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, _callback, (_cast_data)&_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, &_callback, (_cast_data)_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, &_callback, (_cast_data)&_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, (_cast_func)_callback, (_cast_data)_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, (_cast_func)_callback, (_cast_data)&_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, (_cast_func)&_callback, (_cast_data)_E);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, (_cast_func)&_callback, (_cast_data)&_E);
+timer_setup(&_E._timer, _callback, 0);
|
 _E->_timer@_stl.function = _callback;
|
 _E->_timer@_stl.function = &_callback;
|
 _E->_timer@_stl.function = (_cast_func)_callback;
|
 _E->_timer@_stl.function = (_cast_func)&_callback;
|
 _E._timer@_stl.function = _callback;
|
 _E._timer@_stl.function = &_callback;
|
 _E._timer@_stl.function = (_cast_func)_callback;
|
 _E._timer@_stl.function = (_cast_func)&_callback;
)

// callback(unsigned long arg)
@change_callback_handle_cast
 depends on change_timer_function_usage@
identifier change_timer_function_usage._callback;
identifier change_timer_function_usage._timer;
type _origtype;
identifier _origarg;
type _handletype;
identifier _handle;
@@

 void _callback(
-_origtype _origarg
+struct timer_list *t
 )
 {
(
	... when != _origarg
	_handletype *_handle =
-(_handletype *)_origarg;
+from_timer(_handle, t, _timer);
	... when != _origarg
|
	... when != _origarg
	_handletype *_handle =
-(void *)_origarg;
+from_timer(_handle, t, _timer);
	... when != _origarg
|
	... when != _origarg
	_handletype *_handle;
	... when != _handle
	_handle =
-(_handletype *)_origarg;
+from_timer(_handle, t, _timer);
	... when != _origarg
|
	... when != _origarg
	_handletype *_handle;
	... when != _handle
	_handle =
-(void *)_origarg;
+from_timer(_handle, t, _timer);
	... when != _origarg
)
 }

// callback(unsigned long arg) without existing variable
@change_callback_handle_cast_no_arg
 depends on change_timer_function_usage &&
                     !change_callback_handle_cast@
identifier change_timer_function_usage._callback;
identifier change_timer_function_usage._timer;
type _origtype;
identifier _origarg;
type _handletype;
@@

 void _callback(
-_origtype _origarg
+struct timer_list *t
 )
 {
+	_handletype *_origarg = from_timer(_origarg, t, _timer);
+
	... when != _origarg
-	(_handletype *)_origarg
+	_origarg
	... when != _origarg
 }

// Avoid already converted callbacks.
@match_callback_converted
 depends on change_timer_function_usage &&
            !change_callback_handle_cast &&
	    !change_callback_handle_cast_no_arg@
identifier change_timer_function_usage._callback;
identifier t;
@@

 void _callback(struct timer_list *t)
 { ... }

// callback(struct something *handle)
@change_callback_handle_arg
 depends on change_timer_function_usage &&
	    !match_callback_converted &&
            !change_callback_handle_cast &&
            !change_callback_handle_cast_no_arg@
identifier change_timer_function_usage._callback;
identifier change_timer_function_usage._timer;
type _handletype;
identifier _handle;
@@

 void _callback(
-_handletype *_handle
+struct timer_list *t
 )
 {
+	_handletype *_handle = from_timer(_handle, t, _timer);
	...
 }

// If change_callback_handle_arg ran on an empty function, remove
// the added handler.
@unchange_callback_handle_arg
 depends on change_timer_function_usage &&
	    change_callback_handle_arg@
identifier change_timer_function_usage._callback;
identifier change_timer_function_usage._timer;
type _handletype;
identifier _handle;
identifier t;
@@

 void _callback(struct timer_list *t)
 {
-	_handletype *_handle = from_timer(_handle, t, _timer);
 }

// We only want to refactor the setup_timer() data argument if we've found
// the matching callback. This undoes changes in change_timer_function_usage.
@unchange_timer_function_usage
 depends on change_timer_function_usage &&
            !change_callback_handle_cast &&
            !change_callback_handle_cast_no_arg &&
	    !change_callback_handle_arg@
expression change_timer_function_usage._E;
identifier change_timer_function_usage._timer;
identifier change_timer_function_usage._callback;
type change_timer_function_usage._cast_data;
@@

(
-timer_setup(&_E->_timer, _callback, 0);
+setup_timer(&_E->_timer, _callback, (_cast_data)_E);
|
-timer_setup(&_E._timer, _callback, 0);
+setup_timer(&_E._timer, _callback, (_cast_data)&_E);
)

// If we fixed a callback from a .function assignment, fix the
// assignment cast now.
@change_timer_function_assignment
 depends on change_timer_function_usage &&
            (change_callback_handle_cast ||
             change_callback_handle_cast_no_arg ||
             change_callback_handle_arg)@
expression change_timer_function_usage._E;
identifier change_timer_function_usage._timer;
identifier change_timer_function_usage._callback;
type _cast_func;
typedef TIMER_FUNC_TYPE;
@@

(
 _E->_timer.function =
-_callback
+(TIMER_FUNC_TYPE)_callback
 ;
|
 _E->_timer.function =
-&_callback
+(TIMER_FUNC_TYPE)_callback
 ;
|
 _E->_timer.function =
-(_cast_func)_callback;
+(TIMER_FUNC_TYPE)_callback
 ;
|
 _E->_timer.function =
-(_cast_func)&_callback
+(TIMER_FUNC_TYPE)_callback
 ;
|
 _E._timer.function =
-_callback
+(TIMER_FUNC_TYPE)_callback
 ;
|
 _E._timer.function =
-&_callback;
+(TIMER_FUNC_TYPE)_callback
 ;
|
 _E._timer.function =
-(_cast_func)_callback
+(TIMER_FUNC_TYPE)_callback
 ;
|
 _E._timer.function =
-(_cast_func)&_callback
+(TIMER_FUNC_TYPE)_callback
 ;
)

// Sometimes timer functions are called directly. Replace matched args.
@change_timer_function_calls
 depends on change_timer_function_usage &&
            (change_callback_handle_cast ||
             change_callback_handle_cast_no_arg ||
             change_callback_handle_arg)@
expression _E;
identifier change_timer_function_usage._timer;
identifier change_timer_function_usage._callback;
type _cast_data;
@@

 _callback(
(
-(_cast_data)_E
+&_E->_timer
|
-(_cast_data)&_E
+&_E._timer
|
-_E
+&_E->_timer
)
 )

// If a timer has been configured without a data argument, it can be
// converted without regard to the callback argument, since it is unused.
@match_timer_function_unused_data@
expression _E;
identifier _timer;
identifier _callback;
@@

(
-setup_timer(&_E->_timer, _callback, 0);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, _callback, 0L);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E->_timer, _callback, 0UL);
+timer_setup(&_E->_timer, _callback, 0);
|
-setup_timer(&_E._timer, _callback, 0);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, _callback, 0L);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_E._timer, _callback, 0UL);
+timer_setup(&_E._timer, _callback, 0);
|
-setup_timer(&_timer, _callback, 0);
+timer_setup(&_timer, _callback, 0);
|
-setup_timer(&_timer, _callback, 0L);
+timer_setup(&_timer, _callback, 0);
|
-setup_timer(&_timer, _callback, 0UL);
+timer_setup(&_timer, _callback, 0);
|
-setup_timer(_timer, _callback, 0);
+timer_setup(_timer, _callback, 0);
|
-setup_timer(_timer, _callback, 0L);
+timer_setup(_timer, _callback, 0);
|
-setup_timer(_timer, _callback, 0UL);
+timer_setup(_timer, _callback, 0);
)

@change_callback_unused_data
 depends on match_timer_function_unused_data@
identifier match_timer_function_unused_data._callback;
type _origtype;
identifier _origarg;
@@

 void _callback(
-_origtype _origarg
+struct timer_list *unused
 )
 {
	... when != _origarg
 }

Signed-off-by: Kees Cook <keescook@chromium.org>
2017-11-21 15:57:07 -08:00
Netanel Belgazal
d18e4f6834 net: ena: fix race condition between device reset and link up setup
In rare cases, ena driver would reset and re-start the device,
for example, in case of misbehaving application that causes
transmit timeout

The first step in the reset procedure is to stop the Tx traffic by
calling ena_carrier_off().

After the driver have just started the device reset procedure, device
happens to send an asynchronous notification (via AENQ) to the driver
than there was a link change (to link-up state).
This link change is mapped to a call to netif_carrier_on() which
re-activates the Tx queues, violating the assumption of no tx traffic
until device reset is completed, as the reset task might still be in
the process of queues initialization, leading to an access to
uninitialized memory.

Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-11-20 11:35:16 +09:00
Linus Torvalds
7c225c69f8 Merge branch 'akpm' (patches from Andrew)
Merge updates from Andrew Morton:

 - a few misc bits

 - ocfs2 updates

 - almost all of MM

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (131 commits)
  memory hotplug: fix comments when adding section
  mm: make alloc_node_mem_map a void call if we don't have CONFIG_FLAT_NODE_MEM_MAP
  mm: simplify nodemask printing
  mm,oom_reaper: remove pointless kthread_run() error check
  mm/page_ext.c: check if page_ext is not prepared
  writeback: remove unused function parameter
  mm: do not rely on preempt_count in print_vma_addr
  mm, sparse: do not swamp log with huge vmemmap allocation failures
  mm/hmm: remove redundant variable align_end
  mm/list_lru.c: mark expected switch fall-through
  mm/shmem.c: mark expected switch fall-through
  mm/page_alloc.c: broken deferred calculation
  mm: don't warn about allocations which stall for too long
  fs: fuse: account fuse_inode slab memory as reclaimable
  mm, page_alloc: fix potential false positive in __zone_watermark_ok
  mm: mlock: remove lru_add_drain_all()
  mm, sysctl: make NUMA stats configurable
  shmem: convert shmem_init_inodecache() to void
  Unify migrate_pages and move_pages access checks
  mm, pagevec: rename pagevec drained field
  ...
2017-11-15 19:42:40 -08:00
Mel Gorman
453f85d43f mm: remove __GFP_COLD
As the page free path makes no distinction between cache hot and cold
pages, there is no real useful ordering of pages in the free list that
allocation requests can take advantage of.  Juding from the users of
__GFP_COLD, it is likely that a number of them are the result of copying
other sites instead of actually measuring the impact.  Remove the
__GFP_COLD parameter which simplifies a number of paths in the page
allocator.

This is potentially controversial but bear in mind that the size of the
per-cpu pagelists versus modern cache sizes means that the whole per-cpu
list can often fit in the L3 cache.  Hence, there is only a potential
benefit for microbenchmarks that alloc/free pages in a tight loop.  It's
even worse when THP is taken into account which has little or no chance
of getting a cache-hot page as the per-cpu list is bypassed and the
zeroing of multiple pages will thrash the cache anyway.

The truncate microbenchmarks are not shown as this patch affects the
allocation path and not the free path.  A page fault microbenchmark was
tested but it showed no sigificant difference which is not surprising
given that the __GFP_COLD branches are a miniscule percentage of the
fault path.

Link: http://lkml.kernel.org/r/20171018075952.10627-9-mgorman@techsingularity.net
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-11-15 18:21:06 -08:00
David S. Miller
f8ddadc4db Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
There were quite a few overlapping sets of changes here.

Daniel's bug fix for off-by-ones in the new BPF branch instructions,
along with the added allowances for "data_end > ptr + x" forms
collided with the metadata additions.

Along with those three changes came veritifer test cases, which in
their final form I tried to group together properly.  If I had just
trimmed GIT's conflict tags as-is, this would have split up the
meta tests unnecessarily.

In the socketmap code, a set of preemption disabling changes
overlapped with the rename of bpf_compute_data_end() to
bpf_compute_data_pointers().

Changes were made to the mv88e6060.c driver set addr method
which got removed in net-next.

The hyperv transport socket layer had a locking change in 'net'
which overlapped with a change of socket state macro usage
in 'net-next'.

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-22 13:39:14 +01:00
Netanel Belgazal
046b307189 net: ena: increase ena driver version to 1.3.0
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-19 12:51:37 +01:00
Netanel Belgazal
58894d5219 net: ena: add new admin define for future support of IPv6 RSS
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-19 12:51:37 +01:00
Netanel Belgazal
11095fdb71 net: ena: add statistics for missed tx packets
Add a new statistic to ethtool stats that show the number of packets
without transmit acknowledgement from ENA device.

Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-19 12:51:36 +01:00
Netanel Belgazal
8c5c7abdeb net: ena: add power management ops to the ENA driver
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-19 12:51:36 +01:00
Netanel Belgazal
dbeaf1e3c2 net: ena: remove legacy suspend suspend/resume support
Remove ena_device_io_suspend/resume() methods
Those methods were intend to be used by the device to trigger
suspend/resume but eventually it was dropped.

Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-19 12:51:36 +01:00
Netanel Belgazal
88aef2f51c net: ena: improve ENA driver boot time.
The ena admin commands timeout is in resolutions of 100ms.
Therefore, When the driver works in polling mode, it sleeps for 100ms
each time. The overall boot time of the ENA driver is ~1.5 sec.
To reduce the boot time, This change modifies the granularity of
the sleeps to 5ms.
This change improves the boot time to 220ms.

Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-19 12:51:36 +01:00
Netanel Belgazal
a59df39676 net: ena: fix wrong max Tx/Rx queues on ethtool
ethtool ena_get_channels() expose the max number of queues as the max
number of queues ENA supports (128 queues) and not the actual number
of created queues.

Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-19 12:49:15 +01:00
Netanel Belgazal
411838e7b4 net: ena: fix rare kernel crash when bar memory remap fails
This failure is rare and only found on testing where deliberately fail
devm_ioremap()

[  451.170464] ena 0000:04:00.0: failed to remap regs bar
451.170549] Workqueue: pciehp-1 pciehp_power_thread
[  451.170551] task: ffff88085a5f2d00 task.stack: ffffc9000756c000
[  451.170552] RIP: 0010:devm_iounmap+0x2d/0x40
[  451.170553] RSP: 0018:ffffc9000756fac0 EFLAGS: 00010282
[  451.170554] RAX: 00000000fffffffe RBX: 0000000000000000 RCX:
0000000000000000
[  451.170555] RDX: ffffffff813a7e00 RSI: 0000000000000282 RDI:
0000000000000282
[  451.170556] RBP: ffffc9000756fac8 R08: 00000000fffffffe R09:
00000000000009b7
[  451.170557] R10: 0000000000000005 R11: 00000000000009b6 R12:
ffff880856c9d0a0
[  451.170558] R13: ffffc9000f5c90c0 R14: ffff880856c9d0a0 R15:
0000000000000028
[  451.170559] FS:  0000000000000000(0000) GS:ffff88085f400000(0000)
knlGS:0000000000000000
[  451.170560] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  451.170561] CR2: 00007f169038b000 CR3: 0000000001c09000 CR4:
00000000003406f0
[  451.170562] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[  451.170562] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
[  451.170563] Call Trace:
[  451.170572]  ena_release_bars.isra.48+0x34/0x60 [ena]
[  451.170574]  ena_probe+0x144/0xd90 [ena]
[  451.170579]  ? ida_simple_get+0x98/0x100
[  451.170585]  ? kernfs_next_descendant_post+0x40/0x50
[  451.170591]  local_pci_probe+0x45/0xa0
[  451.170592]  pci_device_probe+0x157/0x180
[  451.170599]  driver_probe_device+0x2a8/0x460
[  451.170600]  __device_attach_driver+0x7e/0xe0
[  451.170602]  ? driver_allows_async_probing+0x30/0x30
[  451.170603]  bus_for_each_drv+0x68/0xb0
[  451.170605]  __device_attach+0xdd/0x160
[  451.170607]  device_attach+0x10/0x20
[  451.170610]  pci_bus_add_device+0x4f/0xa0
[  451.170611]  pci_bus_add_devices+0x39/0x70
[  451.170613]  pciehp_configure_device+0x96/0x120
[  451.170614]  pciehp_enable_slot+0x1b3/0x290
[  451.170616]  pciehp_power_thread+0x3b/0xb0
[  451.170622]  process_one_work+0x149/0x360
[  451.170623]  worker_thread+0x4d/0x3c0
[  451.170626]  kthread+0x109/0x140
[  451.170627]  ? rescuer_thread+0x380/0x380
[  451.170628]  ? kthread_park+0x60/0x60
[  451.170632]  ret_from_fork+0x25/0x30

Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-19 12:49:15 +01:00
Netanel Belgazal
cd7aea1875 net: ena: reduce the severity of some printouts
Decrease log level of checksum errors as these messages can be
triggered remotely by bad packets.

Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-10-19 12:49:15 +01:00
Tobias Klauser
1f4cf93b13 net: ena: Remove redundant unlikely()
IS_ERR() already implies unlikely(), so it can be omitted.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-26 09:54:06 -07:00
Netanel Belgazal
8523899912 net: ena: update ena driver to version 1.2.0
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-23 14:15:11 -04:00
Netanel Belgazal
11a9a46019 net: ena: update driver's rx drop statistics
rx drop counter is reported by the device in the keep-alive
event.
update the driver's counter with the device counter.

Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-23 14:15:11 -04:00
Netanel Belgazal
3ae5907c61 net: ena: use lower_32_bits()/upper_32_bits() to split dma address
In ena_com_mem_addr_set(), use the above functions to split dma address
to the lower 32 bits and the higher 16 bits.

Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-23 14:15:10 -04:00
Netanel Belgazal
4265114d53 net: ena: separate skb allocation to dedicated function
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-23 14:15:10 -04:00
Netanel Belgazal
e745dafab0 net: ena: use napi_schedule_irqoff when possible
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-23 14:15:10 -04:00
Netanel Belgazal
06443684da net: ena: allow the driver to work with small number of msix vectors
Current driver tries to allocate msix vectors as the number of the
negotiated io queues. (with another msix vector for management).
If pci_alloc_irq_vectors() fails, the driver aborts the probe
and the ENA network device is never brought up.

With this patch, the driver's logic will reduce the number of IO
queues to the number of allocated msix vectors (minus one for management)
instead of failing probe().

Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-23 14:15:10 -04:00
Netanel Belgazal
ad974baef2 net: ena: add support for out of order rx buffers refill
ENA driver post Rx buffers through the Rx submission queue
for the ENA device to fill them with receive packets.
Each Rx buffer is marked with req_id in the Rx descriptor.

Newer ENA devices could consume the posted Rx buffer in out of order,
and as result the corresponding Rx completion queue will have Rx
completion descriptors with non contiguous req_id(s)

In this change the driver holds two rings.
The first ring (called free_rx_ids) is a mapping ring.
It holds all the unused request ids.
The values in this ring are from 0 to ring_size -1.

When the driver wants to allocate a new Rx buffer it uses the head of
free_rx_ids and uses it's value as the index for rx_buffer_info ring.
The req_id is also written to the Rx descriptor

Upon Rx completion,
The driver took the req_id from the completion descriptor and uses it
as index in rx_buffer_info.
The req_id is then return to the free_rx_ids ring.

This patch also adds statistics to inform when the driver receive out
of range or unused req_id.

Note:
free_rx_ids is only accessible from the napi handler, so no locking is
required

Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-23 14:15:09 -04:00
Netanel Belgazal
e2eed0e307 net: ena: add reset reason for each device FLR
For each device reset, log to the device what is the cause
the reset occur.

Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-23 14:15:09 -04:00
Netanel Belgazal
917501109c net: ena: change sizeof() argument to be the type pointer
Instead of using:
memset(ptr, 0x0, sizeof(struct ...))
use:
memset(ptr, 0x0, sizeor(*ptr))

Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-23 14:15:09 -04:00
Netanel Belgazal
82ef30f13b net: ena: add hardware hints capability to the driver
With this patch, ENA device can update the ena driver about
the desired timeout values:
These values are part of the "hardware hints" which are transmitted
to the driver as Asynchronous event through ENA async
event notification queue.

In case the ENA device does not support this capability,
the driver will use its own default values.

Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-23 14:15:08 -04:00
Netanel Belgazal
d1497638b6 net: ena: change return value for unsupported features unsupported return value
return -EOPNOTSUPP instead of -EPERM.

Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-23 14:15:08 -04:00
Netanel Belgazal
e7ff7efae5 net: ena: update ena driver to version 1.1.7
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-11 16:36:47 -04:00
Netanel Belgazal
800c55cb76 net: ena: bug fix in lost tx packets detection mechanism
check_for_missing_tx_completions() is called from a timer
task and looking for lost tx packets.
The old implementation accumulate all the lost tx packets
and did not check if those packets were retrieved on a later stage.
This cause to a situation where the driver reset
the device for no reason.

Fixes: 1738cd3ed3 ("Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-11 16:36:47 -04:00
Netanel Belgazal
a2cc5198da net: ena: disable admin msix while working in polling mode
Fixes: 1738cd3ed3 ("Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-11 16:36:47 -04:00
Netanel Belgazal
a3af7c18cf net: ena: fix theoretical Rx hang on low memory systems
For the rare case where the device runs out of free rx buffer
descriptors (in case of pressure on kernel  memory),
and the napi handler continuously fail to refill new Rx descriptors
until device rx queue totally runs out of all free rx buffers
to post incoming packet, leading to a deadlock:
* The device won't send interrupts since all the new
Rx packets will be dropped.
* The napi handler won't try to allocate new Rx descriptors
since allocation is part of NAPI that's not being invoked any more

The fix involves detecting this scenario and rescheduling NAPI
(to refill buffers) by the keepalive/watchdog task.

Fixes: 1738cd3ed3 ("Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-11 16:36:46 -04:00
Netanel Belgazal
0857d92f71 net: ena: add missing unmap bars on device removal
This patch also change the mapping functions to devm_ functions

Fixes: 1738cd3ed3 ("Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-11 16:36:46 -04:00
Netanel Belgazal
661d2b0cce net: ena: fix race condition between submit and completion admin command
Bug:
"Completion context is occupied" error printout will be noticed in
dmesg.
This error will cause the admin command to fail, which will lead to
an ena_probe() failure or a watchdog reset (depends on which admin
command failed).

Root cause:
__ena_com_submit_admin_cmd() is the function that submits new entries to
the admin queue.
The function have a check that makes sure the queue is not full and the
function does not override any outstanding command.
It uses head and tail indexes for this check.
The head is increased by ena_com_handle_admin_completion() which runs
from interrupt context, and the tail index is increased by the submit
function (the function is running under ->q_lock, so there is no risk
of multithread increment).
Each command is associated with a completion context. This context
allocated before call to __ena_com_submit_admin_cmd() and freed by
ena_com_wait_and_process_admin_cq_interrupts(), right after the command
was completed.

This can lead to a state where the head was increased, the check passed,
but the completion context is still in use.

Solution:
Use the atomic variable ->outstanding_cmds instead of using the head and
the tail indexes.
This variable is safe for use since it is bumped in get_comp_ctx() in
__ena_com_submit_admin_cmd() and is freed by comp_ctxt_release()

Fixes: 1738cd3ed3 ("Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-11 16:36:46 -04:00
Netanel Belgazal
2d2c600a91 net: ena: add missing return when ena_com_get_io_handlers() fails
Fixes: 1738cd3ed3 ("Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-11 16:36:46 -04:00
Netanel Belgazal
418df30f7e net: ena: fix bug that might cause hang after consecutive open/close interface.
Fixing a bug that the driver does not unmask the IO interrupts
in ndo_open():
occasionally, the MSI-X interrupt (for one or more IO queues)
can be masked when ndo_close() was called.
If that is followed by ndo open(),
then the MSI-X will be still masked so no interrupt
will be received by the driver.

Fixes: 1738cd3ed3 ("Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-11 16:36:46 -04:00
Netanel Belgazal
a77c1aafcc net: ena: fix rare uncompleted admin command false alarm
The current flow to detect admin completion is:
while (command_not_completed) {
	if (timeout)
		error

	check_for_completion()
		sleep()
   }
So in case the sleep took more than the timeout
(in case the thread/workqueue was not scheduled due to higher priority
task or prolonged VMexit), the driver can detect a stall even if
the completion is present.

The fix changes the order of this function to first check for
completion and only after that check if the timeout expired.

Fixes: 1738cd3ed3 ("Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-11 16:36:45 -04:00
Christoph Hellwig
da6f4cf58e net/ena: switch to pci_alloc_irq_vectors
Remove the deprecated pci_enable_msix API in favour of its successor.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-11 11:16:03 -04:00
Lino Sanfilippo
90a6c997bd net: ena: remove superfluous check in ena_remove()
The check in ena_remove() for the pci driver data not being NULL is not
needed, since it is always set in the probe() function. Remove the
superfluous check.

Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-20 10:22:59 -05:00
Netanel Belgazal
3e5d689717 net/ena: update driver version to 1.1.2
Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-09 22:27:06 -05:00
Netanel Belgazal
dd8427a78f net/ena: change condition for host attribute configuration
Move the host info config to be the first admin command that is executed.
This change require the driver to remove the 'feature check'
from host info configuration flow.
The check is removed since the supported features bitmask field
is retrieved only after calling ENA_ADMIN_DEVICE_ATTRIBUTES admin command.

If set host info is not supported an error will be returned by the device.

Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-09 22:27:06 -05:00
Netanel Belgazal
7102a18ac3 net/ena: change driver's default timeouts
The timeouts were too agressive and sometimes cause false alarms.

Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-09 22:27:06 -05:00
Netanel Belgazal
5add6e4a22 net/ena: reduce the severity of ena printouts
Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-09 22:27:06 -05:00
Netanel Belgazal
a8496eb813 net/ena: use READ_ONCE to access completion descriptors
Completion descriptors are accessed from the driver and from the device.
To avoid reading the old value, use READ_ONCE macro.

Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-09 22:27:06 -05:00
Netanel Belgazal
b1669c9f5a net/ena: use napi_complete_done() return value
Do not unamsk interrupts if we are in busy poll mode.

Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-09 22:27:06 -05:00
Netanel Belgazal
3f6159dbfc net/ena: fix potential access to freed memory during device reset
If the ena driver detects that the device is not behave as expected,
it tries to reset the device.
The reset flow calls ena_down, which will frees all the resources
the driver allocates and then it will reset the device.

This flow can cause memory corruption if the device is still writes
to the driver's memory space.
To overcome this potential race, move the reset before the device
resources are freed.

Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-09 22:27:06 -05:00
Netanel Belgazal
d81db24056 net/ena: refactor ena_get_stats64 to be atomic context safe
ndo_get_stat64() can be called from atomic context, but the current
implementation sends an admin command to retrieve the statistics from
the device. This admin command can sleep.

This patch re-factors the implementation of ena_get_stats64() to use
the {rx,tx}bytes/count from the driver's inner counters, and to obtain
the rx drop counter from the asynchronous keep alive (heart bit)
event.

Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-09 22:27:06 -05:00
Netanel Belgazal
22b331c9e0 net/ena: fix NULL dereference when removing the driver after device reset failed
If for some reason the device stops responding, and the device reset
failes to recover the device, the mmio register read data structure
will not be reinitialized.

On driver removal, the driver will also try to reset the device, but
this time the mmio data structure will be NULL.

To solve this issue, perform the device reset in the remove function
only if the device is runnig.

Crash log
   54.240382] BUG: unable to handle kernel NULL pointer dereference at           (null)
[   54.244186] IP: [<ffffffffc067de5a>] ena_com_reg_bar_read32+0x8a/0x180 [ena_drv]
[   54.244186] PGD 0
[   54.244186] Oops: 0002 [#1] SMP
[   54.244186] Modules linked in: ena_drv(OE-) snd_hda_codec_generic kvm_intel kvm crct10dif_pclmul ppdev crc32_pclmul ghash_clmulni_intel aesni_intel snd_hda_intel aes_x86_64 snd_hda_controller lrw gf128mul cirrus glue_helper ablk_helper ttm snd_hda_codec drm_kms_helper cryptd snd_hwdep drm snd_pcm pvpanic snd_timer syscopyarea sysfillrect snd parport_pc sysimgblt serio_raw soundcore i2c_piix4 mac_hid lp parport psmouse floppy
[   54.244186] CPU: 5 PID: 1841 Comm: rmmod Tainted: G           OE 3.16.0-031600-generic #201408031935
[   54.244186] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
[   54.244186] task: ffff880135852880 ti: ffff8800bb640000 task.ti: ffff8800bb640000
[   54.244186] RIP: 0010:[<ffffffffc067de5a>]  [<ffffffffc067de5a>] ena_com_reg_bar_read32+0x8a/0x180 [ena_drv]
[   54.244186] RSP: 0018:ffff8800bb643d50  EFLAGS: 00010083
[   54.244186] RAX: 000000000000deb0 RBX: 0000000000030d40 RCX: 0000000000000003
[   54.244186] RDX: 0000000000000202 RSI: 0000000000000058 RDI: ffffc90000775104
[   54.244186] RBP: ffff8800bb643d88 R08: 0000000000000000 R09: cf00000000000000
[   54.244186] R10: 0000000fffffffe0 R11: 0000000000000001 R12: 0000000000000000
[   54.244186] R13: ffffc90000765000 R14: ffffc90000775104 R15: 00007fca1fa98090
[   54.244186] FS:  00007fca1f1bd740(0000) GS:ffff88013fd40000(0000) knlGS:0000000000000000
[   54.244186] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   54.244186] CR2: 0000000000000000 CR3: 00000000b9cf6000 CR4: 00000000001406e0
[   54.244186] Stack:
[   54.244186]  0000000000000202 0000005800000286 ffffc90000765000 ffffc90000765000
[   54.244186]  ffff880135f6b000 ffff8800b9360000 00007fca1fa98090 ffff8800bb643db8
[   54.244186]  ffffffffc0680b3d ffff8800b93608c0 ffffc90000765000 ffff880135f6b000
[   54.244186] Call Trace:
[   54.244186]  [<ffffffffc0680b3d>] ena_com_dev_reset+0x1d/0x1b0 [ena_drv]
[   54.244186]  [<ffffffffc0678497>] ena_remove+0xa7/0x130 [ena_drv]
[   54.244186]  [<ffffffff813d4df6>] pci_device_remove+0x46/0xc0
[   54.244186]  [<ffffffff814c3b7f>] __device_release_driver+0x7f/0xf0
[   54.244186]  [<ffffffff814c4738>] driver_detach+0xc8/0xd0
[   54.244186]  [<ffffffff814c3969>] bus_remove_driver+0x59/0xd0
[   54.244186]  [<ffffffff814c4fde>] driver_unregister+0x2e/0x60
[   54.244186]  [<ffffffff810f0a80>] ? show_refcnt+0x40/0x40
[   54.244186]  [<ffffffff813d4ec3>] pci_unregister_driver+0x23/0xa0
[   54.244186]  [<ffffffffc068413f>] ena_cleanup+0x10/0xed1 [ena_drv]
[   54.244186]  [<ffffffff810f3a47>] SyS_delete_module+0x157/0x1e0
[   54.244186]  [<ffffffff81014fb7>] ? do_notify_resume+0xc7/0xd0
[   54.244186]  [<ffffffff81793fad>] system_call_fastpath+0x1a/0x1f
[   54.244186] Code: c3 4d 8d b5 04 01 01 00 4c 89 f7 e8 e1 5a 11 c1 48 89 45 c8 41 0f b7 85 00 01 01 00 8d 48 01 66 2d 52 21 66 41 89 8d 00 01 01 00 <66> 41 89 04 24 0f b7 45 d4 89 45 d0 89 c1 41 0f b7 85 00 01 01
[   54.244186] RIP  [<ffffffffc067de5a>] ena_com_reg_bar_read32+0x8a/0x180 [ena_drv]
[   54.244186]  RSP <ffff8800bb643d50>
[   54.244186] CR2: 0000000000000000
[   54.244186] ---[ end trace 18dd9889b6497810 ]---

Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-09 22:27:06 -05:00
Netanel Belgazal
422e21e761 net/ena: fix RSS default hash configuration
ENA default hash configures IPv4_frag hash twice instead of
configure non-IP packets.

The bug caused IPv4 fragmented packets to be calculated based on
L2 source and destination address instead of L3 source and destination.
IPv4 packets can reach to the wrong Rx queue.

Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-09 22:27:06 -05:00
Netanel Belgazal
6e2de20ddc net/ena: fix ethtool RSS flow configuration
ena_flow_data_to_flow_hash and ena_flow_hash_to_flow_type
treat the ena_flow_hash_to_flow_type enum as power of two values.

Change the values of ena_admin_flow_hash_fields to be power of two values.

This bug effect the ethtool set/get rxnfc.
ethtool will report wrong values hash fields for get and will
configure wrong hash fields in set.

Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-09 22:27:06 -05:00
Netanel Belgazal
6a1ce2fb67 net/ena: fix queues number calculation
The ENA driver tries to open a queue per vCPU.
To determine how many vCPUs the instance have it uses num_possible_cpus()
while it should have use num_online_cpus() instead.

Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-09 22:27:06 -05:00
Netanel Belgazal
fdeea0ad87 net/ena: remove ntuple filter support from device feature list
Remove NETIF_F_NTUPLE from netdev->features.
The ENA device driver does not support ntuple filtering.

Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-09 22:27:06 -05:00
stephen hemminger
bc1f44709c net: make ndo_get_stats64 a void function
The network device operation for reading statistics is only called
in one place, and it ignores the return value. Having a structure
return value is potentially confusing because some future driver could
incorrectly assume that the return value was used.

Fix all drivers with ndo_get_stats64 to have a void function.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-08 17:51:44 -05:00
Wei Yongjun
f850b4a77d net: ena: use setup_timer() and mod_timer()
Use setup_timer() instead of init_timer(), being the preferred/standard
way to set a timer up.

Also, quoting the mod_timer() function comment:
-> mod_timer() is a more efficient way to update the expire field of an
   active timer (if the timer is inactive it will be activated).

Use setup_timer and mod_timer to setup and arm a timer, to make the code
cleaner and easier to read.

Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-26 17:21:59 -04:00
Jarod Wilson
d894be57ca ethernet: use net core MTU range checking in more drivers
Somehow, I missed a healthy number of ethernet drivers in the last pass.
Most of these drivers either were in need of an updated max_mtu to make
jumbo frames possible to enable again. In a few cases, also setting a
different min_mtu to match previous lower bounds. There are also a few
drivers that had no upper bounds checking, so they're getting a brand new
ETH_MAX_MTU that is identical to IP_MAX_MTU, but accessible by includes
all ethernet and ethernet-like drivers all have already.

acenic:
- min_mtu = 0, max_mtu = 9000

amazon/ena:
- min_mtu = 128, max_mtu = adapter->max_mtu

amd/xgbe:
- min_mtu = 0, max_mtu = 9000

sb1250:
- min_mtu = 0, max_mtu = 1518

cxgb3:
- min_mtu = 81, max_mtu = 65535

cxgb4:
- min_mtu = 81, max_mtu = 9600

cxgb4vf:
- min_mtu = 81, max_mtu = 65535

benet:
- min_mtu = 256, max_mtu = 9000

ibmveth:
- min_mtu = 68, max_mtu = 65535

ibmvnic:
- min_mtu = adapter->min_mtu, max_mtu = adapter->max_mtu
- remove now redundant ibmvnic_change_mtu

jme:
- min_mtu = 1280, max_mtu = 9202

mv643xx_eth:
- min_mtu = 64, max_mtu = 9500

mlxsw:
- min_mtu = 0, max_mtu = 65535
- Basically bypassing the core checks, and instead relying on dynamic
  checks in the respective switch drivers' ndo_change_mtu functions

ns83820:
- min_mtu = 0
- remove redundant ns83820_change_mtu, only checked for mtu > 1500

netxen:
- min_mtu = 0, max_mtu = 8000 (P2), max_mtu = 9600 (P3)

qlge:
- min_mtu = 1500, max_mtu = 9000
- driver only supports setting mtu to 1500 or 9000, so the core check only
  rules out < 1500 and > 9000, qlge_change_mtu still needs to check that
  the value is 1500 or 9000

qualcomm/emac:
- min_mtu = 46, max_mtu = 9194

xilinx_axienet:
- min_mtu = 64, max_mtu = 9000

Fixes: 61e84623ac ("net: centralize net_device min/max MTU checking")
CC: netdev@vger.kernel.org
CC: Jes Sorensen <jes@trained-monkey.org>
CC: Netanel Belgazal <netanel@annapurnalabs.com>
CC: Tom Lendacky <thomas.lendacky@amd.com>
CC: Santosh Raspatur <santosh@chelsio.com>
CC: Hariprasad S <hariprasad@chelsio.com>
CC: Sathya Perla <sathya.perla@broadcom.com>
CC: Ajit Khaparde <ajit.khaparde@broadcom.com>
CC: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
CC: Somnath Kotur <somnath.kotur@broadcom.com>
CC: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
CC: John Allen <jallen@linux.vnet.ibm.com>
CC: Guo-Fu Tseng <cooldavid@cooldavid.org>
CC: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
CC: Jiri Pirko <jiri@mellanox.com>
CC: Ido Schimmel <idosch@mellanox.com>
CC: Manish Chopra <manish.chopra@qlogic.com>
CC: Sony Chacko <sony.chacko@qlogic.com>
CC: Rajesh Borundia <rajesh.borundia@qlogic.com>
CC: Timur Tabi <timur@codeaurora.org>
CC: Anirudha Sarangi <anirudh@xilinx.com>
CC: John Linn <John.Linn@xilinx.com>
Signed-off-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-10-20 14:51:08 -04:00
Rami Rosen
184b49c89f net: ena: change the return type of ena_set_push_mode() to be void.
This patch changes the return type of ena_set_push_mode() to be void,
as it always returns 0.

Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-23 17:42:33 -07:00
Wei Yongjun
6e22066fd0 net: ena: Fix error return code in ena_device_init()
Fix to return a negative error code from the invalid dma width
error handling case instead of 0.

Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-15 16:16:11 -07:00
Wei Yongjun
557bc7d44d net: ena: Remove unnecessary pci_set_drvdata()
The driver core clears the driver data to NULL after device_release
or on probe failure. Thus, it is not needed to manually clear the
device driver data to NULL.

Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-15 16:16:10 -07:00
Netanel Belgazal
1738cd3ed3 net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)
This is a driver for the ENA family of networking devices.

Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-08-12 17:12:08 -07:00