Commit Graph

1235337 Commits

Author SHA1 Message Date
Ahmed Zaki
fb6e30a725 net: ethtool: pass a pointer to parameters to get/set_rxfh ethtool ops
The get/set_rxfh ethtool ops currently takes the rxfh (RSS) parameters
as direct function arguments. This will force us to change the API (and
all drivers' functions) every time some new parameters are added.

This is part 1/2 of the fix, as suggested in [1]:

- First simplify the code by always providing a pointer to all params
   (indir, key and func); the fact that some of them may be NULL seems
   like a weird historic thing or a premature optimization.
   It will simplify the drivers if all pointers are always present.

 - Then make the functions take a dev pointer, and a pointer to a
   single struct wrapping all arguments. The set_* should also take
   an extack.

Link: https://lore.kernel.org/netdev/20231121152906.2dd5f487@kernel.org/ [1]
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Suggested-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com>
Link: https://lore.kernel.org/r/20231213003321.605376-2-ahmed.zaki@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-13 22:07:16 -08:00
Jakub Kicinski
c3f687d8df net: page_pool: factor out releasing DMA from releasing the page
Releasing the DMA mapping will be useful for other types
of pages, so factor it out. Make sure compiler inlines it,
to avoid any regressions.

Signed-off-by: Mina Almasry <almasrymina@google.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-13 21:59:04 -08:00
Liang Chen
0a149ab78e page_pool: transition to reference count management after page draining
To support multiple users referencing the same fragment,
'pp_frag_count' is renamed to 'pp_ref_count', transitioning pp pages
from fragment management to reference count management after draining
based on the suggestion from [1].

The idea is that the concept of fragmenting exists before the page is
drained, and all related functions retain their current names.
However, once the page is drained, its management shifts to being
governed by 'pp_ref_count'. Therefore, all functions associated with
that lifecycle stage of a pp page are renamed.

[1]
http://lore.kernel.org/netdev/f71d9448-70c8-8793-dc9a-0eb48a570300@huawei.com

Signed-off-by: Liang Chen <liangchen.linux@gmail.com>
Reviewed-by: Yunsheng Lin <linyunsheng@huawei.com>
Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Reviewed-by: Mina Almasry <almasrymina@google.com>
Link: https://lore.kernel.org/r/20231212044614.42733-2-liangchen.linux@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-13 18:35:16 -08:00
Kees Cook
bc044ae9d6 cxgb3: Avoid potential string truncation in desc
Builds with W=1 were warning about potential string truncations:

drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c: In function 'cxgb_up':
drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c:394:38: warning: '%d' directive output may be truncated writing between 1 and 11 bytes into a region of size between 5 and 20 [-Wformat-truncation=]
  394 |                                  "%s-%d", d->name, pi->first_qset + i);
      |                                      ^~
In function 'name_msix_vecs',
    inlined from 'cxgb_up' at drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c:1264:3: drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c:394:34: note: directive argument in the range [-2147483641, 509]
  394 |                                  "%s-%d", d->name, pi->first_qset + i);
      |                                  ^~~~~~~
drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c:393:25: note: 'snprintf' output between 3 and 28 bytes into a destination of size 21
  393 |                         snprintf(adap->msix_info[msi_idx].desc, n,
      |                         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  394 |                                  "%s-%d", d->name, pi->first_qset + i);
      |                                  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Avoid open-coded %NUL-termination (this code was assuming snprintf
wasn't %NUL terminating when it does -- likely thinking of strncpy),
and grow the size of the string to handle a maximal value.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202312100937.ZPZCARhB-lkp@intel.com/
Cc: Raju Rangoju <rajur@chelsio.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20231212220954.work.219-kees@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-13 18:32:01 -08:00
Kees Cook
84cc99199a amd-xgbe: Avoid potential string truncation in name
Build with W=1 were warning about a potential string truncation:

drivers/net/ethernet/amd/xgbe/xgbe-drv.c: In function 'xgbe_alloc_channels':
drivers/net/ethernet/amd/xgbe/xgbe-drv.c:211:73: warning: '%u' directive output may be truncated writing between 1 and 10 bytes into a region of size 8 [-Wformat-truncation=]
  211 |                 snprintf(channel->name, sizeof(channel->name), "channel-%u", i);
      |                                                                         ^~
drivers/net/ethernet/amd/xgbe/xgbe-drv.c:211:64: note: directive argument in the range [0, 4294967294]
  211 |                 snprintf(channel->name, sizeof(channel->name), "channel-%u", i);
      |                                                                ^~~~~~~~~~~~
drivers/net/ethernet/amd/xgbe/xgbe-drv.c:211:17: note: 'snprintf' output between 10 and 19 bytes into a destination of size 16
  211 |                 snprintf(channel->name, sizeof(channel->name), "channel-%u", i);
      |                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Increase the size of the "name" buffer to handle the full format range.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202312100937.ZPZCARhB-lkp@intel.com/
Cc: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20231212221312.work.830-kees@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-13 18:31:32 -08:00
Jiri Pirko
97f265ef7f dpll: allocate pin ids in cycle
Pin ID is just a number. Nobody should rely on a certain value, instead,
user should use either pin-id-get op or RTNetlink to get it.

Unify the pin ID allocation behavior with what there is already
implemented for dpll devices.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20231212150605.1141261-1-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-13 18:30:31 -08:00
Eric Dumazet
4746b36b1a sctp: support MSG_ERRQUEUE flag in recvmsg()
For some reason sctp_poll() generates EPOLLERR if sk->sk_error_queue
is not empty but recvmsg() can not drain the error queue yet.

This is needed to better support timestamping.

I had to export inet_recv_error(), since sctp
can be compiled as a module.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Cc: Willem de Bruijn <willemb@google.com>
Acked-by: Xin Long <lucien.xin@gmail.com>
Link: https://lore.kernel.org/r/20231212145550.3872051-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-13 18:30:24 -08:00
Jakub Kicinski
36d8afbb2b Merge branch 'idpf-add-get-set-for-ethtool-s-header-split-ringparam'
Alexander Lobakin says:

====================
idpf: add get/set for Ethtool's header split ringparam

Currently, the header split feature (putting headers in one smaller
buffer and then the data in a separate bigger one) is always enabled
in idpf when supported.
One may want to not have fragmented frames per each packet, for example,
to avoid XDP frags. To better optimize setups for particular workloads,
add ability to switch the header split state on and off via Ethtool's
ringparams, as well as to query the current status.
There's currently only GET in the Ethtool Netlink interface for now,
so add SET first. I suspect idpf is not the only one supporting this.
====================

Link: https://lore.kernel.org/r/20231212142752.935000-1-aleksander.lobakin@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-13 18:22:10 -08:00
Michal Kubiak
9b1aa3ef23 idpf: add get/set for Ethtool's header split ringparam
idpf supports the header split feature and that feature is always
enabled by default.
However, for flexibility reasons and to simplify some scenarios, it
would be useful to have the support for switching the header split
off (and on) from the userspace.

Address that need by adding the user config parameter, the functions
for disabling (or enabling) the header split feature, and calls to
them from the Ethtool ringparam callbacks.
It still is enabled by default if supported by the hardware.

Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Michal Kubiak <michal.kubiak@intel.com>
Co-developed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Link: https://lore.kernel.org/r/20231212142752.935000-3-aleksander.lobakin@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-13 18:22:06 -08:00
Alexander Lobakin
50d7371071 ethtool: add SET for TCP_DATA_SPLIT ringparam
Follow up commit 9690ae6042 ("ethtool: add header/data split
indication") and add the set part of Ethtool's header split, i.e.
ability to enable/disable header split via the Ethtool Netlink
interface. This might be helpful to optimize the setup for particular
workloads, for example, to avoid XDP frags, and so on.
A driver should advertise ``ETHTOOL_RING_USE_TCP_DATA_SPLIT`` in its
ops->supported_ring_params to allow doing that. "Unknown" passed from
the userspace when the header split is supported means the driver is
free to choose the preferred state.

Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Link: https://lore.kernel.org/r/20231212142752.935000-2-aleksander.lobakin@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-13 18:22:02 -08:00
Eric Dumazet
173b6d1cdf docs: networking: timestamping: mention MSG_EOR flag
TCP got MSG_EOR support in linux-4.7.

This is a canonical way of making sure no coalescing
will be performed on the skb, even if it could not be
immediately sent.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20231212110608.3673677-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-13 18:19:39 -08:00
Jakub Kicinski
85c2674d53 Merge branch 'add-support-for-dp83tg720s-phy'
Oleksij Rempel says:

====================
add support for DP83TG720S PHY
====================

Link: https://lore.kernel.org/r/20231212054144.87527-1-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-13 18:01:10 -08:00
Oleksij Rempel
cb80ee2f9b net: phy: Add support for the DP83TG720S Ethernet PHY
The DP83TG720S-Q1 device is an IEEE 802.3bp and Open Alliance compliant
automotive Ethernet physical layer transceiver.

This driver was tested with i.MX8MP EQOS (stmmac) on the MAC side and
same TI PHY on other side.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/20231212054144.87527-3-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-13 18:01:08 -08:00
Oleksij Rempel
0c47615708 net: phy: c45: add genphy_c45_pma_read_ext_abilities() function
Move part of the genphy_c45_pma_read_abilities() code to a separate
function.

Some PHYs do not implement PMA/PMD status 2 register (Register 1.8) but
do implement PMA/PMD extended ability register (Register 1.11). To make
use of it, we need to be able to access this part of code separately.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/20231212054144.87527-2-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-13 18:01:07 -08:00
Jakub Kicinski
a25ebbf332 Merge branch 'net-sched-optimizations-around-action-binding-and-init'
Pedro Tammela says:

====================
net/sched: optimizations around action binding and init

Scaling optimizations for action binding in rtnl-less filters.
We saw a noticeable lock contention around idrinfo->lock when
testing in a 56 core system, which disappeared after the patches.
====================

Link: https://lore.kernel.org/r/20231211181807.96028-1-pctammela@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-13 17:54:01 -08:00
Pedro Tammela
1dd7f18fc0 net/sched: act_api: skip idr replace on bound actions
tcf_idr_insert_many will replace the allocated -EBUSY pointer in
tcf_idr_check_alloc with the real action pointer, exposing it
to all operations. This operation is only needed when the action pointer
is created (ACT_P_CREATED). For actions which are bound to (returned 0),
the pointer already resides in the idr making such operation a nop.

Even though it's a nop, it's still not a cheap operation as internally
the idr code walks the idr and then does a replace on the appropriate slot.
So if the action was bound, better skip the idr replace entirely.

Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Vlad Buslov <vladbu@nvidia.com>
Link: https://lore.kernel.org/r/20231211181807.96028-3-pctammela@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-13 17:53:59 -08:00
Pedro Tammela
4b55e86736 net/sched: act_api: rely on rcu in tcf_idr_check_alloc
Instead of relying only on the idrinfo->lock mutex for
bind/alloc logic, rely on a combination of rcu + mutex + atomics
to better scale the case where multiple rtnl-less filters are
binding to the same action object.

Action binding happens when an action index is specified explicitly and
an action exists which such index exists. Example:
  tc actions add action drop index 1
  tc filter add ... matchall action drop index 1
  tc filter add ... matchall action drop index 1
  tc filter add ... matchall action drop index 1
  tc filter ls ...
     filter protocol all pref 49150 matchall chain 0 filter protocol all pref 49150 matchall chain 0 handle 0x1
     not_in_hw
           action order 1: gact action drop
            random type none pass val 0
            index 1 ref 4 bind 3

   filter protocol all pref 49151 matchall chain 0 filter protocol all pref 49151 matchall chain 0 handle 0x1
     not_in_hw
           action order 1: gact action drop
            random type none pass val 0
            index 1 ref 4 bind 3

   filter protocol all pref 49152 matchall chain 0 filter protocol all pref 49152 matchall chain 0 handle 0x1
     not_in_hw
           action order 1: gact action drop
            random type none pass val 0
            index 1 ref 4 bind 3

When no index is specified, as before, grab the mutex and allocate
in the idr the next available id. In this version, as opposed to before,
it's simplified to store the -EBUSY pointer instead of the previous
alloc + replace combination.

When an index is specified, rely on rcu to find if there's an object in
such index. If there's none, fallback to the above, serializing on the
mutex and reserving the specified id. If there's one, it can be an -EBUSY
pointer, in which case we just try again until it's an action, or an action.
Given the rcu guarantees, the action found could be dead and therefore
we need to bump the refcount if it's not 0, handling the case it's
in fact 0.

As bind and the action refcount are already atomics, these increments can
happen without the mutex protection while many tcf_idr_check_alloc race
to bind to the same action instance.

In case binding encounters a parallel delete or add, it will return
-EAGAIN in order to try again. Both filter and action apis already
have the retry machinery in-place. In case it's an unlocked filter it
retries under the rtnl lock.

Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Reviewed-by: Vlad Buslov <vladbu@nvidia.com>
Link: https://lore.kernel.org/r/20231211181807.96028-2-pctammela@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-13 17:53:59 -08:00
David S. Miller
604ca8ee7b Merge branch 'virtio-net-dynamic-coalescing-moderation'
Heng Qi says:

====================
virtio-net: support dynamic coalescing moderation

Now, virtio-net already supports per-queue moderation parameter
setting. Based on this, we use the linux dimlib to support
dynamic coalescing moderation for virtio-net.

Due to some scheduling issues, we only support and test the rx dim.

Some test results:

I. Sockperf UDP
=================================================
1. Env
rxq_0 with affinity to cpu_0.

2. Cmd
client: taskset -c 0 sockperf tp -p 8989 -i $IP -t 10 -m 16B
server: taskset -c 0 sockperf sr -p 8989

3. Result
dim off: 1143277.00 rxpps, throughput 17.844 MBps, cpu is 100%.
dim on:  1124161.00 rxpps, throughput 17.610 MBps, cpu is 83.5%.
=================================================

II. Redis
=================================================
1. Env
There are 8 rxqs, and rxq_i with affinity to cpu_i.

2. Result
When all cpus are 100%, ops/sec of memtier_benchmark client is
dim off:  978437.23
dim on:  1143638.28
=================================================

III. Nginx
=================================================
1. Env
There are 8 rxqs and rxq_i with affinity to cpu_i.

2. Result
When all cpus are 100%, requests/sec of wrk client is
dim off:  877931.67
dim on:  1019160.31
=================================================

IV. Latency of sockperf udp
=================================================
1. Rx cmd
taskset -c 0 sockperf sr -p 8989

2. Tx cmd
taskset -c 0 sockperf pp -i ${ip} -p 8989 -t 10

After running this cmd 5 times and averaging the results,

3. Result
dim off: 17.7735 usec
dim on:  18.0110 usec
=================================================

Changelog:
v7->v8:
- Add select DIMLIB.

v6->v7:
- Drop the patch titled "spin lock for ctrl cmd access"
- Use rtnl_trylock to avoid the deadlock.

v5->v6:
- Add patch(4/5): spin lock for ctrl cmd access
- Patch(5/5):
   - Use spin lock and cancel_work_sync to synchronize

v4->v5:
- Patch(4/4):
   - Fix possible synchronization issues with cancel_work_sync.
   - Reduce if/else nesting levels

v3->v4:
- Patch(5/5): drop.

v2->v3:
- Patch(4/5): some minor modifications.

v1->v2:
- Patch(2/5): a minor fix.
- Patch(4/5):
   - improve the judgment of dim switch conditions.
   - Cancel the work when vq reset.
- Patch(5/5): drop the tx dim implementation.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-13 12:49:05 +00:00
Heng Qi
6208799553 virtio-net: support rx netdim
By comparing the traffic information in the complete napi processes,
let the virtio-net driver automatically adjust the coalescing
moderation parameters of each receive queue.

Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-13 12:49:05 +00:00
Heng Qi
1db43c0818 virtio-net: extract virtqueue coalescig cmd for reuse
Extract commands to set virtqueue coalescing parameters for reuse
by ethtool -Q, vq resize and netdim.

Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-13 12:49:05 +00:00
Heng Qi
d7180080dd virtio-net: separate rx/tx coalescing moderation cmds
This patch separates the rx and tx global coalescing moderation
commands to support netdim switches in subsequent patches.

Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-13 12:49:04 +00:00
Heng Qi
7949c06ad9 virtio-net: returns whether napi is complete
rx netdim needs to count the traffic during a complete napi process,
and start updating and comparing samples to make decisions after
the napi ends. Let virtqueue_napi_complete() return true if napi is done,
otherwise vice versa.

Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-13 12:49:04 +00:00
David S. Miller
d2e9464e63 Merge branch 'ionic-pci-errors'
Shannon Nelson says:

====================
ionic: updates to PCI error handling

These are improvements to our PCI error handling, including FLR and
AER events.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-13 12:35:55 +00:00
Shannon Nelson
c3a910e1c4 ionic: fill out pci error handlers
Set up the pci_error_handlers error_detected and resume to be useful in
handling AER events.  If the error detected is pci_channel_io_frozen we
set up to do an FLR at the end of the AER handling - this tends to clear
things up well enough that traffic can continue.  Else, let the AER/PCI
machinery do what is needed for the less serious errors seen.

Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Reviewed-by: Brett Creeley <brett.creeley@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-13 12:35:55 +00:00
Shannon Nelson
ce66172d33 ionic: lif debugfs refresh on reset
Remove and restore the lif's debugfs pointers on a reset,
and make sure to check for the dentry before removing it
in case an earlier reset failed to rebuild the lif.

Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Reviewed-by: Brett Creeley <brett.creeley@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-13 12:35:55 +00:00
Shannon Nelson
b0dbe358fb ionic: use timer_shutdown_sync
When stopping the watchdog timer at remove time we should
be using the new timer_shutdown_sync to assure the timer
doesn't ever get rearmed.

Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Reviewed-by: Brett Creeley <brett.creeley@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-13 12:35:55 +00:00
Shannon Nelson
219e183272 ionic: no fw read when PCI reset failed
If there was a failed attempt to reset the PCI connection,
don't later try to read from PCI as the space is unmapped
and will cause a paging request crash.  When clearing the PCI
setup we can clear the dev_info register pointer, and check
it before using it in the fw_running test.

Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Reviewed-by: Brett Creeley <brett.creeley@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-13 12:35:55 +00:00
Shannon Nelson
13943d6c82 ionic: prevent pci disable of already disabled device
If a reset fails, the PCI device is left in a disabled
state, so don't try to disable it again on driver remove.
This prevents a scary looking WARN trace in the kernel log.

    ionic 0000:2b:00.0: disabling already-disabled device

Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Reviewed-by: Brett Creeley <brett.creeley@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-13 12:35:55 +00:00
Shannon Nelson
ca5fdf9a7c ionic: bypass firmware cmds when stuck in reset
If the driver or firmware is stuck in reset state, don't bother
trying to use adminq commands.  This speeds up shutdown and
prevents unnecessary timeouts and error messages.

This includes a bit of rework on ionic_adminq_post_wait()
and ionic_adminq_post_wait_nomsg() to both use
__ionic_adminq_post_wait() which can do the checks needed in
both cases.

Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Reviewed-by: Brett Creeley <brett.creeley@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-13 12:35:55 +00:00
Shannon Nelson
45b84188a0 ionic: keep filters across FLR
Make sure we keep and replay the filters and RSS config across
an FLR by using our FW_RESET flag.  This gets checked on the
way down and on the way back up to help determine how much LIF
state to keep and restore across a reset action.

Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Reviewed-by: Brett Creeley <brett.creeley@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-13 12:35:55 +00:00
Shannon Nelson
24f110240c ionic: pass opcode to devcmd_wait
Don't rely on the PCI memory for the devcmd opcode because we
read a 0xff value if the PCI bus is broken, which can cause us
to report a bogus dev_cmd opcode later.

Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Reviewed-by: Brett Creeley <brett.creeley@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-13 12:35:54 +00:00
Furong Xu
e5bc1f4c65 net: stmmac: mmc: Support more counters for XGMAC Core
Complete all counters on XGMAC Core.
These can be useful for debugging.

Signed-off-by: Furong Xu <0x1207@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-13 10:38:24 +00:00
David S. Miller
83691d6fa7 Merge branch 'net-at803x-cleanups'
Christian Marangi says:

====================
net: phy: at803x: cleanup

The intention of this big series is to try to cleanup the big
at803x PHY driver.

It currently have 3 different family of PHY in it. at803x, qca83xx
and qca808x.

The current codebase required lots of cleanup and reworking to
make the split possible as currently there is a greater use of
adding special function matching the phy_id.

This has been reworked to make the function actually generic
and make the change only in more specific one. The result
is the addition of micro additional function but that is for good
as it massively simplify splitting the driver later.

Consider that this is all in preparation for the addition of
qca807x PHY driver that will also uso some of the functions of
at803x.

Subsequent series will come with the actual PHY split and other
required cleanup. This is only to start the process with minor
changes.

Changes v4:
- Improve at8031_probe function
Changes v3:
- Add Reviewed-by tag from Andrew
- Split patch 10 (at8031 rename) to rename and move
Changes v2:
- Drop split part due to series too big
- Split changes even more
- Fix problem pointed out by Russell (flawed reworked function logic)
- Add Reviewed-by tag from Andrew
- Minor rework to prevent further code duplication for cdt
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-13 10:34:28 +00:00
Christian Marangi
ef9df47b44 net: phy: at803x: drop specific PHY ID check from cable test functions
Drop specific PHY ID check for cable test functions for at803x. This is
done to make functions more generic. While at it better describe what
the functions does by using more symbolic function names.

PHYs that requires to set additional reg are moved to specific function
calling the more generic one.

cdt_start and cdt_wait_for_completion are changed to take an additional
arg to pass specific values specific to the PHY.

Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-13 10:34:28 +00:00
Christian Marangi
21a2802a83 net: phy: at803x: move at8035 specific DT parse to dedicated probe
Move at8035 specific DT parse for clock out frequency to dedicated probe
to make at803x probe function more generic.

This is to tidy code and no behaviour change are intended.

Detection logic is changed, we check if the clk 25m mask is set and if
it's not zero, we assume the qca,clk-out-frequency property is set.

The property is checked in the generic at803x_parse_dt called by
at803x_probe.

Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-13 10:34:28 +00:00
Christian Marangi
f932a6dc8b net: phy: at803x: move at8031 functions in dedicated section
Move at8031 functions in dedicated section with dedicated at8031
parse_dt and probe.

Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-13 10:34:28 +00:00
Christian Marangi
a5ab9d8e7a net: phy: at803x: make at8031 related DT functions name more specific
Rename at8031 related DT function name to a more specific name
referencing they are only related to at8031 and not to the generic
at803x PHY family.

Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-13 10:34:28 +00:00
Christian Marangi
30dd62191d net: phy: at803x: move specific at8031 config_intr to dedicated function
Move specific at8031 config_intr bits to dedicated function to make
at803x_config_initr more generic.

This is needed in preparation for PHY driver split as qca8081 share the
same function to setup interrupts.

Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-13 10:34:28 +00:00
Christian Marangi
27b89c9dc1 net: phy: at803x: move specific at8031 WOL bits to dedicated function
Move specific at8031 WOL enable/disable to dedicated function to make
at803x_set_wol more generic.

This is needed in preparation for PHY driver split as qca8081 share the
same function to toggle WOL settings.

In this new implementation WOL module in at8031 is enabled after the
generic interrupt is setup. This should not cause any problem as the
WOL_INT has a separate implementation and only relay on MAC bits.

Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-13 10:34:28 +00:00
Christian Marangi
3ae3bc426e net: phy: at803x: move specific at8031 config_init to dedicated function
Move specific at8031 config_init to dedicated function to make
at803x_config_init more generic and tidy things up.

Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-13 10:34:28 +00:00
Christian Marangi
25d2ba9400 net: phy: at803x: move specific at8031 probe mode check to dedicated probe
Move specific at8031 probe mode check to dedicated probe to make
at803x_probe more generic and keep code tidy.

Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-13 10:34:28 +00:00
Christian Marangi
900eef75cc net: phy: at803x: move specific DT option for at8031 to specific probe
Move specific DT options for at8031 to specific probe to tidy things up
and make at803x_parse_dt more generic.

Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-13 10:34:28 +00:00
Christian Marangi
d43cff3f82 net: phy: at803x: move qca83xx specific check in dedicated functions
Rework qca83xx specific check to dedicated function to tidy things up
and drop useless phy_id check.

Also drop an useless link_change_notify for QCA8337 as it did nothing an
returned early.

Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-13 10:34:27 +00:00
Christian Marangi
07b1ad83b9 net: phy: at803x: raname hw_stats functions to qca83xx specific name
The function and the struct related to hw_stats were specific to qca83xx
PHY but were called following the convention in the driver of calling
everything with at803x prefix.

To better organize the code, rename these function a more specific name
to better describe that they are specific to 83xx PHY family.

Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-13 10:34:27 +00:00
Christian Marangi
6a3b8c573b net: phy: at803x: move disable WOL to specific at8031 probe
Move the WOL disable call to specific at8031 probe to make at803x_probe
more generic and drop extra check for PHY ID.

Keep the same previous behaviour by first calling at803x_probe and then
disabling WOL.

Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-13 10:34:27 +00:00
Christian Marangi
f8fdbf3389 net: phy: at803x: fix passing the wrong reference for config_intr
Fix passing the wrong reference for config_initr on passing the function
pointer, drop the wrong & from at803x_config_intr in the PHY struct.

Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-13 10:34:27 +00:00
Jiri Pirko
4f7aa122bc dpll: remove leftover mode_supported() op and use mode_get() instead
Mode supported is currently reported to the user exactly the same, as
the current mode. That's because mode changing is not implemented.
Remove the leftover mode_supported() op and use mode_get() to fill up
the supported mode exposed to user.

One, if even, mode changing is going to be introduced, this could be
very easily taken back. In the meantime, prevent drivers form
implementing this in wrong way (as for example recent netdevsim
implementation attempt intended to do).

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-12-13 10:31:19 +00:00
Jakub Kicinski
9bab51bd66 Merge branch 'bnxt_en-update-for-net-next'
Michael Chan says:

====================
bnxt_en: Update for net-next

The first 4 patches in the series fix issues in the net-next tree
introduced in the last 4 weeks.  The first 3 patches fix ring accounting
and indexing logic.  The 4th patch fix TX timeout when the TX ring is
very small.

The next 7 patches add new features on the P7 chips, including TX
coalesced completions, VXLAN GPE and UDP GSO stateless offload, a
new rx_filter_miss counters, and more QP backing store memory for
RoCE.

The last 2 patches are PTP improvements.
====================

Link: https://lore.kernel.org/r/20231212005122.2401-1-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-12 16:06:01 -08:00
Pavan Chebbi
056bce63c4 bnxt_en: Make PTP TX timestamp HWRM query silent
In a busy network, especially with flow control enabled, we may
experience timestamp query failures fairly regularly. After a while,
dmesg may be flooded with timestamp query failure error messages.

Silence the error message from the low level hwrm function that
sends the firmware message.  Change netdev_err() to netdev_WARN_ONCE()
if this FW call ever fails.

Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/r/20231212005122.2401-14-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-12 16:05:59 -08:00
Pavan Chebbi
84793a4995 bnxt_en: Skip nic close/open when configuring tstamp filters
We don't have to close and open the nic to make sure we have
valid rx timestamps. Once we have the timestamp filter applied to
the HW and the timestamp_fld_format bit is cleared in the rx
completion and the timestamp is non-zero, we can be sure that rx
timestamp is valid data.

Skip close/open when we set any timestamp filter.

Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/r/20231212005122.2401-13-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-12 16:05:58 -08:00