Commit Graph

111009 Commits

Author SHA1 Message Date
Yang Yingliang
2e7bf4a6af net: axienet: add missing error return code in axienet_probe()
It should return error code in error path in axienet_probe().

Fixes: 00be43a74c ("net: axienet: make the 64b addresable DMA depends on 64b archectures")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20220616062917.3601-1-yangyingliang@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-16 11:08:38 -07:00
Jose Alonso
36a15e1cb1 net: usb: ax88179_178a needs FLAG_SEND_ZLP
The extra byte inserted by usbnet.c when
 (length % dev->maxpacket == 0) is causing problems to device.

This patch sets FLAG_SEND_ZLP to avoid this.

Tested with: 0b95:1790 ASIX Electronics Corp. AX88179 Gigabit Ethernet

Problems observed:
======================================================================
1) Using ssh/sshfs. The remote sshd daemon can abort with the message:
   "message authentication code incorrect"
   This happens because the tcp message sent is corrupted during the
   USB "Bulk out". The device calculate the tcp checksum and send a
   valid tcp message to the remote sshd. Then the encryption detects
   the error and aborts.
2) NETDEV WATCHDOG: ... (ax88179_178a): transmit queue 0 timed out
3) Stop normal work without any log message.
   The "Bulk in" continue receiving packets normally.
   The host sends "Bulk out" and the device responds with -ECONNRESET.
   (The netusb.c code tx_complete ignore -ECONNRESET)
Under normal conditions these errors take days to happen and in
intense usage take hours.

A test with ping gives packet loss, showing that something is wrong:
ping -4 -s 462 {destination}	# 462 = 512 - 42 - 8
Not all packets fail.
My guess is that the device tries to find another packet starting
at the extra byte and will fail or not depending on the next
bytes (old buffer content).
======================================================================

Signed-off-by: Jose Alonso <joalonsof@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-15 09:32:12 +01:00
David S. Miller
371de1aa00 Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2022-06-14

This series contains updates to ice driver only.

Michal fixes incorrect Tx timestamp offset calculation for E822 devices.

Roman enforces required VLAN filtering settings for double VLAN mode.

Przemyslaw fixes memory corruption issues with VFs by ensuring
queues are disabled in the error path of VF queue configuration and to
disabled VFs during reset.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-15 09:15:33 +01:00
Christophe JAILLET
d7dd6eccfb net: bgmac: Fix an erroneous kfree() in bgmac_remove()
'bgmac' is part of a managed resource allocated with bgmac_alloc(). It
should not be freed explicitly.

Remove the erroneous kfree() from the .remove() function.

Fixes: 34a5102c32 ("net: bgmac: allocate struct bgmac just once & don't copy it")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/a026153108dd21239036a032b95c25b5cece253b.1655153616.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-14 19:16:36 -07:00
Przemyslaw Patynowski
efe4186000 ice: Fix memory corruption in VF driver
Disable VF's RX/TX queues, when it's disabled. VF can have queues enabled,
when it requests a reset. If PF driver assumes that VF is disabled,
while VF still has queues configured, VF may unmap DMA resources.
In such scenario device still can map packets to memory, which ends up
silently corrupting it.
Previously, VF driver could experience memory corruption, which lead to
crash:
[ 5119.170157] BUG: unable to handle kernel paging request at 00001b9780003237
[ 5119.170166] PGD 0 P4D 0
[ 5119.170173] Oops: 0002 [#1] PREEMPT_RT SMP PTI
[ 5119.170181] CPU: 30 PID: 427592 Comm: kworker/u96:2 Kdump: loaded Tainted: G        W I      --------- -  - 4.18.0-372.9.1.rt7.166.el8.x86_64 #1
[ 5119.170189] Hardware name: Dell Inc. PowerEdge R740/014X06, BIOS 2.3.10 08/15/2019
[ 5119.170193] Workqueue: iavf iavf_adminq_task [iavf]
[ 5119.170219] RIP: 0010:__page_frag_cache_drain+0x5/0x30
[ 5119.170238] Code: 0f 0f b6 77 51 85 f6 74 07 31 d2 e9 05 df ff ff e9 90 fe ff ff 48 8b 05 49 db 33 01 eb b4 0f 1f 80 00 00 00 00 0f 1f 44 00 00 <f0> 29 77 34 74 01 c3 48 8b 07 f6 c4 80 74 0f 0f b6 77 51 85 f6 74
[ 5119.170244] RSP: 0018:ffffa43b0bdcfd78 EFLAGS: 00010282
[ 5119.170250] RAX: ffffffff896b3e40 RBX: ffff8fb282524000 RCX: 0000000000000002
[ 5119.170254] RDX: 0000000049000000 RSI: 0000000000000000 RDI: 00001b9780003203
[ 5119.170259] RBP: ffff8fb248217b00 R08: 0000000000000022 R09: 0000000000000009
[ 5119.170262] R10: 2b849d6300000000 R11: 0000000000000020 R12: 0000000000000000
[ 5119.170265] R13: 0000000000001000 R14: 0000000000000009 R15: 0000000000000000
[ 5119.170269] FS:  0000000000000000(0000) GS:ffff8fb1201c0000(0000) knlGS:0000000000000000
[ 5119.170274] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 5119.170279] CR2: 00001b9780003237 CR3: 00000008f3e1a003 CR4: 00000000007726e0
[ 5119.170283] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 5119.170286] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 5119.170290] PKRU: 55555554
[ 5119.170292] Call Trace:
[ 5119.170298]  iavf_clean_rx_ring+0xad/0x110 [iavf]
[ 5119.170324]  iavf_free_rx_resources+0xe/0x50 [iavf]
[ 5119.170342]  iavf_free_all_rx_resources.part.51+0x30/0x40 [iavf]
[ 5119.170358]  iavf_virtchnl_completion+0xd8a/0x15b0 [iavf]
[ 5119.170377]  ? iavf_clean_arq_element+0x210/0x280 [iavf]
[ 5119.170397]  iavf_adminq_task+0x126/0x2e0 [iavf]
[ 5119.170416]  process_one_work+0x18f/0x420
[ 5119.170429]  worker_thread+0x30/0x370
[ 5119.170437]  ? process_one_work+0x420/0x420
[ 5119.170445]  kthread+0x151/0x170
[ 5119.170452]  ? set_kthread_struct+0x40/0x40
[ 5119.170460]  ret_from_fork+0x35/0x40
[ 5119.170477] Modules linked in: iavf sctp ip6_udp_tunnel udp_tunnel mlx4_en mlx4_core nfp tls vhost_net vhost vhost_iotlb tap tun xt_CHECKSUM ipt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 nft_compat nft_counter nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink bridge stp llc rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache sunrpc intel_rapl_msr iTCO_wdt iTCO_vendor_support dell_smbios wmi_bmof dell_wmi_descriptor dcdbas kvm_intel kvm irqbypass intel_rapl_common isst_if_common skx_edac irdma nfit libnvdimm x86_pkg_temp_thermal i40e intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel ib_uverbs rapl ipmi_ssif intel_cstate intel_uncore mei_me pcspkr acpi_ipmi ib_core mei lpc_ich i2c_i801 ipmi_si ipmi_devintf wmi ipmi_msghandler acpi_power_meter xfs libcrc32c sd_mod t10_pi sg mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ice ahci drm libahci crc32c_intel libata tg3 megaraid_sas
[ 5119.170613]  i2c_algo_bit dm_mirror dm_region_hash dm_log dm_mod fuse [last unloaded: iavf]
[ 5119.170627] CR2: 00001b9780003237

Fixes: ec4f5a436b ("ice: Check if VF is disabled for Opcode and other operations")
Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com>
Co-developed-by: Slawomir Laba <slawomirx.laba@intel.com>
Signed-off-by: Slawomir Laba <slawomirx.laba@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-06-14 09:38:57 -07:00
Przemyslaw Patynowski
be2af71496 ice: Fix queue config fail handling
Disable VF's RX/TX queues, when VIRTCHNL_OP_CONFIG_VSI_QUEUES fail.
Not disabling them might lead to scenario, where PF driver leaves VF
queues enabled, when VF's VSI failed queue config.
In this scenario VF should not have RX/TX queues enabled. If PF failed
to set up VF's queues, VF will reset due to TX timeouts in VF driver.
Initialize iterator 'i' to -1, so if error happens prior to configuring
queues then error path code will not disable queue 0. Loop that
configures queues will is using same iterator, so error path code will
only disable queues that were configured.

Fixes: 77ca27c417 ("ice: add support for virtchnl_queue_select.[tx|rx]_queues bitmap")
Suggested-by: Slawomir Laba <slawomirx.laba@intel.com>
Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-06-14 09:38:57 -07:00
Roman Storozhenko
9542ef4fba ice: Sync VLAN filtering features for DVM
VLAN filtering features, that is C-Tag and S-Tag, in DVM mode must be
both enabled or disabled.
In case of turning off/on only one of the features, another feature must
be turned off/on automatically with issuing an appropriate message to
the kernel log.

Fixes: 1babaf77f4 ("ice: Advertise 802.1ad VLAN filtering and offloads for PF netdev")
Signed-off-by: Roman Storozhenko <roman.storozhenko@intel.com>
Co-developed-by: Anatolii Gerasymenko <anatolii.gerasymenko@intel.com>
Signed-off-by: Anatolii Gerasymenko <anatolii.gerasymenko@intel.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-06-14 09:38:57 -07:00
Michal Michalik
71a579f0d3 ice: Fix PTP TX timestamp offset calculation
The offset was being incorrectly calculated for E822 - that led to
collisions in choosing TX timestamp register location when more than
one port was trying to use timestamping mechanism.

In E822 one quad is being logically split between ports, so quad 0 is
having trackers for ports 0-3, quad 1 ports 4-7 etc. Each port should
have separate memory location for tracking timestamps. Due to error for
example ports 1 and 2 had been assigned to quad 0 with same offset (0),
while port 1 should have offset 0 and 1 offset 16.

Fix it by correctly calculating quad offset.

Fixes: 3a7496234d ("ice: implement basic E822 PTP support")
Signed-off-by: Michal Michalik <michal.michalik@intel.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-06-14 09:35:57 -07:00
Petr Machata
4b7a632ac4 mlxsw: spectrum_cnt: Reorder counter pools
Both RIF and ACL flow counters use a 24-bit SW-managed counter address to
communicate which counter they want to bind.

In a number of Spectrum FW releases, binding a RIF counter is broken and
slices the counter index to 16 bits. As a result, on Spectrum-2 and above,
no more than about 410 RIF counters can be effectively used. This
translates to 205 netdevices for which L3 HW stats can be enabled. (This
does not happen on Spectrum-1, because there are fewer counters available
overall and the counter index never exceeds 16 bits.)

Binding counters to ACLs does not have this issue. Therefore reorder the
counter allocation scheme so that RIF counters come first and therefore get
lower indices that are below the 16-bit barrier.

Fixes: 98e60dce4d ("Merge branch 'mlxsw-Introduce-initial-Spectrum-2-support'")
Reported-by: Maksym Yaremchuk <maksymy@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/20220613125017.2018162-1-idosch@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-06-14 16:00:37 +02:00
Jean-Philippe Brucker
884c65e4da amd-xgbe: Use platform_irq_count()
The AMD XGbE driver currently counts the number of interrupts assigned
to the device by inspecting the pdev->resource array. Since commit
a1a2b7125e ("of/platform: Drop static setup of IRQ resource from DT
core") removed IRQs from this array, the driver now attempts to get all
interrupts from 1 to -1U and gives up probing once it reaches an invalid
interrupt index.

Obtain the number of IRQs with platform_irq_count() instead.

Fixes: a1a2b7125e ("of/platform: Drop static setup of IRQ resource from DT core")
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Acked-by: Rob Herring <robh@kernel.org>
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Link: https://lore.kernel.org/r/20220609161457.69614-1-jean-philippe@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-13 23:12:39 -07:00
Suman Ghosh
619c010a65 octeontx2-vf: Add support for adaptive interrupt coalescing
Fixes: 6e144b47f5 (octeontx2-pf: Add support for adaptive interrupt coalescing)
Added support for VF interfaces as well.

Signed-off-by: Suman Ghosh <sumang@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-13 13:42:24 +01:00
David S. Miller
5f7b84151a xilinx: Fix build on x86.
CONFIG_64BIT is not sufficient for checking for availability of
iowrite64() and friends.

Also, the out_addr helpers need to be inline.

Fixes: b690f8df64 ("net: axienet: Use iowrite64 to write all 64b descriptor pointers")
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-13 12:49:21 +01:00
Andy Chiu
b690f8df64 net: axienet: Use iowrite64 to write all 64b descriptor pointers
According to commit f735c40ed9 ("net: axienet: Autodetect 64-bit DMA
capability") and AXI-DMA spec (pg021), on 64-bit capable dma, only
writing MSB part of tail descriptor pointer causes DMA engine to start
fetching descriptors. However, we found that it is true only if dma is in
idle state. In other words, dma would use a tailp even if it only has LSB
updated, when the dma is running.

The non-atomicity of this behavior could be problematic if enough
delay were introduced in between the 2 writes. For example, if an
interrupt comes right after the LSB write and the cpu spends long
enough time in the handler for the dma to get back into idle state by
completing descriptors, then the seconcd write to MSB would treat dma
to start fetching descriptors again. Since the descriptor next to the
one pointed by current tail pointer is not filled by the kernel yet,
fetching a null descriptor here causes a dma internal error and halt
the dma engine down.

We suggest that the dma engine should start process a 64-bit MMIO write
to the descriptor pointer only if ONE 32-bit part of it is written on all
states. Or we should restrict the use of 64-bit addressable dma on 32-bit
platforms, since those devices have no instruction to guarantee the write
to LSB and MSB part of tail pointer occurs atomically to the dma.

initial condition:
curp =  x-3;
tailp = x-2;
LSB = x;
MSB = 0;

cpu:                       |dma:
 iowrite32(LSB, tailp)     |  completes #(x-3) desc, curp = x-3
 ...                       |  tailp updated
 => irq                    |  completes #(x-2) desc, curp = x-2
    ...                    |  completes #(x-1) desc, curp = x-1
    ...                    |  ...
    ...                    |  completes #x desc, curp = tailp = x
 <= irqreturn              |  reaches tailp == curp = x, idle
 iowrite32(MSB, tailp + 4) |  ...
                           |  tailp updated, starts fetching...
                           |  fetches #(x + 1) desc, sees cntrl = 0
                           |  post Tx error, halt

Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
Reported-by: Max Hsu <max.hsu@sifive.com>
Reviewed-by: Greentime Hu <greentime.hu@sifive.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-13 12:36:55 +01:00
Andy Chiu
00be43a74c net: axienet: make the 64b addresable DMA depends on 64b archectures
Currently it is not safe to config the IP as 64-bit addressable on 32-bit
archectures, which cannot perform a double-word store on its descriptor
pointers. The pointer is 64-bit wide if the IP is configured as 64-bit,
and the device would process the partially updated pointer on some
states if the pointer was updated via two store-words. To prevent such
condition, we force a probe fail if we discover that the IP has 64-bit
capability but it is not running on a 64-Bit kernel.

This is a series of patch (1/2). The next patch must be applied in order
to make 64b DMA safe on 64b archectures.

Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
Reported-by: Max Hsu <max.hsu@sifive.com>
Reviewed-by: Greentime Hu <greentime.hu@sifive.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-13 12:36:55 +01:00
Guangbin Huang
12a3670887 net: hns3: fix tm port shapping of fibre port is incorrect after driver initialization
Currently in driver initialization process, driver will set shapping
parameters of tm port to default speed read from firmware. However, the
speed of SFP module may not be default speed, so shapping parameters of
tm port may be incorrect.

To fix this problem, driver sets new shapping parameters for tm port
after getting exact speed of SFP module in this case.

Fixes: 88d10bd6f7 ("net: hns3: add support for multiple media type")
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-13 11:56:01 +01:00
Jie Wang
71b215f36d net: hns3: fix PF rss size initialization bug
Currently hns3 driver misuses the VF rss size to initialize the PF rss size
in hclge_tm_vport_tc_info_update. So this patch fix it by checking the
vport id before initialization.

Fixes: 7347255ea3 ("net: hns3: refactor PF rss get APIs with new common rss get APIs")
Signed-off-by: Jie Wang <wangjie125@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-13 11:56:01 +01:00
Guangbin Huang
e93530ae0e net: hns3: restore tm priority/qset to default settings when tc disabled
Currently, settings parameters of schedule mode, dwrr, shaper of tm
priority or qset of one tc are only be set when tc is enabled, they are
not restored to the default settings when tc is disabled. It confuses
users when they cat tm_priority or tm_qset files of debugfs. So this
patch fixes it.

Fixes: 848440544b ("net: hns3: Add support of TX Scheduler & Shaper to HNS3 driver")
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-13 11:56:01 +01:00
Jie Wang
cfd80687a5 net: hns3: modify the ring param print info
Currently tx push is also a ring param. So the original ring param print
info in hns3_is_ringparam_changed should be adjusted.

Fixes: 07fdc163ac ("net: hns3: refactor hns3_set_ringparam()")
Signed-off-by: Jie Wang <wangjie125@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-13 11:56:01 +01:00
Jian Shen
283847e3ef net: hns3: don't push link state to VF if unalive
It's unnecessary to push link state to unalive VF, and the VF will
query link state from PF when it being start works.

Fixes: 18b6e31f8b ("net: hns3: PF add support for pushing link status to VFs")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-13 11:56:01 +01:00
Guangbin Huang
9eda7d8bcb net: hns3: set port base vlan tbl_sta to false before removing old vlan
When modify port base vlan, the port base vlan tbl_sta needs to set to
false before removing old vlan, to indicate this operation is not finish.

Fixes: c0f46de30c ("net: hns3: fix port base vlan add fail when concurrent with reset")
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-13 11:56:01 +01:00
Jakub Kicinski
145684d9f9 Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2022-06-09

Grzegorz prevents addition of TC flower filters to TC0 and fixes queue
iteration for VF ADQ to number of actual queues for i40e.

Aleksandr prevents running of ethtool tests when device is being reset
for i40e.

Michal resolves an issue where iavf does not report its MAC address
properly.

* '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
  iavf: Fix issue with MAC address of VF shown as zero
  i40e: Fix call trace in setup_tx_descriptors
  i40e: Fix calculating the number of queue pairs
  i40e: Fix adding ADQ filter to TC0
====================

Link: https://lore.kernel.org/r/20220609162620.2619258-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-10 22:10:30 -07:00
Jakub Kicinski
bf56a0917f mlx5-fixes-2022-06-08
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAmKg7PAACgkQSD+KveBX
 +j7snAgAqdyRrGVVfTDd7lMqjNJu12KA14LUVvBchtUod5KBGsuwbP2KAC0dRHRo
 F6zLwIVfjf3ZICJDdZYYMDUyp3kuaO1iS1tQq7a1N1zo/cepdzDlbnfRikCWQSq8
 yM3vvBPiy3UEF4duMZW2eMmkLW89dKsd7MwK5pQ1LitbnGgdR7x6nh5WR6FNFjrD
 bvMtH9qiePIIWn//wfz4FKJdCzGJN4URyS/YRH5SnbR0pzpucOUOEhlj1XTXyWG5
 sDwugKqYm2JcmMEVvHw+8r8ZWEZght3B1qRzbO4OtHYng3CZ0pCZnQ7la+CWZG/y
 XNEzkI+y+8kFlkNPeveok/pE/aBMEQ==
 =ICLs
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-fixes-2022-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5 fixes 2022-06-08

This series provides bug fixes to mlx5 driver.

* tag 'mlx5-fixes-2022-06-08' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
  net/mlx5: fs, fail conflicting actions
  net/mlx5: Rearm the FW tracer after each tracer event
  net/mlx5: E-Switch, pair only capable devices
  net/mlx5e: CT: Fix cleanup of CT before cleanup of TC ct rules
  Revert "net/mlx5e: Allow relaxed ordering over VFs"
  MAINTAINERS: adjust MELLANOX ETHERNET INNOVA DRIVERS to TLS support removal
====================

Link: https://lore.kernel.org/r/20220608185855.19818-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-09 22:05:37 -07:00
Etienne van der Linde
a0b843340d nfp: flower: restructure flow-key for gre+vlan combination
Swap around the GRE and VLAN parts in the flow-key offloaded by
the driver to fit in with other tunnel types and the firmware.
Without this change used cases with GRE+VLAN on the outer header
does not get offloaded as the flow-key mismatches what the
firmware expect.

Fixes: 0d630f5898 ("nfp: flower: add support to offload QinQ match")
Fixes: 5a2b930416 ("nfp: flower-ct: compile match sections of flow_payload")
Signed-off-by: Etienne van der Linde <etienne.vanderlinde@corigine.com>
Signed-off-by: Louis Peens <louis.peens@corigine.com>
Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-09 22:02:38 -07:00
Fei Qin
03d5005ff7 nfp: avoid unnecessary check warnings in nfp_app_get_vf_config
nfp_net_sriov_check is added in nfp_app_get_vf_config which intends
to ensure ivi->vlan_proto and ivi->max_tx_rate/min_tx_rate can be
read from VF config table only when firmware supports corresponding
capability.

However, "nfp_app_get_vf_config" can be called by commands like
"ip a", "ip link set $DEV up" and "ip link set $DEV vf $NUM vlan
$param" (with VF). When using commands above, many warnings
"ndo_set_vf_<cap_x> not supported" would appear if firmware doesn't
support VF rate limit and 802.1ad VLAN assingment. If more VFs are
created, things could get worse.

Thus, this patch add an extra bool parameter for nfp_net_sriov_check
to enable/disable the cap check warning report. Unnecessary warnings
in nfp_app_get_vf_config can be avoided. Valid warnings in kinds of
vf setting function can be reserved.

Fixes: e0d0e1fdf1 ("nfp: VF rate limit support")
Fixes: 59359597b0 ("nfp: support 802.1ad VLAN assingment to VF")
Signed-off-by: Fei Qin <fei.qin@corigine.com>
Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com>
Signed-off-by: Louis Peens <louis.peens@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-09 22:02:38 -07:00
Linus Torvalds
825464e79d Networking fixes for 5.19-rc2, including fixes from bpf and netfilter.
Current release - regressions:
   - eth: amt: fix possible null-ptr-deref in amt_rcv()
 
 Previous releases - regressions:
   - tcp: use alloc_large_system_hash() to allocate table_perturb
 
   - af_unix: fix a data-race in unix_dgram_peer_wake_me()
 
   - nfc: st21nfca: fix memory leaks in EVT_TRANSACTION handling
 
   - eth: ixgbe: fix unexpected VLAN rx in promisc mode on VF
 
 Previous releases - always broken:
   - ipv6: fix signed integer overflow in __ip6_append_data
 
   - netfilter:
     - nat: really support inet nat without l3 address
     - nf_tables: memleak flow rule from commit path
 
   - bpf: fix calling global functions from BPF_PROG_TYPE_EXT programs
 
   - openvswitch: fix misuse of the cached connection on tuple changes
 
   - nfc: nfcmrvl: fix memory leak in nfcmrvl_play_deferred
 
   - eth: altera: fix refcount leak in altera_tse_mdio_create
 
 Misc:
   - add Quentin Monnet to bpftool maintainers
 
 Signed-off-by: Paolo Abeni <pabeni@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmKhykgSHHBhYmVuaUBy
 ZWRoYXQuY29tAAoJECkkeY3MjxOkN7sQAIn+ZmzQqTm5MVWnlvt/GcRGjjMP2VQY
 60oS2re8QC773yWoP6PvXqxCSFc99paDCC5BmCK6DMLbp9yuVSp5W8iAPuFuyjXE
 /Nur4Ti57LcGJ8ZpcJheBD4cRFbf+xtsGzx9a1WhUDrCYASo7vqRes5Eos2dT7P7
 qjgTduhUtaj6S1CfenfTnYqemZPzSGa+1euDuQ/Bu4mjCPUTrNZZQVYjmfTYM9p1
 UzwfCQr9TtmRKo8wLFHnYDLoWHNpfp55SNL0ShAwIQqgldiJ2OdMje+a2Sa4m6uF
 etRz8H0WrGVqfneD424tdyZv4nwhHw5dnaSrGe8DGq98c4/lIIcVyC38oDAbfWqI
 l8p7ZmtvNid7rpgoQFcxKpb2TAYAI+jaFq5GySEhvj5ZAblNQgFyghfMGPoncXCO
 XW6va8TtP2lmHFScAljQiQb6GNwDO52x77/q14Jkwvr+DILRKXMZZ3hCGrKUn5JM
 lafGkdL5ufm+E9C9RlaWN3imb2KoRj+wdThgV79efEPGG1py7yLOPVMoOCP3qmLq
 torcGcfDi1LGb7ohQxN6tCMv0JgXjS5nd1i+qJnImpkhRrUmahOfmpnElHoPuzs3
 6FU8HR77Eo15x70Jt+WOMy4oXrNh2MeEm8/Fhpj84MEhKpxVn+2o/53M+++5h+ru
 YtiLwEri0dCA
 =rdoB
 -----END PGP SIGNATURE-----

Merge tag 'net-5.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
 "Including fixes from bpf and netfilter.

  Current release - regressions:

   - eth: amt: fix possible null-ptr-deref in amt_rcv()

  Previous releases - regressions:

   - tcp: use alloc_large_system_hash() to allocate table_perturb

   - af_unix: fix a data-race in unix_dgram_peer_wake_me()

   - nfc: st21nfca: fix memory leaks in EVT_TRANSACTION handling

   - eth: ixgbe: fix unexpected VLAN rx in promisc mode on VF

  Previous releases - always broken:

   - ipv6: fix signed integer overflow in __ip6_append_data

   - netfilter:
       - nat: really support inet nat without l3 address
       - nf_tables: memleak flow rule from commit path

   - bpf: fix calling global functions from BPF_PROG_TYPE_EXT programs

   - openvswitch: fix misuse of the cached connection on tuple changes

   - nfc: nfcmrvl: fix memory leak in nfcmrvl_play_deferred

   - eth: altera: fix refcount leak in altera_tse_mdio_create

  Misc:

   - add Quentin Monnet to bpftool maintainers"

* tag 'net-5.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (45 commits)
  net: amd-xgbe: fix clang -Wformat warning
  tcp: use alloc_large_system_hash() to allocate table_perturb
  net: dsa: realtek: rtl8365mb: fix GMII caps for ports with internal PHY
  net: dsa: mv88e6xxx: correctly report serdes link failure
  net: dsa: mv88e6xxx: fix BMSR error to be consistent with others
  net: dsa: mv88e6xxx: use BMSR_ANEGCOMPLETE bit for filling an_complete
  net: altera: Fix refcount leak in altera_tse_mdio_create
  net: openvswitch: fix misuse of the cached connection on tuple changes
  net: ethernet: mtk_eth_soc: fix misuse of mem alloc interface netdev[napi]_alloc_frag
  ip_gre: test csum_start instead of transport header
  au1000_eth: stop using virt_to_bus()
  ipv6: Fix signed integer overflow in l2tp_ip6_sendmsg
  ipv6: Fix signed integer overflow in __ip6_append_data
  nfc: nfcmrvl: Fix memory leak in nfcmrvl_play_deferred
  nfc: st21nfca: fix incorrect sizing calculations in EVT_TRANSACTION
  nfc: st21nfca: fix memory leaks in EVT_TRANSACTION handling
  nfc: st21nfca: fix incorrect validating logic in EVT_TRANSACTION
  net: ipv6: unexport __init-annotated seg6_hmac_init()
  net: xfrm: unexport __init-annotated xfrm4_protocol_init()
  net: mdio: unexport __init-annotated mdio_bus_init()
  ...
2022-06-09 12:06:52 -07:00
Linus Torvalds
842c3b3ddc mellanox: mlx5: avoid uninitialized variable warning with gcc-12
gcc-12 started warning about 'tracker' being used uninitialized:

  drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c: In function ‘mlx5_do_bond’:
  drivers/net/ethernet/mellanox/mlx5/core/lag/lag.c:786:28: warning: ‘tracker’ is used uninitialized [-Wuninitialized]
    786 |         struct lag_tracker tracker;
        |                            ^~~~~~~

which seems to be because it doesn't track how the use (and
initialization) is bound by the 'do_bond' flag.

But admittedly that 'do_bond' usage is fairly complicated, and involves
passing it around as an argument to helper functions, so it's somewhat
understandable that gcc doesn't see how that all works.

This function could be rewritten to make the use of that tracker
variable more obviously safe, but for now I'm just adding the forced
initialization of it.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-06-09 10:03:28 -07:00
Michal Wilczynski
6456038442 iavf: Fix issue with MAC address of VF shown as zero
After reinitialization of iavf, ice driver gets VIRTCHNL_OP_ADD_ETH_ADDR
message with incorrectly set type of MAC address. Hardware address should
have is_primary flag set as true. This way ice driver knows what it has
to set as a MAC address.

Check if the address is primary in iavf_add_filter function and set flag
accordingly.

To test set all-zero MAC on a VF. This triggers iavf re-initialization
and VIRTCHNL_OP_ADD_ETH_ADDR message gets sent to PF.
For example:

ip link set dev ens785 vf 0 mac 00:00:00:00:00:00

This triggers re-initialization of iavf. New MAC should be assigned.
Now check if MAC is non-zero:

ip link show dev ens785

Fixes: a3e839d539 ("iavf: Add usage of new virtchnl format to set default MAC")
Signed-off-by: Michal Wilczynski <michal.wilczynski@intel.com>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-06-09 08:58:15 -07:00
Aleksandr Loktionov
fd5855e6b1 i40e: Fix call trace in setup_tx_descriptors
After PF reset and ethtool -t there was call trace in dmesg
sometimes leading to panic. When there was some time, around 5
seconds, between reset and test there were no errors.

Problem was that pf reset calls i40e_vsi_close in prep_for_reset
and ethtool -t calls i40e_vsi_close in diag_test. If there was not
enough time between those commands the second i40e_vsi_close starts
before previous i40e_vsi_close was done which leads to crash.

Add check to diag_test if pf is in reset and don't start offline
tests if it is true.
Add netif_info("testing failed") into unhappy path of i40e_diag_test()

Fixes: e17bc411ae ("i40e: Disable offline diagnostics if VFs are enabled")
Fixes: 510efb2682 ("i40e: Fix ethtool offline diagnostic with netqueues")
Signed-off-by: Michal Jaron <michalx.jaron@intel.com>
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-06-09 08:54:19 -07:00
Grzegorz Szczurek
0bb050670a i40e: Fix calculating the number of queue pairs
If ADQ is enabled for a VF, then actual number of queue pair
is a number of currently available traffic classes for this VF.

Without this change the configuration of the Rx/Tx queues
fails with error.

Fixes: d29e0d233e ("i40e: missing input validation on VF message handling by the PF")
Signed-off-by: Grzegorz Szczurek <grzegorzx.szczurek@intel.com>
Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com>
Tested-by: Bharathi Sreenivas <bharathi.sreenivas@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-06-09 08:54:03 -07:00
Grzegorz Szczurek
c3238d36c3 i40e: Fix adding ADQ filter to TC0
Procedure of configure tc flower filters erroneously allows to create
filters on TC0 where unfiltered packets are also directed by default.
Issue was caused by insufficient checks of hw_tc parameter specifying
the hardware traffic class to pass matching packets to.

Fix checking hw_tc parameter which blocks creation of filters on TC0.

Fixes: 2f4b411a3d ("i40e: Enable cloud filters via tc-flower")
Signed-off-by: Grzegorz Szczurek <grzegorzx.szczurek@intel.com>
Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com>
Tested-by: Bharathi Sreenivas <bharathi.sreenivas@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-06-09 08:53:43 -07:00
Justin Stitt
647df0d41b net: amd-xgbe: fix clang -Wformat warning
see warning:
| drivers/net/ethernet/amd/xgbe/xgbe-drv.c:2787:43: warning: format specifies
| type 'unsigned short' but the argument has type 'int' [-Wformat]
|        netdev_dbg(netdev, "Protocol: %#06hx\n", ntohs(eth->h_proto));
|                                      ~~~~~~     ^~~~~~~~~~~~~~~~~~~

Variadic functions (printf-like) undergo default argument promotion.
Documentation/core-api/printk-formats.rst specifically recommends
using the promoted-to-type's format flag.

Also, as per C11 6.3.1.1:
(https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1548.pdf)
`If an int can represent all values of the original type ..., the
value is converted to an int; otherwise, it is converted to an
unsigned int. These are called the integer promotions.`

Since the argument is a u16 it will get promoted to an int and thus it is
most accurate to use the %x format specifier here. It should be noted that the
`#06` formatting sugar does not alter the promotion rules.

Link: https://github.com/ClangBuiltLinux/linux/issues/378
Signed-off-by: Justin Stitt <jstitt007@gmail.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Link: https://lore.kernel.org/r/20220607191119.20686-1-jstitt007@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-08 21:12:03 -07:00
Alvin Šipraga
487994ff75 net: dsa: realtek: rtl8365mb: fix GMII caps for ports with internal PHY
Since commit a18e6521a7 ("net: phylink: handle NA interface mode in
phylink_fwnode_phy_connect()"), phylib defaults to GMII when no phy-mode
or phy-connection-type property is specified in a DSA port node of the
device tree. The same commit caused a regression in rtl8365mb whereby
phylink would fail to connect, because the driver did not advertise
support for GMII for ports with internal PHY.

It should be noted that the aforementioned regression is not because the
blamed commit was incorrect: on the contrary, the blamed commit is
correcting the previous behaviour whereby unspecified phy-mode would
cause the internal interface mode to be PHY_INTERFACE_MODE_NA. The
rtl8365mb driver only worked by accident before because it _did_
advertise support for PHY_INTERFACE_MODE_NA, despite NA being reserved
for internal use by phylink. With one mistake fixed, the other was
exposed.

Commit a5dba0f207 ("net: dsa: rtl8365mb: add GMII as user port mode")
then introduced implicit support for GMII mode on ports with internal
PHY to allow a PHY connection for device trees where the phy-mode is not
explicitly set to "internal". At this point everything was working OK
again.

Subsequently, commit 6ff6064605 ("net: dsa: realtek: convert to
phylink_generic_validate()") broke this behaviour again by discarding
the usage of rtl8365mb_phy_mode_supported() - where this GMII support
was indicated - while switching to the new .phylink_get_caps API.

With the new API, rtl8365mb_phy_mode_supported() is no longer needed.
Remove it altogether and add back the GMII capability - this time to
rtl8365mb_phylink_get_caps() - so that the above default behaviour works
for ports with internal PHY again.

Fixes: 6ff6064605 ("net: dsa: realtek: convert to phylink_generic_validate()")
Signed-off-by: Alvin Šipraga <alsi@bang-olufsen.dk>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/20220607184624.417641-1-alvin@pqrs.dk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-08 21:03:51 -07:00
Jakub Kicinski
568a32f565 Merge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2022-06-07

This series contains updates to ixgbe driver only.

Olivier Matz resolves an issue so that broadcast packets can still be
received when VF removes promiscuous settings and removes setting of
VLAN promiscuous, in promiscuous mode, to prevent a loop when VFs are
bridged.

* '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
  ixgbe: fix unexpected VLAN Rx in promisc mode on VF
  ixgbe: fix bcast packets Rx on VF after promisc removal
====================

Link: https://lore.kernel.org/r/20220607181538.748786-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-08 21:02:23 -07:00
Russell King (Oracle)
b4d78731b3 net: dsa: mv88e6xxx: correctly report serdes link failure
Phylink wants to know if the link has dropped since the last time state
was retrieved, and the BMSR gives us that. Read the BMSR and use it when
deciding the link state. Fill in the an_complete member as well for the
emulated PHY state.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-08 20:58:30 -07:00
Russell King (Oracle)
2b4bb9cd9b net: dsa: mv88e6xxx: fix BMSR error to be consistent with others
Other errors accessing the registers in mv88e6352_serdes_pcs_get_state()
print "PHY " before the register name, except for the BMSR. Make this
consistent with the other error messages.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-08 20:58:30 -07:00
Marek Behún
47e96930d6 net: dsa: mv88e6xxx: use BMSR_ANEGCOMPLETE bit for filling an_complete
Commit ede359d884 ("net: dsa: mv88e6xxx: Link in pcs_get_state() if AN
is bypassed") added the ability to link if AN was bypassed, and added
filling of state->an_complete field, but set it to true if AN was
enabled in BMCR, not when AN was reported complete in BMSR.

This was done because for some reason, when I wanted to use BMSR value
to infer an_complete, I was looking at BMSR_ANEGCAPABLE bit (which was
always 1), instead of BMSR_ANEGCOMPLETE bit.

Use BMSR_ANEGCOMPLETE for filling state->an_complete.

Fixes: ede359d884 ("net: dsa: mv88e6xxx: Link in pcs_get_state() if AN is bypassed")
Signed-off-by: Marek Behún <kabel@kernel.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-08 20:58:30 -07:00
Miaoqian Lin
11ec18b1d8 net: altera: Fix refcount leak in altera_tse_mdio_create
Every iteration of for_each_child_of_node() decrements
the reference count of the previous node.
When break from a for_each_child_of_node() loop,
we need to explicitly call of_node_put() on the child node when
not need anymore.
Add missing of_node_put() to avoid refcount leak.

Fixes: bbd2190ce9 ("Altera TSE: Add main and header file for Altera Ethernet Driver")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Link: https://lore.kernel.org/r/20220607041144.7553-1-linmq006@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-08 20:52:41 -07:00
Chen Lin
2f2c0d2919 net: ethernet: mtk_eth_soc: fix misuse of mem alloc interface netdev[napi]_alloc_frag
When rx_flag == MTK_RX_FLAGS_HWLRO,
rx_data_len = MTK_MAX_LRO_RX_LENGTH(4096 * 3) > PAGE_SIZE.
netdev_alloc_frag is for alloction of page fragment only.
Reference to other drivers and Documentation/vm/page_frags.rst

Branch to use __get_free_pages when ring->frag_size > PAGE_SIZE.

Signed-off-by: Chen Lin <chen45464546@163.com>
Link: https://lore.kernel.org/r/1654692413-2598-1-git-send-email-chen45464546@163.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-08 20:37:27 -07:00
Mark Bloch
8fa5e7b20e net/mlx5: fs, fail conflicting actions
When combining two steering rules into one check
not only do they share the same actions but those
actions are also the same. This resolves an issue where
when creating two different rules with the same match
the actions are overwritten and one of the rules is deleted
a FW syndrome can be seen in dmesg.

mlx5_core 0000:03:00.0: mlx5_cmd_check:819:(pid 2105): DEALLOC_MODIFY_HEADER_CONTEXT(0x941) op_mod(0x0) failed, status bad resource state(0x9), syndrome (0x1ab444)

Fixes: 0d235c3fab ("net/mlx5: Add hash table to search FTEs in a flow-group")
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-06-08 11:39:44 -07:00
Feras Daoud
8bf94e6414 net/mlx5: Rearm the FW tracer after each tracer event
The current design does not arm the tracer if traces are available before
the tracer string database is fully loaded, leading to an unfunctional tracer.
This fix will rearm the tracer every time the FW triggers tracer event
regardless of the tracer strings database status.

Fixes: c71ad41ccb ("net/mlx5: FW tracer, events handling")
Signed-off-by: Feras Daoud <ferasda@nvidia.com>
Signed-off-by: Roy Novich <royno@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-06-08 11:39:44 -07:00
Mark Bloch
3008e6a004 net/mlx5: E-Switch, pair only capable devices
OFFLOADS paring using devcom is possible only on devices
that support LAG. Filter based on lag capabilities.

This fixes an issue where mlx5_get_next_phys_dev() was
called without holding the interface lock.

This issue was found when commit
bc4c2f2e01 ("net/mlx5: Lag, filter non compatible devices")
added an assert that verifies the interface lock is held.

WARNING: CPU: 9 PID: 1706 at drivers/net/ethernet/mellanox/mlx5/core/dev.c:642 mlx5_get_next_phys_dev+0xd2/0x100 [mlx5_core]
Modules linked in: mlx5_vdpa vringh vhost_iotlb vdpa mlx5_ib mlx5_core xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter rpcrdma rdma_ucm ib_iser libiscsi scsi_transport_iscsi rdma_cm iw_cm ib_umad ib_ipoib ib_cm ib_uverbs ib_core overlay fuse [last unloaded: mlx5_core]
CPU: 9 PID: 1706 Comm: devlink Not tainted 5.18.0-rc7+ #11
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:mlx5_get_next_phys_dev+0xd2/0x100 [mlx5_core]
Code: 02 00 75 48 48 8b 85 80 04 00 00 5d c3 31 c0 5d c3 be ff ff ff ff 48 c7 c7 08 41 5b a0 e8 36 87 28 e3 85 c0 0f 85 6f ff ff ff <0f> 0b e9 68 ff ff ff 48 c7 c7 0c 91 cc 84 e8 cb 36 6f e1 e9 4d ff
RSP: 0018:ffff88811bf47458 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff88811b398000 RCX: 0000000000000001
RDX: 0000000080000000 RSI: ffffffffa05b4108 RDI: ffff88812daaaa78
RBP: ffff88812d050380 R08: 0000000000000001 R09: ffff88811d6b3437
R10: 0000000000000001 R11: 00000000fddd3581 R12: ffff88815238c000
R13: ffff88812d050380 R14: ffff8881018aa7e0 R15: ffff88811d6b3428
FS:  00007fc82e18ae80(0000) GS:ffff88842e080000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f9630d1b421 CR3: 0000000149802004 CR4: 0000000000370ea0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 mlx5_esw_offloads_devcom_event+0x99/0x3b0 [mlx5_core]
 mlx5_devcom_send_event+0x167/0x1d0 [mlx5_core]
 esw_offloads_enable+0x1153/0x1500 [mlx5_core]
 ? mlx5_esw_offloads_controller_valid+0x170/0x170 [mlx5_core]
 ? wait_for_completion_io_timeout+0x20/0x20
 ? mlx5_rescan_drivers_locked+0x318/0x810 [mlx5_core]
 mlx5_eswitch_enable_locked+0x586/0xc50 [mlx5_core]
 ? mlx5_eswitch_disable_pf_vf_vports+0x1d0/0x1d0 [mlx5_core]
 ? mlx5_esw_try_lock+0x1b/0xb0 [mlx5_core]
 ? mlx5_eswitch_enable+0x270/0x270 [mlx5_core]
 ? __debugfs_create_file+0x260/0x3e0
 mlx5_devlink_eswitch_mode_set+0x27e/0x870 [mlx5_core]
 ? mutex_lock_io_nested+0x12c0/0x12c0
 ? esw_offloads_disable+0x250/0x250 [mlx5_core]
 ? devlink_nl_cmd_trap_get_dumpit+0x470/0x470
 ? rcu_read_lock_sched_held+0x3f/0x70
 devlink_nl_cmd_eswitch_set_doit+0x217/0x620

Fixes: dd3fddb827 ("net/mlx5: E-Switch, handle devcom events only for ports on the same device")
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-06-08 11:39:43 -07:00
Paul Blakey
15ef9efa85 net/mlx5e: CT: Fix cleanup of CT before cleanup of TC ct rules
CT cleanup assumes that all tc rules were deleted first, and so
is free to delete the CT shared resources (e.g the dr_action
fwd_action which is shared for all tuples). But currently for
uplink, this is happens in reverse, causing the below trace.

CT cleanup is called from:
mlx5e_cleanup_rep_tx()->mlx5e_cleanup_uplink_rep_tx()->
mlx5e_rep_tc_cleanup()->mlx5e_tc_esw_cleanup()->
mlx5_tc_ct_clean()

Only afterwards, tc cleanup is called from:
mlx5e_cleanup_rep_tx()->mlx5e_tc_ht_cleanup()
which would have deleted all the tc ct rules, and so delete
all the offloaded tuples.

Fix this reversing the order of init and on cleanup, which
will result in tc cleanup then ct cleanup.

[ 9443.593347] WARNING: CPU: 2 PID: 206774 at drivers/net/ethernet/mellanox/mlx5/core/steering/dr_action.c:1882 mlx5dr_action_destroy+0x188/0x1a0 [mlx5_core]
[ 9443.593349] Modules linked in: act_ct nf_flow_table rdma_ucm(O) rdma_cm(O) iw_cm(O) ib_ipoib(O) ib_cm(O) ib_umad(O) mlx5_core(O-) mlxfw(O) mlxdevm(O) auxiliary(O) ib_uverbs(O) psample ib_core(O) mlx_compat(O) ip_gre gre ip_tunnel act_vlan bonding geneve esp6_offload esp6 esp4_offload esp4 act_tunnel_key vxlan ip6_udp_tunnel udp_tunnel act_mirred act_skbedit act_gact cls_flower sch_ingress nfnetlink_cttimeout nfnetlink xfrm_user xfrm_algo 8021q garp stp ipmi_devintf mrp ipmi_msghandler llc openvswitch nsh nf_conncount nf_nat mst_pciconf(O) dm_multipath sbsa_gwdt uio_pdrv_genirq uio mlxbf_pmc mlxbf_pka mlx_trio mlx_bootctl(O) bluefield_edac sch_fq_codel ip_tables ipv6 crc_ccitt btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor xor_neon raid6_pq raid1 raid0 crct10dif_ce i2c_mlxbf gpio_mlxbf2 mlxbf_gige aes_neon_bs aes_neon_blk [last unloaded: mlx5_ib]
[ 9443.593419] CPU: 2 PID: 206774 Comm: modprobe Tainted: G           O      5.4.0-1023.24.gc14613d-bluefield #1
[ 9443.593422] Hardware name: https://www.mellanox.com BlueField SoC/BlueField SoC, BIOS BlueField:143ebaf Jan 11 2022
[ 9443.593424] pstate: 20000005 (nzCv daif -PAN -UAO)
[ 9443.593489] pc : mlx5dr_action_destroy+0x188/0x1a0 [mlx5_core]
[ 9443.593545] lr : mlx5_ct_fs_smfs_destroy+0x24/0x30 [mlx5_core]
[ 9443.593546] sp : ffff8000135dbab0
[ 9443.593548] x29: ffff8000135dbab0 x28: ffff0003a6ab8e80
[ 9443.593550] x27: 0000000000000000 x26: ffff0003e07d7000
[ 9443.593552] x25: ffff800009609de0 x24: ffff000397fb2120
[ 9443.593554] x23: ffff0003975c0000 x22: 0000000000000000
[ 9443.593556] x21: ffff0003975f08c0 x20: ffff800009609de0
[ 9443.593558] x19: ffff0003c8a13380 x18: 0000000000000014
[ 9443.593560] x17: 0000000067f5f125 x16: 000000006529c620
[ 9443.593561] x15: 000000000000000b x14: 0000000000000000
[ 9443.593563] x13: 0000000000000002 x12: 0000000000000001
[ 9443.593565] x11: ffff800011108868 x10: 0000000000000000
[ 9443.593567] x9 : 0000000000000000 x8 : ffff8000117fb270
[ 9443.593569] x7 : ffff0003ebc01288 x6 : 0000000000000000
[ 9443.593571] x5 : ffff800009591ab8 x4 : fffffe000f6d9a20
[ 9443.593572] x3 : 0000000080040001 x2 : fffffe000f6d9a20
[ 9443.593574] x1 : ffff8000095901d8 x0 : 0000000000000025
[ 9443.593577] Call trace:
[ 9443.593634]  mlx5dr_action_destroy+0x188/0x1a0 [mlx5_core]
[ 9443.593688]  mlx5_ct_fs_smfs_destroy+0x24/0x30 [mlx5_core]
[ 9443.593743]  mlx5_tc_ct_clean+0x34/0xa8 [mlx5_core]
[ 9443.593797]  mlx5e_tc_esw_cleanup+0x58/0x88 [mlx5_core]
[ 9443.593851]  mlx5e_rep_tc_cleanup+0x24/0x30 [mlx5_core]
[ 9443.593905]  mlx5e_cleanup_rep_tx+0x6c/0x78 [mlx5_core]
[ 9443.593959]  mlx5e_detach_netdev+0x74/0x98 [mlx5_core]
[ 9443.594013]  mlx5e_netdev_change_profile+0x70/0x180 [mlx5_core]
[ 9443.594067]  mlx5e_netdev_attach_nic_profile+0x34/0x40 [mlx5_core]
[ 9443.594122]  mlx5e_vport_rep_unload+0x15c/0x1a8 [mlx5_core]
[ 9443.594177]  mlx5_eswitch_unregister_vport_reps+0x228/0x298 [mlx5_core]
[ 9443.594231]  mlx5e_rep_remove+0x2c/0x38 [mlx5_core]
[ 9443.594236]  auxiliary_bus_remove+0x30/0x50 [auxiliary]
[ 9443.594246]  device_release_driver_internal+0x108/0x1d0
[ 9443.594248]  driver_detach+0x5c/0xe8
[ 9443.594250]  bus_remove_driver+0x64/0xd8
[ 9443.594253]  driver_unregister+0x38/0x60
[ 9443.594255]  auxiliary_driver_unregister+0x24/0x38 [auxiliary]
[ 9443.594311]  mlx5e_rep_cleanup+0x20/0x38 [mlx5_core]
[ 9443.594365]  mlx5e_cleanup+0x18/0x30 [mlx5_core]
[ 9443.594419]  cleanup+0xc/0x20cc [mlx5_core]
[ 9443.594424]  __arm64_sys_delete_module+0x154/0x2b0
[ 9443.594429]  el0_svc_common.constprop.0+0xf4/0x200
[ 9443.594432]  el0_svc_handler+0x38/0xa8
[ 9443.594435]  el0_svc+0x10/0x26c

Fixes: d1a3138f79 ("net/mlx5e: TC, Move flow hashtable to be per rep")
Signed-off-by: Paul Blakey <paulb@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-06-08 11:39:43 -07:00
Saeed Mahameed
4d995c1b9d Revert "net/mlx5e: Allow relaxed ordering over VFs"
FW is not ready, fix was sent too soon.
This reverts commit f05ec8d9d0.

Fixes: f05ec8d9d0 ("net/mlx5e: Allow relaxed ordering over VFs")
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-06-08 11:39:43 -07:00
Arnd Bergmann
a6958951eb au1000_eth: stop using virt_to_bus()
The conversion to the dma-mapping API in linux-2.6.11 was incomplete
and left a virt_to_bus() call around. There have been a number of
fixes for DMA mapping API abuse in this driver, but this one always
slipped through.

Change it to just use the existing dma_addr_t pointer, and make it
use the correct types throughout the driver to make it easier to
understand the virtual vs dma address spaces.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Tested-by: Manuel Lauss <manuel.lauss@gmail.com>
Link: https://lore.kernel.org/r/20220607090206.19830-1-arnd@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-08 11:32:02 -07:00
Masahiro Yamada
35b42dce61 net: mdio: unexport __init-annotated mdio_bus_init()
EXPORT_SYMBOL and __init is a bad combination because the .init.text
section is freed up after the initialization. Hence, modules cannot
use symbols annotated __init. The access to a freed symbol may end up
with kernel panic.

modpost used to detect it, but it has been broken for a decade.

Recently, I fixed modpost so it started to warn it again, then this
showed up in linux-next builds.

There are two ways to fix it:

  - Remove __init
  - Remove EXPORT_SYMBOL

I chose the latter for this case because the only in-tree call-site,
drivers/net/phy/phy_device.c is never compiled as modular.
(CONFIG_PHYLIB is boolean)

Fixes: 90eff9096c ("net: phy: Allow splitting MDIO bus/device support from PHYs")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-08 10:10:13 -07:00
Gal Pressman
f5826c8c9d net/mlx4_en: Fix wrong return value on ioctl EEPROM query failure
The ioctl EEPROM query wrongly returns success on read failures, fix
that by returning the appropriate error code.

Fixes: 7202da8b7f ("ethtool, net/mlx4_en: Cable info, get_module_info/eeprom ethtool support")
Signed-off-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://lore.kernel.org/r/20220606115718.14233-1-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-07 20:49:58 -07:00
Miaoqian Lin
0737e018a0 net: dsa: lantiq_gswip: Fix refcount leak in gswip_gphy_fw_list
Every iteration of for_each_available_child_of_node() decrements
the reference count of the previous node.
when breaking early from a for_each_available_child_of_node() loop,
we need to explicitly call of_node_put() on the gphy_fw_np.
Add missing of_node_put() to avoid refcount leak.

Fixes: 14fceff477 ("net: dsa: Add Lantiq / Intel DSA driver for vrx200")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Link: https://lore.kernel.org/r/20220605072335.11257-1-linmq006@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-07 20:45:14 -07:00
Olivier Matz
7bb0fb7c63 ixgbe: fix unexpected VLAN Rx in promisc mode on VF
When the promiscuous mode is enabled on a VF, the IXGBE_VMOLR_VPE
bit (VLAN Promiscuous Enable) is set. This means that the VF will
receive packets whose VLAN is not the same than the VLAN of the VF.

For instance, in this situation:

┌────────┐    ┌────────┐    ┌────────┐
│        │    │        │    │        │
│        │    │        │    │        │
│     VF0├────┤VF1  VF2├────┤VF3     │
│        │    │        │    │        │
└────────┘    └────────┘    └────────┘
   VM1           VM2           VM3

vf 0:  vlan 1000
vf 1:  vlan 1000
vf 2:  vlan 1001
vf 3:  vlan 1001

If we tcpdump on VF3, we see all the packets, even those transmitted
on vlan 1000.

This behavior prevents to bridge VF1 and VF2 in VM2, because it will
create a loop: packets transmitted on VF1 will be received by VF2 and
vice-versa, and bridged again through the software bridge.

This patch remove the activation of VLAN Promiscuous when a VF enables
the promiscuous mode. However, the IXGBE_VMOLR_UPE bit (Unicast
Promiscuous) is kept, so that a VF receives all packets that has the
same VLAN, whatever the destination MAC address.

Fixes: 8443c1a4b1 ("ixgbe, ixgbevf: Add new mbox API xcast mode")
Cc: stable@vger.kernel.org
Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-06-07 10:59:59 -07:00
Olivier Matz
803e9895ea ixgbe: fix bcast packets Rx on VF after promisc removal
After a VF requested to remove the promiscuous flag on an interface, the
broadcast packets are not received anymore. This breaks some protocols
like ARP.

In ixgbe_update_vf_xcast_mode(), we should keep the IXGBE_VMOLR_BAM
bit (Broadcast Accept) on promiscuous removal.

This flag is already set by default in ixgbe_set_vmolr() on VF reset.

Fixes: 8443c1a4b1 ("ixgbe, ixgbevf: Add new mbox API xcast mode")
Cc: stable@vger.kernel.org
Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-06-07 10:58:43 -07:00
Christophe JAILLET
5e74a4b3ec stmmac: intel: Fix an error handling path in intel_eth_pci_probe()
When the managed API is used, there is no need to explicitly call
pci_free_irq_vectors().

This looks to be a left-over from the commit in the Fixes tag. Only the
.remove() function had been updated.

So remove this unused function call and update goto label accordingly.

Fixes: 8accc46775 ("stmmac: intel: use managed PCI function on probe and resume")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Wong Vee Khee <vee.khee.wong@linux.intel.com>
Link: https://lore.kernel.org/r/1ac9b6787b0db83b0095711882c55c77c8ea8da0.1654462241.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-06-07 11:56:29 +02:00