Commit Graph

1218422 Commits

Author SHA1 Message Date
Justin Stitt
a6c78d5f8d igbvf: replace deprecated strncpy with strscpy
`strncpy` is deprecated for use on NUL-terminated destination strings
[1] and as such we should prefer more robust and less ambiguous string
interfaces.

We expect netdev->name to be NUL-terminated based on its usage with
`strlen` and format strings:
|       if (strlen(netdev->name) < (IFNAMSIZ - 5)) {
|               sprintf(adapter->tx_ring->name, "%s-tx-0", netdev->name);

Moreover, we do not need NUL-padding as netdev is already
zero-allocated:
|       netdev = alloc_etherdev(sizeof(struct igbvf_adapter));
...
alloc_etherdev() -> alloc_etherdev_mq() -> alloc_etherdev_mqs() ->
alloc_netdev_mqs() ...
|       p = kvzalloc(alloc_size, GFP_KERNEL_ACCOUNT | __GFP_RETRY_MAYFAIL);

Considering the above, a suitable replacement is `strscpy` [2] due to
the fact that it guarantees NUL-termination on the destination buffer
without unnecessarily NUL-padding.

Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings [1]
Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html [2]
Link: https://github.com/KSPP/linux/issues/90
Signed-off-by: Justin Stitt <justinstitt@google.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20231017190411.2199743-9-jacob.e.keller@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-18 18:10:18 -07:00
Justin Stitt
95e71e35e6 igb: replace deprecated strncpy with strscpy
`strncpy` is deprecated for use on NUL-terminated destination strings
[1] and as such we should prefer more robust and less ambiguous string
interfaces.

We see that netdev->name is expected to be NUL-terminated based on its
usage with format strings:
|       sprintf(q_vector->name, "%s-TxRx-%u", netdev->name,
|               q_vector->rx.ring->queue_index);

Furthermore, NUL-padding is not required as netdev is already
zero-allocated:
|       netdev = alloc_etherdev_mq(sizeof(struct igb_adapter),
|                                  IGB_MAX_TX_QUEUES);
...
alloc_etherdev_mq() -> alloc_etherdev_mqs() -> alloc_netdev_mqs() ...
|       p = kvzalloc(alloc_size, GFP_KERNEL_ACCOUNT | __GFP_RETRY_MAYFAIL);

Considering the above, a suitable replacement is `strscpy` [2] due to
the fact that it guarantees NUL-termination on the destination buffer
without unnecessarily NUL-padding.

Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings [1]
Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html [2]
Link: https://github.com/KSPP/linux/issues/90
Signed-off-by: Justin Stitt <justinstitt@google.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20231017190411.2199743-8-jacob.e.keller@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-18 18:10:17 -07:00
Justin Stitt
be39d0a61a i40e: use scnprintf over strncpy+strncat
`strncpy` is deprecated for use on NUL-terminated destination strings
[1] and as such we should prefer more robust and less ambiguous string
interfaces.

Moreover, `strncat` shouldn't really be used either as per
fortify-string.h:
 * Do not use this function. While FORTIFY_SOURCE tries to avoid
 * read and write overflows, this is only possible when the sizes
 * of @p and @q are known to the compiler. Prefer building the
 * string with formatting, via scnprintf() or similar.

Instead, use `scnprintf` with "%s%s" format string. This code is now
more readable and robust.

Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings [1]
Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html [2]
Link: https://github.com/KSPP/linux/issues/90
Signed-off-by: Justin Stitt <justinstitt@google.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20231017190411.2199743-7-jacob.e.keller@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-18 18:10:17 -07:00
Justin Stitt
48b238461b fm10k: replace deprecated strncpy with strscpy
`strncpy` is deprecated for use on NUL-terminated destination strings
[1] and as such we should prefer more robust and less ambiguous string
interfaces.

A suitable replacement is `strscpy` [2] due to the fact that it
guarantees NUL-termination on the destination buffer without
unnecessarily NUL-padding.

Other implementations of .*get_drvinfo also use strscpy so this patch
brings fm10k_get_drvinfo in line as well:

igb/igb_ethtool.c +851
static void igb_get_drvinfo(struct net_device *netdev,

igbvf/ethtool.c
167:static void igbvf_get_drvinfo(struct net_device *netdev,

i40e/i40e_ethtool.c
1999:static void i40e_get_drvinfo(struct net_device *netdev,

e1000/e1000_ethtool.c
529:static void e1000_get_drvinfo(struct net_device *netdev,

ixgbevf/ethtool.c
211:static void ixgbevf_get_drvinfo(struct net_device *netdev,

Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings [1]
Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html [2]
Link: https://github.com/KSPP/linux/issues/90
Signed-off-by: Justin Stitt <justinstitt@google.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20231017190411.2199743-6-jacob.e.keller@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-18 18:10:17 -07:00
Justin Stitt
341359e034 e1000: replace deprecated strncpy with strscpy
`strncpy` is deprecated for use on NUL-terminated destination strings
[1] and as such we should prefer more robust and less ambiguous string
interfaces.

We can see that netdev->name is expected to be NUL-terminated based on
it's usage with format strings:
|       pr_info("%s NIC Link is Down\n",
|               netdev->name);

A suitable replacement is `strscpy` [2] due to the fact that it
guarantees NUL-termination on the destination buffer without
unnecessarily NUL-padding.

This is in line with other uses of strscpy on netdev->name:
$ rg "strscpy\(netdev\->name.*pci.*"

drivers/net/ethernet/intel/e1000e/netdev.c
7455:   strscpy(netdev->name, pci_name(pdev), sizeof(netdev->name));

drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
10839:  strscpy(netdev->name, pci_name(pdev), sizeof(netdev->name));

Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings [1]
Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html [2]
Link: https://github.com/KSPP/linux/issues/90
Signed-off-by: Justin Stitt <justinstitt@google.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20231017190411.2199743-5-jacob.e.keller@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-18 18:10:17 -07:00
Justin Stitt
7677f635bf e100: replace deprecated strncpy with strscpy
`strncpy` is deprecated for use on NUL-terminated destination strings
[1] and as such we should prefer more robust and less ambiguous string
interfaces.

The "...-1" pattern makes it evident that netdev->name is expected to be
NUL-terminated.

Meanwhile, it seems NUL-padding is not required due to alloc_etherdev
zero-allocating the buffer.

Considering the above, a suitable replacement is `strscpy` [2] due to
the fact that it guarantees NUL-termination on the destination buffer
without unnecessarily NUL-padding.

This is in line with other uses of strscpy on netdev->name:
$ rg "strscpy\(netdev\->name.*pci.*"

drivers/net/ethernet/intel/e1000e/netdev.c
7455:   strscpy(netdev->name, pci_name(pdev), sizeof(netdev->name));

drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
10839:  strscpy(netdev->name, pci_name(pdev), sizeof(netdev->name));

Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings [1]
Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html [2]
Link: https://github.com/KSPP/linux/issues/90
Signed-off-by: Justin Stitt <justinstitt@google.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20231017190411.2199743-4-jacob.e.keller@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-18 18:10:17 -07:00
Jesse Brandeburg
d97af2440a intel: fix format warnings
Get ahead of the game and fix all the -Wformat=2 noted warnings in the
intel drivers directory.

There are one set of i40e and iavf warnings I couldn't figure out how to
fix because the driver is already using vsnprintf without an explicit
"const char *" format string.

Tested with both gcc-12 and clang-15. I found gcc-12 runs clean after
this series but clang-15 is a little worried about the vsnprintf lines.

summary of warnings:

drivers/net/ethernet/intel/fm10k/fm10k_ethtool.c:148:34: warning: format string is not a string literal [-Wformat-nonliteral]
drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c:1416:24: warning: format string is not a string literal (potentially insecure) [-Wformat-security]
drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c:1416:24: note: treat the string as an argument to avoid this
drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c:1421:6: warning: format string is not a string literal (potentially insecure) [-Wformat-security]
drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c:1421:6: note: treat the string as an argument to avoid this
drivers/net/ethernet/intel/igc/igc_ethtool.c:776:24: warning: format string is not a string literal (potentially insecure) [-Wformat-security]
drivers/net/ethernet/intel/igc/igc_ethtool.c:776:24: note: treat the string as an argument to avoid this
drivers/net/ethernet/intel/igc/igc_ethtool.c:779:6: warning: format string is not a string literal (potentially insecure) [-Wformat-security]
drivers/net/ethernet/intel/igc/igc_ethtool.c:779:6: note: treat the string as an argument to avoid this
drivers/net/ethernet/intel/iavf/iavf_ethtool.c:199:34: warning: format string is not a string literal [-Wformat-nonliteral]
drivers/net/ethernet/intel/igb/igb_ethtool.c:2360:6: warning: format string is not a string literal (potentially insecure) [-Wformat-security]
drivers/net/ethernet/intel/igb/igb_ethtool.c:2360:6: note: treat the string as an argument to avoid this
drivers/net/ethernet/intel/igb/igb_ethtool.c:2363:6: warning: format string is not a string literal (potentially insecure) [-Wformat-security]
drivers/net/ethernet/intel/igb/igb_ethtool.c:2363:6: note: treat the string as an argument to avoid this
drivers/net/ethernet/intel/i40e/i40e_ethtool.c:208:34: warning: format string is not a string literal [-Wformat-nonliteral]
drivers/net/ethernet/intel/i40e/i40e_ethtool.c:2515:23: warning: format string is not a string literal (potentially insecure) [-Wformat-security]
drivers/net/ethernet/intel/i40e/i40e_ethtool.c:2515:23: note: treat the string as an argument to avoid this
drivers/net/ethernet/intel/i40e/i40e_ethtool.c:2519:23: warning: format string is not a string literal (potentially insecure) [-Wformat-security]
drivers/net/ethernet/intel/i40e/i40e_ethtool.c:2519:23: note: treat the string as an argument to avoid this
drivers/net/ethernet/intel/ice/ice_ethtool.c:1064:6: warning: format string is not a string literal (potentially insecure) [-Wformat-security]
drivers/net/ethernet/intel/ice/ice_ethtool.c:1064:6: note: treat the string as an argument to avoid this
drivers/net/ethernet/intel/ice/ice_ethtool.c:1084:6: warning: format string is not a string literal (potentially insecure) [-Wformat-security]
drivers/net/ethernet/intel/ice/ice_ethtool.c:1084:6: note: treat the string as an argument to avoid this
drivers/net/ethernet/intel/ice/ice_ethtool.c:1100:24: warning: format string is not a string literal (potentially insecure) [-Wformat-security]
drivers/net/ethernet/intel/ice/ice_ethtool.c:1100:24: note: treat the string as an argument to avoid this

Suggested-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20231017190411.2199743-3-jacob.e.keller@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-18 18:10:17 -07:00
Jesse Brandeburg
1978d3ead8 intel: fix string truncation warnings
Fix -Wformat-truncated warnings to complete the intel directories' W=1
clean efforts. The W=1 recently got enhanced with a few new flags and
this brought up some new warnings.

Switch to using kasprintf() when possible so we always allocate the
right length strings.

summary of warnings:
drivers/net/ethernet/intel/iavf/iavf_virtchnl.c:1425:60: warning: ‘%s’ directive output may be truncated writing 4 bytes into a region of size between 1 and 11 [-Wformat-truncation=]
drivers/net/ethernet/intel/iavf/iavf_virtchnl.c:1425:17: note: ‘snprintf’ output between 7 and 17 bytes into a destination of size 13
drivers/net/ethernet/intel/ice/ice_ptp.c:43:27: warning: ‘%s’ directive output may be truncated writing up to 479 bytes into a region of size 64 [-Wformat-truncation=]
drivers/net/ethernet/intel/ice/ice_ptp.c:42:17: note: ‘snprintf’ output between 1 and 480 bytes into a destination of size 64
drivers/net/ethernet/intel/igb/igb_main.c:3092:53: warning: ‘%d’ directive output may be truncated writing between 1 and 5 bytes into a region of size between 1 and 13 [-Wformat-truncation=]
drivers/net/ethernet/intel/igb/igb_main.c:3092:34: note: directive argument in the range [0, 65535]
drivers/net/ethernet/intel/igb/igb_main.c:3092:34: note: directive argument in the range [0, 65535]
drivers/net/ethernet/intel/igb/igb_main.c:3090:25: note: ‘snprintf’ output between 23 and 43 bytes into a destination of size 32

Suggested-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20231017190411.2199743-2-jacob.e.keller@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-18 18:10:16 -07:00
Paolo Abeni
68b54aeff8 tcp_bpf: properly release resources on error paths
In the blamed commit below, I completely forgot to release the acquired
resources before erroring out in the TCP BPF code, as reported by Dan.

Address the issues by replacing the bogus return with a jump to the
relevant cleanup code.

Fixes: 419ce133ab ("tcp: allow again tcp_disconnect() when threads are waiting")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Jakub Sitnicki <jakub@cloudflare.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/r/8f99194c698bcef12666f0a9a999c58f8b1cb52c.1697557782.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-18 18:09:31 -07:00
Pedro Tammela
a13b67c9a0 net/sched: sch_hfsc: upgrade 'rt' to 'sc' when it becomes a inner curve
Christian Theune says:
   I upgraded from 6.1.38 to 6.1.55 this morning and it broke my traffic shaping script,
   leaving me with a non-functional uplink on a remote router.

A 'rt' curve cannot be used as a inner curve (parent class), but we were
allowing such configurations since the qdisc was introduced. Such
configurations would trigger a UAF as Budimir explains:
   The parent will have vttree_insert() called on it in init_vf(),
   but will not have vttree_remove() called on it in update_vf()
   because it does not have the HFSC_FSC flag set.

The qdisc always assumes that inner classes have the HFSC_FSC flag set.
This is by design as it doesn't make sense 'qdisc wise' for an 'rt'
curve to be an inner curve.

Budimir's original patch disallows users to add classes with a 'rt'
parent, but this is too strict as it breaks users that have been using
'rt' as a inner class. Another approach, taken by this patch, is to
upgrade the inner 'rt' into a 'sc', warning the user in the process.
It avoids the UAF reported by Budimir while also being more permissive
to bad scripts/users/code using 'rt' as a inner class.

Users checking the `tc class ls [...]` or `tc class get [...]` dumps would
observe the curve change and are potentially breaking with this change.

v1->v2: https://lore.kernel.org/all/20231013151057.2611860-1-pctammela@mojatatu.com/
- Correct 'Fixes' tag and merge with revert (Jakub)

Cc: Christian Theune <ct@flyingcircus.io>
Cc: Budimir Markovic <markovicbudimir@gmail.com>
Fixes: b3d26c5702 ("net/sched: sch_hfsc: Ensure inner classes have fsc curve")
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://lore.kernel.org/r/20231017143602.3191556-1-pctammela@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-18 18:08:28 -07:00
Vladimir Oltean
1f9f2143f2 net: mdio-mux: fix C45 access returning -EIO after API change
The mii_bus API conversion to read_c45() and write_c45() did not cover
the mdio-mux driver before read() and write() were made C22-only.

This broke arch/arm64/boot/dts/freescale/fsl-ls1028a-qds-13bb.dtso.
The -EOPNOTSUPP from mdiobus_c45_read() is transformed by
get_phy_c45_devs_in_pkg() into -EIO, is further propagated to
of_mdiobus_register() and this makes the mdio-mux driver fail to probe
the entire child buses, not just the PHYs that cause access errors.

Fix the regression by introducing special c45 read and write accessors
to mdio-mux which forward the operation to the parent MDIO bus.

Fixes: db1a63aed8 ("net: phy: Remove fallback to old C45 method")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/20231017143144.3212657-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-18 18:08:23 -07:00
Jakub Kicinski
50ee052b39 Merge branch 'selftests-tc-testing-fixes-for-kselftest'
Pedro Tammela says:

====================
selftests: tc-testing: fixes for kselftest

While playing around with TuxSuite, we noticed a couple of things were
broken for strict CI/automated builds. We had a script that didn't make into
the kselftest tarball and a couple of missing Kconfig knobs in our
minimal config.
====================

Link: https://lore.kernel.org/r/20231017152309.3196320-1-pctammela@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-18 18:07:53 -07:00
Pedro Tammela
35027c7909 selftests: tc-testing: move auxiliary scripts to a dedicated folder
Some taprio tests need auxiliary scripts to wait for workqueue events to
process. Move them to a dedicated folder in order to package them for
the kselftests tarball.

Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Link: https://lore.kernel.org/r/20231017152309.3196320-3-pctammela@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-18 18:07:51 -07:00
Pedro Tammela
f157b73d51 selftests: tc-testing: add missing Kconfig options to 'config'
Make sure CI builds using just tc-testing/config can run all tdc tests.
Some tests were broken because of missing knobs.

Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Link: https://lore.kernel.org/r/20231017152309.3196320-2-pctammela@mojatatu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-18 18:07:51 -07:00
Eric Dumazet
f921a4a5bf tcp: tsq: relax tcp_small_queue_check() when rtx queue contains a single skb
In commit 75eefc6c59 ("tcp: tsq: add a shortcut in tcp_small_queue_check()")
we allowed to send an skb regardless of TSQ limits being hit if rtx queue
was empty or had a single skb, in order to better fill the pipe
when/if TX completions were slow.

Then later, commit 75c119afe1 ("tcp: implement rb-tree based
retransmit queue") accidentally removed the special case for
one skb in rtx queue.

Stefan Wahren reported a regression in single TCP flow throughput
using a 100Mbit fec link, starting from commit 65466904b0 ("tcp: adjust
TSO packet sizes based on min_rtt"). This last commit only made the
regression more visible, because it locked the TCP flow on a particular
behavior where TSQ prevented two skbs being pushed downstream,
adding silences on the wire between each TSO packet.

Many thanks to Stefan for his invaluable help !

Fixes: 75c119afe1 ("tcp: implement rb-tree based retransmit queue")
Link: https://lore.kernel.org/netdev/7f31ddc8-9971-495e-a1f6-819df542e0af@gmx.net/
Reported-by: Stefan Wahren <wahrenst@gmx.net>
Tested-by: Stefan Wahren <wahrenst@gmx.net>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Link: https://lore.kernel.org/r/20231017124526.4060202-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-18 18:06:36 -07:00
Shinas Rasheed
a0ca6b9dfe octeon_ep: update BQL sent bytes before ringing doorbell
Sometimes Tx is completed immediately after doorbell is updated, which
causes Tx completion routing to update completion bytes before the
same packet bytes are updated in sent bytes in transmit function, hence
hitting BUG_ON() in dql_completed(). To avoid this, update BQL
sent bytes before ringing doorbell.

Fixes: 37d79d0596 ("octeon_ep: add Tx/Rx processing and interrupt support")
Signed-off-by: Shinas Rasheed <srasheed@marvell.com>
Link: https://lore.kernel.org/r/20231017105030.2310966-1-srasheed@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-18 18:05:08 -07:00
Jiawen Wu
48e44287c6 net: wangxun: remove redundant kernel log
Since PBA info can be read from lspci, delete txgbe_read_pba_string()
and the prints. In addition, delete the redundant MAC address printing.

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20231017100635.154967-1-jiawenwu@trustnetic.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-18 17:57:36 -07:00
Jiri Slaby (SUSE)
31c65705a8 perf/benchmark: fix seccomp_unotify benchmark for 32-bit
Commit 7d5cb68af6 (perf/benchmark: add a new benchmark for
seccom_unotify) added a reference to __NR_seccomp into perf. This is
fine as it added also a definition of __NR_seccomp for 64-bit. But it
failed to do so for 32-bit as instead of ifndef, ifdef was used.

Fix this typo (so fix the build of perf on 32-bit).

Fixes: 7d5cb68af6 (perf/benchmark: add a new benchmark for seccom_unotify)
Cc: Andrei Vagin <avagin@google.com>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: "Jiri Slaby (SUSE)" <jirislaby@kernel.org>
Link: https://lore.kernel.org/r/20231017083019.31733-1-jirislaby@kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
2023-10-18 17:47:18 -07:00
Jakub Kicinski
8bb0475623 Merge branch 'net-fec-fix-device_get_match_data-usage'
Alexander Stein says:

====================
net: fec: Fix device_get_match_data usage

this is v2 adressing the regression introduced by commit b0377116de
("net: ethernet: Use device_get_match_data()").

You could also remove the (!dev_info) case for Coldfire as this platform
has no quirks. But IMHO this should be kept as long as Coldfire platform
data is supported.
====================

Link: https://lore.kernel.org/r/20231017063419.925266-1-alexander.stein@ew.tq-group.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-18 17:47:13 -07:00
Alexander Stein
50254bfe14 net: fec: Remove non-Coldfire platform IDs
All i.MX platforms (non-Coldfire) use DT nowadays, so their platform ID
entries can be removed.

Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20231017063419.925266-3-alexander.stein@ew.tq-group.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-18 17:47:11 -07:00
Alexander Stein
e6809dba5e net: fec: Fix device_get_match_data usage
device_get_match_data() expects that of_device_id->data points to actual
fec_devinfo data, not a platform_device_id entry.
Fix this by adjusting OF device data pointers to their corresponding
structs.
enum imx_fec_type is now unused and can be removed.

Fixes: b0377116de ("net: ethernet: Use device_get_match_data()")
Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20231017063419.925266-2-alexander.stein@ew.tq-group.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-18 17:47:10 -07:00
Muhammad Muzammil
2c6370a13f drivers: net: wwan: iosm: Fixed multiple typos in multiple files
iosm_ipc_chnl_cfg.h: Fixed typo
iosm_ipc_imem_ops.h: Fixed typo
iosm_ipc_mux.h: Fixed typo
iosm_ipc_pm.h: Fixed typo
iosm_ipc_port.h: Fixed typo
iosm_ipc_trace.h: Fixed typo

Signed-off-by: Muhammad Muzammil <m.muzzammilashraf@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20231014121407.10012-1-m.muzzammilashraf@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-18 17:45:44 -07:00
Linus Torvalds
dd72f9c7e5 spi: Fix for v6.6
A fix for the npcm-fiu driver in cases where there are no dummy bytes
 during reads.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmUf640ACgkQJNaLcl1U
 h9C5cQf+K+ml3dwDRfmQX6U9sEwfCbwqf1kf3paZmGE1Vf515+mbZY6Lw7FBOpCa
 UkoVnsN++qfT77dUSMFYZxKgYhAGdfHLgtgxJefyePEUxxVbzivzmMHDyQPHfMJG
 jz/PnG3aFc2OSxnOb2Kr4J2Fla+9orSebMwvg6CbJnkTtcsEpAoTn3cc0C+C7z9t
 C/vLl+F0grf5vgkbLCmazZZ18YfKNSb635nzh9+PXxfnBk4/D4oO/SO8foX4Dh7A
 kaX1YRj9X98dI4+KsK4HcA004Dndl2jCamDGFzb7uKJQplFPosuoH2vBapZ5NUMh
 X0/vpq1Sbur1EGrSwyD/jkovyuChRg==
 =TlYc
 -----END PGP SIGNATURE-----

Merge tag 'spi-fix-v6-6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi

Pull spi fix from Mark Brown:
 "A fix for the npcm-fiu driver in cases where there are no dummy bytes
  during reads"

* tag 'spi-fix-v6-6-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
  spi: npcm-fiu: Fix UMA reads when dummy.nbytes == 0
2023-10-18 09:37:36 -07:00
Linus Torvalds
e1e80380f1 regmap: Fix for v6.6
A straightforward fix from Johan for a long standing bug in cases where
 we both have regmaps without devices and something is using
 dev_get_regmap().
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmUv6g4ACgkQJNaLcl1U
 h9BAFAf/Xcx9jWPmYJ4Peon5j25nkzuMn0A4aka2uqfr5juTFZYXi3milp5hTpfD
 tTC0cgBaADKCTXPU2aDrQeiHd1Rr18sOxAUdQWr7POPsuT+LJFpTjl5agNgPRa9a
 LqdPA+KfrYV1snIC+sQDhEuHX1UtxDdXCS4Q0KpxIiEZGt3/Z6kDysWEQCPR8m1e
 uNauKz2GaDB1ejXnEnZFMjkleKbBPc8qMF8A/c2lTayIjDn1xYVbj9IzHKoIP+au
 S9eMBLrnPNPqhOJxe0pQKIfSz5nbMls/NCZRD/IQN8TTQ/Z6LActfUoKMS09hR6p
 t8aEB9YCm8Wb+5ad8A0e3molsylvGQ==
 =RWQo
 -----END PGP SIGNATURE-----

Merge tag 'regmap-fix-v6.6-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap

Pull regmap fix from Mark Brown:
 "A straightforward fix from Johan for a long standing bug in cases
  where we both have regmaps without devices and something is using
  dev_get_regmap()"

* tag 'regmap-fix-v6.6-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
  regmap: fix NULL deref on lookup
2023-10-18 09:30:03 -07:00
Pablo Neira Ayuso
f86fb94011 netfilter: nf_tables: revert do not remove elements if set backend implements .abort
nf_tables_abort_release() path calls nft_set_elem_destroy() for
NFT_MSG_NEWSETELEM which releases the element, however, a reference to
the element still remains in the working copy.

Fixes: ebd032fa88 ("netfilter: nf_tables: do not remove elements if set backend implements .abort")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
2023-10-18 13:47:32 +02:00
Pablo Neira Ayuso
d111692a59 netfilter: nft_set_rbtree: .deactivate fails if element has expired
This allows to remove an expired element which is not possible in other
existing set backends, this is more noticeable if gc-interval is high so
expired elements remain in the tree. On-demand gc also does not help in
this case, because this is delete element path. Return NULL if element
has expired.

Fixes: 8d8540c4f5 ("netfilter: nft_set_rbtree: add timeout support")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
2023-10-18 13:47:32 +02:00
Phil Sutter
2e2d9c7d4d selftests: netfilter: Run nft_audit.sh in its own netns
Don't mess with the host's firewall ruleset. Since audit logging is not
per-netns, add an initial delay of a second so other selftests' netns
cleanups have a chance to finish.

Fixes: e8dbde59ca ("selftests: netfilter: Test nf_tables audit logging")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Florian Westphal <fw@strlen.de>
2023-10-18 13:47:08 +02:00
Phil Sutter
1baf0152f7 netfilter: nf_tables: audit log object reset once per table
When resetting multiple objects at once (via dump request), emit a log
message per table (or filled skb) and resurrect the 'entries' parameter
to contain the number of objects being logged for.

To test the skb exhaustion path, perform some bulk counter and quota
adds in the kselftest.

Signed-off-by: Phil Sutter <phil@nwl.cc>
Reviewed-by: Richard Guy Briggs <rgb@redhat.com>
Acked-by: Paul Moore <paul@paul-moore.com> (Audit)
Signed-off-by: Florian Westphal <fw@strlen.de>
2023-10-18 13:43:40 +02:00
Geert Uytterhoeven
2915240edd neighbor: tracing: Move pin6 inside CONFIG_IPV6=y section
When CONFIG_IPV6=n, and building with W=1:

    In file included from include/trace/define_trace.h:102,
		     from include/trace/events/neigh.h:255,
		     from net/core/net-traces.c:51:
    include/trace/events/neigh.h: In function ‘trace_event_raw_event_neigh_create’:
    include/trace/events/neigh.h:42:34: error: variable ‘pin6’ set but not used [-Werror=unused-but-set-variable]
       42 |                 struct in6_addr *pin6;
	  |                                  ^~~~
    include/trace/trace_events.h:402:11: note: in definition of macro ‘DECLARE_EVENT_CLASS’
      402 |         { assign; }                                                     \
	  |           ^~~~~~
    include/trace/trace_events.h:44:30: note: in expansion of macro ‘PARAMS’
       44 |                              PARAMS(assign),                   \
	  |                              ^~~~~~
    include/trace/events/neigh.h:23:1: note: in expansion of macro ‘TRACE_EVENT’
       23 | TRACE_EVENT(neigh_create,
	  | ^~~~~~~~~~~
    include/trace/events/neigh.h:41:9: note: in expansion of macro ‘TP_fast_assign’
       41 |         TP_fast_assign(
	  |         ^~~~~~~~~~~~~~
    In file included from include/trace/define_trace.h:103,
		     from include/trace/events/neigh.h:255,
		     from net/core/net-traces.c:51:
    include/trace/events/neigh.h: In function ‘perf_trace_neigh_create’:
    include/trace/events/neigh.h:42:34: error: variable ‘pin6’ set but not used [-Werror=unused-but-set-variable]
       42 |                 struct in6_addr *pin6;
	  |                                  ^~~~
    include/trace/perf.h:51:11: note: in definition of macro ‘DECLARE_EVENT_CLASS’
       51 |         { assign; }                                                     \
	  |           ^~~~~~
    include/trace/trace_events.h:44:30: note: in expansion of macro ‘PARAMS’
       44 |                              PARAMS(assign),                   \
	  |                              ^~~~~~
    include/trace/events/neigh.h:23:1: note: in expansion of macro ‘TRACE_EVENT’
       23 | TRACE_EVENT(neigh_create,
	  | ^~~~~~~~~~~
    include/trace/events/neigh.h:41:9: note: in expansion of macro ‘TP_fast_assign’
       41 |         TP_fast_assign(
	  |         ^~~~~~~~~~~~~~

Indeed, the variable pin6 is declared and initialized unconditionally,
while it is only used and needlessly re-initialized when support for
IPv6 is enabled.

Fix this by dropping the unused variable initialization, and moving the
variable declaration inside the existing section protected by a check
for CONFIG_IPV6.

Fixes: fc651001d2 ("neighbor: Add tracepoint to __neigh_create")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Simon Horman <horms@kernel.org> # build-tested
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-18 11:16:43 +01:00
Phil Sutter
c4eee56e14 net: skb_find_text: Ignore patterns extending past 'to'
Assume that caller's 'to' offset really represents an upper boundary for
the pattern search, so patterns extending past this offset are to be
rejected.

The old behaviour also was kind of inconsistent when it comes to
fragmentation (or otherwise non-linear skbs): If the pattern started in
between 'to' and 'from' offsets but extended to the next fragment, it
was not found if 'to' offset was still within the current fragment.

Test the new behaviour in a kselftest using iptables' string match.

Suggested-by: Pablo Neira Ayuso <pablo@netfilter.org>
Fixes: f72b948dcb ("[NET]: skb_find_text ignores to argument")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Reviewed-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-18 11:09:55 +01:00
Bagas Sanjaya
1db34aa58d Revert "net: wwan: iosm: enable runtime pm support for 7560"
Runtime power management support breaks Intel LTE modem where dmesg dump
showes timeout errors:

```
[   72.027442] iosm 0000:01:00.0: msg timeout
[   72.531638] iosm 0000:01:00.0: msg timeout
[   73.035414] iosm 0000:01:00.0: msg timeout
[   73.540359] iosm 0000:01:00.0: msg timeout
```

Furthermore, when shutting down with `poweroff` and modem attached, the
system rebooted instead of powering down as expected. The modem works
again only after power cycling.

Revert runtime power management support for IOSM driver as introduced by
commit e4f5073d53 ("net: wwan: iosm: enable runtime pm support for
7560").

Fixes: e4f5073d53 ("net: wwan: iosm: enable runtime pm support for 7560")
Reported-by: Martin <mwolf@adiumentum.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217996
Link: https://lore.kernel.org/r/267abf02-4b60-4a2e-92cd-709e3da6f7d3@gmail.com/
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Reviewed-by: Loic Poulain <loic.poulain@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-18 11:08:52 +01:00
David S. Miller
37fb1c81d2 netfilter next pull request 2023-10-18
-----BEGIN PGP SIGNATURE-----
 
 iQJBBAABCAArFiEEgKkgxbID4Gn1hq6fcJGo2a1f9gAFAmUvltYNHGZ3QHN0cmxl
 bi5kZQAKCRBwkajZrV/2AGbwEADT9pI6dIyEJt0xMte+Ld41E+lfBgLXxv/cAFQo
 5NjP5SF75iMh0PoRrw1f9DJJNGo4C2OhcWcltosYLdE2xoDNjYw/J2oi/WdYHLe7
 tEFZWB0KfOBl17Gh0l9Gue8MyVKKamNDuMvfd/MQB9af7VQ6lPi/hlWyjAk4hTGu
 6Ly1asHM1kXFt6Z0mKMRQomXKZ46mEVWX8a11wtfLnd3vmCqrHlKwGKlNVi9U5ir
 aQnitw60OmCvbYLBdjDugPibSTeoha674vxVtjWuwq+1uQrkx0js/ASKPGrT2a33
 yo2/nuprdZc135YCXAlccb4TwgKTMT1R1pf9JY5fhSIPKfY49yR0JKUckXpieefy
 t/VAJWVsVvXg27XF1LCommMOhbM0c+WE/BpJ8ASTSt+79nXhLcISDirkgUcGvUKZ
 LNp1sa3QqwR7auTTzd9aKIWvZIN82MLOJJpYkhNOqYheCAm9Y2O1kClQB7t7Zi3r
 K7L9clpexYWcJQmDox2SwU7nVWeqJP6e2Y2f4W+B9aKuiyHNF/B/pEE5LO6hqZ8C
 WJ+TZZVaxDTxtVtC2W3tFFhtd/ZzuIPryXNL127XN83L0iQZ+STDe6CzTXyysD3h
 14iCr4pY4u5y5w0jqilQYdMCyCK2LUKmUoZUKGpwSlDQtEZU6BZsFnE12C6+v9N+
 BXr7aA==
 =/3kX
 -----END PGP SIGNATURE-----

Merge tag 'nf-next-23-10-18' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next

Florian Westphal says:

====================
netfilter next pull request 2023-10-18

This series contains initial netfilter skb drop_reason support, from
myself.

First few patches fix up a few spots to make sure we won't trip
when followup patches embed error numbers in the upper bits
(we already do this in some places).

Then, nftables and bridge netfilter get converted to call kfree_skb_reason
directly to let tooling pinpoint exact location of packet drops,
rather than the existing NF_DROP catchall in nf_hook_slow().

I would like to eventually convert all netfilter modules, but as some
callers cannot deal with NF_STOLEN (notably act_ct), more preparation
work is needed for this.

Last patch gets rid of an ugly 'de-const' cast in nftables.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-18 11:05:21 +01:00
Gavrilov Ilia
1d30162f35 net: pktgen: Fix interface flags printing
Device flags are displayed incorrectly:
1) The comparison (i == F_FLOW_SEQ) is always false, because F_FLOW_SEQ
is equal to (1 << FLOW_SEQ_SHIFT) == 2048, and the maximum value
of the 'i' variable is (NR_PKT_FLAG - 1) == 17. It should be compared
with FLOW_SEQ_SHIFT.

2) Similarly to the F_IPSEC flag.

3) Also add spaces to the print end of the string literal "spi:%u"
to prevent the output from merging with the flag that follows.

Found by InfoTeCS on behalf of Linux Verification Center
(linuxtesting.org) with SVACE.

Fixes: 99c6d3d20d ("pktgen: Remove brute-force printing of flags")
Signed-off-by: Gavrilov Ilia <Ilia.Gavrilov@infotecs.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-18 10:12:30 +01:00
David S. Miller
810799a066 Merge branch 'ethtool-forced-speed'
Paul Greenwalt says:

====================
ethtool: Add link mode maps for forced speeds

The following patch set was initially a part of [1]. As the purpose of the
original series was to add the support of the new hardware to the intel ice
driver, the refactoring of advertised link modes mapping was extracted to a
new set.

The patch set adds a common mechanism for mapping Ethtool forced speeds
with Ethtool supported link modes, which can be used in drivers code.

[1] https://lore.kernel.org/netdev/20230823180633.2450617-1-pawel.chmielewski@intel.com

Changelog:
v4->v5:
Separated ethtool and qede changes into two patches, fixed indentation,
and moved ethtool_forced_speed_maps_init() from ioctl.c to ethtool.h

v3->v4:
Moved the macro for setting fields into the common header file

v2->v3:
Fixed whitespaces, added missing line at end of file

v1->v2:
Fixed formatting, typo, moved declaration of iterator to loop line.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-18 09:36:36 +01:00
Pawel Chmielewski
982b0192db ice: Refactor finding advertised link speed
Refactor ice_get_link_ksettings to using forced speed to link modes
mapping.

Suggested-by : Alexander Lobakin <aleksander.lobakin@intel.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Pawel Chmielewski <pawel.chmielewski@intel.com>
Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-18 09:36:35 +01:00
Paul Greenwalt
a5b65cd2a3 qede: Refactor qede_forced_speed_maps_init()
Refactor qede_forced_speed_maps_init() to use commen implementation
ethtool_forced_speed_maps_init().

The qede driver was compile tested only.

Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Pawel Chmielewski <pawel.chmielewski@intel.com>
Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-18 09:36:35 +01:00
Paul Greenwalt
26c5334d34 ethtool: Add forced speed to supported link modes maps
The need to map Ethtool forced speeds to Ethtool supported link modes is
common among drivers. To support this, add a common structure for forced
speed maps and a function to init them.  This is solution was originally
introduced in commit 1d4e4ecccb ("qede: populate supported link modes
maps on module init") for qede driver.

ethtool_forced_speed_maps_init() should be called during driver init
with an array of struct ethtool_forced_speed_map to populate the mapping.

Definitions for maps themselves are left in the driver code, as the sets
of supported link modes may vary between the devices.

Suggested-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Pawel Chmielewski <pawel.chmielewski@intel.com>
Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-18 09:36:35 +01:00
Florian Westphal
2560016721 netfilter: nf_tables: de-constify set commit ops function argument
The set backend using this already has to work around this via ugly
cast, don't spread this pattern.

Signed-off-by: Florian Westphal <fw@strlen.de>
2023-10-18 10:26:43 +02:00
Florian Westphal
cf8b7c1a5b netfilter: bridge: convert br_netfilter to NF_DROP_REASON
errno is 0 because these hooks are called from prerouting and forward.
There is no socket that the errno would ever be propagated to.

Other netfilter modules (e.g. nf_nat, conntrack, ...) can be converted
in a similar way.

Signed-off-by: Florian Westphal <fw@strlen.de>
2023-10-18 10:26:43 +02:00
Florian Westphal
e0d4593140 netfilter: make nftables drops visible in net dropmonitor
net_dropmonitor blames core.c:nf_hook_slow.
Add NF_DROP_REASON() helper and use it in nft_do_chain().

The helper releases the skb, so exact drop location becomes
available. Calling code will observe the NF_STOLEN verdict
instead.

Adjust nf_hook_slow so we can embed an erro value wih
NF_STOLEN verdicts, just like we do for NF_DROP.

After this, drop in nftables can be pinpointed to a drop due
to a rule or the chain policy.

Signed-off-by: Florian Westphal <fw@strlen.de>
2023-10-18 10:26:43 +02:00
Florian Westphal
35c038b0a4 netfilter: nf_nat: mask out non-verdict bits when checking return value
Same as previous change: we need to mask out the non-verdict bits, as
upcoming patches may embed an errno value in NF_STOLEN verdicts too.

NF_DROP could already do this, but not all called functions do this.

Checks that only test ret vs NF_ACCEPT are fine, the 'errno parts'
are always 0 for those.

Signed-off-by: Florian Westphal <fw@strlen.de>
2023-10-18 10:26:43 +02:00
Florian Westphal
6291b3a67a netfilter: conntrack: convert nf_conntrack_update to netfilter verdicts
This function calls helpers that can return nf-verdicts, but then
those get converted to -1/0 as thats what the caller expects.

Theoretically NF_DROP could have an errno number set in the upper 24
bits of the return value. Or any of those helpers could return
NF_STOLEN, which would result in use-after-free.

This is fine as-is, the called functions don't do this yet.

But its better to avoid possible future problems if the upcoming
patchset to add NF_DROP_REASON() support gains further users, so remove
the 0/-1 translation from the picture and pass the verdicts down to
the caller.

Signed-off-by: Florian Westphal <fw@strlen.de>
2023-10-18 10:26:43 +02:00
Florian Westphal
4d26ab0086 netfilter: nf_tables: mask out non-verdict bits when checking return value
nftables trace infra must mask out the non-verdict bit parts of the
return value, else followup changes that 'return errno << 8 | NF_STOLEN'
will cause breakage.

Signed-off-by: Florian Westphal <fw@strlen.de>
2023-10-18 10:26:43 +02:00
Florian Westphal
e15e502710 netfilter: xt_mangle: only check verdict part of return value
These checks assume that the caller only returns NF_DROP without
any errno embedded in the upper bits.

This is fine right now, but followup patches will start to propagate
such errors to allow kfree_skb_drop_reason() in the called functions,
those would then indicate 'errno << 8 | NF_STOLEN'.

To not break things we have to mask those parts out.

Signed-off-by: Florian Westphal <fw@strlen.de>
2023-10-18 10:26:43 +02:00
David S. Miller
a0a8602247 Merge branch 'devlink-deadlock'
Jiri Pirko says:

====================
devlink: fix a deadlock when taking devlink instance lock while holding RTNL lock

devlink_port_fill() may be called sometimes with RTNL lock held.
When putting the nested port function devlink instance attrs,
current code takes nested devlink instance lock. In that case lock
ordering is wrong.

Patch #1 is a dependency of patch #2.
Patch #2 converts the peernet2id_alloc() call to rely in RCU so it could
         called without devlink instance lock.
Patch #3 takes device reference for devlink instance making sure that
         device does not disappear before devlink_release() is called.
Patch #4 benefits from the preparations done in patches #2 and #3 and
         removes the problematic nested devlink lock aquisition.
Patched #5-#7 improve documentation to reflect this issue so it is
              avoided in the future.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-18 09:23:02 +01:00
Jiri Pirko
5d77371e8c devlink: document devlink_rel_nested_in_notify() function
Add a documentation for devlink_rel_nested_in_notify() describing the
devlink instance locking consequences.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-18 09:23:01 +01:00
Jiri Pirko
bb11cf9b2c Documentation: devlink: add a note about RTNL lock into locking section
Add a note describing the locking order of taking RTNL lock with devlink
instance lock.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-18 09:23:01 +01:00
Jiri Pirko
b6f23b319a Documentation: devlink: add nested instance section
Add a part talking about nested devlink instances describing
the helpers and locking ordering.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-18 09:23:01 +01:00
Jiri Pirko
b5f4e37133 devlink: don't take instance lock for nested handle put
Lockdep reports following issue:

WARNING: possible circular locking dependency detected
------------------------------------------------------
devlink/8191 is trying to acquire lock:
ffff88813f32c250 (&devlink->lock_key#14){+.+.}-{3:3}, at: devlink_rel_devlink_handle_put+0x11e/0x2d0

                           but task is already holding lock:
ffffffff8511eca8 (rtnl_mutex){+.+.}-{3:3}, at: unregister_netdev+0xe/0x20

                           which lock already depends on the new lock.

                           the existing dependency chain (in reverse order) is:

                           -> #3 (rtnl_mutex){+.+.}-{3:3}:
       lock_acquire+0x1c3/0x500
       __mutex_lock+0x14c/0x1b20
       register_netdevice_notifier_net+0x13/0x30
       mlx5_lag_add_mdev+0x51c/0xa00 [mlx5_core]
       mlx5_load+0x222/0xc70 [mlx5_core]
       mlx5_init_one_devl_locked+0x4a0/0x1310 [mlx5_core]
       mlx5_init_one+0x3b/0x60 [mlx5_core]
       probe_one+0x786/0xd00 [mlx5_core]
       local_pci_probe+0xd7/0x180
       pci_device_probe+0x231/0x720
       really_probe+0x1e4/0xb60
       __driver_probe_device+0x261/0x470
       driver_probe_device+0x49/0x130
       __driver_attach+0x215/0x4c0
       bus_for_each_dev+0xf0/0x170
       bus_add_driver+0x21d/0x590
       driver_register+0x133/0x460
       vdpa_match_remove+0x89/0xc0 [vdpa]
       do_one_initcall+0xc4/0x360
       do_init_module+0x22d/0x760
       load_module+0x51d7/0x6750
       init_module_from_file+0xd2/0x130
       idempotent_init_module+0x326/0x5a0
       __x64_sys_finit_module+0xc1/0x130
       do_syscall_64+0x3d/0x90
       entry_SYSCALL_64_after_hwframe+0x46/0xb0

                           -> #2 (mlx5_intf_mutex){+.+.}-{3:3}:
       lock_acquire+0x1c3/0x500
       __mutex_lock+0x14c/0x1b20
       mlx5_register_device+0x3e/0xd0 [mlx5_core]
       mlx5_init_one_devl_locked+0x8fa/0x1310 [mlx5_core]
       mlx5_devlink_reload_up+0x147/0x170 [mlx5_core]
       devlink_reload+0x203/0x380
       devlink_nl_cmd_reload+0xb84/0x10e0
       genl_family_rcv_msg_doit+0x1cc/0x2a0
       genl_rcv_msg+0x3c9/0x670
       netlink_rcv_skb+0x12c/0x360
       genl_rcv+0x24/0x40
       netlink_unicast+0x435/0x6f0
       netlink_sendmsg+0x7a0/0xc70
       sock_sendmsg+0xc5/0x190
       __sys_sendto+0x1c8/0x290
       __x64_sys_sendto+0xdc/0x1b0
       do_syscall_64+0x3d/0x90
       entry_SYSCALL_64_after_hwframe+0x46/0xb0

                           -> #1 (&dev->lock_key#8){+.+.}-{3:3}:
       lock_acquire+0x1c3/0x500
       __mutex_lock+0x14c/0x1b20
       mlx5_init_one_devl_locked+0x45/0x1310 [mlx5_core]
       mlx5_devlink_reload_up+0x147/0x170 [mlx5_core]
       devlink_reload+0x203/0x380
       devlink_nl_cmd_reload+0xb84/0x10e0
       genl_family_rcv_msg_doit+0x1cc/0x2a0
       genl_rcv_msg+0x3c9/0x670
       netlink_rcv_skb+0x12c/0x360
       genl_rcv+0x24/0x40
       netlink_unicast+0x435/0x6f0
       netlink_sendmsg+0x7a0/0xc70
       sock_sendmsg+0xc5/0x190
       __sys_sendto+0x1c8/0x290
       __x64_sys_sendto+0xdc/0x1b0
       do_syscall_64+0x3d/0x90
       entry_SYSCALL_64_after_hwframe+0x46/0xb0

                           -> #0 (&devlink->lock_key#14){+.+.}-{3:3}:
       check_prev_add+0x1af/0x2300
       __lock_acquire+0x31d7/0x4eb0
       lock_acquire+0x1c3/0x500
       __mutex_lock+0x14c/0x1b20
       devlink_rel_devlink_handle_put+0x11e/0x2d0
       devlink_nl_port_fill+0xddf/0x1b00
       devlink_port_notify+0xb5/0x220
       __devlink_port_type_set+0x151/0x510
       devlink_port_netdevice_event+0x17c/0x220
       notifier_call_chain+0x97/0x240
       unregister_netdevice_many_notify+0x876/0x1790
       unregister_netdevice_queue+0x274/0x350
       unregister_netdev+0x18/0x20
       mlx5e_vport_rep_unload+0xc5/0x1c0 [mlx5_core]
       __esw_offloads_unload_rep+0xd8/0x130 [mlx5_core]
       mlx5_esw_offloads_rep_unload+0x52/0x70 [mlx5_core]
       mlx5_esw_offloads_unload_rep+0x85/0xc0 [mlx5_core]
       mlx5_eswitch_unload_sf_vport+0x41/0x90 [mlx5_core]
       mlx5_devlink_sf_port_del+0x120/0x280 [mlx5_core]
       genl_family_rcv_msg_doit+0x1cc/0x2a0
       genl_rcv_msg+0x3c9/0x670
       netlink_rcv_skb+0x12c/0x360
       genl_rcv+0x24/0x40
       netlink_unicast+0x435/0x6f0
       netlink_sendmsg+0x7a0/0xc70
       sock_sendmsg+0xc5/0x190
       __sys_sendto+0x1c8/0x290
       __x64_sys_sendto+0xdc/0x1b0
       do_syscall_64+0x3d/0x90
       entry_SYSCALL_64_after_hwframe+0x46/0xb0

                           other info that might help us debug this:

Chain exists of:
                             &devlink->lock_key#14 --> mlx5_intf_mutex --> rtnl_mutex

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(rtnl_mutex);
                               lock(mlx5_intf_mutex);
                               lock(rtnl_mutex);
  lock(&devlink->lock_key#14);

Problem is taking the devlink instance lock of nested instance when RTNL
is already held.

To fix this, don't take the devlink instance lock when putting nested
handle. Instead, rely on the preparations done by previous two patches
to be able to access device pointer and obtain netns id without devlink
instance lock held.

Fixes: c137743bce ("devlink: introduce object and nested devlink relationship infra")
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-18 09:23:01 +01:00
Jiri Pirko
a380687200 devlink: take device reference for devlink object
In preparation to allow to access device pointer without devlink
instance lock held, make sure the device pointer is usable until
devlink_release() is called.

Fixes: c137743bce ("devlink: introduce object and nested devlink relationship infra")
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-18 09:23:01 +01:00