Commit Graph

1140851 Commits

Author SHA1 Message Date
Jakub Kicinski
95d1815f09 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next
Pablo Neira Ayuso says:

====================
Netfilter/IPVS updates for net-next

1) Incorrect error check in nft_expr_inner_parse(), from Dan Carpenter.

2) Add DATA_SENT state to SCTP connection tracking helper, from
   Sriram Yagnaraman.

3) Consolidate nf_confirm for ipv4 and ipv6, from Florian Westphal.

4) Add bitmask support for ipset, from Vishwanath Pai.

5) Handle icmpv6 redirects as RELATED, from Florian Westphal.

6) Add WARN_ON_ONCE() to impossible case in flowtable datapath,
   from Li Qiong.

7) A large batch of IPVS updates to replace timer-based estimators by
   kthreads to scale up wrt. CPUs and workload (millions of estimators).

Julian Anastasov says:

	This patchset implements stats estimation in kthread context.
It replaces the code that runs on single CPU in timer context every 2
seconds and causing latency splats as shown in reports [1], [2], [3].
The solution targets setups with thousands of IPVS services,
destinations and multi-CPU boxes.

	Spread the estimation on multiple (configured) CPUs and multiple
time slots (timer ticks) by using multiple chains organized under RCU
rules.  When stats are not needed, it is recommended to use
run_estimation=0 as already implemented before this change.

RCU Locking:

- As stats are now RCU-locked, tot_stats, svc and dest which
hold estimator structures are now always freed from RCU
callback. This ensures RCU grace period after the
ip_vs_stop_estimator() call.

Kthread data:

- every kthread works over its own data structure and all
such structures are attached to array. For now we limit
kthreads depending on the number of CPUs.

- even while there can be a kthread structure, its task
may not be running, eg. before first service is added or
while the sysctl var is set to an empty cpulist or
when run_estimation is set to 0 to disable the estimation.

- the allocated kthread context may grow from 1 to 50
allocated structures for timer ticks which saves memory for
setups with small number of estimators

- a task and its structure may be released if all
estimators are unlinked from its chains, leaving the
slot in the array empty

- every kthread data structure allows limited number
of estimators. Kthread 0 is also used to initially
calculate the max number of estimators to allow in every
chain considering a sub-100 microsecond cond_resched
rate. This number can be from 1 to hundreds.

- kthread 0 has an additional job of optimizing the
adding of estimators: they are first added in
temp list (est_temp_list) and later kthread 0
distributes them to other kthreads. The optimization
is based on the fact that newly added estimator
should be estimated after 2 seconds, so we have the
time to offload the adding to chain from controlling
process to kthread 0.

- to add new estimators we use the last added kthread
context (est_add_ktid). The new estimators are linked to
the chains just before the estimated one, based on add_row.
This ensures their estimation will start after 2 seconds.
If estimators are added in bursts, common case if all
services and dests are initially configured, we may
spread the estimators to more chains and as result,
reducing the initial delay below 2 seconds.

Many thanks to Jiri Wiesner for his valuable comments
and for spending a lot of time reviewing and testing
the changes on different platforms with 48-256 CPUs and
1-8 NUMA nodes under different cpufreq governors.

The new IPVS estimators do not use workqueue infrastructure
because:

- The estimation can take long time when using multiple IPVS rules (eg.
  millions estimator structures) and especially when box has multiple
  CPUs due to the for_each_possible_cpu usage that expects packets from
  any CPU. With est_nice sysctl we have more control how to prioritize the
  estimation kthreads compared to other processes/kthreads that have
  latency requirements (such as servers). As a benefit, we can see these
  kthreads in top and decide if we will need some further control to limit
  their CPU usage (max number of structure to estimate per kthread).

- with kthreads we run code that is read-mostly, no write/lock
  operations to process the estimators in 2-second intervals.

- work items are one-shot: as estimators are processed every
  2 seconds, they need to be re-added every time. This again
  loads the timers (add_timer) if we use delayed works, as there are
  no kthreads to do the timings.

[1] Report from Yunhong Jiang:
    https://lore.kernel.org/netdev/D25792C1-1B89-45DE-9F10-EC350DC04ADC@gmail.com/
[2] https://marc.info/?l=linux-virtual-server&m=159679809118027&w=2
[3] Report from Dust:
    https://archive.linuxvirtualserver.org/html/lvs-devel/2020-12/msg00000.html

* git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next:
  ipvs: run_estimation should control the kthread tasks
  ipvs: add est_cpulist and est_nice sysctl vars
  ipvs: use kthreads for stats estimation
  ipvs: use u64_stats_t for the per-cpu counters
  ipvs: use common functions for stats allocation
  ipvs: add rcu protection to stats
  netfilter: flowtable: add a 'default' case to flowtable datapath
  netfilter: conntrack: set icmpv6 redirects as RELATED
  netfilter: ipset: Add support for new bitmask parameter
  netfilter: conntrack: merge ipv4+ipv6 confirm functions
  netfilter: conntrack: add sctp DATA_SENT state
  netfilter: nft_inner: fix IS_ERR() vs NULL check
====================

Link: https://lore.kernel.org/r/20221211101204.1751-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-12 14:45:36 -08:00
Rob Herring
15eb162176 dt-bindings: net: Convert Socionext NetSec Ethernet to DT schema
Convert the Socionext NetSec Ethernet binding to DT schema format.

Signed-off-by: Rob Herring <robh@kernel.org>
Acked-by: Jassi Brar <jaswinder.singh@linaro.org>
Link: https://lore.kernel.org/r/20221209171553.3350583-1-robh@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-12 13:03:45 -08:00
Taras Chornyi
4e426e2534 MAINTAINERS: Update email address for Marvell Prestera Ethernet Switch driver
Taras's Marvell email account will be shut down soon so change it to Plvision.

Signed-off-by: Taras Chornyi <taras.chornyi@plvision.eu>
Signed-off-by: Vadym Kochan <vadym.kochan@plvision.eu>
Link: https://lore.kernel.org/r/20221209154521.1246881-1-vadym.kochan@plvision.eu
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-12 13:03:11 -08:00
Xu Panda
80a464d83f net: hns3: use strscpy() to instead of strncpy()
The implementation of strscpy() is more robust and safer.
That's now the recommended way to copy NUL terminated strings.

Signed-off-by: Xu Panda <xu.panda@zte.com.cn>
Signed-off-by: Yang Yang <yang.yang29@zte.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/202212091538591375035@zte.com.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-12 13:00:00 -08:00
Firo Yang
da05cecc49 sctp: sysctl: make extra pointers netns aware
Recently, a customer reported that from their container whose
net namespace is different to the host's init_net, they can't set
the container's net.sctp.rto_max to any value smaller than
init_net.sctp.rto_min.

For instance,
Host:
sudo sysctl net.sctp.rto_min
net.sctp.rto_min = 1000

Container:
echo 100 > /mnt/proc-net/sctp/rto_min
echo 400 > /mnt/proc-net/sctp/rto_max
echo: write error: Invalid argument

This is caused by the check made from this'commit 4f3fdf3bc5
("sctp: add check rto_min and rto_max in sysctl")'
When validating the input value, it's always referring the boundary
value set for the init_net namespace.

Having container's rto_max smaller than host's init_net.sctp.rto_min
does make sense. Consider that the rto between two containers on the
same host is very likely smaller than it for two hosts.

So to fix this problem, as suggested by Marcelo, this patch makes the
extra pointers of rto_min, rto_max, pf_retrans, and ps_retrans point
to the corresponding variables from the newly created net namespace while
the new net namespace is being registered in sctp_sysctl_net_register.

Fixes: 4f3fdf3bc5 ("sctp: add check rto_min and rto_max in sysctl")
Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Firo Yang <firo.yang@suse.com>
Link: https://lore.kernel.org/r/20221209054854.23889-1-firo.yang@suse.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-12 12:57:29 -08:00
Roger Quadros
5821504f50 net: ethernet: ti: am65-cpsw: Fix PM runtime leakage in am65_cpsw_nuss_ndo_slave_open()
Ensure pm_runtime_put() is issued in error path.

Reported-by: Jakub Kicinski <kuba@kernel.org>
Fixes: 93a7653031 ("net: ethernet: ti: introduce am65x/j721e gigabit eth subsystem driver")
Signed-off-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: Saeed Mahameed <saeed@kernel.org>
Link: https://lore.kernel.org/r/20221208105534.63709-1-rogerq@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-12 12:52:28 -08:00
Jakub Kicinski
fba119cee1 wireless-next patches for v6.2
Fourth set of patches for v6.2. Few final patches, a big change is
 that rtw88 now has USB support.
 
 Major changes:
 
 rtw88
 
 * support USB devices rtw8821cu, rtw8822bu, rtw8822cu and rtw8723du
 -----BEGIN PGP SIGNATURE-----
 
 iQFFBAABCgAvFiEEiBjanGPFTz4PRfLobhckVSbrbZsFAmOW9EsRHGt2YWxvQGtl
 cm5lbC5vcmcACgkQbhckVSbrbZup1wf9EO1wK3M+wICOthV9hphaMK/cU5XQe455
 V27N+IGg3KkeLgERYj/35C1Xwa4vXeUxWN31KvNJ/51IQaM14NfUE3OOu+gqUO36
 a/HhECtcmzsjC+22hH1gvnN73/+S8VQEEfOJ//Jcg6exRmXVKRMWZU1o3rFupAR+
 qePWzvYuadobmE8G/Zq2VFB5Eouf5z7MSXLd7Y/8HJMdL812GmA0nhNJXhHDt27A
 /K5QBOSnRQ24ZLOG4P0eFZrJGqFTcI27+w4bv+rzUL2ME/Rgp1hB7wBwQbCwD8sU
 Aus05avcIcVNfsTk3z7kOnYiBqf08WX6Ng6ezb9HunXXfWhLW7JtAw==
 =AzL1
 -----END PGP SIGNATURE-----

Merge tag 'wireless-next-2022-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next

Kalle Valo says:

====================
wireless-next patches for v6.2

Fourth set of patches for v6.2. Few final patches, a big change is
that rtw88 now has USB support.

Major changes:

rtw88
 * support USB devices rtw8821cu, rtw8822bu, rtw8822cu and rtw8723du

* tag 'wireless-next-2022-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (43 commits)
  wifi: rtl8xxxu: fixing IQK failures for rtl8192eu
  wifi: rtlwifi: btcoexist: fix conditions branches that are never executed
  wifi: rtlwifi: rtl8192se: remove redundant rtl_get_bbreg() call
  wifi: rtw88: Add rtw8723du chipset support
  wifi: rtw88: Add rtw8822cu chipset support
  wifi: rtw88: Add rtw8822bu chipset support
  wifi: rtw88: Add rtw8821cu chipset support
  wifi: rtw88: Add common USB chip support
  wifi: rtw88: iterate over vif/sta list non-atomically
  wifi: rtw88: Drop coex mutex
  wifi: rtw88: Drop h2c.lock
  wifi: rtw88: Drop rf_lock
  wifi: rtw88: Call rtw_fw_beacon_filter_config() with rtwdev->mutex held
  wifi: rtw88: print firmware type in info message
  wifi: rtw89: add join info upon create interface
  wifi: rtw89: fix unsuccessful interface_add flow
  wifi: rtw89: stop mac port function when stop_ap()
  wifi: rtw89: add mac TSF sync function
  wifi: rtw89: request full firmware only once if it's early requested
  wifi: rtw89: don't request partial firmware if SECURITY_LOADPIN_ENFORCE
  ...
====================

Link: https://lore.kernel.org/r/20221212093026.5C5AEC433D2@smtp.kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-12 12:15:23 -08:00
Jakub Kicinski
26f708a284 bpf-next-for-netdev
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE+soXsSLHKoYyzcli6rmadz2vbToFAmOWgtsACgkQ6rmadz2v
 bTpT2g//WzQRsODtPVVmg87fEo1GSTXvoXq/fhg95OKNZrVKgx1N6EVlFSLSqEjL
 TAmOuv5cZT28ZpMPMNjnU/c/lFf/6/UWbbTusA+F3MtSCBSbP5DPsWDD0yvNT9DL
 EZbGoQDSyt1M+BakZLzwOV6HPn9oDhj5p/4lMw+gptTY+3IeYUbS50DinM8eLz+Q
 067aF01p3ROF6LNUx9Az0cLPdU05oHzL2MvRsj/F7h/sWoSW5B/1Kx/m1vsT9lwn
 T2vbm6r4Jo0m0ZvpEMeRyKNZgVKIc64C7NH9CV7V66giJaONmxvLwkc0zWFwbXJ2
 V9aPQbbBUx/CZXoC72LEsvVcoAFl7LAL1IALm2HVt1iQjpj1yDlWw3WV0PMQ9Rn7
 xRVDOfQNGZ6jnkv6LB2j7V1z7hVENWQQwM48dgO2pAnJwYmUW9wZaAGE5kadUrZf
 eCD4c1U+qcZkSk4vwvpr8ubJ0PWPMUZqI0FrHUxfPxqkdy78c1h3qNQufZvAHWff
 Ca9NZqraFACTx58ZBsN1V5Xzv7azoK8Zgr9+JwVNahpFxclrbL8xuceThkC4smBl
 fiZJC9fClD9ATquIdj177jNMVC8F4B5yrKF/ehJDcNQhcqUdWx9Sbj461enf+3HI
 nfTP+77ZzyIJ76iRXJBV/jr9wkaPWhAZVeBGxmw5clTvB9/RBbU=
 =fzwv
 -----END PGP SIGNATURE-----

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Alexei Starovoitov says:

====================
pull-request: bpf-next 2022-12-11

We've added 74 non-merge commits during the last 11 day(s) which contain
a total of 88 files changed, 3362 insertions(+), 789 deletions(-).

The main changes are:

1) Decouple prune and jump points handling in the verifier, from Andrii.

2) Do not rely on ALLOW_ERROR_INJECTION for fmod_ret, from Benjamin.
   Merged from hid tree.

3) Do not zero-extend kfunc return values. Necessary fix for 32-bit archs,
   from Björn.

4) Don't use rcu_users to refcount in task kfuncs, from David.

5) Three reg_state->id fixes in the verifier, from Eduard.

6) Optimize bpf_mem_alloc by reusing elements from free_by_rcu, from Hou.

7) Refactor dynptr handling in the verifier, from Kumar.

8) Remove the "/sys" mount and umount dance in {open,close}_netns
  in bpf selftests, from Martin.

9) Enable sleepable support for cgrp local storage, from Yonghong.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (74 commits)
  selftests/bpf: test case for relaxed prunning of active_lock.id
  selftests/bpf: Add pruning test case for bpf_spin_lock
  bpf: use check_ids() for active_lock comparison
  selftests/bpf: verify states_equal() maintains idmap across all frames
  bpf: states_equal() must build idmap for all function frames
  selftests/bpf: test cases for regsafe() bug skipping check_id()
  bpf: regsafe() must not skip check_ids()
  docs/bpf: Add documentation for BPF_MAP_TYPE_SK_STORAGE
  selftests/bpf: Add test for dynptr reinit in user_ringbuf callback
  bpf: Use memmove for bpf_dynptr_{read,write}
  bpf: Move PTR_TO_STACK alignment check to process_dynptr_func
  bpf: Rework check_func_arg_reg_off
  bpf: Rework process_dynptr_func
  bpf: Propagate errors from process_* checks in check_func_arg
  bpf: Refactor ARG_PTR_TO_DYNPTR checks into process_dynptr_func
  bpf: Skip rcu_barrier() if rcu_trace_implies_rcu_gp() is true
  bpf: Reuse freed element in free_by_rcu during allocation
  selftests/bpf: Bring test_offload.py back to life
  bpf: Fix comment error in fixup_kfunc_call function
  bpf: Do not zero-extend kfunc return values
  ...
====================

Link: https://lore.kernel.org/r/20221212024701.73809-1-alexei.starovoitov@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-12 11:27:42 -08:00
David S. Miller
b2b509fb5a linux-can-next-for-6.2-20221212
-----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCgAxFiEEBsvAIBsPu6mG7thcrX5LkNig010FAmOXCo0THG1rbEBwZW5n
 dXRyb25peC5kZQAKCRCtfkuQ2KDTXUD/B/4m3F/FIpfQjrLR8P/87hkHtu1vLnPV
 uo7SGVmq8aDiMRqWSLBWkiP6ceGememckrplG+qprx9QVZhagbtFN1/kE9jXxYSU
 s+hh4ARmLfpdZmcNCFzFi2S68G4fsQ3rTI8g/itbpQYCGyHt90yA5+PqT+QQZOvo
 J4l4uim4nVyBNEW136Vf13K0iFCmeJr4zr1eSqqHhZVSbK2cQSSN93HlhmZ+ccjT
 c1vZdNL1nd9mCQkNFHC+JPehfcgaI0r6hx8RYECvOCy8EbK1ZXdseWlM/z7IoxZm
 DiT2KhTao9xa9qcnRjI6oqTwO0omlqND2YDBj+SRwfFAs3HogeOaQjRT
 =w/j3
 -----END PGP SIGNATURE-----

Merge tag 'linux-can-next-for-6.2-20221212' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next

Marc Kleine-Budde says:

====================
linux-can-next-for-6.2-20221212

this is a pull request of 39 patches for net-next/master.

The first 2 patches are by me fix a warning and coding style in the
kvaser_usb driver.

Vivek Yadav's patch sorts the includes of the m_can driver.

Biju Das contributes 5 patches for the rcar_canfd driver improve the
support for different IP core variants.

Jean Delvare's patch for the ctucanfd drops the dependency on
COMPILE_TEST.

Vincent Mailhol's patch sorts the includes of the etas_es58x driver.

Haibo Chen's contributes 2 patches that add i.MX93 support to the
flexcan driver.

Lad Prabhakar's patch updates the dt-bindings documentation of the
rcar_canfd driver.

Minghao Chi's patch converts the c_can platform driver to
devm_platform_get_and_ioremap_resource().

In the next 7 patches Vincent Mailhol adds devlink support to the
etas_es58x driver to report firmware, bootloader and hardware version.

Xu Panda's patch converts a strncpy() -> strscpy() in the ucan driver.

Ye Bin's patch removes a useless parameter from the AF_CAN protocol.

The next 2 patches by Vincent Mailhol and remove unneeded or unused
pointers to struct usb_interface in device's priv struct in the ucan
and gs_usb driver.

Vivek Yadav's patch cleans up the usage of the RAM initialization in
the m_can driver.

A patch by me add support for SO_MARK to the AF_CAN protocol.

Geert Uytterhoeven's patch fixes the number of CAN channels in the
rcan_canfd bindings documentation.

In the last 11 patches Markus Schneider-Pargmann optimizes the
register access in the t_can driver and cleans up the tcan glue
driver.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-12-12 12:11:37 +00:00
Marc Kleine-Budde
47bf2b2393 Merge patch series "can: m_can: Optimizations for tcan and peripheral chips"
Markus Schneider-Pargmann <msp@baylibre.com> says:

as requested I split the series into two parts. This is the first
parts with simple improvements to reduce the number of SPI transfers.
The second part will be the rest with coalescing support and more
complex optimizations.

Changes since v1: https://lore.kernel.org/all/20221116205308.2996556-1-msp@baylibre.com
- Fixed register ranges
- Added fixes: tag for two patches

Link: https://lore.kernel.org/all/20221206115728.1056014-1-msp@baylibre.com
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-12-12 12:01:26 +01:00
Markus Schneider-Pargmann
39dbb21b6a can: tcan4x5x: Specify separate read/write ranges
Specify exactly which registers are read/writeable in the chip. This
is supposed to help detect any violations in the future.

Signed-off-by: Markus Schneider-Pargmann <msp@baylibre.com>
Link: https://lore.kernel.org/all/20221206115728.1056014-12-msp@baylibre.com
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-12-12 12:01:01 +01:00
Markus Schneider-Pargmann
ef5778f708 can: tcan4x5x: Fix register range of first two blocks
According to the datasheet 0x10 is the last register in the first block,
not register 0x2c.

The datasheet lists the last register of the second block as 0x830, not
0x83c.

Signed-off-by: Markus Schneider-Pargmann <msp@baylibre.com>
Link: https://lore.kernel.org/all/20221206115728.1056014-11-msp@baylibre.com
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-12-12 12:01:01 +01:00
Markus Schneider-Pargmann
67727a17a6 can: tcan4x5x: Fix use of register error status mask
TCAN4X5X_ERROR_STATUS is not a status register that needs clearing
during interrupt handling. Instead this is a masking register that masks
error interrupts. Writing TCAN4X5X_CLEAR_ALL_INT to this register
effectively masks everything.

Rename the register and mask all error interrupts only once by writing
to the register in tcan4x5x_init.

Fixes: 5443c226ba ("can: tcan4x5x: Add tcan4x5x driver to the kernel")
Signed-off-by: Markus Schneider-Pargmann <msp@baylibre.com>
Link: https://lore.kernel.org/all/20221206115728.1056014-10-msp@baylibre.com
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-12-12 12:00:50 +01:00
Markus Schneider-Pargmann
40c9e4f676 can: tcan4x5x: Remove invalid write in clear_interrupts
Register 0x824 TCAN4X5X_MCAN_INT_REG is a read-only register. Any writes
to this register do not have any effect.

Remove this write. The m_can driver aldready clears the interrupts in
m_can_isr() by writing to M_CAN_IR which is translated to register
0x1050 which is a writable version of this register.

Fixes: 5443c226ba ("can: tcan4x5x: Add tcan4x5x driver to the kernel")
Signed-off-by: Markus Schneider-Pargmann <msp@baylibre.com>
Link: https://lore.kernel.org/all/20221206115728.1056014-9-msp@baylibre.com
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-12-12 11:58:33 +01:00
Markus Schneider-Pargmann
e2f1c8cb02 can: m_can: Batch acknowledge rx fifo
Instead of acknowledging every item of the fifo, only acknowledge the
last item read. This behavior is documented in the datasheet. The new
getindex will be the acknowledged item + 1.

Signed-off-by: Markus Schneider-Pargmann <msp@baylibre.com>
Link: https://lore.kernel.org/all/20221206115728.1056014-8-msp@baylibre.com
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-12-12 11:58:33 +01:00
Markus Schneider-Pargmann
e3bff5256a can: m_can: Batch acknowledge transmit events
Transmit events from the txe fifo can be batch acknowledged by
acknowledging the last read txe fifo item. This will save txe_count
writes which is important for peripheral chips.

Signed-off-by: Markus Schneider-Pargmann <msp@baylibre.com>
Link: https://lore.kernel.org/all/20221206115728.1056014-7-msp@baylibre.com
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-12-12 11:58:32 +01:00
Markus Schneider-Pargmann
6355a3c983 can: m_can: Count read getindex in the driver
The getindex gets increased by one every time. We can calculate the
correct getindex in the driver and avoid the additional reads of rxfs.

Signed-off-by: Markus Schneider-Pargmann <msp@baylibre.com>
Link: https://lore.kernel.org/all/20221206115728.1056014-6-msp@baylibre.com
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-12-12 11:58:32 +01:00
Markus Schneider-Pargmann
d4535b90a7 can: m_can: Count TXE FIFO getidx in the driver
The getindex simply increases by one for every iteration. There is no
need to get the current getidx every time from a register. Instead we
can just count and wrap if necessary.

Signed-off-by: Markus Schneider-Pargmann <msp@baylibre.com>
Link: https://lore.kernel.org/all/20221206115728.1056014-5-msp@baylibre.com
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-12-12 11:58:32 +01:00
Markus Schneider-Pargmann
fac52bf786 can: m_can: Read register PSR only on error
Only read register PSR if there is an error indicated in irqstatus.

Signed-off-by: Markus Schneider-Pargmann <msp@baylibre.com>
Link: https://lore.kernel.org/all/20221206115728.1056014-4-msp@baylibre.com
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-12-12 11:58:32 +01:00
Markus Schneider-Pargmann
5775793797 can: m_can: Avoid reading irqstatus twice
For peripheral devices the m_can_rx_handler is called directly after
setting cdev->irqstatus. This means we don't have to read the irqstatus
again in m_can_rx_handler. Avoid this by adding a parameter that is
false for direct calls.

Signed-off-by: Markus Schneider-Pargmann <msp@baylibre.com>
Link: https://lore.kernel.org/all/20221206115728.1056014-3-msp@baylibre.com
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-12-12 11:58:32 +01:00
Markus Schneider-Pargmann
c1eaf8b9bd can: m_can: Eliminate double read of TXFQS in tx_handler
The TXFQS register is read first to check if the fifo is full and then
immediately again to get the putidx. This is unnecessary and adds
significant overhead if read requests are done over a slow bus, for
example SPI with tcan4x5x.

Add a variable to store the value of the register. Split the
m_can_tx_fifo_full function into two to avoid the hidden m_can_read call
if not needed.

Signed-off-by: Markus Schneider-Pargmann <msp@baylibre.com>
Link: https://lore.kernel.org/all/20221206115728.1056014-2-msp@baylibre.com
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-12-12 11:58:32 +01:00
Geert Uytterhoeven
3abcc01c38 dt-bindings: can: renesas,rcar-canfd: Fix number of channels for R-Car V3U
According to the bindings, only two channels are supported.
However, R-Car V3U supports eight, leading to "make dtbs" failures:

        arch/arm64/boot/dts/renesas/r8a779a0-falcon.dtb: can@e6660000: Unevaluated properties are not allowed ('channel2', 'channel3', 'channel4', 'channel5', 'channel6', 'channel7' were unexpected)

Update the number of channels to 8 on R-Car V3U.
While at it, prevent adding more properties to the channel nodes, as
they must contain no other properties than a status property.

Fixes: d6254d52d7 ("dt-bindings: can: renesas,rcar-canfd: Document r8a779a0 support")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/all/7d41d72cd7db2e90bae069ce57dbb672f17500ae.1670431681.git.geert+renesas@glider.be
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-12-12 11:53:10 +01:00
Marc Kleine-Budde
0826e82b8a can: raw: add support for SO_MARK
Add support for SO_MARK to the CAN_RAW protocol. This makes it
possible to add traffic control filters based on the fwmark.

Link: https://lore.kernel.org/all/20221210113653.170346-1-mkl@pengutronix.de
Acked-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-12-12 11:43:16 +01:00
Vivek Yadav
eaacfeaca7 can: m_can: Call the RAM init directly from m_can_chip_config
When we try to access the mcan message ram addresses during the probe,
hclk is gated by any other drivers or disabled, because of that probe
gets failed.

Move the mram init functionality to mcan chip config called by
m_can_start from mcan open function, by that time clocks are
enabled.

Suggested-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Vivek Yadav <vivek.2311@samsung.com>
Link: https://lore.kernel.org/all/20221207100632.96200-2-vivek.2311@samsung.com
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-12-12 11:42:33 +01:00
Marc Kleine-Budde
bd4a52bf9d Merge patch series "can: usb: remove pointers to struct usb_interface in device's priv structures"
Vincent Mailhol <mailhol.vincent@wanadoo.fr> says:

The gs_can and ucan drivers keep a pointer to struct usb_interface in
their private structure. This is not needed. For gs_can the only use
is to retrieve struct usb_device, which is already available in
gs_usb::udev. For ucan, the field is set but never used.

Remove the struct usb_interface fields and clean up.

Link: https://lore.kernel.org/all/20221208081142.16936-1-mailhol.vincent@wanadoo.fr
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-12-12 11:41:58 +01:00
Vincent Mailhol
56c56a309e can: gs_usb: remove gs_can::iface
The iface field of struct gs_can is only used to retrieve the
usb_device which is already available in gs_can::udev.

Replace each occurrence of interface_to_usbdev(dev->iface) with
dev->udev. This done, remove gs_can::iface.

Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Link: https://lore.kernel.org/all/20221208081142.16936-3-mailhol.vincent@wanadoo.fr
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-12-12 11:41:25 +01:00
Vincent Mailhol
f54b101dde can: ucan: remove unused ucan_priv::intf
Field intf of struct ucan_priv is set but never used. Remove it.

Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Link: https://lore.kernel.org/all/20221208081142.16936-2-mailhol.vincent@wanadoo.fr
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-12-12 11:41:25 +01:00
Ye Bin
f793458bba net: af_can: remove useless parameter 'err' in 'can_rx_register()'
Since commit bdfb5765e4 remove NULL-ptr checks from users of
can_dev_rcv_lists_find(). 'err' parameter is useless, so remove it.

Signed-off-by: Ye Bin <yebin10@huawei.com>
Link: https://lore.kernel.org/all/20221208090940.3695670-1-yebin@huaweicloud.com
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-12-12 11:41:24 +01:00
Xu Panda
7fdaf8966a can: ucan: use strscpy() to instead of strncpy()
The implementation of strscpy() is more robust and safer.
That's now the recommended way to copy NUL terminated strings.

Signed-off-by: Xu Panda <xu.panda@zte.com.cn>
Signed-off-by: Yang Yang <yang.yang29@zte.com>
Link: https://lore.kernel.org/all/202212070909095189693@zte.com.cn
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-12-12 11:41:24 +01:00
Marc Kleine-Budde
5425094a39 Merge patch series "can: etas_es58x: report firmware, bootloader and hardware version"
Vincent Mailhol <mailhol.vincent@wanadoo.fr> says:

The goal of this series is to report the firmware version, the
bootloader version and the hardware revision of ETAS ES58x devices.

These are already reported in the kernel log but this isn't best
practice. Remove the kernel log and instead export all these through
devlink. The devlink core automatically exports the firmware and the
bootloader version to ethtool, so no need to implement the
ethtool_ops::get_drvinfo() callback anymore.

Patch one and two implement the core support for devlink (at device
level) and devlink port (at the network interface level).

Patch three export usb_cache_string() and patch four add a new info
attribute to devlink.h. Both are prerequisites for patch five.

Patch five is the actual goal: it parses the product information from
a custom usb string returned by the device and expose them through
devlink.

Patch six removes the product information from the kernel log.

Finally, patch seven add a devlink documentation page with list all
the information attributes reported by the driver.

* Sample outputs following this series *

| $ devlink dev info
| usb/1-9:1.1:
|   serial_number 0108954
|   versions:
|       fixed:
|         board.rev B012/000
|       running:
|         fw 04.00.01
|         fw.bootloader 02.00.00

| $ devlink port show can0
| usb/1-9:1.1/0: type eth netdev can0 flavour physical port 0 splittable false

| $ ethtool -i can0
| driver: etas_es58x
| version: 6.1.0-rc7+
| firmware-version: 04.00.01 02.00.00
| expansion-rom-version:
| bus-info: 1-9:1.1
| supports-statistics: no
| supports-test: no
| supports-eeprom-access: no
| supports-register-dump: no
| supports-priv-flags: no

* Changelog *

v4 -> v5: https://lore.kernel.org/all/20221130174658.29282-1-mailhol.vincent@wanadoo.fr

  * [PATH 2/7] add devlink port support. This extends devlink to the
    network interface.

  * thanks to devlink port, 'ethtool -i' is now able to retrieve the
    firmware version from devlink. No need to implement the
    ethtool_ops::get_drvinfo() callback anymore: remove one patch from
    the series.

  * [PATCH 4/7] A new patch to add a new info attribute for the
    bootloader version in devlink.h. This patch was initially sent as
    a standalone patch here:
      https://lore.kernel.org/netdev/20221129031406.3849872-1-mailhol.vincent@wanadoo.fr
    Merging it to this series so that it is both added and used at the
    same time.

  * [PATCH 5/7] use the newly info attribute defined in patch 4/7 to
    report the bootloader version instead of the custom string "bl".

  * [PATCH 5/7] because the series does not implement
    ethtool_ops::get_drvinfo() anymore, the two helper functions
    es58x_sw_version_is_set() and es58x_hw_revision_is_set() are only
    used in devlink.c. Move them from es58x_core.h to es58x_devlink.c.

  * [PATCH 5/7] small rework of the helper function
    es58x_hw_revision_is_set(): it is OK to only check the letter (if
    the letter is '\0', it will not be possible to print the next
    numbers).

  * [PATCH 5/7 and 6/7] add reviewed-by Andrew Lunn tag.

  * [PATCH 7/7] Now, 'ethtool -i' reports both the firmware version
    and the bootloader version (this is how the core export the
    information from devlink to ethtool). Update the documentation to
    reflect this fact.

  * Reoder the patches.

v3 -> v4: https://lore.kernel.org/all/20221126162211.93322-1-mailhol.vincent@wanadoo.fr

  * major rework to use devlink instead of sysfs following Andrew's
    comment.

  * split the series in 6 patches.

  * [PATCH 1/6] add Acked-by: Greg Kroah-Hartman

v2 -> v3: https://lore.kernel.org/all/20221113040108.68249-1-mailhol.vincent@wanadoo.fr

  * patch 2/3: do not spam the kernel log anymore with the product
    number. Instead parse the product information string, extract the
    firmware version, the bootloadar version and the hardware revision
    and export them through sysfs.

  * patch 2/3: rework the parsing in order not to need additional
    fields in struct es58x_parameters.

  * patch 3/3: only populate ethtool_drvinfo::fw_version because since
    commit edaf5df22c ("ethtool: ethtool_get_drvinfo: populate
    drvinfo fields even if callback exits"), there is no need to
    populate ethtool_drvinfo::driver and ethtool_drvinfo::bus_info in
    the driver.

v1 -> v2: https://lore.kernel.org/all/20221104171604.24052-1-mailhol.vincent@wanadoo.fr

  * was a single patch. It is now a series of three patches.
  * add a first new patch to export  usb_cache_string().
  * add a second new patch to apply usb_cache_string() to existing code.
  * add missing check on product info string to prevent a buffer overflow.
  * add comma on the last entry of struct es58x_parameters.

v1: https://lore.kernel.org/all/20221104073659.414147-1-mailhol.vincent@wanadoo.fr

Link: https://lore.kernel.org/all/20221130174658.29282-1-mailhol.vincent@wanadoo.fr
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-12-12 11:40:16 +01:00
Vincent Mailhol
9f63f96aac Documentation: devlink: add devlink documentation for the etas_es58x driver
List all the version information reported by the etas_es58x driver
through devlink. Also, update MAINTAINERS with the newly created file.

Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Link: https://lore.kernel.org/all/20221130174658.29282-8-mailhol.vincent@wanadoo.fr
[mkl: fixed version information table: "bl" -> "fw.bootloader"
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-12-12 11:39:13 +01:00
Vincent Mailhol
d8f26fd689 can: etas_es58x: remove es58x_get_product_info()
Now that the product information are available under devlink, no more
need to print them in the kernel log. Remove es58x_get_product_info().

Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/all/20221130174658.29282-7-mailhol.vincent@wanadoo.fr
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-12-12 11:39:13 +01:00
Vincent Mailhol
9f06631c3f can: etas_es58x: export product information through devlink_ops::info_get()
ES58x devices report below product information through a custom usb
string:

  * the firmware version
  * the bootloader version
  * the hardware revision

Parse this string, store the results in struct es58x_dev, export:

  * the firmware version through devlink's "fw" name
  * the bootloader version through devlink's "fw.bootloader" name
  * the hardware revisionthrough devlink's "board.rev" name

Those devlink entries are not critical to use the device, if parsing
fails, print an informative log message and continue to probe the
device.

In addition to that, use usb_device::serial to report the device
serial number.

Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/all/20221130174658.29282-6-mailhol.vincent@wanadoo.fr
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-12-12 11:39:13 +01:00
Vincent Mailhol
01d8053229 net: devlink: add DEVLINK_INFO_VERSION_GENERIC_FW_BOOTLOADER
As discussed in [1], abbreviating the bootloader to "bl" might not be
well understood. Instead, a bootloader technically being a firmware,
name it "fw.bootloader".

Add a new macro to devlink.h to formalize this new info attribute name
and update the documentation.

[1] https://lore.kernel.org/netdev/20221128142723.2f826d20@kernel.org/

Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Link: https://lore.kernel.org/all/20221130174658.29282-5-mailhol.vincent@wanadoo.fr
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-12-12 11:39:13 +01:00
Minghao Chi
74d95352bd can: c_can: use devm_platform_get_and_ioremap_resource()
Convert platform_get_resource(), devm_ioremap_resource() to a single
call to devm_platform_get_and_ioremap_resource(), as this is exactly
what this function does.

Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Minghao Chi <chi.minghao@zte.com.cn>
Link: https://lore.kernel.org/all/202211111443005202576@zte.com.cn
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-12-12 11:39:12 +01:00
Vincent Mailhol
983055bf83 USB: core: export usb_cache_string()
usb_cache_string() can also be useful for the drivers so export it.

Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/all/20221130174658.29282-4-mailhol.vincent@wanadoo.fr
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-12-12 11:39:12 +01:00
Lad Prabhakar
5237ff4e7d dt-bindings: can: renesas,rcar-canfd: Document RZ/Five SoC
The CANFD block on the RZ/Five SoC is identical to one found on the
RZ/G2UL SoC. "renesas,r9a07g043-canfd" compatible string will be used
on the RZ/Five SoC so to make this clear, update the comment to include
RZ/Five SoC.

No driver changes are required as generic compatible string
"renesas,rzg2l-canfd" will be used as a fallback on RZ/Five SoC.

Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Acked-by: Rob Herring <robh@kernel.org>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/all/20221115123811.1182922-1-prabhakar.mahadev-lad.rj@bp.renesas.com
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-12-12 11:39:12 +01:00
Vincent Mailhol
594a25e1ff can: etas_es58x: add devlink port support
Add support for devlink port which extends the devlink support to the
network interface level. For now, the etas_es58x driver will only rely
on the default features that devlink port has to offer and not
implement additional feature ones.

Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Link: https://lore.kernel.org/all/20221130174658.29282-3-mailhol.vincent@wanadoo.fr
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-12-12 11:39:12 +01:00
Haibo Chen
a21cee59b4 dt-bindings: can: fsl,flexcan: add imx93 compatible
Add a new compatible string for imx93.

Signed-off-by: Haibo Chen <haibo.chen@nxp.com>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/all/1669116752-4260-2-git-send-email-haibo.chen@nxp.com
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-12-12 11:39:12 +01:00
Vincent Mailhol
2c4a1efcf6 can: etas_es58x: add devlink support
Add basic support for devlink at the device level. The callbacks of
struct devlink_ops will be implemented next.

Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Link: https://lore.kernel.org/all/20221130174658.29282-2-mailhol.vincent@wanadoo.fr
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-12-12 11:39:12 +01:00
Haibo Chen
8cb53b485f can: flexcan: add auto stop mode for IMX93 to support wakeup
IMX93 do not contain a GPR to config the stop mode, it will set
the flexcan into stop mode automatically once the ARM core go
into low power mode (WFI instruct) and gate off the flexcan
related clock automatically. But to let these logic work as
expect, before ARM core go into low power mode, need to make
sure the flexcan related clock keep on.

To support stop mode and wakeup feature on imx93, this patch
add a new fsl_imx93_devtype_data to separate from imx8mp.

Signed-off-by: Haibo Chen <haibo.chen@nxp.com>
Link: https://lore.kernel.org/all/1669116752-4260-1-git-send-email-haibo.chen@nxp.com
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-12-12 11:39:12 +01:00
Vincent Mailhol
8fd9323ef7 can: etas_es58x: sort the includes by alphabetic order
Follow the best practices, reorder the includes.

While doing so, bump up copyright year of each modified files.

Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Link: https://lore.kernel.org/all/20221126160525.87036-1-mailhol.vincent@wanadoo.fr
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-12-12 11:39:11 +01:00
Jean Delvare
005c54278b can: ctucanfd: Drop obsolete dependency on COMPILE_TEST
Since commit 0166dc11be ("of: make CONFIG_OF user selectable"), it
is possible to test-build any driver which depends on OF on any
architecture by explicitly selecting OF. Therefore depending on
COMPILE_TEST as an alternative is no longer needed.

It is actually better to always build such drivers with OF enabled,
so that the test builds are closer to how each driver will actually be
built on its intended target. Building them without OF may not test
much as the compiler will optimize out potentially large parts of the
code. In the worst case, this could even pop false positive warnings.
Dropping COMPILE_TEST here improves the quality of our testing and
avoids wasting time on non-existent issues.

Signed-off-by: Jean Delvare <jdelvare@suse.de>
Cc: Pavel Pisa <pisa@cmp.felk.cvut.cz>
Cc: Ondrej Ille <ondrej.ille@gmail.com>
Acked-by: Pavel Pisa <pisa@cmp.felk.cvut.cz>
Link: https://lore.kernel.org/all/20221124141604.4265225f@endymion.delvare
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-12-12 11:39:11 +01:00
Marc Kleine-Budde
a59d65e1be Merge patch series "R-Car CAN FD driver enhancements"
Biju Das <biju.das.jz@bp.renesas.com> says:

The CAN FD IP found on RZ/G2L SoC has some HW features different to
that of R-Car. For example, it has multiple resets, dedicated channel
tx and error interrupts, separate global rx and error interrupts
compared to shared irq for R-Car. it does not s ECC error flag
registers and clk post divider present on R-Car.

Similarly, R-Car V3U has 8 channels whereas other SoCs has only 2
channels. Currently all the HW differences are handled by comparing
with chip_id enum.

This patch series aims to replace chip_id with struct
rcar_canfd_hw_info to handle the HW feature differences and driver
data present on both IPs.

The changes are trivial and tested on RZ/G2L SMARC EVK.

This patch series depend upon [1].

[1] https://lore.kernel.org/all/20221025155657.1426948-1-biju.das.jz@bp.renesas.com

changes since v2: https://lore.kernel.org/all/20221026131732.1843105-1-biju.das.jz@bp.renesas.com
 * Replaced data type of max_channels from unsigned int->u8 to save memory.
 * Replaced data type of postdiv from unsigned int->u8 to save memory.

changes since v1: https://lore.kernel.org/all/20221022104357.1276740-1-biju.das.jz@bp.renesas.com
 * Updated commit description for R-Car V3U SoC detection using
   driver data.
 * Replaced data type of max_channels from u32->unsigned int.
 * Replaced multi_global_irqs->shared_global_irqs to make it
   positive checks.
 * Replaced clk_postdiv->postdiv driver data variable.
 * Simplified the calcualtion for fcan_freq.
 * Replaced info->has_gerfl to gpriv->info->has_gerfl and wrapped
   the ECC error flag checks inside a single if statement.
 * Added Rb tag from Geert patch#1,#2,#3 and #5

Link: https://lore.kernel.org/all/20221027082158.95895-1-biju.das.jz@bp.renesas.com
[mkl: only take patches 1...5 to avoid merge conflict]
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2022-12-12 11:37:56 +01:00
David S. Miller
21af0d557e Merge branch 'ovs-tc-dedup'
Xin Long says:

====================
net: eliminate the duplicate code in the ct nat functions of ovs and tc

The changes in the patchset:

  "net: add helper support in tc act_ct for ovs offloading"

had moved some common ct code used by both OVS and TC into netfilter.

There are still some big functions pretty similar defined and used in
each of OVS and TC. It is not good to maintain such big function in 2
places. This patchset is to extract the functions for NAT processing
from OVS and TC to netfilter.

To make this change clear and safe, this patchset gets the common code
out of OVS and TC step by step: The patch 1-4 make some minor changes
in OVS and TC to make the NAT code of them completely the same, then
the patch 5 moves the common code to the netfilter and exports one
function called by each of OVS and TC.

v1->v2:
  - Create nf_nat_ovs.c to include the nat functions, as Pablo suggested.
v2->v3:
  - fix a typo in subject of patch 2/5, as Marcelo noticed.
  - fix in openvswitch to keep OVS ct nat and TC ct nat consistent in
    patch 3/5 instead of in tc, as Marcelo noticed.
  - use BIT(var) macro instead of (1 << var) in patch 5/5, as Marcelo
    suggested.
  - use ifdef in netfilter/Makefile to build nf_nat_ovs only when OVS
    or TC ct action is enabled in patch 5/5, as Marcelo suggested.
v3->v4:
  - add NF_NAT_OVS in netfilter/Kconfig and add select NF_NAT_OVS in
    OVS and TC Kconfig instead of using ifdef in netfilter/Makefile,
    as Pablo suggested.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-12-12 10:14:03 +00:00
Xin Long
ebddb14049 net: move the nat function to nf_nat_ovs for ovs and tc
There are two nat functions are nearly the same in both OVS and
TC code, (ovs_)ct_nat_execute() and ovs_ct_nat/tcf_ct_act_nat().

This patch creates nf_nat_ovs.c under netfilter and moves them
there then exports nf_ct_nat() so that it can be shared by both
OVS and TC, and keeps the nat (type) check and nat flag update
in OVS and TC's own place, as these parts are different between
OVS and TC.

Note that in OVS nat function it was using skb->protocol to get
the proto as it already skips vlans in key_extract(), while it
doesn't in TC, and TC has to call skb_protocol() to get proto.
So in nf_ct_nat_execute(), we keep using skb_protocol() which
works for both OVS and TC contrack.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-12-12 10:14:03 +00:00
Xin Long
0564c3e51b net: sched: update the nat flag for icmp error packets in ct_nat_execute
In ovs_ct_nat_execute(), the packet flow key nat flags are updated
when it processes ICMP(v6) error packets translation successfully.

In ct_nat_execute() when processing ICMP(v6) error packets translation
successfully, it should have done the same in ct_nat_execute() to set
post_ct_s/dnat flag, which will be used to update flow key nat flags
in OVS module later.

Reviewed-by: Saeed Mahameed <saeed@kernel.org>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-12-12 10:14:03 +00:00
Xin Long
2b85144ab3 openvswitch: return NF_DROP when fails to add nat ext in ovs_ct_nat
When it fails to allocate nat ext, the packet should be dropped, like
the memory allocation failures in other places in ovs_ct_nat().

This patch changes to return NF_DROP when fails to add nat ext before
doing NAT in ovs_ct_nat(), also it would keep consistent with tc
action ct' processing in tcf_ct_act_nat().

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-12-12 10:14:03 +00:00
Xin Long
7795928921 openvswitch: return NF_ACCEPT when OVS_CT_NAT is not set in info nat
Either OVS_CT_SRC_NAT or OVS_CT_DST_NAT is set, OVS_CT_NAT must be
set in info->nat. Thus, if OVS_CT_NAT is not set in info->nat, it
will definitely not do NAT but returns NF_ACCEPT in ovs_ct_nat().

This patch changes nothing funcational but only makes this return
earlier in ovs_ct_nat() to keep consistent with TC's processing
in tcf_ct_act_nat().

Reviewed-by: Saeed Mahameed <saeed@kernel.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-12-12 10:14:03 +00:00
Xin Long
bf14f4923d openvswitch: delete the unncessary skb_pull_rcsum call in ovs_ct_nat_execute
The calls to ovs_ct_nat_execute() are as below:

  ovs_ct_execute()
    ovs_ct_lookup()
      __ovs_ct_lookup()
        ovs_ct_nat()
          ovs_ct_nat_execute()
    ovs_ct_commit()
      __ovs_ct_lookup()
        ovs_ct_nat()
          ovs_ct_nat_execute()

and since skb_pull_rcsum() and skb_push_rcsum() are already
called in ovs_ct_execute(), there's no need to do it again
in ovs_ct_nat_execute().

Reviewed-by: Saeed Mahameed <saeed@kernel.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-12-12 10:14:03 +00:00