linux/drivers/net/ethernet/intel/ice
Jacob Keller 4d4a223a86 ice: fix locking for Tx timestamp tracking flush
Commit 4dd0d5c33c ("ice: add lock around Tx timestamp tracker flush")
added a lock around the Tx timestamp tracker flow which is used to
cleanup any left over SKBs and prepare for device removal.

This lock is problematic because it is being held around a call to
ice_clear_phy_tstamp. The clear function takes a mutex to send a PHY
write command to firmware. This could lead to a deadlock if the mutex
actually sleeps, and causes the following warning on a kernel with
preemption debugging enabled:

[  715.419426] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:573
[  715.427900] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 3100, name: rmmod
[  715.435652] INFO: lockdep is turned off.
[  715.439591] Preemption disabled at:
[  715.439594] [<0000000000000000>] 0x0
[  715.446678] CPU: 52 PID: 3100 Comm: rmmod Tainted: G        W  OE     5.15.0-rc4+ #42 bdd7ec3018e725f159ca0d372ce8c2c0e784891c
[  715.458058] Hardware name: Intel Corporation S2600STQ/S2600STQ, BIOS SE5C620.86B.02.01.0010.010620200716 01/06/2020
[  715.468483] Call Trace:
[  715.470940]  dump_stack_lvl+0x6a/0x9a
[  715.474613]  ___might_sleep.cold+0x224/0x26a
[  715.478895]  __mutex_lock+0xb3/0x1440
[  715.482569]  ? stack_depot_save+0x378/0x500
[  715.486763]  ? ice_sq_send_cmd+0x78/0x14c0 [ice 9a7e1ec00971c89ecd3fe0d4dc7da2b3786a421d]
[  715.494979]  ? kfree+0xc1/0x520
[  715.498128]  ? mutex_lock_io_nested+0x12a0/0x12a0
[  715.502837]  ? kasan_set_free_info+0x20/0x30
[  715.507110]  ? __kasan_slab_free+0x10b/0x140
[  715.511385]  ? slab_free_freelist_hook+0xc7/0x220
[  715.516092]  ? kfree+0xc1/0x520
[  715.519235]  ? ice_deinit_lag+0x16c/0x220 [ice 9a7e1ec00971c89ecd3fe0d4dc7da2b3786a421d]
[  715.527359]  ? ice_remove+0x1cf/0x6a0 [ice 9a7e1ec00971c89ecd3fe0d4dc7da2b3786a421d]
[  715.535133]  ? pci_device_remove+0xab/0x1d0
[  715.539318]  ? __device_release_driver+0x35b/0x690
[  715.544110]  ? driver_detach+0x214/0x2f0
[  715.548035]  ? bus_remove_driver+0x11d/0x2f0
[  715.552309]  ? pci_unregister_driver+0x26/0x250
[  715.556840]  ? ice_module_exit+0xc/0x2f [ice 9a7e1ec00971c89ecd3fe0d4dc7da2b3786a421d]
[  715.564799]  ? __do_sys_delete_module.constprop.0+0x2d8/0x4e0
[  715.570554]  ? do_syscall_64+0x3b/0x90
[  715.574303]  ? entry_SYSCALL_64_after_hwframe+0x44/0xae
[  715.579529]  ? start_flush_work+0x542/0x8f0
[  715.583719]  ? ice_sq_send_cmd+0x78/0x14c0 [ice 9a7e1ec00971c89ecd3fe0d4dc7da2b3786a421d]
[  715.591923]  ice_sq_send_cmd+0x78/0x14c0 [ice 9a7e1ec00971c89ecd3fe0d4dc7da2b3786a421d]
[  715.599960]  ? wait_for_completion_io+0x250/0x250
[  715.604662]  ? lock_acquire+0x196/0x200
[  715.608504]  ? do_raw_spin_trylock+0xa5/0x160
[  715.612864]  ice_sbq_rw_reg+0x1e6/0x2f0 [ice 9a7e1ec00971c89ecd3fe0d4dc7da2b3786a421d]
[  715.620813]  ? ice_reset+0x130/0x130 [ice 9a7e1ec00971c89ecd3fe0d4dc7da2b3786a421d]
[  715.628497]  ? __debug_check_no_obj_freed+0x1e8/0x3c0
[  715.633550]  ? trace_hardirqs_on+0x1c/0x130
[  715.637748]  ice_write_phy_reg_e810+0x70/0xf0 [ice 9a7e1ec00971c89ecd3fe0d4dc7da2b3786a421d]
[  715.646220]  ? do_raw_spin_trylock+0xa5/0x160
[  715.650581]  ? ice_ptp_release+0x910/0x910 [ice 9a7e1ec00971c89ecd3fe0d4dc7da2b3786a421d]
[  715.658797]  ? ice_ptp_release+0x255/0x910 [ice 9a7e1ec00971c89ecd3fe0d4dc7da2b3786a421d]
[  715.667013]  ice_clear_phy_tstamp+0x2c/0x110 [ice 9a7e1ec00971c89ecd3fe0d4dc7da2b3786a421d]
[  715.675403]  ice_ptp_release+0x408/0x910 [ice 9a7e1ec00971c89ecd3fe0d4dc7da2b3786a421d]
[  715.683440]  ice_remove+0x560/0x6a0 [ice 9a7e1ec00971c89ecd3fe0d4dc7da2b3786a421d]
[  715.691037]  ? _raw_spin_unlock_irqrestore+0x46/0x73
[  715.696005]  pci_device_remove+0xab/0x1d0
[  715.700018]  __device_release_driver+0x35b/0x690
[  715.704637]  driver_detach+0x214/0x2f0
[  715.708389]  bus_remove_driver+0x11d/0x2f0
[  715.712489]  pci_unregister_driver+0x26/0x250
[  715.716857]  ice_module_exit+0xc/0x2f [ice 9a7e1ec00971c89ecd3fe0d4dc7da2b3786a421d]
[  715.724637]  __do_sys_delete_module.constprop.0+0x2d8/0x4e0
[  715.730210]  ? free_module+0x6d0/0x6d0
[  715.733963]  ? task_work_run+0xe1/0x170
[  715.737803]  ? exit_to_user_mode_loop+0x17f/0x1d0
[  715.742509]  ? rcu_read_lock_sched_held+0x12/0x80
[  715.747215]  ? trace_hardirqs_on+0x1c/0x130
[  715.751401]  do_syscall_64+0x3b/0x90
[  715.754981]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  715.760033] RIP: 0033:0x7f4dfe59000b
[  715.763612] Code: 73 01 c3 48 8b 0d 6d 1e 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 3d 1e 0c 00 f7 d8 64 89 01 48
[  715.782357] RSP: 002b:00007ffe8c891708 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
[  715.789923] RAX: ffffffffffffffda RBX: 00005558a20468b0 RCX: 00007f4dfe59000b
[  715.797054] RDX: 000000000000000a RSI: 0000000000000800 RDI: 00005558a2046918
[  715.804189] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[  715.811319] R10: 00007f4dfe603ac0 R11: 0000000000000206 R12: 00007ffe8c891940
[  715.818455] R13: 00007ffe8c8920a3 R14: 00005558a20462a0 R15: 00005558a20468b0

Notice that this is the only case where we use the lock in this way. In
the cleanup kthread and work kthread the lock is only taken around the
bit accesses. This was done intentionally to avoid this kind of issue.
The way the lock is used, we only protect ordering of bit sets vs bit
clears. The Tx writers in the hot path don't need to be protected
against the entire kthread loop. The Tx queues threads only need to
ensure that they do not re-use an index that is currently in use. The
cleanup loop does not need to block all new set bits, since it will
re-queue itself if new timestamps are present.

Fix the tracker flow so that it uses the same flow as the standard
cleanup thread. In addition, ensure the in_use bitmap actually gets
cleared properly.

This fixes the warning and also avoids the potential deadlock that might
have occurred otherwise.

Fixes: 4dd0d5c33c ("ice: add lock around Tx timestamp tracker flush")
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-12 12:10:39 +01:00
..
ice_adminq_cmd.h ice: add support for set/get of driver-stored firmware parameters 2021-06-11 07:38:00 -07:00
ice_arfs.c ice: Delay netdev registration 2021-03-31 14:21:27 -07:00
ice_arfs.h ice: use static inline for dummy functions 2021-06-07 08:59:01 -07:00
ice_base.c ice: enable transmit timestamps for E810 devices 2021-06-11 08:47:41 -07:00
ice_base.h ice: Refactor ice_setup_rx_ctx 2021-06-07 08:58:56 -07:00
ice_common.c ice: register 1588 PTP clock device object for E810 devices 2021-06-11 08:47:30 -07:00
ice_common.h ice: register 1588 PTP clock device object for E810 devices 2021-06-11 08:47:30 -07:00
ice_controlq.c ice: add support for sideband messages 2021-06-11 07:38:00 -07:00
ice_controlq.h ice: add support for sideband messages 2021-06-11 07:38:00 -07:00
ice_dcb_lib.c ice: Fix a memory leak in an error handling path in 'ice_pf_dcb_cfg()' 2021-06-25 11:30:50 -07:00
ice_dcb_lib.h ice: use static inline for dummy functions 2021-06-07 08:59:01 -07:00
ice_dcb_nl.c ice: remove DCBNL_DEVRESET bit from PF state 2021-03-29 10:37:19 -07:00
ice_dcb_nl.h ice: use static inline for dummy functions 2021-06-07 08:59:01 -07:00
ice_dcb.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-04-17 11:08:07 -07:00
ice_dcb.h ice: replace single-element array used for C struct hack 2020-07-01 16:35:23 -07:00
ice_devids.h ice: fix define for E822 backplane device 2020-02-19 13:39:33 -08:00
ice_devlink.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-08-26 17:57:57 -07:00
ice_devlink.h ice: refactor devlink_port to be per-VSI 2020-10-09 13:14:19 -07:00
ice_ethtool_fdir.c ice: Drop leading underscores in enum ice_pf_state 2021-04-14 17:00:05 -07:00
ice_ethtool.c ethtool: extend coalesce setting uAPI with CQE mode 2021-08-24 07:38:29 -07:00
ice_fdir.c ice: Fix prototype warnings 2021-03-23 11:34:02 -07:00
ice_fdir.h ice: Add more FDIR filter type for AVF 2021-03-22 11:32:12 -07:00
ice_flex_pipe.c ice: suppress false cppcheck issues 2021-04-14 17:12:17 -07:00
ice_flex_pipe.h ice: Support to separate GTP-U uplink and downlink 2021-03-22 11:32:12 -07:00
ice_flex_type.h ice: cleanup style issues 2021-03-31 14:21:28 -07:00
ice_flow.c ice: Support RSS configure removal for AVF 2021-04-22 09:26:22 -07:00
ice_flow.h ice: Support RSS configure removal for AVF 2021-04-22 09:26:22 -07:00
ice_fltr.c ice: refactor filter functions 2020-05-21 22:10:04 -07:00
ice_fltr.h ice: refactor filter functions 2020-05-21 22:10:04 -07:00
ice_fw_update.c ice: add error message when pldmfw_flash_image fails 2021-06-07 08:59:01 -07:00
ice_fw_update.h ice: add support for flash update overwrite mask 2020-09-25 17:20:57 -07:00
ice_hw_autogen.h ice: add support for auxiliary input/output pins 2021-06-25 11:30:49 -07:00
ice_idc_int.h ice: Implement iidc operations 2021-05-28 20:11:13 -07:00
ice_idc.c ice: Correctly deal with PFs that do not support RDMA 2021-09-10 09:58:55 +01:00
ice_lag.c ice: Initialize RDMA support 2021-05-28 20:11:13 -07:00
ice_lag.h ice: Add initial support framework for LAG 2021-02-08 16:27:01 -08:00
ice_lan_tx_rx.h ice: report hash type such as L2/L3/L4 2021-06-18 08:59:46 -07:00
ice_lib.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-06-18 19:47:02 -07:00
ice_lib.h ice: enable receive hardware timestamping 2021-06-11 08:47:41 -07:00
ice_main.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-08-31 09:06:04 -07:00
ice_nvm.c ice: suppress false cppcheck issues 2021-04-14 17:12:17 -07:00
ice_nvm.h ice: display stored UNDI firmware version via devlink info 2021-02-05 11:44:16 -08:00
ice_osdep.h
ice_protocol_type.h ice: Add more advanced protocol support in flow filter 2021-03-22 11:32:12 -07:00
ice_ptp_hw.c ice: remove redundant continue statement in a for-loop 2021-06-17 09:25:06 -07:00
ice_ptp_hw.h ice: add low level PTP clock access functions 2021-06-11 07:38:00 -07:00
ice_ptp.c ice: fix locking for Tx timestamp tracking flush 2021-10-12 12:10:39 +01:00
ice_ptp.h ice: add support for auxiliary input/output pins 2021-06-25 11:30:49 -07:00
ice_sbq_cmd.h ice: add support for sideband messages 2021-06-11 07:38:00 -07:00
ice_sched.c ice: remove the VSI info from previous agg 2021-06-25 11:30:49 -07:00
ice_sched.h ice: Use PSM clock frequency to calculate RL profiles 2021-02-08 16:27:01 -08:00
ice_sriov.c ice: warn about potentially malicious VFs 2021-04-22 09:26:22 -07:00
ice_sriov.h ice: warn about potentially malicious VFs 2021-04-22 09:26:22 -07:00
ice_status.h ice: display stored netlist versions via devlink info 2021-02-05 11:43:37 -08:00
ice_switch.c ice: Implement iidc operations 2021-05-28 20:11:13 -07:00
ice_switch.h ice: Remove the repeated declaration 2021-06-17 09:19:59 -07:00
ice_trace.h ice: add tracepoints 2021-06-25 08:32:18 -07:00
ice_txrx_lib.c ice: report hash type such as L2/L3/L4 2021-06-18 08:59:46 -07:00
ice_txrx_lib.h ice: report hash type such as L2/L3/L4 2021-06-18 08:59:46 -07:00
ice_txrx.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next 2021-06-28 15:28:03 -07:00
ice_txrx.h ice: enable transmit timestamps for E810 devices 2021-06-11 08:47:41 -07:00
ice_type.h ice: add low level PTP clock access functions 2021-06-11 07:38:00 -07:00
ice_virtchnl_allowlist.c ice: Enable RSS configure for AVF 2021-04-22 09:26:22 -07:00
ice_virtchnl_allowlist.h ice: Allow ignoring opcodes on specific VF 2021-04-22 09:26:22 -07:00
ice_virtchnl_fdir.c ice: Drop leading underscores in enum ice_pf_state 2021-04-14 17:00:05 -07:00
ice_virtchnl_fdir.h ice: Check FDIR program status for AVF 2021-03-22 11:32:12 -07:00
ice_virtchnl_pf.c ice: Stop processing VF messages during teardown 2021-08-09 09:59:23 -07:00
ice_virtchnl_pf.h ice: use static inline for dummy functions 2021-06-07 08:59:01 -07:00
ice_xsk.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next 2021-06-28 15:28:03 -07:00
ice_xsk.h ice: use static inline for dummy functions 2021-06-07 08:59:01 -07:00
ice.h ice: Correctly deal with PFs that do not support RDMA 2021-09-10 09:58:55 +01:00
Makefile ice: register 1588 PTP clock device object for E810 devices 2021-06-11 08:47:30 -07:00