linux

korg/linux

mirror of https://mirrors.bfsu.edu.cn/git/linux.git synced 2024-11-28 06:34:12 +08:00

Author	SHA1	Message	Date
Martin K. Petersen	6b1c374c45	Merge branch '6.2/scsi-queue' into 6.2/scsi-fixes Pull in remaining patches from the 6.2 queue. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-12-30 16:29:34 +00:00
Linus Torvalds	aa5ad10f6c	SCSI misc on 20221213 Updates to the usual drivers (target, ufs, smartpqi, lpfc). There are some core changes, mostly around reworking some of our user context assumptions in device put and moving some code around. The remaining updates are bug fixes and minor changes. Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com> -----BEGIN PGP SIGNATURE----- iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCY5jjrSYcamFtZXMuYm90 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishR9iAPwN++uF BNlCD36duS8LslKQMPAmFxWt3d/4RWAHsXj2WQEAtu9q8K9PSe1ueb4y+rAEG4oj 2AUQhR3v9ciWBBKlDog= =JYJC -----END PGP SIGNATURE----- Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI updates from James Bottomley: "Updates to the usual drivers (target, ufs, smartpqi, lpfc). There are some core changes, mostly around reworking some of our user context assumptions in device put and moving some code around. The remaining updates are bug fixes and minor changes" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (138 commits) scsi: sg: Fix get_user() in call sg_scsi_ioctl() scsi: megaraid_sas: Fix some spelling mistakes in comment scsi: core: Use SCSI_SCAN_INITIAL in do_scsi_scan_host() scsi: core: Use SCSI_SCAN_RESCAN in __scsi_add_device() scsi: ufs: ufs-mediatek: Remove unnecessary return code scsi: ufs: core: Fix the polling implementation scsi: libsas: Do not export sas_ata_wait_after_reset() scsi: hisi_sas: Fix SATA devices missing issue during I_T nexus reset scsi: libsas: Add smp_ata_check_ready_type() scsi: Revert "scsi: hisi_sas: Don't send bcast events from HW during nexus HA reset" scsi: Revert "scsi: hisi_sas: Drain bcast events in hisi_sas_rescan_topology()" scsi: ufs: ufs-mediatek: Modify the return value scsi: ufs: ufs-mediatek: Remove unneeded code scsi: device_handler: alua: Call scsi_device_put() from non-atomic context scsi: device_handler: alua: Revert "Move a scsi_device_put() call out of alua_check_vpd()" scsi: snic: Fix possible UAF in snic_tgt_create() scsi: qla2xxx: Initialize vha->unknown_atio_[list, work] for NPIV hosts scsi: qla2xxx: Remove duplicate of vha->iocb_work initialization scsi: fcoe: Fix transport not deattached when fcoe_if_init() fails scsi: sd: Use 16-byte SYNCHRONIZE CACHE on ZBC devices ...	2022-12-14 08:58:51 -08:00
Wenchao Hao	a3be19b91e	scsi: iscsi: Fix multiple iSCSI session unbind events sent to userspace It was observed that the kernel would potentially send ISCSI_KEVENT_UNBIND_SESSION multiple times. Introduce 'target_state' in iscsi_cls_session() to make sure session will send only one unbind session event. This introduces a regression wrt. the issue fixed in commit `13e60d3ba2` ("scsi: iscsi: Report unbind session event when the target has been removed"). If iscsid dies for any reason after sending an unbind session to kernel, once iscsid is restarted, the kernel's ISCSI_KEVENT_UNBIND_SESSION event is lost and userspace is then unable to logout. However, the session is actually in invalid state (its target_id is INVALID) so iscsid should not sync this session during restart. Consequently we need to check the session's target state during iscsid restart. If session is in unbound state, do not sync this session and perform session teardown. This is OK because once a session is unbound, we can not recover it any more (mainly because its target id is INVALID). Signed-off-by: Wenchao Hao <haowenchao@huawei.com> Link: https://lore.kernel.org/r/20221126010752.231917-1-haowenchao@huawei.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Reviewed-by: Wu Bo <wubo40@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-12-14 02:49:19 +00:00
Wenchao Hao	0c26a2d7c9	scsi: iscsi: Rename iscsi_set_param() to iscsi_if_set_param() There are two iscsi_set_param() functions defined in libiscsi.c and scsi_transport_iscsi.c respectively which is confusing. Rename the one in scsi_transport_iscsi.c to iscsi_if_set_param(). Signed-off-by: Wenchao Hao <haowenchao@huawei.com> Link: https://lore.kernel.org/r/20221122181105.4123935-1-haowenchao@huawei.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Reviewed-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-11-24 03:27:24 +00:00
Zhou Guanghui	f014165faa	scsi: iscsi: Fix possible memory leak when device_register() failed If device_register() returns error, the name allocated by the dev_set_name() need be freed. As described in the comment of device_register(), we should use put_device() to give up the reference in the error path. Fix this by calling put_device(), the name will be freed in the kobject_cleanup(), and this patch modified resources will be released by calling the corresponding callback function in the device_release(). Signed-off-by: Zhou Guanghui <zhouguanghui1@huawei.com> Link: https://lore.kernel.org/r/20221110033729.1555-1-zhouguanghui1@huawei.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-11-17 17:53:36 +00:00
Martin K. Petersen	11e50ed239	Merge branch '5.19/scsi-fixes' into 5.20/scsi-staging Bring in fixes to resolve a merge conflict in the lpfc driver update. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-07-07 17:20:43 -04:00
Mike Christie	bb42856bfd	scsi: iscsi: Add helper to remove a session from the kernel During qedi shutdown we need to stop the iSCSI layer from sending new nops as pings and from responding to target ones and make sure there is no running connection cleanups. Commit `d1f2ce7763` ("scsi: qedi: Fix host removal with running sessions") converted the driver to use the libicsi helper to drive session removal, so the above issues could be handled. The problem is that during system shutdown iscsid will not be running so when we try to remove the root session we will hang waiting for userspace to reply. Add a helper that will drive the destruction of sessions like these during system shutdown. Link: https://lore.kernel.org/r/20220616222738.5722-5-michael.christie@oracle.com Tested-by: Nilesh Javali <njavali@marvell.com> Reviewed-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-06-21 21:14:54 -04:00
Mike Christie	da2f132d00	scsi: iscsi: Clean up bound endpoints during shutdown In the next patch we allow drivers to drive session removal during shutdown. In this case iscsid will not be running, so we need to detect bound endpoints and disconnect them. This moves the bound ep check so we now always check. Link: https://lore.kernel.org/r/20220616222738.5722-4-michael.christie@oracle.com Tested-by: Nilesh Javali <njavali@marvell.com> Reviewed-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-06-21 21:14:53 -04:00
Mike Christie	3328333b47	scsi: iscsi: Allow iscsi_if_stop_conn() to be called from kernel iscsi_if_stop_conn() is only called from the userspace interface but in a subsequent commit we will want to call it from the kernel interface to allow drivers like qedi to remove sessions from inside the kernel during shutdown. This removes the iscsi_uevent code from iscsi_if_stop_conn() so we can call it in a new helper. Link: https://lore.kernel.org/r/20220616222738.5722-3-michael.christie@oracle.com Tested-by: Nilesh Javali <njavali@marvell.com> Reviewed-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-06-21 21:14:53 -04:00
Mike Christie	c577ab7ba5	scsi: iscsi: Fix HW conn removal use after free If qla4xxx doesn't remove the connection before the session, the iSCSI class tries to remove the connection for it. We were doing a iscsi_put_conn() in the iter function which is not needed and will result in a use after free because iscsi_remove_conn() will free the connection. Link: https://lore.kernel.org/r/20220616222738.5722-2-michael.christie@oracle.com Tested-by: Nilesh Javali <njavali@marvell.com> Reviewed-by: Lee Duncan <lduncan@suse.com> Reviewed-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-06-21 21:14:53 -04:00
Max Gurtovoy	6a33ed5064	scsi: iscsi: Make iscsi_unregister_transport() return void This function always returns 0. We can make it return void to simplify the code. Also, no caller ever checks the return value of this function. Link: https://lore.kernel.org/r/20220616080210.18531-1-mgurtovoy@nvidia.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-06-16 22:11:18 -04:00
Sergey Gorenko	f6eed15f3e	scsi: iscsi: Exclude zero from the endpoint ID range The kernel returns an endpoint ID as r.ep_connect_ret.handle in the iscsi_uevent. The iscsid validates a received endpoint ID and treats zero as an error. The commit referenced in the fixes line changed the endpoint ID range, and zero is always assigned to the first endpoint ID. So, the first attempt to create a new iSER connection always fails. Link: https://lore.kernel.org/r/20220613123854.55073-1-sergeygo@nvidia.com Fixes: `3c6ae371b8` ("scsi: iscsi: Release endpoint ID when its freed") Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: Mike Christie <michael.christie@oracle.com> Reviewed-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Sergey Gorenko <sergeygo@nvidia.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-06-13 22:11:36 -04:00
keliu	3fd3a52ca6	scsi: core: iscsi: Directly use ida_alloc()/ida_free() Use ida_alloc()/ida_free() instead of the deprecated ida_simple_get()/ida_simple_remove() interface. Link: https://lore.kernel.org/r/20220527083049.2552526-1-liuke94@huawei.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Reviewed-by: Lee Duncan <lduncan@suse.com> Signed-off-by: keliu <liuke94@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-06-07 21:58:06 -04:00
Mike Christie	03690d8197	scsi: iscsi: Fix unbound endpoint error handling If a driver raises a connection error before the connection is bound, we can leave a cleanup_work queued that can later run and disconnect/stop a connection that is logged in. The problem is that drivers can call iscsi_conn_error_event for endpoints that are connected but not yet bound when something like the network port they are using is brought down. iscsi_cleanup_conn_work_fn will check for this and exit early, but if the cleanup_work is stuck behind other works, it might not get run until after userspace has done ep_disconnect. Because the endpoint is not yet bound there was no way for ep_disconnect to flush the work. The bug of leaving stop_conns queued was added in: Commit `23d6fefbb3` ("scsi: iscsi: Fix in-kernel conn failure handling") and: Commit `0ab710458d` ("scsi: iscsi: Perform connection failure entirely in kernel space") was supposed to fix it, but left this case. This patch moves the conn state check to before we even queue the work so we can avoid queueing. Link: https://lore.kernel.org/r/20220408001314.5014-7-michael.christie@oracle.com Fixes: `0ab710458d` ("scsi: iscsi: Perform connection failure entirely in kernel space") Tested-by: Manish Rangankar <mrangankar@marvell.com> Reviewed-by: Lee Duncan <lduncan@@suse.com> Reviewed-by: Chris Leech <cleech@redhat.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-04-11 22:09:35 -04:00
Mike Christie	7c6e99c181	scsi: iscsi: Fix conn cleanup and stop race during iscsid restart If iscsid is doing a stop_conn at the same time the kernel is starting error recovery we can hit a race that allows the cleanup work to run on a valid connection. In the race, iscsi_if_stop_conn sees the cleanup bit set, but it calls flush_work on the clean_work before iscsi_conn_error_event has queued it. The flush then returns before the queueing and so the cleanup_work can run later and disconnect/stop a conn while it's in a connected state. The patch: Commit `0ab710458d` ("scsi: iscsi: Perform connection failure entirely in kernel space") added the late stop_conn call bug originally, and the patch: Commit `23d6fefbb3` ("scsi: iscsi: Fix in-kernel conn failure handling") attempted to fix it but only fixed the normal EH case and left the above race for the iscsid restart case. For the normal EH case we don't hit the race because we only signal userspace to start recovery after we have done the queueing, so the flush will always catch the queued work or see it completed. For iscsid restart cases like boot, we can hit the race because iscsid will call down to the kernel before the kernel has signaled any error, so both code paths can be running at the same time. This adds a lock around the setting of the cleanup bit and queueing so they happen together. Link: https://lore.kernel.org/r/20220408001314.5014-6-michael.christie@oracle.com Fixes: `0ab710458d` ("scsi: iscsi: Perform connection failure entirely in kernel space") Tested-by: Manish Rangankar <mrangankar@marvell.com> Reviewed-by: Lee Duncan <lduncan@suse.com> Reviewed-by: Chris Leech <cleech@redhat.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-04-11 22:09:34 -04:00
Mike Christie	0aadafb5c3	scsi: iscsi: Fix endpoint reuse regression This patch fixes a bug where when using iSCSI offload we can free an endpoint while userspace still thinks it's active. That then causes the endpoint ID to be reused for a new connection's endpoint while userspace still thinks the ID is for the original connection. Userspace will then end up disconnecting a running connection's endpoint or trying to bind to another connection's endpoint. This bug is a regression added in: Commit `23d6fefbb3` ("scsi: iscsi: Fix in-kernel conn failure handling") where we added a in kernel ep_disconnect call to fix a bug in: Commit `0ab710458d` ("scsi: iscsi: Perform connection failure entirely in kernel space") where we would call stop_conn without having done ep_disconnect. This early ep_disconnect call will then free the endpoint and it's ID while userspace still thinks the ID is valid. Fix the early release of the ID by having the in kernel recovery code keep a reference to the endpoint until userspace has called into the kernel to finish cleaning up the endpoint/connection. It requires the previous commit "scsi: iscsi: Release endpoint ID when its freed" which moved the freeing of the ID until when the endpoint is released. Link: https://lore.kernel.org/r/20220408001314.5014-5-michael.christie@oracle.com Fixes: `23d6fefbb3` ("scsi: iscsi: Fix in-kernel conn failure handling") Tested-by: Manish Rangankar <mrangankar@marvell.com> Reviewed-by: Lee Duncan <lduncan@suse.com> Reviewed-by: Chris Leech <cleech@redhat.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-04-11 22:09:34 -04:00
Mike Christie	3c6ae371b8	scsi: iscsi: Release endpoint ID when its freed We can't release the endpoint ID until all references to the endpoint have been dropped or it could be allocated while in use. This has us use an idr instead of looping over all conns to find a free ID and then free the ID when all references have been dropped instead of when the device is only deleted. Link: https://lore.kernel.org/r/20220408001314.5014-4-michael.christie@oracle.com Tested-by: Manish Rangankar <mrangankar@marvell.com> Reviewed-by: Lee Duncan <lduncan@suse.com> Reviewed-by: Chris Leech <cleech@redhat.com> Reviewed-by: Wu Bo <wubo40@huawei.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-04-11 22:09:34 -04:00
Mike Christie	cbd2283aaf	scsi: iscsi: Fix offload conn cleanup when iscsid restarts When userspace restarts during boot or upgrades it won't know about the offload driver's endpoint and connection mappings. iscsid will start by cleaning up the old session by doing a stop_conn call. Later, if we are able to create a new connection, we clean up the old endpoint during the binding stage. The problem is that if we do stop_conn before doing the ep_disconnect call offload, drivers can still be executing I/O. We then might free tasks from the under the card/driver. This moves the ep_disconnect call to before we do the stop_conn call for this case. It will then work and look like a normal recovery/cleanup procedure from the driver's point of view. Link: https://lore.kernel.org/r/20220408001314.5014-3-michael.christie@oracle.com Tested-by: Manish Rangankar <mrangankar@marvell.com> Reviewed-by: Lee Duncan <lduncan@suse.com> Reviewed-by: Chris Leech <cleech@redhat.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-04-11 22:09:34 -04:00
Mike Christie	c34f95e98d	scsi: iscsi: Move iscsi_ep_disconnect() This patch moves iscsi_ep_disconnect() so it can be called earlier in the next patch. Link: https://lore.kernel.org/r/20220408001314.5014-2-michael.christie@oracle.com Tested-by: Manish Rangankar <mrangankar@marvell.com> Reviewed-by: Lee Duncan <lduncan@suse.com> Reviewed-by: Chris Leech <cleech@redhat.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-04-11 22:09:34 -04:00
Wenchao Hao	8709c32309	scsi: libiscsi: Teardown iscsi_cls_conn gracefully Commit `1b8d0300a3` ("scsi: libiscsi: Fix UAF in iscsi_conn_get_param()/iscsi_conn_teardown()") fixed an UAF in iscsi_conn_get_param() and introduced 2 tmp_xxx varibles. We can gracefully fix this UAF with the help of device_del(). Calling iscsi_remove_conn() at the beginning of iscsi_conn_teardown would make userspace unable to see iscsi_cls_conn. This way we we can free memory safely. Remove iscsi_destroy_conn() since it is no longer used. Link: https://lore.kernel.org/r/20220310015759.3296841-4-haowenchao@huawei.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Wenchao Hao <haowenchao@huawei.com> Signed-off-by: Wu Bo <wubo40@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-03-15 00:20:16 -04:00
Wenchao Hao	7dae459f5e	scsi: libiscsi: Add iscsi_cls_conn to sysfs after initialization iscsi_create_conn() exposed iscsi_cls_conn to sysfs prior to initialization of iscsi_conn's dd_data. When userspace tried to access an attribute such as the connect address, a NULL pointer dereference was observed. Do not add iscsi_cls_conn to sysfs until it has been initialized. Remove iscsi_create_conn() since it is no longer used. Link: https://lore.kernel.org/r/20220310015759.3296841-3-haowenchao@huawei.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Wenchao Hao <haowenchao@huawei.com> Signed-off-by: Wu Bo <wubo40@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-03-15 00:19:50 -04:00
Wenchao Hao	ad515cada7	scsi: iscsi: Add helper functions to manage iscsi_cls_conn - iscsi_alloc_conn(): Allocate and initialize iscsi_cls_conn - iscsi_add_conn(): Expose iscsi_cls_conn to userspace via sysfs - iscsi_remove_conn(): Remove iscsi_cls_conn from sysfs Link: https://lore.kernel.org/r/20220310015759.3296841-2-haowenchao@huawei.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Wenchao Hao <haowenchao@huawei.com> Signed-off-by: Wu Bo <wubo40@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-03-15 00:19:50 -04:00
Mike Christie	7cb6683ce7	scsi: iscsi: Use the session workqueue for recovery Use the session workqueue for recovery and unbinding. If there are delays during device blocking/cleanup then it will no longer affect other sessions. Link: https://lore.kernel.org/r/20220226230435.38733-6-michael.christie@oracle.com Reviewed-by: Chris Leech <cleech@redhat.com> Reviewed-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-03-01 23:56:28 -05:00
Mike Christie	5842ea3668	scsi: iscsi: ql4xxx: Use per-session workqueue for unbinding We currently allocate a workqueue per host and only use it for removing the target. For the session per host case we could be using this workqueue to be able to do recoveries (block, unblock, timeout handling) in parallel. To also allow offload drivers to do their session recoveries in parallel, this drops the per host workqueue and replaces it with a per session one. Link: https://lore.kernel.org/r/20220226230435.38733-5-michael.christie@oracle.com Reviewed-by: Lee Duncan <lduncan@suse.com> Reviewed-by: Chris Leech <cleech@redhat.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-03-01 23:56:28 -05:00
Mike Christie	d8ec5d67b8	scsi: iscsi: Remove iscsi_scan_finished() qla4xxx does not use iscsi_scan_finished() anymore so remove it. Link: https://lore.kernel.org/r/20220226230435.38733-4-michael.christie@oracle.com Reviewed-by: Lee Duncan <lduncan@suse.com> Reviewed-by: Chris Leech <cleech@redhat.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-03-01 23:56:28 -05:00
Mike Christie	b07c348f8f	scsi: iscsi: Speed up session unblocking and removal When the iSCSI class was added upstream, blocking a queue was fast because it just set some flag bits and didn't handle I/O that was in the process of being sent to the driver. That's no longer the case so blocking a queue is expensive and we can end up with a backlog of blocks by the time we have relogged in and are trying to start the queues. For the session unblock case, this has try to cancel the block and recovery work in case they are still queued so we can avoid unneeded queue manipulations. For removal, we also now try to cancel all the recovery related works since a couple lines down we will set the session and device state so running those functions are not necessary. Link: https://lore.kernel.org/r/20220226230435.38733-3-michael.christie@oracle.com Reviewed-by: Lee Duncan <lduncan@suse.com> Reviewed-by: Chris Leech <cleech@redhat.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-03-01 23:56:28 -05:00
Mike Christie	8dd3dff3bf	scsi: iscsi: Fix recovery and unblocking race If the user sets the iscsi_eh_timer_workq/iscsi_eh workqueue's max_active to greater than 1, the recovery_work could be running when __iscsi_unblock_session() runs. The cancel_delayed_work() will then not wait for the running work and we can race where we end up with the wrong session state and scsi_device state set. This replaces the cancel_delayed_work() with the sync version. Link: https://lore.kernel.org/r/20220226230435.38733-2-michael.christie@oracle.com Reviewed-by: Lee Duncan <lduncan@suse.com> Reviewed-by: Chris Leech <cleech@redhat.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-03-01 23:56:28 -05:00
Mike Christie	a0c2f8b670	scsi: iscsi: Unblock session then wake up error handler We can race where iscsi_session_recovery_timedout() has woken up the error handler thread and it's now setting the devices to offline, and session_recovery_timedout()'s call to scsi_target_unblock() is also trying to set the device's state to transport-offline. We can then get a mix of states. For the case where we can't relogin we want the devices to be in transport-offline so when we have repaired the connection __iscsi_unblock_session() can set the state back to running. Set the device state then call into libiscsi to wake up the error handler. Link: https://lore.kernel.org/r/20211105221048.6541-2-michael.christie@oracle.com Reviewed-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-11-16 19:42:30 -05:00
Mike Christie	187a580c9e	scsi: iscsi: Fix set_param() handling In commit `9e67600ed6` ("scsi: iscsi: Fix race condition between login and sync thread") we meant to add a check where before we call ->set_param() we make sure the iscsi_cls_connection is bound. The problem is that between versions 4 and 5 of the patch the deletion of the unchecked set_param() call was dropped so we ended up with 2 calls. As a result we can still hit a crash where we access the unbound connection on the first call. This patch removes that first call. Fixes: `9e67600ed6` ("scsi: iscsi: Fix race condition between login and sync thread") Link: https://lore.kernel.org/r/20211010161904.60471-1-michael.christie@oracle.com Reviewed-by: Lee Duncan <lduncan@suse.com> Reviewed-by: Li Feng <fengli@smartx.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-10-12 12:51:25 -04:00
Baokun Li	4e28550829	scsi: iscsi: Adjust iface sysfs attr detection ISCSI_NET_PARAM_IFACE_ENABLE belongs to enum iscsi_net_param instead of iscsi_iface_param so move it to ISCSI_NET_PARAM. Otherwise, when we call into the driver, we might not match and return that we don't want attr visible in sysfs. Found in code review. Link: https://lore.kernel.org/r/20210901085336.2264295-1-libaokun1@huawei.com Fixes: `e746f3451e` ("scsi: iscsi: Fix iface sysfs attr detection") Reviewed-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Baokun Li <libaokun1@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-09-13 22:15:43 -04:00
Mike Christie	e746f3451e	scsi: iscsi: Fix iface sysfs attr detection A ISCSI_IFACE_PARAM can have the same value as a ISCSI_NET_PARAM so when iscsi_iface_attr_is_visible tries to figure out the type by just checking the value, we can collide and return the wrong type. When we call into the driver we might not match and return that we don't want attr visible in sysfs. The patch fixes this by setting the type when we figure out what the param is. Link: https://lore.kernel.org/r/20210701002559.89533-1-michael.christie@oracle.com Fixes: `3e0f65b34c` ("[SCSI] iscsi_transport: Additional parameters for network settings") Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-07-18 21:07:48 -04:00
Mike Christie	7ce9fc5ecd	scsi: iscsi: Flush block work before unblock We set the max_active iSCSI EH works to 1, so all work is going to execute in order by default. However, userspace can now override this in sysfs. If max_active > 1, we can end up with the block_work on CPU1 and iscsi_unblock_session running the unblock_work on CPU2 and the session and target/device state will end up out of sync with each other. This adds a flush of the block_work in iscsi_unblock_session. Link: https://lore.kernel.org/r/20210525181821.7617-17-michael.christie@oracle.com Fixes: `1d726aa6ef` ("scsi: iscsi: Optimize work queue flush use") Reviewed-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-06-02 01:28:21 -04:00
Mike Christie	b1d19e8c92	scsi: iscsi: Add iscsi_cls_conn refcount helpers There are a couple places where we could free the iscsi_cls_conn while it's still in use. This adds some helpers to get/put a refcount on the struct and converts an exiting user. Subsequent commits will then use the helpers to fix 2 bugs in the eh code. Link: https://lore.kernel.org/r/20210525181821.7617-11-michael.christie@oracle.com Reviewed-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-06-02 01:28:20 -04:00
Mike Christie	23d6fefbb3	scsi: iscsi: Fix in-kernel conn failure handling Commit `0ab710458d` ("scsi: iscsi: Perform connection failure entirely in kernel space") has the following regressions/bugs that this patch fixes: 1. It can return cmds to upper layers like dm-multipath where that can retry them. After they are successful the fs/app can send new I/O to the same sectors, but we've left the cmds running in FW or in the net layer. We need to be calling ep_disconnect if userspace is not up. This patch only fixes the issue for offload drivers. iscsi_tcp will be fixed in separate commit because it doesn't have a ep_disconnect call. 2. The drivers that implement ep_disconnect expect that it's called before conn_stop. Besides crashes, if the cleanup_task callout is called before ep_disconnect it might free up driver/card resources for session1 then they could be allocated for session2. But because the driver's ep_disconnect is not called it has not cleaned up the firmware so the card is still using the resources for the original cmd. 3. The stop_conn_work_fn can run after userspace has done its recovery and we are happily using the session. We will then end up with various bugs depending on what is going on at the time. We may also run stop_conn_work_fn late after userspace has called stop_conn and ep_disconnect and is now going to call start/bind conn. If stop_conn_work_fn runs after bind but before start, we would leave the conn in a unbound but sort of started state where IO might be allowed even though the drivers have been set in a state where they no longer expect I/O. 4. Returning -EAGAIN in iscsi_if_destroy_conn if we haven't yet run the in kernel stop_conn function is breaking userspace. We should have been doing this for the caller. Link: https://lore.kernel.org/r/20210525181821.7617-8-michael.christie@oracle.com Fixes: `0ab710458d` ("scsi: iscsi: Perform connection failure entirely in kernel space") Reviewed-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-06-02 01:28:20 -04:00
Mike Christie	9e5fe17008	scsi: iscsi: Rel ref after iscsi_lookup_endpoint() Subsequent commits allow the kernel to do ep_disconnect. In that case we will have to get a proper refcount on the ep so one thread does not delete it from under another. Link: https://lore.kernel.org/r/20210525181821.7617-7-michael.christie@oracle.com Reviewed-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-06-02 01:28:20 -04:00
Mike Christie	b25b957d2d	scsi: iscsi: Use system_unbound_wq for destroy_work Use the system_unbound_wq for async session destruction. We don't need a dedicated workqueue for async session destruction because: 1. perf does not seem to be an issue since we only allow 1 active work. 2. it does not have deps with other system works and we can run them in parallel with each other. Link: https://lore.kernel.org/r/20210525181821.7617-6-michael.christie@oracle.com Reviewed-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-06-02 01:28:20 -04:00
Mike Christie	06c203a556	scsi: iscsi: Force immediate failure during shutdown If the system is not up, we can just fail immediately since iscsid is not going to ever answer our netlink events. We are already setting the recovery_tmo to 0, but by passing stop_conn STOP_CONN_TERM we never will block the session and start the recovery timer, because for that flag userspace will do the unbind and destroy events which would remove the devices and wake up and kill the eh. Since the conn is dead and the system is going dowm this just has us use STOP_CONN_RECOVER with recovery_tmo=0 so we fail immediately. However, if the user has set the recovery_tmo=-1 we let the system hang like they requested since they might have used that setting for specific reasons (one known reason is for buggy cluster software). Link: https://lore.kernel.org/r/20210525181821.7617-5-michael.christie@oracle.com Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-06-02 01:28:19 -04:00
Mike Christie	891e2639de	scsi: iscsi: Stop queueing during ep_disconnect During ep_disconnect we have been doing iscsi_suspend_tx/queue to block new I/O but every driver except cxgbi and iscsi_tcp can still get I/O from __iscsi_conn_send_pdu() if we haven't called iscsi_conn_failure() before ep_disconnect. This could happen if we were terminating the session, and the logout timed out before it was even sent to libiscsi. Fix the issue by adding a helper which reverses the bind_conn call that allows new I/O to be queued. Drivers implementing ep_disconnect can use this to make sure new I/O is not queued to them when handling the disconnect. Link: https://lore.kernel.org/r/20210525181821.7617-3-michael.christie@oracle.com Reviewed-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-06-02 01:28:19 -04:00
Linus Torvalds	c98ff1d013	SCSI fixes on 20210417 This libsas fix is for a problem that occurs when trying to change the cache type of an ATA device and the libiscsi one is a regression fix from this merge window. Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com> -----BEGIN PGP SIGNATURE----- iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCYHuPRSYcamFtZXMuYm90 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pisha52AQCXc8n0 6VAjfc+8aCqjX2Hpw4YCGeW5RYoNj1WXhiDv/AD+L4FVBMdQ4DE9ukH12YW7YBRS qP03aNSHLCl8wfVon8Q= =Btn4 -----END PGP SIGNATURE----- Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Two fixes: the libsas fix is for a problem that occurs when trying to change the cache type of an ATA device and the libiscsi one is a regression fix from this merge window" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: libsas: Reset num_scatter if libata marks qc as NODATA scsi: iscsi: Fix iSCSI cls conn state	2021-04-17 20:25:33 -07:00
Mike Christie	0dcf8febcb	scsi: iscsi: Fix iSCSI cls conn state In commit `9e67600ed6` ("scsi: iscsi: Fix race condition between login and sync thread") I missed that libiscsi was now setting the iSCSI class state, and that patch ended up resetting the state during conn stoppage and using the wrong state value during ep_disconnect. This patch moves the setting of the class state to the class module and then fixes the two issues above. Link: https://lore.kernel.org/r/20210406171746.5016-1-michael.christie@oracle.com Fixes: `9e67600ed6` ("scsi: iscsi: Fix race condition between login and sync thread") Cc: Gulam Mohamed <gulam.mohamed@oracle.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-04-07 21:30:59 -04:00
Linus Torvalds	57fbdb15ec	SCSI fixes on 20210402 Single fix to iscsi for a rare race condition which can cause a kernel panic. Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com> -----BEGIN PGP SIGNATURE----- iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCYGe3ZCYcamFtZXMuYm90 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishaxZAQDt/zcv xvK+2qWNsqVse32hknc3RpdMWUh4JE1pKfSvgwD/X7c3goqQ8dEyEK0cpXLNpw9D kOOQxTVVCxFImwActdg= =VlUo -----END PGP SIGNATURE----- Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fix from James Bottomley: "A single fix to iscsi for a rare race condition which can cause a kernel panic" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: iscsi: Fix race condition between login and sync thread	2021-04-03 09:07:35 -07:00
Gulam Mohamed	9e67600ed6	scsi: iscsi: Fix race condition between login and sync thread A kernel panic was observed due to a timing issue between the sync thread and the initiator processing a login response from the target. The session reopen can be invoked both from the session sync thread when iscsid restarts and from iscsid through the error handler. Before the initiator receives the response to a login, another reopen request can be sent from the error handler/sync session. When the initial login response is subsequently processed, the connection has been closed and the socket has been released. To fix this a new connection state, ISCSI_CONN_BOUND, is added: - Set the connection state value to ISCSI_CONN_DOWN upon iscsi_if_ep_disconnect() and iscsi_if_stop_conn() - Set the connection state to the newly created value ISCSI_CONN_BOUND after bind connection (transport->bind_conn()) - In iscsi_set_param(), return -ENOTCONN if the connection state is not either ISCSI_CONN_BOUND or ISCSI_CONN_UP Link: https://lore.kernel.org/r/20210325093248.284678-1-gulam.mohamed@oracle.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Gulam Mohamed <gulam.mohamed@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> index 91074fd97f64..f4bf62b007a0 100644	2021-03-29 21:17:45 -04:00
Chris Leech	f9dbdf97a5	scsi: iscsi: Verify lengths on passthrough PDUs Open-iSCSI sends passthrough PDUs over netlink, but the kernel should be verifying that the provided PDU header and data lengths fall within the netlink message to prevent accessing beyond that in memory. Cc: stable@vger.kernel.org Reported-by: Adam Nichols <adam@grimm-co.com> Reviewed-by: Lee Duncan <lduncan@suse.com> Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Chris Leech <cleech@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-03-04 20:09:51 -05:00
Chris Leech	ec98ea7070	scsi: iscsi: Ensure sysfs attributes are limited to PAGE_SIZE As the iSCSI parameters are exported back through sysfs, it should be enforcing that they never are more than PAGE_SIZE (which should be more than enough) before accepting updates through netlink. Change all iSCSI sysfs attributes to use sysfs_emit(). Cc: stable@vger.kernel.org Reported-by: Adam Nichols <adam@grimm-co.com> Reviewed-by: Lee Duncan <lduncan@suse.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Chris Leech <cleech@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-03-04 20:09:51 -05:00
Lee Duncan	688e8128b7	scsi: iscsi: Restrict sessions and handles to admin capabilities Protect the iSCSI transport handle, available in sysfs, by requiring CAP_SYS_ADMIN to read it. Also protect the netlink socket by restricting reception of messages to ones sent with CAP_SYS_ADMIN. This disables normal users from being able to end arbitrary iSCSI sessions. Cc: stable@vger.kernel.org Reported-by: Adam Nichols <adam@grimm-co.com> Reviewed-by: Chris Leech <cleech@redhat.com> Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-03-04 20:09:50 -05:00
Mike Christie	d39bfd0686	scsi: iscsi: Drop session lock in iscsi_session_chkready() The session lock in iscsi_session_chkready() is not needed because when we transition from logged into to another state we will block and/or remove the devices under the session, so no new I/O will be sent to the drivers after the block/remove. I/O that races with the block/removal is cleaned up by the drivers when it handles all outstanding I/O, so this just added an extra lock in the main I/O path. This patch removes the lock like other transport classes. Link: https://lore.kernel.org/r/20210207044608.27585-10-michael.christie@oracle.com Reviewed-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-02-08 22:39:04 -05:00
Qinglang Miao	6dc1c7ab6f	scsi: iscsi: Fix inappropriate use of put_device() kfree(conn) is called inside put_device(&conn->dev) which could lead to use-after-free. In addition, device_unregister() should be used here rather than put_deviceO(). Link: https://lore.kernel.org/r/20201120074852.31658-1-miaoqinglang@huawei.com Fixes: `f3c893e3db` ("scsi: iscsi: Fail session and connection on transport registration failure") Reported-by: Hulk Robot <hulkci@huawei.com> Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Qinglang Miao <miaoqinglang@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-12-07 17:45:19 -05:00
Linus Torvalds	dfdf16ecfd	SCSI misc on 20200806 This series consists of the usual driver updates (ufs, qla2xxx, tcmu, lpfc, hpsa, zfcp, scsi_debug) and minor bug fixes. We also have a huge docbook fix update like most other subsystems and no major update to the core (the few non trivial updates are either minor fixes or removing an unused feature [scsi_sdb_cache]). Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com> -----BEGIN PGP SIGNATURE----- iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXyxq1yYcamFtZXMuYm90 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishSoAAQChZ4i8 ZqYW3pL33JO3fA8vdjvLuyC489Hj4wzIsl3/bQEAxYyM6BSLvMoLWR2Plq/JmTLm 4W/LDptarpTiDI3NuDc= =4b0W -----END PGP SIGNATURE----- Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI updates from James Bottomley: "This consists of the usual driver updates (ufs, qla2xxx, tcmu, lpfc, hpsa, zfcp, scsi_debug) and minor bug fixes. We also have a huge docbook fix update like most other subsystems and no major update to the core (the few non trivial updates are either minor fixes or removing an unused feature [scsi_sdb_cache])" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (307 commits) scsi: scsi_transport_srp: Sanitize scsi_target_block/unblock sequences scsi: ufs-mediatek: Apply DELAY_AFTER_LPM quirk to Micron devices scsi: ufs: Introduce device quirk "DELAY_AFTER_LPM" scsi: virtio-scsi: Correctly handle the case where all LUNs are unplugged scsi: scsi_debug: Implement tur_ms_to_ready parameter scsi: scsi_debug: Fix request sense scsi: lpfc: Fix typo in comment for ULP scsi: ufs-mediatek: Prevent LPM operation on undeclared VCC scsi: iscsi: Do not put host in iscsi_set_flashnode_param() scsi: hpsa: Correct ctrl queue depth scsi: target: tcmu: Make TMR notification optional scsi: target: tcmu: Implement tmr_notify callback scsi: target: tcmu: Fix and simplify timeout handling scsi: target: tcmu: Factor out new helper ring_insert_padding scsi: target: tcmu: Do not queue aborted commands scsi: target: tcmu: Use priv pointer in se_cmd scsi: target: Add tmr_notify backend function scsi: target: Modify core_tmr_abort_task() scsi: target: iscsi: Fix inconsistent debug message scsi: target: iscsi: Fix login error when receiving ...	2020-08-06 16:50:07 -07:00
Jing Xiangfeng	68e12e5f61	scsi: iscsi: Do not put host in iscsi_set_flashnode_param() If scsi_host_lookup() fails we will jump to put_host which may cause a panic. Jump to exit_set_fnode instead. Link: https://lore.kernel.org/r/20200615081226.183068-1-jingxiangfeng@huawei.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Jing Xiangfeng <jingxiangfeng@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-07-28 22:56:24 -04:00
Bob Liu	919a295abf	scsi: iscsi: Register sysfs for workqueue iscsi_destroy Register sysfs for workqueue iscsi_destroy so that users can set CPU affinity through "cpumask" for this workqueue to get better isolation in cloud multi-tenant scenario. This patch unfolded create_singlethread_workqueue(), added WQ_SYSFS and drop __WQ_ORDERED_EXPLICIT since __WQ_ORDERED_EXPLICIT workqueue isn't allowed to change "cpumask". Link: https://lore.kernel.org/r/20200703051603.1473-1-bob.liu@oracle.com Suggested-by: Mike Christie <michael.christie@oracle.com> Reviewed-by: Lee Duncan <lduncan@suse.com> Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Bob Liu <bob.liu@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-07-08 00:18:51 -04:00

1 2 3 4 5 ...

258 Commits