linux

korg/linux

mirror of https://mirrors.bfsu.edu.cn/git/linux.git synced 2024-12-11 13:04:03 +08:00

Author	SHA1	Message	Date
Randy Dunlap	4d516e4952	scsi: aacraid: Fix spelling of "its" Use the possessive "its" instead of the contraction "it's" in user messages. Link: https://lore.kernel.org/r/20211223061119.18304-1-rdunlap@infradead.org Cc: Adaptec OEM Raid Solutions <aacraid@microsemi.com> Cc: "James E.J. Bottomley" <jejb@linux.ibm.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-05 00:23:47 -05:00
Jiasheng Jiang	aa7069d840	scsi: qedf: Fix potential dereference of NULL pointer The return value of dma_alloc_coherent() needs to be checked to avoid use of NULL pointer in case of an allocation failure. Link: https://lore.kernel.org/r/20211216101449.375953-1-jiasheng@iscas.ac.cn Fixes: `61d8658b4a` ("scsi: qedf: Add QLogic FastLinQ offload FCoE driver framework.") Acked-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2022-01-05 00:19:28 -05:00
Sreekanth Reddy	c77b1f8a8f	scsi: mpi3mr: Bump driver version to 8.0.0.61.0 Update the driver version to newer version format i.e. 8.0.0.61.0. Link: https://lore.kernel.org/r/20211220141159.16117-26-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-23 00:04:25 -05:00
Sreekanth Reddy	243bcc8efd	scsi: mpi3mr: Fixes around reply request queues Set reply queue depth of 1K for B0 and 4K for A0. While freeing the segmented request queues use the actual queue depth that is used while creating them. Link: https://lore.kernel.org/r/20211220141159.16117-25-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-23 00:04:25 -05:00
Sreekanth Reddy	a91603a5d5	scsi: mpi3mr: Enhanced Task Management Support Reply handling Enhance driver to consider MPI3_IOCSTATUS_SCSI_IOC_TERMINATED as a success for TMs issued by it and check the pending I/Os to decide the success or failure of the task management requests instead of just considering the MPI3_IOCSTATUS_SCSI_IOC_TERMINATED as a failure of the task management request. Link: https://lore.kernel.org/r/20211220141159.16117-24-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-23 00:04:25 -05:00
Sreekanth Reddy	c86651345c	scsi: mpi3mr: Use TM response codes from MPI3 headers Remove locally defined TM response codes and use codes from MPI3 headers. Link: https://lore.kernel.org/r/20211220141159.16117-23-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-23 00:04:25 -05:00
Sreekanth Reddy	afd3a5793f	scsi: mpi3mr: Add io_uring interface support in I/O-polled mode Add support for the io_uring interface in I/O-polled mode. This feature is disabled in the driver by default. To enable the feature, a module parameter "poll_queues" has to be set with the desired number of polling queues. When the feature is enabled, the driver reserves a certain number of operational queue pairs for the poll_queues either from the available queue pairs or creates additional queue pairs based on the operational queue availability. The Polling queues will have corresponding IRQ and ISR functions as similar to default queues. However, the IRQ line is disabled by the driver for poll_queues. Link: https://lore.kernel.org/r/20211220141159.16117-22-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-23 00:04:24 -05:00
Sreekanth Reddy	95cca8d554	scsi: mpi3mr: Print cable mngnt and temp threshold events Print cable management & temperature threshold event data. Use vendor id & device id macro definitions from MPI3 headers. Link: https://lore.kernel.org/r/20211220141159.16117-21-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-23 00:04:24 -05:00
Sreekanth Reddy	78b76a0768	scsi: mpi3mr: Support Prepare for Reset event The IOC sends a Prepare for Reset Event to the host to prepare for a Soft Reset. This event data has two reason codes: 1. Start - The host is expected to gracefully quiesce all I/O within approximately 1 second. 2. Abort - The IOC is requesting to abort a previous Prepare for Reset Event request. Normal I/O may be resumed. Link: https://lore.kernel.org/r/20211220141159.16117-20-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-23 00:04:24 -05:00
Sreekanth Reddy	c1af985d27	scsi: mpi3mr: Add Event acknowledgment logic Add Event acknowledgment logic. Link: https://lore.kernel.org/r/20211220141159.16117-19-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-23 00:04:24 -05:00
Sreekanth Reddy	c5758fc72b	scsi: mpi3mr: Gracefully handle online FW update operation Enhance driver to gracefully handle discrepancies in certain key data sizes between firmware update operations as mentioned below: - The driver displays an error message and marks the controller as unrecoverable if the firmware reports ReplyFrameSize that is greater than the current ReplyFrameSize. - If the firmware reports ReplyFrameSize greater than the current ReplyFrameSize then the driver uses the current ReplyFrameSize while copying the reply messages. - The driver displays an error message and marks the controller as unrecoverable if the firmware reports MaxOperationalReplyQueues less than the currently allocated operational reply queues count. - If the firmware reports MaxOperationalReplyQueues that is greater than the currently allocated operational reply queue count then the driver ignores the new increased value and uses the previously allocated number of operational queues only. - If the firmware reports MaxDevHandle greater than the previously used MaxDevHandle value after a reset then the driver re-allocates the 'device remove pending bitmap' buffer with the newer size using krealloc(). Link: https://lore.kernel.org/r/20211220141159.16117-18-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-23 00:04:24 -05:00
Sreekanth Reddy	b64845a7d4	scsi: mpi3mr: Detect async reset that occurred in firmware Detect asynchronous reset that occurred in the firmware by polling for reset history bit of IOC status register is set and if that bit is set, then the driver waits for the controller to become ready and then re-initializes the controller. Also reduce the time driver is waiting for the controller to acknowledge the reset action after issuing a specific reset action to the controller. The wait time is reduced from 510 seconds to 30 seconds. If the controller didn't acknowledge a specific reset action within the time interval then the driver marks the controller as unrecoverable instead of retrying two more times prior to giving up. Link: https://lore.kernel.org/r/20211220141159.16117-17-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-23 00:04:24 -05:00
Sreekanth Reddy	c0b00a931e	scsi: mpi3mr: Add IOC reinit function Add IOC reinitialization function. Link: https://lore.kernel.org/r/20211220141159.16117-16-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-23 00:04:23 -05:00
Sreekanth Reddy	fe6db61515	scsi: mpi3mr: Handle offline FW activation in graceful manner Currently the driver marks the controller as unrecoverable if there is an asynchronous reset or fault during the initialization, reinitialization post reset, and OS resume. Enhance driver to retry the initialization, re-initialization, and resume sequences for a maximum of 3 times if the controller became faulty or asynchronously reset due to a firmware activation during the initialization sequence. Link: https://lore.kernel.org/r/20211220141159.16117-15-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-23 00:04:23 -05:00
Sreekanth Reddy	59bd9cfe3f	scsi: mpi3mr: Code refactor of IOC init - part2 Move the IOC initialization's bring up logic to mpi3mr_bring_ioc_ready() routine. Link: https://lore.kernel.org/r/20211220141159.16117-14-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-23 00:04:23 -05:00
Sreekanth Reddy	e3605f65ef	scsi: mpi3mr: Code refactor of IOC init - part1 Separate out reply and sense buffer allocation and initialization into two routines and call only initialization routine while issuing the IOC Init request message. Also move out the event enable logic to a separate function. Link: https://lore.kernel.org/r/20211220141159.16117-13-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-23 00:04:23 -05:00
Sreekanth Reddy	a6856cc450	scsi: mpi3mr: Fault IOC when internal command gets timeout Save snapdump and fault the controller with the given reason code if it is already not in the fault or not in asynchronous reset. This ensures that soft reset is issued from the watchdog thread. This will also be used to handle initialization time faults/resets/timeout as in those cases immediate soft reset invocation is not required. Link: https://lore.kernel.org/r/20211220141159.16117-12-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-23 00:04:23 -05:00
Sreekanth Reddy	2ac794baae	scsi: mpi3mr: Display IOC firmware package version Display IOC firmware package version by reading component image upload data. Link: https://lore.kernel.org/r/20211220141159.16117-11-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-23 00:04:23 -05:00
Sreekanth Reddy	13fd7b1555	scsi: mpi3mr: Handle unaligned PLL in unmap cmnds The following special handling is needed for UNMAP commands issued to NVMe drives: - On B0 boards, if the parameter list length is greater than 24 and not a 16-byte multiple, then truncate the parameter list length to a 16-byte multiple. - On A0 boards, if the parameter list length is greater than block descriptor data length + 8, then truncate the parameter list length to block descriptor data length + 8 value. Link: https://lore.kernel.org/r/20211220141159.16117-10-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-23 00:04:22 -05:00
Sreekanth Reddy	4f08b9637f	scsi: mpi3mr: Increase internal cmnds timeout to 60s - Increase internal command timeout to 60 seconds. - Enable 16 device removal handshake processing in parallel in the device removal handshake infrastructure. Link: https://lore.kernel.org/r/20211220141159.16117-9-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-23 00:04:22 -05:00
Sreekanth Reddy	ba68779a51	scsi: mpi3mr: Do access status validation before adding devices Add validation for various access statuses prior to exposing attached target device to the operating system. Link: https://lore.kernel.org/r/20211220141159.16117-8-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-23 00:04:22 -05:00
Sreekanth Reddy	17d6b9cf89	scsi: mpi3mr: Add support for PCIe Managed Switch SES device The SAS4 Controller firmware exposes the SES devices in Managed PCIe Switch as a PCIe Device Type SCSI Device (MPI3_DEVICE0_PCIE_DEVICE_INFO_TYPE_SCSI_DEVICE). Driver is enhanced to handle this device type by: - Exposing the device to the upper layers and - Not updating any hardware sectors & virtual boundary settings as these settings are needed only for NVMe devices. Link: https://lore.kernel.org/r/20211220141159.16117-7-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-23 00:04:22 -05:00
Sreekanth Reddy	ec5ebd2c14	scsi: mpi3mr: Update MPI3 headers - part2 Continued updating MPI3 headers. Link: https://lore.kernel.org/r/20211220141159.16117-6-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-23 00:04:22 -05:00
Sreekanth Reddy	d00ff7c311	scsi: mpi3mr: Update MPI3 headers - part1 Update MPI3 headers. Link: https://lore.kernel.org/r/20211220141159.16117-5-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-23 00:04:22 -05:00
Sreekanth Reddy	fbaa9aa48b	scsi: mpi3mr: Don't reset IOC if cmnds flush with reset status Don't issue the soft reset if internal commands are flushed out with reset status. Soft reset needs to be issued only if commands are really timed out. Link: https://lore.kernel.org/r/20211220141159.16117-4-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-23 00:04:21 -05:00
Sreekanth Reddy	a83ec831b2	scsi: mpi3mr: Replace spin_lock() with spin_lock_irqsave() Use spin_lock_irqsave() instead of spin_lock() while acquiring reply_free_queue_lock & sbq_lock locks. Link: https://lore.kernel.org/r/20211220141159.16117-3-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-23 00:04:21 -05:00
Sreekanth Reddy	9cf0666f34	scsi: mpi3mr: Add debug APIs based on logging_level bits Add debug print functions which will print messages based on logging_level bits enabled. Link: https://lore.kernel.org/r/20211220141159.16117-2-sreekanth.reddy@broadcom.com Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-23 00:04:21 -05:00
Christoph Hellwig	657b44d651	scsi: pmcraid: Don't use GFP_DMA in pmcraid_alloc_sglist() The driver doesn't express DMA addressing limitation under 32-bits anywhere else, so remove the spurious GFP_DMA allocation. Link: https://lore.kernel.org/r/20211222092247.928711-1-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-22 23:44:12 -05:00
Christoph Hellwig	1964777e10	scsi: snic: Don't use GFP_DMA in snic_queue_report_tgt_req() The driver doesn't express DMA addressing limitation under 32-bits anywhere else, so remove the spurious GFP_DMA allocation. Link: https://lore.kernel.org/r/20211222092048.925829-1-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-22 23:43:23 -05:00
Christoph Hellwig	0298b7daf8	scsi: myrs: Don't use GFP_DMA The myrs devices supports 64-bit addressing, so remove the spurious GFP_DMA allocations. Link: https://lore.kernel.org/r/20211222091935.925624-1-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-22 23:42:53 -05:00
Christoph Hellwig	27363ba89f	scsi: myrb: Don't use GFP_DMA in myrb_pdev_slave_alloc() The driver doesn't express DMA addressing limitation under 32-bits anywhere else, so remove the spurious GFP_DMA allocation. Link: https://lore.kernel.org/r/20211222091801.924745-1-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-22 23:42:23 -05:00
Christoph Hellwig	c981e9e0f8	scsi: initio: Don't use GFP_DMA in initio_probe_one() The driver doesn't express DMA addressing limitation under 32-bits anywhere else, so remove the spurious GFP_DMA allocation. Link: https://lore.kernel.org/r/20211222091630.922788-1-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-22 23:41:53 -05:00
Christoph Hellwig	d94d94969a	scsi: sr: Don't use GFP_DMA The allocated buffers are used as a command payload, for which the block layer and/or DMA API do the proper bounce buffering if needed. Link: https://lore.kernel.org/r/20211222090842.920724-1-hch@lst.de Reported-by: Baoquan He <bhe@redhat.com> Reviewed-by: Baoquan He <bhe@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-22 23:41:13 -05:00
Christoph Hellwig	bc7806b395	scsi: ch: Don't use GFP_DMA The allocated buffers are used as a command payload, for which the block layer and/or DMA API do the proper bounce buffering if needed. Link: https://lore.kernel.org/r/20211222090311.916624-1-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-22 23:40:02 -05:00
Xiang Chen	b4cc094922	scsi: hisi_sas: Use autosuspend for the host controller The controller may frequently enter and exit suspend for each I/O which we need to deal with. This is inefficient and may cause too much suspend and resume activity for the controller. To avoid this, use a default 5s autosuspend for the controller to stop frequently suspending and resuming. This value may still be modified via sysfs interfaces. Link: https://lore.kernel.org/r/1639999298-244569-16-git-send-email-chenxiang66@hisilicon.com Acked-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-22 23:38:31 -05:00
Xiang Chen	307d9f49cc	scsi: libsas: Keep host active while processing events Processing events such as PORTE_BROADCAST_RCVD may cause dependency issues for runtime power management support. Such a problem would be that handling a PORTE_BROADCAST_RCVD event requires that the host is resumed to send SMP commands. However, in resuming the host, the phyup events generated from re-enabling the phys are processed in the same workqueue as the original PORTE_BROADCAST_RCVD event. As such, the host will never finish resuming (as it waits for the phyup event processing), and then the PORTE_BROADCAST_RCVD event can't be processed as the SMP commands are blocked, and so we have a deadlock. Solve this problem by ensuring that libsas keeps the host active until completely finished phy or port events, such as PORTE_BYTES_DMAED. As such, we don't have to worry about resuming the host for processing individual SMP commands in this example. Link: https://lore.kernel.org/r/1639999298-244569-15-git-send-email-chenxiang66@hisilicon.com Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-22 23:38:31 -05:00
Xiang Chen	ae9b69e85e	scsi: hisi_sas: Keep controller active between ISR of phyup and the event being processed It is possible that controller may become suspended between processing a phyup interrupt and the event being processed by libsas. As such, we can't ensure the controller is active when processing the phyup event - this may cause the phyup event to be lost or other issues. To avoid any possible issues, add pm_runtime_get_noresume() in phyup interrupt handler and pm_runtime_put_sync() in the work handler exit to ensure that we stay always active. Since we only want to call pm_runtime_get_noresume() for v3 hw, signal this will a new event, HISI_PHYE_PHY_UP_PM. Link: https://lore.kernel.org/r/1639999298-244569-14-git-send-email-chenxiang66@hisilicon.com Acked-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-22 23:38:31 -05:00
Xiang Chen	bf19aea460	scsi: libsas: Defer works of new phys during suspend During the processing of event PORT_BYTES_DMAED, the driver queues work DISCE_DISCOVER_DOMAIN and then flushes workqueue ha->disco_q. If a new phyup event occurs during resuming the controller, the work PORTE_BYTES_DMAED of new phy occurs before suspended phy's. The work DISCE_DISCOVER_DOMAIN of new phy requires an active SAS controller (it needs to resume SAS controller by function scsi_sysfs_add_sdev() and some other functions such as function add_device_link()). However, the activation of the SAS controller requires completion of work PORTE_BYTES_DMAED of suspended phys while it is blocked by new phy's work on ha->event_q. So there is a deadlock and it is released only after resume timeout. To solve the issue, defer works of new phys during suspend and queue those defer works after SAS controller becomes active. Link: https://lore.kernel.org/r/1639999298-244569-13-git-send-email-chenxiang66@hisilicon.com Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-22 23:38:30 -05:00
Xiang Chen	1bc35475c6	scsi: libsas: Refactor sas_queue_deferred_work() In the second part of function __sas_drain_work(), deferred work is queued. This functionality is required other places so factor it out into the function sas_queue_deferred_work(). Link: https://lore.kernel.org/r/1639999298-244569-12-git-send-email-chenxiang66@hisilicon.com Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-22 23:38:30 -05:00
Xiang Chen	4ea775abbb	scsi: libsas: Add flag SAS_HA_RESUMING Add a flag SAS_HA_RESUMING and use it to indicate the state of resuming the host controller. Link: https://lore.kernel.org/r/1639999298-244569-11-git-send-email-chenxiang66@hisilicon.com Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-22 23:38:30 -05:00
Xiang Chen	0da7ca4c4f	scsi: libsas: Resume host while sending SMP I/Os When sending SMP I/Os to the host we need to ensure that the host is not suspended and can process the commands. This is a better approach than replying on the host to resume itself to handle such commands. Use pm_runtime_get_sync() and pm_runtime_put_sync() calls for the host when executing SMP I/Os. Link: https://lore.kernel.org/r/1639999298-244569-10-git-send-email-chenxiang66@hisilicon.com Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-22 23:38:30 -05:00
Xiang Chen	97f4100939	scsi: hisi_sas: Add more logs for runtime suspend/resume Add some logs at the beginning and end of suspend/resume. Link: https://lore.kernel.org/r/1639999298-244569-9-git-send-email-chenxiang66@hisilicon.com Acked-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-22 23:38:30 -05:00
Xiang Chen	e31e18128e	scsi: libsas: Insert PORTE_BROADCAST_RCVD event for resuming host If a new disk is inserted through an expander when the host was suspended, it will not necessarily be detected as the topology is not re-scanned during resume. To detect possible changes in topology during suspension, insert a PORTE_BROADCAST_RCVD event per port when resuming to trigger a revalidation. Link: https://lore.kernel.org/r/1639999298-244569-8-git-send-email-chenxiang66@hisilicon.com Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-22 23:38:30 -05:00
Xiang Chen	133b688b2d	scsi: mvsas: Add spin_lock/unlock() to protect asd_sas_port->phy_list phy_list_lock is not held when using asd_sas_port->phy_list in the mvsas driver. Add spin_lock/unlock in those places. Link: https://lore.kernel.org/r/1639999298-244569-7-git-send-email-chenxiang66@hisilicon.com Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-22 23:38:29 -05:00
Xiang Chen	29e2bac874	scsi: hisi_sas: Fix some issues related to asd_sas_port->phy_list Most places that use asd_sas_port->phy_list are protected by spinlock asd_sas_port->phy_list_lock, however there are still some places which miss grabbing the lock. Add it in function hisi_sas_refresh_port_id() when accessing asd_sas_port->phy_list. This carries a risk that list mutates while at the same time dropping the lock in function hisi_sas_send_ata_reset_each_phy(). Read asd_sas_port->phy_mask instead of accessing asd_sas_port->phy_list to avoid this risk. Link: https://lore.kernel.org/r/1639999298-244569-6-git-send-email-chenxiang66@hisilicon.com Acked-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-22 23:38:29 -05:00
Xiang Chen	42159d3c8d	scsi: libsas: Add spin_lock/unlock() to protect asd_sas_port->phy_list Most places that use asd_sas_port->phy_list in libsas are protected by spinlock asd_sas_port->phy_list_lock. However, there are still a few places which miss the lock. Add it in those places. Link: https://lore.kernel.org/r/1639999298-244569-5-git-send-email-chenxiang66@hisilicon.com Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-22 23:38:29 -05:00
Alan Stern	6e1fcab00a	scsi: block: pm: Always set request queue runtime active in blk_post_runtime_resume() John Garry reported a deadlock that occurs when trying to access a runtime-suspended SATA device. For obscure reasons, the rescan procedure causes the link to be hard-reset, which disconnects the device. The rescan tries to carry out a runtime resume when accessing the device. scsi_rescan_device() holds the SCSI device lock and won't release it until it can put commands onto the device's block queue. This can't happen until the queue is successfully runtime-resumed or the device is unregistered. But the runtime resume fails because the device is disconnected, and __scsi_remove_device() can't do the unregistration because it can't get the device lock. The best way to resolve this deadlock appears to be to allow the block queue to start running again even after an unsuccessful runtime resume. The idea is that the driver or the SCSI error handler will need to be able to use the queue to resolve the runtime resume failure. This patch removes the err argument to blk_post_runtime_resume() and makes the routine act as though the resume was successful always. This fixes the deadlock. Link: https://lore.kernel.org/r/1639999298-244569-4-git-send-email-chenxiang66@hisilicon.com Fixes: `e27829dc92` ("scsi: serialize ->rescan against ->remove") Reported-and-tested-by: John Garry <john.garry@huawei.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-22 23:38:29 -05:00
John Garry	6cc7390877	scsi: Revert "scsi: hisi_sas: Filter out new PHY up events during suspend" This reverts commit `b14a37e011`. In that commit, we had to filter out phy-up events during suspend, as it work cause a deadlock between processing the phyup event and the resume HA function try to drain the HA event workqueue to complete the resume process. Now that we no longer try to drain the HA event queue during the HA resume processor, the deadlock would not occur, so remove the special handling for it. Link: https://lore.kernel.org/r/1639999298-244569-3-git-send-email-chenxiang66@hisilicon.com Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-22 23:38:29 -05:00
John Garry	fbefe22811	scsi: libsas: Don't always drain event workqueue for HA resume For the hisi_sas driver, if a directly attached disk is removed during suspend, a hang will occur in the resume process: The background is that in commit `16fd4a7c59` ("scsi: hisi_sas: Add device link between SCSI devices and hisi_hba"), it is ensured that the HBA device cannot be runtime suspended when any SCSI device associated is active. Other drivers which use libsas don't worry about this as none support runtime suspend. The mentioned hang occurs when an disk is removed during suspend. In the removal process - from PHYE_RESUME_TIMEOUT event processing - we call into scsi_remove_device(), which is being processed in the HA event workqueue. Here we wait for all suppliers of the SCSI device to resume, which includes the HBA device (from the above commit). However the HBA device cannot resume, as it is waiting for the PHYE_RESUME_TIMEOUT to be processed (from calling sas_resume_ha() -> sas_drain_work()). This is the deadlock. There does not appear to be any need for the sas_drain_work() to be called at all in sas_resume_ha() as it is not syncing against anything, so allow LLDDs to avoid this by providing a variant of sas_resume_ha() which does "sync", i.e. doesn't drain the event workqueue. Link: https://lore.kernel.org/r/1639999298-244569-2-git-send-email-chenxiang66@hisilicon.com Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-22 23:38:29 -05:00
John Garry	4be6181fea	scsi: libsas: Decode SAM status and host byte codes Value 0 is used for SAM status and libsas exec_status bytes codes in sas_end_task() - use defined macros instead. In addition, change to proper enum types. Also replace SAM_STAT_CHECK_CONDITION with SAS_SAM_STAT_CHECK_CONDITION, the former being a proper member of enum exec_status. Link: https://lore.kernel.org/r/1639579061-179473-9-git-send-email-john.garry@huawei.com Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2021-12-16 22:59:58 -05:00

1 2 3 4 5 ...

22425 Commits