linux

korg/linux

mirror of https://mirrors.bfsu.edu.cn/git/linux.git synced 2024-12-11 13:04:03 +08:00

Author	SHA1	Message	Date
Bart Van Assche	8b566edbdb	scsi: core: Only kick the requeue list if necessary Instead of running the request queue of each device associated with a host every 3 ms (BLK_MQ_RESOURCE_DELAY) while host error handling is in progress, run the request queue after error handling has finished. Cc: Christoph Hellwig <hch@lst.de> Cc: Ming Lei <ming.lei@redhat.com> Cc: Hannes Reinecke <hare@suse.de> Cc: John Garry <john.g.garry@oracle.com> Cc: Mike Christie <michael.christie@oracle.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20230518193159.1166304-4-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-05-31 11:05:34 -04:00
Bart Van Assche	416dace649	scsi: core: Use min() instead of open-coding it Use min() instead of open-coding it in scsi_normalize_sense(). Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Cc: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20230518193159.1166304-2-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-05-31 11:05:34 -04:00
Gustavo A. R. Silva	e90644b0ce	scsi: lpfc: Replace one-element array with flexible-array member One-element arrays are deprecated, and we are replacing them with flexible array members instead. So, replace one-element arrays with flexible-array members in a couple of structures, and refactor the rest of the code, accordingly. This helps with the ongoing efforts to tighten the FORTIFY_SOURCE routines on memcpy() and help us make progress towards globally enabling -fstrict-flex-arrays=3 [1]. This results in no differences in binary output. Link: https://github.com/KSPP/linux/issues/79 Link: https://github.com/KSPP/linux/issues/295 Link: https://gcc.gnu.org/pipermail/gcc-patches/2022-October/602902.html [1] Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Link: https://lore.kernel.org/r/6c6dcab88524c14c47fd06b9332bd96162656db5.1684358315.git.gustavoars@kernel.org Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-05-22 18:02:02 -04:00
Christophe JAILLET	144679dfb5	scsi: mpi3mr: Fix the type used for pointers to bitmap Bitmaps are "unsigned long[]", so better use "unsigned long " instead of a plain "void " when dealing with pointers to bitmaps. This is more informative. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/8bdf9148ce1a5d01aac11c46c8617b477813457e.1683473011.git.christophe.jaillet@wanadoo.fr Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-05-22 17:40:11 -04:00
Yuchen Yang	2e2fe5ac69	scsi: 3w-xxxx: Add error handling for initialization failure in tw_probe() Smatch complains that: tw_probe() warn: missing error code 'retval' This patch adds error checking to tw_probe() to handle initialization failure. If tw_reset_sequence() function returns a non-zero value, the function will return -EINVAL to indicate initialization failure. Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Signed-off-by: Yuchen Yang <u202114568@hust.edu.cn> Link: https://lore.kernel.org/r/20230505141259.7730-1-u202114568@hust.edu.cn Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-05-22 17:31:56 -04:00
Martin K. Petersen	8b60e2189f	Merge patch series "Add Command Duration Limits support" Niklas Cassel <nks@flawful.org> says: This series adds support for Command Duration Limits. The series is based on linux tag: v6.4-rc1 The series can also be found in git: https://github.com/floatious/linux/commits/cdl-v7 ================= CDL in ATA / SCSI ================= Command Duration Limits is defined in: T13 ATA Command Set - 5 (ACS-5) and T10 SCSI Primary Commands - 6 (SPC-6) respectively (a simpler version of CDL is defined in T10 SPC-5). CDL defines Duration Limits Descriptors (DLD). 7 DLDs for read commands and 7 DLDs for write commands. Simply put, a DLD contains a limit and a policy. A command can specify that a certain limit should be applied by setting the DLD index field (3 bits, so 0-7) in the command itself. The DLD index points to one of the 7 DLDs. DLD index 0 means no descriptor, so no limit. DLD index 1-7 means DLD 1-7. A DLD can have a few different policies, but the two major ones are: -Policy 0xF (abort), command will be completed with command aborted error (ATA) or status CHECK CONDITION (SCSI), with sense data indicating that the command timed out. -Policy 0xD (complete-unavailable), command will be completed without error (ATA) or status GOOD (SCSI), with sense data indicating that the command timed out. Note that the command will not have transferred any data to/from the device when the command timed out, even though the command returned success. Regardless of the CDL policy, in case of a CDL timeout, the I/O will result in a -ETIME error to user-space. The DLDs are defined in the CDL log page(s) and are readable and writable. Reading and writing the CDL DLDs are outside the scope of the kernel. If a user wants to read or write the descriptors, they can do so using a user-space application that sends passthrough commands, such as cdl-tools: https://github.com/westerndigitalcorporation/cdl-tools ================================ The introduction of ioprio hints ================================ What the kernel does provide, is a method to let I/O use one of the CDL DLDs defined in the device. Note that the kernel will simply forward the DLD index to the device, so the kernel currently does not know, nor does it need to know, how the DLDs are defined inside the device. The way that the CDL DLD index is supplied to the kernel is by introducing a new 10 bit "ioprio hint" field within the existing 16 bit ioprio definition. Currently, only 6 out of the 16 ioprio bits are in use, the remaining 10 bits are unused, and are currently explicitly disallowed to be set by the kernel. For now, we only add ioprio hints representing CDL DLD index 1-7. Additional ioprio hints for other QoS features could be defined in the future. A theoretical future work could be to make an I/O scheduler aware of these hints. E.g. for CDL, an I/O scheduler could make use of the duration limit in each descriptor, and take that information into account while scheduling commands. Right now, the ioprio hints will be ignored by the I/O schedulers. ============================== How to use CDL from user-space ============================== Since CDL is mutually exclusive with NCQ priority (see ncq_prio_enable and sas_ncq_prio_enable in Documentation/ABI/testing/sysfs-block-device), CDL has to be explicitly enabled using: echo 1 > /sys/block/$bdev/device/cdl_enable Since the ioprio hints are supplied through the existing I/O priority API, it should be simple for an application to make use of the ioprio hints. It simply has to reuse one of the new macros defined in include/uapi/linux/ioprio.h: IOPRIO_PRIO_HINT() or IOPRIO_PRIO_VALUE_HINT(), and supply one of the new hints defined in include/uapi/linux/ioprio.h: IOPRIO_HINT_DEV_DURATION_LIMIT_[1-7], which indicates that the I/O should use the corresponding CDL DLD index 1-7. By reusing the I/O priority API, the user can both define a DLD to use per AIO (io_uring sqe->ioprio or libaio iocb->aio_reqprio) or per-thread (ioprio_set()). ======= Testing ======= With the following fio patches: https://github.com/floatious/fio/commits/cdl fio adds support for ioprio hints, such that CDL can be tested using e.g.: fio --ioengine=io_uring --cmdprio_percentage=10 --cmdprio_hint=DLD_index A simple way to test is to use a DLD with a very short duration limit, and send large reads. Regardless of the CDL policy, in case of a CDL timeout, the I/O will result in a -ETIME error to user-space. We also provide a CDL test suite located in the cdl-tools repo, see: https://github.com/westerndigitalcorporation/cdl-tools#testing-a-system-command-duration-limits-support We have tested this patch series using: -real hardware -the following QEMU implementation: https://github.com/floatious/qemu/tree/cdl (NOTE: the QEMU implementation requires you to define the CDL policy at compile time, so you currently need to recompile QEMU when switching between policies.) =================== Further information =================== For further information about CDL, see Damien's slides: Presented at SDC 2021: https://www.snia.org/sites/default/files/SDC/2021/pdfs/SNIA-SDC21-LeMoal-Be-On-Time-command-duration-limits-Feature-Support-in%20Linux.pdf Presented at Lund Linux Con 2022: https://drive.google.com/file/d/1I6ChFc0h4JY9qZdO1bY5oCAdYCSZVqWw/view?usp=sharing ================ Changes since V6 ================ -Rebased series on v6.4-rc1. -Picked up Reviewed-by tags from Hannes (Thank you Hannes!) -Picked up Reviewed-by tag from Christoph (Thank you Christoph!) -Changed KernelVersion from 6.4 to 6.5 for new sysfs attributes. For older change logs, see previous patch series versions: https://lore.kernel.org/linux-scsi/20230406113252.41211-1-nks@flawful.org/ https://lore.kernel.org/linux-scsi/20230404182428.715140-1-nks@flawful.org/ https://lore.kernel.org/linux-scsi/20230309215516.3800571-1-niklas.cassel@wdc.com/ https://lore.kernel.org/linux-scsi/20230124190308.127318-1-niklas.cassel@wdc.com/ https://lore.kernel.org/linux-scsi/20230112140412.667308-1-niklas.cassel@wdc.com/ https://lore.kernel.org/linux-scsi/20221208105947.2399894-1-niklas.cassel@wdc.com/ Link: https://lore.kernel.org/r/20230511011356.227789-1-nks@flawful.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-05-22 17:09:51 -04:00
Niklas Cassel	390e2d1a58	scsi: sd: Handle read/write CDL timeout failures Commands using a duration limit descriptor that has limit policies set to a value other than 0x0 may be failed by the device if one of the limits are exceeded. For such commands, since the failure is the result of the user duration limit configuration and workload, the commands should not be retried and terminated immediately. Furthermore, to allow the user to differentiate these "soft" failures from hard errors due to hardware problem, a different error code than EIO should be returned. There are 2 cases to consider: (1) The failure is due to a limit policy failing the command with a check condition sense key, that is, any limit policy other than 0xD. For this case, scsi_check_sense() is modified to detect failures with the ABORTED COMMAND sense key and the COMMAND TIMEOUT BEFORE PROCESSING or COMMAND TIMEOUT DURING PROCESSING or COMMAND TIMEOUT DURING PROCESSING DUE TO ERROR RECOVERY additional sense code. For these failures, a SUCCESS disposition is returned so that scsi_finish_command() is called to terminate the command. (2) The failure is due to a limit policy set to 0xD, which result in the command being terminated with a GOOD status, COMPLETED sense key, and DATA CURRENTLY UNAVAILABLE additional sense code. To handle this case, the scsi_check_sense() is modified to return a SUCCESS disposition so that scsi_finish_command() is called to terminate the command. In addition, scsi_decide_disposition() has to be modified to see if a command being terminated with GOOD status has sense data. This is as defined in SCSI Primary Commands - 6 (SPC-6), so all according to spec, even if GOOD status commands were not checked before. If scsi_check_sense() detects sense data representing a duration limit, scsi_check_sense() will set the newly introduced SCSI ML byte SCSIML_STAT_DL_TIMEOUT. This SCSI ML byte is checked in scsi_noretry_cmd(), so that a command that failed because of a CDL timeout cannot be retried. The SCSI ML byte is also checked in scsi_result_to_blk_status() to complete the command request with the BLK_STS_DURATION_LIMIT status, which result in the user seeing ETIME errors for the failed commands. Co-developed-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Link: https://lore.kernel.org/r/20230511011356.227789-12-nks@flawful.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-05-22 17:05:19 -04:00
Damien Le Moal	e59e80cfef	scsi: sd: Set read/write command CDL index Introduce the command duration limits helper function sd_cdl_dld() to set the DLD bits of READ/WRITE 16 and READ/WRITE 32 commands to indicate to the device the command duration limit descriptor to apply to the commands. When command duration limits are enabled, sd_cdl_dld() obtains the index of the descriptor to apply to the command using the hints field of the request IO priority value (hints IOPRIO_HINT_DEV_DURATION_LIMIT_1 to IOPRIO_HINT_DEV_DURATION_LIMIT_7). If command duration limits is disabled (which is the default), the limit index "0" is always used to indicate "no limit" for a command. Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Co-developed-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Link: https://lore.kernel.org/r/20230511011356.227789-11-nks@flawful.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-05-22 17:05:19 -04:00
Damien Le Moal	1b22cfb141	scsi: core: Allow enabling and disabling command duration limits Add the sysfs scsi_device attribute cdl_enable to allow a user to enable or disable a device command duration limits feature. CDL is disabled by default. This feature must be explicitly enabled by a user by setting the cdl_enable attribute to 1. The new function scsi_cdl_enable() does not do anything beside setting the cdl_enable field of struct scsi_device in the case of a (real) SCSI device (e.g. a SAS HDD). For ATA devices, the command duration limits feature needs to be enabled/disabled using the ATA feature sub-page of the control mode page. To do so, the scsi_cdl_enable() function checks if this mode page is supported using scsi_mode_sense(). If it is, scsi_mode_select() is used to enable and disable CDL. Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Co-developed-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Link: https://lore.kernel.org/r/20230511011356.227789-10-nks@flawful.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-05-22 17:05:19 -04:00
Damien Le Moal	624885209f	scsi: core: Detect support for command duration limits Introduce the function scsi_cdl_check() to detect if a device supports command duration limits (CDL). Support for the READ 16, WRITE 16, READ 32 and WRITE 32 commands are checked using the function scsi_report_opcode() to probe the rwcdlp and cdlp bits as they indicate the mode page defining the command duration limits descriptors that apply to the command being tested. If any of these commands support CDL, the field cdl_supported of struct scsi_device is set to 1 to indicate that the device supports CDL. Support for CDL for a device is advertizes through sysfs using the new cdl_supported device attribute. This attribute value is 1 for a device supporting CDL and 0 otherwise. Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Co-developed-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Link: https://lore.kernel.org/r/20230511011356.227789-9-nks@flawful.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-05-22 17:05:19 -04:00
Damien Le Moal	152e52fb6f	scsi: core: Support Service Action in scsi_report_opcode() The REPORT_SUPPORTED_OPERATION_CODES command allows checking for support of commands that have the same opcode but different service actions, such as READ 32 and WRITE 32. However, the current implementation of scsi_report_opcode() only allows checking an operation code without a service action differentiation. Add the "sa" argument to scsi_report_opcode() to allow passing a service action. If a non-zero service action is specified, the reporting options field value is set to 3 to have the service action field taken into account by the device. If no service action field is specified (zero), the reporting options field is set to 1 as before. Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Link: https://lore.kernel.org/r/20230511011356.227789-8-nks@flawful.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-05-22 17:05:19 -04:00
Damien Le Moal	a6cdc35fab	scsi: core: Support retrieving sub-pages of mode pages Allow scsi_mode_sense() to retrieve sub-pages of mode pages by adding the subpage argument. Change all the current caller sites to specify the subpage 0. Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Link: https://lore.kernel.org/r/20230511011356.227789-7-nks@flawful.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-05-22 17:05:19 -04:00
Niklas Cassel	734326937b	scsi: core: Rename and move get_scsi_ml_byte() SCSI has two different getters: - get_XXX_byte() (in scsi_cmnd.h) which takes a struct scsi_cmnd *, and - XXX_byte() (in scsi.h) which takes a scmd->result. The proper name for get_scsi_ml_byte() should thus be without the get_ prefix, as it takes a scmd->result. Rename the function to rectify this. (This change was suggested by Mike Christie.) Additionally, move get_scsi_ml_byte() to scsi_priv.h since both scsi_lib.c and scsi_error.c will need to use this helper in a follow-up patch. Cc: Mike Christie <michael.christie@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Link: https://lore.kernel.org/r/20230511011356.227789-6-nks@flawful.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-05-22 17:05:19 -04:00
Niklas Cassel	3d848ca1eb	scsi: core: Allow libata to complete successful commands via EH In SCSI, we get the sense data as part of the completion, for ATA however, we need to fetch the sense data as an extra step. For an aborted ATA command the sense data is fetched via libata's ->eh_strategy_handler(). For Command Duration Limits policy 0xD: The device shall complete the command without error with the additional sense code set to DATA CURRENTLY UNAVAILABLE. In order to handle this policy in libata, we intend to send a successful command via SCSI EH, and let libata's ->eh_strategy_handler() fetch the sense data for the good command. This is similar to how we handle an aborted ATA command, just that we need to read the Successful NCQ Commands log instead of the NCQ Command Error log. When we get a SATA completion with successful commands, ATA_SENSE will be set, indicating that some commands in the completion have sense data. The sense_valid bitmask in the Sense Data for Successful NCQ Commands log will inform exactly which commands that had sense data, which might be a subset of all the commands that was completed in the same completion. (Yet all will have ATA_SENSE set, since the status is per completion.) The successful commands that have e.g. a "DATA CURRENTLY UNAVAILABLE" sense data will have a SCSI ML byte set, so scsi_eh_flush_done_q() will not set the scmd->result to DID_TIME_OUT for these commands. However, the successful commands that did not have sense data, must not get their result marked as DID_TIME_OUT by SCSI EH. Add a new flag SCMD_FORCE_EH_SUCCESS, which tells SCSI EH to not mark a command as DID_TIME_OUT, even if it has scmd->result == SAM_STAT_GOOD. This will be used by libata in a subsequent commit. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Link: https://lore.kernel.org/r/20230511011356.227789-5-nks@flawful.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-05-22 17:05:18 -04:00
Martin K. Petersen	7907ad748b	Merge patch series "Use block pr_ops in LIO" Mike Christie <michael.christie@oracle.com> says: The patches in this thread allow us to use the block pr_ops with LIO's target_core_iblock module to support cluster applications in VMs. They were built over Linus's tree. They also apply over linux-next and Martin's tree and Jens's trees. Currently, to use windows clustering or linux clustering (pacemaker + cluster labs scsi fence agents) in VMs with LIO and vhost-scsi, you have to use tcmu or pscsi or use a cluster aware FS/framework for the LIO pr file. Setting up a cluster FS/framework is pain and waste when your real backend device is already a distributed device, and pscsi and tcmu are nice for specific use cases, but iblock gives you the best performance and allows you to use stacked devices like dm-multipath. So these patches allow iblock to work like pscsi/tcmu where they can pass a PR command to the backend module. And then iblock will use the pr_ops to pass the PR command to the real devices similar to what we do for unmap today. The patches are separated in the following groups: Patch 1 - 2: - Add block layer callouts for reading reservations and rename reservation error code. Patch 3 - 5: - SCSI support for new callouts. Patch 6: - DM support for new callouts. Patch 7 - 13: - NVMe support for new callouts. Patch 14 - 18: - LIO support for new callouts. This patchset has been tested with the libiscsi PGR ops and with window's failover cluster verification test. Note that for scsi backend devices we need this patchset: https://lore.kernel.org/linux-scsi/20230123221046.125483-1-michael.christie@oracle.com/T/#m4834a643ffb5bac2529d65d40906d3cfbdd9b1b7 to handle UAs. To reduce the size of this patchset that's being done separately to make reviewing easier. And to make merging easier this patchset and the one above do not have any conflicts so can be merged in different trees. Link: https://lore.kernel.org/r/20230407200551.12660-1-michael.christie@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-05-22 16:35:02 -04:00
Azeem Shaikh	37f1663c91	scsi: qla2xxx: Replace all non-returning strlcpy() with strscpy() strlcpy() reads the entire source buffer first. This read may exceed the destination size limit. This is both inefficient and can lead to linear read overflows if a source string is not NUL-terminated [1]. In an effort to remove strlcpy() completely [2], replace strlcpy() here with strscpy(). No return values were used, so direct replacement is safe. [1] https://www.kernel.org/doc/html/latest/process/deprecated.html#strlcpy [2] https://github.com/KSPP/linux/issues/89 Signed-off-by: Azeem Shaikh <azeemshaikh38@gmail.com> Link: https://lore.kernel.org/r/20230516025404.2843867-1-azeemshaikh38@gmail.com Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-05-16 21:41:34 -04:00
Azeem Shaikh	41300cc989	scsi: qla4xxx: Replace all non-returning strlcpy() with strscpy() strlcpy() reads the entire source buffer first. This read may exceed the destination size limit. This is both inefficient and can lead to linear read overflows if a source string is not NUL-terminated [1]. In an effort to remove strlcpy() completely [2], replace strlcpy() here with strscpy(). No return values were used, so direct replacement is safe. [1] https://www.kernel.org/doc/html/latest/process/deprecated.html#strlcpy [2] https://github.com/KSPP/linux/issues/89 Signed-off-by: Azeem Shaikh <azeemshaikh38@gmail.com> Link: https://lore.kernel.org/r/20230516025355.2835898-1-azeemshaikh38@gmail.com Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-05-16 21:40:41 -04:00
Azeem Shaikh	973464fded	scsi: bfa: Replace all non-returning strlcpy() with strscpy() strlcpy() reads the entire source buffer first. This read may exceed the destination size limit. This is both inefficient and can lead to linear read overflows if a source string is not NUL-terminated [1]. In an effort to remove strlcpy() completely [2], replace strlcpy() here with strscpy(). No return values were used, so direct replacement is safe. [1] https://www.kernel.org/doc/html/latest/process/deprecated.html#strlcpy [2] https://github.com/KSPP/linux/issues/89 Signed-off-by: Azeem Shaikh <azeemshaikh38@gmail.com> Link: https://lore.kernel.org/r/20230516013345.723623-1-azeemshaikh38@gmail.com Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-05-16 21:38:36 -04:00
Martin K. Petersen	8759924ddb	Merge patch series "scsi: hisi_sas: Some misc changes" Xiang Chen <chenxiang66@hisilicon.com> says: This series contains some fixes including: - Configure initial value of some registers according to HBA model - Change DMA setup lock timeout from 100ms to 2.5s - Fix warnings detected by sparse Link: https://lore.kernel.org/r/1684118481-95908-1-git-send-email-chenxiang66@hisilicon.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-05-16 21:36:51 -04:00
Xingui Yang	c0328cc595	scsi: hisi_sas: Fix warnings detected by sparse This patch fixes the following warning: drivers/scsi/hisi_sas/hisi_sas_v3_hw.c:2168:43: sparse: sparse: restricted __le32 degrades to integer Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/oe-kbuild-all/202304161254.NztCVZIO-lkp@intel.com/ Signed-off-by: Xingui Yang <yangxingui@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Link: https://lore.kernel.org/r/1684118481-95908-4-git-send-email-chenxiang66@hisilicon.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-05-16 21:36:40 -04:00
Xingui Yang	a090fc9761	scsi: hisi_sas: Change DMA setup lock timeout to 2.5s DMA setup lock timeout protection is added when DMA setup frames are received. It's a function outside the protocol and used to prevent SATA disk I/Os from being delivered for a long time. The default value is 100ms, it's too strict and easily triggered timeout when the disk is overloaded or faulty. Based on the average I/O latency of 300 disks, we adjust the value to 2.5s. Signed-off-by: Xingui Yang <yangxingui@huawei.com> Signed-off-by: Yihang Li <liyihang9@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Link: https://lore.kernel.org/r/1684118481-95908-3-git-send-email-chenxiang66@hisilicon.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-05-16 21:36:40 -04:00
Yihang Li	b68daae966	scsi: hisi_sas: Configure initial value of some registers according to HBA model For SAS HBAs of 920 and previous version, we use init_reg_v3_hw() to set some registers which are related to HW boards. For SAS HBAs of 920B and later version, those HW registers are set through firmware. And different HBA models are distinguished through pci_dev->revision. Signed-off-by: Yihang Li <liyihang9@huawei.com> Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Link: https://lore.kernel.org/r/1684118481-95908-2-git-send-email-chenxiang66@hisilicon.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-05-16 21:36:39 -04:00
Kees Cook	aa67380056	scsi: megaraid_sas: Convert union megasas_sgl to flex-arrays In the ongoing effort to replace all fake flexible arrays with true flexible arrays, replace the sge32, sge64, and sge_skinny members of union megasas_sgl with true flexible arrays. No binary differences are seen after this change; sizes were already being manually calculated using the member struct sizes directly. Cc: Kashyap Desai <kashyap.desai@broadcom.com> Cc: Sumit Saxena <sumit.saxena@broadcom.com> Cc: Shivasharan S <shivasharan.srikanteshwara@broadcom.com> Cc: "James E.J. Bottomley" <jejb@linux.ibm.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Gustavo A. R. Silva <gustavoars@kernel.org> Cc: megaraidlinux.pdl@broadcom.com Cc: linux-scsi@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230511220957.never.919-kees@kernel.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-05-16 21:30:25 -04:00
Martin K. Petersen	44ef1604ae	Merge patch series "smartpqi updates" Don Brace <don.brace@microchip.com> says: These patches are based on Martin Petersen's 6.4/scsi-queue tree https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git 6.4/scsi-queue This set of changes consists of: * Map entire BAR 0. The driver was mapping up to and including the controller registers, but not all of BAR 0. * Add PCI IDs to support new controllers. * Clean up some code by removing unnecessary NULL checks. This cleanup is a result of a Coverity report. * Correct a rare memory leak whenever pqi_sas_port_add_rhpy() returns an error. This was Suggested by: Yang Yingliang <yangyingliang@huawei.com> * Remove atomic operations on variable raid_bypass_cnt. Accuracy is not required for driver operation. Change type from atomic_t to unsigned int. * Correct a rare drive hot-plug removal issue where we get a NULL io_request. We added a check for this condition. * Turn on NCQ priority for AIO requests to disks comprising RAID devices. * Correct byte aligned writew() operations on some ARM servers. Changed the writew() to two writeb() operations. * Change how the driver checks for a sanitize operation in progress. We were using TEST UNIT READY. We removed the TEST UNIT READY code and are now using the controller's firmware information in order to avoid issues caused by drives failing to complete TEST UNIT READY. * Some customers have been requesting that we add the NUMA node to /sys/block/sd<scsi device>/device like the nvme driver does. * Update the copyright information to match the current year. * Bump the driver version to 2.1.22-040. Link: https://lore.kernel.org/r/20230428153712.297638-1-don.brace@microchip.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-05-08 07:56:37 -04:00
Martin K. Petersen	79c67c54f6	Merge patch series "scsi: pm80xx: Enhanced debug logs for HW events" Pranav Prasad <pranavpp@google.com> says: This patch series enhances debug logs for pm80xx HW events, and provides a minor fix in the case of a hard reset. The log enhancement involves changing the log severity level to enable logging for HW events which consequently help debug disk discovery issues. 1. Changed log severity level from MSG to EVENT for HW events. Enhanced the HW event logs by adding the phyid. 2. Enabled INIT logging. 3. Log portid along with the PHY_UP event. 4. Print phyid and portid sent as part of device registration request. 5. Log port state during HW events. 6. Update phy_state and phy_attached to correct values after a hard reset. Link: https://lore.kernel.org/r/20230418190101.696345-1-pranavpp@google.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-05-08 07:55:25 -04:00
Martin K. Petersen	44fcce6735	Merge patch series "scsi: libsas: remove empty branches and code simplification" Jason Yan <yanaijie@huawei.com> says: Three patches to remove two empty branches and a little code simplification. Link: https://lore.kernel.org/r/20230421093744.1583609-1-yanaijie@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-05-08 07:54:31 -04:00
Martin K. Petersen	92d685a96b	Merge patch series "qla2xxx driver update" Nilesh Javali <njavali@marvell.com> says: Please apply the qla2xxx driver enhancement and bug fixes to the scsi tree at your earliest convenience. Link: https://lore.kernel.org/r/20230428075339.32551-1-njavali@marvell.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-05-08 07:52:50 -04:00
Martin K. Petersen	808e87a511	Merge patch series "lpfc: Update lpfc to revision 14.2.0.12" Justin Tee <justintee8345@gmail.com> says: Update lpfc to revision 14.2.0.12 This patch set contains fixes flagged by code analyzer tools, introduces a new CQE status to handle DMA errors, and replaces the usage of blk interrupts with threaded interrupts. The patches were cut against Martin's 6.4/scsi-queue tree. Link: https://lore.kernel.org/r/20230417191558.83100-1-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-05-08 07:51:05 -04:00
Jinhong Zhu	f025312b08	scsi: qedf: Fix NULL dereference in error handling Smatch reported: drivers/scsi/qedf/qedf_main.c:3056 qedf_alloc_global_queues() warn: missing unwind goto? At this point in the function, nothing has been allocated so we can return directly. In particular the "qedf->global_queues" have not been allocated so calling qedf_free_global_queues() will lead to a NULL dereference when we check if (!gl[i]) and "gl" is NULL. Fixes: `61d8658b4a` ("scsi: qedf: Add QLogic FastLinQ offload FCoE driver framework.") Signed-off-by: Jinhong Zhu <jinhongzhu@hust.edu.cn> Link: https://lore.kernel.org/r/20230502140022.2852-1-jinhongzhu@hust.edu.cn Reviewed-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-05-08 07:19:20 -04:00
Harshit Mogalapalli	2a95483201	scsi: mpi3mr: Use -ENOMEM instead of -1 in mpi3mr_expander_add() smatch warnings: drivers/scsi/mpi3mr/mpi3mr_transport.c:1449 mpi3mr_expander_add() warn: returning -1 instead of -ENOMEM is sloppy No functional change. Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <error27@gmail.com> Link: https://lore.kernel.org/r/202303202027.ZeDQE5Ug-lkp@intel.com/ Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Link: https://lore.kernel.org/r/20230419064256.2532069-1-harshit.m.mogalapalli@oracle.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-05-08 07:19:20 -04:00
Don Brace	fcb405111a	scsi: smartpqi: Update version to 2.1.22-040 Reviewed-by: Gerry Morong <gerry.morong@microchip.com> Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Link: https://lore.kernel.org/r/20230428153712.297638-13-don.brace@microchip.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-05-08 07:17:12 -04:00
Don Brace	49fd52d499	scsi: smartpqi: Update copyright to 2023 Update copyright to current year. Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Link: https://lore.kernel.org/r/20230428153712.297638-12-don.brace@microchip.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-05-08 07:17:12 -04:00
Don Brace	d2c7583f27	scsi: smartpqi: Add sysfs entry for NUMA node in /sys/block/sdX/device Although NUMA node is a PCIe device level attribute, it was requested the NUMA node be added for each exposed device similar to NVMe disks. Example for NVMe: /sys/block/nvme1c1n1/device/numa_node Example for smartpqi: /sys/block/sdh/device/numa_node cat /sys/block/sdh/device/numa_node 0 Reviewed-by: David Strahan <david.strahan@microchip.com> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Link: https://lore.kernel.org/r/20230428153712.297638-11-don.brace@microchip.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-05-08 07:17:12 -04:00
Kevin Barnett	2eddf98d01	scsi: smartpqi: Stop sending driver-initiated TURs Stop sending driver-initiated TURs to physical devices during driver load/rescan. Note: This does not affect SML initiated TURs. Some Linux kernels can cause lengthy delays in OS boot if the kernel detects that a drive is being sanitized/erased. We were using TURs to detect if a sanitize/erase was in progress. Some devices do not return the TUR in a timely manner, causing driver load/rescan stalls. Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Link: https://lore.kernel.org/r/20230428153712.297638-10-don.brace@microchip.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-05-08 07:17:12 -04:00
Don Brace	c23efd9ead	scsi: smartpqi: Fix byte aligned writew for ARM servers Correct OOPs on ARM servers during driver init. The driver attempts to update FW with max_feature_supported value using a writew() kernel call using a byte aligned address. This fails on some ARM systems. Change the writew() to two writeb() calls to update this value. Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Link: https://lore.kernel.org/r/20230428153712.297638-9-don.brace@microchip.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-05-08 07:17:11 -04:00
Gilbert Wu	68f7920492	scsi: smartpqi: Add support for RAID NCQ priority Enable NCQ priority feature for the RAID path when AIO path is disabled. Move function pqi_is_io_high_priority() up to avoid adding a prototype. Remove unused argument ctrl_info. Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com> Signed-off-by: Gilbert Wu <Gilbert.Wu@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Link: https://lore.kernel.org/r/20230428153712.297638-8-don.brace@microchip.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-05-08 07:17:11 -04:00
Murthy Bhat	5c9e3c1c52	scsi: smartpqi: Validate block layer host tag Prevent OS crashes when a drive is hot removed during I/O stress test. The I/O request pointer can be invalid if block layer provides incorrect multi-queue host tag. This can lead to invalid I/O request pointer dereference. Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com> Signed-off-by: Murthy Bhat <Murthy.Bhat@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Link: https://lore.kernel.org/r/20230428153712.297638-7-don.brace@microchip.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-05-08 07:17:11 -04:00
Mike McGowen	80d560d94f	scsi: smartpqi: Remove contention for raid_bypass_cnt Reduce CPU contention when incrementing variable raid_bypass_cnt. Remove the atomic operations for this variable by changing the atomic to an unsigned int and replace atomic operations with standard operations. The value is only checked that it is increasing and accuracy is not required. Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com> Signed-off-by: Mike McGowen <mike.mcgowen@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Link: https://lore.kernel.org/r/20230428153712.297638-6-don.brace@microchip.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-05-08 07:17:11 -04:00
Don Brace	2312e844dc	scsi: smartpqi: Fix rare SAS transport memory leak Free rphy when pqi_sas_port_add_rphy() returns an error. If pqi_sas_port_add_rphy() returns an error, the 'rphy' allocated in sas_end_device_alloc() needs to be freed. It should be noted that no issues were ever reported. Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com> Suggested-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Don Brace <don.brace@microchip.com> Link: https://lore.kernel.org/r/20230428153712.297638-5-don.brace@microchip.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-05-08 07:17:11 -04:00
Kevin Barnett	889cda36db	scsi: smartpqi: Remove NULL pointer check Remove an unnecessary check for a NULL pointer. This unnecessary check was flagged by Coverity. Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Signed-off-by: Kevin Barnett <kevin.barnett@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Link: https://lore.kernel.org/r/20230428153712.297638-4-don.brace@microchip.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-05-08 07:17:11 -04:00
David Strahan	fe0375d485	scsi: smartpqi: Add new controller PCI IDs All PCI ID entries in Hex. Add PCI IDs for ZTE controllers: VID / DID / SVID / SDID ---- ---- ---- ---- ZTE SmartROC3200 RS344-16i 4G 9005 / 028f / 1cf2 / 0804 ZTE SmartROC3200 RS345-16i 8G 9005 / 028f / 1cf2 / 0805 ZTE SmartIOC2200 RS346-16i 9005 / 028f / 1cf2 / 0806 ZTE SmartROC3200 RM344-16i 4G 9005 / 028f / 1cf2 / 54da ZTE SmartROC3200 RM345-16i 8G 9005 / 028f / 1cf2 / 54db ZTE SmartIOC2200 RM346-16i 9005 / 028f / 1cf2 / 54dc Add PCI IDs for ByteDance controllers: VID / DID / SVID / SDID ---- ---- ---- ---- ByteHBA JGH43014-8 9005 / 028f / 1e93 / 1005 Add PCI IDs for IBM controllers: VID / DID / SVID / SDID ---- ---- ---- ---- IBM 4-Port 24G SAS 9005 / 028f / 1014 / 0718 Add PCI IDs for Cloudnine controllers: VID / DID / SVID / SDID ---- ---- ---- ---- SmartHBA P6600-8i 9005 / 028f / 1f51 / 1001 SmartRAID P7604-8i 9005 / 028f / 1f51 / 1002 SmartHBA P6600-8e 9005 / 028f / 1f51 / 1003 SmartRAID P7604-8e 9005 / 028f / 1f51 / 1004 SmartHBA P6600-16i 9005 / 028f / 1f51 / 1005 SmartRAID P7608-16i 9005 / 028f / 1f51 / 1006 SmartHBA P6600-8i8e 9005 / 028f / 1f51 / 1007 SmartRAID P7608-8i8e 9005 / 028f / 1f51 / 1008 SmartHBA P6600-16e 9005 / 028f / 1f51 / 1009 SmartRAID P7608-16e 9005 / 028f / 1f51 / 100a Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Scott Teel <scott.teel@microchip.com> Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com> Reviewed-by: Mike McGowen <mike.mcgowen@microchip.com> Signed-off-by: David Strahan <David.Strahan@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Link: https://lore.kernel.org/r/20230428153712.297638-3-don.brace@microchip.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-05-08 07:17:11 -04:00
Mike McGowen	3e7e55aa3d	scsi: smartpqi: Map full length of PCI BAR 0 Map full length of PCI BAR 0 at driver init. During driver initialization, the driver must make a kernel call to map the controller registers into kernel address space. A parameter to this call is the length of the memory to be mapped. The driver was specifying the wrong length. Reviewed-by: Scott Benesh <scott.benesh@microchip.com> Reviewed-by: Kevin Barnett <kevin.barnett@microchip.com> Signed-off-by: Mike McGowen <mike.mcgowen@microchip.com> Signed-off-by: Don Brace <don.brace@microchip.com> Link: https://lore.kernel.org/r/20230428153712.297638-2-don.brace@microchip.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-05-08 07:17:11 -04:00
Nilesh Javali	eb91eb809c	scsi: qla2xxx: Update version to 10.02.08.300-k Update version to 10.02.08.300-k. Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20230428075339.32551-8-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-05-08 07:16:40 -04:00
Quinn Tran	fc0cba0c7b	scsi: qla2xxx: Wait for io return on terminate rport System crash due to use after free. Current code allows terminate_rport_io to exit before making sure all IOs has returned. For FCP-2 device, IO's can hang on in HW because driver has not tear down the session in FW at first sign of cable pull. When dev_loss_tmo timer pops, terminate_rport_io is called and upper layer is about to free various resources. Terminate_rport_io trigger qla to do the final cleanup, but the cleanup might not be fast enough where it leave qla still holding on to the same resource. Wait for IO's to return to upper layer before resources are freed. Cc: stable@vger.kernel.org Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20230428075339.32551-7-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-05-08 07:16:40 -04:00
Quinn Tran	b843adde8d	scsi: qla2xxx: Fix mem access after free System crash, where driver is accessing scsi layer's memory (scsi_cmnd->device->host) to search for a well known internal pointer (vha). The scsi_cmnd was released back to upper layer which could be freed, but the driver is still accessing it. 7 [ffffa8e8d2c3f8d0] page_fault at ffffffff86c010fe [exception RIP: __qla2x00_eh_wait_for_pending_commands+240] RIP: ffffffffc0642350 RSP: ffffa8e8d2c3f988 RFLAGS: 00010286 RAX: 0000000000000165 RBX: 0000000000000002 RCX: 00000000000036d8 RDX: 0000000000000000 RSI: ffff9c5c56535188 RDI: 0000000000000286 RBP: ffff9c5bf7aa4a58 R8: ffff9c589aecdb70 R9: 00000000000003d1 R10: 0000000000000001 R11: 0000000000380000 R12: ffff9c5c5392bc78 R13: ffff9c57044ff5c0 R14: ffff9c56b5a3aa00 R15: 00000000000006db ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 8 [ffffa8e8d2c3f9c8] qla2x00_eh_wait_for_pending_commands at ffffffffc0646dd5 [qla2xxx] 9 [ffffa8e8d2c3fa00] __qla2x00_async_tm_cmd at ffffffffc0658094 [qla2xxx] Remove access of freed memory. Currently the driver was checking to see if scsi_done was called by seeing if the sp->type has changed. Instead, check to see if the command has left the oustanding_cmds[] array as sign of scsi_done was called. Cc: stable@vger.kernel.org Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20230428075339.32551-6-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-05-08 07:16:40 -04:00
Quinn Tran	9ae615c5bf	scsi: qla2xxx: Fix hang in task management Task management command hangs where a side band chip reset failed to nudge the TMF from it's current send path. Add additional error check to block TMF from entering during chip reset and along the TMF path to cause it to bail out, skip over abort of marker. Cc: stable@vger.kernel.org Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20230428075339.32551-5-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-05-08 07:16:40 -04:00
Quinn Tran	6a87679626	scsi: qla2xxx: Fix task management cmd fail due to unavailable resource Task management command failed with status 2Ch which is a result of too many task management commands sent to the same target. Hence limit task management commands to 8 per target. Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/oe-kbuild-all/202304271952.NKNmoFzv-lkp@intel.com/ Cc: stable@vger.kernel.org Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20230428075339.32551-4-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-05-08 07:16:39 -04:00
Quinn Tran	9803fb5d27	scsi: qla2xxx: Fix task management cmd failure Task management cmd failed with status 30h which means FW is not able to finish processing one task management before another task management for the same lun. Hence add wait for completion of marker to space it out. Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/oe-kbuild-all/202304271802.uCZfwQC1-lkp@intel.com/ Cc: stable@vger.kernel.org Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20230428075339.32551-3-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com <mailto:himanshu.madhani@oracle.com>> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-05-08 07:16:39 -04:00
Quinn Tran	d90171dd0d	scsi: qla2xxx: Multi-que support for TMF Add queue flush for task management command, before placing it on the wire. Do IO flush for all Request Q's. Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/oe-kbuild-all/202304271702.GpIL391S-lkp@intel.com/ Cc: stable@vger.kernel.org Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20230428075339.32551-2-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com <mailto:himanshu.madhani@oracle.com>> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-05-08 07:16:39 -04:00
Jason Yan	cf3cd61e76	scsi: libsas: factor out sas_check_fanout_expander_topo() To be consistent with sas_check_edge_expander_topo(), factor out sas_check_fanout_expander_topo(). And remove the comment since we are not spilling over 80 colums now. Signed-off-by: Jason Yan <yanaijie@huawei.com> Link: https://lore.kernel.org/r/20230421093744.1583609-4-yanaijie@huawei.com Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2023-05-08 07:16:18 -04:00

1 2 3 4 5 ...

23988 Commits