Commit Graph

24552 Commits

Author SHA1 Message Date
Uwe Kleine-König
f0baf76a22 scsi: mvme16x: Convert to platform remove callback returning void
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.

To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new(), which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().

In the error path emit an error message replacing the (less useful)
message by the core. Apart from the improved error message there is no
change in behaviour.

Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Link: https://lore.kernel.org/r/1d16e93a498831abd64df8b8cf54fd8872cdd1cd.1701619134.git.u.kleine-koenig@pengutronix.de
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-12-05 21:51:37 -05:00
Uwe Kleine-König
69b43bf38b scsi: mac: Convert to platform remove callback returning void
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.

To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new(), which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().

In the error path emit an error message replacing the (less useful)
message by the core. Apart from the improved error message there is no
change in behaviour.

Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Link: https://lore.kernel.org/r/89ce161dad52d99df07135531512ccecb7f25d14.1701619134.git.u.kleine-koenig@pengutronix.de
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-12-05 21:51:37 -05:00
Uwe Kleine-König
0b649224f7 scsi: mac_esp: Convert to platform remove callback returning void
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.

To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new(), which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().

In the error path emit an error message replacing the (less useful)
message by the core. Apart from the improved error message there is no
change in behaviour.

Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Link: https://lore.kernel.org/r/9013c84059b8ccd6a5c8305aa35cfdfa314ba74c.1701619134.git.u.kleine-koenig@pengutronix.de
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-12-05 21:51:37 -05:00
Uwe Kleine-König
c71ef3d1fb scsi: jazz_esp: Convert to platform remove callback returning void
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.

To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new(), which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().

In the error path emit an error message replacing the (less useful)
message by the core. Apart from the improved error message there is no
change in behaviour.

Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Link: https://lore.kernel.org/r/f71efbe17973c97fd2a1e78f8d7fcf456644510b.1701619134.git.u.kleine-koenig@pengutronix.de
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-12-05 21:51:37 -05:00
Uwe Kleine-König
51a41ec6d3 scsi: bvme6000: Convert to platform remove callback returning void
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.

To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new(), which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().

In the error path emit an error message replacing the (less useful)
message by the core. Apart from the improved error message there is no
change in behaviour.

Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Link: https://lore.kernel.org/r/63294479a4e745210c078859afa88904fa0b3be8.1701619134.git.u.kleine-koenig@pengutronix.de
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-12-05 21:51:36 -05:00
Uwe Kleine-König
3becb4cdf1 scsi: atari: Convert to platform remove callback returning void
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.

To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new(), which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().

In the error path emit an error message replacing the (less useful)
message by the core. Apart from the improved error message there is no
change in behaviour.

Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Link: https://lore.kernel.org/r/27a2b133b1b88a9baf51353c511e93a5027f9602.1701619134.git.u.kleine-koenig@pengutronix.de
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-12-05 21:51:36 -05:00
Uwe Kleine-König
688bbe398c scsi: a4000t: Convert to platform remove callback returning void
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.

To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new(), which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().

In the error path emit an error message replacing the (less useful)
message by the core. Apart from the improved error message there is no
change in behaviour.

Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Link: https://lore.kernel.org/r/c15ffc57efebc5da3f7e6dd558d69181e129cafe.1701619134.git.u.kleine-koenig@pengutronix.de
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-12-05 21:51:36 -05:00
Uwe Kleine-König
5854cdd041 scsi: a3000: Convert to platform remove callback returning void
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.

To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new(), which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().

In the error path emit an error message replacing the (less useful)
message by the core. Apart from the improved error message there is no
change in behaviour.

Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Link: https://lore.kernel.org/r/811d180950b76c2d95cd080e3c251757ca011380.1701619134.git.u.kleine-koenig@pengutronix.de
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-12-05 21:51:36 -05:00
Martin K. Petersen
e84d34372e Merge branch '6.8/s/mpi3mr2' into 6.8/scsi-staging
Two driver updates from Chandrakanth patil at Broadcom:

  scsi: mpi3mr: Update driver version to 8.5.1.0.0
  scsi: mpi3mr: Support for preallocation of SGL BSG data buffers part-3
  scsi: mpi3mr: Support for preallocation of SGL BSG data buffers part-2
  scsi: mpi3mr: Support for preallocation of SGL BSG data buffers part-1
  scsi: mpi3mr: Fetch correct device dev handle for status reply descriptor
  scsi: mpi3mr: Block PEL Enable Command on Controller Reset and Unrecoverable State
  scsi: mpi3mr: Clean up block devices post controller reset
  scsi: mpi3mr: Refresh sdev queue depth after controller reset

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-12-05 21:46:32 -05:00
Chandrakanth patil
d0a60e3eda scsi: mpi3mr: Update driver version to 8.5.1.0.0
Update driver version to 8.5.1.0.0

Signed-off-by: Chandrakanth patil <chandrakanth.patil@broadcom.com>
Link: https://lore.kernel.org/r/20231205191630.12201-5-chandrakanth.patil@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-12-05 21:45:53 -05:00
Chandrakanth patil
9536af615d scsi: mpi3mr: Support for preallocation of SGL BSG data buffers part-3
The driver acquires the required NVMe SGLs from the pre-allocated
pool.

Co-developed-by: Sathya Prakash <sathya.prakash@broadcom.com>
Signed-off-by: Sathya Prakash <sathya.prakash@broadcom.com>
Signed-off-by: Chandrakanth patil <chandrakanth.patil@broadcom.com>
Link: https://lore.kernel.org/r/20231205191630.12201-4-chandrakanth.patil@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-12-05 21:45:53 -05:00
Chandrakanth patil
fb231d7def scsi: mpi3mr: Support for preallocation of SGL BSG data buffers part-2
The driver acquires the required SGLs from the pre-allocated pool.

Co-developed-by: Sathya Prakash <sathya.prakash@broadcom.com>
Signed-off-by: Sathya Prakash <sathya.prakash@broadcom.com>
Signed-off-by: Chandrakanth patil <chandrakanth.patil@broadcom.com>
Link: https://lore.kernel.org/r/20231205191630.12201-3-chandrakanth.patil@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-12-05 21:45:53 -05:00
Chandrakanth patil
c432e16752 scsi: mpi3mr: Support for preallocation of SGL BSG data buffers part-1
The driver now supports SGLs for BSG data transfer. Upon loading, the
driver pre-allocates SGLs in chunks of 8k, results in a total of 256 * 8k,
equal to 2MB. These pre-allocated SGLs are reserved for handling BSG
commands and are deallocated during driver unload.

Co-developed-by: Sathya Prakash <sathya.prakash@broadcom.com>
Signed-off-by: Sathya Prakash <sathya.prakash@broadcom.com>
Signed-off-by: Chandrakanth patil <chandrakanth.patil@broadcom.com>
Link: https://lore.kernel.org/r/20231205191630.12201-2-chandrakanth.patil@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-12-05 21:45:53 -05:00
Chandrakanth patil
07ac6adda4 scsi: mpi3mr: Fetch correct device dev handle for status reply descriptor
The current dev handle for the status reply is 0xFFFF, which is invalid.
So fetch the correct value.

Co-developed-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Chandrakanth patil <chandrakanth.patil@broadcom.com>
Link: https://lore.kernel.org/r/20231126053134.10133-5-chandrakanth.patil@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-12-05 21:38:48 -05:00
Chandrakanth patil
f8fb3f3914 scsi: mpi3mr: Block PEL Enable Command on Controller Reset and Unrecoverable State
If a controller reset is underway or the controller is in an unrecoverable
state, the PEL enable management command will be returned as EAGAIN or
EFAULT.

Cc: <stable@vger.kernel.org> # v6.1+
Co-developed-by: Sathya Prakash <sathya.prakash@broadcom.com>
Signed-off-by: Sathya Prakash <sathya.prakash@broadcom.com>
Signed-off-by: Chandrakanth patil <chandrakanth.patil@broadcom.com>
Link: https://lore.kernel.org/r/20231126053134.10133-4-chandrakanth.patil@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-12-05 21:38:48 -05:00
Chandrakanth patil
c01d515687 scsi: mpi3mr: Clean up block devices post controller reset
After a controller reset, if the firmware changes the state of devices to
"hide", then remove those devices from the OS.

Cc: <stable@vger.kernel.org> # v6.6+
Co-developed-by: Sathya Prakash <sathya.prakash@broadcom.com>
Signed-off-by: Sathya Prakash <sathya.prakash@broadcom.com>
Signed-off-by: Chandrakanth patil <chandrakanth.patil@broadcom.com>
Link: https://lore.kernel.org/r/20231126053134.10133-3-chandrakanth.patil@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-12-05 21:38:48 -05:00
Chandrakanth patil
e5aab848df scsi: mpi3mr: Refresh sdev queue depth after controller reset
After a controller reset, the firmware may modify the device queue depth.
Therefore, update the device queue depth accordingly.

Cc: <stable@vger.kernel.org> # v5.15+
Co-developed-by: Sathya Prakash <sathya.prakash@broadcom.com>
Signed-off-by: Sathya Prakash <sathya.prakash@broadcom.com>
Signed-off-by: Chandrakanth patil <chandrakanth.patil@broadcom.com>
Link: https://lore.kernel.org/r/20231126053134.10133-2-chandrakanth.patil@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-12-05 21:38:47 -05:00
Martin K. Petersen
f200dad9f3 Merge patch series "libfc: fixup command abort handling"
Hannes Reinecke <hare@kernel.org> says:

Hi all,

when testing command timeout with the help of XDP I found that
scsi_try_to_abort_cmd() would always return 'SUCCESS' for FCoE, even
if no commands could be sent over the wire.  Which is not only
surprising, but also can lead to data corruption as commands were
never aborted.  Root cause was that aborts had been sent twice, once
from FC error recovery and once from SCSI EH, with the former inducing
the latter to assume that the command was already aborted.

As usual, comments and reviews are welcome.

Link: https://lore.kernel.org/r/20231129165832.224100-1-hare@kernel.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-12-05 21:33:05 -05:00
Hannes Reinecke
be40572c22 scsi: libfc: Map FC_TIMED_OUT to DID_TIME_OUT
When an exchange is completed with FC_TIMED_OUT we should map it to
DID_TIME_OUT to inform the SCSI midlayer that this was a command timeout;
DID_BUS_BUSY implies that the command was never sent which is not the case
here.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20231129165832.224100-4-hare@kernel.org
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-12-05 21:32:11 -05:00
Hannes Reinecke
53122a49f4 scsi: libfc: Fix up timeout error in fc_fcp_rec_error()
We should set the status to FC_TIMED_OUT when a timeout error is passed to
fc_fcp_rec_error().

Signed-off-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20231129165832.224100-3-hare@kernel.org
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-12-05 21:32:11 -05:00
Hannes Reinecke
b57c4db5d2 scsi: libfc: Don't schedule abort twice
The current FC error recovery is sending up to three REC (recovery) frames
in 10 second intervals, and as a final step sending an ABTS after 30
seconds for the command itself.  Unfortunately sending an ABTS is also the
action for the SCSI abort handler, and the default timeout for SCSI
commands is also 30 seconds. This causes two ABTS to be scheduled, with the
libfc one slightly earlier. The ABTS scheduled by SCSI EH then sees the
command to be already aborted, and will always return with a 'GOOD' status
irrespective on the actual result from the first ABTS.  This causes the
SCSI EH abort handler to always succeed, and SCSI EH never to be engaged.
Fix this by not issuing an ABTS when a SCSI command is present for the
exchange, but rather wait for the abort scheduled from SCSI EH.  And warn
if an abort is already scheduled to avoid similar errors in the future.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20231129165832.224100-2-hare@kernel.org
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-12-05 21:32:11 -05:00
Su Hui
aef6ac1236 scsi: aic7xxx: Return negative error codes in aic7770_probe()
aic7770_config() returns both negative and positive error codes.  It's
better to make aic7770_probe() only return negative error codes.

A previous commit made ahc_linux_register_host() return negative error
codes, which makes sure aic7770_probe() returns negative error codes.

Signed-off-by: Su Hui <suhui@nfschina.com>
Link: https://lore.kernel.org/r/20231201025955.1584260-4-suhui@nfschina.com
Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-12-05 21:18:40 -05:00
Su Hui
70dfaf84ec scsi: aic7xxx: Return ahc_linux_register_host()'s value rather than zero
ahc_linux_register_host() can return an error code in case of failure. So
ahc_linux_pci_dev_probe() should return the value of
ahc_linux_register_host() rather than zero.

An earlier commit made ahc_linux_register_host() return negative error
codes, which makes sure ahc_linux_pci_dev_probe() returns negative error
codes.

Signed-off-by: Su Hui <suhui@nfschina.com>
Link: https://lore.kernel.org/r/20231201025955.1584260-3-suhui@nfschina.com
Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-12-05 21:18:40 -05:00
Su Hui
573eb4a341 scsi: aic7xxx: Return negative error codes in ahc_linux_register_host()
ahc_linux_register_host() returns both positive and negative error codes.
It's better to make this function only return negative error codes.

Signed-off-by: Su Hui <suhui@nfschina.com>
Link: https://lore.kernel.org/r/20231201025955.1584260-2-suhui@nfschina.com
Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-12-05 21:18:40 -05:00
Artem Chernyshev
25cba909ad scsi: isci: Remove redundant check in isci_task_request_build()
sci_task_request_construct_ssp() has invariant return. Change this function
to void and get rid of unnecessary checks.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Signed-off-by: Artem Chernyshev <artem.chernyshev@red-soft.ru>
Link: https://lore.kernel.org/r/20231128121159.2373975-1-artem.chernyshev@red-soft.ru
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-12-05 21:13:15 -05:00
Michael Ellerman
84e46978b9 scsi: ipr: Remove obsolete check for old CPUs
The IPR driver has a routine to check whether it's running on certain CPU
versions and if so whether the adapter is supported on that CPU.

But none of the CPUs it checks for are supported by Linux anymore.

The most recent CPU it checks for is Power4+ which was removed in commit
471d7ff8b5 ("powerpc/64s: Remove POWER4 support").

So drop the check. That makes the "testmode" module parameter unused, so
remove it as well.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20231127111740.1288463-1-mpe@ellerman.id.au
Acked-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-12-05 21:05:09 -05:00
Justin Stitt
712b3f43ba scsi: ibmvscsi: Replace deprecated strncpy() with strscpy()
strncpy() is deprecated for use on NUL-terminated destination strings [1]
and as such we should prefer more robust and less ambiguous string
interfaces.

We expect partition_name to be NUL-terminated based on its usage with
format strings:
|       dev_info(hostdata->dev, "host srp version: %s, "
|                "host partition %s (%d), OS %d, max io %u\n",
|                hostdata->madapter_info.srp_version,
|                hostdata->madapter_info.partition_name,
|                be32_to_cpu(hostdata->madapter_info.partition_number),
|                be32_to_cpu(hostdata->madapter_info.os_type),
|                be32_to_cpu(hostdata->madapter_info.port_max_txu[0]));
...
|       len = snprintf(buf, PAGE_SIZE, "%s\n",
|                hostdata->madapter_info.partition_name);

Moreover, NUL-padding is not required as madapter_info is explicitly
memset to 0:
|       memset(&hostdata->madapter_info, 0x00,
|                       sizeof(hostdata->madapter_info));

Considering the above, a suitable replacement is strscpy() [2] due to the
fact that it guarantees NUL-termination on the destination buffer without
unnecessarily NUL-padding.

Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings [1]
Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html [2]
Link: https://github.com/KSPP/linux/issues/90
Cc: <linux-hardening@vger.kernel.org>
Signed-off-by: Justin Stitt <justinstitt@google.com>
Link: https://lore.kernel.org/r/20231030-strncpy-drivers-scsi-ibmvscsi-ibmvscsi-c-v1-1-f8b06ae9e3d5@google.com
Reviewed-by: Kees Cook <keescook@chromium.org>
Tested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-12-05 21:01:52 -05:00
Justin Stitt
a9baa16b4f scsi: ibmvfc: Replace deprecated strncpy() with strscpy()
strncpy() is deprecated for use on NUL-terminated destination strings [1]
and as such we should prefer more robust and less ambiguous string
interfaces.

We expect these fields to be NUL-terminated as the property names from
which they are derived are also NUL-terminated.

Moreover, NUL-padding is not required as our destination buffers are
already NUL-allocated and any future NUL-byte assignments are redundant
(like the ones that strncpy() does).
ibmvfc_probe() ->
|       struct ibmvfc_host *vhost;
|       struct Scsi_Host *shost;
...
| 	shost = scsi_host_alloc(&driver_template, sizeof(*vhost));
... **side note: is this a bug? Looks like a type to me   ^^^^^**
...
|	vhost = shost_priv(shost);

... where shost_priv() is:
|       static inline void *shost_priv(struct Scsi_Host *shost)
|       {
|       	return (void *)shost->hostdata;
|       }

.. and:
scsi_host_alloc() ->
| 	shost = kzalloc(sizeof(struct Scsi_Host) + privsize, GFP_KERNEL);

And for login_info->..., NUL-padding is also not required as it is
explicitly memset to 0:
|	memset(login_info, 0, sizeof(*login_info));

Considering the above, a suitable replacement is strscpy() [2] due to
the fact that it guarantees NUL-termination on the destination buffer
without unnecessarily NUL-padding.

Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings [1]
Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html [2]
Link: https://github.com/KSPP/linux/issues/90
Cc: <linux-hardening@vger.kernel.org>
Signed-off-by: Justin Stitt <justinstitt@google.com>
Link: https://lore.kernel.org/r/20231030-strncpy-drivers-scsi-ibmvscsi-ibmvfc-c-v1-1-5a4909688435@google.com
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-12-05 21:01:52 -05:00
Artem Chernyshev
f5f27a332a scsi: fnic: Return error if vmalloc() failed
In fnic_init_module() exists redundant check for return value from
fnic_debugfs_init(), because at moment it only can return zero. It make
sense to process theoretical vmalloc() failure.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Fixes: 9730ddfb12 ("scsi: fnic: remove redundant assignment of variable rc")
Signed-off-by: Artem Chernyshev <artem.chernyshev@red-soft.ru>
Link: https://lore.kernel.org/r/20231128111008.2280507-1-artem.chernyshev@red-soft.ru
Reviewed-by: Karan Tilak Kumar <kartilak@cisco.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-12-05 21:01:52 -05:00
Dinghao Liu
235f2b548d scsi: be2iscsi: Fix a memleak in beiscsi_init_wrb_handle()
When an error occurs in the for loop of beiscsi_init_wrb_handle(), we
should free phwi_ctxt->be_wrbq before returning an error code to prevent
potential memleak.

Fixes: a7909b396b ("[SCSI] be2iscsi: Fix dynamic CID allocation Mechanism in driver")
Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn>
Link: https://lore.kernel.org/r/20231123081941.24854-1-dinghao.liu@zju.edu.cn
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-12-05 20:38:26 -05:00
Ilpo Järvinen
420ac76610 scsi: lpfc: Use PCI_HEADER_TYPE_MFD instead of literal
Replace literal 0x80 with PCI_HEADER_TYPE_MFD.

Link: https://lore.kernel.org/r/20231124090919.23687-4-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-12-01 16:50:59 -06:00
Martin K. Petersen
6bae38ddd3 Merge patch series "scsi: arcmsr: support Areca ARC-1688 Raid controller"
Driver update from ching Huang <ching2048@areca.com.tw>.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-24 21:24:38 -05:00
ching Huang
56610811cc scsi: arcmsr: Update driver version to v1.51.00.14-20230915
Signed-off-by: ching Huang <ching2048@areca.com.tw>
Link: https://lore.kernel.org/r/514898a472dfdf0502afe27d127ed5145a1fb915.camel@areca.com.tw
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-24 21:23:36 -05:00
ching Huang
41c8a1a1e9 scsi: arcmsr: Support new PCI device IDs 1883 and 1886
Add support for Areca RAID controllers with PCI device IDs 1883 and 1886.

Signed-off-by: ching Huang <ching2048@areca.com.tw>
Link: https://lore.kernel.org/r/7732e743eaad57681b1552eec9c6a86c76dbe459.camel@areca.com.tw
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-24 21:22:29 -05:00
ching Huang
14ef4b001a scsi: arcmsr: Support new RAID controller ARC-1688
Add support for new Areca RAID controller ARC-1688

Signed-off-by: ching Huang <ching2048@areca.com.tw>
Link: https://lore.kernel.org/r/110bdc873497d3d5e090b908fb159b6155bb3a2b.camel@areca.com.tw
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-24 21:20:49 -05:00
Abhinav Singh
f38d4eda25 scsi: dc395x: Fix warning using plain integer as NULL
Sparse static analysis tool generates a warning with this message "Using
plain integer as NULL pointer". Fix it.

Signed-off-by: Abhinav Singh <singhabhinav9051571833@gmail.com>
Link: https://lore.kernel.org/r/20231109215049.1466431-1-singhabhinav9051571833@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-24 21:08:23 -05:00
Martin K. Petersen
130fbf45f4 Merge patch series "mpi3mr: Add support for Broadcom SAS5116 IO/RAID controllers"
Sumit Saxena <sumit.saxena@broadcom.com> says:

These patches add support for Broadcom's SAS5116 IO/RAID controllers
in mpi3mr driver.

Link: https://lore.kernel.org/r/20231123160132.4155-1-sumit.saxena@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-24 20:55:24 -05:00
Sumit Saxena
b4d94164ff scsi: mpi3mr: driver version upgrade to 8.5.0.0.50
Update driver version to 8.5.0.0.50.

Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Link: https://lore.kernel.org/r/20231123160132.4155-6-sumit.saxena@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-24 20:53:05 -05:00
Sumit Saxena
1193a89d2b scsi: mpi3mr: Add support for status reply descriptor
Inform controller firmware that driver supports status reply descriptor.

Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Link: https://lore.kernel.org/r/20231123160132.4155-5-sumit.saxena@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-24 20:53:05 -05:00
Sumit Saxena
cb5b608946 scsi: mpi3mr: Increase maximum number of PHYs to 64 from 32
SAS5116 controllers supports maximum 48 physical PHYs. Modify driver to
accommodate up to 64 PHYs (though current need is to support 48 PHYs).

Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Link: https://lore.kernel.org/r/20231123160132.4155-4-sumit.saxena@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-24 20:53:05 -05:00
Sumit Saxena
c9260ff28e scsi: mpi3mr: Add PCI checks where SAS5116 diverges from SAS4116
Add PCI IDs checks for the cases where SAS5116 diverges from SAS4116 in
behavior.

Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Link: https://lore.kernel.org/r/20231123160132.4155-3-sumit.saxena@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-24 20:53:05 -05:00
Sumit Saxena
6fa21eab82 scsi: mpi3mr: Add support for SAS5116 PCI IDs
Add support for Broadcom's SAS5116 IO/RAID controllers PCI IDs.

Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Link: https://lore.kernel.org/r/20231123160132.4155-2-sumit.saxena@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-24 20:53:05 -05:00
Damien Le Moal
b09d7f8fd5 scsi: sd: Fix system start for ATA devices
It is not always possible to keep a device in the runtime suspended state
when a system level suspend/resume cycle is executed. E.g. for ATA devices
connected to AHCI adapters, system resume resets the ATA ports, which
causes connected devices to spin up. In such case, a runtime suspended disk
will incorrectly be seen with a suspended runtime state because the device
is not resumed by sd_resume_system(). The power state seen by the user is
different than the actual device physical power state.

Fix this issue by introducing the struct scsi_device flag
force_runtime_start_on_system_start. When set, this flag causes
sd_resume_system() to request a runtime resume operation for runtime
suspended devices. This results in the user seeing the device runtime_state
as active after a system resume, thus correctly reflecting the device
physical power state.

Fixes: 9131bff6a9 ("scsi: core: pm: Only runtime resume if necessary")
Cc: <stable@vger.kernel.org>
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20231120225631.37938-3-dlemoal@kernel.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-24 20:44:21 -05:00
Bart Van Assche
10b53db2db scsi: core: Add a precondition check in scsi_eh_scmd_add()
Calling scsi_eh_scmd_add() may cause the error handler never to be woken up
because this may result in shost->host_failed to become larger than
scsi_host_busy(shost). Hence complain if scsi_eh_scmd_add() is called after
SCMD_STATE_INFLIGHT has been cleared.

Cc: Hannes Reinecke <hare@suse.de>
Cc: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Cc: Mike Christie <michael.christie@oracle.com>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20231115193343.2262013-1-bvanassche@acm.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-24 19:23:44 -05:00
Bart Van Assche
0349be31e4 scsi: bfa: Use the proper data type for BLIST flags
Fix the following sparse warning:

drivers/scsi/bfa/bfad_bsg.c:2553:50: sparse: sparse: incorrect type in initializer (different base types)

Fixes: 2e5a6c3bac ("scsi: bfa: Convert bfad_reset_sdev_bflags() from a macro into a function")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202311031255.lmSPisIk-lkp@intel.com/
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20231115193338.2261972-1-bvanassche@acm.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-24 19:23:39 -05:00
Tomas Henzl
6a965ee189 scsi: mpt3sas: Suppress a warning in debug kernel
The mpt3sas_ctl_exit() should be called after communication with the
controller stops but currently it may cause false warnings about not
released memory. Fix this by letting mpt3sas_ctl_exit() handle misc driver
release per driver and release DMA in mpt3sas_ctl_release() per ioc.

Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Link: https://lore.kernel.org/r/20231019153706.7967-1-thenzl@redhat.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-15 10:50:31 -05:00
Martin K. Petersen
2aee050cef Merge patch series "lpfc: Update lpfc to revision 14.2.0.16"
Justin Tee <justintee8345@gmail.com> says:

Update lpfc to revision 14.2.0.16

This patch set contains a user input range check correction, static
code analyzer fixes, refactoring of clean up code, and logging
enhancements.

The patches were cut against Martin's 6.7/scsi-queue tree.

Link: https://lore.kernel.org/r/20231031191224.150862-1-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-15 10:07:39 -05:00
Martin K. Petersen
b098cc463f Merge patch series "Replace deprecated strncpy() with strscpy()"
A series of patches from Justin Stitt.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-15 10:06:23 -05:00
Justin Tee
1f86b0d9c7 scsi: lpfc: Copyright updates for 14.2.0.16 patches
Update copyrights to 2023 for files modified in the 14.2.0.16 patch set.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20231031191224.150862-10-justintee8345@gmail.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-15 09:52:58 -05:00
Justin Tee
c855e02b57 scsi: lpfc: Update lpfc version to 14.2.0.16
Update lpfc version to 14.2.0.16.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20231031191224.150862-9-justintee8345@gmail.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-15 09:52:58 -05:00
Justin Tee
e6af452187 scsi: lpfc: Enhance driver logging for selected discovery events
Typically, debugging discovery issues requires the ndlp reference count,
nlp flags, transport flags, and the io tag for root cause analysis.

Modify important discovery log messages to include one or more of these
attributes to aid in debugging and support.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20231031191224.150862-8-justintee8345@gmail.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-15 09:52:58 -05:00
Justin Tee
349b1e2c1b scsi: lpfc: Refactor and clean up mailbox command memory free
A lot of repeated clean up code exists when freeing mailbox commands in
lpfc_mem_free_all().

Introduce a lpfc_mem_free_sli_mbox() helper routine to refactor the
copy-paste code.  Additionally, reinitialize the mailbox command structure
context pointers to NULL in lpfc_sli4_mbox_cmd_free().

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20231031191224.150862-7-justintee8345@gmail.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-15 09:52:58 -05:00
Justin Tee
57ea41eb7f scsi: lpfc: Return early in lpfc_poll_eratt() when the driver is unloading
Add a check in lpfc_poll_eratt() when the driver is unloading.  There is no
point to check for error attention events if the driver is rmmod'ed.

If the driver is reloaded, as part of insmod initialization, then a fresh
reset is always asserted to start clean and free of error attention events.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20231031191224.150862-6-justintee8345@gmail.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-15 09:52:58 -05:00
Justin Tee
e07ac2d2aa scsi: lpfc: Eliminate unnecessary relocking in lpfc_check_nlp_post_devloss()
In lpfc_check_nlp_post_devloss(), retaking of the ndlp lock in the if
statement is useless because the very next line unlocks. Simply return to
avoid relocking.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20231031191224.150862-5-justintee8345@gmail.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-15 09:52:57 -05:00
Justin Tee
1dec1311b9 scsi: lpfc: Fix list_entry null check warning in lpfc_cmpl_els_plogi()
Smatch called out a warning for null checking a ptr that is assigned by
list_entry(). list_entry() does not return null and, if the list is empty,
can return an invalid ptr. Thus, the !psrp check does not execute properly.

 drivers/scsi/lpfc/lpfc_els.c:2133 lpfc_cmpl_els_plogi()
 warn: list_entry() does not return NULL 'prsp'

Replace list_entry() with list_get_first(), which does a list_empty() check
before returning the first entry.

Fixes: a3c3c0a806 ("scsi: lpfc: Validate ELS LS_ACC completion payload")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/linux-scsi/01b7568f-4ab4-4d56-bfa6-9ecc5fc261fe@moroto.mountain/
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20231031191224.150862-4-justintee8345@gmail.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-15 09:52:57 -05:00
Justin Tee
f5779b5292 scsi: lpfc: Fix possible file string name overflow when updating firmware
Because file_name and phba->ModelName are both declared a size 80 bytes,
the extra ".grp" file extension could cause an overflow into file_name.

Define a ELX_FW_NAME_SIZE macro with value 84.  84 incorporates the 4 extra
characters from ".grp".  file_name is changed to be declared as a char and
initialized to zeros i.e. null chars.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20231031191224.150862-3-justintee8345@gmail.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-15 09:52:57 -05:00
Justin Tee
2fe4b6a677 scsi: lpfc: Correct maximum PCI function value for RAS fw logging
Currently, the ras_fwlog_func sysfs parameter allows users to input a value
greater than three when selecting a PCI function to enable RAS fw logging
feature.

The user's input is sanity checked in lpfc_sli4_ras_init(), but allowing an
input greater than three doesn't make sense because the max number of ports
per HBA is four.

Change the allowable range from [0, 7] to [0, 3].

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20231031191224.150862-2-justintee8345@gmail.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-15 09:52:57 -05:00
Justin Stitt
1057f44137 scsi: elx: libefc: Replace deprecated strncpy() with strscpy_pad()/memcpy()
strncpy() is deprecated for use on NUL-terminated destination strings [1]
and as such we should prefer more robust and less ambiguous string
interfaces.

To keep node->current_state_name and node->prev_state_name NUL-padded and
NUL-terminated let's use strscpy_pad() as this implicitly provides both.

For the swap between the two, a simple memcpy() will suffice.

Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings [1]
Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html [2]
Link: https://github.com/KSPP/linux/issues/90
Cc: linux-hardening@vger.kernel.org
Signed-off-by: Justin Stitt <justinstitt@google.com>
Link: https://lore.kernel.org/r/20231026-strncpy-drivers-scsi-elx-libefc-efc_node-h-v2-1-5c083d0c13f4@google.com
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-15 09:46:03 -05:00
Justin Stitt
4592411784 scsi: csiostor: Replace deprecated strncpy() with strscpy()
strncpy() is deprecated for use on NUL-terminated destination strings [1]
and as such we should prefer more robust and less ambiguous string
interfaces.

'hw' is kzalloc'd just before this string assignment:
|       hw = kzalloc(sizeof(struct csio_hw), GFP_KERNEL);

... which means any NUL-padding is redundant.

Since CSIO_DRV_VERSION is a small string literal (smaller than
sizeof(dest)):

... there is functionally no change in this swap from strncpy() to
strscpy(). Nonetheless, let's make the change for robustness' sake -- as
it will ensure that drv_version is _always_ NUL-terminated.

Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings [1]
Link: https://github.com/KSPP/linux/issues/90
Cc: linux-hardening@vger.kernel.org
Signed-off-by: Justin Stitt <justinstitt@google.com>
Link: https://lore.kernel.org/r/20231023-strncpy-drivers-scsi-csiostor-csio_init-c-v1-1-5ea445b56864@google.com
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-15 09:05:46 -05:00
Justin Stitt
dc7a7f10e6 scsi: ch: Replace deprecated strncpy() with strscpy()
strncpy() is deprecated for use on NUL-terminated destination strings [1]
and as such we should prefer more robust and less ambiguous string
interfaces.

These labels get copied out to the user so lets make sure they are
NUL-terminated and NUL-padded.

vparams is already memset to 0 so we don't need to do any NUL-padding (like
what strncpy() is doing).

Considering the above, a suitable replacement is strscpy() [2] due to the
fact that it guarantees NUL-termination on the destination buffer without
unnecessarily NUL-padding.

Let's also opt to use the more idiomatic strscpy() usage of: (dest, src,
sizeof(dest)) as this more closely ties the destination buffer to the
length.

Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings [1]
Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html [2]
Link: https://github.com/KSPP/linux/issues/90
Cc: linux-hardening@vger.kernel.org
Signed-off-by: Justin Stitt <justinstitt@google.com>
Link: https://lore.kernel.org/r/20231023-strncpy-drivers-scsi-ch-c-v1-1-dc67ba8075a3@google.com
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-15 09:03:58 -05:00
Justin Stitt
b04a2eff9e scsi: bnx2fc: Replace deprecated strncpy() with strscpy()
strncpy() is deprecated for use on NUL-terminated destination strings [1]
and as such we should prefer more robust and less ambiguous string
interfaces.

We expect hba->chip_num to be NUL-terminated based on its usage with format
strings:

	snprintf(fc_host_symbolic_name(lport->host), 256,
		 "%s (QLogic %s) v%s over %s",
		BNX2FC_NAME, hba->chip_num, BNX2FC_VERSION,
		interface->netdev->name);

Moreover, NUL-padding is not required as hba is zero-allocated from its
callsite:

	hba = kzalloc(sizeof(*hba), GFP_KERNEL);

Considering the above, a suitable replacement is strscpy() [2] due to the
fact that it guarantees NUL-termination on the destination buffer without
unnecessarily NUL-padding.

Regarding stats_addr->version, I've opted to also use strscpy() instead of
strscpy_pad() as I typically see these XYZ_get_strings() pass
zero-allocated data. I couldn't track all of where bnx2fc_ulp_get_stats()
is used and if required, we could opt for strscpy_pad().

Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings [1]
Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html [2]
Link: https://github.com/KSPP/linux/issues/90
Cc: linux-hardening@vger.kernel.org
Signed-off-by: Justin Stitt <justinstitt@google.com>
Link: https://lore.kernel.org/r/20231023-strncpy-drivers-scsi-bnx2fc-bnx2fc_fcoe-c-v1-1-a3736943cde2@google.com
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-15 09:01:10 -05:00
Justin Stitt
7936a19e94 scsi: 3w-sas: Replace deprecated strncpy() with strscpy()
strncpy() is deprecated for use on NUL-terminated destination strings
[1] and as such we should prefer more robust and less ambiguous string
interfaces.

This pattern of strncpy(dest, src, strlen(src)) is extremely bug-prone.
This pattern basically never results in NUL-terminated destination
strings unless `dest` was zero-initialized. The current implementation
may be accidentally correct as tw_dev is zero-allocated via:

	host = scsi_host_alloc(&driver_template, sizeof(TW_Device_Extension));
        ...
	tw_dev = shost_priv(host);

... wherein scsi_host_alloc() zero-allocates host:

        shost = kzalloc(sizeof(struct Scsi_Host) + privsize, GFP_KERNEL);

Also, further suggesting this change is worthwhile is another strscpy()
usage in 3w-9xxx.c:

	strscpy(tw_dev->tw_compat_info.driver_version, TW_DRIVER_VERSION,
		sizeof(tw_dev->tw_compat_info.driver_version));

Considering the above, a suitable replacement is strscpy() [2] due to
the fact that it guarantees NUL-termination on the destination buffer
without unnecessarily NUL-padding.

Let's not be accidentally correct, let's be definitely correct.

Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings [1]
Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html [2]
Link: https://github.com/KSPP/linux/issues/90
Cc: linux-hardening@vger.kernel.org
Signed-off-by: Justin Stitt <justinstitt@google.com>
Link: https://lore.kernel.org/r/20231023-strncpy-drivers-scsi-3w-sas-c-v1-1-4c40a1e99dfc@google.com
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-15 08:58:26 -05:00
James Seo
e188215562 scsi: mpt3sas: Replace dynamic allocations with local variables
mpt3sas_scsih.c:_scsih_scan_for_devices_after_reset() allocates and fetches
a MPI2_CONFIG_PAGE_RAID_VOL_0 struct (Mpi2RaidVolPage0_t) and a
MPI2_CONFIG_PAGE_RAID_VOL_1 struct (Mpi2RaidVolPage1_t), but does not
include the terminal flexible array members in the struct size
calculations, fetch those members, or otherwise use those members in any
way.

These dynamic allocations can be replaced with local variables.

Signed-off-by: James Seo <james@equiv.tech>
Link: https://lore.kernel.org/r/20230806170604.16143-13-james@equiv.tech
Tested-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-15 08:52:03 -05:00
James Seo
dde41e0c1c scsi: mpt3sas: Replace a dynamic allocation with a local variable
mpt3sas_base.c:_base_update_diag_trigger_pages() allocates and fetches a
MPI2_CONFIG_PAGE_SASIOUNIT_1 struct (Mpi2SasIOUnitPage_t), but does not
include the terminal flexible array member in the struct size calculation,
fetch that member, or otherwise use that member in any way.

This dynamic allocation can be replaced with a local variable.

Signed-off-by: James Seo <james@equiv.tech>
Link: https://lore.kernel.org/r/20230806170604.16143-12-james@equiv.tech
Tested-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-15 08:52:02 -05:00
James Seo
e5035459d3 scsi: mpt3sas: Fix typo of "TRIGGER"
Change "TIGGER" to "TRIGGER" in struct names and typedefs.

Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: James Seo <james@equiv.tech>
Link: https://lore.kernel.org/r/20230806170604.16143-11-james@equiv.tech
Tested-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-15 08:52:02 -05:00
James Seo
8a3db51e01 scsi: mpt3sas: Fix an outdated comment
May reduce confusion for users of MPI2_CONFIG_PAGE_IO_UNIT_3::GPIOVal[].

Fixes: a1c4d77413 ("scsi: mpt3sas: Replace unnecessary dynamic allocation with a static one")
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: James Seo <james@equiv.tech>
Link: https://lore.kernel.org/r/20230806170604.16143-10-james@equiv.tech
Tested-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-15 08:52:02 -05:00
James Seo
66f2a53fc6 scsi: mpt3sas: Remove the iounit_pg8 member of the per-adapter struct
The per-adapter struct (struct MPT3SAS_ADAPTER) contains a
MPI2_CONFIG_PAGE_IO_UNIT_8 (Mpi2IOUnitPage8_t) iounit_pg8 member that is
populated by mpt3sas_base.c:_base_static_config_pages().

As the name of that function indicates, the iounit_pg8 member represents a
static configuration page data structure that rarely changes, and is among
several such static config pages that are currently being fetched once per
adapter per init (or reset) and copied to the per-adapter struct for later
use.

However, unlike the other static config pages, the iounit_pg8 member is
never actually used outside of _base_static_config_pages(). Also,
Mpi2IOUnitPage8_t has a flexible array member, making its presence in the
_middle_ of the per-adapter struct rather strange.

Remove this member from the per-adapter struct and fix up the portion of
_base_static_config_pages() that uses it.

Signed-off-by: James Seo <james@equiv.tech>
Link: https://lore.kernel.org/r/20230806170604.16143-9-james@equiv.tech
Tested-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-15 08:52:02 -05:00
James Seo
f4f76e1417 scsi: mpt3sas: Use struct_size() for struct size calculations
After converting terminal variable arrays into flexible array members, use
the bounds-checking struct_size() helper when possible to avoid open-coded
arithmetic struct size calculations.

Signed-off-by: James Seo <james@equiv.tech>
Link: https://lore.kernel.org/r/20230806170604.16143-8-james@equiv.tech
Tested-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-15 08:52:02 -05:00
James Seo
1f11266099 scsi: mpt3sas: Make MPI26_CONFIG_PAGE_PIOUNIT_1::PhyData[] a flexible array
This terminal 1-length variable array can be directly converted into a C99
flexible array member.

As all users of MPI26_CONFIG_PAGE_PIOUNIT_1 (Mpi26PCIeIOUnitPage1_t) do not
use PhyData[], no further source changes are required to accommodate its
reduced sizeof():

 - mpt3sas_config.c:mpt3sas_config_get_pcie_iounit_pg1() fetches a
   Mpi26PCIeIOUnitPage1_t into a caller-provided buffer, and may fetch
   and write PhyData[] into that buffer depending on its sz argument.
   It has one caller:

   - mpt3sas_base.c:_base_assign_fw_reported_qd() passes
     sizeof(Mpi26PCIeIOUnitPage1_t) as sz, but does not use PhyData[].

Signed-off-by: James Seo <james@equiv.tech>
Link: https://lore.kernel.org/r/20230806170604.16143-7-james@equiv.tech
Tested-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-15 08:52:02 -05:00
James Seo
e249a957ce scsi: mpt3sas: Make MPI2_CONFIG_PAGE_SASIOUNIT_1::PhyData[] a flexible array
This terminal 1-length variable array can be directly converted into a C99
flexible array member.

As all users of MPI2_CONFIG_PAGE_SASIOUNIT_1 (Mpi2SasIOUnitPage1_t) either
calculate its size without depending on its sizeof() or do not use
PhyData[], no further source changes are required:

 - mpt3sas_config.c:mpt3sas_config_get_sas_iounit_pg1() fetches a
   Mpi2SasIOUnitPage1_t into a caller-provided buffer, and may fetch and
   write PhyData[] into that buffer depending on its sz argument.  Its
   callers:

   - mpt3sas_base.c:_base_assign_fw_reported_qd() passes
     sizeof(Mpi2SasIOUnitPage1_t) as sz, but does not use PhyData[].

   - mpt3sas_base.c:mpt3sas_base_update_missing_delay(),
     mpt3sas_scsih.c:_scsih_sas_host_add(),
     mpt3sas_transport.c:_transport_phy_enable(), and
     mpt3sas_transport.c:_transport_phy_speed() all calculate sz
     independently of sizeof(Mpi2SasIOUnitPage1_t) and allocate a
     suitable buffer before calling mpt3sas_config_get_sas_iounit_pg1()
     and using PhyData[].

 - mpt3sas_config.c:mpt3sas_config_set_sas_iounit_pg1() writes the contents
   of a caller-provided buffer to the adapter, with the size of the write
   depending on its sz argument. Its callers:

   - mpt3sas_base.c:mpt3sas_base_update_missing_delay(),
     mpt3sas_transport.c:_transport_phy_enable(), and
     mpt3sas_transport.c:_transport_phy_speed() have all previously
     called mpt3sas_config_get_sas_iounit_pg1() to obtain a
     Mpi2SasIOUnitPage1_t, and are merely writing back this same
     struct with the same previously calculated sz.

Signed-off-by: James Seo <james@equiv.tech>
Link: https://lore.kernel.org/r/20230806170604.16143-6-james@equiv.tech
Tested-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-15 08:52:02 -05:00
James Seo
dccc1e3ed9 scsi: mpt3sas: Make MPI2_CONFIG_PAGE_SASIOUNIT_0::PhyData[] a flexible array
This terminal 1-length variable array can be directly converted into a C99
flexible array member.

As all users of MPI2_CONFIG_PAGE_SASIOUNIT_0 (Mpi2SasIOUnitPage0_t) either
calculate its size without depending on its sizeof() or do not use
PhyData[], no further source changes are required:

 - mpt3sas_config.c:mpt3sas_config_get_number_hba_phys() fetches a
   Mpi2SasIOUnitPage0_t for itself, but does not use PhyData[].

 - mpt3sas_config.c:mpt3sas_config_get_sas_iounit_pg0() fetches a
   Mpi2SasIOUnitPage0_t into a caller-provided buffer, and may fetch and
   write PhyData[] into that buffer depending on its sz argument.  Its
   callers:

   - mpt3sas_scsih.c:_scsih_update_vphys_after_reset(),
     mpt3sas_scsih.c:_scsih_get_port_table_after_reset(),
     mpt3sas_scsih.c:_scsih_sas_host_refresh(),
     mpt3sas_scsih.c:_scsih_sas_host_add(), and
     mpt3sas_transport.c:_transport_phy_enable() all calculate sz
     independently of sizeof(Mpi2SasIOUnitPage0_t) and allocate a
     suitable buffer before calling mpt3sas_config_get_sas_iounit_pg0()
     and using PhyData[].

Signed-off-by: James Seo <james@equiv.tech>
Link: https://lore.kernel.org/r/20230806170604.16143-5-james@equiv.tech
Tested-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-15 08:52:02 -05:00
James Seo
cb7c03c5d3 scsi: mpt3sas: Make MPI2_CONFIG_PAGE_RAID_VOL_0::PhysDisk[] a flexible array
This terminal 1-length variable array can be directly converted into a C99
flexible array member.

As all users of MPI2_CONFIG_PAGE_RAID_VOL_0 (Mpi2RaidVolPage0_t) either
calculate its size without depending on its sizeof() or do not use
PhysDisk[], no further source changes are required:

 - mpt3sas_config.c:mpt3sas_config_get_number_pds() fetches a
   Mpi2RaidVolPage0_t for itself, but does not use PhysDisk[].

 - mpt3sas_config.c:mpt3sas_config_get_raid_volume_pg0() fetches a
   Mpi2RaidVolPage0_t into a caller-provided buffer, and may fetch and
   write PhysDisk[] into that buffer depending on its sz argument.  Its
   callers:

   - mpt3sas_scsih.c:scsih_get_resync(),
     mpt3sas_scsih.c:scsih_get_state(),
     mpt3sas_scsih.c:_scsih_search_responding_raid_devices(), and
     mpt3sas_scsih.c:_scsih_scan_for_devices_after_reset() all pass
     sizeof(Mpi2RaidVolPage0_t) as sz, but do not use PhysDisk[].

   - mpt3sas_scsih.c:_scsih_get_volume_capabilities() and
     mpt3sas_warpdrive.c:mpt3sas_init_warpdrive_properties()
     both calculate sz independently of sizeof(Mpi2RaidVolPage0_t)
     and allocate a suitable buffer before calling
     mpt3sas_config_get_raid_volume_pg0() and using PhysDisk[].

Signed-off-by: James Seo <james@equiv.tech>
Link: https://lore.kernel.org/r/20230806170604.16143-4-james@equiv.tech
Tested-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-15 08:52:02 -05:00
James Seo
f7830af68e scsi: mpt3sas: Make MPI2_CONFIG_PAGE_IO_UNIT_8::Sensor[] a flexible array
This terminal 1-length variable array can be directly converted into a C99
flexible array member.

As all users of MPI2_CONFIG_PAGE_IO_UNIT_8 (Mpi2IOUnitPage8_t) do not use
Sensor[], no further source changes are required to accommodate its reduced
sizeof():

 - mpt3sas_config.c:mpt3sas_config_get_iounit_pg8() fetches a
   Mpi2IOUnitPage8_t into a caller-provided buffer, assuming
   sizeof(Mpi2IOUnitPage8_t) as the buffer size. It has one caller:

   - mpt3sas_base.c:_base_static_config_pages() passes the address of the
     Mpi2IOUnitPage8_t iounit_pg8 member of the per-adapter struct (struct
     MPT3SAS_ADAPTER *ioc) as the buffer. The assumed buffer size is
     therefore correct.

     However, the only subsequent use in mpt3sas of the thus populated
     ioc->iounit_pg8 is a little further on in the same function, and this
     use does not involve ioc->iounit_pg8.Sensor[].

     Note that iounit_pg8 occurs in the middle of the per-adapter struct,
     not at the end. The per-adapter struct is extensively used throughout
     mpt3sas even if its iounit_pg8 member isn't, resulting in an
     especially large amount of noise when comparing binary changes
     attributable to this commit.

Signed-off-by: James Seo <james@equiv.tech>
Link: https://lore.kernel.org/r/20230806170604.16143-3-james@equiv.tech
Tested-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-15 08:52:01 -05:00
James Seo
aa4db51bbd scsi: mpt3sas: Use flexible arrays when obviously possible
These terminal 1-length variable arrays can be directly converted into C99
flexible array members without any binary changes.

In most cases, they belong to unused structs, or to structs used only by
unused code. The remaining few coincidentally have their sizes calculated
in roundabout ways that do not depend on the sizeof() their structs.

Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: James Seo <james@equiv.tech>
Link: https://lore.kernel.org/r/20230806170604.16143-2-james@equiv.tech
Tested-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-15 08:52:01 -05:00
Martin K. Petersen
2a0508d9d0 Merge branch '6.7/scsi-staging' into 6.7/scsi-fixes
Pull in queued fixes for 6.7

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-14 11:40:40 -05:00
Mike Christie
3b83486399 scsi: sd: Fix sshdr use in sd_suspend_common()
If scsi_execute_cmd() returns < 0, it doesn't initialize the sshdr, so we
shouldn't access the sshdr. If it returns 0, then the cmd executed
successfully, so there is no need to check the sshdr. sd_sync_cache() will
only access the sshdr if it's been setup because it calls
scsi_status_is_check_condition() before accessing it. However, the
sd_sync_cache() caller, sd_suspend_common(), does not check.

sd_suspend_common() is only checking for ILLEGAL_REQUEST which it's using
to determine if the command is supported. If it's not it just ignores the
error. So to fix its sshdr use this patch just moves that check to
sd_sync_cache() where it converts ILLEGAL_REQUEST to success/0.
sd_suspend_common() was ignoring that error and sd_shutdown() doesn't check
for errors so there will be no behavior changes.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Link: https://lore.kernel.org/r/20231106231304.5694-2-michael.christie@oracle.com
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin Wilck <mwilck@suse.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-08 21:46:58 -05:00
Dan Carpenter
037fbd3fcf scsi: scsi_debug: Delete some bogus error checking
Smatch complains that "dentry" is never initialized.  These days everyone
initializes all their stack variables to zero so this means that it will
trigger a warning every time this function is run.

Really, debugfs functions are not supposed to be checked for errors in
normal code.  For example, if we updated this code to check the correct
variable then it would print a warning if CONFIG_DEBUGFS was disabled.  We
don't want that.  Just delete the check.

Fixes: f084fe52c6 ("scsi: scsi_debug: Add debugfs interface to fail target reset")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Link: https://lore.kernel.org/r/c602c9ad-5e35-4e18-a47f-87ed956a9ec2@moroto.mountain
Reviewed-by: Wenchao Hao <haowenchao2@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-08 21:42:26 -05:00
Dan Carpenter
860c3d03bb scsi: scsi_debug: Fix some bugs in sdebug_error_write()
There are two bug in this code:

 1) If count is zero, then it will lead to a NULL dereference.  The
    kmalloc() will successfully allocate zero bytes and the test for "if
    (buf[0] == '-')" will read beyond the end of the zero size buffer and
    Oops.

 2) The code does not ensure that the user's string is properly NUL
    terminated which could lead to a read overflow.

Fixes: a9996d722b ("scsi: scsi_debug: Add interface to manage error injection for a single device")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Link: https://lore.kernel.org/r/7733643d-e102-4581-8d29-769472011c97@moroto.mountain
Reviewed-by: Wenchao Hao <haowenchao2@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-08 21:42:26 -05:00
Quinn Tran
19597cad64 scsi: qla2xxx: Fix system crash due to bad pointer access
User experiences system crash when running AER error injection.  The
perturbation causes the abort-all-I/O path to trigger. The driver assumes
all I/O on this path is FCP only. If there is both NVMe & FCP traffic, a
system crash happens. Add additional check to see if I/O is FCP or not
before access.

PID: 999019  TASK: ff35d769f24722c0  CPU: 53  COMMAND: "kworker/53:1"
 0 [ff3f78b964847b58] machine_kexec at ffffffffae86973d
 1 [ff3f78b964847ba8] __crash_kexec at ffffffffae9be29d
 2 [ff3f78b964847c70] crash_kexec at ffffffffae9bf528
 3 [ff3f78b964847c78] oops_end at ffffffffae8282ab
 4 [ff3f78b964847c98] exc_page_fault at ffffffffaf2da502
 5 [ff3f78b964847cc0] asm_exc_page_fault at ffffffffaf400b62
   [exception RIP: qla2x00_abort_srb+444]
   RIP: ffffffffc07b5f8c  RSP: ff3f78b964847d78  RFLAGS: 00010046
   RAX: 0000000000000282  RBX: ff35d74a0195a200  RCX: ff35d76886fd03a0
   RDX: 0000000000000001  RSI: ffffffffc07c5ec8  RDI: ff35d74a0195a200
   RBP: ff35d76913d22080   R8: ff35d7694d103200   R9: ff35d7694d103200
   R10: 0000000100000000  R11: ffffffffb05d6630  R12: 0000000000010000
   R13: ff3f78b964847df8  R14: ff35d768d8754000  R15: ff35d768877248e0
   ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 6 [ff3f78b964847d70] qla2x00_abort_srb at ffffffffc07b5f84 [qla2xxx]
 7 [ff3f78b964847de0] __qla2x00_abort_all_cmds at ffffffffc07b6238 [qla2xxx]
 8 [ff3f78b964847e38] qla2x00_abort_all_cmds at ffffffffc07ba635 [qla2xxx]
 9 [ff3f78b964847e58] qla2x00_terminate_rport_io at ffffffffc08145eb [qla2xxx]
10 [ff3f78b964847e70] fc_terminate_rport_io at ffffffffc045987e [scsi_transport_fc]
11 [ff3f78b964847e88] process_one_work at ffffffffae914f15
12 [ff3f78b964847ed0] worker_thread at ffffffffae9154c0
13 [ff3f78b964847f10] kthread at ffffffffae91c456
14 [ff3f78b964847f50] ret_from_fork at ffffffffae8036ef

Cc: stable@vger.kernel.org
Fixes: f45bca8c50 ("scsi: qla2xxx: Fix double scsi_done for abort path")
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Link: https://lore.kernel.org/r/20231030064912.37912-1-njavali@marvell.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-11-08 21:32:41 -05:00
Linus Torvalds
8f6f76a6a2 As usual, lots of singleton and doubleton patches all over the tree and
there's little I can say which isn't in the individual changelogs.
 
 The lengthier patch series are
 
 - "kdump: use generic functions to simplify crashkernel reservation in
   arch", from Baoquan He.  This is mainly cleanups and consolidation of
   the "crashkernel=" kernel parameter handling.
 
 - After much discussion, David Laight's "minmax: Relax type checks in
   min() and max()" is here.  Hopefully reduces some typecasting and the
   use of min_t() and max_t().
 
 - A group of patches from Oleg Nesterov which clean up and slightly fix
   our handling of reads from /proc/PID/task/...  and which remove
   task_struct.therad_group.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZUQP9wAKCRDdBJ7gKXxA
 jmOAAQDh8sxagQYocoVsSm28ICqXFeaY9Co1jzBIDdNesAvYVwD/c2DHRqJHEiS4
 63BNcG3+hM9nwGJHb5lyh5m79nBMRg0=
 =On4u
 -----END PGP SIGNATURE-----

Merge tag 'mm-nonmm-stable-2023-11-02-14-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull non-MM updates from Andrew Morton:
 "As usual, lots of singleton and doubleton patches all over the tree
  and there's little I can say which isn't in the individual changelogs.

  The lengthier patch series are

   - 'kdump: use generic functions to simplify crashkernel reservation
     in arch', from Baoquan He. This is mainly cleanups and
     consolidation of the 'crashkernel=' kernel parameter handling

   - After much discussion, David Laight's 'minmax: Relax type checks in
     min() and max()' is here. Hopefully reduces some typecasting and
     the use of min_t() and max_t()

   - A group of patches from Oleg Nesterov which clean up and slightly
     fix our handling of reads from /proc/PID/task/... and which remove
     task_struct.thread_group"

* tag 'mm-nonmm-stable-2023-11-02-14-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (64 commits)
  scripts/gdb/vmalloc: disable on no-MMU
  scripts/gdb: fix usage of MOD_TEXT not defined when CONFIG_MODULES=n
  .mailmap: add address mapping for Tomeu Vizoso
  mailmap: update email address for Claudiu Beznea
  tools/testing/selftests/mm/run_vmtests.sh: lower the ptrace permissions
  .mailmap: map Benjamin Poirier's address
  scripts/gdb: add lx_current support for riscv
  ocfs2: fix a spelling typo in comment
  proc: test ProtectionKey in proc-empty-vm test
  proc: fix proc-empty-vm test with vsyscall
  fs/proc/base.c: remove unneeded semicolon
  do_io_accounting: use sig->stats_lock
  do_io_accounting: use __for_each_thread()
  ocfs2: replace BUG_ON() at ocfs2_num_free_extents() with ocfs2_error()
  ocfs2: fix a typo in a comment
  scripts/show_delta: add __main__ judgement before main code
  treewide: mark stuff as __ro_after_init
  fs: ocfs2: check status values
  proc: test /proc/${pid}/statm
  compiler.h: move __is_constexpr() to compiler.h
  ...
2023-11-02 20:53:31 -10:00
Linus Torvalds
6ed92e559a SCSI misc on 20231102
Updates to the usual drivers (ufs, megaraid_sas, lpfc, target, ibmvfc,
 scsi_debug) plus the usual assorted minor fixes and updates.  The
 major change this time around is a prep patch for rethreading of the
 driver reset handler API not to take a scsi_cmd structure which starts
 to reduce various drivers' dependence on scsi_cmd in error handling.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCZUORLiYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishQ4WAQDDIhzp
 /PiJBBtt0U9ii/lYqRLrOVnN0extKEgEGO+FbwEAssKgs+5Jn/7XCgdpSrx8Co3/
 0cPXrZGxs7tFpFWLZjM=
 =AlRU
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "Updates to the usual drivers (ufs, megaraid_sas, lpfc, target, ibmvfc,
  scsi_debug) plus the usual assorted minor fixes and updates.

  The major change this time around is a prep patch for rethreading of
  the driver reset handler API not to take a scsi_cmd structure which
  starts to reduce various drivers' dependence on scsi_cmd in error
  handling"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (132 commits)
  scsi: ufs: core: Leave space for '\0' in utf8 desc string
  scsi: ufs: core: Conversion to bool not necessary
  scsi: ufs: core: Fix race between force complete and ISR
  scsi: megaraid: Fix up debug message in megaraid_abort_and_reset()
  scsi: aic79xx: Fix up NULL command in ahd_done()
  scsi: message: fusion: Initialize return value in mptfc_bus_reset()
  scsi: mpt3sas: Fix loop logic
  scsi: snic: Remove useless code in snic_dr_clean_pending_req()
  scsi: core: Add comment to target_destroy in scsi_host_template
  scsi: core: Clean up scsi_dev_queue_ready()
  scsi: pmcraid: Add missing scsi_device_put() in pmcraid_eh_target_reset_handler()
  scsi: target: core: Fix kernel-doc comment
  scsi: pmcraid: Fix kernel-doc comment
  scsi: core: Handle depopulation and restoration in progress
  scsi: ufs: core: Add support for parsing OPP
  scsi: ufs: core: Add OPP support for scaling clocks and regulators
  scsi: ufs: dt-bindings: common: Add OPP table
  scsi: scsi_debug: Add param to control sdev's allow_restart
  scsi: scsi_debug: Add debugfs interface to fail target reset
  scsi: scsi_debug: Add new error injection type: Reset LUN failed
  ...
2023-11-02 15:13:50 -10:00
Linus Torvalds
27beb3ca34 pci-v6.7-changes
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmVBaU8UHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vwEdxAAo++s98+ZaaTdUuoV0Zpft1fuY6Yr
 mR80jUDxjHDbcI1G4iNVUSWG6pGIdlURnrBp5kU74FV9R2Ps3Fl49XQUHowE0HfH
 D/qmihiJQdnMsQKwzw3XGoTSINrDcF6nLafl9brBItVkgjNxfxSEbnweJMBf+Boc
 rpRXHzxbVHVjwwhBLODF2Wt/8sQ24w9c+wcQkpo7im8ZZReoigNMKgEa4J7tLlqA
 vTyPR/K6QeU8IBUk2ObCY3GeYrVuqi82eRK3Uwzu7IkQwA9orE416Okvq3Z026/h
 TUAivtrcygHaFRdGNvzspYLbc2hd2sEXF+KKKb6GNAjxuDWUhVQW4ObY4FgFkZ65
 Gqz/05D6c1dqTS3vTxp3nZYpvPEbNnO1RaGRL4h0/mbU+QSPSlHXWd9Lfg6noVVd
 3O+CcstQK8RzMiiWLeyctRPV5XIf7nGVQTJW5aCLajlHeJWcvygNpNG4N57j/hXQ
 gyEHrz3idXXHXkBKmyWZfre6YpLkxZtKyONZDHWI/AVhU0TgRdJWmqpRfC1kVVUe
 IUWBRcPUF4/r3jEu6t10N/aDWQN1uQzIsJNnCrKzAddPDTTYQJk8VVzKPo8SVxPD
 X+OjEMgBB/fXUfkJ7IMwgYnWaFJhxthrs6/3j1UqRvGYRoulE4NdWwJDky9UYIHd
 qV3dzuAxC/cpv08=
 =G//C
 -----END PGP SIGNATURE-----

Merge tag 'pci-v6.7-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci

Pull pci updates from Bjorn Helgaas:
 "Enumeration:

   - Use acpi_evaluate_dsm_typed() instead of open-coding _DSM
     evaluation to learn device characteristics (Andy Shevchenko)

   - Tidy multi-function header checks using new PCI_HEADER_TYPE_MASK
     definition (Ilpo Järvinen)

   - Simplify config access error checking in various drivers (Ilpo
     Järvinen)

   - Use pcie_capability_clear_word() (not
     pcie_capability_clear_and_set_word()) when only clearing (Ilpo
     Järvinen)

   - Add pci_get_base_class() to simplify finding devices using base
     class only (ignoring subclass and programming interface) (Sui
     Jingfeng)

   - Add pci_is_vga(), which includes ancient PCI_CLASS_NOT_DEFINED_VGA
     devices from before the Class Code was added to PCI (Sui Jingfeng)

   - Use pci_is_vga() for vgaarb, sysfs "boot_vga", virtio, qxl to
     include ancient VGA devices (Sui Jingfeng)

  Resource management:

   - Make pci_assign_unassigned_resources() non-init because sparc uses
     it after init (Randy Dunlap)

  Driver binding:

   - Retain .remove() and .probe() callbacks (previously __init) because
     sysfs may cause them to be called later (Uwe Kleine-König)

   - Prevent xHCI driver from claiming AMD VanGogh USB3 DRD device, so
     it can be claimed by dwc3 instead (Vicki Pfau)

  PCI device hotplug:

   - Add Ampere Altra Attention Indicator extension driver for acpiphp
     (D Scott Phillips)

  Power management:

   - Quirk VideoPropulsion Torrent QN16e with longer delay after reset
     (Lukas Wunner)

   - Prevent users from overriding drivers that say we shouldn't use
     D3cold (Lukas Wunner)

   - Avoid PME from D3hot/D3cold for AMD Rembrandt and Phoenix USB4
     because wakeup interrupts from those states don't work if amd-pmc
     has put the platform in a hardware sleep state (Mario Limonciello)

  IOMMU:

   - Disable ATS for Intel IPU E2000 devices with invalidation message
     endianness erratum (Bartosz Pawlowski)

  Error handling:

   - Factor out interrupt enable/disable into helpers (Kai-Heng Feng)

  Peer-to-peer DMA:

   - Fix flexible-array usage in struct pci_p2pdma_pagemap in case we
     ever use pagemaps with multiple entries (Gustavo A. R. Silva)

  ASPM:

   - Revert a change that broke when drivers disabled L1 and users later
     enabled an L1.x substate via sysfs, and fix a similar issue when
     users disabled L1 via sysfs (Heiner Kallweit)

  Endpoint framework:

   - Fix double free in __pci_epc_create() (Dan Carpenter)

   - Use IS_ERR_OR_NULL() to simplify endpoint core (Ruan Jinjie)

  Cadence PCIe controller driver:

   - Drop unused "is_rc" member (Li Chen)

  Freescale Layerscape PCIe controller driver:

   - Enable 64-bit addressing in endpoint mode (Guanhua Gao)

  Intel VMD host bridge driver:

   - Fix multi-function header check (Ilpo Järvinen)

  Microsoft Hyper-V host bridge driver:

   - Annotate struct hv_dr_state with __counted_by (Kees Cook)

  NVIDIA Tegra194 PCIe controller driver:

   - Drop setting of LNKCAP_MLW (max link width) since dw_pcie_setup()
     already does this via dw_pcie_link_set_max_link_width() (Yoshihiro
     Shimoda)

  Qualcomm PCIe controller driver:

   - Use PCIE_SPEED2MBS_ENC() to simplify encoding of link speed
     (Manivannan Sadhasivam)

   - Add a .write_dbi2() callback so DBI2 register writes, e.g., for
     setting the BAR size, work correctly (Manivannan Sadhasivam)

   - Enable ASPM for platforms that use 1.9.0 ops, because the PCI core
     doesn't enable ASPM states that haven't been enabled by the
     firmware (Manivannan Sadhasivam)

  Renesas R-Car Gen4 PCIe controller driver:

   - Add DesignWare core support (set max link width, EDMA_UNROLL flag,
     .pre_init(), .deinit(), etc) for use by R-Car Gen4 driver
     (Yoshihiro Shimoda)

   - Add driver and DT schema for DesignWare-based Renesas R-Car Gen4
     controller in both host and endpoint mode (Yoshihiro Shimoda)

  Xilinx NWL PCIe controller driver:

   - Update ECAM size to support 256 buses (Thippeswamy Havalige)

   - Stop setting bridge primary/secondary/subordinate bus numbers,
     since PCI core does this (Thippeswamy Havalige)

  Xilinx XDMA controller driver:

   - Add driver and DT schema for Zynq UltraScale+ MPSoCs devices with
     Xilinx XDMA Soft IP (Thippeswamy Havalige)

  Miscellaneous:

   - Use FIELD_GET()/FIELD_PREP() to simplify and reduce use of _SHIFT
     macros (Ilpo Järvinen, Bjorn Helgaas)

   - Remove logic_outb(), _outw(), outl() duplicate declarations (John
     Sanpe)

   - Replace unnecessary UTF-8 in Kconfig help text because menuconfig
     doesn't render it correctly (Liu Song)"

* tag 'pci-v6.7-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (102 commits)
  PCI: qcom-ep: Add dedicated callback for writing to DBI2 registers
  PCI: Simplify pcie_capability_clear_and_set_word() to ..._clear_word()
  PCI: endpoint: Fix double free in __pci_epc_create()
  PCI: xilinx-xdma: Add Xilinx XDMA Root Port driver
  dt-bindings: PCI: xilinx-xdma: Add schemas for Xilinx XDMA PCIe Root Port Bridge
  PCI: xilinx-cpm: Move IRQ definitions to a common header
  PCI: xilinx-nwl: Modify ECAM size to enable support for 256 buses
  PCI: xilinx-nwl: Rename the NWL_ECAM_VALUE_DEFAULT macro
  dt-bindings: PCI: xilinx-nwl: Modify ECAM size in the DT example
  PCI: xilinx-nwl: Remove redundant code that sets Type 1 header fields
  PCI: hotplug: Add Ampere Altra Attention Indicator extension driver
  PCI/AER: Factor out interrupt toggling into helpers
  PCI: acpiphp: Allow built-in drivers for Attention Indicators
  PCI/portdrv: Use FIELD_GET()
  PCI/VC: Use FIELD_GET()
  PCI/PTM: Use FIELD_GET()
  PCI/PME: Use FIELD_GET()
  PCI/ATS: Use FIELD_GET()
  PCI/ATS: Show PASID Capability register width in bitmasks
  PCI/ASPM: Fix L1 substate handling in aspm_attr_store_common()
  ...
2023-11-02 14:05:18 -10:00
Linus Torvalds
426ee5196d sysctl-6.7-rc1
To help make the move of sysctls out of kernel/sysctl.c not incur a size
 penalty sysctl has been changed to allow us to not require the sentinel, the
 final empty element on the sysctl array. Joel Granados has been doing all this
 work. On the v6.6 kernel we got the major infrastructure changes required to
 support this. For v6.7-rc1 we have all arch/ and drivers/ modified to remove
 the sentinel. Both arch and driver changes have been on linux-next for a bit
 less than a month. It is worth re-iterating the value:
 
   - this helps reduce the overall build time size of the kernel and run time
      memory consumed by the kernel by about ~64 bytes per array
   - the extra 64-byte penalty is no longer inncurred now when we move sysctls
     out from kernel/sysctl.c to their own files
 
 For v6.8-rc1 expect removal of all the sentinels and also then the unneeded
 check for procname == NULL.
 
 The last 2 patches are fixes recently merged by Krister Johansen which allow
 us again to use softlockup_panic early on boot. This used to work but the
 alias work broke it. This is useful for folks who want to detect softlockups
 super early rather than wait and spend money on cloud solutions with nothing
 but an eventual hung kernel. Although this hadn't gone through linux-next it's
 also a stable fix, so we might as well roll through the fixes now.
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCgAwFiEENnNq2KuOejlQLZofziMdCjCSiKcFAmVCqKsSHG1jZ3JvZkBr
 ZXJuZWwub3JnAAoJEM4jHQowkoinEgYQAIpkqRL85DBwems19Uk9A27lkctwZ6Fc
 HdslQCObQTsbuKVimZFP4IL2beUfUE0cfLZCXlzp+4nRDOf6vyhyf3w19jPQtI0Q
 YdqwTk9y6G5VjDsb35QK0+UBloY/kZ1H3/LW4uCwjXTuksUGmWW2Qvey35696Scv
 hDMLADqKQmdpYxLUaNi9QyYbEAjYtOai2ezg3+i7hTG168t1k/Ab2BxIFrPVsCR2
 FAiq05L4ugWjNskdsWBjck05JZsx9SK/qcAxpIPoUm4nGiFNHApXE0E0hs3vsnmn
 WIHIbxCQw8ZlUDlmw4S+0YH3NFFzFbWfmW8k2b0f2qZTJm/rU4KiJfcJVknkAUVF
 raFox6XDW0AUQ9L/NOUJ9ip5rup57GcFrMYocdJ3PPAvvmHKOb1D1O741p75RRcc
 9j7zwfIRrzjPUqzhsQS/GFjdJu3lJNmEBK1AcgrVry6WoItrAzJHKPPDC7TwaNmD
 eXpjxMl1sYzzHqtVh4hn+xkUYphj/6gTGMV8zdo+/FopFswgeJW9G8kHtlEWKDPk
 MRIKwACmfetP6f3ngHunBg+BOipbjCANL7JI0nOhVOQoaULxCCPx+IPJ6GfSyiuH
 AbcjH8DGI7fJbUkBFoF0dsRFZ2gH8ds1PYMbWUJ6x3FtuCuv5iIuvQYoaWU6itm7
 6f0KvCogg0fU
 =Qf50
 -----END PGP SIGNATURE-----

Merge tag 'sysctl-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux

Pull sysctl updates from Luis Chamberlain:
 "To help make the move of sysctls out of kernel/sysctl.c not incur a
  size penalty sysctl has been changed to allow us to not require the
  sentinel, the final empty element on the sysctl array. Joel Granados
  has been doing all this work. On the v6.6 kernel we got the major
  infrastructure changes required to support this. For v6.7-rc1 we have
  all arch/ and drivers/ modified to remove the sentinel. Both arch and
  driver changes have been on linux-next for a bit less than a month. It
  is worth re-iterating the value:

   - this helps reduce the overall build time size of the kernel and run
     time memory consumed by the kernel by about ~64 bytes per array

   - the extra 64-byte penalty is no longer inncurred now when we move
     sysctls out from kernel/sysctl.c to their own files

  For v6.8-rc1 expect removal of all the sentinels and also then the
  unneeded check for procname == NULL.

  The last two patches are fixes recently merged by Krister Johansen
  which allow us again to use softlockup_panic early on boot. This used
  to work but the alias work broke it. This is useful for folks who want
  to detect softlockups super early rather than wait and spend money on
  cloud solutions with nothing but an eventual hung kernel. Although
  this hadn't gone through linux-next it's also a stable fix, so we
  might as well roll through the fixes now"

* tag 'sysctl-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux: (23 commits)
  watchdog: move softlockup_panic back to early_param
  proc: sysctl: prevent aliased sysctls from getting passed to init
  intel drm: Remove now superfluous sentinel element from ctl_table array
  Drivers: hv: Remove now superfluous sentinel element from ctl_table array
  raid: Remove now superfluous sentinel element from ctl_table array
  fw loader: Remove the now superfluous sentinel element from ctl_table array
  sgi-xp: Remove the now superfluous sentinel element from ctl_table array
  vrf: Remove the now superfluous sentinel element from ctl_table array
  char-misc: Remove the now superfluous sentinel element from ctl_table array
  infiniband: Remove the now superfluous sentinel element from ctl_table array
  macintosh: Remove the now superfluous sentinel element from ctl_table array
  parport: Remove the now superfluous sentinel element from ctl_table array
  scsi: Remove now superfluous sentinel element from ctl_table array
  tty: Remove now superfluous sentinel element from ctl_table array
  xen: Remove now superfluous sentinel element from ctl_table array
  hpet: Remove now superfluous sentinel element from ctl_table array
  c-sky: Remove now superfluous sentinel element from ctl_talbe array
  powerpc: Remove now superfluous sentinel element from ctl_table arrays
  riscv: Remove now superfluous sentinel element from ctl_table array
  x86/vdso: Remove now superfluous sentinel element from ctl_table array
  ...
2023-11-01 20:51:41 -10:00
Linus Torvalds
39714efc23 ATA changes for 6.7-rc1
- Modify the AHCI driver to print the link power management policy used
    on scan, to help with debugging issues (Niklas).
 
  - Add support for the ASM2116 series adapters to the AHCI driver
    (Szuying).
 
  - Prepare libata for the coming gcc and Clang __counted_by attribute
    (Kees).
 
  - Following the recent estensive fixing of libata suspend/resume
    handling, several patches further cleanup and improve disk power state
    management (from me).
 
  - Reduce the verbosity of some error messages for non-fatal temporary
    errors, e.g. slow response to device reset when scanning a port, and
    warning messages that are in fact normal, e.g. disabling a device
    on suspend or when removing it (from me).
 
  - Cleanup DMA helper functions (from me).
 
  - Fix sata_mv drive handling of potential errors durring probe (Ma).
 
  - Cleanup the xgene and imx drivers using the functions
    of_device_get_match_data() and device_get_match_data() (Rob).
 
  - Improve the tegra driver device tree (Rob).
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQSRPv8tYSvhwAzJdzjdoc3SxdoYdgUCZT9PtwAKCRDdoc3SxdoY
 diCGAQCbSOB3PCpsm1cCMevZeEbt+zUSx+sj6twR12RohtQK6wEA1rEytQllZyqa
 Pk/jIH2UYZaf+lVuV3Vbey4rwDYXTQs=
 =V53T
 -----END PGP SIGNATURE-----

Merge tag 'ata-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata

Pull ATA updates from Damien Le Moal:

 - Modify the AHCI driver to print the link power management policy used
   on scan, to help with debugging issues (Niklas)

 - Add support for the ASM2116 series adapters to the AHCI driver
   (Szuying)

 - Prepare libata for the coming gcc and Clang __counted_by attribute
   (Kees)

 - Following the recent estensive fixing of libata suspend/resume
   handling, several patches further cleanup and improve disk power
   state management (me)

 - Reduce the verbosity of some error messages for non-fatal temporary
   errors, e.g. slow response to device reset when scanning a port, and
   warning messages that are in fact normal, e.g. disabling a device on
   suspend or when removing it (me)

 - Cleanup DMA helper functions (me)

 - Fix sata_mv drive handling of potential errors durring probe (Ma)

 - Cleanup the xgene and imx drivers using the functions
   of_device_get_match_data() and device_get_match_data() (Rob)

 - Improve the tegra driver device tree (Rob)

* tag 'ata-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata: (22 commits)
  dt-bindings: ata: tegra: Disallow undefined properties
  ata: libata-core: Improve ata_dev_power_set_active()
  ata: libata-eh: Spinup disk on resume after revalidation
  ata: imx: Use device_get_match_data()
  ata: xgene: Use of_device_get_match_data()
  ata: sata_mv: aspeed: fix value check in mv_platform_probe()
  ata: ahci: Add Intel Alder Lake-P AHCI controller to low power chipsets list
  ata: libata: Cleanup inline DMA helper functions
  ata: libata-eh: Reduce "disable device" message verbosity
  ata: libata-eh: Improve reset error messages
  ata: libata-sata: Improve ata_sas_slave_configure()
  ata: libata-core: Do not resume runtime suspended ports
  ata: libata-core: Do not poweroff runtime suspended ports
  ata: libata-core: Remove ata_port_resume_async()
  ata: libata-core: Remove ata_port_suspend_async()
  ata: libata-core: Detach a port devices on shutdown
  ata: libata-core: Synchronize ata_port_detach() with hotplug
  ata: libata-scsi: Cleanup ata_scsi_start_stop_xlat()
  scsi: Remove scsi device no_start_on_resume flag
  ata: libata: Annotate struct ata_cpr_log with __counted_by
  ...
2023-11-01 12:50:12 -10:00
Linus Torvalds
eb55307e67 X86 core code updates:
- Limit the hardcoded topology quirk for Hygon CPUs to those which have a
     model ID less than 4. The newer models have the topology CPUID leaf 0xB
     correctly implemented and are not affected.
 
   - Make SMT control more robust against enumeration failures
 
     SMT control was added to allow controlling SMT at boottime or
     runtime. The primary purpose was to provide a simple mechanism to
     disable SMT in the light of speculation attack vectors.
 
     It turned out that the code is sensible to enumeration failures and
     worked only by chance for XEN/PV. XEN/PV has no real APIC enumeration
     which means the primary thread mask is not set up correctly. By chance
     a XEN/PV boot ends up with smp_num_siblings == 2, which makes the
     hotplug control stay at its default value "enabled". So the mask is
     never evaluated.
 
     The ongoing rework of the topology evaluation caused XEN/PV to end up
     with smp_num_siblings == 1, which sets the SMT control to "not
     supported" and the empty primary thread mask causes the hotplug core to
     deny the bringup of the APS.
 
     Make the decision logic more robust and take 'not supported' and 'not
     implemented' into account for the decision whether a CPU should be
     booted or not.
 
   - Fake primary thread mask for XEN/PV
 
     Pretend that all XEN/PV vCPUs are primary threads, which makes the
     usage of the primary thread mask valid on XEN/PV. That is consistent
     with because all of the topology information on XEN/PV is fake or even
     non-existent.
 
   - Encapsulate topology information in cpuinfo_x86
 
     Move the randomly scattered topology data into a separate data
     structure for readability and as a preparatory step for the topology
     evaluation overhaul.
 
   - Consolidate APIC ID data type to u32
 
     It's fixed width hardware data and not randomly u16, int, unsigned long
     or whatever developers decided to use.
 
   - Cure the abuse of cpuinfo for persisting logical IDs.
 
     Per CPU cpuinfo is used to persist the logical package and die
     IDs. That's really not the right place simply because cpuinfo is
     subject to be reinitialized when a CPU goes through an offline/online
     cycle.
 
     Use separate per CPU data for the persisting to enable the further
     topology management rework. It will be removed once the new topology
     management is in place.
 
   - Provide a debug interface for inspecting topology information
 
     Useful in general and extremly helpful for validating the topology
     management rework in terms of correctness or "bug" compatibility.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmU+yX0THHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoROUD/4vlvKEcpm9rbI5DzLcaq4DFHKbyEZF
 cQtzuOSM/9vTc9DHnuoNNLl9TWSYxiVYnejf3E21evfsqspYlzbTH8bId9XBCUid
 6B68AJW842M2erNuwj0b0HwF1z++zpDmBDyhGOty/KQhoM8pYOHMvntAmbzJbuso
 Dgx6BLVFcboTy6RwlfRa0EE8f9W5V+JbmG/VBDpdyCInal7VrudoVFZmWQnPIft7
 zwOJpAoehkp8OKq7geKDf79yWxu9a1sNPd62HtaVEvfHwehHqE6OaMLss1us+0vT
 SJ/D6gmRQBOwcXaZL0wL1dG7Km9Et4AisOvzhXGvTa5b2D5oljVoqJ7V7FTf5g3u
 y3aqWbeUJzERUbeJt1HoGVAKyA4GtZOvg+TNIysf6F1Z4khl9alfa9jiqjj4g1au
 zgItq/ZMBEBmJ7X4FxQUEUVBG2CDsEidyNBDRcimWQUDfBakV/iCs0suD8uu8ZOD
 K5jMx8Hi2+xFx7r1YqsfsyMBYOf/zUZw65RbNe+kI992JbJ9nhcODbnbo5MlAsyv
 vcqlK5FwXgZ4YAC8dZHU/tyTiqAW7oaOSkqKwTP5gcyNEqsjQHV//q6v+uqtjfYn
 1C4oUsRHT2vJiV9ktNJTA4GQHIYF4geGgpG8Ih2SjXsSzdGtUd3DtX1iq0YiLEOk
 eHhYsnniqsYB5g==
 =xrz8
 -----END PGP SIGNATURE-----

Merge tag 'x86-core-2023-10-29-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 core updates from Thomas Gleixner:

 - Limit the hardcoded topology quirk for Hygon CPUs to those which have
   a model ID less than 4.

   The newer models have the topology CPUID leaf 0xB correctly
   implemented and are not affected.

 - Make SMT control more robust against enumeration failures

   SMT control was added to allow controlling SMT at boottime or
   runtime. The primary purpose was to provide a simple mechanism to
   disable SMT in the light of speculation attack vectors.

   It turned out that the code is sensible to enumeration failures and
   worked only by chance for XEN/PV. XEN/PV has no real APIC enumeration
   which means the primary thread mask is not set up correctly. By
   chance a XEN/PV boot ends up with smp_num_siblings == 2, which makes
   the hotplug control stay at its default value "enabled". So the mask
   is never evaluated.

   The ongoing rework of the topology evaluation caused XEN/PV to end up
   with smp_num_siblings == 1, which sets the SMT control to "not
   supported" and the empty primary thread mask causes the hotplug core
   to deny the bringup of the APS.

   Make the decision logic more robust and take 'not supported' and 'not
   implemented' into account for the decision whether a CPU should be
   booted or not.

 - Fake primary thread mask for XEN/PV

   Pretend that all XEN/PV vCPUs are primary threads, which makes the
   usage of the primary thread mask valid on XEN/PV. That is consistent
   with because all of the topology information on XEN/PV is fake or
   even non-existent.

 - Encapsulate topology information in cpuinfo_x86

   Move the randomly scattered topology data into a separate data
   structure for readability and as a preparatory step for the topology
   evaluation overhaul.

 - Consolidate APIC ID data type to u32

   It's fixed width hardware data and not randomly u16, int, unsigned
   long or whatever developers decided to use.

 - Cure the abuse of cpuinfo for persisting logical IDs.

   Per CPU cpuinfo is used to persist the logical package and die IDs.
   That's really not the right place simply because cpuinfo is subject
   to be reinitialized when a CPU goes through an offline/online cycle.

   Use separate per CPU data for the persisting to enable the further
   topology management rework. It will be removed once the new topology
   management is in place.

 - Provide a debug interface for inspecting topology information

   Useful in general and extremly helpful for validating the topology
   management rework in terms of correctness or "bug" compatibility.

* tag 'x86-core-2023-10-29-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits)
  x86/apic, x86/hyperv: Use u32 in hv_snp_boot_ap() too
  x86/cpu: Provide debug interface
  x86/cpu/topology: Cure the abuse of cpuinfo for persisting logical ids
  x86/apic: Use u32 for wakeup_secondary_cpu[_64]()
  x86/apic: Use u32 for [gs]et_apic_id()
  x86/apic: Use u32 for phys_pkg_id()
  x86/apic: Use u32 for cpu_present_to_apicid()
  x86/apic: Use u32 for check_apicid_used()
  x86/apic: Use u32 for APIC IDs in global data
  x86/apic: Use BAD_APICID consistently
  x86/cpu: Move cpu_l[l2]c_id into topology info
  x86/cpu: Move logical package and die IDs into topology info
  x86/cpu: Remove pointless evaluation of x86_coreid_bits
  x86/cpu: Move cu_id into topology info
  x86/cpu: Move cpu_core_id into topology info
  hwmon: (fam15h_power) Use topology_core_id()
  scsi: lpfc: Use topology_core_id()
  x86/cpu: Move cpu_die_id into topology info
  x86/cpu: Move phys_proc_id into topology info
  x86/cpu: Encapsulate topology information in cpuinfo_x86
  ...
2023-10-30 17:37:47 -10:00
Linus Torvalds
832328c9f8 ATA fixes for 6.6-final
A single patch to fix a regression introduced by the recent
 suspend/resume fixes. The regression is that ATA disks are not stopped
 on system shutdown, which is not recommended and increases the disks
 SMART counters for unclean power off events. This patch fixes this by
 refining the recent rework of the scsi device manage_xxx flags.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQSRPv8tYSvhwAzJdzjdoc3SxdoYdgUCZTtnHAAKCRDdoc3SxdoY
 dqksAP95dLIScPn8PTW2oy2O+LHW1g67XVziTY5nce7byqNJwQEA/q8cRw2LYzB3
 ELPZElQzlAgD47JqC2YFIHbomnqEagU=
 =fC3T
 -----END PGP SIGNATURE-----

Merge tag 'ata-6.6-final' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata

Pull ATA fix from Damien Le Moal:
 "A single patch to fix a regression introduced by the recent
  suspend/resume fixes.

  The regression is that ATA disks are not stopped on system shutdown,
  which is not recommended and increases the disks SMART counters for
  unclean power off events.

  This patch fixes this by refining the recent rework of the scsi device
  manage_xxx flags"

* tag 'ata-6.6-final' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata:
  scsi: sd: Introduce manage_shutdown device flag
2023-10-27 13:38:59 -10:00
Damien Le Moal
24eca2dce0 scsi: sd: Introduce manage_shutdown device flag
Commit aa3998dbeb ("ata: libata-scsi: Disable scsi device
manage_system_start_stop") change setting the manage_system_start_stop
flag to false for libata managed disks to enable libata internal
management of disk suspend/resume. However, a side effect of this change
is that on system shutdown, disks are no longer being stopped (set to
standby mode with the heads unloaded). While this is not a critical
issue, this unclean shutdown is not recommended and shows up with
increased smart counters (e.g. the unexpected power loss counter
"Unexpect_Power_Loss_Ct").

Instead of defining a shutdown driver method for all ATA adapter
drivers (not all of them define that operation), this patch resolves
this issue by further refining the sd driver start/stop control of disks
using the new flag manage_shutdown. If this new flag is set to true by
a low level driver, the function sd_shutdown() will issue a
START STOP UNIT command with the start argument set to 0 when a disk
needs to be powered off (suspended) on system power off, that is, when
system_state is equal to SYSTEM_POWER_OFF.

Similarly to the other manage_xxx flags, the new manage_shutdown flag is
exposed through sysfs as a read-write device attribute.

To avoid any confusion between manage_shutdown and
manage_system_start_stop, the comments describing these flags in
include/scsi/scsi.h are also improved.

Fixes: aa3998dbeb ("ata: libata-scsi: Disable scsi device manage_system_start_stop")
Cc: stable@vger.kernel.org
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218038
Link: https://lore.kernel.org/all/cd397c88-bf53-4768-9ab8-9d107df9e613@gmail.com/
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-27 10:00:19 +09:00
Hannes Reinecke
f2d79aa16a scsi: megaraid: Fix up debug message in megaraid_abort_and_reset()
Found by Smatch.

Fixes: 5bcd3bfbda ("scsi: megaraid: Pass in NULL scb for host reset")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20231023073021.21954-1-hare@suse.de
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-24 22:40:39 -04:00
Hannes Reinecke
c7f4c5dec6 scsi: aic79xx: Fix up NULL command in ahd_done()
Found by smatch.

Fixes: c67e638004 ("scsi: aic79xx: Do not reference SCSI command when resetting device")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20231023073014.21438-1-hare@suse.de
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-24 22:38:34 -04:00
Ranjan Kumar
3c978492c3 scsi: mpt3sas: Fix loop logic
The retry loop continues to iterate until the count reaches 30, even after
receiving the correct value. Exit loop when a non-zero value is read.

Fixes: 4ca10f3e31 ("scsi: mpt3sas: Perform additional retries if doorbell read returns 0")
Cc: stable@vger.kernel.org
Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com>
Link: https://lore.kernel.org/r/20231020105849.6350-1-ranjan.kumar@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-24 22:35:22 -04:00
Su Hui
44a31659ea scsi: snic: Remove useless code in snic_dr_clean_pending_req()
Return error code directly to save space and be more clear.

Signed-off-by: Su Hui <suhui@nfschina.com>
Link: https://lore.kernel.org/r/20231020023326.43898-1-suhui@nfschina.com
Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-24 22:33:02 -04:00
Wenchao Hao
3dc985bfbd scsi: core: Clean up scsi_dev_queue_ready()
This is just a cleanup for scsi_dev_queue_ready() to avoid a redundant goto
and if statement. No functional change.

Signed-off-by: Wenchao Hao <haowenchao2@huawei.com>
Link: https://lore.kernel.org/r/20231018113746.1940197-2-haowenchao2@huawei.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-24 22:31:04 -04:00
Hannes Reinecke
0b1b4b0444 scsi: pmcraid: Add missing scsi_device_put() in pmcraid_eh_target_reset_handler()
When breaking out of an shost_for_each_device() loop one needs to do an
explicit scsi_device_put().

Fixes: c2a14ab3b9 ("scsi: pmcraid: Select device in pmcraid_eh_target_reset_handler()")
Signed-off-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20231023072957.20191-1-hare@suse.de
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-24 22:24:32 -04:00
Yang Li
9e1c911ecb scsi: pmcraid: Fix kernel-doc comment
Fix kernel-doc comment to silence the warnings:

drivers/scsi/pmcraid.c:2697: warning: Excess function parameter 'scsi_cmd' description in 'pmcraid_reset_device'
drivers/scsi/pmcraid.c:2697: warning: Function parameter or member 'scsi_dev' not described in 'pmcraid_reset_device'

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=6843
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Link: https://lore.kernel.org/r/20231017025853.67562-1-yang.lee@linux.alibaba.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-24 22:21:51 -04:00
Linus Torvalds
75e167c2f6 SCSI fixes on 20231020
Two small fixes, both in drivers.  The mptsas one is really fixing an
 error path issue where it can leave the misc driver loaded even though
 the sas driver fails to initialize.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCZTLCuiYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishU6oAP45oLIr
 P3d4AXt2XGBwsaqyWaQx2OOe8fGps0+jyF3OdQEAzFXKXp7lNmHlA573JVMjH6Sz
 JqAdi+dYksRssC0GMZk=
 =iXXi
 -----END PGP SIGNATURE-----

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "Two small fixes, both in drivers.

  The mptsas one is really fixing an error path issue where it can leave
  the misc driver loaded even though the sas driver fails to initialize"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: qla2xxx: Fix double free of dsd_list during driver load
  scsi: mpt3sas: Fix in error path
2023-10-20 13:24:50 -07:00
Quinn Tran
097c06394c scsi: qla2xxx: Fix double free of dsd_list during driver load
On driver load, scsi_add_host() can fail. This triggers the free path to
call qla2x00_mem_free() multiple times. This causes NULL pointer access of
ha->base_qpair. Add check before access.

 BUG: unable to handle kernel NULL pointer dereference at 0000000000000030
 IP: [<ffffffffc118f73c>] qla2x00_mem_free+0x51c/0xcb0 [qla2xxx]
 PGD 8000001fcfe4a067 PUD 1fc8f0a067 PMD 0
 Oops: 0000 [#1] SMP
 RIP: 0010:[<ffffffffc118f73c>]  [<ffffffffc118f73c>] qla2x00_mem_free+0x51c/0xcb0 [qla2xxx]
 RSP: 0018:ffff8ace97a93a30  EFLAGS: 00010246
 RAX: 0000000000000000 RBX: ffff8ace8efd0000 RCX: 000000000000488f
 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
 RBP: ffff8ace97a93a60 R08: 000000000001f040 R09: ffffffff8678209b
 R10: ffff8acf7d6df040 R11: ffffc591c0fcc980 R12: ffffffff87034800
 R13: ffff8acf0e3cc740 R14: ffff8ace8efd0000 R15: 00000000fffffff4
 FS:  00007f4cf5449740(0000) GS:ffff8acf7d6c0000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000030 CR3: 0000001fc2f6c000 CR4: 00000000007607e0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 Call Trace:
  [<ffffffff86781f18>] ? kobject_put+0x28/0x60
  [<ffffffffc119a59c>] qla2x00_probe_one+0x19fc/0x3040 [qla2xxx]

Fixes: efeda3bf91 ("scsi: qla2xxx: Move resource to allow code reuse")
Signed-off-by: Quinn Tran <qutran@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Link: https://lore.kernel.org/r/20231016101749.5059-1-njavali@marvell.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-16 21:07:05 -04:00
Tomas Henzl
e40c04ade0 scsi: mpt3sas: Fix in error path
The driver should be deregistered as misc driver after PCI registration
failure.

Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Link: https://lore.kernel.org/r/20231015114529.10725-1-thenzl@redhat.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-16 21:04:54 -04:00
Douglas Gilbert
2bbeb8d124 scsi: core: Handle depopulation and restoration in progress
The default handling of the NOT READY sense key is to wait for the device
to become ready. The "wait" is assumed to be relatively short. However
there is a sub-class of NOT READY that have the "... in progress" phrase in
their additional sense code and these can take much longer.  Following on
from commit 505aa4b6a8 ("scsi: sd: Defer spinning up drive while SANITIZE
is in progress") we now have element depopulation and restoration that can
take a long time.  For example, over 24 hours for a 20 TB, 7200 rpm hard
disk to depopulate 1 of its 20 elements.

Add handling of ASC/ASCQ: 0x4,0x24 (depopulation in progress)
and ASC/ASCQ: 0x4,0x25 (depopulation restoration in progress)
to sd.c . The scsi_lib.c has incomplete handling of these
two messages, so complete it.

Signed-off-by: Douglas Gilbert <dgilbert@interlog.com>
Link: https://lore.kernel.org/r/20231015050650.131145-1-dgilbert@interlog.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-16 21:02:23 -04:00
Martin K. Petersen
058676b513 Merge patch series "scsi: scsi_debug: Add error injection for single device"
Wenchao Hao <haowenchao2@huawei.com> says:

The original error injection mechanism was based on scsi_host which
could not inject fault for a single SCSI device.

This patchset provides the ability to inject errors for a single SCSI
device. Now we support inject timeout errors, queuecommand errors, and
hostbyte, driverbyte, statusbyte, and sense data for specific SCSI
Command. Two new error injection is defined to make abort command or
reset LUN failed.

Besides error injection for single device, this patchset add a new
interface to make reset target failed for each scsi_target.

The first two patch add a debugfs interface to add and inquiry single
device's error injection info; the third patch defined how to remove
an injection which has been added. The following 5 patches use the
injection info and generate the related error type. The last two just
add a new interface to make reset target failed and control
scsi_device's allow_restart flag.

Link: https://lore.kernel.org/r/20231010092051.608007-1-haowenchao2@huawei.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-16 20:51:40 -04:00
Wenchao Hao
573c2d066e scsi: scsi_debug: Add param to control sdev's allow_restart
Add new module param "allow_restart" to control scsi_device's allow_restart
flag. This flag determines if EH is triggered after a command completes
with sense_key 0x6, ASC 0x4 and ASCQ 0x2. EH would be triggered if
allow_restart=1 in this condition.

The new param can be used with the error injection capability to test how
commands completing with sense_key 0x6, ASC 0x4 and ASCQ 0x2 are handled.

Signed-off-by: Wenchao Hao <haowenchao2@huawei.com>
Link: https://lore.kernel.org/r/20231010092051.608007-11-haowenchao2@huawei.com
Tested-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-16 20:50:12 -04:00
Wenchao Hao
f084fe52c6 scsi: scsi_debug: Add debugfs interface to fail target reset
The interface is found at
/sys/kernel/debug/scsi_debug/target<h:c:t>/fail_reset where <h:c:t>
identifies the target to inject errors on. It's a simple bool type
interface which would make this target's reset fail if set to 'Y'.

Signed-off-by: Wenchao Hao <haowenchao2@huawei.com>
Link: https://lore.kernel.org/r/20231010092051.608007-10-haowenchao2@huawei.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-16 20:50:11 -04:00
Wenchao Hao
0267811625 scsi: scsi_debug: Add new error injection type: Reset LUN failed
Add error injection type 4 to make scsi_debug_device_reset() return FAILED.
Fail abort command format:

  +--------+------+-------------------------------------------------------+
  | Column | Type | Description                                           |
  +--------+------+-------------------------------------------------------+
  |   1    |  u8  | Error type, fixed to 0x4                              |
  +--------+------+-------------------------------------------------------+
  |   2    |  s32 | Error count                                           |
  |        |      |  0: this rule will be ignored                         |
  |        |      |  positive: the rule will always take effect           |
  |        |      |  negative: the rule takes effect n times where -n is  |
  |        |      |            the value given. Ignored after n times     |
  +--------+------+-------------------------------------------------------+
  |   3    |  x8  | SCSI command opcode, 0xff for all commands            |
  +--------+------+-------------------------------------------------------+

Examples:

    error=/sys/kernel/debug/scsi_debug/0:0:0:1/error
    echo "4 -10 0x12" > ${error}

will make the device return FAILED when trying to reset LUN with inquiry
command 10 times.

    error=/sys/kernel/debug/scsi_debug/0:0:0:1/error
    echo "4 -10 0xff" > ${error}

will make the device return FAILED when trying to reset LUN 10 times.

Usually we do not care about what command it is when trying to perform
reset LUN, so 0xff could be applied.

Signed-off-by: Wenchao Hao <haowenchao2@huawei.com>
Link: https://lore.kernel.org/r/20231010092051.608007-9-haowenchao2@huawei.com
Tested-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-16 20:50:11 -04:00
Wenchao Hao
5551ce9288 scsi: scsi_debug: Add new error injection type: Abort Failed
Add error injection type 3 to make scsi_debug_abort() return FAILED.  Fail
abort command format:

  +--------+------+-------------------------------------------------------+
  | Column | Type | Description                                           |
  +--------+------+-------------------------------------------------------+
  |   1    |  u8  | Error type, fixed to 0x3                              |
  +--------+------+-------------------------------------------------------+
  |   2    |  s32 | Error count                                           |
  |        |      |  0: this rule will be ignored                         |
  |        |      |  positive: the rule will always take effect           |
  |        |      |  negative: the rule takes effect n times where -n is  |
  |        |      |            the value given. Ignored after n times     |
  +--------+------+-------------------------------------------------------+
  |   3    |  x8  | SCSI command opcode, 0xff for all commands            |
  +--------+------+-------------------------------------------------------+

Examples:

    error=/sys/kernel/debug/scsi_debug/0:0:0:1/error
    echo "3 -10 0x12" > ${error}

will make the device return FAILED when aborting inquiry command 10 times.

Signed-off-by: Wenchao Hao <haowenchao2@huawei.com>
Link: https://lore.kernel.org/r/20231010092051.608007-8-haowenchao2@huawei.com
Tested-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-16 20:50:11 -04:00
Wenchao Hao
3359227432 scsi: scsi_debug: Set command result and sense data if error is injected
If a fail command error is injected, set the command's status and sense
data then finish this SCSI command.

Set SCSI command's status and sense data format:

  +--------+------+-------------------------------------------------------+
  | Column | Type | Description                                           |
  +--------+------+-------------------------------------------------------+
  |   1    |  u8  | Error type, fixed to 0x2                              |
  +--------+------+-------------------------------------------------------+
  |   2    |  s32 | Error Count                                           |
  |        |      |  0: the rule will be ignored                          |
  |        |      |  positive: the rule will always take effect           |
  |        |      |  negative: the rule takes effect n times where -n is  |
  |        |      |            the value given. Ignored after n times     |
  +--------+------+-------------------------------------------------------+
  |   3    |  x8  | SCSI command opcode, 0xff for all commands            |
  +--------+------+-------------------------------------------------------+
  |   4    |  x8  | Host byte in scsi_cmd::status                         |
  |        |      | [scsi_cmd::status has 32 bits holding these 3 bytes]  |
  +--------+------+-------------------------------------------------------+
  |   5    |  x8  | Driver byte in scsi_cmd::status                       |
  +--------+------+-------------------------------------------------------+
  |   6    |  x8  | SCSI Status byte in scsi_cmd::status                  |
  +--------+------+-------------------------------------------------------+
  |   7    |  x8  | SCSI Sense Key in scsi_cmnd                           |
  +--------+------+-------------------------------------------------------+
  |   8    |  x8  | SCSI ASC in scsi_cmnd                                 |
  +--------+------+-------------------------------------------------------+
  |   9    |  x8  | SCSI ASCQ in scsi_cmnd                                |
  +--------+------+-------------------------------------------------------+

Examples:
    error=/sys/kernel/debug/scsi_debug/0:0:0:1/error
    echo "2 -10 0x88 0 0 0x2 0x3 0x11 0x0" >${error}

will make device's read command return with media error with additional
sense of "Unrecovered read error" (UNC):

Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Wenchao Hao <haowenchao2@huawei.com>
Link: https://lore.kernel.org/r/20231010092051.608007-7-haowenchao2@huawei.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-16 20:50:11 -04:00
Wenchao Hao
33bccf55c2 scsi: scsi_debug: Return failed value if error is injected
If a fail queuecommand error is injected, return the failed value defined
in the rule from queuecommand.

Make queuecommand return format:

  +--------+------+-------------------------------------------------------+
  | Column | Type | Description                                           |
  +--------+------+-------------------------------------------------------+
  |   1    |  u8  | Error type, fixed to 0x1                              |
  +--------+------+-------------------------------------------------------+
  |   2    |  s32 | Error count                                           |
  |        |      |  0: this rule will be ignored                         |
  |        |      |  positive: the rule will always take effect           |
  |        |      |  negative: the rule takes effect n times where -n is  |
  |        |      |            the value given. Ignored after n times     |
  +--------+------+-------------------------------------------------------+
  |   3    |  x8  | SCSI command opcode, 0xff for all commands            |
  +--------+------+-------------------------------------------------------+
  |   4    |  x32 | The queuecommand() return value we want               |
  +--------+------+-------------------------------------------------------+

Examples:

    error=/sys/kernel/debug/scsi_debug/0:0:0:1/error
    echo "1 1 0x12 0x1055" > ${error}

will make each INQUIRY command sent to that device return 0x1055
(SCSI_MLQUEUE_HOST_BUSY).

Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Wenchao Hao <haowenchao2@huawei.com>
Link: https://lore.kernel.org/r/20231010092051.608007-6-haowenchao2@huawei.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-16 20:50:11 -04:00
Wenchao Hao
32be8b6e22 scsi: scsi_debug: Time out command if the error is injected
If a timeout error is injected, return 0 from scsi_debug_queuecommand to
make the command time out.

Time out SCSI command format:

  +--------+------+-------------------------------------------------------+
  | Column | Type | Description                                           |
  +--------+------+-------------------------------------------------------+
  |   1    |  u8  | Error type, fixed to 0x0                              |
  +--------+------+-------------------------------------------------------+
  |   2    |  s32 | Error count                                           |
  |        |      |  0: this rule will be ignored                         |
  |        |      |  positive: the rule will always take effect           |
  |        |      |  negative: the rule takes effect n times where -n is  |
  |        |      |            the value given. Ignored after n times     |
  +--------+------+-------------------------------------------------------+
  |   3    |  x8  | SCSI command opcode, 0xff for all commands            |
  +--------+------+-------------------------------------------------------+

Examples:

    error=/sys/kernel/debug/scsi_debug/0:0:0:1/error
    echo "0 -10 0x12" > ${error}

will make the device's inquiry command time out 10 times.

    echo "0 1 0x12" > ${error}

will make the device's inquiry time out each time it is invoked on this
device.

Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Wenchao Hao <haowenchao2@huawei.com>
Link: https://lore.kernel.org/r/20231010092051.608007-5-haowenchao2@huawei.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-16 20:50:11 -04:00
Wenchao Hao
962d77cd4c scsi: scsi_debug: Define grammar to remove added error injection
The grammar to remove error injection is a line with fixed 3 columns
separated by spaces.

First column is fixed to "-". It tells this is a removal operation.  Second
column is the error code to match.  Third column is the scsi command to
match.

For example the following command would remove timeout injection of inquiry
command:

    echo "- 0 0x12" > /sys/kernel/debug/scsi_debug/0:0:0:1/error

Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Wenchao Hao <haowenchao2@huawei.com>
Link: https://lore.kernel.org/r/20231010092051.608007-4-haowenchao2@huawei.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-16 20:50:11 -04:00
Wenchao Hao
a9996d722b scsi: scsi_debug: Add interface to manage error injection for a single device
This new facility uses the debugfs pseudo file system which is typically
mounted under the /sys/kernel/debug directory and requires root permissions
to access.

The interface file is found at /sys/kernel/debug/scsi_debug/<h:c:t:l>/error
where <h:c:t:l> identifies the device (logical unit (LU)) to inject errors
on.

For the following description the ${error} environment variable is assumed
to be set to/sys/kernel/debug/scsi_debug/1:0:0:0/error where 1:0:0:0 is a
pseudo device (LU) owned by the scsi_debug driver. Rules are written to
${error} in the normal sysfs fashion (e.g. 'echo "0 -2 0x12" > ${error}').

More than one rule can be active on a device at a time and inactive rules
(i.e. those whose error count is 0) remain in the rule listing. The
existing rules can be read with 'cat ${error}' with oneline output for each
rule.

The interface format is line-by-line, each line is an error injection rule.
Each rule contains integers separated by spaces, the first three columns
correspond to "Error code", "Error count" and "SCSI command", other
columns depend on Error code.

General rule format:
  +--------+------+-------------------------------------------------------+
  | Column | Type | Description                                           |
  +--------+------+-------------------------------------------------------+
  |   1    |  u8  | Error code                                            |
  |        |      |  0: timeout SCSI command                              |
  |        |      |  1: fail queuecommand, make queuecommand return       |
  |        |      |     given value                                       |
  |        |      |  2: fail command, finish command with SCSI status,    |
  |        |      |     sense key and ASC/ASCQ values                     |
  |        |      |  3: make abort commands for specific command fail     |
  |        |      |  4: make reset lun for specific command fail          |
  +--------+------+-------------------------------------------------------+
  |   2    |  s32 | Error count                                           |
  |        |      |  0: this rule will be ignored                         |
  |        |      |  positive: the rule will always take effect           |
  |        |      |  negative: the rule takes effect n times where -n is  |
  |        |      |            the value given. Ignored after n times     |
  +--------+------+-------------------------------------------------------+
  |   3    |  x8  | SCSI command opcode, 0xff for all commands            |
  +--------+------+-------------------------------------------------------+
  |  ...   |  xxx | Error type specific fields                            |
  +--------+------+-------------------------------------------------------+

Notes:

 - When multiple error inject rules are added for the same SCSI command,
   the one with smaller error code will take effect (and the others will be
   ignored).

 - If the same error (i.e. same Error code and SCSI command) is added, the
   older one will be overwritten..

 - Currently, the basic types are (u8/u16/u32/u64/s8/s16/s32/s64) and the
   hexadecimal types (x8/x16/x32/x64).

 - Where a hexadecimal value is expected (e.g. Column 3: SCSI command
   opcode) the "0x" prefix is optional on the value (e.g. the INQUIRY
   opcode can be given as '0x12' or '12').

 - When the Error count is negative, reading ${error} will show that value
   incrementing, stopping when it gets to 0.

Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Wenchao Hao <haowenchao2@huawei.com>
Link: https://lore.kernel.org/r/20231010092051.608007-3-haowenchao2@huawei.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-16 20:50:11 -04:00
Wenchao Hao
6e2d15f59b scsi: scsi_debug: Create scsi_debug directory in the debugfs filesystem
Create directory scsi_debug in the root of the debugfs filesystem.  Prepare
to add interface for manage error injection.

Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Wenchao Hao <haowenchao2@huawei.com>
Link: https://lore.kernel.org/r/20231010092051.608007-2-haowenchao2@huawei.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-16 20:50:10 -04:00
Martin K. Petersen
af46076d66 Merge patch series "lpfc: Update lpfc to revision 14.2.0.15"
Justin Tee <justintee8345@gmail.com> says:

Update lpfc to revision 14.2.0.15

This patch set contains error handling fixes, ELS bug fixes, and
logging improvements.

The patches were cut against Martin's 6.7/scsi-queue tree.

Link: https://lore.kernel.org/r/20231009161812.97232-1-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13 17:00:47 -04:00
Justin Tee
8a9a690b5a scsi: lpfc: Update lpfc version to 14.2.0.15
Update lpfc version to 14.2.0.15.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20231009161812.97232-7-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13 16:58:27 -04:00
Justin Tee
41c831bbb0 scsi: lpfc: Introduce LOG_NODE_VERBOSE messaging flag
The preexisting LOG_NODE message flag frequently spams a subset of the same
log messages during normal FC driver operations.  When analyzing driver
logs, this sometimes leads to difficulty in troubleshooting.

Because LOG_IP log message flag is unused, convert it to a new
LOG_NODE_VERBOSE flag.  The LOG_NODE_VERBOSE shall specifically be used for
diagnosing issues that require precise ndlp tracking detail.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20231009161812.97232-6-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13 16:58:27 -04:00
Justin Tee
a3c3c0a806 scsi: lpfc: Validate ELS LS_ACC completion payload
A WCQE success completion status does not guarantee valid LS_ACC receipt
for ELS commands.  So, introduce a small helper routine that validates ELS
LS_ACC frames in ELS cmpl routines.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20231009161812.97232-5-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13 16:58:27 -04:00
Justin Tee
12e896c742 scsi: lpfc: Reject received PRLIs with only initiator fcn role for NPIV ports
Currently, NPIV ports send PRLI_ACC to all received unsolicited PRLI
requests.  For an NPIV port, there is no point to PRLI_ACC if the received
PRLI request has the initiator function bit set and the target function bit
unset.  Modify the lpfc_rcv_prli_support_check() routine to send a PRLI_RJT
in such cases.  NPIV ports are expected to send PRLI_ACC only if the Target
function bit is set.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20231009161812.97232-4-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13 16:58:27 -04:00
Justin Tee
d472a76603 scsi: lpfc: Treat IOERR_SLI_DOWN I/O completion status the same as pci offline
During receipt of a hardware error attention ACQE, IOERR_SLI_DOWN status is
set by the driver for all outstanding I/Os.

In such hardware error attention cases, we can treat the situation exactly
the same as pci_channel_offline.  Thus, add IOERR_SLI_DOWN status to the
same category as pci_channel_offline handling in lpfc_nvme_io_cmd_cmpl.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20231009161812.97232-3-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13 16:58:27 -04:00
Justin Tee
0506814609 scsi: lpfc: Remove unnecessary zero return code assignment in lpfc_sli4_hba_setup
In order to enter the !rc if statement block in question, rc had to have
been zero to begin with.  Thus, the rc = 0 assignment is unnecessary and
can be removed.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20231009161812.97232-2-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13 16:58:27 -04:00
Martin K. Petersen
949189a567 Merge patch series "megaraid_sas: Driver version update to 07.727.03.00-rc1"
Chandrakanth patil <chandrakanth.patil@broadcom.com> says:

This set of patches includes critical fixes, and updates to the
maintainer list.

Link: https://lore.kernel.org/r/20231003110021.168862-1-chandrakanth.patil@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13 16:52:47 -04:00
Chandrakanth patil
0938f9fa42 scsi: megaraid_sas: Driver version update to 07.727.03.00-rc1
Driver version update.

Signed-off-by: Chandrakanth patil <chandrakanth.patil@broadcom.com>
Link: https://lore.kernel.org/r/20231003110021.168862-4-chandrakanth.patil@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13 16:51:56 -04:00
Chandrakanth patil
2d83fb023c scsi: megaraid_sas: Log message when controller reset is requested but not issued
The driver now includes the print message 'IO is completed, no reset is
required' when a reset is requested but not issued. This message is
displayed only when pending SCSI IO is completed before issuing the reset.

Signed-off-by: Chandrakanth patil <chandrakanth.patil@broadcom.com>
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Link: https://lore.kernel.org/r/20231003110021.168862-3-chandrakanth.patil@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13 16:51:55 -04:00
Chandrakanth patil
8e3ed9e786 scsi: megaraid_sas: Increase register read retry rount from 3 to 30 for selected registers
In BMC environments with concurrent access to multiple registers, certain
registers occasionally yield a value of 0 even after 3 retries due to
hardware errata. As a fix, we have extended the retry count from 3 to 30.

The same errata applies to the mpt3sas driver, and a similar patch has
been accepted. Please find more details in the mpt3sas patch reference
link.

Link: https://lore.kernel.org/r/20230829090020.5417-2-ranjan.kumar@broadcom.com
Fixes: 272652fcbf ("scsi: megaraid_sas: add retry logic in megasas_readl")
Cc: stable@vger.kernel.org
Signed-off-by: Chandrakanth patil <chandrakanth.patil@broadcom.com>
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Link: https://lore.kernel.org/r/20231003110021.168862-2-chandrakanth.patil@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13 16:51:55 -04:00
Martin K. Petersen
bd7f0ef293 Merge patch series "scsi: sshdr and retry fixes"
Mike Christie <michael.christie@oracle.com> says:

The following patches were made over Linus tree (Martin's 6.7 branch
was missing some changes to sd.c). They only contain the sshdr and
rdac retry fixes from the "Allow scsi_execute users to control
retries" patchset.

The patches in this set are reviewed and tested but the changes to how
we do retries will take a little longer and require more testing, so I
broke up the series to make them easier to review.

Link: https://lore.kernel.org/r/20231004210013.5601-1-michael.christie@oracle.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13 16:38:26 -04:00
Mike Christie
f7d7129c6c scsi: sr: Fix sshdr use in sr_get_events
If scsi_execute_cmd returns < 0, it doesn't initialize the sshdr, so we
shouldn't access the sshdr. If it returns 0, then the cmd executed
successfully, so there is no need to check the sshdr. This has us access
the sshdr when we get a return value > 0.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Link: https://lore.kernel.org/r/20231004210013.5601-13-michael.christie@oracle.com
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13 16:36:20 -04:00
Mike Christie
c8b7ef36da scsi: sd: Fix sshdr use in cache_type_store
If scsi_execute_cmd returns < 0, it doesn't initialize the sshdr, so we
shouldn't access the sshdr. If it returns 0, then the cmd executed
successfully, so there is no need to check the sshdr. This has us access
the sshdr when we get a return value > 0.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Link: https://lore.kernel.org/r/20231004210013.5601-12-michael.christie@oracle.com
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13 16:36:20 -04:00
Mike Christie
8f0017694c scsi: Fix sshdr use in scsi_cdl_enable
If scsi_execute_cmd returns < 0, it doesn't initialize the sshdr, so we
shouldn't access the sshdr. If it returns 0, then the cmd executed
successfully, so there is no need to check the sshdr. This has us access
the sshdr when we get a return value > 0.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Link: https://lore.kernel.org/r/20231004210013.5601-11-michael.christie@oracle.com
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13 16:36:20 -04:00
Mike Christie
f43158eefd scsi: Fix sshdr use in scsi_test_unit_ready
If scsi_execute_cmd returns < 0, it doesn't initialize the sshdr, so we
shouldn't access the sshdr. If it returns 0, then the cmd executed
successfully, so there is no need to check the sshdr. This has us access
the sshdr when we get a return value > 0.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Link: https://lore.kernel.org/r/20231004210013.5601-10-michael.christie@oracle.com
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13 16:36:20 -04:00
Mike Christie
add2c24d32 scsi: sd: Fix scsi_mode_sense caller's sshdr use
The sshdr passed into scsi_execute_cmd is only initialized if
scsi_execute_cmd returns >= 0, and scsi_mode_sense will convert all non
good statuses like check conditions to -EIO. This has scsi_mode_sense
callers that were possibly accessing an uninitialized sshdrs to only
access it if we got -EIO.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Link: https://lore.kernel.org/r/20231004210013.5601-9-michael.christie@oracle.com
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13 16:36:20 -04:00
Mike Christie
0b149cee83 scsi: spi: Fix sshdr use
If scsi_execute_cmd returns < 0, it doesn't initialize the sshdr, so we
shouldn't access the sshdr. If it returns 0, then the cmd executed
successfully, so there is no need to check the sshdr. This has us access
the sshdr when we get a return value > 0.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Link: https://lore.kernel.org/r/20231004210013.5601-7-michael.christie@oracle.com
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13 16:36:20 -04:00
Mike Christie
87e145a293 scsi: rdac: Fix sshdr use
If scsi_execute_cmd returns < 0, it doesn't initialize the sshdr, so we
shouldn't access the sshdr. If it returns 0, then the cmd executed
successfully, so there is no need to check the sshdr. This has us access
the sshdr when we get a return value > 0.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Link: https://lore.kernel.org/r/20231004210013.5601-6-michael.christie@oracle.com
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin Wilck <mwilck@suse.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13 16:36:20 -04:00
Mike Christie
2274bd5e3a scsi: rdac: Fix send_mode_select retry handling
If send_mode_select retries scsi_execute_cmd it will leave err set to
SCSI_DH_RETRY/SCSI_DH_IMM_RETRY. If on the retry, the command is
successful, then SCSI_DH_RETRY/SCSI_DH_IMM_RETRY will be returned to the
scsi_dh activation caller. On the retry, we will then detect the previous
MODE SELECT had worked, and so we will return success.

This patch has us return the correct return value, so we can avoid the
extra scsi_dh activation call and to avoid failures if the caller had hit
its activation retry limit and does not end up retrying.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Link: https://lore.kernel.org/r/20231004210013.5601-5-michael.christie@oracle.com
Reviewed-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13 16:36:19 -04:00
Mike Christie
5759a5650d scsi: hp_sw: Fix sshdr use
If scsi_execute_cmd returns < 0, it doesn't initialize the sshdr, so we
shouldn't access the sshdr. If it returns 0, then the cmd executed
successfully, so there is no need to check the sshdr. This has us access
the sshdr when we get a return value > 0.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Link: https://lore.kernel.org/r/20231004210013.5601-4-michael.christie@oracle.com
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13 16:36:19 -04:00
Mike Christie
b4d0c33a32 scsi: sd: Fix sshdr use in sd_spinup_disk
If scsi_execute_cmd returns < 0, it doesn't initialize the sshdr, so we
shouldn't access the sshdr. If it returns 0, then the cmd executed
successfully, so there is no need to check the sshdr. This has us access
the sshdr when we get a return value > 0.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Link: https://lore.kernel.org/r/20231004210013.5601-3-michael.christie@oracle.com
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13 16:36:19 -04:00
Mike Christie
bd593bd2c1 scsi: sd: Fix sshdr use in read_capacity_16
If scsi_execute_cmd returns < 0, it doesn't initialize the sshdr, so we
shouldn't access the sshdr. If it returns 0, then the cmd executed
successfully, so there is no need to check the sshdr. This has us access
the sshdr when we get a return value > 0.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Link: https://lore.kernel.org/r/20231004210013.5601-2-michael.christie@oracle.com
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13 16:36:19 -04:00
Martin K. Petersen
1caddfc581 Merge patch series "scsi: target: Allow userspace to config cmd submission"
Mike Christie <michael.christie@oracle.com> says:

The following patches were made over Linus's tree but apply over
Martin's branches. They allow userspace to configure how fabric
drivers submit cmds to backend drivers.

Right now loop and vhost use a worker thread, and the other drivers
submit from the contexts they receive/process the cmd from. For
multiple LUN cases where the target can queue more cmds than the
backend can handle then deferring to a worker thread is safest because
the backend driver can block when doing things like waiting for a free
request/tag. Deferring also helps when the target has to handle
transport level requests from the recv context.

For cases where the backend devices can queue everything the target
sends, then there is no need to defer to a workqueue and you can see a
perf boost of up to 26% for small IO workloads. For a nvme device and
vhost-scsi I can see with 4K IOs:

fio jobs        1       2       4       8       10
--------------------------------------------------
workqueue
submit        94K     190K    394K    770K    890K

direct
submit       128K     252K    488K    950K    -

Link: https://lore.kernel.org/r/1b1f7a5c-0988-45f9-b103-dfed2c0405b1@oracle.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13 15:56:37 -04:00
Mike Christie
194605d45d scsi: target: Have drivers report if they support direct submissions
In some cases, like with multiple LUN targets or where the target has to
respond to transport level requests from the receiving context it can be
better to defer cmd submission to a helper thread. If the backend driver
blocks on something like request/tag allocation it can block the entire
target submission path and other LUs and transport IO on that session.

In other cases like single LUN targets with storage that can support all
the commands that the target can queue, then it's best to submit the cmd
to the backend from the target's cmd receiving context.

Subsequent commits will allow the user to config what they prefer, but
drivers like loop can't directly submit because they can be called from a
context that can't sleep. And, drivers like vhost-scsi can support direct
submission, but need to keep their default behavior of deferring execution
to avoid possible regressions where the backend can block.

Make the drivers tell LIO core if they support direct submissions and their
current default, so we can prevent users from misconfiguring the system and
initialize devices correctly.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Link: https://lore.kernel.org/r/20230928020907.5730-2-michael.christie@oracle.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13 15:53:57 -04:00
Martin K. Petersen
9f4c887fe6 Merge patch series "scsi: EH rework prep patches, part 1"
Hannes Reinecke <hare@suse.de> says:

Hi all,

(taking up an old thread:) here's the first batch of patches for my EH
rework.  It modifies the reset callbacks for SCSI drivers such that
the final conversion to drop the 'struct scsi_cmnd' argument and use
the entity in question (host, bus, target, device) as the argument to
the SCSI EH callbacks becomes possible.  The first part covers drivers
which just requires minor tweaks.

Link: https://lore.kernel.org/r/20231002154328.43718-1-hare@suse.de
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13 14:25:20 -04:00
Hannes Reinecke
82b2fb52d6 scsi: mpi3mr: Split off bus_reset function from host_reset
SCSI EH host reset is the final callback in the escalation chain; once we
reach this we need to reset the controller.  As such it defeats the purpose
to skip controller reset if no I/Os are pending and the RAID device is to
be reset; especially after kexec there might be stale commands pending in
firmware for which we have no reference whatsoever.  So this patch splits
off the check for pending I/O into a 'bus_reset' function, and leaves the
actual controller reset to the host reset.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20231002154328.43718-19-hare@suse.de
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: Sathya Prakash Veerichetty <sathya.prakash@broadcom.com>
Cc: Sumit Saxena <sumit.saxena@broadcom.com>
Cc: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13 14:23:15 -04:00
Hannes Reinecke
c2a14ab3b9 scsi: pmcraid: Select device in pmcraid_eh_target_reset_handler()
The reset code requires a device to be selected, but we shouldn't rely on
the command to provide a device for us. So select the first device on the
target when sending down a target reset.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20231002154328.43718-18-hare@suse.de
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13 14:23:15 -04:00
Hannes Reinecke
09df469722 scsi: pmcraid: Select device in pmcraid_eh_bus_reset_handler()
The reset code requires a device to be selected, but we shouldn't rely on
the command to provide a device for us. So select the first device on the
bus when sending down a bus reset.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20231002154328.43718-17-hare@suse.de
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13 14:23:15 -04:00
Hannes Reinecke
bffebc1993 scsi: qla1280: Separate out host reset function from qla1280_error_action()
There's not much in common between host reset and all other error handlers,
so use a separate function here.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20231002154328.43718-16-hare@suse.de
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13 14:23:15 -04:00
Hannes Reinecke
c7c559d2b3 scsi: sym53c8xx_2: Rework reset handling
Split off the combined abort and device reset handling into distinct
functions. And rename the current device reset handler into a target reset
handler, seeing that it really is a target reset.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20231002154328.43718-15-hare@suse.de
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13 14:23:15 -04:00
Hannes Reinecke
4980ae18c3 scsi: sym53c8xx_2: Split off bus reset from host reset
The current handler does both, bus reset and host reset.  So split them off
into two distinct functions.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20231002154328.43718-14-hare@suse.de
Cc: Matthew Wilcox <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13 14:23:15 -04:00
Hannes Reinecke
c8102e421e scsi: ips: Do not try to abort command from host reset
The code for aborting an outstanding command is a copy of the functionality
from command abort. As we already have called this function once we reach
host reset there's no point in trying to do so again.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20231002154328.43718-13-hare@suse.de
Cc: Adaptec OEM Raid Solutions <aacraid@microsemi.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13 14:23:15 -04:00
Hannes Reinecke
5bcd3bfbda scsi: megaraid: Pass in NULL scb for host reset
When calling a host reset we shouldn't rely on the command triggering the
reset, so allow megaraid_abort_and_reset() to be called with a NULL scb.
And drop the pointless 'bus_reset' and 'target_reset' handlers, which just
call the same function as host_reset.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20231002154328.43718-12-hare@suse.de
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13 14:23:15 -04:00
Hannes Reinecke
397ff21a96 scsi: ibmvfc: Open-code reset loop for target reset
For target reset we need a device to send the target reset to, so open-code
the loop in target reset to send the target reset TMF to the correct
device.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20231002154328.43718-11-hare@suse.de
Cc: Tyrel Datwyler <tyreld@linux.ibm.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13 14:23:14 -04:00
Hannes Reinecke
c67e638004 scsi: aic79xx: Do not reference SCSI command when resetting device
When sending a device reset we should not take a reference to the SCSI
command.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20231002154328.43718-10-hare@suse.de
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13 14:23:14 -04:00
Hannes Reinecke
9cc9ef2819 scsi: aic79xx: Make BUILD_SCSIID() a function
Convert BUILD_SCSIID() into a function and add a scsi_device argument.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20231002154328.43718-9-hare@suse.de
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13 14:23:14 -04:00
Hannes Reinecke
10f5aa018f scsi: aic7xxx: Do not reference SCSI command when resetting device
When sending a device reset we should not take a reference to the SCSI
command.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20231002154328.43718-8-hare@suse.de
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13 14:23:14 -04:00
Hannes Reinecke
958230bcdd scsi: aic7xxx: Make BUILD_SCSIID() a function
Convert BUILD_SCSIID() into a function and add a scsi_device argument.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20231002154328.43718-7-hare@suse.de
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13 14:23:14 -04:00
Hannes Reinecke
6a137a967b scsi: bnx2fc: Do not rely on a SCSI command for LUN or target reset
When a LUN or target reset is issued, we should not rely on a SCSI command
to be present; we'll have to reset the entire device or target anyway.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20231002154328.43718-6-hare@suse.de
Cc: Saurav Kashyap <skashyap@marvell.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13 14:23:14 -04:00
Hannes Reinecke
ade4fb9457 scsi: qedf: Use FC rport as argument for qedf_initiate_tmf()
When sending a TMF we're only concerned with the rport and the LUN ID, so
use struct fc_rport as argument for qedf_initiate_tmf().

Signed-off-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20231002154328.43718-5-hare@suse.de
Cc: Saurav Kashyap <skashyap@marvell.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2023-10-13 14:23:14 -04:00