Fix an issue reported when performing a live migration when multipath is
configured with a short fast fail timeout of 5 seconds and also to have
no_path_retry set to fail. In this scenario, all paths would go into the
devloss state while the ibmvfc driver went through discovery to log back
in. On a loaded system, the discovery might take longer than 5 seconds,
which was resulting in all paths being marked failed, which then resulted
in a read only filesystem.
This patch changes the migration code in ibmvfc to avoid deleting rports at
all in this scenario, so we avoid losing all paths.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Link: https://lore.kernel.org/r/20221026181356.148517-1-brking@linux.vnet.ibm.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Currently the back pointer from a queue to the vhost adapter isn't set
until after subcrq interrupt registration. The value is available when a
queue is first allocated and can/should be also set for primary and async
queues as well as subcrqs.
This fixes a crash observed during kexec/kdump on Power 9 with legacy XICS
interrupt controller where a pending subcrq interrupt from the previous
kernel can be replayed immediately upon IRQ registration resulting in
dereference of a garbage backpointer in ibmvfc_interrupt_scsi().
Kernel attempted to read user page (58) - exploit attempt? (uid: 0)
BUG: Kernel NULL pointer dereference on read at 0x00000058
Faulting instruction address: 0xc008000003216a08
Oops: Kernel access of bad area, sig: 11 [#1]
...
NIP [c008000003216a08] ibmvfc_interrupt_scsi+0x40/0xb0 [ibmvfc]
LR [c0000000082079e8] __handle_irq_event_percpu+0x98/0x270
Call Trace:
[c000000047fa3d80] [c0000000123e6180] 0xc0000000123e6180 (unreliable)
[c000000047fa3df0] [c0000000082079e8] __handle_irq_event_percpu+0x98/0x270
[c000000047fa3ea0] [c000000008207d18] handle_irq_event+0x98/0x188
[c000000047fa3ef0] [c00000000820f564] handle_fasteoi_irq+0xc4/0x310
[c000000047fa3f40] [c000000008205c60] generic_handle_irq+0x50/0x80
[c000000047fa3f60] [c000000008015c40] __do_irq+0x70/0x1a0
[c000000047fa3f90] [c000000008016d7c] __do_IRQ+0x9c/0x130
[c000000014622f60] [0000000020000000] 0x20000000
[c000000014622ff0] [c000000008016e50] do_IRQ+0x40/0xa0
[c000000014623020] [c000000008017044] replay_soft_interrupts+0x194/0x2f0
[c000000014623210] [c0000000080172a8] arch_local_irq_restore+0x108/0x170
[c000000014623240] [c000000008eb1008] _raw_spin_unlock_irqrestore+0x58/0xb0
[c000000014623270] [c00000000820b12c] __setup_irq+0x49c/0x9f0
[c000000014623310] [c00000000820b7c0] request_threaded_irq+0x140/0x230
[c000000014623380] [c008000003212a50] ibmvfc_register_scsi_channel+0x1e8/0x2f0 [ibmvfc]
[c000000014623450] [c008000003213d1c] ibmvfc_init_sub_crqs+0xc4/0x1f0 [ibmvfc]
[c0000000146234d0] [c0080000032145a8] ibmvfc_reset_crq+0x150/0x210 [ibmvfc]
[c000000014623550] [c0080000032147c8] ibmvfc_init_crq+0x160/0x280 [ibmvfc]
[c0000000146235f0] [c00800000321a9cc] ibmvfc_probe+0x2a4/0x530 [ibmvfc]
Link: https://lore.kernel.org/r/20220616191126.1281259-2-tyreld@linux.ibm.com
Fixes: 3034ebe263 ("scsi: ibmvfc: Add alloc/dealloc routines for SCSI Sub-CRQ Channels")
Cc: stable@vger.kernel.org
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Currently, the sub-queues and event pool resources are allocated/freed for
every CRQ connection event such as reset and LPM. This exposes the driver
to a couple issues. First the inefficiency of freeing and reallocating
memory that can simply be resued after being sanitized. Further, a system
under memory pressue runs the risk of allocation failures that could result
in a crippled driver. Finally, there is a race window where command
submission/compeletion can try to pull/return elements from/to an event
pool that is being deleted or already has been deleted due to the lack of
host state around freeing/allocating resources. The following is an example
of list corruption following a live partition migration (LPM):
Oops: Exception in kernel mode, sig: 5 [#1]
LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
Modules linked in: vfat fat isofs cdrom ext4 mbcache jbd2 nft_counter nft_compat nf_tables nfnetlink rpadlpar_io rpaphp xsk_diag nfsv3 nfs_acl nfs lockd grace fscache netfs rfkill bonding tls sunrpc pseries_rng drm drm_panel_orientation_quirks xfs libcrc32c dm_service_time sd_mod t10_pi sg ibmvfc scsi_transport_fc ibmveth vmx_crypto dm_multipath dm_mirror dm_region_hash dm_log dm_mod ipmi_devintf ipmi_msghandler fuse
CPU: 0 PID: 2108 Comm: ibmvfc_0 Kdump: loaded Not tainted 5.14.0-70.9.1.el9_0.ppc64le #1
NIP: c0000000007c4bb0 LR: c0000000007c4bac CTR: 00000000005b9a10
REGS: c00000025c10b760 TRAP: 0700 Not tainted (5.14.0-70.9.1.el9_0.ppc64le)
MSR: 800000000282b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 2800028f XER: 0000000f
CFAR: c0000000001f55bc IRQMASK: 0
GPR00: c0000000007c4bac c00000025c10ba00 c000000002a47c00 000000000000004e
GPR04: c0000031e3006f88 c0000031e308bd00 c00000025c10b768 0000000000000027
GPR08: 0000000000000000 c0000031e3009dc0 00000031e0eb0000 0000000000000000
GPR12: c0000031e2ffffa8 c000000002dd0000 c000000000187108 c00000020fcee2c0
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20: 0000000000000000 0000000000000000 0000000000000000 c008000002f81300
GPR24: 5deadbeef0000100 5deadbeef0000122 c000000263ba6910 c00000024cc88000
GPR28: 000000000000003c c0000002430a0000 c0000002430ac300 000000000000c300
NIP [c0000000007c4bb0] __list_del_entry_valid+0x90/0x100
LR [c0000000007c4bac] __list_del_entry_valid+0x8c/0x100
Call Trace:
[c00000025c10ba00] [c0000000007c4bac] __list_del_entry_valid+0x8c/0x100 (unreliable)
[c00000025c10ba60] [c008000002f42284] ibmvfc_free_queue+0xec/0x210 [ibmvfc]
[c00000025c10bb10] [c008000002f4246c] ibmvfc_deregister_scsi_channel+0xc4/0x160 [ibmvfc]
[c00000025c10bba0] [c008000002f42580] ibmvfc_release_sub_crqs+0x78/0x130 [ibmvfc]
[c00000025c10bc20] [c008000002f4f6cc] ibmvfc_do_work+0x5c4/0xc70 [ibmvfc]
[c00000025c10bce0] [c008000002f4fdec] ibmvfc_work+0x74/0x1e8 [ibmvfc]
[c00000025c10bda0] [c0000000001872b8] kthread+0x1b8/0x1c0
[c00000025c10be10] [c00000000000cd64] ret_from_kernel_thread+0x5c/0x64
Instruction dump:
40820034 38600001 38210060 4e800020 7c0802a6 7c641b78 3c62fe7a 7d254b78
3863b590 f8010070 4ba309cd 60000000 <0fe00000> 7c0802a6 3c62fe7a 3863b640
---[ end trace 11a2b65a92f8b66c ]---
ibmvfc 30000003: Send warning. Receive queue closed, will retry.
Add registration/deregistration helpers that are called instead during
connection resets to sanitize and reconfigure the queues.
Link: https://lore.kernel.org/r/20220616191126.1281259-3-tyreld@linux.ibm.com
Fixes: 3034ebe263 ("scsi: ibmvfc: Add alloc/dealloc routines for SCSI Sub-CRQ Channels")
Cc: stable@vger.kernel.org
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This series consists of the usual driver updates (ufs, smartpqi, lpfc,
target, megaraid_sas, hisi_sas, qla2xxx) and minor updates and bug
fixes. Notable core changes are the removal of scsi->tag which caused
some churn in obsolete drivers and a sweep through all drivers to call
scsi_done() directly instead of scsi->done() which removes a pointer
indirection from the hot path and a move to register core sysfs files
earlier, which means they're available to KOBJ_ADD processing, which
necessitates switching all drivers to using attribute groups.
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCYYUfBCYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishbUJAQDZt4oc
vUx9JpyrdHxxTCuOzVFd8W1oJn0k5ltCBuz4yAD8DNbGhGm93raMSJ3FOOlzLEbP
RG8vBdpxMudlvxAPi/A=
=BSFz
-----END PGP SIGNATURE-----
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI updates from James Bottomley:
"This consists of the usual driver updates (ufs, smartpqi, lpfc,
target, megaraid_sas, hisi_sas, qla2xxx) and minor updates and bug
fixes.
Notable core changes are the removal of scsi->tag which caused some
churn in obsolete drivers and a sweep through all drivers to call
scsi_done() directly instead of scsi->done() which removes a pointer
indirection from the hot path and a move to register core sysfs files
earlier, which means they're available to KOBJ_ADD processing, which
necessitates switching all drivers to using attribute groups"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (279 commits)
scsi: lpfc: Update lpfc version to 14.0.0.3
scsi: lpfc: Allow fabric node recovery if recovery is in progress before devloss
scsi: lpfc: Fix link down processing to address NULL pointer dereference
scsi: lpfc: Allow PLOGI retry if previous PLOGI was aborted
scsi: lpfc: Fix use-after-free in lpfc_unreg_rpi() routine
scsi: lpfc: Correct sysfs reporting of loop support after SFP status change
scsi: lpfc: Wait for successful restart of SLI3 adapter during host sg_reset
scsi: lpfc: Revert LOG_TRACE_EVENT back to LOG_INIT prior to driver_resource_setup()
scsi: ufs: ufshcd-pltfrm: Fix memory leak due to probe defer
scsi: ufs: mediatek: Avoid sched_clock() misuse
scsi: mpt3sas: Make mpt3sas_dev_attrs static
scsi: scsi_transport_sas: Add 22.5 Gbps link rate definitions
scsi: target: core: Stop using bdevname()
scsi: aha1542: Use memcpy_{from,to}_bvec()
scsi: sr: Add error handling support for add_disk()
scsi: sd: Add error handling support for add_disk()
scsi: target: Perform ALUA group changes in one step
scsi: target: Replace lun_tg_pt_gp_lock with rcu in I/O path
scsi: target: Fix alua_tg_pt_gps_count tracking
scsi: target: Fix ordered tag handling
...
The end goal of the current buffer overflow detection work[0] is to gain
full compile-time and run-time coverage of all detectable buffer overflows
seen via array indexing or memcpy(), memmove(), and memset(). The str*()
family of functions already have full coverage.
While much of the work for these changes have been on-going for many
releases (i.e. 0-element and 1-element array replacements, as well as
avoiding false positives and fixing discovered overflows[1]), this series
contains the foundational elements of several related buffer overflow
detection improvements by providing new common helpers and FORTIFY_SOURCE
changes needed to gain the introspection required for compiler visibility
into array sizes. Also included are a handful of already Acked instances
using the helpers (or related clean-ups), with many more waiting at the
ready to be taken via subsystem-specific trees[2]. The new helpers are:
- struct_group() for gaining struct member range introspection.
- memset_after() and memset_startat() for clearing to the end of structures.
- DECLARE_FLEX_ARRAY() for using flex arrays in unions or alone in structs.
Also included is the beginning of the refactoring of FORTIFY_SOURCE to
support memcpy() introspection, fix missing and regressed coverage under
GCC, and to prepare to fix the currently broken Clang support. Finishing
this work is part of the larger series[0], but depends on all the false
positives and buffer overflow bug fixes to have landed already and those
that depend on this series to land.
As part of the FORTIFY_SOURCE refactoring, a set of both a compile-time
and run-time tests are added for FORTIFY_SOURCE and the mem*()-family
functions respectively. The compile time tests have found a legitimate
(though corner-case) bug[6] already.
Please note that the appearance of "panic" and "BUG" in the
FORTIFY_SOURCE refactoring are the result of relocating existing code,
and no new use of those code-paths are expected nor desired.
Finally, there are two tree-wide conversions for 0-element arrays and
flexible array unions to gain sane compiler introspection coverage that
result in no known object code differences.
After this series (and the changes that have now landed via netdev
and usb), we are very close to finally being able to build with
-Warray-bounds and -Wzero-length-bounds. However, due corner cases in
GCC[3] and Clang[4], I have not included the last two patches that turn
on these options, as I don't want to introduce any known warnings to
the build. Hopefully these can be solved soon.
[0] https://lore.kernel.org/lkml/20210818060533.3569517-1-keescook@chromium.org/
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/log/?qt=grep&q=FORTIFY_SOURCE
[2] https://lore.kernel.org/lkml/202108220107.3E26FE6C9C@keescook/
[3] https://lore.kernel.org/lkml/3ab153ec-2798-da4c-f7b1-81b0ac8b0c5b@roeck-us.net/
[4] https://bugs.llvm.org/show_bug.cgi?id=51682
[5] https://lore.kernel.org/lkml/202109051257.29B29745C0@keescook/
[6] https://lore.kernel.org/lkml/20211020200039.170424-1-keescook@chromium.org/
-----BEGIN PGP SIGNATURE-----
iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmGAFWcWHGtlZXNjb29r
QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJmKFD/45MJdnvW5MhIEeW5tc5UjfcIPS
ae+YvlEX/2ZwgSlTxocFVocE6hz7b6eCiX3dSAChPkPxsSfgeiuhjxsU+4ROnELR
04RqTA/rwT6JXfJcXbDPXfxDL4huUkgktAW3m1sT771AZspeap2GrSwFyttlTqKA
+kTiZ3lXJVFcw10uyhfp3Lk6eFJxdf5iOjuEou5kBOQfpNKEOduRL2K15hSowOwB
lARiAC+HbmN+E+npvDE7YqK4V7ZQ0/dtB0BlfqgTkn1spQz8N21kBAMpegV5vvIk
A+qGHc7q2oyk4M14TRTidQHGQ4juW1Kkvq3NV6KzwQIVD+mIfz0ESn3d4tnp28Hk
Y+OXTI1BRFlApQU9qGWv33gkNEozeyqMLDRLKhDYRSFPA9UKkpgXQRzeTzoLKyrQ
4B6n5NnUGcu7I6WWhpyZQcZLDsHGyy0vHzjQGs/NXtb1PzXJ5XIGuPdmx9pVMykk
IVKnqRcWyGWahfh3asOnoXvdhi1No4NSHQ/ZHfUM+SrIGYjBMaUisw66qm3Fe8ZU
lbO2CFkCsfGSoKNPHf0lUEGlkyxAiDolazOfflDNxdzzlZo2X1l/a7O/yoO4Pqul
cdL0eDjiNoQ2YR2TSYPnXq5KSL1RI0tlfS8pH8k1hVhZsQx0wpAQ+qki0S+fLePV
PdA9XB82G2tmqKc9cQ==
=9xbT
-----END PGP SIGNATURE-----
Merge tag 'overflow-v5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull overflow updates from Kees Cook:
"The end goal of the current buffer overflow detection work[0] is to
gain full compile-time and run-time coverage of all detectable buffer
overflows seen via array indexing or memcpy(), memmove(), and
memset(). The str*() family of functions already have full coverage.
While much of the work for these changes have been on-going for many
releases (i.e. 0-element and 1-element array replacements, as well as
avoiding false positives and fixing discovered overflows[1]), this
series contains the foundational elements of several related buffer
overflow detection improvements by providing new common helpers and
FORTIFY_SOURCE changes needed to gain the introspection required for
compiler visibility into array sizes. Also included are a handful of
already Acked instances using the helpers (or related clean-ups), with
many more waiting at the ready to be taken via subsystem-specific
trees[2].
The new helpers are:
- struct_group() for gaining struct member range introspection
- memset_after() and memset_startat() for clearing to the end of
structures
- DECLARE_FLEX_ARRAY() for using flex arrays in unions or alone in
structs
Also included is the beginning of the refactoring of FORTIFY_SOURCE to
support memcpy() introspection, fix missing and regressed coverage
under GCC, and to prepare to fix the currently broken Clang support.
Finishing this work is part of the larger series[0], but depends on
all the false positives and buffer overflow bug fixes to have landed
already and those that depend on this series to land.
As part of the FORTIFY_SOURCE refactoring, a set of both a
compile-time and run-time tests are added for FORTIFY_SOURCE and the
mem*()-family functions respectively. The compile time tests have
found a legitimate (though corner-case) bug[6] already.
Please note that the appearance of "panic" and "BUG" in the
FORTIFY_SOURCE refactoring are the result of relocating existing code,
and no new use of those code-paths are expected nor desired.
Finally, there are two tree-wide conversions for 0-element arrays and
flexible array unions to gain sane compiler introspection coverage
that result in no known object code differences.
After this series (and the changes that have now landed via netdev and
usb), we are very close to finally being able to build with
-Warray-bounds and -Wzero-length-bounds.
However, due corner cases in GCC[3] and Clang[4], I have not included
the last two patches that turn on these options, as I don't want to
introduce any known warnings to the build. Hopefully these can be
solved soon"
Link: https://lore.kernel.org/lkml/20210818060533.3569517-1-keescook@chromium.org/ [0]
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/log/?qt=grep&q=FORTIFY_SOURCE [1]
Link: https://lore.kernel.org/lkml/202108220107.3E26FE6C9C@keescook/ [2]
Link: https://lore.kernel.org/lkml/3ab153ec-2798-da4c-f7b1-81b0ac8b0c5b@roeck-us.net/ [3]
Link: https://bugs.llvm.org/show_bug.cgi?id=51682 [4]
Link: https://lore.kernel.org/lkml/202109051257.29B29745C0@keescook/ [5]
Link: https://lore.kernel.org/lkml/20211020200039.170424-1-keescook@chromium.org/ [6]
* tag 'overflow-v5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (30 commits)
fortify: strlen: Avoid shadowing previous locals
compiler-gcc.h: Define __SANITIZE_ADDRESS__ under hwaddress sanitizer
treewide: Replace 0-element memcpy() destinations with flexible arrays
treewide: Replace open-coded flex arrays in unions
stddef: Introduce DECLARE_FLEX_ARRAY() helper
btrfs: Use memset_startat() to clear end of struct
string.h: Introduce memset_startat() for wiping trailing members and padding
xfrm: Use memset_after() to clear padding
string.h: Introduce memset_after() for wiping trailing members/padding
lib: Introduce CONFIG_MEMCPY_KUNIT_TEST
fortify: Add compile-time FORTIFY_SOURCE tests
fortify: Allow strlen() and strnlen() to pass compile-time known lengths
fortify: Prepare to improve strnlen() and strlen() warnings
fortify: Fix dropped strcpy() compile-time write overflow check
fortify: Explicitly disable Clang support
fortify: Move remaining fortify helpers into fortify-string.h
lib/string: Move helper functions out of string.c
compiler_types.h: Remove __compiletime_object_size()
cm4000_cs: Use struct_group() to zero struct cm4000_dev region
can: flexcan: Use struct_group() to zero struct flexcan_regs regions
...
Commit a264cf5e81 ("scsi: ibmvfc: Fix command state accounting and stale
response detection") introduced a regression in detecting duplicate
responses. This was observed in test where a command was sent to the VIOS
and completed before ibmvfc_send_event() set the active flag to 1, which
resulted in the atomic_dec_if_positive() call in ibmvfc_handle_crq()
thinking this was a duplicate response, which resulted in scsi_done() not
getting called, so we then hit a SCSI command timeout for this command once
the timeout expires. This simply ensures the active flag gets set prior to
making the hcall to send the command to the VIOS, in order to close this
window.
Link: https://lore.kernel.org/r/20211019152129.16558-1-brking@linux.vnet.ibm.com
Fixes: a264cf5e81 ("scsi: ibmvfc: Fix command state accounting and stale response detection")
Cc: stable@vger.kernel.org
Acked-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
During driver probe we allocate a dma region for our event pool.
Currently, zero is passed for the gfp_flags parameter. Driver probe
callbacks are run in process context and we hold no locks so we can sleep
here if necessary.
Fix by passing GFP_KERNEL explicitly to dma_alloc_coherent().
Link: https://lore.kernel.org/r/1547089149-20577-1-git-send-email-tyreld@linux.vnet.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
struct device supports attribute groups directly but does not support
struct device_attribute directly. Hence switch to attribute groups.
Link: https://lore.kernel.org/r/20211012233558.4066756-25-bvanassche@acm.org
Acked-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
struct device supports attribute groups directly but does not support
struct device_attribute directly. Hence switch to attribute groups.
Link: https://lore.kernel.org/r/20211012233558.4066756-24-bvanassche@acm.org
Acked-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The DEF_SCSI_QCMD() macro passes the addresses of the SCSI host lock and
also that of the scsi_done function to the queuecommand_lck() function
implementations. Remove the 'scsi_done' argument since its address is
now a constant and instead call 'scsi_done' directly from inside the
queuecommand_lck() functions.
Link: https://lore.kernel.org/r/20211007204618.2196847-14-bvanassche@acm.org
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In preparation for FORTIFY_SOURCE performing compile-time and run-time
field bounds checking for memset(), avoid intentionally writing across
neighboring fields.
Instead of writing beyond the end of evt_struct->iu.srp.cmd, target the
upper union (evt_struct->iu.srp) instead, as that's what is being wiped.
Cc: Tyrel Datwyler <tyreld@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: linux-scsi@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/lkml/yq135rzp79c.fsf@ca-mkp.ca.oracle.com
Acked-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Link: https://lore.kernel.org/lkml/6eae8434-e9a7-aa74-628b-b515b3695359@linux.ibm.com
The initial device scan might take some time, and there really is no need
to wait for it during probe(). So return immediately from scsi_scan_host()
during probe() and avoid any udev stalls during booting.
Link: https://lore.kernel.org/r/20210817075306.11315-1-mwilck@suse.com
Acked-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Prepare for removal of the request pointer by using scsi_cmd_to_rq()
instead. This patch does not change any functionality.
Link: https://lore.kernel.org/r/20210809230355.8186-25-bvanassche@acm.org
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Prepare for removal of the request pointer by using scsi_cmd_to_rq()
instead. This patch does not change any functionality.
Link: https://lore.kernel.org/r/20210809230355.8186-24-bvanassche@acm.org
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Prior to commit 1f4a4a1950 ("scsi: ibmvfc: Complete commands outside the
host/queue lock") responses to commands were completed sequentially with
the host lock held such that a command had a basic binary state of active
or free. It was therefore a simple affair of ensuring the assocaiated
ibmvfc_event to a VIOS response was valid by testing that it was not
already free. The lock relexation work to complete commands outside the
lock inadverdently made it a trinary command state such that a command is
either in flight, received and being completed, or completed and now
free. This breaks the stale command detection logic as a command may be
still marked active and been placed on the delayed completion list when a
second stale response for the same command arrives. This can lead to double
completions and list corruption. This issue was exposed by a recent VIOS
regression were a missing memory barrier could occasionally result in the
ibmvfc client receiving a duplicate response for the same command.
Fix the issue by introducing the atomic ibmvfc_event.active to track the
trinary state of a command. The state is explicitly set to 1 when a command
is successfully sent. The CRQ response handlers use
atomic_dec_if_positive() to test for stale responses and correctly
transition to the completion state when a active command is received.
Finally, atomic_dec_and_test() is used to sanity check transistions when
commands are freed as a result of a completion, or moved to the purge list
as a result of error handling or adapter reset.
Link: https://lore.kernel.org/r/20210716205220.1101150-1-tyreld@linux.ibm.com
Fixes: 1f4a4a1950 ("scsi: ibmvfc: Complete commands outside the host/queue lock")
Cc: stable@vger.kernel.org
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This series consists of the usual driver updates (ufs, ibmvfc,
megaraid_sas, lpfc, elx, mpi3mr, qedi, iscsi, storvsc, mpt3sas) with
elx and mpi3mr being new drivers. The major core change is a rework
to drop the status byte handling macros and the old bit shifted
definitions and the rest of the updates are minor fixes.
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCYN7I6iYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishXpRAQCkngYZ
35yQrqOxgOk2pfrysE95tHrV1MfJm2U49NFTwAEAuZutEvBUTfBF+sbcJ06r6q7i
H0hkJN/Io7enFs5v3WA=
=zwIa
-----END PGP SIGNATURE-----
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI updates from James Bottomley:
"This series consists of the usual driver updates (ufs, ibmvfc,
megaraid_sas, lpfc, elx, mpi3mr, qedi, iscsi, storvsc, mpt3sas) with
elx and mpi3mr being new drivers.
The major core change is a rework to drop the status byte handling
macros and the old bit shifted definitions and the rest of the updates
are minor fixes"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (287 commits)
scsi: aha1740: Avoid over-read of sense buffer
scsi: arcmsr: Avoid over-read of sense buffer
scsi: ips: Avoid over-read of sense buffer
scsi: ufs: ufs-mediatek: Add missing of_node_put() in ufs_mtk_probe()
scsi: elx: libefc: Fix IRQ restore in efc_domain_dispatch_frame()
scsi: elx: libefc: Fix less than zero comparison of a unsigned int
scsi: elx: efct: Fix pointer error checking in debugfs init
scsi: elx: efct: Fix is_originator return code type
scsi: elx: efct: Fix link error for _bad_cmpxchg
scsi: elx: efct: Eliminate unnecessary boolean check in efct_hw_command_cancel()
scsi: elx: efct: Do not use id uninitialized in efct_lio_setup_session()
scsi: elx: efct: Fix error handling in efct_hw_init()
scsi: elx: efct: Remove redundant initialization of variable lun
scsi: elx: efct: Fix spelling mistake "Unexected" -> "Unexpected"
scsi: lpfc: Fix build error in lpfc_scsi.c
scsi: target: iscsi: Remove redundant continue statement
scsi: qla4xxx: Remove redundant continue statement
scsi: ppa: Switch to use module_parport_driver()
scsi: imm: Switch to use module_parport_driver()
scsi: mpt3sas: Fix error return value in _scsih_expander_add()
...
A couple of ibmvscsi files are missing the inclusion of linux/of.h
and linux/irqdomain.h, relying on transitive inclusion from another
file.
As we are about to break this dependency, make sure these dependencies
are explicit.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Add a helper function scsi_status_is_check_condition() to encapsulate the
frequent checks for SAM_STAT_CHECK_CONDITION.
Link: https://lore.kernel.org/r/20210427083046.31620-9-hare@suse.de
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If rport target discovery commands fail for some reason, they get retried
up to a set number of retries. Once the retry limit is exceeded, the target
is deleted. In order to delete the target, we either need to do an implicit
logout or a move login. In the move login case, if the move login fails, we
want to retry it. This ensures the retry counter gets reinitialized so the
move login will get retried.
Link: https://lore.kernel.org/r/1620756740-7045-4-git-send-email-brking@linux.vnet.ibm.com
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If fast fail is enabled and we encounter a WWPN moving from one port id to
another port id with I/O outstanding, if we use the move login MAD,
although it will work, it will leave any outstanding I/O still outstanding
to the old port id. Eventually, the SCSI command timers will fire and we
will abort these commands, however, this is generally much longer than the
fast fail timeout, which can lead to I/O operations being outstanding for a
long time. This patch changes the behavior to avoid the move login if fast
fail is enabled. Once terminate_rport_io cleans up the rport, then we force
the target back through the delete process, which re-drives the implicit
logout, then kicks us back into discovery where we will discover the WWPN
at the new location and do a PLOGI to it.
Link: https://lore.kernel.org/r/1620756740-7045-3-git-send-email-brking@linux.vnet.ibm.com
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When service is being performed on an SVC with NPIV enabled, the WWPN of
the canister / node being serviced fails over to the another canister /
node. This looks to the ibmvfc driver as a WWPN moving from one SCSI ID to
another. The driver will first attempt to do an implicit logout of the old
SCSI ID. If this works, we simply delete the rport at the old location and
add an rport at the new location and the FC transport class handles
everything. However, if there is I/O outstanding, this implicit logout will
fail, in which case we will send a "move login" request to the VIOS. This
will cancel any outstanding I/O to that port, logout the port, and PLOGI
the new port. Recently we've encountered a scenario where the move login
fails. This was resulting in an attempted plogi to the new scsi id, without
the old scsi id getting logged out, which is a VIOS protocol violation. To
solve this, we want to keep tracking the old scsi id as the current scsi
id. That way, once terminate_rport_io cancels the outstanding i/o, it will
send us back through to do an implicit logout of the old scsi id, rather
than the new scsi id, and then we can plogi the new scsi id.
Link: https://lore.kernel.org/r/1620756740-7045-2-git-send-email-brking@linux.vnet.ibm.com
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This series consists of the usual driver updates (ufs, target, tcmu,
smartpqi, lpfc, zfcp, qla2xxx, mpt3sas, pm80xx). The major core
change is using a sbitmap instead of an atomic for queue tracking.
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCYInvqCYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishYh2AP0SgqqL
WYZRT2oiyBOKD28v+ceOSiXvgjPlqABwVMC0BAEAn29/wNCxyvzZ1k/b0iPJ4M+S
klkSxLzXKQLzJBgdK5w=
=p5B/
-----END PGP SIGNATURE-----
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI updates from James Bottomley:
"This consists of the usual driver updates (ufs, target, tcmu,
smartpqi, lpfc, zfcp, qla2xxx, mpt3sas, pm80xx).
The major core change is using a sbitmap instead of an atomic for
queue tracking"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (412 commits)
scsi: target: tcm_fc: Fix a kernel-doc header
scsi: target: Shorten ALUA error messages
scsi: target: Fix two format specifiers
scsi: target: Compare explicitly with SAM_STAT_GOOD
scsi: sd: Introduce a new local variable in sd_check_events()
scsi: dc395x: Open-code status_byte(u8) calls
scsi: 53c700: Open-code status_byte(u8) calls
scsi: smartpqi: Remove unused functions
scsi: qla4xxx: Remove an unused function
scsi: myrs: Remove unused functions
scsi: myrb: Remove unused functions
scsi: mpt3sas: Fix two kernel-doc headers
scsi: fcoe: Suppress a compiler warning
scsi: libfc: Fix a format specifier
scsi: aacraid: Remove an unused function
scsi: core: Introduce enum scsi_disposition
scsi: core: Modify the scsi_send_eh_cmnd() return value for the SDEV_BLOCK case
scsi: core: Rename scsi_softirq_done() into scsi_complete()
scsi: core: Remove an incorrect comment
scsi: core: Make the scsi_alloc_sgtables() documentation more accurate
...
This fixes an issue hitting the BUG_ON() in ibmvfc_do_work(). When going
through a host action of IBMVFC_HOST_ACTION_RESET, we change the action to
IBMVFC_HOST_ACTION_TGT_DEL, then drop the host lock, and reset the CRQ,
which changes the host state to IBMVFC_NO_CRQ. If, prior to setting the
host state to IBMVFC_NO_CRQ, ibmvfc_init_host() is called, it can then end
up changing the host action to IBMVFC_HOST_ACTION_INIT. If we then change
the host state to IBMVFC_NO_CRQ, we will then hit the BUG_ON().
Make a couple of changes to avoid this. Leave the host action to be
IBMVFC_HOST_ACTION_RESET or IBMVFC_HOST_ACTION_REENABLE until after we drop
the host lock and reset or reenable the CRQ. Also harden the host state
machine to ensure we cannot leave the reset / reenable state until we've
finished processing the reset or reenable.
Link: https://lore.kernel.org/r/20210413001009.902400-1-tyreld@linux.ibm.com
Fixes: 73ee5d8672 ("[SCSI] ibmvfc: Fix soft lockup on resume")
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
[tyreld: added fixes tag]
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
[mkp: fix comment checkpatch warnings]
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Pull 5.12/scsi-fixes into the 5.13 SCSI tree to provide a baseline for
some UFS changes that would otherwise cause conflicts during the
merge.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Seven fixes, all in drivers (qla2xxx, mkt3sas, qedi, target,
ibmvscsi). The most serious are the target pscsi oom and the qla2xxx
revert which can otherwise cause a use after free.
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCYF/V2yYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pisha+AAQCSO7Wv
Wshi7H9N8SQUJO9EcmkrSApGDtjHUfZYlnse0AD/UhbpCSzSEPzI23lfduMB3QTa
o5wmwHEIG8ULneB57vE=
=f44O
-----END PGP SIGNATURE-----
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Seven fixes, all in drivers (qla2xxx, mkt3sas, qedi, target,
ibmvscsi).
The most serious are the target pscsi oom and the qla2xxx revert which
can otherwise cause a use after free"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: target: pscsi: Clean up after failure in pscsi_map_sg()
scsi: target: pscsi: Avoid OOM in pscsi_map_sg()
scsi: mpt3sas: Fix error return code of mpt3sas_base_attach()
scsi: qedi: Fix error return code of qedi_alloc_global_queues()
scsi: Revert "qla2xxx: Make sure that aborted commands are freed"
scsi: ibmvfc: Make ibmvfc_wait_for_ops() MQ aware
scsi: ibmvfc: Fix potential race in ibmvfc_wait_for_ops()
During MQ enablement of the ibmvfc driver ibmvfc_wait_for_ops() was
missed. This function is responsible for waiting on commands to complete
that match a certain criteria such as LUN or cancel key. The implementation
as is only scans the CRQ for events ignoring any sub-queues and as a result
will exit successfully without doing anything when operating in MQ
channelized mode.
Check the MQ and channel use flags to determine which queues are
applicable, and scan each queue accordingly. Note in MQ mode SCSI commands
are only issued down sub-queues and the CRQ is only used for driver
specific management commands. As such the CRQ events are ignored when
operating in MQ mode with channels.
Link: https://lore.kernel.org/r/20210319205029.312969-3-tyreld@linux.ibm.com
Fixes: 9000cb998b ("scsi: ibmvfc: Enable MQ and set reasonable defaults")
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
For various EH activities the ibmvfc driver uses ibmvfc_wait_for_ops() to
wait for the completion of commands that match a given criteria be it
cancel key, or specific LUN. With recent changes commands are completed
outside the lock in bulk by removing them from the sent list and adding
them to a private completion list. This introduces a potential race in
ibmvfc_wait_for_ops() since the criteria for a command to be outstanding is
no longer simply being on the sent list, but instead not being on the free
list.
Avoid this race by scanning the entire command event pool and checking that
any matching command that ibmvfc needs to wait on is not already on the
free list.
Link: https://lore.kernel.org/r/20210319205029.312969-2-tyreld@linux.ibm.com
Fixes: 1f4a4a1950 ("scsi: ibmvfc: Complete commands outside the host/queue lock")
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Eight fixes, all in drivers, all fairly minor either being fixes in
error legs, memory leaks on teardown, context errors or semantic
problems.
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCYFYmeiYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishXQRAPwKUNlT
7KbOMx5MqsBq+/0m7iVUHEDg0kNJwYslEL0jSQEAnauUYfDI34z6cPXx4L+hqOiM
wP5dRGK4rs1u92AJmoY=
=h42D
-----END PGP SIGNATURE-----
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Eight fixes, all in drivers, all fairly minor either being fixes in
error legs, memory leaks on teardown, context errors or semantic
problems"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: mpt3sas: Do not use GFP_KERNEL in atomic context
scsi: ufs: ufs-mediatek: Correct operator & -> &&
scsi: sd_zbc: Update write pointer offset cache
scsi: lpfc: Fix some error codes in debugfs
scsi: qla2xxx: Fix broken #endif placement
scsi: st: Fix a use after free in st_open()
scsi: myrs: Fix a double free in myrs_cleanup()
scsi: ibmvfc: Free channel_setup_buf during device tear down
Fixes the following W=1 kernel build warning(s):
drivers/scsi/ibmvscsi/ibmvfc.c:331: warning: Function parameter or member 'vhost' not described in 'ibmvfc_get_err_result'
drivers/scsi/ibmvscsi/ibmvfc.c:653: warning: Excess function parameter 'job_step' description in 'ibmvfc_del_tgt'
drivers/scsi/ibmvscsi/ibmvfc.c:773: warning: Function parameter or member 'queue' not described in 'ibmvfc_init_event_pool'
drivers/scsi/ibmvscsi/ibmvfc.c:773: warning: Function parameter or member 'size' not described in 'ibmvfc_init_event_pool'
drivers/scsi/ibmvscsi/ibmvfc.c:823: warning: Function parameter or member 'queue' not described in 'ibmvfc_free_event_pool'
drivers/scsi/ibmvscsi/ibmvfc.c:1413: warning: Function parameter or member 'vhost' not described in 'ibmvfc_gather_partition_info'
drivers/scsi/ibmvscsi/ibmvfc.c:1483: warning: Function parameter or member 'queue' not described in 'ibmvfc_get_event'
drivers/scsi/ibmvscsi/ibmvfc.c:1483: warning: Excess function parameter 'vhost' description in 'ibmvfc_get_event'
drivers/scsi/ibmvscsi/ibmvfc.c:1630: warning: Function parameter or member 't' not described in 'ibmvfc_timeout'
drivers/scsi/ibmvscsi/ibmvfc.c:1630: warning: Excess function parameter 'evt' description in 'ibmvfc_timeout'
drivers/scsi/ibmvscsi/ibmvfc.c:1893: warning: Function parameter or member 'shost' not described in 'ibmvfc_queuecommand'
drivers/scsi/ibmvscsi/ibmvfc.c:1893: warning: Excess function parameter 'done' description in 'ibmvfc_queuecommand'
drivers/scsi/ibmvscsi/ibmvfc.c:2324: warning: Function parameter or member 'rport' not described in 'ibmvfc_match_rport'
drivers/scsi/ibmvscsi/ibmvfc.c:2324: warning: Excess function parameter 'device' description in 'ibmvfc_match_rport'
drivers/scsi/ibmvscsi/ibmvfc.c:3133: warning: Function parameter or member 'evt_doneq' not described in 'ibmvfc_handle_crq'
drivers/scsi/ibmvscsi/ibmvfc.c:3317: warning: Excess function parameter 'reason' description in 'ibmvfc_change_queue_depth'
drivers/scsi/ibmvscsi/ibmvfc.c:3390: warning: Function parameter or member 'attr' not described in 'ibmvfc_show_log_level'
drivers/scsi/ibmvscsi/ibmvfc.c:3413: warning: Function parameter or member 'attr' not described in 'ibmvfc_store_log_level'
drivers/scsi/ibmvscsi/ibmvfc.c:3413: warning: Function parameter or member 'count' not described in 'ibmvfc_store_log_level'
drivers/scsi/ibmvscsi/ibmvfc.c:4121: warning: Function parameter or member 'done' not described in '__ibmvfc_tgt_get_implicit_logout_evt'
drivers/scsi/ibmvscsi/ibmvfc.c:4438: warning: Function parameter or member 't' not described in 'ibmvfc_adisc_timeout'
drivers/scsi/ibmvscsi/ibmvfc.c:4438: warning: Excess function parameter 'tgt' description in 'ibmvfc_adisc_timeout'
drivers/scsi/ibmvscsi/ibmvfc.c:4641: warning: Function parameter or member 'target' not described in 'ibmvfc_alloc_target'
drivers/scsi/ibmvscsi/ibmvfc.c:4641: warning: Excess function parameter 'scsi_id' description in 'ibmvfc_alloc_target'
drivers/scsi/ibmvscsi/ibmvfc.c:5068: warning: Function parameter or member 'evt' not described in 'ibmvfc_npiv_logout_done'
drivers/scsi/ibmvscsi/ibmvfc.c:5068: warning: Excess function parameter 'vhost' description in 'ibmvfc_npiv_logout_done'
Link: https://lore.kernel.org/r/20210317091230.2912389-35-lee.jones@linaro.org
Cc: Tyrel Datwyler <tyreld@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Brian King <brking@linux.vnet.ibm.com>
Cc: linux-scsi@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Fixes the following W=1 kernel build warning(s):
drivers/scsi/ibmvscsi/ibmvscsi.c:143: warning: Function parameter or member 'hostdata' not described in 'ibmvscsi_release_crq_queue'
drivers/scsi/ibmvscsi/ibmvscsi.c:143: warning: Function parameter or member 'max_requests' not described in 'ibmvscsi_release_crq_queue'
drivers/scsi/ibmvscsi/ibmvscsi.c:143: warning: expecting prototype for release_crq_queue(). Prototype was for ibmvscsi_release_crq_queue() instead
drivers/scsi/ibmvscsi/ibmvscsi.c:286: warning: expecting prototype for reset_crq_queue(). Prototype was for ibmvscsi_reset_crq_queue() instead
drivers/scsi/ibmvscsi/ibmvscsi.c:328: warning: Function parameter or member 'max_requests' not described in 'ibmvscsi_init_crq_queue'
drivers/scsi/ibmvscsi/ibmvscsi.c:328: warning: expecting prototype for initialize_crq_queue(). Prototype was for ibmvscsi_init_crq_queue() instead
drivers/scsi/ibmvscsi/ibmvscsi.c:414: warning: expecting prototype for reenable_crq_queue(). Prototype was for ibmvscsi_reenable_crq_queue() instead
drivers/scsi/ibmvscsi/ibmvscsi.c:536: warning: expecting prototype for ibmvscsi_free(). Prototype was for free_event_struct() instead
drivers/scsi/ibmvscsi/ibmvscsi.c:558: warning: expecting prototype for get_evt_struct(). Prototype was for get_event_struct() instead
drivers/scsi/ibmvscsi/ibmvscsi.c:587: warning: Function parameter or member 'evt_struct' not described in 'init_event_struct'
drivers/scsi/ibmvscsi/ibmvscsi.c:587: warning: Excess function parameter 'evt' description in 'init_event_struct'
drivers/scsi/ibmvscsi/ibmvscsi.c:608: warning: Function parameter or member 'cmd' not described in 'set_srp_direction'
drivers/scsi/ibmvscsi/ibmvscsi.c:608: warning: Function parameter or member 'srp_cmd' not described in 'set_srp_direction'
drivers/scsi/ibmvscsi/ibmvscsi.c:608: warning: Function parameter or member 'numbuf' not described in 'set_srp_direction'
drivers/scsi/ibmvscsi/ibmvscsi.c:641: warning: Function parameter or member 'evt_struct' not described in 'unmap_cmd_data'
drivers/scsi/ibmvscsi/ibmvscsi.c:683: warning: Function parameter or member 'evt_struct' not described in 'map_sg_data'
drivers/scsi/ibmvscsi/ibmvscsi.c:757: warning: Function parameter or member 'evt_struct' not described in 'map_data_for_srp_cmd'
drivers/scsi/ibmvscsi/ibmvscsi.c:783: warning: Function parameter or member 'error_code' not described in 'purge_requests'
drivers/scsi/ibmvscsi/ibmvscsi.c:846: warning: Function parameter or member 't' not described in 'ibmvscsi_timeout'
drivers/scsi/ibmvscsi/ibmvscsi.c:846: warning: Excess function parameter 'evt_struct' description in 'ibmvscsi_timeout'
drivers/scsi/ibmvscsi/ibmvscsi.c:1043: warning: Function parameter or member 'cmnd' not described in 'ibmvscsi_queuecommand_lck'
drivers/scsi/ibmvscsi/ibmvscsi.c:1043: warning: expecting prototype for ibmvscsi_queue(). Prototype was for ibmvscsi_queuecommand_lck() instead
drivers/scsi/ibmvscsi/ibmvscsi.c:1351: warning: expecting prototype for init_host(). Prototype was for enable_fast_fail() instead
drivers/scsi/ibmvscsi/ibmvscsi.c:1464: warning: Function parameter or member 'hostdata' not described in 'init_adapter'
drivers/scsi/ibmvscsi/ibmvscsi.c:1475: warning: Function parameter or member 'evt_struct' not described in 'sync_completion'
drivers/scsi/ibmvscsi/ibmvscsi.c:1488: warning: Function parameter or member 'cmd' not described in 'ibmvscsi_eh_abort_handler'
drivers/scsi/ibmvscsi/ibmvscsi.c:1488: warning: expecting prototype for ibmvscsi_abort(). Prototype was for ibmvscsi_eh_abort_handler() instead
drivers/scsi/ibmvscsi/ibmvscsi.c:1627: warning: Function parameter or member 'cmd' not described in 'ibmvscsi_eh_device_reset_handler'
drivers/scsi/ibmvscsi/ibmvscsi.c:1893: warning: Excess function parameter 'reason' description in 'ibmvscsi_change_queue_depth'
drivers/scsi/ibmvscsi/ibmvscsi.c:2221: warning: Function parameter or member 'vdev' not described in 'ibmvscsi_probe'
drivers/scsi/ibmvscsi/ibmvscsi.c:2221: warning: Function parameter or member 'id' not described in 'ibmvscsi_probe'
drivers/scsi/ibmvscsi/ibmvscsi.c:2221: warning: expecting prototype for Called by bus code for each adapter(). Prototype was for ibmvscsi_probe() instead
drivers/scsi/ibmvscsi/ibmvscsi.c:2381: warning: cannot understand function prototype: 'const struct vio_device_id ibmvscsi_device_table[] = '
[mkp: fix checkpatch whitespace warning]
Link: https://lore.kernel.org/r/20210317091230.2912389-34-lee.jones@linaro.org
Cc: Tyrel Datwyler <tyreld@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Colin DeVilbiss <devilbis@us.ibm.com>
Cc: Santiago Leon <santil@us.ibm.com>
Cc: Dave Boutcher <sleddog@us.ibm.com>
Cc: linux-scsi@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The buffer for negotiating channel setup is DMA allocated at device probe
time. However, the remove path fails to free this allocation which will
prevent the hypervisor from releasing the virtual device in the case of a
hotplug remove.
Fix this issue by freeing the buffer allocation in ibmvfc_free_mem().
Link: https://lore.kernel.org/r/20210311012212.428068-1-tyreld@linux.ibm.com
Fixes: e95eef3fc0 ("scsi: ibmvfc: Implement channel enquiry and setup commands")
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
10 updates, a non code maintainer update for vmw_pvscsi, five code
updates for ibmvfc and four for UFS. All updates are either trivial
patches or bug fixes.
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCYEu5+iYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishVWMAP9Xn5jn
qfb+4W2qy8tydsUfQOO7t5j+yNYtF0wDFOOGqAEAoE+1DNrOfls62Agn29pcde2N
5oQe5TcxwsoC3Bir7B4=
=bsMt
-----END PGP SIGNATURE-----
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Ten updates: one non code maintainer update for vmw_pvscsi, five code
updates for ibmvfc and four for UFS.
All are either trivial patches or bug fixes"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: vmw_pvscsi: MAINTAINERS: Update maintainer
scsi: ufs: Convert sysfs sprintf/snprintf family to sysfs_emit
scsi: ufs: Remove redundant checks of !hba in suspend/resume callbacks
scsi: ufs: ufs-qcom: Disable interrupt in reset path
scsi: ufs: Minor adjustments to error handling
scsi: ibmvfc: Reinitialize sub-CRQs and perform channel enquiry after LPM
scsi: ibmvfc: Store return code of H_FREE_SUB_CRQ during cleanup
scsi: ibmvfc: Treat H_CLOSED as success during sub-CRQ registration
scsi: ibmvfc: Fix invalid sub-CRQ handles after hard reset
scsi: ibmvfc: Simplify handling of sub-CRQ initialization
A live partition migration (LPM) results in a CRQ disconnect similar to a
hard reset. In this LPM case the hypervisor mostly preserves the CRQ
transport such that it simply needs to be reenabled. However, the
capabilities may have changed such as fewer channels, or no channels at
all. Further, its possible that there may be sub-CRQ support, but no
channel support. The CRQ reenable path currently doesn't take any of this
into consideration.
For simplicity release and reinitialize sub-CRQs during reenable, and set
do_enquiry and using_channels with the appropriate values to trigger
channel renegotiation.
Link: https://lore.kernel.org/r/20210302230543.9905-6-tyreld@linux.ibm.com
Fixes: 3034ebe263 ("scsi: ibmvfc: Add alloc/dealloc routines for SCSI Sub-CRQ Channels")
Reviewed-by: Brian King <brking@linux.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The H_FREE_SUB_CRQ hypercall can return a retry delay return code that
indicates the call needs to be retried after a specific amount of time
delay. The error path to free a sub-CRQ in case of a failure during channel
registration fails to capture the return code of H_FREE_SUB_CRQ which will
result in the delay loop being skipped in the case of a retry delay return
code.
Store the return code result of the H_FREE_SUB_CRQ call such that the
return code check in the delay loop evaluates a meaningful value. Also, use
the rtas_busy_delay() to check the rc value and delay for the appropriate
amount of time.
Link: https://lore.kernel.org/r/20210302230543.9905-5-tyreld@linux.ibm.com
Fixes: 39e461fddf ("scsi: ibmvfc: Map/request irq and register Sub-CRQ interrupt handler")
Reviewed-by: Brian King <brking@linux.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
A non-zero return code for H_REG_SUB_CRQ is currently treated as a failure
resulting in failing sub-CRQ setup. The case of H_CLOSED should not be
treated as a failure. This return code translates to a successful sub-CRQ
registration by the hypervisor, and is meant to communicate back that there
is currently no partner VIOS CRQ connection established as of yet. This is
a common occurrence during a disconnect where the client adapter can
possibly come back up prior to the partner adapter.
For non-zero return code from H_REG_SUB_CRQ treat a H_CLOSED as success so
that sub-CRQs are successfully setup.
Link: https://lore.kernel.org/r/20210302230543.9905-4-tyreld@linux.ibm.com
Fixes: 3034ebe263 ("scsi: ibmvfc: Add alloc/dealloc routines for SCSI Sub-CRQ Channels")
Reviewed-by: Brian King <brking@linux.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
A hard reset results in a complete transport disconnect such that the CRQ
connection with the partner VIOS is broken. This has the side effect of
also invalidating the associated sub-CRQs. The current code assumes that
the sub-CRQs are perserved resulting in a protocol violation after trying
to reconnect them with the VIOS. This introduces an infinite loop such that
the VIOS forces a disconnect after each subsequent attempt to re-register
with invalid handles.
Avoid the aforementioned issue by releasing the sub-CRQs prior to CRQ
disconnect, and driving a reinitialization of the sub-CRQs once a new CRQ
is registered with the hypervisor.
Link: https://lore.kernel.org/r/20210302230543.9905-3-tyreld@linux.ibm.com
Fixes: 3034ebe263 ("scsi: ibmvfc: Add alloc/dealloc routines for SCSI Sub-CRQ Channels")
Reviewed-by: Brian King <brking@linux.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If ibmvfc_init_sub_crqs() fails ibmvfc_probe() simply parrots registration
failure reported elsewhere, and futher vhost->scsi_scrq.scrq == NULL is
indication enough to the driver that it has no sub-CRQs available. The
mq_enabled check can also be moved into ibmvfc_init_sub_crqs() such that
each caller doesn't have to gate the call with a mq_enabled check. Finally,
in the case of sub-CRQ setup failure setting do_enquiry can be turned off
to putting the driver into single queue fallback mode.
The aforementioned changes also simplify the next patch in the series that
fixes a hard reset issue, by tying a sub-CRQ setup failure and do_enquiry
logic into ibmvfc_init_sub_crqs().
Link: https://lore.kernel.org/r/20210302230543.9905-2-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The driver core ignores the return value of struct bus_type::remove()
because there is only little that can be done. To simplify the quest to
make this function return void, let struct vio_driver::remove() return
void, too. All users already unconditionally return 0, this commit makes
it obvious that returning an error code is a bad idea.
Note there are two nominally different implementations for a vio bus:
one in arch/sparc/kernel/vio.c and the other in
arch/powerpc/platforms/pseries/vio.c. This patch only adapts the powerpc
one.
Before this patch for a device that was bound to a driver without a
remove callback vio_cmo_bus_remove(viodev) wasn't called. As the device
core still considers the device unbound after vio_bus_remove() returns
calling this unconditionally is the consistent behaviour which is
implemented here.
Signed-off-by: Uwe Kleine-König <uwe@kleine-koenig.org>
Reviewed-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Acked-by: Lijun Pan <ljp@linux.ibm.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[mpe: Drop unneeded hvcs_remove() forward declaration, squash in
change from sfr to drop ibmvnic_remove() forward declaration]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210225221834.160083-1-uwe@kleine-koenig.org
The UFS core has received a substantial rework this cycle. This in
turn has caused a merge conflict in linux-next. Merge 5.11/scsi-fixes
into 5.12/scsi-queue and resolve the conflict.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add the various module parameter toggles for adjusting the MQ
characteristics at boot/load time as well as a device attribute for
changing the client scsi channel request amount.
Link: https://lore.kernel.org/r/20210114203148.246656-22-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Turn on MQ by default and set sane values for the upper limit on hw queues
for the SCSI host, and number of hw SCSI channels to request from the
partner VIOS.
Link: https://lore.kernel.org/r/20210114203148.246656-21-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Grab the queue and list lock for each Sub-CRQ and add any uncompleted
events to the host purge list.
Link: https://lore.kernel.org/r/20210114203148.246656-20-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In general the client needs to send Cancel MADs and task management
commands down the same channel as the command(s) intended to cancel or
abort. The client assigns cancel keys per LUN and thus must send a Cancel
down each channel commands were submitted for that LUN. Further, the client
then must wait for those cancel completions prior to submitting a LUN RESET
or ABORT TASK SET.
Add a cancel rsp iu syncronization field to the ibmvfc_queue struct such
that the cancel routine can sync the cancel response to each queue that
requires a cancel command. Build a list of each cancel event sent and wait
for the completion of each submitted cancel.
Link: https://lore.kernel.org/r/20210114203148.246656-19-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add a helper routine for initializing a Cancel MAD. This will be useful for
a channelized client that needs to send Cancel commands down every channel
commands were sent for a particular LUN.
Link: https://lore.kernel.org/r/20210114203148.246656-18-tyreld@linux.ibm.com
Reviewed-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>