Commit Graph

17678 Commits

Author SHA1 Message Date
Hannes Reinecke
1feb3b0229 scsi: 3w-sas: fix calls to dma_set_mask_and_coherent()
The change to use dma_set_mask_and_coherent() incorrectly made a second
call with the 32 bit DMA mask value when the call with the 64 bit DMA mask
value succeeded.

Fixes: b1fa122930 ("scsi: 3w-sas: fully convert to the generic DMA API")
Cc: <stable@vger.kernel.org>
Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-25 21:37:26 -05:00
Hannes Reinecke
33d6667416 scsi: 3w-9xxx: fix calls to dma_set_mask_and_coherent()
The change to use dma_set_mask_and_coherent() incorrectly made a second
call with the 32 bit DMA mask value when the call with the 64 bit DMA mask
value succeeded.

Fixes: b000bced57 ("scsi: 3w-9xxx: fully convert to the generic DMA API")
Cc: <stable@vger.kernel.org>
Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-25 21:37:25 -05:00
Hannes Reinecke
56de835704 scsi: lpfc: fix calls to dma_set_mask_and_coherent()
The change to use dma_set_mask_and_coherent() incorrectly made a second
call with the 32 bit DMA mask value when the call with the 64 bit DMA mask
value succeeded.  This resulted in NVMe/FC connections failing due to
corrupted data buffers, and various other SCSI/FCP I/O errors.

Fixes: f30e1bfd61 ("scsi: lpfc: use dma_set_mask_and_coherent")
Cc: <stable@vger.kernel.org>
Suggested-by: Don Dutile <ddutile@redhat.com>
Signed-off-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-25 21:37:25 -05:00
Linus Torvalds
6089a91fc0 SCSI fixes on 20190222
Four small fixes: three in drivers and one in the core.  The core fix
 is also minor in scope since the bug it fixes is only known to affect
 systems using SCSI reservations.  Of the driver bugs, the libsas one
 is the most major because it can lead to multiple disks on the same
 expander not being exposed.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXHC4uSYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishfYwAP9zX676
 svxUeEQLLyMLXmGyDZ5um8ne8VDAzXDIrkS06gEAhKju7hb7jYvt0pf3jj+utS+v
 KXtT8CpMuj+cffeVXng=
 =OkZL
 -----END PGP SIGNATURE-----

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "Four small fixes: three in drivers and one in the core.

  The core fix is also minor in scope since the bug it fixes is only
  known to affect systems using SCSI reservations. Of the driver bugs,
  the libsas one is the most major because it can lead to multiple disks
  on the same expander not being exposed"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: core: reset host byte in DID_NEXUS_FAILURE case
  scsi: libsas: Fix rphy phy_identifier for PHYs with end devices attached
  scsi: sd_zbc: Fix sd_zbc_report_zones() buffer allocation
  scsi: libiscsi: Fix race between iscsi_xmit_task and iscsi_complete_task
2019-02-23 09:48:01 -08:00
Sedat Dilek
8beb90aaf3 scsi: fcoe: make use of fip_mode enum complete
commit 1917d42d14 ("fcoe: use enum for fip_mode") introduces a separate
enum for the fip_mode that shall be used during initialisation handling
until it is passed to fcoe_ctrl_link_up to set the initial fip_state.  That
change was incomplete and gcc quietly converted in various places between
the fip_mode and the fip_state enum values with implicit enum conversions,
which fortunately cannot cause any issues in the actual code's execution.

clang however warns about these implicit enum conversions in the scsi
drivers. This commit consolidates the use of the two enums, guided by
clang's enum-conversion warnings.

This commit now completes the use of the fip_mode: It expects and uses
fip_mode in {bnx2fc,fcoe}_interface_create and fcoe_ctlr_init, and it calls
fcoe_ctrl_set_set() with the correct values in fcoe_ctlr_link_up().  It
also breaks the association between FIP_MODE_AUTO and FIP_ST_AUTO to
indicate these two enums are distinct.

Link: https://github.com/ClangBuiltLinux/linux/issues/151
Fixes: 1917d42d14 ("fcoe: use enum for fip_mode")
Reported-by: Dmitry Golovin <dima@golovin.in>
Original-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
CC: Lukas Bulwahn <lukas.bulwahn@gmail.com>
CC: Nick Desaulniers <ndesaulniers@google.com>
CC: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Tested-by: Nathan Chancellor <natechancellor@gmail.com>
Suggested-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-19 18:58:38 -05:00
Jason Yan
bcf3b67d16 scsi: megaraid_sas: return error when create DMA pool failed
when create DMA pool for cmd frames failed, we should return -ENOMEM,
instead of 0.
In some case in:

    megasas_init_adapter_fusion()

    -->megasas_alloc_cmds()
       -->megasas_create_frame_pool
          create DMA pool failed,
        --> megasas_free_cmds() [1]

    -->megasas_alloc_cmds_fusion()
       failed, then goto fail_alloc_cmds.
    -->megasas_free_cmds() [2]

we will call megasas_free_cmds twice, [1] will kfree cmd_list,
[2] will use cmd_list.it will cause a problem:

Unable to handle kernel NULL pointer dereference at virtual address
00000000
pgd = ffffffc000f70000
[00000000] *pgd=0000001fbf893003, *pud=0000001fbf893003,
*pmd=0000001fbf894003, *pte=006000006d000707
Internal error: Oops: 96000005 [#1] SMP
 Modules linked in:
 CPU: 18 PID: 1 Comm: swapper/0 Not tainted
 task: ffffffdfb9290000 ti: ffffffdfb923c000 task.ti: ffffffdfb923c000
 PC is at megasas_free_cmds+0x30/0x70
 LR is at megasas_free_cmds+0x24/0x70
 ...
 Call trace:
 [<ffffffc0005b779c>] megasas_free_cmds+0x30/0x70
 [<ffffffc0005bca74>] megasas_init_adapter_fusion+0x2f4/0x4d8
 [<ffffffc0005b926c>] megasas_init_fw+0x2dc/0x760
 [<ffffffc0005b9ab0>] megasas_probe_one+0x3c0/0xcd8
 [<ffffffc0004a5abc>] local_pci_probe+0x4c/0xb4
 [<ffffffc0004a5c40>] pci_device_probe+0x11c/0x14c
 [<ffffffc00053a5e4>] driver_probe_device+0x1ec/0x430
 [<ffffffc00053a92c>] __driver_attach+0xa8/0xb0
 [<ffffffc000538178>] bus_for_each_dev+0x74/0xc8
  [<ffffffc000539e88>] driver_attach+0x28/0x34
 [<ffffffc000539a18>] bus_add_driver+0x16c/0x248
 [<ffffffc00053b234>] driver_register+0x6c/0x138
 [<ffffffc0004a5350>] __pci_register_driver+0x5c/0x6c
 [<ffffffc000ce3868>] megasas_init+0xc0/0x1a8
 [<ffffffc000082a58>] do_one_initcall+0xe8/0x1ec
 [<ffffffc000ca7be8>] kernel_init_freeable+0x1c8/0x284
 [<ffffffc0008d90b8>] kernel_init+0x1c/0xe4

Signed-off-by: Jason Yan <yanaijie@huawei.com>
Acked-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-19 18:58:38 -05:00
Giridhar Malavali
f3e0269517 scsi: qla2xxx: Avoid PCI IRQ affinity mapping when multiqueue is not supported
This patch fixes warning seen when BLK-MQ is enabled and hardware does not
support MQ. This will result into driver requesting MSIx vectors which are
equal or less than pre_desc via PCI IRQ Affinity infrastructure.

    [   19.746300] qla2xxx [0000:00:00.0]-0005: : QLogic Fibre Channel HBA Driver: 10.00.00.12-k.
    [   19.746599] qla2xxx [0000:02:00.0]-001d: : Found an ISP2432 irq 18 iobase 0x(____ptrval____).
    [   20.203186] ------------[ cut here ]------------
    [   20.203306] WARNING: CPU: 8 PID: 268 at drivers/pci/msi.c:1273 pci_irq_get_affinity+0xf4/0x120
    [   20.203481] Modules linked in: tg3 ptp qla2xxx(+) pps_core sg libphy scsi_transport_fc flash loop autofs4
    [   20.203700] CPU: 8 PID: 268 Comm: systemd-udevd Not tainted 5.0.0-rc5-00358-gdf3865f #113
    [   20.203830] Call Trace:
    [   20.203933]  [0000000000461bb0] __warn+0xb0/0xe0
    [   20.204090]  [00000000006c8f34] pci_irq_get_affinity+0xf4/0x120
    [   20.204219]  [000000000068c764] blk_mq_pci_map_queues+0x24/0x120
    [   20.204396]  [00000000007162f4] scsi_map_queues+0x14/0x40
    [   20.204626]  [0000000000673654] blk_mq_update_queue_map+0x94/0xe0
    [   20.204698]  [0000000000676ce0] blk_mq_alloc_tag_set+0x120/0x300
    [   20.204869]  [000000000071077c] scsi_add_host_with_dma+0x7c/0x300
    [   20.205419]  [00000000100ead54] qla2x00_probe_one+0x19d4/0x2640 [qla2xxx]
    [   20.205621]  [00000000006b3c88] pci_device_probe+0xc8/0x160
    [   20.205697]  [0000000000701c0c] really_probe+0x1ac/0x2e0
    [   20.205770]  [0000000000701f90] driver_probe_device+0x50/0x100
    [   20.205843]  [0000000000702134] __driver_attach+0xf4/0x120
    [   20.205913]  [0000000000700644] bus_for_each_dev+0x44/0x80
    [   20.206081]  [0000000000700c98] bus_add_driver+0x198/0x220
    [   20.206300]  [0000000000702950] driver_register+0x70/0x120
    [   20.206582]  [0000000010248224] qla2x00_module_init+0x224/0x284 [qla2xxx]
    [   20.206857] ---[ end trace b1de7a3f79fab2c2 ]---

The fix is to check if the hardware does not have Multi Queue capabiltiy,
use pci_alloc_irq_vectors() call instead of pci_alloc_irq_affinity().

Fixes: f664a3cc17 ("scsi: kill off the legacy IO path")
Cc: stable@vger.kernel.org #4.19
Signed-off-by: Giridhar Malavali <gmalavali@marvell.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-19 18:58:38 -05:00
Himanshu Madhani
21497857ef scsi: qla2xxx: Update driver version to 10.00.00.14-k
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-19 18:58:37 -05:00
Joe Carnuccio
64f61d9944 scsi: qla2xxx: Add new FW dump template entry types
This patch adds new firmware dump template entries for ISP27XX firmware
dump.

Signed-off-by: Joe Carnuccio <joe.carnuccio@cavium.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-19 18:58:37 -05:00
Himanshu Madhani
5241f7ca62 scsi: qla2xxx: Fix code indentation for qla27xx_fwdt_entry
This patch fixes following checkpatch ERROR

 ERROR: space prohibited before that ',' (ctx:WxW)

No change is functionality due to this patch.

Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-19 18:58:37 -05:00
Quinn Tran
9eb9c6dc3a scsi: qla2xxx: Move marker request behind QPair
Current code hard codes marker request to use request and response queue
0. This patch make use of the qpair as the path to access the
request/response queues.  It allows marker to be place on any hardware
queue.

Signed-off-by: Quinn Tran <qtran@marvell.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-19 18:58:37 -05:00
Quinn Tran
b726d99d72 scsi: qla2xxx: Prevent SysFS access when chip is down
Prevent user from sending commands through sysfs while FW is not running or
reset is in progress.

Signed-off-by: Quinn Tran <qtran@marvell.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-19 18:58:36 -05:00
Anil Gurumurthy
4910b524ac scsi: qla2xxx: Add support for setting port speed
This patch adds sysfs node

1. There is a new sysfs node port_speed
2. The possible values are 2(Auto neg), 8, 16, 32
3. A value outside of the above defaults to Auto neg
4. Any update to the setting causes a link toggle
5. This feature is currently only for ISP27xx

Signed-off-by: Anil Gurumurthy <agurumurthy@marvell.com>
Signed-off-by: Quinn Tran <qtran@marvell.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-19 18:58:36 -05:00
Quinn Tran
192c4e9b93 scsi: qla2xxx: Prevent multiple ADISC commands per session
Add check to allow 1 discovery command per session to be sent.

Signed-off-by: Quinn Tran <qtran@marvell.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-19 18:58:36 -05:00
Himanshu Madhani
471f8e03d7 scsi: qla2xxx: Check for FW started flag before aborting
For FC-NVMe, if the fw_started flag is not set or fcport is deleted, then
do not send Abort command

Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-19 18:58:36 -05:00
Himanshu Madhani
e476fe8af5 scsi: qla2xxx: Fix unload when NVMe devices are configured
This patch fixes driver unload issue when FC-NVMe devices are configured.

Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-19 18:58:35 -05:00
Darren Trapp
03aaa89fe4 scsi: qla2xxx: Add First Burst support for FC-NVMe devices
Add Support for First Burst for FC-NVMe protocol. This feature requires
First Burst support in the firmware.

Signed-off-by: Darren Trapp <darren.trapp@cavium.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-19 18:58:35 -05:00
Himanshu Madhani
ec322937a7 scsi: qla2xxx: Fix LUN discovery if loop id is not assigned yet by firmware
This patch fixes LUN discovery when loop ID is not yet assigned by the
firmware during driver load/sg_reset operations. Driver will now search for
new loop id before retrying login.

Fixes: 48acad0990 ("scsi: qla2xxx: Fix N2N link re-connect")
Cc: stable@vger.kernel.org #4.19
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-19 18:58:35 -05:00
Colin Ian King
bb6abdd453 scsi: qla2xxx: remove redundant null check on pointer sess
The null check on pointer sess and the subsequent call is redundant as sess
is null on all the the paths that lead to the out_term2 label.  Hence the
null check and the call can be removed.  Also remove the redundant setting
of sess to NULL as this is not required now.

Detected by CoverityScan, CID#1420663 ("Logically dead code")

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-19 18:58:35 -05:00
Bill Kuzeja
f233e8c000 scsi: qla2xxx: Move debug messages before sending srb preventing panic
When sending an srb with qla2x00_start_sp, the sp can complete and be freed
by the time we log the debug message saying we sent it. This can cause a
panic if sp gets reused quickly or when running a kernel that poisons freed
memory.

This was partially fixed by (not every case was addressed):

Commit 9fe278f44b ("scsi: qla2xxx: Move log messages before issuing
command to firmware")

Signed-off-by: Bill Kuzeja <william.kuzeja@stratus.com>
Acked-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-19 18:58:34 -05:00
YueHaibing
59e54d9aab scsi: lpfc: Remove set but not used variable 'phys_id'
Fixes gcc '-Wunused-but-set-variable' warning:

drivers/scsi/lpfc/lpfc_init.c: In function 'lpfc_cpu_affinity_check':
drivers/scsi/lpfc/lpfc_init.c:10599:19: warning:
 variable 'phys_id' set but not used [-Wunused-but-set-variable]

It never used since introduction in commit 6a828b0f61 ("scsi: lpfc:
Support non-uniform allocation of MSIX vectors to hardware queues")

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-19 18:58:34 -05:00
Manivannan Sadhasivam
653fcb07d9 scsi: ufs: Add HI3670 SoC UFS driver support
Add HI3670 SoC UFS driver support by extending the common ufs-hisi
driver. One major difference between HI3660 ad HI3670 SoCs interms of UFS
is the PHY. HI3670 has a 10nm variant PHY and hence this parameter is used
to distinguish the configuration.

Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Acked-by: Wei Li <liwei213@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-19 18:58:34 -05:00
Ming Lei
c66d4bd110 genirq/affinity: Add new callback for (re)calculating interrupt sets
The interrupt affinity spreading mechanism supports to spread out
affinities for one or more interrupt sets. A interrupt set contains one or
more interrupts. Each set is mapped to a specific functionality of a
device, e.g. general I/O queues and read I/O queus of multiqueue block
devices.

The number of interrupts per set is defined by the driver. It depends on
the total number of available interrupts for the device, which is
determined by the PCI capabilites and the availability of underlying CPU
resources, and the number of queues which the device provides and the
driver wants to instantiate.

The driver passes initial configuration for the interrupt allocation via a
pointer to struct irq_affinity.

Right now the allocation mechanism is complex as it requires to have a loop
in the driver to determine the maximum number of interrupts which are
provided by the PCI capabilities and the underlying CPU resources.  This
loop would have to be replicated in every driver which wants to utilize
this mechanism. That's unwanted code duplication and error prone.

In order to move this into generic facilities it is required to have a
mechanism, which allows the recalculation of the interrupt sets and their
size, in the core code. As the core code does not have any knowledge about the
underlying device, a driver specific callback is required in struct
irq_affinity, which can be invoked by the core code. The callback gets the
number of available interupts as an argument, so the driver can calculate the
corresponding number and size of interrupt sets.

At the moment the struct irq_affinity pointer which is handed in from the
driver and passed through to several core functions is marked 'const', but for
the callback to be able to modify the data in the struct it's required to
remove the 'const' qualifier.

Add the optional callback to struct irq_affinity, which allows drivers to
recalculate the number and size of interrupt sets and remove the 'const'
qualifier.

For simple invocations, which do not supply a callback, a default callback
is installed, which just sets nr_sets to 1 and transfers the number of
spreadable vectors to the set_size array at index 0.

This is for now guarded by a check for nr_sets != 0 to keep the NVME driver
working until it is converted to the callback mechanism.

To make sure that the driver configuration is correct under all circumstances
the callback is invoked even when there are no interrupts for queues left,
i.e. the pre/post requirements already exhaust the numner of available
interrupts.

At the PCI layer irq_create_affinity_masks() has to be invoked even for the
case where the legacy interrupt is used. That ensures that the callback is
invoked and the device driver can adjust to that situation.

[ tglx: Fixed the simple case (no sets required). Moved the sanity check
  	for nr_sets after the invocation of the callback so it catches
  	broken drivers. Fixed the kernel doc comments for struct
  	irq_affinity and de-'This patch'-ed the changelog ]

Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Bjorn Helgaas <helgaas@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: linux-nvme@lists.infradead.org
Cc: linux-pci@vger.kernel.org
Cc: Keith Busch <keith.busch@intel.com>
Cc: Sumit Saxena <sumit.saxena@broadcom.com>
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: Shivasharan Srikanteshwara <shivasharan.srikanteshwara@broadcom.com>
Link: https://lkml.kernel.org/r/20190216172228.512444498@linutronix.de
2019-02-18 11:21:28 +01:00
Martin Wilck
4a067cf823 scsi: core: reset host byte in DID_NEXUS_FAILURE case
Up to 4.12, __scsi_error_from_host_byte() would reset the host byte to
DID_OK for various cases including DID_NEXUS_FAILURE.  Commit
2a842acab1 ("block: introduce new block status code type") replaced this
function with scsi_result_to_blk_status() and removed the host-byte
resetting code for the DID_NEXUS_FAILURE case.  As the line
set_host_byte(cmd, DID_OK) was preserved for the other cases, I suppose
this was an editing mistake.

The fact that the host byte remains set after 4.13 is causing problems with
the sg_persist tool, which now returns success rather then exit status 24
when a RESERVATION CONFLICT error is encountered.

Fixes: 2a842acab1 "block: introduce new block status code type"
Signed-off-by: Martin Wilck <mwilck@suse.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-15 22:17:58 -05:00
John Garry
ffeafdd2bf scsi: libsas: Fix rphy phy_identifier for PHYs with end devices attached
The sysfs phy_identifier attribute for a sas_end_device comes from the rphy
phy_identifier value.

Currently this is not being set for rphys with an end device attached, so
we see incorrect symlinks from systemd disk/by-path:

root@localhost:~# ls -l /dev/disk/by-path/
total 0
lrwxrwxrwx 1 root root  9 Feb 13 12:26 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy0-lun-0 -> ../../sdb
lrwxrwxrwx 1 root root 10 Feb 13 12:26 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy0-lun-0-part1 -> ../../sdb1
lrwxrwxrwx 1 root root 10 Feb 13 12:26 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy0-lun-0-part2 -> ../../sdb2
lrwxrwxrwx 1 root root 10 Feb 13 12:26 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy0-lun-0-part3 -> ../../sdc3

Indeed, each sas_end_device phy_identifier value is 0:

root@localhost:/# more sys/class/sas_device/end_device-0\:0\:2/phy_identifier
0
root@localhost:/# more sys/class/sas_device/end_device-0\:0\:10/phy_identifier
0

This patch fixes the discovery code to set the phy_identifier.  With this,
we now get proper symlinks:

root@localhost:~# ls -l /dev/disk/by-path/
total 0
lrwxrwxrwx 1 root root  9 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy10-lun-0 -> ../../sdg
lrwxrwxrwx 1 root root  9 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy11-lun-0 -> ../../sdh
lrwxrwxrwx 1 root root  9 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy2-lun-0 -> ../../sda
lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy2-lun-0-part1 -> ../../sda1
lrwxrwxrwx 1 root root  9 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy3-lun-0 -> ../../sdb
lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy3-lun-0-part1 -> ../../sdb1
lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy3-lun-0-part2 -> ../../sdb2
lrwxrwxrwx 1 root root  9 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy4-lun-0 -> ../../sdc
lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy4-lun-0-part1 -> ../../sdc1
lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy4-lun-0-part2 -> ../../sdc2
lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy4-lun-0-part3 -> ../../sdc3
lrwxrwxrwx 1 root root  9 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy5-lun-0 -> ../../sdd
lrwxrwxrwx 1 root root  9 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy7-lun-0 -> ../../sde
lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy7-lun-0-part1 -> ../../sde1
lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy7-lun-0-part2 -> ../../sde2
lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy7-lun-0-part3 -> ../../sde3
lrwxrwxrwx 1 root root  9 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy8-lun-0 -> ../../sdf
lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy8-lun-0-part1 -> ../../sdf1
lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy8-lun-0-part2 -> ../../sdf2
lrwxrwxrwx 1 root root 10 Feb 13 11:53 platform-HISI0162:01-sas-exp0x500e004aaaaaaa1f-phy8-lun-0-part3 -> ../../sdf3

Fixes: 2908d778ab ("[SCSI] aic94xx: new driver")
Reported-by: dann frazier <dann.frazier@canonical.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Jason Yan <yanaijie@huawei.com>
Tested-by: dann frazier <dann.frazier@canonical.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-15 22:16:07 -05:00
Masato Suzuki
515ce60613 scsi: sd_zbc: Fix sd_zbc_report_zones() buffer allocation
The function sd_zbc_do_report_zones() issues a REPORT ZONES command with a
buffer size calculated based on the number of zones requested by the
caller. This value should however not exceed the capabilities of the
hardware maximum command size, that is, should not exceed the
max_hw_sectors limit of the device. This problem leads to failures of
report zones commands when re-validating disks with some SAS HBAs.

Fix this by limiting a report zone command buffer size to the minimum of
the device max_hw_sectors and calculated value based on the requested
number of zones. This does not change the semantic of the report_zones file
operation as report zones can always return less zone reports than
requested. Short reports are handled using a loop execution of the
report_zones file operation in the function blk_report_zones().

[Damien]
Before patch 'e76239a3748c ("block: add a report_zones method")', report
zones buffer allocation was limited to max_sectors when allocated in
blk_report_zones(). This however does not consider the actual format of the
device reply which is interface dependent.  Limiting the allocation based
on the size of the expected reply format rather than the size of the array
of generic sturct blkzone passed by blk_report_zones() makes more sense.

Fixes: e76239a374 ("block: add a report_zones method")
Cc: stable@vger.kernel.org
Signed-off-by: Masato Suzuki <masato.suzuki@wdc.com>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-15 22:09:54 -05:00
Anoob Soman
79edd00dc6 scsi: libiscsi: Fix race between iscsi_xmit_task and iscsi_complete_task
When a target sends Check Condition, whilst initiator is busy xmiting
re-queued data, could lead to race between iscsi_complete_task() and
iscsi_xmit_task() and eventually crashing with the following kernel
backtrace.

[3326150.987523] ALERT: BUG: unable to handle kernel NULL pointer dereference at 0000000000000078
[3326150.987549] ALERT: IP: [<ffffffffa05ce70d>] iscsi_xmit_task+0x2d/0xc0 [libiscsi]
[3326150.987571] WARN: PGD 569c8067 PUD 569c9067 PMD 0
[3326150.987582] WARN: Oops: 0002 [#1] SMP
[3326150.987593] WARN: Modules linked in: tun nfsv3 nfs fscache dm_round_robin
[3326150.987762] WARN: CPU: 2 PID: 8399 Comm: kworker/u32:1 Tainted: G O 4.4.0+2 #1
[3326150.987774] WARN: Hardware name: Dell Inc. PowerEdge R720/0W7JN5, BIOS 2.5.4 01/22/2016
[3326150.987790] WARN: Workqueue: iscsi_q_13 iscsi_xmitworker [libiscsi]
[3326150.987799] WARN: task: ffff8801d50f3800 ti: ffff8801f5458000 task.ti: ffff8801f5458000
[3326150.987810] WARN: RIP: e030:[<ffffffffa05ce70d>] [<ffffffffa05ce70d>] iscsi_xmit_task+0x2d/0xc0 [libiscsi]
[3326150.987825] WARN: RSP: e02b:ffff8801f545bdb0 EFLAGS: 00010246
[3326150.987831] WARN: RAX: 00000000ffffffc3 RBX: ffff880282d2ab20 RCX: ffff88026b6ac480
[3326150.987842] WARN: RDX: 0000000000000000 RSI: 00000000fffffe01 RDI: ffff880282d2ab20
[3326150.987852] WARN: RBP: ffff8801f545bdc8 R08: 0000000000000000 R09: 0000000000000008
[3326150.987862] WARN: R10: 0000000000000000 R11: 000000000000fe88 R12: 0000000000000000
[3326150.987872] WARN: R13: ffff880282d2abe8 R14: ffff880282d2abd8 R15: ffff880282d2ac08
[3326150.987890] WARN: FS: 00007f5a866b4840(0000) GS:ffff88028a640000(0000) knlGS:0000000000000000
[3326150.987900] WARN: CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033
[3326150.987907] WARN: CR2: 0000000000000078 CR3: 0000000070244000 CR4: 0000000000042660
[3326150.987918] WARN: Stack:
[3326150.987924] WARN: ffff880282d2ad58 ffff880282d2ab20 ffff880282d2abe8 ffff8801f545be18
[3326150.987938] WARN: ffffffffa05cea90 ffff880282d2abf8 ffff88026b59cc80 ffff88026b59cc00
[3326150.987951] WARN: ffff88022acf32c0 ffff880289491800 ffff880255a80800 0000000000000400
[3326150.987964] WARN: Call Trace:
[3326150.987975] WARN: [<ffffffffa05cea90>] iscsi_xmitworker+0x2f0/0x360 [libiscsi]
[3326150.987988] WARN: [<ffffffff8108862c>] process_one_work+0x1fc/0x3b0
[3326150.987997] WARN: [<ffffffff81088f95>] worker_thread+0x2a5/0x470
[3326150.988006] WARN: [<ffffffff8159cad8>] ? __schedule+0x648/0x870
[3326150.988015] WARN: [<ffffffff81088cf0>] ? rescuer_thread+0x300/0x300
[3326150.988023] WARN: [<ffffffff8108ddf5>] kthread+0xd5/0xe0
[3326150.988031] WARN: [<ffffffff8108dd20>] ? kthread_stop+0x110/0x110
[3326150.988040] WARN: [<ffffffff815a0bcf>] ret_from_fork+0x3f/0x70
[3326150.988048] WARN: [<ffffffff8108dd20>] ? kthread_stop+0x110/0x110
[3326150.988127] ALERT: RIP [<ffffffffa05ce70d>] iscsi_xmit_task+0x2d/0xc0 [libiscsi]
[3326150.988138] WARN: RSP <ffff8801f545bdb0>
[3326150.988144] WARN: CR2: 0000000000000078
[3326151.020366] WARN: ---[ end trace 1c60974d4678d81b ]---

Commit 6f8830f5bb ("scsi: libiscsi: add lock around task lists to fix
list corruption regression") introduced "taskqueuelock" to fix list
corruption during the race, but this wasn't enough.

Re-setting of conn->task to NULL, could race with iscsi_xmit_task().
iscsi_complete_task()
{
    ....
    if (conn->task == task)
        conn->task = NULL;
}

conn->task in iscsi_xmit_task() could be NULL and so will be task.
__iscsi_get_task(task) will crash (NullPtr de-ref), trying to access
refcount.

iscsi_xmit_task()
{
    struct iscsi_task *task = conn->task;

    __iscsi_get_task(task);
}

This commit will take extra conn->session->back_lock in iscsi_xmit_task()
to ensure iscsi_xmit_task() waits for iscsi_complete_task(), if
iscsi_complete_task() wins the race.  If iscsi_xmit_task() wins the race,
iscsi_xmit_task() increments task->refcount
(__iscsi_get_task) ensuring iscsi_complete_task() will not iscsi_free_task().

Signed-off-by: Anoob Soman <anoob.soman@citrix.com>
Signed-off-by: Bob Liu <bob.liu@oracle.com>
Acked-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-15 22:05:04 -05:00
Linus Torvalds
5ded587103 SCSI fixes on 20190215
Two fairly small fixes: the qla one is a panic inducing use after free
 and the entropy fix may seem minor but it has had huge userspace
 impact thanks to an unrelated change in openssl that causes sshd to
 refuse logins until it has enough entropy for the session keys, which
 causes tens of minutes delay before the affected systems allow logins
 after reboot.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXGb2iiYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishfPmAQD/eR6G
 RkGbnLfXMcP5EfAnEJAYoD8SJsR7UAAV7tdaWwEAihagqOiFmzbDKlceahaZFl27
 mizmOjw4EnpIDG2W3Qw=
 =BGnV
 -----END PGP SIGNATURE-----

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "Two fairly small fixes: the qla one is a panic inducing use after free
  and the entropy fix may seem minor but it has had huge userspace
  impact thanks to an unrelated change in openssl that causes sshd to
  refuse logins until it has enough entropy for the session keys, which
  causes tens of minutes delay before the affected systems allow logins
  after reboot"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: qla2xxx: Fix panic from use after free in qla2x00_async_tm_cmd
  scsi: sd: fix entropy gathering for most rotational disks
2019-02-15 13:36:43 -08:00
Jens Axboe
6fb845f0e7 Linux 5.0-rc6
-----BEGIN PGP SIGNATURE-----
 
 iQFRBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAlxgqNUeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGwsoH+OVXu0NQofwTvVru
 8lgF3BSDG2mhf7mxbBBlBizGVy9jnjRNGCFMC+Jq8IwiFLwprja/G27kaDTkpuF1
 PHC3yfjKvjTeUP5aNdHlmxv6j1sSJfZl0y46DQal4UeTG/Giq8TFTi+Tbz7Wb/WV
 yCx4Lr8okAwTuNhnL8ojUCVIpd3c8QsyR9v6nEQ14Mj+MvEbokyTkMJV0bzOrM38
 JOB+/X1XY4JPZ6o3MoXrBca3bxbAJzMneq+9CWw1U5eiIG3msg4a+Ua3++RQMDNr
 8BP0yCZ6wo32S8uu0PI6HrZaBnLYi5g9Wh7Q7yc0mn1Uh1zWFykA6TtqK90agJeR
 A6Ktjw==
 =scY4
 -----END PGP SIGNATURE-----

Merge tag 'v5.0-rc6' into for-5.1/block

Pull in 5.0-rc6 to avoid a dumb merge conflict with fs/iomap.c.
This is needed since io_uring is now based on the block branch,
to avoid a conflict between the multi-page bvecs and the bits
of io_uring that touch the core block parts.

* tag 'v5.0-rc6': (525 commits)
  Linux 5.0-rc6
  x86/mm: Make set_pmd_at() paravirt aware
  MAINTAINERS: Update the ocores i2c bus driver maintainer, etc
  blk-mq: remove duplicated definition of blk_mq_freeze_queue
  Blk-iolatency: warn on negative inflight IO counter
  blk-iolatency: fix IO hang due to negative inflight counter
  MAINTAINERS: unify reference to xen-devel list
  x86/mm/cpa: Fix set_mce_nospec()
  futex: Handle early deadlock return correctly
  futex: Fix barrier comment
  net: dsa: b53: Fix for failure when irq is not defined in dt
  blktrace: Show requests without sector
  mips: cm: reprime error cause
  mips: loongson64: remove unreachable(), fix loongson_poweroff().
  sit: check if IPv6 enabled before calling ip6_err_gen_icmpv6_unreach()
  geneve: should not call rt6_lookup() when ipv6 was disabled
  KVM: nVMX: unconditionally cancel preemption timer in free_nested (CVE-2019-7221)
  KVM: x86: work around leak of uninitialized stack contents (CVE-2019-7222)
  kvm: fix kvm_ioctl_create_device() reference counting (CVE-2019-6974)
  signal: Better detection of synchronous signals
  ...
2019-02-15 08:43:59 -07:00
Ming Lei
56d18f62f5 block: kill BLK_MQ_F_SG_MERGE
QUEUE_FLAG_NO_SG_MERGE has been killed, so kill BLK_MQ_F_SG_MERGE too.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-02-15 08:40:12 -07:00
Colin Ian King
258f84fae3 scsi: lpfc: fix a handful of indentation issues
There are a handful of statements that are indented incorrectly. Fix these.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-13 22:15:42 -05:00
Rob Herring
7f8e12f1e2 scsi: qlogicpti: Use of_node_name_eq for node name comparisons
Convert string compares of DT node names to use of_node_name_eq helper
instead. This removes direct access to the node name pointer.

As prom_name is not used for anything else, remove it.

Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: linux-scsi@vger.kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-13 22:07:03 -05:00
Dietmar Hahn
df46cac3f7 scsi: sd: Fix typo in sd_first_printk()
Commit b2bff6ceb6 ("[SCSI] sd: Quiesce mode sense error messages")
added the macro sd_first_printk(). The macro takes "sdsk" as argument
but dereferences "sdkp". This hasn't caused any real issues since all
callers of sd_first_printk() have an sdkp. But fix the typo.

[mkp: Turned this into a real patch and tweaked commit description]

Signed-off-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-12 22:33:00 -05:00
Bill Kuzeja
388a49959e scsi: qla2xxx: Fix panic from use after free in qla2x00_async_tm_cmd
In qla2x00_async_tm_cmd, we reference off sp after it has been freed.  This
caused a panic on a system running a slub debug kernel. Since fcport is
passed in anyways, just use that instead.

Signed-off-by: Bill Kuzeja <william.kuzeja@stratus.com>
Acked-by: Giridhar Malavali <gmalavali@marvell.com>
Acked-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-12 22:23:12 -05:00
Shivasharan S
0de0540512 scsi: megaraid_sas: driver version update
Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-12 22:20:41 -05:00
Shivasharan S
a3742d6848 scsi: megaraid_sas: Update structures for HOST_DEVICE_LIST DCMD
Add padding to make the structure variables in MR_HOST_DEVICE_LIST_ENTRY
64-bit aligned.  Also, add reserved fields to MR_HOST_DEVICE_LIST for
future firmware usage.

Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-12 22:20:05 -05:00
James Bottomley
e4a056987c scsi: sd: fix entropy gathering for most rotational disks
The problem is that the default for MQ is not to gather entropy, whereas
the default for the legacy queue was always to gather it.  The original
attempt to fix entropy gathering for rotational disks under MQ added an
else branch in sd_read_block_characteristics().  Unfortunately, the entire
check isn't reached if the device has no characteristics VPD page.  Since
this page was only introduced in SBC-3 and its optional anyway, most less
expensive rotational disks don't have one, meaning they all stopped
gathering entropy when we made MQ the default.  In a wholly unrelated
change, openssl and openssh won't function until the random number
generator is initialised, meaning lots of people have been seeing large
delays before they could log into systems with default MQ kernels due to
this lack of entropy, because it now can take tens of minutes to initialise
the kernel random number generator.

The fix is to set the non-rotational and add-randomness flags
unconditionally early on in the disk initialization path, so they can be
reset only if the device actually reports being non-rotational via the VPD
page.

Reported-by: Mikael Pettersson <mikpelinux@gmail.com>
Fixes: 83e32a5910 ("scsi: sd: Contribute to randomness when running rotational device")
Cc: stable@vger.kernel.org
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Xuewei Zhang <xueweiz@google.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-12 22:18:26 -05:00
Dan Carpenter
fad28e3d9a scsi: lpfc: Fix error code if kcalloc() fails
This should return -ENOMEM if kcalloc() fails, but it accidentally
returns success instead.

Fixes: 6a828b0f61 ("scsi: lpfc: Support non-uniform allocation of MSIX vectors to hardware queues")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-12 22:15:54 -05:00
Chengguang Xu
2174b18513 scsi: ufs: fix a typo in comment
poitner -> pointer.

Signed-off-by: Chengguang Xu <cgxu519@gmx.com>
Reviewed-by: Pedro Sousa <pedrom.sousa@synopsys.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-12 22:03:11 -05:00
Martin K. Petersen
9447b6ce94 scsi: scsi_debug: Implement support for write protect
Teach scsi_debug to honor SWP in the Control Mode Page and report the
resulting WP state in the Device-Specific Parameter field.

In check_device_access_params() verify that commands that will write
the medium are permitted to do so.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
2019-02-12 11:15:44 -05:00
Bart Van Assche
9fa505adf9 scsi: core: Move resid from scsi_data_buffer to scsi_cmnd
This patch does not change any functionality but reduces the size of
struct scsi_cmnd.

Cc: Douglas Gilbert <dgilbert@interlog.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-12 11:13:41 -05:00
Bart Van Assche
80f82c169b scsi: sd: Remove superfluous residual assignments
Since commit 26e85fcd15 ("[SCSI] sd: Permit merged discard requests";
kernel v3.10) sd_done() sets the residual not only for failed special
requests but also for special requests that succeeded. Hence remove the
code from functions called by sd_init_command() that sets the residual.
This patch does not change any functionality.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-12 11:13:41 -05:00
Bart Van Assche
42d387be5b scsi: scsi_debug: Use scsi_[gs]et_resid() where appropriate
This patch does not change any functionality.

Cc: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-12 11:13:41 -05:00
Bart Van Assche
960bf87a4f scsi: libiscsi: Use scsi_[gs]et_resid() where appropriate
This patch does not change any functionality.

Cc: Lee Duncan <lduncan@suse.com>
Cc: Chris Leech <cleech@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-12 11:13:41 -05:00
Bart Van Assche
c208556ab3 scsi: scsi_debug: Fix a recently introduced regression
A recent commit removed an element from opcode_info_arr[] but did not
modify opcode_ind_arr[] nor was SDEB_I_XDWRITEREAD removed. Remove
SDEB_I_XDWRITEREAD and bring the two arrays again in sync. This patch
avoids that the following is reported:

BUG: KASAN: null-ptr-deref in scsi_debug_queuecommand+0x60f/0xc90 [scsi_debug]
Read of size 1 at addr 0000000000000001 by task iscsi-test-cu/683
CPU: 3 PID: 683 Comm: iscsi-test-cu Not tainted 5.0.0-rc5-dbg+ #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
Call Trace:
 dump_stack+0x86/0xca
 kasan_report.cold.3+0x5/0x3e
 __asan_load1+0x47/0x50
 scsi_debug_queuecommand+0x60f/0xc90 [scsi_debug]
 scsi_queue_rq+0xc17/0x12e0
 blk_mq_dispatch_rq_list+0x5fc/0xb10
 blk_mq_sched_dispatch_requests+0x2f7/0x300
 __blk_mq_run_hw_queue+0xd6/0x180
 __blk_mq_delay_run_hw_queue+0x25c/0x290
 blk_mq_run_hw_queue+0x119/0x1b0
 blk_mq_sched_insert_request+0x274/0x350
 blk_execute_rq_nowait+0x78/0x90
 blk_execute_rq+0xcc/0x140
 sg_io+0x30f/0x700
 scsi_cmd_ioctl+0x4d4/0x540
 scsi_cmd_blk_ioctl+0x7b/0x8b
 sd_ioctl+0xba/0x150
 blkdev_ioctl+0x6e1/0xea0
 block_ioctl+0x79/0x90
 do_vfs_ioctl+0x12b/0x9b0
 ksys_ioctl+0x41/0x80
 __x64_sys_ioctl+0x43/0x50
 do_syscall_64+0x71/0x210
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

Cc: Christoph Hellwig <hch@lst.de>
Cc: Douglas Gilbert <dgilbert@interlog.com>
Fixes: ae3d56d815 ("scsi: remove bidirectional command support")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-12 11:08:06 -05:00
Greg Kroah-Hartman
5c07488d99 Merge 5.0-rc6 into char-misc-next
We need the char-misc fixes in here as well.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-02-11 09:05:58 +01:00
Linus Torvalds
3b6e8204a9 SCSI fixes on 20190208
This is a set of five minor fixes (although, tecnhincally, the aicxxx
 fix is for a major problem in that the driver won't load without it,
 but I think the fact it's taken us since 4.10 to discover this
 indicates that the user base for these things has declined).
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXF3VNSYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishUz2AP9L+n9A
 Ma5WutU8gkoNcttX7RJvRmtha9RiwvxRi7cs6QD+OToBDpTbo+kLuzfXz0Gop4Go
 qQziEsBm1P9ShCti3K0=
 =hptI
 -----END PGP SIGNATURE-----

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "This is a set of five minor fixes (although, tecnhincally, the aicxxx
  fix is for a major problem in that the driver won't load without it,
  but I think the fact it's taken us since 4.10 to discover this
  indicates that the user base for these things has declined)"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: cxlflash: Prevent deadlock when adapter probe fails
  Revert "scsi: libfc: Add WARN_ON() when deleting rports"
  scsi: sd_zbc: Fix zone information messages
  scsi: target: make the pi_prot_format ConfigFS path readable
  scsi: aic94xx: fix module loading
2019-02-08 15:37:17 -08:00
John Garry
4a8bec88f7 scsi: hisi_sas: Do some more tidy-up
Do some very minor tidy-up, for things like needlessly initing variable and
not leaving whitespace before quote endings.

Originally-from: Xiang Chen <chenxiang66@hisilicon.com>
Originally-from: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-08 18:08:22 -05:00
Xiang Chen
4fefe5bbf5 scsi: hisi_sas: Use pci_irq_get_affinity() for v3 hw as experimental
For auto-control irq affinity mode, choose the dq to deliver IO according
to the current CPU.

Then it decreases the performance regression that fio and CQ interrupts are
processed on different node.

For user control irq affinity mode, keep it as before.

To realize it, also need to distinguish the usage of dq lock and sas_dev
lock.

We mark as experimental due to ongoing discussion on managed MSI IRQ
during hotplug:
https://marc.info/?l=linux-scsi&m=154876335707751&w=2

We're almost at the point where we can expose multiple queues to the upper
layer for SCSI MQ, but we need to sort out the per-HBA tags performance
issue.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-08 18:08:22 -05:00
John Garry
795f25a31b scsi: hisi_sas: Issue internal abort on all relevant queues
To support queue mapped to a CPU, it needs to be ensured that issuing an
internal abort is safe, in that it is guaranteed that an internal abort is
processed for a single IO or a device after all the relevant command(s)
which it is attempting to abort have been processed by the controller.

Currently we only deliver commands for any device on a single queue to
solve this problem, as we know that commands issued on the same queue will
be processed in order, and we will not have a scenario where the internal
abort is racing against a command(s) which it is trying to abort.

To enqueue commands on queue mapped to a CPU, choosing a queue for an
command is based on the associated queue for the current CPU, so this is
not safe for internal abort since it would definitely not be guaranteed
that commands for the command devices are issued on the same queue.

To solve this issue, we take a bludgeoning approach, and issue a separate
internal abort on any queue(s) relevant to the command or device, in that
we will be guaranteed that at least one of these internal aborts will be
received last in the controller.

So, for aborting a single command, we can just force the internal abort to
be issued on the same queue as the command which we are trying to abort.

For aborting all commands associated with a device, we issue a separate
internal abort on all relevant queues. Issuing multiple internal aborts in
this fashion would have not side affect.

Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-08 18:08:22 -05:00
Xiang Chen
1273d65f29 scsi: hisi_sas: change queue depth from 512 to 4096
If sending IOs to many disks from single queue, it is possible that the
queue may be full. To avoid the situation, change queue depth from 512 to
4096 which is the max number of IOs for v3 hw.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-08 18:08:22 -05:00
Luo Jiaxing
7c5e136363 scsi: hisi_sas: Add manual trigger for debugfs dump
Add an interface to manually trigger a debugfs dump.

Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-08 18:08:22 -05:00
Xiang Chen
b3cce125cb scsi: hisi_sas: Add support for DIX feature for v3 hw
This patch adds support for DIX to v3 hw driver.

For this, we build upon support for DIF, most significantly is adding new
DMA map and unmap paths.

Some pre-existing macro precedence issues are also tidied. They were
detected by checkpatch --strict.

Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-08 18:08:21 -05:00
Nathan Chancellor
6f4e626fb0 scsi: ata: Use unsigned int for cmd's type in ioctls in scsi_host_template
Clang warns several times in the scsi subsystem (trimmed for brevity):

drivers/scsi/hpsa.c:6209:7: warning: overflow converting case value to
switch condition type (2147762695 to 18446744071562347015) [-Wswitch]
        case CCISS_GETBUSTYPES:
             ^
drivers/scsi/hpsa.c:6208:7: warning: overflow converting case value to
switch condition type (2147762694 to 18446744071562347014) [-Wswitch]
        case CCISS_GETHEARTBEAT:
             ^

The root cause is that the _IOC macro can generate really large numbers,
which don't fit into type 'int', which is used for the cmd parameter in
the ioctls in scsi_host_template. My research into how GCC and Clang are
handling this at a low level didn't prove fruitful. However, looking at
the rest of the kernel tree, all ioctls use an 'unsigned int' for the
cmd parameter, which will fit all of the _IOC values in the scsi/ata
subsystems.

Make that change because none of the ioctls expect a negative value for
any command, it brings the ioctls inline with the reset of the kernel,
and it removes ambiguity, which is never good when dealing with compilers.

Link: https://github.com/ClangBuiltLinux/linux/issues/85
Link: https://github.com/ClangBuiltLinux/linux/issues/154
Link: https://github.com/ClangBuiltLinux/linux/issues/157
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Acked-by: Bradley Grove <bgrove@attotech.com>
Acked-by: Don Brace <don.brace@microsemi.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-08 17:33:00 -05:00
James Smart
42fb055a57 scsi: lpfc: Update lpfc version to 12.2.0.0
Update lpfc version to 12.2.0.0

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:29:50 -05:00
James Smart
0d041215f0 scsi: lpfc: Update 12.2.0.0 file copyrights to 2019
For files modified as part of 12.2.0.0 patches, update copyright to 2019

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:29:50 -05:00
James Smart
c160c0f806 scsi: lpfc: Fix nvmet issues when link bounce under IO load
Various null pointer dereference and general protection fault panics occur
when there is a link bounce under load. There are a large number of "error"
message 6413 indicating "bad release".

The issues resolve to list corruptions due to missing or inconsistent lock
protection. Lockups are due to nested locks in the unsolicited abort
path. The unsolicited abort path calls the wrong abort processing
routine. There was also duplicate context release while aborts were still
active in the hardware.

Removed duplicate locks and added lock protection around list item
removal. Commonized lock handling around the abort processing routines.
Prevent context release while still in ABTS list.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:29:50 -05:00
James Smart
472e146d1c scsi: lpfc: Correct upcalling nvmet_fc transport during io done downcall
When the transport calls into the lpfc target to release an IO job
structure, which corresponds to an exchange, and if the driver was waiting
for an exchange in order to post a previously received command to the
transport, the driver immediately takes the IO job and reuses the context
for the prior command and calls nvmet_fc_rcv_fcp_req() to tell the
transport about a newly received command.

Problem is, the execution of the IO job release may be in the context of
the back end driver and its bio completion handlers, thus it may be in a
irq context and protection code kicks in in the bio and request layers that
are subsequently called.

Rework lpfc so that instead of immediately upcalling, queue it to a
deferred work thread and have the thread make the upcall.

Took advantage of this change to remove duplicated code with the normal
command receive path that preps the IO job and upcalls nvmet_fc. Created a
common routine both paths use.

Also corrected some errors that were found during review of the context
freeing and reuse - basically unlocked operations and a somewhat disjoint
set of calls to release associated job elements. Cleaned up this path and
added locks for coherency.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:29:50 -05:00
James Smart
f6e8479052 scsi: lpfc: Fix default driver parameter collision for allowing NPIV support
The conversion to enable SCSI and NVME fc4 support ran into an issue with
NPIV support. With NVME, NPIV is not currently supported, but with SCSI it
was. The driver reverted to its lowest setting meaning NPIV with SCSI was
not allowed.

Convert the NPIV checks and implementation so that SCSI can continue to
allow NPIV support.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:29:50 -05:00
James Smart
c2017260ee scsi: lpfc: Rework locking on SCSI io completion
A scsi host lock is taken on every io completion to check whether the abort
handler is waiting on the io completion. This is an expensive lock to take
on all completion when rarely in an abort condition.

Replace scsi host lock with command-specific lock. Synchronize completion
and abort paths by new cmd lock. Ensure all flag changing and nulling of
context pointers taken under lock.  When adding lock to task management
abort, realized it was missing other synchronization locks. Added that
synchronization to match normal paths.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:29:50 -05:00
James Smart
b1684a0b42 scsi: lpfc: Enable SCSI and NVME fc4s by default
Now that performance mods don't split resources by protocol and enable both
protocols by default, there's no reason not to enable concurrent SCSI and
NVME fc4 support.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:29:50 -05:00
James Smart
222e9239c6 scsi: lpfc: Resize cpu maps structures based on possible cpus
The work done to date utilized the number of present cpus when sizing
per-cpu structures. Structures should have been sized based on the max
possible cpu count.

Convert the driver over to possible cpu count for sizing allocation.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:29:50 -05:00
James Smart
75508a8b8b scsi: lpfc: Utilize new IRQ API when allocating MSI-X vectors
Current driver uses the older IRQ API for MSIX allocation

Change driver to utilize pci_alloc_irq_vectors when allocating IRQ vectors.

Make lpfc_cpu_affinity_check use pci_irq_get_affinity to determine how the
kernel mapped all the IRQs.

Remove msix_entries from SLI4 structure, replaced with pci_irq_vector()
usage.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:29:49 -05:00
James Smart
32517fc097 scsi: lpfc: Rework EQ/CQ processing to address interrupt coalescing
When driving high iop counts, auto_imax coalescing kicks in and drives the
performance to extremely small iops levels.

There are two issues:

 1) auto_imax is enabled by default. The auto algorithm, when iops gets
    high, divides the iops by the hdwq count and uses that value to
    calculate EQ_Delay. The EQ_Delay is set uniformly on all EQs whether
    they have load or not. The EQ_delay is only manipulated every 5s (a
    long time). Thus there were large 5s swings of no interrupt delay
    followed by large/maximum delay, before repeating.

 2) When processing a CQ, the driver got mixed up on the rate of when
    to ring the doorbell to keep the chip appraised of the eqe or cqe
    consumption as well as how how long to sit in the thread and
    process queue entries. Currently, the driver capped its work at
    64 entries (very small) and exited/rearmed the CQ.  Thus, on heavy
    loads, additional overheads were taken to exit and re-enter the
    interrupt handler. Worse, if in the large/maximum coalescing
    windows,k it could be a while before getting back to servicing.

The issues are corrected by the following:

 - A change in defaults. Auto_imax is turned OFF and fcp_imax is set
   to 0. Thus all interrupts are immediate.

 - Cleanup of field names and their meanings. Existing names were
   non-intuitive or used for duplicate things.

 - Added max_proc_limit field, to control the length of time the
   handlers would service completions.

 - Reworked EQ handling:
    Added common routine that walks eq, applying notify interval and max
      processing limits. Use queue_claimed to claim ownership of the queue
      while processing. Always rearm the queue whenever the common routine
      is called.
    Rework queue element processing, namely to eliminate hba_index vs
      host_index. Only one index is necessary. The queue entry can be
      marked invalid and the host_index updated immediately after eqe
      processing.
    After rework, xx_release routines are now DB write functions. Renamed
      the routines as such.
    Moved lpfc_sli4_eq_flush(), which does similar action, to same area.
    Replaced the 2 individual loops that walk an eq with a call to the
      common routine.
    Slightly revised lpfc_sli4_hba_handle_eqe() calling syntax.
    Added per-cpu counters to detect interrupt rates and scale
      interrupt coalescing values.

 - Reworked CQ handling:
    Added common routine that walks cq, applying notify interval and max
      processing limits. Use queue_claimed to claim ownership of the queue
      while processing. Always rearm the queue whenever the common routine
      is called.
    Rework queue element processing, namely to eliminate hba_index vs
      host_index. Only one index is necessary. The queue entry can be
      marked invalid and the host_index updated immediately after cqe
      processing.
    After rework, xx_release routines are now DB write functions.  Renamed
      the routines as such.
    Replaced the 3 individual loops that walk a cq with a call to the
      common routine.
    Redefined lpfc_sli4_sp_handle_mcqe() to commong handler definition with
      queue reference. Add increment for mbox completion to handler.

 - Added a new module/sysfs attribute: lpfc_cq_max_proc_limit To allow
   dynamic changing of the CQ max_proc_limit value being used.

Although this leaves an EQ as an immediate interrupt, that interrupt will
only occur if a CQ bound to it is in an armed state and has cqe's to
process.  By staying in the cq processing routine longer, high loads will
avoid generating more interrupts as they will only rearm as the processing
thread exits. The immediately interrupt is also beneficial to idle or
lower-processing CQ's as they get serviced immediately without being
penalized by sharing an EQ with a more loaded CQ.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:29:49 -05:00
James Smart
cb733e3587 scsi: lpfc: cleanup: convert eq_delay to usdelay
Review of the eq coalescing logic showed the code was a bit fragmented.
Sometimes it would save/set via an interrupt max value, while in others it
would do so via a usdelay. There were also two places changing eq delay,
one place that issued mailbox commands, and another that changed via
register writes if supported.

Clean this up by:

 - Standardizing the operation of lpfc_modify_hba_eq_delay() routine so
   that it is always told of a us delay to impose. The routine then chooses
   the best way to set that - via register or via mbx.

 - Rather than two value types stored in eq->q_mode (usdelay if change via
   register, imax if change via mbox) - q_mode always contains usdelay.
   Before any value change, old vs new value is compared and only if
   different is a change done.

 - Revised the dmult calculation. dmult is not set based on overall imax
   divided by hardware queues - instead imax applies to a single cpu and
   the value will be replicated to all cpus.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:29:49 -05:00
James Smart
6a828b0f61 scsi: lpfc: Support non-uniform allocation of MSIX vectors to hardware queues
So far MSIX vector allocation assumed it would be 1:1 with hardware
queues. However, there are several reasons why fewer MSIX vectors may be
allocated than hardware queues such as the platform being out of vectors or
adapter limits being less than cpu count.

This patch reworks the MSIX/EQ relationships with the per-cpu hardware
queues so they can function independently. MSIX vectors will be equitably
split been cpu sockets/cores and then the per-cpu hardware queues will be
mapped to the vectors most efficient for them.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:29:49 -05:00
James Smart
b3295c2a75 scsi: lpfc: Fix setting affinity hints to correlate with hardware queues
The desired affinity for the hardware queue behavior is for hdwq 0 to be
affinitized with cpu 0, hdwq 1 to cpu 1, and so on.  The implementation so
far does not do this if the number of cpus is greater than the number of
hardware queues (e.g. hardware queue allocation was administratively
reduced or hardware queue resources could not scale to the cpu count).

Correct the queue affinitization logic when queue count is less than
cpu count.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:29:09 -05:00
James Smart
45aa312e21 scsi: lpfc: Allow override of hardware queue selection policies
Default behavior is to use the information from the upper IO stacks to
select the hardware queue to use for IO submission.  Which typically has
good cpu affinity.

However, the driver, when used on some variants of the upstream kernel, has
found queuing information to be suboptimal for FCP or IO completion locked
on particular cpus.

For command submission situations, the lpfc_fcp_io_sched module parameter
can be set to specify a hardware queue selection policy that overrides the
os stack information.

For IO completion situations, rather than queing cq processing based on the
cpu servicing the interrupting event, schedule the cq processing on the cpu
associated with the hardware queue's cq.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:29:09 -05:00
James Smart
c490850a09 scsi: lpfc: Adapt partitioned XRI lists to efficient sharing
The XRI get/put lists were partitioned per hardware queue. However, the
adapter rarely had sufficient resources to give a large number of resources
per queue. As such, it became common for a cpu to encounter a lack of XRI
resource and request the upper io stack to retry after returning a BUSY
condition. This occurred even though other cpus were idle and not using
their resources.

Create as efficient a scheme as possible to move resources to the cpus that
need them. Each cpu maintains a small private pool which it allocates from
for io. There is a watermark that the cpu attempts to keep in the private
pool.  The private pool, when empty, pulls from a global pool from the
cpu. When the cpu's global pool is empty it will pull from other cpu's
global pool. As there many cpu global pools (1 per cpu or hardware queue
count) and as each cpu selects what cpu to pull from at different rates and
at different times, it creates a radomizing effect that minimizes the
number of cpu's that will contend with each other when the steal XRI's from
another cpu's global pool.

On io completion, a cpu will push the XRI back on to its private pool.  A
watermark level is maintained for the private pool such that when it is
exceeded it will move XRI's to the CPU global pool so that other cpu's may
allocate them.

On NVME, as heartbeat commands are critical to get placed on the wire, a
single expedite pool is maintained. When a heartbeat is to be sent, it will
allocate an XRI from the expedite pool rather than the normal cpu
private/global pools. On any io completion, if a reduction in the expedite
pools is seen, it will be replenished before the XRI is placed on the cpu
private pool.

Statistics are added to aid understanding the XRI levels on each cpu and
their behaviors.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:29:09 -05:00
James Smart
ace44e48b1 scsi: lpfc: Synchronize hardware queues with SCSI MQ interface
Now that the lower half has much better per-cpu parallelization using the
hardware queues, the SCSI MQ support needs to be tied into it.

The involves the following mods:

 - Use the hardware queue info from the midlayer to help select the
   hardware queue to utilize. This required change to the get_scsi-buf_xxx
   routines.

 - Remove lpfc_sli4_scmd_to_wqidx_distr() routine. No longer needed.

 - Includes fix for SLI-3 that does not have multi queue parallelization.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:29:09 -05:00
James Smart
1fbf974250 scsi: lpfc: Convert ring number to hardware queue for nvme wqe posting.
SLI4 nvme functions are passing the SLI3 ring number when posting wqe to
hardware. This should be indicating the hardware queue to use, not the ring
number.

Replace ring number with the hardware queue that should be used.

Note: SCSI avoided this issue as it utilized an older lfpc_issue_iocb
routine that properly adapts.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:29:09 -05:00
James Smart
4c47efc140 scsi: lpfc: Move SCSI and NVME Stats to hardware queue structures
Many io statistics were being sampled and saved using adapter-based data
structures. This was creating a lot of contention and cache thrashing in
the I/O path.

Move the statistics to the hardware queue data structures.  Given the
per-queue data structures, use of atomic types is lessened.

Add new sysfs and debugfs stat routines to collate the per hardware queue
values and report at an adapter level.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:29:08 -05:00
James Smart
63df6d637e scsi: lpfc: Adapt cpucheck debugfs logic to Hardware Queues
Similar to the io execution path that reports cpu context information, the
debugfs routines for cpu information needs to be aligned with new hardware
queue implementation.

Convert debugfs cnd nvme cpucheck statistics to report information per
Hardware Queue.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:28:11 -05:00
James Smart
18c27a6216 scsi: lpfc: cleanup: Remove unused FCP_XRI_ABORT_EVENT slowpath event
Both NVME and SCSI aborts are now processed off the CQ workqueue and do not
generate events for the slowpath any more.

Remove the unused event code.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:24:22 -05:00
James Smart
5e5b511d8b scsi: lpfc: Partition XRI buffer list across Hardware Queues
Once the IO buff allocations were made shared, there was a single XRI
buffer list shared by all hardware queues.  A single list isn't great for
performance when shared across the per-cpu hardware queues.

Create a separate XRI IO buffer get/put list for each Hardware Queue.  As
SGLs and associated IO buffers get allocated/posted to the firmware; round
robin their assignment across all available hardware Queues so that there
is an equitable assignment.

Modify SCSI and NVME IO submit code paths to use the Hardware Queue logic
for XRI allocation.

Add a debugfs interface to display hardware queue statistics

Added new empty_io_bufs counter to track if a cpu runs out of XRIs.

Replace common_ variables/names with io_ to make meanings clearer.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:24:22 -05:00
James Smart
cdb42becdd scsi: lpfc: Replace io_channels for nvme and fcp with general hdw_queues per cpu
Currently, both nvme and fcp each have their own concept of an io_channel,
which is a combination wq/cq and associated msix.  Different cpus would
share an io_channel.

The driver is now moving to per-cpu wq/cq pairs and msix vectors.  The
driver will still use separate wq/cq pairs per protocol on each cpu, but
the protocols will share the msix vector.

Given the elimination of the nvme and fcp io channels, the module
parameters will be removed.  A new parameter, lpfc_hdw_queue is added which
allows the wq/cq pair allocation per cpu to be overridden and allocated to
lesser value. If lpfc_hdw_queue is zero, the number of pairs allocated will
be based on the number of cpus. If non-zero, the parameter specifies the
number of queues to allocate. At this time, the maximum non-zero value is
64.

To manage this new paradigm, a new hardware queue structure is created to
track queue activity and relationships.

As MSIX vector allocation must be known before setting up the
relationships, msix allocation now occurs before queue datastructures are
allocated. If the number of vectors allocated is less than the desired
hardware queues, the hardware queue counts will be reduced to the number of
vectors

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:22:42 -05:00
James Smart
7370d10ac9 scsi: lpfc: Remove extra vector and SLI4 queue for Expresslane
There is a extra queue and msix vector for expresslane. Now that the driver
will be doing queues per cpu, this oddball queue is no longer needed.
Expresslane will utilize the normal per-cpu queues.

Updated debugfs sli4 queue output to go along with the change

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:22:42 -05:00
James Smart
0794d601d1 scsi: lpfc: Implement common IO buffers between NVME and SCSI
Currently, both NVME and SCSI get their IO buffers from separate
pools. XRI's are associated 1:1 with IO buffers, so XRI's are also split
between protocols.

Eliminate the independent pools and use a single pool. Each buffer
structure now has a common section and a protocol section. Per protocol
routines for SGL initialization are removed and replaced by common
routines. Initialization of the buffers is only done on the common area.
All other fields, which are protocol specific, are initialized when the
buffer is allocated for use in the per-protocol allocation routine.

In the past, the SCSI side allocated IO buffers as part of slave_alloc
calls until the maximum XRIs for SCSI was reached. As all XRIs are now
common and may be used for either protocol, allocation for everything is
done as part of adapter initialization and the scsi side has no action in
slave alloc.

As XRI's are no longer split, the lpfc_xri_split module parameter is
removed.

Adapters based on SLI3 will continue to use the older scsi_buf_list_get/put
routines.  All SLI4 adapters utilize the new IO buffer scheme

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:22:42 -05:00
James Smart
e960f5ab40 scsi: lpfc: cleanup: Remove excess check on NVME io submit code path
lpfc_nvme_prep_io_cmd() checks for null pnode, but caller
lpfc_nvme_fcp_io_submit() has already ensured it's non-null.

Remove the pnode null check.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:22:42 -05:00
James Smart
0b05e9fe1f scsi: lpfc: cleanup: remove nrport from nvme command structure
An hba-wide lock is taken in the nvme io completion routine. The lock
covers null'ing of the nrport pointer in the cmd structure.

The nrport member isn't necessary. After extracting the pointer from the
command, the pointer was dereferenced to get the fc discovery node
pointer. But the fc discovery node pointer is alrady in the command
structure so the dereferrence was unnecessary.

Eliminated the nrport structure member and its use, which also eliminates
the port-wide lock.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:22:42 -05:00
Himanshu Madhani
b8837a0f88 scsi: qla2xxx: Update driver version to 10.00.00.13-k
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 21:41:16 -05:00
Quinn Tran
1560bafdff scsi: qla2xxx: Use complete switch scan for RSCN events
This patch removes unnecessary code to handle RSCN, instead performs full
scan everytime driver receives RSCN

Fixes: d4f7a16aec ("scsi: qla2xxx: Remove ASYNC GIDPN switch command")
Cc: stable@vger.kernel.org #4.19
Signed-off-by: Quinn Tran <qtran@marvell.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 21:41:16 -05:00
Quinn Tran
87d6814a28 scsi: qla2xxx: Fix fw options handle eh_bus_reset()
For eh_bus_reset, driver is supposed to reset the link.  Current option to
reset the link is applicable to Loop only. This patch updates current FW
option with the one that is applicable to all topologies.

Signed-off-by: Quinn Tran <qtran@marvell.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 21:41:16 -05:00
Sawan Chandak
dcbf8f8087 scsi: qla2xxx: Restore FAWWPN of Physical Port only for loop down
When loop was made down explicitly due to cable pull, then for N2N toplogy,
if FAWWPN BIT is enabled by user, then it would restore some default
(garbage) value for Physical port WWPN, so this show garbage WWPN for the
port. Fix is, to restore physical port WWPN, if it is fabric
configuration. When loop is explicitly made down, and FAWWPN feature is
enabled, then driver need to restore original flashed WWPN.

Signed-off-by: Sawan Chandak <schandak@marvell.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 21:41:16 -05:00
Quinn Tran
5e85f6df77 scsi: qla2xxx: Prevent memory leak for CT req/rsp allocation
This patch fixes memory leak by releasing DMA memory in case CT request and
response allocation fails.

Signed-off-by: Quinn Tran <qtran@marvall.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 21:41:16 -05:00
Giridhar Malavali
97a93cea88 scsi: qla2xxx: Fix SRB allocation flag to avoid sleeping in IRQ context
This patch fixes SRB allocation flag from GFP_KERNEL to GFP_ATOMIC, to
prevent sleeping in IRQ context

Signed-off-by: Giridhar Malavali <gmalavali@marvell.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 21:41:16 -05:00
Quinn Tran
1021f0bc2f scsi: qla2xxx: allow session delete to finish before create.
This patch flushes del_work and free_work while sending NACK response for
PRLI

Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 21:41:16 -05:00
Quinn Tran
9ecd6564d1 scsi: qla2xxx: fix fcport null pointer access.
This patch allocates DMA memory to prevent NULL pointer access for ct_sns
request while sending switch commands.

Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 21:41:16 -05:00
Quinn Tran
51fd6e6351 scsi: qla2xxx: flush IO on chip reset or sess delete
On Transmit respond in target mode, if the chip is already reset or the
session is already deleted, then advance the command to the free step.
There is no need to abort the command, because the chip has already flushed
it.

Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 21:41:16 -05:00
Quinn Tran
80676d054e scsi: qla2xxx: Fix session cleanup hang
On session cleanup, either an implicit LOGO or an implicit PRLO is used to
flush IOs.  If the flush command hit Queue Full condition, then it is
dropped.  This patch adds retry code to prevent command drop.

Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 21:41:16 -05:00
Quinn Tran
4825034afb scsi: qla2xxx: Change default ZIO threshold.
Change default ZIO threshold to an optimized value.

Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 21:41:16 -05:00
Quinn Tran
590f806ddd scsi: qla2xxx: Add pci function reset support.
This patch provide call back functions to stop the chip and resume the chip
if the PCI lower level driver wants to perform function level reset/FLR.
Before the FLR, the chip will be stopped to turn off all DMA activity.
After the FLR, the chip is reset to resume previous operation state.

Signed-off-by: Quinn Tran <qtran@marvell.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 21:41:15 -05:00
Himanshu Madhani
7f147f9bfd scsi: qla2xxx: Fix N2N target discovery with Local loop
This patch fixes the issue where Dell-EMC Target will fail to discover LUNs
if domain and area of port ID is not same as adapter's.

Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 21:41:15 -05:00
Christoph Hellwig
69ed175c19 scsi: block: remove req->special
No users left.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 21:30:09 -05:00
Christoph Hellwig
b9f9199299 scsi: stop setting up request->special
No more need in a blk-mq world where the scsi command and request are
allocated together.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 21:29:49 -05:00
Christoph Hellwig
ae3d56d815 scsi: remove bidirectional command support
No real need for bidi support once the OSD code is gone.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 21:29:21 -05:00
Christoph Hellwig
19fcae3d4f scsi: remove the SCSI OSD library
Now that all the users are gone the SCSI OSD library can be removed as
well.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 21:28:52 -05:00
Christoph Hellwig
972248e911 scsi: bsg-lib: handle bidi requests without block layer help
We can just stash away the second request in struct bsg_job instead of
using the block layer req->next_rq field, allowing for the eventual removal
of the latter.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 21:27:40 -05:00
Suganath Prabu S
c6ded86a16 scsi: mpt3sas: Update driver version to 27.102.00.00
Updated driver version to 27.102.00.00 from 27.101.00.00.

Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-04 22:52:21 -05:00
Suganath Prabu S
eb9c7ce560 scsi: mpt3sas: Add support for ATLAS PCIe switch
Add Atlas PCIe Switch Management Port device PNPID,
Vendor Id: 0x1000
device Id: 0x00B2

This device is based on MPI 2.6 spec and it exposes one SES device to
accept management commands for the PCIe switch.

Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-04 22:52:21 -05:00