Commit Graph

730 Commits

Author SHA1 Message Date
James Smart
f6770e7d23 scsi: lpfc: Clean up hba max_lun_queue_depth checks
The current code does some odd +1 over maximum xri count checks and
requires that the lun_queue_count can't be bigger than maximum xri count
divided by 8. These items are bogus.

Clean the code up to cap lun_queue_count to maximum xri count.

Link: https://lore.kernel.org/r/20200128002312.16346-10-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-02-10 22:46:56 -05:00
James Smart
a99c80742a scsi: lpfc: Fix compiler warning on frame size
The following error is see from the compiler:

  drivers/scsi/lpfc/lpfc_init.c: In function
    ‘lpfc_cpuhp_get_eq’: drivers/scsi/lpfc/lpfc_init.c:12660:1:
      error: the frame size of 1032 bytes is larger than 1024 bytes
         [-Werror=frame-larger-than=]

The issue is due to allocating a cpumask on the stack.

Fix by converting to a dynamical allocation of the cpu mask.

Link: https://lore.kernel.org/r/20200128002312.16346-7-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-02-10 22:46:56 -05:00
James Smart
821bc882ac scsi: lpfc: Fix release of hwq to clear the eq relationship
When performing reset testing, the eq's list for related hwqs was getting
corrupted.  In cases where there is not a 1:1 eq to hwq, the eq is
shared. The eq maintains a list of hwqs utilizing it in case of cpu
offlining and polling. During the reset, the hwqs are being torn down so
they can be recreated. The recreation was getting confused by seeing a
non-null eq assignment on the eq and the eq list became corrupt.

Correct by clearing the hdwq eq assignment when the hwq is cleaned up.

Link: https://lore.kernel.org/r/20200128002312.16346-6-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-02-10 22:46:55 -05:00
Martin K. Petersen
1c46a2cf2d block, scsi: final compat_ioctl cleanup
This series concludes the work I did for linux-5.5 on the compat_ioctl()
 cleanup, killing off fs/compat_ioctl.c and block/compat_ioctl.c by moving
 everything into drivers.
 
 Overall this would be a reduction both in complexity and line count, but
 as I'm also adding documentation the overall number of lines increases
 in the end.
 
 My plan was originally to keep the SCSI and block parts separate.
 This did not work easily because of interdependencies: I cannot
 do the final SCSI cleanup in a good way without first addressing the
 CDROM ioctls, so this is one series that I hope could be merged through
 either the block or the scsi git trees, or possibly both if you can
 pull in the same branch.
 
 The series comes in these steps:
 
 1. clean up the sg v3 interface as suggested by Linus. I have
    talked about this with Doug Gilbert as well, and he would
    rebase his sg v4 patches on top of "compat: scsi: sg: fix v3
    compat read/write interface"
 
 2. Actually moving handlers out of block/compat_ioctl.c and
    block/scsi_ioctl.c into drivers, mixed in with cleanup
    patches
 
 3. Document how to do this right. I keep getting asked about this,
    and it helps to point to some documentation file.
 
 The branch is based on another one that fixes a couple of bugs found
 during the creation of this series.
 
 Changes since v3:
   https://lore.kernel.org/lkml/20200102145552.1853992-1-arnd@arndb.de/
 
 - Move sr_compat_ioctl fixup to correct patch (Ben Hutchings)
 - Add Reviewed-by tags
 
 Changes since v2:
   https://lore.kernel.org/lkml/20191217221708.3730997-1-arnd@arndb.de/
 
 - Rebase to v5.5-rc4, which contains the earlier bugfixes
 - Fix sr_block_compat_ioctl() error handling bug found by
   Ben Hutchings
 - Fix idecd_locked_compat_ioctl() compat_ptr() bug
 - Don't try to handle HDIO_DRIVE_TASKFILE in drivers/ide
 - More documentation improvements
 
 Changes since v1:
   https://lore.kernel.org/lkml/20191211204306.1207817-1-arnd@arndb.de/
 
 - move out the bugfixes into a branch for itself
 - clean up scsi sg driver further as suggested by Christoph Hellwig
 - avoid some ifdefs by moving compat_ptr() out of asm/compat.h
 - split out the blkdev_compat_ptr_ioctl function; bug spotted by
   Ben Hutchings
 - Improve formatting of documentation
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJeDv8JAAoJEGCrR//JCVInh/oP/2BHdQvWONxwXXg2BLH7OJHm
 4PFoblxjNH/pwHm2PKh2uj8vUSgTHqID7NJChVgKZaZEJEJR7h26Sx60p+yTAepR
 /ysQiGameacJu2ZzKPYc4/S33Yu8cogQ5+DSz7mI9T5Yw0HSAE0JZ5xd9KIZ+/u8
 6k65ujd9kCxCgmtXrpx+7JFF0xb+urXKCvjdt2EfQ1ZmuMX5rDG/bTNg5JJ50shW
 vb7Z8hCpfW61ux8M/dgIh4WvUf0SA7FOy8WF1Km9gNhKGj41Arb2lmX1Jb4jDgjl
 DGsXQupyMVwigp5N37H3o1MamX/C8S49c16/zJQcJj64xX7WdxhE5kR8JIf+36Tf
 2l4wpaqVukXPvXkdv76Y472fKoOMZATF6kCoEPG3gXW9oxXDs5d2ofALfO3uNfLB
 PC4hzorw6bBlt67qAqERft2cxMMi9xSYfYZ8jD+eSF8WLL7xIcEazZqq8dKz7O00
 Qqx6+jzejT18av7cPfLjnupZg+mEcxDbPeuCgjrbhR8lcUI4DBu379RiTaQanvyR
 W00zwqCZWYnNJoha8u3AKsRcfL8eziF+/K9k+lCuhXeQBI4ipFJ03wAD4TWCigCS
 N7AikOdLzGVxE+2IfeCXPDKpdT6hFnjulnyDEgc/7jwHzcVF3MQBHwXiKhWHEUvT
 /AzAtKAiivp+uaMgAbzd
 =cddL
 -----END PGP SIGNATURE-----

Merge tag 'block-ioctl-cleanup-5.6' into 5.6/scsi-queue

Pull compat_ioctl cleanup from Arnd. Here's his description:

This series concludes the work I did for linux-5.5 on the compat_ioctl()
cleanup, killing off fs/compat_ioctl.c and block/compat_ioctl.c by moving
everything into drivers.

Overall this would be a reduction both in complexity and line count, but
as I'm also adding documentation the overall number of lines increases
in the end.

My plan was originally to keep the SCSI and block parts separate.
This did not work easily because of interdependencies: I cannot
do the final SCSI cleanup in a good way without first addressing the
CDROM ioctls, so this is one series that I hope could be merged through
either the block or the scsi git trees, or possibly both if you can
pull in the same branch.

The series comes in these steps:

1. clean up the sg v3 interface as suggested by Linus. I have
   talked about this with Doug Gilbert as well, and he would
   rebase his sg v4 patches on top of "compat: scsi: sg: fix v3
   compat read/write interface"

2. Actually moving handlers out of block/compat_ioctl.c and
   block/scsi_ioctl.c into drivers, mixed in with cleanup
   patches

3. Document how to do this right. I keep getting asked about this,
   and it helps to point to some documentation file.

The branch is based on another one that fixes a couple of bugs found
during the creation of this series.

Changes since v3:
  https://lore.kernel.org/lkml/20200102145552.1853992-1-arnd@arndb.de/

- Move sr_compat_ioctl fixup to correct patch (Ben Hutchings)
- Add Reviewed-by tags

Changes since v2:
  https://lore.kernel.org/lkml/20191217221708.3730997-1-arnd@arndb.de/

- Rebase to v5.5-rc4, which contains the earlier bugfixes
- Fix sr_block_compat_ioctl() error handling bug found by
  Ben Hutchings
- Fix idecd_locked_compat_ioctl() compat_ptr() bug
- Don't try to handle HDIO_DRIVE_TASKFILE in drivers/ide
- More documentation improvements

Changes since v1:
  https://lore.kernel.org/lkml/20191211204306.1207817-1-arnd@arndb.de/

- move out the bugfixes into a branch for itself
- clean up scsi sg driver further as suggested by Christoph Hellwig
- avoid some ifdefs by moving compat_ptr() out of asm/compat.h
- split out the blkdev_compat_ptr_ioctl function; bug spotted by
  Ben Hutchings
- Improve formatting of documentation

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-01-10 00:14:46 -05:00
James Smart
0b4391946d scsi: lpfc: Fix unmap of dpp bars affecting next driver load
When unattaching, the driver did not unmap the DPP bar. This caused the
next load of the driver, which attempts to enable wc, to not work correctly
and wc to be disabled due to an address mapping overlap.

Fix by unmapping on unattach.

Link: https://lore.kernel.org/r/20191218235808.31922-8-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-12-21 13:42:42 -05:00
James Smart
a052ce848d scsi: lpfc: Fix disablement of FC-AL on lpe35000 models
The order of the flags/checks for adapters where FC-AL is supported
erroneously excluded lpe35000 adapter models.  Also noted that the G7 flags
for Loop and Persistent topology are incorrect. They should follow the
rules as G6.

Rework the logic to enable LPe35000 FC-AL support.  Collapse G7 support
logic to the same rules as G6.

Link: https://lore.kernel.org/r/20191218235808.31922-7-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-12-21 13:42:42 -05:00
James Smart
e3ba04c9ba scsi: lpfc: Fix Fabric hostname registration if system hostname changes
There are reports of multiple ports on the same system displaying different
hostnames in fabric FDMI displays.

Currently, the driver registers the hostname at initialization and obtains
the hostname via init_utsname()->nodename queried at the time the FC link
comes up. Unfortunately, if the machine hostname is updated after
initialization, such as via DHCP or admin command, the value registered
initially will be incorrect.

Fix by having the driver save the hostname that was registered with FDMI.
The driver then runs a heartbeat action that will check the hostname.  If
the name changes, reregister the FMDI data.

The hostname is used in RSNN_NN, FDMI RPA and FDMI RHBA.

Link: https://lore.kernel.org/r/20191218235808.31922-5-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-12-21 13:42:42 -05:00
Colin Ian King
291c254845 scsi: lpfc: fix spelling mistakes of asynchronous
There are spelling mistakes of asynchronous in a lpfc_printf_log message
and comments. Fix these.

Link: https://lore.kernel.org/r/20191218084301.627555-1-colin.king@canonical.com
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-12-19 18:47:07 -05:00
Linus Torvalds
138f371ddf SCSI misc on 20191207
11 patches, all in drivers (no core changes) that are either minor
 cleanups or small fixes.  They were late arriving, but still safe for
 -rc1.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXeyDEiYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishbijAQDIuM45
 xeEXgWXF8c/tYuvildK1WjyVzwBO2k563lUmYQEA4rxSzkmhtcaMTDuk4hI4Y4TP
 p87U1bXNSJ7tCpFU15w=
 =W43H
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull more SCSI updates from James Bottomley:
 "Eleven patches, all in drivers (no core changes) that are either minor
  cleanups or small fixes.

  They were late arriving, but still safe for -rc1"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: MAINTAINERS: Add the linux-scsi mailing list to the ISCSI entry
  scsi: megaraid_sas: Make poll_aen_lock static
  scsi: sd_zbc: Improve report zones error printout
  scsi: qla2xxx: Fix qla2x00_request_irqs() for MSI
  scsi: qla2xxx: unregister ports after GPN_FT failure
  scsi: qla2xxx: fix rports not being mark as lost in sync fabric scan
  scsi: pm80xx: Remove unused include of linux/version.h
  scsi: pm80xx: fix logic to break out of loop when register value is 2 or 3
  scsi: scsi_transport_sas: Fix memory leak when removing devices
  scsi: lpfc: size cpu map by last cpu id set
  scsi: ibmvscsi_tgt: Remove unneeded variable rc
2019-12-08 12:23:42 -08:00
Linus Torvalds
ef2cc88e2a SCSI misc on 20191130
This is mostly update of the usual drivers: aacraid, ufs, zfcp,
 NCR5380, lpfc, qla2xxx, smartpqi, hisi_sas, target, mpt3sas, pm80xx
 plus a whole load of minor updates and fixes.  The two major core
 changes are Al Viro's reworking of sg's handling of copy to/from user,
 Ming Lei's removal of the host busy counter to avoid contention in the
 multiqueue case and Damien Le Moal's fixing of residual tracking
 across error handling.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXeKvHCYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishQJMAQDAjlAi
 SNfbyndMqyf+rZGWufDI+43Up1VvW9GeWJHeDwEAxfO5XZsCks2uT8UxXhpEp9L7
 HkiUww3zbcgl0FWFkUM=
 =cdVU
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "This is mostly update of the usual drivers: aacraid, ufs, zfcp,
  NCR5380, lpfc, qla2xxx, smartpqi, hisi_sas, target, mpt3sas, pm80xx
  plus a whole load of minor updates and fixes.

  The major core changes are Al Viro's reworking of sg's handling of
  copy to/from user, Ming Lei's removal of the host busy counter to
  avoid contention in the multiqueue case and Damien Le Moal's fixing of
  residual tracking across error handling"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (251 commits)
  scsi: bnx2fc: timeout calculation invalid for bnx2fc_eh_abort()
  scsi: target: core: Fix a pr_debug() argument
  scsi: iscsi: Don't send data to unbound connection
  scsi: target: iscsi: Wait for all commands to finish before freeing a session
  scsi: target: core: Release SPC-2 reservations when closing a session
  scsi: target: core: Document target_cmd_size_check()
  scsi: bnx2i: fix potential use after free
  Revert "scsi: qla2xxx: Fix memory leak when sending I/O fails"
  scsi: NCR5380: Add disconnect_mask module parameter
  scsi: NCR5380: Unconditionally clear ICR after do_abort()
  scsi: NCR5380: Call scsi_set_resid() on command completion
  scsi: scsi_debug: num_tgts must be >= 0
  scsi: lpfc: use hdwq assigned cpu for allocation
  scsi: arcmsr: fix indentation issues
  scsi: qla4xxx: fix double free bug
  scsi: pm80xx: Modified the logic to collect fatal dump
  scsi: pm80xx: Tie the interrupt name to the module instance
  scsi: pm80xx: Controller fatal error through sysfs
  scsi: pm80xx: Do not request 12G sas speeds
  scsi: pm80xx: Cleanup command when a reset times out
  ...
2019-12-02 13:37:02 -08:00
James Smart
eede4970fb scsi: lpfc: size cpu map by last cpu id set
Currently the lpfc driver sizes its cpu_map array based on
num_possible_cpus(). However, that can be a value that is less than the
highest cpu id bit that is set. As such, if a thread runs on a cpu with a
larger cpu id, or for_each_possible_cpu() is used, the driver could index
off the end of the array and return garbage or GPF.

The driver maintains its own internal copy of the "num_possible" cpu value
and sizes arrays by it.

Fix by setting the driver's value to the value of the last cpu id bit set
in the possible_mask - plus 1. Thus cpu_map will be sized to allow access
by any cpu id possible.

Link: https://lore.kernel.org/r/20191121175556.18953-1-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-21 20:49:50 -05:00
James Smart
bc227dde0d scsi: lpfc: Initialize cpu_map for not present cpus
Currently, cpu_map[cpu#]->hdwq is left to equal LPFC_VECTOR_MAP_EMPTY for
not present CPUs.  If a CPU is dynamically hot-added, it is possible we may
crash due to not assigning an allocated hdwq.

Correct by assigning a hdwq at initialization for all not-present CPUs.

Fixes: dcaa213679 ("scsi: lpfc: Change default IRQ model on AMD architectures")
Link: https://lore.kernel.org/r/20191111230401.12958-5-jsmart2021@gmail.com
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-12 22:21:33 -05:00
Bart Van Assche
61951a6d31 scsi: lpfc: Fix lpfc_cpumask_of_node_init()
Fix the following kernel warning:

cpumask_of_node(-1): (unsigned)node >= nr_node_ids(1)

Fixes: dcaa213679 ("scsi: lpfc: Change default IRQ model on AMD architectures")
Link: https://lore.kernel.org/r/20191108225947.1395-1-jsmart2021@gmail.com
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-08 21:29:42 -05:00
Bart Van Assche
eea2d396aa scsi: lpfc: Fix a kernel warning triggered by lpfc_sli4_enable_intr()
Fix the following lockdep warning:

============================================
WARNING: possible recursive locking detected
5.4.0-rc6-dbg+ #2 Not tainted
--------------------------------------------
systemd-udevd/130 is trying to acquire lock:
ffffffff826b05d0 (cpu_hotplug_lock.rw_sem){++++}, at: irq_calc_affinity_vectors+0x63/0x90

but task is already holding lock:

ffffffff826b05d0 (cpu_hotplug_lock.rw_sem){++++}, at: lpfc_sli4_enable_intr+0x422/0xd50 [lpfc]

other info that might help us debug this:

 Possible unsafe locking scenario:
       CPU0
       ----
  lock(cpu_hotplug_lock.rw_sem);
  lock(cpu_hotplug_lock.rw_sem);

*** DEADLOCK ***
 May be due to missing lock nesting notation
2 locks held by systemd-udevd/130:
 #0: ffff8880d53fe210 (&dev->mutex){....}, at: __device_driver_lock+0x4a/0x70
 #1: ffffffff826b05d0 (cpu_hotplug_lock.rw_sem){++++}, at: lpfc_sli4_enable_intr+0x422/0xd50 [lpfc]

stack backtrace:
CPU: 1 PID: 130 Comm: systemd-udevd Not tainted 5.4.0-rc6-dbg+ #2
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
Call Trace:
 dump_stack+0xa5/0xe6
 __lock_acquire.cold+0xf7/0x23a
 lock_acquire+0x106/0x240
 cpus_read_lock+0x41/0xe0
 irq_calc_affinity_vectors+0x63/0x90
 __pci_enable_msix_range+0x10a/0x950
 pci_alloc_irq_vectors_affinity+0x144/0x210
 lpfc_sli4_enable_intr+0x4b2/0xd50 [lpfc]
 lpfc_pci_probe_one+0x1411/0x22b0 [lpfc]
 local_pci_probe+0x7c/0xc0
 pci_device_probe+0x25d/0x390
 really_probe+0x170/0x510
 driver_probe_device+0x127/0x190
 device_driver_attach+0x98/0xa0
 __driver_attach+0xb6/0x1a0
 bus_for_each_dev+0x100/0x150
 driver_attach+0x31/0x40
 bus_add_driver+0x246/0x300
 driver_register+0xe0/0x170
 __pci_register_driver+0xde/0xf0
 lpfc_init+0x134/0x1000 [lpfc]
 do_one_initcall+0xda/0x47e
 do_init_module+0x10a/0x3b0
 load_module+0x4318/0x47c0
 __do_sys_finit_module+0x134/0x1d0
 __x64_sys_finit_module+0x47/0x50
 do_syscall_64+0x6f/0x2e0
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

Fixes: dcaa213679 ("scsi: lpfc: Change default IRQ model on AMD architectures")
Link: https://lore.kernel.org/r/20191107052158.25788-4-bvanassche@acm.org
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-08 21:28:53 -05:00
James Smart
dcaa213679 scsi: lpfc: Change default IRQ model on AMD architectures
The current driver attempts to allocate an interrupt vector per cpu using
the systems managed IRQ allocator (flag PCI_IRQ_AFFINITY). The system IRQ
allocator will either provide the per-cpu vector, or return fewer
vectors. When fewer vectors, they are evenly spread between the numa nodes
on the system.  When run on an AMD architecture, if interrupts occur to a
cpu that is not in the same numa node as the adapter generating the
interrupt, there are extreme costs and overheads in performance.  Thus, if
1:1 vector allocation is used, or the "balanced" vectors in the other numa
nodes, performance can be hit significantly.

A much more performant model is to allocate interrupts only on the cpus
that are in the numa node where the adapter resides.  I/O completion is
still performed by the cpu where the I/O was generated. Unfortunately,
there is no flag to request the managed IRQ subsystem allocate vectors only
for the CPUs in the numa node as the adapter.

On AMD architecture, revert the irq allocation to the normal style
(non-managed) and then use irq_set_affinity_hint() to set the cpu
affinity and disable user-space rebalancing.

Tie the support into CPU offline/online. If the cpu being offlined owns a
vector, the vector is re-affinitized to one of the other CPUs on the same
numa node. If there are no more CPUs on the numa node, the vector has all
affinity removed and lets the system determine where it's serviced.
Similarly, when the cpu that owned a vector comes online, the vector is
reaffinitized to the cpu.

Link: https://lore.kernel.org/r/20191105005708.7399-10-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-06 00:04:04 -05:00
James Smart
93a4d6f401 scsi: lpfc: Add registration for CPU Offline/Online events
The recent affinitization didn't address cpu offlining/onlining.  If an
interrupt vector is shared and the low order cpu owning the vector is
offlined, as interrupts are managed, the vector is taken offline. This
causes the other CPUs sharing the vector will hang as they can't get io
completions.

Correct by registering callbacks with the system for Offline/Online
events. When a cpu is taken offline, its eq, which is tied to an interrupt
vector is found. If the cpu is the "owner" of the vector and if the
eq/vector is shared by other CPUs, the eq is placed into a polled mode.
Additionally, code paths that perform io submission on the "sharing CPUs"
will check the eq state and poll for completion after submission of new io
to a wq that uses the eq.

Similarly, when a cpu comes back online and owns an offlined vector, the eq
is taken out of polled mode and rearmed to start driving interrupts for eq.

Link: https://lore.kernel.org/r/20191105005708.7399-9-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-11-06 00:04:04 -05:00
Saurav Girepunje
c3e5aac3e2 scsi: lpfc: Fix NULL check before mempool_destroy is not needed
mempool_destroy has taken null pointer check into account. Remove the
redundant check.

Link: https://lore.kernel.org/r/20191026194712.GA22249@saurav
Signed-off-by: Saurav Girepunje <saurav.girepunje@gmail.com>
Reviewed-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-28 21:52:38 -04:00
James Smart
5792a0e816 scsi: lpfc: fix spelling error in MAGIC_NUMER_xxx
convert MAGIC_NUMER_xxx to MAGIC_NUMBER_xxx

Link: https://lore.kernel.org/r/20191025184342.6623-1-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-28 21:49:22 -04:00
Linus Torvalds
1c4e395cf7 SCSI fixes on 20191025
Nine changes, eight to drivers (qla2xxx, hpsa, lpfc, alua, ch,
 53c710[x2], target) and one core change that tries to close a race
 between sysfs delete and module removal.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXbN1gSYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishWUzAP4tB9Z+
 X5zfnMLmeAtSCnVwIgFX3/GVSFfzEmi+3VxfBQEA3nfs5AAJCPsaTk9z+jLtAKPk
 6uYoHwsyTHal19Ojt9g=
 =IOPn
 -----END PGP SIGNATURE-----

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "Nine changes, eight to drivers (qla2xxx, hpsa, lpfc, alua, ch,
  53c710[x2], target) and one core change that tries to close a race
  between sysfs delete and module removal"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: lpfc: remove left-over BUILD_NVME defines
  scsi: core: try to get module before removing device
  scsi: hpsa: add missing hunks in reset-patch
  scsi: target: core: Do not overwrite CDB byte 1
  scsi: ch: Make it possible to open a ch device multiple times again
  scsi: fix kconfig dependency warning related to 53C700_LE_ON_BE
  scsi: sni_53c710: fix compilation error
  scsi: scsi_dh_alua: handle RTPG sense code correctly during state transitions
  scsi: qla2xxx: fix a potential NULL pointer dereference
2019-10-25 20:11:33 -04:00
James Smart
83c6cb1ae8 scsi: lpfc: Add FC-AL support to lpe32000 models
In the past, the lpe32000 models, based their main support being for 32G,
and as FC-AL is not supported in the FC standards past 8G, did not support
FC-AL operation.

This patch adds private-loop FC-AL support for the LPE32000 adapters
when a link is 8G or below. To avoid conditions where link rate may
change, which would cause non-connectivity to the AL device, FC-AL
mode must become a persistent setting and the link kept at a speed
supporting FC-AL.

The patch:

 - Adds a pls attribute indicating whether the adapter properly supports
   FC-AL.

 - Adds support for the adapter to indicate that topology should be fixed
   and the topology types to be configured.

 - Adds a pt attribute to report the persistent topology if present.

Link: https://lore.kernel.org/r/20191018211832.7917-15-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-24 21:02:06 -04:00
James Smart
e7d8595272 scsi: lpfc: Add FA-WWN Async Event reporting
Add decode support for adapter Async Events which report FA-WWN
configuration errors.

Link: https://lore.kernel.org/r/20191018211832.7917-14-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-24 21:02:06 -04:00
James Smart
8156d378c4 scsi: lpfc: Revise interrupt coalescing for missing scenarios
The existing "auto eq delay" mechanism was sometimes skipping over an EQ,
not ramping the coalescing down under light load fast enough, and in other
cases never kicked in as cpu sharing by multiple vectors didn't quite add
up right.

Tweak the interrupt mechanism such that:

 - Add a flag to the EQ to force checking for colaescing values when being
   serviced in the interrupt handler.  The flag will be set by any CQ bound
   to the EQ whenever the number of CQ elements process in a single scan
   meets or exceeds the hardware queue notify level. E.g. there's a
   significant number of completions happening.

 - In the heartbeat work item that checks coalescing:

   - Replace the structure that was counting the number of EQs that
     interrupted on a single cpu with a new structure that looks at the EQ
     to see whether EQ currently has a coalescing value (thus it should be
     re-evaluate) or was marked by the new flag indicating heavy
     completions.

   - When a cpu, which may be servicing multiple vectors, had at least 1 EQ
     that should be checked, a new coalescing delay is calculated based on
     the number of interrupts that occurred on the cpu.

   - The new coalescing value is then applied to the EQs that had
     interrupted on the cpu.

Link: https://lore.kernel.org/r/20191018211832.7917-11-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-24 21:02:05 -04:00
James Smart
ea85a20cd5 scsi: lpfc: Remove lock contention target write path
Lower IOps performance with write operations. Perf tool shows lock
contention in dma_pool_alloc and dma_pool_free related to the
txrdy_payload_pool.

The allocations are for dma buffers for XFER_RDY's, which actually are not
needed for the FCP_TRECEIVE command as the command contents are used by the
adapter to generate the IU.

Remove the allocations and the associated buffer pool.  Rather than leaving
NULLs in buffer pointer locations, set command and sgl to indicate skipped
SGLE indexes.

Link: https://lore.kernel.org/r/20191018211832.7917-10-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-24 21:02:05 -04:00
James Smart
0a5ce73197 scsi: lpfc: Fix reporting of read-only fw error errors
When the adapter FW is administratively set to RO mode, a FW update
triggered by the driver's sysfs attribute will fail. Currently, the
driver's logging mechanism does not properly parse the adapter return codes
and print a meaningful message.  This oversight prevents quick diagnosis in
the field.

Parse the adapter return codes for Write_Object and write an appropriate
message to the system console.

[mkp: typo]

Link: https://lore.kernel.org/r/20191018211832.7917-3-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-24 21:02:04 -04:00
James Smart
97a9ed3b3a scsi: lpfc: fix lpfc_nvmet_mrq to be bound by hdw queue count
Currently, lpfc_nvmet_mrq is always scaled back to the min(lpfc_nvmet_mrq,
lpfc_irq_chann). There's no reason to reduce it to the number of interrupt
vectors.  Rather, it should be scaled down based on the number of hardware
queues for the system (if lower than max of 16).

Change scaling to use hardware queue count rather than interrupt vector
count.

Link: https://lore.kernel.org/r/20191018211832.7917-2-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-24 21:02:04 -04:00
Hannes Reinecke
1052b41b25 scsi: lpfc: remove left-over BUILD_NVME defines
The BUILD_NVME define never got defined anywhere, causing NVMe commands to
be treated as SCSI commands when freeing the buffers.  This was causing a
stuck discovery and a horrible crash in lpfc_set_rrq_active() later on.

Link: https://lore.kernel.org/r/20191017150019.75769-1-hare@suse.de
Fixes: c00f62e6c5 ("scsi: lpfc: Merge per-protocol WQ/CQ pairs into single per-cpu pair")
Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-10-17 22:01:27 -04:00
James Smart
d11ed16db6 scsi: lpfc: Update async event logging
This patch updates ACQE handling for:

 - an EEPROM failure error reported by the adapter.

 - ensures that all data for any ACQE, recognized or not, is logged.

 - Given that all data is now logged unconditionally, the default case
   (unrecognized) data can be reduced.

Link: https://lore.kernel.org/r/20190922035906.10977-18-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-09-30 22:07:11 -04:00
James Smart
0f154226d6 scsi: lpfc: Fix device recovery errors after PLOGI failures
When target-side fault injections are made, the driver isn't reconnecting
to the remote port. The driver is logging "2753" error messages which
state:

"PLOGI failure DID:1B2400 Status:x3/xf0240008"

The failures status is indicating a Illegal field error, which points to
the Temporary RPI field being used for the ELS. This error typically means
the driver used an RPI that was already registered (shouldn't be registered
if using it in this context).

Study has found that if the driver were in discovery attempts and
encountered an error, it wouldn't flag the temporary rpi in error.  Yet the
rpi was released for reallocation in these error paths and another ELS
could allocate the rpi. In the failure situation a retry was done on an ELS
that had encountered an error, and as the rpi wasn't marked in error, the
ELS reused the rpi it originally allocated. But that rpi had been allocated
by a different ELS issued after the original error and before the retry
attempt. The different ELS had succeeded and the RPI was registered.

Fix by marking the rpi state for the node to be in error, aka as needing
reallocation, upon an error in the els processing.  Error state marking is
always done prior to release back to the internal rpi free list, which the
driver wasn't doing in cases prior.

Also enhanced some of the logging to help in the next case of problem
troubleshooting.

Link: https://lore.kernel.org/r/20190922035906.10977-7-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-09-30 22:07:09 -04:00
James Smart
a5f7337f5a scsi: lpfc: Fix NVME io abort failures causing hangs
The nvme-fc transport may call to abort an io on controller reset. If the
driver is out of resources to issue an abort command, it just gives up and
does nothing. The transport expects the lldd to always be able to terminate
an io it has issued.  At that point, the controller hangs waiting for
aborted ios to be returned.  Note: flaged by "6136" and "6176" error
messages.

Root issue was the adapter mis-allocated the number resources it allocated
for command entries for the adapter.

Convert the driver to allocate command resources based on the number of
xris supported by the FC port - 1 resource for the original command and 1
resource for the abort request.

Link: https://lore.kernel.org/r/20190922035906.10977-5-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-09-30 22:07:09 -04:00
Linus Torvalds
10fd71780f SCSI misc on 20190919
This is mostly update of the usual drivers: qla2xxx, ufs, smartpqi,
 lpfc, hisi_sas, qedf, mpt3sas; plus a whole load of minor updates.
 The only core change this time around is the addition of request
 batching for virtio.  Since batching requires an additional flag to
 use, it should be invisible to the rest of the drivers.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXYQE/yYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishXs9AP4usPY5
 OpMlF6OiKFNeJrCdhCScVghf9uHbc7UA6cP+EgD/bCtRgcDe1ZjOTYWdeTwvwWqA
 ltWYonnv6Lg3b1f9yqI=
 =jRC/
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "This is mostly update of the usual drivers: qla2xxx, ufs, smartpqi,
  lpfc, hisi_sas, qedf, mpt3sas; plus a whole load of minor updates. The
  only core change this time around is the addition of request batching
  for virtio. Since batching requires an additional flag to use, it
  should be invisible to the rest of the drivers"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (264 commits)
  scsi: hisi_sas: Fix the conflict between device gone and host reset
  scsi: hisi_sas: Add BIST support for phy loopback
  scsi: hisi_sas: Add hisi_sas_debugfs_alloc() to centralise allocation
  scsi: hisi_sas: Remove some unused function arguments
  scsi: hisi_sas: Remove redundant work declaration
  scsi: hisi_sas: Remove hisi_sas_hw.slot_complete
  scsi: hisi_sas: Assign NCQ tag for all NCQ commands
  scsi: hisi_sas: Update all the registers after suspend and resume
  scsi: hisi_sas: Retry 3 times TMF IO for SAS disks when init device
  scsi: hisi_sas: Remove sleep after issue phy reset if sas_smp_phy_control() fails
  scsi: hisi_sas: Directly return when running I_T_nexus reset if phy disabled
  scsi: hisi_sas: Use true/false as input parameter of sas_phy_reset()
  scsi: hisi_sas: add debugfs auto-trigger for internal abort time out
  scsi: virtio_scsi: unplug LUNs when events missed
  scsi: scsi_dh_rdac: zero cdb in send_mode_select()
  scsi: fcoe: fix null-ptr-deref Read in fc_release_transport
  scsi: ufs-hisi: use devm_platform_ioremap_resource() to simplify code
  scsi: ufshcd: use devm_platform_ioremap_resource() to simplify code
  scsi: hisi_sas: use devm_platform_ioremap_resource() to simplify code
  scsi: ufs: Use kmemdup in ufshcd_read_string_desc()
  ...
2019-09-21 10:50:15 -07:00
James Smart
9db6c14c36 scsi: lpfc: Remove bg debugfs buffers
Capturing and downloading dif command data and dif data was done a dozen
years ago and no longer being used. Also creates a potential security hole.

Remove the debugfs buffer for dif debugging.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
CC: KyleMahlkuch <kmahlkuc@linux.vnet.ibm.com>
CC: Hannes Reinecke <hare@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-29 18:08:58 -04:00
James Smart
7f9989bace scsi: lpfc: Resolve checker warning for lpfc_new_io_buf()
Per Dan Carpenter:

The patch d79c9e9d4b: "scsi: lpfc: Support dynamic unbounded SGL lists on
G7 hardware." from Aug 14, 2019, leads to the following static checker
warning:

   drivers/scsi/lpfc/lpfc_init.c:4107 lpfc_new_io_buf()
  error: not allocating enough data 784 vs 768

There was no need to compare sizes nor to allocate size based on a define.

Change allocation to use actual structure length

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
CC: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-29 18:07:59 -04:00
James Smart
c00f62e6c5 scsi: lpfc: Merge per-protocol WQ/CQ pairs into single per-cpu pair
Currently, each hardware queue, typically allocated per-cpu, consists of a
WQ/CQ pair per protocol. Meaning if both SCSI and NVMe are supported 2
WQ/CQ pairs will exist for the hardware queue. Separate queues are
unnecessary. The current implementation wastes memory backing the 2nd set
of queues, and the use of double the SLI-4 WQ/CQ's means less hardware
queues can be supported which means there may not always be enough to have
a pair per cpu. If there is only 1 pair per cpu, more cpu's may get their
own WQ/CQ.

Rework the implementation to use a single WQ/CQ pair by both protocols.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-19 22:41:12 -04:00
James Smart
0d8af09643 scsi: lpfc: Add NVMe sequence level error recovery support
FC-NVMe-2 added support for sequence level error recovery in the FC-NVME
protocol. This allows for the detection of errors and lost frames and
immediate retransmission of data to avoid exchange termination, which
escalates into NVMeoFC connection and association failures. A significant
RAS improvement.

The driver is modified to indicate support for SLER in the NVMe PRLI is
issues and to check for support in the PRLI response.  When both sides
support it, the driver will set a bit in the WQE to enable the recovery
behavior on the exchange. The adapter will take care of all detection and
retransmission.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-19 22:41:12 -04:00
James Smart
d79c9e9d4b scsi: lpfc: Support dynamic unbounded SGL lists on G7 hardware.
Typical SLI-4 hardware supports up to 2 4KB pages to be registered per XRI
to contain the exchanges Scatter/Gather List. This caps the number of SGL
elements that can be in the SGL. There are not extensions to extend the
list out of the 2 pages.

The G7 hardware adds a SGE type that allows the SGL to be vectored to a
different scatter/gather list segment. And that segment can contain a SGE
to go to another segment and so on.  The initial segment must still be
pre-registered for the XRI, but it can be a much smaller amount (256Bytes)
as it can now be dynamically grown.  This much smaller allocation can
handle the SG list for most normal I/O, and the dynamic aspect allows it to
support many MB's if needed.

The implementation creates a pool which contains "segments" and which is
initially sized to hold the initial small segment per xri. If an I/O
requires additional segments, they are allocated from the pool.  If the
pool has no more segments, the pool is grown based on what is now
needed. After the I/O completes, the additional segments are returned to
the pool for use by other I/Os. Once allocated, the additional segments are
not released under the assumption of "if needed once, it will be needed
again". Pools are kept on a per-hardware queue basis, which is typically
1:1 per cpu, but may be shared by multiple cpus.

The switch to the smaller initial allocation significantly reduces the
memory footprint of the driver (which only grows if large ios are
issued). Based on the several K of XRIs for the adapter, the 8KB->256B
reduction can conserve 32MBs or more.

It has been observed with per-cpu resource pools that allocating a resource
on CPU A, may be put back on CPU B. While the get routines are distributed
evenly, only a limited subset of CPUs may be handling the put routines.
This can put a strain on the lpfc_put_cmd_rsp_buf_per_cpu routine because
all the resources are being put on a limited subset of CPUs.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-19 22:41:12 -04:00
James Smart
3235066449 scsi: lpfc: Migrate to %px and %pf in kernel print calls
In order to see real addresses, convert %p with %px for kernel addresses
and replace %p with %pf for functions.

While converting, standardize on "x%px" throughout (not %px or 0x%px).

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-19 22:41:11 -04:00
James Smart
07b1b91412 scsi: lpfc: Fix sli4 adapter initialization with MSI
When forcing the use of MSI (vs MSI-X) the driver is crashing in
pci_irq_get_affinity.

The driver was not using the new pci_alloc_irq_vectors interface in the MSI
path.

Fix by using pci_alloc_irq_vectors() with PCI_RQ_MSI in the MSI path.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-19 22:41:11 -04:00
James Smart
6a224b47fd scsi: lpfc: Fix nvme sg_seg_cnt display if HBA does not support NVME
The driver is currently reporting a non-zero nvme sg_seg_cnt value of 256
when nvme is disabled. It should be zero.

Fix by ensuring the value is cleared.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-19 22:41:10 -04:00
James Smart
84f2ddf8cf scsi: lpfc: Fix hang when downloading fw on port enabled for nvme
As part of firmware download, the adapter is reset. On the adapter the
reset causes the function to stop and all outstanding io is terminated
(without responses). The reset path then starts teardown of the adapter,
starting with deregistration of the remote ports with the nvme-fc
transport. The local port is then deregistered and the driver waits for
local port deregistration. This never finishes.

The remote port deregistrations terminated the nvme controllers, causing
them to send aborts for all the outstanding io. The aborts were serviced in
the driver, but stalled due to its state. The nvme layer then stops to
reclaim it's outstanding io before continuing.  The io must be returned
before the reset on the controller is deemed complete and the controller
delete performed.  The remote port deregistration won't complete until all
the controllers are terminated. And the local port deregistration won't
complete until all controllers and remote ports are terminated. Thus things
hang.

The issue is the reset which stopped the adapter also stopped all the
responses that would drive i/o completions, and the aborts were also
stopped that stopped i/o completions. The driver, when resetting the
adapter like this, needs to be generating the completions as part of the
adapter reset so that I/O complete (in error), and any aborts are not
queued.

Fix by adding flush routines whenever the adapter port has been reset or
discovered in error. The flush routines will generate the completions for
the scsi and nvme outstanding io. The abort ios, if waiting, will be caught
and flushed as well.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-19 22:41:10 -04:00
James Smart
8c24a4f643 scsi: lpfc: Fix crash due to port reset racing vs adapter error handling
If the adapter encounters a condition which causes the adapter to fail
(driver must detect the failure) simultaneously to a request to the driver
to reset the adapter (such as a host_reset), the reset path will be racing
with the asynchronously-detect adapter failure path.  In the failing
situation, one path has started to tear down the adapter data structures
(io_wq's) while the other path has initiated a repeat of the teardown and
is in the lpfc_sli_flush_xxx_rings path and attempting to access the
just-freed data structures.

Fix by the following:

 - In cases where an adapter failure is detected, rather than explicitly
   calling offline_eratt() to start the teardown, change the adapter state
   and let the later calls of posted work to the slowpath thread invoke the
   adapter recovery.  In essence, this means all requests to reset are
   serialized on the slowpath thread.

 - Clean up the routine that restarts the adapter. If there is a failure
   from brdreset, don't immediately error and leave things in a partial
   state. Instead, ensure the adapter state is set and finish the teardown
   of structures before returning.

 - If in the scsi host reset handler and the board fails to reset and
   restart (which can be due to parallel reset/recovery paths), instead of
   hard failing and explicitly calling offline_eratt() (which gets into the
   redundant path), just fail out and let the asynchronous path resolve the
   adapter state.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-19 22:41:10 -04:00
James Smart
c26c265b16 scsi: lpfc: Fix sg_seg_cnt for HBAs that don't support NVME
On an SLI-3 adapter which does not support NVMe, but with the driver global
attribute to enable nvme on any adapter if it does support NVMe
(e.g. module parameter lpfc_enable_fc4_type=3), the SGL and total SGE
values are being munged by the protocol enablement when it shouldn't be.

Correct by changing the location of where the NVME sgl information is being
applied, which will avoid any SLI-3-based adapter.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-19 22:41:10 -04:00
James Smart
3ad348d944 scsi: lpfc: Fix oops when fewer hdwqs than cpus
When tearing down the adapter for a reset, online/offline, or driver
unload, the queue free routine would hit a GPF oops.  This only occurs on
conditions where the number of hardware queues created is fewer than the
number of cpus in the system. In this condition cpus share a hardware
queue. And of course, it's the 2nd cpu that shares a hardware that
attempted to free it a second time and hit the oops.

Fix by reworking the cpu to hardware queue mapping such that:
Assignment of hardware queues to cpus occur in two passes:
first pass: is first time assignment of a hardware queue to a cpu.
  This will set the LPFC_CPU_FIRST_IRQ flag for the cpu.
second pass: for cpus that did not get a hardware queue they will
  be assigned one from a primary cpu (one set in first pass).

Deletion of hardware queues is driven by cpu itteration, and queues
will only be deleted if the LPFC_CPU_FIRST_IRQ flag is set.

Also contains a few small cleanup fixes and a little better logging.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-19 22:41:09 -04:00
James Smart
8d34a59cae scsi: lpfc: Fix failure to clear non-zero eq_delay after io rate reduction
Unusually high IO latency can be observed with little IO in progress. The
latency may remain high regardless of amount of IO and can only be cleared
by forcing lpfc_fcp_imax values to non-zero and then back to zero.

The driver's eq_delay mechanism that scales the interrupt coalescing based
on io completion load failed to reduce or turn off coalescing when load
decreased. Specifically, if no io completed on a cpu within an eq_delay
polling window, the eq delay processing was skipped and no change was made
to the coalescing values. This left the coalescing values set when they
were no longer applicable.

Fix by always clearing the percpu counters for each time period and always
run the eq_delay calculations if an eq has a non-zero coalescing value.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-19 22:41:08 -04:00
James Smart
3cee98db26 scsi: lpfc: Fix crash on driver unload in wq free
If a timer routine uses workqueues, it could fire before the workqueue is
allocated.

Fix by allocating the workqueue before the timer routines are setup

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-19 22:41:08 -04:00
James Smart
31f06d2e73 scsi: lpfc: Limit xri count for kdump environment
scsi-mq operation inherently performs pre-allocation of resources for
blk-mq request queues. Even though the kdump environment reduces the
configuration to a single CPU, thus 1 hardware queue, which helps
significantly, the resources are still rather large due to the per request
allocations. blk-mq pre-allocations can be over 4KB per request.  With
adapter can_queue values in the 4k or 8k range, this can easily be 32MBs
before any other driver memory is factored in.  Driver SGL DMA buffer
allocation can be up to 8KB per request as well adding an additional
64MB. Totals are well over 100MB for a single shost.  Given kdump memory
auto-sizing utilities don't accommodate this amount of memory well, it's
very possible for kdump to fail due to lack of memory.

Fix by having the driver recognize that it is booting within a kdump
context and reduce the number of requests it will support to a more
reasonable value.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-19 22:41:08 -04:00
James Smart
77ffd3465b scsi: lpfc: Mitigate high memory pre-allocation by SCSI-MQ
When SCSI-MQ is enabled, the SCSI-MQ layers will do pre-allocation of MQ
resources based on shost values set by the driver. In newer cases of the
driver, which attempts to set nr_hw_queues to the cpu count, the
multipliers become excessive, with a single shost having SCSI-MQ
pre-allocation reaching into the multiple GBytes range.  NPIV, which
creates additional shosts, only multiply this overhead. On lower-memory
systems, this can exhaust system memory very quickly, resulting in a system
crash or failures in the driver or elsewhere due to low memory conditions.

After testing several scenarios, the situation can be mitigated by limiting
the value set in shost->nr_hw_queues to 4. Although the shost values were
changed, the driver still had per-cpu hardware queues of its own that
allowed parallelization per-cpu.  Testing revealed that even with the
smallish number for nr_hw_queues for SCSI-MQ, performance levels remained
near maximum with the within-driver affiinitization.

A module parameter was created to allow the value set for the nr_hw_queues
to be tunable.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-19 22:14:10 -04:00
James Smart
a86c71ba30 scsi: lpfc: Fix crash when cpu count is 1 and null irq affinity mask
When a configurations runs with a single cpu (such as a kdump kernel),
which causes the driver to request a single vector, when the driver
subsequently requests an irq affinity mask, the mask comes back null.  The
driver currently does nothing in this scenario, which leaves mappings to
hardware queues incomplete and crashes the system.

Fix by recognizing the null mask and assigning the vector to the first cpu
in the system.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-07 21:56:41 -04:00
Linus Torvalds
ef8f3d48af Merge branch 'akpm' (patches from Andrew)
Merge updates from Andrew Morton:
 "Am experimenting with splitting MM up into identifiable subsystems
  perhaps with a view to gitifying it in complex ways. Also with more
  verbose "incoming" emails.

  Most of MM is here and a few other trees.

  Subsystems affected by this patch series:
   - hotfixes
   - iommu
   - scripts
   - arch/sh
   - ocfs2
   - mm:slab-generic
   - mm:slub
   - mm:kmemleak
   - mm:kasan
   - mm:cleanups
   - mm:debug
   - mm:pagecache
   - mm:swap
   - mm:memcg
   - mm:gup
   - mm:pagemap
   - mm:infrastructure
   - mm:vmalloc
   - mm:initialization
   - mm:pagealloc
   - mm:vmscan
   - mm:tools
   - mm:proc
   - mm:ras
   - mm:oom-kill

  hotfixes:
      mm: vmscan: scan anonymous pages on file refaults
      mm/nvdimm: add is_ioremap_addr and use that to check ioremap address
      mm/memcontrol: fix wrong statistics in memory.stat
      mm/z3fold.c: lock z3fold page before  __SetPageMovable()
      nilfs2: do not use unexported cpu_to_le32()/le32_to_cpu() in uapi header
      MAINTAINERS: nilfs2: update email address

  iommu:
      include/linux/dmar.h: replace single-char identifiers in macros

  scripts:
      scripts/decode_stacktrace: match basepath using shell prefix operator, not regex
      scripts/decode_stacktrace: look for modules with .ko.debug extension
      scripts/spelling.txt: drop "sepc" from the misspelling list
      scripts/spelling.txt: add spelling fix for prohibited
      scripts/decode_stacktrace: Accept dash/underscore in modules
      scripts/spelling.txt: add more spellings to spelling.txt

  arch/sh:
      arch/sh/configs/sdk7786_defconfig: remove CONFIG_LOGFS
      sh: config: remove left-over BACKLIGHT_LCD_SUPPORT
      sh: prevent warnings when using iounmap

  ocfs2:
      fs: ocfs: fix spelling mistake "hearbeating" -> "heartbeat"
      ocfs2/dlm: use struct_size() helper
      ocfs2: add last unlock times in locking_state
      ocfs2: add locking filter debugfs file
      ocfs2: add first lock wait time in locking_state
      ocfs: no need to check return value of debugfs_create functions
      fs/ocfs2/dlmglue.c: unneeded variable: "status"
      ocfs2: use kmemdup rather than duplicating its implementation

  mm:slab-generic:
    Patch series "mm/slab: Improved sanity checking":
      mm/slab: validate cache membership under freelist hardening
      mm/slab: sanity-check page type when looking up cache
      lkdtm/heap: add tests for freelist hardening

  mm:slub:
      mm/slub.c: avoid double string traverse in kmem_cache_flags()
      slub: don't panic for memcg kmem cache creation failure

  mm:kmemleak:
      mm/kmemleak.c: fix check for softirq context
      mm/kmemleak.c: change error at _write when kmemleak is disabled
      docs: kmemleak: add more documentation details

  mm:kasan:
      mm/kasan: print frame description for stack bugs
      Patch series "Bitops instrumentation for KASAN", v5:
        lib/test_kasan: add bitops tests
        x86: use static_cpu_has in uaccess region to avoid instrumentation
        asm-generic, x86: add bitops instrumentation for KASAN
      Patch series "mm/kasan: Add object validation in ksize()", v3:
        mm/kasan: introduce __kasan_check_{read,write}
        mm/kasan: change kasan_check_{read,write} to return boolean
        lib/test_kasan: Add test for double-kzfree detection
        mm/slab: refactor common ksize KASAN logic into slab_common.c
        mm/kasan: add object validation in ksize()

  mm:cleanups:
      include/linux/pfn_t.h: remove pfn_t_to_virt()
      Patch series "remove ARCH_SELECT_MEMORY_MODEL where it has no effect":
        arm: remove ARCH_SELECT_MEMORY_MODEL
        s390: remove ARCH_SELECT_MEMORY_MODEL
        sparc: remove ARCH_SELECT_MEMORY_MODEL
      mm/gup.c: make follow_page_mask() static
      mm/memory.c: trivial clean up in insert_page()
      mm: make !CONFIG_HUGE_PAGE wrappers into static inlines
      include/linux/mm_types.h: ifdef struct vm_area_struct::swap_readahead_info
      mm: remove the account_page_dirtied export
      mm/page_isolation.c: change the prototype of undo_isolate_page_range()
      include/linux/vmpressure.h: use spinlock_t instead of struct spinlock
      mm: remove the exporting of totalram_pages
      include/linux/pagemap.h: document trylock_page() return value

  mm:debug:
      mm/failslab.c: by default, do not fail allocations with direct reclaim only
      Patch series "debug_pagealloc improvements":
        mm, debug_pagelloc: use static keys to enable debugging
        mm, page_alloc: more extensive free page checking with debug_pagealloc
        mm, debug_pagealloc: use a page type instead of page_ext flag

  mm:pagecache:
      Patch series "fix filler_t callback type mismatches", v2:
        mm/filemap.c: fix an overly long line in read_cache_page
        mm/filemap: don't cast ->readpage to filler_t for do_read_cache_page
        jffs2: pass the correct prototype to read_cache_page
        9p: pass the correct prototype to read_cache_page
      mm/filemap.c: correct the comment about VM_FAULT_RETRY

  mm:swap:
      mm, swap: fix race between swapoff and some swap operations
      mm/swap_state.c: simplify total_swapcache_pages() with get_swap_device()
      mm, swap: use rbtree for swap_extent
      mm/mincore.c: fix race between swapoff and mincore

  mm:memcg:
      memcg, oom: no oom-kill for __GFP_RETRY_MAYFAIL
      memcg, fsnotify: no oom-kill for remote memcg charging
      mm, memcg: introduce memory.events.local
      mm: memcontrol: dump memory.stat during cgroup OOM
      Patch series "mm: reparent slab memory on cgroup removal", v7:
        mm: memcg/slab: postpone kmem_cache memcg pointer initialization to memcg_link_cache()
        mm: memcg/slab: rename slab delayed deactivation functions and fields
        mm: memcg/slab: generalize postponed non-root kmem_cache deactivation
        mm: memcg/slab: introduce __memcg_kmem_uncharge_memcg()
        mm: memcg/slab: unify SLAB and SLUB page accounting
        mm: memcg/slab: don't check the dying flag on kmem_cache creation
        mm: memcg/slab: synchronize access to kmem_cache dying flag using a spinlock
        mm: memcg/slab: rework non-root kmem_cache lifecycle management
        mm: memcg/slab: stop setting page->mem_cgroup pointer for slab pages
        mm: memcg/slab: reparent memcg kmem_caches on cgroup removal
      mm, memcg: add a memcg_slabinfo debugfs file

  mm:gup:
      Patch series "switch the remaining architectures to use generic GUP", v4:
        mm: use untagged_addr() for get_user_pages_fast addresses
        mm: simplify gup_fast_permitted
        mm: lift the x86_32 PAE version of gup_get_pte to common code
        MIPS: use the generic get_user_pages_fast code
        sh: add the missing pud_page definition
        sh: use the generic get_user_pages_fast code
        sparc64: add the missing pgd_page definition
        sparc64: define untagged_addr()
        sparc64: use the generic get_user_pages_fast code
        mm: rename CONFIG_HAVE_GENERIC_GUP to CONFIG_HAVE_FAST_GUP
        mm: reorder code blocks in gup.c
        mm: consolidate the get_user_pages* implementations
        mm: validate get_user_pages_fast flags
        mm: move the powerpc hugepd code to mm/gup.c
        mm: switch gup_hugepte to use try_get_compound_head
        mm: mark the page referenced in gup_hugepte
      mm/gup: speed up check_and_migrate_cma_pages() on huge page
      mm/gup.c: remove some BUG_ONs from get_gate_page()
      mm/gup.c: mark undo_dev_pagemap as __maybe_unused

  mm:pagemap:
      asm-generic, x86: introduce generic pte_{alloc,free}_one[_kernel]
      alpha: switch to generic version of pte allocation
      arm: switch to generic version of pte allocation
      arm64: switch to generic version of pte allocation
      csky: switch to generic version of pte allocation
      m68k: sun3: switch to generic version of pte allocation
      mips: switch to generic version of pte allocation
      nds32: switch to generic version of pte allocation
      nios2: switch to generic version of pte allocation
      parisc: switch to generic version of pte allocation
      riscv: switch to generic version of pte allocation
      um: switch to generic version of pte allocation
      unicore32: switch to generic version of pte allocation
      mm/pgtable: drop pgtable_t variable from pte_fn_t functions
      mm/memory.c: fail when offset == num in first check of __vm_map_pages()

  mm:infrastructure:
      mm/mmu_notifier: use hlist_add_head_rcu()

  mm:vmalloc:
      Patch series "Some cleanups for the KVA/vmalloc", v5:
        mm/vmalloc.c: remove "node" argument
        mm/vmalloc.c: preload a CPU with one object for split purpose
        mm/vmalloc.c: get rid of one single unlink_va() when merge
        mm/vmalloc.c: switch to WARN_ON() and move it under unlink_va()
      mm/vmalloc.c: spelling> s/informaion/information/

  mm:initialization:
      mm/large system hash: use vmalloc for size > MAX_ORDER when !hashdist
      mm/large system hash: clear hashdist when only one node with memory is booted

  mm:pagealloc:
      arm64: move jump_label_init() before parse_early_param()
      Patch series "add init_on_alloc/init_on_free boot options", v10:
        mm: security: introduce init_on_alloc=1 and init_on_free=1 boot options
        mm: init: report memory auto-initialization features at boot time

  mm:vmscan:
      mm: vmscan: remove double slab pressure by inc'ing sc->nr_scanned
      mm: vmscan: correct some vmscan counters for THP swapout

  mm:tools:
      tools/vm/slabinfo: order command line options
      tools/vm/slabinfo: add partial slab listing to -X
      tools/vm/slabinfo: add option to sort by partial slabs
      tools/vm/slabinfo: add sorting info to help menu

  mm:proc:
      proc: use down_read_killable mmap_sem for /proc/pid/maps
      proc: use down_read_killable mmap_sem for /proc/pid/smaps_rollup
      proc: use down_read_killable mmap_sem for /proc/pid/pagemap
      proc: use down_read_killable mmap_sem for /proc/pid/clear_refs
      proc: use down_read_killable mmap_sem for /proc/pid/map_files
      mm: use down_read_killable for locking mmap_sem in access_remote_vm
      mm: smaps: split PSS into components
      mm: vmalloc: show number of vmalloc pages in /proc/meminfo

  mm:ras:
      mm/memory-failure.c: clarify error message

  mm:oom-kill:
      mm: memcontrol: use CSS_TASK_ITER_PROCS at mem_cgroup_scan_tasks()
      mm, oom: refactor dump_tasks for memcg OOMs
      mm, oom: remove redundant task_in_mem_cgroup() check
      oom: decouple mems_allowed from oom_unkillable_task
      mm/oom_kill.c: remove redundant OOM score normalization in select_bad_process()"

* akpm: (147 commits)
  mm/oom_kill.c: remove redundant OOM score normalization in select_bad_process()
  oom: decouple mems_allowed from oom_unkillable_task
  mm, oom: remove redundant task_in_mem_cgroup() check
  mm, oom: refactor dump_tasks for memcg OOMs
  mm: memcontrol: use CSS_TASK_ITER_PROCS at mem_cgroup_scan_tasks()
  mm/memory-failure.c: clarify error message
  mm: vmalloc: show number of vmalloc pages in /proc/meminfo
  mm: smaps: split PSS into components
  mm: use down_read_killable for locking mmap_sem in access_remote_vm
  proc: use down_read_killable mmap_sem for /proc/pid/map_files
  proc: use down_read_killable mmap_sem for /proc/pid/clear_refs
  proc: use down_read_killable mmap_sem for /proc/pid/pagemap
  proc: use down_read_killable mmap_sem for /proc/pid/smaps_rollup
  proc: use down_read_killable mmap_sem for /proc/pid/maps
  tools/vm/slabinfo: add sorting info to help menu
  tools/vm/slabinfo: add option to sort by partial slabs
  tools/vm/slabinfo: add partial slab listing to -X
  tools/vm/slabinfo: order command line options
  mm: vmscan: correct some vmscan counters for THP swapout
  mm: vmscan: remove double slab pressure by inc'ing sc->nr_scanned
  ...
2019-07-12 11:40:28 -07:00
Paul Walmsley
cc0e5f1ce0 scripts/spelling.txt: drop "sepc" from the misspelling list
The RISC-V architecture has a register named the "Supervisor Exception
Program Counter", or "sepc".  This abbreviation triggers checkpatch.pl's
misspelling detector, resulting in noise in the checkpatch output.  The
risk that this noise could cause more useful warnings to be missed seems
to outweigh the harm of an occasional misspelling of "spec".  Thus drop
the "sepc" entry from the misspelling list.

[akpm@linux-foundation.org: fix existing "sepc" instances, per Joe]
Link: http://lkml.kernel.org/r/20190518210037.13674-1-paul.walmsley@sifive.com
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-12 11:05:41 -07:00
YueHaibing
d7b761b069 scsi: lpfc: Make some symbols static
Fix sparse warnings:

drivers/scsi/lpfc/lpfc_sli.c:115:1: warning: symbol 'lpfc_sli4_pcimem_bcopy' was not declared. Should it be static?
drivers/scsi/lpfc/lpfc_sli.c:7854:1: warning: symbol 'lpfc_sli4_process_missed_mbox_completions' was not declared. Should it be static?
drivers/scsi/lpfc/lpfc_nvmet.c:223:27: warning: symbol 'lpfc_nvmet_get_ctx_for_xri' was not declared. Should it be static?
drivers/scsi/lpfc/lpfc_nvmet.c:245:27: warning: symbol 'lpfc_nvmet_get_ctx_for_oxid' was not declared. Should it be static?
drivers/scsi/lpfc/lpfc_init.c:75:10: warning: symbol 'lpfc_present_cpu' was not declared. Should it be static?

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-18 19:46:24 -04:00
YueHaibing
a82b3539dc scsi: lpfc: Remove set but not used variables 'qp'
Fixes gcc '-Wunused-but-set-variable' warnings:

drivers/scsi/lpfc/lpfc_init.c: In function lpfc_setup_cq_lookup:
drivers/scsi/lpfc/lpfc_init.c:9359:30: warning: variable qp set but not used [-Wunused-but-set-variable]

It's not used since commit e70596a60f88 ("scsi: lpfc: Fix poor use of
hardware queues if fewer irq vectors")

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-18 19:46:24 -04:00
Thomas Meyer
a5c990eea5 scsi: lpfc: Use *_pool_zalloc rather than *_pool_alloc
Use *_pool_zalloc rather than *_pool_alloc followed by memset with 0.

Signed-off-by: Thomas Meyer <thomas@m3y3r.de>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-18 19:46:23 -04:00
James Smart
aa6ff30918 scsi: lpfc: Fix BFS crash with DIX enabled
Crashes in scsi_queue_rq or in dma_unmap_direct_sg during BFS when lpfc has
lpfc_enable_bg=1.

lpfc is setting DIX and prot sg after scsi_add_host_with_dma() has been
called. The scsi_host_set_prot() and scsi_host_set_guard() routines need to
be called before scsi_add_host_with_dma().

Revise the calling sequence to set the protection/guard data before calling
scsi_add_host_with_dma().

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-18 19:46:22 -04:00
James Smart
657add4e5e scsi: lpfc: Fix poor use of hardware queues if fewer irq vectors
While fixing the resources per socket, realized the driver was not using
hardware queues (up to 1 per cpu) if there were fewer interrupt
vectors. The driver was only using the hardware queue assigned to the cpu
with the vector.

Rework the affinity map check to use the additional hardware queue elements
that had been allocated.  If the cpu count exceeds the hardware queue count
- share, but choose what is shared with by: hyperthread peer, core peer,
socket peer, or finally similar cpu in a different socket.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-18 19:46:22 -04:00
James Smart
d9954a2d18 scsi: lpfc: Fix oops when driver is loaded with 1 interrupt vector
The driver was coded expecting enough hardware queues and interrupt vectors
such that at least there was one per socket. In the case where there were
fewer than sockets, cpus were left unassigned thus null pointers.

Rework the affinity mappings. Map settings for the cpu's that are in the
irq cpu mask. For each cpu not in the mask, map to another cpu that does
have a mask. Choice of the "other" cpu will attempt to map to the same cpu
but differing hyperthread, or cpu within in same core, or cpu within same
socket, or finally cpu in the base socket.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-18 19:46:22 -04:00
James Smart
b8e6f13617 scsi: lpfc: Fix incorrect logical link speed on trunks when links down
Invalid logical speed is displayed for trunk enabled ports when all ports
are down. Also noted that link speed is incorrectly reported for the units
when links are up.

Current code is returning the logical link speed from the last event from
the adapter. In cases where the last link went down, the link speed in the
event was not valid - meaning that although the links where down the field
had a bogus value.

Rework the event handling to qualify the trunk link state before using the
event speed data.

Also correct units on other areas where the logical link speed was taken
from a link event.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-18 19:46:21 -04:00
James Smart
c15e07047e scsi: lpfc: Rework misleading nvme not supported in firmware message
The driver unconditionally says fw doesn't support nvme when in
truth it was a driver parameter settings that disabled nvme support.

Rework the code validating nvme support to accurately report what
condition is disabling nvme support. Save state on whether nvme
fw supports nvme in case sysfs attributes change dynamically.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-18 19:46:21 -04:00
James Smart
79d8c4ce01 scsi: lpfc: Fix nvmet handling of received ABTS for unmapped frames
The driver currently is relying on firmware to match ABTSs to existing
exchanges. This works fine as long as an exchange has been assigned to the
io and work posted to it. However, for unmapped frames (rxid=0xFFFF), the
driver has yet to assign an xri. The driver was blindly saying it couldn't
match the ABTS and sending the BA_xxx. However, the command frame may have
been in queues waiting on xri's before posting to the nvmet_fc layer.  When
xri's became available, the command frame would still be pushed to the
transport and that io would execute, even though the io had been killed by
ABTS. The initiator, seeing the io ABTS'd, would reuse the exchange for a
different io which would be received on the target and pushed up. If the
"zombie" io then came back down and started transmitting, the initiator
would match the oxid and accept erroneous data. Bad things happened.

Add tracking of active exchanges in the target to allow matching of a
received ABTS against active or pending IO requests. If the ABTS is matched
to a pending or active IO, the drive initiates cleanup and conditionally
notifies the transport.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-18 19:46:21 -04:00
YueHaibing
c709297525 scsi: lpfc: Make lpfc_sli4_oas_verify static
Fix sparse warning:

drivers/scsi/lpfc/lpfc_init.c:13091:1: warning:
 symbol 'lpfc_sli4_oas_verify' was not declared. Should it be static?

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-04-18 20:35:58 -04:00
Bart Van Assche
3999df75bc scsi: lpfc: Declare local functions static
This patch avoids that the compiler complains about missing declarations
when building with W=1.

Cc: James Smart <james.smart@broadcom.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-04-03 23:11:35 -04:00
James Smart
c835c0854c scsi: lpfc: Fix duplicate log message numbers
Driver had duplicated log message numbers making debug difficult.

Make all messages unique.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-03-19 13:15:10 -04:00
James Smart
c1a21ebc0f scsi: lpfc: Specify node affinity for queue memory allocation
Change the SLI4 queue creation code to use NUMA node based memory
allocation based on the cpu the queues will be related to.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-03-19 13:15:09 -04:00
James Smart
c66a919746 scsi: lpfc: Fix io lost on host resets
If the driver undergoes repeated host resets it starts losing exchange
structures and eventually returns SCSI_MLQUEUE_HOST_BUSY and does not
recover. The offline path is not reclaiming the outstanding ios on the fcp
pring txcmplq before calling lpfc_destroy_multixripool, which causes the
txmcplq to be reinit and the resources lost.

Flush the fcp rings before destroying the multixripools.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-03-19 13:15:09 -04:00
James Smart
4645f7b56a scsi: lpfc: Coordinate adapter error handling with offline handling
The driver periodically checks for adapter error in a background thread. If
the thread detects an error, the adapter will be reset including the
deletion and reallocation of workqueues on the adapter.  Simultaneously,
there may be a user-space request to offline the adapter which may try to
do many of the same steps, in parallel, on a different thread. As memory
was deallocated while unexpected, the parallel offline request hit a bad
pointer.

Add coordination between the two threads.  The error recovery thread has
precedence. So, when an error is detected, a flag is set on the adapter to
indicate the error thread is terminating the adapter. But, before doing
that work, it will look for a flag that is set by the offline flow, and if
set, will wait for it to complete before then processing the error handling
path.  Similarly, in the offline thread, it first checks for whether the
error thread is resetting the adapter, and if so, will then wait for the
error thread to finish. Only after it has finished, will it set its flag
and offline the adapter.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-03-19 12:57:02 -04:00
James Smart
32a9310076 scsi: lpfc: Stop adapter if pci errors detected
In a couple of cases, the driver detected a pci error (via pci device state
or via failed register reads) but didn't take any action to disable the
device.  Additionally, the driver is ignoring the status of pci
configuration space reads.

Having the driver take the adapter offline whenever the pci error is
detected.  Pay attention to pci_config_space_read status and return failure
if an error is seen.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-03-19 12:57:02 -04:00
James Smart
731eedcb31 scsi: lpfc: Fix deadlock due to nested hbalock call
If an adapter fails, causing a board reset, the board reset routine
lpfc_hba_down_s4() takes the hbalock out then calls
lpfc_nvmet_ctxbuf_post() who then tries to take out the same lock.  As the
context lists are now protected under the buf_list_locks, there is no need
for the hbalock to be held by the board reset routine.

Fix by no longer taking the hbalock in the board reset routine.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-03-19 12:57:02 -04:00
James Smart
982ab128dc scsi: lpfc: Fix lpfc_nvmet_mrq attribute handling when 0
Currently, when lpfc_nvmet_mrq is 0 it could mean 2 different things
depending on when its looked at. If at module load time it specifies the
default number of hardware queues to allocate, with 0 meaning default to
the number of CPUs. But post module load, a value of zero means to disable
mrq use.

Changed the driver so that enablement of mrq is based on whether nvme
target mode is enabled or not. When enabled, mrq is enabled.  Thus, the
cfg_nvemt_mrq field only specifies the number of mrq queues to enable, with
0 defaulting to the number of cpus.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-03-19 12:57:02 -04:00
James Smart
50e3f871fb scsi: lpfc: Resolve irq-unsafe lockdep heirarchy warning in lpfc_io_free
A patch in the 12.2.0.0 set caused a new lockdep warning:

  WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected
  5.0.0-rc8-next-20190301-dbg+ #1 Not tainted

  Possible interrupt unsafe locking scenario:
       CPU0                    CPU1
       ----                    ----
  lock(&(&qp->io_buf_list_put_lock)->rlock);
                               local_irq_disable();
                               lock(&(&phba->hbalock)->rlock);
                               lock(&(&qp->io_buf_list_put_lock)->rlock);
  <Interrupt>
    lock(&(&phba->hbalock)->rlock);

see: https://www.spinics.net/lists/linux-scsi/msg128389.html

In summary, the new patch added taking the io_buf_list_put_lock while under
an irq-disabled hbalock. This created a lock heirarchy dependent upon irq
being disabled, and there are paths that take the io_buf_list_put_lock
without disabling irq.

Looking at the lpfc_io_free routine, which is where the new heirarchy was
introduced, there is no reason to be taking out the hbalock and raising
irq, as the functionality is replaced by the io_buf_list_xxx locks.

Resolve by removing the hbalock/irq calls in lpfc_io_free.

Fixes: 5e5b511d8b ("scsi: lpfc: Partition XRI buffer list across Hardware Queues")
Reported-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-03-19 12:57:01 -04:00
Linus Torvalds
477558d7e8 SCSI misc on 20190315
This is the final round of mostly small fixes and performance
 improvements to our initial submit.  The main regression fix is the
 ia64 simscsi build failure which was missed in the serial number
 elimination conversion.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXIxBayYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pisherpAP4rxLpX
 bcUnQnEsvoxys/JyoK08Qfv1JebZo1B2MAZ62wD/VZ7LpOuzVLhsM2KhLFGRrs1/
 7D2K4tgtO2dQsFix7H0=
 =pcHl
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull more SCSI updates from James Bottomley:
 "This is the final round of mostly small fixes and performance
  improvements to our initial submit.

  The main regression fix is the ia64 simscsi build failure which was
  missed in the serial number elimination conversion"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (24 commits)
  scsi: ia64: simscsi: use request tag instead of serial_number
  scsi: aacraid: Fix performance issue on logical drives
  scsi: lpfc: Fix error codes in lpfc_sli4_pci_mem_setup()
  scsi: libiscsi: Hold back_lock when calling iscsi_complete_task
  scsi: hisi_sas: Change SERDES_CFG init value to increase reliability of HiLink
  scsi: hisi_sas: Send HARD RESET to clear the previous affiliation of STP target port
  scsi: hisi_sas: Set PHY linkrate when disconnected
  scsi: hisi_sas: print PHY RX errors count for later revision of v3 hw
  scsi: hisi_sas: Fix a timeout race of driver internal and SMP IO
  scsi: hisi_sas: Change return variable type in phy_up_v3_hw()
  scsi: qla2xxx: check for kstrtol() failure
  scsi: lpfc: fix 32-bit format string warning
  scsi: lpfc: fix unused variable warning
  scsi: target: tcmu: Switch to bitmap_zalloc()
  scsi: libiscsi: fall back to sendmsg for slab pages
  scsi: qla2xxx: avoid printf format warning
  scsi: lpfc: resolve static checker warning in lpfc_sli4_hba_unset
  scsi: lpfc: Correct __lpfc_sli_issue_iocb_s4 lockdep check
  scsi: ufs: hisi: fix ufs_hba_variant_ops passing
  scsi: qla2xxx: Fix panic in qla_dfs_tgt_counters_show
  ...
2019-03-16 12:51:50 -07:00
Dan Carpenter
3a487ff78c scsi: lpfc: Fix error codes in lpfc_sli4_pci_mem_setup()
It used to be that "error" was set to -ENODEV at the start of the function
but we shifted some code around an now "error" is set to zero for most
error paths.  There is a mix of direct returns and "goto out" but I changed
everything to direct returns for consistency.

Fixes: 56de835704 ("scsi: lpfc: fix calls to dma_set_mask_and_coherent()")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: James Smart  <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-03-14 07:01:36 -04:00
Linus Torvalds
92fff53b71 SCSI misc on 20190306
This is mostly update of the usual drivers: arcmsr, qla2xxx, lpfc,
 hisi_sas, target/iscsi and target/core.  Additionally Christoph
 refactored gdth as part of the dma changes.  The major mid-layer
 change this time is the removal of bidi commands and with them the
 whole of the osd/exofs driver and filesystem.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXIC54SYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishT1GAPwJEV23
 ExPiPsnuVgKj49nLTagZ3rILRQcYNbL+MNYqxQEA0cT8FHzSDBfWY5OKPNE+RQ8z
 f69LpXGmMpuagKGvvd4=
 =Fhy1
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "This is mostly update of the usual drivers: arcmsr, qla2xxx, lpfc,
  hisi_sas, target/iscsi and target/core.

  Additionally Christoph refactored gdth as part of the dma changes. The
  major mid-layer change this time is the removal of bidi commands and
  with them the whole of the osd/exofs driver and filesystem. This is a
  major simplification for block and mq in particular"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (240 commits)
  scsi: cxgb4i: validate tcp sequence number only if chip version <= T5
  scsi: cxgb4i: get pf number from lldi->pf
  scsi: core: replace GFP_ATOMIC with GFP_KERNEL in scsi_scan.c
  scsi: mpt3sas: Add missing breaks in switch statements
  scsi: aacraid: Fix missing break in switch statement
  scsi: kill command serial number
  scsi: csiostor: drop serial_number usage
  scsi: mvumi: use request tag instead of serial_number
  scsi: dpt_i2o: remove serial number usage
  scsi: st: osst: Remove negative constant left-shifts
  scsi: ufs-bsg: Allow reading descriptors
  scsi: ufs: Allow reading descriptor via raw upiu
  scsi: ufs-bsg: Change the calling convention for write descriptor
  scsi: ufs: Remove unused device quirks
  Revert "scsi: ufs: disable vccq if it's not needed by UFS device"
  scsi: megaraid_sas: Remove a bunch of set but not used variables
  scsi: clean obsolete return values of eh_timed_out
  scsi: sd: Optimal I/O size should be a multiple of physical block size
  scsi: MAINTAINERS: SCSI initiator and target tweaks
  scsi: fcoe: make use of fip_mode enum complete
  ...
2019-03-09 16:53:47 -08:00
Arnd Bergmann
f996861be1 scsi: lpfc: fix 32-bit format string warning
On 32-bit architectures, we see a warning when %ld is used to print a
size_t:

In file included from drivers/scsi/lpfc/lpfc_init.c:62:
drivers/scsi/lpfc/lpfc_init.c: In function 'lpfc_new_io_buf':
drivers/scsi/lpfc/lpfc_logmsg.h:62:45: error: format '%ld' expects argument of type 'long int', but argument 5 has type 'unsigned int' [-Werror=format=]

This is harmless, but portable code should just use %zd to avoid the
warning.

Fixes: 0794d601d1 ("scsi: lpfc: Implement common IO buffers between NVME and SCSI")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-03-06 19:26:46 -05:00
James Smart
1ffdd2c044 scsi: lpfc: resolve static checker warning in lpfc_sli4_hba_unset
The patch that replaced io channels for hdw_queues now reports the
following static checker warning:

drivers/scsi/lpfc/lpfc_init.c:11136 lpfc_sli4_hba_unset()
 error: we previously assumed 'phba->pport' could be null (see line 11074)

Resolve by adding a pport NULL check.

[mkp: tag tweak]

Fixes: cdb42becdd ("scsi: lpfc: Replace io_channels for nvme and fcp with general hdw_queues per cpu"_
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-03-06 19:26:45 -05:00
Linus Torvalds
df49fd0ff8 SCSI fixes on 20190302
Nine small fixes.  The resume fix is a cosmetic removal of a warning
 with an incorrect condition causing it to alarm people wrongly.  The
 other eight patches correct a thinko in Christoph Hellwig's DMA
 conversion series.  Without it all these drivers end up with 32 bit
 DMA masks meaning they bounce any page over 4GB before sending it to
 the controller.  Nowadays, even laptops mostly have memory above 4GB,
 so this can lead to significant performance degradation with all the
 bouncing.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXHql8CYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishQmvAQCSVQRf
 kx3ABDGnaj4Km4/Jzibj44aCYwh+ewwtLCWwFQD9GWaEaDxBkbxQDf/YndQKRhYg
 VJQjjj6a9VlNSmWoW28=
 =L9Fe
 -----END PGP SIGNATURE-----

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "Nine small fixes.

  The resume fix is a cosmetic removal of a warning with an incorrect
  condition causing it to alarm people wrongly.

  The other eight patches correct a thinko in Christoph Hellwig's DMA
  conversion series. Without it all these drivers end up with 32 bit DMA
  masks meaning they bounce any page over 4GB before sending it to the
  controller.

  Nowadays, even laptops mostly have memory above 4GB, so this can lead
  to significant performance degradation with all the bouncing"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: core: Avoid that system resume triggers a kernel warning
  scsi: hptiop: fix calls to dma_set_mask()
  scsi: hisi_sas: fix calls to dma_set_mask_and_coherent()
  scsi: csiostor: fix calls to dma_set_mask_and_coherent()
  scsi: bfa: fix calls to dma_set_mask_and_coherent()
  scsi: aic94xx: fix calls to dma_set_mask_and_coherent()
  scsi: 3w-sas: fix calls to dma_set_mask_and_coherent()
  scsi: 3w-9xxx: fix calls to dma_set_mask_and_coherent()
  scsi: lpfc: fix calls to dma_set_mask_and_coherent()
2019-03-02 11:39:54 -08:00
Hannes Reinecke
56de835704 scsi: lpfc: fix calls to dma_set_mask_and_coherent()
The change to use dma_set_mask_and_coherent() incorrectly made a second
call with the 32 bit DMA mask value when the call with the 64 bit DMA mask
value succeeded.  This resulted in NVMe/FC connections failing due to
corrupted data buffers, and various other SCSI/FCP I/O errors.

Fixes: f30e1bfd61 ("scsi: lpfc: use dma_set_mask_and_coherent")
Cc: <stable@vger.kernel.org>
Suggested-by: Don Dutile <ddutile@redhat.com>
Signed-off-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-25 21:37:25 -05:00
YueHaibing
59e54d9aab scsi: lpfc: Remove set but not used variable 'phys_id'
Fixes gcc '-Wunused-but-set-variable' warning:

drivers/scsi/lpfc/lpfc_init.c: In function 'lpfc_cpu_affinity_check':
drivers/scsi/lpfc/lpfc_init.c:10599:19: warning:
 variable 'phys_id' set but not used [-Wunused-but-set-variable]

It never used since introduction in commit 6a828b0f61 ("scsi: lpfc:
Support non-uniform allocation of MSIX vectors to hardware queues")

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-19 18:58:34 -05:00
Colin Ian King
258f84fae3 scsi: lpfc: fix a handful of indentation issues
There are a handful of statements that are indented incorrectly. Fix these.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-13 22:15:42 -05:00
Dan Carpenter
fad28e3d9a scsi: lpfc: Fix error code if kcalloc() fails
This should return -ENOMEM if kcalloc() fails, but it accidentally
returns success instead.

Fixes: 6a828b0f61 ("scsi: lpfc: Support non-uniform allocation of MSIX vectors to hardware queues")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-12 22:15:54 -05:00
James Smart
0d041215f0 scsi: lpfc: Update 12.2.0.0 file copyrights to 2019
For files modified as part of 12.2.0.0 patches, update copyright to 2019

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:29:50 -05:00
James Smart
f6e8479052 scsi: lpfc: Fix default driver parameter collision for allowing NPIV support
The conversion to enable SCSI and NVME fc4 support ran into an issue with
NPIV support. With NVME, NPIV is not currently supported, but with SCSI it
was. The driver reverted to its lowest setting meaning NPIV with SCSI was
not allowed.

Convert the NPIV checks and implementation so that SCSI can continue to
allow NPIV support.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:29:50 -05:00
James Smart
c2017260ee scsi: lpfc: Rework locking on SCSI io completion
A scsi host lock is taken on every io completion to check whether the abort
handler is waiting on the io completion. This is an expensive lock to take
on all completion when rarely in an abort condition.

Replace scsi host lock with command-specific lock. Synchronize completion
and abort paths by new cmd lock. Ensure all flag changing and nulling of
context pointers taken under lock.  When adding lock to task management
abort, realized it was missing other synchronization locks. Added that
synchronization to match normal paths.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:29:50 -05:00
James Smart
222e9239c6 scsi: lpfc: Resize cpu maps structures based on possible cpus
The work done to date utilized the number of present cpus when sizing
per-cpu structures. Structures should have been sized based on the max
possible cpu count.

Convert the driver over to possible cpu count for sizing allocation.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:29:50 -05:00
James Smart
75508a8b8b scsi: lpfc: Utilize new IRQ API when allocating MSI-X vectors
Current driver uses the older IRQ API for MSIX allocation

Change driver to utilize pci_alloc_irq_vectors when allocating IRQ vectors.

Make lpfc_cpu_affinity_check use pci_irq_get_affinity to determine how the
kernel mapped all the IRQs.

Remove msix_entries from SLI4 structure, replaced with pci_irq_vector()
usage.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:29:49 -05:00
James Smart
32517fc097 scsi: lpfc: Rework EQ/CQ processing to address interrupt coalescing
When driving high iop counts, auto_imax coalescing kicks in and drives the
performance to extremely small iops levels.

There are two issues:

 1) auto_imax is enabled by default. The auto algorithm, when iops gets
    high, divides the iops by the hdwq count and uses that value to
    calculate EQ_Delay. The EQ_Delay is set uniformly on all EQs whether
    they have load or not. The EQ_delay is only manipulated every 5s (a
    long time). Thus there were large 5s swings of no interrupt delay
    followed by large/maximum delay, before repeating.

 2) When processing a CQ, the driver got mixed up on the rate of when
    to ring the doorbell to keep the chip appraised of the eqe or cqe
    consumption as well as how how long to sit in the thread and
    process queue entries. Currently, the driver capped its work at
    64 entries (very small) and exited/rearmed the CQ.  Thus, on heavy
    loads, additional overheads were taken to exit and re-enter the
    interrupt handler. Worse, if in the large/maximum coalescing
    windows,k it could be a while before getting back to servicing.

The issues are corrected by the following:

 - A change in defaults. Auto_imax is turned OFF and fcp_imax is set
   to 0. Thus all interrupts are immediate.

 - Cleanup of field names and their meanings. Existing names were
   non-intuitive or used for duplicate things.

 - Added max_proc_limit field, to control the length of time the
   handlers would service completions.

 - Reworked EQ handling:
    Added common routine that walks eq, applying notify interval and max
      processing limits. Use queue_claimed to claim ownership of the queue
      while processing. Always rearm the queue whenever the common routine
      is called.
    Rework queue element processing, namely to eliminate hba_index vs
      host_index. Only one index is necessary. The queue entry can be
      marked invalid and the host_index updated immediately after eqe
      processing.
    After rework, xx_release routines are now DB write functions. Renamed
      the routines as such.
    Moved lpfc_sli4_eq_flush(), which does similar action, to same area.
    Replaced the 2 individual loops that walk an eq with a call to the
      common routine.
    Slightly revised lpfc_sli4_hba_handle_eqe() calling syntax.
    Added per-cpu counters to detect interrupt rates and scale
      interrupt coalescing values.

 - Reworked CQ handling:
    Added common routine that walks cq, applying notify interval and max
      processing limits. Use queue_claimed to claim ownership of the queue
      while processing. Always rearm the queue whenever the common routine
      is called.
    Rework queue element processing, namely to eliminate hba_index vs
      host_index. Only one index is necessary. The queue entry can be
      marked invalid and the host_index updated immediately after cqe
      processing.
    After rework, xx_release routines are now DB write functions.  Renamed
      the routines as such.
    Replaced the 3 individual loops that walk a cq with a call to the
      common routine.
    Redefined lpfc_sli4_sp_handle_mcqe() to commong handler definition with
      queue reference. Add increment for mbox completion to handler.

 - Added a new module/sysfs attribute: lpfc_cq_max_proc_limit To allow
   dynamic changing of the CQ max_proc_limit value being used.

Although this leaves an EQ as an immediate interrupt, that interrupt will
only occur if a CQ bound to it is in an armed state and has cqe's to
process.  By staying in the cq processing routine longer, high loads will
avoid generating more interrupts as they will only rearm as the processing
thread exits. The immediately interrupt is also beneficial to idle or
lower-processing CQ's as they get serviced immediately without being
penalized by sharing an EQ with a more loaded CQ.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:29:49 -05:00
James Smart
cb733e3587 scsi: lpfc: cleanup: convert eq_delay to usdelay
Review of the eq coalescing logic showed the code was a bit fragmented.
Sometimes it would save/set via an interrupt max value, while in others it
would do so via a usdelay. There were also two places changing eq delay,
one place that issued mailbox commands, and another that changed via
register writes if supported.

Clean this up by:

 - Standardizing the operation of lpfc_modify_hba_eq_delay() routine so
   that it is always told of a us delay to impose. The routine then chooses
   the best way to set that - via register or via mbx.

 - Rather than two value types stored in eq->q_mode (usdelay if change via
   register, imax if change via mbox) - q_mode always contains usdelay.
   Before any value change, old vs new value is compared and only if
   different is a change done.

 - Revised the dmult calculation. dmult is not set based on overall imax
   divided by hardware queues - instead imax applies to a single cpu and
   the value will be replicated to all cpus.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:29:49 -05:00
James Smart
6a828b0f61 scsi: lpfc: Support non-uniform allocation of MSIX vectors to hardware queues
So far MSIX vector allocation assumed it would be 1:1 with hardware
queues. However, there are several reasons why fewer MSIX vectors may be
allocated than hardware queues such as the platform being out of vectors or
adapter limits being less than cpu count.

This patch reworks the MSIX/EQ relationships with the per-cpu hardware
queues so they can function independently. MSIX vectors will be equitably
split been cpu sockets/cores and then the per-cpu hardware queues will be
mapped to the vectors most efficient for them.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:29:49 -05:00
James Smart
b3295c2a75 scsi: lpfc: Fix setting affinity hints to correlate with hardware queues
The desired affinity for the hardware queue behavior is for hdwq 0 to be
affinitized with cpu 0, hdwq 1 to cpu 1, and so on.  The implementation so
far does not do this if the number of cpus is greater than the number of
hardware queues (e.g. hardware queue allocation was administratively
reduced or hardware queue resources could not scale to the cpu count).

Correct the queue affinitization logic when queue count is less than
cpu count.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:29:09 -05:00
James Smart
c490850a09 scsi: lpfc: Adapt partitioned XRI lists to efficient sharing
The XRI get/put lists were partitioned per hardware queue. However, the
adapter rarely had sufficient resources to give a large number of resources
per queue. As such, it became common for a cpu to encounter a lack of XRI
resource and request the upper io stack to retry after returning a BUSY
condition. This occurred even though other cpus were idle and not using
their resources.

Create as efficient a scheme as possible to move resources to the cpus that
need them. Each cpu maintains a small private pool which it allocates from
for io. There is a watermark that the cpu attempts to keep in the private
pool.  The private pool, when empty, pulls from a global pool from the
cpu. When the cpu's global pool is empty it will pull from other cpu's
global pool. As there many cpu global pools (1 per cpu or hardware queue
count) and as each cpu selects what cpu to pull from at different rates and
at different times, it creates a radomizing effect that minimizes the
number of cpu's that will contend with each other when the steal XRI's from
another cpu's global pool.

On io completion, a cpu will push the XRI back on to its private pool.  A
watermark level is maintained for the private pool such that when it is
exceeded it will move XRI's to the CPU global pool so that other cpu's may
allocate them.

On NVME, as heartbeat commands are critical to get placed on the wire, a
single expedite pool is maintained. When a heartbeat is to be sent, it will
allocate an XRI from the expedite pool rather than the normal cpu
private/global pools. On any io completion, if a reduction in the expedite
pools is seen, it will be replenished before the XRI is placed on the cpu
private pool.

Statistics are added to aid understanding the XRI levels on each cpu and
their behaviors.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:29:09 -05:00
James Smart
ace44e48b1 scsi: lpfc: Synchronize hardware queues with SCSI MQ interface
Now that the lower half has much better per-cpu parallelization using the
hardware queues, the SCSI MQ support needs to be tied into it.

The involves the following mods:

 - Use the hardware queue info from the midlayer to help select the
   hardware queue to utilize. This required change to the get_scsi-buf_xxx
   routines.

 - Remove lpfc_sli4_scmd_to_wqidx_distr() routine. No longer needed.

 - Includes fix for SLI-3 that does not have multi queue parallelization.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:29:09 -05:00
James Smart
1fbf974250 scsi: lpfc: Convert ring number to hardware queue for nvme wqe posting.
SLI4 nvme functions are passing the SLI3 ring number when posting wqe to
hardware. This should be indicating the hardware queue to use, not the ring
number.

Replace ring number with the hardware queue that should be used.

Note: SCSI avoided this issue as it utilized an older lfpc_issue_iocb
routine that properly adapts.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:29:09 -05:00
James Smart
4c47efc140 scsi: lpfc: Move SCSI and NVME Stats to hardware queue structures
Many io statistics were being sampled and saved using adapter-based data
structures. This was creating a lot of contention and cache thrashing in
the I/O path.

Move the statistics to the hardware queue data structures.  Given the
per-queue data structures, use of atomic types is lessened.

Add new sysfs and debugfs stat routines to collate the per hardware queue
values and report at an adapter level.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:29:08 -05:00
James Smart
5e5b511d8b scsi: lpfc: Partition XRI buffer list across Hardware Queues
Once the IO buff allocations were made shared, there was a single XRI
buffer list shared by all hardware queues.  A single list isn't great for
performance when shared across the per-cpu hardware queues.

Create a separate XRI IO buffer get/put list for each Hardware Queue.  As
SGLs and associated IO buffers get allocated/posted to the firmware; round
robin their assignment across all available hardware Queues so that there
is an equitable assignment.

Modify SCSI and NVME IO submit code paths to use the Hardware Queue logic
for XRI allocation.

Add a debugfs interface to display hardware queue statistics

Added new empty_io_bufs counter to track if a cpu runs out of XRIs.

Replace common_ variables/names with io_ to make meanings clearer.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:24:22 -05:00
James Smart
cdb42becdd scsi: lpfc: Replace io_channels for nvme and fcp with general hdw_queues per cpu
Currently, both nvme and fcp each have their own concept of an io_channel,
which is a combination wq/cq and associated msix.  Different cpus would
share an io_channel.

The driver is now moving to per-cpu wq/cq pairs and msix vectors.  The
driver will still use separate wq/cq pairs per protocol on each cpu, but
the protocols will share the msix vector.

Given the elimination of the nvme and fcp io channels, the module
parameters will be removed.  A new parameter, lpfc_hdw_queue is added which
allows the wq/cq pair allocation per cpu to be overridden and allocated to
lesser value. If lpfc_hdw_queue is zero, the number of pairs allocated will
be based on the number of cpus. If non-zero, the parameter specifies the
number of queues to allocate. At this time, the maximum non-zero value is
64.

To manage this new paradigm, a new hardware queue structure is created to
track queue activity and relationships.

As MSIX vector allocation must be known before setting up the
relationships, msix allocation now occurs before queue datastructures are
allocated. If the number of vectors allocated is less than the desired
hardware queues, the hardware queue counts will be reduced to the number of
vectors

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:22:42 -05:00
James Smart
7370d10ac9 scsi: lpfc: Remove extra vector and SLI4 queue for Expresslane
There is a extra queue and msix vector for expresslane. Now that the driver
will be doing queues per cpu, this oddball queue is no longer needed.
Expresslane will utilize the normal per-cpu queues.

Updated debugfs sli4 queue output to go along with the change

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:22:42 -05:00
James Smart
0794d601d1 scsi: lpfc: Implement common IO buffers between NVME and SCSI
Currently, both NVME and SCSI get their IO buffers from separate
pools. XRI's are associated 1:1 with IO buffers, so XRI's are also split
between protocols.

Eliminate the independent pools and use a single pool. Each buffer
structure now has a common section and a protocol section. Per protocol
routines for SGL initialization are removed and replaced by common
routines. Initialization of the buffers is only done on the common area.
All other fields, which are protocol specific, are initialized when the
buffer is allocated for use in the per-protocol allocation routine.

In the past, the SCSI side allocated IO buffers as part of slave_alloc
calls until the maximum XRIs for SCSI was reached. As all XRIs are now
common and may be used for either protocol, allocation for everything is
done as part of adapter initialization and the scsi side has no action in
slave alloc.

As XRI's are no longer split, the lpfc_xri_split module parameter is
removed.

Adapters based on SLI3 will continue to use the older scsi_buf_list_get/put
routines.  All SLI4 adapters utilize the new IO buffer scheme

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-02-05 22:22:42 -05:00
Luis Chamberlain
750afb08ca cross-tree: phase out dma_zalloc_coherent()
We already need to zero out memory for dma_alloc_coherent(), as such
using dma_zalloc_coherent() is superflous. Phase it out.

This change was generated with the following Coccinelle SmPL patch:

@ replace_dma_zalloc_coherent @
expression dev, size, data, handle, flags;
@@

-dma_zalloc_coherent(dev, size, handle, flags)
+dma_alloc_coherent(dev, size, handle, flags)

Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
[hch: re-ran the script on the latest tree]
Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-01-08 07:58:37 -05:00
Linus Torvalds
938edb8a31 SCSI misc on 20181224
This is mostly update of the usual drivers: smarpqi, lpfc, qedi,
 megaraid_sas, libsas, zfcp, mpt3sas, hisi_sas.  Additionally, we have
 a pile of annotation, unused variable and minor updates.  The big API
 change is the updates for Christoph's DMA rework which include
 removing the DISABLE_CLUSTERING flag.  And finally there are a couple
 of target tree updates.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXCEUNiYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishdjKAP9vrTTv
 qFaYmAoRSbPq9ZiixaXLMy0K/6o76Uay0gnBqgD/fgn3jg/KQ6alNaCjmfeV3wAj
 u1j3H7tha9j1it+4pUw=
 =GDa+
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "This is mostly update of the usual drivers: smarpqi, lpfc, qedi,
  megaraid_sas, libsas, zfcp, mpt3sas, hisi_sas.

  Additionally, we have a pile of annotation, unused variable and minor
  updates.

  The big API change is the updates for Christoph's DMA rework which
  include removing the DISABLE_CLUSTERING flag.

  And finally there are a couple of target tree updates"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (259 commits)
  scsi: isci: request: mark expected switch fall-through
  scsi: isci: remote_node_context: mark expected switch fall-throughs
  scsi: isci: remote_device: Mark expected switch fall-throughs
  scsi: isci: phy: Mark expected switch fall-through
  scsi: iscsi: Capture iscsi debug messages using tracepoints
  scsi: myrb: Mark expected switch fall-throughs
  scsi: megaraid: fix out-of-bound array accesses
  scsi: mpt3sas: mpt3sas_scsih: Mark expected switch fall-through
  scsi: fcoe: remove set but not used variable 'port'
  scsi: smartpqi: call pqi_free_interrupts() in pqi_shutdown()
  scsi: smartpqi: fix build warnings
  scsi: smartpqi: update driver version
  scsi: smartpqi: add ofa support
  scsi: smartpqi: increase fw status register read timeout
  scsi: smartpqi: bump driver version
  scsi: smartpqi: add smp_utils support
  scsi: smartpqi: correct lun reset issues
  scsi: smartpqi: correct volume status
  scsi: smartpqi: do not offline disks for transient did no connect conditions
  scsi: smartpqi: allow for larger raid maps
  ...
2018-12-28 14:48:06 -08:00
James Smart
529b3ddcff scsi: lpfc: update fault value on successful trunk events.
Currently, when a trunk link goes down due to some fault, the driver
snapshots the fault code.  If the link then comes back up, meaning there is
no fault, the driver is not clearing the fault code so the sysfs link_state
entry reports old/stale data.

Revise the logic so that on successful link up the fault code is cleared.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-12-19 22:13:07 -05:00
James Smart
1165a5c220 scsi: lpfc: Fix driver release of fw-logging buffers
On driver termination, after the driver stops fw logging by writing a
register on the chip, the driver immediately unmaps and frees the logging
buffer, without confirming in any way that the chip has received the write
and terminated the logging. As termination on the chip is not immediate,
the chip may issue a dma request to the now unmapped dma buffer, resulting
in a iommu fault.

Change the driver to receive a confirmation that logging ahs been
terminated. As the driver always issues an SLI reset with the device as
part of shutdown, and as part of that is receiving confirmation that the
reset is complete - the driver was modified to perform the write to disable
fw logging prior to the SLI reset and only free the fw log buffer after the
SLI reset is complete. That guarantees use of the fw log buffer is fully
terminated when it is unmapped.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-12-07 22:35:33 -05:00
James Smart
8b47ae69e0 scsi: lpfc: Cap NPIV vports to 256
Depending on the chipset, the number of NPIV vports may vary and be in
excess of what most switches support (256). To avoid confusion with the
users, limit the reported NPIV vports to 256.

Additionally correct the 16G adapter which is reporting a bogus NPIV vport
number if the link is down.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-12-07 22:35:32 -05:00
James Smart
5a9eeff57f scsi: lpfc: Fix kernel Oops due to null pring pointers
Driver is hitting null pring pointers in lpfc_do_work().

Pointer assignment occurs based on SLI-revision. If recovering after an
error, its possible the sli revision for the port was cleared, making the
lpfc_phba_elsring() not return a ring pointer, thus the null pointer.

Add SLI revision checking to lpfc_phba_elsring() and status checking to all
callers.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-12-07 22:35:32 -05:00
James Smart
3e1f071892 scsi: lpfc: refactor mailbox structure context fields
The driver data structure for managing a mailbox command contained two
context fields. Unfortunately, the context were considered "generic" to be
used at the whim of the command code.  Of course, one section of code used
fields this way, while another did it that way, and eventually there were
mixups.

Refactored the structure so that the generic contexts become a node context
and a buffer context and all code standardizes on their use.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-12-07 22:35:32 -05:00
James Smart
cb34990b90 scsi: lpfc: Fix panic when FW-log buffsize is not initialized
While trying to get adapter fw-log for a function whose buffsize was set to
0, kernel panic occurred.

When buffsize is 0, the kernel buffer for the log won't be allocated.  When
fw log usage was enabled, it failed to check the buffer size, and log usage
was started. Eventually the driver referenced the unallocated log buffer.

Added checks of the buffer size before allowing fw logging to be enabled
and added check for valid buffer if enabling fw log.

Performed a couple other minor cleanups while fixing this:
 - clarified log messages
 - re-evaluated log message severity
 - treat any error as an error, not only a couple codes

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-12-07 22:35:31 -05:00
Martin Wilck
dfb7513374 scsi: lpfc: fix block guard enablement on SLI3 adapters
Since f44ac12f1d, BG enablement is tracked with the LPFC_SLI3_BG_ENABLED
bit, which is set in lpfc_get_cfgparam before lpfc_sli_config_sli_port() is
called. The bit shouldn't be cleared before checking the feature.  Based on
problem analysis by David Bond.

Fixes: f44ac12f1d "scsi: lpfc: Memory allocation error during driver start-up on power8"
Tested-by: David Bond <dbond@suse.com>
Signed-off-by: Martin Wilck <mwilck@suse.com>
Cc: stable@vger.kernel.org # 4.17.x
Cc: stable@vger.kernel.org # 4.18.x
Cc: stable@vger.kernel.org # 4.19.x
Reviewed-by: Hannes Reinecke <hare@suse.com>
Acked-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-11-28 12:14:25 -05:00
Christoph Hellwig
f30e1bfd61 scsi: lpfc: use dma_set_mask_and_coherent
The driver currently uses pci_set_dma_mask despite otherwise using the
generic DMA API.  Switch it over to the better generic DMA API.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-11-15 14:27:08 -05:00
James Smart
1dc5ec2452 scsi: lpfc: add Trunking support
Add trunking support to the driver. Trunking is found on more recent
asics. In general, trunking appears as a single "port" to the driver
and overall behavior doesn't differ. Link speed is reported as an
aggregate value, while link speed control is done on a per-physical
link basis with all links in the trunk symmetrical. Some commands
returning port information are updated to additionally provide
trunking information. And new ACQEs are generated to report physical
link events relative to the trunk.

This patch contains the following modifications:

- Added link speed settings of 128GB and 256GB.

- Added handling of trunk-related ACQEs, mainly logging and trapping
  of physical link statuses.

- Added additional bsg interface to query trunk state by applications.

- Augment link_state sysfs attribtute to display trunk link status

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-11-06 20:42:51 -05:00
James Smart
036cad1f1a scsi: lpfc: fcoe: Fix link down issue after 1000+ link bounces
On FCoE adapters, when running link bounce test in a loop, initiator
failed to login with switch switch and required driver reload to
recover. Switch reached a point where all subsequent FLOGIs would be
LS_RJT'd. Further testing showed the condition to be related to not
performing FCF discovery between FLOGI's.

Fix by monitoring FLOGI failures and once a repeated error is seen
repeat FCF discovery.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-11-06 20:42:51 -05:00
James Smart
3952e91f11 scsi: lpfc: Fix lpfc_sli4_read_config return value check
An error is an error - but not to the existing return value check.

Revise check to handle any failure, not just EIO.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-11-06 20:42:50 -05:00
James Smart
cd71348ad7 scsi: lpfc: Correct speeds on SFP swap
Supported speeds is not updated when SFP is removed or replaced

Supported speed is obtained from lmt field in READ_CONFIG mailbox
response. Driver updates supported speeds only once from PCI probe
path. After that it is never updated. So, supported speeds remains the
same till reboot or driver reload.

When SFP is removed or inserted, driver gets SLI-Port Event ACQE.  If
SFP is removed, lmt wil have value 0. If a different SFP is inserted,
lmt will have value according to its supported speeds.  So, afterr
SLI-Port Event ACQE handling path, send READ_CONFIG mailbox and update
supported speeds. If READ_CONFIG fails, set supported speeds to
unknown and log.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-11-06 20:42:50 -05:00
Linus Torvalds
d49f8a52b1 SCSI misc on 20181024
This is mostly updates of the usual drivers: UFS, esp_scsi, NCR5380,
 qla2xxx, lpfc, libsas, hisi_sas.  In addition there's a set of mostly
 small updates to the target subsystem a set of conversions to the
 generic DMA API, which do have some potential for issues in the older
 drivers but we'll handle those as case by case fixes. A new myrs for
 the DAC960/mylex raid controllers to replace the block based DAC960
 which is also being removed by Jens in this merge window. Plus the
 usual slew of trivial changes.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCW9BQJSYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishU3MAP41T8yW
 UJQDCprj65pCR+9mOUWzgMvgAW/15ouK89x/7AD/XAEQZqoAgpFUbgnoZWGddZkS
 LykIzSiLHP4qeDOh1TQ=
 =2JMU
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "This is mostly updates of the usual drivers: UFS, esp_scsi, NCR5380,
  qla2xxx, lpfc, libsas, hisi_sas.

  In addition there's a set of mostly small updates to the target
  subsystem a set of conversions to the generic DMA API, which do have
  some potential for issues in the older drivers but we'll handle those
  as case by case fixes.

  A new myrs driver for the DAC960/mylex raid controllers to replace the
  block based DAC960 which is also being removed by Jens in this merge
  window.

  Plus the usual slew of trivial changes"

[ "myrs" stands for "MYlex Raid Scsi". Obviously. Silly of me to even
  wonder. There's also a "myrb" driver, where the 'b' stands for
  'block'. Truly, somebody has got mad naming skillz. - Linus ]

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (237 commits)
  scsi: myrs: Fix the processor absent message in processor_show()
  scsi: myrs: Fix a logical vs bitwise bug
  scsi: hisi_sas: Fix NULL pointer dereference
  scsi: myrs: fix build failure on 32 bit
  scsi: fnic: replace gross legacy tag hack with blk-mq hack
  scsi: mesh: switch to generic DMA API
  scsi: ips: switch to generic DMA API
  scsi: smartpqi: fully convert to the generic DMA API
  scsi: vmw_pscsi: switch to generic DMA API
  scsi: snic: switch to generic DMA API
  scsi: qla4xxx: fully convert to the generic DMA API
  scsi: qla2xxx: fully convert to the generic DMA API
  scsi: qla1280: switch to generic DMA API
  scsi: qedi: fully convert to the generic DMA API
  scsi: qedf: fully convert to the generic DMA API
  scsi: pm8001: switch to generic DMA API
  scsi: nsp32: switch to generic DMA API
  scsi: mvsas: fully convert to the generic DMA API
  scsi: mvumi: switch to generic DMA API
  scsi: mpt3sas: switch to generic DMA API
  ...
2018-10-25 07:40:30 -07:00
Colin Ian King
c4dba187e6 scsi: lpfc: fix spelling mistake "Resrouce" -> "Resource"
Trivial fix to spelling mistake in lpfc_printf_log message text.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-10-16 17:51:47 -04:00
Oza Pawandeep
62b36c3ea6 PCI/AER: Remove pci_cleanup_aer_uncorrect_error_status() calls
After bfcb79fca1 ("PCI/ERR: Run error recovery callbacks for all affected
devices"), AER errors are always cleared by the PCI core and drivers don't
need to do it themselves.

Remove calls to pci_cleanup_aer_uncorrect_error_status() from device
driver error recovery functions.

Signed-off-by: Oza Pawandeep <poza@codeaurora.org>
[bhelgaas: changelog, remove PCI core changes, remove unused variables]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2018-10-02 16:04:40 -05:00
James Smart
d2cc9bcd7f scsi: lpfc: add support to retrieve firmware logs
This patch adds the ability to read firmware logs from the adapter. The driver
registers a buffer with the adapter that is then written to by the adapter.
The adapter posts CQEs to indicate content updates in the buffer. While the
adapter is writing to the buffer in a circular fashion, an application will
poll the driver to read the next amount of log data from the buffer.

Driver log buffer size is configurable via the ras_fwlog_buffsize sysfs
attribute. Verbosity to be used by firmware when logging to host memory is
controlled through the ras_fwlog_level attribute.  The ras_fwlog_func
attribute enables or disables loggy by firmware.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-09-11 20:37:33 -04:00
James Smart
523128e53b scsi: lpfc: Correct irq handling via locks when taking adapter offline
When taking the board offline while performing i/o, unsafe locking errors
occurred and irq level isn't properly managed.

In lpfc_sli_hba_down, spin_lock_irqsave(&phba->hbalock, flags) does not
disable softirqs raised from timer expiry.  It is possible that a softirq is
raised from the lpfc_els_retry_delay routine and recursively requests the same
phba->hbalock spinlock causing deadlock.

Address the deadlocks by creating a new port_list lock. The softirq behavior
can then be managed a level deeper into the calling sequences.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-09-11 20:37:33 -04:00
James Smart
5b9e70b22c scsi: lpfc: raise sg count for nvme to use available sg resources
The driver allocates a sg list per io struture based on a fixed maximum
size. When it registers with the protocol transports and indicates the max sg
list size it supports, the driver manipulates the fixed value to report a
lesser amount so that it has reserved space for sg elements that are used for
DIF.

The driver initialization path sets the cfg_sg_seg_cnt field to the
manipulated value for scsi. NVME initialization ran afterward and capped it's
maximum by the manipulated value for SCSI. This erroneously made NVME report
the SCSI-reduce-for-DIF value that reduced the max io size for nvme and wasted
sg elements.

Rework the driver so that cfg_sg_seg_cnt becomes the overall maximum size and
allow the max size to be tunable.  A separate (new) scsi sg count is then
setup with the scsi-modified reduced value. NVME then initializes based off
the overall maximum.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-09-11 20:37:33 -04:00
James Smart
66e9e6bf07 scsi: lpfc: Support duration field in Link Cable Beacon V1 command
Current implementation missed setting the duration field. Correct the code
to set the field.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-07-10 22:15:09 -04:00
James Smart
414abe0ab6 scsi: lpfc: Make PBDE optimizations configurable
The PBDE optimizations aren't supported in all firmware revs.

Make optimizations configurable in case there's a side effect on old
firmware.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-07-10 22:15:09 -04:00
James Smart
68c9b55dee scsi: lpfc: Fix abort error path for NVMET
rmmod of driver hangs

As driver instances were being unloaded, the NVME target port was unloaded
first. During the unload, the NVME initiator port sent a heartbeat
IO. Because of the target port state, that IO was scheduled for an Abort;
however, that abort subsequently failed. The failure was not cleaned up
properly and lpfc_sli4_xri_exchange_busy_wait silently hung forever.

Clean failed abort properly and make lpfc_sli4_xri_exchange_busy_wait not
hangs silently while waiting for aborts to complete.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-07-10 22:15:09 -04:00
Kees Cook
6396bb2215 treewide: kzalloc() -> kcalloc()
The kzalloc() function has a 2-factor argument form, kcalloc(). This
patch replaces cases of:

        kzalloc(a * b, gfp)

with:
        kcalloc(a * b, gfp)

as well as handling cases of:

        kzalloc(a * b * c, gfp)

with:

        kzalloc(array3_size(a, b, c), gfp)

as it's slightly less ugly than:

        kzalloc_array(array_size(a, b), c, gfp)

This does, however, attempt to ignore constant size factors like:

        kzalloc(4 * 1024, gfp)

though any constants defined via macros get caught up in the conversion.

Any factors with a sizeof() of "unsigned char", "char", and "u8" were
dropped, since they're redundant.

The Coccinelle script used for this was:

// Fix redundant parens around sizeof().
@@
type TYPE;
expression THING, E;
@@

(
  kzalloc(
-	(sizeof(TYPE)) * E
+	sizeof(TYPE) * E
  , ...)
|
  kzalloc(
-	(sizeof(THING)) * E
+	sizeof(THING) * E
  , ...)
)

// Drop single-byte sizes and redundant parens.
@@
expression COUNT;
typedef u8;
typedef __u8;
@@

(
  kzalloc(
-	sizeof(u8) * (COUNT)
+	COUNT
  , ...)
|
  kzalloc(
-	sizeof(__u8) * (COUNT)
+	COUNT
  , ...)
|
  kzalloc(
-	sizeof(char) * (COUNT)
+	COUNT
  , ...)
|
  kzalloc(
-	sizeof(unsigned char) * (COUNT)
+	COUNT
  , ...)
|
  kzalloc(
-	sizeof(u8) * COUNT
+	COUNT
  , ...)
|
  kzalloc(
-	sizeof(__u8) * COUNT
+	COUNT
  , ...)
|
  kzalloc(
-	sizeof(char) * COUNT
+	COUNT
  , ...)
|
  kzalloc(
-	sizeof(unsigned char) * COUNT
+	COUNT
  , ...)
)

// 2-factor product with sizeof(type/expression) and identifier or constant.
@@
type TYPE;
expression THING;
identifier COUNT_ID;
constant COUNT_CONST;
@@

(
- kzalloc
+ kcalloc
  (
-	sizeof(TYPE) * (COUNT_ID)
+	COUNT_ID, sizeof(TYPE)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(TYPE) * COUNT_ID
+	COUNT_ID, sizeof(TYPE)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(TYPE) * (COUNT_CONST)
+	COUNT_CONST, sizeof(TYPE)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(TYPE) * COUNT_CONST
+	COUNT_CONST, sizeof(TYPE)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(THING) * (COUNT_ID)
+	COUNT_ID, sizeof(THING)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(THING) * COUNT_ID
+	COUNT_ID, sizeof(THING)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(THING) * (COUNT_CONST)
+	COUNT_CONST, sizeof(THING)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(THING) * COUNT_CONST
+	COUNT_CONST, sizeof(THING)
  , ...)
)

// 2-factor product, only identifiers.
@@
identifier SIZE, COUNT;
@@

- kzalloc
+ kcalloc
  (
-	SIZE * COUNT
+	COUNT, SIZE
  , ...)

// 3-factor product with 1 sizeof(type) or sizeof(expression), with
// redundant parens removed.
@@
expression THING;
identifier STRIDE, COUNT;
type TYPE;
@@

(
  kzalloc(
-	sizeof(TYPE) * (COUNT) * (STRIDE)
+	array3_size(COUNT, STRIDE, sizeof(TYPE))
  , ...)
|
  kzalloc(
-	sizeof(TYPE) * (COUNT) * STRIDE
+	array3_size(COUNT, STRIDE, sizeof(TYPE))
  , ...)
|
  kzalloc(
-	sizeof(TYPE) * COUNT * (STRIDE)
+	array3_size(COUNT, STRIDE, sizeof(TYPE))
  , ...)
|
  kzalloc(
-	sizeof(TYPE) * COUNT * STRIDE
+	array3_size(COUNT, STRIDE, sizeof(TYPE))
  , ...)
|
  kzalloc(
-	sizeof(THING) * (COUNT) * (STRIDE)
+	array3_size(COUNT, STRIDE, sizeof(THING))
  , ...)
|
  kzalloc(
-	sizeof(THING) * (COUNT) * STRIDE
+	array3_size(COUNT, STRIDE, sizeof(THING))
  , ...)
|
  kzalloc(
-	sizeof(THING) * COUNT * (STRIDE)
+	array3_size(COUNT, STRIDE, sizeof(THING))
  , ...)
|
  kzalloc(
-	sizeof(THING) * COUNT * STRIDE
+	array3_size(COUNT, STRIDE, sizeof(THING))
  , ...)
)

// 3-factor product with 2 sizeof(variable), with redundant parens removed.
@@
expression THING1, THING2;
identifier COUNT;
type TYPE1, TYPE2;
@@

(
  kzalloc(
-	sizeof(TYPE1) * sizeof(TYPE2) * COUNT
+	array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
  , ...)
|
  kzalloc(
-	sizeof(TYPE1) * sizeof(THING2) * (COUNT)
+	array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
  , ...)
|
  kzalloc(
-	sizeof(THING1) * sizeof(THING2) * COUNT
+	array3_size(COUNT, sizeof(THING1), sizeof(THING2))
  , ...)
|
  kzalloc(
-	sizeof(THING1) * sizeof(THING2) * (COUNT)
+	array3_size(COUNT, sizeof(THING1), sizeof(THING2))
  , ...)
|
  kzalloc(
-	sizeof(TYPE1) * sizeof(THING2) * COUNT
+	array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
  , ...)
|
  kzalloc(
-	sizeof(TYPE1) * sizeof(THING2) * (COUNT)
+	array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
  , ...)
)

// 3-factor product, only identifiers, with redundant parens removed.
@@
identifier STRIDE, SIZE, COUNT;
@@

(
  kzalloc(
-	(COUNT) * STRIDE * SIZE
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kzalloc(
-	COUNT * (STRIDE) * SIZE
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kzalloc(
-	COUNT * STRIDE * (SIZE)
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kzalloc(
-	(COUNT) * (STRIDE) * SIZE
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kzalloc(
-	COUNT * (STRIDE) * (SIZE)
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kzalloc(
-	(COUNT) * STRIDE * (SIZE)
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kzalloc(
-	(COUNT) * (STRIDE) * (SIZE)
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kzalloc(
-	COUNT * STRIDE * SIZE
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
)

// Any remaining multi-factor products, first at least 3-factor products,
// when they're not all constants...
@@
expression E1, E2, E3;
constant C1, C2, C3;
@@

(
  kzalloc(C1 * C2 * C3, ...)
|
  kzalloc(
-	(E1) * E2 * E3
+	array3_size(E1, E2, E3)
  , ...)
|
  kzalloc(
-	(E1) * (E2) * E3
+	array3_size(E1, E2, E3)
  , ...)
|
  kzalloc(
-	(E1) * (E2) * (E3)
+	array3_size(E1, E2, E3)
  , ...)
|
  kzalloc(
-	E1 * E2 * E3
+	array3_size(E1, E2, E3)
  , ...)
)

// And then all remaining 2 factors products when they're not all constants,
// keeping sizeof() as the second factor argument.
@@
expression THING, E1, E2;
type TYPE;
constant C1, C2, C3;
@@

(
  kzalloc(sizeof(THING) * C2, ...)
|
  kzalloc(sizeof(TYPE) * C2, ...)
|
  kzalloc(C1 * C2 * C3, ...)
|
  kzalloc(C1 * C2, ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(TYPE) * (E2)
+	E2, sizeof(TYPE)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(TYPE) * E2
+	E2, sizeof(TYPE)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(THING) * (E2)
+	E2, sizeof(THING)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(THING) * E2
+	E2, sizeof(THING)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	(E1) * E2
+	E1, E2
  , ...)
|
- kzalloc
+ kcalloc
  (
-	(E1) * (E2)
+	E1, E2
  , ...)
|
- kzalloc
+ kcalloc
  (
-	E1 * E2
+	E1, E2
  , ...)
)

Signed-off-by: Kees Cook <keescook@chromium.org>
2018-06-12 16:19:22 -07:00
James Smart
b92dc72df3 scsi: lpfc: Fix port initialization failure.
The driver exits port setup after failing the lpfc_sli4_get_parameters
command (messages 0356, 2541, & 1412).

The older CNA adapters do not support the MBX command. In the past
the code was allowed to fail and continue on with initialization.
However a nvme change moved a closing bracket and now makes all
failures terminal.

Revise the logic so that terminal failure only occurs if the command
failed on the newer adapters. Additionally, if parameters are set
that require information from the command and the command failed,
the parameters are erroneous and port set up should fail even on
the older adapters.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28 22:40:33 -04:00
James Smart
c221768bd4 scsi: lpfc: Fix 16gb hbas failing cq create.
The lancer G5 chip family fails the CQ create with 16k page size.  The
hardware incorrectly reports it supports large page sizes when it is
actually limited to 4k pages.

A prior patch resolved this for the A0 chip revision only.  This patch
excludes all revisions of the G5 asic from using large page sizes. As
knowing the actual chip revision is unnecessary, the now unused definitions
are removed

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-28 22:40:33 -04:00
Colin Ian King
7afc0ce912 scsi: lpfc: fix spelling mistakes: "mabilbox" and "maibox"
Trivial fix to spelling mistakes in lpfc_printf_log log message

"mabilbox" -> "mailbox"
"maibox" -> "mailbox"

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 01:26:44 -04:00
James Smart
3e21d1cb0f scsi: lpfc: Comment cleanup regarding Broadcom copyright header
Fix small formatting and wording nits in Broadcom copyright header

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 01:03:16 -04:00
James Smart
d38f33b304 scsi: lpfc: Driver NVME load fails when CPU cnt > WQ resource cnt
If the cpu count is larger than the number of WQ resources available,
adapter attachment eventually failes due to a WQ_CREATE failure.

Calculate the number of WQs desired (which initializes to cpu count)
after accounting for the number of queues the adapter supports and the
number allocated to SCSI and the control/ELS path, and scale down if
necessary.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 01:03:16 -04:00
James Smart
23288b78a1 scsi: lpfc: Handle new link fault code returned by adapter firmware.
The driver encounters a link event ACQE with a fault code it doesn't
recognize, it logs an "Invalid" fault type and futher treats the unknown
value as a mailbox command failure.  First off, there is no "invalid"
value, only values that are unknown. Secondly, the fault code doesn't
indicate status - the rest of the ACQE contains that status so there is
no reason to "fail the commands".

Change the "Invalid" to "Unknown". There is no "invalid" code value.

Separate fault code parsing and message genaration from any mbx handling
status.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 01:03:16 -04:00
James Smart
a72d56b2a6 scsi: lpfc: Correct fw download error message
In situations when the firmware image in inappropriate for the chip
type, initial validation checks were light, allowing the checks to pass,
thus allowing the firmware to be downloaded.  Eventually, after the
download, the chip rejects the firmware but it is logged as a generic
firmware download error.

Revise the initial checks to validate the image vs asic type so that the
correct message is displayed and the download process is avoided.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-05-08 01:03:16 -04:00
James Smart
bf316c7851 scsi: lpfc: Fix WQ/CQ creation for older asic's.
The patch to enlarge WQ/CQ creation keys off of an adapter response that
indicates support for the larger values. Older adapters return an
incorrect response and are limited in size.  Thus the adapters fail the
WQ creation steps.

Augment the WQ sizing checks with a check on the older adapter types and
limit them to the restricted sizes.

Fixes: c176ffa084 ("scsi: lpfc: Increase CQ and WQ sizes for SCSI")
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:34:04 -04:00
James Smart
0cdb84ec26 scsi: lpfc: Fix lingering lpfc_wq resource after driver unload
After driver unloads, lpfc_wq remains active. The destroy_workqueue
calls were not being made in driver unload.  Additionally, SLI3 is
allocating lpfc_wq resources, but never uses it.

Make the destroy_workqueue calls on driver unload.  Modify the SLI3 code
path no longer allocate lpfc_wq resources.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:34:03 -04:00
James Smart
66a210ffb8 scsi: lpfc: Add per io channel NVME IO statistics
When debugging various issues, per IO channel IO statistics were useful
to understand what was happening. However, many of the stats were on a
port basis rather than an io channel basis.

Move statistics to an io channel basis.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-04-18 19:34:01 -04:00
James Smart
f44ac12f1d scsi: lpfc: Memory allocation error during driver start-up on power8
The driver fails to allocate command buffers in the routine
lpfc_new_scsi_buf_s4

There is an inconsistency between lpfc_mem_alloc(), where the
phba->lpfc_sg_dma_buf_pool is created, and lpfc_new_scsi_buf_s4(),
when we allocate a buffer from the pool and check the alignment.  The
alignment should be on a page boundary, based on LPFC_SLI3_BG_ENABLED in
sli3_options, for both cases.

Fix by explicitly tracking sli4 vs sli3 and BG options.  The result is that
phba->cfg_sg_dma_buf_size is now set correctly for SLI-4.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-03-12 21:55:24 -04:00
James Smart
bd3061bab3 scsi: lpfc: Streamline NVME Targe6t WQE setup
To reduce latency when initializing WQE content, created templates for the
most common wqes. This reduces the number of operations taken to set the
content. It's not a lot of speed up, but every bit helps.

This patch updates the NVME target path.

[mkp: fixed typo]

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-03-12 21:55:23 -04:00
James Smart
5fd1108517 scsi: lpfc: Streamline NVME Initiator WQE setup
To reduce latency when initializing WQE content, create templates for the
most common wqes. This reduces the number of operations taken to set the
content. It's not a lot of speed up, but every bit helps.

This patch updates the NVME initiator path.

[mkp: fixed typo]

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-03-12 21:55:23 -04:00
James Smart
4e565cf041 scsi: lpfc: Work around NVME cmd iu SGL type
The hardware offload for NVME commands was created when the
FC-NVME standard was setting SGL Descriptor Type to SGL Data
Block Descriptor (0h) and SGL Descriptor Sub Type to Address (0h).

A late change in NVMe-over-Fabrics obsoleted these values, creating
a transport SGL descriptor type with new values to go into these
fields.

For initial hardware support, in order to be compliant to the spec,
use host-supplied cmd IU buffers instead of the adapter generated
values. Later hardware will correct this.

Add a module parameter to override this offload disablement if looking
for lowest latency. This is reasonable as nothing in FC-NVME uses
the SQE SGL values.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-02-22 20:39:29 -05:00
James Smart
0bc2b7c531 scsi: lpfc: Add embedded data pointers for enhanced performance
The current driver isn't taking advantage of a performance hint whereby
the initial data buffer descriptor can be placed in the WQE as well as
the SGL.

Add the logic to detect support for the feature and to use it when
supported.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-02-22 20:39:29 -05:00
James Smart
1feb8204a1 scsi: lpfc: Enable fw download on if_type=6 devices
Current code is very explicit in what it allows to be downloaded.
The driver checking prevented G7 firmware download. The driver
checking is unnecessary as the device will validate what it receives.

Revise the firmware download interface checking.
Added a little debug support in case there is still a failure.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-02-22 20:39:29 -05:00
James Smart
7365f6fdbb scsi: lpfc: Add if_type=6 support for cycling valid bits
Traditional SLI4 required the driver to clear Valid bits on
EQEs and CQEs after consuming them.

The new if_type=6 hardware will cycle the value for what is
valid on each queue itteration. The driver no longer has to
touch the valid bits. This also means all the cpu cache
dirtying and perhaps flush/refill's done by the hardware
in accessing the EQ/CQ elements is eliminated.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-02-22 20:39:29 -05:00
James Smart
fbd8a6ba65 scsi: lpfc: Add 64G link speed support
The G7 adapter supports 64G link speeds. Add support to the driver.

In addition, a small cleanup to replace the odd bitmap logic with
a switch case.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-02-22 20:39:29 -05:00
James Smart
c238b9b6ea scsi: lpfc: Add PCI Ids for if_type=6 hardware
Add PCI ids for the new G7 adapter

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-02-22 20:39:29 -05:00
James Smart
1351e69fc6 scsi: lpfc: Add push-to-adapter support to sli4
New if_type=6 adapters support an additional BAR that provides
apertures to allow direct WQE to adapter push support - termed
Direct Packet Push (DPP). WQ creation differs slightly to ask for
a WQ to be DPP-ized. When submitting a WQE to a DPP WQ, it is
submitted to the host memory for the WQ normally, but is also
written by the host cpu directly to a BAR aperture.  Write buffer
coalescing in hardware is (hopefully) turned on, enabling single
pci write operation support. The doorbell is thing rung to indicate
the WQE is available and was pushed to the aperture.

This patch:
- Updates the WQ Create commands for the DPP options
- Adds the bar mapping for if_type=6 DPP bar
- Adds the WQE pushing to the DDP aperture received from WQ create
- Adds a new module parameter to disable DPP operation if desired.
  Default is enabled.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-02-22 20:39:29 -05:00
James Smart
27d6ac0a6e scsi: lpfc: Add SLI-4 if_type=6 support to the code base
New hardware supports a SLI-4 interface, but with a new if_type
variant of 6.

If_type=6 has a different PCI BAR map, separate EQ/CQ doorbells,
and some changes in doorbell formats.

Add the changes for the if_type into headers, adapter initialization
and control flows. Add new eq and cq handlers.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-02-22 20:39:28 -05:00
James Smart
9dd35425a5 scsi: lpfc: Rework sli4 doorbell infrastructure
Up until now, all SLI-4 devices had the same doorbells at the same
bar locations. With newer hardware, there are now independent EQ and
CQ doorbells and the bar locations differ.

Prepare the code for new hardware by separating the eq/cq doorbell into
separate components. The components can be set based on if_type.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-02-22 20:39:28 -05:00
James Smart
b71413dd01 scsi: lpfc: Rework lpfc to allow different sli4 cq and eq handlers
Up until now, an SLI-4 device had no variance in the way it handled
its EQs and CQs. With newer hardware, there are now differences in
doorbells and some differences in how entries are valid.

Prepare the code for new hardware by creating a sli4-based callout
table that can be set based on if_type.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-02-22 20:39:28 -05:00
James Smart
128bddacc4 scsi: lpfc: Update 11.4.0.7 modified files for 2018 Copyright
Updated Copyright in files updated 11.4.0.7

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-02-12 11:43:24 -05:00
James Smart
20aefac3a9 scsi: lpfc: Validate adapter support for SRIU option
When using the special option to suppress the response iu, ensure the
adapter fully supports the feature by checking feature flags from the
adapter and validating the support when formatting the WQE.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-02-12 11:43:24 -05:00
James Smart
c1dd9111b7 scsi: lpfc: Fix SCSI io host reset causing kernel crash
During SCSI error handling escalation to host reset, the SCSI io
routines were moved off the txcmplq, but the individual io's ON_CMPLQ
flag wasn't cleared.  Thus, a background thread saw the io and attempted
to access it as if on the txcmplq.

Clear the flag upon removal.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-02-12 11:43:23 -05:00
James Smart
281d61902f scsi: lpfc: move placement of target destroy on driver detach
Ensure nvme localports/targetports are torn down before dismantling the
adapter sli interface on driver detachment.  This aids leaving
interfaces live while nvme may be making callbacks to abort it.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-02-12 11:43:22 -05:00
James Smart
c176ffa084 scsi: lpfc: Increase CQ and WQ sizes for SCSI
Increased CQ and WQ sizes for SCSI FCP, matching those used for NVMe
development.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-02-12 11:43:22 -05:00
James Smart
a51e41b671 scsi: lpfc: Increase SCSI CQ and WQ sizes.
Increased the sizes of the SCSI WQ's and CQ's so that SCSI operation is
similar to that used by NVME. However, size increase restricted only to
those newer adapters that can support the larger WQE size, thus bigger
queue sizes.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-20 21:11:46 -05:00
James Smart
cf1a1d3e2d scsi: lpfc: Fix random heartbeat timeouts during heavy IO
NVME targets appear to randomly disconnect from the initiator when
running heavy IO.

The error is due to the host aggregate (across all controllers) io load
was beyond the maximum exchange count for nvme on the adapter. The
driver was properly returning a resource busy status, but the io load
was so great heartbeat commands would be bounced and not have a
successful retry within the fuzz amount for the nvme heartbeat (yes, a
very high io load!). Thus the target was terminating the controller due
to a keep alive failure.

Resolve by reserving a few exchanges (by counters) which can be used
when the adapter is out of normal exchanges and the command is a NVME
heartbeat command. As counters are used, while the reserved command is
outstanding, as soon as any other exchange completes, the counters are
adjusted and the reserved count is replenished. The heartbeat completes
execution in a normal fashion.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-20 21:11:44 -05:00
James Smart
81e6a63728 scsi: lpfc: small sg cnt cleanup
The logic for sg_seg_cnt is a bit convoluted. This patch tries to clean
up a couple of areas, especially around the +2 and +1 logic.

This patch:

- Cleans up the lpfc_sg_seg_cnt attribute to specify a real minimum
  rather than making the minimum be whatever the default is.

- Removes the hardcoding of +2 (for the number of elements we use in a
  sgl for cmd iu and rsp iu) and +1 (an additional entry to compensate
  for nvme's reduction of io size based on a possible partial page)
  logic in sg list initialization. In the case where the +1 logic is
  referenced in host and target io checks, use the values set in the
  transport template as that value was properly set.

There can certainly be more done in this area and it will be addressed
in combined host/target driver effort.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-04 20:32:55 -05:00
James Smart
c3725bdcdf scsi: lpfc: Fix driver handling of nvme resources during unload
During driver unload, the driver may crash due to NULL pointers.  The
NULL pointers were due to the driver not protecting itself sufficiently
during some of the teardown paths.  Additionally, the driver was not
waiting for and cleanup up nvme io resources. As such, the driver wasn't
making the callbacks to the transport, stalling the transports
association teardown.

This patch waits for io clean up before tearding down and adds checks
for possible NULL pointers.

Cc: <stable@vger.kernel.org> # 4.12+
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-04 20:32:55 -05:00
James Smart
bcb24f6577 scsi: lpfc: Adjust default value of lpfc_nvmet_mrq
The current default for async hw receive queues is 1, which presents
issues under heavy load as number of queues influence the available
async receive buffer limits.

Raise the default to the either the current hw limit (16) or the number
of hw qs configured (io channel value).

Revise the attribute definition for mrq to better reflect what we do for
hw queues. E.g. 0 means default to optimal (# of cpus), non-zero
specifies a specific limit. Before this change, mrq=0 meant target mode
was disabled. As 0 now has a different meaning, rework the if tests to
use the better nvmet_support check.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-04 20:32:54 -05:00
James Smart
e4b9794efd scsi: lpfc: Fix crash after bad bar setup on driver attachment
In test cases where an instance of the driver is detached and
reattached, the driver will crash on reattachment. There is a compound
if statement that will skip over the bar setup if the pci_resource_start
call is not successful. The driver erroneously returns success to its
bar setup in this scenario even though the bars aren't properly
configured.

Rework the offending code segment for proper initialization steps.  If
the pci_resource_start call fails, -ENOMEM is now returned.

Sample stack:

rport-5:0-10: blocked FC remote port time out: removing rport
BUG: unable to handle kernel NULL pointer dereference at           (null)
... lpfc_sli4_wait_bmbx_ready+0x32/0x70 [lpfc]
...
...  RIP: 0010:...  ... lpfc_sli4_wait_bmbx_ready+0x32/0x70 [lpfc]
 Call Trace:
  ... lpfc_sli4_post_sync_mbox+0x106/0x4d0 [lpfc]
  ... ? __alloc_pages_nodemask+0x176/0x420
  ... ? __kmalloc+0x2e/0x230
  ... lpfc_sli_issue_mbox_s4+0x533/0x720 [lpfc]
  ... ? mempool_alloc+0x69/0x170
  ... ? dma_generic_alloc_coherent+0x8f/0x140
  ... lpfc_sli_issue_mbox+0xf/0x20 [lpfc]
  ... lpfc_sli4_driver_resource_setup+0xa6f/0x1130 [lpfc]
  ... ? lpfc_pci_probe_one+0x23e/0x16f0 [lpfc]
  ... lpfc_pci_probe_one+0x445/0x16f0 [lpfc]
  ... local_pci_probe+0x45/0xa0
  ... work_for_cpu_fn+0x14/0x20
  ... process_one_work+0x17a/0x440

Cc: <stable@vger.kernel.org> # 4.12+
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-04 20:32:54 -05:00
James Smart
8a5ca109a3 scsi: lpfc: Handle XRI_ABORTED_CQE in soft IRQ
XRI_ABORTED_CQE completions were not being handled in the fast path.
They were being queued and deferred to the lpfc worker thread for
processing. This is an artifact of the driver design prior to moving
queue processing out of the isr and into a workq element. Now that queue
processing is already in a deferred context, remove this artifact and
process them directly.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-04 20:32:53 -05:00
James Smart
81b96eda5f scsi: lpfc: Expand WQE capability of every NVME hardware queue
Hardware queues are a fast staging area to push commands into the
adapter.  The adapter should drain them extremely quickly. However,
under heavy io load, the host cpu is pushing commands faster than the
drain rate of the adapter causing the driver to resource busy commands.

Enlarge the hardware queue (wq & cq) to support a larger number of queue
entries (4x the prior size) before backpressure. Enlarging the queue
requires larger contiguous buffers (16k) per logical page for the
hardware. This changed calling sequences that were expecting 4K page
sizes that now must pass a parameter with the page sizes. It also
required use of a new version of an adapter command that can vary the
page size values.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-12-04 20:32:53 -05:00
Linus Torvalds
670ffccb2f SCSI misc on 20171114
This is mostly updates of the usual suspects: lpfc, qla2xxx, hisi_sas,
 megaraid_sas, pm80xx, mpt3sas, be2iscsi, hpsa. and a host of minor
 updates.
 
 There's no major behaviour change or additions to the core in all of
 this, so the potential for regressions should be small (biggest
 potential being in the scsi error handler changes).
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJaCxtCAAoJEAVr7HOZEZN4d9EQAI+OHP6ss6zjKKC21c9jNPcH
 NhLrNv37gHg/LA2VXeUEL9RGUjCGLIUrI4HsrxzkFAMLKP4TkshMs8/2RvczY+Sa
 VpayPqVybEKLIS6ipQyM1SLIQff2nvtDVcN/T+8z1lkk45TrbA6ZGuwUwd2aJyEA
 2V2wtg51ObnL0Nr9QPPll0JrtL1AnCZyRlu9XrwTZuuSBZwk93opIuuvbZm/3dVg
 Ir4GSS4Y+PuHIfu4cxqdsPMdzRdY9I2me1YiE4jeFSn1/VTAjL4HBz7fO9eITT42
 VhXSpDz1XvFsa9dJ0ubkqoALpJzCfOcBw+EuGvSydLEvOBoEVwMccdfaD9lT1zc5
 L9e1Z5qqJoq7hTA6xTXCYfWG73I9HYvljtmc8yudKHhADOdnSTUXhaO6uBF0RNqD
 OxPSA1RZwRx3c6lDOcK6BTtvLAkTEuYKdrWSKJi0w+QXJAyQ6etqbmsKpmPdRim7
 Z4ZSpJFro2gyo9gcdJO0ykTG+z3U7Z/ay1sNgnuprsv+eU/QjUdlAPl18o79EkRf
 H54zZggZ4wC6q/cFVVt4Vx+V+oqIeu38s7NDXS9UltLoTZPm2EzDW6pXd/38Z4Tf
 a1oBAUET8kYLC90P8sVZxUIHZjITlpgDbyE2Lq00PMYXhk8S4IxF0aMN5RvVqzUv
 +7N2HrHkSSgG1nhw1t+E
 =3O85
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "This is mostly updates of the usual suspects: lpfc, qla2xxx, hisi_sas,
  megaraid_sas, pm80xx, mpt3sas, be2iscsi, hpsa. and a host of minor
  updates.

  There's no major behaviour change or additions to the core in all of
  this, so the potential for regressions should be small (biggest
  potential being in the scsi error handler changes)"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (203 commits)
  scsi: lpfc: Fix hard lock up NMI in els timeout handling.
  scsi: mpt3sas: remove a stray KERN_INFO
  scsi: mpt3sas: cleanup _scsih_pcie_enumeration_event()
  scsi: aacraid: use timespec64 instead of timeval
  scsi: scsi_transport_fc: add 64GBIT and 128GBIT port speed definitions
  scsi: qla2xxx: Suppress a kernel complaint in qla_init_base_qpair()
  scsi: mpt3sas: fix dma_addr_t casts
  scsi: be2iscsi: Use kasprintf
  scsi: storvsc: Avoid excessive host scan on controller change
  scsi: lpfc: fix kzalloc-simple.cocci warnings
  scsi: mpt3sas: Update mpt3sas driver version.
  scsi: mpt3sas: Fix sparse warnings
  scsi: mpt3sas: Fix nvme drives checking for tlr.
  scsi: mpt3sas: NVMe drive support for BTDHMAPPING ioctl command and log info
  scsi: mpt3sas: Add-Task-management-debug-info-for-NVMe-drives.
  scsi: mpt3sas: scan and add nvme device after controller reset
  scsi: mpt3sas: Set NVMe device queue depth as 128
  scsi: mpt3sas: Handle NVMe PCIe device related events generated from firmware.
  scsi: mpt3sas: API's to remove nvme drive from sml
  scsi: mpt3sas: API 's to support NVMe drive addition to SML
  ...
2017-11-14 16:23:44 -08:00
Kees Cook
f22eb4d31c scsi: lpfc: Convert timers to use timer_setup()
In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.

Cc: James Smart <james.smart@broadcom.com>
Cc: Dick Kennedy <dick.kennedy@broadcom.com>
Cc: "James E.J. Bottomley" <jejb@linux.vnet.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: linux-scsi@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-11-01 11:27:07 -07:00
Dick Kennedy
f485c18db2 scsi: lpfc: Move CQ processing to a soft IRQ
Under heavy target nvme load duration, the lpfc irq handler is
encountering cpu lockup warnings.

Convert the driver to a shortened ISR handler which identifies the
interrupting condition then schedules a workq thread to process the
completion queue the interrupt was for. This moves all the real work
into the workq element.

As nvmet_fc upcalls are no longer in ISR context, don't set the feature
flags

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-10-02 22:46:36 -04:00
Dick Kennedy
1234a6d54f scsi: lpfc: Fix crash receiving ELS while detaching driver
The driver crashes when attempting to use a freed ndpl pointer.

The pci_remove_one handler runs on a separate kernel thread. The order
of the removal is starting by freeing all of the ndlps and then
disabling interrupts. In between these two events the driver can still
receive an ELS and process it. When it tries to use the ndlp pointer
will be NULL

Change the order of the pci_remove_one vs disable interrupts so that
interrupts are disabled before the ndlp's are freed.

Cc: <stable@vger.kernel.org> # 4.12+
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-10-02 22:46:33 -04:00
Dick Kennedy
1901762f2c scsi: lpfc: fix pci hot plug crash in timer management routines
During pci hot plug, the kernel crashes in timer management code.

The sli4 remove_one handler is not stoping the timers as it starts to
remove the port so that it can be swapped.

Fix: Stop the timers early in the handler routine.

Note: Fix in SLI-4 only. SLI-3 already stopped the timers properly.

Cc: <stable@vger.kernel.org> # 4.12+
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-10-02 22:46:32 -04:00
Stefano Brivio
5c756065e4 scsi: lpfc: Don't return internal MBXERR_ERROR code from probe function
Internal error codes happen to be positive, thus the PCI driver core
won't treat them as failure, but we do. This would cause a crash later
on as lpfc_pci_remove_one() is called (e.g. as shutdown function).

Fixes: 6d368e5321 ("[SCSI] lpfc 8.3.24: Add resource extent support")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Acked-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-09-15 21:15:32 -04:00
Maurizio Lombardi
286871a666 scsi: lpfc: fix "integer constant too large" error on 32bit archs
cc1: warnings being treated as errors
drivers/scsi/lpfc/lpfc_init.c: In function 'lpfc_get_wwpn':
drivers/scsi/lpfc/lpfc_init.c:3253: error: integer constant is too large for 'long' type

Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-24 22:29:44 -04:00
James Smart
44fd7fe3dd scsi: lpfc: Add Buffer to Buffer credit recovery support
Add Buffer to buffer credit recovery support to the driver.  This is a
negotiated feature with the peer that allows for both sides to detect
dropped RRDY's and FC Frames and recover credit.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-24 22:29:43 -04:00
Dick Kennedy
a145fda381 scsi: lpfc: Fix nvme target failure after 2nd adapter reset
The nonrecovery occurred because the lpfc nvme initiator function did
not reestablish its localport creation with the nvme host transport in
lpfc_oneline.  Because of that, an NVME rport binding could not take
place.

Corrected by recreating the localport in the adapter reset recovery
routine.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-24 22:29:42 -04:00
Dick Kennedy
66d7ce93a0 scsi: lpfc: Fix MRQ > 1 context list handling
Various oops including cpu LOCKUPs were seen.

For asynchronously received ius where the driver must assign exchange
resources, the resources were on a single get (free) list and put list
(finished, waiting to be put on get list). As all cpus are sharing the
lists, an interrupt for a receive frame may have to wait for all the
other cpus to place their done work onto the put list before it can
acquire the lock to pull from the list.

Fix by breaking the resource lists into per-cpu lists or at least more
than 1 list with cpu's sharing the lists). A cpu would allocate from the
free list for its own cpu, and put its done work on the its own put list
- avoiding the contention. As cpu load may vary, when empty, a cpu may
grab from another cpu, thereby changing resource distribution.  But
searching for a resource only occurs on 1 or a few cpus until a single
resource can be allocated. if the condition reoccurs, it starts looking
at a different cpu.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-24 22:29:41 -04:00
Dick Kennedy
4b40d02b8b scsi: lpfc: Fix crash in lpfc nvmet when fc port is reset
In adapter reset tests, an oops was seen with a NULL pointer in
lpfc_free_rq_buffer+0x20/0x60

The driver is failing to properly repost the nvmet sgl list when
recovering from the reset. Thus the driver eventually trys to walk an
errant buffer list.

Corrected the sgl buffer recovery as well as strengthening the
initialization of the bufferlist.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-24 22:29:38 -04:00
Romain Perier
771db5c0e3 scsi: lpfc: Replace PCI pool old API
The PCI pool API is deprecated. This commit replaces the PCI pool old
API by the appropriate function with the DMA pool API. It also updates
some comments, accordingly.

Signed-off-by: Romain Perier <romain.perier@collabora.com>
Reviewed-by: Peter Senna Tschudin <peter.senna@collabora.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-08-07 14:04:01 -04:00
James Smart
cb45e5295a scsi: lpfc: fix refcount error on node list
A change in remote port removal introduced a spurious put which can
cause a premature structure teardown. The affects were most notable when
the driver attempted to unload as a null pointer would be hit.

Fix by removing the unnecessary put.

Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-26 15:01:01 -04:00
James Smart
966bb5b711 scsi: lpfc: Break up IO ctx list into a separate get and put list
Since unsol rcv ISR and command cmpl ISR both access/lock this list,
separate get/put lists will reduce contention.

Replaced
struct list_head lpfc_nvmet_ctx_list;
with
struct list_head lpfc_nvmet_ctx_get_list;
struct list_head lpfc_nvmet_ctx_put_list;
and all correpsonding locks and counters.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-19 21:40:10 -04:00
James Smart
56bc802842 scsi: lpfc: Vport creation is failing with "Link Down" error
Vport creation fails for SLI-3 adapters.

Mailbox submission fails because mailbox interrupt is disabled. Mailbox
interrupt is disabled during port reset.

Do reset only for physical port.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-19 21:39:37 -04:00
James Smart
0cf07f84dd scsi: lpfc: Add auto EQ delay logic
Administrator intervention is currently required to get good numbers
when switching from running latency tests to IOPS tests.

The configured interrupt coalescing values will greatly effect the
results of these tests.  Currently, the driver has a single coalescing
value set by values of the module attribute.  This patch changes the
driver to support auto-configuration of the coalescing value based on
the total number of outstanding IOs and average number of CQEs processed
per interrupt for an EQ.  Values are checked every 5 seconds.

The driver defaults to the automatic selection. Automatic selection can
be disabled by the new lpfc_auto_imax module_parameter.

Older hardware can only change interrupt coalescing by mailbox
command. Newer hardware supports change via a register. The patch
support both.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-12 21:37:31 -04:00
James Smart
b83d005e63 scsi: lpfc: Fix System panic after loading the driver
System panic with general protection fault during driver load

The driver uses a static array sli4_hba.handler_name to store the irq
handler names. If the io_channel_irqs exceeds the pre-allocated size
(32+1), then the driver will overwrite other fields of sli4_hba.

Fix: Dynamically allocate handler_name.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-12 21:37:31 -04:00
James Smart
2cee780800 scsi: lpfc: Fix counters so outstandng NVME IO count is accurate
NVME FC counters don't reflect actual results

Since counters are not atomic, or protected by a lock, the values often
get screwed up.

Make them atomic, like NVMET.  Fix up sysfs and debugfs display
accordingly Added Outstanding IOs to stats display

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-06-12 21:37:31 -04:00
James Smart
ae9e28f36a scsi: lpfc: Add MDS Diagnostic support.
Added code to support Cisco MDS loopback diagnostic. The diagnostics run
various loopbacks including one which loops-back frame through the
driver.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-16 21:24:47 -04:00
James Smart
a8cf5dfeb4 scsi: lpfc: Added recovery logic for running out of NVMET IO context resources
Previous logic would just drop the IO.

Added logic to queue the IO to wait for an IO context resource from an
IO thats already in progress.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-16 21:22:22 -04:00
James Smart
6c621a2229 scsi: lpfc: Separate NVMET RQ buffer posting from IO resources SGL/iocbq/context
Currently IO resources are mapped 1 to 1 with RQ buffers posted

Added logic to separate RQE buffers from IO op resources
(sgl/iocbq/context). During initialization, the driver will determine
how many SGLs it will allocate for NVMET (based on what the firmware
reports) and associate a NVMET IOCBq and NVMET context structure with
each one.

Now that hdr/data buffers are immediately reposted back to the RQ, 512
RQEs for each MRQ is sufficient. Also, since NVMET data buffers are now
128 bytes, lpfc_nvmet_mrq_post is not necessary anymore as we will
always post the max (512) buffers per NVMET MRQ.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-16 21:21:47 -04:00
James Smart
3c603be979 scsi: lpfc: Separate NVMET data buffer pool fir ELS/CT.
Using 2048 byte buffer and onle 128 bytes is needed.

Create nee LFPC_NVMET_DATA_BUF_SIZE define to use for NVMET RQ/MRQs.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-16 21:21:17 -04:00
James Smart
61f3d4bf4f scsi: lpfc: Fix nvmet RQ resource needs for large block writes.
Large block writes to the nvme target were failing because the default
number of RQs posted was insufficient.

Expand the NVMET RQs to 2048 RQEs and ensure a minimum of 512 RQEs are
posted, no matter how many MRQs are configured.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-16 21:18:41 -04:00
James Smart
845d9e8df2 scsi: lpfc: Fix used-RPI accounting problem.
With 255 vports created a link trasition can casue a crash.

When going through discovery after a link bounce the driver is using
rpis before the cmd FCOE_POST_HDR_TEMPLATES completes. By doing that the
next rpi bumps the rpi range out of the boundary.

The fix it to increment the next_rpi only when the
FCOE_POST_HDR_TEMPLATE succeeds.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-16 21:17:28 -04:00
Colin Ian King
019c0d66f6 scsi: lpfc: ensure els_wq is being checked before destroying it
I believe there is a typo on the wq destroy of els_wq, currently the
driver is checking if els_cq is not null and I think this should be a
check on els_wq instead.

Detected by CoverityScan, CID#1411629 ("Copy-paste error")

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-08 22:10:25 -04:00
James Smart
4492b739c9 scsi: lpfc: Fix panic on BFS configuration
To select the appropriate shost template, the driver is issuing a
mailbox command to retrieve the wwn. Turns out the sending of the
command precedes the reset of the function.  On SLI-4 adapters, this is
inconsequential as the mailbox command location is specified by dma via
the BMBX register. However, on SLI-3 adapters, the location of the
mailbox command submission area changes. When the function is first
powered on or reset, the cmd is submitted via PCI bar memory. Later the
driver changes the function config to use host memory and DMA. The
request to start a mailbox command is the same, a simple doorbell write,
regardless of submission area.  So.. if there has not been a boot driver
run against the adapter, the mailbox command works as defaults are
ok. But, if the boot driver has configured the card and, and if no
platform pci function/slot reset occurs as the os starts, the mailbox
command will fail. The SLI-3 device will use the stale boot driver dma
location. This can cause PCI eeh errors.

Fix is to reset the sli-3 function before sending the mailbox command,
thus synchronizing the function/driver on mailbox location.

Note: The fix uses routines that are typically invoked later in the call
flow to reset the sli-3 device. The issue in using those routines is
that the normal (non-fix) flow does additional initialization, namely
the allocation of the pport structure. So, rather than significantly
reworking the initialization flow so that the pport is alloc'd first,
pointer checks are added to work around it. Checks are limited to the
routines invoked by a sli-3 adapter (s3 routines) as this fix/early call
is only invoked on a sli3 adapter. Nothing changes post the
fix. Subsequent initialization, and another adapter reset, still occur -
both on sli-3 and sli-4 adapters.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Fixes: 96418b5e2c ("scsi: lpfc: Fix eh_deadline setting for sli3 adapters.")
Cc: stable@vger.kernel.org # v4.11+
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-05-08 21:24:34 -04:00
James Smart
7e04e21afa Fix Express lane queue creation.
The older sli4 adapters only supported the 64 byte WQE entry size.
The new adapter (fw) support both 64 and 128 byte WQE entry sizies.
The Express lane WQ was not being created with the 128 byte WQE sizes
when it was supported.

Not having the right WQE size created for the express lane work queue
caused the the firmware to overwrite the lun indentifier in the FCP header.

This patch correctly creates the express lane work queue with the
supported size.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
2017-04-24 09:25:49 +02:00
James Smart
86c6737963 Update ABORT processing for NVMET.
The driver with nvme had this routine stubbed.

Right now XRI_ABORTED_CQE is not handled and the FC NVMET
Transport has a new API for the driver.

Missing code path, new NVME abort API
Update ABORT processing for NVMET

There are 3 new FC NVMET Transport API/ template routines for NVMET:

lpfc_nvmet_xmt_fcp_release
This NVMET template callback routine called to release context
associated with an IO This routine is ALWAYS called last, even
if the IO was aborted or completed in error.

lpfc_nvmet_xmt_fcp_abort
This NVMET template callback routine called to abort an exchange that
has an IO in progress

nvmet_fc_rcv_fcp_req
When the lpfc driver receives an ABTS, this NVME FC transport layer
callback routine is called. For this case there are 2 paths thru the
driver: the driver either has an outstanding exchange / context for the
XRI to be aborted or not.  If not, a BA_RJT is issued otherwise a BA_ACC

NVMET Driver abort paths:

There are 2 paths for aborting an IO. The first one is we receive an IO and
decide not to process it because of lack of resources. An unsolicated ABTS
is immediately sent back to the initiator as a response.
lpfc_nvmet_unsol_fcp_buffer
            lpfc_nvmet_unsol_issue_abort  (XMIT_SEQUENCE_WQE)

The second one is we sent the IO up to the NVMET transport layer to
process, and for some reason the NVME Transport layer decided to abort the
IO before it completes all its phases. For this case there are 2 paths
thru the driver:
the driver either has an outstanding TSEND/TRECEIVE/TRSP WQE or no
outstanding WQEs are present for the exchange / context.
lpfc_nvmet_xmt_fcp_abort
    if (LPFC_NVMET_IO_INP)
        lpfc_nvmet_sol_fcp_issue_abort  (ABORT_WQE)
                lpfc_nvmet_sol_fcp_abort_cmp
    else
        lpfc_nvmet_unsol_fcp_issue_abort
                lpfc_nvmet_unsol_issue_abort  (XMIT_SEQUENCE_WQE)
                        lpfc_nvmet_unsol_fcp_abort_cmp

Context flags:
LPFC_NVMET_IOP - his flag signifies an IO is in progress on the exchange.
LPFC_NVMET_XBUSY  - this flag indicates the IO completed but the firmware
is still busy with the corresponding exchange. The exchange should not be
reused until after a XRI_ABORTED_CQE is received for that exchange.
LPFC_NVMET_ABORT_OP - this flag signifies an ABORT_WQE was issued on the
exchange.
LPFC_NVMET_CTX_RLS  - this flag signifies a context free was requested,
but we are deferring it due to an XBUSY or ABORT in progress.

A ctxlock is added to the context structure that is used whenever these
flags are set/read  within the context of an IO.
The LPFC_NVMET_CTX_RLS flag is only set in the defer_relase routine when
the transport has resolved all IO associated with the buffer. The flag is
cleared when the CTX is associated with a new IO.

An exchange can has both an LPFC_NVMET_XBUSY and a LPFC_NVMET_ABORT_OP
condition active simultaneously. Both conditions must complete before the
exchange is freed.
When the abort callback (lpfc_nvmet_xmt_fcp_abort) is envoked:
If there is an outstanding IO, the driver will issue an ABORT_WQE. This
should result in 3 completions for the exchange:
1) IO cmpl with XB bit set
2) Abort WQE cmpl
3) XRI_ABORTED_CQE cmpl
For this scenerio, after completion #1, the NVMET Transport IO rsp
callback is called.  After completion #2, no action is taken with respect
to the exchange / context.  After completion #3, the exchange context is
free for re-use on another IO.

If there is no outstanding activity on the exchange, the driver will send a
ABTS to the Initiator. Upon completion of this WQE, the exchange / context
is freed for re-use on another IO.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
2017-04-24 09:25:49 +02:00
James Smart
aeb3c8170b Add Fabric assigned WWN support.
Adding support for Fabric assigned WWPN and WWNN.

Firmware sends first FLOGI to fabric with vendor version changes.
On link up driver gets updated service parameter with FAWWN assigned port
name.  Driver sends 2nd FLOGI with updated fawwpn and modifies the
vport->fc_portname in driver.

Note:
Soft wwpn will not be allowed when fawwpn is enabled.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
2017-04-24 09:25:49 +02:00
James Smart
9d3d340d19 Fix crash after issuing lip reset
When RPI is not available, driver sends WQE with invalid RPI value and
rejected by HBA.
lpfc 0000:82:00.3: 1:3154 BLS ABORT RSP failed, data:  x3/xa0320008
and
lpfc :2753 PLOGI failure DID:FFFFFA Status:x3/xa0240008

In this case, driver accesses rpi_ids array out of bounds.

Fix:
Check return value of lpfc_sli4_alloc_rpi(). Do not allocate
lpfc_nodelist entry if RPI is not available.

When RPI is not available, we will get discovery timeouts and
command drops for some of the vports as seen below.

lpfc :0273 Unexpected discovery timeout, vport State x0
lpfc :0230 Unexpected timeout, hba link state x5
lpfc :0111 Dropping received ELS cmd Data: x0 xc90c55 x0

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
2017-04-24 09:25:49 +02:00
James Smart
2b7824d00d Fix driver load issues when MRQ=8
The symptom is that the driver will fail to login to the fabric.
The reason is because it is out of iocb resources.

There is a one to one relationship between MRQs
(receive buffers for NVMET-FC) and iocbs and the default number of
IOCBs was not accounting for the number of MRQs that were being created.

This fix aligns the number of MRQ resources with the total resources so
that it can handle fabric events when needed.

Also the initialization of ctxlock to be on FCP commands, NOT LS commands.
And modified log messages so that the log output can be correlated with
the analyzer trace.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
2017-04-24 09:25:49 +02:00
James Smart
d1f525aaa4 Fix driver unload/reload operation.
There are couple of different load/unload issues fixed with this patch.
One of the issues was reported by Junichi Nomura, a patch was submitted
by Johannes Thumsrhirn which did fix one of the problems but the fix in
this patch separates the pring free from the queue free and does not set
the parameter passed in to NULL.

issues:
(1) driver could not be unloaded and reloaded without some Oops or
 Panic occurring.
(2) The driver was panicking because of a corruption in the Memory
Manager when the iocb list was getting allocated.

Root cause for the memory corruption was a double free of the Work Queue
ring pointer memory - Freed once in the lpfc_sli4_queue_free when the CQ
was destroyed and again in lpfc_sli4_queue_free when the WQ was destroyed.

The pring free and the queue free were separated, the pring free was moved
to the wq destroy routine because it a better fit logically to delete the
ring with the wq.

The checkpatch flagged several alignmenet issues that were also corrected
with this patch.

The mboxq was never initialed correctly before it was used by the driver
this patch corrects that issue.

Reported-by: Junichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Tested-by: Junichi Nomura <j-nomura@ce.jp.nec.com>
2017-04-24 09:25:48 +02:00
James Smart
e8c0a77935 Add debug messages for nvme/fcp resource allocation.
The xri resources are split into pools for NVME and FCP IO when NVME is
enabled. There was not message in the log that identified this allocation.

Added debug message to log XRI split.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
2017-04-24 09:25:48 +02:00
James Smart
7d7080335f scsi: lpfc: Finalize Kconfig options for nvme
Reviewing the result of what was just added for Kconfig, we made a poor
choice. It worked well for full kernel builds, but not so much for how
it would be deployed on a distro.

Here's the final result:
- lpfc will compile in NVME initiator and/or NVME target support based
  on whether the kernel has the corresponding subsystem support.
  Kconfig is not used to drive this specifically for lpfc.
- There is a module parameter, lpfc_enable_fc4_type, that indicates
  whether the ports will do FCP-only or FCP & NVME (NVME-only not yet
  possible due to dependency on fc transport). As FCP & NVME divvys up
  exchange resources, and given NVME will not be often initially, the
  default is changed to FCP only.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-03-15 13:37:18 -04:00
Anton Blanchard
85e8a23936 scsi: lpfc: Add shutdown method for kexec
We see lpfc devices regularly fail during kexec. Fix this by adding a
shutdown method which mirrors the remove method.

Cc: <stable@vger.kernel.org>
Signed-off-by: Anton Blanchard <anton@samba.org>
Reviewed-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
Tested-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-03-07 20:29:10 -05:00
James Smart
ba3bd6e2a9 scsi: lpfc: remove dead sli3 nvme code
Remove nvme teardown calls that should not be there on sli3 devices

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-03-06 23:04:22 -05:00
James Smart
43140ca68d scsi: lpfc: Rename LPFC_MAX_EQ_DELAY to LPFC_MAX_EQ_DELAY_EQID_CNT
Without apriori understanding of what the define is, the name gives
a very different impression of what it is (a max delay value
for an EQ).  Rename the define so it reflects what it is: the number
of EQ IDs that can be set in one instance of the MODIFY_EQ_DELAY
mbx command.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-03-06 23:04:22 -05:00
James Smart
96418b5e2c scsi: lpfc: Fix eh_deadline setting for sli3 adapters.
A previous change unilaterally removed the hba reset entry point
from the sli3 host template. This was done to allow tape devices
being used for back up from being removed. Why was this done ?
When there was non-responding device on the fabric, the error
escalation policy would escalate to the reset handler. When the
reset handler was called, it would reset the adapter, dropping
link, thus logging out and terminating all i/o's - on any target.
If there was a tape device on the same adapter that wasn't in
error, it would kill the tape i/o's, effectively killing the
tape device state.  With the reset point removed, the adapter
reset avoided the fabric logout, allowing the other devices to
continue to operate unaffected. A hack - yes. Hint: we really
need a transport I_T nexus reset callback added to the eh process
(in between the SCSI target reset and hba reset points), so a
fc logout could occur to the one bad target only and stop the error
escalation process.

This patch commonizes the approach so it can be used for sli3 and sli4
adapters, but mandates the admin, via module parameter, specifically
identify which adapters the resets are to be removed for. Additionally,
bus_reset, which sends Target Reset TMFs to all targets, is also removed
from the template as it too has the same effect as the adapter reset.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Laurence Oberman <loberman@redhat.com>
Tested-by:   Laurence Oberman <loberman@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-03-06 23:04:22 -05:00
James Smart
318083ad92 scsi: lpfc: add NVME exchange aborts
previous code did little more than log a message.

This patch adds abort path support, modeled after the SCSI code paths.
Currently addresses only the initiator path. Target path under
development, but stubbed out.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-03-06 23:04:22 -05:00
Tomas Jasek
33cc559a81 scsi: lpfc: replace init_timer by setup_timer
This patch shortens every init_timer in lpfc module followed by function
and data assignment using setup_timer.  This is purely cleanup patch, it
does not add new functionality nor remove any existing functionality.

An init_timer call in this form:

    init_timer(&vport->fc_disctmo);
    vport->fc_disctmo.function = lpfc_disc_timeout;
    vport->fc_disctmo.data = vport;

is shortened to:

    setup_timer(&vport->fc_disctmo, lpfc_disc_timeout, vport);

It increases readability and reduces chances of mistakes done by
developers.

Signed-off-by: Tomas Jasek <tomsik68@gmail.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: James Smart <james.smart@broadcom.com>
Cc: Dick Kennedy <dick.kennedy@broadcom.com>
Cc: "James E.J. Bottomley" <jejb@linux.vnet.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: <linux-scsi@vger.kernel.org>
Acked-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-03-06 22:48:49 -05:00
James Smart
d080abe0a8 scsi: lpfc: Update copyrights
Update copyrights to 2017 for all files touched in this patch set

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-02-22 18:41:44 -05:00
James Smart
d613b6a7aa scsi: lpfc: NVME Target: bind to nvmet_fc api
NVME Target: Tie in to NVME Fabrics nvmet_fc LLDD target api

Adds the routines to:
- register and deregister the FC port as a nvmet-fc targetport
- binding of nvme queues to adapter WQs
- receipt and passing of NVME LS's to transport, sending transport response
- receipt of NVME FCP CMD IUs, processing FCP target io data transmission
  commands; transmission of FCP io response
- Abort operations for tgt io exchanges

[mkp: fixed space at end of file warning]

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-02-22 18:41:43 -05:00
James Smart
2d7dbc4c27 scsi: lpfc: NVME Target: Receive buffer updates
NVME Target: Receive buffer updates

Allocates buffer pools and configures adapter interfaces to handle
receive buffer (asynchronous FCP CMD ius, first burst data)
from the adapter. Splits by protocol, etc.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-02-22 18:41:43 -05:00
James Smart
f358dd0ca2 scsi: lpfc: NVME Target: Base modifications
NVME Target: Base modifications

This set of patches adds the base modifications for NVME target support

The base modifications consist of:
- Additional module parameters or configuration tuning
- Enablement of configuration mode for NVME target. Ties into the
  queueing model put into place by the initiator basemods patches.
- Target-specific buffer pools, dma pools, sgl pools

[mkp: fixed space at end of file]

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-02-22 18:41:43 -05:00
James Smart
01649561a8 scsi: lpfc: NVME Initiator: bind to nvme_fc api
NVME Initiator: Tie in to NVME Fabrics nvme_fc LLDD initiator api

Adds the routines to:
- register and deregister the FC port as a nvme-fc initiator localport
- register and deregister remote FC ports as a nvme-fc remoteport
- binding of nvme queues to adapter WQs
- send/perform NVME LS's
- send/perform NVME FCP initiator io operations

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <james.smart@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-02-22 18:41:43 -05:00