Commit Graph

262 Commits

Author SHA1 Message Date
kernel test robot
12e3ef8b3e scsi: megaraid: Fix ifnullfree.cocci warnings
NULL check before vfree is not needed.

Generated by: scripts/coccinelle/free/ifnullfree.cocci

Link: https://lore.kernel.org/r/alpine.DEB.2.22.394.2012111113060.2669@hadrien
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: kernel test robot <lkp@intel.com>
Signed-off-by: Julia Lawall <julia.lawall@inria.fr>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-01-22 22:24:44 -05:00
Martin K. Petersen
81e7eb5bf0 Revert "Revert "scsi: megaraid_sas: Added support for shared host tagset for cpuhotplug""
This reverts commit 1a0e1943d8.

Commit b3c6a59975 ("block: Fix a lockdep complaint triggered by
request queue flushing") has been reverted and commit fb01a2932e has
been introduced in its place. Consequently, it is now safe to
reinstate the megaraid_sas tagset changes that led to boot problems in
5.10.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-12-16 22:43:44 -05:00
Linus Torvalds
1a0e1943d8 Revert "scsi: megaraid_sas: Added support for shared host tagset for cpuhotplug"
This reverts commit 103fbf8e40.

It turns out that it causes long boot-time latencies (to the point of
timeouts and failed boots).

The cause is the increase in request queues, and a fix for that is
queued up for 5.11, but we're reverting this commit that triggered the
problem for now.

Reported-and-tested-by: John Garry <john.garry@huawei.com>
Reported-and-tested-by: Julia Lawall <julia.lawall@inria.fr>
Reported-by: Qian Cai <cai@redhat.com>
Acked-by: Jens Axboe <axboe@kernel.dk>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/linux-scsi/fe3dff7dae4494e5a88caffbb4d877bbf472dceb.camel@redhat.com/
Link: https://lore.kernel.org/lkml/alpine.DEB.2.22.394.2012081813310.2680@hadrien/
Link: https://lore.kernel.org/linux-block/20201203012638.543321-1-ming.lei@redhat.com/
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-12-08 15:00:36 -08:00
Kashyap Desai
103fbf8e40 scsi: megaraid_sas: Added support for shared host tagset for cpuhotplug
Fusion adapters can steer completions to individual queues, and
we now have support for shared host-wide tags.
So we can enable multiqueue support for fusion adapters.

Once driver enable shared host-wide tags, cpu hotplug feature is also
supported as it was enabled using below patchsets -
commit bf0beec060 ("blk-mq: drain I/O when all CPUs in a hctx are
offline")

Currently driver has provision to disable host-wide tags using
"host_tagset_enable" module parameter.

Once we do not have any major performance regression using host-wide
tags, we will drop the hand-crafted interrupt affinity settings.

Performance is also meeting the expecatation - (used both none and
mq-deadline scheduler)
24 Drive SSD on Aero with/without this patch can get 3.1M IOPs
3 VDs consist of 8 SAS SSD on Aero with/without this patch can get 3.1M
IOPs.

Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Tested-by: Douglas Gilbert <dgilbert@interlog.com>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-10-06 08:33:44 -06:00
Linus Torvalds
d6dc7e0682 SCSI fixes on 20200908
Eleven fixes, mostly in drivers or minor fixes in driver related
 infrastructure libraries (target, libfc and libsas).  Most of the bugs
 fixed only show up under rare circumstances, the exception being the
 endianness problem in qla2xxx which is used as a device on some sparc
 systems.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCX1egVyYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishShgAP9McHKn
 9T/m2mAjBr8IMZRlD4Y6nC+ToxDdRDsYT6iAOQD/ZO1FajxEcGC4xEtWznvXtqGk
 H1Lfr/ta19GSEPqaZ44=
 =rDo9
 -----END PGP SIGNATURE-----

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "Eleven fixes, mostly in drivers or minor fixes in driver related
  infrastructure libraries (target, libfc and libsas).

  Most of the bugs fixed only show up under rare circumstances, the
  exception being the endianness problem in qla2xxx which is used as a
  device on some sparc systems"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: mpt3sas: Don't call disable_irq from IRQ poll handler
  scsi: megaraid_sas: Don't call disable_irq from process IRQ poll
  scsi: target: iscsi: Fix hang in iscsit_access_np() when getting tpg->np_login_sem
  scsi: libsas: Set data_dir as DMA_NONE if libata marks qc as NODATA
  scsi: target: iscsi: Fix data digest calculation
  scsi: lpfc: Update lpfc version to 12.8.0.4
  scsi: lpfc: Extend the RDF FPIN Registration descriptor for additional events
  scsi: lpfc: Fix FLOGI/PLOGI receive race condition in pt2pt discovery
  scsi: lpfc: Fix setting IRQ affinity with an empty CPU mask
  scsi: qla2xxx: Fix regression on sparc64
  scsi: libfc: Fix for double free()
  scsi: pm8001: Fix memleak in pm8001_exec_internal_task_abort
2020-09-08 11:42:58 -07:00
Tomas Henzl
d2af39141e scsi: megaraid_sas: Don't call disable_irq from process IRQ poll
disable_irq() might sleep. Replace it with disable_irq_nosync() which is
sufficient as irq_poll_scheduled protects against concurrently running
complete_cmd_fusion() from megasas_irqpoll() and megasas_isr_fusion().

Link: https://lore.kernel.org/r/20200827165332.8432-1-thenzl@redhat.com
Fixes: a6ffd5bf68 scsi: megaraid_sas: Call disable_irq from process IRQ poll
Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-09-02 21:59:44 -04:00
Gustavo A. R. Silva
df561f6688 treewide: Use fallthrough pseudo-keyword
Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through markings when it is the case.

[1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-08-23 17:36:59 -05:00
Linus Torvalds
dfdf16ecfd SCSI misc on 20200806
This series consists of the usual driver updates (ufs, qla2xxx, tcmu,
 lpfc, hpsa, zfcp, scsi_debug) and minor bug fixes.  We also have a
 huge docbook fix update like most other subsystems and no major update
 to the core (the few non trivial updates are either minor fixes or
 removing an unused feature [scsi_sdb_cache]).
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXyxq1yYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishSoAAQChZ4i8
 ZqYW3pL33JO3fA8vdjvLuyC489Hj4wzIsl3/bQEAxYyM6BSLvMoLWR2Plq/JmTLm
 4W/LDptarpTiDI3NuDc=
 =4b0W
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "This consists of the usual driver updates (ufs, qla2xxx, tcmu, lpfc,
  hpsa, zfcp, scsi_debug) and minor bug fixes.

  We also have a huge docbook fix update like most other subsystems and
  no major update to the core (the few non trivial updates are either
  minor fixes or removing an unused feature [scsi_sdb_cache])"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (307 commits)
  scsi: scsi_transport_srp: Sanitize scsi_target_block/unblock sequences
  scsi: ufs-mediatek: Apply DELAY_AFTER_LPM quirk to Micron devices
  scsi: ufs: Introduce device quirk "DELAY_AFTER_LPM"
  scsi: virtio-scsi: Correctly handle the case where all LUNs are unplugged
  scsi: scsi_debug: Implement tur_ms_to_ready parameter
  scsi: scsi_debug: Fix request sense
  scsi: lpfc: Fix typo in comment for ULP
  scsi: ufs-mediatek: Prevent LPM operation on undeclared VCC
  scsi: iscsi: Do not put host in iscsi_set_flashnode_param()
  scsi: hpsa: Correct ctrl queue depth
  scsi: target: tcmu: Make TMR notification optional
  scsi: target: tcmu: Implement tmr_notify callback
  scsi: target: tcmu: Fix and simplify timeout handling
  scsi: target: tcmu: Factor out new helper ring_insert_padding
  scsi: target: tcmu: Do not queue aborted commands
  scsi: target: tcmu: Use priv pointer in se_cmd
  scsi: target: Add tmr_notify backend function
  scsi: target: Modify core_tmr_abort_task()
  scsi: target: iscsi: Fix inconsistent debug message
  scsi: target: iscsi: Fix login error when receiving
  ...
2020-08-06 16:50:07 -07:00
Chandrakanth Patil
07d3f04550 scsi: megaraid_sas: Remove undefined ENABLE_IRQ_POLL macro
As the ENABLE_IRQ_POLL macro is undefined, the check for ENABLE_IRQ_POLL
macro in ISR will always be false. This leads to irq polling being
non-functional.

Remove ENABLE_IRQ_POLL check from ISR.

Link: https://lore.kernel.org/r/20200715120153.20512-1-chandrakanth.patil@broadcom.com
Fixes: a6ffd5bf68 ("scsi: megaraid_sas: Call disable_irq from process IRQ")
Cc: <stable@vger.kernel.org> # v5.3+
Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com>
Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-07-15 16:16:45 -04:00
Damien Le Moal
7b3c103508 scsi: megaraid: Fix compilation warnings
Move function declarations to megaraid_sas.h to avoid warnings such as:

	warning: no previous prototype for ‘xxx'

No functional changes.

Link: https://lore.kernel.org/r/20200706123346.451827-1-damien.lemoal@wdc.com
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-07-08 01:01:30 -04:00
Damien Le Moal
2b46e5c142 scsi: megaraid: Fix kdoc comments format
Fix kernel documentation comments to avoid various warnings when compiling
with W=1.

No functional changes.

Link: https://lore.kernel.org/r/20200706123345.451783-1-damien.lemoal@wdc.com
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-07-08 01:00:30 -04:00
Sumit Saxena
6fd8525a70 scsi: megaraid_sas: TM command refire leads to controller firmware crash
When TM command times out, driver invokes the controller reset. Post reset,
driver re-fires pended TM commands which leads to firmware crash.

Post controller reset, return pended TM commands back to OS.

Link: https://lore.kernel.org/r/20200508085242.23406-1-chandrakanth.patil@broadcom.com
Cc: stable@vger.kernel.org
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-05-11 23:06:24 -04:00
Sumit Saxena
84badfab0d scsi: megaraid_sas: Remove IO buffer hole detection logic
As blk_queue_virt_boundary() API in slave_configure ensures that no IOs
will come with holes/gaps. Hence, code logic to detect the holes/gaps in IO
buffer is not required.

Link: https://lore.kernel.org/r/20200508083838.22778-3-chandrakanth.patil@broadcom.com
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-05-11 23:06:23 -04:00
Jason Yan
1a5d1d940b scsi: megaraid: Use true, false for bool variables
Fix the following coccicheck warning:

drivers/scsi/megaraid/megaraid_sas_fusion.c:4242:6-16: WARNING:
Assignment of 0/1 to bool variable
drivers/scsi/megaraid/megaraid_sas_fusion.c:4786:1-29: WARNING:
Assignment of 0/1 to bool variable
drivers/scsi/megaraid/megaraid_sas_fusion.c:4791:1-29: WARNING:
Assignment of 0/1 to bool variable
drivers/scsi/megaraid/megaraid_sas_fusion.c:4716:1-29: WARNING:
Assignment of 0/1 to bool variable
drivers/scsi/megaraid/megaraid_sas_fusion.c:4721:1-29: WARNING:
Assignment of 0/1 to bool variable

Link: https://lore.kernel.org/r/20200421034111.28353-1-yanaijie@huawei.com
Acked-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com>
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-04-24 18:21:08 -04:00
Jason Yan
057d1c0d1b scsi: megaraid: make some symbols static in megaraid_sas_fusion.c
Fix the following sparse warning:

drivers/scsi/megaraid/megaraid_sas_fusion.c:180:1: warning: symbol
'megasas_enable_intr_fusion' was not declared. Should it be static?
drivers/scsi/megaraid/megaraid_sas_fusion.c:202:1: warning: symbol
'megasas_disable_intr_fusion' was not declared. Should it be static?
drivers/scsi/megaraid/megaraid_sas_fusion.c:4233:6: warning: symbol
'megasas_refire_mgmt_cmd' was not declared. Should it be static?

Link: https://lore.kernel.org/r/20200407092827.18074-4-yanaijie@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-04-14 21:38:41 -04:00
Tomas Henzl
0e99b2c625 scsi: megaraid_sas: silence a warning
Add a flag to DMA memory allocation to silence a warning.

This driver allocates DMA memory for IO frames. This allocation may exceed
MAX_ORDER pages for few megaraid_sas controllers (controllers with very
high queue depth). Consequently, the driver has logic to keep reducing the
controller queue depth until the DMA memory allocation succeeds.

On impacted megaraid_sas controllers there would be multiple DMA allocation
failures until driver settled on an allocation that fit. These failed DMA
allocation requests caused stack traces in system logs. These were not
harmful and this patch silences those warnings/stack traces.

[mkp: clarified commit desc]

Link: https://lore.kernel.org/r/20200204152413.7107-1-thenzl@redhat.com
Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Acked-by: Sumit Saxena <sumit.saxena@broadcom.com>
Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-02-12 19:13:48 -05:00
Anand Lodnoor
4d1634b8d1 scsi: megaraid_sas: Use Block layer API to check SCSI device in-flight IO requests
Remove usage of device_busy counter from driver. Instead of device_busy
counter now driver uses 'nr_active' counter of request_queue to get the
number of inflight request for a LUN.

Link: https://lore.kernel.org/r/1579000882-20246-11-git-send-email-anand.lodnoor@broadcom.com
Link : https://patchwork.kernel.org/patch/11249297/
Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com>
Signed-off-by: Anand Lodnoor <anand.lodnoor@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-01-15 23:21:03 -05:00
Anand Lodnoor
56ee0c5856 scsi: megaraid_sas: Limit the number of retries for the IOCTLs causing firmware fault
IOCTLs causing firmware fault may end up in failed controller resets and
finally killing the adapter.

This patch fixes this problem as stated below:

In OCR sequence, driver will attempt refiring pended IOCTLs upto two times.
If first two attempts fail, then in third attempt driver will return pended
IOCTLs with EBUSY status to application. These changes are done to ensure
if any of pended IOCTLs is causing firmware fault and resulting into OCR
failure, then in last attempt of OCR driver will refrain firing it to
firmware and saving adapter from being killed due to faulty IOCTL.

Link: https://lore.kernel.org/r/1579000882-20246-10-git-send-email-anand.lodnoor@broadcom.com
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com>
Signed-off-by: Anand Lodnoor <anand.lodnoor@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-01-15 23:21:03 -05:00
Anand Lodnoor
6d7537270e scsi: megaraid_sas: Do not initiate OCR if controller is not in ready state
Driver initiates OCR if a DCMD command times out. But there is a deadlock
if the driver attempts to invoke another OCR before the mutex lock
(reset_mutex) is released from the previous session of OCR.

This patch takes care of the above scenario using new flag
MEGASAS_FUSION_OCR_NOT_POSSIBLE to indicate if OCR is possible.

Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/1579000882-20246-9-git-send-email-anand.lodnoor@broadcom.com
Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Signed-off-by: Anand Lodnoor <anand.lodnoor@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-01-15 23:21:03 -05:00
Anand Lodnoor
eeb63c23ff scsi: megaraid_sas: Do not set HBA Operational if FW is not in operational state
After issuing a adapter reset, driver blindly used to set adprecovery flag
to OPERATIONAL state.  Add a check to see if the FW is operational before
setting the flag and marking reset adapter successful.

Link: https://lore.kernel.org/r/1579000882-20246-7-git-send-email-anand.lodnoor@broadcom.com
Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Signed-off-by: Anand Lodnoor <anand.lodnoor@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-01-15 23:21:03 -05:00
Anand Lodnoor
9330a0fd82 scsi: megaraid_sas: Do not kill HBA if JBOD Seqence map or RAID map is disabled
At the time of firmware initialization, if JBOD map or RAID map is not
available, driver can function without these features in a limited
functionality mode.

Link: https://lore.kernel.org/r/1579000882-20246-6-git-send-email-anand.lodnoor@broadcom.com
Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Signed-off-by: Anand Lodnoor <anand.lodnoor@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-01-15 23:21:03 -05:00
Linus Torvalds
10fd71780f SCSI misc on 20190919
This is mostly update of the usual drivers: qla2xxx, ufs, smartpqi,
 lpfc, hisi_sas, qedf, mpt3sas; plus a whole load of minor updates.
 The only core change this time around is the addition of request
 batching for virtio.  Since batching requires an additional flag to
 use, it should be invisible to the rest of the drivers.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXYQE/yYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishXs9AP4usPY5
 OpMlF6OiKFNeJrCdhCScVghf9uHbc7UA6cP+EgD/bCtRgcDe1ZjOTYWdeTwvwWqA
 ltWYonnv6Lg3b1f9yqI=
 =jRC/
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "This is mostly update of the usual drivers: qla2xxx, ufs, smartpqi,
  lpfc, hisi_sas, qedf, mpt3sas; plus a whole load of minor updates. The
  only core change this time around is the addition of request batching
  for virtio. Since batching requires an additional flag to use, it
  should be invisible to the rest of the drivers"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (264 commits)
  scsi: hisi_sas: Fix the conflict between device gone and host reset
  scsi: hisi_sas: Add BIST support for phy loopback
  scsi: hisi_sas: Add hisi_sas_debugfs_alloc() to centralise allocation
  scsi: hisi_sas: Remove some unused function arguments
  scsi: hisi_sas: Remove redundant work declaration
  scsi: hisi_sas: Remove hisi_sas_hw.slot_complete
  scsi: hisi_sas: Assign NCQ tag for all NCQ commands
  scsi: hisi_sas: Update all the registers after suspend and resume
  scsi: hisi_sas: Retry 3 times TMF IO for SAS disks when init device
  scsi: hisi_sas: Remove sleep after issue phy reset if sas_smp_phy_control() fails
  scsi: hisi_sas: Directly return when running I_T_nexus reset if phy disabled
  scsi: hisi_sas: Use true/false as input parameter of sas_phy_reset()
  scsi: hisi_sas: add debugfs auto-trigger for internal abort time out
  scsi: virtio_scsi: unplug LUNs when events missed
  scsi: scsi_dh_rdac: zero cdb in send_mode_select()
  scsi: fcoe: fix null-ptr-deref Read in fc_release_transport
  scsi: ufs-hisi: use devm_platform_ioremap_resource() to simplify code
  scsi: ufshcd: use devm_platform_ioremap_resource() to simplify code
  scsi: hisi_sas: use devm_platform_ioremap_resource() to simplify code
  scsi: ufs: Use kmemdup in ufshcd_read_string_desc()
  ...
2019-09-21 10:50:15 -07:00
Qian Cai
e5460f084b scsi: megaraid_sas: Fix a compilation warning
The commit de516379e8 ("scsi: megaraid_sas: changes to function
prototypes") introduced a comilation warning due to it changed the function
prototype of read_fw_status_reg() to take an instance pointer instead, but
forgot to remove an unused variable.

drivers/scsi/megaraid/megaraid_sas_fusion.c: In function
'megasas_fusion_update_can_queue':
drivers/scsi/megaraid/megaraid_sas_fusion.c:326:39: warning: variable
'reg_set' set but not used [-Wunused-but-set-variable]
  struct megasas_register_set __iomem *reg_set;
                                       ^~~~~~~
Fixes: de516379e8 ("scsi: megaraid_sas: changes to function prototypes")
Signed-off-by: Qian Cai <cai@lca.pw>
Acked-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-07 21:29:49 -04:00
YueHaibing
88d5c34394 scsi: megaraid_sas: Make a bunch of functions static
Fix sparse warnings:

drivers/scsi/megaraid/megaraid_sas_fusion.c:3369:1: warning: symbol 'complete_cmd_fusion' was not declared. Should it be static?
drivers/scsi/megaraid/megaraid_sas_fusion.c:3535:6: warning: symbol 'megasas_sync_irqs' was not declared. Should it be static?
drivers/scsi/megaraid/megaraid_sas_fusion.c:3554:1: warning: symbol 'megasas_complete_cmd_dpc_fusion' was not declared. Should it be static?
drivers/scsi/megaraid/megaraid_sas_fusion.c:3573:13: warning: symbol 'megasas_isr_fusion' was not declared. Should it be static?
drivers/scsi/megaraid/megaraid_sas_fusion.c:3604:1: warning: symbol 'build_mpt_mfi_pass_thru' was not declared. Should it be static?
drivers/scsi/megaraid/megaraid_sas_fusion.c:3661:40: warning: symbol 'build_mpt_cmd' was not declared. Should it be static?
drivers/scsi/megaraid/megaraid_sas_fusion.c:3688:1: warning: symbol 'megasas_issue_dcmd_fusion' was not declared. Should it be static?
drivers/scsi/megaraid/megaraid_sas_fusion.c:3881:5: warning: symbol 'megasas_wait_for_outstanding_fusion' was not declared. Should it be static?
drivers/scsi/megaraid/megaraid_sas_fusion.c:4005:6: warning: symbol 'megasas_refire_mgmt_cmd' was not declared. Should it be static?
drivers/scsi/megaraid/megaraid_sas_fusion.c:4525:25: warning: symbol 'megasas_get_peer_instance' was not declared. Should it be static?
drivers/scsi/megaraid/megaraid_sas_fusion.c:4825:7: warning: symbol 'megasas_fusion_crash_dump' was not declared. Should it be static?

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-08-07 21:27:12 -04:00
YueHaibing
e45ab43b1d scsi: megaraid_sas: Make some functions static
Fix sparse warnings:

drivers/scsi/megaraid/megaraid_sas_fusion.c:541:1: warning: symbol 'megasas_alloc_cmdlist_fusion' was not declared. Should it be static?
drivers/scsi/megaraid/megaraid_sas_fusion.c:580:1: warning: symbol 'megasas_alloc_request_fusion' was not declared. Should it be static?
drivers/scsi/megaraid/megaraid_sas_fusion.c:661:1: warning: symbol 'megasas_alloc_reply_fusion' was not declared. Should it be static?
drivers/scsi/megaraid/megaraid_sas_fusion.c:738:1: warning: symbol 'megasas_alloc_rdpq_fusion' was not declared. Should it be static?
drivers/scsi/megaraid/megaraid_sas_fusion.c:920:1: warning: symbol 'megasas_alloc_cmds_fusion' was not declared. Should it be static?
drivers/scsi/megaraid/megaraid_sas_fusion.c:1740:1: warning: symbol 'megasas_init_adapter_fusion' was not declared. Should it be static?
drivers/scsi/megaraid/megaraid_sas_fusion.c:1966:1: warning: symbol 'map_cmd_status' was not declared. Should it be static?
drivers/scsi/megaraid/megaraid_sas_fusion.c:2379:1: warning: symbol 'megasas_set_pd_lba' was not declared. Should it be static?
drivers/scsi/megaraid/megaraid_sas_fusion.c:2718:1: warning: symbol 'megasas_build_ldio_fusion' was not declared. Should it be static?
drivers/scsi/megaraid/megaraid_sas_fusion.c:3215:1: warning: symbol 'megasas_build_io_fusion' was not declared. Should it be static?
drivers/scsi/megaraid/megaraid_sas_fusion.c:3328:6: warning: symbol 'megasas_prepare_secondRaid1_IO' was not declared. Should it be static?

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-07-23 22:14:06 -04:00
Linus Torvalds
ba6d10ab80 SCSI misc on 20190709
This is mostly update of the usual drivers: qla2xxx, hpsa, lpfc, ufs,
 mpt3sas, ibmvscsi, megaraid_sas, bnx2fc and hisi_sas as well as the
 removal of the osst driver (I heard from Willem privately that he
 would like the driver removed because all his test hardware has
 failed).  Plus number of minor changes, spelling fixes and other
 trivia.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXSTl4yYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishdcxAQDCJVbd
 fPUX76/V1ldupunF97+3DTharxxbst+VnkOnCwD8D4c0KFFFOI9+F36cnMGCPegE
 fjy17dQLvsJ4GsidHy8=
 =aS5B
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "This is mostly update of the usual drivers: qla2xxx, hpsa, lpfc, ufs,
  mpt3sas, ibmvscsi, megaraid_sas, bnx2fc and hisi_sas as well as the
  removal of the osst driver (I heard from Willem privately that he
  would like the driver removed because all his test hardware has
  failed). Plus number of minor changes, spelling fixes and other
  trivia.

  The big merge conflict this time around is the SPDX licence tags.
  Following discussion on linux-next, we believe our version to be more
  accurate than the one in the tree, so the resolution is to take our
  version for all the SPDX conflicts"

Note on the SPDX license tag conversion conflicts: the SCSI tree had
done its own SPDX conversion, which in some cases conflicted with the
treewide ones done by Thomas & co.

In almost all cases, the conflicts were purely syntactic: the SCSI tree
used the old-style SPDX tags ("GPL-2.0" and "GPL-2.0+") while the
treewide conversion had used the new-style ones ("GPL-2.0-only" and
"GPL-2.0-or-later").

In these cases I picked the new-style one.

In a few cases, the SPDX conversion was actually different, though.  As
explained by James above, and in more detail in a pre-pull-request
thread:

 "The other problem is actually substantive: In the libsas code Luben
  Tuikov originally specified gpl 2.0 only by dint of stating:

  * This file is licensed under GPLv2.

  In all the libsas files, but then muddied the water by quoting GPLv2
  verbatim (which includes the or later than language). So for these
  files Christoph did the conversion to v2 only SPDX tags and Thomas
  converted to v2 or later tags"

So in those cases, where the spdx tag substantially mattered, I took the
SCSI tree conversion of it, but then also took the opportunity to turn
the old-style "GPL-2.0" into a new-style "GPL-2.0-only" tag.

Similarly, when there were whitespace differences or other differences
to the comments around the copyright notices, I took the version from
the SCSI tree as being the more specific conversion.

Finally, in the spdx conversions that had no conflicts (because the
treewide ones hadn't been done for those files), I just took the SCSI
tree version as-is, even if it was old-style.  The old-style conversions
are perfectly valid, even if the "-only" and "-or-later" versions are
perhaps more descriptive.

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (185 commits)
  scsi: qla2xxx: move IO flush to the front of NVME rport unregistration
  scsi: qla2xxx: Fix NVME cmd and LS cmd timeout race condition
  scsi: qla2xxx: on session delete, return nvme cmd
  scsi: qla2xxx: Fix kernel crash after disconnecting NVMe devices
  scsi: megaraid_sas: Update driver version to 07.710.06.00-rc1
  scsi: megaraid_sas: Introduce various Aero performance modes
  scsi: megaraid_sas: Use high IOPS queues based on IO workload
  scsi: megaraid_sas: Set affinity for high IOPS reply queues
  scsi: megaraid_sas: Enable coalescing for high IOPS queues
  scsi: megaraid_sas: Add support for High IOPS queues
  scsi: megaraid_sas: Add support for MPI toolbox commands
  scsi: megaraid_sas: Offload Aero RAID5/6 division calculations to driver
  scsi: megaraid_sas: RAID1 PCI bandwidth limit algorithm is applicable for only Ventura
  scsi: megaraid_sas: megaraid_sas: Add check for count returned by HOST_DEVICE_LIST DCMD
  scsi: megaraid_sas: Handle sequence JBOD map failure at driver level
  scsi: megaraid_sas: Don't send FPIO to RL Bypass queue
  scsi: megaraid_sas: In probe context, retry IOC INIT once if firmware is in fault
  scsi: megaraid_sas: Release Mutex lock before OCR in case of DCMD timeout
  scsi: megaraid_sas: Call disable_irq from process IRQ poll
  scsi: megaraid_sas: Remove few debug counters from IO path
  ...
2019-07-11 15:14:01 -07:00
Chandrakanth Patil
299ee42615 scsi: megaraid_sas: Introduce various Aero performance modes
For Aero adapters, driver provides three different performance modes
controlled through module parameter named 'perf_mode'. Below are those
performance modes:

 0: Balanced - Additional high IOPS reply queues will be enabled along with
    low latency queues. Interrupt coalescing will be enabled only for these
    high IOPS reply queues.

 1: IOPS - No additional high IOPS queues are enabled. Interrupt coalescing
    will be enabled on all reply queues.

 2: Latency - No additional high IOPS queues are enabled. Interrupt
    coalescing will be disabled on all reply queues. This is a legacy
    behavior similar to Ventura & Invader Series.

Default performance mode settings:

 - Performance mode set to 'Balanced', if Aero controller is working in
   16GT/s PCIe speed.

 - Performance mode will be set to 'Latency' mode for all other cases.

Through module parameter 'perf_mode', user can override default performance
mode to desired one.

Captured some performance numbers with these performance modes.  4k Random
Read IO performance numbers on 24 SAS SSD drives for above three
performance modes. Performance data is from Intel Skylake and HGST SS300
(drive model SDLL1DLR400GCCA1).

IOPS:
 -----------------------------------------------------------------------
  |perf_mode    | qd = 1 | qd = 64 |   note                             |
  |-------------|--------|---------|-------------------------------------
  |balanced     |  259K  |  3061k  | Provides max performance numbers   |
  |             |        |         | both on lower QD workload &        |
  |             |        |         | also on higher QD workload         |
  |-------------|--------|---------|-------------------------------------
  |iops         |  220K  |  3100k  | Provides max performance numbers   |
  |             |        |         | only on higher QD workload.        |
  |-------------|--------|---------|-------------------------------------
  |latency      |  246k  |  2226k  | Provides good performance numbers  |
  |             |        |         | only on lower QD worklaod.         |
  -----------------------------------------------------------------------

Average Latency:
  -----------------------------------------------------
  |perf_mode    |  qd = 1      |    qd = 64           |
  |-------------|--------------|----------------------|
  |balanced     |  92.05 usec  |    501.12 usec       |
  |-------------|--------------|----------------------|
  |iops         |  108.40 usec |    498.10 usec       |
  |-------------|--------------|----------------------|
  |latency      |  97.10 usec  |    689.26 usec       |
  -----------------------------------------------------

Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-27 00:08:50 -04:00
Chandrakanth Patil
f39e5e52c5 scsi: megaraid_sas: Use high IOPS queues based on IO workload
The driver will use round-robin method for IO submission in batches within
the high IOPS queues when the number of in-flight ios on the target device
is larger than 8. Otherwise the driver will use low latency reply queues.

Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-27 00:07:36 -04:00
Chandrakanth Patil
ea836f40f8 scsi: megaraid_sas: Enable coalescing for high IOPS queues
Driver should enable interrupt coalescing (during driver load and after
Controller Reset) for High IOPS queues by masking appropriate bits in IOC
INIT frame.

Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-27 00:07:36 -04:00
Chandrakanth Patil
132147d7f6 scsi: megaraid_sas: Add support for High IOPS queues
Aero controllers support balanced performance mode through the ability to
configure queues with different properties.

Reply queues with interrupt coalescing enabled are called "high iops reply
queues" and reply queues with interrupt coalescing disabled are called "low
latency reply queues".

The driver configures a combination of high iops and low latency reply
queues if:

 - HBA is an AERO controller;

 - MSI-X vectors supported by the HBA is 128;

 - Total CPU count in the system more than high iops queue count;

 - Driver is loaded with default max_msix_vectors module parameter; and

 - System booted in non-kdump mode.

Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-27 00:07:35 -04:00
Chandrakanth Patil
5813685616 scsi: megaraid_sas: Add support for MPI toolbox commands
Added driver support to allow passthrough MPI toolbox type MFI commands to
firmware based on firmware capability.

Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-27 00:07:35 -04:00
Chandrakanth Patil
7fc557005c scsi: megaraid_sas: Offload Aero RAID5/6 division calculations to driver
For RAID5/RAID6 volumes configured behind Aero, driver will be doing 64bit
division operations on behalf of firmware as controller's ARM CPU is very
slow in this division. Later, driver calculates Q-ARM, P-ARM and Log-ARM and
passes those values to firmware by writing these values to RAID_CONTEXT.

Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-27 00:07:35 -04:00
Chandrakanth Patil
49f2bf1071 scsi: megaraid_sas: RAID1 PCI bandwidth limit algorithm is applicable for only Ventura
RAID1 PCI bandwidth limit algorithm is not applicable to Aero as it's PCIe
Gen4 adapter.

Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-27 00:07:35 -04:00
Chandrakanth Patil
59db5a931b scsi: megaraid_sas: Handle sequence JBOD map failure at driver level
Issue: This issue is applicable to scenario when JBOD sequence map is
unavailable (memory allocation for JBOD sequence map failed) to driver but
feature is supported by firmware.  If the driver sends a JBOD IO by not
adding 255 (MAX_PHYSICAL_DEVICES - 1) to device ID when underlying firmware
supports JBOD sequence map, it will lead to the IO failure.

Fix: For JBOD IOs, driver will not use the RAID map to fetch the devhandle
if JBOD sequence map is unavailable. Driver will set Devhandle to 0xffff
and Target ID to 'device ID + 255 (MAX_PHYSICAL_DEVICES - 1)'.

Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-27 00:07:35 -04:00
Chandrakanth Patil
798d44b04f scsi: megaraid_sas: Don't send FPIO to RL Bypass queue
Firmware does not expect FastPath IO sent through Region Lock Bypass queue.
Though firmware never exposes such settings when fastpath IO can be sent to
RL bypass queue but it's safer to remove dead code which directs fastpath
IO to RL Bypass queue.

Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-27 00:07:35 -04:00
Chandrakanth Patil
ccf6c1f2e2 scsi: megaraid_sas: In probe context, retry IOC INIT once if firmware is in fault
Issue: Under certain conditions, controller goes in FAULT state after IOC
INIT fired to firmware. Such Fault can be recovered through controller
reset.

Fix: In driver probe context, if firmware fault is observed post IOC INIT,
driver would do controller reset followed by retry logic for IOC INIT
command.

Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-27 00:07:35 -04:00
Chandrakanth Patil
a6ffd5bf68 scsi: megaraid_sas: Call disable_irq from process IRQ poll
On PowerPC architecture, calling disable_irq_nosync from IRQ context is not
providing the required effect.

In current megaraid_sas driver, disable_irq_nosync is being called from IRQ
context before enabling IRQ poll. But due to the issue seen on PPC, after
IRQ poll disable and legacy ISR is enabled, we are not seeing our ISR
getting called.

Fix: Call disable_irq from IRQ poll thread context instead of IRQ context.

Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-27 00:07:35 -04:00
Chandrakanth Patil
2181aacf46 scsi: megaraid_sas: Remove few debug counters from IO path
Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-27 00:07:35 -04:00
Chandrakanth Patil
5885571df7 scsi: megaraid_sas: Add 32 bit atomic descriptor support to AERO adapters
Aero adapters provides Atomic Request Descriptor as an alternative method
for posting an entry onto a request queue. The posting of an Atomic Request
Descriptor is an atomic operation, providing a safe mechanism for multiple
processors on the host to post requests without synchronization. This
Atomic Request Descriptor format is identical to first 32 bits of Default
Request Descriptor and uses only 32 bits.

If Aero adapters support Atomic descriptor, driver should use it for
posting IOs and DCMDs to firmware.

Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-27 00:07:34 -04:00
Gustavo A. R. Silva
e58ed5002f scsi: megaraid_sas: Use struct_size() helper
One of the more common cases of allocation size calculations is finding the
size of a structure that has a zero-sized array at the end, along with
memory for some number of elements for that array. For example:

struct MR_PD_CFG_SEQ_NUM_SYNC {
	...
        struct MR_PD_CFG_SEQ seq[1];
} __packed;

Make use of the struct_size() helper instead of an open-coded version in
order to avoid any potential type mistakes.

So, replace the following form:

sizeof(struct MR_PD_CFG_SEQ_NUM_SYNC) + (sizeof(struct MR_PD_CFG_SEQ) * (MAX_PHYSICAL_DEVICES - 1))

with:

struct_size(pd_sync, seq, MAX_PHYSICAL_DEVICES - 1)

This code was detected with the help of Coccinelle.

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Acked-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-20 15:37:03 -04:00
Shivasharan S
223d5818e7 scsi: megaraid_sas: Print firmware interrupt status
Add a print to dump the interrupt status in system log for debugging.

Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-18 19:46:20 -04:00
Shivasharan S
b6661342f2 scsi: megaraid_sas: Print FW fault information
When driver detects a firmware fault during load, dump additional
information on fault code and subcode that will help in debugging.

Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-18 19:46:20 -04:00
Shivasharan S
96c9603cf1 scsi: megaraid_sas: Enhance prints in OCR and TM path
This patch enhances the existing debug prints in reset and task management
path.

These debug prints in adapter reset path helps with debugging issues
related to IO timeouts that are seen frequently in the field.  Add
additional debug prints to dump the pending command frames before
initiating an adapter reset.  Also, print FastPath IOs that are
outstanding.

Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-18 19:46:19 -04:00
Shivasharan S
1d15d9098a scsi: megaraid_sas: Load balance completions across all MSI-X
Driver will use "reply descriptor post queues" in round robin fashion when
the combined MSI-X mode is not enabled. With this IO completions are
distributed and load balanced across all the available reply descriptor
post queues equally.

This is enabled only if combined MSI-X mode is not enabled in firmware.
This improves performance and also fixes soft lockups.

When load balancing is enabled, IRQ affinity from driver needs to be
disabled.

Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-18 19:46:19 -04:00
Shivasharan S
62a04f81e6 scsi: megaraid_sas: IRQ poll to avoid CPU hard lockups
Issue Description:

We have seen cpu lock up issues from field if system has a large (more than
96) logical cpu count.  SAS3.0 controller (Invader series) supports max 96
MSI-X vector and SAS3.5 product (Ventura) supports max 128 MSI-X vectors.

This may be a generic issue (if PCI device support completion on multiple
reply queues).

Let me explain it w.r.t megaraid_sas supported h/w just to simplify the
problem and possible changes to handle such issues.  MegaRAID controller
supports multiple reply queues in completion path.  Driver creates MSI-X
vectors for controller as "minimum of (FW supported Reply queues, Logical
CPUs)".  If submitter is not interrupted via completion on same CPU, there
is a loop in the IO path. This behavior can cause hard/soft CPU lockups, IO
timeout, system sluggish etc.

Example - one CPU (e.g. CPU A) is busy submitting the IOs and another CPU
(e.g. CPU B) is busy with processing the corresponding IO's reply
descriptors from reply descriptor queue upon receiving the interrupts from
HBA.  If CPU A is continuously pumping the IOs then always CPU B (which is
executing the ISR) will see the valid reply descriptors in the reply
descriptor queue and it will be continuously processing those reply
descriptor in a loop without quitting the ISR handler.

megaraid_sas driver will exit ISR handler if it finds unused reply
descriptor in the reply descriptor queue.  Since CPU A will be continuously
sending the IOs, CPU B may always see a valid reply descriptor (posted by
HBA Firmware after processing the IO) in the reply descriptor queue. In
worst case, driver will not quit from this loop in the ISR handler.
Eventually, CPU lockup will be detected by watchdog.

Above mentioned behavior is not common if "rq_affinity" set to 2 or
affinity_hint is honored by irqbalancer as "exact".  If rq_affinity is set
to 2, submitter will be always interrupted via completion on same CPU.  If
irqbalancer is using "exact" policy, interrupt will be delivered to
submitter CPU.

Problem statement:

If CPU count to MSI-X vectors (reply descriptor Queues) count ratio is not
1:1, we still have exposure of issue explained above and for that we don't
have any solution.

Exposure of soft/hard lockup is seen if CPU count is more than MSI-X
supported by device.

If CPUs count to MSI-X vectors count ratio is not 1:1, (Other way, if
CPU counts to MSI-X vector count ratio is something like X:1, where X > 1)
then 'exact' irqbalance policy OR rq_affinity = 2 won't help to avoid CPU
hard/soft lockups. There won't be any one to one mapping between
CPU to MSI-X vector instead one MSI-X interrupt (or reply descriptor queue)
is shared with group/set of CPUs and there is a possibility of having a
loop in the IO path within that CPU group and may observe lockups.

For example: Consider a system having two NUMA nodes and each node having
four logical CPUs and also consider that number of MSI-X vectors enabled on
the HBA is two, then CPUs count to MSI-X vector count ratio as 4:1.
e.g.
MSI-X vector 0 is affinity to CPU 0, CPU 1, CPU 2 & CPU 3 of NUMA node 0 and
MSI-X vector 1 is affinity to CPU 4, CPU 5, CPU 6 & CPU 7 of NUMA node 1.

numactl --hardware
available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3                 --> MSI-X 0
node 0 size: 65536 MB
node 0 free: 63176 MB
node 1 cpus: 4 5 6 7                 --> MSI-X 1
node 1 size: 65536 MB
node 1 free: 63176 MB

Assume that user started an application which uses all the CPUs of NUMA
node 0 for issuing the IOs.  Only one CPU from affinity list (it can be any
cpu since this behavior depends upon irqbalance) CPU0 will receive the
interrupts from MSI-X 0 for all the IOs. Eventually, CPU 0 IO submission
percentage will be decreasing and ISR processing percentage will be
increasing as it is more busy with processing the interrupts.  Gradually IO
submission percentage on CPU 0 will be zero and it's ISR processing
percentage will be 100% as IO loop has already formed within the
NUMA node 0, i.e. CPU 1, CPU 2 & CPU 3 will be continuously busy with
submitting the heavy IOs and only CPU 0 is busy in the ISR path as it
always find the valid reply descriptor in the reply descriptor queue.
Eventually, we will observe the hard lockup here.

Chances of occurring of hard/soft lockups are directly proportional to
value of X. If value of X is high, then chances of observing CPU lockups is
high.

Solution:

Use IRQ poll interface defined in "irq_poll.c".

megaraid_sas driver will execute ISR routine in softirq context and it will
always quit the loop based on budget provided in IRQ poll interface.
Driver will switch to IRQ poll only when more than a threshold number of
reply descriptors are handled in one ISR. Currently threshold is set as
1/4th of HBA queue depth.

In these scenarios (i.e. where CPUs count to MSI-X vectors count ratio is
X:1 (where X >  1)), IRQ poll interface will avoid CPU hard lockups due to
voluntary exit from the reply queue processing based on budget.
Note - Only one MSI-X vector is busy doing processing.

Select CONFIG_IRQ_POLL from driver Kconfig for driver compilation.

Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-18 19:46:19 -04:00
Shivasharan S
78409d4b47 scsi: megaraid_sas: Block PCI config space access from userspace during OCR
While an online controller reset(OCR) is in progress, there is short
duration where all access to controller's PCI config space from the host
needs to be blocked.  This is due to a hardware limitation of MegaRAID
controllers.

With this patch, driver will block all access to controller's config space
from userland applications by calling pci_cfg_access_lock() while OCR is in
progress and unlocking after controller comes back to ready state.

Added helper function which locks the config space before initiating OCR
and wait for controller to become READY.

Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-18 19:46:19 -04:00
Shivasharan S
44e8d6930f scsi: megaraid_sas: Rework code around controller reset
No functional change.  This patch reworks code around controller reset path
which gets rid of a couple of goto labels.  This is in preparation for the
next patch which adds PCI config space access locking while controller
reset is in progress.

Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-06-18 19:46:19 -04:00
Thomas Gleixner
1ccea77e2a treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 13
Based on 2 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license as published by
  the free software foundation either version 2 of the license or at
  your option any later version this program is distributed in the
  hope that it will be useful but without any warranty without even
  the implied warranty of merchantability or fitness for a particular
  purpose see the gnu general public license for more details you
  should have received a copy of the gnu general public license along
  with this program if not see http www gnu org licenses

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license as published by
  the free software foundation either version 2 of the license or at
  your option any later version this program is distributed in the
  hope that it will be useful but without any warranty without even
  the implied warranty of merchantability or fitness for a particular
  purpose see the gnu general public license for more details [based]
  [from] [clk] [highbank] [c] you should have received a copy of the
  gnu general public license along with this program if not see http
  www gnu org licenses

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-or-later

has been chosen to replace the boilerplate/reference in 355 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Jilayne Lovejoy <opensource@jilayne.com>
Reviewed-by: Steve Winslow <swinslow@gmail.com>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190519154041.837383322@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-21 11:28:45 +02:00
Linus Torvalds
d1cd7c85f9 SCSI misc on 20190507
This is mostly update of the usual drivers: qla2xxx, qedf, smartpqi,
 hpsa, lpfc, ufs, mpt3sas, ibmvfc and hisi_sas.  Plus number of minor
 changes, spelling fixes and other trivia.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXNIK0yYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishbeFAP4wsOm6
 BNqDvLbF7gHAH+oSVAeENd9tepXG2xKbUHmLRgEA1wPaUxon8L8v/SSsqwewqMXP
 y6CnU6aO2iaViTNBsPs=
 =RX74
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "This is mostly update of the usual drivers: qla2xxx, qedf, smartpqi,
  hpsa, lpfc, ufs, mpt3sas, ibmvfc and hisi_sas. Plus number of minor
  changes, spelling fixes and other trivia"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (298 commits)
  scsi: qla2xxx: Avoid that lockdep complains about unsafe locking in tcm_qla2xxx_close_session()
  scsi: qla2xxx: Avoid that qlt_send_resp_ctio() corrupts memory
  scsi: qla2xxx: Fix hardirq-unsafe locking
  scsi: qla2xxx: Complain loudly about reference count underflow
  scsi: qla2xxx: Use __le64 instead of uint32_t[2] for sending DMA addresses to firmware
  scsi: qla2xxx: Introduce the dsd32 and dsd64 data structures
  scsi: qla2xxx: Check the size of firmware data structures at compile time
  scsi: qla2xxx: Pass little-endian values to the firmware
  scsi: qla2xxx: Fix race conditions in the code for aborting SCSI commands
  scsi: qla2xxx: Use an on-stack completion in qla24xx_control_vp()
  scsi: qla2xxx: Make qla24xx_async_abort_cmd() static
  scsi: qla2xxx: Remove unnecessary locking from the target code
  scsi: qla2xxx: Remove qla_tgt_cmd.released
  scsi: qla2xxx: Complain if a command is released that is owned by the firmware
  scsi: qla2xxx: target: Fix offline port handling and host reset handling
  scsi: qla2xxx: Fix abort handling in tcm_qla2xxx_write_pending()
  scsi: qla2xxx: Fix error handling in qlt_alloc_qfull_cmd()
  scsi: qla2xxx: Simplify qlt_send_term_imm_notif()
  scsi: qla2xxx: Fix use-after-free issues in qla2xxx_qpair_sp_free_dma()
  scsi: qla2xxx: Fix a qla24xx_enable_msix() error path
  ...
2019-05-08 10:12:46 -07:00
Colin Ian King
efc372c1bf scsi: megaraid_sas: fix spelling mistake "oustanding" -> "outstanding"
There are a couple of spelling mistakes in some kernel info and notice
messages. Fix these.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-04-18 20:34:46 -04:00