Prepare for removal of the request pointer by using scsi_cmd_to_rq()
instead. This patch does not change any functionality.
Link: https://lore.kernel.org/r/20210809230355.8186-31-bvanassche@acm.org
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Currently, the mpt3sas driver sets the default queue depth based on the
physical interface of the attached device:
- SAS : 254
- SATA: 32
- NVMe: 128
The IOC firmware provides a recommended queue depth for each device through
SAS IO Unit Page1 for SAS/SATA and PCIe IO Unit Page 1 for NVMe devices.
If the host sets the queue depth greater than the firmware recommended
value, then the IOC places the I/Os above the recommended queue depth in an
internal pending queue. This consumes outstanding host-credit/resources,
thereby leading to potential starvation of other devices.
To avoid this, use the device depth recommended by the IOC firmware.
Link: https://lore.kernel.org/r/20210809072639.21228-2-suganath-prabu.subramani@broadcom.com
Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Enable the driver to work in non-IRQ mode, i.e. there will not be any MSI-X
vectors associated with queues dedicated to polling. The IOC hardware is
single submission queue and multiple reply queue. However, using the shared
host tagset support it is possible to simulate multiple hardware queues.
When poll_queues are enabled through the module parameter, the driver will
allocate extra reply queues without an MSI-X association. All I/O
completion on these queues will be done through the iopoll interface.
Link: https://lore.kernel.org/r/20210727081212.2742-1-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The IOC firmware assumes that the host driver is still alive after shutdown
and continues to post events to host memory (due to faulty expander phy
links, etc). This leads to 0x2666 (a bus fault occurred during a host-IOC
memory access).
Perform an IOC soft reset as part of shutdown to disable event posting.
Link: https://lore.kernel.org/r/20210705145951.32258-1-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When an expander does not contain any 'phys', an appropriate error code -1
should be returned, as done elsewhere in this function. However, we
currently do not explicitly assign this error code to 'rc'. As a result, 0
was incorrectly returned.
Link: https://lore.kernel.org/r/20210514081300.6650-1-thunder.leizhen@huawei.com
Fixes: f92363d123 ("[SCSI] mpt3sas: add new driver supporting 12GB SAS")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Include Hannes' SCSI command result rework in the staging branch.
[mkp: remove DRIVER_SENSE from mpi3mr]
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In preparation to enable -Wimplicit-fallthrough for Clang, fix a couple
of warnings by explicitly adding break statements instead of just letting
the code fall through to the next case.
Link: https://github.com/KSPP/linux/issues/115
Link: https://lore.kernel.org/r/20210528200828.GA39349@embeddedor
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Introduce scsi_build_sense() as a wrapper around scsi_build_sense_buffer()
to format the buffer and set the correct SCSI status.
Link: https://lore.kernel.org/r/20210427083046.31620-8-hare@suse.de
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If a firmware fault occurs while scanning the devices during IOC
initialization then the driver issues the hard reset operation to recover
the IOC. However, the driver is not issuing a Port enable request
message as part of hard reset operation during IOC initialization. Due to
this, the driver will not receive get any device discovery-related events
and hence devices will not be accessible.
Teach the driver to gracefully handle firmware faults while scanning for
target devices during IOC initialization. Make the driver issue a port
enable request message as part of hard reset operation. This permits
receiving device discovery-related events from the firmware after the hard
reset operation completes.
Link: https://lore.kernel.org/r/20210518051625.1596742-4-suganath-prabu.subramani@broadcom.com
Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Do not cancel current running firmware event work if the event type is
different from MPT3SAS_REMOVE_UNRESPONDING_DEVICES. Otherwise a deadlock
can be observed while cancelling the current firmware event work if a hard
reset operation is called as part of processing the current event.
Link: https://lore.kernel.org/r/20210518051625.1596742-2-suganath-prabu.subramani@broadcom.com
Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Whenever the driver is adding a vSES to virtual-phys list it is
reinitializing the list head. Hence those vSES devices which were added
previously are lost.
Stop reinitializing the list every time a new vSES device is added.
Link: https://lore.kernel.org/r/20210330105004.20413-1-sreekanth.reddy@broadcom.com
Cc: stable@vger.kernel.org #v5.11.10+
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Pull 5.12/scsi-fixes into the 5.13 SCSI tree to provide a baseline for
some UFS changes that would otherwise cause conflicts during the
merge.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Fixes the following W=1 kernel build warning(s):
drivers/scsi/mpt3sas/mpt3sas_scsih.c:763: warning: Function parameter or member 'sas_address' not described in '__mpt3sas_get_sdev_by_addr'
drivers/scsi/mpt3sas/mpt3sas_scsih.c:763: warning: expecting prototype for mpt3sas_get_sdev_by_addr(). Prototype was for __mpt3sas_get_sdev_by_addr() instead
drivers/scsi/mpt3sas/mpt3sas_scsih.c:4535: warning: expecting prototype for _scsih_check_for_pending_internal_cmds(). Prototype was for mpt3sas_check_for_pending_internal_cmds() instead
drivers/scsi/mpt3sas/mpt3sas_scsih.c:6188: warning: Function parameter or member 'port_entry' not described in '_scsih_look_and_get_matched_port_entry'
drivers/scsi/mpt3sas/mpt3sas_scsih.c:6188: warning: Function parameter or member 'matched_port_entry' not described in '_scsih_look_and_get_matched_port_entry'
drivers/scsi/mpt3sas/mpt3sas_scsih.c:6188: warning: Function parameter or member 'count' not described in '_scsih_look_and_get_matched_port_entry'
drivers/scsi/mpt3sas/mpt3sas_scsih.c:6959: warning: Function parameter or member 'port' not described in 'mpt3sas_expander_remove'
drivers/scsi/mpt3sas/mpt3sas_scsih.c:10494: warning: expecting prototype for mpt3sas_scsih_reset_handler(). Prototype was for mpt3sas_scsih_pre_reset_handler() instead
drivers/scsi/mpt3sas/mpt3sas_scsih.c:10536: warning: expecting prototype for mpt3sas_scsih_reset_handler(). Prototype was for mpt3sas_scsih_reset_done_handler() instead
drivers/scsi/mpt3sas/mpt3sas_scsih.c:12303: warning: expecting prototype for scsih__ncq_prio_supp(). Prototype was for scsih_ncq_prio_supp() instead
Link: https://lore.kernel.org/r/20210317091230.2912389-15-lee.jones@linaro.org
Cc: Sathya Prakash <sathya.prakash@broadcom.com>
Cc: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Cc: Suganath Prabu Subramani <suganath-prabu.subramani@broadcom.com>
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: MPT-FusionLinux.pdl@avagotech.com
Cc: MPT-FusionLinux.pdl@broadcom.com
Cc: linux-scsi@vger.kernel.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
mpt3sas_get_port_by_id() can be called when a spinlock is held. Use
GFP_ATOMIC instead of GFP_KERNEL when allocating memory.
Issue spotted by call_kern.cocci:
./drivers/scsi/mpt3sas/mpt3sas_scsih.c:416:42-52: ERROR: function mpt3sas_get_port_by_id called on line 7125 inside lock on line 7123 but uses GFP_KERNEL
./drivers/scsi/mpt3sas/mpt3sas_scsih.c:416:42-52: ERROR: function mpt3sas_get_port_by_id called on line 6842 inside lock on line 6839 but uses GFP_KERNEL
./drivers/scsi/mpt3sas/mpt3sas_scsih.c:416:42-52: ERROR: function mpt3sas_get_port_by_id called on line 6854 inside lock on line 6851 but uses GFP_KERNEL
./drivers/scsi/mpt3sas/mpt3sas_scsih.c:416:42-52: ERROR: function mpt3sas_get_port_by_id called on line 7706 inside lock on line 7702 but uses GFP_KERNEL
./drivers/scsi/mpt3sas/mpt3sas_scsih.c:416:42-52: ERROR: function mpt3sas_get_port_by_id called on line 10260 inside lock on line 10256 but uses GFP_KERNEL
Link: https://lore.kernel.org/r/20210220093951.905362-1-christophe.jaillet@wanadoo.fr
Fixes: 324c122fc0 ("scsi: mpt3sas: Add module parameter multipath_on_hba")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Fixes the following W=1 kernel build warning(s):
drivers/scsi/mpt3sas/mpt3sas_scsih.c: In function ‘_scsih_scan_for_devices_after_reset’:
drivers/scsi/mpt3sas/mpt3sas_scsih.c:10473:1: warning: the frame size of 1064 bytes is larger than 1024 bytes [-Wframe-larger-than=]
Link: https://lore.kernel.org/r/20210303144631.3175331-31-lee.jones@linaro.org
Cc: Sathya Prakash <sathya.prakash@broadcom.com>
Cc: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Cc: Suganath Prabu Subramani <suganath-prabu.subramani@broadcom.com>
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: MPT-FusionLinux.pdl@avagotech.com
Cc: MPT-FusionLinux.pdl@broadcom.com
Cc: linux-scsi@vger.kernel.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When a host trace buffer is released, applications never know for what
reason the buffer is released. Add a new IOCTL MPT3ADDNLDIAGQUERY to
provide the trigger information due to which the diag buffer is released.
Link: https://lore.kernel.org/r/20210204033724.1345-2-suganath-prabu.subramani@broadcom.com
Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
MPT Fusion adapters can steer completions to individual queues and we now
have support for shared host-wide tags in the I/O stack. The addition of
the host-wide tags allows us to enable multiqueue support for MPT Fusion
adapters. Once host-wise tags are enabled, the CPU hotplug feature is also
supported.
Allow use of host-wide tags to be disabled through the "host_tagset_enable"
module parameter. Once we do not have any major performance regressions
using host-wide tags, we will drop the hand-crafted interrupt affinity
settings.
Performance is meeting expectations. About 3.1M IOPS using 24 Drive SSD on
Aero controllers.
Link: https://lore.kernel.org/r/20210202095832.23072-1-sreekanth.reddy@broadcom.com
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
_scsih_fw_event_cleanup_queue() waits for all outstanding firmware events
wokrqueue handlers to finish. If in_interrupt() is true, it cancels itself
and return early.
That in_interrupt() check is ill-defined and does not provide what the name
suggests: it does not cover all states in which it is safe to block and
call functions like cancel_work_sync().
That check is also not needed: _scsih_fw_event_cleanup_queue() is always
invoked from process context. Below is an analysis of its callers:
- scsih_remove(), bound to PCI ->remove(), process context
- scsih_shutdown(), bound to PCI ->shutdown(), process context
- mpt3sas_scsih_clear_outstanding_scsi_tm_commands(), called by
=> _base_clear_outstanding_commands(), called by
=>_base_fault_reset_work(), workqueue
=> mpt3sas_base_hard_reset_handler(), locks mutex
Remove the in_interrupt() check. Change _scsih_fw_event_cleanup_queue()
specification to a purely process-context function and mark it with
"Context: task, can sleep".
Link: https://lore.kernel.org/r/20201126132952.2287996-10-bigeasy@linutronix.de
Cc: Sathya Prakash <sathya.prakash@broadcom.com>
Cc: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Cc: Suganath Prabu Subramani <suganath-prabu.subramani@broadcom.com>
Cc: <MPT-FusionLinux.pdl@broadcom.com>
Reviewed-by: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Drivers should do only device-specific jobs. But in general, drivers using
legacy PCI PM framework for .suspend()/.resume() have to manage many PCI
PM-related tasks themselves which can be done by PCI Core itself. This
brings extra load on the driver and it directly calls PCI helper functions
to handle them.
Switch to the new generic framework by updating function signatures and
define a "struct dev_pm_ops" variable to bind PM callbacks. Also, remove
unnecessary calls to the PCI Helper functions along with the legacy
.suspend & .resume bindings.
Link: https://lore.kernel.org/r/20201102164730.324035-17-vaibhavgupta40@gmail.com
Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The driver calls pci_enable_wake(...., false) in scsih_resume(), and there
is no corresponding pci_enable_wake(...., true) in scsih_suspend(). Either
it should do enable-wake the device in .suspend() or should not invoke
pci_enable_wake() at all.
Concluding that this driver doesn't support enable-wake and PCI core calls
pci_enable_wake(pci_dev, PCI_D0, false) during resume, drop it from
scsih_resume().
Link: https://lore.kernel.org/r/20201102164730.324035-16-vaibhavgupta40@gmail.com
Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Fixes the following W=1 kernel build warning(s):
drivers/scsi/mpt3sas/mpt3sas_scsih.c:2778: warning: Function parameter or member 'ioc' not described in 'scsih_tm_cmd_map_status'
drivers/scsi/mpt3sas/mpt3sas_scsih.c:2778: warning: Function parameter or member 'channel' not described in 'scsih_tm_cmd_map_status'
drivers/scsi/mpt3sas/mpt3sas_scsih.c:2829: warning: Function parameter or member 'ioc' not described in 'scsih_tm_post_processing'
drivers/scsi/mpt3sas/mpt3sas_scsih.c:2829: warning: Function parameter or member 'channel' not described in 'scsih_tm_post_processing'
Link: https://lore.kernel.org/r/20201102142359.561122-3-lee.jones@linaro.org
Cc: Sathya Prakash <sathya.prakash@broadcom.com>
Cc: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Cc: Suganath Prabu Subramani <suganath-prabu.subramani@broadcom.com>
Cc: MPT-FusionLinux.pdl@avagotech.com
Cc: MPT-FusionLinux.pdl@broadcom.com
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add module parameter multipath_on_hba to enable/disable multi-port path
topology support. By default this feature is enabled on SAS3.5 HBA device
and disabled on SAS3 &SAS2.5 HBA devices.
When this feature is disabled then driver uses a default
PhysicalPort(PortID) number i.e. 255 instead of the PhysicalPort number
provided by HBA firmware.
Link: https://lore.kernel.org/r/20201027130847.9962-14-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
During HBA reset the Port ID of vSES device may change. As a result, it is
necessary to refresh virtual_phy objects after reset.
Each Port's vphy_list table needs to be updated after updating the
HBA port table. The algorithm is as follows:
- Loop over each port entry from HBA port table
* Loop over each virtual phy entry from port's vphys_list table
- Mark virtual phy entry as dirty by setting dirty bit in virtual phy
entry's flags field
- Read SASIOUnitPage0 page
- Loop over each HBA Phy's Phy data from SASIOUnitPage0
* If phy's remote attached device is not SES device then continue with
processing next HBA Phy's Phy data;
* Read SASPhyPage0 data for this Phy number and determine whether
current phy is a virtual phy or not. If it is not a virtual phy then
continue with next Phy data;
* Get the current phy's remote attached vSES device's SAS Address;
* Loop over each port entry from HBA port table
- If Port's vphys_mask field is zero then continue with
next Port entry,
- Loop over each virtual phy entry from Port's vphy_list table
- If the current phy's remote SAS Address is different from
virtual phy entry's SAS Address then continue with next
virtual phy entry,
- Set bit corresponding to current phy number in virtual phy
entry's phy_mask field,
- Get the HBA port table's Port entry corresponding to
Phy data's 'Port' value,
* If there is no Port entry corresponding to Phy data's
'Port' value in HBA port table then create a new port entry
and add it to HBA port table.
- If this retrieved Port entry is the same as the current Port
entry then don't do anything, just clear the dirty bit from
virtual phy entry's flag field and continue with processing
next HBA Phy's Phy data.
- If this retrieved Port entry is different from the current Port
entry then move the current virtual phy entry from current Port's
vphys_list to retrieved Port entry's vphys_list.
* Clear current phy bit in current Port entry's vphys_mask and
set the current phy bit in the retrieved Port entry's
vphys_mask field.
* Clear the dirty bit from virtual phy entry's flag field and
continue with next HBA Phy's Phy data.
- Delete the 'virtual phy' entries and HBA's 'Port table' entries which
are still marked as 'dirty'.
Link: https://lore.kernel.org/r/20201027130847.9962-13-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Added a new parameter bypass_dirty_port_flag in function
mpt3sas_get_port_by_id(). When this parameter is set to one then search for
matching hba port entry from port_table_list even when this hba_port entry
is marked as dirty.
Link: https://lore.kernel.org/r/20201027130847.9962-12-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Each direct attached device will have a unique Port ID, but with an
exception. HBA vSES may use the same Port ID of another direct attached
device Port's ID. As a result, special handling is needed for vSES.
Create a virtual_phy object when a new HBA vSES device is detected and add
this virtual_phy object to vphys_list of port ID's hba_port object. When
the HBA vSES device is removed then remove the corresponding virtual_phy
object from its parent's hba_port's vphy_list and free this virtual_vphy
object.
In hba_port object add vphy_mask field to hold the list of HBA phy bits
which are assigned to vSES devices. Also add vphy_list list to hold list of
virtual_phy objects which holds the same portID of current hba_port's
portID.
Also, add a hba_vphy field in _sas_phy object to determine whether this
_sas_phy object belongs to vSES device or not.
- Allocate a virtual_phy object whenever a virtual phy is detected while
processing the SASIOUnitPage0's phy data. And this allocated virtual_phy
object to corresponding PortID's hba_port's vphy_list.
- When a vSES device is added to the SML then initialize the corresponding
virtual_phy objects's sas_address field with vSES device's SAS Address.
- Free this virtual_phy object during driver unload time and when this
vSES device is removed.
Link: https://lore.kernel.org/r/20201027130847.9962-11-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The driver currently sets PhysicalPort field to 0xFF for SMPPassthrough
Request message. In zoning topologies this SMPPassthrough command always
operates on devices in one zone (default zone) even when user issues SMP
command for other zone drives.
Define _transport_get_port_id_by_rphy() and
_transport_get_port_id_by_sas_phy() helper functions to get Physical Port
number from sas_rphy & sas_phy respectively for SMPPassthrough request
message so that SMP Passthrough request message is sent to intended zone
device.
Link: https://lore.kernel.org/r/20201027130847.9962-10-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
During host reset there is a chance that the Port number allocated by the
firmware for the attached devices may change. Also, it may be possible that
some HBA phy's can go down/come up after reset. As a result, the driver
can't just trust the HBA Port table that it has populated before host reset
as valid. Instead it has to update the HBA Port table in such a way that it
shouldn't disturb the drives which are still accessible even after host
reset.
Use the following algorithm to update the HBA Port table during host reset:
I. After host reset operation and before marking the devices as
responding/non-responding, create a temporary Port table called "New
Port table" by parsing each of the HBA phy's Phy data info read from SAS
IOUnit Page0:
a. Check whether Phy's negotiated link rate is greater than 1.5Gbps, if
not go to next Phy;
b. Get the SAS Address of the attached device;
c. Create a new entry in the "New Port table" with SAS Address field
filled with attached device's SAS Address, port number with Phy's
Port number (read from SAS IOUnit Page0) and enable bit in the 'Phy
mask' field corresponding to current Phy number. New entry is
created only if the driver can't find an entry in the "New Port
table" which matches with attached device 'SAS Address' & 'Port
Number'. If it finds an entry with matches with attached device 'SAS
Address' & 'Port Number' then the driver takes that matched entry and
will enable current Phy number bit in the 'Phy mask' field;
d. After parsing all the HBA phy's info, the driver will have complete
Port table info in "New Port table".
II. Mark all the existing sas_device & sas_expander device structures as
'dirty'.
III. Mark each entry of the HBA Port lists as 'dirty'.
IV. Take each entry from 'New Port table' one by one and check whether the
entry has any corresponding matched entry (which is marked as 'dirty')
in the HBA Port table or not. While looking for a corresponding
matched entry, look for matched entry in the sequence from top row to
bottom row listed in the following table. If you find any matched entry
(according to any of the rules tabulated below) then perform the action
mentioned in the 'Action' column in that matched rule.
===========================================================================
|Search |SAS | Phy Mask | Port | Possibilities| Action |
|every |Address | or | Number | | required |
|entry |matched?| subset of| matched?| | |
|in below| | phy mask | | | |
|sequence| | matched? | | | |
===========================================================================
| 1 |matched | matched | matched | nothing |* unmark HBA port |
| | | | | changed |table entry as |
| | | | | |dirty |
---------------------------------------------------------------------------
| 2 |matched | matched | not | port number |* Update port |
| | | | matched | is changed |number in the |
| | | | | |matched port table |
| | | | | |entry |
| | | | | |* unmask HBA port |
| | | | | |table entry as |
| | | | | |dirty |
---------------------------------------------------------------------------
| 3.a |matched | subset of| matched |some phys |* Add these new |
| | | phy mask | (or) |might have |phys to current |
| | | matched | not |enabled which |port in STL |
| | | | matched |are previously|* Update phy mask |
| | | | (but |disabled |field in HBA's port|
| | | | first | |table's matched |
| | | | look for| |entry, |
| | | | matched | |* Update port |
| | | | one) | |number in the |
| | | | | |matched port |
| | | | | |table entry (if |
| | | | | |port number is |
| | | | | |changed), |
| | | | | |* Unmask HBA port |
| | | | | |table entry as |
| | | | | |dirty |
---------------------------------------------------------------------------
| 3.b |matched | subset of| matched |some phys |*Remove these phys |
| | | phy mask | (or) |might have |from current port |
| | | matched | not |disabled which|in STL |
| | | | matched |are previously|* Update phy mask |
| | | | (but |enabled |field in HBA's port|
| | | | first | |tables's matched |
| | | | look for| |entry, |
| | | | matched | |*Update port number|
| | | | one) | |in the matched port|
| | | | | |table entry (if |
| | | | | |port number is |
| | | | | |changed), |
| | | | | |* Unmask HBA port |
| | | | | |table entry as |
| | | | | |dirty |
---------------------------------------------------------------------------
| 4 |matched | not | matched |A cable |*Remove old phys & |
| | | matched | (or) |attached to an|new phys to current|
| | | | not |expander is |port in STL |
| | | | matched |changed to |* Update phy mask |
| | | | |another HBA |field in HBA's port|
| | | | |port during |tables's matched |
| | | | |reset |entry, |
| | | | | |*Update port number|
| | | | | |in the matched port|
| | | | | |table entry (if |
| | | | | |port number is |
| | | | | |changed), |
| | | | | |* Unmask HBA port |
| | | | | |table entry as |
| | | | | |dirty |
---------------------------------------------------------------------------
V. Delete the hba_port objects which are still marked as dirty.
Link: https://lore.kernel.org/r/20201027130847.9962-9-sreekanth.reddy@broadcom.com
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In the following scsi_host_template and sas_function_template callback
functions the driver does not have PhysicalPort number information to
retrieve the sas_device object using SAS Address & PhysicalPort number. In
these callback functions the device's rphy object is used to retrieve
sas_device object for the device.
.target_alloc,
.get_enclosure_identifier
.get_bay_identifier
When a rphy (of type sas_rphy) object is allocated then its address is
saved in corresponding sas_device object's rphy field. In
__mpt3sas_get_sdev_by_rphy(), the driver loops over all the sas_device
objects from sas_device_list list to retrieve the sas_device objects whose
rphy matches the provided rphy.
Link: https://lore.kernel.org/r/20201027130847.9962-8-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Currently driver retrieves the sas_device/sas_expander objects from
corresponding object's lists using just device's SAS Address.
Make driver retrieve the objects from the corresponding objects list using
device's SAS Address and PhysicalPort (or PortID) number. PhysicalPort
number is the port number of the HBA through which this device is accessed.
Link: https://lore.kernel.org/r/20201027130847.9962-6-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Update hba_port's sas_address & phy_mask fields whenever a direct expander
or sas/sata target devices are added or removed.
When any direct attached device is discovered then driver:
- Gets the hba_port object corresponding to device's PhysicalPort
number;
- Updates the hba_port's sas_address field with device's SAS
Address;
- Updates the hba_port's phy_mask filed with device's narrow/wide
port Phy number bits;
- If a sas/sata end device (not only direct-attached devices) is added
then corresponding sas_device object's port variable is assigned with
hba_port object's address whose port_id matches the device's
PhysicalPort number.
- If an expander device is added then corresponding sas_expander object's
port variable is assigned with hba_port object's address whose port_id
matches the expander device's PhysicalPort number.
When any direct attached device is detached then driver will delete the
hba_port object corresponding to device's PhysicalPort number.
Whenever any HBA phy's link (of direct attached device's port) comes up
then update the phy_mask field of corresponding hba_port object.
Link: https://lore.kernel.org/r/20201027130847.9962-5-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Allocate hba_port object whenever a new HBA's wide/narrow port is
identified while processing the SASIOUnitPage0's phy data and add this
object to port_table_list. Deallocate these objects during driver unload.
Link: https://lore.kernel.org/r/20201027130847.9962-3-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This series consists of the usual driver updates (ufs, qla2xxx, tcmu,
ibmvfc, lpfc, smartpqi, hisi_sas, qedi, qedf, mpt3sas) and minor bug
fixes. There are only three core changes: adding sense codes,
cleaning up noretry and adding an option for limitless retries.
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCX4YulyYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishaZDAQCT7rwG
UEZYHgYkU9EX9ERVBQM0SW4mLrxf3g3P5ioJsAEAtkclCM4QsIOP+MIPjIa0EyUY
khu0kcrmeFR2YwA8zhw=
=4w4S
-----END PGP SIGNATURE-----
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI updates from James Bottomley:
"The usual driver updates (ufs, qla2xxx, tcmu, ibmvfc, lpfc, smartpqi,
hisi_sas, qedi, qedf, mpt3sas) and minor bug fixes.
There are only three core changes: adding sense codes, cleaning up
noretry and adding an option for limitless retries"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (226 commits)
scsi: hisi_sas: Recover PHY state according to the status before reset
scsi: hisi_sas: Filter out new PHY up events during suspend
scsi: hisi_sas: Add device link between SCSI devices and hisi_hba
scsi: hisi_sas: Add check for methods _PS0 and _PR0
scsi: hisi_sas: Add controller runtime PM support for v3 hw
scsi: hisi_sas: Switch to new framework to support suspend and resume
scsi: hisi_sas: Use hisi_hba->cq_nvecs for calling calling synchronize_irq()
scsi: qedf: Remove redundant assignment to variable 'rc'
scsi: lpfc: Remove unneeded variable 'status' in lpfc_fcp_cpu_map_store()
scsi: snic: Convert to use DEFINE_SEQ_ATTRIBUTE macro
scsi: qla4xxx: Delete unneeded variable 'status' in qla4xxx_process_ddb_changed
scsi: sun_esp: Use module_platform_driver to simplify the code
scsi: sun3x_esp: Use module_platform_driver to simplify the code
scsi: sni_53c710: Use module_platform_driver to simplify the code
scsi: qlogicpti: Use module_platform_driver to simplify the code
scsi: mac_esp: Use module_platform_driver to simplify the code
scsi: jazz_esp: Use module_platform_driver to simplify the code
scsi: mvumi: Fix error return in mvumi_io_attach()
scsi: lpfc: Drop nodelist reference on error in lpfc_gen_req()
scsi: be2iscsi: Fix a theoretical leak in beiscsi_create_eqs()
...
The driver will throw an error message when a tampered type controller
is detected. The intent is to avoid interacting with any firmware
which is not secured/signed by Broadcom. Any tampering on firmware
component will be detected by hardware and it will be communicated to
the driver to avoid any further interaction with that component.
[mkp: switched back to dev_err]
Link: https://lore.kernel.org/r/20200814130426.2741171-1-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If driver has not received the interrupt for the aborted SCSI command
before processing the TM reply, driver polls all the reply descriptor pools
looking for the reply for the aborted SCSI command before marking TM as
FAILED. If it finds the reply, then it marks the TM as SUCCESS otherwise it
marks it FAILED.
scsih_tm_cmd_map_status() checks whether TM has aborted the timed out SCSI
command or not. If TM has aborted the IO, then it returns SUCCESS else it
returns FAILED.
Link: https://lore.kernel.org/r/1596096229-3341-7-git-send-email-suganath-prabu.subramani@broadcom.com
Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add helper functions to check whether any SCSI command is outstanding on
particular Target, LUN device.
Also add function parameters 'channel', 'id' to function
mpt3sas_scsih_issue_tm().
Link: https://lore.kernel.org/r/1596096229-3341-6-git-send-email-suganath-prabu.subramani@broadcom.com
Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
It is not recommended to issue back-to-back host reset without any delay.
However, if someone issues back-to-back host reset then we observe that
target devices get unregistered and re-register with SML. And if OS drive
is behind the HBA when it gets unregistered, then file-system goes into
read-only mode.
Normally during host reset, driver marks accessible target devices as
responding and triggers the event MPT3SAS_REMOVE_UNRESPONDING_DEVICES to
remove any non-responding devices through FW worker thread. While
processing this event, driver unregisters the non-responding devices and
clears the responding flag for all the devices.
Currently, during host reset, driver is cancelling only those Firmware
event works which are pending in Firmware event workqueue. It is not
cancelling work which is currently running. Change the driver to cancel all
events.
Link: https://lore.kernel.org/r/1596096229-3341-4-git-send-email-suganath-prabu.subramani@broadcom.com
Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
By default DIF Type 1, DIF Type 2 & DIF Type 3 will be enabled. Also,
users can enable either DIF Type 1 or DIF Type 2 or DIF Type 3 or in any
combination using the prot_mask module parameter.
However, when the user provides a prot_mask module parameter value of zero,
then the driver is not disabling the DIF. Instead it enables all three
types.
Modify the driver to disable the DIF support if the user provides a
prot_mask module parameter value of zero.
Link: https://lore.kernel.org/r/1588065902-2726-1-git-send-email-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Information needed to debug driver problems and firmware faults is stored
in the IOC’s MPT3SAS_ADAPTER data structure. Parameters such as IOCFacts,
IOC flags (related to sge, MSI-X, error recovery etc.), performance mode
type, TMs, internal commands reply status, etc. are present.
For debugging purposes, it is therefore helpful to be able to capture this
information so that the fault can be analyzed. Export the MPT3SAS_ADAPTER
data structure in debugfs. The data is available in:
/sys/kernel/debug/mpt3sas/scsi_hostX/ioc_dump
Link: https://lore.kernel.org/r/1588056322-29227-1-git-send-email-suganath-prabu.subramani@broadcom.com
Signed-off-by: Suganath Prabu <suganath-prabu.subramani@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Generic protection fault type kernel panic is observed when user performs
soft (ordered) HBA unplug operation while IOs are running on drives
connected to HBA.
When user performs ordered HBA removal operation, the kernel calls PCI
device's .remove() call back function where driver is flushing out all the
outstanding SCSI IO commands with DID_NO_CONNECT host byte and also unmaps
sg buffers allocated for these IO commands.
However, in the ordered HBA removal case (unlike of real HBA hot removal),
HBA device is still alive and hence HBA hardware is performing the DMA
operations to those buffers on the system memory which are already unmapped
while flushing out the outstanding SCSI IO commands and this leads to
kernel panic.
Don't flush out the outstanding IOs from .remove() path in case of ordered
removal since HBA will be still alive in this case and it can complete the
outstanding IOs. Flush out the outstanding IOs only in case of 'physical
HBA hot unplug' where there won't be any communication with the HBA.
During shutdown also it is possible that HBA hardware can perform DMA
operations on those outstanding IO buffers which are completed with
DID_NO_CONNECT by the driver from .shutdown(). So same above fix is applied
in shutdown path as well.
It is safe to drop the outstanding commands when HBA is inaccessible such
as when permanent PCI failure happens, when HBA is in non-operational
state, or when someone does a real HBA hot unplug operation. Since driver
knows that HBA is inaccessible during these cases, it is safe to drop the
outstanding commands instead of waiting for SCSI error recovery to kick in
and clear these outstanding commands.
Link: https://lore.kernel.org/r/1585302763-23007-1-git-send-email-sreekanth.reddy@broadcom.com
Fixes: c666d3be99 ("scsi: mpt3sas: wait for and flush running commands on shutdown/unload")
Cc: stable@vger.kernel.org #v4.14.174+
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The current codebase makes use of the zero-length array language extension
to the C90 standard, but the preferred mechanism to declare variable-length
types such as these ones is a flexible array member[1][2], introduced in
C99:
struct foo {
int stuff;
struct boo array[];
};
By making use of the mechanism above, we will get a compiler warning in
case the flexible array does not occur last in the structure, which will
help us prevent some kind of undefined behavior bugs from being
inadvertently introduced[3] to the codebase from now on.
Also, notice that, dynamic memory allocations won't be affected by this
change:
"Flexible array members have incomplete type, and so the sizeof operator
may not be applied. As a quirk of the original implementation of
zero-length arrays, sizeof evaluates to zero."[1]
This issue was found with the help of Coccinelle.
[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
[2] https://github.com/KSPP/linux/issues/21
[3] commit 7649773293 ("cxgb3/l2t: Fix undefined behaviour")
Link: https://lore.kernel.org/r/20200224161406.GA21454@embeddedor
Reviewed-by: Lee Duncan <lduncan@suse.com>
Reviewed-by: Satish Kharat <satishkh@cisco.com>
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Print the function name in which MPT command got timed out. This will
facilitate debugging in which path corresponding MPT command got timeout in
first failure instance of log itself.
Link: https://lore.kernel.org/r/20191226111333.26131-9-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This improves mpt3sas driver default debug information collection and
allows for a higher percentage of issues being able to be resolved with a
first-time data capture. However, this improvement to balance the amount
of debug data captured with the performance of driver.
Enabled below print messages with out affecting the IO performance,
1. When task abort TM is received then print IO commands's timeout value
and how much time this command has been outstanding.
2. Whenever hard reset occurs then print from where this hard reset has
been issued.
3. Failure message should be displayed for failure scenarios without any
logging level.
4. Added a print after driver successfully register or unregistered a
target drive with the SML. This print will be useful for debugging the
issue where the drive addition or deletion is hanging at SML.
5. During driver load time print request, reply, sense and config page
pool's information such as its address, length and size. Also printed
sg_tablesize information.
Link: https://lore.kernel.org/r/20191226111333.26131-8-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When Firmware fault occurs then print in which path firmware fault has
occurred. This will be useful while debugging the firmware fault issues.
Link: https://lore.kernel.org/r/20191226111333.26131-7-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Watchdog thread polls for IOC state every 1 second. If it detects that IOC
state is in CoreDump state then it immediately stops the IOs and also
clears the outstanding commands issued to the HBA firmware and then it will
poll for IOC state to be out of CoreDump state and once it detects that IOC
state is changed from CoreDump state to Fault state (or) CoreDumpTOSec
number of seconds are elapsed then it will issue host reset operation and
moves the IOC state to Operational state and resumes the IOs.
Whenever any TM is received from SML then if driver detects the IOC state
is in CoreDump state then it will wait for CoreDump state to be cleared and
will host reset operation.
Link: https://lore.kernel.org/r/20191226111333.26131-6-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Renamed _base_after_reset_handler function to
_base_clear_outstanding_commands so that it can be used in multiple
scenarios with suitable name which matches with the operation it does.
Also renamed its child functions. No functional changes.
Link: https://lore.kernel.org/r/20191226111333.26131-4-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>