Byte 69 bits 0:1 in the IDENTIFY DEVICE data indicate a
host-aware ZAC device.
Host-managed ZAC devices have their own individual signature,
and to not set the bits in the IDENTIFY DEVICE data.
And whenever we detect a ZAC-compatible device we should
be displaying the zoned block characteristics VPD page.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Tejun Heo <tj@kernel.org>
Device-managed ZAC devices just set the zoned capabilities field
in INQUIRY byte 69 (cf ACS-4). This corresponds to the 'zoned'
field in the block device characteristics VPD page.
As this is only defined in SPC-5/SBC-4 we also need to update
the supported SCSI version descriptor.
Reviewed-by: Shaun Tancheff <shaun.tancheff@seagate.com>
Tested-by: Shaun Tancheff <shaun.tancheff@seagate.com>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Tejun Heo <tj@kernel.org>
Add NCQ encapsulation for ZAC MANAGEMENT OUT and evaluate
NCQ Non-Data log pages to figure out if NCQ encapsulation
is supported.
Signed-off-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
ZAC drives implement a 'ZAC Management Out' command template,
which maps onto the ZBC OUT command.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Tejun Heo <tj@kernel.org>
ZAC drives implement a 'ZAC Management In' command template,
which maps onto the ZBC IN command.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Tejun Heo <tj@kernel.org>
libata device disabling is ... curious. So add the correct
definitions that we can disable ZAC devices properly.
Signed-off-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
If a device is disabled after error recovery it doesn't make
any sense to generate an ATA sense, but we should rather
return a generic sense code indicating the device is gone.
Signed-off-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Some commands like FPDMA RECEIVE or NCQ NON DATA can encapsulate
other commands to NCQ transport. So decode the subcmds, too.
Signed-off-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
When reading the NCQ Send/Recv log it might actually not
supported, thereby causing irritating messages
'READ LOG DMA EXT failed'.
Instead we should be reading the log directory first to
figure out if the log is actually supported before trying
to access it.
Signed-off-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Define the NCQ NON DATA command and update libsas to handle it
correctly.
Signed-off-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Do not call ata_request_sense() if the sense code is already
present.
Signed-off-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
I actually read the error messages in my logs, and successful
initialization is not an error.
Arguably these log lines could be deleted entirely.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Hannes Reinicke <hare@suse.de>
Acked-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When a cxlflash adapter goes into EEH recovery and multiple processes
(each having established its own context) are active, the EEH recovery
can hang if the processes attempt to recover in parallel. The symptom
logged after a couple of minutes is:
INFO: task eehd:48 blocked for more than 120 seconds.
Not tainted 4.5.0-491-26f710d+ #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
eehd 0 48 2
Call Trace:
__switch_to+0x2f0/0x410
__schedule+0x300/0x980
schedule+0x48/0xc0
rwsem_down_write_failed+0x294/0x410
down_write+0x88/0xb0
cxlflash_pci_error_detected+0x100/0x1c0 [cxlflash]
cxl_vphb_error_detected+0x88/0x110 [cxl]
cxl_pci_error_detected+0xb0/0x1d0 [cxl]
eeh_report_error+0xbc/0x130
eeh_pe_dev_traverse+0x94/0x160
eeh_handle_normal_event+0x17c/0x450
eeh_handle_event+0x184/0x370
eeh_event_handler+0x1c8/0x1d0
kthread+0x110/0x130
ret_from_kernel_thread+0x5c/0xa4
INFO: task blockio:33215 blocked for more than 120 seconds.
Not tainted 4.5.0-491-26f710d+ #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
blockio 0 33215 33213
Call Trace:
0x1 (unreliable)
__switch_to+0x2f0/0x410
__schedule+0x300/0x980
schedule+0x48/0xc0
rwsem_down_read_failed+0x124/0x1d0
down_read+0x68/0x80
cxlflash_ioctl+0x70/0x6f0 [cxlflash]
scsi_ioctl+0x3b0/0x4c0
sg_ioctl+0x960/0x1010
do_vfs_ioctl+0xd8/0x8c0
SyS_ioctl+0xd4/0xf0
system_call+0x38/0xb4
INFO: task eehd:48 blocked for more than 120 seconds.
The hang is because of a 3 way dead-lock:
Process A holds the recovery mutex, and waits for eehd to complete.
Process B holds the semaphore and waits for the recovery mutex.
eehd waits for semaphore.
The fix is to have Process B above release the semaphore before
attempting to acquire the recovery mutex. This will allow
eehd to proceed to completion.
Signed-off-by: Manoj N. Kumar <manoj@linux.vnet.ibm.com>
Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Based on "[PATH V2] scsi_debug: rework resp_report_luns" patch
sent by Tomas Winkler on Thursday, 26 Feb 2015. His notes:
1. Remove duplicated boundary checks which simplify the fill-in
loop
2. Use more of scsi generic API
Replace fixed length response array a with heap allocation
allowing up to 256 normal LUNs per target.
Signed-off-by: Douglas Gilbert <dgilbert@interlog.com>
Reviewed-by: Hannes Reinicke <hare@suse.de>
Reviewed-by: Tomas Winkler <tomas.winkler@intel.com>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Use TYPE_* constants for SCSI peripheral device types instead of
numbers. Further cleanups requested by checkpatch.pl.
Signed-off-by: Douglas Gilbert <dgilbert@interlog.com>
Reviewed-by: Hannes Reinicke <hare@suse.de>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The most common commands in normal use are the READ and WRITE SCSI
commands. Use likely and unlikely hints along the path taken by these
commands. Rename check_readiness() to make_ua() and remove associated
dead code. Rename devInfoReg() to find_build_dev_info().
Signed-off-by: Douglas Gilbert <dgilbert@interlog.com>
Reviewed-by: Hannes Reinicke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Group most defines together first; followed by struct definitions and
then table and variable definitions. Normalize all function headers.
[mkp: Corrected hex value in WP/DPOFUA MODE SENSE comment]
Signed-off-by: Douglas Gilbert <dgilbert@interlog.com>
Reviewed-by: Hannes Reinicke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When a negative value was placed in the delay parameter, a tasklet was
scheduled. Change the tasklet to a work queue. Previously a delay of -1
scheduled a high priority tasklet; since there are no high priority work
queues, treat -1 like other negative values in delay and schedule a work
item.
Signed-off-by: Douglas Gilbert <dgilbert@interlog.com>
Reviewed-by: Hannes Reinicke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add 'j' to delay names to make it clearer that its unit is jiffies and
to differentiate it from sdebug_ndelay whose unit is nanoseconds.
Signed-off-by: Douglas Gilbert <dgilbert@interlog.com>
Reviewed-by: Hannes Reinicke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The driver supports two command delay interfaces, the original one whose
unit is a jiffy, and a newer one whose unit is a nanosecond. Each had
different implementations. Keep both interfaces but simplify the
implemenation to use a single delay mechanism based on high resolution
timers.
Signed-off-by: Douglas Gilbert <dgilbert@interlog.com>
Reviewed-by: Hannes Reinicke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Remove logic to optionally hold host_lock while each command is
queued. Keep module and sysfs host_lock parameters for backward
compatibility. Note in module parameter description that host_lock is
ignored.
Signed-off-by: Douglas Gilbert <dgilbert@interlog.com>
Reviewed-by: Hannes Reinicke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Shorten file scope static and constant names. Use more
get/put_unaligned calls to hide bit banging. Introduce
sdebug_verbose boolean to replace frequent masking of
option bit flags. Add GPL and bump version.
[mkp: Use logical instead of bitwise OR for LBP VPD flags]
Signed-off-by: Douglas Gilbert <dgilbert@interlog.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Gerry Morong <gerry.morong@microsemi.com>
Signed-off-by: Don Brace <don.brace@microsemi.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Need to report HBA device removal faster than the
event handler polling interval.
Stop I/O to the removed disk and wait for all
I/O operations to flush before removing the device.
Reviewed-by: Scott Teel <scott.teel@microsemi.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microsemi.com>
Signed-off-by: Don Brace <don.brace@microsemi.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
set offload_to_be_enabled to 0 when an ioaccel2 error is processed.
Before, an ioaccel completion error would turn of ioaccel but a rescan
would turn it back on again.
Reviewed-by: Scott Teel <scott.teel@microsemi.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microsemi.com>
Signed-off-by: Don Brace <don.brace@microsemi.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
offload_to_be_enabled also needs to be set to 0 during a state
change.
Reviewed-by: Scott Teel <scott.teel@microsemi.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microsemi.com>
Signed-off-by: Don Brace <don.brace@microsemi.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
faulty drives can cause the driver to hang during a
scan operation.
Reviewed-by: Scott Teel <scott.teel@microsemi.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microsemi.com>
Signed-off-by: Don Brace <don.brace@microsemi.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
There have been companies requesting a sysfs entry
to obtain the sas address of device.
Reviewed-by: Scott Teel <scott.teel@microsemi.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microsemi.com>
Signed-off-by: Don Brace <don.brace@microsemi.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The driver was calling scsi_scan_host before enabling interrupts.
This has gone unnoticed except for customers running in intx mode.
Calling scsi_scan_host before interrupts are enabled causes
"irq XX: nobody cared" messages and the driver to hang.
This patch enables interrupts before the call to scsi_scan_host.
Reported-by: Piotr Karbowski <piotr.karbowski@gmail.com>
Reviewed-by: Scott Teel <scott.teel@microsemi.com>
Reviewed-by: Kevin Barnett <kevin.barnett@microsemi.com>
Signed-off-by: Don Brace <don.brace@microsemi.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When KDUMP is triggered the driver first talks to the firmware in INTX
mode, but the adapter firmware is still in MSIX mode. Therefore the first
driver command hangs since the driver is waiting for an INTX response and
firmware gives a MSIX response. If when the OS is installed on a RAID
drive created by the adapter KDUMP will hang since the driver does not
receive a response in sync mode.
Fixed by: Change the firmware to INTX mode if it is in MSIX mode before
sending the first sync command.
Cc: stable@vger.kernel.org
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Currently driver completes double completed or spurious interrupted fibs.
This is not necessary and causes the SCSI mid layer to issue aborts and
resets, since completing a fib prematurely might trigger a race condition
resulting in the driver not calling the scsi_done callback.
Fixed by removing the call to fib complete.
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Firmware AIF messages about cache loss and data recovery are being missed
by the driver since currently they are not captured but rather let go.
This patch to capture those messages and log them for the user.
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Typically under error conditions, it is possible for aac_command_thread()
to miss the wakeup from kthread_stop() and go back to sleep, causing it
to hang aac_shutdown.
In the observed scenario, the adapter is not functioning correctly and so
aac_fib_send() never completes (or time-outs depending on how it was
called). Shortly after aac_command_thread() starts it performs
aac_fib_send(SendHostTime) which hangs. When aac_probe_one
/aac_get_adapter_info send time outs, kthread_stop is called which breaks
the command thread out of it's hang.
The code will still go back to sleep in schedule_timeout() without
checking kthread_should_stop() so it causes aac_probe_one to hang until
the schedule_timeout() which is 30 minutes.
Fixed by: Adding another kthread_should_stop() before schedule_timeout()
Cc: stable@vger.kernel.org
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
As the firmware for series 6, 7, 8 cards does not support msi, remove it
in the driver
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
aac_fib_send has a special function case for initial commands during
driver initialization using wait < 0(pseudo sync mode). In this case,
the command does not sleep but rather spins checking for timeout.This
loop is calls cpu_relax() in an attempt to allow other processes/threads
to use the CPU, but this function does not relinquish the CPU and so the
command will hog the processor. This was observed in a KDUMP
"crashkernel" and that prevented the "command thread" (which is
responsible for completing the command from being timed out) from
starting because it could not get the CPU.
Fixed by replacing "cpu_relax()" call with "schedule()"
Cc: stable@vger.kernel.org
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The adapter has to be started after updating the number of MSIX Vectors
Fixes: ecc479e00d (aacraid: Set correct MSIX count for EEH recovery)
Cc: stable@vger.kernel.org
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Suggested-by: Seymour, Shane M <shane.seymour@hpe.com>
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Current driver checks for NULL return from aac_fib_alloc_tag, but it not
possible for it to return NULL.
Fixed by: Remove all the checks for NULL returns from aac_fib_alloc_tag
Suggested-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Raghava Aditya Renukunta <RaghavaAditya.Renukunta@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
mptsas_smp_handler() checks for dma mapping errors by comparison
returned address with zero, while pci_dma_mapping_error() should be
used.
Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Acked-by: Sathya Prakash Veerichetty <sathya.prakash@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The file atari_NCR5380.c has been removed from the tree so remove it
from the MAINTAINERS file as well.
While we are here, add the file dtc3x80.txt as it is only relevant to
the dtc driver.
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Some drives set the ILI flag together with MEDIUM ERROR sense code.
Clear the ILI flag in this case so that the medium error will be
handled. The problem was reported by Maurizio Lombardi.
Signed-off-by: Kai Mäkisara <kai.makisara@kolumbus.fi>
Reviewed-by: Laurence Oberman <loberman@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Remove incorrect lockdep assertion from lpfc_sli_hbqbuf_find() which
acquires the hbalock itself. Fix the comment which resulted in this
mistake.
Fixes: 1c2ba475eb ("lpfc: Add lockdep assertions")
Signed-off-by: Sebastian Herbszt <herbszt@gmx.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Presumably it isn't possible to have empty lists here, but my static
checker doesn't know that and complains that "ep" can be used
uninitialized.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Nilesh Javali <nilesh.javali@qlogic.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This has only called from show_sas_rphy_enclosure_identifier(). The
caller expects that we set an identifier, otherwise it uses an
uninitialized variable.
[mkp: fixed typo]
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Don Brace <don.brace@microsemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Firmware events are queued up using the fw_event_work's struct work, not
its delayed_work member. The initial driver for SAS2 controllers had
handled firmware reset using the rescan barrier and was later redesigned
through "mpt2sas: [Resend] Host Reset code cleanup". The delayed_work
variables are now unused and may provoke CONFIG_DEBUG_OBJECTS_TIMERS
"assert_init not available" false warnings in
_scsih_fw_event_cleanup_queue.
Cleanup fw_event_work's unused entries, update its kerneldoc, and
update _scsih_fw_event_cleanup_queue accordingly.
Fixes: 146b16c807 (mpt3sas: Refcount fw_events and fix unsafe list usage)
Signed-off-by: Joe Lawrence <joe.lawrence@stratus.com>
Acked-by: Chaitra P B <chaitra.basappa@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>