Need to take the lock while accessing the register to check to
see if config table changes have taken effect.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This is to prevent hpsa from resetting older boards
which the cciss driver may be controlling.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This is to conserve memory in a memory-limited kdump scenario
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
and use the doorbell reset method if available (which doesn't
lock up the controller if you properly save and restore all
the PCI registers that you're supposed to.)
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
After a reset, we should first wait for the board to become "not ready",
and then wait for it to become "ready", instead of immediately
waiting for it to become "ready", and do this waiting *after*
restoring PCI config space registers. Also, only wait 10 secs
for board to become "not ready" after a reset (it should quickly
become not ready.)
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
They are defined in hpsa_cmd.h
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Some low bits might have been set by the driver, causing
a message like this to come out:
[ 13.288062] ------------[ cut here ]------------
[ 13.293211] WARNING: at lib/dma-debug.c:803 check_unmap+0x1a1/0x654()
[ 13.300387] Hardware name: ProLiant DL180 G6
[ 13.305335] hpsa 0000:06:00.0: DMA-API: device driver tries to free
DMA memory it has not allocated [device address=0x000000007f81e001]
[size=640 bytes]
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
[jejb: fix up patch problems and checkpatch.pl issues]
Signed-off-by: Nick Cheng <nick.cheng@areca.com.tw>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Currently NetApp's VID/PID details in the INQUIRY response shows up as
'NETAPP' and 'LUN'. With upcoming scalable SAN ONTAP version on NetApp
controllers, the PID entry alone is being modified to 'LUN C-Mode' (to
distinguish current ONTAP LUNs from scalable ONTAP LUNs).
'LUN' would still suffice for matching 'LUN C-Mode' but best to
explicitly add these new NetApp LUNs to the device list.
Reported-by: Martin George <marting@netapp.com>
Acked-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Adds Promise VTrak devices to the ALUA device handler.
Signed-off-by: Ilgu Hong <ilgu.hong@promise.com>
Signed-off-by: Joseph Gruher <joseph.r.gruher@intel.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Initialize stpg_endio() 'err' to SCSI_DH_OK and only change it to
SCSI_DH_IO accordingly. This allows the switching of target group state
to be properly reported when no error has occurred.
Signed-off-by: Joseph Gruher <joseph.r.gruher@intel.com>
Signed-off-by: Ilgu Hong <ilgu.hong@promise.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The use of blk_execute_rq_nowait() implies __blk_put_request() is needed
in stpg_endio() rather than blk_put_request() -- blk_finish_request() is
called with queue lock already held.
Signed-off-by: Joseph Gruher <joseph.r.gruher@intel.com>
Signed-off-by: Ilgu Hong <ilgu.hong@promise.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@kernel.org
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
submit_stpg() will always return failure so alua_activate() will report
failure via dm-multipath callback function. Even though the stpg fired
successfuly dm-multipath does not know and always fails to change the
valid path.
By returning SCSI_DH_OK we're now skipping alua_activate()'s call to
activate_complete 'fn'. But this is fine because stpg_endio() will call
it via h->callback_fn().
Signed-off-by: Joseph Gruher <joseph.r.gruher@intel.com>
Signed-off-by: Ilgu Hong <ilgu.hong@promise.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Upgrade driver version from 7.100.00.00 to 8.100.00.00
Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Basic Code Cleanup:
(1) _base_get_cb_idx and mpt2sas_base_free_smid were reorganized in
similar fashion so the order of obtaining the cbx and smid are
scsiio,
hi_priority, and internal.
(2) The hi_priority and internal request queue struct was made
smaller
by removing the scmd and chain_tracker, thus saving memory
allocation.
(3) For scsiio request, a new structure was created having the same
elements from the former request tracker struct.
Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Add support for Customer specific branding messages when device driver loads,
based on specific customer subsystem vendor and device Ids
Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Revision P MPI Header Update:
a) Added enable/disable SATA NCQ operations to SAS IO Unit Control
Request.
b) Modified Host Based Discovery Action Request message format.
c) Removed Device Path bit from IO Unit Page 1 Flags field.
d) Added description of ChainOffset field for Diagnostic Data Upload
Tool.Chaining is not allowed.
Removed mpi2_history.txt file
Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Issue:
IR shutdown(sending) and IR shutdown(complete) messages not
listed in /var/log/messages when driver is removed.
The driver needs to issue a MPI2_RAID_ACTION_SYSTEM_SHUTDOWN_INITIATED
request when the driver is unloaded so the IR metadata journal is updated.
If this request is not sent, then the volume would need a "check
consistency" issued on the next bootup if the volume was roamed from one
initiator to another. The current driver supports this feature only when the
system is rebooted, however this also need to be supported if the driver is
unloaded
Fix:
To fix this issue, the driver is going
to need to call the _scsih_ir_shutdown prior to reporting
the volumes missing from the OS, hence the device handles
are still present.
Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
Cc: stable@kernel.org
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
There was a configuration page timing out during the initial port
enable at driver load time. The port enable would fail, and this would
result in the driver unloading itself, meanwhile the driver was accessing
freed memory in another context resulting in the panic. The fix is to
prevent access to freed memory once the driver had issued the diag reset
which woke up the sleeping port enable process. The routine
_base_reset_handler was reorganized so the last sleeping process woken up was
the port_enable.
Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
Cc: stable@kernel.org
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
False timeout after hard resets, there were two issues which leads
to timeout.
(1) Panic because of invalid memory access in the broadcast asyn
event processing routine due to a race between accessing the scsi command
pointer from broadcast asyn event processing thread and completing
the same scsi command from the interrupt context.
(2) Broadcast asyn event notifcations are not handled due to events
ignored while the broadcast asyn event is activity being processed
from the event process kernel thread.
In addition, changed the ABRT_TASK_SET to ABORT_TASK in the
broadcast async event processing routine. This is less disruptive to other
request that generate Broadcast Asyn Primitives besides target
reset. e.g clear reservations, microcode download,and mode select.
Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
Cc: stable@kernel.org
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The ioc->hba_queue_depth is not properly resized when the controller
firmware reports that it supports more outstanding IO than what can be fit
inside the reply descriptor pool depth. This is reproduced by setting the
controller global credits larger than 30,000. The bug results in an
incorrect sizing of the queues. The fix is to resize the queue_size by
dividing queue_diff by two.
Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
Cc: stable@kernel.org
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The "internal device reset complete" event is not supported
for older firmware prior to MPI Rev K We added
a check in the driver so the "internal device reset" event is
ignored for older firmware. When ignored, the tm_busy flag doesn't
get set nor cleared. Without this fix, IO queues would be froozen
indefinetly after the "internal device reset" event, as the "complete" event
never sent to clear the flag.
Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
Cc: stable@kernel.org
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
When zoning end devices, the driver is not sending device
removal handshake alogrithm to firmware. This results in controller
firmware not sending sas topology add events the next time the device is
added. The fix is the driver should be doing the device removal handshake
even though the PHYSTATUS_VACANT bit is set in the PhyStatus of the
event data. The current design is avoiding the handshake when the
VACANT bit is set in the phy status.
Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
Cc: stable@kernel.org
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The [vk][cmz]alloc(_node) family of functions return void pointers which
it's completely unnecessary/pointless to cast to other pointer types since
that happens implicitly.
This patch removes such casts from drivers/scsi/
Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
PCI/PM: Report wakeup events before resuming devices
PCI/PM: Use pm_wakeup_event() directly for reporting wakeup events
PCI: sysfs: Update ROM to include default owner write access
x86/PCI: make Broadcom CNB20LE driver EMBEDDED and EXPERIMENTAL
x86/PCI: don't use native Broadcom CNB20LE driver when ACPI is available
PCI/ACPI: Request _OSC control once for each root bridge (v3)
PCI: enable pci=bfsort by default on future Dell systems
PCI/PCIe: Clear Root PME Status bits early during system resume
PCI: pci-stub: ignore zero-length id parameters
x86/PCI: irq and pci_ids patch for Intel Patsburg
PCI: Skip id checking if no id is passed
PCI: fix __pci_device_probe kernel-doc warning
PCI: make pci_restore_state return void
PCI: Disable ASPM if BIOS asks us to
PCI: Add mask bit definition for MSI-X table
PCI: MSI: Move MSI-X entry definition to pci_regs.h
Fix up trivial conflicts in drivers/net/{skge.c,sky2.c} that had in the
meantime been converted to not use legacy PCI power management, and thus
no longer use pci_restore_state() at all (and that caused trivial
conflicts with the "make pci_restore_state return void" patch)
SDEV_MEDIA_CHANGE event was first added by commit a341cd0f (SCSI: add
asynchronous event notification API) for SATA AN support and then
extended to cover generic media change events by commit 285e9670
([SCSI] sr,sd: send media state change modification events).
This event was mapped to block device in userland with all properties
stripped to simulate CHANGE event on the block device, which, in turn,
was used to trigger further userspace action on media change.
The recent addition of disk event framework kept this event for
backward compatibility but it turns out to be unnecessary and causes
erratic and inefficient behavior. The new disk event generates proper
events on the block devices and the compat events are mapped to block
device with all properties stripped, so the block device ends up
generating multiple duplicate events for single actual event.
This patch removes the compat event generation from both sr and sd as
suggested by Kay Sievers. Both existing and newer versions of udev
and the associated tools will behave better with the removal of these
events as they from the beginning were expecting events on the block
devices.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Replace sd_media_change() with sd_check_events().
* Move media removed logic into set_media_not_present() and
media_not_present() and set sdev->changed iff an existing media is
removed or the device indicates UNIT_ATTENTION.
* Make sd_check_events() sets sdev->changed if previously missing
media becomes present.
* Event is reported only if sdev->changed is set.
This makes media presence event reported if scsi_disk->media_present
actually changed or the device indicated UNIT_ATTENTION. For backward
compatibility, SDEV_EVT_MEDIA_CHANGE is generated each time
sd_check_events() detects media change event.
[jejb: fix boot failure]
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Jens Axboe <jaxboe@fusionio.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
* 'for-2.6.38/core' of git://git.kernel.dk/linux-2.6-block: (43 commits)
block: ensure that completion error gets properly traced
blktrace: add missing probe argument to block_bio_complete
block cfq: don't use atomic_t for cfq_group
block cfq: don't use atomic_t for cfq_queue
block: trace event block fix unassigned field
block: add internal hd part table references
block: fix accounting bug on cross partition merges
kref: add kref_test_and_get
bio-integrity: mark kintegrityd_wq highpri and CPU intensive
block: make kblockd_workqueue smarter
Revert "sd: implement sd_check_events()"
block: Clean up exit_io_context() source code.
Fix compile warnings due to missing removal of a 'ret' variable
fs/block: type signature of major_to_index(int) to major_to_index(unsigned)
block: convert !IS_ERR(p) && p to !IS_ERR_NOR_NULL(p)
cfq-iosched: don't check cfqg in choose_service_tree()
fs/splice: Pull buf->ops->confirm() from splice_from_pipe actors
cdrom: export cdrom_check_events()
sd: implement sd_check_events()
sr: implement sr_check_events()
...
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (147 commits)
[SCSI] arcmsr: fix write to device check
[SCSI] lpfc: lower stack use in lpfc_fc_frame_check
[SCSI] eliminate an unnecessary local variable from scsi_remove_target()
[SCSI] libiscsi: use bh locking instead of irq with session lock
[SCSI] libiscsi: do not take host lock in queuecommand
[SCSI] be2iscsi: fix null ptr when accessing task hdr
[SCSI] be2iscsi: fix gfp use in alloc_pdu
[SCSI] libiscsi: add more informative failure message during iscsi scsi eh
[SCSI] gdth: Add missing call to gdth_ioctl_free
[SCSI] bfa: remove unused defintions and misc cleanups
[SCSI] bfa: remove inactive functions
[SCSI] bfa: replace bfa_assert with WARN_ON
[SCSI] qla2xxx: Use sg_next to fetch next sg element while walking sg list.
[SCSI] qla2xxx: Fix to avoid recursive lock failure during BSG timeout.
[SCSI] qla2xxx: Remove code to not reset ISP82xx on failure.
[SCSI] qla2xxx: Display mailbox register 4 during 8012 AEN for ISP82XX parts.
[SCSI] qla2xxx: Don't perform a BIG_HAMMER if Get-ID (0x20) mailbox command fails on CNAs.
[SCSI] qla2xxx: Remove redundant module parameter permission bits
[SCSI] qla2xxx: Add sysfs node for displaying board temperature.
[SCSI] qla2xxx: Code cleanup to remove unwanted comments and code.
...
Use command->sc_data_direction instead of trying (incorrectly) to
figure it out from the command itself
[jejb: fix up compile failure]
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Acked-by: NickCheng <nick.cheng@areca.com.tw>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
according to checkstack the lpfc_fc_frame_check occupies the first
place in stack usage:
make checkstack
objdump -d vmlinux $(find . -name '*.ko') | \
perl /root/rpmbuild/BUILD/kernel-2.6.32/linux-2.6.32.x86_64/scripts/checkstack.pl x86_64
0x000013f4 lpfc_fc_frame_check [lpfc]: 1936
...
This change makes the rctl_names static, thus not on stack.
Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Acked-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The below patch fixes a typo "diable" to "disable" and also fixes another typo in a comment.
Please let me know if this is correct or not.
Signed-off-by: Justin P. Mattock <justinmattock@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
The session lock is taken in threads, timers, and bottom halves
like softirqs and tasklets. All the code but
iscsi_conn/session_failure take the session lock with the spin_lock_bh
call. This was done because I thought some offload drivers
would be calling these functions from a irq. They never did,
so this patch has iscsi_conn/session_failure use the bh
locking.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
iscsi_tcp, ib_iser, cxgb*, be2iscsi and bnx2i do not use
the host lock and do not take the session lock against
a irq, so this patch drops the DEF_SCSI_QCMD use. Instead
we just take the session lock and disable bhs.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
If alloc_pdu fails then the task->hdr pointer may not be
set. This adds a check for this case in the cleanup callback.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The pdu allication callout is called from a spin lock
and in the IO path so we cannot use GFP_KERNEL. This
has the driver use GFP_ATOMIC.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This adds a more informative error code and message
for the iscsi scsi eh session drop paths. This allows
you to distinguish if the session was dropped due to
a connection failure vs the iscsi layer dropping
the session due to scsi eh failure processing.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Add missing call to gdth_ioctl_free before aborting.
The semantic match that finds this problem is as follows:
(http://coccinelle.lip6.fr/)
// <smpl>
@@
expression buf,ha,len,addr,E;
@@
buf = gdth_ioctl_alloc(ha, len, FALSE, &addr)
... when != false buf != NULL
when != true buf == NULL
when != \(E = buf\|buf = E\)
when != gdth_ioctl_free(ha, len, buf, addr)
*return ...;
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This patch removes unused functions, data strucutres, and definitions. It
also includes misc comment and formatting cleanups.
Signed-off-by: Jing Huang <huangj@brocade.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This patch removes some inactive functions and macros.
Signed-off-by: Jing Huang <huangj@brocade.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6:
[SCSI] fix up documentation for change in ->queuecommand to lockless calling
[SCSI] bfa: rename log_level to bfa_log_level
The semantics we employ now in the driver, performing a
BIG_HAMMER in the event of Get-ID (0x20) mailbox command
failing, should only be done for FC. On FC configurations, it
makes sense since advertising is only really performed once,
so a BIG_HAMMER to reinitiate the process is needed to
restart. Under FCoE, this is not needed, as there's a
continous stream of advertisements/ACks at the protocol layer
to initiate a relogin/reinitialization process.
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
For driver module parameters that have permission bits set to
(S_IRUGO|S_IRUSR), remove the second term since it is already
included in the first term.
S_IRUGO comes defined as (S_IRUSR|S_IRGRP|S_IROTH).
Signed-off-by: Joe Carnuccio <joe.carnuccio@qlogic.com>
Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The FCP priority info was not being updated properly in certain situations.
Here are the changes that needs to be done to take care of this issue:
1. No need to check fcport->state for FCS_UNCONFIGURED in
qla24xx_update_fcport_fcp_prio(), since an invalid loop id check is
already performed which is sufficient.
2. Add the missing qla24xx_update_fcport_fcp_prio() function call
within qla2x00_update_fcport() function, so that the priority info
is updated on every port addition or change.
3. Perform proper adapter types checking.
4. Other changes, associated with DEBUG/printk's and parameter passing.
Signed-off-by: Saurav Kashyap <saurav.kashyap@qlogic.com>
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Fixed the incorrect zero test on array new_config[].
Signed-off-by: Joe Carnuccio <joe.carnuccio@qlogic.com>
Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This is just a cleanup.
The unneeded NULL check annoys static checkers because we already
derefenced it and the we check it and then (if it's not the _safe()
version) we dereference it again without checking. And the static
checker is all, "Wah? Is it null or not?"
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Support is added for quiescence mode. This feature is for P3P
adapters. Any of the functions can put the firmware into quiescence
state. All the others have to ack that request. During quiescence mode
current commands are processed and all the new incoming I/Os are
blocked. Loop resync is performed after firmware comes out of
quiescence state.
Signed-off-by: Saurav Kashyap <saurav.kashyap@qlogic.com>
Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
pci_restore_state only ever returns 0, thus there is no benefit in
having it return any value. Also, a large majority of the callers do
not check the return code of pci_restore_state. Make the
pci_restore_state a void return and avoid the overhead.
Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Jon Mason <jon.mason@exar.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
1. Change first parameter from cnic_dev to ulp_handle which is the hba
pointer. All other similar upcalls are using hba pointer. The callee
can then directly reference the hba without conversion.
2. Change return value from void to int so that an error code can be
passed back. This allows the operation to be retried.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds MegaRAID 9265/9285 (Device id 0x5b) specific code
Signed-off-by: Adam Radford <aradford@gmail.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The following patch adds struct megasas_instance_template changes to
the megaraid_sas driver, and changes all code to use the new instance
entries:
irqreturn_t (*service_isr )(int irq, void *devp);
void (*tasklet)(unsigned long);
u32 (*init_adapter)(struct megasas_instance *);
u32 (*build_and_issue_cmd) (struct megasas_instance *, struct scsi_cmnd *);
void (*issue_dcmd) (struct megasas_instance *instance,
struct megasas_cmd *cmd);
Signed-off-by: Adam Radford <aradford@gmail.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The following patch modifies the megaraid_sas driver to select the
lowest memory bar available so the driver will work in SR-IOV VF
environments where the memory bar mapping changes.
Signed-off-by: Adam Radford <aradford@gmail.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This patch adds MSI-X support and 'msix_disable' module parameter to
the megaraid_sas driver.
Signed-off-by: Adam Radford <aradford@gmail.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Modify allocation to try the minimum possible page order allowed by the HBA
scatter/gather segment limit in allocation of the driver's internal
buffer. This increases the probability of successful allocation. The
allocation may still fail if this minimum order is > 0.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Kai Makisara <kai.makisara@kolumbus.fi>
Reported-by: Lukas Kolbe <lkolbe@techfak.uni-bielefeld.de>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The order of the pages allocated for the driver buffer must be stored before
allocation because it is used in freeing already allocated pages if
allocation fails.
Signed-off-by: Kai Makisara <kai.makisara@kolumbus.fi>
Reported-by: Lukas Kolbe <lkolbe@techfak.uni-bielefeld.de>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
set resid to the requested data-in length when a MEDIUM ERROR is
simulated. This implies no valid data is returned in the data-in
buffer
Signed-off-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Our current handling of medium error assumes that data is returned up
to the bad sector. This assumption holds good for all disk devices,
all DIF arrays and most ordinary arrays. However, an LSI array engine
was recently discovered which reports a medium error without returning
any data. This means that when we report good data up to the medium
error, we've reported junk originally in the buffer as good. Worse,
if the read consists of requested data plus a readahead, and the error
occurs in readahead, we'll just strip off the readahead and report
junk up to userspace as good data with no error.
The fix for this is to have the error position computation take into
account the amount of data returned by the driver using the scsi
residual data. Unfortunately, not every driver fills in this data,
but for those who don't, it's set to zero, which means we'll think a
full set of data was transferred and the behaviour will be identical
to the prior behaviour of the code (believe the buffer up to the error
sector). All modern drivers seem to set the residual, so that should
fix up the LSI failure/corruption case.
Reported-by: Douglas Gilbert <dgilbert@interlog.com>
Cc: Stable Tree <stable@kernel.org>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Conflicts:
MAINTAINERS
arch/arm/mach-omap2/pm24xx.c
drivers/scsi/bfa/bfa_fcpim.c
Needed to update to apply fixes for which the old branch was too
outdated.
This reverts commit c8d2e93735.
We run into merging problems with the SCSI tree, revert this one
so it can be handled by a postmerge tree there.
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
This patch updates the GPL headers in megaraid_sas_base.c and megaraid_sas.h.
Signed-off-by: Adam Radford <aradford@gmail.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This patch renames megaraid_sas.c to megaraid_sas_base.c to facilitate
other files in the compile.
Signed-off-by: Adam Radford <aradford@gmail.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
It seems that zero should be returned if scsi_target_is_busy(starget) is
true, no matter if sdev is on the starved list.
Signed-off-by: Hillf Danton <dhillf@gmail.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Currently, when scsi_dh_activate() returns with an error
(e.g. SCSI_DH_NOSYS) the activate_complete callback is not called and
the error is not propagated to DM mpath.
When a SCSI device attached to a device handler is deleted, userland
processes currently performing I/O on the device will have their I/O
hang forever.
- Set SCSI_DH_NOSYS error when the handler is in the process of being
deleted (e.g. the SCSI device is in a SDEV_CANCEL or SDEV_DEL state).
- Set SCSI_DH_DEV_OFFLINED error when device is in SDEV_OFFLINE state.
- Call the activate_complete callback function directly from
scsi_dh_activate if an error has been set (when either the scsi_dh
internal data has already been deleted or is in the process of being
deleted).
The patch was tested in an iSCSI environment, RDAC H/W handler and
multipath. In the following reproduction process, dd will I/O hang
forever and the only way to release it will be to reboot the machine:
1) Perform I/O on a multipath device:
dd if=/dev/dm-0 of=/dev/zero bs=8k count=1000000 &
2) Delete all slave SCSI devices contained in the mpath device:
I) In an iSCSI environment, the easiest way to do this is by
stopping iSCSI:
/etc/init.d/iscsi stop
II) Another way to delete the devices is by applying the following
bash scriptlet:
dm_devs=$(ls /sys/block/ | grep dm- | xargs)
for dm_dev in $dm_devs; do
devices=$(ls /sys/block/$dm_dev/slaves)
for device in $devices; do
echo 1 > /sys/block/$device/device/delete
done
done
NOTE: when DM mpath's fail_path uses blk_abort_queue this scsi_dh change
isn't strictly required. However, DM mpath's call to blk_abort_queue
will soon be reverted because it has proven to be unsafe due to a race
(between blk_abort_queue and scsi_request_fn) that can lead to list
corruption. Therefore we cannot rely on blk_abort_queue via fail_path,
but even if we could this scsi_dh change is still preferrable.
Signed-off-by: Menny Hamburger <Menny_Hamburger@Dell.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Reviewed-by: Babu Moger <babu.moger@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Otherwise, after doing a RAID level migration, the disk will be
disruptively removed and re-added as a different disk on rescan.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The firmware may have been updated, in which case, it's the same device,
and in that case, we do not want to remove and add the device, we want to
let it continue as is.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Signed-off-by: Alex Iannicelli <alex.iannicelli@emulex.com>
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Updated commands used for ELS to utilize VPI
Allocate RPI at node creation time and pass in ELS commnads.
Signed-off-by: Alex Iannicelli <alex.iannicelli@emulex.com>
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Implement new SLI4 init procedures based on if_type:
- Add structure changes for new SLIPORT registers and BAR changes.
- Update register names to be consistent with inteface spec terms.
- Added union to encapsulate Hardward error registers.
- Rework lpfc_sli4_post_status_check() around SLI-4's SLI_INTF type
- Removed the lpfc_sli4_fw_cfg_check routine
- Segmented driver logic to include evaluation of the if_type to
engage different behaviors.
Signed-off-by: Alex Iannicelli <alex.iannicelli@emulex.com>
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Implement the FC and SLI async event handlers:
- Updated MQ_CREATE_EXT mailbox structure to include fc and SLI async events.
- Added the SLI trailer code.
- Split physical field into type and number to reflect latest SLI spec.
- Changed lpfc_acqe_fcoe to lpfc_acqe_fip to reflect latest Spec changes.
- Added lpfc_acqe_fc_la structure for FC link attention async events.
- Added lpfc_acqe_sli structure for sli async events.
- Added lpfc_sli4_async_fc_evt routine to handle fc la async events.
- Added lpfc_sli4_async_sli routine to handle sli async events.
- Moved LPFC_TRAILER_CODE_FC to be handled by its own handler function.
Signed-off-by: Alex Iannicelli <alex.iannicelli@emulex.com>
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
- Use for iocbq->context1 to hold the ndlp pointer.
- Set ndlp in all iocbs generated from ioctl functions.
- Turn parity and serr bits back on after performing sli4 board reset.
Signed-off-by: Alex Iannicelli <alex.iannicelli@emulex.com>
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Fix iotag handling:
1) Update and check io tag for retry case.
2) Clearing upper 3 bits in io tag when an IO completes.
The 3 upper bits in io tags are used for counting FCP exchange retry.
Un-cleared bits will cause firmware to access invalid memory when the
same io tag is used for an IO to a target that doesn't support FCP
exchange retry.
3) Only check the effective bits when validating an iotag.
Other minor fixes:
1) Added trace to get FC header type with assert of unhandled packet received.
Ignore the type FC_TYPE_FC_FSS (FC_XS).
2) Fixed the adapter info display check - to check for fcmode flag even.
Signed-off-by: Krishna Gudipati <kgudipat@brocade.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
- Direct attach is not working due to the check of PID in fcxp_send request.
- Added logic to set the lps->lp_pid with the PID assigned for n2n mode.
Signed-off-by: Krishna Gudipati <kgudipat@brocade.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
- Made IOC auto_recovery synchronized and not timer based.
- Only one PCI function will attempt to recover and reinitialize
the ASIC on a failure, after all the active PCI fns
acknowledge the IOC failure.
Signed-off-by: Krishna Gudipati <kgudipat@brocade.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
When the bfa driver is loaded a flogi is sent without the knowledge of
trunking configuration. This normal flogi causes the switch ports
which had trunking enabled to go to persistent offline. Solution is
to store the port configuration (which has trunking info) in the flash
for persistency. The firmware will read this configuration when the
very first fcport enable is received.
Signed-off-by: Krishna Gudipati <kgudipat@brocade.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
- Move fw trace save logic to bfa_ioc_sm_fail_entry(),
so that fw trace is saved irrespective of the cause of the failure.
- Make bfa_ioc_sm_fail() a failure parking state.
- Rename bfa_ioc_sm_initfail() to a more appropriate bfa_ioc_sm_fail_retry()
as it is no longer a parking state.
Signed-off-by: Krishna Gudipati <kgudipat@brocade.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Remove OS wrapper functions/macros, and as a result remove bfa_os_inc.h.
Signed-off-by: Maggie Zhang <xmzhang@brocade.com>
Signed-off-by: Jing Huang <huangj@brocade.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Remove SCSI IO callbacks, and as a result remove bfa_cb_ioim.h.
Signed-off-by: Maggie Zhang <xmzhang@brocade.com>
Signed-off-by: Jing Huang <huangj@brocade.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Modified scatter gather processing to use the kernel provided
scsi_for_each_sg() macro.
1) Instead of allocating and setting up sgpg in bfa_ioim_sge_setup(),
we only do allocation. As a result, we remove
bfa_ioim_sgpg_setup() and rename bfa_ioim_sge_setup() to
bfa_ioim_sgpg_alloc().
2) bfa_ioim_send_ioreq() call scsi_for_each_sg() to handle both inline
and sgpg setup.
Signed-off-by: Maggie Zhang <xmzhang@brocade.com>
Signed-off-by: Jing Huang <huangj@brocade.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Cleaned up one line functions.
Signed-off-by: Maggie Zhang <xmzhang@brocade.com>
Signed-off-by: Jing Huang <huangj@brocade.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
By changing field ordering we can avoid a couple of memory holes in
the tables that use the ibmvfc_async_desc structure.
Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com>
Acked-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Since iscsi transport can be built as a module and uses netlink socket
to communicate. The module should have an alias to autoload when socket
of NETLINK_ISCSI type is requested.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Prior to firmware state change from ACQUIRING to READY, an
0x8029 AEN is received. Added code to check previous state
being ACQUIRING in order to update the ip address in the driver.
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Signed-off-by: Prasanna Mumbai <prasanna.mumbai@qlogic.com>
Signed-off-by: Ravi Anand <ravi.anand@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Since if fw load is failing, running on incomplete fw load would
be fatal.
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Signed-off-by: Lalit Chandivade <lalit.chandivade@qlogic.com>
Signed-off-by: Ravi Anand <ravi.anand@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
in mailbox command do not process interrupt unconditionally,
process interrupt only in polling mode
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Signed-off-by: Lalit Chandivade <lalit.chandivade@qlogic.com>
Signed-off-by: Ravi Anand <ravi.anand@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
IRQF_SHARED flag should not be set when calling request_irq for MSI since
this interrupt mechanism cannot be shared like standard INTx
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Signed-off-by: Shyam Sundar <shyam.sundar@qlogic.com>
Signed-off-by: Ravi Anand <ravi.anand@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
IRQF_DISABLE flag is deprecated and this flag is a NOOP in kernel.
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Signed-off-by: Ravi Anand <ravi.anand@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The statistics for InputMegabytes and OutputMegabytes are
misnamed. They're accumulating bytes, not megabytes.
The statistic returned via /sys must be in megabytes, however,
which is what the HBA-API wants. The FCP code needs to accumulate
it in bytes and then divide by 1,000,000 (not 2^20) before it
presented via sysfs.
This affects fcoe.ko only, not fnic. The fnic driver
correctly by accumulating bytes and then converts to megabytes.
I checked that libhbalinux is using the /sys file directly without
conversion.
BTW, qla2xxx does divide by 2^20, which I'm not fixing here.
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Neaten several calls to fip_select() by having it return the
pointer to the new FCF.
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
When there are several FCFs to choose from, the one most likely
to accept a FLOGI on certian switches is the one that last
answered a multicast solicit.
So, when receiving an advertisement, move the FCF to the front
of the list so that it gets chosen first among those with the
same priority.
Without this, more FLOGIs need to be sent in a test with
multiple FCFs and a switch in NPV mode, but it still
eventually finds one that accepts the FLOGI.
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
When multiple FCFs to the same fabric exist, the debug messages
all look alike. Change the message to include the MAC address.
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Switches using multiple-FCFs may reject FLOGI in order to
balance the load between multiple FCFs. Even though the FCF
was available, it may have more load at the point we actually
send the FLOGI.
If the FLOGI fails, select a different FCF
if possible, among those with the same priority. If no other
FCF is available, just deliver the reject to libfc for retry.
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The check for conflicting fabrics in fcoe_ctlr_select()
ignores any FCFs that aren't usable. This is a minor
problem now but becomes more pronounced after later patches.
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Move some of the code in fcoe_ctlr_timer_work() to
fcoe_ctlr_select() so that it can be shared
with another function in a forthcoming patch.
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Move the announcement code to a separate function for reuse in
a forthcoming patch.
For messages regarding FCF timeout and selection, use the
previously-announced FCF MAC address (dest_addr) in the fcoe_ctlr struct.
Only print (announce) the FCF if it is new. Print MAC for
timed-out or deselected FCFs.
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Frame should be freed in fc_tm_done, this is an updated patch on the one
initially submitted by Hillf Danton.
Signed-off-by: Hillf Danton <dhillf@gmail.com>
Signed-off-by: Yi Zou <yi.zou@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The timeout for the exchange carrying REC itself is 2 * R_A_TOV_els.
Signed-off-by: Yi Zou <yi.zou@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Should not continue when the abort itself is being timeout since in that case
the exchange will be deleted and relesased. We still want to call the
associated response handler to let the layer, e.g., fcp, know the exchange
itself is being timed out.
Signed-off-by: Yi Zou <yi.zou@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Do not call fc_io_compl() on fsp w/o any scsi_cmnd, e.g., lun reset is built
inside fc_fcp, not from a scsi command from queuecommnd from scsi-ml, so in
in case target is buggy that is invalid flags in the FCP_RSP, as we have seen
in some SAN Blaze target where all bits in flags are 0, we do not want to call
io_compl on this fsp.
[ Comment block added by Robert Love ]
Signed-off-by: Yi Zou <yi.zou@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This is very helpful to match up the corresponding exchange to the actual I/O
described by the fsp, particularly when you do a side-by-side comparison of
the syslog with your trace.
Signed-off-by: Yi Zou <yi.zou@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Add missing newlines.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
There seems rdata should get put before return.
Signed-off-by: Hillf Danton <dhillf@gmail.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
There seems info should get freed when error encountered.
Signed-off-by: Hillf Danton <dhillf@gmail.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
There seems info should get freed when error encountered.
Signed-off-by: Hillf Danton <dhillf@gmail.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
We can easily remove the tgt_flags from fc_fcp_pkt struct
and use rpriv->tgt_flags directly where needed.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Use the rport value for rec_tov for timeout values when
sending fcp commands. Currently, defaults are being used
which may or may not match the advertised values.
The default may cause i/o to timeout on networks that
set this value larger then the default value. To make
the timeout more configurable in the non-REC mode we
remove the FC_SCSI_ER_TIMEOUT completely allowing the
scsi-ml to do the timeout. This removes an unneeded
timer and allows the i/o timeout to be configured
using the scsi-ml knobs.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The fcp packet recovery handler fc_fcp_recover() is called
when errors occurr in a fcp session. Currently it is
generically setting the status code to FC_CMD_RECOVERY for
all error types. This results in DID_BUS_BUSY errors
being returned to the scsi-ml.
DID_BUS_BUSY errors indicate "BUS stayed busy through time
out period" according to scsi.h. Many of the error reported
by fc_rcp_recovery() are pkt errors. Here we update
fc_fcp_recovery to use better host byte codes.
With certain FAST FAIL flags set DID_BUS_BUSY and DID_ERROR
will have different behaviors this was causing dm multipath
to fail quickly in some cases where a retry would be a
better action.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
There seems accumulation needed.
Signed-off-by: Hillf Danton <dhillf@gmail.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
There is a typo cleaned, which triggers memory leakage.
Signed-off-by: Hillf Danton <dhillf@gmail.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The error handler grabs the si->scsi_queue_lock, but
in the case where the fsp pointer is NULL it releases
the scsi_host lock. This can lead to a variety of
system hangs depending on which is used first- the
scsi_host lock or the scsi_queue_lock.
This patch simply unlocks the correct lock when fcp
is NULL.
Signed-off-by: Robert Love <robert.w.love@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
For allocating new exch from pool, scanning for free slot in exch
array fluctuates when exch pool is close to exhaustion.
The fluctuation is smoothed, and the scan looks to be O(2).
Signed-off-by: Hillf Danton <dhillf@gmail.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
There seems that ep should get released, or it will no longer get freed.
Signed-off-by: Hillf Danton <dhillf@gmail.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This happens when then tearing down the fcoe interface with active I/O.
The back trace shows dead000000200200 in RAX, i.e., LIST_POISON2, indicating
that the fsp is already being dequeued, which is probably why no complaining
was seen in fc_fcp_destroy() about outstanding fsp not freed, since we dequeue
it in the end of fc_io_compl() before releasing it. The bug is due to the
fact that we have already destroyed lport's scsi_pkt_pool while on-going i/o
is still accessing it through fc_fcp_pkt_release(), like this trace or the
similar code path from scsi-ml to fc_eh_abort, etc. This is fixed by moving
the fc_fcp_destroy() after lport is detached from scsi-ml since fc_fcp_destroy
is supposed to called only once where no lport lock is taken, otherwise the
fc_fcp_pkt_release() would have to grab the lport lock.
BUG: unable to handle kernel NULL pointer dereference at (null)
.......
RIP: 0010:[<0000000000000000>]
[<(null)>] (null)
RSP: 0018:ffff8803270f7b88 EFLAGS: 00010282
RAX: dead000000200200 RBX: ffff880197d2fbc0 RCX: 0000000000005908
RDX: ffff880195ea6d08 RSI: 0000000000000282 RDI: ffff880180f4fec0
RBP: ffff8803270f7bc0 R08: ffff880197d2fbe0 R09: 0000000000000000
R10: ffff88032867f090 R11: 0000000000000000 R12: ffff880195ea6d08
R13: 0000000000000282 R14: ffff880180f4fec0 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff8801b5820000(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000000 CR3: 00000001a6eae000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process fc_rport_eq (pid: 5278, threadinfo ffff8803270f6000, task ffff880326254ab0)
Stack:
ffffffffa02c39ca ffff8803270f7ba0 ffff88019331cbc0 ffff880197d2fbc0
0000000000000000 ffff8801a8c895e0 ffff8801a8c895e0 ffff8803270f7c10
ffffffffa02c4962 ffff8803270f7be0 ffffffff814c94ab ffff8803270f7c10
Call Trace:
[<ffffffffa02c39ca>] ? fc_io_compl+0x10a/0x530 [libfc]
[<ffffffffa02c4962>] fc_fcp_complete_locked+0x72/0x150 [libfc]
[<ffffffff814c94ab>] ? _spin_unlock_bh+0x1b/0x20
[<ffffffffa02b98ff>] ? fc_exch_done+0x3f/0x60 [libfc]
[<ffffffffa02c4a8f>] fc_fcp_retry_cmd+0x4f/0x60 [libfc]
[<ffffffffa02c6150>] fc_fcp_recv+0x9b0/0xc30 [libfc]
[<ffffffff8106ba7a>] ? _call_console_drivers+0x4a/0x80
[<ffffffff8107d5ec>] ? lock_timer_base+0x3c/0x70
[<ffffffff8107e06b>] ? try_to_del_timer_sync+0x7b/0xe0
[<ffffffffa02b9dcf>] fc_exch_mgr_reset+0x1df/0x250 [libfc]
[<ffffffffa02c57a0>] ? fc_fcp_recv+0x0/0xc30 [libfc]
[<ffffffffa02c1042>] fc_rport_work+0xf2/0x4e0 [libfc]
[<ffffffff8109203e>] ? prepare_to_wait+0x4e/0x80
[<ffffffffa02c0f50>] ? fc_rport_work+0x0/0x4e0 [libfc]
[<ffffffff8108c6c0>] worker_thread+0x170/0x2a0
[<ffffffff81091d50>] ? autoremove_wake_function+0x0/0x40
[<ffffffff8108c550>] ? worker_thread+0x0/0x2a0
[<ffffffff810919e6>] kthread+0x96/0xa0
[<ffffffff810141ca>] child_rip+0xa/0x20
[<ffffffff81091950>] ? kthread+0x0/0xa0
[<ffffffff810141c0>] ? child_rip+0x0/0x20
Code:
Bad RIP value.
RIP
[<(null)>] (null)
RSP <ffff8803270f7b88>
CR2: 0000000000000000
Signed-off-by: Yi Zou <yi.zou@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The define for fc_seq_exch is unnecessary, since it also appears in scsi/libfc.h
Signed-off-by: Hillf Danton <dhillf@gmail.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
First round of fix for the endianess check warnings from make C=2 CF="-D__CHECK_ENDIAN__".
Signed-off-by: Maggie <xmzhang@brocade.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Various error conditions inside ep_connect and ep_disconnect were
either not being handled or not being handled correctly. This patch
fixes all those issues.
Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Acked-by: Anil Veerabhadrappa <anilgv@broadcom.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Added the handling for cases when a chip request is made to the
CNIC module but the hardware is not ready to accept. This would
lead to many unnecessary wait timeouts.
This code adds check in the connect establishment and destruction
path.
Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The stop path has been augmented to wait a max of 10s for all in
progress offload and destroy activities to complete before proceeding
to terminate all active connections (via iscsid or forcefully).
Note that any new offload and destroy requests are now blocked and
return to the caller immediately.
Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Acked-by: Anil Veerabhadrappa <anilgv@broadcom.com>
Reviewed-by: Benjamin Li <benli@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The code no longer needs to dynamically register and unregister
the CNIC device. The CNIC device will be kept registered until
module unload.
Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Added net_dev mutex lock protection before accessing the csk
parameters.
Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Reviewed-by: Benjamin Li <benli@broadcom.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
In the situation where the connect completion response arrives after
the connect request has already timed out, the connection was not being
aborted but only the resource was being freed. This creates a problem
for 5771X (10g) as the chip flags this with an assertion.
This change will properly aborts the connection before freeing the
resource.
Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Modified the handling of the remote TCP RST code so the chip can now
flush the tx pipe accordingly upon a remote TCP RST reception.
Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
A cid leak issue was found when the connect destroy request exceeded
the driver's disconnection timeout. This will lead to a cid resource
leak issue.
The fix is to allow the cid cleanup even when this happens.
Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Acked-by: Anil Veerabhadrappa <anilgv@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Added a be32_to_cpu call for the TMF LUN wqe.
Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The patch fixes the following situations where NOP-Out pkt is called for:
- local unsolicited NOP-Out requests (requesting no NOP-In response)
- local NOP-Out responses to unsolicited NOP-In requests
kernel panic is observed due to double session spin_lock requests; one in the
bnx2i_process_nopin_local_cmpl routine in bnx2i_hwi.c and the other in the
iscsi_put_task routine in libiscsi.c
The proposed fix is to export the currently static __iscsi_put_task() routine
and have bnx2i call it directly instead of the iscsi_put_task() routine which
holds the session spin lock.
Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
Reviewed-by: Anil Veerabhadrappa <anilgv@broadcom.com>
Acked-by: Benjamin Li <benli@broadcom.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Unsolicited NOP-Ins are placed in the receive queue of the hardware
which requires to be read out regardless if the receive pipe is suspended
or not. This patch adds the disposal of this RQ element under this
condition.
Also fixed the bug in the unsolicited NOP-In handling routine which
checks for the RESERVED_ITT.
Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
Reviewed-by: Benjamin Li <benli@broadcom.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Fix oops loading driver when there is direct attached
SEP device
The driver set max phys count to the value reported in sas iounit page
zero. However this page doesn't take into account additional virutal
phys. When sas topology event arrives, the phy count is larger than
expected, and the driver accesses memory array beyond the end of
allocated space, then oops. Manufacturing page 8 contains the info
on direct attached phys.
For this fix will making sure that sas topology event is not
processing phys greater than the expected phy count.
Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
change_queue_depth callback API changed
The change_queue_depth callback changed where there is now an additional
parameter called reason, with SCSI_QDEPTH_DEFAULT, SCSI_QDEPTH_QFULL,
and SCSI_QDEPTH_RAMP_UP codes.
Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
remove support for MPI2_EVENT_TASK_SET_FULL
This event is obsoleted, so this processing of this event
needs to be removed from the driver. The controller firmware is going
to handle TASK_SET_FULL, the driver doesn't need to do anything.
Even though we are removing the EVENT handling, the behavour has not
changed between driver versions becuase fimrware will still be handling
queue throttling, and retrying of commands when the target device queues
are full.
Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
MPI2 Rev header files.
1) Removed Task Set Full Event. Modified description of Disable SCSI
Initiator Task Set Full Handling bit in the Flags field of IO Unit
Page 1. Modified the descriptions for the three queue depth fields in
SAS IO Unit Page 1.
(2) Added new value for the Current Operation bits of the Flags field
in the RAID Volume Indicator Structure to indicate that the Make Data
Consistent operation is running.
(3) Added a value of 0x6 to various SAS link rate fields to indicate an
attached PHY that is not using any commonly supported settings.
(4) Added Volume Not Consistent bit to the VolumeStatusFlags field of
RAID Volume Page 0.
(5) Added a new value for the IncompatibleReason field of RAID Physical
Disk Page 0 to indicate an incompatible media type.
(6) Added Diagnostic Data Upload tool for the Toolbox Request.
Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Issue : Switch swap doesn't work when device missing delay is enabled.
(1) add support to individually add and remove phys to and from
existing ports. This replaces the routine
_transport_delete_duplicate_port.
(2) _scsih_sas_host_refresh - was modified to change the link rate
from zero to 1.5 GB rate when the firmware reports there is an
attached device with zero link.
(3) add new function mpt2sas_device_remove, this is wrapper function
deletes some redundant code through out driver by combining into one
subrountine
(4) two subroutines were modified so the sas_device, raid_device, and
port lists are traversed once when objects are deleted from the list.
Previously it was looping back each time an object was deleted from the
list.
Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Create a pool of chain buffers, instead of dedicated per IO:
This enahancment is to address memory allocation failure when asking
for more than 2300 IOs per host. There is just not enough contiquious
DMA physical memory to make one single allocation to hold both message
frames and chain buffers when asking for more than 2300 request. In order
to address this problem we will have to allocate memory for each chain
buffer in a seperate individual memory allocation, placing each chain
element of 128 bytes onto a pool of available chains, which can be
shared amoung all request.
Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Ability to override/set the ReportDeviceMissingDelay and
IODeviceMissingDelay from driver: Add new command line option missing_delay,
this is an array, where the first element is the device missing delay,
and the second element is io missing delay. The driver will program
sas iounit page 1 with the new setting when the driver loads. This is
programmed to the current and persistent configuration page so this takes
immediately, as will be sticky across host reboots.
Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Sometime it is seen that controller
firmware returns an invalid system message id (smid).
the oops is occurring becuase mpt_callbacks pointer is referenced to
either null or invalid virtual address. this is due to cb_idx set
incorrectly from routine _base_get_cb_idx. the cb_idx was set incorrectly
becuase there is no check to make sure smid is less than maxiumum
anticapted smid. to fix this issue, we add a check in
_base_get_cb_idx to make sure smid is not greater than
ioc->hba_queue_depth. in addition, a similar check was added to make
sure the reply address was less than the largest anticapated address.
Newer firmware has sovled this issue, however it good to have this sanity
check.
Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The compiler throws warning messages while compiling without
CONFIG_SCSI_MPT2SAS_LOGGING.
Set proper ifdef for CONFIG_SCSI_MPT2SAS_LOGGING to avoid warnnings.
Signed-off-by: Kashyap Desai <kashyap.desai@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Signed-off-by: Alex Iannicelli <alex.iannicelli@emulex.com>
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Signed-off-by: Alex Iannicelli <alex.iannicelli@emulex.com>
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Added support for ELS RRQ command
- Add new routine lpfc_set_rrq_active() to track XRI qualifier state.
- Add new module parameter lpfc_enable_rrq to control RRQ operation.
- Add logic to ELS RRQ completion handler and xri qualifier timeout
to clear XRI qualifier state.
- Use OX_ID from XRI_ABORTED_CQE for RRQ payload.
- Tie abort and XRI_ABORTED_CQE andler to RRQ generation.
Signed-off-by: Alex Iannicelli <alex.iannicelli@emulex.com>
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Add SLI4 FC Discovery support
- Replace READ_LA and READ_LA64 with READ_TOPOLOGY mailbox command.
- Converted the old READ_LA structure to use bf_set/get instead of bit fields.
- Rename HBA_FCOE_SUPPORT flag to HBA_FCOE_MODE. Flag now indicates function
is running as SLI-4 FC or FCoE port. Make sure flag reset each time
READ_REV completed as it can dynamically change.
- Removed BDE union in the READ_TOPOLOGY mailbox command and added a define to
define the ALPA MAP SIZE. Added FC Code for async events.
- Added code to support new 16G link speed.
- Define new set of values to keep track of valid user settable link speeds.
- Used new link speed definitions to define link speed max and bitmap.
- Redefined FDMI Port sppeds to be hax values and added the 16G value.
- Added new CQE trailer code for FC Events.
- Add lpfc_issue_init_vfi and lpfc_init_vfi_cmpl routines.
- Replace many calls to the initial_flogi routine with lpfc_issue_init_vfi.
- Add vp and vpi fields to the INIT_VFI mailbox command.
- Addapt lpfc_hba_init_link routine for SLI4 use.
- Use lpfc_hba_init_link call from lpfc_sli4_hba_setup.
- Add a check for FC mode to register the FCFI before init link.
- Convert lpfc_sli4_init_vpi to be called without a vpi (get it from vport).
Signed-off-by: Alex Iannicelli <alex.iannicelli@emulex.com>
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
- Add the Lancer FC and FCoE PCI IDs
- Add new SLI4 INTF register definitions
- Implement new SLI4 doorbell register
Signed-off-by: Alex Iannicelli <alex.iannicelli@emulex.com>
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Fix critical errors and crashes
- Replace LOF_SECURITY with LOG_SECURITY
- When calculating diag test memory size, use full size with header.
- Return LS_RJT with status=UNSUPPORTED on unrecognized ELS's
- Correct NULL pointer dereference when lpfc_create_vport_work_array()
returns NULL.
- Added code to handle CVL when port is in LPFC_VPORT_FAILED state.
- In lpfc_do_scr_ns_plogi, check the nodelist for FDMI_DID and reuse
the resource.
- Check for generic request 64 and calculate the sgl offset for the request
and reply sgls, also calculate the xmit length using only the request bde.
Signed-off-by: Alex Iannicelli <alex.iannicelli@emulex.com>
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The current code in scsi_eh_target_reset() has an off by one error
that actually sends spurious extra resets. Since there's no real need
to reset the targets in numerical order, simply chunk up the command
recovery list doing target resets and pulling matching targets out of
the list (that also makes the loop O(N) instead of O(N^2).
[mike christie found and fixed a list_splice -> list_splice_init problem]
Reported-by: Hillf Danton<dhillf@gmail.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The definition for the mailbox register for new adapters was incorrect. The
value has been updated to the correct offset.
After an adapter reset, the mailbox register on the new adapters takes a
number of seconds to stabilize. A delay has been added before reading the
register.
Signed-off-by: Wayne Boyer <wayneb@linux.vnet.ibm.com>
Acked-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The lun value was not getting set up correctly for all devices attached to the
new 64 bit adapters. The fix is to move the logic to earlier in the
ipr_init_res_entry routine such that the value does get set correctly for all
devices.
Then the ipr_is_same_device comparison function was using the wrong lun value
in the logic for the new adapters. Change this to use the correct lun value.
Signed-off-by: Wayne Boyer <wayneb@linux.vnet.ibm.com>
Acked-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Some kernel transport drivers unconditionally disable
retrieval of the Caching mode page. One such for example is
the BBB/CBI transport over USB. Such a restraint is too
harsh as some devices do support the Caching mode
page. Unconditionally enabling the retrieval of this mode
page over those transports at their transport code level may
result in some devices failing and becoming unusable.
This patch implements a method of retrieving the Caching
mode page without unconditionally enabling it in the
transports which unconditionally disable it. The idea is to
ask for all supported pages, page code 0x3F, and then search
for the Caching mode page in the mode parameter data
returned. The sd driver already asks for all the mode pages
supported by the attached device by setting the page code to
0x3F in order to find out if the media is write protected by
reading the WP bit in the Device Specific Parameter
field. It then attempts to retrieve only the Caching mode
page by setting the page code to 8 and actually attempting
to retrieve it if and only if the transport allows it.
The method implemented here is that if the transport doesn't
allow retrieval of the Caching mode page and the device is
not RBC, then we ask for all pages supported by setting the
page code to 0x3F (similarly to how the WP bit is retrieved
above), and then we search for the Caching mode page in the
mode parameter data returned.
With this patch, devices over SATA, report this (no change):
Oct 22 18:45:58 localhost kernel: sd 0:0:0:0: [sda] 976773168 512-byte logical blocks: (500 GB/465 GiB)
Oct 22 18:45:58 localhost kernel: sd 0:0:0:0: Attached scsi generic sg0 type 0
Oct 22 18:45:58 localhost kernel: sd 0:0:0:0: [sda] Write Protect is off
Oct 22 18:45:58 localhost kernel: sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
Oct 22 18:45:58 localhost kernel: sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Smart devices report their Caching mode page. This is a
change where we'd previously see the kernel making
assumption about the device's cache being write-through:
Oct 22 18:45:58 localhost kernel: sd 6:0:0:0: Attached scsi generic sg2 type 0
Oct 22 18:45:58 localhost kernel: sd 6:0:0:0: [sdb] 610472646 4096-byte logical blocks: (2.50 TB/2.27 TiB)
Oct 22 18:45:58 localhost kernel: sd 6:0:0:0: [sdb] Write Protect is off
Oct 22 18:45:58 localhost kernel: sd 6:0:0:0: [sdb] Mode Sense: 47 00 10 08
Oct 22 18:45:58 localhost kernel: sd 6:0:0:0: [sdb] Write cache: enabled, read cache: enabled, supports DPO and FUA
And "dumb" devices over BBB, are correctly shown not to
support reporting the Caching mode page:
Oct 22 18:49:06 localhost kernel: sd 7:0:0:0: [sdc] 15663104 512-byte logical blocks: (8.01 GB/7.46 GiB)
Oct 22 18:49:06 localhost kernel: sd 7:0:0:0: [sdc] Write Protect is off
Oct 22 18:49:06 localhost kernel: sd 7:0:0:0: [sdc] Mode Sense: 23 00 00 00
Oct 22 18:49:06 localhost kernel: sd 7:0:0:0: [sdc] No Caching mode page present
Oct 22 18:49:06 localhost kernel: sd 7:0:0:0: [sdc] Assuming drive cache: write through
Signed-off-by: Luben Tuikov <ltuikov@yahoo.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
To date libsas has only looked at the attached sas address when
determining the formation of wide ports. The specification and some
hardware expects that phys with different addresses will not form a wide
port unless the local peer phys also match each other. Introduce a flag
to select stricter behavior at sas_register_ha() time. The flag can be
dropped once it is known that all libsas users expect the same behavior.
Current drivers just initialize this field to zero and get the
traditional behavior.
Reported-by: Patrick Thomson <patrick.s.thomson@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This patch (as1415) improves the formerly incomprehensible logic in
sd_media_changed() (the current code refers to "changed" as a state,
whereas in fact it is a relation between two states). It also adds a
big comment so that everyone can understand what is really going on.
The patch also improves efficiency by not reporting a media change
when no medium was ever present. If no medium was present the last
time we checked and there's still no medium, it's not necessary to
tell the caller that a change occurred. Doing so merely causes the
caller to attempt to revalidate a non-existent disk, which is a waste
of time.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Rename log_level to bfa_log_level to make the global variable more bfa
specific and avoid clashes with other drivers which was causing a
build failure.
Signed-off-by: Jing Huang <huangj@brocade.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
cciss: fix cciss_revalidate panic
block: max hardware sectors limit wrapper
block: Deprecate QUEUE_FLAG_CLUSTER and use queue_limits instead
blk-throttle: Correct the placement of smp_rmb()
blk-throttle: Trim/adjust slice_end once a bio has been dispatched
block: check for proper length of iov entries earlier in blk_rq_map_user_iov()
drbd: fix for spin_lock_irqsave in endio callback
drbd: don't recvmsg with zero length
When stacking devices, a request_queue is not always available. This
forced us to have a no_cluster flag in the queue_limits that could be
used as a carrier until the request_queue had been set up for a
metadevice.
There were several problems with that approach. First of all it was up
to the stacking device to remember to set queue flag after stacking had
completed. Also, the queue flag and the queue limits had to be kept in
sync at all times. We got that wrong, which could lead to us issuing
commands that went beyond the max scatterlist limit set by the driver.
The proper fix is to avoid having two flags for tracking the same thing.
We deprecate QUEUE_FLAG_CLUSTER and use the queue limit directly in the
block layer merging functions. The queue_limit 'no_cluster' is turned
into 'cluster' to avoid double negatives and to ease stacking.
Clustering defaults to being enabled as before. The queue flag logic is
removed from the stacking function, and explicitly setting the cluster
flag is no longer necessary in DM and MD.
Reported-by: Ed Lin <ed.lin@promise.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Acked-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@kernel.org
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Replace sd_media_change() with sd_check_events(). sd used to set the
changed state whenever the device is not ready, which can cause event
loop while the device is not ready. Media presence handling code is
changed such that the changed state is set iff the media presence
actually changes. UA still always sets the changed state and
NOT_READY always (at least where it used to set ->changed) clears
media presence, so no event is lost.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Replace sr_media_change() with sr_check_events(). It normally only
uses GET_EVENT_STATUS_NOTIFICATION to check both media change and
eject request. If @clearing includes DISK_EVENT_MEDIA_CHANGE, it
issues TUR and compares whether media presence has changed. The SCSI
specific media change uevent is kept for compatibility.
sr_media_change() was doing both media change check and revalidation.
The revalidation part is split into sr_block_revalidate_disk().
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
The usage of TUR has been confusing involving several different
commits updating different parts over time. Currently, the only
differences between scsi_test_unit_ready() and sr_test_unit_ready()
are,
* scsi_test_unit_ready() also sets sdev->changed on NOT_READY.
* scsi_test_unit_ready() returns 0 if TUR ended with UNIT_ATTENTION or
NOT_READY.
Due to the above two differences, sr is using its own
sr_test_unit_ready(), but sd - the sole user of the above extra
handling - doesn't even need them.
Where scsi_test_unit_ready() is used in sd_media_changed(), the code
is looking for device ready w/ media present state which is true iff
TUR succeeds w/o sense data or UA, and when the device is not ready
for whatever reason sd_media_changed() explicitly marks media as
missing so there's no reason to set sdev->changed automatically from
scsi_test_unit_ready() on NOT_READY.
Drop both special handlings from scsi_test_unit_ready(), which makes
it equivalant to sr_test_unit_ready(), and replace
sr_test_unit_ready() with scsi_test_unit_ready(). Also, drop the
unnecessary explicit NOT_READY check from sd_media_changed().
Checking return value is enough for testing device readiness.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
sr_test_unit_ready() returns 0 iff TUR succeeded - IOW, when media is
present and the device is actually ready, so the return value wouldn't
be zero when TUR ends with sense data. sr_media_change() incorrectly
tests (retval || (scsi_sense_valid(sshdr)...)) when it tries to test
whether TUR failed without sense data or with sense data indicating
media-not-present.
Fix the test using scsi_status_is_good() and update comments.
- Fixed a comment typo spotted by Eike.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Rolf Eike Beer <eike-kernel@sf-tec.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Make all RTAX_ADVMSS metric accesses go through a new helper function,
dst_metric_advmss().
Leave the actual default metric as "zero" in the real metric slot,
and compute the actual default value dynamically via a new dst_ops
AF specific callback.
For stacked IPSEC routes, we use the advmss of the path which
preserves existing behavior.
Unlike ipv4/ipv6, DecNET ties the advmss to the mtu and thus updates
advmss on pmtu updates. This inconsistency in advmss handling
results in more raw metric accesses than I wish we ended up with.
Signed-off-by: David S. Miller <davem@davemloft.net>
PCI_DEVICE_ID_CISSF is defined as 323b in pci_ids.h but redefined as 3fff in
hpsa.c. The ID of 3fff will _never_ ship as a standalone controller. It is
intended only as part a complete storage solution. As such, this patch
removes the redefinition and the StorageWorks P1210m from the product table.
It also removes a duplicate line for the "unknown" controller support.
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
A return value is not set for the successful case and it has a garbage value.
This fix will set the default value to SUCCESS and in case of any failures
it is changed.
Signed-off-by: Saurav Kashyap <saurav.kashyap@qlogic.com>
Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This would cause a panic while reading the NPIV-config data.
Cc: stable@kernel.org
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
IRQF_SHARED flag should not be set when calling request_irq for MSI
since this interrupt mechanism cannot be shared like standard INTx.
Signed-off-by: Mike Hernandez <michael.hernandez@qlogic.com>
Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Use the host_to_fcp_swap call to correctly populate the LUN field
in the Command Type 6 path. This field is used during LUN reset
cleanup and must match the field used in the FCP command.
Cc: stable@kernel.org
Signed-off-by: Mike Hernandez <michael.hernandez@qlogic.com>
Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The error handler is using the test cmd->serial_number == 0 in the
abort routines to signal that the command to be aborted has already
completed normally. This design was to close a race window in the
original error handler where a command could go through the normal
completion routines after it timed out but before error handling was
started.
Mike Anderson pointed out that when we converted our timeout and
softirq completions, we picked up atomicity here because the block
layer now mediates this with the REQ_ATOM_COMPLETE flag and guarantees
that *either* the command times out or our done routine is called, but
ensures we can't get both occurring. That makes the serial number
zero check redundant and it can be removed.
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Firmware requires a larger configuration entry size than the driver
currently allows, and MSI-X pretty much doesn't work with current FW,
so disable it for now.
Signed-off-by: Anil Ravindranath <anil_ravindranath@pmc-sierra.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
bio_map_kern() returns ERR_PTRs on failure and never returns NULL.
[jejb: remove redundant unlikely spotted by Tobias Klauser]
Signed-off-by: Dan Carpenter <error27@gmail.com>
Acked-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This trivial patch (as1338) makes two uninformative error messages in
scsi_sysfs_add_sdev() more explicit.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
f281233 (SCSI host lock push-down) broke the fas216 build:
drivers/scsi/arm/fas216.h: In function 'fas216_noqueue_command':
drivers/scsi/arm/fas216.h:354: error: storage class specified for parameter 'fas216_intr'
drivers/scsi/arm/fas216.h:356: error: storage class specified for parameter 'fas216_remove'
...
Fix it.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The big kernel lock has been removed from all these files at some point,
leaving only the #include.
Remove this too as a cleanup.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Move the mid-layer's ->queuecommand() invocation from being locked
with the host lock to being unlocked to facilitate speeding up the
critical path for drivers who don't need this lock taken anyway.
The patch below presents a simple SCSI host lock push-down as an
equivalent transformation. No locking or other behavior should change
with this patch. All existing bugs and locking orders are preserved.
Additionally, add one parameter to queuecommand,
struct Scsi_Host *
and remove one parameter from queuecommand,
void (*done)(struct scsi_cmnd *)
Scsi_Host* is a convenient pointer that most host drivers need anyway,
and 'done' is redundant to struct scsi_cmnd->scsi_done.
Minimal code disturbance was attempted with this change. Most drivers
needed only two one-line modifications for their host lock push-down.
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Acked-by: James Bottomley <James.Bottomley@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
REQ_HARDBARRIER is dead now, so remove the leftovers. What's left
at this point is:
- various checks inside the block layer.
- sanity checks in bio based drivers.
- now unused bio_empty_barrier helper.
- Xen blockfront use of BLKIF_OP_WRITE_BARRIER - it's dead for a while,
but Xen really needs to sort out it's barrier situaton.
- setting of ordered tags in uas - dead code copied from old scsi
drivers.
- scsi different retry for barriers - it's dead and should have been
removed when flushes were converted to FS requests.
- blktrace handling of barriers - removed. Someone who knows blktrace
better should add support for REQ_FLUSH and REQ_FUA, though.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
The following are the fixes in this patch:
1. Added support of set timestamp command in the driver
2. Pass all status code to mgmt application. Earlier we were passing
only failed ones.
3. Call class_destroy after unregister_chrdev and pci_unregister_driver
Signed-off-by: Anil Ravindranath <anil_ravindranath@pmc-sierra.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
sense_buffer is both a direct member of struct pmcraid_cmd as well as
an indirect one via an anonymous union and struct. Fix this clash by
eliminating the direct member in favour of the anonymous struct/union
one. The name duplication apparently isn't noticed by gcc versions
earlier than 4.4
Reported-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Anil Ravindranath <anil_ravindranath@pmc-sierra.com>
Cc: Stable Tree <stable@kernel.org>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
If the command has timedout then the block layer has called
blk_mark_rq_complete. If qla4xxx_cmd_wait is then called
from qla4xxx_eh_host_reset, we will always fail, because if
the driver calls scsi_done then the the block layer will fail
at blk_complete_request's blk_mark_rq_complete call instead of
calling the normal completion path including the function,
blk_queue_end_tag, which releases the tag.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: Vikas Chaudhary <vikas.chaudhary@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
If fw didn't raise the interrupt with the fw state change to driver
and fw goes to failure state, driver Will check the FW state in
driver's timeout routine and issue the reset if need. Driver will do
the OCR upto three times until kill adapter. Also driver will issue
OCR before driver kill adapter even if fw in operational state.
Signed-off-by Bo Yang <bo.yang@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Driver add the input parameters support for max_sectors for megaraid
sas gen2 chip. Customer can set the max_sectors support to 1MB for
gen2 chip during the driver load.
Signed-off-by Bo Yang <bo.yang@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Driver added the Device update flag to tell LSI application driver
whether to do the device Update. LSI MegaRAID SAS application will
check this flag to decide if it needs to update the Device or not.
Signed-off-by Bo Yang <bo.yang@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This is a trivial addition to the SG API that can receive kernel
pointers. It is only used by the out-of-tree test module. So
it's immediate need is questionable. For maintenance ease it might
just get in, as it's very small.
John.
do you need this in the Kernel, or is it only for osd_ktest.ko?
Signed-off-by: John A. Chandy <john.chandy@uconn.edu>
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This patch adds the Scatter-Gather (sg) API to libosd.
Scatter-gather enables a write/read of multiple none-contiguous
areas of an object, in a single call. The extents may overlap
and/or be in any order.
The Scatter-Gather list is sent to the target in what is called
a "cdb continuation segment". This is yet another possible segment
in the osd-out-buffer. It is unlike all other segments in that it
sits before the actual "data" segment (which until now was always
first), and that it is signed by itself and not part of the data
buffer. This is because the cdb-continuation-segment is considered
a spill-over of the CDB data, and is therefor signed under
OSD_SEC_CAPKEY and higher.
TODO: A new osd_finalize_request_ex version should be supplied so
the @caps received on the network also contains a size parameter
and can be spilled over into the "cdb continuation segment".
Thanks to John Chandy <john.chandy@uconn.edu> for the original
code, and investigations. And the implementation of SG support
in the osd-target.
Original-coded-by: John Chandy <john.chandy@uconn.edu>
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
At osd_end_request first free the request that might
point to pages, then free these pages. In reverse order
of allocation. For now it's just anal neatness. When we'll
use mempools It'll also pay in performance.
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The _osd_req_finalize_attr_page was off by a mile, when trying to
append the enc_get_attr segment instead of the proper set_attr segment.
Also properly support when we don't have any attribute to set while
getting a full page. And when clearing an attribute by setting it's
size to zero.
Reported-by: John Chandy <john.chandy@uconn.edu>
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Signed-off-by: Alex Iannicelli <alex.iannicelli@emulex.com>
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
- Add new WQE fields as defined by new SLI interface to support new hardware.
Signed-off-by: Alex Iannicelli <alex.iannicelli@emulex.com>
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Fix critical errors
- Update send_scsi_event to validate pnode pointer active before copying
the wwpn information.
- Add a message, mailbox_idle, and unlock before failing SECURITY_MGMT
or AUTH_PORT mailbox commands
- Prevent spin_lock_irqsave from being called twice in a row.
Signed-off-by: Alex Iannicelli <alex.iannicelli@emulex.com>
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Adapter Shutdown and Unregistration cleanup
- Correct the logic around hba shutdown. Prior to final reset, the
driver must wait for all XRIs to return from the adapter. Added logic
to poll, progressively slowing the poll rate as delay gets longer.
- Correct behavior around the rsvd1 field in UNREG_RPI_ALL mailbox
completion and final rpi cleanup.
- Updated logic to move pending VPI registrations to their completion
in cases where a CVL may be received while registration in progress.
- Added unreg all rpi mailbox command before unreg vpi.
Signed-off-by: Alex Iannicelli <alex.iannicelli@emulex.com>
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Added driver logic to detect the last devloss timeout of remote nodes which
was still in use of FCF. At that point, the driver should set the last
in-use remote node devloss timeout flag if it was not already set and should
perform proper action on the in-use FCF and recover of FCF from firmware,
depending on the state the driver's FIP engine is in.
Find eligible FCF through FCF table rescan or the next new FCF event when
FCF table rescan turned out empty eligible FCF, and the successful flogi
into an FCF shall clear the HBA_DEVLOSS_TMO flag, indicating the successful
recovery from devloss timeout.
[jejb: add delay.h include to lpfc_hbadisc.c to fix ppc compile]
Signed-off-by: Alex Iannicelli <alex.iannicelli@emulex.com>
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Add support of received ELS commands
- Add support for received RLS ELS command
- Add support for received ECHO ELS command
- Add support for received RTV ELS command
Signed-off-by: Alex Iannicelli <alex.iannicelli@emulex.com>
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
FC/FCoE Discovery fixes:
- Call the lpfc_drain_txq only for SLI4 hba
- In lpfc_cmpl_els_fdisc, fix code path that does not free IOCB.
- Treated firmware matching FCF property with different index as error
- Propagate error returns from lpfc_issue_els_flogi()
- Refactored lpfc_unregister_unused_fcf() to create a post
lpfc_dev_loss_tmo handler call for SLI-4 devices. Allows checking of
fcf after last ndlp released so that fcf can be released if no longer
in use.
- Replaced individual FCF_XXXX_DISC flag clearing in lieu of aggregate
FCF_DISCOVERY flag upon succesful completion of flogi.
- Correct setting of altBbCredit value in sparams to correct issue with
logins with remote loop-based devices.
Signed-off-by: Alex Iannicelli <alex.iannicelli@emulex.com>
Signed-off-by: James Smart <james.smart@emulex.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
There was an addition to the hardware roadmap that includes a new adapter.
This patch adds the new definitions for the adapter.
Signed-off-by: Wayne Boyer <wayneb@linux.vnet.ibm.com>
Acked-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This patch addresses the comments from Randy Dunlap (Randy.Dunlap@oracle.com)
regarding comment blocks that begining with "/**". bfa driver comments
currently do not follow kernel-doc convention, we hence replace all
/** with /* and **/ with */.
Signed-off-by: Jing Huang <huangj@brocade.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This patch addresses the comments from Randy Dunlap (Randy.Dunlap@oracle.com)
regarding comment blocks that begining with "/**". bfa driver comments
currently do not follow kernel-doc convention, we hence replace all
/** with /* and **/ with */.
Signed-off-by: Jing Huang <huangj@brocade.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Fix compile warning for frame size over 1024 in gcc 4.4.
Signed-off-by: Jing Huang <huangj@brocade.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This patch replaces register access functions and macros with the the ones
provided by linux.
Signed-off-by: Jing Huang <huangj@brocade.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This patch removes os wrapper and unused functions.
bfa_os_assign(), bfa_os_memset(), bfa_os_memcpy(), bfa_os_udelay()
bfa_os_vsprintf(), bfa_os_snprintf(), and bfa_os_get_clock() are replaced with
direct assignment or native linux functions. Some unused functions related to VF
(Vitual fabric) are also removed.
Signed-off-by: Jing Huang <huangj@brocade.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Ignore active open reply with status negative advice. This is an
informational message.
Signed-off-by: Karen Xie <kxie@chelsio.com>
Reviewed-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This patch fixes an issue which causes the firmware to fail with a
'PRLI failed' status code (iop1 = 405). This status triggers the
driver to fall into an incorrect code-path which does not attempt
a login retry.
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This patch fixes a regression introduced by commit
083a469db4
qla2xxx_eh_wait_on_command() is waiting for an srb to
complete, which will never happen as the routine took
a reference to the srb previously and will only drop it
after this function. So every command abort will fail.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: Giridhar Malavali <giridhar.malavali@qlogic.com>
Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This patch adds a shutdown handler to qla2xxx driver to make sure that all
DMA and firmware activities are stopped, and any associated driver resources
are released. The need for this handler arose when executing kexec in specific
environments caused the data of the 2nd kernel to be corrupted, due to DMA
activities.
Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Commit feafb7b171 neglected to initialize
the spinlock.
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This patch cleans up any printk or debug tracing of the the
serial_number field in the qla2xxx driver.
Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Currently when we receive a CS_RESET as a response for a SCSI command the
driver will return DID_TRANSPORT_DISRUPTED back to the SCSI mid-layer. There
are certain circumstances where this could cause the mid-layer to exhaust all of
its retries if the FC port goes away for a short time. This will result in
commands being prematurly failed. Moving the CS_RESET return code to be
grouped with other link level events will cause the FC transport layer to block
that target's queue thus preventing the premature exhaustion of retries.
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Using del_timer_sync() in the qla2x00_ctx_sp_free() function may cause a kernel
panic as it is not interrupt context safe and qla2x00_ctx_sp_free() may be
called from a softirq context. Changing the call from del_timer_sync() to
del_timer() will make the function interrupt context safe.
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Add the module parameter ql2xgffidenable to disable/enable the use of the
GFF_ID name server command to prevent non FCP SCSI devices from being added to
the driver's internal fc_port database.
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This patch removes the use of the port down retry counter as a mechanism to
update a fcport state. The internal driver counter is a residual carry-over
from pre-FC-transport aware driver inteaction. The ql2xport_down_retry module
parameter and NVRAM set ha->port_down_retry_count remain in order to seed the
fc-host's default dev-loss-tmo.
Signed-off-by: Chad Dupuis <chad.dupuis@qlogic.com>
Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
IRQs are already disabled here so we don't need to disable them again.
But more importantly, the spin_lock_irqsave() overwrites "flags" and
that breaks things when we want to re-enable the IRQs when we call
spin_unlock_irqrestore(&ha->hardware_lock, flags);
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Madhuranath Iyengar <Madhu.Iyengar@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
An sr device that reports sense data with SK/ASC/ASCQ of 2/4/2 (Not ready,
Logical unit not ready, Initializing command required) will be handled
in sr_drive_status as (2/4/!1) and assumed to be a 'format in progress'
which returns CDS_DISC_OK. The drive will not be made ready in this case.
Prior to 210ba1d172 sr_drive_status would
have returned CDS_TRAY_OPEN and this results in an START_STOP_UNIT to
close the tray, which resolves the initialization requirement.
This patch adds handling for SK/ASC/ASCQ of 2/4/2 where it will return
CDS_TRAY_OPEN as a means of triggering a START_STOP_UNIT.
This issue is seen on the IBM POWER platform when using a file-backed,
virtual optical device. The device does not support media queries
through the Get Event Status Notification command which could otherwise
trigger a START_STOP_UNIT call to close an open tray.
Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
A previous patch attempted to validate the destination
MAC address of a FCoE frame by checking that MAC
address against the received port's MAC address. The
implementation seems fine on the surface, but any
VN_Ports added using the NPIV feature will have their
own MAC addresses and these MACs were not being checked,
which prevented any NPIV VN_Ports from receiving frames.
In other words, the following patch has broken NPIV.
519e5135e2
[SCSI] fcoe: adds src and dest mac address
checking for fcoe frames
Part of the offending patch is correct, but the part
that broke NPIV was attempting to satisfy FC-BB-5
section D.5, 2.1-
(discard frames that) "contain a destination MAC
address/destination N_Port_ID pair that was not
assigned by an FCF to one of the VN_Ports on the ENode"
The language does _not_ say to compare the destination
FC-MAP/destination N_Port_ID, but instead to compare
the destination MAC address/destination N_Port_ID.
>From the FC-BB-5 specification,
"A properly formed FPMA is one in which the 24 most
significant bits equal the Fabric’s FC-MAP value and
the least significant 24 bits equal the N_Port_ID
assigned to the VN_Port by the FCF."
This means that we need to compare the FC Frame's
destination FCID against the embedded FCID in the
destination MAC address. This patch checks the lower
24 bits of the destination MAC address against
destination FCID in the Fibre Channel frame.
For MAC validation the first line of defense is the
hardware MAC filtering. Each VN_Port will have a
unicast MAC addresses added to the hardware's
filtering table. The Ethernet driver should drop any
MACs not destined for a programmed MAC. This patch
adds a second line of defense that very specfically
compares an element in the FC frame against an element
in the Ethernet header, which is appropriate for the
FCoE layer.
Many alternative approaches were considered, including
a LLD callback from libfc. The second most reasonable
approach seemed to be walking the list of NPIV ports
and check each of their MAC addresses against the
destination MAC address of the received frame. The
problem with this approach was that it is likely that
performance would suffer with the more NPIV ports added
to the system since every received frame would need to
walk this list, comparing each entry's MAC.
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Fix: When FIP frame is received, function fcoe_ctlr_vn_recv calls function
fcoe_ctlr_vn_parse which does memset for addr (&buf.rdata) which leads to
memory corruption. Code was trying to treat "buf" as struct but it was defined
as union. Fix is to change from union to struct for "buf" in function fcoe_ctlr_vn_recv.
Technical Details: N/A
Signed-off-by: Kiran Patil <kiran.patil@intel.com>
Acked-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
When number of NPIV ports created are greater than the xids
allocated per pool -- for eg., creating 255 NPIV ports on a
system with nr_cpu_ids of 32, with each pool containing 128
xids -- and then generating a link event - for eg.,
shutdown/no shutdown -- on the switch port causes the hang
with the following stack trace.
Call Trace:
schedule_timeout+0x19d/0x230
wait_for_common+0xc0/0x170
__cancel_work_timer+0xcf/0x1b0
fc_disc_stop+0x16/0x30 [libfc]
fc_lport_reset_locked+0x47/0x90 [libfc]
fc_lport_enter_reset+0x67/0xe0 [libfc]
fc_lport_disc_callback+0xbc/0xe0 [libfc]
fc_disc_done+0xa8/0xf0 [libfc]
fc_disc_timeout+0x29/0x40 [libfc]
run_workqueue+0xb8/0x140
worker_thread+0x96/0x110
kthread+0x96/0xa0
child_rip+0xa/0x20
Fix is to not cancel the disc_work if discovery is already
stopped, thus allowing lport state machine to restart and try
discovery again.
Signed-off-by: Bhanu Prakash Gollapudi <bprakash@broadcom.com>
Acked-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
It is unlikely but in case if it hits then it would cause panic
due to null cmd ptr, so far only one instance seen recently with
ESX though this was introduced long ago with this commit:-
commit c1ecb90a66
Author: Chris Leech <christopher.leech@intel.com>
Date: Thu Dec 10 09:59:26 2009 -0800
[SCSI] libfc: reduce hold time on SCSI host lock
Currently fsp->cmd is set to NULL w/o scsi_queue_lock before
dequeuing from scsi_pkt_queue and that could cause NULL
fsp->cmd in fc_fcp_cleanup_each_cmd for cmd completing
with fsp->cmd = NULL after fc_fcp_cleanup_each_cmd taken
reference. No need to set fsp->cmd to NULL as this is also
protected by fc_fcp_lock_pkt(), for above race the
fc_fcp_lock_pkt() in fc_fcp_cleanup_each_cmd() will fail
as that cmd is already done.
Mike mentioned same issue at
http://www.open-fcoe.org/pipermail/devel/2010-September/010533.html
Similarly moved sc_cmd->SCp.ptr = NULL under scsi_queue_lock so
that scsi abort error handler won't abort on completed cmds.
Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Since sometimes current FIP_MODE_AUTO mode falls back to non-FIP
mode while DCB link still getting ready in fabric mode with
its peer switch, it falls back after few libfc flogi retries
and that is not we want while working with FIP enabled
switches in FABRIC mode, therefore sets default as FIP_MODE_FABRIC
as discussed and agreed before in this mail thread
http://www.open-fcoe.org/pipermail/devel/2010-August/010511.html
Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
Sometimes switch in NPV mode rejects flogi request with DID
zero and in that case flogi is not tried again and port
remains offline, so this patch validates DID for non zero
along with only ACC response to allow flogi retry
for RJT with DID=0 also succeed FLOGI in next try.
Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>