This patch improves handling of TASK ABORTED status by Linux SCSI
mid-layer. Currently, command returned with this status considered
failed and returned to upper layers. It leads to additional error
recovery load on file systems and block layer, which sometimes can
cause undesired side effects, like I/O errors and file systems
corruptions. See http://lkml.org/lkml/2008/11/1/38, for instance.
From other side, TASK ABORTED status is returned by SCSI target if the
corresponding command was aborted by another initiator and the target
has TAS bit set in the control mode page. So, in the majority of cases
commands with TASK ABORTED status should be simply retried. In other
cases, maybe_retry path will not retry if no retries are allowed.
This patch implement suggestion by James Bottomley from
http://marc.info/?l=linux-scsi&m=121932916906009&w=2.
Signed-off-by: Vladislav Bolkhovitin <vst@vlnb.net>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
When the mode select sent to the controller fails with the retryable
error, it is better to retry the mode_select from the hardware handler
itself, instead of propagating the failure to dm-multipath.
Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
When the controller ownership is changed (from passive to active),
check_ownership() doesn't set the state of the device to ACTIVE.
Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Reported-by: "Moger, Babu" <Babu.Moger@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
...and the list of recent breakage goes on and on, this time
it's 242f9dcb8b (block: unify request timeout handling)
which broke it.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Acked-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
When unsigned, scsi_dma_map may return -ENOMEM without triggering BUG_ON()
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Use the macro DIV_ROUND_UP and eliminate the variable rounded_up, as
suggested by Matthew Wilcox.
Signed-off-by: Julia Lawall <julia@diku.dk>
Cc: David Miller <davem@davemloft.net>
Reviewed-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Stops gcc from complaining about a possible uninitialized
variable being used in ibmvfc_reset_device.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
If the ibmvfc driver is in discovery attempting to log into a target
and it encounters an error, the command may get retried one or more
times, depending on the error received. If the retries are
unsuccessful such that the discovery thread gives up on discovery to
that target, the target ends up in a state where, if SCSI core had
previously known about the device, the host will get unblocked but the
host will not be logged into the target, causing any commands sent to
the target to fail. This patch fixes this so that if this occurs, the
target is deleted such that the normal dev_loss processing can occur
instead.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Due to an ambiguity in the VIOS VFC interface specification,
abort/cancel handling is not done correctly and can result in double
completion of commands. In order to cancel all outstanding commands to
a device, a cancel must be sent, followed by an abort task set. After
the responses are received for these commands, there may still be
commands outstanding, in the process of getting flushed back, in which
case, we need to wait for them. This patch removes the assumption that
if the abort and the cancel both complete successfully that the device
queue has been flushed and waits for all the responses to come back.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
If either a "transport fault" or a "general transport" error is received
and no other error information is available, the command is improperly
returned as successful. Fix this to return DID_ERROR in this case.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
The ibmvfc log level filtering logic was reversed. The log_level scsi
host parameter should result in more verbose logs when log_level is
larger, not smaller.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
We need to check the address that pci_alloc_consistent() returns since
it might fail.
When pci_alloc_consistent() fails, some IOMMUs set the dma_handle
argument to zero. So we can't use fibptr->hw_fib_pa directly here.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Aacraid List <aacraid@adaptec.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Use the newly introduced pci_ioremap_bar() function in drivers/scsi.
pci_ioremap_bar() just takes a pci device and a bar number, with the goal
of making it really hard to get wrong, while also having a central place
to stick sanity checks.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Matthew Wilcox <willy@linux.intel.com>
Cc: Brian King <brking@us.ibm.com>
Cc: Ed Lin <ed.lin@promise.com>
Cc: Nick Cheng <nick.cheng@areca.com.tw>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Fix kernel-doc parameter warning and correct the function name:
Warning(linux-next-20081022//drivers/scsi/scsi_ioctl.c:281): No description found for parameter 'ndelay'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
m68k: use the new byteorder headers
fbcon: Protect free_irq() by MACH_IS_ATARI check
fbcon: remove broken mac vbl handler
m68k: fix trigraph ignored warning in setox.S
macfb annotations and compiler warning fix
m68k: mac baboon interrupt enable/disable
m68k: machw.h cleanup
m68k: Mac via cleanup and commentry
m68k: Reinstate mac rtc
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1429 commits)
net: Allow dependancies of FDDI & Tokenring to be modular.
igb: Fix build warning when DCA is disabled.
net: Fix warning fallout from recent NAPI interface changes.
gro: Fix potential use after free
sfc: If AN is enabled, always read speed/duplex from the AN advertising bits
sfc: When disabling the NIC, close the device rather than unregistering it
sfc: SFT9001: Add cable diagnostics
sfc: Add support for multiple PHY self-tests
sfc: Merge top-level functions for self-tests
sfc: Clean up PHY mode management in loopback self-test
sfc: Fix unreliable link detection in some loopback modes
sfc: Generate unique names for per-NIC workqueues
802.3ad: use standard ethhdr instead of ad_header
802.3ad: generalize out mac address initializer
802.3ad: initialize ports LACPDU from const initializer
802.3ad: remove typedef around ad_system
802.3ad: turn ports is_individual into a bool
802.3ad: turn ports is_enabled into a bool
802.3ad: make ntt bool
ixgbe: Fix set_ringparam in ixgbe to use the same memory pools.
...
Fixed trivial IPv4/6 address printing conflicts in fs/cifs/connect.c due
to the conversion to %pI (in this networking merge) and the addition of
doing IPv6 addresses (from the earlier merge of CIFS).
Remove some more cruft from machw.h and drop the #include where it isn't
needed.
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6:
[SCSI] mpt fusion: clear list of outstanding commands on host reset
[SCSI] scsi_lib: only call scsi_unprep_request() under queue lock
[SCSI] ibmvstgt: move crq_queue_create to the end of initialization
[SCSI] libiscsi REGRESSION: fix passthrough support with older iscsi tools
[SCSI] aacraid: disable Dell Percraid quirk on Adaptec 2200S and 2120S
It's called under that lock everywhere else and it does alter the
request state, so it should be.
This one occurance in scsi_requeue_command() could open a window where
req->special is set to NULL while the requests is going through either
timeout or completion processing leading to NULL pointer derefs of the
sort complained of in bugzillas 12020 and 12195.
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
The test-unit-ready portion of this patch was causing boots to fail on
my test machine (as in http://lkml.org/lkml/2008/12/5/161). With this
patch in place, the system is booting reliably.
Mike Anderson found the same problem in the hp_hw_start_stop code,
and I applied the same solution in cdrom_read_cdda_bpc.
Signed-off-by: Alan D. Brunelle <alan.brunelle@hp.com>
Cc: Mike Anderson <andmike@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Calling crq_queue_create could lead to the creation of a rport. We
need to set up everything before creating a rport. This moves
crq_queue_create to the end of initialization to avoid a race which
causes an oops if lost.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Reported-by: Olaf Hering <olh@suse.de>
Cc: stable@kernel.org
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Update FMODE_NDELAY before each ioctl call so that we can kill the
magic FMODE_NDELAY_NOW. It would be even better to do this directly
in setfl(), but for that we'd need to have FMODE_NDELAY for all files,
not just block special files.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
This regression was added in 2.6.27, when the mtask and ctask were
merged into the the common task struct. The patch applies to
scsi-rc-fixes, but also applies to 2.6.27 with some offsets.
The problem is that __iscsi_conn_send_pdu assumes that userspace was
not sending nops with the format it is checking for in the "if" below.
It turns out that older userspace tools are. This patch moves the
setting of the internal ping_task tracker (it tracks libiscsi current
outstanding nop) to iscsi_send_nopout which is only used by kernel callers.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
A lot of 64bit machines with Adaptec 2200S and 2120S controllers don't
recognize SCSI disks any more with the patch
commit 94cf6ba11b
Author: Salyzyn, Mark <mark_salyzyn@adaptec.com>
Date: Thu Dec 13 16:14:18 2007 -0800
[SCSI] aacraid: fix driver failure with Dell PowerEdge Expandable RAID Controller 3/Di
but fail with tons of "aac_srb: aac_fib_send failed with status: 8195"
instead. This patch disables the quirk introduced in the change cited
above for those two controllers again.
[thenzl: added 2120S Controller]
Signed-off-by: Gernot Hillier <gernot.hillier@siemens.com>
Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Acked-by: Matt Domsch <Matt_Domsch@dell.com>
Cc: AACRAID list <aacraid@adaptec.com>
Cc: Stable Tree <stable@kernel.org>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6:
[SCSI] stex: switch to block timeout
[SCSI] make scsi_eh_try_stu use block timeout
[SCSI] megaraid_sas: switch to block timeout
[SCSI] ibmvscsi: switch to block timeout
[SCSI] aacraid: switch to block timeout
[SCSI] zfcp: prevent double decrement on host_busy while being busy
[SCSI] zfcp: fix deadlock between wq triggered port scan and ERP
[SCSI] zfcp: eliminate race between validation and locking
[SCSI] zfcp: verify for correct rport state before scanning for SCSI devs
[SCSI] zfcp: returning an ERR_PTR where a NULL value is expected
[SCSI] zfcp: Fix opening of wka ports
[SCSI] zfcp: fix remote port status check
[SCSI] fc_transport: fix old bug on bitflag definitions
[SCSI] Fix hang in starved list processing
stex sets the timeout in its slave configure routine for all devices.
This now needs to update the request queue timeout in block.
Cc: Ed Lin <ed.lin@promise.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
scsi_eh_try_stu() was still using the timeout parameter in the device
which is now not set (i.e. zero filled) meaning that it waited no time
at all for the start unit command to complete (leading the routine to
conclude failure every time). This lead to a 2.6.27 regression:
http://bugzilla.kernel.org/show_bug.cgi?id=12120
Where firewire devices that were non spec compliant wouldn't spin up.
Fix this by using the block queue timeout value instead.
Reported-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
megaraid_sas sets the timeout in its slave configure routine for devices
on special channels. This now needs to update the request queue timeout
in block.
Cc: "Yang, Bo" <Bo.Yang@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
ibmvscsi sets the timeout in its slave configure routine for disk
devices. This now needs to update the request queue timeout in block.
Cc: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
aacraid updates the timeout in its slave configure routine if it is too
small. This now needs to update the request queue timeout in block.
Cc: AACRAID list <aacraid@adaptec.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
PCI side of driver should be devinit, not init
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The code
if (shost->dma_channel != NO_ISA_DMA)
free_dma(shost->dma_channel);
in there is triggerable only if we have CONFIG_ISA (we only set ->dma_channel to
something other than NO_ISA_DMA under #ifdef CONFIG_ISA). OTOH, free_dma() is
not guaranteed to be there in absense of CONFIG_ISA. IOW, driver runs into
undefined symbols on PCI-but-not-ISA configs (e.g. on frv) and it's a false
positive.
Fix: put the entire if () under #ifdef CONFIG_ISA; behaviour doesn't change and
dependency on free_dma() disappears for !CONFIG_ISA.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Close possible infinite loop with interrupts off when devices are
added back to the starved list.
Fixes: http://bugzilla.kernel.org/show_bug.cgi?id=11898
Reported-by: <alex.shi@intel.com>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
dpt_i2o.c::adpt_i2o_to_scsi() reads the value at (reply+5) which
should contain the length in bytes of the transferred data. This
would be correct if reply was a u32 *. However it is a void * here,
so we need to read the value at (reply+20) instead.
The value at (reply+5) is usually 0xff0000, which is apparently
'large enough' and didn't cause any trouble until 2.6.27 where
commit 427e59f09f
Author: James Bottomley <James.Bottomley@HansenPartnership.com>
Date: Sat Mar 8 18:24:17 2008 -0600
[SCSI] make use of the residue value
caused this to become visible through e.g. iostat -x .
Signed-off-by: Miquel van Smoorenburg <mikevs@xs4all.net>
Cc: Stable Tree <stable@kernel.org>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Drivers want to be able to return DID_TRANSPORT_DISRUPTED and
have it do the right thing for commands like tape and passthrouh
as far as retries go. The LLDs previously used DID_BUS_BUSY or DID_ERROR
which followed the cmd->retries limit, but DID_TRANSPORT_DISRUPTED
was skipping that check so it could have caused a problem with tape
commands.
This patch has DID_TRANSPORT_DISRUPTED check the cmd->retries/cmd->allowed.
Signed-off-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Mike Reed noted
(https://bugzilla.novell.com/show_bug.cgi?id=421330) that the
driver was incorrectly returning a SUCCESS status if the driver's
request to the firmware to abort a command failed. By doing so,
the mid-layer believed, incorrectly, that the command has
completed and has been returned (ultimately clearing
scsi_cmnd.request_buffer) yet the driver still has the command.
What should correctly happen is a mid-layer escalation
(device-reset, etc.) of recovery during which the driver will
eventually return the outstanding commands to the mid-layer.
Cc: Stable Tree <stable@kernel.org>
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
For 23XX ISPs, max_vports may return an invalid value.
Do not honour it.
Cc: Stable Tree <stable@kernel.org>
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Use correct block size (4K) for erase command 0x20 for Atmel
Flash. Use dword addresses for determining sector boundary.
Cc: Stable Tree <stable@kernel.org>
Signed-off-by: Lalit Chandivade <lalit.chandivade@qlogic.com>
Signed-off-by: Andrew Vasquez <andrew.vasquez@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
scsi_cmnd->cmnd was changed from a static array to a pointer post
2.6.25. It breaks mega_internal_command():
static int
mega_internal_command(adapter_t *adapter, megacmd_t *mc, mega_passthru *pthru)
{
...
scb = &adapter->int_scb;
memset(scb, 0, sizeof(scb_t));
scmd = &adapter->int_scmd;
memset(scmd, 0, sizeof(Scsi_Cmnd));
sdev = kzalloc(sizeof(struct scsi_device), GFP_KERNEL);
scmd->device = sdev;
scmd->device->host = adapter->host;
scmd->host_scribble = (void *)scb;
scmd->cmnd[0] = MEGA_INTERNAL_CMD;
mega_internal_command() uses scsi_cmnd allocated internally so
scmd->cmnd is NULL here. This patch adds a static array for cdb to
adapter_t and uses it here. This also uses
scsi_allocate_command/scsi_free_command, the recommended way to
allocate struct scsi_cmnd since the driver might use sense_buffer in
struct scsi_cmnd.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Reviewed-by: Boaz Harrosh <bharrosh@panasas.com>
Tested-by: Pascal Terjan <pterjan@gmail.com>
Reported-by: Pascal Terjan <pterjan@gmail.com>
Acked-by: "Yang, Bo" <Bo.Yang@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
As it is, all instances of ->release() for files that have ->fasync()
need to remember to evict file from fasync lists; forgetting that
creates a hole and we actually have a bunch that *does* forget.
So let's keep our lives simple - let __fput() check FASYNC in
file->f_flags and call ->fasync() there if it's been set. And lose that
crap in ->release() instances - leaving it there is still valid, but we
don't have to bother anymore.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Using NIPQUAD() with NIPQUAD_FMT, %d.%d.%d.%d or %u.%u.%u.%u
can be replaced with %pI4
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>