If we had multiple tasks on the cmd or requeue lists, and iscsi_tcp
returns a error, the write_space function can still run and queue
iscsi_data_xmit. If it was a legetimate problem and iscsi_conn_failure
was run but we raced and iscsi_data_xmit was run first it could miss
the suspend bit checks, and start trying to send data again and hit
another timeout. A similar problem is present when using cxgb3i.
This has libiscsi check the suspend bit before calling the xmit
task callout, so we at least do not try sending multiple tasks
(one could be sent).
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
If we sent multiple pdus as immediate the target could be
rejecting some and we have just been dropping the rejection
notification. This adds code to handle nop-out rejections,
so if a nop-out was sent as a ping and rejected we do not
mark the connection bad. Instead we just clean up the timers
since we have pdu making a rount trip we know the connection
is good.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
We increment session->cmdsn at the top of iscsi_prep_scsi_cmd_pdu, but
if the prep ecb or prep bidi or init_task calls fails then we leave the
session->cmdsn incremented. This moves the cmdsn manipulation to the end
of the function when we know it has succeeded.
It also adds a session->cmdsn--; in queuecommand for if a driver like
bnx2i tries to send a a task from that context but it fails. We do not
have to do this in the xmit thread context because that code will retry
the same task if the initial call fails.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
The session lock can be held in the scsi eh thread or the completion
paths run from the net softirq. This disables bhs in iscsi_eh_abort when
taking the session lock.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Allow the user to control the debug logs in libiscsi. We will now
have a module param for connection, session & error handling.
[Mike Christie - Fixed up to compile on current code and added
missing ISCSI_DBG_EH conversions]
Signed-off-by: Erez Zilber <erezzi.list@gmail.com>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
If we are sending or receiving data for the task successfully do
not run the scsi eh, because we know the task is making progress.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This patch just adds some debug statements for the abort
and completion paths.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
If a task did not complete normally due to a TMF, libiscsi will
now complete the task with the state ISCSI_TASK_ABRT_TMF. Drivers
like bnx2i that need to free resources if a command did not complete normally
can then check the task state. If a driver does not need to send
a special command if we have dropped the session then they can check
for ISCSI_TASK_ABRT_SESS_RECOV.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Instead of having libiscsi check if the offload bit is set, have
it check if the lld created a work queue. I think this is more
clear.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
If the session is failed, but we have not yet fully transitioned
to the recovery stage we were still queueuing IO. The idea is
that for some failures we can recvover at the command level
and still continue to execute other IO. Well, we never have
added the recovery within a command code, so queueing up IO here
just creates the possibility that it might time time out so
this just has us requeue the IO the scsi layer for now.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
bnx2i needs to send a hardware specific cleanup command if
a command has not completed normally (iscsi/scsi response from
target), and the session is still ok (this is the case when we
send a TMF to stop the command).
At this time it will need to drop the session lock. The problem
with the current code is that fail_all_commands assumes we
will hold the lock the entire time, so it uses list_for_each_entry_safe.
If while bnx2i drops the session lock multiple cmds complete then
list_for_each_entry_safe will not handle this correctly.
This patch removes the running lists and just has us loop over
the cmds array (in later patches we will then replace that
array with a block tag map at the session level). It also fixes
up the completion path so that if the TMF code and the normal recv
path were completing the same command then they both do not try
to do release the refcount taken when the task is queued.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
If we have not got any pdus for recv_timeout seconds, then we will
send a iscsi ping/nop to make sure the target is still around. The
problem is if this is a slow link, and the ping got queued after
the data for a data_out (read), then the transport code could think
the ping has failed when it is just slowly making its way through
the network. This patch has us check if we are making progress while
the nop is outstanding. If we are still reading in data, then we
do not fail the session at that time.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
If we are responding to a nop from the target by sending our nop,
and the session is getting torn down, then iscsi_start_session_recovery
could set the conn stop bits while the recv path is sending the nop
response and we will hit the bug ons in __iscsi_conn_send_pdu.
This has us check the state in __iscsi_conn_send_pdu and fail all
incoming mgmt IO if we are not logged in and if the pdu is not login
related. It also changes the ordering of the setting of conn stop state
bits so they are set after the session state is set (both are set under
the session lock).
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This has iscsi_data_in_rsp call iscsi_update_cmdsn when a pdu is
completed like is done for other pdu's that are don.
For libiscsi_tcp, this means that it calls iscsi_update_cmdsn when
it is handling the pdu internally to only transfer data, but if there is
status then it does not need to call it since the completion handling
will do it.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
bnx2i needs to be able to look up mgmt task like login and nop, because
it does some processing of them on the completion path. This exports
iscsi_itt_to_task so it can look up the task.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
If we could not allocate the initiator name or some other id like
the hwaddress or netdev, then userspace could deal with the failure
by just running in a dregraded mode.
Now we want to be able to switch values for the params and we
want some feedback, so this patch will check if a string like
the initiatorname could not be allocated and return an error.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
bnx2i does not have one. It currently preallocates the bdt
when the session is setup.
We probably want to change that to a dma pool, then allocate from
the pool in the alloc pdu. Until then check if there is a alloc
pdu callout.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Set target can queue limit to the number of preallocated
session tasks we have.
This along with the cxgb3i can_queue patch will fix a throughput
problem where it could only queue one LU worth of data at a time.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Le lundi 30 mars 2009, Chris Wright a écrit :
> q->queue could be ERR_PTR(-ENOMEM) which will break unwinding
> on error. Make iscsi_pool_free more defensive.
>
Making the freeing of q->queue dependent on q->pool being set looks
really weird (although it is correct at the moment. But this seems
to be fixable in a much simpler way.
With the benefit that only the error case is slowed down. In both
cases we have a problem if q->queue contains an error value but it's
not -ENOMEM. Apparently this can't happen today, but it doesn't feel
right to assume this will always be true. Maybe it's the right time
to fix this as well.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
If the iscsi eh fires when the current task is a nop, then
the task->sc pointer is null. fail_all_commands could
then try to do task->sc->device and oops. We actually do
not need to access the curr task in this path, because
if it is a cmd task the fail_command call will handle
this and if it is mgmt task then the flush of the mgmt
queues will handle that.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
The api for conn and session failures is akward because
one takes a conn from the lib and one takes a session
from the class. This syncs up the interfaces to use
structs from the lib.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
The qdepth setting was useful when we needed libiscsi to verify
the setting. Now we just need to make sure if older tools
passed in zero then we need to set some default.
So this patch just has us use the sht->cmd_per_lun or if
for LLD does a host per session then we can set it on per
host basis.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
We were using the shost work queue which ended up being
a little akward since all iscsi hosts need a thread for
scanning, but only drivers hooked into libiscsi need
a workqueue for transmitting. So this patch moves the
xmit workqueue to the lib.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
There is no need to cap the queue depth in the modules. We set
this in userspace and can do that there. For performance testing
with ram based targets, this is helpful since we can have very
high queue depths.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This makes the logging a compile time option and replaces
the scsi_debug macro with session and connection ones
that print out a driver model id prefix.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Memory freeing in iscsi_pool_free() looks wrong to me. Either q->pool
can be NULL and this should be tested before dereferencing it, or it
can't be NULL and it shouldn't be tested at all. As far as I can see,
the only case where q->pool is NULL is on early error in
iscsi_pool_init(). One possible way to fix the bug is thus to not
call iscsi_pool_free() in this case (nothing needs to be freed anyway)
and then we can get rid of the q->pool check.
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Yanling Qi from LSI found the root cause of the panic, below is his
analysis:
Problem description: the open iscsi driver installs eh_timed_out handler
to the
blank_transport_template of the scsi middle level that causes panic of
timed
out command of other host
Here are the details
Iscsi Session creation
During iscsi session creation time, the iscsi_tcp_session_create() of
iscsi_tpc.c will create a scsi-host for the session. See the statement
marked
with the label A. The statement B replaces the shost->transportt point
with a
local struct variable.
static struct iscsi_cls_session *
iscsi_tcp_session_create(struct iscsi_endpoint *ep, uint16_t cmds_max,
uint16_t qdepth, uint32_t initial_cmdsn,
uint32_t *hostno)
{
struct iscsi_cls_session *cls_session;
struct iscsi_session *session;
struct Scsi_Host *shost;
int cmd_i;
if (ep) {
printk(KERN_ERR "iscsi_tcp: invalid ep %p.\n", ep);
return NULL;
}
A shost = iscsi_host_alloc(&iscsi_sht, 0, qdepth);
if (!shost)
return NULL;
B shost->transportt = iscsi_tcp_scsi_transport;
shost->max_lun = iscsi_max_lun;
Please note the scsi host is allocated by invoking isccsi_host_alloc()
in
libiscsi.c
Polluting the middle level blank_transport_template in
iscsi_host_alloc() of
libiscsi.c
The iscsi_host_alloc() invokes the middle level function
scsi_host_alloc() in
hosts.c for allocating a scsi_host. Then the statement marked with C
assigns
the iscsi_eh_cmd_timed_out handler to the eh_timed_out callback
function.
struct Scsi_Host *iscsi_host_alloc(struct scsi_host_template *sht,
int dd_data_size, uint16_t qdepth)
{
struct Scsi_Host *shost;
struct iscsi_host *ihost;
shost = scsi_host_alloc(sht, sizeof(struct iscsi_host) +
dd_data_size);
if (!shost)
return NULL;
C shost->transportt->eh_timed_out = iscsi_eh_cmd_timed_out;
Please note the shost->transport is the middle level
blank_transport_template
as shown in the code segment below. We see two problems here. 1.
iscsi_eh_cmd_timed_out is installed to the blank_transport_template that
will
cause some body else problem. 2. iscsi_eh_cmd_timed_out will never be
invoked
when iscsi command gets timeout because the statement B resets the
pointer.
Middle level blank_transport_template
In the middle level function scsi_host_alloc() of hosts.c, the middle
level
assigns a blank_transport_template for those hosts not implementing its
transport layer. All HBAs without supporting a specific scsi_transport
will
share the middle level blank_transport_template. Please see the
statement D
struct Scsi_Host *scsi_host_alloc(struct scsi_host_template *sht, int
privsize)
{
struct Scsi_Host *shost;
gfp_t gfp_mask = GFP_KERNEL;
int rval;
if (sht->unchecked_isa_dma && privsize)
gfp_mask |= __GFP_DMA;
shost = kzalloc(sizeof(struct Scsi_Host) + privsize, gfp_mask);
if (!shost)
return NULL;
shost->host_lock = &shost->default_lock;
spin_lock_init(shost->host_lock);
shost->shost_state = SHOST_CREATED;
INIT_LIST_HEAD(&shost->__devices);
INIT_LIST_HEAD(&shost->__targets);
INIT_LIST_HEAD(&shost->eh_cmd_q);
INIT_LIST_HEAD(&shost->starved_list);
init_waitqueue_head(&shost->host_wait);
mutex_init(&shost->scan_mutex);
shost->host_no = scsi_host_next_hn++; /* XXX(hch): still racy */
shost->dma_channel = 0xff;
/* These three are default values which can be overridden */
shost->max_channel = 0;
shost->max_id = 8;
shost->max_lun = 8;
/* Give each shost a default transportt */
D shost->transportt = &blank_transport_template;
Why we see panic at iscsi_eh_cmd_timed_out()
The mpp virtual HBA doesn’t have a specific scsi_transport. Therefore,
the
blank_transport_template will be assigned to the virtual host of the MPP
virtual HBA by SCSI middle level. Please note that the statement C has
assigned
iscsi-transport eh_timedout handler to the blank_transport_template.
When a mpp
virtual command gets timedout, the iscsi_eh_cmd_timed_out() will be
invoked to
handle mpp virtual command timeout from the middle level
scsi_times_out()
function of the scsi_error.c.
enum blk_eh_timer_return scsi_times_out(struct request *req)
{
struct scsi_cmnd *scmd = req->special;
enum blk_eh_timer_return (*eh_timed_out)(struct scsi_cmnd *);
enum blk_eh_timer_return rtn = BLK_EH_NOT_HANDLED;
scsi_log_completion(scmd, TIMEOUT_ERROR);
if (scmd->device->host->transportt->eh_timed_out)
E eh_timed_out =
scmd->device->host->transportt->eh_timed_out;
else if (scmd->device->host->hostt->eh_timed_out)
eh_timed_out = scmd->device->host->hostt->eh_timed_out;
else
eh_timed_out = NULL;
if (eh_timed_out) {
rtn = eh_timed_out(scmd);
It is very easy to understand why we get panic in the
iscsi_eh_cmd_timed_out().
A scsi_cmnd from a no-iscsi device definitely can not resolve out a
session and
session->lock. The panic can be happed anywhere during the differencing.
static enum blk_eh_timer_return iscsi_eh_cmd_timed_out(struct scsi_cmnd
*scmd)
{
struct iscsi_cls_session *cls_session;
struct iscsi_session *session;
struct iscsi_conn *conn;
enum blk_eh_timer_return rc = BLK_EH_NOT_HANDLED;
cls_session = starget_to_session(scsi_target(scmd->device));
session = cls_session->dd_data;
debug_scsi("scsi cmd %p timedout\n", scmd);
spin_lock(&session->lock);
This patch fixes the problem by moving the setting of the
iscsi_eh_cmd_timed_out to iscsi_add_host, which is after the LLDs
have set their transport template to shost->transportt.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
I am not sure what happened. It looks like we have always leaked
the q->queue that is allocated from the kfifo_init call. nab finally
noticed that we were leaking and this patch fixes it by adding a
kfree call to iscsi_pool_free. kfifo_free is not used per kfifo_init's
instructions to use kfree.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Mgmt setup used to not fail so we did not have to check
the return value. Now with cxgb3i it can so this has us
pass up a error.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
We do not need to allocate a itt for data_out, so this
passes the opcode to the alloc_pdu callout.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
bnx2i and cxgb3i need to encode LLD info in the itt so that
the firmware/hardware can process the pdu. This patch allows
the LLDs to encode info in the task->hdr->itt that they
setup in the alloc_pdu callout (any resources that are allocated
can be freed with the pdu in the cleanup_task callout). If
the LLD encodes info in the itt they should implement a
parse_pdu_itt callout. If parse_pdu_itt is not implemented
libiscsi will do the right thing for the LLD.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This modifies the login buffer allocation to use __get_free_pages.
It will allow drivers that want to send this data with zero copy
operations to easily line things up on page boundaries.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
cxgb3i offloads data transfers. It does not offload the entire scsi/iscsi
procssing like qla4xxx and it does not offload the iscsi sequence
processing like how bnx2i does. cxgb3i relies on iscsi_tcp for the
seqeunce handling so this changes how we transfer unsolicitied data by
adding a common r2t struct and helpers.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This regression was added in 2.6.27, when the mtask and ctask were
merged into the the common task struct. The patch applies to
scsi-rc-fixes, but also applies to 2.6.27 with some offsets.
The problem is that __iscsi_conn_send_pdu assumes that userspace was
not sending nops with the format it is checking for in the "if" below.
It turns out that older userspace tools are. This patch moves the
setting of the internal ping_task tracker (it tracks libiscsi current
outstanding nop) to iscsi_send_nopout which is only used by kernel callers.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
We must be using the bh spin locking functions in
iscsi_eh_device_reset becuase the session lock interacts with
a thread and softirq.
This patch also fixes up a bogus comment and check in fail_command,
because no one drops the lock (bnx2i did but it is not going
upstream yet and there were other refcount changes for that).
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Some wires got crossed on some patches and I messed up in the code
below when rebuilding a patch. We want to be checking if flag
equaled the value indicating if we killing the session due to
final logout or if we just trying to relogin.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
I had this in my patchset to add target reset support, but
it got dropped due to patching conflicts. This initial patch
just renames the function and users. We are actually just
dropping the session, and so this does not have anything to do
with the host exactly. It does for software iscsi because
we allocate a host per session, but for cxgb3i this makes no
sense.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
If the driver knows when hardware is removed like with cxgb3i,
bnx2i, qla4xxx and iser then we will want to remove the sessions/devices
that are bound to that device before removing the host.
cxgb3i and in the future bnx2i will remove the host and that will
remove all the sessions on the hba. iser can call iscsi_kill_session
when it gets an event that indicates that a hca is removed.
And when qla4xxx is hooked in to the lib (it is only hooked into
the class right now) it can call iscsi remove host like the
partial offload card drivers.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
iscsi_tcp was updating the exp_statsn (exp_statsn acknowledges
status and tells the target it is ok to let the resources for
a iscsi pdu to be reused) before it got all the data for pdu read
into OS buffers. Data corruption was occuring if something happens
to a packet and the network layer requests a retransmit, and the
initiator has told the target about the udpated exp_statsn ack,
then the target may be sending data from a buffer it has reused
for a new iscsi pdu. This fixes the problem by having the LLD
(iscsi_tcp in this case) just handle the transferring of data, and
has libiscsi handle the processing of status (libiscsi completion
processing is done after LLD data transfers are complete).
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This patch converts the iscsi drivers to the new host byte values.
v2
Drop some conversions. Want to avoid conflicts with other patches.
v1
initial patch.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
For the conditions below we do not want the queuecommand
function to call us right back, so return SCSI_MLQUEUE_TARGET_BUSY.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (37 commits)
[SCSI] zfcp: fix double dbf id usage
[SCSI] zfcp: wait on SCSI work to be finished before proceeding with init dev
[SCSI] zfcp: fix erp list usage without using locks
[SCSI] zfcp: prevent fc_remote_port_delete calls for unregistered rport
[SCSI] zfcp: fix deadlock caused by shared work queue tasks
[SCSI] zfcp: put threshold data in hba trace
[SCSI] zfcp: Simplify zfcp data structures
[SCSI] zfcp: Simplify get_adapter_by_busid
[SCSI] zfcp: remove all typedefs and replace them with standards
[SCSI] zfcp: attach and release SAN nameserver port on demand
[SCSI] zfcp: remove unused references, declarations and flags
[SCSI] zfcp: Update message with input from review
[SCSI] zfcp: add queue_full sysfs attribute
[SCSI] scsi_dh: suppress comparison warning
[SCSI] scsi_dh: add Dell product information into rdac device handler
[SCSI] qla2xxx: remove the unused SCSI_QLOGIC_FC_FIRMWARE option
[SCSI] qla2xxx: fix printk format warnings
[SCSI] qla2xxx: Update version number to 8.02.01-k8.
[SCSI] qla2xxx: Ignore payload reserved-bits during RSCN processing.
[SCSI] qla2xxx: Additional residual-count corrections during UNDERRUN handling.
...
Right now SCSI and others do their own command timeout handling.
Move those bits to the block layer.
Instead of having a timer per command, we try to be a bit more clever
and simply have one per-queue. This avoids the overhead of having to
tear down and setup a timer for each command, so it will result in a lot
less timer fiddling.
Signed-off-by: Mike Anderson <andmike@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Due to patch building error on my side, we are still passing DID_BUS_BUSY
for commands that are running, when we want to return whatever the caller
of fail_all_commands wanted. This replaces the hardcoded error code with
the value that is passed in.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This patch fixes two bugs that are related.
1. Old tools did not set can_queue/cmds_max. This patch modifies
libiscsi so that when we add the host we catch this and set it
to the default.
2. iscsi_tcp thought that the scsi command that was passed to
the eh functions needed a iscsi_cmd_task allocated for it. It
only needed a mgmt task, and now it does not matter since it
all comes from the same pool and libiscsi handles this for the
drivers. ib_iser had copied iscsi_tcp's code and set can_queue
to its max - 1 to handle this. So this patch removes the max -1,
and just sets it to the max.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
The recv lock was defined so the iscsi layer could block
the recv path from processing IO during recovery. It
turns out iser just set a lock to that pointer which was pointless.
We now disconnect the transport connection before doing recovery
so we do not need the recv lock. For iscsi_tcp we still stop
the recv path incase older tools are being used.
This patch also has iscsi_itt_to_ctask user grab the session lock
and has the caller access the task with the lock or get a ref
to it in case the target is broken and sends a tmf success response
then sends data or a response for the command that was supposed to
be affected bty the tmf.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Drivers expect that the cmds_max value they pass to the iscsi layer
is the max scsi commands + mgmt tasks. This patch implements that
and fixes some checks for nr cmd limits.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This adds two new attrs used for creating initiator ports and
binding sessions to hardware.
The session level initiatorname:
Since bnx2i does a scsi_host per host device, we need to add the
iface initiator port settings on the session, so we can create
multiple initiator ports (each with different inames) per device/scsi_host.
The current iname reflects that qla4xxx can have one iname per hba, and we are
allocating a host per session for software. The iname on the host will
remain so we can export and set the hba level qla4xxx setting.
The ifacename attr:
To bind a session to a some peice of hardware in userspace we maintain
some mappings, but during boot or iscsid restart (iscsid contains the user
space part of the driver) we need to be able to figure out which of those
host mappings abstractions maps to certain sessions. This patch adds
a ifacename attr, which userspace can set to id the host side of the
endpoint across pivot_roots and iscsid restarts.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Currently we duplicate the list of sessions, because we were using the
test for if a session was on the host list to indicate if the session
was bound or unbound. We can instead use the target_id and fix up
the class so that drivers like bnx2i do not have to manage the target id
space.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This is the second part of the iscsi task merging, and
all it does it rename iscsi_cmd_task to iscsi_task and
mtask/ctask to just task.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
There is no need to have the mgmt and cmd tasks separate
structs. It used to save a lot of memory when we overprealocated
memory for tasks, but the next patches will set up the
driver so in the future they can use a mempool or some other
common scsi command allocator and common tagging.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This patch modifies libiscsi, so drivers like bnx2i and iser can execute
a command from queuecommand/send_pdu instead of having to be queued to
be run in a workq.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Currently to get a ctask from the session cmd array, you have to
know to use the itt modifier. To make this easier on LLDs and
so in the future we can easilly kill the session array and use
the host shared map instead, this patch adds a nice wrapper
to strip the itt into a session->cmds index and return a ctask.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
After the stop_conn callback has returned the LLD should not
touch the scsi cmds. iscsi_tcp and libiscsi use the
conn->recv_lock and suspend_rx field to halt recv path
processing, but iser does not have any protection.
This patch modifies iser so that userspace can just
call the ep_disconnect callback, which will halt
all recv IO, before calling the stop_conn callback so
we do not have to worry about the conn->recv_lock and
suspend rx field. iser just needs to stop the send side
from accessing the ib conn.
Fixup to handle when the ep poll fails and ep disconnect
is called from Erez.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This removes the session and conn data_size fields from the iscsi_transport.
Just pass in the value like with host allocation. This patch also makes
it so the LLD iscsi_conn data is allocated with the iscsi_cls_conn.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This finishes the host/session unbinding, by adding some helpers
to add and remove hosts and the session they manage.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
bnx2i allocates a host per netdevice but will use libiscsi,
so this unbinds the session from the host in that code.
This will also be useful for the iser parent device dma settings
fixes.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
max_cmd_len and max_conn are not really used. max_cmd_len is
always 16 and can be set by the LLD. max_conn is always one
since we do not support MCS.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
If the ping tmo is longer than the recv tmo then we could miss a window
where we were supposed to check the recv tmo. This happens because
the ping code will set the next timeout for the ping timeout, and if the
ping executes quickly there will be a long chunk of time before the
timer wakes up again.
This patch has the ping processing code kick off a recv
tmo check when getting a nop in response to our ping.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Cc: Stable Tree <stable@kernel.org>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
The following patch fixes a bug in the iscsi nop processing.
The target sends iscsi nops to ping the initiator and the
initiator has to send nops to reply and can send nops to
ping the target.
In 2.6.25 we moved the nop processing to the kernel to handle
problems when the userspace daemon is not up, but the target
is pinging us, and to handle when scsi commands timeout, but
the transport may be the cause (we can send a nop to check
the transport). When we added this code we added a bug where
if the transport timer wakes at the exact same time we are supposed to check
for a nop timeout we drop the session instead of checking the transport.
This patch checks if a iscsi ping is outstanding and if the ping has
timed out, to determine if we need to signal a connection problem.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Cc: Stable Tree <stable@kernel.org>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
- prepare the additional bidi_read rlength header.
- access the right scsi_in() and/or scsi_out() side of things.
also for resid.
- Handle BIDI underflow overflow from target
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Reviewed-by: Pete Wyckoff <pw@osc.edu>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Support for extended CDBs in iscsi.
All we need is to check if command spills over 16 bytes then allocate
an iscsi-extended-header for the leftovers.
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Reviewed-by: Pete Wyckoff <pw@osc.edu>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
The old tools did not set max session cmds. This is a regression.
I removed the check when merging the power of 2 patch.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
The session age mask is only 4 bits, but session->age is 32. When
it gets larger then 15 and we try to or the bits some bits get
dropped and the check for session age in iscsi_verify_itt is useless.
The ISCSI_CID_MASK related bits are also useless since cid is always
one.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Some iscsi class messages have the dev_printk prefix and some libiscsi
and iscsi_tcp messages have "iscsi" or the module name as a prefix which
is normally pretty useless when trying to figure out which session
or connection the message is attached to. This patch adds iscsi lib
and class dev_printks so all messages have a common prefix that
can be used to figure out which object printed it.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
If we rollover then we could get a next_timeout of zero, so we need
to set the new timer to that value.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This adds a iscsi session state file which exports the session
state for both software and hardware iscsi. It also hooks libiscsi
in.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
__iscsi_complete_pdu() can now become static.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Acked-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Replacing n & (n - 1) for power of 2 check by is_power_of_2(n)
Signed-off-by: vignesh babu <vignesh.babu@wipro.com>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Older tools will not be setting the tmf time outs since they
did not exists, so set them to a safe default.
And export abort and lu reset timeout values in sysfs.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Convert xmit to iscsi chunks.
from michaelc@cs.wisc.edu:
Bug fixes, more digest integration, sg chaining conversion and other
sg wrapper changes, coding style sync up, and removal of io fields,
like pdu_sent, that are not needed.
Signed-off-by: Olaf Kirch <olaf.kirch@oracle.com>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
The driver does not need the host lock in queuecommand so drop it.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
If the current ctask is failed early, we legt the conn->ctask pointer
pointing to a invalid task. When the xmit thread would send data for
it, we would then oops.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
If the target requests a logout, then we do not want
to fail commands to scsi-ml right away. This patch just
fails in pending commands for a requeue immediately, and then lets
iscsid handle running commands like normal recovery.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
During root boot and shutdown the target could send us nops.
At this time iscsid cannot be running, so the target will drop
the session and the boot or shutdown will hang.
To handle this and allow us to better control when to check the network
this patch moves the nop handling to the kernel.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
We were using the device delete sysfs file to remove each device
then logout. Now in 2.6.21 this will not work because
the sysfs delete file returns immediately and does not wait for
the device removal to complete. This causes a hang if a cache sync
is needed during shutdown. Before .21, that approach had other
problems, so this patch fixes the shutdown code so that we remove the target
and unbind the session before logging out and shut down the session
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
I thought we may not need the eh mutex during host reset, but that is wrong
with the new shutdown code. When start_session_recovery sets the state to
terminate then drops the session lock. The scsi eh thread could then grab the
session lock see that we are terminating and then return failed to scsi-ml.
scsi-ml's eh then owns the command and will do whatever it wants
with it. But then the iscsi eh thread could grab the session lock
and want to complete the scsi commands that we in the LLD, but
it no longer owns them and kaboom.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
There is not need to block the session during logout. Since
we are going to fail the commands that were blocked just fail them
immediately instead.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
iscsi_pool_init simplified
iscsi_pool_init currently has a lot of duplicate kfree() calls it does
when some allocation fails. This patch simplifies the code a little by
using iscsi_pool_free to tear down the pool in case of an error.
iscsi_pool_init also returns a copy of the item array to the caller.
Not all callers use this array, so we make it optional.
Instead of allocating a second array and return that, allocate just one
array, of twice the size.
Update users of iscsi_pool_{init,free}
This patch drops the (now useless) second argument to
iscsi_pool_free, and updates all callers.
It also removes the ctask->r2ts array, which was never
used anyway. Since the items argument to iscsi_pool_init
is now optional, we can pass NULL instead.
Signed-off-by: Olaf Kirch <olaf.kirch@oracle.com>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
- The default initialization of hdr_max is the minimum -
sizeof(struct iscsi_cmd) - Once this patch goes into iser the default
initialization at libiscsi can be removed.
- This is not yet full support for AHSs at iser end. But it should be easy.
Just allocate more space at iser_desc right after iscsi_hdr. Than
at transmission time use ctask->hdr_len to retrieve the total
size of all iscsi pdu headers. See previous patch at iscsi_tcp.[ch]
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
at libiscsi generic code
- currently code assumes a storage space of pdu header is allocated
at llds ctask and is pointed to by iscsi_cmd_task->hdr. Here I add
a hdr_max field pertaining to that storage, and an hdr_len that
accumulates the current use of the pdu-header.
- Add an iscsi_next_hdr() inline which returns the next free space
to write new Header at. Also iscsi_next_hdr() is used to retrieve
the address at which to write the header-digest.
- Add iscsi_add_hdr(length). What the user do is calls iscsi_next_hdr()
for address of the new header, than calls iscsi_add_hdr(length) with
the size of the new header. iscsi_add_hdr() will check if space is
available and update to the new size. length must be padded according
to standard.
- Add 2 padding inline helpers thanks to Olaf. Current patch does not
use them but Following patches will.
Also moved definition of ISCSI_PAD_LEN to iscsi_proto.h which had
PAD_WORD_LEN that was never used anywhere.
- Let iscsi_prep_scsi_cmd_pdu() signal an Error return since now it is
possible that it will fail.
- I was tired of yet again writing a "this is a digest" comment next to
sizeof(__u32) so I defined a new ISCSI_DIGEST_SIZE. Now I don't need
any comments. Changed all places that used sizeof(__u32) or "4" in
connection to a digest.
iscsi_tcp specific code
- At struct iscsi_tcp_cmd_task allocate maximum space allowed in
standard for all headers following the iscsi_cmd header. and mark
it so in iscsi_tcp_session_create()
- At iscsi_send_cmd_hdr() retrieve the correct headers size and
write header digest at iscsi_next_hdr().
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Olaf Kirch <olaf.kirch@oracle.com>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
- Check to see that OVERFLOW is not negative indicating
a bug.
- Unify handling of UNDERFLOW and OVERFLOW to the same
code.
- Also handle BIDI_OVERFLOW.
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
This patch adds logical unit reset support. This should work for ib_iser,
but I have not finished testing that driver so it is not hooked in yet.
This patch also temporarily reverts the iscsi_tcp r2t write out patch.
That code is completely rewritten in this patchset.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Currently, the iSCSI driver returns the data transfer residual for
data-in commands (e.g. read) but not data-out commands (e.g. write).
This patch makes it return the data transfer residual for both types of
commands.
Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
The iscsi eh could be tearing down the session/connection while
the scsi eh is still sending task management functions. If when
we drop the session lock to grab the recv lock, the iscsi eh
tears down the connection we will oops.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
We do not want to send data if we are aborting a task. There is
a check in iscsi_xmit_ctask, but right before calling this we overwrite
the state so we always go right past the test. Sending data causes problems
because when we clean up from a successful abort the LLD assumes that
the task is not running.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
We should not be checking the cmd windown for just handling r2t responses.
And if the window closes in on us, always have scsi-ml requeue the command
from our queuecommand function.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
When we logout we block the session since we are not taking any more
commands, but when we call remove host we want to make sure any
IO that got queued up and blocked gets failed upwards quickly, so
we unblock the session and fail it.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
- remove the unnecessary map_single path.
- convert to use the new accessors for the sg lists and the
parameters.
TODO: use scsi_for_each_sg().
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Acked-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
iSCSI must support software iscsi (iscsi_tcp, iser), hardware iscsi (qla4xxx),
and partial offload (broadcom). To be able to allow each stack or driver
or port (virtual or physical) to be able to log into the same target portal
we use the initiator tuple [[HWADDRESS | NETDEVNAME], INITIATOR_NAME] and
the target tuple [TARGETNAME, CONN_ADDRESS, CONN_PORT] to id a session.
This patch adds the netdev name, which is used by software iscsi when
it binds a session to a netdevice using the SO_BINDTODEVICE sock opt.
It cannot use HWADDRESS because if someone did vlans then the same netdevice
will have the same mac and the initiator,target id will not be unique.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Cc: Roland Dreier <rdreier@cisco.com>
Cc: David C Somayajulu <david.somayajulu@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This patch allows us to set can_queue and cmds_per_lun from userspace
when we create the session/host. From there we can set it on a per
target basis. The patch fully converts iscsi_tcp, but only hooks
up ib_iser for cmd_per_lun since it currently has a lots of preallocations
based on can_queue.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Cc: Roland Dreier <rdreier@cisco.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The cmdsn allocation and pdu transmit code can race, and we can end
up sending a pdu with cmdsn 10 before a pdu with 5. The target will
then fail the connection/session. This patch fixes the problem by
delaying the cmdsn allocation until we are about to send the pdu.
This also removes the xmitmutex. We were using the connection xmitmutex
during error handling to handle races with mtask and ctask cleanup and
completion. For ctasks we now have nice refcounting and for the mtask,
if we hit the case where the mtask timesout and it is floating
around somewhere in the driver, we end up dropping the session.
And to handle session level cleanup, we use the xmit suspend bit
along with scsi_flush_queue and the session lock to make sure
that the xmit thread is not possibly transmitting a task while
we are trying to kill it.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Cc: Roland Dreier <rdreier@cisco.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
If iscsi_tcp partially sends a header, it would recalculate the
header size and readd the size of the digest (if header digests
are used).This would cause us to send sizeof(digest) extra bytes
when we sent the rest of the header.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The attached patches add sysfs files for the chap settings
to the iscsi transport class, iscsi_tcp and ib_iser. This is
needed for software iscsi because there are times when iscsid
can die and it will need to reread the values it was using.
And it is needed by qla4xxx for basic management opertaions.
This patch does not hook in qla4xxx yet, because I am not sure
the mbx command to use.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Cc: Roland Dreier <rdreier@cisco.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
- Remove shadow of request length from struct iscsi_cmd_task.
- change all users to use scsi_cmnd->request_bufflen directly
(With bidi we will use scsi-ml API to retrieve in/out length)
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Cc: Roland Dreier <rdreier@cisco.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This patch fixes handling of expected datasn/r2tsn as received from
target. It is done according to: T10 rfc3720 section 3.2.2.3. Data Sequencing.
. unify expected datasn/r2tsn into one counter
. calculate than check expected datasn/r2tsn. On error print a message
and fail the request. (TODO use iscsi retransmits)
. remove the FIXME ;)
. avoid zero length memset
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
For iscsi root boot, software iscsi needs to know what the BIOS/OF
initiator used for the initiator name so this puts it in sysfs
for userspace to be able to pick up.
For hw iscsi, it is nice to see what the card is using.
This patch adds the new param, and hooks in qla4xxx, iscsi_tcp, and ib_iser.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Cc: Roland Dreier <rdreier@cisco.com>
Cc: David C Somayajulu <david.somayajulu@qlogic.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
iscsid and udev need to key off the hw address being
used so add some helpers for iser and iscsi tcp.
Also convert them
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Cc: Roland Dreier <rdreier@cisco.com>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Dave Miller meantioned that the data buffer in a past
sense fixup patch was not gauranteed to be aligned
properly for ia64. This patch has libiscsi use get_unalinged
to make sure. There are a couple more places in the
digest handling we may need to do this, but we are in the middle
of fixing that code for big endien systems so just the sense
access is fixed here.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
This patch renames DEFAULT_MAX_RECV_DATA_SEGMENT_LENGTH to avoid
confusion with the drivers default values (DEFAULT_MAX_RECV_DATA_SEGMENT_LENGTH
is the iscsi RFC specific default).
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
It's possible that we call iscsi_xmitworker after iscsi_conn_release
which causes a oops. This patch flushes the workqueue.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Yanling Qi, noted that when the sense data length of
a check-condition is greater than 0x7f (127), senselen = (data[0] << 8)
| data[1] will become negative. It causes different kinds of panics from
GPF, spin_lock deadlock to spin_lock recursion.
We were also swapping this value on big endien machines.
This patch fixes both issues by using be16_to_cpu().
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Some messages from debug_scsi do not have trailing newlines,
making console messages difficult to read. Fix that.
Signed-off-by: Pete Wyckoff <pw@osc.edu>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
According to the iscsi RFC, we cannot send other requests if
we have sent a logout pdu. This patch enforces this requirement
by blocking the session and suspending the send thread. Userspace
decides if we restart the connection or if we just free everything.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
We have been dropping the pdu. We should just send it to userspace
and let it handle it.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
from bhalevy@gmail.com:
It looks like change 652 to libiscsi.c added some dead code around line
670
if (rc) {
spin_unlock_bh(&conn->session->lock);
goto again;
}
since 5 lines above we goto again if (rc).
It looks like the previous if (rc) should go away if we want to put the
ctask before
breaking out of the while loop with "goto again" (see following patch).
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
If connection creation fails we end up calling list_del
on a invalid struct. This then causes an oops. We are not
acutally using the lists (old MCS code we thought might
be useful elsewhere) so this patch just removes that
code.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
In the normal IO path we should not be calling back
into the LLD since the LLD will have cleaned up the
task before or after calling complete pdu.
For the fail_command path we still need to do this
to force the cleanup.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
If the scsi eh sends a TUR and the session is down we could
return SCSI_ML_HOST_BUSY. scsi eh will ignore this and send
ask us to abort the command and we blindly accesst the
command ptr.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The first burst length is only relevant if immedate data = Yes
or if Initial R2T is No
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
iscsi_tcp calculates padding by using the expected transfer length. This
has the problem where if we have immediate data = no and initial R2T =
yes, and the transfer length ended up needing padding then we send:
1. header
2. padding which should have gone after data
3. data
Besides this bug, we also assume the target will always ask for nice
transfer lengths and the first burst length will always be a nice value.
As far as I can tell form the RFC this is not a requirement. It would be
silly to do this, but if someone did it we will end doing bad things.
Finally the last bug in that bit of code is in our handling of the
recalculation of data digests when we do not send a whole iscsi_buf in
one try. The bug here is that we call crypto_digest_final on a
iscsi_sendpage error, then when we send the rest of the iscsi_buf, we
doiscsi_data_digest_init and this causes the previous data digest to be
lost.
And to make matters worse, some of these bugs are replicated over and
over and over again for immediate data, solicited data and unsolicited
data. So the attached patch made over the iscsi git tree (see
kernel.org/git for details) which I updated today to include the patches
I said I merged, consolidates the sending of data, padding and digests
and calculation of data digests and fixes the above bugs.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
It is possible that a ctask could be completing and getting
cleaned up at the same time, we are finishing up the last
data transfer. This could then result in the data transfer
code using stale or invalid values. This patch adds a refcount
to the ctask. When the count goes to zero then we know the
transmit thread and recv thread or softirq are not touching
it and we can safely release it.
The eh should not need to grab a reference because it only cleans
up a task if it has both the xmit mutex and recv lock (or recv
side suspended).
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
iSCSI RFC states that the first burst length must be smaller than the
max burst length. We currently assume targets will be good, but that may
not be the case, so this patch adds a check.
This patch also moves the unsol data out offset to the lib so the LLDs
do not have to track it.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
We were leaking some strings. This patch just frees them.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Must pass ISCSI_ERR values from the recv path and propogate them
upwards.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
We currently try to allocate a max_recv_data_segment_length
which can be very large (default is 64K), and common uses
are up to 1MB. It is very very difficult to allocte this
much contiguous memory and it turns out we never even use it.
We really only need a couple of pages, so this patch has us
allocates just what we know what we need today.
Later if vendors start adding vendor specific data and
we need to handle large buffers we can do this, but for
the last 4 years we have not seen anyone do this or request
it.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
We are touching the cls_session after we have freed
it. This causes a oops.
Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
When we enter recovery and flush the running commands
we cannot freee the connection before flushing the commands.
Some commands may have a reference to the connection
that needs to be released before. iscsi_stop was forcing
the term and suspend too early and was causing a oops
in iser, so this patch removes those callbacks all together
and allows the LLD to handle that detail.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Abort handler fixes.
If a connection is dropped and reconnected while an abort is
running then we should assume the recovery code will clean up
the abort. Not doing so causes a oops.
And if a command completes then we get the status for the abort, we do not
need to call into the LLD to cleanup the resources. Doing this causes
and oops in iser because it ends up freeing some resources twice.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
The iscsi tcp code can pluck multiple rt2s from the tasks's r2tqueue
in the xmit code. This can result in the task being queued on the xmit queue
but gettting completed at the same time.
This patch fixes the above bug by making the fifo a list so
we always remove the entry on the list del.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
So the drivers do not use the channel numbers, but some do
use the target numbers. We were just adding some goofy
variable that just increases for the target nr. This is useless
for software iscsi because it is always zero. And for qla4xxx
the target nr is actually the index of the target/session
in its FW or FLASH tables. We needed to expose this to userspace
so apps could access those numbers so this patch just adds the
target nr to the iscsi session creation functions. This way
when qla4xxx's Hw thinks a session is at target nr 4
in its hw, it is exposed as that number in sysfs.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
I do not remember what I was thinking when we added the channel
as a argument to the session create function. It was probably
due to too much cut and paste work from the FC transport class.
The channel is meaningless for iscsi drivers so this patch drops
its usage everywhere in the iscsi related code.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
iscsi_tcp and iser cannot be rmmod from the kernel when sessions
are running because session removal is driven from userspace. For
those modules we get a module reference when a session is
created then drop it when the session is removed.
For qla4xxx, they can jsut remove the sessions from the pci remove
function like normal HW drivers, so this patch moves the module
reference from the transport class functions shared by all
drivers to the libiscsi functions only used be software iscsi
modules.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Reduce duplication in the software iscsi_transport modules by
adding a libiscsi function to handle the common grunt work.
This also has the drivers return specifc -EXXX values for different
errors so userspace can finally handle them in a sane way.
Also just pass the sysfs buffers to the drivers so HW iscsi can
get/set its string values, like targetname, and initiatorname.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
We can race and misset the suspend bit if iscsi_write_space is
called then iscsi_send returns with a failure indicating
there is no space.
To handle this this patch returns a error upwards allowing xmitworker
to decide if we need to try and transmit again. For the no
write space case xmitworker will not retry, and instead
let iscsi_write_space queue it back up if needed (this relies
on the work queue code to properly requeue us if needed).
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
from davidw@netapp.com:
remove task type should return a task on success.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
from davidw@netapp.com:
We must grab the session lock when modifying the running lists.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
If recovery failed or we are in recovery only overwrite the state
if we are going to terminate the session or if we logged back in.
STOP_CONN_SUSPEND and conn_cnt are not used. We only support
a single connection session ATM, so cleanup that code while
we are working around it.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Do not flush queues then block session. This will cause commands
to needlessly swing around on us and remove goofy
recovery_failed field and replace with state value.
And do not start recovery from within the host reset function.
This causeis too many problems becuase open-iscsi was desinged to
call out to userspace then have userpscae decide if we should
go into recovery or kill the session.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
We only use the mtask data buffer for login tasks so we do not
need to preallocate a buffer for every mtask. This saves
8 * 31 KB.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
debugged by Ming and Rohan:
The problem Ming and Rohan debugged was that during a normal session
login, open-iscsi is not incrementing the exp_statsn counter. It was
stuck at zero. From the RFC, it looks like if the login response PDU has
a successful status then we should be incrementing that value. Also from
the RFC, it looks like if when we drop a connection then reconnect, we
should be using the exp_statsn from the old connection in the next
relogin attempt.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
align printk output
Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
from patmans@us.ibm.com and michaelc@cs.wisc.edu
Fix bugs when forcing a mgmt task to fail and allow
session recovery to cleanup the session/connection
of any running mgmt tasks. When called during
the in login state.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
There is a lot of code duplcited between iscsi_tcp
and the upcoming iscsi_iser driver. This patch puts
the duplicated code in a lib. There is more code
to move around but this takes care of the
basics. For iscsi_offload if they use the lib we will
probably move some things around. For example in the
queuecommand we will not assume that the LLD wants
to do queue_work, but it is better to handle that
later when we know for sure what iscsi_offload looks
like (we could probably do this for iscsi_iser though to).
Ideally I would like to get the iscsi_transports modules
to a place where all they really have to do is put data
on the wire, but how to do that will hopefully be more clear
when we see other modules like iscsi_offload. Or maybe
iscsi_offload will not use the lib and it will just be
iscsi_iser and iscsi_tcp and maybe the iscsi_tcp_tgt if that
is allowed in mainline.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>