License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.
By default all files without license information are under the default
license of the kernel, which is GPL version 2.
Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.
This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.
How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,
Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.
The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.
The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.
Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).
All documentation files were explicitly excluded.
The following heuristics were used to determine which SPDX license
identifiers to apply.
- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.
For non */uapi/* files that summary was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139
and resulted in the first patch in this series.
If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930
and resulted in the second patch in this series.
- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:
SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1
and that resulted in the third patch in this series.
- when the two scanners agreed on the detected license(s), that became
the concluded license(s).
- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.
- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).
- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.
- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.
In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.
Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.
Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.
In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.
Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct
This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.
These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-01 22:07:57 +08:00
|
|
|
// SPDX-License-Identifier: GPL-2.0
|
2005-04-17 06:20:36 +08:00
|
|
|
/*
|
2008-06-11 00:20:58 +08:00
|
|
|
* zfcp device driver
|
2005-04-17 06:20:36 +08:00
|
|
|
*
|
2008-06-11 00:20:58 +08:00
|
|
|
* Implementation of FSF commands.
|
2005-04-17 06:20:36 +08:00
|
|
|
*
|
2023-02-22 01:56:00 +08:00
|
|
|
* Copyright IBM Corp. 2002, 2023
|
2005-04-17 06:20:36 +08:00
|
|
|
*/
|
|
|
|
|
2008-12-25 20:39:53 +08:00
|
|
|
#define KMSG_COMPONENT "zfcp"
|
|
|
|
#define pr_fmt(fmt) KMSG_COMPONENT ": " fmt
|
|
|
|
|
2008-10-16 14:23:39 +08:00
|
|
|
#include <linux/blktrace_api.h>
|
2019-10-26 00:12:44 +08:00
|
|
|
#include <linux/jiffies.h>
|
scsi: zfcp: fix request object use-after-free in send path causing seqno errors
With a recent change to our send path for FSF commands we introduced a
possible use-after-free of request-objects, that might further lead to
zfcp crafting bad requests, which the FCP channel correctly complains
about with an error (FSF_PROT_SEQ_NUMB_ERROR). This error is then handled
by an adapter-wide recovery.
The following sequence illustrates the possible use-after-free:
Send Path:
int zfcp_fsf_open_port(struct zfcp_erp_action *erp_action)
{
struct zfcp_fsf_req *req;
...
spin_lock_irq(&qdio->req_q_lock);
// ^^^^^^^^^^^^^^^^
// protects QDIO queue during sending
...
req = zfcp_fsf_req_create(qdio,
FSF_QTCB_OPEN_PORT_WITH_DID,
SBAL_SFLAGS0_TYPE_READ,
qdio->adapter->pool.erp_req);
// ^^^^^^^^^^^^^^^^^^^
// allocation of the request-object
...
retval = zfcp_fsf_req_send(req);
...
spin_unlock_irq(&qdio->req_q_lock);
return retval;
}
static int zfcp_fsf_req_send(struct zfcp_fsf_req *req)
{
struct zfcp_adapter *adapter = req->adapter;
struct zfcp_qdio *qdio = adapter->qdio;
...
zfcp_reqlist_add(adapter->req_list, req);
// ^^^^^^^^^^^^^^^^
// add request to our driver-internal hash-table for tracking
// (protected by separate lock req_list->lock)
...
if (zfcp_qdio_send(qdio, &req->qdio_req)) {
// ^^^^^^^^^^^^^^
// hand-off the request to FCP channel;
// the request can complete at any point now
...
}
/* Don't increase for unsolicited status */
if (!zfcp_fsf_req_is_status_read_buffer(req))
// ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
// possible use-after-free
adapter->fsf_req_seq_no++;
// ^^^^^^^^^^^^^^^^
// because of the use-after-free we might
// miss this accounting, and as follow-up
// this results in the FCP channel error
// FSF_PROT_SEQ_NUMB_ERROR
adapter->req_no++;
return 0;
}
static inline bool
zfcp_fsf_req_is_status_read_buffer(struct zfcp_fsf_req *req)
{
return req->qtcb == NULL;
// ^^^^^^^^^
// possible use-after-free
}
Response Path:
void zfcp_fsf_reqid_check(struct zfcp_qdio *qdio, int sbal_idx)
{
...
struct zfcp_fsf_req *fsf_req;
...
for (idx = 0; idx < QDIO_MAX_ELEMENTS_PER_BUFFER; idx++) {
...
fsf_req = zfcp_reqlist_find_rm(adapter->req_list,
req_id);
// ^^^^^^^^^^^^^^^^^^^^
// remove request from our driver-internal
// hash-table (lock req_list->lock)
...
zfcp_fsf_req_complete(fsf_req);
}
}
static void zfcp_fsf_req_complete(struct zfcp_fsf_req *req)
{
...
if (likely(req->status & ZFCP_STATUS_FSFREQ_CLEANUP))
zfcp_fsf_req_free(req);
// ^^^^^^^^^^^^^^^^^
// free memory for request-object
else
complete(&req->completion);
// ^^^^^^^^
// completion notification for code-paths that wait
// synchronous for the completion of the request; in
// those the memory is freed separately
}
The result of the use-after-free only affects the send path, and can not
lead to any data corruption. In case we miss the sequence-number
accounting, because the memory was already re-purposed, the next FSF
command will fail with said FCP channel error, and we will recover the
whole adapter. This causes no additional errors, but it slows down
traffic. There is a slight chance of the same thing happen again
recursively after the adapter recovery, but so far this has not been seen.
This was seen under z/VM, where the send path might run on a virtual CPU
that gets scheduled away by z/VM, while the return path might still run,
and so create the necessary timing. Running with KASAN can also slow down
the kernel sufficiently to run into this user-after-free, and then see the
report by KASAN.
To fix this, simply pull the test for the sequence-number accounting in
front of the hand-off to the FCP channel (this information doesn't change
during hand-off), but leave the sequence-number accounting itself where it
is.
To make future regressions of the same kind less likely, add comments to
all closely related code-paths.
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Fixes: f9eca0227600 ("scsi: zfcp: drop duplicate fsf_command from zfcp_fsf_req which is also in QTCB header")
Cc: <stable@vger.kernel.org> #5.0+
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-07-03 05:02:00 +08:00
|
|
|
#include <linux/types.h>
|
include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.
percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.
http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.
* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.
* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.
The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.
2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.
3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.
4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.
5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.
6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).
* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig
8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.
Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-24 16:04:11 +08:00
|
|
|
#include <linux/slab.h>
|
2009-11-24 23:54:09 +08:00
|
|
|
#include <scsi/fc/fc_els.h>
|
2005-04-17 06:20:36 +08:00
|
|
|
#include "zfcp_ext.h"
|
2009-11-24 23:54:08 +08:00
|
|
|
#include "zfcp_fc.h"
|
2009-08-18 21:43:08 +08:00
|
|
|
#include "zfcp_dbf.h"
|
2010-02-17 18:18:59 +08:00
|
|
|
#include "zfcp_qdio.h"
|
2010-02-17 18:18:50 +08:00
|
|
|
#include "zfcp_reqlist.h"
|
2019-10-26 00:12:44 +08:00
|
|
|
#include "zfcp_diag.h"
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2018-11-08 22:44:40 +08:00
|
|
|
/* timeout for FSF requests sent during scsi_eh: abort or FCP TMF */
|
|
|
|
#define ZFCP_FSF_SCSI_ER_TIMEOUT (10*HZ)
|
|
|
|
/* timeout for: exchange config/port data outside ERP, or open/close WKA port */
|
|
|
|
#define ZFCP_FSF_REQUEST_TIMEOUT (60*HZ)
|
|
|
|
|
2011-02-23 02:54:44 +08:00
|
|
|
struct kmem_cache *zfcp_fsf_qtcb_cache;
|
|
|
|
|
2019-10-01 18:49:49 +08:00
|
|
|
static bool ber_stop = true;
|
|
|
|
module_param(ber_stop, bool, 0600);
|
|
|
|
MODULE_PARM_DESC(ber_stop,
|
|
|
|
"Shuts down FCP devices for FCP channels that report a bit-error count in excess of its threshold (default on)");
|
|
|
|
|
2017-10-17 07:44:34 +08:00
|
|
|
static void zfcp_fsf_request_timeout_handler(struct timer_list *t)
|
2008-07-02 16:56:40 +08:00
|
|
|
{
|
2017-10-17 07:44:34 +08:00
|
|
|
struct zfcp_fsf_req *fsf_req = from_timer(fsf_req, t, timer);
|
|
|
|
struct zfcp_adapter *adapter = fsf_req->adapter;
|
2017-10-18 00:40:51 +08:00
|
|
|
|
2010-07-16 21:37:43 +08:00
|
|
|
zfcp_qdio_siosl(adapter);
|
2009-03-02 20:09:04 +08:00
|
|
|
zfcp_erp_adapter_reopen(adapter, ZFCP_STATUS_COMMON_ERP_FAILED,
|
2010-12-02 22:16:16 +08:00
|
|
|
"fsrth_1");
|
2008-07-02 16:56:40 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
static void zfcp_fsf_start_timer(struct zfcp_fsf_req *fsf_req,
|
|
|
|
unsigned long timeout)
|
|
|
|
{
|
2017-10-23 15:40:42 +08:00
|
|
|
fsf_req->timer.function = zfcp_fsf_request_timeout_handler;
|
2008-07-02 16:56:40 +08:00
|
|
|
fsf_req->timer.expires = jiffies + timeout;
|
|
|
|
add_timer(&fsf_req->timer);
|
|
|
|
}
|
|
|
|
|
|
|
|
static void zfcp_fsf_start_erp_timer(struct zfcp_fsf_req *fsf_req)
|
|
|
|
{
|
|
|
|
BUG_ON(!fsf_req->erp_action);
|
2017-10-23 15:40:42 +08:00
|
|
|
fsf_req->timer.function = zfcp_erp_timeout_handler;
|
2008-07-02 16:56:40 +08:00
|
|
|
fsf_req->timer.expires = jiffies + 30 * HZ;
|
|
|
|
add_timer(&fsf_req->timer);
|
|
|
|
}
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
/* association between FSF command and FSF QTCB type */
|
|
|
|
static u32 fsf_qtcb_type[] = {
|
|
|
|
[FSF_QTCB_FCP_CMND] = FSF_IO_COMMAND,
|
|
|
|
[FSF_QTCB_ABORT_FCP_CMND] = FSF_SUPPORT_COMMAND,
|
|
|
|
[FSF_QTCB_OPEN_PORT_WITH_DID] = FSF_SUPPORT_COMMAND,
|
|
|
|
[FSF_QTCB_OPEN_LUN] = FSF_SUPPORT_COMMAND,
|
|
|
|
[FSF_QTCB_CLOSE_LUN] = FSF_SUPPORT_COMMAND,
|
|
|
|
[FSF_QTCB_CLOSE_PORT] = FSF_SUPPORT_COMMAND,
|
|
|
|
[FSF_QTCB_CLOSE_PHYSICAL_PORT] = FSF_SUPPORT_COMMAND,
|
|
|
|
[FSF_QTCB_SEND_ELS] = FSF_SUPPORT_COMMAND,
|
|
|
|
[FSF_QTCB_SEND_GENERIC] = FSF_SUPPORT_COMMAND,
|
|
|
|
[FSF_QTCB_EXCHANGE_CONFIG_DATA] = FSF_CONFIG_COMMAND,
|
|
|
|
[FSF_QTCB_EXCHANGE_PORT_DATA] = FSF_PORT_COMMAND,
|
|
|
|
[FSF_QTCB_DOWNLOAD_CONTROL_FILE] = FSF_SUPPORT_COMMAND,
|
|
|
|
[FSF_QTCB_UPLOAD_CONTROL_FILE] = FSF_SUPPORT_COMMAND
|
|
|
|
};
|
|
|
|
|
2008-06-11 00:20:58 +08:00
|
|
|
static void zfcp_fsf_class_not_supp(struct zfcp_fsf_req *req)
|
|
|
|
{
|
2008-10-01 18:42:15 +08:00
|
|
|
dev_err(&req->adapter->ccw_device->dev, "FCP device not "
|
|
|
|
"operational because of an unsupported FC class\n");
|
2010-12-02 22:16:16 +08:00
|
|
|
zfcp_erp_adapter_shutdown(req->adapter, 0, "fscns_1");
|
2008-06-11 00:20:58 +08:00
|
|
|
req->status |= ZFCP_STATUS_FSFREQ_ERROR;
|
|
|
|
}
|
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
/**
|
|
|
|
* zfcp_fsf_req_free - free memory used by fsf request
|
2018-11-08 22:44:54 +08:00
|
|
|
* @req: pointer to struct zfcp_fsf_req
|
2005-04-17 06:20:36 +08:00
|
|
|
*/
|
2008-07-02 16:56:39 +08:00
|
|
|
void zfcp_fsf_req_free(struct zfcp_fsf_req *req)
|
2005-04-17 06:20:36 +08:00
|
|
|
{
|
2008-07-02 16:56:39 +08:00
|
|
|
if (likely(req->pool)) {
|
2018-11-08 22:44:45 +08:00
|
|
|
if (likely(!zfcp_fsf_req_is_status_read_buffer(req)))
|
2009-08-18 21:43:15 +08:00
|
|
|
mempool_free(req->qtcb, req->adapter->pool.qtcb_pool);
|
2008-07-02 16:56:39 +08:00
|
|
|
mempool_free(req, req->pool);
|
2006-09-19 04:28:49 +08:00
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
2018-11-08 22:44:45 +08:00
|
|
|
if (likely(!zfcp_fsf_req_is_status_read_buffer(req)))
|
2011-02-23 02:54:44 +08:00
|
|
|
kmem_cache_free(zfcp_fsf_qtcb_cache, req->qtcb);
|
2009-08-18 21:43:15 +08:00
|
|
|
kfree(req);
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
static void zfcp_fsf_status_read_port_closed(struct zfcp_fsf_req *req)
|
2005-04-17 06:20:36 +08:00
|
|
|
{
|
2009-11-24 23:53:58 +08:00
|
|
|
unsigned long flags;
|
2008-07-02 16:56:39 +08:00
|
|
|
struct fsf_status_read_buffer *sr_buf = req->data;
|
|
|
|
struct zfcp_adapter *adapter = req->adapter;
|
|
|
|
struct zfcp_port *port;
|
2009-11-24 23:54:12 +08:00
|
|
|
int d_id = ntoh24(sr_buf->d_id);
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2009-11-24 23:53:58 +08:00
|
|
|
read_lock_irqsave(&adapter->port_list_lock, flags);
|
|
|
|
list_for_each_entry(port, &adapter->port_list, list)
|
2008-07-02 16:56:39 +08:00
|
|
|
if (port->d_id == d_id) {
|
2010-12-02 22:16:16 +08:00
|
|
|
zfcp_erp_port_reopen(port, 0, "fssrpc1");
|
2009-11-24 23:53:58 +08:00
|
|
|
break;
|
2008-07-02 16:56:39 +08:00
|
|
|
}
|
2009-11-24 23:53:58 +08:00
|
|
|
read_unlock_irqrestore(&adapter->port_list_lock, flags);
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
|
|
|
|
scsi: zfcp: Move allocation of the shost object to after xconf- and xport-data
At the moment we allocate and register the Scsi_Host object corresponding
to a zfcp adapter (FCP device) very early in the life cycle of the adapter
- even before we fully discover and initialize the underlying
firmware/hardware. This had the advantage that we could already use the
Scsi_Host object, and fill in all its information during said discover and
initialize.
Due to commit 737eb78e82d5 ("block: Delay default elevator initialization")
(first released in v5.4), we noticed a regression that would prevent us
from using any storage volume if zfcp is configured with support for DIF or
DIX (zfcp.dif=1 || zfcp.dix=1). Doing so would result in an illegal memory
access as soon as the first request is sent with such an configuration. As
example for a crash resulting from this:
scsi host0: scsi_eh_0: sleeping
scsi host0: zfcp
qdio: 0.0.1900 ZFCP on SC 4bd using AI:1 QEBSM:0 PRI:1 TDD:1 SIGA: W AP
scsi 0:0:0:0: scsi scan: INQUIRY pass 1 length 36
Unable to handle kernel pointer dereference in virtual kernel address space
Failing address: 0000000000000000 TEID: 0000000000000483
Fault in home space mode while using kernel ASCE.
AS:0000000035c7c007 R3:00000001effcc007 S:00000001effd1000 P:000000000000003d
Oops: 0004 ilc:3 [#1] PREEMPT SMP DEBUG_PAGEALLOC
Modules linked in: ...
CPU: 1 PID: 783 Comm: kworker/u760:5 Kdump: loaded Not tainted 5.6.0-rc2-bb-next+ #1
Hardware name: ...
Workqueue: scsi_wq_0 fc_scsi_scan_rport [scsi_transport_fc]
Krnl PSW : 0704e00180000000 000003ff801fcdae (scsi_queue_rq+0x436/0x740 [scsi_mod])
R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
Krnl GPRS: 0fffffffffffffff 0000000000000000 0000000187150120 0000000000000000
000003ff80223d20 000000000000018e 000000018adc6400 0000000187711000
000003e0062337e8 00000001ae719000 0000000187711000 0000000187150000
00000001ab808100 0000000187150120 000003ff801fcd74 000003e0062336a0
Krnl Code: 000003ff801fcd9e: e310a35c0012 lt %r1,860(%r10)
000003ff801fcda4: a7840010 brc 8,000003ff801fcdc4
#000003ff801fcda8: e310b2900004 lg %r1,656(%r11)
>000003ff801fcdae: d71710001000 xc 0(24,%r1),0(%r1)
000003ff801fcdb4: e310b2900004 lg %r1,656(%r11)
000003ff801fcdba: 41201018 la %r2,24(%r1)
000003ff801fcdbe: e32010000024 stg %r2,0(%r1)
000003ff801fcdc4: b904002b lgr %r2,%r11
Call Trace:
[<000003ff801fcdae>] scsi_queue_rq+0x436/0x740 [scsi_mod]
([<000003ff801fcd74>] scsi_queue_rq+0x3fc/0x740 [scsi_mod])
[<00000000349c9970>] blk_mq_dispatch_rq_list+0x390/0x680
[<00000000349d1596>] blk_mq_sched_dispatch_requests+0x196/0x1a8
[<00000000349c7a04>] __blk_mq_run_hw_queue+0x144/0x160
[<00000000349c7ab6>] __blk_mq_delay_run_hw_queue+0x96/0x228
[<00000000349c7d5a>] blk_mq_run_hw_queue+0xd2/0xe0
[<00000000349d194a>] blk_mq_sched_insert_request+0x192/0x1d8
[<00000000349c17b8>] blk_execute_rq_nowait+0x80/0x90
[<00000000349c1856>] blk_execute_rq+0x6e/0xb0
[<000003ff801f8ac2>] __scsi_execute+0xe2/0x1f0 [scsi_mod]
[<000003ff801fef98>] scsi_probe_and_add_lun+0x358/0x840 [scsi_mod]
[<000003ff8020001c>] __scsi_scan_target+0xc4/0x228 [scsi_mod]
[<000003ff80200254>] scsi_scan_target+0xd4/0x100 [scsi_mod]
[<000003ff802d8b96>] fc_scsi_scan_rport+0x96/0xc0 [scsi_transport_fc]
[<0000000034245ce8>] process_one_work+0x458/0x7d0
[<00000000342462a2>] worker_thread+0x242/0x448
[<0000000034250994>] kthread+0x15c/0x170
[<0000000034e1979c>] ret_from_fork+0x30/0x38
INFO: lockdep is turned off.
Last Breaking-Event-Address:
[<000003ff801fbc36>] scsi_add_cmd_to_list+0x9e/0xa8 [scsi_mod]
Kernel panic - not syncing: Fatal exception: panic_on_oops
While this issue is exposed by the commit named above, this is only by
accident. The real issue exists for longer already - basically since it's
possible to use blk-mq via scsi-mq, and blk-mq pre-allocates all requests
for a tag-set during initialization of the same. For a given Scsi_Host
object this is done when adding the object to the midlayer
(`scsi_add_host()` and such). In `scsi_mq_setup_tags()` the midlayer
calculates how much memory is required for a single scsi_cmnd, and its
additional data, which also might include space for additional protection
data - depending on whether the Scsi_Host has any form of protection
capabilities (`scsi_host_get_prot()`).
The problem is now thus, because zfcp does this step before we actually
know whether the firmware/hardware has these capabilities, we don't set any
protection capabilities in the Scsi_Host object. And so, no space is
allocated for additional protection data for requests in the Scsi_Host
tag-set.
Once we go through discover and initialize the FCP device firmware/hardware
fully (this is done via the firmware commands "Exchange Config Data" and
"Exchange Port Data") we find out whether it actually supports DIF and DIX,
and we set the corresponding capabilities in the Scsi_Host object (in
`zfcp_scsi_set_prot()`). Now the Scsi_Host potentially has protection
capabilities, but the already allocated requests in the tag-set don't have
any space allocated for that.
When we then trigger target scanning or add scsi_devices manually, the
midlayer will use requests from that tag-set, and before sending most
requests, it will also call `scsi_mq_prep_fn()`. To prepare the scsi_cmnd
this function will check again whether the used Scsi_Host has any
protection capabilities - and now it potentially has - and if so, it will
try to initialize the assumed to be preallocated structures and thus it
causes the crash, like shown above.
Before delaying the default elevator initialization with the commit named
above, we always would also allocate an elevator for any scsi_device before
ever sending any requests - in contrast to now, where we do it after
device-probing. That elevator in turn would have its own tag-set, and that
is initialized after we went through discovery and initialization of the
underlying firmware/hardware. So requests from that tag-set can be
allocated properly, and if used - unless the user changes/disabled the
default elevator - this would hide the underlying issue.
To fix this for any configuration - with or without an elevator - we move
the allocation and registration of the Scsi_Host object for a given FCP
device to after the first complete discovery and initialization of the
underlying firmware/hardware. By doing that we can make all basic
properties of the Scsi_Host known to the midlayer by the time we call
`scsi_add_host()`, including whether we have any protection capabilities.
To do that we have to delay all the accesses that we would have done in the
past during discovery and initialization, and do them instead once we are
finished with it. The previous patches ramp up to this by fencing and
factoring out all these accesses, and make it possible to re-do them later
on. In addition we make also use of the diagnostic buffers we recently
added with
commit 92953c6e0aa7 ("scsi: zfcp: signal incomplete or error for sync exchange config/port data")
commit 7e418833e689 ("scsi: zfcp: diagnostics buffer caching and use for exchange port data")
commit 088210233e6f ("scsi: zfcp: add diagnostics buffer for exchange config data")
(first released in v5.5), because these already cache all the information
we need for that "re-do operation" - the information cached are always
updated during xconf or xport data, so it won't be stale.
In addition to the move and re-do, this patch also updates the
function-documentation of `zfcp_scsi_adapter_register()` and changes how it
reports if a Scsi_Host object already exists. In that case future
recovery-operations can skip this step completely and behave much like they
would do in the past - zfcp does not release a once allocated Scsi_Host
object unless the corresponding FCP device is deconstructed completely.
Link: https://lore.kernel.org/r/030dd6da318bbb529f0b5268ec65cebcd20fc0a3.1588956679.git.bblock@linux.ibm.com
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-05-09 01:23:35 +08:00
|
|
|
void zfcp_fsf_fc_host_link_down(struct zfcp_adapter *adapter)
|
scsi: zfcp: fix fc_host attributes that should be unknown on local link down
When we get an unsolicited notification on local link went down,
zfcp_fsf_status_read_link_down() calls zfcp_fsf_link_down_info_eval().
This only blocks rports, and sets ZFCP_STATUS_ADAPTER_LINK_UNPLUGGED and
ZFCP_STATUS_COMMON_ERP_FAILED. Only the fc_host port_state changes to
"Linkdown", because zfcp_scsi_get_host_port_state() is an active callback
and uses the adapter status.
Other fc_host attributes model, port_id, port_type, speed, fabric_name (and
zfcp device attributes card_version, peer_wwpn, peer_wwnn, peer_d_id) which
depend on a local link, continued to show their last known "good" value.
Only if something triggered an exchange config data, some values were
updated to their unknown equivalent via case
FSF_EXCHANGE_CONFIG_DATA_INCOMPLETE due to local link down. Triggers for
exchange config data are adapter recovery, or reading any of the following
zfcp-specific scsi host sysfs attributes "requests", "megabytes", or
"seconds_active" in /sys/devices/css*/*.*.*/*.*.*/host*/scsi_host/host*/.
The other fc_host attributes active_fc4s and permanent_port_name continued
to show their last known "good" value. Only if something triggered an
exchange port data, some values changed. Active_fc4s became all zeros as
unknown equivalent during link down. Permanent_port_name does not depend
on a local link. But for non-NPIV FCP devices, permanent_port_name
erroneously became whatever value fc_host port_name had at that point in
time (see previous paragraph). Triggers for exchange port data are the
zfcp-specific scsi host sysfs attribute "utilization", or
[{reset,get}_fc_host_stats] write anything into "reset_statistics" or read
any of the other attributes under
/sys/devices/css*/*.*.*/*.*.*/host*/fc_host/host*/statistics/.
(cf. v4.9 commit bd77befa5bcf ("zfcp: fix fc_host port_type with NPIV"))
This is particularly confusing when using "lszfcp -b <fcpdevbusid> -Ha" or
dbginfo.sh which read fc_host attributes and also scsi_host attributes.
After link down, the first invocation produces (abbreviated):
Class = "fc_host"
active_fc4s = "0x00 0x00 0x01 0x00 ..."
...
fabric_name = "0x10000027f8e04c49"
...
permanent_port_name = "0xc05076e4588059c1"
port_id = "0x244800"
port_state = "Linkdown"
port_type = "NPort (fabric via point-to-point)"
...
speed = "16 Gbit"
Class = "scsi_host"
...
megabytes = "0 0"
...
requests = "0 0 0"
seconds_active = "37"
...
utilization = "0 0 0"
The second and next invocations produce (abbreviated):
Class = "fc_host"
active_fc4s = "0x00 0x00 0x00 0x00 ..."
...
fabric_name = "0x0"
...
permanent_port_name = "0x0"
port_id = "0x000000"
port_state = "Linkdown"
port_type = "Unknown"
...
speed = "unknown"
Class = "scsi_host"
...
megabytes = "0 0"
...
requests = "0 0 0"
seconds_active = "38"
...
utilization = "0 0 0"
Factor out the resetting of local link dependent fc_host attributes from
zfcp_fsf_exchange_config_data_handler() case
FSF_EXCHANGE_CONFIG_DATA_INCOMPLETE into a new helper function
zfcp_fsf_fc_host_link_down(). All code places that detect local link down
(SRB, FSF_PROT_LINK_DOWN, xconf data/port incomplete) call
zfcp_fsf_link_down_info_eval(). Call the new helper from there. This works
because zfcp_fsf_link_down_info_eval() and thus the helper is called before
zfcp_fsf_exchange_{config,port}_evaluate().
Port_name and node_name are always valid, so never reset them.
Get the permanent_port_name from exchange port data unconditionally as it
always has a valid known good value, even during link down.
Note: Rather than hardcode in zfcp_fsf_exchange_config_evaluate(), fc_host
supported_classes could theoretically get its value from
fsf_qtcb_bottom_port.class_of_service in zfcp_fsf_exchange_port_evaluate().
When the link comes back, we get a different notification, perform adapter
recovery, and this triggers an implicit exchange config data followed by
exchange port data filling in the link dependent fc_host attributes with
known good values again.
Link: https://lore.kernel.org/r/20200312174505.51294-5-maier@linux.ibm.com
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-03-13 01:44:59 +08:00
|
|
|
{
|
|
|
|
struct Scsi_Host *shost = adapter->scsi_host;
|
|
|
|
|
2020-05-09 01:23:31 +08:00
|
|
|
adapter->hydra_version = 0;
|
|
|
|
adapter->peer_wwpn = 0;
|
|
|
|
adapter->peer_wwnn = 0;
|
|
|
|
adapter->peer_d_id = 0;
|
|
|
|
|
|
|
|
/* if there is no shost yet, we have nothing to zero-out */
|
|
|
|
if (shost == NULL)
|
|
|
|
return;
|
|
|
|
|
scsi: zfcp: fix fc_host attributes that should be unknown on local link down
When we get an unsolicited notification on local link went down,
zfcp_fsf_status_read_link_down() calls zfcp_fsf_link_down_info_eval().
This only blocks rports, and sets ZFCP_STATUS_ADAPTER_LINK_UNPLUGGED and
ZFCP_STATUS_COMMON_ERP_FAILED. Only the fc_host port_state changes to
"Linkdown", because zfcp_scsi_get_host_port_state() is an active callback
and uses the adapter status.
Other fc_host attributes model, port_id, port_type, speed, fabric_name (and
zfcp device attributes card_version, peer_wwpn, peer_wwnn, peer_d_id) which
depend on a local link, continued to show their last known "good" value.
Only if something triggered an exchange config data, some values were
updated to their unknown equivalent via case
FSF_EXCHANGE_CONFIG_DATA_INCOMPLETE due to local link down. Triggers for
exchange config data are adapter recovery, or reading any of the following
zfcp-specific scsi host sysfs attributes "requests", "megabytes", or
"seconds_active" in /sys/devices/css*/*.*.*/*.*.*/host*/scsi_host/host*/.
The other fc_host attributes active_fc4s and permanent_port_name continued
to show their last known "good" value. Only if something triggered an
exchange port data, some values changed. Active_fc4s became all zeros as
unknown equivalent during link down. Permanent_port_name does not depend
on a local link. But for non-NPIV FCP devices, permanent_port_name
erroneously became whatever value fc_host port_name had at that point in
time (see previous paragraph). Triggers for exchange port data are the
zfcp-specific scsi host sysfs attribute "utilization", or
[{reset,get}_fc_host_stats] write anything into "reset_statistics" or read
any of the other attributes under
/sys/devices/css*/*.*.*/*.*.*/host*/fc_host/host*/statistics/.
(cf. v4.9 commit bd77befa5bcf ("zfcp: fix fc_host port_type with NPIV"))
This is particularly confusing when using "lszfcp -b <fcpdevbusid> -Ha" or
dbginfo.sh which read fc_host attributes and also scsi_host attributes.
After link down, the first invocation produces (abbreviated):
Class = "fc_host"
active_fc4s = "0x00 0x00 0x01 0x00 ..."
...
fabric_name = "0x10000027f8e04c49"
...
permanent_port_name = "0xc05076e4588059c1"
port_id = "0x244800"
port_state = "Linkdown"
port_type = "NPort (fabric via point-to-point)"
...
speed = "16 Gbit"
Class = "scsi_host"
...
megabytes = "0 0"
...
requests = "0 0 0"
seconds_active = "37"
...
utilization = "0 0 0"
The second and next invocations produce (abbreviated):
Class = "fc_host"
active_fc4s = "0x00 0x00 0x00 0x00 ..."
...
fabric_name = "0x0"
...
permanent_port_name = "0x0"
port_id = "0x000000"
port_state = "Linkdown"
port_type = "Unknown"
...
speed = "unknown"
Class = "scsi_host"
...
megabytes = "0 0"
...
requests = "0 0 0"
seconds_active = "38"
...
utilization = "0 0 0"
Factor out the resetting of local link dependent fc_host attributes from
zfcp_fsf_exchange_config_data_handler() case
FSF_EXCHANGE_CONFIG_DATA_INCOMPLETE into a new helper function
zfcp_fsf_fc_host_link_down(). All code places that detect local link down
(SRB, FSF_PROT_LINK_DOWN, xconf data/port incomplete) call
zfcp_fsf_link_down_info_eval(). Call the new helper from there. This works
because zfcp_fsf_link_down_info_eval() and thus the helper is called before
zfcp_fsf_exchange_{config,port}_evaluate().
Port_name and node_name are always valid, so never reset them.
Get the permanent_port_name from exchange port data unconditionally as it
always has a valid known good value, even during link down.
Note: Rather than hardcode in zfcp_fsf_exchange_config_evaluate(), fc_host
supported_classes could theoretically get its value from
fsf_qtcb_bottom_port.class_of_service in zfcp_fsf_exchange_port_evaluate().
When the link comes back, we get a different notification, perform adapter
recovery, and this triggers an implicit exchange config data followed by
exchange port data filling in the link dependent fc_host attributes with
known good values again.
Link: https://lore.kernel.org/r/20200312174505.51294-5-maier@linux.ibm.com
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-03-13 01:44:59 +08:00
|
|
|
fc_host_port_id(shost) = 0;
|
|
|
|
fc_host_fabric_name(shost) = 0;
|
|
|
|
fc_host_speed(shost) = FC_PORTSPEED_UNKNOWN;
|
|
|
|
fc_host_port_type(shost) = FC_PORTTYPE_UNKNOWN;
|
|
|
|
snprintf(fc_host_model(shost), FC_SYMBOLIC_NAME_SIZE, "0x%04x", 0);
|
|
|
|
memset(fc_host_active_fc4s(shost), 0, FC_FC4_LIST_SIZE);
|
|
|
|
}
|
|
|
|
|
2010-09-08 20:40:01 +08:00
|
|
|
static void zfcp_fsf_link_down_info_eval(struct zfcp_fsf_req *req,
|
2008-07-02 16:56:39 +08:00
|
|
|
struct fsf_link_down_info *link_down)
|
2005-09-14 03:51:16 +08:00
|
|
|
{
|
2008-07-02 16:56:39 +08:00
|
|
|
struct zfcp_adapter *adapter = req->adapter;
|
2008-03-27 21:22:02 +08:00
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
if (atomic_read(&adapter->status) & ZFCP_STATUS_ADAPTER_LINK_UNPLUGGED)
|
2005-12-01 09:48:41 +08:00
|
|
|
return;
|
|
|
|
|
2015-04-24 07:12:32 +08:00
|
|
|
atomic_or(ZFCP_STATUS_ADAPTER_LINK_UNPLUGGED, &adapter->status);
|
2009-04-17 21:08:15 +08:00
|
|
|
|
2009-03-02 20:09:08 +08:00
|
|
|
zfcp_scsi_schedule_rports_block(adapter);
|
2005-12-01 09:48:41 +08:00
|
|
|
|
scsi: zfcp: fix fc_host attributes that should be unknown on local link down
When we get an unsolicited notification on local link went down,
zfcp_fsf_status_read_link_down() calls zfcp_fsf_link_down_info_eval().
This only blocks rports, and sets ZFCP_STATUS_ADAPTER_LINK_UNPLUGGED and
ZFCP_STATUS_COMMON_ERP_FAILED. Only the fc_host port_state changes to
"Linkdown", because zfcp_scsi_get_host_port_state() is an active callback
and uses the adapter status.
Other fc_host attributes model, port_id, port_type, speed, fabric_name (and
zfcp device attributes card_version, peer_wwpn, peer_wwnn, peer_d_id) which
depend on a local link, continued to show their last known "good" value.
Only if something triggered an exchange config data, some values were
updated to their unknown equivalent via case
FSF_EXCHANGE_CONFIG_DATA_INCOMPLETE due to local link down. Triggers for
exchange config data are adapter recovery, or reading any of the following
zfcp-specific scsi host sysfs attributes "requests", "megabytes", or
"seconds_active" in /sys/devices/css*/*.*.*/*.*.*/host*/scsi_host/host*/.
The other fc_host attributes active_fc4s and permanent_port_name continued
to show their last known "good" value. Only if something triggered an
exchange port data, some values changed. Active_fc4s became all zeros as
unknown equivalent during link down. Permanent_port_name does not depend
on a local link. But for non-NPIV FCP devices, permanent_port_name
erroneously became whatever value fc_host port_name had at that point in
time (see previous paragraph). Triggers for exchange port data are the
zfcp-specific scsi host sysfs attribute "utilization", or
[{reset,get}_fc_host_stats] write anything into "reset_statistics" or read
any of the other attributes under
/sys/devices/css*/*.*.*/*.*.*/host*/fc_host/host*/statistics/.
(cf. v4.9 commit bd77befa5bcf ("zfcp: fix fc_host port_type with NPIV"))
This is particularly confusing when using "lszfcp -b <fcpdevbusid> -Ha" or
dbginfo.sh which read fc_host attributes and also scsi_host attributes.
After link down, the first invocation produces (abbreviated):
Class = "fc_host"
active_fc4s = "0x00 0x00 0x01 0x00 ..."
...
fabric_name = "0x10000027f8e04c49"
...
permanent_port_name = "0xc05076e4588059c1"
port_id = "0x244800"
port_state = "Linkdown"
port_type = "NPort (fabric via point-to-point)"
...
speed = "16 Gbit"
Class = "scsi_host"
...
megabytes = "0 0"
...
requests = "0 0 0"
seconds_active = "37"
...
utilization = "0 0 0"
The second and next invocations produce (abbreviated):
Class = "fc_host"
active_fc4s = "0x00 0x00 0x00 0x00 ..."
...
fabric_name = "0x0"
...
permanent_port_name = "0x0"
port_id = "0x000000"
port_state = "Linkdown"
port_type = "Unknown"
...
speed = "unknown"
Class = "scsi_host"
...
megabytes = "0 0"
...
requests = "0 0 0"
seconds_active = "38"
...
utilization = "0 0 0"
Factor out the resetting of local link dependent fc_host attributes from
zfcp_fsf_exchange_config_data_handler() case
FSF_EXCHANGE_CONFIG_DATA_INCOMPLETE into a new helper function
zfcp_fsf_fc_host_link_down(). All code places that detect local link down
(SRB, FSF_PROT_LINK_DOWN, xconf data/port incomplete) call
zfcp_fsf_link_down_info_eval(). Call the new helper from there. This works
because zfcp_fsf_link_down_info_eval() and thus the helper is called before
zfcp_fsf_exchange_{config,port}_evaluate().
Port_name and node_name are always valid, so never reset them.
Get the permanent_port_name from exchange port data unconditionally as it
always has a valid known good value, even during link down.
Note: Rather than hardcode in zfcp_fsf_exchange_config_evaluate(), fc_host
supported_classes could theoretically get its value from
fsf_qtcb_bottom_port.class_of_service in zfcp_fsf_exchange_port_evaluate().
When the link comes back, we get a different notification, perform adapter
recovery, and this triggers an implicit exchange config data followed by
exchange port data filling in the link dependent fc_host attributes with
known good values again.
Link: https://lore.kernel.org/r/20200312174505.51294-5-maier@linux.ibm.com
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-03-13 01:44:59 +08:00
|
|
|
zfcp_fsf_fc_host_link_down(adapter);
|
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
if (!link_down)
|
2006-02-11 08:41:50 +08:00
|
|
|
goto out;
|
2005-12-01 09:48:41 +08:00
|
|
|
|
2005-09-14 03:51:16 +08:00
|
|
|
switch (link_down->error_code) {
|
|
|
|
case FSF_PSQ_LINK_NO_LIGHT:
|
2008-07-02 16:56:39 +08:00
|
|
|
dev_warn(&req->adapter->ccw_device->dev,
|
2008-10-01 18:42:15 +08:00
|
|
|
"There is no light signal from the local "
|
|
|
|
"fibre channel cable\n");
|
2005-09-14 03:51:16 +08:00
|
|
|
break;
|
|
|
|
case FSF_PSQ_LINK_WRAP_PLUG:
|
2008-07-02 16:56:39 +08:00
|
|
|
dev_warn(&req->adapter->ccw_device->dev,
|
2008-10-01 18:42:15 +08:00
|
|
|
"There is a wrap plug instead of a fibre "
|
|
|
|
"channel cable\n");
|
2005-09-14 03:51:16 +08:00
|
|
|
break;
|
|
|
|
case FSF_PSQ_LINK_NO_FCP:
|
2008-07-02 16:56:39 +08:00
|
|
|
dev_warn(&req->adapter->ccw_device->dev,
|
2008-10-01 18:42:15 +08:00
|
|
|
"The adjacent fibre channel node does not "
|
|
|
|
"support FCP\n");
|
2005-09-14 03:51:16 +08:00
|
|
|
break;
|
|
|
|
case FSF_PSQ_LINK_FIRMWARE_UPDATE:
|
2008-07-02 16:56:39 +08:00
|
|
|
dev_warn(&req->adapter->ccw_device->dev,
|
2008-10-01 18:42:15 +08:00
|
|
|
"The FCP device is suspended because of a "
|
|
|
|
"firmware update\n");
|
2008-06-11 00:20:58 +08:00
|
|
|
break;
|
2005-09-14 03:51:16 +08:00
|
|
|
case FSF_PSQ_LINK_INVALID_WWPN:
|
2008-07-02 16:56:39 +08:00
|
|
|
dev_warn(&req->adapter->ccw_device->dev,
|
2008-10-01 18:42:15 +08:00
|
|
|
"The FCP device detected a WWPN that is "
|
|
|
|
"duplicate or not valid\n");
|
2005-09-14 03:51:16 +08:00
|
|
|
break;
|
|
|
|
case FSF_PSQ_LINK_NO_NPIV_SUPPORT:
|
2008-07-02 16:56:39 +08:00
|
|
|
dev_warn(&req->adapter->ccw_device->dev,
|
2008-10-01 18:42:15 +08:00
|
|
|
"The fibre channel fabric does not support NPIV\n");
|
2005-09-14 03:51:16 +08:00
|
|
|
break;
|
|
|
|
case FSF_PSQ_LINK_NO_FCP_RESOURCES:
|
2008-07-02 16:56:39 +08:00
|
|
|
dev_warn(&req->adapter->ccw_device->dev,
|
2008-10-01 18:42:15 +08:00
|
|
|
"The FCP adapter cannot support more NPIV ports\n");
|
2005-09-14 03:51:16 +08:00
|
|
|
break;
|
|
|
|
case FSF_PSQ_LINK_NO_FABRIC_RESOURCES:
|
2008-07-02 16:56:39 +08:00
|
|
|
dev_warn(&req->adapter->ccw_device->dev,
|
2008-10-01 18:42:15 +08:00
|
|
|
"The adjacent switch cannot support "
|
|
|
|
"more NPIV ports\n");
|
2005-09-14 03:51:16 +08:00
|
|
|
break;
|
|
|
|
case FSF_PSQ_LINK_FABRIC_LOGIN_UNABLE:
|
2008-07-02 16:56:39 +08:00
|
|
|
dev_warn(&req->adapter->ccw_device->dev,
|
2008-10-01 18:42:15 +08:00
|
|
|
"The FCP adapter could not log in to the "
|
|
|
|
"fibre channel fabric\n");
|
2005-09-14 03:51:16 +08:00
|
|
|
break;
|
|
|
|
case FSF_PSQ_LINK_WWPN_ASSIGNMENT_CORRUPTED:
|
2008-07-02 16:56:39 +08:00
|
|
|
dev_warn(&req->adapter->ccw_device->dev,
|
2008-10-01 18:42:15 +08:00
|
|
|
"The WWPN assignment file on the FCP adapter "
|
|
|
|
"has been damaged\n");
|
2005-09-14 03:51:16 +08:00
|
|
|
break;
|
|
|
|
case FSF_PSQ_LINK_MODE_TABLE_CURRUPTED:
|
2008-07-02 16:56:39 +08:00
|
|
|
dev_warn(&req->adapter->ccw_device->dev,
|
2008-10-01 18:42:15 +08:00
|
|
|
"The mode table on the FCP adapter "
|
|
|
|
"has been damaged\n");
|
2005-09-14 03:51:16 +08:00
|
|
|
break;
|
|
|
|
case FSF_PSQ_LINK_NO_WWPN_ASSIGNMENT:
|
2008-07-02 16:56:39 +08:00
|
|
|
dev_warn(&req->adapter->ccw_device->dev,
|
2008-10-01 18:42:15 +08:00
|
|
|
"All NPIV ports on the FCP adapter have "
|
|
|
|
"been assigned\n");
|
2005-09-14 03:51:16 +08:00
|
|
|
break;
|
|
|
|
default:
|
2008-07-02 16:56:39 +08:00
|
|
|
dev_warn(&req->adapter->ccw_device->dev,
|
2008-10-01 18:42:15 +08:00
|
|
|
"The link between the FCP adapter and "
|
|
|
|
"the FC fabric is down\n");
|
2005-09-14 03:51:16 +08:00
|
|
|
}
|
2008-07-02 16:56:39 +08:00
|
|
|
out:
|
2010-09-08 20:40:01 +08:00
|
|
|
zfcp_erp_set_adapter_status(adapter, ZFCP_STATUS_COMMON_ERP_FAILED);
|
2005-09-14 03:51:16 +08:00
|
|
|
}
|
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
static void zfcp_fsf_status_read_link_down(struct zfcp_fsf_req *req)
|
2005-04-17 06:20:36 +08:00
|
|
|
{
|
2008-07-02 16:56:39 +08:00
|
|
|
struct fsf_status_read_buffer *sr_buf = req->data;
|
|
|
|
struct fsf_link_down_info *ldi =
|
|
|
|
(struct fsf_link_down_info *) &sr_buf->payload;
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
switch (sr_buf->status_subtype) {
|
|
|
|
case FSF_STATUS_READ_SUB_NO_PHYSICAL_LINK:
|
|
|
|
case FSF_STATUS_READ_SUB_FDISC_FAILED:
|
2010-09-08 20:40:01 +08:00
|
|
|
zfcp_fsf_link_down_info_eval(req, ldi);
|
2005-04-17 06:20:36 +08:00
|
|
|
break;
|
2008-07-02 16:56:39 +08:00
|
|
|
case FSF_STATUS_READ_SUB_FIRMWARE_UPDATE:
|
2010-09-08 20:40:01 +08:00
|
|
|
zfcp_fsf_link_down_info_eval(req, NULL);
|
2015-08-04 23:11:15 +08:00
|
|
|
}
|
2008-07-02 16:56:39 +08:00
|
|
|
}
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2020-10-29 02:30:51 +08:00
|
|
|
static void
|
|
|
|
zfcp_fsf_status_read_version_change(struct zfcp_adapter *adapter,
|
|
|
|
struct fsf_status_read_buffer *sr_buf)
|
|
|
|
{
|
|
|
|
if (sr_buf->status_subtype == FSF_STATUS_READ_SUB_LIC_CHANGE) {
|
|
|
|
u32 version = sr_buf->payload.version_change.current_version;
|
|
|
|
|
|
|
|
WRITE_ONCE(adapter->fsf_lic_version, version);
|
|
|
|
snprintf(fc_host_firmware_version(adapter->scsi_host),
|
|
|
|
FC_VERSION_STRING_SIZE, "%#08x", version);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
static void zfcp_fsf_status_read_handler(struct zfcp_fsf_req *req)
|
|
|
|
{
|
|
|
|
struct zfcp_adapter *adapter = req->adapter;
|
|
|
|
struct fsf_status_read_buffer *sr_buf = req->data;
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
if (req->status & ZFCP_STATUS_FSFREQ_DISMISSED) {
|
2010-12-02 22:16:14 +08:00
|
|
|
zfcp_dbf_hba_fsf_uss("fssrh_1", req);
|
2011-02-23 02:54:40 +08:00
|
|
|
mempool_free(virt_to_page(sr_buf), adapter->pool.sr_data);
|
2008-07-02 16:56:39 +08:00
|
|
|
zfcp_fsf_req_free(req);
|
|
|
|
return;
|
|
|
|
}
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2012-09-04 21:23:30 +08:00
|
|
|
zfcp_dbf_hba_fsf_uss("fssrh_4", req);
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
switch (sr_buf->status_type) {
|
|
|
|
case FSF_STATUS_READ_PORT_CLOSED:
|
|
|
|
zfcp_fsf_status_read_port_closed(req);
|
2005-04-17 06:20:36 +08:00
|
|
|
break;
|
2008-07-02 16:56:39 +08:00
|
|
|
case FSF_STATUS_READ_INCOMING_ELS:
|
|
|
|
zfcp_fc_incoming_els(req);
|
2005-04-17 06:20:36 +08:00
|
|
|
break;
|
2008-07-02 16:56:39 +08:00
|
|
|
case FSF_STATUS_READ_SENSE_DATA_AVAIL:
|
2005-04-17 06:20:36 +08:00
|
|
|
break;
|
2008-07-02 16:56:39 +08:00
|
|
|
case FSF_STATUS_READ_BIT_ERROR_THRESHOLD:
|
2010-12-02 22:16:14 +08:00
|
|
|
zfcp_dbf_hba_bit_err("fssrh_3", req);
|
2019-10-01 18:49:49 +08:00
|
|
|
if (ber_stop) {
|
|
|
|
dev_warn(&adapter->ccw_device->dev,
|
|
|
|
"All paths over this FCP device are disused because of excessive bit errors\n");
|
|
|
|
zfcp_erp_adapter_shutdown(adapter, 0, "fssrh_b");
|
|
|
|
} else {
|
|
|
|
dev_warn(&adapter->ccw_device->dev,
|
|
|
|
"The error threshold for checksum statistics has been exceeded\n");
|
|
|
|
}
|
2005-04-17 06:20:36 +08:00
|
|
|
break;
|
2008-07-02 16:56:39 +08:00
|
|
|
case FSF_STATUS_READ_LINK_DOWN:
|
|
|
|
zfcp_fsf_status_read_link_down(req);
|
2010-07-16 21:37:39 +08:00
|
|
|
zfcp_fc_enqueue_event(adapter, FCH_EVT_LINKDOWN, 0);
|
2008-07-02 16:56:39 +08:00
|
|
|
break;
|
|
|
|
case FSF_STATUS_READ_LINK_UP:
|
|
|
|
dev_info(&adapter->ccw_device->dev,
|
2008-10-01 18:42:15 +08:00
|
|
|
"The local link has been restored\n");
|
2008-07-02 16:56:39 +08:00
|
|
|
/* All ports should be marked as ready to run again */
|
2010-09-08 20:40:01 +08:00
|
|
|
zfcp_erp_set_adapter_status(adapter,
|
|
|
|
ZFCP_STATUS_COMMON_RUNNING);
|
2008-07-02 16:56:39 +08:00
|
|
|
zfcp_erp_adapter_reopen(adapter,
|
|
|
|
ZFCP_STATUS_ADAPTER_LINK_UNPLUGGED |
|
|
|
|
ZFCP_STATUS_COMMON_ERP_FAILED,
|
2010-12-02 22:16:16 +08:00
|
|
|
"fssrh_2");
|
2010-07-16 21:37:39 +08:00
|
|
|
zfcp_fc_enqueue_event(adapter, FCH_EVT_LINKUP, 0);
|
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
break;
|
|
|
|
case FSF_STATUS_READ_NOTIFICATION_LOST:
|
|
|
|
if (sr_buf->status_subtype & FSF_STATUS_READ_SUB_INCOMING_ELS)
|
2012-09-04 21:23:35 +08:00
|
|
|
zfcp_fc_conditional_port_scan(adapter);
|
2020-10-29 02:30:52 +08:00
|
|
|
if (sr_buf->status_subtype & FSF_STATUS_READ_SUB_VERSION_CHANGE)
|
|
|
|
queue_work(adapter->work_queue,
|
|
|
|
&adapter->version_change_lost_work);
|
2008-07-02 16:56:39 +08:00
|
|
|
break;
|
|
|
|
case FSF_STATUS_READ_FEATURE_UPDATE_ALERT:
|
|
|
|
adapter->adapter_features = sr_buf->payload.word[0];
|
2005-04-17 06:20:36 +08:00
|
|
|
break;
|
2020-10-29 02:30:51 +08:00
|
|
|
case FSF_STATUS_READ_VERSION_CHANGE:
|
|
|
|
zfcp_fsf_status_read_version_change(adapter, sr_buf);
|
|
|
|
break;
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
|
|
|
|
2011-02-23 02:54:40 +08:00
|
|
|
mempool_free(virt_to_page(sr_buf), adapter->pool.sr_data);
|
2008-07-02 16:56:39 +08:00
|
|
|
zfcp_fsf_req_free(req);
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
atomic_inc(&adapter->stat_miss);
|
2009-08-18 21:43:17 +08:00
|
|
|
queue_work(adapter->work_queue, &adapter->stat_work);
|
2008-07-02 16:56:39 +08:00
|
|
|
}
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
static void zfcp_fsf_fsfstatus_qual_eval(struct zfcp_fsf_req *req)
|
|
|
|
{
|
|
|
|
switch (req->qtcb->header.fsf_status_qual.word[0]) {
|
|
|
|
case FSF_SQ_FCP_RSP_AVAILABLE:
|
|
|
|
case FSF_SQ_INVOKE_LINK_TEST_PROCEDURE:
|
|
|
|
case FSF_SQ_NO_RETRY_POSSIBLE:
|
|
|
|
case FSF_SQ_ULP_DEPENDENT_ERP_REQUIRED:
|
|
|
|
return;
|
|
|
|
case FSF_SQ_COMMAND_ABORTED:
|
|
|
|
break;
|
|
|
|
case FSF_SQ_NO_RECOM:
|
|
|
|
dev_err(&req->adapter->ccw_device->dev,
|
2008-10-01 18:42:15 +08:00
|
|
|
"The FCP adapter reported a problem "
|
|
|
|
"that cannot be recovered\n");
|
2010-07-16 21:37:43 +08:00
|
|
|
zfcp_qdio_siosl(req->adapter);
|
2010-12-02 22:16:16 +08:00
|
|
|
zfcp_erp_adapter_shutdown(req->adapter, 0, "fsfsqe1");
|
2008-07-02 16:56:39 +08:00
|
|
|
break;
|
|
|
|
}
|
|
|
|
/* all non-return stats set FSFREQ_ERROR*/
|
|
|
|
req->status |= ZFCP_STATUS_FSFREQ_ERROR;
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
static void zfcp_fsf_fsfstatus_eval(struct zfcp_fsf_req *req)
|
2005-04-17 06:20:36 +08:00
|
|
|
{
|
2008-07-02 16:56:39 +08:00
|
|
|
if (unlikely(req->status & ZFCP_STATUS_FSFREQ_ERROR))
|
|
|
|
return;
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
switch (req->qtcb->header.fsf_status) {
|
|
|
|
case FSF_UNKNOWN_COMMAND:
|
|
|
|
dev_err(&req->adapter->ccw_device->dev,
|
2008-10-01 18:42:15 +08:00
|
|
|
"The FCP adapter does not recognize the command 0x%x\n",
|
2008-07-02 16:56:39 +08:00
|
|
|
req->qtcb->header.fsf_command);
|
2010-12-02 22:16:16 +08:00
|
|
|
zfcp_erp_adapter_shutdown(req->adapter, 0, "fsfse_1");
|
2008-07-02 16:56:39 +08:00
|
|
|
req->status |= ZFCP_STATUS_FSFREQ_ERROR;
|
|
|
|
break;
|
|
|
|
case FSF_ADAPTER_STATUS_AVAILABLE:
|
|
|
|
zfcp_fsf_fsfstatus_qual_eval(req);
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
static void zfcp_fsf_protstatus_eval(struct zfcp_fsf_req *req)
|
|
|
|
{
|
|
|
|
struct zfcp_adapter *adapter = req->adapter;
|
|
|
|
struct fsf_qtcb *qtcb = req->qtcb;
|
|
|
|
union fsf_prot_status_qual *psq = &qtcb->prefix.prot_status_qual;
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2009-08-18 21:43:21 +08:00
|
|
|
zfcp_dbf_hba_fsf_response(req);
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
if (req->status & ZFCP_STATUS_FSFREQ_DISMISSED) {
|
2009-11-24 23:54:15 +08:00
|
|
|
req->status |= ZFCP_STATUS_FSFREQ_ERROR;
|
2008-07-02 16:56:39 +08:00
|
|
|
return;
|
|
|
|
}
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
switch (qtcb->prefix.prot_status) {
|
|
|
|
case FSF_PROT_GOOD:
|
|
|
|
case FSF_PROT_FSF_STATUS_PRESENTED:
|
|
|
|
return;
|
|
|
|
case FSF_PROT_QTCB_VERSION_ERROR:
|
|
|
|
dev_err(&adapter->ccw_device->dev,
|
2008-10-01 18:42:15 +08:00
|
|
|
"QTCB version 0x%x not supported by FCP adapter "
|
|
|
|
"(0x%x to 0x%x)\n", FSF_QTCB_CURRENT_VERSION,
|
|
|
|
psq->word[0], psq->word[1]);
|
2010-12-02 22:16:16 +08:00
|
|
|
zfcp_erp_adapter_shutdown(adapter, 0, "fspse_1");
|
2008-07-02 16:56:39 +08:00
|
|
|
break;
|
|
|
|
case FSF_PROT_ERROR_STATE:
|
|
|
|
case FSF_PROT_SEQ_NUMB_ERROR:
|
2010-12-02 22:16:16 +08:00
|
|
|
zfcp_erp_adapter_reopen(adapter, 0, "fspse_2");
|
2009-11-24 23:54:15 +08:00
|
|
|
req->status |= ZFCP_STATUS_FSFREQ_ERROR;
|
2008-07-02 16:56:39 +08:00
|
|
|
break;
|
|
|
|
case FSF_PROT_UNSUPP_QTCB_TYPE:
|
|
|
|
dev_err(&adapter->ccw_device->dev,
|
2008-10-01 18:42:15 +08:00
|
|
|
"The QTCB type is not supported by the FCP adapter\n");
|
2010-12-02 22:16:16 +08:00
|
|
|
zfcp_erp_adapter_shutdown(adapter, 0, "fspse_3");
|
2008-07-02 16:56:39 +08:00
|
|
|
break;
|
|
|
|
case FSF_PROT_HOST_CONNECTION_INITIALIZING:
|
2015-04-24 07:12:32 +08:00
|
|
|
atomic_or(ZFCP_STATUS_ADAPTER_HOST_CON_INIT,
|
2008-07-02 16:56:39 +08:00
|
|
|
&adapter->status);
|
|
|
|
break;
|
|
|
|
case FSF_PROT_DUPLICATE_REQUEST_ID:
|
|
|
|
dev_err(&adapter->ccw_device->dev,
|
2008-10-01 18:42:15 +08:00
|
|
|
"0x%Lx is an ambiguous request identifier\n",
|
2008-07-02 16:56:39 +08:00
|
|
|
(unsigned long long)qtcb->bottom.support.req_handle);
|
2010-12-02 22:16:16 +08:00
|
|
|
zfcp_erp_adapter_shutdown(adapter, 0, "fspse_4");
|
2008-07-02 16:56:39 +08:00
|
|
|
break;
|
|
|
|
case FSF_PROT_LINK_DOWN:
|
2010-09-08 20:40:01 +08:00
|
|
|
zfcp_fsf_link_down_info_eval(req, &psq->link_down_info);
|
2010-02-17 18:18:51 +08:00
|
|
|
/* go through reopen to flush pending requests */
|
2010-12-02 22:16:16 +08:00
|
|
|
zfcp_erp_adapter_reopen(adapter, 0, "fspse_6");
|
2008-07-02 16:56:39 +08:00
|
|
|
break;
|
|
|
|
case FSF_PROT_REEST_QUEUE:
|
|
|
|
/* All ports should be marked as ready to run again */
|
2010-09-08 20:40:01 +08:00
|
|
|
zfcp_erp_set_adapter_status(adapter,
|
|
|
|
ZFCP_STATUS_COMMON_RUNNING);
|
2008-07-02 16:56:39 +08:00
|
|
|
zfcp_erp_adapter_reopen(adapter,
|
|
|
|
ZFCP_STATUS_ADAPTER_LINK_UNPLUGGED |
|
2009-03-02 20:09:04 +08:00
|
|
|
ZFCP_STATUS_COMMON_ERP_FAILED,
|
2010-12-02 22:16:16 +08:00
|
|
|
"fspse_8");
|
2008-07-02 16:56:39 +08:00
|
|
|
break;
|
|
|
|
default:
|
|
|
|
dev_err(&adapter->ccw_device->dev,
|
2008-10-01 18:42:15 +08:00
|
|
|
"0x%x is not a valid transfer protocol status\n",
|
2008-07-02 16:56:39 +08:00
|
|
|
qtcb->prefix.prot_status);
|
2010-07-16 21:37:43 +08:00
|
|
|
zfcp_qdio_siosl(adapter);
|
2010-12-02 22:16:16 +08:00
|
|
|
zfcp_erp_adapter_shutdown(adapter, 0, "fspse_9");
|
2008-07-02 16:56:39 +08:00
|
|
|
}
|
|
|
|
req->status |= ZFCP_STATUS_FSFREQ_ERROR;
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
/**
|
|
|
|
* zfcp_fsf_req_complete - process completion of a FSF request
|
2018-11-08 22:44:54 +08:00
|
|
|
* @req: The FSF request that has been completed.
|
2008-07-02 16:56:39 +08:00
|
|
|
*
|
|
|
|
* When a request has been completed either from the FCP adapter,
|
|
|
|
* or it has been dismissed due to a queue shutdown, this function
|
|
|
|
* is called to process the completion status and trigger further
|
|
|
|
* events related to the FSF request.
|
2020-09-11 03:49:16 +08:00
|
|
|
* Caller must ensure that the request has been removed from
|
|
|
|
* adapter->req_list, to protect against concurrent modification
|
|
|
|
* by zfcp_erp_strategy_check_fsfreq().
|
2008-07-02 16:56:39 +08:00
|
|
|
*/
|
2009-08-18 21:43:13 +08:00
|
|
|
static void zfcp_fsf_req_complete(struct zfcp_fsf_req *req)
|
2005-04-17 06:20:36 +08:00
|
|
|
{
|
2020-09-11 03:49:16 +08:00
|
|
|
struct zfcp_erp_action *erp_action;
|
|
|
|
|
2018-11-08 22:44:45 +08:00
|
|
|
if (unlikely(zfcp_fsf_req_is_status_read_buffer(req))) {
|
2008-07-02 16:56:39 +08:00
|
|
|
zfcp_fsf_status_read_handler(req);
|
|
|
|
return;
|
|
|
|
}
|
2005-04-17 06:20:36 +08:00
|
|
|
|
scsi: zfcp: Fix use-after-free in request timeout handlers
Before v4.15 commit 75492a51568b ("s390/scsi: Convert timers to use
timer_setup()"), we intentionally only passed zfcp_adapter as context
argument to zfcp_fsf_request_timeout_handler(). Since we only trigger
adapter recovery, it was unnecessary to sync against races between timeout
and (late) completion. Likewise, we only passed zfcp_erp_action as context
argument to zfcp_erp_timeout_handler(). Since we only wakeup an ERP action,
it was unnecessary to sync against races between timeout and (late)
completion.
Meanwhile the timeout handlers get timer_list as context argument and do a
timer-specific container-of to zfcp_fsf_req which can have been freed.
Fix it by making sure that any request timeout handlers, that might just
have started before del_timer(), are completed by using del_timer_sync()
instead. This ensures the request free happens afterwards.
Space time diagram of potential use-after-free:
Basic idea is to have 2 or more pending requests whose timeouts run out at
almost the same time.
req 1 timeout ERP thread req 2 timeout
---------------- ---------------- ---------------------------------------
zfcp_fsf_request_timeout_handler
fsf_req = from_timer(fsf_req, t, timer)
adapter = fsf_req->adapter
zfcp_qdio_siosl(adapter)
zfcp_erp_adapter_reopen(adapter,...)
zfcp_erp_strategy
...
zfcp_fsf_req_dismiss_all
list_for_each_entry_safe
zfcp_fsf_req_complete 1
del_timer 1
zfcp_fsf_req_free 1
zfcp_fsf_req_complete 2
zfcp_fsf_request_timeout_handler
del_timer 2
fsf_req = from_timer(fsf_req, t, timer)
zfcp_fsf_req_free 2
adapter = fsf_req->adapter
^^^^^^^ already freed
Link: https://lore.kernel.org/r/20200813152856.50088-1-maier@linux.ibm.com
Fixes: 75492a51568b ("s390/scsi: Convert timers to use timer_setup()")
Cc: <stable@vger.kernel.org> #4.15+
Suggested-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-13 23:28:56 +08:00
|
|
|
del_timer_sync(&req->timer);
|
2008-07-02 16:56:39 +08:00
|
|
|
zfcp_fsf_protstatus_eval(req);
|
|
|
|
zfcp_fsf_fsfstatus_eval(req);
|
|
|
|
req->handler(req);
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2020-09-11 03:49:16 +08:00
|
|
|
erp_action = req->erp_action;
|
|
|
|
if (erp_action)
|
|
|
|
zfcp_erp_notify(erp_action, 0);
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
if (likely(req->status & ZFCP_STATUS_FSFREQ_CLEANUP))
|
|
|
|
zfcp_fsf_req_free(req);
|
|
|
|
else
|
2009-08-18 21:43:14 +08:00
|
|
|
complete(&req->completion);
|
2008-07-02 16:56:39 +08:00
|
|
|
}
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2009-08-18 21:43:13 +08:00
|
|
|
/**
|
|
|
|
* zfcp_fsf_req_dismiss_all - dismiss all fsf requests
|
|
|
|
* @adapter: pointer to struct zfcp_adapter
|
|
|
|
*
|
|
|
|
* Never ever call this without shutting down the adapter first.
|
|
|
|
* Otherwise the adapter would continue using and corrupting s390 storage.
|
|
|
|
* Included BUG_ON() call to ensure this is done.
|
|
|
|
* ERP is supposed to be the only user of this function.
|
|
|
|
*/
|
|
|
|
void zfcp_fsf_req_dismiss_all(struct zfcp_adapter *adapter)
|
|
|
|
{
|
|
|
|
struct zfcp_fsf_req *req, *tmp;
|
|
|
|
LIST_HEAD(remove_queue);
|
|
|
|
|
|
|
|
BUG_ON(atomic_read(&adapter->status) & ZFCP_STATUS_ADAPTER_QDIOUP);
|
2010-02-17 18:18:50 +08:00
|
|
|
zfcp_reqlist_move(adapter->req_list, &remove_queue);
|
2009-08-18 21:43:13 +08:00
|
|
|
|
|
|
|
list_for_each_entry_safe(req, tmp, &remove_queue, list) {
|
|
|
|
list_del(&req->list);
|
|
|
|
req->status |= ZFCP_STATUS_FSFREQ_DISMISSED;
|
|
|
|
zfcp_fsf_req_complete(req);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2012-09-04 21:23:29 +08:00
|
|
|
#define ZFCP_FSF_PORTSPEED_1GBIT (1 << 0)
|
|
|
|
#define ZFCP_FSF_PORTSPEED_2GBIT (1 << 1)
|
|
|
|
#define ZFCP_FSF_PORTSPEED_4GBIT (1 << 2)
|
|
|
|
#define ZFCP_FSF_PORTSPEED_10GBIT (1 << 3)
|
|
|
|
#define ZFCP_FSF_PORTSPEED_8GBIT (1 << 4)
|
|
|
|
#define ZFCP_FSF_PORTSPEED_16GBIT (1 << 5)
|
2018-05-18 01:15:06 +08:00
|
|
|
#define ZFCP_FSF_PORTSPEED_32GBIT (1 << 6)
|
|
|
|
#define ZFCP_FSF_PORTSPEED_64GBIT (1 << 7)
|
|
|
|
#define ZFCP_FSF_PORTSPEED_128GBIT (1 << 8)
|
2012-09-04 21:23:29 +08:00
|
|
|
#define ZFCP_FSF_PORTSPEED_NOT_NEGOTIATED (1 << 15)
|
|
|
|
|
2020-05-09 01:23:29 +08:00
|
|
|
u32 zfcp_fsf_convert_portspeed(u32 fsf_speed)
|
2012-09-04 21:23:29 +08:00
|
|
|
{
|
|
|
|
u32 fdmi_speed = 0;
|
|
|
|
if (fsf_speed & ZFCP_FSF_PORTSPEED_1GBIT)
|
|
|
|
fdmi_speed |= FC_PORTSPEED_1GBIT;
|
|
|
|
if (fsf_speed & ZFCP_FSF_PORTSPEED_2GBIT)
|
|
|
|
fdmi_speed |= FC_PORTSPEED_2GBIT;
|
|
|
|
if (fsf_speed & ZFCP_FSF_PORTSPEED_4GBIT)
|
|
|
|
fdmi_speed |= FC_PORTSPEED_4GBIT;
|
|
|
|
if (fsf_speed & ZFCP_FSF_PORTSPEED_10GBIT)
|
|
|
|
fdmi_speed |= FC_PORTSPEED_10GBIT;
|
|
|
|
if (fsf_speed & ZFCP_FSF_PORTSPEED_8GBIT)
|
|
|
|
fdmi_speed |= FC_PORTSPEED_8GBIT;
|
|
|
|
if (fsf_speed & ZFCP_FSF_PORTSPEED_16GBIT)
|
|
|
|
fdmi_speed |= FC_PORTSPEED_16GBIT;
|
2018-05-18 01:15:06 +08:00
|
|
|
if (fsf_speed & ZFCP_FSF_PORTSPEED_32GBIT)
|
|
|
|
fdmi_speed |= FC_PORTSPEED_32GBIT;
|
|
|
|
if (fsf_speed & ZFCP_FSF_PORTSPEED_64GBIT)
|
|
|
|
fdmi_speed |= FC_PORTSPEED_64GBIT;
|
|
|
|
if (fsf_speed & ZFCP_FSF_PORTSPEED_128GBIT)
|
|
|
|
fdmi_speed |= FC_PORTSPEED_128GBIT;
|
2012-09-04 21:23:29 +08:00
|
|
|
if (fsf_speed & ZFCP_FSF_PORTSPEED_NOT_NEGOTIATED)
|
|
|
|
fdmi_speed |= FC_PORTSPEED_NOT_NEGOTIATED;
|
|
|
|
return fdmi_speed;
|
|
|
|
}
|
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
static int zfcp_fsf_exchange_config_evaluate(struct zfcp_fsf_req *req)
|
|
|
|
{
|
2009-11-24 23:54:09 +08:00
|
|
|
struct fsf_qtcb_bottom_config *bottom = &req->qtcb->bottom.config;
|
2008-07-02 16:56:39 +08:00
|
|
|
struct zfcp_adapter *adapter = req->adapter;
|
2020-05-09 01:23:29 +08:00
|
|
|
struct fc_els_flogi *plogi;
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2009-11-24 23:54:09 +08:00
|
|
|
/* adjust pointers for missing command code */
|
|
|
|
plogi = (struct fc_els_flogi *) ((u8 *)&bottom->plogi_payload
|
|
|
|
- sizeof(u32));
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
if (req->data)
|
|
|
|
memcpy(req->data, bottom, sizeof(*bottom));
|
|
|
|
|
2010-07-16 21:37:36 +08:00
|
|
|
adapter->timer_ticks = bottom->timer_interval & ZFCP_FSF_TIMER_INT_MASK;
|
2010-06-21 16:11:33 +08:00
|
|
|
adapter->stat_read_buf_num = max(bottom->status_read_buf_num,
|
|
|
|
(u16)FSF_STATUS_READS_RECOM);
|
2008-07-02 16:56:39 +08:00
|
|
|
|
2013-04-26 23:34:54 +08:00
|
|
|
/* no error return above here, otherwise must fix call chains */
|
|
|
|
/* do not evaluate invalid fields */
|
|
|
|
if (req->qtcb->header.fsf_status == FSF_EXCHANGE_CONFIG_DATA_INCOMPLETE)
|
|
|
|
return 0;
|
|
|
|
|
|
|
|
adapter->hydra_version = bottom->adapter_type;
|
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
switch (bottom->fc_topology) {
|
|
|
|
case FSF_TOPO_P2P:
|
2009-11-24 23:54:12 +08:00
|
|
|
adapter->peer_d_id = ntoh24(bottom->peer_d_id);
|
scsi: zfcp: use endianness conversions with common FC(P) struct fields
Just to silence sparse. Since zfcp only exists for s390 and
s390 is big endian, this has been working correctly without conversions
and all the new conversions are NOPs so no performance impact.
Nonetheless, use the conversion on the constant expression where possible.
NB: N_Port-IDs have always been handled with hton24 or ntoh24 conversions
because they also convert to / from character array.
Affected common code structs and .fields are:
HOT I/O PATH:
fcp_cmnd .fc_dl
FCP command: regular SCSI I/O, including DIX case
SEMI-HOT I/O PATH:
fcp_cmnd .fc_dl
recovery FCP command: task management function (LUN / target reset)
fcp_resp_ext
FCP response having FCP_SNS_LEN_VAL with .fr_rsp_len .fr_sns_len
FCP response having FCP_RESID_UNDER with .fr_resid
RECOVERY / DISCOVERY PATHS:
fc_ct_hdr .ct_cmd .ct_mr_size
zfcp auto port scan [GPN_FT] with fc_gpn_ft_resp.fp_wwpn,
recovery for returned port [GID_PN] with fc_ns_gid_pn.fn_wwpn,
get symbolic port name [GSPN],
register symbolic port name [RSPN] (NPIV only).
fc_els_rscn .rscn_plen
incoming ELS (RSCN).
fc_els_flogi .fl_wwpn .fl_wwnn
incoming ELS (PLOGI),
port open response with .fl_csp.sp_bb_data .fl_cssp[0..3].cp_class,
FCP channel physical port,
point-to-point peer (P2P only).
fc_els_logo .fl_n_port_wwn
incoming ELS (LOGO).
fc_els_adisc .adisc_wwnn .adisc_wwpn
path test after RSCN for gone target port.
Since v4.10 commit 05de97003c77 ("linux/types.h: enable endian checks for
all sparse builds"), below sparse endianness reports appear by default.
Previously, one needed to pass argument CF="-D__CHECK_ENDIAN__" to make
as in: $ make C=1 CF="-D__CHECK_ENDIAN__" M=drivers/s390/scsi.
Silenced sparse warnings and one error:
$ make C=1 M=drivers/s390/scsi
...
CHECK drivers/s390/scsi/zfcp_dbf.c
drivers/s390/scsi/zfcp_dbf.c:463:22: warning: restricted __be16 degrades to integer
drivers/s390/scsi/zfcp_dbf.c:476:28: warning: restricted __be16 degrades to integer
CC drivers/s390/scsi/zfcp_dbf.o
...
CHECK drivers/s390/scsi/zfcp_fc.c
drivers/s390/scsi/zfcp_fc.c:263:26: warning: restricted __be16 degrades to integer
drivers/s390/scsi/zfcp_fc.c:299:41: warning: incorrect type in argument 2 (different base types)
drivers/s390/scsi/zfcp_fc.c:299:41: expected unsigned long long [unsigned] [usertype] wwpn
drivers/s390/scsi/zfcp_fc.c:299:41: got restricted __be64 [usertype] fl_wwpn
drivers/s390/scsi/zfcp_fc.c:309:40: warning: incorrect type in argument 2 (different base types)
drivers/s390/scsi/zfcp_fc.c:309:40: expected unsigned long long [unsigned] [usertype] wwpn
drivers/s390/scsi/zfcp_fc.c:309:40: got restricted __be64 [usertype] fl_n_port_wwn
drivers/s390/scsi/zfcp_fc.c:338:31: warning: restricted __be16 degrades to integer
drivers/s390/scsi/zfcp_fc.c:355:24: warning: incorrect type in assignment (different base types)
drivers/s390/scsi/zfcp_fc.c:355:24: expected restricted __be16 [usertype] ct_cmd
drivers/s390/scsi/zfcp_fc.c:355:24: got unsigned short [unsigned] [usertype] cmd
drivers/s390/scsi/zfcp_fc.c:356:28: warning: incorrect type in assignment (different base types)
drivers/s390/scsi/zfcp_fc.c:356:28: expected restricted __be16 [usertype] ct_mr_size
drivers/s390/scsi/zfcp_fc.c:356:28: got int
drivers/s390/scsi/zfcp_fc.c:379:36: warning: incorrect type in assignment (different base types)
drivers/s390/scsi/zfcp_fc.c:379:36: expected restricted __be64 [usertype] fn_wwpn
drivers/s390/scsi/zfcp_fc.c:379:36: got unsigned long long [unsigned] [usertype] wwpn
drivers/s390/scsi/zfcp_fc.c:463:18: warning: restricted __be64 degrades to integer
drivers/s390/scsi/zfcp_fc.c:465:17: warning: cast from restricted __be64
drivers/s390/scsi/zfcp_fc.c:473:20: warning: incorrect type in assignment (different base types)
drivers/s390/scsi/zfcp_fc.c:473:20: expected unsigned long long [unsigned] [usertype] wwnn
drivers/s390/scsi/zfcp_fc.c:473:20: got restricted __be64 [usertype] fl_wwnn
drivers/s390/scsi/zfcp_fc.c:474:29: warning: incorrect type in assignment (different base types)
drivers/s390/scsi/zfcp_fc.c:474:29: expected unsigned int [unsigned] [usertype] maxframe_size
drivers/s390/scsi/zfcp_fc.c:474:29: got restricted __be16 [usertype] sp_bb_data
drivers/s390/scsi/zfcp_fc.c:476:30: warning: restricted __be16 degrades to integer
drivers/s390/scsi/zfcp_fc.c:478:30: warning: restricted __be16 degrades to integer
drivers/s390/scsi/zfcp_fc.c:480:30: warning: restricted __be16 degrades to integer
drivers/s390/scsi/zfcp_fc.c:482:30: warning: restricted __be16 degrades to integer
drivers/s390/scsi/zfcp_fc.c:500:28: warning: incorrect type in assignment (different base types)
drivers/s390/scsi/zfcp_fc.c:500:28: expected unsigned long long [unsigned] [usertype] wwnn
drivers/s390/scsi/zfcp_fc.c:500:28: got restricted __be64 [usertype] adisc_wwnn
drivers/s390/scsi/zfcp_fc.c:502:38: warning: restricted __be64 degrades to integer
drivers/s390/scsi/zfcp_fc.c:541:40: warning: incorrect type in assignment (different base types)
drivers/s390/scsi/zfcp_fc.c:541:40: expected restricted __be64 [usertype] adisc_wwpn
drivers/s390/scsi/zfcp_fc.c:541:40: got unsigned long long [unsigned] [usertype] port_name
drivers/s390/scsi/zfcp_fc.c:542:40: warning: incorrect type in assignment (different base types)
drivers/s390/scsi/zfcp_fc.c:542:40: expected restricted __be64 [usertype] adisc_wwnn
drivers/s390/scsi/zfcp_fc.c:542:40: got unsigned long long [unsigned] [usertype] node_name
drivers/s390/scsi/zfcp_fc.c:669:16: warning: restricted __be16 degrades to integer
drivers/s390/scsi/zfcp_fc.c:696:24: warning: restricted __be64 degrades to integer
drivers/s390/scsi/zfcp_fc.c:699:54: warning: incorrect type in argument 2 (different base types)
drivers/s390/scsi/zfcp_fc.c:699:54: expected unsigned long long [unsigned] [usertype] <noident>
drivers/s390/scsi/zfcp_fc.c:699:54: got restricted __be64 [usertype] fp_wwpn
CC drivers/s390/scsi/zfcp_fc.o
CHECK drivers/s390/scsi/zfcp_fsf.c
drivers/s390/scsi/zfcp_fsf.c:479:34: warning: incorrect type in assignment (different base types)
drivers/s390/scsi/zfcp_fsf.c:479:34: expected unsigned long long [unsigned] [usertype] port_name
drivers/s390/scsi/zfcp_fsf.c:479:34: got restricted __be64 [usertype] fl_wwpn
drivers/s390/scsi/zfcp_fsf.c:480:34: warning: incorrect type in assignment (different base types)
drivers/s390/scsi/zfcp_fsf.c:480:34: expected unsigned long long [unsigned] [usertype] node_name
drivers/s390/scsi/zfcp_fsf.c:480:34: got restricted __be64 [usertype] fl_wwnn
drivers/s390/scsi/zfcp_fsf.c:506:36: warning: incorrect type in assignment (different base types)
drivers/s390/scsi/zfcp_fsf.c:506:36: expected unsigned long long [unsigned] [usertype] peer_wwpn
drivers/s390/scsi/zfcp_fsf.c:506:36: got restricted __be64 [usertype] fl_wwpn
drivers/s390/scsi/zfcp_fsf.c:507:36: warning: incorrect type in assignment (different base types)
drivers/s390/scsi/zfcp_fsf.c:507:36: expected unsigned long long [unsigned] [usertype] peer_wwnn
drivers/s390/scsi/zfcp_fsf.c:507:36: got restricted __be64 [usertype] fl_wwnn
drivers/s390/scsi/zfcp_fc.h:269:46: warning: restricted __be32 degrades to integer
drivers/s390/scsi/zfcp_fc.h:270:29: error: incompatible types in comparison expression (different base types)
Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Benjamin Block <bblock@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2017-07-28 18:31:02 +08:00
|
|
|
adapter->peer_wwpn = be64_to_cpu(plogi->fl_wwpn);
|
|
|
|
adapter->peer_wwnn = be64_to_cpu(plogi->fl_wwnn);
|
2008-07-02 16:56:39 +08:00
|
|
|
break;
|
|
|
|
case FSF_TOPO_FABRIC:
|
|
|
|
break;
|
|
|
|
case FSF_TOPO_AL:
|
|
|
|
default:
|
|
|
|
dev_err(&adapter->ccw_device->dev,
|
2008-10-01 18:42:15 +08:00
|
|
|
"Unknown or unsupported arbitrated loop "
|
|
|
|
"fibre channel topology detected\n");
|
2010-12-02 22:16:16 +08:00
|
|
|
zfcp_erp_adapter_shutdown(adapter, 0, "fsece_1");
|
2008-07-02 16:56:39 +08:00
|
|
|
return -EIO;
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
2008-07-02 16:56:39 +08:00
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
static void zfcp_fsf_exchange_config_data_handler(struct zfcp_fsf_req *req)
|
2008-06-11 00:20:58 +08:00
|
|
|
{
|
|
|
|
struct zfcp_adapter *adapter = req->adapter;
|
2019-10-26 00:12:45 +08:00
|
|
|
struct zfcp_diag_header *const diag_hdr =
|
|
|
|
&adapter->diagnostics->config_data.header;
|
2008-07-02 16:56:39 +08:00
|
|
|
struct fsf_qtcb *qtcb = req->qtcb;
|
|
|
|
struct fsf_qtcb_bottom_config *bottom = &qtcb->bottom.config;
|
2008-06-11 00:20:58 +08:00
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
if (req->status & ZFCP_STATUS_FSFREQ_ERROR)
|
|
|
|
return;
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
adapter->fsf_lic_version = bottom->lic_version;
|
|
|
|
adapter->adapter_features = bottom->adapter_features;
|
|
|
|
adapter->connection_features = bottom->connection_features;
|
|
|
|
adapter->peer_wwpn = 0;
|
|
|
|
adapter->peer_wwnn = 0;
|
|
|
|
adapter->peer_d_id = 0;
|
2005-09-14 03:50:38 +08:00
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
switch (qtcb->header.fsf_status) {
|
|
|
|
case FSF_GOOD:
|
2019-10-26 00:12:45 +08:00
|
|
|
/*
|
|
|
|
* usually we wait with an update till the cache is too old,
|
|
|
|
* but because we have the data available, update it anyway
|
|
|
|
*/
|
|
|
|
zfcp_diag_update_xdata(diag_hdr, bottom, false);
|
|
|
|
|
2020-05-09 01:23:29 +08:00
|
|
|
zfcp_scsi_shost_update_config_data(adapter, bottom, false);
|
2008-07-02 16:56:39 +08:00
|
|
|
if (zfcp_fsf_exchange_config_evaluate(req))
|
|
|
|
return;
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
if (bottom->max_qtcb_size < sizeof(struct fsf_qtcb)) {
|
|
|
|
dev_err(&adapter->ccw_device->dev,
|
2008-10-01 18:42:15 +08:00
|
|
|
"FCP adapter maximum QTCB size (%d bytes) "
|
|
|
|
"is too small\n",
|
|
|
|
bottom->max_qtcb_size);
|
2010-12-02 22:16:16 +08:00
|
|
|
zfcp_erp_adapter_shutdown(adapter, 0, "fsecdh1");
|
2008-07-02 16:56:39 +08:00
|
|
|
return;
|
|
|
|
}
|
2015-04-24 07:12:32 +08:00
|
|
|
atomic_or(ZFCP_STATUS_ADAPTER_XCONFIG_OK,
|
2008-07-02 16:56:39 +08:00
|
|
|
&adapter->status);
|
2005-04-17 06:20:36 +08:00
|
|
|
break;
|
2008-07-02 16:56:39 +08:00
|
|
|
case FSF_EXCHANGE_CONFIG_DATA_INCOMPLETE:
|
2019-10-26 00:12:45 +08:00
|
|
|
zfcp_diag_update_xdata(diag_hdr, bottom, true);
|
2019-10-26 00:12:43 +08:00
|
|
|
req->status |= ZFCP_STATUS_FSFREQ_XDATAINCOMPLETE;
|
|
|
|
|
2013-04-26 23:32:14 +08:00
|
|
|
/* avoids adapter shutdown to be able to recognize
|
|
|
|
* events such as LINK UP */
|
2015-04-24 07:12:32 +08:00
|
|
|
atomic_or(ZFCP_STATUS_ADAPTER_XCONFIG_OK,
|
2013-04-26 23:32:14 +08:00
|
|
|
&adapter->status);
|
2010-09-08 20:40:01 +08:00
|
|
|
zfcp_fsf_link_down_info_eval(req,
|
2008-07-02 16:56:39 +08:00
|
|
|
&qtcb->header.fsf_status_qual.link_down_info);
|
2020-05-09 01:23:29 +08:00
|
|
|
|
|
|
|
zfcp_scsi_shost_update_config_data(adapter, bottom, true);
|
2013-04-26 23:34:54 +08:00
|
|
|
if (zfcp_fsf_exchange_config_evaluate(req))
|
|
|
|
return;
|
2005-04-17 06:20:36 +08:00
|
|
|
break;
|
2008-07-02 16:56:39 +08:00
|
|
|
default:
|
2010-12-02 22:16:16 +08:00
|
|
|
zfcp_erp_adapter_shutdown(adapter, 0, "fsecdh3");
|
2008-07-02 16:56:39 +08:00
|
|
|
return;
|
|
|
|
}
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2020-05-09 01:23:29 +08:00
|
|
|
if (adapter->adapter_features & FSF_FEATURE_HBAAPI_MANAGEMENT)
|
2008-07-02 16:56:39 +08:00
|
|
|
adapter->hardware_version = bottom->hardware_version;
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
if (FSF_QTCB_CURRENT_VERSION < bottom->low_qtcb_version) {
|
|
|
|
dev_err(&adapter->ccw_device->dev,
|
2008-10-01 18:42:15 +08:00
|
|
|
"The FCP adapter only supports newer "
|
|
|
|
"control block versions\n");
|
2010-12-02 22:16:16 +08:00
|
|
|
zfcp_erp_adapter_shutdown(adapter, 0, "fsecdh4");
|
2008-07-02 16:56:39 +08:00
|
|
|
return;
|
|
|
|
}
|
|
|
|
if (FSF_QTCB_CURRENT_VERSION > bottom->high_qtcb_version) {
|
|
|
|
dev_err(&adapter->ccw_device->dev,
|
2008-10-01 18:42:15 +08:00
|
|
|
"The FCP adapter only supports older "
|
|
|
|
"control block versions\n");
|
2010-12-02 22:16:16 +08:00
|
|
|
zfcp_erp_adapter_shutdown(adapter, 0, "fsecdh5");
|
2008-07-02 16:56:39 +08:00
|
|
|
}
|
|
|
|
}
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2020-03-13 01:45:01 +08:00
|
|
|
/*
|
|
|
|
* Mapping of FC Endpoint Security flag masks to mnemonics
|
2020-03-13 01:45:02 +08:00
|
|
|
*
|
|
|
|
* NOTE: Update macro ZFCP_FSF_MAX_FC_SECURITY_MNEMONIC_LENGTH when making any
|
|
|
|
* changes.
|
2020-03-13 01:45:01 +08:00
|
|
|
*/
|
|
|
|
static const struct {
|
|
|
|
u32 mask;
|
|
|
|
char *name;
|
|
|
|
} zfcp_fsf_fc_security_mnemonics[] = {
|
|
|
|
{ FSF_FC_SECURITY_AUTH, "Authentication" },
|
|
|
|
{ FSF_FC_SECURITY_ENC_FCSP2 |
|
|
|
|
FSF_FC_SECURITY_ENC_ERAS, "Encryption" },
|
|
|
|
};
|
|
|
|
|
2020-03-13 01:45:02 +08:00
|
|
|
/* maximum strlen(zfcp_fsf_fc_security_mnemonics[...].name) + 1 */
|
|
|
|
#define ZFCP_FSF_MAX_FC_SECURITY_MNEMONIC_LENGTH 15
|
|
|
|
|
2020-03-13 01:45:01 +08:00
|
|
|
/**
|
|
|
|
* zfcp_fsf_scnprint_fc_security() - translate FC Endpoint Security flags into
|
|
|
|
* mnemonics and place in a buffer
|
|
|
|
* @buf : the buffer to place the translated FC Endpoint Security flag(s)
|
|
|
|
* into
|
|
|
|
* @size : the size of the buffer, including the trailing null space
|
|
|
|
* @fc_security: one or more FC Endpoint Security flags, or zero
|
|
|
|
* @fmt : specifies whether a list or a single item is to be put into the
|
|
|
|
* buffer
|
|
|
|
*
|
|
|
|
* The Fibre Channel (FC) Endpoint Security flags are translated into mnemonics.
|
|
|
|
* If the FC Endpoint Security flags are zero "none" is placed into the buffer.
|
|
|
|
*
|
|
|
|
* With ZFCP_FSF_PRINT_FMT_LIST the mnemonics are placed as a list separated by
|
|
|
|
* a comma followed by a space into the buffer. If one or more FC Endpoint
|
|
|
|
* Security flags cannot be translated into a mnemonic, as they are undefined
|
|
|
|
* in zfcp_fsf_fc_security_mnemonics, their bitwise ORed value in hexadecimal
|
|
|
|
* representation is placed into the buffer.
|
|
|
|
*
|
|
|
|
* With ZFCP_FSF_PRINT_FMT_SINGLEITEM only one single mnemonic is placed into
|
|
|
|
* the buffer. If the FC Endpoint Security flag cannot be translated, as it is
|
|
|
|
* undefined in zfcp_fsf_fc_security_mnemonics, its value in hexadecimal
|
|
|
|
* representation is placed into the buffer. If more than one FC Endpoint
|
|
|
|
* Security flag was specified, their value in hexadecimal representation is
|
2020-03-13 01:45:02 +08:00
|
|
|
* placed into the buffer. The macro ZFCP_FSF_MAX_FC_SECURITY_MNEMONIC_LENGTH
|
|
|
|
* can be used to define a buffer that is large enough to hold one mnemonic.
|
2020-03-13 01:45:01 +08:00
|
|
|
*
|
|
|
|
* Return: The number of characters written into buf not including the trailing
|
|
|
|
* '\0'. If size is == 0 the function returns 0.
|
|
|
|
*/
|
|
|
|
ssize_t zfcp_fsf_scnprint_fc_security(char *buf, size_t size, u32 fc_security,
|
|
|
|
enum zfcp_fsf_print_fmt fmt)
|
|
|
|
{
|
|
|
|
const char *prefix = "";
|
|
|
|
ssize_t len = 0;
|
|
|
|
int i;
|
|
|
|
|
|
|
|
if (fc_security == 0)
|
|
|
|
return scnprintf(buf, size, "none");
|
|
|
|
if (fmt == ZFCP_FSF_PRINT_FMT_SINGLEITEM && hweight32(fc_security) != 1)
|
|
|
|
return scnprintf(buf, size, "0x%08x", fc_security);
|
|
|
|
|
|
|
|
for (i = 0; i < ARRAY_SIZE(zfcp_fsf_fc_security_mnemonics); i++) {
|
|
|
|
if (!(fc_security & zfcp_fsf_fc_security_mnemonics[i].mask))
|
|
|
|
continue;
|
|
|
|
|
|
|
|
len += scnprintf(buf + len, size - len, "%s%s", prefix,
|
|
|
|
zfcp_fsf_fc_security_mnemonics[i].name);
|
|
|
|
prefix = ", ";
|
|
|
|
fc_security &= ~zfcp_fsf_fc_security_mnemonics[i].mask;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (fc_security != 0)
|
|
|
|
len += scnprintf(buf + len, size - len, "%s0x%08x",
|
|
|
|
prefix, fc_security);
|
|
|
|
|
|
|
|
return len;
|
|
|
|
}
|
|
|
|
|
2020-03-13 01:45:03 +08:00
|
|
|
static void zfcp_fsf_dbf_adapter_fc_security(struct zfcp_adapter *adapter,
|
|
|
|
struct zfcp_fsf_req *req)
|
|
|
|
{
|
|
|
|
if (adapter->fc_security_algorithms ==
|
|
|
|
adapter->fc_security_algorithms_old) {
|
|
|
|
/* no change, no trace */
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
|
|
|
zfcp_dbf_hba_fsf_fces("fsfcesa", req, ZFCP_DBF_INVALID_WWPN,
|
|
|
|
adapter->fc_security_algorithms_old,
|
|
|
|
adapter->fc_security_algorithms);
|
|
|
|
|
|
|
|
adapter->fc_security_algorithms_old = adapter->fc_security_algorithms;
|
|
|
|
}
|
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
static void zfcp_fsf_exchange_port_evaluate(struct zfcp_fsf_req *req)
|
|
|
|
{
|
|
|
|
struct zfcp_adapter *adapter = req->adapter;
|
|
|
|
struct fsf_qtcb_bottom_port *bottom = &req->qtcb->bottom.port;
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
if (req->data)
|
|
|
|
memcpy(req->data, bottom, sizeof(*bottom));
|
2006-01-05 16:56:47 +08:00
|
|
|
|
2020-03-13 01:45:01 +08:00
|
|
|
if (adapter->adapter_features & FSF_FEATURE_FC_SECURITY)
|
|
|
|
adapter->fc_security_algorithms =
|
|
|
|
bottom->fc_security_algorithms;
|
|
|
|
else
|
|
|
|
adapter->fc_security_algorithms = 0;
|
2020-03-13 01:45:03 +08:00
|
|
|
zfcp_fsf_dbf_adapter_fc_security(adapter, req);
|
2008-07-02 16:56:39 +08:00
|
|
|
}
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
static void zfcp_fsf_exchange_port_data_handler(struct zfcp_fsf_req *req)
|
|
|
|
{
|
2019-10-26 00:12:44 +08:00
|
|
|
struct zfcp_diag_header *const diag_hdr =
|
|
|
|
&req->adapter->diagnostics->port_data.header;
|
2008-07-02 16:56:39 +08:00
|
|
|
struct fsf_qtcb *qtcb = req->qtcb;
|
2019-10-26 00:12:44 +08:00
|
|
|
struct fsf_qtcb_bottom_port *bottom = &qtcb->bottom.port;
|
2008-07-02 16:56:39 +08:00
|
|
|
|
|
|
|
if (req->status & ZFCP_STATUS_FSFREQ_ERROR)
|
|
|
|
return;
|
|
|
|
|
|
|
|
switch (qtcb->header.fsf_status) {
|
|
|
|
case FSF_GOOD:
|
2019-10-26 00:12:44 +08:00
|
|
|
/*
|
|
|
|
* usually we wait with an update till the cache is too old,
|
|
|
|
* but because we have the data available, update it anyway
|
|
|
|
*/
|
|
|
|
zfcp_diag_update_xdata(diag_hdr, bottom, false);
|
|
|
|
|
2020-05-09 01:23:30 +08:00
|
|
|
zfcp_scsi_shost_update_port_data(req->adapter, bottom);
|
2008-07-02 16:56:39 +08:00
|
|
|
zfcp_fsf_exchange_port_evaluate(req);
|
|
|
|
break;
|
|
|
|
case FSF_EXCHANGE_CONFIG_DATA_INCOMPLETE:
|
2019-10-26 00:12:44 +08:00
|
|
|
zfcp_diag_update_xdata(diag_hdr, bottom, true);
|
2019-10-26 00:12:43 +08:00
|
|
|
req->status |= ZFCP_STATUS_FSFREQ_XDATAINCOMPLETE;
|
|
|
|
|
2010-09-08 20:40:01 +08:00
|
|
|
zfcp_fsf_link_down_info_eval(req,
|
2008-07-02 16:56:39 +08:00
|
|
|
&qtcb->header.fsf_status_qual.link_down_info);
|
2020-05-09 01:23:30 +08:00
|
|
|
|
|
|
|
zfcp_scsi_shost_update_port_data(req->adapter, bottom);
|
scsi: zfcp: fix fc_host attributes that should be unknown on local link down
When we get an unsolicited notification on local link went down,
zfcp_fsf_status_read_link_down() calls zfcp_fsf_link_down_info_eval().
This only blocks rports, and sets ZFCP_STATUS_ADAPTER_LINK_UNPLUGGED and
ZFCP_STATUS_COMMON_ERP_FAILED. Only the fc_host port_state changes to
"Linkdown", because zfcp_scsi_get_host_port_state() is an active callback
and uses the adapter status.
Other fc_host attributes model, port_id, port_type, speed, fabric_name (and
zfcp device attributes card_version, peer_wwpn, peer_wwnn, peer_d_id) which
depend on a local link, continued to show their last known "good" value.
Only if something triggered an exchange config data, some values were
updated to their unknown equivalent via case
FSF_EXCHANGE_CONFIG_DATA_INCOMPLETE due to local link down. Triggers for
exchange config data are adapter recovery, or reading any of the following
zfcp-specific scsi host sysfs attributes "requests", "megabytes", or
"seconds_active" in /sys/devices/css*/*.*.*/*.*.*/host*/scsi_host/host*/.
The other fc_host attributes active_fc4s and permanent_port_name continued
to show their last known "good" value. Only if something triggered an
exchange port data, some values changed. Active_fc4s became all zeros as
unknown equivalent during link down. Permanent_port_name does not depend
on a local link. But for non-NPIV FCP devices, permanent_port_name
erroneously became whatever value fc_host port_name had at that point in
time (see previous paragraph). Triggers for exchange port data are the
zfcp-specific scsi host sysfs attribute "utilization", or
[{reset,get}_fc_host_stats] write anything into "reset_statistics" or read
any of the other attributes under
/sys/devices/css*/*.*.*/*.*.*/host*/fc_host/host*/statistics/.
(cf. v4.9 commit bd77befa5bcf ("zfcp: fix fc_host port_type with NPIV"))
This is particularly confusing when using "lszfcp -b <fcpdevbusid> -Ha" or
dbginfo.sh which read fc_host attributes and also scsi_host attributes.
After link down, the first invocation produces (abbreviated):
Class = "fc_host"
active_fc4s = "0x00 0x00 0x01 0x00 ..."
...
fabric_name = "0x10000027f8e04c49"
...
permanent_port_name = "0xc05076e4588059c1"
port_id = "0x244800"
port_state = "Linkdown"
port_type = "NPort (fabric via point-to-point)"
...
speed = "16 Gbit"
Class = "scsi_host"
...
megabytes = "0 0"
...
requests = "0 0 0"
seconds_active = "37"
...
utilization = "0 0 0"
The second and next invocations produce (abbreviated):
Class = "fc_host"
active_fc4s = "0x00 0x00 0x00 0x00 ..."
...
fabric_name = "0x0"
...
permanent_port_name = "0x0"
port_id = "0x000000"
port_state = "Linkdown"
port_type = "Unknown"
...
speed = "unknown"
Class = "scsi_host"
...
megabytes = "0 0"
...
requests = "0 0 0"
seconds_active = "38"
...
utilization = "0 0 0"
Factor out the resetting of local link dependent fc_host attributes from
zfcp_fsf_exchange_config_data_handler() case
FSF_EXCHANGE_CONFIG_DATA_INCOMPLETE into a new helper function
zfcp_fsf_fc_host_link_down(). All code places that detect local link down
(SRB, FSF_PROT_LINK_DOWN, xconf data/port incomplete) call
zfcp_fsf_link_down_info_eval(). Call the new helper from there. This works
because zfcp_fsf_link_down_info_eval() and thus the helper is called before
zfcp_fsf_exchange_{config,port}_evaluate().
Port_name and node_name are always valid, so never reset them.
Get the permanent_port_name from exchange port data unconditionally as it
always has a valid known good value, even during link down.
Note: Rather than hardcode in zfcp_fsf_exchange_config_evaluate(), fc_host
supported_classes could theoretically get its value from
fsf_qtcb_bottom_port.class_of_service in zfcp_fsf_exchange_port_evaluate().
When the link comes back, we get a different notification, perform adapter
recovery, and this triggers an implicit exchange config data followed by
exchange port data filling in the link dependent fc_host attributes with
known good values again.
Link: https://lore.kernel.org/r/20200312174505.51294-5-maier@linux.ibm.com
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-03-13 01:44:59 +08:00
|
|
|
zfcp_fsf_exchange_port_evaluate(req);
|
2005-09-14 03:51:16 +08:00
|
|
|
break;
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
2008-07-02 16:56:39 +08:00
|
|
|
}
|
2008-05-19 18:17:37 +08:00
|
|
|
|
2009-08-18 21:43:15 +08:00
|
|
|
static struct zfcp_fsf_req *zfcp_fsf_alloc(mempool_t *pool)
|
2008-07-02 16:56:39 +08:00
|
|
|
{
|
|
|
|
struct zfcp_fsf_req *req;
|
2009-08-18 21:43:15 +08:00
|
|
|
|
|
|
|
if (likely(pool))
|
|
|
|
req = mempool_alloc(pool, GFP_ATOMIC);
|
|
|
|
else
|
|
|
|
req = kmalloc(sizeof(*req), GFP_ATOMIC);
|
|
|
|
|
|
|
|
if (unlikely(!req))
|
2008-07-02 16:56:39 +08:00
|
|
|
return NULL;
|
2009-08-18 21:43:15 +08:00
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
memset(req, 0, sizeof(*req));
|
2008-11-04 23:35:07 +08:00
|
|
|
req->pool = pool;
|
2008-07-02 16:56:39 +08:00
|
|
|
return req;
|
|
|
|
}
|
|
|
|
|
2018-05-18 01:14:58 +08:00
|
|
|
static struct fsf_qtcb *zfcp_fsf_qtcb_alloc(mempool_t *pool)
|
2008-07-02 16:56:39 +08:00
|
|
|
{
|
2009-08-18 21:43:15 +08:00
|
|
|
struct fsf_qtcb *qtcb;
|
2008-07-02 16:56:39 +08:00
|
|
|
|
|
|
|
if (likely(pool))
|
|
|
|
qtcb = mempool_alloc(pool, GFP_ATOMIC);
|
|
|
|
else
|
2011-02-23 02:54:44 +08:00
|
|
|
qtcb = kmem_cache_alloc(zfcp_fsf_qtcb_cache, GFP_ATOMIC);
|
2009-08-18 21:43:15 +08:00
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
if (unlikely(!qtcb))
|
|
|
|
return NULL;
|
|
|
|
|
|
|
|
memset(qtcb, 0, sizeof(*qtcb));
|
2009-08-18 21:43:15 +08:00
|
|
|
return qtcb;
|
2008-07-02 16:56:39 +08:00
|
|
|
}
|
|
|
|
|
2009-08-18 21:43:19 +08:00
|
|
|
static struct zfcp_fsf_req *zfcp_fsf_req_create(struct zfcp_qdio *qdio,
|
2011-06-06 20:14:40 +08:00
|
|
|
u32 fsf_cmd, u8 sbtype,
|
2010-05-01 00:09:34 +08:00
|
|
|
mempool_t *pool)
|
2005-04-17 06:20:36 +08:00
|
|
|
{
|
2009-08-18 21:43:19 +08:00
|
|
|
struct zfcp_adapter *adapter = qdio->adapter;
|
2009-08-18 21:43:15 +08:00
|
|
|
struct zfcp_fsf_req *req = zfcp_fsf_alloc(pool);
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
if (unlikely(!req))
|
2009-07-13 21:06:04 +08:00
|
|
|
return ERR_PTR(-ENOMEM);
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
if (adapter->req_no == 0)
|
|
|
|
adapter->req_no++;
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2017-10-17 07:44:34 +08:00
|
|
|
timer_setup(&req->timer, NULL, 0);
|
2009-08-18 21:43:14 +08:00
|
|
|
init_completion(&req->completion);
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
req->adapter = adapter;
|
2009-03-02 20:08:58 +08:00
|
|
|
req->req_id = adapter->req_no;
|
2008-07-02 16:56:39 +08:00
|
|
|
|
2009-08-18 21:43:15 +08:00
|
|
|
if (likely(fsf_cmd != FSF_QTCB_UNSOLICITED_STATUS)) {
|
|
|
|
if (likely(pool))
|
2018-05-18 01:14:58 +08:00
|
|
|
req->qtcb = zfcp_fsf_qtcb_alloc(
|
|
|
|
adapter->pool.qtcb_pool);
|
2009-08-18 21:43:15 +08:00
|
|
|
else
|
2018-05-18 01:14:58 +08:00
|
|
|
req->qtcb = zfcp_fsf_qtcb_alloc(NULL);
|
2009-08-18 21:43:15 +08:00
|
|
|
|
|
|
|
if (unlikely(!req->qtcb)) {
|
|
|
|
zfcp_fsf_req_free(req);
|
|
|
|
return ERR_PTR(-ENOMEM);
|
|
|
|
}
|
|
|
|
|
2009-08-18 21:43:19 +08:00
|
|
|
req->qtcb->prefix.req_seq_no = adapter->fsf_req_seq_no;
|
2008-07-02 16:56:39 +08:00
|
|
|
req->qtcb->prefix.req_id = req->req_id;
|
|
|
|
req->qtcb->prefix.ulp_info = 26;
|
2018-11-08 22:44:45 +08:00
|
|
|
req->qtcb->prefix.qtcb_type = fsf_qtcb_type[fsf_cmd];
|
2008-07-02 16:56:39 +08:00
|
|
|
req->qtcb->prefix.qtcb_version = FSF_QTCB_CURRENT_VERSION;
|
|
|
|
req->qtcb->header.req_handle = req->req_id;
|
2018-11-08 22:44:45 +08:00
|
|
|
req->qtcb->header.fsf_command = fsf_cmd;
|
2008-07-02 16:56:39 +08:00
|
|
|
}
|
|
|
|
|
2010-05-01 00:09:34 +08:00
|
|
|
zfcp_qdio_req_init(adapter->qdio, &req->qdio_req, req->req_id, sbtype,
|
|
|
|
req->qtcb, sizeof(struct fsf_qtcb));
|
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
return req;
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
static int zfcp_fsf_req_send(struct zfcp_fsf_req *req)
|
|
|
|
{
|
scsi: zfcp: fix request object use-after-free in send path causing seqno errors
With a recent change to our send path for FSF commands we introduced a
possible use-after-free of request-objects, that might further lead to
zfcp crafting bad requests, which the FCP channel correctly complains
about with an error (FSF_PROT_SEQ_NUMB_ERROR). This error is then handled
by an adapter-wide recovery.
The following sequence illustrates the possible use-after-free:
Send Path:
int zfcp_fsf_open_port(struct zfcp_erp_action *erp_action)
{
struct zfcp_fsf_req *req;
...
spin_lock_irq(&qdio->req_q_lock);
// ^^^^^^^^^^^^^^^^
// protects QDIO queue during sending
...
req = zfcp_fsf_req_create(qdio,
FSF_QTCB_OPEN_PORT_WITH_DID,
SBAL_SFLAGS0_TYPE_READ,
qdio->adapter->pool.erp_req);
// ^^^^^^^^^^^^^^^^^^^
// allocation of the request-object
...
retval = zfcp_fsf_req_send(req);
...
spin_unlock_irq(&qdio->req_q_lock);
return retval;
}
static int zfcp_fsf_req_send(struct zfcp_fsf_req *req)
{
struct zfcp_adapter *adapter = req->adapter;
struct zfcp_qdio *qdio = adapter->qdio;
...
zfcp_reqlist_add(adapter->req_list, req);
// ^^^^^^^^^^^^^^^^
// add request to our driver-internal hash-table for tracking
// (protected by separate lock req_list->lock)
...
if (zfcp_qdio_send(qdio, &req->qdio_req)) {
// ^^^^^^^^^^^^^^
// hand-off the request to FCP channel;
// the request can complete at any point now
...
}
/* Don't increase for unsolicited status */
if (!zfcp_fsf_req_is_status_read_buffer(req))
// ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
// possible use-after-free
adapter->fsf_req_seq_no++;
// ^^^^^^^^^^^^^^^^
// because of the use-after-free we might
// miss this accounting, and as follow-up
// this results in the FCP channel error
// FSF_PROT_SEQ_NUMB_ERROR
adapter->req_no++;
return 0;
}
static inline bool
zfcp_fsf_req_is_status_read_buffer(struct zfcp_fsf_req *req)
{
return req->qtcb == NULL;
// ^^^^^^^^^
// possible use-after-free
}
Response Path:
void zfcp_fsf_reqid_check(struct zfcp_qdio *qdio, int sbal_idx)
{
...
struct zfcp_fsf_req *fsf_req;
...
for (idx = 0; idx < QDIO_MAX_ELEMENTS_PER_BUFFER; idx++) {
...
fsf_req = zfcp_reqlist_find_rm(adapter->req_list,
req_id);
// ^^^^^^^^^^^^^^^^^^^^
// remove request from our driver-internal
// hash-table (lock req_list->lock)
...
zfcp_fsf_req_complete(fsf_req);
}
}
static void zfcp_fsf_req_complete(struct zfcp_fsf_req *req)
{
...
if (likely(req->status & ZFCP_STATUS_FSFREQ_CLEANUP))
zfcp_fsf_req_free(req);
// ^^^^^^^^^^^^^^^^^
// free memory for request-object
else
complete(&req->completion);
// ^^^^^^^^
// completion notification for code-paths that wait
// synchronous for the completion of the request; in
// those the memory is freed separately
}
The result of the use-after-free only affects the send path, and can not
lead to any data corruption. In case we miss the sequence-number
accounting, because the memory was already re-purposed, the next FSF
command will fail with said FCP channel error, and we will recover the
whole adapter. This causes no additional errors, but it slows down
traffic. There is a slight chance of the same thing happen again
recursively after the adapter recovery, but so far this has not been seen.
This was seen under z/VM, where the send path might run on a virtual CPU
that gets scheduled away by z/VM, while the return path might still run,
and so create the necessary timing. Running with KASAN can also slow down
the kernel sufficiently to run into this user-after-free, and then see the
report by KASAN.
To fix this, simply pull the test for the sequence-number accounting in
front of the hand-off to the FCP channel (this information doesn't change
during hand-off), but leave the sequence-number accounting itself where it
is.
To make future regressions of the same kind less likely, add comments to
all closely related code-paths.
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Fixes: f9eca0227600 ("scsi: zfcp: drop duplicate fsf_command from zfcp_fsf_req which is also in QTCB header")
Cc: <stable@vger.kernel.org> #5.0+
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-07-03 05:02:00 +08:00
|
|
|
const bool is_srb = zfcp_fsf_req_is_status_read_buffer(req);
|
2008-07-02 16:56:39 +08:00
|
|
|
struct zfcp_adapter *adapter = req->adapter;
|
2009-08-18 21:43:19 +08:00
|
|
|
struct zfcp_qdio *qdio = adapter->qdio;
|
2023-02-22 01:55:59 +08:00
|
|
|
u64 req_id = req->req_id;
|
2008-07-02 16:56:39 +08:00
|
|
|
|
2010-02-17 18:18:50 +08:00
|
|
|
zfcp_reqlist_add(adapter->req_list, req);
|
2008-07-02 16:56:39 +08:00
|
|
|
|
2010-07-16 21:37:38 +08:00
|
|
|
req->qdio_req.qdio_outb_usage = atomic_read(&qdio->req_q_free);
|
2013-01-30 16:49:40 +08:00
|
|
|
req->issued = get_tod_clock();
|
2010-02-17 18:18:59 +08:00
|
|
|
if (zfcp_qdio_send(qdio, &req->qdio_req)) {
|
scsi: zfcp: Fix use-after-free in request timeout handlers
Before v4.15 commit 75492a51568b ("s390/scsi: Convert timers to use
timer_setup()"), we intentionally only passed zfcp_adapter as context
argument to zfcp_fsf_request_timeout_handler(). Since we only trigger
adapter recovery, it was unnecessary to sync against races between timeout
and (late) completion. Likewise, we only passed zfcp_erp_action as context
argument to zfcp_erp_timeout_handler(). Since we only wakeup an ERP action,
it was unnecessary to sync against races between timeout and (late)
completion.
Meanwhile the timeout handlers get timer_list as context argument and do a
timer-specific container-of to zfcp_fsf_req which can have been freed.
Fix it by making sure that any request timeout handlers, that might just
have started before del_timer(), are completed by using del_timer_sync()
instead. This ensures the request free happens afterwards.
Space time diagram of potential use-after-free:
Basic idea is to have 2 or more pending requests whose timeouts run out at
almost the same time.
req 1 timeout ERP thread req 2 timeout
---------------- ---------------- ---------------------------------------
zfcp_fsf_request_timeout_handler
fsf_req = from_timer(fsf_req, t, timer)
adapter = fsf_req->adapter
zfcp_qdio_siosl(adapter)
zfcp_erp_adapter_reopen(adapter,...)
zfcp_erp_strategy
...
zfcp_fsf_req_dismiss_all
list_for_each_entry_safe
zfcp_fsf_req_complete 1
del_timer 1
zfcp_fsf_req_free 1
zfcp_fsf_req_complete 2
zfcp_fsf_request_timeout_handler
del_timer 2
fsf_req = from_timer(fsf_req, t, timer)
zfcp_fsf_req_free 2
adapter = fsf_req->adapter
^^^^^^^ already freed
Link: https://lore.kernel.org/r/20200813152856.50088-1-maier@linux.ibm.com
Fixes: 75492a51568b ("s390/scsi: Convert timers to use timer_setup()")
Cc: <stable@vger.kernel.org> #4.15+
Suggested-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2020-08-13 23:28:56 +08:00
|
|
|
del_timer_sync(&req->timer);
|
2023-02-22 01:56:00 +08:00
|
|
|
|
2008-11-04 23:35:08 +08:00
|
|
|
/* lookup request again, list might have changed */
|
2023-02-22 01:56:00 +08:00
|
|
|
if (zfcp_reqlist_find_rm(adapter->req_list, req_id) == NULL)
|
|
|
|
zfcp_dbf_hba_fsf_reqid("fsrsrmf", 1, adapter, req_id);
|
|
|
|
|
2010-12-02 22:16:16 +08:00
|
|
|
zfcp_erp_adapter_reopen(adapter, 0, "fsrs__1");
|
2008-07-02 16:56:39 +08:00
|
|
|
return -EIO;
|
|
|
|
}
|
|
|
|
|
scsi: zfcp: fix request object use-after-free in send path causing seqno errors
With a recent change to our send path for FSF commands we introduced a
possible use-after-free of request-objects, that might further lead to
zfcp crafting bad requests, which the FCP channel correctly complains
about with an error (FSF_PROT_SEQ_NUMB_ERROR). This error is then handled
by an adapter-wide recovery.
The following sequence illustrates the possible use-after-free:
Send Path:
int zfcp_fsf_open_port(struct zfcp_erp_action *erp_action)
{
struct zfcp_fsf_req *req;
...
spin_lock_irq(&qdio->req_q_lock);
// ^^^^^^^^^^^^^^^^
// protects QDIO queue during sending
...
req = zfcp_fsf_req_create(qdio,
FSF_QTCB_OPEN_PORT_WITH_DID,
SBAL_SFLAGS0_TYPE_READ,
qdio->adapter->pool.erp_req);
// ^^^^^^^^^^^^^^^^^^^
// allocation of the request-object
...
retval = zfcp_fsf_req_send(req);
...
spin_unlock_irq(&qdio->req_q_lock);
return retval;
}
static int zfcp_fsf_req_send(struct zfcp_fsf_req *req)
{
struct zfcp_adapter *adapter = req->adapter;
struct zfcp_qdio *qdio = adapter->qdio;
...
zfcp_reqlist_add(adapter->req_list, req);
// ^^^^^^^^^^^^^^^^
// add request to our driver-internal hash-table for tracking
// (protected by separate lock req_list->lock)
...
if (zfcp_qdio_send(qdio, &req->qdio_req)) {
// ^^^^^^^^^^^^^^
// hand-off the request to FCP channel;
// the request can complete at any point now
...
}
/* Don't increase for unsolicited status */
if (!zfcp_fsf_req_is_status_read_buffer(req))
// ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
// possible use-after-free
adapter->fsf_req_seq_no++;
// ^^^^^^^^^^^^^^^^
// because of the use-after-free we might
// miss this accounting, and as follow-up
// this results in the FCP channel error
// FSF_PROT_SEQ_NUMB_ERROR
adapter->req_no++;
return 0;
}
static inline bool
zfcp_fsf_req_is_status_read_buffer(struct zfcp_fsf_req *req)
{
return req->qtcb == NULL;
// ^^^^^^^^^
// possible use-after-free
}
Response Path:
void zfcp_fsf_reqid_check(struct zfcp_qdio *qdio, int sbal_idx)
{
...
struct zfcp_fsf_req *fsf_req;
...
for (idx = 0; idx < QDIO_MAX_ELEMENTS_PER_BUFFER; idx++) {
...
fsf_req = zfcp_reqlist_find_rm(adapter->req_list,
req_id);
// ^^^^^^^^^^^^^^^^^^^^
// remove request from our driver-internal
// hash-table (lock req_list->lock)
...
zfcp_fsf_req_complete(fsf_req);
}
}
static void zfcp_fsf_req_complete(struct zfcp_fsf_req *req)
{
...
if (likely(req->status & ZFCP_STATUS_FSFREQ_CLEANUP))
zfcp_fsf_req_free(req);
// ^^^^^^^^^^^^^^^^^
// free memory for request-object
else
complete(&req->completion);
// ^^^^^^^^
// completion notification for code-paths that wait
// synchronous for the completion of the request; in
// those the memory is freed separately
}
The result of the use-after-free only affects the send path, and can not
lead to any data corruption. In case we miss the sequence-number
accounting, because the memory was already re-purposed, the next FSF
command will fail with said FCP channel error, and we will recover the
whole adapter. This causes no additional errors, but it slows down
traffic. There is a slight chance of the same thing happen again
recursively after the adapter recovery, but so far this has not been seen.
This was seen under z/VM, where the send path might run on a virtual CPU
that gets scheduled away by z/VM, while the return path might still run,
and so create the necessary timing. Running with KASAN can also slow down
the kernel sufficiently to run into this user-after-free, and then see the
report by KASAN.
To fix this, simply pull the test for the sequence-number accounting in
front of the hand-off to the FCP channel (this information doesn't change
during hand-off), but leave the sequence-number accounting itself where it
is.
To make future regressions of the same kind less likely, add comments to
all closely related code-paths.
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Fixes: f9eca0227600 ("scsi: zfcp: drop duplicate fsf_command from zfcp_fsf_req which is also in QTCB header")
Cc: <stable@vger.kernel.org> #5.0+
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-07-03 05:02:00 +08:00
|
|
|
/*
|
|
|
|
* NOTE: DO NOT TOUCH ASYNC req PAST THIS POINT.
|
|
|
|
* ONLY TOUCH SYNC req AGAIN ON req->completion.
|
|
|
|
*
|
|
|
|
* The request might complete and be freed concurrently at any point
|
|
|
|
* now. This is not protected by the QDIO-lock (req_q_lock). So any
|
|
|
|
* uncontrolled access after this might result in an use-after-free bug.
|
|
|
|
* Only if the request doesn't have ZFCP_STATUS_FSFREQ_CLEANUP set, and
|
|
|
|
* when it is completed via req->completion, is it safe to use req
|
|
|
|
* again.
|
|
|
|
*/
|
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
/* Don't increase for unsolicited status */
|
scsi: zfcp: fix request object use-after-free in send path causing seqno errors
With a recent change to our send path for FSF commands we introduced a
possible use-after-free of request-objects, that might further lead to
zfcp crafting bad requests, which the FCP channel correctly complains
about with an error (FSF_PROT_SEQ_NUMB_ERROR). This error is then handled
by an adapter-wide recovery.
The following sequence illustrates the possible use-after-free:
Send Path:
int zfcp_fsf_open_port(struct zfcp_erp_action *erp_action)
{
struct zfcp_fsf_req *req;
...
spin_lock_irq(&qdio->req_q_lock);
// ^^^^^^^^^^^^^^^^
// protects QDIO queue during sending
...
req = zfcp_fsf_req_create(qdio,
FSF_QTCB_OPEN_PORT_WITH_DID,
SBAL_SFLAGS0_TYPE_READ,
qdio->adapter->pool.erp_req);
// ^^^^^^^^^^^^^^^^^^^
// allocation of the request-object
...
retval = zfcp_fsf_req_send(req);
...
spin_unlock_irq(&qdio->req_q_lock);
return retval;
}
static int zfcp_fsf_req_send(struct zfcp_fsf_req *req)
{
struct zfcp_adapter *adapter = req->adapter;
struct zfcp_qdio *qdio = adapter->qdio;
...
zfcp_reqlist_add(adapter->req_list, req);
// ^^^^^^^^^^^^^^^^
// add request to our driver-internal hash-table for tracking
// (protected by separate lock req_list->lock)
...
if (zfcp_qdio_send(qdio, &req->qdio_req)) {
// ^^^^^^^^^^^^^^
// hand-off the request to FCP channel;
// the request can complete at any point now
...
}
/* Don't increase for unsolicited status */
if (!zfcp_fsf_req_is_status_read_buffer(req))
// ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
// possible use-after-free
adapter->fsf_req_seq_no++;
// ^^^^^^^^^^^^^^^^
// because of the use-after-free we might
// miss this accounting, and as follow-up
// this results in the FCP channel error
// FSF_PROT_SEQ_NUMB_ERROR
adapter->req_no++;
return 0;
}
static inline bool
zfcp_fsf_req_is_status_read_buffer(struct zfcp_fsf_req *req)
{
return req->qtcb == NULL;
// ^^^^^^^^^
// possible use-after-free
}
Response Path:
void zfcp_fsf_reqid_check(struct zfcp_qdio *qdio, int sbal_idx)
{
...
struct zfcp_fsf_req *fsf_req;
...
for (idx = 0; idx < QDIO_MAX_ELEMENTS_PER_BUFFER; idx++) {
...
fsf_req = zfcp_reqlist_find_rm(adapter->req_list,
req_id);
// ^^^^^^^^^^^^^^^^^^^^
// remove request from our driver-internal
// hash-table (lock req_list->lock)
...
zfcp_fsf_req_complete(fsf_req);
}
}
static void zfcp_fsf_req_complete(struct zfcp_fsf_req *req)
{
...
if (likely(req->status & ZFCP_STATUS_FSFREQ_CLEANUP))
zfcp_fsf_req_free(req);
// ^^^^^^^^^^^^^^^^^
// free memory for request-object
else
complete(&req->completion);
// ^^^^^^^^
// completion notification for code-paths that wait
// synchronous for the completion of the request; in
// those the memory is freed separately
}
The result of the use-after-free only affects the send path, and can not
lead to any data corruption. In case we miss the sequence-number
accounting, because the memory was already re-purposed, the next FSF
command will fail with said FCP channel error, and we will recover the
whole adapter. This causes no additional errors, but it slows down
traffic. There is a slight chance of the same thing happen again
recursively after the adapter recovery, but so far this has not been seen.
This was seen under z/VM, where the send path might run on a virtual CPU
that gets scheduled away by z/VM, while the return path might still run,
and so create the necessary timing. Running with KASAN can also slow down
the kernel sufficiently to run into this user-after-free, and then see the
report by KASAN.
To fix this, simply pull the test for the sequence-number accounting in
front of the hand-off to the FCP channel (this information doesn't change
during hand-off), but leave the sequence-number accounting itself where it
is.
To make future regressions of the same kind less likely, add comments to
all closely related code-paths.
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Fixes: f9eca0227600 ("scsi: zfcp: drop duplicate fsf_command from zfcp_fsf_req which is also in QTCB header")
Cc: <stable@vger.kernel.org> #5.0+
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-07-03 05:02:00 +08:00
|
|
|
if (!is_srb)
|
2008-07-02 16:56:39 +08:00
|
|
|
adapter->fsf_req_seq_no++;
|
2009-03-02 20:08:58 +08:00
|
|
|
adapter->req_no++;
|
2008-07-02 16:56:39 +08:00
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* zfcp_fsf_status_read - send status read request
|
2018-11-08 22:44:54 +08:00
|
|
|
* @qdio: pointer to struct zfcp_qdio
|
2008-07-02 16:56:39 +08:00
|
|
|
* Returns: 0 on success, ERROR otherwise
|
2005-04-17 06:20:36 +08:00
|
|
|
*/
|
2009-08-18 21:43:19 +08:00
|
|
|
int zfcp_fsf_status_read(struct zfcp_qdio *qdio)
|
2005-04-17 06:20:36 +08:00
|
|
|
{
|
2009-08-18 21:43:19 +08:00
|
|
|
struct zfcp_adapter *adapter = qdio->adapter;
|
2008-07-02 16:56:39 +08:00
|
|
|
struct zfcp_fsf_req *req;
|
|
|
|
struct fsf_status_read_buffer *sr_buf;
|
2011-02-23 02:54:40 +08:00
|
|
|
struct page *page;
|
2008-07-02 16:56:39 +08:00
|
|
|
int retval = -EIO;
|
2005-04-17 06:20:36 +08:00
|
|
|
|
[SCSI] zfcp: Change spin_lock_bh to spin_lock_irq to fix lockdep warning
With the change to use the data on the SCSI device, iterating through
all LUNs/scsi_devices takes the SCSI host_lock. This triggers warnings
from the lock dependency checker:
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #97
---------------------------------------------------------
chchp/3224 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->req_q_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this: [ 24.972394] 2 locks held by chchp/3224:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-....}, at: [<0000000000490302>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #98
---------------------------------------------------------
chchp/3235 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->stat_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this:
2 locks held by chchp/3235:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-.-..}, at: [<00000000004902f6>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
To stop this warning, change the request queue lock to disable irqs,
not only softirq. The changes are required only outside of the
critical "send fcp command" path.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-09-08 20:39:57 +08:00
|
|
|
spin_lock_irq(&qdio->req_q_lock);
|
2010-05-01 00:09:35 +08:00
|
|
|
if (zfcp_qdio_sbal_get(qdio))
|
2008-07-02 16:56:39 +08:00
|
|
|
goto out;
|
|
|
|
|
2013-08-22 23:49:31 +08:00
|
|
|
req = zfcp_fsf_req_create(qdio, FSF_QTCB_UNSOLICITED_STATUS,
|
|
|
|
SBAL_SFLAGS0_TYPE_STATUS,
|
2009-08-18 21:43:15 +08:00
|
|
|
adapter->pool.status_read_req);
|
2008-08-21 19:43:37 +08:00
|
|
|
if (IS_ERR(req)) {
|
2008-07-02 16:56:39 +08:00
|
|
|
retval = PTR_ERR(req);
|
|
|
|
goto out;
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
|
|
|
|
2011-02-23 02:54:40 +08:00
|
|
|
page = mempool_alloc(adapter->pool.sr_data, GFP_ATOMIC);
|
|
|
|
if (!page) {
|
2008-07-02 16:56:39 +08:00
|
|
|
retval = -ENOMEM;
|
|
|
|
goto failed_buf;
|
|
|
|
}
|
2011-02-23 02:54:40 +08:00
|
|
|
sr_buf = page_address(page);
|
2008-07-02 16:56:39 +08:00
|
|
|
memset(sr_buf, 0, sizeof(*sr_buf));
|
|
|
|
req->data = sr_buf;
|
2010-05-01 00:09:34 +08:00
|
|
|
|
|
|
|
zfcp_qdio_fill_next(qdio, &req->qdio_req, sr_buf, sizeof(*sr_buf));
|
|
|
|
zfcp_qdio_set_sbale_last(qdio, &req->qdio_req);
|
2005-09-14 03:47:52 +08:00
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
retval = zfcp_fsf_req_send(req);
|
|
|
|
if (retval)
|
|
|
|
goto failed_req_send;
|
scsi: zfcp: fix request object use-after-free in send path causing seqno errors
With a recent change to our send path for FSF commands we introduced a
possible use-after-free of request-objects, that might further lead to
zfcp crafting bad requests, which the FCP channel correctly complains
about with an error (FSF_PROT_SEQ_NUMB_ERROR). This error is then handled
by an adapter-wide recovery.
The following sequence illustrates the possible use-after-free:
Send Path:
int zfcp_fsf_open_port(struct zfcp_erp_action *erp_action)
{
struct zfcp_fsf_req *req;
...
spin_lock_irq(&qdio->req_q_lock);
// ^^^^^^^^^^^^^^^^
// protects QDIO queue during sending
...
req = zfcp_fsf_req_create(qdio,
FSF_QTCB_OPEN_PORT_WITH_DID,
SBAL_SFLAGS0_TYPE_READ,
qdio->adapter->pool.erp_req);
// ^^^^^^^^^^^^^^^^^^^
// allocation of the request-object
...
retval = zfcp_fsf_req_send(req);
...
spin_unlock_irq(&qdio->req_q_lock);
return retval;
}
static int zfcp_fsf_req_send(struct zfcp_fsf_req *req)
{
struct zfcp_adapter *adapter = req->adapter;
struct zfcp_qdio *qdio = adapter->qdio;
...
zfcp_reqlist_add(adapter->req_list, req);
// ^^^^^^^^^^^^^^^^
// add request to our driver-internal hash-table for tracking
// (protected by separate lock req_list->lock)
...
if (zfcp_qdio_send(qdio, &req->qdio_req)) {
// ^^^^^^^^^^^^^^
// hand-off the request to FCP channel;
// the request can complete at any point now
...
}
/* Don't increase for unsolicited status */
if (!zfcp_fsf_req_is_status_read_buffer(req))
// ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
// possible use-after-free
adapter->fsf_req_seq_no++;
// ^^^^^^^^^^^^^^^^
// because of the use-after-free we might
// miss this accounting, and as follow-up
// this results in the FCP channel error
// FSF_PROT_SEQ_NUMB_ERROR
adapter->req_no++;
return 0;
}
static inline bool
zfcp_fsf_req_is_status_read_buffer(struct zfcp_fsf_req *req)
{
return req->qtcb == NULL;
// ^^^^^^^^^
// possible use-after-free
}
Response Path:
void zfcp_fsf_reqid_check(struct zfcp_qdio *qdio, int sbal_idx)
{
...
struct zfcp_fsf_req *fsf_req;
...
for (idx = 0; idx < QDIO_MAX_ELEMENTS_PER_BUFFER; idx++) {
...
fsf_req = zfcp_reqlist_find_rm(adapter->req_list,
req_id);
// ^^^^^^^^^^^^^^^^^^^^
// remove request from our driver-internal
// hash-table (lock req_list->lock)
...
zfcp_fsf_req_complete(fsf_req);
}
}
static void zfcp_fsf_req_complete(struct zfcp_fsf_req *req)
{
...
if (likely(req->status & ZFCP_STATUS_FSFREQ_CLEANUP))
zfcp_fsf_req_free(req);
// ^^^^^^^^^^^^^^^^^
// free memory for request-object
else
complete(&req->completion);
// ^^^^^^^^
// completion notification for code-paths that wait
// synchronous for the completion of the request; in
// those the memory is freed separately
}
The result of the use-after-free only affects the send path, and can not
lead to any data corruption. In case we miss the sequence-number
accounting, because the memory was already re-purposed, the next FSF
command will fail with said FCP channel error, and we will recover the
whole adapter. This causes no additional errors, but it slows down
traffic. There is a slight chance of the same thing happen again
recursively after the adapter recovery, but so far this has not been seen.
This was seen under z/VM, where the send path might run on a virtual CPU
that gets scheduled away by z/VM, while the return path might still run,
and so create the necessary timing. Running with KASAN can also slow down
the kernel sufficiently to run into this user-after-free, and then see the
report by KASAN.
To fix this, simply pull the test for the sequence-number accounting in
front of the hand-off to the FCP channel (this information doesn't change
during hand-off), but leave the sequence-number accounting itself where it
is.
To make future regressions of the same kind less likely, add comments to
all closely related code-paths.
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Fixes: f9eca0227600 ("scsi: zfcp: drop duplicate fsf_command from zfcp_fsf_req which is also in QTCB header")
Cc: <stable@vger.kernel.org> #5.0+
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-07-03 05:02:00 +08:00
|
|
|
/* NOTE: DO NOT TOUCH req PAST THIS POINT! */
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
goto out;
|
|
|
|
|
|
|
|
failed_req_send:
|
2010-12-02 22:16:14 +08:00
|
|
|
req->data = NULL;
|
2011-02-23 02:54:40 +08:00
|
|
|
mempool_free(virt_to_page(sr_buf), adapter->pool.sr_data);
|
2008-07-02 16:56:39 +08:00
|
|
|
failed_buf:
|
2010-12-02 22:16:14 +08:00
|
|
|
zfcp_dbf_hba_fsf_uss("fssr__1", req);
|
2008-07-02 16:56:39 +08:00
|
|
|
zfcp_fsf_req_free(req);
|
|
|
|
out:
|
[SCSI] zfcp: Change spin_lock_bh to spin_lock_irq to fix lockdep warning
With the change to use the data on the SCSI device, iterating through
all LUNs/scsi_devices takes the SCSI host_lock. This triggers warnings
from the lock dependency checker:
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #97
---------------------------------------------------------
chchp/3224 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->req_q_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this: [ 24.972394] 2 locks held by chchp/3224:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-....}, at: [<0000000000490302>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #98
---------------------------------------------------------
chchp/3235 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->stat_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this:
2 locks held by chchp/3235:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-.-..}, at: [<00000000004902f6>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
To stop this warning, change the request queue lock to disable irqs,
not only softirq. The changes are required only outside of the
critical "send fcp command" path.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-09-08 20:39:57 +08:00
|
|
|
spin_unlock_irq(&qdio->req_q_lock);
|
2008-07-02 16:56:39 +08:00
|
|
|
return retval;
|
|
|
|
}
|
|
|
|
|
|
|
|
static void zfcp_fsf_abort_fcp_command_handler(struct zfcp_fsf_req *req)
|
|
|
|
{
|
2010-09-08 20:39:55 +08:00
|
|
|
struct scsi_device *sdev = req->data;
|
2012-09-04 21:23:36 +08:00
|
|
|
struct zfcp_scsi_dev *zfcp_sdev;
|
2008-07-02 16:56:39 +08:00
|
|
|
union fsf_status_qual *fsq = &req->qtcb->header.fsf_status_qual;
|
|
|
|
|
|
|
|
if (req->status & ZFCP_STATUS_FSFREQ_ERROR)
|
|
|
|
return;
|
|
|
|
|
2012-09-04 21:23:36 +08:00
|
|
|
zfcp_sdev = sdev_to_zfcp(sdev);
|
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
switch (req->qtcb->header.fsf_status) {
|
2005-04-17 06:20:36 +08:00
|
|
|
case FSF_PORT_HANDLE_NOT_VALID:
|
2008-07-02 16:56:39 +08:00
|
|
|
if (fsq->word[0] == fsq->word[1]) {
|
2010-09-08 20:39:55 +08:00
|
|
|
zfcp_erp_adapter_reopen(zfcp_sdev->port->adapter, 0,
|
2010-12-02 22:16:16 +08:00
|
|
|
"fsafch1");
|
2008-07-02 16:56:39 +08:00
|
|
|
req->status |= ZFCP_STATUS_FSFREQ_ERROR;
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
|
|
|
break;
|
|
|
|
case FSF_LUN_HANDLE_NOT_VALID:
|
2008-07-02 16:56:39 +08:00
|
|
|
if (fsq->word[0] == fsq->word[1]) {
|
2010-12-02 22:16:16 +08:00
|
|
|
zfcp_erp_port_reopen(zfcp_sdev->port, 0, "fsafch2");
|
2008-07-02 16:56:39 +08:00
|
|
|
req->status |= ZFCP_STATUS_FSFREQ_ERROR;
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
|
|
|
break;
|
|
|
|
case FSF_FCP_COMMAND_DOES_NOT_EXIST:
|
2008-07-02 16:56:39 +08:00
|
|
|
req->status |= ZFCP_STATUS_FSFREQ_ABORTNOTNEEDED;
|
2005-04-17 06:20:36 +08:00
|
|
|
break;
|
|
|
|
case FSF_PORT_BOXED:
|
2010-09-08 20:40:01 +08:00
|
|
|
zfcp_erp_set_port_status(zfcp_sdev->port,
|
|
|
|
ZFCP_STATUS_COMMON_ACCESS_BOXED);
|
|
|
|
zfcp_erp_port_reopen(zfcp_sdev->port,
|
2010-12-02 22:16:16 +08:00
|
|
|
ZFCP_STATUS_COMMON_ERP_FAILED, "fsafch3");
|
2009-11-24 23:54:15 +08:00
|
|
|
req->status |= ZFCP_STATUS_FSFREQ_ERROR;
|
2005-04-17 06:20:36 +08:00
|
|
|
break;
|
|
|
|
case FSF_LUN_BOXED:
|
2010-09-08 20:40:01 +08:00
|
|
|
zfcp_erp_set_lun_status(sdev, ZFCP_STATUS_COMMON_ACCESS_BOXED);
|
|
|
|
zfcp_erp_lun_reopen(sdev, ZFCP_STATUS_COMMON_ERP_FAILED,
|
2010-12-02 22:16:16 +08:00
|
|
|
"fsafch4");
|
2009-11-24 23:54:15 +08:00
|
|
|
req->status |= ZFCP_STATUS_FSFREQ_ERROR;
|
2005-04-17 06:20:36 +08:00
|
|
|
break;
|
|
|
|
case FSF_ADAPTER_STATUS_AVAILABLE:
|
2008-07-02 16:56:39 +08:00
|
|
|
switch (fsq->word[0]) {
|
2005-04-17 06:20:36 +08:00
|
|
|
case FSF_SQ_INVOKE_LINK_TEST_PROCEDURE:
|
2010-09-08 20:39:55 +08:00
|
|
|
zfcp_fc_test_link(zfcp_sdev->port);
|
2020-03-31 22:21:48 +08:00
|
|
|
fallthrough;
|
2005-04-17 06:20:36 +08:00
|
|
|
case FSF_SQ_ULP_DEPENDENT_ERP_REQUIRED:
|
2008-07-02 16:56:39 +08:00
|
|
|
req->status |= ZFCP_STATUS_FSFREQ_ERROR;
|
2005-04-17 06:20:36 +08:00
|
|
|
break;
|
|
|
|
}
|
|
|
|
break;
|
|
|
|
case FSF_GOOD:
|
2008-07-02 16:56:39 +08:00
|
|
|
req->status |= ZFCP_STATUS_FSFREQ_ABORTSUCCEEDED;
|
2005-04-17 06:20:36 +08:00
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
2010-09-08 20:39:55 +08:00
|
|
|
* zfcp_fsf_abort_fcp_cmnd - abort running SCSI command
|
|
|
|
* @scmnd: The SCSI command to abort
|
2008-07-02 16:56:39 +08:00
|
|
|
* Returns: pointer to struct zfcp_fsf_req
|
2005-04-17 06:20:36 +08:00
|
|
|
*/
|
|
|
|
|
2010-09-08 20:39:55 +08:00
|
|
|
struct zfcp_fsf_req *zfcp_fsf_abort_fcp_cmnd(struct scsi_cmnd *scmnd)
|
2005-04-17 06:20:36 +08:00
|
|
|
{
|
2008-07-02 16:56:39 +08:00
|
|
|
struct zfcp_fsf_req *req = NULL;
|
2010-09-08 20:39:55 +08:00
|
|
|
struct scsi_device *sdev = scmnd->device;
|
|
|
|
struct zfcp_scsi_dev *zfcp_sdev = sdev_to_zfcp(sdev);
|
|
|
|
struct zfcp_qdio *qdio = zfcp_sdev->port->adapter->qdio;
|
2023-02-22 01:55:59 +08:00
|
|
|
u64 old_req_id = (u64) scmnd->host_scribble;
|
2005-09-14 03:50:38 +08:00
|
|
|
|
[SCSI] zfcp: Change spin_lock_bh to spin_lock_irq to fix lockdep warning
With the change to use the data on the SCSI device, iterating through
all LUNs/scsi_devices takes the SCSI host_lock. This triggers warnings
from the lock dependency checker:
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #97
---------------------------------------------------------
chchp/3224 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->req_q_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this: [ 24.972394] 2 locks held by chchp/3224:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-....}, at: [<0000000000490302>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #98
---------------------------------------------------------
chchp/3235 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->stat_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this:
2 locks held by chchp/3235:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-.-..}, at: [<00000000004902f6>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
To stop this warning, change the request queue lock to disable irqs,
not only softirq. The changes are required only outside of the
critical "send fcp command" path.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-09-08 20:39:57 +08:00
|
|
|
spin_lock_irq(&qdio->req_q_lock);
|
2010-05-01 00:09:35 +08:00
|
|
|
if (zfcp_qdio_sbal_get(qdio))
|
2008-07-02 16:56:39 +08:00
|
|
|
goto out;
|
2009-08-18 21:43:19 +08:00
|
|
|
req = zfcp_fsf_req_create(qdio, FSF_QTCB_ABORT_FCP_CMND,
|
2011-06-06 20:14:40 +08:00
|
|
|
SBAL_SFLAGS0_TYPE_READ,
|
2009-08-18 21:43:19 +08:00
|
|
|
qdio->adapter->pool.scsi_abort);
|
2008-11-27 01:07:37 +08:00
|
|
|
if (IS_ERR(req)) {
|
|
|
|
req = NULL;
|
2008-07-02 16:56:39 +08:00
|
|
|
goto out;
|
2008-11-27 01:07:37 +08:00
|
|
|
}
|
2006-09-19 04:29:56 +08:00
|
|
|
|
2010-09-08 20:39:55 +08:00
|
|
|
if (unlikely(!(atomic_read(&zfcp_sdev->status) &
|
2008-07-02 16:56:39 +08:00
|
|
|
ZFCP_STATUS_COMMON_UNBLOCKED)))
|
|
|
|
goto out_error_free;
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2010-05-01 00:09:34 +08:00
|
|
|
zfcp_qdio_set_sbale_last(qdio, &req->qdio_req);
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2010-11-17 21:23:41 +08:00
|
|
|
req->data = sdev;
|
2008-07-02 16:56:39 +08:00
|
|
|
req->handler = zfcp_fsf_abort_fcp_command_handler;
|
2010-09-08 20:39:55 +08:00
|
|
|
req->qtcb->header.lun_handle = zfcp_sdev->lun_handle;
|
|
|
|
req->qtcb->header.port_handle = zfcp_sdev->port->handle;
|
2023-02-22 01:55:59 +08:00
|
|
|
req->qtcb->bottom.support.req_handle = old_req_id;
|
2008-07-02 16:56:39 +08:00
|
|
|
|
2018-11-08 22:44:40 +08:00
|
|
|
zfcp_fsf_start_timer(req, ZFCP_FSF_SCSI_ER_TIMEOUT);
|
scsi: zfcp: fix request object use-after-free in send path causing seqno errors
With a recent change to our send path for FSF commands we introduced a
possible use-after-free of request-objects, that might further lead to
zfcp crafting bad requests, which the FCP channel correctly complains
about with an error (FSF_PROT_SEQ_NUMB_ERROR). This error is then handled
by an adapter-wide recovery.
The following sequence illustrates the possible use-after-free:
Send Path:
int zfcp_fsf_open_port(struct zfcp_erp_action *erp_action)
{
struct zfcp_fsf_req *req;
...
spin_lock_irq(&qdio->req_q_lock);
// ^^^^^^^^^^^^^^^^
// protects QDIO queue during sending
...
req = zfcp_fsf_req_create(qdio,
FSF_QTCB_OPEN_PORT_WITH_DID,
SBAL_SFLAGS0_TYPE_READ,
qdio->adapter->pool.erp_req);
// ^^^^^^^^^^^^^^^^^^^
// allocation of the request-object
...
retval = zfcp_fsf_req_send(req);
...
spin_unlock_irq(&qdio->req_q_lock);
return retval;
}
static int zfcp_fsf_req_send(struct zfcp_fsf_req *req)
{
struct zfcp_adapter *adapter = req->adapter;
struct zfcp_qdio *qdio = adapter->qdio;
...
zfcp_reqlist_add(adapter->req_list, req);
// ^^^^^^^^^^^^^^^^
// add request to our driver-internal hash-table for tracking
// (protected by separate lock req_list->lock)
...
if (zfcp_qdio_send(qdio, &req->qdio_req)) {
// ^^^^^^^^^^^^^^
// hand-off the request to FCP channel;
// the request can complete at any point now
...
}
/* Don't increase for unsolicited status */
if (!zfcp_fsf_req_is_status_read_buffer(req))
// ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
// possible use-after-free
adapter->fsf_req_seq_no++;
// ^^^^^^^^^^^^^^^^
// because of the use-after-free we might
// miss this accounting, and as follow-up
// this results in the FCP channel error
// FSF_PROT_SEQ_NUMB_ERROR
adapter->req_no++;
return 0;
}
static inline bool
zfcp_fsf_req_is_status_read_buffer(struct zfcp_fsf_req *req)
{
return req->qtcb == NULL;
// ^^^^^^^^^
// possible use-after-free
}
Response Path:
void zfcp_fsf_reqid_check(struct zfcp_qdio *qdio, int sbal_idx)
{
...
struct zfcp_fsf_req *fsf_req;
...
for (idx = 0; idx < QDIO_MAX_ELEMENTS_PER_BUFFER; idx++) {
...
fsf_req = zfcp_reqlist_find_rm(adapter->req_list,
req_id);
// ^^^^^^^^^^^^^^^^^^^^
// remove request from our driver-internal
// hash-table (lock req_list->lock)
...
zfcp_fsf_req_complete(fsf_req);
}
}
static void zfcp_fsf_req_complete(struct zfcp_fsf_req *req)
{
...
if (likely(req->status & ZFCP_STATUS_FSFREQ_CLEANUP))
zfcp_fsf_req_free(req);
// ^^^^^^^^^^^^^^^^^
// free memory for request-object
else
complete(&req->completion);
// ^^^^^^^^
// completion notification for code-paths that wait
// synchronous for the completion of the request; in
// those the memory is freed separately
}
The result of the use-after-free only affects the send path, and can not
lead to any data corruption. In case we miss the sequence-number
accounting, because the memory was already re-purposed, the next FSF
command will fail with said FCP channel error, and we will recover the
whole adapter. This causes no additional errors, but it slows down
traffic. There is a slight chance of the same thing happen again
recursively after the adapter recovery, but so far this has not been seen.
This was seen under z/VM, where the send path might run on a virtual CPU
that gets scheduled away by z/VM, while the return path might still run,
and so create the necessary timing. Running with KASAN can also slow down
the kernel sufficiently to run into this user-after-free, and then see the
report by KASAN.
To fix this, simply pull the test for the sequence-number accounting in
front of the hand-off to the FCP channel (this information doesn't change
during hand-off), but leave the sequence-number accounting itself where it
is.
To make future regressions of the same kind less likely, add comments to
all closely related code-paths.
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Fixes: f9eca0227600 ("scsi: zfcp: drop duplicate fsf_command from zfcp_fsf_req which is also in QTCB header")
Cc: <stable@vger.kernel.org> #5.0+
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-07-03 05:02:00 +08:00
|
|
|
if (!zfcp_fsf_req_send(req)) {
|
|
|
|
/* NOTE: DO NOT TOUCH req, UNTIL IT COMPLETES! */
|
2008-07-02 16:56:39 +08:00
|
|
|
goto out;
|
scsi: zfcp: fix request object use-after-free in send path causing seqno errors
With a recent change to our send path for FSF commands we introduced a
possible use-after-free of request-objects, that might further lead to
zfcp crafting bad requests, which the FCP channel correctly complains
about with an error (FSF_PROT_SEQ_NUMB_ERROR). This error is then handled
by an adapter-wide recovery.
The following sequence illustrates the possible use-after-free:
Send Path:
int zfcp_fsf_open_port(struct zfcp_erp_action *erp_action)
{
struct zfcp_fsf_req *req;
...
spin_lock_irq(&qdio->req_q_lock);
// ^^^^^^^^^^^^^^^^
// protects QDIO queue during sending
...
req = zfcp_fsf_req_create(qdio,
FSF_QTCB_OPEN_PORT_WITH_DID,
SBAL_SFLAGS0_TYPE_READ,
qdio->adapter->pool.erp_req);
// ^^^^^^^^^^^^^^^^^^^
// allocation of the request-object
...
retval = zfcp_fsf_req_send(req);
...
spin_unlock_irq(&qdio->req_q_lock);
return retval;
}
static int zfcp_fsf_req_send(struct zfcp_fsf_req *req)
{
struct zfcp_adapter *adapter = req->adapter;
struct zfcp_qdio *qdio = adapter->qdio;
...
zfcp_reqlist_add(adapter->req_list, req);
// ^^^^^^^^^^^^^^^^
// add request to our driver-internal hash-table for tracking
// (protected by separate lock req_list->lock)
...
if (zfcp_qdio_send(qdio, &req->qdio_req)) {
// ^^^^^^^^^^^^^^
// hand-off the request to FCP channel;
// the request can complete at any point now
...
}
/* Don't increase for unsolicited status */
if (!zfcp_fsf_req_is_status_read_buffer(req))
// ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
// possible use-after-free
adapter->fsf_req_seq_no++;
// ^^^^^^^^^^^^^^^^
// because of the use-after-free we might
// miss this accounting, and as follow-up
// this results in the FCP channel error
// FSF_PROT_SEQ_NUMB_ERROR
adapter->req_no++;
return 0;
}
static inline bool
zfcp_fsf_req_is_status_read_buffer(struct zfcp_fsf_req *req)
{
return req->qtcb == NULL;
// ^^^^^^^^^
// possible use-after-free
}
Response Path:
void zfcp_fsf_reqid_check(struct zfcp_qdio *qdio, int sbal_idx)
{
...
struct zfcp_fsf_req *fsf_req;
...
for (idx = 0; idx < QDIO_MAX_ELEMENTS_PER_BUFFER; idx++) {
...
fsf_req = zfcp_reqlist_find_rm(adapter->req_list,
req_id);
// ^^^^^^^^^^^^^^^^^^^^
// remove request from our driver-internal
// hash-table (lock req_list->lock)
...
zfcp_fsf_req_complete(fsf_req);
}
}
static void zfcp_fsf_req_complete(struct zfcp_fsf_req *req)
{
...
if (likely(req->status & ZFCP_STATUS_FSFREQ_CLEANUP))
zfcp_fsf_req_free(req);
// ^^^^^^^^^^^^^^^^^
// free memory for request-object
else
complete(&req->completion);
// ^^^^^^^^
// completion notification for code-paths that wait
// synchronous for the completion of the request; in
// those the memory is freed separately
}
The result of the use-after-free only affects the send path, and can not
lead to any data corruption. In case we miss the sequence-number
accounting, because the memory was already re-purposed, the next FSF
command will fail with said FCP channel error, and we will recover the
whole adapter. This causes no additional errors, but it slows down
traffic. There is a slight chance of the same thing happen again
recursively after the adapter recovery, but so far this has not been seen.
This was seen under z/VM, where the send path might run on a virtual CPU
that gets scheduled away by z/VM, while the return path might still run,
and so create the necessary timing. Running with KASAN can also slow down
the kernel sufficiently to run into this user-after-free, and then see the
report by KASAN.
To fix this, simply pull the test for the sequence-number accounting in
front of the hand-off to the FCP channel (this information doesn't change
during hand-off), but leave the sequence-number accounting itself where it
is.
To make future regressions of the same kind less likely, add comments to
all closely related code-paths.
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Fixes: f9eca0227600 ("scsi: zfcp: drop duplicate fsf_command from zfcp_fsf_req which is also in QTCB header")
Cc: <stable@vger.kernel.org> #5.0+
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-07-03 05:02:00 +08:00
|
|
|
}
|
2008-07-02 16:56:39 +08:00
|
|
|
|
|
|
|
out_error_free:
|
|
|
|
zfcp_fsf_req_free(req);
|
|
|
|
req = NULL;
|
|
|
|
out:
|
[SCSI] zfcp: Change spin_lock_bh to spin_lock_irq to fix lockdep warning
With the change to use the data on the SCSI device, iterating through
all LUNs/scsi_devices takes the SCSI host_lock. This triggers warnings
from the lock dependency checker:
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #97
---------------------------------------------------------
chchp/3224 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->req_q_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this: [ 24.972394] 2 locks held by chchp/3224:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-....}, at: [<0000000000490302>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #98
---------------------------------------------------------
chchp/3235 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->stat_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this:
2 locks held by chchp/3235:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-.-..}, at: [<00000000004902f6>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
To stop this warning, change the request queue lock to disable irqs,
not only softirq. The changes are required only outside of the
critical "send fcp command" path.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-09-08 20:39:57 +08:00
|
|
|
spin_unlock_irq(&qdio->req_q_lock);
|
2008-07-02 16:56:39 +08:00
|
|
|
return req;
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
static void zfcp_fsf_send_ct_handler(struct zfcp_fsf_req *req)
|
2005-04-17 06:20:36 +08:00
|
|
|
{
|
2008-07-02 16:56:39 +08:00
|
|
|
struct zfcp_adapter *adapter = req->adapter;
|
2009-11-24 23:54:13 +08:00
|
|
|
struct zfcp_fsf_ct_els *ct = req->data;
|
2008-07-02 16:56:39 +08:00
|
|
|
struct fsf_qtcb_header *header = &req->qtcb->header;
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2009-11-24 23:54:13 +08:00
|
|
|
ct->status = -EINVAL;
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
if (req->status & ZFCP_STATUS_FSFREQ_ERROR)
|
2005-04-17 06:20:36 +08:00
|
|
|
goto skip_fsfstatus;
|
|
|
|
|
|
|
|
switch (header->fsf_status) {
|
|
|
|
case FSF_GOOD:
|
2009-11-24 23:54:13 +08:00
|
|
|
ct->status = 0;
|
2017-07-28 18:30:53 +08:00
|
|
|
zfcp_dbf_san_res("fsscth2", req);
|
2005-04-17 06:20:36 +08:00
|
|
|
break;
|
|
|
|
case FSF_SERVICE_CLASS_NOT_SUPPORTED:
|
2008-07-02 16:56:39 +08:00
|
|
|
zfcp_fsf_class_not_supp(req);
|
2005-04-17 06:20:36 +08:00
|
|
|
break;
|
|
|
|
case FSF_ADAPTER_STATUS_AVAILABLE:
|
|
|
|
switch (header->fsf_status_qual.word[0]){
|
|
|
|
case FSF_SQ_INVOKE_LINK_TEST_PROCEDURE:
|
|
|
|
case FSF_SQ_ULP_DEPENDENT_ERP_REQUIRED:
|
2008-07-02 16:56:39 +08:00
|
|
|
req->status |= ZFCP_STATUS_FSFREQ_ERROR;
|
2005-04-17 06:20:36 +08:00
|
|
|
break;
|
|
|
|
}
|
|
|
|
break;
|
|
|
|
case FSF_PORT_BOXED:
|
2009-11-24 23:54:15 +08:00
|
|
|
req->status |= ZFCP_STATUS_FSFREQ_ERROR;
|
2005-04-17 06:20:36 +08:00
|
|
|
break;
|
2008-07-02 16:56:39 +08:00
|
|
|
case FSF_PORT_HANDLE_NOT_VALID:
|
2010-12-02 22:16:16 +08:00
|
|
|
zfcp_erp_adapter_reopen(adapter, 0, "fsscth1");
|
2020-03-31 22:21:48 +08:00
|
|
|
fallthrough;
|
2008-07-02 16:56:39 +08:00
|
|
|
case FSF_GENERIC_COMMAND_REJECTED:
|
2005-04-17 06:20:36 +08:00
|
|
|
case FSF_PAYLOAD_SIZE_MISMATCH:
|
|
|
|
case FSF_REQUEST_SIZE_TOO_LARGE:
|
|
|
|
case FSF_RESPONSE_SIZE_TOO_LARGE:
|
|
|
|
case FSF_SBAL_MISMATCH:
|
2008-07-02 16:56:39 +08:00
|
|
|
req->status |= ZFCP_STATUS_FSFREQ_ERROR;
|
2005-04-17 06:20:36 +08:00
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
|
|
|
skip_fsfstatus:
|
2009-11-24 23:54:13 +08:00
|
|
|
if (ct->handler)
|
|
|
|
ct->handler(ct->handler_data);
|
2008-07-02 16:56:39 +08:00
|
|
|
}
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2010-05-01 00:09:34 +08:00
|
|
|
static void zfcp_fsf_setup_ct_els_unchained(struct zfcp_qdio *qdio,
|
|
|
|
struct zfcp_qdio_req *q_req,
|
2009-07-13 21:06:06 +08:00
|
|
|
struct scatterlist *sg_req,
|
|
|
|
struct scatterlist *sg_resp)
|
|
|
|
{
|
2010-05-01 00:09:34 +08:00
|
|
|
zfcp_qdio_fill_next(qdio, q_req, sg_virt(sg_req), sg_req->length);
|
|
|
|
zfcp_qdio_fill_next(qdio, q_req, sg_virt(sg_resp), sg_resp->length);
|
|
|
|
zfcp_qdio_set_sbale_last(qdio, q_req);
|
2009-07-13 21:06:06 +08:00
|
|
|
}
|
|
|
|
|
2008-12-19 23:57:01 +08:00
|
|
|
static int zfcp_fsf_setup_ct_els_sbals(struct zfcp_fsf_req *req,
|
|
|
|
struct scatterlist *sg_req,
|
2010-07-16 21:37:37 +08:00
|
|
|
struct scatterlist *sg_resp)
|
2008-07-02 16:56:39 +08:00
|
|
|
{
|
2009-08-18 21:43:18 +08:00
|
|
|
struct zfcp_adapter *adapter = req->adapter;
|
2011-08-15 20:40:32 +08:00
|
|
|
struct zfcp_qdio *qdio = adapter->qdio;
|
|
|
|
struct fsf_qtcb *qtcb = req->qtcb;
|
2009-08-18 21:43:18 +08:00
|
|
|
u32 feat = adapter->adapter_features;
|
2008-07-02 16:56:39 +08:00
|
|
|
|
2011-08-15 20:40:32 +08:00
|
|
|
if (zfcp_adapter_multi_buffer_active(adapter)) {
|
|
|
|
if (zfcp_qdio_sbals_from_sg(qdio, &req->qdio_req, sg_req))
|
|
|
|
return -EIO;
|
2016-08-11 00:30:45 +08:00
|
|
|
qtcb->bottom.support.req_buf_length =
|
|
|
|
zfcp_qdio_real_bytes(sg_req);
|
2011-08-15 20:40:32 +08:00
|
|
|
if (zfcp_qdio_sbals_from_sg(qdio, &req->qdio_req, sg_resp))
|
|
|
|
return -EIO;
|
2016-08-11 00:30:45 +08:00
|
|
|
qtcb->bottom.support.resp_buf_length =
|
|
|
|
zfcp_qdio_real_bytes(sg_resp);
|
2008-12-19 23:57:01 +08:00
|
|
|
|
2017-07-28 18:30:47 +08:00
|
|
|
zfcp_qdio_set_data_div(qdio, &req->qdio_req, sg_nents(sg_req));
|
2011-08-15 20:40:32 +08:00
|
|
|
zfcp_qdio_set_sbale_last(qdio, &req->qdio_req);
|
|
|
|
zfcp_qdio_set_scount(qdio, &req->qdio_req);
|
2009-07-13 21:06:06 +08:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* use single, unchained SBAL if it can hold the request */
|
2010-06-21 16:11:31 +08:00
|
|
|
if (zfcp_qdio_sg_one_sbale(sg_req) && zfcp_qdio_sg_one_sbale(sg_resp)) {
|
2011-08-15 20:40:32 +08:00
|
|
|
zfcp_fsf_setup_ct_els_unchained(qdio, &req->qdio_req,
|
2010-05-01 00:09:34 +08:00
|
|
|
sg_req, sg_resp);
|
2008-12-19 23:57:01 +08:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2011-08-15 20:40:32 +08:00
|
|
|
if (!(feat & FSF_FEATURE_ELS_CT_CHAINED_SBALS))
|
|
|
|
return -EOPNOTSUPP;
|
|
|
|
|
|
|
|
if (zfcp_qdio_sbals_from_sg(qdio, &req->qdio_req, sg_req))
|
2009-07-13 21:06:07 +08:00
|
|
|
return -EIO;
|
2008-07-02 16:56:39 +08:00
|
|
|
|
2011-08-15 20:40:32 +08:00
|
|
|
qtcb->bottom.support.req_buf_length = zfcp_qdio_real_bytes(sg_req);
|
|
|
|
|
|
|
|
zfcp_qdio_set_sbale_last(qdio, &req->qdio_req);
|
|
|
|
zfcp_qdio_skip_to_last_sbale(qdio, &req->qdio_req);
|
|
|
|
|
|
|
|
if (zfcp_qdio_sbals_from_sg(qdio, &req->qdio_req, sg_resp))
|
2009-07-13 21:06:07 +08:00
|
|
|
return -EIO;
|
2011-08-15 20:40:32 +08:00
|
|
|
|
|
|
|
qtcb->bottom.support.resp_buf_length = zfcp_qdio_real_bytes(sg_resp);
|
|
|
|
|
|
|
|
zfcp_qdio_set_sbale_last(qdio, &req->qdio_req);
|
2009-08-18 21:43:26 +08:00
|
|
|
|
2009-09-24 16:23:21 +08:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
static int zfcp_fsf_setup_ct_els(struct zfcp_fsf_req *req,
|
|
|
|
struct scatterlist *sg_req,
|
|
|
|
struct scatterlist *sg_resp,
|
2010-07-16 21:37:37 +08:00
|
|
|
unsigned int timeout)
|
2009-09-24 16:23:21 +08:00
|
|
|
{
|
|
|
|
int ret;
|
|
|
|
|
2010-07-16 21:37:37 +08:00
|
|
|
ret = zfcp_fsf_setup_ct_els_sbals(req, sg_req, sg_resp);
|
2009-09-24 16:23:21 +08:00
|
|
|
if (ret)
|
|
|
|
return ret;
|
|
|
|
|
2009-08-18 21:43:26 +08:00
|
|
|
/* common settings for ct/gs and els requests */
|
2010-01-15 00:19:02 +08:00
|
|
|
if (timeout > 255)
|
|
|
|
timeout = 255; /* max value accepted by hardware */
|
2009-08-18 21:43:26 +08:00
|
|
|
req->qtcb->bottom.support.service_class = FSF_CLASS_3;
|
2010-01-15 00:19:02 +08:00
|
|
|
req->qtcb->bottom.support.timeout = timeout;
|
|
|
|
zfcp_fsf_start_timer(req, (timeout + 10) * HZ);
|
2008-07-02 16:56:39 +08:00
|
|
|
|
|
|
|
return 0;
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
2008-07-02 16:56:39 +08:00
|
|
|
* zfcp_fsf_send_ct - initiate a Generic Service request (FC-GS)
|
2018-11-08 22:44:54 +08:00
|
|
|
* @wka_port: pointer to zfcp WKA port to send CT/GS to
|
2008-07-02 16:56:39 +08:00
|
|
|
* @ct: pointer to struct zfcp_send_ct with data for request
|
|
|
|
* @pool: if non-null this mempool is used to allocate struct zfcp_fsf_req
|
2018-11-08 22:44:54 +08:00
|
|
|
* @timeout: timeout that hardware should use, and a later software timeout
|
2005-04-17 06:20:36 +08:00
|
|
|
*/
|
2009-11-24 23:54:13 +08:00
|
|
|
int zfcp_fsf_send_ct(struct zfcp_fc_wka_port *wka_port,
|
2010-01-15 00:19:02 +08:00
|
|
|
struct zfcp_fsf_ct_els *ct, mempool_t *pool,
|
|
|
|
unsigned int timeout)
|
2005-04-17 06:20:36 +08:00
|
|
|
{
|
2009-08-18 21:43:19 +08:00
|
|
|
struct zfcp_qdio *qdio = wka_port->adapter->qdio;
|
2008-07-02 16:56:39 +08:00
|
|
|
struct zfcp_fsf_req *req;
|
|
|
|
int ret = -EIO;
|
2005-04-17 06:20:36 +08:00
|
|
|
|
[SCSI] zfcp: Change spin_lock_bh to spin_lock_irq to fix lockdep warning
With the change to use the data on the SCSI device, iterating through
all LUNs/scsi_devices takes the SCSI host_lock. This triggers warnings
from the lock dependency checker:
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #97
---------------------------------------------------------
chchp/3224 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->req_q_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this: [ 24.972394] 2 locks held by chchp/3224:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-....}, at: [<0000000000490302>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #98
---------------------------------------------------------
chchp/3235 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->stat_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this:
2 locks held by chchp/3235:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-.-..}, at: [<00000000004902f6>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
To stop this warning, change the request queue lock to disable irqs,
not only softirq. The changes are required only outside of the
critical "send fcp command" path.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-09-08 20:39:57 +08:00
|
|
|
spin_lock_irq(&qdio->req_q_lock);
|
2010-05-01 00:09:35 +08:00
|
|
|
if (zfcp_qdio_sbal_get(qdio))
|
2008-07-02 16:56:39 +08:00
|
|
|
goto out;
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2010-05-01 00:09:34 +08:00
|
|
|
req = zfcp_fsf_req_create(qdio, FSF_QTCB_SEND_GENERIC,
|
2011-06-06 20:14:40 +08:00
|
|
|
SBAL_SFLAGS0_TYPE_WRITE_READ, pool);
|
2009-08-18 21:43:16 +08:00
|
|
|
|
2008-08-21 19:43:37 +08:00
|
|
|
if (IS_ERR(req)) {
|
2008-07-02 16:56:39 +08:00
|
|
|
ret = PTR_ERR(req);
|
|
|
|
goto out;
|
2007-12-20 19:30:25 +08:00
|
|
|
}
|
|
|
|
|
2009-08-18 21:43:16 +08:00
|
|
|
req->status |= ZFCP_STATUS_FSFREQ_CLEANUP;
|
2010-07-16 21:37:37 +08:00
|
|
|
ret = zfcp_fsf_setup_ct_els(req, ct->req, ct->resp, timeout);
|
2008-06-11 00:20:58 +08:00
|
|
|
if (ret)
|
2005-04-17 06:20:36 +08:00
|
|
|
goto failed_send;
|
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
req->handler = zfcp_fsf_send_ct_handler;
|
2008-10-01 18:42:17 +08:00
|
|
|
req->qtcb->header.port_handle = wka_port->handle;
|
2016-08-11 00:30:51 +08:00
|
|
|
ct->d_id = wka_port->d_id;
|
2008-07-02 16:56:39 +08:00
|
|
|
req->data = ct;
|
|
|
|
|
2010-12-02 22:16:13 +08:00
|
|
|
zfcp_dbf_san_req("fssct_1", req, wka_port->d_id);
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
ret = zfcp_fsf_req_send(req);
|
|
|
|
if (ret)
|
|
|
|
goto failed_send;
|
scsi: zfcp: fix request object use-after-free in send path causing seqno errors
With a recent change to our send path for FSF commands we introduced a
possible use-after-free of request-objects, that might further lead to
zfcp crafting bad requests, which the FCP channel correctly complains
about with an error (FSF_PROT_SEQ_NUMB_ERROR). This error is then handled
by an adapter-wide recovery.
The following sequence illustrates the possible use-after-free:
Send Path:
int zfcp_fsf_open_port(struct zfcp_erp_action *erp_action)
{
struct zfcp_fsf_req *req;
...
spin_lock_irq(&qdio->req_q_lock);
// ^^^^^^^^^^^^^^^^
// protects QDIO queue during sending
...
req = zfcp_fsf_req_create(qdio,
FSF_QTCB_OPEN_PORT_WITH_DID,
SBAL_SFLAGS0_TYPE_READ,
qdio->adapter->pool.erp_req);
// ^^^^^^^^^^^^^^^^^^^
// allocation of the request-object
...
retval = zfcp_fsf_req_send(req);
...
spin_unlock_irq(&qdio->req_q_lock);
return retval;
}
static int zfcp_fsf_req_send(struct zfcp_fsf_req *req)
{
struct zfcp_adapter *adapter = req->adapter;
struct zfcp_qdio *qdio = adapter->qdio;
...
zfcp_reqlist_add(adapter->req_list, req);
// ^^^^^^^^^^^^^^^^
// add request to our driver-internal hash-table for tracking
// (protected by separate lock req_list->lock)
...
if (zfcp_qdio_send(qdio, &req->qdio_req)) {
// ^^^^^^^^^^^^^^
// hand-off the request to FCP channel;
// the request can complete at any point now
...
}
/* Don't increase for unsolicited status */
if (!zfcp_fsf_req_is_status_read_buffer(req))
// ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
// possible use-after-free
adapter->fsf_req_seq_no++;
// ^^^^^^^^^^^^^^^^
// because of the use-after-free we might
// miss this accounting, and as follow-up
// this results in the FCP channel error
// FSF_PROT_SEQ_NUMB_ERROR
adapter->req_no++;
return 0;
}
static inline bool
zfcp_fsf_req_is_status_read_buffer(struct zfcp_fsf_req *req)
{
return req->qtcb == NULL;
// ^^^^^^^^^
// possible use-after-free
}
Response Path:
void zfcp_fsf_reqid_check(struct zfcp_qdio *qdio, int sbal_idx)
{
...
struct zfcp_fsf_req *fsf_req;
...
for (idx = 0; idx < QDIO_MAX_ELEMENTS_PER_BUFFER; idx++) {
...
fsf_req = zfcp_reqlist_find_rm(adapter->req_list,
req_id);
// ^^^^^^^^^^^^^^^^^^^^
// remove request from our driver-internal
// hash-table (lock req_list->lock)
...
zfcp_fsf_req_complete(fsf_req);
}
}
static void zfcp_fsf_req_complete(struct zfcp_fsf_req *req)
{
...
if (likely(req->status & ZFCP_STATUS_FSFREQ_CLEANUP))
zfcp_fsf_req_free(req);
// ^^^^^^^^^^^^^^^^^
// free memory for request-object
else
complete(&req->completion);
// ^^^^^^^^
// completion notification for code-paths that wait
// synchronous for the completion of the request; in
// those the memory is freed separately
}
The result of the use-after-free only affects the send path, and can not
lead to any data corruption. In case we miss the sequence-number
accounting, because the memory was already re-purposed, the next FSF
command will fail with said FCP channel error, and we will recover the
whole adapter. This causes no additional errors, but it slows down
traffic. There is a slight chance of the same thing happen again
recursively after the adapter recovery, but so far this has not been seen.
This was seen under z/VM, where the send path might run on a virtual CPU
that gets scheduled away by z/VM, while the return path might still run,
and so create the necessary timing. Running with KASAN can also slow down
the kernel sufficiently to run into this user-after-free, and then see the
report by KASAN.
To fix this, simply pull the test for the sequence-number accounting in
front of the hand-off to the FCP channel (this information doesn't change
during hand-off), but leave the sequence-number accounting itself where it
is.
To make future regressions of the same kind less likely, add comments to
all closely related code-paths.
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Fixes: f9eca0227600 ("scsi: zfcp: drop duplicate fsf_command from zfcp_fsf_req which is also in QTCB header")
Cc: <stable@vger.kernel.org> #5.0+
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-07-03 05:02:00 +08:00
|
|
|
/* NOTE: DO NOT TOUCH req PAST THIS POINT! */
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
goto out;
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
failed_send:
|
|
|
|
zfcp_fsf_req_free(req);
|
|
|
|
out:
|
[SCSI] zfcp: Change spin_lock_bh to spin_lock_irq to fix lockdep warning
With the change to use the data on the SCSI device, iterating through
all LUNs/scsi_devices takes the SCSI host_lock. This triggers warnings
from the lock dependency checker:
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #97
---------------------------------------------------------
chchp/3224 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->req_q_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this: [ 24.972394] 2 locks held by chchp/3224:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-....}, at: [<0000000000490302>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #98
---------------------------------------------------------
chchp/3235 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->stat_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this:
2 locks held by chchp/3235:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-.-..}, at: [<00000000004902f6>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
To stop this warning, change the request queue lock to disable irqs,
not only softirq. The changes are required only outside of the
critical "send fcp command" path.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-09-08 20:39:57 +08:00
|
|
|
spin_unlock_irq(&qdio->req_q_lock);
|
2008-07-02 16:56:39 +08:00
|
|
|
return ret;
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
static void zfcp_fsf_send_els_handler(struct zfcp_fsf_req *req)
|
2005-04-17 06:20:36 +08:00
|
|
|
{
|
2009-11-24 23:54:13 +08:00
|
|
|
struct zfcp_fsf_ct_els *send_els = req->data;
|
2008-07-02 16:56:39 +08:00
|
|
|
struct fsf_qtcb_header *header = &req->qtcb->header;
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
send_els->status = -EINVAL;
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
if (req->status & ZFCP_STATUS_FSFREQ_ERROR)
|
2005-04-17 06:20:36 +08:00
|
|
|
goto skip_fsfstatus;
|
|
|
|
|
|
|
|
switch (header->fsf_status) {
|
|
|
|
case FSF_GOOD:
|
2008-07-02 16:56:39 +08:00
|
|
|
send_els->status = 0;
|
2017-07-28 18:30:53 +08:00
|
|
|
zfcp_dbf_san_res("fsselh1", req);
|
2005-04-17 06:20:36 +08:00
|
|
|
break;
|
|
|
|
case FSF_SERVICE_CLASS_NOT_SUPPORTED:
|
2008-07-02 16:56:39 +08:00
|
|
|
zfcp_fsf_class_not_supp(req);
|
2005-04-17 06:20:36 +08:00
|
|
|
break;
|
|
|
|
case FSF_ADAPTER_STATUS_AVAILABLE:
|
|
|
|
switch (header->fsf_status_qual.word[0]){
|
|
|
|
case FSF_SQ_INVOKE_LINK_TEST_PROCEDURE:
|
|
|
|
case FSF_SQ_ULP_DEPENDENT_ERP_REQUIRED:
|
|
|
|
case FSF_SQ_RETRY_IF_POSSIBLE:
|
2008-07-02 16:56:39 +08:00
|
|
|
req->status |= ZFCP_STATUS_FSFREQ_ERROR;
|
2005-04-17 06:20:36 +08:00
|
|
|
break;
|
|
|
|
}
|
|
|
|
break;
|
|
|
|
case FSF_ELS_COMMAND_REJECTED:
|
|
|
|
case FSF_PAYLOAD_SIZE_MISMATCH:
|
|
|
|
case FSF_REQUEST_SIZE_TOO_LARGE:
|
|
|
|
case FSF_RESPONSE_SIZE_TOO_LARGE:
|
|
|
|
break;
|
2008-07-02 16:56:39 +08:00
|
|
|
case FSF_SBAL_MISMATCH:
|
2011-03-31 09:57:33 +08:00
|
|
|
/* should never occur, avoided in zfcp_fsf_send_els */
|
2020-03-31 22:21:48 +08:00
|
|
|
fallthrough;
|
2005-04-17 06:20:36 +08:00
|
|
|
default:
|
2008-07-02 16:56:39 +08:00
|
|
|
req->status |= ZFCP_STATUS_FSFREQ_ERROR;
|
2005-04-17 06:20:36 +08:00
|
|
|
break;
|
|
|
|
}
|
|
|
|
skip_fsfstatus:
|
2007-07-18 16:55:10 +08:00
|
|
|
if (send_els->handler)
|
2005-04-17 06:20:36 +08:00
|
|
|
send_els->handler(send_els->handler_data);
|
2008-07-02 16:56:39 +08:00
|
|
|
}
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
/**
|
|
|
|
* zfcp_fsf_send_els - initiate an ELS command (FC-FS)
|
2018-11-08 22:44:54 +08:00
|
|
|
* @adapter: pointer to zfcp adapter
|
|
|
|
* @d_id: N_Port_ID to send ELS to
|
2008-07-02 16:56:39 +08:00
|
|
|
* @els: pointer to struct zfcp_send_els with data for the command
|
2018-11-08 22:44:54 +08:00
|
|
|
* @timeout: timeout that hardware should use, and a later software timeout
|
2008-07-02 16:56:39 +08:00
|
|
|
*/
|
2009-11-24 23:54:13 +08:00
|
|
|
int zfcp_fsf_send_els(struct zfcp_adapter *adapter, u32 d_id,
|
2010-01-15 00:19:02 +08:00
|
|
|
struct zfcp_fsf_ct_els *els, unsigned int timeout)
|
2008-07-02 16:56:39 +08:00
|
|
|
{
|
|
|
|
struct zfcp_fsf_req *req;
|
2009-11-24 23:54:13 +08:00
|
|
|
struct zfcp_qdio *qdio = adapter->qdio;
|
2008-07-02 16:56:39 +08:00
|
|
|
int ret = -EIO;
|
|
|
|
|
[SCSI] zfcp: Change spin_lock_bh to spin_lock_irq to fix lockdep warning
With the change to use the data on the SCSI device, iterating through
all LUNs/scsi_devices takes the SCSI host_lock. This triggers warnings
from the lock dependency checker:
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #97
---------------------------------------------------------
chchp/3224 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->req_q_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this: [ 24.972394] 2 locks held by chchp/3224:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-....}, at: [<0000000000490302>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #98
---------------------------------------------------------
chchp/3235 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->stat_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this:
2 locks held by chchp/3235:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-.-..}, at: [<00000000004902f6>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
To stop this warning, change the request queue lock to disable irqs,
not only softirq. The changes are required only outside of the
critical "send fcp command" path.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-09-08 20:39:57 +08:00
|
|
|
spin_lock_irq(&qdio->req_q_lock);
|
2010-05-01 00:09:35 +08:00
|
|
|
if (zfcp_qdio_sbal_get(qdio))
|
2008-07-02 16:56:39 +08:00
|
|
|
goto out;
|
2009-08-18 21:43:16 +08:00
|
|
|
|
2010-05-01 00:09:34 +08:00
|
|
|
req = zfcp_fsf_req_create(qdio, FSF_QTCB_SEND_ELS,
|
2011-06-06 20:14:40 +08:00
|
|
|
SBAL_SFLAGS0_TYPE_WRITE_READ, NULL);
|
2009-08-18 21:43:16 +08:00
|
|
|
|
2008-08-21 19:43:37 +08:00
|
|
|
if (IS_ERR(req)) {
|
2008-07-02 16:56:39 +08:00
|
|
|
ret = PTR_ERR(req);
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
|
2009-08-18 21:43:16 +08:00
|
|
|
req->status |= ZFCP_STATUS_FSFREQ_CLEANUP;
|
2010-07-16 21:37:37 +08:00
|
|
|
|
2011-08-15 20:40:32 +08:00
|
|
|
if (!zfcp_adapter_multi_buffer_active(adapter))
|
|
|
|
zfcp_qdio_sbal_limit(qdio, &req->qdio_req, 2);
|
2010-07-16 21:37:37 +08:00
|
|
|
|
|
|
|
ret = zfcp_fsf_setup_ct_els(req, els->req, els->resp, timeout);
|
2008-10-01 18:42:16 +08:00
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
if (ret)
|
|
|
|
goto failed_send;
|
|
|
|
|
2009-11-24 23:54:13 +08:00
|
|
|
hton24(req->qtcb->bottom.support.d_id, d_id);
|
2008-07-02 16:56:39 +08:00
|
|
|
req->handler = zfcp_fsf_send_els_handler;
|
2016-08-11 00:30:51 +08:00
|
|
|
els->d_id = d_id;
|
2008-07-02 16:56:39 +08:00
|
|
|
req->data = els;
|
|
|
|
|
2010-12-02 22:16:13 +08:00
|
|
|
zfcp_dbf_san_req("fssels1", req, d_id);
|
2008-07-02 16:56:39 +08:00
|
|
|
|
|
|
|
ret = zfcp_fsf_req_send(req);
|
|
|
|
if (ret)
|
|
|
|
goto failed_send;
|
scsi: zfcp: fix request object use-after-free in send path causing seqno errors
With a recent change to our send path for FSF commands we introduced a
possible use-after-free of request-objects, that might further lead to
zfcp crafting bad requests, which the FCP channel correctly complains
about with an error (FSF_PROT_SEQ_NUMB_ERROR). This error is then handled
by an adapter-wide recovery.
The following sequence illustrates the possible use-after-free:
Send Path:
int zfcp_fsf_open_port(struct zfcp_erp_action *erp_action)
{
struct zfcp_fsf_req *req;
...
spin_lock_irq(&qdio->req_q_lock);
// ^^^^^^^^^^^^^^^^
// protects QDIO queue during sending
...
req = zfcp_fsf_req_create(qdio,
FSF_QTCB_OPEN_PORT_WITH_DID,
SBAL_SFLAGS0_TYPE_READ,
qdio->adapter->pool.erp_req);
// ^^^^^^^^^^^^^^^^^^^
// allocation of the request-object
...
retval = zfcp_fsf_req_send(req);
...
spin_unlock_irq(&qdio->req_q_lock);
return retval;
}
static int zfcp_fsf_req_send(struct zfcp_fsf_req *req)
{
struct zfcp_adapter *adapter = req->adapter;
struct zfcp_qdio *qdio = adapter->qdio;
...
zfcp_reqlist_add(adapter->req_list, req);
// ^^^^^^^^^^^^^^^^
// add request to our driver-internal hash-table for tracking
// (protected by separate lock req_list->lock)
...
if (zfcp_qdio_send(qdio, &req->qdio_req)) {
// ^^^^^^^^^^^^^^
// hand-off the request to FCP channel;
// the request can complete at any point now
...
}
/* Don't increase for unsolicited status */
if (!zfcp_fsf_req_is_status_read_buffer(req))
// ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
// possible use-after-free
adapter->fsf_req_seq_no++;
// ^^^^^^^^^^^^^^^^
// because of the use-after-free we might
// miss this accounting, and as follow-up
// this results in the FCP channel error
// FSF_PROT_SEQ_NUMB_ERROR
adapter->req_no++;
return 0;
}
static inline bool
zfcp_fsf_req_is_status_read_buffer(struct zfcp_fsf_req *req)
{
return req->qtcb == NULL;
// ^^^^^^^^^
// possible use-after-free
}
Response Path:
void zfcp_fsf_reqid_check(struct zfcp_qdio *qdio, int sbal_idx)
{
...
struct zfcp_fsf_req *fsf_req;
...
for (idx = 0; idx < QDIO_MAX_ELEMENTS_PER_BUFFER; idx++) {
...
fsf_req = zfcp_reqlist_find_rm(adapter->req_list,
req_id);
// ^^^^^^^^^^^^^^^^^^^^
// remove request from our driver-internal
// hash-table (lock req_list->lock)
...
zfcp_fsf_req_complete(fsf_req);
}
}
static void zfcp_fsf_req_complete(struct zfcp_fsf_req *req)
{
...
if (likely(req->status & ZFCP_STATUS_FSFREQ_CLEANUP))
zfcp_fsf_req_free(req);
// ^^^^^^^^^^^^^^^^^
// free memory for request-object
else
complete(&req->completion);
// ^^^^^^^^
// completion notification for code-paths that wait
// synchronous for the completion of the request; in
// those the memory is freed separately
}
The result of the use-after-free only affects the send path, and can not
lead to any data corruption. In case we miss the sequence-number
accounting, because the memory was already re-purposed, the next FSF
command will fail with said FCP channel error, and we will recover the
whole adapter. This causes no additional errors, but it slows down
traffic. There is a slight chance of the same thing happen again
recursively after the adapter recovery, but so far this has not been seen.
This was seen under z/VM, where the send path might run on a virtual CPU
that gets scheduled away by z/VM, while the return path might still run,
and so create the necessary timing. Running with KASAN can also slow down
the kernel sufficiently to run into this user-after-free, and then see the
report by KASAN.
To fix this, simply pull the test for the sequence-number accounting in
front of the hand-off to the FCP channel (this information doesn't change
during hand-off), but leave the sequence-number accounting itself where it
is.
To make future regressions of the same kind less likely, add comments to
all closely related code-paths.
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Fixes: f9eca0227600 ("scsi: zfcp: drop duplicate fsf_command from zfcp_fsf_req which is also in QTCB header")
Cc: <stable@vger.kernel.org> #5.0+
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-07-03 05:02:00 +08:00
|
|
|
/* NOTE: DO NOT TOUCH req PAST THIS POINT! */
|
2008-07-02 16:56:39 +08:00
|
|
|
|
|
|
|
goto out;
|
|
|
|
|
|
|
|
failed_send:
|
|
|
|
zfcp_fsf_req_free(req);
|
|
|
|
out:
|
[SCSI] zfcp: Change spin_lock_bh to spin_lock_irq to fix lockdep warning
With the change to use the data on the SCSI device, iterating through
all LUNs/scsi_devices takes the SCSI host_lock. This triggers warnings
from the lock dependency checker:
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #97
---------------------------------------------------------
chchp/3224 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->req_q_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this: [ 24.972394] 2 locks held by chchp/3224:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-....}, at: [<0000000000490302>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #98
---------------------------------------------------------
chchp/3235 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->stat_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this:
2 locks held by chchp/3235:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-.-..}, at: [<00000000004902f6>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
To stop this warning, change the request queue lock to disable irqs,
not only softirq. The changes are required only outside of the
critical "send fcp command" path.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-09-08 20:39:57 +08:00
|
|
|
spin_unlock_irq(&qdio->req_q_lock);
|
2008-07-02 16:56:39 +08:00
|
|
|
return ret;
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
int zfcp_fsf_exchange_config_data(struct zfcp_erp_action *erp_action)
|
2005-04-17 06:20:36 +08:00
|
|
|
{
|
2008-07-02 16:56:39 +08:00
|
|
|
struct zfcp_fsf_req *req;
|
2009-08-18 21:43:19 +08:00
|
|
|
struct zfcp_qdio *qdio = erp_action->adapter->qdio;
|
2008-07-02 16:56:39 +08:00
|
|
|
int retval = -EIO;
|
|
|
|
|
[SCSI] zfcp: Change spin_lock_bh to spin_lock_irq to fix lockdep warning
With the change to use the data on the SCSI device, iterating through
all LUNs/scsi_devices takes the SCSI host_lock. This triggers warnings
from the lock dependency checker:
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #97
---------------------------------------------------------
chchp/3224 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->req_q_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this: [ 24.972394] 2 locks held by chchp/3224:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-....}, at: [<0000000000490302>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #98
---------------------------------------------------------
chchp/3235 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->stat_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this:
2 locks held by chchp/3235:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-.-..}, at: [<00000000004902f6>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
To stop this warning, change the request queue lock to disable irqs,
not only softirq. The changes are required only outside of the
critical "send fcp command" path.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-09-08 20:39:57 +08:00
|
|
|
spin_lock_irq(&qdio->req_q_lock);
|
2010-05-01 00:09:35 +08:00
|
|
|
if (zfcp_qdio_sbal_get(qdio))
|
2008-07-02 16:56:39 +08:00
|
|
|
goto out;
|
2009-08-18 21:43:16 +08:00
|
|
|
|
2009-08-18 21:43:19 +08:00
|
|
|
req = zfcp_fsf_req_create(qdio, FSF_QTCB_EXCHANGE_CONFIG_DATA,
|
2011-06-06 20:14:40 +08:00
|
|
|
SBAL_SFLAGS0_TYPE_READ,
|
2009-08-18 21:43:19 +08:00
|
|
|
qdio->adapter->pool.erp_req);
|
2009-08-18 21:43:16 +08:00
|
|
|
|
2008-08-21 19:43:37 +08:00
|
|
|
if (IS_ERR(req)) {
|
2008-07-02 16:56:39 +08:00
|
|
|
retval = PTR_ERR(req);
|
|
|
|
goto out;
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
|
|
|
|
2009-08-18 21:43:16 +08:00
|
|
|
req->status |= ZFCP_STATUS_FSFREQ_CLEANUP;
|
2010-05-01 00:09:34 +08:00
|
|
|
zfcp_qdio_set_sbale_last(qdio, &req->qdio_req);
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
req->qtcb->bottom.config.feature_selection =
|
2006-01-05 16:56:47 +08:00
|
|
|
FSF_FEATURE_NOTIFICATION_LOST |
|
2019-10-26 00:12:46 +08:00
|
|
|
FSF_FEATURE_UPDATE_ALERT |
|
2020-03-13 01:45:04 +08:00
|
|
|
FSF_FEATURE_REQUEST_SFP_DATA |
|
|
|
|
FSF_FEATURE_FC_SECURITY;
|
2008-07-02 16:56:39 +08:00
|
|
|
req->erp_action = erp_action;
|
|
|
|
req->handler = zfcp_fsf_exchange_config_data_handler;
|
2010-02-17 18:18:49 +08:00
|
|
|
erp_action->fsf_req_id = req->req_id;
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2008-07-02 16:56:40 +08:00
|
|
|
zfcp_fsf_start_erp_timer(req);
|
2008-07-02 16:56:39 +08:00
|
|
|
retval = zfcp_fsf_req_send(req);
|
2005-04-17 06:20:36 +08:00
|
|
|
if (retval) {
|
2008-07-02 16:56:39 +08:00
|
|
|
zfcp_fsf_req_free(req);
|
2010-02-17 18:18:49 +08:00
|
|
|
erp_action->fsf_req_id = 0;
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
scsi: zfcp: fix request object use-after-free in send path causing seqno errors
With a recent change to our send path for FSF commands we introduced a
possible use-after-free of request-objects, that might further lead to
zfcp crafting bad requests, which the FCP channel correctly complains
about with an error (FSF_PROT_SEQ_NUMB_ERROR). This error is then handled
by an adapter-wide recovery.
The following sequence illustrates the possible use-after-free:
Send Path:
int zfcp_fsf_open_port(struct zfcp_erp_action *erp_action)
{
struct zfcp_fsf_req *req;
...
spin_lock_irq(&qdio->req_q_lock);
// ^^^^^^^^^^^^^^^^
// protects QDIO queue during sending
...
req = zfcp_fsf_req_create(qdio,
FSF_QTCB_OPEN_PORT_WITH_DID,
SBAL_SFLAGS0_TYPE_READ,
qdio->adapter->pool.erp_req);
// ^^^^^^^^^^^^^^^^^^^
// allocation of the request-object
...
retval = zfcp_fsf_req_send(req);
...
spin_unlock_irq(&qdio->req_q_lock);
return retval;
}
static int zfcp_fsf_req_send(struct zfcp_fsf_req *req)
{
struct zfcp_adapter *adapter = req->adapter;
struct zfcp_qdio *qdio = adapter->qdio;
...
zfcp_reqlist_add(adapter->req_list, req);
// ^^^^^^^^^^^^^^^^
// add request to our driver-internal hash-table for tracking
// (protected by separate lock req_list->lock)
...
if (zfcp_qdio_send(qdio, &req->qdio_req)) {
// ^^^^^^^^^^^^^^
// hand-off the request to FCP channel;
// the request can complete at any point now
...
}
/* Don't increase for unsolicited status */
if (!zfcp_fsf_req_is_status_read_buffer(req))
// ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
// possible use-after-free
adapter->fsf_req_seq_no++;
// ^^^^^^^^^^^^^^^^
// because of the use-after-free we might
// miss this accounting, and as follow-up
// this results in the FCP channel error
// FSF_PROT_SEQ_NUMB_ERROR
adapter->req_no++;
return 0;
}
static inline bool
zfcp_fsf_req_is_status_read_buffer(struct zfcp_fsf_req *req)
{
return req->qtcb == NULL;
// ^^^^^^^^^
// possible use-after-free
}
Response Path:
void zfcp_fsf_reqid_check(struct zfcp_qdio *qdio, int sbal_idx)
{
...
struct zfcp_fsf_req *fsf_req;
...
for (idx = 0; idx < QDIO_MAX_ELEMENTS_PER_BUFFER; idx++) {
...
fsf_req = zfcp_reqlist_find_rm(adapter->req_list,
req_id);
// ^^^^^^^^^^^^^^^^^^^^
// remove request from our driver-internal
// hash-table (lock req_list->lock)
...
zfcp_fsf_req_complete(fsf_req);
}
}
static void zfcp_fsf_req_complete(struct zfcp_fsf_req *req)
{
...
if (likely(req->status & ZFCP_STATUS_FSFREQ_CLEANUP))
zfcp_fsf_req_free(req);
// ^^^^^^^^^^^^^^^^^
// free memory for request-object
else
complete(&req->completion);
// ^^^^^^^^
// completion notification for code-paths that wait
// synchronous for the completion of the request; in
// those the memory is freed separately
}
The result of the use-after-free only affects the send path, and can not
lead to any data corruption. In case we miss the sequence-number
accounting, because the memory was already re-purposed, the next FSF
command will fail with said FCP channel error, and we will recover the
whole adapter. This causes no additional errors, but it slows down
traffic. There is a slight chance of the same thing happen again
recursively after the adapter recovery, but so far this has not been seen.
This was seen under z/VM, where the send path might run on a virtual CPU
that gets scheduled away by z/VM, while the return path might still run,
and so create the necessary timing. Running with KASAN can also slow down
the kernel sufficiently to run into this user-after-free, and then see the
report by KASAN.
To fix this, simply pull the test for the sequence-number accounting in
front of the hand-off to the FCP channel (this information doesn't change
during hand-off), but leave the sequence-number accounting itself where it
is.
To make future regressions of the same kind less likely, add comments to
all closely related code-paths.
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Fixes: f9eca0227600 ("scsi: zfcp: drop duplicate fsf_command from zfcp_fsf_req which is also in QTCB header")
Cc: <stable@vger.kernel.org> #5.0+
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-07-03 05:02:00 +08:00
|
|
|
/* NOTE: DO NOT TOUCH req PAST THIS POINT! */
|
2008-07-02 16:56:39 +08:00
|
|
|
out:
|
[SCSI] zfcp: Change spin_lock_bh to spin_lock_irq to fix lockdep warning
With the change to use the data on the SCSI device, iterating through
all LUNs/scsi_devices takes the SCSI host_lock. This triggers warnings
from the lock dependency checker:
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #97
---------------------------------------------------------
chchp/3224 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->req_q_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this: [ 24.972394] 2 locks held by chchp/3224:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-....}, at: [<0000000000490302>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #98
---------------------------------------------------------
chchp/3235 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->stat_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this:
2 locks held by chchp/3235:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-.-..}, at: [<00000000004902f6>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
To stop this warning, change the request queue lock to disable irqs,
not only softirq. The changes are required only outside of the
critical "send fcp command" path.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-09-08 20:39:57 +08:00
|
|
|
spin_unlock_irq(&qdio->req_q_lock);
|
2007-08-28 15:31:09 +08:00
|
|
|
return retval;
|
|
|
|
}
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2019-10-26 00:12:43 +08:00
|
|
|
|
|
|
|
/**
|
|
|
|
* zfcp_fsf_exchange_config_data_sync() - Request information about FCP channel.
|
|
|
|
* @qdio: pointer to the QDIO-Queue to use for sending the command.
|
|
|
|
* @data: pointer to the QTCB-Bottom for storing the result of the command,
|
|
|
|
* might be %NULL.
|
|
|
|
*
|
|
|
|
* Returns:
|
|
|
|
* * 0 - Exchange Config Data was successful, @data is complete
|
|
|
|
* * -EIO - Exchange Config Data was not successful, @data is invalid
|
|
|
|
* * -EAGAIN - @data contains incomplete data
|
|
|
|
* * -ENOMEM - Some memory allocation failed along the way
|
|
|
|
*/
|
2009-08-18 21:43:19 +08:00
|
|
|
int zfcp_fsf_exchange_config_data_sync(struct zfcp_qdio *qdio,
|
2008-07-02 16:56:39 +08:00
|
|
|
struct fsf_qtcb_bottom_config *data)
|
2007-08-28 15:31:09 +08:00
|
|
|
{
|
2008-07-02 16:56:39 +08:00
|
|
|
struct zfcp_fsf_req *req = NULL;
|
|
|
|
int retval = -EIO;
|
|
|
|
|
[SCSI] zfcp: Change spin_lock_bh to spin_lock_irq to fix lockdep warning
With the change to use the data on the SCSI device, iterating through
all LUNs/scsi_devices takes the SCSI host_lock. This triggers warnings
from the lock dependency checker:
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #97
---------------------------------------------------------
chchp/3224 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->req_q_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this: [ 24.972394] 2 locks held by chchp/3224:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-....}, at: [<0000000000490302>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #98
---------------------------------------------------------
chchp/3235 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->stat_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this:
2 locks held by chchp/3235:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-.-..}, at: [<00000000004902f6>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
To stop this warning, change the request queue lock to disable irqs,
not only softirq. The changes are required only outside of the
critical "send fcp command" path.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-09-08 20:39:57 +08:00
|
|
|
spin_lock_irq(&qdio->req_q_lock);
|
2010-05-01 00:09:35 +08:00
|
|
|
if (zfcp_qdio_sbal_get(qdio))
|
2009-04-17 21:08:03 +08:00
|
|
|
goto out_unlock;
|
2008-07-02 16:56:39 +08:00
|
|
|
|
2010-05-01 00:09:34 +08:00
|
|
|
req = zfcp_fsf_req_create(qdio, FSF_QTCB_EXCHANGE_CONFIG_DATA,
|
2011-06-06 20:14:40 +08:00
|
|
|
SBAL_SFLAGS0_TYPE_READ, NULL);
|
2009-08-18 21:43:16 +08:00
|
|
|
|
2008-08-21 19:43:37 +08:00
|
|
|
if (IS_ERR(req)) {
|
2008-07-02 16:56:39 +08:00
|
|
|
retval = PTR_ERR(req);
|
2009-04-17 21:08:03 +08:00
|
|
|
goto out_unlock;
|
2007-08-28 15:31:09 +08:00
|
|
|
}
|
|
|
|
|
2010-05-01 00:09:34 +08:00
|
|
|
zfcp_qdio_set_sbale_last(qdio, &req->qdio_req);
|
2008-07-02 16:56:39 +08:00
|
|
|
req->handler = zfcp_fsf_exchange_config_data_handler;
|
2007-08-28 15:31:09 +08:00
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
req->qtcb->bottom.config.feature_selection =
|
2007-08-28 15:31:09 +08:00
|
|
|
FSF_FEATURE_NOTIFICATION_LOST |
|
2019-10-26 00:12:46 +08:00
|
|
|
FSF_FEATURE_UPDATE_ALERT |
|
2020-03-13 01:45:04 +08:00
|
|
|
FSF_FEATURE_REQUEST_SFP_DATA |
|
|
|
|
FSF_FEATURE_FC_SECURITY;
|
2007-08-28 15:31:09 +08:00
|
|
|
|
|
|
|
if (data)
|
2008-07-02 16:56:39 +08:00
|
|
|
req->data = data;
|
2007-08-28 15:31:09 +08:00
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
zfcp_fsf_start_timer(req, ZFCP_FSF_REQUEST_TIMEOUT);
|
|
|
|
retval = zfcp_fsf_req_send(req);
|
[SCSI] zfcp: Change spin_lock_bh to spin_lock_irq to fix lockdep warning
With the change to use the data on the SCSI device, iterating through
all LUNs/scsi_devices takes the SCSI host_lock. This triggers warnings
from the lock dependency checker:
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #97
---------------------------------------------------------
chchp/3224 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->req_q_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this: [ 24.972394] 2 locks held by chchp/3224:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-....}, at: [<0000000000490302>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #98
---------------------------------------------------------
chchp/3235 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->stat_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this:
2 locks held by chchp/3235:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-.-..}, at: [<00000000004902f6>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
To stop this warning, change the request queue lock to disable irqs,
not only softirq. The changes are required only outside of the
critical "send fcp command" path.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-09-08 20:39:57 +08:00
|
|
|
spin_unlock_irq(&qdio->req_q_lock);
|
2019-10-26 00:12:43 +08:00
|
|
|
|
scsi: zfcp: fix request object use-after-free in send path causing seqno errors
With a recent change to our send path for FSF commands we introduced a
possible use-after-free of request-objects, that might further lead to
zfcp crafting bad requests, which the FCP channel correctly complains
about with an error (FSF_PROT_SEQ_NUMB_ERROR). This error is then handled
by an adapter-wide recovery.
The following sequence illustrates the possible use-after-free:
Send Path:
int zfcp_fsf_open_port(struct zfcp_erp_action *erp_action)
{
struct zfcp_fsf_req *req;
...
spin_lock_irq(&qdio->req_q_lock);
// ^^^^^^^^^^^^^^^^
// protects QDIO queue during sending
...
req = zfcp_fsf_req_create(qdio,
FSF_QTCB_OPEN_PORT_WITH_DID,
SBAL_SFLAGS0_TYPE_READ,
qdio->adapter->pool.erp_req);
// ^^^^^^^^^^^^^^^^^^^
// allocation of the request-object
...
retval = zfcp_fsf_req_send(req);
...
spin_unlock_irq(&qdio->req_q_lock);
return retval;
}
static int zfcp_fsf_req_send(struct zfcp_fsf_req *req)
{
struct zfcp_adapter *adapter = req->adapter;
struct zfcp_qdio *qdio = adapter->qdio;
...
zfcp_reqlist_add(adapter->req_list, req);
// ^^^^^^^^^^^^^^^^
// add request to our driver-internal hash-table for tracking
// (protected by separate lock req_list->lock)
...
if (zfcp_qdio_send(qdio, &req->qdio_req)) {
// ^^^^^^^^^^^^^^
// hand-off the request to FCP channel;
// the request can complete at any point now
...
}
/* Don't increase for unsolicited status */
if (!zfcp_fsf_req_is_status_read_buffer(req))
// ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
// possible use-after-free
adapter->fsf_req_seq_no++;
// ^^^^^^^^^^^^^^^^
// because of the use-after-free we might
// miss this accounting, and as follow-up
// this results in the FCP channel error
// FSF_PROT_SEQ_NUMB_ERROR
adapter->req_no++;
return 0;
}
static inline bool
zfcp_fsf_req_is_status_read_buffer(struct zfcp_fsf_req *req)
{
return req->qtcb == NULL;
// ^^^^^^^^^
// possible use-after-free
}
Response Path:
void zfcp_fsf_reqid_check(struct zfcp_qdio *qdio, int sbal_idx)
{
...
struct zfcp_fsf_req *fsf_req;
...
for (idx = 0; idx < QDIO_MAX_ELEMENTS_PER_BUFFER; idx++) {
...
fsf_req = zfcp_reqlist_find_rm(adapter->req_list,
req_id);
// ^^^^^^^^^^^^^^^^^^^^
// remove request from our driver-internal
// hash-table (lock req_list->lock)
...
zfcp_fsf_req_complete(fsf_req);
}
}
static void zfcp_fsf_req_complete(struct zfcp_fsf_req *req)
{
...
if (likely(req->status & ZFCP_STATUS_FSFREQ_CLEANUP))
zfcp_fsf_req_free(req);
// ^^^^^^^^^^^^^^^^^
// free memory for request-object
else
complete(&req->completion);
// ^^^^^^^^
// completion notification for code-paths that wait
// synchronous for the completion of the request; in
// those the memory is freed separately
}
The result of the use-after-free only affects the send path, and can not
lead to any data corruption. In case we miss the sequence-number
accounting, because the memory was already re-purposed, the next FSF
command will fail with said FCP channel error, and we will recover the
whole adapter. This causes no additional errors, but it slows down
traffic. There is a slight chance of the same thing happen again
recursively after the adapter recovery, but so far this has not been seen.
This was seen under z/VM, where the send path might run on a virtual CPU
that gets scheduled away by z/VM, while the return path might still run,
and so create the necessary timing. Running with KASAN can also slow down
the kernel sufficiently to run into this user-after-free, and then see the
report by KASAN.
To fix this, simply pull the test for the sequence-number accounting in
front of the hand-off to the FCP channel (this information doesn't change
during hand-off), but leave the sequence-number accounting itself where it
is.
To make future regressions of the same kind less likely, add comments to
all closely related code-paths.
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Fixes: f9eca0227600 ("scsi: zfcp: drop duplicate fsf_command from zfcp_fsf_req which is also in QTCB header")
Cc: <stable@vger.kernel.org> #5.0+
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-07-03 05:02:00 +08:00
|
|
|
if (!retval) {
|
|
|
|
/* NOTE: ONLY TOUCH SYNC req AGAIN ON req->completion. */
|
2009-08-18 21:43:14 +08:00
|
|
|
wait_for_completion(&req->completion);
|
2019-10-26 00:12:43 +08:00
|
|
|
|
|
|
|
if (req->status &
|
|
|
|
(ZFCP_STATUS_FSFREQ_ERROR | ZFCP_STATUS_FSFREQ_DISMISSED))
|
|
|
|
retval = -EIO;
|
|
|
|
else if (req->status & ZFCP_STATUS_FSFREQ_XDATAINCOMPLETE)
|
|
|
|
retval = -EAGAIN;
|
scsi: zfcp: fix request object use-after-free in send path causing seqno errors
With a recent change to our send path for FSF commands we introduced a
possible use-after-free of request-objects, that might further lead to
zfcp crafting bad requests, which the FCP channel correctly complains
about with an error (FSF_PROT_SEQ_NUMB_ERROR). This error is then handled
by an adapter-wide recovery.
The following sequence illustrates the possible use-after-free:
Send Path:
int zfcp_fsf_open_port(struct zfcp_erp_action *erp_action)
{
struct zfcp_fsf_req *req;
...
spin_lock_irq(&qdio->req_q_lock);
// ^^^^^^^^^^^^^^^^
// protects QDIO queue during sending
...
req = zfcp_fsf_req_create(qdio,
FSF_QTCB_OPEN_PORT_WITH_DID,
SBAL_SFLAGS0_TYPE_READ,
qdio->adapter->pool.erp_req);
// ^^^^^^^^^^^^^^^^^^^
// allocation of the request-object
...
retval = zfcp_fsf_req_send(req);
...
spin_unlock_irq(&qdio->req_q_lock);
return retval;
}
static int zfcp_fsf_req_send(struct zfcp_fsf_req *req)
{
struct zfcp_adapter *adapter = req->adapter;
struct zfcp_qdio *qdio = adapter->qdio;
...
zfcp_reqlist_add(adapter->req_list, req);
// ^^^^^^^^^^^^^^^^
// add request to our driver-internal hash-table for tracking
// (protected by separate lock req_list->lock)
...
if (zfcp_qdio_send(qdio, &req->qdio_req)) {
// ^^^^^^^^^^^^^^
// hand-off the request to FCP channel;
// the request can complete at any point now
...
}
/* Don't increase for unsolicited status */
if (!zfcp_fsf_req_is_status_read_buffer(req))
// ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
// possible use-after-free
adapter->fsf_req_seq_no++;
// ^^^^^^^^^^^^^^^^
// because of the use-after-free we might
// miss this accounting, and as follow-up
// this results in the FCP channel error
// FSF_PROT_SEQ_NUMB_ERROR
adapter->req_no++;
return 0;
}
static inline bool
zfcp_fsf_req_is_status_read_buffer(struct zfcp_fsf_req *req)
{
return req->qtcb == NULL;
// ^^^^^^^^^
// possible use-after-free
}
Response Path:
void zfcp_fsf_reqid_check(struct zfcp_qdio *qdio, int sbal_idx)
{
...
struct zfcp_fsf_req *fsf_req;
...
for (idx = 0; idx < QDIO_MAX_ELEMENTS_PER_BUFFER; idx++) {
...
fsf_req = zfcp_reqlist_find_rm(adapter->req_list,
req_id);
// ^^^^^^^^^^^^^^^^^^^^
// remove request from our driver-internal
// hash-table (lock req_list->lock)
...
zfcp_fsf_req_complete(fsf_req);
}
}
static void zfcp_fsf_req_complete(struct zfcp_fsf_req *req)
{
...
if (likely(req->status & ZFCP_STATUS_FSFREQ_CLEANUP))
zfcp_fsf_req_free(req);
// ^^^^^^^^^^^^^^^^^
// free memory for request-object
else
complete(&req->completion);
// ^^^^^^^^
// completion notification for code-paths that wait
// synchronous for the completion of the request; in
// those the memory is freed separately
}
The result of the use-after-free only affects the send path, and can not
lead to any data corruption. In case we miss the sequence-number
accounting, because the memory was already re-purposed, the next FSF
command will fail with said FCP channel error, and we will recover the
whole adapter. This causes no additional errors, but it slows down
traffic. There is a slight chance of the same thing happen again
recursively after the adapter recovery, but so far this has not been seen.
This was seen under z/VM, where the send path might run on a virtual CPU
that gets scheduled away by z/VM, while the return path might still run,
and so create the necessary timing. Running with KASAN can also slow down
the kernel sufficiently to run into this user-after-free, and then see the
report by KASAN.
To fix this, simply pull the test for the sequence-number accounting in
front of the hand-off to the FCP channel (this information doesn't change
during hand-off), but leave the sequence-number accounting itself where it
is.
To make future regressions of the same kind less likely, add comments to
all closely related code-paths.
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Fixes: f9eca0227600 ("scsi: zfcp: drop duplicate fsf_command from zfcp_fsf_req which is also in QTCB header")
Cc: <stable@vger.kernel.org> #5.0+
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-07-03 05:02:00 +08:00
|
|
|
}
|
2007-08-28 15:31:09 +08:00
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
zfcp_fsf_req_free(req);
|
2009-04-17 21:08:03 +08:00
|
|
|
return retval;
|
2007-08-28 15:31:09 +08:00
|
|
|
|
2009-04-17 21:08:03 +08:00
|
|
|
out_unlock:
|
[SCSI] zfcp: Change spin_lock_bh to spin_lock_irq to fix lockdep warning
With the change to use the data on the SCSI device, iterating through
all LUNs/scsi_devices takes the SCSI host_lock. This triggers warnings
from the lock dependency checker:
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #97
---------------------------------------------------------
chchp/3224 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->req_q_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this: [ 24.972394] 2 locks held by chchp/3224:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-....}, at: [<0000000000490302>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #98
---------------------------------------------------------
chchp/3235 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->stat_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this:
2 locks held by chchp/3235:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-.-..}, at: [<00000000004902f6>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
To stop this warning, change the request queue lock to disable irqs,
not only softirq. The changes are required only outside of the
critical "send fcp command" path.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-09-08 20:39:57 +08:00
|
|
|
spin_unlock_irq(&qdio->req_q_lock);
|
2005-04-17 06:20:36 +08:00
|
|
|
return retval;
|
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* zfcp_fsf_exchange_port_data - request information about local port
|
2005-09-14 03:51:16 +08:00
|
|
|
* @erp_action: ERP action for the adapter for which port data is requested
|
2008-07-02 16:56:39 +08:00
|
|
|
* Returns: 0 on success, error otherwise
|
2005-04-17 06:20:36 +08:00
|
|
|
*/
|
2008-07-02 16:56:39 +08:00
|
|
|
int zfcp_fsf_exchange_port_data(struct zfcp_erp_action *erp_action)
|
2005-04-17 06:20:36 +08:00
|
|
|
{
|
2009-08-18 21:43:19 +08:00
|
|
|
struct zfcp_qdio *qdio = erp_action->adapter->qdio;
|
2008-07-02 16:56:39 +08:00
|
|
|
struct zfcp_fsf_req *req;
|
|
|
|
int retval = -EIO;
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2009-08-18 21:43:19 +08:00
|
|
|
if (!(qdio->adapter->adapter_features & FSF_FEATURE_HBAAPI_MANAGEMENT))
|
2007-08-28 15:31:09 +08:00
|
|
|
return -EOPNOTSUPP;
|
2005-04-17 06:20:36 +08:00
|
|
|
|
[SCSI] zfcp: Change spin_lock_bh to spin_lock_irq to fix lockdep warning
With the change to use the data on the SCSI device, iterating through
all LUNs/scsi_devices takes the SCSI host_lock. This triggers warnings
from the lock dependency checker:
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #97
---------------------------------------------------------
chchp/3224 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->req_q_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this: [ 24.972394] 2 locks held by chchp/3224:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-....}, at: [<0000000000490302>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #98
---------------------------------------------------------
chchp/3235 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->stat_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this:
2 locks held by chchp/3235:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-.-..}, at: [<00000000004902f6>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
To stop this warning, change the request queue lock to disable irqs,
not only softirq. The changes are required only outside of the
critical "send fcp command" path.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-09-08 20:39:57 +08:00
|
|
|
spin_lock_irq(&qdio->req_q_lock);
|
2010-05-01 00:09:35 +08:00
|
|
|
if (zfcp_qdio_sbal_get(qdio))
|
2008-07-02 16:56:39 +08:00
|
|
|
goto out;
|
2009-08-18 21:43:16 +08:00
|
|
|
|
2009-08-18 21:43:19 +08:00
|
|
|
req = zfcp_fsf_req_create(qdio, FSF_QTCB_EXCHANGE_PORT_DATA,
|
2011-06-06 20:14:40 +08:00
|
|
|
SBAL_SFLAGS0_TYPE_READ,
|
2009-08-18 21:43:19 +08:00
|
|
|
qdio->adapter->pool.erp_req);
|
2009-08-18 21:43:16 +08:00
|
|
|
|
2008-08-21 19:43:37 +08:00
|
|
|
if (IS_ERR(req)) {
|
2008-07-02 16:56:39 +08:00
|
|
|
retval = PTR_ERR(req);
|
|
|
|
goto out;
|
2005-09-14 03:51:16 +08:00
|
|
|
}
|
|
|
|
|
2009-08-18 21:43:16 +08:00
|
|
|
req->status |= ZFCP_STATUS_FSFREQ_CLEANUP;
|
2010-05-01 00:09:34 +08:00
|
|
|
zfcp_qdio_set_sbale_last(qdio, &req->qdio_req);
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
req->handler = zfcp_fsf_exchange_port_data_handler;
|
|
|
|
req->erp_action = erp_action;
|
2010-02-17 18:18:49 +08:00
|
|
|
erp_action->fsf_req_id = req->req_id;
|
2007-08-28 15:31:09 +08:00
|
|
|
|
2008-07-02 16:56:40 +08:00
|
|
|
zfcp_fsf_start_erp_timer(req);
|
2008-07-02 16:56:39 +08:00
|
|
|
retval = zfcp_fsf_req_send(req);
|
2005-04-17 06:20:36 +08:00
|
|
|
if (retval) {
|
2008-07-02 16:56:39 +08:00
|
|
|
zfcp_fsf_req_free(req);
|
2010-02-17 18:18:49 +08:00
|
|
|
erp_action->fsf_req_id = 0;
|
2007-08-28 15:31:09 +08:00
|
|
|
}
|
scsi: zfcp: fix request object use-after-free in send path causing seqno errors
With a recent change to our send path for FSF commands we introduced a
possible use-after-free of request-objects, that might further lead to
zfcp crafting bad requests, which the FCP channel correctly complains
about with an error (FSF_PROT_SEQ_NUMB_ERROR). This error is then handled
by an adapter-wide recovery.
The following sequence illustrates the possible use-after-free:
Send Path:
int zfcp_fsf_open_port(struct zfcp_erp_action *erp_action)
{
struct zfcp_fsf_req *req;
...
spin_lock_irq(&qdio->req_q_lock);
// ^^^^^^^^^^^^^^^^
// protects QDIO queue during sending
...
req = zfcp_fsf_req_create(qdio,
FSF_QTCB_OPEN_PORT_WITH_DID,
SBAL_SFLAGS0_TYPE_READ,
qdio->adapter->pool.erp_req);
// ^^^^^^^^^^^^^^^^^^^
// allocation of the request-object
...
retval = zfcp_fsf_req_send(req);
...
spin_unlock_irq(&qdio->req_q_lock);
return retval;
}
static int zfcp_fsf_req_send(struct zfcp_fsf_req *req)
{
struct zfcp_adapter *adapter = req->adapter;
struct zfcp_qdio *qdio = adapter->qdio;
...
zfcp_reqlist_add(adapter->req_list, req);
// ^^^^^^^^^^^^^^^^
// add request to our driver-internal hash-table for tracking
// (protected by separate lock req_list->lock)
...
if (zfcp_qdio_send(qdio, &req->qdio_req)) {
// ^^^^^^^^^^^^^^
// hand-off the request to FCP channel;
// the request can complete at any point now
...
}
/* Don't increase for unsolicited status */
if (!zfcp_fsf_req_is_status_read_buffer(req))
// ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
// possible use-after-free
adapter->fsf_req_seq_no++;
// ^^^^^^^^^^^^^^^^
// because of the use-after-free we might
// miss this accounting, and as follow-up
// this results in the FCP channel error
// FSF_PROT_SEQ_NUMB_ERROR
adapter->req_no++;
return 0;
}
static inline bool
zfcp_fsf_req_is_status_read_buffer(struct zfcp_fsf_req *req)
{
return req->qtcb == NULL;
// ^^^^^^^^^
// possible use-after-free
}
Response Path:
void zfcp_fsf_reqid_check(struct zfcp_qdio *qdio, int sbal_idx)
{
...
struct zfcp_fsf_req *fsf_req;
...
for (idx = 0; idx < QDIO_MAX_ELEMENTS_PER_BUFFER; idx++) {
...
fsf_req = zfcp_reqlist_find_rm(adapter->req_list,
req_id);
// ^^^^^^^^^^^^^^^^^^^^
// remove request from our driver-internal
// hash-table (lock req_list->lock)
...
zfcp_fsf_req_complete(fsf_req);
}
}
static void zfcp_fsf_req_complete(struct zfcp_fsf_req *req)
{
...
if (likely(req->status & ZFCP_STATUS_FSFREQ_CLEANUP))
zfcp_fsf_req_free(req);
// ^^^^^^^^^^^^^^^^^
// free memory for request-object
else
complete(&req->completion);
// ^^^^^^^^
// completion notification for code-paths that wait
// synchronous for the completion of the request; in
// those the memory is freed separately
}
The result of the use-after-free only affects the send path, and can not
lead to any data corruption. In case we miss the sequence-number
accounting, because the memory was already re-purposed, the next FSF
command will fail with said FCP channel error, and we will recover the
whole adapter. This causes no additional errors, but it slows down
traffic. There is a slight chance of the same thing happen again
recursively after the adapter recovery, but so far this has not been seen.
This was seen under z/VM, where the send path might run on a virtual CPU
that gets scheduled away by z/VM, while the return path might still run,
and so create the necessary timing. Running with KASAN can also slow down
the kernel sufficiently to run into this user-after-free, and then see the
report by KASAN.
To fix this, simply pull the test for the sequence-number accounting in
front of the hand-off to the FCP channel (this information doesn't change
during hand-off), but leave the sequence-number accounting itself where it
is.
To make future regressions of the same kind less likely, add comments to
all closely related code-paths.
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Fixes: f9eca0227600 ("scsi: zfcp: drop duplicate fsf_command from zfcp_fsf_req which is also in QTCB header")
Cc: <stable@vger.kernel.org> #5.0+
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-07-03 05:02:00 +08:00
|
|
|
/* NOTE: DO NOT TOUCH req PAST THIS POINT! */
|
2008-07-02 16:56:39 +08:00
|
|
|
out:
|
[SCSI] zfcp: Change spin_lock_bh to spin_lock_irq to fix lockdep warning
With the change to use the data on the SCSI device, iterating through
all LUNs/scsi_devices takes the SCSI host_lock. This triggers warnings
from the lock dependency checker:
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #97
---------------------------------------------------------
chchp/3224 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->req_q_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this: [ 24.972394] 2 locks held by chchp/3224:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-....}, at: [<0000000000490302>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #98
---------------------------------------------------------
chchp/3235 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->stat_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this:
2 locks held by chchp/3235:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-.-..}, at: [<00000000004902f6>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
To stop this warning, change the request queue lock to disable irqs,
not only softirq. The changes are required only outside of the
critical "send fcp command" path.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-09-08 20:39:57 +08:00
|
|
|
spin_unlock_irq(&qdio->req_q_lock);
|
2007-08-28 15:31:09 +08:00
|
|
|
return retval;
|
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
2019-10-26 00:12:43 +08:00
|
|
|
* zfcp_fsf_exchange_port_data_sync() - Request information about local port.
|
|
|
|
* @qdio: pointer to the QDIO-Queue to use for sending the command.
|
|
|
|
* @data: pointer to the QTCB-Bottom for storing the result of the command,
|
|
|
|
* might be %NULL.
|
|
|
|
*
|
|
|
|
* Returns:
|
|
|
|
* * 0 - Exchange Port Data was successful, @data is complete
|
|
|
|
* * -EIO - Exchange Port Data was not successful, @data is invalid
|
|
|
|
* * -EAGAIN - @data contains incomplete data
|
|
|
|
* * -ENOMEM - Some memory allocation failed along the way
|
|
|
|
* * -EOPNOTSUPP - This operation is not supported
|
2007-08-28 15:31:09 +08:00
|
|
|
*/
|
2009-08-18 21:43:19 +08:00
|
|
|
int zfcp_fsf_exchange_port_data_sync(struct zfcp_qdio *qdio,
|
2008-07-02 16:56:39 +08:00
|
|
|
struct fsf_qtcb_bottom_port *data)
|
2007-08-28 15:31:09 +08:00
|
|
|
{
|
2008-07-02 16:56:39 +08:00
|
|
|
struct zfcp_fsf_req *req = NULL;
|
|
|
|
int retval = -EIO;
|
2007-08-28 15:31:09 +08:00
|
|
|
|
2009-08-18 21:43:19 +08:00
|
|
|
if (!(qdio->adapter->adapter_features & FSF_FEATURE_HBAAPI_MANAGEMENT))
|
2007-08-28 15:31:09 +08:00
|
|
|
return -EOPNOTSUPP;
|
|
|
|
|
[SCSI] zfcp: Change spin_lock_bh to spin_lock_irq to fix lockdep warning
With the change to use the data on the SCSI device, iterating through
all LUNs/scsi_devices takes the SCSI host_lock. This triggers warnings
from the lock dependency checker:
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #97
---------------------------------------------------------
chchp/3224 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->req_q_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this: [ 24.972394] 2 locks held by chchp/3224:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-....}, at: [<0000000000490302>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #98
---------------------------------------------------------
chchp/3235 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->stat_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this:
2 locks held by chchp/3235:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-.-..}, at: [<00000000004902f6>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
To stop this warning, change the request queue lock to disable irqs,
not only softirq. The changes are required only outside of the
critical "send fcp command" path.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-09-08 20:39:57 +08:00
|
|
|
spin_lock_irq(&qdio->req_q_lock);
|
2010-05-01 00:09:35 +08:00
|
|
|
if (zfcp_qdio_sbal_get(qdio))
|
2009-04-17 21:08:03 +08:00
|
|
|
goto out_unlock;
|
2008-07-02 16:56:39 +08:00
|
|
|
|
2010-05-01 00:09:34 +08:00
|
|
|
req = zfcp_fsf_req_create(qdio, FSF_QTCB_EXCHANGE_PORT_DATA,
|
2011-06-06 20:14:40 +08:00
|
|
|
SBAL_SFLAGS0_TYPE_READ, NULL);
|
2009-08-18 21:43:16 +08:00
|
|
|
|
2008-08-21 19:43:37 +08:00
|
|
|
if (IS_ERR(req)) {
|
2008-07-02 16:56:39 +08:00
|
|
|
retval = PTR_ERR(req);
|
2009-04-17 21:08:03 +08:00
|
|
|
goto out_unlock;
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
|
|
|
|
2007-08-28 15:31:09 +08:00
|
|
|
if (data)
|
2008-07-02 16:56:39 +08:00
|
|
|
req->data = data;
|
2007-08-28 15:31:09 +08:00
|
|
|
|
2010-05-01 00:09:34 +08:00
|
|
|
zfcp_qdio_set_sbale_last(qdio, &req->qdio_req);
|
2007-08-28 15:31:09 +08:00
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
req->handler = zfcp_fsf_exchange_port_data_handler;
|
|
|
|
zfcp_fsf_start_timer(req, ZFCP_FSF_REQUEST_TIMEOUT);
|
|
|
|
retval = zfcp_fsf_req_send(req);
|
[SCSI] zfcp: Change spin_lock_bh to spin_lock_irq to fix lockdep warning
With the change to use the data on the SCSI device, iterating through
all LUNs/scsi_devices takes the SCSI host_lock. This triggers warnings
from the lock dependency checker:
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #97
---------------------------------------------------------
chchp/3224 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->req_q_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this: [ 24.972394] 2 locks held by chchp/3224:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-....}, at: [<0000000000490302>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #98
---------------------------------------------------------
chchp/3235 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->stat_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this:
2 locks held by chchp/3235:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-.-..}, at: [<00000000004902f6>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
To stop this warning, change the request queue lock to disable irqs,
not only softirq. The changes are required only outside of the
critical "send fcp command" path.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-09-08 20:39:57 +08:00
|
|
|
spin_unlock_irq(&qdio->req_q_lock);
|
2009-04-17 21:08:03 +08:00
|
|
|
|
scsi: zfcp: fix request object use-after-free in send path causing seqno errors
With a recent change to our send path for FSF commands we introduced a
possible use-after-free of request-objects, that might further lead to
zfcp crafting bad requests, which the FCP channel correctly complains
about with an error (FSF_PROT_SEQ_NUMB_ERROR). This error is then handled
by an adapter-wide recovery.
The following sequence illustrates the possible use-after-free:
Send Path:
int zfcp_fsf_open_port(struct zfcp_erp_action *erp_action)
{
struct zfcp_fsf_req *req;
...
spin_lock_irq(&qdio->req_q_lock);
// ^^^^^^^^^^^^^^^^
// protects QDIO queue during sending
...
req = zfcp_fsf_req_create(qdio,
FSF_QTCB_OPEN_PORT_WITH_DID,
SBAL_SFLAGS0_TYPE_READ,
qdio->adapter->pool.erp_req);
// ^^^^^^^^^^^^^^^^^^^
// allocation of the request-object
...
retval = zfcp_fsf_req_send(req);
...
spin_unlock_irq(&qdio->req_q_lock);
return retval;
}
static int zfcp_fsf_req_send(struct zfcp_fsf_req *req)
{
struct zfcp_adapter *adapter = req->adapter;
struct zfcp_qdio *qdio = adapter->qdio;
...
zfcp_reqlist_add(adapter->req_list, req);
// ^^^^^^^^^^^^^^^^
// add request to our driver-internal hash-table for tracking
// (protected by separate lock req_list->lock)
...
if (zfcp_qdio_send(qdio, &req->qdio_req)) {
// ^^^^^^^^^^^^^^
// hand-off the request to FCP channel;
// the request can complete at any point now
...
}
/* Don't increase for unsolicited status */
if (!zfcp_fsf_req_is_status_read_buffer(req))
// ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
// possible use-after-free
adapter->fsf_req_seq_no++;
// ^^^^^^^^^^^^^^^^
// because of the use-after-free we might
// miss this accounting, and as follow-up
// this results in the FCP channel error
// FSF_PROT_SEQ_NUMB_ERROR
adapter->req_no++;
return 0;
}
static inline bool
zfcp_fsf_req_is_status_read_buffer(struct zfcp_fsf_req *req)
{
return req->qtcb == NULL;
// ^^^^^^^^^
// possible use-after-free
}
Response Path:
void zfcp_fsf_reqid_check(struct zfcp_qdio *qdio, int sbal_idx)
{
...
struct zfcp_fsf_req *fsf_req;
...
for (idx = 0; idx < QDIO_MAX_ELEMENTS_PER_BUFFER; idx++) {
...
fsf_req = zfcp_reqlist_find_rm(adapter->req_list,
req_id);
// ^^^^^^^^^^^^^^^^^^^^
// remove request from our driver-internal
// hash-table (lock req_list->lock)
...
zfcp_fsf_req_complete(fsf_req);
}
}
static void zfcp_fsf_req_complete(struct zfcp_fsf_req *req)
{
...
if (likely(req->status & ZFCP_STATUS_FSFREQ_CLEANUP))
zfcp_fsf_req_free(req);
// ^^^^^^^^^^^^^^^^^
// free memory for request-object
else
complete(&req->completion);
// ^^^^^^^^
// completion notification for code-paths that wait
// synchronous for the completion of the request; in
// those the memory is freed separately
}
The result of the use-after-free only affects the send path, and can not
lead to any data corruption. In case we miss the sequence-number
accounting, because the memory was already re-purposed, the next FSF
command will fail with said FCP channel error, and we will recover the
whole adapter. This causes no additional errors, but it slows down
traffic. There is a slight chance of the same thing happen again
recursively after the adapter recovery, but so far this has not been seen.
This was seen under z/VM, where the send path might run on a virtual CPU
that gets scheduled away by z/VM, while the return path might still run,
and so create the necessary timing. Running with KASAN can also slow down
the kernel sufficiently to run into this user-after-free, and then see the
report by KASAN.
To fix this, simply pull the test for the sequence-number accounting in
front of the hand-off to the FCP channel (this information doesn't change
during hand-off), but leave the sequence-number accounting itself where it
is.
To make future regressions of the same kind less likely, add comments to
all closely related code-paths.
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Fixes: f9eca0227600 ("scsi: zfcp: drop duplicate fsf_command from zfcp_fsf_req which is also in QTCB header")
Cc: <stable@vger.kernel.org> #5.0+
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-07-03 05:02:00 +08:00
|
|
|
if (!retval) {
|
|
|
|
/* NOTE: ONLY TOUCH SYNC req AGAIN ON req->completion. */
|
2009-08-18 21:43:14 +08:00
|
|
|
wait_for_completion(&req->completion);
|
2019-10-26 00:12:43 +08:00
|
|
|
|
|
|
|
if (req->status &
|
|
|
|
(ZFCP_STATUS_FSFREQ_ERROR | ZFCP_STATUS_FSFREQ_DISMISSED))
|
|
|
|
retval = -EIO;
|
|
|
|
else if (req->status & ZFCP_STATUS_FSFREQ_XDATAINCOMPLETE)
|
|
|
|
retval = -EAGAIN;
|
scsi: zfcp: fix request object use-after-free in send path causing seqno errors
With a recent change to our send path for FSF commands we introduced a
possible use-after-free of request-objects, that might further lead to
zfcp crafting bad requests, which the FCP channel correctly complains
about with an error (FSF_PROT_SEQ_NUMB_ERROR). This error is then handled
by an adapter-wide recovery.
The following sequence illustrates the possible use-after-free:
Send Path:
int zfcp_fsf_open_port(struct zfcp_erp_action *erp_action)
{
struct zfcp_fsf_req *req;
...
spin_lock_irq(&qdio->req_q_lock);
// ^^^^^^^^^^^^^^^^
// protects QDIO queue during sending
...
req = zfcp_fsf_req_create(qdio,
FSF_QTCB_OPEN_PORT_WITH_DID,
SBAL_SFLAGS0_TYPE_READ,
qdio->adapter->pool.erp_req);
// ^^^^^^^^^^^^^^^^^^^
// allocation of the request-object
...
retval = zfcp_fsf_req_send(req);
...
spin_unlock_irq(&qdio->req_q_lock);
return retval;
}
static int zfcp_fsf_req_send(struct zfcp_fsf_req *req)
{
struct zfcp_adapter *adapter = req->adapter;
struct zfcp_qdio *qdio = adapter->qdio;
...
zfcp_reqlist_add(adapter->req_list, req);
// ^^^^^^^^^^^^^^^^
// add request to our driver-internal hash-table for tracking
// (protected by separate lock req_list->lock)
...
if (zfcp_qdio_send(qdio, &req->qdio_req)) {
// ^^^^^^^^^^^^^^
// hand-off the request to FCP channel;
// the request can complete at any point now
...
}
/* Don't increase for unsolicited status */
if (!zfcp_fsf_req_is_status_read_buffer(req))
// ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
// possible use-after-free
adapter->fsf_req_seq_no++;
// ^^^^^^^^^^^^^^^^
// because of the use-after-free we might
// miss this accounting, and as follow-up
// this results in the FCP channel error
// FSF_PROT_SEQ_NUMB_ERROR
adapter->req_no++;
return 0;
}
static inline bool
zfcp_fsf_req_is_status_read_buffer(struct zfcp_fsf_req *req)
{
return req->qtcb == NULL;
// ^^^^^^^^^
// possible use-after-free
}
Response Path:
void zfcp_fsf_reqid_check(struct zfcp_qdio *qdio, int sbal_idx)
{
...
struct zfcp_fsf_req *fsf_req;
...
for (idx = 0; idx < QDIO_MAX_ELEMENTS_PER_BUFFER; idx++) {
...
fsf_req = zfcp_reqlist_find_rm(adapter->req_list,
req_id);
// ^^^^^^^^^^^^^^^^^^^^
// remove request from our driver-internal
// hash-table (lock req_list->lock)
...
zfcp_fsf_req_complete(fsf_req);
}
}
static void zfcp_fsf_req_complete(struct zfcp_fsf_req *req)
{
...
if (likely(req->status & ZFCP_STATUS_FSFREQ_CLEANUP))
zfcp_fsf_req_free(req);
// ^^^^^^^^^^^^^^^^^
// free memory for request-object
else
complete(&req->completion);
// ^^^^^^^^
// completion notification for code-paths that wait
// synchronous for the completion of the request; in
// those the memory is freed separately
}
The result of the use-after-free only affects the send path, and can not
lead to any data corruption. In case we miss the sequence-number
accounting, because the memory was already re-purposed, the next FSF
command will fail with said FCP channel error, and we will recover the
whole adapter. This causes no additional errors, but it slows down
traffic. There is a slight chance of the same thing happen again
recursively after the adapter recovery, but so far this has not been seen.
This was seen under z/VM, where the send path might run on a virtual CPU
that gets scheduled away by z/VM, while the return path might still run,
and so create the necessary timing. Running with KASAN can also slow down
the kernel sufficiently to run into this user-after-free, and then see the
report by KASAN.
To fix this, simply pull the test for the sequence-number accounting in
front of the hand-off to the FCP channel (this information doesn't change
during hand-off), but leave the sequence-number accounting itself where it
is.
To make future regressions of the same kind less likely, add comments to
all closely related code-paths.
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Fixes: f9eca0227600 ("scsi: zfcp: drop duplicate fsf_command from zfcp_fsf_req which is also in QTCB header")
Cc: <stable@vger.kernel.org> #5.0+
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-07-03 05:02:00 +08:00
|
|
|
}
|
2009-08-18 21:43:14 +08:00
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
zfcp_fsf_req_free(req);
|
2005-04-17 06:20:36 +08:00
|
|
|
return retval;
|
2009-04-17 21:08:03 +08:00
|
|
|
|
|
|
|
out_unlock:
|
[SCSI] zfcp: Change spin_lock_bh to spin_lock_irq to fix lockdep warning
With the change to use the data on the SCSI device, iterating through
all LUNs/scsi_devices takes the SCSI host_lock. This triggers warnings
from the lock dependency checker:
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #97
---------------------------------------------------------
chchp/3224 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->req_q_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this: [ 24.972394] 2 locks held by chchp/3224:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-....}, at: [<0000000000490302>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #98
---------------------------------------------------------
chchp/3235 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->stat_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this:
2 locks held by chchp/3235:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-.-..}, at: [<00000000004902f6>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
To stop this warning, change the request queue lock to disable irqs,
not only softirq. The changes are required only outside of the
critical "send fcp command" path.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-09-08 20:39:57 +08:00
|
|
|
spin_unlock_irq(&qdio->req_q_lock);
|
2009-04-17 21:08:03 +08:00
|
|
|
return retval;
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
|
|
|
|
2020-03-13 01:45:03 +08:00
|
|
|
static void zfcp_fsf_log_port_fc_security(struct zfcp_port *port,
|
|
|
|
struct zfcp_fsf_req *req)
|
2020-03-13 01:45:02 +08:00
|
|
|
{
|
|
|
|
char mnemonic_old[ZFCP_FSF_MAX_FC_SECURITY_MNEMONIC_LENGTH];
|
|
|
|
char mnemonic_new[ZFCP_FSF_MAX_FC_SECURITY_MNEMONIC_LENGTH];
|
|
|
|
|
|
|
|
if (port->connection_info == port->connection_info_old) {
|
2020-03-13 01:45:03 +08:00
|
|
|
/* no change, no log nor trace */
|
2020-03-13 01:45:02 +08:00
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
2020-03-13 01:45:03 +08:00
|
|
|
zfcp_dbf_hba_fsf_fces("fsfcesp", req, port->wwpn,
|
|
|
|
port->connection_info_old,
|
|
|
|
port->connection_info);
|
|
|
|
|
2020-03-13 01:45:02 +08:00
|
|
|
zfcp_fsf_scnprint_fc_security(mnemonic_old, sizeof(mnemonic_old),
|
|
|
|
port->connection_info_old,
|
|
|
|
ZFCP_FSF_PRINT_FMT_SINGLEITEM);
|
|
|
|
zfcp_fsf_scnprint_fc_security(mnemonic_new, sizeof(mnemonic_new),
|
|
|
|
port->connection_info,
|
|
|
|
ZFCP_FSF_PRINT_FMT_SINGLEITEM);
|
|
|
|
|
|
|
|
if (strncmp(mnemonic_old, mnemonic_new,
|
|
|
|
ZFCP_FSF_MAX_FC_SECURITY_MNEMONIC_LENGTH) == 0) {
|
|
|
|
/* no change in string representation, no log */
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (port->connection_info_old == 0) {
|
|
|
|
/* activation */
|
|
|
|
dev_info(&port->adapter->ccw_device->dev,
|
|
|
|
"FC Endpoint Security of connection to remote port 0x%16llx enabled: %s\n",
|
|
|
|
port->wwpn, mnemonic_new);
|
|
|
|
} else if (port->connection_info == 0) {
|
|
|
|
/* deactivation */
|
|
|
|
dev_warn(&port->adapter->ccw_device->dev,
|
|
|
|
"FC Endpoint Security of connection to remote port 0x%16llx disabled: was %s\n",
|
|
|
|
port->wwpn, mnemonic_old);
|
|
|
|
} else {
|
|
|
|
/* change */
|
|
|
|
dev_warn(&port->adapter->ccw_device->dev,
|
|
|
|
"FC Endpoint Security of connection to remote port 0x%16llx changed: from %s to %s\n",
|
|
|
|
port->wwpn, mnemonic_old, mnemonic_new);
|
|
|
|
}
|
|
|
|
|
|
|
|
out:
|
|
|
|
port->connection_info_old = port->connection_info;
|
|
|
|
}
|
|
|
|
|
2020-03-13 01:45:05 +08:00
|
|
|
static void zfcp_fsf_log_security_error(const struct device *dev, u32 fsf_sqw0,
|
|
|
|
u64 wwpn)
|
|
|
|
{
|
|
|
|
switch (fsf_sqw0) {
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Open Port command error codes
|
|
|
|
*/
|
|
|
|
|
|
|
|
case FSF_SQ_SECURITY_REQUIRED:
|
|
|
|
dev_warn_ratelimited(dev,
|
|
|
|
"FC Endpoint Security error: FC security is required but not supported or configured on remote port 0x%016llx\n",
|
|
|
|
wwpn);
|
|
|
|
break;
|
|
|
|
case FSF_SQ_SECURITY_TIMEOUT:
|
|
|
|
dev_warn_ratelimited(dev,
|
|
|
|
"FC Endpoint Security error: a timeout prevented opening remote port 0x%016llx\n",
|
|
|
|
wwpn);
|
|
|
|
break;
|
|
|
|
case FSF_SQ_SECURITY_KM_UNAVAILABLE:
|
|
|
|
dev_warn_ratelimited(dev,
|
|
|
|
"FC Endpoint Security error: opening remote port 0x%016llx failed because local and external key manager cannot communicate\n",
|
|
|
|
wwpn);
|
|
|
|
break;
|
|
|
|
case FSF_SQ_SECURITY_RKM_UNAVAILABLE:
|
|
|
|
dev_warn_ratelimited(dev,
|
|
|
|
"FC Endpoint Security error: opening remote port 0x%016llx failed because it cannot communicate with the external key manager\n",
|
|
|
|
wwpn);
|
|
|
|
break;
|
|
|
|
case FSF_SQ_SECURITY_AUTH_FAILURE:
|
|
|
|
dev_warn_ratelimited(dev,
|
|
|
|
"FC Endpoint Security error: the device could not verify the identity of remote port 0x%016llx\n",
|
|
|
|
wwpn);
|
|
|
|
break;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Send FCP command error codes
|
|
|
|
*/
|
|
|
|
|
|
|
|
case FSF_SQ_SECURITY_ENC_FAILURE:
|
|
|
|
dev_warn_ratelimited(dev,
|
|
|
|
"FC Endpoint Security error: FC connection to remote port 0x%016llx closed because encryption broke down\n",
|
|
|
|
wwpn);
|
|
|
|
break;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Unknown error codes
|
|
|
|
*/
|
|
|
|
|
|
|
|
default:
|
|
|
|
dev_warn_ratelimited(dev,
|
|
|
|
"FC Endpoint Security error: the device issued an unknown error code 0x%08x related to the FC connection to remote port 0x%016llx\n",
|
|
|
|
fsf_sqw0, wwpn);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
static void zfcp_fsf_open_port_handler(struct zfcp_fsf_req *req)
|
2005-04-17 06:20:36 +08:00
|
|
|
{
|
2020-03-13 01:45:00 +08:00
|
|
|
struct zfcp_adapter *adapter = req->adapter;
|
2008-07-02 16:56:39 +08:00
|
|
|
struct zfcp_port *port = req->data;
|
|
|
|
struct fsf_qtcb_header *header = &req->qtcb->header;
|
2020-03-13 01:45:00 +08:00
|
|
|
struct fsf_qtcb_bottom_support *bottom = &req->qtcb->bottom.support;
|
2009-11-24 23:54:09 +08:00
|
|
|
struct fc_els_flogi *plogi;
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
if (req->status & ZFCP_STATUS_FSFREQ_ERROR)
|
2009-05-15 19:18:19 +08:00
|
|
|
goto out;
|
2005-04-17 06:20:36 +08:00
|
|
|
|
|
|
|
switch (header->fsf_status) {
|
|
|
|
case FSF_PORT_ALREADY_OPEN:
|
|
|
|
break;
|
|
|
|
case FSF_MAXIMUM_NUMBER_OF_PORTS_EXCEEDED:
|
2020-03-13 01:45:00 +08:00
|
|
|
dev_warn(&adapter->ccw_device->dev,
|
2008-10-01 18:42:15 +08:00
|
|
|
"Not enough FCP adapter resources to open "
|
2008-10-01 18:42:18 +08:00
|
|
|
"remote port 0x%016Lx\n",
|
|
|
|
(unsigned long long)port->wwpn);
|
2010-09-08 20:40:01 +08:00
|
|
|
zfcp_erp_set_port_status(port,
|
|
|
|
ZFCP_STATUS_COMMON_ERP_FAILED);
|
2008-07-02 16:56:39 +08:00
|
|
|
req->status |= ZFCP_STATUS_FSFREQ_ERROR;
|
2005-04-17 06:20:36 +08:00
|
|
|
break;
|
2020-03-13 01:45:04 +08:00
|
|
|
case FSF_SECURITY_ERROR:
|
2020-03-13 01:45:05 +08:00
|
|
|
zfcp_fsf_log_security_error(&req->adapter->ccw_device->dev,
|
|
|
|
header->fsf_status_qual.word[0],
|
|
|
|
port->wwpn);
|
2020-03-13 01:45:04 +08:00
|
|
|
req->status |= ZFCP_STATUS_FSFREQ_ERROR;
|
|
|
|
break;
|
2005-04-17 06:20:36 +08:00
|
|
|
case FSF_ADAPTER_STATUS_AVAILABLE:
|
|
|
|
switch (header->fsf_status_qual.word[0]) {
|
|
|
|
case FSF_SQ_INVOKE_LINK_TEST_PROCEDURE:
|
2017-07-28 18:31:00 +08:00
|
|
|
/* no zfcp_fc_test_link() with failed open port */
|
2020-03-31 22:21:48 +08:00
|
|
|
fallthrough;
|
2005-04-17 06:20:36 +08:00
|
|
|
case FSF_SQ_ULP_DEPENDENT_ERP_REQUIRED:
|
|
|
|
case FSF_SQ_NO_RETRY_POSSIBLE:
|
2008-07-02 16:56:39 +08:00
|
|
|
req->status |= ZFCP_STATUS_FSFREQ_ERROR;
|
2005-04-17 06:20:36 +08:00
|
|
|
break;
|
|
|
|
}
|
|
|
|
break;
|
|
|
|
case FSF_GOOD:
|
|
|
|
port->handle = header->port_handle;
|
2020-03-13 01:45:01 +08:00
|
|
|
if (adapter->adapter_features & FSF_FEATURE_FC_SECURITY)
|
|
|
|
port->connection_info = bottom->connection_info;
|
|
|
|
else
|
|
|
|
port->connection_info = 0;
|
2020-03-13 01:45:03 +08:00
|
|
|
zfcp_fsf_log_port_fc_security(port, req);
|
2015-04-24 07:12:32 +08:00
|
|
|
atomic_or(ZFCP_STATUS_COMMON_OPEN |
|
2005-04-17 06:20:36 +08:00
|
|
|
ZFCP_STATUS_PORT_PHYS_OPEN, &port->status);
|
2015-04-24 07:12:32 +08:00
|
|
|
atomic_andnot(ZFCP_STATUS_COMMON_ACCESS_BOXED,
|
2005-06-13 19:23:57 +08:00
|
|
|
&port->status);
|
2005-04-17 06:20:36 +08:00
|
|
|
/* check whether D_ID has changed during open */
|
|
|
|
/*
|
|
|
|
* FIXME: This check is not airtight, as the FCP channel does
|
|
|
|
* not monitor closures of target port connections caused on
|
|
|
|
* the remote side. Thus, they might miss out on invalidating
|
|
|
|
* locally cached WWPNs (and other N_Port parameters) of gone
|
|
|
|
* target ports. So, our heroic attempt to make things safe
|
|
|
|
* could be undermined by 'open port' response data tagged with
|
|
|
|
* obsolete WWPNs. Another reason to monitor potential
|
|
|
|
* connection closures ourself at least (by interpreting
|
|
|
|
* incoming ELS' and unsolicited status). It just crosses my
|
|
|
|
* mind that one should be able to cross-check by means of
|
|
|
|
* another GID_PN straight after a port has been opened.
|
|
|
|
* Alternately, an ADISC/PDISC ELS should suffice, as well.
|
|
|
|
*/
|
2020-03-13 01:45:00 +08:00
|
|
|
plogi = (struct fc_els_flogi *) bottom->els;
|
|
|
|
if (bottom->els1_length >= FSF_PLOGI_MIN_LEN)
|
|
|
|
zfcp_fc_plogi_evaluate(port, plogi);
|
2005-04-17 06:20:36 +08:00
|
|
|
break;
|
|
|
|
case FSF_UNKNOWN_OP_SUBTYPE:
|
2008-07-02 16:56:39 +08:00
|
|
|
req->status |= ZFCP_STATUS_FSFREQ_ERROR;
|
2005-04-17 06:20:36 +08:00
|
|
|
break;
|
|
|
|
}
|
2009-05-15 19:18:19 +08:00
|
|
|
|
|
|
|
out:
|
2010-02-17 18:18:56 +08:00
|
|
|
put_device(&port->dev);
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
/**
|
|
|
|
* zfcp_fsf_open_port - create and send open port request
|
|
|
|
* @erp_action: pointer to struct zfcp_erp_action
|
|
|
|
* Returns: 0 on success, error otherwise
|
2005-04-17 06:20:36 +08:00
|
|
|
*/
|
2008-07-02 16:56:39 +08:00
|
|
|
int zfcp_fsf_open_port(struct zfcp_erp_action *erp_action)
|
2005-04-17 06:20:36 +08:00
|
|
|
{
|
2009-08-18 21:43:19 +08:00
|
|
|
struct zfcp_qdio *qdio = erp_action->adapter->qdio;
|
2009-05-15 19:18:19 +08:00
|
|
|
struct zfcp_port *port = erp_action->port;
|
2009-08-18 21:43:19 +08:00
|
|
|
struct zfcp_fsf_req *req;
|
2008-07-02 16:56:39 +08:00
|
|
|
int retval = -EIO;
|
|
|
|
|
[SCSI] zfcp: Change spin_lock_bh to spin_lock_irq to fix lockdep warning
With the change to use the data on the SCSI device, iterating through
all LUNs/scsi_devices takes the SCSI host_lock. This triggers warnings
from the lock dependency checker:
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #97
---------------------------------------------------------
chchp/3224 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->req_q_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this: [ 24.972394] 2 locks held by chchp/3224:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-....}, at: [<0000000000490302>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #98
---------------------------------------------------------
chchp/3235 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->stat_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this:
2 locks held by chchp/3235:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-.-..}, at: [<00000000004902f6>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
To stop this warning, change the request queue lock to disable irqs,
not only softirq. The changes are required only outside of the
critical "send fcp command" path.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-09-08 20:39:57 +08:00
|
|
|
spin_lock_irq(&qdio->req_q_lock);
|
2010-05-01 00:09:35 +08:00
|
|
|
if (zfcp_qdio_sbal_get(qdio))
|
2008-07-02 16:56:39 +08:00
|
|
|
goto out;
|
|
|
|
|
2009-08-18 21:43:19 +08:00
|
|
|
req = zfcp_fsf_req_create(qdio, FSF_QTCB_OPEN_PORT_WITH_DID,
|
2011-06-06 20:14:40 +08:00
|
|
|
SBAL_SFLAGS0_TYPE_READ,
|
2009-08-18 21:43:19 +08:00
|
|
|
qdio->adapter->pool.erp_req);
|
2009-08-18 21:43:16 +08:00
|
|
|
|
2008-08-21 19:43:37 +08:00
|
|
|
if (IS_ERR(req)) {
|
2008-07-02 16:56:39 +08:00
|
|
|
retval = PTR_ERR(req);
|
2005-04-17 06:20:36 +08:00
|
|
|
goto out;
|
2008-07-02 16:56:39 +08:00
|
|
|
}
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2009-08-18 21:43:16 +08:00
|
|
|
req->status |= ZFCP_STATUS_FSFREQ_CLEANUP;
|
2010-05-01 00:09:34 +08:00
|
|
|
zfcp_qdio_set_sbale_last(qdio, &req->qdio_req);
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
req->handler = zfcp_fsf_open_port_handler;
|
2009-11-24 23:54:12 +08:00
|
|
|
hton24(req->qtcb->bottom.support.d_id, port->d_id);
|
2009-05-15 19:18:19 +08:00
|
|
|
req->data = port;
|
2008-07-02 16:56:39 +08:00
|
|
|
req->erp_action = erp_action;
|
2010-02-17 18:18:49 +08:00
|
|
|
erp_action->fsf_req_id = req->req_id;
|
2010-02-17 18:18:56 +08:00
|
|
|
get_device(&port->dev);
|
2008-07-02 16:56:39 +08:00
|
|
|
|
2008-07-02 16:56:40 +08:00
|
|
|
zfcp_fsf_start_erp_timer(req);
|
2008-07-02 16:56:39 +08:00
|
|
|
retval = zfcp_fsf_req_send(req);
|
2005-04-17 06:20:36 +08:00
|
|
|
if (retval) {
|
2008-07-02 16:56:39 +08:00
|
|
|
zfcp_fsf_req_free(req);
|
2010-02-17 18:18:49 +08:00
|
|
|
erp_action->fsf_req_id = 0;
|
2010-02-17 18:18:56 +08:00
|
|
|
put_device(&port->dev);
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
scsi: zfcp: fix request object use-after-free in send path causing seqno errors
With a recent change to our send path for FSF commands we introduced a
possible use-after-free of request-objects, that might further lead to
zfcp crafting bad requests, which the FCP channel correctly complains
about with an error (FSF_PROT_SEQ_NUMB_ERROR). This error is then handled
by an adapter-wide recovery.
The following sequence illustrates the possible use-after-free:
Send Path:
int zfcp_fsf_open_port(struct zfcp_erp_action *erp_action)
{
struct zfcp_fsf_req *req;
...
spin_lock_irq(&qdio->req_q_lock);
// ^^^^^^^^^^^^^^^^
// protects QDIO queue during sending
...
req = zfcp_fsf_req_create(qdio,
FSF_QTCB_OPEN_PORT_WITH_DID,
SBAL_SFLAGS0_TYPE_READ,
qdio->adapter->pool.erp_req);
// ^^^^^^^^^^^^^^^^^^^
// allocation of the request-object
...
retval = zfcp_fsf_req_send(req);
...
spin_unlock_irq(&qdio->req_q_lock);
return retval;
}
static int zfcp_fsf_req_send(struct zfcp_fsf_req *req)
{
struct zfcp_adapter *adapter = req->adapter;
struct zfcp_qdio *qdio = adapter->qdio;
...
zfcp_reqlist_add(adapter->req_list, req);
// ^^^^^^^^^^^^^^^^
// add request to our driver-internal hash-table for tracking
// (protected by separate lock req_list->lock)
...
if (zfcp_qdio_send(qdio, &req->qdio_req)) {
// ^^^^^^^^^^^^^^
// hand-off the request to FCP channel;
// the request can complete at any point now
...
}
/* Don't increase for unsolicited status */
if (!zfcp_fsf_req_is_status_read_buffer(req))
// ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
// possible use-after-free
adapter->fsf_req_seq_no++;
// ^^^^^^^^^^^^^^^^
// because of the use-after-free we might
// miss this accounting, and as follow-up
// this results in the FCP channel error
// FSF_PROT_SEQ_NUMB_ERROR
adapter->req_no++;
return 0;
}
static inline bool
zfcp_fsf_req_is_status_read_buffer(struct zfcp_fsf_req *req)
{
return req->qtcb == NULL;
// ^^^^^^^^^
// possible use-after-free
}
Response Path:
void zfcp_fsf_reqid_check(struct zfcp_qdio *qdio, int sbal_idx)
{
...
struct zfcp_fsf_req *fsf_req;
...
for (idx = 0; idx < QDIO_MAX_ELEMENTS_PER_BUFFER; idx++) {
...
fsf_req = zfcp_reqlist_find_rm(adapter->req_list,
req_id);
// ^^^^^^^^^^^^^^^^^^^^
// remove request from our driver-internal
// hash-table (lock req_list->lock)
...
zfcp_fsf_req_complete(fsf_req);
}
}
static void zfcp_fsf_req_complete(struct zfcp_fsf_req *req)
{
...
if (likely(req->status & ZFCP_STATUS_FSFREQ_CLEANUP))
zfcp_fsf_req_free(req);
// ^^^^^^^^^^^^^^^^^
// free memory for request-object
else
complete(&req->completion);
// ^^^^^^^^
// completion notification for code-paths that wait
// synchronous for the completion of the request; in
// those the memory is freed separately
}
The result of the use-after-free only affects the send path, and can not
lead to any data corruption. In case we miss the sequence-number
accounting, because the memory was already re-purposed, the next FSF
command will fail with said FCP channel error, and we will recover the
whole adapter. This causes no additional errors, but it slows down
traffic. There is a slight chance of the same thing happen again
recursively after the adapter recovery, but so far this has not been seen.
This was seen under z/VM, where the send path might run on a virtual CPU
that gets scheduled away by z/VM, while the return path might still run,
and so create the necessary timing. Running with KASAN can also slow down
the kernel sufficiently to run into this user-after-free, and then see the
report by KASAN.
To fix this, simply pull the test for the sequence-number accounting in
front of the hand-off to the FCP channel (this information doesn't change
during hand-off), but leave the sequence-number accounting itself where it
is.
To make future regressions of the same kind less likely, add comments to
all closely related code-paths.
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Fixes: f9eca0227600 ("scsi: zfcp: drop duplicate fsf_command from zfcp_fsf_req which is also in QTCB header")
Cc: <stable@vger.kernel.org> #5.0+
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-07-03 05:02:00 +08:00
|
|
|
/* NOTE: DO NOT TOUCH req PAST THIS POINT! */
|
2008-07-02 16:56:39 +08:00
|
|
|
out:
|
[SCSI] zfcp: Change spin_lock_bh to spin_lock_irq to fix lockdep warning
With the change to use the data on the SCSI device, iterating through
all LUNs/scsi_devices takes the SCSI host_lock. This triggers warnings
from the lock dependency checker:
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #97
---------------------------------------------------------
chchp/3224 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->req_q_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this: [ 24.972394] 2 locks held by chchp/3224:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-....}, at: [<0000000000490302>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #98
---------------------------------------------------------
chchp/3235 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->stat_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this:
2 locks held by chchp/3235:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-.-..}, at: [<00000000004902f6>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
To stop this warning, change the request queue lock to disable irqs,
not only softirq. The changes are required only outside of the
critical "send fcp command" path.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-09-08 20:39:57 +08:00
|
|
|
spin_unlock_irq(&qdio->req_q_lock);
|
2005-04-17 06:20:36 +08:00
|
|
|
return retval;
|
|
|
|
}
|
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
static void zfcp_fsf_close_port_handler(struct zfcp_fsf_req *req)
|
2005-04-17 06:20:36 +08:00
|
|
|
{
|
2008-07-02 16:56:39 +08:00
|
|
|
struct zfcp_port *port = req->data;
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
if (req->status & ZFCP_STATUS_FSFREQ_ERROR)
|
2008-10-01 18:42:16 +08:00
|
|
|
return;
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
switch (req->qtcb->header.fsf_status) {
|
2005-04-17 06:20:36 +08:00
|
|
|
case FSF_PORT_HANDLE_NOT_VALID:
|
2010-12-02 22:16:16 +08:00
|
|
|
zfcp_erp_adapter_reopen(port->adapter, 0, "fscph_1");
|
2008-07-02 16:56:39 +08:00
|
|
|
req->status |= ZFCP_STATUS_FSFREQ_ERROR;
|
2005-04-17 06:20:36 +08:00
|
|
|
break;
|
|
|
|
case FSF_ADAPTER_STATUS_AVAILABLE:
|
|
|
|
break;
|
|
|
|
case FSF_GOOD:
|
2010-09-08 20:40:01 +08:00
|
|
|
zfcp_erp_clear_port_status(port, ZFCP_STATUS_COMMON_OPEN);
|
2005-04-17 06:20:36 +08:00
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
/**
|
|
|
|
* zfcp_fsf_close_port - create and send close port request
|
|
|
|
* @erp_action: pointer to struct zfcp_erp_action
|
|
|
|
* Returns: 0 on success, error otherwise
|
2005-04-17 06:20:36 +08:00
|
|
|
*/
|
2008-07-02 16:56:39 +08:00
|
|
|
int zfcp_fsf_close_port(struct zfcp_erp_action *erp_action)
|
2005-04-17 06:20:36 +08:00
|
|
|
{
|
2009-08-18 21:43:19 +08:00
|
|
|
struct zfcp_qdio *qdio = erp_action->adapter->qdio;
|
2008-07-02 16:56:39 +08:00
|
|
|
struct zfcp_fsf_req *req;
|
|
|
|
int retval = -EIO;
|
|
|
|
|
[SCSI] zfcp: Change spin_lock_bh to spin_lock_irq to fix lockdep warning
With the change to use the data on the SCSI device, iterating through
all LUNs/scsi_devices takes the SCSI host_lock. This triggers warnings
from the lock dependency checker:
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #97
---------------------------------------------------------
chchp/3224 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->req_q_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this: [ 24.972394] 2 locks held by chchp/3224:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-....}, at: [<0000000000490302>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #98
---------------------------------------------------------
chchp/3235 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->stat_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this:
2 locks held by chchp/3235:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-.-..}, at: [<00000000004902f6>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
To stop this warning, change the request queue lock to disable irqs,
not only softirq. The changes are required only outside of the
critical "send fcp command" path.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-09-08 20:39:57 +08:00
|
|
|
spin_lock_irq(&qdio->req_q_lock);
|
2010-05-01 00:09:35 +08:00
|
|
|
if (zfcp_qdio_sbal_get(qdio))
|
2008-07-02 16:56:39 +08:00
|
|
|
goto out;
|
|
|
|
|
2009-08-18 21:43:19 +08:00
|
|
|
req = zfcp_fsf_req_create(qdio, FSF_QTCB_CLOSE_PORT,
|
2011-06-06 20:14:40 +08:00
|
|
|
SBAL_SFLAGS0_TYPE_READ,
|
2009-08-18 21:43:19 +08:00
|
|
|
qdio->adapter->pool.erp_req);
|
2009-08-18 21:43:16 +08:00
|
|
|
|
2008-08-21 19:43:37 +08:00
|
|
|
if (IS_ERR(req)) {
|
2008-07-02 16:56:39 +08:00
|
|
|
retval = PTR_ERR(req);
|
2005-04-17 06:20:36 +08:00
|
|
|
goto out;
|
2008-07-02 16:56:39 +08:00
|
|
|
}
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2009-08-18 21:43:16 +08:00
|
|
|
req->status |= ZFCP_STATUS_FSFREQ_CLEANUP;
|
2010-05-01 00:09:34 +08:00
|
|
|
zfcp_qdio_set_sbale_last(qdio, &req->qdio_req);
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
req->handler = zfcp_fsf_close_port_handler;
|
|
|
|
req->data = erp_action->port;
|
|
|
|
req->erp_action = erp_action;
|
|
|
|
req->qtcb->header.port_handle = erp_action->port->handle;
|
2010-02-17 18:18:49 +08:00
|
|
|
erp_action->fsf_req_id = req->req_id;
|
2008-07-02 16:56:39 +08:00
|
|
|
|
2008-07-02 16:56:40 +08:00
|
|
|
zfcp_fsf_start_erp_timer(req);
|
2008-07-02 16:56:39 +08:00
|
|
|
retval = zfcp_fsf_req_send(req);
|
2005-04-17 06:20:36 +08:00
|
|
|
if (retval) {
|
2008-07-02 16:56:39 +08:00
|
|
|
zfcp_fsf_req_free(req);
|
2010-02-17 18:18:49 +08:00
|
|
|
erp_action->fsf_req_id = 0;
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
scsi: zfcp: fix request object use-after-free in send path causing seqno errors
With a recent change to our send path for FSF commands we introduced a
possible use-after-free of request-objects, that might further lead to
zfcp crafting bad requests, which the FCP channel correctly complains
about with an error (FSF_PROT_SEQ_NUMB_ERROR). This error is then handled
by an adapter-wide recovery.
The following sequence illustrates the possible use-after-free:
Send Path:
int zfcp_fsf_open_port(struct zfcp_erp_action *erp_action)
{
struct zfcp_fsf_req *req;
...
spin_lock_irq(&qdio->req_q_lock);
// ^^^^^^^^^^^^^^^^
// protects QDIO queue during sending
...
req = zfcp_fsf_req_create(qdio,
FSF_QTCB_OPEN_PORT_WITH_DID,
SBAL_SFLAGS0_TYPE_READ,
qdio->adapter->pool.erp_req);
// ^^^^^^^^^^^^^^^^^^^
// allocation of the request-object
...
retval = zfcp_fsf_req_send(req);
...
spin_unlock_irq(&qdio->req_q_lock);
return retval;
}
static int zfcp_fsf_req_send(struct zfcp_fsf_req *req)
{
struct zfcp_adapter *adapter = req->adapter;
struct zfcp_qdio *qdio = adapter->qdio;
...
zfcp_reqlist_add(adapter->req_list, req);
// ^^^^^^^^^^^^^^^^
// add request to our driver-internal hash-table for tracking
// (protected by separate lock req_list->lock)
...
if (zfcp_qdio_send(qdio, &req->qdio_req)) {
// ^^^^^^^^^^^^^^
// hand-off the request to FCP channel;
// the request can complete at any point now
...
}
/* Don't increase for unsolicited status */
if (!zfcp_fsf_req_is_status_read_buffer(req))
// ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
// possible use-after-free
adapter->fsf_req_seq_no++;
// ^^^^^^^^^^^^^^^^
// because of the use-after-free we might
// miss this accounting, and as follow-up
// this results in the FCP channel error
// FSF_PROT_SEQ_NUMB_ERROR
adapter->req_no++;
return 0;
}
static inline bool
zfcp_fsf_req_is_status_read_buffer(struct zfcp_fsf_req *req)
{
return req->qtcb == NULL;
// ^^^^^^^^^
// possible use-after-free
}
Response Path:
void zfcp_fsf_reqid_check(struct zfcp_qdio *qdio, int sbal_idx)
{
...
struct zfcp_fsf_req *fsf_req;
...
for (idx = 0; idx < QDIO_MAX_ELEMENTS_PER_BUFFER; idx++) {
...
fsf_req = zfcp_reqlist_find_rm(adapter->req_list,
req_id);
// ^^^^^^^^^^^^^^^^^^^^
// remove request from our driver-internal
// hash-table (lock req_list->lock)
...
zfcp_fsf_req_complete(fsf_req);
}
}
static void zfcp_fsf_req_complete(struct zfcp_fsf_req *req)
{
...
if (likely(req->status & ZFCP_STATUS_FSFREQ_CLEANUP))
zfcp_fsf_req_free(req);
// ^^^^^^^^^^^^^^^^^
// free memory for request-object
else
complete(&req->completion);
// ^^^^^^^^
// completion notification for code-paths that wait
// synchronous for the completion of the request; in
// those the memory is freed separately
}
The result of the use-after-free only affects the send path, and can not
lead to any data corruption. In case we miss the sequence-number
accounting, because the memory was already re-purposed, the next FSF
command will fail with said FCP channel error, and we will recover the
whole adapter. This causes no additional errors, but it slows down
traffic. There is a slight chance of the same thing happen again
recursively after the adapter recovery, but so far this has not been seen.
This was seen under z/VM, where the send path might run on a virtual CPU
that gets scheduled away by z/VM, while the return path might still run,
and so create the necessary timing. Running with KASAN can also slow down
the kernel sufficiently to run into this user-after-free, and then see the
report by KASAN.
To fix this, simply pull the test for the sequence-number accounting in
front of the hand-off to the FCP channel (this information doesn't change
during hand-off), but leave the sequence-number accounting itself where it
is.
To make future regressions of the same kind less likely, add comments to
all closely related code-paths.
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Fixes: f9eca0227600 ("scsi: zfcp: drop duplicate fsf_command from zfcp_fsf_req which is also in QTCB header")
Cc: <stable@vger.kernel.org> #5.0+
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-07-03 05:02:00 +08:00
|
|
|
/* NOTE: DO NOT TOUCH req PAST THIS POINT! */
|
2008-07-02 16:56:39 +08:00
|
|
|
out:
|
[SCSI] zfcp: Change spin_lock_bh to spin_lock_irq to fix lockdep warning
With the change to use the data on the SCSI device, iterating through
all LUNs/scsi_devices takes the SCSI host_lock. This triggers warnings
from the lock dependency checker:
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #97
---------------------------------------------------------
chchp/3224 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->req_q_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this: [ 24.972394] 2 locks held by chchp/3224:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-....}, at: [<0000000000490302>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #98
---------------------------------------------------------
chchp/3235 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->stat_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this:
2 locks held by chchp/3235:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-.-..}, at: [<00000000004902f6>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
To stop this warning, change the request queue lock to disable irqs,
not only softirq. The changes are required only outside of the
critical "send fcp command" path.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-09-08 20:39:57 +08:00
|
|
|
spin_unlock_irq(&qdio->req_q_lock);
|
2005-04-17 06:20:36 +08:00
|
|
|
return retval;
|
|
|
|
}
|
|
|
|
|
2008-10-01 18:42:17 +08:00
|
|
|
static void zfcp_fsf_open_wka_port_handler(struct zfcp_fsf_req *req)
|
|
|
|
{
|
2009-11-24 23:54:11 +08:00
|
|
|
struct zfcp_fc_wka_port *wka_port = req->data;
|
2008-10-01 18:42:17 +08:00
|
|
|
struct fsf_qtcb_header *header = &req->qtcb->header;
|
|
|
|
|
|
|
|
if (req->status & ZFCP_STATUS_FSFREQ_ERROR) {
|
2009-11-24 23:54:11 +08:00
|
|
|
wka_port->status = ZFCP_FC_WKA_PORT_OFFLINE;
|
2008-10-01 18:42:17 +08:00
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
|
|
|
|
switch (header->fsf_status) {
|
|
|
|
case FSF_MAXIMUM_NUMBER_OF_PORTS_EXCEEDED:
|
|
|
|
dev_warn(&req->adapter->ccw_device->dev,
|
|
|
|
"Opening WKA port 0x%x failed\n", wka_port->d_id);
|
2020-03-31 22:21:48 +08:00
|
|
|
fallthrough;
|
2008-10-01 18:42:17 +08:00
|
|
|
case FSF_ADAPTER_STATUS_AVAILABLE:
|
|
|
|
req->status |= ZFCP_STATUS_FSFREQ_ERROR;
|
2009-11-24 23:54:11 +08:00
|
|
|
wka_port->status = ZFCP_FC_WKA_PORT_OFFLINE;
|
2008-10-01 18:42:17 +08:00
|
|
|
break;
|
|
|
|
case FSF_GOOD:
|
|
|
|
wka_port->handle = header->port_handle;
|
2020-03-31 22:21:48 +08:00
|
|
|
fallthrough;
|
2009-07-13 21:06:13 +08:00
|
|
|
case FSF_PORT_ALREADY_OPEN:
|
2009-11-24 23:54:11 +08:00
|
|
|
wka_port->status = ZFCP_FC_WKA_PORT_ONLINE;
|
2008-10-01 18:42:17 +08:00
|
|
|
}
|
|
|
|
out:
|
2022-07-30 00:25:29 +08:00
|
|
|
wake_up(&wka_port->opened);
|
2008-10-01 18:42:17 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* zfcp_fsf_open_wka_port - create and send open wka-port request
|
2009-11-24 23:54:11 +08:00
|
|
|
* @wka_port: pointer to struct zfcp_fc_wka_port
|
2008-10-01 18:42:17 +08:00
|
|
|
* Returns: 0 on success, error otherwise
|
|
|
|
*/
|
2009-11-24 23:54:11 +08:00
|
|
|
int zfcp_fsf_open_wka_port(struct zfcp_fc_wka_port *wka_port)
|
2008-10-01 18:42:17 +08:00
|
|
|
{
|
2009-08-18 21:43:19 +08:00
|
|
|
struct zfcp_qdio *qdio = wka_port->adapter->qdio;
|
2017-02-08 22:34:22 +08:00
|
|
|
struct zfcp_fsf_req *req;
|
2023-02-22 01:55:59 +08:00
|
|
|
u64 req_id = 0;
|
2008-10-01 18:42:17 +08:00
|
|
|
int retval = -EIO;
|
|
|
|
|
[SCSI] zfcp: Change spin_lock_bh to spin_lock_irq to fix lockdep warning
With the change to use the data on the SCSI device, iterating through
all LUNs/scsi_devices takes the SCSI host_lock. This triggers warnings
from the lock dependency checker:
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #97
---------------------------------------------------------
chchp/3224 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->req_q_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this: [ 24.972394] 2 locks held by chchp/3224:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-....}, at: [<0000000000490302>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #98
---------------------------------------------------------
chchp/3235 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->stat_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this:
2 locks held by chchp/3235:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-.-..}, at: [<00000000004902f6>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
To stop this warning, change the request queue lock to disable irqs,
not only softirq. The changes are required only outside of the
critical "send fcp command" path.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-09-08 20:39:57 +08:00
|
|
|
spin_lock_irq(&qdio->req_q_lock);
|
2010-05-01 00:09:35 +08:00
|
|
|
if (zfcp_qdio_sbal_get(qdio))
|
2008-10-01 18:42:17 +08:00
|
|
|
goto out;
|
|
|
|
|
2009-08-18 21:43:19 +08:00
|
|
|
req = zfcp_fsf_req_create(qdio, FSF_QTCB_OPEN_PORT_WITH_DID,
|
2011-06-06 20:14:40 +08:00
|
|
|
SBAL_SFLAGS0_TYPE_READ,
|
2009-08-18 21:43:19 +08:00
|
|
|
qdio->adapter->pool.erp_req);
|
2009-08-18 21:43:16 +08:00
|
|
|
|
2011-02-23 02:54:38 +08:00
|
|
|
if (IS_ERR(req)) {
|
2008-10-01 18:42:17 +08:00
|
|
|
retval = PTR_ERR(req);
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
|
2009-08-18 21:43:16 +08:00
|
|
|
req->status |= ZFCP_STATUS_FSFREQ_CLEANUP;
|
2010-05-01 00:09:34 +08:00
|
|
|
zfcp_qdio_set_sbale_last(qdio, &req->qdio_req);
|
2008-10-01 18:42:17 +08:00
|
|
|
|
|
|
|
req->handler = zfcp_fsf_open_wka_port_handler;
|
2009-11-24 23:54:12 +08:00
|
|
|
hton24(req->qtcb->bottom.support.d_id, wka_port->d_id);
|
2008-10-01 18:42:17 +08:00
|
|
|
req->data = wka_port;
|
|
|
|
|
2019-07-03 05:02:01 +08:00
|
|
|
req_id = req->req_id;
|
|
|
|
|
2008-10-01 18:42:17 +08:00
|
|
|
zfcp_fsf_start_timer(req, ZFCP_FSF_REQUEST_TIMEOUT);
|
|
|
|
retval = zfcp_fsf_req_send(req);
|
|
|
|
if (retval)
|
|
|
|
zfcp_fsf_req_free(req);
|
scsi: zfcp: fix request object use-after-free in send path causing seqno errors
With a recent change to our send path for FSF commands we introduced a
possible use-after-free of request-objects, that might further lead to
zfcp crafting bad requests, which the FCP channel correctly complains
about with an error (FSF_PROT_SEQ_NUMB_ERROR). This error is then handled
by an adapter-wide recovery.
The following sequence illustrates the possible use-after-free:
Send Path:
int zfcp_fsf_open_port(struct zfcp_erp_action *erp_action)
{
struct zfcp_fsf_req *req;
...
spin_lock_irq(&qdio->req_q_lock);
// ^^^^^^^^^^^^^^^^
// protects QDIO queue during sending
...
req = zfcp_fsf_req_create(qdio,
FSF_QTCB_OPEN_PORT_WITH_DID,
SBAL_SFLAGS0_TYPE_READ,
qdio->adapter->pool.erp_req);
// ^^^^^^^^^^^^^^^^^^^
// allocation of the request-object
...
retval = zfcp_fsf_req_send(req);
...
spin_unlock_irq(&qdio->req_q_lock);
return retval;
}
static int zfcp_fsf_req_send(struct zfcp_fsf_req *req)
{
struct zfcp_adapter *adapter = req->adapter;
struct zfcp_qdio *qdio = adapter->qdio;
...
zfcp_reqlist_add(adapter->req_list, req);
// ^^^^^^^^^^^^^^^^
// add request to our driver-internal hash-table for tracking
// (protected by separate lock req_list->lock)
...
if (zfcp_qdio_send(qdio, &req->qdio_req)) {
// ^^^^^^^^^^^^^^
// hand-off the request to FCP channel;
// the request can complete at any point now
...
}
/* Don't increase for unsolicited status */
if (!zfcp_fsf_req_is_status_read_buffer(req))
// ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
// possible use-after-free
adapter->fsf_req_seq_no++;
// ^^^^^^^^^^^^^^^^
// because of the use-after-free we might
// miss this accounting, and as follow-up
// this results in the FCP channel error
// FSF_PROT_SEQ_NUMB_ERROR
adapter->req_no++;
return 0;
}
static inline bool
zfcp_fsf_req_is_status_read_buffer(struct zfcp_fsf_req *req)
{
return req->qtcb == NULL;
// ^^^^^^^^^
// possible use-after-free
}
Response Path:
void zfcp_fsf_reqid_check(struct zfcp_qdio *qdio, int sbal_idx)
{
...
struct zfcp_fsf_req *fsf_req;
...
for (idx = 0; idx < QDIO_MAX_ELEMENTS_PER_BUFFER; idx++) {
...
fsf_req = zfcp_reqlist_find_rm(adapter->req_list,
req_id);
// ^^^^^^^^^^^^^^^^^^^^
// remove request from our driver-internal
// hash-table (lock req_list->lock)
...
zfcp_fsf_req_complete(fsf_req);
}
}
static void zfcp_fsf_req_complete(struct zfcp_fsf_req *req)
{
...
if (likely(req->status & ZFCP_STATUS_FSFREQ_CLEANUP))
zfcp_fsf_req_free(req);
// ^^^^^^^^^^^^^^^^^
// free memory for request-object
else
complete(&req->completion);
// ^^^^^^^^
// completion notification for code-paths that wait
// synchronous for the completion of the request; in
// those the memory is freed separately
}
The result of the use-after-free only affects the send path, and can not
lead to any data corruption. In case we miss the sequence-number
accounting, because the memory was already re-purposed, the next FSF
command will fail with said FCP channel error, and we will recover the
whole adapter. This causes no additional errors, but it slows down
traffic. There is a slight chance of the same thing happen again
recursively after the adapter recovery, but so far this has not been seen.
This was seen under z/VM, where the send path might run on a virtual CPU
that gets scheduled away by z/VM, while the return path might still run,
and so create the necessary timing. Running with KASAN can also slow down
the kernel sufficiently to run into this user-after-free, and then see the
report by KASAN.
To fix this, simply pull the test for the sequence-number accounting in
front of the hand-off to the FCP channel (this information doesn't change
during hand-off), but leave the sequence-number accounting itself where it
is.
To make future regressions of the same kind less likely, add comments to
all closely related code-paths.
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Fixes: f9eca0227600 ("scsi: zfcp: drop duplicate fsf_command from zfcp_fsf_req which is also in QTCB header")
Cc: <stable@vger.kernel.org> #5.0+
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-07-03 05:02:00 +08:00
|
|
|
/* NOTE: DO NOT TOUCH req PAST THIS POINT! */
|
2008-10-01 18:42:17 +08:00
|
|
|
out:
|
[SCSI] zfcp: Change spin_lock_bh to spin_lock_irq to fix lockdep warning
With the change to use the data on the SCSI device, iterating through
all LUNs/scsi_devices takes the SCSI host_lock. This triggers warnings
from the lock dependency checker:
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #97
---------------------------------------------------------
chchp/3224 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->req_q_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this: [ 24.972394] 2 locks held by chchp/3224:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-....}, at: [<0000000000490302>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #98
---------------------------------------------------------
chchp/3235 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->stat_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this:
2 locks held by chchp/3235:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-.-..}, at: [<00000000004902f6>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
To stop this warning, change the request queue lock to disable irqs,
not only softirq. The changes are required only outside of the
critical "send fcp command" path.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-09-08 20:39:57 +08:00
|
|
|
spin_unlock_irq(&qdio->req_q_lock);
|
2017-02-08 22:34:22 +08:00
|
|
|
if (!retval)
|
2019-07-03 05:02:01 +08:00
|
|
|
zfcp_dbf_rec_run_wka("fsowp_1", wka_port, req_id);
|
2008-10-01 18:42:17 +08:00
|
|
|
return retval;
|
|
|
|
}
|
|
|
|
|
|
|
|
static void zfcp_fsf_close_wka_port_handler(struct zfcp_fsf_req *req)
|
|
|
|
{
|
2009-11-24 23:54:11 +08:00
|
|
|
struct zfcp_fc_wka_port *wka_port = req->data;
|
2008-10-01 18:42:17 +08:00
|
|
|
|
|
|
|
if (req->qtcb->header.fsf_status == FSF_PORT_HANDLE_NOT_VALID) {
|
|
|
|
req->status |= ZFCP_STATUS_FSFREQ_ERROR;
|
2010-12-02 22:16:16 +08:00
|
|
|
zfcp_erp_adapter_reopen(wka_port->adapter, 0, "fscwph1");
|
2008-10-01 18:42:17 +08:00
|
|
|
}
|
|
|
|
|
2009-11-24 23:54:11 +08:00
|
|
|
wka_port->status = ZFCP_FC_WKA_PORT_OFFLINE;
|
2022-07-30 00:25:29 +08:00
|
|
|
wake_up(&wka_port->closed);
|
2008-10-01 18:42:17 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* zfcp_fsf_close_wka_port - create and send close wka port request
|
2009-11-24 23:54:11 +08:00
|
|
|
* @wka_port: WKA port to open
|
2008-10-01 18:42:17 +08:00
|
|
|
* Returns: 0 on success, error otherwise
|
|
|
|
*/
|
2009-11-24 23:54:11 +08:00
|
|
|
int zfcp_fsf_close_wka_port(struct zfcp_fc_wka_port *wka_port)
|
2008-10-01 18:42:17 +08:00
|
|
|
{
|
2009-08-18 21:43:19 +08:00
|
|
|
struct zfcp_qdio *qdio = wka_port->adapter->qdio;
|
2017-02-08 22:34:22 +08:00
|
|
|
struct zfcp_fsf_req *req;
|
2023-02-22 01:55:59 +08:00
|
|
|
u64 req_id = 0;
|
2008-10-01 18:42:17 +08:00
|
|
|
int retval = -EIO;
|
|
|
|
|
[SCSI] zfcp: Change spin_lock_bh to spin_lock_irq to fix lockdep warning
With the change to use the data on the SCSI device, iterating through
all LUNs/scsi_devices takes the SCSI host_lock. This triggers warnings
from the lock dependency checker:
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #97
---------------------------------------------------------
chchp/3224 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->req_q_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this: [ 24.972394] 2 locks held by chchp/3224:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-....}, at: [<0000000000490302>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #98
---------------------------------------------------------
chchp/3235 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->stat_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this:
2 locks held by chchp/3235:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-.-..}, at: [<00000000004902f6>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
To stop this warning, change the request queue lock to disable irqs,
not only softirq. The changes are required only outside of the
critical "send fcp command" path.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-09-08 20:39:57 +08:00
|
|
|
spin_lock_irq(&qdio->req_q_lock);
|
2010-05-01 00:09:35 +08:00
|
|
|
if (zfcp_qdio_sbal_get(qdio))
|
2008-10-01 18:42:17 +08:00
|
|
|
goto out;
|
|
|
|
|
2009-08-18 21:43:19 +08:00
|
|
|
req = zfcp_fsf_req_create(qdio, FSF_QTCB_CLOSE_PORT,
|
2011-06-06 20:14:40 +08:00
|
|
|
SBAL_SFLAGS0_TYPE_READ,
|
2009-08-18 21:43:19 +08:00
|
|
|
qdio->adapter->pool.erp_req);
|
2009-08-18 21:43:16 +08:00
|
|
|
|
2011-02-23 02:54:38 +08:00
|
|
|
if (IS_ERR(req)) {
|
2008-10-01 18:42:17 +08:00
|
|
|
retval = PTR_ERR(req);
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
|
2009-08-18 21:43:16 +08:00
|
|
|
req->status |= ZFCP_STATUS_FSFREQ_CLEANUP;
|
2010-05-01 00:09:34 +08:00
|
|
|
zfcp_qdio_set_sbale_last(qdio, &req->qdio_req);
|
2008-10-01 18:42:17 +08:00
|
|
|
|
|
|
|
req->handler = zfcp_fsf_close_wka_port_handler;
|
|
|
|
req->data = wka_port;
|
|
|
|
req->qtcb->header.port_handle = wka_port->handle;
|
|
|
|
|
2019-07-03 05:02:01 +08:00
|
|
|
req_id = req->req_id;
|
|
|
|
|
2008-10-01 18:42:17 +08:00
|
|
|
zfcp_fsf_start_timer(req, ZFCP_FSF_REQUEST_TIMEOUT);
|
|
|
|
retval = zfcp_fsf_req_send(req);
|
|
|
|
if (retval)
|
|
|
|
zfcp_fsf_req_free(req);
|
scsi: zfcp: fix request object use-after-free in send path causing seqno errors
With a recent change to our send path for FSF commands we introduced a
possible use-after-free of request-objects, that might further lead to
zfcp crafting bad requests, which the FCP channel correctly complains
about with an error (FSF_PROT_SEQ_NUMB_ERROR). This error is then handled
by an adapter-wide recovery.
The following sequence illustrates the possible use-after-free:
Send Path:
int zfcp_fsf_open_port(struct zfcp_erp_action *erp_action)
{
struct zfcp_fsf_req *req;
...
spin_lock_irq(&qdio->req_q_lock);
// ^^^^^^^^^^^^^^^^
// protects QDIO queue during sending
...
req = zfcp_fsf_req_create(qdio,
FSF_QTCB_OPEN_PORT_WITH_DID,
SBAL_SFLAGS0_TYPE_READ,
qdio->adapter->pool.erp_req);
// ^^^^^^^^^^^^^^^^^^^
// allocation of the request-object
...
retval = zfcp_fsf_req_send(req);
...
spin_unlock_irq(&qdio->req_q_lock);
return retval;
}
static int zfcp_fsf_req_send(struct zfcp_fsf_req *req)
{
struct zfcp_adapter *adapter = req->adapter;
struct zfcp_qdio *qdio = adapter->qdio;
...
zfcp_reqlist_add(adapter->req_list, req);
// ^^^^^^^^^^^^^^^^
// add request to our driver-internal hash-table for tracking
// (protected by separate lock req_list->lock)
...
if (zfcp_qdio_send(qdio, &req->qdio_req)) {
// ^^^^^^^^^^^^^^
// hand-off the request to FCP channel;
// the request can complete at any point now
...
}
/* Don't increase for unsolicited status */
if (!zfcp_fsf_req_is_status_read_buffer(req))
// ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
// possible use-after-free
adapter->fsf_req_seq_no++;
// ^^^^^^^^^^^^^^^^
// because of the use-after-free we might
// miss this accounting, and as follow-up
// this results in the FCP channel error
// FSF_PROT_SEQ_NUMB_ERROR
adapter->req_no++;
return 0;
}
static inline bool
zfcp_fsf_req_is_status_read_buffer(struct zfcp_fsf_req *req)
{
return req->qtcb == NULL;
// ^^^^^^^^^
// possible use-after-free
}
Response Path:
void zfcp_fsf_reqid_check(struct zfcp_qdio *qdio, int sbal_idx)
{
...
struct zfcp_fsf_req *fsf_req;
...
for (idx = 0; idx < QDIO_MAX_ELEMENTS_PER_BUFFER; idx++) {
...
fsf_req = zfcp_reqlist_find_rm(adapter->req_list,
req_id);
// ^^^^^^^^^^^^^^^^^^^^
// remove request from our driver-internal
// hash-table (lock req_list->lock)
...
zfcp_fsf_req_complete(fsf_req);
}
}
static void zfcp_fsf_req_complete(struct zfcp_fsf_req *req)
{
...
if (likely(req->status & ZFCP_STATUS_FSFREQ_CLEANUP))
zfcp_fsf_req_free(req);
// ^^^^^^^^^^^^^^^^^
// free memory for request-object
else
complete(&req->completion);
// ^^^^^^^^
// completion notification for code-paths that wait
// synchronous for the completion of the request; in
// those the memory is freed separately
}
The result of the use-after-free only affects the send path, and can not
lead to any data corruption. In case we miss the sequence-number
accounting, because the memory was already re-purposed, the next FSF
command will fail with said FCP channel error, and we will recover the
whole adapter. This causes no additional errors, but it slows down
traffic. There is a slight chance of the same thing happen again
recursively after the adapter recovery, but so far this has not been seen.
This was seen under z/VM, where the send path might run on a virtual CPU
that gets scheduled away by z/VM, while the return path might still run,
and so create the necessary timing. Running with KASAN can also slow down
the kernel sufficiently to run into this user-after-free, and then see the
report by KASAN.
To fix this, simply pull the test for the sequence-number accounting in
front of the hand-off to the FCP channel (this information doesn't change
during hand-off), but leave the sequence-number accounting itself where it
is.
To make future regressions of the same kind less likely, add comments to
all closely related code-paths.
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Fixes: f9eca0227600 ("scsi: zfcp: drop duplicate fsf_command from zfcp_fsf_req which is also in QTCB header")
Cc: <stable@vger.kernel.org> #5.0+
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-07-03 05:02:00 +08:00
|
|
|
/* NOTE: DO NOT TOUCH req PAST THIS POINT! */
|
2008-10-01 18:42:17 +08:00
|
|
|
out:
|
[SCSI] zfcp: Change spin_lock_bh to spin_lock_irq to fix lockdep warning
With the change to use the data on the SCSI device, iterating through
all LUNs/scsi_devices takes the SCSI host_lock. This triggers warnings
from the lock dependency checker:
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #97
---------------------------------------------------------
chchp/3224 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->req_q_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this: [ 24.972394] 2 locks held by chchp/3224:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-....}, at: [<0000000000490302>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #98
---------------------------------------------------------
chchp/3235 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->stat_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this:
2 locks held by chchp/3235:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-.-..}, at: [<00000000004902f6>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
To stop this warning, change the request queue lock to disable irqs,
not only softirq. The changes are required only outside of the
critical "send fcp command" path.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-09-08 20:39:57 +08:00
|
|
|
spin_unlock_irq(&qdio->req_q_lock);
|
2017-02-08 22:34:22 +08:00
|
|
|
if (!retval)
|
2019-07-03 05:02:01 +08:00
|
|
|
zfcp_dbf_rec_run_wka("fscwp_1", wka_port, req_id);
|
2008-10-01 18:42:17 +08:00
|
|
|
return retval;
|
|
|
|
}
|
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
static void zfcp_fsf_close_physical_port_handler(struct zfcp_fsf_req *req)
|
2005-04-17 06:20:36 +08:00
|
|
|
{
|
2008-07-02 16:56:39 +08:00
|
|
|
struct zfcp_port *port = req->data;
|
|
|
|
struct fsf_qtcb_header *header = &req->qtcb->header;
|
2010-09-08 20:39:55 +08:00
|
|
|
struct scsi_device *sdev;
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
if (req->status & ZFCP_STATUS_FSFREQ_ERROR)
|
2009-03-02 20:08:54 +08:00
|
|
|
return;
|
2005-04-17 06:20:36 +08:00
|
|
|
|
|
|
|
switch (header->fsf_status) {
|
|
|
|
case FSF_PORT_HANDLE_NOT_VALID:
|
2010-12-02 22:16:16 +08:00
|
|
|
zfcp_erp_adapter_reopen(port->adapter, 0, "fscpph1");
|
2008-07-02 16:56:39 +08:00
|
|
|
req->status |= ZFCP_STATUS_FSFREQ_ERROR;
|
2005-04-17 06:20:36 +08:00
|
|
|
break;
|
|
|
|
case FSF_PORT_BOXED:
|
2008-03-10 23:18:54 +08:00
|
|
|
/* can't use generic zfcp_erp_modify_port_status because
|
|
|
|
* ZFCP_STATUS_COMMON_OPEN must not be reset for the port */
|
2015-04-24 07:12:32 +08:00
|
|
|
atomic_andnot(ZFCP_STATUS_PORT_PHYS_OPEN, &port->status);
|
2010-09-08 20:39:55 +08:00
|
|
|
shost_for_each_device(sdev, port->adapter->scsi_host)
|
|
|
|
if (sdev_to_zfcp(sdev)->port == port)
|
2015-04-24 07:12:32 +08:00
|
|
|
atomic_andnot(ZFCP_STATUS_COMMON_OPEN,
|
2010-09-08 20:39:55 +08:00
|
|
|
&sdev_to_zfcp(sdev)->status);
|
2010-09-08 20:40:01 +08:00
|
|
|
zfcp_erp_set_port_status(port, ZFCP_STATUS_COMMON_ACCESS_BOXED);
|
|
|
|
zfcp_erp_port_reopen(port, ZFCP_STATUS_COMMON_ERP_FAILED,
|
2010-12-02 22:16:16 +08:00
|
|
|
"fscpph2");
|
2009-11-24 23:54:15 +08:00
|
|
|
req->status |= ZFCP_STATUS_FSFREQ_ERROR;
|
2005-04-17 06:20:36 +08:00
|
|
|
break;
|
|
|
|
case FSF_ADAPTER_STATUS_AVAILABLE:
|
|
|
|
switch (header->fsf_status_qual.word[0]) {
|
|
|
|
case FSF_SQ_INVOKE_LINK_TEST_PROCEDURE:
|
|
|
|
case FSF_SQ_ULP_DEPENDENT_ERP_REQUIRED:
|
2008-07-02 16:56:39 +08:00
|
|
|
req->status |= ZFCP_STATUS_FSFREQ_ERROR;
|
2005-04-17 06:20:36 +08:00
|
|
|
break;
|
|
|
|
}
|
|
|
|
break;
|
|
|
|
case FSF_GOOD:
|
|
|
|
/* can't use generic zfcp_erp_modify_port_status because
|
|
|
|
* ZFCP_STATUS_COMMON_OPEN must not be reset for the port
|
|
|
|
*/
|
2015-04-24 07:12:32 +08:00
|
|
|
atomic_andnot(ZFCP_STATUS_PORT_PHYS_OPEN, &port->status);
|
2010-09-08 20:39:55 +08:00
|
|
|
shost_for_each_device(sdev, port->adapter->scsi_host)
|
|
|
|
if (sdev_to_zfcp(sdev)->port == port)
|
2015-04-24 07:12:32 +08:00
|
|
|
atomic_andnot(ZFCP_STATUS_COMMON_OPEN,
|
2010-09-08 20:39:55 +08:00
|
|
|
&sdev_to_zfcp(sdev)->status);
|
2005-04-17 06:20:36 +08:00
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
/**
|
|
|
|
* zfcp_fsf_close_physical_port - close physical port
|
|
|
|
* @erp_action: pointer to struct zfcp_erp_action
|
|
|
|
* Returns: 0 on success
|
2005-04-17 06:20:36 +08:00
|
|
|
*/
|
2008-07-02 16:56:39 +08:00
|
|
|
int zfcp_fsf_close_physical_port(struct zfcp_erp_action *erp_action)
|
2005-04-17 06:20:36 +08:00
|
|
|
{
|
2009-08-18 21:43:19 +08:00
|
|
|
struct zfcp_qdio *qdio = erp_action->adapter->qdio;
|
2008-07-02 16:56:39 +08:00
|
|
|
struct zfcp_fsf_req *req;
|
|
|
|
int retval = -EIO;
|
|
|
|
|
[SCSI] zfcp: Change spin_lock_bh to spin_lock_irq to fix lockdep warning
With the change to use the data on the SCSI device, iterating through
all LUNs/scsi_devices takes the SCSI host_lock. This triggers warnings
from the lock dependency checker:
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #97
---------------------------------------------------------
chchp/3224 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->req_q_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this: [ 24.972394] 2 locks held by chchp/3224:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-....}, at: [<0000000000490302>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #98
---------------------------------------------------------
chchp/3235 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->stat_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this:
2 locks held by chchp/3235:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-.-..}, at: [<00000000004902f6>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
To stop this warning, change the request queue lock to disable irqs,
not only softirq. The changes are required only outside of the
critical "send fcp command" path.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-09-08 20:39:57 +08:00
|
|
|
spin_lock_irq(&qdio->req_q_lock);
|
2010-05-01 00:09:35 +08:00
|
|
|
if (zfcp_qdio_sbal_get(qdio))
|
2005-04-17 06:20:36 +08:00
|
|
|
goto out;
|
|
|
|
|
2009-08-18 21:43:19 +08:00
|
|
|
req = zfcp_fsf_req_create(qdio, FSF_QTCB_CLOSE_PHYSICAL_PORT,
|
2011-06-06 20:14:40 +08:00
|
|
|
SBAL_SFLAGS0_TYPE_READ,
|
2009-08-18 21:43:19 +08:00
|
|
|
qdio->adapter->pool.erp_req);
|
2009-08-18 21:43:16 +08:00
|
|
|
|
2008-08-21 19:43:37 +08:00
|
|
|
if (IS_ERR(req)) {
|
2008-07-02 16:56:39 +08:00
|
|
|
retval = PTR_ERR(req);
|
|
|
|
goto out;
|
|
|
|
}
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2009-08-18 21:43:16 +08:00
|
|
|
req->status |= ZFCP_STATUS_FSFREQ_CLEANUP;
|
2010-05-01 00:09:34 +08:00
|
|
|
zfcp_qdio_set_sbale_last(qdio, &req->qdio_req);
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
req->data = erp_action->port;
|
|
|
|
req->qtcb->header.port_handle = erp_action->port->handle;
|
|
|
|
req->erp_action = erp_action;
|
|
|
|
req->handler = zfcp_fsf_close_physical_port_handler;
|
2010-02-17 18:18:49 +08:00
|
|
|
erp_action->fsf_req_id = req->req_id;
|
2008-07-02 16:56:39 +08:00
|
|
|
|
2008-07-02 16:56:40 +08:00
|
|
|
zfcp_fsf_start_erp_timer(req);
|
2008-07-02 16:56:39 +08:00
|
|
|
retval = zfcp_fsf_req_send(req);
|
2005-04-17 06:20:36 +08:00
|
|
|
if (retval) {
|
2008-07-02 16:56:39 +08:00
|
|
|
zfcp_fsf_req_free(req);
|
2010-02-17 18:18:49 +08:00
|
|
|
erp_action->fsf_req_id = 0;
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
scsi: zfcp: fix request object use-after-free in send path causing seqno errors
With a recent change to our send path for FSF commands we introduced a
possible use-after-free of request-objects, that might further lead to
zfcp crafting bad requests, which the FCP channel correctly complains
about with an error (FSF_PROT_SEQ_NUMB_ERROR). This error is then handled
by an adapter-wide recovery.
The following sequence illustrates the possible use-after-free:
Send Path:
int zfcp_fsf_open_port(struct zfcp_erp_action *erp_action)
{
struct zfcp_fsf_req *req;
...
spin_lock_irq(&qdio->req_q_lock);
// ^^^^^^^^^^^^^^^^
// protects QDIO queue during sending
...
req = zfcp_fsf_req_create(qdio,
FSF_QTCB_OPEN_PORT_WITH_DID,
SBAL_SFLAGS0_TYPE_READ,
qdio->adapter->pool.erp_req);
// ^^^^^^^^^^^^^^^^^^^
// allocation of the request-object
...
retval = zfcp_fsf_req_send(req);
...
spin_unlock_irq(&qdio->req_q_lock);
return retval;
}
static int zfcp_fsf_req_send(struct zfcp_fsf_req *req)
{
struct zfcp_adapter *adapter = req->adapter;
struct zfcp_qdio *qdio = adapter->qdio;
...
zfcp_reqlist_add(adapter->req_list, req);
// ^^^^^^^^^^^^^^^^
// add request to our driver-internal hash-table for tracking
// (protected by separate lock req_list->lock)
...
if (zfcp_qdio_send(qdio, &req->qdio_req)) {
// ^^^^^^^^^^^^^^
// hand-off the request to FCP channel;
// the request can complete at any point now
...
}
/* Don't increase for unsolicited status */
if (!zfcp_fsf_req_is_status_read_buffer(req))
// ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
// possible use-after-free
adapter->fsf_req_seq_no++;
// ^^^^^^^^^^^^^^^^
// because of the use-after-free we might
// miss this accounting, and as follow-up
// this results in the FCP channel error
// FSF_PROT_SEQ_NUMB_ERROR
adapter->req_no++;
return 0;
}
static inline bool
zfcp_fsf_req_is_status_read_buffer(struct zfcp_fsf_req *req)
{
return req->qtcb == NULL;
// ^^^^^^^^^
// possible use-after-free
}
Response Path:
void zfcp_fsf_reqid_check(struct zfcp_qdio *qdio, int sbal_idx)
{
...
struct zfcp_fsf_req *fsf_req;
...
for (idx = 0; idx < QDIO_MAX_ELEMENTS_PER_BUFFER; idx++) {
...
fsf_req = zfcp_reqlist_find_rm(adapter->req_list,
req_id);
// ^^^^^^^^^^^^^^^^^^^^
// remove request from our driver-internal
// hash-table (lock req_list->lock)
...
zfcp_fsf_req_complete(fsf_req);
}
}
static void zfcp_fsf_req_complete(struct zfcp_fsf_req *req)
{
...
if (likely(req->status & ZFCP_STATUS_FSFREQ_CLEANUP))
zfcp_fsf_req_free(req);
// ^^^^^^^^^^^^^^^^^
// free memory for request-object
else
complete(&req->completion);
// ^^^^^^^^
// completion notification for code-paths that wait
// synchronous for the completion of the request; in
// those the memory is freed separately
}
The result of the use-after-free only affects the send path, and can not
lead to any data corruption. In case we miss the sequence-number
accounting, because the memory was already re-purposed, the next FSF
command will fail with said FCP channel error, and we will recover the
whole adapter. This causes no additional errors, but it slows down
traffic. There is a slight chance of the same thing happen again
recursively after the adapter recovery, but so far this has not been seen.
This was seen under z/VM, where the send path might run on a virtual CPU
that gets scheduled away by z/VM, while the return path might still run,
and so create the necessary timing. Running with KASAN can also slow down
the kernel sufficiently to run into this user-after-free, and then see the
report by KASAN.
To fix this, simply pull the test for the sequence-number accounting in
front of the hand-off to the FCP channel (this information doesn't change
during hand-off), but leave the sequence-number accounting itself where it
is.
To make future regressions of the same kind less likely, add comments to
all closely related code-paths.
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Fixes: f9eca0227600 ("scsi: zfcp: drop duplicate fsf_command from zfcp_fsf_req which is also in QTCB header")
Cc: <stable@vger.kernel.org> #5.0+
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-07-03 05:02:00 +08:00
|
|
|
/* NOTE: DO NOT TOUCH req PAST THIS POINT! */
|
2008-07-02 16:56:39 +08:00
|
|
|
out:
|
[SCSI] zfcp: Change spin_lock_bh to spin_lock_irq to fix lockdep warning
With the change to use the data on the SCSI device, iterating through
all LUNs/scsi_devices takes the SCSI host_lock. This triggers warnings
from the lock dependency checker:
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #97
---------------------------------------------------------
chchp/3224 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->req_q_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this: [ 24.972394] 2 locks held by chchp/3224:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-....}, at: [<0000000000490302>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #98
---------------------------------------------------------
chchp/3235 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->stat_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this:
2 locks held by chchp/3235:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-.-..}, at: [<00000000004902f6>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
To stop this warning, change the request queue lock to disable irqs,
not only softirq. The changes are required only outside of the
critical "send fcp command" path.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-09-08 20:39:57 +08:00
|
|
|
spin_unlock_irq(&qdio->req_q_lock);
|
2005-04-17 06:20:36 +08:00
|
|
|
return retval;
|
|
|
|
}
|
|
|
|
|
2010-09-08 20:39:55 +08:00
|
|
|
static void zfcp_fsf_open_lun_handler(struct zfcp_fsf_req *req)
|
2005-04-17 06:20:36 +08:00
|
|
|
{
|
2008-07-02 16:56:39 +08:00
|
|
|
struct zfcp_adapter *adapter = req->adapter;
|
2010-09-08 20:39:55 +08:00
|
|
|
struct scsi_device *sdev = req->data;
|
2012-09-04 21:23:36 +08:00
|
|
|
struct zfcp_scsi_dev *zfcp_sdev;
|
2008-07-02 16:56:39 +08:00
|
|
|
struct fsf_qtcb_header *header = &req->qtcb->header;
|
2013-04-26 22:13:54 +08:00
|
|
|
union fsf_status_qual *qual = &header->fsf_status_qual;
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
if (req->status & ZFCP_STATUS_FSFREQ_ERROR)
|
2008-10-01 18:42:16 +08:00
|
|
|
return;
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2012-09-04 21:23:36 +08:00
|
|
|
zfcp_sdev = sdev_to_zfcp(sdev);
|
|
|
|
|
2015-04-24 07:12:32 +08:00
|
|
|
atomic_andnot(ZFCP_STATUS_COMMON_ACCESS_DENIED |
|
2013-04-26 22:13:54 +08:00
|
|
|
ZFCP_STATUS_COMMON_ACCESS_BOXED,
|
2010-09-08 20:39:55 +08:00
|
|
|
&zfcp_sdev->status);
|
2005-04-17 06:20:36 +08:00
|
|
|
|
|
|
|
switch (header->fsf_status) {
|
|
|
|
|
|
|
|
case FSF_PORT_HANDLE_NOT_VALID:
|
2010-12-02 22:16:16 +08:00
|
|
|
zfcp_erp_adapter_reopen(adapter, 0, "fsouh_1");
|
2020-03-31 22:21:48 +08:00
|
|
|
fallthrough;
|
2005-04-17 06:20:36 +08:00
|
|
|
case FSF_LUN_ALREADY_OPEN:
|
|
|
|
break;
|
|
|
|
case FSF_PORT_BOXED:
|
2010-09-08 20:40:01 +08:00
|
|
|
zfcp_erp_set_port_status(zfcp_sdev->port,
|
|
|
|
ZFCP_STATUS_COMMON_ACCESS_BOXED);
|
|
|
|
zfcp_erp_port_reopen(zfcp_sdev->port,
|
2010-12-02 22:16:16 +08:00
|
|
|
ZFCP_STATUS_COMMON_ERP_FAILED, "fsouh_2");
|
2009-11-24 23:54:15 +08:00
|
|
|
req->status |= ZFCP_STATUS_FSFREQ_ERROR;
|
2005-04-17 06:20:36 +08:00
|
|
|
break;
|
|
|
|
case FSF_LUN_SHARING_VIOLATION:
|
2013-04-26 22:13:54 +08:00
|
|
|
if (qual->word[0])
|
|
|
|
dev_warn(&zfcp_sdev->port->adapter->ccw_device->dev,
|
2018-11-08 22:44:53 +08:00
|
|
|
"LUN 0x%016Lx on port 0x%016Lx is already in "
|
2013-04-26 22:13:54 +08:00
|
|
|
"use by CSS%d, MIF Image ID %x\n",
|
|
|
|
zfcp_scsi_dev_lun(sdev),
|
|
|
|
(unsigned long long)zfcp_sdev->port->wwpn,
|
|
|
|
qual->fsf_queue_designator.cssid,
|
|
|
|
qual->fsf_queue_designator.hla);
|
|
|
|
zfcp_erp_set_lun_status(sdev,
|
|
|
|
ZFCP_STATUS_COMMON_ERP_FAILED |
|
|
|
|
ZFCP_STATUS_COMMON_ACCESS_DENIED);
|
2008-07-02 16:56:39 +08:00
|
|
|
req->status |= ZFCP_STATUS_FSFREQ_ERROR;
|
2005-04-17 06:20:36 +08:00
|
|
|
break;
|
|
|
|
case FSF_MAXIMUM_NUMBER_OF_LUNS_EXCEEDED:
|
2008-07-02 16:56:39 +08:00
|
|
|
dev_warn(&adapter->ccw_device->dev,
|
2008-10-01 18:42:15 +08:00
|
|
|
"No handle is available for LUN "
|
|
|
|
"0x%016Lx on port 0x%016Lx\n",
|
2010-09-08 20:39:55 +08:00
|
|
|
(unsigned long long)zfcp_scsi_dev_lun(sdev),
|
|
|
|
(unsigned long long)zfcp_sdev->port->wwpn);
|
2010-09-08 20:40:01 +08:00
|
|
|
zfcp_erp_set_lun_status(sdev, ZFCP_STATUS_COMMON_ERP_FAILED);
|
2020-03-31 22:21:48 +08:00
|
|
|
fallthrough;
|
2008-07-02 16:56:39 +08:00
|
|
|
case FSF_INVALID_COMMAND_OPTION:
|
|
|
|
req->status |= ZFCP_STATUS_FSFREQ_ERROR;
|
2005-04-17 06:20:36 +08:00
|
|
|
break;
|
|
|
|
case FSF_ADAPTER_STATUS_AVAILABLE:
|
|
|
|
switch (header->fsf_status_qual.word[0]) {
|
|
|
|
case FSF_SQ_INVOKE_LINK_TEST_PROCEDURE:
|
2010-09-08 20:39:55 +08:00
|
|
|
zfcp_fc_test_link(zfcp_sdev->port);
|
2020-03-31 22:21:48 +08:00
|
|
|
fallthrough;
|
2005-04-17 06:20:36 +08:00
|
|
|
case FSF_SQ_ULP_DEPENDENT_ERP_REQUIRED:
|
2008-07-02 16:56:39 +08:00
|
|
|
req->status |= ZFCP_STATUS_FSFREQ_ERROR;
|
2005-04-17 06:20:36 +08:00
|
|
|
break;
|
|
|
|
}
|
|
|
|
break;
|
|
|
|
|
|
|
|
case FSF_GOOD:
|
2010-09-08 20:39:55 +08:00
|
|
|
zfcp_sdev->lun_handle = header->lun_handle;
|
2015-04-24 07:12:32 +08:00
|
|
|
atomic_or(ZFCP_STATUS_COMMON_OPEN, &zfcp_sdev->status);
|
2005-04-17 06:20:36 +08:00
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
/**
|
2010-09-08 20:39:55 +08:00
|
|
|
* zfcp_fsf_open_lun - open LUN
|
2008-07-02 16:56:39 +08:00
|
|
|
* @erp_action: pointer to struct zfcp_erp_action
|
|
|
|
* Returns: 0 on success, error otherwise
|
2005-04-17 06:20:36 +08:00
|
|
|
*/
|
2010-09-08 20:39:55 +08:00
|
|
|
int zfcp_fsf_open_lun(struct zfcp_erp_action *erp_action)
|
2005-04-17 06:20:36 +08:00
|
|
|
{
|
2008-07-02 16:56:39 +08:00
|
|
|
struct zfcp_adapter *adapter = erp_action->adapter;
|
2009-08-18 21:43:19 +08:00
|
|
|
struct zfcp_qdio *qdio = adapter->qdio;
|
2008-07-02 16:56:39 +08:00
|
|
|
struct zfcp_fsf_req *req;
|
|
|
|
int retval = -EIO;
|
|
|
|
|
[SCSI] zfcp: Change spin_lock_bh to spin_lock_irq to fix lockdep warning
With the change to use the data on the SCSI device, iterating through
all LUNs/scsi_devices takes the SCSI host_lock. This triggers warnings
from the lock dependency checker:
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #97
---------------------------------------------------------
chchp/3224 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->req_q_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this: [ 24.972394] 2 locks held by chchp/3224:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-....}, at: [<0000000000490302>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #98
---------------------------------------------------------
chchp/3235 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->stat_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this:
2 locks held by chchp/3235:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-.-..}, at: [<00000000004902f6>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
To stop this warning, change the request queue lock to disable irqs,
not only softirq. The changes are required only outside of the
critical "send fcp command" path.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-09-08 20:39:57 +08:00
|
|
|
spin_lock_irq(&qdio->req_q_lock);
|
2010-05-01 00:09:35 +08:00
|
|
|
if (zfcp_qdio_sbal_get(qdio))
|
2005-04-17 06:20:36 +08:00
|
|
|
goto out;
|
|
|
|
|
2009-08-18 21:43:19 +08:00
|
|
|
req = zfcp_fsf_req_create(qdio, FSF_QTCB_OPEN_LUN,
|
2011-06-06 20:14:40 +08:00
|
|
|
SBAL_SFLAGS0_TYPE_READ,
|
2009-08-18 21:43:15 +08:00
|
|
|
adapter->pool.erp_req);
|
2009-08-18 21:43:16 +08:00
|
|
|
|
2008-08-21 19:43:37 +08:00
|
|
|
if (IS_ERR(req)) {
|
2008-07-02 16:56:39 +08:00
|
|
|
retval = PTR_ERR(req);
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
|
2009-08-18 21:43:16 +08:00
|
|
|
req->status |= ZFCP_STATUS_FSFREQ_CLEANUP;
|
2010-05-01 00:09:34 +08:00
|
|
|
zfcp_qdio_set_sbale_last(qdio, &req->qdio_req);
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
req->qtcb->header.port_handle = erp_action->port->handle;
|
2010-09-08 20:39:55 +08:00
|
|
|
req->qtcb->bottom.support.fcp_lun = zfcp_scsi_dev_lun(erp_action->sdev);
|
|
|
|
req->handler = zfcp_fsf_open_lun_handler;
|
|
|
|
req->data = erp_action->sdev;
|
2008-07-02 16:56:39 +08:00
|
|
|
req->erp_action = erp_action;
|
2010-02-17 18:18:49 +08:00
|
|
|
erp_action->fsf_req_id = req->req_id;
|
2008-07-02 16:56:39 +08:00
|
|
|
|
|
|
|
if (!(adapter->connection_features & FSF_FEATURE_NPIV_MODE))
|
|
|
|
req->qtcb->bottom.support.option = FSF_OPEN_LUN_SUPPRESS_BOXING;
|
|
|
|
|
2008-07-02 16:56:40 +08:00
|
|
|
zfcp_fsf_start_erp_timer(req);
|
2008-07-02 16:56:39 +08:00
|
|
|
retval = zfcp_fsf_req_send(req);
|
2005-04-17 06:20:36 +08:00
|
|
|
if (retval) {
|
2008-07-02 16:56:39 +08:00
|
|
|
zfcp_fsf_req_free(req);
|
2010-02-17 18:18:49 +08:00
|
|
|
erp_action->fsf_req_id = 0;
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
scsi: zfcp: fix request object use-after-free in send path causing seqno errors
With a recent change to our send path for FSF commands we introduced a
possible use-after-free of request-objects, that might further lead to
zfcp crafting bad requests, which the FCP channel correctly complains
about with an error (FSF_PROT_SEQ_NUMB_ERROR). This error is then handled
by an adapter-wide recovery.
The following sequence illustrates the possible use-after-free:
Send Path:
int zfcp_fsf_open_port(struct zfcp_erp_action *erp_action)
{
struct zfcp_fsf_req *req;
...
spin_lock_irq(&qdio->req_q_lock);
// ^^^^^^^^^^^^^^^^
// protects QDIO queue during sending
...
req = zfcp_fsf_req_create(qdio,
FSF_QTCB_OPEN_PORT_WITH_DID,
SBAL_SFLAGS0_TYPE_READ,
qdio->adapter->pool.erp_req);
// ^^^^^^^^^^^^^^^^^^^
// allocation of the request-object
...
retval = zfcp_fsf_req_send(req);
...
spin_unlock_irq(&qdio->req_q_lock);
return retval;
}
static int zfcp_fsf_req_send(struct zfcp_fsf_req *req)
{
struct zfcp_adapter *adapter = req->adapter;
struct zfcp_qdio *qdio = adapter->qdio;
...
zfcp_reqlist_add(adapter->req_list, req);
// ^^^^^^^^^^^^^^^^
// add request to our driver-internal hash-table for tracking
// (protected by separate lock req_list->lock)
...
if (zfcp_qdio_send(qdio, &req->qdio_req)) {
// ^^^^^^^^^^^^^^
// hand-off the request to FCP channel;
// the request can complete at any point now
...
}
/* Don't increase for unsolicited status */
if (!zfcp_fsf_req_is_status_read_buffer(req))
// ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
// possible use-after-free
adapter->fsf_req_seq_no++;
// ^^^^^^^^^^^^^^^^
// because of the use-after-free we might
// miss this accounting, and as follow-up
// this results in the FCP channel error
// FSF_PROT_SEQ_NUMB_ERROR
adapter->req_no++;
return 0;
}
static inline bool
zfcp_fsf_req_is_status_read_buffer(struct zfcp_fsf_req *req)
{
return req->qtcb == NULL;
// ^^^^^^^^^
// possible use-after-free
}
Response Path:
void zfcp_fsf_reqid_check(struct zfcp_qdio *qdio, int sbal_idx)
{
...
struct zfcp_fsf_req *fsf_req;
...
for (idx = 0; idx < QDIO_MAX_ELEMENTS_PER_BUFFER; idx++) {
...
fsf_req = zfcp_reqlist_find_rm(adapter->req_list,
req_id);
// ^^^^^^^^^^^^^^^^^^^^
// remove request from our driver-internal
// hash-table (lock req_list->lock)
...
zfcp_fsf_req_complete(fsf_req);
}
}
static void zfcp_fsf_req_complete(struct zfcp_fsf_req *req)
{
...
if (likely(req->status & ZFCP_STATUS_FSFREQ_CLEANUP))
zfcp_fsf_req_free(req);
// ^^^^^^^^^^^^^^^^^
// free memory for request-object
else
complete(&req->completion);
// ^^^^^^^^
// completion notification for code-paths that wait
// synchronous for the completion of the request; in
// those the memory is freed separately
}
The result of the use-after-free only affects the send path, and can not
lead to any data corruption. In case we miss the sequence-number
accounting, because the memory was already re-purposed, the next FSF
command will fail with said FCP channel error, and we will recover the
whole adapter. This causes no additional errors, but it slows down
traffic. There is a slight chance of the same thing happen again
recursively after the adapter recovery, but so far this has not been seen.
This was seen under z/VM, where the send path might run on a virtual CPU
that gets scheduled away by z/VM, while the return path might still run,
and so create the necessary timing. Running with KASAN can also slow down
the kernel sufficiently to run into this user-after-free, and then see the
report by KASAN.
To fix this, simply pull the test for the sequence-number accounting in
front of the hand-off to the FCP channel (this information doesn't change
during hand-off), but leave the sequence-number accounting itself where it
is.
To make future regressions of the same kind less likely, add comments to
all closely related code-paths.
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Fixes: f9eca0227600 ("scsi: zfcp: drop duplicate fsf_command from zfcp_fsf_req which is also in QTCB header")
Cc: <stable@vger.kernel.org> #5.0+
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-07-03 05:02:00 +08:00
|
|
|
/* NOTE: DO NOT TOUCH req PAST THIS POINT! */
|
2008-07-02 16:56:39 +08:00
|
|
|
out:
|
[SCSI] zfcp: Change spin_lock_bh to spin_lock_irq to fix lockdep warning
With the change to use the data on the SCSI device, iterating through
all LUNs/scsi_devices takes the SCSI host_lock. This triggers warnings
from the lock dependency checker:
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #97
---------------------------------------------------------
chchp/3224 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->req_q_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this: [ 24.972394] 2 locks held by chchp/3224:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-....}, at: [<0000000000490302>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #98
---------------------------------------------------------
chchp/3235 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->stat_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this:
2 locks held by chchp/3235:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-.-..}, at: [<00000000004902f6>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
To stop this warning, change the request queue lock to disable irqs,
not only softirq. The changes are required only outside of the
critical "send fcp command" path.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-09-08 20:39:57 +08:00
|
|
|
spin_unlock_irq(&qdio->req_q_lock);
|
2005-04-17 06:20:36 +08:00
|
|
|
return retval;
|
|
|
|
}
|
|
|
|
|
2010-09-08 20:39:55 +08:00
|
|
|
static void zfcp_fsf_close_lun_handler(struct zfcp_fsf_req *req)
|
2005-04-17 06:20:36 +08:00
|
|
|
{
|
2010-09-08 20:39:55 +08:00
|
|
|
struct scsi_device *sdev = req->data;
|
2012-09-04 21:23:36 +08:00
|
|
|
struct zfcp_scsi_dev *zfcp_sdev;
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
if (req->status & ZFCP_STATUS_FSFREQ_ERROR)
|
2008-10-01 18:42:16 +08:00
|
|
|
return;
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2012-09-04 21:23:36 +08:00
|
|
|
zfcp_sdev = sdev_to_zfcp(sdev);
|
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
switch (req->qtcb->header.fsf_status) {
|
2005-04-17 06:20:36 +08:00
|
|
|
case FSF_PORT_HANDLE_NOT_VALID:
|
2010-12-02 22:16:16 +08:00
|
|
|
zfcp_erp_adapter_reopen(zfcp_sdev->port->adapter, 0, "fscuh_1");
|
2008-07-02 16:56:39 +08:00
|
|
|
req->status |= ZFCP_STATUS_FSFREQ_ERROR;
|
2005-04-17 06:20:36 +08:00
|
|
|
break;
|
|
|
|
case FSF_LUN_HANDLE_NOT_VALID:
|
2010-12-02 22:16:16 +08:00
|
|
|
zfcp_erp_port_reopen(zfcp_sdev->port, 0, "fscuh_2");
|
2008-07-02 16:56:39 +08:00
|
|
|
req->status |= ZFCP_STATUS_FSFREQ_ERROR;
|
2005-04-17 06:20:36 +08:00
|
|
|
break;
|
|
|
|
case FSF_PORT_BOXED:
|
2010-09-08 20:40:01 +08:00
|
|
|
zfcp_erp_set_port_status(zfcp_sdev->port,
|
|
|
|
ZFCP_STATUS_COMMON_ACCESS_BOXED);
|
|
|
|
zfcp_erp_port_reopen(zfcp_sdev->port,
|
2010-12-02 22:16:16 +08:00
|
|
|
ZFCP_STATUS_COMMON_ERP_FAILED, "fscuh_3");
|
2009-11-24 23:54:15 +08:00
|
|
|
req->status |= ZFCP_STATUS_FSFREQ_ERROR;
|
2005-04-17 06:20:36 +08:00
|
|
|
break;
|
|
|
|
case FSF_ADAPTER_STATUS_AVAILABLE:
|
2008-07-02 16:56:39 +08:00
|
|
|
switch (req->qtcb->header.fsf_status_qual.word[0]) {
|
2005-04-17 06:20:36 +08:00
|
|
|
case FSF_SQ_INVOKE_LINK_TEST_PROCEDURE:
|
2010-09-08 20:39:55 +08:00
|
|
|
zfcp_fc_test_link(zfcp_sdev->port);
|
2020-03-31 22:21:48 +08:00
|
|
|
fallthrough;
|
2005-04-17 06:20:36 +08:00
|
|
|
case FSF_SQ_ULP_DEPENDENT_ERP_REQUIRED:
|
2008-07-02 16:56:39 +08:00
|
|
|
req->status |= ZFCP_STATUS_FSFREQ_ERROR;
|
2005-04-17 06:20:36 +08:00
|
|
|
break;
|
|
|
|
}
|
|
|
|
break;
|
|
|
|
case FSF_GOOD:
|
2015-04-24 07:12:32 +08:00
|
|
|
atomic_andnot(ZFCP_STATUS_COMMON_OPEN, &zfcp_sdev->status);
|
2005-04-17 06:20:36 +08:00
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
2021-09-07 04:28:16 +08:00
|
|
|
* zfcp_fsf_close_lun - close LUN
|
2010-09-08 20:39:55 +08:00
|
|
|
* @erp_action: pointer to erp_action triggering the "close LUN"
|
2008-07-02 16:56:39 +08:00
|
|
|
* Returns: 0 on success, error otherwise
|
2005-04-17 06:20:36 +08:00
|
|
|
*/
|
2010-09-08 20:39:55 +08:00
|
|
|
int zfcp_fsf_close_lun(struct zfcp_erp_action *erp_action)
|
2005-04-17 06:20:36 +08:00
|
|
|
{
|
2009-08-18 21:43:19 +08:00
|
|
|
struct zfcp_qdio *qdio = erp_action->adapter->qdio;
|
2010-09-08 20:39:55 +08:00
|
|
|
struct zfcp_scsi_dev *zfcp_sdev = sdev_to_zfcp(erp_action->sdev);
|
2008-07-02 16:56:39 +08:00
|
|
|
struct zfcp_fsf_req *req;
|
|
|
|
int retval = -EIO;
|
2005-04-17 06:20:36 +08:00
|
|
|
|
[SCSI] zfcp: Change spin_lock_bh to spin_lock_irq to fix lockdep warning
With the change to use the data on the SCSI device, iterating through
all LUNs/scsi_devices takes the SCSI host_lock. This triggers warnings
from the lock dependency checker:
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #97
---------------------------------------------------------
chchp/3224 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->req_q_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this: [ 24.972394] 2 locks held by chchp/3224:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-....}, at: [<0000000000490302>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #98
---------------------------------------------------------
chchp/3235 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->stat_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this:
2 locks held by chchp/3235:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-.-..}, at: [<00000000004902f6>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
To stop this warning, change the request queue lock to disable irqs,
not only softirq. The changes are required only outside of the
critical "send fcp command" path.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-09-08 20:39:57 +08:00
|
|
|
spin_lock_irq(&qdio->req_q_lock);
|
2010-05-01 00:09:35 +08:00
|
|
|
if (zfcp_qdio_sbal_get(qdio))
|
2005-04-17 06:20:36 +08:00
|
|
|
goto out;
|
2009-08-18 21:43:16 +08:00
|
|
|
|
2009-08-18 21:43:19 +08:00
|
|
|
req = zfcp_fsf_req_create(qdio, FSF_QTCB_CLOSE_LUN,
|
2011-06-06 20:14:40 +08:00
|
|
|
SBAL_SFLAGS0_TYPE_READ,
|
2009-08-18 21:43:19 +08:00
|
|
|
qdio->adapter->pool.erp_req);
|
2009-08-18 21:43:16 +08:00
|
|
|
|
2008-08-21 19:43:37 +08:00
|
|
|
if (IS_ERR(req)) {
|
2008-07-02 16:56:39 +08:00
|
|
|
retval = PTR_ERR(req);
|
|
|
|
goto out;
|
|
|
|
}
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2009-08-18 21:43:16 +08:00
|
|
|
req->status |= ZFCP_STATUS_FSFREQ_CLEANUP;
|
2010-05-01 00:09:34 +08:00
|
|
|
zfcp_qdio_set_sbale_last(qdio, &req->qdio_req);
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
req->qtcb->header.port_handle = erp_action->port->handle;
|
2010-09-08 20:39:55 +08:00
|
|
|
req->qtcb->header.lun_handle = zfcp_sdev->lun_handle;
|
|
|
|
req->handler = zfcp_fsf_close_lun_handler;
|
|
|
|
req->data = erp_action->sdev;
|
2008-07-02 16:56:39 +08:00
|
|
|
req->erp_action = erp_action;
|
2010-02-17 18:18:49 +08:00
|
|
|
erp_action->fsf_req_id = req->req_id;
|
2007-12-20 19:30:27 +08:00
|
|
|
|
2008-07-02 16:56:40 +08:00
|
|
|
zfcp_fsf_start_erp_timer(req);
|
2008-07-02 16:56:39 +08:00
|
|
|
retval = zfcp_fsf_req_send(req);
|
|
|
|
if (retval) {
|
|
|
|
zfcp_fsf_req_free(req);
|
2010-02-17 18:18:49 +08:00
|
|
|
erp_action->fsf_req_id = 0;
|
2008-07-02 16:56:39 +08:00
|
|
|
}
|
scsi: zfcp: fix request object use-after-free in send path causing seqno errors
With a recent change to our send path for FSF commands we introduced a
possible use-after-free of request-objects, that might further lead to
zfcp crafting bad requests, which the FCP channel correctly complains
about with an error (FSF_PROT_SEQ_NUMB_ERROR). This error is then handled
by an adapter-wide recovery.
The following sequence illustrates the possible use-after-free:
Send Path:
int zfcp_fsf_open_port(struct zfcp_erp_action *erp_action)
{
struct zfcp_fsf_req *req;
...
spin_lock_irq(&qdio->req_q_lock);
// ^^^^^^^^^^^^^^^^
// protects QDIO queue during sending
...
req = zfcp_fsf_req_create(qdio,
FSF_QTCB_OPEN_PORT_WITH_DID,
SBAL_SFLAGS0_TYPE_READ,
qdio->adapter->pool.erp_req);
// ^^^^^^^^^^^^^^^^^^^
// allocation of the request-object
...
retval = zfcp_fsf_req_send(req);
...
spin_unlock_irq(&qdio->req_q_lock);
return retval;
}
static int zfcp_fsf_req_send(struct zfcp_fsf_req *req)
{
struct zfcp_adapter *adapter = req->adapter;
struct zfcp_qdio *qdio = adapter->qdio;
...
zfcp_reqlist_add(adapter->req_list, req);
// ^^^^^^^^^^^^^^^^
// add request to our driver-internal hash-table for tracking
// (protected by separate lock req_list->lock)
...
if (zfcp_qdio_send(qdio, &req->qdio_req)) {
// ^^^^^^^^^^^^^^
// hand-off the request to FCP channel;
// the request can complete at any point now
...
}
/* Don't increase for unsolicited status */
if (!zfcp_fsf_req_is_status_read_buffer(req))
// ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
// possible use-after-free
adapter->fsf_req_seq_no++;
// ^^^^^^^^^^^^^^^^
// because of the use-after-free we might
// miss this accounting, and as follow-up
// this results in the FCP channel error
// FSF_PROT_SEQ_NUMB_ERROR
adapter->req_no++;
return 0;
}
static inline bool
zfcp_fsf_req_is_status_read_buffer(struct zfcp_fsf_req *req)
{
return req->qtcb == NULL;
// ^^^^^^^^^
// possible use-after-free
}
Response Path:
void zfcp_fsf_reqid_check(struct zfcp_qdio *qdio, int sbal_idx)
{
...
struct zfcp_fsf_req *fsf_req;
...
for (idx = 0; idx < QDIO_MAX_ELEMENTS_PER_BUFFER; idx++) {
...
fsf_req = zfcp_reqlist_find_rm(adapter->req_list,
req_id);
// ^^^^^^^^^^^^^^^^^^^^
// remove request from our driver-internal
// hash-table (lock req_list->lock)
...
zfcp_fsf_req_complete(fsf_req);
}
}
static void zfcp_fsf_req_complete(struct zfcp_fsf_req *req)
{
...
if (likely(req->status & ZFCP_STATUS_FSFREQ_CLEANUP))
zfcp_fsf_req_free(req);
// ^^^^^^^^^^^^^^^^^
// free memory for request-object
else
complete(&req->completion);
// ^^^^^^^^
// completion notification for code-paths that wait
// synchronous for the completion of the request; in
// those the memory is freed separately
}
The result of the use-after-free only affects the send path, and can not
lead to any data corruption. In case we miss the sequence-number
accounting, because the memory was already re-purposed, the next FSF
command will fail with said FCP channel error, and we will recover the
whole adapter. This causes no additional errors, but it slows down
traffic. There is a slight chance of the same thing happen again
recursively after the adapter recovery, but so far this has not been seen.
This was seen under z/VM, where the send path might run on a virtual CPU
that gets scheduled away by z/VM, while the return path might still run,
and so create the necessary timing. Running with KASAN can also slow down
the kernel sufficiently to run into this user-after-free, and then see the
report by KASAN.
To fix this, simply pull the test for the sequence-number accounting in
front of the hand-off to the FCP channel (this information doesn't change
during hand-off), but leave the sequence-number accounting itself where it
is.
To make future regressions of the same kind less likely, add comments to
all closely related code-paths.
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Fixes: f9eca0227600 ("scsi: zfcp: drop duplicate fsf_command from zfcp_fsf_req which is also in QTCB header")
Cc: <stable@vger.kernel.org> #5.0+
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-07-03 05:02:00 +08:00
|
|
|
/* NOTE: DO NOT TOUCH req PAST THIS POINT! */
|
2008-07-02 16:56:39 +08:00
|
|
|
out:
|
[SCSI] zfcp: Change spin_lock_bh to spin_lock_irq to fix lockdep warning
With the change to use the data on the SCSI device, iterating through
all LUNs/scsi_devices takes the SCSI host_lock. This triggers warnings
from the lock dependency checker:
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #97
---------------------------------------------------------
chchp/3224 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->req_q_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this: [ 24.972394] 2 locks held by chchp/3224:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-....}, at: [<0000000000490302>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #98
---------------------------------------------------------
chchp/3235 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->stat_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this:
2 locks held by chchp/3235:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-.-..}, at: [<00000000004902f6>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
To stop this warning, change the request queue lock to disable irqs,
not only softirq. The changes are required only outside of the
critical "send fcp command" path.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-09-08 20:39:57 +08:00
|
|
|
spin_unlock_irq(&qdio->req_q_lock);
|
2008-07-02 16:56:39 +08:00
|
|
|
return retval;
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
|
|
|
|
2018-11-08 22:44:42 +08:00
|
|
|
static void zfcp_fsf_update_lat(struct zfcp_latency_record *lat_rec, u32 lat)
|
2008-05-06 17:00:05 +08:00
|
|
|
{
|
|
|
|
lat_rec->sum += lat;
|
2008-07-02 16:56:39 +08:00
|
|
|
lat_rec->min = min(lat_rec->min, lat);
|
|
|
|
lat_rec->max = max(lat_rec->max, lat);
|
2008-05-06 17:00:05 +08:00
|
|
|
}
|
|
|
|
|
2009-11-24 23:54:03 +08:00
|
|
|
static void zfcp_fsf_req_trace(struct zfcp_fsf_req *req, struct scsi_cmnd *scsi)
|
2008-05-06 17:00:05 +08:00
|
|
|
{
|
2009-11-24 23:54:03 +08:00
|
|
|
struct fsf_qual_latency_info *lat_in;
|
2018-11-08 22:44:42 +08:00
|
|
|
struct zfcp_latency_cont *lat = NULL;
|
2012-09-04 21:23:36 +08:00
|
|
|
struct zfcp_scsi_dev *zfcp_sdev;
|
2009-11-24 23:54:03 +08:00
|
|
|
struct zfcp_blk_drv_data blktrc;
|
|
|
|
int ticks = req->adapter->timer_ticks;
|
2008-05-06 17:00:05 +08:00
|
|
|
|
2009-11-24 23:54:03 +08:00
|
|
|
lat_in = &req->qtcb->prefix.prot_status_qual.latency_info;
|
2008-05-06 17:00:05 +08:00
|
|
|
|
2009-11-24 23:54:03 +08:00
|
|
|
blktrc.flags = 0;
|
|
|
|
blktrc.magic = ZFCP_BLK_DRV_DATA_MAGIC;
|
|
|
|
if (req->status & ZFCP_STATUS_FSFREQ_ERROR)
|
|
|
|
blktrc.flags |= ZFCP_BLK_REQ_ERROR;
|
2010-07-16 21:37:38 +08:00
|
|
|
blktrc.inb_usage = 0;
|
2010-02-17 18:18:59 +08:00
|
|
|
blktrc.outb_usage = req->qdio_req.qdio_outb_usage;
|
2009-11-24 23:54:03 +08:00
|
|
|
|
2010-04-01 19:04:08 +08:00
|
|
|
if (req->adapter->adapter_features & FSF_FEATURE_MEASUREMENT_DATA &&
|
|
|
|
!(req->status & ZFCP_STATUS_FSFREQ_ERROR)) {
|
2012-09-04 21:23:36 +08:00
|
|
|
zfcp_sdev = sdev_to_zfcp(scsi->device);
|
2009-11-24 23:54:03 +08:00
|
|
|
blktrc.flags |= ZFCP_BLK_LAT_VALID;
|
|
|
|
blktrc.channel_lat = lat_in->channel_lat * ticks;
|
|
|
|
blktrc.fabric_lat = lat_in->fabric_lat * ticks;
|
|
|
|
|
|
|
|
switch (req->qtcb->bottom.io.data_direction) {
|
2010-07-16 21:37:42 +08:00
|
|
|
case FSF_DATADIR_DIF_READ_STRIP:
|
|
|
|
case FSF_DATADIR_DIF_READ_CONVERT:
|
2009-11-24 23:54:03 +08:00
|
|
|
case FSF_DATADIR_READ:
|
2010-09-08 20:39:55 +08:00
|
|
|
lat = &zfcp_sdev->latencies.read;
|
2009-11-24 23:54:03 +08:00
|
|
|
break;
|
2010-07-16 21:37:42 +08:00
|
|
|
case FSF_DATADIR_DIF_WRITE_INSERT:
|
|
|
|
case FSF_DATADIR_DIF_WRITE_CONVERT:
|
2009-11-24 23:54:03 +08:00
|
|
|
case FSF_DATADIR_WRITE:
|
2010-09-08 20:39:55 +08:00
|
|
|
lat = &zfcp_sdev->latencies.write;
|
2009-11-24 23:54:03 +08:00
|
|
|
break;
|
|
|
|
case FSF_DATADIR_CMND:
|
2010-09-08 20:39:55 +08:00
|
|
|
lat = &zfcp_sdev->latencies.cmd;
|
2009-11-24 23:54:03 +08:00
|
|
|
break;
|
|
|
|
}
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2009-11-24 23:54:03 +08:00
|
|
|
if (lat) {
|
2010-09-08 20:39:55 +08:00
|
|
|
spin_lock(&zfcp_sdev->latencies.lock);
|
2009-11-24 23:54:03 +08:00
|
|
|
zfcp_fsf_update_lat(&lat->channel, lat_in->channel_lat);
|
|
|
|
zfcp_fsf_update_lat(&lat->fabric, lat_in->fabric_lat);
|
|
|
|
lat->counter++;
|
2010-09-08 20:39:55 +08:00
|
|
|
spin_unlock(&zfcp_sdev->latencies.lock);
|
2009-11-24 23:54:03 +08:00
|
|
|
}
|
2008-10-16 14:23:39 +08:00
|
|
|
}
|
|
|
|
|
2021-08-10 07:03:13 +08:00
|
|
|
blk_add_driver_data(scsi_cmd_to_rq(scsi), &blktrc, sizeof(blktrc));
|
2008-10-16 14:23:39 +08:00
|
|
|
}
|
|
|
|
|
2018-05-18 01:14:51 +08:00
|
|
|
/**
|
|
|
|
* zfcp_fsf_fcp_handler_common() - FCP response handler common to I/O and TMF.
|
|
|
|
* @req: Pointer to FSF request.
|
|
|
|
* @sdev: Pointer to SCSI device as request context.
|
|
|
|
*/
|
|
|
|
static void zfcp_fsf_fcp_handler_common(struct zfcp_fsf_req *req,
|
|
|
|
struct scsi_device *sdev)
|
2008-07-02 16:56:39 +08:00
|
|
|
{
|
2012-09-04 21:23:36 +08:00
|
|
|
struct zfcp_scsi_dev *zfcp_sdev;
|
2008-07-02 16:56:39 +08:00
|
|
|
struct fsf_qtcb_header *header = &req->qtcb->header;
|
|
|
|
|
|
|
|
if (unlikely(req->status & ZFCP_STATUS_FSFREQ_ERROR))
|
2010-09-08 20:39:58 +08:00
|
|
|
return;
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2012-09-04 21:23:36 +08:00
|
|
|
zfcp_sdev = sdev_to_zfcp(sdev);
|
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
switch (header->fsf_status) {
|
|
|
|
case FSF_HANDLE_MISMATCH:
|
|
|
|
case FSF_PORT_HANDLE_NOT_VALID:
|
2018-05-18 01:14:51 +08:00
|
|
|
zfcp_erp_adapter_reopen(req->adapter, 0, "fssfch1");
|
2008-07-02 16:56:39 +08:00
|
|
|
req->status |= ZFCP_STATUS_FSFREQ_ERROR;
|
|
|
|
break;
|
|
|
|
case FSF_FCPLUN_NOT_VALID:
|
|
|
|
case FSF_LUN_HANDLE_NOT_VALID:
|
2010-12-02 22:16:16 +08:00
|
|
|
zfcp_erp_port_reopen(zfcp_sdev->port, 0, "fssfch2");
|
2008-07-02 16:56:39 +08:00
|
|
|
req->status |= ZFCP_STATUS_FSFREQ_ERROR;
|
2005-04-17 06:20:36 +08:00
|
|
|
break;
|
2008-07-02 16:56:39 +08:00
|
|
|
case FSF_SERVICE_CLASS_NOT_SUPPORTED:
|
|
|
|
zfcp_fsf_class_not_supp(req);
|
2005-04-17 06:20:36 +08:00
|
|
|
break;
|
2008-07-02 16:56:39 +08:00
|
|
|
case FSF_DIRECTION_INDICATOR_NOT_VALID:
|
|
|
|
dev_err(&req->adapter->ccw_device->dev,
|
2010-09-08 20:39:55 +08:00
|
|
|
"Incorrect direction %d, LUN 0x%016Lx on port "
|
2008-10-01 18:42:15 +08:00
|
|
|
"0x%016Lx closed\n",
|
2008-07-02 16:56:39 +08:00
|
|
|
req->qtcb->bottom.io.data_direction,
|
2010-09-08 20:39:55 +08:00
|
|
|
(unsigned long long)zfcp_scsi_dev_lun(sdev),
|
|
|
|
(unsigned long long)zfcp_sdev->port->wwpn);
|
2018-05-18 01:14:51 +08:00
|
|
|
zfcp_erp_adapter_shutdown(req->adapter, 0, "fssfch3");
|
2008-07-02 16:56:39 +08:00
|
|
|
req->status |= ZFCP_STATUS_FSFREQ_ERROR;
|
|
|
|
break;
|
|
|
|
case FSF_CMND_LENGTH_NOT_VALID:
|
|
|
|
dev_err(&req->adapter->ccw_device->dev,
|
2018-11-08 22:44:47 +08:00
|
|
|
"Incorrect FCP_CMND length %d, FCP device closed\n",
|
|
|
|
req->qtcb->bottom.io.fcp_cmnd_length);
|
2018-05-18 01:14:51 +08:00
|
|
|
zfcp_erp_adapter_shutdown(req->adapter, 0, "fssfch4");
|
2008-07-02 16:56:39 +08:00
|
|
|
req->status |= ZFCP_STATUS_FSFREQ_ERROR;
|
|
|
|
break;
|
|
|
|
case FSF_PORT_BOXED:
|
2010-09-08 20:40:01 +08:00
|
|
|
zfcp_erp_set_port_status(zfcp_sdev->port,
|
|
|
|
ZFCP_STATUS_COMMON_ACCESS_BOXED);
|
|
|
|
zfcp_erp_port_reopen(zfcp_sdev->port,
|
2010-12-02 22:16:16 +08:00
|
|
|
ZFCP_STATUS_COMMON_ERP_FAILED, "fssfch5");
|
2009-11-24 23:54:15 +08:00
|
|
|
req->status |= ZFCP_STATUS_FSFREQ_ERROR;
|
2008-07-02 16:56:39 +08:00
|
|
|
break;
|
|
|
|
case FSF_LUN_BOXED:
|
2010-09-08 20:40:01 +08:00
|
|
|
zfcp_erp_set_lun_status(sdev, ZFCP_STATUS_COMMON_ACCESS_BOXED);
|
|
|
|
zfcp_erp_lun_reopen(sdev, ZFCP_STATUS_COMMON_ERP_FAILED,
|
2010-12-02 22:16:16 +08:00
|
|
|
"fssfch6");
|
2009-11-24 23:54:15 +08:00
|
|
|
req->status |= ZFCP_STATUS_FSFREQ_ERROR;
|
2008-07-02 16:56:39 +08:00
|
|
|
break;
|
|
|
|
case FSF_ADAPTER_STATUS_AVAILABLE:
|
|
|
|
if (header->fsf_status_qual.word[0] ==
|
|
|
|
FSF_SQ_INVOKE_LINK_TEST_PROCEDURE)
|
2010-09-08 20:39:55 +08:00
|
|
|
zfcp_fc_test_link(zfcp_sdev->port);
|
2008-07-02 16:56:39 +08:00
|
|
|
req->status |= ZFCP_STATUS_FSFREQ_ERROR;
|
2005-04-17 06:20:36 +08:00
|
|
|
break;
|
2020-03-13 01:45:04 +08:00
|
|
|
case FSF_SECURITY_ERROR:
|
2020-03-13 01:45:05 +08:00
|
|
|
zfcp_fsf_log_security_error(&req->adapter->ccw_device->dev,
|
|
|
|
header->fsf_status_qual.word[0],
|
|
|
|
zfcp_sdev->port->wwpn);
|
2020-03-13 01:45:04 +08:00
|
|
|
zfcp_erp_port_forced_reopen(zfcp_sdev->port, 0, "fssfch7");
|
|
|
|
req->status |= ZFCP_STATUS_FSFREQ_ERROR;
|
|
|
|
break;
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
2010-09-08 20:39:58 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
static void zfcp_fsf_fcp_cmnd_handler(struct zfcp_fsf_req *req)
|
|
|
|
{
|
|
|
|
struct scsi_cmnd *scpnt;
|
|
|
|
struct fcp_resp_with_ext *fcp_rsp;
|
|
|
|
unsigned long flags;
|
|
|
|
|
|
|
|
read_lock_irqsave(&req->adapter->abort_lock, flags);
|
|
|
|
|
|
|
|
scpnt = req->data;
|
|
|
|
if (unlikely(!scpnt)) {
|
|
|
|
read_unlock_irqrestore(&req->adapter->abort_lock, flags);
|
|
|
|
return;
|
2008-07-02 16:56:39 +08:00
|
|
|
}
|
2010-09-08 20:39:58 +08:00
|
|
|
|
2018-05-18 01:14:51 +08:00
|
|
|
zfcp_fsf_fcp_handler_common(req, scpnt->device);
|
[SCSI] zfcp: Fix common FCP request reception
The reception of a common FCP request should only be evaluated if the
corresponding SCSI request data is available. Therefore put the
information under the lock protection and verify the existence before
processing. This fixes the following kernel panic.
Unable to handle kernel pointer dereference at virtual kernel address 0000000180000000
Oops: 003b [#1] PREEMPT SMP DEBUG_PAGEALLOC
CPU: 0 Not tainted 2.6.35.7-45.x.20101007-s390xdefault #1
Process blast (pid: 9711, task: 00000000a3be8e40, ksp: 00000000b221bac0)
Krnl PSW : 0704300180000000 0000000000489878 (zfcp_fsf_fcp_handler_common+0x4c/0x3a0)
R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:0 CC:3 PM:0 EA:3
Krnl GPRS: 00000000b663c1b8 0000000180000000 000000007ab5bdf0 0000000000000000
00000000b0ccd800 0000000000000018 07000000a3be8e78 00000000b5d3e600
000000007ab5bdf0 0000000000000066 00000000b72137f0 00000000b72137f0
0000000000000000 00000000005a8178 00000000bdf37a60 00000000bdf379f0
Krnl Code: 0000000000489866: e3c030000004 lg %r12,0(%r3)
000000000048986c: e310c0000004 lg %r1,0(%r12)
0000000000489872: e31011e00004 lg %r1,480(%r1)
>0000000000489878: 581011ec l %r1,492(%r1)
000000000048987c: a774001c brc 7,4898b4
0000000000489880: b91400b1 lgfr %r11,%r1
0000000000489884: 5810405c l %r1,92(%r4)
0000000000489888: 5510d00c cl %r1,12(%r13)
Call Trace:
([<000000000010d344>] debug_event_common+0x22c/0x244)
[<000000000048a0b4>] zfcp_fsf_fcp_cmnd_handler+0x2c/0x3b4
[<000000000048b5b6>] zfcp_fsf_req_complete+0x1b6/0x9dc
[<000000000048bede>] zfcp_fsf_reqid_check+0x102/0x138
[<000000000048e478>] zfcp_qdio_int_resp+0x70/0x110
[<000000000044a1ec>] qdio_kick_handler+0xb0/0x19c
[<000000000044c228>] __tiqdio_inbound_processing+0x30c/0xebc
[<000000000014a5fc>] tasklet_action+0x1b4/0x1e8
[<000000000014b676>] __do_softirq+0x106/0x1cc
[<000000000010d91a>] do_softirq+0xe6/0xec
[<000000000014b0c8>] irq_exit+0xd4/0xd8
[<00000000004307ec>] do_IRQ+0x7c0/0xf54
[<0000000000114d28>] io_return+0x0/0x16
[<000000000055fef0>] sub_preempt_count+0x50/0xe4
([<00000000b1f873c0>] 0xb1f873c0)
[<000000000055e25a>] _raw_spin_unlock+0x46/0x74
[<0000000000241c40>] __d_lookup+0x288/0x2c8
[<000000000023502c>] do_lookup+0x7c/0x25c
[<0000000000237fa8>] link_path_walk+0x5e4/0xe2c
[<0000000000238a00>] path_walk+0x98/0x148
[<0000000000238c98>] do_path_lookup+0x74/0xc0
[<000000000023989c>] user_path_at+0x64/0xa4
[<000000000022e366>] vfs_fstatat+0x4e/0xb0
[<000000000022e4d6>] SyS_newstat+0x2e/0x54
[<00000000001146de>] sysc_noemu+0x10/0x16
[<0000020000153456>] 0x20000153456
INFO: lockdep is turned off.
Last Breaking-Event-Address:
[<000000000048a0ae>] zfcp_fsf_fcp_cmnd_handler+0x26/0x3b4
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-11-17 21:23:40 +08:00
|
|
|
|
2010-09-08 20:39:58 +08:00
|
|
|
if (unlikely(req->status & ZFCP_STATUS_FSFREQ_ERROR)) {
|
|
|
|
set_host_byte(scpnt, DID_TRANSPORT_DISRUPTED);
|
|
|
|
goto skip_fsfstatus;
|
|
|
|
}
|
|
|
|
|
|
|
|
switch (req->qtcb->header.fsf_status) {
|
|
|
|
case FSF_INCONSISTENT_PROT_DATA:
|
|
|
|
case FSF_INVALID_PROT_PARM:
|
|
|
|
set_host_byte(scpnt, DID_ERROR);
|
|
|
|
goto skip_fsfstatus;
|
|
|
|
case FSF_BLOCK_GUARD_CHECK_FAILURE:
|
|
|
|
zfcp_scsi_dif_sense_error(scpnt, 0x1);
|
|
|
|
goto skip_fsfstatus;
|
|
|
|
case FSF_APP_TAG_CHECK_FAILURE:
|
|
|
|
zfcp_scsi_dif_sense_error(scpnt, 0x2);
|
|
|
|
goto skip_fsfstatus;
|
|
|
|
case FSF_REF_TAG_CHECK_FAILURE:
|
|
|
|
zfcp_scsi_dif_sense_error(scpnt, 0x3);
|
|
|
|
goto skip_fsfstatus;
|
|
|
|
}
|
2017-07-28 18:31:01 +08:00
|
|
|
BUILD_BUG_ON(sizeof(struct fcp_resp_with_ext) > FSF_FCP_RSP_SIZE);
|
|
|
|
fcp_rsp = &req->qtcb->bottom.io.fcp_rsp.iu;
|
2010-09-08 20:39:58 +08:00
|
|
|
zfcp_fc_eval_fcp_rsp(fcp_rsp, scpnt);
|
|
|
|
|
|
|
|
skip_fsfstatus:
|
|
|
|
zfcp_fsf_req_trace(req, scpnt);
|
2010-12-02 22:16:15 +08:00
|
|
|
zfcp_dbf_scsi_result(scpnt, req);
|
2010-09-08 20:39:58 +08:00
|
|
|
|
|
|
|
scpnt->host_scribble = NULL;
|
2021-10-08 04:28:02 +08:00
|
|
|
scsi_done(scpnt);
|
2010-09-08 20:39:58 +08:00
|
|
|
/*
|
|
|
|
* We must hold this lock until scsi_done has been called.
|
|
|
|
* Otherwise we may call scsi_done after abort regarding this
|
|
|
|
* command has completed.
|
|
|
|
* Note: scsi_done must not block!
|
|
|
|
*/
|
|
|
|
read_unlock_irqrestore(&req->adapter->abort_lock, flags);
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
|
|
|
|
2010-07-16 21:37:42 +08:00
|
|
|
static int zfcp_fsf_set_data_dir(struct scsi_cmnd *scsi_cmnd, u32 *data_dir)
|
|
|
|
{
|
|
|
|
switch (scsi_get_prot_op(scsi_cmnd)) {
|
|
|
|
case SCSI_PROT_NORMAL:
|
|
|
|
switch (scsi_cmnd->sc_data_direction) {
|
|
|
|
case DMA_NONE:
|
|
|
|
*data_dir = FSF_DATADIR_CMND;
|
|
|
|
break;
|
|
|
|
case DMA_FROM_DEVICE:
|
|
|
|
*data_dir = FSF_DATADIR_READ;
|
|
|
|
break;
|
|
|
|
case DMA_TO_DEVICE:
|
|
|
|
*data_dir = FSF_DATADIR_WRITE;
|
|
|
|
break;
|
|
|
|
case DMA_BIDIRECTIONAL:
|
|
|
|
return -EINVAL;
|
|
|
|
}
|
|
|
|
break;
|
|
|
|
|
|
|
|
case SCSI_PROT_READ_STRIP:
|
|
|
|
*data_dir = FSF_DATADIR_DIF_READ_STRIP;
|
|
|
|
break;
|
|
|
|
case SCSI_PROT_WRITE_INSERT:
|
|
|
|
*data_dir = FSF_DATADIR_DIF_WRITE_INSERT;
|
|
|
|
break;
|
|
|
|
case SCSI_PROT_READ_PASS:
|
|
|
|
*data_dir = FSF_DATADIR_DIF_READ_CONVERT;
|
|
|
|
break;
|
|
|
|
case SCSI_PROT_WRITE_PASS:
|
|
|
|
*data_dir = FSF_DATADIR_DIF_WRITE_CONVERT;
|
|
|
|
break;
|
|
|
|
default:
|
|
|
|
return -EINVAL;
|
|
|
|
}
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
/**
|
2010-09-08 20:39:55 +08:00
|
|
|
* zfcp_fsf_fcp_cmnd - initiate an FCP command (for a SCSI command)
|
2008-07-02 16:56:39 +08:00
|
|
|
* @scsi_cmnd: scsi command to be sent
|
2005-04-17 06:20:36 +08:00
|
|
|
*/
|
2010-09-08 20:39:55 +08:00
|
|
|
int zfcp_fsf_fcp_cmnd(struct scsi_cmnd *scsi_cmnd)
|
2005-04-17 06:20:36 +08:00
|
|
|
{
|
2008-07-02 16:56:39 +08:00
|
|
|
struct zfcp_fsf_req *req;
|
2009-11-24 23:54:08 +08:00
|
|
|
struct fcp_cmnd *fcp_cmnd;
|
2011-06-06 20:14:40 +08:00
|
|
|
u8 sbtype = SBAL_SFLAGS0_TYPE_READ;
|
2011-08-15 20:40:32 +08:00
|
|
|
int retval = -EIO;
|
2010-09-08 20:39:55 +08:00
|
|
|
struct scsi_device *sdev = scsi_cmnd->device;
|
|
|
|
struct zfcp_scsi_dev *zfcp_sdev = sdev_to_zfcp(sdev);
|
|
|
|
struct zfcp_adapter *adapter = zfcp_sdev->port->adapter;
|
2009-08-18 21:43:19 +08:00
|
|
|
struct zfcp_qdio *qdio = adapter->qdio;
|
2010-07-16 21:37:42 +08:00
|
|
|
struct fsf_qtcb_bottom_io *io;
|
2010-11-18 21:53:18 +08:00
|
|
|
unsigned long flags;
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2010-09-08 20:39:55 +08:00
|
|
|
if (unlikely(!(atomic_read(&zfcp_sdev->status) &
|
2008-07-02 16:56:39 +08:00
|
|
|
ZFCP_STATUS_COMMON_UNBLOCKED)))
|
|
|
|
return -EBUSY;
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2010-11-18 21:53:18 +08:00
|
|
|
spin_lock_irqsave(&qdio->req_q_lock, flags);
|
2010-07-16 21:37:38 +08:00
|
|
|
if (atomic_read(&qdio->req_q_free) <= 0) {
|
2009-08-18 21:43:19 +08:00
|
|
|
atomic_inc(&qdio->req_q_full);
|
2008-07-02 16:56:39 +08:00
|
|
|
goto out;
|
2009-03-02 20:09:01 +08:00
|
|
|
}
|
2009-08-18 21:43:16 +08:00
|
|
|
|
2010-05-01 00:09:34 +08:00
|
|
|
if (scsi_cmnd->sc_data_direction == DMA_TO_DEVICE)
|
2011-06-06 20:14:40 +08:00
|
|
|
sbtype = SBAL_SFLAGS0_TYPE_WRITE;
|
2010-05-01 00:09:34 +08:00
|
|
|
|
2009-08-18 21:43:19 +08:00
|
|
|
req = zfcp_fsf_req_create(qdio, FSF_QTCB_FCP_CMND,
|
2010-05-01 00:09:34 +08:00
|
|
|
sbtype, adapter->pool.scsi_req);
|
2009-08-18 21:43:16 +08:00
|
|
|
|
2008-08-21 19:43:37 +08:00
|
|
|
if (IS_ERR(req)) {
|
2008-07-02 16:56:39 +08:00
|
|
|
retval = PTR_ERR(req);
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
|
2023-02-22 01:55:59 +08:00
|
|
|
BUILD_BUG_ON(sizeof(scsi_cmnd->host_scribble) < sizeof(req->req_id));
|
2010-07-16 21:37:42 +08:00
|
|
|
scsi_cmnd->host_scribble = (unsigned char *) req->req_id;
|
|
|
|
|
|
|
|
io = &req->qtcb->bottom.io;
|
2009-08-18 21:43:16 +08:00
|
|
|
req->status |= ZFCP_STATUS_FSFREQ_CLEANUP;
|
2008-07-02 16:56:39 +08:00
|
|
|
req->data = scsi_cmnd;
|
2010-09-08 20:39:58 +08:00
|
|
|
req->handler = zfcp_fsf_fcp_cmnd_handler;
|
2010-09-08 20:39:55 +08:00
|
|
|
req->qtcb->header.lun_handle = zfcp_sdev->lun_handle;
|
|
|
|
req->qtcb->header.port_handle = zfcp_sdev->port->handle;
|
2010-07-16 21:37:42 +08:00
|
|
|
io->service_class = FSF_CLASS_3;
|
|
|
|
io->fcp_cmnd_length = FCP_CMND_LEN;
|
2008-07-02 16:56:39 +08:00
|
|
|
|
2010-07-16 21:37:42 +08:00
|
|
|
if (scsi_get_prot_op(scsi_cmnd) != SCSI_PROT_NORMAL) {
|
2021-06-09 11:39:20 +08:00
|
|
|
io->data_block_length = scsi_prot_interval(scsi_cmnd);
|
|
|
|
io->ref_tag_value = scsi_prot_ref_tag(scsi_cmnd);
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
|
|
|
|
2011-08-15 20:40:30 +08:00
|
|
|
if (zfcp_fsf_set_data_dir(scsi_cmnd, &io->data_direction))
|
|
|
|
goto failed_scsi_cmnd;
|
2010-07-16 21:37:42 +08:00
|
|
|
|
2017-07-28 18:31:01 +08:00
|
|
|
BUILD_BUG_ON(sizeof(struct fcp_cmnd) > FSF_FCP_CMND_SIZE);
|
|
|
|
fcp_cmnd = &req->qtcb->bottom.io.fcp_cmnd.iu;
|
2018-05-18 01:14:52 +08:00
|
|
|
zfcp_fc_scsi_to_fcp(fcp_cmnd, scsi_cmnd);
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2017-07-28 18:30:51 +08:00
|
|
|
if ((scsi_get_prot_op(scsi_cmnd) != SCSI_PROT_NORMAL) &&
|
|
|
|
scsi_prot_sg_count(scsi_cmnd)) {
|
2010-07-16 21:37:42 +08:00
|
|
|
zfcp_qdio_set_data_div(qdio, &req->qdio_req,
|
|
|
|
scsi_prot_sg_count(scsi_cmnd));
|
2011-08-15 20:40:32 +08:00
|
|
|
retval = zfcp_qdio_sbals_from_sg(qdio, &req->qdio_req,
|
|
|
|
scsi_prot_sglist(scsi_cmnd));
|
|
|
|
if (retval)
|
|
|
|
goto failed_scsi_cmnd;
|
|
|
|
io->prot_data_length = zfcp_qdio_real_bytes(
|
2010-07-16 21:37:42 +08:00
|
|
|
scsi_prot_sglist(scsi_cmnd));
|
|
|
|
}
|
|
|
|
|
2011-08-15 20:40:32 +08:00
|
|
|
retval = zfcp_qdio_sbals_from_sg(qdio, &req->qdio_req,
|
|
|
|
scsi_sglist(scsi_cmnd));
|
|
|
|
if (unlikely(retval))
|
2008-07-02 16:56:39 +08:00
|
|
|
goto failed_scsi_cmnd;
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2010-07-16 21:37:42 +08:00
|
|
|
zfcp_qdio_set_sbale_last(adapter->qdio, &req->qdio_req);
|
2011-08-15 20:40:32 +08:00
|
|
|
if (zfcp_adapter_multi_buffer_active(adapter))
|
|
|
|
zfcp_qdio_set_scount(qdio, &req->qdio_req);
|
2010-07-16 21:37:42 +08:00
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
retval = zfcp_fsf_req_send(req);
|
|
|
|
if (unlikely(retval))
|
|
|
|
goto failed_scsi_cmnd;
|
scsi: zfcp: fix request object use-after-free in send path causing seqno errors
With a recent change to our send path for FSF commands we introduced a
possible use-after-free of request-objects, that might further lead to
zfcp crafting bad requests, which the FCP channel correctly complains
about with an error (FSF_PROT_SEQ_NUMB_ERROR). This error is then handled
by an adapter-wide recovery.
The following sequence illustrates the possible use-after-free:
Send Path:
int zfcp_fsf_open_port(struct zfcp_erp_action *erp_action)
{
struct zfcp_fsf_req *req;
...
spin_lock_irq(&qdio->req_q_lock);
// ^^^^^^^^^^^^^^^^
// protects QDIO queue during sending
...
req = zfcp_fsf_req_create(qdio,
FSF_QTCB_OPEN_PORT_WITH_DID,
SBAL_SFLAGS0_TYPE_READ,
qdio->adapter->pool.erp_req);
// ^^^^^^^^^^^^^^^^^^^
// allocation of the request-object
...
retval = zfcp_fsf_req_send(req);
...
spin_unlock_irq(&qdio->req_q_lock);
return retval;
}
static int zfcp_fsf_req_send(struct zfcp_fsf_req *req)
{
struct zfcp_adapter *adapter = req->adapter;
struct zfcp_qdio *qdio = adapter->qdio;
...
zfcp_reqlist_add(adapter->req_list, req);
// ^^^^^^^^^^^^^^^^
// add request to our driver-internal hash-table for tracking
// (protected by separate lock req_list->lock)
...
if (zfcp_qdio_send(qdio, &req->qdio_req)) {
// ^^^^^^^^^^^^^^
// hand-off the request to FCP channel;
// the request can complete at any point now
...
}
/* Don't increase for unsolicited status */
if (!zfcp_fsf_req_is_status_read_buffer(req))
// ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
// possible use-after-free
adapter->fsf_req_seq_no++;
// ^^^^^^^^^^^^^^^^
// because of the use-after-free we might
// miss this accounting, and as follow-up
// this results in the FCP channel error
// FSF_PROT_SEQ_NUMB_ERROR
adapter->req_no++;
return 0;
}
static inline bool
zfcp_fsf_req_is_status_read_buffer(struct zfcp_fsf_req *req)
{
return req->qtcb == NULL;
// ^^^^^^^^^
// possible use-after-free
}
Response Path:
void zfcp_fsf_reqid_check(struct zfcp_qdio *qdio, int sbal_idx)
{
...
struct zfcp_fsf_req *fsf_req;
...
for (idx = 0; idx < QDIO_MAX_ELEMENTS_PER_BUFFER; idx++) {
...
fsf_req = zfcp_reqlist_find_rm(adapter->req_list,
req_id);
// ^^^^^^^^^^^^^^^^^^^^
// remove request from our driver-internal
// hash-table (lock req_list->lock)
...
zfcp_fsf_req_complete(fsf_req);
}
}
static void zfcp_fsf_req_complete(struct zfcp_fsf_req *req)
{
...
if (likely(req->status & ZFCP_STATUS_FSFREQ_CLEANUP))
zfcp_fsf_req_free(req);
// ^^^^^^^^^^^^^^^^^
// free memory for request-object
else
complete(&req->completion);
// ^^^^^^^^
// completion notification for code-paths that wait
// synchronous for the completion of the request; in
// those the memory is freed separately
}
The result of the use-after-free only affects the send path, and can not
lead to any data corruption. In case we miss the sequence-number
accounting, because the memory was already re-purposed, the next FSF
command will fail with said FCP channel error, and we will recover the
whole adapter. This causes no additional errors, but it slows down
traffic. There is a slight chance of the same thing happen again
recursively after the adapter recovery, but so far this has not been seen.
This was seen under z/VM, where the send path might run on a virtual CPU
that gets scheduled away by z/VM, while the return path might still run,
and so create the necessary timing. Running with KASAN can also slow down
the kernel sufficiently to run into this user-after-free, and then see the
report by KASAN.
To fix this, simply pull the test for the sequence-number accounting in
front of the hand-off to the FCP channel (this information doesn't change
during hand-off), but leave the sequence-number accounting itself where it
is.
To make future regressions of the same kind less likely, add comments to
all closely related code-paths.
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Fixes: f9eca0227600 ("scsi: zfcp: drop duplicate fsf_command from zfcp_fsf_req which is also in QTCB header")
Cc: <stable@vger.kernel.org> #5.0+
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-07-03 05:02:00 +08:00
|
|
|
/* NOTE: DO NOT TOUCH req PAST THIS POINT! */
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
goto out;
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
failed_scsi_cmnd:
|
|
|
|
zfcp_fsf_req_free(req);
|
|
|
|
scsi_cmnd->host_scribble = NULL;
|
|
|
|
out:
|
2010-11-18 21:53:18 +08:00
|
|
|
spin_unlock_irqrestore(&qdio->req_q_lock, flags);
|
2008-07-02 16:56:39 +08:00
|
|
|
return retval;
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
|
|
|
|
2010-09-08 20:39:58 +08:00
|
|
|
static void zfcp_fsf_fcp_task_mgmt_handler(struct zfcp_fsf_req *req)
|
|
|
|
{
|
2018-05-18 01:14:51 +08:00
|
|
|
struct scsi_device *sdev = req->data;
|
2010-09-08 20:39:58 +08:00
|
|
|
struct fcp_resp_with_ext *fcp_rsp;
|
|
|
|
struct fcp_resp_rsp_info *rsp_info;
|
|
|
|
|
2018-05-18 01:14:51 +08:00
|
|
|
zfcp_fsf_fcp_handler_common(req, sdev);
|
2010-09-08 20:39:58 +08:00
|
|
|
|
2017-07-28 18:31:01 +08:00
|
|
|
fcp_rsp = &req->qtcb->bottom.io.fcp_rsp.iu;
|
2010-09-08 20:39:58 +08:00
|
|
|
rsp_info = (struct fcp_resp_rsp_info *) &fcp_rsp[1];
|
|
|
|
|
|
|
|
if ((rsp_info->rsp_code != FCP_TMF_CMPL) ||
|
|
|
|
(req->status & ZFCP_STATUS_FSFREQ_ERROR))
|
|
|
|
req->status |= ZFCP_STATUS_FSFREQ_TMFUNCFAILED;
|
|
|
|
}
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
/**
|
2018-05-18 01:14:54 +08:00
|
|
|
* zfcp_fsf_fcp_task_mgmt() - Send SCSI task management command (TMF).
|
|
|
|
* @sdev: Pointer to SCSI device to send the task management command to.
|
|
|
|
* @tm_flags: Unsigned byte for task management flags.
|
|
|
|
*
|
|
|
|
* Return: On success pointer to struct zfcp_fsf_req, %NULL otherwise.
|
2005-04-17 06:20:36 +08:00
|
|
|
*/
|
2018-05-18 01:14:54 +08:00
|
|
|
struct zfcp_fsf_req *zfcp_fsf_fcp_task_mgmt(struct scsi_device *sdev,
|
2010-09-08 20:39:55 +08:00
|
|
|
u8 tm_flags)
|
2005-04-17 06:20:36 +08:00
|
|
|
{
|
2008-07-02 16:56:39 +08:00
|
|
|
struct zfcp_fsf_req *req = NULL;
|
2009-11-24 23:54:08 +08:00
|
|
|
struct fcp_cmnd *fcp_cmnd;
|
2018-05-18 01:14:53 +08:00
|
|
|
struct zfcp_scsi_dev *zfcp_sdev = sdev_to_zfcp(sdev);
|
2010-09-08 20:39:55 +08:00
|
|
|
struct zfcp_qdio *qdio = zfcp_sdev->port->adapter->qdio;
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2010-09-08 20:39:55 +08:00
|
|
|
if (unlikely(!(atomic_read(&zfcp_sdev->status) &
|
2008-07-02 16:56:39 +08:00
|
|
|
ZFCP_STATUS_COMMON_UNBLOCKED)))
|
|
|
|
return NULL;
|
2005-04-17 06:20:36 +08:00
|
|
|
|
[SCSI] zfcp: Change spin_lock_bh to spin_lock_irq to fix lockdep warning
With the change to use the data on the SCSI device, iterating through
all LUNs/scsi_devices takes the SCSI host_lock. This triggers warnings
from the lock dependency checker:
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #97
---------------------------------------------------------
chchp/3224 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->req_q_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this: [ 24.972394] 2 locks held by chchp/3224:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-....}, at: [<0000000000490302>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #98
---------------------------------------------------------
chchp/3235 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->stat_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this:
2 locks held by chchp/3235:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-.-..}, at: [<00000000004902f6>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
To stop this warning, change the request queue lock to disable irqs,
not only softirq. The changes are required only outside of the
critical "send fcp command" path.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-09-08 20:39:57 +08:00
|
|
|
spin_lock_irq(&qdio->req_q_lock);
|
2010-05-01 00:09:35 +08:00
|
|
|
if (zfcp_qdio_sbal_get(qdio))
|
2008-07-02 16:56:39 +08:00
|
|
|
goto out;
|
2009-08-18 21:43:16 +08:00
|
|
|
|
2009-08-18 21:43:19 +08:00
|
|
|
req = zfcp_fsf_req_create(qdio, FSF_QTCB_FCP_CMND,
|
2011-06-06 20:14:40 +08:00
|
|
|
SBAL_SFLAGS0_TYPE_WRITE,
|
2009-08-18 21:43:19 +08:00
|
|
|
qdio->adapter->pool.scsi_req);
|
2009-08-18 21:43:16 +08:00
|
|
|
|
2008-11-27 01:07:37 +08:00
|
|
|
if (IS_ERR(req)) {
|
|
|
|
req = NULL;
|
2008-07-02 16:56:39 +08:00
|
|
|
goto out;
|
2008-11-27 01:07:37 +08:00
|
|
|
}
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2018-05-18 01:14:53 +08:00
|
|
|
req->data = sdev;
|
|
|
|
|
2010-09-08 20:39:58 +08:00
|
|
|
req->handler = zfcp_fsf_fcp_task_mgmt_handler;
|
2010-09-08 20:39:55 +08:00
|
|
|
req->qtcb->header.lun_handle = zfcp_sdev->lun_handle;
|
|
|
|
req->qtcb->header.port_handle = zfcp_sdev->port->handle;
|
2008-07-02 16:56:39 +08:00
|
|
|
req->qtcb->bottom.io.data_direction = FSF_DATADIR_CMND;
|
|
|
|
req->qtcb->bottom.io.service_class = FSF_CLASS_3;
|
2009-11-24 23:54:08 +08:00
|
|
|
req->qtcb->bottom.io.fcp_cmnd_length = FCP_CMND_LEN;
|
2008-07-02 16:56:39 +08:00
|
|
|
|
2010-05-01 00:09:34 +08:00
|
|
|
zfcp_qdio_set_sbale_last(qdio, &req->qdio_req);
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2017-07-28 18:31:01 +08:00
|
|
|
fcp_cmnd = &req->qtcb->bottom.io.fcp_cmnd.iu;
|
2018-05-18 01:14:53 +08:00
|
|
|
zfcp_fc_fcp_tm(fcp_cmnd, sdev, tm_flags);
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2018-11-08 22:44:40 +08:00
|
|
|
zfcp_fsf_start_timer(req, ZFCP_FSF_SCSI_ER_TIMEOUT);
|
scsi: zfcp: fix request object use-after-free in send path causing seqno errors
With a recent change to our send path for FSF commands we introduced a
possible use-after-free of request-objects, that might further lead to
zfcp crafting bad requests, which the FCP channel correctly complains
about with an error (FSF_PROT_SEQ_NUMB_ERROR). This error is then handled
by an adapter-wide recovery.
The following sequence illustrates the possible use-after-free:
Send Path:
int zfcp_fsf_open_port(struct zfcp_erp_action *erp_action)
{
struct zfcp_fsf_req *req;
...
spin_lock_irq(&qdio->req_q_lock);
// ^^^^^^^^^^^^^^^^
// protects QDIO queue during sending
...
req = zfcp_fsf_req_create(qdio,
FSF_QTCB_OPEN_PORT_WITH_DID,
SBAL_SFLAGS0_TYPE_READ,
qdio->adapter->pool.erp_req);
// ^^^^^^^^^^^^^^^^^^^
// allocation of the request-object
...
retval = zfcp_fsf_req_send(req);
...
spin_unlock_irq(&qdio->req_q_lock);
return retval;
}
static int zfcp_fsf_req_send(struct zfcp_fsf_req *req)
{
struct zfcp_adapter *adapter = req->adapter;
struct zfcp_qdio *qdio = adapter->qdio;
...
zfcp_reqlist_add(adapter->req_list, req);
// ^^^^^^^^^^^^^^^^
// add request to our driver-internal hash-table for tracking
// (protected by separate lock req_list->lock)
...
if (zfcp_qdio_send(qdio, &req->qdio_req)) {
// ^^^^^^^^^^^^^^
// hand-off the request to FCP channel;
// the request can complete at any point now
...
}
/* Don't increase for unsolicited status */
if (!zfcp_fsf_req_is_status_read_buffer(req))
// ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
// possible use-after-free
adapter->fsf_req_seq_no++;
// ^^^^^^^^^^^^^^^^
// because of the use-after-free we might
// miss this accounting, and as follow-up
// this results in the FCP channel error
// FSF_PROT_SEQ_NUMB_ERROR
adapter->req_no++;
return 0;
}
static inline bool
zfcp_fsf_req_is_status_read_buffer(struct zfcp_fsf_req *req)
{
return req->qtcb == NULL;
// ^^^^^^^^^
// possible use-after-free
}
Response Path:
void zfcp_fsf_reqid_check(struct zfcp_qdio *qdio, int sbal_idx)
{
...
struct zfcp_fsf_req *fsf_req;
...
for (idx = 0; idx < QDIO_MAX_ELEMENTS_PER_BUFFER; idx++) {
...
fsf_req = zfcp_reqlist_find_rm(adapter->req_list,
req_id);
// ^^^^^^^^^^^^^^^^^^^^
// remove request from our driver-internal
// hash-table (lock req_list->lock)
...
zfcp_fsf_req_complete(fsf_req);
}
}
static void zfcp_fsf_req_complete(struct zfcp_fsf_req *req)
{
...
if (likely(req->status & ZFCP_STATUS_FSFREQ_CLEANUP))
zfcp_fsf_req_free(req);
// ^^^^^^^^^^^^^^^^^
// free memory for request-object
else
complete(&req->completion);
// ^^^^^^^^
// completion notification for code-paths that wait
// synchronous for the completion of the request; in
// those the memory is freed separately
}
The result of the use-after-free only affects the send path, and can not
lead to any data corruption. In case we miss the sequence-number
accounting, because the memory was already re-purposed, the next FSF
command will fail with said FCP channel error, and we will recover the
whole adapter. This causes no additional errors, but it slows down
traffic. There is a slight chance of the same thing happen again
recursively after the adapter recovery, but so far this has not been seen.
This was seen under z/VM, where the send path might run on a virtual CPU
that gets scheduled away by z/VM, while the return path might still run,
and so create the necessary timing. Running with KASAN can also slow down
the kernel sufficiently to run into this user-after-free, and then see the
report by KASAN.
To fix this, simply pull the test for the sequence-number accounting in
front of the hand-off to the FCP channel (this information doesn't change
during hand-off), but leave the sequence-number accounting itself where it
is.
To make future regressions of the same kind less likely, add comments to
all closely related code-paths.
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Fixes: f9eca0227600 ("scsi: zfcp: drop duplicate fsf_command from zfcp_fsf_req which is also in QTCB header")
Cc: <stable@vger.kernel.org> #5.0+
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-07-03 05:02:00 +08:00
|
|
|
if (!zfcp_fsf_req_send(req)) {
|
|
|
|
/* NOTE: DO NOT TOUCH req, UNTIL IT COMPLETES! */
|
2008-07-02 16:56:39 +08:00
|
|
|
goto out;
|
scsi: zfcp: fix request object use-after-free in send path causing seqno errors
With a recent change to our send path for FSF commands we introduced a
possible use-after-free of request-objects, that might further lead to
zfcp crafting bad requests, which the FCP channel correctly complains
about with an error (FSF_PROT_SEQ_NUMB_ERROR). This error is then handled
by an adapter-wide recovery.
The following sequence illustrates the possible use-after-free:
Send Path:
int zfcp_fsf_open_port(struct zfcp_erp_action *erp_action)
{
struct zfcp_fsf_req *req;
...
spin_lock_irq(&qdio->req_q_lock);
// ^^^^^^^^^^^^^^^^
// protects QDIO queue during sending
...
req = zfcp_fsf_req_create(qdio,
FSF_QTCB_OPEN_PORT_WITH_DID,
SBAL_SFLAGS0_TYPE_READ,
qdio->adapter->pool.erp_req);
// ^^^^^^^^^^^^^^^^^^^
// allocation of the request-object
...
retval = zfcp_fsf_req_send(req);
...
spin_unlock_irq(&qdio->req_q_lock);
return retval;
}
static int zfcp_fsf_req_send(struct zfcp_fsf_req *req)
{
struct zfcp_adapter *adapter = req->adapter;
struct zfcp_qdio *qdio = adapter->qdio;
...
zfcp_reqlist_add(adapter->req_list, req);
// ^^^^^^^^^^^^^^^^
// add request to our driver-internal hash-table for tracking
// (protected by separate lock req_list->lock)
...
if (zfcp_qdio_send(qdio, &req->qdio_req)) {
// ^^^^^^^^^^^^^^
// hand-off the request to FCP channel;
// the request can complete at any point now
...
}
/* Don't increase for unsolicited status */
if (!zfcp_fsf_req_is_status_read_buffer(req))
// ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
// possible use-after-free
adapter->fsf_req_seq_no++;
// ^^^^^^^^^^^^^^^^
// because of the use-after-free we might
// miss this accounting, and as follow-up
// this results in the FCP channel error
// FSF_PROT_SEQ_NUMB_ERROR
adapter->req_no++;
return 0;
}
static inline bool
zfcp_fsf_req_is_status_read_buffer(struct zfcp_fsf_req *req)
{
return req->qtcb == NULL;
// ^^^^^^^^^
// possible use-after-free
}
Response Path:
void zfcp_fsf_reqid_check(struct zfcp_qdio *qdio, int sbal_idx)
{
...
struct zfcp_fsf_req *fsf_req;
...
for (idx = 0; idx < QDIO_MAX_ELEMENTS_PER_BUFFER; idx++) {
...
fsf_req = zfcp_reqlist_find_rm(adapter->req_list,
req_id);
// ^^^^^^^^^^^^^^^^^^^^
// remove request from our driver-internal
// hash-table (lock req_list->lock)
...
zfcp_fsf_req_complete(fsf_req);
}
}
static void zfcp_fsf_req_complete(struct zfcp_fsf_req *req)
{
...
if (likely(req->status & ZFCP_STATUS_FSFREQ_CLEANUP))
zfcp_fsf_req_free(req);
// ^^^^^^^^^^^^^^^^^
// free memory for request-object
else
complete(&req->completion);
// ^^^^^^^^
// completion notification for code-paths that wait
// synchronous for the completion of the request; in
// those the memory is freed separately
}
The result of the use-after-free only affects the send path, and can not
lead to any data corruption. In case we miss the sequence-number
accounting, because the memory was already re-purposed, the next FSF
command will fail with said FCP channel error, and we will recover the
whole adapter. This causes no additional errors, but it slows down
traffic. There is a slight chance of the same thing happen again
recursively after the adapter recovery, but so far this has not been seen.
This was seen under z/VM, where the send path might run on a virtual CPU
that gets scheduled away by z/VM, while the return path might still run,
and so create the necessary timing. Running with KASAN can also slow down
the kernel sufficiently to run into this user-after-free, and then see the
report by KASAN.
To fix this, simply pull the test for the sequence-number accounting in
front of the hand-off to the FCP channel (this information doesn't change
during hand-off), but leave the sequence-number accounting itself where it
is.
To make future regressions of the same kind less likely, add comments to
all closely related code-paths.
Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Fixes: f9eca0227600 ("scsi: zfcp: drop duplicate fsf_command from zfcp_fsf_req which is also in QTCB header")
Cc: <stable@vger.kernel.org> #5.0+
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-07-03 05:02:00 +08:00
|
|
|
}
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2008-07-02 16:56:39 +08:00
|
|
|
zfcp_fsf_req_free(req);
|
|
|
|
req = NULL;
|
|
|
|
out:
|
[SCSI] zfcp: Change spin_lock_bh to spin_lock_irq to fix lockdep warning
With the change to use the data on the SCSI device, iterating through
all LUNs/scsi_devices takes the SCSI host_lock. This triggers warnings
from the lock dependency checker:
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #97
---------------------------------------------------------
chchp/3224 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->req_q_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this: [ 24.972394] 2 locks held by chchp/3224:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-....}, at: [<0000000000490302>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.34.1 #98
---------------------------------------------------------
chchp/3235 just changed the state of lock:
(&(shost->host_lock)->rlock){-.-...}, at: [<00000000003a73f4>] __scsi_iterate_devices+0x38/0xbc
but this lock took another, HARDIRQ-unsafe lock in the past:
(&(&qdio->stat_lock)->rlock){+.-...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this:
2 locks held by chchp/3235:
#0: (&(sch->lock)->rlock){-.-...}, at: [<0000000000401efa>] do_IRQ+0xb2/0x1e4
#1: (&adapter->port_list_lock){.-.-..}, at: [<00000000004902f6>] zfcp_erp_modify_adapter_status+0x9e/0x16c
[...]
To stop this warning, change the request queue lock to disable irqs,
not only softirq. The changes are required only outside of the
critical "send fcp command" path.
Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2010-09-08 20:39:57 +08:00
|
|
|
spin_unlock_irq(&qdio->req_q_lock);
|
2008-07-02 16:56:39 +08:00
|
|
|
return req;
|
|
|
|
}
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2009-08-18 21:43:13 +08:00
|
|
|
/**
|
|
|
|
* zfcp_fsf_reqid_check - validate req_id contained in SBAL returned by QDIO
|
2018-11-08 22:44:54 +08:00
|
|
|
* @qdio: pointer to struct zfcp_qdio
|
2009-08-18 21:43:13 +08:00
|
|
|
* @sbal_idx: response queue index of SBAL to be processed
|
|
|
|
*/
|
2009-08-18 21:43:19 +08:00
|
|
|
void zfcp_fsf_reqid_check(struct zfcp_qdio *qdio, int sbal_idx)
|
2009-08-18 21:43:13 +08:00
|
|
|
{
|
2009-08-18 21:43:19 +08:00
|
|
|
struct zfcp_adapter *adapter = qdio->adapter;
|
2010-07-16 21:37:38 +08:00
|
|
|
struct qdio_buffer *sbal = qdio->res_q[sbal_idx];
|
2009-08-18 21:43:13 +08:00
|
|
|
struct qdio_buffer_element *sbale;
|
|
|
|
struct zfcp_fsf_req *fsf_req;
|
2023-02-22 01:55:59 +08:00
|
|
|
u64 req_id;
|
2009-08-18 21:43:13 +08:00
|
|
|
int idx;
|
|
|
|
|
|
|
|
for (idx = 0; idx < QDIO_MAX_ELEMENTS_PER_BUFFER; idx++) {
|
|
|
|
|
|
|
|
sbale = &sbal->element[idx];
|
2024-03-07 20:28:21 +08:00
|
|
|
req_id = dma64_to_u64(sbale->addr);
|
2010-02-17 18:18:50 +08:00
|
|
|
fsf_req = zfcp_reqlist_find_rm(adapter->req_list, req_id);
|
2009-08-18 21:43:13 +08:00
|
|
|
|
2010-07-16 21:37:43 +08:00
|
|
|
if (!fsf_req) {
|
2009-08-18 21:43:13 +08:00
|
|
|
/*
|
|
|
|
* Unknown request means that we have potentially memory
|
|
|
|
* corruption and must stop the machine immediately.
|
|
|
|
*/
|
2010-07-16 21:37:43 +08:00
|
|
|
zfcp_qdio_siosl(adapter);
|
2023-02-22 01:55:59 +08:00
|
|
|
panic("error: unknown req_id (%llx) on adapter %s.\n",
|
2009-08-18 21:43:13 +08:00
|
|
|
req_id, dev_name(&adapter->ccw_device->dev));
|
2010-07-16 21:37:43 +08:00
|
|
|
}
|
2009-08-18 21:43:13 +08:00
|
|
|
|
|
|
|
zfcp_fsf_req_complete(fsf_req);
|
|
|
|
|
2011-06-06 20:14:40 +08:00
|
|
|
if (likely(sbale->eflags & SBAL_EFLAGS_LAST_ENTRY))
|
2009-08-18 21:43:13 +08:00
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|