This patch converts the handful of se_device statistics to type
atomic_long_t, instead of using se_device->stats_lock when
incrementing these values.
More importantly, go ahead and drop the spinlock usage within
transport_lookup_cmd_lun() fast-path code.
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch adds a se_device->xcopy_lun that is used for local
copy offload I/O, instead of allocating + initializing a pseudo
se_lun for each received EXTENDED_COPY operation.
Also, move declaration of struct se_lun + struct se_port_stat_grps
ahead of struct se_device.
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch adds percpu refcounting for se_lun access that allows the
association of an se_lun + se_cmd in transport_lookup_cmd_lun() to
occur without an extra list_head for tracking outstanding I/O during
se_lun shutdown.
This effectively changes se_lun shutdown logic to wait for outstanding
I/O percpu references to complete in transport_lun_remove_cmd() using
se_lun->lun_ref_comp, instead of explicitly draining the per se_lun
command list and waiting for individual se_cmd descriptor processing
to complete.
Cc: Kent Overstreet <kmo@daterainc.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Update copyright ownership/year information for target-core,
loopback, iscsi-target, tcm_qla2xx, vhost and iser-target.
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch adds the Third Party Copy (3PC) bit to signal support
for EXTENDED_COPY within standard inquiry response data.
Also add emulate_3pc device attribute in configfs (enabled by default)
to allow the exposure of this bit to be disabled, if necessary.
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Martin Petersen <martin.petersen@oracle.com>
Cc: Chris Mason <chris.mason@fusionio.com>
Cc: Roland Dreier <roland@purestorage.com>
Cc: Zach Brown <zab@redhat.com>
Cc: James Bottomley <JBottomley@Parallels.com>
Cc: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Nicholas Bellinger <nab@daterainc.com>
EXTENDED_COPY needs to be able to search a global list of devices
based on NAA WWN device identifiers, so add a simple g_device_list
protected by g_device_mutex.
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Martin Petersen <martin.petersen@oracle.com>
Cc: Chris Mason <chris.mason@fusionio.com>
Cc: Roland Dreier <roland@purestorage.com>
Cc: Zach Brown <zab@redhat.com>
Cc: James Bottomley <JBottomley@Parallels.com>
Cc: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Nicholas Bellinger <nab@daterainc.com>
This patch adds support for COMPARE_AND_WRITE emulation on a per block
basis. This logic is used as an atomic test and set primative currently
used by VMWare ESX VAAI for performing array side locking of individual
VMFS extent ownership.
This includes the COMPARE_AND_WRITE CDB parsing within sbc_parse_cdb(),
and does the majority of the work within the compare_and_write_callback()
to perform the verify instance user data comparision, and subsequent
write instance user data I/O submission upon a successfull comparision.
The synchronization is enforced by se_device->caw_sem, that is obtained
before the initial READ I/O submission in sbc_compare_and_write(). The
mutex is then released upon MISCOMPARE in compare_and_write_callback(),
or upon WRITE instance user-data completion in compare_and_write_post().
The implementation currently assumes a single logical block (NoLB=1).
v4 changes:
- Explicitly clear cmd->transport_complete_callback for two failure
cases in sbc_compare_and_write() in order to avoid double unlock
of ->caw_sem in compare_and_write_callback() (Dan Carpenter)
v3 changes:
- Convert se_device->caw_mutex to ->caw_sem
v2 changes:
- Set SCF_COMPARE_AND_WRITE and cmd->execute_cmd() to
sbc_compare_and_write() during setup in sbc_parse_cdb()
- Use sbc_compare_and_write() for initial READ submission with
DMA_FROM_DEVICE
- Reset cmd->execute_cmd() to sbc_execute_rw() for write instance
user-data in compare_and_write_callback()
- Drop SCF_BIDI command flag usage
- Set TRANSPORT_PROCESSING + transport_state flags before write
instance submission, and convert to __target_execute_cmd()
- Prevent sbc_get_size() from being being called twice to
generate incorrect size in sbc_parse_cdb()
- Enforce se_device->caw_mutex synchronization between initial
READ I/O submission, and final WRITE I/O completion.
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Martin Petersen <martin.petersen@oracle.com>
Cc: Chris Mason <chris.mason@fusionio.com>
Cc: James Bottomley <JBottomley@Parallels.com>
Cc: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Nicholas Bellinger <nab@daterainc.com>
This patch adds the MAXIMUM COMPARE AND WRITE LENGTH bit, currently
hardcoded to a single logical block (NoLB=1) within the Block Limits
VPD in spc_emulate_evpd_b0().
Also add emulate_caw device attribute in configfs (enabled by default)
to allow the exposure of this bit to be disabled, if necessary.
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Martin Petersen <martin.petersen@oracle.com>
Cc: Chris Mason <chris.mason@fusionio.com>
Cc: James Bottomley <JBottomley@Parallels.com>
Cc: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Nicholas Bellinger <nab@daterainc.com>
Nobody should be expecting to read or write virtual_lun0.
Signed-off-by: Andy Grover <agrover@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
It's only ever set to PR_APTPL_BUF_LEN, so we don't need a variable.
Signed-off-by: Andy Grover <agrover@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Some were incremented, but never used anywhere from what I could tell.
Signed-off-by: Andy Grover <agrover@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch fixes a bug in core_tpg_check_initiator_node_acl() ->
core_tpg_get_initiator_node_acl() where a dynamically created
se_node_acl generated during session login would be skipped during
subsequent lookup due to the '!acl->dynamic_node_acl' check, causing
a new se_node_acl to be created with a duplicate ->initiatorname.
This would occur when a fabric endpoint was configured with
TFO->tpg_check_demo_mode()=1 + TPF->tpg_check_demo_mode_cache()=1
preventing the release of an existing se_node_acl during se_session
shutdown.
Also, drop the unnecessary usage of core_tpg_get_initiator_node_acl()
within core_dev_init_initiator_node_lun_acl() that originally
required the extra '!acl->dynamic_node_acl' check, and just pass
the configfs provided se_node_acl pointer instead.
Cc: <stable@vger.kernel.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch changes LIO to use the configfs backend device name as the
model if you echo '1' to an individual device's emulate_model_alias attribute.
This is a valid operation only on devices with an export count of 0.
Signed-off-by: Tregaron Bayly <tbayly@bluehost.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch allows IBLOCK to check block hints in request_queue->flush_flags
when reporting current backend device WriteCacheEnabled status to a remote
SCSI initiator port.
This is done via a se_subsystem_api->get_write_cache() call instead of a
backend se_device creation time flag, as we expect REQ_FLUSH bits to possibly
change from an underlying blk_queue_flush() by the SCSI disk driver, or
internal raw struct block_device driver usage.
Also go ahead and update iblock_execute_rw() bio I/O path code to use
REQ_FLUSH + REQ_FUA hints when determining WRITE_FUA usage, and make SPC
emulation code use a spc_check_dev_wce() helper to handle both types of
cases for virtual backend subsystem drivers.
(asias: Drop unnecessary comparsion operators)
Reported-by: majianpeng <majianpeng@gmail.com>
Cc: majianpeng <majianpeng@gmail.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch fixes a possible divide by zero bug when the fabric_max_sectors
device attribute is written and backend se_device failed to be successfully
configured -> enabled.
Go ahead and use block_size=512 within se_dev_set_fabric_max_sectors()
in the event of a target_configure_device() failure case, as no valid
dev->dev_attrib.block_size value will have been setup yet.
Cc: stable@vger.kernel.org
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch adds [dev,lun]_link_magic value assignment + checks within generic
target_fabric_port_link() and target_fabric_mappedlun_link() code to ensure
destination config_item *target_item sent from configfs_symlink() ->
config_item_operations->allow_link() is the underlying se_device->dev_group
and se_lun->lun_group that we expect to symlink.
Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: stable@vger.kernel.org
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
v2: Use correct target_core_stat.c 2006 copyright year
v3: Drop extra unnessary legal verbage from header (hch)
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch adds a new max_write_same_len device attribute for use with
WRITE_SAME w/ UNMAP=0 backend emulation. This can be useful for
lowering the default backend value (IBLOCK uses 0xFFFF).
Also, update block limits VPD emulation code in spc_emulate_evpd_b0() to
report MAXIMUM WRITE SAME LENGTH, and enforce max_write_same_len during
sbc_parse() -> sbc_setup_write_same() CDB sanity checking for all emulated
WRITE_SAME w/ UNMAP=0 cases.
(Robert: Move max_write_same_len check in sbc_setup_write_same() to
check both WRITE_SAME w/ UNMAP=1 and w/ UNMAP=0 cases)
Cc: Christoph Hellwig <hch@lst.de>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Robert Elliott <Elliott@hp.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Pass the sense reason as an explicit return value from the I/O submission
path instead of storing it in struct se_cmd and using negative return
values. This cleans up a lot of the code pathes, and with the sparse
annotations for the new sense_reason_t type allows for much better
error checking.
(nab: Convert spc_emulate_modesense + spc_emulate_modeselect to use
sense_reason_t with Roland's MODE SELECT changes)
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Roland Dreier <roland@purestorage.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Now that the reservations and ALUA code have been cleaned up there is no need
for the get_device_rev method, as we only need the standards revision in the
inquiry data, where we can hardcode it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
We always support ALUA for virtual backends, and never for physical ones. Simplify
the code to just deal with these two cases and remove the superflous abstractions.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
We do not support host-level reservations for the pscsi backend, and all
virtual backends are newere than SCSI-2, so just make the combined
SPC-3 + SCSI-2 support the only supported variant and kill the switches
for the different implementations, given that this code handles the no-op
version just fine.
(hch: Update DRF_SPC2_RESERVATIONS lock usage)
Signed-off-by: Christoph Hellwig <hch@lst.de>
We can just key off ordered tag emulation of the transport_type field.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Simplify the code a lot by killing the superflous struct se_subsystem_dev.
Instead se_device is allocated early on by the backend driver, which allocates
it as part of its own per-device structure, borrowing the scheme that is for
example used for inode allocation.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
The expression (max_sectors * block_size) might overflow a u32
(indeed, since iblock sets max_hw_sectors to UINT_MAX, it is
guaranteed to overflow and end up with a much-too-small result in many
common cases). Fix this by doing an equivalent calculation that
doesn't require multiplication.
While we're touching this code, avoid splitting a printk format across
two lines and use pr_info(...) instead of printk(KERN_INFO ...).
Signed-off-by: Roland Dreier <roland@purestorage.com>
Cc: stable@vger.kernel.org
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch drops se_subsystem_api->[write_cache,fua_write]_emulated flags
set by viritual FILEIO/IBLOCK/RD_MCP backend drivers in favor of explict
TRANSPORT_PLUGIN_PHBA_PDEV checks to know when to fail if userspace is
attempting to set virtual emulation bits for an pSCSI (passthrough)
backend device.
Reported-by: Christoph Hellwig <hch@lst.de>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Correct spelling typo in printk and comment within drivers/target.
Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
SPC says:
"The ALLOCATION LENGTH field is defined in 4.3.5.6. The allocation length
should be at least 16. Device servers compliant with SPC return CHECK
CONDITION status, with the sense key set to ILLEGAL REQUEST, and the
additional sense code set to INVALID FIELD IN CDB when the allocation
length is less than 16 bytes".
Testcase: sg_raw -r8 /dev/sdb a0 00 00 00 00 00 00 00 00 08 00 00
should fail with ILLEGAL REQUEST / INVALID FIELD IN CDB sense
does not fail without the patch
fails correctly with the patch
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Code was almost entirely divided based on value of bool param "enable".
Split it into two functions.
Signed-off-by: Andy Grover <agrover@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Only used in a debugprint, and function signature is cleaner now.
Signed-off-by: Andy Grover <agrover@redhat.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
The last functionality of the target processing thread is offloading possibly
long running task management requests from the submitter context. To keep
TMR semantics the same we need a single threaded ordered queue, which can
be provided by a per-device workqueue with the right flags.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This patch removes the original usage of dev_attr->max_sectors in favor of
dev_attr->hw_max_sectors that is now being enforced by target core from
within transport_generic_cmd_sequencer() for SCF_SCSI_DATA_SG_IO_CDB ops.
After the recent se_task removal patches from hch, this value for IBLOCK
backends being set via configfs by userspace from an saved max_sectors
value that is turning out to be problematic, so it makes sense to go ahead
and remove this now legacy attribute all-together.
This patch also continues to make se_dev_set_default_attribs() do
(sectors / block_size) alignment for what actually get used by
target_core_mod to be safe here, following the same alignment currently
used by fabric_max_sectors.
Reported-by: Andy Grover <agrover@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Roland Dreier <roland@purestorage.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Make CDB emulation work on commands instead of tasks again as a preparation
of removing tasks completely.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
This was used at one time as a hack by FILEIO backend registration to
allow a struct block_device that was claimed with blkdev_get (by a local
filesystem mount for example) to be exported as read-only (SCSI WP=1).
Since FILEIO backend registration will no longer attempt to obtain
exclusive access to an underlying struct block_device here, this flag is
now obsolete.
Reported-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Get rid of a bunch of write-only variables. In a number of cases I
suspect actual bugs to be present, so I left all of those for a second
look.
(nab: fix lio-core patch fuzz)
Signed-off-by: Joern Engel <joern@logfs.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Turns an order-8 allocation into slab-sized ones, thereby preventing
allocation failures with memory fragmentation.
This likely saves memory as well, as the slab allocator can pack objects
more tightly than the buddy allocator.
(nab: Fix lio-core patch fuzz)
Signed-off-by: Joern Engel <joern@logfs.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Turns an order-10 allocation into slab-sized ones, thereby preventing
allocation failures with memory fragmentation.
This likely saves memory as well, as the slab allocator can pack objects
more tightly than the buddy allocator.
Signed-off-by: Joern Engel <joern@logfs.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
cdb_offset is always equal to offset - 8, so remove that one. More
importantly, the existing code only worked correct if
se_cmd->data_length is a multiple of 8. Pass in a length of, say, 9 and
we will happily overwrite 7 bytes of "unallocated" memory.
Now, afaics this bug is currently harmless, as allocations will
implicitly be padded to multiples of 8 bytes. But depending on such a
fact wouldn't qualify as sound engineering practice.
Signed-off-by: Joern Engel <joern@logfs.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
transport_kmap_data_sg can return NULL. I never saw this trigger, but
returning -ENOMEM seems better than a crash. Also removes a pointless
case while at it.
Signed-off-by: Joern Engel <joern@logfs.org>
Cc: stable@vger.kernel.org
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Fix possible NULL pointer dereference in target_report_luns failure path.
Signed-off-by: Joern Engel <joern@logfs.org>
Cc: stable@vger.kernel.org
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
se_dev_attrib.max_sectors currently has two independent meanings:
- It is reported in the block limits VPD page as the maximum transfer
length, ie the largest IO that the front-end (fabric) can handle.
Also the target core doesn't enforce this maximum transfer length.
- It is used to hold the size of the largest IO that the back-end can
handle, so we know when to split SCSI commands into multiple tasks.
Fix this by adding a new se_dev_attrib.fabric_max_sectors to hold the
maximum transfer length, and checking incoming IOs against that limit.
Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
There is no reason to have a flag telling if a command is on the per-lun list,
we can simply do a list_empty check before removing it as long as we're careful
to always use list_del_init.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
We need to handle >1 page control cdbs, so extend the code to do a vmap
if bigger than 1 page. It seems like kmap() is still preferable if just
a page, fewer TLB shootdowns(?), so keep using that when possible.
Rename function pair for their new scope.
Signed-off-by: Andy Grover <agrover@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
- core_tpg_pre_addlun()
returns always ERR_PTR() or the pointer, never NULL. The additional
check for NULL in core_dev_add_lun() is not required.
- core_tpg_pre_dellun()
returns always ERR_PTR() or the pointer, never NULL. The check for NULL
in core_dev_del_lun() is wrong. The third argument (int *) is never
used, remove it.
- core_dev_add_lun()
returns always NULL or the pointer, never ERR_PTR. The check for
IS_ERR() is not required.
(nab: Convert core_dev_add_lun() use err.h macros for failure
handling to be consistent with the rest of target_core_fabric_configfs.c
callers)
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
It may happen that uasp will free the request in irq conntext, the
callchain:
uasp_cmd_release() -> transport_generic_free_cmd() -> core_dec_lacl_count()
where the last function enables the IRQ. Those irqs are re-disabled
later (due to the spin.*irq_restore) but in between we could get hurt.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Historically, pSCSI devices have been the ones that required target-core
to enforce a per se_device->depth_left. This patch changes target-core
to no longer (by default) enforce a per se_device->depth_left or sleep in
transport_tcq_window_closed() when we out of queue slots for all backend
export cases.
Cc: Christoph Hellwig <hch@lst.de>
Cc: Roland Dreier <roland@purestorage.com>
Cc: Joern Engel <joern@logfs.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>