Factor out the mapping of scsi-tags -> uas-tags/stream-ids to a helper function
so that there is a single place where this "magic" happens.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
- Make sure we always hold the lock when setting / checking resetting
- Check resetting before checking urb->status
- Add missing check for resetting to uas_data_cmplt
- Add missing check for resetting to uas_do_work
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There are various bug reports about oopses / hangs with the uas driver,
which all point to the abort-command and logical-unit-reset (task-management)
error handling paths.
Getting these right is very hard, there are quite a few corner cases, and
testing is almost impossible since under normal operation these code paths
are not used at all.
Another problem is that there are also some cases where it simply is not clear
what to do at all. E.g. over usb-2 multiple outstanding commands share the same
endpoint. What if a command gets aborted while its sense urb is half way
through completing (so some data has been transfered but not all). Since the
urb is not yet complete we don't know if the sense urb is actually for this
command, or for one of the other oustanding commands. If it is for one of the
other commands and we cancel it, then we end up in an undefined state. But if
it is actually for the command we're aborting, and the abort succeeds, then it
may never complete...
This exact same problem applies to logical unit resets too, if there are
multiple luns, then commands outstanding on both luns share the sense
endpoint. If there is only a single lun, then doing a logical unit reset is
little better then doing a full usb device reset.
So summarizing because:
1) abort / lun-reset is very tricky to get right
2) Not being able to test the tricky code, which means it will have bugs
3) This being a code path which under normal operation will never happen,
so being slow / sub-optimal here is not really an issue
4) Under error conditions we will still be able to recover through usb
device resets.
5) This may be a bit slower in some cases, but this is actually faster in
cases where the bridge ship has locked up, which seems to be the most
common error case sofar.
This commit removes the abort / lun-reset error handling paths, and also the
taks-mgmt code since those are the only 2 task-mgmt users. Leaving only the
(tested and testable) usb-device-reset error handling path in place.
Note I realize that this is somewhat of a big hammer, but currently people
are seeing very hard to debug oopses with uas. First let focus on making uas
work reliable, then we can later look into adding more fine grained error
handling.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Besides the ASM1051 (*) needing sdev->no_report_opcodes = 1, it turns out that
the JMicron JMS567 also needs it to work properly with uas (usb-storage always
sets it). Since some of the scsi devs were not to keen on the idea to
outrightly set sdev->no_report_opcodes = 1 for all uas devices, so add a quirk
for this, and set it for the JMS567.
*) Which has become a non-issue since we've completely blacklisted uas on
the ASM1051 for other reasons
Cc: stable@vger.kernel.org
Reported-and-tested-by: Claudio Bizzarri <claudio.bizzarri@gmail.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
And set this quirk for the Seagate Expansion Desk (0bc2:2312), as that one
seems to hang upon receiving an ATA_12 or ATA_16 command.
https://bugzilla.kernel.org/show_bug.cgi?id=79511https://bbs.archlinux.org/viewtopic.php?id=183190
While at it also add missing documentation for the u value for usb-storage
quirks.
Cc: stable@vger.kernel.org # 3.16, 3.17
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
--
Changes in v2: Add documentation for new t and u usb-storage.quirks flags
Changes in v3: Fix typo in documentation
Changes in v4: Also apply the quirk to (0bc2:3312)
Changes in v5: Rebased on 3.17-rc5, drop u documentation, already upstream
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
on some architecture spin_is_locked() always return false in
uniprocessor configuration and therefore it would be advise
to replace with lockdep_assert_held().
Signed-off-by: Sanjeev Sharma <Sanjeev_Sharma@mentor.com>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Some jmicron uas chipsets act up (they disconnect from the bus) when sending
more then 32 commands to them at once.
Rather then building an ever growing list with usb-id based quirks for
devices using this chipset, simply reduce the qdepth to 32 when connected
over usb-2. 32 should be plenty to keep things close to maximum
possible throughput on usb-2.
Cc: stable@vger.kernel.org
Tested-and-reported-by: Laszlo T. <tlacix@gmail.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There are also two allocations with GFP_KERNEL in the pre-/post_reset
code paths. That is no good because that is a part of the SCSI error handler.
Signed-off-by: Oliver Neukum <oliver@neukum.org>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
intfdata is set only after scsi_scan(). uas_pre_reset() however
needs intfdata to be valid and will follow the NULL pointer
killing khubd. intfdata must be preemptively set before the
host is registered and undone in the error case.
Signed-off-by: Oliver Neukum <oliver@neukum.org>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Quote Dan:
The patch e36e64930c: "uas: Use GFP_NOIO rather then GFP_ATOMIC
where possible" from Nov 7, 2013, leads to the following static
checker warning:
drivers/usb/storage/uas.c:806 uas_eh_task_mgmt()
error: scheduling with locks held: 'spin_lock:lock'
Some other allocations under spinlock are not caught.
The fix essentially reverts e36e64930c
Signed-off-by: Oliver Neukum <oliver@neukum.org>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Although an interesting concept, I don't think that this is a good idea:
-This will result in lots of "virtual" scsi controllers confusing users
-If we get a scsi-bus-reset we will now need to do a usb-device-reset of all
uas devices on the same usb bus, which is something to avoid if possible
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
At the kernel-summit Sarah Sharp asked me if I was willing to become the
uas maintainer. I said yes, and here is a patch to make this official.
Also remove Matthew Wilcox and Sarah Sharp as maintainers at their request.
I've also added myself to the module's author tag, so that if people look there
rather then in maintainers they will know they should bug me about uas too.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Copy the sg alignment trick from the usb-storage driver, without this I'm
seeing intermittent errors when using uas devices with an ehci controller.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
The scsi error handling path re-uses previously queued up (and errored-out)
cmds. If such a re-used cmd had a data-phase then cmdinfo will have
data_in_urb / data_out_urb still set to the free-ed urbs from the errored-out
cmd, and they will get free-ed a second time when the error handling cmd
completes, corrupting the kernel heap.
Clearing cmdinfo on command queue-ing fixes this, and seems like a good idea
in general.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
The scsi-host structure is refcounted, scsi_remove_host tears down the
scsi-host but does not decrement the refcount, so we need to call
scsi_put_host on disconnect to get the underlying memory to be freed.
After calling scsi_remove_host, the scsi-core may still hold a reference to
the scsi-host, iow we may still get called after uas_disconnect, but we
do our own life cycle management of uas_devinfo, freeing it on disconnect,
and thus may end up using devinfo after it has been freed. Switch to letting
scsi_host_alloc allocate and manage the memory for us.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
cmds are either on the inflight list or on the dead list, never both, so
we only need one list head.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Before this commit the uas driver would keep track of scsi commands which still
need to have some urbs submitted to the device, and complete this with an
ABORT result code on bus-reset or disconnect, but in flight scsi commands
which have all their urbs submitted, and thus are not part of the work list,
would never get their done callback called.
The problem is killed sense urbs don't have any tag info, so it is impossible
to tell which scsi cmd they belong to, so merely making sure all the urbs
have completed one way or the other is not enough.
This commit fixes this by changing the work list to an inflight list, which
keeps tracks of all inflight scsi cmnds, using the IS_IN_WORK_LIST flag to
determine if actual work needs to be done in uas_do_work(), and by moving
marking all inflight scsi commands as aborted and moving them to the dead list
on bus-reset or disconnect.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
uas_alloc_data_urb always gets called with a stream_id value of 0 when not
using streams. Removing the check makes it consistent with uas_alloc_sense_urb.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
For USB-2 connections the stream-id must always be 0.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Some BIOS-es will hang on reboot when an uas device is attached and left in
uas mode on reboot.
This commit adds a shutdown handler which on reboot puts the device back into
usb-storage mode, fixing the hang on reboot on these systems.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
We can sleep in our own workqueue (which is the whole reason for having
it), and scsi error handlers are also always called from a context which
may sleep.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Since we use a fixed tag / stream for tasks we cannot allow more then one
to run at the same time. This could happen before this time if a task timed
out.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
The fixed endpoint config code was only necessary to deal with an early
uas prototype which has never been released, so lets drop it and enforce
proper uas endpoint descriptors.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
The loop uses up to 3 bytes of the endpoint extra data.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
This is a preparation patch for adding better descriptor validation.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Once we start supporting uas hardware, and as more and more uas devices
become available, we will likely start seeing broken devices. This patch
prepares for the inevitable need for blacklisting those devices from
using the uas driver (they will use usb-storage instead).
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
This is a preparation patch for teaching usb-storage to not bind to
uas devices.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
This is a preparation patch for teaching usb-storage to not bind to
uas devices.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
If we get ie 16 streams we can use stream-id 1-16, not 1-15.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Handle usb-device resets not triggered from uas_eh_bus_reset_handler(), when
this happens, disable cmd queuing during the reset, and wait for existing
requests to finish in pre_reset.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Fix the uas_eh_bus_reset_handler not properly taking the usbdev lock
before calling usb_device_reset, the usb-core expects this lock to be
taken when usb_device_reset is called.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
I thought it would be a good idea to also test uas with usb-2, and it turns out
it was, as it did not work. The problem is that the uas driver was passing the
bEndpointAddress' direction bit to usb_rcvbulkpipe, the xhci code seems to not
care about this, but with the ehci code this causes usb_submit_urb failure.
With this fixed the uas code works nicely with an uas device plugged into
an ehci port.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
The cmd endpoint never has streams, so the stream_id parameter is unused.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
All callers of unlink_data_urbs drop devinfo->lock before calling it, and
then immediately take it again after the call. And the first thing
unlink_data_urbs does is take the lock again, and the last thing it does
is drop it. This commit removes all the unnecessary lock dropping and taking.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
- Rename labels to properly reflect this
- Don't skip free-ing the streams when scsi_init_shared_tag_map fails
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Otherwise they may complete before they get anchored and thus never get
unanchored (as the unanchoring is done by the usb core on completion).
This commit also remove the usb_get_urb / usb_put_urb around cmd submission +
anchoring, since if done in the proper order this is not necessary.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
This patch adds a new list where all requests which are canceled are
added to, so we don't loose them. Then, after killing all inflight
urbs on bus reset (and disconnect) we'll walk over the list and clean
them up.
Without this we can end up with aborted requests lingering around in
case of status pipe transfer errors.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Simplifies locking, we'll protect the list with the device spin lock.
Also plugs races which can happen when two devices operate on the
global list.
While being at it rename the list head from "list" to "work", preparing
for the addition of a second list.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>