The name ATA_QCFLAG_FAILED is misleading since it does not mean that a
QC completed in error, or that it didn't complete at all. It means that
libata decided to schedule EH for the QC, so the QC is now owned by the
libata error handler (EH).
The normal execution path is responsible for not accessing a QC owned
by EH. libata core enforces the rule by returning NULL from
ata_qc_from_tag() for QCs owned by EH.
It is quite easy to mistake that a QC marked with ATA_QCFLAG_FAILED was
an error. However, a QC that was actually an error is instead indicated
by having qc->err_mask set. E.g. when we have a NCQ error, we abort all
QCs, which currently will mark all QCs as ATA_QCFLAG_FAILED. However, it
will only be a single QC that is an error (i.e. has qc->err_mask set).
Rename ATA_QCFLAG_FAILED to ATA_QCFLAG_EH to more clearly highlight that
this flag simply means that a QC is now owned by EH. This new name will
not mislead to think that the QC was an error (which is instead
indicated by having qc->err_mask set).
This also makes it more obvious that the EH code skips all QCs that do
not have ATA_QCFLAG_EH set (rather than ATA_QCFLAG_FAILED), since the EH
code should simply only care about QCs that are owned by EH itself.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
If ap->ops->error_handler is NULL just return. This patch also
fixes some comment style issue.
Signed-off-by: Wenchao Hao <haowenchao@huawei.com>
Reviewed-by: Niklas Cassel <niklas.cassel@wdc.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
A NCQ error means that the device has aborted processing of all active
commands.
To get the single NCQ command that caused the NCQ error, host software has
to read the NCQ error log, which also takes the device out of error state.
When the device encounters a NCQ error, we receive an error interrupt from
the HBA, and call ata_do_link_abort() to mark all outstanding commands on
the link as ATA_QCFLAG_FAILED (which means that these commands are owned
by libata EH), and then call ata_qc_complete() on them.
ata_qc_complete() will call fill_result_tf() for all commands marked as
ATA_QCFLAG_FAILED.
The taskfile is simply the latest status/error as seen from the device's
perspective. The taskfile will have ATA_ERR set in the status field and
ATA_ABORTED set in the error field.
When we fill the current taskfile values for all outstanding commands,
that means that qc->result_tf will have ATA_ERR set for all commands
owned by libata EH.
When ata_eh_link_autopsy() later analyzes all commands owned by libata EH,
it will call ata_eh_analyze_tf(), which will check if qc->result_tf has
ATA_ERR set, if it does, it will set qc->err_mask (which marks the command
as an error).
When ata_eh_finish() later calls __ata_qc_complete() on all commands owned
by libata EH, it will call qc->complete_fn() (ata_scsi_qc_complete()),
ata_scsi_qc_complete() will call ata_gen_ata_sense() to generate sense
data if qc->err_mask is set.
This means that we will generate sense data for commands that should not
have any sense data set. Having sense data set for the non-failed commands
will cause SCSI to finish these commands instead of retrying them.
While this incorrect behavior has existed for a long time, this first
became a problem once we started reading the correct taskfile register in
commit 4ba09d2026 ("ata: libahci: read correct status and error field
for NCQ commands").
Before this commit, NCQ commands would read the taskfile values received
from the last non-NCQ command completion, which most likely did not have
ATA_ERR set, since the last non-NCQ command was most likely not an error.
Fix this by changing ata_eh_analyze_ncq_error() to mark all non-failed
commands as ATA_QCFLAG_RETRY, and change the loop in ata_eh_link_autopsy()
to skip commands marked as ATA_QCFLAG_RETRY.
While at it, make sure that we clear ATA_ERR and any error bits for all
commands except the actual command that caused the NCQ error, so that no
other libata code will be able to misinterpret these commands as errors.
Fixes: 4ba09d2026 ("ata: libahci: read correct status and error field for NCQ commands")
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Clean up the code by making use of the newly introduced
ata_port_is_frozen() helper function.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Currently, the sense data reporting feature set is enabled for all ATA
devices which supports the feature set (ata_id_has_sense_reporting()),
see ata_dev_config_sense_reporting().
However, even if sense data reporting is enabled, and the device
indicates that sense data is available, the sense data is only fetched
for ATA ZAC devices. For regular ATA devices, the available sense data
is never fetched, it is simply ignored. Instead, libata will use the
ERROR + STATUS fields and map them to a very generic and reduced set
of sense data, see ata_gen_ata_sense() and ata_to_sense_error().
When sense data reporting was first implemented, regular ATA devices
did fetch the sense data from the device. However, this was restricted
to only ATA ZAC devices in commit ca156e006a ("libata: don't request
sense data on !ZAC ATA devices").
With recent changes related to sense data and NCQ autosense, we want
to, once again, fetch the sense data for all ATA devices supporting
sense reporting.
ata_gen_ata_sense() should only be used for devices that don't support
the sense data reporting feature set.
hopefully the features will be more robust this time around.
It is not just ZAC, many new ATA features, e.g. Command Duration
Limits, relies on working NCQ autosense and sense data. Therefore,
it is not really an option to avoid fetching the sense data forever.
If we encounter a device that is misbehaving because the sense data is
actually fetched, then that device should be quirked such that it
never enables the sense data reporting feature set in the first place,
since such a device is obviously not compliant with the specification.
The order in which we will try to add sense data to a scsi_cmnd:
1) NCQ autosense (if supported) - ata_eh_analyze_ncq_error()
2) REQUEST SENSE DATA EXT (if supported) - ata_eh_request_sense()
3) error + status field translation - ata_gen_ata_sense(), called
by ata_scsi_qc_complete() if neither 1) or 2) is supported.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
While this shouldn't be needed if all devices that claim that they
support NCQ autosense (ata_id_has_ncq_autosense()) and/or the sense
data reporting feature (ata_id_has_sense_reporting()), actually
supported those features.
However, there might be some old ATA devices that either have these
bits set, even when they don't support those features, or they simply
return malformed data when using those features.
These devices should be quirked, but in order to try to minimize the
impact for the users of these such devices, it was suggested by Damien
Le Moal that it might be a good idea to sanity check the sense data
received from the device. If the sense data looks bogus, then the
sense data is never added to the scsi_cmnd command.
Introduce a new function, ata_scsi_sense_is_valid(), and use it in all
places where sense data is received from the device.
Suggested-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
ata_eh_request_sense() returns early when flag ATA_QCFLAG_SENSE_VALID is
set. However, since the call to ata_eh_request_sense() is guarded by a
ATA_SENSE bit conditional, the logical conclusion for the reader is that
all checks are performed at the call site.
Highlight the fact that the sense data will not be fetched if flag
ATA_QCFLAG_SENSE_VALID is already set by adding an additional check to
the existing guarding conditional. No functional change.
Additionally, add a comment explaining that ata_eh_analyze_tf() will
only fetch the sense data if:
-It was a non-NCQ command that failed, or
-It was a NCQ command that failed, but the sense data was not included
in the NCQ command error log (i.e. NCQ autosense is not supported).
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
* Print the timeout value for internal command failures due to a
timeout (from Tomas).
* Improve parameter names in ata_dev_set_feature() to clarify this
function use (from Niklas).
* Improve the ahci driver low power mode setting initialization to allow
more flexibility for the user (from Rafael).
* Several patches to remove redundant variables in libata-core,
libata-eh and the pata_macio driver and to fix typos in comments (from
Jinpeng, Shaomin, Ye).
* Some code simplifications and macro renaming (for clarity) in various
functions of libata-core (from me).
* Add a missing check for a potential failure of sata_scr_read() in
sata_print_link_status() (from Li).
* Cleanup of libata Kconfig PATA_PLATFORM and PATA_OF_PLATFORM options
(from Lukas).
* Cleanups of ata dt-bindings and improvements of libahci_platform, ahci
and libahci code (from Serge)
* New driver for Synopsys AHCI SATA controllers, based of the generic
ahci code (from Serge). One compilation warning fix is added for this
driver (from me).
* Several fixes to macros used to discover a drive capabilities to be
consistent with the ACS specifications (from Niklas).
* A couple of simplifcations to some libata functions, removing
unnecessary arguments (from Niklas).
* An improvements to libata-eh code to avoid unnecessary link reset when
revalidating a drive after a failed command. In practice, this extra,
unneeded reset, reset does not cause any arm beyond slightly slowing
down error recovery (from Niklas).
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQSRPv8tYSvhwAzJdzjdoc3SxdoYdgUCYz0asgAKCRDdoc3SxdoY
drHoAQCJhb6MuQHzbN/wR5cTGAfWXQJWBJx2mJr7oKJCrB34PwD/RzphcsuaXDta
kwbTGlpitegByZTDKt9eMRLWmKgyngw=
=CnJj
-----END PGP SIGNATURE-----
Merge tag 'ata-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata
Pull ata updates from Damien Le Moal:
- Print the timeout value for internal command failures due to a
timeout (from Tomas)
- Improve parameter names in ata_dev_set_feature() to clarify this
function use (from Niklas)
- Improve the ahci driver low power mode setting initialization to
allow more flexibility for the user (from Rafael)
- Several patches to remove redundant variables in libata-core,
libata-eh and the pata_macio driver and to fix typos in comments
(from Jinpeng, Shaomin, Ye)
- Some code simplifications and macro renaming (for clarity) in various
functions of libata-core (from me)
- Add a missing check for a potential failure of sata_scr_read() in
sata_print_link_status() (from Li)
- Cleanup of libata Kconfig PATA_PLATFORM and PATA_OF_PLATFORM options
(from Lukas)
- Cleanups of ata dt-bindings and improvements of libahci_platform,
ahci and libahci code (from Serge)
- New driver for Synopsys AHCI SATA controllers, based of the generic
ahci code (from Serge). One compilation warning fix is added for this
driver (from me)
- Several fixes to macros used to discover a drive capabilities to be
consistent with the ACS specifications (from Niklas)
- A couple of simplifcations to some libata functions, removing
unnecessary arguments (from Niklas)
- An improvements to libata-eh code to avoid unnecessary link reset
when revalidating a drive after a failed command. In practice, this
extra, unneeded reset, reset does not cause any arm beyond slightly
slowing down error recovery (from Niklas)
* tag 'ata-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata: (45 commits)
ata: libata-eh: avoid needless hard reset when revalidating link
ata: libata: drop superfluous ata_eh_analyze_tf() parameter
ata: libata: drop superfluous ata_eh_request_sense() parameter
ata: fix ata_id_has_dipm()
ata: fix ata_id_has_ncq_autosense()
ata: fix ata_id_has_devslp()
ata: fix ata_id_sense_reporting_enabled() and ata_id_has_sense_reporting()
ata: libata-eh: Remove the unneeded result variable
ata: ahci_st: Enable compile test
ata: ahci_st: Fix compilation warning
MAINTAINERS: Add maintainers for DWC AHCI SATA driver
ata: ahci-dwc: Add Baikal-T1 AHCI SATA interface support
ata: ahci-dwc: Add platform-specific quirks support
dt-bindings: ata: ahci: Add Baikal-T1 AHCI SATA controller DT schema
ata: ahci: Add DWC AHCI SATA controller support
ata: libahci_platform: Add function returning a clock-handle by id
dt-bindings: ata: ahci: Add DWC AHCI SATA controller DT schema
ata: ahci: Introduce firmware-specific caps initialization
ata: ahci: Convert __ahci_port_base to accepting hpriv as arguments
ata: libahci: Don't read AHCI version twice in the save-config method
...
Performing a revalidation on a AHCI controller supporting LPM,
while using a lpm mode of e.g. med_power_with_dip (hipm + dipm) or
medium_power (hipm), will currently always lead to a hard reset.
The expected behavior is that a hard reset is only performed when
revalidate fails, because the properties of the drive has changed.
A revalidate performed after e.g. a NCQ error, or such a simple thing
as disabling write-caching (hdparm -W 0 /dev/sda), should succeed on
the first try (and should therefore not cause the link to be reset).
This unwarranted hard reset happens because ata_phys_link_offline()
returns true for a link that is in deep sleep. Thus the call to
ata_phys_link_offline() in ata_eh_revalidate_and_attach() will cause
the revalidation to fail, which causes ata_eh_handle_dev_fail() to be
called, which will set ehc->i.action |= ATA_EH_RESET, such that the
link is reset before retrying revalidation.
When the link is reset, the link is reestablished, so when
ata_eh_revalidate_and_attach() is called the second time, directly
after the link has been reset, ata_phys_link_offline() will return
false, and the revalidation will succeed.
Looking at "8.3.1.3 HBA Initiated" in the AHCI 1.3.1 specification,
it is clear the when host software writes a new command to memory,
by setting a bit in the PxCI/PxSACT HBA port registers, the HBA will
automatically bring back the link before sending out the Command FIS.
However, simply reading a SCR (like ata_phys_link_offline() does),
will not cause the HBA to automatically bring back the link.
As long as hipm is enabled, the HBA will put an idle link into deep
sleep. Avoid this needless hard reset on revalidation by temporarily
disabling hipm, by setting the LPM mode to ATA_LPM_MAX_POWER.
After revalidation is complete, ata_eh_recover() will restore the link
policy by setting the LPM mode to ap->target_lpm_policy.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
The parameter can easily be derived from struct ata_queued_cmd.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
The parameter can easily be derived from struct ata_queued_cmd.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Return the value ata_port_abort() directly instead of storing it in
another redundant variable.
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: ye xingchen <ye.xingchen@zte.com.cn>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Add the missing command name for ATA_CMD_NCQ_NON_DATA to
ata_get_cmd_name().
Fixes: 661ce1f0c4 ("libata/libsas: Define ATA_CMD_NCQ_NON_DATA")
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
ata_internal_cmd_timeout() returns *unsigned long* timeout in ms, however
ata_exec_internal_sg() passes that timeout to msecs_to_jiffies() that takes
just *unsigned int*. Change ata_internal_cmd_timeout()'s result type to
*unsigned int* as well, also updating the *struct* ata_eh_cmd_timeout_ent
and the command timeout tables -- all timeouts fit into *unsigned int* but
we have to change ULONG_MAX to UINT_MAX...
Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
ata_eh_nr_in_flight() counts the # of the active tagged commands and
thus cannot return a negative value but the result type is nevertheless
int. Switching it to unsigned int (along with the local variables
receiving the function's result) helps avoiding the sign extension
instructions when comparing with or assigning to unsigned long
ata_port::fastdrain_cnt and thus results in a more compact 64-bit
code.
Found by Linux Verification Center (linuxtesting.org) with the SVACE
static analysis tool.
[Damien]
Fixed commit message.
Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Add the explicit error and status register fields to 'struct ata_taskfile'
using the anonymous *union*s ('struct ide_taskfile' had that for ages!) and
update the libata taskfile code accordingly. There should be no object code
changes resulting from that...
Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Since the commit c05e6ff035 ("libata-acpi: implement
and use ata_acpi_init_gtm()") ata_acpi_on_suspend() just returns 0, so
its call from ata_eh_handle_port_suspend() doesn't make sense anymore.
Remove the function completely, at last...
Found by Linux Verification Center (linuxtesting.org) with the SVACE
static analysis tool.
Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Callers are already protected by ata_dev_print_info(), so no need
to have an additional configuration parameter here.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Rename ata_get_cmd_descrip() to ata_get_cmd_name() and simplify
it to return "unknown" instead of NULL.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
To follow the flow of control we should be using tracepoints, as
they will tie in with the actual I/O flow and deliver a better
overview about what it happening.
This patch adds tracepoints for hard reset, soft reset, and postreset
and adds them in the libata-eh control flow.
With that we can drop the reset DPRINTK calls in the various drivers.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Some ATA drives are very slow to respond to READ_LOG_EXT and
READ_LOG_DMA_EXT commands issued from ata_dev_configure() when the
device is revalidated right after resuming a system or inserting the
ATA adapter driver (e.g. ahci). The default 5s timeout
(ATA_EH_CMD_DFL_TIMEOUT) used for these commands is too short, causing
errors during the device configuration. Ex:
...
ata9: SATA max UDMA/133 abar m524288@0x9d200000 port 0x9d200400 irq 209
ata9: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
ata9.00: ATA-9: XXX XXXXXXXXXXXXXXX, XXXXXXXX, max UDMA/133
ata9.00: qc timeout (cmd 0x2f)
ata9.00: Read log page 0x00 failed, Emask 0x4
ata9.00: Read log page 0x00 failed, Emask 0x40
ata9.00: NCQ Send/Recv Log not supported
ata9.00: Read log page 0x08 failed, Emask 0x40
ata9.00: 27344764928 sectors, multi 16: LBA48 NCQ (depth 32), AA
ata9.00: Read log page 0x00 failed, Emask 0x40
ata9.00: ATA Identify Device Log not supported
ata9.00: failed to set xfermode (err_mask=0x40)
ata9: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
ata9.00: configured for UDMA/133
...
The timeout error causes a soft reset of the drive link, followed in
most cases by a successful revalidation as that give enough time to the
drive to become fully ready to quickly process the read log commands.
However, in some cases, this also fails resulting in the device being
dropped.
Fix this by using adding the ata_eh_revalidate_timeouts entries for the
READ_LOG_EXT and READ_LOG_DMA_EXT commands. This defines a timeout
increased to 15s, retriable one time.
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Prepare for removal of the request pointer by using scsi_cmd_to_rq()
instead. This patch does not change any functionality.
Link: https://lore.kernel.org/r/20210809230355.8186-8-bvanassche@acm.org
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This series consists of the usual driver updates (ufs, target, tcmu,
smartpqi, lpfc, zfcp, qla2xxx, mpt3sas, pm80xx). The major core
change is using a sbitmap instead of an atomic for queue tracking.
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCYInvqCYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishYh2AP0SgqqL
WYZRT2oiyBOKD28v+ceOSiXvgjPlqABwVMC0BAEAn29/wNCxyvzZ1k/b0iPJ4M+S
klkSxLzXKQLzJBgdK5w=
=p5B/
-----END PGP SIGNATURE-----
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI updates from James Bottomley:
"This consists of the usual driver updates (ufs, target, tcmu,
smartpqi, lpfc, zfcp, qla2xxx, mpt3sas, pm80xx).
The major core change is using a sbitmap instead of an atomic for
queue tracking"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (412 commits)
scsi: target: tcm_fc: Fix a kernel-doc header
scsi: target: Shorten ALUA error messages
scsi: target: Fix two format specifiers
scsi: target: Compare explicitly with SAM_STAT_GOOD
scsi: sd: Introduce a new local variable in sd_check_events()
scsi: dc395x: Open-code status_byte(u8) calls
scsi: 53c700: Open-code status_byte(u8) calls
scsi: smartpqi: Remove unused functions
scsi: qla4xxx: Remove an unused function
scsi: myrs: Remove unused functions
scsi: myrb: Remove unused functions
scsi: mpt3sas: Fix two kernel-doc headers
scsi: fcoe: Suppress a compiler warning
scsi: libfc: Fix a format specifier
scsi: aacraid: Remove an unused function
scsi: core: Introduce enum scsi_disposition
scsi: core: Modify the scsi_send_eh_cmnd() return value for the SDEV_BLOCK case
scsi: core: Rename scsi_softirq_done() into scsi_complete()
scsi: core: Remove an incorrect comment
scsi: core: Make the scsi_alloc_sgtables() documentation more accurate
...
In preparation to enable -Wimplicit-fallthrough for Clang, fix a warning
by explicitly adding a break statement instead of letting the code fall
through to the next case.
Link: https://github.com/KSPP/linux/issues/115
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Improve readability of the code in the SCSI core by introducing an
enumeration type for the values used internally that decide how to continue
processing a SCSI command. The eh_*_handler return values have not been
changed because that would involve modifying all SCSI drivers.
The output of the following command has been inspected to verify that no
out-of-range values are assigned to a variable of type enum
scsi_disposition:
KCFLAGS=-Wassign-enum make CC=clang W=1 drivers/scsi/
Link: https://lore.kernel.org/r/20210415220826.29438-6-bvanassche@acm.org
Cc: Christoph Hellwig <hch@lst.de>
Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Daniel Wagner <dwagner@suse.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Some functions have different names between their prototypes
and the kernel-doc markup.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
* move ata_eh_analyze_ncq_error() and ata_eh_read_log_10h() to
libata-sata.c
* add static inline for ata_eh_analyze_ncq_error() for
CONFIG_SATA_HOST=n case (link->sactive is non-zero only if
NCQ commands are actually queued so empty function body is
sufficient)
Code size savings on m68k arch using (modified) atari_defconfig:
text data bss dec hex filename
before:
16164 18 0 16182 3f36 drivers/ata/libata-eh.o
after:
15446 18 0 15464 3c68 drivers/ata/libata-eh.o
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Add !IS_ENABLED(CONFIG_SATA_HOST) to ata_eh_set_lpm() to allow
compiler to optimize out the function for non-SATA configs (for
PATA hosts "ap && !ap->ops->set_lpm" condition is always true so
it's sufficient for the function to return zero).
Code size savings on m68k arch using (modified) atari_defconfig:
text data bss dec hex filename
before:
17353 18 0 17371 43db drivers/ata/libata-eh.o
after:
16607 18 0 16625 40f1 drivers/ata/libata-eh.o
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Move EXPORT_SYMBOL_GPL()s close to exported code like it is
done in other kernel subsystems. As a nice side effect this
results in the removal of few ifdefs.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
In commit 7634ccd2da ("libata: maintainership update") from 2018
Jens has officially taken over libata maintainership from Tejun so
remove stale information from core libata code.
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
ZAC support added sense data requesting on error for both ZAC and ATA
devices. This seems to cause erratic error handling behaviors on some
SSDs where the device reports sense data availability and then
delivers the wrong content making EH take the wrong actions. The
failure mode was sporadic on a LITE-ON ssd and couldn't be reliably
reproduced.
There is no value in requesting sense data from non-ZAC ATA devices
while there's a significant risk of introducing EH misbehaviors which
are difficult to reproduce and fix. Let's do the sense data dancing
only for ZAC devices.
Reviewed-by: Hannes Reinecke <hare@suse.com>
Tested-by: Masato Suzuki <masato.suzuki@wdc.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Based on 1 normalized pattern(s):
this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license as published by
the free software foundation either version 2 or at your option any
later version this program is distributed in the hope that it will
be useful but without any warranty without even the implied warranty
of merchantability or fitness for a particular purpose see the gnu
general public license for more details you should have received a
copy of the gnu general public license along with this program see
the file copying if not write to the free software foundation 675
mass ave cambridge ma 02139 usa
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-or-later
has been chosen to replace the boilerplate/reference in 52 file(s).
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Jilayne Lovejoy <opensource@jilayne.com>
Reviewed-by: Steve Winslow <swinslow@gmail.com>
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190519154042.342335923@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There is nothing it could synchronize against, so don't go through
the pains of acquiring the lock.
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pull libata updates from Tejun Heo:
- libata has always been limiting the maximum queue depth to 31, with
one entry set aside mostly for historical reasons. This didn't use to
make much difference but Jens found out that modern hard drives can
actually perform measurably better with the extra one queue depth.
Jens updated libata core so that it can make use of full 32 queue
depth
- Damien updated command retry logic in error handling so that it
doesn't unnecessarily retry when upper layer (SCSI) is gonna handle
them
- A couple misc changes
* 'for-4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
sata_fsl: use the right type for tag bitshift
ahci: enable full queue depth of 32
libata: don't clamp queue depth to ATA_MAX_QUEUE - 1
libata: add extra internal command
sata_nv: set host can_queue count appropriately
libata: remove assumption that ATA_MAX_QUEUE - 1 is the max
libata: use ata_tag_internal() consistently
libata: bump ->qc_active to a 64-bit type
libata: convert core and drivers to ->hw_tag usage
libata: introduce notion of separate hardware tags
libata: Fix command retry decision
libata: Honor RQF_QUIET flag
libata: Make ata_dev_set_mode() less verbose
libata: Fix ata_err_string()
libata: Fix comment typo in ata_eh_analyze_tf()
sata_nv: don't use block layer bounce buffer
ata: hpt37x: Convert to use match_string() helper
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABCAAGBQJbFIrHAAoJEPfTWPspceCm2+kQAKo7o7HL30aRxJYu+gYafkuW
PV47zr3e4vhMDEzDaMsh1+V7I7bm3uS+NZu6cFbcV+N9KXFpeb4V4Hvvm5cs+OC3
WCOBi4eC1h4qnDQ3ZyySrCMN+KHYJ16pZqddEjqw+fhVudx8i+F+jz3Y4ZMDDc3q
pArKZvjKh2wEuYXUMFTjaXY46IgPt+er94OwvrhyHk+4AcA+Q/oqSfSdDahUC8jb
BVR3FV4I3NOHUaru0RbrUko13sVZSboWPCIFrlTDz8xXcJOnVHzdVS1WLFDXLHnB
O8q9cADCfa4K08kz68RxykcJiNxNvz5ChDaG0KloCFO+q1tzYRoXLsfaxyuUDg57
Zd93OFZC6hAzXdhclDFIuPET9OQIjDzwphodfKKmDsm3wtyOtydpA0o7JUEongp0
O1gQsEfYOXmQsXlo8Ot+Z7Ne/HvtGZ91JahUa/59edxQbcKaMrktoyQsQ/d1nOEL
4kXID18wPcFHWRQHYXyVuw6kbpRtQnh/U2m1eenSZ7tVQHwoe6mF3cfSf5MMseak
k8nAnmsfEvOL4Ar9ftg61GOrImaQlidxOC2A8fmY5r0Sq/ZldvIFIZizsdTTCcni
8SOTxcQowyqPf5NvMNQ8cKqqCJap3ppj4m7anZNhbypDIF2TmOWsEcXcMDn4y9on
fax14DPLo59gBRiPCn5f
=nga/
-----END PGP SIGNATURE-----
Merge tag 'for-4.18/block-20180603' of git://git.kernel.dk/linux-block
Pull block updates from Jens Axboe:
- clean up how we pass around gfp_t and
blk_mq_req_flags_t (Christoph)
- prepare us to defer scheduler attach (Christoph)
- clean up drivers handling of bounce buffers (Christoph)
- fix timeout handling corner cases (Christoph/Bart/Keith)
- bcache fixes (Coly)
- prep work for bcachefs and some block layer optimizations (Kent).
- convert users of bio_sets to using embedded structs (Kent).
- fixes for the BFQ io scheduler (Paolo/Davide/Filippo)
- lightnvm fixes and improvements (Matias, with contributions from Hans
and Javier)
- adding discard throttling to blk-wbt (me)
- sbitmap blk-mq-tag handling (me/Omar/Ming).
- remove the sparc jsflash block driver, acked by DaveM.
- Kyber scheduler improvement from Jianchao, making it more friendly
wrt merging.
- conversion of symbolic proc permissions to octal, from Joe Perches.
Previously the block parts were a mix of both.
- nbd fixes (Josef and Kevin Vigor)
- unify how we handle the various kinds of timestamps that the block
core and utility code uses (Omar)
- three NVMe pull requests from Keith and Christoph, bringing AEN to
feature completeness, file backed namespaces, cq/sq lock split, and
various fixes
- various little fixes and improvements all over the map
* tag 'for-4.18/block-20180603' of git://git.kernel.dk/linux-block: (196 commits)
blk-mq: update nr_requests when switching to 'none' scheduler
block: don't use blocking queue entered for recursive bio submits
dm-crypt: fix warning in shutdown path
lightnvm: pblk: take bitmap alloc. out of critical section
lightnvm: pblk: kick writer on new flush points
lightnvm: pblk: only try to recover lines with written smeta
lightnvm: pblk: remove unnecessary bio_get/put
lightnvm: pblk: add possibility to set write buffer size manually
lightnvm: fix partial read error path
lightnvm: proper error handling for pblk_bio_add_pages
lightnvm: pblk: fix smeta write error path
lightnvm: pblk: garbage collect lines with failed writes
lightnvm: pblk: rework write error recovery path
lightnvm: pblk: remove dead function
lightnvm: pass flag on graceful teardown to targets
lightnvm: pblk: check for chunk size before allocating it
lightnvm: pblk: remove unnecessary argument
lightnvm: pblk: remove unnecessary indirection
lightnvm: pblk: return NVM_ error on failed submission
lightnvm: pblk: warn in case of corrupted write buffer
...
As far as I can tell this function can't even be called any more, given
that ATA implements its own eh_strategy_handler with ata_scsi_error, which
never calls ->eh_timed_out.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Bump the internal tag to 32, instead of stealing the last tag in
our regular command space. This works just fine, since we don't
actually need a separate hardware tag for this. Internal commands
cannot coexist with NCQ commands.
As a bonus, we get rid of the special casing of what tag to use
for the internal command.
This is in preparation for utilizing all 32 commands for normal IO.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Tejun Heo <tj@kernel.org>
In a few spots we iterate to ATA_MAX_QUEUE -1, including internal
knowledge that the last tag is the internal tag. Remove this
assumption.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Tejun Heo <tj@kernel.org>
For failed commands with valid sense data (e.g. NCQ commands),
scsi_check_sense() is used in ata_analyze_tf() to determine if the
command can be retried. In such case, rely on this decision and ignore
the command error mask based decision done in ata_worth_retry().
This fixes useless retries of commands such as unaligned writes on zoned
disks (TYPE_ZAC).
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Currently, libata ignores requests RQF_QUIET flag and print error
messages for failed commands, regardless if this flag is set in the
command request. Fix this by introducing the ata_eh_quiet() function and
using this function in ata_eh_link_autopsy() to determine if the EH
context should be quiet. This works by counting the number of failed
commands and the number of commands with the quiet flag set. If both
numbers are equal, the the EH context can be set to quiet and all error
messages suppressed. Otherwise, only the error messages for the failed
commands are suppressed and the link Emask and irq_stat messages printed.
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Add proper error string output for ATA_ERR_NCQ and ATA_ERR_NODEV_HINT
instead of returning "unknown error".
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
__printf is useful to verify format and arguments. Remove the following
warning (with W=1):
drivers/ata/libata-eh.c:183:10: warning: function might be possible candidate for ‘gnu_printf’ format attribute [-Wsuggest-attribute=format]
Signed-off-by: Mathieu Malaterre <malat@debian.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
We've got a kernel panic when using sata disk with sas controller:
[115946.152283] Unable to handle kernel NULL pointer dereference at virtual address 000007d8
[115946.223963] CPU: 0 PID: 22175 Comm: kworker/0:1 Tainted: G W OEL 4.14.0 #1
[115946.232925] Workqueue: events ata_scsi_hotplug
[115946.237938] task: ffff8021ee50b180 task.stack: ffff00000d5d0000
[115946.244717] PC is at sas_find_dev_by_rphy+0x44/0x114
[115946.250224] LR is at sas_find_dev_by_rphy+0x3c/0x114
......
[115946.355701] Process kworker/0:1 (pid: 22175, stack limit = 0xffff00000d5d0000)
[115946.363369] Call trace:
[115946.456356] [<ffff000008878a9c>] sas_find_dev_by_rphy+0x44/0x114
[115946.462908] [<ffff000008878b8c>] sas_target_alloc+0x20/0x5c
[115946.469408] [<ffff00000885a31c>] scsi_alloc_target+0x250/0x308
[115946.475781] [<ffff00000885ba30>] __scsi_add_device+0xb0/0x154
[115946.481991] [<ffff0000088b520c>] ata_scsi_scan_host+0x180/0x218
[115946.488367] [<ffff0000088b53d8>] ata_scsi_hotplug+0xb0/0xcc
[115946.494801] [<ffff0000080ebd70>] process_one_work+0x144/0x390
[115946.501115] [<ffff0000080ec100>] worker_thread+0x144/0x418
[115946.507093] [<ffff0000080f2c98>] kthread+0x10c/0x138
[115946.512792] [<ffff0000080855dc>] ret_from_fork+0x10/0x18
We found that Ding Xiang has reported a similar bug before:
https://patchwork.kernel.org/patch/9179817/
And this bug still exists in mainline. Since libsas handles hotplug and
device adding/removing itself, do not need to schedule ata hot plug task
here if it is a sas host.
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Cc: Ding Xiang <dingxiang@huawei.com>
Cc: stable@vger.kernel.org
Signed-off-by: Tejun Heo <tj@kernel.org>
Pull libata updates from Tejun Heo:
"Nothing too interesting or alarming. Other than a new power saving
mode addition to ahci and crash fix on a tracepoint, all changes are
trivial or device-specific"
* 'for-4.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata: (22 commits)
ahci: imx: Handle increased read failures for IMX53 temperature sensor in low frequency mode.
ata: sata_dwc_460ex: Propagate platform device ID to DMA driver
ata: fixes kernel crash while tracing ata_eh_link_autopsy event
ata: pata_pdc2027x: Fix space before '[' error.
libata: fix spelling mistake: 'ambigious' -> 'ambiguous'
ata: ceva: Add SMMU support for SATA IP
ata: ceva: Correct the suspend and resume logic for SATA
ata: ceva: Correct the AXI bus configuration for SATA ports
ata: ceva: Add CCI support for SATA if CCI is enabled
ata: ceva: Make RxWaterMark value as module parameter
ata: ceva: Disable Device Sleep capability
ata: ceva: Add gen 3 mode support in driver
ata: ceva: Move sata port phy oob settings to device-tree
devicetree: bindings: Add sata port phy config parameters in ahci-ceva
ata: mark expected switch fall-throughs
ata: sata_mv: remove a redundant assignment to pointer ehi
ahci: Add support for Cavium's fifth generation SATA controller
ata: sata_rcar: Use of_device_get_match_data() helper
libata: make ata_port_type const
libata: make static arrays const, reduces object code size
...