John Garry <john.g.garry@oracle.com> says:
It's easy to get scsi_debug to error on throughput testing when we have
multiple shosts:
$ lsscsi
[7:0:0:0] disk Linux scsi_debug 0191
[0:0:0:0] disk Linux scsi_debug 0191
$ fio --filename=/dev/sda --filename=/dev/sdb --direct=1 --rw=read
--bs=4k --iodepth=256 --runtime=60 --numjobs=40 --time_based --name=jpg
--eta-newline=1 --readonly --ioengine=io_uring --hipri --exitall_on_error
jpg: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=io_uring, iodepth=256
...
fio-3.28
Starting 40 processes
[ 27.521809] hrtimer: interrupt took 33067 ns
[ 27.904660] sd 7:0:0:0: [sdb] tag#171 FAILED Result: hostbyte=DID_ABORT driverbyte=DRIVER_OK cmd_age=0s
[ 27.904660] sd 0:0:0:0: [sda] tag#58 FAILED Result: hostbyte=DID_ABORT driverbyte=DRIVER_OK cmd_age=0s
fio: io_u error [ 27.904667] sd 0:0:0:0: [sda] tag#58 CDB: Read(10) 28 00 00 00 27 00 00 01 18 00
on file /dev/sda[ 27.904670] sd 0:0:0:0: [sda] tag#62 FAILED Result: hostbyte=DID_ABORT driverbyte=DRIVER_OK cmd_age=0s
The issue is related to how the driver manages submit queues and tags. A
single array of submit queues - sdebug_q_arr - with its own set of tags is
shared among all shosts. As such, for occasions when we have more than one
host it is possible to overload the submit queues and run out of tags.
Another separate issue that we may reduce the shost submit queue depth,
sdebug_max_queue, dynamically causing the shost to be overloaded. How many
IOs which the shost may be sent is fixed at can_queue at init time, which
is the same initial value for sdebug_max_queue. So reducing
sdebug_max_queue means that the shost may be sent more IOs than it is
configured to handle, causing overloading.
This series removes the scsi_debug submit queue concept and uses
pre-existing APIs to manage and examine tags, like scsi_block_requests()
and blk_mq_tagset_busy_iter(). Using standard APIs makes the driver more
maintainable and extensible in future.
A restriction is also added to allow sdebug_max_queue only be modified when
no shosts are present, i.e. we need to remove shosts, modify
sdebug_max_queue, and then re-add the shosts.
Link: https://lore.kernel.org/r/20230327074310.1862889-1-john.g.garry@oracle.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
It's easy to get scsi_debug to error on throughput testing when we have
multiple shosts:
$ lsscsi
[7:0:0:0] disk Linux scsi_debug 0191
[0:0:0:0] disk Linux scsi_debug 0191
$ fio --filename=/dev/sda --filename=/dev/sdb --direct=1 --rw=read --bs=4k
--iodepth=256 --runtime=60 --numjobs=40 --time_based --name=jpg
--eta-newline=1 --readonly --ioengine=io_uring --hipri --exitall_on_error
jpg: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=io_uring, iodepth=256
...
fio-3.28
Starting 40 processes
[ 27.521809] hrtimer: interrupt took 33067 ns
[ 27.904660] sd 7:0:0:0: [sdb] tag#171 FAILED Result: hostbyte=DID_ABORT driverbyte=DRIVER_OK cmd_age=0s
[ 27.904660] sd 0:0:0:0: [sda] tag#58 FAILED Result: hostbyte=DID_ABORT driverbyte=DRIVER_OK cmd_age=0s
fio: io_u error [ 27.904667] sd 0:0:0:0: [sda] tag#58 CDB: Read(10) 28 00 00 00 27 00 00 01 18 00
on file /dev/sda[ 27.904670] sd 0:0:0:0: [sda] tag#62 FAILED Result: hostbyte=DID_ABORT driverbyte=DRIVER_OK cmd_age=0s
The issue is related to how the driver manages submit queues and tags. A
single array of submit queues - sdebug_q_arr - with its own set of tags is
shared among all shosts. As such, for occasions when we have more than one
shost it is possible to overload the submit queues and run out of tags.
The struct sdebug_queue is to manage tags and hold the associated
queued command entry pointer (for that tag).
Since the tagset iters are now used for functions like
sdebug_blk_mq_poll(), there is no need to manage these queues. Indeed,
blk-mq already provides what we need for managing tags and queues.
Drop sdebug_queue and all its usage in the driver.
Signed-off-by: John Garry <john.g.garry@oracle.com>
Link: https://lore.kernel.org/r/20230327074310.1862889-12-john.g.garry@oracle.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The shost->can_queue value is initially used to set per-HW queue context
tag depth in the block layer. This ensures that the shost is not sent too
many commands which it can deal with. However lowering sdebug_max_queue
separately means that we can easily overload the shost, as in the following
example:
$ cat /sys/bus/pseudo/drivers/scsi_debug/max_queue
192
$ cat /sys/class/scsi_host/host0/can_queue
192
$ echo 100 > /sys/bus/pseudo/drivers/scsi_debug/max_queue
$ cat /sys/class/scsi_host/host0/can_queue
192
$ fio --filename=/dev/sda --direct=1 --rw=read --bs=4k --iodepth=256
--runtime=1200 --numjobs=10 --time_based --group_reporting
--name=iops-test-job --eta-newline=1 --readonly --ioengine=io_uring
--hipri --exitall_on_error
iops-test-job: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=io_uring, iodepth=256
...
fio-3.28
Starting 10 processes
[ 111.269885] scsi_io_completion_action: 400 callbacks suppressed
[ 111.269885] blk_print_req_error: 400 callbacks suppressed
[ 111.269889] I/O error, dev sda, sector 440 op 0x0:(READ) flags 0x1200000 phys_seg 1 prio class 2
[ 111.269892] sd 0:0:0:0: [sda] tag#132 FAILED Result: hostbyte=DID_ABORT driverbyte=DRIVER_OK cmd_age=0s
[ 111.269897] sd 0:0:0:0: [sda] tag#132 CDB: Read(10) 28 00 00 00 01 68 00 00 08 00
[ 111.277058] I/O error, dev sda, sector 360 op 0x0:(READ) flags 0x1200000 phys_seg 1 prio class 2
[...]
Ensure that this cannot happen by allowing sdebug_max_queue be modified
only when we have no shosts. As such, any shost->can_queue value will match
sdebug_max_queue, and sdebug_max_queue cannot be modified separately.
Since retired_max_queue is no longer set, remove support.
Continue to apply the restriction that sdebug_host_max_queue cannot be
modified when sdebug_host_max_queue is set. Adding support for that would
mean extra code, and no one has complained about this restriction
previously.
A command like the following may be used to remove a shost:
echo -1 > /sys/bus/pseudo/drivers/scsi_debug/add_host
Signed-off-by: John Garry <john.g.garry@oracle.com>
Link: https://lore.kernel.org/r/20230327074310.1862889-11-john.g.garry@oracle.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The functions to update ndelay and delay value first check whether we have
any in-flight IO for any host. It does this by checking if any tag is used
in the global submit queues.
We can achieve the same by setting the host as blocked and then ensuring
that we have no in-flight commands with scsi_host_busy().
Note that scsi_host_busy() checks SCMD_STATE_INFLIGHT flag, which is only
set per command after we ensure that the host is not blocked, i.e. we see
more commands active after the check for scsi_host_busy() returns 0.
Signed-off-by: John Garry <john.g.garry@oracle.com>
Link: https://lore.kernel.org/r/20230327074310.1862889-10-john.g.garry@oracle.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Instead of iterating all deferred commands in the submission queue
structures, use blk_mq_tagset_busy_iter(), which is a standard API for
this.
Signed-off-by: John Garry <john.g.garry@oracle.com>
Link: https://lore.kernel.org/r/20230327074310.1862889-9-john.g.garry@oracle.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Instead of iterating all deferred commands in the submission queue
structures, use blk_mq_tagset_busy_iter(), which is a standard API for
this.
Signed-off-by: John Garry <john.g.garry@oracle.com>
Link: https://lore.kernel.org/r/20230327074310.1862889-8-john.g.garry@oracle.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Eventually we will drop the sdebug_queue struct as it is not really
required, so start with making the sdebug_queued_cmd dynamically allocated
for the lifetime of the scsi_cmnd in the driver.
As an interim measure, make sdebug_queued_cmd.sd_dp a pointer to struct
sdebug_defer. Also keep a value of the index allocated in
sdebug_queued_cmd.qc_arr in struct sdebug_queued_cmd.
To deal with an races in accessing the scsi cmnd allocated struct
sdebug_queued_cmd, add a spinlock for the scsi command in its priv area.
Races may be between scheduling a command for completion, aborting a
command, and the command actually completing and freeing the struct
sdebug_queued_cmd.
[mkp: typo fix]
Signed-off-by: John Garry <john.g.garry@oracle.com>
Link: https://lore.kernel.org/r/20230327074310.1862889-7-john.g.garry@oracle.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The feature to block queues is quite dubious, since it races with in-flight
IO. Indeed, it seems unnecessary for block queues for any times we do so.
Anyway, to keep the same behaviour, use standard SCSI API to stop IO being
sent - scsi_{un}block_requests().
Signed-off-by: John Garry <john.g.garry@oracle.com>
Link: https://lore.kernel.org/r/20230327074310.1862889-6-john.g.garry@oracle.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
There is no reason that calls to block_unblock_all_queues() from different
context can't race with one another, so protect with the
sdebug_host_list_mutex. There's no need for a more fine-grained per shost
locking here (and we don't have a per-host lock anyway).
Also simplify some touched code in sdebug_change_qdepth().
Signed-off-by: John Garry <john.g.garry@oracle.com>
Link: https://lore.kernel.org/r/20230327074310.1862889-5-john.g.garry@oracle.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The shost list lock, sdebug_host_list_lock, is a spinlock. We would only
lock in non-atomic context in this driver, so use a mutex instead, which is
friendlier if we need to schedule when iterating.
Signed-off-by: John Garry <john.g.garry@oracle.com>
Link: https://lore.kernel.org/r/20230327074310.1862889-4-john.g.garry@oracle.com
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In clear_luns_changed_on_target(), we iter all devices for all shosts to
conditionally clear the SDEBUG_UA_LUNS_CHANGED flag in the per-device
uas_bm.
One condition to see whether we clear the flag is to test whether the host
for the device under consideration is the same as the matching device's
(devip) host. This check will only ever pass for devices for the same
shost, so only iter the devices for the matching device shost.
We can now drop the spinlock'ing of the sdebug_host_list_lock in the same
function. This will allow us to use a mutex instead of the spinlock for the
global shost lock, as clear_luns_changed_on_target() could be called in
non-blocking context, in scsi_debug_queuecommand() -> make_ua() ->
clear_luns_changed_on_target() (which is why required a spinlock).
Signed-off-by: John Garry <john.g.garry@oracle.com>
Link: https://lore.kernel.org/r/20230327074310.1862889-3-john.g.garry@oracle.com
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
There is a report that the blktests scsi/004 test for "TASK SET FULL" (TSF)
now fails.
The condition upon we should issue this TSF is when the sdev queue is
full. The check for a full queue has an off-by-1 error. Previously we would
increment the number of requests in the queue after testing if the queue
would be full, i.e. test if one less than full. Since we now use
scsi_device_busy() to count the number of requests in the queue, this would
already account for the current request, so fix the test for queue full
accordingly.
Fixes: 151f0ec9dd ("scsi: scsi_debug: Drop sdebug_dev_info.num_in_q")
Reported-by: kernel test robot <oliver.sang@intel.com>
Link: https://lore.kernel.org/oe-lkp/202303201334.18b30edc-oliver.sang@intel.com
Signed-off-by: John Garry <john.g.garry@oracle.com>
Link: https://lore.kernel.org/r/20230327074310.1862889-2-john.g.garry@oracle.com
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Tested-by: Yi Zhang <yi.zhang@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
chenxiang <chenxiang66@hisilicon.com> says:
This series contain some fixes including:
- Grab sas_dev lock when traversing sas_dev list to avoid NULL
pointer
- Handle NCQ error when IPTT is valid
- Ensure all enabled PHYs up during controller reset
- Exit suspend state when usage count of runtime PM is greater than 0
https://lore.kernel.org/r/1679283265-115066-1-git-send-email-chenxiang66@hisilicon.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When the current status of the host controller is suspended, enabling a
local PHY just after disabling all local PHYs in expander environment, a
hang as follows occurs:
[ 486.854655] INFO: task kworker/u256:1:899 blocked for more than 120 seconds.
[ 486.862207] Not tainted 6.1.0-rc4+ #1
[ 486.870545] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 486.878893] task:kworker/u256:1 state:D stack:0 pid:899 ppid:2 flags:0x00000008
[ 486.887745] Workqueue: 0000:74:02.0_disco_q sas_discover_domain [libsas]
[ 486.894704] Call trace:
[ 486.897400] __switch_to+0xf0/0x170
[ 486.901146] __schedule+0x3e4/0x1160
[ 486.904970] schedule+0x64/0x104
[ 486.908442] rpm_resume+0x158/0x6a0
[ 486.912163] __pm_runtime_resume+0x5c/0x84
[ 486.916489] smp_execute_task_sg+0x1f8/0x264 [libsas]
[ 486.921773] sas_discover_expander.part.0+0xbc/0x720 [libsas]
[ 486.927750] sas_discover_root_expander+0x90/0x154 [libsas]
[ 486.933552] sas_discover_domain+0x444/0x6d0 [libsas]
[ 486.938826] process_one_work+0x1e0/0x450
[ 486.943057] worker_thread+0x150/0x44c
[ 486.947015] kthread+0x114/0x120
[ 486.950447] ret_from_fork+0x10/0x20
[ 486.954292] INFO: task kworker/u256:2:1780 blocked for more than 120 seconds.
[ 486.961637] Not tainted 6.1.0-rc4+ #1
[ 486.966087] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 486.974356] task:kworker/u256:2 state:D stack:0 pid:1780 ppid:2 flags:0x00000208
[ 486.983141] Workqueue: 0000:74:02.0_event_q sas_port_event_worker [libsas]
[ 486.990252] Call trace:
[ 486.992930] __switch_to+0xf0/0x170
[ 486.996645] __schedule+0x3e4/0x1160
[ 487.000439] schedule+0x64/0x104
[ 487.003886] schedule_timeout+0x17c/0x1c0
[ 487.008102] wait_for_completion+0x7c/0x160
[ 487.012488] __flush_workqueue+0x104/0x3e0
[ 487.016782] sas_porte_bytes_dmaed+0x414/0x454 [libsas]
[ 487.022203] sas_port_event_worker+0x38/0x60 [libsas]
[ 487.027449] process_one_work+0x1e0/0x450
[ 487.031645] worker_thread+0x150/0x44c
[ 487.035594] kthread+0x114/0x120
[ 487.039017] ret_from_fork+0x10/0x20
[ 487.042828] INFO: task bash:11488 blocked for more than 121 seconds.
[ 487.049366] Not tainted 6.1.0-rc4+ #1
[ 487.053746] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 487.061953] task:bash state:D stack:0 pid:11488 ppid:10977 flags:0x00000204
[ 487.070698] Call trace:
[ 487.073355] __switch_to+0xf0/0x170
[ 487.077050] __schedule+0x3e4/0x1160
[ 487.080833] schedule+0x64/0x104
[ 487.084270] schedule_timeout+0x17c/0x1c0
[ 487.088474] wait_for_completion+0x7c/0x160
[ 487.092851] __flush_workqueue+0x104/0x3e0
[ 487.097137] drain_workqueue+0xb8/0x160
[ 487.101159] __sas_drain_work+0x50/0x90 [libsas]
[ 487.105963] sas_suspend_ha+0x64/0xd4 [libsas]
[ 487.110590] suspend_v3_hw+0x198/0x1e8 [hisi_sas_v3_hw]
[ 487.115989] pci_pm_runtime_suspend+0x5c/0x1d0
[ 487.120606] __rpm_callback+0x50/0x150
[ 487.124535] rpm_callback+0x74/0x80
[ 487.128204] rpm_suspend+0x110/0x640
[ 487.131955] rpm_idle+0x1f4/0x2d0
[ 487.135447] __pm_runtime_idle+0x58/0x94
[ 487.139538] queue_phy_enable+0xcc/0xf0 [libsas]
[ 487.144330] store_sas_phy_enable+0x74/0x100
[ 487.148770] dev_attr_store+0x20/0x34
[ 487.152606] sysfs_kf_write+0x4c/0x5c
[ 487.156437] kernfs_fop_write_iter+0x120/0x1b0
[ 487.161049] vfs_write+0x2d0/0x36c
[ 487.164625] ksys_write+0x70/0x100
[ 487.168194] __arm64_sys_write+0x24/0x30
[ 487.172280] invoke_syscall+0x50/0x120
[ 487.176186] el0_svc_common.constprop.0+0x168/0x190
[ 487.181214] do_el0_svc+0x34/0xc0
[ 487.184680] el0_svc+0x2c/0xb4
[ 487.187879] el0t_64_sync_handler+0xb8/0xbc
[ 487.192205] el0t_64_sync+0x19c/0x1a0
We find that when all local PHYs are disabled, all the devices will be
removed, the ->runtime_suspend() callback suspend_v3_hw() directly execute
since the controller usage count drop to 0. On the other side, the first
local PHY is enabled through the sysfs interface, and ensures that function
phy_up_v3_hw() is completed due to suspend_v3_hw()->
interrupt_disable_v3_hw(). In the expander scenario,
sas_discover_root_expander() is executed in event work
DISCE_DISCOVER_DOMAIN, which will increases the controller usage count and
carry out a resume and sends SMPIO, it cannot be completed because the
runtime PM status of the controller is RPM_SUSPENDING. At the same time,
the ->runtime_suspend() callback suspend_v3_hw() also cannot complete the
process because of drain libsas event queue in sas_suspend_ha(), so hung
occurs.
(thread 1) | (thread 2)
... |
rpm_idle() |
... |
__update_runtime_status(RPM_SUSPENDING)|
... | ...
suspend_v3_hw() | smp_execute_task_sg()
... | ...
interrupt_disable_v3_hw() | pm_runtime_get_sync()
| ...
... | rpm_resume() //RPM_SUSPENDING
|
__sas_drain_work() |
To fix this, check if the current runtime PM status of the controller
allows to be suspended continue after interrupt_disable_v3_hw(), return
immediately if not.
Signed-off-by: Yihang Li <liyihang9@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hislicon.com>
Link: https://lore.kernel.org/r/1679283265-115066-5-git-send-email-chenxiang66@hisilicon.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
For the controller reset operation, hisi_sas_phy_enable() is executed for
each enabled local PHY, and refresh the port id of each device based on the
latest hisi_sas_phy->port_id after 1 second sleep, hisi_sas_phy->port_id is
configured in the interrupt processing function phy_up_v3_hw(). However, in
directly attached scenario, for some SATA disks the amount of time for
phyup more than 1s sometimes. In this case, incorrect port id may be
configured in hisi_sas_refresh_port_id(). As a result, all the internal
IOs fail and disk lost, such as follows:
[10717.666565] hisi_sas_v3_hw 0000:74:02.0: phyup: phy1 link_rate=10(sata)
[10718.826813] hisi_sas_v3_hw 0000:74:02.0: erroneous completion iptt=63
task=00000000c1ab1c2b dev id=200 addr=5000000000000501 CQ hdr: 0x8000007 0xc8003f 0x0
0x0 Error info: 0x0 0x0 0x0 0x0
[10718.843428] sas: TMF task open reject failed 5000000000000501
[10718.849242] hisi_sas_v3_hw 0000:74:02.0: erroneous completion iptt=64
task=00000000c1ab1c2b dev id=200 addr=5000000000000501 CQ hdr: 0x8000007 0xc80040 0x0
0x0 Error info: 0x0 0x0 0x0 0x0
[10718.865856] sas: TMF task open reject failed 5000000000000501
[10718.871670] hisi_sas_v3_hw 0000:74:02.0: erroneous completion iptt=65
task=00000000c1ab1c2b dev id=200 addr=5000000000000501 CQ hdr: 0x8000007 0xc80041 0x0
0x0 Error info: 0x0 0x0 0x0 0x0
[10718.888284] sas: TMF task open reject failed 5000000000000501
[10718.894093] sas: executing TMF for 5000000000000501 failed after 3 attempts!
[10718.901114] hisi_sas_v3_hw 0000:74:02.0: ata disk 5000000000000501 reset failed
[10718.908410] hisi_sas_v3_hw 0000:74:02.0: controller reset complete
.....
[10773.298633] ata216.00: revalidation failed (errno=-19)
[10773.303753] ata216.00: disable device
So the time of waitting for PHYs up is 1s which may be not enough. To solve
the issue, running hisi_sas_phy_enable() in parallel through async
operations and use wait_for_completion_timeout() to wait for PHYs come up
instead of directly sleep for 1 second.
Signed-off-by: Yihang Li <liyihang9@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Link: https://lore.kernel.org/r/1679283265-115066-4-git-send-email-chenxiang66@hisilicon.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If an NCQ error occurs when the IPTT is valid and slot->abort flag is set
in completion path, sas_task_abort() will be called to abort only one NCQ
command now, and the host would be set to SHOST_RECOVERY state. But this
may not kick-off EH Immediately until other outstanding QCs timeouts. As a
result, the host may remain in the SHOST_RECOVERY state for up to 30
seconds, such as follows:
[7972317.645234] hisi_sas_v3_hw 0000:74:04.0: erroneous completion iptt=3264 task=00000000466116b8 dev id=2 sas_addr=0x5000000000000502 CQ hdr: 0x1883 0x20cc0 0x40000 0x20420000 Error info: 0x0 0x0 0x200000 0x0
[7972341.508264] sas: Enter sas_scsi_recover_host busy: 32 failed: 32
[7972341.984731] sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 32 tries: 1
All NCQ commands that are in the queue should be aborted when an NCQ error
occurs in this scenario.
Fixes: 05d91b557a ("scsi: hisi_sas: Directly trigger SCSI error handling for completion errors")
Signed-off-by: Xingui Yang <yangxingui@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Link: https://lore.kernel.org/r/1679283265-115066-3-git-send-email-chenxiang66@hisilicon.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
clang with W=1 reports:
drivers/scsi/qla4xxx/ql4_isr.c:475:11: error: variable
'count' set but not used [-Werror,-Wunused-but-set-variable]
uint32_t count = 0;
^
This variable is not used so remove it.
Signed-off-by: Tom Rix <trix@redhat.com>
Link: https://lore.kernel.org/r/20230331175757.1860780-1-trix@redhat.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
clang with W=1 reports:
drivers/scsi/snic/snic_scsi.c:490:6: error: variable
'xfer_len' set but not used [-Werror,-Wunused-but-set-variable]
u64 xfer_len = 0;
^
This variable is not used so remove it.
Signed-off-by: Tom Rix <trix@redhat.com>
Link: https://lore.kernel.org/r/20230328001647.1778448-1-trix@redhat.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
clang with W=1 reports:
drivers/scsi/qedf/qedf_main.c:2227:6: error: variable
'num_handled' set but not used [-Werror,-Wunused-but-set-variable]
int num_handled = 0;
^
This variable is not used so remove it.
Signed-off-by: Tom Rix <trix@redhat.com>
Link: https://lore.kernel.org/r/20230330203444.1842425-1-trix@redhat.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The validity of sock should be checked before assignment to avoid incorrect
values. Commit 57569c37f0 ("scsi: iscsi: iscsi_tcp: Fix null-ptr-deref
while calling getpeername()") introduced this change which may lead to
inconsistent values of tcp_sw_conn->sendpage and conn->datadgst_en.
Fix the issue by moving the position of the assignment.
Fixes: 57569c37f0 ("scsi: iscsi: iscsi_tcp: Fix null-ptr-deref while calling getpeername()")
Signed-off-by: Zhong Jinghua <zhongjinghua@huawei.com>
Link: https://lore.kernel.org/r/20230329071739.2175268-1-zhongjinghua@huaweicloud.com
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
There is a memory leak reported by kmemleak:
unreferenced object 0xffffc900003f0000 (size 12288):
comm "modprobe", pid 19117, jiffies 4299751452 (age 42490.264s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
[<00000000629261a8>] __vmalloc_node_range+0xe56/0x1110
[<0000000001906886>] __vmalloc_node+0xbd/0x150
[<000000005bb4dc34>] vmalloc+0x25/0x30
[<00000000a2dc1194>] qla2x00_create_host+0x7a0/0xe30 [qla2xxx]
[<0000000062b14b47>] qla2x00_probe_one+0x2eb8/0xd160 [qla2xxx]
[<00000000641ccc04>] local_pci_probe+0xeb/0x1a0
The root cause is traced to an error-handling path in qla2x00_probe_one()
when the adapter "base_vha" initialize failed. The fab_scan_rp "scan.l" is
used to record the port information and it is allocated in
qla2x00_create_host(). However, it is not released in the error handling
path "probe_failed".
Fix this by freeing the memory of "scan.l" when an error occurs in the
adapter initialization process.
Fixes: a4239945b8 ("scsi: qla2xxx: Add switch command to simplify fabric discovery")
Signed-off-by: Li Zetao <lizetao1@huawei.com>
Link: https://lore.kernel.org/r/20230325110004.363898-1-lizetao1@huawei.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
clang with W=1 reports:
drivers/scsi/scsi_transport_fc.c:908:6: error: variable
'desc_cnt' set but not used [-Werror,-Wunused-but-set-variable]
u32 desc_cnt = 0, bytes_remain;
^
This variable is not used so remove it.
Signed-off-by: Tom Rix <trix@redhat.com>
Link: https://lore.kernel.org/r/20230326003222.1343671-1-trix@redhat.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Simplify the sr_open() by removing the goto label as the function only
returns one error code.
Signed-off-by: Enze Li <lienze@kylinos.cn>
Link: https://lore.kernel.org/r/20230327030237.3407253-1-lienze@kylinos.cn
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The driver is exiting from the fault watchdog thread if it sees the 0xF002
(Soft reset in progress) fault code.
If the driver initiates the soft reset, then the driver restarts the
watchdog at the end of the soft reset completion. However, if the soft
reset is initiated by the firmware asynchronously, then the driver will
never restart the watchdog and never re-initialize the controller after the
asynchronous soft reset completion.
Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com>
Link: https://lore.kernel.org/r/20230331122317.11391-1-ranjan.kumar@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When a disk is removed with in-flight I/O, the application needs to wait
for 30 seconds (depending on the timeout configuration) to hear back from
the kernel. Xingui tried to fix this issue by aborting the ATA link for
SATA devices[1], however this approach left the SAS devices unresolved.
Try to fix this issue by aborting all in-flight requests when the device is
gone. This is implemented by iterating over the tagset.
[1] https://lore.kernel.org/lkml/234e04db-7539-07e4-a6b8-c6b05f78193d@opensource.wdc.com/T/
Cc: Xingui Yang <yangxingui@huawei.com>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Cc: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Link: https://lore.kernel.org/r/20230330110930.175539-1-yanaijie@huawei.com
Reviewed-by: John Garry <john.g.garry@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Four small fixes, three in drivers. The core fix is yet another
attempt to insulate us from UFS devices' weird behaviour for VPD
pages.
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCZCX0NCYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishSz5AQDiKfGd
tTA/Ff24pGzvqFfsU9R8vbV/ROKxiC9yAf9OGwEA9racOhJOi0FWNwkM0M3hwBG6
I4T0jp+ItbdSwbONV+Q=
=ltqk
-----END PGP SIGNATURE-----
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Four small fixes, three in drivers. The core fix is yet another
attempt to insulate us from UFS devices' weird behaviour for VPD
pages"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: mpt3sas: Don't print sense pool info twice
scsi: core: Improve scsi_vpd_inquiry() checks
scsi: megaraid_sas: Fix crash after a double completion
scsi: megaraid_sas: Fix fw_crash_buffer_show()
Currently, MAX_SKB_FRAGS value is 17.
For standard tcp sendmsg() traffic, no big deal because tcp_sendmsg()
attempts order-3 allocations, stuffing 32768 bytes per frag.
But with zero copy, we use order-0 pages.
For BIG TCP to show its full potential, we add a config option
to be able to fit up to 45 segments per skb.
This is also needed for BIG TCP rx zerocopy, as zerocopy currently
does not support skbs with frag list.
We have used MAX_SKB_FRAGS=45 value for years at Google before
we deployed 4K MTU, with no adverse effect, other than
a recent issue in mlx4, fixed in commit 26782aad00
("net/mlx4: MLX4_TX_BOUNCE_BUFFER_SIZE depends on MAX_SKB_FRAGS")
Back then, goal was to be able to receive full size (64KB) GRO
packets without the frag_list overhead.
Note that /proc/sys/net/core/max_skb_frags can also be used to limit
the number of fragments TCP can use in tx packets.
By default we keep the old/legacy value of 17 until we get
more coverage for the updated values.
Sizes of struct skb_shared_info on 64bit arches
MAX_SKB_FRAGS | sizeof(struct skb_shared_info):
==============================================
17 320
21 320+64 = 384
25 320+128 = 448
29 320+192 = 512
33 320+256 = 576
37 320+320 = 640
41 320+384 = 704
45 320+448 = 768
This inflation might cause problems for drivers assuming they could pack
both the incoming packet (for MTU=1500) and skb_shared_info in half a page,
using build_skb().
v3: fix build error when CONFIG_NET=n
v2: fix two build errors assuming MAX_SKB_FRAGS was "unsigned long"
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Link: https://lore.kernel.org/r/20230323162842.1935061-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
_base_allocate_sense_dma_pool() already prints out the sense pool
information, so don't print it a second time after calling it in
_base_allocate_memory_pools(). In addition the version in
_base_allocate_memory_pools() was using the wrong size value, sz, which was
last assigned when doing some nvme calculations instead of sense_sz to
determine the pool size in kilobytes.
Cc: Sathya Prakash <sathya.prakash@broadcom.com>
Cc: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Cc: Suganath Prabu Subramani <suganath-prabu.subramani@broadcom.com>
Cc: MPT-FusionLinux.pdl@broadcom.com
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Fixes: 970ac2bb70 ("scsi: mpt3sas: Force sense buffer allocations to be within same 4 GB region")
Signed-off-by: Jerry Snitselaar <jsnitsel@redhat.com>
Link: https://lore.kernel.org/r/20230324193204.567932-1-jsnitsel@redhat.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Some USB-SATA adapters have broken behavior when an unsupported VPD page is
probed: Depending on the VPD page number, a 4-byte header with a valid VPD
page number but with a 0 length is returned. Currently, scsi_vpd_inquiry()
only checks that the page number is valid to determine if the page is
valid, which results in receiving only the 4-byte header for the
non-existent page. This error manifests itself very often with page 0xb9
for the Concurrent Positioning Ranges detection done by sd_read_cpr(),
resulting in the following error message:
sd 0:0:0:0: [sda] Invalid Concurrent Positioning Ranges VPD page
Prevent such misleading error message by adding a check in
scsi_vpd_inquiry() to verify that the page length is not 0.
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Link: https://lore.kernel.org/r/20230322022211.116327-1-damien.lemoal@opensource.wdc.com
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When a physical disk is attached directly "without JBOD MAP support" (see
megasas_get_tm_devhandle()) then there is no real error handling in the
driver. Return FAILED instead of SUCCESS.
Fixes: 18365b1385 ("megaraid_sas: Task management support")
Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Link: https://lore.kernel.org/r/20230324150134.14696-1-thenzl@redhat.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If crash_dump_buf is not allocated then crash dump can't be available.
Replace logical 'and' with 'or'.
Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Link: https://lore.kernel.org/r/20230324135249.9733-1-thenzl@redhat.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When cmdid == CMDID_INT_CMDS, the 'cmds' pointer is NULL but is
dereferenced below.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 0f2bb84d2a ("[SCSI] megaraid: simplify internal command handling")
Signed-off-by: Danila Chernetsov <listdansp@mail.ru>
Link: https://lore.kernel.org/r/20230317175109.18585-1-listdansp@mail.ru
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Bart Van Assche <bvanassche@acm.org> says:
It helps humans and the compiler if it is made explicit that SCSI host
templates are not modified. Hence this patch series that constifies most
SCSI host templates. Please consider this patch series for the next merge
window.
Link: https://lore.kernel.org/r/20230322195515.1267197-1-bvanassche@acm.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Make it explicit that the SCSI host template is not modified.
Reviewed-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20230322195515.1267197-77-bvanassche@acm.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Make it explicit that the SCSI host template is not modified.
Tested-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20230322195515.1267197-66-bvanassche@acm.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Make it explicit that the SCSI host template is not modified.
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20230322195515.1267197-63-bvanassche@acm.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Make it explicit that the SCSI host template is not modified.
Reviewed-by: John Garry <john.g.garry@oracle.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20230322195515.1267197-57-bvanassche@acm.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Make it explicit that the SCSI host template is not modified.
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20230322195515.1267197-50-bvanassche@acm.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Make it explicit that the SCSI host template is not modified.
Acked-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20230322195515.1267197-48-bvanassche@acm.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Make it explicit that the SCSI host template is not modified.
Acked-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20230322195515.1267197-45-bvanassche@acm.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Make it explicit that the SCSI host template is not modified.
Acked-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20230322195515.1267197-42-bvanassche@acm.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Make it explicit that the SCSI host template is not modified.
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20230322195515.1267197-31-bvanassche@acm.org
Acked-by: Oliver Neukum <oneukum@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Make it explicit that the SCSI host template is not modified.
Acked-by: Khalid Aziz <khalid@gonehiking.org>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20230322195515.1267197-13-bvanassche@acm.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Declare the SCSI host template pointer members const and also the remaining
SCSI host template pointers in the SCSI core.
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20230322195515.1267197-4-bvanassche@acm.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Prepare for constifying most SCSI host template pointers by constifying the
SCSI host template pointer arguments and variables in the SCSI core.
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20230322195515.1267197-3-bvanassche@acm.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Access the qla2xxx_driver_template data structure directly instead of via
the host pointer. This patch prepares for declaring the 'hostt' pointer
const.
Cc: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20230322195515.1267197-2-bvanassche@acm.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Ching Huang <ching2048@areca.com.tw> says:
The following patches were made over mkp's 6.4/scsi-staging
This series comtain some fixes:
- Deprecate arcmsr_pci_unmap_dma()
- Fix ADAPTER_TYPE_B 64-bit DMA compatibility issue
- Fix reading buffer empty length error
- Add driver proc_name
- Update driver's version to v1.50.00.13-20230206
Link: https://lore.kernel.org/r/6f3eb04dbe89d2b9f239600dd2c575227f3c0afc.camel@areca.com.tw
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If there is no driver match function, the driver core assumes that each
candidate pair (driver, device) matches, see driver_match_device().
Drop the pseudo_lld bus match function that always returned 1. This results
in the same behaviour as when there is no match function.
[mkp+jgg: patch description]
Signed-off-by: Lizhe <sensor1010@163.com>
Link: https://lore.kernel.org/r/20230319042732.278691-1-sensor1010@163.com
Reviewed-by: John Garry <john.g.garry@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Four small fixes, three in drivers. The core fix adds a UFS device to
an existing quirk to avoid a huge delay on boot.
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCZBzeWSYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishZ0DAP0f7AM7
fo9rUeihF0/eOGOWPuw83lUOl74g6yRkiCb3JQD+OBJO7k+0CwUqKdC1Nscfdq0K
67EApAuLsYw/PN9DaCM=
=tnVs
-----END PGP SIGNATURE-----
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Four small fixes, three in drivers.
The core fix adds a UFS device to an existing quirk to avoid a huge
delay on boot"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: scsi_dh_alua: Fix memleak for 'qdata' in alua_activate()
scsi: qla2xxx: Synchronize the IOCB count to be in order
scsi: qla2xxx: Perform lockless command completion in abort path
scsi: core: Add BLIST_SKIP_VPD_PAGES for SKhynix H28U74301AMR
struct bus_type should never be modified in a sysfs callback as there is
nothing in the structure to modify, and frankly, the structure is almost
never used in a sysfs callback, so mark it as constant to allow struct
bus_type to be moved to read-only memory.
Cc: "David S. Miller" <davem@davemloft.net>
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Alexandre Bounine <alex.bou9@gmail.com>
Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Ben Widawsky <bwidawsk@kernel.org>
Cc: Dexuan Cui <decui@microsoft.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Harald Freudenberger <freude@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Hu Haowen <src.res@email.cn>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Laurentiu Tudor <laurentiu.tudor@nxp.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Stuart Yoder <stuyoder@gmail.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Yanteng Si <siyanteng@loongson.cn>
Acked-by: Ilya Dryomov <idryomov@gmail.com> # rbd
Acked-by: Ira Weiny <ira.weiny@intel.com> # cxl
Reviewed-by: Alex Shi <alexs@kernel.org>
Acked-by: Iwona Winiarska <iwona.winiarska@intel.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com> # pci
Acked-by: Wei Liu <wei.liu@kernel.org>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com> # scsi
Link: https://lore.kernel.org/r/20230313182918.1312597-23-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Ten patches, eight in drivers and two in the core, which correct a
regression from directory removal and add a no VPD size quirk also to
fix a regression. All pretty small as you can see from the diffstat.
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCZBPW8yYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishdl+AQDyZieE
4Ma7ARkNKD+RGsW0qjJ/xMD2kHgkIRuH4vSG5wD+I8a4rXXy5EysOM/z49InCOD1
D2v7CyVEZjTF2qpEcB4=
=5HIW
-----END PGP SIGNATURE-----
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Ten patches, eight in drivers and two in the core, which correct a
regression from directory removal and add a no VPD size quirk also to
fix a regression. All pretty small"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: ufs: mcq: Use active_reqs to check busy in clock scaling
scsi: core: Fix a procfs host directory removal regression
scsi: core: Add BLIST_NO_VPD_SIZE for some VDASD
scsi: mpi3mr: Fix expander node leak in mpi3mr_remove()
scsi: mpi3mr: Fix memory leaks in mpi3mr_init_ioc()
scsi: mpi3mr: Fix sas_hba.phy memory leak in mpi3mr_remove()
scsi: mpi3mr: Fix mpi3mr_hba_port memory leak in mpi3mr_remove()
scsi: mpi3mr: Fix config page DMA memory leak
scsi: mpi3mr: Fix throttle_groups memory leak
scsi: mpt3sas: Fix NULL pointer access in mpt3sas_transport_port_add()
The module pointer in class_create() never actually did anything, and it
shouldn't have been requred to be set as a parameter even if it did
something. So just remove it and fix up all callers of the function in
the kernel tree at the same time.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Acked-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Link: https://lore.kernel.org/r/20230313181843.1207845-4-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There is no need to manually set the owner of a struct class, as the
registering function does it automatically, so remove all of the
explicit settings from various drivers that did so as it is unneeded.
This allows us to remove this pointer entirely from this structure going
forward.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Link: https://lore.kernel.org/r/20230313181843.1207845-2-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Update driver version to 8.4.1.0.0.
Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com>
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Link: https://lore.kernel.org/r/20230316110209.60145-9-ranjan.kumar@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Update copyright year from 2022 to 2023.
Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com>
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Link: https://lore.kernel.org/r/20230316110209.60145-8-ranjan.kumar@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
SCSI error handling has taken place for timed out I/Os on a drive and the
corresponding drive is removed. Stop escalating to higher level of reset by
returning the TUR with "I_T NEXUS LOSS OCCURRED" sense key.
Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com>
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Link: https://lore.kernel.org/r/20230316110209.60145-5-ranjan.kumar@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Modify Message Unit Reset timeout value to 120 seconds from the previous
value of 30 seconds.
Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com>
Signed-off-by: Sreekant Reddy <sreekanth.reddy@broadcom.com>
Link: https://lore.kernel.org/r/20230316110209.60145-4-ranjan.kumar@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
After a soft reset, while setting up admin queue pairs, the driver
initially sets admin request base and admin reply base addresses to
NULL. This leads to DMA memory pointed by these pointers getting leaked.
Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com>
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Link: https://lore.kernel.org/r/20230316110209.60145-3-ranjan.kumar@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Upon Virtual disk removal, firmware sends device status change event
(Virtual disk remove event) and expects the driver to start device remove
handshake (by sending target reset and IOU control command to firmware).
However, the driver does not initiate the device remove handshake which
leads to the firmware fault.
Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com>
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Link: https://lore.kernel.org/r/20230316110209.60145-2-ranjan.kumar@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
It is preferred to use typed property access functions (i.e.
of_property_read_<type> functions) rather than low-level
of_get_property()/of_find_property() functions for reading properties.
Convert reading boolean properties to of_property_read_bool().
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20230310144723.1544999-1-robh@kernel.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Dmitry Bogdanov <d.bogdanov@yadro.com> says:
The patchset is based on 6.4/scsi-staging branch.
The first 11 patches are just a refactoring to reduce code duplication
in fabric drivers. They make several callouts be optional in fabric
ops. Make a default implementation of the optional ops and remove
such implementations in the fabric drivers.
The last patch is a new virtual remote fabric driver. It has a
valueble sence with patchset "scsi: target: make RTPI an TPG
identifier" to configure RPTI on remote/tpgt_x same as on tpgt_y on
other nodes in a storage cluster. That allows to report the same ports
in RTPG from each node and to have a clusterwide tpg/acl/lun view in
kernel.
Link: https://lore.kernel.org/r/20230313181110.20566-1-d.bogdanov@yadro.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
John Garry <john.g.garry@oracle.com> says:
This series contains a bunch of minor improvements to the driver. I
have another bunch waiting with more major changes.
Most of the changes are quite straightforward, and the only patches of
note are as follows:
- Fix the command abort feature, enabled with host option
SDEBUG_OPT_CMD_ABORT.
- Drop driver count of queued commands per device.
- Add poll mode completions to statistics. We already have poll mode
callback call count, so maybe it was intentional to omit poll mode
from the statistics.
Link: https://lore.kernel.org/r/20230313093114.1498305-1-john.g.garry@oracle.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Currently commands completed via poll mode are not included in the
statistics gathering for deferred completions and missed CPUs.
Poll mode completions should be treated the same as other deferred
completion types, so add poll mode completions to the statistics.
Signed-off-by: John Garry <john.g.garry@oracle.com>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Link: https://lore.kernel.org/r/20230313093114.1498305-12-john.g.garry@oracle.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The command abort feature allows us to test aborting a command which has
timed-out.
The idea is that for specific commands we just don't call scsi_done() and
allow the request to timeout, which ensures SCSI EH kicks-in we try to
abort the command.
Since commit 4a0c6f432d ("scsi: scsi_debug: Add new defer type for
mq_poll") this does not seem to work. The issue is that we clear the
sd_dp->aborted flag in schedule_resp() before the completion callback has
run. When the completion callback actually runs, it calls scsi_done() as
normal as sd_dp->aborted unset. This is all very racy.
Fix by not clearing sd_dp->aborted in schedule_resp(). Also move the call
to blk_abort_request() from schedule_resp() to sdebug_q_cmd_complete(),
which makes the code have a more logical sequence.
I also note that this feature only works for commands which are classed as
"SDEG_RES_IMMED_MASK", but only practically triggered with prior RW
commands. So for my experiment I need to run fio to trigger the error on
the "nth" command (see inject_on_this_cmd()), and then run something like
sg_sync to queue a command to actually trigger the abort.
Signed-off-by: John Garry <john.g.garry@oracle.com>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Link: https://lore.kernel.org/r/20230313093114.1498305-11-john.g.garry@oracle.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In schedule_resp(), under certain conditions we check whether the
per-device queue is full (num_in_q == queue depth - 1) and we may inject a
"task set full" (TSF) error if it is.
However how we read num_in_q is racy - many threads may see the same "queue
is full" value (and also issue a TSF).
There is per-queue locking in reading per-device num_in_q, but that would
not help.
Replace how we read num_in_q at this location with a call to
scsi_device_busy(). Calling scsi_device_busy() is likewise racy (as reading
num_in_q), so nothing lost or gained. Calling scsi_device_busy() is also
slow as it needs to read all bits in the per-device budget bitmap, but we
can live with that since we're just a simulator and it's only under a
certain configs which we would see this.
Also move the "task set full" print earlier as it would only be called now
under this condition. However, previously it may not have been called -
like returning early - but keep it simple and always call it.
At this point we can drop sdebug_dev_info.num_in_q - it is difficult to
maintain properly and adds extra normal case command processing.
Signed-off-by: John Garry <john.g.garry@oracle.com>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Link: https://lore.kernel.org/r/20230313093114.1498305-10-john.g.garry@oracle.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The per-device num_in_q value cannot exceed the device queue depth, so drop
the check.
Signed-off-by: John Garry <john.g.garry@oracle.com>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Link: https://lore.kernel.org/r/20230313093114.1498305-9-john.g.garry@oracle.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The check for device pointer for the SCSI command is unnecessary, so drop
it.
The only caller is scsi_try_host_reset() -> eh_host_reset_handler(), and
there that pointer cannot be NULL.
Indeed, there is already code later in the same function which does not
check the device pointer for the SCSI command.
Signed-off-by: John Garry <john.g.garry@oracle.com>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Link: https://lore.kernel.org/r/20230313093114.1498305-8-john.g.garry@oracle.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The checks for SCSI cmnd, SCSI device, and SCSI host are unnecessary, so
drop them. Likewise, drop the NULL check for sdbg_host.
The only caller is scsi_try_bus_reset() -> eh_bus_reset_handler(), and
there those pointers cannot be NULL.
Signed-off-by: John Garry <john.g.garry@oracle.com>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Link: https://lore.kernel.org/r/20230313093114.1498305-7-john.g.garry@oracle.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The checks for SCSI cmnd, SCSI device, and SCSI host are unnecessary, so
drop them. Likewise, drop the NULL check for sdbg_host.
The only caller is scsi_try_target_reset() -> eh_target_reset_handler(),
and there those pointers cannot be NULL.
Signed-off-by: John Garry <john.g.garry@oracle.com>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Link: https://lore.kernel.org/r/20230313093114.1498305-6-john.g.garry@oracle.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The SCSI cmnd pointer arg would never be NULL, so drop the check. In
addition, its SCSI device pointer would never be NULL (so drop that check
also).
The only caller is scsi_try_bus_device_reset(), and the command and its
device pointer could not be NULL when calling eh_device_reset_handler()
there.
Signed-off-by: John Garry <john.g.garry@oracle.com>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Link: https://lore.kernel.org/r/20230313093114.1498305-5-john.g.garry@oracle.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The SCSI cmnd pointer arg would never be NULL, so drop the check. In
addition, its SCSI device pointer would never be NULL.
The only caller is scsi_send_eh_cmnd() -> scsi_abort_eh_cmnd() ->
scsi_try_to_abort_cmd() -> scsi_try_to_abort_cmd(), and in the origin of
that chain those pointers cannot be NULL.
Signed-off-by: John Garry <john.g.garry@oracle.com>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Link: https://lore.kernel.org/r/20230313093114.1498305-4-john.g.garry@oracle.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In sdebug_device_create(), the devip->sdbg_host pointer is needlessly set
twice, so stop doing that.
Signed-off-by: John Garry <john.g.garry@oracle.com>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Link: https://lore.kernel.org/r/20230313093114.1498305-3-john.g.garry@oracle.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This driver stores just a pointer to the driver host structure in
host->hostdata[]. Most other drivers actually have the driver host
structure allocated in host->hostdata[], but this driver is different as we
allocate that memory separately before allocating the shost memory.
However there is no need to allocate this memory only in host->hostdata[]
when we can already look up the driver host structure from shost->dma_dev,
so add a macro for this - shost_to_sdebug_host(). Rename to_sdebug_host()
-> dev_to_sdebug_host() to avoid ambiguity.
Also remove a check for !sdbg_host in find_build_dev_info(), as this cannot
be true. Other similar checks will be later removed.
Signed-off-by: John Garry <john.g.garry@oracle.com>
Link: https://lore.kernel.org/r/20230313093114.1498305-2-john.g.garry@oracle.com
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
smatch reports several warnings:
drivers/scsi/be2iscsi/be_main.c:148:1: warning:
symbol 'dev_attr_beiscsi_log_enable' was not declared. Should it be static?
drivers/scsi/be2iscsi/be_main.c:158:1: warning:
symbol 'dev_attr_beiscsi_drvr_ver' was not declared. Should it be static?
drivers/scsi/be2iscsi/be_main.c:159:1: warning:
symbol 'dev_attr_beiscsi_adapter_family' was not declared. Should it be static?
drivers/scsi/be2iscsi/be_main.c:160:1: warning:
symbol 'dev_attr_beiscsi_fw_ver' was not declared. Should it be static?
drivers/scsi/be2iscsi/be_main.c:161:1: warning:
symbol 'dev_attr_beiscsi_phys_port' was not declared. Should it be static?
drivers/scsi/be2iscsi/be_main.c:162:1: warning:
symbol 'dev_attr_beiscsi_active_session_count' was not declared. Should it be static?
drivers/scsi/be2iscsi/be_main.c:164:1: warning:
symbol 'dev_attr_beiscsi_free_session_count' was not declared. Should it be static ?
These variables are only used in be_main.c, so should be static.
Signed-off-by: Tom Rix <trix@redhat.com>
Link: https://lore.kernel.org/r/20230314005157.536918-1-trix@redhat.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
While adding and removing the controller, the following call trace was
observed:
WARNING: CPU: 3 PID: 623596 at kernel/dma/mapping.c:532 dma_free_attrs+0x33/0x50
CPU: 3 PID: 623596 Comm: sh Kdump: loaded Not tainted 5.14.0-96.el9.x86_64 #1
RIP: 0010:dma_free_attrs+0x33/0x50
Call Trace:
qla2x00_async_sns_sp_done+0x107/0x1b0 [qla2xxx]
qla2x00_abort_srb+0x8e/0x250 [qla2xxx]
? ql_dbg+0x70/0x100 [qla2xxx]
__qla2x00_abort_all_cmds+0x108/0x190 [qla2xxx]
qla2x00_abort_all_cmds+0x24/0x70 [qla2xxx]
qla2x00_abort_isp_cleanup+0x305/0x3e0 [qla2xxx]
qla2x00_remove_one+0x364/0x400 [qla2xxx]
pci_device_remove+0x36/0xa0
__device_release_driver+0x17a/0x230
device_release_driver+0x24/0x30
pci_stop_bus_device+0x68/0x90
pci_stop_and_remove_bus_device_locked+0x16/0x30
remove_store+0x75/0x90
kernfs_fop_write_iter+0x11c/0x1b0
new_sync_write+0x11f/0x1b0
vfs_write+0x1eb/0x280
ksys_write+0x5f/0xe0
do_syscall_64+0x5c/0x80
? do_user_addr_fault+0x1d8/0x680
? do_syscall_64+0x69/0x80
? exc_page_fault+0x62/0x140
? asm_exc_page_fault+0x8/0x30
entry_SYSCALL_64_after_hwframe+0x44/0xae
The command was completed in the abort path during driver unload with a
lock held, causing the warning in abort path. Hence complete the command
without any lock held.
Reported-by: Lin Li <lilin@redhat.com>
Tested-by: Lin Li <lilin@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Link: https://lore.kernel.org/r/20230313043711.13500-2-njavali@marvell.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: John Meneghini <jmeneghi@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Xiaomi Poco F1 (qcom/sdm845-xiaomi-beryllium*.dts) comes with a SKhynix
H28U74301AMR UFS. The sd_read_cpr() operation leads to a 120 second
timeout, making the device bootup very slow:
[ 121.457736] sd 0:0:0:1: [sdb] tag#23 timing out command, waited 120s
Setting the BLIST_SKIP_VPD_PAGES allows the device to skip the failing
sd_read_cpr operation and boot normally.
Signed-off-by: Joel Selvaraj <joelselvaraj.oss@gmail.com>
Link: https://lore.kernel.org/r/20230313041402.39330-1-joelselvaraj.oss@gmail.com
Cc: stable@vger.kernel.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
20 Fixes all in drivers except the one zone storage revalidation fix
to sd. The megaraid_sas fixes are more on the level of a driver
update (enabling crash dump and increasing lun number) but I thought
you could let this slide on -rc1 and the next most extensive update is
a load of fixes to mpi3mr.
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCZAuqwyYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishVm/AP4t1E3w
Jc19sSzMV6OZc5x1eTSGzso80fkTeFgVXWOmcQD9G6ymH78hS0cb7FRIwtnHcV5r
6TwbInHF2LwNKzPhuMI=
=BdqN
-----END PGP SIGNATURE-----
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Twenty fixes all in drivers except the one zone storage revalidation
fix to sd.
The megaraid_sas fixes are more on the level of a driver update
(enabling crash dump and increasing lun number) but I thought you
could let this slide on -rc1 and the next most extensive update is a
load of fixes to mpi3mr"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: sd: Fix wrong zone_write_granularity value during revalidate
scsi: storvsc: Handle BlockSize change in Hyper-V VHD/VHDX file
scsi: megaraid_sas: Driver version update to 07.725.01.00-rc1
scsi: megaraid_sas: Add crash dump mode capability bit in MFI capabilities
scsi: megaraid_sas: Update max supported LD IDs to 240
scsi: mpi3mr: Bad drive in topology results kernel crash
scsi: mpi3mr: NVMe command size greater than 8K fails
scsi: mpi3mr: Return proper values for failures in firmware init path
scsi: mpi3mr: Wait for diagnostic save during controller init
scsi: mpi3mr: Driver unload crashes host when enhanced logging is enabled
scsi: mpi3mr: ioctl timeout when disabling/enabling interrupt
scsi: lpfc: Avoid usage of list iterator variable after loop
scsi: lpfc: Check kzalloc() in lpfc_sli4_cgn_params_read()
scsi: ufs: mcq: qcom: Clean the return path of ufs_qcom_mcq_config_resource()
scsi: ufs: mcq: qcom: Fix passing zero to PTR_ERR
scsi: ufs: ufs-qcom: Remove impossible check
scsi: ufs: core: Add soft dependency on governor_simpleondemand
scsi: hisi_sas: Check devm_add_action() return value
scsi: qla2xxx: Add option to disable FC2 Target support
scsi: target: iscsi: Fix an error message in iscsi_check_key()
Bjorn Helgaas <helgaas@kernel.org> says:
Since f26e58bf6f ("PCI/AER: Enable error reporting when AER is native"),
which appeared in v6.0, the PCI core has enabled PCIe error reporting for
all devices during enumeration.
Remove driver code to do this and remove unnecessary includes of
<linux/aer.h> from several other drivers.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
pci_enable_pcie_error_reporting() enables the device to send ERR_*
Messages. Since commit f26e58bf6f ("PCI/AER: Enable error reporting when
AER is native"), the PCI core does this for all devices during enumeration,
so the driver doesn't need to do it itself.
Remove the redundant pci_enable_pcie_error_reporting() call from the
driver. Also remove the corresponding pci_disable_pcie_error_reporting()
from the driver .remove() path.
Note that this only controls ERR_* Messages from the device. An ERR_*
Message may cause the Root Port to generate an interrupt, depending on the
AER Root Error Command register managed by the AER service driver.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20230307182842.870378-11-helgaas@kernel.org
Cc: Nilesh Javali <njavali@marvell.com>
Cc: Manish Rangankar <mrangankar@marvell.com>
Cc: GR-QLogic-Storage-Upstream@marvell.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
pci_enable_pcie_error_reporting() enables the device to send ERR_*
Messages. Since commit f26e58bf6f ("PCI/AER: Enable error reporting when
AER is native"), the PCI core does this for all devices during enumeration,
so the driver doesn't need to do it itself.
Remove the redundant pci_enable_pcie_error_reporting() call from the
driver. Also remove the corresponding pci_disable_pcie_error_reporting()
from the driver .remove() path.
Note that this only controls ERR_* Messages from the device. An ERR_*
Message may cause the Root Port to generate an interrupt, depending on the
AER Root Error Command register managed by the AER service driver.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20230307182842.870378-10-helgaas@kernel.org
Cc: Nilesh Javali <njavali@marvell.com>
Cc: GR-QLogic-Storage-Upstream@marvell.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
pci_enable_pcie_error_reporting() enables the device to send ERR_*
Messages. Since commit f26e58bf6f ("PCI/AER: Enable error reporting when
AER is native"), the PCI core does this for all devices during enumeration,
so the driver doesn't need to do it itself.
Remove the redundant pci_enable_pcie_error_reporting() call from the
driver. Also remove the corresponding pci_disable_pcie_error_reporting()
from the driver .remove() path.
Note that this only controls ERR_* Messages from the device. An ERR_*
Message may cause the Root Port to generate an interrupt, depending on the
AER Root Error Command register managed by the AER service driver.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20230307182842.870378-9-helgaas@kernel.org
Cc: Sathya Prakash <sathya.prakash@broadcom.com>
Cc: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Cc: Suganath Prabu Subramani <suganath-prabu.subramani@broadcom.com>
Cc: MPT-FusionLinux.pdl@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
pci_enable_pcie_error_reporting() enables the device to send ERR_*
Messages. Since commit f26e58bf6f ("PCI/AER: Enable error reporting when
AER is native"), the PCI core does this for all devices during enumeration,
so the driver doesn't need to do it itself.
Remove the redundant pci_enable_pcie_error_reporting() call from the
driver. Also remove the corresponding pci_disable_pcie_error_reporting()
from the driver .remove() path.
Note that this only controls ERR_* Messages from the device. An ERR_*
Message may cause the Root Port to generate an interrupt, depending on the
AER Root Error Command register managed by the AER service driver.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20230307182842.870378-8-helgaas@kernel.org
Cc: James Smart <james.smart@broadcom.com>
Cc: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Commit 105a3dbc74 ("hpsa: clean up driver init") added a comment about
pci_disable_pcie_error_reporting(), but hpsa has never called either
pci_enable_pcie_error_reporting() or pci_disable_pcie_error_reporting().
Since commit f26e58bf6f ("PCI/AER: Enable error reporting when AER is
native"), the PCI core is responsible for managing PCIe device error
reporting.
Remove the comment about pci_disable_pcie_error_reporting().
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20230307182842.870378-7-helgaas@kernel.org
Cc: Don Brace <don.brace@microchip.com>
Cc: storagedev@microchip.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
pci_enable_pcie_error_reporting() enables the device to send ERR_*
Messages. Since commit f26e58bf6f ("PCI/AER: Enable error reporting when
AER is native"), the PCI core does this for all devices during enumeration,
so the driver doesn't need to do it itself.
Remove the redundant pci_enable_pcie_error_reporting() call from the
driver. Also remove the corresponding pci_disable_pcie_error_reporting()
from the driver .remove() path.
Note that this only controls ERR_* Messages from the device. An ERR_*
Message may cause the Root Port to generate an interrupt, depending on the
AER Root Error Command register managed by the AER service driver.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20230307182842.870378-5-helgaas@kernel.org
Cc: Anil Gurumurthy <anil.gurumurthy@qlogic.com>
Cc: Sudarsana Kalluru <sudarsana.kalluru@qlogic.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
pci_enable_pcie_error_reporting() enables the device to send ERR_*
Messages. Since commit f26e58bf6f ("PCI/AER: Enable error reporting when
AER is native"), the PCI core does this for all devices during enumeration,
so the driver doesn't need to do it itself.
Remove the redundant pci_enable_pcie_error_reporting() call from the
driver. Also remove the corresponding pci_disable_pcie_error_reporting()
from the driver .remove() path.
Note that this only controls ERR_* Messages from the device. An ERR_*
Message may cause the Root Port to generate an interrupt, depending on the
AER Root Error Command register managed by the AER service driver.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20230307182842.870378-4-helgaas@kernel.org
Cc: Ketan Mukadam <ketan.mukadam@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
pci_enable_pcie_error_reporting() enables the device to send ERR_*
Messages. Since commit f26e58bf6f ("PCI/AER: Enable error reporting when
AER is native"), the PCI core does this for all devices during enumeration,
so the driver doesn't need to do it itself.
Remove the redundant pci_enable_pcie_error_reporting() call from the
driver.
Note that this only controls ERR_* Messages from the device. An ERR_*
Message may cause the Root Port to generate an interrupt, depending on the
AER Root Error Command register managed by the AER service driver.
Also remove pci_disable_pcie_error_reporting() from the .error_detected()
path, which was added by commit 5c63f7f710 ("aacraid: Added EEH support")
but looks unnecessary. Error reporting will be disabled by the device
reset and will be re-enabled by the pci_restore_state() in
aac_pci_slot_reset().
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20230307182842.870378-2-helgaas@kernel.org
Cc: Adaptec OEM Raid Solutions <aacraid@microsemi.com>
Cc: Raghava Aditya Renukunta <raghavaaditya.renukunta@pmcs.com>
Cc: Tomas Henzl <thenzl@redhat.com>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
chenxiang <chenxiang66@hisilicon.com> says:
To support IO_URING IOPOLL support for hisi_sas, we need to:
- Add and fill mq_poll interface to poll queue;
- Ensure internal I/Os (including internal abort I/Os) are delivered and
completed through non-iopoll queue (queue 0);
Sending internal abort commands to non-poll queue actually requires to
sending the abort command to every queue. This carries a a risk. Make iopoll
support module parameter "experimental".
I have tested performance on v3 hw with different modes as follows. 4K
READs and 4K WRITEs both see an improvement when enabling poll mode:
4K READ 4K RANDREAD 4K WRITE 4K RANDWRITE
interrupt + libaio 1770k 1316k 1197k 831k
interrupt + io_uring 1848k 1390k 1238k 857k
iopoll + io_uring 2117k 1364k 1874k 849k
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Currently we sync irq to avoid freeing task before using task in I/O
completion. After adding io_uring support, we need to do something similar
for poll queues. As the process of CQ entries on poll queue are protected
by spinlock cq->lock, we can use spin_lock() + spin_unlock() on cq->lock to
make sure that CQ entries are processed to completion and then the complete
queue is synced.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Link: https://lore.kernel.org/r/1678169355-76215-4-git-send-email-chenxiang66@hisilicon.com
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add a module parameter to set how many queues are used for iopoll. Also
fill the interface mq_poll. For internal I/Os from libsas and libata we use
non-iopoll queue (queue 0) to deliver and complete them. But for internal
abort I/Os, just don't send them for poll queues.
There is still a risk associated as this sends internal abort commands to
non-iopoll queues which actually requires sending an internal abort command
to every queue. As a result, make the module parameter as "experimental"
for now.
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Link: https://lore.kernel.org/r/1678169355-76215-3-git-send-email-chenxiang66@hisilicon.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Put the work of processing cq slots in a separate function,
complete_v3_hw(), which can then be used by cq_thread_v3_hw() and other
functions when adding poll support.
Co-developed-by: John Garry <john.garry@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Link: https://lore.kernel.org/r/1678169355-76215-2-git-send-email-chenxiang66@hisilicon.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Justin Tee <justin.tee@broadcom.com> says:
Update lpfc to revision 14.2.0.11
This patch set contains bug fixes for buffer overflow, resource
management, discovery, and HBA error status handling.
The patches were cut against Martin's 6.3/scsi-queue tree.
Justin Tee (10):
lpfc: Protect against potential lpfc_debugfs_lockstat_write buffer overflow
lpfc: Reorder freeing of various dma buffers and their list removal
lpfc: Fix lockdep warning for rx_monitor lock when unloading driver
lpfc: Record LOGO state with discovery engine even if aborted
lpfc: Defer issuing new PLOGI if received RSCN before completing REG_LOGIN
lpfc: Correct used_rpi count when devloss tmo fires with no recovery
lpfc: Skip waiting for register ready bits when in unrecoverable state
lpfc: Revise lpfc_error_lost_link reason code evaluation logic
lpfc: Update lpfc version to 14.2.0.11
lpfc: Copyright updates for 14.2.0.11 patches
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Extended status reason code errors should mask off the IOERR_PARAM_MASK
before checking strict equalities for IOERR values.
Update the lpfc_error_lost_link() routine as such.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230301231626.9621-9-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
During tolerance tests that force an HBA to become unresponsive, rmmod
hangs resulting in the inability to remove the driver.
The lpfc_pci_remove_one_s4() routine attempts to submit a clean up mailbox
command via the lpfc_sli4_post_sync_mbox() routine, but ends up waiting
forever for a mailbox register to set its ready bit. Because the HBA is in
an unrecoverable and unresponsive state, the ready bit will never be set.
Create a new routine called lpfc_sli4_unrecoverable_port(), which checks a
port status register's error notification bits.
Use the lpfc_sli4_unrecoverable_port() routine in ready bit check routines
to early return error if port is deemed unrecoverable.
Also, when the lpfc_handle_eratt_s4() handler detects an unrecoverable
state, call the lpfc_sli4_offline_eratt() routine to kick off flushing
outstanding I/O.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230301231626.9621-8-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
A fabric controller can sometimes send an RDP request right before a link
down event. Because of this outstanding RDP request, the driver does not
remove the last reference count on its ndlp causing a potential leak of RPI
resources when devloss tmo fires.
In lpfc_cmpl_els_rsp(), modify the NPIV clause to always allow the
lpfc_drop_node() routine to execute when not registered with SCSI
transport.
This relaxes the contraint that an NPIV ndlp must be in a specific state in
order to call lpfc_drop node. Logic is revised such that the
lpfc_drop_node() routine is always called to ensure the last ndlp decrement
occurs.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230301231626.9621-7-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When mapped to a target with multiple virtual ports, a link bounce
sometimes results in unsuccessful rediscovery of all of the target's
virtual ports. This is because a succession of repeat RSCNs for the
virtual target ports leaves ndlps in the REG_LOGIN state with the
NLP_REG_LOGIN_SEND flag set. With NLP_REG_LOGIN_SEND set, during the next
PLOGI, the driver will UNREG_RPI. When UNREG_RPI is processed, the driver
can be in the middle of PRLI_ISSUE or MAPPED state resulting in an illegal
state transition by the discovery engine and stalling.
Fix by calling the discovery state machine with DEVICE_RECOVERY event
during RSCN processing. This will set the NLP_IGNR_REG_CMPL bit and
prevent the old REG_LOGIN state from advancing. Then for the new PLOGI
issue, add the check for the NLP_IGNR_REG_CMPL bit to delay issuing the new
PLOGI until the queued REG_LOGIN and UNREG_LOGIN have been processed.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230301231626.9621-6-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
A target vendor array reboot in P2P topology can sometimes result in
unsuccessful rediscovery.
Rework the lpfc_cmpl_els_logo() routine such that when the LOGO completes
as a failure because of driver abort, the LOGO state is still recorded with
the discovery state machine.
This is a small rework to set LOGO completion without forcing a device
removal state change.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230301231626.9621-5-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Lockdep enabled kernels report a theoretical deadlock state where the
cmf_timer interrupt occurs while the rx_monitor ring is being destroyed.
During rmmod, the cmf_timer is cancelled prior to the
lpfc_rx_monitor_destroy_ring call. This actually eliminates the need to
take the rx_monitor ring lock in lpfc_rx_monitor_destroy_ring. Thus, just
remove lock/unlock of rx_monitor in lpfc_rx_monitor_destroy_ring.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230301231626.9621-4-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Code sections where DMA resources are freed before list removal are
reworked to ensure item removal before being freed.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230301231626.9621-3-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
A static code analysis tool flagged the possibility of buffer overflow when
using copy_from_user() for a debugfs entry.
Currently, it is possible that copy_from_user() copies more bytes than what
would fit in the mybuf char array. Add a min() restriction check between
sizeof(mybuf) - 1 and nbytes passed from the userspace buffer to protect
against buffer overflow.
Link: https://lore.kernel.org/r/20230301231626.9621-2-justintee8345@gmail.com
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Some storage, such as AIX VDASD (virtual storage) and IBM 2076 (front
end), fail as a result of commit c92a6b5d63 ("scsi: core: Query VPD
size before getting full page").
That commit changed getting SCSI VPD pages so that we now read just
enough of the page to get the actual page size, then read the whole
page in a second read. The problem is that the above mentioned
hardware returns zero for the page size, because of a firmware
error. In such cases, until the firmware is fixed, this new blacklist
flag says to revert to the original method of reading the VPD pages,
i.e. try to read a whole buffer's worth on the first try.
[mkp: reworked somewhat]
Fixes: c92a6b5d63 ("scsi: core: Query VPD size before getting full page")
Reported-by: Martin Wilck <mwilck@suse.com>
Suggested-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Lee Duncan <lduncan@suse.com>
Link: https://lore.kernel.org/r/20220928181350.9948-1-leeman.duncan@gmail.com
Tested-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add a missing resource clean up in .remove.
Fixes: e22bae3066 ("scsi: mpi3mr: Add expander devices to STL")
Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Link: https://lore.kernel.org/r/20230302234336.25456-7-thenzl@redhat.com
Acked-by: Sathya Prakash Veerichetty <sathya.prakash@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Don't allocate memory again when IOC is being reinitialized.
Fixes: fe6db61515 ("scsi: mpi3mr: Handle offline FW activation in graceful manner")
Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Link: https://lore.kernel.org/r/20230302234336.25456-6-thenzl@redhat.com
Acked-by: Sathya Prakash Veerichetty <sathya.prakash@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
A fix for:
DMA-API: pci 0000:83:00.0: device driver has pending DMA allocations while released from device [count=1]
Fixes: 32d457d5a2 ("scsi: mpi3mr: Add framework to issue config requests")
Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Link: https://lore.kernel.org/r/20230302234336.25456-3-thenzl@redhat.com
Acked-by: Sathya Prakash Veerichetty <sathya.prakash@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Port is allocated by sas_port_alloc_num() and rphy is allocated by either
sas_end_device_alloc() or sas_expander_alloc(), all of which may return
NULL. So we need to check the rphy to avoid possible NULL pointer access.
If sas_rphy_add() returned with failure, rphy is set to NULL. We would
access the rphy in the following lines which would also result NULL pointer
access.
Fixes: 78316e9dfc ("scsi: mpt3sas: Fix possible resource leaks in mpt3sas_transport_port_add()")
Signed-off-by: Wenchao Hao <haowenchao2@huawei.com>
Link: https://lore.kernel.org/r/20230225100135.2109330-1-haowenchao2@huawei.com
Acked-by: Sathya Prakash Veerichetty <sathya.prakash@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When the sd driver revalidates host-managed SMR disks, it calls
disk_set_zoned() which changes the zone_write_granularity attribute value
to the logical block size regardless of the device type. After that, the sd
driver overwrites the value in sd_zbc_read_zone() with the physical block
size, since ZBC/ZAC requires this for host-managed disks. Between the calls
to disk_set_zoned() and sd_zbc_read_zone(), there exists a window where the
attribute shows the logical block size as the zone_write_granularity value,
which is wrong for host-managed disks. The duration of the window is from
20ms to 200ms, depending on report zone command execution time.
To avoid the wrong zone_write_granularity value between disk_set_zoned()
and sd_zbc_read_zone(), modify the value not in sd_zbc_read_zone() but
just after disk_set_zoned() call.
Fixes: a805a4fa4f ("block: introduce zone_write_granularity limit")
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Link: https://lore.kernel.org/r/20230306063024.3376959-1-shinichiro.kawasaki@wdc.com
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Hyper-V uses a VHD or VHDX file on the host as the underlying storage for a
virtual disk. The VHD/VHDX file format is a sparse format where real disk
space on the host is assigned in chunks that the VHD/VHDX file format calls
the BlockSize. This BlockSize is not to be confused with the 512-byte (or
4096-byte) sector size of the underlying storage device. The default block
size for a new VHD/VHDX file is 32 Mbytes. When a guest VM touches any
disk space within a 32 Mbyte chunk of the VHD/VHDX file, Hyper-V allocates
32 Mbytes of real disk space for that section of the VHD/VHDX. Similarly,
if a discard operation is done that covers an entire 32 Mbyte chunk,
Hyper-V will free the real disk space for that portion of the VHD/VHDX.
This BlockSize is surfaced in Linux as the "discard_granularity" in
/sys/block/sd<x>/queue, which makes sense.
Hyper-V also has differencing disks that can overlay a VHD/VHDX file to
capture changes to the VHD/VHDX while preserving the original VHD/VHDX.
One example of this differencing functionality is for VM snapshots. When a
snapshot is created, a differencing disk is created. If the snapshot is
rolled back, Hyper-V can just delete the differencing disk, and the VM will
see the original disk contents at the time the snapshot was taken.
Differencing disks are used in other scenarios as well.
The BlockSize for a differencing disk defaults to 2 Mbytes, not 32 Mbytes.
The smaller default is used because changes to differencing disks are
typically scattered all over, and Hyper-V doesn't want to allocate 32
Mbytes of real disk space for a stray write here or there. The smaller
BlockSize provides more efficient use of real disk space.
When a differencing disk is added to a VHD/VHDX, Hyper-V reports
UNIT_ATTENTION with a sense code indicating "Operating parameters have
changed", because the value of discard_granularity should be changed to 2
Mbytes. When the differencing disk is removed, discard_granularity should
be changed back to 32 Mbytes. However, current code simply reports a
message from scsi_report_sense() and the value of
/sys/block/sd<x>/queue/discard_granularity is not updated. The message
isn't very actionable by a sysadmin.
Fix this by having the storvsc driver check for the sense code indicating
that the underly VHD/VHDX block size has changed, and do a rescan of the
device to pick up the new discard_granularity. With this change the entire
transition to/from differencing disks is handled automatically and
transparently, with no confusing messages being output.
Link: https://lore.kernel.org/r/1677516514-86060-1-git-send-email-mikelley@microsoft.com
Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In kdump kernel mode, the driver works in reduced functionality mode with
some features disabled such as reduced MSI-X count and RDPQ disabled, etc.
However, the firmware is not aware of this mode in some cases, which
results in undefined behavior.
To address this, the driver informs the firmware about the kdump mode
through MPI capabilities bit during driver initialization. This allows
firmware to adjust its behavior accordingly.
Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com>
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Link: https://lore.kernel.org/r/20230302105342.34933-3-chandrakanth.patil@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The firmware only supports Logical Disk IDs up to 240 and LD ID 255 (0xFF)
is reserved for deleted LDs. However, in some cases, firmware was assigning
LD ID 254 (0xFE) to deleted LDs and this was causing the driver to mark the
wrong disk as deleted. This in turn caused the wrong disk device to be
taken offline by the SCSI midlayer.
To address this issue, limit the LD ID range from 255 to 240. This ensures
the deleted LD ID is properly identified and removed by the driver without
accidently deleting any valid LDs.
Fixes: ae6874ba4b ("scsi: megaraid_sas: Early detection of VD deletion through RaidMap update")
Reported-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Chandrakanth Patil <chandrakanth.patil@broadcom.com>
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Link: https://lore.kernel.org/r/20230302105342.34933-2-chandrakanth.patil@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When the SAS Transport Layer support is enabled and a device exposed to
the OS by the driver fails INQUIRY commands, the driver frees up the memory
allocated for an internal HBA port data structure. However, in some places,
the reference to the freed memory is not cleared. When the firmware sends
the Device Info change event for the same device again, the freed memory is
accessed and that leads to memory corruption and OS crash.
Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com>
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Link: https://lore.kernel.org/r/20230228140835.4075-7-ranjan.kumar@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
A wrong variable is checked while populating PRP entries in the PRP page
and this results in failure. No PRP entries in the PRP page were
successfully created and any NVMe Encapsulated commands with PRP of size
greater than 8K failed.
Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com>
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Link: https://lore.kernel.org/r/20230228140835.4075-6-ranjan.kumar@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Return proper non-zero return values for all the cases when the controller
initialization and re-initialization fails.
Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com>
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Link: https://lore.kernel.org/r/20230228140835.4075-5-ranjan.kumar@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If a controller reset operation is triggered to recover the controller from
a fault state, then wait for the snapdump to be saved in the firmware
region before proceeding to reset the controller.
Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com>
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Link: https://lore.kernel.org/r/20230228140835.4075-4-ranjan.kumar@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Prevent driver from trying to dereference a NULL pointer in a debug print
while removing a device during driver unload.
Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com>
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Link: https://lore.kernel.org/r/20230228140835.4075-3-ranjan.kumar@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
As part of Task Management handling, the driver will disable and enable the
MSIx index zero which belongs to the Admin reply queue. During this
transition the driver loses some interrupts and this leads to Admin request
and ioctl timeouts.
After enabling the interrupts, poll the Admin reply queue to avoid
timeouts.
Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com>
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Link: https://lore.kernel.org/r/20230228140835.4075-2-ranjan.kumar@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If the &epd_pool->list is empty when executing
lpfc_get_io_buf_from_expedite_pool() the function would return an invalid
pointer. Even in the case if the list is guaranteed to be populated, the
iterator variable should not be used after the loop to be more robust for
future changes.
Linus proposed to avoid any use of the list iterator variable after the
loop, in the attempt to move the list iterator variable declaration into
the macro to avoid any potential misuse after the loop [1].
Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=EHreAsk5SqXPwr9Y7k9sA6cWXJ6w@mail.gmail.com/ [1]
Signed-off-by: Jakob Koschel <jkl820.git@gmail.com>
Link: https://lore.kernel.org/r/20230301-scsi-lpfc-avoid-list-iterator-after-loop-v1-1-325578ae7561@gmail.com
Reviewed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If kzalloc() fails in lpfc_sli4_cgn_params_read(), then we rely on
lpfc_read_object()'s routine to NULL check pdata.
Currently, an early return error is thrown from lpfc_read_object() to
protect us from NULL ptr dereference, but the errno code is -ENODEV.
Change the errno code to a more appropriate -ENOMEM.
Reported-by: Kang Chen <void0red@gmail.com>
Link: https://lore.kernel.org/all/20230226102338.3362585-1-void0red@gmail.com
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20230228044336.5195-1-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In case devm_add_action() fails, check it in the caller of
interrupt_preinit_v3_hw().
Link: https://lore.kernel.org/r/20230227031030.893324-1-void0red@gmail.com
Signed-off-by: Kang Chen <void0red@gmail.com>
Acked-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Commit 44c57f2058 ("scsi: qla2xxx: Changes to support FCP2 Target") added
support for FC2 Targets. Unfortunately, there are older setups which break
with this new feature enabled.
Allow to disable it via module option.
Link: https://lore.kernel.org/r/20230208152014.109214-1-dwagner@suse.de
Signed-off-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
It turns out that commit 596ff4a09b ("cpumask: re-introduce
constant-sized cpumask optimizations") exposed a number of cases of
drivers not checking the result of "cpumask_next()" and friends
correctly.
The documented correct check for "no more cpus in the cpumask" is to
check for the result being equal or larger than the number of possible
CPU ids, exactly _because_ we've always done those constant-sized
cpumask scans using a widened type before. So the return value of a
cpumask scan should be checked with
if (cpu >= nr_cpu_ids)
...
because the cpumask scan did not necessarily stop exactly *at* that
maximum CPU id.
But a few cases ended up instead using checks like
if (cpu == nr_cpumask_bits)
...
which used that internal "widened" number of bits. And that used to
work pretty much by accident (ok, in this case "by accident" is simply
because it matched the historical internal implementation of the cpumask
scanning, so it was more of a "intentionally using implementation
details rather than an accident").
But the extended constant-sized optimizations then did that internal
implementation differently, and now that code that did things wrong but
matched the old implementation no longer worked at all.
Which then causes subsequent odd problems due to using what ends up
being an invalid CPU ID.
Most of these cases require either unusual hardware or special uses to
hit, but the random.c one triggers quite easily.
All you really need is to have a sufficiently small CONFIG_NR_CPUS value
for the bit scanning optimization to be triggered, but not enough CPUs
to then actually fill that widened cpumask. At that point, the cpumask
scanning will return the NR_CPUS constant, which is _not_ the same as
nr_cpumask_bits.
This just does the mindless fix with
sed -i 's/== nr_cpumask_bits/>= nr_cpu_ids/'
to fix the incorrect uses.
The ones in the SCSI lpfc driver in particular could probably be fixed
more cleanly by just removing that repeated pattern entirely, but I am
not emptionally invested enough in that driver to care.
Reported-and-tested-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/lkml/481b19b5-83a0-4793-b4fd-194ad7b978c3@roeck-us.net/
Reported-and-tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/lkml/CAMuHMdUKo_Sf7TjKzcNDa8Ve+6QrK+P8nSQrSQ=6LTRmcBKNww@mail.gmail.com/
Reported-by: Vernon Yang <vernon2gm@gmail.com>
Link: https://lore.kernel.org/lkml/20230306160651.2016767-1-vernon2gm@gmail.com/
Cc: Yury Norov <yury.norov@gmail.com>
Cc: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Updates that missed the first pull, mostly because of needing more
soak time. Driver updates (zfcp, ufs, mpi3mr, plus two ipr bug
fixes), an enclosure services (ses) update (mostly bug fixes) and
other minor bug fixes and changes.
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCZAJf2SYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishWYeAP9u6umX
5Trq6mtfdPyhSxIhgD6AwJDgVkeApSKAHZRu0AD/dfMTQmeaJory4LD4JW+uAgVl
yFPVz9VPlKTiaM2lwUI=
=BKzr
-----END PGP SIGNATURE-----
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull more SCSI updates from James Bottomley:
"Updates that missed the first pull, mostly because of needing more
soak time.
Driver updates (zfcp, ufs, mpi3mr, plus two ipr bug fixes), an
enclosure services (ses) update (mostly bug fixes) and other minor bug
fixes and changes"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (32 commits)
scsi: zfcp: Trace when request remove fails after qdio send fails
scsi: zfcp: Change the type of all fsf request id fields and variables to u64
scsi: zfcp: Make the type for accessing request hashtable buckets size_t
scsi: ufs: core: Simplify ufshcd_execute_start_stop()
scsi: ufs: core: Rely on the block layer for setting RQF_PM
scsi: core: Extend struct scsi_exec_args
scsi: lpfc: Fix double word in comments
scsi: core: Remove the /proc/scsi/${proc_name} directory earlier
scsi: core: Fix a source code comment
scsi: cxgbi: Remove unneeded version.h include
scsi: qedi: Remove unneeded version.h include
scsi: mpi3mr: Remove unneeded version.h include
scsi: mpi3mr: Fix missing mrioc->evtack_cmds initialization
scsi: mpi3mr: Use number of bits to manage bitmap sizes
scsi: mpi3mr: Remove unnecessary memcpy() to alltgt_info->dmi
scsi: mpi3mr: Fix issues in mpi3mr_get_all_tgt_info()
scsi: mpi3mr: Fix an issue found by KASAN
scsi: mpi3mr: Replace 1-element array with flex-array
scsi: ipr: Work around fortify-string warning
scsi: ipr: Make ipr_probe_ioa_part2() return void
...
device feature provisioning in ifcvf, mlx5
new SolidNET driver
support for zoned block device in virtio blk
numa support in virtio pmem
VIRTIO_F_RING_RESET support in vhost-net
more debugfs entries in mlx5
resume support in vdpa
completion batching in virtio blk
cleanup of dma api use in vdpa
now simulating more features in vdpa-sim
documentation, features, fixes all over the place
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmP0D98PHG1zdEByZWRo
YXQuY29tAAoJECgfDbjSjVRpV6IH/iecRgLMWWjp3n31IFdu31f/J4HpF7dczVjK
qtV98eJ1N2pkgeJkdCfmB5XszfvFBeAurrS7++FTHiJhrRfR3Z+2ml/Qtvh5DEyP
qxz6wOw6VVsi/txdUxM1wsxLeEmmzkmFdAmPM+FXeIjhWj76GOgy/4A3eaj6TgzV
W8ShsBve/UZ5qMOC3XbIscvdOrudHJ18tH90Tiz3NZfH1fAs5E4uWbU6Mrz9DJVr
canGvf4kAI9z8qram5HSgzPIXRJEYiF4q/eiStdtiiME8gL1mHLRZDNP1I1LeCAb
q6Q6RCRKi3Ek+LGdH6u+nR1Swu03N2b/g+vgKtv30kJo06oZVzw=
=EasV
-----END PGP SIGNATURE-----
Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull virtio updates from Michael Tsirkin:
- device feature provisioning in ifcvf, mlx5
- new SolidNET driver
- support for zoned block device in virtio blk
- numa support in virtio pmem
- VIRTIO_F_RING_RESET support in vhost-net
- more debugfs entries in mlx5
- resume support in vdpa
- completion batching in virtio blk
- cleanup of dma api use in vdpa
- now simulating more features in vdpa-sim
- documentation, features, fixes all over the place
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (64 commits)
vdpa/mlx5: support device features provisioning
vdpa/mlx5: make MTU/STATUS presence conditional on feature bits
vdpa: validate device feature provisioning against supported class
vdpa: validate provisioned device features against specified attribute
vdpa: conditionally read STATUS in config space
vdpa: fix improper error message when adding vdpa dev
vdpa/mlx5: Initialize CVQ iotlb spinlock
vdpa/mlx5: Don't clear mr struct on destroy MR
vdpa/mlx5: Directly assign memory key
tools/virtio: enable to build with retpoline
vringh: fix a typo in comments for vringh_kiov
vhost-vdpa: print warning when vhost_vdpa_alloc_domain fails
scsi: virtio_scsi: fix handling of kmalloc failure
vdpa: Fix a couple of spelling mistakes in some messages
vhost-net: support VIRTIO_F_RING_RESET
vhost-scsi: convert sysfs snprintf and sprintf to sysfs_emit
vdpa: mlx5: support per virtqueue dma device
vdpa: set dma mask for vDPA device
virtio-vdpa: support per vq dma device
vdpa: introduce get_vq_dma_device()
...
Here is the large set of driver core changes for 6.3-rc1.
There's a lot of changes this development cycle, most of the work falls
into two different categories:
- fw_devlink fixes and updates. This has gone through numerous review
cycles and lots of review and testing by lots of different devices.
Hopefully all should be good now, and Saravana will be keeping a
watch for any potential regression on odd embedded systems.
- driver core changes to work to make struct bus_type able to be moved
into read-only memory (i.e. const) The recent work with Rust has
pointed out a number of areas in the driver core where we are
passing around and working with structures that really do not have
to be dynamic at all, and they should be able to be read-only making
things safer overall. This is the contuation of that work (started
last release with kobject changes) in moving struct bus_type to be
constant. We didn't quite make it for this release, but the
remaining patches will be finished up for the release after this
one, but the groundwork has been laid for this effort.
Other than that we have in here:
- debugfs memory leak fixes in some subsystems
- error path cleanups and fixes for some never-able-to-be-hit
codepaths.
- cacheinfo rework and fixes
- Other tiny fixes, full details are in the shortlog
All of these have been in linux-next for a while with no reported
problems.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCY/ipdg8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ynL3gCgwzbcWu0So3piZyLiJKxsVo9C2EsAn3sZ9gN6
6oeFOjD3JDju3cQsfGgd
=Su6W
-----END PGP SIGNATURE-----
Merge tag 'driver-core-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Here is the large set of driver core changes for 6.3-rc1.
There's a lot of changes this development cycle, most of the work
falls into two different categories:
- fw_devlink fixes and updates. This has gone through numerous review
cycles and lots of review and testing by lots of different devices.
Hopefully all should be good now, and Saravana will be keeping a
watch for any potential regression on odd embedded systems.
- driver core changes to work to make struct bus_type able to be
moved into read-only memory (i.e. const) The recent work with Rust
has pointed out a number of areas in the driver core where we are
passing around and working with structures that really do not have
to be dynamic at all, and they should be able to be read-only
making things safer overall. This is the contuation of that work
(started last release with kobject changes) in moving struct
bus_type to be constant. We didn't quite make it for this release,
but the remaining patches will be finished up for the release after
this one, but the groundwork has been laid for this effort.
Other than that we have in here:
- debugfs memory leak fixes in some subsystems
- error path cleanups and fixes for some never-able-to-be-hit
codepaths.
- cacheinfo rework and fixes
- Other tiny fixes, full details are in the shortlog
All of these have been in linux-next for a while with no reported
problems"
[ Geert Uytterhoeven points out that that last sentence isn't true, and
that there's a pending report that has a fix that is queued up - Linus ]
* tag 'driver-core-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (124 commits)
debugfs: drop inline constant formatting for ERR_PTR(-ERROR)
OPP: fix error checking in opp_migrate_dentry()
debugfs: update comment of debugfs_rename()
i3c: fix device.h kernel-doc warnings
dma-mapping: no need to pass a bus_type into get_arch_dma_ops()
driver core: class: move EXPORT_SYMBOL_GPL() lines to the correct place
Revert "driver core: add error handling for devtmpfs_create_node()"
Revert "devtmpfs: add debug info to handle()"
Revert "devtmpfs: remove return value of devtmpfs_delete_node()"
driver core: cpu: don't hand-override the uevent bus_type callback.
devtmpfs: remove return value of devtmpfs_delete_node()
devtmpfs: add debug info to handle()
driver core: add error handling for devtmpfs_create_node()
driver core: bus: update my copyright notice
driver core: bus: add bus_get_dev_root() function
driver core: bus: constify bus_unregister()
driver core: bus: constify some internal functions
driver core: bus: constify bus_get_kset()
driver core: bus: constify bus_register/unregister_notifier()
driver core: remove private pointer from struct bus_type
...
F_SEAL_EXEC") which permits the setting of the memfd execute bit at
memfd creation time, with the option of sealing the state of the X bit.
- Peter Xu adds a patch series ("mm/hugetlb: Make huge_pte_offset()
thread-safe for pmd unshare") which addresses a rare race condition
related to PMD unsharing.
- Several folioification patch serieses from Matthew Wilcox, Vishal
Moola, Sidhartha Kumar and Lorenzo Stoakes
- Johannes Weiner has a series ("mm: push down lock_page_memcg()") which
does perform some memcg maintenance and cleanup work.
- SeongJae Park has added DAMOS filtering to DAMON, with the series
"mm/damon/core: implement damos filter". These filters provide users
with finer-grained control over DAMOS's actions. SeongJae has also done
some DAMON cleanup work.
- Kairui Song adds a series ("Clean up and fixes for swap").
- Vernon Yang contributed the series "Clean up and refinement for maple
tree".
- Yu Zhao has contributed the "mm: multi-gen LRU: memcg LRU" series. It
adds to MGLRU an LRU of memcgs, to improve the scalability of global
reclaim.
- David Hildenbrand has added some userfaultfd cleanup work in the
series "mm: uffd-wp + change_protection() cleanups".
- Christoph Hellwig has removed the generic_writepages() library
function in the series "remove generic_writepages".
- Baolin Wang has performed some maintenance on the compaction code in
his series "Some small improvements for compaction".
- Sidhartha Kumar is doing some maintenance work on struct page in his
series "Get rid of tail page fields".
- David Hildenbrand contributed some cleanup, bugfixing and
generalization of pte management and of pte debugging in his series "mm:
support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on all architectures with swap
PTEs".
- Mel Gorman and Neil Brown have removed the __GFP_ATOMIC allocation
flag in the series "Discard __GFP_ATOMIC".
- Sergey Senozhatsky has improved zsmalloc's memory utilization with his
series "zsmalloc: make zspage chain size configurable".
- Joey Gouly has added prctl() support for prohibiting the creation of
writeable+executable mappings. The previous BPF-based approach had
shortcomings. See "mm: In-kernel support for memory-deny-write-execute
(MDWE)".
- Waiman Long did some kmemleak cleanup and bugfixing in the series
"mm/kmemleak: Simplify kmemleak_cond_resched() & fix UAF".
- T.J. Alumbaugh has contributed some MGLRU cleanup work in his series
"mm: multi-gen LRU: improve".
- Jiaqi Yan has provided some enhancements to our memory error
statistics reporting, mainly by presenting the statistics on a per-node
basis. See the series "Introduce per NUMA node memory error
statistics".
- Mel Gorman has a second and hopefully final shot at fixing a CPU-hog
regression in compaction via his series "Fix excessive CPU usage during
compaction".
- Christoph Hellwig does some vmalloc maintenance work in the series
"cleanup vfree and vunmap".
- Christoph Hellwig has removed block_device_operations.rw_page() in ths
series "remove ->rw_page".
- We get some maple_tree improvements and cleanups in Liam Howlett's
series "VMA tree type safety and remove __vma_adjust()".
- Suren Baghdasaryan has done some work on the maintainability of our
vm_flags handling in the series "introduce vm_flags modifier functions".
- Some pagemap cleanup and generalization work in Mike Rapoport's series
"mm, arch: add generic implementation of pfn_valid() for FLATMEM" and
"fixups for generic implementation of pfn_valid()"
- Baoquan He has done some work to make /proc/vmallocinfo and
/proc/kcore better represent the real state of things in his series
"mm/vmalloc.c: allow vread() to read out vm_map_ram areas".
- Jason Gunthorpe rationalized the GUP system's interface to the rest of
the kernel in the series "Simplify the external interface for GUP".
- SeongJae Park wishes to migrate people from DAMON's debugfs interface
over to its sysfs interface. To support this, we'll temporarily be
printing warnings when people use the debugfs interface. See the series
"mm/damon: deprecate DAMON debugfs interface".
- Andrey Konovalov provided the accurately named "lib/stackdepot: fixes
and clean-ups" series.
- Huang Ying has provided a dramatic reduction in migration's TLB flush
IPI rates with the series "migrate_pages(): batch TLB flushing".
- Arnd Bergmann has some objtool fixups in "objtool warning fixes".
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCY/PoPQAKCRDdBJ7gKXxA
jlvpAPsFECUBBl20qSue2zCYWnHC7Yk4q9ytTkPB/MMDrFEN9wD/SNKEm2UoK6/K
DmxHkn0LAitGgJRS/W9w81yrgig9tAQ=
=MlGs
-----END PGP SIGNATURE-----
Merge tag 'mm-stable-2023-02-20-13-37' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
- Daniel Verkamp has contributed a memfd series ("mm/memfd: add
F_SEAL_EXEC") which permits the setting of the memfd execute bit at
memfd creation time, with the option of sealing the state of the X
bit.
- Peter Xu adds a patch series ("mm/hugetlb: Make huge_pte_offset()
thread-safe for pmd unshare") which addresses a rare race condition
related to PMD unsharing.
- Several folioification patch serieses from Matthew Wilcox, Vishal
Moola, Sidhartha Kumar and Lorenzo Stoakes
- Johannes Weiner has a series ("mm: push down lock_page_memcg()")
which does perform some memcg maintenance and cleanup work.
- SeongJae Park has added DAMOS filtering to DAMON, with the series
"mm/damon/core: implement damos filter".
These filters provide users with finer-grained control over DAMOS's
actions. SeongJae has also done some DAMON cleanup work.
- Kairui Song adds a series ("Clean up and fixes for swap").
- Vernon Yang contributed the series "Clean up and refinement for maple
tree".
- Yu Zhao has contributed the "mm: multi-gen LRU: memcg LRU" series. It
adds to MGLRU an LRU of memcgs, to improve the scalability of global
reclaim.
- David Hildenbrand has added some userfaultfd cleanup work in the
series "mm: uffd-wp + change_protection() cleanups".
- Christoph Hellwig has removed the generic_writepages() library
function in the series "remove generic_writepages".
- Baolin Wang has performed some maintenance on the compaction code in
his series "Some small improvements for compaction".
- Sidhartha Kumar is doing some maintenance work on struct page in his
series "Get rid of tail page fields".
- David Hildenbrand contributed some cleanup, bugfixing and
generalization of pte management and of pte debugging in his series
"mm: support __HAVE_ARCH_PTE_SWP_EXCLUSIVE on all architectures with
swap PTEs".
- Mel Gorman and Neil Brown have removed the __GFP_ATOMIC allocation
flag in the series "Discard __GFP_ATOMIC".
- Sergey Senozhatsky has improved zsmalloc's memory utilization with
his series "zsmalloc: make zspage chain size configurable".
- Joey Gouly has added prctl() support for prohibiting the creation of
writeable+executable mappings.
The previous BPF-based approach had shortcomings. See "mm: In-kernel
support for memory-deny-write-execute (MDWE)".
- Waiman Long did some kmemleak cleanup and bugfixing in the series
"mm/kmemleak: Simplify kmemleak_cond_resched() & fix UAF".
- T.J. Alumbaugh has contributed some MGLRU cleanup work in his series
"mm: multi-gen LRU: improve".
- Jiaqi Yan has provided some enhancements to our memory error
statistics reporting, mainly by presenting the statistics on a
per-node basis. See the series "Introduce per NUMA node memory error
statistics".
- Mel Gorman has a second and hopefully final shot at fixing a CPU-hog
regression in compaction via his series "Fix excessive CPU usage
during compaction".
- Christoph Hellwig does some vmalloc maintenance work in the series
"cleanup vfree and vunmap".
- Christoph Hellwig has removed block_device_operations.rw_page() in
ths series "remove ->rw_page".
- We get some maple_tree improvements and cleanups in Liam Howlett's
series "VMA tree type safety and remove __vma_adjust()".
- Suren Baghdasaryan has done some work on the maintainability of our
vm_flags handling in the series "introduce vm_flags modifier
functions".
- Some pagemap cleanup and generalization work in Mike Rapoport's
series "mm, arch: add generic implementation of pfn_valid() for
FLATMEM" and "fixups for generic implementation of pfn_valid()"
- Baoquan He has done some work to make /proc/vmallocinfo and
/proc/kcore better represent the real state of things in his series
"mm/vmalloc.c: allow vread() to read out vm_map_ram areas".
- Jason Gunthorpe rationalized the GUP system's interface to the rest
of the kernel in the series "Simplify the external interface for
GUP".
- SeongJae Park wishes to migrate people from DAMON's debugfs interface
over to its sysfs interface. To support this, we'll temporarily be
printing warnings when people use the debugfs interface. See the
series "mm/damon: deprecate DAMON debugfs interface".
- Andrey Konovalov provided the accurately named "lib/stackdepot: fixes
and clean-ups" series.
- Huang Ying has provided a dramatic reduction in migration's TLB flush
IPI rates with the series "migrate_pages(): batch TLB flushing".
- Arnd Bergmann has some objtool fixups in "objtool warning fixes".
* tag 'mm-stable-2023-02-20-13-37' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (505 commits)
include/linux/migrate.h: remove unneeded externs
mm/memory_hotplug: cleanup return value handing in do_migrate_range()
mm/uffd: fix comment in handling pte markers
mm: change to return bool for isolate_movable_page()
mm: hugetlb: change to return bool for isolate_hugetlb()
mm: change to return bool for isolate_lru_page()
mm: change to return bool for folio_isolate_lru()
objtool: add UACCESS exceptions for __tsan_volatile_read/write
kmsan: disable ftrace in kmsan core code
kasan: mark addr_has_metadata __always_inline
mm: memcontrol: rename memcg_kmem_enabled()
sh: initialize max_mapnr
m68k/nommu: add missing definition of ARCH_PFN_OFFSET
mm: percpu: fix incorrect size in pcpu_obj_full_size()
maple_tree: reduce stack usage with gcc-9 and earlier
mm: page_alloc: call panic() when memoryless node allocation fails
mm: multi-gen LRU: avoid futile retries
migrate_pages: move THP/hugetlb migration support check to simplify code
migrate_pages: batch flushing TLB
migrate_pages: share more code between _unmap and _move
...
Updates to the usual drivers (ufs, lpfc, qla2xxx, libsas). The major
core change is a rework to remove the two helpers around
scsi_execute_cmd and use it as the only submission interface along
with other minor fixes and updates.
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCY/Z7BSYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishfF9AP9HzZaZ
0XumuxjchcJHRntcIAzb9kE62SSWFxBrgZQCtAEAyhAO1zK284esgCa+arkzEC7p
uhxpufQSGnYvbasStsU=
=VsH1
-----END PGP SIGNATURE-----
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI updates from James Bottomley:
"Updates to the usual drivers (ufs, lpfc, qla2xxx, libsas).
The major core change is a rework to remove the two helpers around
scsi_execute_cmd and use it as the only submission interface along
with other minor fixes and updates"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (142 commits)
scsi: ufs: core: Fix an error handling path in ufshcd_read_desc_param()
scsi: ufs: core: Fix device management cmd timeout flow
scsi: aic94xx: Add missing check for dma_map_single()
scsi: smartpqi: Replace one-element array with flexible-array member
scsi: mpt3sas: Fix a memory leak
scsi: qla2xxx: Remove the unused variable wwn
scsi: ufs: core: Fix kernel-doc syntax
scsi: ufs: core: Add hibernation callbacks
scsi: snic: Fix memory leak with using debugfs_lookup()
scsi: ufs: core: Limit DMA alignment check
scsi: Documentation: Correct spelling
scsi: Documentation: Correct spelling
scsi: target: Documentation: Correct spelling
scsi: aacraid: Allocate cmd_priv with scsicmd
scsi: ufs: qcom: dt-bindings: Add SM8550 compatible string
scsi: ufs: ufs-qcom: Clear qunipro_g4_sel for HW version major 5
scsi: ufs: qcom: fix platform_msi_domain_free_irqs() reference
scsi: ufs: core: Enable DMA clustering
scsi: ufs: exynos: Fix the maximum segment size
scsi: ufs: exynos: Fix DMA alignment for PAGE_SIZE != 4096
...
* Small cleanup of the pata_octeon driver to drop a useless platform
callback, from Uwe.
* Simplify ata_scsi_cmd_error_handler() code using the fact that
ap->ops->error_handler is NULL most of the time, from Wenchao.
* Several patches improving libata error handling. This is in
preparation for supporting the command duration limits (CDL)
feature. The changes allow handling corner cases of ATA NCQ errors
which do not happen with regular drives but will be triggered with
CDL drives. From Niklas.
* Simplify the qc_fill_rtf operation, from me.
* Improve SCSI command translation for the
REPORT_SUPPORTED_OPERATION_CODES command, from me.
* Cleanup of libata FUA handling. This falls short of enabling FUA for
ATA drives that support it by default as there were concerns that
old drives would break. The series howeverfixes several issues with
the FUA support to ensure that FUA is reported as being supported
only for drives that can handle all possible write cases (NCQ and
non-NCQ). A check in the block layer is also added to ensure that we
never see read FUA commands (current behavior). From me.
* Several patches to move the old PARIDE (parallel port IDE) driver to
libata as pata_parport. Given that this driver also needs protocol
modules, the driver code resides in its own pata_parport directoy
under drivers/ata. From Ondrej.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQSRPv8tYSvhwAzJdzjdoc3SxdoYdgUCY/VTnQAKCRDdoc3SxdoY
dk77AQCA1frczKhcOFe2PK/FsFAiO9Nlx/snk7V95JdjVG8GlwEAkey7mvbXMfX0
fDbqpaCkWFb6SvwxdMSATlqUvwEpSQ8=
=tqQP
-----END PGP SIGNATURE-----
Merge tag 'ata-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata
Pull ATA updates from Damien Le Moal:
- Small cleanup of the pata_octeon driver to drop a useless platform
callback (Uwe)
- Simplify ata_scsi_cmd_error_handler() code using the fact that
ap->ops->error_handler is NULL most of the time (Wenchao)
- Several patches improving libata error handling. This is in
preparation for supporting the command duration limits (CDL) feature.
The changes allow handling corner cases of ATA NCQ errors which do
not happen with regular drives but will be triggered with CDL drives
(Niklas)
- Simplify the qc_fill_rtf operation (me)
- Improve SCSI command translation for REPORT_SUPPORTED_OPERATION_CODES
command (me)
- Cleanup of libata FUA handling.
This falls short of enabling FUA for ATA drives that support it by
default as there were concerns that old drives would break. The
series however fixes several issues with the FUA support to ensure
that FUA is reported as being supported only for drives that can
handle all possible write cases (NCQ and non-NCQ). A check in the
block layer is also added to ensure that we never see read FUA
commands (current behavior) (me)
- Several patches to move the old PARIDE (parallel port IDE) driver to
libata as pata_parport. Given that this driver also needs protocol
modules, the driver code resides in its own pata_parport directoy
under drivers/ata (Ondrej)
* tag 'ata-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata:
ata: pata_parport: Fix ida_alloc return value error check
drivers/block: Move PARIDE protocol modules to drivers/ata/pata_parport
drivers/block: Remove PARIDE core and high-level protocols
ata: pata_parport: add driver (PARIDE replacement)
ata: libata: exclude FUA support for known buggy drives
ata: libata: Fix FUA handling in ata_build_rw_tf()
ata: libata: cleanup fua support detection
ata: libata: Rename and cleanup ata_rwcmd_protocol()
ata: libata: Introduce ata_ncq_supported()
block: add a sanity check for non-write flush/fua bios
ata: libata-scsi: improve ata_scsiop_maint_in()
ata: libata-scsi: do not overwrite SCSI ML and status bytes
ata: libata: move NCQ related ATA_DFLAGs
ata: libata: respect successfully completed commands during errors
ata: libata: read the shared status for successful NCQ commands once
ata: libata: simplify qc_fill_rtf port operation interface
ata: scsi: rename flag ATA_QCFLAG_FAILED to ATA_QCFLAG_EH
ata: libata-eh: Cleanup ata_scsi_cmd_error_handler()
ata: octeon: Drop empty platform remove function
Allow SCSI LLDs to specify SCMD_* flags.
Link: https://lore.kernel.org/r/20230210193258.4004923-2-bvanassche@acm.org
Cc: Mike Christie <michael.christie@oracle.com>
Cc: John Garry <john.g.garry@oracle.com>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Remove the repeated word "the" in comments.
[mkp: fixed additional typos in the changed lines]
Link: https://lore.kernel.org/r/20230217083046.4090-1-liubo03@inspur.com
Signed-off-by: Bo Liu <liubo03@inspur.com>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Fix an incorrect reference to the scsi_remove_host() function in a source
code comment.
Link: https://lore.kernel.org/r/20230210205200.36973-2-bvanassche@acm.org
Fixes: b49493f996 ("Fix a memory leak in scsi_host_dev_release()")
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Commit c1af985d27 ("scsi: mpi3mr: Add Event acknowledgment logic")
introduced an array mrioc->evtack_cmds but initialization of the array
elements was missed. They are just zero cleared. The function
mpi3mr_complete_evt_ack() refers host_tag field of the elements. Due to the
zero value of the host_tag field, the function calls clear_bit() for
mrico->evtack_cmds_bitmap with wrong bit index. This results in memory
access to invalid address and "BUG: KASAN: use-after-free". This BUG was
observed at eHBA-9600 firmware update to version 8.3.1.0. To fix it, add
the missing initialization of mrioc->evtack_cmds.
Link: https://lore.kernel.org/r/20230214005019.1897251-5-shinichiro.kawasaki@wdc.com
Cc: stable@vger.kernel.org
Fixes: c1af985d27 ("scsi: mpi3mr: Add Event acknowledgment logic")
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Acked-by: Sathya Prakash Veerichetty <sathya.prakash@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
To allocate bitmaps, the mpi3mr driver calculates sizes of bitmaps using
byte as unit. However, bitmap helper functions assume that bitmaps are
allocated using unsigned long as unit. This gap causes memory access beyond
the bitmap sizes and results in "BUG: KASAN: slab-out-of-bounds". The BUG
was observed at firmware download to eHBA-9600. Call trace indicated that
the out-of-bounds access happened in find_first_zero_bit() called from
mpi3mr_send_event_ack() for miroc->evtack_cmds_bitmap.
To fix the BUG, do not use bytes to manage bitmap sizes. Instead, use
number of bits, and call bitmap helper functions which take number of bits
as arguments. For memory allocation, call bitmap_zalloc() instead of
kzalloc() and krealloc(). For memory free, call bitmap_free() instead of
kfree(). For zero clear, call bitmap_clear() instead of memset().
Remove three fields for bitmap byte sizes in struct scmd_priv which are no
longer required. Replace the field dev_handle_bitmap_sz with
dev_handle_bitmap_bits to keep number of bits of removepend_bitmap across
resize.
Link: https://lore.kernel.org/r/20230214005019.1897251-4-shinichiro.kawasaki@wdc.com
Fixes: c5758fc72b ("scsi: mpi3mr: Gracefully handle online FW update operation")
Fixes: e844adb1fb ("scsi: mpi3mr: Implement SCSI error handler hooks")
Fixes: c1af985d27 ("scsi: mpi3mr: Add Event acknowledgment logic")
Fixes: 824a156633 ("scsi: mpi3mr: Base driver code")
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Acked-by: Sathya Prakash Veerichetty <sathya.prakash@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In the function mpi3mr_get_all_tgt_info(), devmap_info points to
alltgt_info->dmi then there is no need to memcpy() data from devmap_info to
alltgt_info->dmi. Remove the unnecessary memcpy(). This also allows to
remove the local variable 'rval' and the goto label 'out'.
Link: https://lore.kernel.org/r/20230214005019.1897251-3-shinichiro.kawasaki@wdc.com
Cc: stable@vger.kernel.org
Fixes: f5e6d5a343 ("scsi: mpi3mr: Add support for driver commands")
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Acked-by: Sathya Prakash Veerichetty <sathya.prakash@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The function mpi3mr_get_all_tgt_info() has four issues:
1) It calculates valid entry length in alltgt_info assuming the header part
of the struct mpi3mr_device_map_info would equal to sizeof(u32). The
correct size is sizeof(u64).
2) When it calculates the valid entry length kern_entrylen, it excludes one
entry by subtracting 1 from num_devices.
3) It copies num_device by calling memcpy(). Substitution is enough.
4) It does not specify the calculated length to sg_copy_from_buffer().
Instead, it specifies the payload length which is larger than the
alltgt_info size. It causes "BUG: KASAN: slab-out-of-bounds".
Fix the issues by using the correct header size, removing the subtraction
from num_devices, replacing the memcpy() with substitution and specifying
the correct length to sg_copy_from_buffer().
Link: https://lore.kernel.org/r/20230214005019.1897251-2-shinichiro.kawasaki@wdc.com
Cc: stable@vger.kernel.org
Fixes: f5e6d5a343 ("scsi: mpi3mr: Add support for driver commands")
Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Acked-by: Sathya Prakash Veerichetty <sathya.prakash@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The ipr_log_vpd_compact() function triggers a fortified memcpy() warning
about a potential string overflow with all versions of clang:
In file included from drivers/scsi/ipr.c:43:
In file included from include/linux/string.h:254:
include/linux/fortify-string.h:520:4: error: call to '__write_overflow_field' declared with 'warning' attribute: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Werror,-Wattribute-warning]
__write_overflow_field(p_size_field, size);
^
include/linux/fortify-string.h:520:4: error: call to '__write_overflow_field' declared with 'warning' attribute: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Werror,-Wattribute-warning]
2 errors generated.
I don't see anything actually wrong with the function, but this is the only
instance I can reproduce of the fortification going wrong in the kernel at
the moment, so the easiest solution may be to rewrite the function into
something that does not trigger the warning.
Instead of having a combined buffer for vendor/device/serial strings, use
three separate local variables and just truncate the whitespace
individually.
Link: https://lore.kernel.org/r/20230214132831.2118392-1-arnd@kernel.org
Cc: Kees Cook <keescook@chromium.org>
Fixes: 8cf093e275 ("[SCSI] ipr: Improved dual adapter errors")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Convert function ipr_probe_ioa_part2() to return void instead of int since
the current implementation always returns 0 to the caller. The
transformation also eliminates the dead code when calling
ipr_probe_ioa_part2() function. Issue identified using returnvar
Coccinelle semantic patch.
Link: https://lore.kernel.org/r/Y7rvQyMOGcPKPTv8@ubun2204.myguest.virtualbox.org
Signed-off-by: Deepak R Varma <drv@mailo.com>
Acked-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If a controller has DIX is enabled and an attached disk is formatted using
a protection type supported by the controller, a block integrity profile is
registered to enable protected transfers.
If the disk is subsequently reformatted to disable PI, and the controller
does not support DIX Type 0, this can lead to failures such as this:
[142829.032340] hisi_sas_v3_hw 0000:b4:04.0: erroneous completion iptt=2375 task=00000000bea0970c dev id=5 direct-attached phy4 addr=51c20dbaf642a000 CQ hdr: 0x1023 0x50947 0x0 0x20000 Error info: 0x0 0x0 0x4 0x0
[142829.073883] sas: Enter sas_scsi_recover_host busy: 1 failed: 1
[142829.079783] sas: sas_scsi_find_task: aborting task 0x00000000bea0970c
[142829.102342] sas: Internal abort: task to dev 51c20dbaf642a000 response: 0x0 status 0x5
[142829.110319] sas: sas_eh_handle_sas_errors: task 0x00000000bea0970c is done
[142829.117275] sd 7:0:5:0: [sdc] tag#2375 UNKNOWN(0x2003) Result: hostbyte=0x05 driverbyte=DRIVER_OK cmd_age=0s
[142829.127171] sd 7:0:5:0: [sdc] tag#2375 CDB: opcode=0x2a 2a 00 00 00 00 00 00 00 08 00
[142829.135059] I/O error, dev sdc, sector 0 op 0x1:(WRITE) flags 0x18800 phys_seg 1 prio class 2
This is because the block layer integrity profile is currently only set up
the first time a disk is discovered.
To address this, remove the first_scan check when configuring protection
information during revalidate. Also unregister the block integrity profile
if DIX is not supported with a given protection type.
[mkp: commit description + printk dedup]
Link: https://lore.kernel.org/r/20230221081026.24736-1-yangxingui@huawei.com
Signed-off-by: Xingui Yang <yangxingui@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Core
----
- Add dedicated kmem_cache for typical/small skb->head, avoid having
to access struct page at kfree time, and improve memory use.
- Introduce sysctl to set default RPS configuration for new netdevs.
- Define Netlink protocol specification format which can be used
to describe messages used by each family and auto-generate parsers.
Add tools for generating kernel data structures and uAPI headers.
- Expose all net/core sysctls inside netns.
- Remove 4s sleep in netpoll if carrier is instantly detected on boot.
- Add configurable limit of MDB entries per port, and port-vlan.
- Continue populating drop reasons throughout the stack.
- Retire a handful of legacy Qdiscs and classifiers.
Protocols
---------
- Support IPv4 big TCP (TSO frames larger than 64kB).
- Add IP_LOCAL_PORT_RANGE socket option, to control local port range
on socket by socket basis.
- Track and report in procfs number of MPTCP sockets used.
- Support mixing IPv4 and IPv6 flows in the in-kernel MPTCP
path manager.
- IPv6: don't check net.ipv6.route.max_size and rely on garbage
collection to free memory (similarly to IPv4).
- Support Penultimate Segment Pop (PSP) flavor in SRv6 (RFC8986).
- ICMP: add per-rate limit counters.
- Add support for user scanning requests in ieee802154.
- Remove static WEP support.
- Support minimal Wi-Fi 7 Extremely High Throughput (EHT) rate
reporting.
- WiFi 7 EHT channel puncturing support (client & AP).
BPF
---
- Add a rbtree data structure following the "next-gen data structure"
precedent set by recently added linked list, that is, by using
kfunc + kptr instead of adding a new BPF map type.
- Expose XDP hints via kfuncs with initial support for RX hash and
timestamp metadata.
- Add BPF_F_NO_TUNNEL_KEY extension to bpf_skb_set_tunnel_key
to better support decap on GRE tunnel devices not operating
in collect metadata.
- Improve x86 JIT's codegen for PROBE_MEM runtime error checks.
- Remove the need for trace_printk_lock for bpf_trace_printk
and bpf_trace_vprintk helpers.
- Extend libbpf's bpf_tracing.h support for tracing arguments of
kprobes/uprobes and syscall as a special case.
- Significantly reduce the search time for module symbols
by livepatch and BPF.
- Enable cpumasks to be used as kptrs, which is useful for tracing
programs tracking which tasks end up running on which CPUs in
different time intervals.
- Add support for BPF trampoline on s390x and riscv64.
- Add capability to export the XDP features supported by the NIC.
- Add __bpf_kfunc tag for marking kernel functions as kfuncs.
- Add cgroup.memory=nobpf kernel parameter option to disable BPF
memory accounting for container environments.
Netfilter
---------
- Remove the CLUSTERIP target. It has been marked as obsolete
for years, and we still have WARN splats wrt. races of
the out-of-band /proc interface installed by this target.
- Add 'destroy' commands to nf_tables. They are identical to
the existing 'delete' commands, but do not return an error if
the referenced object (set, chain, rule...) did not exist.
Driver API
----------
- Improve cpumask_local_spread() locality to help NICs set the right
IRQ affinity on AMD platforms.
- Separate C22 and C45 MDIO bus transactions more clearly.
- Introduce new DCB table to control DSCP rewrite on egress.
- Support configuration of Physical Layer Collision Avoidance (PLCA)
Reconciliation Sublayer (RS) (802.3cg-2019). Modern version of
shared medium Ethernet.
- Support for MAC Merge layer (IEEE 802.3-2018 clause 99). Allowing
preemption of low priority frames by high priority frames.
- Add support for controlling MACSec offload using netlink SET.
- Rework devlink instance refcounts to allow registration and
de-registration under the instance lock. Split the code into multiple
files, drop some of the unnecessarily granular locks and factor out
common parts of netlink operation handling.
- Add TX frame aggregation parameters (for USB drivers).
- Add a new attr TCA_EXT_WARN_MSG to report TC (offload) warning
messages with notifications for debug.
- Allow offloading of UDP NEW connections via act_ct.
- Add support for per action HW stats in TC.
- Support hardware miss to TC action (continue processing in SW from
a specific point in the action chain).
- Warn if old Wireless Extension user space interface is used with
modern cfg80211/mac80211 drivers. Do not support Wireless Extensions
for Wi-Fi 7 devices at all. Everyone should switch to using nl80211
interface instead.
- Improve the CAN bit timing configuration. Use extack to return error
messages directly to user space, update the SJW handling, including
the definition of a new default value that will benefit CAN-FD
controllers, by increasing their oscillator tolerance.
New hardware / drivers
----------------------
- Ethernet:
- nVidia BlueField-3 support (control traffic driver)
- Ethernet support for imx93 SoCs
- Motorcomm yt8531 gigabit Ethernet PHY
- onsemi NCN26000 10BASE-T1S PHY (with support for PLCA)
- Microchip LAN8841 PHY (incl. cable diagnostics and PTP)
- Amlogic gxl MDIO mux
- WiFi:
- RealTek RTL8188EU (rtl8xxxu)
- Qualcomm Wi-Fi 7 devices (ath12k)
- CAN:
- Renesas R-Car V4H
Drivers
-------
- Bluetooth:
- Set Per Platform Antenna Gain (PPAG) for Intel controllers.
- Ethernet NICs:
- Intel (1G, igc):
- support TSN / Qbv / packet scheduling features of i226 model
- Intel (100G, ice):
- use GNSS subsystem instead of TTY
- multi-buffer XDP support
- extend support for GPIO pins to E823 devices
- nVidia/Mellanox:
- update the shared buffer configuration on PFC commands
- implement PTP adjphase function for HW offset control
- TC support for Geneve and GRE with VF tunnel offload
- more efficient crypto key management method
- multi-port eswitch support
- Netronome/Corigine:
- add DCB IEEE support
- support IPsec offloading for NFP3800
- Freescale/NXP (enetc):
- enetc: support XDP_REDIRECT for XDP non-linear buffers
- enetc: improve reconfig, avoid link flap and waiting for idle
- enetc: support MAC Merge layer
- Other NICs:
- sfc/ef100: add basic devlink support for ef100
- ionic: rx_push mode operation (writing descriptors via MMIO)
- bnxt: use the auxiliary bus abstraction for RDMA
- r8169: disable ASPM and reset bus in case of tx timeout
- cpsw: support QSGMII mode for J721e CPSW9G
- cpts: support pulse-per-second output
- ngbe: add an mdio bus driver
- usbnet: optimize usbnet_bh() by avoiding unnecessary queuing
- r8152: handle devices with FW with NCM support
- amd-xgbe: support 10Mbps, 2.5GbE speeds and rx-adaptation
- virtio-net: support multi buffer XDP
- virtio/vsock: replace virtio_vsock_pkt with sk_buff
- tsnep: XDP support
- Ethernet high-speed switches:
- nVidia/Mellanox (mlxsw):
- add support for latency TLV (in FW control messages)
- Microchip (sparx5):
- separate explicit and implicit traffic forwarding rules, make
the implicit rules always active
- add support for egress DSCP rewrite
- IS0 VCAP support (Ingress Classification)
- IS2 VCAP filters (protos, L3 addrs, L4 ports, flags, ToS etc.)
- ES2 VCAP support (Egress Access Control)
- support for Per-Stream Filtering and Policing (802.1Q, 8.6.5.1)
- Ethernet embedded switches:
- Marvell (mv88e6xxx):
- add MAB (port auth) offload support
- enable PTP receive for mv88e6390
- NXP (ocelot):
- support MAC Merge layer
- support for the the vsc7512 internal copper phys
- Microchip:
- lan9303: convert to PHYLINK
- lan966x: support TC flower filter statistics
- lan937x: PTP support for KSZ9563/KSZ8563 and LAN937x
- lan937x: support Credit Based Shaper configuration
- ksz9477: support Energy Efficient Ethernet
- other:
- qca8k: convert to regmap read/write API, use bulk operations
- rswitch: Improve TX timestamp accuracy
- Intel WiFi (iwlwifi):
- EHT (Wi-Fi 7) rate reporting
- STEP equalizer support: transfer some STEP (connection to radio
on platforms with integrated wifi) related parameters from the
BIOS to the firmware.
- Qualcomm 802.11ax WiFi (ath11k):
- IPQ5018 support
- Fine Timing Measurement (FTM) responder role support
- channel 177 support
- MediaTek WiFi (mt76):
- per-PHY LED support
- mt7996: EHT (Wi-Fi 7) support
- Wireless Ethernet Dispatch (WED) reset support
- switch to using page pool allocator
- RealTek WiFi (rtw89):
- support new version of Bluetooth co-existance
- Mobile:
- rmnet: support TX aggregation.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmP1VIYACgkQMUZtbf5S
IrvsChAApz0rNL/sPKxXTEfxZ1tN7D3sYxYKQPomxvl5BV+MvicrLddJy3KmzEFK
nnJNO3nuRNuH422JQ/ylZ4mGX1opa6+5QJb0UINImXUI7Fm8HHBIuPGkv7d5CheZ
7JexFqjPJXUy9nPyh1Rra+IA9AcRd2U7jeGEZR38wb99bHJQj5Bzdk20WArEB0el
n44aqg49LXH71bSeXRz77x5SjkwVtYiccQxLcnmTbjLU2xVraLvI2J+wAhHnVXWW
9lrU1+V4Ex2Xcd1xR0L0cHeK+meP1TrPRAeF+JDpVI3a/zJiE7cZjfHdG/jH5xWl
leZJqghVozrZQNtewWWO7XhUFhMDgFu3W/1vNLjSHPZEqaz1JpM67J1+ql6s63l4
LMWoXbcYZz+SL9ZRCoPkbGue/5fKSHv8/Jl9Sh58+eTS+c/zgN8uFGRNFXLX1+EP
n8uvt985PxMd6x1+dHumhOUzxnY4Sfi1vjitSunTsNFQ3Cmp4SO0IfBVJWfLUCuC
xz5hbJGJJbSpvUsO+HWyCg83E5OWghRE/Onpt2jsQSZCrO9HDg4FRTEf3WAMgaqc
edb5KfbRZPTJQM08gWdluXzSk1nw3FNP2tXW4XlgUrEbjb+fOk0V9dQg2gyYTxQ1
Nhvn8ZQPi6/GMMELHAIPGmmW1allyOGiAzGlQsv8EmL+OFM6WDI=
=xXhC
-----END PGP SIGNATURE-----
Merge tag 'net-next-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
"Core:
- Add dedicated kmem_cache for typical/small skb->head, avoid having
to access struct page at kfree time, and improve memory use.
- Introduce sysctl to set default RPS configuration for new netdevs.
- Define Netlink protocol specification format which can be used to
describe messages used by each family and auto-generate parsers.
Add tools for generating kernel data structures and uAPI headers.
- Expose all net/core sysctls inside netns.
- Remove 4s sleep in netpoll if carrier is instantly detected on
boot.
- Add configurable limit of MDB entries per port, and port-vlan.
- Continue populating drop reasons throughout the stack.
- Retire a handful of legacy Qdiscs and classifiers.
Protocols:
- Support IPv4 big TCP (TSO frames larger than 64kB).
- Add IP_LOCAL_PORT_RANGE socket option, to control local port range
on socket by socket basis.
- Track and report in procfs number of MPTCP sockets used.
- Support mixing IPv4 and IPv6 flows in the in-kernel MPTCP path
manager.
- IPv6: don't check net.ipv6.route.max_size and rely on garbage
collection to free memory (similarly to IPv4).
- Support Penultimate Segment Pop (PSP) flavor in SRv6 (RFC8986).
- ICMP: add per-rate limit counters.
- Add support for user scanning requests in ieee802154.
- Remove static WEP support.
- Support minimal Wi-Fi 7 Extremely High Throughput (EHT) rate
reporting.
- WiFi 7 EHT channel puncturing support (client & AP).
BPF:
- Add a rbtree data structure following the "next-gen data structure"
precedent set by recently added linked list, that is, by using
kfunc + kptr instead of adding a new BPF map type.
- Expose XDP hints via kfuncs with initial support for RX hash and
timestamp metadata.
- Add BPF_F_NO_TUNNEL_KEY extension to bpf_skb_set_tunnel_key to
better support decap on GRE tunnel devices not operating in collect
metadata.
- Improve x86 JIT's codegen for PROBE_MEM runtime error checks.
- Remove the need for trace_printk_lock for bpf_trace_printk and
bpf_trace_vprintk helpers.
- Extend libbpf's bpf_tracing.h support for tracing arguments of
kprobes/uprobes and syscall as a special case.
- Significantly reduce the search time for module symbols by
livepatch and BPF.
- Enable cpumasks to be used as kptrs, which is useful for tracing
programs tracking which tasks end up running on which CPUs in
different time intervals.
- Add support for BPF trampoline on s390x and riscv64.
- Add capability to export the XDP features supported by the NIC.
- Add __bpf_kfunc tag for marking kernel functions as kfuncs.
- Add cgroup.memory=nobpf kernel parameter option to disable BPF
memory accounting for container environments.
Netfilter:
- Remove the CLUSTERIP target. It has been marked as obsolete for
years, and we still have WARN splats wrt races of the out-of-band
/proc interface installed by this target.
- Add 'destroy' commands to nf_tables. They are identical to the
existing 'delete' commands, but do not return an error if the
referenced object (set, chain, rule...) did not exist.
Driver API:
- Improve cpumask_local_spread() locality to help NICs set the right
IRQ affinity on AMD platforms.
- Separate C22 and C45 MDIO bus transactions more clearly.
- Introduce new DCB table to control DSCP rewrite on egress.
- Support configuration of Physical Layer Collision Avoidance (PLCA)
Reconciliation Sublayer (RS) (802.3cg-2019). Modern version of
shared medium Ethernet.
- Support for MAC Merge layer (IEEE 802.3-2018 clause 99). Allowing
preemption of low priority frames by high priority frames.
- Add support for controlling MACSec offload using netlink SET.
- Rework devlink instance refcounts to allow registration and
de-registration under the instance lock. Split the code into
multiple files, drop some of the unnecessarily granular locks and
factor out common parts of netlink operation handling.
- Add TX frame aggregation parameters (for USB drivers).
- Add a new attr TCA_EXT_WARN_MSG to report TC (offload) warning
messages with notifications for debug.
- Allow offloading of UDP NEW connections via act_ct.
- Add support for per action HW stats in TC.
- Support hardware miss to TC action (continue processing in SW from
a specific point in the action chain).
- Warn if old Wireless Extension user space interface is used with
modern cfg80211/mac80211 drivers. Do not support Wireless
Extensions for Wi-Fi 7 devices at all. Everyone should switch to
using nl80211 interface instead.
- Improve the CAN bit timing configuration. Use extack to return
error messages directly to user space, update the SJW handling,
including the definition of a new default value that will benefit
CAN-FD controllers, by increasing their oscillator tolerance.
New hardware / drivers:
- Ethernet:
- nVidia BlueField-3 support (control traffic driver)
- Ethernet support for imx93 SoCs
- Motorcomm yt8531 gigabit Ethernet PHY
- onsemi NCN26000 10BASE-T1S PHY (with support for PLCA)
- Microchip LAN8841 PHY (incl. cable diagnostics and PTP)
- Amlogic gxl MDIO mux
- WiFi:
- RealTek RTL8188EU (rtl8xxxu)
- Qualcomm Wi-Fi 7 devices (ath12k)
- CAN:
- Renesas R-Car V4H
Drivers:
- Bluetooth:
- Set Per Platform Antenna Gain (PPAG) for Intel controllers.
- Ethernet NICs:
- Intel (1G, igc):
- support TSN / Qbv / packet scheduling features of i226 model
- Intel (100G, ice):
- use GNSS subsystem instead of TTY
- multi-buffer XDP support
- extend support for GPIO pins to E823 devices
- nVidia/Mellanox:
- update the shared buffer configuration on PFC commands
- implement PTP adjphase function for HW offset control
- TC support for Geneve and GRE with VF tunnel offload
- more efficient crypto key management method
- multi-port eswitch support
- Netronome/Corigine:
- add DCB IEEE support
- support IPsec offloading for NFP3800
- Freescale/NXP (enetc):
- support XDP_REDIRECT for XDP non-linear buffers
- improve reconfig, avoid link flap and waiting for idle
- support MAC Merge layer
- Other NICs:
- sfc/ef100: add basic devlink support for ef100
- ionic: rx_push mode operation (writing descriptors via MMIO)
- bnxt: use the auxiliary bus abstraction for RDMA
- r8169: disable ASPM and reset bus in case of tx timeout
- cpsw: support QSGMII mode for J721e CPSW9G
- cpts: support pulse-per-second output
- ngbe: add an mdio bus driver
- usbnet: optimize usbnet_bh() by avoiding unnecessary queuing
- r8152: handle devices with FW with NCM support
- amd-xgbe: support 10Mbps, 2.5GbE speeds and rx-adaptation
- virtio-net: support multi buffer XDP
- virtio/vsock: replace virtio_vsock_pkt with sk_buff
- tsnep: XDP support
- Ethernet high-speed switches:
- nVidia/Mellanox (mlxsw):
- add support for latency TLV (in FW control messages)
- Microchip (sparx5):
- separate explicit and implicit traffic forwarding rules, make
the implicit rules always active
- add support for egress DSCP rewrite
- IS0 VCAP support (Ingress Classification)
- IS2 VCAP filters (protos, L3 addrs, L4 ports, flags, ToS
etc.)
- ES2 VCAP support (Egress Access Control)
- support for Per-Stream Filtering and Policing (802.1Q,
8.6.5.1)
- Ethernet embedded switches:
- Marvell (mv88e6xxx):
- add MAB (port auth) offload support
- enable PTP receive for mv88e6390
- NXP (ocelot):
- support MAC Merge layer
- support for the the vsc7512 internal copper phys
- Microchip:
- lan9303: convert to PHYLINK
- lan966x: support TC flower filter statistics
- lan937x: PTP support for KSZ9563/KSZ8563 and LAN937x
- lan937x: support Credit Based Shaper configuration
- ksz9477: support Energy Efficient Ethernet
- other:
- qca8k: convert to regmap read/write API, use bulk operations
- rswitch: Improve TX timestamp accuracy
- Intel WiFi (iwlwifi):
- EHT (Wi-Fi 7) rate reporting
- STEP equalizer support: transfer some STEP (connection to radio
on platforms with integrated wifi) related parameters from the
BIOS to the firmware.
- Qualcomm 802.11ax WiFi (ath11k):
- IPQ5018 support
- Fine Timing Measurement (FTM) responder role support
- channel 177 support
- MediaTek WiFi (mt76):
- per-PHY LED support
- mt7996: EHT (Wi-Fi 7) support
- Wireless Ethernet Dispatch (WED) reset support
- switch to using page pool allocator
- RealTek WiFi (rtw89):
- support new version of Bluetooth co-existance
- Mobile:
- rmnet: support TX aggregation"
* tag 'net-next-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1872 commits)
page_pool: add a comment explaining the fragment counter usage
net: ethtool: fix __ethtool_dev_mm_supported() implementation
ethtool: pse-pd: Fix double word in comments
xsk: add linux/vmalloc.h to xsk.c
sefltests: netdevsim: wait for devlink instance after netns removal
selftest: fib_tests: Always cleanup before exit
net/mlx5e: Align IPsec ASO result memory to be as required by hardware
net/mlx5e: TC, Set CT miss to the specific ct action instance
net/mlx5e: Rename CHAIN_TO_REG to MAPPED_OBJ_TO_REG
net/mlx5: Refactor tc miss handling to a single function
net/mlx5: Kconfig: Make tc offload depend on tc skb extension
net/sched: flower: Support hardware miss to tc action
net/sched: flower: Move filter handle initialization earlier
net/sched: cls_api: Support hardware miss to tc action
net/sched: Rename user cookie and act cookie
sfc: fix builds without CONFIG_RTC_LIB
sfc: clean up some inconsistent indentings
net/mlx4_en: Introduce flexible array to silence overflow warning
net: lan966x: Fix possible deadlock inside PTP
net/ulp: Remove redundant ->clone() test in inet_clone_ulp().
...
-----BEGIN PGP SIGNATURE-----
iQFHBAABCAAxFiEEIbPD0id6easf0xsudhRwX5BBoF4FAmPzgDgTHHdlaS5saXVA
a2VybmVsLm9yZwAKCRB2FHBfkEGgXrc7CACfG4SSd8KkWU/y8Q66Irxdau0a3ETD
KL4UNRKGIyKujufgFsme79O6xVSSsCNSay449wk20hqn8lnwbSRi9pUwmLn29hfd
CMFleWIqgwGFfC1do5DRF1vrt1siuG/jVE07mWsEwuY2iHx/es+H7LiQKidhkndZ
DhXRqoi7VYiJv5fRSumpkUJrMZiI96o9Mk09HUksdMwCn3+7RQEqHnlTH5KOozKF
iMroDB72iNw5Na/USZwWL2EDRptENam3lFkPBeDPqNw0SbG4g65JGPR9DSa0Lkbq
AGCJQkdU33mcYQG5MY7R4K1evufpOl/apqLW7h92j45Znr9ok6Vr2c1R
=J1VT
-----END PGP SIGNATURE-----
Merge tag 'hyperv-next-signed-20230220' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux
Pull hyperv updates from Wei Liu:
- allow Linux to run as the nested root partition for Microsoft
Hypervisor (Jinank Jain and Nuno Das Neves)
- clean up the return type of callback functions (Dawei Li)
* tag 'hyperv-next-signed-20230220' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
x86/hyperv: Fix hv_get/set_register for nested bringup
Drivers: hv: Make remove callback of hyperv driver void returned
Drivers: hv: Enable vmbus driver for nested root partition
x86/hyperv: Add an interface to do nested hypercalls
Drivers: hv: Setup synic registers in case of nested root partition
x86/hyperv: Add support for detecting nested hypervisor
Since commit ce70fd9a55 ("scsi: core: Remove the cmd field from struct
scsi_request") sd_cdb_cache is unused. Remove it.
Link: https://lore.kernel.org/r/20230221115340.21201-1-changfengnan@bytedance.com
Signed-off-by: Fengnan Chang <changfengnan@bytedance.com>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The LLDD and the stack currently process FPINs received from the fabric,
but the stack is not aware of any action taken by the driver to alleviate
congestion. The current interface between the driver and the SCSI stack is
limited to passing the notification mainly for statistics and heuristics.
The reaction to an FPIN could be handled either by the driver or by the
stack (marginal path and failover). Amend the interface to indicate if
action on an FPIN has already been reacted to by the LLDDs or not. Add an
additional flag to fc_host_fpin_rcv() to indicate if the FPIN has been
acknowledged/reacted to by the driver.
Also added a new event code FCH_EVT_LINK_FPIN_ACK to notify to the user
that the event has been acknowledged/reacted by the LLDD driver
Link: https://lore.kernel.org/r/20230209034326.882514-1-muneendra.kumar@broadcom.com
Co-developed-by: Anil Gurumurthy <agurumurthy@marvell.com>
Signed-off-by: Anil Gurumurthy <agurumurthy@marvell.com>
Co-developed-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Nilesh Javali <njavali@marvell.com>
Signed-off-by: Muneendra <muneendra.kumar@broadcom.com>
Reviewed-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
An enclosure with no components can't usefully be operated by the driver
(since effectively it has nothing to manage), so report the problem and
don't attach. Not attaching also fixes an oops which could occur if the
driver tries to manage a zero component enclosure.
[mkp: Switched to KERN_WARNING since this scenario is common]
Link: https://lore.kernel.org/r/c5deac044ac409e32d9ad9968ce0dcbc996bfc7a.camel@linux.ibm.com
Cc: stable@vger.kernel.org
Reported-by: Ding Hui <dinghui@sangfor.com.cn>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
A fix for:
BUG: KASAN: slab-out-of-bounds in ses_intf_remove+0x23f/0x270 [ses]
Read of size 8 at addr ffff88a10d32e5d8 by task rmmod/12013
When edev->components is zero, accessing edev->component[0] members is
wrong.
Link: https://lore.kernel.org/r/20230202162451.15346-5-thenzl@redhat.com
Cc: stable@vger.kernel.org
Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
A fix for:
BUG: KASAN: slab-out-of-bounds in ses_enclosure_data_process+0x949/0xe30 [ses]
Read of size 1 at addr ffff88a1b043a451 by task systemd-udevd/3271
Checking after (and before in next loop) addl_desc_ptr[1] is sufficient, we
expect the size to be sanitized before first access to addl_desc_ptr[1].
Make sure we don't walk beyond end of page.
Link: https://lore.kernel.org/r/20230202162451.15346-2-thenzl@redhat.com
Cc: stable@vger.kernel.org
Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
There is no check about the return value of kmalloc in
virtscsi_rescan_hotunplug. Add the check to avoid use
of null pointer 'inq_result' in case of the failure
of kmalloc.
Signed-off-by: Zheng Wang <zyytlz.wz@163.com>
Message-Id: <20230202064124.22277-1-zyytlz.wz@163.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmPvfncQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpob2EADXJxcr2jjYHm/7cjKkyuVX8fr80dNdMeuY
JFdsjG1k6Uj73BVhQQWYTcs/PsrWBHWRsv6uz4WgOELj55eXmf5Q0kJszyUeJW33
/DjqLvtoppVcYf80xE13wKvCfn73BjwQo6xkGM0qAYn15eaXiD/Ax3xC6eJlsBeK
PEw7EJyhacbSxZa/1D2B6+mqII1jUQWProTCc3udZ4JHi3WvdWa3Rda0qCqHl4a1
+K2aP2YTFIRPxBzfMNa/CafWVIFubTdht+4Ds6R60RImzB9e0VUBfcsiUyW5Zg7L
Fwv7ptXuWrALwVNdW56Oz1QikBxn2pdRR2HMLwKJW1MD8kP9r8LMm2jV5Rhiwe0B
OQsGRYkOzBvw+bxeP5fvk0iPGVMz6ActH4gkraA5QdLqayDaFYOadlhqz0uRo5SH
Fb42Vl658K/MHDSIk8U58TNkmrsIJsBGohXI9DOGINPPvv3XOPi4Q1HmXkGRmii0
y+lNU/QEGh7xXXew29SPP76uQpQaYfC7NxXCMw/OpOMwehzjsjshmM2lpxi8zsgt
PJUmfHv5qxCplNmTJXmUpmX7sS7550HUdu9FJb13DM+gzKg8bk9jWVuLrzqrVlG5
1hKWEl1+heg1heRfaIuJVLbPI0au6Sb4uqhih/PHyrP9TWIoAruDbDJM65GKTxyE
2uEgcHzHQw==
=poRc
-----END PGP SIGNATURE-----
Merge tag 'for-6.3/block-2023-02-16' of git://git.kernel.dk/linux
Pull block updates from Jens Axboe:
- NVMe updates via Christoph:
- Small improvements to the logging functionality (Amit Engel)
- Authentication cleanups (Hannes Reinecke)
- Cleanup and optimize the DMA mapping cod in the PCIe driver
(Keith Busch)
- Work around the command effects for Format NVM (Keith Busch)
- Misc cleanups (Keith Busch, Christoph Hellwig)
- Fix and cleanup freeing single sgl (Keith Busch)
- MD updates via Song:
- Fix a rare crash during the takeover process
- Don't update recovery_cp when curr_resync is ACTIVE
- Free writes_pending in md_stop
- Change active_io to percpu
- Updates to drbd, inching us closer to unifying the out-of-tree driver
with the in-tree one (Andreas, Christoph, Lars, Robert)
- BFQ update adding support for multi-actuator drives (Paolo, Federico,
Davide)
- Make brd compliant with REQ_NOWAIT (me)
- Fix for IOPOLL and queue entering, fixing stalled IO waiting on
timeouts (me)
- Fix for REQ_NOWAIT with multiple bios (me)
- Fix memory leak in blktrace cleanup (Greg)
- Clean up sbitmap and fix a potential hang (Kemeng)
- Clean up some bits in BFQ, and fix a bug in the request injection
(Kemeng)
- Clean up the request allocation and issue code, and fix some bugs
related to that (Kemeng)
- ublk updates and fixes:
- Add support for unprivileged ublk (Ming)
- Improve device deletion handling (Ming)
- Misc (Liu, Ziyang)
- s390 dasd fixes (Alexander, Qiheng)
- Improve utility of request caching and fixes (Anuj, Xiao)
- zoned cleanups (Pankaj)
- More constification for kobjs (Thomas)
- blk-iocost cleanups (Yu)
- Remove bio splitting from drivers that don't need it (Christoph)
- Switch blk-cgroups to use struct gendisk. Some of this is now
incomplete as select late reverts were done. (Christoph)
- Add bvec initialization helpers, and convert callers to use that
rather than open-coding it (Christoph)
- Misc fixes and cleanups (Jinke, Keith, Arnd, Bart, Li, Martin,
Matthew, Ulf, Zhong)
* tag 'for-6.3/block-2023-02-16' of git://git.kernel.dk/linux: (169 commits)
brd: use radix_tree_maybe_preload instead of radix_tree_preload
block: use proper return value from bio_failfast()
block: bio-integrity: Copy flags when bio_integrity_payload is cloned
block: Fix io statistics for cgroup in throttle path
brd: mark as nowait compatible
brd: check for REQ_NOWAIT and set correct page allocation mask
brd: return 0/-error from brd_insert_page()
block: sync mixed merged request's failfast with 1st bio's
Revert "blk-cgroup: pin the gendisk in struct blkcg_gq"
Revert "blk-cgroup: pass a gendisk to blkg_lookup"
Revert "blk-cgroup: delay blk-cgroup initialization until add_disk"
Revert "blk-cgroup: delay calling blkcg_exit_disk until disk_release"
Revert "blk-cgroup: move the cgroup information to struct gendisk"
nvme-pci: remove iod use_sgls
nvme-pci: fix freeing single sgl
block: ublk: check IO buffer based on flag need_get_data
s390/dasd: Fix potential memleak in dasd_eckd_init()
s390/dasd: sort out physical vs virtual pointers usage
block: Remove the ALLOC_CACHE_SLACK constant
block: make kobj_type structures constant
...
Replace direct modifications to vma->vm_flags with calls to modifier
functions to be able to track flag changes and to keep vma locking
correctness.
[akpm@linux-foundation.org: fix drivers/misc/open-dice.c, per Hyeonggon Yoo]
Link: https://lkml.kernel.org/r/20230126193752.297968-5-surenb@google.com
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Mike Rapoport (IBM) <rppt@kernel.org>
Acked-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@Oracle.com>
Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arjun Roy <arjunroy@google.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: David Rientjes <rientjes@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Joel Fernandes <joelaf@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Laurent Dufour <ldufour@linux.ibm.com>
Cc: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Minchan Kim <minchan@google.com>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Peter Oskolkov <posk@google.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Punit Agrawal <punit.agrawal@bytedance.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Soheil Hassas Yeganeh <soheil@google.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Add check for dma_map_single() and return error if it fails in order to
avoid invalid DMA address.
Fixes: 2908d778ab ("[SCSI] aic94xx: new driver")
Link: https://lore.kernel.org/r/20230128110832.6792-1-jiasheng@iscas.ac.cn
Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn>
Reviewed-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
One-element arrays are deprecated, and we are replacing them with flexible
array members instead. So, replace one-element array with flexible-array
member in struct report_log_lun_list.
This helps with the ongoing efforts to tighten the FORTIFY_SOURCE routines
on memcpy().
Link: https://lore.kernel.org/r/Y+LJz/r6+UeLqnV3@work
Link: https://github.com/KSPP/linux/issues/79
Link: https://github.com/KSPP/linux/issues/204
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Acked-by: Don Brace <don.brace@microchip.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Variable wwn is not used. Delete it.
drivers/scsi/qla2xxx/qla_init.c:1657:6: warning: variable 'wwn' set but not used.
Link: https://lore.kernel.org/r/20230207052234.24535-1-jiapeng.chong@linux.alibaba.com
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When calling debugfs_lookup() the result must have dput() called on it,
otherwise the memory will leak over time. To make things simpler, just
call debugfs_lookup_and_remove() instead which handles all of the logic at
once.
Link: https://lore.kernel.org/r/20230202141009.2290380-1-gregkh@linuxfoundation.org
Cc: Karan Tilak Kumar <kartilak@cisco.com>
Cc: Sesidhar Baddela <sebaddel@cisco.com>
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: linux-scsi@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The aac_priv() helper assumes that the private cmd area immediately follows
struct scsi_cmnd. Allocate this space as part of scsicmd, else there is a
risk of heap overflow. Seen with GCC 13:
../drivers/scsi/aacraid/aachba.c: In function 'aac_probe_container':
../drivers/scsi/aacraid/aachba.c:841:26: warning: array subscript 16 is outside array bounds of 'void[392]' [-Warray-bounds=]
841 | status = cmd_priv->status;
| ^~
In file included from ../include/linux/resource_ext.h:11,
from ../include/linux/pci.h:40,
from ../drivers/scsi/aacraid/aachba.c:22:
In function 'kmalloc',
inlined from 'kzalloc' at ../include/linux/slab.h:720:9,
inlined from 'aac_probe_container' at ../drivers/scsi/aacraid/aachba.c:821:30:
../include/linux/slab.h:580:24: note: at offset 392 into object of size 392 allocated by 'kmalloc_trace'
580 | return kmalloc_trace(
| ^~~~~~~~~~~~~~
581 | kmalloc_caches[kmalloc_type(flags)][index],
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
582 | flags, size);
| ~~~~~~~~~~~~
Fixes: 76a3451b64 ("scsi: aacraid: Move the SCSI pointer to private command data")
Link: https://lore.kernel.org/r/20230128000409.never.976-kees@kernel.org
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Himanshu Madhani <himanshu.madhani@oracle.com>
Cc: Adaptec OEM Raid Solutions <aacraid@microsemi.com>
Cc: "James E.J. Bottomley" <jejb@linux.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: linux-scsi@vger.kernel.org
Cc: stable@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Vegard Nossum <vegard.nossum@oracle.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Add a helper for setting up the special_bvec instead of open coding it
in three place, and use the new bvec_set_page helper to initialize
special_vec.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Link: https://lore.kernel.org/r/20230203150634.3199647-5-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Two core fixes. One simply moves an annotation from put to release to
avoid the warning triggering needlessly in alua, but to keep it in
case release is ever called from that path (which we don't think will
happen). The other reverts a change to the PQ=1 target scanning
behaviour that's under intense discussion at the moment.
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCY9lCeiYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishX7uAQDhcLu9
0YYWJfkEj41fJqcLOsai0UvSoY0oJQVBSvvyOwEA7KjLz4XNWBNE0fw3TkijOio8
YdIVb2K2sKlDo8oKM0U=
=fNIn
-----END PGP SIGNATURE-----
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Two core fixes.
One simply moves an annotation from put to release to avoid the
warning triggering needlessly in alua, but to keep it in case release
is ever called from that path (which we don't think will happen).
The other reverts a change to the PQ=1 target scanning behaviour
that's under intense discussion at the moment"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: Revert "scsi: core: map PQ=1, PDT=other values to SCSI_SCAN_TARGET_PRESENT"
scsi: core: Fix the scsi_device_put() might_sleep annotation
The uevent() callback in struct bus_type should not be modifying the
device that is passed into it, so mark it as a const * and propagate the
function signature changes out into all relevant subsystems that use
this callback.
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20230111113018.459199-16-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Number of files depend on linux/sched/clock.h getting included
by linux/skbuff.h which soon will no longer be the case.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>