mirror of
https://mirrors.bfsu.edu.cn/git/linux.git
synced 2024-11-27 14:14:24 +08:00
docs: block: remove queue-sysfs.rst
This has been replaced by Documentation/ABI/stable/sysfs-block, which is the correct place for sysfs documentation. Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Eric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20211209003833.6396-8-ebiggers@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
This commit is contained in:
parent
8bc2f7c670
commit
208e4f9c00
@ -20,7 +20,6 @@ Block
|
|||||||
kyber-iosched
|
kyber-iosched
|
||||||
null_blk
|
null_blk
|
||||||
pr
|
pr
|
||||||
queue-sysfs
|
|
||||||
request
|
request
|
||||||
stat
|
stat
|
||||||
switching-sched
|
switching-sched
|
||||||
|
@ -1,321 +0,0 @@
|
|||||||
=================
|
|
||||||
Queue sysfs files
|
|
||||||
=================
|
|
||||||
|
|
||||||
This text file will detail the queue files that are located in the sysfs tree
|
|
||||||
for each block device. Note that stacked devices typically do not export
|
|
||||||
any settings, since their queue merely functions as a remapping target.
|
|
||||||
These files are the ones found in the /sys/block/xxx/queue/ directory.
|
|
||||||
|
|
||||||
Files denoted with a RO postfix are readonly and the RW postfix means
|
|
||||||
read-write.
|
|
||||||
|
|
||||||
add_random (RW)
|
|
||||||
---------------
|
|
||||||
This file allows to turn off the disk entropy contribution. Default
|
|
||||||
value of this file is '1'(on).
|
|
||||||
|
|
||||||
chunk_sectors (RO)
|
|
||||||
------------------
|
|
||||||
This has different meaning depending on the type of the block device.
|
|
||||||
For a RAID device (dm-raid), chunk_sectors indicates the size in 512B sectors
|
|
||||||
of the RAID volume stripe segment. For a zoned block device, either host-aware
|
|
||||||
or host-managed, chunk_sectors indicates the size in 512B sectors of the zones
|
|
||||||
of the device, with the eventual exception of the last zone of the device which
|
|
||||||
may be smaller.
|
|
||||||
|
|
||||||
dax (RO)
|
|
||||||
--------
|
|
||||||
This file indicates whether the device supports Direct Access (DAX),
|
|
||||||
used by CPU-addressable storage to bypass the pagecache. It shows '1'
|
|
||||||
if true, '0' if not.
|
|
||||||
|
|
||||||
discard_granularity (RO)
|
|
||||||
------------------------
|
|
||||||
This shows the size of internal allocation of the device in bytes, if
|
|
||||||
reported by the device. A value of '0' means device does not support
|
|
||||||
the discard functionality.
|
|
||||||
|
|
||||||
discard_max_hw_bytes (RO)
|
|
||||||
-------------------------
|
|
||||||
Devices that support discard functionality may have internal limits on
|
|
||||||
the number of bytes that can be trimmed or unmapped in a single operation.
|
|
||||||
The `discard_max_hw_bytes` parameter is set by the device driver to the
|
|
||||||
maximum number of bytes that can be discarded in a single operation.
|
|
||||||
Discard requests issued to the device must not exceed this limit.
|
|
||||||
A `discard_max_hw_bytes` value of 0 means that the device does not support
|
|
||||||
discard functionality.
|
|
||||||
|
|
||||||
discard_max_bytes (RW)
|
|
||||||
----------------------
|
|
||||||
While discard_max_hw_bytes is the hardware limit for the device, this
|
|
||||||
setting is the software limit. Some devices exhibit large latencies when
|
|
||||||
large discards are issued, setting this value lower will make Linux issue
|
|
||||||
smaller discards and potentially help reduce latencies induced by large
|
|
||||||
discard operations.
|
|
||||||
|
|
||||||
discard_zeroes_data (RO)
|
|
||||||
------------------------
|
|
||||||
Obsolete. Always zero.
|
|
||||||
|
|
||||||
fua (RO)
|
|
||||||
--------
|
|
||||||
Whether or not the block driver supports the FUA flag for write requests.
|
|
||||||
FUA stands for Force Unit Access. If the FUA flag is set that means that
|
|
||||||
write requests must bypass the volatile cache of the storage device.
|
|
||||||
|
|
||||||
hw_sector_size (RO)
|
|
||||||
-------------------
|
|
||||||
This is the hardware sector size of the device, in bytes.
|
|
||||||
|
|
||||||
io_poll (RW)
|
|
||||||
------------
|
|
||||||
When read, this file shows whether polling is enabled (1) or disabled
|
|
||||||
(0). Writing '0' to this file will disable polling for this device.
|
|
||||||
Writing any non-zero value will enable this feature.
|
|
||||||
|
|
||||||
io_poll_delay (RW)
|
|
||||||
------------------
|
|
||||||
If polling is enabled, this controls what kind of polling will be
|
|
||||||
performed. It defaults to -1, which is classic polling. In this mode,
|
|
||||||
the CPU will repeatedly ask for completions without giving up any time.
|
|
||||||
If set to 0, a hybrid polling mode is used, where the kernel will attempt
|
|
||||||
to make an educated guess at when the IO will complete. Based on this
|
|
||||||
guess, the kernel will put the process issuing IO to sleep for an amount
|
|
||||||
of time, before entering a classic poll loop. This mode might be a
|
|
||||||
little slower than pure classic polling, but it will be more efficient.
|
|
||||||
If set to a value larger than 0, the kernel will put the process issuing
|
|
||||||
IO to sleep for this amount of microseconds before entering classic
|
|
||||||
polling.
|
|
||||||
|
|
||||||
io_timeout (RW)
|
|
||||||
---------------
|
|
||||||
io_timeout is the request timeout in milliseconds. If a request does not
|
|
||||||
complete in this time then the block driver timeout handler is invoked.
|
|
||||||
That timeout handler can decide to retry the request, to fail it or to start
|
|
||||||
a device recovery strategy.
|
|
||||||
|
|
||||||
iostats (RW)
|
|
||||||
-------------
|
|
||||||
This file is used to control (on/off) the iostats accounting of the
|
|
||||||
disk.
|
|
||||||
|
|
||||||
logical_block_size (RO)
|
|
||||||
-----------------------
|
|
||||||
This is the logical block size of the device, in bytes.
|
|
||||||
|
|
||||||
max_discard_segments (RO)
|
|
||||||
-------------------------
|
|
||||||
The maximum number of DMA scatter/gather entries in a discard request.
|
|
||||||
|
|
||||||
max_hw_sectors_kb (RO)
|
|
||||||
----------------------
|
|
||||||
This is the maximum number of kilobytes supported in a single data transfer.
|
|
||||||
|
|
||||||
max_integrity_segments (RO)
|
|
||||||
---------------------------
|
|
||||||
Maximum number of elements in a DMA scatter/gather list with integrity
|
|
||||||
data that will be submitted by the block layer core to the associated
|
|
||||||
block driver.
|
|
||||||
|
|
||||||
max_active_zones (RO)
|
|
||||||
---------------------
|
|
||||||
For zoned block devices (zoned attribute indicating "host-managed" or
|
|
||||||
"host-aware"), the sum of zones belonging to any of the zone states:
|
|
||||||
EXPLICIT OPEN, IMPLICIT OPEN or CLOSED, is limited by this value.
|
|
||||||
If this value is 0, there is no limit.
|
|
||||||
|
|
||||||
If the host attempts to exceed this limit, the driver should report this error
|
|
||||||
with BLK_STS_ZONE_ACTIVE_RESOURCE, which user space may see as the EOVERFLOW
|
|
||||||
errno.
|
|
||||||
|
|
||||||
max_open_zones (RO)
|
|
||||||
-------------------
|
|
||||||
For zoned block devices (zoned attribute indicating "host-managed" or
|
|
||||||
"host-aware"), the sum of zones belonging to any of the zone states:
|
|
||||||
EXPLICIT OPEN or IMPLICIT OPEN, is limited by this value.
|
|
||||||
If this value is 0, there is no limit.
|
|
||||||
|
|
||||||
If the host attempts to exceed this limit, the driver should report this error
|
|
||||||
with BLK_STS_ZONE_OPEN_RESOURCE, which user space may see as the ETOOMANYREFS
|
|
||||||
errno.
|
|
||||||
|
|
||||||
max_sectors_kb (RW)
|
|
||||||
-------------------
|
|
||||||
This is the maximum number of kilobytes that the block layer will allow
|
|
||||||
for a filesystem request. Must be smaller than or equal to the maximum
|
|
||||||
size allowed by the hardware.
|
|
||||||
|
|
||||||
max_segments (RO)
|
|
||||||
-----------------
|
|
||||||
Maximum number of elements in a DMA scatter/gather list that is submitted
|
|
||||||
to the associated block driver.
|
|
||||||
|
|
||||||
max_segment_size (RO)
|
|
||||||
---------------------
|
|
||||||
Maximum size in bytes of a single element in a DMA scatter/gather list.
|
|
||||||
|
|
||||||
minimum_io_size (RO)
|
|
||||||
--------------------
|
|
||||||
This is the smallest preferred IO size reported by the device.
|
|
||||||
|
|
||||||
nomerges (RW)
|
|
||||||
-------------
|
|
||||||
This enables the user to disable the lookup logic involved with IO
|
|
||||||
merging requests in the block layer. By default (0) all merges are
|
|
||||||
enabled. When set to 1 only simple one-hit merges will be tried. When
|
|
||||||
set to 2 no merge algorithms will be tried (including one-hit or more
|
|
||||||
complex tree/hash lookups).
|
|
||||||
|
|
||||||
nr_requests (RW)
|
|
||||||
----------------
|
|
||||||
This controls how many requests may be allocated in the block layer for
|
|
||||||
read or write requests. Note that the total allocated number may be twice
|
|
||||||
this amount, since it applies only to reads or writes (not the accumulated
|
|
||||||
sum).
|
|
||||||
|
|
||||||
To avoid priority inversion through request starvation, a request
|
|
||||||
queue maintains a separate request pool per each cgroup when
|
|
||||||
CONFIG_BLK_CGROUP is enabled, and this parameter applies to each such
|
|
||||||
per-block-cgroup request pool. IOW, if there are N block cgroups,
|
|
||||||
each request queue may have up to N request pools, each independently
|
|
||||||
regulated by nr_requests.
|
|
||||||
|
|
||||||
nr_zones (RO)
|
|
||||||
-------------
|
|
||||||
For zoned block devices (zoned attribute indicating "host-managed" or
|
|
||||||
"host-aware"), this indicates the total number of zones of the device.
|
|
||||||
This is always 0 for regular block devices.
|
|
||||||
|
|
||||||
optimal_io_size (RO)
|
|
||||||
--------------------
|
|
||||||
This is the optimal IO size reported by the device.
|
|
||||||
|
|
||||||
physical_block_size (RO)
|
|
||||||
------------------------
|
|
||||||
This is the physical block size of device, in bytes.
|
|
||||||
|
|
||||||
read_ahead_kb (RW)
|
|
||||||
------------------
|
|
||||||
Maximum number of kilobytes to read-ahead for filesystems on this block
|
|
||||||
device.
|
|
||||||
|
|
||||||
rotational (RW)
|
|
||||||
---------------
|
|
||||||
This file is used to stat if the device is of rotational type or
|
|
||||||
non-rotational type.
|
|
||||||
|
|
||||||
rq_affinity (RW)
|
|
||||||
----------------
|
|
||||||
If this option is '1', the block layer will migrate request completions to the
|
|
||||||
cpu "group" that originally submitted the request. For some workloads this
|
|
||||||
provides a significant reduction in CPU cycles due to caching effects.
|
|
||||||
|
|
||||||
For storage configurations that need to maximize distribution of completion
|
|
||||||
processing setting this option to '2' forces the completion to run on the
|
|
||||||
requesting cpu (bypassing the "group" aggregation logic).
|
|
||||||
|
|
||||||
scheduler (RW)
|
|
||||||
--------------
|
|
||||||
When read, this file will display the current and available IO schedulers
|
|
||||||
for this block device. The currently active IO scheduler will be enclosed
|
|
||||||
in [] brackets. Writing an IO scheduler name to this file will switch
|
|
||||||
control of this block device to that new IO scheduler. Note that writing
|
|
||||||
an IO scheduler name to this file will attempt to load that IO scheduler
|
|
||||||
module, if it isn't already present in the system.
|
|
||||||
|
|
||||||
write_cache (RW)
|
|
||||||
----------------
|
|
||||||
When read, this file will display whether the device has write back
|
|
||||||
caching enabled or not. It will return "write back" for the former
|
|
||||||
case, and "write through" for the latter. Writing to this file can
|
|
||||||
change the kernels view of the device, but it doesn't alter the
|
|
||||||
device state. This means that it might not be safe to toggle the
|
|
||||||
setting from "write back" to "write through", since that will also
|
|
||||||
eliminate cache flushes issued by the kernel.
|
|
||||||
|
|
||||||
write_same_max_bytes (RO)
|
|
||||||
-------------------------
|
|
||||||
This is the number of bytes the device can write in a single write-same
|
|
||||||
command. A value of '0' means write-same is not supported by this
|
|
||||||
device.
|
|
||||||
|
|
||||||
wbt_lat_usec (RW)
|
|
||||||
-----------------
|
|
||||||
If the device is registered for writeback throttling, then this file shows
|
|
||||||
the target minimum read latency. If this latency is exceeded in a given
|
|
||||||
window of time (see wb_window_usec), then the writeback throttling will start
|
|
||||||
scaling back writes. Writing a value of '0' to this file disables the
|
|
||||||
feature. Writing a value of '-1' to this file resets the value to the
|
|
||||||
default setting.
|
|
||||||
|
|
||||||
throttle_sample_time (RW)
|
|
||||||
-------------------------
|
|
||||||
This is the time window that blk-throttle samples data, in millisecond.
|
|
||||||
blk-throttle makes decision based on the samplings. Lower time means cgroups
|
|
||||||
have more smooth throughput, but higher CPU overhead. This exists only when
|
|
||||||
CONFIG_BLK_DEV_THROTTLING_LOW is enabled.
|
|
||||||
|
|
||||||
write_zeroes_max_bytes (RO)
|
|
||||||
---------------------------
|
|
||||||
For block drivers that support REQ_OP_WRITE_ZEROES, the maximum number of
|
|
||||||
bytes that can be zeroed at once. The value 0 means that REQ_OP_WRITE_ZEROES
|
|
||||||
is not supported.
|
|
||||||
|
|
||||||
zone_append_max_bytes (RO)
|
|
||||||
--------------------------
|
|
||||||
This is the maximum number of bytes that can be written to a sequential
|
|
||||||
zone of a zoned block device using a zone append write operation
|
|
||||||
(REQ_OP_ZONE_APPEND). This value is always 0 for regular block devices.
|
|
||||||
|
|
||||||
zoned (RO)
|
|
||||||
----------
|
|
||||||
This indicates if the device is a zoned block device and the zone model of the
|
|
||||||
device if it is indeed zoned. The possible values indicated by zoned are
|
|
||||||
"none" for regular block devices and "host-aware" or "host-managed" for zoned
|
|
||||||
block devices. The characteristics of host-aware and host-managed zoned block
|
|
||||||
devices are described in the ZBC (Zoned Block Commands) and ZAC
|
|
||||||
(Zoned Device ATA Command Set) standards. These standards also define the
|
|
||||||
"drive-managed" zone model. However, since drive-managed zoned block devices
|
|
||||||
do not support zone commands, they will be treated as regular block devices
|
|
||||||
and zoned will report "none".
|
|
||||||
|
|
||||||
zone_write_granularity (RO)
|
|
||||||
---------------------------
|
|
||||||
This indicates the alignment constraint, in bytes, for write operations in
|
|
||||||
sequential zones of zoned block devices (devices with a zoned attributed
|
|
||||||
that reports "host-managed" or "host-aware"). This value is always 0 for
|
|
||||||
regular block devices.
|
|
||||||
|
|
||||||
independent_access_ranges (RO)
|
|
||||||
------------------------------
|
|
||||||
|
|
||||||
The presence of this sub-directory of the /sys/block/xxx/queue/ directory
|
|
||||||
indicates that the device is capable of executing requests targeting
|
|
||||||
different sector ranges in parallel. For instance, single LUN multi-actuator
|
|
||||||
hard-disks will have an independent_access_ranges directory if the device
|
|
||||||
correctly advertizes the sector ranges of its actuators.
|
|
||||||
|
|
||||||
The independent_access_ranges directory contains one directory per access
|
|
||||||
range, with each range described using the sector (RO) attribute file to
|
|
||||||
indicate the first sector of the range and the nr_sectors (RO) attribute file
|
|
||||||
to indicate the total number of sectors in the range starting from the first
|
|
||||||
sector of the range. For example, a dual-actuator hard-disk will have the
|
|
||||||
following independent_access_ranges entries.::
|
|
||||||
|
|
||||||
$ tree /sys/block/<device>/queue/independent_access_ranges/
|
|
||||||
/sys/block/<device>/queue/independent_access_ranges/
|
|
||||||
|-- 0
|
|
||||||
| |-- nr_sectors
|
|
||||||
| `-- sector
|
|
||||||
`-- 1
|
|
||||||
|-- nr_sectors
|
|
||||||
`-- sector
|
|
||||||
|
|
||||||
The sector and nr_sectors attributes use 512B sector unit, regardless of
|
|
||||||
the actual block size of the device. Independent access ranges do not
|
|
||||||
overlap and include all sectors within the device capacity. The access
|
|
||||||
ranges are numbered in increasing order of the range start sector,
|
|
||||||
that is, the sector attribute of range 0 always has the value 0.
|
|
||||||
|
|
||||||
Jens Axboe <jens.axboe@oracle.com>, February 2009
|
|
Loading…
Reference in New Issue
Block a user