linux/block
Paolo Valente 2341d662e9 block, bfq: tune service injection basing on request service times
The processes associated with a bfq_queue, say Q, may happen to
generate their cumulative I/O at a lower rate than the rate at which
the device could serve the same I/O. This is rather probable, e.g., if
only one process is associated with Q and the device is an SSD. It
results in Q becoming often empty while in service. If BFQ is not
allowed to switch to another queue when Q becomes empty, then, during
the service of Q, there will be frequent "service holes", i.e., time
intervals during which Q gets empty and the device can only consume
the I/O already queued in its hardware queues. This easily causes
considerable losses of throughput.

To counter this problem, BFQ implements a request injection mechanism,
which tries to fill the above service holes with I/O requests taken
from other bfq_queues. The hard part in this mechanism is finding the
right amount of I/O to inject, so as to both boost throughput and not
break Q's bandwidth and latency guarantees. To this goal, the current
version of this mechanism measures the bandwidth enjoyed by Q while it
is being served, and tries to inject the maximum possible amount of
extra service that does not cause Q's bandwidth to decrease too
much.

This solution has an important shortcoming. For bandwidth measurements
to be stable and reliable, Q must remain in service for a much longer
time than that needed to serve a single I/O request. Unfortunately,
this does not hold with many workloads. This commit addresses this
issue by changing the way the amount of injection allowed is
dynamically computed. It tunes injection as a function of the service
times of single I/O requests of Q, instead of Q's
bandwidth. Single-request service times are evidently meaningful even
if Q gets very few I/O requests completed while it is in service.

As a testbed for this new solution, we measured the throughput reached
by BFQ for one of the nastiest workloads and configurations for this
scheduler: the workload generated by the dbench test (in the Phoronix
suite), with 6 clients, on a filesystem with journaling, and with the
journaling daemon enjoying a higher weight than normal processes.
With this commit, the throughput grows from ~100 MB/s to ~150 MB/s on
a PLEXTOR PX-256M5.

Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: Francesco Pollicino <fra.fra.800@gmail.com>
Signed-off-by: Paolo Valente <paolo.valente@linaro.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-04-01 08:15:39 -06:00
..
partitions partitions/aix: append null character to print data from disk 2018-07-27 09:17:41 -06:00
badblocks.c badblocks: fix wrong return value in badblocks_set if badblocks are disabled 2017-11-03 11:29:50 -07:00
bfq-cgroup.c blkcg: fix ref count issue with bio_blkcg() using task_css 2018-12-07 22:26:36 -07:00
bfq-iosched.c block, bfq: tune service injection basing on request service times 2019-04-01 08:15:39 -06:00
bfq-iosched.h block, bfq: tune service injection basing on request service times 2019-04-01 08:15:39 -06:00
bfq-wf2q.c block, bfq: do not idle for lowest-weight queues 2019-04-01 08:15:39 -06:00
bio-integrity.c block: remove the bio_integrity_advance export 2018-12-16 08:33:57 -07:00
bio.c block: add BIO_NO_PAGE_REF flag 2019-03-18 10:44:48 -06:00
blk-cgroup.c blkcg: Fix kernel-doc warnings 2019-03-20 14:39:09 -06:00
blk-core.c mm: refactor readahead defines in mm.h 2019-03-12 10:04:01 -07:00
blk-exec.c block: remove dead elevator code 2018-11-07 13:42:32 -07:00
blk-flush.c blk-mq: use blk_mq_put_driver_tag() to put tag 2019-03-24 10:26:16 -06:00
blk-integrity.c block: merge BIOVEC_SEG_BOUNDARY into biovec_phys_mergeable 2018-09-24 12:33:57 -06:00
blk-ioc.c block: remove the queue_lock indirection 2018-11-15 12:17:28 -07:00
blk-iolatency.c blk-iolatency: #include "blk.h" 2019-03-20 14:19:38 -06:00
blk-lib.c block: fix 32 bit overflow in __blkdev_issue_discard() 2018-11-14 08:17:18 -07:00
blk-map.c Merge branch 'for-4.16/block' of git://git.kernel.dk/linux-block 2018-01-29 11:51:49 -08:00
blk-merge.c block: fix segment calculation for passthrough IO 2019-03-06 09:42:54 -07:00
blk-mq-cpumap.c blk-mq: initial support for multiple queue maps 2018-11-07 13:45:00 -07:00
blk-mq-debugfs-zoned.c block: Cleanup license notice 2019-01-17 21:21:40 -07:00
blk-mq-debugfs.c SCSI misc on 20190306 2019-03-09 16:53:47 -08:00
blk-mq-debugfs.h blk-mq-debugfs: support rq_qos 2018-12-16 19:53:47 -07:00
blk-mq-pci.c blk-mq: initial support for multiple queue maps 2018-11-07 13:45:00 -07:00
blk-mq-rdma.c blk-mq-rdma: pass in queue map to blk_mq_rdma_map_queues 2018-12-13 09:59:08 +01:00
blk-mq-sched.c blk-mq: save queue mapping result into ctx directly 2019-02-01 08:33:04 -07:00
blk-mq-sched.h block: mq-deadline: Fix write completion handling 2018-12-17 11:19:39 -07:00
blk-mq-sysfs.c blk-mq: export hctx->type in debugfs instead of sysfs 2018-12-17 05:44:45 -07:00
blk-mq-tag.c blk-mq: save queue mapping result into ctx directly 2019-02-01 08:33:04 -07:00
blk-mq-tag.h Merge branch 'for-4.15/block' of git://git.kernel.dk/linux-block 2017-11-14 15:32:19 -08:00
blk-mq-virtio.c blk-mq: initial support for multiple queue maps 2018-11-07 13:45:00 -07:00
blk-mq.c blk-mq: fix sbitmap ws_active for shared tags 2019-03-25 13:05:47 -06:00
blk-mq.h blk-mq: use blk_mq_put_driver_tag() to put tag 2019-03-24 10:26:16 -06:00
blk-pm.c block: remove the queue_lock indirection 2018-11-15 12:17:28 -07:00
blk-pm.h block: remove the queue_lock indirection 2018-11-15 12:17:28 -07:00
blk-rq-qos.c blk-mq-debugfs: support rq_qos 2018-12-16 19:53:47 -07:00
blk-rq-qos.h block: fix blk-iolatency accounting underflow 2018-12-17 11:19:54 -07:00
blk-settings.c block: kill QUEUE_FLAG_FLUSH_NQ 2019-02-09 15:40:24 -07:00
blk-softirq.c block: remove a few unused exports 2018-11-15 12:13:25 -07:00
blk-stat.c block: remove a few unused exports 2018-11-15 12:13:25 -07:00
blk-stat.h block: deactivate blk_stat timer in wbt_disable_default() 2018-12-12 06:47:51 -07:00
blk-sysfs.c block: add BLK_MQ_POLL_CLASSIC for hybrid poll and return EINVAL for unexpected value 2019-03-20 14:02:07 -06:00
blk-throttle.c blkcg: consolidate bio_issue_init() to be a part of core 2018-12-07 22:26:37 -07:00
blk-timeout.c block: don't hold the queue_lock over blk_abort_request 2018-11-15 12:13:18 -07:00
blk-wbt.c blk-wbt: Declare local functions static 2019-01-24 11:09:21 -07:00
blk-wbt.h block: remove external dependency on wbt_flags 2018-07-09 09:07:54 -06:00
blk-zoned.c for-4.21/block-20181221 2018-12-28 13:19:59 -08:00
blk.h blk-mq: save queue mapping result into ctx directly 2019-02-01 08:33:04 -07:00
bounce.c block: bounce: make sure that bvec table is updated 2019-02-21 10:58:44 -07:00
bsg-lib.c scsi: bsg-lib: handle bidi requests without block layer help 2019-02-05 21:27:40 -05:00
bsg.c scsi: bsg-lib: handle bidi requests without block layer help 2019-02-05 21:27:40 -05:00
cmdline-parser.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
compat_ioctl.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
elevator.c block: avoid setting none scheduler if it's already none 2019-02-11 08:21:40 -07:00
genhd.c block: Replace function name in string with __func__ 2019-02-28 14:09:08 -07:00
ioctl.c block: Introduce BLKGETNRZONES ioctl 2018-10-25 11:17:40 -06:00
ioprio.c block: add ioprio_check_cap function 2018-05-31 10:50:54 -04:00
Kconfig Kconfig updates for v4.21 2018-12-29 13:03:29 -08:00
Kconfig.iosched block: remove legacy IO schedulers 2018-11-07 13:42:32 -07:00
kyber-iosched.c kyber: use sbitmap add_wait_queue/list_del wait helpers 2018-12-20 12:17:21 -07:00
Makefile block: remove legacy IO schedulers 2018-11-07 13:42:32 -07:00
mq-deadline.c block: mq-deadline: Fix write completion handling 2018-12-17 11:19:39 -07:00
opal_proto.h block: sed-opal: Set MBRDone on S3 resume path if TPER is MBREnabled 2017-09-11 09:45:52 -06:00
partition-generic.c block: return just one value from part_in_flight 2018-12-10 08:30:38 -07:00
scsi_ioctl.c block: consistently use GFP_NOIO instead of __GFP_NORECLAIM 2018-05-14 08:55:18 -06:00
sed-opal.c block: sed-opal: Fix a couple off by one bugs 2018-06-20 12:04:06 -06:00
t10-pi.c block: move dif_prepare/dif_complete functions to block layer 2018-07-30 08:27:02 -06:00