linux/block
Ritesh Harjani b3193bc0dc cfq: Give a chance for arming slice idle timer in case of group_idle
In below scenario blkio cgroup does not work as per their assigned
weights :-
1. When the underlying device is nonrotational with a single HW queue
with depth of >= CFQ_HW_QUEUE_MIN
2. When the use case is forming two blkio cgroups cg1(weight 1000) &
cg2(wight 100) and two processes(file1 and file2) doing sync IO in
their respective blkio cgroups.

For above usecase result of fio (without this patch):-
file1: (groupid=0, jobs=1): err= 0: pid=685: Thu Jan  1 19:41:49 1970
  write: IOPS=1315, BW=41.1MiB/s (43.1MB/s)(1024MiB/24906msec)
<...>
file2: (groupid=0, jobs=1): err= 0: pid=686: Thu Jan  1 19:41:49 1970
  write: IOPS=1295, BW=40.5MiB/s (42.5MB/s)(1024MiB/25293msec)
<...>
// both the process BW is equal even though they belong to diff.
cgroups with weight of 1000(cg1) and 100(cg2)

In above case (for non rotational NCQ devices),
as soon as the request from cg1 is completed and even
though it is provided with higher set_slice=10, because of CFQ
algorithm when the driver tries to fetch the request, CFQ expires
this group without providing any idle time nor weight priority
and schedules another cfq group (in this case cg2).
And thus both cfq groups(cg1 & cg2) keep alternating to get the
disk time and hence loses the cgroup weight based scheduling.

Below patch gives a chance to cfq algorithm (cfq_arm_slice_timer)
to arm the slice timer in case group_idle is enabled.
In case if group_idle is also not required (including for nonrotational
NCQ drives), we need to explicitly set group_idle = 0 from sysfs for
such cases.

With this patch result of fio(for above usecase) :-
file1: (groupid=0, jobs=1): err= 0: pid=690: Thu Jan  1 00:06:08 1970
  write: IOPS=1706, BW=53.3MiB/s (55.9MB/s)(1024MiB/19197msec)
<..>
file2: (groupid=0, jobs=1): err= 0: pid=691: Thu Jan  1 00:06:08 1970
  write: IOPS=1043, BW=32.6MiB/s (34.2MB/s)(1024MiB/31401msec)
<..>
// In this processes BW is as per their respective cgroups weight.

Signed-off-by: Ritesh Harjani <riteshh@codeaurora.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-08-11 09:00:26 -06:00
..
partitions partitions/ldm: switch to use uuid_t 2017-06-05 16:59:14 +02:00
badblocks.c block: Add fallthrough markers to switch statements 2017-06-21 11:46:07 -06:00
bfq-cgroup.c block, bfq: access and cache blkg data only when safe 2017-06-08 09:51:10 -06:00
bfq-iosched.c block, bfq: boost throughput with flash-based non-queueing devices 2017-08-11 08:58:03 -06:00
bfq-iosched.h block,bfq: refactor device-idling logic 2017-08-11 08:58:02 -06:00
bfq-wf2q.c bfq: fix typos in comments about B-WF2Q+ algorithm 2017-07-12 08:32:02 -06:00
bio-integrity.c bio-integrity: move the bio integrity profile check earlier in bio_integrity_prep 2017-08-09 10:00:29 -06:00
bio.c block: pass in queue to inflight accounting 2017-08-09 13:09:16 -06:00
blk-cgroup.c block: Avoid that blk_exit_rl() triggers a use-after-free 2017-06-01 13:07:55 -06:00
blk-core.c blk-mq: enable checking two part inflight counts at the same time 2017-08-09 13:09:33 -06:00
blk-exec.c block: introduce new block status code type 2017-06-09 09:27:32 -06:00
blk-flush.c block: Check locking assumptions at runtime 2017-06-20 19:27:14 -06:00
blk-integrity.c block: switch bios to blk_status_t 2017-06-09 09:27:32 -06:00
blk-ioc.c Merge branch 'for-linus' of git://git.kernel.dk/linux-block 2017-03-03 10:53:35 -08:00
blk-lib.c block: Fix __blkdev_issue_zeroout loop 2017-07-06 09:43:20 -06:00
blk-map.c blk-map: call blk_queue_bounce from blk_rq_append_bio 2017-06-27 12:13:21 -06:00
blk-merge.c block: pass in queue to inflight accounting 2017-08-09 13:09:16 -06:00
blk-mq-cpumap.c blk-mq: map queues to all present CPUs 2017-07-24 10:01:31 -06:00
blk-mq-debugfs.c block: remove unused syncfull/asyncfull queue flags 2017-08-10 08:25:38 -06:00
blk-mq-debugfs.h mq-deadline: add debugfs attributes 2017-05-04 08:25:17 -06:00
blk-mq-pci.c blk-mq-pci: Fix two spelling mistakes 2017-03-29 11:09:51 -06:00
blk-mq-sched.c blk-mq-sched: fix performance regression of mq-deadline 2017-07-03 16:54:09 -06:00
blk-mq-sched.h Merge commit '8e8320c9315c' into for-4.13/block 2017-06-22 21:55:24 -06:00
blk-mq-sysfs.c blk-mq: untangle debugfs and sysfs 2017-05-04 08:24:13 -06:00
blk-mq-tag.c blk-mq-tag: check for NULL rq when iterating tags 2017-08-09 13:09:08 -06:00
blk-mq-tag.h blk-mq-sched: Allocate sched reserved tags as specified in the original queue tagset 2017-03-02 08:56:04 -07:00
blk-mq-virtio.c blk-mq: provide a default queue mapping for virtio device 2017-02-27 20:54:05 +02:00
blk-mq.c blk-mq: enable checking two part inflight counts at the same time 2017-08-09 13:09:33 -06:00
blk-mq.h blk-mq: provide internal in-flight variant 2017-08-09 13:09:28 -06:00
blk-settings.c block: don't bother with bounce limits for make_request drivers 2017-06-27 12:13:45 -06:00
blk-softirq.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/topology.h> 2017-03-02 08:42:26 +01:00
blk-stat.c blk-stat: don't use this_cpu_ptr() in a preemptable section 2017-05-10 07:40:18 -06:00
blk-stat.h blk-stat: kill blk_stat_rq_ddir() 2017-04-21 07:56:23 -06:00
blk-sysfs.c block: Fix a blk_exit_rl() regression 2017-06-14 13:27:50 -06:00
blk-tag.c block: Check locking assumptions at runtime 2017-06-20 19:27:14 -06:00
blk-throttle.c block: use standard blktrace API to output cgroup info for debug notes 2017-07-29 09:00:03 -06:00
blk-timeout.c block: Check locking assumptions at runtime 2017-06-20 19:27:14 -06:00
blk-wbt.c sched/wait: Disambiguate wq_entry->task_list and wq_head->task_list naming 2017-06-20 12:19:14 +02:00
blk-wbt.h block: Make writeback throttling defaults consistent for SQ devices 2017-04-19 08:49:03 -06:00
blk-zoned.c block: Rename blk_queue_zone_size and bdev_zone_size 2017-01-12 07:58:32 -07:00
blk.h bio-integrity: stop abusing bi_end_io 2017-07-03 17:00:59 -06:00
bounce.c block: remove the queue_bounce_pfn helper 2017-06-27 12:13:45 -06:00
bsg-lib.c block: introduce new block status code type 2017-06-09 09:27:32 -06:00
bsg.c block: Make most scsi_req_init() calls implicit 2017-06-20 19:27:14 -06:00
cfq-iosched.c cfq: Give a chance for arming slice idle timer in case of group_idle 2017-08-11 09:00:26 -06:00
cmdline-parser.c block: remove unrelated header files and export symbol 2014-01-21 20:18:26 -08:00
compat_ioctl.c compat_hdio_ioctl: get rid of set_fs() 2017-06-29 18:17:52 -04:00
deadline-iosched.c block: enumify ELEVATOR_*_MERGE 2017-02-08 13:43:06 -07:00
elevator.c block: Add fallthrough markers to switch statements 2017-06-21 11:46:07 -06:00
genhd.c blk-mq: provide internal in-flight variant 2017-08-09 13:09:28 -06:00
ioctl.c block: remove the discard_zeroes_data flag 2017-04-08 11:25:38 -06:00
ioprio.c block: Add fallthrough markers to switch statements 2017-06-21 11:46:07 -06:00
Kconfig Merge branch 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm 2017-05-12 15:43:10 -07:00
Kconfig.iosched block, bfq: add full hierarchical scheduling and cgroups support 2017-04-19 08:30:26 -06:00
kyber-iosched.c Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-07-03 13:08:04 -07:00
Makefile block, bfq: split bfq-iosched.c into multiple source files 2017-04-19 08:48:24 -06:00
mq-deadline.c mq-deadline: add debugfs attributes 2017-05-04 08:25:17 -06:00
noop-iosched.c block: move existing elevator ops to union 2017-01-17 10:03:33 -07:00
opal_proto.h block/sed-opal: allocate struct opal_dev dynamically 2017-02-17 12:41:47 -07:00
partition-generic.c block: make part_in_flight() take an array of two ints 2017-08-09 13:09:20 -06:00
scsi_ioctl.c block: Change argument type of scsi_req_init() 2017-06-20 19:27:14 -06:00
sed-opal.c block: sed-opal: Tone down all the pr_* to debugs 2017-04-07 14:24:16 -06:00
t10-pi.c t10-pi: Move opencoded contants to common header 2017-07-03 16:56:25 -06:00