linux/block
Tejun Heo e9f4eee9a0 blk-iocost: fix weight updates of inner active iocgs
When the weight of an active iocg is updated, weight_updated() is called
which in turn calls __propagate_weights() to update the active and inuse
weights so that the effective hierarchical weights are update accordingly.

The current implementation is incorrect for inner active nodes. For an
active leaf iocg, inuse can be any value between 1 and active and the
difference represents how much the iocg is donating. When weight is updated,
as long as inuse is clamped between 1 and the new weight, we're alright and
this is what __propagate_weights() currently implements.

However, that's not how an active inner node's inuse is set. An inner node's
inuse is solely determined by the ratio between the sums of inuse's and
active's of its children - ie. they're results of propagating the leaves'
active and inuse weights upwards. __propagate_weights() incorrectly applies
the same clamping as for a leaf when an active inner node's weight is
updated. Consider a hierarchy which looks like the following with saturating
workloads in AA and BB.

     R
   /   \
  A     B
  |     |
 AA     BB

1. For both A and B, active=100, inuse=100, hwa=0.5, hwi=0.5.

2. echo 200 > A/io.weight

3. __propagate_weights() update A's active to 200 and leave inuse at 100 as
   it's already between 1 and the new active, making A:active=200,
   A:inuse=100. As R's active_sum is updated along with A's active,
   A:hwa=2/3, B:hwa=1/3. However, because the inuses didn't change, the
   hwi's remain unchanged at 0.5.

4. The weight of A is now twice that of B but AA and BB still have the same
   hwi of 0.5 and thus are doing the same amount of IOs.

Fix it by making __propgate_weights() always calculate the inuse of an
active inner iocg based on the ratio of child_inuse_sum to child_active_sum.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Dan Schatzberg <dschatzberg@fb.com>
Fixes: 7caa47151a ("blkcg: implement blk-iocost")
Cc: stable@vger.kernel.org # v5.4+
Link: https://lore.kernel.org/r/YJsxnLZV1MnBcqjj@slm.duckdns.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-05-11 20:50:35 -06:00
..
partitions for-5.13/block-2021-04-27 2021-04-28 14:27:12 -07:00
badblocks.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
bfq-cgroup.c block, bfq: merge bursts of newly-created queues 2021-03-25 10:50:07 -06:00
bfq-iosched.c kyber: fix out of bounds access when preempted 2021-05-11 08:12:14 -06:00
bfq-iosched.h block, bfq: merge bursts of newly-created queues 2021-03-25 10:50:07 -06:00
bfq-wf2q.c block, bfq: always inject I/O of queues blocked by wakers 2021-03-25 10:50:07 -06:00
bio-integrity.c block: remove BLK_BOUNCE_ISA support 2021-04-06 09:28:17 -06:00
bio.c Revert "bio: limit bio max size" 2021-05-08 21:49:48 -06:00
blk-cgroup-rwstat.c blk-cgroup: Fix the recursive blkg rwstat 2021-03-05 11:32:15 -07:00
blk-cgroup-rwstat.h blk-cgroup: separate out blkg_rwstat under CONFIG_BLK_CGROUP_RWSTAT 2019-11-07 12:28:13 -07:00
blk-cgroup.c for-5.12/block-2021-02-17 2021-02-21 11:02:48 -08:00
blk-core.c block: refactor the bounce buffering code 2021-04-06 09:28:17 -06:00
blk-crypto-fallback.c block: rename BIO_MAX_PAGES to BIO_MAX_VECS 2021-03-11 07:47:48 -07:00
blk-crypto-internal.h block: make blk_crypto_rq_bio_prep() able to fail 2020-10-05 10:47:43 -06:00
blk-crypto.c dm: support key eviction from keyslot managers of underlying devices 2021-02-11 09:45:25 -05:00
blk-exec.c block: drop removed argument from kernel-doc of blk_execute_rq() 2021-01-29 07:43:29 -07:00
blk-flush.c block: use an on-stack bio in blkdev_issue_flush 2021-01-27 09:51:48 -07:00
blk-integrity.c block: remove the unused blk_integrity_merge_bio export 2020-10-06 07:29:53 -06:00
blk-ioc.c block: remove retry loop in ioc_release_fn() 2020-07-16 10:22:15 -06:00
blk-iocost.c blk-iocost: fix weight updates of inner active iocgs 2021-05-11 20:50:35 -06:00
blk-iolatency.c block: Remove redundant 'return' statement 2020-10-08 07:59:48 -06:00
blk-lib.c block: rename BIO_MAX_PAGES to BIO_MAX_VECS 2021-03-11 07:47:48 -07:00
blk-map.c block: remove an incorrect check from blk_rq_append_bio 2021-04-12 06:45:12 -06:00
blk-merge.c block: recalculate segment count for multi-segment discards correctly 2021-03-23 10:39:57 -06:00
blk-mq-cpumap.c blk-mq: remove the calling of local_memory_node() 2020-10-20 07:08:17 -06:00
blk-mq-debugfs-zoned.c block: Cleanup license notice 2019-01-17 21:21:40 -07:00
blk-mq-debugfs.c for-5.13/block-2021-04-27 2021-04-28 14:27:12 -07:00
blk-mq-debugfs.h blk-mq: no need to check return value of debugfs_create functions 2019-06-13 03:00:30 -06:00
blk-mq-pci.c block: Fix blk_mq_*_map_queues() kernel-doc headers 2019-05-31 15:12:34 -06:00
blk-mq-rdma.c block: Fix blk_mq_*_map_queues() kernel-doc headers 2019-05-31 15:12:34 -06:00
blk-mq-sched.c kyber: fix out of bounds access when preempted 2021-05-11 08:12:14 -06:00
blk-mq-sched.h block: get rid of the trace rq insert wrapper 2021-02-22 06:37:41 -07:00
blk-mq-sysfs.c blk-mq: move cancel of hctx->run_work to the front of blk_exit_queue 2020-10-09 12:46:28 -06:00
blk-mq-tag.c blk-mq: Always use blk_mq_is_sbitmap_shared 2021-04-06 09:24:07 -06:00
blk-mq-tag.h blk-mq: Relocate hctx_may_queue() 2020-09-03 15:20:47 -06:00
blk-mq-virtio.c blk-mq: Fix typo in comment 2020-03-17 20:55:21 +01:00
blk-mq.c SCSI misc on 20210428 2021-04-28 17:22:10 -07:00
blk-mq.h scsi: blk-mq: Return budget token from .get_budget callback 2021-03-04 17:36:59 -05:00
blk-pm.c scsi: block: Fix a race in the runtime power management code 2020-12-09 11:41:41 -05:00
blk-pm.h block: Remove unused blk_pm_*() function definitions 2021-02-22 06:33:48 -07:00
blk-rq-qos.c Revert "blk-rq-qos: remove redundant finish_wait to rq_qos_wait." 2020-07-15 09:33:37 -06:00
blk-rq-qos.h blk-rq-qos: fix first node deletion of rq_qos_del() 2019-10-15 10:13:13 -06:00
blk-settings.c Revert "bio: limit bio max size" 2021-05-08 21:49:48 -06:00
blk-stat.c blk-stat: make q->stats->lock irqsafe 2020-09-01 16:48:46 -06:00
blk-stat.h block: deactivate blk_stat timer in wbt_disable_default() 2018-12-12 06:47:51 -07:00
blk-sysfs.c block: add sysfs entry for virt boundary mask 2021-04-06 09:23:23 -06:00
blk-throttle.c block: store a block_device pointer in struct bio 2021-01-24 18:17:20 -07:00
blk-timeout.c block: blk-timeout: delete duplicated word 2020-07-31 16:29:47 -06:00
blk-wbt.c blk: wbt: remove unused parameter from wbt_should_throttle 2021-01-26 13:13:00 -07:00
blk-wbt.h blk-wbt: remove wbt_update_limits 2020-05-29 16:30:39 -06:00
blk-zoned.c blk-zoned: Remove the definition of blk_zone_start() 2021-04-07 14:31:45 -06:00
blk.h block: refactor blk_drop_partitions 2021-04-08 10:24:36 -06:00
bounce.c block: stop calling blk_queue_bounce for passthrough requests 2021-04-06 09:28:18 -06:00
bsg-lib.c block: drop double zeroing 2020-09-23 09:18:13 -06:00
bsg.c block: remove unnecessary argument from blk_execute_rq 2021-01-24 21:52:39 -07:00
cmdline-parser.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
elevator.c blk-mq: set default elevator as deadline in case of hctx shared tagset 2021-04-07 10:15:23 -06:00
genhd.c block: remove disk_part_iter 2021-04-08 10:24:36 -06:00
ioctl.c block: return -EBUSY when there are open partitions in blkdev_reread_part 2021-04-21 10:49:37 -06:00
ioprio.c block: Fix sys_ioprio_set(.which=IOPRIO_WHO_PGRP) task iteration 2021-04-08 13:43:53 -06:00
Kconfig blk-wbt: Remove obsolete multiqueue I/O scheduling comment 2020-09-01 16:49:26 -06:00
Kconfig.iosched treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
keyslot-manager.c - Fix DM integrity's HMAC support to provide enhanced security of 2021-02-22 10:22:54 -08:00
kyber-iosched.c kyber: fix out of bounds access when preempted 2021-05-11 08:12:14 -06:00
Makefile blk-mq: merge blk-softirq.c into blk-mq.c 2020-06-24 09:15:56 -06:00
mq-deadline.c kyber: fix out of bounds access when preempted 2021-05-11 08:12:14 -06:00
opal_proto.h block: sed-opal: Change the check condition for regular session validity 2020-03-12 08:00:10 -06:00
scsi_ioctl.c block: Remove an obsolete comment from sg_io() 2021-04-13 11:23:52 -06:00
sed-opal.c block: sed-opal: Change the check condition for regular session validity 2020-03-12 08:00:10 -06:00
t10-pi.c block: Allow t10-pi to be modular 2020-01-06 20:59:04 -07:00