Go to file
Michael Wang dd39eadc71 sched: Avoid scale real weight down to zero
[ Upstream commit 26cf52229e ]

During our testing, we found a case that shares no longer
working correctly, the cgroup topology is like:

  /sys/fs/cgroup/cpu/A		(shares=102400)
  /sys/fs/cgroup/cpu/A/B	(shares=2)
  /sys/fs/cgroup/cpu/A/B/C	(shares=1024)

  /sys/fs/cgroup/cpu/D		(shares=1024)
  /sys/fs/cgroup/cpu/D/E	(shares=1024)
  /sys/fs/cgroup/cpu/D/E/F	(shares=1024)

The same benchmark is running in group C & F, no other tasks are
running, the benchmark is capable to consumed all the CPUs.

We suppose the group C will win more CPU resources since it could
enjoy all the shares of group A, but it's F who wins much more.

The reason is because we have group B with shares as 2, since
A->cfs_rq.load.weight == B->se.load.weight == B->shares/nr_cpus,
so A->cfs_rq.load.weight become very small.

And in calc_group_shares() we calculate shares as:

  load = max(scale_load_down(cfs_rq->load.weight), cfs_rq->avg.load_avg);
  shares = (tg_shares * load) / tg_weight;

Since the 'cfs_rq->load.weight' is too small, the load become 0
after scale down, although 'tg_shares' is 102400, shares of the se
which stand for group A on root cfs_rq become 2.

While the se of D on root cfs_rq is far more bigger than 2, so it
wins the battle.

Thus when scale_load_down() scale real weight down to 0, it's no
longer telling the real story, the caller will have the wrong
information and the calculation will be buggy.

This patch add check in scale_load_down(), so the real weight will
be >= MIN_SHARES after scale, after applied the group C wins as
expected.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Michael Wang <yun.wang@linux.alibaba.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lkml.kernel.org/r/38e8e212-59a1-64b2-b247-b6d0b52d8dc1@linux.alibaba.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-04-17 10:50:02 +02:00
arch x86: Don't let pgprot_modify() change the page encryption bit 2020-04-17 10:50:01 +02:00
block block: keep bdi->io_pages in sync with max_sectors_kb for stacked devices 2020-04-17 10:50:01 +02:00
certs PKCS#7: Refactor verify_pkcs7_signature() 2019-08-05 18:40:18 -04:00
crypto crypto: rename sm3-256 to sm3 in hash_algo_name 2020-02-28 17:22:26 +01:00
Documentation dt-bindings: net: FMan erratum A050385 2020-04-01 11:01:52 +02:00
drivers media: allegro: fix type of gop_length in channel_create message 2020-04-17 10:50:02 +02:00
fs debugfs: Check module state before warning in {full/open}_proxy_open() 2020-04-17 10:50:02 +02:00
include media: rc: add keymap for Videostrong KII Pro 2020-04-17 10:49:59 +02:00
init kbuild: remove header compile test 2020-03-05 16:43:47 +01:00
ipc Revert "ipc,sem: remove uneeded sem_undo_list lock usage in exit_sem()" 2020-02-28 17:22:20 +01:00
kernel sched: Avoid scale real weight down to zero 2020-04-17 10:50:02 +02:00
lib uapi: rename ext2_swab() to swab() and share globally in swab.h 2020-04-13 10:48:07 +02:00
LICENSES LICENSES: Rename other to deprecated 2019-05-03 06:34:32 -06:00
mm slub: improve bit diffusion for freelist ptr obfuscation 2020-04-13 10:48:07 +02:00
net cfg80211: Do not warn on same channel at the end of CSA 2020-04-17 10:49:59 +02:00
samples samples/bpf: Set -fno-stack-protector when building BPF programs 2020-02-24 08:36:36 +01:00
scripts kconfig: introduce m32-flag and m64-flag 2020-04-08 09:08:37 +02:00
security efi: Only print errors about failing to get certs if EFI vars are found 2020-03-12 13:00:14 +01:00
sound ASoC: jz4740-i2s: Fix divider written at incorrect offset in register 2020-04-13 10:48:09 +02:00
tools selftests/net: add definition for SOL_DCCP to fix compilation errors for old libc 2020-04-17 10:49:59 +02:00
usr initramfs: restore default compression behavior 2020-04-08 09:08:38 +02:00
virt KVM: Check for a bad hva before dropping into the ghc slow path 2020-03-05 16:43:48 +01:00
.clang-format clang-format: Update with the latest for_each macro list 2019-08-31 10:00:51 +02:00
.cocciconfig
.get_maintainer.ignore Opt out of scripts/get_maintainer.pl 2019-05-16 10:53:40 -07:00
.gitattributes .gitattributes: set git diff driver for C source code files 2016-10-07 18:46:30 -07:00
.gitignore Modules updates for v5.4 2019-09-22 10:34:46 -07:00
.mailmap ARM: SoC fixes 2019-11-10 13:41:59 -08:00
COPYING COPYING: use the new text with points to the license files 2018-03-23 12:41:45 -06:00
CREDITS MAINTAINERS: Remove Simon as Renesas SoC Co-Maintainer 2019-10-10 08:12:51 -07:00
Kbuild kbuild: do not descend to ./Kbuild when cleaning 2019-08-21 21:03:58 +09:00
Kconfig docs: kbuild: convert docs to ReST and rename to *.rst 2019-06-14 14:21:21 -06:00
MAINTAINERS MAINTAINERS: Update drm/i915 bug filing URL 2020-02-28 17:22:19 +01:00
Makefile Linux 5.4.32 2020-04-13 10:48:18 +02:00
README Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.