mirror of
https://github.com/edk2-porting/linux-next.git
synced 2025-01-08 05:34:29 +08:00
db1b1fefc2
Currently, count_active_tasks() calls both nr_running() & nr_interruptible(). Each of these functions does a "for_each_cpu" & reads values from the runqueue of each cpu. Although this is not a lot of instructions, each runqueue may be located on different node. Depending on the architecture, a unique TLB entry may be required to access each runqueue. Since there may be more runqueues than cpu TLB entries, a scan of all runqueues can trash the TLB. Each memory reference incurs a TLB miss & refill. In addition, the runqueue cacheline that contains nr_running & nr_uninterruptible may be evicted from the cache between the two passes. This causes unnecessary cache misses. Combining nr_running() & nr_interruptible() into a single function substantially reduces the TLB & cache misses on large systems. This should have no measureable effect on smaller systems. On a 128p IA64 system running a memory stress workload, the new function reduced the overhead of calc_load() from 605 usec/call to 324 usec/call. Signed-off-by: Jack Steiner <steiner@sgi.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org> |
||
---|---|---|
.. | ||
acpi | ||
asm-alpha | ||
asm-arm | ||
asm-arm26 | ||
asm-cris | ||
asm-frv | ||
asm-generic | ||
asm-h8300 | ||
asm-i386 | ||
asm-ia64 | ||
asm-m32r | ||
asm-m68k | ||
asm-m68knommu | ||
asm-mips | ||
asm-parisc | ||
asm-powerpc | ||
asm-ppc | ||
asm-s390 | ||
asm-sh | ||
asm-sh64 | ||
asm-sparc | ||
asm-sparc64 | ||
asm-um | ||
asm-v850 | ||
asm-x86_64 | ||
asm-xtensa | ||
keys | ||
linux | ||
math-emu | ||
media | ||
mtd | ||
net | ||
pcmcia | ||
rdma | ||
rxrpc | ||
scsi | ||
sound | ||
video |