2
0
mirror of https://github.com/edk2-porting/linux-next.git synced 2024-12-29 07:34:06 +08:00
linux-next/arch/arm/mm
Lorenzo Pieralisi 70f665fe77 ARM: 7919/1: mm: refactor v7 cache cleaning ops to use way/index sequence
Set-associative caches on all v7 implementations map the index bits
to physical addresses LSBs and tag bits to MSBs. As the last level
of cache on current and upcoming ARM systems grows in size,
this means that under normal DRAM controller configurations, the
current v7 cache flush routine using set/way operations triggers a
DRAM memory controller precharge/activate for every cache line
writeback since the cache routine cleans lines by first fixing the
index and then looping through ways (index bits are mapped to lower
physical addresses on all v7 cache implementations; this means that,
with last level cache sizes in the order of MBytes, lines belonging
to the same set but different ways map to different DRAM pages).

Given the random content of cache tags, swapping the order between
indexes and ways loops do not prevent DRAM pages precharge and
activate cycles but at least, on average, improves the chances that
either multiple lines hit the same page or multiple lines belong to
different DRAM banks, improving throughput significantly.

This patch swaps the inner loops in the v7 cache flushing routine
to carry out the clean operations first on all sets belonging to
a given way (looping through sets) and then decrementing the way.

Benchmarks showed that by swapping the ordering in which sets and
ways are decremented in the v7 cache flushing routine, that uses
set/way operations, time required to flush caches is reduced
significantly, owing to improved writebacks throughput to the DRAM
controller.

Benchmarks results vary and depend heavily on the last level of
cache tag RAM content when cache is cleaned and invalidated, ranging
from 2x throughput when all tag RAM entries contain dirty lines
mapping to sequential pages of RAM to 1x (ie no improvement) when
all tag RAM accesses trigger a DRAM precharge/activate cycle, as the
current code implies on most DRAM controller configurations.

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-12-29 12:32:40 +00:00
..
abort-ev4.S ARM: entry: data abort: tail-call the main data abort handler 2011-07-02 10:56:11 +01:00
abort-ev4t.S ARM: entry: data abort: tail-call the main data abort handler 2011-07-02 10:56:11 +01:00
abort-ev5t.S ARM: entry: data abort: tail-call the main data abort handler 2011-07-02 10:56:11 +01:00
abort-ev5tj.S ARM: entry: data abort: tail-call the main data abort handler 2011-07-02 10:56:11 +01:00
abort-ev6.S ARM: asm: Add ARM_BE8() assembly helper 2013-10-19 20:46:33 +01:00
abort-ev7.S ARM: entry: data abort: tail-call the main data abort handler 2011-07-02 10:56:11 +01:00
abort-lv4t.S ARM: entry: data abort: ensure r5 is preserved by abort functions 2011-07-02 10:56:12 +01:00
abort-macro.S ARM: 7088/1: entry: fix wrong parameter name used in do_thumb_abort 2011-09-10 23:39:56 +01:00
abort-nommu.S ARM: entry: data abort: tail-call the main data abort handler 2011-07-02 10:56:11 +01:00
alignment.c ARM: alignment: correctly decode instructions in BE8 mode. 2013-10-19 20:46:34 +01:00
cache-aurora-l2.h ARM: 7547/4: cache-l2x0: add support for Aurora L2 cache ctrl 2012-11-06 19:47:35 +00:00
cache-fa.S ARM: mm: implement LoUIS API for cache maintenance ops 2012-09-25 11:20:25 +01:00
cache-feroceon-l2.c ARM: 7696/1: Fix kexec by setting outer_cache.inv_all for Feroceon 2013-04-17 16:53:27 +01:00
cache-l2x0.c Merge branches 'debug-choice', 'devel-stable' and 'misc' into for-linus 2013-09-05 10:34:15 +01:00
cache-nop.S ARM: Add base support for ARMv7-M 2013-04-17 21:38:10 +02:00
cache-tauros2.c ARM: cache: add dt support for tauros2 cache 2012-08-16 16:16:50 +08:00
cache-v4.S ARM: mm: remove broken condition check for v4 flushing 2013-03-26 09:55:34 +00:00
cache-v4wb.S ARM: mm: implement LoUIS API for cache maintenance ops 2012-09-25 11:20:25 +01:00
cache-v4wt.S ARM: mm: implement LoUIS API for cache maintenance ops 2012-09-25 11:20:25 +01:00
cache-v6.S ARM: mm: implement LoUIS API for cache maintenance ops 2012-09-25 11:20:25 +01:00
cache-v7.S ARM: 7919/1: mm: refactor v7 cache cleaning ops to use way/index sequence 2013-12-29 12:32:40 +00:00
cache-xsc3l2.c ARM: move CP15 definitions to separate header file 2012-03-28 18:30:01 +01:00
context.c ARM: tlb: don't perform inner-shareable invalidation for local TLB ops 2013-08-12 12:25:44 +01:00
copypage-fa.c arm: remove the second argument of k[un]map_atomic() 2012-03-20 21:48:14 +08:00
copypage-feroceon.c arm: remove the second argument of k[un]map_atomic() 2012-03-20 21:48:14 +08:00
copypage-v4mc.c Merge branch 'for-linus' of git://git.linaro.org/people/rmk/linux-arm 2012-03-29 16:53:48 -07:00
copypage-v4wb.c arm: remove the second argument of k[un]map_atomic() 2012-03-20 21:48:14 +08:00
copypage-v4wt.c arm: remove the second argument of k[un]map_atomic() 2012-03-20 21:48:14 +08:00
copypage-v6.c Merge branch 'for-linus' of git://git.linaro.org/people/rmk/linux-arm 2012-03-29 16:53:48 -07:00
copypage-xsc3.c arm: remove the second argument of k[un]map_atomic() 2012-03-20 21:48:14 +08:00
copypage-xscale.c Merge branch 'for-linus' of git://git.linaro.org/people/rmk/linux-arm 2012-03-29 16:53:48 -07:00
dma-mapping.c ARM: another fix for the DMA mapping checks 2013-12-09 23:24:26 +00:00
extable.c ARM: 7876/1: clear Thumb-2 IT state on exception handling 2013-11-07 00:15:49 +00:00
fault-armv.c mm: rename USE_SPLIT_PTLOCKS to USE_SPLIT_PTE_PTLOCKS 2013-11-15 09:32:14 +09:00
fault.c arch: mm: pass userspace fault flag to generic fault handler 2013-09-12 15:38:01 -07:00
fault.h ARM: LPAE: Add fault handling support 2011-12-08 10:30:40 +00:00
flush.c Merge branch 'devel-stable' into for-next 2013-06-29 11:44:43 +01:00
fsr-2level.c ARM: LPAE: Move the FSR definitions to separate files 2011-12-08 10:30:37 +00:00
fsr-3level.c ARM: mm: Transparent huge page support for LPAE systems. 2013-06-04 16:52:38 +01:00
highmem.c Merge branch 'for-linus' of git://git.linaro.org/people/rmk/linux-arm 2012-03-29 16:53:48 -07:00
hugetlbpage.c mm: migrate: check movability of hugepage in unmap_and_move_huge_page() 2013-09-11 15:57:49 -07:00
idmap.c ARM: mm: Move the idmap print to appropriate place in the code 2013-10-10 20:25:06 -04:00
init.c ARM: 7908/1: mm: Fix the arm_dma_limit calculation 2013-12-09 23:24:28 +00:00
iomap.c arm/PCI: remove arch pci_flags definition 2012-02-23 20:18:56 -07:00
ioremap.c ARM: 7728/1: mm: Use phys_addr_t properly for ioremap functions 2013-05-23 00:09:44 +01:00
Kconfig ARM: fix ARCH_IXP4xx usage of ARCH_SUPPORTS_BIG_ENDIAN 2013-10-19 20:46:32 +01:00
Makefile Merge branch 'for-rmk/hugepages' of git://git.linaro.org/people/stevecapper/linux into devel-stable 2013-06-18 20:05:48 +01:00
mm.h ARM: DMA-API: better handing of DMA masks for coherent allocations 2013-10-31 14:49:21 +00:00
mmap.c ARM: fix booting low-vectors machines 2013-11-30 14:45:31 +00:00
mmu.c ARM: 7884/1: mm: Fix ECC mem policy printk 2013-11-14 11:13:10 +00:00
nommu.c ARM: Fix nommu.c build warning 2013-11-14 10:59:50 +00:00
pabort-legacy.S ARM: entry: prefetch abort: tail-call the main prefetch abort handler 2011-07-02 10:56:10 +01:00
pabort-v6.S ARM: entry: prefetch abort: tail-call the main prefetch abort handler 2011-07-02 10:56:10 +01:00
pabort-v7.S ARM: entry: prefetch abort: tail-call the main prefetch abort handler 2011-07-02 10:56:10 +01:00
pgd.c ARM: fix booting low-vectors machines 2013-11-30 14:45:31 +00:00
proc-arm7tdmi.S arm: delete __cpuinit/__CPUINIT usage from all ARM users 2013-07-14 19:36:52 -04:00
proc-arm9tdmi.S arm: delete __cpuinit/__CPUINIT usage from all ARM users 2013-07-14 19:36:52 -04:00
proc-arm720.S arm: delete __cpuinit/__CPUINIT usage from all ARM users 2013-07-14 19:36:52 -04:00
proc-arm740.S arm: delete __cpuinit/__CPUINIT usage from all ARM users 2013-07-14 19:36:52 -04:00
proc-arm920.S arm: delete __cpuinit/__CPUINIT usage from all ARM users 2013-07-14 19:36:52 -04:00
proc-arm922.S arm: delete __cpuinit/__CPUINIT usage from all ARM users 2013-07-14 19:36:52 -04:00
proc-arm925.S arm: delete __cpuinit/__CPUINIT usage from all ARM users 2013-07-14 19:36:52 -04:00
proc-arm926.S arm: delete __cpuinit/__CPUINIT usage from all ARM users 2013-07-14 19:36:52 -04:00
proc-arm940.S arm: delete __cpuinit/__CPUINIT usage from all ARM users 2013-07-14 19:36:52 -04:00
proc-arm946.S arm: delete __cpuinit/__CPUINIT usage from all ARM users 2013-07-14 19:36:52 -04:00
proc-arm1020.S arm: delete __cpuinit/__CPUINIT usage from all ARM users 2013-07-14 19:36:52 -04:00
proc-arm1020e.S arm: delete __cpuinit/__CPUINIT usage from all ARM users 2013-07-14 19:36:52 -04:00
proc-arm1022.S arm: delete __cpuinit/__CPUINIT usage from all ARM users 2013-07-14 19:36:52 -04:00
proc-arm1026.S arm: delete __cpuinit/__CPUINIT usage from all ARM users 2013-07-14 19:36:52 -04:00
proc-fa526.S arm: delete __cpuinit/__CPUINIT usage from all ARM users 2013-07-14 19:36:52 -04:00
proc-feroceon.S ARM: 7818/1: feroceon: Add suspend/resume operation 2013-08-20 00:12:52 +01:00
proc-macros.S ARM: 7773/1: PJ4B: Add support for errata 4742 2013-06-24 14:28:46 +01:00
proc-mohawk.S arm: delete __cpuinit/__CPUINIT usage from all ARM users 2013-07-14 19:36:52 -04:00
proc-sa110.S arm: delete __cpuinit/__CPUINIT usage from all ARM users 2013-07-14 19:36:52 -04:00
proc-sa1100.S arm: delete __cpuinit/__CPUINIT usage from all ARM users 2013-07-14 19:36:52 -04:00
proc-syms.c ARM: modules: don't export cpu_set_pte_ext when !MMU 2013-03-26 09:55:34 +00:00
proc-v6.S ARM: asm: Add ARM_BE8() assembly helper 2013-10-19 20:46:33 +01:00
proc-v7-2level.S ARM: 7784/1: mm: ensure SMP alternates assemble to exactly 4 bytes with Thumb-2 2013-07-22 14:29:09 +01:00
proc-v7-3level.S ARM: 7784/1: mm: ensure SMP alternates assemble to exactly 4 bytes with Thumb-2 2013-07-22 14:29:09 +01:00
proc-v7.S ARM: 7885/1: Save/Restore 64-bit TTBR registers on LPAE suspend/resume 2013-11-14 11:13:11 +00:00
proc-v7m.S ARM: ARMv7-M: implement read_cpuid_ext 2013-05-17 11:44:40 +02:00
proc-xsc3.S arm: delete __cpuinit/__CPUINIT usage from all ARM users 2013-07-14 19:36:52 -04:00
proc-xscale.S arm: delete __cpuinit/__CPUINIT usage from all ARM users 2013-07-14 19:36:52 -04:00
tcm.h ARM: 7694/1: ARM, TCM: initialize TCM in paging_init(), instead of setup_arch() 2013-04-17 16:53:24 +01:00
tlb-fa.S Merge branch 'devel-stable' into for-next 2011-07-22 23:09:07 +01:00
tlb-v4.S ARM: mm: tlb-v4: Use the new processor struct macros 2011-07-07 15:31:12 +01:00
tlb-v4wb.S ARM: mm: tlb-v4wb: Use the new processor struct macros 2011-07-07 15:31:12 +01:00
tlb-v4wbi.S ARM: mm: tlb-v4wbi: Use the new processor struct macros 2011-07-07 15:31:12 +01:00
tlb-v6.S Merge branch 'devel-stable' into for-next 2011-07-22 23:09:07 +01:00
tlb-v7.S ARM: mm: use inner-shareable barriers for TLB and user cache operations 2013-08-12 12:25:45 +01:00