Commit Graph

39 Commits

Author SHA1 Message Date
Mike Rapoport
e31cf2f4ca mm: don't include asm/pgtable.h if linux/mm.h is already included
Patch series "mm: consolidate definitions of page table accessors", v2.

The low level page table accessors (pXY_index(), pXY_offset()) are
duplicated across all architectures and sometimes more than once.  For
instance, we have 31 definition of pgd_offset() for 25 supported
architectures.

Most of these definitions are actually identical and typically it boils
down to, e.g.

static inline unsigned long pmd_index(unsigned long address)
{
        return (address >> PMD_SHIFT) & (PTRS_PER_PMD - 1);
}

static inline pmd_t *pmd_offset(pud_t *pud, unsigned long address)
{
        return (pmd_t *)pud_page_vaddr(*pud) + pmd_index(address);
}

These definitions can be shared among 90% of the arches provided
XYZ_SHIFT, PTRS_PER_XYZ and xyz_page_vaddr() are defined.

For architectures that really need a custom version there is always
possibility to override the generic version with the usual ifdefs magic.

These patches introduce include/linux/pgtable.h that replaces
include/asm-generic/pgtable.h and add the definitions of the page table
accessors to the new header.

This patch (of 12):

The linux/mm.h header includes <asm/pgtable.h> to allow inlining of the
functions involving page table manipulations, e.g.  pte_alloc() and
pmd_alloc().  So, there is no point to explicitly include <asm/pgtable.h>
in the files that include <linux/mm.h>.

The include statements in such cases are remove with a simple loop:

	for f in $(git grep -l "include <linux/mm.h>") ; do
		sed -i -e '/include <asm\/pgtable.h>/ d' $f
	done

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Greg Ungerer <gerg@linux-m68k.org>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Guo Ren <guoren@kernel.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: http://lkml.kernel.org/r/20200514170327.31389-1-rppt@kernel.org
Link: http://lkml.kernel.org/r/20200514170327.31389-2-rppt@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-09 09:39:13 -07:00
Mike Rapoport
84e6ffb2c4 arm: add support for folded p4d page tables
Implement primitives necessary for the 4th level folding, add walks of p4d
level where appropriate, and remove __ARCH_USE_5LEVEL_HACK.

[rppt@linux.ibm.com: fix kexec]
  Link: http://lkml.kernel.org/r/20200508174232.GA759899@linux.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christophe Leroy <christophe.leroy@c-s.fr>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: James Morse <james.morse@arm.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Julien Thierry <julien.thierry.kdev@gmail.com>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: http://lkml.kernel.org/r/20200414153455.21744-3-rppt@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-04 19:06:21 -07:00
Thomas Gleixner
d2912cb15b treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500
Based on 2 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license version 2 as
  published by the free software foundation

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license version 2 as
  published by the free software foundation #

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 4122 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Enrico Weigelt <info@metux.net>
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-19 17:09:55 +02:00
Huang Ying
cb9f753a37 mm: fix races between swapoff and flush dcache
Thanks to commit 4b3ef9daa4 ("mm/swap: split swap cache into 64MB
trunks"), after swapoff the address_space associated with the swap
device will be freed.  So page_mapping() users which may touch the
address_space need some kind of mechanism to prevent the address_space
from being freed during accessing.

The dcache flushing functions (flush_dcache_page(), etc) in architecture
specific code may access the address_space of swap device for anonymous
pages in swap cache via page_mapping() function.  But in some cases
there are no mechanisms to prevent the swap device from being swapoff,
for example,

  CPU1					CPU2
  __get_user_pages()			swapoff()
    flush_dcache_page()
      mapping = page_mapping()
        ...				  exit_swap_address_space()
        ...				    kvfree(spaces)
        mapping_mapped(mapping)

The address space may be accessed after being freed.

But from cachetlb.txt and Russell King, flush_dcache_page() only care
about file cache pages, for anonymous pages, flush_anon_page() should be
used.  The implementation of flush_dcache_page() in all architectures
follows this too.  They will check whether page_mapping() is NULL and
whether mapping_mapped() is true to determine whether to flush the
dcache immediately.  And they will use interval tree (mapping->i_mmap)
to find all user space mappings.  While mapping_mapped() and
mapping->i_mmap isn't used by anonymous pages in swap cache at all.

So, to fix the race between swapoff and flush dcache, __page_mapping()
is add to return the address_space for file cache pages and NULL
otherwise.  All page_mapping() invoking in flush dcache functions are
replaced with page_mapping_file().

[akpm@linux-foundation.org: simplify page_mapping_file(), per Mike]
Link: http://lkml.kernel.org/r/20180305083634.15174-1-ying.huang@intel.com
Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Chen Liqin <liqin.linux@gmail.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Chris Zankel <chris@zankel.net>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-05 21:36:26 -07:00
Russell King
4ed89f2228 ARM: convert printk(KERN_* to pr_*
Convert many (but not all) printk(KERN_* to pr_* to simplify the code.
We take the opportunity to join some printk lines together so we don't
split the message across several lines, and we also add a few levels
to some messages which were previously missing them.

Tested-by: Andrew Lunn <andrew@lunn.ch>
Tested-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2014-11-21 15:24:50 +00:00
Kirill A. Shutemov
57c1ffcefb mm: rename USE_SPLIT_PTLOCKS to USE_SPLIT_PTE_PTLOCKS
We're going to introduce split page table lock for PMD level.  Let's
rename existing split ptlock for PTE level to avoid confusion.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Tested-by: Alex Thorlton <athorlton@sgi.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: "Eric W . Biederman" <ebiederm@xmission.com>
Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Jones <davej@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Robin Holt <robinmholt@gmail.com>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-15 09:32:14 +09:00
Michel Lespinasse
6b2dbba8b6 mm: replace vma prio_tree with an interval tree
Implement an interval tree as a replacement for the VMA prio_tree.  The
algorithms are similar to lib/interval_tree.c; however that code can't be
directly reused as the interval endpoints are not explicitly stored in the
VMA.  So instead, the common algorithm is moved into a template and the
details (node type, how to get interval endpoints from the node, etc) are
filled in using the C preprocessor.

Once the interval tree functions are available, using them as a
replacement to the VMA prio tree is a relatively simple, mechanical job.

Signed-off-by: Michel Lespinasse <walken@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Hillf Danton <dhillf@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-09 16:22:39 +09:00
Paul Gortmaker
d91ef63bd5 arm: remove several unnecessary module.h include instances
Building these files does not reveal a hidden need for
any of these.  Since module.h brings in the whole kitchen
sink, it just needlessly adds 30k+ lines to the cpp burden.

There are probably lots more, but ARM files of mach-* and plat-*
don't get coverage via a simple yesconfig build.  They will have
to be cleaned up and tested via using their respective configs.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2011-10-31 19:30:48 -04:00
Russell King
516295e5ab ARM: pgtable: add pud-level code
Add pud_offset() et.al. between the pgd and pmd code in preparation of
using pgtable-nopud.h rather than 4level-fixup.h.

This incorporates a fix from Jamie Iles <jamie@jamieiles.com> for
uaccess_with_memcpy.c.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-02-21 19:24:14 +00:00
Russell King
f6e3354d02 ARM: pgtable: introduce pteval_t to represent a pte value
This makes everywhere dealing with pte values use the same type.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-11-26 20:45:47 +00:00
Mika Westerberg
4e54d93d3c ARM: 6464/2: fix spinlock recursion in adjust_pte()
When running following code in a machine which has VIVT caches and
USE_SPLIT_PTLOCKS is not defined:

  fd = open("/etc/passwd", O_RDONLY);
  addr = mmap(NULL, 4096, PROT_READ, MAP_SHARED, fd, 0);
  addr2 = mmap(NULL, 4096, PROT_READ, MAP_SHARED, fd, 0);

  v = *((int *)addr);

we will hang in spinlock recursion in the page fault handler:

  BUG: spinlock recursion on CPU#0, mmap_test/717
  lock: c5e295d8, .magic: dead4ead, .owner: mmap_test/717,
                  .owner_cpu: 0
  [<c0026604>] (unwind_backtrace+0x0/0xec)
  [<c014ee48>] (do_raw_spin_lock+0x40/0x140)
  [<c0027f68>] (update_mmu_cache+0x208/0x250)
  [<c0079db4>] (__do_fault+0x320/0x3ec)
  [<c007af7c>] (handle_mm_fault+0x2f0/0x6d8)
  [<c0027834>] (do_page_fault+0xdc/0x1cc)
  [<c00202d0>] (do_DataAbort+0x34/0x94)

This comes from the fact that when USE_SPLIT_PTLOCKS is not defined,
the only lock protecting the page tables is mm->page_table_lock
which is already locked before update_mmu_cache() is called.

Signed-off-by: Mika Westerberg <mika.westerberg@iki.fi>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-10-28 13:53:47 +01:00
Peter Zijlstra
ece0e2b640 mm: remove pte_*map_nested()
Since we no longer need to provide KM_type, the whole pte_*map_nested()
API is now redundant, remove it.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: David Miller <davem@davemloft.net>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-26 16:52:08 -07:00
Catalin Marinas
6012191aa9 ARM: 6380/1: Introduce __sync_icache_dcache() for VIPT caches
On SMP systems, there is a small chance of a PTE becoming visible to a
different CPU before the current cache maintenance operations in
update_mmu_cache(). To avoid this, cache maintenance must be handled in
set_pte_at() (similar to IA-64 and PowerPC).

This patch provides a unified VIPT cache handling mechanism and
implements the __sync_icache_dcache() function for ARMv6 onwards
architectures. It is called from set_pte_at() and replaces the
update_mmu_cache(). The latter is still used on VIVT hardware where a
vm_area_struct is required.

Tested-by: Rabin Vincent <rabin.vincent@stericsson.com>
Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-09-19 12:17:44 +01:00
Catalin Marinas
c01778001a ARM: 6379/1: Assume new page cache pages have dirty D-cache
There are places in Linux where writes to newly allocated page cache
pages happen without a subsequent call to flush_dcache_page() (several
PIO drivers including USB HCD). This patch changes the meaning of
PG_arch_1 to be PG_dcache_clean and always flush the D-cache for a newly
mapped page in update_mmu_cache().

The patch also sets the PG_arch_1 bit in the DMA cache maintenance
function to avoid additional cache flushing in update_mmu_cache().

Tested-by: Rabin Vincent <rabin.vincent@stericsson.com>
Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-09-19 12:17:43 +01:00
Russell King
ac1d426e82 Merge branch 'devel-stable' into devel
Conflicts:
	arch/arm/Kconfig
	arch/arm/include/asm/system.h
	arch/arm/mm/Kconfig
2010-05-17 17:24:04 +01:00
Russell King
f76348a360 ARM: remove unnecessary cache flush
This cache flush occurs when we first insert a page into the page
tables, where a page did not exist previously.  There can be no
cache lines associated with this virtual mapping, so this cache
flush is redundant.

Tested-by: Mike Rapoport <mike@compulab.co.il>
Tested-by: Mikael Pettersson <mikpe at it.uu.se>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-04-14 13:13:25 +01:00
Tejun Heo
5a0e3ad6af include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files.  percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -> slab.h dependency is about to be removed.  Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability.  As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

  http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
  only the necessary includes are there.  ie. if only gfp is used,
  gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
  blocks and try to put the new include such that its order conforms
  to its surrounding.  It's put in the include block which contains
  core kernel includes, in the same order that the rest are ordered -
  alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
  doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
  because the file doesn't have fitting include block), it prints out
  an error message indicating which .h file needs to be added to the
  file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
   over 4000 files, deleting around 700 includes and adding ~480 gfp.h
   and ~3000 slab.h inclusions.  The script emitted errors for ~400
   files.

2. Each error was manually checked.  Some didn't need the inclusion,
   some needed manual addition while adding it to implementation .h or
   embedding .c file was more appropriate for others.  This step added
   inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
   from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
   e.g. lib/decompress_*.c used malloc/free() wrappers around slab
   APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
   editing them as sprinkling gfp.h and slab.h inclusions around .h
   files could easily lead to inclusion dependency hell.  Most gfp.h
   inclusion directives were ignored as stuff from gfp.h was usually
   wildly available and often used in preprocessor macros.  Each
   slab.h inclusion directive was examined and added manually as
   necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
   were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
   distributed build env didn't work with gcov compiles) and a few
   more options had to be turned off depending on archs to make things
   build (like ipr on powerpc/64 which failed due to missing writeq).

   * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
   * powerpc and powerpc64 SMP allmodconfig
   * sparc and sparc64 SMP allmodconfig
   * ia64 SMP allmodconfig
   * s390 SMP allmodconfig
   * alpha SMP allmodconfig
   * um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
   a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-30 22:02:32 +09:00
Russell King
ae1402022e ARM: make_coherent(): fix problems with highpte, part 2
update_mmu_cache() is called with the page table for the faulted-in
page still mapped.  We need to modify the PTE for this page to ensure
coherency with other shared mappings when multiple shared mappings
exist within a MM.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-02-20 16:42:51 +00:00
Russell King
4b3073e1c5 MM: Pass a PTE pointer to update_mmu_cache() rather than the PTE itself
On VIVT ARM, when we have multiple shared mappings of the same file
in the same MM, we need to ensure that we have coherency across all
copies.  We do this via make_coherent() by making the pages
uncacheable.

This used to work fine, until we allowed highmem with highpte - we
now have a page table which is mapped as required, and is not available
for modification via update_mmu_cache().

Ralf Beache suggested getting rid of the PTE value passed to
update_mmu_cache():

  On MIPS update_mmu_cache() calls __update_tlb() which walks pagetables
  to construct a pointer to the pte again.  Passing a pte_t * is much
  more elegant.  Maybe we might even replace the pte argument with the
  pte_t?

Ben Herrenschmidt would also like the pte pointer for PowerPC:

  Passing the ptep in there is exactly what I want.  I want that
  -instead- of the PTE value, because I have issue on some ppc cases,
  for I$/D$ coherency, where set_pte_at() may decide to mask out the
  _PAGE_EXEC.

So, pass in the mapped page table pointer into update_mmu_cache(), and
remove the PTE value, updating all implementations and call sites to
suit.

Includes a fix from Stephen Rothwell:

  sparc: fix fallout from update_mmu_cache API change

  Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>

Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-02-20 16:41:46 +00:00
Russell King
ed42acaef1 ARM: make_coherent: avoid recalculating the pfn for the modified page
We already know the pfn for the page to be modified in make_coherent,
so let's stop recalculating it unnecessarily.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-01-20 13:48:30 +00:00
Russell King
56dd47098a ARM: make_coherent: fix problems with highpte, part 1
update_mmu_cache() is called with a page table already mapped.  We
call make_coherent(), which then calls adjust_pte() which wants to
map other page tables.  This causes kmap_atomic() to BUG() because
the slot its trying to use is already taken.

Since do_adjust_pte() modifies the page tables, we are also missing
any form of locking, so we're risking corrupting the page tables.

Fix this by using pte_offset_map_nested(), and taking the pte page
table lock around do_adjust_pte().

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-01-20 13:48:30 +00:00
Russell King
f8a85f1164 ARM: make_coherent: convert adjust_pte() to use p*d_none_or_clear_bad()
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-01-20 13:48:29 +00:00
Russell King
c26c20b823 ARM: make_coherent: split adjust_pte() in two
adjust_pte() walks the page tables, and do_adjust_pte() does the
page table manipulation.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2010-01-20 13:48:29 +00:00
Russell King
52e8bfd81a ARM: Fix wrong shared bit for CPU write buffer bug test
It is unpredictable to have the same memory mapped using different
shared bit settings for ARMv6 and ARMv7 CPUs.  Fix this for the CPU
write buffer bug test.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-12-23 19:54:31 +00:00
Russell King
7b0a1003e7 ARM: Reduce __flush_dcache_page() visibility
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-12-04 14:58:50 +00:00
Russell King
421fe93cc4 ARM: ZERO_PAGE: Avoid flush_dcache_page() for zero page
The zero page is read-only, and has its cache state cleared during
boot.  No further maintanence for this page is required.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-12-01 18:20:07 +00:00
Nitin Gupta
787b2faadc ARM: force dcache flush if dcache_dirty bit set
On ARM, update_mmu_cache() does dcache flush for a page only if
it has a kernel mapping (page_mapping(page) != NULL). The correct
behavior would be to force the flush based on dcache_dirty bit only.

One of the cases where present logic would be a problem is when
a RAM based block device[1] is used as a swap disk. In this case,
we would have in-memory data corruption as shown in steps below:

do_swap_page()
{
    - Allocate a new page (if not already in swap cache)
    - Issue read from swap disk
        - Block driver issues flush_dcache_page()
        - flush_dcache_page() simply sets PG_dcache_dirty bit and does not
          actually issue a flush since this page has no user space mapping yet.
    - Now, if swap disk is almost full, this newly read page is removed
      from swap cache and corrsponding swap slot is freed.
    - Map this page anonymously in user space.
    - update_mmu_cache()
        - Since this page does not have kernel mapping (its not in page/swap
          cache and is mapped anonymously), it does not issue dcache flush
          even if dcache_dirty bit is set by flush_dcache_page() above.

    <user now gets stale data since dcache was never flushed>
}

Same problem exists on mips too.

[1] example:
 - brd (RAM based block device)
 - ramzswap (RAM based compressed swap device)

Signed-off-by: Nitin Gupta <ngupta@vflare.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-10-12 17:52:26 +01:00
Nicolas Pitre
08e445bd6a [ARM] 5366/1: fix shared memory coherency with VIVT L1 + L2 caches
When there are multiple L1-aliasing userland mappings of the same physical
page, we currently remap each of them uncached, to prevent VIVT cache
aliasing issues. (E.g. writes to one of the mappings not being immediately
visible via another mapping.)  However, when we do this remapping, there
could still be stale data in the L2 cache, and an uncached mapping might
bypass L2 and go straight to RAM.  This would cause reads from such
mappings to see old data (until the dirty L2 line is eventually evicted.)

This issue is solved by forcing a L2 cache flush whenever the shared page
is made L1 uncacheable.

Ideally, we would make L1 uncacheable and L2 cacheable as L2 is PIPT. But
Feroceon does not support that combination, and the TEX=5 C=0 B=0 encoding
for XSc3 doesn't appear to work in practice.

Signed-off-by: Nicolas Pitre <nico@marvell.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-01-28 16:55:00 +00:00
Russell King
6a4690c22f Merge branch 'ptebits' into devel
Conflicts:

	arch/arm/Kconfig
2008-10-09 21:31:56 +01:00
Russell King
bb30f36f9b [ARM] Introduce new PTE memory type bits
Provide L_PTE_MT_xxx definitions to describe the memory types that we
use in Linux/ARM.  These definitions are carefully picked such that:

1. their LSBs match what is required for pre-ARMv6 CPUs.
2. they all have a unique encoding, including after modification
   by build_mem_type_table() (the result being that some have more
   than one combination.)

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2008-10-01 16:40:56 +01:00
Russell King
09d9bae064 [ARM] sparse: fix several warnings
arch/arm/kernel/process.c:270:6: warning: symbol 'show_fpregs' was not declared. Should it be static?

This function isn't used, so can be removed.

arch/arm/kernel/setup.c:532:9: warning: symbol 'len' shadows an earlier one
arch/arm/kernel/setup.c:524:6: originally declared here

A function containing two 'len's.

arch/arm/mm/fault-armv.c:188:13: warning: symbol 'check_writebuffer_bugs' was not declared. Should it be static?
arch/arm/mm/mmap.c:122:5: warning: symbol 'valid_phys_addr_range' was not declared. Should it be static?
arch/arm/mm/mmap.c:137:5: warning: symbol 'valid_mmap_phys_addr_range' was not declared. Should it be static?

Missing includes.

arch/arm/kernel/traps.c:71:77: warning: Using plain integer as NULL pointer
arch/arm/mm/ioremap.c:355:46: error: incompatible types in comparison expression (different address spaces)

Sillies.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2008-09-05 14:11:24 +01:00
Russell King
46097c7dd8 [ARM] cachetype: move definitions to separate header
Rather than pollute asm/cacheflush.h with the cache type definitions,
move them to asm/cachetype.h, and include this new header where
necessary.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2008-09-01 12:06:24 +01:00
Russell King
53cdb27a93 [ARM] Fix shared mmap when more than two maps of the same file exist
The shared mmap code works fine for the test case, which only checked
for two shared maps of the same file.  However, three shared maps
result in one mapping remaining cached, resulting in stale data being
visible via that mapping.  Fix this.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2008-07-27 10:35:54 +01:00
Catalin Marinas
826cbdaff2 [ARM] 5092/1: Fix the I-cache invalidation on ARMv6 and later CPUs
This patch adds the I-cache invalidation in update_mmu_cache if the
corresponding vma is marked as executable. It also invalidates the
I-cache if a thread migrates to a CPU it never ran on.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2008-07-03 16:39:57 +01:00
George G. Davis
cb36bb7516 [ARM] 4191/1: Remove redundant __flush_dcache_page() function prototype
Commit 1c9d3df5e8 added function prototype
__flush_dcache_page() in include/asm-arm/cacheflush.h.  So we can remove
the prototype for same in arch/arm/mm/fault-armv.c since it is now
redundant to have it there.

Signed-off-by: George G. Davis <gdavis@mvista.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2007-02-16 12:57:55 +00:00
Russell King
ad1ae2fe7f [ARM] Unuse another Linux PTE bit
L_PTE_ASID is not really required to be stored in every PTE, since we
can identify it via the address passed to set_pte_at().  So, create
set_pte_ext() which takes the address of the PTE to set, the Linux
PTE value, and the additional CPU PTE bits which aren't encoded in
the Linux PTE value.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2006-12-13 14:34:43 +00:00
Hugh Dickins
69b0475456 [PATCH] mm: arm ready for split ptlock
Prepare arm for the split page_table_lock: three issues.

Signal handling's preserve and restore of iwmmxt context currently involves
reading and writing that context to and from user space, while holding
page_table_lock to secure the user page(s) against kswapd.  If we split the
lock, then the structure might span two pages, secured by to read into and
write from a kernel stack buffer, copying that out and in without locking (the
structure is 160 bytes in size, and here we're near the top of the kernel
stack).  Or would the overhead be noticeable?

arm_syscall's cmpxchg emulation use pte_offset_map_lock, instead of
pte_offset_map and mm-wide page_table_lock; and strictly, it should now also
take mmap_sem before descending to pmd, to guard against another thread
munmapping, and the page table pulled out beneath this thread.

Updated two comments in fault-armv.c.  adjust_pte is interesting, since its
modification of a pte in one part of the mm depends on the lock held when
calling update_mmu_cache for a pte in some other part of that mm.  This can't
be done with a split page_table_lock (and we've already taken the lowest lock
in the hierarchy here): so we'll have to disable split on arm, unless
CONFIG_CPU_CACHE_VIPT to ensures adjust_pte never used.

Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-29 21:40:42 -07:00
Russell King
8830f04a09 [PATCH] ARM: Fix delayed dcache flush for ARMv6 non-aliasing caches
flush_dcache_page() did nothing for these caches, but since they
suffer from I/D cache coherency issues, we need to ensure that data
is written back to RAM.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2005-06-20 09:51:03 +01:00
Linus Torvalds
1da177e4c3 Linux-2.6.12-rc2
Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.

Let it rip!
2005-04-16 15:20:36 -07:00