Commit Graph

84 Commits

Author SHA1 Message Date
Marc Zyngier
d0e22b4ac3 KVM: arm/arm64: Limit icache invalidation to prefetch aborts
We've so far eagerly invalidated the icache, no matter how
the page was faulted in (data or prefetch abort).

But we can easily track execution by setting the XN bits
in the S2 page tables, get the prefetch abort at HYP and
perform the icache invalidation at that time only.

As for most VMs, the instruction working set is pretty
small compared to the data set, this is likely to save
some traffic (specially as the invalidation is broadcast).

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2018-01-08 15:20:45 +01:00
Russell King
1ee5e87f86 ARM: fix get_user_pages_fast
Ensure that get_user_pages_fast() is not able to access memory which
has been mapped with PROT_NONE.

Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2017-11-21 14:45:36 +00:00
Kirill A. Shutemov
9849a5697d arch, mm: convert all architectures to use 5level-fixup.h
If an architecture uses 4level-fixup.h we don't need to do anything as
it includes 5level-fixup.h.

If an architecture uses pgtable-nop*d.h, define __ARCH_USE_5LEVEL_HACK
before inclusion of the header. It makes asm-generic code to use
5level-fixup.h.

If an architecture has 4-level paging or folds levels on its own,
include 5level-fixup.h directly.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-03-09 11:48:47 -08:00
Linus Torvalds
221bb8a46e - ARM: GICv3 ITS emulation and various fixes. Removal of the old
VGIC implementation.
 
 - s390: support for trapping software breakpoints, nested virtualization
 (vSIE), the STHYI opcode, initial extensions for CPU model support.
 
 - MIPS: support for MIPS64 hosts (32-bit guests only) and lots of cleanups,
 preliminary to this and the upcoming support for hardware virtualization
 extensions.
 
 - x86: support for execute-only mappings in nested EPT; reduced vmexit
 latency for TSC deadline timer (by about 30%) on Intel hosts; support for
 more than 255 vCPUs.
 
 - PPC: bugfixes.
 
 The ugly bit is the conflicts.  A couple of them are simple conflicts due
 to 4.7 fixes, but most of them are with other trees. There was definitely
 too much reliance on Acked-by here.  Some conflicts are for KVM patches
 where _I_ gave my Acked-by, but the worst are for this pull request's
 patches that touch files outside arch/*/kvm.  KVM submaintainers should
 probably learn to synchronize better with arch maintainers, with the
 latter providing topic branches whenever possible instead of Acked-by.
 This is what we do with arch/x86.  And I should learn to refuse pull
 requests when linux-next sends scary signals, even if that means that
 submaintainers have to rebase their branches.
 
 Anyhow, here's the list:
 
 - arch/x86/kvm/vmx.c: handle_pcommit and EXIT_REASON_PCOMMIT was removed
 by the nvdimm tree.  This tree adds handle_preemption_timer and
 EXIT_REASON_PREEMPTION_TIMER at the same place.  In general all mentions
 of pcommit have to go.
 
 There is also a conflict between a stable fix and this patch, where the
 stable fix removed the vmx_create_pml_buffer function and its call.
 
 - virt/kvm/kvm_main.c: kvm_cpu_notifier was removed by the hotplug tree.
 This tree adds kvm_io_bus_get_dev at the same place.
 
 - virt/kvm/arm/vgic.c: a few final bugfixes went into 4.7 before the
 file was completely removed for 4.8.
 
 - include/linux/irqchip/arm-gic-v3.h: this one is entirely our fault;
 this is a change that should have gone in through the irqchip tree and
 pulled by kvm-arm.  I think I would have rejected this kvm-arm pull
 request.  The KVM version is the right one, except that it lacks
 GITS_BASER_PAGES_SHIFT.
 
 - arch/powerpc: what a mess.  For the idle_book3s.S conflict, the KVM
 tree is the right one; everything else is trivial.  In this case I am
 not quite sure what went wrong.  The commit that is causing the mess
 (fd7bacbca4, "KVM: PPC: Book3S HV: Fix TB corruption in guest exit
 path on HMI interrupt", 2016-05-15) touches both arch/powerpc/kernel/
 and arch/powerpc/kvm/.  It's large, but at 396 insertions/5 deletions
 I guessed that it wasn't really possible to split it and that the 5
 deletions wouldn't conflict.  That wasn't the case.
 
 - arch/s390: also messy.  First is hypfs_diag.c where the KVM tree
 moved some code and the s390 tree patched it.  You have to reapply the
 relevant part of commits 6c22c98637, plus all of e030c1125e, to
 arch/s390/kernel/diag.c.  Or pick the linux-next conflict
 resolution from http://marc.info/?l=kvm&m=146717549531603&w=2.
 Second, there is a conflict in gmap.c between a stable fix and 4.8.
 The KVM version here is the correct one.
 
 I have pushed my resolution at refs/heads/merge-20160802 (commit
 3d1f53419842) at git://git.kernel.org/pub/scm/virt/kvm/kvm.git.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJXoGm7AAoJEL/70l94x66DugQIAIj703ePAFepB/fCrKHkZZia
 SGrsBdvAtNsOhr7FQ5qvvjLxiv/cv7CymeuJivX8H+4kuUHUllDzey+RPHYHD9X7
 U6n1PdCH9F15a3IXc8tDjlDdOMNIKJixYuq1UyNZMU6NFwl00+TZf9JF8A2US65b
 x/41W98ilL6nNBAsoDVmCLtPNWAqQ3lajaZELGfcqRQ9ZGKcAYOaLFXHv2YHf2XC
 qIDMf+slBGSQ66UoATnYV2gAopNlWbZ7n0vO6tE2KyvhHZ1m399aBX1+k8la/0JI
 69r+Tz7ZHUSFtmlmyByi5IAB87myy2WQHyAPwj+4vwJkDGPcl0TrupzbG7+T05Y=
 =42ti
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Paolo Bonzini:

 - ARM: GICv3 ITS emulation and various fixes.  Removal of the
   old VGIC implementation.

 - s390: support for trapping software breakpoints, nested
   virtualization (vSIE), the STHYI opcode, initial extensions
   for CPU model support.

 - MIPS: support for MIPS64 hosts (32-bit guests only) and lots
   of cleanups, preliminary to this and the upcoming support for
   hardware virtualization extensions.

 - x86: support for execute-only mappings in nested EPT; reduced
   vmexit latency for TSC deadline timer (by about 30%) on Intel
   hosts; support for more than 255 vCPUs.

 - PPC: bugfixes.

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (302 commits)
  KVM: PPC: Introduce KVM_CAP_PPC_HTM
  MIPS: Select HAVE_KVM for MIPS64_R{2,6}
  MIPS: KVM: Reset CP0_PageMask during host TLB flush
  MIPS: KVM: Fix ptr->int cast via KVM_GUEST_KSEGX()
  MIPS: KVM: Sign extend MFC0/RDHWR results
  MIPS: KVM: Fix 64-bit big endian dynamic translation
  MIPS: KVM: Fail if ebase doesn't fit in CP0_EBase
  MIPS: KVM: Use 64-bit CP0_EBase when appropriate
  MIPS: KVM: Set CP0_Status.KX on MIPS64
  MIPS: KVM: Make entry code MIPS64 friendly
  MIPS: KVM: Use kmap instead of CKSEG0ADDR()
  MIPS: KVM: Use virt_to_phys() to get commpage PFN
  MIPS: Fix definition of KSEGX() for 64-bit
  KVM: VMX: Add VMCS to CPU's loaded VMCSs before VMPTRLD
  kvm: x86: nVMX: maintain internal copy of current VMCS
  KVM: PPC: Book3S HV: Save/restore TM state in H_CEDE
  KVM: PPC: Book3S HV: Pull out TM state save/restore into separate procedures
  KVM: arm64: vgic-its: Simplify MAPI error handling
  KVM: arm64: vgic-its: Make vgic_its_cmd_handle_mapi similar to other handlers
  KVM: arm64: vgic-its: Turn device_id validation into generic ID validation
  ...
2016-08-02 16:11:27 -04:00
Marc Zyngier
0996353f8e arm/arm64: KVM: Make default HYP mappings non-excutable
Structures that can be generally written to don't have any requirement
to be executable (quite the opposite). This includes the kvm and vcpu
structures, as well as the stacks.

Let's change the default to incorporate the XN flag.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-06-29 14:01:34 +02:00
Marc Zyngier
5900270550 arm/arm64: KVM: Map the HYP text as read-only
There should be no reason for mapping the HYP text read/write.

As such, let's have a new set of flags (PAGE_HYP_EXEC) that allows
execution, but makes the page as read-only, and update the two call
sites that deal with mapping code.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-06-29 14:01:34 +02:00
Marc Zyngier
74a6b8885f arm/arm64: KVM: Enforce HYP read-only mapping of the kernel's rodata section
In order to be able to use C code in HYP, we're now mapping the kernel's
rodata in HYP. It works absolutely fine, except that we're mapping it RWX,
which is not what it should be.

Add a new HYP_PAGE_RO protection, and pass it as the protection flags
when mapping the rodata section.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2016-06-29 13:59:14 +02:00
Will Deacon
6245318869 ARM: 8578/1: mm: ensure pmd_present only checks the valid bit
In a subsequent patch, pmd_mknotpresent will clear the valid bit of the
pmd entry, resulting in a not-present entry from the hardware's
perspective. Unfortunately, pmd_present simply checks for a non-zero pmd
value and will therefore continue to return true even after a
pmd_mknotpresent operation. Since pmd_mknotpresent is only used for
managing huge entries, this is only an issue for the 3-level case.

This patch fixes the 3-level pmd_present implementation to take into
account the valid bit. For bisectability, the change is made before the
fix to pmd_mknotpresent.

[catalin.marinas@arm.com: comment update regarding pmd_mknotpresent patch]

Fixes: 8d96250700 ("ARM: mm: Transparent huge page support for LPAE systems.")
Cc: <stable@vger.kernel.org> # 3.11+
Cc: Russell King <linux@armlinux.org.uk>
Cc: Steve Capper <Steve.Capper@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2016-06-09 17:51:47 +01:00
Nicolas Pitre
6ff0966052 ARM: 8432/1: move VMALLOC_END from 0xff000000 to 0xff800000
There is a 12MB unused region in our memory map between the vmalloc and
fixmap areas. This became unused with commit e9da6e9905, confirmed
with commit 64d3b6a3f4.

We also have a 8MB guard area before the vmalloc area.  With the default
240MB vmalloc area size and the current VMALLOC_END definition, that
means the end of low memory ends up at 0xef800000 which is unfortunate
for 768MB machines where 8MB of RAM is lost to himem.

Let's move VMALLOC_END to 0xff800000 so the guard area won't chop the
top of the 768MB low memory area while keeping the default vmalloc area
size unchanged and still preserving a gap between the vmalloc and fixmap
areas.

Signed-off-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2015-09-22 08:13:57 +01:00
Kirill A. Shutemov
b007ea798f arm: drop L_PTE_FILE and pte_file()-related helpers
We've replaced remap_file_pages(2) implementation with emulation.  Nobody
creates non-linear mapping anymore.

This patch also adjust __SWP_TYPE_SHIFT, effectively increase size of
possible swap file to 128G.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Russell King <linux@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-02-10 14:30:31 -08:00
Jungseung Lee
1f92f77ab6 ARM: 8239/1: Introduce {set,clear}_pte_bit
Introduce helper functions for pte_mk* functions and it would be
used to change individual bits in ptes at times.

Signed-off-by: Jungseung Lee <js07.lee@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2014-12-03 16:00:06 +00:00
Linus Torvalds
8a5de18239 Second batch of changes for KVM/{arm,arm64} for 3.18
- Support for 48bit IPA and VA (EL2)
 - A number of fixes for devices mapped into guests
 - Yet another VGIC fix for BE
 - A fix for CPU hotplug
 - A few compile fixes (disabled VGIC, strict mm checks)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJUQkyTAAoJECPQ0LrRPXpDwrUP/1WbELgB74W35CJ1zIc4KuBi
 unP1muW3QAr9Vmp/KovRKyLKFiRaTRlQsszaI78f4ZQ++0vzivU8dZwV81Gn1y/v
 0qF63OB0UYsOgXMRrh5JTEqzUyNyNBLUH+FAQiEO/srDoH5WLp3Zq7ThjzjwGn7Q
 K2ArxFiml+p2BGIGKWe3XIrxNgpW4oWhfe1kW4WU7sshuJlut3Nee+q2lSIg9mZx
 2VXYnLNzSsHizgQHuVEyXIqn8HA5FSCvjBYIUcLERlWB0I66WvzOqg9rH/BmNNR2
 H+cBDY+9D8KBUBG9zZSG7hZ0mAONKcOnxGZWGzte3Oi7FMZkB3Y/zrIs0na4iB5Y
 FxE8j+2qclZk9fkHQ7wn9Ws8hpGR2OrFlc2O5ZoBJJ2KJ4wMRHMeEbYRBCRQbTCN
 +81SUW7mh2j/La0JBqZ6DhhTiymUdIB+6v78im9WGlHsFRAIHBt0Q0u/pIyY+GJs
 OH7FoswI3vF5iODlHeRO1yjaO3rkj+IJwqTuUuhAIGu9+qnof3ge+eM1cOqrudNa
 u2kDz+BC21+Q8dflOF99Ryz7cMWqMiwtR+N+OUYpxc7RL7mCeHVANJxdWIFHKa2Z
 XJaHmbKjmw8AoR0fbS6YWOl2xmIhqU+FAngI+mow/Hz4pJDpR2K3w17ASXIVVQVX
 go2bvGHONkdn8Ji3Asap
 =3fl5
 -----END PGP SIGNATURE-----

Merge tag 'kvm-arm-for-3.18-take-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm

Pull second batch of changes for KVM/{arm,arm64} from Marc Zyngier:
 "The most obvious thing is the sizeable MMU changes to support 48bit
  VAs on arm64.

  Summary:

   - support for 48bit IPA and VA (EL2)
   - a number of fixes for devices mapped into guests
   - yet another VGIC fix for BE
   - a fix for CPU hotplug
   - a few compile fixes (disabled VGIC, strict mm checks)"

[ I'm pulling directly from Marc at the request of Paolo Bonzini, whose
  backpack was stolen at Düsseldorf airport and will do new keys and
  rebuild his web of trust.    - Linus ]

* tag 'kvm-arm-for-3.18-take-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm:
  arm/arm64: KVM: Fix BE accesses to GICv2 EISR and ELRSR regs
  arm: kvm: STRICT_MM_TYPECHECKS fix for user_mem_abort
  arm/arm64: KVM: Ensure memslots are within KVM_PHYS_SIZE
  arm64: KVM: Implement 48 VA support for KVM EL2 and Stage-2
  arm/arm64: KVM: map MMIO regions at creation time
  arm64: kvm: define PAGE_S2_DEVICE as read-only by default
  ARM: kvm: define PAGE_S2_DEVICE as read-only by default
  arm/arm64: KVM: add 'writable' parameter to kvm_phys_addr_ioremap
  arm/arm64: KVM: fix potential NULL dereference in user_mem_abort()
  arm/arm64: KVM: use __GFP_ZERO not memset() to get zeroed pages
  ARM: KVM: fix vgic-disabled build
  arm: kvm: fix CPU hotplug
2014-10-18 14:32:31 -07:00
Ard Biesheuvel
903ed3a54d ARM: kvm: define PAGE_S2_DEVICE as read-only by default
Now that we support read-only memslots, we need to make sure that
pass-through device mappings are not mapped writable if the guest
has requested them to be read-only. The existing implementation
already honours this by calling kvm_set_s2pte_writable() on the new
pte in case of writable mappings, so all we need to do is define
the default pgprot_t value used for devices to be PTE_S2_RDONLY.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2014-10-10 13:07:37 +02:00
Steve Capper
bd951303be arm: mm: introduce special ptes for LPAE
We need a mechanism to tag ptes as being special, this indicates that no
attempt should be made to access the underlying struct page * associated
with the pte.  This is used by the fast_gup when operating on ptes as it
has no means to access VMAs (that also contain this information)
locklessly.

The L_PTE_SPECIAL bit is already allocated for LPAE, this patch modifies
pte_special and pte_mkspecial to make use of it, and defines
__HAVE_ARCH_PTE_SPECIAL.

This patch also excludes special ptes from the icache/dcache sync logic.

Signed-off-by: Steve Capper <steve.capper@linaro.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dann Frazier <dann.frazier@canonical.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Christoffer Dall <christoffer.dall@linaro.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-10-09 22:26:00 -04:00
Steven Capper
f295070687 ARM: 8108/1: mm: Introduce {pte,pmd}_isset and {pte,pmd}_isclear
Long descriptors on ARM are 64 bits, and some pte functions such as
pte_dirty return a bitwise-and of a flag with the pte value. If the
flag to be tested resides in the upper 32 bits of the pte, then we run
into the danger of the result being dropped if downcast.

For example:
	gather_stats(page, md, pte_dirty(*pte), 1);
where pte_dirty(*pte) is downcast to an int.

This patch introduces a new macro pte_isset which performs the bitwise
and, then performs a double logical invert (where needed) to ensure
predictable downcasting. The logical inverse pte_isclear is also
introduced.

Equivalent pmd functions for Transparent HugePages have also been
added.

Signed-off-by: Steve Capper <steve.capper@linaro.org>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2014-07-24 14:27:07 +01:00
Will Deacon
1971188aa1 ARM: 7985/1: mm: implement pte_accessible for faulting mappings
The pte_accessible macro can be used to identify page table entries
capable of being cached by a TLB. In principle, this differs from
pte_present, since PROT_NONE mappings are mapped using invalid entries
identified as present and ptes designated as `old' can use either
invalid entries or those with the access flag cleared (guaranteed not to
be in the TLB). However, there is a race to take care of, as described
in 2084140594 ("mm: fix TLB flush race between migration, and
change_protection_range"), between a page being migrated and mprotected
at the same time. In this case, we can check whether a TLB invalidation
is pending for the mm and if so, temporarily consider PROT_NONE mappings
as valid.

This patch implements a quick pte_accessible macro for ARM by simply
checking if the pte is valid/present depending on the mm. For classic
MMU, these checks are identical and will generate some false positives
for PROT_NONE mappings, but this is better than the current asm-generic
definition of ((void)(pte),1).

Finally, pte_present_user is moved to use pte_valid (and renamed
appropriately) since we don't care about cache flushing for faulting
mappings.

Acked-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2014-02-25 11:32:40 +00:00
Russell King
6f14d778c1 Merge branches 'amba', 'fixes', 'kees', 'misc' and 'unstable/sa11x0' into for-next 2014-01-21 21:26:33 +00:00
Laura Abbott
27ec8da4ac ARM: add definitions for pte_mkexec/pte_mknexec
Other architectures define pte_mkexec to mark a pte as executable.
Add pte_mkexec for ARM to get the same functionality. Although no
other architectures currently define it, also add pte_mknexec to
explicitly allow a pte to be marked as non executable.

Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-12-11 09:53:18 +00:00
Russell King
d8aa712c30 ARM: fix booting low-vectors machines
Commit f6f91b0d9f (ARM: allow kuser helpers to be removed from the
vector page) required two pages for the vectors code.  Although the
code setting up the initial page tables was updated, the code which
allocates page tables for new processes wasn't, neither was the code
which tears down the mappings.  Fix this.

Fixes: f6f91b0d9f ("ARM: allow kuser helpers to be removed from the vector page")
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: <stable@vger.kernel.org>
2013-11-30 14:45:31 +00:00
Christoffer Dall
8947c09d05 ARM: 7808/1: KVM: mm: Get rid of L_PTE_USER ref from PAGE_S2_DEVICE
THe L_PTE_USER actually has nothing to do with stage 2 mappings and the
L_PTE_S2_RDWR value sets the readable bit, which was what L_PTE_USER
was used for before proper handling of stage 2 memory defines.

Changelog:
  [v3]: Drop call to kvm_set_s2pte_writable in mmu.c
  [v2]: Change default mappings to be r/w instead of r/o, as per Marc
     Zyngier's suggestion.

Cc: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-08-13 20:25:06 +01:00
Linus Torvalds
fb2af0020a Merge branch 'for-linus' of git://git.linaro.org/people/rmk/linux-arm
Pull ARM updates from Russell King:
 "This contains the usual updates from other people (listed below) and
  the usual random muddle of miscellaneous ARM updates which cover some
  low priority bug fixes and performance improvements.

  I've started to put the pull request wording into the merge commits,
  which are:

   - NoMMU stuff:

     This includes the following series sent earlier to the list:
      - nommu-fixes
      - R7 Support
      - MPU support

     I've left out the ARCH_MULTIPLATFORM/!MMU stuff that Arnd and I
     were discussing today until we've reached a conclusion/that's had
     some more review.

     This is rebased (and re-tested) on your devel-stable branch because
     otherwise there were going to be conflicts with Uwe's V7M work now
     that you've merged that.  I've included the fix for limiting MPU to
     CPU_V7.

   - Huge page support

     These changes bring both HugeTLB support and Transparent HugePage
     (THP) support to ARM.  Only long descriptors (LPAE) are supported
     in this series.

     The code has been tested on an Arndale board (Exynos 5250).

   - LPAE updates

     Please pull these miscellaneous LPAE fixes I've been collecting for
     a while now for 3.11.  They've been tested and reviewed by quite a
     few people, and most of the patches are pretty trivial.  -- Will Deacon.

   - arch_timer cleanups

     Please pull these arch_timer cleanups I've been holding onto for a
     while.  They're the same as my last posting, but have been rebased
     to v3.10-rc3.

   - mpidr linearisation (multiprocessor id register - identifies which
     CPU number we are in the system)

     This patch series that implements MPIDR linearization through a
     simple hashing algorithm and updates current cpu_{suspend}/{resume}
     code to use the newly created hash structures to retrieve context
     pointers.  It represents a stepping stone for the implementation of
     power management code on forthcoming multi-cluster ARM systems.

     It has been tested on TC2 (dual cluster A15xA7 system), iMX6q,
     OMAP4 and Tegra, with processors hitting low-power states requiring
     warm-boot resume through the cpu_resume code path"

* 'for-linus' of git://git.linaro.org/people/rmk/linux-arm: (77 commits)
  ARM: 7775/1: mm: Remove do_sect_fault from LPAE code
  ARM: 7777/1: Avoid extra calls to the C compiler
  ARM: 7774/1: Fix dtb dependency to use order-only prerequisites
  ARM: 7770/1: remove residual ARMv2 support from decompressor
  ARM: 7769/1: Cortex-A15: fix erratum 798181 implementation
  ARM: 7768/1: prevent risks of out-of-bound access in ASID allocator
  ARM: 7767/1: let the ASID allocator handle suspended animation
  ARM: 7766/1: versatile: don't mark pen as __INIT
  ARM: 7765/1: perf: Record the user-mode PC in the call chain.
  ARM: 7735/2: Preserve the user r/w register TPIDRURW on context switch and fork
  ARM: kernel: implement stack pointer save array through MPIDR hashing
  ARM: kernel: build MPIDR hash function data structure
  ARM: mpu: Ensure that MPU depends on CPU_V7
  ARM: mpu: protect the vectors page with an MPU region
  ARM: mpu: Allow enabling of the MPU via kconfig
  ARM: 7758/1: introduce config HAS_BANDGAP
  ARM: 7757/1: mm: don't flush icache in switch_mm with hardware broadcasting
  ARM: 7751/1: zImage: don't overwrite ourself with a page table
  ARM: 7749/1: spinlock: retry trylock operation if strex fails on free lock
  ARM: 7748/1: oabi: handle faults when loading swi instruction from userspace
  ...
2013-07-03 09:46:29 -07:00
Al Viro
40d158e618 consolidate io_remap_pfn_range definitions
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2013-06-29 12:46:35 +04:00
Catalin Marinas
8d96250700 ARM: mm: Transparent huge page support for LPAE systems.
The patch adds support for THP (transparent huge pages) to LPAE
systems. When this feature is enabled, the kernel tries to map
anonymous pages as 2MB sections where possible.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
[steve.capper@linaro.org: symbolic constants used, value of
PMD_SECT_SPLITTING adjusted, tlbflush.h included in pgtable.h,
added PROT_NONE support.]
Signed-off-by: Steve Capper <steve.capper@linaro.org>
Reviewed-by: Will Deacon <will.deacon@arm.com>
2013-06-04 16:52:38 +01:00
Catalin Marinas
104ad3b32d arm: set the page table freeing ceiling to TASK_SIZE
ARM processors with LPAE enabled use 3 levels of page tables, with an
entry in the top level (pgd) covering 1GB of virtual space.  Because of
the branch relocation limitations on ARM, the loadable modules are
mapped 16MB below PAGE_OFFSET, making the corresponding 1GB pgd shared
between kernel modules and user space.

If free_pgtables() is called with the default ceiling 0,
free_pgd_range() (and subsequently called functions) also frees the page
table shared between user space and kernel modules (which is normally
handled by the ARM-specific pgd_free() function).  This patch changes
defines the ARM USER_PGTABLES_CEILING to TASK_SIZE when CONFIG_ARM_LPAE
is enabled.

Note that the pgd_free() function already checks the presence of the
shared pmd page allocated by pgd_alloc() and frees it, though with
ceiling 0 this wasn't necessary.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Hugh Dickins <hughd@google.com>
Cc: <stable@vger.kernel.org>	[3.3+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-29 15:54:34 -07:00
Russell King
16af43fef8 Merge branches 'devel-stable', 'fixes' and 'mmci' into for-linus 2013-03-03 00:32:50 +00:00
Catalin Marinas
69dde4c52d ARM: 7654/1: Preserve L_PTE_VALID in pte_modify()
Following commit 26ffd0d4 (ARM: mm: introduce present, faulting entries
for PAGE_NONE), if a page has been mapped as PROT_NONE, the L_PTE_VALID
bit is cleared by the set_pte_ext() code. With LPAE the software and
hardware pte share the same location and subsequent modifications of pte
range (change_protection()) will leave the L_PTE_VALID bit cleared.

This patch adds the L_PTE_VALID bit to the newprot mask in pte_modify().

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Subash Patel <subash.rp@samsung.com>
Tested-by: Subash Patel <subash.rp@samsung.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: <stable@vger.kernel.org> # 3.8.x
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2013-02-21 13:25:37 +00:00
Christoffer Dall
cc577c26e2 ARM: Add page table and page defines needed by KVM
KVM uses the stage-2 page tables and the Hyp page table format,
so we define the fields and page protection flags needed by KVM.

The nomenclature is this:
 - page_hyp:        PL2 code/data mappings
 - page_hyp_device: PL2 device mappings (vgic access)
 - page_s2:         Stage-2 code/data page mappings
 - page_s2_device:  Stage-2 device mappings (vgic access)

Reviewed-by: Will Deacon <will.deacon@arm.com>
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Christoffer Dall <c.dall@virtualopensystems.com>
2013-01-23 13:29:08 -05:00
Will Deacon
26ffd0d43b ARM: mm: introduce present, faulting entries for PAGE_NONE
PROT_NONE mappings apply the page protection attributes defined by _P000
which translate to PAGE_NONE for ARM. These attributes specify an XN,
RDONLY pte that is inaccessible to userspace. However, on kernels
configured without support for domains, such a pte *is* accessible to
the kernel and can be read via get_user, allowing tasks to read
PROT_NONE pages via syscalls such as read/write over a pipe.

This patch introduces a new software pte flag, L_PTE_NONE, that is set
to identify faulting, present entries.

Signed-off-by: Will Deacon <will.deacon@arm.com>
2012-11-09 14:13:20 +00:00
Will Deacon
dbf62d5006 ARM: mm: introduce L_PTE_VALID for page table entries
For long-descriptor translation table formats, the ARMv7 architecture
defines the last two bits of the second- and third-level descriptors to
be:

	x0b	- Invalid
	01b	- Block (second-level), Reserved (third-level)
	11b	- Table (second-level), Page (third-level)

This allows us to define L_PTE_PRESENT as (3 << 0) and use this value to
create ptes directly. However, when determining whether a given pte
value is present in the low-level page table accessors, we only need to
check the least significant bit of the descriptor, allowing us to write
faulting, present entries which are required for PROT_NONE mappings.

This patch introduces L_PTE_VALID, which can be used to test whether a
pte should fault, and updates the low-level page table accessors
accordingly.

Signed-off-by: Will Deacon <will.deacon@arm.com>
2012-11-09 14:13:19 +00:00
David Howells
a1ce39288e UAPI: (Scripted) Convert #include "..." to #include <path/...> in kernel system headers
Convert #include "..." to #include <path/...> in kernel system headers.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Dave Jones <davej@redhat.com>
2012-10-02 18:01:25 +01:00
Will Deacon
f5f2025ef3 ARM: 7488/1: mm: use 5 bits for swapfile type encoding
Page migration encodes the pfn in the offset field of a swp_entry_t.
For LPAE, we support physical addresses of up to 36 bits (due to
sparsemem limitations with the size of page flags), requiring 24 bits
to represent a pfn. A further 3 bits are used to encode a swp_entry into
a pte, leaving 5 bits for the type field. Furthermore, the core code
defines MAX_SWAPFILES_SHIFT as 5, so the additional type bit does not
get used.

This patch reduces the width of the type field to 5 bits, allowing us
to create up to 31 swapfiles of 64GB each.

Cc: <stable@vger.kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-08-11 09:15:59 +01:00
Will Deacon
47f1204329 ARM: 7487/1: mm: avoid setting nG bit for user mappings that aren't present
Swap entries are encoding in ptes such that !pte_present(pte) and
pte_file(pte). The remaining bits of the descriptor are used to identify
the swapfile and offset within it to the swap entry.

When writing such a pte for a user virtual address, set_pte_at
unconditionally sets the nG bit, which (in the case of LPAE) will
corrupt the swapfile offset and lead to a BUG:

[  140.494067] swap_free: Unused swap offset entry 000763b4
[  140.509989] BUG: Bad page map in process rs:main Q:Reg  pte:0ec76800 pmd:8f92e003

This patch fixes the problem by only setting the nG bit for user
mappings that are actually present.

Cc: <stable@vger.kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-08-11 09:15:59 +01:00
Russell King
2e0e943436 Merge branch 'devel-stable' into for-linus
Conflicts:
	arch/arm/kernel/setup.c
	arch/arm/mach-shmobile/board-kota2.c
2012-01-05 13:24:33 +00:00
Russell King
e0b58ee8c4 Merge branch 'vmalloc' of git://git.linaro.org/people/nico/linux into devel-stable 2012-01-04 09:01:51 +00:00
Nicolas Pitre
9561f4e052 Revert "ARM: move VMALLOC_END down temporarily for shmobile"
This reverts commit 0af362f844 as shmobile
is not using a non-standard memory layout anymore.

Signed-off-by: Nicolas Pitre <nico@linaro.org>
2012-01-02 23:14:35 -05:00
Russell King
6ae25a5b9d Merge branch 'for-rmk' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux into devel-stable
Conflicts:
	arch/arm/mm/ioremap.c
2011-12-08 18:02:04 +00:00
Catalin Marinas
dcfdae04bd ARM: LPAE: Introduce the 3-level page table format definitions
This patch introduces the pgtable-3level*.h files with definitions
specific to the LPAE page table format (3 levels of page tables).

Each table is 4KB and has 512 64-bit entries. An entry can point to a
40-bit physical address. The young, write and exec software bits share
the corresponding hardware bits (negated). Other software bits use spare
bits in the PTE.

The patch also changes some variable types from unsigned long or int to
pteval_t or pgprot_t.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2011-12-08 10:30:39 +00:00
Catalin Marinas
e0c0313bd7 ARM: LPAE: Move page table maintenance macros to pgtable-2level.h
The page table maintenance macros need to be duplicated between the
classic and the LPAE MMU so this patch moves those that are not common
to the pgtable-2level.h file.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2011-12-08 10:30:37 +00:00
Russell King
a32618d28d ARM: pgtable: switch to use pgtable-nopud.h
Nick Piggin noted upon introducing 4level-fixup.h:

| Add a temporary "fallback" header so architectures can run with
| the 4level pagetables patch without modification. All architectures
| should be converted to use the folding headers (include/asm-generic/
| pgtable-nop?d.h) as soon as possible, and the fallback header removed.

This makes ARM compliant with this statement.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2011-12-08 10:30:36 +00:00
Russell King
3ee0fc5ca1 Merge branch 'kexec/idmap' of git://git.kernel.org/pub/scm/linux/kernel/git/will/linux into devel-stable 2011-12-06 20:27:54 +00:00
Will Deacon
8903826d0c ARM: idmap: populate identity map pgd at init time using .init.text
When disabling and re-enabling the MMU, it is necessary to take out an
identity mapping for the code that manipulates the SCTLR in order to
avoid it disappearing from under our feet. This is useful when soft
rebooting and returning from CPU suspend.

This patch allocates a set of page tables during boot and populates them
with an identity mapping for the .idmap.text section. This means that
users of the identity map do not need to manage their own pgd and can
instead annotate their functions with __idmap or, in the case of assembly
code, place them in the correct section.

Acked-by: Dave Martin <dave.martin@linaro.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Lorenzo Pieralisi <Lorenzo.Pieralisi@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2011-12-06 14:04:14 +00:00
Rob Herring
7dbaa46678 ARM: 7169/1: topdown mmap support
Similar to other architectures, this adds topdown mmap support in user
process address space allocation policy. This allows mmap sizes greater
than 2GB. This support is largely copied from MIPS and the generic
implementations.

The address space randomization is moved into arch_pick_mmap_layout.

Tested on V-Express with ubuntu and a mmap test from here:
https://bugs.launchpad.net/bugs/861296

Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-12-06 11:15:25 +00:00
Nicolas Pitre
0af362f844 ARM: move VMALLOC_END down temporarily for shmobile
THIS IS A TEMPORARY HACK.  The purpose of this is _only_ to avoid a
regression on an existing machine while a better fix is implemented.

On shmobile the consistent DMA memory area was set to 158MB in commit
28f0721a79 with no explanation.  The documented size for this area should
vary between 2MB and 14MB, and none of the other ARM targets exceed that.

The included #warning is therefore meant to be noisy on purpose to get
shmobile maintainers attention and this commit reverted once this
consistent DMA size conflict is resolved.

Signed-off-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Cc: Magnus Damm <damm@opensource.se>
Cc: Paul Mundt <lethal@linux-sh.org>
2011-11-26 19:21:30 -05:00
Nicolas Pitre
0536bdf33f ARM: move iotable mappings within the vmalloc region
In order to remove the build time variation between different SOCs with
regards to VMALLOC_END, the iotable mappings are now allocated inside
the vmalloc region.  This allows for VMALLOC_END to be identical across
all machines.

The value for VMALLOC_END is now set to 0xff000000 which is right where
the consistent DMA area starts.

To accommodate all static mappings on machines with possible highmem usage,
the default vmalloc area size is changed to 240 MB so that VMALLOC_START
is no higher than 0xf0000000 by default.

Signed-off-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Tested-by: Stephen Warren <swarren@nvidia.com>
Tested-by: Kevin Hilman <khilman@ti.com>
Tested-by: Jamie Iles <jamie@jamieiles.com>
2011-11-26 19:21:26 -05:00
Linus Torvalds
1fdb24e969 Merge branch 'devel-stable' of http://ftp.arm.linux.org.uk/pub/linux/arm/kernel/git-cur/linux-2.6-arm
* 'devel-stable' of http://ftp.arm.linux.org.uk/pub/linux/arm/kernel/git-cur/linux-2.6-arm: (178 commits)
  ARM: 7139/1: fix compilation with CONFIG_ARM_ATAG_DTB_COMPAT and large TEXT_OFFSET
  ARM: gic, local timers: use the request_percpu_irq() interface
  ARM: gic: consolidate PPI handling
  ARM: switch from NO_MACH_MEMORY_H to NEED_MACH_MEMORY_H
  ARM: mach-s5p64x0: remove mach/memory.h
  ARM: mach-s3c64xx: remove mach/memory.h
  ARM: plat-mxc: remove mach/memory.h
  ARM: mach-prima2: remove mach/memory.h
  ARM: mach-zynq: remove mach/memory.h
  ARM: mach-bcmring: remove mach/memory.h
  ARM: mach-davinci: remove mach/memory.h
  ARM: mach-pxa: remove mach/memory.h
  ARM: mach-ixp4xx: remove mach/memory.h
  ARM: mach-h720x: remove mach/memory.h
  ARM: mach-vt8500: remove mach/memory.h
  ARM: mach-s5pc100: remove mach/memory.h
  ARM: mach-tegra: remove mach/memory.h
  ARM: plat-tcc: remove mach/memory.h
  ARM: mach-mmp: remove mach/memory.h
  ARM: mach-cns3xxx: remove mach/memory.h
  ...

Fix up mostly pretty trivial conflicts in:
 - arch/arm/Kconfig
 - arch/arm/include/asm/localtimer.h
 - arch/arm/kernel/Makefile
 - arch/arm/mach-shmobile/board-ap4evb.c
 - arch/arm/mach-u300/core.c
 - arch/arm/mm/dma-mapping.c
 - arch/arm/mm/proc-v7.S
 - arch/arm/plat-omap/Kconfig
largely due to some CONFIG option renaming (ie CONFIG_PM_SLEEP ->
CONFIG_ARM_CPU_SUSPEND for the arm-specific suspend code etc) and
addition of NEED_MACH_MEMORY_H next to HAVE_IDE.
2011-10-28 12:02:27 -07:00
Catalin Marinas
d7c5d0dcff ARM: 7077/1: LPAE: Use a mask for physical addresses in page table entries
With LPAE, the physical address mask is 40-bit while the page table
entry is 64-bit. This patch introduces PHYS_MASK for the 2-level page
table format, defined as ~0UL.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-10-06 15:40:06 +01:00
Catalin Marinas
17f5721196 ARM: 7075/1: LPAE: Factor out 2-level page table definitions into separate files
This patch moves page table definitions from asm/page.h, asm/pgtable.h
and asm/ptgable-hwdef.h into corresponding *-2level* files.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-10-06 15:40:05 +01:00
Santosh Shilimkar
8fb54284ba ARM: mm: Add strongly ordered descriptor support.
On certain architectures, there might be a need to mark certain
addresses with strongly ordered memory attributes to avoid ordering
issues at the interconnect level.

On OMAP4, the asynchronous bridge buffers can only be drained
with strongly ordered accesses and hence the need to mark the
memory strongly ordered.

Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Signed-off-by: Woodruff Richard <r-woodruff2@ti.com>
Tested-by: Vishwanath BS <vishwanath.bs@ti.com>
2011-09-23 12:05:30 +05:30
Russell King
516295e5ab ARM: pgtable: add pud-level code
Add pud_offset() et.al. between the pgd and pmd code in preparation of
using pgtable-nopud.h rather than 4level-fixup.h.

This incorporates a fix from Jamie Iles <jamie@jamieiles.com> for
uaccess_with_memcpy.c.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-02-21 19:24:14 +00:00
Will Deacon
cae6292b65 ARM: 6672/1: LPAE: use phys_addr_t instead of unsigned long in mapping functions
The unsigned long datatype is not sufficient for mapping physical addresses
>= 4GB.

This patch ensures that the phys_addr_t datatype is used to represent physical
addresses when converting from a PFN.

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2011-02-15 14:20:23 +00:00