linux

korg/linux

mirror of https://mirrors.bfsu.edu.cn/git/linux.git synced 2024-12-15 15:04:27 +08:00

Author	SHA1	Message	Date
Kirill A. Shutemov	ef298cc567	arm/mm: provide pmdp_establish() helper ARM LPAE doesn't have hardware dirty/accessed bits. generic_pmdp_establish() is the right implementation of pmdp_establish for this case. Link: http://lkml.kernel.org/r/20171213105756.69879-4-kirill.shutemov@linux.intel.com Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2018-01-31 17:18:37 -08:00
Dan Williams	e4e40e0263	mm: switch to 'define pmd_write' instead of __HAVE_ARCH_PMD_WRITE In response to compile breakage introduced by a series that added the pud_write helper to x86, Stephen notes: did you consider using the other paradigm: In arch include files: #define pud_write pud_write static inline int pud_write(pud_t pud) ..... Then in include/asm-generic/pgtable.h: #ifndef pud_write tatic inline int pud_write(pud_t pud) { .... } #endif If you had, then the powerpc code would have worked ... ;-) and many of the other interfaces in include/asm-generic/pgtable.h are protected that way ... Given that some architecture already define pmd_write() as a macro, it's a net reduction to drop the definition of __HAVE_ARCH_PMD_WRITE. Link: http://lkml.kernel.org/r/151129126721.37405.13339850900081557813.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com> Suggested-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> Cc: Oliver OHalloran <oliveroh@au1.ibm.com> Cc: Chris Metcalf <cmetcalf@mellanox.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-11-29 18:40:42 -08:00
Steve Capper	56530f5d2d	ARM: 8579/1: mm: Fix definition of pmd_mknotpresent Currently pmd_mknotpresent will use a zero entry to respresent an invalidated pmd. Unfortunately this definition clashes with pmd_none, thus it is possible for a race condition to occur if zap_pmd_range sees pmd_none whilst __split_huge_pmd_locked is running too with pmdp_invalidate just called. This patch fixes the race condition by modifying pmd_mknotpresent to create non-zero faulting entries (as is done in other architectures), removing the ambiguity with pmd_none. [catalin.marinas@arm.com: using L_PMD_SECT_VALID instead of PMD_TYPE_SECT] Fixes: `8d96250700` ("ARM: mm: Transparent huge page support for LPAE systems.") Cc: <stable@vger.kernel.org> # 3.11+ Reported-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Will Deacon <will.deacon@arm.com> Cc: Russell King <linux@armlinux.org.uk> Signed-off-by: Steve Capper <steve.capper@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2016-06-09 17:51:47 +01:00
Will Deacon	6245318869	ARM: 8578/1: mm: ensure pmd_present only checks the valid bit In a subsequent patch, pmd_mknotpresent will clear the valid bit of the pmd entry, resulting in a not-present entry from the hardware's perspective. Unfortunately, pmd_present simply checks for a non-zero pmd value and will therefore continue to return true even after a pmd_mknotpresent operation. Since pmd_mknotpresent is only used for managing huge entries, this is only an issue for the 3-level case. This patch fixes the 3-level pmd_present implementation to take into account the valid bit. For bisectability, the change is made before the fix to pmd_mknotpresent. [catalin.marinas@arm.com: comment update regarding pmd_mknotpresent patch] Fixes: `8d96250700` ("ARM: mm: Transparent huge page support for LPAE systems.") Cc: <stable@vger.kernel.org> # 3.11+ Cc: Russell King <linux@armlinux.org.uk> Cc: Steve Capper <Steve.Capper@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2016-06-09 17:51:47 +01:00
Hugh Dickins	fd8cfd3000	arch: fix has_transparent_hugepage() I've just discovered that the useful-sounding has_transparent_hugepage() is actually an architecture-dependent minefield: on some arches it only builds if CONFIG_TRANSPARENT_HUGEPAGE=y, on others it's also there when not, but on some of those (arm and arm64) it then gives the wrong answer; and on mips alone it's marked __init, which would crash if called later (but so far it has not been called later). Straighten this out: make it available to all configs, with a sensible default in asm-generic/pgtable.h, removing its definitions from those arches (arc, arm, arm64, sparc, tile) which are served by the default, adding #define has_transparent_hugepage has_transparent_hugepage to those (mips, powerpc, s390, x86) which need to override the default at runtime, and removing the __init from mips (but maybe that kind of code should be avoided after init: set a static variable the first time it's called). Signed-off-by: Hugh Dickins <hughd@google.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andres Lagar-Cavilla <andreslc@google.com> Cc: Yang Shi <yang.shi@linaro.org> Cc: Ning Qu <quning@gmail.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Vineet Gupta <vgupta@synopsys.com> [arch/arc] Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> [arch/s390] Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-05-19 19:12:14 -07:00
Minchan Kim	44842045e4	arch/arm/include/asm/pgtable-3level.h: add pmd_mkclean for THP MADV_FREE needs pmd_dirty and pmd_mkclean for detecting recent overwrite of the contents since MADV_FREE syscall is called for THP page. This patch adds pmd_mkclean for THP page MADV_FREE support. Signed-off-by: Minchan Kim <minchan@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Russell King <rmk@arm.linux.org.uk> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Shaohua Li <shli@kernel.org> Cc: <yalin.wang2010@gmail.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Chen Gang <gang.chen.5i5j@gmail.com> Cc: Chris Zankel <chris@zankel.net> Cc: Daniel Micay <danielmicay@gmail.com> Cc: Darrick J. Wong <darrick.wong@oracle.com> Cc: David S. Miller <davem@davemloft.net> Cc: Helge Deller <deller@gmx.de> Cc: Hugh Dickins <hughd@google.com> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Jason Evans <je@fb.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Kirill A. Shutemov <kirill@shutemov.name> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mika Penttil <mika.penttila@nextfour.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Rik van Riel <riel@redhat.com> Cc: Roland Dreier <roland@kernel.org> Cc: Shaohua Li <shli@kernel.org> Cc: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-01-15 17:56:32 -08:00
Kirill A. Shutemov	0ebd744615	arm, thp: remove infrastructure for handling splitting PMDs With new refcounting we don't need to mark PMDs splitting. Let's drop code to handle this. pmdp_splitting_flush() is not needed too: on splitting PMD we will do pmdp_clear_flush() + set_pte_at(). pmdp_clear_flush() will do IPI as needed for fast_gup. [arnd@arndb.de: fix unterminated ifdef in header file] Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: Jerome Marchand <jmarchan@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Rik van Riel <riel@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Steve Capper <steve.capper@linaro.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-01-15 17:56:32 -08:00
Linus Torvalds	b9085bcbf5	Fairly small update, but there are some interesting new features. Common: Optional support for adding a small amount of polling on each HLT instruction executed in the guest (or equivalent for other architectures). This can improve latency up to 50% on some scenarios (e.g. O_DSYNC writes or TCP_RR netperf tests). This also has to be enabled manually for now, but the plan is to auto-tune this in the future. ARM/ARM64: the highlights are support for GICv3 emulation and dirty page tracking s390: several optimizations and bugfixes. Also a first: a feature exposed by KVM (UUID and long guest name in /proc/sysinfo) before it is available in IBM's hypervisor! :) MIPS: Bugfixes. x86: Support for PML (page modification logging, a new feature in Broadwell Xeons that speeds up dirty page tracking), nested virtualization improvements (nested APICv---a nice optimization), usual round of emulation fixes. There is also a new option to reduce latency of the TSC deadline timer in the guest; this needs to be tuned manually. Some commits are common between this pull and Catalin's; I see you have already included his tree. ARM has other conflicts where functions are added in the same place by 3.19-rc and 3.20 patches. These are not large though, and entirely within KVM. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQEcBAABAgAGBQJU28rkAAoJEL/70l94x66DXqQH/1TDOfJIjW7P2kb0Sw7Fy1wi cEX1KO/VFxAqc8R0E/0Wb55CXyPjQJM6xBXuFr5cUDaIjQ8ULSktL4pEwXyyv/s5 DBDkN65mriry2w5VuEaRLVcuX9Wy+tqLQXWNkEySfyb4uhZChWWHvKEcgw5SqCyg NlpeHurYESIoNyov3jWqvBjr4OmaQENyv7t2c6q5ErIgG02V+iCux5QGbphM2IC9 LFtPKxoqhfeB2xFxTOIt8HJiXrZNwflsTejIlCl/NSEiDVLLxxHCxK2tWK/tUXMn JfLD9ytXBWtNMwInvtFm4fPmDouv2VDyR0xnK2db+/axsJZnbxqjGu1um4Dqbak= =7gdx -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull KVM update from Paolo Bonzini: "Fairly small update, but there are some interesting new features. Common: Optional support for adding a small amount of polling on each HLT instruction executed in the guest (or equivalent for other architectures). This can improve latency up to 50% on some scenarios (e.g. O_DSYNC writes or TCP_RR netperf tests). This also has to be enabled manually for now, but the plan is to auto-tune this in the future. ARM/ARM64: The highlights are support for GICv3 emulation and dirty page tracking s390: Several optimizations and bugfixes. Also a first: a feature exposed by KVM (UUID and long guest name in /proc/sysinfo) before it is available in IBM's hypervisor! :) MIPS: Bugfixes. x86: Support for PML (page modification logging, a new feature in Broadwell Xeons that speeds up dirty page tracking), nested virtualization improvements (nested APICv---a nice optimization), usual round of emulation fixes. There is also a new option to reduce latency of the TSC deadline timer in the guest; this needs to be tuned manually. Some commits are common between this pull and Catalin's; I see you have already included his tree. Powerpc: Nothing yet. The KVM/PPC changes will come in through the PPC maintainers, because I haven't received them yet and I might end up being offline for some part of next week" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (130 commits) KVM: ia64: drop kvm.h from installed user headers KVM: x86: fix build with !CONFIG_SMP KVM: x86: emulate: correct page fault error code for NoWrite instructions KVM: Disable compat ioctl for s390 KVM: s390: add cpu model support KVM: s390: use facilities and cpu_id per KVM KVM: s390/CPACF: Choose crypto control block format s390/kernel: Update /proc/sysinfo file with Extended Name and UUID KVM: s390: reenable LPP facility KVM: s390: floating irqs: fix user triggerable endless loop kvm: add halt_poll_ns module parameter kvm: remove KVM_MMIO_SIZE KVM: MIPS: Don't leak FPU/DSP to guest KVM: MIPS: Disable HTW while in guest KVM: nVMX: Enable nested posted interrupt processing KVM: nVMX: Enable nested virtual interrupt delivery KVM: nVMX: Enable nested apic register virtualization KVM: nVMX: Make nested control MSRs per-cpu KVM: nVMX: Enable nested virtualize x2apic mode KVM: nVMX: Prepare for using hardware MSR bitmap ...	2015-02-13 09:55:09 -08:00
Mel Gorman	4d94246699	mm: convert p[te\|md]_mknonnuma and remaining page table manipulations With PROT_NONE, the traditional page table manipulation functions are sufficient. [andre.przywara@arm.com: fix compiler warning in pmdp_invalidate()] [akpm@linux-foundation.org: fix build with STRICT_MM_TYPECHECKS] Signed-off-by: Mel Gorman <mgorman@suse.de> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com> Tested-by: Sasha Levin <sasha.levin@oracle.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Dave Jones <davej@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-02-12 18:54:08 -08:00
Kirill A. Shutemov	b007ea798f	arm: drop L_PTE_FILE and pte_file()-related helpers We've replaced remap_file_pages(2) implementation with emulation. Nobody creates non-linear mapping anymore. This patch also adjust __SWP_TYPE_SHIFT, effectively increase size of possible swap file to 128G. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Russell King <linux@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2015-02-10 14:30:31 -08:00
Mario Smarduch	c64735554c	KVM: arm: Add initial dirty page locking support Add support for initial write protection of VM memslots. This patch series assumes that huge PUDs will not be used in 2nd stage tables, which is always valid on ARMv7 Acked-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>	2015-01-16 14:40:14 +01:00
Steve Capper	b8cd51afe0	arm: mm: enable RCU fast_gup Activate the RCU fast_gup for ARM. We also need to force THP splits to broadcast an IPI s.t. we block in the fast_gup page walker. As THP splits are comparatively rare, this should not lead to a noticeable performance degradation. Some pre-requisite functions pud_write and pud_page are also added. Signed-off-by: Steve Capper <steve.capper@linaro.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Dann Frazier <dann.frazier@canonical.com> Cc: Hugh Dickins <hughd@google.com> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Will Deacon <will.deacon@arm.com> Cc: Christoffer Dall <christoffer.dall@linaro.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-10-09 22:26:01 -04:00
Steve Capper	bd951303be	arm: mm: introduce special ptes for LPAE We need a mechanism to tag ptes as being special, this indicates that no attempt should be made to access the underlying struct page * associated with the pte. This is used by the fast_gup when operating on ptes as it has no means to access VMAs (that also contain this information) locklessly. The L_PTE_SPECIAL bit is already allocated for LPAE, this patch modifies pte_special and pte_mkspecial to make use of it, and defines __HAVE_ARCH_PTE_SPECIAL. This patch also excludes special ptes from the icache/dcache sync logic. Signed-off-by: Steve Capper <steve.capper@linaro.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Dann Frazier <dann.frazier@canonical.com> Cc: Hugh Dickins <hughd@google.com> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Will Deacon <will.deacon@arm.com> Cc: Christoffer Dall <christoffer.dall@linaro.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-10-09 22:26:00 -04:00
Steven Capper	ded9477984	ARM: 8109/1: mm: Modify pte_write and pmd_write logic for LPAE For LPAE, we have the following means for encoding writable or dirty ptes: L_PTE_DIRTY L_PTE_RDONLY !pte_dirty && !pte_write 0 1 !pte_dirty && pte_write 0 1 pte_dirty && !pte_write 1 1 pte_dirty && pte_write 1 0 So we can't distinguish between writeable clean ptes and read only ptes. This can cause problems with ptes being incorrectly flagged as read only when they are writeable but not dirty. This patch renumbers L_PTE_RDONLY from AP[2] to a software bit #58, and adds additional logic to set AP[2] whenever the pte is read only or not dirty. That way we can distinguish between clean writeable ptes and read only ptes. HugeTLB pages will use this new logic automatically. We need to add some logic to Transparent HugePages to ensure that they correctly interpret the revised pgprot permissions (L_PTE_RDONLY has moved and no longer matches PMD_SECT_AP2). In the process of revising THP, the names of the PMD software bits have been prefixed with L_ to make them easier to distinguish from their hardware bit counterparts. Signed-off-by: Steve Capper <steve.capper@linaro.org> Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2014-07-24 14:27:08 +01:00
Steven Capper	f295070687	ARM: 8108/1: mm: Introduce {pte,pmd}_isset and {pte,pmd}_isclear Long descriptors on ARM are 64 bits, and some pte functions such as pte_dirty return a bitwise-and of a flag with the pte value. If the flag to be tested resides in the upper 32 bits of the pte, then we run into the danger of the result being dropped if downcast. For example: gather_stats(page, md, pte_dirty(pte), 1); where pte_dirty(pte) is downcast to an int. This patch introduces a new macro pte_isset which performs the bitwise and, then performs a double logical invert (where needed) to ensure predictable downcasting. The logical inverse pte_isclear is also introduced. Equivalent pmd functions for Transparent HugePages have also been added. Signed-off-by: Steve Capper <steve.capper@linaro.org> Reviewed-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2014-07-24 14:27:07 +01:00
Christoffer Dall	4d9c5b89cf	ARM: 7950/1: mm: Fix stage-2 device memory attributes The stage-2 memory attributes are distinct from the Hyp memory attributes and the Stage-1 memory attributes. We were using the stage-1 memory attributes for stage-2 mappings causing device mappings to be mapped as normal memory. Add the S2 equivalent defines for memory attributes and fix the comments explaining the defines while at it. Add a prot_pte_s2 field to the mem_type struct and fill out the field for device mappings accordingly. Cc: <stable@vger.kernel.org> [3.9+] Acked-by: Marc Zyngier <marc.zyngier@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2014-02-10 11:44:05 +00:00
Russell King	1fd15b879d	ARM: add support to dump the kernel page tables This patch allows the kernel page tables to be dumped via a debugfs file, allowing kernel developers to check the layout of the kernel page tables and the verify the various permissions and type settings. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2013-12-11 09:53:13 +00:00
Linus Torvalds	f080480488	Here are the 3.13 KVM changes. There was a lot of work on the PPC side: the HV and emulation flavors can now coexist in a single kernel is probably the most interesting change from a user point of view. On the x86 side there are nested virtualization improvements and a few bugfixes. ARM got transparent huge page support, improved overcommit, and support for big endian guests. Finally, there is a new interface to connect KVM with VFIO. This helps with devices that use NoSnoop PCI transactions, letting the driver in the guest execute WBINVD instructions. This includes some nVidia cards on Windows, that fail to start without these patches and the corresponding userspace changes. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABAgAGBQJShPAhAAoJEBvWZb6bTYbyl48P/297GgmELHAGBgjvb6q7yyGu L8+eHjKbh4XBAkPwyzbvUjuww5z2hM0N3JQ0BDV9oeXlO+zwwCEns/sg2Q5/NJXq XxnTeShaKnp9lqVBnE6G9rAOUWKoyLJ2wItlvUL8JlaO9xJ0Vmk0ta4n2Nv5GqDp db6UD7vju6rHtIAhNpvvAO51kAOwc01xxRixCVb7KUYOnmO9nvpixzoI/S0Rp1gu w/OWMfCosDzBoT+cOe79Yx1OKcpaVW94X6CH1s+ShCw3wcbCL2f13Ka8/E3FIcuq vkZaLBxio7vjUAHRjPObw0XBW4InXEbhI1DjzIvm8dmc4VsgmtLQkTCG8fj+jINc dlHQUq6Do+1F4zy6WMBUj8tNeP1Z9DsABp98rQwR8+BwHoQpGQBpAxW0TE0ZMngC t1caqyvjZ5pPpFUxSrAV+8Kg4AvobXPYOim0vqV7Qea07KhFcBXLCfF7BWdwq/Jc 0CAOlsLL4mHGIQWZJuVGw0YGP7oATDCyewlBuDObx+szYCoV4fQGZVBEL0KwJx/1 7lrLN7JWzRyw6xTgJ5VVwgYE1tUY4IFQcHu7/5N+dw8/xg9KWA3f4PeMavIKSf+R qteewbtmQsxUnvuQIBHLs8NRWPnBPy+F3Sc2ckeOLIe4pmfTte6shtTXcLDL+LqH NTmT/cfmYp2BRkiCfCiS =rWNf -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull KVM changes from Paolo Bonzini: "Here are the 3.13 KVM changes. There was a lot of work on the PPC side: the HV and emulation flavors can now coexist in a single kernel is probably the most interesting change from a user point of view. On the x86 side there are nested virtualization improvements and a few bugfixes. ARM got transparent huge page support, improved overcommit, and support for big endian guests. Finally, there is a new interface to connect KVM with VFIO. This helps with devices that use NoSnoop PCI transactions, letting the driver in the guest execute WBINVD instructions. This includes some nVidia cards on Windows, that fail to start without these patches and the corresponding userspace changes" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (146 commits) kvm, vmx: Fix lazy FPU on nested guest arm/arm64: KVM: PSCI: propagate caller endianness to the incoming vcpu arm/arm64: KVM: MMIO support for BE guest kvm, cpuid: Fix sparse warning kvm: Delete prototype for non-existent function kvm_check_iopl kvm: Delete prototype for non-existent function complete_pio hung_task: add method to reset detector pvclock: detect watchdog reset at pvclock read kvm: optimize out smp_mb after srcu_read_unlock srcu: API for barrier after srcu read unlock KVM: remove vm mmap method KVM: IOMMU: hva align mapping page size KVM: x86: trace cpuid emulation when called from emulator KVM: emulator: cleanup decode_register_operand() a bit KVM: emulator: check rex prefix inside decode_register() KVM: x86: fix emulation of "movzbl %bpl, %eax" kvm_host: typo fix KVM: x86: emulate SAHF instruction MAINTAINERS: add tree for kvm.git Documentation/kvm: add a 00-INDEX file ...	2013-11-15 13:51:36 +09:00
Steven Capper	a3a9ea656d	ARM: 7858/1: mm: make UACCESS_WITH_MEMCPY huge page aware The memory pinning code in uaccess_with_memcpy.c does not check for HugeTLB or THP pmds, and will enter an infinite loop should a __copy_to_user or __clear_user occur against a huge page. This patch adds detection code for huge pages to pin_page_for_write. As this code can be executed in a fast path it refers to the actual pmds rather than the vma. If a HugeTLB or THP is found (they have the same pmd representation on ARM), the page table spinlock is taken to prevent modification whilst the page is pinned. On ARM, huge pages are only represented as pmds, thus no huge pud checks are performed. (For huge puds one would lock the page table in a similar manner as in the pmd case). Two helper functions are introduced; pmd_thp_or_huge will check whether or not a page is huge or transparent huge (which have the same pmd layout on ARM), and pmd_hugewillfault will detect whether or not a page fault will occur on write to the page. Running the following test (with the chunking from read_zero removed): $ dd if=/dev/zero of=/dev/null bs=10M count=1024 Gave: 2.3 GB/s backed by normal pages, 2.9 GB/s backed by huge pages, 5.1 GB/s backed by huge pages, with page mask=HPAGE_MASK. After some discussion, it was decided not to adopt the HPAGE_MASK, as this would have a significant detrimental effect on the overall system latency due to page_table_lock being held for too long. This could be revisited if split huge page locks are adopted. Signed-off-by: Steve Capper <steve.capper@linaro.org> Reviewed-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2013-10-29 11:06:15 +00:00
Christoffer Dall	ad361f093c	KVM: ARM: Support hugetlbfs backed huge pages Support huge pages in KVM/ARM and KVM/ARM64. The pud_huge checking on the unmap path may feel a bit silly as the pud_huge check is always defined to false, but the compiler should be smart about this. Note: This deals only with VMAs marked as huge which are allocated by users through hugetlbfs only. Transparent huge pages can only be detected by looking at the underlying pages (or the page tables themselves) and this patch so far simply maps these on a page-by-page level in the Stage-2 page tables. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Russell King <rmk+kernel@arm.linux.org.uk> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>	2013-10-17 17:06:20 -07:00
Russell King	3fbd55ec21	Merge branch 'for-rmk/lpae' of git://git.kernel.org/pub/scm/linux/kernel/git/will/linux into devel-stable Conflicts: arch/arm/kernel/smp.c Please pull these miscellaneous LPAE fixes I've been collecting for a while now for 3.11. They've been tested and reviewed by quite a few people, and most of the patches are pretty trivial. -- Will Deacon.	2013-06-18 20:11:32 +01:00
Catalin Marinas	8d96250700	ARM: mm: Transparent huge page support for LPAE systems. The patch adds support for THP (transparent huge pages) to LPAE systems. When this feature is enabled, the kernel tries to map anonymous pages as 2MB sections where possible. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> [steve.capper@linaro.org: symbolic constants used, value of PMD_SECT_SPLITTING adjusted, tlbflush.h included in pgtable.h, added PROT_NONE support.] Signed-off-by: Steve Capper <steve.capper@linaro.org> Reviewed-by: Will Deacon <will.deacon@arm.com>	2013-06-04 16:52:38 +01:00
Catalin Marinas	1355e2a6eb	ARM: mm: HugeTLB support for LPAE systems. This patch adds support for hugetlbfs based on the x86 implementation. It allows mapping of 2MB sections (see Documentation/vm/hugetlbpage.txt for usage). The 64K pages configuration is not supported (section size is 512MB in this case). Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> [steve.capper@linaro.org: symbolic constants replace numbers in places. Split up into multiple files, to simplify future non-LPAE support, removed huge_pmd_share code, as this is very rarely executed, Added PROT_NONE support]. Signed-off-by: Steve Capper <steve.capper@linaro.org> Reviewed-by: Will Deacon <will.deacon@arm.com>	2013-06-04 16:52:37 +01:00
Steve Capper	dde1b65110	ARM: mm: correct pte_same behaviour for LPAE. For 3 levels of paging the PTE_EXT_NG bit will be set for user address ptes that are written to a page table but not for ptes created with mk_pte. This can cause some comparison tests made by pte_same to fail spuriously and lead to other problems. To correct this behaviour, we mask off PTE_EXT_NG for any pte that is present before running the comparison. Signed-off-by: Steve Capper <steve.capper@linaro.org> Reviewed-by: Will Deacon <will.deacon@arm.com>	2013-06-04 16:52:37 +01:00
Will Deacon	e38a517578	ARM: lpae: fix definition of PTE_HWTABLE_PTRS For 2-level page tables, PTE_HWTABLE_PTRS describes the offset between Linux PTEs and hardware PTEs. On LPAE, there is no distinction (since we have 64-bit descriptors with plenty of space) so PTE_HWTABLE_PTRS should be 0. Unfortunately, it is wrongly defined as PTRS_PER_PTE, meaning that current pte table flushing is off by a page. Luckily, all current LPAE implementations are SMP, so the hardware walker can snoop L1. This patch fixes the broken definition. Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2013-05-30 16:02:33 +01:00
Cyril Chemparathy	926edcc747	ARM: LPAE: use signed arithmetic for mask definitions This patch applies to PAGE_MASK, PMD_MASK, and PGDIR_MASK, where forcing unsigned long math truncates the mask at the 32-bits. This clearly does bad things on PAE systems. This patch fixes this problem by defining these masks as signed quantities. We then rely on sign extension to do the right thing. Signed-off-by: Cyril Chemparathy <cyril@ti.com> Signed-off-by: Vitaly Andrianov <vitalya@ti.com> Reviewed-by: Nicolas Pitre <nico@linaro.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Tested-by: Subash Patel <subash.rp@samsung.com> Signed-off-by: Will Deacon <will.deacon@arm.com>	2013-05-30 16:01:30 +01:00
Marc Zyngier	865499ea90	ARM: KVM: fix L_PTE_S2_RDWR to actually be Read/Write Looks like our L_PTE_S2_RDWR definition is slightly wrong, and is actually write only (see ARM ARM Table B3-9, Stage 2 control of access permissions). Didn't make a difference for normal pages, as we OR the flags together, but I'm still wondering how it worked for Stage-2 mapped devices, such as the GIC. Brown paper bag time, again. Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>	2013-04-16 16:21:25 -07:00
Christoffer Dall	cc577c26e2	ARM: Add page table and page defines needed by KVM KVM uses the stage-2 page tables and the Hyp page table format, so we define the fields and page protection flags needed by KVM. The nomenclature is this: - page_hyp: PL2 code/data mappings - page_hyp_device: PL2 device mappings (vgic access) - page_s2: Stage-2 code/data page mappings - page_s2_device: Stage-2 device mappings (vgic access) Reviewed-by: Will Deacon <will.deacon@arm.com> Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Christoffer Dall <c.dall@virtualopensystems.com>	2013-01-23 13:29:08 -05:00
Will Deacon	26ffd0d43b	ARM: mm: introduce present, faulting entries for PAGE_NONE PROT_NONE mappings apply the page protection attributes defined by _P000 which translate to PAGE_NONE for ARM. These attributes specify an XN, RDONLY pte that is inaccessible to userspace. However, on kernels configured without support for domains, such a pte is accessible to the kernel and can be read via get_user, allowing tasks to read PROT_NONE pages via syscalls such as read/write over a pipe. This patch introduces a new software pte flag, L_PTE_NONE, that is set to identify faulting, present entries. Signed-off-by: Will Deacon <will.deacon@arm.com>	2012-11-09 14:13:20 +00:00
Will Deacon	dbf62d5006	ARM: mm: introduce L_PTE_VALID for page table entries For long-descriptor translation table formats, the ARMv7 architecture defines the last two bits of the second- and third-level descriptors to be: x0b - Invalid 01b - Block (second-level), Reserved (third-level) 11b - Table (second-level), Page (third-level) This allows us to define L_PTE_PRESENT as (3 << 0) and use this value to create ptes directly. However, when determining whether a given pte value is present in the low-level page table accessors, we only need to check the least significant bit of the descriptor, allowing us to write faulting, present entries which are required for PROT_NONE mappings. This patch introduces L_PTE_VALID, which can be used to test whether a pte should fault, and updates the low-level page table accessors accordingly. Signed-off-by: Will Deacon <will.deacon@arm.com>	2012-11-09 14:13:19 +00:00
Catalin Marinas	0ec8e7aa8f	ARM: 7416/1: LPAE: Remove unused L_PTE_(BUFFERABLE\|CACHEABLE) macros These have already been removed from the classic MMU in favour of L_PTE_MT_* macros. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2012-05-12 14:38:21 +01:00
Catalin Marinas	da02877987	ARM: LPAE: Page table maintenance for the 3-level format This patch modifies the pgd/pmd/pte manipulation functions to support the 3-level page table format. Since there is no need for an 'ext' argument to cpu_set_pte_ext(), this patch conditionally defines a different prototype for this function when CONFIG_ARM_LPAE. The patch also introduces the L_PGD_SWAPPER flag to mark pgd entries pointing to pmd tables pre-allocated in the swapper_pg_dir and avoid trying to free them at run-time. This flag is 0 with the classic page table format. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2011-12-08 10:30:39 +00:00
Catalin Marinas	dcfdae04bd	ARM: LPAE: Introduce the 3-level page table format definitions This patch introduces the pgtable-3level*.h files with definitions specific to the LPAE page table format (3 levels of page tables). Each table is 4KB and has 512 64-bit entries. An entry can point to a 40-bit physical address. The young, write and exec software bits share the corresponding hardware bits (negated). Other software bits use spare bits in the PTE. The patch also changes some variable types from unsigned long or int to pteval_t or pgprot_t. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>	2011-12-08 10:30:39 +00:00

33 Commits