linux

korg/linux

mirror of https://mirrors.bfsu.edu.cn/git/linux.git synced 2024-11-29 07:04:10 +08:00

Author	SHA1	Message	Date
Aneesh Kumar K.V	64168f4296	powerpc/mm/coproc: Handle bad address on coproc slb fault VSID 0 is bad address. Don't create slb entries on coproc fault for bad address Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Reviewed-by: Balbir Singh <bsingharora@gmail.com> Reviewed-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-11-22 11:57:08 +11:00
Russell Currey	6654c9368a	powerpc/eeh: Refactor EEH PE reset functions eeh_pe_reset and eeh_reset_pe are two different functions in the same file which do mostly the same thing. Not only is this confusing, but potentially causes disrepancies in functionality, notably eeh_reset_pe as it does not check return values for failure. Refactor this into the following: - eeh_pe_reset(): stays as is, performs a single operation, exported - eeh_pe_reset_full(): new, full reset process that calls eeh_pe_reset() - eeh_reset_pe(): removed and replaced by eeh_pe_reset_full() - eeh_reset_pe_once(): removed Signed-off-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-11-22 11:57:08 +11:00
Russell Currey	1f52f17614	powerpc/pci: Always print PHB and PE numbers as hexadecimal PHB, PE (and by association MVE) numbers are printed as a mix of decimal and hexadecimal throughout the kernel. This can be misleading, so make them all hexadecimal. Standardising on hex instead of dec because: - PHB numbers are presented in hex in sysfs/debugfs (and lspci, etc) - PE numbers are presented as hex in sysfs and parsed in hex in debugfs The only place I think this could cause confusing are the messages during boot, i.e. pci 000a:01 : [PE# 000] Secondary bus 1 associated with PE#0 which can be a quick way to check PE numbers. pe_level_printk() will only print two characters instead of three, so the above would be pci 000a:01 : [PE# 00] Secondary bus 1 associated with PE#0 which gives a hint it's in hex. Signed-off-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-11-22 11:57:07 +11:00
Russell Currey	d4791db527	powerpc/powernv: Don't warn on PE init if unfreeze is unsupported Whenever a PE is initialised in powernv, opal_pci_eeh_freeze_clear() is called. This is to remove any existing freeze, and has no negative side effects if the PE is already in an unfrozen state. On PHB backends that don't support this operation and return OPAL_UNSUPPORTED, this creates a scary and misleading warning message. Skip the warning message on init if OPAL_UNSUPPORTED is returned. As far as I'm aware, this currently only affects NPUs. Fixes: `313483d` ("powerpc/powernv: Unfreeze PE on allocation") Signed-off-by: Russell Currey <ruscur@russell.cc> Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-11-22 11:57:07 +11:00
Paul Mackerras	7fd317f8c3	powerpc/64: Add some more SPRs and SPR bits for POWER9 These definitions will be needed by KVM. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-11-22 10:19:31 +11:00
Geliang Tang	68b8b72bb2	KVM: PPC: Book3S HV: Drop duplicate header asm/iommu.h Drop duplicate header asm/iommu.h from book3s_64_vio_hv.c. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2016-11-21 15:29:26 +11:00
Paul Mackerras	f064a0de15	KVM: PPC: Book3S HV: Don't lose hardware R/C bit updates in H_PROTECT The hashed page table MMU in POWER processors can update the R (reference) and C (change) bits in a HPTE at any time until the HPTE has been invalidated and the TLB invalidation sequence has completed. In kvmppc_h_protect, which implements the H_PROTECT hypercall, we read the HPTE, modify the second doubleword, invalidate the HPTE in memory, do the TLB invalidation sequence, and then write the modified value of the second doubleword back to memory. In doing so we could overwrite an R/C bit update done by hardware between when we read the HPTE and when the TLB invalidation completed. To fix this we re-read the second doubleword after the TLB invalidation and OR in the (possibly) new values of R and C. We can use an OR since hardware only ever sets R and C, never clears them. This race was found by code inspection. In principle this bug could cause occasional guest memory corruption under host memory pressure. Fixes: `a8606e20e4` ("KVM: PPC: Handle some PAPR hcalls in the kernel", 2011-06-29) Cc: stable@vger.kernel.org # v3.19+ Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2016-11-21 15:29:20 +11:00
Paul Mackerras	0d808df06a	KVM: PPC: Book3S HV: Save/restore XER in checkpointed register state When switching from/to a guest that has a transaction in progress, we need to save/restore the checkpointed register state. Although XER is part of the CPU state that gets checkpointed, the code that does this saving and restoring doesn't save/restore XER. This fixes it by saving and restoring the XER. To allow userspace to read/write the checkpointed XER value, we also add a new ONE_REG specifier. The visible effect of this bug is that the guest may see its XER value being corrupted when it uses transactions. Fixes: `e4e3812150` ("KVM: PPC: Book3S HV: Add transactional memory support") Fixes: `0a8eccefcb` ("KVM: PPC: Book3S HV: Add missing code for transaction reclaim on guest exit") Cc: stable@vger.kernel.org # v3.15+ Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2016-11-21 15:17:55 +11:00
Yongji Xie	a56ee9f8f0	KVM: PPC: Book3S HV: Add a per vcpu cache for recently page faulted MMIO entries This keeps a per vcpu cache for recently page faulted MMIO entries. On a page fault, if the entry exists in the cache, we can avoid some time-consuming paths, for example, looking up HPT, locking HPTE twice and searching mmio gfn from memslots, then directly call kvmppc_hv_emulate_mmio(). In current implenment, we limit the size of cache to four. We think it's enough to cover the high-frequency MMIO HPTEs in most case. For example, considering the case of using virtio device, for virtio legacy devices, one HPTE could handle notifications from up to 1024 (64K page / 64 byte Port IO register) devices, so one cache entry is enough; for virtio modern devices, we always need one HPTE to handle notification for each device because modern device would use a 8M MMIO register to notify host instead of Port IO register, typically the system's configuration should not exceed four virtio devices per vcpu, four cache entry is also enough in this case. Of course, if needed, we could also modify the macro to a module parameter in the future. Signed-off-by: Yongji Xie <xyjxie@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2016-11-21 15:17:55 +11:00
Yongji Xie	f05859827d	KVM: PPC: Book3S HV: Clear the key field of HPTE when the page is paged out Currently we mark a HPTE for emulated MMIO with HPTE_V_ABSENT bit set as well as key 0x1f. However, those HPTEs may be conflicted with the HPTE for real guest RAM page HPTE with key 0x1f when the page get paged out. This patch clears the key field of HPTE when the page is paged out, then recover it when HPTE is re-established. Signed-off-by: Yongji Xie <xyjxie@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2016-11-21 15:17:55 +11:00
Wei Yongjun	28d057c897	KVM: PPC: Book3S HV: Use list_move_tail instead of list_del/list_add_tail Using list_move_tail() instead of list_del() + list_add_tail(). Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2016-11-21 15:17:55 +11:00
Daniel Axtens	ebe4535fbe	KVM: PPC: Book3S HV: sparse: prototypes for functions called from assembler A bunch of KVM functions are only called from assembler. Give them prototypes in asm-prototypes.h This reduces sparse warnings. Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2016-11-21 15:17:54 +11:00
Daniel Axtens	025c951138	KVM: PPC: Book3S HV: Fix sparse static warning Squash a couple of sparse warnings by making things static. Build tested. Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Paul Mackerras <paulus@ozlabs.org>	2016-11-21 15:16:53 +11:00
Linus Torvalds	f6918382c7	powerpc fixes for 4.9 #5 Fixes marked for stable: - Fix system reset interrupt winkle wakeups (Nicholas Piggin) - Fix setting of AIL in hypervisor mode (Benjamin Herrenschmidt) Fixes for code merged this cycle: - Fix exception vector build with 2.23 era binutils (Hugh Dickins) - Fix missing update of HID register on secondary CPUs (Aneesh Kumar K.V) Other: - Fix missing pr_cont()s in show_stack() (Michael Ellerman) - Fix missing pr_cont()s in print_msr_bits() et. al. (Michael Ellerman) - Fix missing pr_cont()s in show_regs() (Michael Ellerman) - Fix missing pr_cont()s in instruction dump (Andrew Donnellan) - Invalidate ERAT on tlbiel for POWER9 DD1 (Michael Neuling) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJYMBJ5AAoJEFHr6jzI4aWA7hcP/1y8rTxNE+QYFMgkAVOJRDNL t11jhvzWd+IQKCQnp+UtxlVUsMunwcE57nLu/gSndTwd801yBshslFhPjCljKt7o g2oO4C+j90Vm6/0pg/HN51QPaCESwzZd8N6Xf0ApLfnxJ8elY9FSKfVmxWOfZnxo heKWCjQTw+LVH04sIB09vo4Jf6djhC1mlVyxpH+6pG5rP6ftgse82wtTQQR2dVlk tgfPNP2+wXF1Yl5vGFv/Q8p73RgcHUHok3spvmVQ1sZ+a8ezh2F/FhHeUlfyfuaq s35MMgF3JAxXizNZ4I7oqCDpI6M1NCmuQI9QULHHKRMVunV3x8Zf3/FeFpWDD3y/ RCqk5oWIeemYbtX9i9suVYJVLr3Qz6tCjN9jlIl8EnIhsDAKrKOjkrCP4ke9Nzv1 eQMmtAQJC4dib0DqNbAfuvEtnLFbL83xmmBHKG/GY77iKtvJEB2Wx5rC5LZ6Dw9a Ua1cBN+d1gBU1gBIKwa/fCkLxS0o+6LBGrZOd39r931Zw0ETl4miTuFdQiNJ2PnG BMnUK0I6FfKRgAFa0d4UXbqLv4HI6Nh8MEMTpoQ+oCK9Rbn0ZcmFfdzHWzLZmHg4 NQ/1CiS17IKEHYSRI/r4M7jq6obem3x7wPJWsfySu0cs8YG2BjdfUcs+ff5TR/xV jEGarBJgZ4bArqOw4TEI =+6XC -----END PGP SIGNATURE----- Merge tag 'powerpc-4.9-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "Fixes marked for stable: - fix system reset interrupt winkle wakeups - fix setting of AIL in hypervisor mode Fixes for code merged this cycle: - fix exception vector build with 2.23 era binutils - fix missing update of HID register on secondary CPUs Other: - fix missing pr_cont()s - invalidate ERAT on tlbiel for POWER9 DD1" * tag 'powerpc-4.9-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/mm: Fix missing update of HID register on secondary CPUs powerpc/mm/radix: Invalidate ERAT on tlbiel for POWER9 DD1 powerpc/64: Fix setting of AIL in hypervisor mode powerpc/oops: Fix missing pr_cont()s in instruction dump powerpc/oops: Fix missing pr_cont()s in show_regs() powerpc/oops: Fix missing pr_cont()s in print_msr_bits() et. al. powerpc/oops: Fix missing pr_cont()s in show_stack() powerpc: Fix exception vector build with 2.23 era binutils powerpc/64s: Fix system reset interrupt winkle wakeups	2016-11-19 11:21:59 -08:00
Aneesh Kumar K.V	cac4a18540	powerpc/mm: Fix missing update of HID register on secondary CPUs We need to update on secondaries for the selected MMU mode. Fixes: `ad410674f5` ("powerpc/mm: Update the HID bit when switching from radix to hash") Reported-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-11-18 23:16:58 +11:00
Michael Ellerman	e9eb0278da	powerpc/64: Used named initialisers for ibm_pa_features The ibm_pa_features array consists of structures that describe which bit and byte in the ibm,pa-features property toggles one or more flags in either the CPU, MMU, or user visible feature flags. Each one consists of 7 values, which are all unsigned long, int or char, meaning the compiler gives us no warning if we assign the wrong values to the wrong elements. In fact we have had a bug here in the past, where we were setting incorrect bits, see commit `6997e57d69` ("powerpc: scan_features() updates incorrect bits for REAL_LE"). So switch to using named initialisers for the structure elements, to reduce the likelihood of future bugs, and hopefully improve readability also. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Balbir Singh <bsingharora@gmail.com>	2016-11-18 23:02:19 +11:00
Michael Ellerman	3baad97067	powerpc/configs: Turn on PPC crypto implementations in the server defconfigs These are the PPC optimised versions of various crypto algorithms, so we should turn them on by default to get test coverage. Suggested-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-11-18 23:02:19 +11:00
Michael Ellerman	90ee8762e2	powerpc/pseries: Disable IBMEBUS on little endian builds The IBMEBUS code supports the GX bus found on Power7 and earlier CPUs. On Power8 it has been replaced, and so we have no need for it. We don't actually have a config symbol for Power8 vs Power7 etc., but we only support booting little endian on Power8 or later, so use that as a reasonable approximation. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-11-18 23:02:18 +11:00
Michael Ellerman	30757de203	powerpc/pseries: Move ibmebus.c into platforms pseries ibmebus.c is pseries only code, so move it in there. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-11-18 23:02:18 +11:00
Michael Ellerman	139ac5afe3	powerpc/pseries: Move vio.c into platforms pseries vio.c is pseries only code, so move it in there. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-11-18 23:02:17 +11:00
Nicholas Piggin	7458e8b2ce	powerpc: Fix second nested oops hang When ending an oops, don't clear die_owner unless the nest count went to zero. This prevents a second nested oops from hanging forever on the die_lock. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-11-18 22:40:42 +11:00
Nicholas Piggin	6f44b20ee9	powerpc: Fix graceful debugger recovery When exiting xmon with 'x' (exit and recover), oops_begin bails out immediately, but die then calls __die() and oops_end(), which cause a lot of bad things to happen. If the debugger was attached then went to graceful recovery, exit from die() immediately. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-11-18 22:40:42 +11:00
Nicholas Piggin	43c9127d94	powerpc: Add option to use thin archives Add an option to use thin archives to build the kernel. Thin archives are explained in commit `a5967db9af` ("kbuild: allow architectures to use thin archives instead of ld -r"). This is a gradual way to introduce the option to testers. Some change to the way we invoke ar is required so it can be used by scripts/link-vmlinux.sh. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Make it an explicit option not dependant on COMPILE_TEST] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-11-18 22:40:42 +11:00
Michael Ellerman	5e9d0e3d9e	powerpc/lib: Fix randconfig build failure in sstep.c Under some configs we need to explicitly include cpu_has_feature.h, otherwise we fail with: arch/powerpc/lib/sstep.c:1992:7: error: implicit declaration of function 'cpu_has_feature' Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-11-18 22:40:42 +11:00
Nathan Fontenot	25b587fba9	powerpc/pseries: Correct possible read beyond dlpar sysfs buffer The pasrsing of data written to the dlpar file in sysfs does not correctly account for the possibility of reading past the end of the buffer. The code assumes that all pieces of the command witten to the sysfs file are present in the form "<resource> <action> <id_type> <id>". Correct this by updating the buffer parsing code to make a local copy and use the strsep() and sysfs_streq() routines to parse the buffer. This patch also separates the parsing code into subroutines for each piece of the command. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-11-18 22:40:38 +11:00
Tobias Klauser	d6d56ec738	powerpc/mce: Remove unused but set variable Remove the unused but set variable srr1 in save_mce_event() to fix the following GCC warning when building with 'W=1': arch/powerpc/kernel/mce.c:75:11: warning: variable 'srr1' set but not used It has never been used. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-11-18 22:40:38 +11:00
Tobias Klauser	60d862e531	powerpc: Fix old style declaration GCC warnings Fix two [-Wold-style-declaration] GCC warnings by moving the inline keyword before the return type. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-11-18 22:40:38 +11:00
Christophe Leroy	c05f69a3dc	powerpc/64: get rid of MIN_HUGEPTE_SHIFT MIN_HUGEPTE_SHIFT hasn't been used since commit `d1837cba5d` ("powerpc/mm: Cleanup initialization of hugepages on powerpc") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-11-18 22:40:37 +11:00
Michael Neuling	96ed1fe511	powerpc/mm/radix: Invalidate ERAT on tlbiel for POWER9 DD1 On POWER9 DD1, when we do a local TLB invalidate we also need to explicitly invalidate the ERAT. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-11-18 15:12:24 +11:00
Christian Borntraeger	6d0d287891	locking/core: Provide common cpu_relax_yield() definition No need to duplicate the same define everywhere. Since the only user is stop-machine and the only provider is s390, we can use a default implementation of cpu_relax_yield() in sched.h. Suggested-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com> Acked-by: Russell King <rmk+kernel@armlinux.org.uk> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Noam Camus <noamc@ezchip.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Cc: kvm@vger.kernel.org Cc: linux-arch@vger.kernel.org Cc: linux-s390 <linux-s390@vger.kernel.org> Cc: linuxppc-dev@lists.ozlabs.org Cc: sparclinux@vger.kernel.org Cc: virtualization@lists.linux-foundation.org Cc: xen-devel@lists.xenproject.org Link: http://lkml.kernel.org/r/1479298985-191589-1-git-send-email-borntraeger@de.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-11-17 08:17:36 +01:00
Suraj Jitindar Singh	555c16328a	powerpc/mm: Correct process and partition table max size Version 3.00 of the ISA states that the PATS (partition table size) field of the PTCR (partition table control register) and the PRTS (process table size) field of the partition table entry must both be less than or equal to 24. However the actual size of the partition and process tables is equal to 2 to the power of 12 plus the PATS and PRTS fields, respectively. This means that the max allowable size of each of these tables is 2^36 or 64GB for both. Thus when checking the size shift for each we should be checking for values of greater than 36 instead of the current check for shifts larger than 24 and 23. Fixes: `2bfd65e45e` Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Reviewed-by: Balbir Singh <bsingharora@gmail.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-11-17 17:11:53 +11:00
Rashmica Gupta	1515ab9321	powerpc/mm: Dump hash table Useful to be able to dump the kernel hash page table to check which pages are hashed along with their sizes and other details. Add a debugfs file to check the hash page table. If radix is enabled (and so there is no hash page table) then this file doesn't exist. To use this the PPC_PTDUMP config option must be selected. Signed-off-by: Rashmica Gupta <rashmicy@gmail.com> [mpe: Fix build with SPARSEMEM_VMEMMAP=n & PSERIES=n] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-11-17 17:11:47 +11:00
Rashmica Gupta	8eb07b1870	powerpc/mm: Dump linux pagetables Useful to be able to dump the kernels page tables to check permissions and memory types - derived from arm64's implementation. Add a debugfs file to check the page tables. To use this the PPC_PTDUMP config option must be selected. Signed-off-by: Rashmica Gupta <rashmicy@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-11-17 17:11:46 +11:00
Michael Ellerman	8f272a5dd6	powerpc/pseries: Move CMO code from plapr_wrappers.h to platforms/pseries Currently there's some CMO (Cooperative Memory Overcommit) code, in plpar_wrappers.h. Some of it is #ifdef CONFIG_PSERIES and some of it isn't. The end result being if a file includes plpar_wrappers.h it won't build with CONFIG_PSERIES=n. Fix it by moving the CMO code into platforms/pseries. The two hcall wrappers can just be moved into their only caller, cmm.c, and the accessors can go in pseries.h. Note we need the accessors because cmm.c can be built as a module, so there needs to be a split between the built-in code vs the module, and that's achieved by using those accessors. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-11-17 17:11:46 +11:00
Balbir Singh	ac8d3818aa	powerpc/mm: Fix typo in radix encodings print Rename "sift" to "shift". Signed-off-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-11-17 17:11:45 +11:00
Christian Borntraeger	5bd0b85ba8	locking/core, arch: Remove cpu_relax_lowlatency() As there are no users left, we can remove cpu_relax_lowlatency() implementations from every architecture. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Noam Camus <noamc@ezchip.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Cc: linuxppc-dev@lists.ozlabs.org Cc: virtualization@lists.linux-foundation.org Cc: xen-devel@lists.xenproject.org Cc: <linux-arch@vger.kernel.org> Link: http://lkml.kernel.org/r/1477386195-32736-6-git-send-email-borntraeger@de.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-11-16 10:15:11 +01:00
Christian Borntraeger	79ab11cdb9	locking/core: Introduce cpu_relax_yield() For spinning loops people do often use barrier() or cpu_relax(). For most architectures cpu_relax and barrier are the same, but on some architectures cpu_relax can add some latency. For example on power,sparc64 and arc, cpu_relax can shift the CPU towards other hardware threads in an SMT environment. On s390 cpu_relax does even more, it uses an hypercall to the hypervisor to give up the timeslice. In contrast to the SMT yielding this can result in larger latencies. In some places this latency is unwanted, so another variant "cpu_relax_lowlatency" was introduced. Before this is used in more and more places, lets revert the logic and provide a cpu_relax_yield that can be called in places where yielding is more important than latency. By default this is the same as cpu_relax on all architectures. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Noam Camus <noamc@ezchip.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Cc: linuxppc-dev@lists.ozlabs.org Cc: virtualization@lists.linux-foundation.org Cc: xen-devel@lists.xenproject.org Link: http://lkml.kernel.org/r/1477386195-32736-2-git-send-email-borntraeger@de.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-11-16 10:15:09 +01:00
Paul Mackerras	6b243fcfb5	powerpc/64: Simplify adaptation to new ISA v3.00 HPTE format This changes the way that we support the new ISA v3.00 HPTE format. Instead of adapting everything that uses HPTE values to handle either the old format or the new format, depending on which CPU we are on, we now convert explicitly between old and new formats if necessary in the low-level routines that actually access HPTEs in memory. This limits the amount of code that needs to know about the new format and makes the conversions explicit. This is OK because the old format contains all the information that is in the new format. This also fixes operation under a hypervisor, because the H_ENTER hypercall (and other hypercalls that deal with HPTEs) will continue to require the HPTE value to be supplied in the old format. At present the kernel will not boot in HPT mode on POWER9 under a hypervisor. This fixes and partially reverts commit `50de596de8` ("powerpc/mm/hash: Add support for Power9 Hash", 2016-04-29). Fixes: `50de596de8` ("powerpc/mm/hash: Add support for Power9 Hash") Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-11-16 15:43:25 +11:00
Madalin Bucur	e0bd03d0ea	arch/powerpc: Enable dpaa_eth Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-15 22:34:25 -05:00
Madalin Bucur	9256f21e50	arch/powerpc: Enable FSL_FMAN Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-15 22:34:25 -05:00
Madalin Bucur	8f2840197a	arch/powerpc: Enable FSL_PAMU Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-15 22:34:25 -05:00
Benjamin Herrenschmidt	c0a3601363	powerpc/64: Fix setting of AIL in hypervisor mode Commit `d3cbff1b5` "powerpc: Put exception configuration in a common place" broke the setting of the AIL bit (which enables taking exceptions with the MMU still on) on all processors, moving it incorrectly to a function called only on the boot CPU. This was correct for the guest case but not when running in hypervisor mode. This fixes it by partially reverting that commit, putting the setting back in cpu_ready_for_interrupts() Fixes: `d3cbff1b5a` ("powerpc: Put exception configuration in a common place") Cc: stable@vger.kernel.org # v4.8+ Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-11-15 20:43:59 +11:00
Stanislaw Gruszka	40565b5aed	sched/cputime, powerpc, s390: Make scaled cputime arch specific Only s390 and powerpc have hardware facilities allowing to measure cputimes scaled by frequency. On all other architectures utimescaled/stimescaled are equal to utime/stime (however they are accounted separately). Remove {u,s}timescaled accounting on all architectures except powerpc and s390, where those values are explicitly accounted in the proper places. Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Neuling <mikey@neuling.org> Cc: Paul Mackerras <paulus@ozlabs.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20161031162143.GB12646@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-11-15 09:51:05 +01:00
Stanislaw Gruszka	981ee2d444	sched/cputime, powerpc: Remove cputime_to_scaled() Currently cputime_to_scaled() just return it's argument on all implementations, we don't need to call this function. Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Reviewed-by: Paul Mackerras <paulus@ozlabs.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Neuling <mikey@neuling.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1479175612-14718-3-git-send-email-fweisbec@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-11-15 09:51:04 +01:00
Stanislaw Gruszka	7008eb997b	sched/cputime, powerpc: Remove cputime_last_delta global variable Since commit: `cf9efce0ce` ("powerpc: Account time using timebase rather than PURR") cputime_last_delta is not initialized to other value than 0, hence it's not used except zero check and cputime_to_scaled() just returns the argument. Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Paul Mackerras <paulus@ozlabs.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Michael Neuling <mikey@neuling.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1479175612-14718-2-git-send-email-fweisbec@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-11-15 09:51:04 +01:00
Bartlomiej Zolnierkiewicz	1aed61f9c9	powerpc: convert storcenter_defconfig to use libata PATA drivers IDE subsystem has been deprecated since 2009 and the majority (if not all) of Linux distributions have switched to use libata for ATA support exclusively. However there are still some users (mostly old or/and embedded non-x86 systems) that have not converted from using IDE subsystem to libata PATA drivers. This doesn't seem to be good thing in the long-term for Linux as while there is less and less PATA systems left in use: * testing efforts are divided between two subsystems * having duplicate drivers for same hardware confuses users This patch converts storcenter_defconfig to use libata PATA drivers. Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>	2016-11-14 20:09:32 +11:00
Bartlomiej Zolnierkiewicz	80269fe420	powerpc: convert pseries_defconfig to use libata PATA drivers IDE subsystem has been deprecated since 2009 and the majority (if not all) of Linux distributions have switched to use libata for ATA support exclusively. However there are still some users (mostly old or/and embedded non-x86 systems) that have not converted from using IDE subsystem to libata PATA drivers. This doesn't seem to be good thing in the long-term for Linux as while there is less and less PATA systems left in use: * testing efforts are divided between two subsystems * having duplicate drivers for same hardware confuses users This patch converts pseries_defconfig to use libata PATA drivers. Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>	2016-11-14 20:09:32 +11:00
Bartlomiej Zolnierkiewicz	7f10fe3636	powerpc: convert ppc6xx_defconfig to use libata PATA drivers IDE subsystem has been deprecated since 2009 and the majority (if not all) of Linux distributions have switched to use libata for ATA support exclusively. However there are still some users (mostly old or/and embedded non-x86 systems) that have not converted from using IDE subsystem to libata PATA drivers. This doesn't seem to be good thing in the long-term for Linux as while there is less and less PATA systems left in use: * testing efforts are divided between two subsystems * having duplicate drivers for same hardware confuses users This patch converts ppc6xx_defconfig to use libata PATA drivers. Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>	2016-11-14 20:09:32 +11:00
Bartlomiej Zolnierkiewicz	4a0b4bfe01	powerpc: convert ppc64e_defconfig to use libata PATA drivers IDE subsystem has been deprecated since 2009 and the majority (if not all) of Linux distributions have switched to use libata for ATA support exclusively. However there are still some users (mostly old or/and embedded non-x86 systems) that have not converted from using IDE subsystem to libata PATA drivers. This doesn't seem to be good thing in the long-term for Linux as while there is less and less PATA systems left in use: * testing efforts are divided between two subsystems * having duplicate drivers for same hardware confuses users This patch converts ppc64e_defconfig to use libata PATA drivers. Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>	2016-11-14 20:09:32 +11:00
Bartlomiej Zolnierkiewicz	7ddf3db3d7	powerpc: convert ppc64_defconfig to use libata PATA drivers IDE subsystem has been deprecated since 2009 and the majority (if not all) of Linux distributions have switched to use libata for ATA support exclusively. However there are still some users (mostly old or/and embedded non-x86 systems) that have not converted from using IDE subsystem to libata PATA drivers. This doesn't seem to be good thing in the long-term for Linux as while there is less and less PATA systems left in use: * testing efforts are divided between two subsystems * having duplicate drivers for same hardware confuses users This patch converts ppc64_defconfig to use libata PATA drivers. Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>	2016-11-14 20:09:32 +11:00
Bartlomiej Zolnierkiewicz	b0d3a07417	powerpc: convert pmac32_defconfig to use libata PATA drivers IDE subsystem has been deprecated since 2009 and the majority (if not all) of Linux distributions have switched to use libata for ATA support exclusively. However there are still some users (mostly old or/and embedded non-x86 systems) that have not converted from using IDE subsystem to libata PATA drivers. This doesn't seem to be good thing in the long-term for Linux as while there is less and less PATA systems left in use: * testing efforts are divided between two subsystems * having duplicate drivers for same hardware confuses users This patch converts pmac32_defconfig to use libata PATA drivers. Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>	2016-11-14 20:09:32 +11:00
Bartlomiej Zolnierkiewicz	23bf36cd12	powerpc: disable IDE subsystem in pasemi_defconfig This patch disables deprecated IDE subsystem in pasemi_defconfig (no IDE host drivers are selected in this config so there is no valid reason to enable IDE subsystem itself). Cc: Olof Johansson <olof@lixom.net> Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>	2016-11-14 20:09:32 +11:00
Bartlomiej Zolnierkiewicz	c1f9d2938b	powerpc: convert maple_defconfig to use libata PATA drivers IDE subsystem has been deprecated since 2009 and the majority (if not all) of Linux distributions have switched to use libata for ATA support exclusively. However there are still some users (mostly old or/and embedded non-x86 systems) that have not converted from using IDE subsystem to libata PATA drivers. This doesn't seem to be good thing in the long-term for Linux as while there is less and less PATA systems left in use: * testing efforts are divided between two subsystems * having duplicate drivers for same hardware confuses users This patch converts maple_defconfig to use libata PATA drivers. Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>	2016-11-14 20:09:32 +11:00
Bartlomiej Zolnierkiewicz	b10fd39637	powerpc: convert g5_defconfig to use libata PATA drivers IDE subsystem has been deprecated since 2009 and the majority (if not all) of Linux distributions have switched to use libata for ATA support exclusively. However there are still some users (mostly old or/and embedded non-x86 systems) that have not converted from using IDE subsystem to libata PATA drivers. This doesn't seem to be good thing in the long-term for Linux as while there is less and less PATA systems left in use: * testing efforts are divided between two subsystems * having duplicate drivers for same hardware confuses users This patch converts g5_defconfig to use libata PATA drivers. Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>	2016-11-14 20:09:32 +11:00
Bartlomiej Zolnierkiewicz	f22cc4d7bb	powerpc: convert chrp32_defconfig to use libata PATA drivers IDE subsystem has been deprecated since 2009 and the majority (if not all) of Linux distributions have switched to use libata for ATA support exclusively. However there are still some users (mostly old or/and embedded non-x86 systems) that have not converted from using IDE subsystem to libata PATA drivers. This doesn't seem to be good thing in the long-term for Linux as while there is less and less PATA systems left in use: * testing efforts are divided between two subsystems * having duplicate drivers for same hardware confuses users This patch converts chrp32_defconfig to use libata PATA drivers. Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>	2016-11-14 20:09:32 +11:00
Bartlomiej Zolnierkiewicz	8020c1220d	powerpc: convert cell_defconfig to use libata PATA drivers IDE subsystem has been deprecated since 2009 and the majority (if not all) of Linux distributions have switched to use libata for ATA support exclusively. However there are still some users (mostly old or/and embedded non-x86 systems) that have not converted from using IDE subsystem to libata PATA drivers. This doesn't seem to be good thing in the long-term for Linux as while there is less and less PATA systems left in use: * testing efforts are divided between two subsystems * having duplicate drivers for same hardware confuses users This patch converts cell_defconfig to use libata PATA drivers. Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>	2016-11-14 20:09:32 +11:00
Bartlomiej Zolnierkiewicz	67f6d66559	powerpc: convert amigaone_defconfig to use libata PATA drivers IDE subsystem has been deprecated since 2009 and the majority (if not all) of Linux distributions have switched to use libata for ATA support exclusively. However there are still some users (mostly old or/and embedded non-x86 systems) that have not converted from using IDE subsystem to libata PATA drivers. This doesn't seem to be good thing in the long-term for Linux as while there is less and less PATA systems left in use: * testing efforts are divided between two subsystems * having duplicate drivers for same hardware confuses users This patch converts amigaone_defconfig to use libata PATA drivers. Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>	2016-11-14 20:09:32 +11:00
Johan Hovold	e8cfb7e7c3	powerpc/vio: Clarify vio_find_node() reference counting Add comment clarifying that vio_find_node() takes a reference to the embedded struct device which needs to be dropped after use. Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-11-14 20:05:59 +11:00
Johan Hovold	815a7141c4	powerpc/ibmebus: Fix further device reference leaks Make sure to drop any reference taken by bus_find_device() when creating devices during init and driver registration. Fixes: `55347cc996` ("[POWERPC] ibmebus: Add device creation and bus probing based on of_device") Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-11-14 20:05:58 +11:00
Johan Hovold	fe0f316816	powerpc/ibmebus: Fix device reference leaks in sysfs interface Make sure to drop any reference taken by bus_find_device() in the sysfs callbacks that are used to create and destroy devices based on device-tree entries. Fixes: `6bccf755ff` ("[POWERPC] ibmebus: dynamic addition/removal of adapters, some code cleanup") Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-11-14 20:05:58 +11:00
Jack Miller	9e4f51bdaf	powerpc/powernv: Simplify searching for compatible device nodes This condenses the opal node searching into a single function that finds all compatible nodes, instead of just searching the ibm,opal children, for ipmi, flash, and prd similar to how opal-i2c nodes are found. Signed-off-by: Jack Miller <jack@codezen.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-11-14 20:05:57 +11:00
Michael Neuling	29a969b764	powerpc: Revert Load Monitor Register Support Load monitored is no longer supported on POWER9 so let's remove the code. This reverts commit `bd3ea317fd` ("powerpc: Load Monitor Register Support"). Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-11-14 20:05:57 +11:00
Michael Ellerman	7a53ef5ebe	powerpc/configs: Drop REISERFS from pseries & powernv No one uses reiserfs much these days, or is likely to in future. So drop it from pseries and powernv defconfigs to save time and space. It's still enabled in ppc64_defconfig so we get some build coverage. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-11-14 11:11:51 +11:00
Balbir Singh	f923efbcfd	powerpc/hash64: Be more careful when generating tlbiel In ISA v2.05, the tlbiel instruction takes two arguments, RB and L: tlbiel RB,L +---------+---------+----+---------+---------+---------+----+ \| 31 \| / \| L \| / \| RB \| 274 \| / \| \| 31 - 26 \| 25 - 22 \| 21 \| 20 - 16 \| 15 - 11 \| 10 - 1 \| 0 \| +---------+---------+----+---------+---------+---------+----+ In ISA v2.06 tlbiel takes only one argument, RB: tlbiel RB +---------+---------+---------+---------+---------+----+ \| 31 \| / \| / \| RB \| 274 \| / \| \| 31 - 26 \| 25 - 21 \| 20 - 16 \| 15 - 11 \| 10 - 1 \| 0 \| +---------+---------+---------+---------+---------+----+ And in ISA v3.00 tlbiel takes five arguments: tlbiel RB,RS,RIC,PRS,R +---------+---------+----+---------+----+----+---------+---------+----+ \| 31 \| RS \| / \| RIC \|PRS \| R \| RB \| 274 \| / \| \| 31 - 26 \| 25 - 21 \| 20 \| 19 - 18 \| 17 \| 16 \| 15 - 11 \| 10 - 1 \| 0 \| +---------+---------+----+---------+----+----+---------+---------+----+ However the assembler also accepts "tlbiel RB", and generates "tlbiel RB,r0,0,0,0". As you can see above the L field from the v2.05 encoding overlaps with the reserved field of the v2.06 encoding, and the low bit of the RS field of the v3.00 encoding. Currently in __tlbiel() we generate two tlbiel instructions manually using hex constants. In the first case, for MMU_PAGE_4K, we generate "tlbiel RB,0", which is safe in all cases, because the L bit is zero. However in the default case we generate "tlbiel RB,1", therefore setting bit 21 to 1. This is not an actual bug on v2.06 processors, because the CPU ignores the value of the reserved field. However software is supposed to encode the reserved fields as zero to enable forward compatibility. On v3.00 processors setting bit 21 to 1 and no other bits of RS, means we are using r1 for the value of RS. Although it's not obvious, the code sets the IS field (bits 10-11) to 0 (by omission), and L=1, in the va value, which is passed as RB. We also pass R=0 in the instruction. The combination of IS=0, L=1 and R=0 means the value of RS is not used, so even on ISA v3.00 there is no actual bug. We should still fix it, as setting a reserved bit on v2.06 is naughty, and we are only avoiding a bug on v3.00 by accident rather than design. Use ASM_FTR_IFSET() to generate the single argument form on ISA v2.06 and later, and the two argument form on pre v2.06. Although there may be very old toolchains which don't understand tlbiel, we have other code in the tree which has been using tlbiel for over five years, and no one has reported any build failures, so just let the assembler generate the instructions. Signed-off-by: Balbir Singh <bsingharora@gmail.com> [mpe: Rewrite change log, use IFSET instead of IFCLR] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-11-14 11:11:51 +11:00
Michael Ellerman	3a849815a1	powerpc/book3s64: Always build for power4 or later When we're not compiling for a specific CPU, ie. none of the CONFIG_POWERx_CPU options are set, and CONFIG_GENERIC_CPU is set, we currently don't pass any -mcpu option to the compiler. This means the compiler builds for a "generic" Power CPU. But back in 2014 we dropped support for pre power4 CPUs in commit `468a33028e` ("powerpc: Drop support for pre-POWER4 cpus"). Given that, there's no point in building the kernel to run on pre power4 cpus. So update the flags we pass to the compiler when CONFIG_GENERIC_CPU is set, to specify -mcpu=power4. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-11-14 11:11:51 +11:00
Nicholas Piggin	70839d2077	powerpc/64: Add an option to force run-at-load to test relocation This adds a config option that can help exercise the case when the kernel is not running at PAGE_OFFSET. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-11-14 11:11:51 +11:00
Anton Blanchard	5246adec59	powerpc/pseries: Use H_CLEAR_HPT to clear MMU hash table during kexec An hcall was recently added that does exactly what we need during kexec - it clears the entire MMU hash table, ignoring any VRMA mappings. Try it and fall back to the old method if we get a failure. On a POWER8 box with 5TB of memory, this reduces the time it takes to kexec a new kernel from from 4 minutes to 1 minute. Signed-off-by: Anton Blanchard <anton@samba.org> Tested-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> [mpe: Split into separate functions and tweak function naming] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-11-14 11:11:51 +11:00
Denis Kirjanov	9e607f7274	i2c_powermac: shut up lockdep warning That's unclear why lockdep shows the following warning but adding a lockdep class to struct pmac_i2c_bus solves it [ 20.507795] ====================================================== [ 20.507796] [ INFO: possible circular locking dependency detected ] [ 20.507800] 4.8.0-rc7-00037-gd2ffb01 #21 Not tainted [ 20.507801] ------------------------------------------------------- [ 20.507803] swapper/0/1 is trying to acquire lock: [ 20.507818] (&bus->mutex){+.+.+.}, at: [<c000000000052830>] .pmac_i2c_open+0x30/0x100 [ 20.507819] [ 20.507819] but task is already holding lock: [ 20.507829] (&policy->rwsem){+.+.+.}, at: [<c00000000068adcc>] .cpufreq_online+0x1ac/0x9d0 [ 20.507830] [ 20.507830] which lock already depends on the new lock. [ 20.507830] [ 20.507832] [ 20.507832] the existing dependency chain (in reverse order) is: [ 20.507837] [ 20.507837] -> #4 (&policy->rwsem){+.+.+.}: [ 20.507844] [<c00000000082385c>] .down_write+0x6c/0x110 [ 20.507849] [<c00000000068adcc>] .cpufreq_online+0x1ac/0x9d0 [ 20.507855] [<c0000000004d76d8>] .subsys_interface_register+0xb8/0x110 [ 20.507860] [<c000000000689bb0>] .cpufreq_register_driver+0x1d0/0x250 [ 20.507866] [<c000000000b4f8f4>] .g5_cpufreq_init+0x9cc/0xa28 [ 20.507872] [<c00000000000a98c>] .do_one_initcall+0x5c/0x1d0 [ 20.507878] [<c000000000b0f86c>] .kernel_init_freeable+0x1ac/0x28c [ 20.507883] [<c00000000000b3bc>] .kernel_init+0x1c/0x140 [ 20.507887] [<c0000000000098f4>] .ret_from_kernel_thread+0x58/0x64 [ 20.507894] [ 20.507894] -> #3 (subsys mutex#2){+.+.+.}: [ 20.507899] [<c000000000820448>] .mutex_lock_nested+0xa8/0x590 [ 20.507903] [<c0000000004d7f24>] .bus_probe_device+0x44/0xe0 [ 20.507907] [<c0000000004d5208>] .device_add+0x508/0x730 [ 20.507911] [<c0000000004dd528>] .register_cpu+0x118/0x190 [ 20.507916] [<c000000000b14450>] .topology_init+0x148/0x248 [ 20.507921] [<c00000000000a98c>] .do_one_initcall+0x5c/0x1d0 [ 20.507925] [<c000000000b0f86c>] .kernel_init_freeable+0x1ac/0x28c [ 20.507929] [<c00000000000b3bc>] .kernel_init+0x1c/0x140 [ 20.507934] [<c0000000000098f4>] .ret_from_kernel_thread+0x58/0x64 [ 20.507939] [ 20.507939] -> #2 (cpu_add_remove_lock){+.+.+.}: [ 20.507944] [<c000000000820448>] .mutex_lock_nested+0xa8/0x590 [ 20.507950] [<c000000000087a9c>] .register_cpu_notifier+0x2c/0x70 [ 20.507955] [<c000000000b267e0>] .spawn_ksoftirqd+0x18/0x4c [ 20.507959] [<c00000000000a98c>] .do_one_initcall+0x5c/0x1d0 [ 20.507964] [<c000000000b0f770>] .kernel_init_freeable+0xb0/0x28c [ 20.507968] [<c00000000000b3bc>] .kernel_init+0x1c/0x140 [ 20.507972] [<c0000000000098f4>] .ret_from_kernel_thread+0x58/0x64 [ 20.507978] [ 20.507978] -> #1 (&host->mutex){+.+.+.}: [ 20.507982] [<c000000000820448>] .mutex_lock_nested+0xa8/0x590 [ 20.507987] [<c0000000000527e8>] .kw_i2c_open+0x18/0x30 [ 20.507991] [<c000000000052894>] .pmac_i2c_open+0x94/0x100 [ 20.507995] [<c000000000b220a0>] .smp_core99_probe+0x260/0x410 [ 20.507999] [<c000000000b185bc>] .smp_prepare_cpus+0x280/0x2ac [ 20.508003] [<c000000000b0f748>] .kernel_init_freeable+0x88/0x28c [ 20.508008] [<c00000000000b3bc>] .kernel_init+0x1c/0x140 [ 20.508012] [<c0000000000098f4>] .ret_from_kernel_thread+0x58/0x64 [ 20.508018] [ 20.508018] -> #0 (&bus->mutex){+.+.+.}: [ 20.508023] [<c0000000000ed5b4>] .lock_acquire+0x84/0x100 [ 20.508027] [<c000000000820448>] .mutex_lock_nested+0xa8/0x590 [ 20.508032] [<c000000000052830>] .pmac_i2c_open+0x30/0x100 [ 20.508037] [<c000000000052e14>] .pmac_i2c_do_begin+0x34/0x120 [ 20.508040] [<c000000000056bc0>] .pmf_call_one+0x50/0xd0 [ 20.508045] [<c00000000068ff1c>] .g5_pfunc_switch_volt+0x2c/0xc0 [ 20.508050] [<c00000000068fecc>] .g5_pfunc_switch_freq+0x1cc/0x1f0 [ 20.508054] [<c00000000068fc2c>] .g5_cpufreq_target+0x2c/0x40 [ 20.508058] [<c0000000006873ec>] .__cpufreq_driver_target+0x23c/0x840 [ 20.508062] [<c00000000068c798>] .cpufreq_gov_performance_limits+0x18/0x30 [ 20.508067] [<c00000000068915c>] .cpufreq_start_governor+0xac/0x100 [ 20.508071] [<c00000000068a788>] .cpufreq_set_policy+0x208/0x260 [ 20.508076] [<c00000000068abdc>] .cpufreq_init_policy+0x6c/0xb0 [ 20.508081] [<c00000000068ae70>] .cpufreq_online+0x250/0x9d0 [ 20.508085] [<c0000000004d76d8>] .subsys_interface_register+0xb8/0x110 [ 20.508090] [<c000000000689bb0>] .cpufreq_register_driver+0x1d0/0x250 [ 20.508094] [<c000000000b4f8f4>] .g5_cpufreq_init+0x9cc/0xa28 [ 20.508099] [<c00000000000a98c>] .do_one_initcall+0x5c/0x1d0 [ 20.508103] [<c000000000b0f86c>] .kernel_init_freeable+0x1ac/0x28c [ 20.508107] [<c00000000000b3bc>] .kernel_init+0x1c/0x140 [ 20.508112] [<c0000000000098f4>] .ret_from_kernel_thread+0x58/0x64 [ 20.508113] [ 20.508113] other info that might help us debug this: [ 20.508113] [ 20.508121] Chain exists of: [ 20.508121] &bus->mutex --> subsys mutex#2 --> &policy->rwsem [ 20.508121] [ 20.508123] Possible unsafe locking scenario: [ 20.508123] [ 20.508124] CPU0 CPU1 [ 20.508125] ---- ---- [ 20.508128] lock(&policy->rwsem); [ 20.508132] lock(subsys mutex#2); [ 20.508135] lock(&policy->rwsem); [ 20.508138] lock(&bus->mutex); [ 20.508139] [ 20.508139] * DEADLOCK * [ 20.508139] [ 20.508141] 3 locks held by swapper/0/1: [ 20.508150] #0: (cpu_hotplug.lock){++++++}, at: [<c000000000087838>] .get_online_cpus+0x48/0xc0 [ 20.508159] #1: (subsys mutex#2){+.+.+.}, at: [<c0000000004d7670>] .subsys_interface_register+0x50/0x110 [ 20.508168] #2: (&policy->rwsem){+.+.+.}, at: [<c00000000068adcc>] .cpufreq_online+0x1ac/0x9d0 [ 20.508169] [ 20.508169] stack backtrace: [ 20.508173] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.8.0-rc7-00037-gd2ffb01 #21 [ 20.508175] Call Trace: [ 20.508180] [c0000000790c2b90] [c00000000082cc70] .dump_stack+0xe0/0x14c (unreliable) [ 20.508184] [c0000000790c2c20] [c000000000828c88] .print_circular_bug+0x350/0x388 [ 20.508188] [c0000000790c2cd0] [c0000000000ecb0c] .__lock_acquire+0x196c/0x1d30 [ 20.508192] [c0000000790c2e50] [c0000000000ed5b4] .lock_acquire+0x84/0x100 [ 20.508196] [c0000000790c2f20] [c000000000820448] .mutex_lock_nested+0xa8/0x590 [ 20.508201] [c0000000790c3030] [c000000000052830] .pmac_i2c_open+0x30/0x100 [ 20.508206] [c0000000790c30c0] [c000000000052e14] .pmac_i2c_do_begin+0x34/0x120 [ 20.508209] [c0000000790c3150] [c000000000056bc0] .pmf_call_one+0x50/0xd0 [ 20.508213] [c0000000790c31e0] [c00000000068ff1c] .g5_pfunc_switch_volt+0x2c/0xc0 [ 20.508217] [c0000000790c3250] [c00000000068fecc] .g5_pfunc_switch_freq+0x1cc/0x1f0 [ 20.508221] [c0000000790c3320] [c00000000068fc2c] .g5_cpufreq_target+0x2c/0x40 [ 20.508226] [c0000000790c3390] [c0000000006873ec] .__cpufreq_driver_target+0x23c/0x840 [ 20.508230] [c0000000790c3440] [c00000000068c798] .cpufreq_gov_performance_limits+0x18/0x30 [ 20.508235] [c0000000790c34b0] [c00000000068915c] .cpufreq_start_governor+0xac/0x100 [ 20.508239] [c0000000790c3530] [c00000000068a788] .cpufreq_set_policy+0x208/0x260 [ 20.508244] [c0000000790c35d0] [c00000000068abdc] .cpufreq_init_policy+0x6c/0xb0 [ 20.508249] [c0000000790c3940] [c00000000068ae70] .cpufreq_online+0x250/0x9d0 [ 20.508253] [c0000000790c3a30] [c0000000004d76d8] .subsys_interface_register+0xb8/0x110 [ 20.508258] [c0000000790c3ad0] [c000000000689bb0] .cpufreq_register_driver+0x1d0/0x250 [ 20.508262] [c0000000790c3b60] [c000000000b4f8f4] .g5_cpufreq_init+0x9cc/0xa28 [ 20.508267] [c0000000790c3c20] [c00000000000a98c] .do_one_initcall+0x5c/0x1d0 [ 20.508271] [c0000000790c3d00] [c000000000b0f86c] .kernel_init_freeable+0x1ac/0x28c [ 20.508276] [c0000000790c3db0] [c00000000000b3bc] .kernel_init+0x1c/0x140 [ 20.508280] [c0000000790c3e30] [c0000000000098f4] .ret_from_kernel_thread+0x58/0x64 Signed-off-by: Denis Kirjanov <kda@linux-powerpc.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-11-14 11:11:51 +11:00
Nicholas Piggin	c0a5149105	powerpc: Make _ASM_NOKPROBE_SYMBOL a noop when KPROBES not defined Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-11-14 11:11:51 +11:00
Nicholas Piggin	5b9ff02785	powerpc: Build-time sort the exception table Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-11-14 11:11:51 +11:00
Nicholas Piggin	61a92f7031	powerpc: Add support for relative exception tables This halves the exception table size on 64-bit builds, and it allows build-time sorting of exception tables to work on relocated kernels. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> [mpe: Minor asm fixups and bits to keep the selftests working] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-11-14 11:11:51 +11:00
Nicholas Piggin	24bfa6a9e0	powerpc: EX_TABLE macro for exception tables This macro is taken from s390, and allows more flexibility in changing exception table format. mpe: Put it in ppc_asm.h and only define one version using stringinfy_in_c(). Add some empty definitions and headers to keep the selftests happy. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-11-14 11:11:51 +11:00
Michael Ellerman	9f751b82b4	powerpc/module: Add support for R_PPC64_REL32 relocations We haven't seen these before, but the soon to be merged relative exception tables support causes them to be generated. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-11-14 11:11:51 +11:00
Michael Ellerman	e3f2c6c393	powerpc/asm: Allow including ppc_asm.h in asm files There's no reason to #error if we include ppc_asm.h in asm files, the ifdef already prevents any problems. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-11-14 11:11:51 +11:00
Nicholas Piggin	f4329f2ecb	powerpc/64s: Reduce exception alignment Exception handlers are aligned to 128 bytes (L1 cache) on 64s, which is overkill. It can reduce the icache footprint of any individual exception path. However taken as a whole, the expansion in icache footprint seems likely to be counter-productive and cause more total misses. Create IFETCH_ALIGN_SHIFT/BYTES, which should give optimal ifetch alignment with much more reasonable alignment. This saves 1792 bytes from head_64.o text with an allmodconfig build. Other subarchitectures should define appropriate IFETCH_ALIGN_SHIFT values if this becomes more widely used. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-11-14 11:11:51 +11:00
Rui Teng	fda0440d82	powerpc: Remove suspect CONFIG_PPC_BOOK3E #ifdefs in nohash/64/pgtable.h There are three #ifdef CONFIG_PPC_BOOK3E sections in nohash/64/pgtable.h. And there should be no configurations possible which use nohash/64/pgtable.h but don't also enable CONFIG_PPC_BOOK3E. Suggested-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Rui Teng <rui.teng@linux.vnet.ibm.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-11-14 11:11:51 +11:00
Andrew Donnellan	2ffd04dee0	powerpc/oops: Fix missing pr_cont()s in instruction dump Since the KERN_CONT changes, the current code in show_instructions() prints out a whole bunch of unnecessary newlines. Change occurrences of printk("\n") to pr_cont("\n"). While we're here, change all the other cases of printk(KERN_CONT ...) to pr_cont() as well. Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-11-12 20:12:51 +11:00
Michael Ellerman	7dae865f58	powerpc/oops: Fix missing pr_cont()s in show_regs() Fix up our oops output by converting continuation lines to use pr_cont(). Some of these are dubious, eg. printing a continuation line which starts with a newline, but seem to work OK for now. This whole function needs a rewrite in the next release. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-11-12 20:12:50 +11:00
Michael Ellerman	db5ba5ae6e	powerpc/oops: Fix missing pr_cont()s in print_msr_bits() et. al. Since the KERN_CONT changes these are being horribly split across lines, for example: MSR: 8000000000009033 < SF,EE ,ME,IR ,DR,RI ,LE> So fix it by using pr_cont() where appropriate. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-11-12 20:12:50 +11:00
Michael Ellerman	9a1f490f35	powerpc/oops: Fix missing pr_cont()s in show_stack() Previously we got away with printing the stack trace in multiple pieces and it usually looked right. But since commit `4bcc595ccd` ("printk: reinstate KERN_CONT for printing continuation lines"), KERN_CONT is now required when printing continuation lines. Use pr_cont() as appropriate. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-11-12 20:12:49 +11:00
Hugh Dickins	e6740ae631	powerpc: Fix exception vector build with 2.23 era binutils The changes to use gas sections for constructing the exception vectors causes a build break when using binutils 2.23: arch/powerpc/kernel/exceptions-64s.S:770: Error: operand out of range (0xffffffffffff8100 is not between 0x0000000000000000 and 0x000000000000ffff) And so on. Reported by Hugh with binutils-2.23.2-8.1.4.ppc64 from openSUSE 13.1 and also Naveen & Denis using 2.23.52.0.1-26.el7 from RHEL 7. Strangely binutils 2.22 (what I test with) is not affected. This is caused by the use of @l in LOAD_HANDLER(). The @l was only recently added in commit `a24553dd02` ("powerpc/pseries: Remove unnecessary syscall trampoline"). Luckily the gas section changes split out the LOAD_SYSCALL_HANDLER() macro, which means we actually don't need to use @l in LOAD_HANDLER() any more, only in LOAD_SYSCALL_HANDLER(). So drop the @l from LOAD_HANDLER(). Fixes: `57f266497d` ("powerpc: Use gas sections for arranging exception vectors") Signed-off-by: Hugh Dickins <hughd@google.com> [mpe: Add gory details to change log] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-11-12 20:12:49 +11:00
Nicholas Piggin	f23ed166f2	powerpc/64s: Fix system reset interrupt winkle wakeups Wakeups from winkle set the low bit of the HSPRG0 register, to distinguish it from other sleep states. This is also the PACA pointer. The system reset exception handler fails to mask this bit away before using this value before using it as the PACA pointer. Fix this by adding a new type of exception prolog macro where we already have the PACA set in r13, and have the system reset vector mask it out. The winkle wakeup handler will store the masked value back into HSPRG0. Fixes: `fb479e44a9` ("powerpc/64s: relocation, register save fixes for system reset interrupt") Cc: stable@vger.kernel.org # v3.0+ Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-11-12 20:12:42 +11:00
Ingo Molnar	4c8ee71620	Merge branch 'linus' into locking/core, to pick up fixes Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-11-11 08:25:07 +01:00
Linus Torvalds	2a26d99b25	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: "Lots of fixes, mostly drivers as is usually the case. 1) Don't treat zero DMA address as invalid in vmxnet3, from Alexey Khoroshilov. 2) Fix element timeouts in netfilter's nft_dynset, from Anders K. Pedersen. 3) Don't put aead_req crypto struct on the stack in mac80211, from Ard Biesheuvel. 4) Several uninitialized variable warning fixes from Arnd Bergmann. 5) Fix memory leak in cxgb4, from Colin Ian King. 6) Fix bpf handling of VLAN header push/pop, from Daniel Borkmann. 7) Several VRF semantic fixes from David Ahern. 8) Set skb->protocol properly in ip6_tnl_xmit(), from Eli Cooper. 9) Socket needs to be locked in udp_disconnect(), from Eric Dumazet. 10) Div-by-zero on 32-bit fix in mlx4 driver, from Eugenia Emantayev. 11) Fix stale link state during failover in NCSCI driver, from Gavin Shan. 12) Fix netdev lower adjacency list traversal, from Ido Schimmel. 13) Propvide proper handle when emitting notifications of filter deletes, from Jamal Hadi Salim. 14) Memory leaks and big-endian issues in rtl8xxxu, from Jes Sorensen. 15) Fix DESYNC_FACTOR handling in ipv6, from Jiri Bohac. 16) Several routing offload fixes in mlxsw driver, from Jiri Pirko. 17) Fix broadcast sync problem in TIPC, from Jon Paul Maloy. 18) Validate chunk len before using it in SCTP, from Marcelo Ricardo Leitner. 19) Revert a netns locking change that causes regressions, from Paul Moore. 20) Add recursion limit to GRO handling, from Sabrina Dubroca. 21) GFP_KERNEL in irq context fix in ibmvnic, from Thomas Falcon. 22) Avoid accessing stale vxlan/geneve socket in data path, from Pravin Shelar" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (189 commits) geneve: avoid using stale geneve socket. vxlan: avoid using stale vxlan socket. qede: Fix out-of-bound fastpath memory access net: phy: dp83848: add dp83822 PHY support enic: fix rq disable tipc: fix broadcast link synchronization problem ibmvnic: Fix missing brackets in init_sub_crq_irqs ibmvnic: Fix releasing of sub-CRQ IRQs in interrupt context Revert "ibmvnic: Fix releasing of sub-CRQ IRQs in interrupt context" arch/powerpc: Update parameters for csum_tcpudp_magic & csum_tcpudp_nofold net/mlx4_en: Save slave ethtool stats command net/mlx4_en: Fix potential deadlock in port statistics flow net/mlx4: Fix firmware command timeout during interrupt test net/mlx4_core: Do not access comm channel if it has not yet been initialized net/mlx4_en: Fix panic during reboot net/mlx4_en: Process all completions in RX rings after port goes up net/mlx4_en: Resolve dividing by zero in 32-bit system net/mlx4_core: Change the default value of enable_qos net/mlx4_core: Avoid setting ports to auto when only one port type is supported net/mlx4_core: Fix the resource-type enum in res tracker to conform to FW spec ...	2016-10-29 20:33:20 -07:00
Ivan Vecera	f9d4286b95	arch/powerpc: Update parameters for csum_tcpudp_magic & csum_tcpudp_nofold Commit `01cfbad` "ipv4: Update parameters for csum_tcpudp_magic to their original types" changed parameters for csum_tcpudp_magic and csum_tcpudp_nofold for many platforms but not for PowerPC. Fixes: `01cfbad` "ipv4: Update parameters for csum_tcpudp_magic to their original types" Cc: Alexander Duyck <aduyck@mirantis.com> Signed-off-by: Ivan Vecera <ivecera@redhat.com> Acked-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-10-29 17:06:23 -04:00
Linus Torvalds	6fcc8cea82	powerpc fixes for 4.9 #4 Fixes marked for stable: - Convert cmp to cmpd in idle enter sequence (Segher Boessenkool) - cxl: Fix leaking pid refs in some error paths (Vaibhav Jain) - Re-fix race condition between going idle and entering guest (Paul Mackerras) - Fix race condition in setting lock bit in idle/wakeup code (Paul Mackerras) - radix: Use tlbiel only if we ever ran on the current cpu (Aneesh Kumar K.V) - relocation, register save fixes for system reset interrupt (Nicholas Piggin) Fixes for code merged this cycle: - Fix CONFIG_ALIVEC typo in restore_tm_state() (Valentin Rothberg) - KVM: PPC: Book3S HV: Fix build error when SMP=n (Michael Ellerman) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJYE8erAAoJEFHr6jzI4aWA7/0QALRrO64SzxMmKzratYCgqJ3Q vfLI7nOFKWSKKV6DlGjQxqDlcvYVjnl4QpWHD+y+8eLgJBdF2HiDl3fG/q1afaGd so+eoHhbb/ZZ7xcT/5IaSkNS6QdHu8gxrRa4yXuTB8Zktl6j/ZVBFW0Skeuyd7ic yOwh6S38gKKNRq3q0rVeKFp4ONvg3PYIWKCUjIZssU5FB4lpMJ/i3cN5dTQ13Umz JjMxhZqyLjOBnTdSHEzTD0sJR8a0HwrubkPySwWSVxSKpEPAhva2Yjt9mW3PXE77 Pzz9qRnDJo28F7K5C2mpWrk6RuiYlnrzVmsFKO45BJrLXbzAzDC6bSeS3LYOyYII j5FpMrEnq5DeGXE12cfmcrZokws6SkCbYE0zOkqRkGm4q1HPkdZRLGV9DVMGkAcW kjmIC7+V0Gnak4fGbAjy4TpK0PhuyNEl+noCLpI9SRSdbrIa8ax9Ug0VfaatOChC ZOKDokW6STzuH+dBr+IUyfAJVGUfyvKq5GGqdTr1ejo816ILi+c0oKYVBQAanPWW nuR22j1pa2MaTRHhBpNBTGeuchWLFNUxTWEuG1Mu0FaM+NNoF9oy7AyathXWnrn0 OanDuFPynKzMQ0IsJTNbTbfOuEnGMtFNfqavuPjHje3l5kv9489ZeqpbLJmMvQIK 1RnAlzXUh+GcWfIT8qr4 =dnRQ -----END PGP SIGNATURE----- Merge tag 'powerpc-4.9-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "Fixes marked for stable: - Convert cmp to cmpd in idle enter sequence (Segher Boessenkool) - cxl: Fix leaking pid refs in some error paths (Vaibhav Jain) - Re-fix race condition between going idle and entering guest (Paul Mackerras) - Fix race condition in setting lock bit in idle/wakeup code (Paul Mackerras) - radix: Use tlbiel only if we ever ran on the current cpu (Aneesh Kumar K.V) - relocation, register save fixes for system reset interrupt (Nicholas Piggin) Fixes for code merged this cycle: - Fix CONFIG_ALIVEC typo in restore_tm_state() (Valentin Rothberg) - KVM: PPC: Book3S HV: Fix build error when SMP=n (Michael Ellerman)" * tag 'powerpc-4.9-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/64s: relocation, register save fixes for system reset interrupt powerpc/mm/radix: Use tlbiel only if we ever ran on the current cpu powerpc/process: Fix CONFIG_ALIVEC typo in restore_tm_state() powerpc/64: Fix race condition in setting lock bit in idle/wakeup code powerpc/64: Re-fix race condition between going idle and entering guest cxl: Fix leaking pid refs in some error paths powerpc: Convert cmp to cmpd in idle enter sequence KVM: PPC: Book3S HV: Fix build error when SMP=n	2016-10-28 16:52:28 -07:00
Linus Torvalds	b49c3170bf	Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Misc kernel fixes: a virtualization environment related fix, an uncore PMU driver removal handling fix, a PowerPC fix and new events for Knights Landing" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/intel: Honour the CPUID for number of fixed counters in hypervisors perf/powerpc: Don't call perf_event_disable() from atomic context perf/core: Protect PMU device removal with a 'pmu_bus_running' check, to fix CONFIG_DEBUG_TEST_DRIVER_REMOVE=y kernel panic perf/x86/intel/cstate: Add C-state residency events for Knights Landing	2016-10-28 16:27:16 -07:00
Jiri Olsa	5aab90ce1e	perf/powerpc: Don't call perf_event_disable() from atomic context The trinity syscall fuzzer triggered following WARN() on powerpc: WARNING: CPU: 9 PID: 2998 at arch/powerpc/kernel/hw_breakpoint.c:278 ... NIP [c00000000093aedc] .hw_breakpoint_handler+0x28c/0x2b0 LR [c00000000093aed8] .hw_breakpoint_handler+0x288/0x2b0 Call Trace: [c0000002f7933580] [c00000000093aed8] .hw_breakpoint_handler+0x288/0x2b0 (unreliable) [c0000002f7933630] [c0000000000f671c] .notifier_call_chain+0x7c/0xf0 [c0000002f79336d0] [c0000000000f6abc] .__atomic_notifier_call_chain+0xbc/0x1c0 [c0000002f7933780] [c0000000000f6c40] .notify_die+0x70/0xd0 [c0000002f7933820] [c00000000001a74c] .do_break+0x4c/0x100 [c0000002f7933920] [c0000000000089fc] handle_dabr_fault+0x14/0x48 Followed by a lockdep warning: =============================== [ INFO: suspicious RCU usage. ] 4.8.0-rc5+ #7 Tainted: G W ------------------------------- ./include/linux/rcupdate.h:556 Illegal context switch in RCU read-side critical section! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 0 2 locks held by ls/2998: #0: (rcu_read_lock){......}, at: [<c0000000000f6a00>] .__atomic_notifier_call_chain+0x0/0x1c0 #1: (rcu_read_lock){......}, at: [<c00000000093ac50>] .hw_breakpoint_handler+0x0/0x2b0 stack backtrace: CPU: 9 PID: 2998 Comm: ls Tainted: G W 4.8.0-rc5+ #7 Call Trace: [c0000002f7933150] [c00000000094b1f8] .dump_stack+0xe0/0x14c (unreliable) [c0000002f79331e0] [c00000000013c468] .lockdep_rcu_suspicious+0x138/0x180 [c0000002f7933270] [c0000000001005d8] .___might_sleep+0x278/0x2e0 [c0000002f7933300] [c000000000935584] .mutex_lock_nested+0x64/0x5a0 [c0000002f7933410] [c00000000023084c] .perf_event_ctx_lock_nested+0x16c/0x380 [c0000002f7933500] [c000000000230a80] .perf_event_disable+0x20/0x60 [c0000002f7933580] [c00000000093aeec] .hw_breakpoint_handler+0x29c/0x2b0 [c0000002f7933630] [c0000000000f671c] .notifier_call_chain+0x7c/0xf0 [c0000002f79336d0] [c0000000000f6abc] .__atomic_notifier_call_chain+0xbc/0x1c0 [c0000002f7933780] [c0000000000f6c40] .notify_die+0x70/0xd0 [c0000002f7933820] [c00000000001a74c] .do_break+0x4c/0x100 [c0000002f7933920] [c0000000000089fc] handle_dabr_fault+0x14/0x48 While it looks like the first WARN() is probably valid, the other one is triggered by disabling event via perf_event_disable() from atomic context. The event is disabled here in case we were not able to emulate the instruction that hit the breakpoint. By disabling the event we unschedule the event and make sure it's not scheduled back. But we can't call perf_event_disable() from atomic context, instead we need to use the event's pending_disable irq_work method to disable it. Reported-by: Jan Stancek <jstancek@redhat.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Huang Ying <ying.huang@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michael Neuling <mikey@neuling.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20161026094824.GA21397@krava Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-10-28 11:06:25 +02:00
Nicholas Piggin	fb479e44a9	powerpc/64s: relocation, register save fixes for system reset interrupt This patch does a couple of things. First of all, powernv immediately explodes when running a relocated kernel, because the system reset exception for handling sleeps does not do correct relocated branches. Secondly, the sleep handling code trashes the condition and cfar registers, which we would like to preserve for debugging purposes (for non-sleep case exception). This patch changes the exception to use the standard format that saves registers before any tests or branches are made. It adds the test for idle-wakeup as an "extra" to break out of the normal exception path. Then it branches to a relocated idle handler that calls the various idle handling functions. After this patch, POWER8 CPU simulator now boots powernv kernel that is running at non-zero. Fixes: `948cf67c47` ("powerpc: Add NAP mode support on Power7 in HV mode") Cc: stable@vger.kernel.org # v3.0+ Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Acked-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Acked-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-27 21:55:14 +11:00
Aneesh Kumar K.V	bd77c44986	powerpc/mm/radix: Use tlbiel only if we ever ran on the current cpu Before this patch, we used tlbiel, if we ever ran only on this core. That was mostly derived from the nohash usage of the same. But is incorrect, the ISA 3.0 clarifies tlbiel such that: "All TLB entries that have all of the following properties are made invalid on the thread executing the tlbiel instruction" ie. tlbiel only invalidates TLB entries on the current thread. So if the mm has been used on any other thread (aka. cpu) then we must broadcast the invalidate. This bug could lead to invalid TLB entries if a program runs on multiple threads of a core. Hence use tlbiel, if we only ever ran on only the current cpu. Fixes: `1a472c9dba` ("powerpc/mm/radix: Add tlbflush routines") Cc: stable@vger.kernel.org # v4.7+ Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-27 21:55:13 +11:00
Valentin Rothberg	39715bf972	powerpc/process: Fix CONFIG_ALIVEC typo in restore_tm_state() It should be ALTIVEC, not ALIVEC. Cyril explains: If a thread performs a transaction with altivec and then gets preempted for whatever reason, this bug may cause the kernel to not re-enable altivec when that thread runs again. This will result in an altivec unavailable fault, when that fault happens inside a user transaction the kernel has no choice but to enable altivec and doom the transaction. The result is that transactions using altivec may get aborted more often than they should. The difficulty in catching this with a selftest is my deliberate use of the word may above. Optimisations to avoid FPU/altivec/VSX faults mean that the kernel will always leave them on for 255 switches. This code prevents the kernel turning it off if it got to the 256th switch (and userspace was transactional). Fixes: `dc16b553c9` ("powerpc: Always restore FPU/VEC/VSX if hardware transactional memory in use") Reviewed-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Valentin Rothberg <valentinrothberg@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-27 21:52:59 +11:00
Peter Zijlstra	890658b7ab	locking/mutex: Kill arch specific code Its all generic atomic_long_t stuff now. Tested-by: Jason Low <jason.low2@hpe.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-arch@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2016-10-25 11:31:51 +02:00
Paul Mackerras	09b7e37b18	powerpc/64: Fix race condition in setting lock bit in idle/wakeup code This fixes a race condition where one thread that is entering or leaving a power-saving state can inadvertently ignore the lock bit that was set by another thread, and potentially also clear it. The core_idle_lock_held function is called when the lock bit is seen to be set. It polls the lock bit until it is clear, then does a lwarx to load the word containing the lock bit and thread idle bits so it can be updated. However, it is possible that the value loaded with the lwarx has the lock bit set, even though an immediately preceding lwz loaded a value with the lock bit clear. If this happens then we go ahead and update the word despite the lock bit being set, and when called from pnv_enter_arch207_idle_mode, we will subsequently clear the lock bit. No identifiable misbehaviour has been attributed to this race. This fixes it by checking the lock bit in the value loaded by the lwarx. If it is set then we just go back and keep on polling. Fixes: `b32aadc1a8` ("powerpc/powernv: Fix race in updating core_idle_state") Cc: stable@vger.kernel.org # v4.2+ Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-24 19:29:47 +11:00
Paul Mackerras	56c46222af	powerpc/64: Re-fix race condition between going idle and entering guest Commit `8117ac6a6c` ("powerpc/powernv: Switch off MMU before entering nap/sleep/rvwinkle mode", 2014-12-10) fixed a race condition where one thread entering a KVM guest could switch the MMU context to the guest while another thread was still in host kernel context with the MMU on. That commit moved the point where a thread entering a power-saving mode set its kvm_hstate.hwthread_state field in its PACA to KVM_HWTHREAD_IN_IDLE from a point where the MMU was on to after the MMU had been switched off. That commit also added a comment explaining that we have to switch to real mode before setting hwthread_state to avoid this race. Nevertheless, commit `4eae2c9ae5` ("powerpc/powernv: Make pnv_powersave_common more generic", 2016-07-08) subsequently moved the setting of hwthread_state back to a point where the MMU is on, thus reintroducing the race, despite the comment saying that this should not be done being included in full in the context lines of the patch that did it. This fixes the race again and adds a bigger and shoutier comment explaining the potential race condition. Fixes: `4eae2c9ae5` ("powerpc/powernv: Make pnv_powersave_common more generic") Cc: stable@vger.kernel.org # v4.8+ Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Reviewed-by: Shreyas B. Prabhu <shreyasbp@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-24 19:29:36 +11:00
Linus Torvalds	dcd4693cf4	powerpc fixes for 4.9 #3 Fixes marked for stable: - Prevent unlikely crash in copro_calculate_slb() (Frederic Barrat) - cxl: Prevent adapter reset if an active context exists (Vaibhav Jain) Fixes for code merged this cycle: - Fix boot on systems with uncompressed kernel image (Heiner Kallweit) - Drop dump_numa_memory_topology() (Michael Ellerman) - Fix numa topology console print (Aneesh Kumar K.V) - Ignore the pkey system calls for now (Stephen Rothwell) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJYCpJqAAoJEFHr6jzI4aWABS8QAJXuCjXrfNdQoiNmSHTOOUuj Z1KFIU/WjLa42VD2KIvW/OiTjzmrA9yl/PkNYD185yXu5DAE1h+lH0gBCA3KlSUc LwNneqn+3aGqmAX7jTm1HaWFCQt6mF0z3hwDPvEXhC4hcNjhe3mp3Q9/Q8idVfAJ f48vBa8qgJ5gpD5zVva5ujh1F2RUA+RQmhaR+LS19B+OH6xPzRp7VGUdsKRp75pI ILVCsjxA+DoaMOUK9quE5/9n9IK+N10QLfCqJu6HxJJ47nBxkiDPtdcndv0WTA9m kYTdqcv5o7A4+SrdXOkNBBHjj09UhdHBmhIrEt6286wyJ3thvDIhjMrX6OwJSCyb oB8PhXwjyUQrws19h4RNDToPG2Hr9A8BXVTofyPV4ku6gvucI03WFcVbHMWhAiLh lwR3Ppg4mHHAndL4oRlRhpvEVmBGwMuKEbisTa82T5RK4iPVWRcGqN6bltj9g6QX VXc8KQzKM+qEKQmDzdjExr0ZFq+USea96JmCJs6l9+M1nwe5CRCJAZyjp5LhVYRf ky9DSmp+nwIUxAQ73rv/NrjvRNZXCaUn4G+vpcSix7jrq6DqJoLSTEqpfw3Lfejj oJ1YxqD9SrNYhXChj071zLoDznZIviCxitLbQYVLt1Y72iLUXgt+s/y3JZWuxGrt EAmIXJq8fJHhHEd0TEW9 =39+z -----END PGP SIGNATURE----- Merge tag 'powerpc-4.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "Fixes marked for stable: - Prevent unlikely crash in copro_calculate_slb() (Frederic Barrat) - cxl: Prevent adapter reset if an active context exists (Vaibhav Jain) Fixes for code merged this cycle: - Fix boot on systems with uncompressed kernel image (Heiner Kallweit) - Drop dump_numa_memory_topology() (Michael Ellerman) - Fix numa topology console print (Aneesh Kumar K.V) - Ignore the pkey system calls for now (Stephen Rothwell)" * tag 'powerpc-4.9-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc: Ignore the pkey system calls for now powerpc: Fix numa topology console print powerpc/mm: Drop dump_numa_memory_topology() cxl: Prevent adapter reset if an active context exists powerpc/boot: Fix boot on systems with uncompressed kernel image powerpc/mm: Prevent unlikely crash in copro_calculate_slb()	2016-10-21 19:13:00 -07:00
Segher Boessenkool	80f23935ca	powerpc: Convert cmp to cmpd in idle enter sequence PowerPC's "cmp" instruction has four operands. Normally people write "cmpw" or "cmpd" for the second cmp operand 0 or 1. But, frequently people forget, and write "cmp" with just three operands. With older binutils this is silently accepted as if this was "cmpw", while often "cmpd" is wanted. With newer binutils GAS will complain about this for 64-bit code. For 32-bit code it still silently assumes "cmpw" is what is meant. In this instance the code comes directly from ISA v2.07, including the cmp, but cmpd is correct. Backport to stable so that new toolchains can build old kernels. Fixes: `948cf67c47` ("powerpc: Add NAP mode support on Power7 in HV mode") Cc: stable@vger.kernel.org # v3.0 Reviewed-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Signed-off-by: Segher Boessenkool <segher@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-22 08:44:38 +11:00
Michael Ellerman	62623d5f91	KVM: PPC: Book3S HV: Fix build error when SMP=n Commit `5d375199ea` ("KVM: PPC: Book3S HV: Set server for passed-through interrupts") broke the SMP=n build: arch/powerpc/kvm/book3s_hv_rm_xics.c:758:2: error: implicit declaration of function 'get_hard_smp_processor_id' That is because we lost the implicit include of asm/smp.h, so include it explicitly to get the definition for get_hard_smp_processor_id(). Fixes: `5d375199ea` ("KVM: PPC: Book3S HV: Set server for passed-through interrupts") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-22 08:44:37 +11:00
Linus Torvalds	63ae602cea	Merge branch 'gup_flag-cleanups' Merge the gup_flags cleanups from Lorenzo Stoakes: "This patch series adjusts functions in the get_user_pages* family such that desired FOLL_* flags are passed as an argument rather than implied by flags. The purpose of this change is to make the use of FOLL_FORCE explicit so it is easier to grep for and clearer to callers that this flag is being used. The use of FOLL_FORCE is an issue as it overrides missing VM_READ/VM_WRITE flags for the VMA whose pages we are reading from/writing to, which can result in surprising behaviour. The patch series came out of the discussion around commit `38e0885465` ("mm: check VMA flags to avoid invalid PROT_NONE NUMA balancing"), which addressed a BUG_ON() being triggered when a page was faulted in with PROT_NONE set but having been overridden by FOLL_FORCE. do_numa_page() was run on the assumption the page _must_ be one marked for NUMA node migration as an actual PROT_NONE page would have been dealt with prior to this code path, however FOLL_FORCE introduced a situation where this assumption did not hold. See https://marc.info/?l=linux-mm&m=147585445805166 for the patch proposal" Additionally, there's a fix for an ancient bug related to FOLL_FORCE and FOLL_WRITE by me. [ This branch was rebased recently to add a few more acked-by's and reviewed-by's ] * gup_flag-cleanups: mm: replace access_process_vm() write parameter with gup_flags mm: replace access_remote_vm() write parameter with gup_flags mm: replace __access_remote_vm() write parameter with gup_flags mm: replace get_user_pages_remote() write/force parameters with gup_flags mm: replace get_user_pages() write/force parameters with gup_flags mm: replace get_vaddr_frames() write/force parameters with gup_flags mm: replace get_user_pages_locked() write/force parameters with gup_flags mm: replace get_user_pages_unlocked() write/force parameters with gup_flags mm: remove write/force parameters from __get_user_pages_unlocked() mm: remove write/force parameters from __get_user_pages_locked() mm: remove gup_flags FOLL_WRITE games from __get_user_pages()	2016-10-19 08:39:47 -07:00
Lorenzo Stoakes	f307ab6dce	mm: replace access_process_vm() write parameter with gup_flags This removes the 'write' argument from access_process_vm() and replaces it with 'gup_flags' as use of this function previously silently implied FOLL_FORCE, whereas after this patch callers explicitly pass this flag. We make this explicit as use of FOLL_FORCE can result in surprising behaviour (and hence bugs) within the mm subsystem. Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com> Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-10-19 08:31:25 -07:00
Stephen Rothwell	78914ff084	powerpc: Ignore the pkey system calls for now Eliminates warning messages: <stdin>:1316:2: warning: #warning syscall pkey_mprotect not implemented [-Wcpp] <stdin>:1319:2: warning: #warning syscall pkey_alloc not implemented [-Wcpp] <stdin>:1322:2: warning: #warning syscall pkey_free not implemented [-Wcpp] Hopefully we will remember to revert this commit if we ever implement them. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Acked-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-19 20:36:24 +11:00
Aneesh Kumar K.V	8467801cc8	powerpc: Fix numa topology console print With recent update to printk, we get console output like below: [ 0.550639] Brought up 160 CPUs [ 0.550718] Node 0 CPUs: [ 0.550721] 0 [ 0.550754] -39 [ 0.550794] Node 1 CPUs: [ 0.550798] 40 [ 0.550817] -79 [ 0.550856] Node 16 CPUs: [ 0.550860] 80 [ 0.550880] -119 [ 0.550917] Node 17 CPUs: [ 0.550923] 120 [ 0.550942] -159 Fix this by properly using pr_cont(), ie. KERN_CONT. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-19 20:35:41 +11:00
Michael Ellerman	08b5e79ebd	powerpc/mm: Drop dump_numa_memory_topology() At boot we dump the NUMA memory topology in dump_numa_memory_topology(), at KERN_DEBUG level, resulting in output like: Node 0 Memory: 0x0-0x100000000 Node 1 Memory: 0x100000000-0x200000000 Which is nice enough, but immediately after that we iterate over each node and call setup_node_data(), which also prints out the node ranges, at KERN_INFO, giving eg: numa: Initmem setup node 0 [mem 0x00000000-0xffffffff] numa: Initmem setup node 1 [mem 0x100000000-0x1ffffffff] Additionally dump_numa_memory_topology() does not use KERN_CONT correctly, resulting in split output lines on recent kernels. So drop dump_numa_memory_topology() as superfluous chatter. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Acked-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-19 20:35:40 +11:00
Heiner Kallweit	65bc3ece84	powerpc/boot: Fix boot on systems with uncompressed kernel image This commit broke boot on systems with an uncompressed kernel image, namely systems using a cuImage. On such systems the compressed boot image (boot wrapper, uncompressed kernel image, ..) is decompressed by u-boot already, therefore the boot wrapper code sees an uncompressed kernel image. The old decompression code silently assumed an uncompressed kernel image if it found no valid gzip signature, whilst the new code bailed out in this case. Fix this by re-introducing such a fallback if no valid compressed image is found. Fixes: `1b7898ee27` ("Use the pre-boot decompression API") Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-19 20:35:34 +11:00
Frederic Barrat	d2cf909cda	powerpc/mm: Prevent unlikely crash in copro_calculate_slb() If a cxl adapter faults on an invalid address for a kernel context, we may enter copro_calculate_slb() with a NULL mm pointer (kernel context) and an effective address which looks like a user address. Which will cause a crash when dereferencing mm. It is clearly an AFU bug, but there's no reason to crash either. So return an error, so that cxl can ack the interrupt with an address error. Fixes: `73d16a6e0e` ("powerpc/cell: Move data segment faulting code out of cell platform") Cc: stable@vger.kernel.org # v3.18+ Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-19 20:32:49 +11:00
Linus Torvalds	9ffc66941d	This adds a new gcc plugin named "latent_entropy". It is designed to extract as much possible uncertainty from a running system at boot time as possible, hoping to capitalize on any possible variation in CPU operation (due to runtime data differences, hardware differences, SMP ordering, thermal timing variation, cache behavior, etc). At the very least, this plugin is a much more comprehensive example for how to manipulate kernel code using the gcc plugin internals. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 Comment: Kees Cook <kees@outflux.net> iQIcBAABCgAGBQJX/BAFAAoJEIly9N/cbcAmzW8QALFbCs7EFFkML+M/M/9d8zEk 1QbUs/z8covJTTT1PjSdw7JUrAMulI3S00owpcQVd/PcWjRPU80QwfsXBgIB0tvC Kub2qxn6Oaf+kTB646zwjFgjdCecw/USJP+90nfcu2+LCnE8ReclKd1aUee+Bnhm iDEUyH2ONIoWq6ta2Z9sA7+E4y2ZgOlmW0iga3Mnf+OcPtLE70fWPoe5E4g9DpYk B+kiPDrD9ql5zsHaEnKG1ldjiAZ1L6Grk8rGgLEXmbOWtTOFmnUhR+raK5NA/RCw MXNuyPay5aYPpqDHFm+OuaWQAiPWfPNWM3Ett4k0d9ZWLixTcD1z68AciExwk7aW SEA8b1Jwbg05ZNYM7NJB6t6suKC4dGPxWzKFOhmBicsh2Ni5f+Az0BQL6q8/V8/4 8UEqDLuFlPJBB50A3z5ngCVeYJKZe8Bg/Swb4zXl6mIzZ9darLzXDEV6ystfPXxJ e1AdBb41WC+O2SAI4l64yyeswkGo3Iw2oMbXG5jmFl6wY/xGp7dWxw7gfnhC6oOh afOT54p2OUDfSAbJaO0IHliWoIdmE5ZYdVYVU9Ek+uWyaIwcXhNmqRg+Uqmo32jf cP5J9x2kF3RdOcbSHXmFp++fU+wkhBtEcjkNpvkjpi4xyA47IWS7lrVBBebrCq9R pa/A7CNQwibIV6YD8+/p =1dUK -----END PGP SIGNATURE----- Merge tag 'gcc-plugins-v4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull gcc plugins update from Kees Cook: "This adds a new gcc plugin named "latent_entropy". It is designed to extract as much possible uncertainty from a running system at boot time as possible, hoping to capitalize on any possible variation in CPU operation (due to runtime data differences, hardware differences, SMP ordering, thermal timing variation, cache behavior, etc). At the very least, this plugin is a much more comprehensive example for how to manipulate kernel code using the gcc plugin internals" * tag 'gcc-plugins-v4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: latent_entropy: Mark functions with __latent_entropy gcc-plugins: Add latent_entropy plugin	2016-10-15 10:03:15 -07:00
Linus Torvalds	84d69848c9	Merge branch 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild Pull kbuild updates from Michal Marek: - EXPORT_SYMBOL for asm source by Al Viro. This does bring a regression, because genksyms no longer generates checksums for these symbols (CONFIG_MODVERSIONS). Nick Piggin is working on a patch to fix this. Plus, we are talking about functions like strcpy(), which rarely change prototypes. - Fixes for PPC fallout of the above by Stephen Rothwell and Nick Piggin - fixdep speedup by Alexey Dobriyan. - preparatory work by Nick Piggin to allow architectures to build with -ffunction-sections, -fdata-sections and --gc-sections - CONFIG_THIN_ARCHIVES support by Stephen Rothwell - fix for filenames with colons in the initramfs source by me. * 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild: (22 commits) initramfs: Escape colons in depfile ppc: there is no clear_pages to export powerpc/64: whitelist unresolved modversions CRCs kbuild: -ffunction-sections fix for archs with conflicting sections kbuild: add arch specific post-link Makefile kbuild: allow archs to select link dead code/data elimination kbuild: allow architectures to use thin archives instead of ld -r kbuild: Regenerate genksyms lexer kbuild: genksyms fix for typeof handling fixdep: faster CONFIG_ search ia64: move exports to definitions sparc32: debride memcpy.S a bit [sparc] unify 32bit and 64bit string.h sparc: move exports to definitions ppc: move exports to definitions arm: move exports to definitions s390: move exports to definitions m68k: move exports to definitions alpha: move exports to actual definitions x86: move exports to actual definitions ...	2016-10-14 14:26:58 -07:00
Linus Torvalds	f96ed26122	Merge branch 'for-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata Pull libata updates from Tejun Heo: - Write same support added - Minor ahci MSIX irq handling updates - Non-critical SCSI command translation fixes - Controller specific changes * 'for-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata: ahci: qoriq: Revert "ahci: qoriq: Disable NCQ on ls2080a SoC" libata: remove <asm-generic/libata-portmap.h> libata: remove unused definitions from <asm/libata-portmap.h> pata_at91: Use PTR_ERR_OR_ZERO rather than if(IS_ERR(...)) + PTR_ERR ata: Replace BUG() with BUG_ON(). ata: sata_mv: Replacing dma_pool_alloc and memset with a single call dma_pool_zalloc. libata: Some drives failing on SCT Write Same ahci: use pci_alloc_irq_vectors libata: SCT Write Same handle ATA_DFLAG_PIO libata: SCT Write Same / DSM Trim libata: Add support for SCT Write Same libata: Safely overwrite attached page in WRITE SAME xlat ahci: also use a per-port lock for the multi-MSIX case ARM: dts: STiH407-family: Add ports-implemented property in sata nodes ahci: st: Add ports-implemented property in support ahci: qoriq: enable snoopable sata read and write ahci: qoriq: adjust sata parameter libata-scsi: fix MODE SELECT translation for Control mode page libata-scsi: use u8 array to store mode page copy	2016-10-14 11:41:28 -07:00
Linus Torvalds	d8bfb96a2e	powerpc updates for 4.9 #2 Freescale updates from Scott: "Highlights include qbman support (a prerequisite for datapath drivers such as ethernet), a PCI DMA fix+improvement, reset handler changes, more 8xx optimizations, and some cleanups and fixes." Fixes: - selftests/powerpc: Add missing binaries to .gitignores (Michael Ellerman) - selftests/powerpc: Fix build break caused by EXPORT_SYMBOL changes (Michael Ellerman) - powerpc/pseries: Fix stack corruption in htpe code (Laurent Dufour) - powerpc/64s: Fix power4_fixup_nap placement (Nicholas Piggin) - powerpc/64: Fix incorrect return value from __copy_tofrom_user (Paul Mackerras) - powerpc/mm/hash64: Fix might_have_hea() check (Michael Ellerman) Other: - MAINTAINERS: Remove myself from PA Semi entries (Olof Johansson) - MAINTAINERS: Drop separate pseries entry (Michael Ellerman) - MAINTAINERS: Update powerpc website & add selftests (Michael Ellerman) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJX/1EOAAoJEFHr6jzI4aWAu0UQAIsmnc361a4xOl1ODRzJNWSD OqBbuGQPZ3I3XPMxB6BBK4mnpR507nb+L1yxMDidDebJal/pRJdXkGi7I5pe9uq+ 12XJcaePtVfmrHKKWAhC/fef0gQHSusBDpIIDquN5QE1BVvUDGbynG4GjnpX9ZaT gmXGL03u/yJvUoUNexG7lrMAJ7bZgU8BzFKyojzWtoEDF4SM7rpWKs1hGwojW4/T EYcek5uTNo01UsN/WNrtBkHA8eC9unnLk9NisOxvBXu7eJfEq38Bz71fhoowFO+C FDRboPdkXxySzzNTBb3hROontLZS2S13upzjcrRo2/f4gxvcimRJtDzxuRKrYX5n xdXcZVdFSRsKanbuV0Dwjki05IU4zeOhsHUqYqaS2UD+QlAbNCu0N9DZOhPMn+H2 8uT3cOOrBLBrhIH3e7DMK9Rx97FBeuCvwrbjnZp8My7s55VXXd2CZTFYf5/wW0b2 VEf5eoXM1BB2zuh9kFZ785Sq5iYnsKoNhKjoXULkBrf3m7WtmjPIbHzRTJM5ltwt YUvFMG6nncQB0ERVOvDIXXNzwVB0JkJTVX2BBZ2a7Fr+8KHE6rTYkgcQiosibUmq gLV9M59MFamAgJlna3A1OmGIpEiZ3RrriYL2mgraWwuLUn/qW3yPiPE6hBk0uomL cARvlIjGf8rWhi+3qAb1 =lwsd -----END PGP SIGNATURE----- Merge tag 'powerpc-4.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull more powerpc updates from Michael Ellerman: "Some more powerpc updates for 4.9: Freescale updates from Scott Wood: - qbman support (a prerequisite for datapath drivers such as ethernet) - a PCI DMA fix+improvement - reset handler changes - more 8xx optimizations - some cleanups and fixes.' Fixes: - selftests/powerpc: Add missing binaries to .gitignores (Michael Ellerman) - selftests/powerpc: Fix build break caused by EXPORT_SYMBOL changes (Michael Ellerman) - powerpc/pseries: Fix stack corruption in htpe code (Laurent Dufour) - powerpc/64s: Fix power4_fixup_nap placement (Nicholas Piggin) - powerpc/64: Fix incorrect return value from __copy_tofrom_user (Paul Mackerras) - powerpc/mm/hash64: Fix might_have_hea() check (Michael Ellerman) Other: - MAINTAINERS: Remove myself from PA Semi entries (Olof Johansson) - MAINTAINERS: Drop separate pseries entry (Michael Ellerman) - MAINTAINERS: Update powerpc website & add selftests (Michael Ellerman): * tag 'powerpc-4.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (35 commits) powerpc/mm/hash64: Fix might_have_hea() check powerpc/64: Fix incorrect return value from __copy_tofrom_user powerpc/64s: Fix power4_fixup_nap placement powerpc/pseries: Fix stack corruption in htpe code selftests/powerpc: Fix build break caused by EXPORT_SYMBOL changes MAINTAINERS: Update powerpc website & add selftests MAINTAINERS: Drop separate pseries entry MAINTAINERS: Remove myself from PA Semi entries selftests/powerpc: Add missing binaries to .gitignores arch/powerpc: Add CONFIG_FSL_DPAA to corenetXX_smp_defconfig soc/qman: Add self-test for QMan driver soc/bman: Add self-test for BMan driver soc/fsl: Introduce DPAA 1.x QMan device driver soc/fsl: Introduce DPAA 1.x BMan device driver powerpc/8xx: make user addr DTLB miss the short path powerpc/8xx: Move additional DTLBMiss handlers out of exception area powerpc/8xx: use r3 to scratch CR in ITLBmiss soc/fsl/qe: fix gpio save_regs functions powerpc/8xx: add dedicated machine check handler powerpc/8xx: add system_reset_exception ...	2016-10-14 11:07:42 -07:00
Mauricio Faria de Oliveira	af8a24988e	powerpc: implement the DMA_ATTR_NO_WARN attribute Add support for the DMA_ATTR_NO_WARN attribute on powerpc iommu code. Link: http://lkml.kernel.org/r/1470092390-25451-3-git-send-email-mauricfo@linux.vnet.ibm.com Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Cc: Keith Busch <keith.busch@intel.com> Cc: Jens Axboe <axboe@fb.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Krzysztof Kozlowski <k.kozlowski@samsung.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-10-11 15:06:32 -07:00
Michael Ellerman	08bf75ba85	powerpc/mm/hash64: Fix might_have_hea() check In commit `2b4e3ad8f5` ("powerpc/mm/hash64: Don't test for machine type to detect HEA special case") we changed the logic in might_have_hea() to check FW_FEATURE_SPLPAR rather than machine_is(pseries). However the check was incorrectly negated, leading to crashes on machines with HEA adapters, such as: mm: Hashing failure ! EA=0xd000080080004040 access=0x800000000000000e current=NetworkManager trap=0x300 vsid=0x13d349c ssize=1 base psize=2 psize 2 pte=0xc0003cc033e701ae Unable to handle kernel paging request for data at address 0xd000080080004040 Call Trace: .ehea_create_cq+0x148/0x340 [ehea] (unreliable) .ehea_up+0x258/0x1200 [ehea] .ehea_open+0x44/0x1a0 [ehea] ... Fix it by removing the negation. Fixes: `2b4e3ad8f5` ("powerpc/mm/hash64: Don't test for machine type to detect HEA special case") Cc: stable@vger.kernel.org # v4.8+ Reported-by: Denis Kirjanov <kda@linux-powerpc.org> Reported-by: Jan Stancek <jstancek@redhat.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-12 08:32:28 +11:00
Paul Mackerras	1a34439e5a	powerpc/64: Fix incorrect return value from __copy_tofrom_user Debugging a data corruption issue with virtio-net/vhost-net led to the observation that __copy_tofrom_user was occasionally returning a value 16 larger than it should. Since the return value from __copy_tofrom_user is the number of bytes not copied, this means that __copy_tofrom_user can occasionally return a value larger than the number of bytes it was asked to copy. In turn this can cause higher-level copy functions such as copy_page_to_iter_iovec to corrupt memory by copying data into the wrong memory locations. It turns out that the failing case involves a fault on the store at label 79, and at that point the first unmodified byte of the destination is at R3 + 16. Consequently the exception handler for that store needs to add 16 to R3 before using it to work out how many bytes were not copied, but in this one case it was not adding the offset to R3. To fix it, this moves the label 179 to the point where we add 16 to R3. I have checked manually all the exception handlers for the loads and stores in this code and the rest of them are correct (it would be excellent to have an automated test of all the exception cases). This bug has been present since this code was initially committed in May 2002 to Linux version 2.5.20. Cc: stable@vger.kernel.org Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-12 08:31:37 +11:00
Nicholas Piggin	7c8cb4b50f	powerpc/64s: Fix power4_fixup_nap placement power4_fixup_nap is called from the "common" handlers, not the virt/real handlers, therefore it should itself be a common handler. Placing it down in the trampoline space caused it to go out of reach of its callers, requiring a trampoline inserted at the start of the text section, which breaks the fixed section address calculations. Fixes: `da2bc4644c` ("powerpc/64s: Add new exception vector macros") Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-11 21:28:30 +11:00
Laurent Dufour	05af40e885	powerpc/pseries: Fix stack corruption in htpe code This commit fixes a stack corruption in the pseries specific code dealing with the huge pages. In __pSeries_lpar_hugepage_invalidate() the buffer used to pass arguments to the hypervisor is not large enough. This leads to a stack corruption where a previously saved register could be corrupted leading to unexpected result in the caller, like the following panic: Oops: Kernel access of bad area, sig: 11 [#1] SMP NR_CPUS=2048 NUMA pSeries Modules linked in: virtio_balloon ip_tables x_tables autofs4 virtio_blk 8139too virtio_pci virtio_ring 8139cp virtio CPU: 11 PID: 1916 Comm: mmstress Not tainted 4.8.0 #76 task: c000000005394880 task.stack: c000000005570000 NIP: c00000000027bf6c LR: c00000000027bf64 CTR: 0000000000000000 REGS: c000000005573820 TRAP: 0300 Not tainted (4.8.0) MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 84822884 XER: 20000000 CFAR: c00000000010a924 DAR: 420000000014e5e0 DSISR: 40000000 SOFTE: 1 GPR00: c00000000027bf64 c000000005573aa0 c000000000e02800 c000000004447964 GPR04: c00000000404de18 c000000004d38810 00000000042100f5 00000000f5002104 GPR08: e0000000f5002104 0000000000000001 042100f5000000e0 00000000042100f5 GPR12: 0000000000002200 c00000000fe02c00 c00000000404de18 0000000000000000 GPR16: c1ffffffffffe7ff 00003fff62000000 420000000014e5e0 00003fff63000000 GPR20: 0008000000000000 c0000000f7014800 0405e600000000e0 0000000000010000 GPR24: c000000004d38810 c000000004447c10 c00000000404de18 c000000004447964 GPR28: c000000005573b10 c000000004d38810 00003fff62000000 420000000014e5e0 NIP [c00000000027bf6c] zap_huge_pmd+0x4c/0x470 LR [c00000000027bf64] zap_huge_pmd+0x44/0x470 Call Trace: [c000000005573aa0] [c00000000027bf64] zap_huge_pmd+0x44/0x470 (unreliable) [c000000005573af0] [c00000000022bbd8] unmap_page_range+0xcf8/0xed0 [c000000005573c30] [c00000000022c2d4] unmap_vmas+0x84/0x120 [c000000005573c80] [c000000000235448] unmap_region+0xd8/0x1b0 [c000000005573d80] [c0000000002378f0] do_munmap+0x2d0/0x4c0 [c000000005573df0] [c000000000237be4] SyS_munmap+0x64/0xb0 [c000000005573e30] [c000000000009560] system_call+0x38/0x108 Instruction dump: fbe1fff8 fb81ffe0 7c7f1b78 7ca32b78 7cbd2b78 f8010010 7c9a2378 f821ffb1 7cde3378 4bfffea9 7c7b1b79 41820298 <e87f0000> 48000130 7fa5eb78 7fc4f378 Most of the time, the bug is surfacing in a caller up in the stack from __pSeries_lpar_hugepage_invalidate() which is quite confusing. This bug is pending since v3.11 but was hidden if a caller of the caller of __pSeries_lpar_hugepage_invalidate() has pushed the corruped register (r18 in this case) in the stack and is not using it until restoring it. GCC 6.2.0 seems to raise it more frequently. This commit also change the definition of the parameter buffer in pSeries_lpar_flush_hash_range() to rely on the global define PLPAR_HCALL9_BUFSIZE (no functional change here). Fixes: `1a5272866f` ("powerpc: Optimize hugepage invalidate") Cc: stable@vger.kernel.org # v3.11+ Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Acked-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-11 21:23:23 +11:00
Michael Ellerman	065397a969	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/scottwood/linux into next Freescale updates from Scott: "Highlights include qbman support (a prerequisite for datapath drivers such as ethernet), a PCI DMA fix+improvement, reset handler changes, more 8xx optimizations, and some cleanups and fixes."	2016-10-11 20:07:56 +11:00
Linus Torvalds	101105b171	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull more vfs updates from Al Viro: ">rename2() work from Miklos + current_time() from Deepa" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fs: Replace current_fs_time() with current_time() fs: Replace CURRENT_TIME_SEC with current_time() for inode timestamps fs: Replace CURRENT_TIME with current_time() for inode timestamps fs: proc: Delete inode time initializations in proc_alloc_inode() vfs: Add current_time() api vfs: add note about i_op->rename changes to porting fs: rename "rename2" i_op to "rename" vfs: remove unused i_op->rename fs: make remaining filesystems use .rename2 libfs: support RENAME_NOREPLACE in simple_rename() fs: support RENAME_NOREPLACE for local filesystems ncpfs: fix unused variable warning	2016-10-10 20:16:43 -07:00
Al Viro	3873691e5a	Merge remote-tracking branch 'ovl/rename2' into for-linus	2016-10-10 23:02:51 -04:00
Emese Revfy	38addce8b6	gcc-plugins: Add latent_entropy plugin This adds a new gcc plugin named "latent_entropy". It is designed to extract as much possible uncertainty from a running system at boot time as possible, hoping to capitalize on any possible variation in CPU operation (due to runtime data differences, hardware differences, SMP ordering, thermal timing variation, cache behavior, etc). At the very least, this plugin is a much more comprehensive example for how to manipulate kernel code using the gcc plugin internals. The need for very-early boot entropy tends to be very architecture or system design specific, so this plugin is more suited for those sorts of special cases. The existing kernel RNG already attempts to extract entropy from reliable runtime variation, but this plugin takes the idea to a logical extreme by permuting a global variable based on any variation in code execution (e.g. a different value (and permutation function) is used to permute the global based on loop count, case statement, if/then/else branching, etc). To do this, the plugin starts by inserting a local variable in every marked function. The plugin then adds logic so that the value of this variable is modified by randomly chosen operations (add, xor and rol) and random values (gcc generates separate static values for each location at compile time and also injects the stack pointer at runtime). The resulting value depends on the control flow path (e.g., loops and branches taken). Before the function returns, the plugin mixes this local variable into the latent_entropy global variable. The value of this global variable is added to the kernel entropy pool in do_one_initcall() and _do_fork(), though it does not credit any bytes of entropy to the pool; the contents of the global are just used to mix the pool. Additionally, the plugin can pre-initialize arrays with build-time random contents, so that two different kernel builds running on identical hardware will not have the same starting values. Signed-off-by: Emese Revfy <re.emese@gmail.com> [kees: expanded commit message and code comments] Signed-off-by: Kees Cook <keescook@chromium.org>	2016-10-10 14:51:44 -07:00
Linus Torvalds	30066ce675	Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto updates from Herbert Xu: "Here is the crypto update for 4.9: API: - The crypto engine code now supports hashes. Algorithms: - Allow keys >= 2048 bits in FIPS mode for RSA. Drivers: - Memory overwrite fix for vmx ghash. - Add support for building ARM sha1-neon in Thumb2 mode. - Reenable ARM ghash-ce code by adding import/export. - Reenable img-hash by adding import/export. - Add support for multiple cores in omap-aes. - Add little-endian support for sha1-powerpc. - Add Cavium HWRNG driver for ThunderX SoC" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (137 commits) crypto: caam - treat SGT address pointer as u64 crypto: ccp - Make syslog errors human-readable crypto: ccp - clean up data structure crypto: vmx - Ensure ghash-generic is enabled crypto: testmgr - add guard to dst buffer for ahash_export crypto: caam - Unmap region obtained by of_iomap crypto: sha1-powerpc - little-endian support crypto: gcm - Fix IV buffer size in crypto_gcm_setkey crypto: vmx - Fix memory corruption caused by p8_ghash crypto: ghash-generic - move common definitions to a new header file crypto: caam - fix sg dump hwrng: omap - Only fail if pm_runtime_get_sync returns < 0 crypto: omap-sham - shrink the internal buffer size crypto: omap-sham - add support for export/import crypto: omap-sham - convert driver logic to use sgs for data xmit crypto: omap-sham - change the DMA threshold value to a define crypto: omap-sham - add support functions for sg based data handling crypto: omap-sham - rename sgl to sgl_tmp for deprecation crypto: omap-sham - align algorithms on word offset crypto: omap-sham - add context export/import stubs ...	2016-10-10 14:04:16 -07:00
Linus Torvalds	b66484cd74	Merge branch 'akpm' (patches from Andrew) Merge updates from Andrew Morton: - fsnotify updates - ocfs2 updates - all of MM * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (127 commits) console: don't prefer first registered if DT specifies stdout-path cred: simpler, 1D supplementary groups CREDITS: update Pavel's information, add GPG key, remove snail mail address mailmap: add Johan Hovold .gitattributes: set git diff driver for C source code files uprobes: remove function declarations from arch/{mips,s390} spelling.txt: "modeled" is spelt correctly nmi_backtrace: generate one-line reports for idle cpus arch/tile: adopt the new nmi_backtrace framework nmi_backtrace: do a local dump_stack() instead of a self-NMI nmi_backtrace: add more trigger__cpu_backtrace() methods min/max: remove sparse warnings when they're nested Documentation/filesystems/proc.txt: add more description for maps/smaps mm, proc: fix region lost in /proc/self/smaps proc: fix timerslack_ns CAP_SYS_NICE check when adjusting self proc: add LSM hook checks to /proc/<tid>/timerslack_ns proc: relax /proc/<tid>/timerslack_ns capability requirements meminfo: break apart a very long seq_printf with #ifdefs seq/proc: modify seq_put_decimal_[u]ll to take a const char , not char proc: faster /proc/*/status ...	2016-10-07 21:38:00 -07:00
Linus Torvalds	07021b4359	powerpc updates for 4.9 Highlights: - Major rework of Book3S 64-bit exception vectors (Nicholas Piggin) - Use gas sections for arranging exception vectors et. al. - Large set of TM cleanups and selftests (Cyril Bur) - Enable transactional memory (TM) lazily for userspace (Cyril Bur) - Support for XZ compression in the zImage wrapper (Oliver O'Halloran) - Add support for bpf constant blinding (Naveen N. Rao) - Beginnings of upstream support for PA Semi Nemo motherboards (Darren Stevens) Fixes: - Ensure .mem(init\|exit).text are within _stext/_etext (Michael Ellerman) - xmon: Don't use ld on 32-bit (Michael Ellerman) - vdso64: Use double word compare on pointers (Anton Blanchard) - powerpc/nvram: Fix an incorrect partition merge (Pan Xinhui) - powerpc: Fix usage of _PAGE_RO in hugepage (Christophe Leroy) - powerpc/mm: Update FORCE_MAX_ZONEORDER range to allow hugetlb w/4K (Aneesh Kumar K.V) - Fix memory leak in queue_hotplug_event() error path (Andrew Donnellan) - Replay hypervisor maintenance interrupt first (Nicholas Piggin) Cleanups & features: - Sparse fixes/cleanups (Daniel Axtens) - Preserve CFAR value on SLB miss caused by access to bogus address (Paul Mackerras) - Radix MMU fixups for POWER9 (Aneesh Kumar K.V) - Support for setting used_(vsr\|vr\|spe) in sigreturn path (for CRIU) (Simon Guo) - Optimise syscall entry for virtual, relocatable case (Nicholas Piggin) - Optimise MSR handling in exception handling (Nicholas Piggin) - Support for kexec with Radix MMU (Benjamin Herrenschmidt) - powernv EEH fixes (Russell Currey) - Suprise PCI hotplug support for powernv (Gavin Shan) - Endian/sparse fixes for powernv PCI (Gavin Shan) - Defconfig updates (Anton Blanchard) - Various performance optimisations (Anton Blanchard) - Align hot loops of memset() and backwards_memcpy() - During context switch, check before setting mm_cpumask - Remove static branch prediction in atomic{, 64}_add_unless - Only disable HAVE_EFFICIENT_UNALIGNED_ACCESS on POWER7 little endian - Set default CPU type to POWER8 for little endian builds - KVM: PPC: Book3S HV: Migrate pinned pages out of CMA (Balbir Singh) - cxl: Flush PSL cache before resetting the adapter (Frederic Barrat) - cxl: replace loop with for_each_child_of_node(), remove unneeded of_node_put() (Andrew Donnellan) - Fix HV facility unavailable to use correct handler (Nicholas Piggin) - Remove unnecessary syscall trampoline (Nicholas Piggin) - fadump: Fix build break when CONFIG_PROC_VMCORE=n (Michael Ellerman) - Quieten EEH message when no adapters are found (Anton Blanchard) - powernv: Add PHB register dump debugfs handle (Russell Currey) - Use kprobe blacklist for exception handlers & asm functions (Nicholas Piggin) - Document the syscall ABI (Nicholas Piggin) - MAINTAINERS: Update cxl maintainers (Michael Neuling) - powerpc: Remove all usages of NO_IRQ (Michael Ellerman) Minor cleanups: - Andrew Donnellan, Christophe Leroy, Colin Ian King, Cyril Bur, Frederic Barrat, Pan Xinhui, PrasannaKumar Muralidharan, Rui Teng, Simon Guo. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJX9x5ZAAoJEFHr6jzI4aWAWQ0P+gOhdtayMsRY0k0dzPmYaFr0 Ha5v968RJaNIyGGM9ARJg8h27PGMaSlBp/9zaYdk1G7xfv/DMR0uq8d8l5pjy/Zw Jm72WE4PEX/zAcQxry6Y2fDdumO09crTBA/W0hM1UZzqu0bcVUfD+E51ZFYWW7yh fyhT2YnlucxIcT34pxsLqwTIiZYG4xgN3+YGo0wohY1D1GHE3UZ7SXIglb49yM6v ZeXrL7SOdERR1w88rC+g99P/cWng5HDS0wPLUbxGT5KIpoOSXOs7EbZwFqQBUy5O 37PB07K5dDyUbrm++l5lUigldF3W1OZQBN5+n8PciulxxwFX84pllTlAxv1p60JR piEKZ8pl023IF7zMGatUG9qcNOcnbxdMsAhoEhlcFi9ulM/yLzbmRTKVfDYm+O/J UI+YtcbsgdyOXMdGXCqdpeBNuuypgLG/g7gC8bnk3taS0LUUZLcXtRNuE4tcPJJe v8FnszaLkjAi83Lmzt3fgZo7DI1RIPwDSw6fY+nBrxCRfEPRVx3f7KhmUXvSeol5 Ln9xpk4AtyQt1RHhckxXwWSUgvXVg2ltmz7ElqK4sQ9mO/D2ZIs6R6fPY4VlJLc4 /2yIV4RLIsbHmdv9IbJ8PBp0VTugSNdicZ904QiAHSZQv/i1mgYuXw3tjR6kuy9f bKOzNJTwLV1WUsOlUpiq =Jnn8 -----END PGP SIGNATURE----- Merge tag 'powerpc-4.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: "Highlights: - Major rework of Book3S 64-bit exception vectors (Nicholas Piggin) - Use gas sections for arranging exception vectors et. al. - Large set of TM cleanups and selftests (Cyril Bur) - Enable transactional memory (TM) lazily for userspace (Cyril Bur) - Support for XZ compression in the zImage wrapper (Oliver O'Halloran) - Add support for bpf constant blinding (Naveen N. Rao) - Beginnings of upstream support for PA Semi Nemo motherboards (Darren Stevens) Fixes: - Ensure .mem(init\|exit).text are within _stext/_etext (Michael Ellerman) - xmon: Don't use ld on 32-bit (Michael Ellerman) - vdso64: Use double word compare on pointers (Anton Blanchard) - powerpc/nvram: Fix an incorrect partition merge (Pan Xinhui) - powerpc: Fix usage of _PAGE_RO in hugepage (Christophe Leroy) - powerpc/mm: Update FORCE_MAX_ZONEORDER range to allow hugetlb w/4K (Aneesh Kumar K.V) - Fix memory leak in queue_hotplug_event() error path (Andrew Donnellan) - Replay hypervisor maintenance interrupt first (Nicholas Piggin) Various performance optimisations (Anton Blanchard): - Align hot loops of memset() and backwards_memcpy() - During context switch, check before setting mm_cpumask - Remove static branch prediction in atomic{, 64}_add_unless - Only disable HAVE_EFFICIENT_UNALIGNED_ACCESS on POWER7 little endian - Set default CPU type to POWER8 for little endian builds Cleanups & features: - Sparse fixes/cleanups (Daniel Axtens) - Preserve CFAR value on SLB miss caused by access to bogus address (Paul Mackerras) - Radix MMU fixups for POWER9 (Aneesh Kumar K.V) - Support for setting used_(vsr\|vr\|spe) in sigreturn path (for CRIU) (Simon Guo) - Optimise syscall entry for virtual, relocatable case (Nicholas Piggin) - Optimise MSR handling in exception handling (Nicholas Piggin) - Support for kexec with Radix MMU (Benjamin Herrenschmidt) - powernv EEH fixes (Russell Currey) - Suprise PCI hotplug support for powernv (Gavin Shan) - Endian/sparse fixes for powernv PCI (Gavin Shan) - Defconfig updates (Anton Blanchard) - KVM: PPC: Book3S HV: Migrate pinned pages out of CMA (Balbir Singh) - cxl: Flush PSL cache before resetting the adapter (Frederic Barrat) - cxl: replace loop with for_each_child_of_node(), remove unneeded of_node_put() (Andrew Donnellan) - Fix HV facility unavailable to use correct handler (Nicholas Piggin) - Remove unnecessary syscall trampoline (Nicholas Piggin) - fadump: Fix build break when CONFIG_PROC_VMCORE=n (Michael Ellerman) - Quieten EEH message when no adapters are found (Anton Blanchard) - powernv: Add PHB register dump debugfs handle (Russell Currey) - Use kprobe blacklist for exception handlers & asm functions (Nicholas Piggin) - Document the syscall ABI (Nicholas Piggin) - MAINTAINERS: Update cxl maintainers (Michael Neuling) - powerpc: Remove all usages of NO_IRQ (Michael Ellerman) Minor cleanups: - Andrew Donnellan, Christophe Leroy, Colin Ian King, Cyril Bur, Frederic Barrat, Pan Xinhui, PrasannaKumar Muralidharan, Rui Teng, Simon Guo" * tag 'powerpc-4.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (156 commits) powerpc/bpf: Add support for bpf constant blinding powerpc/bpf: Implement support for tail calls powerpc/bpf: Introduce accessors for using the tmp local stack space powerpc/fadump: Fix build break when CONFIG_PROC_VMCORE=n powerpc: tm: Enable transactional memory (TM) lazily for userspace powerpc/tm: Add TM Unavailable Exception powerpc: Remove do_load_up_transact_{fpu,altivec} powerpc: tm: Rename transct_(*) to ck(\1)_state powerpc: tm: Always use fp_state and vr_state to store live registers selftests/powerpc: Add checks for transactional VSXs in signal contexts selftests/powerpc: Add checks for transactional VMXs in signal contexts selftests/powerpc: Add checks for transactional FPUs in signal contexts selftests/powerpc: Add checks for transactional GPRs in signal contexts selftests/powerpc: Check that signals always get delivered selftests/powerpc: Add TM tcheck helpers in C selftests/powerpc: Allow tests to extend their kill timeout selftests/powerpc: Introduce GPR asm helper header file selftests/powerpc: Move VMX stack frame macros to header file selftests/powerpc: Rework FPU stack placement macros and move to header file selftests/powerpc: Check for VSX preservation across userspace preemption ...	2016-10-07 20:19:31 -07:00
Chris Metcalf	6727ad9e20	nmi_backtrace: generate one-line reports for idle cpus When doing an nmi backtrace of many cores, most of which are idle, the output is a little overwhelming and very uninformative. Suppress messages for cpus that are idling when they are interrupted and just emit one line, "NMI backtrace for N skipped: idling at pc 0xNNN". We do this by grouping all the cpuidle code together into a new .cpuidle.text section, and then checking the address of the interrupted PC to see if it lies within that section. This commit suitably tags x86 and tile idle routines, and only adds in the minimal framework for other architectures. Link: http://lkml.kernel.org/r/1472487169-14923-5-git-send-email-cmetcalf@mellanox.com Signed-off-by: Chris Metcalf <cmetcalf@mellanox.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Daniel Thompson <daniel.thompson@linaro.org> [arm] Tested-by: Petr Mladek <pmladek@suse.com> Cc: Aaron Tomlin <atomlin@redhat.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Russell King <linux@arm.linux.org.uk> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-10-07 18:46:30 -07:00
Vineet Gupta	51a021244b	atomic64: no need for CONFIG_ARCH_HAS_ATOMIC64_DEC_IF_POSITIVE This came to light when implementing native 64-bit atomics for ARCv2. The atomic64 self-test code uses CONFIG_ARCH_HAS_ATOMIC64_DEC_IF_POSITIVE to check whether atomic64_dec_if_positive() is available. It seems it was needed when not every arch defined it. However as of current code the Kconfig option seems needless - for CONFIG_GENERIC_ATOMIC64 it is auto-enabled in lib/Kconfig and a generic definition of API is present lib/atomic64.c - arches with native 64-bit atomics select it in arch/*/Kconfig and define the API in their headers So I see no point in keeping the Kconfig option Compile tested for: - blackfin (CONFIG_GENERIC_ATOMIC64) - x86 (!CONFIG_GENERIC_ATOMIC64) - ia64 Link: http://lkml.kernel.org/r/1473703083-8625-3-git-send-email-vgupta@synopsys.com Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Helge Deller <deller@gmx.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Chris Metcalf <cmetcalf@mellanox.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Zhaoxiu Zeng <zhaoxiu.zeng@gmail.com> Cc: Linus Walleij <linus.walleij@linaro.org> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Ming Lin <ming.l@ssi.samsung.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@suse.de> Cc: Andi Kleen <ak@linux.intel.com> Cc: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-10-07 18:46:30 -07:00
Srikar Dronamraju	1e76609cc1	powerpc: implement arch_reserved_kernel_pages Currently significant amount of memory is reserved only in kernel booted to capture kernel dump using the fa_dump method. Kernels compiled with CONFIG_DEFERRED_STRUCT_PAGE_INIT will initialize only certain size memory per node. The certain size takes into account the dentry and inode cache sizes. Currently the cache sizes are calculated based on the total system memory including the reserved memory. However such a kernel when booting the same kernel as fadump kernel will not be able to allocate the required amount of memory to suffice for the dentry and inode caches. This results in crashes like Hence only implement arch_reserved_kernel_pages() for CONFIG_FA_DUMP configurations. The amount reserved will be reduced while calculating the large caches and will avoid crashes like the below on large systems such as 32 TB systems. Dentry cache hash table entries: 536870912 (order: 16, 4294967296 bytes) vmalloc: allocation failure, allocated 4097114112 of 17179934720 bytes swapper/0: page allocation failure: order:0, mode:0x2080020(GFP_ATOMIC) CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.6-master+ #3 Call Trace: dump_stack+0xb0/0xf0 (unreliable) warn_alloc_failed+0x114/0x160 __vmalloc_node_range+0x304/0x340 __vmalloc+0x6c/0x90 alloc_large_system_hash+0x1b8/0x2c0 inode_init+0x94/0xe4 vfs_caches_init+0x8c/0x13c start_kernel+0x50c/0x578 start_here_common+0x20/0xa8 Link: http://lkml.kernel.org/r/1472476010-4709-4-git-send-email-srikar@linux.vnet.ibm.com Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Suggested-by: Mel Gorman <mgorman@techsingularity.net> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Michal Hocko <mhocko@kernel.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Cc: Hari Bathini <hbathini@linux.vnet.ibm.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Balbir Singh <bsingharora@gmail.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-10-07 18:46:28 -07:00
Linus Torvalds	6218590bcb	KVM updates for v4.9-rc1 All architectures: Move `make kvmconfig` stubs from x86; use 64 bits for debugfs stats. ARM: Important fixes for not using an in-kernel irqchip; handle SError exceptions and present them to guests if appropriate; proxying of GICV access at EL2 if guest mappings are unsafe; GICv3 on AArch32 on ARMv8; preparations for GICv3 save/restore, including ABI docs; cleanups and a bit of optimizations. MIPS: A couple of fixes in preparation for supporting MIPS EVA host kernels; MIPS SMP host & TLB invalidation fixes. PPC: Fix the bug which caused guests to falsely report lockups; other minor fixes; a small optimization. s390: Lazy enablement of runtime instrumentation; up to 255 CPUs for nested guests; rework of machine check deliver; cleanups and fixes. x86: IOMMU part of AMD's AVIC for vmexit-less interrupt delivery; Hyper-V TSC page; per-vcpu tsc_offset in debugfs; accelerated INS/OUTS in nVMX; cleanups and fixes. -----BEGIN PGP SIGNATURE----- iQEcBAABCAAGBQJX9iDrAAoJEED/6hsPKofoOPoIAIUlgojkb9l2l1XVDgsXdgQL sRVhYSVv7/c8sk9vFImrD5ElOPZd+CEAIqFOu45+NM3cNi7gxip9yftUVs7wI5aC eDZRWm1E4trDZLe54ZM9ThcqZzZZiELVGMfR1+ZndUycybwyWzafpXYsYyaXp3BW hyHM3qVkoWO3dxBWFwHIoO/AUJrWYkRHEByKyvlC6KPxSdBPSa5c1AQwMCoE0Mo4 K/xUj4gBn9eMelNhg4Oqu/uh49/q+dtdoP2C+sVM8bSdquD+PmIeOhPFIcuGbGFI B+oRpUhIuntN39gz8wInJ4/GRSeTuR2faNPxMn4E1i1u4LiuJvipcsOjPfe0a18= =fZRB -----END PGP SIGNATURE----- Merge tag 'kvm-4.9-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull KVM updates from Radim Krčmář: "All architectures: - move `make kvmconfig` stubs from x86 - use 64 bits for debugfs stats ARM: - Important fixes for not using an in-kernel irqchip - handle SError exceptions and present them to guests if appropriate - proxying of GICV access at EL2 if guest mappings are unsafe - GICv3 on AArch32 on ARMv8 - preparations for GICv3 save/restore, including ABI docs - cleanups and a bit of optimizations MIPS: - A couple of fixes in preparation for supporting MIPS EVA host kernels - MIPS SMP host & TLB invalidation fixes PPC: - Fix the bug which caused guests to falsely report lockups - other minor fixes - a small optimization s390: - Lazy enablement of runtime instrumentation - up to 255 CPUs for nested guests - rework of machine check deliver - cleanups and fixes x86: - IOMMU part of AMD's AVIC for vmexit-less interrupt delivery - Hyper-V TSC page - per-vcpu tsc_offset in debugfs - accelerated INS/OUTS in nVMX - cleanups and fixes" * tag 'kvm-4.9-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (140 commits) KVM: MIPS: Drop dubious EntryHi optimisation KVM: MIPS: Invalidate TLB by regenerating ASIDs KVM: MIPS: Split kernel/user ASID regeneration KVM: MIPS: Drop other CPU ASIDs on guest MMU changes KVM: arm/arm64: vgic: Don't flush/sync without a working vgic KVM: arm64: Require in-kernel irqchip for PMU support KVM: PPC: Book3s PR: Allow access to unprivileged MMCR2 register KVM: PPC: Book3S PR: Support 64kB page size on POWER8E and POWER8NVL KVM: PPC: Book3S: Remove duplicate setting of the B field in tlbie KVM: PPC: BookE: Fix a sanity check KVM: PPC: Book3S HV: Take out virtual core piggybacking code KVM: PPC: Book3S: Treat VTB as a per-subcore register, not per-thread ARM: gic-v3: Work around definition of gic_write_bpr1 KVM: nVMX: Fix the NMI IDT-vectoring handling KVM: VMX: Enable MSR-BASED TPR shadow even if APICv is inactive KVM: nVMX: Fix reload apic access page warning kvmconfig: add virtio-gpu to config fragment config: move x86 kvm_guest.config to a common location arm64: KVM: Remove duplicating init code for setting VMID ARM: KVM: Support vgic-v3 ...	2016-10-06 10:49:01 -07:00
Naveen N. Rao	b7b7013cac	powerpc/bpf: Add support for bpf constant blinding In line with similar support for other architectures by Daniel Borkmann. 'MOD Default X' from test_bpf without constant blinding: 84 bytes emitted from JIT compiler (pass:3, flen:7) d0000000058a4688 + <x>: 0: nop 4: nop 8: std r27,-40(r1) c: std r28,-32(r1) 10: xor r8,r8,r8 14: xor r28,r28,r28 18: mr r27,r3 1c: li r8,66 20: cmpwi r28,0 24: bne 0x0000000000000030 28: li r8,0 2c: b 0x0000000000000044 30: divwu r9,r8,r28 34: mullw r9,r28,r9 38: subf r8,r9,r8 3c: rotlwi r8,r8,0 40: li r8,66 44: ld r27,-40(r1) 48: ld r28,-32(r1) 4c: mr r3,r8 50: blr ... and with constant blinding: 140 bytes emitted from JIT compiler (pass:3, flen:11) d00000000bd6ab24 + <x>: 0: nop 4: nop 8: std r27,-40(r1) c: std r28,-32(r1) 10: xor r8,r8,r8 14: xor r28,r28,r28 18: mr r27,r3 1c: lis r2,-22834 20: ori r2,r2,36083 24: rotlwi r2,r2,0 28: xori r2,r2,36017 2c: xoris r2,r2,42702 30: rotlwi r2,r2,0 34: mr r8,r2 38: rotlwi r8,r8,0 3c: cmpwi r28,0 40: bne 0x000000000000004c 44: li r8,0 48: b 0x000000000000007c 4c: divwu r9,r8,r28 50: mullw r9,r28,r9 54: subf r8,r9,r8 58: rotlwi r8,r8,0 5c: lis r2,-17137 60: ori r2,r2,39065 64: rotlwi r2,r2,0 68: xori r2,r2,39131 6c: xoris r2,r2,48399 70: rotlwi r2,r2,0 74: mr r8,r2 78: rotlwi r8,r8,0 7c: ld r27,-40(r1) 80: ld r28,-32(r1) 84: mr r3,r8 88: blr Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 20:33:20 +11:00
Naveen N. Rao	ce0761419f	powerpc/bpf: Implement support for tail calls Tail calls allow JIT'ed eBPF programs to call into other JIT'ed eBPF programs. This can be achieved either by: (1) retaining the stack setup by the first eBPF program and having all subsequent eBPF programs re-using it, or, (2) by unwinding/tearing down the stack and having each eBPF program deal with its own stack as it sees fit. To ensure that this does not create loops, there is a limit to how many tail calls can be done (currently 32). This requires the JIT'ed code to maintain a count of the number of tail calls done so far. Approach (1) is simple, but requires every eBPF program to have (almost) the same prologue/epilogue, regardless of whether they need it. This is inefficient for small eBPF programs which may not sometimes need a prologue at all. As such, to minimize impact of tail call implementation, we use approach (2) here which needs each eBPF program in the chain to use its own prologue/epilogue. This is not ideal when many tail calls are involved and when all the eBPF programs in the chain have similar prologue/epilogue. However, the impact is restricted to programs that do tail calls. Individual eBPF programs are not affected. We maintain the tail call count in a fixed location on the stack and updated tail call count values are passed in through this. The very first eBPF program in a chain sets this up to 0 (the first 2 instructions). Subsequent tail calls skip the first two eBPF JIT instructions to maintain the count. For programs that don't do tail calls themselves, the first two instructions are NOPs. Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 20:33:19 +11:00
Naveen N. Rao	7b847f523f	powerpc/bpf: Introduce accessors for using the tmp local stack space While at it, ensure that the location of the local save area is consistent whether or not we setup our own stackframe. This property is utilised in the next patch that adds support for tail calls. Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 20:33:19 +11:00
Michael Ellerman	2685f826e5	powerpc/fadump: Fix build break when CONFIG_PROC_VMCORE=n The fadump code calls vmcore_cleanup() which only exists if CONFIG_PROC_VMCORE=y. We don't want to depend on CONFIG_PROC_VMCORE, because it's user selectable, so just wrap the call in an #ifdef. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 20:33:18 +11:00
Cyril Bur	5d176f751e	powerpc: tm: Enable transactional memory (TM) lazily for userspace Currently the MSR TM bit is always set if the hardware is TM capable. This adds extra overhead as it means the TM SPRS (TFHAR, TEXASR and TFAIR) must be swapped for each process regardless of if they use TM. For processes that don't use TM the TM MSR bit can be turned off allowing the kernel to avoid the expensive swap of the TM registers. A TM unavailable exception will occur if a thread does use TM and the kernel will enable MSR_TM and leave it so for some time afterwards. Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 20:33:17 +11:00
Cyril Bur	172f7aaa75	powerpc/tm: Add TM Unavailable Exception If the kernel disables transactional memory (TM) and userspace still tries TM related actions (TM instructions or TM SPR accesses) TM aware hardware will cause the kernel to take a facility unavailable exception. Add checks for the exception being caused by illegal TM access in userspace. Signed-off-by: Cyril Bur <cyrilbur@gmail.com> [mpe: Rewrite comment entirely, bugs in it are mine] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 20:33:17 +11:00
Cyril Bur	d986d6f4d0	powerpc: Remove do_load_up_transact_{fpu,altivec} Previous rework of TM code leaves these functions unused Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 20:33:16 +11:00
Cyril Bur	000ec280e3	powerpc: tm: Rename transct_(*) to ck(\1)_state Make the structures being used for checkpointed state named consistently with the pt_regs/ckpt_regs. Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 20:33:16 +11:00
Cyril Bur	dc3106690b	powerpc: tm: Always use fp_state and vr_state to store live registers There is currently an inconsistency as to how the entire CPU register state is saved and restored when a thread uses transactional memory (TM). Using transactional memory results in the CPU having duplicated (almost) all of its register state. This duplication results in a set of registers which can be considered 'live', those being currently modified by the instructions being executed and another set that is frozen at a point in time. On context switch, both sets of state have to be saved and (later) restored. These two states are often called a variety of different things. Common terms for the state which only exists after the CPU has entered a transaction (performed a TBEGIN instruction) in hardware are 'transactional' or 'speculative'. Between a TBEGIN and a TEND or TABORT (or an event that causes the hardware to abort), regardless of the use of TSUSPEND the transactional state can be referred to as the live state. The second state is often to referred to as the 'checkpointed' state and is a duplication of the live state when the TBEGIN instruction is executed. This state is kept in the hardware and will be rolled back to on transaction failure. Currently all the registers stored in pt_regs are ALWAYS the live registers, that is, when a thread has transactional registers their values are stored in pt_regs and the checkpointed state is in ckpt_regs. A strange opposite is true for fp_state/vr_state. When a thread is non transactional fp_state/vr_state holds the live registers. When a thread has initiated a transaction fp_state/vr_state holds the checkpointed state and transact_fp/transact_vr become the structure which holds the live state (at this point it is a transactional state). This method creates confusion as to where the live state is, in some circumstances it requires extra work to determine where to put the live state and prevents the use of common functions designed (probably before TM) to save the live state. With this patch pt_regs, fp_state and vr_state all represent the same thing and the other structures [pending rename] are for checkpointed state. Acked-by: Simon Guo <wei.guo.simon@gmail.com> Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 20:33:15 +11:00
Cyril Bur	d11994314b	powerpc: signals: Stop using current in signal code Much of the signal code takes a pt_regs on which it operates. Over time the signal code has needed to know more about the thread than what pt_regs can supply, this information is obtained as needed by using 'current'. This approach is not strictly incorrect however it does mean that there is now a hard requirement that the pt_regs being passed around does belong to current, this is never checked. A safer approach is for the majority of the signal functions to take a task_struct from which they can obtain pt_regs and any other information they need. The caveat that the task_struct they are passed must be current doesn't go away but can more easily be checked for. Functions called from outside powerpc signal code are passed a pt_regs and they can confirm that the pt_regs is that of current and pass current to other functions, furthurmore, powerpc signal functions can check that the task_struct they are passed is the same as current avoiding possible corruption of current (or the task they are passed) if this assertion ever fails. CC: paulus@samba.org Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 16:43:07 +11:00
Cyril Bur	e909fb83d3	powerpc: Never giveup a reclaimed thread when enabling kernel {fp, altivec, vsx} After a thread is reclaimed from its active or suspended transactional state the checkpointed state exists on CPU, this state (along with the live/transactional state) has been saved in its entirety by the reclaiming process. There exists a sequence of events that would cause the kernel to call one of enable_kernel_fp(), enable_kernel_altivec() or enable_kernel_vsx() after a thread has been reclaimed. These functions save away any user state on the CPU so that the kernel can use the registers. Not only is this saving away unnecessary at this point, it is actually incorrect. It causes a save of the checkpointed state to the live structures within the thread struct thus destroying the true live state for that thread. Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 16:43:07 +11:00
Cyril Bur	3cee070a13	powerpc: Return the new MSR from msr_check_and_set() msr_check_and_set() always performs a mfmsr() to determine if it needs to perform an mtmsr(), as mfmsr() can be a costly operation msr_check_and_set() could return the MSR now on the CPU to avoid callers of msr_check_and_set having to make their own mfmsr() call. Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 16:43:06 +11:00
Cyril Bur	b0f16b4698	powerpc: Add check_if_tm_restore_required() to giveup_all() giveup_all() causes FPU/VMX/VSX facilities to be disabled in a threads MSR. If the thread performing the giveup was transactional, the kernel must record which facilities were in use before the giveup as the thread must have these facilities re-enabled on return to userspace. >From process.c: /* * This is called if we are on the way out to userspace and the * TIF_RESTORE_TM flag is set. It checks if we need to reload * FP and/or vector state and does so if necessary. * If userspace is inside a transaction (whether active or * suspended) and FP/VMX/VSX instructions have ever been enabled * inside that transaction, then we have to keep them enabled * and keep the FP/VMX/VSX state loaded while ever the transaction * continues. The reason is that if we didn't, and subsequently * got a FP/VMX/VSX unavailable interrupt inside a transaction, * we don't know whether it's the same transaction, and thus we * don't know which of the checkpointed state and the transactional * state to use. */ Calling check_if_tm_restore_required() will set TIF_RESTORE_TM and save the MSR if needed. Fixes: `c208505` ("powerpc: create giveup_all()") Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 16:43:06 +11:00
Cyril Bur	dc16b553c9	powerpc: Always restore FPU/VEC/VSX if hardware transactional memory in use Comment from arch/powerpc/kernel/process.c:967: If userspace is inside a transaction (whether active or suspended) and FP/VMX/VSX instructions have ever been enabled inside that transaction, then we have to keep them enabled and keep the FP/VMX/VSX state loaded while ever the transaction continues. The reason is that if we didn't, and subsequently got a FP/VMX/VSX unavailable interrupt inside a transaction, we don't know whether it's the same transaction, and thus we don't know which of the checkpointed state and the ransactional state to use. restore_math() restore_fp() and restore_altivec() currently may not restore the registers. It doesn't appear that this is more serious than a performance penalty. If the math registers aren't restored the userspace thread will still be run with the facility disabled. Userspace will not be able to read invalid values. On the first access it will take an facility unavailable exception and the kernel will detected an active transaction, at which point it will abort the transaction. There is the possibility for a pathological case preventing any progress by transactions, however, transactions are never guaranteed to make progress. Fixes: `70fe3d9` ("powerpc: Restore FPU/VEC/VSX if previously used") Signed-off-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 16:43:05 +11:00
Gavin Shan	0e7736c6b8	powerpc/powernv: Fix data type for @r in pnv_ioda_parse_m64_window() This fixes warning reported from sparse: pci-ioda.c:451:49: warning: incorrect type in argument 2 (different base types) Fixes: `262af557dd` ("powerpc/powernv: Enable M64 aperatus for PHB3") Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 16:30:28 +11:00
Gavin Shan	5adaf8629b	powerpc/powernv: Use CPU-endian PEST in pnv_pci_dump_p7ioc_diag_data() This fixes the warnings reported from sparse: pci.c:312:33: warning: restricted __be64 degrades to integer pci.c:313:33: warning: restricted __be64 degrades to integer Fixes: `cee72d5bb4` ("powerpc/powernv: Display diag data on p7ioc EEH errors") Cc: stable@vger.kernel.org # v3.3+ Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 16:29:59 +11:00
Gavin Shan	066bcd785a	powerpc/powernv: Specify proper data type for PCI_SLOT_ID_PREFIX This fixes the warning reported from sparse: eeh-powernv.c:875:23: warning: constant 0x8000000000000000 is so big it is unsigned long Fixes: `ebe2253127` ("powerpc/powernv: Support PCI slot ID") Suggested-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 16:29:46 +11:00
Gavin Shan	a7032132d7	powerpc/powernv: Use CPU-endian hub diag-data type in pnv_eeh_get_and_dump_hub_diag() The hub diag-data type is filled with big-endian data by OPAL call opal_pci_get_hub_diag_data(). We need convert it to CPU-endian value before using it. The issue is reported by sparse as pointed by Michael Ellerman: eeh-powernv.c:1309:21: warning: restricted __be16 degrades to integer This converts hub diag-data type to CPU-endian before using it in pnv_eeh_get_and_dump_hub_diag(). Fixes: `2a485ad7c8` ("powerpc/powernv: Drop PHB operation next_error()") Cc: stable@vger.kernel.org # v4.1+ Suggested-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 16:29:23 +11:00
Gavin Shan	d63e51b31e	powerpc/powernv: Pass CPU-endian PE number to opal_pci_eeh_freeze_clear() The PE number (@frozen_pe_no), filled by opal_pci_next_error() is in big-endian format. It should be converted to CPU-endian before it is passed to opal_pci_eeh_freeze_clear() when clearing the frozen state if the PE is invalid one. As Michael Ellerman pointed out, the issue is also detected by sparse: eeh-powernv.c:1541:41: warning: incorrect type in argument 2 (different base types) This passes CPU-endian PE number to opal_pci_eeh_freeze_clear() and it should be part of commit <0f36db77643b> ("powerpc/eeh: Fix wrong printed PE number"), which was merged to 4.3 kernel. Fixes: `71b540adff` ("powerpc/powernv: Don't escalate non-existing frozen PE") Cc: stable@vger.kernel.org # v4.3+ Suggested-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Reviewed-by: Russell Currey <ruscur@russell.cc> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 16:28:18 +11:00
Anton Blanchard	e2ad477cb2	powerpc: Set default CPU type to POWER8 for little endian builds We supported POWER7 CPUs for bootstrapping little endian, but the target was always POWER8. Now that POWER7 specific issues are impacting performance, change the default target to POWER8. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 16:15:00 +11:00
Anton Blanchard	8a18cc0c2c	powerpc: Only disable HAVE_EFFICIENT_UNALIGNED_ACCESS on POWER7 little endian POWER8 handles unaligned accesses in little endian mode, but commit `0b5e6661ac` ("powerpc: Don't set HAVE_EFFICIENT_UNALIGNED_ACCESS on little endian builds") disabled it for all. The issue with unaligned little endian accesses is specific to POWER7, so update the Kconfig check to match. Using the stat() testcase from commit `a75c380c71` ("powerpc: Enable DCACHE_WORD_ACCESS on ppc64le"), performance improves 15% on POWER8. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 16:15:00 +11:00
Anton Blanchard	61e98ebff3	powerpc: Remove static branch prediction in atomic{, 64}_add_unless I see quite a lot of static branch mispredictions on a simple web serving workload. The issue is in __atomic_add_unless(), called from _atomic_dec_and_lock(). There is no obvious common case, so it is better to let the hardware predict the branch. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 16:13:13 +11:00
Anton Blanchard	bb85fb5803	powerpc: During context switch, check before setting mm_cpumask During context switch, switch_mm() sets our current CPU in mm_cpumask. We can avoid this atomic sequence in most cases by checking before setting the bit. Testing on a POWER8 using our context switch microbenchmark: tools/testing/selftests/powerpc/benchmarks/context_switch \ --process --no-fp --no-altivec --no-vector Performance improves 2%. Signed-off-by: Anton Blanchard <anton@samba.org> Acked-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 16:12:16 +11:00
Anton Blanchard	91ac730b8b	powerpc/eeh: Quieten EEH message when no adapters are found No real need for this to be pr_warn(), reduce it to pr_info(). Signed-off-by: Anton Blanchard <anton@samba.org> Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 16:11:48 +11:00
Anton Blanchard	9eda65fb82	powerpc/configs: Enable Intel i40e on 64 bit configs We are starting to see i40e adapters in recent machines, so enable it in our configs. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 16:10:56 +11:00
Anton Blanchard	d3eb34a312	powerpc/configs: Change a few things from built in to modules Change a few devices and filesystems that are seldom used any more from built in to modules. This reduces our vmlinux about 500kB. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 16:10:55 +11:00
Anton Blanchard	32eab6c9e1	powerpc/configs: Bump kernel ring buffer size on 64 bit configs When we issue a system reset, every CPU in the box prints an Oops, including a backtrace. Each of these can be quite large (over 4kB) and we may end up wrapping the ring buffer and losing important information. Bump the base size from 128kB to 256kB and the per CPU size from 4kB to 8kB. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 16:10:54 +11:00
Anton Blanchard	43c2394fc1	powerpc/configs: Enable VMX crypto We see big improvements with the VMX crypto functions (often 10x or more), so enable it as a module. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 16:10:54 +11:00
Anton Blanchard	12ab11a2c0	powerpc/64: Align hot loops of memset() and backwards_memcpy() Align the hot loops in our assembly implementation of memset() and backwards_memcpy(). backwards_memcpy() is called from tcp_v4_rcv(), so we might want to optimise this a little more. Signed-off-by: Anton Blanchard <anton@samba.org> Reviewed-by: Nick Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 16:08:19 +11:00
Linus Torvalds	597f03f9d1	Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull CPU hotplug updates from Thomas Gleixner: "Yet another batch of cpu hotplug core updates and conversions: - Provide core infrastructure for multi instance drivers so the drivers do not have to keep custom lists. - Convert custom lists to the new infrastructure. The block-mq custom list conversion comes through the block tree and makes the diffstat tip over to more lines removed than added. - Handle unbalanced hotplug enable/disable calls more gracefully. - Remove the obsolete CPU_STARTING/DYING notifier support. - Convert another batch of notifier users. The relayfs changes which conflicted with the conversion have been shipped to me by Andrew. The remaining lot is targeted for 4.10 so that we finally can remove the rest of the notifiers" * 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (46 commits) cpufreq: Fix up conversion to hotplug state machine blk/mq: Reserve hotplug states for block multiqueue x86/apic/uv: Convert to hotplug state machine s390/mm/pfault: Convert to hotplug state machine mips/loongson/smp: Convert to hotplug state machine mips/octeon/smp: Convert to hotplug state machine fault-injection/cpu: Convert to hotplug state machine padata: Convert to hotplug state machine cpufreq: Convert to hotplug state machine ACPI/processor: Convert to hotplug state machine virtio scsi: Convert to hotplug state machine oprofile/timer: Convert to hotplug state machine block/softirq: Convert to hotplug state machine lib/irq_poll: Convert to hotplug state machine x86/microcode: Convert to hotplug state machine sh/SH-X3 SMP: Convert to hotplug state machine ia64/mca: Convert to hotplug state machine ARM/OMAP/wakeupgen: Convert to hotplug state machine ARM/shmobile: Convert to hotplug state machine arm64/FP/SIMD: Convert to hotplug state machine ...	2016-10-03 19:43:08 -07:00
Nicholas Piggin	e0319829a9	powerpc/64s: Remove unused exception code, small cleanups This was not done before the big patches because I only noticed them afterwards. It has become much easier to see which handlers are branched to from which exception vectors now, and to see exactly what vector space is being used for what. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 13:07:16 +11:00
Nicholas Piggin	a33532af18	powerpc/64s: Use a single macro for both parts of OOL exception Simple substitution. This is possible now that both parts of the OOL initial handler get linked into their correct location. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 13:07:16 +11:00
Nicholas Piggin	0f0c6ca194	powerpc/64s: Move __replay_interrupt function below handlers This is not an exception handler as such, it's called from local_irq_enable(), not exception entry. Also clean up some now redundant comments at the end of the consolidation series. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 13:07:15 +11:00
Nicholas Piggin	3965f8ab77	powerpc/64s: Consolidate CBE Thermal 0x1800 interrupt Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 13:07:15 +11:00
Nicholas Piggin	b51c079ed4	powerpc/64s: Consolidate Altivec 0x1700 interrupt Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 13:07:14 +11:00
Nicholas Piggin	69a793444c	powerpc/64s: Consolidate Debug 0x1600 interrupt Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 13:07:14 +11:00
Nicholas Piggin	d7e898491c	powerpc/64s: Consolidate Softpatch 0x1500 interrupt Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 13:07:13 +11:00
Nicholas Piggin	4e96dbbfe3	powerpc/64s: Consolidate Instruction Breakpoint 0x1300 interrupt Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 13:07:13 +11:00
Nicholas Piggin	ff1b320640	powerpc/64s: Consolidate CBE System Error 0x1200 interrupt Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 13:07:12 +11:00
Nicholas Piggin	e46b964c1a	powerpc/64s: Consolidate Reserved 0xfa0-0x1200 interrupts Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 13:07:12 +11:00
Nicholas Piggin	14b0072cfd	powerpc/64s: Consolidate Hypervisor Facility Unavailable 0xf80 interrupt Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 13:07:11 +11:00
Nicholas Piggin	1134713c26	powerpc/64s: Consolidate Facility Unavailable 0xf60 interrupt Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 13:07:11 +11:00
Nicholas Piggin	792cbddd62	powerpc/64s: Consolidate VSX Unavailable 0xf40 interrupt Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 13:07:10 +11:00
Nicholas Piggin	d1a0ca9c8b	powerpc/64s: Consolidate Vector Unavailable 0xf20 interrupt Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 13:07:09 +11:00
Nicholas Piggin	b1c7f150a9	powerpc/64s: Consolidate Performance Monitor 0xf00 interrupt Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 13:07:09 +11:00
Nicholas Piggin	bda7fea2b8	powerpc/64s: Consolidate Reserved 0xec0, 0xee0 interrupts Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 13:07:08 +11:00
Nicholas Piggin	7440877675	powerpc/64s: Consolidate Hypervisor Virtualization 0xea0 interrupt Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 13:07:08 +11:00
Nicholas Piggin	9bcb81bf68	powerpc/64s: Consolidate Directed Hypervisor Doorbell 0xe80 interrupt Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 13:07:07 +11:00
Nicholas Piggin	62f9b03b06	powerpc/64s: Consolidate Hypervisor Maintenance 0xe60 interrupt Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 13:07:07 +11:00
Nicholas Piggin	031b4026a8	powerpc/64s: Consolidate Hypervisor Emulation Assistance 0xe40 interrupt Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 13:07:06 +11:00
Nicholas Piggin	82517cabc5	powerpc/64s: Consolidate Hypervisor Instruction Storage 0xe20 interrupt Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 13:07:06 +11:00
Nicholas Piggin	f5c32c1d9a	powerpc/64s: Consolidate Hypervisor Data Storage 0xe00 interrupt Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 13:07:05 +11:00
Nicholas Piggin	bc6675c608	powerpc/64s: Consolidate Trace 0xd00 interrupt Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 13:07:05 +11:00
Nicholas Piggin	d807ad37e8	powerpc/64s: Consolidate System Call 0xc00 interrupt Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 13:07:04 +11:00
Nicholas Piggin	341215dc12	powerpc/64s: Consolidate Reserved 0xb00 interrupt Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 13:07:04 +11:00
Nicholas Piggin	ca2431633b	powerpc/64s: Consolidate Directed Privileged Doorbell 0xa00 interrupt Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 13:07:03 +11:00
Nicholas Piggin	facc6d7424	powerpc/64s: Consolidate Hypervisor Decrementer 0x980 interrupt Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 13:07:02 +11:00
Nicholas Piggin	39c0da57a9	powerpc/64s: Consolidate Decrementer 0x900 interrupt Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 13:07:02 +11:00
Nicholas Piggin	c78d9b9747	powerpc/64s: Consolidate FP Unavailable 0x800 interrupt Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 13:07:01 +11:00
Nicholas Piggin	11e87346b9	powerpc/64s: Consolidate Program 0x700 interrupt Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 13:07:01 +11:00
Nicholas Piggin	f9aa67142e	powerpc/64s: Consolidate Alignment 0x600 interrupt Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 13:07:00 +11:00
Nicholas Piggin	c138e58890	powerpc/64s: Consolidate External 0x500 interrupt Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 13:07:00 +11:00
Nicholas Piggin	8d04631ad7	powerpc/64s: Consolidate Instruction Segment 0x480 interrupt Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 13:06:59 +11:00
Nicholas Piggin	27ce77df60	powerpc/64s: Consolidate Instruction Storage 0x400 interrupt Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 13:06:58 +11:00
Nicholas Piggin	2b9af6e40e	powerpc/64s: Consolidate Data Segment 0x380 interrupt Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 13:06:58 +11:00
Nicholas Piggin	80795e6cbe	powerpc/64s: Consolidate Data Storage 0x300 interrupt Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 13:06:57 +11:00
Nicholas Piggin	afcf009548	powerpc/64s: Consolidate Machine Check 0x200 interrupt Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 13:06:57 +11:00
Nicholas Piggin	582baf44f9	powerpc/64s: Consolidate System Reset 0x100 interrupt Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 13:06:56 +11:00
Nicholas Piggin	57f266497d	powerpc: Use gas sections for arranging exception vectors Use assembler sections of fixed size and location to arrange the 64-bit Book3S exception vector code (64-bit Book3E also uses it in head_64.S for 0x0..0x100). This allows better flexibility in arranging exception code and hiding unimportant details behind macros. Gas sections can be a bit painful to use this way, mainly because the assembler does not know where they will be finally linked. Taking absolute addresses requires a bit of trickery for example, but it can be hidden behind macros for the most part. Generated code is mostly the same except locations, offsets, alignments. The "+ 0x2" is only required for the trap number / kvm exit number, which gets loaded as a constant into a register. Previously, code also used + 0x2 for label names, but we changed to using "H" to distinguish HV case for that. Remove the last vestiges of that. __after_prom_start is taking absolute address of a label in another fixed section. Newer toolchains seemed to compile this okay, but older ones do not. FIXED_SYMBOL_ABS_ADDR is more foolproof, it just takes an additional line to define. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 13:06:56 +11:00
Nicholas Piggin	573819e343	powerpc/64: Change the way relocation copy is calculated With a subsequent patch to put text into different sections, (_end - _stext) can no longer be computed at link time to determine the end of the copy. Instead, calculate it at runtime with (copy_to_here - _stext) + (_end - copy_to_here). Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 13:06:55 +11:00
Nicholas Piggin	be642c3457	powerpc/64s: Consolidate exception handler alignment Move exception handler alignment directives into the head-64.h macros, beause they will no longer work in-place after the next patch. This slightly changes functions that have alignments applied and therefore code generation, which is why it was not done initially (see earlier patch). Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 13:06:55 +11:00
Michael Ellerman	da2bc4644c	powerpc/64s: Add new exception vector macros Create arch/powerpc/include/asm/head-64.h with macros that specify an exception vector (name, type, location), which will be used to label and lay out exceptions into the object file. Naming is moved out of exception-64s.h, which is used to specify the implementation of exception handlers. objdump of generated code in exception vectors is unchanged except for names. Alignment directives scattered around are annoying, but done this way so that disassembly can verify identical instruction generation before and after patch. These get cleaned up in future patch. We change the way KVMTEST works, explicitly passing EXC_HV or EXC_STD rather than overloading the trap number. This removes the need to have SOFTEN values for the overloaded trap numbers, eg. 0x502. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 13:06:36 +11:00
Marcelo Cerri	74ff6cb3aa	crypto: sha1-powerpc - little-endian support The driver does not handle endianness properly when loading the input data. Signed-off-by: Marcelo Cerri <marcelo.cerri@canonical.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2016-10-02 22:31:53 +08:00
Thomas Gleixner	d7e25c66c9	Merge branch 'x86/urgent' into x86/asm Get the cr4 fixes so we can apply the final cleanup	2016-09-30 12:38:28 +02:00
Anton Blanchard	5045ea3737	powerpc/vdso64: Use double word compare on pointers __kernel_get_syscall_map() and __kernel_clock_getres() use cmpli to check if the passed in pointer is non zero. cmpli maps to a 32 bit compare on binutils, so we ignore the top 32 bits. A simple test case can be created by passing in a bogus pointer with the bottom 32 bits clear. Using a clk_id that is handled by the VDSO, then one that is handled by the kernel shows the problem: printf("%d\n", clock_getres(CLOCK_REALTIME, (void )0x100000000)); printf("%d\n", clock_getres(CLOCK_BOOTTIME, (void )0x100000000)); And we get: 0 -1 The bigger issue is if we pass a valid pointer with the bottom 32 bits clear, in this case we will return success but won't write any data to the pointer. I stumbled across this issue because the LLVM integrated assembler doesn't accept cmpli with 3 arguments. Fix this by converting them to cmpldi. Fixes: `a7f290dad3` ("[PATCH] powerpc: Merge vdso's and add vdso support to 32 bits kernel") Cc: stable@vger.kernel.org # v2.6.15+ Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-09-29 15:17:57 +10:00
Balbir Singh	2e5bbb5461	KVM: PPC: Book3S HV: Migrate pinned pages out of CMA When PCI Device pass-through is enabled via VFIO, KVM-PPC will pin pages using get_user_pages_fast(). One of the downsides of the pinning is that the page could be in CMA region. The CMA region is used for other allocations like the hash page table. Ideally we want the pinned pages to be from non CMA region. This patch (currently only for KVM PPC with VFIO) forcefully migrates the pages out (huge pages are omitted for the moment). There are more efficient ways of doing this, but that might be elaborate and might impact a larger audience beyond just the kvm ppc implementation. The magic is in new_iommu_non_cma_page() which allocates the new page from a non CMA region. I've tested the patches lightly at my end. The full solution requires migration of THP pages in the CMA region. That work will be done incrementally on top of this. Signed-off-by: Balbir Singh <bsingharora@gmail.com> Acked-by: Alexey Kardashevskiy <aik@ozlabs.ru> [mpe: Merged via powerpc tree as that's where the changes are] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-09-29 15:14:44 +10:00

... 2 3 4 5 6 ...

15828 Commits