linux

korg/linux

mirror of https://mirrors.bfsu.edu.cn/git/linux.git synced 2024-12-22 18:44:44 +08:00

Author	SHA1	Message	Date
Bharat Bhushan	99e99d19a8	kvm: ppc: bookehv: Save restore SPRN_SPRG9 on guest entry exit SPRN_SPRG is used by debug interrupt handler, so this is required for debug support. Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:23:14 +02:00
Mihai Caraman	f5250471b2	KVM: PPC: Bookehv: Get vcpu's last instruction for emulation On book3e, KVM uses load external pid (lwepx) dedicated instruction to read guest last instruction on the exit path. lwepx exceptions (DTLB_MISS, DSI and LRAT), generated by loading a guest address, needs to be handled by KVM. These exceptions are generated in a substituted guest translation context (EPLC[EGS] = 1) from host context (MSR[GS] = 0). Currently, KVM hooks only interrupts generated from guest context (MSR[GS] = 1), doing minimal checks on the fast path to avoid host performance degradation. lwepx exceptions originate from host state (MSR[GS] = 0) which implies additional checks in DO_KVM macro (beside the current MSR[GS] = 1) by looking at the Exception Syndrome Register (ESR[EPID]) and the External PID Load Context Register (EPLC[EGS]). Doing this on each Data TLB miss exception is obvious too intrusive for the host. Read guest last instruction from kvmppc_load_last_inst() by searching for the physical address and kmap it. This address the TODO for TLB eviction and execute-but-not-read entries, and allow us to get rid of lwepx until we are able to handle failures. A simple stress benchmark shows a 1% sys performance degradation compared with previous approach (lwepx without failure handling): time for i in `seq 1 10000`; do /bin/echo > /dev/null; done real 0m 8.85s user 0m 4.34s sys 0m 4.48s vs real 0m 8.84s user 0m 4.36s sys 0m 4.44s A solution to use lwepx and to handle its exceptions in KVM would be to temporary highjack the interrupt vector from host. This imposes additional synchronizations for cores like FSL e6500 that shares host IVOR registers between hardware threads. This optimized solution can be later developed on top of this patch. Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:23:14 +02:00
Mihai Caraman	51f047261e	KVM: PPC: Allow kvmppc_get_last_inst() to fail On book3e, guest last instruction is read on the exit path using load external pid (lwepx) dedicated instruction. This load operation may fail due to TLB eviction and execute-but-not-read entries. This patch lay down the path for an alternative solution to read the guest last instruction, by allowing kvmppc_get_lat_inst() function to fail. Architecture specific implmentations of kvmppc_load_last_inst() may read last guest instruction and instruct the emulation layer to re-execute the guest in case of failure. Make kvmppc_get_last_inst() definition common between architectures. Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:23:14 +02:00
Mihai Caraman	9a26af64d6	KVM: PPC: Book3s: Remove kvmppc_read_inst() function In the context of replacing kvmppc_ld() function calls with a version of kvmppc_get_last_inst() which allow to fail, Alex Graf suggested this: "If we get EMULATE_AGAIN, we just have to make sure we go back into the guest. No need to inject an ISI into the guest - it'll do that all by itself. With an error returning kvmppc_get_last_inst we can just use completely get rid of kvmppc_read_inst() and only use kvmppc_get_last_inst() instead." As a intermediate step get rid of kvmppc_read_inst() and only use kvmppc_ld() instead. Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:23:13 +02:00
Mihai Caraman	9c0d4e0dcf	KVM: PPC: Book3e: Add TLBSEL/TSIZE defines for MAS0/1 Add mising defines MAS0_GET_TLBSEL() and MAS1_GET_TSIZE() for Book3E. Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:23:13 +02:00
Mihai Caraman	b5741bb3d4	KVM: PPC: e500mc: Revert "add load inst fixup" The commit `1d628af7` "add load inst fixup" made an attempt to handle failures generated by reading the guest current instruction. The fixup code that was added works by chance hiding the real issue. Load external pid (lwepx) instruction, used by KVM to read guest instructions, is executed in a subsituted guest translation context (EPLC[EGS] = 1). In consequence lwepx's TLB error and data storage interrupts need to be handled by KVM, even though these interrupts are generated from host context (MSR[GS] = 0) where lwepx is executed. Currently, KVM hooks only interrupts generated from guest context (MSR[GS] = 1), doing minimal checks on the fast path to avoid host performance degradation. As a result, the host kernel handles lwepx faults searching the faulting guest data address (loaded in DEAR) in its own Logical Partition ID (LPID) 0 context. In case a host translation is found the execution returns to the lwepx instruction instead of the fixup, the host ending up in an infinite loop. Revert the commit "add load inst fixup". lwepx issue will be addressed in a subsequent patch without needing fixup code. Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:23:13 +02:00
Bharat Bhushan	34f754b99e	kvm: ppc: Add SPRN_EPR get helper function kvmppc_set_epr() is already defined in asm/kvm_ppc.h, So rename and move get_epr helper function to same file. Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com> [agraf: remove duplicate return] Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:23:13 +02:00
Bharat Bhushan	c1b8a01bf9	kvm: ppc: booke: Use the shared struct helpers for SPRN_SPRG0-7 Use kvmppc_set_sprg[0-7]() and kvmppc_get_sprg[0-7]() helper functions Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:23:12 +02:00
Bharat Bhushan	dc168549d9	kvm: ppc: booke: Add shared struct helpers of SPRN_ESR Add and use kvmppc_set_esr() and kvmppc_get_esr() helper functions Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:23:12 +02:00
Bharat Bhushan	a5414d4b5e	kvm: ppc: booke: Use the shared struct helpers of SPRN_DEAR Uses kvmppc_set_dar() and kvmppc_get_dar() helper functions Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:23:12 +02:00
Bharat Bhushan	31579eea69	kvm: ppc: booke: Use the shared struct helpers of SRR0 and SRR1 Use kvmppc_set_srr0/srr1() and kvmppc_get_srr0/srr1() helper functions Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:23:11 +02:00
Bharat Bhushan	1dc0c5b88c	kvm: ppc: bookehv: Added wrapper macros for shadow registers There are shadow registers like, GSPRG[0-3], GSRR0, GSRR1 etc on BOOKE-HV and these shadow registers are guest accessible. So these shadow registers needs to be updated on BOOKE-HV. This patch adds new macro for get/set helper of shadow register . Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:23:11 +02:00
Alexander Graf	89b68c96a2	KVM: PPC: Book3S: Make magic page properly 4k mappable The magic page is defined as a 4k page of per-vCPU data that is shared between the guest and the host to accelerate accesses to privileged registers. However, when the host is using 64k page size granularity we weren't quite as strict about that rule anymore. Instead, we partially treated all of the upper 64k as magic page and mapped only the uppermost 4k with the actual magic contents. This works well enough for Linux which doesn't use any memory in kernel space in the upper 64k, but Mac OS X got upset. So this patch makes magic page actually stay in a 4k range even on 64k page size hosts. This patch fixes magic page usage with Mac OS X (using MOL) on 64k PAGE_SIZE hosts for me. Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:23:11 +02:00
Alexander Graf	c01e3f66cd	KVM: PPC: Book3S: Add hack for split real mode Today we handle split real mode by mapping both instruction and data faults into a special virtual address space that only exists during the split mode phase. This is good enough to catch 32bit Linux guests that use split real mode for copy_from/to_user. In this case we're always prefixed with 0xc0000000 for our instruction pointer and can map the user space process freely below there. However, that approach fails when we're running KVM inside of KVM. Here the 1st level last_inst reader may well be in the same virtual page as a 2nd level interrupt handler. It also fails when running Mac OS X guests. Here we have a 4G/4G split, so a kernel copy_from/to_user implementation can easily overlap with user space addresses. The architecturally correct way to fix this would be to implement an instruction interpreter in KVM that kicks in whenever we go into split real mode. This interpreter however would not receive a great amount of testing and be a lot of bloat for a reasonably isolated corner case. So I went back to the drawing board and tried to come up with a way to make split real mode work with a single flat address space. And then I realized that we could get away with the same trick that makes it work for Linux: Whenever we see an instruction address during split real mode that may collide, we just move it higher up the virtual address space to a place that hopefully does not collide (keep your fingers crossed!). That approach does work surprisingly well. I am able to successfully run Mac OS X guests with KVM and QEMU (no split real mode hacks like MOL) when I apply a tiny timing probe hack to QEMU. I'd say this is a win over even more broken split real mode :). Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:23:10 +02:00
Alexander Graf	2e27ecc961	KVM: PPC: Book3S: Stop PTE lookup on write errors When a page lookup failed because we're not allowed to write to the page, we should not overwrite that value with another lookup on the second PTEG which will return "page not found". Instead, we should just tell the caller that we had a permission problem. This fixes Mac OS X guests looping endlessly in page lookup code for me. Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:23:10 +02:00
Alexander Graf	17824b5afc	KVM: PPC: Deflect page write faults properly in kvmppc_st When we have a page that we're not allowed to write to, xlate() will already tell us -EPERM on lookup of that page. With the code as is we change it into a "page missing" error which a guest may get confused about. Instead, just tell the caller about the -EPERM directly. This fixes Mac OS X guests when run with DCBZ32 emulation. Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:23:10 +02:00
Alexander Graf	1287cb3fa8	KVM: PPC: Book3S: Move vcore definition to end of kvm_arch struct When building KVM with a lot of vcores (NR_CPUS is big), we can potentially get out of the ld immediate range for dereferences inside that struct. Move the array to the end of our kvm_arch struct. This fixes compilation issues with NR_CPUS=2048 for me. Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:22:27 +02:00
Mihai Caraman	debf27d6b9	KVM: PPC: e500: Emulate power management control SPR For FSL e6500 core the kernel uses power management SPR register (PWRMGTCR0) to enable idle power down for cores and devices by setting up the idle count period at boot time. With the host already controlling the power management configuration the guest could simply benefit from it, so emulate guest request as a general store. Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:22:27 +02:00
Alexander Graf	6947f948f0	KVM: PPC: Book3S HV: Enable for little endian hosts Now that we've fixed all the issues that HV KVM code had on little endian hosts, we can enable it in the kernel configuration for users to play with. Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:22:26 +02:00
Alexander Graf	9bf163f86d	KVM: PPC: Book3S HV: Fix ABIv2 on LE For code that doesn't live in modules we can just branch to the real function names, giving us compatibility with ABIv1 and ABIv2. Do this for the compiled-in code of HV KVM. Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:22:25 +02:00
Alexander Graf	76d072fb05	KVM: PPC: Book3S HV: Access XICS in BE On the exit path from the guest we check what type of interrupt we received if we received one. This means we're doing hardware access to the XICS interrupt controller. However, when running on a little endian system, this access is byte reversed. So let's make sure to swizzle the bytes back again and virtually make XICS accesses big endian. Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:22:24 +02:00
Alexander Graf	0865a583a4	KVM: PPC: Book3S HV: Access host lppaca and shadow slb in BE Some data structures are always stored in big endian. Among those are the LPPACA fields as well as the shadow slb. These structures might be shared with a hypervisor. So whenever we access those fields, make sure we do so in big endian byte order. Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:22:23 +02:00
Alexander Graf	0240755225	KVM: PPC: Book3S HV: Access guest VPA in BE There are a few shared data structures between the host and the guest. Most of them get registered through the VPA interface. These data structures are defined to always be in big endian byte order, so let's make sure we always access them in big endian. Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:22:22 +02:00
Alexander Graf	6f22bd3265	KVM: PPC: Book3S HV: Make HTAB code LE host aware When running on an LE host all data structures are kept in little endian byte order. However, the HTAB still needs to be maintained in big endian. So every time we access any HTAB we need to make sure we do so in the right byte order. Fix up all accesses to manually byte swap. Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:22:22 +02:00
Alexander Graf	8f6822c4b9	PPC: Add asm helpers for BE 32bit load/store From assembly code we might not only have to explicitly BE access 64bit values, but sometimes also 32bit ones. Add helpers that allow for easy use of lwzx/stwx in their respective byte-reverse or native form. Signed-off-by: Alexander Graf <agraf@suse.de> CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-28 15:22:21 +02:00
Mihai Caraman	d57cef91a0	KVM: PPC: e500: Fix default tlb for victim hint Tlb search operation used for victim hint relies on the default tlb set by the host. When hardware tablewalk support is enabled in the host, the default tlb is TLB1 which leads KVM to evict the bolted entry. Set and restore the default tlb when searching for victim hint. Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> Reviewed-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:22:20 +02:00
Michael Neuling	9642382e82	KVM: PPC: Book3S HV: Add H_SET_MODE hcall handling This adds support for the H_SET_MODE hcall. This hcall is a multiplexer that has several functions, some of which are called rarely, and some which are potentially called very frequently. Here we add support for the functions that set the debug registers CIABR (Completed Instruction Address Breakpoint Register) and DAWR/DAWRX (Data Address Watchpoint Register and eXtension), since they could be updated by the guest as often as every context switch. This also adds a kvmppc_power8_compatible() function to test to see if a guest is compatible with POWER8 or not. The CIABR and DAWR/X only exist on POWER8. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:22:19 +02:00
Paul Mackerras	ae2113a4f1	KVM: PPC: Book3S: Allow only implemented hcalls to be enabled or disabled This adds code to check that when the KVM_CAP_PPC_ENABLE_HCALL capability is used to enable or disable in-kernel handling of an hcall, that the hcall is actually implemented by the kernel. If not an EINVAL error is returned. This also checks the default-enabled list of hcalls and prints a warning if any hcall there is not actually implemented. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:22:18 +02:00
Paul Mackerras	699a0ea082	KVM: PPC: Book3S: Controls for in-kernel sPAPR hypercall handling This provides a way for userspace controls which sPAPR hcalls get handled in the kernel. Each hcall can be individually enabled or disabled for in-kernel handling, except for H_RTAS. The exception for H_RTAS is because userspace can already control whether individual RTAS functions are handled in-kernel or not via the KVM_PPC_RTAS_DEFINE_TOKEN ioctl, and because the numeric value for H_RTAS is out of the normal sequence of hcall numbers. Hcalls are enabled or disabled using the KVM_ENABLE_CAP ioctl for the KVM_CAP_PPC_ENABLE_HCALL capability on the file descriptor for the VM. The args field of the struct kvm_enable_cap specifies the hcall number in args[0] and the enable/disable flag in args[1]; 0 means disable in-kernel handling (so that the hcall will always cause an exit to userspace) and 1 means enable. Enabling or disabling in-kernel handling of an hcall is effective across the whole VM. The ability for KVM_ENABLE_CAP to be used on a VM file descriptor on PowerPC is new, added by this commit. The KVM_CAP_ENABLE_CAP_VM capability advertises that this ability exists. When a VM is created, an initial set of hcalls are enabled for in-kernel handling. The set that is enabled is the set that have an in-kernel implementation at this point. Any new hcall implementations from this point onwards should not be added to the default set without a good reason. No distinction is made between real-mode and virtual-mode hcall implementations; the one setting controls them both. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:22:17 +02:00
Mihai Caraman	1f0eeb7e1a	KVM: PPC: e500mc: Enhance tlb invalidation condition on vcpu schedule On vcpu schedule, the condition checked for tlb pollution is too loose. The tlb entries of a vcpu become polluted (vs stale) only when a different vcpu within the same logical partition runs in-between. Optimize the tlb invalidation condition keeping last_vcpu per logical partition id. With the new invalidation condition, a guest shows 4% performance improvement on P5020DS while running a memory stress application with the cpu oversubscribed, the other guest running a cpu intensive workload. Guest - old invalidation condition real 3.89 user 3.87 sys 0.01 Guest - enhanced invalidation condition real 3.75 user 3.73 sys 0.01 Host real 3.70 user 1.85 sys 0.00 The memory stress application accesses 4KB pages backed by 75% of available TLB0 entries: char foo[ENTRIES][4096] __attribute__ ((aligned (4096))); int main() { char bar; int i, j; for (i = 0; i < ITERATIONS; i++) for (j = 0; j < ENTRIES; j++) bar = foo[j][0]; return 0; } Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> Reviewed-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:22:16 +02:00
Alexander Graf	f396df3518	KVM: PPC: Book3S PR: Fix sparse endian checks While sending sparse with endian checks over the code base, it triggered at some places that were missing casts or had wrong types. Fix them up. Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:22:16 +02:00
Alexander Graf	da166facd4	KVM: PPC: Book3S PR: Fix ABIv2 on LE We switched to ABIv2 on Little Endian systems now which gets rid of the dotted function names. Branch to the actual functions when we see such a system. Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:22:15 +02:00
Anton Blanchard	ad7d4584a2	KVM: PPC: Assembly functions exported to modules need _GLOBAL_TOC() Both kvmppc_hv_entry_trampoline and kvmppc_entry_trampoline are assembly functions that are exported to modules and also require a valid r2. As such we need to use _GLOBAL_TOC so we provide a global entry point that establishes the TOC (r2). Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:22:14 +02:00
Anton Blanchard	05a308c722	KVM: PPC: Book3S HV: Fix ABIv2 indirect branch issue To establish addressability quickly, ABIv2 requires the target address of the function being called to be in r12. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:22:13 +02:00
Alexander Graf	568fccc43f	KVM: PPC: Book3S PR: Handle hyp doorbell exits If we're running PR KVM in HV mode, we may get hypervisor doorbell interrupts. Handle those the same way we treat normal doorbells. Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:22:12 +02:00
Alexander Graf	f6bf3a6622	KVM: PPC: Book3s HV: Fix tlbie compile error Some compilers complain about uninitialized variables in the compute_tlbie_rb function. When you follow the code path you'll realize that we'll never get to that point, but the compiler isn't all that smart. So just default to 4k page sizes for everything, making the compiler happy and the code slightly easier to read. Signed-off-by: Alexander Graf <agraf@suse.de> Acked-by: Paul Mackerras <paulus@samba.org>	2014-07-28 15:22:12 +02:00
Alexander Graf	fb4188bad0	KVM: PPC: Book3s PR: Disable AIL mode with OPAL When we're using PR KVM we must not allow the CPU to take interrupts in virtual mode, as the SLB does not contain host kernel mappings when running inside the guest context. To make sure we get good performance for non-KVM tasks but still properly functioning PR KVM, let's just disable AIL whenever a vcpu is scheduled in. This is fundamentally different from how we deal with AIL on pSeries type machines where we disable AIL for the whole machine as soon as a single KVM VM is up. The reason for that is easy - on pSeries we do not have control over per-cpu configuration of AIL. We also don't want to mess with CPU hotplug races and AIL configuration, so setting it per CPU is easier and more flexible. This patch fixes running PR KVM on POWER8 bare metal for me. Signed-off-by: Alexander Graf <agraf@suse.de> Acked-by: Paul Mackerras <paulus@samba.org>	2014-07-28 15:22:11 +02:00
Aneesh Kumar K.V	06da28e76b	KVM: PPC: BOOK3S: PR: Emulate instruction counter Writing to IC is not allowed in the privileged mode. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:22:10 +02:00
Aneesh Kumar K.V	8f42ab2749	KVM: PPC: BOOK3S: PR: Emulate virtual timebase register virtual time base register is a per VM, per cpu register that needs to be saved and restored on vm exit and entry. Writing to VTB is not allowed in the privileged mode. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> [agraf: fix compile error] Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:21:50 +02:00
Ingo Molnar	5030c69755	Linux 3.16-rc7 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJT1VYNAAoJEHm+PkMAQRiGQJwIAKSYp1Uqz5O/e5r0V1TlZKT4 1B4Njopl57PwSrJQWcGEuH2yHyM896vfPO4L6BJIOfyWzh8kwpQqclDt6uhXoF/v OsO1zb/7/j+n/pDZsePqP9AyIgErsHEBgUbhecDqzjN++ITPcZjQ6TIMPglZaumN jFAdAZuAaEwqAk8jqN2wlm689Fh9MuUEarHXbXLCqu5RgLrWhFGhp/cTWY62aqnZ XfEeQ9KtpRZmlR/IYjerbb1eRH7ZdJsZ88WngLX9dj/JdNxHWBkWQBXGAusXk5Fk y6LsIV3TjyBdrRKJ1Ifyg/2EIXHNBs8HxTFGXpjtp2HPuMLDxZOWOWikb9URtNg= =Fjf4 -----END PGP SIGNATURE----- Merge tag 'v3.16-rc7' into perf/core, to merge in the latest fixes before applying new changes Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-07-28 10:00:33 +02:00
Michael Ellerman	9de5cb0f6d	powerpc/perf: Add per-event excludes on Power8 Power8 has a new register (MMCR2), which contains individual freeze bits for each counter. This is an improvement on previous chips as it means we can have multiple events on the PMU at the same time with different exclude_{user,kernel,hv} settings. Previously we had to ensure all events on the PMU had the same exclude settings. The core of the patch is fairly simple. We use the 207S feature flag to indicate that the PMU backend supports per-event excludes, if it's set we skip the generic logic that enforces the equality of excludes between events. We also use that flag to skip setting the freeze bits in MMCR0, the PMU backend is expected to have handled setting them in MMCR2. The complication arises with EBB. The FCxP bits in MMCR2 are accessible R/W to a task using EBB. Which means a task using EBB will be able to see that we are using MMCR2 for freezing, whereas the old logic which used MMCR0 is not user visible. The task can not see or affect exclude_kernel & exclude_hv, so we only need to consider exclude_user. The table below summarises the behaviour both before and after this commit is applied: exclude_user true false ------------------------------------ \| User visible \| N N Before \| Can freeze \| Y Y \| Can unfreeze \| N Y ------------------------------------ \| User visible \| Y Y After \| Can freeze \| Y Y \| Can unfreeze \| Y/N Y ------------------------------------ So firstly I assert that the simple visibility of the exclude_user setting in MMCR2 is a non-issue. The event belongs to the task, and was most likely created by the task. So the exclude_user setting is not privileged information in any way. Secondly, the behaviour in the exclude_user = false case is unchanged. This is important as it is the case that is actually useful, ie. the event is created with no exclude setting and the task uses MMCR2 to implement exclusion manually. For exclude_user = true there is no meaningful change to freezing the event. Previously the task could use MMCR2 to freeze the event, though it was already frozen with MMCR0. With the new code the task can use MMCR2 to freeze the event, though it was already frozen with MMCR2. The only real change is when exclude_user = true and the task tries to use MMCR2 to unfreeze the event. Previously this had no effect, because the event was already frozen in MMCR0. With the new code the task can unfreeze the event in MMCR2, but at some indeterminate time in the future the kernel will overwrite its setting and refreeze the event. Therefore my final assertion is that any task using exclude_user = true and also fiddling with MMCR2 was deeply confused before this change, and remains so after it. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-28 14:30:58 +10:00
Michael Ellerman	8abd818fc7	powerpc/perf: Pass the struct perf_events down to compute_mmcr() To support per-event exclude settings on Power8 we need access to the struct perf_events in compute_mmcr(). Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-28 14:30:47 +10:00
Michael Ellerman	79a4cb28a0	powerpc/perf: Clear all MMCR settings before calling compute_mmcr() Because we reuse cpuhw->mmcr on each call to compute_mmcr() there's a risk that we could forget to set one of the values and use whatever value was in there previously. Currently all the implementations are careful to set all the values, but it's safer to clear them all before we call compute_mmcr(). Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-28 14:11:34 +10:00
Michael Ellerman	633440f18f	powerpc: Document how we set AIL on guest kernels I spent ten minutes scratching my head, trying to work out where we enabled relocation on interrupts for guest kernels. Expand the doco to make it clear. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-28 14:11:27 +10:00
Michael Ellerman	8e83e9053f	powerpc/pseries: Switch pseries drivers to use machine_xxx_initcall() A lot of the code in platforms/pseries is using non-machine initcalls. That means if a kernel built with pseries support runs on another platform, for example powernv, the initcalls will still run. Most of these cases are OK, though sometimes only due to luck. Some were having more effect: * hcall_inst_init - Checking FW_FEATURE_LPAR which is set on ps3 & celleb. * mobility_sysfs_init - created sysfs files unconditionally - but no effect due to ENOSYS from rtas_ibm_suspend_me() * apo_pm_init - created sysfs, allows write - nothing checks the value written to though * alloc_dispatch_log_kmem_cache - creating kmem_cache on non-pseries machines Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-28 14:11:26 +10:00
Michael Ellerman	b14726c51c	powerpc/powernv: Switch powernv drivers to use machine_xxx_initcall() A lot of the code in platforms/powernv is using non-machine initcalls. That means if a kernel built with powernv support runs on another platform, for example pseries, the initcalls will still run. That is usually OK, because the initcalls will check for something in the device tree or elsewhere before doing anything, so on other platforms they will usually just return. But it's fishy for powernv code to be running on other platforms, so switch them all to be machine initcalls. If we want any of them to run on other platforms in future they should move to sysdev. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-28 14:11:26 +10:00
Michael Ellerman	8d3c941e24	powerpc: Add machine_early_initcall() Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-28 14:11:25 +10:00
Michael Ellerman	9daf112bd4	powerpc: Remove misleading DISABLE_INTS DISABLE_INTS has a long and storied history, but for some time now it has not actually disabled interrupts. For the open-coded exception handlers, just stop using it, instead call RECONCILE_IRQ_STATE directly. This has the benefit of removing a level of indirection, and making it clear that r10 & r11 are used at that point. For the addition case we still need a macro, so rename it to clarify what it actually does. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-28 14:11:24 +10:00
Michael Ellerman	a1d711c53f	powerpc: Document register clobbering in EXCEPTION_COMMON() Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-28 14:11:24 +10:00
Michael Ellerman	144beb2f53	powerpc: Update comments in irqflags.h The comment on TRACE_ENABLE_INTS is incorrect, and appears to have always been incorrect since the code was merged. It probably came from an original out-of-tree patch. Replace it with something that's correct. Also propagate the message to RECONCILE_IRQ_STATE(), because it's potentially subtle. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-28 14:11:23 +10:00
Michael Ellerman	4e2bf01b21	powerpc: Move bad_stack() below the fwnmi_data_area At the moment the allmodconfig build is failing because we run out of space between altivec_assist() at 0x5700 and the fwnmi_data_area at 0x7000. Fixing it permanently will take some more work, but a quick fix is to move bad_stack() below the fwnmi_data_area. That gives us just enough room with everything enabled. bad_stack() is called from the common exception handlers, but it's a non-conditional branch, so we have plenty of scope to move it further way. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-28 14:11:22 +10:00
Michael Ellerman	1e07a0a033	powerpc: Remove CLASSIC_PPC We have a strange #define in cputable.h called CLASSIC_PPC. Although it is defined for 32 & 64bit, it's only used for 32bit and it's basically a duplicate of CONFIG_PPC_BOOK3S_32, so let's use the latter. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-28 14:11:22 +10:00
Michael Ellerman	804ece07e9	powerpc: Remove CONFIG_POWER4 Although the name CONFIG_POWER4 suggests that it controls support for power4 cpus, this symbol is actually misnamed. It is a historical wart from the powermac code, which used to support building a 32-bit kernel for power4. CONFIG_POWER4 was used in that context to guard code that was 64-bit only. In the powermac code we can just use CONFIG_PPC64 instead, and in other places it is a synonym for CONFIG_PPC_BOOK3S_64. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-28 14:10:26 +10:00
Michael Ellerman	0f369103ce	powerpc: Remove power3 from comments There are still a few occurences where it remains, because it helps to explain something that persists. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-28 14:10:26 +10:00
Michael Ellerman	086dddc15f	powerpc: Remove oprofile RS64 support We no longer support these cpus, so we don't need oprofile support for them either. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-28 14:10:25 +10:00
Michael Ellerman	c3993f1007	powerpc: Remove CONFIG_POWER3 Now that we have dropped power3 support we can remove CONFIG_POWER3. The usage in pgtable_32.c was already dead code as CONFIG_POWER3 was not selectable on PPC32. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-28 14:10:24 +10:00
Michael Ellerman	cec15488c7	powerpc: Pull out ksp_vsid logic into a helper The previous patch left a bit of a wart in copy_process(). Clean it up a bit by moving the logic out into a helper. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-28 14:10:24 +10:00
Michael Ellerman	13b3d13b81	powerpc: Remove MMU_FTR_SLB We now only support cpus that use an SLB, so we don't need an MMU feature to indicate that. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-28 14:10:23 +10:00
Michael Ellerman	376af5947c	powerpc: Remove STAB code Old cpus didn't have a Segment Lookaside Buffer (SLB), instead they had a Segment Table (STAB). Now that we've dropped support for those cpus, we can remove the STAB support entirely. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-28 14:10:22 +10:00
Michael Ellerman	468a33028e	powerpc: Drop support for pre-POWER4 cpus We inadvertently broke power3 support back in 3.4 with commit `f5339277eb` "powerpc: Remove FW_FEATURE ISERIES from arch code". No one noticed until at least 3.9. By then we'd also broken it with the optimised memcpy, copy_to/from_user and clear_user routines. We don't want to add any more complexity to those just to support ancient cpus, so it seems like it's a good time to drop support for power3 and earlier. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-28 14:09:23 +10:00
Michael Ellerman	2061f7beaa	powerpc: Use standard macros for sys_sigpending() & sys_old_getrlimit() Currently we have sys_sigpending and sys_old_getrlimit defined to use COMPAT_SYS() in systbl.h, but then both are #defined to sys_ni_syscall in systbl.S. This seems to have been done when ppc and ppc64 were merged, in commit `9994a33` "Introduce entry_{32,64}.S, misc_{32,64}.S, systbl.S". AFAICS there's no longer (or never was) any need for this, we can just use SYSX() for both and remove the #defines to sys_ni_syscall. The expansion before was: #define COMPAT_SYS(func) .llong .sys_##func,.compat_sys_##func #define sys_old_getrlimit sys_ni_syscall COMPAT_SYS(old_getrlimit) => .llong .sys_old_getrlimit,.compat_sys_old_getrlimit => .llong .sys_ni_syscall,.compat_sys_old_getrlimit After is: #define SYSX(f, f3264, f32) .llong .f,.f3264 SYSX(sys_ni_syscall, compat_sys_old_getrlimit, sys_old_getrlimit) => .llong .sys_ni_syscall,.compat_sys_old_getrlimit ie. they are equivalent. Finally both COMPAT_SYS() and SYSX() evaluate to sys_ni_syscall in the Cell SPU code. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-28 14:09:23 +10:00
Benjamin Herrenschmidt	cdc2652ee5	Merge branch 'merge' into next Bring in some important fixes from the 3.16 branch	2014-07-28 13:41:12 +10:00
Thomas Falcon	396a34340c	powerpc: Fix endianness of flash_block_list in rtas_flash The function rtas_flash_firmware passes the address of a data structure, flash_block_list, when making the update-flash-64-and-reboot rtas call. While the endianness of the address is handled correctly, the endianness of the data is not. This patch ensures that the data in flash_block_list is big endian when passed to rtas on little endian hosts. Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-28 11:30:54 +10:00
Vasant Hegde	fa952c54ba	powerpc/powernv: Change BUG_ON to WARN_ON in elog code We can continue to read the error log (up to MAX size) even if we get the elog size more than MAX size. Hence change BUG_ON to WARN_ON. Also updated error message. Reported-by: Gopesh Kumar Chaudhary <gopchaud@in.ibm.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Acked-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com> Acked-by: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-28 11:30:54 +10:00
Kristina Martšenko	ad8c12eea0	staging: silicom: remove driver The driver hasn't been cleaned up and it doesn't look like anyone is working on it anymore (including the original author). So remove it. If someone wants to work on cleaning the driver up and moving it out of staging, this commit can be reverted. In addition, since this removes the CONFIG_NET_VENDOR_SILICOM config symbol, remove the symbol from all defconfig files that reference it. Signed-off-by: Kristina Martšenko <kristina.martsenko@gmail.com> Cc: Daniel Cotey <puff65537@bansheeslibrary.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-07-27 11:24:37 -07:00
Alexander Popov	ec1f0c9666	dmaengine: mpc512x: register for device tree channel lookup Register the controller for device tree based lookup of DMA channels (non-fatal for backwards compatibility with older device trees) and provide the '#dma-cells' property in the shared mpc5121.dtsi file Signed-off-by: Alexander Popov <a13xp0p0v88@gmail.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>	2014-07-26 00:21:42 +05:30
Grant Likely	259092a35c	of: Reorder device tree changes and notifiers Currently, devicetree reconfig notifiers get emitted before the change is applied to the tree, but that behaviour is problematic if the receiver wants the determine the new state of the tree. The current users don't care, but the changeset code to follow will be making multiple changes at once. Reorder notifiers to get emitted after the change has been applied to the tree so that callbacks see the new tree state. At the same time, fixup the existing callbacks to expect the new order. There are a few callbacks that compare the old and new values of a changed property. Put both property pointers into the of_prop_reconfig structure. The current notifiers also allow the notifier callback to fail and cancel the change to the tree, but that feature isn't actually used. It really isn't valid to ignore a tree modification provided by firmware anyway, so remove the ability to cancel a change to the tree. Signed-off-by: Grant Likely <grant.likely@linaro.org> Cc: Nathan Fontenot <nfont@austin.ibm.com>	2014-07-23 17:08:13 -06:00
Grant Likely	a25095d451	of: Move dynamic node fixups out of powerpc and into common code PowerPC does an odd thing with dynamic nodes. It uses a notifier to catch new node additions and set some of the values like name and type. This makes no sense since that same code can be put directly into of_attach_node(). Besides, all dynamic node users need this, not just powerpc. Fix this problem by moving the logic out of arch/powerpc and into drivers/of/dynamic.c. It is also important to remove this notifier because we want to move the firing of notifiers from before the tree is modified to after so that the receiver gets a consistent view of the tree, but that is incompatible with notifiers that modify the node. Signed-off-by: Grant Likely <grant.likely@linaro.org> Cc: Nathan Fontenot <nfont@austin.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-23 17:05:46 -06:00
Linus Torvalds	7442cf9ac2	Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull powerpc fixes from Ben Herrenschmidt: "Here is a handful of powerpc fixes for 3.16. They are all pretty simple and self contained and should still make this release" * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: powerpc: use _GLOBAL_TOC for memmove powerpc/pseries: dynamically added OF nodes need to call of_node_init powerpc: subpage_protect: Increase the array size to take care of 64TB powerpc: Fix bugs in emulate_step() powerpc: Disable doorbells on Power8 DD1.x	2014-07-23 15:34:13 -07:00
Thomas Gleixner	4a0e637738	clocksource: Get rid of cycle_last cycle_last was added to the clocksource to support the TSC validation. We moved that to the core code, so we can get rid of the extra copy. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Stultz <john.stultz@linaro.org>	2014-07-23 15:01:52 -07:00
Thomas Gleixner	f2dec1eae8	powerpc: cell: Use ktime_get_ns() Replace the ever recurring: ts = ktime_get_ts(); ns = timespec_to_ns(&ts); with ns = ktime_get_ns(); Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: John Stultz <john.stultz@linaro.org>	2014-07-23 10:18:07 -07:00
Michael Ellerman	8903461c9b	powerpc/perf: Fix MMCR2 handling for EBB In the recent commit `b50a6c584b` "Clear MMCR2 when enabling PMU", I screwed up the handling of MMCR2 for tasks using EBB. We must make sure we set MMCR2 before ebb_switch_in(), otherwise we overwrite the value of MMCR2 that userspace may have written. That potentially breaks a task that uses EBB and manually uses MMCR2 for event freezing. Fixes: `b50a6c584b` ("powerpc/perf: Clear MMCR2 when enabling PMU") Cc: stable@vger.kernel.org Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-23 17:16:47 +10:00
Li Zhong	6f5405bc2e	powerpc: use _GLOBAL_TOC for memmove memmove may be called from module code copy_pages(btrfs), and it may call memcpy, which may call back to C code, so it needs to use _GLOBAL_TOC to set up r2 correctly. This fixes following error when I tried to boot an le guest: Vector: 300 (Data Access) at [c000000073f97210] pc: c000000000015004: enable_kernel_altivec+0x24/0x80 lr: c000000000058fbc: enter_vmx_copy+0x3c/0x60 sp: c000000073f97490 msr: 8000000002009033 dar: d000000001d50170 dsisr: 40000000 current = 0xc0000000734c0000 paca = 0xc00000000fff0000 softe: 0 irq_happened: 0x01 pid = 815, comm = mktemp enter ? for help [c000000073f974f0] c000000000058fbc enter_vmx_copy+0x3c/0x60 [c000000073f97510] c000000000057d34 memcpy_power7+0x274/0x840 [c000000073f97610] d000000001c3179c copy_pages+0xfc/0x110 [btrfs] [c000000073f97660] d000000001c3c248 memcpy_extent_buffer+0xe8/0x160 [btrfs] [c000000073f97700] d000000001be4be8 setup_items_for_insert+0x208/0x4a0 [btrfs] [c000000073f97820] d000000001be50b4 btrfs_insert_empty_items+0xf4/0x140 [btrfs] [c000000073f97890] d000000001bfed30 insert_with_overflow+0x70/0x180 [btrfs] [c000000073f97900] d000000001bff174 btrfs_insert_dir_item+0x114/0x2f0 [btrfs] [c000000073f979a0] d000000001c1f92c btrfs_add_link+0x10c/0x370 [btrfs] [c000000073f97a40] d000000001c20e94 btrfs_create+0x204/0x270 [btrfs] [c000000073f97b00] c00000000026d438 vfs_create+0x178/0x210 [c000000073f97b50] c000000000270a70 do_last+0x9f0/0xe90 [c000000073f97c20] c000000000271010 path_openat+0x100/0x810 [c000000073f97ce0] c000000000272ea8 do_filp_open+0x58/0xd0 [c000000073f97dc0] c00000000025ade8 do_sys_open+0x1b8/0x300 [c000000073f97e30] c00000000000a008 syscall_exit+0x0/0x7c Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-22 15:56:04 +10:00
Tyrel Datwyler	97a9a7179a	powerpc/pseries: dynamically added OF nodes need to call of_node_init Commit `75b57ecf9` refactored device tree nodes to use kobjects such that they can be exposed via /sysfs. A secondary commit `0829f6d1f` furthered this rework by moving the kobect initialization logic out of of_node_add into its own of_node_init function. The inital commit removed the existing kref_init calls in the pseries dlpar code with the assumption kobject initialization would occur in of_node_add. The second commit had the side effect of triggering a BUG_ON during DLPAR, migration and suspend/resume operations as a result of dynamically added nodes being uninitialized. This patch fixes this by adding of_node_init calls in place of the previously removed kref_init calls. Fixes: `0829f6d1f6` ("of: device_node kobject lifecycle fixes") Cc: stable@vger.kernel.org Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com> Acked-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Acked-by: Grant Likely <grant.likely@linaro.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-22 15:55:59 +10:00
Aneesh Kumar K.V	dad6f37c26	powerpc: subpage_protect: Increase the array size to take care of 64TB We now support TASK_SIZE of 16TB, hence the array should be 8. Fixes the below crash: Unable to handle kernel paging request for data at address 0x000100bd Faulting instruction address: 0xc00000000004f914 cpu 0x13: Vector: 300 (Data Access) at [c000000fea75fa90] pc: c00000000004f914: .sys_subpage_prot+0x2d4/0x5c0 lr: c00000000004fb5c: .sys_subpage_prot+0x51c/0x5c0 sp: c000000fea75fd10 msr: 9000000000009032 dar: 100bd dsisr: 40000000 current = 0xc000000fea6ae490 paca = 0xc00000000fb8ab00 softe: 0 irq_happened: 0x00 pid = 8237, comm = a.out enter ? for help [c000000fea75fe30] c00000000000a164 syscall_exit+0x0/0x98 Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-22 15:55:55 +10:00
Paul Mackerras	e698b96678	powerpc: Fix bugs in emulate_step() This fixes some bugs in emulate_step(). First, the setting of the carry bit for the arithmetic right-shift instructions was not correct on 64-bit machines because we were masking with a mask of type int rather than unsigned long. Secondly, the sld (shift left doubleword) instruction was using the wrong instruction field for the register containing the shift count. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-22 15:55:51 +10:00
Joel Stanley	bd6ba3518f	powerpc: Disable doorbells on Power8 DD1.x These processors do not currently support doorbell IPIs, so remove them from the feature list if we are at DD 1.xx for the 0x004d part. This fixes a regression caused by `d4e58e5928` (powerpc/powernv: Enable POWER8 doorbell IPIs). With that patch the kernel would hang at boot when calling smp_call_function_many, as the doorbell would not be received by the target CPUs: .smp_call_function_many+0x2bc/0x3c0 (unreliable) .on_each_cpu_mask+0x30/0x100 .cpuidle_register_driver+0x158/0x1a0 .cpuidle_register+0x2c/0x110 .powernv_processor_idle_init+0x23c/0x2c0 .do_one_initcall+0xd4/0x260 .kernel_init_freeable+0x25c/0x33c .kernel_init+0x1c/0x120 .ret_from_kernel_thread+0x58/0x7c Fixes: `d4e58e5928` (powerpc/powernv: Enable POWER8 doorbell IPIs) Signed-off-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-22 15:55:24 +10:00
Linus Torvalds	5b2b9d7761	These are mostly PPC changes for 3.16-new things. However, there is an x86 change too and it is a regression from 3.14. As it only affects nested virtualization and there were other changes in this area in 3.16, I am not nominating it for 3.15-stable. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABAgAGBQJTyTTfAAoJEBvWZb6bTYbyytIQAJare/EWQmNBDK57EcJBIlJS 6MW2XnASEW+KCoUw0+u3sm9eaRXQdmJRb1Aw5zxTiUIR3ZSI8MDSQr1XxEgTAOtE vFZjonPwlbnE8edLMhH3v/6/v9oO7bwNTDYeOE2pKPRfgPRjFmj1QUOJkvzRnRwj kS5M4RtI+VqhdyJW8f4HaWqoRaOAISp3ZjQUJQdab3DWsf9ZpNjwLNjKzGZKNvIN Klcpi7JH32zawUfqnAvph/7NsrBGrpFRE+j+JU9LLnD9PehuXwqZbWh01g2Anbq2 TKVrmXW+YnoD1IZsDw7r/14FaeRweV7yALA/eA9F4KfSyF2Qm9RbjVVdrUYz0CHV aIl0cZeZM8xRCLy/ZWj+dOQ23RWelZaslHSpshKOznoRsuuvVwpx93zVtRwlw2dx 4WJ2A5gYA+ZUQ7eWjk83381JXkbRDUb3cO+NL8t9GnFctCJzT/gQHjqu15f7TJ2Q gKhmeciKOS3xY4sQ+ti6gv8CwIFYqgdTzkxedxSgS9xpiAmw9v57V7WukXoXB6zl AyjEAk9FFOeBZ5nXs0ObK5LKjI+MJoZ3X0bin7PCuT6dFrIA2yHvo5EgMvOcUua9 8Tu9L8sWv/JsKjuqebkKxekAKvv0CV35Q8OsLpEF6RI0eXyiXy2extk1LzUuK9cx ZVYbN263++En/tgH2AJM =Vdqn -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm fixes from Paolo Bonzini: "These are mostly PPC changes for 3.16-new things. However, there is an x86 change too and it is a regression from 3.14. As it only affects nested virtualization and there were other changes in this area in 3.16, I am not nominating it for 3.15-stable" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86: Check for nested events if there is an injectable interrupt KVM: PPC: RTAS: Do byte swaps explicitly KVM: PPC: Book3S PR: Fix ABIv2 on LE KVM: PPC: Assembly functions exported to modules need _GLOBAL_TOC() PPC: Add _GLOBAL_TOC for 32bit KVM: PPC: BOOK3S: HV: Use base page size when comparing against slb value KVM: PPC: Book3E: Unlock mmu_lock when setting caching atttribute	2014-07-21 11:19:18 -07:00
Linus Torvalds	d057190925	Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fixes from Thomas Gleixner: "The locking department delivers: - A rather large and intrusive bundle of fixes to address serious performance regressions introduced by the new rwsem / mcs technology. Simpler solutions have been discussed, but they would have been ugly bandaids with more risk than doing the right thing. - Make the rwsem spin on owner technology opt-in for architectures and enable it only on the known to work ones. - A few fixes to the lockdep userspace library" * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking/rwsem: Add CONFIG_RWSEM_SPIN_ON_OWNER locking/mutex: Disable optimistic spinning on some architectures locking/rwsem: Reduce the size of struct rw_semaphore locking/rwsem: Rename 'activity' to 'count' locking/spinlocks/mcs: Micro-optimize osq_unlock() locking/spinlocks/mcs: Introduce and use init macro and function for osq locks locking/spinlocks/mcs: Convert osq lock to atomic_t to reduce overhead locking/spinlocks/mcs: Rename optimistic_spin_queue() to optimistic_spin_node() locking/rwsem: Allow conservative optimistic spinning when readers have lock tools/liblockdep: Account for bitfield changes in lockdeps lock_acquire tools/liblockdep: Remove debug print left over from development tools/liblockdep: Fix comparison of a boolean value with a value of 2	2014-07-19 06:27:55 -10:00
Steven Rostedt (Red Hat)	96d4f43e3d	powerpc/ftrace: Add call to ftrace_graph_is_dead() in function graph code ftrace_stop() is going away as it disables parts of function tracing that affects users that should not be affected. But ftrace_graph_stop() is built on ftrace_stop(). Here's another example of killing all of function tracing because something went wrong with function graph tracing. Instead of disabling all users of function tracing on function graph error, disable only function graph tracing. To do this, the arch code must call ftrace_graph_is_dead() before it implements function graph. Cc: Anton Blanchard <anton@samba.org> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2014-07-18 13:56:56 -04:00
Bart Van Assche	aa3fc09078	tgt: defconfig cleanup Because of the removal of the scsi_tgt kernel module, the kbuild variables CONFIG_SCSI_TGT, CONFIG_SCSI_SRP_TGT_ATTRS and CONFIG_SCSI_FC_TGT_ATTRS are obsolete. This patch removes these variables. This patch is the result of the following command: find -name '*defconfig' \| while read f; do grep -vwE 'CONFIG_SCSI_TGT\|CONFIG_SCSI_SRP_TGT_ATTRS\|CONFIG_SCSI_FC_TGT_ATTRS\|CONFIG_SRP' $f >/tmp/t && mv /tmp/t $f; done Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Hannes Reinecke <hare@suse.de>	2014-07-17 22:07:44 +02:00
Davidlohr Bueso	3a6bfbc91d	arch, locking: Ciao arch_mutex_cpu_relax() The arch_mutex_cpu_relax() function, introduced by `34b133f`, is hacky and ugly. It was added a few years ago to address the fact that common cpu_relax() calls include yielding on s390, and thus impact the optimistic spinning functionality of mutexes. Nowadays we use this function well beyond mutexes: rwsem, qrwlock, mcs and lockref. Since the macro that defines the call is in the mutex header, any users must include mutex.h and the naming is misleading as well. This patch (i) renames the call to cpu_relax_lowlatency ("relax, but only if you can do it with very low latency") and (ii) defines it in each arch's asm/processor.h local header, just like for regular cpu_relax functions. On all archs, except s390, cpu_relax_lowlatency is simply cpu_relax, and thus we can take it out of mutex.h. While this can seem redundant, I believe it is a good choice as it allows us to move out arch specific logic from generic locking primitives and enables future(?) archs to transparently define it, similarly to System Z. Signed-off-by: Davidlohr Bueso <davidlohr@hp.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Anton Blanchard <anton@samba.org> Cc: Aurelien Jacquiot <a-jacquiot@ti.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Bharat Bhushan <r65777@freescale.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chen Liqin <liqin.linux@gmail.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Chris Zankel <chris@zankel.net> Cc: David Howells <dhowells@redhat.com> Cc: David S. Miller <davem@davemloft.net> Cc: Deepthi Dharwar <deepthi@linux.vnet.ibm.com> Cc: Dominik Dingel <dingel@linux.vnet.ibm.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Haavard Skinnemoen <hskinnemoen@gmail.com> Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Hirokazu Takata <takata@linux-m32r.org> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: James E.J. Bottomley <jejb@parisc-linux.org> Cc: James Hogan <james.hogan@imgtec.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Cc: Joe Perches <joe@perches.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Joseph Myers <joseph@codesourcery.com> Cc: Kees Cook <keescook@chromium.org> Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com> Cc: Lennox Wu <lennox.wu@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Salter <msalter@redhat.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Neuling <mikey@neuling.org> Cc: Michal Simek <monstr@monstr.eu> Cc: Mikael Starvik <starvik@axis.com> Cc: Nicolas Pitre <nico@linaro.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Paul Burton <paul.burton@imgtec.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Qais Yousef <qais.yousef@imgtec.com> Cc: Qiaowei Ren <qiaowei.ren@intel.com> Cc: Rafael Wysocki <rafael.j.wysocki@intel.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Steven Miao <realmz6@gmail.com> Cc: Steven Rostedt <srostedt@redhat.com> Cc: Stratos Karafotis <stratosk@semaphore.gr> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Vasily Kulikov <segoon@openwall.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Vineet Gupta <Vineet.Gupta1@synopsys.com> Cc: Waiman Long <Waiman.Long@hp.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Wolfram Sang <wsa@the-dreams.de> Cc: adi-buildroot-devel@lists.sourceforge.net Cc: linux390@de.ibm.com Cc: linux-alpha@vger.kernel.org Cc: linux-am33-list@redhat.com Cc: linux-arm-kernel@lists.infradead.org Cc: linux-c6x-dev@linux-c6x.org Cc: linux-cris-kernel@axis.com Cc: linux-hexagon@vger.kernel.org Cc: linux-ia64@vger.kernel.org Cc: linux@lists.openrisc.net Cc: linux-m32r-ja@ml.linux-m32r.org Cc: linux-m32r@ml.linux-m32r.org Cc: linux-m68k@lists.linux-m68k.org Cc: linux-metag@vger.kernel.org Cc: linux-mips@linux-mips.org Cc: linux-parisc@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-s390@vger.kernel.org Cc: linux-sh@vger.kernel.org Cc: linux-xtensa@linux-xtensa.org Cc: sparclinux@vger.kernel.org Link: http://lkml.kernel.org/r/1404079773.2619.4.camel@buesod1.americas.hpqcorp.net Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-07-17 12:32:47 +02:00
Linus Torvalds	bcf44bfe5e	Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar: "A cpufreq lockup fix and a compiler warning fix" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched: Fix compiler warnings x86, tsc: Fix cpufreq lockup	2014-07-16 10:11:02 -10:00
Peter Zijlstra	4badad352a	locking/mutex: Disable optimistic spinning on some architectures The optimistic spin code assumes regular stores and cmpxchg() play nice; this is found to not be true for at least: parisc, sparc32, tile32, metag-lock1, arc-!llsc and hexagon. There is further wreckage, but this in particular seemed easy to trigger, so blacklist this. Opt in for known good archs. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Reported-by: Mikulas Patocka <mpatocka@redhat.com> Cc: David Miller <davem@davemloft.net> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: James Bottomley <James.Bottomley@hansenpartnership.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Jason Low <jason.low2@hp.com> Cc: Waiman Long <waiman.long@hp.com> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Cc: John David Anglin <dave.anglin@bell.net> Cc: James Hogan <james.hogan@imgtec.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Davidlohr Bueso <davidlohr@hp.com> Cc: stable@vger.kernel.org Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: sparclinux@vger.kernel.org Link: http://lkml.kernel.org/r/20140606175316.GV13930@laptop.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-07-16 14:57:07 +02:00
Linus Torvalds	5615f9f822	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) Bluetooth pairing fixes from Johan Hedberg. 2) ieee80211_send_auth() doesn't allocate enough tail room for the SKB, from Max Stepanov. 3) New iwlwifi chip IDs, from Oren Givon. 4) bnx2x driver reads wrong PCI config space MSI register, from Yijing Wang. 5) IPV6 MLD Query validation isn't strong enough, from Hangbin Liu. 6) Fix double SKB free in openvswitch, from Andy Zhou. 7) Fix sk_dst_set() being racey with UDP sockets, leading to strange crashes, from Eric Dumazet. 8) Interpret the NAPI budget correctly in the new systemport driver, from Florian Fainelli. 9) VLAN code frees percpu stats in the wrong place, leading to crashes in the get stats handler. From Eric Dumazet. 10) TCP sockets doing a repair can crash with a divide by zero, because we invoke tcp_push() with an MSS value of zero. Just skip that part of the sendmsg paths in repair mode. From Christoph Paasch. 11) IRQ affinity bug fixes in mlx4 driver from Amir Vadai. 12) Don't ignore path MTU icmp messages with a zero mtu, machines out there still spit them out, and all of our per-protocol handlers for PMTU can cope with it just fine. From Edward Allcutt. 13) Some NETDEV_CHANGE notifier invocations were not passing in the correct kind of cookie as the argument, from Loic Prylli. 14) Fix crashes in long multicast/broadcast reassembly, from Jon Paul Maloy. 15) ip_tunnel_lookup() doesn't interpret wildcard keys correctly, fix from Dmitry Popov. 16) Fix skb->sk assigned without taking a reference to 'sk' in appletalk, from Andrey Utkin. 17) Fix some info leaks in ULP event signalling to userspace in SCTP, from Daniel Borkmann. 18) Fix deadlocks in HSO driver, from Olivier Sobrie. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (93 commits) hso: fix deadlock when receiving bursts of data hso: remove unused workqueue net: ppp: don't call sk_chk_filter twice mlx4: mark napi id for gro_skb bonding: fix ad_select module param check net: pppoe: use correct channel MTU when using Multilink PPP neigh: sysctl - simplify address calculation of gc_* variables net: sctp: fix information leaks in ulpevent layer MAINTAINERS: update r8169 maintainer net: bcmgenet: fix RGMII_MODE_EN bit tipc: clear 'next'-pointer of message fragments before reassembly r8152: fix r8152_csum_workaround function be2net: set EQ DB clear-intr bit in be_open() GRE: enable offloads for GRE farsync: fix invalid memory accesses in fst_add_one() and fst_init_card() igb: do a reset on SR-IOV re-init if device is down igb: Workaround for i210 Errata 25: Slow System Clock usbnet: smsc95xx: add reset_resume function with reset operation dp83640: Always decode received status frames r8169: disable L23 ...	2014-07-15 08:42:52 -07:00
Anton Blanchard	c49f63530b	powernv: Add OPAL tracepoints Knowing how long we spend in firmware calls is an important part of minimising OS jitter. This patch adds tracepoints to each OPAL call. If tracepoints are enabled we branch out to a common routine that calls an entry and exit tracepoint. This allows us to write tools that monitor the frequency and duration of OPAL calls, eg: name count total(ms) min(ms) max(ms) avg(ms) period(ms) OPAL_HANDLE_INTERRUPT 5 0.199 0.037 0.042 0.040 12547.545 OPAL_POLL_EVENTS 204 2.590 0.012 0.036 0.013 2264.899 OPAL_PCI_MSI_EOI 2830 3.066 0.001 0.005 0.001 81.166 We use jump labels if configured, which means we only add a single nop instruction to every OPAL call when the tracepoints are disabled. Signed-off-by: Anton Blanchard <anton@samba.org> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-11 16:06:08 +10:00
Anton Blanchard	aaad422482	powerpc/pseries: Optimise hcall tracepoints Now that we execute the hcall tracepoint entry and exit code out of line, we can use the same stack across both functions. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-11 16:06:03 +10:00
Anton Blanchard	cc1adb5f32	powerpc/pseries: Use jump labels for hcall tracepoints hcall tracepoints add quite a few instructions to our hcall path: plpar_hcall: mr r2,r2 mfcr r0 stw r0,8(r1) b 164 <---- start ld r12,0(r2) std r12,32(r1) cmpdi r12,0 beq 164 <---- end ... We have an unconditional branch that gets noped out during boot and a load/compare/branch. We also store the tracepoint value to the stack for the hcall_exit path to use. By using jump labels we can simplify this to just a single nop that gets replaced with a branch when the tracepoint is enabled: plpar_hcall: mr r2,r2 mfcr r0 stw r0,8(r1) nop <---- ... If jump labels are not enabled, we fall back to the old method. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-11 16:05:58 +10:00
Alexey Kardashevskiy	8fa5d4547e	powerpc/powernv: Add a page size parameter to pnv_pci_setup_iommu_table() Since a TCE page size can be other than 4K, make it configurable for P5IOC2 and IODA PHBs. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-11 16:05:53 +10:00
Alexey Kardashevskiy	bc32057ed5	powerpc/powernv: Use it_page_shift in TCE build This makes use of iommu_table::it_page_shift instead of TCE_SHIFT and TCE_RPN_SHIFT hardcoded values. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-11 16:05:48 +10:00
Alexey Kardashevskiy	b0376c9b08	powerpc/powernv: Use it_page_shift for TCE invalidation This fixes IODA1/2 to use it_page_shift as it may be bigger than 4K. This changes involved constant values to use "ull" modifier. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-11 16:05:43 +10:00
Benjamin Herrenschmidt	5b97259220	Merge branch 'merge' into next	2014-07-11 15:38:12 +10:00
Anton Blanchard	f56029410a	powerpc/perf: Never program book3s PMCs with values >= 0x80000000 We are seeing a lot of PMU warnings on POWER8: Can't find PMC that caused IRQ Looking closer, the active PMC is 0 at this point and we took a PMU exception on the transition from negative to 0. Some versions of POWER8 have an issue where they edge detect and not level detect PMC overflows. A number of places program the PMC with (0x80000000 - period_left), where period_left can be negative. We can either fix all of these or just ensure that period_left is always >= 1. This patch takes the second option. Cc: <stable@vger.kernel.org> Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-11 13:50:47 +10:00
Guenter Roeck	fb43e8477e	powerpc: Disable RELOCATABLE for COMPILE_TEST with PPC64 powerpc:allmodconfig has been failing for some time with the following error. arch/powerpc/kernel/exceptions-64s.S: Assembler messages: arch/powerpc/kernel/exceptions-64s.S:1312: Error: attempt to move .org backwards make[1]: *** [arch/powerpc/kernel/head_64.o] Error 1 A number of attempts to fix the problem by moving around code have been unsuccessful and resulted in failed builds for some configurations and the discovery of toolchain bugs. Fix the problem by disabling RELOCATABLE for COMPILE_TEST builds instead. While this is less than perfect, it avoids substantial code changes which would otherwise be necessary just to make COMPILE_TEST builds happy and might have undesired side effects. Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-11 12:55:09 +10:00
Joel Stanley	b50a6c584b	powerpc/perf: Clear MMCR2 when enabling PMU On POWER8 when switching to a KVM guest we set bits in MMCR2 to freeze the PMU counters. Aside from on boot they are then never reset, resulting in stuck perf counters for any user in the guest or host. We now set MMCR2 to 0 whenever enabling the PMU, which provides a sane state for perf to use the PMU counters under either the guest or the host. This was manifesting as a bug with ppc64_cpu --frequency: $ sudo ppc64_cpu --frequency WARNING: couldn't run on cpu 0 WARNING: couldn't run on cpu 8 ... WARNING: couldn't run on cpu 144 WARNING: couldn't run on cpu 152 min: 18446744073.710 GHz (cpu -1) max: 0.000 GHz (cpu -1) avg: 0.000 GHz The command uses a perf counter to measure CPU cycles over a fixed amount of time, in order to approximate the frequency of the machine. The counters were returning zero once a guest was started, regardless of weather it was still running or had been shut down. By dumping the value of MMCR2, it was observed that once a guest is running MMCR2 is set to 1s - which stops counters from running: $ sudo sh -c 'echo p > /proc/sysrq-trigger' CPU: 0 PMU registers, ppmu = POWER8 n_counters = 6 PMC1: 5b635e38 PMC2: 00000000 PMC3: 00000000 PMC4: 00000000 PMC5: 1bf5a646 PMC6: 5793d378 PMC7: deadbeef PMC8: deadbeef MMCR0: 0000000080000000 MMCR1: 000000001e000000 MMCRA: 0000040000000000 MMCR2: fffffffffffffc00 EBBHR: 0000000000000000 EBBRR: 0000000000000000 BESCR: 0000000000000000 SIAR: 00000000000a51cc SDAR: c00000000fc40000 SIER: 0000000001000000 This is done unconditionally in book3s_hv_interrupts.S upon entering the guest, and the original value is only save/restored if the host has indicated it was using the PMU. This is okay, however the user of the PMU needs to ensure that it is in a defined state when it starts using it. Fixes: `e05b9b9e5c` ("powerpc/perf: Power8 PMU support") Cc: stable@vger.kernel.org Signed-off-by: Joel Stanley <joel@jms.id.au> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-11 12:55:08 +10:00
Joel Stanley	4d9690dd56	powerpc/perf: Add PPMU_ARCH_207S define Instead of separate bits for every POWER8 PMU feature, have a single one for v2.07 of the architecture. This saves us adding a MMCR2 define for a future patch. Cc: stable@vger.kernel.org Signed-off-by: Joel Stanley <joel@jms.id.au> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-11 12:55:07 +10:00
Joel Stanley	f73128f4f6	powerpc/kvm: Remove redundant save of SIER AND MMCR2 These two registers are already saved in the block above. Aside from being unnecessary, by the time we get down to the second save location r8 no longer contains MMCR2, so we are clobbering the saved value with PMC5. MMCR2 primarily consists of counter freeze bits. So restoring the value of PMC5 into MMCR2 will most likely have the effect of freezing counters. Fixes: `72cde5a88d` ("KVM: PPC: Book3S HV: Save/restore host PMU registers that are new in POWER8") Cc: stable@vger.kernel.org Signed-off-by: Joel Stanley <joel@jms.id.au> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Acked-by: Paul Mackerras <paulus@samba.org> Reviewed-by: Alexander Graf <agraf@suse.de> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-11 12:55:07 +10:00
Preeti U Murthy	c733cf83bb	powerpc/powernv: Check for IRQHAPPENED before sleeping Commit `8d6f7c5a`: "powerpc/powernv: Make it possible to skip the IRQHAPPENED check in power7_nap()" added code that prevents cpus from checking for pending interrupts just before entering sleep state, which is wrong. These interrupts are delivered during the soft irq disabled state of the cpu. A cpu cannot enter any idle state with pending interrupts because they will never be serviced until the next time the cpu is woken up by some other interrupt. Its only then that the pending interrupts are replayed. This can result in device timeouts or warnings about this cpu being stuck. This patch fixes ths issue by ensuring that cpus check for pending interrupts just before entering any idle state as long as they are not in the path of split core operations. Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-11 12:55:06 +10:00
Michael Ellerman	cd68098bce	powerpc: Clean up MMU_FTRS_A2 and MMU_FTR_TYPE_3E In `fb5a515704` "powerpc: Remove platforms/wsp and associated pieces", we removed the last user of MMU_FTRS_A2. So remove it. MMU_FTRS_A2 was the last user of MMU_FTR_TYPE_3E, so remove it also. This leaves some unreachable code in mmu_context_nohash.c, so remove that also. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-11 12:55:05 +10:00
Michael Ellerman	e623fbf1c4	powerpc/cell: Fix compilation with CONFIG_COREDUMP=n Commit `046d662f48` "coredump: make core dump functionality optional" made the coredump optional, but didn't update the spufs code that depends on it. That leads to build errors such as: arch/powerpc/platforms/built-in.o: In function `.spufs_arch_write_note': coredump.c:(.text+0x22cd4): undefined reference to `.dump_emit' coredump.c:(.text+0x22cf4): undefined reference to `.dump_emit' coredump.c:(.text+0x22d0c): undefined reference to `.dump_align' coredump.c:(.text+0x22d48): undefined reference to `.dump_emit' coredump.c:(.text+0x22e7c): undefined reference to `.dump_skip' Fix it by adding some ifdefs in the cell code. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-11 12:55:05 +10:00
Nitesh Narayan Lal	6f39da1cad	crypto: dts - Addition of missing SEC compatibile property in c29x device tree The driver is compatible with SEC version 4.0, which was missing from device tree resulting that the caam driver doesn't gets probed. Since SEC is backward compatible with older versions, so this patch adds those missing versions in c29x device tree. Signed-off-by: Nitesh Narayan Lal <b44382@freescale.com> Signed-off-by: Vakul Garg <b16394@freescale.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2014-07-08 21:06:32 +08:00
Paolo Bonzini	bb18b526a9	Patch queue for 3.16 - 2014-07-08 A few bug fixes to make 3.16 work well with KVM on PowerPC: - Fix ppc32 module builds - Fix Little Endian hosts - Fix Book3S HV HPTE lookup with huge pages in guest - Fix BookE lock leak -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (GNU/Linux) iQIcBAABAgAGBQJTu8HqAAoJECszeR4D/txg7u0QAMk3SC2+yBVsOAusB0YvERyU Y9X4Lz9fALdeZf2Fd2Qk9BD5y283LDppZqyoy+dmef9DXopfCv0Kh4rl/GrlG9ny aHeiBfJGpIpjqZnvkZP0Ln9zpyg7gMLRVNfNJvZWji8RcHly6m9/bxEkG0HnX6Hn /2UkUzOdk2aymjzMqFXdHODdC0JsGtWtGBiVC+HOtIf1D3TX42R4KI+ieOSKjGDp OYgN2XskOMgiXvPtEx2yMyHAAw5OTCVNdFt6Co1x0qUsz560Wy3Hy6QCwiroLrPH rjxkHhcQN0GJJLXs/jajdDJoEp5wYLRomReZbdrKgBj+zGvQQgGRD+RO9iyfedlm 4hTw98tgmHcPgFTIXQlG5U8Cn0/oPr/k7FWBZJDpiUCTNRI/rsL6eHX7Wu/ylUfm uvcwdl5tXdM2OMHE2wEB4pEwSAK4TNGjx237txNgaeLu4ZT8yk4TQnOXlxyMJQe7 /Bfh8oUKBqRlWAymwut8y/cazZCRDFAx88ovwqAW9GXxgB+tiCeIDLNnLYEkjEmV 8l+viAjZz3LbzLeFxCxHnNha9WhK7A7kNGhYaWn1+N2Zlz1F3u3mQm5QoZ1UJgIH TtbwWsfM7jYrlUsJB1xTeL5Hs8JhOTp+kgLpMbRXe1sNX1xqh+OQZHsJ16VB6zU9 RiOjHnv2D9/icH0B2DsW =+sQF -----END PGP SIGNATURE----- Merge tag 'signed-for-3.16' of git://github.com/agraf/linux-2.6 into kvm-master Patch queue for 3.16 - 2014-07-08 A few bug fixes to make 3.16 work well with KVM on PowerPC: - Fix ppc32 module builds - Fix Little Endian hosts - Fix Book3S HV HPTE lookup with huge pages in guest - Fix BookE lock leak	2014-07-08 12:08:58 +02:00
Alexander Graf	19a44ecff5	KVM: PPC: RTAS: Do byte swaps explicitly In commit `b59d9d26b` we introduced implicit byte swaps for RTAS calls. Unfortunately we messed up and didn't swizzle return values properly. Also the old approach wasn't "sparse" compatible - we were randomly reading __be32 values on an LE system. Let's just do all of the swizzling explicitly with byte swaps right where values get used. That way we can at least catch bugs using sparse. This patch fixes XICS RTAS emulation on little endian hosts for me. Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-07 23:17:20 +02:00
Alexander Graf	55ab169b7b	KVM: PPC: Book3S PR: Fix ABIv2 on LE We switched to ABIv2 on Little Endian systems now which gets rid of the dotted function names. Branch to the actual functions when we see such a system. Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-07 12:02:51 +02:00
Anton Blanchard	6ed179b67c	KVM: PPC: Assembly functions exported to modules need _GLOBAL_TOC() Both kvmppc_hv_entry_trampoline and kvmppc_entry_trampoline are assembly functions that are exported to modules and also require a valid r2. As such we need to use _GLOBAL_TOC so we provide a global entry point that establishes the TOC (r2). Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-07 12:02:32 +02:00
Aneesh Kumar K.V	3cd60e3118	KVM: PPC: BOOK3S: PR: Fix PURR and SPURR emulation We use time base for PURR and SPURR emulation with PR KVM since we are emulating a single threaded core. When using time base we need to make sure that we don't accumulate time spent in the host in PURR and SPURR value. Also we don't need to emulate mtspr because both the registers are hypervisor resource. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-06 13:56:49 +02:00
Vince Weaver	cc56d673a9	powerpc, perf: Use common PMU interrupt disabled code Transition to using the new generic PERF_PMU_CAP_NO_INTERRUPT method for failing a sampling event when no PMU interrupt is available. Signed-off-by: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1406191435440.27913@vincent-weaver-1.umelst.maine.edu Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Cody P Schafer <cody@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: linux-kernel@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-07-05 11:21:51 +02:00
Gavin Shan	21dd5a43d0	powerpc/pci: Remove duplicate logic Since the logic to reset PCI secondary bus by PCI config register PCI_BRIDGE_CTL_BUS_RESET is included in pci_reset_secondary_bus(), we needn't implement another one. Remove the duplicate implementation and call pci_reset_secondary_bus(). Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2014-07-03 16:45:04 -06:00
Laurentiu TUDOR	cd1154770b	powerpc/85xx: drop hypervisor specific board compatibles They're almost a duplicate of the boards array and we can build them at run-time. Signed-off-by: Laurentiu Tudor <Laurentiu.Tudor@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com>	2014-07-02 17:33:10 -05:00
Shengzhou Liu	4c18be2bf5	powerpc/fsl-booke: Add initial T208x QDS board support Add support for Freescale T2080/T2081 QDS Development System Board. The T2080QDS Development System is a high-performance computing, evaluation, and development platform that supports T2080 QorIQ Power Architecture processor, with following major features: T2080QDS feature overview: Processor: - T2080 SoC integrating four 64-bit dual-threads e6500 cores up to 1.8GHz Memory: - Single memory controller capable of supporting DDR3 and DDR3-LP - Dual DIMM slots up 2133MT/s with ECC Ethernet interfaces: - Two 1Gbps RGMII on-board ports - Four 10Gbps XFI on-board cages - 1Gbps/2.5Gbps SGMII Riser card - 10Gbps XAUI Riser card Accelerator: - DPAA components consist of FMan, BMan, QMan, PME, DCE and SEC SerDes: - 16 lanes up to 10.3125GHz - Supports Aurora debug, PEX, SATA, SGMII, sRIO, HiGig, XFI and XAUI IFC: - 128MB NOR Flash, 512MB NAND Flash, PromJet debug port and FPGA eSPI: - Three SPI flash (16MB N25Q128A + 8MB EN25S64 + 512KB SST25WF040) USB: - Two USB2.0 ports with internal PHY (one Type-A + one micro Type-AB) PCIE: - Four PCI Express controllers (two PCIe 2.0 and two PCIe 3.0, SR-IOV) SATA: - Two SATA 2.0 ports on-board SRIO: - Two Serial RapidIO 2.0 ports up to 5 GHz eSDHC: - Supports SD/MMC/eMMC Card DMA: - Three 8-channels DMA controllers I2C: - Four I2C controllers. UART: - Dual 4-pins UART serial ports System Logic: - QIXIS-II FPGA system controll T2081QDS board shares the same PCB with T1040QDS with some differences. Signed-off-by: Shengzhou Liu <Shengzhou.Liu@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com>	2014-07-02 17:33:09 -05:00
Shengzhou Liu	1d8de8fced	powerpc/fsl-booke: Add support for T2080/T2081 SoC The T2080 QorIQ multicore processor combines four dual-threaded e6500 Power Architecture processor cores with high-performance datapath acceleration logic and network and peripheral bus interfaces required for networking, telecom/datacom, wireless infrastructure, and mil/aerospace applications. The T2080 SoC includes the following function and features: - Four dual-threaded 64-bit Power architecture e6500 cores, up to 1.8GHz - 2MB L2 cache and 512KB CoreNet platform cache (CPC) - Hierarchical interconnect fabric - One 32-/64-bit DDR3/3L SDRAM memory controllers with ECC and interleaving - Data Path Acceleration Architecture (DPAA) incorporating acceleration - 16 SerDes lanes up to 10.3125 GHz - 8 Ethernet interfaces (multiple 1G/2.5G/10G MACs) - High-speed peripheral interfaces - Four PCI Express controllers (two PCIe 2.0 and two PCIe 3.0) - Two Serial RapidIO 2.0 controllers/ports running at up to 5 GHz - Additional peripheral interfaces - Two serial ATA (SATA 2.0) controllers - Two high-speed USB 2.0 controllers with integrated PHY - Enhanced secure digital host controller (SD/SDXC/eMMC) - Enhanced serial peripheral interface (eSPI) - Four I2C controllers - Four 2-pin UARTs or two 4-pin UARTs - Integrated Flash Controller supporting NAND and NOR flash - Three eight-channel DMA engines - Support for hardware virtualization and partitioning enforcement - QorIQ Platform's Trust Architecture 2.0 T2081 is a reduced personality of T2080 with following difference: Feature T2080 T2081 1G Ethernet numbers: 8 6 10G Ethernet numbers: 4 2 SerDes lanes: 16 8 Serial RapidIO,RMan: 2 no SATA Controller: 2 no Aurora: yes no SoC Package: 896-pins 780-pins Signed-off-by: Shengzhou Liu <Shengzhou.Liu@freescale.com> [scottwood@freescale.com: added fsl,qoriq-pci-v3.0 for U-Boot compat] Signed-off-by: Scott Wood <scottwood@freescale.com>	2014-07-02 17:32:41 -05:00
Guenter Roeck	b6220ad66b	sched: Fix compiler warnings Commit `143e1e28cb` (sched: Rework sched_domain topology definition) introduced a number of functions with a return value of 'const int'. gcc doesn't know what to do with that and, if the kernel is compiled with W=1, complains with the following warnings whenever sched.h is included. include/linux/sched.h:875:25: warning: type qualifiers ignored on function return type include/linux/sched.h:882:25: warning: type qualifiers ignored on function return type include/linux/sched.h:889:25: warning: type qualifiers ignored on function return type include/linux/sched.h:1002:21: warning: type qualifiers ignored on function return type Commits `fb2aa855` (sched, ARM: Create a dedicated scheduler topology table) and `607b45e9a` (sched, powerpc: Create a dedicated topology table) introduce the same warning in the arm and powerpc code. Drop 'const' from the function declarations to fix the problem. The fix for all three patches has to be applied together to avoid compilation failures for the affected architectures. Acked-by: Vincent Guittot <vincent.guittot@linaro.org> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Cc: Russell King <linux@arm.linux.org.uk> Cc: Paul Mackerras <paulus@samba.org> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1403658329-13196-1-git-send-email-linux@roeck-us.net Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-07-02 08:33:48 +02:00
Denis Kirjanov	dba63115ce	powerpc: bpf: Fix the broken LD_VLAN_TAG_PRESENT test We have to return the boolean here if the tag presents or not, not just ANDing the TCI with the mask which results to: [ 709.412097] test_bpf: #18 LD_VLAN_TAG_PRESENT [ 709.412245] ret 4096 != 1 [ 709.412332] ret 4096 != 1 [ 709.412333] FAIL (2 times) Signed-off-by: Denis Kirjanov <kda@linux-powerpc.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-27 16:14:12 -07:00
Denis Kirjanov	3fc60aa097	powerpc: bpf: Use correct mask while accessing the VLAN tag To get a full tag (and not just a VID) we should access the TCI except the VLAN_TAG_PRESENT field (which means that 802.1q header is present). Also ensure that the VLAN_TAG_PRESENT stay on its place Signed-off-by: Denis Kirjanov <kda@linux-powerpc.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-06-27 16:14:12 -07:00
Grant Likely	ccdb8ed3b3	of: Migrate of_find_node_by_name() users to for_each_node_by_name() There are a bunch of users open coding the for_each_node_by_name() by calling of_find_node_by_name() directly instead of using the macro. This is getting in the way of some cleanups, and the possibility of removing of_find_node_by_name() entirely. Clean it up so that all the users are consistent. Signed-off-by: Grant Likely <grant.likely@linaro.org> Cc: Rob Herring <robh+dt@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Takashi Iwai <tiwai@suse.de>	2014-06-26 17:12:24 +01:00
Alexander Graf	9715a2e851	PPC: Add _GLOBAL_TOC for 32bit Commit ac5a8ee8 started using _GLOBAL_TOC on ppc32 code. Unfortunately it's only defined for 64bit targets though. Define it for ppc32 as well, fixing the build breakage that commit introduced. Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-26 13:19:42 +02:00
Scott Wood	087dfae3fe	powerpc/8xx: Remove empty asm/mpc8xx.h m8xx_pcmcia_ops was the only thing in this file (other than a comment that describes a usage that doesn't match the file's contents); now that m8xx_pcmcia_ops is gone, remove the empty file. Signed-off-by: Scott Wood <scottwood@freescale.com> Cc: Pantelis Antoniou <pantelis.antoniou@gmail.com> Cc: Vitaly Bordug <vitb@kernel.crashing.org> Cc: netdev@vger.kernel.org	2014-06-25 18:49:40 -05:00
Scott Wood	39eb56da2b	pcmcia: Remove m8xx_pcmcia driver This driver doesn't build, and apparently has not built since arch/ppc was removed in 2008 (when mk_int_int_mask was removed from asm/irq.h, among other build errors). A few weeks ago I asked whether anyone was actively maintaining this code, and got no positive response: http://patchwork.ozlabs.org/patch/352082/ So, let's remove it. Signed-off-by: Scott Wood <scottwood@freescale.com> Cc: Vitaly Bordug <vitb@kernel.crashing.org> Cc: linux-pcmcia@lists.infradead.org Cc: Paul Bolle <pebolle@tiscali.nl>	2014-06-25 18:49:39 -05:00
Bharat Bhushan	2759a7f13d	booke/powerpc: define wimge shift mask to fix compilation error This fixes below compilation error on SOCs where CONFIG_PHYS_64BIT is not defined: arch/powerpc/kvm/e500_mmu_host.c: In function 'kvmppc_e500_shadow_map': \| arch/powerpc/kvm/e500_mmu_host.c:631:20: error: 'PTE_WIMGE_SHIFT' undeclared (first use in this function) \| wimg = (ptep >> PTE_WIMGE_SHIFT) & MAS2_WIMGE_MASK; \| ^ \| arch/powerpc/kvm/e500_mmu_host.c:631:20: note: each undeclared identifier is reported only once for each function it appears in \| make[1]: ** [arch/powerpc/kvm/e500_mmu_host.o] Error 1 Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com>	2014-06-25 18:49:39 -05:00
Wladislav Wiebe	c152833949	powerpc/traps/e500: fix misleading error output In machine_check_e500 exception handler is a wrong indication in case of MCSR_BUS_WBERR - so print "Write" instead of "Read". Signed-off-by: Wladislav Wiebe <wladislav.kw@gmail.com> Signed-off-by: Scott Wood <scottwood@freescale.com>	2014-06-25 18:49:38 -05:00
Chunhe Lan	36a2a09d57	powerpc/85xx: Add T4240RDB board support T4240RDB board Specification ---------------------------- Memory subsystem: 6GB DDR3 128MB NOR flash 2GB NAND flash Ethernet: Eight 1G SGMII ports Four 10Gbps SFP+ ports PCIe: Two PCIe slots USB: Two USB2.0 Type A ports SDHC: One SD-card port SATA: One SATA port UART: Dual RJ45 ports Signed-off-by: Chunhe Lan <Chunhe.Lan@freescale.com> Cc: Scott Wood <scottwood@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com>	2014-06-25 18:49:35 -05:00
Aneesh Kumar K.V	341acbb3aa	KVM: PPC: BOOK3S: HV: Use base page size when comparing against slb value With guests supporting Multiple page size per segment (MPSS), hpte_page_size returns the actual page size used. Add a new function to return base page size and use that to compare against the the page size calculated from SLB. Without this patch a hpte lookup can fail since we are comparing wrong page size in kvmppc_hv_find_lock_hpte. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-25 14:07:06 +02:00
Scott Wood	6663a4fa67	powerpc: Don't skip ePAPR spin-table CPUs Commit `59a53afe70` "powerpc: Don't setup CPUs with bad status" broke ePAPR SMP booting. ePAPR says that CPUs that aren't presently running shall have status of disabled, with enable-method being used to determine whether the CPU can be enabled. Fix by checking for spin-table, which is currently the only supported enable-method. Signed-off-by: Scott Wood <scottwood@freescale.com> Cc: Michael Neuling <mikey@neuling.org> Cc: Emil Medve <Emilian.Medve@Freescale.com> Cc: stable@vger.kernel.org Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-06-25 13:10:49 +10:00
Laurent Dufour	c2cbcf533a	powerpc/module: Fix TOC symbol CRC The commit `71ec7c55ed` introduced the magic symbol ".TOC." for ELFv2 ABI. This symbol is built manually and has no CRC value computed. A zero value is put in the CRC section to avoid modpost complaining about a missing CRC. Unfortunately, this breaks the kernel module loading when the kernel is relocated (kdump case for instance) because of the relocation applied to the kcrctab values. This patch compute a CRC value for the TOC symbol which will match the one compute by the kernel when it is relocated - aka '0 - relocate_start' done in maybe_relocated called by check_version (module.c). Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com> Cc: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-06-25 13:10:48 +10:00
Michael Ellerman	e2500be2b8	powerpc/powernv: Remove OPAL v1 takeover In commit `27f4488872` "Add OPAL takeover from PowerVM" we added support for "takeover" on OPAL v1 machines. This was a mode of operation where we would boot under pHyp, and query for the presence of OPAL. If detected we would then do a special sequence to take over the machine, and the kernel would end up running in hypervisor mode. OPAL v1 was never a supported product, and was never shipped outside IBM. As far as we know no one is still using it. Newer versions of OPAL do not use the takeover mechanism. Although the query for OPAL should be harmless on machines with newer OPAL, we have seen a machine where it causes a crash in Open Firmware. The code in early_init_devtree() to copy boot_command_line into cmd_line was added in commit `817c21ad9a` "Get kernel command line accross OPAL takeover", and AFAIK is only used by takeover, so should also be removed. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-06-25 13:10:47 +10:00
Mihai Caraman	511c66818d	KVM: PPC: Book3E: Unlock mmu_lock when setting caching atttribute The patch `08c9a188d0` kvm: powerpc: use caching attributes as per linux pte do not handle properly the error case, letting mmu_lock locked. The lock will further generate a RCU stall from kvmppc_e500_emul_tlbwe() caller. In case of an error go to out label. Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-06-24 15:37:25 +02:00
Catalin Marinas	a1d23d5c94	powerpc/kmemleak: Do not scan the DART table The DART table allocation is registered to kmemleak via the memblock_alloc_base() call. However, the DART table is later unmapped and dart_tablebase VA no longer accessible. This patch tells kmemleak not to scan this block and avoid an unhandled paging request. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-06-24 14:29:46 +10:00
Rickard Strandqvist	f5fc82290c	powerpc/cell: cbe_thermal.c: Cleaning up a variable is of the wrong type This variable is of the wrong type, everywhere it is used it should be an unsigned int rather than a int. Signed-off-by: Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-06-24 14:05:59 +10:00
Michael Ellerman	2f0143c91d	powerpc/kprobes: Fix jprobes on ABI v2 (LE) In commit `721aeaa9` "Build little endian ppc64 kernel with ABIv2", we missed some updates required in the kprobes code to make jprobes work when the kernel is built with ABI v2. Firstly update arch_deref_entry_point() to do the right thing. Now that we have added ppc_global_function_entry() we can just always use that, it will do the right thing for 32 & 64 bit and ABI v1 & v2. Secondly we need to update the code that sets up the register state before calling the jprobe handler. On ABI v1 we setup r2 to hold the TOC, on ABI v2 we need to populate r12 with the function entry point address. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-06-24 14:05:55 +10:00
Michael Ellerman	072c4c018e	powerpc/ftrace: Use pr_fmt() to namespace error messages The printks() in our ftrace code have no prefix, so they appear on the console with very little context, eg: Branch out of range Use pr_fmt() & pr_err() to add a prefix. While we're at it, collapse a few split lines that don't need to be, and add a missing newline to one message. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-06-24 14:05:50 +10:00
Michael Ellerman	d84e0d69c2	powerpc/ftrace: Fix nop of modules on 64bit LE (ABIv2) There is a bug in the handling of the function entry when we are nopping out a branch from a module in ftrace. We compare the result of module_trampoline_target() with the value of ppc_function_entry(), and expect them to be true. But they never will be. module_trampoline_target() will always return the global entry point of the function, whereas ppc_function_entry() will always return the local. Fix it by using the newly added ppc_global_function_entry(). Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-06-24 14:05:46 +10:00
Michael Ellerman	b7b348c682	powerpc/ftrace: Fix inverted check of create_branch() In commit `24a1bdc35`, "Fix ABIv2 issues with __ftrace_make_call", Anton changed the logic that creates and patches the branch, and added a thinko in the check of create_branch(). create_branch() returns the instruction that was generated, so if we get zero then it succeeded. The result is we can't ftrace modules: Branch out of range WARNING: at ../kernel/trace/ftrace.c:1638 ftrace failed to modify [<d000000004ba001c>] fuse_req_init_context+0x1c/0x90 [fuse] We should probably fix patch_instruction() to do that check and make the API saner, but that's a separate patch. For now just invert the test. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-06-24 14:05:41 +10:00
Michael Ellerman	dfc382a19a	powerpc/ftrace: Fix typo in mask of opcode In commit `24a1bdc35`, "Fix ABIv2 issues with __ftrace_make_call", Anton changed the logic that checks for the expected code sequence when patching a module. We missed the typo in the mask, 0xffff00000 should be 0xffff0000, which has the effect of making the test always true. That makes it impossible to ftrace against modules, eg: Unexpected call sequence: 48000008 e8410018 WARNING: at ../kernel/trace/ftrace.c:1638 ftrace failed to modify [<d000000007cf001c>] rng_dev_open+0x1c/0x70 [rng_core] Reported-by: David Binderman <dcb314@hotmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-06-24 14:05:37 +10:00
Michael Ellerman	d997c00c5a	powerpc: Add ppc_global_function_entry() ABIv2 has the concept of a global and local entry point to a function. In most cases we are interested in the local entry point, and so that is what ppc_function_entry() returns. However we have a case in the ftrace code where we want the global entry point, and there may be other places we need it too. Rather than special casing each, add an accessor. For ABIv1 and 32-bit there is only a single entry point, so we return that. That means it's safe for the caller to use this without also checking the ABI version. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-06-24 14:05:32 +10:00
Benjamin Herrenschmidt	716821c943	powerpc: Remove __arch_swab* The generic code uses gcc built-ins which work fine so there's no benefit in implementing our own anymore. We can't completely remove the ld/st_le* functions as some historical cruft still uses them, but that's next on the radar Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-06-24 12:43:15 +10:00
Michael Ellerman	bf77ee2a7a	powerpc: Remove ancient DEBUG_SIG code We have some compile-time disabled debug code in signal_xx.c. It's from some ancient time BG, almost certainly part of the original port, given the very similar code on other arches. The show_unhandled_signal logic, added in `d0c3d534a4` (2.6.24) is cleaner and prints more useful information, so drop the debug code. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-06-24 12:43:14 +10:00
Gavin Shan	0eb5736828	powerpc/kerenl: Enable EEH for IO accessors In arch/powerpc/kernel/iomap.c, lots of IO reading accessors missed to check EEH error as Ben pointed. The patch fixes it. For the writing accessors, we change the called functions only for making them look similar to the reading counterparts. Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-06-24 12:43:13 +10:00
Valentin Longchamp	db75c22f1a	powerpc/mpc85xx: fix fsl/p2041-post.dtsi clockgen mux2 The mux2 node is missing the clock-output-names field that is required by the clk-ppc-corenet driver. Signed-off-by: Valentin Longchamp <valentin.longchamp@keymile.com> Signed-off-by: Scott Wood <scottwood@freescale.com>	2014-06-20 18:48:36 -05:00
Laurentiu Tudor	7c48005036	powerpc/booke64: wrap tlb lock and search in htw miss with FTR_SMT Virtualized environments may expose a e6500 dual-threaded core as two single-threaded e6500 cores. Take advantage of this and get rid of the tlb lock and the trap-causing tlbsx in the htw miss handler by guarding with CPU_FTR_SMT, as it's already being done in the bolted tlb1 miss handler. As seen in the results below, measurements done with lmbench random memory access latency test running under Freescale's Embedded Hypervisor, there is a ~34% improvement. Memory latencies in nanoseconds - smaller is better (WARNING - may not be correct, check graphs) ---------------------------------------------------- Host Mhz L1 $ L2 $ Main mem Rand mem --------- --- ---- ---- -------- -------- smt 1665 1.8020 13.2 83.0 1149.7 nosmt 1665 1.8020 13.2 83.0 758.1 Signed-off-by: Laurentiu Tudor <Laurentiu.Tudor@freescale.com> Cc: Scott Wood <scottwood@freescale.com> [scottwood@freescale.com: commit message tweak] Signed-off-by: Scott Wood <scottwood@freescale.com>	2014-06-20 18:48:34 -05:00
Chunhe Lan	db65025d8e	t4240/dts: Enable third elo3 DMA engine support T4240 has a third DMA engine controller, so add the corresponding DMA node into the dts file. Signed-off-by: Chunhe Lan <Chunhe.Lan@freescale.com> Cc: Scott Wood <scottwood@freescale.com> [scottwood@freescale.com: reword commit message] Signed-off-by: Scott Wood <scottwood@freescale.com>	2014-06-20 18:48:32 -05:00
Shengzhou Liu	a95e8c28b3	powerpc/defconfig: update RTC support - remove CONFIG_RTC_DRV_CMOS in corenet32_smp_defconfig(it's unused), reserve CONFIG_RTC_DRV_CMOS in mpc85xx_defconfig(needed on some CDS boards) - enable CONFIG_RTC_DRV_DS1307, CONFIG_RTC_DRV_DS1374, CONFIG_RTC_DRV_DS3232 in mpc85xx_defconfig, mpc85xx_smp_defconfig - enable RTC support in corenet64_smp_defconfig Signed-off-by: Shengzhou Liu <Shengzhou.Liu@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com>	2014-06-20 18:48:32 -05:00
Scott Wood	5d1bf1e2c0	powerpc/e500mc: Fix wrong value of MCSR_L2MMU_MHIT Signed-off-by: Scott Wood <scottwood@freescale.com> Reported-by: Ed Swarthout <ed.swarthout@freescale.com>	2014-06-20 18:48:31 -05:00
Scott Wood	1cb4ed92f6	powerpc/e6500: hw tablewalk: fix recursive tlb lock on cpu 0 Commit `82d86de25b` "TLB lock recursive" introduced a bug whereby cpu 0 uses the same value for "lock held" as is used to indicate that the lock is free. This means that cpu 1 can acquire the lock whenever it wants, regardless of whether cpu 0 has it locked, which in turn means we can get duplicate TLB entries. Add one to the CPU value to ensure we do not use zero as a "lock held" value. Signed-off-by: Scott Wood <scottwood@freescale.com> Reported-by: Ed Swarthout <ed.swarthout@freescale.com>	2014-06-20 18:48:30 -05:00
Scott Wood	bbd08c72c6	powerpc/e6500: hw tablewalk: clear TID in kernel indirect entries Previously TID was being cleared before the tlbsx, but not after. This can lead to a multiway hit between a TLB entry with TID=0 (previously inserted when PID=0) and a TLB entry with TID!=0 that matches PID. This can theoretically result in undefined behavior, though we probably get lucky due to the details of the overlap. It also results in the inability to use multihit detection to detect other conflicting TLB entries, as well as poorer TLB utilization due to duplicating kernel TLB entries. Rather than try to patch up MAS1 after tlbsx, the entire value is saved/restored as with MAS2. I observed a slight improvement in TLB miss performance with this patch applied. Signed-off-by: Scott Wood <scottwood@freescale.com> Reported-by: Ed Swarthout <ed.swarthout@freescale.com>	2014-06-20 18:48:29 -05:00
Linus Torvalds	1700ff823b	Merge branch 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild Pull kbuild updates from Michal Marek: "Kbuild changes for v3.16-rc1: - cross-compilation fix so that cc-option is testing the right compiler - Fix for make defconfig all - Using relative paths to the object and source directory where possible, plus fixes for the fallout of the change - several cleanups in the Makefiles and scripts The powerpc fix is from today, because it was only discovered recently. The rest has been in linux-next for some time" * 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild: powerpc: Avoid circular dependency with zImage.% kbuild: create include/config directory in scripts/kconfig/Makefile kbuild: do not create include/linux directory Makefile: Fix unrecognized cross-compiler command line options kbuild: do not add "selinux" to subdir- twice um: Fix for relative objtree when generating x86 headers kbuild: Use relative path when building in a subdir of the source tree kbuild: Use relative path when building in the source tree kbuild: Use relative path for $(objtree) firmware: Use $(quote) in the Makefile firmware: Simplify directory creation kbuild: trivial - fix comment block indent kbuild: trivial - remove trailing spaces kbuild: support simultaneous "make %config" and "make all" kbuild: move extra gcc checks to scripts/Makefile.extrawarn	2014-06-12 21:23:38 -07:00
Linus Torvalds	b7c8c1945c	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull more powerpc updates from Ben Herrenschmidt: "Here are the remaining bits I was mentioning earlier. Mostly bug fixes and new selftests from Michael (yay !). He also removed the WSP platform and A2 core support which were dead before release, so less clutter. One little "feature" I snuck in is the doorbell IPI support for non-virtualized P8 which speeds up IPIs significantly between threads of a core" * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (34 commits) powerpc/book3s: Fix some ABIv2 issues in machine check code powerpc/book3s: Fix guest MC delivery mechanism to avoid soft lockups in guest. powerpc/book3s: Increment the mce counter during machine_check_early call. powerpc/book3s: Add stack overflow check in machine check handler. powerpc/book3s: Fix machine check handling for unhandled errors powerpc/eeh: Dump PE location code powerpc/powernv: Enable POWER8 doorbell IPIs powerpc/cpuidle: Only clear LPCR decrementer wakeup bit on fast sleep entry powerpc/powernv: Fix killed EEH event powerpc: fix typo 'CONFIG_PMAC' powerpc: fix typo 'CONFIG_PPC_CPU' powerpc/powernv: Don't escalate non-existing frozen PE powerpc/eeh: Report frozen parent PE prior to child PE powerpc/eeh: Clear frozen state for child PE powerpc/powernv: Reduce panic timeout from 180s to 10s powerpc/xmon: avoid format string leaking to printk selftests/powerpc: Add tests of PMU EBBs selftests/powerpc: Add support for skipping tests selftests/powerpc: Put the test in a separate process group selftests/powerpc: Fix instruction loop for ABIv2 (LE) ...	2014-06-12 20:11:38 -07:00
Linus Torvalds	b2e09f633a	Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull more scheduler updates from Ingo Molnar: "Second round of scheduler changes: - try-to-wakeup and IPI reduction speedups, from Andy Lutomirski - continued power scheduling cleanups and refactorings, from Nicolas Pitre - misc fixes and enhancements" * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/deadline: Delete extraneous extern for to_ratio() sched/idle: Optimize try-to-wake-up IPI sched/idle: Simplify wake_up_idle_cpu() sched/idle: Clear polling before descheduling the idle thread sched, trace: Add a tracepoint for IPI-less remote wakeups cpuidle: Set polling in poll_idle sched: Remove redundant assignment to "rt_rq" in update_curr_rt(...) sched: Rename capacity related flags sched: Final power vs. capacity cleanups sched: Remove remaining dubious usage of "power" sched: Let 'struct sched_group_power' care about CPU capacity sched/fair: Disambiguate existing/remaining "capacity" usage sched/fair: Change "has_capacity" to "has_free_capacity" sched/fair: Remove "power" from 'struct numa_stats' sched: Fix signedness bug in yield_to() sched/fair: Use time_after() in record_wakee() sched/balancing: Reduce the rate of needless idle load balancing sched/fair: Fix unlocked reads of some cfs_b->quota/period	2014-06-12 19:42:15 -07:00
Linus Torvalds	f9da455b93	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next Pull networking updates from David Miller: 1) Seccomp BPF filters can now be JIT'd, from Alexei Starovoitov. 2) Multiqueue support in xen-netback and xen-netfront, from Andrew J Benniston. 3) Allow tweaking of aggregation settings in cdc_ncm driver, from Bjørn Mork. 4) BPF now has a "random" opcode, from Chema Gonzalez. 5) Add more BPF documentation and improve test framework, from Daniel Borkmann. 6) Support TCP fastopen over ipv6, from Daniel Lee. 7) Add software TSO helper functions and use them to support software TSO in mvneta and mv643xx_eth drivers. From Ezequiel Garcia. 8) Support software TSO in fec driver too, from Nimrod Andy. 9) Add Broadcom SYSTEMPORT driver, from Florian Fainelli. 10) Handle broadcasts more gracefully over macvlan when there are large numbers of interfaces configured, from Herbert Xu. 11) Allow more control over fwmark used for non-socket based responses, from Lorenzo Colitti. 12) Do TCP congestion window limiting based upon measurements, from Neal Cardwell. 13) Support busy polling in SCTP, from Neal Horman. 14) Allow RSS key to be configured via ethtool, from Venkata Duvvuru. 15) Bridge promisc mode handling improvements from Vlad Yasevich. 16) Don't use inetpeer entries to implement ID generation any more, it performs poorly, from Eric Dumazet. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1522 commits) rtnetlink: fix userspace API breakage for iproute2 < v3.9.0 tcp: fixing TLP's FIN recovery net: fec: Add software TSO support net: fec: Add Scatter/gather support net: fec: Increase buffer descriptor entry number net: fec: Factorize feature setting net: fec: Enable IP header hardware checksum net: fec: Factorize the .xmit transmit function bridge: fix compile error when compiling without IPv6 support bridge: fix smatch warning / potential null pointer dereference via-rhine: fix full-duplex with autoneg disable bnx2x: Enlarge the dorq threshold for VFs bnx2x: Check for UNDI in uncommon branch bnx2x: Fix 1G-baseT link bnx2x: Fix link for KR with swapped polarity lane sctp: Fix sk_ack_backlog wrap-around problem net/core: Add VF link state control policy net/fsl: xgmac_mdio is dependent on OF_MDIO net/fsl: Make xgmac_mdio read error message useful net_sched: drr: warn when qdisc is not work conserving ...	2014-06-12 14:27:40 -07:00
Ingo Molnar	535560d841	Merge commit '3cf2f34' into sched/core, to fix build error Fix this dependency on the locking tree's smp_mb*() API changes: kernel/sched/idle.c:247:3: error: implicit declaration of function ‘smp_mb__after_atomic’ [-Werror=implicit-function-declaration] Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-06-12 13:46:37 +02:00
Michal Marek	699c659b49	powerpc: Avoid circular dependency with zImage.% The rule to create the final images uses a zImage.% pattern. Unfortunately, this also matches the names of the zImage.*.lds linker scripts, which appear as a dependency of the final images. This somehow worked when $(srctree) used to be an absolute path, but now the pattern matches too much. List only the images from $(image-y) as the target of the rule, to avoid the circular dependency. Reported-and-tested-by: Mike Qiu <qiudayu@linux.vnet.ibm.com> Signed-off-by: Michal Marek <mmarek@suse.cz>	2014-06-12 10:07:35 +02:00
Anton Blanchard	ad718622ab	powerpc/book3s: Fix some ABIv2 issues in machine check code Commit `2749a2f26a` (powerpc/book3s: Fix machine check handling for unhandled errors) introduced a few ABIv2 issues. We can maintain ABIv1 and ABIv2 compatibility by branching to the function rather than the dot symbol. Fixes: `2749a2f26a` ("powerpc/book3s: Fix machine check handling for unhandled errors") Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-06-12 09:41:33 +10:00
Mahesh Salgaonkar	74845bc2fa	powerpc/book3s: Fix guest MC delivery mechanism to avoid soft lockups in guest. Currently we forward MCEs to guest which have been recovered by guest. And for unhandled errors we do not deliver the MCE to guest. It looks like with no support of FWNMI in qemu, guest just panics whenever we deliver the recovered MCEs to guest. Also, the existig code used to return to host for unhandled errors which was casuing guest to hang with soft lockups inside guest and makes it difficult to recover guest instance. This patch now forwards all fatal MCEs to guest causing guest to crash/panic. And, for recovered errors we just go back to normal functioning of guest instead of returning to host. This fixes soft lockup issues in guest. This patch also fixes an issue where guest MCE events were not logged to host console. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-06-11 19:15:15 +10:00
Mahesh Salgaonkar	e6654d5b42	powerpc/book3s: Increment the mce counter during machine_check_early call. We don't see MCE counter getting increased in /proc/interrupts which gives false impression of no MCE occurred even when there were MCE events. The machine check early handling was added for PowerKVM and we missed to increment the MCE count in the early handler. We also increment mce counters in the machine_check_exception call, but in most cases where we handle the error hypervisor never reaches there unless its fatal and we want to crash. Only during fatal situation we may see double increment of mce count. We need to fix that. But for now it always good to have some count increased instead of zero. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-06-11 19:15:14 +10:00
Mahesh Salgaonkar	e75ad93afc	powerpc/book3s: Add stack overflow check in machine check handler. Currently machine check handler does not check for stack overflow for nested machine check. If we hit another MCE while inside the machine check handler repeatedly from same address then we get into risk of stack overflow which can cause huge memory corruption. This patch limits the nested MCE level to 4 and panic when we cross level 4. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-06-11 19:15:13 +10:00
Mahesh Salgaonkar	2749a2f26a	powerpc/book3s: Fix machine check handling for unhandled errors Current code does not check for unhandled/unrecovered errors and return from interrupt if it is recoverable exception which in-turn triggers same machine check exception in a loop causing hypervisor to be unresponsive. This patch fixes this situation and forces hypervisor to panic for unhandled/unrecovered errors. This patch also fixes another issue where unrecoverable_exception routine was called in real mode in case of unrecoverable exception (MSR_RI = 0). This causes another exception vector 0x300 (data access) during system crash leading to confusion while debugging cause of the system crash. Also turn ME bit off while going down, so that when another MCE is hit during panic path, system will checkstop and hypervisor will get restarted cleanly by SP. With the above fixes we now throw correct console messages (see below) while crashing the system in case of unhandled/unrecoverable machine checks. -------------- Severe Machine check interrupt [[Not recovered] Initiator: CPU Error type: UE [Instruction fetch] Effective address: 0000000030002864 Oops: Machine check, sig: 7 [#1] SMP NR_CPUS=2048 NUMA PowerNV Modules linked in: bork(O) bridge stp llc kvm [last unloaded: bork] CPU: 36 PID: 55162 Comm: bash Tainted: G O 3.14.0mce #1 task: c000002d72d022d0 ti: c000000007ec0000 task.ti: c000002d72de4000 NIP: 0000000030002864 LR: 00000000300151a4 CTR: 000000003001518c REGS: c000000007ec3d80 TRAP: 0200 Tainted: G O (3.14.0mce) MSR: 9000000000041002 <SF,HV,ME,RI> CR: 28222848 XER: 20000000 CFAR: 0000000030002838 DAR: d0000000004d0000 DSISR: 00000000 SOFTE: 1 GPR00: 000000003001512c 0000000031f92cb0 0000000030078af0 0000000030002864 GPR04: d0000000004d0000 0000000000000000 0000000030002864 ffffffffffffffc9 GPR08: 0000000000000024 0000000030008af0 000000000000002c c00000000150e728 GPR12: 9000000000041002 0000000031f90000 0000000010142550 0000000040000000 GPR16: 0000000010143cdc 0000000000000000 00000000101306fc 00000000101424dc GPR20: 00000000101424e0 000000001013c6f0 0000000000000000 0000000000000000 GPR24: 0000000010143ce0 00000000100f6440 c000002d72de7e00 c000002d72860250 GPR28: c000002d72860240 c000002d72ac0038 0000000000000008 0000000000040000 NIP [0000000030002864] 0x30002864 LR [00000000300151a4] 0x300151a4 Call Trace: Instruction dump: XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX ---[ end trace 7285f0beac1e29d3 ]--- Sending IPI to other CPUs IPI complete OPAL V3 detected ! -------------- Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-06-11 19:15:13 +10:00
Gavin Shan	357b2f3dd9	powerpc/eeh: Dump PE location code As Ben suggested, it's meaningful to dump PE's location code for site engineers when hitting EEH errors. The patch introduces function eeh_pe_loc_get() to retireve the location code from dev-tree so that we can output it when hitting EEH errors. If primary PE bus is root bus, the PHB's dev-node would be tried prior to root port's dev-node. Otherwise, the upstream bridge's dev-node of the primary PE bus will be check for the location code directly. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-06-11 19:12:23 +10:00
Michael Neuling	d4e58e5928	powerpc/powernv: Enable POWER8 doorbell IPIs This patch enables POWER8 doorbell IPIs on powernv. Since doorbells can only IPI within a core, we test to see when we can use doorbells and if not we fall back to XICS. This also enables hypervisor doorbells to wakeup us up from nap/sleep via the LPCR PECEDH bit. Based on tests by Anton, the best case IPI latency between two threads dropped from 894ns to 512ns. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-06-11 17:05:12 +10:00
Gavin Shan	5c7a35e3e2	powerpc/powernv: Fix killed EEH event On PowerNV platform, EEH errors are reported by IO accessors or poller driven by interrupt. After the PE is isolated, we won't produce EEH event for the PE. The current implementation has possibility of EEH event lost in this way: The interrupt handler queues one "special" event, which drives the poller. EEH thread doesn't pick the special event yet. IO accessors kicks in, the frozen PE is marked as "isolated" and EEH event is queued to the list. EEH thread runs because of special event and purge all existing EEH events. However, we never produce an other EEH event for the frozen PE. Eventually, the PE is marked as "isolated" and we don't have EEH event to recover it. The patch fixes the issue to keep EEH events for PEs that have been marked as "isolated" with the help of additional "force" help to eeh_remove_event(). Reported-by: Rolf Brudeseth <rolfb@us.ibm.com> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-06-11 17:04:33 +10:00
Paul Bolle	6e0fdf9af2	powerpc: fix typo 'CONFIG_PMAC' Commit `b0d278b7d3` ("powerpc/perf_event: Reduce latency of calling perf_event_do_pending") added a check for CONFIG_PMAC were a check for CONFIG_PPC_PMAC was clearly intended. Fixes: `b0d278b7d3` ("powerpc/perf_event: Reduce latency of calling perf_event_do_pending") Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-06-11 17:04:29 +10:00
Paul Bolle	b69a1da94f	powerpc: fix typo 'CONFIG_PPC_CPU' Commit `cd64d1697c` ("powerpc: mtmsrd not defined") added a check for CONFIG_PPC_CPU were a check for CONFIG_PPC_FPU was clearly intended. Fixes: `cd64d1697c` ("powerpc: mtmsrd not defined") Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-06-11 17:04:25 +10:00
Gavin Shan	71b540adff	powerpc/powernv: Don't escalate non-existing frozen PE Commit `cb5b242c` ("powerpc/eeh: Escalate error on non-existing PE") escalates the frozen state on non-existing PE to fenced PHB. It was to improve kdump reliability. After that, commit `361f2a2a` ("powrpc/powernv: Reset PHB in kdump kernel") was introduced to issue complete reset on all PHBs to increase the reliability of kdump kernel. Commit `cb5b242c` becomes unuseful and it would be reverted. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-06-11 17:04:20 +10:00
Gavin Shan	1ad7a72c5e	powerpc/eeh: Report frozen parent PE prior to child PE When we have the corner case of frozen parent and child PE at the same time, we have to handle the frozen parent PE prior to the child. Without clearning the frozen state on parent PE, the child PE can't be recovered successfully. The patch searches the EEH PE hierarchy tree and returns the toppest frozen PE to be handled. It ensures the frozen parent PE will be handled prior to child PE. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-06-11 17:04:16 +10:00
Gavin Shan	2c66599206	powerpc/eeh: Clear frozen state for child PE Since commit cb523e09 ("powerpc/eeh: Avoid I/O access during PE reset"), the PE is kept as frozen state on hardware level until the PE reset is done completely. After that, we explicitly clear the frozen state of the affected PE. However, there might have frozen child PEs of the affected PE and we also need clear their frozen state as well. Otherwise, the recovery is going to fail. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-06-11 17:04:12 +10:00
Anton Blanchard	4817fc323d	powerpc/powernv: Reduce panic timeout from 180s to 10s We've already dropped the default pseries timeout to 10s, do the same for powernv. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-06-11 17:04:07 +10:00
Kees Cook	50b66dbf87	powerpc/xmon: avoid format string leaking to printk This makes sure format strings cannot leak into printk (the string has already been correctly processed for format arguments). Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-06-11 17:04:03 +10:00
Michael Ellerman	3df48c981d	powerpc/perf: Ensure all EBB register state is cleared on fork() In commit `330a1eb` "Core EBB support for 64-bit book3s" I messed up clear_task_ebb(). It clears some but not all of the task's Event Based Branch (EBB) registers when we duplicate a task struct. That allows a child task to observe the EBBHR & EBBRR of its parent, which it should not be able to do. Fix it by clearing EBBHR & EBBRR. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Cc: stable@vger.kernel.org [v3.11+] Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-06-11 17:03:41 +10:00
Joel Stanley	caf69ba627	powerpc/powernv: Fix reading of OPAL msglog memory_return_from_buffer returns a signed value, so ret should be ssize_t. Fixes the following issue reported by David Binderman: [linux-3.15/arch/powerpc/platforms/powernv/opal-msglog.c:65]: (style) Checking if unsigned variable 'ret' is less than zero. [linux-3.15/arch/powerpc/platforms/powernv/opal-msglog.c:82]: (style) Checking if unsigned variable 'ret' is less than zero. Local variable "ret" is of type size_t. This is always unsigned, so it is pointless to check if it is less than zero. https://bugzilla.kernel.org/show_bug.cgi?id=77551 Fixing this exposes a real bug for the case where the entire count bytes is successfully read from the POS_WRAP case. The second memory_read_from_buffer will return EINVAL, causing the entire read to return EINVAL to userspace, despite the data being copied correctly. The fix is to test for the case where the data has been read and return early. Signed-off-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-06-11 17:03:36 +10:00
Dan Carpenter	be8f9642a0	powerpc/spufs: Remove duplicate SPUFS_CNTL_MAP_SIZE define The SPUFS_CNTL_MAP_SIZE define is cut and pasted twice so we can delete the second instance. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Jeremy Kerr <jk@ozlabs.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-06-11 17:03:32 +10:00
Dan Carpenter	d9d8212319	powerpc/cpm: Remove duplicate FCC_GFMR_TTX define The FCC_GFMR_TTX define is cut and pasted twice so we can remove the second instance. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-06-11 17:03:28 +10:00
Guo Chao	ddf0322a3f	powerpc/powernv: Fix endianness problems in EEH EEH information fetched from OPAL need fix before using in LE environment. To be included in sparse's endian check, declare them as __beXX and access them by accessors. Cc: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Guo Chao <yan@linux.vnet.ibm.com> Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-06-11 17:03:23 +10:00
Anton Blanchard	1bd098903f	powernv: Fix permissions on sysparam sysfs entries Everyone can write to these files, which is not what we want. Cc: stable@vger.kernel.org # 3.15 Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-06-11 17:03:15 +10:00
Shreyas B. Prabhu	ad41733062	powerpc/powernv : Disable subcore for UP configs Build throws following errors when CONFIG_SMP=n arch/powerpc/platforms/powernv/subcore.c: In function ‘cpu_update_split_mode’: arch/powerpc/platforms/powernv/subcore.c:274:15: error: ‘setup_max_cpus’ undeclared (first use in this function) arch/powerpc/platforms/powernv/subcore.c:285:5: error: lvalue required as left operand of assignment 'setup_max_cpus' variable is relevant only on SMP, so there is no point working around it for UP. Furthermore, subcore itself is relevant only on SMP and hence the better solution is to exclude subcore.o and subcore-asm.o for UP builds. Signed-off-by: Shreyas B. Prabhu <shreyas@linux.vnet.ibm.com> Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-06-11 17:03:10 +10:00
Shreyas B. Prabhu	b2a8087869	powerpc/powernv: Include asm/smp.h to fix UP build failure Build throws following errors when CONFIG_SMP=n arch/powerpc/platforms/powernv/setup.c: In function ‘pnv_kexec_wait_secondaries_down’: arch/powerpc/platforms/powernv/setup.c:179:4: error: implicit declaration of function ‘get_hard_smp_processor_id’ rc = opal_query_cpu_status(get_hard_smp_processor_id(i), The usage of get_hard_smp_processor_id() needs the declaration from <asm/smp.h>. The file setup.c includes <linux/sched.h>, which in-turn includes <linux/smp.h>. However, <linux/smp.h> includes <asm/smp.h> only on SMP configs and hence UP builds fail. Fix this by directly including <asm/smp.h> in setup.c unconditionally. Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Shreyas B. Prabhu <shreyas@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-06-11 17:03:06 +10:00
Michael Neuling	59a53afe70	powerpc: Don't setup CPUs with bad status OPAL will mark a CPU that is guarded as "bad" in the status property of the CPU node. Unfortunatley Linux doesn't check this property and will put the bad CPU in the present map. This has caused hangs on booting when we try to unsplit the core. This patch checks the CPU is avaliable via this status property before putting it in the present map. Signed-off-by: Michael Neuling <mikey@neuling.org> Tested-by: Anton Blanchard <anton@samba.org> cc: stable@vger.kernel.org Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-06-11 17:03:01 +10:00
Sam bobroff	96d0161086	powerpc: Correct DSCR during TM context switch Correct the DSCR SPR becoming temporarily corrupted if a task is context switched during a transaction. The problem occurs while suspending the task and is caused by saving the DSCR to thread.dscr after it has already been set to the CPU's default value: __switch_to() calls __switch_to_tm() which calls tm_reclaim_task() which calls tm_reclaim_thread() which calls tm_reclaim() where the DSCR is set to the CPU's default __switch_to() calls _switch() where thread.dscr is set to the DSCR When the task is resumed, it's transaction will be doomed (as usual) and the DSCR SPR will be corrupted, although the checkpointed value will be correct. Therefore the DSCR will be immediately corrected by the transaction aborting, unless it has been suspended. In that case the incorrect value can be seen by the task until it resumes the transaction. The fix is to treat the DSCR similarly to the TAR and save it early in __switch_to(). A program exposing the problem is added to the kernel self tests as: tools/testing/selftests/powerpc/tm/tm-resched-dscr. Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com> CC: <stable@vger.kernel.org> [v3.10+] Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-06-11 17:02:56 +10:00
Michael Ellerman	fb5a515704	powerpc: Remove platforms/wsp and associated pieces __attribute__ ((unused)) WSP is the last user of CONFIG_PPC_A2, so we remove that as well. Although CONFIG_PPC_ICSWX still exists, it's no longer selectable for any Book3E platform, so we can remove the code in mmu-book3e.h that depended on it. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-06-11 16:35:38 +10:00
Paul Bolle	94314290ed	powerpc: Remove check for CONFIG_SERIAL_TEXT_DEBUG The Kconfig symbol SERIAL_TEXT_DEBUG was removed from arch/powerpc/Kconfig.debug in v2.6.22. (In v2.6.27 it was also removed from arch/ppc/Kconfig.debug.) So the check for its macro has evaluated to false for over five years now. Remove that check and the few lines of code hidden behind it. Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-06-11 16:31:21 +10:00
Benjamin Herrenschmidt	dd58a092c4	powerpc: Add AT_HWCAP2 to indicate V.CRYPTO category support The Vector Crypto category instructions are supported by current POWER8 chips, advertise them to userspace using a specific bit to properly differentiate with chips of the same architecture level that might not have them. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> CC: <stable@vger.kernel.org> [v3.10+]	2014-06-11 16:31:20 +10:00
Linus Torvalds	dfb945473a	Merge git://www.linux-watchdog.org/linux-watchdog Pull watchdog updates from Wim Van Sebroeck: "This contains: - addition of the Intel MID watchdog - removal of W83697HF and W83697UG drivers (code was merged into w83627hf_wdt driver) - addition of Armada 375/380 SoC support - conversion of imx2_wdt to regmap API and to watchdog core API - lots of other small improvements and fixes" [ Wim was also tagged by gmail as a spammer, but not delayed by days unlike Ben ] * git://www.linux-watchdog.org/linux-watchdog: (25 commits) x86: intel-mid: add watchdog platform code for Merrifield watchdog: add Intel MID watchdog driver support watchdog: sp805: Set watchdog_device->timeout from ->set_timeout() booke/watchdog: refine and clean up the codes watchdog: iop_wdt only builds for mach-iop13xx watchdog: Remove drivers for W83697HF and W83697UG watchdog: w83627hf_wdt: Add early_disable module parameter ARM: mvebu: Add A375/A380 watchdog binding documentation watchdog: orion: Add Armada 375/380 SoC support watchdog: orion: Introduce per-SoC enabled() function watchdog: orion: Introduce per-SoC stop() function watchdog: orion: Remove unneeded atomic access watchdog: orion: Introduce a SoC-specific RSTOUT mapping watchdog: orion: Move the register ioremap'ing to its own function watchdog: xilinx: Make of_device_id array const watchdog: imx2_wdt: convert to watchdog core api watchdog: imx2_wdt: convert to use regmap API. watchdog: imx2_wdt: Sort the header files alphabetically watchdog: ath79_wdt: switch to clk_prepare/clk_disable watchdog: ath79_wdt: avoid spurious restarts on AR934x ...	2014-06-10 19:16:36 -07:00
Linus Torvalds	c5aec4c76a	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull powerpc updates from Ben Herrenschmidt: "Here is the bulk of the powerpc changes for this merge window. It got a bit delayed in part because I wasn't paying attention, and in part because I discovered I had a core PCI change without a PCI maintainer ack in it. Bjorn eventually agreed it was ok to merge it though we'll probably improve it later and I didn't want to rebase to add his ack. There is going to be a bit more next week, essentially fixes that I still want to sort through and test. The biggest item this time is the support to build the ppc64 LE kernel with our new v2 ABI. We previously supported v2 userspace but the kernel itself was a tougher nut to crack. This is now sorted mostly thanks to Anton and Rusty. We also have a fairly big series from Cedric that add support for 64-bit LE zImage boot wrapper. This was made harder by the fact that traditionally our zImage wrapper was always 32-bit, but our new LE toolchains don't really support 32-bit anymore (it's somewhat there but not really "supported") so we didn't want to rely on it. This meant more churn that just endian fixes. This brings some more LE bits as well, such as the ability to run in LE mode without a hypervisor (ie. under OPAL firmware) by doing the right OPAL call to reinitialize the CPU to take HV interrupts in the right mode and the usual pile of endian fixes. There's another series from Gavin adding EEH improvements (one day we will have a release with less than 20 EEH patches, I promise!). Another highlight is the support for the "Split core" functionality on P8 by Michael. This allows a P8 core to be split into "sub cores" of 4 threads which allows the subcores to run different guests under KVM (the HW still doesn't support a partition per thread). And then the usual misc bits and fixes ..." [ Further delayed by gmail deciding that BenH is a dirty spammer. Google knows. ] * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (155 commits) powerpc/powernv: Add missing include to LPC code selftests/powerpc: Test the THP bug we fixed in the previous commit powerpc/mm: Check paca psize is up to date for huge mappings powerpc/powernv: Pass buffer size to OPAL validate flash call powerpc/pseries: hcall functions are exported to modules, need _GLOBAL_TOC() powerpc: Exported functions __clear_user and copy_page use r2 so need _GLOBAL_TOC() powerpc/powernv: Set memory_block_size_bytes to 256MB powerpc: Allow ppc_md platform hook to override memory_block_size_bytes powerpc/powernv: Fix endian issues in memory error handling code powerpc/eeh: Skip eeh sysfs when eeh is disabled powerpc: 64bit sendfile is capped at 2GB powerpc/powernv: Provide debugfs access to the LPC bus via OPAL powerpc/serial: Use saner flags when creating legacy ports powerpc: Add cpu family documentation powerpc/xmon: Fix up xmon format strings powerpc/powernv: Add calls to support little endian host powerpc: Document sysfs DSCR interface powerpc: Fix regression of per-CPU DSCR setting powerpc: Split __SYSFS_SPRSETUP macro arch: powerpc/fadump: Cleaning up inconsistent NULL checks ...	2014-06-10 18:54:22 -07:00
Tang Yuantian	d2deebabae	booke/watchdog: refine and clean up the codes Basically, this patch does the following: 1. Move the codes of parsing boot parameters from setup-common.c to driver. In this way, code reader can know directly that there are boot parameters that can change the timeout. 2. Make boot parameter 'booke_wdt_period' effective. currently, when driver is loaded, default timeout is always being used in stead of booke_wdt_period. 3. Wrap up the watchdog timeout in device struct and clean up unnecessary codes. Signed-off-by: Tang Yuantian <yuantian.tang@freescale.com> Acked-by: Scott Wood <scottwood@freescale.com> Reviewed-by: Li Yang <leoli@freescale.com> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>	2014-06-10 21:47:26 +02:00
Linus Torvalds	77c32bbbe0	Merge branch 'for-linus' of git://git.infradead.org/users/vkoul/slave-dma Pull slave-dmaengine updates from Vinod Koul: - new Xilixn VDMA driver from Srikanth - bunch of updates for edma driver by Thomas, Joel and Peter - fixes and updates on dw, ste_dma, freescale, mpc512x, sudmac etc * 'for-linus' of git://git.infradead.org/users/vkoul/slave-dma: (45 commits) dmaengine: sh: don't use dynamic static allocation dmaengine: sh: fix print specifier warnings dmaengine: sh: make shdma_prep_dma_cyclic static dmaengine: Kconfig: Update MXS_DMA help text to include MX6Q/MX6DL of: dma: Grammar s/requests/request/, s/used required/required/ dmaengine: shdma: Enable driver compilation with COMPILE_TEST dmaengine: rcar-hpbdma: Include linux/err.h dmaengine: sudmac: Include linux/err.h dmaengine: sudmac: Keep #include sorted alphabetically dmaengine: shdmac: Include linux/err.h dmaengine: shdmac: Keep #include sorted alphabetically dmaengine: s3c24xx-dma: Add cyclic transfer support dmaengine: s3c24xx-dma: Process whole SG chain dmaengine: imx: correct sdmac->status for cyclic dma tx dmaengine: pch: fix compilation for alpha target dmaengine: dw: check return code of dma_async_device_register() dmaengine: dw: fix regression in dw_probe() function dmaengine: dw: enable clock before access dma: pch_dma: Fix Kconfig dependencies dmaengine: mpc512x: add support for peripheral transfers ...	2014-06-10 10:28:45 -07:00
Geert Uytterhoeven	0d2b7ea928	powerpc: update comments for generic idle conversion As of commit `799fef0612` ("powerpc: Use generic idle loop"), this applies to arch_cpu_idle() instead of cpu_idle(). Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-06-06 16:08:18 -07:00
Benjamin Herrenschmidt	0c0a3e5a10	powerpc/powernv: Add missing include to LPC code kbuild bot spotted that one: arch/powerpc/platforms/powernv/opal-lpc.c: In function 'opal_lpc_init_debugfs': >> arch/powerpc/platforms/powernv/opal-lpc.c:319:35: error: 'powerpc_debugfs_root' undeclared (first use in this function) root = debugfs_create_dir("lpc", powerpc_debugfs_root); ^ We neet to include the definition explicitely. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-06-07 08:57:21 +10:00
Michael Ellerman	09567e7fd4	powerpc/mm: Check paca psize is up to date for huge mappings We have a bug in our hugepage handling which exhibits as an infinite loop of hash faults. If the fault is being taken in the kernel it will typically trigger the softlockup detector, or the RCU stall detector. The bug is as follows: 1. mmap(0xa0000000, ..., MAP_FIXED \| MAP_HUGE_TLB \| MAP_ANONYMOUS ..) 2. Slice code converts the slice psize to 16M. 3. The code on lines 539-540 of slice.c in slice_get_unmapped_area() synchronises the mm->context with the paca->context. So the paca slice mask is updated to include the 16M slice. 3. Either: * mmap() fails because there are no huge pages available. * mmap() succeeds and the mapping is then munmapped. In both cases the slice psize remains at 16M in both the paca & mm. 4. mmap(0xa0000000, ..., MAP_FIXED \| MAP_ANONYMOUS ..) 5. The slice psize is converted back to 64K. Because of the check on line 539 of slice.c we DO NOT update the paca->context. The paca slice mask is now out of sync with the mm slice mask. 6. User/kernel accesses 0xa0000000. 7. The SLB miss handler slb_allocate_realmode() uses the paca slice mask to create an SLB entry and inserts it in the SLB. 18. With the 16M SLB entry in place the hardware does a hash lookup, no entry is found so a data access exception is generated. 19. The data access handler calls do_page_fault() -> handle_mm_fault(). 10. __handle_mm_fault() creates a THP mapping with do_huge_pmd_anonymous_page(). 11. The hardware retries the access, there is still nothing in the hash table so once again a data access exception is generated. 12. hash_page() calls into __hash_page_thp() and inserts a mapping in the hash. Although the THP mapping maps 16M the hashing is done using 64K as the segment page size. 13. hash_page() returns immediately after calling __hash_page_thp(), skipping over the code at line 1125. Resulting in the mismatch between the paca->context and mm->context not being detected. 14. The hardware retries the access, the hash it generates using the 16M SLB entry does NOT match the hash we inserted. 15. We take another data access and go into __hash_page_thp(). 16. We see a valid entry in the hpte_slot_array and so we call updatepp() which succeeds. 17. Goto 14. We could fix this in two ways. The first would be to remove or modify the check on line 539 of slice.c. The second option is to cause the check of paca psize in hash_page() on line 1125 to also be done for THP pages. We prefer the latter, because the check & update of the paca psize is not done until we know it's necessary. It's also done only on the current cpu, so we don't need to IPI all other cpus. Without further rearranging the code, the simplest fix is to pull out the code that checks paca psize and call it in two places. Firstly for THP/hugetlb, and secondly for other mappings as before. Thanks to Dave Jones for trinity, which originally found this bug. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> CC: stable@vger.kernel.org [v3.11+]	2014-06-06 13:54:26 +10:00
Nicolas Pitre	5d4dfddd4f	sched: Rename capacity related flags It is better not to think about compute capacity as being equivalent to "CPU power". The upcoming "power aware" scheduler work may create confusion with the notion of energy consumption if "power" is used too liberally. Let's rename the following feature flags since they do relate to capacity: SD_SHARE_CPUPOWER -> SD_SHARE_CPUCAPACITY ARCH_POWER -> ARCH_CAPACITY NONTASK_POWER -> NONTASK_CAPACITY Signed-off-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: Morten Rasmussen <morten.rasmussen@arm.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: linaro-kernel@lists.linaro.org Cc: Andy Fleming <afleming@freescale.com> Cc: Anton Blanchard <anton@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Grant Likely <grant.likely@linaro.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com> Cc: Rob Herring <robh+dt@kernel.org> Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Cc: Toshi Kani <toshi.kani@hp.com> Cc: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: devicetree@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/n/tip-e93lpnxb87owfievqatey6b5@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-06-05 11:52:32 +02:00
Vasant Hegde	8b8f7bf4c2	powerpc/powernv: Pass buffer size to OPAL validate flash call We pass actual buffer size to opal_validate_flash() OPAL API call and in return it contains output buffer size. Commit `cc146d1d` (Fix little endian issues) missed to set the size param before making OPAL call. So firmware image validation fails. This patch sets size variable before making OPAL call. Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Tested-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-06-05 14:54:04 +10:00
Anton Blanchard	c1931e2181	powerpc/pseries: hcall functions are exported to modules, need _GLOBAL_TOC() The hcall macros may call out to c code for tracing, so we need to set up a valid r2. This fixes an oops found when testing ibmvscsi as a module. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-06-05 13:21:28 +10:00
Anton Blanchard	2ac7b0166a	powerpc: Exported functions __clear_user and copy_page use r2 so need _GLOBAL_TOC() __clear_user and copy_page load from the TOC and are also exported to modules. This means we have to use _GLOBAL_TOC() so that we create the global entry point that sets up the TOC. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-06-05 13:20:41 +10:00
Anton Blanchard	6d97d7a28f	powerpc/powernv: Set memory_block_size_bytes to 256MB powerpc sets a low SECTION_SIZE_BITS to accomodate small pseries boxes. We default to 16MB memory blocks, and boxes with a lot of memory end up with enormous numbers of sysfs memory nodes. Set a more reasonable default for powernv of 256MB. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-06-05 13:20:40 +10:00
Anton Blanchard	a5d862576a	powerpc: Allow ppc_md platform hook to override memory_block_size_bytes The pseries platform code unconditionally overrides memory_block_size_bytes regardless of the running platform. Create a ppc_md hook that so each platform can choose to do what it wants. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-06-05 13:20:39 +10:00
Anton Blanchard	223ca9d855	powerpc/powernv: Fix endian issues in memory error handling code struct OpalMemoryErrorData is passed to us from firmware, so we have to byteswap it. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-06-05 13:20:39 +10:00
Wei Yang	2213fb142f	powerpc/eeh: Skip eeh sysfs when eeh is disabled When eeh is not enabled, and hotplug two pci devices on the same bus, eeh related sysfs would be added twice for the first added pci device. Since the eeh_dev is not created when eeh is not enabled. This patch adds the check, if eeh is not enabled, eeh sysfs will not be created. After applying this patch, following warnings are reduced: sysfs: cannot create duplicate filename '/devices/pci0000:00/0000:00:00.0/eeh_mode' sysfs: cannot create duplicate filename '/devices/pci0000:00/0000:00:00.0/eeh_config_addr' sysfs: cannot create duplicate filename '/devices/pci0000:00/0000:00:00.0/eeh_pe_config_addr' Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com> Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-06-05 13:20:38 +10:00
Anton Blanchard	5d73320a96	powerpc: 64bit sendfile is capped at 2GB commit `8f9c0119d7` (compat: fs: Generic compat_sys_sendfile implementation) changed the PowerPC 64bit sendfile call from sys_sendile64 to sys_sendfile. Unfortunately this broke sendfile of lengths greater than 2G because sys_sendfile caps at MAX_NON_LFS. Restore what we had previously which fixes the bug. Cc: stable@vger.kernel.org Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-06-05 13:20:38 +10:00
Benjamin Herrenschmidt	fa2dbe2e0f	powerpc/powernv: Provide debugfs access to the LPC bus via OPAL This provides debugfs files to access the LPC bus on Power8 non-virtualized using the appropriate OPAL firmware calls. The usage is simple: one file per space (IO, MEM and FW), lseek to the address and read/write the data. IO and MEM always generate series of byte accesses. FW can generate word and dword accesses if aligned properly. Based on an original patch from Rob Lippert and reworked. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-06-05 13:20:37 +10:00
Benjamin Herrenschmidt	c4cad90f9e	powerpc/serial: Use saner flags when creating legacy ports We had a mix & match of flags used when creating legacy ports depending on where we found them in the device-tree. Among others we were missing UPF_SKIP_TEST for some kind of ISA ports which is a problem as quite a few UARTs out there don't support the loopback test (such as a lot of BMCs). Let's pick the set of flags used by the SoC code and generalize it which means autoconf, no loopback test, irq maybe shared and fixed port. Sending to stable as the lack of UPF_SKIP_TEST is breaking serial on some machines so I want this back into distros Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> CC: stable@vger.kernel.org	2014-06-05 13:20:36 +10:00
Michael Ellerman	736256e4f1	powerpc/xmon: Fix up xmon format strings There are a couple of places where xmon is using %x to print values that are unsigned long. I found this out the hard way recently: 0:mon> p c000000000d0e7c8 c00000033dc90000 00000000a0000089 c000000000000000 return value is 0x96300500 Which is calling find_linux_pte_or_hugepte(), the result should be a kernel pointer. After decoding the page tables by hand I discovered the correct value was c000000396300500. So fix up that case and a few others. We also use a mix of 0x%x, %x and %u to print cpu numbers. So standardise on 0x%x. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-06-05 13:20:00 +10:00
Benjamin Herrenschmidt	4926616c77	powerpc/powernv: Add calls to support little endian host When running as a powernv "host" system on P8, we need to switch the endianness of interrupt handlers. This does it via the appropriate call to the OPAL firmware which may result in just switching HID0:HILE but depending on the processor version might need to do a few more things. This call must be done early before any other processor has been brought out of firmware. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-06-05 13:19:59 +10:00
Fabian Frederick	f6187769da	sys_sgetmask/sys_ssetmask: add CONFIG_SGETMASK_SYSCALL sys_sgetmask and sys_ssetmask are obsolete system calls no longer supported in libc. This patch replaces architecture related __ARCH_WANT_SYS_SGETMAX by expert mode configuration.That option is enabled by default for those architectures. Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Steven Miao <realmz6@gmail.com> Cc: Mikael Starvik <starvik@axis.com> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Cc: David Howells <dhowells@redhat.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Michal Simek <monstr@monstr.eu> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Helge Deller <deller@gmx.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Greg Ungerer <gerg@uclinux.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-06-04 16:54:14 -07:00
Mel Gorman	4f9b16a647	mm: disable zone_reclaim_mode by default When it was introduced, zone_reclaim_mode made sense as NUMA distances punished and workloads were generally partitioned to fit into a NUMA node. NUMA machines are now common but few of the workloads are NUMA-aware and it's routine to see major performance degradation due to zone_reclaim_mode being enabled but relatively few can identify the problem. Those that require zone_reclaim_mode are likely to be able to detect when it needs to be enabled and tune appropriately so lets have a sensible default for the bulk of users. This patch (of 2): zone_reclaim_mode causes processes to prefer reclaiming memory from local node instead of spilling over to other nodes. This made sense initially when NUMA machines were almost exclusively HPC and the workload was partitioned into nodes. The NUMA penalties were sufficiently high to justify reclaiming the memory. On current machines and workloads it is often the case that zone_reclaim_mode destroys performance but not all users know how to detect this. Favour the common case and disable it by default. Users that are sophisticated enough to know they need zone_reclaim_mode will detect it. Signed-off-by: Mel Gorman <mgorman@suse.de> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Acked-by: Michal Hocko <mhocko@suse.cz> Reviewed-by: Christoph Lameter <cl@linux.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-06-04 16:53:59 -07:00

... 2 3 4 5 6 ...

12843 Commits