linux

korg/linux

mirror of https://mirrors.bfsu.edu.cn/git/linux.git synced 2024-12-22 18:44:44 +08:00

Author	SHA1	Message	Date
Gavin Shan	05b1721d9f	powerpc/eeh: Refactor EEH flag accessors There are multiple global EEH flags. Almost each flag has its own accessor, which doesn't make sense. The patch refactors EEH flag accessors so that they look unified: eeh_add_flag(): Add EEH flag eeh_clear_flag(): Clear EEH flag eeh_has_flag(): Check if one specific flag has been set eeh_enabled(): Check if EEH functionality has been enabled Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-08-05 15:41:21 +10:00
Gavin Shan	a3032ca9f8	powerpc/eeh: Fetch IOMMU table in reliable way Function eeh_iommu_group_to_pe() iterates each PCI device to check the binding IOMMU group with get_iommu_table_base(), which possibly fetches pdev->dev.archdata.dma_data.dma_offset. It's (0x1 << 59) for "bypass" cases. The patch fixes the issue by iterating devices hooked to the IOMMU group and fetch IOMMU table there. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-08-05 15:41:16 +10:00
Gavin Shan	dff4a39e88	powerpc/powernv: Fix IOMMU table for VFIO dev On PHB3, PCI devices can bypass IOMMU for DMA access. If we pass through one PCI device, whose hose driver ever enable the bypass mode, pdev->dev.archdata.dma_data.iommu_table_base isn't IOMMU table. However, EEH needs access the IOMMU table when the device is owned by guest. The patch fixes pdev->dev.archdata.dma_data.iommu_table when passing through the device to guest in pnv_pci_ioda2_set_bypass(). Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-08-05 15:41:12 +10:00
Mike Qiu	9e5c6e5a3b	powerpc/eeh: Wrong place to call pci_get_slot() pci_get_slot() is called with hold of PCI bus semaphore and it's not safe to be called in interrupt context. However, we possibly checks EEH error and calls the function in interrupt context. To avoid using pci_get_slot(), we turn into device tree for fetching location code. Otherwise, we might run into WARN_ON() as following messages indicate: WARNING: at drivers/pci/search.c:223 CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.16.0-rc3+ #72 task: c000000001367af0 ti: c000000001444000 task.ti: c000000001444000 NIP: c000000000497b70 LR: c000000000037530 CTR: 000000003003d114 REGS: c000000001446fa0 TRAP: 0700 Not tainted (3.16.0-rc3+) MSR: 9000000000029032 <SF,HV,EE,ME,IR,DR,RI> CR: 48002422 XER: 20000000 CFAR: c00000000003752c SOFTE: 0 : NIP [c000000000497b70] .pci_get_slot+0x40/0x110 LR [c000000000037530] .eeh_pe_loc_get+0x150/0x190 Call Trace: .of_get_property+0x30/0x60 (unreliable) .eeh_pe_loc_get+0x150/0x190 .eeh_dev_check_failure+0x1b4/0x550 .eeh_check_failure+0x90/0xf0 .lpfc_sli_check_eratt+0x504/0x7c0 [lpfc] .lpfc_poll_eratt+0x64/0x100 [lpfc] .call_timer_fn+0x64/0x190 .run_timer_softirq+0x2cc/0x3e0 Cc: stable@vger.kernel.org Signed-off-by: Mike Qiu <qiudayu@linux.vnet.ibm.com> Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-08-05 15:41:07 +10:00
Lucas Tanure	c03719ef23	powerpc: Fix wrong defintion in boot/io.h Fix wrong __IO_H definition in boot/io.h Reported-by: Fernando Silveira <fsilveira@gmail.com> Signed-off-by: Lucas Tanure <tanure@linux.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-08-05 15:41:03 +10:00
Tyrel Datwyler	7340056567	powerpc/pci: Reorder pci bus/bridge unregistration during PHB removal Commit `bcdde7e` made __sysfs_remove_dir() recursive and introduced a BUG_ON during PHB removal while attempting to delete the power managment attribute group of the bus. This is a result of tearing the bridge and bus devices down out of order in remove_phb_dynamic. Since, the the bus resides below the bridge in the sysfs device tree it should be torn down first. This patch simply moves the device_unregister call for the PHB bridge device after the device_unregister call for the PHB bus. Fixes: `bcdde7e221` ("sysfs: make __sysfs_remove_dir() recursive") Cc: stable@vger.kernel.org Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-08-05 15:40:58 +10:00
Brian W Hart	a32305bf90	powerpc/powernv: Update dev->dma_mask in pci_set_dma_mask() path powerpc defines various machine-specific routines for handling pci_set_dma_mask(). The routines for machine "PowerNV" may neglect to set dev->dma_mask. This could confuse anyone (e.g. drivers) that consult dev->dma_mask to find the current mask. Set the dma_mask in the PowerNV leaf routine. Signed-off-by: Brian W. Hart <hartb@linux.vnet.ibm.com> CC: <stable@vger.kernel.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-08-05 15:40:54 +10:00
Scott Wood	7d17622159	powerpc/64e: Add __ref to early_alloc_pgtable() This silences a section mismatch warning. early_alloc_pgtable() is called from map_kernel_page() which cannot be __init, but only when slab_is_available() returns false which can only happen during early boot. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-08-05 15:40:49 +10:00
Andrey Utkin	b00fc6ec1f	powerpc/mm/numa: Fix break placement Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=81631 Reported-by: David Binderman <dcb314@hotmail.com> Signed-off-by: Andrey Utkin <andrey.krieger.utkin@gmail.com> CC: <stable@vger.kernel.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-08-05 15:39:50 +10:00
Mike Qiu	dadcd6d6e7	powerpc/eeh: sysfs entries lost The sysfs entries are lost because of commit `2213fb1` ("powerpc/eeh: Skip eeh sysfs when eeh is disabled"). That commit added condition to create sysfs entries with EEH_ENABLED, which isn't populated when trying to create sysfs entries on PowerNV platform during system boot time. The patch fixes the issue by: * Reoder EEH initialization functions so that they're same on PowerNV/pSeries. * Cache PE's primary bus by PowerNV platform instead of EEH core to avoid kernel crash caused by the function reorder. Another benefit with this is to avoid one eeh_probe_mode_dev() in EEH core. Signed-off-by: Mike Qiu <qiudayu@linux.vnet.ibm.com> Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-08-05 15:28:49 +10:00
Gavin Shan	212d16cdca	powerpc/eeh: EEH support for VFIO PCI device The patch exports functions to be used by new VFIO ioctl command, which will be introduced in subsequent patch, to support EEH functinality for VFIO PCI devices. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Acked-by: Alexander Graf <agraf@suse.de> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-08-05 15:28:48 +10:00
Gavin Shan	05ec424e38	powerpc/eeh: Avoid event on passed PE We must not handle EEH error on devices which are passed to somebody else. Instead, we expect that the frozen device owner detects an EEH error and recovers from it. This avoids EEH error handling on passed through devices so the device owner gets a chance to handle them. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Acked-by: Alexander Graf <agraf@suse.de> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-08-05 15:28:47 +10:00
Benjamin Herrenschmidt	9287b95ec9	Merge remote-tracking branch 'scott/next' into next Scott writes: Highlights include e6500 hardware threading support, an e6500 TLB erratum workaround, corenet error reporting, support for a new board, and some minor fixes.	2014-08-05 14:13:41 +10:00
Linus Torvalds	53ee983378	Staging driver patches for 3.17-rc1 Here's the big pull request for the staging driver tree for 3.17-rc1. Lots of things in here, over 2000 patches, but the best part is this: 1480 files changed, 39070 insertions(+), 254659 deletions(-) Thanks to the great work of Kristina Martšenko, 14 different staging drivers have been removed from the tree as they were obsolete and no one was willing to work on cleaning them up. Other than the driver removals, loads of cleanups are in here (comedi, lustre, etc.) as well as the usual IIO driver updates and additions. All of this has been in the linux-next tree for a while. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iEYEABECAAYFAlPf1wYACgkQMUfUDdst+ykrNwCgswPkRSAPQ3C8WvLhzUYRZZ/L AqEAoJP0Q8Fz8unXjlSMcx7pgcqUaJ8G =mrTQ -----END PGP SIGNATURE----- Merge tag 'staging-3.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull staging driver updates from Greg KH: "Here's the big pull request for the staging driver tree for 3.17-rc1. Lots of things in here, over 2000 patches, but the best part is this: 1480 files changed, 39070 insertions(+), 254659 deletions(-) Thanks to the great work of Kristina Martšenko, 14 different staging drivers have been removed from the tree as they were obsolete and no one was willing to work on cleaning them up. Other than the driver removals, loads of cleanups are in here (comedi, lustre, etc.) as well as the usual IIO driver updates and additions. All of this has been in the linux-next tree for a while" * tag 'staging-3.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (2199 commits) staging: comedi: addi_apci_1564: remove diagnostic interrupt support code staging: comedi: addi_apci_1564: add subdevice to check diagnostic status staging: wlan-ng: coding style problem fix staging: wlan-ng: fixing coding style problems staging: comedi: ii_pci20kc: request and ioremap memory staging: lustre: bitwise vs logical typo staging: dgnc: Remove unneeded dgnc_trace.c and dgnc_trace.h staging: dgnc: rephrase comment staging: comedi: ni_tio: remove some dead code staging: rtl8723au: Fix static symbol sparse warning staging: rtl8723au: usb_dvobj_init(): Remove unused variable 'pdev_desc' staging: rtl8723au: Do not duplicate kernel provided USB macros staging: rtl8723au: Remove never set struct pwrctrl_priv.bHWPowerdown staging: rtl8723au: Remove two never set variables staging: rtl8723au: RSSI_test is never set staging:r8190: coding style: Fixed checkpatch reported Error staging:r8180: coding style: Fixed too long lines staging:r8180: coding style: Fixed commenting style staging: lustre: ptlrpc: lproc_ptlrpc.c - fix dereferenceing user space buffer staging: lustre: ldlm: ldlm_resource.c - fix dereferenceing user space buffer ...	2014-08-04 18:36:12 -07:00
Linus Torvalds	ef35ad26f8	Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf changes from Ingo Molnar: "Kernel side changes: - Consolidate the PMU interrupt-disabled code amongst architectures (Vince Weaver) - misc fixes Tooling changes (new features, user visible changes): - Add support for pagefault tracing in 'trace', please see multiple examples in the changeset messages (Stanislav Fomichev). - Add pagefault statistics in 'trace' (Stanislav Fomichev) - Add header for columns in 'top' and 'report' TUI browsers (Jiri Olsa) - Add pagefault statistics in 'trace' (Stanislav Fomichev) - Add IO mode into timechart command (Stanislav Fomichev) - Fallback to syscalls:* when raw_syscalls:* is not available in the perl and python perf scripts. (Daniel Bristot de Oliveira) - Add --repeat global option to 'perf bench' to be used in benchmarks such as the existing 'futex' one, that was modified to use it instead of a local option. (Davidlohr Bueso) - Fix fd -> pathname resolution in 'trace', be it using /proc or a vfs_getname probe point. (Arnaldo Carvalho de Melo) - Add suggestion of how to set perf_event_paranoid sysctl, to help non-root users trying tools like 'trace' to get a working environment. (Arnaldo Carvalho de Melo) - Updates from trace-cmd for traceevent plugin_kvm plus args cleanup (Steven Rostedt, Jan Kiszka) - Support S/390 in 'perf kvm stat' (Alexander Yarygin) Tooling infrastructure changes: - Allow reserving a row for header purposes in the hists browser (Arnaldo Carvalho de Melo) - Various fixes and prep work related to supporting Intel PT (Adrian Hunter) - Introduce multiple debug variables control (Jiri Olsa) - Add callchain and additional sample information for python scripts (Joseph Schuchart) - More prep work to support Intel PT: (Adrian Hunter) - Polishing 'script' BTS output - 'inject' can specify --kallsym - VDSO is per machine, not a global var - Expose data addr lookup functions previously private to 'script' - Large mmap fixes in events processing - Include standard stringify macros in power pc code (Sukadev Bhattiprolu) Tooling cleanups: - Convert open coded equivalents to asprintf() (Andy Shevchenko) - Remove needless reassignments in 'trace' (Arnaldo Carvalho de Melo) - Cache the is_exit syscall test in 'trace) (Arnaldo Carvalho de Melo) - No need to reimplement err() in 'perf bench sched-messaging', drop barf(). (Davidlohr Bueso). - Remove ev_name argument from perf_evsel__hists_browse, can be obtained from the other parameters. (Jiri Olsa) Tooling fixes: - Fix memory leak in the 'sched-messaging' perf bench test. (Davidlohr Bueso) - The -o and -n 'perf bench mem' options are mutually exclusive, emit error when both are specified. (Davidlohr Bueso) - Fix scrollbar refresh row index in the ui browser, problem exposed now that headers will be added and will be allowed to be switched on/off. (Jiri Olsa) - Handle the num array type in python properly (Sebastian Andrzej Siewior) - Fix wrong condition for allocation failure (Jiri Olsa) - Adjust callchain based on DWARF debug info on powerpc (Sukadev Bhattiprolu) - Fix a risk for doing free on uninitialized pointer in traceevent lib (Rickard Strandqvist) - Update attr test with PERF_FLAG_FD_CLOEXEC flag (Jiri Olsa) - Enable close-on-exec flag on perf file descriptor (Yann Droneaud) - Fix build on gcc 4.4.7 (Arnaldo Carvalho de Melo) - Event ordering fixes (Jiri Olsa)" * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (123 commits) Revert "perf tools: Fix jump label always changing during tracing" perf tools: Fix perf usage string leftover perf: Check permission only for parent tracepoint event perf record: Store PERF_RECORD_FINISHED_ROUND only for nonempty rounds perf record: Always force PERF_RECORD_FINISHED_ROUND event perf inject: Add --kallsyms parameter perf tools: Expose 'addr' functions so they can be reused perf session: Fix accounting of ordered samples queue perf powerpc: Include util/util.h and remove stringify macros perf tools: Fix build on gcc 4.4.7 perf tools: Add thread parameter to vdso__dso_findnew() perf tools: Add dso__type() perf tools: Separate the VDSO map name from the VDSO dso name perf tools: Add vdso__new() perf machine: Fix the lifetime of the VDSO temporary file perf tools: Group VDSO global variables into a structure perf session: Add ability to skip 4GiB or more perf session: Add ability to 'skip' a non-piped event stream perf tools: Pass machine to vdso__dso_findnew() perf tools: Add dso__data_size() ...	2014-08-04 16:09:53 -07:00
Linus Torvalds	8efb90cf1e	Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Ingo Molnar: "The main changes in this cycle are: - big rtmutex and futex cleanup and robustification from Thomas Gleixner - mutex optimizations and refinements from Jason Low - arch_mutex_cpu_relax() removal and related cleanups - smaller lockdep tweaks" * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits) arch, locking: Ciao arch_mutex_cpu_relax() locking/lockdep: Only ask for /proc/lock_stat output when available locking/mutexes: Optimize mutex trylock slowpath locking/mutexes: Try to acquire mutex only if it is unlocked locking/mutexes: Delete the MUTEX_SHOW_NO_WAITER macro locking/mutexes: Correct documentation on mutex optimistic spinning rtmutex: Make the rtmutex tester depend on BROKEN futex: Simplify futex_lock_pi_atomic() and make it more robust futex: Split out the first waiter attachment from lookup_pi_state() futex: Split out the waiter check from lookup_pi_state() futex: Use futex_top_waiter() in lookup_pi_state() futex: Make unlock_pi more robust rtmutex: Avoid pointless requeueing in the deadlock detection chain walk rtmutex: Cleanup deadlock detector debug logic rtmutex: Confine deadlock logic to futex rtmutex: Simplify remove_waiter() rtmutex: Document pi chain walk rtmutex: Clarify the boost/deboost part rtmutex: No need to keep task ref for lock owner check rtmutex: Simplify and document try_to_take_rtmutex() ...	2014-08-04 16:09:06 -07:00
Linus Torvalds	b8c0aa46b3	This pull request has a lot of work done. The main thing is the changes to the ftrace function callback infrastructure. It's introducing a way to allow different functions to call directly different trampolines instead of all calling the same "mcount" one. The only user of this for now is the function graph tracer, which always had a different trampoline, but the function tracer trampoline was called and did basically nothing, and then the function graph tracer trampoline was called. The difference now, is that the function graph tracer trampoline can be called directly if a function is only being traced by the function graph trampoline. If function tracing is also happening on the same function, the old way is still done. The accounting for this takes up more memory when function graph tracing is activated, as it needs to keep track of which functions it uses. I have a new way that wont take as much memory, but it's not ready yet for this merge window, and will have to wait for the next one. Another big change was the removal of the ftrace_start/stop() calls that were used by the suspend/resume code that stopped function tracing when entering into suspend and resume paths. The stop of ftrace was done because there was some function that would crash the system if one called smp_processor_id()! The stop/start was a big hammer to solve the issue at the time, which was when ftrace was first introduced into Linux. Now ftrace has better infrastructure to debug such issues, and I found the problem function and labeled it with "notrace" and function tracing can now safely be activated all the way down into the guts of suspend and resume. Other changes include clean ups of uprobe code. Clean up of the trace_seq() code. And other various small fixes and clean ups to ftrace and tracing. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJT35zXAAoJEKQekfcNnQGuOz0H/38zqM0nLFhrgvz3EPk2UOjn xqpX8qyb2V7TJZL+IqeXU2a5cQZl5ba0D4WtBGpxbTae3CJYiuQ87iKUNFoH0om5 FDpn80igb368k8V3qRdRsziKVCCf0XBd/NkHJXc0ZkfXGyzB2Ga4bBxALxp2gj9y bnO+vKo6+tWYKG4hyQb4P3LRXUrK8/LWEsPr39cH2QH1Rdj69Lx9CgrCdUVJmwcb Bj8hEiLXL/RYCFNn79A3wNTUvW0rG/AOIf4SLqXtasSRZ0ToaU0ZyDnrNv+0Ol47 rX8tSk+LfXchL9hpIvjCf1vlAYq3pO02favteR/jip3lx/dTjEDE4RJ9qtJzZ4Q= =fwQY -----END PGP SIGNATURE----- Merge tag 'trace-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: "This pull request has a lot of work done. The main thing is the changes to the ftrace function callback infrastructure. It's introducing a way to allow different functions to call directly different trampolines instead of all calling the same "mcount" one. The only user of this for now is the function graph tracer, which always had a different trampoline, but the function tracer trampoline was called and did basically nothing, and then the function graph tracer trampoline was called. The difference now, is that the function graph tracer trampoline can be called directly if a function is only being traced by the function graph trampoline. If function tracing is also happening on the same function, the old way is still done. The accounting for this takes up more memory when function graph tracing is activated, as it needs to keep track of which functions it uses. I have a new way that wont take as much memory, but it's not ready yet for this merge window, and will have to wait for the next one. Another big change was the removal of the ftrace_start/stop() calls that were used by the suspend/resume code that stopped function tracing when entering into suspend and resume paths. The stop of ftrace was done because there was some function that would crash the system if one called smp_processor_id()! The stop/start was a big hammer to solve the issue at the time, which was when ftrace was first introduced into Linux. Now ftrace has better infrastructure to debug such issues, and I found the problem function and labeled it with "notrace" and function tracing can now safely be activated all the way down into the guts of suspend and resume Other changes include clean ups of uprobe code, clean up of the trace_seq() code, and other various small fixes and clean ups to ftrace and tracing" * tag 'trace-3.17' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (57 commits) ftrace: Add warning if tramp hash does not match nr_trampolines ftrace: Fix trampoline hash update check on rec->flags ring-buffer: Use rb_page_size() instead of open coded head_page size ftrace: Rename ftrace_ops field from trampolines to nr_trampolines tracing: Convert local function_graph functions to static ftrace: Do not copy old hash when resetting tracing: let user specify tracing_thresh after selecting function_graph ring-buffer: Always run per-cpu ring buffer resize with schedule_work_on() tracing: Remove function_trace_stop and HAVE_FUNCTION_TRACE_MCOUNT_TEST s390/ftrace: remove check of obsolete variable function_trace_stop arm64, ftrace: Remove check of obsolete variable function_trace_stop Blackfin: ftrace: Remove check of obsolete variable function_trace_stop metag: ftrace: Remove check of obsolete variable function_trace_stop microblaze: ftrace: Remove check of obsolete variable function_trace_stop MIPS: ftrace: Remove check of obsolete variable function_trace_stop parisc: ftrace: Remove check of obsolete variable function_trace_stop sh: ftrace: Remove check of obsolete variable function_trace_stop sparc64,ftrace: Remove check of obsolete variable function_trace_stop tile: ftrace: Remove check of obsolete variable function_trace_stop ftrace: x86: Remove check of obsolete variable function_trace_stop ...	2014-08-04 11:50:00 -07:00
Linus Torvalds	3e7a716a92	Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto update from Herbert Xu: - CTR(AES) optimisation on x86_64 using "by8" AVX. - arm64 support to ccp - Intel QAT crypto driver - Qualcomm crypto engine driver - x86-64 assembly optimisation for 3DES - CTR(3DES) speed test - move FIPS panic from module.c so that it only triggers on crypto modules - SP800-90A Deterministic Random Bit Generator (drbg). - more test vectors for ghash. - tweak self tests to catch partial block bugs. - misc fixes. * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (94 commits) crypto: drbg - fix failure of generating multiple of 2**16 bytes crypto: ccp - Do not sign extend input data to CCP crypto: testmgr - add missing spaces to drbg error strings crypto: atmel-tdes - Switch to managed version of kzalloc crypto: atmel-sha - Switch to managed version of kzalloc crypto: testmgr - use chunks smaller than algo block size in chunk tests crypto: qat - Fixed SKU1 dev issue crypto: qat - Use hweight for bit counting crypto: qat - Updated print outputs crypto: qat - change ae_num to ae_id crypto: qat - change slice->regions to slice->region crypto: qat - use min_t macro crypto: qat - remove unnecessary parentheses crypto: qat - remove unneeded header crypto: qat - checkpatch blank lines crypto: qat - remove unnecessary return codes crypto: Resolve shadow warnings crypto: ccp - Remove "select OF" from Kconfig crypto: caam - fix DECO RSR polling crypto: qce - Let 'DEV_QCE' depend on both HAS_DMA and HAS_IOMEM ...	2014-08-04 09:52:51 -07:00
Linus Torvalds	f74ad8df4e	PCI changes for the v3.17 merge window: Resource management - Support BAR sizes up to 128GB (Yinghai Lu) - Keep original resource if we fail to expand it (Guo Chao) - Return conventional error values from pci_revert_fw_address() (Bjorn Helgaas) - Tidy resource assignment messages (Bjorn Helgaas) - Don't exclude low BIOS area for non-PCI cards (Christoph Schulz) PCI device hotplug - Prevent NULL dereference during pciehp probe (Andreas Noever) - Make pciehp pcie_wait_cmd() self-contained (Bjorn Helgaas) - Wait for pciehp hotplug command completion lazily (Bjorn Helgaas) - Compute pciehp timeout from hotplug command start time (Bjorn Helgaas) - Remove pciehp assumptions about which commands cause completion events (Bjorn Helgaas) - Clear pciehp Data Link Layer State Changed during init (Myron Stowe) - Remove pciehp struct controller.no_cmd_complete (Rajat Jain) - Remove cpqphp unnecessary null test (Fabian Frederick) - Remove "invalid IRQ" warning for hot-added PCIe ports (Jiang Liu) IOMMU - Add DMA alias quirk for Intel 82801 bridge (Alex Williamson) MSI - Add internal msix_clear_and_set_ctrl() (Yijing Wang) - Remove unused msi_enabled_mask() (Yijing Wang) - Cache Multiple Message Capable in struct msi_desc (Yijing Wang) - Add msi_setup_entry() to clean up initialization (Yijing Wang) - Remove unused msi_remove_pci_irq_vectors() (Yijing Wang) - Retrieve first MSI IRQ from msi_desc rather than pci_dev (Yijing Wang) - Remove unused list access in __pci_restore_msix_state() (Yijing Wang) - Use irq_get_msi_desc() to simplify code (Yijing Wang) Generic host bridge driver - Fix GPL v2 license string typo (Bjorn Helgaas) Marvell MVEBU - Fix GPL v2 license string typo (Thierry Reding) NVIDIA Tegra - Use correct initial HW settings (Phil Edworthy) - Remove rcar_pcie_setup_window() resource argument (Phil Edworthy) - Fix GPL v2 license string typo (Thierry Reding) Renesas R-Car - Remove redundant config accessor register checks (Sergei Shtylyov) - Fix GPL v2 license string typo (Bjorn Helgaas) Virtualization - Factor secondary bus reset logic (Gavin Shan) - Remove duplicate powerpc reset logic (Gavin Shan) Miscellaneous - Rework default VGA detection for EFI (Bruno Prémont) - Fix sysfs "acpi_index" and "label" errors for NIC renaming (Simone Gotti) - Configure ASPM at pci_enable_device()-time (Vidya Sagar) - Add include/linux/pci_ids.h include guard (Rasmus Villemoes) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABAgAGBQJTzppBAAoJEFmIoMA60/r8mtQP/jgVWCSU+0ulHjoxVSRLu4Lc UGKQFjS03oWWflHdvW6wZFqN82Ynva9fYCLMtiKdPg7cgTosSRT3I4DjAIm80ZI/ kZvHxSmi6DBYmchZBsWzj60zxNiYZeEgd7CevzcJRHuwbKNMr2y12s6hjJbyl5lF ygaXWpDveKjsEDjyk9vKjUGwul/NJKynar253Yh178XaoypdGuiEIw3D1lQFMZZp ADcRijIi+CD2BENtDr6fbldbj+yQ93yyUSloEnaKtWZD+Ao5IsHngN0IyRu+l1Wl LFob0AsopeYVFKdw22Gn1KAq9Jj01acsSBRXjgrauU+tLY512Vkbp1lFYl85B/38 /Z0VNHncmIh29rq9Tl2xQwEeI3Ja27FfnMjC70dLM5YjWf8vsYnDEQZHyxAAe15D p3H3YuuDjmvHkoSrHY/68DLfDl9ubw3/BFUlCMqijL7444ZWLXathrnCV8ZJimmr PlF/m7GtXYF4wIw19m9KQqNBUPJJEsVHExKzICOY4v5/nMlvx4ZkBDR3tPNEH1sk 3AYKjLDw21Nle7yKcAlxDI/TYWZqxuph23UpevzlQd16tutq2i2FqpauiqI3DFm4 VfYVbOVQwfeUJt11VOCgxvE7RsTxCk5QefB+YKVAdVK6vMZHeZxsetYvrCDptnea cId/NfiEFnmr+u3mAyPM =U5Ip -----END PGP SIGNATURE----- Merge tag 'pci-v3.17-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI updates from Bjorn Helgaas: "I'll be on vacation until Aug 11, and I suspect the merge window will open before then, so I'm sending this to you early. There are more things I'd like to get into v3.17, so I hope to send another pull request soon after I return. The most notable pieces here are: - Support BARs up to 128GB (up from 8GB) - Fix SR-IOV resource assignment when we fail to expand a resource - Rework pciehp to handle a common hardware erratum - Cleanup MSI - Fix NIC renaming issue - Fix VGA default device issue on EFI systems - Fix ASPM configuration (previously we didn't enable it as expected) Alex Williamson has graciously agreed to take care of any major issues with this if you take it before I return. Details: Resource management - Support BAR sizes up to 128GB (Yinghai Lu) - Keep original resource if we fail to expand it (Guo Chao) - Return conventional error values from pci_revert_fw_address() (Bjorn Helgaas) - Tidy resource assignment messages (Bjorn Helgaas) - Don't exclude low BIOS area for non-PCI cards (Christoph Schulz) PCI device hotplug - Prevent NULL dereference during pciehp probe (Andreas Noever) - Make pciehp pcie_wait_cmd() self-contained (Bjorn Helgaas) - Wait for pciehp hotplug command completion lazily (Bjorn Helgaas) - Compute pciehp timeout from hotplug command start time (Bjorn Helgaas) - Remove pciehp assumptions about which commands cause completion events (Bjorn Helgaas) - Clear pciehp Data Link Layer State Changed during init (Myron Stowe) - Remove pciehp struct controller.no_cmd_complete (Rajat Jain) - Remove cpqphp unnecessary null test (Fabian Frederick) - Remove "invalid IRQ" warning for hot-added PCIe ports (Jiang Liu) IOMMU - Add DMA alias quirk for Intel 82801 bridge (Alex Williamson) MSI - Add internal msix_clear_and_set_ctrl() (Yijing Wang) - Remove unused msi_enabled_mask() (Yijing Wang) - Cache Multiple Message Capable in struct msi_desc (Yijing Wang) - Add msi_setup_entry() to clean up initialization (Yijing Wang) - Remove unused msi_remove_pci_irq_vectors() (Yijing Wang) - Retrieve first MSI IRQ from msi_desc rather than pci_dev (Yijing Wang) - Remove unused list access in __pci_restore_msix_state() (Yijing Wang) - Use irq_get_msi_desc() to simplify code (Yijing Wang) Generic host bridge driver - Fix GPL v2 license string typo (Bjorn Helgaas) Marvell MVEBU - Fix GPL v2 license string typo (Thierry Reding) NVIDIA Tegra - Use correct initial HW settings (Phil Edworthy) - Remove rcar_pcie_setup_window() resource argument (Phil Edworthy) - Fix GPL v2 license string typo (Thierry Reding) Renesas R-Car - Remove redundant config accessor register checks (Sergei Shtylyov) - Fix GPL v2 license string typo (Bjorn Helgaas) Virtualization - Factor secondary bus reset logic (Gavin Shan) - Remove duplicate powerpc reset logic (Gavin Shan) Miscellaneous - Rework default VGA detection for EFI (Bruno Prémont) - Fix sysfs "acpi_index" and "label" errors for NIC renaming (Simone Gotti) - Configure ASPM at pci_enable_device()-time (Vidya Sagar) - Add include/linux/pci_ids.h include guard (Rasmus Villemoes)" * tag 'pci-v3.17-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (38 commits) PCI/MSI: Use irq_get_msi_desc() to simplify code PCI/MSI: Remove unused list access in __pci_restore_msix_state() PCI/MSI: Retrieve first MSI IRQ from msi_desc rather than pci_dev PCI/MSI: Remove unused function msi_remove_pci_irq_vectors() PCI/MSI: Add msi_setup_entry() to clean up MSI initialization PCI: Configure ASPM when enabling device x86: don't exclude low BIOS area when allocating address space for non-PCI cards PCI: generic: Fix GPL v2 license string typo PCI: rcar: Fix GPL v2 license string typo PCI: tegra: Fix GPL v2 license string typo PCI: mvebu: Fix GPL v2 license string typo PCI: Add include guard to include/linux/pci_ids.h x86, ia64: Move EFI_FB vga_default_device() initialization to pci_vga_fixup() PCI: Tidy resource assignment messages PCI: Return conventional error values from pci_revert_fw_address() PCI: Cleanup control flow PCI: Support BAR sizes up to 128GB PCI: cpqphp: Remove unnecessary null test before debugfs_remove() PCI: pciehp: Clear Data Link Layer State Changed during init PCI: Add bridge DMA alias quirk for Intel 82801 bridge ...	2014-08-04 09:29:37 -07:00
Alexei Starovoitov	7ae457c1e5	net: filter: split 'struct sk_filter' into socket and bpf parts clean up names related to socket filtering and bpf in the following way: - everything that deals with sockets keeps 'sk_' prefix - everything that is pure BPF is changed to 'bpf_' prefix split 'struct sk_filter' into struct sk_filter { atomic_t refcnt; struct rcu_head rcu; struct bpf_prog prog; }; and struct bpf_prog { u32 jited:1, len:31; struct sock_fprog_kern orig_prog; unsigned int (bpf_func)(const struct sk_buff skb, const struct bpf_insn filter); union { struct sock_filter insns[0]; struct bpf_insn insnsi[0]; struct work_struct work; }; }; so that 'struct bpf_prog' can be used independent of sockets and cleans up 'unattached' bpf use cases split SK_RUN_FILTER macro into: SK_RUN_FILTER to be used with 'struct sk_filter ' and BPF_PROG_RUN to be used with 'struct bpf_prog ' __sk_filter_release(struct sk_filter ) gains __bpf_prog_release(struct bpf_prog ) helper function also perform related renames for the functions that work with 'struct bpf_prog ', since they're on the same lines: sk_filter_size -> bpf_prog_size sk_filter_select_runtime -> bpf_prog_select_runtime sk_filter_free -> bpf_prog_free sk_unattached_filter_create -> bpf_prog_create sk_unattached_filter_destroy -> bpf_prog_destroy sk_store_orig_filter -> bpf_prog_store_orig_filter sk_release_orig_filter -> bpf_release_orig_filter __sk_migrate_filter -> bpf_migrate_filter __sk_prepare_filter -> bpf_prepare_filter API for attaching classic BPF to a socket stays the same: sk_attach_filter(prog, struct sock )/sk_detach_filter(struct sock ) and SK_RUN_FILTER(struct sk_filter , ctx) to execute a program which is used by sockets, tun, af_packet API for 'unattached' BPF programs becomes: bpf_prog_create(struct bpf_prog )/bpf_prog_destroy(struct bpf_prog ) and BPF_PROG_RUN(struct bpf_prog *, ctx) to execute a program which is used by isdn, ppp, team, seccomp, ptp, xt_bpf, cls_bpf, test_bpf Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-08-02 15:03:58 -07:00
Alexander Graf	8e6afa36e7	KVM: PPC: PR: Handle FSCR feature deselects We handle FSCR feature bits (well, TAR only really today) lazily when the guest starts using them. So when a guest activates the bit and later uses that feature we enable it for real in hardware. However, when the guest stops using that bit we don't stop setting it in hardware. That means we can potentially lose a trap that the guest expects to happen because it thinks a feature is not active. This patch adds support to drop TAR when then guest turns it off in FSCR. While at it it also restricts FSCR access to 64bit systems - 32bit ones don't have it. Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-31 10:23:46 +02:00
Shengzhou Liu	78eb9094ca	powerpc/t2080rdb: Add T2080RDB board support T2080PCIe-RDB is a Freescale Reference Design Board that hosts T2080 SoC. The board feature overview: Processor: - T2080 SoC integrating four 64-bit dual-threads e6500 cores up to 1.8GHz DDR Memory: - Single memory controller capable of supporting DDR3 and DDR3-LP devices - 72bit 4GB DDR3-LP SODIMM in slot Ethernet interfaces: - Two 1Gbps RGMII ports on-board - Two 10Gbps SFP+ ports on-board - Two 10Gbps Base-T ports on-board Accelerator: - DPAA components consist of FMan, BMan, QMan, PME, DCE and SEC IFC/Local Bus - NOR: 128MB 16-bit NOR flash - NAND: 1GB 8-bit NAND flash - CPLD: for system controlling with programable header on-board eSPI: - 64MB N25Q512 SPI flash USB: - Two USB2.0 ports with internal PHY (both Type-A) PCIe: - One PCIe x4 goldfinger(support SR-IOV) - One PCIe x4 slot - One PCIe x2 end-point device (C293 crypto co-processor) SATA: - Two SATA 2.0 ports on-board SDHC: - support a MicroSD/TF card on-board I2C: - Four I2C controllers. UART: - Dual 4-pins UART serial ports Signed-off-by: Shengzhou Liu <Shengzhou.Liu@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com>	2014-07-31 00:11:10 -05:00
Alexander Graf	29577fc00b	KVM: PPC: HV: Remove generic instruction emulation Now that we have properly split load/store instruction emulation and generic instruction emulation, we can move the generic one from kvm.ko to kvm-pr.ko on book3s_64. This reduces the attack surface and amount of code loaded on HV KVM kernels. Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-30 15:25:49 +02:00
Bharat Bhushan	5a484c7c1e	KVM: PPC: BOOKEHV: rename e500hv_spr to bookehv_spr This are not specific to e500hv but applicable for bookehv (As per comment from Scott Wood on my patch "kvm: ppc: bookehv: Added wrapper macros for shadow registers") Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-30 11:39:52 +02:00
Himangi Saraogi	3894817fb1	powerpc/fsl-pci: Correct use of ! and & In commit `ae91d60ba8`, a bug was fixed that involved converting !x & y to !(x & y). The code below shows the same pattern, and thus should perhaps be fixed in the same way. This is not tested and clearly changes the semantics, so it is only something to consider. The Coccinelle semantic patch that makes this change is as follows: // <smpl> @@ expression E1,E2; @@ ( !E1 & !E2 \| - !E1 & E2 + !(E1 & E2) ) // </smpl> Signed-off-by: Himangi Saraogi <himangi774@gmail.com> Signed-off-by: Scott Wood <scottwood@freescale.com>	2014-07-29 19:26:31 -05:00
Himangi Saraogi	983e244410	powerpc/mpic_msgr: Use kcalloc and correct the argument to sizeof mpic_msgrs has type struct mpic_msgr *, not struct mpic_msgr , so the elements of the array should have pointer type, not structure type. The advantage of kcalloc is, that will prevent integer overflows which could result from the multiplication of number of elements and size and it is also a bit nicer to read. The Coccinelle semantic patch that makes the first change is as follows: // <smpl> @disable sizeof_type_expr@ type T; T *x; @@ x = <+...sizeof( - T + x )...+> // </smpl> Signed-off-by: Himangi Saraogi <himangi774@gmail.com> Acked-by: Julia Lawall <julia.lawall@lip6.fr> Signed-off-by: Scott Wood <scottwood@freescale.com>	2014-07-29 19:26:31 -05:00
Scott Wood	54afbec0d5	memory: Freescale CoreNet Coherency Fabric error reporting driver The CoreNet Coherency Fabric is part of the memory subsystem on some Freescale QorIQ chips. It can report coherency violations (e.g. due to misusing memory that is mapped noncoherent) as well as transactions that do not hit any local access window, or which hit a local access window with an invalid target ID. Signed-off-by: Scott Wood <scottwood@freescale.com> Reviewed-by: Bharat Bhushan <bharat.bhushan@freescale.com>	2014-07-29 19:26:30 -05:00
Scott Wood	48cd9b5d59	powerpc/e6500: Work around erratum A-008139 Erratum A-008139 can cause duplicate TLB entries if an indirect entry is overwritten using tlbwe while the other thread is using it to do a lookup. Work around this by using tlbilx to invalidate prior to overwriting. To avoid the need to save another register to hold MAS1 during the workaround code, TID clearing has been moved from tlb_miss_kernel_e6500 until after the SMT section. Signed-off-by: Scott Wood <scottwood@freescale.com>	2014-07-29 19:26:29 -05:00
Andy Fleming	e16c876553	powerpc/e6500: Add support for hardware threads The general idea is that each core will release all of its threads into the secondary thread startup code, which will eventually wait in the secondary core holding area, for the appropriate bit in the PACA to be set. The kick_cpu function pointer will set that bit in the PACA, and thus "release" the core/thread to boot. We also need to do a few things that U-Boot normally does for CPUs (like enable branch prediction). Signed-off-by: Andy Fleming <afleming@freescale.com> [scottwood@freescale.com: various changes, including only enabling threads if Linux wants to kick them] Signed-off-by: Scott Wood <scottwood@freescale.com>	2014-07-29 19:26:20 -05:00
Scott Wood	7251a24e4d	powerpc/booke: Define MSR bits the same way as reg.h This ensures that all MSR definitions are consistently unsigned long, and that MSR_CM does not become 0xffffffff80000000 (this is usually harmless because MSR is 32-bit on booke and is mainly noticeable when debugging, but still I'd rather avoid it). Signed-off-by: Scott Wood <scottwood@freescale.com>	2014-07-29 19:24:38 -05:00
Linus Torvalds	e8a91e0e87	Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull powerpc fixes from Ben Herrenschmidt: "Here are 3 more small powerpc fixes that should still go into .16. One is a recent regression (MMCR2 business), the other is a trivial endian fix without which FW updates won't work on LE in IBM machines, and the 3rd one turns a BUG_ON into a WARN_ON which is definitely a LOT more friendly especially when the whole thing is about retrieving error logs ..." * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: powerpc: Fix endianness of flash_block_list in rtas_flash powerpc/powernv: Change BUG_ON to WARN_ON in elog code powerpc/perf: Fix MMCR2 handling for EBB	2014-07-28 11:34:31 -07:00
Alexander Graf	ce91ddc471	KVM: PPC: Remove DCR handling DCR handling was only needed for 440 KVM. Since we removed it, we can also remove handling of DCR accesses. Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 19:29:15 +02:00
Alexander Graf	8de12015ff	KVM: PPC: Expose helper functions for data/inst faults We're going to implement guest code interpretation in KVM for some rare corner cases. This code needs to be able to inject data and instruction faults into the guest when it encounters them. Expose generic APIs to do this in a reasonably subarch agnostic fashion. Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 18:30:18 +02:00
Alexander Graf	d69614a295	KVM: PPC: Separate loadstore emulation from priv emulation Today the instruction emulator can get called via 2 separate code paths. It can either be called by MMIO emulation detection code or by privileged instruction traps. This is bad, as both code paths prepare the environment differently. For MMIO emulation we already know the virtual address we faulted on, so instructions there don't have to actually fetch that information. Split out the two separate use cases into separate files. Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 18:30:10 +02:00
Alexander Graf	c12fb43c2f	KVM: PPC: Handle magic page in kvmppc_ld/st We use kvmppc_ld and kvmppc_st to emulate load/store instructions that may as well access the magic page. Special case it out so that we can properly access it. Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 16:35:53 +02:00
Alexander Graf	c45c551403	KVM: PPC: Use kvm_read_guest in kvmppc_ld We have a nice and handy helper to read from guest physical address space, so we should make use of it in kvmppc_ld as we already do for its counterpart in kvmppc_st. Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 16:33:54 +02:00
Alexander Graf	9897e88a79	KVM: PPC: Remove kvmppc_bad_hva() We have a proper define for invalid HVA numbers. Use those instead of the ppc specific kvmppc_bad_hva(). Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 16:28:51 +02:00
Alexander Graf	35c4a7330d	KVM: PPC: Move kvmppc_ld/st to common code We have enough common infrastructure now to resolve GVA->GPA mappings at runtime. With this we can move our book3s specific helpers to load / store in guest virtual address space to common code as well. Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 16:27:12 +02:00
Alexander Graf	7d15c06f1a	KVM: PPC: Implement kvmppc_xlate for all targets We have a nice API to find the translated GPAs of a GVA including protection flags. So far we only use it on Book3S, but there's no reason the same shouldn't be used on BookE as well. Implement a kvmppc_xlate() version for BookE and clean it up to make it more readable in general. Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 16:15:50 +02:00
Aneesh Kumar K.V	63fff5c1e3	KVM: PPC: BOOK3S: HV: Update compute_tlbie_rb to handle 16MB base page When calculating the lower bits of AVA field, use the shift count based on the base page size. Also add the missing segment size and remove stale comment. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 16:09:17 +02:00
Alexander Graf	7a58777a33	KVM: PPC: Book3S: Provide different CAPs based on HV or PR mode With Book3S KVM we can create both PR and HV VMs in parallel on the same machine. That gives us new challenges on the CAPs we return - both have different capabilities. When we get asked about CAPs on the kvm fd, there's nothing we can do. We can try to be smart and assume we're running HV if HV is available, PR otherwise. However with the newly added VM CHECK_EXTENSION we can now ask for capabilities directly on a VM which knows whether it's PR or HV. With this patch I can successfully expose KVM PVINFO data to user space in the PR case, fixing magic page mapping for PAPR guests. Signed-off-by: Alexander Graf <agraf@suse.de> Acked-by: Paolo Bonzini <pbonzini@redhat.com>	2014-07-28 15:23:18 +02:00
Alexander Graf	784aa3d7fb	KVM: Rename and add argument to check_extension In preparation to make the check_extension function available to VM scope we add a struct kvm * argument to the function header and rename the function accordingly. It will still be called from the /dev/kvm fd, but with a NULL argument for struct kvm *. Signed-off-by: Alexander Graf <agraf@suse.de> Acked-by: Paolo Bonzini <pbonzini@redhat.com>	2014-07-28 15:23:17 +02:00
Stewart Smith	9678cdaae9	Use the POWER8 Micro Partition Prefetch Engine in KVM HV on POWER8 The POWER8 processor has a Micro Partition Prefetch Engine, which is a fancy way of saying "has way to store and load contents of L2 or L2+MRU way of L3 cache". We initiate the storing of the log (list of addresses) using the logmpp instruction and start restore by writing to a SPR. The logmpp instruction takes parameters in a single 64bit register: - starting address of the table to store log of L2/L2+L3 cache contents - 32kb for L2 - 128kb for L2+L3 - Aligned relative to maximum size of the table (32kb or 128kb) - Log control (no-op, L2 only, L2 and L3, abort logout) We should abort any ongoing logging before initiating one. To initiate restore, we write to the MPPR SPR. The format of what to write to the SPR is similar to the logmpp instruction parameter: - starting address of the table to read from (same alignment requirements) - table size (no data, until end of table) - prefetch rate (from fastest possible to slower. about every 8, 16, 24 or 32 cycles) The idea behind loading and storing the contents of L2/L3 cache is to reduce memory latency in a system that is frequently swapping vcores on a physical CPU. The best case scenario for doing this is when some vcores are doing very cache heavy workloads. The worst case is when they have about 0 cache hits, so we just generate needless memory operations. This implementation just does L2 store/load. In my benchmarks this proves to be useful. Benchmark 1: - 16 core POWER8 - 3x Ubuntu 14.04LTS guests (LE) with 8 VCPUs each - No split core/SMT - two guests running sysbench memory test. sysbench --test=memory --num-threads=8 run - one guest running apache bench (of default HTML page) ab -n 490000 -c 400 http://localhost/ This benchmark aims to measure performance of real world application (apache) where other guests are cache hot with their own workloads. The sysbench memory benchmark does pointer sized writes to a (small) memory buffer in a loop. In this benchmark with this patch I can see an improvement both in requests per second (~5%) and in mean and median response times (again, about 5%). The spread of minimum and maximum response times were largely unchanged. benchmark 2: - Same VM config as benchmark 1 - all three guests running sysbench memory benchmark This benchmark aims to see if there is a positive or negative affect to this cache heavy benchmark. Although due to the nature of the benchmark (stores) we may not see a difference in performance, but rather hopefully an improvement in consistency of performance (when vcore switched in, don't have to wait many times for cachelines to be pulled in) The results of this benchmark are improvements in consistency of performance rather than performance itself. With this patch, the few outliers in duration go away and we get more consistent performance in each guest. benchmark 3: - same 3 guests and CPU configuration as benchmark 1 and 2. - two idle guests - 1 guest running STREAM benchmark This scenario also saw performance improvement with this patch. On Copy and Scale workloads from STREAM, I got 5-6% improvement with this patch. For Add and triad, it was around 10% (or more). benchmark 4: - same 3 guests as previous benchmarks - two guests running sysbench --memory, distinctly different cache heavy workload - one guest running STREAM benchmark. Similar improvements to benchmark 3. benchmark 5: - 1 guest, 8 VCPUs, Ubuntu 14.04 - Host configured with split core (SMT8, subcores-per-core=4) - STREAM benchmark In this benchmark, we see a 10-20% performance improvement across the board of STREAM benchmark results with this patch. Based on preliminary investigation and microbenchmarks by Prerna Saxena <prerna@linux.vnet.ibm.com> Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:23:17 +02:00
Stewart Smith	de9bdd1a60	Split out struct kvmppc_vcore creation to separate function No code changes, just split it out to a function so that with the addition of micro partition prefetch buffer allocation (in subsequent patch) looks neater and doesn't require excessive indentation. Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:23:16 +02:00
Paul Mackerras	1b2e33b071	KVM: PPC: Book3S: Make kvmppc_ld return a more accurate error indication At present, kvmppc_ld calls kvmppc_xlate, and if kvmppc_xlate returns any error indication, it returns -ENOENT, which is taken to mean an HPTE not found error. However, the error could have been a segment found (no SLB entry) or a permission error. Similarly, kvmppc_pte_to_hva currently does permission checking, but any error from it is taken by kvmppc_ld to mean that the access is an emulated MMIO access. Also, kvmppc_ld does no execute permission checking. This fixes these problems by (a) returning any error from kvmppc_xlate directly, (b) moving the permission check from kvmppc_pte_to_hva into kvmppc_ld, and (c) adding an execute permission check to kvmppc_ld. This is similar to what was done for kvmppc_st() by commit 82ff911317c3 ("KVM: PPC: Deflect page write faults properly in kvmppc_st"). Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:23:16 +02:00
Paul Mackerras	ef1af2e296	KVM: PPC: Book3S PR: Take SRCU read lock around RTAS kvm_read_guest() call This does for PR KVM what `c9438092ca` ("KVM: PPC: Book3S HV: Take SRCU read lock around kvm_read_guest() call") did for HV KVM, that is, eliminate a "suspicious rcu_dereference_check() usage!" warning by taking the SRCU lock around the call to kvmppc_rtas_hcall(). It also fixes a return of RESUME_HOST to return EMULATE_FAIL instead, since kvmppc_h_pr() is supposed to return EMULATE_* values. Signed-off-by: Paul Mackerras <paulus@samba.org> Cc: stable@vger.kernel.org Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:23:16 +02:00
Alexey Kardashevskiy	a0840240c0	KVM: PPC: Book3S: Fix LPCR one_reg interface Unfortunately, the LPCR got defined as a 32-bit register in the one_reg interface. This is unfortunate because KVM allows userspace to control the DPFD (default prefetch depth) field, which is in the upper 32 bits. The result is that DPFD always get set to 0, which reduces performance in the guest. We can't just change KVM_REG_PPC_LPCR to be a 64-bit register ID, since that would break existing userspace binaries. Instead we define a new KVM_REG_PPC_LPCR_64 id which is 64-bit. Userspace can still use the old KVM_REG_PPC_LPCR id, but it now only modifies those fields in the bottom 32 bits that userspace can modify (ILE, TC and AIL). If userspace uses the new KVM_REG_PPC_LPCR_64 id, it can modify DPFD as well. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Paul Mackerras <paulus@samba.org> Cc: stable@vger.kernel.org Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:23:16 +02:00
Alexander Graf	b2677b8dd8	KVM: PPC: Remove 440 support The 440 target hasn't been properly functioning for a few releases and before I was the only one who fixes a very serious bug that indicates to me that nobody used it before either. Furthermore KVM on 440 is slow to the extent of unusable. We don't have to carry along completely unused code. Remove 440 and give us one less thing to worry about. Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:23:15 +02:00
Bharat Bhushan	8c95ead603	KVM: PPC: Remove comment saying SPRG1 is used for vcpu pointer Scott Wood pointed out that We are no longer using SPRG1 for vcpu pointer, but using SPRN_SPRG_THREAD <=> SPRG3 (thread->vcpu). So this comment is not valid now. Note: SPRN_SPRG3R is not supported (do not see any need as of now), and if we want to support this in future then we have to shift to using SPRG1 for VCPU pointer. Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:23:15 +02:00
Bharat Bhushan	28d2f421bc	KVM: PPC: Booke-hv: Add one reg interface for SPRG9 We now support SPRG9 for guest, so also add a one reg interface for same Note: Changes are in bookehv code only as we do not have SPRG9 on booke-pr. Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:23:15 +02:00
Bharat Bhushan	99e99d19a8	kvm: ppc: bookehv: Save restore SPRN_SPRG9 on guest entry exit SPRN_SPRG is used by debug interrupt handler, so this is required for debug support. Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:23:14 +02:00
Mihai Caraman	f5250471b2	KVM: PPC: Bookehv: Get vcpu's last instruction for emulation On book3e, KVM uses load external pid (lwepx) dedicated instruction to read guest last instruction on the exit path. lwepx exceptions (DTLB_MISS, DSI and LRAT), generated by loading a guest address, needs to be handled by KVM. These exceptions are generated in a substituted guest translation context (EPLC[EGS] = 1) from host context (MSR[GS] = 0). Currently, KVM hooks only interrupts generated from guest context (MSR[GS] = 1), doing minimal checks on the fast path to avoid host performance degradation. lwepx exceptions originate from host state (MSR[GS] = 0) which implies additional checks in DO_KVM macro (beside the current MSR[GS] = 1) by looking at the Exception Syndrome Register (ESR[EPID]) and the External PID Load Context Register (EPLC[EGS]). Doing this on each Data TLB miss exception is obvious too intrusive for the host. Read guest last instruction from kvmppc_load_last_inst() by searching for the physical address and kmap it. This address the TODO for TLB eviction and execute-but-not-read entries, and allow us to get rid of lwepx until we are able to handle failures. A simple stress benchmark shows a 1% sys performance degradation compared with previous approach (lwepx without failure handling): time for i in `seq 1 10000`; do /bin/echo > /dev/null; done real 0m 8.85s user 0m 4.34s sys 0m 4.48s vs real 0m 8.84s user 0m 4.36s sys 0m 4.44s A solution to use lwepx and to handle its exceptions in KVM would be to temporary highjack the interrupt vector from host. This imposes additional synchronizations for cores like FSL e6500 that shares host IVOR registers between hardware threads. This optimized solution can be later developed on top of this patch. Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:23:14 +02:00
Mihai Caraman	51f047261e	KVM: PPC: Allow kvmppc_get_last_inst() to fail On book3e, guest last instruction is read on the exit path using load external pid (lwepx) dedicated instruction. This load operation may fail due to TLB eviction and execute-but-not-read entries. This patch lay down the path for an alternative solution to read the guest last instruction, by allowing kvmppc_get_lat_inst() function to fail. Architecture specific implmentations of kvmppc_load_last_inst() may read last guest instruction and instruct the emulation layer to re-execute the guest in case of failure. Make kvmppc_get_last_inst() definition common between architectures. Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:23:14 +02:00
Mihai Caraman	9a26af64d6	KVM: PPC: Book3s: Remove kvmppc_read_inst() function In the context of replacing kvmppc_ld() function calls with a version of kvmppc_get_last_inst() which allow to fail, Alex Graf suggested this: "If we get EMULATE_AGAIN, we just have to make sure we go back into the guest. No need to inject an ISI into the guest - it'll do that all by itself. With an error returning kvmppc_get_last_inst we can just use completely get rid of kvmppc_read_inst() and only use kvmppc_get_last_inst() instead." As a intermediate step get rid of kvmppc_read_inst() and only use kvmppc_ld() instead. Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:23:13 +02:00
Mihai Caraman	9c0d4e0dcf	KVM: PPC: Book3e: Add TLBSEL/TSIZE defines for MAS0/1 Add mising defines MAS0_GET_TLBSEL() and MAS1_GET_TSIZE() for Book3E. Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:23:13 +02:00
Mihai Caraman	b5741bb3d4	KVM: PPC: e500mc: Revert "add load inst fixup" The commit `1d628af7` "add load inst fixup" made an attempt to handle failures generated by reading the guest current instruction. The fixup code that was added works by chance hiding the real issue. Load external pid (lwepx) instruction, used by KVM to read guest instructions, is executed in a subsituted guest translation context (EPLC[EGS] = 1). In consequence lwepx's TLB error and data storage interrupts need to be handled by KVM, even though these interrupts are generated from host context (MSR[GS] = 0) where lwepx is executed. Currently, KVM hooks only interrupts generated from guest context (MSR[GS] = 1), doing minimal checks on the fast path to avoid host performance degradation. As a result, the host kernel handles lwepx faults searching the faulting guest data address (loaded in DEAR) in its own Logical Partition ID (LPID) 0 context. In case a host translation is found the execution returns to the lwepx instruction instead of the fixup, the host ending up in an infinite loop. Revert the commit "add load inst fixup". lwepx issue will be addressed in a subsequent patch without needing fixup code. Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:23:13 +02:00
Bharat Bhushan	34f754b99e	kvm: ppc: Add SPRN_EPR get helper function kvmppc_set_epr() is already defined in asm/kvm_ppc.h, So rename and move get_epr helper function to same file. Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com> [agraf: remove duplicate return] Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:23:13 +02:00
Bharat Bhushan	c1b8a01bf9	kvm: ppc: booke: Use the shared struct helpers for SPRN_SPRG0-7 Use kvmppc_set_sprg[0-7]() and kvmppc_get_sprg[0-7]() helper functions Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:23:12 +02:00
Bharat Bhushan	dc168549d9	kvm: ppc: booke: Add shared struct helpers of SPRN_ESR Add and use kvmppc_set_esr() and kvmppc_get_esr() helper functions Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:23:12 +02:00
Bharat Bhushan	a5414d4b5e	kvm: ppc: booke: Use the shared struct helpers of SPRN_DEAR Uses kvmppc_set_dar() and kvmppc_get_dar() helper functions Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:23:12 +02:00
Bharat Bhushan	31579eea69	kvm: ppc: booke: Use the shared struct helpers of SRR0 and SRR1 Use kvmppc_set_srr0/srr1() and kvmppc_get_srr0/srr1() helper functions Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:23:11 +02:00
Bharat Bhushan	1dc0c5b88c	kvm: ppc: bookehv: Added wrapper macros for shadow registers There are shadow registers like, GSPRG[0-3], GSRR0, GSRR1 etc on BOOKE-HV and these shadow registers are guest accessible. So these shadow registers needs to be updated on BOOKE-HV. This patch adds new macro for get/set helper of shadow register . Signed-off-by: Bharat Bhushan <Bharat.Bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:23:11 +02:00
Alexander Graf	89b68c96a2	KVM: PPC: Book3S: Make magic page properly 4k mappable The magic page is defined as a 4k page of per-vCPU data that is shared between the guest and the host to accelerate accesses to privileged registers. However, when the host is using 64k page size granularity we weren't quite as strict about that rule anymore. Instead, we partially treated all of the upper 64k as magic page and mapped only the uppermost 4k with the actual magic contents. This works well enough for Linux which doesn't use any memory in kernel space in the upper 64k, but Mac OS X got upset. So this patch makes magic page actually stay in a 4k range even on 64k page size hosts. This patch fixes magic page usage with Mac OS X (using MOL) on 64k PAGE_SIZE hosts for me. Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:23:11 +02:00
Alexander Graf	c01e3f66cd	KVM: PPC: Book3S: Add hack for split real mode Today we handle split real mode by mapping both instruction and data faults into a special virtual address space that only exists during the split mode phase. This is good enough to catch 32bit Linux guests that use split real mode for copy_from/to_user. In this case we're always prefixed with 0xc0000000 for our instruction pointer and can map the user space process freely below there. However, that approach fails when we're running KVM inside of KVM. Here the 1st level last_inst reader may well be in the same virtual page as a 2nd level interrupt handler. It also fails when running Mac OS X guests. Here we have a 4G/4G split, so a kernel copy_from/to_user implementation can easily overlap with user space addresses. The architecturally correct way to fix this would be to implement an instruction interpreter in KVM that kicks in whenever we go into split real mode. This interpreter however would not receive a great amount of testing and be a lot of bloat for a reasonably isolated corner case. So I went back to the drawing board and tried to come up with a way to make split real mode work with a single flat address space. And then I realized that we could get away with the same trick that makes it work for Linux: Whenever we see an instruction address during split real mode that may collide, we just move it higher up the virtual address space to a place that hopefully does not collide (keep your fingers crossed!). That approach does work surprisingly well. I am able to successfully run Mac OS X guests with KVM and QEMU (no split real mode hacks like MOL) when I apply a tiny timing probe hack to QEMU. I'd say this is a win over even more broken split real mode :). Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:23:10 +02:00
Alexander Graf	2e27ecc961	KVM: PPC: Book3S: Stop PTE lookup on write errors When a page lookup failed because we're not allowed to write to the page, we should not overwrite that value with another lookup on the second PTEG which will return "page not found". Instead, we should just tell the caller that we had a permission problem. This fixes Mac OS X guests looping endlessly in page lookup code for me. Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:23:10 +02:00
Alexander Graf	17824b5afc	KVM: PPC: Deflect page write faults properly in kvmppc_st When we have a page that we're not allowed to write to, xlate() will already tell us -EPERM on lookup of that page. With the code as is we change it into a "page missing" error which a guest may get confused about. Instead, just tell the caller about the -EPERM directly. This fixes Mac OS X guests when run with DCBZ32 emulation. Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:23:10 +02:00
Alexander Graf	1287cb3fa8	KVM: PPC: Book3S: Move vcore definition to end of kvm_arch struct When building KVM with a lot of vcores (NR_CPUS is big), we can potentially get out of the ld immediate range for dereferences inside that struct. Move the array to the end of our kvm_arch struct. This fixes compilation issues with NR_CPUS=2048 for me. Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:22:27 +02:00
Mihai Caraman	debf27d6b9	KVM: PPC: e500: Emulate power management control SPR For FSL e6500 core the kernel uses power management SPR register (PWRMGTCR0) to enable idle power down for cores and devices by setting up the idle count period at boot time. With the host already controlling the power management configuration the guest could simply benefit from it, so emulate guest request as a general store. Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:22:27 +02:00
Alexander Graf	6947f948f0	KVM: PPC: Book3S HV: Enable for little endian hosts Now that we've fixed all the issues that HV KVM code had on little endian hosts, we can enable it in the kernel configuration for users to play with. Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:22:26 +02:00
Alexander Graf	9bf163f86d	KVM: PPC: Book3S HV: Fix ABIv2 on LE For code that doesn't live in modules we can just branch to the real function names, giving us compatibility with ABIv1 and ABIv2. Do this for the compiled-in code of HV KVM. Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:22:25 +02:00
Alexander Graf	76d072fb05	KVM: PPC: Book3S HV: Access XICS in BE On the exit path from the guest we check what type of interrupt we received if we received one. This means we're doing hardware access to the XICS interrupt controller. However, when running on a little endian system, this access is byte reversed. So let's make sure to swizzle the bytes back again and virtually make XICS accesses big endian. Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:22:24 +02:00
Alexander Graf	0865a583a4	KVM: PPC: Book3S HV: Access host lppaca and shadow slb in BE Some data structures are always stored in big endian. Among those are the LPPACA fields as well as the shadow slb. These structures might be shared with a hypervisor. So whenever we access those fields, make sure we do so in big endian byte order. Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:22:23 +02:00
Alexander Graf	0240755225	KVM: PPC: Book3S HV: Access guest VPA in BE There are a few shared data structures between the host and the guest. Most of them get registered through the VPA interface. These data structures are defined to always be in big endian byte order, so let's make sure we always access them in big endian. Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:22:22 +02:00
Alexander Graf	6f22bd3265	KVM: PPC: Book3S HV: Make HTAB code LE host aware When running on an LE host all data structures are kept in little endian byte order. However, the HTAB still needs to be maintained in big endian. So every time we access any HTAB we need to make sure we do so in the right byte order. Fix up all accesses to manually byte swap. Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:22:22 +02:00
Alexander Graf	8f6822c4b9	PPC: Add asm helpers for BE 32bit load/store From assembly code we might not only have to explicitly BE access 64bit values, but sometimes also 32bit ones. Add helpers that allow for easy use of lwzx/stwx in their respective byte-reverse or native form. Signed-off-by: Alexander Graf <agraf@suse.de> CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-28 15:22:21 +02:00
Mihai Caraman	d57cef91a0	KVM: PPC: e500: Fix default tlb for victim hint Tlb search operation used for victim hint relies on the default tlb set by the host. When hardware tablewalk support is enabled in the host, the default tlb is TLB1 which leads KVM to evict the bolted entry. Set and restore the default tlb when searching for victim hint. Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> Reviewed-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:22:20 +02:00
Michael Neuling	9642382e82	KVM: PPC: Book3S HV: Add H_SET_MODE hcall handling This adds support for the H_SET_MODE hcall. This hcall is a multiplexer that has several functions, some of which are called rarely, and some which are potentially called very frequently. Here we add support for the functions that set the debug registers CIABR (Completed Instruction Address Breakpoint Register) and DAWR/DAWRX (Data Address Watchpoint Register and eXtension), since they could be updated by the guest as often as every context switch. This also adds a kvmppc_power8_compatible() function to test to see if a guest is compatible with POWER8 or not. The CIABR and DAWR/X only exist on POWER8. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:22:19 +02:00
Paul Mackerras	ae2113a4f1	KVM: PPC: Book3S: Allow only implemented hcalls to be enabled or disabled This adds code to check that when the KVM_CAP_PPC_ENABLE_HCALL capability is used to enable or disable in-kernel handling of an hcall, that the hcall is actually implemented by the kernel. If not an EINVAL error is returned. This also checks the default-enabled list of hcalls and prints a warning if any hcall there is not actually implemented. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:22:18 +02:00
Paul Mackerras	699a0ea082	KVM: PPC: Book3S: Controls for in-kernel sPAPR hypercall handling This provides a way for userspace controls which sPAPR hcalls get handled in the kernel. Each hcall can be individually enabled or disabled for in-kernel handling, except for H_RTAS. The exception for H_RTAS is because userspace can already control whether individual RTAS functions are handled in-kernel or not via the KVM_PPC_RTAS_DEFINE_TOKEN ioctl, and because the numeric value for H_RTAS is out of the normal sequence of hcall numbers. Hcalls are enabled or disabled using the KVM_ENABLE_CAP ioctl for the KVM_CAP_PPC_ENABLE_HCALL capability on the file descriptor for the VM. The args field of the struct kvm_enable_cap specifies the hcall number in args[0] and the enable/disable flag in args[1]; 0 means disable in-kernel handling (so that the hcall will always cause an exit to userspace) and 1 means enable. Enabling or disabling in-kernel handling of an hcall is effective across the whole VM. The ability for KVM_ENABLE_CAP to be used on a VM file descriptor on PowerPC is new, added by this commit. The KVM_CAP_ENABLE_CAP_VM capability advertises that this ability exists. When a VM is created, an initial set of hcalls are enabled for in-kernel handling. The set that is enabled is the set that have an in-kernel implementation at this point. Any new hcall implementations from this point onwards should not be added to the default set without a good reason. No distinction is made between real-mode and virtual-mode hcall implementations; the one setting controls them both. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:22:17 +02:00
Mihai Caraman	1f0eeb7e1a	KVM: PPC: e500mc: Enhance tlb invalidation condition on vcpu schedule On vcpu schedule, the condition checked for tlb pollution is too loose. The tlb entries of a vcpu become polluted (vs stale) only when a different vcpu within the same logical partition runs in-between. Optimize the tlb invalidation condition keeping last_vcpu per logical partition id. With the new invalidation condition, a guest shows 4% performance improvement on P5020DS while running a memory stress application with the cpu oversubscribed, the other guest running a cpu intensive workload. Guest - old invalidation condition real 3.89 user 3.87 sys 0.01 Guest - enhanced invalidation condition real 3.75 user 3.73 sys 0.01 Host real 3.70 user 1.85 sys 0.00 The memory stress application accesses 4KB pages backed by 75% of available TLB0 entries: char foo[ENTRIES][4096] __attribute__ ((aligned (4096))); int main() { char bar; int i, j; for (i = 0; i < ITERATIONS; i++) for (j = 0; j < ENTRIES; j++) bar = foo[j][0]; return 0; } Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> Reviewed-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:22:16 +02:00
Alexander Graf	f396df3518	KVM: PPC: Book3S PR: Fix sparse endian checks While sending sparse with endian checks over the code base, it triggered at some places that were missing casts or had wrong types. Fix them up. Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:22:16 +02:00
Alexander Graf	da166facd4	KVM: PPC: Book3S PR: Fix ABIv2 on LE We switched to ABIv2 on Little Endian systems now which gets rid of the dotted function names. Branch to the actual functions when we see such a system. Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:22:15 +02:00
Anton Blanchard	ad7d4584a2	KVM: PPC: Assembly functions exported to modules need _GLOBAL_TOC() Both kvmppc_hv_entry_trampoline and kvmppc_entry_trampoline are assembly functions that are exported to modules and also require a valid r2. As such we need to use _GLOBAL_TOC so we provide a global entry point that establishes the TOC (r2). Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:22:14 +02:00
Anton Blanchard	05a308c722	KVM: PPC: Book3S HV: Fix ABIv2 indirect branch issue To establish addressability quickly, ABIv2 requires the target address of the function being called to be in r12. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:22:13 +02:00
Alexander Graf	568fccc43f	KVM: PPC: Book3S PR: Handle hyp doorbell exits If we're running PR KVM in HV mode, we may get hypervisor doorbell interrupts. Handle those the same way we treat normal doorbells. Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:22:12 +02:00
Alexander Graf	f6bf3a6622	KVM: PPC: Book3s HV: Fix tlbie compile error Some compilers complain about uninitialized variables in the compute_tlbie_rb function. When you follow the code path you'll realize that we'll never get to that point, but the compiler isn't all that smart. So just default to 4k page sizes for everything, making the compiler happy and the code slightly easier to read. Signed-off-by: Alexander Graf <agraf@suse.de> Acked-by: Paul Mackerras <paulus@samba.org>	2014-07-28 15:22:12 +02:00
Alexander Graf	fb4188bad0	KVM: PPC: Book3s PR: Disable AIL mode with OPAL When we're using PR KVM we must not allow the CPU to take interrupts in virtual mode, as the SLB does not contain host kernel mappings when running inside the guest context. To make sure we get good performance for non-KVM tasks but still properly functioning PR KVM, let's just disable AIL whenever a vcpu is scheduled in. This is fundamentally different from how we deal with AIL on pSeries type machines where we disable AIL for the whole machine as soon as a single KVM VM is up. The reason for that is easy - on pSeries we do not have control over per-cpu configuration of AIL. We also don't want to mess with CPU hotplug races and AIL configuration, so setting it per CPU is easier and more flexible. This patch fixes running PR KVM on POWER8 bare metal for me. Signed-off-by: Alexander Graf <agraf@suse.de> Acked-by: Paul Mackerras <paulus@samba.org>	2014-07-28 15:22:11 +02:00
Aneesh Kumar K.V	06da28e76b	KVM: PPC: BOOK3S: PR: Emulate instruction counter Writing to IC is not allowed in the privileged mode. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:22:10 +02:00
Aneesh Kumar K.V	8f42ab2749	KVM: PPC: BOOK3S: PR: Emulate virtual timebase register virtual time base register is a per VM, per cpu register that needs to be saved and restored on vm exit and entry. Writing to VTB is not allowed in the privileged mode. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> [agraf: fix compile error] Signed-off-by: Alexander Graf <agraf@suse.de>	2014-07-28 15:21:50 +02:00
Ingo Molnar	5030c69755	Linux 3.16-rc7 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJT1VYNAAoJEHm+PkMAQRiGQJwIAKSYp1Uqz5O/e5r0V1TlZKT4 1B4Njopl57PwSrJQWcGEuH2yHyM896vfPO4L6BJIOfyWzh8kwpQqclDt6uhXoF/v OsO1zb/7/j+n/pDZsePqP9AyIgErsHEBgUbhecDqzjN++ITPcZjQ6TIMPglZaumN jFAdAZuAaEwqAk8jqN2wlm689Fh9MuUEarHXbXLCqu5RgLrWhFGhp/cTWY62aqnZ XfEeQ9KtpRZmlR/IYjerbb1eRH7ZdJsZ88WngLX9dj/JdNxHWBkWQBXGAusXk5Fk y6LsIV3TjyBdrRKJ1Ifyg/2EIXHNBs8HxTFGXpjtp2HPuMLDxZOWOWikb9URtNg= =Fjf4 -----END PGP SIGNATURE----- Merge tag 'v3.16-rc7' into perf/core, to merge in the latest fixes before applying new changes Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-07-28 10:00:33 +02:00
Michael Ellerman	9de5cb0f6d	powerpc/perf: Add per-event excludes on Power8 Power8 has a new register (MMCR2), which contains individual freeze bits for each counter. This is an improvement on previous chips as it means we can have multiple events on the PMU at the same time with different exclude_{user,kernel,hv} settings. Previously we had to ensure all events on the PMU had the same exclude settings. The core of the patch is fairly simple. We use the 207S feature flag to indicate that the PMU backend supports per-event excludes, if it's set we skip the generic logic that enforces the equality of excludes between events. We also use that flag to skip setting the freeze bits in MMCR0, the PMU backend is expected to have handled setting them in MMCR2. The complication arises with EBB. The FCxP bits in MMCR2 are accessible R/W to a task using EBB. Which means a task using EBB will be able to see that we are using MMCR2 for freezing, whereas the old logic which used MMCR0 is not user visible. The task can not see or affect exclude_kernel & exclude_hv, so we only need to consider exclude_user. The table below summarises the behaviour both before and after this commit is applied: exclude_user true false ------------------------------------ \| User visible \| N N Before \| Can freeze \| Y Y \| Can unfreeze \| N Y ------------------------------------ \| User visible \| Y Y After \| Can freeze \| Y Y \| Can unfreeze \| Y/N Y ------------------------------------ So firstly I assert that the simple visibility of the exclude_user setting in MMCR2 is a non-issue. The event belongs to the task, and was most likely created by the task. So the exclude_user setting is not privileged information in any way. Secondly, the behaviour in the exclude_user = false case is unchanged. This is important as it is the case that is actually useful, ie. the event is created with no exclude setting and the task uses MMCR2 to implement exclusion manually. For exclude_user = true there is no meaningful change to freezing the event. Previously the task could use MMCR2 to freeze the event, though it was already frozen with MMCR0. With the new code the task can use MMCR2 to freeze the event, though it was already frozen with MMCR2. The only real change is when exclude_user = true and the task tries to use MMCR2 to unfreeze the event. Previously this had no effect, because the event was already frozen in MMCR0. With the new code the task can unfreeze the event in MMCR2, but at some indeterminate time in the future the kernel will overwrite its setting and refreeze the event. Therefore my final assertion is that any task using exclude_user = true and also fiddling with MMCR2 was deeply confused before this change, and remains so after it. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-28 14:30:58 +10:00
Michael Ellerman	8abd818fc7	powerpc/perf: Pass the struct perf_events down to compute_mmcr() To support per-event exclude settings on Power8 we need access to the struct perf_events in compute_mmcr(). Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-28 14:30:47 +10:00
Michael Ellerman	79a4cb28a0	powerpc/perf: Clear all MMCR settings before calling compute_mmcr() Because we reuse cpuhw->mmcr on each call to compute_mmcr() there's a risk that we could forget to set one of the values and use whatever value was in there previously. Currently all the implementations are careful to set all the values, but it's safer to clear them all before we call compute_mmcr(). Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-28 14:11:34 +10:00
Michael Ellerman	633440f18f	powerpc: Document how we set AIL on guest kernels I spent ten minutes scratching my head, trying to work out where we enabled relocation on interrupts for guest kernels. Expand the doco to make it clear. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-28 14:11:27 +10:00
Michael Ellerman	8e83e9053f	powerpc/pseries: Switch pseries drivers to use machine_xxx_initcall() A lot of the code in platforms/pseries is using non-machine initcalls. That means if a kernel built with pseries support runs on another platform, for example powernv, the initcalls will still run. Most of these cases are OK, though sometimes only due to luck. Some were having more effect: * hcall_inst_init - Checking FW_FEATURE_LPAR which is set on ps3 & celleb. * mobility_sysfs_init - created sysfs files unconditionally - but no effect due to ENOSYS from rtas_ibm_suspend_me() * apo_pm_init - created sysfs, allows write - nothing checks the value written to though * alloc_dispatch_log_kmem_cache - creating kmem_cache on non-pseries machines Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-28 14:11:26 +10:00
Michael Ellerman	b14726c51c	powerpc/powernv: Switch powernv drivers to use machine_xxx_initcall() A lot of the code in platforms/powernv is using non-machine initcalls. That means if a kernel built with powernv support runs on another platform, for example pseries, the initcalls will still run. That is usually OK, because the initcalls will check for something in the device tree or elsewhere before doing anything, so on other platforms they will usually just return. But it's fishy for powernv code to be running on other platforms, so switch them all to be machine initcalls. If we want any of them to run on other platforms in future they should move to sysdev. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-28 14:11:26 +10:00
Michael Ellerman	8d3c941e24	powerpc: Add machine_early_initcall() Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-28 14:11:25 +10:00
Michael Ellerman	9daf112bd4	powerpc: Remove misleading DISABLE_INTS DISABLE_INTS has a long and storied history, but for some time now it has not actually disabled interrupts. For the open-coded exception handlers, just stop using it, instead call RECONCILE_IRQ_STATE directly. This has the benefit of removing a level of indirection, and making it clear that r10 & r11 are used at that point. For the addition case we still need a macro, so rename it to clarify what it actually does. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-28 14:11:24 +10:00
Michael Ellerman	a1d711c53f	powerpc: Document register clobbering in EXCEPTION_COMMON() Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-28 14:11:24 +10:00
Michael Ellerman	144beb2f53	powerpc: Update comments in irqflags.h The comment on TRACE_ENABLE_INTS is incorrect, and appears to have always been incorrect since the code was merged. It probably came from an original out-of-tree patch. Replace it with something that's correct. Also propagate the message to RECONCILE_IRQ_STATE(), because it's potentially subtle. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-28 14:11:23 +10:00
Michael Ellerman	4e2bf01b21	powerpc: Move bad_stack() below the fwnmi_data_area At the moment the allmodconfig build is failing because we run out of space between altivec_assist() at 0x5700 and the fwnmi_data_area at 0x7000. Fixing it permanently will take some more work, but a quick fix is to move bad_stack() below the fwnmi_data_area. That gives us just enough room with everything enabled. bad_stack() is called from the common exception handlers, but it's a non-conditional branch, so we have plenty of scope to move it further way. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-28 14:11:22 +10:00
Michael Ellerman	1e07a0a033	powerpc: Remove CLASSIC_PPC We have a strange #define in cputable.h called CLASSIC_PPC. Although it is defined for 32 & 64bit, it's only used for 32bit and it's basically a duplicate of CONFIG_PPC_BOOK3S_32, so let's use the latter. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-28 14:11:22 +10:00
Michael Ellerman	804ece07e9	powerpc: Remove CONFIG_POWER4 Although the name CONFIG_POWER4 suggests that it controls support for power4 cpus, this symbol is actually misnamed. It is a historical wart from the powermac code, which used to support building a 32-bit kernel for power4. CONFIG_POWER4 was used in that context to guard code that was 64-bit only. In the powermac code we can just use CONFIG_PPC64 instead, and in other places it is a synonym for CONFIG_PPC_BOOK3S_64. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-28 14:10:26 +10:00
Michael Ellerman	0f369103ce	powerpc: Remove power3 from comments There are still a few occurences where it remains, because it helps to explain something that persists. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-28 14:10:26 +10:00
Michael Ellerman	086dddc15f	powerpc: Remove oprofile RS64 support We no longer support these cpus, so we don't need oprofile support for them either. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-28 14:10:25 +10:00
Michael Ellerman	c3993f1007	powerpc: Remove CONFIG_POWER3 Now that we have dropped power3 support we can remove CONFIG_POWER3. The usage in pgtable_32.c was already dead code as CONFIG_POWER3 was not selectable on PPC32. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-28 14:10:24 +10:00
Michael Ellerman	cec15488c7	powerpc: Pull out ksp_vsid logic into a helper The previous patch left a bit of a wart in copy_process(). Clean it up a bit by moving the logic out into a helper. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-28 14:10:24 +10:00
Michael Ellerman	13b3d13b81	powerpc: Remove MMU_FTR_SLB We now only support cpus that use an SLB, so we don't need an MMU feature to indicate that. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-28 14:10:23 +10:00
Michael Ellerman	376af5947c	powerpc: Remove STAB code Old cpus didn't have a Segment Lookaside Buffer (SLB), instead they had a Segment Table (STAB). Now that we've dropped support for those cpus, we can remove the STAB support entirely. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-28 14:10:22 +10:00
Michael Ellerman	468a33028e	powerpc: Drop support for pre-POWER4 cpus We inadvertently broke power3 support back in 3.4 with commit `f5339277eb` "powerpc: Remove FW_FEATURE ISERIES from arch code". No one noticed until at least 3.9. By then we'd also broken it with the optimised memcpy, copy_to/from_user and clear_user routines. We don't want to add any more complexity to those just to support ancient cpus, so it seems like it's a good time to drop support for power3 and earlier. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-28 14:09:23 +10:00
Michael Ellerman	2061f7beaa	powerpc: Use standard macros for sys_sigpending() & sys_old_getrlimit() Currently we have sys_sigpending and sys_old_getrlimit defined to use COMPAT_SYS() in systbl.h, but then both are #defined to sys_ni_syscall in systbl.S. This seems to have been done when ppc and ppc64 were merged, in commit `9994a33` "Introduce entry_{32,64}.S, misc_{32,64}.S, systbl.S". AFAICS there's no longer (or never was) any need for this, we can just use SYSX() for both and remove the #defines to sys_ni_syscall. The expansion before was: #define COMPAT_SYS(func) .llong .sys_##func,.compat_sys_##func #define sys_old_getrlimit sys_ni_syscall COMPAT_SYS(old_getrlimit) => .llong .sys_old_getrlimit,.compat_sys_old_getrlimit => .llong .sys_ni_syscall,.compat_sys_old_getrlimit After is: #define SYSX(f, f3264, f32) .llong .f,.f3264 SYSX(sys_ni_syscall, compat_sys_old_getrlimit, sys_old_getrlimit) => .llong .sys_ni_syscall,.compat_sys_old_getrlimit ie. they are equivalent. Finally both COMPAT_SYS() and SYSX() evaluate to sys_ni_syscall in the Cell SPU code. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-28 14:09:23 +10:00
Benjamin Herrenschmidt	cdc2652ee5	Merge branch 'merge' into next Bring in some important fixes from the 3.16 branch	2014-07-28 13:41:12 +10:00
Thomas Falcon	396a34340c	powerpc: Fix endianness of flash_block_list in rtas_flash The function rtas_flash_firmware passes the address of a data structure, flash_block_list, when making the update-flash-64-and-reboot rtas call. While the endianness of the address is handled correctly, the endianness of the data is not. This patch ensures that the data in flash_block_list is big endian when passed to rtas on little endian hosts. Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-28 11:30:54 +10:00
Vasant Hegde	fa952c54ba	powerpc/powernv: Change BUG_ON to WARN_ON in elog code We can continue to read the error log (up to MAX size) even if we get the elog size more than MAX size. Hence change BUG_ON to WARN_ON. Also updated error message. Reported-by: Gopesh Kumar Chaudhary <gopchaud@in.ibm.com> Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Acked-by: Deepthi Dharwar <deepthi@linux.vnet.ibm.com> Acked-by: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-28 11:30:54 +10:00
Kristina Martšenko	ad8c12eea0	staging: silicom: remove driver The driver hasn't been cleaned up and it doesn't look like anyone is working on it anymore (including the original author). So remove it. If someone wants to work on cleaning the driver up and moving it out of staging, this commit can be reverted. In addition, since this removes the CONFIG_NET_VENDOR_SILICOM config symbol, remove the symbol from all defconfig files that reference it. Signed-off-by: Kristina Martšenko <kristina.martsenko@gmail.com> Cc: Daniel Cotey <puff65537@bansheeslibrary.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-07-27 11:24:37 -07:00
Alexander Popov	ec1f0c9666	dmaengine: mpc512x: register for device tree channel lookup Register the controller for device tree based lookup of DMA channels (non-fatal for backwards compatibility with older device trees) and provide the '#dma-cells' property in the shared mpc5121.dtsi file Signed-off-by: Alexander Popov <a13xp0p0v88@gmail.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com>	2014-07-26 00:21:42 +05:30
Grant Likely	259092a35c	of: Reorder device tree changes and notifiers Currently, devicetree reconfig notifiers get emitted before the change is applied to the tree, but that behaviour is problematic if the receiver wants the determine the new state of the tree. The current users don't care, but the changeset code to follow will be making multiple changes at once. Reorder notifiers to get emitted after the change has been applied to the tree so that callbacks see the new tree state. At the same time, fixup the existing callbacks to expect the new order. There are a few callbacks that compare the old and new values of a changed property. Put both property pointers into the of_prop_reconfig structure. The current notifiers also allow the notifier callback to fail and cancel the change to the tree, but that feature isn't actually used. It really isn't valid to ignore a tree modification provided by firmware anyway, so remove the ability to cancel a change to the tree. Signed-off-by: Grant Likely <grant.likely@linaro.org> Cc: Nathan Fontenot <nfont@austin.ibm.com>	2014-07-23 17:08:13 -06:00
Grant Likely	a25095d451	of: Move dynamic node fixups out of powerpc and into common code PowerPC does an odd thing with dynamic nodes. It uses a notifier to catch new node additions and set some of the values like name and type. This makes no sense since that same code can be put directly into of_attach_node(). Besides, all dynamic node users need this, not just powerpc. Fix this problem by moving the logic out of arch/powerpc and into drivers/of/dynamic.c. It is also important to remove this notifier because we want to move the firing of notifiers from before the tree is modified to after so that the receiver gets a consistent view of the tree, but that is incompatible with notifiers that modify the node. Signed-off-by: Grant Likely <grant.likely@linaro.org> Cc: Nathan Fontenot <nfont@austin.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-23 17:05:46 -06:00
Linus Torvalds	7442cf9ac2	Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull powerpc fixes from Ben Herrenschmidt: "Here is a handful of powerpc fixes for 3.16. They are all pretty simple and self contained and should still make this release" * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: powerpc: use _GLOBAL_TOC for memmove powerpc/pseries: dynamically added OF nodes need to call of_node_init powerpc: subpage_protect: Increase the array size to take care of 64TB powerpc: Fix bugs in emulate_step() powerpc: Disable doorbells on Power8 DD1.x	2014-07-23 15:34:13 -07:00
Thomas Gleixner	4a0e637738	clocksource: Get rid of cycle_last cycle_last was added to the clocksource to support the TSC validation. We moved that to the core code, so we can get rid of the extra copy. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: John Stultz <john.stultz@linaro.org>	2014-07-23 15:01:52 -07:00
Thomas Gleixner	f2dec1eae8	powerpc: cell: Use ktime_get_ns() Replace the ever recurring: ts = ktime_get_ts(); ns = timespec_to_ns(&ts); with ns = ktime_get_ns(); Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: John Stultz <john.stultz@linaro.org>	2014-07-23 10:18:07 -07:00
Michael Ellerman	8903461c9b	powerpc/perf: Fix MMCR2 handling for EBB In the recent commit `b50a6c584b` "Clear MMCR2 when enabling PMU", I screwed up the handling of MMCR2 for tasks using EBB. We must make sure we set MMCR2 before ebb_switch_in(), otherwise we overwrite the value of MMCR2 that userspace may have written. That potentially breaks a task that uses EBB and manually uses MMCR2 for event freezing. Fixes: `b50a6c584b` ("powerpc/perf: Clear MMCR2 when enabling PMU") Cc: stable@vger.kernel.org Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-23 17:16:47 +10:00
Li Zhong	6f5405bc2e	powerpc: use _GLOBAL_TOC for memmove memmove may be called from module code copy_pages(btrfs), and it may call memcpy, which may call back to C code, so it needs to use _GLOBAL_TOC to set up r2 correctly. This fixes following error when I tried to boot an le guest: Vector: 300 (Data Access) at [c000000073f97210] pc: c000000000015004: enable_kernel_altivec+0x24/0x80 lr: c000000000058fbc: enter_vmx_copy+0x3c/0x60 sp: c000000073f97490 msr: 8000000002009033 dar: d000000001d50170 dsisr: 40000000 current = 0xc0000000734c0000 paca = 0xc00000000fff0000 softe: 0 irq_happened: 0x01 pid = 815, comm = mktemp enter ? for help [c000000073f974f0] c000000000058fbc enter_vmx_copy+0x3c/0x60 [c000000073f97510] c000000000057d34 memcpy_power7+0x274/0x840 [c000000073f97610] d000000001c3179c copy_pages+0xfc/0x110 [btrfs] [c000000073f97660] d000000001c3c248 memcpy_extent_buffer+0xe8/0x160 [btrfs] [c000000073f97700] d000000001be4be8 setup_items_for_insert+0x208/0x4a0 [btrfs] [c000000073f97820] d000000001be50b4 btrfs_insert_empty_items+0xf4/0x140 [btrfs] [c000000073f97890] d000000001bfed30 insert_with_overflow+0x70/0x180 [btrfs] [c000000073f97900] d000000001bff174 btrfs_insert_dir_item+0x114/0x2f0 [btrfs] [c000000073f979a0] d000000001c1f92c btrfs_add_link+0x10c/0x370 [btrfs] [c000000073f97a40] d000000001c20e94 btrfs_create+0x204/0x270 [btrfs] [c000000073f97b00] c00000000026d438 vfs_create+0x178/0x210 [c000000073f97b50] c000000000270a70 do_last+0x9f0/0xe90 [c000000073f97c20] c000000000271010 path_openat+0x100/0x810 [c000000073f97ce0] c000000000272ea8 do_filp_open+0x58/0xd0 [c000000073f97dc0] c00000000025ade8 do_sys_open+0x1b8/0x300 [c000000073f97e30] c00000000000a008 syscall_exit+0x0/0x7c Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-22 15:56:04 +10:00
Tyrel Datwyler	97a9a7179a	powerpc/pseries: dynamically added OF nodes need to call of_node_init Commit `75b57ecf9` refactored device tree nodes to use kobjects such that they can be exposed via /sysfs. A secondary commit `0829f6d1f` furthered this rework by moving the kobect initialization logic out of of_node_add into its own of_node_init function. The inital commit removed the existing kref_init calls in the pseries dlpar code with the assumption kobject initialization would occur in of_node_add. The second commit had the side effect of triggering a BUG_ON during DLPAR, migration and suspend/resume operations as a result of dynamically added nodes being uninitialized. This patch fixes this by adding of_node_init calls in place of the previously removed kref_init calls. Fixes: `0829f6d1f6` ("of: device_node kobject lifecycle fixes") Cc: stable@vger.kernel.org Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com> Acked-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Acked-by: Grant Likely <grant.likely@linaro.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-22 15:55:59 +10:00
Aneesh Kumar K.V	dad6f37c26	powerpc: subpage_protect: Increase the array size to take care of 64TB We now support TASK_SIZE of 16TB, hence the array should be 8. Fixes the below crash: Unable to handle kernel paging request for data at address 0x000100bd Faulting instruction address: 0xc00000000004f914 cpu 0x13: Vector: 300 (Data Access) at [c000000fea75fa90] pc: c00000000004f914: .sys_subpage_prot+0x2d4/0x5c0 lr: c00000000004fb5c: .sys_subpage_prot+0x51c/0x5c0 sp: c000000fea75fd10 msr: 9000000000009032 dar: 100bd dsisr: 40000000 current = 0xc000000fea6ae490 paca = 0xc00000000fb8ab00 softe: 0 irq_happened: 0x00 pid = 8237, comm = a.out enter ? for help [c000000fea75fe30] c00000000000a164 syscall_exit+0x0/0x98 Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-22 15:55:55 +10:00
Paul Mackerras	e698b96678	powerpc: Fix bugs in emulate_step() This fixes some bugs in emulate_step(). First, the setting of the carry bit for the arithmetic right-shift instructions was not correct on 64-bit machines because we were masking with a mask of type int rather than unsigned long. Secondly, the sld (shift left doubleword) instruction was using the wrong instruction field for the register containing the shift count. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-22 15:55:51 +10:00
Joel Stanley	bd6ba3518f	powerpc: Disable doorbells on Power8 DD1.x These processors do not currently support doorbell IPIs, so remove them from the feature list if we are at DD 1.xx for the 0x004d part. This fixes a regression caused by `d4e58e5928` (powerpc/powernv: Enable POWER8 doorbell IPIs). With that patch the kernel would hang at boot when calling smp_call_function_many, as the doorbell would not be received by the target CPUs: .smp_call_function_many+0x2bc/0x3c0 (unreliable) .on_each_cpu_mask+0x30/0x100 .cpuidle_register_driver+0x158/0x1a0 .cpuidle_register+0x2c/0x110 .powernv_processor_idle_init+0x23c/0x2c0 .do_one_initcall+0xd4/0x260 .kernel_init_freeable+0x25c/0x33c .kernel_init+0x1c/0x120 .ret_from_kernel_thread+0x58/0x7c Fixes: `d4e58e5928` (powerpc/powernv: Enable POWER8 doorbell IPIs) Signed-off-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-22 15:55:24 +10:00
Linus Torvalds	5b2b9d7761	These are mostly PPC changes for 3.16-new things. However, there is an x86 change too and it is a regression from 3.14. As it only affects nested virtualization and there were other changes in this area in 3.16, I am not nominating it for 3.15-stable. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABAgAGBQJTyTTfAAoJEBvWZb6bTYbyytIQAJare/EWQmNBDK57EcJBIlJS 6MW2XnASEW+KCoUw0+u3sm9eaRXQdmJRb1Aw5zxTiUIR3ZSI8MDSQr1XxEgTAOtE vFZjonPwlbnE8edLMhH3v/6/v9oO7bwNTDYeOE2pKPRfgPRjFmj1QUOJkvzRnRwj kS5M4RtI+VqhdyJW8f4HaWqoRaOAISp3ZjQUJQdab3DWsf9ZpNjwLNjKzGZKNvIN Klcpi7JH32zawUfqnAvph/7NsrBGrpFRE+j+JU9LLnD9PehuXwqZbWh01g2Anbq2 TKVrmXW+YnoD1IZsDw7r/14FaeRweV7yALA/eA9F4KfSyF2Qm9RbjVVdrUYz0CHV aIl0cZeZM8xRCLy/ZWj+dOQ23RWelZaslHSpshKOznoRsuuvVwpx93zVtRwlw2dx 4WJ2A5gYA+ZUQ7eWjk83381JXkbRDUb3cO+NL8t9GnFctCJzT/gQHjqu15f7TJ2Q gKhmeciKOS3xY4sQ+ti6gv8CwIFYqgdTzkxedxSgS9xpiAmw9v57V7WukXoXB6zl AyjEAk9FFOeBZ5nXs0ObK5LKjI+MJoZ3X0bin7PCuT6dFrIA2yHvo5EgMvOcUua9 8Tu9L8sWv/JsKjuqebkKxekAKvv0CV35Q8OsLpEF6RI0eXyiXy2extk1LzUuK9cx ZVYbN263++En/tgH2AJM =Vdqn -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm fixes from Paolo Bonzini: "These are mostly PPC changes for 3.16-new things. However, there is an x86 change too and it is a regression from 3.14. As it only affects nested virtualization and there were other changes in this area in 3.16, I am not nominating it for 3.15-stable" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86: Check for nested events if there is an injectable interrupt KVM: PPC: RTAS: Do byte swaps explicitly KVM: PPC: Book3S PR: Fix ABIv2 on LE KVM: PPC: Assembly functions exported to modules need _GLOBAL_TOC() PPC: Add _GLOBAL_TOC for 32bit KVM: PPC: BOOK3S: HV: Use base page size when comparing against slb value KVM: PPC: Book3E: Unlock mmu_lock when setting caching atttribute	2014-07-21 11:19:18 -07:00
Linus Torvalds	d057190925	Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fixes from Thomas Gleixner: "The locking department delivers: - A rather large and intrusive bundle of fixes to address serious performance regressions introduced by the new rwsem / mcs technology. Simpler solutions have been discussed, but they would have been ugly bandaids with more risk than doing the right thing. - Make the rwsem spin on owner technology opt-in for architectures and enable it only on the known to work ones. - A few fixes to the lockdep userspace library" * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking/rwsem: Add CONFIG_RWSEM_SPIN_ON_OWNER locking/mutex: Disable optimistic spinning on some architectures locking/rwsem: Reduce the size of struct rw_semaphore locking/rwsem: Rename 'activity' to 'count' locking/spinlocks/mcs: Micro-optimize osq_unlock() locking/spinlocks/mcs: Introduce and use init macro and function for osq locks locking/spinlocks/mcs: Convert osq lock to atomic_t to reduce overhead locking/spinlocks/mcs: Rename optimistic_spin_queue() to optimistic_spin_node() locking/rwsem: Allow conservative optimistic spinning when readers have lock tools/liblockdep: Account for bitfield changes in lockdeps lock_acquire tools/liblockdep: Remove debug print left over from development tools/liblockdep: Fix comparison of a boolean value with a value of 2	2014-07-19 06:27:55 -10:00
Steven Rostedt (Red Hat)	96d4f43e3d	powerpc/ftrace: Add call to ftrace_graph_is_dead() in function graph code ftrace_stop() is going away as it disables parts of function tracing that affects users that should not be affected. But ftrace_graph_stop() is built on ftrace_stop(). Here's another example of killing all of function tracing because something went wrong with function graph tracing. Instead of disabling all users of function tracing on function graph error, disable only function graph tracing. To do this, the arch code must call ftrace_graph_is_dead() before it implements function graph. Cc: Anton Blanchard <anton@samba.org> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2014-07-18 13:56:56 -04:00
Bart Van Assche	aa3fc09078	tgt: defconfig cleanup Because of the removal of the scsi_tgt kernel module, the kbuild variables CONFIG_SCSI_TGT, CONFIG_SCSI_SRP_TGT_ATTRS and CONFIG_SCSI_FC_TGT_ATTRS are obsolete. This patch removes these variables. This patch is the result of the following command: find -name '*defconfig' \| while read f; do grep -vwE 'CONFIG_SCSI_TGT\|CONFIG_SCSI_SRP_TGT_ATTRS\|CONFIG_SCSI_FC_TGT_ATTRS\|CONFIG_SRP' $f >/tmp/t && mv /tmp/t $f; done Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Hannes Reinecke <hare@suse.de>	2014-07-17 22:07:44 +02:00
Davidlohr Bueso	3a6bfbc91d	arch, locking: Ciao arch_mutex_cpu_relax() The arch_mutex_cpu_relax() function, introduced by `34b133f`, is hacky and ugly. It was added a few years ago to address the fact that common cpu_relax() calls include yielding on s390, and thus impact the optimistic spinning functionality of mutexes. Nowadays we use this function well beyond mutexes: rwsem, qrwlock, mcs and lockref. Since the macro that defines the call is in the mutex header, any users must include mutex.h and the naming is misleading as well. This patch (i) renames the call to cpu_relax_lowlatency ("relax, but only if you can do it with very low latency") and (ii) defines it in each arch's asm/processor.h local header, just like for regular cpu_relax functions. On all archs, except s390, cpu_relax_lowlatency is simply cpu_relax, and thus we can take it out of mutex.h. While this can seem redundant, I believe it is a good choice as it allows us to move out arch specific logic from generic locking primitives and enables future(?) archs to transparently define it, similarly to System Z. Signed-off-by: Davidlohr Bueso <davidlohr@hp.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Anton Blanchard <anton@samba.org> Cc: Aurelien Jacquiot <a-jacquiot@ti.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Bharat Bhushan <r65777@freescale.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chen Liqin <liqin.linux@gmail.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Chris Zankel <chris@zankel.net> Cc: David Howells <dhowells@redhat.com> Cc: David S. Miller <davem@davemloft.net> Cc: Deepthi Dharwar <deepthi@linux.vnet.ibm.com> Cc: Dominik Dingel <dingel@linux.vnet.ibm.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Haavard Skinnemoen <hskinnemoen@gmail.com> Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Hirokazu Takata <takata@linux-m32r.org> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: James E.J. Bottomley <jejb@parisc-linux.org> Cc: James Hogan <james.hogan@imgtec.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Cc: Joe Perches <joe@perches.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Joseph Myers <joseph@codesourcery.com> Cc: Kees Cook <keescook@chromium.org> Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com> Cc: Lennox Wu <lennox.wu@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Salter <msalter@redhat.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Neuling <mikey@neuling.org> Cc: Michal Simek <monstr@monstr.eu> Cc: Mikael Starvik <starvik@axis.com> Cc: Nicolas Pitre <nico@linaro.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Paul Burton <paul.burton@imgtec.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Qais Yousef <qais.yousef@imgtec.com> Cc: Qiaowei Ren <qiaowei.ren@intel.com> Cc: Rafael Wysocki <rafael.j.wysocki@intel.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Steven Miao <realmz6@gmail.com> Cc: Steven Rostedt <srostedt@redhat.com> Cc: Stratos Karafotis <stratosk@semaphore.gr> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Vasily Kulikov <segoon@openwall.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Vineet Gupta <Vineet.Gupta1@synopsys.com> Cc: Waiman Long <Waiman.Long@hp.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Wolfram Sang <wsa@the-dreams.de> Cc: adi-buildroot-devel@lists.sourceforge.net Cc: linux390@de.ibm.com Cc: linux-alpha@vger.kernel.org Cc: linux-am33-list@redhat.com Cc: linux-arm-kernel@lists.infradead.org Cc: linux-c6x-dev@linux-c6x.org Cc: linux-cris-kernel@axis.com Cc: linux-hexagon@vger.kernel.org Cc: linux-ia64@vger.kernel.org Cc: linux@lists.openrisc.net Cc: linux-m32r-ja@ml.linux-m32r.org Cc: linux-m32r@ml.linux-m32r.org Cc: linux-m68k@lists.linux-m68k.org Cc: linux-metag@vger.kernel.org Cc: linux-mips@linux-mips.org Cc: linux-parisc@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-s390@vger.kernel.org Cc: linux-sh@vger.kernel.org Cc: linux-xtensa@linux-xtensa.org Cc: sparclinux@vger.kernel.org Link: http://lkml.kernel.org/r/1404079773.2619.4.camel@buesod1.americas.hpqcorp.net Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-07-17 12:32:47 +02:00
Linus Torvalds	bcf44bfe5e	Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar: "A cpufreq lockup fix and a compiler warning fix" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched: Fix compiler warnings x86, tsc: Fix cpufreq lockup	2014-07-16 10:11:02 -10:00
Peter Zijlstra	4badad352a	locking/mutex: Disable optimistic spinning on some architectures The optimistic spin code assumes regular stores and cmpxchg() play nice; this is found to not be true for at least: parisc, sparc32, tile32, metag-lock1, arc-!llsc and hexagon. There is further wreckage, but this in particular seemed easy to trigger, so blacklist this. Opt in for known good archs. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Reported-by: Mikulas Patocka <mpatocka@redhat.com> Cc: David Miller <davem@davemloft.net> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: James Bottomley <James.Bottomley@hansenpartnership.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Jason Low <jason.low2@hp.com> Cc: Waiman Long <waiman.long@hp.com> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Cc: John David Anglin <dave.anglin@bell.net> Cc: James Hogan <james.hogan@imgtec.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Davidlohr Bueso <davidlohr@hp.com> Cc: stable@vger.kernel.org Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Russell King <linux@arm.linux.org.uk> Cc: Will Deacon <will.deacon@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: sparclinux@vger.kernel.org Link: http://lkml.kernel.org/r/20140606175316.GV13930@laptop.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-07-16 14:57:07 +02:00
Linus Torvalds	5615f9f822	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) Bluetooth pairing fixes from Johan Hedberg. 2) ieee80211_send_auth() doesn't allocate enough tail room for the SKB, from Max Stepanov. 3) New iwlwifi chip IDs, from Oren Givon. 4) bnx2x driver reads wrong PCI config space MSI register, from Yijing Wang. 5) IPV6 MLD Query validation isn't strong enough, from Hangbin Liu. 6) Fix double SKB free in openvswitch, from Andy Zhou. 7) Fix sk_dst_set() being racey with UDP sockets, leading to strange crashes, from Eric Dumazet. 8) Interpret the NAPI budget correctly in the new systemport driver, from Florian Fainelli. 9) VLAN code frees percpu stats in the wrong place, leading to crashes in the get stats handler. From Eric Dumazet. 10) TCP sockets doing a repair can crash with a divide by zero, because we invoke tcp_push() with an MSS value of zero. Just skip that part of the sendmsg paths in repair mode. From Christoph Paasch. 11) IRQ affinity bug fixes in mlx4 driver from Amir Vadai. 12) Don't ignore path MTU icmp messages with a zero mtu, machines out there still spit them out, and all of our per-protocol handlers for PMTU can cope with it just fine. From Edward Allcutt. 13) Some NETDEV_CHANGE notifier invocations were not passing in the correct kind of cookie as the argument, from Loic Prylli. 14) Fix crashes in long multicast/broadcast reassembly, from Jon Paul Maloy. 15) ip_tunnel_lookup() doesn't interpret wildcard keys correctly, fix from Dmitry Popov. 16) Fix skb->sk assigned without taking a reference to 'sk' in appletalk, from Andrey Utkin. 17) Fix some info leaks in ULP event signalling to userspace in SCTP, from Daniel Borkmann. 18) Fix deadlocks in HSO driver, from Olivier Sobrie. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (93 commits) hso: fix deadlock when receiving bursts of data hso: remove unused workqueue net: ppp: don't call sk_chk_filter twice mlx4: mark napi id for gro_skb bonding: fix ad_select module param check net: pppoe: use correct channel MTU when using Multilink PPP neigh: sysctl - simplify address calculation of gc_* variables net: sctp: fix information leaks in ulpevent layer MAINTAINERS: update r8169 maintainer net: bcmgenet: fix RGMII_MODE_EN bit tipc: clear 'next'-pointer of message fragments before reassembly r8152: fix r8152_csum_workaround function be2net: set EQ DB clear-intr bit in be_open() GRE: enable offloads for GRE farsync: fix invalid memory accesses in fst_add_one() and fst_init_card() igb: do a reset on SR-IOV re-init if device is down igb: Workaround for i210 Errata 25: Slow System Clock usbnet: smsc95xx: add reset_resume function with reset operation dp83640: Always decode received status frames r8169: disable L23 ...	2014-07-15 08:42:52 -07:00
Anton Blanchard	c49f63530b	powernv: Add OPAL tracepoints Knowing how long we spend in firmware calls is an important part of minimising OS jitter. This patch adds tracepoints to each OPAL call. If tracepoints are enabled we branch out to a common routine that calls an entry and exit tracepoint. This allows us to write tools that monitor the frequency and duration of OPAL calls, eg: name count total(ms) min(ms) max(ms) avg(ms) period(ms) OPAL_HANDLE_INTERRUPT 5 0.199 0.037 0.042 0.040 12547.545 OPAL_POLL_EVENTS 204 2.590 0.012 0.036 0.013 2264.899 OPAL_PCI_MSI_EOI 2830 3.066 0.001 0.005 0.001 81.166 We use jump labels if configured, which means we only add a single nop instruction to every OPAL call when the tracepoints are disabled. Signed-off-by: Anton Blanchard <anton@samba.org> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-11 16:06:08 +10:00
Anton Blanchard	aaad422482	powerpc/pseries: Optimise hcall tracepoints Now that we execute the hcall tracepoint entry and exit code out of line, we can use the same stack across both functions. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-11 16:06:03 +10:00
Anton Blanchard	cc1adb5f32	powerpc/pseries: Use jump labels for hcall tracepoints hcall tracepoints add quite a few instructions to our hcall path: plpar_hcall: mr r2,r2 mfcr r0 stw r0,8(r1) b 164 <---- start ld r12,0(r2) std r12,32(r1) cmpdi r12,0 beq 164 <---- end ... We have an unconditional branch that gets noped out during boot and a load/compare/branch. We also store the tracepoint value to the stack for the hcall_exit path to use. By using jump labels we can simplify this to just a single nop that gets replaced with a branch when the tracepoint is enabled: plpar_hcall: mr r2,r2 mfcr r0 stw r0,8(r1) nop <---- ... If jump labels are not enabled, we fall back to the old method. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-11 16:05:58 +10:00
Alexey Kardashevskiy	8fa5d4547e	powerpc/powernv: Add a page size parameter to pnv_pci_setup_iommu_table() Since a TCE page size can be other than 4K, make it configurable for P5IOC2 and IODA PHBs. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-11 16:05:53 +10:00
Alexey Kardashevskiy	bc32057ed5	powerpc/powernv: Use it_page_shift in TCE build This makes use of iommu_table::it_page_shift instead of TCE_SHIFT and TCE_RPN_SHIFT hardcoded values. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-11 16:05:48 +10:00
Alexey Kardashevskiy	b0376c9b08	powerpc/powernv: Use it_page_shift for TCE invalidation This fixes IODA1/2 to use it_page_shift as it may be bigger than 4K. This changes involved constant values to use "ull" modifier. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-11 16:05:43 +10:00
Benjamin Herrenschmidt	5b97259220	Merge branch 'merge' into next	2014-07-11 15:38:12 +10:00
Anton Blanchard	f56029410a	powerpc/perf: Never program book3s PMCs with values >= 0x80000000 We are seeing a lot of PMU warnings on POWER8: Can't find PMC that caused IRQ Looking closer, the active PMC is 0 at this point and we took a PMU exception on the transition from negative to 0. Some versions of POWER8 have an issue where they edge detect and not level detect PMC overflows. A number of places program the PMC with (0x80000000 - period_left), where period_left can be negative. We can either fix all of these or just ensure that period_left is always >= 1. This patch takes the second option. Cc: <stable@vger.kernel.org> Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-11 13:50:47 +10:00
Guenter Roeck	fb43e8477e	powerpc: Disable RELOCATABLE for COMPILE_TEST with PPC64 powerpc:allmodconfig has been failing for some time with the following error. arch/powerpc/kernel/exceptions-64s.S: Assembler messages: arch/powerpc/kernel/exceptions-64s.S:1312: Error: attempt to move .org backwards make[1]: *** [arch/powerpc/kernel/head_64.o] Error 1 A number of attempts to fix the problem by moving around code have been unsuccessful and resulted in failed builds for some configurations and the discovery of toolchain bugs. Fix the problem by disabling RELOCATABLE for COMPILE_TEST builds instead. While this is less than perfect, it avoids substantial code changes which would otherwise be necessary just to make COMPILE_TEST builds happy and might have undesired side effects. Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-11 12:55:09 +10:00
Joel Stanley	b50a6c584b	powerpc/perf: Clear MMCR2 when enabling PMU On POWER8 when switching to a KVM guest we set bits in MMCR2 to freeze the PMU counters. Aside from on boot they are then never reset, resulting in stuck perf counters for any user in the guest or host. We now set MMCR2 to 0 whenever enabling the PMU, which provides a sane state for perf to use the PMU counters under either the guest or the host. This was manifesting as a bug with ppc64_cpu --frequency: $ sudo ppc64_cpu --frequency WARNING: couldn't run on cpu 0 WARNING: couldn't run on cpu 8 ... WARNING: couldn't run on cpu 144 WARNING: couldn't run on cpu 152 min: 18446744073.710 GHz (cpu -1) max: 0.000 GHz (cpu -1) avg: 0.000 GHz The command uses a perf counter to measure CPU cycles over a fixed amount of time, in order to approximate the frequency of the machine. The counters were returning zero once a guest was started, regardless of weather it was still running or had been shut down. By dumping the value of MMCR2, it was observed that once a guest is running MMCR2 is set to 1s - which stops counters from running: $ sudo sh -c 'echo p > /proc/sysrq-trigger' CPU: 0 PMU registers, ppmu = POWER8 n_counters = 6 PMC1: 5b635e38 PMC2: 00000000 PMC3: 00000000 PMC4: 00000000 PMC5: 1bf5a646 PMC6: 5793d378 PMC7: deadbeef PMC8: deadbeef MMCR0: 0000000080000000 MMCR1: 000000001e000000 MMCRA: 0000040000000000 MMCR2: fffffffffffffc00 EBBHR: 0000000000000000 EBBRR: 0000000000000000 BESCR: 0000000000000000 SIAR: 00000000000a51cc SDAR: c00000000fc40000 SIER: 0000000001000000 This is done unconditionally in book3s_hv_interrupts.S upon entering the guest, and the original value is only save/restored if the host has indicated it was using the PMU. This is okay, however the user of the PMU needs to ensure that it is in a defined state when it starts using it. Fixes: `e05b9b9e5c` ("powerpc/perf: Power8 PMU support") Cc: stable@vger.kernel.org Signed-off-by: Joel Stanley <joel@jms.id.au> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-11 12:55:08 +10:00
Joel Stanley	4d9690dd56	powerpc/perf: Add PPMU_ARCH_207S define Instead of separate bits for every POWER8 PMU feature, have a single one for v2.07 of the architecture. This saves us adding a MMCR2 define for a future patch. Cc: stable@vger.kernel.org Signed-off-by: Joel Stanley <joel@jms.id.au> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-11 12:55:07 +10:00
Joel Stanley	f73128f4f6	powerpc/kvm: Remove redundant save of SIER AND MMCR2 These two registers are already saved in the block above. Aside from being unnecessary, by the time we get down to the second save location r8 no longer contains MMCR2, so we are clobbering the saved value with PMC5. MMCR2 primarily consists of counter freeze bits. So restoring the value of PMC5 into MMCR2 will most likely have the effect of freezing counters. Fixes: `72cde5a88d` ("KVM: PPC: Book3S HV: Save/restore host PMU registers that are new in POWER8") Cc: stable@vger.kernel.org Signed-off-by: Joel Stanley <joel@jms.id.au> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Acked-by: Paul Mackerras <paulus@samba.org> Reviewed-by: Alexander Graf <agraf@suse.de> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-11 12:55:07 +10:00
Preeti U Murthy	c733cf83bb	powerpc/powernv: Check for IRQHAPPENED before sleeping Commit `8d6f7c5a`: "powerpc/powernv: Make it possible to skip the IRQHAPPENED check in power7_nap()" added code that prevents cpus from checking for pending interrupts just before entering sleep state, which is wrong. These interrupts are delivered during the soft irq disabled state of the cpu. A cpu cannot enter any idle state with pending interrupts because they will never be serviced until the next time the cpu is woken up by some other interrupt. Its only then that the pending interrupts are replayed. This can result in device timeouts or warnings about this cpu being stuck. This patch fixes ths issue by ensuring that cpus check for pending interrupts just before entering any idle state as long as they are not in the path of split core operations. Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-11 12:55:06 +10:00
Michael Ellerman	cd68098bce	powerpc: Clean up MMU_FTRS_A2 and MMU_FTR_TYPE_3E In `fb5a515704` "powerpc: Remove platforms/wsp and associated pieces", we removed the last user of MMU_FTRS_A2. So remove it. MMU_FTRS_A2 was the last user of MMU_FTR_TYPE_3E, so remove it also. This leaves some unreachable code in mmu_context_nohash.c, so remove that also. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-11 12:55:05 +10:00
Michael Ellerman	e623fbf1c4	powerpc/cell: Fix compilation with CONFIG_COREDUMP=n Commit `046d662f48` "coredump: make core dump functionality optional" made the coredump optional, but didn't update the spufs code that depends on it. That leads to build errors such as: arch/powerpc/platforms/built-in.o: In function `.spufs_arch_write_note': coredump.c:(.text+0x22cd4): undefined reference to `.dump_emit' coredump.c:(.text+0x22cf4): undefined reference to `.dump_emit' coredump.c:(.text+0x22d0c): undefined reference to `.dump_align' coredump.c:(.text+0x22d48): undefined reference to `.dump_emit' coredump.c:(.text+0x22e7c): undefined reference to `.dump_skip' Fix it by adding some ifdefs in the cell code. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-07-11 12:55:05 +10:00

1 2 3 4 5 ...

12843 Commits