linux

korg/linux

mirror of https://mirrors.bfsu.edu.cn/git/linux.git synced 2025-01-08 14:54:23 +08:00

Author	SHA1	Message	Date
Gautham R. Shenoy	fb5153d05a	powerpc: powernv: Implement ppc_md.get_proc_freq() Implement a method named pnv_get_proc_freq(unsigned int cpu) which returns the current clock rate on the 'cpu' in Hz to be reported in /proc/cpuinfo. This method uses the value reported by cpufreq when such a value is sane. Otherwise it falls back to old way of reporting the clockrate, i.e. ppc_proc_freq. Set the ppc_md.get_proc_freq() hook to pnv_get_proc_freq() on the PowerNV platform. Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-04-28 17:36:43 +10:00
Vasant Hegde	2196c6f1ed	powerpc/powernv: Return secondary CPUs to firmware before FW update Firmware update on PowerNV platform takes several minutes. During this time one CPU is stuck in FW and the kernel complains about "soft lockups". This patch returns all secondary CPUs to firmware before starting firmware update process. [ Reworked a bit and cleaned up -- BenH ] Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-04-28 17:36:34 +10:00
Gavin Shan	e9bc03fe22	powerpc/powernv: Don't use pe->pbus to get the domain number If the PE contains single PCI function, "pe->pbus" would be NULL. It's not reliable to be used by pci_domain_nr(). We just grab the PCI domain number from the PCI host controller (struct pci_controller) instance. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-04-28 17:35:14 +10:00
Gavin Shan	65fd766b99	powerpc/powernv: Missed IOMMU table type In function pnv_pci_ioda2_setup_dma_pe(), the IOMMU table type is set to (TCE_PCI_SWINV_CREATE \| TCE_PCI_SWINV_FREE) unconditionally. It was just set to TCE_PCI by pnv_pci_setup_iommu_table(). So the primary IOMMU table type (TCE_PCI) is lost. The patch fixes it. Also, pnv_pci_setup_iommu_table() already set "tbl->it_busno" to zero and we needn't do it again. The patch removes the redundant assignment. The patch also fixes similar issues in pnv_pci_ioda_setup_dma_pe(). Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-04-28 17:35:10 +10:00
Gavin Shan	b2b5efcf20	powerpc/powernv: Fundamental reset on PLX ports The patch intends to support fundamental reset on PLX downstream ports. If the PCI device matches any one of the internal table, which includes PLX vendor ID, bridge device ID, register offset for fundamental reset and bit, fundamental reset will be done accordingly. Otherwise, it will fail back to hot reset. Additional flag (EEH_DEV_FRESET) is introduced to record the last reset type on the PCI bridge. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-04-28 17:35:05 +10:00
Gavin Shan	361f2a2a15	powrpc/powernv: Reset PHB in kdump kernel In the kdump scenario, the first kerenl doesn't shutdown PCI devices and the kdump kerenl clean PHB IODA table at the early probe time. That means the kdump kerenl can't support PCI transactions piled by the first kerenl. Otherwise, lots of EEH errors and frozen PEs will be detected. In order to avoid the EEH errors, the PHB is resetted to drop all PCI transaction from the first kerenl. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-04-28 17:34:57 +10:00
Gavin Shan	d92a208d08	powerpc/pci: Mask linkDown on resetting PCI bus The problem was initially reported by Wendy who tried pass through IPR adapter, which was connected to PHB root port directly, to KVM based guest. When doing that, pci_reset_bridge_secondary_bus() was called by VFIO driver and linkDown was detected by the root port. That caused all PEs to be frozen. The patch fixes the issue by routing the reset for the secondary bus of root port to underly firmware. For that, one more weak function pci_reset_secondary_bus() is introduced so that the individual platforms can override that and do specific reset for bridge's secondary bus. Reported-by: Wendy Xiong <wenxiong@linux.vnet.ibm.com> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-04-28 17:34:53 +10:00
Gavin Shan	26833a5029	powerpc/eeh: Make the delay for PE reset unified Basically, we have 3 types of resets to fulfil PE reset: fundamental, hot and PHB reset. For the later 2 cases, we need PCI bus reset hold and settlement delay as specified by PCI spec. PowerNV and pSeries platforms are running on top of different firmware and some of the delays have been covered by underly firmware (PowerNV). The patch makes the delays unified to be done in backend, instead of EEH core. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-04-28 17:34:48 +10:00
Gavin Shan	fd5cee7ce8	powerpc/powernv: Reset root port in firmware Resetting root port has more stuff to do than that for PCIe switch ports and we should have resetting root port done in firmware instead of the kernel itself. The problem was introduced by commit `5b2e198e` ("powerpc/powernv: Rework EEH reset"). Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-04-28 17:34:44 +10:00
Gavin Shan	63796558d4	powerpc/powernv: Fix endless reporting frozen PE Once one specific PE has been marked as EEH_PE_ISOLATED, it's in the middile of recovery or removed permenently. We needn't report the frozen PE again. Otherwise, we will have endless reporting same frozen PE. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-04-28 17:34:36 +10:00
Gavin Shan	d2b0f6f77e	powerpc/eeh: No hotplug on permanently removed dev The issue was detected in a bit complicated test case where we have multiple hierarchical PEs shown as following figure: +-----------------+ \| PE#3 p2p#0 \| \| p2p#1 \| +-----------------+ \| +-----------------+ \| PE#4 pdev#0 \| \| pdev#1 \| +-----------------+ PE#4 (have 2 PCI devices) is the child of PE#3, which has 2 p2p bridges. We accidentally had less-known scenario: PE#4 was removed permanently from the system because of permanent failure (e.g. exceeding the max allowd failure times in last hour), then we detects EEH errors on PE#3 and tried to recover it. However, eeh_dev instances for pdev#0/1 were not detached from PE#4, which was still connected to PE#3. All of that was because of the fact that we rely on count-based pcibios_release_device(), which isn't reliable enough. When doing recovery for PE#3, we still apply hotplug on PE#4 and pdev#0/1, which are not valid any more. Eventually, we run into kernel crash. The patch fixes above issue from two aspects. For unplug, we simply skip those permanently removed PE, whose state is (EEH_PE_STATE_ISOLATED && !EEH_PE_STATE_RECOVERING) and its frozen count should be greater than EEH_MAX_ALLOWED_FREEZES. For plug, we marked all permanently removed EEH devices with EEH_DEV_REMOVED and return 0xFF's on read its PCI config so that PCI core will omit them. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-04-28 17:34:32 +10:00
Gavin Shan	7f52a526f6	powerpc/eeh: Allow to disable EEH The patch introduces bootarg "eeh=off" to disable EEH functinality. Also, it creates /sys/kerenl/debug/powerpc/eeh_enable to disable or enable EEH functionality. By default, we have the functionality enabled. For PowerNV platform, we will restore to have the conventional mechanism of clearing frozen PE during PCI config access if we're going to disable EEH functionality. Conversely, we will rely on EEH for error recovery. The patch also fixes the issue that we missed to cover the case of disabled EEH functionality in function ioda_eeh_event(). Those events driven by interrupt should be cleared to avoid endless reporting. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-04-28 17:34:27 +10:00
Gavin Shan	2a18dfc6ee	powerpc/eeh: Use cached capability for log dump When calling into eeh_gather_pci_data() on pSeries platform, we possiblly don't have pci_dev instance yet, but eeh_dev is always ready. So we use cached capability from eeh_dev instead of pci_dev for log dump there. In order to keep things unified, we also cache PCI capability positions to eeh_dev for PowerNV as well. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-04-28 17:34:19 +10:00
Gavin Shan	7895470063	powerpc/eeh: Avoid I/O access during PE reset We have suffered recrusive frozen PE a lot, which was caused by IO accesses during the PE reset. Ben came up with the good idea to keep frozen PE until recovery (BAR restore) gets done. With that, IO accesses during PE reset are dropped by hardware and wouldn't incur the recrusive frozen PE any more. The patch implements the idea. We don't clear the frozen state until PE reset is done completely. During the period, the EEH core expects unfrozen state from backend to keep going. So we have to reuse EEH_PE_RESET flag, which has been set during PE reset, to return normal state from backend. The side effect is we have to clear frozen state for towice (PE reset and clear it explicitly), but that's harmless. We have some limitations on pHyp. pHyp doesn't allow to enable IO or DMA for unfrozen PE. So we don't enable them on unfrozen PE in eeh_pci_enable(). We have to enable IO before grabbing logs on pHyp. Otherwise, 0xFF's is always returned from PCI config space. Also, we had wrong return value from eeh_pci_enable() for EEH_OPT_THAW_DMA case. The patch fixes it too. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-04-28 17:34:10 +10:00
Gavin Shan	1d9a544646	powerpc/powernv: Use EEH PCI config accessors For EEH PowerNV backends, they need use their own PCI config accesors as the normal one could be blocked during PE reset. The patch also removes necessary parameter "hose" for the function ioda_eeh_bridge_reset(). Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-04-28 17:34:06 +10:00
Gavin Shan	d0914f503f	powerpc/eeh: Block PCI-CFG access during PE reset We've observed multiple PE reset failures because of PCI-CFG access during that period. Potentially, some device drivers can't support EEH very well and they can't put the device to motionless state before PE reset. So those device drivers might produce PCI-CFG accesses during PE reset. Also, we could have PCI-CFG access from user space (e.g. "lspci"). Since access to frozen PE should return 0xFF's, we can block PCI-CFG access during the period of PE reset so that we won't get recrusive EEH errors. The patch adds flag EEH_PE_RESET, which is kept during PE reset. The PowerNV/pSeries PCI-CFG accessors reuse the flag to block PCI-CFG accordingly. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-04-28 17:34:02 +10:00
Gavin Shan	b34497d184	powerpc/powernv: Remove fields in PHB diag-data dump For some fields (e.g. LEM, MMIO, DMA) in PHB diag-data dump, it's meaningless to print them if they have non-zero value in the corresponding mask registers because we always have non-zero values in the mask registers. The patch only prints those fieds if we have non-zero values in the primary registers (e.g. LEM, MMIO, DMA status) so that we can save couple of lines. The patch also removes unnecessary spare line before "brdgCtl:" and two leading spaces as prefix in each line as Ben suggested. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-04-28 17:33:52 +10:00
Gavin Shan	f5bc6b70d2	powerpc/powernv: Move PNV_EEH_STATE_ENABLED around The flag PNV_EEH_STATE_ENABLED is put into pnv_phb::eeh_state, which is protected by CONFIG_EEH. We needn't that. Instead, we can have pnv_phb::flags and maintain all flags there, which is the purpose of the patch. The patch also renames PNV_EEH_STATE_ENABLED to PNV_PHB_FLAG_EEH. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-04-28 17:33:48 +10:00
Gavin Shan	467f79a956	powerpc/powernv: Remove PNV_EEH_STATE_REMOVED The PHB state PNV_EEH_STATE_REMOVED maintained in pnv_phb isn't so useful any more and it's duplicated to EEH_PE_ISOLATED. The patch replaces PNV_EEH_STATE_REMOVED with EEH_PE_ISOLATED. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-04-28 17:33:44 +10:00
Preeti U Murthy	f203891117	ppc/powernv: Set the runlatch bits correctly for offline cpus Up until now we have been setting the runlatch bits for a busy CPU and clearing it when a CPU enters idle state. The runlatch bit has thus been consistent with the utilization of a CPU as long as the CPU is online. However when a CPU is hotplugged out the runlatch bit is not cleared. It needs to be cleared to indicate an unused CPU. Hence this patch has the runlatch bit cleared for an offline CPU just before entering an idle state and sets it immediately after it exits the idle state. Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-04-28 16:32:40 +10:00
Anton Blanchard	2d6b63bbdd	powerpc/powernv: Fix little endian issues in OPAL dump code Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-04-28 13:11:24 +10:00
Anton Blanchard	3441f04b4b	powerpc/powernv: Create OPAL sglist helper functions and fix endian issues We have two copies of code that creates an OPAL sg list. Consolidate these into a common set of helpers and fix the endian issues. The flash interface embedded a version number in the num_entries field, whereas the dump interface did did not. Since versioning wasn't added to the flash interface and it is impossible to add this in a backwards compatible way, just remove it. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-04-28 13:11:23 +10:00
Anton Blanchard	14ad0c58d5	powerpc/powernv: Fix little endian issues in OPAL error log code Fix little endian issues with the OPAL error log code. Signed-off-by: Anton Blanchard <anton@samba.org> Reviewed-by: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-04-28 13:11:23 +10:00
Anton Blanchard	56b4c99312	powerpc/powernv: Fix little endian issues with opal_do_notifier calls The bitmap in opal_poll_events and opal_handle_interrupt is big endian, so we need to byteswap it on little endian builds. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-04-28 13:11:22 +10:00
Anton Blanchard	2bad742388	powerpc/powernv: Use uint64_t instead of size_t in OPAL APIs Using size_t in our APIs is asking for trouble, especially when some OPAL calls use size_t pointers. Signed-off-by: Anton Blanchard <anton@samba.org> Reviewed-by: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-04-28 13:11:21 +10:00
Wei Yang	4966bfa1b3	powerpc/powernv: Release the refcount for pci_dev On PowerNV platform, we are holding an unnecessary refcount on a pci_dev, which leads to the pci_dev is not destroyed when hotplugging a pci device. This patch release the unnecessary refcount. Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-04-28 13:11:20 +10:00
Wei Yang	3f28c5af39	powerpc/powernv: Reduce multi-hit of iommu_add_device() During the EEH hotplug event, iommu_add_device() will be invoked three times and two of them will trigger warning or error. The three times to invoke the iommu_add_device() are: pci_device_add ... set_iommu_table_base_and_group <- 1st time, fail device_add ... tce_iommu_bus_notifier <- 2nd time, succees pcibios_add_pci_devices ... pcibios_setup_bus_devices <- 3rd time, re-attach The first time fails, since the dev->kobj->sd is not initialized. The dev->kobj->sd is initialized in device_add(). The third time's warning is triggered by the re-attach of the iommu_group. After applying this patch, the error iommu_tce: 0003:05:00.0 has not been added, ret=-14 and the warning [ 204.123609] ------------[ cut here ]------------ [ 204.123645] WARNING: at arch/powerpc/kernel/iommu.c:1125 [ 204.123680] Modules linked in: xt_CHECKSUM nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6t_REJECT bnep bluetooth 6lowpan_iphc rfkill xt_conntrack ebtable_nat ebtable_broute bridge stp llc mlx4_ib ib_sa ib_mad ib_core ib_addr ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw bnx2x tg3 mlx4_core nfsd ptp mdio ses libcrc32c nfs_acl enclosure be2net pps_core shpchp lockd kvm uinput sunrpc binfmt_misc lpfc scsi_transport_fc ipr scsi_tgt [ 204.124356] CPU: 18 PID: 650 Comm: eehd Not tainted 3.14.0-rc5yw+ #102 [ 204.124400] task: c0000027ed485670 ti: c0000027ed50c000 task.ti: c0000027ed50c000 [ 204.124453] NIP: c00000000003cf80 LR: c00000000006c648 CTR: c00000000006c5c0 [ 204.124506] REGS: c0000027ed50f440 TRAP: 0700 Not tainted (3.14.0-rc5yw+) [ 204.124558] MSR: 9000000000029032 <SF,HV,EE,ME,IR,DR,RI> CR: 88008084 XER: 20000000 [ 204.124682] CFAR: c00000000006c644 SOFTE: 1 GPR00: c00000000006c648 c0000027ed50f6c0 c000000001398380 c0000027ec260300 GPR04: c0000027ea92c000 c00000000006ad00 c0000000016e41b0 0000000000000110 GPR08: c0000000012cd4c0 0000000000000001 c0000027ec2602ff 0000000000000062 GPR12: 0000000028008084 c00000000fdca200 c0000000000d1d90 c0000027ec281a80 GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000001 GPR24: 000000005342697b 0000000000002906 c000001fe6ac9800 c000001fe6ac9800 GPR28: 0000000000000000 c0000000016e3a80 c0000027ea92c090 c0000027ea92c000 [ 204.125353] NIP [c00000000003cf80] .iommu_add_device+0x30/0x1f0 [ 204.125399] LR [c00000000006c648] .pnv_pci_ioda_dma_dev_setup+0x88/0xb0 [ 204.125443] Call Trace: [ 204.125464] [c0000027ed50f6c0] [c0000027ed50f750] 0xc0000027ed50f750 (unreliable) [ 204.125526] [c0000027ed50f750] [c00000000006c648] .pnv_pci_ioda_dma_dev_setup+0x88/0xb0 [ 204.125588] [c0000027ed50f7d0] [c000000000069cc8] .pnv_pci_dma_dev_setup+0x78/0x340 [ 204.125650] [c0000027ed50f870] [c000000000044408] .pcibios_setup_device+0x88/0x2f0 [ 204.125712] [c0000027ed50f940] [c000000000046040] .pcibios_setup_bus_devices+0x60/0xd0 [ 204.125774] [c0000027ed50f9c0] [c000000000043acc] .pcibios_add_pci_devices+0xdc/0x1c0 [ 204.125837] [c0000027ed50fa50] [c00000000086f970] .eeh_reset_device+0x36c/0x4f0 [ 204.125939] [c0000027ed50fb20] [c00000000003a2d8] .eeh_handle_normal_event+0x448/0x480 [ 204.126068] [c0000027ed50fbc0] [c00000000003a35c] .eeh_handle_event+0x4c/0x340 [ 204.126192] [c0000027ed50fc80] [c00000000003a74c] .eeh_event_handler+0xfc/0x1b0 [ 204.126319] [c0000027ed50fd30] [c0000000000d1ea0] .kthread+0x110/0x130 [ 204.126430] [c0000027ed50fe30] [c00000000000a460] .ret_from_kernel_thread+0x5c/0x7c [ 204.126556] Instruction dump: [ 204.126610] 7c0802a6 fba1ffe8 fbc1fff0 fbe1fff8 f8010010 f821ff71 7c7e1b78 60000000 [ 204.126787] 60000000 e87e0298 3143ffff 7d2a1910 <0b090000> 2fa90000 40de00c8 ebfe0218 [ 204.126966] ---[ end trace 6e7aefd80add2973 ]--- are cleared. This patch removes iommu_add_device() in pnv_pci_ioda_dma_dev_setup(), which revert part of the change in commit d905c5df(PPC: POWERNV: move iommu_add_device earlier). Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-04-28 13:11:20 +10:00
Anton Blanchard	cc146d1db0	powerpc/powernv: Fix little endian issues in OPAL flash code With this patch I was able to update firmware on an LE kernel. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-04-28 13:08:50 +10:00
Benjamin Herrenschmidt	298b34d7d5	powerpc/powernv: Fix kexec races going back to OPAL We have a subtle race when sending CPUs back to OPAL on kexec. We mark them as "in real mode" right before we send them down. Once we've booted the new kernel, it might try to call opal_reinit_cpus() to change endianness, and that requires all CPUs to be spinning inside OPAL. However there is no synchronization here and we've observed cases where the returning CPUs hadn't established their new state inside OPAL before opal_reinit_cpus() is called, causing it to fail. The proper fix is to actually wait for them to go down all the way from the kexec'ing kernel. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-04-28 13:08:50 +10:00
Joel Stanley	63aecfb20a	powerpc/powernv: Check sysparam size before creation The size of the sysparam sysfs files is determined from the device tree at boot. However the buffer is hard coded to 64 bytes. If we encounter a parameter that is larger than 64, or miss-parse the device tree, the buffer will overflow when reading or writing to the parameter. Check it at discovery time, and if the parameter is too large, do not create a sysfs entry for it. Signed-off-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-04-28 13:08:49 +10:00
Joel Stanley	16003d235b	powerpc/powernv: Fix typos in sysparam code Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-04-28 13:08:49 +10:00
Joel Stanley	85390378f0	powerpc/powernv: Check sysfs size before copying The sysparam code currently uses the userspace supplied number of bytes when memcpy()ing in to a local 64-byte buffer. Limit the maximum number of bytes by the size of the buffer. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-04-28 13:08:48 +10:00
Joel Stanley	b8569d2304	powerpc/powernv: Use ssize_t for sysparam return values The OPAL calls are returning int64_t values, which the sysparam code stores in an int, and the sysfs callback returns ssize_t. Make code a easier to read by consistently using ssize_t. Signed-off-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-04-28 13:08:48 +10:00
Joel Stanley	ba9a32b176	powerpc/powernv: Fix sysparam sysfs error handling When a sysparam query in OPAL returned a negative value (error code), sysfs would spew out a decent chunk of memory; almost 64K more than expected. This was traced to a sign/unsigned mix up in the OPAL sysparam sysfs code at sys_param_show. The return value of sys_param_show is a ssize_t, calculated using return ret ? ret : attr->param_size; Alan Modra explains: "attr->param_size" is an unsigned int, "ret" an int, so the overall expression has type unsigned int. Result is that ret is cast to unsigned int before being cast to ssize_t. Instead of using the ternary operator, set ret to the param_size if an error is not detected. The same bug exists in the sysfs write callback; this patch fixes it in the same way. A note on debugging this next time: on my system gcc will warn about this if compiled with -Wsign-compare, which is not enabled by -Wall, only -Wextra. Signed-off-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-04-28 13:08:47 +10:00
Linus Torvalds	eeb91e4f9d	More ACPI and power management fixes and updates for 3.15-rc1 - Fix for a recently introduced CPU hotplug regression in ARM KVM from Ming Lei. - Fixes for breakage in the at32ap, loongson2_cpufreq, and unicore32 cpufreq drivers introduced during the 3.14 cycle (-stable material) from Chen Gang and Viresh Kumar. - New powernv cpufreq driver from Vaidyanathan Srinivasan, with bits from Gautham R Shenoy and Srivatsa S Bhat. - Exynos cpufreq driver fix preventing it from being included into multiplatform builds that aren't supported by it from Sachin Kamat. - cpufreq cleanups related to the usage of the driver_data field in struct cpufreq_frequency_table from Viresh Kumar. - cpufreq ppc driver cleanup from Sachin Kamat. - Intel BayTrail support for intel_idle and ACPI idle from Len Brown. - Intel CPU model 54 (Atom N2000 series) support for intel_idle from Jan Kiszka. - intel_idle fix for Intel Ivy Town residency targets from Len Brown. - turbostat updates (Intel Broadwell support and output cleanups) from Len Brown. - New cpuidle sysfs attribute for exporting C-states' target residency information to user space from Daniel Lezcano. - New kernel command line argument to prevent power domains enabled by the bootloader from being turned off even if they are not in use (for diagnostics purposes) from Tushar Behera. - Fixes for wakeup sysfs attributes documentation from Geert Uytterhoeven. - New ACPI video blacklist entry for ThinkPad Helix from Stephen Chandler Paul. - Assorted ACPI cleanups and a Kconfig help update from Jonghwan Choi, Zhihui Zhang, Hanjun Guo. / -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABCAAGBQJTRxLAAAoJEILEb/54YlRxnCwP/16UwO/eVE8SIi0TqQboikFC k8u7F3zgDYG+xPSzXlCR+J7thTxGueQlrb+aM18PYuMVgaw2rpy7U7SIqEk8s6oR uFnzZCWKA5ZebbZn+NlodnQaJmbgJxwsVJDuuechUka/e67CaIc54JULi2ynZ0lz Kg/nU3NJhu4S81cT5SOTkJ9xE63oxHcCwKbNqEmxn7x7ddFzGK/DThG67NMEnW1F vHbBTSyI6vmXXg1f9aobUtuo3PfJkkx5jD+nR1H2e6wmB64tW7JPVKV3mi6LJfYM ui/8/gNb3PUMHMX1QbL9EFbPxl9miQx2NJ7dgFKa1HZ/WPyiXpJjz7uGr9O3Fau3 cFVREdaW8p2TAYWOEgH8luohhdK0j8UEpR/sEm0TrTjsK8wqczVf/hz6RraVJZiN ck6eVHjY6m3/bFQauZQ/r+DNeeNcdr+iLejgjbh/MXuF3j0kx+1dkKkzCEU2TgEZ 9etF0uzjlgyXySyxNKBeSW13+ssVA6kF5/BHns7LHoxTfGu7Y4oVaWUi+j74i66O bc+2ileNal71mS4v9gomnj6Ffj8oH8KXFA7k0sEsAdwLZNgThB5bTppmY/U7Y5Ce hTS81tcGe2vOVQzF9iFOF7LNKKussAVAtrgkkrA8lJLeOTfQbIo4+fMhORxf3X/p 3O7R/jc4cT+IXK8a2xRt =hGKg -----END PGP SIGNATURE----- Merge tag 'pm+acpi-3.15-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull more ACPI and power management fixes and updates from Rafael Wysocki: "This is PM and ACPI material that has emerged over the last two weeks and one fix for a CPU hotplug regression introduced by the recent CPU hotplug notifiers registration series. Included are intel_idle and turbostat updates from Len Brown (these have been in linux-next for quite some time), a new cpufreq driver for powernv (that might spend some more time in linux-next, but BenH was asking me so nicely to push it for 3.15 that I couldn't resist), some cpufreq fixes and cleanups (including fixes for some silly breakage in a couple of cpufreq drivers introduced during the 3.14 cycle), assorted ACPI cleanups, wakeup framework documentation fixes, a new sysfs attribute for cpuidle and a new command line argument for power domains diagnostics. Specifics: - Fix for a recently introduced CPU hotplug regression in ARM KVM from Ming Lei. - Fixes for breakage in the at32ap, loongson2_cpufreq, and unicore32 cpufreq drivers introduced during the 3.14 cycle (-stable material) from Chen Gang and Viresh Kumar. - New powernv cpufreq driver from Vaidyanathan Srinivasan, with bits from Gautham R Shenoy and Srivatsa S Bhat. - Exynos cpufreq driver fix preventing it from being included into multiplatform builds that aren't supported by it from Sachin Kamat. - cpufreq cleanups related to the usage of the driver_data field in struct cpufreq_frequency_table from Viresh Kumar. - cpufreq ppc driver cleanup from Sachin Kamat. - Intel BayTrail support for intel_idle and ACPI idle from Len Brown. - Intel CPU model 54 (Atom N2000 series) support for intel_idle from Jan Kiszka. - intel_idle fix for Intel Ivy Town residency targets from Len Brown. - turbostat updates (Intel Broadwell support and output cleanups) from Len Brown. - New cpuidle sysfs attribute for exporting C-states' target residency information to user space from Daniel Lezcano. - New kernel command line argument to prevent power domains enabled by the bootloader from being turned off even if they are not in use (for diagnostics purposes) from Tushar Behera. - Fixes for wakeup sysfs attributes documentation from Geert Uytterhoeven. - New ACPI video blacklist entry for ThinkPad Helix from Stephen Chandler Paul. - Assorted ACPI cleanups and a Kconfig help update from Jonghwan Choi, Zhihui Zhang, Hanjun Guo" * tag 'pm+acpi-3.15-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (28 commits) ACPI: Update the ACPI spec information in Kconfig arm, kvm: fix double lock on cpu_add_remove_lock cpuidle: sysfs: Export target residency information cpufreq: ppc: Remove duplicate inclusion of fsl_soc.h cpufreq: create another field .flags in cpufreq_frequency_table cpufreq: use kzalloc() to allocate memory for cpufreq_frequency_table cpufreq: don't print value of .driver_data from core cpufreq: ia64: don't set .driver_data to index cpufreq: powernv: Select CPUFreq related Kconfig options for powernv cpufreq: powernv: Use cpufreq_frequency_table.driver_data to store pstate ids cpufreq: powernv: cpufreq driver for powernv platform cpufreq: at32ap: don't declare local variable as static cpufreq: loongson2_cpufreq: don't declare local variable as static cpufreq: unicore32: fix typo issue for 'clk' cpufreq: exynos: Disable on multiplatform build PM / wakeup: Correct presence vs. emptiness of wakeup_* attributes PM / domains: Add pd_ignore_unused to keep power domains enabled ACPI / dock: Drop dock_device_ids[] table ACPI / video: Favor native backlight interface for ThinkPad Helix ACPI / thermal: Fix wrong variable usage in debug statement ...	2014-04-11 13:20:04 -07:00
Stewart Smith	cc4f265ad9	powerpc/powernv Adapt opal-elog and opal-dump to new sysfs_remove_file_self We are currently using sysfs_schedule_callback() which is deprecated and about to be removed. Switch to the new interface instead. Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-04-09 13:51:50 +10:00
Joel Stanley	e28b05e7ae	powerpc/powernv: Add invalid OPAL call This call will not be understood by OPAL, and cause it to add an error to it's log. Among other things, this is useful for testing the behaviour of the log as it fills up. Signed-off-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-04-09 12:53:23 +10:00
Joel Stanley	bfc36894a4	powerpc/powernv: Add OPAL message log interface OPAL provides an in-memory circular buffer containing a message log populated with various runtime messages produced by the firmware. Provide a sysfs interface /sys/firmware/opal/msglog for userspace to view the messages. Signed-off-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-04-09 12:53:19 +10:00
Mahesh Salgaonkar	6e556b4710	powerpc/book3s: Fix mc_recoverable_range buffer overrun issue. Currently we wrongly allocate mc_recoverable_range buffer (to hold recoverable ranges) based on size of the property "mcheck-recoverable-ranges". This results in allocating less memory to hold available recoverable range entries from /proc/device-tree/ibm,opal/mcheck-recoverable-ranges. This patch fixes this issue by allocating mc_recoverable_range buffer based on number of entries of recoverable ranges instead of device property size. Without this change we end up allocating less memory and run into memory corruption issue. Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-04-09 12:53:15 +10:00
Anton Blanchard	9000c17dc0	powerpc/powernv: Fix endian issues with sensor code One OPAL call and one device tree property needed byte swapping. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-04-09 12:52:49 +10:00
Gautham R. Shenoy	81f359027a	cpufreq: powernv: Select CPUFreq related Kconfig options for powernv Enable CPUFreq for PowerNV. Select "performance", "powersave", "userspace" and "ondemand" governors. Choose "ondemand" to be the default governor. Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2014-04-07 14:35:28 +02:00
Anton Blanchard	bb4398e1de	powerpc/powernv: Fix endian issues with OPAL async code OPAL defines opal_msg as a big endian struct so we have to byte swap it on little endian builds. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-04-07 10:34:27 +10:00
Benjamin Herrenschmidt	798af00c4d	powerpc/powernv: Add opal_notifier_unregister() and export to modules opal_notifier_register() is missing a pending "unregister" variant and should be exposed to modules. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-04-07 10:33:16 +10:00
Linus Torvalds	e6d9bfc638	Merge branch 'powernv-cpuidle' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull powerpc non-virtualized cpuidle from Ben Herrenschmidt: "This is the branch I mentioned in my other pull request which contains our improved cpuidle support for the "powernv" platform (non-virtualized). It adds support for the "fast sleep" feature of the processor which provides higher power savings than our usual "nap" mode but at the cost of losing the timers while asleep, and thus exploits the new timer broadcast framework to work around that limitation. It's based on a tip timer tree that you seem to have already merged" * 'powernv-cpuidle' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: cpuidle/powernv: Parse device tree to setup idle states cpuidle/powernv: Add "Fast-Sleep" CPU idle state powerpc/powernv: Add OPAL call to resync timebase on wakeup powerpc/powernv: Add context management for Fast Sleep powerpc: Split timer_interrupt() into timer handling and interrupt handling routines powerpc: Implement tick broadcast IPI as a fixed IPI message powerpc: Free up the slot of PPC_MSG_CALL_FUNC_SINGLE IPI message	2014-04-02 13:47:29 -07:00
Linus Torvalds	235c7b9feb	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull main powerpc updates from Ben Herrenschmidt: "This time around, the powerpc merges are going to be a little bit more complicated than usual. This is the main pull request with most of the work for this merge window. I will describe it a bit more further down. There is some additional cpuidle driver work, however I haven't included it in this tree as it depends on some work in tip/timer-core which Thomas accidentally forgot to put in a topic branch. Since I didn't want to carry all of that tip timer stuff in powerpc -next, I setup a separate branch on top of Thomas tree with just that cpuidle driver in it, and Stephen has been carrying that in next separately for a while now. I'll send a separate pull request for it. Additionally, two new pieces in this tree add users for a sysfs API that Tejun and Greg have been deprecating in drivers-core-next. Thankfully Greg reverted the patch that removes the old API so this merge can happen cleanly, but once merged, I will send a patch adjusting our new code to the new API so that Greg can send you the removal patch. Now as for the content of this branch, we have a lot of perf work for power8 new counters including support for our new "nest" counters (also called 24x7) under pHyp (not natively yet). We have new functionality when running under the OPAL firmware (non-virtualized or KVM host), such as access to the firmware error logs and service processor dumps, system parameters and sensors, along with a hwmon driver for the latter. There's also a bunch of bug fixes accross the board, some LE fixes, and a nice set of selftests for validating our various types of copy loops. On the Freescale side, we see mostly new chip/board revisions, some clock updates, better support for machine checks and debug exceptions, etc..." * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (70 commits) powerpc/book3s: Fix CFAR clobbering issue in machine check handler. powerpc/compat: 32-bit little endian machine name is ppcle, not ppc powerpc/le: Big endian arguments for ppc_rtas() powerpc: Use default set of netfilter modules (CONFIG_NETFILTER_ADVANCED=n) powerpc/defconfigs: Enable THP in pseries defconfig powerpc/mm: Make sure a local_irq_disable prevent a parallel THP split powerpc: Rate-limit users spamming kernel log buffer powerpc/perf: Fix handling of L3 events with bank == 1 powerpc/perf/hv_{gpci, 24x7}: Add documentation of device attributes powerpc/perf: Add kconfig option for hypervisor provided counters powerpc/perf: Add support for the hv 24x7 interface powerpc/perf: Add support for the hv gpci (get performance counter info) interface powerpc/perf: Add macros for defining event fields & formats powerpc/perf: Add a shared interface to get gpci version and capabilities powerpc/perf: Add 24x7 interface headers powerpc/perf: Add hv_gpci interface header powerpc: Add hvcalls for 24x7 and gpci (Get Performance Counter Info) sysfs: create bin_attributes under the requested group powerpc/perf: Enable BHRB access for EBB events powerpc/perf: Add BHRB constraint and IFM MMCRA handling for EBB ...	2014-04-02 13:42:59 -07:00
Neelesh Gupta	7224adbbb8	powerpc/powernv: Enable fetching of platform sensor data This patch enables fetching of various platform sensor data through OPAL and expects a sensor handle from the driver to pass to OPAL. Signed-off-by: Neelesh Gupta <neelegup@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-03-24 09:48:21 +11:00
Neelesh Gupta	4029cd6654	powerpc/powernv: Enable reading and updating of system parameters This patch enables reading and updating of system parameters through OPAL call. Signed-off-by: Neelesh Gupta <neelegup@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-03-24 09:47:30 +11:00
Neelesh Gupta	8d72482322	powerpc/powernv: Infrastructure to support OPAL async completion This patch adds support for notifying the clients of their request completion. Clients request for the token before making OPAL call and then wait for the response. This patch uses messaging infrastructure to pull the data to linux by registering itself for the message type OPAL_MSG_ASYNC_COMP. Signed-off-by: Neelesh Gupta <neelegup@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-03-24 09:45:22 +11:00
Ingo Molnar	a02ed5e3e0	Merge branch 'sched/urgent' into sched/core Pick up fixes before queueing up new changes. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-03-11 11:34:27 +01:00
Stewart Smith	c7e64b9ce0	powerpc/powernv Platform dump interface This enables support for userspace to fetch and initiate FSP and Platform dumps from the service processor (via firmware) through sysfs. Based on original patch from Vasant Hegde <hegdevasant@linux.vnet.ibm.com> Flow: - We register for OPAL notification events. - OPAL sends new dump available notification. - We make information on dump available via sysfs - Userspace requests dump contents - We retrieve the dump via OPAL interface - User copies the dump data - userspace sends ack for dump - We send ACK to OPAL. sysfs files: - We add the /sys/firmware/opal/dump directory - echoing 1 (well, anything, but in future we may support different dump types) to /sys/firmware/opal/dump/initiate_dump will initiate a dump. - Each dump that we've been notified of gets a directory in /sys/firmware/opal/dump/ with a name of the dump type and ID (in hex, as this is what's used elsewhere to identify the dump). - Each dump has files: id, type, dump and acknowledge dump is binary and is the dump itself. echoing 'ack' to acknowledge (currently any string will do) will acknowledge the dump and it will soon after disappear from sysfs. OPAL APIs: - opal_dump_init() - opal_dump_info() - opal_dump_read() - opal_dump_ack() - opal_dump_resend_notification() Currently we are only ever notified for one dump at a time (until the user explicitly acks the current dump, then we get a notification of the next dump), but this kernel code should "just work" when OPAL starts notifying us of all the dumps present. Signed-off-by: Stewart Smith <stewart@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2014-03-07 16:19:10 +11:00

1 2 3 4 5

234 Commits