Here's the set of driver core patches for 3.19-rc1.
They are dominated by the removal of the .owner field in platform
drivers. They touch a lot of files, but they are "simple" changes, just
removing a line in a structure.
Other than that, a few minor driver core and debugfs changes. There are
some ath9k patches coming in through this tree that have been acked by
the wireless maintainers as they relied on the debugfs changes.
Everything has been in linux-next for a while.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iEYEABECAAYFAlSOD20ACgkQMUfUDdst+ylLPACg2QrW1oHhdTMT9WI8jihlHVRM
53kAoLeteByQ3iVwWurwwseRPiWa8+MI
=OVRS
-----END PGP SIGNATURE-----
Merge tag 'driver-core-3.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core update from Greg KH:
"Here's the set of driver core patches for 3.19-rc1.
They are dominated by the removal of the .owner field in platform
drivers. They touch a lot of files, but they are "simple" changes,
just removing a line in a structure.
Other than that, a few minor driver core and debugfs changes. There
are some ath9k patches coming in through this tree that have been
acked by the wireless maintainers as they relied on the debugfs
changes.
Everything has been in linux-next for a while"
* tag 'driver-core-3.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (324 commits)
Revert "ath: ath9k: use debugfs_create_devm_seqfile() helper for seq_file entries"
fs: debugfs: add forward declaration for struct device type
firmware class: Deletion of an unnecessary check before the function call "vunmap"
firmware loader: fix hung task warning dump
devcoredump: provide a one-way disable function
device: Add dev_<level>_once variants
ath: ath9k: use debugfs_create_devm_seqfile() helper for seq_file entries
ath: use seq_file api for ath9k debugfs files
debugfs: add helper function to create device related seq_file
drivers/base: cacheinfo: remove noisy error boot message
Revert "core: platform: add warning if driver has no owner"
drivers: base: support cpu cache information interface to userspace via sysfs
drivers: base: add cpu_device_create to support per-cpu devices
topology: replace custom attribute macros with standard DEVICE_ATTR*
cpumask: factor out show_cpumap into separate helper function
driver core: Fix unbalanced device reference in drivers_probe
driver core: fix race with userland in device_add()
sysfs/kernfs: make read requests on pre-alloc files use the buffer.
sysfs/kernfs: allow attributes to request write buffer be pre-allocated.
fs: sysfs: return EGBIG on write if offset is larger than file size
...
Some nice cleanups like removing bootmem, and removal of __get_cpu_var().
There is one patch to mm/gup.c. This is the generic GUP implementation, but is
only used by us and arm(64). We have an ack from Steve Capper, and although we
didn't get an ack from Andrew he told us to take the patch through the powerpc
tree.
There's one cxl patch. This is in drivers/misc, but Greg said he was happy for
us to manage fixes for it.
There is an infrastructure patch to support an IPMI driver for OPAL. That patch
also appears in Corey Minyard's IPMI tree, you may see a conflict there.
There is also an RTC driver for OPAL. We weren't able to get any response from
the RTC maintainer, Alessandro Zummo, so in the end we just merged the driver.
The usual batch of Freescale updates from Scott.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJUiSTSAAoJEFHr6jzI4aWAirQP/3rIEng0LzLu5kW2zkGylIaM
SNDum1vze3mHiTFl+CFcSIGpC1UEULoB49HA+2oE/ExKpIceG6lpL2LP+wNh2FW5
mozjMjS6mZt4w1Fu1D2ZtgQc3O1T1pxkqsnZmPa8gVf5k5d5IQNPY6yB0pgVWwbV
gwBKxe4VwPAzJjppE9i9MDhNTJwmHZq0lI8XuoTXOOU/f+4G1WxmjrbyveQ7cRP5
i/sq2cKjxpWA+KDeIXo0GR0DpXR7qMeAvFX5xXY7oKuUJIFDM4kSHfmMYP6qLf5c
2vlsJqHVqfOgQdve41z1ooaPzNtg7ezVo+VqqguSgtSgwy2JUo/uHpnzz3gD1Olo
AP5+6xj8LZac0rTPxF4n4Hoyrp7AaaFjEFt1zqT9PWniZW4B41wtia0QORBNUf1S
UEmKAC9T3WZJ47mH7WMSadtOPF9E3Yd/zuiPD4udtptCNKPbr6/k1MpJPIW2D4Rn
BJ0QZTRd7V0yRofXxZtHxaMxq8pWd/Tip7J/zr/ghz+ulnH8BuFamuhCCLuJlESU
+A2PMfuseyTMpH9sMAmmTwSGPDKjaUFWvmFvY/n88NZL7r2LlomNrDWFSSQOIHUP
FxjYmjUMpZeexsfyRdgFV/INhYC3o3cso2fRGO45YK6nkxNnjNFEBS6WhQLvNLBu
sknd1WjXkuJtoMC15SrQ
=jvyT
-----END PGP SIGNATURE-----
Merge tag 'powerpc-3.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux
Pull powerpc updates from Michael Ellerman:
"Some nice cleanups like removing bootmem, and removal of
__get_cpu_var().
There is one patch to mm/gup.c. This is the generic GUP
implementation, but is only used by us and arm(64). We have an ack
from Steve Capper, and although we didn't get an ack from Andrew he
told us to take the patch through the powerpc tree.
There's one cxl patch. This is in drivers/misc, but Greg said he was
happy for us to manage fixes for it.
There is an infrastructure patch to support an IPMI driver for OPAL.
There is also an RTC driver for OPAL. We weren't able to get any
response from the RTC maintainer, Alessandro Zummo, so in the end we
just merged the driver.
The usual batch of Freescale updates from Scott"
* tag 'powerpc-3.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux: (101 commits)
powerpc/powernv: Return to cpu offline loop when finished in KVM guest
powerpc/book3s: Fix partial invalidation of TLBs in MCE code.
powerpc/mm: don't do tlbie for updatepp request with NO HPTE fault
powerpc/xmon: Cleanup the breakpoint flags
powerpc/xmon: Enable HW instruction breakpoint on POWER8
powerpc/mm/thp: Use tlbiel if possible
powerpc/mm/thp: Remove code duplication
powerpc/mm/hugetlb: Sanity check gigantic hugepage count
powerpc/oprofile: Disable pagefaults during user stack read
powerpc/mm: Check for matching hpte without taking hpte lock
powerpc: Drop useless warning in eeh_init()
powerpc/powernv: Cleanup unused MCE definitions/declarations.
powerpc/eeh: Dump PHB diag-data early
powerpc/eeh: Recover EEH error on ownership change for BCM5719
powerpc/eeh: Set EEH_PE_RESET on PE reset
powerpc/eeh: Refactor eeh_reset_pe()
powerpc: Remove more traces of bootmem
powerpc/pseries: Initialise nvram_pstore_info's buf_lock
cxl: Name interrupts in /proc/interrupt
cxl: Return error to PSL if IRQ demultiplexing fails & print clearer warning
...
Pull irq domain updates from Thomas Gleixner:
"The real interesting irq updates:
- Support for hierarchical irq domains:
For complex interrupt routing scenarios where more than one
interrupt related chip is involved we had no proper representation
in the generic interrupt infrastructure so far. That made people
implement rather ugly constructs in their nested irq chip
implementations. The main offenders are x86 and arm/gic.
To distangle that mess we have now hierarchical irqdomains which
seperate the various interrupt chips and connect them via the
hierarchical domains. That keeps the domain specific details
internal to the particular hierarchy level and removes the
criss/cross referencing of chip internals. The resulting hierarchy
for a complex x86 system will look like this:
vector mapped: 74
msi-0 mapped: 2
dmar-ir-1 mapped: 69
ioapic-1 mapped: 4
ioapic-0 mapped: 20
pci-msi-2 mapped: 45
dmar-ir-0 mapped: 3
ioapic-2 mapped: 1
pci-msi-1 mapped: 2
htirq mapped: 0
Neither ioapic nor pci-msi know about the dmar interrupt remapping
between themself and the vector domain. If interrupt remapping is
disabled ioapic and pci-msi become direct childs of the vector
domain.
In hindsight we should have done that years ago, but in hindsight
we always know better :)
- Support for generic MSI interrupt domain handling
We have more and more non PCI related MSI interrupts, so providing
a generic infrastructure for this is better than having all
affected architectures implementing their own private hacks.
- Support for PCI-MSI interrupt domain handling, based on the generic
MSI support.
This part carries the pci/msi branch from Bjorn Helgaas pci tree to
avoid a massive conflict. The PCI/MSI parts are acked by Bjorn.
I have two more branches on top of this. The full conversion of x86
to hierarchical domains and a partial conversion of arm/gic"
* 'irq-irqdomain-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (41 commits)
genirq: Move irq_chip_write_msi_msg() helper to core
PCI/MSI: Allow an msi_controller to be associated to an irq domain
PCI/MSI: Provide mechanism to alloc/free MSI/MSIX interrupt from irqdomain
PCI/MSI: Enhance core to support hierarchy irqdomain
PCI/MSI: Move cached entry functions to irq core
genirq: Provide default callbacks for msi_domain_ops
genirq: Introduce msi_domain_alloc/free_irqs()
asm-generic: Add msi.h
genirq: Add generic msi irq domain support
genirq: Introduce callback irq_chip.irq_write_msi_msg
genirq: Work around __irq_set_handler vs stacked domains ordering issues
irqdomain: Introduce helper function irq_domain_add_hierarchy()
irqdomain: Implement a method to automatically call parent domains alloc/free
genirq: Introduce helper irq_domain_set_info() to reduce duplicated code
genirq: Split out flow handler typedefs into seperate header file
genirq: Add IRQ_SET_MASK_OK_DONE to support stacked irqchip
genirq: Introduce irq_chip.irq_compose_msi_msg() to support stacked irqchip
genirq: Add more helper functions to support stacked irq_chip
genirq: Introduce helper functions to support stacked irq_chip
irqdomain: Do irq_find_mapping and set_type for hierarchy irqdomain in case OF
...
The PCI/MSI irq chip callbacks mask/unmask_msi_irq have been renamed
to pci_msi_mask/unmask_irq to mark them PCI specific. Rename all usage
sites. The conversion helper functions are kept around to avoid
conflicts in next and will be removed after merging into mainline.
Coccinelle assisted conversion. No functional change.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: x86@kernel.org
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Murali Karicheri <m-karicheri2@ti.com>
Cc: Thierry Reding <thierry.reding@gmail.com>
Cc: Mohit Kumar <mohit.kumar@st.com>
Cc: Simon Horman <horms@verge.net.au>
Cc: Michal Simek <michal.simek@xilinx.com>
Cc: Yijing Wang <wangyijing@huawei.com>
Rename write_msi_msg() to pci_write_msi_msg() to mark it as PCI
specific.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Grant Likely <grant.likely@linaro.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Yingjoe Chen <yingjoe.chen@mediatek.com>
Cc: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Although we are now selecting NO_BOOTMEM, we still have some traces of
bootmem lying around. That is because even with NO_BOOTMEM there is
still a shim that converts bootmem calls into memblock calls, but
ultimately we want to remove all traces of bootmem.
Most of the patch is conversions from alloc_bootmem() to
memblock_virt_alloc(). In general a call such as:
p = (struct foo *)alloc_bootmem(x);
Becomes:
p = memblock_virt_alloc(x, 0);
We don't need the cast because memblock_virt_alloc() returns a void *.
The alignment value of zero tells memblock to use the default alignment,
which is SMP_CACHE_BYTES, the same value alloc_bootmem() uses.
We remove a number of NULL checks on the result of
memblock_virt_alloc(). That is because memblock_virt_alloc() will panic
if it can't allocate, in exactly the same way as alloc_bootmem(), so the
NULL checks are and always have been redundant.
The memory returned by memblock_virt_alloc() is already zeroed, so we
remove several memsets of the result of memblock_virt_alloc().
Finally we convert a few uses of __alloc_bootmem(x, y, MAX_DMA_ADDRESS)
to just plain memblock_virt_alloc(). We don't use memblock_alloc_base()
because MAX_DMA_ADDRESS is ~0ul on powerpc, so limiting the allocation
to that is pointless, 16XB ought to be enough for anyone.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Scott says:
"Highlights include a bunch of 8xx optimizations, device tree bindings
for Freescale BMan, QMan, and FMan datapath components, misc device tree
updates, and inbound rio window support."
The commit 543c043cba ("powerpc/fsl_msi: change the irq handler from
chained to normal") changes the msi cascade handler from chained to
normal. Since cascade handler must run in hard interrupt context, this
will cause kernel panic if we force threading of all the interrupt
handler via kernel command parameter 'threadirqs'. So mark the irq
handler IRQF_NO_THREAD explicitly.
Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
Add support for mapping and unmapping of inbound rapidio windows. This
allows for drivers to open up a part of local memory on the rapidio
network. Also applications can use this and tranfer blocks of data
over the network.
Signed-off-by: Martijn de Gouw <martijn.de.gouw@prodrive-technologies.com>
[scottwood@freescale.com: updated commit message based on review]
Signed-off-by: Scott Wood <scottwood@freescale.com>
Of_node_put supports NULL as its argument, so the initial test is not
necessary.
Suggested by Uwe Kleine-König.
The semantic patch that fixes this problem is as follows:
(http://coccinelle.lip6.fr/)
// <smpl>
@@
expression e;
@@
-if (e)
of_node_put(e);
// </smpl>
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Of_node_put supports NULL as its argument, so the initial test is not
necessary.
Suggested by Uwe Kleine-König.
The semantic patch that fixes this problem is as follows:
(http://coccinelle.lip6.fr/)
// <smpl>
@@
expression e;
@@
-if (e)
of_node_put(e);
// </smpl>
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Simplify the error path to avoid calling of_node_put when it is not needed.
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
I'm seeing a build warning in mm/nobootmem.c after removing
bootmem:
mm/nobootmem.c: In function '__free_pages_memory':
include/linux/kernel.h:713:17: warning: comparison of distinct pointer types lacks a cast [enabled by default]
(void) (&_min1 == &_min2); \
^
mm/nobootmem.c:90:11: note: in expansion of macro 'min'
order = min(MAX_ORDER - 1UL, __ffs(start));
^
The rest of the worlds seems to define __ffs as returning unsigned long,
so lets do that.
Signed-off-by: Anton Blanchard <anton@samba.org>
Tested-by: Emil Medve <Emilian.Medve@Freescale.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Lots of places included bootmem.h even when not using bootmem.
Signed-off-by: Anton Blanchard <anton@samba.org>
Tested-by: Emil Medve <Emilian.Medve@Freescale.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
The generic Linux framework to power off the machine is a function pointer
called pm_power_off. The trick about this pointer is that device drivers can
potentially implement it rather than board files.
Today on powerpc we set pm_power_off to invoke our generic full machine power
off logic which then calls ppc_md.power_off to invoke machine specific power
off.
However, when we want to add a power off GPIO via the "gpio-poweroff" driver,
this card house falls apart. That driver only registers itself if pm_power_off
is NULL to ensure it doesn't override board specific logic. However, since we
always set pm_power_off to the generic power off logic (which will just not
power off the machine if no ppc_md.power_off call is implemented), we can't
implement power off via the generic GPIO power off driver.
To fix this up, let's get rid of the ppc_md.power_off logic and just always use
pm_power_off as was intended. Then individual drivers such as the GPIO power off
driver can implement power off logic via that function pointer.
With this patch set applied and a few patches on top of QEMU that implement a
power off GPIO on the virt e500 machine, I can successfully turn off my virtual
machine after halt.
Signed-off-by: Alexander Graf <agraf@suse.de>
[mpe: Squash into one patch and update changelog based on cover letter]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
This still has not been merged and now powerpc is the only arch that does
not have this change. Sorry about missing linuxppc-dev before.
V2->V2
- Fix up to work against 3.18-rc1
__get_cpu_var() is used for multiple purposes in the kernel source. One of
them is address calculation via the form &__get_cpu_var(x). This calculates
the address for the instance of the percpu variable of the current processor
based on an offset.
Other use cases are for storing and retrieving data from the current
processors percpu area. __get_cpu_var() can be used as an lvalue when
writing data or on the right side of an assignment.
__get_cpu_var() is defined as :
__get_cpu_var() always only does an address determination. However, store
and retrieve operations could use a segment prefix (or global register on
other platforms) to avoid the address calculation.
this_cpu_write() and this_cpu_read() can directly take an offset into a
percpu area and use optimized assembly code to read and write per cpu
variables.
This patch converts __get_cpu_var into either an explicit address
calculation using this_cpu_ptr() or into a use of this_cpu operations that
use the offset. Thereby address calculations are avoided and less registers
are used when code is generated.
At the end of the patch set all uses of __get_cpu_var have been removed so
the macro is removed too.
The patch set includes passes over all arches as well. Once these operations
are used throughout then specialized macros can be defined in non -x86
arches as well in order to optimize per cpu access by f.e. using a global
register that may be set to the per cpu base.
Transformations done to __get_cpu_var()
1. Determine the address of the percpu instance of the current processor.
DEFINE_PER_CPU(int, y);
int *x = &__get_cpu_var(y);
Converts to
int *x = this_cpu_ptr(&y);
2. Same as #1 but this time an array structure is involved.
DEFINE_PER_CPU(int, y[20]);
int *x = __get_cpu_var(y);
Converts to
int *x = this_cpu_ptr(y);
3. Retrieve the content of the current processors instance of a per cpu
variable.
DEFINE_PER_CPU(int, y);
int x = __get_cpu_var(y)
Converts to
int x = __this_cpu_read(y);
4. Retrieve the content of a percpu struct
DEFINE_PER_CPU(struct mystruct, y);
struct mystruct x = __get_cpu_var(y);
Converts to
memcpy(&x, this_cpu_ptr(&y), sizeof(x));
5. Assignment to a per cpu variable
DEFINE_PER_CPU(int, y)
__get_cpu_var(y) = x;
Converts to
__this_cpu_write(y, x);
6. Increment/Decrement etc of a per cpu variable
DEFINE_PER_CPU(int, y);
__get_cpu_var(y)++
Converts to
__this_cpu_inc(y)
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: Paul Mackerras <paulus@samba.org>
Signed-off-by: Christoph Lameter <cl@linux.com>
[mpe: Fix build errors caused by set/or_softirq_pending(), and rework
assignment in __set_breakpoint() to use memcpy().]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Pull more powerpc updates from Michael Ellerman:
"Here's some more updates for powerpc for 3.18.
They are a bit late I know, though must are actually bug fixes. In my
defence I nearly cut the top of my finger off last weekend in a
gruesome bike maintenance accident, so I spent a good part of the week
waiting around for doctors. True story, I can send photos if you like :)
Probably the most interesting fix is the sys_call_table one, which
enables syscall tracing for powerpc. There's a fix for HMI handling
for old firmware, more endian fixes for firmware interfaces, more EEH
fixes, Anton fixed our routine that gets the current stack pointer,
and a few other misc bits"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux: (22 commits)
powerpc: Only do dynamic DMA zone limits on platforms that need it
powerpc: sync pseries_le_defconfig with pseries_defconfig
powerpc: Add printk levels to setup_system output
powerpc/vphn: NUMA node code expects big-endian
powerpc/msi: Use WARN_ON() in msi bitmap selftests
powerpc/msi: Fix the msi bitmap alignment tests
powerpc/eeh: Block CFG upon frozen Shiner adapter
powerpc/eeh: Don't collect logs on PE with blocked config space
powerpc/eeh: Block PCI config access upon frozen PE
powerpc/pseries: Drop config requests in EEH accessors
powerpc/powernv: Drop config requests in EEH accessors
powerpc/eeh: Rename flag EEH_PE_RESET to EEH_PE_CFG_BLOCKED
powerpc/eeh: Fix condition for isolated state
powerpc/pseries: Make CPU hotplug path endian safe
powerpc/pseries: Use dump_stack instead of show_stack
powerpc: Rename __get_SP() to current_stack_pointer()
powerpc: Reimplement __get_SP() as a function not a define
powerpc/numa: Add ability to disable and debug topology updates
powerpc/numa: check error return from proc_create
powerpc/powernv: Fallback to old HMI handling behavior for old firmware
...
As demonstrated in the previous commit, the failure message from the msi
bitmap selftests is a bit subtle, it's easy to miss a failure in a busy
boot log.
So drop our check() macro and use WARN_ON() instead. This necessitates
inverting all the conditions as well.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
When we added the alignment tests recently we failed to check they were
actually passing - oops.
They weren't passing, because the bitmap was full. We should also be a
bit more careful when checking the return code, a negative error return
could by divisible by our alignment value.
Fixes: b0345bbc6d ("powerpc/msi: Improve IRQ bitmap allocator")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Pull powerpc updates from Michael Ellerman:
"Here's a first pull request for powerpc updates for 3.18.
The bulk of the additions are for the "cxl" driver, for IBM's Coherent
Accelerator Processor Interface (CAPI). Most of it's in drivers/misc,
which Greg & Arnd maintain, Greg said he was happy for us to take it
through our tree.
There's the usual minor cleanups and fixes, including a bit of noise
in drivers from some of those. A bunch of updates to our EEH code,
which has been getting more testing. Several nice speedups from
Anton, including 20% in clear_page().
And a bunch of updates for freescale from Scott"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux: (130 commits)
cxl: Fix afu_read() not doing finish_wait() on signal or non-blocking
cxl: Add documentation for userspace APIs
cxl: Add driver to Kbuild and Makefiles
cxl: Add userspace header file
cxl: Driver code for powernv PCIe based cards for userspace access
cxl: Add base builtin support
powerpc/mm: Add hooks for cxl
powerpc/opal: Add PHB to cxl mode call
powerpc/mm: Add new hash_page_mm()
powerpc/powerpc: Add new PCIe functions for allocating cxl interrupts
cxl: Add new header for call backs and structs
powerpc/powernv: Split out set MSI IRQ chip code
powerpc/mm: Export mmu_kernel_ssize and mmu_linear_psize
powerpc/msi: Improve IRQ bitmap allocator
powerpc/cell: Make spu_flush_all_slbs() generic
powerpc/cell: Move data segment faulting code out of cell platform
powerpc/cell: Move spu_handle_mm_fault() out of cell platform
powerpc/pseries: Use new defines when calling H_SET_MODE
powerpc: Update contact info in Documentation files
powerpc/perf/hv-24x7: Simplify catalog_read()
...
Currently msi_bitmap_alloc_hwirqs() will round up any IRQ allocation requests
to the nearest power of 2. eg. ask for 5 IRQs and you'll get 8. This wastes a
lot of IRQs which can be a scarce resource.
For cxl we may require multiple IRQs for every context that is attached to the
accelerator. There may be 1000s of contexts attached, hence we can easily run
out of IRQs, especially if we are needlessly wasting them.
This changes the msi_bitmap_alloc_hwirqs() to allocate only the required number
of IRQs, hence avoiding this wastage. It keeps the natural alignment
requirement though.
Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Freescale updates from Scott (27 commits):
"Highlights include DMA32 zone support (SATA, USB, etc now works on 64-bit
FSL kernels), MSI changes, 8xx optimizations and cleanup, t104x board
support, and PrPMC PCI enumeration."
Move MSI checks from arch_msi_check_device() to arch_setup_msi_irqs().
This makes the code more compact and allows removing
arch_msi_check_device() from generic MSI code.
Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
On PowerNV platforms, when a CPU is offline, we put it into nap mode.
It's possible that the CPU wakes up from nap mode while it is still
offline due to a stray IPI. A misdirected device interrupt could also
potentially cause it to wake up. In that circumstance, we need to clear
the interrupt so that the CPU can go back to nap mode.
In the past the clearing of the interrupt was accomplished by briefly
enabling interrupts and allowing the normal interrupt handling code
(do_IRQ() etc.) to handle the interrupt. This has the problem that
this code calls irq_enter() and irq_exit(), which call functions such
as account_system_vtime() which use RCU internally. Use of RCU is not
permitted on offline CPUs and will trigger errors if RCU checking is
enabled.
To avoid calling into any generic code which might use RCU, we adopt
a different method of clearing interrupts on offline CPUs. Since we
are on the PowerNV platform, we know that the system interrupt
controller is a XICS being driven directly (i.e. not via hcalls) by
the kernel. Hence this adds a new icp_native_flush_interrupt()
function to the native-mode XICS driver and arranges to call that
when an offline CPU is woken from nap. This new function reads the
interrupt from the XICS. If it is an IPI, it clears the IPI; if it
is a device interrupt, it prints a warning and disables the source.
Then it does the end-of-interrupt processing for the interrupt.
The other thing that briefly enabling interrupts did was to check and
clear the irq_happened flag in this CPU's PACA. Therefore, after
flushing the interrupt from the XICS, we also clear all bits except
the PACA_IRQ_HARD_DIS (interrupts are hard disabled) bit from the
irq_happened flag. The PACA_IRQ_HARD_DIS flag is set by power7_nap()
and is left set to indicate that interrupts are hard disabled. This
means we then have to ignore that flag in power7_nap(), which is
reasonable since it doesn't indicate that any interrupt event needs
servicing.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
of_device_ids (i.e. compatible strings and the respective data) are not
supposed to change at runtime. All functions working with of_device_ids
provided by <linux/of.h> work with const of_device_ids. This allows to
mark all struct of_device_id const, too.
While touching these line also put the __init annotation at the right
position where necessary.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
This reverts commit c822e73731.
This commit conflicted with a bitmap allocator change that partially
accomplishes the same thing, but which does so more correctly. Revert
this one until it can be respun on top of the correct change.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Allocate msis such that each time a new interrupt is requested,
the SRS (MSIR register select) to be used is allocated in a
round-robin fashion.
The end result is that the msi interrupts will be spread across
distinct MSIRs with the main benefit that now users can set
affinity to each msi int through the mpic irq backing up the
MSIR register.
This is achieved with the help of a newly introduced msi bitmap
api that allows specifying the starting point when searching
for a free msi interrupt.
Signed-off-by: Laurentiu Tudor <Laurentiu.Tudor@freescale.com>
Cc: Scott Wood <scottwood@freescale.com>
Cc: Mihai Caraman <mihai.caraman@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
Rename the irq controller associated with a MSI
interrupt to fsl-msi-<V>, where <V> is the virq
of the cascade irq backing up this MSI interrupt.
This way, one can set the affinity of a MSI
through the cascade irq associated with said MSI
interrupt.
Given this example /proc/interrupts snippet:
CPU0 CPU1 CPU2 CPU3
16: 0 0 0 0 OpenPIC 16 Edge mpic-error-int
17: 0 4 0 0 fsl-msi-224 0 Edge eth0-rx-0
18: 0 5 0 0 fsl-msi-225 1 Edge eth0-tx-0
19: 0 2 0 0 fsl-msi-226 2 Edge eth0
[...]
224: 0 11 0 0 OpenPIC 224 Edge fsl-msi-cascade
225: 0 0 0 0 OpenPIC 225 Edge fsl-msi-cascade
226: 0 0 0 0 OpenPIC 226 Edge fsl-msi-cascade
[...]
To change the affinity of MSI interrupt 17
(having the irq controller named "fsl-msi-224")
instead of writing /proc/irq/17/smp_affinity, use
the associated MSI cascade irq, in this case,
interrupt 224, e.g.:
echo 6 > /proc/irq/224/smp_affinity
Note that a MSI cascade irq covers several MSI
interrupts, so changing the affinity on the
cascade will impact all of the associated MSI
interrupts.
Signed-off-by: Laurentiu Tudor <Laurentiu.Tudor@freescale.com>
Cc: Scott Wood <scottwood@freescale.com>
Cc: Mihai Caraman <mihai.caraman@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
As we do for other fsl-mpic related cascaded irqchips
(e.g. error ints, mpic timers), use a normal irq handler
for msi irqs too.
This brings some advantages such as mask/unmask/ack/eoi
and irq state taken care behind the scenes, kstats
updates a.s.o plus access to features provided by mpic,
such as affinity.
Signed-off-by: Laurentiu Tudor <Laurentiu.Tudor@freescale.com>
Cc: Scott Wood <scottwood@freescale.com>
Cc: Mihai Caraman <mihai.caraman@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
Store cascade_data in an array inside the driver
data for later use.
Get rid of the msi_virq array since now we can
encapsulate the virqs in the cascade_data
directly and access them through the array
mentioned earlier.
Signed-off-by: Laurentiu Tudor <Laurentiu.Tudor@freescale.com>
Cc: Scott Wood <scottwood@freescale.com>
Cc: Mihai Caraman <mihai.caraman@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
The following commit prevents the MPC8548E on the XPedite5200 PrPMC
module from enumerating its PCI/PCI-X bus:
powerpc/fsl-pci: use 'Header Type' to identify PCIE mode
The previous patch prevents any Freescale PCI-X bridge from enumerating
the bus, if it is hardware strapped into Agent mode.
In PCI-X, the Host is responsible for driving the PCI-X initialization
pattern to devices on the bus, so that they know whether to operate in
conventional PCI or PCI-X mode as well as what the bus timing will be.
For a PCI-X PrPMC, the pattern is driven by the mezzanine carrier it is
installed onto. Therefore, PrPMCs are PCI-X Agents, but one per system
may still enumerate the bus.
This patch causes the device node of any PCI/PCI-X bridge strapped into
Agent mode to be checked for the fsl,pci-agent-force-enum property. If
the property is present in the node, the bridge will be allowed to
enumerate the bus.
Cc: Minghuan Lian <Minghuan.Lian@freescale.com>
Signed-off-by: Aaron Sierra <asierra@xes-inc.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
The new MSI block in MPIC 4.3 added the MSIIR1 register,
with a different layout, in order to support 16 MSIR
registers. The msi binding was also updated so that
the "reg" reflects the newly introduced MSIIR1 register.
Virtual machines advertise these msi nodes by using the
compatible "fsl,vmpic-msi-v4.3" so add support for it.
Signed-off-by: Laurentiu Tudor <Laurentiu.Tudor@freescale.com>
Cc: Scott Wood <scottwood@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
Scott writes:
Highlights include e6500 hardware threading support, an e6500 TLB erratum
workaround, corenet error reporting, support for a new board, and some
minor fixes.
In commit ae91d60ba8, a bug was fixed that
involved converting !x & y to !(x & y). The code below shows the same
pattern, and thus should perhaps be fixed in the same way.
This is not tested and clearly changes the semantics, so it is only
something to consider.
The Coccinelle semantic patch that makes this change is as follows:
// <smpl>
@@ expression E1,E2; @@
(
!E1 & !E2
|
- !E1 & E2
+ !(E1 & E2)
)
// </smpl>
Signed-off-by: Himangi Saraogi <himangi774@gmail.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
mpic_msgrs has type struct mpic_msgr **, not struct mpic_msgr *, so the
elements of the array should have pointer type, not structure type.
The advantage of kcalloc is, that will prevent integer overflows which
could result from the multiplication of number of elements and size and
it is also a bit nicer to read.
The Coccinelle semantic patch that makes the first change is as follows:
// <smpl>
@disable sizeof_type_expr@
type T;
T **x;
@@
x =
<+...sizeof(
- T
+ *x
)...+>
// </smpl>
Signed-off-by: Himangi Saraogi <himangi774@gmail.com>
Acked-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: Scott Wood <scottwood@freescale.com>
m8xx_pcmcia_ops was the only thing in this file (other than a comment
that describes a usage that doesn't match the file's contents); now
that m8xx_pcmcia_ops is gone, remove the empty file.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Cc: Pantelis Antoniou <pantelis.antoniou@gmail.com>
Cc: Vitaly Bordug <vitb@kernel.crashing.org>
Cc: netdev@vger.kernel.org
The DART table allocation is registered to kmemleak via the
memblock_alloc_base() call. However, the DART table is later unmapped
and dart_tablebase VA no longer accessible. This patch tells kmemleak
not to scan this block and avoid an unhandled paging request.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Pull more powerpc updates from Ben Herrenschmidt:
"Here are the remaining bits I was mentioning earlier. Mostly bug
fixes and new selftests from Michael (yay !). He also removed the WSP
platform and A2 core support which were dead before release, so less
clutter.
One little "feature" I snuck in is the doorbell IPI support for
non-virtualized P8 which speeds up IPIs significantly between threads
of a core"
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (34 commits)
powerpc/book3s: Fix some ABIv2 issues in machine check code
powerpc/book3s: Fix guest MC delivery mechanism to avoid soft lockups in guest.
powerpc/book3s: Increment the mce counter during machine_check_early call.
powerpc/book3s: Add stack overflow check in machine check handler.
powerpc/book3s: Fix machine check handling for unhandled errors
powerpc/eeh: Dump PE location code
powerpc/powernv: Enable POWER8 doorbell IPIs
powerpc/cpuidle: Only clear LPCR decrementer wakeup bit on fast sleep entry
powerpc/powernv: Fix killed EEH event
powerpc: fix typo 'CONFIG_PMAC'
powerpc: fix typo 'CONFIG_PPC_CPU'
powerpc/powernv: Don't escalate non-existing frozen PE
powerpc/eeh: Report frozen parent PE prior to child PE
powerpc/eeh: Clear frozen state for child PE
powerpc/powernv: Reduce panic timeout from 180s to 10s
powerpc/xmon: avoid format string leaking to printk
selftests/powerpc: Add tests of PMU EBBs
selftests/powerpc: Add support for skipping tests
selftests/powerpc: Put the test in a separate process group
selftests/powerpc: Fix instruction loop for ABIv2 (LE)
...
Pull networking updates from David Miller:
1) Seccomp BPF filters can now be JIT'd, from Alexei Starovoitov.
2) Multiqueue support in xen-netback and xen-netfront, from Andrew J
Benniston.
3) Allow tweaking of aggregation settings in cdc_ncm driver, from Bjørn
Mork.
4) BPF now has a "random" opcode, from Chema Gonzalez.
5) Add more BPF documentation and improve test framework, from Daniel
Borkmann.
6) Support TCP fastopen over ipv6, from Daniel Lee.
7) Add software TSO helper functions and use them to support software
TSO in mvneta and mv643xx_eth drivers. From Ezequiel Garcia.
8) Support software TSO in fec driver too, from Nimrod Andy.
9) Add Broadcom SYSTEMPORT driver, from Florian Fainelli.
10) Handle broadcasts more gracefully over macvlan when there are large
numbers of interfaces configured, from Herbert Xu.
11) Allow more control over fwmark used for non-socket based responses,
from Lorenzo Colitti.
12) Do TCP congestion window limiting based upon measurements, from Neal
Cardwell.
13) Support busy polling in SCTP, from Neal Horman.
14) Allow RSS key to be configured via ethtool, from Venkata Duvvuru.
15) Bridge promisc mode handling improvements from Vlad Yasevich.
16) Don't use inetpeer entries to implement ID generation any more, it
performs poorly, from Eric Dumazet.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1522 commits)
rtnetlink: fix userspace API breakage for iproute2 < v3.9.0
tcp: fixing TLP's FIN recovery
net: fec: Add software TSO support
net: fec: Add Scatter/gather support
net: fec: Increase buffer descriptor entry number
net: fec: Factorize feature setting
net: fec: Enable IP header hardware checksum
net: fec: Factorize the .xmit transmit function
bridge: fix compile error when compiling without IPv6 support
bridge: fix smatch warning / potential null pointer dereference
via-rhine: fix full-duplex with autoneg disable
bnx2x: Enlarge the dorq threshold for VFs
bnx2x: Check for UNDI in uncommon branch
bnx2x: Fix 1G-baseT link
bnx2x: Fix link for KR with swapped polarity lane
sctp: Fix sk_ack_backlog wrap-around problem
net/core: Add VF link state control policy
net/fsl: xgmac_mdio is dependent on OF_MDIO
net/fsl: Make xgmac_mdio read error message useful
net_sched: drr: warn when qdisc is not work conserving
...
This patch enables POWER8 doorbell IPIs on powernv.
Since doorbells can only IPI within a core, we test to see when we can use
doorbells and if not we fall back to XICS. This also enables hypervisor
doorbells to wakeup us up from nap/sleep via the LPCR PECEDH bit.
Based on tests by Anton, the best case IPI latency between two threads dropped
from 894ns to 512ns.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Pull powerpc updates from Ben Herrenschmidt:
"Here is the bulk of the powerpc changes for this merge window. It got
a bit delayed in part because I wasn't paying attention, and in part
because I discovered I had a core PCI change without a PCI maintainer
ack in it. Bjorn eventually agreed it was ok to merge it though we'll
probably improve it later and I didn't want to rebase to add his ack.
There is going to be a bit more next week, essentially fixes that I
still want to sort through and test.
The biggest item this time is the support to build the ppc64 LE kernel
with our new v2 ABI. We previously supported v2 userspace but the
kernel itself was a tougher nut to crack. This is now sorted mostly
thanks to Anton and Rusty.
We also have a fairly big series from Cedric that add support for
64-bit LE zImage boot wrapper. This was made harder by the fact that
traditionally our zImage wrapper was always 32-bit, but our new LE
toolchains don't really support 32-bit anymore (it's somewhat there
but not really "supported") so we didn't want to rely on it. This
meant more churn that just endian fixes.
This brings some more LE bits as well, such as the ability to run in
LE mode without a hypervisor (ie. under OPAL firmware) by doing the
right OPAL call to reinitialize the CPU to take HV interrupts in the
right mode and the usual pile of endian fixes.
There's another series from Gavin adding EEH improvements (one day we
*will* have a release with less than 20 EEH patches, I promise!).
Another highlight is the support for the "Split core" functionality on
P8 by Michael. This allows a P8 core to be split into "sub cores" of
4 threads which allows the subcores to run different guests under KVM
(the HW still doesn't support a partition per thread).
And then the usual misc bits and fixes ..."
[ Further delayed by gmail deciding that BenH is a dirty spammer.
Google knows. ]
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (155 commits)
powerpc/powernv: Add missing include to LPC code
selftests/powerpc: Test the THP bug we fixed in the previous commit
powerpc/mm: Check paca psize is up to date for huge mappings
powerpc/powernv: Pass buffer size to OPAL validate flash call
powerpc/pseries: hcall functions are exported to modules, need _GLOBAL_TOC()
powerpc: Exported functions __clear_user and copy_page use r2 so need _GLOBAL_TOC()
powerpc/powernv: Set memory_block_size_bytes to 256MB
powerpc: Allow ppc_md platform hook to override memory_block_size_bytes
powerpc/powernv: Fix endian issues in memory error handling code
powerpc/eeh: Skip eeh sysfs when eeh is disabled
powerpc: 64bit sendfile is capped at 2GB
powerpc/powernv: Provide debugfs access to the LPC bus via OPAL
powerpc/serial: Use saner flags when creating legacy ports
powerpc: Add cpu family documentation
powerpc/xmon: Fix up xmon format strings
powerpc/powernv: Add calls to support little endian host
powerpc: Document sysfs DSCR interface
powerpc: Fix regression of per-CPU DSCR setting
powerpc: Split __SYSFS_SPRSETUP macro
arch: powerpc/fadump: Cleaning up inconsistent NULL checks
...
There is now a way to ensure all platform devices get a unique name when
populated from the device tree, and the DCR_NATIVE code path is broken
anyway. PowerPC Cell (PS3) is the only platform that actually uses this
path. Most likely nobody will notice if it is killed. Remove the code
and associated ugly #ifdef.
The user-visible impact of this patch is that any DCR device on Cell
will get a new name in the /sys/devices hierarchy.
Signed-off-by: Grant Likely <grant.likely@linaro.org>
Cc: Rob Herring <robh@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Besides other potential problems, if MPIC_NO_RESET is not set,
the error interrupt will be masked after it is requested.
Signed-off-by: Scott Wood <scottwood@freescale.com>