- Patrice Chotard contributed a new configuration debugfs interface
and reintroduced fine-grained locking into the core: instead of
having a "big pinctrl lock" we have a per-controller lock and
specialized locks for the global controller and pinctrl handle
lists.
- Haoijan Zhuang deleted all the PXA and MMP2 pinctrl drivers and
replaced them with pinctrl-single (which is also used by other SoCs)
so we are gaining consolidation. The platform particulars now come
in through the device tree.
- Haoijan also added support for generic pin config into the
pinctrl-single driver which is another big consolidation win.
- Finally also GPIO ranges are now supported by the pinctrl-single
driver.
- Tomasz Figa contributed a new Samsung S3C pinctrl driver, bringing
more of the older Samsung platforms under the pinctrl umbrella and
out of arch/arm.
- Maxime Ripard contributed new Allwinner A10/A13 drivers.
- Sachin Kamat, Wei Yongjun and Axel Lin did a lot of cleanups.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
iQIcBAABAgAGBQJRfmmSAAoJEEEQszewGV1z2BEQAKZ5RhNu+Rc0AoDxYlVWg6bf
GBepmdjHXvYSlFltIu7/ti0ttGGfU/F5j8eqGQDBZl9WnYu9WEglPp7EdBNn4/oo
fVtqwWH0tAXbk3BNn7tpufWQh0HpFnmkMqtqtfM9HqvRLnw7HISLYzSjBO41rTiw
+Cpx/FMogRK1ABKyWvoddXQnQ2ApJwLlgaJ95vkuaQe3i0PVz39e/GdKW4aM7Y6N
ksim0GXGLuOObdp42b1Ib5/MdHJKCQfwmtOd+d017uREXxR5u5y49oyjYRTeHcjk
qMqKUL3c+HNN1b39I9O2LoWuKAALN6WSOl+o+o+GXLHHe96i8nwp59f7Q7No2v3J
x1bCCfbJ1yoRfmdpA/P9DYFj7taiEhCfVKf34NJcw5ZhRk3soJ9BpyYqhPAlEdgD
pDjalTeMRJORo8ZuOpmR8f3sFNd6XnO84NWFoPOPiRQJYKYCUehY5bOVP8nfYTXJ
A9VWS0ZCFFLS6wU2kSU++gVg0sFToCbYCEP6fSnl9n48U56jXUgNt36+HL69bGw/
Du7WdHilNuMfGM3HhAf2dM7NlzSyIbUwagBJHPVa4iF/pDF5BeCdsP983INfomuR
SQv25JfOMGB1iIyC87j9+1Mzuc2ouFV/QXBsTtmiyYmIaWhf8SGeioyXGfqUiQmE
6sZKS27MYtAKQHmwvf5U
=CyMU
-----END PGP SIGNATURE-----
Merge tag 'pinctrl-for-v3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pinctrl update from Linus Walleij:
"These are the pinctrl changes for v3.10:
- Patrice Chotard contributed a new configuration debugfs interface
and reintroduced fine-grained locking into the core: instead of
having a "big pinctrl lock" we have a per-controller lock and
specialized locks for the global controller and pinctrl handle
lists.
- Haoijan Zhuang deleted all the PXA and MMP2 pinctrl drivers and
replaced them with pinctrl-single (which is also used by other
SoCs) so we are gaining consolidation. The platform particulars
now come in through the device tree.
- Haoijan also added support for generic pin config into the
pinctrl-single driver which is another big consolidation win.
- Finally also GPIO ranges are now supported by the pinctrl-single
driver.
- Tomasz Figa contributed a new Samsung S3C pinctrl driver, bringing
more of the older Samsung platforms under the pinctrl umbrella and
out of arch/arm.
- Maxime Ripard contributed new Allwinner A10/A13 drivers.
- Sachin Kamat, Wei Yongjun and Axel Lin did a lot of cleanups."
* tag 'pinctrl-for-v3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: (66 commits)
pinctrl: move subsystem mutex to pinctrl_dev struct
pinctrl/pinconfig: fix misplaced goto
pinctrl: s3c64xx: Fix build error caused by undefined chained_irq_enter
pinctrl/pinconfig: add debug interface
pinctrl: abx500: fix issue when no pdata
pinctrl: pinctrl-single: add missing double quote
pinctrl: sunxi: Rename wemac functions to emac
pinctrl: exynos5440: add gpio interrupt support
pinctrl: exynos5440: fix probe failure due to missing pin-list in config nodes
pinctrl: ab8505: Staticize some symbols
pinctrl: ab8540: Staticize some symbols
pinctrl: ab9540: Staticize some symbols
pinctrl: ab8500: Staticize some symbols
pinctrl: abx500: Staticize some symbols
pinctrl: Add pinctrl-s3c64xx driver
pinctrl: samsung: Handle banks with two configuration registers
pinctrl: samsung: Remove hardcoded register offsets
pinctrl: samsung: Split pin bank description into two structures
pinctrl: samsung: Include pinctrl-exynos driver data conditionally
pinctrl: samsung: Protect bank registers with a spinlock
...
* use vm_iomap_memory() in various fb drivers to map the fb memory to userspace
* Cleanups for the videomode and display_timing features
* Updates to vt8500, wm8505 and auo-k190x fb drivers
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJRflTCAAoJEPo9qoy8lh71w+8P/RBU/twhd0S1aqPC1Lb7H9Tb
mfcIFzqpvIwItqw6dIe5c6WC3eggDpj0O74PnXtbxeQSYQOfoZgzayPkaYF2TmN6
5S3bVeHhQmybho3LZUjIclaBIF3mwNNmsVfBjfObCp6Gy9ziybDBiZhF0/GcUVGt
h/D2QlmCDM7QrheYehVhquiPI5+7oxHqGbl+eYMjjNL7+jIiCCeqrXuDQJqx6Gxk
Q88xicao8TCDLRniMrHAJv/e99x6cQ1Fa8grsbnFghLr6PITu088OZNNuYdUdIV3
WNfqTp7AGHHvja1evfj8kgTVwP/eeourkPIrRMFHNaHNOsTfu6HP//L5rynAQ5vI
EwIGbZNzjI11zeJAljgXL4Lqgo7mavV24DoHHliQm79OdWrXLm+l8AFfA2OF6BNf
10H4m60deX4O6ALfjvLMx89gxXcPfb6pdnnWoyJW3DLU2BHv3V+tMNLRA+g7L7LC
FE+Z23UCxSHk4S+WwtfrwHjPL+R260kOB9xkaal6ekc4utyg2KFqGyaOigWKGdCO
jZZJOXMuQQ1dZhFh3Yh9Hy/Sw0FI+40g7HhqdtENlqlJVa7LCt5PQEI6lGtiNkCA
MrH+5KnW2Pbut+DK8ONePhHEkc6PKGXxH3pD8IjRYZRufwBtEtmzaZk8kC0M2BHJ
p3fSEPwBuaz84fM20Du1
=L5rT
-----END PGP SIGNATURE-----
Merge tag 'fbdev-for-3.10' of git://gitorious.org/linux-omap-dss2/linux
Pull fbdev updates from Tomi Valkeinen:
- use vm_iomap_memory() in various fb drivers to map the fb memory to
userspace
- Cleanups for the videomode and display_timing features
- Updates to vt8500, wm8505 and auo-k190x fb drivers
* tag 'fbdev-for-3.10' of git://gitorious.org/linux-omap-dss2/linux: (36 commits)
fbdev: fix check for fb_mmap's mmio availability
fbdev: improve fb_mmap bounds checks
fbdev/ps3fb: use vm_iomap_memory()
fbdev/sgivwfb: use vm_iomap_memory()
fbdev/vermillion: use vm_iomap_memory()
fbdev/sa1100fb: use vm_iomap_memory()
fbdev/fb-puv3: use vm_iomap_memory()
fbdev/controlfb: use vm_iomap_memory()
fbdev/omapfb: use vm_iomap_memory()
video: vt8500: fix Kconfig for videomode
video/s3c: move platform_data out of arch/arm
video/exynos: remove unnecessary header inclusions
drivers/video: fsl-diu-fb: add hardware cursor support
drivers: video: use module_platform_driver_probe()
ARM: OMAP: remove "config FB_OMAP_CONSISTENT_DMA_SIZE"
video: wm8505fb: Convert to devm_ioremap_resource()
AUO-K190x: Add resolutions for portrait displays
AUO-K190x: add framebuffer rotation support
AUO-K190x: add a 16bit truecolor mode
AUO-K190x: make color handling more flexible
...
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJReu6gAAoJEKurIx+X31iB9aoQAItXGa7kGdtPXBBgnQVx7Q/T
B9LbO3qVAnEEQ7x73C8NUAiDfT8KKwDbHZDyMVqlyZSfckAtNcZikXGuWZfHIktc
7Xg1ogLEAPDiUCKLXV/zGMCwn08ew+QNycwJrxjSehXfYZFdj4JTCIDswJ1RUNCT
D0U+afhsWuZRH6yZOEFzap2ZwX+o2m4jSa0tHO0eiXAcOTTF8Nxy8mDZKBJ7kNrg
2WxYvNjTnNR3csRRmdRnQEKHtvI/snGRkMS4UoOKDN3+IZo8OsKNbWfFhysFoYVq
gGXSTI8lqG7j+POCQdp2k4XvmoOhVyll79Bo0bYOSMO0gkqUp86+2i/IWx7L/njS
4wbi5nMSzhX4t4GBpEo99NV76KLKaSwjAUVTWhbM7L40NFjSz3rXm5qfbh8Y+k/v
ngi41yx1nOkOYJCSRvjjFAUGnHNE/k/7aSGb+Nba13RnJlJ36tpnifszqxhOax7+
BmXn5S2lOLh1wrM/nBOKmbfrk/+N9QI5MhVoyA/Fg3SCzl96T7gVVEAfCfTmC2vQ
r/6/djRVfIhiLcHhdV3ev6TC41AynMAAuOuClU+lw2ZO8kOn9fHoKV9Mz2J1flQq
gJEfIuKToYbizDDXa+wf/k4G5GD77D7AKlhlZw6XcKJImCWktCzzlwOqwMIj27sj
F00m4RtrlFHUBTT+tvc2
=Y2HM
-----END PGP SIGNATURE-----
Merge tag 'please-pull-misc-3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux
Pull ia64 fixes from Tony Luck:
"Bundle of miscellaneous ia64 fixes for 3.10 merge window."
* tag 'please-pull-misc-3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux:
Add size restriction to the kdump documentation
Fix example error_injection_tool
Fix build error for numa_clear_node() under IA64
Fix initialization of CMCI/CMCP interrupts
Change "select DMAR" to "select INTEL_IOMMU"
Wrong asm register contraints in the kvm implementation
Wrong asm register contraints in the futex implementation
Remove cast for kmalloc return value
Fix kexec oops when iosapic was removed
iosapic: fix a minor typo in comments
Add WB/UC check for early_ioremap
Fix broken fsys_getppid()
tiocx: check retval from bus_register()
Pull s390 update from Martin Schwidefsky:
"This is the first batch of s390 patches for the 3.10 merge window.
Included are some performance enhancements: storage key
initialization, zero page cache synonyms, system call micro
optimization and the speedup patches for dasdfmt. Sebastian managed
to get rid of the special casing for the console device in the cio
layer. And the usual bunch of bug fixes."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (59 commits)
s390/pci: use pci_scan_root_bus
s390/scm_blk: fix memleak in init function
s390/scm_blk: allow more cluster size values
s390/cio: fix irq statistics
s390/memory hotplug: prevent offline of active memory increments
s390: remove small stack config option
s390: system call path micro optimization
s390: lowcore stack pointer offsets
s390/uapi: change struct statfs[64] member types to unsigned values
s390/pci: return correct dma address for offset > PAGE_SIZE
s390/ptrace: remove empty ifdefs
s390/compat: remove ptrace compat definitions from uapi header file
s390/compat: fix compile error for !COMPAT
s390/compat: fix compat_sys_statfs() memory corruption
s390/zcore: Fix HSA copy length for last block
s390/mm,gmap: segment mapping race
s390/mm,gmap: implement gmap_translate()
s390/pci: remove disable_device implementation
s390/pci: disable per default
s390/pci: return error after failed pci ops
...
- Populate the boot_params with EDD data.
- Cleanups in the IRQ code.
Bug-fixes:
- CPU hotplug offline/online in PVHVM mode.
- Re-upload processor PM data after ACPI S3 suspend/resume cycle.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.13 (GNU/Linux)
iQEcBAABAgAGBQJRcWFDAAoJEFjIrFwIi8fJQScIAJmnsPzoqd/USahqgbOvc0Nb
fJgzvno0Lfuvi5NipMpMDNCpCxRuikoJe6ocoF6E0+poucckRGDFSC3R2KVgLR2O
pXkO7bzggkUYkLehW7gXmM4IlvhLxnsfEofDWxG2VJy6RfWxFk+84v/uimQSIB0D
jI9FB3oUhfA+IT8/3Iofyv2OiPH39zNLlzidAnVXmJ8SoJFDLt3l1IymYqVdq6Lt
AKL0IXa31f+yfj9Vli1mkBsxuEIvIo5tVQNn250B4cMlR8mcWsQBnxIeA2M+8Jhl
d2ereTBwhjohCrFR/TWlXYrMlL+XVucI7J7agf3tF4kmDUqjg+c9hUkszSOPjEc=
=yJWr
-----END PGP SIGNATURE-----
Merge tag 'stable/for-linus-3.10-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
Pull Xen updates from Konrad Rzeszutek Wilk:
"Features:
- Populate the boot_params with EDD data.
- Cleanups in the IRQ code.
Bug-fixes:
- CPU hotplug offline/online in PVHVM mode.
- Re-upload processor PM data after ACPI S3 suspend/resume cycle."
And Konrad gets a gold star for sending the pull request early when he
thought he'd be away for the first week of the merge window (but because
of 3.9 dragging out to -rc8 he then re-sent the reminder on the first
day of the merge window anyway)
* tag 'stable/for-linus-3.10-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen: resolve section mismatch warnings in xen-acpi-processor
xen: Re-upload processor PM data to hypervisor after S3 resume (v2)
xen/smp: Unifiy some of the PVs and PVHVM offline CPU path
xen/smp/pvhvm: Don't initialize IRQ_WORKER as we are using the native one.
xen/spinlock: Disable IRQ spinlock (PV) allocation on PVHVM
xen/spinlock: Check against default value of -1 for IRQ line.
xen/time: Add default value of -1 for IRQ and check for that.
xen/events: Check that IRQ value passed in is valid.
xen/time: Fix kasprintf splat when allocating timer%d IRQ line.
xen/smp/spinlock: Fix leakage of the spinlock interrupt line for every CPU online/offline
xen/smp: Fix leakage of timer interrupt line for every CPU online/offline.
xen kconfig: fix select INPUT_XEN_KBDDEV_FRONTEND
xen: drop tracking of IRQ vector
x86/xen: populate boot_params with EDD data
The linux/cpu.h header is no longer implictly included in this
file, so we need to an #include statement to avoid this build
warning:
arch/arm/mach-orion5x/common.c:339:3: error: implicit declaration of function 'cpu_idle_poll_ctrl' [-Werror=implicit-function-declaration]
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Jason Cooper <jason@lakedaemon.net>
Cc: Andrew Lunn <andrew@lunn.ch>
When building a kernel for multiple CPU architecture levels,
cpu_do_idle() is a macro for an indirect function call, which
cannot be called from assembly code as Tegra does.
Adding a trivial C wrapper for this function lets us build
a tegra kernel with ARMv6 support enabled.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Joseph Lo <josephl@nvidia.com>
Cc: Stephen Warren <swarren@nvidia.com>
The Kconfig entry for USB_EHCI_MSM unconditionally selects USB_MSM_OTG,
which is now only visible when USB_PHY is also enabled.
A dependency for this has been added in the USB tree, this adds the
now missing bit in the defconfig.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Use the new vsprintf extension to avoid any possible
message interleaving.
Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Hirokazu Takata <takata@linux-m32r.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
kvm_target_cpus() checks the compatibility of the used CPU with
KVM, which is currently limited to ARM Cortex-A15 cores.
However by calling it only once on any random CPU it assumes that
all cores are the same, which is not necessarily the case (for example
in Big.Little).
[ I cut some of the commit message and changed the formatting of the
code slightly to pass checkpatch and look more like the rest of the
kvm/arm init code - Christoffer ]
Signed-off-by: Andre Przywara <andre.przywara@linaro.org>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
The CONFIG_KVM_ARM_MAX_VCPUS symbol is needed in order to build the
kernel/context_tracking.c code, which includes the vgic data structures
implictly through the kvm headers. Definining the symbol to zero
on builds without KVM resolves this build error:
In file included from include/linux/kvm_host.h:33:0,
from kernel/context_tracking.c:18:
arch/arm/include/asm/kvm_host.h:28:23: warning: "CONFIG_KVM_ARM_MAX_VCPUS" is not defined [-Wundef]
#define KVM_MAX_VCPUS CONFIG_KVM_ARM_MAX_VCPUS
^
arch/arm/include/asm/kvm_vgic.h:34:24: note: in expansion of macro 'KVM_MAX_VCPUS'
#define VGIC_MAX_CPUS KVM_MAX_VCPUS
^
arch/arm/include/asm/kvm_vgic.h:38:6: note: in expansion of macro 'VGIC_MAX_CPUS'
#if (VGIC_MAX_CPUS > 8)
^
In file included from arch/arm/include/asm/kvm_host.h:41:0,
from include/linux/kvm_host.h:33,
from kernel/context_tracking.c:18:
arch/arm/include/asm/kvm_vgic.h:59:11: error: 'CONFIG_KVM_ARM_MAX_VCPUS' undeclared here (not in a function)
} percpu[VGIC_MAX_CPUS];
^
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Christoffer Dall <cdall@cs.columbia.edu>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
We use the vfp_host pointer to store the host VFP context, should
the guest start using VFP itself.
Actually, we can use this pointer in a more generic way to store
CPU speficic data, and arm64 is using it to dump the whole host
state before switching to the guest.
Simply rename the vfp_host field to host_cpu_context, and the
corresponding type to kvm_cpu_context_t. No change in functionnality.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
Most of the capabilities are common to both arm and arm64, but
we still need to handle the exceptions.
Introduce kvm_arch_dev_ioctl_check_extension, which both architectures
implement (in the 32bit case, it just returns 0).
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
Now that we have the necessary infrastructure to boot a hotplugged CPU
at any point in time, wire a CPU notifier that will perform the HYP
init for the incoming CPU.
Note that this depends on the platform code and/or firmware to boot the
incoming CPU with HYP mode enabled and return to the kernel by following
the normal boot path (HYP stub installed).
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
Our HYP init code suffers from two major design issues:
- it cannot support CPU hotplug, as we tear down the idmap very early
- it cannot perform a TLB invalidation when switching from init to
runtime mappings, as pages are manipulated from PL1 exclusively
The hotplug problem mandates that we keep two sets of page tables
(boot and runtime). The TLB problem mandates that we're able to
transition from one PGD to another while in HYP, invalidating the TLBs
in the process.
To be able to do this, we need to share a page between the two page
tables. A page that will have the same VA in both configurations. All we
need is a VA that has the following properties:
- This VA can't be used to represent a kernel mapping.
- This VA will not conflict with the physical address of the kernel text
The vectors page seems to satisfy this requirement:
- The kernel never maps anything else there
- The kernel text being copied at the beginning of the physical memory,
it is unlikely to use the last 64kB (I doubt we'll ever support KVM
on a system with something like 4MB of RAM, but patches are very
welcome).
Let's call this VA the trampoline VA.
Now, we map our init page at 3 locations:
- idmap in the boot pgd
- trampoline VA in the boot pgd
- trampoline VA in the runtime pgd
The init scenario is now the following:
- We jump in HYP with four parameters: boot HYP pgd, runtime HYP pgd,
runtime stack, runtime vectors
- Enable the MMU with the boot pgd
- Jump to a target into the trampoline page (remember, this is the same
physical page!)
- Now switch to the runtime pgd (same VA, and still the same physical
page!)
- Invalidate TLBs
- Set stack and vectors
- Profit! (or eret, if you only care about the code).
Note that we keep the boot mapping permanently (it is not strictly an
idmap anymore) to allow for CPU hotplug in later patches.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
There is no point in freeing HYP page tables differently from Stage-2.
They now have the same requirements, and should be dealt with the same way.
Promote unmap_stage2_range to be The One True Way, and get rid of a number
of nasty bugs in the process (good thing we never actually called free_hyp_pmds
before...).
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
We're about to move to an init procedure where we rely on the
fact that the init code fits in a single page. Make sure we
align the idmap text on a vector alignment, and that the code is
not bigger than a single page.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
After the HYP page table rework, it is pretty easy to let the KVM
code provide its own idmap, rather than expecting the kernel to
provide it. It takes actually less code to do so.
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
The current code for creating HYP mapping doesn't like to wrap
around zero, which prevents from mapping anything into the last
page of the virtual address space.
It doesn't take much effort to remove this limitation, making
the code more consistent with the rest of the kernel in the process.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
The way we populate HYP mappings is a bit convoluted, to say the least.
Passing a pointer around to keep track of the current PFN is quite
odd, and we end-up having two different PTE accessors for no good
reason.
Simplify the whole thing by unifying the two PTE accessors, passing
a pgprot_t around, and moving the various validity checks to the
upper layers.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
In clocksource/arm_arch_timer.h we define useful symbolic constants.
Let's use them to make the KVM arch_timer code clearer.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Christoffer Dall <cdall@cs.columbia.edu>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
In order to be able to correctly profile what is happening on the
host, we need to be able to identify when we're running on the guest,
and log these events differently.
Perf offers a simple way to register callbacks into KVM. Mimic what
x86 does and enjoy being able to profile your KVM host.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <cdall@cs.columbia.edu>
Merge in the gic cleanup since it has a handful of annoying internal conflicts
with soc development branches. All of them are delete/delete conflicts.
* gic/cleanup:
irqchip: vic: add include of linux/irq.h
irqchip: gic: Perform the gic_secondary_init() call via CPU notifier
irqchip: gic: Call handle_bad_irq() directly
arm: Move chained_irq_(enter|exit) to a generic file
arm: Move the set_handle_irq and handle_arch_irq declarations to asm/irq.h
Signed-off-by: Olof Johansson <olof@lixom.net>
Conflicts:
arch/arm/mach-shmobile/smp-emev2.c
arch/arm/mach-shmobile/smp-r8a7779.c
arch/arm/mach-shmobile/smp-sh73a0.c
arch/arm/mach-socfpga/platsmp.c
Merging in fixes since there's a conflict in the omap4 clock tables caused by
it.
* fixes: (245 commits)
ARM: highbank: fix cache flush ordering for cpu hotplug
ARM: OMAP4: hwmod data: make 'ocp2scp_usb_phy_phy_48m" as the main clock
arm: mvebu: Fix the irq map function in SMP mode
Fix GE0/GE1 init on ix2-200 as GE0 has no PHY
ARM: S3C24XX: Fix interrupt pending register offset of the EINT controller
ARM: S3C24XX: Correct NR_IRQS definition for s3c2440
ARM i.MX6: Fix ldb_di clock selection
ARM: imx: provide twd clock lookup from device tree
ARM: imx35 Bugfix admux clock
ARM: clk-imx35: Bugfix iomux clock
+ Linux 3.9-rc6
Signed-off-by: Olof Johansson <olof@lixom.net>
Conflicts:
arch/arm/mach-omap2/cclock44xx_data.c
While a nested run is pending, vmx_queue_exception is only called to
requeue exceptions that were previously picked up via
vmx_cancel_injection. Therefore, we must not check for PF interception
by L1, possibly causing a bogus nested vmexit.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
KVM guests today use 8bit APIC ids allowing for 256 ID's. Reserving one
ID for Broadcast interrupts should leave 255 ID's. In case of KVM there
is no need for reserving another ID for IO-APIC so the hard max limit for
VCPUS can be increased from 254 to 255. (This was confirmed by Gleb Natapov
http://article.gmane.org/gmane.comp.emulators.kvm.devel/99713 )
Signed-off-by: Chegu Vinod <chegu_vinod@hp.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
We hope to at some point deprecate KVM legacy device assignment in
favor of VFIO-based assignment. Towards that end, allow legacy
device assignment to be deconfigured.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
The VMX implementation of enable_irq_window raised
KVM_REQ_IMMEDIATE_EXIT after we checked it in vcpu_enter_guest. This
caused infinite loops on vmentry. Fix it by letting enable_irq_window
signal the need for an immediate exit via its return value and drop
KVM_REQ_IMMEDIATE_EXIT.
This issue only affects nested VMX scenarios.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
It is "exit_int_info". It is actually EXITINTINFO in the official docs
but we don't like screaming docs.
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
* pm-cpufreq: (57 commits)
cpufreq: MAINTAINERS: Add co-maintainer
cpufreq: pxa2xx: initialize variables
ARM: S5pv210: compiling issue, ARM_S5PV210_CPUFREQ needs CONFIG_CPU_FREQ_TABLE=y
cpufreq: cpu0: Put cpu parent node after using it
cpufreq: ARM big LITTLE: Adapt to latest cpufreq updates
cpufreq: ARM big LITTLE: put DT nodes after using them
cpufreq: Don't call __cpufreq_governor() for drivers without target()
cpufreq: exynos5440: Protect OPP search calls with RCU lock
cpufreq: dbx500: Round to closest available freq
cpufreq: Call __cpufreq_governor() with correct policy->cpus mask
cpufreq / intel_pstate: Optimize intel_pstate_set_policy
cpufreq: OMAP: instantiate omap-cpufreq as a platform_driver
arm: exynos: Enable OPP library support for exynos5440
cpufreq: exynos: Remove error return even if no soc is found
cpufreq: exynos: Add cpufreq driver for exynos5440
cpufreq: AMD "frequency sensitivity feedback" powersave bias for ondemand governor
cpufreq: ondemand: allow custom powersave_bias_target handler to be registered
cpufreq: convert cpufreq_driver to using RCU
cpufreq: powerpc/platforms/cell: move cpufreq driver to drivers/cpufreq
cpufreq: sparc: move cpufreq driver to drivers/cpufreq
...
Conflicts:
MAINTAINERS (with commit a8e39c3 from pm-cpuidle)
drivers/cpufreq/cpufreq_governor.h (with commit beb0ff3)
* pm-cpuidle: (51 commits)
cpuidle: add maintainer entry
ARM: s3c64xx: cpuidle: use init/exit common routine
SH: cpuidle: use init/exit common routine
cpuidle: fix comment format
ARM: imx: cpuidle: use init/exit common routine
ARM: davinci: cpuidle: use init/exit common routine
ARM: kirkwood: cpuidle: use init/exit common routine
ARM: calxeda: cpuidle: use init/exit common routine
ARM: tegra: cpuidle: use init/exit common routine for tegra3
ARM: tegra: cpuidle: use init/exit common routine for tegra2
ARM: OMAP4: cpuidle: use init/exit common routine
ARM: shmobile: cpuidle: use init/exit common routine
ARM: tegra: cpuidle: use init/exit common routine
ARM: OMAP3: cpuidle: use init/exit common routine
ARM: at91: cpuidle: use init/exit common routine
ARM: ux500: cpuidle: use init/exit common routine
cpuidle: make a single register function for all
ARM: ux500: cpuidle: replace for_each_online_cpu by for_each_possible_cpu
cpuidle: remove en_core_tk_irqen flag
ARM: OMAP3: remove cpuidle_wrap_enter
...
A late-arriving fix for musb on OMAP4, resolving an issue where the musb
IP won't be clocked and thus not functional. Small in scope, most of the
lines changed is a longish comment.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJRfDlOAAoJEIwa5zzehBx3DQAQAJ/paiDUJBvjXc+UmoZm1L3X
mjBBNdGTJgN8MKnnQrDdPjjYalht42Avu3UXFJMggpETVbYD0IOApv/lvhj+BVbL
sdDDJ9+xr5ojz8h80yU+7lsflqbvJVaPVDdVVpfSHysXEWjnfggblF8HrFNOgLaB
q1BwnvEIs7kwCb1icUrUL0/NIAEFRHDACkssoME0QMLyfvE+L8kvgqiD61/i1LCL
70Vy73VkMzLevJlcI9AMPhip0D8HxNUGQZbwhN/6qcX+GiFY8IeVVru4i3ux261B
RlRrM8UOh6gHtmjzd8f8YOg4IG5ys4YEB9T7La2riEin91aay2ID1lGfaPlj5jZl
Y9HGpSO+CcbminBdnpPQR91iI5qpB3WXtkOoISr2f9kOPLofmk7SsK8lIfVBzqhD
HzalNoZfeFlfT64Zg2nK+ocIZSvCLr6YfcNs22jD1KfxYRMehJ3J8u620TDWZKRI
iW3SVVi4XDWvX7g7Ja6Grbs/d/LBOrTgnzDJGbrkgcC0sAovnLVmVP2XoQQdKX7M
z+qhJPbKwQZcnDzNp7VTwZwkt0ph7anxLxDGm6rNTgU/C3ECq59WmRL9UwLdakqu
6gnix3NyQ3V9x1DVp0OsDnDARJ/jq4qPg7tbv0eZtEVjYjNPa0OXidRGK6yIyCGo
puAufO6bkpFFZj8/Wfz3
=pHY5
-----END PGP SIGNATURE-----
Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fix from Olof Johansson:
"A late-arriving fix for musb on OMAP4, resolving an issue where the
musb IP won't be clocked and thus not functional. Small in scope,
most of the lines changed is a longish comment."
* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: OMAP4: hwmod data: make 'ocp2scp_usb_phy_phy_48m" as the main clock
This change makes the rtc on the exynos5250 and 5440 disabled by
default to match exynos4.
Ever since the common clock framework came in, exynos5250 boards
have dumped lots of warnings in the boot log. It turns out that
we don't see those on exynos4 since the rtc is disabled by default.
While we need to get to the bottom of the problems with the RTC,
it still makes sense to have the default state of the RTC on exynos
boards match.
For the record, warnings look like this:
------------[ cut here ]------------
WARNING: at drivers/clk/clk.c:771 __clk_enable+0x34/0xb0()
Modules linked in:
[<80015bfc>] (unwind_backtrace+0x0/0xec) from [<804717f0>] (dump_stack+0x20/0x24)
[<804717f0>] (dump_stack+0x20/0x24) from [<80023cd0>] (warn_slowpath_common+0x5c/0x7c)
[<80023cd0>] (warn_slowpath_common+0x5c/0x7c) from [<80023d1c>] (warn_slowpath_null+0x2c/0x34)
[<80023d1c>] (warn_slowpath_null+0x2c/0x34) from [<8035ddb0>] (__clk_enable+0x34/0xb0)
[<8035ddb0>] (__clk_enable+0x34/0xb0) from [<8035de54>] (clk_enable+0x28/0x3c)
[<8035de54>] (clk_enable+0x28/0x3c) from [<8031a160>] (s3c_rtc_probe+0xf4/0x434)
[<8031a160>] (s3c_rtc_probe+0xf4/0x434) from [<8028e288>] (platform_drv_probe+0x24/0x28)
[<8028e288>] (platform_drv_probe+0x24/0x28) from [<8028ce10>] (driver_probe_device+0xbc/0x22c)
[<8028ce10>] (driver_probe_device+0xbc/0x22c) from [<8028cff8>] (__driver_attach+0x78/0x9c)
[<8028cff8>] (__driver_attach+0x78/0x9c) from [<8028bdfc>] (bus_for_each_dev+0x64/0xac)
[<8028bdfc>] (bus_for_each_dev+0x64/0xac) from [<8028c7e0>] (driver_attach+0x28/0x30)
[<8028c7e0>] (driver_attach+0x28/0x30) from [<8028c43c>] (bus_add_driver+0xe0/0x234)
[<8028c43c>] (bus_add_driver+0xe0/0x234) from [<8028d55c>] (driver_register+0xac/0x13c)
[<8028d55c>] (driver_register+0xac/0x13c) from [<8028e4f4>] (platform_driver_register+0x54/0x68)
[<8028e4f4>] (platform_driver_register+0x54/0x68) from [<8065c944>] (s3c_rtc_driver_init+0x14/0x1c)
[<8065c944>] (s3c_rtc_driver_init+0x14/0x1c) from [<800086d8>] (do_one_initcall+0x60/0x138)
[<800086d8>] (do_one_initcall+0x60/0x138) from [<80633a8c>] (kernel_init_freeable+0x108/0x1d0)
[<80633a8c>] (kernel_init_freeable+0x108/0x1d0) from [<8046d2f8>] (kernel_init+0x1c/0xf4)
[<8046d2f8>] (kernel_init+0x1c/0xf4) from [<8000e358>] (ret_from_fork+0x14/0x20)
---[ end trace 4bcdc801c868d73f ]---
Signed-off-by: Doug Anderson <dianders@chromium.org>
Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
this MUSB no longer works on omap4 based devices.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABAgAGBQJRdv6TAAoJEBvUPslcq6VzgtcP/jTfSTszSv0vkxivkeyvWN31
59TSAOTOgTxLS2MIncHhq61pAcvCe9xuM2a3tjrEB4KqOyOS+xcahD3RdVgSI75y
FmHGWC22d5jjza07OM4NbAXIUJ6xvTAPaLaNPeQ/hu1+p7MAaJdDVtz4crHgSRqs
8LMQeZE1j/B51xnBXuJ05sv+7uOZ63w47+mGlUDsGxpuOBYVfj1lN/305yvK5iWP
HNArH/5L0AgQSN50A6Ra+mGdC1PfzD6vStxtjWSftJp37J+ti504gvOzeCcSmYXz
oOJNp9nKcH8KuBFbY5pclevaadQKvPq5Prl0GIklf7df5bqQSHycagCsN76iRT5N
HawOGuzImFPn5IVeEWJBbBS5E7MK96WBOpHWCGmzKcgaLBAwwG4gJz8OYwyW+ofr
X3bhRKKd8DJjmFJV92ItGC/EncL5gOTbJsFQ4/DZoCwMtDA1A+rd2FIQf5rhGUe+
TLpQu0XKRs/abRXmSF+TRAI64evBTjSa8SMysSUBU5ptPFeZlRAhC3fiPV/NUvHD
gUGX5NUbLJ50Jk3B9EAinclPjK+PkoPLmrvTdyV+Fno8BHq5YLYpZHRg1qzvxrlo
JkluCnWYEZL7s331KjTME9BliLHbOQQw1kF/YSnecOktbI+N9u9t2eHnRgSjasBd
dSir3/pyCcKym5keyycW
=MBEd
-----END PGP SIGNATURE-----
Merge tag 'omap-for-v3.9-rc6/fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes
From Tony Lindgren:
One MUSB regression fix that I forgot to send earlier. Without
this MUSB no longer works on omap4 based devices.
* tag 'omap-for-v3.9-rc6/fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: OMAP4: hwmod data: make 'ocp2scp_usb_phy_phy_48m" as the main clock
Signed-off-by: Olof Johansson <olof@lixom.net>
The Nomadik clocksource driver has had a bad define making it
impossible to use it for sched_clock() for a while. Fix this
and also enable it for the Nomadik.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Olof Johansson <olof@lixom.net>
The UART1 is on the fast AHB bridge, not on the slow bus.
Cc: stable@vger.kernel.org
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Olof Johansson <olof@lixom.net>
This merges in the revert of multiplatform support for exynos.
Trivial conflicts on removed code. Also, needed to add "select COMMON_CLK"
to the non-multiplatform EXYNOS config option.
* samsung/exynos-multiplatform:
Revert "ARM: exynos: enable multiplatform support"
Signed-off-by: Olof Johansson <olof@lixom.net>
This just merges in the revert of multiplatform support. Not doing it by
cherry-pick since we need the same revert in the next/drivers branch.
* samsung/exynos-multiplatform:
Revert "ARM: exynos: enable multiplatform support"
Signed-off-by: Olof Johansson <olof@lixom.net>
This reverts commit bd51de53e1.
Turns out that multiplatform breaks some uses cases, such as when you
have an existing defconfig, since it adds the new EXYNOS_SINGLE config
option as a dependecy. As a result, nearly all exynos config options
will be disabled by default.
Reverting instead of rebasing since this branch is pulled in as a
dependency elsewhere.
Signed-off-by: Olof Johansson <olof@lixom.net>
This adds the ability for userspace to save and restore the state
of the XICS interrupt presentation controllers (ICPs) via the
KVM_GET/SET_ONE_REG interface. Since there is one ICP per vcpu, we
simply define a new 64-bit register in the ONE_REG space for the ICP
state. The state includes the CPU priority setting, the pending IPI
priority, and the priority and source number of any pending external
interrupt.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
This adds support for the ibm,int-on and ibm,int-off RTAS calls to the
in-kernel XICS emulation and corrects the handling of the saved
priority by the ibm,set-xive RTAS call. With this, ibm,int-off sets
the specified interrupt's priority in its saved_priority field and
sets the priority to 0xff (the least favoured value). ibm,int-on
restores the saved_priority to the priority field, and ibm,set-xive
sets both the priority and the saved_priority to the specified
priority value.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
This streamlines our handling of external interrupts that come in
while we're in the guest. First, when waking up a hardware thread
that was napping, we split off the "napping due to H_CEDE" case
earlier, and use the code that handles an external interrupt (0x500)
in the guest to handle that too. Secondly, the code that handles
those external interrupts now checks if any other thread is exiting
to the host before bouncing an external interrupt to the guest, and
also checks that there is actually an external interrupt pending for
the guest before setting the LPCR MER bit (mediated external request).
This also makes sure that we clear the "ceded" flag when we handle a
wakeup from cede in real mode, and fixes a potential infinite loop
in kvmppc_run_vcpu() which can occur if we ever end up with the ceded
flag set but MSR[EE] off.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
This adds an implementation of the XICS hypercalls in real mode for HV
KVM, which allows us to avoid exiting the guest MMU context on all
threads for a variety of operations such as fetching a pending
interrupt, EOI of messages, IPIs, etc.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
Currently, we wake up a CPU by sending a host IPI with
smp_send_reschedule() to thread 0 of that core, which will take all
threads out of the guest, and cause them to re-evaluate their
interrupt status on the way back in.
This adds a mechanism to differentiate real host IPIs from IPIs sent
by KVM for guest threads to poke each other, in order to target the
guest threads precisely when possible and avoid that global switch of
the core to host state.
We then use this new facility in the in-kernel XICS code.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
This adds in-kernel emulation of the XICS (eXternal Interrupt
Controller Specification) interrupt controller specified by PAPR, for
both HV and PR KVM guests.
The XICS emulation supports up to 1048560 interrupt sources.
Interrupt source numbers below 16 are reserved; 0 is used to mean no
interrupt and 2 is used for IPIs. Internally these are represented in
blocks of 1024, called ICS (interrupt controller source) entities, but
that is not visible to userspace.
Each vcpu gets one ICP (interrupt controller presentation) entity,
used to store the per-vcpu state such as vcpu priority, pending
interrupt state, IPI request, etc.
This does not include any API or any way to connect vcpus to their
ICP state; that will be added in later patches.
This is based on an initial implementation by Michael Ellerman
<michael@ellerman.id.au> reworked by Benjamin Herrenschmidt and
Paul Mackerras.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
[agraf: fix typo, add dependency on !KVM_MPIC]
Signed-off-by: Alexander Graf <agraf@suse.de>
For pseries machine emulation, in order to move the interrupt
controller code to the kernel, we need to intercept some RTAS
calls in the kernel itself. This adds an infrastructure to allow
in-kernel handlers to be registered for RTAS services by name.
A new ioctl, KVM_PPC_RTAS_DEFINE_TOKEN, then allows userspace to
associate token values with those service names. Then, when the
guest requests an RTAS service with one of those token values, it
will be handled by the relevant in-kernel handler rather than being
passed up to userspace as at present.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
[agraf: fix warning]
Signed-off-by: Alexander Graf <agraf@suse.de>
We no longer need to keep track of this now that MPIC destruction
always happens either during VM destruction (after MMIO has been
destroyed) or during a failed creation (before the fd has been exposed
to userspace, and thus before the MMIO region could have been
registered).
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
The hassle of getting refcounting right was greater than the hassle
of keeping a list of devices to destroy on VM exit.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
We changed a few things in non-ia64 code paths. This patch blindly applies
the changes to the ia64 code as well, hoping it proves useful in case anyone
revives the ia64 kvm code.
Signed-off-by: Alexander Graf <agraf@suse.de>
The code as is doesn't make any sense on non-e500 platforms. Restrict it
there, so that people don't get wrong ideas on what would actually work.
This patch should get reverted as soon as it's possible to either run e500
guests on non-e500 hosts or the MPIC emulation gains support for non-e500
modes.
Signed-off-by: Alexander Graf <agraf@suse.de>
Now that all pieces are in place for reusing generic irq infrastructure,
we can copy x86's implementation of KVM_IRQ_LINE irq injection and simply
reuse it for PPC, as it will work there just as well.
Signed-off-by: Alexander Graf <agraf@suse.de>
Now that all the irq routing and irqfd pieces are generic, we can expose
real irqchip support to all of KVM's internal helpers.
This allows us to use irqfd with the in-kernel MPIC.
Signed-off-by: Alexander Graf <agraf@suse.de>
Enabling this capability connects the vcpu to the designated in-kernel
MPIC. Using explicit connections between vcpus and irqchips allows
for flexibility, but the main benefit at the moment is that it
simplifies the code -- KVM doesn't need vm-global state to remember
which MPIC object is associated with this vm, and it doesn't need to
care about ordering between irqchip creation and vcpu creation.
Signed-off-by: Scott Wood <scottwood@freescale.com>
[agraf: add stub functions for kvmppc_mpic_{dis,}connect_vcpu]
Signed-off-by: Alexander Graf <agraf@suse.de>
Hook the MPIC code up to the KVM interfaces, add locking, etc.
Signed-off-by: Scott Wood <scottwood@freescale.com>
[agraf: add stub function for kvmppc_mpic_set_epr, non-booke, 64bit]
Signed-off-by: Alexander Graf <agraf@suse.de>
Remove braces that Linux style doesn't permit, remove space after
'*' that Lindent added, keep error/debug strings contiguous, etc.
Substitute type names, debug prints, etc.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Remove some parts of the code that are obviously QEMU or Raven specific
before fixing style issues, to reduce the style issues that need to be
fixed.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
This is QEMU's hw/openpic.c from commit
abd8d4a4d6dfea7ddea72f095f993e1de941614e ("Update version for
1.4.0-rc0"), run through Lindent with no other changes to ease merging
future changes between Linux and QEMU. Remaining style issues
(including those introduced by Lindent) will be fixed in a later patch.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Now that we have most irqfd code completely platform agnostic, let's move
irqfd's resample capability return to generic code as well.
Signed-off-by: Alexander Graf <agraf@suse.de>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
The current irq_comm.c file contains pieces of code that are generic
across different irqchip implementations, as well as code that is
fully IOAPIC specific.
Split the generic bits out into irqchip.c.
Signed-off-by: Alexander Graf <agraf@suse.de>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Quite a bit of code in KVM has been conditionalized on availability of
IOAPIC emulation. However, most of it is generically applicable to
platforms that don't have an IOPIC, but a different type of irq chip.
Make code that only relies on IRQ routing, not an APIC itself, on
CONFIG_HAVE_KVM_IRQ_ROUTING, so that we can reuse it later.
Signed-off-by: Alexander Graf <agraf@suse.de>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
The concept of routing interrupt lines to an irqchip is nothing
that is IOAPIC specific. Every irqchip has a maximum number of pins
that can be linked to irq lines.
So let's add a new define that allows us to reuse generic code for
non-IOAPIC platforms.
Signed-off-by: Alexander Graf <agraf@suse.de>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
At present, the KVM_GET_DIRTY_LOG ioctl doesn't report modifications
done by the host to the virtual processor areas (VPAs) and dispatch
trace logs (DTLs) registered by the guest. This is because those
modifications are done either in real mode or in the host kernel
context, and in neither case does the access go through the guest's
HPT, and thus no change (C) bit gets set in the guest's HPT.
However, the changes done by the host do need to be tracked so that
the modified pages get transferred when doing live migration. In
order to track these modifications, this adds a dirty flag to the
struct representing the VPA/DTL areas, and arranges to set the flag
when the VPA/DTL gets modified by the host. Then, when we are
collecting the dirty log, we also check the dirty flags for the
VPA and DTL for each vcpu and set the relevant bit in the dirty log
if necessary. Doing this also means we now need to keep track of
the guest physical address of the VPA/DTL areas.
So as not to lose track of modifications to a VPA/DTL area when it gets
unregistered, or when a new area gets registered in its place, we need
to transfer the dirty state to the rmap chain. This adds code to
kvmppc_unpin_guest_page() to do that if the area was dirty. To simplify
that code, we now require that all VPA, DTL and SLB shadow buffer areas
fit within a single host page. Guests already comply with this
requirement because pHyp requires that these areas not cross a 4k
boundary.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
At present, the code that determines whether a HPT entry has changed,
and thus needs to be sent to userspace when it is copying the HPT,
doesn't consider a hardware update to the reference and change bits
(R and C) in the HPT entries to constitute a change that needs to
be sent to userspace. This adds code to check for changes in R and C
when we are scanning the HPT to find changed entries, and adds code
to set the changed flag for the HPTE when we update the R and C bits
in the guest view of the HPTE.
Since we now need to set the HPTE changed flag in book3s_64_mmu_hv.c
as well as book3s_hv_rm_mmu.c, we move the note_hpte_modification()
function into kvm_book3s_64.h.
Current Linux guest kernels don't use the hardware updates of R and C
in the HPT, so this change won't affect them. Linux (or other) kernels
might in future want to use the R and C bits and have them correctly
transferred across when a guest is migrated, so it is better to correct
this deficiency.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
Extend processor compatibility names to e6500 cores.
Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
Embedded.Page Table (E.PT) category is not supported yet in e6500 kernel.
Configure TLBnCFG to remove E.PT and E.HV.LRAT categories from VCPUs.
Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
EPTCFG register defined by E.PT is accessed unconditionally by Linux guests
in the presence of MAV 2.0. Emulate it now.
Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Add support for TLBnPS registers available in MMU Architecture Version
(MAV) 2.0.
Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Vcpu's MMU default configuration and geometry update logic was buried in
a chunk of code. Move them to dedicated functions to add more clarity.
Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
MMU registers were exposed to user-space using sregs interface. Add them
to ONE_REG interface using kvmppc_get_one_reg/kvmppc_set_one_reg delegation
mechanism.
Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Refactor Book3E ONE_REG ioctl implementation to use kvmppc_get_one_reg/
kvmppc_set_one_reg delegation interface introduced by Book3S. This is
necessary for MMU SPRs which are platform specifics.
Get rid of useless case braces in the process.
Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
This allows the exit to user space if emulator request by returning
EMULATE_EXIT_USER. This will be used in subsequent patches in list
Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Currently the instruction emulator code returns EMULATE_EXIT_USER
and common code initializes the "run->exit_reason = .." and
"vcpu->arch.hcall_needed = .." with one fixed reason.
But there can be different reasons when emulator need to exit
to user space. To support that the "run->exit_reason = .."
and "vcpu->arch.hcall_needed = .." initialization is moved a
level up to emulator.
Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Instruction emulation return EMULATE_DO_PAPR when it requires
exit to userspace on book3s. Similar return is required
for booke. EMULATE_DO_PAPR reads out to be confusing so it is
renamed to EMULATE_EXIT_USER.
Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
This patch defines the interface parameter for KVM_SET_GUEST_DEBUG
ioctl support. Follow up patches will use this for setting up
hardware breakpoints, watchpoints and software breakpoints.
Also kvm_arch_vcpu_ioctl_set_guest_debug() is brought one level below.
This is because I am not sure what is required for book3s. So this ioctl
behaviour will not change for book3s.
Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Kernel can only access pages which maps as memory.
So flush only the valid kernel pages.
Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Pull late parisc fixes from Helge Deller:
"I know it's *very* late in the 3.9 release cycle, but since there
aren't that many people testing the parisc linux kernel, a few (for
our port) critical issues just showed up a few days back for the first
time.
What's in it?
- add missing __ucmpdi2 symbol, which is required for btrfs on 32bit
kernel.
- change kunmap() macro to static inline function. This fixes a
debian/gcc-4.4 build error.
- add locking when doing PTE updates. This fixes random userspace
crashes.
- disable (optional) -mlong-calls compiler option for modules, else
modules can't be loaded at runtime.
- a smart patch by Will Deacon which fixes 64bit put_user() warnings
on 32bit kernel."
* 'fixes-3.9-late' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: use spin_lock_irqsave/spin_unlock_irqrestore for PTE updates
parisc: disable -mlong-calls compiler option for kernel modules
parisc: uaccess: fix compiler warnings caused by __put_user casting
parisc: Change kunmap macro to static inline function
parisc: Provide __ucmpdi2 to resolve undefined references in 32 bit builds.
This patch adds the necessary Kconfig entries to enable support for the
ARMv8 software model (Versatile Express platform) together with the
defconfig update.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
This patch adds the DTS files for the ARMv8 RTSM and Foundation models.
Signed-off-by: Pawel Moll <Pawel.Moll@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Remove the duplicated code and use the cpuidle common code for initialization.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Kukjin Kim <kgene.kim@samsung.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
On resume from CPU power down any trace hooks enabled in cpu_init()
will get called before that function has done set_my_cpu_offset(),
so any use of per-cpu variables by trace hook code will cause bad
things to happen. Prevent this by marking the function notrace.
This fixes lockups/crashes seen when enabling function tracer on TC2
with the not yet mainlined cpuidle driver.
Signed-off-by: Jon Medhurst <tixy@linaro.org>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Linus found, while extending integer type extension checks in the
sparse static code checker, various fragile patterns of mixed
signed/unsigned 64-bit/32-bit integer use in perf_events_p4.c.
The relevant hardware register ABI is 64 bit wide on 32-bit
kernels as well, so clean it all up a bit, remove unnecessary
casts, and make sure we use 64-bit unsigned integers in these
places.
[ Unfortunately this patch was not tested on real P4 hardware,
those are pretty rare already. If this patch causes any
problems on P4 hardware then please holler ... ]
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: David Miller <davem@davemloft.net>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20130424072630.GB1780@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The pci config space accessors on s390 are (now) smart enough to
figure out if a pci function is available. So instead of calling
pci_create_root_bus and then pci_scan_single_device for each
available function just call pci_scan_root_bus and let the pci core
do the scanning (via config reads on all possible functions) and
device creation.
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
We've seen repeatedly that 8KB stack size on 64 bit kernels
is not sufficient.
So simply remove the config option.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Add a pointer to the system call table to the thread_info structure.
The TIF_31BIT bit is set or cleared by SET_PERSONALITY exactly once
for the lifetime of a process. With the pointer to the correct system
call table in thread_info the system call code in entry64.S path can
drop the check for TIF_31BIT which saves a couple of instructions.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Store the stack pointers in the lowcore for the kernel stack, the async
stack and the panic stack with the offset required for the first user.
This avoids an unnecessary add instruction on the system call path.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Provides basic enablement for perf branch stack sampling framework on
POWER8 processor based platforms. Adds new BHRB related elements into
cpu_hw_event structure to represent current BHRB config, BHRB filter
configuration, manage context and to hold output BHRB buffer during
PMU interrupt before passing to the user space. This also enables
processing of BHRB data and converts them into generic perf branch
stack data format.
Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This patch populates BHRB specific data for power_pmu structure. It
also implements POWER8 specific BHRB filter and configuration functions.
Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This patch adds couple of generic functions to power_pmu structure
which would configure the BHRB and it's filters. It also adds
representation of the number of BHRB entries present on the PMU.
A new PMU flag PPMU_BHRB would indicate presence of BHRB feature.
Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This patch adds the basic assembly code to read BHRB buffer. BHRB entries
are valid only after a PMU interrupt has happened (when MMCR0[PMAO]=1)
and BHRB has been freezed. BHRB read should not be attempted when it is
still enabled (MMCR0[PMAE]=1) and getting updated, as this can produce
non-deterministic results.
Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This patch adds new POWER8 instruction encoding for reading
and clearing Branch History Rolling Buffer entries. The new
instruction 'mfbhrbe' (move from branch history rolling buffer
entry) is used to read BHRB buffer entries and instruction
'clrbhrb' (clear branch history rolling buffer) is used to
clear the entire buffer. The instruction 'clrbhrb' has straight
forward encoding. But the instruction encoding format for
reading the BHRB entries is like 'mfbhrbe RT, BHRBE' where it
takes two arguments, i.e the index for the BHRB buffer entry to
read and a general purpose register to put the value which was
read from the buffer entry.
Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This patch adds support for the power8 PMU to perf.
Work is ongoing to add generic cache events.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
On power8 we have a new SIER (Sampled Instruction Event Register), which
captures information about instructions when we have random sampling
enabled.
Add support for loading the SIER into pt_regs, overloading regs->dar.
Also set the new NO_SIPR flag in regs->result if we don't have SIPR.
Update regs_sihv/sipr() to look for SIPR/SIHV in SIER.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
On power8 the presence or absence of SIPR depends on settings at runtime,
so convert to using a dynamic flag for NO_SIPR. Existing backends that
set NO_SIPR unconditionally set the dynamic flag obviously.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Add an accessor for regs->result so we can use it to store more flags in
future.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
On power8 the SIPR and SIHV are not in MMCRA, so convert the routines
to take regs and change the names accordingly.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
In perf_ip_adjust() we potentially use the MMCRA[SLOT] field to adjust
the reported IP of a sampled instruction.
Currently the logic is written so that if the backend does NOT have
the PPMU_ALT_SIPR flag set then we assume MMCRA[SLOT] exists.
However on power8 we do not want to set ALT_SIPR (it's in a third
location), and we also do not have MMCRA[SLOT].
So add a new flag which only indicates whether MMCRA[SLOT] exists.
Naively we'd set it on everything except power6/7, because they set
ALT_SIPR, and we've reversed the polarity of the flag. But it's more
complicated than that.
mpc7450 is 32-bit, and uses its own version of perf_ip_adjust()
which doesn't use MMCRA[SLOT], so it doesn't need the new flag set and
the behaviour is unchanged.
PPC970 (and I assume power4) don't have MMCRA[SLOT], so shouldn't have
the new flag set. This is a behaviour change on those cpus, though we
were probably getting lucky and the bits in question were 0.
power5 and power5+ set the new flag, behaviour unchanged.
power6 & power7 do not set the new flag, behaviour unchanged.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
For both HV and guest kernels, intialise PMU regs to something sane.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Ben found the root cause. Commit 37f02195be
("powerpc/pci: fix PCI-e devices rescan issue on powerpc platform")
overwrites the IOMMU table of PCI device while enabling PCI device.
The patch intends to fix the IOMMU table after that point.
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The patch intends to build 32-bits DMA space for individual PEs on
PHB3. The TVE# is recognized by the combo of PE# and fixed bits
from DMA address, which is zero for 32-bits DMA space.
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The TCE should be invalidated while it's created or free'd. The
approach to do that for IODA1 and IODA2 compliant PHBs are different.
So the patch differentiate them with different functions called to
do that for IODA1 and IODA2 compliant PHBs. It's notable that the
PCI address is used to invalidate the corresponding TCE on IODA2
compliant PHB3.
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The EOI handler of MSI/MSI-X interrupts for P8 (PHB3) need additional
steps to handle the P/Q bits in IVE before EOIing the corresponding
interrupt. The patch changes the EOI handler to cover that. we have
individual IRQ chip in each PHB instance. During the MSI IRQ setup
time, the IRQ chip is copied over from the original one for that IRQ,
and the EOI handler is patched with the one that will handle the P/Q
bits (As Ben suggested).
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
As Michael Ellerman suggested, to add CONFIG_POWERNV_MSI for PowerNV
platform. That's similar to CONFIG_PSERIES_MSI for pSeries platform.
For now, we don't make it dependent on CONFIG_EEH since it's not ready
to enable that yet.
Apart from that, we also enable CONFIG_PPC_MSI_BITMAP on selecting
CONFIG_POWERNV_MSI.
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The patch intends to initialize PHB3 during system boot stage. The
flag "PNV_PHB_MODEL_PHB3" is introduced to differentiate IODA2
compatible PHB3 from other types of PHBs.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Building a 64-bit powerpc kernel with PR KVM enabled currently gives
this error:
AS arch/powerpc/kernel/head_64.o
arch/powerpc/kernel/exceptions-64s.S: Assembler messages:
arch/powerpc/kernel/exceptions-64s.S:258: Error: attempt to move .org backwards
make[2]: *** [arch/powerpc/kernel/head_64.o] Error 1
This happens because the MASKABLE_EXCEPTION_PSERIES macro turns into
33 instructions, but we only have space for 32 at the decrementer
interrupt vector (from 0x900 to 0x980).
In the code generated by the MASKABLE_EXCEPTION_PSERIES macro, we
currently have two instances of the HMT_MEDIUM macro, which has the
effect of setting the SMT thread priority to medium. One is the
first instruction, and is overwritten by a no-op on processors where
we save the PPR (processor priority register), that is, POWER7 or
later. The other is after we have saved the PPR.
In order to reduce the code at 0x900 by one instruction, we omit the
first HMT_MEDIUM. On processors without SMT this will have no effect
since HMT_MEDIUM is a no-op there. On POWER5 and RS64 machines this
will mean that the first few instructions take a little longer in the
case where a decrementer interrupt occurs when the hardware thread is
running at low SMT priority. On POWER6 and later machines, the
hardware automatically boosts the thread priority when a decrementer
interrupt is taken if the thread priority was below medium, so this
change won't make any difference.
The alternative would be to branch out of line after saving the CFAR.
However, that would incur an extra overhead on all processors, whereas
the approach adopted here only adds overhead on older threaded processors.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
There are instances in which we do not want topology updates to occur.
In order to allow this a /proc interface (/proc/powerpc/topology_updates)
is introduced so that topology updates can be enabled and disabled.
This patch also adds a prrn_is_enabled() call so that PRRN events are
handled in the kernel only if topology updating is enabled.
Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The Linux kernel and platform firmware negotiate their mutual support
of the PRRN option via the ibm,client-architecture-support interface.
This patch simply sets the appropriate fields in the client architecture
vector to indicate Linux support for PRRN and will allow the firmware to
report PRRN events via the RTAS event-scan mechanism.
Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The new PRRN firmware feature provides a more convenient and event-driven
interface than VPHN for notifying Linux of changes to the NUMA affinity of
platform resources. However, for practical reasons, it may not be feasible
for some customers to update to the latest firmware. For these customers,
the VPHN feature supported on previous firmware versions may still be the
best option.
The VPHN feature was previously disabled due to races with the load
balancing code when accessing the NUMA cpu maps, but the new stop_machine()
approach protects the NUMA cpu maps from these concurrent accesses. It
should be safe to re-enable this feature now.
Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The following patch adds vdso_getcpu_init(), which stores the NUMA node for
a cpu in SPRG3:
Commit 18ad51dd34 ("powerpc: Add VDSO version of getcpu") adds
vdso_getcpu_init(), which stores the NUMA node for a cpu in SPRG3.
This patch ensures that this information is also updated when the NUMA
affinity of a cpu changes.
Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The new PRRN firmware feature allows CPU and memory resources to be
transparently reassigned across NUMA boundaries. When this happens, the
kernel must update the node maps to reflect the new affinity information.
Although the NUMA maps can be protected by locking primitives during the
update itself, this is insufficient to prevent concurrent accesses to these
structures. Since cpumask_of_node() hands out a pointer to these
structures, they can still be modified outside of the lock. Furthermore,
tracking down each usage of these pointers and adding locks would be quite
invasive and difficult to maintain.
The approach used is to make a list of affected cpus and call stop_machine
to have the update routine run on each of the affected cpus allowing them
to update themselves. Each cpu finds itself in the list of cpus and makes
the appropriate updates. We need to have each cpu do this for themselves to
handle calls to vdso_getcpu_init() added in a subsequent patch.
Situations like these are best handled using stop_machine(). Since the NUMA
affinity updates are exceptionally rare events, this approach has the
benefit of not adding any overhead while accessing the NUMA maps during
normal operation.
Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Platform events such as partition migration or the new PRRN firmware
feature can cause the NUMA characteristics of a CPU to change, and these
changes will be reflected in the device tree nodes for the affected
CPUs.
This patch registers a handler for Open Firmware device tree updates
and reconfigures the CPU and node maps whenever the associativity
changes. Currently, this is accomplished by marking the affected CPUs in
the cpu_associativity_changes_mask and allowing
arch_update_cpu_topology() to retrieve the new associativity information
using hcall_vphn().
Protecting the NUMA cpu maps from concurrent access during an update
operation will be addressed in a subsequent patch in this series.
Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Update the numa code to use the updated firmware_has_feature() when checking
for type 1 affinity.
Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The firmware_has_feature() function makes it easy to check for supported
features of the hypervisor. This patch extends the capability of
firmware_has_feature() to include checking for specified bits
in vector 5 of the architecture vector as reported in the device tree.
As part of this the #defines used for the architecture vector are re-defined
such that each option has the index into vector 5 and the feature bit encoded
into it. This makes checking for architecture bits when initiating data
for firmware_has_feature much easier.
Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
When iterating over the entries in firmware_features_table we only need
to go over the actual number of entries in the array instead of declaring
it to be bigger and checking to make sure there is a valid entry in every
slot.
This patch removes the FIRMWARE_MAX_FEATURES #define and replaces the
array looping with the use of ARRAY_SIZE().
Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
As part of handling of PRRN events we need to check vector 5 of the
architecture vector bits reported in the device tree to ensure PRRN event
handling is enabled. To do this firmware_has_feature() is updated (in a
subsequent patch) to make this check vector 5 bits. To avoid having to
re-define bits in the architecture vector the bit definitions are moved
to prom.h.
Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
A PRRN event is signaled via the RTAS event-scan mechanism, which
returns a Hot Plug Event message "fixed part" indicating "Platform
Resource Reassignment". In response to the Hot Plug Event message,
we must call ibm,update-nodes to determine which resources were
reassigned and then ibm,update-properties to obtain the new affinity
information about those resources.
The PRRN event-scan RTAS message contains only the "fixed part" with
the "Type" field set to the value 160 and no Extended Event Log. The
four-byte Extended Event Log Length field is re-purposed (since no
Extended Event Log message is included) to pass the "scope" parameter
that causes the ibm,update-nodes to return the nodes affected by the
specific resource reassignment.
This patch adds a handler for RTAS events. The function
pseries_devicetree_update() (from mobility.c) is used to make the
ibm,update-nodes/ibm,update-properties RTAS calls. Updating the NUMA maps
(handled by a subsequent patch) will require significant processing,
so pseries_devicetree_update() is called from an asynchronous workqueue
to allow event processing to continue.
PRRN RTAS events on pseries systems are rare events that have to be
initiated from the HMC console for the system by an IBM tech. This allows
us to assume that these events are widely spaced. Additionally, all work
on the queue is flushed before handling any new work to ensure we only have
one event in flight being handled at a time.
Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Correct parsing of the buffer returned from ibm,update-properties. The first
element is a length and the path to the property which is slightly different
from the list of properties in the buffer so we need to specifically
handle this.
Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Newer firmware on Power systems can transparently reassign platform resources
(CPU and Memory) in use. For instance, if a processor or memory unit is
predicted to fail, the platform may transparently move the processing to an
equivalent unused processor or the memory state to an equivalent unused
memory unit. However, reassigning resources across NUMA boundaries may alter
the performance of the partition. When such reassignment is necessary, the
Platform Resource Reassignment Notification (PRRN) option provides a
mechanism to inform the Linux kernel of changes to the NUMA affinity of
its platform resources.
When rtasd receives a PRRN event, it needs to make a series of RTAS
calls (ibm,update-nodes and ibm,update-properties) to retrieve the
updated device tree information. These calls are already handled in the
pseries_devicetree_update() routine used in partition migration.
This patch exposes pseries_devicetree_update() to make it accessible
to other pseries routines, this patch also updates pseries_devicetree_update()
to take a 32-bit scope parameter. The scope value, which was previously hard
coded to 1 for partition migration, is used for the RTAS calls
ibm,update-nodes/properties to update the device tree.
Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
POWER8 allows us to take interrupts with the MMU on. This gives us a
second set of vectors offset at 0x4000.
Unfortunately when coping these vectors we missed checking for MSR HV
for hardware interrupts (0x500). This results in us trying to use
HSRR0/1 when HV=0, rather than SRR0/1 on HW IRQs
The below fixes this to check CPU_FTR_HVMODE when patching the code at
0x4500.
Also we remove the check for CPU_FTR_ARCH_206 since relocation on IRQs
are only available in arch 2.07 and beyond.
Thanks to benh for helping find this.
Signed-off-by: Michael Neuling <mikey@neuling.org>
CC: <stable@vger.kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
In __restore_cpu_power8 we determine if we are HV and if not, we return
before setting HV only resources.
Unfortunately we forgot to restore the link register from r11 before
returning.
This will happen on boot and with secondary CPUs not coming online.
This adds the missing link register restore.
Signed-off-by: Michael Neuling <mikey@neuling.org>
CC: <stable@vger.kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
In __after_prom_start we copy the kernel down to zero in two calls to
copy_and_flush. After the first call (copy from 0 to copy_to_here:)
we jump to the newly copied code soon after.
Unfortunately there's no isync between the copy of this code and the
jump to it. Hence it's possible that stale instructions could still be
in the icache or pipeline before we branch to it.
We've seen this on real machines and it's results in no console output
after:
calling quiesce...
returning from prom_init
The below adds an isync to ensure that the copy and flushing has
completed before any branching to the new instructions occurs.
Signed-off-by: Michael Neuling <mikey@neuling.org>
CC: <stable@vger.kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We are currently out of free bits in AT_HWCAP. With POWER8, we have
several hardware features that we need to advertise.
Tested on POWER and x86.
Signed-off-by: Michael Neuling <michael@neuling.org>
Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
on some Apple machines because they implement EFI spec 1.10, which
doesn't provide a QueryVariableInfo() runtime function and the logic
used to check for the existence of that function was insufficient.
Fix from Josh Boyer.
* The anti-bricking algorithm also introduced a compiler warning on
32-bit. Fix from Borislav Petkov.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJReOtLAAoJEC84WcCNIz1VFZgP/Aws1NdPo/RdyI6/oGkI7ZV4
+5O79pLcaJt7ESuWjx2/9pto/qTzsWMri40HZivGbgxw+ViEdprGjJUFqSTn1LyJ
QrYamP40jBdLFfh1oDHvsub8HiC72sjB/ILSoDvooHEniDmajrL6zZK7C66gP+na
Q4ZN/Jp3x3XAW0s1mVJC4VnL60489Q/ndR3SH01hr2gqMSvmjwnhfiio6n9gYvdd
egmoalTIst94+X0nW1VHA4HT3SRM7cuwCA/kDxtG6qitbsQMUKUoa+DOpMNfE8mD
QdzmzZL115O+7ORj8Ki/JNS2CSyI83IRSQ3kcM1J5026mWIBMiM3h9Vlu5NwAyFA
bapZSaYr7S5u9BU/vICGnpyYnSsLfjuB3CnAuJFyM0YVFjR6n7moUpnP1LNifGHX
E/Qr1HDyIwwxE8K0f/n86a7BfstoMjzE74an6wOVXKDUY/RnH+FdWG/HDBPd8iG4
Avei1bK2zLLcXK4Kqmx8EkXTK7VSFx6StCPjAVlpgYOAMpRmQEmNpd/3lF7Y70gp
yXIBTSTKaPZ+/5SaeOPL2sgW37Uo9fFMphww2mLXGIdgO3L0BHD5hIq9pZQ7g0VK
noDN7f6ViCuNYuZIrTAtLo9Oc+KKgqOXa0TovUhORkJ8Gk93moL4fgYyFVPvsYnD
rQuTRJ3pZEEHlCmyZzBl
=l/fT
-----END PGP SIGNATURE-----
Merge tag 'efi-urgent' into x86/urgent
* The EFI variable anti-bricking algorithm merged in -rc8 broke booting
on some Apple machines because they implement EFI spec 1.10, which
doesn't provide a QueryVariableInfo() runtime function and the logic
used to check for the existence of that function was insufficient.
Fix from Josh Boyer.
* The anti-bricking algorithm also introduced a compiler warning on
32-bit. Fix from Borislav Petkov.
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
User applications running on SMP kernels have long suffered from instability
and random segmentation faults. This patch improves the situation although
there is more work to be done.
One of the problems is the various routines in pgtable.h that update page table
entries use different locking mechanisms, or no lock at all (set_pte_at). This
change modifies the routines to all use the same lock pa_dbit_lock. This lock
is used for dirty bit updates in the interruption code. The patch also purges
the TLB entries associated with the PTE to ensure that inconsistent values are
not used after the page table entry is updated. The UP and SMP code are now
identical.
The change also includes a minor update to the purge_tlb_entries function in
cache.c to improve its efficiency.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Cc: Helge Deller <deller@gmx.de>
Signed-off-by: Helge Deller <deller@gmx.de>
CONFIG_MLONGCALLS was introduced in commit
ec758f9832 to overcome linker issues when linking
huge linux kernels, e.g. with many modules linked in.
But in the kernel module loader there is no support yet for the new relocation
types, which is why modules built with -mlong-calls can't be loaded.
Furthermore, for modules long calls are not really necessary, since we already
use stub sections which resolve long distance calls.
So, let's just disable this compiler option when compiling kernel modules.
Signed-off-by: Helge Deller <deller@gmx.de>
When targetting 32-bit processors, __put_user emits a pair of stw
instructions for the 8-byte case. If the type of __val is a pointer, the
marshalling code casts it to the wider integer type of u64, resulting
in the following compiler warnings:
kernel/signal.c: In function 'copy_siginfo_to_user':
kernel/signal.c:2752:11: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
kernel/signal.c:2752:11: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
[...]
This patch fixes the warnings by removing the marshalling code and using
the correct output modifiers in the __put_{user,kernel}_asm64 macros
so that GCC will allocate the right registers without the need to
extract the two words explicitly.
Cc: Helge Deller <deller@gmx.de>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Helge Deller <deller@gmx.de>
Change kunmap macro to static inline function to fix build error
compiling drivers/base/dma-buf.c.
Without the change, the following error can occur:
CC drivers/base/dma-buf.o
drivers/base/dma-buf.c: In function 'dma_buf_kunmap':
drivers/base/dma-buf.c:427:46:
error: macro "kunmap" passed 3 arguments, but takes just 1
I believe parisc is the only arch to implement kunmap using a macro.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Helge Deller <deller@gmx.de>
Signed-off-by: Helge Deller <deller@gmx.de>
The Debian experimental linux source package (3.8.5-1) build fails
with the following errors:
...
MODPOST 2016 modules
ERROR: "__ucmpdi2" [fs/btrfs/btrfs.ko] undefined!
ERROR: "__ucmpdi2" [drivers/md/dm-verity.ko] undefined!
The attached patch resolves this problem. It is based on the s390
implementation of ucmpdi2.c.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Signed-off-by: Helge Deller <deller@gmx.de>
The Kconfig entry for USB_OMAP unconditionally selects USB_ISP1301,
which is now only visible when USB_PHY is also enabled.
This adds an appropriate dependency and enables USB_PHY in the omap1
defconfig, avoiding these build warnings:
warning: (USB_OHCI_HCD && USB_OMAP) selects ISP1301_OMAP which has unmet direct dependencies (USB_SUPPORT && USB_PHY && I2C && ARCH_OMAP_OTG)
Also fix a Makefile typo while we're at it.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Felipe Balbi <balbi@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The Kconfig entry for USB_LPC32XX unconditionally selects USB_ISP1301,
which is now only visible when USB_PHY is also enabled.
This adds an appropriate dependency and enables USB_PHY in the msm
defconfig, avoiding these build errors:
warning: (USB_LPC32XX) selects USB_ISP1301 which has unmet direct dependencies (USB_SUPPORT && USB_PHY && (USB || USB_GADGET) && I2C)
drivers/built-in.o: In function `usb_hcd_nxp_probe':
drivers/usb/host/ohci-nxp.c:224: undefined reference to `isp1301_get_client'
drivers/built-in.o: In function `lpc32xx_udc_probe':
drivers/usb/gadget/lpc32xx_udc.c:3071: undefined reference to `isp1301_get_client'
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Roland Stigge <stigge@antcom.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently, if you pass the kernel a dtb where a cpu node has an
unsupported enable-method property (e.g. "not-psci"), it'll explode
horribly, as it iterates over the enable_ops array incorrectly. It
increments the pointer *at* the current element, rather than
incrementing the pointer *to* the current element. As the first two
elements pointed to structures that were contiguous in memory, this
happened to be equivalent. However the third element is NULL, so when
the list is exhausted, smp_get_enable_ops generates the wrong pointer,
and dereferences an arbitrary portion of memory, which currently happens
to contain zero.
This patch fixes this by indirecting the pointer one level, so we
iterate over the array elements correctly, avoiding the below panic:
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
show_pte makes use of the *_none_or_clear_bad style functions. If a
pgd, pud or pmd is identified as being bad, it will then be cleared.
As show_pte appears to be called from either the user or kernel
fault handlers this side effect can lead to unpredictable behaviour;
especially as TLB entries are not invalidated.
This patch removes the page table sanitisation from show_pte. If a
bad pgd, pud or pmd is encountered it is left unmodified.
Signed-off-by: Steve Capper <steve.capper@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The compat_stat structure doesn't match the arch/arm/ struct stat
definition. This patch fixes the compat types and struct compat_stat
definition accordingly.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
The DSB following TLB or cache maintenance ops must be run on the same
CPU. With kernel preemption enabled or for user-space cache maintenance
this may not be the case. This patch adds an explicit DSB in the
__switch_to() function.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
For compiling with allmodconfig, need vga.h file, so generate it which
just only include the asm-generic one.
It is firstly used by drivers/gpu/drm/drm_irq.c.
The related error:
include/video/vga.h:22:21: fatal error: asm/vga.h: No such file or directory
Signed-off-by: Chen Gang <gang.chen@asianux.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
For systems where the top 32-bits of the MPIDR are all zero, we should
allow the device-tree to specify an #address-size of 0x1 for the CPU reg
property and then zero extend the value there.
Without this patch, kvmtool breaks with the recent mpidr parsing code
introduced in 4c7aa00213 ("arm64: kernel: initialise cpu_logical_map
from the DT").
Acked-by: Javi Merino <javi.merino@arm.com>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Patch adds AVX2/AES-NI/x86-64 implementation of Camellia cipher, requiring
32 parallel blocks for input (512 bytes). Compared to AVX implementation, this
version is extended to use the 256-bit wide YMM registers. For AES-NI
instructions data is split to two 128-bit registers and merged afterwards.
Even with this additional handling, performance should be higher compared
to the AES-NI/AVX implementation.
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Patch adds AVX2/x86-64 implementation of Serpent cipher, requiring 16 parallel
blocks for input (256 bytes). Implementation is based on the AVX implementation
and extends to use the 256-bit wide YMM registers. Since serpent does not use
table look-ups, this implementation should be close to two times faster than
the AVX implementation.
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Patch adds AVX2/x86-64 implementation of Twofish cipher, requiring 16 parallel
blocks for input (256 bytes). Table look-ups are performed using vpgatherdd
instruction directly from vector registers and thus should be faster than
earlier implementations. Implementation also uses 256-bit wide YMM registers,
which should give additional speed up compared to the AVX implementation.
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Patch adds AVX2/x86-64 implementation of Blowfish cipher, requiring 32 parallel
blocks for input (256 bytes). Table look-ups are performed using vpgatherdd
instruction directly from vector registers and thus should be faster than
earlier implementations.
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add more optimized XTS code for aesni_intel in 64-bit mode, for smaller stack
usage and boost for speed.
tcrypt results, with Intel i5-2450M:
256-bit key
enc dec
16B 0.98x 0.99x
64B 0.64x 0.63x
256B 1.29x 1.32x
1024B 1.54x 1.58x
8192B 1.57x 1.60x
512-bit key
enc dec
16B 0.98x 0.99x
64B 0.60x 0.59x
256B 1.24x 1.25x
1024B 1.39x 1.42x
8192B 1.38x 1.42x
I chose not to optimize smaller than block size of 256 bytes, since XTS is
practically always used with data blocks of size 512 bytes. This is why
performance is reduced in tcrypt for 64 byte long blocks.
Cc: Huang Ying <ying.huang@intel.com>
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add more optimized XTS code for camellia-aesni-avx, for smaller stack usage
and small boost for speed.
tcrypt results, with Intel i5-2450M:
enc dec
16B 1.10x 1.01x
64B 0.82x 0.77x
256B 1.14x 1.10x
1024B 1.17x 1.16x
8192B 1.10x 1.11x
Since XTS is practically always used with data blocks of size 512 bytes or
more, I chose to not make use of camellia-2way for block sized smaller than
256 bytes. This causes slower result in tcrypt for 64 bytes.
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Change cast6-avx to use the new XTS code, for smaller stack usage and small
boost to performance.
tcrypt results, with Intel i5-2450M:
enc dec
16B 1.01x 1.01x
64B 1.01x 1.00x
256B 1.09x 1.02x
1024B 1.08x 1.06x
8192B 1.08x 1.07x
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Change twofish-avx to use the new XTS code, for smaller stack usage and small
boost to performance.
tcrypt results, with Intel i5-2450M:
enc dec
16B 1.03x 1.02x
64B 0.91x 0.91x
256B 1.10x 1.09x
1024B 1.12x 1.11x
8192B 1.12x 1.11x
Since XTS is practically always used with data blocks of size 512 bytes or
more, I chose to not make use of twofish-3way for block sized smaller than
128 bytes. This causes slower result in tcrypt for 64 bytes.
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds AVX optimized XTS-mode helper functions/macros and converts
serpent-avx to use the new facilities. Benefits are slightly improved speed
and reduced stack usage as use of temporary IV-array is avoided.
tcrypt results, with Intel i5-2450M:
enc dec
16B 1.00x 1.00x
64B 1.00x 1.00x
256B 1.04x 1.06x
1024B 1.09x 1.09x
8192B 1.10x 1.09x
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Occurs when CONFIG_CRYPTO_CRC32C_INTEL=y and CONFIG_CRYPTO_CRC32C_INTEL=y.
Older versions of bintuils do not support the pclmulqdq instruction. The
PCLMULQDQ gas macro is used instead.
Signed-off-by: Sandy Wu <sandyw@twitter.com>
Cc: stable@vger.kernel.org # 3.8+
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
We added glue code and config options to create crypto
module that uses SSE/AVX/AVX2 optimized SHA512 x86_64 assembly routines.
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Provides SHA512 x86_64 assembly routine optimized with SSE, AVX and
AVX2's RORX instructions. Speedup of 70% or more has been
measured over the generic implementation.
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Provides SHA512 x86_64 assembly routine optimized with SSE and AVX instructions.
Speedup of 60% or more has been measured over the generic implementation.
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Provides SHA512 x86_64 assembly routine optimized with SSSE3 instructions.
Speedup of 40% or more has been measured over the generic implementation.
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
We added glue code and config options to create crypto
module that uses SSE/AVX/AVX2 optimized SHA256 x86_64 assembly routines.
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
ARM processors with LPAE enabled use 3 levels of page tables, with an
entry in the top level (pgd) covering 1GB of virtual space. Because of
the branch relocation limitations on ARM, the loadable modules are
mapped 16MB below PAGE_OFFSET, making the corresponding 1GB pgd shared
between kernel modules and user space.
If free_pgtables() is called with the default ceiling 0,
free_pgd_range() (and subsequently called functions) also frees the page
table shared between user space and kernel modules (which is normally
handled by the ARM-specific pgd_free() function). This patch changes
defines the ARM USER_PGTABLES_CEILING to TASK_SIZE when CONFIG_ARM_LPAE
is enabled.
Note that the pgd_free() function already checks the presence of the
shared pmd page allocated by pgd_alloc() and frees it, though with
ceiling 0 this wasn't necessary.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: <stable@vger.kernel.org> # 3.3+
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Commit 93dc688 (ARM: 7684/1: errata: Workaround for Cortex-A15 erratum
798181 (TLBI/DSB operations)) introduces calls to smp_processor_id() and
smp_call_function_many() with preemption enabled. This patch disables
preemption and also optimises the smp_processor_id() call in
broadcast_tlb_mm_a15_erratum(). The broadcast_tlb_a15_erratum() function
is changed to use smp_call_function() which disables preemption.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Geoff Levand <geoff@infradead.org>
Reported-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The Meta defconfigs set the log buffer size to just 8KiB, but with the
fairly recent conversion of the kernel log buffer into a structured
binary format, log messages appear to consume more space in the buffer,
and in some cases it's not big enough to store the entire boot log.
Therefore switch all the defconfigs to use the default size of 128KiB.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
arch/x86/kernel/setup.c includes <asm/dmi.h> but it doesn't look
like it needs it, <linux/dmi.h> is sufficient.
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Link: http://lkml.kernel.org/r/1366881845.4186.65.camel@chaos.site
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Source operand for one byte mov[zs]x is decoded incorrectly if it is in
high byte register. Fix that.
Cc: stable@vger.kernel.org
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Pull sparc fix from David Miller:
"Brown paper bag fix for sparc64"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
sparc64: Fix missing put_cpu_var() in tlb_batch_add_one() when not batching.
Currently the syscall number is passed, but it should be the return
value, which is kept in r0.
Signed-off-by: Simon Marchi <simon.marchi@polymtl.ca>
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com> [using a raw 0 value]
We need to check the runtime sys_table for the EFI version the firmware
specifies instead of just checking for a NULL QueryVariableInfo. Older
implementations of EFI don't have QueryVariableInfo but the runtime is
a smaller structure, so the pointer to it may be pointing off into garbage.
This is apparently the case with several Apple firmwares that support EFI
1.10, and the current check causes them to no longer boot. Fix based on
a suggestion from Matthew Garrett.
Signed-off-by: Josh Boyer <jwboyer@redhat.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Fix typo in printk and comments within various drivers.
Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Now that the cluster power API is in place, we can use it for SMP secondary
bringup and CPU hotplug in a generic fashion.
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Reviewed-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Instead of requiring the first man to be elected in advance (which
can be suboptimal in some situations), this patch uses a per-
cluster mutex to co-ordinate selection of the first man.
This should also make it more feasible to reuse this code path for
asynchronous cluster resume (as in CPUidle scenarios).
We must ensure that the vlock data doesn't share a cacheline with
anything else, or dirty cache eviction could corrupt it.
Signed-off-by: Dave Martin <dave.martin@linaro.org>
Signed-off-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Reviewed-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
This patch adds a simple low-level voting mutex implementation
to be used to arbitrate during first man selection when no load/store
exclusive instructions are usable.
For want of a better name, these are called "vlocks". (I was
tempted to call them ballot locks, but "block" is way too confusing
an abbreviation...)
There is no function to wait for the lock to be released, and no
vlock_lock() function since we don't need these at the moment.
These could straightforwardly be added if vlocks get used for other
purposes.
For architectural correctness even Strongly-Ordered memory accesses
require barriers in order to guarantee that multiple CPUs have a
coherent view of the ordering of memory accesses. Whether or not
this matters depends on hardware implementation details of the
memory system. Since the purpose of this code is to provide a clean,
generic locking mechanism with no platform-specific dependencies the
barriers should be present to avoid unpleasant surprises on future
platforms.
Note:
* When taking the lock, we don't care about implicit background
memory operations and other signalling which may be pending,
because those are not part of the critical section anyway.
A DMB is sufficient to ensure correctly observed ordering if
the explicit memory accesses in vlock_trylock.
* No barrier is required after checking the election result,
because the result is determined by the store to
VLOCK_OWNER_OFFSET and is already globally observed due to the
barriers in voting_end. This means that global agreement on
the winner is guaranteed, even before the winner is known
locally.
Signed-off-by: Dave Martin <dave.martin@linaro.org>
Signed-off-by: Nicolas Pitre <nicolas.pitre@linaro.org>
Reviewed-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
This provides helper methods to coordinate between CPUs coming down
and CPUs going up, as well as documentation on the used algorithms,
so that cluster teardown and setup
operations are not done for a cluster simultaneously.
For use in the power_down() implementation:
* __mcpm_cpu_going_down(unsigned int cluster, unsigned int cpu)
* __mcpm_outbound_enter_critical(unsigned int cluster)
* __mcpm_outbound_leave_critical(unsigned int cluster)
* __mcpm_cpu_down(unsigned int cluster, unsigned int cpu)
The power_up_setup() helper should do platform-specific setup in
preparation for turning the CPU on, such as invalidating local caches
or entering coherency. It must be assembler for now, since it must
run before the MMU can be switched on. It is passed the affinity level
for which initialization should be performed.
Because the mcpm_sync_struct content is looked-up and modified
with the cache enabled or disabled depending on the code path, it is
crucial to always ensure proper cache maintenance to update main memory
right away. The sync_cache_*() helpers are used to that end.
Also, in order to prevent a cached writer from interfering with an
adjacent non-cached writer, we ensure each state variable is located to
a separate cache line.
Thanks to Nicolas Pitre and Achin Gupta for the help with this
patch.
Signed-off-by: Dave Martin <dave.martin@linaro.org>
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Reviewed-by: Will Deacon <will.deacon@arm.com>
This is the basic API used to handle the powering up/down of individual
CPUs in a (multi-)cluster system. The platform specific backend
implementation has the responsibility to also handle the cluster level
power as well when the first/last CPU in a cluster is brought up/down.
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Reviewed-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
CPUs in cluster based systems, such as big.LITTLE, have special needs
when entering the kernel due to a hotplug event, or when resuming from
a deep sleep mode.
This is vectorized so multiple CPUs can enter the kernel in parallel
without serialization.
The mcpm prefix stands for "multi cluster power management", however
this is usable on single cluster systems as well. Only the basic
structure is introduced here. This will be extended with later patches.
In order not to complexify things more than they currently have to,
the planned work to make runtime adjusted MPIDR based indexing and
dynamic memory allocation for cluster states is postponed to a later
cycle. The MAX_NR_CLUSTERS and MAX_CPUS_PER_CLUSTER static definitions
should be sufficient for those systems expected to be available in the
near future.
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Reviewed-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Algorithms used by the MCPM layer rely on state variables which are
accessed while the cache is either active or inactive, depending
on the code path and the active state.
This patch introduces generic cache maintenance helpers to provide the
necessary cache synchronization for such state variables to always hit
main memory in an ordered way.
Signed-off-by: Nicolas Pitre <nico@linaro.org>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Dave Martin <dave.martin@linaro.org>
Remove the duplicated code and use the cpuidle common code for initialization.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Fix this:
arch/x86/boot/compressed/eboot.c: In function ‘setup_efi_vars’:
arch/x86/boot/compressed/eboot.c:269:2: warning: passing argument 1 of ‘efi_call_phys’ makes pointer from integer without a cast [enabled by default]
In file included from arch/x86/boot/compressed/eboot.c:12:0:
/w/kernel/linux/arch/x86/include/asm/efi.h:8:33: note: expected ‘void *’ but argument is of type ‘long unsigned int’
after cc5a080c5d ("efi: Pass boot services variable info to runtime
code").
Reported-by: Paul Bolle <pebolle@tiscali.nl>
Cc: Matthew Garrett <matthew.garrett@nebula.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Signed-off-by: Wenyou Yang <wenyou.yang@atmel.com>
Tested-by: Richard Genoud <richard.genoud@gmail.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Richard Genoud <richard.genoud@gmail.com>
[wenyou.yang@atmel.com: added spi nodes for the sam9263ek, sam9g20ek, sam9m10g45ek and sam9n12ek boards]
[wenyou.yang@atmel.com: submit the patch]
Signed-off-by: Wenyou Yang <wenyou.yang@atmel.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Richard Genoud <richard.genoud@gmail.com>
[wenyou.yang@atmel.com: add spi nodes for sam9260, sam9263, sam9g45 and sam9n12]
[wenyou.yang@atmel.com: remove spi property "cs-gpios" to the board dts files]
[wenyou.yang@atmel.com: submit the patch]
Signed-off-by: Wenyou Yang <wenyou.yang@atmel.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Richard Genoud <richard.genoud@gmail.com>
[<wenyou.yang@atmel.com: declare the spi clocks for sam9260, at91sam9g45, and at91sam9n12]
[wenyou.yang@atmel.com: submit the patch]
Signed-off-by: Wenyou Yang <wenyou.yang@atmel.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
In commit 85fe402 (fs: do not assign default i_ino in new_inode), the
initialisation of i_ino was removed from new_inode() and pushed down
into the callers. However spufs_new_inode() was not updated.
This exhibits as no files appearing in /spu, because all our dirents
have a zero inode, which readdir() seems to dislike.
Cc: stable@vger.kernel.org
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Add new return code to rtas_flash to indicate firmware entitlement
expiry. Strictly we don't need this update. But to keep it in sync
with PAPR, this was added.
Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Add proper comment to ibm,validate-flash-image RTAS call
update result tokens.
Note: Only comment section is modified, no code change.
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
None of the cell platforms support CPU hotplug, so we should iterate
only over online nodes when setting PMU interrupts.
This also fixes a warning during boot when NODES_SHIFT is large enough:
WARNING: at /scratch/michael/src/kmk/linus/kernel/irq/irqdomain.c:766
...
NIP [c0000000000db278] .irq_linear_revmap+0x30/0x58
LR [c0000000000dc2a0] .irq_create_mapping+0x38/0x1a8
Call Trace:
[c0000003fc9c3af0] [c0000000000dc2a0] .irq_create_mapping+0x38/0x1a8 (unreliable)
[c0000003fc9c3b80] [c000000000655c1c] .__machine_initcall_cell_cbe_init_pm_irq+0x84/0x158
[c0000003fc9c3c20] [c00000000000afb4] .do_one_initcall+0x5c/0x1e0
[c0000003fc9c3cd0] [c000000000644580] .kernel_init_freeable+0x238/0x328
[c0000003fc9c3db0] [c00000000000b784] .kernel_init+0x1c/0x120
[c0000003fc9c3e30] [c000000000009fb8] .ret_from_kernel_thread+0x64/0xac
This is caused by us overflowing our linear revmap because we're
requesting too many interrupts.
Reported-by: Dennis Schridde <devurandom@gmx.net>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CONFIG_ARCH_OMAP2PLUS depends on (ARCH_MULTI_V6 || ARCH_MULTI_V7) as of
a0694861 "ARM: OMAP2+: Enable ARCH_MULTIPLATFORM support", but the
individual OMAP2/3/4/5 and AM33XX platforms can all be selected independent
of what we are building for, which is a bug and prevents us from easily
building e.g. an ARMv7-only defconfig.
This makes ARCH_OMAP2 depend on ARCH_MULTI_V6 and the others depend on
ARCH_MULTI_V7, to ensure we really only build the platforms for the
CPUs we have enabled in the global multiplatform configuration step.
Cc: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Here are some late urgent fixes for v3.10 merge window.
All of these errors were introduced by recent commits
which are in linux-next.
f_obex, multi and cdc2 gadget drivers have learned to
return a proper error code when something goes wrong.
usb_bind_phy() was mistakenly placed into .init.text
section which caused Section mismatch warnings and undefined
reference compile errors.
f_source_sink had a copy-paste error which is now corrected.
g_zero got a memory leak plugged.
Two defconfigs got fixed to enable the newly introduced
CONFIG_USB_PHY.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABAgAGBQJRdnDhAAoJEIaOsuA1yqRE2UoP/RakEoDFlC58D1FtiKYtedT+
/dyq6BA0riqLcOXbLKtvgG58G7KZGKzMWxCvsGALrbUSWRjRptN7Z7okx0onJgPr
Wfc6tgJwEQdBaH25FYWDJSZ+zoB7l+LYuEO5XstFE6A26Fa4xKudEf6OFr5/odgm
DRxwV0dOSsNzw9zjrDoTnEfLJe8aolGkQNOf/1FUCpLNGFRBt9qg1uMOs5P7bKdp
M544LVQSXjwWlqW1oyysUXBbjQtCK2QvELqa+3EK8yMCQQZ27lJzdgHml02NbyfE
ktoNB8w9hv4xxUIICgWcWQGPpx/1keE4K0xIycRpCTfQobNOcLv1gzxlZBmaXKQE
uBUnXqZ52DgSRMEBmqOOuZqvy5fYOX2e5WRlZNlXAYtnHWMTVf86SwHSr1EWhP7M
JLv2dvukagfkg6vlim4pkLEesLEqdzDK3aUMHCYZTQSTATUrV81CAeXZp1ZGVqCw
y8jF+UPGTCfnar9V9sGBWmye8Aj5Y1XMacWxsrUNiWXgI4kUTyFK3wwGKbeCihyI
z8uZnkTSTUck95S67UBol2dnAyJaEi5N+ttEW8TTyWmCkXmbCORT7wj8CQ6RAMnF
3cLRhU1BKqENaRaBYvVPH28kNSmjNxHQFoxesfDVmzzqUoEvVnzRP+NqLi8y3aSl
5HO3rg8pSc2j056sLLg9
=UKZz
-----END PGP SIGNATURE-----
Merge tag 'usb-for-v3.10-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb into usb-next
Felipe writes:
usb: urgent fixes for v3.10 merge window
Here are some late urgent fixes for v3.10 merge window.
All of these errors were introduced by recent commits
which are in linux-next.
f_obex, multi and cdc2 gadget drivers have learned to
return a proper error code when something goes wrong.
usb_bind_phy() was mistakenly placed into .init.text
section which caused Section mismatch warnings and undefined
reference compile errors.
f_source_sink had a copy-paste error which is now corrected.
g_zero got a memory leak plugged.
Two defconfigs got fixed to enable the newly introduced
CONFIG_USB_PHY.
The bcm_kona_smc_init function references the bcm_kona_smc_ids variable
that is marked __initconst, so the function itself has to be __init
to avoid this build error:
WARNING: arch/arm/mach-bcm/built-in.o(.text+0x12c): Section mismatch in reference from the function bcm_kona_smc_init() to the (unknown reference) .init.rodata:(unknown)
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Daudt <csd@broadcom.com>
The code intializes the cpuidle driver at different places.
The cpuidle driver for :
* imx5 : is in the pm-imx5.c, the init function is in cpuidle.c
* imx6 : is in cpuidle-imx6q.c, the init function is in cpuidle.c
and cpuidle-imx6q.c
Instead of having the cpuidle code spread across different files,
let's create a driver for each SoC and use the common register function.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Remove the duplicated code and use the cpuidle common code for initialization.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Remove the duplicated code and use the cpuidle common code for initialization.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Remove the duplicated code and use the cpuidle common code for initialization.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Remove the duplicated code and use the cpuidle common code for initialization.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Reviewed-by: Kevin Hilman <khilman@linaro.org>
Tested-by: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Remove the duplicated code and use the cpuidle common code for initialization.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Remove the duplicated code and use the cpuidle common code for initialization.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Remove the duplicated code and use the cpuidle common code for initialization.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Reviewed-by: Kevin Hilman <khilman@linaro.org>
Tested-by: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Remove the duplicated code and use the cpuidle common code for initialization.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Remove the duplicate code and use the cpuidle common code for initialization.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
All the drivers are using, in their initialization function, the
for_each_possible_cpu macro.
Using for_each_online_cpu means the driver must handle the initialization
of the cpuidle device when a cpu is up which is not the case here.
Change the macro to for_each_possible_cpu as that fix the hotplug
initialization and make the initialization routine consistent with the
rest of the code in the different drivers.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The en_core_tk_irqen flag is set in all the cpuidle driver which
means it is not necessary to specify this flag.
Remove the flag and the code related to it.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Kevin Hilman <khilman@linaro.org> # for mach-omap2/*
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Commit edc7cb2 (usb: phy: make it a menuconfig) makes USB_MXS_PHY
be a sub-item of menuconfig symbol USB_PHY. This change gets the
selection of CONFIG_USB_MXS_PHY in mxs_defconfig lost. Hence the
boot stops at the point below.
[ 1.600867] ci_hdrc ci_hdrc.0: doesn't support gadget
[ 1.606282] ci_hdrc ci_hdrc.0: EHCI Host Controller
[ 1.613522] ci_hdrc ci_hdrc.0: new USB bus registered, assigned bus number 1
Add CONFIG_USB_PHY to have the CONFIG_USB_MXS_PHY selection back to
work.
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Commit edc7cb2 (usb: phy: make it a menuconfig) makes USB_MXS_PHY
be a sub-item of menuconfig symbol USB_PHY. This change gets the
selection of CONFIG_USB_MXS_PHY in imx_v6_v7_defconfig lost. Hence the
boot stops at the point below.
ci_hdrc ci_hdrc.0: doesn't support gadget
ci_hdrc ci_hdrc.0: EHCI Host Controller
ci_hdrc ci_hdrc.0: new USB bus registered, assigned bus number 1
Add CONFIG_USB_PHY to have the CONFIG_USB_MXS_PHY selection back to
work.
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Drivers use cmpxchg64, cmpxchg64_local to perform 64-bit operation, so
they can cross 32-bit and 64-bit platforms (it is a standard way).
Signed-off-by: Chen Gang <gang.chen@asianux.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Kay Sievers reported that coreutils' stat tool has a problem with
s390's statfs[64] definition:
> The definition of struct statfs::f_type needs a fix. s390 is the only
> architecture in the kernel that uses an int and expects magic
> constants lager than INT_MAX to fit into.
>
> A fix is needed to make Fedora boot on s390, it currently fails to do
> so. Userspace does not want to add code to paper-over this issue.
[...]
> Even coreutils cannot handle it:
> #define RAMFS_MAGIC 0x858458f6
> # stat -f -c%t /
> ffffffff858458f6
>
> #define BTRFS_SUPER_MAGIC 0x9123683E
> # stat -f -c%t /mnt
> ffffffff9123683e
The bug is caused by an implicit sign extension within the stat tool:
out_uint_x (pformat, prefix_len, statfsbuf->f_type);
where the format finally will be "%lx".
A similar problem can be found in the 'tail' tool.
s390 is the only architecture which has an int type f_type member in
struct statfs[64]. Other architectures have either unsigned ints or
long values, so that the problem doesn't occur there.
Therefore change the type of the f_type member to unsigned int, so
that we get zero extension instead of sign extension when assignment to
a long value happens.
This patch changes the s390 uapi struct stafs[64] definition in the kernel
to contain only unsigned values.
This was true for 32 bit builds anyway, since we use the generic uapi
header file in that case. So lets not include conditionally the generic
uapi header file but have the s390 implementation completely independent.
Also fix the types of struct compat_stafs to match reality and move the
definition of struct compat_statfs64 to asm/compat.h since it is not part
of the api.
Reported-by: Kay Sievers <kay@vrfy.org>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
For offset > PAGE_SIZE, s390_dma_map_pages() will issue a warning
and return a wrong dma address.
This patch removes the warning and fixes the dma return address
calculation.
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The compat definitions are not part of the uapi. So move them to
s390's private compat header file.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Fix this one for !COMPAT:
compat.h: In function ‘arch_compat_alloc_user_space’:
compat.h:292:2: error: implicit declaration of function ‘is_compat_task’
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The f_spare field within struct compat_statfs is four bytes larger
than within the native 31 bit struct statfs.
compat_sys_statfs() clears the f_spare field in user space which
means that in compat mode four bytes that are behind the user space
supplied struct compat_statfs will be corrupted (zeroed).
According to Thomas Gleixner's Linux 2.6 history tree this bug is
present since v2.5.74 87880da124 "[PATCH] s390: 31 bit compat.".
So it get's fixed shortly before its 10th anniversary. Tough luck.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The gmap_map_segment function creates a special invalid segment table
entry with the address of the requested target location in the process
address space. The first access will create the connection between the
gmap segment table and the target page table of the main process.
If two threads do this concurrently both will walk the page tables and
allocate a gmap_rmap structure for the same segment table entry.
To avoid the race recheck the segment table entry after taking to page
table lock.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Implement gmap_translate() function which translates a guest absolute address
to a user space process address without establishing the guest page table
entries.
This is useful for kvm guest address translations where no memory access
is expected to happen soon (e.g. tprot exception handler).
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
need set '\0' for 'local_buffer'.
SPLPAR_MAXLENGTH is 1026, RTAS_DATA_BUF_SIZE is 4096. so the contents of
rtas_data_buf may truncated in memcpy.
if contents are really truncated.
the splpar_strlen is more than 1026. the next while loop checking will
not find the end of buffer. that will cause memory access violation.
Signed-off-by: Chen Gang <gang.chen@asianux.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
arch_dup_task_struct() does flush_ptrace_hw_breakpoint(src), this
destroys the parent's breakpoints for no reason. We should clear
child->thread.ptrace_bps[] copied by dup_task_struct() instead.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
On 04/18/2013 07:38 PM, Anton Blanchard wrote:
> Since you are only reading one long you shouldn't need to check the
> update count and loop, you will always see a consistent value. The
> system call version of time() just does an unprotected load for example.
Fixed.
> With the above change and with Michael's comments covered (decent
> changelog entry and Signed-off-by):
>
> Acked-by: Anton Blanchard <anton@samba.org>
Thanks for the review, below the updated patch:
From: Adhemerval Zanella <azanella@linux.vnet.ibm.com>
This patch implement the time syscall as vDSO. The performance speedups
are:
Baseline PPC32: 380 nsec
Baseline PPC64: 350 nsec
vdso PPC32: 20 nsec
vsdo PPC64: 20 nsec
Tested on 64 bit build with both 32 bit and 64 bit userland.
Acked-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Adhemerval Zanella <azanella@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Conflicts:
drivers/net/ethernet/emulex/benet/be_main.c
drivers/net/ethernet/intel/igb/igb_main.c
drivers/net/wireless/brcm80211/brcmsmac/mac80211_if.c
include/net/scm.h
net/batman-adv/routing.c
net/ipv4/tcp_input.c
The e{uid,gid} --> {uid,gid} credentials fix conflicted with the
cleanup in net-next to now pass cred structs around.
The be2net driver had a bug fix in 'net' that overlapped with the VLAN
interface changes by Patrick McHardy in net-next.
An IGB conflict existed because in 'net' the build_skb() support was
reverted, and in 'net-next' there was a comment style fix within that
code.
Several batman-adv conflicts were resolved by making sure that all
calls to batadv_is_my_mac() are changed to have a new bat_priv first
argument.
Eric Dumazet's TS ECR fix in TCP in 'net' conflicted with the F-RTO
rewrite in 'net-next', mostly overlapping changes.
Thanks to Stephen Rothwell and Antonio Quartulli for help with several
of these merge resolutions.
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull MIPS fix from Ralf Baechle:
"Revert the change of the definition of PAGE_MASK which was prettier
but broke a few relativly rare platforms"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
Revert "MIPS: page.h: Provide more readable definition for PAGE_MASK."
This reverts commit c17a655478.
Manuel Lauss writes:
lmo commit c17a6554 (MIPS: page.h: Provide more readable definition for
PAGE_MASK) apparently breaks ioremap of 36-bit addresses on my Alchemy
systems (PCI and PCMCIA) The reason is that in arch/mips/mm/ioremap.c
line 157 (phys_addr &= PAGE_MASK) bits 32-35 are cut off. Seems the
new PAGE_MASK is explicitly 32bit, or one could make it signed instead
of unsigned long.
The builtin .dtb.S intermediate file needs to be marked with .SECONDARY
so that it isn't automatically deleted (which causes it to be
regenerated on every build). Also add *.dtb.S to clean-files so it gets
cleaned up by make clean.
Similarly, if the specified builtin dtb isn't already in dtb-y (e.g.
imported into the tree and specified in CONFIG_METAG_BUILTIN_DTB_NAME)
it too will be treated as an intermediate and deleted automatically
(again causing it to be regenerated on every build), so add it to dtb-y
so it gets added to targets and the dtbs target.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Vineet Gupta <Vineet.Gupta1@synopsys.com>
Cc: Grant Likely <grant.likely@secretlab.ca>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Sam Ravnborg <sam@ravnborg.org>
Reviewed-by: Stephen Warren <swarren@nvidia.com>
If we load the complete EFER MSR on entry or exit, EFER.LMA (and LME)
loading is skipped. Their consistency is already checked now before
starting the transition.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
As we may emulate the loading of EFER on VM-entry and VM-exit, implement
the checks that VMX performs on the guest and host values on vmlaunch/
vmresume. Factor out kvm_valid_efer for this purpose which checks for
set reserved bits.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Borislav Petkov reported a lockdep splat warning about kzalloc()
done in an IPI (hardirq) handler.
This is a real bug, do not call kzalloc() in a smp_call_function_single()
handler because it can schedule and crash.
Reported-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Jacob Shin <jacob.shin@amd.com>
Tested-by: Borislav Petkov <bp@suse.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: <eranian@google.com>
Cc: <a.p.zijlstra@chello.nl>
Cc: <acme@ghostprotocols.net>
Cc: <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20130421180627.GA21049@jshin-Toonie
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The logic for checking if interrupts can be injected has to be applied
also on NMIs. The difference is that if NMI interception is on these
events are consumed and blocked by the VM exit.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
vmx_set_nmi_mask will soon be used by vmx_nmi_allowed. No functional
changes.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
If userspace creates and destroys multiple VMs within the same process
we leak 20k of memory in the userspace process context per VM. This
patch frees the memory in kvm_arch_destroy_vm. If the process exits
without closing the VM file descriptor or the file descriptor has been
shared with another process then we don't free the memory.
It's still possible for a user space process to leak memory if the last
process to close the fd for the VM is not the process that created it.
However, this is an unexpected case that's only caused by a user space
process that's misbehaving.
Signed-off-by: Andrew Honig <ahonig@google.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Fix to return a negative error code from the error handling
case instead of 0, as returned elsewhere in this function.
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Once L1 loads VMCS12 we enable shadow-vmcs capability and copy all the VMCS12
shadowed fields to the shadow vmcs. When we release the VMCS12, we also
disable shadow-vmcs capability.
Signed-off-by: Abel Gordon <abelg@il.ibm.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Synchronize between the VMCS12 software controlled structure and the
processor-specific shadow vmcs
Signed-off-by: Abel Gordon <abelg@il.ibm.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Introduce a function used to copy fields from the software controlled VMCS12
to the processor-specific shadow vmcs
Signed-off-by: Abel Gordon <abelg@il.ibm.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Introduce a function used to copy fields from the processor-specific shadow
vmcs to the software controlled VMCS12
Signed-off-by: Abel Gordon <abelg@il.ibm.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Unmap vmcs12 and release the corresponding shadow vmcs
Signed-off-by: Abel Gordon <abelg@il.ibm.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Allocate a shadow vmcs used by the processor to shadow part of the fields
stored in the software defined VMCS12 (let L1 access fields without causing
exits). Note we keep a shadow vmcs only for the current vmcs12. Once a vmcs12
becomes non-current, its shadow vmcs is released.
Signed-off-by: Abel Gordon <abelg@il.ibm.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
handle_vmon doesn't check if L1 is already in root mode (VMXON
was previously called). This patch adds this missing check and calls
nested_vmx_failValid if VMX is already ON.
We need this check because L0 will allocate the shadow vmcs when L1
executes VMXON and we want to avoid host leaks (due to shadow vmcs
allocation) if L1 executes VMXON repeatedly.
Signed-off-by: Abel Gordon <abelg@il.ibm.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Refactor existent code so we re-use vmcs12_write_any to copy fields from the
shadow vmcs specified by the link pointer (used by the processor,
implementation-specific) to the VMCS12 software format used by L0 to hold
the fields in L1 memory address space.
Signed-off-by: Abel Gordon <abelg@il.ibm.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Prepare vmread and vmwrite bitmaps according to a pre-specified list of fields.
These lists are intended to specifiy most frequent accessed fields so we can
minimize the number of fields that are copied from/to the software controlled
VMCS12 format to/from to processor-specific shadow vmcs. The lists were built
measuring the VMCS fields access rate after L2 Ubuntu 12.04 booted when it was
running on top of L1 KVM, also Ubuntu 12.04. Note that during boot there were
additional fields which were frequently modified but they were not added to
these lists because after boot these fields were not longer accessed by L1.
Signed-off-by: Abel Gordon <abelg@il.ibm.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Add logic required to detect if shadow-vmcs is supported by the
processor. Introduce a new kernel module parameter to specify if L0 should use
shadow vmcs (or not) to run L1.
Signed-off-by: Abel Gordon <abelg@il.ibm.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Add definitions for all the vmcs control fields/bits
required to enable vmcs-shadowing
Signed-off-by: Abel Gordon <abelg@il.ibm.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Now we've adjusted all the code, we can simply set switcher_addr to
wherever it needs to go below the fixmaps, rather than asserting that
it should be so.
With large NR_CPUS and PAE, people were hitting the "mapping switcher
would thwack fixmap" message.
Reported-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
ie. SHARED_SWITCHER_PAGES == 1. It is well under a page, and it's a
minor simplification: it's nice to have *one* simplification in a
patch series!
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
We currently use the whole top PGD entry for the switcher, but that's
hitting the fixmap in some configurations (mainly, large NR_CPUS).
Introduce a variable, currently set to the constant.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
In a previous commit the en_core_tk_irqen flag has been added but we missed
the cpuidle_wrap_enter which was doing the job to measure the time for the
'omap3_enter_idle' function.
Actually, I don't see any reason to use this wrapper in the code. In the better
case, the time computation is not correctly done because of the different
operations done in omap3_enter_idle_bm which were not taken into account
before the en_core_tk_irqen flag was set.
As the time is reflected for the state overridden by the omap3_enter_idle_bm,
using the wrapper is pointless now, so removing it.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Kevin Hilman <khilman@linaro.org>
Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Commit 688036b538 removed the function
'shmobile_enter_wfi' but we forgot to remove the definition in the header file.
Note this function is just an alias to 'cpu_do_idle()' wrapped into a cpuidle
function callback prototype which already exists with the default WFI state
and the arm_simple_enter function.
Remove the function prototype.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Remove the shmobile_enter_wfi function which is the same as the
common WFI enter function from the arm cpuidle driver defined
with the ARM_CPUIDLE_WFI_STATE macro.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Simon Horman <horms+renesas@verge.net.au>
Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Registering the driver, or the device, can fail, let's check the return code
and return the error code to the PM layer.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Like all the other drivers, let's initialize the structure a compile time
instead of init time.
The states #1 and #2 are not enabled by default. The init function will
check the features of the board in order to enable the state.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The CPUIDLE_DRIVER_STATE_START constant is only set when the kernel compilation
option CONFIG_ARCH_HAS_CPU_RELAX is set, but this is only relatated to x86, so
it is always zero.
Remove the reference to this constant in the code.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The driver is a global static variable automatically initialized to zero.
Removing the useless initialization in the init function.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Support for NB counters, MSRs 0xc0010240 ~ 0xc0010247, got
moved to perf_event_amd_uncore.c in the following commit:
c43ca5091a perf/x86/amd: Add support for AMD NB and L2I "uncore" counters
AMD Family 10h NB events (events 0xe0 ~ 0xff, on MSRs 0xc001000 ~
0xc001007) will still continue to be handled by perf_event_amd.c
Signed-off-by: Jacob Shin <jacob.shin@amd.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Jacob Shin <jacob.shin@amd.com>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Link: http://lkml.kernel.org/r/1366046483-1765-2-git-send-email-jacob.shin@amd.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
check_hw_exists() has a number of checks which go to two exit
paths: msr_fail and bios_fail. Checks classified as msr_fail
will cause check_hw_exists() to return false, causing the PMU
not to be used; bios_fail checks will only cause a warning to be
printed, but will return true.
The problem is that if there are both msr failures and bios
failures, and the routine hits a bios_fail check first, it will
exit early and return true, not finishing the rest of the msr
checks. If those msrs are in fact broken, it will cause them to
be used erroneously.
In the case of a Xen PV VM, the guest OS has read access to all
the MSRs, but write access is white-listed to supported
features. Writes to unsupported MSRs have no effect. The PMU
MSRs are not (typically) supported, because they are expensive
to save and restore on a VM context switch. One of the
"msr_fail" checks is supposed to detect this circumstance (ether
for Xen or KVM) and disable the harware PMU.
However, on one of my AMD boxen, there is (apparently) a broken
BIOS which triggers one of the bios_fail checks. In particular,
MSR_K7_EVNTSEL0 has the ARCH_PERFMON_EVENTSEL_ENABLE bit set.
The guest kernel detects this because it has read access to all
MSRs, and causes it to skip the rest of the checks and try to
use the non-existent hardware PMU. This minimally causes a lot
of useless instruction emulation and Xen console spam; it may
cause other issues with the watchdog as well.
This changset causes check_hw_exists() to go through all of the
msr checks, failing and returning false if any of them fail.
This makes sure that a guest running under Xen without a virtual
PMU will detect that there is no functioning PMU and not attempt
to use it.
This problem affects kernels as far back as 3.2, and should thus
be considered for backport.
Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
Cc: Konrad Wilk <konrad.wilk@oracle.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Link: http://lkml.kernel.org/r/1365000388-32448-1-git-send-email-george.dunlap@eu.citrix.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Add support for AMD Family 15h [and above] northbridge
performance counters. MSRs 0xc0010240 ~ 0xc0010247 are shared
across all cores that share a common northbridge.
Add support for AMD Family 16h L2 performance counters. MSRs
0xc0010230 ~ 0xc0010237 are shared across all cores that share a
common L2 cache.
We do not enable counter overflow interrupts. Sampling mode and
per-thread events are not supported.
Signed-off-by: Jacob Shin <jacob.shin@amd.com>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Stephane Eranian <eranian@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20130419213428.GA8229@jshin-Toonie
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The existing code assumes all Cbox and PCU events are using
filter, but actually the filter is event specific. Furthermore
the filter is sub-divided into multiple fields which are used
by different events.
Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: peterz@infradead.org
Cc: ak@linux.intel.com
Link: http://lkml.kernel.org/r/1366113067-3262-3-git-send-email-zheng.z.yan@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reported-by: Stephane Eranian <eranian@google.com>
Conflicts:
arch/x86/kernel/cpu/perf_event_intel.c
Merge in the latest fixes before applying new patches, resolve the conflict.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull kdump fixes from Peter Anvin:
"The kexec/kdump people have found several problems with the support
for loading over 4 GiB that was introduced in this merge cycle. This
is partly due to a number of design problems inherent in the way the
various pieces of kdump fit together (it is pretty horrifically manual
in many places.)
After a *lot* of iterations this is the patchset that was agreed upon,
but of course it is now very late in the cycle. However, because it
changes both the syntax and semantics of the crashkernel option, it
would be desirable to avoid a stable release with the broken
interfaces."
I'm not happy with the timing, since originally the plan was to release
the final 3.9 tomorrow. But apparently I'm doing an -rc8 instead...
* 'x86-kdump-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
kexec: use Crash kernel for Crash kernel low
x86, kdump: Change crashkernel_high/low= to crashkernel=,high/low
x86, kdump: Retore crashkernel= to allocate under 896M
x86, kdump: Set crashkernel_low automatically
Pull x86 fixes from Peter Anvin:
"Three groups of fixes:
1. Make sure we don't execute the early microcode patching if family
< 6, since it would touch MSRs which don't exist on those
families, causing crashes.
2. The Xen partial emulation of HyperV can be dealt with more
gracefully than just disabling the driver.
3. More EFI variable space magic. In particular, variables hidden
from runtime code need to be taken into account too."
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86, microcode: Verify the family before dispatching microcode patching
x86, hyperv: Handle Xen emulation of Hyper-V more gracefully
x86,efi: Implement efi_no_storage_paranoia parameter
efi: Export efi_query_variable_store() for efivars.ko
x86/Kconfig: Make EFI select UCS2_STRING
efi: Distinguish between "remaining space" and actually used space
efi: Pass boot services variable info to runtime code
Move utf16 functions to kernel core and rename
x86,efi: Check max_size only if it is non-zero.
x86, efivars: firmware bug workarounds should be in platform code
Pull ARM fixes from Russell King:
"A set of fixes from various people - Will Deacon gets a prize for
removing code this time around. The biggest fix in this lot is
sorting out the ARM740T mess. The rest are relatively small fixes."
* 'fixes' of git://git.linaro.org/people/rmk/linux-arm:
ARM: 7699/1: sched_clock: Add more notrace to prevent recursion
ARM: 7698/1: perf: fix group validation when using enable_on_exec
ARM: 7697/1: hw_breakpoint: do not use __cpuinitdata for dbg_cpu_pm_nb
ARM: 7696/1: Fix kexec by setting outer_cache.inv_all for Feroceon
ARM: 7694/1: ARM, TCM: initialize TCM in paging_init(), instead of setup_arch()
ARM: 7692/1: iop3xx: move IOP3XX_PERIPHERAL_VIRT_BASE
ARM: modules: don't export cpu_set_pte_ext when !MMU
ARM: mm: remove broken condition check for v4 flushing
ARM: mm: fix numerous hideous errors in proc-arm740.S
ARM: cache: remove ARMv3 support code
ARM: tlbflush: remove ARMv3 support
Pull sparc fixes from David Miller:
1) Fix race in sparc64 TLB shootdowns, we have to synchronize with the
sibling cpus completing if we are passing them a reference via
pointer to a data structure.
2) Fix cleaning of bitmaps in sparc32, from Akinobu Mita.
3) Fix various sparc header mistakes, some of which resulted in
userland build breakage. From Sam Ravnborg.
4) Kill ghost declarations and defines missed when several bits of code
got deleted recently.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
sparc64: Fix race in TLB batch processing.
sparc: use asm-generic version of types.h
bbc_i2c: fix section mismatch warning
sparc: use generic headers
sparc:cleanup unused code in smp_32.h
sparc/iommu: fix typo s/265KB/256KB/
sparc/srmmu: clear trailing edge of bitmap properly
sparc:remove unused declaration smp_boot_cpus()
Matt Fleming (1):
x86, efivars: firmware bug workarounds should be in platform
code
Matthew Garrett (3):
Move utf16 functions to kernel core and rename
efi: Pass boot services variable info to runtime code
efi: Distinguish between "remaining space" and actually used
space
Richard Weinberger (2):
x86,efi: Check max_size only if it is non-zero.
x86,efi: Implement efi_no_storage_paranoia parameter
Sergey Vlasov (2):
x86/Kconfig: Make EFI select UCS2_STRING
efi: Export efi_query_variable_store() for efivars.ko
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
For each CPU vendor that implements CPU microcode patching, there will
be a minimum family for which this is implemented. Verify this
minimum level of support.
This can be done in the dispatch function or early in the application
functions. Doing the latter turned out to be somewhat awkward because
of the ineviable split between the BSP and the AP paths, and rather
than pushing deep into the application functions, do this in
the dispatch function.
Reported-by: "Bryan O'Donoghue" <bryan.odonoghue.lkml@nexus-software.ie>
Suggested-by: Borislav Petkov <bp@alien8.de>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Link: http://lkml.kernel.org/r/1366392183-4149-1-git-send-email-bryan.odonoghue.lkml@nexus-software.ie
As reported by Dave Kleikamp, when we emit cross calls to do batched
TLB flush processing we have a race because we do not synchronize on
the sibling cpus completing the cross call.
So meanwhile the TLB batch can be reset (tb->tlb_nr set to zero, etc.)
and either flushes are missed or flushes will flush the wrong
addresses.
Fix this by using generic infrastructure to synchonize on the
completion of the cross call.
This first required getting the flush_tlb_pending() call out from
switch_to() which operates with locks held and interrupts disabled.
The problem is that smp_call_function_many() cannot be invoked with
IRQs disabled and this is explicitly checked for with WARN_ON_ONCE().
We get the batch processing outside of locked IRQ disabled sections by
using some ideas from the powerpc port. Namely, we only batch inside
of arch_{enter,leave}_lazy_mmu_mode() calls. If we're not in such a
region, we flush TLBs synchronously.
1) Get rid of xcall_flush_tlb_pending and per-cpu type
implementations.
2) Do TLB batch cross calls instead via:
smp_call_function_many()
tlb_pending_func()
__flush_tlb_pending()
3) Batch only in lazy mmu sequences:
a) Add 'active' member to struct tlb_batch
b) Define __HAVE_ARCH_ENTER_LAZY_MMU_MODE
c) Set 'active' in arch_enter_lazy_mmu_mode()
d) Run batch and clear 'active' in arch_leave_lazy_mmu_mode()
e) Check 'active' in tlb_batch_add_one() and do a synchronous
flush if it's clear.
4) Add infrastructure for synchronous TLB page flushes.
a) Implement __flush_tlb_page and per-cpu variants, patch
as needed.
b) Likewise for xcall_flush_tlb_page.
c) Implement smp_flush_tlb_page() to invoke the cross-call.
d) Wire up global_flush_tlb_page() to the right routine based
upon CONFIG_SMP
5) It turns out that singleton batches are very common, 2 out of every
3 batch flushes have only a single entry in them.
The batch flush waiting is very expensive, both because of the poll
on sibling cpu completeion, as well as because passing the tlb batch
pointer to the sibling cpus invokes a shared memory dereference.
Therefore, in flush_tlb_pending(), if there is only one entry in
the batch perform a completely asynchronous global_flush_tlb_page()
instead.
Reported-by: Dave Kleikamp <dave.kleikamp@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Dave Kleikamp <dave.kleikamp@oracle.com>
cyc_to_sched_clock() is called by sched_clock() and cyc_to_ns()
is called by cyc_to_sched_clock(). I suspect that some compilers
inline both of these functions into sched_clock() and so we've
been getting away without having a notrace marking. It seems that
my compiler isn't inlining cyc_to_sched_clock() though, so I'm
hitting a recursion bug when I enable the function graph tracer,
causing my system to crash. Marking these functions notrace fixes
it. Technically cyc_to_ns() doesn't need the notrace because it's
already marked inline, but let's just add it so that if we ever
remove inline from that function it doesn't blow up.
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Commit 6e6aac75 "ARM: EXYNOS: Migrate clock support to common
clock framework" from Thomas Abraham removed the Exynos5 specific
register definitions as they were unused at the time, but the
cpufreq driver actually still uses them.
Cc: Sylwester Nawrocki <s.nawrocki@samsung.com>
Cc: Tomasz Figa <t.figa@samsung.com>
Cc: Thomas Abraham <thomas.abraham@linaro.org>
Cc: Kukjin Kim <kgene.kim@samsung.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Only one remaining fix for arm-soc platforms at this time, a small
bugfix for cpu hotplug on highbank platforms that has become much
easier to hit as of late. Details in the patch description, but it's
small and well-contained and definitely impacts users of the platform,
so 3.9 seems appropriate.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJRcYGyAAoJEIwa5zzehBx3sYYP/iiongz+96eVHVxBMGMWHg0f
BcTtHw2ef0h4vCDMNUXmPEmVu74hN5VMCutUd2qeKvckyC7gEIrbVnLFwej05RiA
AFwIJ0A2pAjdvsAkRfTNXPpFs7bvZ+8r6cUJd4JGCP7IC8Zi4sj4dET1ZNEsaccD
vMQ+Y8O3dvVu1lEFfGT87huppQgCr/jzc6O9oc3eHDZ242v8tVLS31PpuZe8Qt25
8QvKKsY/UG82/aiz+ijlcddDJz132byNzrOvmY0DkcZ5ZMxbTnGUNFUqFszFkBYu
FrtAyJyQEyUYwY7r6geogfgtj17mQzbq1/4azcApmDQHadBhVbXdDuTW0EPO63QC
sQzf9JgWR66H5hOYeDp2ka1RbBh00k6byvh7T5adzgoJDtbHtJJ8OxW16OR/eoCQ
umCZ2rQxAQCpw11qRnA8LDwnmujr5qXFMuj5NqepUaDCLbWAq2VWeNTicu0LMgCN
RJZ7ifk94o+uCQETd0D2ZrZZtqvrbLtaLAgfa54PiMYecV0rRF44FUVDCIbUBRKq
5ouN76JfvbOQutLj41nA/QBr0ATCAw4mPoxaeaIwie1G7lXvZvMRRkjKLxDhe/qs
+3gZivBhQQuKEe3CYooLx3xoC/bs1VoMsyiRXa8YgdKsZbY0bIsQAUZp8dYjAgoW
hYf3zRgRd69+k0uqWA1f
=j3Hq
-----END PGP SIGNATURE-----
Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"Only one remaining fix for arm-soc platforms at this time, a small
bugfix for cpu hotplug on highbank platforms that has become much
easier to hit as of late.
Details in the patch description, but it's small and well-contained
and definitely impacts users of the platform, so 3.9 seems
appropriate."
* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: highbank: fix cache flush ordering for cpu hotplug
when compiling with allmodconfig, CONFIG_64BIT=y the file
drivers/base/regmap/regmap-mmio.c will use readq and writeq so we need
implement these functions.
Signed-off-by: Chen Gang <gang.chen@asianux.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
These patches get us closer to adding multiplatform support on
the Exynos platform, they are part of a longer series of
patches. This would get all the simple stuff out of the
way, and I don't think there is a big risk of introducing
regressions with these.
A lot of the other patches have already been merged into
subsystem trees. After this series in in arm-soc, what is
left comes down to
* The ASoC conversion to dmaengine won't make it unless someone
who knows that code better steps up to do it right away. This
means that we won't have audio in a 3.10 multiplatform kernel
on Exynos, but it will still work for users that don't enable
multiplatform.
* The irqchip (combiner), clk and clksource patches are all based
on top of other changesets we pulled in from your trees, so I
would not make them part of the next/multiplatform branch. We can
apply them on top of the next/drivers branch once they are
tested successfully.
* A trivial patch is needed in the end to actually make
CONFIG_ARCH_EXYNOS visible in multiplatform configurations.
We will do that as a separate patch once everything else is
there.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
This makes it possible to enable the exynos platform as part of a
multiplatform kernel, in addition to keeping the single-platform
exynos support.
The multiplatform variant has a number of limitations at the moment:
* It only supports DT-enabled machines. This is not a problem in
the long run, as non-DT machines for exynos are going away.
The main problem here is that the gpio code and the exynos_eint
irqchip are not multiplatform capable but still required for
ATAGS based boot.
* The watchdog driver is still missing a conversion.
* sparsemem and memory_holes are currently not supported in
multiplatform.
The the multiplatform aware ARCH_EXYNOS Kconfig symbol is disabled
for now, as dependent patches are still pending in other
subsystem trees. We will enable it once everything comes together.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Nothing outside of the rtc driver includes plat/regs-rtc.h,
so we can simply move the file into the same directory,
which allows us to build the file as platform-independent
code.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: rtc-linux@googlegroups.com
Cc: Alessandro Zummo <a.zummo@towertech.it>
Nothing uses the NAND register definitions other than the
actual driver, so we can move the header file into the
same local directory, which lets us build it in a multiplatform
configuration.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Kyungmin Park <kyungmin.park@samsung.com>
Cc: linux-mtd@lists.infradead.org
Cc: David Woodhouse <dwmw2@infradead.org>
plat/regs-sdhci.h is not used anywhere but in the sdhci-s3c
driver, so it can become a local file there and all other
inclusions removed.
plat/sdhci.h is used only to define the platform devices,
and with the exception of the platform_data structure not
needed by the driver, so we can split out the platform_data
definition instead and leave the rest to platform code.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Chris Ball <cjb@laptop.org>
For a DT-only build we don't want to compile devs.c, but we do need
the mfc device, which is also referenced by the DT based platforms,
so move it all into one place.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
The move is necessary to support early debug output on exynos
with multiplatform configurations. This implies also moving the
plat/debug-macro.S file, but we are leaving the remaining users of that
file in place, to avoid adding large numbers of extra configuration
options to Kconfig.debug
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
When we enable CONFIG_SPARSE_IRQ, we have to set the value of NR_IRQS in
the machine_desc for legacy IRQ domains, and any file referring to the
number of interrupts or a specific number must include the mach/irqs.h
header file explicitly.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
As a preparation for multiplatform support, this introduces
a new Kconfig symbol to split the ATAGS based EXYNOS platforms
from the DT based ones. Turning off CONFIG_EXYNOS_ATAGS disables
all platforms that are not yet converted to DT, and we can
have code that relies on DT checking for this symbol being
disabled.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Add code to handle DRAM ECC errors decoding for Fam16h.
Tested on Fam16h with ECC turned on using the mce_amd_inj facility and
works fine.
Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
[ Boris: cleanups and clarifications ]
Signed-off-by: Borislav Petkov <bp@suse.de>
Remove the majority of cache flushing calls from the individual platform
files. This is now handled by the core code.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Flush the L1 cache for the CPU which is going down in cpu_die() so
that we don't end up with all platforms doing this. This ensures
that any cache lines we own are pushed out before the cache becomes
inaccessible.
We may end up subsequently creating some dirty cache lines - for
example, with the complete() call, but this update must become
visible to other CPUs before __cpu_die() can proceed. Subsequent
accesses from the platforms cpu_die() function should _not_ matter.
Also place a mb() after the complete() call to ensure that this is
visible to other CPUs.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Factor out the SP810 clocking code into a separate driver,
selecting better (faster) parent at clk_prepare() time.
This is to avoid problems with clocking infrastructure
initialisation order, in particular to avoid dependency
of fixed clock being initialized before SP810. It also
makes vexpress platform OF-based clock initialisation code
unnecessary.
Signed-off-by: Pawel Moll <pawel.moll@arm.com>
Tested-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mike Turquette <mturquette@linaro.org>
[mturquette@linaro.org: add .unprepare, FIXME comment, cleaned up code]
The tegra cpu_disable() function is the same as the generic version
in arch/arm/kernel/smp.c. Therefore, it can be removed.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The L1 data cache flush needs to be after highbank_set_cpu_jump call which
pollutes the cache with the l2x0_lock. This causes other cores to deadlock
waiting for the l2x0_lock. Moving the flush of the entire data cache after
highbank_set_cpu_jump fixes the problem. Use flush_cache_louis instead of
flush_cache_all are that is sufficient to flush only the L1 data cache.
flush_cache_louis did not exist when highbank_cpu_die was originally
written.
With PL310 errata 769419 enabled, a wmb is inserted into idle which takes
the l2x0_lock. This makes the problem much more easily hit and causes
reset to hang.
Reported-by: Paolo Pisati <p.pisati@gmail.com>
Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
When building the kernel with CONFIG_THUMB2_KERNEL enabled, older
assemblers may emit the following error:
reset-handler.S:78: Error: invalid immediate for address calculation (value = 0x00000004)
Using an explicit adr.w instruction will solve this. Newer assemblers do
this automatically. Use the W() macro to do this under Thumb mode only.
Inspired-by: Joseph Lo <josephl@nvidia.com>
Suggested-by: Dave Martin <dave.martin@linaro.org>
Signed-off-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
This fixes the building error when the PM_SLEEP is disabled. The fucntional
defintion of "tegra_pm_validate_suspend_mode" without "static inline"
would become a multiple definition error.
Reported-by: Rhyland Klein <rklein@nvidia.com>
Signed-off-by: Joseph Lo <josephl@nvidia.com>
Signed-off-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Olof Johansson <olof@lixom.net>
The conditional branch instruction in Thumb2 only available to short range.
The linker will fail when the conditional branch over the range. Then
resulting in link error when generating kernel image. e.g.:
arch/arm/mach-tegra/reset-handler.S:47:(.text+0xf8e):
relocation truncated to fit: R_ARM_THM_JUMP19 against symbol
`cpu_resume' defined in .data section in arch/arm/kernel/built-in.o
This patch using a Thumb2 instruction IT (if-then) to have a longer branch
range.
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Joseph Lo <josephl@nvidia.com>
Signed-off-by: Stephen Warren <swarren@nvidia.com>
Reviewed-by: Dave Martin <dave.martin@linaro.org>
Signed-off-by: Olof Johansson <olof@lixom.net>
This patch fix the build failure when CONFIG_THUBM2_KERNEL enabled. You
clould see the error message below:
arch/arm/mach-tegra/sleep-tegra30.S:69: Error: shift must be constant --
`orr r12,r12,r4,lsl r3'
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Joseph Lo <josephl@nvidia.com>
Signed-off-by: Stephen Warren <swarren@nvidia.com>
Reviewed-by: Dave Martin <dave.martin@linaro.org>
Signed-off-by: Olof Johansson <olof@lixom.net>